Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Maya.

Asked: January 18, 2001 - 11:37 pm UTC

Last updated: November 01, 2006 - 8:44 am UTC

Version: 8.0.4

Viewed 1000+ times

You Asked

Hello,

We use CBO (FIRST_ROWS)
When trying to tune SQL statements we found that using the RULE hint make ,most of the time, a better plan.

1.
I tried that hint on a simple small table with an unique index, with a statment contains a "where" clause using index columns. Somehow the RULE hint does not create the expected plan, with an index scan, not full table scan.

What am I missing here?

here is what I've done:

SQL> CREATE TABLE MYTAB (
2 USER_ID VARCHAR2 (7) NOT NULL,
3 CR_ID NUMBER (8) NOT NULL,
4 NOTIFICATIONS NUMBER (8) NOT NULL)
5 TABLESPACE USERS;

Table created.

SQL> CREATE UNIQUE INDEX myind1 ON mytab (USER_ID,CR_ID);

Index created.

SQL> set autotrace traceonly explain
SQL> SELECT
2 *
3 FROM
4 mytab
5 WHERE USER_ID = 1 and CR_ID = 1 --and
6 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=1 Card=1 Bytes=49)

1 0 TABLE ACCESS (FULL) OF 'MYTAB' (Cost=1 Card=1 Bytes=49)


SQL> SELECT /*+RULE*/
2 *
3 FROM
4 mytab
5 WHERE USER_ID = 1 and CR_ID = 1
6 /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 TABLE ACCESS (FULL) OF 'MYTAB'





2.
For the same table I created an index myind (instead myind1), which is almost the same but the order of columns is different.
why do I get a different plan from the above, the optimizer makes an index scan?


SQL> drop index myind1;

Index dropped.

SQL> CREATE UNIQUE INDEX myind1 ON mytab (CR_ID,USER_ID);

Index created.
SQL> SELECT
2 *
3 FROM
4 mytab
5 WHERE USER_ID = 1 and CR_ID = 1;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=1 Card=1 Bytes=49)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'MYTAB' (Cost=1 Card=1 By
tes=49)

2 1 INDEX (RANGE SCAN) OF 'MYIND1' (UNIQUE) (Cost=2 Card=1)


Thanks,

Maya

and Tom said...

My experience, given that the statistics on the tables are current, is the opposite in 8.x -- but anyway... here are the answers to the questions posed:

1) because user_id is a varchar2(7) and the query:

SQL> SELECT /*+RULE*/
2 *
3 FROM
4 mytab
5 WHERE USER_ID = 1 and CR_ID = 1
6 /


is the same as:

SQL> SELECT /*+RULE*/
2 *
3 FROM
4 mytab
5 WHERE TO_NUMBER(USER_ID) = 1 and CR_ID = 1
6 /

you are implicitly applying a function to the user_id column, making the use of an index impossible in this case. You should ALWAYS avoid implicit conversions -- here is a small example why:

tkyte@TKYTE816> create table t ( x varchar2(1), y int );
Table created.

tkyte@TKYTE816> insert into t values ( 'a', 1 );
1 row created.

tkyte@TKYTE816> insert into t values ( '1', 2 );
1 row created.

tkyte@TKYTE816> select *
2 from t
3 where x = 1 and y = 2
4 /

X Y
- ----------
1 2

tkyte@TKYTE816> select *
2 from t
3 where y = 2 and x = 1
4 /
where y = 2 and x = 1
*
ERROR at line 3:
ORA-01722: invalid number

Note in the RBO -- the first query works. Y=2 is evaluated AND THEN to_number(x) = 1 is evaluated. Since when y=2, x=1 -- this works.

On the other hand, the second query fails. Here to_number(x)=1 is evaluated FIRST and since X='a' for some rows -- the to_number fails.

Just to show you cannot rely on the order of evaluation of a predicate in any circumstances -- consider this:


tkyte@TKYTE816> analyze table t compute statistics;
Table analyzed.

tkyte@TKYTE816> select *
2 from t
3 where x = 1 and y = 2
4 /
where x = 1 and y = 2
*
ERROR at line 3:
ORA-01722: invalid number


tkyte@TKYTE816> select *
2 from t
3 where y = 2 and x = 1
4 /

X Y
- ----------
1 2

tkyte@TKYTE816>

The CBO will evaulate the predicate is a totally different order! The order of evaluation, since it is by definition NOT DEFINED in SQL cannot be relied on. The only correct way to write the above query safely is:

select * from t where y=2 and x = '1';

Compare STRINGS to STRINGS, DATES to DATES and NUMBERS to NUMBERS. Never let an implicit conversion happen!




2) for this one, the optimizer found CR_ID on the leading edge of the index and was able to use that. It is NOT using USER_ID in this index (as evidenced by the fact that it is doing an index range scan on this UNIQUE index). That query is alot like:


select *
from ( select * from mytab where cr_id = 1 )
where to_number(user_id) = 1
/

The optimizer is using the index to do the inner query and THEN it applies the to_number function to the user_id column and sees if that equals 1.


Rating

  (11 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A twist on this

Steve McKee, February 12, 2002 - 11:02 am UTC

Hi Tom,

I understand that we cannot rely on the order in which
predicates will be evaluated. However, why can't we rely on the following:

select to_number(x)
from t
where y=2;


We have this exact situation and are getting conversion
errors even though we eliminate "invalid" rows in the
where clause. Note that the behaviour is dependent on whether or not there are statistics on the table(s).

Can you explain why (in some cases) every row is hit with the to_number() function even though we filter out the invalid rows.



Tom Kyte
February 12, 2002 - 11:04 am UTC

can you show us how to reproduce easily?

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( x varchar2(5), y int );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 1, 2 );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 'a', 3 );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select to_number(x)
  2    from t
  3   where y = 2;

TO_NUMBER(X)
------------
           1

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table t compute statistics;

Table analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select to_number(x)
  2    from t
  3   where y = 2;

TO_NUMBER(X)
------------
           1


 

twist - followup

Steve McKee, February 12, 2002 - 3:32 pm UTC

I apologize in advance if the formatting is a mess.


Here's the query that is failing:


select count(*)
from ppcdrdata pp,
     (
        select to_date(value,'DD-Mon-YYYY HH24:MI:SS') START_TIME
        from   site_data
        where  name='CDE::FROM_DATE'
     ) CDE_START,
     (
        select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME
       from   site_data
       where  name='CDE::TO_DATE'
    ) CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME and
      pp.dws_creation_date <  CDE_STOP.STOP_TIME

SQL> /

select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME
                              *
ERROR at line 9:
ORA-01847: day of month must be between 1 and last day of month






The "raw" value of "value" column for rows with 
name="CDE::TO_DATE"  and  name="CDE::FROM_DATE"
is:


SQL> select '[ ' || value || ' ]' 
     from   site_data  
     where  name='CDE::TO_DATE';

'['||VALUE||']'
--------------------------
[ 29-Jan-2002 06:54:04 ]




SQL>  select '[ ' || value || ' ]' 
      from   site_data  
      where  name='CDE::FROM_DATE';

'['||VALUE||']'
------------------
[ 29-Jan-2000 06:54:04 ]





Note that there are numeric (i.e. non date) values in the "value" field of other rows.




Here are the table definitions



SQL> desc site_data
 Name                Null?    Type
 ------------------- -------- --------------
 DATA_TYPE           NOT NULL NUMBER(5)
 VALUE                        VARCHAR2(250)
 NAME                         VARCHAR2(250)
 DESCRIPTION                  VARCHAR2(1024)

SQL> desc ppcdrdata
 Name                         Null?    Type
 ---------------------------- -------- ----------------------------
 <hide column name>           NOT NULL NUMBER(16)
 <hide column name>           NOT NULL VARCHAR2(40)
 <hide column name>           NOT NULL NUMBER(2)
 <hide column name>                    VARCHAR2(30)
 <hide column name>                    NUMBER(15)
 <hide column name>                    DATE
 <hide column name>                    VARCHAR2(40)
 <hide column name>                    VARCHAR2(20)
 <hide column name>                    VARCHAR2(40)
 <hide column name>                    NUMBER(1)
 <hide column name>                    NUMBER(10)
 <hide column name>                    NUMBER(10)
 <hide column name>                    NUMBER(1)
 <hide column name>                    NUMBER(3)
 <hide column name>                    VARCHAR2(40)
 <hide column name>                    NUMBER(15)
 <hide column name>                    VARCHAR2(15)
 DWS_CREATION_DATE                     DATE




I have opened a TAR on this and the "final answer" from 
Oracle was:

Steve,

Development has determined that this is not a bug. 
Here is what was state by the developer:

This is not a bug . We can not guarantee how predicates will be applied. 
The best way to guarantee Such queries would work is to
use a DECODE around the datatype column, and return some predetermined
value for non-date values that would be recognised as a data to be
excluded.

Please review the bug tomorrow to see the
select count(*) from ppcdrdata pp, 
(select decode( name, 'CDE::FROM_DATE',
to_date(value,'DD-Mon-YYYY HH24:MI:SS'), null ) START_TIME
from site_data 
where name='CDE::FROM_DATE') CDE_START,
(select decode( name, 'CDE::TO_DATE',
to_date(value,'DD-Mon-YYYY HH24:MI:SS'), null ) STOP_TIME 
from site_data 
where name='CDE::TO_DATE') CDE_STOP 
where pp.dws_creation_date >= CDE_START.START_TIME
and pp.dws_creation_date < CDE_STOP.STOP_TIME;

We will inactivate this tar with 2 week closure.
Thanks




I'm sure they are correct. And I'm sure their workaround
will work. But I still don't understand
what the execution plan was doing.

Note that they were able to reproduce the problem with the
export data I sent them.

 

Tom Kyte
February 12, 2002 - 4:01 pm UTC

That makes sense -- if you take the query:

select count(*)
from ppcdrdata pp,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') START_TIME
from site_data
where name='CDE::FROM_DATE'
) CDE_START,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME
from site_data
where name='CDE::TO_DATE'
) CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME



it can be rewritten in a variety of ways -- one of which would have the predicates:

where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME

getting pushed DOWN into the inline views - so that the

select to_date( .... ) start_time
from site_data
where name = 'CDE::FROM_DATE'

is more like:

select to_date( .... ) start_time
from site_data
where name = 'CDE::FROM_DATE'
and pp.dws_creation_date >= start_time (and start_time is really just to_date)


hence, view merging is causing the predicates to be evaluated in a random order.


One "trick" that will fix this will be to add ROWNUM to the inline views:

select count(*)
from ppcdrdata pp,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') START_TIME, rownum r1
from site_data
where name='CDE::FROM_DATE'
) CDE_START,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME, rownum r2
from site_data
where name='CDE::TO_DATE'
) CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME

Now, the inline views will be materialized and then joined to -- instead of having the query predicates get merged. Things like rownum, aggregates, order bys in the inline views preclude us from being able to merge -- HOWEVER -- the solution offered by support is the BETTER way to do this! It always will work.

just a minor doubt

A reader, February 13, 2002 - 2:10 am UTC

Hi Tom, Your comment "Compare STRINGS to STRINGS,
DATES to DATES and NUMBERS to NUMBERS. Never let
an implicit conversion happen!".

Query # A :
select *
from ALL_OBJECTS
where CREATED >= '01-JAN-2001'
and CREATED <= '31-JAN-2001'

Query # B :
select *
from ALL_OBJECTS
where CREATED >= TO_DATE('01-JAN-2001','DD-MON-YYYY')
and CREATED <= TO_DATE('31-JAN-2001','DD-MON-YYYY')

In the Query # A, isn't the column CREATED (a date datatype)
being internally converted by Oracle to a string datatype
for comparison i.e. to_char(CREATED,'DD-MON-YYYY')?

'01-JAN-2001' will just be a character string to Oracle
irrespective of the format we have given to it?

A lot of my colleagues prefer / insist on using Query # A.
But will not Query # B cost less as there are no implicit
conversions?

Pls share your insights / opinion with us on this.

Tom Kyte
February 13, 2002 - 3:11 am UTC

Your colleagues are lazy.  Your colleagues will get totally burned when someone changes the NLS_DATE_FORMAT.  Your colleagues are wrong.

We'll implicitly convert the STRING into a DATE for this one, consider:


ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from all_objects where created > 'x';
select * from all_objects where created > 'x'
                                          *
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected

it didn't do the date to a string.  However, you should NOT rely on that implicit conversion ever.  Be explicity, hit the keyboard a couple of extra times to get the TO_DATE with YOUR format there!
 

Type casting

Mikito Harakiri, February 14, 2002 - 5:40 pm UTC

If 'val' is of type CHAR then the predicate val=1 is ambiguous. It can be transformed either to

TO_NUMBER(val)=1

or

val=TO_CHAR(1).

Oracle seems to always prefer the first way. Wouldn't the rule "cast constant expression first" resolve the issue?


Tom Kyte
February 22, 2002 - 9:20 am UTC

It would remove the error BUT would introduce massive ambiguities and other "incorrect" (percieved INCORRECT) behavior.

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( val varchar2(5) );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( '1' );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( ' 1' );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where val >= to_char(1);

VAL
-----
1

ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where to_number(val) >= 1;

VAL
-----
1
 1


They are not equivalent and most people seem to "expect" the second answer.  When comparing to a number -- convert to a number.  The same with dates.  Since Strings compare SOOO different then number/dates do, we convert to the more specific type and then compare.


The rule "cast the constant" would not work since you can easily have:

select * from t where c1 = c2;

and c1 is a string, c2 a number.  Same problem -- no constants.


Back to MY RULE.  Never ever in any language rely on implict conversions.  In EVERY language I've ever worked with over the years -- I've been burned over and over by not following that very simple rule.  Just be explicit, you'll write more correct programs.  They might be a tad "longer", a few extra keystrokes, but they will be predicable and correct.

You can wish the machine could do it all for you (as you seem to constantly want it to) but it always comes back to "better SAFE then really really SORRY".   

narrowing vs. widening

Mikito Harakiri, February 14, 2002 - 6:01 pm UTC

Correction to the rule: when there is ambiguity between narrowing one type or widening the other, always use widening, because it gurantees no runtime surprises. If there is a conflict that prevents consolidating the types, then report query compilation error instead of nondeterministic failure at runtime confusing unsuspected end user!

Tom Kyte
February 22, 2002 - 9:22 am UTC

talk about confusing!! look at the above results of the two different queries where one widens and the other narrows. Tell me -- which one is "correct". I'll gladly argue the other side, no matter which you pick.

Sure, widening avoids the ERROR but gives what many would say is the wrong answer (silently, without ERROR -- that I believe to be worse then getting the error).

agreed

Mikito Harakiri, February 25, 2002 - 5:57 pm UTC

Hmm... Even more convincing example would be:

'2'<'11' vs. 2<11

A reader, June 24, 2004 - 10:41 am UTC


Is Documentation Correct?

Andrew Markiewicz, August 18, 2004 - 12:12 pm UTC

Tom.
Is the documentation wrong or am I interpretting it incorrectly?

Excerpt from Oracle 9i release 2 SQL Reference:

The following rules govern the direction in which Oracle makes implicit datatype conversions:

During INSERT and UPDATE operations, Oracle converts the value to the datatype of the affected column.
During SELECT FROM operations, Oracle converts the data from the column to the type of the target variable.


The SQL example you gave above
"ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from all_objects where created >
'x';
select * from all_objects where created > 'x'
*
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected"

shows conversion of the literal to the column data type, opposite from what the documentation states (and apparently more correct since that's exactly what did happen).
Or is this second point implying that columns selected are converted such as in 'select ... into' cursors?

Tom Kyte
August 18, 2004 - 12:22 pm UTC

you have no target variable there.

what they are saying is -- regardless of the host bind variable type -- the data will be converted from its input type TO THE DATABASE type, eg:

ops$tkyte@ORA9IR2> create table t ( x varchar2(25) );
 
Table created.
 
ops$tkyte@ORA9IR2> variable x number
ops$tkyte@ORA9IR2> exec :x := 42
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> insert into t values ( :x );
 
1 row created.

<b>
:x the number was converted into a string and then inserted.  conversely:</b>

ops$tkyte@ORA9IR2> begin
  2  select x into :x from t;
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> print x
 
         X
----------
        42


<b>the string t.x was converted into a number and placed into the bind variable</b>



the docs were not talking about type promotion and comparisons in that text, they were talking "inputs and outputs"
 

plan

A reader, October 31, 2006 - 8:40 am UTC

select * from t where x between 1 and 10;

select * from t where x >=1 and x <= 10;

are they do the same plan ?

Tom Kyte
October 31, 2006 - 8:50 am UTC

maybe yes
maybe no

probably.  between is just shorthand for <= and >= 

ops$tkyte%ORA10GR2> create table t ( x int );

Table created.

ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from t where x between 1 and 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    13 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("X">=1 AND "X"<=10)

Note
-----
   - dynamic sampling used for this statement

ops$tkyte%ORA10GR2> select * from t where x >= 1 and x <= 10;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    13 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("X">=1 AND "X"<=10)

Note
-----
   - dynamic sampling used for this statement

ops$tkyte%ORA10GR2> set autotrace off
 

plan between and >=, <=

A reader, October 31, 2006 - 9:16 am UTC

may be no ?, do you have simple case to show me..

are there any cases that show 'between' better than >=,<= or opposite?

Thanks

Tom Kyte
October 31, 2006 - 10:42 am UTC

did you see my example above, see the filter line?  between "does not really exist, it is >= <= - see what the optimizer did to your predicate???"


between is syntactic sugar.  Just like "in"

ops$tkyte%ORA10GR2> select * from t where x in ( 1, 2 );

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    13 |     2   (0)| 00:00:01
|*  1 |  TABLE ACCESS FULL| T    |     1 |    13 |     2   (0)| 00:00:01
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("X"=1 OR "X"=2)

Note
-----
   - dynamic sampling used for this statement

ops$tkyte%ORA10GR2> select * from t where x = 1 or x = 2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    13 |     2   (0)| 00:00:01
|*  1 |  TABLE ACCESS FULL| T    |     1 |    13 |     2   (0)| 00:00:01
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("X"=1 OR "X"=2)

Note
-----
   - dynamic sampling used for this statement


is really OR. 

plan

A reader, November 01, 2006 - 12:32 am UTC

Thanks Tom, just confusing with your word: "may be no"

Tom Kyte
November 01, 2006 - 8:44 am UTC

will they always generate the same plan?

maybe yes, maybe no, PROBABLY (yes)

it is very likely they will.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.