A twist on this
Steve McKee, February 12, 2002 - 11:02 am UTC
Hi Tom,
I understand that we cannot rely on the order in which
predicates will be evaluated. However, why can't we rely on the following:
select to_number(x)
from t
where y=2;
We have this exact situation and are getting conversion
errors even though we eliminate "invalid" rows in the
where clause. Note that the behaviour is dependent on whether or not there are statistics on the table(s).
Can you explain why (in some cases) every row is hit with the to_number() function even though we filter out the invalid rows.
February 12, 2002 - 11:04 am UTC
can you show us how to reproduce easily?
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( x varchar2(5), y int );
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 1, 2 );
1 row created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( 'a', 3 );
1 row created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select to_number(x)
2 from t
3 where y = 2;
TO_NUMBER(X)
------------
1
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table t compute statistics;
Table analyzed.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select to_number(x)
2 from t
3 where y = 2;
TO_NUMBER(X)
------------
1
twist - followup
Steve McKee, February 12, 2002 - 3:32 pm UTC
I apologize in advance if the formatting is a mess.
Here's the query that is failing:
select count(*)
from ppcdrdata pp,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') START_TIME
from site_data
where name='CDE::FROM_DATE'
) CDE_START,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME
from site_data
where name='CDE::TO_DATE'
) CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME
SQL> /
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME
*
ERROR at line 9:
ORA-01847: day of month must be between 1 and last day of month
The "raw" value of "value" column for rows with
name="CDE::TO_DATE" and name="CDE::FROM_DATE"
is:
SQL> select '[ ' || value || ' ]'
from site_data
where name='CDE::TO_DATE';
'['||VALUE||']'
--------------------------
[ 29-Jan-2002 06:54:04 ]
SQL> select '[ ' || value || ' ]'
from site_data
where name='CDE::FROM_DATE';
'['||VALUE||']'
------------------
[ 29-Jan-2000 06:54:04 ]
Note that there are numeric (i.e. non date) values in the "value" field of other rows.
Here are the table definitions
SQL> desc site_data
Name Null? Type
------------------- -------- --------------
DATA_TYPE NOT NULL NUMBER(5)
VALUE VARCHAR2(250)
NAME VARCHAR2(250)
DESCRIPTION VARCHAR2(1024)
SQL> desc ppcdrdata
Name Null? Type
---------------------------- -------- ----------------------------
<hide column name> NOT NULL NUMBER(16)
<hide column name> NOT NULL VARCHAR2(40)
<hide column name> NOT NULL NUMBER(2)
<hide column name> VARCHAR2(30)
<hide column name> NUMBER(15)
<hide column name> DATE
<hide column name> VARCHAR2(40)
<hide column name> VARCHAR2(20)
<hide column name> VARCHAR2(40)
<hide column name> NUMBER(1)
<hide column name> NUMBER(10)
<hide column name> NUMBER(10)
<hide column name> NUMBER(1)
<hide column name> NUMBER(3)
<hide column name> VARCHAR2(40)
<hide column name> NUMBER(15)
<hide column name> VARCHAR2(15)
DWS_CREATION_DATE DATE
I have opened a TAR on this and the "final answer" from
Oracle was:
Steve,
Development has determined that this is not a bug.
Here is what was state by the developer:
This is not a bug . We can not guarantee how predicates will be applied.
The best way to guarantee Such queries would work is to
use a DECODE around the datatype column, and return some predetermined
value for non-date values that would be recognised as a data to be
excluded.
Please review the bug tomorrow to see the
select count(*) from ppcdrdata pp,
(select decode( name, 'CDE::FROM_DATE',
to_date(value,'DD-Mon-YYYY HH24:MI:SS'), null ) START_TIME
from site_data
where name='CDE::FROM_DATE') CDE_START,
(select decode( name, 'CDE::TO_DATE',
to_date(value,'DD-Mon-YYYY HH24:MI:SS'), null ) STOP_TIME
from site_data
where name='CDE::TO_DATE') CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME
and pp.dws_creation_date < CDE_STOP.STOP_TIME;
We will inactivate this tar with 2 week closure.
Thanks
I'm sure they are correct. And I'm sure their workaround
will work. But I still don't understand
what the execution plan was doing.
Note that they were able to reproduce the problem with the
export data I sent them.
February 12, 2002 - 4:01 pm UTC
That makes sense -- if you take the query:
select count(*)
from ppcdrdata pp,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') START_TIME
from site_data
where name='CDE::FROM_DATE'
) CDE_START,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME
from site_data
where name='CDE::TO_DATE'
) CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME
it can be rewritten in a variety of ways -- one of which would have the predicates:
where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME
getting pushed DOWN into the inline views - so that the
select to_date( .... ) start_time
from site_data
where name = 'CDE::FROM_DATE'
is more like:
select to_date( .... ) start_time
from site_data
where name = 'CDE::FROM_DATE'
and pp.dws_creation_date >= start_time (and start_time is really just to_date)
hence, view merging is causing the predicates to be evaluated in a random order.
One "trick" that will fix this will be to add ROWNUM to the inline views:
select count(*)
from ppcdrdata pp,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') START_TIME, rownum r1
from site_data
where name='CDE::FROM_DATE'
) CDE_START,
(
select to_date(value,'DD-Mon-YYYY HH24:MI:SS') STOP_TIME, rownum r2
from site_data
where name='CDE::TO_DATE'
) CDE_STOP
where pp.dws_creation_date >= CDE_START.START_TIME and
pp.dws_creation_date < CDE_STOP.STOP_TIME
Now, the inline views will be materialized and then joined to -- instead of having the query predicates get merged. Things like rownum, aggregates, order bys in the inline views preclude us from being able to merge -- HOWEVER -- the solution offered by support is the BETTER way to do this! It always will work.
just a minor doubt
A reader, February 13, 2002 - 2:10 am UTC
Hi Tom, Your comment "Compare STRINGS to STRINGS,
DATES to DATES and NUMBERS to NUMBERS. Never let
an implicit conversion happen!".
Query # A :
select *
from ALL_OBJECTS
where CREATED >= '01-JAN-2001'
and CREATED <= '31-JAN-2001'
Query # B :
select *
from ALL_OBJECTS
where CREATED >= TO_DATE('01-JAN-2001','DD-MON-YYYY')
and CREATED <= TO_DATE('31-JAN-2001','DD-MON-YYYY')
In the Query # A, isn't the column CREATED (a date datatype)
being internally converted by Oracle to a string datatype
for comparison i.e. to_char(CREATED,'DD-MON-YYYY')?
'01-JAN-2001' will just be a character string to Oracle
irrespective of the format we have given to it?
A lot of my colleagues prefer / insist on using Query # A.
But will not Query # B cost less as there are no implicit
conversions?
Pls share your insights / opinion with us on this.
February 13, 2002 - 3:11 am UTC
Your colleagues are lazy. Your colleagues will get totally burned when someone changes the NLS_DATE_FORMAT. Your colleagues are wrong.
We'll implicitly convert the STRING into a DATE for this one, consider:
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from all_objects where created > 'x';
select * from all_objects where created > 'x'
*
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected
it didn't do the date to a string. However, you should NOT rely on that implicit conversion ever. Be explicity, hit the keyboard a couple of extra times to get the TO_DATE with YOUR format there!
Type casting
Mikito Harakiri, February 14, 2002 - 5:40 pm UTC
If 'val' is of type CHAR then the predicate val=1 is ambiguous. It can be transformed either to
TO_NUMBER(val)=1
or
val=TO_CHAR(1).
Oracle seems to always prefer the first way. Wouldn't the rule "cast constant expression first" resolve the issue?
February 22, 2002 - 9:20 am UTC
It would remove the error BUT would introduce massive ambiguities and other "incorrect" (percieved INCORRECT) behavior.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( val varchar2(5) );
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( '1' );
1 row created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( ' 1' );
1 row created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where val >= to_char(1);
VAL
-----
1
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where to_number(val) >= 1;
VAL
-----
1
1
They are not equivalent and most people seem to "expect" the second answer. When comparing to a number -- convert to a number. The same with dates. Since Strings compare SOOO different then number/dates do, we convert to the more specific type and then compare.
The rule "cast the constant" would not work since you can easily have:
select * from t where c1 = c2;
and c1 is a string, c2 a number. Same problem -- no constants.
Back to MY RULE. Never ever in any language rely on implict conversions. In EVERY language I've ever worked with over the years -- I've been burned over and over by not following that very simple rule. Just be explicit, you'll write more correct programs. They might be a tad "longer", a few extra keystrokes, but they will be predicable and correct.
You can wish the machine could do it all for you (as you seem to constantly want it to) but it always comes back to "better SAFE then really really SORRY".
narrowing vs. widening
Mikito Harakiri, February 14, 2002 - 6:01 pm UTC
Correction to the rule: when there is ambiguity between narrowing one type or widening the other, always use widening, because it gurantees no runtime surprises. If there is a conflict that prevents consolidating the types, then report query compilation error instead of nondeterministic failure at runtime confusing unsuspected end user!
February 22, 2002 - 9:22 am UTC
talk about confusing!! look at the above results of the two different queries where one widens and the other narrows. Tell me -- which one is "correct". I'll gladly argue the other side, no matter which you pick.
Sure, widening avoids the ERROR but gives what many would say is the wrong answer (silently, without ERROR -- that I believe to be worse then getting the error).
agreed
Mikito Harakiri, February 25, 2002 - 5:57 pm UTC
Hmm... Even more convincing example would be:
'2'<'11' vs. 2<11
A reader, June 24, 2004 - 10:41 am UTC
Is Documentation Correct?
Andrew Markiewicz, August 18, 2004 - 12:12 pm UTC
Tom.
Is the documentation wrong or am I interpretting it incorrectly?
Excerpt from Oracle 9i release 2 SQL Reference:
The following rules govern the direction in which Oracle makes implicit datatype conversions:
During INSERT and UPDATE operations, Oracle converts the value to the datatype of the affected column.
During SELECT FROM operations, Oracle converts the data from the column to the type of the target variable.
The SQL example you gave above
"ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from all_objects where created >
'x';
select * from all_objects where created > 'x'
*
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected"
shows conversion of the literal to the column data type, opposite from what the documentation states (and apparently more correct since that's exactly what did happen).
Or is this second point implying that columns selected are converted such as in 'select ... into' cursors?
August 18, 2004 - 12:22 pm UTC
you have no target variable there.
what they are saying is -- regardless of the host bind variable type -- the data will be converted from its input type TO THE DATABASE type, eg:
ops$tkyte@ORA9IR2> create table t ( x varchar2(25) );
Table created.
ops$tkyte@ORA9IR2> variable x number
ops$tkyte@ORA9IR2> exec :x := 42
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> insert into t values ( :x );
1 row created.
<b>
:x the number was converted into a string and then inserted. conversely:</b>
ops$tkyte@ORA9IR2> begin
2 select x into :x from t;
3 end;
4 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> print x
X
----------
42
<b>the string t.x was converted into a number and placed into the bind variable</b>
the docs were not talking about type promotion and comparisons in that text, they were talking "inputs and outputs"
plan
A reader, October 31, 2006 - 8:40 am UTC
select * from t where x between 1 and 10;
select * from t where x >=1 and x <= 10;
are they do the same plan ?
October 31, 2006 - 8:50 am UTC
maybe yes
maybe no
probably. between is just shorthand for <= and >=
ops$tkyte%ORA10GR2> create table t ( x int );
Table created.
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from t where x between 1 and 10;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X">=1 AND "X"<=10)
Note
-----
- dynamic sampling used for this statement
ops$tkyte%ORA10GR2> select * from t where x >= 1 and x <= 10;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X">=1 AND "X"<=10)
Note
-----
- dynamic sampling used for this statement
ops$tkyte%ORA10GR2> set autotrace off
plan between and >=, <=
A reader, October 31, 2006 - 9:16 am UTC
may be no ?, do you have simple case to show me..
are there any cases that show 'between' better than >=,<= or opposite?
Thanks
October 31, 2006 - 10:42 am UTC
did you see my example above, see the filter line? between "does not really exist, it is >= <= - see what the optimizer did to your predicate???"
between is syntactic sugar. Just like "in"
ops$tkyte%ORA10GR2> select * from t where x in ( 1, 2 );
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01
|* 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"=1 OR "X"=2)
Note
-----
- dynamic sampling used for this statement
ops$tkyte%ORA10GR2> select * from t where x = 1 or x = 2;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01
|* 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"=1 OR "X"=2)
Note
-----
- dynamic sampling used for this statement
is really OR.
plan
A reader, November 01, 2006 - 12:32 am UTC
Thanks Tom, just confusing with your word: "may be no"
November 01, 2006 - 8:44 am UTC
will they always generate the same plan?
maybe yes, maybe no, PROBABLY (yes)
it is very likely they will.