I've known for a while that the order of predicate evaluation in a query is non deterministic, as posted by Tom here (
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:696647495510 ). I even wrote my own blog about it a while back (
http://www.wegobeyond.co.uk/blog/search/predicate.html ) yet for some reason I was still surprised when I encountered it for real in some production code at one of our clients. Unfortunately the LiveSQL link doesn't replicate the issue (after all, it's non-deterministic! :)) but hopefully it demonstrates my issue.
The client was using a generic table from their E-Business Suite ERP system, which holds
attributes against an attribute type. These attributes can be of any format (which the application validates).
In very simple terms, the structure is something like this
Create Table xxjktst As
Select Case When Level <= 100 Then 'DATE_ATTR' Else 'ATTR' || To_Char(Mod(Level,100),'00') End attr_type,
Case When Level <= 100 Then To_Char(Trunc(Sysdate,'MM')+Mod(Level,10),'YYYY/MM/DD HH24:MI:SS') Else 'String ' || To_Char(Level,'0000') End attr1
From dual
Connect By Level <= 1000;
The developers had written a query which (roughly) was as follows.
Select * From (Select attr_type, To_Date(attr1,'YYYY/MM/DD HH24:MI:SS') dt From xxjktst Where attr_type = 'DATE_ATTR') v
Where v.dt <= Sysdate;
However when running it gave "ORA-01840: input value not long enough for date format". The code must have run at some point in the past because it is live and has been for a long time. I recreated the problem using a simple example of
Select *
From xxjktst
Where To_Date(attr1,'YYYY/MM/DD HH24:MI:SS') <= Sysdate
And attr_type = 'DATE_ATTR';
Which failed. Then I did this:
Select /*+ordered_predicates*/ *
From xxjktst
Where attr_type = 'DATE_ATTR'
And To_Date(attr1,'YYYY/MM/DD HH24:MI:SS') <= Sysdate;
which worked fine. So it was clear that in the original query the To_Date() was getting pushed into the subquery and evaluated prior to the predicate on attr_type.
So I have a number of questions.
1) Can you share any information on the internal workings of the optimizer and how it decides on the predicate order? Is it actually non deterministic or are there some rules? I.e. did my LiveSQL example work purely by chance?
2) When I run the query "Select to_date(col) From tab Where attr_type=..." then it doesn't throw the error - I
assume this is because functions/conversions are always applied
after predicates? Is that the case?
3) Other than using correct datatypes (this is a ERP product which we have no control of) or the ordered_predicates hint (which could cause suboptimal plans for all other predicates) is there a foolproof way of ensuring this doesn't fail? I.e. if I do this:
Create View xxjktst_v As
Select attr_type, To_Date(attr1,'YYYY/MM/DD HH24:MI:SS') dt From xxjktst Where attr_type = 'DATE_ATTR';
Select * From xxjktst_v Where dt <= Sysdate;
4) I can think of at least several other pieces of code that do something <i>similar</i> to the above - is it again <b>purely</b> by chance that they work, or is the optimizer doing something internally to try and avoid such issues?
Can it be guaranteed to work, or is there a chance that the predicate could be pushed into the view by the optimizer (without being explicitly asked)?
Thanks
I've got a video on this too ;)
There isn't a published list of the order the optimizer applies predicates. Even if there were, this would likely change between versions. And rewrites such as subquery unnesting can change the query significantly. So it could still apply predicates in a different order than you were expecting...
Long story short: You can't rely on the optimizer processing predicates in any particular order.
Your approach of:
Select * From (Select attr_type, To_Date(attr1,'YYYY/MM/DD HH24:MI:SS') dt From xxjktst Where attr_type = 'DATE_ATTR') v
Where v.dt <= Sysdate
works because you got lucky! It's possible that at some future date this query won't work. Particularly if the indexes on the table change...
Other approaches you could investigate:
- Create a function-based index which maps non-date attrs to null and does to_date for those that are. Then update your queries to match this expression
- Create a virtual column which does the same (maps non-dates to null) and apply your predicates to that.
Note: ordered_predicates is undocumented, so use with caution!