excellent, as usual
January 10, 2006 - 10am Central time zone
Reviewer: A reader
Hard and Soft Parse
January 10, 2006 - 10am Central time zone
Reviewer: P.Karthick from India
Tom,
Have i understood your review properly. You mean to say by doing a hard parese the performance has
increased right.
But http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2588723819082 does say some thing else.
This says soft parse is the best.
Followup January 10, 2006 - 10am Central time zone:
search this site for
bind variable peeking
to get an idea of what I was demonstrating. The "where 1=1" was a 'red herring'. I was trying to
demonstrate that the adding of 1=1 did not make the query go "faster", it was the fact we
accidently hard parsed it with different bind variables and the optimizer came up with an entirely
different plan - meaning, you have a bind variable peeking issue that you erroneously believe you
have fixed with 1=1 (until the next time the 1=1 query gets hardparsed itself! I can "prove" (do
not take that seriously) that adding 1=1 will decrease performance, watch
(same setup as before):
ops$tkyte@ORA9IR2> variable n number
ops$tkyte@ORA9IR2> exec :n := 1;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> set autotrace traceonly statistics
ops$tkyte@ORA9IR2> select * from t where id = :n;
1 row selected.
Statistics
----------------------------------------------------------
4 consistent gets
1 rows processed
ops$tkyte@ORA9IR2> exec :n := 99;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select * from t where id = :n AND 1=1;
28410 rows selected.
Statistics
----------------------------------------------------------
2267 consistent gets
28410 rows processed
ops$tkyte@ORA9IR2> exec :n := 1;
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select * from t where id = :n;
1 row selected.
Statistics
----------------------------------------------------------
4 consistent gets
1 rows processed
ops$tkyte@ORA9IR2> select * from t where id = :n AND 1=1;
1 row selected.
Statistics
----------------------------------------------------------
406 consistent gets
1 rows processed
ops$tkyte@ORA9IR2> set autotrace off
see, now 1=1 is "bad" - no, not really, it was bind variable peeking coming up with different
plans depending on what got parsed first...

January 10, 2006 - 10am Central time zone
Reviewer: Dave from UK
No, Tom was saying that you current plan was not optimal. By adding 1=1 you forced the optimizer
to do a hard parse which in turn gave you a better query.
Next time you run it you will get a soft parse with the same good execution plan
however, note
January 10, 2006 - 11am Central time zone
Reviewer: A reader
that there are cases where adding
"and rownum > 0"
to a query dramatically may change plan and
execution time
Followup January 10, 2006 - 1pm Central time zone:
that is very different.
adding where 1=0 is very different to - that makes queries go really fast.

January 10, 2006 - 5pm Central time zone
Reviewer: A reader
>>and we said...
>> http://www.webster.com/cgi-bin/dictionary
it was hillarious..
1=0
January 11, 2006 - 10am Central time zone
Reviewer: A reader
excellent !
my queries are flying !
I have to remember that
adding "and 1=1" vs. adding "and rownum>0"
January 11, 2006 - 3pm Central time zone
Reviewer: Matthias Rogel from Kaiserslautern, Germany
you write "that is very different."
I disagree.
a. for every row in every table or view in every Oracle-database in the world, there is, always
was, and always will be 1=1
b. for every row in every table or view in every Oracle-database in the world, there is, always
was, and always will be rownum > 0
so, adding "and 1=1" is totally equivalent to adding "and rownum>0", correct ?
where is the difference ?
hold on, of course there is a difference:
Oracle optimizer happens to know a. a priori
Oracle optimizer happens NOT to know b. a priori - though this is not documented
(at least in current releases)
is that what you mean by "that is very different." ?
Followup January 12, 2006 - 10am Central time zone:
You are free to disagree, but rownum is a set operation. It is "special", it is "magic". It must
be done deterministically. Adding rownum to a query fundementally changes it.
In this very narrow case, it just so happens that
1=1
rownum>0
will have the same net effect, but then again:
select *
from t
where dt = (select max(dt) from t t2 where t2.id = t.id);
select *
from (select t, max(dt) over (partition by id) max_dt from t)
where dt = max_dt;
return the same results - but are very different.
Fundamental? or Indicative?
January 12, 2006 - 1pm Central time zone
Reviewer: Duke Ganote from a cube on the 2nd floor of the tallest building in all of Clermont County, Ohio USA
I'd have to side with Matthias on this. "ROWNUM>0" is an undocumented 'hint' like MATERIALIZE
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:52733181746448#52821944767865 where you're helping the optimizer use a better plan than it can figure out on its own. I wonder
what information the optimizer lacks?
Conceptually, a perfect optimizer would know if 2 queries (like your example above) return the same
result, and provide the same, optimal plan. Just as it (now) does for
select count(*) from x
and
select count(1) from x
http://asktom.oracle.com/pls/ask/f?p=4950:8:8754503271729407215::NO::F4950_P8_DISPLAYID,F4950_P8_CRI
TERIA:1156159920245
It may change actually...
January 13, 2006 - 6pm Central time zone
Reviewer: Kirurgs from Latvia
>> Reviewer: A reader
>> that there are cases where adding
>> "and rownum > 0"
>> to a query dramatically may change plan and
>> execution time
I would like to indicate that it may be TRUE as well...
For example if statistics are not up to date, some data are not known to optimizer or for whatever
reason CBO merged exists clause into query.
1. select y.x from y where exists (select 1 from z where z.b=y.b)
2. select y.x from y where exists (select 1 from z where z.b=y.b and ROWNUM>0)
I don't pretend to most selfexplanatory queries in the world, but I think it's more or less clear
what I was trying to say :) Both queries can have totally different execution plans/times at the
same time/data.
I tested it for my query
January 31, 2006 - 9pm Central time zone
Reviewer: Ravi from USA
The results where amazing
Excellent
different plans
April 18, 2006 - 10am Central time zone
Reviewer: Mark from NY
Tom,
Adding 1=1 to get a new plan - that's amusing. But it got me thinking - it's interesting how
different values for bind variables can give you different plans. Do you think it could be useful
to do something like:
1) Experiment with different values for the bind variables. Given knowledge of the table data and
common values for the bind variables, find which values give you the best general-purpose plan.
2) On database startup (do not open it yet), run a script that runs all these queries once, so that
the shared pool contains the optimal plans for each. Or maybe not ALL the queries depending on how
long they take to complete. A good sample at least.
Obviously step 2 might not always be a good thing - for example if users are waiting to get into
the database and/or you need to minimize downtime. And I agree with you that restarting the
database because "that's what you do" is a bad thing. But if you HAVE to restart for whatever
reason (for example patching), it seems like this idea could work.
Any thoughts?
Thanks,
Mark
Followup April 18, 2006 - 2pm Central time zone:
option 0) do not have statistics that would cause bind variable peeking to do anything. eg: in
your transactional system - where binds are somewhat "the rule" - do you really want histograms,
are they desirable in that system....
bind variable peeking issues would absolutely be the exception - not the rule. I would not suggest
your steps 1 and 2 for every query. For the exceptions I would consider
a) do not bind that specific column, if you have a Y/N flag for example, or code number check (code
in (1,2,3,4,5) ) - and the flag/code are skewed - you have so unique values, you need not bind that
one necessarily.
b) don't generate the statistics that cause the plans to change (eg: histograms, they are expensive
to compute in the first place)
c) use your domain knownledge of the data - if the user inputs a date within the last 30 days -
"open this query", else if the date is > 30 days (a different plan would be used) then "open this
other_query"...
d) resort to cursor_sharing = similar for that query
e) stored outlines
Thanks!
April 19, 2006 - 9am Central time zone
Reviewer: Mark from NY
Thanks for your input.
|