Skip to Main Content
  • Questions
  • Rows Estimate without executing the query which has joins on many tables?

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Jagadish.

Asked: November 09, 2001 - 1:17 pm UTC

Last updated: August 30, 2011 - 5:05 pm UTC

Version: 8.1.6.0.0 Release 2

Viewed 10K+ times! This question is

You Asked

I heard that some of applications on SQLserver shows an estimated or correct rows without executing the query. I am not sure how that was getting collected in SQL server. I wonder is there any kind of such mechanism exists in Oracle to find out the same row estimates without executing the query. I did watch the execution plan it only provides cost or cardinality. But cardinality is not correct when you run a query with a joins of many tables.

Source:- One of SQL server Application I saw that provides such kind of metrics when mouse moves to an estimate kind of button. That returns a quick row estimates and without executing the query. If we press the button almost all the same estimated rows data is being displayed through the application. So the investigation started how to achieve this in Oracle.

and Tom said...

The explain plan gives this -- as long as you are using the CBO (cost based optimizer). Doesn't matter if you have 1 or 50 tables. The last part of the plan is the estimate output.

consider:

scott@ORA717DEV.US.ORACLE.COM> set autotrace traceonly explain
scott@ORA717DEV.US.ORACLE.COM> select e1.*, e2.* from emp e1, emp e2;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=15 Card=196 Bytes=12544)
1 0 MERGE JOIN (CARTESIAN) (Cost=15 Card=196 Bytes=12544)
2 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=448)
3 1 SORT (JOIN) (Cost=14 Card=14 Bytes=448)
4 3 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=448)

196 = 14 * 14 -- thats right....

scott@ORA717DEV.US.ORACLE.COM> select emp.ename, mgr.ename from
2 emp, emp mgr
3 where emp.mgr = mgr.empno;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=13 Bytes=208)
1 0 MERGE JOIN (Cost=5 Card=13 Bytes=208)
2 1 SORT (JOIN) (Cost=3 Card=14 Bytes=112)
3 2 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=112)
4 1 SORT (JOIN) (Cost=3 Card=14 Bytes=112)
5 4 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=112)

Here is estimated that about 13 rows would be returned...


scott@ORA717DEV.US.ORACLE.COM> select count(*) from emp;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1)
1 0 SORT (AGGREGATE)
2 1 INDEX (FULL SCAN) OF 'EMP_PK' (UNIQUE) (Cost=1 Card=14)

one row...


scott@ORA717DEV.US.ORACLE.COM> select ename, dname
2 from emp, dept where emp.deptno = dept.deptno;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=14 Bytes=252)
1 0 NESTED LOOPS (Cost=5 Card=14 Bytes=252)
2 1 TABLE ACCESS (FULL) OF 'DEPT' (Cost=1 Card=4 Bytes=44)
3 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=98)

14 rows...

1 select ename, dname
2* from emp, dept where emp.deptno = dept.deptno and emp.ename like '%A%'
scott@ORA717DEV.US.ORACLE.COM> /

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=18)
1 0 NESTED LOOPS (Cost=2 Card=1 Bytes=18)
2 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=1 Bytes=7)
3 1 TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (Cost=1 Card=4 Bytes=44)
4 3 INDEX (RANGE SCAN) OF 'DEPT_IDX' (NON-UNIQUE)

1 rows.....


and this was all done without executing the query. See EXPLAIN plan and the PLAN_TABLE.

Rating

  (40 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Estimate rows Without query execution

Jagadish Nalluri, November 12, 2001 - 1:17 pm UTC

I am working with this kind of query ( below ), but I am not getting the correct Cardinality. The result of cardinality shows 4 and when I execute the same query I am getting around 264 rows as result set. Why this is happening. You can tailor the same kind of query on three different tables, Might be the nature of Query is an issue in this. Anyway I am not getting the same result as cardinality shows

Query I am using on some of my tables:- 
----------------------------------------


SQL> set autotrace traceonly explain;
SQL> select
  2   pi.name_family, ep.code_gender
  3  from eh_person_identity pi,
  4   eh_entity_person ep,
  5   eh_entity_roles er
  6  where pi.name_family = 'SMITH'
  7   and ep.entity_id = pi.entity_id
  8   and ep.code_gender = 'M'
  9   and er.entity_id = pi.entity_id
 10   and er.code_entity_role = 'OFNDR';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=19 Card=4 Bytes=840)
   1    0   NESTED LOOPS (Cost=19 Card=4 Bytes=840)
   2    1     NESTED LOOPS (Cost=11 Card=4 Bytes=496)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'EH_PERSON_IDENTITY'
          (Cost=7 Card=4 Bytes=288)

   4    3         INDEX (RANGE SCAN) OF 'EH_PERSON_IDENT_NAME_FAMILY'
          (NON-UNIQUE) (Cost=3 Card=4)

   5    2       INDEX (UNIQUE SCAN) OF 'PK_EH_ENTITY_ROLES' (UNIQUE) (
          Cost=1 Card=2023 Bytes=105196)

   6    1     TABLE ACCESS (BY INDEX ROWID) OF 'EH_ENTITY_PERSON' (Cos
          t=2 Card=13038 Bytes=1121268)

   7    6       INDEX (UNIQUE SCAN) OF 'PK_EH_ENTITY_PERSON' (UNIQUE)
          (Cost=1 Card=13038) 
 

Tom Kyte
November 12, 2001 - 1:32 pm UTC

It is called an ESTIMATE, it is not called a promise ;)

It is an ESTIMATE based on the statistics. If it knew the honest, actual answer -- it would not even have to run the query. It is the best guess of the optimizer based on the data, the statistics.

It is a guess. It will be RARE to get exactly the right number of rows.

estimating cardinality fro complex queries

Mikito Harakiri, November 12, 2001 - 1:54 pm UTC

I assume that the problem with this query is that if user enters a criteria that is not selective enough -- for example: pi.name_family like 'S%' then your query would be spinning for a long time. This is why you want an estimate of the number of rows returned, right?

I would suggest the following workaround. In your case eh_person_identity is the driving table, so fire simple query

select count(1) from eh_person_identity
where pi.name_family like <crit>

first. With index on pi.name_family it's guaranteed to be fast. Now, the number of rows returned is a good approximation of what you primary query must return, or just make a correction based upon some assumptions about how the result will be transformed by the joins and additional filters.

Jagadish, November 12, 2001 - 1:57 pm UTC

That's what exactly my earlier point. Some how Oracle optimizer is unable to provide at least a near by estimate result when we ran such quieries. But on the other hand my observation remained on how the things pretty smart with SQL server database enginge. Might be the same techniques may evolve soon with Oracle Optimizer too. If we could able to find at least a nearby estimate values then it could be a great deal of information through the optimizer.

Thanks for the inputs sharing with me. Now I need not search/wait for a long on this topic. I agreed with your final point(s).

Tom Kyte
November 12, 2001 - 3:24 pm UTC

You are using a single case.

It can be very accurate, IF it has the proper information.  IF you can look at the data yourself and figure out how many rows it returns.  Trust me -- SQLServer isn't always accurate on this either.

Lets look at your query again:

SQL> select
  2   pi.name_family, ep.code_gender
  3  from eh_person_identity pi,
  4   eh_entity_person ep,
  5   eh_entity_roles er
  6  where pi.name_family = 'SMITH'
  7   and ep.entity_id = pi.entity_id
  8   and ep.code_gender = 'M'
  9   and er.entity_id = pi.entity_id
 10   and er.code_entity_role = 'OFNDR';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=19 Card=4 Bytes=840)
   1    0   NESTED LOOPS (Cost=19 Card=4 Bytes=840)
   2    1     NESTED LOOPS (Cost=11 Card=4 Bytes=496)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'EH_PERSON_IDENTITY'(Cost=7 Card=4 Bytes=288)
   4    3         INDEX (RANGE SCAN) OF 'EH_PERSON_IDENT_NAME_FAMILY' (NON-UNIQUE) (Cost=3 Card=4)
   5    2       INDEX (UNIQUE SCAN) OF 'PK_EH_ENTITY_ROLES' (UNIQUE) (Cost=1 Card=2023 Bytes=105196)
   6    1     TABLE ACCESS (BY INDEX ROWID) OF 'EH_ENTITY_PERSON' (Cost=2 Card=13038 Bytes=1121268)
   7    6       INDEX (UNIQUE SCAN) OF 'PK_EH_ENTITY_PERSON' (UNIQUE)(Cost=1 Card=13038) 



Here, it goes after EH_PERSON_IDENTITY first by PI.NAME_FAMILY.  

It looks at the stats (which you haven't told me HOW they are computed)

It guesses "4 rows", that was wrong, it was closer to 264 rows.  Did you compute histograms?  Does the optimizer understand the data is skewed?  Did you give it the information it needs to work with?  Probably not.  You probably just ran "analyze table EH_PERSON_IDENTITY compute statistics".  Thats not good enough for what you are doing.

First of all, I PRAY that you do not actually use literals in your code all over (very bad practice inherited from sqlserver programmers).  Bind variables are the way to go almost always.

Secondly, if you give it more stats -- it'll do better in this case.  Consider:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t as select object_name from all_objects;

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t select 'SMITH' from t where rownum <= 196;

ops$tkyte@ORA817DEV.US.ORACLE.COM> select count(*) from t;

  COUNT(*)
----------
     17303

ops$tkyte@ORA817DEV.US.ORACLE.COM> create index t_idx on t(object_name);

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table t compute statistics;

ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace traceonly explain;
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where object_name = 'SMITH';

Execution Plan
----------------------------------------------------------
   0  SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=2 Bytes=48)
   1    0  INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=2 Bytes=48)


<b>with just "stats", it has no idea about the skew of the columns, lets try 
that again an have it compute the histograms</b>

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table t compute statistics for table for all indexes
  2  for ALL INDEXED COLUMNS size 254;
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where object_name = 'SMITH';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=207 Bytes=4968)
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=2 Card=207 Bytes=4968)

ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t where object_name = 'FOO';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=2 Bytes=48)
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=2 Bytes=48)


So, given the proper amount of information -- it makes the correct assumptions. 

Estimate as pertains to Discoverer

Connor, November 12, 2001 - 2:55 pm UTC

Presumably Discoverer uses the same method to present its "This query will take ... seconds to complete". I was wondering if you have any 'insider' information on what assumptions its making to transforms "rows" into (predicted) "elapsed time".

Cheers
Connor

Tom Kyte
November 12, 2001 - 3:27 pm UTC

Yes, discoverer is using this information in an attempt to predict the runtime (as does the resource manager feature in 9i which allows you to abort a query before it even STARTS in the database based on a predicated runtime).

I haven't any "insider" information about the algorithm used to calculate this value - sorry.

physical versus logical

Mikito Harakiri, November 12, 2001 - 3:48 pm UTC

I wouldn't rely on optimizer to get correct cardinality. You would rely upon some physical implementation features, that could change (hopefully for the better;-) between releases.

Look at bug #1846440:
-- Good Cardinality
select a from x where b like 'C%';
-- Good Cardinality
select a from x where b like 'CE%';
-- Bad Cardinality
select a from x where b like 'CEL%';
It isn't even monothonic!




Jagadish, November 12, 2001 - 4:00 pm UTC

Thanks to make me perfect understanding about use of Histograms and how they differ with estimated statistics. As you said I simply used analyze table compute statistics. This is very useful peace of information.

Thanks for full exploration

compute statistics

Ramesh, November 20, 2001 - 5:37 pm UTC

Is that mean we have to always use "compute statistics for table for all indexes for ALL INDEXED COLUMNS size xxx" to enable optimiser to get full statistics ?

Please clarify

Tom Kyte
November 20, 2001 - 6:06 pm UTC

No, it means in this particular case, because of the nature of the skewed data in his table we needed to compute histograms to feed the optimizer the proper information -- not only to get the best plan but to allow him to get a better estimate of rows that will be returned.

If you want the complete set of stats, you would include these clauses (you can actually get more, i just got the indexed columns -- you can get all columns, hidden columns and so on)

compute statistics

Ramesh, November 21, 2001 - 8:56 am UTC

Is that mean we have to always use "compute statistics for table for all indexes for ALL INDEXED COLUMNS size xxx" to enable optimiser to get full statistics ?

Please clarify

doubts on statistics

David Rodriguez, December 27, 2003 - 5:33 pm UTC

Hi,

It´s David from Argentina.
I have question about cardinality shown in explain plan, it seems they are quite correct when the query is simple but when it gets complex the data arent so good:

select ZZ.PROFILE_CM_SEQ
from sysadm.PS_BO_ROLE_CM ZZ
inner join sysadm.PS_BO_CM_PROFL_DTL Z
on Z.BO_ID =ZZ.BO_ID
and Z.PROFILE_CM_SEQ = ZZ.PROFILE_CM_SEQ
where Z.CM_TYPE_ID in (1, 2)
and Z.PRIMARY_IND ='Y'
and Z.EFFDT < sysdate Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=296563 Card=3 Bytes=195)
1 0 MERGE JOIN (Cost=296563 Card=3 Bytes=195)
2 1 SORT (JOIN) (Cost=80008 Card=3236059 Bytes=126206301)
3 2 TABLE ACCESS (FULL) OF 'PS_BO_CM_PROFL_DTL' (Cost=21513 Card=3236059 Bytes=126206301)
4 1 SORT (JOIN) (Cost=216555 Card=13036685 Bytes=338953810)
5 4 INDEX (FAST FULL SCAN) OF 'PS_BO_ROLE_CM' (UNIQUE) (Cost=7581 Card=13036685 Bytes=338953810)

select table_name, num_rows
from dba_tables where table_name in ('PS_BO_ROLE_CM', 'PS_BO_CM_PROFL_DTL')

TABLE_NAME NUM_ROWS
-------------------- ----------
PS_BO_CM_PROFL_DTL 12944235
PS_BO_ROLE_CM 13036685

The explain plan suggests 3 million rows roughly

TABLE ACCESS (FULL) OF 'PS_BO_CM_PROFL_DTL' (Cost=21513 Card=3236059 Bytes=126206301)

but I have 13 million rows!

Also the

MERGE JOIN (Cost=296563 Card=3 Bytes=195)

suggest that 3 rows will be returned? (the fact is this returns roughly 4 million rows)

Tom Kyte
December 27, 2003 - 5:44 pm UTC

this query plan is saying:

we will:

TABLE ACCESS (FULL) OF 'PS_BO_CM_PROFL_DTL' (Cost=21513 Card=3236059 Bytes=126206301)

of all of the possible rows in 'PS_BO_CM_PROFL_DTL' (which you say is about 13,000,000) we will return from the full scan 3.2 million. the predicate:

where Z.CM_TYPE_ID in (1, 2)
and Z.PRIMARY_IND ='Y'
and Z.EFFDT < sysdate

is expected to return just 3.2 million records.


Well sort that data.

Then, we'll full scan that index:

INDEX (FAST FULL SCAN) OF 'PS_BO_ROLE_CM' (UNIQUE) (Cost=7581 Card=13036685 Bytes=338953810)

and since there is no predicate that can be applied yet -- we'll get all 13,036,685 rows.


then, based on the join condition:

on Z.BO_ID =ZZ.BO_ID
and Z.PROFILE_CM_SEQ = ZZ.PROFILE_CM_SEQ

we expect to take that 3.2 million rows and 13 million rows and after joining, just get 3.

So, this would be the place where the problem is. (before this, it looks dead on).

The question to you would be -- how do you gather statistics exactly, the EXACT commands used.

exact command

David Rodriguez, December 27, 2003 - 7:22 pm UTC

Hi

I use

DBMS_STATS.GATHER_SCHEMA_STATS (ownname => 'SYSADM',
estimate_percent => 20,
block_sample => TRUE,
method_opt =>NULL,
degree => 5,
granularity => 'ALL',
cascade => TRUE
);

anything wrong with that?

Note: PS_BO_ROLE_CM is hash partitioned

Tom Kyte
December 28, 2003 - 10:37 am UTC

you have no histograms and block sampling can be not as good as row sampling.

I'd look at the effect histograms might have on this and consider row over block sampling.

this is getting worse

David Rodriguez, December 28, 2003 - 3:22 pm UTC

Hi

It´s david from Argentina

I followed your advice and analyzed the table with this command

exec dbms_stats.gather_table_stats(ownname=>'SYSADM',-
tabname=>'PS_BO_CM_PROFL_DTL',-
degree=>4,-
method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO',-
estimate_percent=>20,-
granularity=>'ALL',-
cascade=>TRUE)

However I am still getiing bad cardinality as you can see:

1 select ZZ.PROFILE_CM_SEQ
2 from sysadm.PS_BO_ROLE_CM ZZ
3 inner join sysadm.PS_BO_CM_PROFL_DTL Z
4 on Z.BO_ID =ZZ.BO_ID
5 and Z.PROFILE_CM_SEQ = ZZ.PROFILE_CM_SEQ
6 where Z.CM_TYPE_ID in (1, 2)
7 and Z.PRIMARY_IND ='Y'
8* and Z.EFFDT < sysdate

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=453188 Card=2 Bytes=130)
1 0 MERGE JOIN (Cost=453188 Card=2 Bytes=130)
2 1 SORT (JOIN) (Cost=237925 Card=9934470 Bytes=387444330)
3 2 TABLE ACCESS (FULL) OF 'PS_BO_CM_PROFL_DTL' (Cost=21513 Card=9934470 Bytes=387444330)
4 1 SORT (JOIN) (Cost=215263 Card=12959460 Bytes=336945960)
5 4 INDEX (FAST FULL SCAN) OF 'PS_BO_ROLE_CM' (UNIQUE) (Cost=7529 Card=12959460 Bytes=336945960)


getting cardinality of 2 now...?!

1 select table_name, num_rows, partitioned
2* from dba_tables where table_name in ('PS_BO_ROLE_CM', 'PS_BO_CM_PROFL_DTL')


TABLE_NAME NUM_ROWS PAR
------------------------------ ---------- ---
PS_BO_CM_PROFL_DTL 12927275 NO
PS_BO_ROLE_CM 12959460 YES

It should show 9 millions rows though

1 select count(ZZ.PROFILE_CM_SEQ)
2 from sysadm.PS_BO_ROLE_CM ZZ
3 inner join sysadm.PS_BO_CM_PROFL_DTL Z
4 on Z.BO_ID =ZZ.BO_ID
5 and Z.PROFILE_CM_SEQ = ZZ.PROFILE_CM_SEQ
6 where Z.CM_TYPE_ID in (1, 2)
7 and Z.PRIMARY_IND ='Y';

COUNT(ZZ.PROFILE_CM_SEQ)
------------------------
9950707

This is worrying me because in one of big queries these two tables are joined first then from the rows returned a nested loop join is performed (via local index scan), that means Oracle has to do 9950707 INDEX UNIQUE SCAN, TABLE ACCESS BY LOCAL INDEX ROWID then PARTITION HASH ITERATOR!

Any comments is appreciated

TIA

Tom Kyte
December 28, 2003 - 5:46 pm UTC

you are reading the plan wrong there -- what it is doing is

a) reading 'PS_BO_CM_PROFL_DTL' and keeping some of the records -- sorting them by the join key.

b) reading 'PS_BO_ROLE_CM' index as if it were a table and sorting them by the join key

c) merging these two results together -- most likely from temp.


I cannot comment on "another" plan of which I've never ever seen before or the associcated query. I cannot make the same leap you did in the last paragraph there.

(you gathered histograms on indexed columns. are all of the columns in the join condition indexed?)

yes the the join columns are indexed

David Rodriguez, December 28, 2003 - 6:30 pm UTC

Hi

There is a composite index on these four columns in PS_BO_CM_PROFL_DTL

BO_ID
CM_TYPE_ID
EFFDT
PROFILE_CM_SEQ

composite index on PS_BO_ROLE_CM

BO_ID
ROLE_TYPE_ID
PROFILE_CM_SEQ

The thing is Oracle doesnt seem to get the right cardinality after merge join, instead of showing roughly 9 millions rows it shows 2 I dont understand why

Thank you

Tom Kyte
December 29, 2003 - 9:51 am UTC

if you like, you can take a 10053 trace, compress it and email it to me.

go into sqlplus and:

set the event on (search for 10053 on this site for details)
set autotrace traceonly explain
do the query here
exit



What does Cost means ?

Riaz Shahid, December 29, 2003 - 7:29 am UTC

Very usefull information as usual. I need a little clarification:

SQL>  set autotrace traceonly explain
SQL> select count(*) from test;
Elapsed: 00:00:00.00

Execution Plan
==========================================================
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1121 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'TEST' (Cost=1121 Card=158510)



What is meant by Cost=1121 ? and How can we estimate the running time for a query ?

Regards 

Tom Kyte
December 29, 2003 - 10:37 am UTC

the cost is the internal number used by the optimizer to pick the "best plan".

many factors influence the cost -- from system statistics, to hints, to init.ora parameters (eg: play with your db_file_multiblock_read_count setting using alter session and watch the cost go up and down)

consider this other example:

big_table@ORA9IR2> set autotrace traceonly explain
big_table@ORA9IR2>
big_table@ORA9IR2> alter session set optimizer_index_cost_adj = 100;

Session altered.

big_table@ORA9IR2> select count(subobject_name) from big_table where id < 5000;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=86 Card=1 Bytes=22)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=86 Card=5039 Bytes=110858)
3 2 INDEX (RANGE SCAN) OF 'BIG_TABLE_PK' (UNIQUE) (Cost=13 Card=5039)



big_table@ORA9IR2> alter session set optimizer_index_cost_adj = 1;

Session altered.

big_table@ORA9IR2> select count(subobject_name) from big_table where id < 5000;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=22)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=2 Card=5039 Bytes=110858)
3 2 INDEX (RANGE SCAN) OF 'BIG_TABLE_PK' (UNIQUE) (Cost=13 Card=5039)



big_table@ORA9IR2> set autotrace off


Now, the optimizer index cost adj changes nothing about the runtime performance of that query. NOTHING. But, its setting influences the cost numbers the optimizer assigns to various phases in the query plan. Hence, we can influence the optimizer one way or the other by tweaking these various settings.

But -- this example should also hilight that you cannot really 100% estimate time from a query given the cost :) those plans are identical, the runtimes for both are going to be identical, yet the costs are widely different.


If you have acces to my new book "Effective Oracle By Design", i cover this sort of stuff in some detail in chapter 6.

Information on resource manager?

Steve Mitchell, December 29, 2003 - 1:08 pm UTC

Earlier in this thread you said "the resource manager feature in 9i which allows you to abort a query before it even STARTS in the database based on a predicated runtime"

I would very much like to add this kind of functionality to my application. (i.e, "The report you're about to may take 2 hours, do you want to continue?"). But I do not see information about the specifics of using the resource manager as you describe in my Oracle docs. Can you elaborate or point me to the docs I should be reading?


Tom Kyte
December 29, 2003 - 1:11 pm UTC

the timing information is not currently exposed. it is not available to you externally.

the resource manager is documented in the admin guide.

Interesting

Sami, December 29, 2003 - 1:55 pm UTC


how to return the cardinality to a user?

Ryan, December 29, 2003 - 5:43 pm UTC

I want to pass back an 'estimated' number of rows to the user. Per your new book, you state that you should not get counts for pagination and should use v$sql_plan.

what does the query look like to get the cardinality of a query i want to run or just ran. I need one that has the best response time. I have a very strict SLA

Tom Kyte
December 29, 2003 - 6:45 pm UTC

well, if you uniquely "tag" these queries, you can use this technique:

big_table@ORA9IR2> declare
2 cursor c1 is select /* unique_tag1 */ * from big_table where object_name like 'ABC%';
3 cursor c2 is select /* unique_tag2 */ * from big_table where object_name like 'jav%';
4 l_card number;
5 begin
6 open c1;
7 open c2;
8
9 select cardinality into l_card
10 from v$SQL_PLAN_STATISTICS_ALL
11 where (address,hash_value) = ( select distinct address,hash_value
12 from v$open_cursor
13 where upper(sql_text) like 'SELECT /* UNIQUE_TAG1 */%'
14 and sid = ( select sid from v$mystat where rownum=1) )
15 and child_number = 1
16 and id = 1
17 and rownum = 1;
18
19 dbms_output.put_line( 'tag1 ' || l_card );
20
21 select cardinality into l_card
22 from v$SQL_PLAN_STATISTICS_ALL
23 where (address,hash_value) = ( select distinct address,hash_value
24 from v$open_cursor
25 where upper(sql_text) like 'SELECT /* UNIQUE_TAG2 */%'
26 and sid = ( select sid from v$mystat where rownum=1) )
27 and child_number = 1
28 and id = 1
29 and rownum = 1;
30
31 dbms_output.put_line( 'tag1 ' || l_card );
32 end;
33 /
tag1 64
tag1 84444

PL/SQL procedure successfully completed.

big_table@ORA9IR2>
big_table@ORA9IR2> set autotrace traceonly explain
big_table@ORA9IR2> select * from big_table where object_name like 'ABC%';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=66 Card=64 Bytes=6400)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=66 Card=64 Bytes=6400)
2 1 INDEX (RANGE SCAN) OF 'BIG_TABLE_IDX' (NON-UNIQUE) (Cost=3 Card=64)



big_table@ORA9IR2> select * from big_table where object_name like 'jav%';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1379 Card=84444 Bytes=8444400)
1 0 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=1379 Card=84444 Bytes=8444400)



big_table@ORA9IR2> set autotrace off


if you are using dbms_sql, we might have a better way (as the cursor handle returned can be used as a more direct path to the query, obviating the need to "tag" them)

latch contention

Ryan, December 30, 2003 - 8:32 am UTC

Thanks Tom. This is going to be a very high transaction database. I have read that hitting the v$views in production repeatedly can cause latch contention.

Do I have any concerns with this method?

Tom Kyte
December 30, 2003 - 11:04 am UTC

there is no "free" method to get this information. this will cost you -- yes.

the cost of this is fairly substantial in terms of latching and a runtime hit during each parse.

Of course, since the cardinality doesn't change from parse to parse -- you could easily cache this information.

also, you don't need this to EVERY query -- in fact, it would be the exception here, not the rule wouldn't it?


im getting no data returned

A reader, December 30, 2003 - 8:53 am UTC

I have been testing your code. I simplified it for test purposes because it keeps failing. Not sure why.

I tested it with all your code, then started simplifying it in hopes that I can figure out what is wrong. Im on Oracle 9.2 and I'm in a DBA privileged account. The query does return data. 'small_table' is just user_objects.

I also tried commented out child_number=1 and id=1 for test purposes. no data returned.


1 declare
2 cursor c1 is select /* unique_tag1 */ * from user_objects;
3 l_card number;
4 begin
5 open c1;
6 select cardinality into l_card
7 from v$SQL_PLAN_STATISTICS_ALL
8 where (address,hash_value) = ( select distinct address,hash_value
9 from v$open_cursor
10 where --upper(sql_text) like 'SELECT /* UNIQUE_TAG1 */%'
11 sid = ( select sid from v$mystat
12 where rownum=1) )
13 and child_number = 1
14 and id = 1
15 and rownum = 1;
16 dbms_output.put_line( 'tag1 ' || l_card );
17* end;
18 /
declare
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at line 6


Tom Kyte
December 30, 2003 - 11:23 am UTC

looks like not everything is going into there -- it goes into v$sql_plan, but that is more expensive to query (and it looks like I should have had child_number = 0 -- not one)

Tanel Poder, December 30, 2003 - 9:20 am UTC

You have to use CBO for getting statistics in v$sql_plan_statistics(_all)


Tanel Poder, December 30, 2003 - 9:30 am UTC

No, sorry about misinfo, a quick test showed that execution statistics are shown for RBO as well.

The point is, that v$sql_plan_statistics show execution statistics - for already executed statements. V$SQL_PLAN is able to predict row amount, but since it's only CBO estimation based on existing stats, it'll get quite unaccurate for more complex cases.

This is my understanding, can you comment whether it's correct?

Tom Kyte
December 30, 2003 - 11:42 am UTC

the cardinality in both is a guess -- the statistics_all view was cheaper to query.

A reader, December 30, 2003 - 12:48 pm UTC

I wasn't precise enough before, actually I was referring to last_output_rows column in v$sql_plan_statistics(_all) views. This should come directly from query execution statistics, but of course it's valid only for this single query with same (bind) variable values.

update on pagination

Ryan Gaffuri, December 30, 2003 - 2:22 pm UTC

I do need it for every query on a search screen. We are expecting this to be hit alot.(no firm estimates yet, we are stress it with a 100 or so selects/second) Im planning on setting it up similiar to the code from your book 'Effective Oracle by Design' p. 341-348. I would add the 'row estimate' for just the very first page. When you users go to the next 'page' they wouldnt get it.

however, it could be alot. Not sure what else to do. The count is required.

I posted this question on ORACLE-L. Wolfgang Breitling made an interesting point. Id appreciate your take on this.

'v$sql_plan_statistics (and consequently v$sql_plan_statistics_all) only
have data to show if statistics_level is set to ALL. You can set that at
the session level.
Has anyone done measurements on a busy system to evaluate what the impact
is of setting that system-wide. The impression I have is that it is not
something I want to set in production all the time.'

Tom Kyte
December 30, 2003 - 2:37 pm UTC

there are only two ways I know to get the count.

a) using the explain plan command. that is a HARD PARSE for each and every query plus IO to the plan table plus a query to get it out. ugh.

b) using v$sql_plan or more optimally v$sql_plan_statistics_all (queries better against the v$sql_plan_statistics_all -- v$sql_plan seems suboptimal, mixture of lots of x$, and real tables in that one, less so on v$sql_plan_statistics).


if you are doing 100 or so queries/second, they must not be returning large result sets -- what sizes are you expecting here?

compute statistics

David Rodriguez, December 30, 2003 - 2:55 pm UTC

Hi

It's David from Argentina, I did another test today by computing statistics and still getting two rows as cardinality after merge join when we are expecting 4 million roughly....

I tried to read the 10053 output but I am totally lost...

TIA

Tanel Poder, December 30, 2003 - 2:57 pm UTC

In my 9.2.0.4 on W2k, the v$sql_plan_statistics seems to be the simplest view, querying only from QES view x$qesrstat and v$sql_plan_statistics all is based on several tables.

Or did you measure the latching & stats?

Thanks.


Tom Kyte
December 30, 2003 - 3:20 pm UTC

v$sql_plan_statistics doesn't have the cardinality, so we can only compare

v$sql_plan
v$sql_plan_statistics_ALL



Ryan Gaffuri

Ryan Gaffuri, December 30, 2003 - 8:33 pm UTC

Thanks Tom...

Not sure how big the result sets will be. I only have development data right now. Its supposed to be similiar to production data. The largest result sets have been 2,400 records which is too large to return all at once and meet the SLA.

I was going to implement something similiar to the 'pagination' procedures in your second book. I'll return 25 rows at a time via a REF CURSOR. The very first time the query is issued, I'll get an estimate of the total number of rows as well.

It's the best way I can think of to do it.

Tom Kyte
December 31, 2003 - 9:19 am UTC

the only ways I know to get the estimation are from the v$ tables or explain plan.

explain plan will be hugely hugely expensive.
v$ will just be expensive



to David Rodriguez

A reader, January 03, 2004 - 4:13 pm UTC

Hello

I suspect you are hitting bug 2109825, INCORRECT CARDINALITY ESTIMATE WITH OR PREDICATES AND NULL VALUES

Fix, 10g.... I feel your pain sorry

Tom Kyte
January 03, 2004 - 4:56 pm UTC

with lots of backports to 8iR3, 9iR1 and 9iR2 (2 years ago). contact support if you are having pain with this one.

Can't grab from v$_plan_statistics_all

J, March 11, 2004 - 1:58 pm UTC

Hi Tom.

I've taken the above recommendation and modified it to run a test case here - Alas, I'm having difficulties with v$_plan_statistics_all. I can see it, but get blown out when selecting from it. Sorry, but I've looked at this for some time, and I just can't see what's missing. Can you please review this for me?  Many Thanks!

SQL> select * from v$version;

BANNER                                                                          
----------------------------------------------------------------                
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production                
PL/SQL Release 9.2.0.4.0 - Production                                           
CORE    9.2.0.3.0    Production                                                       
TNS for IBM/AIX RISC System/6000: Version 9.2.0.4.0 - Production                
NLSRTL Version 9.2.0.4.0 - Production                                           

SQL> 
SQL> select count(*)
  2  from   addresses
  3  where  cntry_cd = 'US';

  COUNT(*)                                                                      
----------
313052      

SQL> 
SQL> declare
  2      lv_card number;
  3      cursor c1 is
  4         select /* tag1 */ *
  5         from   addresses
  6         where  cntry_cd = 'US';
  7  begin
  8      open c1;
  9  
 10      select cardinality
 11      into   lv_card
 12      from   v$sql_plan_statistics_all
 13      where  (address, hash_value) =
 14             (select distinct address, hash_value
 15          from   v$open_cursor
 16          where  upper(sql_text) like 'SELECT /* TAG1 */%'
 17          and    sid =
 18                 (select sid
 19              from   v$mystat
 20              where  rownum = 1)
 21             )
 22      and    child_number = 1
 23      and    id = 1
 24      and    rownum = 1;
 25  
 26      dbms_output.put_line('Approx Rowcount: '||lv_card );
 27  
 28  end;
 29  /
declare
*
ERROR at line 1:
ORA-01403: no data found 
ORA-06512: at line 10 


SQL> 
SQL> select * from v$sql_plan_statistics_all;

no rows selected

SQL> 
SQL> select * from v$open_cursor
  2  where sid = (select sid from v$mystat where rownum = 1)
  3  and rownum = 1;

SADDR                   SID USER_NAME                      ADDRESS              
---------------- ---------- ------------------------------ ----------------     
HASH_VALUE SQL_TEXT                                                             
---------- ------------------------------------------------------------         
070000000A4CC608         32 ORS                            0700000010602788     
1876698668 select * from v$open_cursor where sid = (select sid from v$m         
 

Tom Kyte
March 11, 2004 - 2:15 pm UTC

what is your statistics level (init.ora)

Statistics Level: Type = 2, Value = 'TYPICAL'

J, March 11, 2004 - 2:28 pm UTC

Wowza, that was a quick response! What brand of coffee do you drink? (Gotta get some of THAT stuff)... =o)

Thanks!
- J

Tom Kyte
March 11, 2004 - 2:38 pm UTC

ops$tkyte@ORA9IR2> l
  1  SELECT STATISTICS_NAME,
  2         SESSION_STATUS,
  3         SYSTEM_STATUS,
  4         ACTIVATION_LEVEL,
  5         SESSION_SETTABLE
  6    FROM v$statistics_level
  7*  ORDER BY 1
ops$tkyte@ORA9IR2> /
 
                               Session    System     Activation Session
Statistics Name                Status     Status     Level      Settable
------------------------------ ---------- ---------- ---------- ----------
Buffer Cache Advice            ENABLED    ENABLED    TYPICAL    NO
MTTR Advice                    ENABLED    ENABLED    TYPICAL    NO
PGA Advice                     ENABLED    ENABLED    TYPICAL    NO<b>
Plan Execution Statistics      ENABLED    ENABLED    ALL        YES</b>
Segment Level Statistics       ENABLED    ENABLED    TYPICAL    NO
Shared Pool Advice             ENABLED    ENABLED    TYPICAL    NO
Timed OS Statistics            ENABLED    ENABLED    ALL        YES
Timed Statistics               ENABLED    ENABLED    TYPICAL    YES
 
8 rows selected.


(black coffee, takes too long to add sugar/milk ;) 

Timed Statistics - Aha!

J, March 11, 2004 - 2:54 pm UTC

Thanks for the lesson, coach!  Excuse me, but I've got to discuss things with the DBA....

SQL> select statistics_name, session_status,
  2         system_status, activation_level
  3         session_settable
  4  from   v$statistics_level
  5* order by 1;


STATISTICS_NAME                 SESSION_ SYSTEM_S SESSION                                          
------------------------------- -------- -------- -------                                          
Buffer Cache Advice             ENABLED  ENABLED  TYPICAL                                          
MTTR Advice                     ENABLED  ENABLED  TYPICAL                                          
PGA Advice                      ENABLED  ENABLED  TYPICAL                                          
Plan Execution Statistics       DISABLED DISABLED ALL                                              
Segment Level Statistics        ENABLED  ENABLED  TYPICAL                                          
Shared Pool Advice              ENABLED  ENABLED  TYPICAL                                          
Timed OS Statistics             DISABLED DISABLED ALL                                              
Timed Statistics                ENABLED  ENABLED  TYPICAL  

statistics_level=ALL adds significant overhead

Vlado Barun, June 16, 2004 - 6:22 pm UTC

Below is a test I have done on an AIX server, where I timed the execution of a query for different statistics_level settings (without timed_os_statistics). I have no question, just sharing my results :))

SQL> alter system set statistics_level=basic;

System altered.

SQL> 
SQL> SELECT  STATISTICS_NAME
  2         , SYSTEM_STATUS
  3  FROM    v$statistics_level;

STATISTICS_NAME                                                  SYSTEM_S                                      
---------------------------------------------------------------- --------                                      
Buffer Cache Advice                                              DISABLED                                      
MTTR Advice                                                      DISABLED                                      
Timed Statistics                                                 ENABLED                                       
Timed OS Statistics                                              DISABLED                                      
Segment Level Statistics                                         DISABLED                                      
PGA Advice                                                       DISABLED                                      
Plan Execution Statistics                                        DISABLED                                      
Shared Pool Advice                                               DISABLED                                      

8 rows selected.

SQL> 
SQL> set timing on;
SQL> 
SQL> declare
  2   i int;
  3       h int;
  4  begin
  5       i:=10;
  6       for a in 1..i loop
  7           SELECT  count(*)
  8           INTO    H
  9           FROM    msorg.participant p
 10             , its.part_transaction_interest pti
 11             , its.interest_table it
 12           WHERE   p.org_id = pti.fk_org_id
 13           AND     pti.fk_interest_table_id = it.interest_table_id;
 14           dbms_output.put_line('number of records returned => ' || h);
 15       end loop;
 16  end;
 17  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:37.26
SQL> set timing off;
SQL> 
SQL> alter system set statistics_level=typical;

System altered.

SQL> 
SQL> SELECT  STATISTICS_NAME
  2         , SYSTEM_STATUS
  3  FROM    v$statistics_level;

STATISTICS_NAME                                                  SYSTEM_S                                      
---------------------------------------------------------------- --------                                      
Buffer Cache Advice                                              DISABLED                                      
MTTR Advice                                                      DISABLED                                      
Timed Statistics                                                 ENABLED                                       
Timed OS Statistics                                              DISABLED                                      
Segment Level Statistics                                         ENABLED                                       
PGA Advice                                                       ENABLED                                       
Plan Execution Statistics                                        DISABLED                                      
Shared Pool Advice                                               ENABLED                                       

8 rows selected.

SQL> 
SQL> 
SQL> 
SQL> declare
  2   i int;
  3       h int;
  4  begin
  5       i:=10;
  6       for a in 1..i loop
  7           SELECT  count(*)
  8           INTO    H
  9           FROM    msorg.participant p
 10             , its.part_transaction_interest pti
 11             , its.interest_table it
 12           WHERE   p.org_id = pti.fk_org_id
 13           AND     pti.fk_interest_table_id = it.interest_table_id;
 14           dbms_output.put_line('number of records returned => ' || h);
 15       end loop;
 16  end;
 17  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:39.98
SQL> set timing off;
SQL> 
SQL> alter system set statistics_level=ALL;

System altered.

SQL> alter system set TIMED_OS_STATISTICS=0;

System altered.

SQL> 
SQL> SELECT  STATISTICS_NAME
  2         , SYSTEM_STATUS
  3  FROM    v$statistics_level;

STATISTICS_NAME                                                  SYSTEM_S                                      
---------------------------------------------------------------- --------                                      
Buffer Cache Advice                                              DISABLED                                      
MTTR Advice                                                      DISABLED                                      
Timed Statistics                                                 ENABLED                                       
Timed OS Statistics                                              DISABLED                                      
Segment Level Statistics                                         ENABLED                                       
PGA Advice                                                       ENABLED                                       
Plan Execution Statistics                                        ENABLED                                       
Shared Pool Advice                                               ENABLED                                       

8 rows selected.

SQL> 
SQL> set timing on;
SQL> declare
  2   i int;
  3       h int;
  4  begin
  5       i:=10;
  6       for a in 1..i loop
  7           SELECT  count(*)
  8           INTO    H
  9           FROM    msorg.participant p
 10             , its.part_transaction_interest pti
 11             , its.interest_table it
 12           WHERE   p.org_id = pti.fk_org_id
 13           AND     pti.fk_interest_table_id = it.interest_table_id;
 14           dbms_output.put_line('number of records returned => ' || h);
 15       end loop;
 16  end;
 17  /

PL/SQL procedure successfully completed.

Elapsed: 00:03:36.48
SQL> 
SQL> 
SQL> set timing off;
SQL> alter system set statistics_level=typical;

System altered.

SQL> 
SQL> spool off
 

Mathew Butler, April 04, 2006 - 9:36 am UTC

I've just been playing around with the queries in the content of this thread. I'm looking for a means of estimating the number of rows returned from a query, but without adding in a tag to the sql. As far as I can see to do this reliably I need to use something other than v$open_cursor ( since this doesn't store the entire SQL statement ). I'm not sure that I can use v$sql - address, hash_value and child_number are in here, but I'm not clear that these will uniquely identify a row, nor how I might filter to identify only my data ( no session info in here to join to ).

Why am I concerned about using a tag in the SQL? I think for invalid reasons:
- The tag should be unique ( but only within the session )
- It's something that the developer needs to remember to do ( but this can be turned into a utility that manages the tagging )
So, possibly some sloppy thinking here.


The next best thing, would be to use explain plan as per </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:4456921593765#5881975252271 <code>

However, this requires a plan table to exist, I would prefer a solution that has fewer dependencies. I think this solution also requires an extra hard parse.


Just looking for a lead really to checking the entire sql. Is this possible?.

Having written all the above, I'm thinking that the v$sql_plan_statistics_all solution may be the best ( and only ) approach.

Question: How do I switch "Plan Execution Statistics" on in order for the v$sql_plan_statistics_all view to be populated but avoid switching on "Timed OS Statistics"


Tom Kyte
April 04, 2006 - 7:21 pm UTC

database version?

Whoops...

Mathew Butler, April 05, 2006 - 8:22 am UTC

Win2K 9.2.0.6

Tom Kyte
April 05, 2006 - 6:07 pm UTC

bummer, in 10g - you have this:

</code> http://docs.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_xplan.htm#sthref11613 <code>

so, after the cursor is executed, this would get you the right stuff.


I don't think you want to turn on the extended statistics stuff just for this, it would add a certain amount of run time overhead - and the way to get that particular view populated is somewhat version dependent (changes with the dot releases until 10gr2)...

Something to look forward to...

Mathew Butler, April 10, 2006 - 9:37 am UTC

Thanks for the pointer - no current plan to move to 10G though so the "explain plan" and plan table implementation appears to be the way to go - I agree with your summary.

A cheeky question:

I've just found out that we have been running with the optimizer_index_cost_adj and optimizer_index_caching parameters set to incorrect values ( some initial testing shows that we can get significant performance improvement by setting these back to the default ). The implementation plan is to set these back to the default at the same time as deploying system stats.

As a result of changing the optimizer_index_* parameters one highlighted problem querys runtime went from 6 minutes down to 30 seconds.

My question: What is the strategy for identifying the optimal runtime performance for a given query?. Basically, we tuned the SQL and got an improved time ( before the param changes ). However, ideally we should have identified that the plan itself was was not optimal given the data and found that there was an issue with the optimizers configuration.

This might be one for when you are accepting questions.




Alexander the ok, June 13, 2006 - 1:54 pm UTC

Tom,

Is there anyway to get an expected amount of time for a query to execute, prior to running it?

This could be used for a status bar in an application, to show how much % complete the query is...maybe.

Tom Kyte
June 13, 2006 - 5:03 pm UTC

version?

9.2.0

Alexander the ok, June 13, 2006 - 6:46 pm UTC


Tom Kyte
June 14, 2006 - 8:39 am UTC

in 10g, yes, in v$sql_plan.

In 9i, there is no real estimate "exposed", the only time you ever see the guess is when you use the resource manager and have a query that is estimated to run longer than you are allowed.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6161932829690#7718237225556 <code>



Some queries cause NULL cardinality in v$SQL_PLAN_STATISTICS_ALL

Robert Piras, August 03, 2006 - 5:02 am UTC

(Version: Oracle9i Enterprise Edition Release 9.2.0.6.0)

In general, your code sample (based on v$SQL_PLAN_STATISTICS_ALL) returns pretty good estimates. However, some types of queries (for example, IN or EXISTS with correlated subquery) return NULL; but the method using cardinality from plan_table still returns a value in these instances. 
Reproducible test cases below; I know the queries don't make much logical sense, but they still demonstrate the point. (My original query where I noticed this problem is a valid query from our application).
Any comments appreciated.

------------------------------------------


CREATE OR REPLACE PACKAGE BODY Query_Veto AS
/

    /*
     * Estimate number of rows returned for given query.
     * This uses the v$statistics views with the open cursor.
     */
  FUNCTION estimate_num_rows(p_Query IN VARCHAR2) RETURN NUMBER IS

    query_cv GENERICCURTYP;
    l_card NUMBER;
    
    BEGIN
    
      OPEN query_cv FOR p_Query;
      
      /*
       * NB v$open_cursor.sql_text is 60 chars - so compare first
       * 60 chars of query only.
       */
      SELECT cardinality 
      INTO l_card
      FROM v$SQL_PLAN_STATISTICS_ALL
      WHERE (address,hash_value) = 
               (SELECT DISTINCT address,hash_value
                FROM v$open_cursor
                WHERE upper(sql_text) like '%'||UPPER(SUBSTR(p_Query,1,60))||'%'
                AND sid = (SELECT sid FROM v$mystat WHERE rownum=1))
      AND child_number = 0
      AND id = 1
      AND rownum = 1;

      
      CLOSE query_cv;
      
    RETURN l_card;
    
  END;
  

  /*
   * Estimate number of rows returned for given query.
   * This usses the Explain Plan Table results.
   */
  FUNCTION estimate_num_rows_alt(p_Query IN VARCHAR2) RETURN NUMBER IS

    l_card NUMBER;
    PRAGMA AUTONOMOUS_TRANSACTION;
    l_stmtid varchar2(25) default 'ESTIMATED';
    
    BEGIN
    
      EXECUTE IMMEDIATE
        'EXPLAIN PLAN SET STATEMENT_ID= ''' || l_stmtid || ''' FOR  ' || p_query;

      SELECT cardinality INTO l_card
      FROM plan_table
      WHERE statement_id = l_stmtid
      AND id = 0;

      DELETE FROM plan_table WHERE STATEMENT_ID = l_stmtid;
      COMMIT;
  
    RETURN l_card;
    
  END;

  
END Query_Veto;
/


SQL> create table myObjects as select * from user_objects;

Table created.

SQL>  create table MySource as select * from User_Source;

Table created.

SQL> analyze table MySource compute statistics;

Table analyzed.

SQL> analyze table MyObjects compute statistics;

Table analyzed.

SQL> create table MyIndexes as select * from user_indexes;

Table created.

SQL> analyze table  MyIndexes compute statistics;

Table analyzed.

SQL> variable n number

SQL> begin
  2  :n := QUERY_VETO.Estimate_Num_Rows(
  3  'SELECT o.Object_Name from MyObjects o, MySource s where o.OBJECT_NAME
  4  = s.name');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> print :n

         N
----------
     13622


SQL> begin
  2  :n := QUERY_VETO.Estimate_Num_Rows(
  3  'select o.OBJECT_NAME from MyObjects o, MySource s where o.OBJECT_NAME = s.Name
  4  AND o.OBJECT_TYPE LIKE s.NAME||''%''');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> print :n

         N
----------
       681     


SQL> begin
  2  :n := QUERY_VETO.Estimate_Num_Rows(
  3  'select o.OBJECT_NAME from MyObjects o, MySource s where o.OBJECT_NAME = s.Name
  4  AND EXISTS (SELECT 1 FROM MYIndexes i WHERE i.TABLESPACE_NAME LIKE s.NAME||''%'')')
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> print :n

         N
----------


SQL> begin
  2  :n := QUERY_VETO.Estimate_Num_Rows(
  3  'select o.OBJECT_NAME from MyObjects o, MySource s where o.OBJECT_NAME = s.Name
  4  AND o.OBJECT_TYPE IN (SELECT i.INDEX_TYPE FROM MYIndexes i
  5  WHERE i.TABLESPACE_NAME LIKE s.NAME||''%'')');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> print :n

         N
----------


SQL> begin
  2  :n := QUERY_VETO.Estimate_Num_Rows_Alt(
  3  'select o.OBJECT_NAME from MyObjects o, MySource s where o.OBJECT_NAME = s.Name
  4  AND o.OBJECT_TYPE IN (SELECT i.INDEX_TYPE FROM MYIndexes i
  5  WHERE i.TABLESPACE_NAME LIKE s.NAME||''%'')');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> print :n

         N
----------
       681
       
(End).

 

addendum to previous entry

Robert Piras, August 03, 2006 - 5:54 am UTC

Sorry, you also needthe package spec:-

CREATE OR REPLACE PACKAGE Query_Veto AS

/*
* Estimate number of rows returned for given query.
* This uses the v$statistics views with the open cursor.
*/
FUNCTION estimate_num_rows(p_Query IN VARCHAR2) RETURN NUMBER;

/*
* Estimate number of rows returned for given query.
* This usses the Explain Plan Table results.
*/
FUNCTION estimate_num_rows_alt(p_Query IN VARCHAR2) RETURN NUMBER;

TYPE GenericCurTyp IS REF CURSOR;

END Query_Veto;


Tom Kyte
August 03, 2006 - 9:36 am UTC

Interesting, v$sql_plan* views are missing that - if you trim the test case down (eg: the package isn't really needed, just confuses the issue) please file this with support. The information is in fact missing

Associated Bug raised by Oracle Support

R.Piras, August 10, 2006 - 4:48 am UTC

Thank you for the response. Oracle Support have assigned this to Development as Associated Bug 5454641.


interview question

A reader, November 15, 2006 - 7:17 pm UTC

how can we estimate execution time of a query without excuting the query.

any ideas ?????

Tom Kyte
November 16, 2006 - 3:02 pm UTC

ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from dual;

Execution Plan
----------------------------------------------------------
Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     2 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------
 

Estimate rows; with bind variables

dayneo, August 28, 2011 - 9:51 am UTC

Hi Tom,  

I am trying to implement this "row estimate" method as you have demonstrated. At the same time, I would like to be using bind variables to reduce security risks. My understanding was that the optimizer utilises bind variable peeking, but I am finding that this is not happening on my environment and I can't figure out why; or perhaps need to understand how to ask the optimizer to do it.

Ora 10.2.0.1 (and 10.1.0.4)

[code]
dayneo@STAGE> SET AUTOTRACE ON EXPLAIN
dayneo@STAGE> ALTER SYSTEM FLUSH SHARED_POOL;
dayneo@STAGE> VARIABLE BV_NULL VARCHAR2(10);
dayneo@STAGE> VARIABLE BV_CHAR VARCHAR2(10);
dayneo@STAGE> EXEC :BV_CHAR := 'MR%';

PL/SQL procedure successfully completed.

dayneo@STAGE> 
dayneo@STAGE> select /* DPO-UNIQUE_TAG-5 */
  2       COUNT(*)
  3  from mims_material_src
  4  where road_no like 'MR%'
  5  /

  COUNT(*)
----------
       370


Execution Plan
----------------------------------------------------------
Plan hash value: 974776596

--------------------------------------------------------------------------------
-------

| Id  | Operation         | Name              | Rows  | Bytes | Cost (%CPU)| Tim
e     |

--------------------------------------------------------------------------------
-------

|   0 | SELECT STATEMENT  |                   |     1 |     7 |     3   (0)| 00:
00:01 |

|   1 |  SORT AGGREGATE   |                   |     1 |     7 |            |
      |

|*  2 |   INDEX RANGE SCAN| MIMS_MAT_SRC_IX03 |   368 |  2576 |     3   (0)| 00:
00:01 |

--------------------------------------------------------------------------------
-------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ROAD_NO" LIKE 'MR%')
       filter("ROAD_NO" LIKE 'MR%')

dayneo@STAGE> 
dayneo@STAGE> select /* DPO-UNIQUE_TAG-7 */
  2       COUNT(*)
  3  from mims_material_src T
  4  where road_no like :BV_CHAR
  5  /

  COUNT(*)
----------
       370


Execution Plan
----------------------------------------------------------
Plan hash value: 974776596

--------------------------------------------------------------------------------
-------

| Id  | Operation         | Name              | Rows  | Bytes | Cost (%CPU)| Tim
e     |

--------------------------------------------------------------------------------
-------

|   0 | SELECT STATEMENT  |                   |     1 |     7 |     2   (0)| 00:
00:01 |

|   1 |  SORT AGGREGATE   |                   |     1 |     7 |            |
      |

|*  2 |   INDEX RANGE SCAN| MIMS_MAT_SRC_IX03 |    65 |   455 |     2   (0)| 00:
00:01 |

--------------------------------------------------------------------------------
-------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ROAD_NO" LIKE :BV_CHAR)
       filter("ROAD_NO" LIKE :BV_CHAR)

dayneo@STAGE> 
dayneo@STAGE> select /* DPO-UNIQUE_TAG-6 */
  2       COUNT(*)
  3  from mims_material_src
  4  where road_no like :BV_NULL
  5  /

  COUNT(*)
----------
         0


Execution Plan
----------------------------------------------------------
Plan hash value: 974776596

--------------------------------------------------------------------------------
-------

| Id  | Operation         | Name              | Rows  | Bytes | Cost (%CPU)| Tim
e     |

--------------------------------------------------------------------------------
-------

|   0 | SELECT STATEMENT  |                   |     1 |     7 |     2   (0)| 00:
00:01 |

|   1 |  SORT AGGREGATE   |                   |     1 |     7 |            |
      |

|*  2 |   INDEX RANGE SCAN| MIMS_MAT_SRC_IX03 |    65 |   455 |     2   (0)| 00:
00:01 |

--------------------------------------------------------------------------------
-------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ROAD_NO" LIKE :BV_NULL)
       filter("ROAD_NO" LIKE :BV_NULL)

[/code]

From the above example, my first statement uses literals and the explain plan very closely estimates the number of matched rows. The next statement uses a bind variable that has the same value as the first statement. However, this time the optimizer estimates the rows to be only 65. In the third statement, I use a bind variable that has not been initialized with a value and the optimizer gives a row estimate of 65; the same as the second statement. I would say, bind variable peeking is not happening here.
 
Is it possible to get better estimates with bind variables?

Tom Kyte
August 30, 2011 - 5:05 pm UTC

explain plan DOES NOT BIND PEEK.

http://asktom.oracle.com/Misc/when-explanation-doesn-sound-quite.html


do this instead:

ops$tkyte%ORA11GR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where object_name like 'AB%';

OWNER                          OBJECT_NAME
------------------------------ ------------------------------
SUBOBJECT_NAME                  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE
------------------------------ ---------- -------------- -------------------
CREATED   LAST_DDL_ TIMESTAMP           STATUS  T G S  NAMESPACE
--------- --------- ------------------- ------- - - - ----------
EDITION_NAME
------------------------------
XDB                            ABSPATH
                                    57031                OPERATOR
05-SEP-10 05-SEP-10 2010-09-05:15:49:15 VALID   N N N          1


PUBLIC                         ABSPATH
                                    57034                SYNONYM
05-SEP-10 05-SEP-10 2010-09-05:15:48:25 VALID   N N N          1



ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> variable x varchar2(20)
ops$tkyte%ORA11GR2> exec :x := 'AB%'

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> select * from t where object_name like :x;

OWNER                          OBJECT_NAME
------------------------------ ------------------------------
SUBOBJECT_NAME                  OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE
------------------------------ ---------- -------------- -------------------
CREATED   LAST_DDL_ TIMESTAMP           STATUS  T G S  NAMESPACE
--------- --------- ------------------- ------- - - - ----------
EDITION_NAME
------------------------------
XDB                            ABSPATH
                                    57031                OPERATOR
05-SEP-10 05-SEP-10 2010-09-05:15:49:15 VALID   N N N          1


PUBLIC                         ABSPATH
                                    57034                SYNONYM
05-SEP-10 05-SEP-10 2010-09-05:15:48:25 VALID   N N N          1



ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
SQL_ID  dusy9vva55s0a, child number 0
-------------------------------------
select * from t where object_name like :x

Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |   288 (100)|          |
|*  1 |  TABLE ACCESS FULL| T    |     2 |   194 |   288   (1)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_NAME" LIKE :X)


18 rows selected.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t where object_name like :x;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  3608 |   341K|   288   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T    |  3608 |   341K|   288   (1)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_NAME" LIKE :X)

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> 



using dbms_xplan.display cursor will get the plan of the last executed sql statement in your session from v$sql - explain plan is often wrong.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.