## Question and Answer

## You Asked

Hi Tom,

Please can you explain what the term cardinality Really means and what is the importance and what role it has to play in various situations and how it influences the optimizer?? I have searched far and wide but the real meaning and importance has eluded me so far.

Please help..

Regards,

Aru.

Please can you explain what the term cardinality Really means and what is the importance and what role it has to play in various situations and how it influences the optimizer?? I have searched far and wide but the real meaning and importance has eluded me so far.

Please help..

Regards,

Aru.

## and Tom said...

1 entry found for cardinality.

cardinality

<mathematics> The number of elements in a set. If two sets

have the same number of elements (i.e. there is a bijection

between them) then they have the same cardinality. A

cardinality is thus an isomorphism class in the category

of sets.

---------------------------------------------------------------

It literraly means that, in a PLAN Card=NNN is the number of rows the optimizer expects to flow from one row source operation to the next:

new 1: explain plan for select * from emp, dept where emp.deptno =dept.deptno

Explained.

scott@ORA10G> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT

----------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes |

----------------------------------------------------------------

| 0 | SELECT STATEMENT | | 14 | 798 |

| 1 | MERGE JOIN | | 14 | 798 |

| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 |

| 3 | INDEX FULL SCAN | PK_DEPT | 4 | |

|* 4 | SORT JOIN | | 14 | 518 |

| 5 | TABLE ACCESS FULL | EMP | 14 | 518 |

----------------------------------------------------------------

The optmizer thinks that a full scan of emp in this case will result in 14 rows, a index full scan to read dept will result in 4, after joining emp to dept, there will be 14 rows.

If the cardinality is way off -- it can be disasterous, consider:

ops$tkyte@ORA10G> create table t as select * from all_objects;

Table created.

ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> create index t_idx on t(object_id);

Index created.

ops$tkyte@ORA10G> exec dbms_stats.gather_index_stats( user, 'T_IDX' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G>

ops$tkyte@ORA10G> set autotrace traceonly explain

ops$tkyte@ORA10G> select * from t where object_id = 42;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=482 Bytes=44826)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=482 Bytes=44826)

2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=193)

ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select * from t where object_id = 42;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=93)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=93)

2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=1)

ops$tkyte@ORA10G> set autotrace off

cardinality

<mathematics> The number of elements in a set. If two sets

have the same number of elements (i.e. there is a bijection

between them) then they have the same cardinality. A

cardinality is thus an isomorphism class in the category

of sets.

---------------------------------------------------------------

It literraly means that, in a PLAN Card=NNN is the number of rows the optimizer expects to flow from one row source operation to the next:

new 1: explain plan for select * from emp, dept where emp.deptno =dept.deptno

Explained.

scott@ORA10G> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT

----------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes |

----------------------------------------------------------------

| 0 | SELECT STATEMENT | | 14 | 798 |

| 1 | MERGE JOIN | | 14 | 798 |

| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 |

| 3 | INDEX FULL SCAN | PK_DEPT | 4 | |

|* 4 | SORT JOIN | | 14 | 518 |

| 5 | TABLE ACCESS FULL | EMP | 14 | 518 |

----------------------------------------------------------------

The optmizer thinks that a full scan of emp in this case will result in 14 rows, a index full scan to read dept will result in 4, after joining emp to dept, there will be 14 rows.

If the cardinality is way off -- it can be disasterous, consider:

ops$tkyte@ORA10G> create table t as select * from all_objects;

Table created.

ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> create index t_idx on t(object_id);

Index created.

ops$tkyte@ORA10G> exec dbms_stats.gather_index_stats( user, 'T_IDX' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G>

ops$tkyte@ORA10G> set autotrace traceonly explain

ops$tkyte@ORA10G> select * from t where object_id = 42;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=482 Bytes=44826)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=482 Bytes=44826)

2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=193)

**right now, due to the lack of any column stats (we gathered them for all indexed columns BEFORE creating the index in this example!), the optimizer got the estimated cardinality way wrong. In this case "so what", but what if this were a subquery???? and the subquery is thought to return almost 500 rows -- not the ONE row we know!**ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'T', method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

ops$tkyte@ORA10G> select * from t where object_id = 42;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=93)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=93)

2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=1)

ops$tkyte@ORA10G> set autotrace off

**Now it got it right and the resulting plans that incorporate this query -- will be better for it.**## Rating

We're not taking comments currently, so please try again later if you want to add a comment.

# Comments

Can the optimizer estimate the cardinality of a combination of predicates on the same table with reasonable accuracy ?

For example:

select *

from t

where c1 = 1

and c2 = 20

;

Analyzing the table t would provide cardinality estimates for predicates [c1 = 1] and [c2 = 20]. But what about the cardinality of the combination of both predicates ? How does the optimizer guess this ? How do we improve the optimizer guess about this ?

The cardinality of the combination is the most important info here, because if table t was part of a larger query (with other tables and joins), and if the the predicates above were evaluated first on table t, then the cardinality estimate of the predicate combination might dictate what operation (Hash join, nested loops etc.) the optmizer chooses to join the row source from table t (after applying the predicates) with other row sources.

For example:

select *

from t

where c1 = 1

and c2 = 20

;

Analyzing the table t would provide cardinality estimates for predicates [c1 = 1] and [c2 = 20]. But what about the cardinality of the combination of both predicates ? How does the optimizer guess this ? How do we improve the optimizer guess about this ?

The cardinality of the combination is the most important info here, because if table t was part of a larger query (with other tables and joins), and if the the predicates above were evaluated first on table t, then the cardinality estimate of the predicate combination might dictate what operation (Hash join, nested loops etc.) the optmizer chooses to join the row source from table t (after applying the predicates) with other row sources.

This is where sql profiles in 10g come in really nice. You can in effect "analyze a query". But even before, dynamic sampling can be used to great affect. It comes into play during the hard parse phase (optimization). consider: ops$tkyte@ORA9IR2> create table t ( x int, y int ); Table created. ops$tkyte@ORA9IR2> insert into t 2 select mod(rownum,100), mod(rownum,250) 3 from all_objects; 27951 rows created. ops$tkyte@ORA9IR2> update t set y = 42 where x = 55; 279 rows updated. ops$tkyte@ORA9IR2> commit; Commit complete. ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all columns size 254' ); PL/SQL procedure successfully completed. ops$tkyte@ORA9IR2> set autotrace traceonly explain ops$tkyte@ORA9IR2> select * from t where y = 42 and x = 55; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=3 Bytes=18) 1 0 TABLE ACCESS (FULL) OF 'T' (Cost=9 Card=3 Bytes=18) <b>it thinks 3 based on x with 100 values and y with 250, if we dynamic sample at level 4 (see the performance and tuning guide for what the levels are), it ran a sampling query to discover this information in order to parse</b> ops$tkyte@ORA9IR2> select /*+ DYNAMIC_SAMPLING(t 4) */ * from t where y = 42 and x = 55; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=279 Bytes=1674) 1 0 TABLE ACCESS (FULL) OF 'T' (Cost=9 Card=279 Bytes=1674) <b>quite different -- below we'll see why</b> ops$tkyte@ORA9IR2> select * from t where y = 42 and x = 56; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=3 Bytes=18) 1 0 TABLE ACCESS (FULL) OF 'T' (Cost=9 Card=3 Bytes=18) ops$tkyte@ORA9IR2> select /*+ DYNAMIC_SAMPLING(t 4) */ * from t where y = 42 and x = 56; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=1 Bytes=6) 1 0 TABLE ACCESS (FULL) OF 'T' (Cost=9 Card=1 Bytes=6) ops$tkyte@ORA9IR2> set autotrace off ops$tkyte@ORA9IR2> select count(*) from t where y = 42 and x = 55; COUNT(*) ---------- 279 ops$tkyte@ORA9IR2> select count(*) from t where y = 42 and x = 56; COUNT(*) ---------- 0

Thanks for the info. Dynamic sampling does open a whole new avenue for estimating single table cardinality for multiple predicates.

1. Since you mention that dynamic sampling occurs during hard parse, how is it done for query with bind variables ? Does bind variable peeking happen and sampling done for bind variable values ? If so one should be aware that the cardinality estimates might be totally off-mark when the same plan is re-used for different bind variable values.

2. In the Oracle 9i R2 database performance guide and reference manual page 6-8, the following is mentioned:

"Dynamic sampling is repeatable if no rows have been inserted, deleted, or updated in the table being sampled". What does the above statement mean ?

3. Also what does dynamic sampling at the cursor level (i.e. with hint /*+ DYNAMIC_SAMPLING(<integer>) */ do ? Does it means that dynamic sampling is done for all tables (with predicates applied) in the query ?

1. Since you mention that dynamic sampling occurs during hard parse, how is it done for query with bind variables ? Does bind variable peeking happen and sampling done for bind variable values ? If so one should be aware that the cardinality estimates might be totally off-mark when the same plan is re-used for different bind variable values.

2. In the Oracle 9i R2 database performance guide and reference manual page 6-8, the following is mentioned:

"Dynamic sampling is repeatable if no rows have been inserted, deleted, or updated in the table being sampled". What does the above statement mean ?

3. Also what does dynamic sampling at the cursor level (i.e. with hint /*+ DYNAMIC_SAMPLING(<integer>) */ do ? Does it means that dynamic sampling is done for all tables (with predicates applied) in the query ?

1) it'll use bind variable peeking with all that entails and implies. The first one to hard parse sets the plan.

2) means (my interpretation) that if you hard parse a query, and then 5 minutes later hard parse the same query -- the plan could be different since the dynamic sample is dependent on the data that exists RIGHT THEN. It is like doing a mini-analyze during each hard parse. (it it more than like that, you could say it is a small analyze)

3) see the docs for the hint -- it changes the meaning of level. chapter 5 page 40 in the 9ir2 doc set.

2) means (my interpretation) that if you hard parse a query, and then 5 minutes later hard parse the same query -- the plan could be different since the dynamic sample is dependent on the data that exists RIGHT THEN. It is like doing a mini-analyze during each hard parse. (it it more than like that, you could say it is a small analyze)

3) see the docs for the hint -- it changes the meaning of level. chapter 5 page 40 in the 9ir2 doc set.

That was a nice definition of cardinality (where did it come from?). What I at least THINK I've observed is that the term gets confused with selectivity, which I think is used in 2 different ways. When one talks about an index selectivity, I've seen it as - number of distinct rows divided by the total number of rows (cardinality of the table?) so that as you approach 1, you have a selective index. I think I've also seen it as how many rows come back on average from the total so that in the case of a 100 distinct row table, a selectivity of .01 would "be good", or as you approach zero that is "good" as oppossed to 1. Would you agree with this observation? Also, cardinality is the # of rows within the context of an operation, and has nothing to do with # of distinct rows, correct?

I've been using </code> http://www.dictionary.com/ <code>recently. fast, minimal UI.

cardinality has to be taken in context.

There is the cardinality of a table (number of rows in table)

There is the cardinality of a row source operation (step in a plan) -- the number of rows flowing out of that step.

A result set (which technically is a table!) has a cardinality.

The steps in the plan have cardinality.

cardinality is the count.

cardinality has to be taken in context.

There is the cardinality of a table (number of rows in table)

There is the cardinality of a row source operation (step in a plan) -- the number of rows flowing out of that step.

A result set (which technically is a table!) has a cardinality.

The steps in the plan have cardinality.

cardinality is the count.

Hi Tom,

I was thinking that CARDINALITY is a collection function like here in the link.

</code> http://www.cis.unisa.edu.au/oracle/server.101/b10759/functions014.htm#i1269375 <code>

But here it is discussed as a HINT for optimizer.

I am guessing that they are not related. You mentioned that cardinality was in 9i itself( as a statement for my earlier post in sqlloader). Which cardinality did you mean by that?

Thank you,

Rahul

I was thinking that CARDINALITY is a collection function like here in the link.

</code> http://www.cis.unisa.edu.au/oracle/server.101/b10759/functions014.htm#i1269375 <code>

But here it is discussed as a HINT for optimizer.

I am guessing that they are not related. You mentioned that cardinality was in 9i itself( as a statement for my earlier post in sqlloader). Which cardinality did you mean by that?

Thank you,

Rahul

cardinality in that context is a function that returns the count of the elements in a collection.

Tom,

This is on your follow up to the "original" question.

The cardinality shown here (below) is way off. But, how come the cost is accurate? I mean, how did the optimiser make the right guess that it just has to do 1 LIO to get the row? Is it from the stats in user_indexes? (leaf_blocks?)

Pasted below (from your follow up)

ops$tkyte@ORA10G> select * from t where object_id = 42;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=482 Bytes=44826)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=482

Bytes=44826)

2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=193)

Thank You,

This is on your follow up to the "original" question.

The cardinality shown here (below) is way off. But, how come the cost is accurate? I mean, how did the optimiser make the right guess that it just has to do 1 LIO to get the row? Is it from the stats in user_indexes? (leaf_blocks?)

Pasted below (from your follow up)

ops$tkyte@ORA10G> select * from t where object_id = 42;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=482 Bytes=44826)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=482

Bytes=44826)

2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=193)

Thank You,

the only stats that where missing were the column stats, everything else (table, index) were there.

clustering factor, height, all of that stuff.

clustering factor, height, all of that stuff.

So Tom! how optimizer calculates the cost for individual operation (FTS, IRS, IFFS etc) ?

it takes the various parameter settings in place and using the built in model -- assigns costs. There is no single constant.

Consider the following example: create table t as select trunc(rownum/10) c1, trunc(rownum/10) + 5 c2 -- c1 and c2 values go together i.e. when c1 = 5 then c2 = 10 -- and when c2 = 10 then c1 = 5 from all_objects ; exec dbms_stats.gather_table_stats(ownname=>'GSP_DBO',tabname=>'T',method_opt=>'FOR ALL COLUMNS SIZE 254'); SQL> show parameter dynamic NAME TYPE VALUE ------------------------------------ ----------- ------ optimizer_dynamic_sampling integer 1 explain plan for select * from t where c1 = 5 ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 70 | 3 | |* 1 | TABLE ACCESS FULL | T | 10 | 70 | 3 | -------------------------------------------------------------------- explain plan for select * from t where c2 = 10 ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 70 | 3 | |* 1 | TABLE ACCESS FULL | T | 10 | 70 | 3 | -------------------------------------------------------------------- explain plan for select * from t where c1 = 5 and c2 = 10 -- one of the conditions is redundant ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 | 3 | |* 1 | TABLE ACCESS FULL | T | 1 | 7 | 3 | -------------------------------------------------------------------- ^^^^ Optimizer cardinality estimate way off here since it does not know that cardinality of the combination of predicates is the same as for one of those predicates explain plan for select /*+ DYNAMIC_SAMPLING(t 1) */ * from t where c1 = 5 and c2 = 10 ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 70 | 3 | |* 1 | TABLE ACCESS FULL | T | 10 | 70 | 3 | -------------------------------------------------------------------- Q1. The right cardinality here, maybe because of dynamic sampling hint, but this is dynamic sampling level 1 (and level 1 conditions are not satisfied here). So how come correct cardinality ? Infact the optimizer comes up with the correct cardinality with dynamic sampling hints of levels 2 and 3 also. The definition of level 2 does not apply here. Level 3 not sure if it applies. Q2. If dynamic sampling level 1 does apply here, why wasn't it applied without the hint since optimizer_dynamic_sampling = 1 ? Infact changing optimizer_dynamic_sampling to 2,3 does not have any effect - the cardinality is still incorrect. Only changing optimizer_dynamic_sampling to 4 results in the correct cardinality (level 4 does apply here - single-table predicates that reference 2 or more columns ). Q3. In cases where such SQLs (as demonstrated in the example above) are issued by a third party application, we have no control over SQL and hence cannot insert hints. To get the correct cardinality, is there any option (in Oracle 9i) other than changing optimizer_dynamic_sampling parameter ? Excerpt from Oracle 9.2 Performance tuning guide ----------------------------------------------- The sampling levels are as follows if the dynamic sampling level used is from a cursor hint or from the optimizer_dynamic_sampling parameter: Level 1: Sample all tables that have not been analyzed if the following criteria are met: (1) there is at least 1 unanalyzed table in the query; (2) this unanalyzed table is joined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes; (4) this unanalyzed table has more blocks than the number of blocks that would be used for dynamic sampling of this table. The number of blocks sampled is the default number of dynamic sampling blocks (32). Level 2: Apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is the default number of dynamic sampling blocks. Level 3: Apply dynamic sampling to all tables that meet Level 2 criteria, plus all tables for which standard selectivity estimation used a guess for some predicate that is a potential dynamic sampling predicate. The number of blocks sampled is the default number of dynamic sampling blocks. Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all tables that have single-table predicates that reference 2 or more columns. The number of blocks sampled is the default number of dynamic sampling blocks. ---------------------------------------------

if you trace the hinted one:

SELECT /*+ ALL_ROWS IGNORE_WHERE_CLAUSE */ NVL(SUM(C1),0), NVL(SUM(C2),0)

FROM

(SELECT /*+ IGNORE_WHERE_CLAUSE NOPARALLEL("T") */ 1 AS C1, CASE WHEN

"T"."C1"=5 AND "T"."C2"=10 THEN 1 ELSE 0 END AS C2 FROM "T" "T") SAMPLESUB

it did sample it.

You need to page down in the performance guide:

...

The sampling levels are as follows if the dynamic sampling level

Level 0: Do not use dynamic sampling.

Level 1: The number of blocks sampled is the default number of dynamic

sampling blocks (32).

You put a table in there and said "DO IT"

SELECT /*+ ALL_ROWS IGNORE_WHERE_CLAUSE */ NVL(SUM(C1),0), NVL(SUM(C2),0)

FROM

(SELECT /*+ IGNORE_WHERE_CLAUSE NOPARALLEL("T") */ 1 AS C1, CASE WHEN

"T"."C1"=5 AND "T"."C2"=10 THEN 1 ELSE 0 END AS C2 FROM "T" "T") SAMPLESUB

it did sample it.

You need to page down in the performance guide:

...

The sampling levels are as follows if the dynamic sampling level

**used is from a table hint:**Level 0: Do not use dynamic sampling.

Level 1: The number of blocks sampled is the default number of dynamic

sampling blocks (32).

You put a table in there and said "DO IT"

Thanks for pointing that out.

Could you please respond to Question 3 in my post above.

Could you please respond to Question 3 in my post above.

but even the optimizer dynamic sampling parameter won't do it, it is the hint that lets you have the table in there.

10g - SQL Profiles, yes.

10g - SQL Profiles, yes.

But changing optimizer_dynamic_sampling to 4 will sample table t in the query (with predicates on c1 and c2) without the dynamic sampling hint.

give it a try (it is what I would do, test your theory out)

Ok here is the proof: create table t as select trunc(rownum/10) c1, trunc(rownum/10) + 5 c2 -- c1 and c2 values go together i.e. when c1 = 5 then c2 = 10 -- and when c2 = 10 then c1 = 5 from all_objects ; exec dbms_stats.gather_table_stats(ownname=>'<Schema Name>',tabname=>'T',method_opt=>'FOR ALL COLUMNS SIZE 254'); SQL> show parameter dynamic NAME TYPE VALUE ------------------------------------ ----------- ----- optimizer_dynamic_sampling integer 1 explain plan for select * from t where c1 = 5 and c2 = 10 -- one of the conditions is redundant ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 7 | 3 | |* 1 | TABLE ACCESS FULL | T | 1 | 7 | 3 | -------------------------------------------------------------------- alter session set optimizer_dynamic_sampling=4; SQL> show parameter dynamic NAME TYPE VALUE ------------------------------------ ----------- ----- optimizer_dynamic_sampling integer 4 explain plan for select * from t where c1 = 5 and c2 = 10 -- one of the conditions is redundant ; -------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 70 | 3 | |* 1 | TABLE ACCESS FULL | T | 10 | 70 | 3 | --------------------------------------------------------------------

there you go then.

Oracle 9.2.0.6 with CBO

select * from foo

where col1 in (1,2,3)

select * from foo

where col1 in (select /*+ cardinality(t 3) */ c1 from bar t)

Why is the execution plan different for the 2 statements above? The first one does a INLIST ITERATOR for col1 (indexed) so everythins is great.

The second one does some crazy things and slows things down.

As far as the CBO knows, the IN is going to give it 3 rows, so why doesn't it do the INLIST ITERATOR thing for both cases?

Thanks

select * from foo

where col1 in (1,2,3)

select * from foo

where col1 in (select /*+ cardinality(t 3) */ c1 from bar t)

Why is the execution plan different for the 2 statements above? The first one does a INLIST ITERATOR for col1 (indexed) so everythins is great.

The second one does some crazy things and slows things down.

As far as the CBO knows, the IN is going to give it 3 rows, so why doesn't it do the INLIST ITERATOR thing for both cases?

Thanks

need full example, as to what the "crazy" thing is - but the inlist interator is for hard coded inlists. I would expect the plans to be different, but sort of the same, like this: ops$tkyte@ORA9IR2> create table foo ( col1 int primary key, col2 int, data char(80) ); Table created. ops$tkyte@ORA9IR2> drop table bar; Table dropped. ops$tkyte@ORA9IR2> create table bar ( c1 int ); Table created. ops$tkyte@ORA9IR2> insert /*+ append */ into foo select rownum, rownum, rownum from all_objects; 28429 rows created. ops$tkyte@ORA9IR2> insert /*+ append */ into bar select rownum from all_objects where rownum <= 3; 3 rows created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'FOO', cascade=>true ); PL/SQL procedure successfully completed. ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'BAR', cascade=>true ); PL/SQL procedure successfully completed. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> set autotrace traceonly explain ops$tkyte@ORA9IR2> select * from foo 2 where col1 in (1,2,3); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=3 Bytes=270) 1 0 INLIST ITERATOR 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'FOO' (Cost=3 Card=3 Bytes=270) 3 2 INDEX (RANGE SCAN) OF 'SYS_C009866' (UNIQUE) (Cost=2 Card=3) ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> select * from foo 2 where col1 in (select /*+ cardinality(t 3) */ c1 from bar t); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=52 Card=3 Bytes=279) 1 0 NESTED LOOPS (Cost=52 Card=3 Bytes=279) 2 1 SORT (UNIQUE) 3 2 TABLE ACCESS (FULL) OF 'BAR' (Cost=2 Card=3 Bytes=9) 4 1 TABLE ACCESS (BY INDEX ROWID) OF 'FOO' (Cost=1 Card=1 Bytes=90) 5 4 INDEX (UNIQUE SCAN) OF 'SYS_C009866' (UNIQUE) ops$tkyte@ORA9IR2> set autotrace off

Hi Tom,

Please can you help me getting to the bottom of why the cardinalities are incorrect in the following query.

Here are all the details ...

user01@MYDB>select count(*) from test_rship;

COUNT(*)

----------

2875047

Elapsed: 00:00:00.78

user01@MYDB>select count(*) from test_rship

2 where prod_parent_type = 'BP'

3 and prod_child_type = 'CUA';

COUNT(*)

----------

683442

Elapsed: 00:00:02.18

user01@MYDB>select count(*) from test_rship

2 where prod_parent_type = 'CUA'

3 and prod_child_type = 'CUV';

COUNT(*)

----------

741223

Elapsed: 00:00:02.26

1 begin

2 dbms_stats.gather_table_stats(user,'TEST_RSHIP',

3 estimate_percent => dbms_stats.auto_sample_size,

4 method_opt => 'for all columns size auto');

5* end;

user01@MYDB>/

PL/SQL procedure successfully completed.

Elapsed: 00:01:22.23

user01@MYDB>select column_name,num_distinct,to_char(last_analyzed,'dd/mm/yyyy')

2 from user_tab_col_statistics

3 where table_name = 'TEST_RSHIP'

4 and column_name in ('PROD_PARENT_TYPE','PROD_CHILD_TYPE');

COLUMN_NAME NUM_DISTINCT TO_CHAR(LA

------------------------------ ------------ ----------

PROD_PARENT_TYPE 3 31/07/2006

PROD_CHILD_TYPE 4 31/07/2006

user01@MYDB>explain plan

2 for

3 select *

4 from

5 (

6 select prod_parent_id pb_prod_gentd_id

7 , parent_iso_ctry_code pb_ctry_code

8 , parent_auth_status pb_auth_status

....

....

....

26 , child_ean cua_ean

27 from test_rship

28 where prod_parent_type = 'BP'

29 and prod_child_type = 'CUA'

30 ) pb_cua

31 join

32 ( select prod_parent_id cua_prod_gentd_id2

33 , parent_iso_ctry_code cua_ctry_code2

34 , prod_child_id cuv_prod_gentd_id

....

.....

....

42 , child_last_auth_upd_dt cuv_last_auth_upd_dt

43 from test_rship

44 where prod_parent_type = 'CUA'

45 and prod_child_type = 'CUV'

46 ) cua_cuv

47 on pb_cua.cua_prod_gentd_id1 = cua_cuv.cua_prod_gentd_id2

48 and pb_cua.cua_ctry_code1 = cua_cuv.cua_ctry_code2;

user01@MYDB>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 105K| 15M| | 8283 |

|* 1 | HASH JOIN | | 105K| 15M| 16M| 8283 |

|* 2 | TABLE ACCESS FULL | TEST_RSHIP | 163K| 14M| | 3798 |

|* 3 | TABLE ACCESS FULL | TEST_RSHIP | 385K| 22M| | 3798 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

---------------------------------------------------

1 - access("TEST_RSHIP"."PROD_CHILD_ID"="TEST_RSHIP"."PROD_PARENT_ID"

AND "TEST_RSHIP"."CHILD_ISO_CTRY_CODE"="TEST_RSHIP"."PARENT_ISO_CT

RY_CODE")

2 - filter("TEST_RSHIP"."PROD_PARENT_TYPE"='BP' AND

"TEST_RSHIP"."PROD_CHILD_TYPE"='CUA')

3 - filter("TEST_RSHIP"."PROD_PARENT_TYPE"='CUA' AND

"TEST_RSHIP"."PROD_CHILD_TYPE"='CUV')

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

Note: cpu costing is off

21 rows selected.

So, as can be seen the cardinality value for the full scan on the TEST_RSHIP table after the filter conditions have been applied is way off from the actual value. I have collected the stats/histograms but still no joy. I have even run the query and have looked at the cardinality values in the V$SQL_PLAN table and they are same as in the explain plan.

Please can you tell me why the cardinality is not right and what I can do to get it as near as the actual values.

Thanks

Please can you help me getting to the bottom of why the cardinalities are incorrect in the following query.

Here are all the details ...

user01@MYDB>select count(*) from test_rship;

COUNT(*)

----------

2875047

Elapsed: 00:00:00.78

user01@MYDB>select count(*) from test_rship

2 where prod_parent_type = 'BP'

3 and prod_child_type = 'CUA';

COUNT(*)

----------

683442

Elapsed: 00:00:02.18

user01@MYDB>select count(*) from test_rship

2 where prod_parent_type = 'CUA'

3 and prod_child_type = 'CUV';

COUNT(*)

----------

741223

Elapsed: 00:00:02.26

1 begin

2 dbms_stats.gather_table_stats(user,'TEST_RSHIP',

3 estimate_percent => dbms_stats.auto_sample_size,

4 method_opt => 'for all columns size auto');

5* end;

user01@MYDB>/

PL/SQL procedure successfully completed.

Elapsed: 00:01:22.23

user01@MYDB>select column_name,num_distinct,to_char(last_analyzed,'dd/mm/yyyy')

2 from user_tab_col_statistics

3 where table_name = 'TEST_RSHIP'

4 and column_name in ('PROD_PARENT_TYPE','PROD_CHILD_TYPE');

COLUMN_NAME NUM_DISTINCT TO_CHAR(LA

------------------------------ ------------ ----------

PROD_PARENT_TYPE 3 31/07/2006

PROD_CHILD_TYPE 4 31/07/2006

user01@MYDB>explain plan

2 for

3 select *

4 from

5 (

6 select prod_parent_id pb_prod_gentd_id

7 , parent_iso_ctry_code pb_ctry_code

8 , parent_auth_status pb_auth_status

....

....

....

26 , child_ean cua_ean

27 from test_rship

28 where prod_parent_type = 'BP'

29 and prod_child_type = 'CUA'

30 ) pb_cua

31 join

32 ( select prod_parent_id cua_prod_gentd_id2

33 , parent_iso_ctry_code cua_ctry_code2

34 , prod_child_id cuv_prod_gentd_id

....

.....

....

42 , child_last_auth_upd_dt cuv_last_auth_upd_dt

43 from test_rship

44 where prod_parent_type = 'CUA'

45 and prod_child_type = 'CUV'

46 ) cua_cuv

47 on pb_cua.cua_prod_gentd_id1 = cua_cuv.cua_prod_gentd_id2

48 and pb_cua.cua_ctry_code1 = cua_cuv.cua_ctry_code2;

user01@MYDB>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes |TempSpc| Cost |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 105K| 15M| | 8283 |

|* 1 | HASH JOIN | | 105K| 15M| 16M| 8283 |

|* 2 | TABLE ACCESS FULL | TEST_RSHIP | 163K| 14M| | 3798 |

|* 3 | TABLE ACCESS FULL | TEST_RSHIP | 385K| 22M| | 3798 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

---------------------------------------------------

1 - access("TEST_RSHIP"."PROD_CHILD_ID"="TEST_RSHIP"."PROD_PARENT_ID"

AND "TEST_RSHIP"."CHILD_ISO_CTRY_CODE"="TEST_RSHIP"."PARENT_ISO_CT

RY_CODE")

2 - filter("TEST_RSHIP"."PROD_PARENT_TYPE"='BP' AND

"TEST_RSHIP"."PROD_CHILD_TYPE"='CUA')

3 - filter("TEST_RSHIP"."PROD_PARENT_TYPE"='CUA' AND

"TEST_RSHIP"."PROD_CHILD_TYPE"='CUV')

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------

Note: cpu costing is off

21 rows selected.

So, as can be seen the cardinality value for the full scan on the TEST_RSHIP table after the filter conditions have been applied is way off from the actual value. I have collected the stats/histograms but still no joy. I have even run the query and have looked at the cardinality values in the V$SQL_PLAN table and they are same as in the explain plan.

Please can you tell me why the cardinality is not right and what I can do to get it as near as the actual values.

Thanks

the optimizer sees "discrete things" It gets statistics on o A table o A column in that table o An index o The system all individual statistics. It then uses 12th grade statistics to guess from there. Suppose you had 1000 people in a room (example gratuitously borrowed from Jonathan Lewis). You poll the room and ask "how many of you are born in the month of December?" - 1/12th of the room will raise their hands approximiately. Then you ask "How many of you are of the zodiac sign Pisces?" Again, about 1/12th of the room will raise their hands. Now, you ask - "and how many of you Pisces are born in December?". The optimizer will guess 1/144th of the room (about 7). The truth? ZERO - all pisces are born in feb/mar. There is a correlation between the two columns and the optimizer doesn't see it because the optimizer has statistics about individual things. Enter perhaps dynamic sampling. Consider: ops$tkyte@ORA10GR2> create table t 2 as select decode( mod(rownum,2), 0, 'N', 'Y' ) flag1, 3 decode( mod(rownum,2), 0, 'Y', 'N' ) flag2, a.* 4 from all_objects a 5 / Table created. ops$tkyte@ORA10GR2> create index t_idx on t(flag1,flag2); Index created. ops$tkyte@ORA10GR2> begin 2 dbms_stats.gather_table_stats 3 ( user, 'T', 4 method_opt=>'for all indexed columns size 254' ); 5 end; 6 / PL/SQL procedure successfully completed. <b>we have a table such that when flag1 = 'Y', flag2 will be 'N' and vice versa. NEVER are the two columns set to Y,Y or N,N. There are:</b> ops$tkyte@ORA10GR2> select num_rows, num_rows/2, num_rows/2/2 from user_tables where table_name = 'T'; NUM_ROWS NUM_ROWS/2 NUM_ROWS/2/2 ---------- ---------- ------------ 50295 25147.5 12573.75 <b>50,000 rows - when we use "where flag1 = 'N'" the optimizer will rightly guess about 25,000 rows. It knows "50% of the rows are flag1='N'" It knows "50% of the rows are flag2='N'" It will guess therefore that about 25% of the rows are flag1=N and flag2=N, we can see this:</b> ops$tkyte@ORA10GR2> select * from t where flag1='N'; Execution Plan ---------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 25268 | 2393K| 167 (2)| 00:00:03 | |* 1 | TABLE ACCESS FULL| T | 25268 | 2393K| 167 (2)| 00:00:03 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("FLAG1"='N') ops$tkyte@ORA10GR2> select * from t where flag2='N'; Execution Plan ---------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 25027 | 2370K| 167 (2)| 00:00:03 | |* 1 | TABLE ACCESS FULL| T | 25027 | 2370K| 167 (2)| 00:00:03 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("FLAG2"='N') ops$tkyte@ORA10GR2> select * from t where flag1='N' and flag2='N'; Execution Plan ---------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 12573 | 1190K| 167 (2)| 00:00:03 | |* 1 | TABLE ACCESS FULL| T | 12573 | 1190K| 167 (2)| 00:00:03 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("FLAG2"='N' AND "FLAG1"='N') <b>that last bit isn't right - it is way off. Enter dynamic sampling:</b> ops$tkyte@ORA10GR2> select /*+ dynamic_sampling(t 3) */ * from t where flag1='N' and flag2='N'; Execution Plan ---------------------------------------------------------- Plan hash value: 470836197 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 388 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T | 4 | 388 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_IDX | 4 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("FLAG1"='N' AND "FLAG2"='N') Note ----- - dynamic sampling used for this statement <b>here the optimizer ran a query to hard parse our query - it ran this query because we had dynamic sampling set to 3 and the optimizer knew it "guessed" at a cardinality. It did a quick scan of the data to figure out how close it was and adjusted the cardinality appropriately</b>

Respected Mr. Tom,

I have already read three (3) threads of your site Â– Â“CardinalityÂ”, Â“densityÂ” and Â“Index rebuildÂ”. But still my idea is not clear on the following three terms. You please explain it in very simple way with your inimitable style. I have read as follows from Oracle document.

Cardinality represents the number of rows in a row set. Â– I think ItÂ’s COUNT of ROWS

The first measure, selectivity, represents a fraction of rows from a row set. Â– Another form of Count may be percentage.

So on reading those lines, my questions are as follows.

Q1> What are low selectivity, poor selectivity and low cardinality? Is there any relationship with Â“densityÂ”? If yes, How?

Q2> Quote from Oracle 9i concept (Chapter : 10) - Â“The optimizer can use an existing index to build another index. This results in a much faster index build.Â”

Does it speak about Index Rebuild? If yes, on what situation does Optimizer itself build a new index?

I have already read three (3) threads of your site Â– Â“CardinalityÂ”, Â“densityÂ” and Â“Index rebuildÂ”. But still my idea is not clear on the following three terms. You please explain it in very simple way with your inimitable style. I have read as follows from Oracle document.

Cardinality represents the number of rows in a row set. Â– I think ItÂ’s COUNT of ROWS

The first measure, selectivity, represents a fraction of rows from a row set. Â– Another form of Count may be percentage.

So on reading those lines, my questions are as follows.

Q1> What are low selectivity, poor selectivity and low cardinality? Is there any relationship with Â“densityÂ”? If yes, How?

Q2> Quote from Oracle 9i concept (Chapter : 10) - Â“The optimizer can use an existing index to build another index. This results in a much faster index build.Â”

Does it speak about Index Rebuild? If yes, on what situation does Optimizer itself build a new index?

cardinality is simply the "count of elements in a set". Many times however, we use cardinality meaning "distinct cardinality" when discussing selectivity. But, cardinality is defined in the original answer above.

q1) selectivity is a measure, a ratio if you will, of how many tuples/rows out of a set a where clause will return.

For example, using ALL_OBJECTS:

select * from all_objects where owner = 'SYS'

where owner = 'SYS' could be said to be "not very selective, having low selectivity" because many rows are such that "owner='SYS'" is true. However

where owner = 'SCOTT'

could be said to have high selectivity, it is very selective, because few rows are such that owner = 'SCOTT' is true.

low selectivity, not very selective, poor selectivity are all the same.

low cardinality is "few rows", typically would be synonymous with "highly selective"

density, from an Oracle cost base optimizer perspective, is a number used to guess the cardinality. It is a measure of how many rows are expected back from a predicate of the general form: 'where column = constant/:bind'. If you do not have histograms, Oracle will use the 'density' to guess how many rows that predicate will return.

q2) it is referring to what it will do when you issue a create index statement, the optimizer itself doesn't choose to build a new index.

q1) selectivity is a measure, a ratio if you will, of how many tuples/rows out of a set a where clause will return.

For example, using ALL_OBJECTS:

select * from all_objects where owner = 'SYS'

where owner = 'SYS' could be said to be "not very selective, having low selectivity" because many rows are such that "owner='SYS'" is true. However

where owner = 'SCOTT'

could be said to have high selectivity, it is very selective, because few rows are such that owner = 'SCOTT' is true.

low selectivity, not very selective, poor selectivity are all the same.

low cardinality is "few rows", typically would be synonymous with "highly selective"

density, from an Oracle cost base optimizer perspective, is a number used to guess the cardinality. It is a measure of how many rows are expected back from a predicate of the general form: 'where column = constant/:bind'. If you do not have histograms, Oracle will use the 'density' to guess how many rows that predicate will return.

q2) it is referring to what it will do when you issue a create index statement, the optimizer itself doesn't choose to build a new index.

Tom,

I have a table with over 30M records (31,705,533) with an index on field prmk (OPT_MEMBER_ATTRIBUTE_IDX2) and the number of unique keys for this index is 60. So when running query Â¿SELECT * FROM facets.opt_member_attribute WHERE prmk = '538';Â¿ I would not expect Oracle to use the index at all, however, the explain plan looks like:

------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 422K| 40M| 4240 (1)|

| 1 | TABLE ACCESS BY INDEX ROWID| OPT_MEMBER_ATTRIBUTE | 422K| 40M| 4240 (1)|

|* 2 | INDEX RANGE SCAN | OPT_MEMBER_ATTRIBUTE_IDX28 | 422K| | 486 (0)|

------------------------------------------------------------------------------------------------

Both the table and the index are analyzed and the total number of records in the table is 55M and the clustering factor for the index is 1,421,538.

And in fact when I run the query it works in a couple of seconds. I was actually expecting a full table scan.

I am using otimizer_index_caching = 90 and optimizer_index_cost_adj =30.

Can you give me your ideas of why this happens? ShouldnÂ¿t Oracle ignore the index because of the low cardinality? IsnÂ¿t this a very lousy selectivity for the index? And a poor choice for the index?

As always,

Thanks for you help.

Sincerely yours,

Orlando Reyes

I have a table with over 30M records (31,705,533) with an index on field prmk (OPT_MEMBER_ATTRIBUTE_IDX2) and the number of unique keys for this index is 60. So when running query Â¿SELECT * FROM facets.opt_member_attribute WHERE prmk = '538';Â¿ I would not expect Oracle to use the index at all, however, the explain plan looks like:

------------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

------------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 422K| 40M| 4240 (1)|

| 1 | TABLE ACCESS BY INDEX ROWID| OPT_MEMBER_ATTRIBUTE | 422K| 40M| 4240 (1)|

|* 2 | INDEX RANGE SCAN | OPT_MEMBER_ATTRIBUTE_IDX28 | 422K| | 486 (0)|

------------------------------------------------------------------------------------------------

Both the table and the index are analyzed and the total number of records in the table is 55M and the clustering factor for the index is 1,421,538.

And in fact when I run the query it works in a couple of seconds. I was actually expecting a full table scan.

I am using otimizer_index_caching = 90 and optimizer_index_cost_adj =30.

Can you give me your ideas of why this happens? ShouldnÂ¿t Oracle ignore the index because of the low cardinality? IsnÂ¿t this a very lousy selectivity for the index? And a poor choice for the index?

As always,

Thanks for you help.

Sincerely yours,

Orlando Reyes

Th first set of output is from production and the second set of values from testing. The number of records retrieved is 3500. The table is apprx <1000000 records.

Individuall searched only based on the indexed column(without the other filters) we get 70000 rows in both Prod and Test. Other than cardinality for index and cost for the suceeding table everything looks same. Production works in 4-10 minute range and Testing in 4-10 Seconds range.

We are absolutely stumped. Kindly advice.

regards

Individuall searched only based on the indexed column(without the other filters) we get 70000 rows in both Prod and Test. Other than cardinality for index and cost for the suceeding table everything looks same. Production works in 4-10 minute range and Testing in 4-10 Seconds range.

We are absolutely stumped. Kindly advice.

regards

OPERATION OPTIONS OBJECT_NAME OPTIMIZER COST CARDINALITY SELECT STATEMENT CHOOSE 873 NESTED LOOPS 873 1 TABLE ACCESS BY INDEX ROWID VH_PSF ANALYZED 871 1 INDEX RANGE SCAN IDX_VH_PSF_TYPE ANALYZED 11 2623 TABLE ACCESS BY GLOBAL INDEX ROWID GM_VIN ANALYZED 2 1 INDEX UNIQUE SCAN SYS_C0014163 ANALYZED 1 1 SELECT STATEMENT CHOOSE 7 NESTED LOOPS 7 1 TABLE ACCESS BY INDEX ROWID VH_PSF ANALYZED 5 1 INDEX RANGE SCAN IDX_VH_PSF_TYPE ANALYZED 3 4 TABLE ACCESS BY GLOBAL INDEX ROWID GM_VIN ANALYZED 2 1 INDEX UNIQUE SCAN SYS_C0020641 ANALYZED 1 1

i cannot read that.

but to tune a query, you want a tkprof, not an explain plan.

but to tune a query, you want a tkprof, not an explain plan.

The above output is from v$sql_plan. Getting tkprof is not easy as we are only given "read only" access to the database. We will have to wait to get the tkprof output.

query v$sql_plan_statistics

Tom,

If Optimizer chosen to sample a table (for e.g. using default number of dynamic sampling blocks of 32), will optimizer read the first 32 blocks of the table? What if the data in the table ordered chronologically? For e.g. I have a table orders with order_date, delivery_date etc. and the table contains order history for the last five years. So if my query filter order for the last two weeks and optimizer will sample the table and read first 32 blocks in the table this will not help optimizer to choose the best plan

If Optimizer chosen to sample a table (for e.g. using default number of dynamic sampling blocks of 32), will optimizer read the first 32 blocks of the table? What if the data in the table ordered chronologically? For e.g. I have a table orders with order_date, delivery_date etc. and the table contains order history for the last five years. So if my query filter order for the last two weeks and optimizer will sample the table and read first 32 blocks in the table this will not help optimizer to choose the best plan

it will not read the first N, it'll randomly sample.

Hi Tom, I was wondering how Oracle estimates cardinality for unknown values for a column. Please see the example below SQL> create table test_card_number as 2 select * 3 from 4 ( 5 select rownum test_num 6 from dual 7 connect by 1=1 8 ) 9 where rownum < 100000; Table created. SQL> create index test_card_number_ix on test_card_number(test_num); Index created. SQL> exec dbms_stats.gather_table_stats('KALITA','TEST_CARD_NUMBER'); PL/SQL procedure successfully completed. SQL> set autotrace traceonly; SQL> select * 2 from test_card_number a,(select ( rownum + 100000) rn from dual )b 3 where test_num >= b.rn; no rows selected Execution Plan ---------------------------------------------------------- Plan hash value: 2093273829 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5000 | 85000 | 4 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 5000 | 85000 | 4 (0)| 00:00:01 | | 2 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 | | 3 | COUNT | | | | | | | 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN| TEST_CARD_NUMBER_IX | 5000 | 20000 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("TEST_NUM">="B"."RN") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 329 bytes sent via SQL*Net to client 373 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed SQL> select * 2 from test_card_number a,(select (rownum + 100000) rn from dual )b 3 where test_num <= b.rn; 99999 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2093273829 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5000 | 85000 | 4 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 5000 | 85000 | 4 (0)| 00:00:01 | | 2 | VIEW | | 1 | 13 | 2 (0)| 00:00:01 | | 3 | COUNT | | | | | | | 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN| TEST_CARD_NUMBER_IX | 5000 | 20000 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("TEST_NUM"<="B"."RN") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 6875 consistent gets 0 physical reads 0 redo size 1835957 bytes sent via SQL*Net to client 73710 bytes received via SQL*Net from client 6668 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 99999 rows processed For both >= and <= the cardinality is estimated to be 5000(5% of the rows). I have tried a few more cases and looks like Oracle estimates cardinality to 5% of rows for unknown values of a column. My problem is similar to the <= case in my example above. I get a date from a table which is evaluated for <= in the other table and the expected cardinality is approxmiately 5% of the total rows but actual cardinality would be close to 50%. So, in stead of doing a full table scan, its doing an index range scan. Is there any way of fixing this issue other than hinting the queries? Thanks, Kalita

the issue is, the plan is generated before the query is executed. So, in the case of a query like:

select *

2 from test_card_number a,(select (rownum + 100000) rn from dual )b

3 where test_num <= b.rn;

a) it cannot presume, assume - just one row from B.

b) it does not know what that value is at all, when building the plan.

so, it has to say "we do a join", and "we do not know what the value is, hence we have to use a heuristic"

I would prefer to write a query like that as:

select * from t where col <= (select X from something_else);

you'll get basically a similar plan and estimated card= values, but at least we know "one row". But since "select x from something_else" is not known until AFTER the plan is already generated.

select *

2 from test_card_number a,(select (rownum + 100000) rn from dual )b

3 where test_num <= b.rn;

a) it cannot presume, assume - just one row from B.

b) it does not know what that value is at all, when building the plan.

so, it has to say "we do a join", and "we do not know what the value is, hence we have to use a heuristic"

I would prefer to write a query like that as:

select * from t where col <= (select X from something_else);

you'll get basically a similar plan and estimated card= values, but at least we know "one row". But since "select x from something_else" is not known until AFTER the plan is already generated.

Hello Tom from this example before: ops$tkyte@ORA10G> select * from t where object_id = 42; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=482 Bytes=44826) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE) (Cost=2 Card=482 Bytes=44826) 2 1 INDEX (RANGE SCAN) OF 'T_IDX' (INDEX) (Cost=1 Card=193) how come index range scan with card 193 to produce the table access by index rowid of card 482, I think that is at most one to one card, should be 193. any idea why? Thanks!

After reading up on this hint I found some queries (over skewed data in very large tables) performed waay much faster when using the hint /*+ dynamic_sampling(3) */.

F.i:

With hint:

without hint:

(2 records were added during tests, so that explains the different results)

With hint the plan looks like I would expect it to be.

(I don't want an Index range scan over a table containing almost 200 million records...it's 'db file sequential reading' forever then)

All tables are analyzed daily (GATHER_STALE) and indexed but quite skewed over certain FK-columns. So sampling (level = 3) is only applied for (quoting from doc.:) "all tables for which standard selectivity estimation used a guess for some predicate that is a potential dynamic sampling predicate"?

(When I use level = 2, the optimezer chooses the 'wrong plan' again).

Is this a 'safe hint-approach' to follow when tuning large queries that perform bad due to skewness?

Or should special histograms be preferred?

(Or am I comparing apples to toaster ovens now?)

I now tend to use this hint more often, but would like to know if there are any caveats?

I'm really amazed by the results: almost one hour faster, just by adding this hint...thanks!

Oracle fancies big brutal hash joins the most, i guess ;)

Regards,

Martijn

F.i:

With hint:

107867 records selected. Elapsed: 00:22:20.07 Uitvoeringsplan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=211975 Card=101631 Bytes=21342510) 1 0 SORT (GROUP BY) (Cost=211975 Card=101631 Bytes=21342510) 2 1 HASH JOIN (OUTER) (Cost=207337 Card=101631 Bytes=21342510) 3 2 HASH JOIN (Cost=183935 Card=101631 Bytes=6199491) 4 3 HASH JOIN (Cost=7787 Card=124222 Bytes=5714212) 5 4 TABLE ACCESS (FULL) OF 'PMR_CURRENT_STATUS' (Cost=1512 Card=124222 Bytes=2981328) 6 4 TABLE ACCESS (FULL) OF 'PAN_RESPONDENTS' (Cost=4269 Card=1103180 Bytes=24269960) 7 3 TABLE ACCESS (FULL) OF 'PAN_RESPONDENT_VALUES' (Cost=174638 Card=902553 Bytes=13538295) 8 2 VIEW (Cost=22597 Card=59183 Bytes=8818267) 9 8 SORT (GROUP BY) (Cost=22597 Card=59183 Bytes=3610163) 10 9 HASH JOIN (Cost=21717 Card=59183 Bytes=3610163) 11 10 TABLE ACCESS (FULL) OF 'PMR_CURRENT_STATUS' (Cost=1510 Card=124222 Bytes=3726660) 12 10 TABLE ACCESS (BY INDEX ROWID) OF 'PAN_PROJECT_MEMBERS' (Cost=39 Card=478 Bytes=10516) 13 12 NESTED LOOPS (Cost=19459 Card=243521 Bytes=7549151) 14 13 VIEW OF 'index$_join$_003' (Cost=36 Card=510 Bytes=4590) 15 14 HASH JOIN 16 15 INDEX (RANGE SCAN) OF 'PMS_SAMP_PANL_FK'(NON-UNIQUE) (Cost=4 Card=510 Bytes=4590) 17 15 INDEX (FAST FULL SCAN) OF 'PMS_SAMP_PK' (UNIQUE) (Cost=4 Card=510 Bytes=4590) 18 13 INDEX (RANGE SCAN) OF 'PMS_PBR_MULTI_IDX1' (NON-UNIQUE) (Cost=7 Card=1809) Statistics ---------------------------------------------------------- 17 recursive calls 0 db block gets 1122897 consistent gets 916871 physical reads 69668 redo size 7988329 bytes sent via SQL*Net to client 81634 bytes received via SQL*Net from client 7193 SQL*Net roundtrips to/from client 4 sorts (memory) 0 sorts (disk) 107867 rows processed

without hint:

107869 records selected. Elapsed: 01:20:04.02 Uitvoeringsplan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=40760 Card=5174 Bytes=1086540) 1 0 SORT (GROUP BY) (Cost=40760 Card=5174 Bytes=1086540) 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'PAN_RESPONDENT_VALUES' (Cost=5 Card=2 Bytes=30) 3 2 NESTED LOOPS (Cost=40521 Card=5174 Bytes=1086540) 4 3 HASH JOIN (OUTER) (Cost=26761 Card=3438 Bytes=670410) 5 4 HASH JOIN (Cost=5790 Card=3438 Bytes=158148) 6 5 TABLE ACCESS (FULL) OF 'PMR_CURRENT_STATUS' (Cost=1512 Card=3438 Bytes=82512) 7 5 TABLE ACCESS (FULL) OF 'PAN_RESPONDENTS' (Cost=4269 Card=1103180 Bytes=24269960) 8 4 VIEW (Cost=20970 Card=355 Bytes=52895) 9 8 SORT (GROUP BY) (Cost=20970 Card=355 Bytes=21655) 10 9 HASH JOIN (Cost=20969 Card=355 Bytes=21655) 11 10 TABLE ACCESS (FULL) OF 'PMR_CURRENT_STATUS' (Cost=1510 Card=3438 Bytes=103140) 12 10 TABLE ACCESS (BY INDEX ROWID) OF 'PAN_PROJECT_MEMBERS' (Cost=39 Card=90 Bytes=1980) 13 12 NESTED LOOPS (Cost=19459 Card=46085 Bytes=1428635) 14 13 VIEW OF 'index$_join$_003' (Cost=36 Card=510 Bytes=4590) 15 14 HASH JOIN 16 15 INDEX (RANGE SCAN) OF 'PMS_SAMP_PANL_FK' (NON-UNIQUE) (Cost=4 Card=510 Bytes=4590) 17 15 INDEX (FAST FULL SCAN) OF 'PMS_SAMP_PK' (UNIQUE) (Cost=4 Card=510 Bytes=4590) 18 13 INDEX (RANGE SCAN) OF 'PMS_PBR_MULTI_IDX1' (NON-UNIQUE) (Cost=7 Card=1809) 19 3 INDEX (RANGE SCAN) OF 'PMS_REVA_RESP_FK' (NON-UNIQUE) (Cost=4 Card=2) Statistics ---------------------------------------------------------- 5 recursive calls 0 db block gets 1507643 consistent gets 766208 physical reads 26840 redo size 7988422 bytes sent via SQL*Net to client 81599 bytes received via SQL*Net from client 7193 SQL*Net roundtrips to/from client 4 sorts (memory) 0 sorts (disk) 107869 rows processed

(2 records were added during tests, so that explains the different results)

With hint the plan looks like I would expect it to be.

(I don't want an Index range scan over a table containing almost 200 million records...it's 'db file sequential reading' forever then)

All tables are analyzed daily (GATHER_STALE) and indexed but quite skewed over certain FK-columns. So sampling (level = 3) is only applied for (quoting from doc.:) "all tables for which standard selectivity estimation used a guess for some predicate that is a potential dynamic sampling predicate"?

(When I use level = 2, the optimezer chooses the 'wrong plan' again).

Is this a 'safe hint-approach' to follow when tuning large queries that perform bad due to skewness?

Or should special histograms be preferred?

(Or am I comparing apples to toaster ovens now?)

I now tend to use this hint more often, but would like to know if there are any caveats?

I'm really amazed by the results: almost one hour faster, just by adding this hint...thanks!

Oracle fancies big brutal hash joins the most, i guess ;)

Regards,

Martijn

Suppose I have a query typically used in web-based reporting tools that allow for optional search filters.

Consider the following 3 variants of this type of query.

1.

2.

3.

The user input is both optional and variable length. So I use the table() function to convert the comma separated string with the in-list using the code you posted on this site.

When :b1 is not null, I thought #3 would have the same execeution plan/performance as #2 (which performs the best) because the optimizer would optimize away the :b1 is null part and the cardinality hint would tell it that the SELECT returns 3 values which is the same as entering the 3 values into an IN list, as in #2.

But it doesn't do that. It does a hash join instead of a INLIST iterator and the overall performance is much slower.

Any idea why?

Thanks

Consider the following 3 variants of this type of query.

1.

where 1=1 and <join conditions> and (:b1 is null or column in (select ...))

2.

where 1=1 and <join conditions> and (column in (:b2,:b3,:b4))

3.

where 1=1 and <join conditions> and (:b1 is null or column in (select /*+ cardinality(t 3) */ ...))

The user input is both optional and variable length. So I use the table() function to convert the comma separated string with the in-list using the code you posted on this site.

When :b1 is not null, I thought #3 would have the same execeution plan/performance as #2 (which performs the best) because the optimizer would optimize away the :b1 is null part and the cardinality hint would tell it that the SELECT returns 3 values which is the same as entering the 3 values into an IN list, as in #2.

But it doesn't do that. It does a hash join instead of a INLIST iterator and the overall performance is much slower.

Any idea why?

Thanks

in general

where c in (a,b,c)

will use what is known as an inlist interator - it knows at least three and at most three, it is 100% sure of everything

where (:b1 is null or c in (set))

is completely different - hugely different. And you would need to verify that the estimated card was really 3 - remember - the cardinality hint is "undocumented", it might not be 'taking' there.

where c in (a,b,c)

will use what is known as an inlist interator - it knows at least three and at most three, it is 100% sure of everything

where (:b1 is null or c in (set))

is completely different - hugely different. And you would need to verify that the estimated card was really 3 - remember - the cardinality hint is "undocumented", it might not be 'taking' there.

> it might not be 'taking' there.

When I see the explain plan (card= for that step), I do see it taking, yet the overall execution plan is different.

Is there a way I can force the inlist iterator to be used?

Sometimes, for other queries, it does work, I see something called 'inlist iterator pickler fetch' in the plan and it works great.

Besides, given a predicate like (1=2 or c in (set)), i.e. any predicate where the first part is constant and evaluates to false, isn't it equivalent to not specifying that predicate all? In other words, all else remaining the same, the execution plan for the above predicate and for (c in (set)) should be identical, right? But it isn't. Why is this?

When I see the explain plan (card= for that step), I do see it taking, yet the overall execution plan is different.

Is there a way I can force the inlist iterator to be used?

Sometimes, for other queries, it does work, I see something called 'inlist iterator pickler fetch' in the plan and it works great.

Besides, given a predicate like (1=2 or c in (set)), i.e. any predicate where the first part is constant and evaluates to false, isn't it equivalent to not specifying that predicate all? In other words, all else remaining the same, the execution plan for the above predicate and for (c in (set)) should be identical, right? But it isn't. Why is this?

where (:x is null or c in (set))

is radically, hugely different than

where (c in (a,b,c))

where (c in (set))

is radically, hugely different than

where (c in (a,b,c))

... the

execution plan for the above predicate and for (c in (set)) should be

identical, right? ...

why ? why would they be identical?

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm#sthref1416

inlist iterator is for in's with lists.

it would be able to nested loop (which is a lot like an inlist interator) or hash join or sort merge or any of the JOIN techniques.

is radically, hugely different than

where (c in (a,b,c))

where (c in (set))

is radically, hugely different than

where (c in (a,b,c))

... the

execution plan for the above predicate and for (c in (set)) should be

identical, right? ...

why ? why would they be identical?

http://docs.oracle.com/docs/cd/B19306_01/server.102/b14211/ex_plan.htm#sthref1416

inlist iterator is for in's with lists.

ops$tkyte%ORA10GR2> create table t 2 as 3 select * 4 from all_objects; Table created. ops$tkyte%ORA10GR2> alter table t add constraint t_pk primary key(object_id); Table altered. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> set autotrace traceonly explain ops$tkyte%ORA10GR2> select * from t where object_id in (42, 55); Execution Plan ---------------------------------------------------------- Plan hash value: 2629880895 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 22 | 2816 | 8 (0)| 00:00:01 | | 1 | INLIST ITERATOR | | | | | | | 2 | TABLE ACCESS BY INDEX ROWID| T | 22 | 2816 | 8 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | T_PK | 163 | | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("OBJECT_ID"=42 OR "OBJECT_ID"=55) Note ----- - dynamic sampling used for this statement ops$tkyte%ORA10GR2> select * from t where object_id in (select 42 from dual union all select 55 from dual); Execution Plan ---------------------------------------------------------- Plan hash value: 3667190664 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 262 | 6 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 2 | 262 | 6 (0)| 00:00:01 | | 2 | VIEW | VW_NSO_1 | 2 | 6 | 4 (0)| 00:00:01 | | 3 | HASH UNIQUE | | 2 | | 4 (50)| 00:00:01 | | 4 | UNION-ALL | | | | | | | 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | 7 | TABLE ACCESS BY INDEX ROWID| T | 1 | 128 | 1 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | T_PK | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("OBJECT_ID"="$nso_col_1") Note ----- - dynamic sampling used for this statement

it would be able to nested loop (which is a lot like an inlist interator) or hash join or sort merge or any of the JOIN techniques.

OK my actual use case is for an Apex report region where a list of ids is stored in a collection. A page item :x determines whether or not the collection is populated. So the predicate takes the form

I *have* to use the cardinality hint when using Apex collections because it is generic view that is used by all applications, only I know how much data is expected in any given situation. This query doesn't perform too well, how would one go about speeding it up? Thanks

(:x is null or c in (select /*+ cardinality(t 10) */ c001 from apex_collections t where t.collection_name='X'))

I *have* to use the cardinality hint when using Apex collections because it is generic view that is used by all applications, only I know how much data is expected in any given situation. This query doesn't perform too well, how would one go about speeding it up? Thanks

are the estimated card= values in the plan for all of the other steps anywhere near "correct", if the result set is small - it would opt for nested loops -big - hash join. I'm guess the problem lies elsewhere. not with the collection

use the gather_plan_statistics technique outlined here

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8764517459743#1929856400346576884

and show us

use the gather_plan_statistics technique outlined here

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8764517459743#1929856400346576884

and show us

I asked "Besides, given a predicate like (1=2 or c in (set)), i.e. any predicate where the first part is constant and evaluates to false, isn't it equivalent to not specifying that predicate all? In other words, all else remaining the same, the execution plan for the above predicate and for (c in (set)) should be identical, right? But it isn't. Why is this?"

I am not sure I understood your response. Adding a constant, known-at-parse-time expression with a OR into a predicate that evaluates to false i.e. 1=2 OR <whatever> should have absolutely no effect on the query plan, right? But it does.

Here is a (trivial) example

Shouldn't the plan for the 2 queries be the same since the constant predicate 1=2 isn't really affecting the answer? But the cost of the query with the OR is much higher. Why? I realize that the CBO (probably) doesn't really evaluate the constant predicate but with bind variable peeking, it could do that and optimize away the expression thus arriving at the lower cost plan, couldn't it?

I am not sure I understood your response. Adding a constant, known-at-parse-time expression with a OR into a predicate that evaluates to false i.e. 1=2 OR <whatever> should have absolutely no effect on the query plan, right? But it does.

Here is a (trivial) example

SQL> create table x as select level i from dual connect by level<=10000; Table created. SQL> create table y as select level i from dual connect by level<=10000; Table created. SQL> create index i1 on x(i); Index created. SQL> create index i2 on y(i); Index created. SQL> exec dbms_stats.gather_table_stats(user,'x',cascade=>true); PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(user,'y',cascade=>true) PL/SQL procedure successfully completed. SQL> set autotrace traceonly exp SQL> select * from x where (1=2 or x.i in (select y.i from y)); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=503 Card=500 Bytes=1 500) 1 0 FILTER 2 1 TABLE ACCESS (FULL) OF 'X' (Cost=3 Card=500 Bytes=1500) 3 1 INDEX (RANGE SCAN) OF 'I2' (NON-UNIQUE) (Cost=1 Card=1 Bytes=3) SQL> select * from x where x.i in (select y.i from y); Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=10000 Bytes=60000) 1 0 NESTED LOOPS (SEMI) (Cost=4 Card=10000 Bytes=60000) 2 1 INDEX (FULL SCAN) OF 'I1' (NON-UNIQUE) (Cost=22 Card=10000 Bytes=30000) 3 1 INDEX (RANGE SCAN) OF 'I2' (NON-UNIQUE)

Shouldn't the plan for the 2 queries be the same since the constant predicate 1=2 isn't really affecting the answer? But the cost of the query with the OR is much higher. Why? I realize that the CBO (probably) doesn't really evaluate the constant predicate but with bind variable peeking, it could do that and optimize away the expression thus arriving at the lower cost plan, couldn't it?

... I am not sure I understood your response. ...

my response was that

where (c in (a,b,c))

is not in any way the same as

where (c in (set))

that is all - forget the 1=2, it goes away.

they were asking why "c in (a,b,c)" results in inlist iterator and "c in (set)" does not.

my response was that

where (c in (a,b,c))

is not in any way the same as

where (c in (set))

that is all - forget the 1=2, it goes away.

they were asking why "c in (a,b,c)" results in inlist iterator and "c in (set)" does not.

That wasn't the question. I posted it again to clarify. It was about the 1=2 (or any expression that evaluates to false and has a OR in the predicate). You said the "1=2 goes away" but it doesn't appear to. I gave an example to show that execution plan with it and without are different. Why is that?

well, it does go away - you chopped off the predicate bit. the plans are different (that is permitted, there is no reason that might now happen).

but the 1=2 is not there.

ops$tkyte%ORA10GR2> set autotrace traceonly exp ops$tkyte%ORA10GR2> select * from x where (1=2 or x.i in (select y.i from y)); Execution Plan ---------------------------------------------------------- Plan hash value: 858363292 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 30000 | 4 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL| X | 10000 | 30000 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | I2 | 1 | 3 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT /*+ */ 0 FROM "Y" "Y" WHERE "Y"."I"=:B1)) 3 - access("Y"."I"=:B1) ops$tkyte%ORA10GR2> select * from x where ( x.i in (select y.i from y)); Execution Plan ---------------------------------------------------------- Plan hash value: 3518811015 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10000 | 60000 | 7 (15)| 00:00:01 | |* 1 | HASH JOIN SEMI | | 10000 | 60000 | 7 (15)| 00:00:01 | | 2 | INDEX FAST FULL SCAN| I1 | 10000 | 30000 | 3 (0)| 00:00:01 | | 3 | INDEX FAST FULL SCAN| I2 | 10000 | 30000 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("X"."I"="Y"."I") ops$tkyte%ORA10GR2> set autotrace off

but the 1=2 is not there.

Tom what was the script you ran that did these steps please

>new 1: explain plan for select * from emp, dept where >emp.deptno =dept.deptno

>Explained.

great explanation on cardinality one of the most common confused terms regarding databases

just a really old script, with dbms_xplan/autotrace I don't use it anymore.

It just did a delete on the plan table

then explain plan for &1

then a query against the plan table to format it.

just use

set autotrace traceonly explain

select ....

set autotrace off

much easier. better plan display too.

It just did a delete on the plan table

then explain plan for &1

then a query against the plan table to format it.

just use

set autotrace traceonly explain

select ....

set autotrace off

much easier. better plan display too.

Thanks very much

Hi Tom,

What could be the other reason for explain plan to show not correct cardinality evenif my all table are analyzed.

What could be the other reason for explain plan to show not correct cardinality evenif my all table are analyzed.

Hi Tom,

Thanks a lot.

Still i have one question.if there a query join of multiple table then how to decide to give hint of dynamic sampling on which table.

Lets say i have below query :

select * from a,b,c,d where .....

i should give hint /*+ dynamic_sampling(a 0) */ or /*+ dynamic_sampling(b 0) */ or dynamic_sampling(c 0) */

or it doesn't matter.

Thanks a lot.

Still i have one question.if there a query join of multiple table then how to decide to give hint of dynamic sampling on which table.

Lets say i have below query :

select * from a,b,c,d where .....

i should give hint /*+ dynamic_sampling(a 0) */ or /*+ dynamic_sampling(b 0) */ or dynamic_sampling(c 0) */

or it doesn't matter.

well, what do you want to have happen?

there is no single answer here - do you want dynamic sampling to happen? or not? it is up to you.

there is no single answer here - do you want dynamic sampling to happen? or not? it is up to you.

Hi Tom,

I have following query and I generated execution plan for it. I am not able to understand the cardinality.

Query :

SELECT DISTINCT NPF_ACCOUNT.AC_SK,

NPF_AC_POSN_SUM.ACCT_BAL,

T_ACCOUNT.AC_ID

FROM T_ACCOUNT,

T_ACCT_SUM

WHERE T_ACCOUNT.AC_SK IN (167056,276364,292510,410216,550806,584324,584325,585607,594305,594306,375386,375387,597632,597633,597634,597636,597637,597638,597639,597640,597641,597642,597643,597644,597645,597646,597647,597648,597649,597650,597651,597652,597653,597654,597655,597665,597656,597657,597658,597659,597660,597661,597662,597663,597664,597625,597626,597627,608683,709753,739855,769598,769599,769600,789267,831812,831816,831815,831814,831813,831817,886799,891301,891302,896746,905441,928452,952683,971664,971665,971666,971667,971668,971669,971670,597628,597629,986885,987360,987859,992765,1069330,1070973,1075750,1201860,1201862,1201861,1283345,1283346,1312703,1312704,1312702,1312578,1312579,1415629,1446799,1446798,1520530,1834159,1859876,1894444,1932920,1964377,2033976,2077465,2185914)

AND T_ACCT_SUM.ACCT_BAL <= 1000.00

AND T_ACCOUNT.AC_SK = T_ACCT_SUM.AC_SK

AND T_ACCT_SUM.BAL_DATE = '31-OCT-13';

Execution plan :

--------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

--------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 132 | 4752 | 124 (2)|

| 1 | HASH UNIQUE | | 132 | 4752 | 124 (2)|

| 2 | MERGE JOIN | | 132 | 4752 | 123 (1)|

| 3 | PARTITION RANGE SINGLE | | 128K| 2501K| 85 (0)|

| 4 | INLIST ITERATOR | | | | |

|* 5 | TABLE ACCESS BY LOCAL INDEX ROWID| T_ACCT_SUM | 128K| 2501K| 85 (0)|

|* 6 | INDEX RANGE SCAN | PK_T_ACCT_SUM | 5 | | 32 (0)|

|* 7 | SORT JOIN | | 106 | 1696 | 39 (3)|

| 8 | INLIST ITERATOR | | | | |

| 9 | TABLE ACCESS BY INDEX ROWID | T_ACCOUNT | 106 | 1696 | 38 (0)|

|* 10 | INDEX UNIQUE SCAN | PK_T_ACCOUNT | 106 | | 29 (0)|

--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

5 - filter("NPF_AC_POSN_SUM"."ACCT_BAL"<=1000.00)

6 - access(("NPF_AC_POSN_SUM"."AC_SK"=167056 OR "NPF_AC_POSN_SUM"."AC_SK"=276364 OR

"NPF_AC_POSN_SUM"."AC_SK"=292510 OR "NPF_AC_POSN_SUM"."AC_SK"=375386 OR "NPF_AC_POSN_SUM"."AC_SK"=375387 OR

"NPF_AC_POSN_SUM"."AC_SK"=410216 OR "NPF_AC_POSN_SUM"."AC_SK"=550806 OR "NPF_AC_POSN_SUM"."AC_SK"=584324 OR

"NPF_AC_POSN_SUM"."AC_SK"=584325 OR "NPF_AC_POSN_SUM"."AC_SK"=585607 OR "NPF_AC_POSN_SUM"."AC_SK"=594305 OR

"NPF_AC_POSN_SUM"."AC_SK"=594306 OR "NPF_AC_POSN_SUM"."AC_SK"=597625 OR "NPF_AC_POSN_SUM"."AC_SK"=597626 OR

"NPF_AC_POSN_SUM"."AC_SK"=597627 OR "NPF_AC_POSN_SUM"."AC_SK"=597628 OR "NPF_AC_POSN_SUM"."AC_SK"=597629 OR

"NPF_AC_POSN_SUM"."AC_SK"=597632 OR "NPF_AC_POSN_SUM"."AC_SK"=597633 OR "NPF_AC_POSN_SUM"."AC_SK"=597634 OR

"NPF_AC_POSN_SUM"."AC_SK"=597636 OR "NPF_AC_POSN_SUM"."AC_SK"=597637 OR "NPF_AC_POSN_SUM"."AC_SK"=597638 OR

"NPF_AC_POSN_SUM"."AC_SK"=597639 OR "NPF_AC_POSN_SUM"."AC_SK"=597640 OR "NPF_AC_POSN_SUM"."AC_SK"=597641 OR

"NPF_AC_POSN_SUM"."AC_SK"=597642 OR "NPF_AC_POSN_SUM"."AC_SK"=597643 OR "NPF_AC_POSN_SUM"."AC_SK"=597644 OR

"NPF_AC_POSN_SUM"."AC_SK"=597645 OR "NPF_AC_POSN_SUM"."AC_SK"=597646 OR "NPF_AC_POSN_SUM"."AC_SK"=597647 OR

"NPF_AC_POSN_SUM"."AC_SK"=597648 OR "NPF_AC_POSN_SUM"."AC_SK"=597649 OR "NPF_AC_POSN_SUM"."AC_SK"=597650 OR

"NPF_AC_POSN_SUM"."AC_SK"=597651 OR "NPF_AC_POSN_SUM"."AC_SK"=597652 OR "NPF_AC_POSN_SUM"."AC_SK"=597653 OR

"NPF_AC_POSN_SUM"."AC_SK"=597654 OR "NPF_AC_POSN_SUM"."AC_SK"=597655 OR "NPF_AC_POSN_SUM"."AC_SK"=597656 OR

"NPF_AC_POSN_SUM"."AC_SK"=597657 OR "NPF_AC_POSN_SUM"."AC_SK"=597658 OR "NPF_AC_POSN_SUM"."AC_SK"=597659 OR

"NPF_AC_POSN_SUM"."AC_SK"=597660 OR "NPF_AC_POSN_SUM"."AC_SK"=597661 OR "NPF_AC_POSN_SUM"."AC_SK"=597662 OR

"NPF_AC_POSN_SUM"."AC_SK"=597663 OR "NPF_AC_POSN_SUM"."AC_SK"=597664 OR "NPF_AC_POSN_SUM"."AC_SK"=597665 OR

"NPF_AC_POSN_SUM"."AC_SK"=608683 OR "NPF_AC_POSN_SUM"."AC_SK"=709753 OR "NPF_AC_POSN_SUM"."AC_SK"=739855 OR

"NPF_AC_POSN_SUM"."AC_SK"=769598 OR "NPF_AC_POSN_SUM"."AC_SK"=769599 OR "NPF_AC_POSN_SUM"."AC_SK"=769600 OR

"NPF_AC_POSN_SUM"."AC_SK"=789267 OR "NPF_AC_POSN_SUM"."AC_SK"=831812 OR "NPF_AC_POSN_SUM"."AC_SK"=831813 OR

"NPF_AC_POSN_SUM"."AC_SK"=831814 OR "NPF_AC_POSN_SUM"."AC_SK"=831815 OR "NPF_AC_POSN_SUM"."AC_SK"=831816 OR

"NPF_AC_POSN_SUM"."AC_SK"=831817 OR "NPF_AC_POSN_SUM"."AC_SK"=886799 OR "NPF_AC_POSN_SUM"."AC_SK"=891301 OR

"NPF_AC_POSN_SUM"."AC_SK"=891302 OR "NPF_AC_POSN_SUM"."AC_SK"=896746 OR "NPF_AC_POSN_SUM"."AC_SK"=905441 OR

"NPF_AC_POSN_SUM"."AC_SK"=928452 OR "NPF_AC_POSN_SUM"."AC_SK"=952683 OR "NPF_AC_POSN_SUM"."AC_SK"=971664 OR

"NPF_AC_POSN_SUM"."AC_SK"=971665 OR "NPF_AC_POSN_SUM"."AC_SK"=971666 OR "NPF_AC_POSN_SUM"."AC_SK"=971667 OR

"NPF_AC_POSN_SUM"."AC_SK"=971668 OR "NPF_AC_POSN_SUM"."AC_SK"=971669 OR "NPF_AC_POSN_SUM"."AC_SK"=971670 OR

"NPF_AC_POSN_SUM"."AC_SK"=986885 OR "NPF_AC_POSN_SUM"."AC_SK"=987360 OR "NPF_AC_POSN_SUM"."AC_SK"=987859 OR

"NPF_AC_POSN_SUM"."AC_SK"=992765 OR "NPF_AC_POSN_SUM"."AC_SK"=1069330 OR "NPF_AC_POSN_SUM"."AC_SK"=1070973 OR

"NPF_AC_POSN_SUM"."AC_SK"=1075750 OR "NPF_AC_POSN_SUM"."AC_SK"=1201860 OR "NPF_AC_POSN_SUM"."AC_SK"=1201861 OR

"NPF_AC_POSN_SUM"."AC_SK"=1201862 OR "NPF_AC_POSN_SUM"."AC_SK"=1283345 OR "NPF_AC_POSN_SUM"."AC_SK"=1283346 OR

"NPF_AC_POSN_SUM"."AC_SK"=1312578 OR "NPF_AC_POSN_SUM"."AC_SK"=1312579 OR "NPF_AC_POSN_SUM"."AC_SK"=1312702 OR

"NPF_AC_POSN_SUM"."AC_SK"=1312703 OR "NPF_AC_POSN_SUM"."AC_SK"=1312704 OR "NPF_AC_POSN_SUM"."AC_SK"=1415629 OR

"NPF_AC_POSN_SUM"."AC_SK"=1446798 OR "NPF_AC_POSN_SUM"."AC_SK"=1446799 OR "NPF_AC_POSN_SUM"."AC_SK"=1520530 OR

"NPF_AC_POSN_SUM"."AC_SK"=1834159 OR "NPF_AC_POSN_SUM"."AC_SK"=1859876 OR "NPF_AC_POSN_SUM"."AC_SK"=1894444 OR

"NPF_AC_POSN_SUM"."AC_SK"=1932920 OR "NPF_AC_POSN_SUM"."AC_SK"=1964377 OR "NPF_AC_POSN_SUM"."AC_SK"=2033976 OR

"NPF_AC_POSN_SUM"."AC_SK"=2077465 OR "NPF_AC_POSN_SUM"."AC_SK"=2185914) AND

"NPF_AC_POSN_SUM"."VALN_AS_OF_DATE"='31-OCT-13')

7 - access("NPF_ACCOUNT"."AC_SK"="NPF_AC_POSN_SUM"."AC_SK")

filter("NPF_ACCOUNT"."AC_SK"="NPF_AC_POSN_SUM"."AC_SK")

10 - access("NPF_ACCOUNT"."AC_SK"=167056 OR "NPF_ACCOUNT"."AC_SK"=276364 OR "NPF_ACCOUNT"."AC_SK"=292510 OR

"NPF_ACCOUNT"."AC_SK"=375386 OR "NPF_ACCOUNT"."AC_SK"=375387 OR "NPF_ACCOUNT"."AC_SK"=410216 OR

"NPF_ACCOUNT"."AC_SK"=550806 OR "NPF_ACCOUNT"."AC_SK"=584324 OR "NPF_ACCOUNT"."AC_SK"=584325 OR

"NPF_ACCOUNT"."AC_SK"=585607 OR "NPF_ACCOUNT"."AC_SK"=594305 OR "NPF_ACCOUNT"."AC_SK"=594306 OR

"NPF_ACCOUNT"."AC_SK"=597625 OR "NPF_ACCOUNT"."AC_SK"=597626 OR "NPF_ACCOUNT"."AC_SK"=597627 OR

"NPF_ACCOUNT"."AC_SK"=597628 OR "NPF_ACCOUNT"."AC_SK"=597629 OR "NPF_ACCOUNT"."AC_SK"=597632 OR

"NPF_ACCOUNT"."AC_SK"=597633 OR "NPF_ACCOUNT"."AC_SK"=597634 OR "NPF_ACCOUNT"."AC_SK"=597636 OR

"NPF_ACCOUNT"."AC_SK"=597637 OR "NPF_ACCOUNT"."AC_SK"=597638 OR "NPF_ACCOUNT"."AC_SK"=597639 OR

"NPF_ACCOUNT"."AC_SK"=597640 OR "NPF_ACCOUNT"."AC_SK"=597641 OR "NPF_ACCOUNT"."AC_SK"=597642 OR

"NPF_ACCOUNT"."AC_SK"=597643 OR "NPF_ACCOUNT"."AC_SK"=597644 OR "NPF_ACCOUNT"."AC_SK"=597645 OR

"NPF_ACCOUNT"."AC_SK"=597646 OR "NPF_ACCOUNT"."AC_SK"=597647 OR "NPF_ACCOUNT"."AC_SK"=597648 OR

"NPF_ACCOUNT"."AC_SK"=597649 OR "NPF_ACCOUNT"."AC_SK"=597650 OR "NPF_ACCOUNT"."AC_SK"=597651 OR

"NPF_ACCOUNT"."AC_SK"=597652 OR "NPF_ACCOUNT"."AC_SK"=597653 OR "NPF_ACCOUNT"."AC_SK"=597654 OR

"NPF_ACCOUNT"."AC_SK"=597655 OR "NPF_ACCOUNT"."AC_SK"=597656 OR "NPF_ACCOUNT"."AC_SK"=597657 OR

"NPF_ACCOUNT"."AC_SK"=597658 OR "NPF_ACCOUNT"."AC_SK"=597659 OR "NPF_ACCOUNT"."AC_SK"=597660 OR

"NPF_ACCOUNT"."AC_SK"=597661 OR "NPF_ACCOUNT"."AC_SK"=597662 OR "NPF_ACCOUNT"."AC_SK"=597663 OR

"NPF_ACCOUNT"."AC_SK"=597664 OR "NPF_ACCOUNT"."AC_SK"=597665 OR "NPF_ACCOUNT"."AC_SK"=608683 OR

"NPF_ACCOUNT"."AC_SK"=709753 OR "NPF_ACCOUNT"."AC_SK"=739855 OR "NPF_ACCOUNT"."AC_SK"=769598 OR

"NPF_ACCOUNT"."AC_SK"=769599 OR "NPF_ACCOUNT"."AC_SK"=769600 OR "NPF_ACCOUNT"."AC_SK"=789267 OR

"NPF_ACCOUNT"."AC_SK"=831812 OR "NPF_ACCOUNT"."AC_SK"=831813 OR "NPF_ACCOUNT"."AC_SK"=831814 OR

"NPF_ACCOUNT"."AC_SK"=831815 OR "NPF_ACCOUNT"."AC_SK"=831816 OR "NPF_ACCOUNT"."AC_SK"=831817 OR

"NPF_ACCOUNT"."AC_SK"=886799 OR "NPF_ACCOUNT"."AC_SK"=891301 OR "NPF_ACCOUNT"."AC_SK"=891302 OR

"NPF_ACCOUNT"."AC_SK"=896746 OR "NPF_ACCOUNT"."AC_SK"=905441 OR "NPF_ACCOUNT"."AC_SK"=928452 OR

"NPF_ACCOUNT"."AC_SK"=952683 OR "NPF_ACCOUNT"."AC_SK"=971664 OR "NPF_ACCOUNT"."AC_SK"=971665 OR

"NPF_ACCOUNT"."AC_SK"=971666 OR "NPF_ACCOUNT"."AC_SK"=971667 OR "NPF_ACCOUNT"."AC_SK"=971668 OR

"NPF_ACCOUNT"."AC_SK"=971669 OR "NPF_ACCOUNT"."AC_SK"=971670 OR "NPF_ACCOUNT"."AC_SK"=986885 OR

"NPF_ACCOUNT"."AC_SK"=987360 OR "NPF_ACCOUNT"."AC_SK"=987859 OR "NPF_ACCOUNT"."AC_SK"=992765 OR

"NPF_ACCOUNT"."AC_SK"=1069330 OR "NPF_ACCOUNT"."AC_SK"=1070973 OR "NPF_ACCOUNT"."AC_SK"=1075750 OR

"NPF_ACCOUNT"."AC_SK"=1201860 OR "NPF_ACCOUNT"."AC_SK"=1201861 OR "NPF_ACCOUNT"."AC_SK"=1201862 OR

"NPF_ACCOUNT"."AC_SK"=1283345 OR "NPF_ACCOUNT"."AC_SK"=1283346 OR "NPF_ACCOUNT"."AC_SK"=1312578 OR

"NPF_ACCOUNT"."AC_SK"=1312579 OR "NPF_ACCOUNT"."AC_SK"=1312702 OR "NPF_ACCOUNT"."AC_SK"=1312703 OR

"NPF_ACCOUNT"."AC_SK"=1312704 OR "NPF_ACCOUNT"."AC_SK"=1415629 OR "NPF_ACCOUNT"."AC_SK"=1446798 OR

"NPF_ACCOUNT"."AC_SK"=1446799 OR "NPF_ACCOUNT"."AC_SK"=1520530 OR "NPF_ACCOUNT"."AC_SK"=1834159 OR

"NPF_ACCOUNT"."AC_SK"=1859876 OR "NPF_ACCOUNT"."AC_SK"=1894444 OR "NPF_ACCOUNT"."AC_SK"=1932920 OR

"NPF_ACCOUNT"."AC_SK"=1964377 OR "NPF_ACCOUNT"."AC_SK"=2033976 OR "NPF_ACCOUNT"."AC_SK"=2077465 OR

"NPF_ACCOUNT"."AC_SK"=2185914)

Table T_ACCT_SUM is range partitioned on BAL_DATE column. ACCT_BAL column has height balanced histogram.

Database is 11.2.0.3 version 4 node RAC system.

Execution plan is collected using "SET AUTOTRACE ON" command.

On line 6, cardinality calculated is "5" using index range scan. But on line 5, cardinality has increased to 128K for same table. My assumption was that, cardinality returned by index should be greater than or same as the cardinality for its corresponding table. Here, the increase is huge. Can you please help me understand what I am missing?

Thanks!

I have following query and I generated execution plan for it. I am not able to understand the cardinality.

Query :

SELECT DISTINCT NPF_ACCOUNT.AC_SK,

NPF_AC_POSN_SUM.ACCT_BAL,

T_ACCOUNT.AC_ID

FROM T_ACCOUNT,

T_ACCT_SUM

WHERE T_ACCOUNT.AC_SK IN (167056,276364,292510,410216,550806,584324,584325,585607,594305,594306,375386,375387,597632,597633,597634,597636,597637,597638,597639,597640,597641,597642,597643,597644,597645,597646,597647,597648,597649,597650,597651,597652,597653,597654,597655,597665,597656,597657,597658,597659,597660,597661,597662,597663,597664,597625,597626,597627,608683,709753,739855,769598,769599,769600,789267,831812,831816,831815,831814,831813,831817,886799,891301,891302,896746,905441,928452,952683,971664,971665,971666,971667,971668,971669,971670,597628,597629,986885,987360,987859,992765,1069330,1070973,1075750,1201860,1201862,1201861,1283345,1283346,1312703,1312704,1312702,1312578,1312579,1415629,1446799,1446798,1520530,1834159,1859876,1894444,1932920,1964377,2033976,2077465,2185914)

AND T_ACCT_SUM.ACCT_BAL <= 1000.00

AND T_ACCOUNT.AC_SK = T_ACCT_SUM.AC_SK

AND T_ACCT_SUM.BAL_DATE = '31-OCT-13';

Execution plan :

--------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|

--------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 132 | 4752 | 124 (2)|

| 1 | HASH UNIQUE | | 132 | 4752 | 124 (2)|

| 2 | MERGE JOIN | | 132 | 4752 | 123 (1)|

| 3 | PARTITION RANGE SINGLE | | 128K| 2501K| 85 (0)|

| 4 | INLIST ITERATOR | | | | |

|* 5 | TABLE ACCESS BY LOCAL INDEX ROWID| T_ACCT_SUM | 128K| 2501K| 85 (0)|

|* 6 | INDEX RANGE SCAN | PK_T_ACCT_SUM | 5 | | 32 (0)|

|* 7 | SORT JOIN | | 106 | 1696 | 39 (3)|

| 8 | INLIST ITERATOR | | | | |

| 9 | TABLE ACCESS BY INDEX ROWID | T_ACCOUNT | 106 | 1696 | 38 (0)|

|* 10 | INDEX UNIQUE SCAN | PK_T_ACCOUNT | 106 | | 29 (0)|

--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

5 - filter("NPF_AC_POSN_SUM"."ACCT_BAL"<=1000.00)

6 - access(("NPF_AC_POSN_SUM"."AC_SK"=167056 OR "NPF_AC_POSN_SUM"."AC_SK"=276364 OR

"NPF_AC_POSN_SUM"."AC_SK"=292510 OR "NPF_AC_POSN_SUM"."AC_SK"=375386 OR "NPF_AC_POSN_SUM"."AC_SK"=375387 OR

"NPF_AC_POSN_SUM"."AC_SK"=410216 OR "NPF_AC_POSN_SUM"."AC_SK"=550806 OR "NPF_AC_POSN_SUM"."AC_SK"=584324 OR

"NPF_AC_POSN_SUM"."AC_SK"=584325 OR "NPF_AC_POSN_SUM"."AC_SK"=585607 OR "NPF_AC_POSN_SUM"."AC_SK"=594305 OR

"NPF_AC_POSN_SUM"."AC_SK"=594306 OR "NPF_AC_POSN_SUM"."AC_SK"=597625 OR "NPF_AC_POSN_SUM"."AC_SK"=597626 OR

"NPF_AC_POSN_SUM"."AC_SK"=597627 OR "NPF_AC_POSN_SUM"."AC_SK"=597628 OR "NPF_AC_POSN_SUM"."AC_SK"=597629 OR

"NPF_AC_POSN_SUM"."AC_SK"=597632 OR "NPF_AC_POSN_SUM"."AC_SK"=597633 OR "NPF_AC_POSN_SUM"."AC_SK"=597634 OR

"NPF_AC_POSN_SUM"."AC_SK"=597636 OR "NPF_AC_POSN_SUM"."AC_SK"=597637 OR "NPF_AC_POSN_SUM"."AC_SK"=597638 OR

"NPF_AC_POSN_SUM"."AC_SK"=597639 OR "NPF_AC_POSN_SUM"."AC_SK"=597640 OR "NPF_AC_POSN_SUM"."AC_SK"=597641 OR

"NPF_AC_POSN_SUM"."AC_SK"=597642 OR "NPF_AC_POSN_SUM"."AC_SK"=597643 OR "NPF_AC_POSN_SUM"."AC_SK"=597644 OR

"NPF_AC_POSN_SUM"."AC_SK"=597645 OR "NPF_AC_POSN_SUM"."AC_SK"=597646 OR "NPF_AC_POSN_SUM"."AC_SK"=597647 OR

"NPF_AC_POSN_SUM"."AC_SK"=597648 OR "NPF_AC_POSN_SUM"."AC_SK"=597649 OR "NPF_AC_POSN_SUM"."AC_SK"=597650 OR

"NPF_AC_POSN_SUM"."AC_SK"=597651 OR "NPF_AC_POSN_SUM"."AC_SK"=597652 OR "NPF_AC_POSN_SUM"."AC_SK"=597653 OR

"NPF_AC_POSN_SUM"."AC_SK"=597654 OR "NPF_AC_POSN_SUM"."AC_SK"=597655 OR "NPF_AC_POSN_SUM"."AC_SK"=597656 OR

"NPF_AC_POSN_SUM"."AC_SK"=597657 OR "NPF_AC_POSN_SUM"."AC_SK"=597658 OR "NPF_AC_POSN_SUM"."AC_SK"=597659 OR

"NPF_AC_POSN_SUM"."AC_SK"=597660 OR "NPF_AC_POSN_SUM"."AC_SK"=597661 OR "NPF_AC_POSN_SUM"."AC_SK"=597662 OR

"NPF_AC_POSN_SUM"."AC_SK"=597663 OR "NPF_AC_POSN_SUM"."AC_SK"=597664 OR "NPF_AC_POSN_SUM"."AC_SK"=597665 OR

"NPF_AC_POSN_SUM"."AC_SK"=608683 OR "NPF_AC_POSN_SUM"."AC_SK"=709753 OR "NPF_AC_POSN_SUM"."AC_SK"=739855 OR

"NPF_AC_POSN_SUM"."AC_SK"=769598 OR "NPF_AC_POSN_SUM"."AC_SK"=769599 OR "NPF_AC_POSN_SUM"."AC_SK"=769600 OR

"NPF_AC_POSN_SUM"."AC_SK"=789267 OR "NPF_AC_POSN_SUM"."AC_SK"=831812 OR "NPF_AC_POSN_SUM"."AC_SK"=831813 OR

"NPF_AC_POSN_SUM"."AC_SK"=831814 OR "NPF_AC_POSN_SUM"."AC_SK"=831815 OR "NPF_AC_POSN_SUM"."AC_SK"=831816 OR

"NPF_AC_POSN_SUM"."AC_SK"=831817 OR "NPF_AC_POSN_SUM"."AC_SK"=886799 OR "NPF_AC_POSN_SUM"."AC_SK"=891301 OR

"NPF_AC_POSN_SUM"."AC_SK"=891302 OR "NPF_AC_POSN_SUM"."AC_SK"=896746 OR "NPF_AC_POSN_SUM"."AC_SK"=905441 OR

"NPF_AC_POSN_SUM"."AC_SK"=928452 OR "NPF_AC_POSN_SUM"."AC_SK"=952683 OR "NPF_AC_POSN_SUM"."AC_SK"=971664 OR

"NPF_AC_POSN_SUM"."AC_SK"=971665 OR "NPF_AC_POSN_SUM"."AC_SK"=971666 OR "NPF_AC_POSN_SUM"."AC_SK"=971667 OR

"NPF_AC_POSN_SUM"."AC_SK"=971668 OR "NPF_AC_POSN_SUM"."AC_SK"=971669 OR "NPF_AC_POSN_SUM"."AC_SK"=971670 OR

"NPF_AC_POSN_SUM"."AC_SK"=986885 OR "NPF_AC_POSN_SUM"."AC_SK"=987360 OR "NPF_AC_POSN_SUM"."AC_SK"=987859 OR

"NPF_AC_POSN_SUM"."AC_SK"=992765 OR "NPF_AC_POSN_SUM"."AC_SK"=1069330 OR "NPF_AC_POSN_SUM"."AC_SK"=1070973 OR

"NPF_AC_POSN_SUM"."AC_SK"=1075750 OR "NPF_AC_POSN_SUM"."AC_SK"=1201860 OR "NPF_AC_POSN_SUM"."AC_SK"=1201861 OR

"NPF_AC_POSN_SUM"."AC_SK"=1201862 OR "NPF_AC_POSN_SUM"."AC_SK"=1283345 OR "NPF_AC_POSN_SUM"."AC_SK"=1283346 OR

"NPF_AC_POSN_SUM"."AC_SK"=1312578 OR "NPF_AC_POSN_SUM"."AC_SK"=1312579 OR "NPF_AC_POSN_SUM"."AC_SK"=1312702 OR

"NPF_AC_POSN_SUM"."AC_SK"=1312703 OR "NPF_AC_POSN_SUM"."AC_SK"=1312704 OR "NPF_AC_POSN_SUM"."AC_SK"=1415629 OR

"NPF_AC_POSN_SUM"."AC_SK"=1446798 OR "NPF_AC_POSN_SUM"."AC_SK"=1446799 OR "NPF_AC_POSN_SUM"."AC_SK"=1520530 OR

"NPF_AC_POSN_SUM"."AC_SK"=1834159 OR "NPF_AC_POSN_SUM"."AC_SK"=1859876 OR "NPF_AC_POSN_SUM"."AC_SK"=1894444 OR

"NPF_AC_POSN_SUM"."AC_SK"=1932920 OR "NPF_AC_POSN_SUM"."AC_SK"=1964377 OR "NPF_AC_POSN_SUM"."AC_SK"=2033976 OR

"NPF_AC_POSN_SUM"."AC_SK"=2077465 OR "NPF_AC_POSN_SUM"."AC_SK"=2185914) AND

"NPF_AC_POSN_SUM"."VALN_AS_OF_DATE"='31-OCT-13')

7 - access("NPF_ACCOUNT"."AC_SK"="NPF_AC_POSN_SUM"."AC_SK")

filter("NPF_ACCOUNT"."AC_SK"="NPF_AC_POSN_SUM"."AC_SK")

10 - access("NPF_ACCOUNT"."AC_SK"=167056 OR "NPF_ACCOUNT"."AC_SK"=276364 OR "NPF_ACCOUNT"."AC_SK"=292510 OR

"NPF_ACCOUNT"."AC_SK"=375386 OR "NPF_ACCOUNT"."AC_SK"=375387 OR "NPF_ACCOUNT"."AC_SK"=410216 OR

"NPF_ACCOUNT"."AC_SK"=550806 OR "NPF_ACCOUNT"."AC_SK"=584324 OR "NPF_ACCOUNT"."AC_SK"=584325 OR

"NPF_ACCOUNT"."AC_SK"=585607 OR "NPF_ACCOUNT"."AC_SK"=594305 OR "NPF_ACCOUNT"."AC_SK"=594306 OR

"NPF_ACCOUNT"."AC_SK"=597625 OR "NPF_ACCOUNT"."AC_SK"=597626 OR "NPF_ACCOUNT"."AC_SK"=597627 OR

"NPF_ACCOUNT"."AC_SK"=597628 OR "NPF_ACCOUNT"."AC_SK"=597629 OR "NPF_ACCOUNT"."AC_SK"=597632 OR

"NPF_ACCOUNT"."AC_SK"=597633 OR "NPF_ACCOUNT"."AC_SK"=597634 OR "NPF_ACCOUNT"."AC_SK"=597636 OR

"NPF_ACCOUNT"."AC_SK"=597637 OR "NPF_ACCOUNT"."AC_SK"=597638 OR "NPF_ACCOUNT"."AC_SK"=597639 OR

"NPF_ACCOUNT"."AC_SK"=597640 OR "NPF_ACCOUNT"."AC_SK"=597641 OR "NPF_ACCOUNT"."AC_SK"=597642 OR

"NPF_ACCOUNT"."AC_SK"=597643 OR "NPF_ACCOUNT"."AC_SK"=597644 OR "NPF_ACCOUNT"."AC_SK"=597645 OR

"NPF_ACCOUNT"."AC_SK"=597646 OR "NPF_ACCOUNT"."AC_SK"=597647 OR "NPF_ACCOUNT"."AC_SK"=597648 OR

"NPF_ACCOUNT"."AC_SK"=597649 OR "NPF_ACCOUNT"."AC_SK"=597650 OR "NPF_ACCOUNT"."AC_SK"=597651 OR

"NPF_ACCOUNT"."AC_SK"=597652 OR "NPF_ACCOUNT"."AC_SK"=597653 OR "NPF_ACCOUNT"."AC_SK"=597654 OR

"NPF_ACCOUNT"."AC_SK"=597655 OR "NPF_ACCOUNT"."AC_SK"=597656 OR "NPF_ACCOUNT"."AC_SK"=597657 OR

"NPF_ACCOUNT"."AC_SK"=597658 OR "NPF_ACCOUNT"."AC_SK"=597659 OR "NPF_ACCOUNT"."AC_SK"=597660 OR

"NPF_ACCOUNT"."AC_SK"=597661 OR "NPF_ACCOUNT"."AC_SK"=597662 OR "NPF_ACCOUNT"."AC_SK"=597663 OR

"NPF_ACCOUNT"."AC_SK"=597664 OR "NPF_ACCOUNT"."AC_SK"=597665 OR "NPF_ACCOUNT"."AC_SK"=608683 OR

"NPF_ACCOUNT"."AC_SK"=709753 OR "NPF_ACCOUNT"."AC_SK"=739855 OR "NPF_ACCOUNT"."AC_SK"=769598 OR

"NPF_ACCOUNT"."AC_SK"=769599 OR "NPF_ACCOUNT"."AC_SK"=769600 OR "NPF_ACCOUNT"."AC_SK"=789267 OR

"NPF_ACCOUNT"."AC_SK"=831812 OR "NPF_ACCOUNT"."AC_SK"=831813 OR "NPF_ACCOUNT"."AC_SK"=831814 OR

"NPF_ACCOUNT"."AC_SK"=831815 OR "NPF_ACCOUNT"."AC_SK"=831816 OR "NPF_ACCOUNT"."AC_SK"=831817 OR

"NPF_ACCOUNT"."AC_SK"=886799 OR "NPF_ACCOUNT"."AC_SK"=891301 OR "NPF_ACCOUNT"."AC_SK"=891302 OR

"NPF_ACCOUNT"."AC_SK"=896746 OR "NPF_ACCOUNT"."AC_SK"=905441 OR "NPF_ACCOUNT"."AC_SK"=928452 OR

"NPF_ACCOUNT"."AC_SK"=952683 OR "NPF_ACCOUNT"."AC_SK"=971664 OR "NPF_ACCOUNT"."AC_SK"=971665 OR

"NPF_ACCOUNT"."AC_SK"=971666 OR "NPF_ACCOUNT"."AC_SK"=971667 OR "NPF_ACCOUNT"."AC_SK"=971668 OR

"NPF_ACCOUNT"."AC_SK"=971669 OR "NPF_ACCOUNT"."AC_SK"=971670 OR "NPF_ACCOUNT"."AC_SK"=986885 OR

"NPF_ACCOUNT"."AC_SK"=987360 OR "NPF_ACCOUNT"."AC_SK"=987859 OR "NPF_ACCOUNT"."AC_SK"=992765 OR

"NPF_ACCOUNT"."AC_SK"=1069330 OR "NPF_ACCOUNT"."AC_SK"=1070973 OR "NPF_ACCOUNT"."AC_SK"=1075750 OR

"NPF_ACCOUNT"."AC_SK"=1201860 OR "NPF_ACCOUNT"."AC_SK"=1201861 OR "NPF_ACCOUNT"."AC_SK"=1201862 OR

"NPF_ACCOUNT"."AC_SK"=1283345 OR "NPF_ACCOUNT"."AC_SK"=1283346 OR "NPF_ACCOUNT"."AC_SK"=1312578 OR

"NPF_ACCOUNT"."AC_SK"=1312579 OR "NPF_ACCOUNT"."AC_SK"=1312702 OR "NPF_ACCOUNT"."AC_SK"=1312703 OR

"NPF_ACCOUNT"."AC_SK"=1312704 OR "NPF_ACCOUNT"."AC_SK"=1415629 OR "NPF_ACCOUNT"."AC_SK"=1446798 OR

"NPF_ACCOUNT"."AC_SK"=1446799 OR "NPF_ACCOUNT"."AC_SK"=1520530 OR "NPF_ACCOUNT"."AC_SK"=1834159 OR

"NPF_ACCOUNT"."AC_SK"=1859876 OR "NPF_ACCOUNT"."AC_SK"=1894444 OR "NPF_ACCOUNT"."AC_SK"=1932920 OR

"NPF_ACCOUNT"."AC_SK"=1964377 OR "NPF_ACCOUNT"."AC_SK"=2033976 OR "NPF_ACCOUNT"."AC_SK"=2077465 OR

"NPF_ACCOUNT"."AC_SK"=2185914)

Table T_ACCT_SUM is range partitioned on BAL_DATE column. ACCT_BAL column has height balanced histogram.

Database is 11.2.0.3 version 4 node RAC system.

Execution plan is collected using "SET AUTOTRACE ON" command.

On line 6, cardinality calculated is "5" using index range scan. But on line 5, cardinality has increased to 128K for same table. My assumption was that, cardinality returned by index should be greater than or same as the cardinality for its corresponding table. Here, the increase is huge. Can you please help me understand what I am missing?

Thanks!

I normally dont write reviews but I had to say this, the answers could have been better if it were written with simple vocabulary. I was looking for different oracle questions, always this website came up first, so I would read this answer and then had to move to other websites as I dont comprehend the style of writing that much.

Team,

Looking into the histogram the estimated cardinality for the bind value SYS is (77043-41873) 35170, but the explain plan reports the cardinality as 4075 (which is num_rows/NDV i.e. 77426/19), any reason why the explain plan doesn't make use of Histograms to report the correct cardinality ?

Looking into the histogram the estimated cardinality for the bind value SYS is (77043-41873) 35170, but the explain plan reports the cardinality as 4075 (which is num_rows/NDV i.e. 77426/19), any reason why the explain plan doesn't make use of Histograms to report the correct cardinality ?

demo@ORA12C> create table t as 2 select * 3 from all_objects; Table created. demo@ORA12C> demo@ORA12C> create index t_idx on t(owner); Index created. demo@ORA12C> begin 2 dbms_stats.gather_table_stats( 3 ownname=>user, 4 tabname=>'T', 5 method_opt=>'for all indexed columns size 20'); 6 end; 7 / PL/SQL procedure successfully completed. demo@ORA12C> demo@ORA12C> select count(*),count(distinct owner) from t; COUNT(*) COUNT(DISTINCTOWNER) ---------- -------------------- 77426 19 1 row selected. demo@ORA12C> demo@ORA12C> select num_distinct,num_buckets,histogram,num_nulls 2 from user_tab_col_statistics 3 where table_name ='T' 4 and column_name ='OWNER'; NUM_DISTINCT NUM_BUCKETS HISTOGRAM NUM_NULLS ------------ ----------- --------------- ---------- 19 19 FREQUENCY 0 1 row selected. demo@ORA12C> demo@ORA12C> column endpoint_actual_value format a20 demo@ORA12C> select endpoint_number,endpoint_actual_value 2 from user_tab_histograms 3 where table_name ='T' 4 and column_name ='OWNER'; ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALU --------------- -------------------- 248 APEX_040200 356 CTXSYS 365 DEMO 382 DVF 393 DVSYS 412 GSMADMIN_INTERNAL 494 LBACSYS 1658 MDSYS 1676 OLAPSYS 1681 ORDDATA 1686 ORDPLUGINS 4786 ORDSYS 4793 OUTLN 41869 PUBLIC 41873 SCOTT 77043 SYS 77056 SYSTEM 77126 WMSYS 77426 XDB 19 rows selected. demo@ORA12C> variable x varchar2(20); demo@ORA12C> exec :x := 'SYS'; PL/SQL procedure successfully completed. demo@ORA12C> demo@ORA12C> explain plan for 2 select * from t 3 where owner =:x; Explained. demo@ORA12C> demo@ORA12C> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------- Plan hash value: 767293772 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4075 | 461K| 141 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 4075 | 461K| 141 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_IDX | 4075 | | 10 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OWNER"=:X) 14 rows selected. demo@ORA12C> exec :x := 'SCOTT'; PL/SQL procedure successfully completed. demo@ORA12C> explain plan for 2 select * from t 3 where owner =:x; Explained. demo@ORA12C> demo@ORA12C> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------- Plan hash value: 767293772 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4075 | 461K| 141 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 4075 | 461K| 141 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_IDX | 4075 | | 10 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OWNER"=:X) 14 rows selected. demo@ORA12C>

An explain plan *command* doesn't peek. It's one of the flaws of using it. Better to run it and use dbms_xplan.display_cursor

SQL> drop table t purge; Table dropped. SQL> SQL> create table t as 2 select * 3 from all_objects; Table created. SQL> SQL> create index t_idx on t(owner); Index created. SQL> SQL> begin 2 dbms_stats.gather_table_stats( 3 ownname=>user, 4 tabname=>'T', 5 method_opt=>'for all indexed columns size 35'); 6 end; 7 / PL/SQL procedure successfully completed. SQL> SQL> select count(*),count(distinct owner) from t; COUNT(*) COUNT(DISTINCTOWNER) ---------- -------------------- 101321 39 1 row selected. SQL> SQL> select num_distinct,num_buckets,histogram,num_nulls 2 from user_tab_col_statistics 3 where table_name ='T' 4 and column_name ='OWNER'; NUM_DISTINCT NUM_BUCKETS HISTOGRAM NUM_NULLS ------------ ----------- --------------- ---------- 39 35 TOP-FREQUENCY 0 1 row selected. SQL> SQL> column endpoint_actual_value format a20 SQL> select endpoint_number,endpoint_actual_value 2 from user_tab_histograms 3 where table_name ='T' 4 and column_name ='OWNER'; ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALU --------------- -------------------- 3021 APEX_040200 6378 APEX_050000 9542 APEX_050100 9564 APEX_LISTENER 9724 ASKTOM 9733 AUDSYS 9741 BI 10142 CTXSYS 10197 DBSNMP 10216 DVF 10508 DVSYS 10520 FLOWS_FILES 10624 GSMADMIN_INTERNAL 10658 HR 10895 LBACSYS 14055 MCDONAC 16154 MDSYS 16171 OJVMSYS 16196 OLAPSYS 16470 ORDDATA 16480 ORDPLUGINS 19640 ORDSYS 19848 ORDS_METADATA 19856 OUTLN 57418 PUBLIC 57428 RASADM 57444 SCOTT 57735 SH 57743 SI_INFORMTN_SCHEMA 57790 SOE 99915 SYS 100513 SYSTEM 100531 WHS_MD 100902 WMSYS 101306 XDB 35 rows selected. SQL> SQL> variable x varchar2(20); SQL> exec :x := 'SCOTT'; PL/SQL procedure successfully completed. SQL> SQL> select * from t 2 where owner =:x; ... 16 rows selected. SQL> SQL> select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 7yvn3fqx5x8tp, child number 0 ------------------------------------- select * from t where owner =:x Plan hash value: 767293772 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 2 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 16 | 1824 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | T_IDX | 16 | | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OWNER"=:X) 19 rows selected. SQL> SQL>