Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Marat.

Asked: January 04, 2010 - 11:10 pm UTC

Last updated: August 31, 2011 - 9:05 am UTC

Version: 10.2.0.3

Viewed 50K+ times! This question is

You Asked

Dear Tom,

I didn't find the "+ cardinality" hint in official Oracle documentation, but you use it in your answers since 9i. Is it really "undocumented" hint, and one is not supposed to use it?

Thank you!

and Tom said...

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3779680732446#15740265481549

correct it is not documented

however, it is - in my opinion - one of the few 'safe' undocumented things to use.

Because it's use will not lead to data corruption, wrong answers, unpredictable outcomes. If it works - it will influence a query plan, if it doesn't - it won't. That is all - it is rather 'safe' in that respect.

Rating

  (9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

mfz, January 05, 2010 - 4:03 pm UTC

With the introduction of dynamic sampling in 10g , there is no need for cardinality hint.


Tom Kyte
January 06, 2010 - 7:32 am UTC

umm, think again

and read the supplied link above and tell me what dynamic sampling would do for you in that case?

answer: not a single thing.


dynamic sampling was introduced in 9iR2 by the way.



dynamic sampling is great, don't get me wrong:

https://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html


but it is scarcely a replacement for this particular tool

One caveat about the cardinality hint...

Steve C, January 05, 2010 - 4:29 pm UTC

One of the challenges of using the cardinality hint is that you are locking in a specific operation cardinality. Most data tends to grow or the distribution of data change over time and I find it's easy for this hint to get overlooked until something breaks.

Our best success has come from using the dynamic_sampling hint which works especially well in OLAP environments. We have hundreds of queries where Oracle just doesn't have the stats on a particular expression think (UPPER, SUBSTR, TO_CHAR, etc.), histograms are not gathered, etc. Where it makes sense to use DYNAMIC_SAMPLING.

Of course there is no perfect answer. As Tom says all the time, every features it there for a reason so this is not a review that is advocating to use DYNAMIC_SAMPLING *instead* of CARDINALITY just passing on some of our experience.


Tom Kyte
January 06, 2010 - 7:46 am UTC

One of the challenges of using the cardinality hint is that you are locking in
a specific operation cardinality.


or more precisely, you are locking in a specific cardinality "value". In other write ups I've discussed this. If you are doing something like a global temporary table (gtt) OR a collection (pipelined function or otherwise) - the cardinality hint can be a great bonus.

for example, sometimes you put 100 rows into the gtt, sometimes 1000, and sometimes 100000 (eg: small, big, extra large). You would like to have the opportunity to have a small plan, big plan and extra large plan (be reasonable and thoughtful all - if you put 50 rows in there, that is small - if you put 123 rows in there, that is small - if you put in 732 rows that is big (near 1000) and so on - you will not use the exact number of rows, you will use only 100, 1000, 100000 for small, big and extra large). You will generate up to 3 queries with 3 potentially different plans.

With dynamic sampling - the first person that runs the query will do the sample and cause the plan to be stuck at small, big or extra large - every one else will use that plan.



We have hundreds of queries where Oracle
just doesn't have the stats on a particular expression think (UPPER, SUBSTR,
TO_CHAR, etc.), histograms are not gathered, etc. Where it makes sense to use
DYNAMIC_SAMPLING.


absolutely - in a warehouse/reporting system - it probably makes sense to use level 3 for pretty much all of the reporting queries, especially ones without binds.



cardinality on nested table

Marat Tolgambayev, January 05, 2010 - 5:05 pm UTC

Thank you Tom!


mfz, Steve C,
I use Cardinality hint where I use TABLE(CAST(<nested_table>)). Dynamic sampling is not working for that.

Chris Curzon, August 04, 2011 - 12:23 pm UTC

Is there a way to implement a dynamic cardinality hint without using dynamic SQL? It would be good to avoid the parsing step forced by the use of dynamic SQL.
Tom Kyte
August 04, 2011 - 6:47 pm UTC

but, if you wanted the "dynamic cardinality" to be used - we'd have to hard parse it to see if a different plan were warranted.

catch 22, it would still be a parse - a hard one.

Dynamic SQL and parse cost

Chris Curzon, August 04, 2011 - 2:56 pm UTC


I think I was too brief. Let me elaborate on my question.

One can write a procedure

create procedure myquery (...)
begin
for curs in
select /*+ cardinality(t,100) */ <columns>
from temptab t
where ...
loop
<do stuff>
end loop;
end;

In this case, the query is parsed when the procedure is created and then simply used thereafter.

But if one is going to use different cardinality hints in different calls, one has to do something like this:

create procedure myquery2 (cardhint in varchar2, ...)
begin
if cardhint='SMALL' then
sqlstmt := 'select /*+ cardinality(t,100) */ ...';
elsif cardhint='MED' then
sqlstmt := 'select /*+ cardinality(t,10000) */ ...';
elsif cardhint='LARGE' then
sqlstmt := 'select /*+ cardinality(t,1000000) */ ...';
end if;

open cv for sqlstmt
loop
<do stuff...>
end loop;
end;

In the second case, there will be a parse step with each call (I think?). Wouldn't that parse cost possibly overwhelm the savings of having a more efficient plan?

What would be the best way to handle this?

Thanks for your ideas, and your great explanations!

-- Chris Curzon

Tom Kyte
August 04, 2011 - 6:59 pm UTC

you need some reparsing - because you want to have potentially different plans each time.

so, you will have to have SOME reparsing.

To minimize it - IF you know the result set will fit nominally into memory - you can use native dynamic sql with bulk collect. This will make it so that if you execute the same statement twice in a row - plsql will have cached it just like static sql.

So, for example:

create or replace procedure p( p_cardhint in varchar2 )
is
    l_sql varchar2(4000);

    type array is table of dual%rowtype;
    l_data  array;
begin
    if (p_cardhint = 'SMALL')
    then
        l_sql := 'select /*+ cardinality(t 10) */ * from dual t';
    else
        l_sql := 'select /*+ cardinality( t 1000 ) */ * from dual t';
    end if;
    execute immediate l_sql bulk collect into l_data;
end;
/



if we call that like this:

exec p('SMALL');
exec p('SMALL');
exec p('SMALL');
exec p('SMALL');
exec p('LARGE');
exec p('LARGE');
exec p('SMALL');


we'll discover that the small cursor was parsed twice and the large was parsed once:

select /*+ cardinality(t 10) */ * from dual t

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      5      0.00       0.00          0          0          0           0
Fetch        5      0.00       0.00          0         10          0           5
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       12      0.00       0.00          0         10          0           5
********************************************************************************
select /*+ cardinality( t 1000 ) */ * from dual t

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.00       0.00          0          4          0           2



Again, this is valid approach ONLY if you know the query is to return a modest number of rows - hundreds, MAYBE thousands at most.

Thanks

Chris Curzon, August 04, 2011 - 7:05 pm UTC

I always appreciate how clear your examples are. Thank you very much!

Estimated Cardinality on External Tables

Rajeshwaran, Jeyabal, August 25, 2011 - 12:15 pm UTC

Tom:

I have an External Table 'EMP_EXT' mapped to a datafile having data's from scott.emp table. (I using Oracle 10GR2)

1) How can the Optimizer be provided with proper cardinality information? (say in this scenario the actual card is 3 and the estimated is 8168 which is far away from actual) ?

2)I know that we can provide proper cadinality information using CARDINALITY hint. But how this will be appropriate when External table is mapped to different data files. say one file is having few records and other file is having Millions of records to process

rajesh@ORA10GR2> select deptno,cast(collect(ename) as array) as cnt
  2    from emp_ext e
  3    group by deptno
  4  /
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3800725879

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |  8168 |   159K|    26   (8)| 00:00:01 |
|   1 |  SORT GROUP BY              |         |  8168 |   159K|    26   (8)| 00:00:01 |
|   2 |   EXTERNAL TABLE ACCESS FULL| EMP_EXT |  8168 |   159K|    25   (4)| 00:00:01 |
---------------------------------------------------------------------------------------

rajesh@ORA10GR2>


3) So before joining the External table to Transaction table,
a) get the count(*) from External table
b) use dbms_stats.set_table_stats to set the cardinalty
Is this What you suggest for me?

begin
 dbms_stats.set_table_stats(
 ownname=>user,
 tabname=>'EMP_EXT',
 numrows=>3);
end;
/

rajesh@ORA10GR2> select deptno,cast(collect(ename) as array) as cnt
  2  from emp_ext e
  3  group by deptno
  4  /
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3800725879

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     3 |    60 |    25   (4)| 00:00:01 |
|   1 |  SORT GROUP BY              |         |     3 |    60 |    25   (4)| 00:00:01 |
|   2 |   EXTERNAL TABLE ACCESS FULL| EMP_EXT |     3 |    60 |    24   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

rajesh@ORA10GR2>

Tom Kyte
August 30, 2011 - 4:23 pm UTC

1) cardinality hint is good, you can also gather statistics on it.

2) Just like you would gather stats after truncating and loading a table - if the size of the underlying data file changes radically - you'd need to update the stats.


ops$tkyte%ORA11GR2> create table all_objects_unload
  2  organization external
  3  ( type oracle_datapump
  4    default directory TMP
  5    location( 'allobjects.dat' )
  6  )
  7  as
  8  select a.*, dbms_random.random rnd from stage a;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from all_objects_unload;

Execution Plan
----------------------------------------------------------
Plan hash value: 2716118922

-------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                    |  8168 |  1363K|    29   (0)| 00:00:01 |
|   1 |  EXTERNAL TABLE ACCESS FULL| ALL_OBJECTS_UNLOAD |  8168 |  1363K|    29   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'ALL_OBJECTS_UNLOAD' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from all_objects_unload;

Execution Plan
----------------------------------------------------------
Plan hash value: 2716118922

-------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                    | 72109 |  7323K|   252   (1)| 00:00:04 |
|   1 |  EXTERNAL TABLE ACCESS FULL| ALL_OBJECTS_UNLOAD | 72109 |  7323K|   252   (1)| 00:00:04 |
-------------------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> set autotrace off

Estimated Cardinality on External Tables

Rajeshwaran, Jeyabal, August 30, 2011 - 6:48 pm UTC

Tom:

I asked

2)I know that we can provide proper cadinality information using CARDINALITY hint. But how this will be appropriate when External table is mapped to different data files. say one file is having few records and other file is having Millions of records to process

and you said,

2) Just like you would gather stats after truncating and loading a table - if the size of the underlying data file changes radically - you'd need to update the stats

Also your example is based on oracle_datapump, How ever mine is based on oracle_loader. I am getting this error message, when gathering stats like you.

rajesh@ORA10GR2>
rajesh@ORA10GR2> exec dbms_stats.gather_table_stats(user,'EMP_EXT');
BEGIN dbms_stats.gather_table_stats(user,'EMP_EXT'); END;

*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "RAJESH"."EMP_EXT", sampling on external table is not supported
ORA-06512: at "SYS.DBMS_STATS", line 20337
ORA-06512: at "SYS.DBMS_STATS", line 20360
ORA-06512: at line 1


Elapsed: 00:00:00.23
rajesh@ORA10GR2>


Now the question is, I will joining this External table with Transaction table's having 600M records. If the External Table cardinality is less then Optimizer goes to Nested loop and if cardinality is Huge then Hash Join. So How can I provider proper cardinality to the optimizer? Also this External table is mapped to different data files each time (say one file is having few records and other file is having Millions of records to process )
Tom Kyte
August 31, 2011 - 9:05 am UTC

did you read the error message? don't sample:

ops$tkyte%ORA10GR2> CREATE or replace DIRECTORY load_dir AS '/home/tkyte'
  2  /

Directory created.

ops$tkyte%ORA10GR2> CREATE TABLE EMP_ET
  2  (
  3    "EMPNO" NUMBER(4),
  4    "ENAME" VARCHAR2(10),
  5    "JOB" VARCHAR2(9),
  6    "MGR" NUMBER(4),
  7    "HIREDATE" DATE,
  8    "SAL" NUMBER(7,2),
  9    "COMM" NUMBER(7,2),
 10    "DEPTNO" NUMBER(2)
 11  )
 12  ORGANIZATION external
 13  ( TYPE oracle_loader
 14    DEFAULT DIRECTORY load_dir
 15    ACCESS PARAMETERS
 16    ( RECORDS DELIMITED BY NEWLINE
 17      FIELDS TERMINATED BY "|" LDRTRIM
 18    )
 19    location ( 'emp.dat' )
 20  )
 21  /

Table created.

ops$tkyte%ORA10GR2> select * from emp_et where rownum=1;

Execution Plan
----------------------------------------------------------
Plan hash value: 2857770134

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     1 |    87 |     2   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY              |        |       |       |            |          |
|   2 |   EXTERNAL TABLE ACCESS FULL| EMP_ET |     1 |    87 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM=1)

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from emp_et;

Execution Plan
----------------------------------------------------------
Plan hash value: 1672942242

-------------------------------------------------------------------------------------
| Id  | Operation                  | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |        |  8168 |   693K|    39   (3)| 00:00:01 |
|   1 |  EXTERNAL TABLE ACCESS FULL| EMP_ET |  8168 |   693K|    39   (3)| 00:00:01 |
-------------------------------------------------------------------------------------

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'EMP_ET', estimate_percent=> null );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select * from emp_et;

Execution Plan
----------------------------------------------------------
Plan hash value: 1672942242

-------------------------------------------------------------------------------------
| Id  | Operation                  | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |        |    14 |   518 |     2   (0)| 00:00:01 |
|   1 |  EXTERNAL TABLE ACCESS FULL| EMP_ET |    14 |   518 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'EMP_ET', numrows => 1000000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select * from emp_et;

Execution Plan
----------------------------------------------------------
Plan hash value: 1672942242

-------------------------------------------------------------------------------------
| Id  | Operation                  | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |        |  1000K|    35M|    51  (97)| 00:00:01 |
|   1 |  EXTERNAL TABLE ACCESS FULL| EMP_ET |  1000K|    35M|    51  (97)| 00:00:01 |
-------------------------------------------------------------------------------------




if you want to use set_table_stats - you can use utl_file.fgetattr to get the size of the file and estimate the numrows from there.

cardinality hint with named query block

Naresh, September 25, 2011 - 12:09 am UTC

Hello Tom,

I used a cardinality hint in combination with the qb_name hint as in below:

select /*+ NO_USE_HASH(X) INDEX_RS(X,order_id_IDX)
cardinality(@sq 100) */
*
FROM DTL_FACT X WHERE X.ORDER_ID IN
( SELECT /*+ qb_name(sq) parallel(orderes_temp1,2) */ ORDER_ID
FROM ORDERS_TEMP1
WHERE FEED_ID IN (7, 8) MINUS
SELECT /*+ parallel ( HDR_TEMP,2) */ ORDER_ID
FROM HDR_TEMP)
/

I wanted the plan to use a NL operation for the "X" table but it was always going to a hash join
(reason being a high cardinality estimate of 700K for the "minus sub-query") before I added the
cardinality hint .

I kno wit worked but question is: Is there any reason to not use this form of the hint - meaning
cardinality(@sq) - is it "illegal" in any way to use an explicitly named query block instead of a
table?

oracle version 11.2.0.2.0.