Ptn stats
David Scott, December  30, 2004 - 1:34 pm UTC
 
 
Thanks for the quick and useful answer! My (long and ugly) query plan shows a PSTART of 1 and a PSTOP of 90. Based on your response, I would  expect the partition stats to be used, but I'm extrapolating this from your example. Is this still correct? 
 
December  30, 2004 - 2:20 pm UTC 
 
No, because there are 1..90 partitions involved so....
table stats are used.
the criteria is:
a) pstart and pstop do not say "KEY"
b) pstart = pstop (a single partition is hit)
So, just to be completely complete:
ops$tkyte@ORA9IR2> explain plan for
  2  select * from t where dt >= to_date('12-mar-2003');
 
Explained.
 
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
------------------------------------
 
-------------------------------------------------------------------------------
| Id  | Operation            |  Name  | Rows  | Bytes | Cost  | Pstart| Pstop |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |    12 |   264 |     2 |       |       |
|   1 |  PARTITION RANGE ALL |        |       |       |       |     1 |     3 |
|*  2 |   TABLE ACCESS FULL  | T      |    12 |   264 |     2 |     1 |     3 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("T"."DT">=TO_DATE('2003-03-12 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
 
Note: cpu costing is off
 
15 rows selected.
 
<b>so, pstart = 1, pstop = 3 ===>>> global table stats are used</b>
 
 
 
 
 
Thanks
David Scott, December  30, 2004 - 2:49 pm UTC
 
 
Tom,
Thanks for clarifying this - I always appreciate your ability to make things more understandable.
 
 
 
clarification
reader, December  30, 2004 - 3:05 pm UTC
 
 
From above results, how do you interpret "global table stats are used" or partition stats are used? Thanks.
 
 
December  30, 2004 - 3:27 pm UTC 
 
you don't, it was a statement.
I was showing that "if you see this in the explain plan, global stats will be used.  On the other hand, if you see this (pstart=pstop=NUMBER), partition stats will be used"
If more than one partition could be accessed (start!=stop)
OR the precise partition to be accessed is not known (start is not a number)
then global stats kick in.
Only when the precise partition to be access is known by the optimizer will the local partition stats be used. 
 
 
 
Global Stats Vs Partition Stats
A reader, December  30, 2004 - 4:02 pm UTC
 
 
Hi Tom,
In a DW enviornment, where the queries are submitted by our friendly business users, and we don't have much control. Though we know that "most of the time" users will query with partition key, yet they can occasionally query without partition key. Do you think we should gather global stats also?
Thanks
 
 
December  30, 2004 - 4:32 pm UTC 
 
hard to say -- if your data is 'nice' you could just set them from the individual partitions (eg: num rows => sum(num rows), num blks => sum(num blks))
it is mostly column level stuff that gives issues.  If partition 1 has 100 distinct values of X, and so do partitions 2, 3, 4, 5 do you have
a) 100 distinct values of X
b) 500 distinct values of X
c) some number between 100 and 500
d) all of the above
the answer to the optimizer is (d).  so, you know your data better than I (hopefully :).   do you think you suffer from that in a wide ranging fashion? 
 
 
 
Stats
A reader, December  30, 2004 - 11:57 pm UTC
 
 
Hi Tom,
Thanks for quick response.
As per data distribution, the variation between partitions with various distinct values is not much. However, as we don't gather table level stats (ie., all table level stats - num_rows etc., are null), how optimiser will behave, if partition key is not part of the where clause? 
 
December  31, 2004 - 10:56 am UTC 
 
are you sure?  when all of the partitions have been analyzed, it should be rolled up:
ops$tkyte@ORA9IR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   int
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) tablespace users,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) tablespace tools,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t select to_date( '12-mar-2003', 'dd-mon-yyyy' ) + mod(rownum,3), 1, 1 from all_users;
 
22 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'PART1', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'PART2', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'JUNK', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                      22
 
ops$tkyte@ORA9IR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
 
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
PART1                                   7
PART2                                   8
JUNK                                    7
 
ops$tkyte@ORA9IR2>
 
 
 
 
 
Peter, December  31, 2004 - 7:32 am UTC
 
 
Hi Tom,
the setup of our tables is similar in so far as our main tables are partitioned by a 'rolling time window' on a date column, whereby one partition is dropped and a new partition is added every day. Data is then subsequently loaded into the new partition during the day. The data in the different partitions is fairly similar for all of them.
For the different queries run against them the CBO does in some cases know which partition to hit, but in many cases it doesn't. Therefore we need statistics on partition level (especially for the new partition once data has been loaded into it), but we need also global statistics as well.
Gathering statistics on global table-level takes too long to run on a daily basis (up to 20 hours, depending on the workload).
My question now is, will it be sufficient to gather stats for the new partition only after the dataload (this is fast) and relying on global statistices which are gathered only on a monthly basis?
Many Thanks for your advise, and I wish you a brilliant New Year. 
 
June      23, 2011 - 7:46 am UTC 
 
only you can answer the last question, for it takes knowledge of your data to properly answer that.
the largest problems will center around column statistics -- high and low values.  If you have monotomically increasing values and the global column stats are way out of wack -- the optimizer might think its:
a) going to get tons and tons of rows when in fact it'll get none.
b) going to get no rows when in fact it'll get tons and tons.
those will mostly be the cases you need to worry about. 
 
 
 
A reader, January   02, 2005 - 2:26 pm UTC
 
 
Thanks Tom.
 
 
 
Roll up - partition level stats
A reader, January   02, 2005 - 4:54 pm UTC
 
 
Hi Tom,
We don't gather stats for all partitions. We gather stats for a partition as we move forward from months to months. Initially, we created 24 months partitions, and to start with we gathered stats for first partition. For next month -partition stats, we use min(partition_position) where num_rows is null, and gather stats through a DB procedure. With this approach, as there are partitions with null stats, partition level stats are ***not*** rolled up.The way you demostrated with the example, we have to change our mechanism for gathering the partition level stats, in order for partition level stats to rollup?
Thanks 
 
January   02, 2005 - 8:00 pm UTC 
 
suggestion:  gathering stats on empty partitions is really fast -- so do it? 
 
 
 
(start!=stop) and (start is not a number)  will partition elimnation works?
A reader, January   02, 2005 - 9:40 pm UTC
 
 
If global stats are used - will it do partition elimination 
if global stats are used will have to go through all the 90 partition or identify the partition range from global stats and go to the corresponding partition?  -- which is true here
simply putting - (start!=stop) and (start is not a number)  will partition elimnation works?
Thanks a lot 
 
January   02, 2005 - 10:42 pm UTC 
 
yes, absolutely (that was sort of the crux of the answer -- when a single KNOWN partition is identified -- pstart = pstop and pstart is a NUMBER -- then local, else if we just KNOW a partition will be accessed but not which one..... we use global)
when you see "KEY KEY" in the plan -- partition elimination is happening, but we cannot use a single partitions statistics (we don't know which one)
when you see "3 9" in the plan -- we are using just partitions 3 through 9 (out of thousands perhaps) but we cannot us an individual partitions statistics.....
 
 
 
 
A reader, January   03, 2005 - 12:45 am UTC
 
 
fantastic reply ..
Lets take your above plan
-------------------------------------------------------------------------------
| Id  | Operation            |  Name  | Rows  | Bytes | Cost  | Pstart| Pstop |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |    12 |   264 |     2 |       |       |
|   1 |  PARTITION RANGE ALL |        |       |       |       |     1 |     3 |
|*  2 |   TABLE ACCESS FULL  | T      |    12 |   264 |     2 |     1 |     3 |
-------------------------------------------------------------------------------
there are only 3 partition and its doing range all and partition 1 to 3. In this case, it will use the global stats and does partition elimination. 
So, partition elimination is occuring but the partitions stats are not used, 
Since its taking the global stats, if the index is used it will use the global index (prefixed) on the table or the local index ?
I have similiar situation in my case here. I dont see the local index in my plan at all. I feel that, since it has used the global stats, it could not use the local index - am i right ?
Thanks for your time 
 
January   03, 2005 - 8:19 am UTC 
 
well, given that there are only 3 partitions
and it says "pstart = 1, pstop = 3", we didn't really have the ability to eliminate anything.
either index could be used -- IF you submit a query that could use an index, which we did not here.
global stats are "TABLE stats", don't confuse them -- just because we use 'TABLE STATS' (not partition stats) does not preclude a GLOBAL index, or a LOCAL index, either or is possible.
that we used "global table stats" has no bearing on the selection or exclusion of a certain class of indices. 
 
 
 
Alberto Dell'Era, January   03, 2005 - 8:24 am UTC
 
 
>when you see "KEY KEY" in the plan -- partition elimination is happening, 
>but we cannot use a single partition statistics (we don't know which one)
Not true in 9.2.0.6 with bind variable peeking, if i'm not mistaken:
create table tb (
  key int,
  x   int
)
partition by range (key)
(
  partition p_oneval   values less than (10),
  partition p_distinct values less than (99999999)
);
-- insert 10,000 identical values in p_oneval
insert into tb partition (p_oneval) (key, x) 
select 1        , rownum 
  from all_objects where rownum <= 10000;
-- insert 10,000 different values in p_distinct
insert into tb partition (p_distinct) (key, x) 
select 10+rownum-1, rownum 
  from all_objects where rownum <= 10000;
-- a local index on key
-- it is very selective on p_distinct,
-- not selective at all on p_oneval
create index tb_key on tb (key) LOCAL;
-- gather stats (global + local)
exec dbms_stats.gather_table_stats (user, 'tb', cascade=>true);
-- same statement, different bind values:
alter session set sql_trace=true;
variable key number
exec :key := 1; -- hits partition p_oneval
select /* key=1 */ x from tb where  key = :key;
Rows     Row Source Operation
-------  ---------------------------------------------------
  10000  PARTITION RANGE SINGLE PARTITION: KEY KEY (cr=690 r=0 w=0 time=66914 us)
  10000   TABLE ACCESS FULL TB PARTITION: KEY KEY (cr=690 r=0 w=0 time=37510 us)
exec :key := 100; -- hits partition p_distinct
select /* key=100 */ x from tb where  key = :key;
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  PARTITION RANGE SINGLE PARTITION: KEY KEY (cr=4 r=0 w=0 time=137 us)
      1   TABLE ACCESS BY LOCAL INDEX ROWID OBJ#(97233) PARTITION: KEY KEY (cr=4 r=0 w=0 time=117 us)
      1    INDEX RANGE SCAN OBJ#(97236) PARTITION: KEY KEY (cr=3 r=0 w=0 time=83 us)(object id 97236)
Different plans for different bind values => bind variable peeking 
perfect plan for selected partition  => partition stats are used
Plans fetched from v$sql_plan show the same thing; explain plan shows (wrongly imho) always the second plan. 
 
 
what each one stands for ?
A reader, January   04, 2005 - 8:15 am UTC
 
 
Tom,
Can you please give me a link or tell me what the below means in explain plan.
PARTITION RANGE ITERATOR PARTITION: KEY KEY 
PARTITION RANGE ALL PARTITION: KEY KEY 
PARTITION RANGE SINGLE PARTITION: KEY KEY 
I am getting the any of the above in my tkprof.  Not clear what each one stands for .. Please requesting your inputs
Thanks a lot
 
 
January   04, 2005 - 8:47 am UTC 
 
range iterator -- we are going to ITERATE over a range of partitions, we know that we'll hit 0, 1 or more but won't know until query execution time how many.
range all -- we are going to hit em all
range single -- we are going to hit one of them.
ops$tkyte@ORA9IR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   int
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) tablespace users,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) tablespace tools,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> variable x varchar2(30)
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select * from t where dt between to_date(:x) and to_date(:x)+1;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=35)
   1    0   FILTER
   2    1     PARTITION RANGE (ITERATOR)
   3    2       TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=35)
 
<b>the between on DT (partition key) lets us partition eliminate, but we don't know WHICH ONES - we'll hit 0, 1 or more partitions when we execution this query with different :x values</b>
 
ops$tkyte@ORA9IR2> select * from t where dt = to_date(:x);
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=35)
   1    0   PARTITION RANGE (SINGLE)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=35)
 
<b>that knows "exactly ONE partition" will be hit</b>
 
ops$tkyte@ORA9IR2> select * from t where x > 5;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=12 Bytes=420)
   1    0   PARTITION RANGE (ALL)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=12 Bytes=420)
 
 
 
ops$tkyte@ORA9IR2> set autotrace off
<b>that knows "all partitions will be hit" since x>5 could be in any and all - we need to look at them all</b>
 
 
 
 
 
Thank you !!!
A reader, January   04, 2005 - 9:46 am UTC
 
 
 
 
 no chance of partition elimination - - > PARTITION RANGE (ALL), 
A reader, January   04, 2005 - 9:52 am UTC
 
 
Tom,
So, when i see PARTITION RANGE (ALL), it means there is no chance of partition elimination - right?
Thanks
 
 
January   05, 2005 - 8:04 am UTC 
 
it means it is iterating over each partition, all will be peeked at. 
 
 
 
Empty partition Stats
A reader, January   04, 2005 - 3:44 pm UTC
 
 
Hi Tom,
Regarding:
suggestion:  gathering stats on empty partitions is really fast -- so do it? 
________________
That is fine, but we have a logic, once monthly load is finished, get the min(partition_position)name where num_rows is null, and then analyze that partition. Is there any way, to get the partition name of the partition which need to be analyzed in which the data has been loaded? We have a control table which holds the key like 200412 on which the partitions are based i.e.,
tables are partitioned as 
table a
partition a1 values < 200411
partition a2 values < 200412
partition a3 values < 200501
......
...
and control table keeps only one record based upon the month key just processed.
Regards
   
 
January   05, 2005 - 9:03 am UTC 
 
not following you -- all I'm saying is have empty partitions analyzed -- analyze all partitions that haven't been analyzed.
you can still use whatever logic you like right now to analyze the partition you just loaded into - i'm just saying "make sure all partitions have stats" 
 
 
 
Alberto Dell'Era, January   06, 2005 - 5:12 pm UTC
 
 
January   06, 2005 - 7:24 pm UTC 
 
oh, the only reason I didn't comment on your correct observations is because it looked so much like the way I would have proved to you that it worked that way!
I honestly can say -- every day, each and every day (sat/sun included) I learn something new about Oracle myself.  that was one of them. 
 
 
 
Alberto Dell'Era, January   06, 2005 - 7:40 pm UTC
 
 
>oh, the only reason I didn't comment on your correct observations is because 
>it looked so much like the way I would have proved to you that it worked that 
>way!
Looks like you have shaped my mind someway ;)
(actually the investigating method itself is the most important thing i've learnt from you)
Many thanks! 
 
 
ora 6550 and ora 922 during import  February 25, 2003 
Randal Singh, January   07, 2005 - 7:43 am UTC
 
 
I would rather suggest a different approach to the problem which i was facing during import of a table. 
 The suggestion which you had provided well worked but I would rather suggest if you have two machines (M1,M2) on which you have installed oracle 9i with different release such as M1---- Oracle 9i Release 9.1 and M2 --- Oracle 9i Release 9.2 if you have taken the export of a table t1 from M2 and want to import in M1 you would face the error Ora-00922. The solution is, In the tnsname.ora of M1 you mention the entry  so as to connect to M2 Machine ( The way you do for distributed database). Staying in M1 machine you take the export of the t1 table of M2 machine i.e from Oracle 9i Release 9.2 database with this command (exp scott/tiger@m2 file=abc.dmp grants=n log=abc.log). After the data got exported you import the data from m1 machine this will solve the problem.  
 
January   07, 2005 - 9:25 am UTC 
 
huh?  which part of this thread are you referring to? 
 
 
 
global statistics 
steve, January   11, 2005 - 5:01 pm UTC
 
 
Hi Tom,
a. Regarding Peter's post above about the daily rolling window,
   we have the same situation. 4 months of data, partitioned
   by day, every day one partition is marked as stale.... 
   Like Peter, we cannot collect statistics on the whole 
   table every day (too long). The question is, when do we need 
   to refresh the global statistics?. 
   I understood your answer, but it doesn't help in deciding 
   when to collect global statistics. Is there any way to 
   determine if the 'global statistics' have gone stale?
b. When you execute:
   exec dbms_stats.gather_schema_stats('TEST', 
                                        estimate_percent=>5, 
                                        options=>'gather stale', 
                                        cascade=>true, 
                                        granularity=>'PARTITION'
                                      );
    By specifying    
     granularity=>'PARTITION'  
    does that only gather statistics for partitioned tables and 
    ignore non partitioned tables? I know the parameter is only pertinent
    to partitioned tables, but does specifying it ignore nonpartitioned
    tables? It seems to. Because, prior to running it, I listed the 
    stale objects and got:
    TABLE ==> PP_TRANSACTIONS partition(CD_TRANS_13_29)
    TABLE ==> T partition()
  And after running it I got:
    TABLE ==> T partition()
  If this is true, that would mean we would need 2 passes, one with 
  granularity=>'PARTITION'  and one without in order to analyze a schema?
Thanks Tom.
Steve
 
 
January   11, 2005 - 7:04 pm UTC 
 
a) you know better than I, you actually know the data.
globally -- things like num rows -- they can be rolled up pretty accurately.  Things like "number of distinct values in a column" -- well, they cannot.
So, you'd be the best at knowing whether that presents an issue for you or not.
You know, will you get bit by things like this:
ops$tkyte@ORA9IR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   int
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) tablespace users,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) tablespace tools,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t select to_date( '12-mar-2003', 'dd-mon-yyyy'), rownum, rownum from all_objects;
 
27816 rows created.
 
ops$tkyte@ORA9IR2> insert into t select to_date( '13-mar-2003', 'dd-mon-yyyy'), rownum, rownum from all_objects;
 
27816 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select * from t where x > 30000;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28 Card=2 Bytes=34)
   1    0   PARTITION RANGE (ALL)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=28 Card=2 Bytes=34)
 
 
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t select to_date( '15-mar-2003', 'dd-mon-yyyy'), 30000+rownum, rownum from all_objects;
 
27816 rows created.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'JUNK', GRANULARITY => 'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select * from t where x > 30000;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=28 Card=2 Bytes=34)
   1    0   PARTITION RANGE (ALL)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=28 Card=2 Bytes=34)
 
<b>whoops, didn't know about the new ID's we loaded up last week at the global level</b> 
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select * from t where x > 30000;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=42 Card=40149 Bytes=682533)
   1    0   PARTITION RANGE (ALL)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=42 Card=40149 Bytes=682533)
 
 
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
<b>of course, if you say "yes --there are two or three columns like that", there is always "dbms_stats.SET......", you loader (which has *seen* the data) can in fact tell us things...</b>
b) well done, exact test i would have set up :) (and was getting ready to -- but then read you did it...)
you would run two if you said "partition" once.  Worst case by running two -- the second one would have had nothing to do, so rather harmless to run two.
 
 
 
 
 
What about  statistics when exchanging partitions?
Seth, January   11, 2005 - 5:49 pm UTC
 
 
Could you please complement your previous explanation with a discussion of what happens to statistics when exchanging partitions into a table?  I've been struggling with this for a while.  I find that in those cases, statistics do not "rollup" into the next level and would like to find a way to avoid re-analyzing what are often large and numerous partitions for our warehouse.
 
To illustrate, I initially create a range-hash composite partitioned table CLAIM with one initial range and then gather stats with granularity=>'all' and cascade=>true.  This table has a few local indexes defined and no global indexes.  I then add a partition P_PLAN1 and analyze it with granularity=>'PARTITION' and cascade.  For discussion purposes, I'll call the partition column plan_id and the hash column pat_id, and the first defined partition has range plan_id < 1, and the second added partition has range plan_id < 2.  (BTW, version is 9.2.0.5 on Solaris 9.)
 
So far so good: I see analysis dates (and rows=0) at all levels global, partition, subpartition (from views dba_tables, dba_tab_partitions, dba_tab_subpartitions, and corresponding views for indexes).
 
I now create a table PLAN1 designed to be exchanged in.  It is hashed on pat_id  (#partitions in PLAN1 = #subpartitions in CLAIM), and the value of plan_id in every row will be '1'.  I then populate PLAN1 with sqlldr and analyze granularity=>all  cascade=>true.
 
I then exchange PLAN1 in into CLAIM:
 ALTER table CLAIM
   EXCHANGE partition P_PLAN1
     WITH table PLAN1
     INCLUDING indexes  WITHOUT validation
 
The data from PLAN1 is now available in CLAIM.  However, when I look at the stats through dba views, I find the following:
- dba_tab_subpartition and dba_ind_subpartition do reflect tha stats from PLAN1 correctly
- dba_tab_partition, dba_ind_partition, dba_tables, and dba_indexes do NOT show any change (they still show the original dates and rows=0 from prior to the exchange).
This has obviously been problem for the CBO when we perform queries and our only solution has been to re-analyze after exchange.  I'm hoping you have some insight into this and can suggest a more efficient solution that is not in the documentation.
Incidentally, what happens to column statistics in my example?  Do they "rollup" as well?  What about in your example?
Many Thanks,
-- Seth
 
 
January   11, 2005 - 7:26 pm UTC 
 
exchanging partitions will preserve partition level stats fine
global stats however will be affected in the same fashion as above -- column distributions are not easily derived from partition level information. 
 
 
 
dynamic_sampling
Rob, January   11, 2005 - 8:42 pm UTC
 
 
Tom:
If I have a partitioned table but only global stats, no partition stats, and the query specifies a single value for the partition key in the predicate, will the global stats be used or will dynamic_sampling take place assuming the query has the /*+ dynamic_sampling(3) */ hint? 
Thanks,
Rob 
 
January   12, 2005 - 8:16 am UTC 
 
it has the statistics -- the lack of partition stats is not "unanalyzed".
I tried a couple of variants -- joins, complex predicates.  I could see it dynamically sample the table being joined to -- but not the individual partition.
even when there are no stats on the partitioned table, the dynamic sample query would simply include the predicate -- which has the "where partition_column = CONSTANT" -- meaning, it would not sample a "partition" in as much as the sampling query itself would use partition elimination to just sample a single partition. 
 
 
 
A couple of additional questions 
Rob, January   12, 2005 - 9:10 am UTC
 
 
Tom:
As a followup to your answer I have a couple of additional questions.
1) How can you tell that dynamic sampling is taking place?
2) Any way to force dynamic_sampling in a query even if the table does have stats. 
Rob 
 
January   12, 2005 - 9:55 am UTC 
 
1) sql_trace=true, you'll see the sql in the tkprof.
2) it only kicks in when it needs them, if it finds all of the stats it needs in the dictionary, it'll not dynamically sample.  
 
 
 
 
Thanks
steve mckee, January   12, 2005 - 9:16 am UTC
 
 
Thank you very much Tom for answering my question above.
Cheers.
Steve 
 
 
What about statistics when exchanging partitions? 
Seth, January   12, 2005 - 3:46 pm UTC
 
 
Um... Well, my experience doesn't bear that out.  Here is the log of sequence of steps you can reproduce:
 
drop table many_parts;
drop table part1;
 
column last_analyzed format A18
column table_name format A12
column part_name format A12
column subpart_name format A12
 
CREATE or REPLACE VIEW VIEW_LEVELS   /* for diagnostic purposes */
AS
  select  table_name, 'TABLE' as llevel, null as part_name, null as subpart_name, num_rows, last_analyzed
  from user_tables
union ALL
  select  table_name, 'PARTITION' as llevel, PARTITION_NAME, null as subpart_name, num_rows, last_analyzed
  from user_tab_partitions
union ALL
  select  table_name, 'SUBPARTITION' as llevel, PARTITION_NAME, SUBPARTITION_NAME, num_rows, last_analyzed
  from user_tab_subpartitions
/
 
 
-- Main table
 
CREATE TABLE MANY_PARTS
  PARTITION BY RANGE (PLAN_ID)
  SUBPARTITION BY HASH (PAT_ID)
    SUBPARTITIONS 2
  (PARTITION P_PART0 VALUES LESS THAN (1)
  )
AS
   SELECT 0 as PLAN_ID,
          ROWNUM as PAT_ID,
          DO.*
   FROM DBA_OBJECTS DO
   WHERE OWNER = 'SYS'
/
 
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                             
MANY_PARTS   PARTITION    P_PART0                              
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217            
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218            
 
 
begin
  dbms_stats.gather_table_stats(user, tabname=>'MANY_PARTS',granularity=>'ALL');
end;
/
 
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                        4043 
MANY_PARTS   PARTITION    P_PART0                         4043 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217       2037 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218       2006 
 
 
-- Table to be exchanged in 
 
CREATE TABLE PART1
  PARTITION BY HASH (PAT_ID)
    PARTITIONS 2
AS
   SELECT 1 as PLAN_ID,
          ROWNUM as PAT_ID,
          DO.*
   FROM DBA_OBJECTS DO
   WHERE OWNER = 'SYS'
/
 
 
select * from view_levels where table_name = 'PART1';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
PART1        TABLE                                             
PART1        PARTITION    SYS_P4219                            
PART1        PARTITION    SYS_P4220                            
begin
  dbms_stats.gather_table_stats(user, tabname=>'PART1',granularity=>'ALL');
end;
/
 
 
select * from view_levels where table_name = 'PART1';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
PART1        TABLE                                        4043 
PART1        PARTITION    SYS_P4219                       2037 
PART1        PARTITION    SYS_P4220                       2006 
 
 
ALTER TABLE MANY_PARTS
  ADD PARTITION P_PART1   VALUES LESS THAN (2)
/
 
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                        4043 
MANY_PARTS   PARTITION    P_PART0                         4043 
MANY_PARTS   PARTITION    P_PART1                              
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217       2037 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218       2006 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4221            
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4222            
 
 
begin
  dbms_stats.gather_table_stats(user, tabname=>'MANY_PARTS',granularity=>'SUBPARTITION', partname=>'P_PART1');
end;
/
 
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                        4043 
MANY_PARTS   PARTITION    P_PART0                         4043 
MANY_PARTS   PARTITION    P_PART1                            0 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217       2037 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218       2006 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4221          0 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4222          0 
 
     Note: So far, all this makes sense
 
 
ALTER TABLE MANY_PARTS
  EXCHANGE partition P_PART1
    WITH table PART1
    INCLUDING indexes  WITHOUT validation
/
 
 
 
SELECT COUNT(*) FROM MANY_PARTS;
 
  COUNT(*)
----------
      8086
 
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                        4043 
MANY_PARTS   PARTITION    P_PART0                         4043 
MANY_PARTS   PARTITION    P_PART1                            0 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217       2037 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218       2006 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4221       2037 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4222       2006 
 
 
   Note above: Exchanging partition does not alter num_rows for PARTITION P_PART1, and table level does not reflect the correct total either (should be 8086).
               
 
 
select * from view_levels where table_name = 'PART1';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
PART1        TABLE                                        4043 
PART1        PARTITION    SYS_P4219                          0 
PART1        PARTITION    SYS_P4220                          0 
 
 
  Note above:  Table level does not reflect the fact that there are no rows anymore.
 
 
begin
  dbms_stats.gather_table_stats(user, tabname=>'MANY_PARTS',granularity=>'SUBPARTITION', partname=>'P_PART1');
end;
/
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                        4043 
MANY_PARTS   PARTITION    P_PART0                         4043 
MANY_PARTS   PARTITION    P_PART1                         4043 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217       2037 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218       2006 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4221       2037 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4222       2006 
 
 
  Note above: Just to see what would happen, I re-analyzed P_PART1  at the supartition level. Partition level row count is now correct, but the table level is not.
 
 
begin
  dbms_stats.gather_table_stats(user, tabname=>'MANY_PARTS',granularity=>'PARTITION', partname=>'P_PART1');
end;
/
 
 
select * from view_levels where table_name = 'MANY_PARTS';
 
TABLE_NAME   LLEVEL       PART_NAME    SUBPART_NAME   NUM_ROWS 
------------ ------------ ------------ ------------ ---------- 
MANY_PARTS   TABLE                                        4043 
MANY_PARTS   PARTITION    P_PART0                         4043 
MANY_PARTS   PARTITION    P_PART1                         4043 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4217       2037 
MANY_PARTS   SUBPARTITION P_PART0      SYS_SUBP4218       2006 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4221       2037 
MANY_PARTS   SUBPARTITION P_PART1      SYS_SUBP4222       2006 
 
  Note above:  So I now I re-analyze at the Partition level in hopes of seeing an effect at the table level. No cigar.
 
 
So my questions are;
 
- Is this the correct functionality, or am I missing something?
 
- Is it unreasonable for me to expect stats to get "rolled up" and "rolled down" automatically as I exchange partitions in and out?
 
- Is this perhaps a bug? 
 
In any event,  it appears if I want correct num_rows at the table level, I have no choice but to re-analyze the entire table at the global level (at a mininum).  With extremely large tables, this seems like an undesirable result.  Any suggested workarounds?
Thanks,
-- Seth
 
 
 
January   12, 2005 - 5:39 pm UTC 
 
yes, it is not maintaining the global stats (partition level stats are fine, global stats are not)
short of analyzing the entire table -- we could use dbms_stats.set_table_stats, eg:
ops$tkyte@ORA9IR2> select * from v;
 
WHAT                           TABLE_NAME     NUM_ROWS
------------------------------ ------------ ----------
TABLE                          T                    24
TABLE                          T_NEW                40
PART                           PART1                 8
PART                           PART2                 8
PART                           JUNK                  8
<b>part1, part2, junk belong to T</b>
 
ops$tkyte@ORA9IR2> alter table t
  2  exchange partition junk with table t_new;
 
Table altered.
 
ops$tkyte@ORA9IR2> select * from v;
 
WHAT                           TABLE_NAME     NUM_ROWS
------------------------------ ------------ ----------
TABLE                          T                    24
TABLE                          T_NEW                 8
PART                           PART1                 8
PART                           PART2                 8
PART                           JUNK                 40
 
ops$tkyte@ORA9IR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 56 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select * from v;
 
WHAT                           TABLE_NAME     NUM_ROWS
------------------------------ ------------ ----------
TABLE                          T                    56
TABLE                          T_NEW                 8
PART                           PART1                 8
PART                           PART2                 8
PART                           JUNK                 40
 
<b>and do that for num blocks and other statistics you want to update.  things that cannot be rolled up -- "distinct values" -- will be problematic</b>
 
 
 
 
 
Effect of gather stale on partition level stats
Rob, February  01, 2005 - 9:35 am UTC
 
 
Tom:
If I am using dbms_stats with the gather stale option and table and index monitoring is turned on, what happens in the following scenario:
A new partition is created and loaded with data. The amount of data in the new partition is less than 10% of the total in the table. Will gather stale cause this table to be skipped when I analyze, thus resulting in no partition level stats for the new partition and no update of the global stats?
 
 
 
What is rowid |row ?
A reader, February  03, 2005 - 11:31 pm UTC
 
 
What does the plan below mean.Have pasted only the partioned table portion of the explain plan
It looks like its using global stats and not partition stats.
what is the meaing of rowid | row there.
----------------------------------------------------------------------------------------------------
| Id  | Operation                  |  Name               | Rows  | Bytes | Cost  | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
PARTITION RANGE ITERATOR           |                     |       |       |       |   KEY |   KEY |
 TABLE ACCESS BY GLOBAL INDEX ROWID| my_partition_tab    |  2034 | 53186 |     2 | ROWID | ROW 
  INDEX RANGE SCAN                 | my_partiton_index   |  3537 |       |     1 |   KEY |   KEY |
Thanx for all your help. 
 
February  04, 2005 - 2:06 am UTC 
 
it is saying "we will do partition elimination at run time"
won't don't know right now what partitions we need (key key), we'll need anywhere from 0 to all of them.  the inputs to the query will tell us what we need ultimately.
so, this plan is doing N index range scans (N will be determined when the inputs are supplied and it can figure out the lo and hi partitions)
For each hit in the index, we'll do a table access by index rowid.
it uses global stats since it does not know at optimize time what partitions will be used (and partition stats are used when a single partition is identified to be used and that partition is known) 
 
 
 
Rob, try it and see
Peter, February  04, 2005 - 4:04 am UTC
 
 
We are just about to change our stats collection method to a "gather stale" schema stats. On our TEST system (you must have one?) we ran a set of tests involving our batch data load and inspecting last_analyzed from user_tables and user_tab_partititions (and just for fun looked at how the user_tab_modifications changed before and after the dataload and also after the gather stats). Of course we also timed the runs!
Well... we were happy with what we saw! 
 
 
What is better for performance and querying and DMLs ?
A reader, February  05, 2005 - 2:56 am UTC
 
 
1 )Having a Range partioned table but all partitions( say total 20 of them) in just one Tablespace
or
2) Having a partitioned table with all partitions in different tablespace
And the same argument for Partitoned Index for the above Partitoned table.
Thanks. 
 
February  05, 2005 - 5:34 am UTC 
 
trick question.
tablespaces aren't about performance.
tablespaces ARE about administration.
If you use SAME (strip and mirror everything) or ASM (new 10g feature, a database file system), then it really just "doesn't matter" if you have 1 or 100 tablespaces - they are all sharing the same disks.
Even if you aren't doing that (SAME or ASM), the same thing applies.  A single tablespace with files from disk1, disk2, disk3, disk4, .... diskN will "stripe" naturally extent by extent.
In fact, if you put a tablespace per disk and put each partition on its own disk, that could be lots worse then striping over everything - if you have a popular partition (say the current data), that disk would be really hot.  If you stripped -- all disks would be doing IO.
tablespaces are not about performance.
tablespaces are about ease of administration.
Say these range partitioned tables are DW tables.  Of the 20 partitions -- 19 oare "read only" effectively (all of their data has been loaded).  You could as an admin) alter them read only, back them up for the last time and forget about them.  So, putting them into separate tablespaces would make sense for that reason.
So, think about this from an admin perspective. 
 
 
 
Thanx for above Partiton answer But Q
A reader, February  05, 2005 - 11:40 am UTC
 
 
It is always said that use seperate tablespaces for Indexes and their tables.
Can I imply from above that this is just for ease of administration, no performance gains while doing I/O or Index updates ?
Thanks 
 
February  06, 2005 - 3:04 am UTC 
 
think about it.
what is a tablespace?
a bunch of files.  nothing more, nothing less.  
You can achieve precisely the same IO distribution over all disks using
a) one tablespace
b) hundreds of tablespaces
in fact, you'll find it easier perhap with a) and some nice striping of disks.
It is a myth (with some historical foundations in facts from the late 1980's, before striping, back when a 4 disk system "was really big", when you had schemas with 5 or 10 tables for an entire application) that you separate indexes and data for performance.
 
 
 
 
Missing link
SFT, February  07, 2005 - 3:54 am UTC
 
 
Hi Tom,
The link to this thread from OTN start page [surprisingly]points to Chris Chalk's J2EE Weblog. That's a bit confusing, and then after all the number of visitors to your site may drop by few dozens of unfortunates that could not figure out how to find it. That's not the problem for you, but for poor creatures out there ... :) 
 
 
Please help!
Sandra Johns, February  10, 2005 - 11:41 pm UTC
 
 
I know that this is not the correct area for my question; BUT, I really need help with this one!
I getting a:
ORA-03116 invalid buffer length passed to a conversion routine
It happened in a PRO-COBOL program during a FETCH of a cursor, while other routines were accessing this table using a SELECT. This table is accessed using PROC ACCESS views. The OS is a HP-UX with a Oracle 9i database.
Thank you so much! 
 
February  11, 2005 - 7:55 pm UTC 
 
please contact support, i've no experience whatsoever with pro*cobol. 
 
 
 
What about statistics when exchanging partitions? 
Seth Goodman, March     15, 2005 - 10:03 am UTC
 
 
Sorry for posting again.  It didn't seem to take yesterday.
Tom,
I wanted to thank your for your help (see Jan 12th) and share my experience with others.  It's taken me a while to get to this point.  This was my first time in dealing with many of the procedures required in the solution.
 
One question first:  What does the 'no_invalidate' parameter do, as found in various dbms_stats procedures like 'set_table_stats'?
 
My example above was used just to illustrate the problem.  My master database in fact has 5 tables, 4 of them range-hash, and the other just range.  All of them have somewhere between 2 to 11 local indexes.
 
Before an exchange, I have to add a partition or split an existing partition in the master table.  Adding is easy.  Splitting is a headache because of dealing with the TS for the indexes.  Although oracle was good about defining the TS for the tables, it kept the index TS in the same TS as the one defined for the existing partition.  Nothing I could do would change that (it's not clear if this is correct behaviour).  I was forced to REBUILD the indexes and define the TS I wanted at that time (I'm using the same TS as the table).  Fortunately the new partition is empty and so it doesn't take long to execute, but it required a fair amount of PL/SQL to build the statements and execute immediate.  Maybe someone can suggest a better built-in method...
 
The script I use for exchanging the partitions ended up having to do the following:
1) save the all current stats using export_table_stats (w/cascade) (table PART1)
2) edit the exported stats to match partition/subpartion and other criteria for importing into the master table (for Table, Index, and Column)
3) exchange the partition (PART1 <--> MANY_PARTS)
4) import the partition level stats using import_table_stats (w/cascade) (into MANY_PARTS)
5) Calculate global table stats using already exchanged partitions and using set_table_stats.  Obviously, this couldn't be done for column stats.
 
When dealing with range-hash stats, I only needed to deal with the global to partition level exchange. Using my example, the stats in the partitions of PART1 were automatically exchanged into the supartitions of MANY_PARTS.  The excercise described above was to only needed to exchange the global stats in PART1 into the partition level of MANY_PARTS.  This is unnecessary if PART1 is not partitioned, in which case the global stats automatically exchange into the partition level of MANY_PARTS.
 
I had to calculate the global row stats individually for each object exchanged (tables and indexes).  This could be done by re-analyzing MANY_PARTS after each exchange, but this is time consuming.  So I use set_table_stats (as per Tom's suggestion) and set_index_stats after calculating the new number of rows and blocks etc. using the partition stats.  Eventually, the tables will need be re-analyzed every so often, but this has worked ok so far.  Query performance was crappy at first until I did my first analyze after about 6 partitions were exchanged: the CBO needed some global column stats to formulate good plans.
 
In conclusion, it would seem to me that Oracle should help streamline some of these steps automatically in the 'exchange partition' function.  It may happen some day.
 
I hope my description has been clear enough and of use to others.
  
Thanks,
 
-- Seth
 
 
March     15, 2005 - 10:13 am UTC 
 
no invalidate would cause the stats to be put in place, but not invalidate the dependent SQL in the shared pool -- only subsequent hard parses of sql would see the new stats, the existing parsed SQL would be in place as it exists. 
 
 
 
Subpartition histograms
Robert, March     17, 2005 - 5:03 pm UTC
 
 
Tom,
We have noticed something that raises our suspicion about histograms on subpartitioned tables.  We initially gathered table statistics when the tables were first built and once we started loading data we gathered on the exchanged subpartitions. This approach works very well as we can see that the subpartition statistics are rolling up to both the partition level and table level.
We discoverd that if we gathered stats initially with a granularity of ALL, then the stats did not roll up.
Here is the way we have been gathering stats on our subpartitioned tables:
dbms_stats.gather_table_stats(ownname           => piTableOwner,
                              tabname           => piTableName,
                              partname          => piPartitionName,
                              estimate_percent  => DBMS_STATS.AUTO_SAMPLE_SIZE,
                              granularity       => 'SUBPARTITION',
                              method_opt        => 'FOR ALL COLUMNS SIZE AUTO',
                              cascade           => true);
We use the following scripts to check on this periodically.  The only time we have noticed discrepancies between the different levels for the last_analyzed times or num_rows has been when we encountered an error during the exchange process.
  1  select table_name,
  2         Table_lvl,
  3         partition_name,
  4         Part_lvl,
  5         subpartition_name,
  6         Subpart_lvl
  7    from (select A.table_name,
  8                 A.last_analyzed Table_lvl,
  9                 B.partition_name,
 10                 B.last_analyzed Part_lvl,
 11                 C.subpartition_name,
 12                 C.last_analyzed Subpart_lvl,
 13                 max(B.last_analyzed) OVER (partition by B.table_name) max_part_la,
 14                 max(C.last_analyzed) OVER (partition by C.table_name) max_subpart_la
 15            from dba_tables              A,
 16                 dba_tab_partitions      B,
 17                 dba_tab_subpartitions   C
 18           where A.owner = B.table_owner
 19             and A.table_name = B.table_name
 20             and A.table_name = C.table_name
 21             and B.table_name = C.table_name
 22             and B.partition_name = C.partition_name)
 23   where (Table_lvl != max_part_la or Table_lvl != max_subpart_la)
 24     and Part_lvl = max_part_la
 25     and Subpart_lvl = max_subpart_la
 26*  order by table_name desc nulls last
sposp> /
Press <RETURN> To Continue...
no rows selected
  1  select distinct (A.table_name) table_name,
  2         A.num_rows Table_rows,
  3         sum(B.num_rows) over (partition by B.table_name) /
  4         count(*) over (partition by C.table_name, C.partition_name) Part_rows,
  5         sum(C.num_rows) over (partition by C.table_name) Subpart_rows
  6    from dba_tables              A,
  7         dba_tab_partitions      B,
  8         dba_tab_subpartitions   C
  9   where A.owner = B.table_owner
 10     and A.owner = C.table_owner
 11     and A.table_name = B.table_name
 12     and A.table_name = C.table_name
 13     and B.table_name = C.table_name
 14*    and B.partition_name = C.partition_name
sposp> /
Press <RETURN> To Continue...
Table
Name                           TABLE_ROWS  PART_ROWS SUBPART_ROWS
------------------------------ ---------- ---------- ------------
D_DAILY_DC_INVENTORY_FACT          968164     968164       968164
D_DAILY_POS_FACT                 29656244   29656244     29656244
D_DAILY_RETAIL_INVEN_FACT        81379154   81379154     81379154
D_WEEKLY_POS_FACT                64066815   64066815     64066815
If the optimizer only looks at the subpartition stats when it knows the ONE subpartition it needs to access, then most of the time we are going to use global stats.  And since most of the time we are going to traverse multiple ranges, the global stats we will use are at the table level.  
  1  select distinct column_name,
  2         count(*)
  3    from dba_subpart_histograms
  4   where owner = 'POSDM'
  5     and table_name = 'D_DAILY_RETAIL_INVEN_FACT'
  6     and subpartition_name = 'P200503_NPPC_PETSMART_USA'
  7*  group by column_name
sposp> /
Press <RETURN> To Continue...
Column
Name                             COUNT(*)
------------------------------ ----------
CREATE_D                                2
CURRENCY_ID                             2
CURRENCY_SOURCE_ID                      2
D_ITEM_ID                             201
D_RETAILER_PARTITION_ID                 1
D_RETAIL_LOC_ID                         2
INDICATOR_ID                            2
IN_TRANSIT_Q                            2
LAST_UPDATE_D                           2
ON_HAND_Q                              62
ON_ORDER_Q                              2
RETAILER_NET_COST_DOLLAR_A              2
RETAILER_NET_COST_EURO_A                2
RETLR_NET_COST_LCL_CRNCY_A              2
SESSION_RUN_ID                          2
SYSTEM_CALENDAR_KEY_N                   1
TOTAL_INVENTORY_Q                       2
  1  select distinct column_name,
  2         count(*)
  3    from dba_part_histograms
  4   where owner = 'POSDM'
  5     and table_name = 'D_DAILY_RETAIL_INVEN_FACT'
  6     and partition_name = 'P200503'
  7*  group by column_name
sposp> /
Press <RETURN> To Continue...
Column
Name                             COUNT(*)
------------------------------ ----------
CREATE_D                                2
CURRENCY_ID                             2
CURRENCY_SOURCE_ID                      2
D_ITEM_ID                               2
D_RETAILER_PARTITION_ID                 2
D_RETAIL_LOC_ID                         2
INDICATOR_ID                            2
LAST_UPDATE_D                           2
ON_HAND_Q                               2
ON_ORDER_Q                              2
RETAILER_NET_COST_DOLLAR_A              2
RETLR_NET_COST_LCL_CRNCY_A              2
SESSION_RUN_ID                          2
SYSTEM_CALENDAR_KEY_N                   2
TOTAL_INVENTORY_Q                       2
  1  select distinct column_name,
  2         count(*)
  3    from dba_tab_histograms
  4   where owner = 'POSDM'
  5     and table_name = 'D_DAILY_RETAIL_INVEN_FACT'
  6*  group by column_name
sposp> /
Press <RETURN> To Continue...
Column
Name                             COUNT(*)
------------------------------ ----------
CREATE_D                                2
CURRENCY_ID                             2
CURRENCY_SOURCE_ID                      2
D_ITEM_ID                               2
D_RETAILER_PARTITION_ID                 2
D_RETAIL_LOC_ID                         2
INDICATOR_ID                            2
LAST_UPDATE_D                           2
ON_HAND_Q                               2
ON_ORDER_Q                              2
RETAILER_NET_COST_DOLLAR_A              2
RETLR_NET_COST_LCL_CRNCY_A              2
SESSION_RUN_ID                          2
SYSTEM_CALENDAR_KEY_N                   2
TOTAL_INVENTORY_Q                       2
If the optimizer looks at the histogram information at the table level then the number of buckets and the associated endpoint values are not accurate when compared to the same column at the subpartition level.  Are you aware of any issues with this?  I have searched everywhere and I havenÂ’t found anything yet to indicate that this is a problem.
Thanks,
Robert D. Ware 
 
March     17, 2005 - 6:03 pm UTC 
 
I'm not what "this" is in regards to "issues with this"
if we cannot eliminate down to a single partition -- global stats, else partition stats. 
 
 
 
To the Point
Robert, March     21, 2005 - 10:41 am UTC
 
 
Tom,
Do you know whether or not the use of partitioning obviates the optimizer's ability to use histograms when it uses global/table level stats.
 
 
March     22, 2005 - 10:45 am UTC 
 
No, it'll either
a) use global histograms
b) local histograms
depending on the query. 
 
 
 
Am I reading this right?
A reader, April     26, 2005 - 11:33 am UTC
 
 
So basically, if a query does partition elimination and turns out that only a few partitions will be looked at I should gather table level stats, not partition level stats?
Here's why I am asking. I have a table with 1.2 billion rows. This table is partitioned on a POST_DATE column. The partitions range from 01-01-1993 up to 12-31-2010. This table is used to collect new rows on a daily basis. All rows inserted have a POST_DATE of yesterday so we don't insert or update any rows with a POST_DATE < yesterday. So, we always "append" new rows. Most queries against this table have a date range on the POST_DATE using BETWEEN and the optimizer usually looks at a number of partitions.
Recently I decided to only gather stats on the most current partition only (the partitions that is currently being filled). Is this a good idea? Should I gather stats on the entire table and not just the most current partition? Does it even make sense to gather table stats when the "old" rows are static?  How should I gather stats??!!
 
 
April     26, 2005 - 1:35 pm UTC 
 
if your queries do not boil down to a single partition, known at parse time (eg: they have a pstart=pstop=something_known) -- global stats are used.
If a single partition is known to be hit, those partition stats will be used.
You can gather or you can SET, the optimizer needs the facts, it could well be you know the facts (high bounds of columns, number of rows and other facts) 
 
 
 
Partitions and granularity
A reader, April     26, 2005 - 2:25 pm UTC
 
 
Thanks for your reply. 
I still have two questions regarding gather_table_stats.
1. If I specify a partition name and use the default granularity, does that mean Oracle will gather global stats and partition stats on the partition I specified only?
2. If I don't specify a partition, and use the default granularity, will Oracle gather global stats and partition stats on ALL partitions?
Please let me know.  Thank you very much. 
 
April     26, 2005 - 2:45 pm UTC 
 
ops$tkyte@ORA9IR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) tablespace users,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) tablespace tools,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t select to_date('10-mar-2003','dd-mon-yyyy')+mod(rownum,10), 1, 1 from all_users;
 
24 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
 
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
PART1                                   8
PART2                                   3
JUNK                                   13
 
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                      24
 
ops$tkyte@ORA9IR2> delete from t;
 
24 rows deleted.
 
ops$tkyte@ORA9IR2> commit;
 
Commit complete.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'PART1' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
 
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
PART1                                   0
PART2                                   3
JUNK                                   13
 
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                       0
 
<b>so, only the partition stats for A partition and the table stats for the entire table were modified by that gather.... the old partition stats would stick</b>
 
 
 
 
 
Confusion
A reader, June      06, 2005 - 11:02 pm UTC
 
 
1. Regarding your latest followup "so, only the partition stats for A partition and the table stats for the 
entire table were modified by that gather.... the old partition stats would stick", so why does it show
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                       0
shouldnt that be 13+3=16?
2. I have a DW where monthly partitions are added to a table from the OLTP system. Partition key is the yyyymm date column. Stats are gathered on the table and it is exchanged into the partitioned table so the partitioned table has partition level stats.
Most queries use only 1 partition at a time (pstart=pstop), but some of them use more than 1.
There seems to be some confusion as to whether partition level stats rollup to the table level or not.
Quoting you from earlier on the page:
"when all of the partitions have been analyzed, it should be 
rolled up"
But then when someone proved otherwise, you admitted that
"yes, it is not maintaining the global stats (partition level stats are fine, global stats are not)"
So, which is it? Do partition level stats roll up to the table level or not?
3. Gathering table stats after each partition is added is hugely expensive.
If partition stats indeed do rollup properly to the table level, is there even a need to gather table level stats explicitly? Yes, column level stats complicate things, but what is the solution?
What is a workable solution to this problem? Use dbms_stats.set_*? How exactly?
Thanks 
 
June      07, 2005 - 8:07 am UTC 
 
1) because as stated:
"so, only the partition stats for A partition 
and the table stats for the 
entire table were modified by that gather.... the old partition stats would 
stick"
the TABLE stats were modified
the PART1 stats were modified
the remaining stats were not touched.
2) well, I showed that when all of the partitions are analyzed, it was rolled up.
but when the partition was swapped in, it changed behavior ("admitted" sounds like there is some culpability, something false was said, you can observe all of this behavior yourself)
It depended on how the information was all gathered.
So, look at your system -- you tell us, what do YOU see.  Seems you should have sufficient information to make this determination.
It was two different cases
case 1) analyze individual partitions in a partitioned table, what happens to global stats.
case 2) exchange a partition, what happens to global stats
case 1 and 2 are different.  And you can see what happens by putting the two examples together:
                                                                                                                                                      
ops$tkyte@ORA9IR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   int
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION jan_2005 VALUES LESS THAN (to_date('01-feb-2005','dd-mon-yyyy')),
 10    PARTITION feb_2005 VALUES LESS THAN (to_date('01-mar-2005','dd-mon-yyyy')),
 11    PARTITION mar_2005 VALUES LESS THAN (to_date('01-apr-2005','dd-mon-yyyy')),
 12    PARTITION apr_2005 VALUES LESS THAN (to_date('01-may-2005','dd-mon-yyyy'))
 13  )
 14  /
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t
  2  select to_date('01-jan-2005','dd-mon-yyyy')+mod(rownum,to_date('30-apr-2005','dd-mon-yyyy')-to_date('01-jan-2005','dd-mon-yyyy')+1),
  3         rownum, rownum
  4    from all_objects
  5  /
 
27986 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'jan_2005', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'feb_2005', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'mar_2005', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'apr_2005', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                   27986
 
ops$tkyte@ORA9IR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
 
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
JAN_2005                             7249
FEB_2005                             6524
MAR_2005                             7223
APR_2005                             6990
<b>ok, as the last partition was analyzed, rollup</b>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table t add partition may_2005 values less than (to_date('01-jun-2005','dd-mon-yyyy'));
 
Table altered.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table temp ( dt date, x int, y int );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into temp
  2  select to_date('01-may-2005')+mod(rownum,30) dt, rownum x, rownum y
  3    from all_objects
  4   where rownum <= 5000;
 
5000 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'TEMP' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table t exchange partition may_2005 with table temp;
 
Table altered.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                   27986
 
ops$tkyte@ORA9IR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
 
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
MAY_2005                             5000
JAN_2005                             7249
FEB_2005                             6524
MAR_2005                             7223
APR_2005                             6990
<b>Now, the table T was not analyzed, it was exchanged into.  Note that the partition level statistics came over but the global table stats were not modified, we haven't gathered stats on anything related to table T yet</b>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', 'may_2005', granularity=>'PARTITION' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select table_name, num_rows from user_tables where table_name = 'T';
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                   32986
 
ops$tkyte@ORA9IR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
 
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
MAY_2005                             5000
JAN_2005                             7249
FEB_2005                             6524
MAR_2005                             7223
APR_2005                             6990
 
<b>so.... there you go.  you need to run dbms_stats on that table to have the global information rolled up</b>
3) what is the solution?  you are the only one that can answer that.  are the rolled up stats *sufficient for you, given your data*.  If there were a single solution, there would be no choices to make.
If you have information about the true "number of distinct values", or "high value, low value" for columns -- you can set this information using the documented set_* functions -- rather than gather it.
Statistics need only be good enough for the optimizer to understand reality, what is there. 
 
 
 
 
 
A reader, June      07, 2005 - 12:28 pm UTC
 
 
"the TABLE stats were modified"
OK so the table had 24 rows in 3 partitions, your DELETE removed all of them.
But then running 
exec dbms_stats.gather_table_stats( user, 'T', 'PART1' );
shows
PART1                                   0
PART2                                   3
JUNK                                   13
but at the global level shows
T                                       0
So, what I meant was, running gather_table_stats( user, 'T', 'PART1' ) had the desired effect of accurately reflecting stats for PART1, but at the expense of the undesirable side-sffect of setting the global num_rows to 0.
Isnt this bad? Specifiying the partname=> parameter to gather_table_stats should only modify partition level stats for that partition and not mess with global stats, right?
Thanks 
 
June      07, 2005 - 1:13 pm UTC 
 
but the delete deleted ALL ROWS from the ENTIRE TABLE.  it did the right thing here.  There were no rows left?
 
 
 
 
A reader, June      07, 2005 - 1:58 pm UTC
 
 
Right, its just that things dont add up i.e. partition stats dont add up to the table stats. (13+3+0 != 0)
Not sure if any CBO decision is based on things adding up in this fashion, but just thought I would point it out.
Thanks 
 
June      07, 2005 - 2:01 pm UTC 
 
because the command I used said "just do this one partition" and let the granularity default (that partition and that table).  the database did exactly what it was told to do. 
 
 
 
Randall, August    24, 2005 - 3:03 pm UTC
 
 
In you example above you show:
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> variable x varchar2(30)
ops$tkyte@ORA9IR2> delete from plan_table;
 
2 rows deleted.
 
ops$tkyte@ORA9IR2> explain plan for
  2  select * from t where dt = to_date(:x,'dd-mon-yyyy');
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
--------------------------------------
 
-------------------------------------------------------------------------------
| Id  | Operation              |  Name| Rows  | Bytes | Cost  | Pstart| Pstop |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |     1 |    22 |     2 |       |       |
|   1 |  PARTITION RANGE SINGLE|      |       |       |       |   KEY |   KEY |
|*  2 |   TABLE ACCESS FULL    | T    |     1 |    22 |     2 |   KEY |   KEY |
-------------------------------------------------------------------------------
Q:  With bind variable peeking, shouldn't it know that you will hit just one partition and which partition to hit?
Thanks..
 
 
August    25, 2005 - 3:18 am UTC 
 
explain plans do not know about bind variables.  autotrace and explain plan do not do "bind variable peeking"
(and in current releases of the sofware, the to_date will preclude the bind variable peeking from taking place) 
 
 
 
Partition statistics and local index
Serge Shmygelsky, April     27, 2006 - 5:00 am UTC
 
 
Hello Tom,
I have a partitioned table of the following structure:
RASH@RASH> desc msc_data
           Name                            Null?    Type
           ------------------------------- -------- ----------------------------
    1      FILE_ID                         NOT NULL NUMBER
    2      CALL_TYPE_ID                             NUMBER
    3      RECORD_TYPE_ID                           NUMBER
    4      CUSTOMER_TYPE_ID                         NUMBER
    5      CALL_DATE                                DATE
    6      DURATION                                 NUMBER
    7      MSC_NO                                   VARCHAR2(8)
    8      IS_DELETED                               VARCHAR2(1)
    9      LAC                                      VARCHAR2(10)
   10      CID                                      VARCHAR2(4)
   11      IMEI                                     VARCHAR2(20)
   12      IMSI                                     VARCHAR2(20)
   13      DN_NUM                                   VARCHAR2(40)
   14      MSRN                                     VARCHAR2(40)
   15      B_NUMBER                                 VARCHAR2(40)
   16      B_MSRN                                   VARCHAR2(40)
   17      TRANS_B_NUMBER                           VARCHAR2(40)
   18      C_NUMBER                                 VARCHAR2(40)
   19      TRUNK_IN                                 VARCHAR2(8)
   20      TRUNK_OUT                                VARCHAR2(8)
   21      VOLUME_UP                                NUMBER
   22      VOLUME_DOWN                              NUMBER
Table is partitioned by call_date (a partition for each day) and then by call_type_id (5 more subpartitions).
I have created the following index:
CREATE  INDEX msc_data_idx
ON      msc_data
        (
        call_date,
        call_type_id,
        dn_num,
        b_number,
        trans_b_number
        )
LOCAL
NOLOGGING
COMPRESS 2
PARALLEL
/
I'm collecting statistics for the table partition and rebuilding index partitions after daily data loading. The problem is that I cannot make the following query using index:
RASH@RASH> explain plan for select call_date, call_type_id, dn_num from msc_data where call_date between  '01.04.2006' and '03.04.2006' and call_type_id = 1 and dn_num = 'adfdf' and b_number = 'aaaa' and trans_b_number = 'aaa';
Plan is:
RASH@RASH> @p
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3415524828
-------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |     1 |    41 | 24565  (30)| 00:04:55 |       |       |
|   1 |  PX COORDINATOR      |          |       |       |            |          |       |       |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |     1 |    41 | 24565  (30)| 00:04:55 |       |       |
|   3 |    PX BLOCK ITERATOR |          |     1 |    41 | 24565  (30)| 00:04:55 |   KEY |   KEY |
|*  4 |     TABLE ACCESS FULL| MSC_DATA |     1 |    41 | 24565  (30)| 00:04:55 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------
When I use hint:
RASH@RASH> explain plan for select /*+ INDEX (msc_data msc_data_idx) */ call_date, call_type_id, dn_num from msc_data where call_date between  '01.04.2006' and '03.04.2006' and call_type_id = 1 and dn_num = 'adfdf' and b_number = 'aaaa' and trans_b_number = 'aaa';
RASH@RASH> @p
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2064827886
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |              |     1 |    41 | 61043  (31)| 00:12:13 |       |       |
|   1 |  PX COORDINATOR            |              |       |       |            |          |       |       |
|   2 |   PX SEND QC (RANDOM)      | :TQ10000     |     1 |    41 | 61043  (31)| 00:12:13 |       |       |
|   3 |    PX PARTITION LIST SINGLE|              |     1 |    41 | 61043  (31)| 00:12:13 |   KEY |   KEY |
|*  4 |     INDEX SKIP SCAN        | MSC_DATA_IDX |     1 |    41 | 61043  (31)| 00:12:13 |   KEY |   KEY |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("CALL_DATE">=TO_DATE('2006-04-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
              "CALL_TYPE_ID"=1 AND "DN_NUM"='adfdf' AND "B_NUMBER"='aaaa' AND "TRANS_B_NUMBER"='aaa' AND
              "CALL_DATE"<=TO_DATE('2006-04-03 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
       filter("DN_NUM"='adfdf' AND "TRANS_B_NUMBER"='aaa' AND "B_NUMBER"='aaaa' AND
              "CALL_TYPE_ID"=1)
So the questions are:
1. Why doesn't Oracle use the index? From my understanding, it must be used as it fits into all the rules: sql query contains predicates used in the index and the data itself - there is no need to read the table at all.
2. Why the cost is so different in the plans? For me it is obviuosly wrong as if I run query with hint it takes about 5 seconds to finish and without hint it runs for hours.
Please advise.
 
 
 
Partition statistics and local index - ctd
Serge Shmygelsky, May       12, 2006 - 5:59 am UTC
 
 
Hello Tom,
sorry for bothering you, but I cannot find the way out from this index situation. Now it's getting even stranger as when I run a query under schema's owner, the index is used. Any other user having 'select' on this table and running the same query, gets the table full scanned. Looks like noone else can see the statistics. How can it be? And what I'm doing wrong?
Hope this time you'll hear me crying... 
 
May       12, 2006 - 9:29 am UTC 
 
do a 10053 trace as the person who can "use the index" and as the person "who cannot use the index" and compare them.
</code>  
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:63445044804318  <code>
I have a feeling "these are not the same tables" perhaps (you have confused yourself with synonyms or views maybe) or there is fine grained access control involved or .... 
Indexes are not something you have to "grant" on.  They are available to everyone.  
 
 
Partition statistics and local index - ctd 
Serge Shmygelsky, May       17, 2006 - 9:10 am UTC
 
 
Hi Tom,
thanks for your information. Unfortunately, I cannot reproduce the situation. I gathered the statistics for the table and the index. Index is local prefixed (you can see it from my previous post). And now the index is not used at all in any case. I'm going to lose my faith in humanity :).
Below is the output from a query with 'EVENTS' on. Maybe you could suggest something on that? 
select count(*) from msc_data where call_date between '15.05.2006' and '16.06.2006' and call_type_id = 1
*********************
QUERY BLOCK SIGNATURE
*********************
qb name was generated
signature (optimizer): qb_name=SEL$1 nbfros=1 flg=0
  fro(0): flg=0 objn=58191 hint_alias="MSC_DATA"@"SEL$1"
*****************************
SYSTEM STATISTICS INFORMATION
*****************************
  Using NOWORKLOAD Stats
  CPUSPEED: 72 millions instruction/sec
  IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
  IOSEEKTIM: 10 milliseconds (default is 10)
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: MSC_DATA  Alias: MSC_DATA  (Using composite stats)
  (making adjustments for partition skews)
  ORIGINAL VALUES::    #Rows: 4460853857  #Blks:  23103220  AvgRowLen:  114.00
  SUBPARTITIONS::
  PRUNED: 33
  ANALYZED: 33  UNANALYZED: 0
    #Rows: 4460853857  #Blks:  183616  AvgRowLen:  114.00
Index Stats::
  Index: FILE_ID_IDX  Col#: 1
    USING COMPOSITE STATS
    LVLS: 2  #LB: 4847987  #DK: 2115  LB/K: 2292.00  DB/K: 11636.00  CLUF: 24610820.00
  Index: MSC_DATA_IDX  Col#: 5 2 13 15 17
    USING COMPOSITE STATS
    LVLS: 2  #LB: 12607386  #DK: 4574215695  LB/K: 1.00  DB/K: 1.00  CLUF: 4510454397.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#2): CALL_TYPE_ID(NUMBER)
    AvgLen: 4.00 NDV: 16 Nulls: 0 Density: 1.1097e-10 Min: -111 Max: 910
    Histogram: Freq  #Bkts: 16  UncompBkts: 5454  EndPtVals: 16
  Column (#5): CALL_DATE(DATE)
    AvgLen: 8.00 NDV: 2880147 Nulls: 0 Density: 3.4720e-07 Min: 2453827 Max: 2453872
  Table: MSC_DATA  Alias: MSC_DATA     
    Card: Original: 4460853857  Rounded: 24176128  Computed: 24176127.82  Non Adjusted: 24176127.82
  Access Path: TableScan
    Cost:  61301.30  Resp: 8514.07  Degree: 0
      Cost_io: 49731.00  Cost_cpu: 9972979094
      Resp_io: 6907.08  Resp_cpu: 1385135985
  Access Path: index (index (FFS))
    Index: MSC_DATA_IDX
    resc_io: 3414502.00  resc_cpu: 921084943807
    ix_sel: 0.0000e+00  ix_sel_with_filters: 1
  Access Path: index (FFS)
    Cost:  4483112.02  Resp: 622654.45  Degree: 8
      Cost_io: 3414502.00  Cost_cpu: 921084943807
      Resp_io: 474236.39  Resp_cpu: 127928464418
kkofmx: index filter:"MSC_DATA"."CALL_TYPE_ID"=1 AND "MSC_DATA"."CALL_DATE">=TO_DATE('2006-05-15 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "MSC_DATA"."CALL_DATE"<=TO_DATE('2006-06-16 00:00:00', 'yyyy-mm-dd hh24:mi:ss')
  Access Path: index (skip-scan)
    SS sel: 1.1148e-04  ANDV (#skips): 5591029
    SS io: 5591029.00 vs. index scan io: 259330.00
    Skip Scan rejected
  Access Path: index (IndexOnly)
    Index: MSC_DATA_IDX
    resc_io: 259332.00  resc_cpu: 21697496356
    ix_sel: 0.02057  ix_sel_with_filters: 0.0054196
    Cost: 284504.66  Resp: 39514.54  Degree: 8
  Best:: AccessPath: TableScan
         Cost: 8514.07  Degree: 8  Resp: 8514.07  Card: 24176127.82  Bytes: 0
***************************************
 
 
May       17, 2006 - 9:32 pm UTC 
 
one would want to see the table, the index create and the dbms_xplan output. 
 
 
 
Partition statistics and local index - ctd
Serge Shmygelsky, May       18, 2006 - 11:19 am UTC
 
 
Hi Tom,
thanks for you attention. Here is the information:
Table:
CREATE  TABLE &owner..msc_data
        (
    file_id        NUMBER NOT NULL,
        call_type_id    NUMBER,
        record_type_id  NUMBER,
        customer_type_id        NUMBER,
        call_date       DATE,
        duration        NUMBER,
        msc_no          VARCHAR2(8),
    is_deleted    VARCHAR2(1),
    lac        VARCHAR2(10),
        cid             VARCHAR2(4),
        imei            VARCHAR2(20),
        imsi            VARCHAR2(20),
        dn_num          VARCHAR2(40),
    msrn        VARCHAR2(40),
        b_number        VARCHAR2(40),
    b_msrn        VARCHAR2(40),
    trans_b_number    VARCHAR2(40),
    c_number    VARCHAR2(40),
        trunk_in        VARCHAR2(8),
        trunk_out       VARCHAR2(8),
    volume_up    NUMBER,
    volume_down    NUMBER
    )
PARTITION BY RANGE ( call_date )
SUBPARTITION BY LIST ( call_type_id )
        (
        PARTITION part_18022006 VALUES LESS THAN ( TO_DATE( '18.02.2006', 'DD.MM.YYYY' ) ) 
    PCTFREE 0
    STORAGE
        (
        INITIAL 16M
        NEXT 16M
        )
    COMPRESS
                (
                SUBPARTITION part_18022006_1 VALUES (30),
                SUBPARTITION part_18022006_2 VALUES (31),
                SUBPARTITION part_18022006_3 VALUES (1),
                SUBPARTITION part_18022006_4 VALUES (2),
                SUBPARTITION part_18022006_5 VALUES (DEFAULT)
                )
        )
/
Each day a partition is added by:
ALTER TABLE MSC_DATA
ADD PARTITION part_18052006
VALUES LESS THAN ( TO_DATE( '19.05.2006', 'DD.MM.YYYY' ) )  PCTFREE 0 STORAGE ( INITIAL 16M NEXT 16M ) COMPRESS
( SUBPARTITION part_18052006_1 VALUES (1),
 SUBPARTITION part_18052006_2 VALUES (2),
 SUBPARTITION part_18052006_3 VALUES (30),
 SUBPARTITION part_18052006_4 VALUES (31),
 SUBPARTITION part_18052006_5 VALUES ( DEFAULT ))
Index:
CREATE  INDEX msc_data_idx
ON      &owner..msc_data
        (
        call_date,
        call_type_id,
        dn_num,
        b_number,
        trans_b_number
        )
LOCAL
NOLOGGING
COMPRESS 2
PARALLEL
/
Query and dbms_xplan output:
RASH@RASH> explain plan for select count(*) from msc_data where call_date between '15.05.2006' and '16.06.2006' and call_type_id = 1;
RASH@RASH> @p
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 969686288
---------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |    12 | 25997  (20)| 00:05:12 |       |       |
|   1 |  SORT AGGREGATE        |          |     1 |    12 |            |          |       |       |
|   2 |   PX COORDINATOR       |          |       |       |            |          |       |       |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |    12 |            |          |       |       |
|   4 |     SORT AGGREGATE     |          |     1 |    12 |            |          |       |       |
|   5 |      PX BLOCK ITERATOR |          |    78M|   899M| 25997  (20)| 00:05:12 |   KEY |   KEY |
|*  6 |       TABLE ACCESS FULL| MSC_DATA |    78M|   899M| 25997  (20)| 00:05:12 |   KEY |   KEY |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - filter("CALL_DATE"<=TO_DATE('2006-06-16 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
RASH@RASH> explain plan for select /*+ INDEX (msc_data msc_data_idx) */ count(*) from msc_data where call_date between '15.05.2006' and '16.06.2006' and call_type_id = 1;
RASH@RASH> @p
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 247641948
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |     1 |    12 |   119K  (9)| 00:23:51 |       |       |
|   1 |  SORT AGGREGATE              |              |     1 |    12 |            |          |       |       |
|   2 |   PX COORDINATOR             |              |       |       |            |          |       |       |
|   3 |    PX SEND QC (RANDOM)       | :TQ10000     |     1 |    12 |            |          |       |       |
|   4 |     SORT AGGREGATE           |              |     1 |    12 |            |          |       |       |
|   5 |      PX PARTITION LIST SINGLE|              |    78M|   899M|   119K  (9)| 00:23:51 |   KEY |   KEY |
|*  6 |       INDEX RANGE SCAN       | MSC_DATA_IDX |    78M|   899M|   119K  (9)| 00:23:51 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("CALL_DATE">=TO_DATE('2006-05-15 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
              "CALL_TYPE_ID"=1 AND "CALL_DATE"<=TO_DATE('2006-06-16 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
       filter("CALL_TYPE_ID"=1)
Statistics is gathered 5 minutes ago and no changes have been done since.
One thought just hit me: maybe this is because fields can contain nulls?  
 
May       19, 2006 - 9:54 am UTC 
 
that is using partition elimination and choosing to full scan the partitions using multiblock IO instand of using single block IO on an index range scan.
it wanted to use multiblock IO, you have parallel query enabled.
I see nothing inheritly wrong here.  seems most of the data would be hit, table is compressed, might even be smaller than the index.
so, does the index plan actually *go faster* 
 
 
 
Partition statistics and local index - ctd 
Serge Shmygelsky, May       26, 2006 - 11:19 am UTC
 
 
Hello Tom,
I've done some tests according to your suggestions. Here are the results:
RASH@RASH> explain plan for select /*+ INDEX (msc_data msc_data_idx ) */ b_number, record_type_id, duration from rash.msc_data where call_date between '15.05.2006' and '22.05.2006' and call_type_id = 1 and dn_num = '80503111011';
RASH@RASH> @p
PLAN_TABLE_OUTPUT
-------------------
Plan hash value: 3241126277
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              |    31 |  1271 |   290K  (9)| 00:58:07 |       |       |
|   1 |  PX COORDINATOR                      |              |       |       |            |          |       |       |
|   2 |   PX SEND QC (RANDOM)                | :TQ10000     |    31 |  1271 |   290K  (9)| 00:58:07 |       |       |
|   3 |    PX PARTITION LIST SINGLE          |              |    31 |  1271 |   290K  (9)| 00:58:07 |   KEY |   KEY |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| MSC_DATA     |    31 |  1271 |   290K  (9)| 00:58:07 |   KEY |   KEY |
|*  5 |      INDEX RANGE SCAN                | MSC_DATA_IDX |    31 |       |   290K  (9)| 00:58:07 |   KEY |   KEY |
---------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("CALL_DATE">=TO_DATE('2006-05-15 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "CALL_TYPE_ID"=1 AND
              "DN_NUM"='80503111011' AND "CALL_DATE"<=TO_DATE('2006-05-22 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
       filter("DN_NUM"='80503111011' AND "CALL_TYPE_ID"=1)
Elapsed: 00:00:00.19
RASH@RASH> select /*+ INDEX (msc_data msc_data_idx ) */ b_number, record_type_id, duration from rash.msc_data where call_date between '15.05.2006' and '22.05.2006' and call_type_id = 1 and dn_num = '80503111011';
Elapsed: 00:02:21.04
Statistics
----------------------------------------------------------
         61  recursive calls
          3  db block gets
     644696  consistent gets
     644624  physical reads
       5264  redo size
       1876  bytes sent via SQL*Net to client
        238  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         15  sorts (memory)
          0  sorts (disk)
         81  rows processed
RASH@RASH> explain plan for select b_number, record_type_id, duration from rash.msc_data where call_date between '15.05.2006' and '22.05.2006' and call_type_id = 1 and dn_num = '80503111011';
RASH@RASH> @p
PLAN_TABLE_OUTPUT
--------------------
Plan hash value: 3415524828
-------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |    31 |  1271 | 76016  (27)| 00:15:13 |       |       |
|   1 |  PX COORDINATOR      |          |       |       |            |          |       |       |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |    31 |  1271 | 76016  (27)| 00:15:13 |       |       |
|   3 |    PX BLOCK ITERATOR |          |    31 |  1271 | 76016  (27)| 00:15:13 |   KEY |   KEY |
|*  4 |     TABLE ACCESS FULL| MSC_DATA |    31 |  1271 | 76016  (27)| 00:15:13 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("DN_NUM"='80503111011' AND "CALL_DATE"<=TO_DATE('2006-05-22 00:00:00',
              'yyyy-mm-dd hh24:mi:ss'))
RASH@RASH> select b_number, record_type_id, duration from rash.msc_data where call_date between '15.05.2006' and '22.05.2006' and call_type_id = 1 and dn_num = '80503111011';
Elapsed: 00:04:19.58
Statistics
----------------------------------------------------------
         45  recursive calls
          3  db block gets
    1630781  consistent gets
    1378072  physical reads
        680  redo size
       1912  bytes sent via SQL*Net to client
        238  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          8  sorts (memory)
          0  sorts (disk)
         81  rows processed
So the index usage was almost two times faster. Looks like due to some reason Oracle chooses incorrect plan (from human point of view). Why could it happen? 
 
May       27, 2006 - 9:24 pm UTC 
 
got luck due to the fewer physical IO's in this case.  So, do you have system statistics in place on your system - does the optimizer have information about how IO's perform on your system and such?
And do you really want your tables parallel enabled (usually, the answer is "not really") 
 
 
 
Partition statistics and local index - ctd
Serge Shmygelsky, May       29, 2006 - 4:22 am UTC
 
 
Hello Tom,
I've already told our DBA that he needed to turn system statistics on. Hope your suggestion will help me to persuade him.
As to parallelism - the main idea of this table is to be used as a data source for some MViews. The process is as follows: calls for entire day are loaded once per day, MViews are refreshed and after that there are no modifications/deletions. Nobody is actually selecting the data from the table itself, only from MViews (except rare cases). It is a kind of small datawarehouse. So I thought it would be good to have parallelism in order to refresh this MViews as fast as possible. But at the meantime in case I need the data from the table, I build the aforementioned index to speed up queries. Is this approach correct?
 
 
 
Tables with Partitions
dav, April     15, 2007 - 10:31 pm UTC
 
 
Hi, 
We've a table partitioned by month. We've data from Jan'05 to current month '07.. When we analyze that table, it analyzes all the partitions. But data is changed only recent month(s) partitions., How can we configure analyze to analyze only the partitions with recently changed data. 
Thanks in advance. 
April     16, 2007 - 1:14 pm UTC 
 
don't use analyze
use dbms_stats.  you can tell it what partitions to gather statistics for.
 
 
 
Entire Schema
Dav, April     16, 2007 - 5:28 pm UTC
 
 
As of today, We analyze entire schema. Do we need to change it to loop thro tables in that schema and run individual analyze/dbms_stats? Please let me know.  
April     17, 2007 - 9:44 am UTC 
 
you don't *need* to do anything - there is a chance you might want to do something differently.
You could, for example, use table monitoring (on by default in 10g, alter it on table by table in 9i) and use dbms_stats to gather just "stale" statistics.  Maybe, it might be appropriate, it might not.
Not knowing your needs, or anything about what you do - no one can say... 
 
 
DM, April     18, 2007 - 1:47 am UTC
 
 
Hi Tom
I am new to the Performance tunning .Here is the Explain plan for the Query I am running .
I am aware that low COST = GOOD.
Please let me know how I can tune this query for optimal Performance as the cost is very high.
OPERATION          OPTIONS          OBJECT NAME       OBJECT INSTANCE       OPTIMIZER                                    COST      CARDINALITY  BYTES
==========================================================================================================================================================
SELECT STATEMENT                                                             FIRST_ROWS             0         3192385    3192385   1833117      1688300757
NESTED LOOPS                                                                                        1     0   1          3192385   1833117      1688300757
NESTED LOOPS                                                                                        2     1   1          1359268   1833117      1629641013
NESTED LOOPS                                                                                        3     2   1          466498    892770       277651470
TABLE ACCESS       FULL             TABLE_BUS_ORG        3                   ANALYZED               4     3   1          146       58294        3789110
TABLE ACCESS       BY INDEX ROWID   TABLE_SITE           4                   ANALYZED               5     3   2          8         15           3690
INDEX              RANGE SCAN       IND_PRIMARY2BUS_ORG         NON-UNIQUE   ANALYZED         1     6     5   1          2         94
TABLE ACCESS       BY INDEX ROWID   TABLE_ADDRESS        2                                          7     2   2          1         2            1156
INDEX              UNIQUE SCAN      ADDRESS_OBJINDEX            UNIQUE                        1     8     7   1                    1
TABLE ACCESS       BY INDEX ROWID   TABLE_COUNTRY        1                   ANALYZED               9     1   2          1         1            32
INDEX              UNIQUE SCAN      COUNTRY_OBJINDEX            UNIQUE       ANALYZED         1     10    9   1                    1
select table_site.objid, table_site.site_id, table_site.S_site_id,
 table_site.name, table_site.S_name, table_site.type,
 table_address.address, table_address.S_address, table_address.city, table_address.S_city,
 table_address.state, table_address.S_state, table_country.code,
 table_site.site_type, table_site.status,
 table_bus_org.objid, table_bus_org.org_id, table_bus_org.S_org_id,
 table_bus_org.name, table_bus_org.S_name, table_bus_org.type,
 table_country.name, table_country.S_name, table_address.zipcode, table_address.S_zipcode,
 table_country.objid, table_address.objid,
 table_site.region, table_site.S_region, table_site.district, table_site.S_district,
 table_address.address_2, table_site.x_manned_owner,
 table_site.x_op_support_site, table_site.S_x_op_support_site, table_site.x_delivery_party_name, table_site.S_x_delivery_party_name,
 table_site.phone, table_site.x_gsa_cust_ref_no, table_site.S_x_gsa_cust_ref_no,
 table_site.site_type, table_site.x_subtype,
 table_site.x_fins_currency, table_site.x_fins_ban_type,
 table_address.x_county, table_address.S_x_county
 from  table_country,table_address, 
  table_bus_org,table_site
 where table_country.objid = table_address.address2country
 AND table_bus_org.objid = table_site.primary2bus_org
 AND table_address.objid = table_site.cust_primaddr2address
 Please let me know If I missed anything.
 Thanks 
 DM
 IND 
April     18, 2007 - 12:04 pm UTC 
 
... I am aware that low COST = GOOD.  ....
hmmm, it depends, just always say "it depends" 
 
 
What about partition stats on a partitioned index?
Robert, April     23, 2007 - 3:44 pm UTC
 
 
Tom,
What about partition level statistics on a partitioned index (which is on a partitioned table)?
Given...
Partitioned table (on account_id by range).
Partitioned index (on contact_date by year).
Query with predicate "contact_date between '01/01/2006' and '12/31/2006'"
The query is using the contact_date index when it should be doing a full table scan. (ie. FTS runs 3x faster).
I've tried all types of combinations of statistics. Global, partitioned, on contact_date index and histogram on contact_date field in table.... but CBO will not choose the 'correct' path.
In desparation I set db_file_multiblock_read_count from 16 to 24 and then it did choose correct path (but caused bad side effects elsewhere in system).
This 2006 of data is approx 10% of data in table. And this is several million rows. It is fairly 'intuitive' to me that a full table scan would be much faster than millions of index lookups...... What info do I need to provide the CBO so it will come to same conclusion?
Thanks,
Robert. 
April     23, 2007 - 4:51 pm UTC 
 
what does the explain plan say (card=) compared to a tkprof row source operation (reality)
are they close.  what kind of statistics do you have on contact_date, is it skewed, do you do that bad thing of "put in the year 9999 for nulls" stuff? 
 
 
Information you requested.....
Robert, April     23, 2007 - 6:43 pm UTC
 
 
Tom,
Here is the query and the explain plans using both DB_MRC of 16 and 24.
No, we are not doing the null = year 9999 stuff.
The INTERACTIONS table has ~600M rows
There are ~110M rows where contact_date between 1/1/06 and 12/31/06 (~20%).
The contact_date index is partitioned by contact_date year (~12 partitions).
16:37:18 SQL> alter session set db_file_multiblock_read_count=16;
Session altered.
16:37:28 SQL> SELECT NA.ALPHA_NAME
16:37:32   2       , INT.TECHNIQUE ORIGINAL_TECH
16:37:32   3       , INT.SOLICITOR
16:37:32   4       , INT.USER_INSERTING
16:37:32   5       , DECODE(INT.TECHNIQUE, 'AN','CC', 'BQ','CC', 'CP','PH', 'CU','PH', 'CV','PH', 'IB','CC'
16:37:32   6                             , 'OD','CC', 'PF','CC', 'PR','PR', 'PT','VI', 'TD','PH', 'TN','PH'
16:37:32   7                             , 'TR','CC', 'WC','WC', 'IP','IP' ) TECHNIQUE 
16:37:32   8    FROM NAMES NA
16:37:32   9       , BRIEF_NAMES BN
16:37:32  10       , INTERACTIONS INT 
16:37:32  11   WHERE ':B1' = BN.BRIEF_NAME 
16:37:32  12     AND NA.ACCOUNT_ID = BN.ACCOUNT_ID 
16:37:32  13     AND NA.NAME_ID = BN.NAME_ID 
16:37:32  14     AND INT.STS != 'D' 
16:37:32  15     AND INT.CONTACT_DATE BETWEEN to_date('01/01/06', 'DD/MM/RR') and to_date('12/31/06', 'DD/MM/RR') 
16:37:32  16     AND INT.TECHNIQUE IN ('IP','PT','CP','CU','CV','WC','PR', 'BQ','OD','PF','TR','AN','IB','TD', 'TN') 
16:37:32  17     AND INT.CONTACT_CATEGORY||NULL IN ('GP','FN','OP') 
16:37:32  18     AND ( ( 'SO' = 'SO' AND INT.SOLICITOR = ':B1' ) OR ( 'SO' = 'ST' AND INT.USER_INSERTING = ':B1' ) )
16:37:32  19  /
Execution Plan <--------- <<<<< W R O N G ---- USING CONTACT_DATE INDEX >>>>>>
----------------------------------------------------------
 0          SELECT STATEMENT Optimizer=CHOOSE (Cost=2856874 Card=1 Bytes=65)
 1        0   FILTER
 2        1     NESTED LOOPS (Cost=2856874 Card=1 Bytes=65)
 3        2       NESTED LOOPS (Cost=4 Card=1 Bytes=41)
 4        3         TABLE ACCESS (BY INDEX ROWID) OF 'BRIEF_NAMES' (Cost=2 Card=1 Bytes=16)
 5        4           INDEX (UNIQUE SCAN) OF 'UQ_AMBRF_BRIEF_NAME' (UNIQUE) (Cost=1 Card=1)
 6        3         TABLE ACCESS (BY INDEX ROWID) OF 'NAMES' (Cost=3 Card=1 Bytes=25)
 7        6           INDEX (UNIQUE SCAN) OF 'PK_NAMES' (UNIQUE) (Cost=2 Card=1)
 8        2       PARTITION RANGE (ITERATOR)
 9        8         TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'INTERACTIONS' (Cost=2856870 Card=1 Bytes=24)
10        9           INDEX (RANGE SCAN) OF 'IX_MSINT_CONTACT_DATE' (NON-UNIQUE) (Cost=178434 Card=30085221)
16:37:33 SQL> alter session set db_file_multiblock_read_count=24;
Session altered.
16:37:39 SQL> SELECT NA.ALPHA_NAME
16:37:43   2       , INT.TECHNIQUE ORIGINAL_TECH
16:37:43   3       , INT.SOLICITOR
16:37:43   4       , INT.USER_INSERTING
16:37:43   5       , DECODE(INT.TECHNIQUE, 'AN','CC', 'BQ','CC', 'CP','PH', 'CU','PH', 'CV','PH', 'IB','CC'
16:37:43   6                             , 'OD','CC', 'PF','CC', 'PR','PR', 'PT','VI', 'TD','PH', 'TN','PH'
16:37:43   7                             , 'TR','CC', 'WC','WC', 'IP','IP' ) TECHNIQUE 
16:37:43   8    FROM NAMES NA
16:37:43   9       , BRIEF_NAMES BN
16:37:43  10       , INTERACTIONS INT 
16:37:43  11   WHERE ':B1' = BN.BRIEF_NAME 
16:37:43  12     AND NA.ACCOUNT_ID = BN.ACCOUNT_ID 
16:37:43  13     AND NA.NAME_ID = BN.NAME_ID 
16:37:43  14     AND INT.STS != 'D' 
16:37:43  15     AND INT.CONTACT_DATE BETWEEN to_date('01/01/06', 'DD/MM/RR') and to_date('12/31/06', 'DD/MM/RR') 
16:37:43  16     AND INT.TECHNIQUE IN ('IP','PT','CP','CU','CV','WC','PR', 'BQ','OD','PF','TR','AN','IB','TD', 'TN') 
16:37:43  17     AND INT.CONTACT_CATEGORY||NULL IN ('GP','FN','OP') 
16:37:43  18     AND ( ( 'SO' = 'SO' AND INT.SOLICITOR = ':B1' ) OR ( 'SO' = 'ST' AND INT.USER_INSERTING = ':B1' ) )
16:37:43  19  /
Execution Plan  <----- <<<<<<  C O R R E C T  --- FULL TABLE SCAN ON INTERACTIONS >>>>>
----------------------------------------------------------
0          SELECT STATEMENT Optimizer=CHOOSE (Cost=2772568 Card=1 Bytes=65)
1        0   FILTER
2        1     NESTED LOOPS (Cost=2772568 Card=1 Bytes=65)
3        2       NESTED LOOPS (Cost=4 Card=1 Bytes=41)
4        3         TABLE ACCESS (BY INDEX ROWID) OF 'BRIEF_NAMES' (Cost=2 Card=1 Bytes=16)
5        4           INDEX (UNIQUE SCAN) OF 'UQ_AMBRF_BRIEF_NAME' (UNIQUE) (Cost=1 Card=1)
6        3         TABLE ACCESS (BY INDEX ROWID) OF 'NAMES' (Cost=3 Card=1 Bytes=25)
7        6           INDEX (UNIQUE SCAN) OF 'PK_NAMES' (UNIQUE) (Cost=2 Card=1)
8        2       PARTITION RANGE (ALL)
9        8         TABLE ACCESS (FULL) OF 'INTERACTIONS' (Cost=2772564 Card=1 Bytes=24)
16:37:43 SQL> 
What do you think?
Thanks,
Robert.
 
 
April     24, 2007 - 10:12 am UTC 
 
what we need is for you to compare the card= values in the above to the ACTUALS reported in a tkprof.
are they close to reality 
 
 
Something changed(?!)
Robert, April     26, 2007 - 10:39 am UTC
 
 
Tom,
I've been trying to get the tkprof you requested of the statements, but something has changed and I am unable to duplicate the results (!?). Will try to duplicate situation and get you the data you requested soon.
Thank you,
Robert. 
 
Here is some of the info.....
Robert, April     27, 2007 - 11:00 am UTC
 
 
 
Tom,
I cannot reproduce problem *exactly* now... but here is enough info (I think) for what you were asking (showing Oracle's *actual* cardinality).
Both DBFMBRC settings are giving exactly the same explain plan now... but here is the results of the tkprof for the query....
SELECT /*+ db_file_mbrc=24 */ NA.ALPHA_NAME
     , INT.TECHNIQUE ORIGINAL_TECH
     , INT.SOLICITOR
     , INT.USER_INSERTING
     , DECODE(INT.TECHNIQUE, 'AN','CC', 'BQ','CC', 'CP','PH', 'CU','PH', 'CV','PH', 'IB','CC'
                           , 'OD','CC', 'PF','CC', 'PR','PR', 'PT','VI', 'TD','PH', 'TN','PH'
                           , 'TR','CC', 'WC','WC', 'IP','IP' ) TECHNIQUE
  FROM NAMES NA
     , BRIEF_NAMES BN
     , INTERACTIONS INT
 WHERE '???????' = BN.BRIEF_NAME
   AND NA.ACCOUNT_ID = BN.ACCOUNT_ID
   AND NA.NAME_ID = BN.NAME_ID
   AND INT.STS != 'D'
   AND INT.CONTACT_DATE BETWEEN to_date('01/01/06', 'MM/DD/RR') and to_date('12/31/06', 'MM/DD/RR')
   AND INT.TECHNIQUE IN ('IP','PT','CP','CU','CV','WC','PR', 'BQ','OD','PF','TR','AN','IB','TD', 'TN')
   AND INT.CONTACT_CATEGORY||NULL IN ('GP','FN','OP')
   AND ( ( 'SO' = 'SO' AND INT.SOLICITOR = '???????' ) OR ( 'SO' = 'ST' AND INT.USER_INSERTING = '???????' ) )
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2   1380.98    2800.48    7960338    8866779          0           7
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4   1381.00    2800.50    7960338    8866779          0           7
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 587
Rows     Row Source Operation
-------  ---------------------------------------------------
      7  FILTER
      7   NESTED LOOPS
      1    NESTED LOOPS
      1     TABLE ACCESS BY INDEX ROWID OBJ#(36372)
      1      INDEX UNIQUE SCAN OBJ#(36563) (object id 36563)
      1     TABLE ACCESS BY INDEX ROWID OBJ#(34465)
      1      INDEX UNIQUE SCAN OBJ#(73436) (object id 73436)
      7    PARTITION RANGE ITERATOR PARTITION: KEY KEY
      7     TABLE ACCESS BY GLOBAL INDEX ROWID OBJ#(145334) PARTITION: ROW LOCATION ROW LOCATION
111828135      INDEX RANGE SCAN OBJ#(584500) PARTITION: KEY KEY (object id 584500)
09:54:41 SQL> select object_name
09:54:45   2    from dba_objects where object_id = 584500;
OBJECT
NAME
----------------------------------------
IX_MSINT_CONTACT_DATE
The point is is that I think this query should be using a full table scan instead of the ix_msint_contact_date index.
(note: I changed actual name in query to '???????')
Thanks,
Robert.
 
 
April     27, 2007 - 11:19 am UTC 
 
my goal was to verify the estimated card= values where even close to the real, actual values (as reported by tkprof)
are they. (we don't even need to see the query to get the answer to that...) 
 
 
Cardinality <> #rows
Robert, April     27, 2007 - 11:38 am UTC
 
 
  
Tom,
Explain plan card =  30,085,221
Actual rows       = 111,828,135
I think I need to update my statistics(?)
Would this suggest I need a histogram? (the query predicate/index in question is using a date range).
Thank you,
Robert.
 
April     27, 2007 - 11:48 am UTC 
 
look at each bit of the plan - are the various STEPS correct. 
 
 
Not getting it?
Robert, April     27, 2007 - 12:13 pm UTC
 
 
Tom,
Here is the plan.
The only 'significant' table here (where 99% of the work is being done is the INTERACTIONS table).
My understanding of how the steps work together is this (in part)... The result set from the INTERACTIONS table is gathered (i.e. once) and combined with the result set of the other two tables (brief_names and names). Since the cardinality for these two tables (brief_names and names) is relatively small, then the main 'bottleneck' would be gathering the result set from the INTERACTIONS table. Therefore this is the place to focus (i.e. Best way to get result set from interactions table...INDEX access vs. Full Table Scan)... Correct?
If so..... what stats (i.e. 'truth') do I need to give CBO in order for it to make correct decision?
Thanks!
----------------------------------------------------------
 0          SELECT STATEMENT Optimizer=CHOOSE (Cost=2856874 Card=1 Bytes=65)
 1        0   FILTER
 2        1     NESTED LOOPS (Cost=2856874 Card=1 Bytes=65)
 3        2       NESTED LOOPS (Cost=4 Card=1 Bytes=41)
 4        3         TABLE ACCESS (BY INDEX ROWID) OF 'BRIEF_NAMES' (Cost=2 Card=1 Bytes=16)
 5        4           INDEX (UNIQUE SCAN) OF 'UQ_AMBRF_BRIEF_NAME' (UNIQUE) (Cost=1 Card=1)
 6        3         TABLE ACCESS (BY INDEX ROWID) OF 'NAMES' (Cost=3 Card=1 Bytes=25)
 7        6           INDEX (UNIQUE SCAN) OF 'PK_NAMES' (UNIQUE) (Cost=2 Card=1)
 8        2       PARTITION RANGE (ITERATOR)
 9        8         TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'INTERACTIONS' (Cost=2856870 Card=1 
Bytes=24)
10        9           INDEX (RANGE SCAN) OF 'IX_MSINT_CONTACT_DATE' (NON-UNIQUE) (Cost=178434 
Card=30085221)
 
April     27, 2007 - 4:38 pm UTC 
 
Ok, the optimizer guessed 30,085,221 in the last step
it got 111,828,135
off by a factor or 3/4.  
(are you missing a join somewhere by the way?  INT seems cartesian joined to the other two?)
It could be that histograms on contact date could help it get the right card=value for that index range scan.
but given this predicate:
AND INT.STS != 'D'
   AND INT.CONTACT_DATE BETWEEN to_date('01/01/06', 'MM/DD/RR') and to_date('12/31/06', 'MM/DD/RR')
   AND INT.TECHNIQUE IN ('IP','PT','CP','CU','CV','WC','PR', 'BQ','OD','PF','TR','AN','IB','TD', 
'TN')
   AND INT.CONTACT_CATEGORY||NULL IN ('GP','FN','OP')
   AND ( ( 'SO' = 'SO' AND INT.SOLICITOR = '???????' ) OR ( 'SO' = 'ST' AND INT.USER_INSERTING = 
'???????' ) )
if we could add technique, contact_category, sts to the end of that index to avoid millions of table access's - you would probably find the index to be the right path 
 
 
Alberto Dell'Era, April     27, 2007 - 5:29 pm UTC
 
 
Have you noticed (good formatting is a bless, indenting makes to Heaven):
 AND ( 
     ( 'SO' = 'SO' AND INT.SOLICITOR      = '???????' ) 
  OR ( 'SO' = 'ST' AND INT.USER_INSERTING = '???????' ) 
     ) 
as written, SOLICITOR wants its place in the index as well, probably in the first position or second after CONTACT_DATE.
Probably another identical index, changing SOLICITOR to USER_INSERTING, is useful to optimize the sibling query that I'm sure is going to be found in the system as well:
 AND ( 
     ( 'SO' = 'ST' AND INT.SOLICITOR      = '???????' ) 
  OR ( 'SO' = 'SO' AND INT.USER_INSERTING = '???????' ) 
     ) 
 
 
A reader, March     09, 2009 - 4:54 pm UTC
 
 
We have a partitioned table and we are doing some certification of the stats job with our online users. The cpu was good until the stats job ran. Our stats job is executing the following for the 4 partitioned tables
SYS.DBMS_STATS.UNLOCK_TABLE_STATS(owner, tablename);
ALTER TABLE <owner>.<tablename> MOVE PARTITION <partition_name> COMPRESS UPDATE GLOBAL INDEXES PARALLEL;
ALTER TABLE <owner>.<tablename> MODIFY PARTITION <partition_name> REBUILD UNUSABLE LOCAL INDEXES;
SYS.DBMS_STATS.gather_table_stats(owner, tablename, partition_name);
SYS.DBMS_STATS.LOCK_TABLE_STATS(owner, tablename);
After this job ran, the online application started behaving badly, could this be due to a bad explain plan bcos of the stats or something ? I even tried to flush the shared pool but no use. Out of ideas on what else to try. Can you help ?
 
March     10, 2009 - 2:33 am UTC 
 
well, the defaults to dbms_stats change version to version, so not sure what statistics you may or may not have gathered there, but yes, it sounds like it *could* be a plan change.
So, question to you - do you have your history, are you using AWR (if so, yes, you do and you can find what plans used to be and see what changed), if not do you have statspack from when it was "OK" 
 
 
A reader, March     13, 2009 - 3:30 pm UTC
 
 
We have a set of new partitioned tables with a new application. To do statistics on these tables we are just running
dbms_stats.gather_table_stats(<owner>,<table>,<partition>);
Does only the partition get affected by this or something happens to the table as well ? What other options should we use for the partition stats ? 
March     13, 2009 - 3:59 pm UTC 
 
 
 
analyze and partitions
A reader, March     14, 2009 - 4:46 pm UTC
 
 
greetings thomas,
and thanks like always,
1) if i had dropped or truncated a partition, what is best, to analyze the effected partition or the entire table.
2) if i had added a partition do i need to analyze it or even analyze the table, if yes, why?
thanks in advance 
March     16, 2009 - 9:31 am UTC 
 
read link above, the optimizer magic one.
the answer is "it depends"
if you truncate a partition - the statistics are not reset for example, you'd sort of need to "fix" that
ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('01-jan-2008','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('01-jan-2009','dd-mon-yyyy'))
 11  )
 12  /
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t
  2  select to_date('1-jan-2007')+mod(rownum,365*2), rownum, rownum
  3    from dual
  4  connect by level <= 100000
  5  /
100000 rows created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name = 'T'
  2  union all
  3  select partition_name, num_rows from user_tab_partitions where table_name = 'T';
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                  100000
PART1                               50004
PART2                               49996
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter table t truncate partition part1;
Table truncated.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name = 'T'
  2  union all
  3  select partition_name, num_rows from user_tab_partitions where table_name = 'T';
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                  100000
PART1                               50004
PART2                               49996
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'PART1' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name = 'T'
  2  union all
  3  select partition_name, num_rows from user_tab_partitions where table_name = 'T';
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                   49996
PART1                                   0
PART2                               49996
and if you drop the partition, you might need to zap them again, to get accurate information:
ops$tkyte%ORA10GR2> alter table t drop partition part2;
Table altered.
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name = 'T'
  2  union all
  3  select partition_name, num_rows from user_tab_partitions where table_name = 'T';
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T                                   49996
PART1                                   0
you could fix that in a few ways - read the associated link to get some ideas. 
 
 
Order of Dictionary Tables updation while gathering stats at subpartition level
A reader, June      04, 2010 - 10:56 am UTC
 
 
If we are gathering stats at SubPartition level with default options. Can you please what are the dictionary tables that gets updated and order in which they get updated? The reason why I am asking this is stats gathering for one of partitions in the table is taking longer than usual. We want to where is taking time.
 
June      09, 2010 - 7:46 am UTC 
 
you would be assuming you could use that knowledge somehow to peek at that session and see where it is - that would mean that stats gathering session would have to commit (else you cannot see it), but it won't.  That approach cannot work.
Not that anyone could reliably tell you the order of anything since it would change from release to release, patch to patch.
You can use ASH/AWR if you have access to them to see what it was waiting for/doing.
You can use sql_trace to see detailed information about what it did and what order it did it.
You can use statspack at an aggregate level if this is the only job running to gain insight as well - take a snap, gather stats, take a snap and report. 
 
 
Order of Dictionary Tables updation while gathering stats at subpartition level
A reader, June      04, 2010 - 10:56 am UTC
 
 
If we are gathering stats at SubPartition level with default options. Can you please tell what are the dictionary tables that gets updated and order in which they get updated? The reason why I am asking this is stats gathering for one of partitions in the table is taking longer than usual. We want to where is taking time.
 
 
Impact of number of rows returned 
A reader, June      22, 2011 - 3:38 pm UTC
 
 
In a comment you made on this thread, you mentioned the inacurate statistics would make Oracle 
a) going to get tons and tons of rows when in fact it'll get none.
b) going to get no rows when in fact it'll get tons and tons.
What kind of impact on performance it would have if statistics led Oracle picked the same execution path without right rows estimation?  Would Oracle allocate wrong size of CPU or RAM thus imperil elapse time? 
June      23, 2011 - 7:48 am UTC 
 
You need to put the words "think it is" in front of those.
If we were to accidentally get the right plan even with bad statistics - then it would perform exactly as if we got it right on purpose.  There would be no difference. 
 
 
Jess, October   07, 2011 - 9:43 am UTC
 
 
Hi Tom,
Running a select from one partition, I would've thought to expect the optimizer to pick 'partition range single' and use partition stats, as it knows exactly which one it's hitting, regardless of whether the stats were up to date (provided there were some stats).
select count (*) from fact_transactions where 
txn_date >= to_date('19-SEP-2011 00:00:00','DD-MON-YYYY HH24:MI:SS') and 
txn_date <= to_date('19-SEP-2011 23:59:59','DD-MON-YYYY HH24:MI:SS')
When I run a query plan to select from one partition that does NOT have up-to-date stats, it shows access with 'partition range single', which is what I would expect for partition pruning, as it would use that partition's stats (leaving aside for the minute the fact that stats aren't representative of data--clearly Oracle doesn't know that at this point).
---------------------------------------------------------------------------------
|Id|Operation               | Name          |Rws|Bts|Cst%CPU|Time    |Pstrt|Pstp|
---------------------------------------------------------------------------------
| 0|SELECT STATEMENT        |               |  1|  9|  0 (0)|00:00:01|     |    |
| 1| SORT AGGREGATE         |               |  1|  9|       |        |     |    |
| 2|  PARTITION RANGE SINGLE|               |  1|  9|  0 (0)|00:00:01|  794| 794|
|*3|   INDEX RANGE SCAN     |FTX_CALKEYTXNID|  1|  9|  0 (0)|00:00:01|  794| 794|
---------------------------------------------------------------------------------
But when I run the same count(*) for a similarly sized partition that's up-to-date on stats, 
I get 'iterator' instead.  (Not comparing cost figures because of differences in stats).
----------------------------------------------------------------------------------
|Id|Operation            |Name     |Rows |Bts|Cst%CPU |Time    |Pstrt|Pstp|IN-OUT|
----------------------------------------------------------------------------------
| 0|SELECT STATEMENT     |         |   1 | 8 |2443 (1)|00:00:44|     |    |      |
| 1| SORT AGGREGATE      |         |   1 | 8 |        |        |     |    |      |
| 2|  PX COORDINATOR     |         |     |   |        |        |     |    |      |
| 3|   PX SEND QC (RANDM)|:TQ10000 |   1 | 8 |        |        |     |    | P->S |
| 4|    SORT AGGREGATE   |         |   1 | 8 |        |        |     |    | PCWP |
| 5|     PX BLOCK ITERTR |         |7270K|55M|2443 (1)|00:00:44|  828| 828| PCWC |
|*6|      TBL ACCESS FULL|FACT_TXNS|7270K|55M|2443 (1)|00:00:44|  828| 828| PCWP |
----------------------------------------------------------------------------------
Does that mean it's not partition pruning properly or am I misinterpreting block iterator?
Are single partition statistics actually used in this case?  Why is it trying to iterate when there's only one partition to speak of?
I thought that 'range single' was a better plan than 'block iterator'.  Is that not so?
Also, this seems to be running in parallel whereas the first one isn't.  Is there a reason for this difference in the plans?  I am presuming I want the former (albeit with accurate stats).  If so, does that mean I need to force the optimizer to do it?
Thank you.
 
October   07, 2011 - 2:04 pm UTC 
 
it started using parallel query so you got an entirely different plan.  It is hitting but one partition - it is doing so in parallel.  it is iterating over the parallel chunks.  Nothing amiss here.
PX BLOCK = parallel block.
Both plans are appropriate for their respective data sets.
 
 
 
Jess, October   10, 2011 - 6:33 am UTC
 
 
Thanks Tom! 
 
partition stats useful if index used?
A reader, October   26, 2011 - 12:22 pm UTC
 
 
This is a snippet from a plan that runs fast - snipped a bit so it would fit
PARTITION RANGE ALL |           |     1 14 |                                   
|  TABLE ACCESS FULL    | | APARTITIONEDTABLE   |  8809K|   243M| 
In this case, the partitioned table is made up of 12 partitions each for the last 12 years.  The queries always span at most 2 partitions of the twelve, but not on the boundaries.  The partitions are Jan1-Dec31, but the queries often span from july to july the following year for example.  In this case, is the partitioned table being full scanned and *then* pruned?  I am 99% sure the partition range scan refers to the partitioned table that is being full scanned on the indented step.  Which kind of stats would be used for that? You mentioned many posts ago that a partition range scan would "check" all the partitions.. what kind of work does than entail?
Now here is a similar snippet when the range is expanded just a couple months but still only uses two partitions.
PARTITION RANGE ALL     1 |    14 |                                                                  INDEX RANGE SCAN    ALOCALINDEXONPARTITIONEDTABLE|    78 |       | 
And this runs horribly.  Is there any way to tell whether this "method" of going to a partition range all is less effective? Global stats or local stats are used? 
October   26, 2011 - 1:17 pm UTC 
 
This is saying there are 14 partitions - you are hitting partition 1 through 14.  Are you sure about 12?
Look at the pstart/pstop columns in the plan - they tell you what partitions will be hit.  In this case, it is going to full scan 1-14, no pruning (unless there is a 15th partition I don't know about - if so, it was pruned)
global stats would be used for any query that hits more than one partition 
 
 
Thank you - 
A reader, October   26, 2011 - 2:19 pm UTC
 
 
So Tom.. any query that hits more than one partition, will use global stats.  Any query that hits more than one subpartition with not use subpartition stats in likewise fashion? 
Thanks,
- D 
October   26, 2011 - 2:39 pm UTC 
 
local stats are used if and only if we can identify at hard parse time that the query hits exactly one partition and we know what that partition is.
else global stats are used. 
 
 
sorry about spelling error
A reader, October   26, 2011 - 2:20 pm UTC
 
 
I meant will not use.. not with not use.. sorry. 
 
"hitting a partition"
A reader, October   26, 2011 - 2:24 pm UTC
 
 
Can you define "hitting a partition"? Does that mean the equivalent of a full table scan of a partitioned table?
Thanks. 
October   26, 2011 - 2:40 pm UTC 
 
hitting a partition means we will process that partition, we will read that partition, we will touch that partition.
It might be a full scan of the partition
It might be an index access by local rowid
It might be an index access by global rowid
It just means we'll be reading - or have the change to be reading - something from that partition. 
 
 
Thanks again Tom - this will be in the plan?
A reader, October   26, 2011 - 3:08 pm UTC
 
 
Tom - 
could mean full table scan, 
could mean index range scan,
could mean rowid - 
That would show up in our explain plan or tkprof? Or not always?  
October   26, 2011 - 3:39 pm UTC 
 
yes, it would show up in your plan - that is the entire goal of a plan - to show you the access method being used.
 
 
 
Jaruwan Na Ranong, June      15, 2012 - 4:00 pm UTC
 
 
I have a 1TB reporting database which daily perform partition exchange multiple times (based on region, i.e. AMER, APAC, etc) from etl database about 300+ regular table into 300+ partitioned table partition by region.  I notice that 11g AUTOTASK job on reporting database attempt to analyze partitioned table at the global level each night and run out of 4 hours windows. My questions are.
1. Are there any benefit to let AUTOTASK collect global stats level each night? The reporting database is designed to be globally available to all users under different timezone. Or benefit to have global stats up to date?
2. Is there a way to not let AUTOTASK collect global stats on partitioned table or to trick AUTOTASK that global stats is up to date. Because the reporting database plan to add more region and growth into 2TB end of this year.  If we have to do global stats every night which sometime may run into same time when Transport Tablespace is executing, it will be time and resource consuming.
Thank you. 
June      16, 2012 - 4:22 am UTC 
 
in a database like this - you KNOW when the stats need to be gathered and in fact you can gather the stats way before time.  why wouldn't your statistics updates be coordinated with when you attach the new data - that is when you need them!
You don't mention a version here at all :( In 11g you would be using the new incremental statistics and never gathering global statistics again. 
 
 
Elaborate 
A reader, January   29, 2019 - 8:33 am UTC
 
 
Tom said in first post "
This query knows it'll partition eliminate, but not down to "which partition", therefore the optimizer will use the table level statistics to optimize this since it doesn't know "partition 1, 2 or 3"
Is still true on 12.1?
How could use binds and force optimizer to usePartition level stats.?
My case is all query always select from a single partition. Never combining more than one 
February  06, 2019 - 1:09 am UTC 
 
You should be ok, because we can peek at the values
SQL> create table t ( x int, y int )
  2  partition by list ( x )
  3  (
  4  partition p1 values (1),
  5  partition p2 values (2)
  6  );
Table created.
SQL>
SQL> insert into t s
  2  select 1 , rownum from dual
  3  connect by level <= 1000;
1000 rows created.
SQL>
SQL> insert into t s
  2  select 2 , rownum from dual
  3  connect by level <= 50000;
50000 rows created.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> exec dbms_stats.gather_table_stats('','T');
PL/SQL procedure successfully completed.
SQL>
SQL> select partition_name, num_rows
  2  from user_tab_partitions
  3  where table_name = 'T';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P1                                   1000
P2                                  50000
SQL>
SQL> variable v number
SQL> exec :v := 1
PL/SQL procedure successfully completed.
SQL>
SQL> select max(y)
  2  from t
  3  where x = :v;
    MAX(Y)
----------
      1000
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0haq2z8y5nah1, child number 0
-------------------------------------
select max(y) from t where x = :v
Plan hash value: 3783435421
-----------------------------------------------
| Id  | Operation             | Name | E-Rows |
-----------------------------------------------
|   0 | SELECT STATEMENT      |      |        |
|   1 |  PARTITION LIST SINGLE|      |      1 |
|   2 |   SORT AGGREGATE      |      |      1 |
|   3 |    TABLE ACCESS FULL  | T    |   1000 |
-----------------------------------------------
But you would need adaptive cursor sharing to kick in if you had identical queries which just changed by the bind variable. 
 
 
Elaborate 
A reader, January   29, 2019 - 8:33 am UTC
 
 
Tom said in first post "
This query knows it'll partition eliminate, but not down to "which partition", therefore the optimizer will use the table level statistics to optimize this since it doesn't know "partition 1, 2 or 3"
Is still true on 12.1?
How could use binds and force optimizer to usePartition level stats.?
My case is all query always select from a single partition. Never combining more than one 
 
Pls elaborate differently 
A reader, February  06, 2019 - 7:17 am UTC
 
 
What I expected is you do the stats only on partition level in order to test. The question is relative to partitions . Does it find accurate e-rows when global stats are stale and only partition level stats are fine when using binds?.
2- pls elaborate about how configure adaptative cursor sharing. 
Thanks.  
February  07, 2019 - 2:01 am UTC 
 
You can see from the example that it used partition stats - the estimates matched the partition numbers.  If it was global, you would not get the exact numbers estimated.
SQL> exec dbms_stats.set_table_stats('','T',numrows=>2000);
PL/SQL procedure successfully completed.
SQL>
SQL> variable v number
SQL> exec :v := 1
PL/SQL procedure successfully completed.
SQL>
SQL> select max(y)
  2  from t
  3  where x = :v;
    MAX(Y)
----------
      1000
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  0haq2z8y5nah1, child number 0
-------------------------------------
select max(y) from t where x = :v
Plan hash value: 3783435421
-----------------------------------------------
| Id  | Operation             | Name | E-Rows |
-----------------------------------------------
|   0 | SELECT STATEMENT      |      |        |
|   1 |  PARTITION LIST SINGLE|      |      1 |
|   2 |   SORT AGGREGATE      |      |      1 |
|   3 |    TABLE ACCESS FULL  | T    |   1000 |
-----------------------------------------------
Adaptive cursor sharing is just there by default - nothing needed to be configured 
 
 
i did the test 
Moris, February  07, 2019 - 2:12 pm UTC
 
 
 create table t ( x int, y int )
     partition by list ( x )
    (
     partition p1 values (1),
    partition p2 values (2)
   );
insert into t s
   select 1 , rownum from dual
 connect by level <= 1000;
  insert into t s
   select 2 , rownum from dual
   connect by level <= 50000;
variable v number
  exec :v := 1
select   max(y)
  from t 
   where x = :v 
  ;
==> 1000
select * from table(dbms_xplan.display_cursor(format =>'ALLSTATS LAST'));  
==> E-rows 1 ..so normal.
SQL_ID  06hjyp6hvr9c5, child number 0
-------------------------------------
select   max(y)  from t    where x = :v
 
Plan hash value: 746180402
 
-----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.01 |      37 |
|   1 |  SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.01 |      37 |
|   2 |   PARTITION LIST SINGLE|      |      1 |      1 |   1000 |00:00:00.01 |      37 |
|   3 |    TABLE ACCESS FULL   | T    |      1 |      1 |   1000 |00:00:00.01 |      37 |
-----------------------------------------------------------------------------------------
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
now i force the stats
exec dbms_stats.set_table_stats('','T','P1',numrows=>1000);
select   max(y)  from t    where x = :v ;
==> 1000
the E-rows remain the same !!
SQL_ID  06hjyp6hvr9c5, child number 0
-------------------------------------
select   max(y)  from t    where x = :v
 
Plan hash value: 746180402
 
-----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.01 |      37 |
|   1 |  SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.01 |      37 |
|   2 |   PARTITION LIST SINGLE|      |      1 |      1 |   1000 |00:00:00.01 |      37 |
|   3 |    TABLE ACCESS FULL   | T    |      1 |      1 |   1000 |00:00:00.01 |      37 |
-----------------------------------------------------------------------------------------
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
 
 
February  08, 2019 - 6:32 am UTC 
 
Because we can peek at the values of the bind variable. This is why I *ran* the statement and then used "display_cursor". That gives me the *real* plan the optimizer used.
If I just use plain old explain plan, it "lies" to me, because I see plan that is absent of peeking (and hence not really what is happening)
SQL> variable v number
SQL> exec :v := 1
PL/SQL procedure successfully completed.
SQL> explain plan for select max(y) from t where x = :v;
Explained.
SQL> select * from dbms_xplan.display();
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
Plan hash value: 3783435421
----------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |     8 |   275   (1)| 00:00:01 |       |       |
|   1 |  PARTITION LIST SINGLE|      |     1 |     8 |            |          |   KEY |   KEY |
|   2 |   SORT AGGREGATE      |      |     1 |     8 |            |          |       |       |
|   3 |    TABLE ACCESS FULL  | T    | 25500 |   199K|   275   (1)| 00:00:01 |   KEY |   KEY |
----------------------------------------------------------------------------------------------
 
 
 
complements previaus post 
Moris, February  07, 2019 - 2:20 pm UTC
 
 
but if a specifiy the partition ...i can see the accurate E-rows :
select   max(y)  from t   partition for(1) ; -- where x = 1 ;
--> 1000
SQL_ID  3km2uwam7b49p, child number 0
-------------------------------------
select   max(y)  from t   partition for(1)
 
Plan hash value: 2831600127
 
-----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.01 |      37 |
|   1 |  SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.01 |      37 |
|   2 |   PARTITION LIST SINGLE|      |      1 |   1000 |   1000 |00:00:00.01 |      37 |
|   3 |    TABLE ACCESS FULL   | T    |      1 |   1000 |   1000 |00:00:00.01 |      37 |
-----------------------------------------------------------------------------------------
 
so HOW could make the optimizer get the correct E-rows for a givin partition without specifiying the PARTITION clause.
cause in my package all selects are binded 
ie. .. WHERE key = :1  (whereas :1 always is the partition value)
 
 
Misunderstood 
Moris, February  08, 2019 - 12:55 pm UTC
 
 
Hi connor   
No it was about execution plan not explain plan as also you can see the e ans a rows.
You can do exactly what a did and you'll find same result.  I didn't get the reasons please re read my previous two posts. 
Thanks for help 
February  11, 2019 - 4:49 am UTC 
 
You're missing out a crucial step :-)
SQL> create table t ( x int, y int )
  2  partition by list ( x )
  3  (
  4  partition p1 values (1),
  5  partition p2 values (2)
  6  );
Table created.
SQL> insert into t s
  2  select 1 , rownum from dual
  3  connect by level <= 1000;
1000 rows created.
SQL>
SQL> insert into t s
  2  select 2 , rownum from dual
  3  connect by level <= 50000;
50000 rows created.
SQL> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
SQL> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
P1
P2
 
 
 
Continue test
Moris, February  11, 2019 - 7:03 am UTC
 
 
In your last post just continue the test..
As I did above by using dbms stats set table stat. Do not stat the global table but only the partition by a set table stat.
Set the stats for partition p1 and you can see then the execution plan.