Incomplete statistics?
Jayaraman Ashok, June      23, 2002 - 8:44 pm UTC
 
 
Hi Tom
You have replied that dbms_stats does not gather information on chained rows,unused data blocks and average free space available in blocks. Does it mean that to get the above mentioned information we have to run analyze command again? Also you mention that dbms_stats gathers statistics required by the CBO. Are there any special statistics that are gathered apart from number of rows, number of blocks occupied? If it is only the above two, then i believe that analyze command already accomplishes it. The only feature that is of use may be the set_stats procedure for tuning queries before using them in production.
Thanks & regards,
Ashok 
 
June      23, 2002 - 9:32 pm UTC 
 
 
dbms_stats only gathers things the CBO cares about.  The CBO does not care about that information. 
In the future, analyze will not gather CBO info, only these other things CBO doesn't care about.  DBMS_STATS currently and will only gather CBO info.
I think you need to reread the answer.  dbms_stats gathers everything that the analyze command does but.... can do it in parallel, can do it for stale tables, can export/import stats, can analyze things analyze cannot (the system for example) and so on.  There are lots more stats then just num rows and num blocks -- and both commands collect them.  There are things dbms_stats will do that analyze does not (especially on a partitioned table).  
Recommendation: use dbms_stats.   
 
 
 
Dave, June      24, 2002 - 9:43 am UTC
 
 
When I "CREATE INDEX my_index ... COMPUTE STATISTICS" is the analyze being performed by the DBMS_STATS method, or the ANALYZE INDEX method?
If by the latter method, am I likely to gain by running DBMS_STATS post-CREATE or post-REBUILD mostly for partitioned segments (ref. your comments on gathering global statistics), with "ANALYZE ..." still being a reasonable choice at the moment for non-partitioned segments? 
 
June      24, 2002 - 9:48 am UTC 
 
 
It is closer to the analyze method -- only because it is DDL (same results from analyze, dbms_stats, create index ultimately).
But since you only create an index once, it is good only for the initial build, after that -- you are back to dbms_stats.
It is a general recommendation to use dbms_stats, period.  There will come a time when dbms_stats is the only way. 
 
 
 
external table, index and dbms_stats
Suresh, June      24, 2002 - 10:39 am UTC
 
 
Tom,
   You have mentioned that DBMS_STATS can analyze external tables. I tried to create an index on an external table and it came up with an error "not supported on external table".
My question is if dbms_stats could analyze external tables, why not allow an index on external table as well and analyze it with dbms_stats. 
 
 
June      25, 2002 - 8:00 am UTC 
 
 
you cannot index something that we have no control over.
Consider if we did.  say the file started with:
1,a,hello
2,b,goodbye
and we indexed that (the first column).  Now, you come along and replace that file with:
2,b,goodbye
1,a,hello
and issue: select * from external_table where column1 = 1;
what do you think would happen if
a) we used an index
b) we full scanned
I think we would get two answers.  We cannot index a file that is changeable (and changeable by forces outside of our control)
 
 
 
 
Queries on Analyze
Sanjay Jha, June      24, 2002 - 4:24 pm UTC
 
 
Tom,
The ongoing discussion has generated few more related queries in me. I am listing them below, for you crisp and clear answers :
1. What are different packages for running analyze (eg. dbms_utulity, dbms_stats, db_maintenance etc.) and what are the advantages of each?
2. What is the best way to analyze one particular table from CBO's perspective(BTW, what are all possible ways)?
3. If someone has to automate the analyze, what do you recommend and why - scheduling through cron or dba_jobs? (For that matter what is your opinion on scheduling any job for database- OS level or through Oracle jobs?)
4. This question is not related to analyze but to the Oracle supplied packages- Where in a databse one can get info about these packages (like v$ views etc.) or give me an url for reference please.
Thanks.
 
 
June      25, 2002 - 8:39 am UTC 
 
 
1) dbms_stats is the one you should use. dbms_utility has some very very primitive functions left in there for backwards compatibility only.
2) for that, please read the ADMIN guide, it covers the methods and when to use what.
You know -- if there was a "best way", it would be the only way...  
3) If you have my book -- you know my opinion on this.  Do it in the database.
Cron -- can cron tell if the database is up and running?  cron will try to analyze even when the database is not ready.  DBMS_JOBS is the right tool for this in my opinion (barring that, OEM would be the right place).  I want all of my database related stuff going on in the database -- one place to look for issues (my alert log or OEM logs) and so on.
4) how about the supplied packages guide.  Instead of a direct link, I'll teach you to fish.
goto otn.oracle.com
click on documentation
pick what database version you have
go after the supplied packages guide (in 8i and before, this was with the "database" set of docs.  In 9i -- there is just one big list)
 
 
 
 
DBMS_STATS and statistics
Arun Gupta, May       02, 2003 - 7:10 pm UTC
 
 
Tom
I did the following:
a) Using dbms_stats, analyzed all the tables in schema X of database A and using dbms_stats.export_schema_stats, exported the statistics into stattab.
b) Extracted the table/index creation scripts for schema X, excluding the stattab table.
c) Using Oracle export (exp), exported the stattab table.
d) Using the scripts in b), I created all the tables and indexes in schema Y of database B. These tables do not contain even a single row of data.
e) Using Oracle import(imp), imported the stattab into schema Y of database B. 
f) Using dbms_stats.import_schema_stats, I imported all the statistics from stattab.
I cannot see the statistics like num_rows, avg_row_len etc. in user_tables. How does the CBO read these statistics?
Thanks 
 
May       02, 2003 - 8:11 pm UTC 
 
 
dbms_stats was designed to move the stats for user A from db1 to user A in db2.
It is the X to Y that is messing you up.  This demonstrates what you are seeing, but shows how to adapt it to work anyway:
ops$tkyte@ORA920LAP> drop user a cascade;
User dropped.
ops$tkyte@ORA920LAP> drop user b cascade;
User dropped.
ops$tkyte@ORA920LAP> grant dba to a identified by a;
Grant succeeded.
ops$tkyte@ORA920LAP> grant dba to b identified by b;
Grant succeeded.
ops$tkyte@ORA920LAP> @connect a/a
a@ORA920LAP> create table t as select * from all_objects;
Table created.
a@ORA920LAP> create index t_idx on t(object_id);
Index created.
a@ORA920LAP> analyze table t compute statistics
  2  for table for all indexes for all indexed columns;
Table analyzed.
a@ORA920LAP> exec dbms_stats.create_stat_table( user, 'st' );
PL/SQL procedure successfully completed.
a@ORA920LAP> exec dbms_stats.export_schema_stats( user, 'st' );
PL/SQL procedure successfully completed.
a@ORA920LAP> !exp userid=a/a tables=st
Export: Release 9.2.0.3.0 - Production on Fri May 2 20:08:56 2003
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
Export done in WE8ISO8859P1 character set and AL16UTF16 NCHAR character set
About to export specified tables via Conventional Path ...
. . exporting table                             ST         78 rows exported
Export terminated successfully without warnings.
a@ORA920LAP> @connect b/b
b@ORA920LAP> create table t as select * from all_objects where 1=0;
Table created.
b@ORA920LAP> create index t_idx on t(object_id);
Index created.
b@ORA920LAP> !imp userid=b/b fromuser=a touser=b
Import: Release 9.2.0.3.0 - Production on Fri May 2 20:08:56 2003
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
Export file created by EXPORT:V09.02.00 via conventional path
Warning: the objects were exported by A, not by you
import done in WE8ISO8859P1 character set and AL16UTF16 NCHAR character set
. . importing table                           "ST"         78 rows imported
Import terminated successfully without warnings.
b@ORA920LAP> exec dbms_stats.import_schema_stats( user, 'st' );
PL/SQL procedure successfully completed.
b@ORA920LAP> select table_name, num_rows from user_tables;
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
ST
T
<b>that is what you currently see but:</b>
b@ORA920LAP> update st
  2  set c5 = USER
  3  /
78 rows updated.
b@ORA920LAP> exec dbms_stats.import_schema_stats( user, 'st' );
PL/SQL procedure successfully completed.
b@ORA920LAP> select table_name, num_rows from user_tables;
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
ST
T                                   30557
b@ORA920LAP>
<b>that'll "rename the user" in the stats tab so it'll be looking for B.T, not A.T
</b>
 
 
 
 
 
It works
Arun Gupta, May       02, 2003 - 9:16 pm UTC
 
 
After updating c5 in stattab to new schema name, it works. 
Thanks...!! 
 
 
What about execution statistics
Arun Gupta, May       06, 2003 - 9:56 am UTC
 
 
Tom
I am a bit unclear about the purpose of importing statistics from one database to another. My questions are:
a) If I set a table with 0 rows with the statistics of a 1 million row table, can I run a query against this 0 row table and actually see the same buffer gets as I would see against a 1 million row table? 
b) When the query actually hits a 1 million row table, can Oracle change the plan? 
 
May       06, 2003 - 10:05 am UTC 
 
 
a) no
b) not sure what you mean.  if you mean "will the plan for a table with 900,000 rows be different then a plan for 1,000,000" -- the answer is "yes, it could be".  If you mean "if the statistics say there are N rows but there are really M rows -- will Oracle change the plan after finding there are not N rows?" -- the answer is "no, once formulated -- the plan is the plan"
One use of the export/import stats trick is
a) you have a logical standby or some reporting instance that is a close copy of production (using replication, whatever)
b) you run stats on the copy.  move them into production.
 
 
 
 
Problems in anlayze
SUBBARAO., May       07, 2003 - 1:20 am UTC
 
 
There is much useful information on ANALYZE in this article. It is helpful. My scenario is like this, looks strange but it is.
  In my production system, we have some problems in getting some reports from the application.
For one report query has 8 tables, out of which 2 tables have huge number of records, both these tables are not analyzed, rest all the tables are analyzed. This report is not launching.
We have now analyzed the entire schema. Then the report started working. The cost of the query before and after analyze has drastically come down. Some tables will be getting different kinds of data, and some tables will be getting same type of data.  I am not sure how long this report will work and when I should analyze tables. I have read about the table monitoring feature and document says that Oracle will automatically analyze the table if it requires analyze. 
My questions:
1) So enabling this feature, what will be the side effects in performance. I know that if we want some thing we should scarifice few things, but I would like to highlight these side effects to my boss. 
2) If a query is having 5 tables and out of which 3 tables are not analyzed and 2 tables are analyzed, will the cost calculation be accurate? I mean will the optimizer gives a better plan.
Is it required that all the tables in the query are to be analyzed or all not to be analyzed.
 
 
May       07, 2003 - 7:32 am UTC 
 
 
1) monitoring is extremely low impact, not even measurable.  YOU still have to analyze, you just use dbms_stats with the gather stale option.  it'll collect stats only on tables you need.
2) of course the costing won't be accurate.  you should either have all or no objects refered to by a query analyzed. 
 
 
 
Excellent and Timely discussion....
A reader, May       07, 2003 - 10:22 am UTC
 
 
Hi Tom:
As usual, your answers allows us (DBA's) to have a good night's sleep.
On the issue of monitoring...
What criteria does oracle use to determine that the stats are stale ?
Can these thresholds be changed/set by the user ?
Are the stats in the user_tab_modifications cumulative ?
If I need to preserve the stats in the above view, can you please suggest a procedure/script as to how I go about saving these. I am interested in preserving the amount of dml activity by segment.
Thanks
Mohsin 
 
May       07, 2003 - 10:46 am UTC 
 
 
o when about 10% of the table is changed
o no
o yes
o you would have to "insert into some_other_table select sysdate, a.* from the modifications table A" before gathering stats.  Make that part of your 'process'. 
 
 
 
Excessive Recursive calls after stats
A reader, May       07, 2003 - 11:53 am UTC
 
 
Hi Tom:
I have observed that the statistics section of the sql+ trace reports excessive recursive call count for queries agains the tables just after the tables have been analyzed.
What does this mean and what are the side effects of stats gathering ? Has the stats gathering invalidated the parsed sql in the shared pool ?
Would this be a good way of implementing my own thresholds:
I'd turn on monitoring for the segments.
Collect the dml activity from the user_tab_modifications to my own table and then have my procedure re-compute the stats after say x% of change to the segment and not use the gather stale option of the dbms_stats.
What is the timestamp column for ? If the values are cummulative then how does it determine the incremental from the last stale stats run ?
Thanks. 
 
May       07, 2003 - 1:45 pm UTC 
 
 
hows about an example ? 
 
 
 
Here it is.
A reader, May       07, 2003 - 2:42 pm UTC
 
 
09:11:53 SQL> connect scott/tiger
Connected.
09:16:14 SQL> 
09:16:14 SQL> 
09:16:14 SQL> analyze table junk compute statistics for table for all indexed columns for all indexes;
Table analyzed.
09:17:06 SQL> show user
USER is "SCOTT"
14:36:01 SQL> set autotrace traceonly explain statistics;
14:36:21 SQL> 
14:36:23 SQL> select * from junk where owner='MMJ';
1 row selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=86)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'JUNK' (Cost=4 Card=1 Byt
          es=86)
   2    1     INDEX (RANGE SCAN) OF 'IDX_JUNK_OWNER' (NON-UNIQUE) (Cos
          t=3 Card=1)
Statistics
----------------------------------------------------------
        330  recursive calls
          0  db block gets
         73  consistent gets
          3  physical reads
          0  redo size
       1114  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed
14:36:37 SQL> /
1 row selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=86)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'JUNK' (Cost=4 Card=1 Byt
          es=86)
   2    1     INDEX (RANGE SCAN) OF 'IDX_JUNK_OWNER' (NON-UNIQUE) (Cos
          t=3 Card=1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
       1114  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
14:36:58 SQL>  
 
 
May       08, 2003 - 9:17 am UTC 
 
 
analyzing the table will invalidate open cursors on that object in the shared pool.  the next parse against that object will be "hard".  hard does lots more recursive sql then "soft"
It is not the analyze so much as "you did something to cause hard parsing to have to take place"
here is an example using the rbo and EMP:
ops$tkyte@ORA817DEV> select * from emp;
14 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
          0  recursive calls
         12  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1979  bytes sent via SQL*Net to client
        425  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         14  rows processed
ops$tkyte@ORA817DEV> alter table emp add x number;
Table altered.
ops$tkyte@ORA817DEV> select * from emp;
14 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
        148  recursive calls
         15  db block gets
         26  consistent gets
          0  physical reads
          0  redo size
       2042  bytes sent via SQL*Net to client
        425  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
         14  rows processed
Of course, SHUTDOWN/STARTUP would have the same effect.
normal, natural, expected and one time -- you hard parse and soft after that. 
 
 
 
 
Some more info....
A reader, May       07, 2003 - 3:32 pm UTC
 
 
Hi Tom.
Here is some more info that might help you.
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE    9.2.0.1.0       Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
The junk table is a mirror of dba_objects view.
And there was no activity in the d/b between the time 9:16  and 14:36 in my sql+ session results I posted earlier.
Please let me know if you need more info.
Thanks
Mohsin 
 
 
effect of partial analyze
kumar, May       08, 2003 - 7:26 am UTC
 
 
Tom,
You have mentioned in one of your replies that we should not analyze certain objects used in a query and leave the rest un-analyzed. I have read similar statements in lots of places, but i could not really understand why. Can you please explain why is it so dangerous to do so? 
 
May       08, 2003 - 9:57 am UTC 
 
 
analogy time.
You are a manager now.
I give you a team of 4 people.
I tell you all about 2 of them -- their capabilities, experience, level of comprehension, lots of stuff.
I tell you NOTHING about the other two.
I give you a problem to solve using these 4 people.  You need to make assignments based on your knowledge of them.  You have good knowledge of 2, no knowledge of the other 2.  You look at them and size them up.  Maybe you don't like the way the 4th guy dressed -- he "looks dumb" to you.
You dole out responsibilities -- not giving much hard stuff to #4.  Unfortunately for you he was the smartest in the group and if he had been given 99% of the work -- it would have been a success.  As it was, due to inefficient resource allocation -- you never finish your project!
So it is with statistics.  You give the optimizer only a little information, it has to "make up the rest" and can come to some very bad decisions as to the correct approach. 
 
 
 
Hat's off to you...
A reader, May       08, 2003 - 11:20 am UTC
 
 
Hi Tom:
Excellent analogy about the CBO....
I will confirm your point, but I am positive nothing was happening on my d/b to have caused invalidation of my cursors resulting in hard parse. I conducted the test on my laptop, so I know what was happening.
Could the Oracle version have made the difference in this case?
I was using Oracle 9i Rel 2, and from your SQL+ prompt it seems you were using 8i.
Thanks
Mohsin 
 
 
Ahhh the language barrier strikes again...
A reader, May       08, 2003 - 11:27 am UTC
 
 
Sorry Tom:
I read and re-read your response. I get it.
Once in a while in between Swahili, Gujarati, Katchi, Urdu, Hindi and a sprinkling of Arabic I forget my English...
As they say in Swahili - Hakuna Matata (No Problem)
Kwaheri.
Thanks again.
 
 
 
Analogy worth more than 5 stars!!!!
A reader, May       08, 2003 - 11:53 am UTC
 
 
  
 
Both Analyze and DBMS_STATS?
Dan Jatnieks, May       16, 2003 - 7:41 pm UTC
 
 
In your original response you stated there are some statistics that DBMS_STATS does not collect:
"...but not the number of chained rows, average free 
space, or number of unused data blocks"
I would like to switch to using DBMS_STATS for CBO statistics on my 9i database, but I also want to collect the statistics that DBMS_STATS does not; I have monitoring software that collects and uses chained rows, avg free space and unused data blocks. 
Can I use Analyze first and then DBMS_STATS (in that order) to accomplish this? 
The reasoning would be that first using Analyze will collect the data that DBMS_STATS does not; then DBMS_STATS will overwrite the CBO statistics that Analyze collected, but leave alone the other Analyze data.
Will that work?
 
 
May       17, 2003 - 10:02 am UTC 
 
 
it would work but it would be like using two cars to goto work -- it would be double the work in most cases (partitioned tables being the exception).
can the monitoring software use "list chained rows" instead and can it use dbms_space to get the other info. 
 
 
 
Statistics for Partioned tables
Nathan, February  05, 2004 - 10:08 am UTC
 
 
Dear Tom, 
 Your answers have been comprehensive and helpful all the time, just another quick one ( hopefully) from me.
When we the Estimate_percent parameter and sent it to 0.00001 in DBMS_STATS.GATHER_TABLE_STATS , would it mean only that percent of the blocks be scanned for analysis ? Our gather_stats takes with estimate takes hours to complete , My Senior DBA is skeptical of the percentage option and reckons that it could be a bug , he concluded by watching V$session_longops ( as the same number of blocks were scanned ) . I have not checked it yet but apparently the percentage does work on non-partitioned tables . Is this true ? Is there a bug in oracle 9i release 1.0 software
Thanks and Regards
Vish 
 
February  05, 2004 - 7:18 pm UTC 
 
 
hows about you share the details -- such as the entire, full command you are using plus the type of table you are doing this to and some facts about the table itself.
0.00001 would be a "not so good number" to use, pretty darn small it would be. 
 
 
 
Analyze and DBMS_STATS
chandu singh, February  06, 2004 - 12:28 am UTC
 
 
Hi 
 i think it is a best to have the percentage of minium to 10% . is that right if not pls explain. 
 
February  07, 2004 - 12:10 pm UTC 
 
 
why -- whats the reasoning (not that you are wrong, but -- why do you think that :)
it could be 1%
it could be 10%
it could be 100%
it just depends on "how evenly distributed the data is in general".  If the data is large-ish AND pretty much evenly distributed, or you are not computing histograms -- 1% might be dandy.
Me, I like to compute whenever possible personally.  Only when it gets really too large -- and then estimate as high as you can (realizing of course anything over 49% is really compute in disguise) 
 
 
 
Analyze and DBMS_STATS
chandu singh, February  06, 2004 - 12:28 am UTC
 
 
Hi 
 i think it is a best to have the percentage of minium to 10% . is that right if not pls explain. 
 
 
DBMS_STATS.AUTO_SAMPLE_SIZE Issues
RyuO, February  06, 2004 - 12:14 pm UTC
 
 
My experience was that DBMS_STATS is wonderful, but requires a lot of wrapping. Here is the wrapper proc I evolved:
CREATE OR REPLACE PROCEDURE analyze_schema(
   p_schema_name IN VARCHAR2 := USER,
   p_Estimate_Pct IN PLS_INTEGER := NULL) IS
   -- like DBMS_UTILITY.ANALYZE_SCHEMA, but using DBMS_STATS
   c_proc_name          CONSTANT VARCHAR2(128) := UPPER('analyze_schema');
   c_Def_Estimate_Pct   CONSTANT PLS_INTEGER := 35;
   c_Min_Degree         CONSTANT PLS_INTEGER := 1;
   c_Def_Degree         CONSTANT PLS_INTEGER := LEAST(c_Min_Degree,DBMS_STATS.DEFAULT_DEGREE);
      -- default degree is somehow 32767 at the moment
   c_Def_Options        CONSTANT VARCHAR2(30) := 'GATHER AUTO';
      -- 'GATHER' means analyze all objects.
      -- 'GATHER AUTO' means only objects that have changed.
   v_Estimate_Pct PLS_INTEGER;
      -- v_Estimate_Pct can also be set to:
      --    DBMS_STATS.AUTO_SAMPLE_SIZE (Oracle sets this, but it has been known to be 0...)
      --    NULL (which means, look at everything, e.g., 'COMPUTE')
   v_schema_name        ALL_TABLES.OWNER%TYPE := NVL(UPPER(p_schema_name),USER);
   v_Msg_Text           VARCHAR2(2000);
BEGIN
   CASE p_Estimate_Pct
      WHEN NULL THEN
         v_Estimate_Pct := 100;
      WHEN 100 THEN
         v_Estimate_Pct := 100;
      ELSE
         IF DBMS_STATS.AUTO_SAMPLE_SIZE = 0 THEN
            IF p_Estimate_Pct BETWEEN 1 AND 99 THEN
               v_Estimate_Pct := p_Estimate_Pct;
            ELSE
               v_Estimate_Pct := c_Def_Estimate_Pct;
            END IF;
         ELSE
            v_Estimate_Pct := DBMS_STATS.AUTO_SAMPLE_SIZE;
         END IF;
   END CASE;
   DBMS_STATS.GATHER_SCHEMA_STATS(
      ownname => v_schema_name,
      estimate_percent => v_Estimate_Pct,
      cascade => TRUE,
      degree => c_Def_Degree,
      options => c_Def_Options);
      -- "cascade" means analyze indexes too.
   v_Msg_Text := c_proc_name||'-> Schema['||v_schema_name||'], Option['||c_Def_Options||'], Sampling['||v_Estimate_Pct||'%]';
   DBMS_OUTPUT.PUT_LINE(v_Msg_Text);
EXCEPTION
   WHEN OTHERS THEN
      v_Msg_Text := c_proc_name||'-> EXCEPTION: '||SQLERRM;
      RAISE_APPLICATION_ERROR(-20199,v_Msg_Text);
END analyze_schema;
My questions are related to AUTO_SAMPLE_SIZE:
1. How come it is always 0? That's what I always see, generally in 9.2.0.3 and 9.2.0.4 on Solaris.
2. What should the default estimate percentage be, if DBMS_STATS can't tell me? Anecdotal evidence suggests that the ANALYZE/ESTIMATE percentage should be 35-40%, but then ANALYZE and DBMS_STATS are different... 
 
February  07, 2004 - 1:34 pm UTC 
 
 
requires....
OR "permits".
it requires NO WRAPPING.
It quite handily permits it easily.
you'll have to provide me a reproducible, small, simple, yet 100% complete test case to demonstrate what you mean by "0"
ops$tkyte@ORA9IR2>  exec dbms_stats.gather_table_stats( user,'BIG_TABLE',estimate_percent => dbms_stats.auto_sample_size );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select sample_size from user_tables where table_name = 'BIG_TABLE';
 
SAMPLE_SIZE
-----------
     100000
(which also shows "does not require", but would "permit" wrapping.... ) 
 
 
 
 
understanding sample_percent
A reader, February  25, 2004 - 3:55 pm UTC
 
 
Oracle 9.2.0.2
I observed that when estimate_percent is specified, the gather_schema_stats is still doing a compute statistics. The sample_size=num_rows. Am I incorrect in concluding that its ignoring the 25% and estimating 100% of rows ? If so, why ?
OR have I overlooked at any setting ?
Please look at the following Case :
------- # of rows from T --------
SQL> select count(*) from T
  COUNT(*)
  --------
    600980
---- Querying the dba_tables ---------
SQL> select table_name, monitoring, num_rows, sample_size, to_char(last_analyzed, 'HH24-mi-ss') ANALYZED from dba_tables;
TABLE_NAME            MON   NUM_ROWS SAMPLE_SIZE ANALYZED
--------------------- --- ---------- ----------- --------
T                     YES     600980      600980 15-39-16
------ inserting data ----------
SQL> insert into T select * from T
600980 rows created.
SQL> commit;
Commit complete.
--------- row count from T ---------
SQL> select count(*) from T;
  COUNT(*)
----------
   1201960
-------- Querying the dba_tables ---------
SQL> select table_name, monitoring, num_rows, sample_size, to_char(last_analyzed, 'HH24-mi-ss') ANALYZED from dba_tables;
TABLE_NAME            MON   NUM_ROWS SAMPLE_SIZE ANALYZED
--------------------- --- ---------- ----------- --------
T                     YES     600980      600980 15-39-16
---------- gather_schema_stats ---------
begin
  dbms_stats.gather_schema_stats(
       ownname          => 'TEST',
       estimate_percent => 25,
       method_opt       => 'FOR ALL INDEXED COLUMNS SIZE 1',
       degree           => 2,
       cascade          => FALSE,
       options          => 'GATHER AUTO'
       );
  end;
/ 
PL/SQL procedure successfully completed.
-------- Querying the dba_tables ---------
SQL> select table_name, monitoring, num_rows, sample_size, to_char(last_analyzed, 'HH24-mi-ss') ANALYZED from dba_tables;
TABLE_NAME            MON   NUM_ROWS SAMPLE_SIZE ANALYZED
--------------------- --- ---------- ----------- --------
T                     YES    1201960     1201960 15-48-45
----------------------------------- 
 
 
February  25, 2004 - 7:13 pm UTC 
 
 
straight from the docs here:
GATHER AUTO: Gathers all necessary statistics automatically. Oracle implicitly determines which objects need new statistics, and determines how to gather those statistics. When GATHER AUTO is specified, the only additional valid parameters are ownname, stattab, statid, objlist and statown; all other parameter settings are ignored. Returns a list of processed objects.
 
 
 
 
got it ... but
A reader, February  26, 2004 - 3:38 pm UTC
 
 
Tom, I understand it better now. 
I took out the invalid parameters.
But my question still lingers :
Why is the gather_schema_stats analyzing all 100% of rows. I changed only 10%(50K of rows) out of 450K rows.
My dbms_job wakes up every few days to execute gather_schema_stats. Now, if I have 10-15 tables which have changed by more than 10%, then analyzing these tables 100%(compute) will take lot of time.
I know oracle automatically calculates how much percentage to analyze. But everytime why is it analyzing all 100% of the rows? Why is it not estimating ? 
=========
begin
   dbms_stats.gather_schema_stats(
         ownname          => 'TEST',
         options          => 'GATHER AUTO'
         );
    end;
/
==============
 
 
February  26, 2004 - 4:24 pm UTC 
 
 
it does not take that long to analyze 1,000,000 rows.
gather stale does not re-analyze just "changed rows"
gather stale re-analyzes the entire object.
If you don't want AUTO, don't use AUTO, use gather stale and fill in the remaining blanks (but for such small tables, i would get it all) 
 
 
 
hmmm
A reader, February  27, 2004 - 2:52 pm UTC
 
 
Hi Tom,
the thing am fighting is I like to use "stale" because I can specify the percentage to analyze and take advantage of the degree. AUTO is good because it automatically analyzes newly added tables.
Best of both worlds would be :
1) analyze newly added tables -> AUTO does this. STALE doesn't.
2) analyze only percentage of rows, not 100% -> AUTO is  analyzing 100%. STALE can do percentage.
Is it possible ?
Thanks as always 
 
February  27, 2004 - 3:07 pm UTC 
 
 
sounds like you might want:
a) alter schema tab monitoring
b) gather empty
c) gather stale
for your degree of control.  consider:
ops$tkyte@ORA920PC> create table t1 as select * from all_users;
 
Table created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec dbms_stats.alter_schema_tab_monitoring
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1
 
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather empty' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1                                     45
<b>that picked it up, this</b>
 
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather stale' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1                                     45
<b>did nothing..</b>
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create table t2 as select * from all_users;
 
Table created.
 
ops$tkyte@ORA920PC> insert into t1 select * from t1;
 
45 rows created.
 
ops$tkyte@ORA920PC> insert into t1 select * from t1;
 
90 rows created.
 
ops$tkyte@ORA920PC> insert into t1 select * from t1;
 
180 rows created.
 
ops$tkyte@ORA920PC> commit;
 
Commit complete.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec dbms_stats.alter_schema_tab_monitoring
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1                                     45
T2
 
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather empty' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1                                     45
T2                                     45
<b>that got t2 and...</b>
 
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather stale' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
 
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1                                    360
T2                                     45
 
ops$tkyte@ORA920PC>
<b>that got t1...</b>
 
 
 
 
 
perfecto
A reader, February  27, 2004 - 10:56 pm UTC
 
 
pretty GOOD !
Thank you much.
Have a good weekend. 
 
 
gather_schema_stats('SYS');
A reader, April     27, 2004 - 9:46 am UTC
 
 
Tom,
Please let us know, when we need to do the following
execute dbms_stats.gather_schema_stats('SYS');
at what point of time, do you recommend us executing the above sql
Thanks a lot 
 
April     28, 2004 - 12:40 pm UTC 
 
 
in your test system before going production, to ensure you are not going to mess up an existing system.
in 10g, it'll be done for you out of the box.
in 9i, it is "ok and safe in most cases to do" -- but TEST it, don't just DO IT
in 8i and before, I would not do it, not necessary and probably would lead only to troubles.  but again, you would TEST it of course. 
 
 
 
Confusing situation
Bruno Di Rei Araujo, May       14, 2004 - 2:21 pm UTC
 
 
Tom,
I've read you're advice about utilizing DBMS_STAT instead of ANALYZE statement. But, I had an experience that makes me get confused: with analyze, the cost got Higher, but the response time was the best; with DBMS_STATS, little tiny costs, but 12 times the response time (comparing TKPROF results). Is there any reason for that situation?
(TKPROF of ANALYZEd query)
********************************************************************************
SELECT a.seq_mov, a.empresa, a.filial, a.LOCAL, a.produto, a.qtde_mov,
a.vlr_tot_mov, a.num_docto, a.cod_oper, a.dt_mov, a.historico,
a.lote_cont, a.lcto_direto, a.disp_trigger, a.dt_registro,
a.qtde_val,
b.entrada_saida
FROM ce_movest a, ce_operacoes b
WHERE (:b5 IS NULL OR a.empresa = :b5)
AND (:b4 IS NULL OR a.filial = :b4)
AND (:b3 IS NULL OR a.produto = :b3)
AND (:b2 IS NULL OR a.dt_mov >= :b2)
AND (:b1 IS NULL OR a.dt_mov <= (TO_DATE (:b1) + .99999))
AND b.empresa = a.empresa
AND b.cod_oper = a.cod_oper
order by a.empresa, a.filial, a.produto, a.dt_mov, b.ajuste_estoque,
      b.entrada_saida /*decode(b.entrada_saida, 'E', decode(b.transferencia, 'S', 3, 1), 2)*/,
    a.local, a.qtde_mov
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.27       5.99         55      19061          0           4
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.27       5.99         55      19061          0           4
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 47  
Rows     Row Source Operation
-------  ---------------------------------------------------
      4  SORT ORDER BY (cr=19061 r=55 w=0 time=5997116 us)
      4   HASH JOIN  (cr=19061 r=55 w=0 time=5996843 us)
    103    INDEX FULL SCAN CE_OPERACOES_AK1 (cr=1 r=0 w=0 time=288 us)(object id 553481)
      4    TABLE ACCESS FULL CE_MOVEST (cr=19060 r=55 w=0 time=5992324 us)
********************************************************************************
(TKPROF of DBMS_STATSed)
********************************************************************************
SELECT a.seq_mov, a.empresa, a.filial, a.LOCAL, a.produto, a.qtde_mov,
a.vlr_tot_mov, a.num_docto, a.cod_oper, a.dt_mov, a.historico,
a.lote_cont, a.lcto_direto, a.disp_trigger, a.dt_registro,
a.qtde_val,
b.entrada_saida
FROM ce_movest a, ce_operacoes b
WHERE (:b5 IS NULL OR a.empresa = :b5)
AND (:b4 IS NULL OR a.filial = :b4)
AND (:b3 IS NULL OR a.produto = :b3)
AND (:b2 IS NULL OR a.dt_mov >= :b2)
AND (:b1 IS NULL OR a.dt_mov <= (TO_DATE (:b1) + .99999))
AND b.empresa = a.empresa
AND b.cod_oper = a.cod_oper
order by a.empresa, a.filial, a.produto, a.dt_mov, b.ajuste_estoque,
      b.entrada_saida /*decode(b.entrada_saida, 'E', decode(b.transferencia, 'S', 3, 1), 2)*/,
    a.local, a.qtde_mov
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     13.03      74.99          0     104482          0           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     13.03      75.00          0     104482          0           2
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 47  
Rows     Row Source Operation
-------  ---------------------------------------------------
      2  SORT ORDER BY (cr=104482 r=0 w=0 time=74999502 us)
      2   TABLE ACCESS BY INDEX ROWID CE_MOVEST (cr=104482 r=0 w=0 time=74999276 us)
1276536    NESTED LOOPS  (cr=4901 r=0 w=0 time=29219344 us)
    103     TABLE ACCESS FULL CE_OPERACOES (cr=3 r=0 w=0 time=728 us)
1276432     INDEX RANGE SCAN CE_MOVEST_FK3 (cr=4898 r=0 w=0 time=10600999 us)(object id 29012)
********************************************************************************
As we can see, the time for the query havin tables DBMS_STATSed were 75 seconds; when ANALYZEd, it took about 6 seconds... Is there a reason for that?
Thanks,
Bruno 
 
May       15, 2004 - 12:14 pm UTC 
 
 
umm, exact commands used please? 
 
 
 
Related questions for a data load(9.2.0.4.0)
Matt, May       16, 2004 - 9:39 pm UTC
 
 
I have a third party system that initially has no data. I need to load alot of data into this system. The product provides a number of interface tables into which I can load my data. I then initiate a third party supplied procedure that will pick up the data in the interface tables, validate  and "interalize"  the data.
I want the load to run efficiently. I believe my best bet is to re-generate stats (on the products internal tables) whilst the load is in progress (a 10-20 hrs process). Here are my questions:
Will open cursors necessarily be reparsed when the stats have been generated?
I plan to also generate system stats (io, cpu etc). I am also wondering about generating stats on 'SYS' and 'SYSTEM'. Is this recommended on 9.2.0.4.0?
Many Thanks.
 
 
May       17, 2004 - 7:19 am UTC 
 
 
"Will open cursors necessarily be reparsed when the stats have been generated?"
if not, why bother generating stats at all -- they will have their entire working set already cached in memory (in the shared pool).  if we did not invalidate the cursors, the stats would "not be used".  dbms_stats does have an invalidate flag you can use to control that --- but, I would ask the vendor for their best practice with their product (make sure they support the cbo first and foremost)
as long as you always do gather stats on sys and system, it is definitely a supported option in 9ir2 -- in 10g, it will be "the only way" in fact. 
 
 
 
What is the difference between the following options..
Matt, May       28, 2004 - 10:20 am UTC
 
 
to dbms_stats.gather_schema_stats:
method_opt => 'FOR ALL INDEXED COLUMNS'
and
method_opt => 'FOR ALL COLUMNS SIZE AUTO' ?
I just drastically improved the performance of a query by using the first. The first appears to have difference use different bucket sizes and generates more stats. Can you please comment on what should be "best practice" as far as generating stats is concerned?
Many Thanks.
I've noted the follwing difference in stats:
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------
 VMETERID                                  NOT NULL NUMBER(38)
 CUI_TID                                   NOT NULL NUMBER(38)
 INTERVAL_ENDDT                            NOT NULL DATE
 INTERVAL_STARTDT                          NOT NULL DATE
 VALUE                                              FLOAT(126)
 READ_TYPE_FLAG                                     VARCHAR2(5)
 TRANS_TID                                          NUMBER(38)
 TRANS_UID                                          NUMBER(38)
 TRANS_STARTTS                                      DATE
method_opt => 'FOR ALL INDEXED COLUMNS':
=======================================
COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS NUM_BUCKETS          DENSITY LAST_ANALYZED                                   
------------------------------ ------------ ---------- ----------- ---------------- -------------------                             
CUI_TID                                   1          0           1     1.0000000000 05/28/2004 23:03:14                             
INTERVAL_ENDDT                          796          0           1      .0012562814 05/28/2004 23:03:14                             
INTERVAL_STARTDT                        792          0           1      .0012626263 05/28/2004 23:03:14                             
READ_TYPE_FLAG                            3          0           1      .3333333333 05/28/2004 23:03:14                             
TRANS_STARTTS                          7894          0           1      .0001266785 05/28/2004 23:03:14                             
TRANS_TID                                 1          0           1     1.0000000000 05/28/2004 23:03:14                             
TRANS_UID                                 1          0           1     1.0000000000 05/28/2004 23:03:14                             
VALUE                                 57746          0           1      .0000173172 05/28/2004 23:03:14                             
VMETERID                             487822          0           1      .0000020499 05/28/2004 23:03:14    
method_opt => 'FOR ALL COLUMNS SIZE AUTO':
=========================================
 Name                                                                     Null?    Type
 ------------------------------------------------------------------------ -------- -------------------------------------------------
 VMETERID                                                                 NOT NULL NUMBER(38)
 CUI_TID                                                                  NOT NULL NUMBER(38)
 INTERVAL_ENDDT                                                           NOT NULL DATE
 INTERVAL_STARTDT                                                         NOT NULL DATE
 VALUE                                                                             FLOAT(126)
 READ_TYPE_FLAG                                                                    VARCHAR2(5)
 TRANS_TID                                                                         NUMBER(38)
 TRANS_UID                                                                         NUMBER(38)
 TRANS_STARTTS                                                                     DATE
INDEXES FOR TABLE: VMC_DATA
================================================
INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION                                                       
------------------------------ ------------------------------ ---------------                                                       
PKVMC_DATA                     VMETERID                                     1                                                       
                               CUI_TID                                      2                                                       
                               INTERVAL_STARTDT                             3                                                       
TABLE STATISTICS FOR : VMC_DATA
===================================================
  NUM_ROWS     BLOCKS AVG_ROW_LEN LAST_ANALYZED                                                                                     
---------- ---------- ----------- -------------------                                                                               
   4680488      35452         100 05/28/2004 22:27:10                                                                               
COLUMN STATISTICS FOR : VMC_DATA
====================================================
COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS NUM_BUCKETS          DENSITY LAST_ANALYZED                                   
------------------------------ ------------ ---------- ----------- ---------------- -------------------                             
CUI_TID                                   1          0           1     1.0000000000 05/28/2004 22:27:10                             
INTERVAL_STARTDT                        792          0          68      .0012626263 05/28/2004 22:27:10                             
VMETERID                             487489          0          75      .0000031279 05/28/2004 22:27:10  
 
 
Confusing Situation (Continued)
Bruno Di Rei Araujo, June      05, 2004 - 1:16 pm UTC
 
 
Tom,
As you asked for (from my Review above), the exact commands were:
DBMS Stats
----------
exec dbms_stats.delete_table_stats('SAGE', 'CE_MOVEST');
exec dbms_stats.delete_table_stats('SAGE', 'CE_OPERACOES');
exec dbms_stats.gather_table_stats('SAGE', 'CE_MOVEST');
exec dbms_stats.gather_table_stats('SAGE', 'CE_OPERACOES');
ANALYZE
-------
exec dbms_stats.delete_table_stats('SAGE', 'CE_MOVEST');
exec dbms_stats.delete_table_stats('SAGE', 'CE_OPERACOES');
analyze table ce_movest delete statistics;
analyze table ce_operacoes delete statistics;
analyze table ce_movest estimate statistics for table for all indexes for all indexed columns;
analyze table ce_operacoes estimate statistics for table for all indexes for all indexed columns;
Note that my approach was, first, remove all relevant statistics for the method used and then gather them brand new.
Just can't understand why is it so different... Can you see any reason for it?
Thanks again,
Bruno. 
 
June      05, 2004 - 1:40 pm UTC 
 
 
add 
cascade => true 
to the dbms_stats and retry (so you compare apples to apples).  you have no index stats. 
 
 
 
Statistics
Aru, September 21, 2004 - 2:34 am UTC
 
 
Hi Tom,
Please can you clarify in what circumstances and why would you use the 3 different(listed below ) methods to analyze tables,
1). a@ORA920LAP> analyze table t compute statistics
    for table for all indexes for all indexed columns;
2). a@ORA920LAP> analyze table t compute statistics;
3). dbms_stats(....
I,m using oracle8i.
Q1) Why would I need to compute stats for 
all indexed columns when the whole table is getting 
analyzed.
Q2) Does dbms_stats analyze the table, it's indexes and the indexed columns too or what exactly does it do.If I specify a schema then does it to all the above mentioned statistics.
Thanks lots as always Tom,
Regards,
Aru.
 
 
September 21, 2004 - 7:43 am UTC 
 
 
1) i would not (do not) -- i have old examples that do but promise never to do it in an example again
2) see #1
3) yes.
q1) because you wanted to have histograms for all indexed columns (not saying you do, but if you did "for all indexed columns" that is the outcome -- you have histograms for indexed columns)
q2) see the documentation -- it'll tell you that method_opt defaults to 'for all columns size 1' and cascade (to indexes) defaults to FALSE, so you would have to pass TRUE to get indexes done. 
 
 
 
cascade=true -- > local indexes and global indexes?
Pravesh Karthik from India, January   02, 2005 - 7:09 am UTC
 
 
Tom,
For a partitioned table, if i say cascade=true. It will collect stats on all local indexes and global indexes?
If yes,i dont need to use GATHER_INDEX_STATS - right?
Please confirm.
Thanks a lot
Pravesh Karthik  
 
January   02, 2005 - 11:05 am UTC 
 
 
depends on your granularity -- but left to the default, 
exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
will get table, table partition, index, index partition statistics.
 
 
 
 
if yes -- then index statistics in which mode?
Pravesh Karthik from India, January   02, 2005 - 7:51 am UTC
 
 
January   02, 2005 - 11:14 am UTC 
 
 
that is a broad sweeping generalization that is not really accurate (or well stated -- "... to COMPUTE index statistics in ESTIMATED mode..." -- bummer.  You either COMPUTE or you ESTIMATE)
and to say the CBO is almost based on index statistcs -- well, I sent an email to the author of the note asking them to remove that paragraph. 
 
 
 
awesome !!
Pravesh Karthik from India, January   02, 2005 - 11:37 am UTC
 
 
 
 
chong, January   03, 2005 - 2:49 am UTC
 
 
Hi Tom,
1)exec dbms_stats.gather_database_stats(method_opt=>'FOR ALL INDEXED COLUMNS',degree=>7,cascade=>true);
2)exec dbms_stats.gather_database_stats(method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO',degree=>7,cascade=>true);
Is the first command will collecting histogram? If yes what is the different between first and second command?
thanks 
 
January   03, 2005 - 8:45 am UTC 
 
 
see
  http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:5792247321358#7417227783861  
they are very different.  size auto sort of needs a historical workload to peek at (queries) as well -- that is part of it's algorithm.  
consider:
ops$tkyte@ORA9IR2> create table t as select * from all_objects;
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx1 on t(object_id);
 
Index created.
 
ops$tkyte@ORA9IR2> create index t_idx2 on t(owner);
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns', cascade => true );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' group by column_name;
 
COLUMN_NAME            COUNT(*)
-------------------- ----------
OBJECT_ID                    76
OWNER                        22
 
<b>histograms collected for both...</b>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size auto', cascade => true );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' group by column_name;
 
COLUMN_NAME            COUNT(*)
-------------------- ----------
OBJECT_ID                     2
OWNER                         2
<b>no histograms -- just hi/lo values and such.  there were no queries that ever referenced either column so histograms were not collected</b>
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size skewonly', cascade => true );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' group by column_name;
 
COLUMN_NAME            COUNT(*)
-------------------- ----------
OBJECT_ID                     2
OWNER                        22
 
<b>it looked at the columns and said "object id, unique -- no histograms needed there", owner -- not so unique and pretty skewed, lets get them</b>
But see that link, the meanings are defined there. 
 
 
 
 
chong, January   03, 2005 - 10:26 pm UTC
 
 
Hi Tom,
Based on the link, size auto means:
It'll collect histograms in memory only for those columns which are used by your applications (those columns appearing in a predicate involving an equality, range, or like operators). we know that a particular column was used by an application because at parse time, we'll store workload information in SGA. Then we'll store histograms in the data dictionary only if it has skewed data (and it worthy of a histogram)
What i understand from the above definition is (please correct me if i'm wrong), oracle will only collect histogram on those tables' columns which are ever been used since database startup and has skewed data, it will not collect histogram for others tables' columns that never been used since database startup even skewed data exists. If my understanding is true, how long oracle will store those table parsing information in SGA (so that dbms_stats will know which column to analyze)? Does it means this option is not appropriate for a fresh database which is just starting up? 
I see some of the oracle notes highly recommend to used this option to analyze the histogram information, is it true?
 
 
January   03, 2005 - 10:45 pm UTC 
 
 
traveling right now but this is worthy of followup later after some research - so, check back maybe later this week.
(see, I don't know everything off the top of my head :)
Update:
looks like sys.col_usage$, it is persistently stored in the dictionary.  The name of the columns that represent "flags" in there says it all.
 
 
 
 
dbms_stats.gather_schema_stats - Oracle9i
Altieri, January   04, 2005 - 4:19 pm UTC
 
 
Tom,
    I'm using dbms_stats.gather_schema_stats( ownname=>'XXX') to analyze a whole schema. 
    I dont want to analyze just a table in this schema. What I have to do ?. 
    I have been verifying all parameters, but and I didnt find any options yet. 
Thanks,
Altieri
 
 
January   05, 2005 - 9:06 am UTC 
 
 
there is no "anti gather" parameter.
you would have to use gather table stats on the tables you wanted stats on, or move the table to another schema.
seems strange to not want stats on a single table doesn't it? 
 
 
 
How to determine if statistics were collected by ANALYZE?
Alexander, January   07, 2005 - 3:19 pm UTC
 
 
Tom, how to determine if the statistics were collected by ANALYZE command or by the package?
 
 
January   08, 2005 - 4:15 pm UTC 
 
 
if you look at columns that are filled in by ANALYZE but not dbms_stats - avg_space is one - you might be able to tell if ANALYZE was at some time in the past used:
ops$tkyte@ORA9IR2> create table t as select * from all_Users;
 
Table created.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select avg_space from user_tables where table_name = 'T';
 
 AVG_SPACE
----------
         0
 
ops$tkyte@ORA9IR2> analyze table t compute statistics;
 
Table analyzed.
 
ops$tkyte@ORA9IR2> select avg_space from user_tables where table_name = 'T';
 
 AVG_SPACE
----------
      7488
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select avg_space from user_tables where table_name = 'T';
 
 AVG_SPACE
----------
      7488
but dbms_stats won't overwrite the value -- so it is not conclusive proof.
 
 
 
 
 
followup on 3 jan 2005
chong, January   11, 2005 - 2:19 am UTC
 
 
Hi Tom,
Do you find out any research on 3rd jan 2005 thread which relate to dbms_stats with size auto option? 
thanks 
 
 
dbms stats failure
Rob H, January   11, 2005 - 12:18 pm UTC
 
 
I have the database setup (9i) to gather statistics however I am running into a strange error
BEGIN DBMS_STATS.GATHER_DATABASE_STATS(cascade=>TRUE); END;
*
ERROR at line 1:
ORA-00904: : invalid identifier 
ORA-06512: at "SYS.DBMS_STATS", line 9375 
ORA-06512: at "SYS.DBMS_STATS", line 9857 
ORA-06512: at "SYS.DBMS_STATS", line 10041 
ORA-06512: at "SYS.DBMS_STATS", line 10134 
ORA-06512: at "SYS.DBMS_STATS", line 10114 
ORA-06512: at line 1 
I don't understand what 
"ORA-00904: : invalid identifier" means with nothing between the 2 colon's.  Any ideas?  I think I know the schema its failing in. 
 
January   11, 2005 - 1:48 pm UTC 
 
 
trace it.
any 'bad' sql will be in the trace file.  if you think you know the schema, just gather schema stats and see if it repros in there, tracefile would be smaller. 
 
 
 
structure not sql
Rob H, January   11, 2005 - 1:56 pm UTC
 
 
Because I'm doing an analyze, isn't it an issue of object structure, not sql?  Regardless, I am tracing it now. 
 
January   11, 2005 - 2:03 pm UTC 
 
 
dbms_stats runs sql
ora-904 is raised from sql. 
 
 
 
interesting....
Rob H, January   12, 2005 - 12:21 pm UTC
 
 
after analyzing the tkprof for the failure I found that the table that was failing.  What was interesting was that I found that the table being analyzed was failing on a specific index.  One that is incredibly old (also note that table contains hourly transactions in the realm of only 6 million rows).  Dropping the index and using dbms_stats again worked.  Adding the index back and using dbms_stats worked.
TKPROF
********************************************************************************
select /*+ cursor_sharing_exact dynamic_sampling(0) no_monitoring no_expand 
  index(t,"INDX_TAB_CAMP") */ count(*) as nrw,count(distinct 
  sys_op_lbid(30519,'L',t.rowid)) as nlb,count(distinct "CAMP") as ndk,
  sys_op_countchg(substrb(t.rowid,1,15),1) as clf 
from
 "KCAGENT"."TAB_ALL"  sample block 
  (5.77406383513445589344056295551646142247)  t where "CAMP" is not null
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          1         14          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          1         14          0           0
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 43     (recursive depth: 1)
Rows     Row Source Operation
-------  ---------------------------------------------------
      0  SORT GROUP BY (cr=0 r=0 w=0 time=0 us)
      1   INDEX SAMPLE FAST FULL SCAN INDX_TAB_ALL_ID_DATE (cr=14 r=1 w=0 time=6851 us)(object id 49563)
********************************************************************************
But the index in the hint is different than the index in the explain.  
Regardless, the immediate problem is solved, thank you. 
 
 
will the statistics be automatically be updated ?
Chenna, January   30, 2005 - 12:18 pm UTC
 
 
I'm interested in knowing in the following
1.Lets say table t has 1000 rows
2.Generate statistics for table t and the associated indexes.
3.Now lets truncate table t.
--will the statistics be automatically be updated ?
4.In the next scenario , lets delete all the rows without truncating.
--will the statistics be automatically be updated ?
If they dont can we do something to update the statistics immedietly based on a certain operation.
Thanks
 
 
January   30, 2005 - 12:36 pm UTC 
 
 
when you tried it, what happened?  I mean, the scenario you scoped out here would be pretty *trivial* to test?  That is what I'm trying to teach here -- you "spec"ed out the question very well, the test case you would need to see would be very easy.
statistics are gathered when you ask us to gather them, or some job you run gathers them.   
 
 
 
Very interesting discussion.
Bob, May       22, 2005 - 9:57 pm UTC
 
 
Thanks everyone.
My question - 10g's "black box" statistics collection mechanism, does it collect histograms and if so, through what parameters?
I've been reasonably satisfied with the execution plans we're getting and collecting system statistics in particular has been a boon.  However there still are some zingers and they typically involve tables with what I know to be rather skewed data distribution on joined columns.  I'd like to keep the process as "black box" as possible (our DBAs have bigger problems to worry about) so I've been wondering how I could verify and/or incorporate histogram collection into the black box, or whether I need to disable the auto collection and wrap up my own.
Thanks in advance - this forum is great! 
 
May       23, 2005 - 8:21 am UTC 
 
 
it uses "auto" and "auto" uses sys.col_usage$ to determine what columns have predicates applied to them and how the predicates use the column (equality, range scans.
so the set of columns that histograms are collected on can vary over time by simply introducing a new query into the system that "where's" or joins on columns that were never used that way before. 
 
 
 
A reader, May       25, 2005 - 8:39 am UTC
 
 
Once a table is set up as Monitoring, how do I disable monitoring, if I need to? 
 
May       25, 2005 - 11:43 am UTC 
 
 
ops$tkyte@ORA9IR2> alter table t monitoring;
                                                                                
Table altered.
                                                                                
ops$tkyte@ORA9IR2> alter table t NOmonitoring;
                                                                                
Table altered.
 
 
 
 
 
Ravi, May       25, 2005 - 8:39 am UTC
 
 
How do I disable MONITORING on a table? 
 
 
DBMS_STATS with low sample size produces wrong DISTINCT_KEYS on nonunique index
Marco Coletti, May       27, 2005 - 3:59 pm UTC
 
 
DBMS_STATS seems very unaccurate at estimating number of distinct keys on nonunique indexes.
I think it could be a bug.
/*
  Test case:
  DBMS_STATS gives wrong DISTINCT_KEYS for nonunique index   
*/
select * from V$VERSION;
/*
BANNER
------------------------------------------------------------Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE    9.2.0.6.0       Production
TNS for Solaris: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
*/
create table NUMBERS as
select mod(N,2000001) N from (select ROWNUM N from DUAL connect by ROWNUM <= 6000000)
;
create index IX_N on NUMBERS (N) compute statistics;
select count(distinct N) from NUMBERS;
/*
COUNT(DISTINCTN)
----------------
         2000001
*/
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
      2000001     6000000 2005-05-25 16:25:12
*/
-- begin bug ------------------------------------
begin
  DBMS_STATS.GATHER_INDEX_STATS (
    ownname => user, 
    indname => 'IX_N',
    estimate_percent => 20,
    degree => 1);
end;
/
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
       414785     1188855 2005-05-25 16:27:39
*/
-- end bug ---------------------------------------
analyze index IX_N estimate statistics sample 20 percent; 
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
      2023171     1415367 2005-05-25 16:47:17
*/
begin
  DBMS_STATS.GATHER_INDEX_STATS (
    ownname => user, 
    indname => 'IX_N',
    estimate_percent => NULL,
    degree => 1);
end;
/
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
      2000001     6000000 2005-05-25 16:48:53
*/
 
 
May       27, 2005 - 5:32 pm UTC 
 
 
it is being computed the way the intended (believe it or not).
do you have a compelling case where by the number of distinct values results in a really wrong plan?
There is an enhancement request in to have a "re-look" at this, but if you have a compelling "bad plan", I'll be glad to take a look and see if we cannot get that into the system. 
 
 
 
No compelling bad plan due to bad NDV for nonunique index
Marco Coletti, May       31, 2005 - 9:25 am UTC
 
 
We have lots of almost-unique indexes in our system.
Probably this is a simptom of bad design, but in development we were heavily constrained by the requirements, which are basically:
* a linear chain of four external systems that exchange messages bidirectionally about entities via Tibco bus; the systems do not employ an entity omogeneus unique key, but only entity unique keys specific for each system (i.e. each system marks an entity with its own key format), and correlation keys from system to system
* collect entities and related events from these systems intercepting the messages in near real-time
* aggregate the events every 1 hour
* make a entity flat warehouse table to be queried by a GUI; refresh it every 1 hour
* delete "dead" entities each day
* the average number of live entities is about 3.5 million per system
I don't remember a specific query which performs bad due to this problem.
 
 
May       31, 2005 - 10:20 am UTC 
 
 
that is the issue though.
dbms_stats uses block sampling to quickly guestimate these statistics.  the block sampling tends to read "sorted data" in the index.  So, the number of distinct values is hard to compute.  Analyze on the other hand used a totally different technique.
But unless you can show this is leading to "bad things happening", that the NDV is off materially affects somethnig -- there isn't an issue.  The sampling is happening the way they intended it to happen, the bug was opened and closed on this.  They would need a compelling case in order to re-look at it.  
Is NDV accurate?  No, observably so.
Is that fact an issue?  That is what they would need to see. 
 
 
 
External tables
A reader, June      06, 2005 - 9:49 am UTC
 
 
We thought we'd create a DDL trigger to automatically turn on Monitoring for new tables, but this DID NOT work for External tables.
Is there some way, (considering it is an AFTER trigger) to know, for SURE, that the current table
1)Is NOT an External table and set them to monitoring
or
2)IS an External table and dont set Monitoring for it?
CREATE OR REPLACE TRIGGER object_change AFTER
CREATE ON SCHEMA
DECLARE
alt_mont varchar2(1000);
BEGIN
 if (ora_sysevent in  ('CREATE') and
                        ora_dict_obj_type = 'TABLE')  then
 alt_mont :='ALTER TABLE ' ||ora_dict_obj_name||' MONITORING ';
  execute immediate alt_mont;
 end if;
end;
/ 
 
June      06, 2005 - 10:50 am UTC 
 
 
either query the dictionary or just "try" and catch the exception (expect it, ignore it) 
 
 
 
A reader, June      06, 2005 - 2:12 pm UTC
 
 
Thanks, accept catching exception is a brilliant solution, but  can't find anything in the Data Dictionery that suggests the table being created is an External one.
By looking at xxx_objects or xxx_tables, what is the DEFINITIVE way to categorise Tables? 
 
June      06, 2005 - 3:04 pm UTC 
 
 
ops$tkyte@ORA9IR2> select * from user_external_tables;
 
no rows selected
 
 
 
 
 
replacing  analyze with dbms_stats  in  9.2.0.6
James K. Purtzer, July      18, 2005 - 1:09 pm UTC
 
 
Currently using:
DBMS_UTILITY.ANALYZE_SCHEMA('ESD', 'COMPUTE');
It takes about 70 minutes to run on about 30 Gb of Data warehouse weekly.
Would like to replace with:
dbms_stats.gather_schema_stats(ownname=>'ESD',
estimate_percent=>dbms_stats.auto_sample_size,
method_opt=>'AUTO', cascade=>TRUE);
1. I'm struggling with how to assess impact on the production server. I can run the new
query on an RMAN restore of the data, without a representative connection load. Is that a fair test per your earlier recommendations? 
2. Does the old analyze data need to be dropped before I run the new statement? 
3.Looking at the DBMS_STATS statement above, as its specified
It will only gather stale statistics from objects the CBO is using?
 
 
July      18, 2005 - 1:41 pm UTC 
 
 
1) I'd be looking to compare plans.  Biggest thing will be auto in the method_opt, in order for that to work, you need to have had some queries going over time.  auto (you have the syntax wrong, see below) will not work "good" until it knows the predicates you use:
ops$tkyte-ORA9IR2> create table ttt as select object_name, owner, object_type from all_objects;
 
Table created.
 
ops$tkyte-ORA9IR2> create index t_idx1 on ttt(owner);
 
Index created.
 
ops$tkyte-ORA9IR2> exec dbms_stats.gather_schema_stats( ownname=> user, estimate_percent=>dbms_stats.auto_sample_size, method_opt=>'for all columns size AUTO',cascade=>true );
 
PL/SQL procedure successfully completed.
 
ops$tkyte-ORA9IR2>
ops$tkyte-ORA9IR2> set autotrace traceonly explain
ops$tkyte-ORA9IR2> select * from ttt t1 where owner = 'SCOTT';
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1271 Bytes=48298)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TTT' (Cost=12 Card=1271 Bytes=48298)
   2    1     INDEX (RANGE SCAN) OF 'T_IDX1' (NON-UNIQUE) (Cost=3 Card=1271)
 
<b>see the card = 1271</b> 
 
ops$tkyte-ORA9IR2> set autotrace off
ops$tkyte-ORA9IR2>
ops$tkyte-ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'TTT' group by column_name;
 
COLUMN_NAME                      COUNT(*)
------------------------------ ----------
OBJECT_NAME                             2
OBJECT_TYPE                             2
OWNER                                   2
 
<b>only two buckets, no histograms really</b>
ops$tkyte-ORA9IR2> exec dbms_stats.gather_schema_stats( ownname=> user, estimate_percent=>dbms_stats.auto_sample_size, method_opt=>'for all columns size AUTO',cascade=>true );
 
PL/SQL procedure successfully completed.
 
ops$tkyte-ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'TTT' group by column_name;
 
COLUMN_NAME                      COUNT(*)
------------------------------ ----------
OBJECT_NAME                             2
OBJECT_TYPE                             2
OWNER                                  22
 
<b>but it saw the predicate now, so it got them</b>
ops$tkyte-ORA9IR2>
ops$tkyte-ORA9IR2> set autotrace traceonly explain
ops$tkyte-ORA9IR2> select * from ttt t2 where owner = 'SCOTT';
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=40 Bytes=1520)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TTT' (Cost=2 Card=40 Bytes=1520)
   2    1     INDEX (RANGE SCAN) OF 'T_IDX1' (NON-UNIQUE) (Cost=1 Card=40)
 
 
 
ops$tkyte-ORA9IR2> set autotrace off
ops$tkyte-ORA9IR2> spool off
<b>and that card= is more accurate</b>
2) no
3) no, it'll gather them all.  you would need to use the alter table T monitoring command and then do a gather stale option. 
 
 
 
 
so method_opt updates as queries run on new stats?
James K. Purtzer, July      18, 2005 - 3:26 pm UTC
 
 
How does the method_opt=>'for all column size AUTO' update as the CBO parses queries? Your example showed cardiality decreasing as more queries run on the table, correct?  
 
July      18, 2005 - 4:02 pm UTC 
 
 
ctl-f for col_usage$ on this page. 
 
 
 
transport stats to another table
valent, July      18, 2005 - 4:35 pm UTC
 
 
Hello Tom,
let's say i want to transport statistics from one big table to another. 
create table t2 as select * from big_table
big table has up to date statistics and i want to transport them to t2
import_table_stats will not work, because it uses same table name.
thanks for your option 
 
July      18, 2005 - 5:08 pm UTC 
 
 
you can use SET_TABLE_STATS  (you do know the stats will be different right -- t2 will be nicely compacted, no whitespace, the clustering factor on indexes will possibly be different.....)
the stats are not made to go from table to table.
but if you look at the stats table you export to, you'll see table names in there, it is not hard to figure out what you would have to update in the stats table after you export to it to touch up the table name. 
 
 
 
valent, July      19, 2005 - 4:22 am UTC
 
 
thank you Tom,
we have large reporting table (no updates) with 15+ million rows par partition, created with CTAS. Statistics should be very similar for my new table copy. 
 
July      19, 2005 - 7:34 am UTC 
 
 
go for it then 
 
 
 
suggestion..!
Reader, July      20, 2005 - 5:05 am UTC
 
 
Tom,
Every week we are analyzing the database schema using DBMS_STAT package as a batch process in our production environment. 
But the this job is getting failed often with ORA-01555 Snapshot too old.We could understand the cause of the problem that is,this analyze job and another job which is schedule in the batch running parallely getting clashed at some point and end with ORA-01555 error.Here my question is,if we re-run the DBMS_STAT package after it failed,its analyzing all the objects from the begining.But we want to analyze objects from the point it failed previously.Is there any option in DBMS_STAT package?.Pls advice.I hope you understand the scenario .
 
 
 
 
July      20, 2005 - 6:42 am UTC 
 
 
why don't we fix the problem?  the 1555?  make your UNDO permanently larger -- sized right to fullful the requirements of your system
the other path will be "write code" as dbms_stats doesn't have a "remember where you were and start over", we'd have to help it out that way. 
 
 
 
dbms_stats hanging,
sns, July      20, 2005 - 9:23 am UTC
 
 
Lately we are experiencing an issue with DBMS_STATS package when we try to analyze the table.
This is what we did:
BEGIN dbms_stats.gather_table_stats(ownname => 'EURO_TSTAT',  estimate_percent => 15,  tabname => 'CALL_SYSTEM_EURO',  block_sample => FALSE,  method_opt => 'FOR ALL INDEXED COLUMNS SIZE 1',  degree => 4,  granularity => 'ALL',  cascade => TRUE ); END;
What are the things we need to see while dbms_stats are in hung state?
Any steps to resolve this?
thanks, 
 
July      20, 2005 - 12:49 pm UTC 
 
 
you could trace it, you could query v$ views (eg: v$session_event) to see what it is waiting on
sql_trace with waits (10046 trace) could be used.
 
 
 
 
need to analyze index separately..!
Reader, August    06, 2005 - 10:17 pm UTC
 
 
Tom,
I have noted that gather schema stats would not analyze indexes until we specify cascade=>true,
is it true?
here the procesure I'm using,kindly give suggestion on this to do the normal process with indexes as well..
exec dbms_stats.gather_schema_stats( -
ownname          => 'OMSTNX', -
options          => 'GATHER AUTO', -
estimate_percent => dbms_stats.auto_sample_size, -
degree           => DBMS_STATS.DEFAULT_DEGREE  -
)
 
 
August    07, 2005 - 9:14 am UTC 
 
 
that is the documented purpose of cascade, yes.   
 
 
 
Trying to understand this behavior...
Mark, August    15, 2005 - 10:02 am UTC
 
 
Hi Tom,
We have this as a job run Weekly (Sunday, Early AM):
DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>'HT4'
, estimate_percent => 40
, block_sample => FALSE
, method_opt => 'FOR ALL COLUMNS SIZE 1'
, degree => NULL
, granularity => 'DEFAULT'
, cascade => TRUE
);
Now, I came into work Monday and the performance was 'yukky', so I did this:
DECLARE
BEGIN
   EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_INSURANCE ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
   EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_PROCEDURE ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
   EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_PROCEDURE_BALANCE ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
   EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_FINANCIAL_TRANSACTION ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
   EXECUTE IMMEDIATE 'ANALYZE TABLE ALL_NAME ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
EXCEPTION
   WHEN OTHERS
   THEN
      NULL;
END;
... And all was good again. Any ideas what is incorrect perhaps on the DBMS_STATS.GATHER_SCHEMA_STATS parameter settings, and could you suggest a good method using this procedure?
Thanks! 
 
August    15, 2005 - 4:00 pm UTC 
 
 
you would need to tell us what happened between sunday early am and monday morning.
and why you have a when other then null???????? 
erase that and stop using it, you want to just silently ignore ERRORS!!!!!???
I can say that the dbms_stats -- you told it "no histograms", and the analyze you let it "get histograms" on the indexed columns -- that jumps right out. 
 
 
 
Yes, ok.
Mark, August    17, 2005 - 2:36 pm UTC
 
 
Thanks Tom.
The Exception handler was quick adhoc code and I don't usually code that way.
Anyway, this is what I've done:
1) turned table monitoring on
2) Via JOB at 1AM:
/* Gather Stale Stats Nightly */
DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>'HT4',
estimate_percent => 40, block_sample => FALSE, method_opt =>
'FOR ALL INDEXED COLUMNS', degree => NULL, granularity =>
'DEFAULT', options => 'GATHER STALE', cascade => TRUE);
3) Once per week (Sunday Early AM):
/* Gather All Stats Weekly */
DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>'HT4',
estimate_percent => 40, block_sample => FALSE, method_opt =>
'FOR ALL INDEXED COLUMNS', degree => NULL, granularity =>
'DEFAULT', options => 'GATHER', cascade => TRUE);
This seems to give me a more consistent performance.
Thanks! 
 
 
collecting initial stats using dbms_stats
JK Purtzer, September 02, 2005 - 1:14 pm UTC
 
 
running the package mentioned above in my previous review
begin
dbms_stats.gather_schema_stats(ownname=> 'ESD',estimate_percent=>dbms_stats.auto_sample_size, 
method_opt=>'for all columns size 
AUTO',cascade=>true );
commit;
end;
I can't seem to verify with
</code>  
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96533/stats.htm#33117  <code>
SELECT TABLE_NAME, NUM_ROWS, BLOCKS, AVG_ROW_LEN,
       TO_CHAR(LAST_ANALYZED, 'MM/DD/YYYY HH24:MI:SS')
     FROM DBA_TABLES
     WHERE TABLE_NAME IN ('SO_LINES_ALL','SO_HEADERS_ALL','SO_LAST_ALL');
I get 0 rows
thinking I'm missing something about initially gathering stats, I looked at ch6 of effective oracle by design also
the procedure only takes 1 sec to run, it can't be gathering stats quite that fast or can it?  
 
September 03, 2005 - 7:40 am UTC 
 
 
well, that dbms_stats will gather stats for the ESD schema
could it take 1 second?  depends on the size of the ESD schema.
That query will return rows for those exact table names, if nothing is coming back, you don't have any tables by those names. 
 
 
 
Nice
Catherine, September 15, 2005 - 2:10 pm UTC
 
 
Hi Tom,
How Statistics gathering is useful for the Optimizer??
In what ways it uses the statistics??
Please provide a detailed answer.
Bye!!
 
 
September 15, 2005 - 2:35 pm UTC 
 
 
 
 
Support of Analyze
Bill, September 28, 2005 - 9:26 am UTC
 
 
Hi Tom,
Both you and the Oracle documentation (9i) state the cost-based optimizer will eventually use only statistics that have been collected by DBMS_STATS.
Do you know when this will be the case?
Also, is the DBMS_DDL.ANALYZE_OBJECT procedure equivalent to ANALYZE TABLE?
Thanks! 
 
September 28, 2005 - 10:25 am UTC 
 
 
The optimizer expects the statistics to be gathered by dbms_stats, the stats generated by analyze can and will be different than dbms_stats, but the optimizer is assuming "dbms_stats" did that.
There is no "end of life" on analyze for statistics gathering 
 
 
 
JK Purtzer, October   03, 2005 - 5:14 pm UTC
 
 
I ran the sql below to gather stats:
begin
dbms_stats.gather_schema_stats(ownname=> 'schema',
estimate_percent=>dbms_stats.auto_sample_size, 
cascade=>true); 
commit;
end;
This caused several of the plans for procedures that are "unhinted" and were using indexes to stop using them and do full table scans, what would you suggest to look at to find out why.
Thanks again for the great book from the OTN lounge at OOW05 
 
October   03, 2005 - 9:01 pm UTC 
 
 
tell me you tested this in a test system first please????
you just turned on the CBO, entirely.  Some things will go better - some will be unchanged and invariably, there will be others that do not go so well the first time (they are the ones you NOTICE, the others you do not).
They are something to be analyzed in a testing environment.  Without version info and other stuff - no comment. 
 
 
 
analyze and cursor invalidation
Al, October   04, 2005 - 11:05 am UTC
 
 
In a followup above you state:
----------------
analyzing the table will invalidate open cursors on that object in the shared pool.  the next parse against that object will be "hard".  hard does lots more recursive sql then "soft"
-----------------
I don't fully understand what you mean by 'invalidate open cursors".  An anaylze of a table will not cause some other session's executing sql to fail, will it?
For complicated reasons I will need to run a bunch of dbms_stats.gather_table_stats after an application upgrade, and there will be real users on the application at that point.  I am assuming that as the stats are gathered, any parse (soft or hard) of any sql on an analyzed object will cause complete re-optimization, but that in no case will an executing statement fail because of the analyze.  Is this correct?
The goal here, obviously, is to get users on the application as quickly as possible after the upgrade, even if performance isn't what we'd like; then gather the stats as the system runs to bring it back to its 'normal' statistics state and 'normal' performance. 
 
October   04, 2005 - 4:43 pm UTC 
 
 
it'll invalidate the shared sql, the cursors in the shared pool so they next time they are executed, they'll hard parse.
it will not break "already running" stuff. 
 
 
 
anaylze table, open cursors and parsing ... 
Catalin CIONTU, October   04, 2005 - 2:23 pm UTC
 
 
Tom, please correct me if i'm wrong, but by analyzing the table, its corresponding open cursors will be invalidated in the shared_pool and the only difference is that the cursors have to do a hard parse instead of a soft parse . The process will complete, but will be less effective, this time. Regards, catalin .  
 
October   04, 2005 - 6:24 pm UTC 
 
 
the process will complete, yes, as to it "effectiveness", one of three things can happen..... (slower, faster, the same) 
 
 
 
dbms_stats quibble
JK Purtzer, October   04, 2005 - 6:40 pm UTC
 
 
9.2.0.6 DB
We did some testing,(see general question above, sorry about that one, panic'd)Here is a more specific one.
We are now using the DBMS_STATS package for collecting stats. We were previously using the old DBMS_UTILITY.ANALYZE_SCHEMA method. 
The following is the new method we are using to collect stats:
dbms_stats.gather_schema_stats(
ownname=>XXX', 
estimate_percent=>dbms_stats.auto_sample_size, 
method_opt=>'for all columns size AUTO', 
cascade=>true);
 
Since we changed over to the new statistics gathering, we have seen significant degradation in some of our existing queries. In analyzing one of the queries, we found some strange behavior by the CBO when evaluating a TO_DATE conversion in the where predicate:
When the date string includes a 4-digit year such as this:
WHERE OA.CREATE_DATE >= TO_DATE('7/28/2005', 'MM/DD/RR')
It uses the index on the CREATE_DATE column which is desired.
 
When the date string uses a 2-digit year such as this:
 
WHERE OA.CREATE_DATE >= TO_DATE('7/28/05', 'MM/DD/RR')
 
It does NOT use the index on the CREATE_DATE column even though this also converts the string to 7/28/2005.
 
The difference in execution times between these two execution plans is less than a second when the index is used to minutes when the index is not used.
This is quite confusing and was hoping you could shed some light on this.
 
 
October   04, 2005 - 9:01 pm UTC 
 
 
WHERE OA.CREATE_DATE >= TO_DATE('7/28/2005', 'MM/DD/RR')
THAT is a constant....
WHERE OA.CREATE_DATE >= TO_DATE('7/28/05', 'MM/DD/RR')
THAT is an "unknown", could be 1905, 2005, 2105 - depends on when you execute it (I know, a human would be able to sort of make a good guess - but the date format is not constant)
Are you sure this is due to stats?  can we see an autotrace traceonly explain?
 
 
 
 
date quibble with explain plans
JK Purtzer, October   04, 2005 - 9:47 pm UTC
 
 
Hope you can read this, my email cut and paste got rid of the formatting, need a clarification
 statement that the 2-digit year is an unknown. I am issuing the query today and by Oracle's rules that would mean an 05 year with an RR date format elelment would return year 2005. Tha makes it a constant as of right now doesn't it?
Using a 4-digit year in the to_date string:
 
 
  
  
 Cost
 Cardinality
 Bytes
 
SELECT STATEMENT, GOAL = CHOOSE
  
  
 3102
 207
 40986
 
 SORT ORDER BY
  
  
 3102
 207
 40986
 
  HASH JOIN OUTER
  
  
 3094
 207
 40986
 
   HASH JOIN OUTER
  
  
 2717
 206
 30488
 
    NESTED LOOPS
  
  
 2714
 206
 29046
 
     HASH JOIN OUTER
  
  
 2302
 206
 26162
 
      NESTED LOOPS OUTER
  
  
 2227
 206
 23484
 
       NESTED LOOPS
  
  
 2227
 206
 21836
 
        HASH JOIN
  
  
 2021
 206
 16686
 
         TABLE ACCESS BY INDEX ROWID
 WAREHOUSE
 ORDER_AGGREGATE
 1988
 206
 14214
 
          INDEX RANGE SCAN
 WAREHOUSE
 OA_CREATE_DATE_IDX
 30
 5041
  
 
         TABLE ACCESS FULL
 WAREHOUSE
 PUBLISHER_PROFILE
 32
 11898
 142776
 
        TABLE ACCESS BY INDEX ROWID
 ESD
 PUBLISHERS
 1
 1
 25
 
         INDEX UNIQUE SCAN
 ESD
 PUBLISHERS_PK
  
 1
  
 
       INDEX UNIQUE SCAN
 ESD
 SALES_AGENT_ID_PK
  
 1
 8
 
      TABLE ACCESS FULL
 WAREHOUSE
 FULFILLMENTS
 74
 48036
 624468
 
     TABLE ACCESS BY INDEX ROWID
 WAREHOUSE
 ORDER_EXT_AGGREGATE
 2
 1
 14
 
      INDEX UNIQUE SCAN
 WAREHOUSE
 ORDERS_EXT_AGG_PK
 1
 1
  
 
    TABLE ACCESS FULL
 WAREHOUSE
 CURRENCIES
 2
 169
 1183
 
   VIEW
 WAREHOUSE
  
 375
 38293
 1914650
 
    SORT GROUP BY
  
  
 375
 38293
 1799771
 
     HASH JOIN
  
  
 66
 38293
 1799771
 
      TABLE ACCESS FULL
 WAREHOUSE
 RETURN_REASONS
 3
 696
 20880
 
      TABLE ACCESS FULL
 WAREHOUSE
 RETURNS
 62
 38293
 650981
 
 
Using a 2-digit year in the to_date string:
 
05, RR
  
  
  
  
  
 
SELECT STATEMENT, GOAL = CHOOSE
  
  
 21420
 234
 46332
 
 SORT ORDER BY
  
  
 21420
 234
 46332
 
  FILTER
  
  
  
  
  
 
   HASH JOIN OUTER
  
  
 21411
 234
 46332
 
    HASH JOIN OUTER
  
  
 21034
 233
 34484
 
     NESTED LOOPS
  
  
 21031
 233
 32853
 
      HASH JOIN OUTER
  
  
 20565
 233
 29591
 
       NESTED LOOPS OUTER
  
  
 20490
 233
 26562
 
        NESTED LOOPS
  
  
 20490
 233
 24698
 
         HASH JOIN
  
  
 20257
 233
 18873
 
          TABLE ACCESS FULL
 WAREHOUSE
 ORDER_AGGREGATE
 20224
 233
 16077
 
          TABLE ACCESS FULL
 WAREHOUSE
 PUBLISHER_PROFILE
 32
 11898
 142776
 
         TABLE ACCESS BY INDEX ROWID
 ESD
 PUBLISHERS
 1
 1
 25
 
          INDEX UNIQUE SCAN
 ESD
 PUBLISHERS_PK
  
 1
  
 
        INDEX UNIQUE SCAN
 ESD
 SALES_AGENT_ID_PK
  
 1
 8
 
       TABLE ACCESS FULL
 WAREHOUSE
 FULFILLMENTS
 74
 48036
 624468
 
      TABLE ACCESS BY INDEX ROWID
 WAREHOUSE
 ORDER_EXT_AGGREGATE
 2
 1
 14
 
       INDEX UNIQUE SCAN
 WAREHOUSE
 ORDERS_EXT_AGG_PK
 1
 1
  
 
     TABLE ACCESS FULL
 WAREHOUSE
 CURRENCIES
 2
 169
 1183
 
    VIEW
 WAREHOUSE
  
 375
 38293
 1914650
 
     SORT GROUP BY
  
  
 375
 38293
 1799771
 
      HASH JOIN
  
  
 66
 38293
 1799771
 
       TABLE ACCESS FULL
 WAREHOUSE
 RETURN_REASONS
 3
 696
 20880
 
       TABLE ACCESS FULL
 WAREHOUSE
 RETURNS
 62
 38293
 650981
 
 
 
October   05, 2005 - 6:59 am UTC 
 
 
... That makes it a constant as of right now doesn't it? ...
no more than SYSDATE is a constant as of right now.  5 minutes from now - sysdate can be different (can be, doesn't have to be! you can freeze time)
WHERE OA.CREATE_DATE >= TO_DATE('7/28/05', 'MM/DD/RR')
Tell me, what date is that - no matter what you answer, I'll say "nope, I was thinking of another date"
would need to see the plan part that represented the estimated cardinality for this step.
(you can format plans skinny, I do it all of the time...) 
 
 
 
How do you know if a table was analyzed with dbms_stats or analyze table?
Peter Tran, December  15, 2005 - 7:38 pm UTC
 
 
Hi Tom,
Someone asked me this today, but I haven't been able to figure out the answer.
Is there anyway you can tell whether someone analyzed a table using "DBMS_STATS" vs "ANALZE TABLE"?
My gut call is no.
Thanks,
-Peter 
 
December  16, 2005 - 7:45 am UTC 
 
 
it won't be "definitive", but you can take an educated guess.
I'll use dbms_stats, then analyze - compare, then dbms_stats and recompare..
ops$tkyte@ORA10GR2> create table t ( x int, y varchar2(4000), z varchar2(4000) );
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into t (x) select rownum from all_objects where rownum <= 10;
10 rows created.
ops$tkyte@ORA10GR2> commit;
Commit complete.
ops$tkyte@ORA10GR2> update t set y = rpad('*',4000,'*'), z = rpad('*',4000,'*');
10 rows updated.
ops$tkyte@ORA10GR2> commit;
Commit complete.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR2> create table tstats as
  2  select 'DBMS_STATS' name, a.* from user_tables a where table_name = 'T';
Table created.
ops$tkyte@ORA10GR2> analyze table t compute statistics;
Table analyzed.
ops$tkyte@ORA10GR2> insert into tstats
  2  select 'ANALYZE' name, a.* from user_tables a where table_name = 'T';
1 row created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> column cname format a20
ops$tkyte@ORA10GR2> column dbms_stats format a25
ops$tkyte@ORA10GR2> column analyze format a25
ops$tkyte@ORA10GR2> select a.cname, a.val dbms_stats, b.val analyze
  2    from table( cols_as_rows( q'|select *
  3                                   from tstats
  4                                  where name = 'DBMS_STATS' |' ) ) a,
  5         table( cols_as_rows( q'|select *
  6                                   from tstats
  7                                  where name = 'ANALYZE' |' ) ) b
  8       where a.cname = b.cname
  9             and decode( a.val, b.val, 0, 1 ) = 1
 10  /
CNAME                DBMS_STATS                ANALYZE
-------------------- ------------------------- -------------------------
NAME                 DBMS_STATS                ANALYZE
EMPTY_BLOCKS         0                         3
AVG_SPACE            0                         1878
CHAIN_CNT            0                         10
AVG_ROW_LEN          8005                      8022
GLOBAL_STATS         YES                       NO
6 rows selected.
<b>dbms_stats won't gather things not used by the CBO such as chain_cnt/avg_space in particular.  If they are 0 or null - it is unlikely that analyze has been used in the past - dbms_stats would be the one.  
But global_stats for a non-partitioned table could be a "tell", avg_space, etc is not "good enough" since:</b>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> delete from tstats where name = 'DBMS_STATS';
1 row deleted.
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR2> insert into tstats
  2  select 'DBMS_STATS' name, a.* from user_tables a where table_name = 'T';
1 row created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select a.cname, a.val dbms_stats, b.val analyze
  2    from table( cols_as_rows( q'|select *
  3                                   from tstats
  4                                  where name = 'DBMS_STATS' |' ) ) a,
  5         table( cols_as_rows( q'|select *
  6                                   from tstats
  7                                  where name = 'ANALYZE' |' ) ) b
  8       where a.cname = b.cname
  9             and decode( a.val, b.val, 0, 1 ) = 1
 10  /
CNAME                DBMS_STATS                ANALYZE
-------------------- ------------------------- -------------------------
NAME                 DBMS_STATS                ANALYZE
AVG_ROW_LEN          8005                      8022
GLOBAL_STATS         YES                       NO
<b>dbms_stats won't 'reset' them, their being set indicates analyze was used at some time in the past - but since analyze and dbms_stats set and reset global_stats - that could be a "tell" for many normal tables...</b>
 
 
 
 
 
Wow...Awesome.
Peter Tran, December  16, 2005 - 12:10 pm UTC
 
 
Perfect...
Thanks again Tom. 
 
 
CHAIN_CNT
A reader, January   05, 2006 - 10:05 am UTC
 
 
Hi
Since dbms_stats doesnt get CHAIN_CNT for us, if we want to find out if we have chained rows on whic tablas how can we achieve it? By analyzing all tablas manually and overwrtting statistics gathered by dbms_stats? 
 
January   05, 2006 - 11:04 am UTC 
 
 
ops$tkyte@ORA10GR2> @?/rdbms/admin/utlchain
Table created.
ops$tkyte@ORA10GR2> analyze table t list chained rows;
Table analyzed.
ops$tkyte@ORA10GR2> desc chained_rows
 Name                                     Null?    Type
 ---------------------------------------- -------- ----------------------------
 OWNER_NAME                                        VARCHAR2(30)
 TABLE_NAME                                        VARCHAR2(30)
 CLUSTER_NAME                                      VARCHAR2(30)
 PARTITION_NAME                                    VARCHAR2(30)
 SUBPARTITION_NAME                                 VARCHAR2(30)
 HEAD_ROWID                                        ROWID
 ANALYZE_TIMESTAMP                                 DATE
 
 
 
 
 
A reader, January   05, 2006 - 10:48 pm UTC
 
 
Tom,
What is the reason for dbms_stats not gathering info about chained rows whereas analyze command does?
Thanks. 
 
January   06, 2006 - 1:37 pm UTC 
 
 
dbms_stats gathers only that which is used by the optimizer - it gathers no other bits of information. 
 
 
 
Ramesh, January   06, 2006 - 9:48 am UTC
 
 
Tom,
How setting "mbrc" through dbms_stats and  parameter db_file_multiblock_read_count  influence differently.
Is that some thing like "mbrc" used during cost evaluation phase and db_file_multiblock_read_count used during execution phase.
Thanks for your time.
Ramesh. 
 
January   06, 2006 - 2:11 pm UTC 
 
 
the system statistics mbrc is the "observed actual multi-block read count on the system, regardless of the db file multiblock read count init.ora parameter"
the optimizer will use the "observed value", the "retrieval" engine will use db_file_multiblock_read_count to do the actual IO however.
So, if you tell the system via dbms_stats "we only get 8 blocks on multi-block reads" but set the multiblock read count to the max (say 128 blocks) - when the optimizer decides to to a full scan (multi-block read), we'll do as large an IO as possible, but the costing engine would consider "8" to be the number we would expect.
 
 
 
 
dbms_stats in 10gR2 
Oraboy, February  21, 2006 - 5:21 pm UTC
 
 
Guess I am really tired and my eyes cant catch what I am missing here..Could anyone please help?
Issue: The index associated with Primary key Constraint never gets updated with correct statistics and hence my query is throwing wrong results
Commodity_DIM is a table with 700+ rows
<b>-- Version check </b>
SQL> select * from V$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
<b>-- Table details </b>
SQL> select table_name,num_rows,sample_size,last_analyzed from user_tables
  2  where table_name='COMMODITY_DIM'
  3  /
TABLE_NAME                       NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------------ ---------- ----------- ---------
COMMODITY_DIM                         730         730 21-FEB-06
SQL> select count(*) from commodity_dim;
  COUNT(*)
----------
         0
<b>-- Index details </b>
SQL> select index_name,num_rows,DISTINCT_KEYS,status from user_indexes
  2  where table_name='COMMODITY_DIM';
INDEX_NAME                       NUM_ROWS DISTINCT_KEYS STATUS
------------------------------ ---------- ------------- --------
SYS_C009071                             0             0 VALID
<b>-- Index associates to Primary Key constraint in the table </b>
SQL> select constraint_name,constraint_type from user_constraints
  2  where table_name='COMMODITY_DIM' and index_name='SYS_C009071';
CONSTRAINT_NAME                C
------------------------------ -
SYS_C009071                    P
<b>-- and the constraint is Valid </b>
SQL> select constraint_name,constraint_type,status from user_constraints
  2  where table_name='COMMODITY_DIM' and index_name='SYS_C009071';
CONSTRAINT_NAME                C STATUS
------------------------------ - --------
SYS_C009071                    P ENABLED
<b>-- analyzing the index - no help </b>
SQL> analyze index SYS_C009071 compute statistics;
Index analyzed.
SQL> select index_name,num_rows,DISTINCT_KEYS,status from user_indexes
  2  where table_name='COMMODITY_DIM';
INDEX_NAME                       NUM_ROWS DISTINCT_KEYS STATUS
------------------------------ ---------- ------------- --------
SYS_C009071                             0             0 VALID
<b>-- dbms_stats ; perform compute  </b>
SQL> exec dbms_stats.gather_index_stats(ownname=>'GOLDDEMO',-
> indname=>'SYS_C009071',estimate_percent=>NULL);
PL/SQL procedure successfully completed.
SQL> select index_name,num_rows,DISTINCT_KEYS,status from user_indexes
  2  where table_name='COMMODITY_DIM';
INDEX_NAME                       NUM_ROWS DISTINCT_KEYS STATUS
------------------------------ ---------- ------------- --------
SYS_C009071                             0             0 VALID
SQL> desc commodity_dim
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 COMM_DIM_PK                               NOT NULL NUMBER
 COMM_LOW_CATG_CODE                        NOT NULL VARCHAR2(10)
 COMM_LOW_CATG_DESC                                 VARCHAR2(50)
 COMM_MID_CATG_CODE                        NOT NULL VARCHAR2(50)
 COMM_MID_CATG_DESC                                 VARCHAR2(50)
 COMM_HI_CATG_CODE                         NOT NULL VARCHAR2(50)
 COMM_HI_CATG_DESC                                  VARCHAR2(50)
 COMM_ALL_CATG_CODE                                 VARCHAR2(10)
 COMM_ALL_CATG_DESC                                 VARCHAR2(50)
<b>-- Here is the problem </b>
SQL> select count(*) from commodity_dim;
  COUNT(*)
----------
         0
SQL> select count(distinct comm_dim_pk) from commodity_dim;
COUNT(DISTINCTCOMM_DIM_PK)
--------------------------
                         0
SQL> select /*+ FULL(c)*/
  2         count(distinct comm_dim_pk) from commodity_dim c;
COUNT(DISTINCTCOMM_DIM_PK)
--------------------------
                       730
<b>-- Verifying Optimizer plans  </b>
SQL> explain plan for select count(*) from commodity_dim;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 818636992
------------------------------------------------------------------------
| Id  | Operation        | Name        | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT |             |     1 |     0   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |             |     1 |            |          |
|   2 |   INDEX FULL SCAN| SYS_C009071 |   730 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------
SQL> explain plan for 
  2  select /*+ FULL(c)*/
  3         count(distinct comm_dim_pk) from commodity_dim c;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
Plan hash value: 3961897095
-------------------------------------------------------------------------
| Id  | Operation          | Name          | Rows  | Bytes | Cost (%CPU)| T
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |               |     1 |     4 |     8   (0)| 0
|   1 |  SORT GROUP BY     |               |     1 |     4 |            |  
|   2 |   TABLE ACCESS FULL| COMMODITY_DIM |   730 |  2920 |     8   (0)| 0
---------------------------------------------------------------------------
<b>-- parameter listing , if this helps  </b>
SQL> show parameter opt
NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------
filesystemio_options                 string
object_cache_optimal_size            integer     102400
optimizer_dynamic_sampling           integer     2
optimizer_features_enable            string      10.2.0.1
optimizer_index_caching              integer     0
optimizer_index_cost_adj             integer     100
optimizer_mode                       string      ALL_ROWS
optimizer_secure_view_merging        boolean     TRUE
plsql_optimize_level                 integer     2 
 
 
February  22, 2006 - 8:28 am UTC 
 
 
analyze table commodity_dim validate structure cascade;
what does that return. 
 
 
 
Ora-1499
oraboy, February  22, 2006 - 9:31 am UTC
 
 
Thanks Tom.. Looks like there is much bigger problem here.
SQL> analyze table commodity_dim validate structure;
Table analyzed.
SQL> analyze table commodity_dim validate structure cascade;
analyze table commodity_dim validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
I 'll follow-up this in metalink.
Thanks for your pointers, as always.
 
 
 
 
Weird but worked
Oraboy, February  22, 2006 - 9:39 am UTC
 
 
SQL> analyze table commodity_dim validate structure cascade;
analyze table commodity_dim validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> alter table commodity_dim move;
Table altered.
SQL> select index_name from user_indexes
  2  where table_name='COMMODITY_DIM';
INDEX_NAME
------------------------------
SYS_C009071
SQL> alter index SYS_C009071 rebuild ;
Index altered.
SQL> analyze table commodity_dim validate structure cascade;
Table analyzed.
SQL> select count(*) from commodity_dim;
  COUNT(*)
----------
       730
SQL> explain plan for select count(*) from commodity_dim;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 2809477265
-----------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |             |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| SYS_C009071 |   730 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------
 
 
 
 
Comparable test, different results...
Ben, March     09, 2006 - 2:52 pm UTC
 
 
This is a great discussion, it has brought to light the concept of histograms and generating statistics overall for me. The argument for how bright that light is, is still up in the air ;)
After reading the portion, dated Jan 05, that gave examples of the differences between the different method_opt choices, I decided to experiment a little for myself. I am using a test data warehouse that is composed of only 7 tables, each of which have a primary key index and two of which have an additional one column non-unique index. I show those below.
I found my results to be un-expectantly different from yours and I wondered if you could help me to understand why. 
The first test was using 'FOR ALL INDEXED COLUMNS'. The second test was with 'FOR ALL INDEXED COLUMNS SIZE AUTO'. Finally, the third test was using 'FOR ALL INDEXED COLUMNS SIZE SKEWONLY'. The third test gave the most confusing results to me. I will include my observation and question following the results from each test, within the spooled listing below. 
SQL> show user
USER is "DMR"
SQL> select USER || '.' || table_name, index_name, column_name
  2  from user_ind_columns;
USER||'.'||TABLE_NAME INDEX_NAME             COLUMN_NAME
--------------------- ---------------------- -------------------------
DMR.BRANCH_BO         BRANCH_BO_1            REPORT_DATE 
DMR.BRANCH_BO         BRANCH_BO_1            PRODUCT  
DMR.BRANCH_BO         BRANCH_BO_1            BRANCH_PLANT
DMR.BRANCH_BO         BRANCH_BO_1            TRX_CLASS 
DMR.INV_FACT          INV_FACT_1             REPORT_DATE  
DMR.INV_FACT          INV_FACT_1             PRODUCT 
DMR.INV_FACT          INV_FACT_1             BRANCH_PLANT 
DMR.INV_FACT          INV_FACT_1             CATEGORY      
DMR.INV_FACT          INV_FACT_1             OBSOLETE 
DMR.ITEM              ITEM_1                 PRODUCT 
DMR.ITEM              ITEM_2**               D_BUS_UNIT
DMR.ITEM_BRANCH       ITEM_BRANCH_1          PRODUCT 
DMR.ITEM_BRANCH       ITEM_BRANCH_1          BRANCH_PLANT
DMR.ITEM_FILL         ITEM_FILL_1            REPORT_DATE 
DMR.ITEM_FILL         ITEM_FILL_1            PRODUCT  
DMR.ITEM_FILL         ITEM_FILL_2**          PRODUCT 
DMR.OSB_FACT          OSB_FACT_1             REPORT_DATE 
DMR.OSB_FACT          OSB_FACT_1             PRODUCT
DMR.OSB_FACT          OSB_FACT_1             TRX_CLASS 
DMR.PROD_FACT         PROD_FACT_1            REPORT_DATE
DMR.PROD_FACT         PROD_FACT_1            PRODUCT 
DMR.PROD_FACT         PROD_FACT_1            BRANCH_PLANT
DMR.TIME              TIME_1                 CURDATE 
** signifies non unique indexes/columns
23 rows selected.
############## 
The following shows the skewdness of d_bus_unit, curdate, and obsolete
##############
SQL> select count(distinct D_BUS_UNIT)
  2  from item;
 
COUNT(DISTINCTD_BUS_UNIT)
-------------------------
                       12
 
SQL> select count(*)
  2  from item;
 
  COUNT(*)
----------
     65119
 
SQL> select D_BUS_UNIT, count(*)
  2  from item
  3  group by D_BUS_UNIT;
 
D_BUS_UNIT                       COUNT(*)
------------------------------ ----------
                                      617
Acute Care                          29298
Corporate                              10
Drop Ship Fee                           1
Freight                                 2
Min Order Qty Fee                       1
NonStock Expense Item                   8
OEM                                  1920
Patient Care                        30547
REBATE CREDIT                           1
Raw Material Samples                    2
 
D_BUS_UNIT                       COUNT(*)
------------------------------ ----------
Wound Care                           2699
 
12 rows selected.
SQL> select count(distinct obsolete)
  2  from inv_fact;
 
COUNT(DISTINCTOBSOLETE)
-----------------------
                      2
 
SQL> select count(*)
  2  from inv_fact;
 
  COUNT(*)
----------
   2743569
 
SQL> select obsolete, count(*)
  2  from inv_fact
  3  group by obsolete;
 
O   COUNT(*)
- ----------
N    2649249
Y      94320
SQL> select count(*)
  2  from time;
 
  COUNT(*)
----------
       156
 
SQL> select count(distinct curdate)
  2  from time;
 
COUNT(DISTINCTCURDATE)
----------------------
                   156
#################################
SQL> exec dbms_stats.delete_schema_stats( 'DMR' );
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
  2  from user_tab_histograms
  3  group by table_name, column_name;
no rows selected
SQL> exec dbms_stats.gather_schema_stats
    ( ownname         =>'DMR', 
      cascade         => TRUE, 
      estimate_percent     => 10, 
      method_opt         => 'FOR ALL INDEXED COLUMNS');
PL/SQL procedure successfully completed.
SQL> l
  1  select table_name, column_name, count(*)
  2  from user_tab_histograms
  3* group by table_name, column_name
SQL> /
TABLE_NAME            COLUMN_NAME     COUNT(*)             
--------------------- --------------  ----------       
ITEM                  PRODUCT         2             
ITEM                  D_BUS_UNIT      7             
TIME                  CURDATE         76             
INV_FACT              PRODUCT         76             
INV_FACT              CATEGORY        2             
INV_FACT              OBSOLETE        2             
INV_FACT              REPORT_DATE     74             
INV_FACT              BRANCH_PLANT    25             
OSB_FACT              PRODUCT         76             
OSB_FACT              TRX_CLASS       6             
OSB_FACT              REPORT_DATE     74             
BRANCH_BO             PRODUCT         76             
BRANCH_BO             TRX_CLASS       6             
BRANCH_BO             REPORT_DATE     62             
BRANCH_BO             BRANCH_PLANT    24             
ITEM_FILL             PRODUCT         76             
ITEM_FILL             REPORT_DATE     3             
PROD_FACT             PRODUCT         76             
PROD_FACT             REPORT_DATE     44             
PROD_FACT             BRANCH_PLANT    32             
ITEM_BRANCH           PRODUCT         76             
ITEM_BRANCH           BRANCH_PLANT    33             
22 rows selected.
######### OBSERVATIONS/QUESTIONS ########
When you did this test on table "T" that was built from 
create table t as select * from all_objects
you got a 76 bucket histogram on object_id, a unique column, and a 22 bucket histogram on owner, a non-unique column.
In my sample item.product is the primary key, I would have expected more than a two bucket histogram.
ITEM.D_BUS_UNIT is non-unique and very skewed but yet I only got a 7 bucket histogram.
What could be the cause of this? It seems like I got the opposite of what your example shows.
################# END of Q1
SQL> exec dbms_stats.delete_schema_stats( 'DMR' );
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
  2  from user_tab_histograms
  3  group by table_name, column_name;
no rows selected
SQL> exec dbms_stats.gather_schema_stats
    ( ownname        => 'DMR', 
      cascade         => TRUE, 
      estimate_percent     => 10, 
      method_opt         => 'FOR ALL INDEXED COLUMNS SIZE AUTO');
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
  2  from user_tab_histograms
  3  group by table_name, column_name;
TABLE_NAME                     COLUMN_NAME                 COUNT(*)             
------------------------------ ------------------------- ----------             
ITEM                           PRODUCT                            2             
ITEM                           D_BUS_UNIT                         2             
TIME                           CURDATE                            2             
INV_FACT                       PRODUCT                            2             
INV_FACT                       CATEGORY                           2             
INV_FACT                       OBSOLETE                           2             
INV_FACT                       REPORT_DATE                        2             
INV_FACT                       BRANCH_PLANT                       2             
OSB_FACT                       PRODUCT                            2             
OSB_FACT                       TRX_CLASS                          2             
OSB_FACT                       REPORT_DATE                        2             
BRANCH_BO                      PRODUCT                            2             
BRANCH_BO                      TRX_CLASS                          2             
BRANCH_BO                      REPORT_DATE                        2             
BRANCH_BO                      BRANCH_PLANT                       2             
ITEM_FILL                      PRODUCT                            2             
ITEM_FILL                      REPORT_DATE                        2             
PROD_FACT                      PRODUCT                            2             
PROD_FACT                      REPORT_DATE                        2             
PROD_FACT                      BRANCH_PLANT                       2             
ITEM_BRANCH                    PRODUCT                            2             
ITEM_BRANCH                    BRANCH_PLANT                       2             
22 rows selected.
##########################
No confusion here, got the exact results as your sample. This is a test db and hasn't had any sql ran for some time.
###########################
SQL> exec dbms_stats.delete_schema_stats( 'DMR' );
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
  2  from user_tab_histograms
  3  group by table_name, column_name;
no rows selected
SQL> exec dbms_stats.gather_schema_stats
    ( ownname        =>'DMR', 
      cascade         => TRUE, 
      estimate_percent     => 10, 
      method_opt         => 'FOR ALL INDEXED COLUMNS SIZE SKEWONLY');
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
  2  from user_tab_histograms
  3  group by table_name, column_name;
TABLE_NAME                     COLUMN_NAME                 COUNT(*)             
------------------------------ ------------------------- ----------             
ITEM                           PRODUCT                            2             
ITEM                           D_BUS_UNIT                         2             
TIME                           CURDATE                           76             
INV_FACT                       PRODUCT                          201             
INV_FACT                       CATEGORY                           2             
INV_FACT                       OBSOLETE                           2             
INV_FACT                       REPORT_DATE                       99             
INV_FACT                       BRANCH_PLANT                      48             
OSB_FACT                       PRODUCT                          201             
OSB_FACT                       TRX_CLASS                          6             
OSB_FACT                       REPORT_DATE                      100             
BRANCH_BO                      PRODUCT                          201             
BRANCH_BO                      TRX_CLASS                          6             
BRANCH_BO                      REPORT_DATE                       89             
BRANCH_BO                      BRANCH_PLANT                      24             
ITEM_FILL                      PRODUCT                          102             
ITEM_FILL                      REPORT_DATE                        3             
PROD_FACT                      PRODUCT                          201             
PROD_FACT                      REPORT_DATE                      108             
PROD_FACT                      BRANCH_PLANT                      31             
ITEM_BRANCH                    PRODUCT                          201             
ITEM_BRANCH                    BRANCH_PLANT                      57             
22 rows selected.
################################
Here's where I really get confused. I listed some statements at the top that shows the skewdness of
item.d_bus_unit, time.curdate, and inv_fact.obsolete. item.product is the primary key of item, so it's unique, and from example I would expect the 2 bucket histogram, but d_bus_unit is non_unique and very skewed. Why no histograms here? Time.curdate is unique, 
why does it have histograms? inv_fact.obsolete is non-unique and very skewed, but yet again, no histograms. What could be causing this?
Also, I might be a little confused with the terminology and referring to that count as buckets. I saw where you were saying that 2 meant no histograms, just hi/lo endpoints, and that >2 meant histograms.
Thank you for any explanations you could give me, and let me know if you need to see other settings "maybe init.ora" or anything else.
 
 
 
March     09, 2006 - 3:58 pm UTC 
 
 
what version here - dbms_stats behavior is driven by that greatly.
Also, I likely won't address everything here - this is sort of a review followup area, this is "big" (and I cannot really reproduce, I do not have your data, that makes it "hard")
 
 
 
 
in addition to above...
Ben, March     09, 2006 - 3:51 pm UTC
 
 
Sorry, I forgot to include the banner information
SQL> select banner from v$version;
 
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
PL/SQL Release 9.2.0.5.0 - Production
CORE    9.2.0.6.0       Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.5.0 - Production
NLSRTL Version 9.2.0.5.0 - Production 
 
 
 
another addition, your response wasn't posted before last.
Ben, March     09, 2006 - 4:17 pm UTC
 
 
sorry, again for the lack of data. I did think of something else that might impact this. 
In my init.ora the compatible parameter is set to 8.1.0 BUT the optimizer_features_enable is set to 9.2.0
I just recently inherited the database so I can't give you a reason why these are the way they are.  
 
March     09, 2006 - 4:25 pm UTC 
 
 
I'll take a look later when I get back from travel if I get a chance.
Meanwhile and faraway, can you try to generate a test case using mod, rownum, decode whatever to generate dummy data that we can test with ? 
 
 
 
a closer comparison
Ben, March     10, 2006 - 1:28 pm UTC
 
 
I re-created your exact example by using the all_objects table. 
Those stats were almost exact. I think the difference with using skewonly is probably due to my db having fewer schemas than the one you were using. 
I'm still confused on some things with my specific data tables. 
The time table has a record for each day. with curdate being the actual date and primary key. The other fields are for day, month, year, day of week, day of year, etc. This is a test schema so there may not be a record for every day begining to end. Could this cause the need for histograms on that column? For example the min date could be 1/1/03 and the max 1/1/06 but it could be missing three months in between or even all of 2005. 
SQL> show user
USER is "DMR"
SQL> create table t as select * from all_objects;
Table created.
SQL> create index t_idx1 on t(object_id);
Index created.
SQL> create index t_idx2 on t(owner);
Index created.
SQL> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns', cascade => true );
PL/SQL procedure successfully completed.
SQL> col column_name for a20
SQL> select column_name, count(*)
  2  from user_tab_histograms
  3  where table_name = 'TIME';
  4  group by column_name
SQL> /
COLUMN_NAME            COUNT(*)                                                 
-------------------- ----------                                                 
OBJECT_ID                    76                                                 
OWNER                         4                                                 
SQL> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size auto', cascade => true );
PL/SQL procedure successfully completed.
SQL> /
COLUMN_NAME            COUNT(*)                                                 
-------------------- ----------                                                 
OBJECT_ID                     2                                                 
OWNER                         2                                                 
SQL> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size skewonly', cascade => true );
PL/SQL procedure successfully completed.
SQL> /
COLUMN_NAME            COUNT(*)                                                 
-------------------- ----------                                                 
OBJECT_ID                     2                                                 
OWNER                         4                                                 
SQL> exec dbms_stats.gather_table_stats( user, 'TIME', method_opt=> 'for all indexed columns size skewonly', cascade => true );
PL/SQL procedure successfully completed.
SQL> select column_name, count(*)
  2  from user_tab_histograms
  3  where table_name = 'TIME';
  4  group by column_name
SQL> /
COLUMN_NAME            COUNT(*)                                                 
-------------------- ----------                                                 
CURDATE                      76 
 
 
March     10, 2006 - 8:29 pm UTC 
 
 
we flipped from T to TIME?  I didn't follow?  What is TIME exactly? 
 
 
 
explanation
Ben, March     14, 2006 - 8:42 am UTC
 
 
I tried to explain in the last paragraph at the top before the example. I went through and recreated your example with almost exact results. I then posted numbers from a TIME.CURDATE field that is a primary key that holds the date. In production there is a record for every day, but the test instance has some gaps. I was asking if that is why I would get histograms on that column.
If I am running dbms_stats.gather_schema_stats(user, 'TIME', method_opt=> 'for all 
indexed columns size skewonly', cascade => true );
Why would a single column primary key of varchar2 datatype get a histogram generated. That could explain a lot to me. I thought if a column is unique it would only need 1 bucket. Is that not true with varchar2 columns. 
If, on a smaller scale, my column looks like:
10-245
10-456
10-984
10-999
15-875
50-3000
50-450
50-451
50-780
90-800
That is a unique column but would it get histograms generated if I used "for all indexed columns size skewonly"?
On the other hand, I also have a char(1) column that is either Y or N. Most of the 100,000 records have an N but yet I only get 1 bucket for this column. I wouldn't think that is sufficient. example:
N
N
N
Y
N
N
N
N
N
Y
Y
N
N
I am only getting a 1 bucket histogram for a column like this, of course on a much larger scale.
If one value out of two possible makes up less than 5% or 10% are histograms not created?
 
 
 
analyze, and gather_stats schemas
Mohammed Abdul Samad, April     24, 2006 - 4:59 am UTC
 
 
this is good and useful but I need more detail with 
example like if we are gathering stats every week
so how to make scheduling with dbms_jobs,
can you give me with complete examples
hope you got the point...... 
 
April     24, 2006 - 5:45 am UTC 
 
 
what issue do you have with dbms_job?  It is pretty straightforward.
You have a procedure you write to gather stats (or just call dbms_stats)
You schedule it.
 
 
 
 
Problem with table analyze.
Abdul Seerae, May       02, 2006 - 4:15 pm UTC
 
 
Dear Tom,
We have the following script running daily (AT 2.00 AM) to gather statistics on a specific schema.
[ dbms_stats.gather_schema_stats(ownname=>'ABCDOWNER',cascade=>true); ]
We are facing a problem with this script. The table analyze stops at a specific table 
(say 'Problem_tbl') and it will not proceed any further. It gives no error message or updates any logs. It stays there for ever. When checking through OEM, or sqlplus, we can see 'last analyzed date' updated for all tables before 'Problem_tbl' table, in this schema. 
This problem_tbl is not very big in size (only16,000 rows) and we have many big ones in this schema.
When we analyze this table individually, we got same result.  It stays for ever. 
[ dbms_stats.gather_table_stats(ownname=> 'ABCDOWNER', tabname=> 'PROBLEM_TBL', partname=> NULL); ]. 
We can analyze all other tables without any problem. To rule out any table corruption, we exported and imported this table to our development server.  The data seems OK. 
Once we restart database, this problem goes away for few weeks before it reappear.
Any suggestions / thoughts in this respect will be highly appreciated.
O/S:  Windows 2003.
Oracle : 9i (9.2.0.6.0)
 
 
May       03, 2006 - 1:33 am UTC 
 
 
(this does seem like an OBVIOUS support issue, I always wonder why that path isn't taken...)
I would peek in the v$ views to see what this session is waiting for when it gets "stuck".
v$session_wait
v$session_event 
 
 
 
Locks
Ik, May       08, 2006 - 6:53 am UTC
 
 
Tom,
Two questions:
Does ANALYZE or DBMS_STATS against a table, lock the table in any mode? I came across an old Oracle press documentation which said so, but could not simulate it (10g R2). Was it true in older versions of Oracle?
If the table is locked by another session, would analyze/dbms_stats wait for that lock to be released?
Thanks,
 
 
May       08, 2006 - 8:22 am UTC 
 
 
certain variations of analyze can (to validate structures and such), but dbms_stats just runs queries basically. 
 
 
 
Doubts
Ik, May       09, 2006 - 10:53 am UTC
 
 
Thanks a lot Tom.
I have some related questions.
1. Why do we have parameter force for gather_table_stats? This is FALSE by default. This means, dbms_stats.gather_table_stats errors out (or does it wait?) if the table is locked?
2. no_invalidate clause -- The documentation says "Does not invalidate the dependent cursors if set to TRUE. The procedure invalidates the dependent cursors immediately if set to FALSE." 
Invalidating dependent cursors means, all the queries that have used this table and are present in the shared pool will be invalidated and will be hard parsed if executed again?
Was this the default behaviour with ANALYZE?
3. DBMS_STATS.GATHER_TABLE_STATS
BEGIN
DBMS_STATS.GATHER_TABLE_STATS     
   (     
   OWNNAME => 'TEST'     
   ,TABNAME => 'RTEST'     
   ,ESTIMATE_PERCENT => 1
   ,GRANULARITY => 'GLOBAL'     
   ,CASCADE => FALSE     
   ); 
END; 
This table has 140 partitions, 6018191841 rows and 80538932 blocks.
The above command takes 80 minutes. 
If i change block_sample => TRUE, it completes in less than a minute.
I found this query (modified) to be the most time consuming one and waits on "db file scattered reads"
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact 
 use_weak_name_resl dynamic_sampling(0) no_monitoring */ * 
 from rtest sample  (1) t 
When the block_sample is made TRUE, query changes to 
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact 
 use_weak_name_resl dynamic_sampling(0) no_monitoring */ * 
 from rtest sample block (1) t 
and is much faster.
Tom, i found sample rows and sample block to return different amount of rows. Which is more accurate... when used within dbms_stats , and why?
Thanks a lot for your time.
Thanks,
 
 
May       09, 2006 - 11:08 am UTC 
 
 
1) force overrides statistics that were "locked", you can "lock" statistics - make it so a normal "gather" won't overwrite them.  Maybe they could have called it "freezing" statistics instead of locking them.
....
force     When value of this argument is TRUE, deletes column statistics even if locked
.......
2) correct. analyze would invalidate cursors as well.
3) You need to understand the difference between block and row sampleing.
One of them samples 1% of the BLOCKS and processes any rows it finds on them.
The other samples 1% of the rows (read the rows, but only look at 1 out of every 100 of them).
You would certainly expect them to return different statistics since they look at radically different data in very different ways. 
 
 
 
Thanks
Ik, May       09, 2006 - 11:23 am UTC
 
 
Many thanks Tom for the very fast and precise response.
When row sampling (1%) is used, oracle reads every row from the table, but considers only 1 out of every hundred rows. In effect the whole table is read.
The difference between this and COMPUTE statistics would then be that COMPUTE considers all the rows whereas sample rows considers only 1% of the rows. But, in both cases the entire table is read.
So, we save on CPU mostly by using sample rows when compared to COMPUTE? Am i correct in saying so?
Also, the Oracle documentation says "Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated." How is random block sampling more efficient?
Iam sorry for asking too many questions, Tom, but these are doubts for which very few people have an answer to.
 
 
May       09, 2006 - 12:12 pm UTC 
 
 
you saw how random block sampling is more efficient - it only read 1% of the blocks, it ran a lot faster.  
But it might get entirely the wrong answer by accident since it reads 1% of the blocks and could be subject to skew (load sorted data and it'll see lots of "the same values" on any given block which could be unnatural)
 
 
 
 
Alberto Dell'Era, May       09, 2006 - 11:30 am UTC
 
 
> Maybe they could have called it "freezing" statistics 
> instead of locking them.
It's worth mentioning the different behaviour when trying to gather stats on a locked table versus the schema/db that contains it:
dellera@ORACLE10> exec  dbms_stats.lock_table_stats (user, 't');
PL/SQL procedure successfully completed.
dellera@ORACLE10> exec dbms_stats.gather_table_stats (user, 't');
BEGIN dbms_stats.gather_table_stats (user, 't'); END;
*
ERROR at line 1:
ORA-20005: object statistics are locked (stattype = ALL)
ORA-06512: at "SYS.DBMS_STATS", line 13056
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1
dellera@ORACLE10> exec dbms_stats.gather_schema_stats (user);
PL/SQL procedure successfully completed.
The first is definetely a "lock" as in "you bang your head into the locked door" ;) - as a non-native speaker would express it ... 
 
May       09, 2006 - 12:12 pm UTC 
 
 
but not to be confused with a table lock or a row lock :) 
 
 
 
David Aldridge, May       09, 2006 - 12:50 pm UTC
 
 
>> But it might get entirely the wrong answer by accident since it reads 1% of the 
blocks and could be subject to skew (load sorted data and it'll see lots of "the 
same values" on any given block which could be unnatural)
It's also sensitive to skewed numbers of rows per block, and the 1% sample is by no means guaranteed
</code>  
http://oraclesponge.blogspot.com/2005/06/statistics-by-block-sampling.html  <code> 
 
 
dbms_stats for IOTs
Ik, May       16, 2006 - 8:11 am UTC
 
 
Tom,
This is 10g R2.
I noticed that when gathering stats on 10g using dbms_stats (specifying partname, granularity => 'PARTITION' and estimate_percent => <some number>)
performance on IOTs are very poor when compared to tables of similar/much bigger size.
The SQL it is waiting for is 
INSERT /*+ append*/ into sys.<temp table>
SELECT .... FROM <iot> sample (0.001) t
WHERE TBL$OR$IDX$PART$NUM(<iot>,0,4,0,ROWID) = :objn
It is waiting on 'db_file_sequential_reads' and in the P1 column of v$session_wait, it shows up almost every datafile that this IOT spans across. This includes other partitions too. My question is that the granularity keyword seems to have no impact on IOTs.
Is this how it should be?
This command is taking more than an hour to run. We have around 100 partitions.
Tom, whenever granularity of PARTITION is specified for a table, index or IOT, shouldnt it go against that partition alone?
In cases when the granularity => 'PARTITION', would Oracle attempt to gather global statistics also by simply rolling up from individual partitions each time?
Please let us know. 
 
 
A reader, June      03, 2006 - 10:05 pm UTC
 
 
Hi Tom,
        A basic question but I am bit confused about it. I create one schema with some tables in it. Do I need to gateher stats for them before using them? Or Oracle will take care once they start getting populated? I am bit confused if we do need to take the stats for empty tables. This I am asking as I was asked to do so.May be I did understand it properly and I might have to go back and confirm.
Thx
Santosh. 
 
June      04, 2006 - 8:16 am UTC 
 
 
The only answers to all questions are:
a) why 
b) it depends
The answer to this is "it depends"
are you using an older release and using the rule based optimizer (RBO) - then the answer is no, because you are using the RBO and the RBO doesn't want statistics
are you using current software and using the cost based optimizer (CBO)?  Then we need to know how current - is it 10g current where dynamic sampling will kick in automatically or is it 8i/9i current where sampling does not kick in automatically.  
In 8i/9i - you would be responsible for collecting the statistics.  In 10g, it'll dynamically sample the table the first time it hard parses a query and if you are using the automatic job that installs with the database, it'll gather statistics for you on these objects in the future as the job runs over time.
 
 
 
 
Dbms_stats.gather_schema_stats
Vikas, June      04, 2006 - 9:49 am UTC
 
 
Hi Tom,
We are currently managing a database worth 461G worth of data. This is on the database server hosted on Linux machine having 4 CPU's , 8 disk devices supported by 4 disk controllers and 16 G of RAM.
We have been running the package to collect the statistics which will help the optimizer to generate the optimised execution plans using dbms_stats.gather_database_stats, but recently data being growing everyday to the tune of 1G the capturing of statistics has really slowed down and is taking 5 hrs to complete the job.
Create or replace procedure gather_db_stats AS
BEGIN
declare
    Path                     VARCHAR2(100) ;
    Filename                 VARCHAR2(100) := 'Dbmsstats.txt';
    Output_file              utl_file.file_type;
    l_Start                  NUMBER;
    l_tname                 VARCHAR2(30);
    l_partition_name         VARCHAR2(30);
    rebuild_partition_index  VARCHAR2(500);
    err_num                  NUMBER;
    err_msg                  VARCHAR2(500);
BEGIN 
    Select value into path from v$parameter where name = 'utl_file_dir';
    output_file := utl_file.fopen (path,filename, 'a');
    for x in (Select * from DWUSER.Rolling_tables where status = 'Y') loop
        l_tname := x.table_name;
        for table_specific_partitions in (Select all_ind_partitions.partition_name part_name from all_indexes,all_ind_partitions 
                                          Where  all_indexes.index_name    = all_ind_partitions.index_name 
                      And    all_indexes.table_name    = l_tname
                                          And    all_ind_partitions.Status = 'UNUSABLE') loop
        begin 
               l_partition_name := table_specific_partitions.part_name;
               rebuild_partition_index := 'Alter table ' || l_tname || ' modify partition ' || l_partition_name || ' rebuild unusable local indexes';
               execute immediate rebuild_partition_index;
           EXCEPTION
             When OTHERS then
                err_num := SQLCODE;
                err_msg := Substr(SQLERRM,1,500);
                utl_file.put_line (output_file,'Date : ' || to_Char(sysdate,'dd/mm/yyyy hh24:mi:ss') || ' Error generated while rebuilding the UNUSABLE local indexes with an error message ' || err_msg);
                utl_file.fclose(output_file);
                return;
           end;
        end loop;
     end loop;
     for x in (select Username from dba_users where username not in ('SYS','SYSTEM'))
     lOOP
        l_start := dbms_utility.get_time;
        dbms_stats.gather_schema_stats(ownname           => x.username,
                                       estimate_percent  => dbms_stats.auto_sample_size,
                                       method_opt        => 'FOR ALL INDEXED COLUMNS SIZE SKEWONLY',
                                       degree            => 16,
                                       cascade           => true,
                                       options           => 'GATHER AUTO');
        utl_file.put_line (output_file,'Date : ' || to_Char(sysdate,'dd/mm/yyyy hh24:mi:ss') || ' The total time spent to gather stats for the schema owner : ' || x.username || ' is ' || round((dbms_utility.get_time - l_start)/100,2) || 'seconds');
    end loop;
END;
END;
/
We just don't have the bandwidth of 5 hrs to get this done. We have 4 partitioned tables which do have most of the data (could be 350M). 
Can we use gather_table_stats instead of gather_schema_stats for only those partitions which have the LAST_ANALYZED_DATE as NULL, since our application is such which doesn't perforn any delete or insert activity once the End of day completes. We have Range partitition based on Date and have daily partitoins.
Will this help, or let us know if we can configure some of the parameter values for the gather_schema_stats to get the job done in a much faster way. We don't need to build histograms for the columns.
Thanks in anticipation 
 
 
June      04, 2006 - 4:16 pm UTC 
 
 
The answer is as always one of two:
a) why
b) it depends
your answer is b), it depends.
Can you just do partition stats?  do your queries always hit a SINGLE partition, and is that partition known at hard parse time?
If your query hits more than one partition OR your query doesn't let us figure out which partition it will hit - then GLOBAL TABLE stats are used to optimize (not local partition statistics).
And the global table stats could be slightly "wrong" for columns if you don't maintain them over time.
eg: if you have 2 partitions - and each partition has 500,000 rows - it is easy enough to know there are 1,000,000 rows - that works OK from the local partition statistics.
however, if in those 2 partitions there it a column X and it has 10,000 distinct values in partition1 and 10,000 distinct values in partition2 - how many distinct values does it have "globally"?  somewhere between 10,000 and 20,000 - which can be the difference between "night and day" to the optimizer.
There is no cut and dry answer here - you might have to "help" on some columns using dbms_stats.set_***  to set the column stats for us if you know them (and you are gathering histograms there - you said you don't need to so why?) 
 
 
 
dbms_stats
geepee, July      05, 2006 - 4:56 am UTC
 
 
Dear Tom,
We have around 150 fast growing/changing transactional data tables in our production system.
Database is 10gR2. 
Growth of these tables is from 0.1 million to 1 million in six months. 
A few tables are partitioned tables.
Which way is good to analyze these fast growing tables? Dynamic sampling? or scheduling the script run using dbms_stats for these tables? Pl. suggest.
Regards.
 
 
July      08, 2006 - 7:51 am UTC 
 
 
dynamic sampling will happen only at the first hard parse time (so if you never shutdown - you never get different plans - assuming you have used bind variables - which you must be since you said "we are transactional").  It would not be appropriate.
More likely would be to use dbms_stats to gather stats on STALE tables (tables are all set to monitoring in 10g you can use "gather stale" as the option, to avoid gathering stats on tables that haven't changed much). 
 
 
 
Table statistics and 3113 end-of-file on comm channel
Jon Waterhouse, July      12, 2006 - 11:38 am UTC
 
 
I ran into the following problem on XE, which I find difficult to explain. I had the following query:
select count(*),urban
from  case@caps s 
inner join res_address@caps r ON r.res_address_id = s.res_address_id
inner join ref.good_postal_code p ON p.postal_code=r.postal_code 
WHERE s.adults=1 and s.children > 0  -- is single parent
AND exists (select * from payment@caps p
           WHERE s.case_id = p.case_id
           AND p.payment_date between to_date('200605','yyyymm') and to_date('200606','yyyymm')-1)
GROUP BY urban
which worked fine. Note that there is one table on XE and the other two tables are remote. I then modified the query to take out the two items that restricted the query to single parents. I got a 3113 error running an explain plan. I found that adding in some restrictions (e.g. s.adults in (1,2,3)) would allow the query to run -- even though this in no way restricts the query - all records would satisfy this. I also found that running an analyze table on ref.good_postal_code allowed the query to run.
Since I wasn't sure if it was possible to delete stats I then did a CTAS to dreate a new table exactly the same as good_postal_code but with no stats. Surprisingly the query ran no problem.
I read somewhere that resource limits can generate 3113s. Is that what is happening here? i.e. the explain plan process decides that the query is going to be too big and returns the 3113?  
 
July      12, 2006 - 5:04 pm UTC 
 
 
doubtful - can you reproduce in "not XE"? 
 
 
 
ZB, July      24, 2006 - 12:44 pm UTC
 
 
system@oip1> create or replace procedure ods_analyze
  2  is
  3  begin
  4  execute immediate 'execute DBMS_STATS.GATHER_SCHEMA_STATS(''ODS'')';
  5  end;
  6  /
Procedure created.
system@oip1> exec ods_analyze;
BEGIN ods_analyze; END;
*
ERROR at line 1:
ORA-00900: invalid SQL statement
ORA-06512: at "SYSTEM.ODS_ANALYZE", line 4
ORA-06512: at line 1
Tom, why did I get the error?
 
 
July      24, 2006 - 2:54 pm UTC 
 
 
why would you try to dynamically execute plsql - plsql that never changes???????
...
is
begin
   dbms_stats.gather_schema_stats( 'ODS' );
end;
you have plsql
you are writing plsql
just call it
execute is a "sqlplus-ism".  you can see this from your own example:
system@oip1> exec ods_analyze;
BEGIN ods_analyze; END;
*
ERROR at line 1:
ORA-00900: invalid SQL statement
see - it says "BEGIN ods_analyze; END;" - exec is just shorthand for sqlplus to glue begin and end on it.  execute is NOT plsql - it is not part of the language at all. 
 
 
 
ZB, July      24, 2006 - 3:07 pm UTC
 
 
thanks Tom, now i got the procedure compiled.
But got the new:
system@oip1> exec ods_analyze
BEGIN ods_analyze; END;
*
ERROR at line 1:
ORA-20000: Insufficient privileges to analyze an object in Schema
ORA-06512: at "SYS.DBMS_STATS", line 7649
ORA-06512: at "SYS.DBMS_STATS", line 7779
ORA-06512: at "SYS.DBMS_STATS", line 7830
ORA-06512: at "SYS.DBMS_STATS", line 7809
ORA-06512: at "SYSTEM.ODS_ANALYZE", line 4
ORA-06512: at line 1
system@oip1>   exec DBMS_STATS.GATHER_SCHEMA_STATS('ODS');
PL/SQL procedure successfully completed.
But i could do it directly. 
 
July      24, 2006 - 4:17 pm UTC 
 
 
STOP USING SYSTEM
you have no reason to use system, it is "ours", stop using it.  
create your own account 
and read:
</code>  
http://asktom.oracle.com/Misc/RolesAndProcedures.html  <code>
grant it "analyze any" directly (no roles)
or better yet....
just write this as the ODS user, schedule it as the ODS user.  
but don't use system!  
 
 
 
Can a table be ANALYZE during the DML operations?
Ravi, August    09, 2006 - 1:01 am UTC
 
 
Hi Tom,
1. Can a table be ANALYZE during (to collect statistics) the DML (INSERT,
    UPDATE, DELETE) operations?
2. If the column in WHERE caluse has an index on it, whether the statistics are
    much concerns to use the index during DMLs?
3. Having statistics and not having statisttics will affect the use of the
    index during DMLs? how?
4. If we use (within the same database) ANALYZE for few tables and DBMS_STATS
    package to gather statistics for some other tables, will there be a performance
    issue by using these two together?
Thanks,
Ravi 
 
August    09, 2006 - 10:42 am UTC 
 
 
1) you may use dbms_stats to gather statistics while DML is happening, yes.
2) sure.  It is all about estimated cardinality.
3) you need to have statistics to use the cost based optimizer (CBO) safely.  If you do not, you will not be able to use the CBO safely.
4) do not use analyze, use dbms_stats 
 
 
 
DBMS_STATS & ANALYZE statement Oracle10g R2
Suvendu, August    19, 2006 - 7:06 am UTC
 
 
Hi Tom,
Below is the test case which shows the wrong information with DBMS_STATS in Oracle 10g R2. It has the correct information in Oracle 9i R2. As per metalink, CBO will get exact information when we will go by COMPUTE STATISTICS where NUM_ROWS and SAMPLE_SIZE columns will point same values. And, I also checked out on query execution.
SQL> create table t2 as select * from t1;
Table created.
SQL> select count(1) from t2;
  COUNT(1)
----------
    735392
SQL>  exec DBMS_STATS.GATHER_TABLE_STATS(user, TABNAME => 'T2',GRANULARITY => 'DEFAULT' ,METHOD_OPT => 'FOR COLUMNS');
PL/SQL procedure successfully completed.
SQL> select table_name, num_rows, sample_size from user_tables where table_name='T2';
TABLE_NAME                       NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
T2                                 735392       51451
SQL> REM Here is the wrong number of rows compare to 735392 and 735392.
SQL> analyze table t2 compute statistics;
Table analyzed.
SQL> select table_name, num_rows, sample_size from user_tables where table_name='T2';
TABLE_NAME                       NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
T2                                 735392      735392
SQL> REM Here all values are accurate.
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
SQL>
======================
Oracle 9i R2
======================
SQL> select count(1) from t1;
  COUNT(1)
----------
  27525120
Elapsed: 00:01:07.05
SQL>  exec DBMS_STATS.GATHER_TABLE_STATS(user, TABNAME => 'T1',GRANULARITY => 'DEFAULT' ,METHOD_OPT => 'FOR COLUMNS', DEGREE=>8);
PL/SQL procedure successfully completed.
Elapsed: 00:00:25.22
SQL> select table_name, num_rows, sample_size from user_tables where table_name='T1';
TABLE_NAME                       NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
T1                               27525120    27525120
Elapsed: 00:00:00.08
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE    9.2.0.6.0       Production
TNS for HPUX: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
Elapsed: 00:00:00.08
albprod@ALBBPM>
Could you please, elaborate the difference of DBMS_STATS in Oracle 10g ?
Thanking you.
Regards,
Suvendu
 
 
 
August    19, 2006 - 9:15 am UTC 
 
 
that is not "wrong"
That is "correct" actually.
If you check out the docs for dbms_stats, you'll find the specification of gather table stats in 10gr2 is:
  http://docs.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#sthref8129   
DBMS_STATS.GATHER_TABLE_STATS (
   ownname          VARCHAR2, 
   tabname          VARCHAR2, 
   partname         VARCHAR2 DEFAULT NULL,
   estimate_percent NUMBER   DEFAULT to_estimate_percent_type 
                                                (get_param('ESTIMATE_PERCENT')), 
   block_sample     BOOLEAN  DEFAULT FALSE,
   method_opt       VARCHAR2 DEFAULT get_param('METHOD_OPT'),
   degree           NUMBER   DEFAULT to_degree_type(get_param('DEGREE')),
   granularity      VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'), 
   cascade          BOOLEAN  DEFAULT to_cascade_type(get_param('CASCADE')),
   stattab          VARCHAR2 DEFAULT NULL, 
   statid           VARCHAR2 DEFAULT NULL,
   statown          VARCHAR2 DEFAULT NULL,
   no_invalidate    BOOLEAN  DEFAULT  to_no_invalidate_type (
                                     get_param('NO_INVALIDATE')),
   force            BOOLEAN DEFAULT FALSE);
and that:
ops$tkyte%ORA10GR2> select dbms_stats.get_param( 'estimate_percent' ) from dual;
DBMS_STATS.GET_PARAM('ESTIMATE_PERCENT')
-------------------------------------------------------------------------------
DBMS_STATS.AUTO_SAMPLE_SIZE
you are using the auto_sample_size, therefore, it is free to sample.
It performed as documented - but things change (and the fact that the row count is pretty much dead on in 10gr2....) 
 
 
 
 
Help needed
Anil P, August    23, 2006 - 5:09 am UTC
 
 
Hello,
This query is taking more than 12 hrs to execute. Can you let me know which areas I need to look to tune.
Thanks
Anil P
SQL> select z.IND_ID,
  2         z.BLK_OPEN_DT,
  3         z.ORG_SRC_CD,
  4         z.PRM_SRC_CD,
  5         c.calendar_month_name_abbr ACC_OPEN_MTH_ABR,
  6         c.calendar_month ACC_OPEN_MTH,
  7         c.calendar_year ACC_OPEN_YEAR
  8    from mrtlookup.calendar_date c,
  9         (select ind.individual_id IND_ID,
 10                 x.BLK_OPEN_DT,
 11                 ind.original_source_cd ORG_SRC_CD,
 12                 ind.primary_source_cd PRM_SRC_CD
 13            from mrtcustomer.individual ind,
 14                 (select prp.individual_id IND_ID, min(prp.open_dt) BLK_OPEN_DT
 15                    from mrtcustomer.proprietary_account prp,
 16                         mrtcustomer.individual_segment isg,
 17                         (select (c.calendar_month_end_dt + 1) BLK_OPEN_START_DT,
 18                                 (select max(c.calendar_month_end_dt)
 19                                    from mrtlookup.calendar_date c,
 20                                         (SELECT PERIOD_DATES.WEEK_END_DT computation_dt
 21                                            FROM MRTLOOKUP.PERIOD_DATES PERIOD_DATES) e
 22                                   where c.calendar_month_end_dt <=
 23                                         e.computation_dt) END_DT
 24                            from mrtlookup.calendar_date c
 25                           where c.calendar_date =
 26                                 (select min(ind.individual_create_dt)
 27                                    from mrtcustomer.individual ind
 28                                   where ind.original_source_cd = 'TEL')) y
 29                   where y.BLK_OPEN_START_DT <= prp.open_dt
 30                     and y.END_DT >= prp.open_dt
 31                     and prp.division_ind is not null
 32                     and isg.individual_id = prp.individual_id
 33                     and isg.marketing_segment_cd = 'MAILABLE'
 34                   group by prp.individual_id) x
 35           where ind.primary_source_cd = 'GEC'
 36             and ind.individual_id = x.IND_ID) z
 37   where c.calendar_date = z.BLK_OPEN_DT;
1143156 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=204854 Card=37843
          Bytes=2194894)
   1    0   HASH JOIN (Cost=204854 Card=37843 Bytes=2194894)
   2    1     TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (TABLE) (Cost=68
          Card=14640 Bytes=278160)
   3    1     NESTED LOOPS (Cost=204785 Card=37843 Bytes=1475877)
   4    3       VIEW (Cost=166771 Card=37843 Bytes=832546)
   5    4         SORT (GROUP BY) (Cost=166771 Card=37843 Bytes=200567
          9)
   6    5           NESTED LOOPS (Cost=58475 Card=37843 Bytes=2005679)
   7    6             NESTED LOOPS (Cost=27419 Card=31019 Bytes=108566
          5)
   8    7               TABLE ACCESS (BY INDEX ROWID) OF 'CALENDAR_DAT
          E' (TABLE) (Cost=2 Card=1 Bytes=16)
   9    8                 INDEX (UNIQUE SCAN) OF 'PK_CD' (INDEX (UNIQU
          E)) (Cost=1 Card=1)
  10    9                   SORT (AGGREGATE)
  11   10                     PARTITION RANGE (ALL) (Cost=107721 Card=
          1302415 Bytes=15628980)
  12   11                       TABLE ACCESS (FULL) OF 'INDIVIDUAL' (T
          ABLE) (Cost=107721 Card=1302415 Bytes=15628980)
  13    7               TABLE ACCESS (FULL) OF 'PROPRIETARY_ACCOUNT' (
          TABLE) (Cost=27417 Card=31019 Bytes=589361)
  14   13                 SORT (AGGREGATE)
  15   14                   NESTED LOOPS (Cost=78 Card=732 Bytes=11712
          )
  16   15                     TABLE ACCESS (FULL) OF 'PERIOD_DATES' (T
          ABLE) (Cost=9 Card=1 Bytes=8)
  17   15                     TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (
          TABLE) (Cost=69 Card=732 Bytes=5856)
  18    6             PARTITION RANGE (ITERATOR) (Cost=1 Card=1 Bytes=
          18)
  19   18               INDEX (UNIQUE SCAN) OF 'PK_ISEG_A' (INDEX (UNI
          QUE)) (Cost=1 Card=1 Bytes=18)
  20    3       PARTITION RANGE (ITERATOR) (Cost=2 Card=1 Bytes=17)
  21   20         TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'INDIVIDUAL'
          (TABLE) (Cost=2 Card=1 Bytes=17)
  22   21           INDEX (UNIQUE SCAN) OF 'PK_IND' (INDEX (UNIQUE)) (
          Cost=1 Card=1)
Statistics
----------------------------------------------------------
        274  recursive calls
          4  db block gets
   14526033  consistent gets
    1299590  physical reads
        232  redo size
   38571751  bytes sent via SQL*Net to client
     838832  bytes received via SQL*Net from client
      76212  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
    1143156  rows processed
 
 
 
August    27, 2006 - 3:23 pm UTC 
 
 
are the ESTIMATED cardinalities in the explain plan above "realistic"  (I'd say "no" based on the returned rows and the estimated returned rows)
are the statistics "up to date" 
 
 
 
Thanks
Anil P, August    29, 2006 - 9:20 am UTC
 
 
Hello Tom,
Yes it's latest and realistic status. And it's a production db on Oracle 10g. Here's the latest explain plan output. 
This query is been generated by a cognos tool. 
Can you tell me which areas needs to be looked into?
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3257 Card=1 Bytes=
          103)
   1    0   FILTER
   2    1     SORT (GROUP BY) (Cost=3257 Card=1 Bytes=103)
   3    2       MERGE JOIN (CARTESIAN) (Cost=1344 Card=445 Bytes=45835
          )
   4    3         NESTED LOOPS (Cost=1281 Card=1 Bytes=77)
   5    4           NESTED LOOPS (Cost=1280 Card=1 Bytes=59)
   6    5             NESTED LOOPS (Cost=369 Card=455 Bytes=15925)
   7    6               TABLE ACCESS (BY INDEX ROWID) OF 'CALENDAR_DAT
          E' (TABLE) (Cost=2 Card=1 Bytes=16)
   8    7                 INDEX (UNIQUE SCAN) OF 'PK_CD' (INDEX (UNIQU
          E)) (Cost=1 Card=1)
   9    8                   SORT (AGGREGATE)
  10    9                     TABLE ACCESS (FULL) OF 'INDIVIDUAL' (TAB
          LE) (Cost=1847 Card=81250 Bytes=975000)
  11    6               TABLE ACCESS (FULL) OF 'PROPRIETARY_ACCOUNT' (
          TABLE) (Cost=367 Card=455 Bytes=8645)
  12   11                 SORT (AGGREGATE)
  13   12                   NESTED LOOPS (Cost=66 Card=740 Bytes=11840
          )
  14   13                     TABLE ACCESS (FULL) OF 'PERIOD_DATES' (T
          ABLE) (Cost=2 Card=1 Bytes=8)
  15   13                     TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (
          TABLE) (Cost=64 Card=740 Bytes=5920)
  16    5             TABLE ACCESS (BY INDEX ROWID) OF 'INDIVIDUAL' (T
          ABLE) (Cost=2 Card=1 Bytes=24)
  17   16               INDEX (UNIQUE SCAN) OF 'PK_IND' (INDEX (UNIQUE
          )) (Cost=1 Card=1)
  18    4           INDEX (UNIQUE SCAN) OF 'PK_ISEG' (INDEX (UNIQUE))
          (Cost=1 Card=1 Bytes=18)
  19    3         BUFFER (SORT) (Cost=1344 Card=14792 Bytes=384592)
  20   19           TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (TABLE) (Co
          st=63 Card=14792 Bytes=384592)
Thanks
Anil 
 
August    29, 2006 - 3:28 pm UTC 
 
 
eh?  this is totally different from above.
anyway - I will not be able to automagically fix this 
 
 
 
Initial Load and statistics
A reader, September 08, 2006 - 8:51 am UTC
 
 
Hi Tom,
I am doing a massif initial load. During this initial load I am inserting into several tables and in the same time I am selecting from theses tables. This is why I kept indexes usable. In addition, when I am in 50% of my initial load I compute statistics using:
 DBMS_STATS.GATHER_SCHEMA_STATS (shema_name, 30)
I wonder if you will advise me to do this or not?
I want to suppress this call to DBMS_STATS but want to be sure that this will improve performance of my process
Thanks for your advise 
 
September 09, 2006 - 11:46 am UTC 
 
 
depends on whether the queries you issue would benefit from this regather.
It is quite possible that you need not gather stats - but since YOU are doing the load (you know the stats already!!! you know the number of rows loaded, you know the average row width, you can guess at the number of blocks and so on) you could just call SET statistics and provide the correct values. 
 
 
 
GATHER STALE
Neeraj Nagpal, September 15, 2006 - 2:49 pm UTC
 
 
Hi Tom,
I have a question for you regarding the GATHER STALE option of DBMS_STATS. Here is a quick little background about my database setup: I have 4 very large composite-partitioned tables -- with 50 partitions and 64 (hash) sub-partitions in each range partition. Each one of these 4 tables has on an average around 120 Million rows, with somewhere around 250 columns each.
Now, the data in these 50 partitions is pretty skewed, in terms of no. of records in them. And each one of these partitions has its own peculiar update cycle -- some states are updated more (much more) than others and some are inserted more than others -- Now what I want to do is that instead of indiscriminately issuing 
"Exec dbms_stats.gather_schema_stats( user, options => 'gather stale' );" I want to get just the listing of the skewed partitions within the tables and selectively analyze these partitions which require them the most. Is there a system table which can give me the information about the particular partitions which are STALE in a table?? 
Thanks so much for your help,
Neeraj
 
 
September 16, 2006 - 2:27 pm UTC 
 
 
look at "LIST STALE" as the option 
 
 
 
Please answer
Reader, September 21, 2006 - 3:50 am UTC
 
 
Sir,
Please clarify my doubt.
q1) Why is CUSLED_KEY_1 Index being used instead of CUSLED_PK Primary Key index? 
q2) Why its filtering ACCOUNT and AGE_PERIOD instead of using the PK?
q3) Or whatever is happening is correct? 
But when I disable CUSLED index it uses CUSLED_PK index. 
SQL> select index_name, substr(column_name,1,40) cname from all_ind_columns
  2  where table_name = 'CUSLED';
INDEX_NAME                     CNAME
------------------------------ ----------------------------------------
CUSLED_PK                      LEDGER
CUSLED_PK                      ACCOUNT
CUSLED_PK                      AGE_PERIOD
CUSLED_PK                      DOC_REF
CUSLED_KEY_1                   LEDGER
CUSLED_KEY_1                   DOC_REF
CUSLED_KEY_2                   LEDGER
CUSLED_KEY_2                   ACCOUNT_REF
CUSLED_KEY_3                   LEDGER
CUSLED_KEY_3                   CHEQUE_NO
CUSLED_ACCOUNT_FK              ACCOUNT
SQL> SELECT *
  2  FROM CUSLED
  3  WHERE CUSLED.LEDGER = 'UD'
  4  AND CUSLED.ACCOUNT = '0000070101'
  5  AND CUSLED.AGE_PERIOD = 'A510'
  6  AND CUSLED.DOC_REF = '30001013'
  7  AND CUSLED.REC_STA != 'X';
Execution Plan
----------------------------------------------------------
Plan hash value: 2871354732
--------------------------------------------------------------------------------
------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)
| Time     |
--------------------------------------------------------------------------------
------------
|   0 | SELECT STATEMENT            |              |     1 |   405 |     2   (0)
| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| CUSLED       |     1 |   405 |     2   (0)
| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | CUSLED_KEY_1 |     3 |       |     1   (0)
| 00:00:01 |
--------------------------------------------------------------------------------
------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CUSLED"."ACCOUNT"='0000070101' AND "CUSLED"."AGE_PERIOD"='A510' A
ND
              "CUSLED"."REC_STA"<>'X')
   2 - access("CUSLED"."LEDGER"='UD' AND "CUSLED"."DOC_REF"='30001013')
SQL> ALTER INDEX CUSLED_KEY_1 UNUSABLE;
Index altered.
SQL> SELECT *
  2  FROM CUSLED
  3  WHERE CUSLED.LEDGER = 'UD'
  4  AND CUSLED.ACCOUNT = '0000070101'
  5  AND CUSLED.AGE_PERIOD = 'A510'
  6  AND CUSLED.DOC_REF = '30001013'
  7  AND CUSLED.REC_STA != 'X';
Execution Plan
----------------------------------------------------------
Plan hash value: 4156143435
--------------------------------------------------------------------------------
---------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| T
ime     |
--------------------------------------------------------------------------------
---------
|   0 | SELECT STATEMENT            |           |     1 |   405 |     2   (0)| 0
0:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| CUSLED    |     1 |   405 |     2   (0)| 0
0:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | CUSLED_PK |     1 |       |     1   (0)| 0
0:00:01 |
--------------------------------------------------------------------------------
---------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CUSLED"."REC_STA"<>'X')
   2 - access("CUSLED"."LEDGER"='UD' AND "CUSLED"."ACCOUNT"='0000070101' AND
              "CUSLED"."AGE_PERIOD"='A510' AND "CUSLED"."DOC_REF"='30001013')
Thanks 
 
 
September 22, 2006 - 1:50 am UTC 
 
 
the costs in this case where so close, that they "tied", either index was in this case equally appropriate. 
 
 
 
Still awaiting for your reply
Reader, September 22, 2006 - 12:29 am UTC
 
 
Still awaiting for your reply 
 
September 22, 2006 - 2:51 am UTC 
 
 
gee, thanks for letting me know?
I was hungry last night and went out for food....   
 
 
 
LIST STALE PARTITIONS
Neeraj Nagpal, September 26, 2006 - 2:35 pm UTC
 
 
Tom,
Thanks for your answer to my question -- titled "GATHER STALE"  asked on September 15, 2006. Breifly searching through your website, I came across this procedure that lists the STALE TABLES -- changed since the last-analyzed date -- but I am more interested in finding out the STALE PARTITIONS rather than just the STALE TABLE. How can one display the names of the STALE PARTITIONS using the following script.
 declare
        l_objList  dbms_stats.objectTab;
begin
        dbms_stats.gather_schema_stats
        ( ownname        => USER,
          options        => 'LIST STALE',
          objlist        => l_objList );
        for i in 1 .. l_objList.count
        loop
                dbms_output.put_line( l_objList(i).objType );
                dbms_output.put_line( l_objList(i).objName );
        end loop;
end;
Appreciate Your Help,
Neeraj 
 
September 26, 2006 - 5:03 pm UTC 
 
 
Ok, my help:
when you glanced at the documentation to see what else might be available in the dbms_stats.objectTable type - what did you see?
</code>  
http://docs.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#sthref7889   <code>
I saw this:
TYPE ObjectElem IS RECORD (
  ownname     VARCHAR2(30),     -- owner
  objtype     VARCHAR2(6),      -- 'TABLE' or 'INDEX'
  objname     VARCHAR2(30),     -- table/index
  partname    VARCHAR2(30),     -- partition
  subpartname VARCHAR2(30),     -- subpartition
  confidence  NUMBER);          -- not used
type ObjectTab is TABLE of ObjectElem;
which would indicate that other attributes of use exist there :)  
 
 
 
Thanks Tom
Neeraj Nagpal, September 26, 2006 - 6:31 pm UTC
 
 
Thanks Tom, Yes ..it was my shortsighted search on the topic!
Neeraj Nagpal 
 
 
analyze vs. dbms_stats
Rob, October   10, 2006 - 4:12 pm UTC
 
 
Tom,
Very useful - Thank You.
I have seen in your examples the use of:
analyze table t compute statistics for table for all indexes for all indexed columns
but then you'll say something like
In real life, however, one would want to use DBMS_STATS instead.
Could you please supply the equivalent statement using dbms_stats ?
Thanks 
 
October   10, 2006 - 8:21 pm UTC 
 
 
those are all old examples, many of which pre-date the existance of dbms_stats.
dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all indexed columns', cascade => true );
would be the "same" as that particular analyze. 
 
 
 
DBMS_STATS - Locking 
A reader, November  02, 2006 - 7:09 pm UTC
 
 
Tom,
Somewhere up there in this thread, you said certain variations of analyze can lock (to validate structures and such), but dbms_stats just runs queries basically." 
However while running the following command, 
        DBMS_STATS.gather_table_stats (
        ownname=> 'ADBPROD',
        tabname=> OP_LAYER,
        estimate_percent=> dbms_stats.auto_sample_size,
        partname=> CALIFORNIA,
        granularity=> 'PARTITION');
I noticed intermittently that the table 'OP_LAYER' was being locked by the Gather_Table_Stats command, and a few users, who wanted to acquire locks on this table, had to wait.
See the screen dump.
(Pardon the bad text formatting here 
)
SQL> Select a.username, b.owner||'.'||b.name,b.type, b.mode_held,b.mode_requested 
         from v$session a, dba_ddl_locks b
         where a.sid=b.session_id
        and  b.mode_held = 'None' and b.mode_requested <> 'None'
11:29:19   5  /  
USERNAME                       B.OWNER||'.'||B.NAME
------------------------------ -------------------------------------------------------------
TYPE                                     MODE_HELD MODE_REQU
---------------------------------------- --------- ---------
ADBPROD                        ADBPROD.OPINION_LAYER
Table/Procedure/Type                     None      Share   (<-Share lock requested on the table)
ADBPROD                        ADBPROD.OPINION_LAYER
Table/Procedure/Type                     None      Share  (<-Share lock requested on the table)
ADBPROD                        ADBPROD.OPINION_LAYER
Table/Procedure/Type                     None      Share  (<-Share lock requested on the table)
ADBPROD                        ADBPROD.OPINION_LAYER
Table/Procedure/Type                     None      Share  (<-Share lock requested on the table)
ADBPROD                        ADBPROD.OPINION_LAYER
Table/Procedure/Type                     None      Share  (<-Share lock requested on the table)
11:37:24 ADBPROD@PROD SQL> Select a.username, b.owner||'.'||b.name,b.type, b.mode_held,b.mode_reques
ted
12:02:21   2  from v$session a, dba_ddl_locks b
12:02:21   3  where a.sid=b.session_id
12:02:21   4  and  b.mode_held = 'None' and b.mode_requested <> 'None'
12:02:21   5  /
no rows selected  <-NO LOCKS ON THE TABLE, AFTER A SHORT INTERVAL 
Could you please tell me that is it one of those certain variations of the DBMS_STATS which can lock the table being analyzed???
 
 
 
November  02, 2006 - 9:07 pm UTC 
 
 
would have been useful to see the blocker and their lock mode? 
 
 
 
Blockers
A reader, November  03, 2006 - 1:58 pm UTC
 
 
Well I did check the blockers/waiters using the query below
, but got back -- no rows returned --
SELECT sid, DECODE( block, 0, 'NO', 'YES' ) BLOCKER, DECODE( request, 0, 'NO', 'YES' ) WAITER FROM v$lock WHERE request > 0 OR block > 0 ORDER BY block DESC;
Although I did'nt see any blockers OR waiters in the database at that time, the users were still getting breifly locked (like for 2-3 minutes and then released) on the table OP_LAYER. Not sure -- why. And I know that only process accessing the table 'OP_LAYER' was the DBMS_STATS.
 
 
November  04, 2006 - 12:08 pm UTC 
 
 
v$sesstat
v$sessevent 
would be useful if you see it again. 
 
 
 
Is Commit after each table Analyze can make difference?
Maulesh Jani, November  21, 2006 - 12:08 pm UTC
 
 
HI TOM, 
    We are analyzing the schema in following way :
for rec_table_name in (
        select table_name from user_tables
        minus
        select table_name from user_external_tables
    )    loop
dbms_stats.gather_table_stats('XXX',''||rec_table_name.TABLE_NAME||'',estimate_percent=>40,DEGREE=>4,CASCADE => TRUE);
end loop;
Now according to you if I put commit after each Loop iteration , than can it reduce overhead on UNDO/REDO/TEMP tablespace or its nothing to relate with it.        
 
 
November  22, 2006 - 3:47 pm UTC 
 
 
according to me, it would be a waste of time, since dbms_stats already commits.
ops$tkyte%ORA10GR2> create table t1 ( x int );
Table created.
ops$tkyte%ORA10GR2> create table t2 ( x int );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t1 values ( 1 );
1 row created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T2' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> rollback;
Rollback complete.
ops$tkyte%ORA10GR2> select * from t1;
         X
----------
         1
and there is no such thing as "overhead on redo/undo/temp" - I am not familar with what you might be thinking. 
 
 
 
 
Can you please help me on this ..
Maulesh Jani, November  22, 2006 - 12:19 am UTC
 
 
Hi Tom ,
       I need your help on this . As I have mentioned in my above question .We have one Analyze Schema stats process in Production system since 3 years . Now suddenly this process fails with below error :
Error While Analyzing Schema tables :
declare
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10$"
too small
ORA-06512: at line 12
Can you please tell me on it , that during stats gathering large amount of Redo generated . I am not sure but I am not been able to find any conviencing document on it. And the error is typical snapshot too old error , how it came during Analyze table stats ?
 Parallel with this process no other process is executing .
 
Regards
Maulesh  
 
November  24, 2006 - 12:29 pm UTC 
 
 
that has nothing to do with redo
everything to do with undo.
snapshot too old happens when you have a query (and that is what dbms_stats does, runs queries) that is in need of undo that was overwritten.
increase your undo retention and/or review v$undostats to ensure you are not expiring undo prematurely. 
 
 
 
Mayday
Apurva, November  30, 2006 - 1:46 pm UTC
 
 
We are handling massive volumes of data in our warehouse. Most of the fact tables will be appended with more than 200 million rows each month, and in the steady state the facts will have close to 2.5 billion rows.
The end-users of our tables are some analysts -- so to give these guys a low downtime (or, sleepless nights - speaking dysphemistically), we came up with the following strategy:
* We'll have two sets of tables
  -> Temporary tables - in which we will load each month's data; these tables will not be partitioned 
  -> Final tables - each month we'll append Temporary tables' content to these tables through partition-exchange; Final tables will we range-partitioned on Month_ID
* As indexing is a time-intensive process -- we will build all the indexes in the world on the set of Temporary tables and have 'including indexes' clause in the partition-exchange statement
Now, here's where I am stumped (assuming you have played cricket)
 * As my tables are pretty big in size (though their sheer size makes them ugly), and they will be used by analysts -- it's almost mandatory to gather statistics for these tables.   Unfortunately, there is no option for a clause like 'including statistics' in partition-exchange statement -- by which I'll gather statistics for the Temporary table and carry the stats to the Final table (it doesn't make much sense either). So, the only option I am left with is to dump data in the Final table, and analyze the entire Final table for statistics -- but with 2.5 billion rows it'll be eons before it gets analyzed. Or is there a way of escaping this?, which'll help me realize my sadistic objectives.
Thanks Tom.
 
 * We load each month's data in a Temporary table
 * We build all the indexes we want on the temporary table
 * Partition-exchange Temporary table with the Final table 
 
November  30, 2006 - 3:32 pm UTC 
 
 
from above:
...
you can import/export/set statistics directly with dbms_stats
.....
                      ^^^
the partition stats will go, you'll just roll up the global stats you want to roll up - using your knowledge of the data
ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int not null,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
Table created.
ops$tkyte%ORA10GR2> insert into t select to_date( '12-mar-2003','dd-mon-yyyy')+mod(rownum,3), user_id, username from all_users;
31 rows created.
ops$tkyte%ORA10GR2> create index t_idx on t(x) local;
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
        31
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_tab_partitions where table_name =
  2  'T';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
JUNK                                   10
PART1                                  10
PART2                                  11
ops$tkyte%ORA10GR2> select num_rows from user_indexes where table_name = 'T';
  NUM_ROWS
----------
        31
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_ind_partitions where index_name =
  2  'T_IDX';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
JUNK                                   10
PART1                                  10
PART2                                  11
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table temp (dt date, x int not null, y varchar2(30) );
Table created.
ops$tkyte%ORA10GR2> insert into temp
  2  select to_date( '12-mar-2003'), user_id, username from all_users;
31 rows created.
ops$tkyte%ORA10GR2> create index temp_idx on temp(x);
Index created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'TEMP' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter table t
  2  exchange partition part1
  3  with table temp
  4  including indexes
  5  without validation
  6  /
Table altered.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
  2          for x in (select sum(num_rows) nrows from user_tab_partitions where table_name = 'T')
  3          loop
  4                  dbms_stats.set_table_stats( user, 'T', numrows => x.nrows );
  5                  dbms_stats.set_index_stats( user, 'T_IDX', numrows => x.nrows );
  6          end loop;
  7  end;
  8  /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
        52
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
JUNK                                   10
PART1                                  31
PART2                                  11
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select num_rows from user_indexes where table_name = 'T';
  NUM_ROWS
----------
        52
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_ind_partitions where index_name =
  2  'T_IDX';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
JUNK                                   10
PART1                                  31
PART2                                  11
 
 
 
 
 
Thanks gazillion
Apurva, December  01, 2006 - 1:15 am UTC
 
 
 
 
Local vs. (Global + Local) statistics
Apurva, December  06, 2006 - 6:00 am UTC
 
 
Tom,
Consider this scenario:
My table has 12 partitions, with ~200 million rown in each partition. I have statistics gathered for each individual partition, but don't have the Global statistics for the table.
Can I expect to achieve a significant increase in query performance with Global indexes gathered? 
Or, gathering Global stats will only be too consuming without much improvement in query response (as I already have stats for individual partitions)?
I understand this question is a bit abstract, and that I can test it myself -- but I don't enough data or resources to test it.
Please share your invaluable views.
Thanks 
 
December  07, 2006 - 8:33 am UTC 
 
 
...
Can I expect to achieve a significant increase in query performance with Global 
indexes gathered? 
....
one of three things can be expected:
a) increase in performance
b) decrease in performance (unlikely, but possible)
c) no change at all in performance
in other words - IT DEPENDS.
if your queries all include the partition key in such a way that the optimizer knows "only one partition will be hit", local statistics will be used to optimize that query.
global statistics may never be used (meaning C is likely)
on the other hand, if you run queries that do not use partition elimination down to a single partition - then global statistics will be used...
Now, are you sure you don't have SOME global stats :)
ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t
  2  select to_date( '12-mar-2003','dd-mon-yyyy')+mod(rownum,3), object_id, object_name
  3    from all_objects;
50122 rows created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'PART1', granularity => 'PARTITION' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'PART2', granularity => 'PARTITION' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'JUNK', granularity => 'PARTITION' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
     50122
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
PARTITION_NAME                   NUM_ROWS
------------------------------ ----------
JUNK                                16707
PART1                               16707
PART2                               16708
 
 
 
 
 
Thanks Tom!
Apurva, December  07, 2006 - 9:33 am UTC
 
 
 
 
Off-the-topic
Apurva, December  07, 2006 - 4:15 pm UTC
 
 
"What gets us into trouble is not what we don't know;
it's what we know for sure that just ain't so."
-- Twain 
 
 
sys_op_countchg - Usage
Arindam Mukherjee, January   04, 2007 - 4:27 am UTC
 
 
Sir,
 Today I was reading a chapter "The Clustering Factor" written by Mr. Jonathan Lewis in his new book (Chapter - 5) and have found one function "sys_op_countchg ()". I have found this explanation rather opaque due to my poor knowledge on Oracle. Could you kindly explain its usage with your inimitable style for us? 
Why and How to use this function in live or production database Oracle 9i and 10g?
 
January   05, 2007 - 9:03 am UTC 
 
 
you do not, it is not documented, do not use it. 
 
 
Last_analyzed shows the incorrect date
Maulesh, January   22, 2007 - 7:19 am UTC
 
 
Hi TOM, 
       Thanks for the support and sharing knowledge about Oracle. We have implemented Schedule Job for statistics gathering . For which we are using Analyze statement . 
But currently we come to know that values for Last_analyzed is not geeting updated daily and it is just remain as it is . Can you please help me on this ?
Thanks  
 
system stats need updating?
steve, February  21, 2007 - 11:53 am UTC
 
 
Oracle 10G
Solaris 9
Regarding the "out of the box" gathering of system statistics, I'm wondering why this query is so slow. It takes well over 1 minute to run and there's not that many connections/processes (< 500).
select p.username pu,
       s.username su,
       s.status stat,
       s.sid ssid,
       s.serial# sser,
       lpad(p.spid,7) spid,
       to_char(logon_time,'YYYY-MM-DD HH24:MI:SS') logon_time,
       substr(sa.sql_text,1,540) txt
from v$process p,  v$session s, v$sqlarea sa
where    p.addr=s.paddr
and      s.username is not null
and      s.sql_address=sa.address(+)
and      s.sql_hash_value=sa.hash_value(+)
order by 1,2,7
The explain plan is:
====================
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1091464685
----------------------------------------------------------------------------------------------
| Id  | Operation                  | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                 |     1 |   738 |   100   (4)| 00:00:02 |
|   1 |  SORT ORDER BY             |                 |     1 |   738 |   100   (4)| 00:00:02 |
|*  2 |   HASH JOIN OUTER          |                 |     1 |   738 |    99   (4)| 00:00:02 |
|   3 |    NESTED LOOPS            |                 |     1 |   204 |    73   (2)| 00:00:01 |
|*  4 |     HASH JOIN              |                 |     1 |   191 |    49   (3)| 00:00:01 |
|*  5 |      FIXED TABLE FULL      | X$KSUSE         |     1 |   142 |    24   (0)| 00:00:01 |
|*  6 |      FIXED TABLE FULL      | X$KSUPR         |     1 |    49 |    24   (0)| 00:00:01 |
|*  7 |     FIXED TABLE FIXED INDEX| X$KSLED (ind:2) |     1 |    13 |    24   (0)| 00:00:01 |
|*  8 |    VIEW                    | GV$SQLAREA      |   100 | 53400 |    25   (4)| 00:00:01 |
|*  9 |     FILTER                 |                 |       |       |            |          |
|  10 |      SORT GROUP BY         |                 |   100 | 72300 |    25   (4)| 00:00:01 |
|  11 |       FIXED TABLE FULL     | X$KGLOB         |   100 | 72300 |    24   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("S"."KSUSESQL"="ADDRESS"(+) AND "S"."KSUSESQH"="HASH_VALUE"(+))
   4 - access("ADDR"="S"."KSUSEPRO")
   5 - filter("S"."KSUUDLNA" IS NOT NULL AND "S"."INST_ID"=:B1 AND
              BITAND("S"."KSSPAFLG",1)<>0 AND BITAND("S"."KSUSEFLG",1)<>0)
   6 - filter("INST_ID"=:B1 AND BITAND("KSSPAFLG",1)<>0)
   7 - filter("S"."KSUSEOPC"="E"."INDX")
   8 - filter("INST_ID"(+)=:B1)
   9 - filter(SUM(DECODE("KGLOBT09",0,"KGLOBT02",0))<>0)
30 rows selected.
SQL> select count(*) from X$KSUSE;
  COUNT(*)
----------
      1200
SQL> select count(*) from X$KSUPR;
  COUNT(*)
----------
       700
SQL> select count(*) from X$KSLED;
  COUNT(*)
----------
       812
The last count(*) took a long time!
====================================
SQL> set timing on
SQL>  select count(*) from X$KGLOB;
  COUNT(*)
----------
    933900
Elapsed: 00:01:08.83
 
February  21, 2007 - 12:27 pm UTC 
 
 
you might want up to date stats on the fixed tables, yes. 
 
 
DBMS_STAT
AD, February  21, 2007 - 1:56 pm UTC
 
 
Hi Tom,
Some of my team members think that we should have a wrapper (package) that would call DBMS_STATS to gather statistics instead of using the DBMS_STATS directly. What is your opinion about this approach.
Regards
 
February  21, 2007 - 3:12 pm UTC 
 
 
depends on the goal of the wrapper package
if it simply calls dbms_stats - call for call - it would not be "smart"
if it does something "intelligent" (and you have to define intelligent in your context), then perhaps - but it would not really be a wrapper package at that point, but a set of procedural logic that gathers statistics on your system in the manner you find best. 
 
 
SEan, February  25, 2007 - 8:59 am UTC
 
 
Tom,
i used dbms_stats.gather_database_stats as :
begin sys.DBMS_STATS.GATHER_DATABASE_STATS (); end;
and found out some of the schemas ware not gathered.
does it gather all the schemas in the database as supposed?
TIA
 
February  26, 2007 - 1:24 pm UTC 
 
 
umm, 
versions?
what schemas?
got example? 
 
 
Sean, February  27, 2007 - 10:21 am UTC
 
 
it's 9.2.0.7 
the job completed via dbms_job and also checked no errors in alert.log. the job have a few time after Feb 25.
the schemas ware not completed:
SQL> select owner, count(*) not_c from dba_tables where last_analyzed <'24-FEB-07' group by owner;
OWNER                              NOT_C
------------------------------ ---------
BIM                                   61
BIS                                  102
ENI                                   46
FII                                   76
GL                                     1
HRI                                   53
ISC                                   10
JTF                                    6
OPI                                   24
OSM                                    1
POA                                   36
PORTAL30_DEMO                          1
SYS                                  316
13 rows selected.
the follwoing t_count is the total no. of the tables owned by the schema.
SQL> select b.owner, count(*) t_count from dba_tables b where b.owner in (select a.owner from dba_ta
bles a
  2  where a.last_analyzed <'24-FEB-07') group by b.owner;
OWNER                            T_COUNT
------------------------------ ---------
BIM                                  261
BIS                                  460
ENI                                  140
FII                                  278
GL                                   176
HRI                                  185
ISC                                   37
JTF                                  403
OPI                                  122
OSM                                  198
POA                                  138
PORTAL30_DEMO                          9
SYS                                  350
13 rows selected.
SQL> 
  
February  27, 2007 - 11:03 am UTC 
 
 
so, lets see the JOB. 
 
 
Sean, February  27, 2007 - 1:16 pm UTC
 
 
those are not LOBs. not sure why dbms_stats.gather_database_stats does not pick up those.
analyze table does work for those...
SQL> select b.owner, b.object_name, b.object_type from 
  2  dba_objects b, dba_tables a 
  3  where a.owner=b.owner
  4  and b.object_name=a.table_name
  5  and a.last_analyzed <'24-FEB-07' and a.owner='SYS'
  6  and rownum<3
  7  /
OWNER    OBJECT_NAME               OBJECT_T
-------- ------------------------- --------
SYS      FILE$                     TABLE
SYS      BOOTSTRAP$                TABLE
SQL> desc BOOTSTRAP$     
ERROR:
ORA-04043: object BOOTSTRAP$ does not exist
SQL> desc sys.BOOTSTRAP$
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- -----------------------
 LINE#                                                 NOT NULL NUMBER
 OBJ#                                                  NOT NULL NUMBER
 SQL_TEXT                                              NOT NULL VARCHAR2(4000)
SQL> select last_analyzed from dba_tables where table_name='BOOTSTRAP$';
LAST_ANAL
---------
07-OCT-06
SQL> analyze table sys.BOOTSTRAP$ compute statistics;
Table analyzed.
SQL>  select last_analyzed from dba_tables where table_name='BOOTSTRAP$';
LAST_ANAL
---------
27-FEB-07
SQL> 
SQL> select job
  2  , log_user
  3  , priv_user
  4  , schema_user
  5  ,       last_date
  6  , last_sec
  7  ,       this_date
  8  , this_sec
  9  ,       next_date
 10  , next_sec
 11  ,       total_time
 12  , broken
 13  , interval
 14  ,       failures
 15  , what
 16  from sys.dba_jobs where log_user='SYS'
 17  /
        LOG      PRIV     SCHEMA
    JOB USER     USER     USER     LAST_DATE LAST_SEC THIS_DATE THIS_SEC
------- -------- -------- -------- --------- -------- --------- --------
                    TOTAL
NEXT_DATE NEXT_SEC   TIME B INTERVAL             FAIL WHAT
--------- -------- ------ - -------------------- ---- --------------------
   
  40704 SYS      SYS      SYS      26-FEB-07 19:33:42
04-MAR-07 19:33:42  41484 N sysdate + 6             0 begin sys.DBMS_STATS
                                                      .GATHER_DATABASE_STA
                                                      TS (); end;
 
February  27, 2007 - 2:24 pm UTC 
 
 
LOBS?  I said "job", just wanted to see the parameters.
where those tables EVER analyzed? 
 
 
Sean, February  27, 2007 - 2:50 pm UTC
 
 
use portal30_demo as example:
SQL> select table_name ,last_analyzed from dba_tables where owner='PORTAL30_DEMO';
TABLE_NAME                     LAST_ANAL
------------------------------ ---------
EMP                            27-FEB-07
DEPT                           27-FEB-07
MLOG$_EMP                      14-NOV-04
RUPD$_EMP
EMP_SNAPSHOT                   27-FEB-07
PEOPLE_INFO$                   27-FEB-07
TASK_CATEGORY$                 27-FEB-07
TASK_DEVELOPER$                27-FEB-07
TASK$                          27-FEB-07
there are two tables were not picked by dbms_stats.gather_database_stats. 
February  27, 2007 - 3:39 pm UTC 
 
 
system generated things, so - got version?
older releases didn't really want statistics on some things. 
 
 
it's 9.2.0.7
Sean, February  27, 2007 - 5:53 pm UTC
 
 
 
February  28, 2007 - 3:39 pm UTC 
 
 
gather database stats skips those two particular segments. 
 
 
dynamic sampling
Amit, March     01, 2007 - 12:38 pm UTC
 
 
Hi Tom,
Thanks for sharing your vast experience and knowledge with us.
From discussions about dynamic sampling above, I gathered the following information-
1)In 9i we have to collect the stats ourself,
2)In 10g, Oracle calc the stats for us (when we first fire the query during its hard parse)
Assuming point 2 above is right, I tried a test in 10g R2
But what I have noticed is dynamic sampling is not kicked in automatically as you can see in Area 1 below ?
and other issues in area 2 and area 3
10g Rel 2
=========================================================
AREA 0)
SQL> create table temp
  2  (a number,
  3  b number);
Table created.
SQL> declare
  2  begin
  3  for i in 1..100000
  4  loop
  5  insert into temp values(i,i+1);
  6  end loop;
  7  end;
  8  /
PL/SQL procedure successfully completed.
============================================================
AREA 1)
SQL> explain plan for select a from temp;
Explained.
SQL> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
-----------------------------------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| TEMP |
----------------------------------
Note
-----
   - 'PLAN_TABLE' is old version
   - rule based optimizer used (consider using cbo)
12 rows selected.
===========================================================
So in Above it is not using dynamic_sampling automatically because it is using RBO and not CBO
But if I hint it to use dynamic sampling (as follows) then it shift to CBO and Rows are guesses nearly Right
=======================================
AREA 2)
SQL> explain plan for select /*+ dynamic_sampling (t 10) */ a from temp;
Explained.
SQL> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
----------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 83932 |  1065K|    57 |
|   1 |  TABLE ACCESS FULL| TEMP | 83932 |  1065K|    57 |
----------------------------------------------------------
Note
-----
   - 'PLAN_TABLE' is old version
11 rows selected.
========================================================
Now dynamic sampling is done in previous step so it should hard parsed and plan is ready in shared pool. This same plan should be used if I fire syntactically same query again but again it is going back to RBO (as follows)
===============================================
AREA 3)
SQL> explain plan for select a from temp;
Explained.
SQL> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|   1 |  TABLE ACCESS FULL| TEMP |
----------------------------------
Note
-----
   - 'PLAN_TABLE' is old version
   - rule based optimizer used (consider using cbo)
12 rows selected.
================================================
 
March     02, 2007 - 12:48 pm UTC 
 
 
1) you have dynamic sampling in 9i as well - it is just defaulting to a "lower level" than 10g did.
the RBO doesn't sample because the rbo would never use the information.  Do not use the RBO in 10g. 
 
 
Tables with stale stats ignored by dbms_stats
A reader, March     01, 2007 - 4:35 pm UTC
 
 
Tom,
This is 10gr2. We use the following command to gather statistics:
DBMS_STATS.GATHER_SCHEMA_STATS(ownname=>'ABC',GRANULARITY=>'ALL',CASCADE
=>TRUE,METHOD_OPT=>'FOR ALL COLUMNS size 1',DEGREE=>DBMS_STATS.AUTO_DEGREE);
The tables are in monitoring mode. We were told by Oracle support that the above method will ignore tables with stale statistics. Here is is quote from analyst:
=============
It is because some of the objects become stale and are ignored while gathering schema statistics. It is better to use 
options=>'GATHER STALE' in the procedure. 
=============
I always thought that if I do not mention any option, it will gather statistics for all tables in the schema, stale or not. 
Thanks
 
 
Analyze process is taking very long
Apurva, March     05, 2007 - 11:16 am UTC
 
 
Tom,
I am using the following piece of PL-SQL code to analyze a 20-column table which has ~650 million rows. 
 
<SQL>
        begin
        dbms_stats.gather_table_stats(ownname => 'ops\$datamart', tabname => 'FCT_SALES', estimate_percent => 15 ,
                                        granularity  => 'ALL', cascade => true, force => true);
        end;
        /
</SQL>
The problem is that the analyze process is taking close to 21 hours!, although we have a very good architecture. 
What would be your take on this?
Thanks in anticipation
 
March     05, 2007 - 2:25 pm UTC 
 
 
... although we have a very good architecture.  ...
well, a pretty building won't make anything run faster (sorry, I could not resist - although we have a very good architecture... don't even know what to make of that!! :)
if the table is dense, you could try block sampling (do 15% of the IO to read).
You could also introduce "parallel" into it.
even if you do not use parallel on the table itself, you could run more than one gather against the table + indexes, instead of cascade...
or if you are on 10g, you might not even need cascade if you are re-creating the indexes after a load since the indexes collect stats during a create now.
but - you really don't give us much to work with.  20 columns and 650 million rows and a "very good archicture" plus $1 will get you a cup of coffee (not at starbucks of course).
you don't mention why you gather statistics (what do you do that causes this, a truncate and reload, a load of a new partition, "time" - what)
you don't tell us about the structure of this table
you don't tell us very much........ 
 
 
Table hung when dbms_stats
A reader, March     08, 2007 - 2:38 pm UTC
 
 
Hi Tom,
We have one partitioned table with 3986259 rows, we have script scheduled (twice a week) to gather stats on this table. This table usually takes maximum 10Mins to gather stats without any issues. But yesterday this job did not finished  at all, we had to kill that process after 14Hours. When this was running any query on this table was hanging, even "desc" from sqlplus hung. As soon as this process was killed all was normal. Due to this reason any application that was trying to access this table was just freezing and brought down the whole system.
DBMS_STATS.gather_table_stats ('AUSER',
                                'ATABLE',
                                  estimate_percent      => 5,
                                  CASCADE               => TRUE
                                 );
Do you know what could be the problem?
Many Thanks
 
 
March     08, 2007 - 3:34 pm UTC 
 
 
please utilize support for this one. 
 
 
A reader, March     08, 2007 - 2:56 pm UTC
 
 
Sorry forgot to mention that this Oracle 9.2.0.6 RAC on Sun Solaris
Thanks 
 
A reader, March     12, 2007 - 4:22 am UTC
 
 
Thanks 
 
Thanks, but...
Apurva, March     16, 2007 - 4:19 pm UTC
 
 
Tom,
Thanks a lot for all your help, and sorry for following-up a bit late: the system as well the building it's housed in have a wonderful architecture :-) 
 * 8 CPUs
 * 32 GB RAM
 * Oracle 10g R2 - both client and server run on Linux
 * 1071, Fifth Avenue at 89th Street
Here are the details of how we load out fact table:
 * The table is partitioned by MONTH (date format)
 * In every refresh we first truncate the entire table
 * Then using SQL loader we populate historical as well as current month's data [and in a given month we load about 200 million rows]
 * Then we build 12 local bitmap indexes on 12 non-fact columns
 * Then we would gather the statistics of the table using the following procedure:
<SQL>
    begin
    dbms_stats.gather_table_stats(ownname => 'ops\$datamart', tabname => 'FCT_SALES', estimate_percent => 15 ,
                          granularity => 'ALL', cascade => true, force => true);
    end;
    /
</SQL> 
...but it would take close to a million years (~21 hours)
I came to you and as per your suggestion I removed 'force => true' option, but Oracle is misbehaving and is still gathering statistics for the indexes?! such that the overall and the time taken is still ~20 hours. Moreover, it's index analyzation that is eating most of the time - we found that out of 20 odd hours that the entire process takes, Oracle spends close to 18 hrs to gather stats for indexes...and quite obviously it's not a good idea as - as you remarked - in 10g index stats are automatically gathered when the index is built.
Seems it's an Oracle bug? What are your thoughts?
Thanks in anticipation
 
March     17, 2007 - 4:02 pm UTC 
 
 
cascade is what affects indexes, oracle is not misbehaving, it is simply doing what you told it to.
cascade=>false, not force.
did you try ANY of the other things I mentioned??!!??
 
 
 
Apurva, March     17, 2007 - 11:09 pm UTC
 
 
that's what happens when you write your mails in anticipation of St. Patrick's day...
I simply removed 'cascade=>true' (and not 'force=>true' - as I miswrote) and did not explicitly write 'cascade=>false', in the fond hope that default value for cascade parameter is 'false'
I tried your 'block sampling' suggestion and found that it did not affect analyze time substantially - index-analyzation is the killer
Tom, 
Thanks *A LOT* again. 
 
 
it's working!!!!!!
Apurva, March     17, 2007 - 11:22 pm UTC
 
 
 
 
wrapper for DBMS_STATS
Chris Gould, March     20, 2007 - 7:10 am UTC
 
 
I noticed recently that if you use DBMS_STATS.GATHER_TABLE_STAS with CASCADE=>TRUE then it gathers stats for the table and each index in turn. This can take a long time, especially if there are a lot of indexes on the table (eg. in a warehouse). One way to speed this up is to use DBMS_SCHEDULER to create and run jobs for each index on the table simultaneously.  Scheduler and the resource manager should ensure the database is not swamped and the jobs all complete as efficiently as possible.  The following code indicates the idea :
begin
  for r in (select ui.index_name
                  ,uip.partition_name
                  ,dbms_scheduler.generate_job_name('IXAZ_') as jobname
              from user_indexes ui
            inner join user_ind_partitions uip
             on (uip.index_name = ui.index_name)
            where ui.table_name = 'ASSET'
              and uip.partition_name = 'P200703'
              order by ui.index_name,uip.partition_name )
  loop
  
        execute immediate
'BEGIN
   DBMS_SCHEDULER.CREATE_JOB (
   job_name             => :jobname, 
   job_type             => :jobtype,
   job_action           => :jobaction,
   start_date           => NULL,
   repeat_interval      => NULL, 
   end_date             => NULL,
   enabled              => TRUE,
   auto_drop            => TRUE,
   comments             => :jobcomments);
END;'
using 
        r.jobname
      ,'PLSQL_BLOCK'
      ,'begin dbms_stats.gather_index_stats(
         ownname=>'''||user||'''
         ,indname=>'''||r.index_name||'''
         ,partname=>'''||r.partition_name||''''||');end;'
      ,'user='||user||',index='||r.index_name||',ptn='||r.partition_name;
      
end loop;
end;
  The code above only considers the indexes for one partition on the table, but the total time to analyze all the indexes will be approx the time taken for the largest index if analyzed individually. 
 
 
wrapper for DBMS_STATS - additional info
Chris Gould, March     21, 2007 - 5:28 am UTC
 
 
One suggestion I'd make regarding the DBMS_SCHEDULER wrapper to DBMS_STATS is not to specify a degree of parallelism > 1 when analyzing.  Using a large degree of parallelism seems to confuse the resource manager - I guess because it thinks it's executing one job but this then spawns multiple sessions for the additional degrees of parallelism - and you can quickly exhaust the supply of available sessions.  
 
Need Guidance on Partitioned tables
Vk, September 20, 2007 - 5:45 am UTC
 
 
Hi Tom,
We have huge patririoned tables which are continually loaded on a daily basis based on Range partition, and we do maintain 30 days worth of retention. The data for the past 29 days is static for the 30th day and we don't want to re-compute the partition level statistics again for them as they have become stale.
However we drop the old partition and adds a new partition on a daily basis , what is termed as Rolling Window.
Now please provide your suggestions:
1. We are doing gather_table_stats in a loop for the complete schema using 
dbms_stats.gather_table_stats(ownname=> l_username,
tabname=> l_table_name,
estimate_percent  => 5,
method_opt        => l_method_opt,
degree            => 16,
cascade           => true);
Where l_method_opt is :
'for columns ' || p_col_name || ' SIZE 1'; for the partitioned tables and p_col_name is the column on which the data is partitioned.
However if the table is non-partitioned then 
l_method_opt is :
'for all indexed columns SIZE 1';
Now this complete gathering of stats takes around 3 hrs and for us it is quite significant.
Can you please provide us suggestions as to how we can capture global and partition level statistics without recomputing them all over again and again.
Your advice/help is appreciated. 
September 24, 2007 - 7:15 am UTC 
 
 
why would you gather stats on the partition key only?  that seems a little strange, what is the thought process behind that?
If you gather local stats on only the single new partition - we'll do our best to "guess" the global stats.  In many cases - we'll get it right (eg: number of total rows).  In other cases, we might not - eg: number of distinct values - if you have 30 partitions and X has 10 distinct values in each partition - you either have a) 10 distinct values of X, b) 300 distinct values of X, or c) between 11 and 299 distinct values of X - the only way to truly know would be to analyze everything OR have you use dbms_stats.set_xxxxx routines to tell us. 
 
 
Thought Process of only gather stats on Partitioned Column
Vk, September 24, 2007 - 1:53 pm UTC
 
 
Hi Tom,
Since every query which comes to us has to have the WHERE clause suppoted by the PARTITIONED COLUMN we thought we don't need to gather stats on any other column, since this will drive the rows flowing from this step.
I am pretty okay in gathering only the stats for PARTITITIONED COLUMN and not on all the columns of the table as is done by Gather Stats by default. One reson is the time it takes to gather stats is very high on nearly a 1TB of database.
I am also thinking to gather the partition level stats only for the latest partition and not capture global level stats if Oracle's CBO can guess it 90% correct, I am happy with it.
Thanks,
Vikas khanna 
September 26, 2007 - 8:30 pm UTC 
 
 
umm, that is flawed
if you only get column stats on the partition column, what would the estimated card= value be for something like:
select from t where x in (select Y from partitioned_table where key=val and SOME_COL='abc')
if we know nothing about some_col, we cannot get the card=value.
so, please - go into more detail as to why you did this?  what facts did you use in coming to this conclusion?
the "partition_key=value" genearally is used for partition elimination.  after that, we still - well - need statistics. 
 
 
Analyze Vs DBMS_STATS
Karthick, October   17, 2007 - 12:40 am UTC
 
 
for a particular table iam able to use the ANALYZE command but not DBMS_STATS.
see the test case below.
Enter user-name: sysadm/sysadm@inlabtst
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> select count(1) from HX_SSN_GEN_STG_TBL;
  COUNT(1)
----------
    162439
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_SSN_GEN_STG_TBL');
BEGIN DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_SSN_GEN_STG_TBL'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "SYSADM"."HX_SSN_GEN_STG_TBL", insufficient
privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13046
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1
SQL> ANALYZE TABLE HX_SSN_GEN_STG_TBL COMPUTE STATISTICS;
Table analyzed.
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_NAMES_STG');
PL/SQL procedure successfully completed.
just for that one table iam not able to use DBMS_STATS for other tables i can. but i can use ANALYZE command on that table. any idea what could be the problem.
 
 
 
Found the answer
Karthick, October   17, 2007 - 12:53 am UTC
 
 
Sorry for posting the pervious one tom. After some search in orafaq got the answer..
SQL>  select username, privilege from user_sys_privs;
USERNAME                       PRIVILEGE
------------------------------ ----------------------------------------
SYSADM                         UNLIMITED TABLESPACE
SYSADM                         CREATE TABLE
SYSADM                         CREATE SESSION
SYSADM                         ANALYZE ANY
SQL> GRANT SELECT ANY TABLE TO SYSADM;
Grant succeeded.
SQL> select username, privilege from user_sys_privs;
USERNAME                       PRIVILEGE
------------------------------ ----------------------------------------
SYSADM                         UNLIMITED TABLESPACE
SYSADM                         SELECT ANY TABLE
SYSADM                         CREATE TABLE
SYSADM                         CREATE SESSION
SYSADM                         ANALYZE ANY
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_SSN_GEN_STG_TBL');
PL/SQL procedure successfully completed.
 
 
 
About the first entry
Alvin, February  21, 2008 - 7:39 pm UTC
 
 
On a 10g box.
Will running an analyze table overwrite the statistics gathered by dbms_stats earlier ?
Dbms_stats ran at 10am 
analyze table ran at 11am
Do both (analyze table and DBMS_Stat) store statistics at the same location ? (hence the earlier question of overwritting). 
February  21, 2008 - 10:43 pm UTC 
 
 
statistics are stored in one place. so, sure they would.
 
 
 
Killind DBMS_STATS in the middle of a large table
Ricardo Masashi, March     16, 2008 - 10:16 am UTC
 
 
Hi, Tom,
I was wondering what happen if we kill a DBMS_STATS running job, by some reason, in the middle of gathering statistics of a large/huge table.
That table would contain stale, last analyzed or no stats ?
My concern is if the dbms_stats:
- Delete stats before collecting;
- Collect before overwriting;
Thanks! 
March     24, 2008 - 8:26 am UTC 
 
 
well, statistics are gathered and updated and gathered and updated - and then finally committed at the end - as a transaction.
So, dbms_stats works like DDL in that sense, all or nothing. 
 
 
different execution plans when using ANALYZE versus DBMS_STATS
A reader, May       16, 2008 - 10:24 am UTC
 
 
hi tom,
i analyzed one table, one time with ANALYZE the other time with DBMS_STATS.GATHER_TABLE_STATS (commands below). i got different executions plans for the same query. good with analyze, bad with DBMS_STATS.GATHER_TABLE_STATS. when comparing all statistics i found they to be the same with the excpetion of the histograms having different slot sizes (see below). 
analyze table gfm.SUBCONFIG_CONFIGURABLE_OBJ_T estimate statistics
for table
for all indexes
for all indexed columns
sample 40 percent;
exec dbms_stats.gather_table_stats ( -
  'GFM', 'SUBCONFIG_CONFIGURABLE_OBJ_T', estimate_percent=>40 -
 ,method_opt=>'for all indexed columns', cascade=>true);
SQL> select * from (
  2  select 'ANALYZE' meth, a.* from system.t_analyze_hist a
  3  UNION ALL
  4  select 'DBMSSTATS', s.* from system.t_stats_hist s
  5  ) where column_name = 'SUBCONFIG_ID'
  6  order by 1;
METH       OWNER      TABLE_NAME                     COLUMN_NAME          ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- ------------------------------ -------------------- --------------- ----------
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                     782              0
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    1662              1
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    2546          19371
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    3423          19390
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    4315          19411
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    5191          19471
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    6075          19490
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    6943          19510
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    7826          19531
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    8720          19571
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    9636          19591
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   10531          19611
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   11414          19631
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   12301          19651
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   13184          19671
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   14056          19673
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   14930          19691
ANALYZE    GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   15816          19711
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    1526              0
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    3033              1
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    4396          19371
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    5745          19390
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    7119          19411
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    8486          19471
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                    9894          19490
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   11234          19510
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   12627          19531
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   13996          19571
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   15363          19591
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   16761          19611
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   18149          19631
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   19540          19651
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   20926          19671
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   22320          19673
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   23722          19691
DBMSSTATS  GFM        SUBCONFIG_CONFIGURABLE_OBJ_T   SUBCONFIG_ID                   25042          19711
36 rows selected.
Q1) could the slot size difference of the histograms be the reason for the different plan?
Q2) what should i do in the future? Going back to ANALYZE?
 
 
May       19, 2008 - 3:10 pm UTC 
 
 
q1) yes
q2) ummm, no.  your output is hard to read - not sure how many buckets each one generated, but use something that generates the same buckets
and realize that since you have histograms, you'll get different plans for different "inputs" over time.  sometimes (since you do not have a bucket PER VALUE) dbms_stats will get the better plan than analyze did - just change your "predicate" and you'll see - you have an imperfect picture of your data here. 
 
 
different execution plans when using ANALYZE versus DBMS_STATS
A reader, May       20, 2008 - 10:30 am UTC
 
 
hi tom,
thanks for your answer. while doing further investigations on ANALYZE versus DBMS_STATS i found that i get the same histograms on COMPUTE statistics, but different ones on ESTIMATE SAMPLE x percent.
i would assume now that the internal sampling is done differently by ANALYZE than DBMS_STATS. correct? 
May       20, 2008 - 11:45 am UTC 
 
 
if you ran analyze over and over - you would discover IT generates different statistics.
they are random samples, if you randomly sample something over and over - you'll get different results....
 
 
 
how oracle will use the statistics after rename a table 
Yuna, May       30, 2008 - 9:24 am UTC
 
 
Hi Tom,
We are going to delete tons of rows from a table. We will do the following:
1) create a temp table;
2) copy the good data  from old table (A) to the temp table;
3) create index and trigger in the temp table;
4) rename old table to A_archive table;
5) rename temp table to A old table;
6) trucate A_archive table;
Thw question is after the temp is rename to A old table ( the same name). Whether Oralce will use the old statistic from the old table? in that case we need to analyze the new table right way?
 
May       30, 2008 - 3:07 pm UTC 
 
 
the statistics are tracked by object ids, not the 'name'.
ops$tkyte%ORA10GR2> create table t1 as select * from all_objects;
Table created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T1' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> create table t2 as select * from t where owner = 'SCOTT';
Table created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T2' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name in ( 'T1', 'T2' );
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
T1                                  49904
T2                                     16
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter table T1 rename to A_archive;
Table altered.
ops$tkyte%ORA10GR2> alter table T2 rename to T1;
Table altered.
ops$tkyte%ORA10GR2> truncate table A_archive;
Table truncated.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name in ( 'T1', 'T2', 'A_ARCHIVE' );
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
A_ARCHIVE                           49904
T1                                     16
 
 
 
 
about the statisitcs
yuna, June      10, 2008 - 2:18 pm UTC
 
 
Thank you very much Tom!
I have the following questions.
1) After creating a table T1 and doing work for a while.
a:  Without anylyze it :  select table_name, num_rows from user_tables where table_name in = 'T1';
No rows returned. But we can see the statistics from ORACLE enterprise manager console. 
b: Anylyze it :  select table_name, num_rows from user_tables where table_name in = 'T1';
Row will be returned.
Does this mean after creating a table and never analyze that table manually, Oracle database still getting the related statistics for it?
2) If I want to calculate the table size using user_extents, whether I need to analyze the table in order to get the correct number? Whether the statisitcs by analyze or dbms_ stats is only for execution plan?
Thank you again for your wisdom.
Best regards, Yuna 
 
June      10, 2008 - 2:59 pm UTC 
 
 
a:  Without anylyze it :  select table_name, num_rows from user_tables where 
table_name in = 'T1';
No rows returned. But we can see the statistics from ORACLE enterprise manager 
console. 
sorry, but that is not true, that returns a row as soon as the table is 'there', you have made a mistake somewhere.
2) user extents shows you allocated space, it is always accurate about what is allocated for the table.   
 
 
still the statistics after creating that table (2)
Yuna, June      11, 2008 - 9:48 am UTC
 
 
Hi Tom, 
Please see the following scripts about creating table without statistics in user_tables before analyzing.
SQL S3 >create table temptest2 as select * from JCL_SCHEDULE;
SQL S3 >select count(*) from temptest2;
  COUNT(*)
----------
        10
SQL S3  >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';
  NUM_ROWS
----------
 SQL S3  >analyze table temptest2 compute statistics for table for all indexed columns for all
indexes;  2
Table analyzed.
 SQL S3  >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';
  NUM_ROWS
----------
        10
Thank you very much for your comments.
Best regards 
June      11, 2008 - 9:55 am UTC 
 
 
yes, so, this is how it is supposed to work and you got a record EACH TIME - both times you got a record.
so, what is the question?
(do not use analyze to gather statistics, use dbms_stats) 
 
 
please read it again
A reader, June      11, 2008 - 10:51 am UTC
 
 
Please read the process again. There is no record for the first time run of :
SQL S3  >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';
  NUM_ROWS
----------
Thank you!
Best regards, Yuna
 
June      11, 2008 - 11:19 am UTC 
 
 
sure there is.
select ROWNUM, num_rows from user_tables where table_name = 'TEMPTEST2';
you cannot "see" null.
ops$tkyte%ORA9IR2> select null from dual where 1=0;
no rows selected
ops$tkyte%ORA9IR2> select null from dual where 1=1;
N
-
ops$tkyte%ORA9IR2>
see the difference. 
 
 
 
no record ( or null record) returned 
A reader, June      11, 2008 - 1:55 pm UTC
 
 
Hi Tom,
Thank you for the explanation. maybe my question can be satted as this way, Why before analyzing the table ( with 10 rows in that table), running "select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';" returns NUll ( no records) from the SQLPLUS, but the result will show in DB console. After analyzing the table and run the same query, it will return the real num_rows to the SQLPLUS and db console? 
SQL S3 >create table temptest2 as select * from JCL_SCHEDULE;
SQL S3 >select count(*) from temptest2; ( this is the real number)
  COUNT(*)
----------
        10
SQL S3  >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2'; ( NULL returned before analyzing)
  NUM_ROWS
----------
 SQL S3  >analyze table temptest2 compute statistics for table for all indexed columns for all
indexes;  2
Table analyzed.
 SQL S3  >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2'; ( the real number returned).
  NUM_ROWS
----------
        10
Thank you! 
Best regards, Yuna
 
June      11, 2008 - 5:50 pm UTC 
 
 
(please stop using analyze)
tell me the precise steps you undertake in dbconsole from the home page in order to see what you say you see. 
 
 
analize table
A reader, June      12, 2008 - 9:01 am UTC
 
 
Thank you very much Tom for the answer.
I am sorry that I gave you wrong info about the db console. I could not see the stats either until I use dbms_stats to gather the infomation (exec  dbms_stats.gather_table_stats(user, 'TEMPTEST2'). There was another long existing table before. I could get info from the dbconsole, but not from the user_tables. So I applied that for the newly created table.
the step is schema-table-and the specific table.
So I could not get the stats either from user_tables or the DB console before analyzing the newly created table.
Best regards, Yuna 
 
privileges 
A reader, June      12, 2008 - 11:47 am UTC
 
 
Hi Tom,
I'd like to know what specific privilege is used to analyze a table?
Thank you!
Best regards, Yuna 
June      12, 2008 - 1:06 pm UTC 
 
 
 
 
MV logs need gather stats ?
Megala, June      22, 2008 - 6:54 pm UTC
 
 
Tom:
Does materialized view log tables need to be analyzed or gather stats (dbms_stats)? Thanks 
June      22, 2008 - 9:52 pm UTC 
 
 
they could, sure, they are a table, in 10g if queried without - we will dynamically sample them at parse time.
so you probably do not need to do anything special for them, we'll sample them at parse time and given how simply they are used - that is probably more than sufficient. 
 
 
MV log table analyze and estimate_percent => null question
Megala, June      22, 2008 - 11:04 pm UTC
 
 
Thanks Tom.
In Oracle 9i,Does MV log table always needs to be analyzed with current stats because after MVs refreshed, MV log will be emptied anyway. ?
Will there be performance problem if MV log tables are not analyzed ?
Thanks for your insight.
2) In oracle 9.2.0.6, When estimate_percent parameter is not specified, Will Oracle gather 100% table stats or ?
Example:
exec dbms_stats.gather_table_stats('CUR_DBA2', tabname => 'CTBILLINGTABLE', degree => dbms_stats.default_degree, cascade =>
true ) ;
 
June      23, 2008 - 7:42 am UTC 
 
 
1) probably not.  In 9i, dynamic sampling is set to 1
"Sample tables that have not been analyzed if there is more than one table in the query, the table in question has not been analyzed and it has no indexes, and the optimizer determines that the query plan would be affected based on the size of this object."
Hence, if you are using the CBO and we access this table using the CBO in a multi-table query, and this mlog$ table is 'large' (allocated space is big), it'll sample it.  for example:
ops$tkyte%ORA9IR2> create table t ( x int primary key, y int );
Table created.
ops$tkyte%ORA9IR2> insert into t select rownum, rownum from all_users;
35 rows created.
ops$tkyte%ORA9IR2> create materialized view log on t;
Materialized view log created.
ops$tkyte%ORA9IR2> select object_name, object_type from user_objects order by object_name;
OBJECT_NAME                    OBJECT_TYPE
------------------------------ ------------------
MLOG$_T                        TABLE
RUPD$_T                        TABLE
SYS_C004615                    INDEX
T                              TABLE
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert into mlog$_t
  2  select rownum, sysdate, 'x', 'x', utl_raw.cast_to_raw( rpad('x',255,'x') ) from all_objects;
30662 rows created.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2> delete from mlog$_t;
30662 rows deleted.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> show parameter optimizer_dynamic_sampling
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     1
ops$tkyte%ORA9IR2> set autotrace traceonly explain
ops$tkyte%ORA9IR2> select * from t, mlog$_t where t.x = mlog$_t.x;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=161)
   1    0   NESTED LOOPS (Cost=3 Card=1 Bytes=161)
   2    1     TABLE ACCESS (FULL) OF 'MLOG$_T' (Cost=2 Card=1 Bytes=155)
   3    1     TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=1 Card=1 Bytes=6)
   4    3       INDEX (UNIQUE SCAN) OF 'SYS_C004615' (UNIQUE)
ops$tkyte%ORA9IR2> alter session set optimizer_dynamic_sampling = 0;
Session altered.
ops$tkyte%ORA9IR2> select * from t, mlog$_t where t.x = mlog$_t.x;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=82 Bytes=13202)
   1    0   HASH JOIN (Cost=5 Card=82 Bytes=13202)
   2    1     TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=35 Bytes=210)
   3    1     TABLE ACCESS (FULL) OF 'MLOG$_T' (Cost=2 Card=82 Bytes=12710)
ops$tkyte%ORA9IR2> set autotrace off
see how the plan is with and without dynamic sampling at it's default level...
2) documentation is actually quite good for that 
http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96612/d_stats2.htm#1003993  .. estimate_percent 
Percentage of rows to estimate 
(NULL means compute) The valid range .. 
 
 
 
estimate_percent => null
A reader, June      24, 2008 - 6:36 pm UTC
 
 
Tom,
I just need to clarify:
estimate_percent => null (default value) is equivalent to 
estimate_percent=> 100  in dbms_stats.gather_table_stats ?
Thanks
  
 
100% table analyze
A reader, June      24, 2008 - 11:40 pm UTC
 
 
Tom:
my goal is to analyze the table 100%.
Can you suggest , Should i use default "compute" or estimate_percent => 100 for any tables ?
Is there any difference between Estimate(100) and Compute in the dbms_stats.gather_table_stats analyze ? 
Under what circumstance, one should use "compute" over estimate_percent => 100.   
June      25, 2008 - 8:29 am UTC 
 
 
they are the same, which pleases you more?  null or 100?
 
 
 
Lock Schema or Table Stats
A reader, July      29, 2008 - 1:48 pm UTC
 
 
Which data dictionary view contains the info about whether a table or a schema's statistics is currently locked?
 
August    01, 2008 - 10:22 am UTC 
 
 
ops$tkyte%ORA10GR2> select stattype_locked from user_tab_statistics where table_name = 'T';
STATT
-----
ops$tkyte%ORA10GR2> exec dbms_stats.lock_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select stattype_locked from user_tab_statistics where table_name = 'T';
STATT
-----
ALL
 
 
 
 
confusing stats
a reader, September 04, 2008 - 4:49 pm UTC
 
 
Hey Tom:
    I created a table.
SQL>  select count(*) from MYBITMAP;
  COUNT(*)
----------
   1270064
SQL> analyze table mybitmap estimate statistics;
Table analyzed.
Elapsed: 00:00:01.44
SQL> select NUM_ROWS, blocks, empty_blocks,PCT_FREE,PCT_USED,CHAIN_CNT,AVG_ROW_LEN from user_tables where table_name='MYBITMAP';
  NUM_ROWS     BLOCKS EMPTY_BLOCKS   PCT_FREE   PCT_USED  CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- ---------- -----------
   1279153      31156          204         10                     0         172
SQL> exec dbms_stats.gather_table_stats(ownname => 'JOSH', tabname => 'MYBITMAP', estimate_percent => 10);
PL/SQL procedure successfully completed.
SQL> select NUM_ROWS, blocks, empty_blocks,PCT_FREE,PCT_USED,CHAIN_CNT,AVG_ROW_LEN from user_tables where table_name='MYBITMAP';
  NUM_ROWS     BLOCKS EMPTY_BLOCKS   PCT_FREE   PCT_USED  CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- ---------- -----------
   1267080      31156          204         10                     0         169
So how come the number of rows are different from analyze and the dbms_stat package, and also neither of them is the same with the real number of rows in the table.
Thank you very much!
 
September 05, 2008 - 7:40 am UTC 
 
 
because you used estimate on the analyze (you saw that right? the word "estimate")
and you used a 10% estimate with dbms_stats (estimate... and estimate)
and estimate is - a guesstimate, an approximation, a guess.  Consider:
ops$tkyte%ORA10GR2> create table t as select * from all_objects;
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', estimate_percent=> 10 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
     49600
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', estimate_percent=> 10 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
  NUM_ROWS
----------
     50730
just run dbms_stats and you'll get slightly different answers time after time.  It is normal, expected, anticipated even.  
Do not use analyze to gather statistics on segments, the optimizer is built to use information dbms_stats gathers and dbms_stats gathers things differently than analyze does.  dbms_stats has been kept up to date over the years whereas analyze isn't as far as statistics gathering goes. 
 
 
 
confusing stats
A reader, September 04, 2008 - 5:17 pm UTC
 
 
Sorry, I get it now, it is because they are only estimating the stats, after I do "compute" the number of rows are correct. but why the average row length from package is different from analyze though?
SQL> analyze table mybitmap compute statistics;
Table analyzed.
Elapsed: 00:02:28.66
SQL>  select NUM_ROWS, blocks, empty_blocks,PCT_FREE,PCT_USED,CHAIN_CNT,AVG_ROW_LEN from user_tables where table_name='MYBITMAP';
  NUM_ROWS     BLOCKS EMPTY_BLOCKS   PCT_FREE   PCT_USED  CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- ---------- -----------
   1270064      31156          204         10                     0         172
 
 
September 05, 2008 - 8:03 am UTC 
 
 
because analyze is legacy code, not maintained and you should only use dbms_stats to gather statistics. 
 
 
DBMS_STATS  in parallel 
A reader, October   02, 2008 - 7:29 am UTC
 
 
Hi Tom,
If stats are being gathered for one partition, and there is no index on table. Can oracle use degree of parallelism? if yes, what is happening in parallel?
Thanks
 
October   02, 2008 - 7:49 am UTC 
 
 
yes it can use parallel.
The queries that dbms_stats execute can use parallel execution.  That is what would be done in parallel - the sampling queries dbms_stats runs 
 
 
degree of parallellism
A reader, October   04, 2008 - 9:45 pm UTC
 
 
Hi Tom,
Thanks.
If table is not defined with parallel clause, and degree of parallellism is not defined in dbms_stats clause. Does oracle automatically uses degree based upon spfile parameters?
Regards, 
October   06, 2008 - 2:47 pm UTC 
 
 
you would see what the default is - which varies by version.
check out the documentation for your release of the database. 
 
 
which case both Analyze and dbms_stats do the same job(optimizer perspective ?
Mahesh, October   06, 2008 - 1:34 pm UTC
 
 
Tom,
Will following do the same thing ?
ANALYZE TABLE <tab_name> COMPUTE STATISTICS;
and
DBMS_STATS.GATHER_TABLE_STATS (
 ownname => <owner>, 
   tabname =><tab_name>, 
estimate_percent => 100 );  --> 100%
I am concerned because one of our application requires full statistics on certain IOT table. But few IOT tables are getting errored out(ora-600) due to bug# 4763768
So I am thinking to user dbms_stats with estimate_percent =100.
Will that be the same case as ANALYZE with COMPUTE STATISTICS?
 
 
Query Perf Slow after Analyze...
A reader, October   28, 2008 - 5:26 pm UTC
 
 
Hi Tom,
In our database :
Ver : 9.2.0.6
Optmiser Mode: FIRST_ROWS
Day#1: dbms_stats were used to gather stats and  query "A" is taking 10 sec to get completed.
Day #2 : dbms_job failed . and no stats gathred ..query "A" is taking 10 sec to get completed.
....
....
day #50: and no stats gathred till date after day#1..query "A" is taking 10 sec to get completed.
day #51 : Then we used analyze command (analyze table .... compute statistics) to gather stats - compute stats.
day#52 : we noticed a query "A" finishing in 10 sec earlier now taking 1 hr.
day#52 : Added the hint /*+ Rule */ in quer "A" and now taking 12 sec.
Question:
a) If we gather stats today on day#52, using dbms_stats will the same query will get finished in ~12 sec?
b) CBO could not use efficiently the stats gatherred using analyze table .... compute statistics..? Your comments Please.
 
  
October   28, 2008 - 7:10 pm UTC 
 
 
try it, you should not be using analyze, the optimizer is written to use the values dbms_stats provides - which are different than analyze will produce.
 
 
 
Query Perf Slow after Analyze...   
A reader, October   28, 2008 - 8:49 pm UTC
 
 
Hi Tom
We have gathered stats using the dbms_stats ( gather auto) and it has finished gathering the stats.
There are 1000 tables in our schema. All were analyzed using "Analyze" command on day#52 as mentioned above.
Now after running the dbms_stats.gather_schema_stats ( gather auto) we found  the last_analyzed on today  only 10 tables. 
Question : 
a) Whether all the necessary stats which optimiser looks for would have been gathered.?
If not,
b) whether we need to delete statistics and then run dbms_stat again to gather the schema stats.
c) Query "A" still take 1 hr to complete.
 
October   28, 2008 - 9:47 pm UTC 
 
 
you only gathered on 10 things.  So, you can probably answer (a) yourself.
where any of the objects query A accesses in that set of 10? 
 
 
Query Perf Slow after Analyze...   
A reader, October   28, 2008 - 9:57 pm UTC
 
 
continued.....
d)If I
exe  dbms_stats.delete_schema_stats('SCHEMA_OWNER')
and then run
exe dbms_stats.gather_schema_stats with gather auto on the above schema. 
Then what stats optimiser will take after delete and before create statistics?
e)  is it safe to do d) above in Live server?
Regards.
 
October   29, 2008 - 7:47 am UTC 
 
 
d) the ones the documentation says would be there - in short, if you delete them all, gather will gather all of them.
and I'm not sure you want to use auto, just gather_schema_stats and let as many things default as possible
and do this IN A TEST INSTANCE, IF YOU DO THIS ON PRODUCTION WITHOUT TESTING - PLEASE DON'T COME BACK HERE.  You never make major changes in a live system. 
 
 
Query Perf Slow after Analyze...   
A user, October   28, 2008 - 10:02 pm UTC
 
 
...you only gathered on 10 things. So, you can probably answer (a) yourself. 
Sorry Tom
Not making the things clear earlier.
I ran 
exec dbms_stats.gather_schema_stats(ownname=>'ZZ', degree=>4, cascade=>true, options=>'gather auto');
....where any of the objects query A accesses in that set of 10?
NO 
 
October   29, 2008 - 7:48 am UTC 
 
 
you said above "and last analyzed was today for only 10 things" - so apparently, you gathered statistics for 10 things, out of 1000
please stop playing in production right now - I'll not follow up anymore until you tell me "we are doing this in test now" 
 
 
Query Perf Slow after Analyze...   
A reader, October   29, 2008 - 12:34 am UTC
 
 
HI Tom.
>you only gathered on 10 things. So, you can probably answer (a) yourself. 
You are right. I used 'gather' option to collect stats on all the tables/indexes.
Now the querry "A" has stats for all the objects it accesses using dbms_stat.
But still it takes long to execute. Does it mean that querry has some problem and stats are correct.?
  
 
Query Perf Slow after Analyze...   
A Reader, October   29, 2008 - 2:56 pm UTC
 
 
Sorry Tom,
I understand the point which you made above.
Regards.
 
 
Analyze versus DBMS_STATS
Edson, December  17, 2008 - 8:18 am UTC
 
 
Hello my name is Edson and I'm from Brazil.
So I have been problems using DBMS_STATS because when I´m using Analyze for 4 tables, my query get better 20 seconds and when I´m using DBMS_STATS my query get better just 13 seconds.
I think that I´m using DBMS_STATS wrong.
The way that I´m using is  : 
EXECUTE SYS.DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'USER',TABNAME=>'TABLE',GRANULARITY=>'ALL',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1', ESTIMATE_PERCENT=>NULL,CASCADE=>TRUE, DEGREE=>6);
Please Help me. 
December  29, 2008 - 12:28 pm UTC 
 
 
do you actually get different plans?
show us the TKPROFS with wait events included.  
The difference between 13 and 20 could be "not different" for all we know.  Show us the row source operations from a tkprof for the two. 
 
 
Query Performance in 10g
Suzanne, December  31, 2008 - 2:01 pm UTC
 
 
We use Siebel as our CRM and I've just upgraded the Oracle database from 9.2.0.7 to 10.2.0.4.  The database performance has been uneven but generally slower under 10g than 9i.  One of the solutions that Siebel asked us to implement was the removal of all of the statistics on any empty table (there are many in our Siebel implementation but we don't use them that is why they are empty).  This seems to has improved our performance significantly.  This runs contrary to all I know about Oracle and statistics.  Why did the removal of statistics on empty tables give us a significant performance boost? 
January   05, 2009 - 10:20 am UTC 
 
 
are they really empty?
or - are they empty usually - filled by the application - and then used.
I would say "probably, they are empty when you analyze and filled by the application at run time"
If you never query them - then removing the statistics would do nothing for you.
So, you must query them... And if you query them, they probably have data (else you would always get zero rows...).  
In 9i, you had no statistics on these tables - and if they were not indexed, dynamic sampling would kick in.  Dynamic sampling would have told the optimizer "there are 10,000 rows in here".  We'd get the right plan.
In 10g, the auto job would find these unanalyzed tables and analyze them.  Putting into the dictionary the "fact" that there are 0 rows.  However, at run time, there would actually be 10,000 rows in there.  When we have the wrong 'card=' values - we get the wrong plan.  So, because we think 0 records - we get the wrong plan.
Remove the statistics and in 10g, we'd dynamically sample at hard parse time.  We would discover "hey, there are 10,000 rows".  We'd get the right plan. 
https://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html  
 
 
 
RE:Analyze versus DBMS_STATS
Edson, January   05, 2009 - 7:56 am UTC
 
 
Hi Tom,
My argument about Analyze cost plan is better as DBMS_STATS is wrong ... I have seen with my DB and the DBMS_STATS get the same cost plan.
Thank you for your answer.
And I want to tell you that the topic is so much rich and help me so much for my research.
One more time I really grateful for your help. 
 
Oramag Article
S, January   17, 2009 - 1:03 pm UTC
 
 
Tom,
In the Oracle Magazine article On Dynamic Sampling, under the title When the Optimizer Guesses, you wrote:
"After gathering statistics, you ask the optimizer to estimate how many people were born in December, and it would almost certainly be able to come up with a very accurate estimate of 1/12 of the data (assuming a normal distribution of birth dates)."
It should probably read: (assuming the uniform distribution of birth dates).
Regards 
January   17, 2009 - 1:28 pm UTC 
 
 
hah, you are correct, I guess if it were "normal", it would have most people in the summer and very few in the winter.
Unless you are in Australia and then it would be the opposite...
thanks and I agree. 
 
 
DBMS_STATS  Problem !!!!
Indranil DAS, March     09, 2009 - 6:10 am UTC
 
 
Hi Tom,
We are not able to find out why the below error is occurred ::
SQL> BEGIN dbms_stats.GATHER_TABLE_STATS('MEA_REPORTS','RPT_OUTER_CARTON_T', estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE, block_sample=>TRUE, method_opt=>
 'for all indexed columns size AUTO', granularity=>'ALL',cascade=>TRUE, degree=>4); END;
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
ORA-06512: at "SYS.DBMS_STATS", line 10502
ORA-06512: at "SYS.DBMS_STATS", line 10516
ORA-06512: at line 1
Tom, what wents wrong here -plz suggest.
Thanks in advance for your help.
Indranil
Oracle DBA,IBM 
 
March     09, 2009 - 12:59 pm UTC 
 
 
please utilize support/metalink for that.
I remember something vaguely relating to a function based index in early releases causing this. 
 
 
Extended Statistics
Arvind Mishra, May       04, 2009 - 8:21 pm UTC
 
 
Hello Tom,
a) Please can you tell me how can I effectively use extended statistics.
b) If i use expression statistics then function based index is still useful? When should I use function based index and when should I use function based index?
c) How can I see which multicolumn statistics are gathered on a table.
d) Is this a new feature in 11g?
Thanks,
ARVIND 
May       09, 2009 - 10:32 am UTC 
 
 
a) by gathering statistics on expressions or by gathering statistics on correlated columns...  In the same manner you would for an INDIVIDUAL column.
b) sure, if you gathered statistics on EMPNO in EMP, but did not have a primary key on EMPNO in EMP and EMP had a lot of records - would 
select * from emp where empno = :x;
run any faster?  Of course not :) you would certainly create an index in order to retrieve that single record quickly. 
Same with an expression!!!
ops$tkyte%ORA11GR1> create table emp
  2  as
  3  select object_id empno,
  4         trunc( dbms_random.value( 1000, 4000 ) ) sal,
  5         trunc( dbms_random.value( 5000, 10000 ) ) comm,
  6             decode( mod(rownum,2), 0, 'N', 'Y' ) flag1,
  7             decode( mod(rownum,2), 0, 'Y', 'N' ) flag2
  8    from stage
  9  /
Table created.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> create index flag_idx on emp(flag1,flag2);
Index created.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP', method_opt => 'for all columns size 254', estimate_percent=>100);
PL/SQL procedure successfully completed.
<b>Now, without any extended statistics:</b>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> begin
  2          for x in ( select /*+ gather_plan_statistics */  * from  emp where flag1 = 'N' )
  3          loop
  4                  null;
  5          end loop;
  6          for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
  7          loop
  8                  dbms_output.put_line( '.'||x.plan_table_output );
  9          end loop;
 10          for x in ( select /*+ gather_plan_statistics */  * from  emp where flag2 = 'N' )
 11          loop
 12                  null;
 13          end loop;
 14          for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
 15          loop
 16                  dbms_output.put_line( '.'||x.plan_table_output );
 17          end loop;
 18          for x in ( select /*+ gather_plan_statistics */  * from  emp where flag1 = 'N' and flag2 = 'N' )
 19          loop
 20                  null;
 21          end loop;
 22          for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
 23          loop
 24                  dbms_output.put_line( '.'||x.plan_table_output );
 25          end loop;
 26  end;
 27  /
.SQL_ID  6ukxu8bf6rrbc, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE FLAG1 = 'N'
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
.------------------------------------------------------------------------------------
.|   0 | SELECT STATEMENT  |      |      1 |        |  34232 |00:00:00.01 |     552 |
.|*  1 |  TABLE ACCESS FULL| EMP  |      1 |  34232 |  34232 |00:00:00.01 |     552 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
.   1 - filter("FLAG1"='N')
.
.SQL_ID  1hwwuw5p2v7s8, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE FLAG2 = 'N'
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
.------------------------------------------------------------------------------------
.|   0 | SELECT STATEMENT  |      |      1 |        |  34232 |00:00:00.01 |     552 |
.|*  1 |  TABLE ACCESS FULL| EMP  |      1 |  34232 |  34232 |00:00:00.01 |     552 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
.   1 - filter("FLAG2"='N')
.
.SQL_ID  00s1uatbzv2j2, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE FLAG1 = 'N' AND
.FLAG2 = 'N'
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
.------------------------------------------------------------------------------------
.|   0 | SELECT STATEMENT  |      |      1 |        |      0 |00:00:00.01 |     211 |
.|*  1 |  TABLE ACCESS FULL| EMP  |      1 |  17116 |      0 |00:00:00.01 |     211 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
.   1 - filter(("FLAG1"='N' AND "FLAG2"='N'))
.
PL/SQL procedure successfully completed.
<b>it gets the right estimated cardinalities for flag1=N, for flag2=N, but totally messes up flag1=N and flag2=N.  Add that pair:</b>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> exec dbms_output.put_line( dbms_stats.create_extended_stats( user, 'EMP', '(FLAG1,FLAG2)' ) );
SYS_STUGVNB7PTIYWAVPJX#YT77WGD
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP', method_opt => 'for all columns size 254', estimate_percent=>100);
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> begin
  2          for x in ( select /*+ gather_plan_statistics */  * from  emp EMP2 where flag1 = 'N' and flag2 = 'N' )
  3          loop
  4                  null;
  5          end loop;
  6          for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
  7          loop
  8                  dbms_output.put_line( '.'||x.plan_table_output );
  9          end loop;
 10  end;
 11  /
.SQL_ID  769uyry623wf9, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP EMP2 WHERE FLAG1 = 'N'
.AND FLAG2 = 'N'
.
.Plan hash value: 3400700027
.
.--------------------------------------------------------------------------------------------------
.| Id  | Operation                   | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
.--------------------------------------------------------------------------------------------------
.|   0 | SELECT STATEMENT            |          |      1 |        |      0 |00:00:00.01 |       2 |
.|   1 |  TABLE ACCESS BY INDEX ROWID| EMP      |      1 |      1 |      0 |00:00:00.01 |       2 |
.|*  2 |   INDEX RANGE SCAN          | FLAG_IDX |      1 |      1 |      0 |00:00:00.01 |       2 |
.--------------------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
.   2 - access("FLAG1"='N' AND "FLAG2"='N')
.
PL/SQL procedure successfully completed.
<b> cardinality is right, plan changes - all good
Now, do an expression:</b>
ops$tkyte%ORA11GR1> begin
  2          for x in ( select /*+ gather_plan_statistics */  * from  EMP where ln(sal+comm) > 9 )
  3          loop
  4                  null;
  5          end loop;
  6          for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
  7          loop
  8                  dbms_output.put_line( '.'||x.plan_table_output );
  9          end loop;
 10  end;
 11  /
.SQL_ID  f03rzhdyzy1gc, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE LN(SAL+COMM) > 9
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
.------------------------------------------------------------------------------------
.|   0 | SELECT STATEMENT  |      |      1 |        |  58463 |00:00:08.42 |     794 |
.|*  1 |  TABLE ACCESS FULL| EMP  |      1 |   3423 |  58463 |00:00:08.42 |     794 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
.   1 - filter(LN("SAL"+"COMM")>9)
.
PL/SQL procedure successfully completed.
<b>well, we can see that is wrong (the estimated row count) </b>
ops$tkyte%ORA11GR1> alter table EMP add (comp as (ln(sal+comm)));
Table altered.
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP', method_opt => 'for all columns size 254', estimate_percent=>100);
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> begin
  2          for x in ( select /*+ gather_plan_statistics */  * from  EMP where ln(sal+comm) > 9 )
  3          loop
  4                  null;
  5          end loop;
  6          for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
  7          loop
  8                  dbms_output.put_line( '.'||x.plan_table_output );
  9          end loop;
 10  end;
 11  /
.SQL_ID  f03rzhdyzy1gc, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE LN(SAL+COMM) > 9
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
.------------------------------------------------------------------------------------
.|   0 | SELECT STATEMENT  |      |      1 |        |  58463 |00:00:05.61 |     794 |
.|*  1 |  TABLE ACCESS FULL| EMP  |      1 |  58481 |  58463 |00:00:05.61 |     794 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
.   1 - filter(LN("SAL"+"COMM")>9)
.
PL/SQL procedure successfully completed.
<b>card = is now correct...</b>
c) query the dictionary:
ops$tkyte%ORA11GR1> select table_name, column_name from user_tab_col_statistics where table_name = 'EMP';
TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------
EMP                            EMPNO
EMP                            SAL
EMP                            COMM
EMP                            FLAG1
EMP                            FLAG2
EMP                            SYS_STUGVNB7PTIYWAVPJX#YT77WGD
EMP                            COMP
7 rows selected.
that sys_fadfadafds thing was dbms_output'ed above - that is flag1,flag2 - our virtual column looks like a normal one - COMP.
c)
ops$tkyte%ORA11GR1> select table_name, column_name from user_tab_col_statistics where table_name = 'EMP';
TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------
EMP                            EMPNO
EMP                            SAL
EMP                            COMM
EMP                            FLAG1
EMP                            FLAG2
EMP                            SYS_STUGVNB7PTIYWAVPJX#YT77WGD
EMP                            COMP
7 rows selected.
ops$tkyte%ORA11GR1> SELECT extension_name, extension
  2  FROM   dba_stat_extensions
  3  WHERE  table_name = 'EMP';
EXTENSION_NAME
------------------------------
EXTENSION
-------------------------------------------------------------------------------
SYS_STUGVNB7PTIYWAVPJX#YT77WGD
("FLAG1","FLAG2")
COMP
("SAL"+"COMM")
d) yes 
 
 
 
CHAIN_CNT affecting query plan ... ?
Gary, August    19, 2009 - 11:23 am UTC
 
 
Tom, 
You said in an earlier post in this thread (December 16, 2005) that "dbms_stats won't gather things not used by the CBO such as chain_cnt/avg_space in particular." We ran into an odd situation recently where removing the chain_cnt statistic seemed to result in an improved plan.
The job stream was running ANALYZE followed by DBMS_STATS, since that was the only way the development team knew to get a count of chained rows.
We decided on a simplified test case, since the developer had been rebuilding the table with a higher pctfree to eliminate the chained rows; here is what I tried:
Use DBMS_STATS.EXPORT_TABLE_STATS to preserve existing stats on two tables used by the problem query.
ANALYZE TABLE <name> DELETE STATISTICS ; -- for both tables
Use IMPORT_TABLE_STATS to put the exported statistics back into place
This should have had the net effect of removing only the stats gathered by ANALYZE TABLE, including CHAIN_CNT and AVG_SPACE. The plan changed very much for the better after this procedure was done. The old plan used index access to hit the two tables; the new plan full scanned them then used them in hash joins. The tables were only 2 MB and 14 MB in size, so the hash join approach seems like a no brainer chained rows or no. 
The developer seemed pretty sure before this that CHAIN_CNT was affecting the query plan ... now I'm not sure what to tell him. (Maybe some other stat gathered by ANALYZE TABLE is to blame?)
Thanks, 
Gary 
August    24, 2009 - 5:00 pm UTC 
 
 
did you export column and index statistics as well?
else you deleted A LOT of stuff. 
 
 
stats
sam, August    27, 2009 - 12:43 pm UTC
 
 
Tom:
I want to setup a nightly job to update stats in a database.
What is the standard statement you would do. is it
dbms_stats.gather_schema_stats('MYAPP');
I noticed this did not update the stats in USER_IDNEXES and COLUMNS. 
August    28, 2009 - 4:57 pm UTC 
 
 
Sam/SMK
you know my answer is going to be
it depends.
You noticed something incorrectly.  It does.
ops$tkyte%ORA10GR2> create table t ( x int );
Table created.
ops$tkyte%ORA10GR2> create index t_idx on t(x);
Index created.
ops$tkyte%ORA10GR2> insert into t select rownum from all_users;
42 rows created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't', num_rows from user_tables where table_name = 'T'
  2  union all
  3  select 'i', num_rows from user_indexes where index_name = 'T_IDX'
  4  union all
  5  select 'c', count(*) from user_tab_histograms where table_name = 'T';
'   NUM_ROWS
- ----------
t
i
c          0
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_schema_stats( user );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't', num_rows from user_tables where table_name = 'T'
  2  union all
  3  select 'i', num_rows from user_indexes where index_name = 'T_IDX'
  4  union all
  5  select 'c', count(*) from user_tab_histograms where table_name = 'T';
'   NUM_ROWS
- ----------
t         42
i         42
c          2
how cool is that, 42 :) that was a pleasant surprise. 
 
 
 
stats
A reader, August    29, 2009 - 1:51 pm UTC
 
 
Tom:
1.  You are looking at num_rows column.
I was looking at LAST_ANALYZED date stamp in USER INDEXES.
The LAST_ANALYZED in USER_TABLES gets update correctly to the date you run the statement but not USER_INDEXES.
2, on  a second note, i was reading a book that says collecting stats may not be helpful ever day depending on data.
The Oracle optimizer will generate new query plan every time you collect stats so it will not reuse old plans.
is this correct and do you adivese to schedule it on a weekly basis? 
August    29, 2009 - 7:27 pm UTC 
 
 
1) sam, perhaps you should have looked again?
ops$tkyte%ORA10GR2> create table t ( x int );
Table created.
ops$tkyte%ORA10GR2> create index t_idx on t(x);
Index created.
ops$tkyte%ORA10GR2> insert into t select rownum from all_users;
42 rows created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2> select 't', num_rows, last_analyzed from user_tables where table_name = 'T'
  2  union all
  3  select 'i', num_rows, last_analyzed from user_indexes where index_name = 'T_IDX'
  4  union all
  5  select 'c', count(*), to_date( null) from user_tab_histograms where table_name = 'T';
'   NUM_ROWS LAST_ANAL
- ---------- ---------
t
i
c          0
ops$tkyte%ORA10GR2> exec dbms_stats.gather_schema_stats( user );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't', num_rows, last_analyzed from user_tables where table_name = 'T'
  2  union all
  3  select 'i', num_rows, last_analyzed from user_indexes where index_name = 'T_IDX'
  4  union all
  5  select 'c', count(*), to_date( null) from user_tab_histograms where table_name = 'T';
'   NUM_ROWS LAST_ANAL
- ---------- ---------
t         42 29-AUG-09
i         42 29-AUG-09
c          2
2) .. The Oracle optimizer will generate new query plan every time you collect stats 
so it will not reuse old plans.
 ..
sam - you do understand that the major goal of collecting statistics is....
to cause plans to change because the size and shape of the data has changed.
so, (this is dead serious, I'm not sarcastic here at all), so do you want your plans to change like that - or not.
Only you can answer the question "how often should I analyze"
you know your data
you know how your data changes
you know how your applications use your data
you know this, if you don't, stop - and get that information 
 
 
 
index creation on table with LOCKED stats
psha, November  04, 2009 - 1:35 pm UTC
 
 
Hi Tom,
We are using Oracle 10g R2.We have one table with LOCKED stats. Now if I create indx on it, index is not analyzed. We are planining to migrate to 11g, is there any way in 11g to lock table stats such that whenever we create any index, index gets analyzed as soon as it is creatd?
Thanks
Reader 
November  11, 2009 - 9:31 am UTC 
 
 
no, when you lock, you lock.
you would unlock in order to gather, so just unlock, create index, lock
ops$tkyte%ORA11GR2> create table t
  2  as
  3  select * from all_objects;
Table created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create index t_idx on t(object_name);
Index created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec dbms_stats.lock_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select 't', table_name, num_rows, last_analyzed from user_tables where table_name = 'T'
  2  union all
  3  select 'i', index_name, num_rows, last_analyzed from user_indexes where table_name = 'T';
' TABLE_NAME                       NUM_ROWS LAST_ANALYZED
- ------------------------------ ---------- --------------------
t T                                   71588 11-nov-2009 10:30:13
i T_IDX                               71588 11-nov-2009 10:30:13
ops$tkyte%ORA11GR2><b>
ops$tkyte%ORA11GR2> exec dbms_stats.unlock_table_stats( user, 'T' );
</b>
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> create index t_idx2 on t(owner,object_type);
Index created.
<b>
ops$tkyte%ORA11GR2> exec dbms_stats.lock_table_stats( user, 'T' );
</b>
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select 't', table_name, num_rows, last_analyzed from user_tables where table_name = 'T'
  2  union all
  3  select 'i', index_name, num_rows, last_analyzed from user_indexes where table_name = 'T';
' TABLE_NAME                       NUM_ROWS LAST_ANALYZED
- ------------------------------ ---------- --------------------
t T                                   71588 11-nov-2009 10:30:13
i T_IDX2                              71588 11-nov-2009 10:30:14
i T_IDX                               71588 11-nov-2009 10:30:13
 
 
 
 
A reader, January   24, 2010 - 3:43 pm UTC
 
 
After I issue "alter table move" or "index rebuild", do I need to analyze tables or use dbms_stats to gather statistics?
From your book I understand that column level histograms and avg_row_len statistics are lost?? 
January   26, 2010 - 1:48 am UTC 
 
 
from *my* book? where did you read that - so I can fix it.
ops$tkyte%ORA11GR1> create table t as select * from all_users;
Table created.
ops$tkyte%ORA11GR1> create index t_idx on t(username);
Index created.
ops$tkyte%ORA11GR1> begin
  2  dbms_stats.gather_table_stats
  3  ( user, 'T', method_opt => 'for all indexed columns size 254' );
  4  end;
  5  /
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select avg_row_len from user_tables where table_name = 'T';
AVG_ROW_LEN
-----------
         20
ops$tkyte%ORA11GR1> select count(*) from user_tab_histograms where table_name = 'T';
  COUNT(*)
----------
        38
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> alter table t move;
Table altered.
ops$tkyte%ORA11GR1> alter index t_idx rebuild;
Index altered.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select avg_row_len from user_tables where table_name = 'T';
AVG_ROW_LEN
-----------
         20
ops$tkyte%ORA11GR1> select count(*) from user_tab_histograms where table_name = 'T';
  COUNT(*)
----------
        38
Now, that aside, you may well have to gather statistics after a reorganization since major metrics can and will change - like number of blocks for example (just for example, the others might be affected as well) 
 
 
 
A reader, February  20, 2010 - 11:31 pm UTC
 
 
Hi
Suppose statistics of two databases are exactly same,database version is also same, but there is a big difference interms of amount of data.
Does the response time of the sql queries will be exactly same ?
 
March     01, 2010 - 5:36 am UTC 
 
 
... Does the response time of the sql queries will be exactly same ? ...
of course not.  Use a little critical thinking and your knowledge of "physics" and you should be able to come up with an example.  Here is a simple one:
select * from t;
In both databases, statistics say "there are 1,000 records".  Everything is the same between the two - with the exception that in database 1 there are 1,000 records and in database 2 there are 1,000,000,000 records.
Will they perform the same?
Even with the same volumes of data - even with exactly the SAME NUMBER OF ROWS - even with precisely the same exact bits and bytes, you would not, should not, expect the performance to be the same. 
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8764517459743#737753700346729628 ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table t1;
ops$tkyte%ORA10GR2> drop table t2;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t1 as select * from all_objects where 1=0 order by object_id;
ops$tkyte%ORA10GR2> create table t2 as select * from all_objects where 1=0 order by dbms_random.random;
ops$tkyte%ORA10GR2> create index t1_idx on t1(object_id);
ops$tkyte%ORA10GR2> create index t2_idx on t2(object_id);
ops$tkyte%ORA10GR2> insert into t1 select * from all_objects order by object_id;
ops$tkyte%ORA10GR2> insert into t2 select * from all_objects order by dbms_random.random;
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 70000, numblks => 1000 );
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 70000, numblks => 1000 );
ops$tkyte%ORA10GR2> exec dbms_stats.set_index_stats( user, 'T1_IDX', numrows => 70000, numlblks => 100, numdist=>70000 );
ops$tkyte%ORA10GR2> exec dbms_stats.set_index_stats( user, 'T2_IDX', numrows => 70000, numlblks => 100, numdist=>70000 );
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set linesize 1000
ops$tkyte%ORA10GR2> set autotrace traceonly
ops$tkyte%ORA10GR2> select * from t1 where object_id between 1 and 500;
456 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 546753835
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |   175 | 17500 |     6   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |   175 | 17500 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IDX |   315 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID">=1 AND "OBJECT_ID"<=500)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         70  consistent gets
          0  physical reads
          0  redo size
      19275  bytes sent via SQL*Net to client
        730  bytes received via SQL*Net from client
         32  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        456  rows processed
ops$tkyte%ORA10GR2> select * from t2 where object_id between 1 and 500;
456 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 4244861920
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |   175 | 17500 |     6   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2     |   175 | 17500 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_IDX |   315 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID">=1 AND "OBJECT_ID"<=500)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        490  consistent gets
          0  physical reads
          0  redo size
      19275  bytes sent via SQL*Net to client
        730  bytes received via SQL*Net from client
         32  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        456  rows processed
ops$tkyte%ORA10GR2> set autotrace off
ops$tkyte%ORA10GR2>
see, same exact data, exactly same data, same stats, exactly the same, same plans, same everything
except the amount of work performed - since the organization of data on disk is different. 
 
 
 
A reader, March     12, 2010 - 8:19 pm UTC
 
 
Mmmm.
Thanks sir.
Suppose uat database is refreshed from production.
Is there any benefit for transferring statistics from production to UAT? 
March     15, 2010 - 9:51 am UTC 
 
 
depends on how you refresh.
if you refresh by dump and load - just remember - UAT and production are only as alike as your image in a funhouse mirror is like you.  You should gather stats on UAT as the order of rows will be different, sizes of things will be different - everything will be different except for the VALUES of the data.  Do expect different plans.  You could likely get the same plans if you move the stats over - however - you have no idea if that plan is good bad or indifferent - since the data is organized differently.
if you refresh by restoring a backup, you are done, just use it.   
 
 
analyze
A reader, July      08, 2010 - 2:28 am UTC
 
 
greetings Thomas,
and thanks like always.
exec dbms_stats.gather_schema_stats('A');
exec dbms_stats.gather_table_stats('A','ACTIONS',estimate_percent => 100, cascade => true);
but the problem that, when oracle access the table (actions)
it issues an analyze (i traced the session and found analyze table A.ACTIONS compute statistics in the trace file).
table_name: A.ACTIONS
number of rows: 0
number of columns: 148
number of indexes: 113
i think it is related to the number of indexes, because when i dropped it worked fine, but my question is why is that happening?????? 
July      08, 2010 - 12:22 pm UTC 
 
 
you think what is related to what???
what version are you on.
 
 
 
analyze
A reader, July      15, 2010 - 4:57 pm UTC
 
 
Greetings thomas,
and thanks like always.
regarding the above post:
-----you think what is related to what??? 
lots of columns and lots of indexes in one table.
but that is a guess, and that is why i am asking you.
------what version are you on. 
10.2.0.1 
July      19, 2010 - 1:10 pm UTC 
 
 
1,000 columns, 1,000 indexes, no analyze
I'd need a test case to reproduce with
drop table t;
declare
        l_create long := 'create table t ( a int primary key';
begin
        for i in 1 .. 999
        loop
                l_create := l_create || ', c' || i || ' int';
        end loop;
        execute immediate l_create || ')';
        for i in 1 .. 999
        loop
                execute immediate 'create index t_idx' || i || ' on t(c' || i || ')';
        end loop;
end;
/
alter session set sql_trace=true;
exec dbms_stats.gather_table_stats( user, 'T', estimate_percent=> 100, cascade=> true );i did not see analyze in a trace file anywhere. 
 
 
 
_optimizer_invalidation_period
Ravi B, September 14, 2010 - 1:56 pm UTC
 
 
Hi Tom,
I read somewhere that the hidden parameter "_optimizer_invalidation_period" influences the setting NO_INVALIDATE in DBMS_STATS. By default it is 180000 (5hrs). What does it mean? The (IN)VALIDATION doesn't occur for 5hrs after it is set? I don't see any documentation on oracle support site either.
Thanks,
Ravi 
September 15, 2010 - 7:36 am UTC 
 
 
that is because it is undocumented, I don't discuss things like that really, you don't need to set them - except in response to support telling you to do so.  Their meaning can, will, has changed from dot release to dot release - it would be unwise to play with them.   
 
 
Oracle 11g : Enterprise vs Standard Editions
Cj, February  04, 2011 - 2:26 am UTC
 
 
Hello,
I have few question regarding Oracle 11g Standard Edition and Enterprise Edition
1- Where could I find a very detailed document of the differences between Enterprise and Standard Editions on Oracle 11g?
2. In Standard edition, is it possible to use packages DBMS_STATS and DBMS_UTILY?
Thanks in advance 
February  04, 2011 - 9:32 am UTC 
 
 
see
 http://www.oracle.com/pls/db112/homepage every release has a License guide that details what comes with each edition.
yes, dbms_stats and dbms_utility are available with SE 
 
 
 
Importing Statistics
Shravani, February  15, 2011 - 1:22 pm UTC
 
 
Hi Tom,
In our application, we have 2 schemas (A and B) with exactly same tables and data. During the data load process we connect user to B copy. We gather stats in A schema using DBMS_STATS and then import the stats to B schema.
However we are seeing the execution plan and timings totally different between A and B schema. Query executed in A copy returns results quickly as compare to that in B.
Can you please let me know what would make this happens? And what are the areas to look for.
Regards
Shravani. 
February  15, 2011 - 4:54 pm UTC 
 
 
show us a tkprof and a 'plan' (with estimated cardinalities - like from autotrace) of both.
Also - show us the method you use to copy the statistics. 
 
 
Import Statistics
Shravani, February  16, 2011 - 11:51 am UTC
 
 
Thanks for your reply Tom.
Here is the query we are executing in A and B Copy
SELECT DISTINCT product_matrix.patient_case_key
FROM    date_time_dim init_rcvd_date_condition,
 dm_load_dt dm_load_dt_condition,
 product_matrix,
 product_dim product_dim_suspect
WHERE (    product_matrix.suspect_product_key =
         product_dim_suspect.product_key
 AND product_dim_suspect.role_indic = 'SUSPECT')
    AND (product_matrix.init_recv_dt_key =
        init_rcvd_date_condition.year_month_day)
    AND ((    dm_load_dt_condition.increment_dt >=
product_matrix.effective_dt AND dm_load_dt_condition.increment_dt < product_matrix.end_dt))
    AND (    (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') =     dm_load_dt_condition.reporting_dt)
  AND dm_load_dt_condition.active = 'Y'
        )
    AND (    (   product_dim_suspect.generic_nm IN
 ('CELECOXIB')
       OR 'ALL' IN ('CELECOXIB')
      )
  AND (dm_load_dt_condition.reporting_dt >=
    SYS_CONTEXT ('pfast', 'mbss_release')
      )
  AND ((init_rcvd_date_condition.calendar_dt) >=
   TO_DATE ('01-JAN-2010', 'DD-MON-YYYY')
      )
  AND ((init_rcvd_date_condition.calendar_dt) <=
   TO_DATE ('25-SEP-2010', 'DD-MON-YYYY')
      )
        );
And here is the TKPROF for A
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.05          0         23          0           0
Execute      1      0.00       0.02          0        128          0           0
Fetch      210      0.00       5.41          0         36          0        3127
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      212      0.01       5.49          0        187          0        3127
Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 73  (FSTADM)
Rows     Row Source Operation
-------  ---------------------------------------------------
   3127  PX COORDINATOR  (cr=164 pr=0 pw=0 time=1302 us)
      0   PX SEND QC (RANDOM) :TQ10002 (cr=0 pr=0 pw=0 time=0 us cost=171 size=106 card=1)
      0    HASH UNIQUE (cr=0 pr=0 pw=0 time=0 us cost=171 size=106 card=1)
      0     PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0      PX SEND HASH :TQ10001 (cr=0 pr=0 pw=0 time=0 us)
      0       FILTER  (cr=0 pr=0 pw=0 time=0 us)
      0        NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us)
      0         NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us cost=170 size=106 card=1)
      0          NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us cost=170 size=276 card=3)
      0           BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
      0            PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0             PX SEND BROADCAST :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
     31              MERGE JOIN CARTESIAN (cr=36 pr=0 pw=0 time=2280 us cost=10 size=58 card=1)
     31               TABLE ACCESS BY GLOBAL INDEX ROWID PRODUCT_DIM PARTITION: ROW LOCATION ROW LOCATION (cr=33 pr=0 pw=0 time=2100 us cost=8 size=39 card=1)
     39                INDEX RANGE SCAN UX_PD_GNM_ROLE_PK (cr=3 pr=0 pw=0 time=0 us cost=2 size=0 card=12)(object id 8689572)
     31               BUFFER SORT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
      1                TABLE ACCESS BY INDEX ROWID DM_LOAD_DT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
      1                 INDEX RANGE SCAN IX_RPT_DATE_DM_LOAD_DT (cr=2 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 8691296)
      0           PX PARTITION HASH ALL PARTITION: 1 4 (cr=0 pr=0 pw=0 time=0 us cost=170 size=170 card=5)
      0            TABLE ACCESS BY LOCAL INDEX ROWID PRODUCT_MATRIX PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us cost=170 size=170 card=5)
      0             BITMAP CONVERSION TO ROWIDS (cr=0 pr=0 pw=0 time=0 us)
      0              BITMAP AND  (cr=0 pr=0 pw=0 time=0 us)
      0               BITMAP INDEX SINGLE VALUE BMAK_MATRIX_SPPRODKEY PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8695173)
      0               BITMAP MERGE  (cr=0 pr=0 pw=0 time=0 us)
      0                BITMAP INDEX RANGE SCAN BMAK_MATRIX_EFFDT PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8694549)
      0          INDEX UNIQUE SCAN PK_DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 8689529)
      0         TABLE ACCESS BY INDEX ROWID DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=14 card=1)
unable to set optimizer goal
ORA-01986: OPTIMIZER_GOAL is obsolete
parse error offset: 33
Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: ALL_ROWS
   3127   PX COORDINATOR
      0    PX SEND (QC (RANDOM)) OF ':TQ10002' [:Q1002]
      0     HASH (UNIQUE) [:Q1002]
      0      PX RECEIVE [:Q1002]
      0       PX SEND (HASH) OF ':TQ10001' [:Q1001]
      0        FILTER [:Q1001]
      0         NESTED LOOPS [:Q1001]
      0          NESTED LOOPS [:Q1001]
      0           NESTED LOOPS [:Q1001]
      0            BUFFER (SORT) [:Q1001]
      0             PX RECEIVE [:Q1001]
      0              PX SEND (BROADCAST) OF ':TQ10000'
     31               MERGE JOIN (CARTESIAN)
     31                TABLE ACCESS   GOAL: ANALYZED (BY 
                         GLOBAL INDEX ROWID) OF 'PRODUCT_DIM' (TABLE) 
                           PARTITION:ROW LOCATION
     39                 INDEX   GOAL: ANALYZED (RANGE SCAN) 
                            OF 'UX_PD_GNM_ROLE_PK' (INDEX (UNIQUE))
     31                BUFFER (SORT)
      1                 TABLE ACCESS   GOAL: ANALYZED (BY 
                            INDEX ROWID) OF 'DM_LOAD_DT' (TABLE)
      1                  INDEX   GOAL: ANALYZED (RANGE SCAN)
                              OF 'IX_RPT_DATE_DM_LOAD_DT' (INDEX)
      0            PX PARTITION HASH (ALL) [:Q1001] PARTITION: 
                       START=1 STOP=4
      0             TABLE ACCESS   GOAL: ANALYZED (BY LOCAL 
                      INDEX ROWID) OF 'PRODUCT_MATRIX' (TABLE) [:Q1001] 
                        PARTITION:KEY STOP=88
      0              BITMAP CONVERSION (TO ROWIDS) [:Q1001]
      0               BITMAP AND [:Q1001]
      0                BITMAP INDEX (SINGLE VALUE) OF 
                         'BMAK_MATRIX_SPPRODKEY' (INDEX (BITMAP)) [:Q1001] 
                           PARTITION:KEY STOP=88
      0                BITMAP MERGE [:Q1001]
      0                 BITMAP INDEX (RANGE SCAN) OF 
                          'BMAK_MATRIX_EFFDT' (INDEX (BITMAP)) [:Q1001] 
                            PARTITION:KEY STOP=88
      0           INDEX   GOAL: ANALYZED (UNIQUE SCAN) OF 
                      'PK_DATE_TIME_DIM' (INDEX (UNIQUE)) [:Q1001]
      0          TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF 
                     'DATE_TIME_DIM' (TABLE) [:Q1001]
=========================================
Here is TKProf for B
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.04          0         26          0           0
Execute      1      0.00       0.02          0        100          0           0
Fetch      210      0.00      69.29          0         36          0        3127
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      212      0.02      69.36          0        162          0        3127
Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 76  (FSTBDM)
Rows     Row Source Operation
-------  ---------------------------------------------------
   3127  PX COORDINATOR  (cr=136 pr=0 pw=0 time=1693 us)
      0   PX SEND QC (RANDOM) :TQ10002 (cr=0 pr=0 pw=0 time=0 us cost=517 size=114 card=1)
      0    HASH UNIQUE (cr=0 pr=0 pw=0 time=0 us cost=517 size=114 card=1)
      0     PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0      PX SEND HASH :TQ10001 (cr=0 pr=0 pw=0 time=0 us)
      0       FILTER  (cr=0 pr=0 pw=0 time=0 us)
      0        NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us)
      0         NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us cost=516 size=114 card=1)
      0          NESTED LOOPS  (cr=0 pr=0 pw=0 time=0 us cost=516 size=368 card=4)
      0           BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
      0            PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0             PX SEND BROADCAST :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
     31              MERGE JOIN CARTESIAN (cr=36 pr=0 pw=0 time=780 us cost=99 size=58 card=1)
     31               TABLE ACCESS BY GLOBAL INDEX ROWID PRODUCT_DIM PARTITION: ROW LOCATION ROW LOCATION (cr=33 pr=0 pw=0 time=600 us cost=97 size=39 card=1)
     39                INDEX RANGE SCAN UX_PD_GNM_ROLE_PK (cr=3 pr=0 pw=0 time=0 us cost=2 size=0 card=189)(object id 8743581)
     31               BUFFER SORT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
      1                TABLE ACCESS BY INDEX ROWID DM_LOAD_DT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
      1                 INDEX RANGE SCAN IX_RPT_DATE_DM_LOAD_DT (cr=2 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 8743601)
      0           PX PARTITION HASH ALL PARTITION: 1 4 (cr=0 pr=0 pw=0 time=0 us cost=516 size=238 card=7)
      0            TABLE ACCESS BY LOCAL INDEX ROWID PRODUCT_MATRIX PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us cost=516 size=238 card=7)
      0             BITMAP CONVERSION TO ROWIDS (cr=0 pr=0 pw=0 time=0 us)
      0              BITMAP AND  (cr=0 pr=0 pw=0 time=0 us)
      0               BITMAP INDEX SINGLE VALUE BMAK_MATRIX_SPPRODKEY PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8748879)
      0               BITMAP MERGE  (cr=0 pr=0 pw=0 time=0 us)
      0                BITMAP INDEX RANGE SCAN BMAK_MATRIX_EFFDT PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8748990)
      0          INDEX UNIQUE SCAN PK_DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 8743544)
      0         TABLE ACCESS BY INDEX ROWID DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=22 card=1)
unable to set optimizer goal
ORA-01986: OPTIMIZER_GOAL is obsolete
parse error offset: 33
Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: ALL_ROWS
   3127   PX COORDINATOR
      0    PX SEND (QC (RANDOM)) OF ':TQ10002' [:Q1002]
      0     HASH (UNIQUE) [:Q1002]
      0      PX RECEIVE [:Q1002]
      0       PX SEND (HASH) OF ':TQ10001' [:Q1001]
      0        FILTER [:Q1001]
      0         NESTED LOOPS [:Q1001]
      0          NESTED LOOPS [:Q1001]
      0           NESTED LOOPS [:Q1001]
      0            BUFFER (SORT) [:Q1001]
      0             PX RECEIVE [:Q1001]
      0              PX SEND (BROADCAST) OF ':TQ10000'
     31               MERGE JOIN (CARTESIAN)
     31                TABLE ACCESS   GOAL: ANALYZED (BY 
                         GLOBAL INDEX ROWID) OF 'PRODUCT_DIM' (TABLE) 
                           PARTITION:ROW LOCATION
     39                 INDEX   GOAL: ANALYZED (RANGE SCAN) 
                            OF 'UX_PD_GNM_ROLE_PK' (INDEX (UNIQUE))
     31                BUFFER (SORT)
      1                 TABLE ACCESS   GOAL: ANALYZED (BY 
                            INDEX ROWID) OF 'DM_LOAD_DT' (TABLE)
      1                  INDEX   GOAL: ANALYZED (RANGE SCAN)
                              OF 'IX_RPT_DATE_DM_LOAD_DT' (INDEX)
      0            PX PARTITION HASH (ALL) [:Q1001] PARTITION: 
                       START=1 STOP=4
      0             TABLE ACCESS   GOAL: ANALYZED (BY LOCAL 
                      INDEX ROWID) OF 'PRODUCT_MATRIX' (TABLE) [:Q1001] 
                        PARTITION:KEY STOP=88
      0              BITMAP CONVERSION (TO ROWIDS) [:Q1001]
      0               BITMAP AND [:Q1001]
      0                BITMAP INDEX (SINGLE VALUE) OF 
                         'BMAK_MATRIX_SPPRODKEY' (INDEX (BITMAP)) [:Q1001] 
                           PARTITION:KEY STOP=88
      0                BITMAP MERGE [:Q1001]
      0                 BITMAP INDEX (RANGE SCAN) OF 
                          'BMAK_MATRIX_EFFDT' (INDEX (BITMAP)) [:Q1001] 
                            PARTITION:KEY STOP=88
      0           INDEX   GOAL: ANALYZED (UNIQUE SCAN) OF 
                      'PK_DATE_TIME_DIM' (INDEX (UNIQUE)) [:Q1001]
      0          TABLE ACCESS (BY INDEX ROWID) OF 'DATE_TIME_DIM' 
                     (TABLE) [:Q1001]
=========================================================
We are using "dbms_stats.export_schema_stats" to export the stats from A copy. And then "dbms_stats.import_table_stats" to import the stats into B copy.
Thanks Again
Regards
Shravani
 
February  16, 2011 - 2:11 pm UTC 
 
 
same plans, same row source operations, same tim= values (similar enough to be the same)...
are you sure in real life the queries take this long - if you run them in sqlplus - do they run really fast in one schema and really slow (repeatedly) in the other?
This looks more like a strange tkprof issue - the TIM= information isn't matching the elapsed column at all. 
 
 
Importing Stats
Shravani, February  16, 2011 - 3:35 pm UTC
 
 
Yes Tome, please believe me the query execution is faster in A copy where we gather the statistics. Can you please let us know where to lookout to find the root cause for this discrepancy? 
February  16, 2011 - 5:27 pm UTC 
 
 
can you show me a sqlplus cut/paste with timing on (run with set autotrace traceonly statistics too - we don't need to see the data, just the query, the row counts and the timing from sqplus)
are the data sets on the same devices?
is there other stuff going on  on the machine?
do you have ash/awr access? 
 
 
Import Statistics
Shravani, February  17, 2011 - 10:15 am UTC
 
 
Thanks so much Tom. Here is the required input
Query :
SELECT 'hints', patient_case_dim.case_nbr, therapy_profile_dim.dose_range,
       meddra_dim.llt_cd_desc, meddra_dim.pt_cd_desc,
       meddra_dim.primary_soc_cd_desc, meddra_dim.hlt_cd_desc,
       meddra_dim.hlgt_cd_desc, therapy_profile_dim.multiple_route_admin,
       therapy_profile_dim.multiple_formulation,
       therapy_profile_dim.multiple_indication,
       product_event_fact.interacting_drug_indic_desc,
       patient_demographics_dim.gender_cd_desc,
       patient_demographics_dim.age_group,
       case_profile_dim.case_country_cd_desc,
       case_profile_dim.co_suspect_prod_presence, patient_case_dim.age,
       case_profile_dim.source_derived, case_profile_dim.conmed_prod_presence,
       dm_load_dt.meddra_version_nbr,
       product_event_profile_dim.latency_time_2, 'Default',
       case_profile_dim.patient_history_presence, product_dim.generic_nm,
       patient_case_dim.age_unit_cd_desc, case_profile_dim.case_completeness,
       product_therapy_fact.first_daily_dose_qty,
       CONCAT (RTRIM (patient_case_dim.case_nbr),
               RTRIM (patient_case_dim.source_system_nm)
              ),
       patient_case_dim.patient_case_key, product_dim.drl_cd,
       product_event_fact.latency_days_calculated,
       therapy_start_date.calendar_dt, therapy_end_date.calendar_dt,
       event_profile_dim.event_seriousness_rep_cd_desc,
       patient_demographics_dim.elderly_case_indic,
       onset_date_dim.calendar_dt,
       product_event_profile_dim.ib_labeling_indic,
       product_event_profile_dim.cds_labeling_indic,
       product_event_profile_dim.spc_labeling_indic,
       product_event_profile_dim.uspi_only_labeling_indic,
       meddra_dim.dme_indic, case_profile_dim.case_outcome_hierarchy,
       case_profile_dim.case_seriousness_co_cd_desc,
       CASE
          WHEN (meddra_dim.pt_cd_desc) IS NULL
             THEN (product_event_fact.verbatim_event_term) || '*'
          ELSE (meddra_dim.pt_cd_desc)
       END,
       CASE
          WHEN (meddra_dim.primary_soc_cd_desc) IS NULL
             THEN 'SOC Not Yet Coded'
          ELSE (meddra_dim.primary_soc_cd_desc)
       END,
       product_event_profile_dim.dechallenge_cd_desc,
       event_profile_dim.treatment_rcvd_desc,
       event_profile_dim.event_outcome_rep_cd_desc,
       product_event_profile_dim.change_in_dosage_cd_desc,
       patient_case_dim.age_in_years,
       patient_demographics_dim.age_group_cd_desc,
       case_xref_fact.case_reftype_desc, case_xref_fact.case_reference_id,
       patient_case_dim.first_avail_eff_dt, therapy_start_date.derived_dt,
       therapy_end_date.derived_dt,
       product_event_profile_dim.latency_unit_cd_desc,
       product_event_fact.latency, onset_date_dim.derived_dt
  FROM product_event_fact,
       product_therapy_fact,
       therapy_profile_dim,
       patient_case_dim,
       product_dim,
       case_profile_dim,
       meddra_dim,
       product_event_profile_dim,
       date_time_dim therapy_start_date,
       date_time_dim therapy_end_date,
       date_time_dim onset_date_dim,
       event_profile_dim,
       patient_demographics_dim,
       dm_load_dt,
       case_xref_fact
 WHERE (product_therapy_fact.therapy_profile_key =
                                       therapy_profile_dim.therapy_profile_key
       )
   AND (meddra_dim.meddra_key = product_event_fact.reaction_key)
   AND (event_profile_dim.event_profile_key =
                                          product_event_fact.event_profile_key
       )
   AND (product_event_profile_dim.product_event_profile_key =
                                  product_event_fact.product_event_profile_key
       )
   AND (patient_case_dim.patient_case_key =
                                           product_event_fact.patient_case_key
       )
   AND (patient_case_dim.case_profile_key = case_profile_dim.case_profile_key
       )
   AND (onset_date_dim.year_month_day(+) = product_event_fact.onset_dt_key)
   AND (product_therapy_fact.therapy_start_dt_key = therapy_start_date.year_month_day(+))
   AND (product_therapy_fact.therapy_end_dt_key = therapy_end_date.year_month_day(+))
   AND (product_event_fact.patient_demographics_key =
                             patient_demographics_dim.patient_demographics_key
       )
   AND (    product_event_fact.patient_case_key = product_therapy_fact.patient_case_key(+)
        AND product_event_fact.product_key = product_therapy_fact.product_key(+)
        AND product_event_fact.seq_product = product_therapy_fact.seq_product(+)
       )
   AND (    product_event_fact.product_key = product_dim.product_key
        AND product_dim.role_indic = 'SUSPECT'
       )
   AND ((    dm_load_dt.increment_dt >= patient_case_dim.effective_dt
         AND dm_load_dt.increment_dt < patient_case_dim.end_dt
        )
       )
   AND (case_xref_fact.patient_case_key(+) = patient_case_dim.patient_case_key)
   AND (    (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') = dm_load_dt.reporting_dt
            )
        AND dm_load_dt.active = 'Y'
       )
   AND (    (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') = dm_load_dt.reporting_dt
            )
        AND dm_load_dt.active = 'Y'
       )
   AND (    (    patient_case_dim.case_logical_del_indic = 0
             AND (   case_profile_dim.historical_case_indic IS NULL
                  OR case_profile_dim.historical_case_indic = 2
                 )
            )
        AND (product_dim.generic_nm IN ('CELECOXIB') OR 'ALL' IN
                                                                ('CELECOXIB')
            )
        AND (patient_case_dim.case_nbr NOT IN (
                SELECT clm_list_item.aer_no
                  FROM bo_sec.clm_list_item, bo_sec.clm_list
                 WHERE (clm_list.list_id = clm_list_item.list_id)
                   AND (clm_list.list_name IN ('NONE'))
                   AND (clm_list.list_type = 'BYPASS'))
            )
        AND patient_case_dim.patient_case_key IN (
               SELECT DISTINCT product_matrix.patient_case_key
                          FROM date_time_dim init_rcvd_date_condition,
                               dm_load_dt dm_load_dt_condition,
                               product_matrix,
                               product_dim product_dim_suspect
                         WHERE (    product_matrix.suspect_product_key =
                                               product_dim_suspect.product_key
                                AND product_dim_suspect.role_indic = 'SUSPECT'
                               )
                           AND (product_matrix.init_recv_dt_key =
                                       init_rcvd_date_condition.year_month_day
                               )
                           AND ((    dm_load_dt_condition.increment_dt >=
                                                   product_matrix.effective_dt
                                 AND dm_load_dt_condition.increment_dt <
                                                         product_matrix.end_dt
                                )
                               )
                           AND (    (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') =
                                             dm_load_dt_condition.reporting_dt
                                    )
                                AND dm_load_dt_condition.active = 'Y'
                               )
                           AND (    (   product_dim_suspect.generic_nm IN
                                                                ('CELECOXIB')
                                     OR 'ALL' IN ('CELECOXIB')
                                    )
                                AND (dm_load_dt_condition.reporting_dt >=
                                         SYS_CONTEXT ('pfast', 'mbss_release')
                                    )
                                AND ((init_rcvd_date_condition.calendar_dt) >=
                                        TO_DATE ('01-JAN-2010', 'DD-MON-YYYY')
                                    )
                                AND ((init_rcvd_date_condition.calendar_dt) <=
                                        TO_DATE ('25-SEP-2010', 'DD-MON-YYYY')
                                    )
                               ))
       )
=========================
Statistics from A Schema
=====================================================
21:06:53 SQL> @query1
12659 rows selected.
Elapsed: 00:10:58.07
Statistics
----------------------------------------------------------
       5690  recursive calls
          0  db block gets
   13040813  consistent gets
     454569  physical reads
       1876  redo size
    1883128  bytes sent via SQL*Net to client
      16153  bytes received via SQL*Net from client
        845  SQL*Net roundtrips to/from client
       1755  sorts (memory)
          0  sorts (disk)
      12659  rows processed
21:17:55 SQL> spool off;
=====================================================
Statistics from B schema
21:07:46 SQL> @query1
12659 rows selected.
Elapsed: 00:28:13.07
Statistics
----------------------------------------------------------
        785  recursive calls
          0  db block gets
   16474030  consistent gets
     935724  physical reads
     479484  redo size
    1999291  bytes sent via SQL*Net to client
      16153  bytes received via SQL*Net from client
        845  SQL*Net roundtrips to/from client
         19  sorts (memory)
          0  sorts (disk)
      12659  rows processed
21:36:06 SQL> spool off;
===================================================
The data in both the schema is exactly same. Only difference is in A copy data gets inserted via ETL process. And in B schema we truncate the current partition and copy data from A schema. Once the copy process is done we export the statistics gathered in A using DBMS_STATS package procedures. The execution plan for the query in the B copy is very much different than what appears for A.
And Sir, when we execute the queries in A and B copy there were no background jobs or any other processes running.
Can you please let me know what should be done to remove the discrepancy?
Regards
Shravani 
 
February  17, 2011 - 12:07 pm UTC 
 
 
... The data in both the schema is exactly same ...
Nope, sorry, but it is not.  Look at the consistent gets.  The data VALUES might be the same - but the ordering of the rows on disk is undoubtedly different (compare the statistics in both schemas - look for index clustering factors that are very different - that'll prove that out)
Here is an example - SAME DATA - exactly the SAME DATA - but watch the IO differences:
ops$tkyte%ORA11GR2> create table organized
  2  as
  3  select x.*
  4    from (select * from stage order by object_name) x
  5  /
Table created.
ops$tkyte%ORA11GR2> create table disorganized
  2  as
  3  select x.*
  4    from (select * from stage order by dbms_random.random) x
  5  /
Table created.
ops$tkyte%ORA11GR2> create index organized_idx on organized(object_name);
Index created.
ops$tkyte%ORA11GR2> create index disorganized_idx on disorganized(object_name);
Index created.
ops$tkyte%ORA11GR2> begin
  2  dbms_stats.gather_table_stats
  3  ( user, 'ORGANIZED',
  4    estimate_percent => 100,
  5    method_opt=>'for all indexed columns size 254'
  6  );
  7  dbms_stats.gather_table_stats
  8  ( user, 'DISORGANIZED',
  9    estimate_percent => 100,
 10    method_opt=>'for all indexed columns size 254'
 11  );
 12  end;
 13  /
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> select table_name, blocks, num_rows, 0.05*num_rows, 0.10*num_rows from user_tables
  2  where table_name like '%ORGANIZED' order by 1;
TABLE_NAME                         BLOCKS   NUM_ROWS 0.05*NUM_ROWS 0.10*NUM_ROWS
------------------------------ ---------- ---------- ------------- -------------
DISORGANIZED                         1064      72759       3637.95        7275.9
ORGANIZED                            1064      72759       3637.95        7275.9
ops$tkyte%ORA11GR2> select table_name, index_name, clustering_factor from user_indexes
  2  where table_name like '%ORGANIZED' order by 1;
TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
DISORGANIZED                   DISORGANIZED_IDX                           72581
ORGANIZED                      ORGANIZED_IDX                               1038
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace on
ops$tkyte%ORA11GR2> select /*+ index( organized organized_idx) */
  2    count(subobject_name)
  3    from organized;
COUNT(SUBOBJECT_NAME)
---------------------
                 1106
Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572
----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     1 |    17 |  1402   (1)| 00:00:17 |
|   1 |  SORT AGGREGATE              |               |     1 |    17 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     | 72759 |  1207K|  1402   (1)| 00:00:17 |
|   3 |    INDEX FULL SCAN           | ORGANIZED_IDX | 72759 |       |   363   (1)| 00:00:05 |
----------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1400  consistent gets
          0  physical reads
          0  redo size
        436  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
ops$tkyte%ORA11GR2> select /*+ index( disorganized disorganized_idx) */
  2    count(subobject_name)
  3    from disorganized;
COUNT(SUBOBJECT_NAME)
---------------------
                 1106
Execution Plan
----------------------------------------------------------
Plan hash value: 3014554493
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |     1 |    17 | 72966   (1)| 00:14:36 |
|   1 |  SORT AGGREGATE              |                  |     1 |    17 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| DISORGANIZED     | 72759 |  1207K| 72966   (1)| 00:14:36 |
|   3 |    INDEX FULL SCAN           | DISORGANIZED_IDX | 72759 |       |   363   (1)| 00:00:05 |
-------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      72943  consistent gets
          0  physical reads
          0  redo size
        436  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
ops$tkyte%ORA11GR2> set autotrace off
One of your schemas did more then twice the physical IO's and 3 million more logical IO's - the data is different.
Try gathering statistics on B directly and compare the statistics of the tables, indexes and columns - you should see they are "different" 
 
 
 
Shravani, February  17, 2011 - 1:57 pm UTC
 
 
Dear Tom,
Thanks so so much for this answer. Yes after gathering the stats, I can see index clustering factor is very different. 
However I am still not getting why the data in B copy is disorganised. Do we need to do changes in the copy process?
The current logic is, if the table is partitioned we truncate the current partition in B copy and then select-insert the current partition data from A into B. For the non-partitioned tables we just truncate-insert. 
One more thing is, each day we drop all the indexes in B and recreate them. So to summarize the A-B copy process works in following way
1. Drop all indexes in B copy
2. Truncate the current partition or a complete table in B 
3. Insert the data from A copy
4. Create all the indexes
5. Clone the statistics from A using DBMS_STATS.Export_Schema_Stats
6. Copy the statistics into b using DBMS_STATS.Import_Schema_Stats
Do you think one of the above steps has lead to disorganized data in B copy? And can you please give us hint on the probable resolution for this?
Thanks 
Shravani 
February  17, 2011 - 2:11 pm UTC 
 
 
show me the code for step 3
tell me about the tablespaces - they are automatic segment space managed or manual segment space managed? 
 
 
Importing Statistics
Shravani, February  17, 2011 - 3:30 pm UTC
 
 
Hi Tom,
The tablespace are automatic segment space managed.
Here is the code
IF clonetype = 1 THEN
   EXECUTE IMMEDIATE ('alter table ' || targetschema || '.' || in_table || ' truncate subpartition ' || in_subpart );
ELSIF clonetype = 2 THEN
   EXECUTE IMMEDIATE ('alter table ' || targetschema || '.'  || in_table || ' truncate partition ' || in_subpart );
ELSIF clonetype = 3 THEN
   EXECUTE IMMEDIATE ('truncate table ' || targetschema || '.' || in_table);
END IF;
/*   Now, insert the data in Direct Path mode...                                         */
diag_section := 'APPEND DATA';
mysql := 'insert /*+ APPEND NOLOGGING */ into ' || targetschema || '.' || in_table;
IF clonetype = 1 THEN
   mysql := mysql || ' subpartition ('  || in_subpart || ') (select * from '|| sourceschema || '.';
   mysql := mysql || in_table || ' subpartition (' || in_subpart || '))';
ELSIF clonetype = 2 THEN
   mysql := mysql || ' partition ('  || in_subpart || ') (select * from ' || sourceschema || '.';
   mysql := mysql || in_table || ' partition (' || in_subpart || '))';
ELSIF clonetype = 3 THEN
   mysql := mysql || '  (select * from ' || sourceschema || '.' || in_table || ')';
END IF;
Regards
Shravani 
February  17, 2011 - 6:02 pm UTC 
 
 
you know that NOLOGGING is meaningless.  Nologging is not a hint, it is not usable in SQL selects/inserts - it is only useful in an ALTER TABLE in that context.  
that should be inserting the data in MORE OR LESS the same sort of order - automatic segment space management spreads data out by design though, so it won't be exact.  Also - is parallel involved anywhere?
But if the clustering factors are very different - then, we know that it is laid out differently on disk.
You can use an order by on the insert as select to cause the rows to be organized in some fashion if you so desire. 
 
 
Thanks so much Tom
shravani, February  21, 2011 - 7:31 am UTC
 
 
Thanks for the clarification Tom. We are now changing the INSERT method by adding the ORDER BY clause.
 
 
granularity with where condition
A reader, March     25, 2011 - 8:50 am UTC
 
 
Is it possible to gather table stats based on where condition
granularity=>"some_column=some_value" 
March     25, 2011 - 11:21 am UTC 
 
 
no, why???  what is your goal. 
 
 
gather_stats
A reader, March     26, 2011 - 4:25 am UTC
 
 
I do not want to subpartition further. I would like to gather statistics on subset of data. 
March     29, 2011 - 2:40 am UTC 
 
 
We can only gather statistics on a segment.  If you do not want to partition further, you have to gather statistics on what you have.   
 
 
Analyze
A reader, July      15, 2011 - 3:49 pm UTC
 
 
Hi Tom - I thought multiple DML in a statement should be followed by analyze of the table with the indexes. I did as below but dont see any effects. So is it even multiple dml on the table keeps the index intact - so analyze needed.
-- create a table with a timestamp column
create table Test_Index as select a.*,a.OBJECT_ID indx_col from all_objects a where 1=2;
--create an index on the table
create index indx_indx_col  on test_index(indx_col );
--analyze the table , nothing to analyze 
analyze table Test_Index compute statistics
  for table
  for all indexes
  for all indexed columns;
-- insert into the table
insert into test_index select a.*,a.OBJECT_ID indx_col from all_objects a;
-- see the sql plan
set autot traceonly
select * from test_index where indx_col=&1;
-- delete from the table
delete from test_index;
-- insert into the table
insert into test_index select a.*,a.OBJECT_ID indx_col from all_objects a;
-- insert into the table
insert into test_index select a.*,a.OBJECT_ID indx_col from all_objects a;
-- see the sql plan
set autot traceonly
select * from test_index where indx_col=&1;
--analyze the table , nothing to analyze 
analyze table Test_Index compute statistics
  for table
  for all indexes
  for all indexed columns;
-- see the sql plan
set autot traceonly
select * from test_index where indx_col=&1;
Please suggest. 
July      18, 2011 - 10:14 am UTC 
 
 
you should never use analyze to gather statistics, please ONLY use dbms_stats.
a gathering of statistics MAY affect the plan, it may not.  It gives an opportunity for a plan change, it does not mean "the plan will change, the plan has to change".  It simply means "plans might change, then again, plans might stay the same"
In general, you DO NOT want to gather statistics after multiple DML, it is not necessary, you do it when the data changes greatly.
when I ran the above (changing analyze to dbms_stats), it just always decides to use the index, nothing wrong or suspicious there. 
 
 
Any Example
A reader, July      19, 2011 - 6:19 pm UTC
 
 
Hi Tom - Is there any specific example/links on your site where i can lookup to see what you meant by "
In general, you DO NOT want to gather statistics after multiple DML, it is not necessary, you do it when the data changes greatly. "
Interested to gather more information on the "data changes greatly" .  
Please help. 
Thanks as always 
July      22, 2011 - 12:33 pm UTC 
 
 
 
 
Another use for dbms_stats?
Raj, August    19, 2011 - 3:20 am UTC
 
 
Hi Tom,
I was trying to find another use for dbms_stats exports by trying to trend a table's growth (row count) over time.  If I had a daily export (taken before a new gather), I'm having troubles figuring out which columns in my export table are reliable.  I realize that this isn't totally accurate, more of a problem of not understanding the mapping.  Eg.
select count(*) from test.a;
COUNT(*)
----------
107929
exec dbms_stats.gather_table_stats(ownname=>'test',tabname=>'a',estimate_percent=>0.1,granularity=>'ALL',degree=>2,cascade=>true);
select count(*) from test.a;
COUNT(*)
----------
107929
exec dbms_stats.create_stat_table('test','a_statexp','test');
exec dbms_stats.export_table_stats(ownname=>'test', tabname=>'a', stattab=>'a_statexp', statown=>'test');
I find 3 columns with the value 107929:
select count(*) from test.a_statexp where N1=107929;
    COUNT(*)
  ----------
           5
select count(*) from test.a_statexp where N3=107929;
    COUNT(*)
  ----------
           1
select count(*) from test.a_statexp where N8=107929;
    COUNT(*)
  ----------
           5
When I look at the value of C1 in the a_statexp table, I see that it's the primary key rather than the table itself.  It sort of makes sense, but doesn't seem very trustworthy.
Is there a better way? 
 
Re: Another use for dbms_stats?
Raj, August    19, 2011 - 10:37 am UTC
 
 
I forgot to include the concluding statement I was wondering about...
select n1 from test.a_statexp where c1 = pk_index_of_tab_a;
That's the statement that seems a little flaky to me (rather than finding the column that represents the num_rows on the table itself... if it exists!). 
 
analyze and dbms_stats
Sushil, August    23, 2011 - 1:36 pm UTC
 
 
Tom, I tried to analyze couple tables in oracle 11g database and I could analyze some and couldn't some. Also, if I run the following gather table stats, it will be successful for some table and I am getting error for some table.
BEGIN DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'RPAPPL',TABNAME=>'ECHO_ZHOURLY',estimate_percent=>10,
method_opt=>'FOR ALL COLUMNS SIZE 1',Degree=>4, GRANULARITY=>'AUTO', CASCADE=>TRUE); END;
/
error:
BEGIN DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'RPAPPL',TABNAME=>'ECHO_ZHOURLY',estimate_percent=>10,
method_opt=>'FOR ALL COLUMNS SIZE 1',Degree=>4, GRANULARITY=>'AUTO', CASCADE=>TRUE); END;
Error at line 7
ORA-20001: Invalid or inconsistent input values
ORA-06512: at "SYS.DBMS_STATS", line 18408
ORA-06512: at "SYS.DBMS_STATS", line 18429
ORA-06512: at line 1
Script Terminated on line 7.
can you please help me to identify the issue? 
August    30, 2011 - 3:25 pm UTC 
 
 
I would highly recommend in 11g to use defaults for estimate_percent - do not sample anymore, let it figure out what it wants to do.  Highly recommendj that.
You would need to give me a complete example if you want me to comment. I need to be able to reproduce the issue. 
 
 
will be of more use
venkata, October   10, 2011 - 5:14 am UTC
 
 
Tom,
i have observed you use @connect user/password in all your replies. can i know what have you written in that connect.sql. hope this is not above classified ;) 
October   10, 2011 - 10:48 am UTC 
 
 
it isn't - and I've published it before - but since 10g it isn't necessary anymore.
It was simply:
set termout off
connect &1
@login
set termout on
because earlier releases of sqlplus didn't re-run the login.sql file (I set my prompt in that).  But sqlplus does that now so I don't use it at all anymore. 
 
 
rbo query
sandeep, December  11, 2011 - 11:34 am UTC
 
 
Hi Tom,
We have a 9i database which is in RBO mode. One query (shown below) which used to take 1 hr to complete took 32 hours to complete after running dbms_stats on those tables.
explain plan for
SELECT /*+ USE_NL(CFA) */
      SUM( CFA.TRNSCTN_AMNT) NOB67AMNT
FROM
      (SELECT C.CLM_ID,
            TO_CHAR(RPTD_DATE, 'YYYY') RPT_YEAR,
            C.ACCNT_PRGRM_PRD_ID
      FROM CLM_R C,
            PLCY_SRVC P,
            LOSS_R L
      WHERE C.CLM_ID > 0 AND
            C.PLCY_SRVC_ID = P.PLCY_SRVC_ID AND
            P.LOB_CODE LIKE '%DD' AND
            C.LOSS_ID = L.LOSS_ID AND
            TO_CHAR(L.RPTD_DATE, 'YYYY') > TO_CHAR(TO_NUMBER(TO_CHAR(TRUNC(SYSDATE),'YYYY'))-10) AND
            C.LMTBL_PRDCT_ID =
            (SELECT LMTBL_PRDCT_ID
            FROM LMTBL_PRDCT
            WHERE LMTBL_PRDCT_SHRT_NAME = 'WC')) C,
      SVNGS_FCTR_HSTRY D,
      NTR_OF_BNFT_R NOB,
      CLM_FNNCL_ACTVTY_R CFA,
      ENTRY_CODE_R E
WHERE CFA.CLM_FNNCL_ACTVTY_ID > 0 AND
      E.ENTRY_CODE_NAME NOT LIKE 'NH%' AND
      CFA.ENTRY_CODE_ID = E.ENTRY_CODE_ID AND
      C.CLM_ID = CFA.CLM_ID AND
      CFA.NTR_OF_BNFT_ID = NOB.NTR_OF_BNFT_ID AND
      D.ACCNT_PRGRM_PRD_ID = C.ACCNT_PRGRM_PRD_ID AND
      NOB.NTR_OF_BNFT_CODE = '67' AND
      TO_CHAR(CFA.ROW_INSRTD_DATE, 'YYYY') >= C.RPT_YEAR AND
      TO_CHAR(D.VLTN_MNTH_YEAR,'MM/YYYY')= TO_CHAR(ADD_MONTHS(TRUNC(SYSDATE), -1),'MM/YYYY') AND
      TO_DATE(CFA.ROW_INSRTD_DATE, 'DD-MON-YYYY') <= TO_DATE(LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE), -1)), 'DD-MON-YYYY');
---------------------------------------------------------------------------------------------
| Id  | Operation                          |  Name                  | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                        |     1 |   140 |  2656 |
|   1 |  SORT AGGREGATE                    |                        |     1 |   140 |       |
|   2 |   NESTED LOOPS                     |                        |     1 |   140 |  2656 |
|   3 |    NESTED LOOPS                    |                        |    11 |  1375 |  2655 |
|   4 |     NESTED LOOPS                   |                        |    11 |  1254 |  2654 |
|   5 |      NESTED LOOPS                  |                        |   920 | 68080 |  2470 |
|   6 |       NESTED LOOPS                 |                        |  5784 |   338K|  1892 |
|   7 |        NESTED LOOPS                |                        |   422 | 10550 |    14 |
|   8 |         TABLE ACCESS BY INDEX ROWID| NTR_OF_BNFT_R          |     1 |     9 |     1 |
|*  9 |          INDEX UNIQUE SCAN         | NTR_OF_BNFT_CK01       |     1 |       |       |
|* 10 |         INDEX FULL SCAN            | SVNGS_FCTR_HSTRY_CK01  |   422 |  6752 |   246 |
|* 11 |        TABLE ACCESS BY INDEX ROWID | CLM_R                  |    14 |   490 |     5 |
|* 12 |         INDEX RANGE SCAN           | CLM_R_PF01             |   192 |       |     2 |
|  13 |         TABLE ACCESS BY INDEX ROWID| LMTBL_PRDCT            |     1 |    13 |     1 |
|* 14 |          INDEX UNIQUE SCAN         | LMTBL_PRDCT_CK02       |     1 |       |       |
|* 15 |       TABLE ACCESS BY INDEX ROWID  | PLCY_SRVC              |     1 |    14 |     1 |
|* 16 |        INDEX UNIQUE SCAN           | PLCY_SRVC_PK           |     1 |       |     1 |
|* 17 |      TABLE ACCESS BY INDEX ROWID   | CLM_FNNCL_ACTVTY_R     |     1 |    40 |     1 |
|* 18 |       INDEX RANGE SCAN             | CLM_FNNCL_ACTVTY_PF02  |     1 |       |     3 |
|* 19 |     TABLE ACCESS BY INDEX ROWID    | ENTRY_CODE_R           |     1 |    11 |     1 |
|* 20 |      INDEX UNIQUE SCAN             | SYS_C0010550           |     1 |       |       |
|* 21 |    TABLE ACCESS BY INDEX ROWID     | LOSS_R                 |     1 |    15 |     1 |
|* 22 |     INDEX UNIQUE SCAN              | LOSS_R_PK              |     1 |       |     1 |
---------------------------------------------------------------------------------------------
Whwn I am applying the hint RULE then its coming within 1 hour. 
explain plan for
SELECT /*+ RULE */
      SUM( CFA.TRNSCTN_AMNT) NOB67AMNT
FROM
      (SELECT C.CLM_ID,
            TO_CHAR(RPTD_DATE, 'YYYY') RPT_YEAR,
            C.ACCNT_PRGRM_PRD_ID
      FROM CLM_R C,
            PLCY_SRVC P,
            LOSS_R L
      WHERE C.CLM_ID > 0 AND
            C.PLCY_SRVC_ID = P.PLCY_SRVC_ID AND
            P.LOB_CODE LIKE '%DD' AND
            C.LOSS_ID = L.LOSS_ID AND
            TO_CHAR(L.RPTD_DATE, 'YYYY') > TO_CHAR(TO_NUMBER(TO_CHAR(TRUNC(SYSDATE),'YYYY'))-10) AND
            C.LMTBL_PRDCT_ID =
            (SELECT LMTBL_PRDCT_ID
            FROM LMTBL_PRDCT
            WHERE LMTBL_PRDCT_SHRT_NAME = 'WC')) C,
      SVNGS_FCTR_HSTRY D,
      NTR_OF_BNFT_R NOB,
      CLM_FNNCL_ACTVTY_R CFA,
      ENTRY_CODE_R E
WHERE CFA.CLM_FNNCL_ACTVTY_ID > 0 AND
      E.ENTRY_CODE_NAME NOT LIKE 'NH%' AND
      CFA.ENTRY_CODE_ID = E.ENTRY_CODE_ID AND
      C.CLM_ID = CFA.CLM_ID AND
      CFA.NTR_OF_BNFT_ID = NOB.NTR_OF_BNFT_ID AND
      D.ACCNT_PRGRM_PRD_ID = C.ACCNT_PRGRM_PRD_ID AND
      NOB.NTR_OF_BNFT_CODE = '67' AND
      TO_CHAR(CFA.ROW_INSRTD_DATE, 'YYYY') >= C.RPT_YEAR AND
      TO_CHAR(D.VLTN_MNTH_YEAR,'MM/YYYY')= TO_CHAR(ADD_MONTHS(TRUNC(SYSDATE), -1),'MM/YYYY') AND
      TO_DATE(CFA.ROW_INSRTD_DATE, 'DD-MON-YYYY') <= TO_DATE(LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE), -1)), 'DD-MON-YYYY');
----------------------------------------------------------------------------------------------
| Id  | Operation                           |  Name                  | Rows  | Bytes | Cost  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                        |       |       |       |
|   1 |  SORT AGGREGATE                     |                        |       |       |       |
|*  2 |   FILTER                            |                        |       |       |       |
|   3 |    NESTED LOOPS                     |                        |       |       |       |
|   4 |     NESTED LOOPS                    |                        |       |       |       |
|   5 |      NESTED LOOPS                   |                        |       |       |       |
|   6 |       NESTED LOOPS                  |                        |       |       |       |
|   7 |        NESTED LOOPS                 |                        |       |       |       |
|   8 |         NESTED LOOPS                |                        |       |       |       |
|   9 |          TABLE ACCESS BY INDEX ROWID| NTR_OF_BNFT_R          |       |       |       |
|* 10 |           INDEX UNIQUE SCAN         | NTR_OF_BNFT_CK01       |       |       |       |
|* 11 |          TABLE ACCESS BY INDEX ROWID| CLM_FNNCL_ACTVTY_R     |       |       |       |
|* 12 |           INDEX RANGE SCAN          | CLM_FNNCL_ACTVTY_FK01  |       |       |       |
|* 13 |         TABLE ACCESS BY INDEX ROWID | ENTRY_CODE_R           |       |       |       |
|* 14 |          INDEX UNIQUE SCAN          | SYS_C0010550           |       |       |       |
|  15 |        TABLE ACCESS BY INDEX ROWID  | CLM_R                  |       |       |       |
|* 16 |         INDEX UNIQUE SCAN           | CLM_PK                 |       |       |       |
|* 17 |       TABLE ACCESS BY INDEX ROWID   | LOSS_R                 |       |       |       |
|* 18 |        INDEX UNIQUE SCAN            | LOSS_R_PK              |       |       |       |
|* 19 |      TABLE ACCESS BY INDEX ROWID    | PLCY_SRVC              |       |       |       |
|* 20 |       INDEX UNIQUE SCAN             | PLCY_SRVC_PK           |       |       |       |
|* 21 |     INDEX RANGE SCAN                | SVNGS_FCTR_HSTRY_CK01  |       |       |       |
|  22 |    TABLE ACCESS BY INDEX ROWID      | LMTBL_PRDCT            |       |       |       |
|* 23 |     INDEX UNIQUE SCAN               | LMTBL_PRDCT_CK02       |       |       |       |
----------------------------------------------------------------------------------------------
We have a job to run the dbms_stats weekly as shown below.
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','CLM_FNNCL_ACTVTY_R',estimate_percent =>25,degree =>4,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','CLM_R',estimate_percent =>100,degree =>4,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','LOSS_R',estimate_percent =>100,degree =>4,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','PLCY_SRVC',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','SVNGS_FCTR_HSTRY',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','LMTBL_PRDCT',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','ENTRY_CODE_R',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','NTR_OF_BNFT_R',estimate_percent =>100,degree =>1,cascade => TRUE);
But last week I had run the dbms_stats for clm_r in this way and after that the weekly job for analyzing also ran. I think this the reason clm_r coming within the nested loop before CLM_FNNCL_ACTVTY_R comes which is causing so much delay. 
dbms_stats.gather_table_stats (
    ownname => user, 
    tabname => 'CLM_R , 
    estimate_percent => 100, 
    method_opt => 'for all columns size auto',
    degree => 4,
    cascade => TRUE 
    );
The size of the tables involved in the above query is given below .
OWNER                          SEGMENT_NAME            SEGMENT_TYPE               mb
------------------------------ ----------------------- ------------------ ----------
PSTPP                          PLCY_SRVC               TABLE                     120
PSTPP                          NTR_OF_BNFT_R           TABLE                    .125
PSTPP                          CLM_FNNCL_ACTVTY_R      TABLE                   23552
PSTPP                          ENTRY_CODE_R            TABLE                    .125
PSTPP                          LOSS_R                  TABLE                    4736
PSTPP                          LMTBL_PRDCT             TABLE                    .125
PSTPP                          SVNGS_FCTR_HSTRY        TABLE                   1.875
PSTPP                          CLM_R                   TABLE                    5504
Can you please suggest how should I run the dbms_stats to give a constant plan. The plan is getting changed and causing the query to finh in 30 hours instaed of getting finshing within 1  hour. 
December  11, 2011 - 2:56 pm UTC 
 
 
why do you have a hint in it?
are the actual cardinalities (number of rows) flowing through the steps of the plan anywhere close to the estimated cardinalities?
it seems you had a method for getting good stats before and you changed your method and it didn't work out so well - is that correct? 
 
 
rbo query
sandeep, December  14, 2011 - 10:14 pm UTC
 
 
Hi Tom,
Thanks for your reply. This query with the use_nl hint is being used from a long time(may be some years) in our database.So the developers are reluctant to change the code. Previously we were collecting stats by using "analyze table ..estimate statistics method". Now I have changed it to dbms_stats .
Now again I have deleted the statistics of a table -clm_r and did the old way of "analyze table ..estimate statistics method" of that particular table.The rest others are in dbms_stats way. Now when I am running the query the result is coming within 45 mins. I m bit confused which way I am going to collect the stats. Can you please suggest which method should I follow to collect the stats.
Many thanks in advance.
sandeep 
December  16, 2011 - 6:13 am UTC 
 
 
the code is not working, why are they reluctant?
get rid of it if you want to discuss it with me.
 
 
 
dbms_stats and analyze result in different plan
A reader, February  26, 2012 - 9:23 pm UTC
 
 
Hi Tom,
Could you please help to elaborate more on why 'dbms_stats' and 'analyze' result in different plan?
Below is the test scripts for your reference.
create table test as select rownum id, 'test' text from dual connect by level <= 10000;
create index test_i on test(id);
delete test where id <= 9990;
commit;
exec dbms_stats.gather_table_stats(ownname=>user, tabname=> 'TEST', cascade => true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
select * from test where id between 1 and 10000;
--IndexRangeScan 
analyze index test_i compute statistics;
select * from test where id between 1 and 10000;
--FTS. Even there is only 10 rows now, but HWM is still high, so FTS is bad. 
February  28, 2012 - 6:26 am UTC 
 
 
because analyze has not been touched in many years as far as stats gathering goes, because analyze is deprecated for gathering stats, because dbms_stats is the only way to gather stats for the optimizer - the optimizer is coded expecting the numbers to have come from dbms_stats.
In short, they generate different statistics, different statistics often lead to different plans.
also, your two commands are entirely different - you gathered entirely different sets of stats.. 
 
 
Analyze any sys priv in 11gr2
A reader, March     19, 2012 - 10:51 am UTC
 
 
Tom,
I recently encountered a strange behavior. After upgrading from 10.2.0.3 to 11gr2, a package stopped working. This package owned by OWNER1, uses dbms_stats to gather stats on tables owned by another user OWNER2. It was working fine in 10g but in 11gr2 I started getting:
dbms_stats.gather_table_stats(ownname=>'OWNER2',tabname=>'T_1',cascade=>true);
ORA-20000: Unable to analyze TABLE "OWNER2"."T_1", insufficient privileges or does not exist
When I grant "ANALYZE ANY" privilege to package owner OWNER1, then I do not get any error. 
Question is: Is this a change in 10g to 11gr2? In 10g the same command worked without granting "ANALYZE ANY" privilege.
Thanks... 
 
Analyze any sys priv in 11gr2
A reader, March     20, 2012 - 7:39 am UTC
 
 
Upgrade was done via cold backup of 10g, restore on 11g, startup in upgrade mode, run catupgrd. So no 10g privileges were modified. The 10g database is still intact and I compared all the user privileges between 10g and 11g. They are same. I have checked all roles assigned to OWNER1 and they match between 10g and 11g. Somehow, OWNER1 can analyze OWNER2's objects with requiring "analyze all" privilege in 10g. 
I just wanted to know if you were aware of something which would cause this to happen in 10g, like any initialization parameter or something which I might have missed. 
Based on your reply, I think my best opton is to open SR with support.
Thanks... 
March     20, 2012 - 9:52 am UTC 
 
 
show us the set of privileges granted to the user in 10g - reproduce this.
Do this before opening an SR.  
show me how, in 10g, you were able to do this - what grants did you give the user in 10g that made this possible? 
 
 
 Analyze any sys priv in 11gr2
A reader, March     20, 2012 - 10:21 am UTC
 
 
Seems like a 10g bug...
C:\>sqlplus system
SQL*Plus: Release 10.2.0.4.0 - Production on Tue Mar 20 10:58:35 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Enter password:
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
SQL> drop user USER1 cascade;
drop user USER1 cascade
          *
ERROR at line 1:
ORA-01918: user 'USER1' does not exist
SQL> drop user USER2 cascade;
drop user USER2 cascade
          *
ERROR at line 1:
ORA-01918: user 'USER2' does not exist
SQL> create user USER1 identified by USER1;
User created.
SQL> grant create session to USER1;
Grant succeeded.
SQL> create user USER2 identified by USER2;
User created.
SQL> grant create session, create table to USER2;
Grant succeeded.
SQL> alter user USER2 quota unlimited on users_data;
User altered.
SQL> conn USER2/USER2
Connected.
SQL> create table t1 as select * from all_objects;
Table created.
SQL> conn USER1/USER1
Connected.
SQL> exec dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1');
PL/SQL procedure successfully completed.
SQL>
-- 11g
C:\Users\oradba>sqlplus system
SQL*Plus: Release 11.2.0.2.0 Production on Tue Mar 20 11:03:20 2012
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters and Automatic Storage Management options
SQL> drop user USER1 cascade;
drop user USER1 cascade
          *
ERROR at line 1:
ORA-01918: user 'USER1' does not exist
SQL> drop user USER2 cascade;
drop user USER2 cascade
          *
ERROR at line 1:
ORA-01918: user 'USER2' does not exist
SQL> create user USER1 identified by USER1;
User created.
SQL> grant create session to USER1;
Grant succeeded.
SQL> create user USER2 identified by USER2;
User created.
SQL> grant create session, create table to USER2;
Grant succeeded.
SQL> alter user USER2 quota unlimited on users;
User altered.
SQL> conn USER2/USER2
Connected.
SQL> create table t1 as select * from all_objects;
Table created.
SQL> conn USER1/USER1
Connected.
SQL> exec dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or
does not exist
ORA-06512: at "SYS.DBMS_STATS", line 23427
ORA-06512: at "SYS.DBMS_STATS", line 23489
ORA-06512: at line 1
SQL>
 
 
March     20, 2012 - 3:19 pm UTC 
 
 
It doesn't work like that by default in 10g
ps$tkyte%ORA10GR2> drop user USER1 cascade;
User dropped.
ops$tkyte%ORA10GR2> drop user USER2 cascade;
User dropped.
ops$tkyte%ORA10GR2> create user USER1 identified by USER1;
User created.
ops$tkyte%ORA10GR2> grant create session to USER1;
Grant succeeded.
ops$tkyte%ORA10GR2> create user USER2 identified by USER2 default tablespace users quota unlimited on users;
User created.
ops$tkyte%ORA10GR2> grant create session, create table to USER2;
Grant succeeded.
ops$tkyte%ORA10GR2> conn USER2/USER2
Connected.
user2%ORA10GR2> create table t1 as select * from all_objects where rownum = 1;
Table created.
user2%ORA10GR2> conn USER1/USER1
Connected.
user1%ORA10GR2> exec dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or
does not exist
ORA-06512: at "SYS.DBMS_STATS", line 15017
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1
I believe (firmly) that your 10g database had analyze any granted to PUBLIC.  Please check out what privs PUBLIC has in your 10g database.
Nothing changed... 
 
 
 
A reader, March     20, 2012 - 8:11 pm UTC
 
 
SQL> select * from dba_sys_privs where grantee='PUBLIC';
no rows selected
SQL>
I have tested in 3 different 10.2.0.3 databases and get the same behavior. I can run gather stats against a different user's schema even without any privilege.
I am sure it must be a 10.2.0.3 bug which was fixed may be in 10.2.0.4 and later. If it was a wrongly granted system privilege, it would have carried over into 11g database since the 11g database was created from cold backup of 10g.
 
 
March     20, 2012 - 9:17 pm UTC 
 
 
I don't have a 10.2.0.3 instance to test with me this week...
Can anyone else out there with 10.2.0.3 try this out?  I don't see any bugs that would have let this happen. 
 
 
A reader, March     20, 2012 - 9:41 pm UTC
 
 
I have a bug number but no details. It might not be a published bug.
Bug 12714396: USER IS ABLE TO EXECUTE DBMS_STATS WITHOUT ANY PRIVILEGE 
March     20, 2012 - 10:16 pm UTC 
 
 
that was a bug closed as NOT A BUG
the bug was opened because someone could execute dbms_stats to gather stats on a schema without the "analyze any" privilege.
but what in fact was happening was that NOTHING in that schema needed stats, so nothing was gathered.  if something needed stats - it failed as it should.
that bug does not apply here at all. 
 
 
Analyze any sys priv in 11gr2 
A reader, March     21, 2012 - 8:15 am UTC
 
 
Thanks for looking at the bug. Since this is not very urgent, please see if you can find a 10.2.0.3 database to test as per your convenience. 
I will also try to create a 10.2.0.3 database and check if it works out of the box. Then I will upgrade to 10.2.0.4 and test if it is resolved. I will have to start with OS install so it will take couple of days.
Thanks for following up on this one even with your extremely busy schedule. 
March     21, 2012 - 10:32 am UTC 
 
 
If you can, please do this in 10g and report back the entire set of apparent system privs the newly created user has:
PLEASE REVIEW CAREFULLY, it operates as sysdba to create a new "dictoinary" like view that shows all of the roles granted to someone
user1%ORA10GR2> connect / as sysdba
Connected.
sys%ORA10GR2>
sys%ORA10GR2> create or replace view user_role_hierarchy
  2  as
  3  select u2.name granted_role
  4    from ( select *
  5             from sys.sysauth$
  6          connect by prior privilege# = grantee#
  7            start with grantee# = uid or grantee# = 1) sa,
  8         sys.user$ u2
  9   where u2.user#=sa.privilege#
 10  union all select user from dual
 11  union all select 'PUBLIC' from dual
 12  /
View created.
sys%ORA10GR2> grant select on user_role_hierarchy to public;
Grant succeeded.
sys%ORA10GR2> create or replace public synonym user_role_hierarchy for user_role_hierarchy;
Synonym created.
sys%ORA10GR2>
sys%ORA10GR2> drop user user1 cascade;
User dropped.
sys%ORA10GR2> create user user1 identified by user1;
User created.
sys%ORA10GR2> grant create session to user1;
Grant succeeded.
sys%ORA10GR2> grant select on dba_sys_privs to user1;
Grant succeeded.
sys%ORA10GR2>
sys%ORA10GR2> connect user1/user1;
Connected.
user1%ORA10GR2>
user1%ORA10GR2> select * from user_role_hierarchy;
GRANTED_ROLE
------------------------------
USER1
PUBLIC
user1%ORA10GR2>
user1%ORA10GR2> select distinct PRIVILEGE
  2    from dba_sys_privs
  3   where grantee in ( select * from user_role_hierarchy )
  4  /
PRIVILEGE
----------------------------------------
CREATE SESSION
user1%ORA10GR2>
 
 
 
 
A reader, March     21, 2012 - 11:49 am UTC
 
 
C:\>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Mar 21 12:45:20 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
SQL> create or replace view user_role_hierarchy
  2  as
  3  select u2.name granted_role
  4    from ( select *
  5           from sys.sysauth$
  6           connect by prior privilege# = grantee#
  7           start with grantee# = uid or grantee# = 1) sa,
  8           sys.user$ u2
  9               where u2.user#=sa.privilege#
 10   union all
 11          select user from dual
 12   union all
 13          select 'PUBLIC' from dual
 14  /
View created.
SQL> grant select on user_role_hierarchy to public;
Grant succeeded.
SQL> create or replace public synonym user_role_hierarchy for user_role_hierarchy;
Synonym created.
SQL> drop user user1 cascade;
User dropped.
SQL> create user user1 identified by user1;
User created.
SQL> grant create session to user1;
Grant succeeded.
SQL> grant select on dba_sys_privs to user1;
Grant succeeded.
SQL> connect user1/user1
Connected.
SQL> select * from user_role_hierarchy;
GRANTED_ROLE
------------------------------
USER1
PUBLIC
SQL> select distinct PRIVILEGE
  2      from dba_sys_privs
  3     where grantee in ( select * from user_role_hierarchy )
  4  /
PRIVILEGE
----------------------------------------
CREATE SESSION
SQL>
 
 
March     21, 2012 - 10:44 pm UTC 
 
 
that is perplexing, I would contact support at this point - you should not be able to analyze ANYONES tables - period.
can anyone else with 10.2.0.3 available reproduce? 
 
 
Analyze any sys priv in 11gr2
A reader, March     22, 2012 - 8:08 am UTC
 
 
Tom,
I really like to express my sincere thanks to you for working on this issue. I have seen your busy schedule on the calendar. It is really amazing that you still find time to help Oracle users.
I did open a SR and the only thing I got out of support is:
"In 11g we tightened security on ANALYZE. It is a behavior change between versions."
Since your test on higher versions of 10g shows that dbms_stats fails without privileges, I kind of seem to disagree with that statement. Also, if it was a behavior change, there would be some documentation of it in new features or PL/SQL packages and type references guide.
Tomorrow I am going to create a VM and test with various versions of 10g to observe the behavior out of the box. 
 
10.2.0.1.0 - 10.2.0.3.0 Results
Mark Williams, March     22, 2012 - 6:21 pm UTC
 
 
Here's some results from versions 10.2.0.1.0, 10.2.0.2.0, and 10.2.0.3.0 on a test VM (Windows Server 2003 32-bit).
Quick summary: I couldn't reproduce the issue.
Common tasks for each version:
create user USER1 identified by USER1
default tablespace users
temporary tablespace temp
quota unlimited on users;
grant create session to USER1;
create user USER2 identified by USER2
default tablespace users
temporary tablespace temp
quota unlimited on users;
grant create session, create table to USER2;
conn USER2/USER2
create table t1 as select * from all_objects;
conn USER1/USER1
exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
Here are the results for each version with the common tasks snipped out for space:
SQL> @get_version
PRODUCT                                          VERSION
------------------------------------------------ ----------
NLSRTL                                           10.2.0.1.0
Oracle Database 10g Enterprise Edition           10.2.0.1.0
PL/SQL                                           10.2.0.1.0
TNS for 32-bit Windows:                          10.2.0.1.0
4 rows selected.
SQL> exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13046
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1
SQL> @get_version
PRODUCT                                          VERSION
------------------------------------------------ ----------
NLSRTL                                           10.2.0.2.0
Oracle Database 10g Enterprise Edition           10.2.0.2.0
PL/SQL                                           10.2.0.2.0
TNS for 32-bit Windows:                          10.2.0.2.0
4 rows selected.
SQL> exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13149
ORA-06512: at "SYS.DBMS_STATS", line 13179
ORA-06512: at line 1
SQL> @get_version
PRODUCT                                          VERSION
------------------------------------------------ ----------
NLSRTL                                           10.2.0.3.0
Oracle Database 10g Enterprise Edition           10.2.0.3.0
PL/SQL                                           10.2.0.3.0
TNS for 32-bit Windows:                          10.2.0.3.0
4 rows selected.
SQL> exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13172
ORA-06512: at "SYS.DBMS_STATS", line 13202
ORA-06512: at line 1
 
 
March     22, 2012 - 7:04 pm UTC 
 
 
Thanks Mark - I'm pretty sure there is something funky that I just cannot see with the original posters database.  
That would have been a pretty large hole had it existed. 
 
 
A reader, July      23, 2012 - 1:27 pm UTC
 
 
Tom,
When I gather table stats using a procedure that's scheduled using dbms_job I get the error below. If I gather stats the table directly I don't get that error. Any reason why? Your reply is greatly appreciated.
Unable to analyze TABLE , insufficient 
privileges or does not exist
 
July      30, 2012 - 9:00 am UTC 
 
 
 
 
A reader, July      30, 2012 - 12:39 pm UTC
 
 
That doesn't tell what to do if I run the procedure through job scheduler. I submitted a TAR with oracle support and the support analyst said I have to grant the analyze any table analyze any schema privileges to dbsnmp user. I can gather_stat if I run it using my id which has dba privilege. 
July      30, 2012 - 4:05 pm UTC 
 
 
same thing, you have no roles.
you need the privileges directly granted to you.  You'll be running from inside of a stored procedure environment. 
 
 
Analyze Command to be used?
Ramki, July      31, 2012 - 1:04 pm UTC
 
 
Tom
   I saw your explanation on difference on using Analyze command vs dbms_stats.gather_table_stats.
I recently joined a Datawarehouse team where the DBA's are using Analyze command as a part of their periodic weekly stats gathering process for important tables.
The Oracle version used is 11g(Release 1102000100).
I referred to Oracle 11g Release notes they recommended to use dbms_stats.gather_table_stats package.
After seeing your recommendations i suggested them to change them to dbms_stats.gather_table_stats.
They(DBA's) responded by stating that "Actually oracle internally runs the “analyze” command within the dbms_stats package. Only difference is the cardinality will be more if you use the dbms_stats. Also running dbms_stats takes longer time than the regular analyze statement."
Is the above DBA statement is correct?Can you please share your thoughts on this so that i can follow up with DBA's to correct them. 
July      31, 2012 - 1:08 pm UTC 
 
 
They(DBA's) responded by stating that "Actually oracle internally runs the 
“analyze” command within the dbms_stats package. Only difference is the 
cardinality will be more if you use the dbms_stats. Also running dbms_stats 
takes longer time than the regular analyze statement."
that is false on every level
first, enable sql trace, run dbms_stats, and then tkprof the trace file, show me a single ANALYZE in the trace?  you cannot.  show me a ton of SQL run against the objects?  you'll find it all over the place.
dbms_stats does not run analyze, it doesn't.
(and if there is a difference - as they say there will be - doesn't that sort of rule out dbms_stats 'just runs analyze' - if they return different numbers?????)
and they should benchmark dbms_stats versus analyze.  I think they'll be surprised.
 
 
 
dbams_stats gather taking ages.
A Reader, January   23, 2013 - 3:08 am UTC
 
 
Tom,
we are at 10.2.0.4
the job:
SQL> exec dbms_stats.gather_table_stats(user,'TABLE_1',cascade=>true);
it  is running for more then 24 hrs...
table TABLE_1 size -   45 gb
Index IDX_NAME_UI size -   28 gb 
table -  sys.ORA_TEMP_1_DS_23485  has 0 rows in it.
index IDX_NAME_UI functiona based index  on name column -- name 
and it  has definition  as  :
create index IDX_NAME_UI on TABLE_1 (  UPPER('name') );
SQL> desc TABLE_1;
 Name                                                                                                      Null?    Type
 ----------------------------------------------------------------------------------------------------------------- -------- 
 MANAGEDSERVICEID                                                                                          NOT NULL NUMBER
 NAME                                                                                                               VARCHAR2(650 CHAR)
 MS2MANAGEDSERVICETYPE                                                                                              NUMBER
 MS2MARKETTYPE                                                                                                      NUMBER
 OMSID                                                                                                              VARCHAR2(2000 CHAR)
 FULLNAME                                                                                                           VARCHAR2(50 CHAR)
 RELATIVENAME                                                                                                       VARCHAR2(50 CHAR)
 ALIAS1                                                                                                             VARCHAR2(50 CHAR)
 ALIAS2                                                                                                             VARCHAR2(50 CHAR)
 OBJECTID                                                                                                           VARCHAR2(64 CHAR)
 SUBTYPE                                                                                                            VARCHAR2(50 CHAR)
 SUBSTATUS                                                                                                          VARCHAR2(50 CHAR)
 DESCRIPTION                                                                                                        VARCHAR2(2000 CHAR)
 NOTES                                                                                                              VARCHAR2(2000 CHAR)
 CREATEDDATE                                                                                                        DATE
 LASTMODIFIEDDATE                                                                                                   DATE
 CREATEDBY2DIMUSER                                                                                                  NUMBER
 LASTMODIFIEDBY2DIMUSER                                                                                             NUMBER
 MS2PROVISIONSTATUS                                                                                                 NUMBER
 MS2FUNCTIONALSTATUS                                                                                                NUMBER
 MARKEDFORDELETE                                                                                                    NUMBER(1)
 ISVISIBLE                                                                                                          NUMBER
 LABEL                                                                                                              NUMBER
 X_CTCV_CONV_ID                                                                                                     VARCHAR2(50 CHAR)
 X_CONV_RUN_NO                                                                                                      NUMBER
 we seen currently it is executing the below statements.
 
 
 sid for the above job is - 1023
 
SQL> select event,state,p1,p2,p3 from v$session_wait where sid=1023;
EVENT                                                            STATE                             P1         P2         P3
---------------------------------------------------------------- ------------------- ---------------- ---------- ----------
latch: cache buffers chains                                      WAITED SHORT TIME     23568357320        122          1
SQL> select  segment_name,
  2              segment_type
  3  from     dba_extents
  4  where  file_id = 23568357320
  5       and  122  between
  6      block_id and block_id + blocks -1;
no rows selected
-- no file with file_id = 23568357320 !
-- current statement being executed ..
SQL> explain plan for
  2  SELECT COUNT (rep)
  3    FROM (  SELECT                   /*+ leading(v1 v2) use_nl_with_index(v2) */
  4                  COUNT (v1.val) rep
  5              FROM (SELECT rn, val
  6                      FROM (SELECT ROWNUM rn, val
  7                              FROM (  SELECT /*+ no_merge no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */
  8                                            SUBSTRB (
  9                                                SYS_DS_ALIAS_26,
 10                                                1,
 11                                                32)
 12                                                val
 13                                        FROM sys.ora_temp_1_ds_23485 t
 14                                       WHERE SUBSTRB (SYS_DS_ALIAS_26, 1, 32)
 15                                                IS NOT NULL
 16                                    GROUP BY SUBSTRB (SYS_DS_ALIAS_26, 1, 32)
 17                                      HAVING COUNT (
 18                                                SUBSTRB (SYS_DS_ALIAS_26, 1, 32)) =
 19                                                1))
 20                     WHERE ORA_HASH (rn) <= 4015864698) v1,
 21                   (SELECT                                      /*+ index(t2) */
 22                          UPPER ("NAME") val
 23                      FROM "CRAMER"."TABLE_1" t2
 24                     WHERE TBL$OR$IDX$PART$NUM ("CRAMER"."TABLE_1",
 25                                                0,
 26                                                4,
 27                                                0,
 28                                                "ROWID") = :objn) v2
 29             WHERE v2.val LIKE v1.val || '%'
 30          GROUP BY v1.val
 31            HAVING COUNT (v1.val) <= 2);
Explained.
SQL> set linesize 200
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------- 
Plan hash value: 3648757343
-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |    13 |   100K  (4)| 00:20:06 |    |          |
|   1 |  SORT AGGREGATE             |                     |     1 |    13 |            |          |    |          |
|   2 |   VIEW                      |                     |    30M|   374M|   100K  (4)| 00:20:06 |    |          |
|*  3 |    FILTER                   |                     |       |       |            |          |    |          |
|   4 |     HASH GROUP BY           |                     |    30M|  1871M|   100K  (4)| 00:20:06 |    |          |
|   5 |      NESTED LOOPS           |                     |    30M|  1871M| 97232   (1)| 00:19:27 |    |          |
|*  6 |       VIEW                  |                     |   408 | 12648 |    21  (10)| 00:00:01 |    |          |
|   7 |        COUNT                |                     |       |       |            |          |    |          |
|   8 |         VIEW                |                     |   408 |  7344 |    21  (10)| 00:00:01 |    |          |
|*  9 |          FILTER             |                     |       |       |            |          |    |          |
|  10 |           HASH GROUP BY     |                     |   408 |   130K|    21  (10)| 00:00:01 |    |          |
|* 11 |            TABLE ACCESS FULL| ORA_TEMP_1_DS_23485 |   408 |   130K|    20   (5)| 00:00:01 |    |          |
|  12 |       PARTITION RANGE SINGLE|                     | 74008 |  2457K|   238   (1)| 00:00:03 |   KEY |   KEY |
|* 13 |        INDEX RANGE SCAN     | IDX_NAME_UI          | 74008 |  2457K|   238   (1)| 00:00:03 |   KEY |   KEY |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(COUNT("VAL")<=2)
   6 - filter(ORA_HASH("RN")<=4015864698)
   9 - filter(COUNT(SUBSTRB("SYS_DS_ALIAS_26",1,32))=1)
  11 - filter(SUBSTRB("SYS_DS_ALIAS_26",1,32) IS NOT NULL)
  13 - access(UPPER("NAME") LIKE "VAL"||'%')
       filter(UPPER("NAME") LIKE "VAL"||'%')
30 rows selected.
SQL>
question
a) where it is waiting ?
b) what P1 = 23568357320 from v$session_wait is referring here?
c) do system level stats gather would help?
d) any other point .. bug?
 
 
January   30, 2013 - 12:28 pm UTC 
 
 
-- no file with file_id = 23568357320 !
p1 = address of latch, not file
ops$tkyte%ORA11GR2> select parameter1 p1, parameter2 p2, parameter3 p3
  2  from v$event_name
  3  where name = 'latch: cache buffers chains'
  4  /
P1
----------------------------------------------------------------
P2
----------------------------------------------------------------
P3
----------------------------------------------------------------
address
number
tries
use ASH reports to see what it was doing.
do you have the resources to run in parallel (IO, CPU and memory) 
 
 
 
..dbms_stats job taking ages
A Reader, January   23, 2013 - 5:16 am UTC
 
 
Tom
further to add to above
I did 
alter session set events '10046 trace name context forever,level 12';
to above session runing stats gather:
excerpts from trace file:( job was running for many hrs..)
this obj#=202725 belongs  to
|* 11 |            TABLE ACCESS FULL| ORA_TEMP_1_DS_23485 |   408 |   130K|    20   (5)| 00:00:01 |    |          |
WAIT #20: nam='db file scattered read' ela= 41817 file#=1009 block#=6154 blocks=128 obj#=202725 tim=3368259988396
WAIT #20: nam='db file scattered read' ela= 3920 file#=1009 block#=6282 blocks=19 obj#=202725 tim=3368260153260
|* 13 |        INDEX RANGE SCAN     | IDX_NAME_UI          | 74008 |  2457K|   238   (1)| 00:00:03 |   KEY |   KEY |
obj#=194946 belongs to IDX_NAME_UI
seen many events as below 
WAIT #23: nam='db file sequential read' ela= 467 file#=260 block#=1432646 blocks=1 obj#=194946 tim=3369425036064
WAIT #23: nam='db file sequential read' ela= 533 file#=277 block#=215812 blocks=1 obj#=194946 tim=3369426851963
WAIT #23: nam='db file sequential read' ela= 442 file#=277 block#=215813 blocks=1 obj#=194946 tim=3369426852683
WAIT #23: nam='db file sequential read' ela= 403 file#=277 block#=215814 blocks=1 obj#=194946 tim=3369426853244
Trace file was not updating:
below was  the line ... where it was aiting at 
*** 2013-01-23 21:35:21.609
WAIT #23: nam='db file sequential read' ela= 13780 file#=269 block#=242690 blocks=1 obj#=194946 tim=3369458683230
I am sure it is this operation.
|   5 |      NESTED LOOPS           |                     |    30M|  1871M| 97232   (1)| 00:19:27 |    |          |
So 
whether nested looop is nasty here ?
 
 
..dbms_stats job taking forever
A Reader, January   25, 2013 - 1:53 am UTC
 
 
Tom,
sorry to keep adding  the info as it comes.
Here are more details to above problem
 
I ran the dbsm_stats.gather_table_stats job again without parallel ( no degree clause speficied)
and traced the session. below excerpts from the tracefile from udump.
...snip
... snip
WAIT #22: nam='db file sequential read' ela= 27129 file#=258 block#=1454180 blocks=1 obj#=194946 tim=3423691838220
WAIT #22: nam='db file sequential read' ela= 40410 file#=260 block#=1453019 blocks=1 obj#=194946 tim=3423691914858
WAIT #22: nam='db file sequential read' ela= 7393 file#=260 block#=1453060 blocks=1 obj#=194946 tim=3423691928244
<here no updates to the trace file size last 11 hrs... since the job is running..>
seeing what for session is waiting:
SQL> select event,state,p1,p2,p3 from v$session_wait where sid=1045;
EVENT                                                            STATE                       P1      P2    P3
---------------------------------------------------------------- ------------------- ---------- ---------- ----------
db file sequential read                                          WAITED KNOWN TIME          260    1453060          1
SQL> select  segment_name,
  2                  segment_type
  3     from     dba_extents
  4       where  file_id = 260
  5           and 1453060   between
  6          block_id and block_id + blocks -1;
SEGMENT_NAME                   SEGMENT_TYPE
------------------------------ --------------------------------------------------
IDX_NAME_UI                    INDEX PARTITION
 
- obj#=194946 in  tracefile  refres to index partition segment IDX_NAME_UI
when I looked at AWR
last 1 hr.. ( 4 cpu dual core) 
 
Top 5 Timed Events 
    
Event Waits     Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time       38,824     100 
control file parallel write  36,658 187    5  0.5   System I/O
control file sequential read  198,347 168    1  0.4   System I/O
direct path read    62,568 34    1  0.1   User I/O
enq: CF - contention   120  31    259  0.1   Other
SQL ordered by Elapsed Time
Elapsed Time (s)  CPU Time (s) Executions  Elap per Exec (s)  % Total DB Time SQL Id SQL Module SQL Text 
38,686     38,683    0          99.61 4wx2na79mm0jv SQL*Plus  BEGIN DBMS_STATS.GATHER_TABLE_... 
38,686     38,683    0          99.61 6kd805142rv5w SQL*Plus  select count(rep) from (select... 
question
a) is it because of sluggish IO?
b) we seen subquery ( running on behalf of DBMS_STATS.GATHER_TABLE_.. job) is cpu intesive?
   full query is:
   select count(rep) from (select /*+ leading(v1 v2) use_nl_with_index(v2) */ count(v1.val) rep from (select rn, val from (select rownum rn, val from (select /*+ no_merge no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */substrb("NAME", 1, 32) val from sys.ora_temp_1_ds_23528 t where substrb("NAME", 1, 32) is not null group by substrb("NAME", 1, 32) having count(substrb("NAME", 1, 32)) = 1)) where ora_hash(rn) <= 3909847333) v1, (select /*+ index(t2) */ "NAME" val from "user"."TABLE_1" t2 where TBL$OR$IDX$PART$NUM("user"."TABLE_1", 0, 4, 0, "ROWID") = :objn) v2 where v2.val like v1.val||'%' group by v1.val having count(v1.val) <= 2)
   
c) what is the way. do I need to raise a bug !.
 
 
 
..dbms_job taking ages..
A Reader, January   26, 2013 - 3:48 pm UTC
 
 
Tom
I got the tkprof for it.
snip...here
********************************************************************************
select count(rep) 
from
 (select /*+ leading(v1 v2) use_nl_with_index(v2) */ count(v1.val) rep from 
  (select rn, val from (select rownum rn, val from (select /*+ no_merge 
  no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact 
  use_weak_name_resl dynamic_sampling(0) no_monitoring */substrb("NAME",1,32) 
  val from sys.ora_temp_1_ds_23528 t where substrb("NAME",1,32) is not null 
  group by substrb("NAME",1,32) having count(substrb("NAME",1,32)) = 1)) 
  where ora_hash(rn) <= 3909847333) v1, (select /*+ index(t2) */ "NAME" val 
  from "user"."TABLE_1" t2 where 
  TBL$OR$IDX$PART$NUM("CRAMER"."TABLE_1",0,4,0,"ROWID") = :objn) v2 
  where v2.val like v1.val||'%' group by v1.val having count(v1.val) <= 2)
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0        297          0           0
Fetch        1  86487.88   84518.20       3001  662474171          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3  86487.89   84518.20       3001  662474468          0           1
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: 72     (recursive depth: 1)
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=662474171 pr=3001 pw=0 time=2913827502 us)
   1123   VIEW  (cr=662474171 pr=3001 pw=0 time=2913846548 us)
   1123    FILTER  (cr=662474171 pr=3001 pw=0 time=2913845415 us)
   2005     HASH GROUP BY (cr=662474171 pr=3001 pw=0 time=2913821602 us)
 611873      NESTED LOOPS  (cr=662474171 pr=3001 pw=0 time=111974935129 us)
   2005       VIEW  (cr=116 pr=0 pw=0 time=152402 us)
   2197        COUNT  (cr=116 pr=0 pw=0 time=102234 us)
   2197         VIEW  (cr=116 pr=0 pw=0 time=91204 us)
   2197          FILTER  (cr=116 pr=0 pw=0 time=80192 us)
   2308           HASH GROUP BY (cr=116 pr=0 pw=0 time=78997 us)
   4308            TABLE ACCESS FULL ORA_TEMP_1_DS_23528 (cr=116 pr=0 pw=0 time=4440 us)
 611873       PARTITION RANGE SINGLE PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289249313 us)
 611873        INDEX RANGE SCAN IDX_NAME_UI PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289129215 us)(object id 194945)
 
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                      3001        0.06         28.32  
  
*********************
here Oracle created  sys.ora_temp_1_ds_23528  using the TABLE_1 definition.
index clustering factor, table count, number of blocks---
  
SQL>  select  column_name, data_type , data_length  from user_TAB_COLUMNS where   column_name in (select  column_name   from dba_ind_columns where  index_name ='IDX_NAME_UI') and table_name ='TABLE_1';
COLUMN_NAME                    DATA_TYPE                      DATA_LENGTH
------------------------------ ------------------------------ -----------
NAME                           VARCHAR2                               650
SQL> select index_name, last_analyzed, clustering_factor, num_rows , distinct_keys from dba_indexes where table_name ='TABLE_1';
INDEX_NAME                     LAST_ANAL CLUSTERING_FACTOR   NUM_ROWS DISTINCT_KEYS
------------------------------ --------- ----------------- ---------- -------------
IDX_NAME_UI                     13-JAN-13         230094384  306871930     304700852
SQL> select  sum(blocks)  from dba_segments where  segment_name ='TABLE_1';
SUM(BLOCKS)
-----------
    5895680
observations/questions
a) each row from   operation --> 2005       VIEW  (cr=116 pr=0 pw=0 time=152402 us)
is looked into index  IDX_NAME_UI and data is read from table  TABLE_1.  only 611873 matching rows are found.
due to poor clustering factor each block is read many many many times.?
b) Optimizer has chosen Nested loop join correctly but issue is poor clustering factor.?
   the table in question   has stale stats.. so to get the proper plan .. we need stats...so sort of race around condition
c) what is the way here - improve Clustering Factor. ? will do alter table move or online redifnition... but would that put the data in table in order for the NAME column key?  table has another column createdate for which I can think of most of keys of createdate column would together in a block .. but not for name column..
d) Nested Loop operation has  below stats
    611873      NESTED LOOPS  (cr=662474171 pr=3001 pw=0 time=111974935129 us)
    here 111974 seconds !  ~ 31 hr
    to compare  611873 rows taking 31 hr is huge..I see inner table  lookup taking long here adding total time to NL operation.?
  
 611873       PARTITION RANGE SINGLE PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289249313 us)
 611873        INDEX RANGE SCAN IDX_NAME_UI PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289129215 us)(object id 194945)
  time taken is around 18 hrs ( 65289249313 us) .
 
 
e) how to probe it further that  poor clustering factor or Nested Loop  is the issue? any other test...? 
 
 
..dbms_stats taking ages
A Reader, January   27, 2013 - 5:48 pm UTC
 
 
Tom,
I see the above iss closely related with.
Bug 8651614 : GATHERING STATISTICS TAKE A LONG TIME WITH DBMS_STATS
will go for it.
Thanks
 
 
Execute DBMS_STATS.GATHER_TABLE_STATS inside a procedure
Raghu, April     03, 2013 - 7:20 pm UTC
 
 
Tom,
We have 2 tables that gets about 5-6 million transactions each day and the stats are gathered at the end of each day(all tables, not just these 2). 
We have a couple of stored Procedures that access these tables but are run before the stats are gathered.
We have had to kill the session (since it takes more than 7 hrs sometimes, which it shouldn't), gather stats and re-run the stored procs (runs really fast the second time around) quite often..
Is it a good idea to gather stats only for those 2 tables inside the stored proc before every run (once everyday)?
procedure abcd
is
begin
  DBMS_STATS.GATHER_TABLE_STATS(owner, table, cascade=>true);
  Our Code here;
end;
Thanks in advance 
April     22, 2013 - 1:58 pm UTC 
 
 
... Is it a good idea to gather stats only for those 2 tables inside the stored 
proc before every run (once everyday)? ...
it would be a good idea to either have already gathered statistics OR to have SET them (using dbms_stats) or to have copied them from yeserdays data - or to LOCK the stats on these tables and never gather again.
what you want to have in place are statistics that represent the data in the table - you could do that by gathering stats ONCE when the table is full, locking them and never doing it again.
you could do that by setting stats on the table - tell us "there are 6 million rows in here and they look sort of like this"
you could do that by gathering stats after the data has been loaded.
but doing what you are doing doesn't make sense... 
 
 
 Execute DBMS_STATS.GATHER_TABLE_STATS 2 methods
Linda, June      28, 2013 - 4:47 pm UTC
 
 
How to know that the statistics of a table where gathered with this two method?
DBMS_STATS.GATHER_TABLE_STATS(
ownname => <table owner>
,tabname =><table_name>
,estimate_percent => NULL
,block_sample => FALSE
,degree => DBMS_STATS.AUTO_DEGREE
,cascade => TRUE
And:
DBMS_STATS.GATHER_TABLE_STATS(
ownname =><table owner>
,tabname => <table_name>
,estimate_percent => NULL
,block_sample => FALSE
,method_opt => ''FOR ALL INDEXED COLUMNS''
,degree => DBMS_STATS.AUTO_DEGREE
,cascade => FALSE 
 
Pawan Tyagi, October   23, 2013 - 12:37 pm UTC
 
 
I need to see the old DBMS Stats data on a table in oracle database. I'm able to see the time when stats ran last time and its history date as well but how can i see old values and new values on a table. 
 
Analyzing Tables using DBMS STATS or Compute Statistics, which will be better ?
Malaya, October   31, 2013 - 2:47 pm UTC
 
 
Hi,
I am under confused stage. Generally I use DBMS STATS to optimize DB performance. But since last 2 months I encountered severe performance issue. Then I used compute statistics and performance also increased. 
Please advice me, what should I use in future ? I am in a confuse stage now. Please help me.
Malaya 
November  01, 2013 - 9:27 pm UTC 
 
 
only use dbms_stats.  
analyze table for gathering statistics is a deprecated bit of functionality.
use dbms_stats and make sure you are gathering the same set of statistics you did with analyze (you didn't share your analyze or dbms_stats commands so I cannot comment further).  what happened is that you probably gathered entirely DIFFERENT types of statistics with the two commands.  Make sure they are gathering the same amount and types of statistics. 
 
 
Degree level in DBMS_Stats
Mahes, November  18, 2013 - 11:10 am UTC
 
 
hi Tom
i am using following scripts in loading Store Procedure. which runs every day and load the data to target table.
Orace Version:11.2
DBMS_STATS.Gather_Table_stats(ownname ='XYZ', tabname = V_tbl_name, cascade = True, degree => 3 );
in degree it is mentioned number. what basis it is used? and what is the range can be used?
i saw in this page autodegree number. can you explain how it is allocated?
also please tell about cascade?
Thanks
 
 
ENDPOINT_VALUE holding VARCHAR
Andre, January   10, 2014 - 1:54 pm UTC
 
 
Hi Tom,
A Very Happy New Year 2014 !!
+
Is there a way to decode the ENDPOINT_VALUE from histogram view where the value is VARCHAR2..?
I have tried to search for answers for days - got a link to Jonathan Lewis and others - but ...
... it seems that decoding the first 6 bytes is OK 0 beyond that there are problems.
And ... the field "ENDPOINT_ACTUAL_VALUE" - sometimes shows such values - but then they also somehow vanish...
Thanks in advance
Andre 
 
ENDPOINT_VAL for VARCHAR2
Andre, January   12, 2014 - 11:03 am UTC
 
 
Hi Tom,
More searching through Jonathan Lewis' papers and I have located his paper on this subject.
It appears that Oracle ONLY encodes the first 5 characters + possibly the 6th (although this is not 100% correct) but beyond this the remaining values are garbage.
JL explained it pretty well - and it appears that the ACTUAL value column that is mostly NULL in the HISTOGRAM view stores the values if the first 6 are not unique. I have not tested such cases yet - and probably will when time allows.
But I am curious as to Oracle strategy... i.e. the first 5 bytes. Does it change for 12c..?
Thanks
Best wishes
A 
 
indexing a patitioned table
jason, January   22, 2014 - 10:17 pm UTC
 
 
Could you please let me know the best way to find out which columns of a table is used most frequently? I just partitioned a table using range (gl date). But i would like to create indexes on a column or columns taht are used mostly in queries. id there a way for me to get this information? I have used the SYS.COL_USAGE$.but im not quite sure that's what is needed  
 
performance imapct on stats collection
Ravi B, April     24, 2014 - 9:40 pm UTC
 
 
Hi Tom,
We have several "big" tables in our data warehouse. After each ETL we gather stats on the tables. Each of the table has partition and sub-partition. The complete process of gathering stats is taking hours to finish.
currently, this is what we do for each table.
EXEC DBMS_STATS.GATHER_TABLE_STATS(<SCHEMA_NAME>,<TABLE_NAME>);
Do you recommend us doing the following as recommended at  
https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables EXEC DBMS_STATS.SET_TABLE_PREFS(<SCHEMA_NAME>,<TABLE_NAME>,'INCREMENTAL','TRUE');
EXEC DBMS_STATS.GATHER_TABLE_STATS(<SCHEMA_NAME>,<TABLE_NAME>);
Thanks! 
 
 
a review
A reader, March     31, 2016 - 5:09 pm UTC
 
 
a review 
 
Undo analyze table 
Praveen Darla, May       27, 2019 - 10:05 am UTC
 
 
Hi Tom,
I analyzed table before taking previous statistics of that table. I want to undo or revert the 
analyze table tablename compute statistics. Please assist with solution.
Thanks
 
May       29, 2019 - 6:23 am UTC 
 
 
You can't undo an analyze (which is why we recommend using dbms_stats).
If you *do* have some historical stats, you can do:
exec dbms_stats.delete_table_stats(ownname=>'MYUSER', tabname=>'MYTABLE');
exec dbms_stats.restore_table_stats(ownname=>'schema1',tabname=>'MYTABLE',as_of_timestamp=>'12/11/19 12:12:12,123456 +08:00');
 
 
 
 
gabriel, November  14, 2022 - 5:39 pm UTC
 
 
hello. i have a problem and i don't know how to figure it out. can you help me please? i have this requires : Count the total number of books, magazines rented, and total number of members
used the library in the selected month.
 
my code : Select sum(transaction_id) from ledgers where month(issue_date) = 7;
the result : ORA-00904: "MONTH": invalid identifier
 
thank you. 
November  15, 2022 - 7:06 am UTC 
 
 
month(issue_date) = 7;
is probably 
extract(month from issue_date) = 7
or you might consider something to ensure issue_date is unaffected, eg
issue_date >= date '2022-07-01' and
issue_date <  date '2022-08-01' 
(assuming you want data just for the current year)