Thanks for the question, Benny.

Asked: August 31, 2001 - 12:53 pm UTC

Last updated: June 17, 2020 - 8:22 am UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

Can you explain the differences between the following methods in analyzing objects, and if you have a preference, thanks!

alter table/index ... compute statistics;
dbms_utility.analyze_schema
dbms_stat

and Tom said...

For quick and dirty examples -- I use analyze table t compute statistics for table for all indexes for all indexed columns.

In real life, one would tend to use dbms_stats instead. It supports the gathering of statistics in parallel (and also the manual setting of stats) which analyze does not. It is easier to automate with dbms_job as it is a plsql interface and doesn't require the use of dynamic sql.

Dbms_utility.analyze_schema is sort of a "left over" from 7.x -- before the DBMS_STATS package was introduced.

Rating

(21 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, August 31, 2001 - 1:54 pm UTC

Thanks for the answer, just a quick follow-up, would using
the analyze command and dbms_stat produce the same statistical values?

August 31, 2001 - 2:38 pm UTC

As long as you used them in the equivalent fashion, yes they should.

Bhawna, September 05, 2001 - 3:14 am UTC

Hi Tom,
Due to the difference in time-zones, when we come to office, we always see the message "Sorry, I already have a backlog of questions, ask later." on the home-page.
Can we have some email ID, which we could use for posting questions to you ?
Thanks.

September 07, 2001 - 12:10 pm UTC

Well, the way it works is like this (i take 10 questions at a time):

in the morning (about 7-8am east coast US time) I do a batch. Usually about 9am-10am or so, I am open for questions. I take 10 more.

After lunch, if I can, I do another batch. Usually around 1-2pm east coast time, I take some more.

If I'm really lucky -- or the questions are very easy, I sometimes do another in the afternoon (sometimes)... So 5pm east coast time is another good time to look.

After I put the kids to bed -- I might do another set. So, 9-10pm east coast time is another great time to look....

Now, I ask people to put locations in when they submit questions. What they put into there is free form -- but a quick query sorted by count shows me:

... (thousands of records deleted)
Kuwait 60
Hyderabad 63
Mumbai 64
Malaysia 66
Pakistan 68
Ny 69
Atlanta 75
Nj 90
Bangalore 98
New Jersey 113
Singapore 197
San Jose 211
India 577

If I just look at locations with "india" in them:

1 select count(*)
2 from WWC_ASK_SUBMITTED_QUESTIONS$
3* where location is not null and upper(location) like '%INDIA%'
ask_tom@OSI816> /

COUNT(*)
----------
1403

ask_tom@OSI816> select count(*) from WWC_ASK_SUBMITTED_QUESTIONS$;

COUNT(*)
----------
11159

about 12% of the questions come from there (i think almost all of the 10pm questions come from there actually)...

I've played with other schemes (eg: take 1 question every hour, take 30 all at once, etc) but the current scheme seems to work well as far as getting questions from all over.

Analyze Any system privilege

D Tran, September 07, 2001 - 6:45 pm UTC

I do like the solution of dbms_stats and dbms_job to gather schema statistics.
I submit the job as system to analyze another schema but get insufficient privilege errors in the alert log -- the job is submitted cleanly but get failures.
Only after granting the above sys priv to sys and system do they run. Now sys and system should get Analyze Any through the DBA role already, shouldn't they?

September 07, 2001 - 6:52 pm UTC

See
</code> http://asktom.oracle.com/Misc/RolesAndProcedures.html <code>
for why privs in the stored procedure must be granted directly, not via a role.

A reader, October 01, 2002 - 1:31 pm UTC

"WWC_ASK_SUBMITTED_QUESTIONS$" - Is the $ sign there signify anything or is it just the way you named it? (It is always curious to have a peek into your applications internals....!!)

October 02, 2002 - 9:49 am UTC

thats the way Joel Kallman named it - he did the Marvel layer for me (the skin and branching from page to page)

It is just the way their group names things. (i find it annoying cause my telnet window doesn't select the $ when I double click on the table name -- making me type the $ in manually !)

DBMS_STATS PACK VERY SLOW COMPARED TO OLD ''ANALYZE''

Billy L., December 09, 2003 - 4:39 pm UTC

We are trying to convert all our statistics collection from "analyze" to "DBMS_STATS" pack, because this will be the only statistics collection tools being enhanced and also strongly recommended by Oracle. However, the time it
takes to gather statistics with this package (gather_schema_stats, gather_table_stats) is a lot longer than using the old "analyze" statement.
For example, to collect statistics on one of our smaller schemas:
1. Analyze (10%) - 23 minutes
2. Gather_schema_stats (10%, degree 2) - 41 minutes
3. Gather_schem_stats (Anto_sample_size, degree 2) - 2 hours 51 minutes
Option 3 (auto_sample_size) is recommended by Oracle, but it takes more than 7 times that of the analyze statement. It also runs with degree 2 (and spawns up to 4 parallel processes) that consumes more resources when it runs. Even doing a sample of 10% takes almost twice as long while consuming a lot more
resources.

Our concern is that with this kind of performance, we wouldn't be able to gather whole schema statistics without impacting our online applications. It does consume significant amount of system resources.

I understand there is an option to gather only on tables with stale stats. However, we still need/want to gather all stats initially, and also once for a while. In your opinion, based on what you read, what can cause this?

December 10, 2003 - 3:19 am UTC

dbms_stats does things with sql (thats why you can parallelize it). it does things differently.

with gather stale, you would significantly (probably) reduce the amount of time it takes from day to day ( you would need to do an initial gather of stats on all objects -- but if you wanted, you could kick start that with analyze -- but after that you would never gather all stats on all objects )

Remeber reading a bug with dbms_stats on 9.2.0.3

Scott Watson, December 10, 2003 - 9:09 am UTC

I had to revert to analyze commands as dbms_stats in 9.2.0.3 was taking too long. After digging on Metalink for awhile I came across a bug logged against 9.2.0.3 for dbms_stats. If I remember correctly the performance problem was fixed in 9.2.0.4. However, I have not upgraded yet to test this.

December 10, 2003 - 3:38 pm UTC

i looked around -- there is a permanent change in dbms_stats in 9ir2 as to the way it gathers index stats (used to use analyze in 9ir1 and before, uses SQL in 9ir2 and after).

Bug noticed on which platform?

A reader, December 10, 2003 - 2:52 pm UTC

Hi Scott,
We are also planning to move from 8.1.6.3 to 9iR2 on AIX platform, and our scripts are using dbms_stats on Oracle 8.1.6.3. Could you tell on which platform you faced this bug?

December 10, 2003 - 4:18 pm UTC

(its not a bug)

Suggestion

A reader, December 10, 2003 - 8:14 pm UTC

Tom,
Thanks for confirming. It means we need not to worry about scripts written with dbms_stats after upgrade.
Thanks

December 11, 2003 - 5:27 am UTC

it means that a serial execution of dbms_stats may be penalized to the benefit of parallelized dbms_stats actually - more operations are parallelizable now.

Uh, right...

fred, December 11, 2003 - 2:14 am UTC

It means with the major changes, dbms_stats will run slower than an analyze.

Ouch!

A reader, December 11, 2003 - 3:35 pm UTC

In Oracle 8.1.6.3, I changed scripts to use dbms_stats from analyze, and one of the main reason was to use parallelism, being a large DW environments. Does 9.2.0.4 or greater use parallelism?

December 11, 2003 - 4:38 pm UTC

all dbms_stats versions do? or capable of doing so.

this just allows more operations to be done in parallel

Clarifications

Sanjaya Balasuriya, April 30, 2004 - 4:57 am UTC

Hi Tom,

I'm planing to schedule stats collection for a production database. 9i. This is an OLTP which have the load between 8.00 am to 7.00 pm. Peak load will be in between 1.00 pm to 5.00 pm. All the tables (912 tables) used in the system are in a single schema.

This my plan:

1. Enable monitoring on all the tables.

2. Run dbms_stats.gather_schema_stats with cascade => true at 1.00 am daily.

3. Run dbms_stats.gather_schema_stats to collect staled stats at every 30 minutes.

4. Run dbms_stats.gather_system_stats at each 30 minutes.

Any problems with plan ?

Thanks in advance.

April 30, 2004 - 7:49 am UTC

seems "overkill"

seems like

a) gather all table/index stats once
b) alter all tables to be monitoring
c) maybe once a day gather stale (never gather all again). Also, consider EXPORTING the stats to a table before you gather, so if something "goes awry" and you suspect "bad plans because of last nights gather", you can prove it by restoring yesterdays stats. If problem goes away, bad plans because of stats is true and we need to figure out "why".

system stats are something you generally gather less frequently, when the system is under typical load. Unless you radically change the execution mix of transactions or change hardware..... they should be relatively constant after the first couple of gathers.

About stats collection

Sanjaya Balasuriya, May 02, 2004 - 10:59 pm UTC

Hi Tom,

You say;

"so if something "goes awry"
and you suspect "bad plans because of last nights gather", you can prove it by
restoring yesterdays stats. If problem goes away, bad plans because of stats is
true and we need to figure out "why"."

How a database can be messed up by stats collection ?

Thanks in advance.

-Sanjaya

May 03, 2004 - 7:16 am UTC

stats drive the query plans.

query plans drive the performance of they query.

bad plan -> poor performance.

collecting stats can change plans, plans may become better, plans may become worse

Sir please give a solid example to convince in single attempt

cc, May 04, 2004 - 4:50 am UTC

Our DBA's are not ready to follow dbms_stats and they are asking for solid reason imean some examples, a good technical reason to use dbms_stats

Please provide the benefit of dbms_stats with examples

May 04, 2004 - 7:29 am UTC

if you have an existing system
and you are not adding anything to it
leave it be (but, you need not more than a part time dba at that point)

dbms_stats is more automat-able (it is plsql already, procedural)

it is parallelizable

it can gather stale stats only (saving many years of processing time) <<== perhaps the biggest reason

location?

Nasir, May 04, 2004 - 12:39 pm UTC

Why so many questions (12%) from India? BTW, Hyderabad, Mumbai are also in India.

Gathering system statistics

Arun Mathur, July 14, 2004 - 11:51 am UTC

Tom,

In your latest book, you have a section on gathering system statistics using the dbms_stats.gather_system_stats procedure. You demonstrate with examples how it can be used as an alternative to the OPTIMIZER_INDEX_COST_ADJ and OPTIMIZER_INDEX_CACHING parameters. I, along with the 9i database server I support, truly gained much from this section, and was curious to know if a similar type approach exists for Oracle versions running under 9i?

I look forward to reading your book. Also, I either owe you a round of beer or a Braves hat for each time you've set me straight. Thanks again.

Arun

July 14, 2004 - 12:14 pm UTC

system stats does not exist prior to 9ir2...

DBMS_STATS issue

Tapas, April 20, 2006 - 5:00 pm UTC

Hi Tom:

So, I decided on 10gR2 database to 
EXEC DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB'); to disable the automated job and use the GATHER_SCHEMA_STATS of my own and schedule it. Since all the tables are in "monitoring" mode in 10G, I thought to use GATHER STALE option with estimate percent 10.

Here is what I did:

1. EXEC DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB'); 
2. EXEC dbms_stats.delete_schema_stats('ADBASE'); 
   -- to delete the schema stats
3. BEGIN
  DBMS_STATS.GATHER_SCHEMA_STATS (
     ownname => 'ADBASE',
     estimate_percent => 10,
     block_sample => FALSE,
     method_opt => 'FOR ALL INDEXED COLUMNS SIZE 1',
     degree => NULL,
     granularity => 'DEFAULT',
     cascade => TRUE,
     stattab => 'STATTABLE',
     statid => 'ADBASE',
     options => 'GATHER STALE',
     statown => NULL,
     no_invalidate => FALSE,
     force => FALSE);
END;
/
4. Then I checked with the following query, but it is no rows:
SQL> select max(last_analyzed) from dba_tables where owner='ADBASE' ;

MAX(LAST_
---------

5. But if I change it to GATHER, then the above query shows that the stats are collected.

Am I doing something wrong?

Also, the above syntax give me the following error on Oracle 9iR2:

ERROR at line 2:
ORA-06550: line 2, column 4:
PLS-00306: wrong number or types of arguments in call to 'GATHER_SCHEMA_STATS'
ORA-06550: line 2, column 4:
PL/SQL: Statement ignored

Can you please advise, what am I doing wrong?
Thanks.

April 20, 2006 - 5:46 pm UTC

you didn't have stats in the first place, therefore they are not "stale".

the automatic job will gather on stale AND tables that haven't been done yet (which will let the stale tracking start...)

in 9i, there were fewer parameters. force for example did not exist back then.

DBMS_STATS issue

Tapas Mishra, April 21, 2006 - 1:27 pm UTC

Thanks Tom; So, is it good to do the following then (?):
1. Do a GATHER first time and only one time. This will initialize the stats. While doing this, 10 percent is good or calculating the entire thing is good, since it is one-time?

2. Once above is done, then swithc to the 10 percent and GATHER STALE approach and schedulethat job?

Thanks, I'm sorry, I should have checked the manual for 9iR2 before asking you the question, I just simply "assumed" that options are equal! And assumptions are always bad - you taught me :-)

Thanks again.
-tapas-

April 21, 2006 - 3:34 pm UTC

1) depends on the size of the segment. 10% or everything would in general "be good"

2) depends - say you have a 1,000,000 row table. You have a date column that is populated by sysdate. You had an id column that is populated by a sequence.

Today is 21-apr-2006, no data before this date exists. also, select max(id) from table returns 2,000,000.

That is what the current statistics say.

You add 50,000 rows over the next week (same number every day). Your queries are:

select * from t where dt >= to_date('21-apr-2006');
select * from t where id >= 2,000,000

optimizer things "very very few records"
reality says "50,000 records"

but the table isn't stale yet - that is one (obvious) "got you" with stale. Whether this is a problem for you or not is something only you can answer.

(and remember, you can use dbms_stats to SET the column stats - you need not gather every day to "correct" this!)

DBMS_STATS issue

Tapas Mishra, April 24, 2006 - 2:18 pm UTC

Thanks Tom for explaining with an example, and I certainly know that there will be 5/10 odd tables in my schema of 600+ tables, that will most likely go the route of the scenario you explained. They will be certainly having more rows enough to drive the execution plans differently while still not being "stale" per the 10 % rule, in a weeks period of time.

This could possibly be a case to "got you" for the staleness route. Considering this, and since I've time to gather stats on the week-ends, I'm planning to gather stats using GATHER (and not GAHTER STALE) on weekly basis.

Please see above I'm using the ALL INDEXED COLUMNS SIZE 1, to avoaid creating histograms and use it, since the app was not initially designed with this. Is this the correct route to do this, since all will be in one bucket and Oracle will not sue Histograms? And certaiinly, where needed I can set the stats using the SET stats procs.

Thanks.
Best regards,
-tapas-

April 24, 2006 - 2:20 pm UTC

if you don't want histograms - you don't want them (and won't be getting them).

stats

Tapas Mishra, April 24, 2006 - 4:24 pm UTC

No, No I understand what you are saying, all I wanted to know and just reveirfy since there is no options for "not creating histograms", the ALL INDEXED COLUMNS SIZE 1 is the right approach or not?

Yes, at this moment I do not want to create histograms, but if in future I do, then either I will gather them correctly or set them.

Please advise.
Thanks.
-tapas-

April 25, 2006 - 12:27 am UTC

size 1 is in effect "no histograms". you get the high/low values and number of values with that.

stats - "no histogram"

Tapas Mishra, April 25, 2006 - 11:30 am UTC

Thanks Tom for reverifying and explainin what goes in there when we use SIZE 1.
Regards,
-tapas-

online stats gathering in 12c

Rajeshwaran, Jeyabal, June 16, 2020 - 12:10 pm UTC

Team:

reading through this link about the restrictions on "online stats gathering" - but dont see any of those applied to this use case.

https://docs.oracle.com/database/121/TGSQL/tgsql_statscon.htm#GUID-CDDB5A54-0991-4E68-A9D7-2305777B608B

could you please let us know - why online stats gathering doesn't kick in these cases? ( all these were run from 12.2 database)

(Is this due to the SYS owned object referenced in the ddl - causing the online stats gathering not applied here?)

demo@PDB1> create table t1
  2  nologging
  3  parallel 4
  4  as
  5  select a.*, rownum as id
  6  from all_objects a,
  7     all_users b ;

Table created.

demo@PDB1> select stale_stats,num_rows,blocks,last_analyzed
  2  from user_tab_statistics
  3  where table_name ='T1'     ;

STA   NUM_ROWS     BLOCKS LAST_ANALYZ
--- ---------- ---------- -----------


demo@PDB1> exec dbms_stats.gather_table_stats(user,'T1',degree=>4);

PL/SQL procedure successfully completed.

demo@PDB1> select stale_stats,num_rows,blocks,last_analyzed
  2  from user_tab_statistics
  3  where table_name ='T1';

STA   NUM_ROWS     BLOCKS LAST_ANALYZ
--- ---------- ---------- -----------
NO     3569124      75312 16-JUN-2020

demo@PDB1> drop table t2 purge;

Table dropped.

demo@PDB1> create table t2
  2  nologging
  3  parallel 4
  4  as
  5  select a.*, rownum as id
  6  from all_objects a;

Table created.

demo@PDB1> select stale_stats,num_rows,blocks,last_analyzed
  2  from user_tab_statistics
  3  where table_name ='T2'     ;

STA   NUM_ROWS     BLOCKS LAST_ANALYZ
--- ---------- ---------- -----------


demo@PDB1> create table t3
  2  nologging
  3  parallel 4
  4  as
  5  select a.*
  6  from all_objects a;

Table created.

demo@PDB1> select stale_stats,num_rows,blocks,last_analyzed
  2  from user_tab_statistics
  3  where table_name ='T3'     ;

STA   NUM_ROWS     BLOCKS LAST_ANALYZ
--- ---------- ---------- -----------


demo@PDB1>

June 17, 2020 - 8:22 am UTC

That looks like a 12.2 bug to me, because it was fine 12.1 and fine in 19...

C:\Users\hamcdc>sqlplus connor/connor@db121pdb1

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Jun 17 16:15:42 2020
Version 19.7.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Wed Jun 17 2020 16:14:49 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options


SQL> create table t1
  2      nologging
  3      parallel 4
  4      as
  5      select a.*, rownum as id
  6      from all_objects a,
  7         all_users b ;

Table created.

SQL>
SQL> select stale_stats,num_rows,blocks,last_analyzed
  2      from user_tab_statistics
  3      where table_name ='T1'     ;

STA   NUM_ROWS     BLOCKS LAST_ANAL
--- ---------- ---------- ---------
NO     3950980      69912 17-JUN-20

SQL>
SQL>



C:\Users\hamcdc>sqlplus connor/connor@db122pdb1


SQL*Plus: Release 19.0.0.0.0 - Production on Wed Jun 17 16:18:41 2020
Version 19.7.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Fri Jun 05 2020 17:17:30 +08:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production


SQL> create table t1
  2      nologging
  3      parallel 4
  4      as
  5      select a.*, rownum as id
  6      from all_objects a,
  7         all_users b ;

Table created.

SQL>
SQL> select stale_stats,num_rows,blocks,last_analyzed
  2      from user_tab_statistics
  3      where table_name ='T1'     ;

STA   NUM_ROWS     BLOCKS LAST_ANAL
--- ---------- ---------- ---------




C:\Users\hamcdc>sqlplus connor/connor@db19pdb1

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Jun 17 16:20:11 2020
Version 19.7.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Wed Jun 17 2020 14:13:53 +08:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.7.0.0.0



SQL> create table t1
  2      nologging
  3      parallel 4
  4      as
  5      select a.*, rownum as id
  6      from all_objects a,
  7         all_users b ;

Table created.

SQL>
SQL> select stale_stats,num_rows,blocks,last_analyzed
  2      from user_tab_statistics
  3      where table_name ='T1'     ;

STALE_S   NUM_ROWS     BLOCKS LAST_ANAL
------- ---------- ---------- ---------
NO         4173400      87288 17-JUN-20

SQL>

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Analyze Any system privilege

DBMS_STATS PACK VERY SLOW COMPARED TO OLD ''ANALYZE''

Remeber reading a bug with dbms_stats on 9.2.0.3

Bug noticed on which platform?

Suggestion

Uh, right...

Ouch!

Clarifications

About stats collection

Sir please give a solid example to convince in single attempt

location?

Gathering system statistics

DBMS_STATS issue

DBMS_STATS issue

DBMS_STATS issue

stats

stats - "no histogram"

online stats gathering in 12c

More to Explore

PL/SQL demos

PL/SQL docs

best method for statistics

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Analyze Any system privilege

DBMS_STATS PACK VERY SLOW COMPARED TO OLD ''ANALYZE''

Remeber reading a bug with dbms_stats on 9.2.0.3

Bug noticed on which platform?

Suggestion

Uh, right...

Ouch!

Clarifications

About stats collection

Sir please give a solid example to convince in single attempt

location?

Gathering system statistics

DBMS_STATS issue

DBMS_STATS issue

DBMS_STATS issue

stats

stats - "no histogram"

online stats gathering in 12c

More to Explore

PL/SQL demos

PL/SQL docs