Incomplete statistics?
Jayaraman Ashok, June 23, 2002 - 8:44 pm UTC
Hi Tom
You have replied that dbms_stats does not gather information on chained rows,unused data blocks and average free space available in blocks. Does it mean that to get the above mentioned information we have to run analyze command again? Also you mention that dbms_stats gathers statistics required by the CBO. Are there any special statistics that are gathered apart from number of rows, number of blocks occupied? If it is only the above two, then i believe that analyze command already accomplishes it. The only feature that is of use may be the set_stats procedure for tuning queries before using them in production.
Thanks & regards,
Ashok
June 23, 2002 - 9:32 pm UTC
dbms_stats only gathers things the CBO cares about. The CBO does not care about that information.
In the future, analyze will not gather CBO info, only these other things CBO doesn't care about. DBMS_STATS currently and will only gather CBO info.
I think you need to reread the answer. dbms_stats gathers everything that the analyze command does but.... can do it in parallel, can do it for stale tables, can export/import stats, can analyze things analyze cannot (the system for example) and so on. There are lots more stats then just num rows and num blocks -- and both commands collect them. There are things dbms_stats will do that analyze does not (especially on a partitioned table).
Recommendation: use dbms_stats.
Dave, June 24, 2002 - 9:43 am UTC
When I "CREATE INDEX my_index ... COMPUTE STATISTICS" is the analyze being performed by the DBMS_STATS method, or the ANALYZE INDEX method?
If by the latter method, am I likely to gain by running DBMS_STATS post-CREATE or post-REBUILD mostly for partitioned segments (ref. your comments on gathering global statistics), with "ANALYZE ..." still being a reasonable choice at the moment for non-partitioned segments?
June 24, 2002 - 9:48 am UTC
It is closer to the analyze method -- only because it is DDL (same results from analyze, dbms_stats, create index ultimately).
But since you only create an index once, it is good only for the initial build, after that -- you are back to dbms_stats.
It is a general recommendation to use dbms_stats, period. There will come a time when dbms_stats is the only way.
external table, index and dbms_stats
Suresh, June 24, 2002 - 10:39 am UTC
Tom,
You have mentioned that DBMS_STATS can analyze external tables. I tried to create an index on an external table and it came up with an error "not supported on external table".
My question is if dbms_stats could analyze external tables, why not allow an index on external table as well and analyze it with dbms_stats.
June 25, 2002 - 8:00 am UTC
you cannot index something that we have no control over.
Consider if we did. say the file started with:
1,a,hello
2,b,goodbye
and we indexed that (the first column). Now, you come along and replace that file with:
2,b,goodbye
1,a,hello
and issue: select * from external_table where column1 = 1;
what do you think would happen if
a) we used an index
b) we full scanned
I think we would get two answers. We cannot index a file that is changeable (and changeable by forces outside of our control)
Queries on Analyze
Sanjay Jha, June 24, 2002 - 4:24 pm UTC
Tom,
The ongoing discussion has generated few more related queries in me. I am listing them below, for you crisp and clear answers :
1. What are different packages for running analyze (eg. dbms_utulity, dbms_stats, db_maintenance etc.) and what are the advantages of each?
2. What is the best way to analyze one particular table from CBO's perspective(BTW, what are all possible ways)?
3. If someone has to automate the analyze, what do you recommend and why - scheduling through cron or dba_jobs? (For that matter what is your opinion on scheduling any job for database- OS level or through Oracle jobs?)
4. This question is not related to analyze but to the Oracle supplied packages- Where in a databse one can get info about these packages (like v$ views etc.) or give me an url for reference please.
Thanks.
June 25, 2002 - 8:39 am UTC
1) dbms_stats is the one you should use. dbms_utility has some very very primitive functions left in there for backwards compatibility only.
2) for that, please read the ADMIN guide, it covers the methods and when to use what.
You know -- if there was a "best way", it would be the only way...
3) If you have my book -- you know my opinion on this. Do it in the database.
Cron -- can cron tell if the database is up and running? cron will try to analyze even when the database is not ready. DBMS_JOBS is the right tool for this in my opinion (barring that, OEM would be the right place). I want all of my database related stuff going on in the database -- one place to look for issues (my alert log or OEM logs) and so on.
4) how about the supplied packages guide. Instead of a direct link, I'll teach you to fish.
goto otn.oracle.com
click on documentation
pick what database version you have
go after the supplied packages guide (in 8i and before, this was with the "database" set of docs. In 9i -- there is just one big list)
DBMS_STATS and statistics
Arun Gupta, May 02, 2003 - 7:10 pm UTC
Tom
I did the following:
a) Using dbms_stats, analyzed all the tables in schema X of database A and using dbms_stats.export_schema_stats, exported the statistics into stattab.
b) Extracted the table/index creation scripts for schema X, excluding the stattab table.
c) Using Oracle export (exp), exported the stattab table.
d) Using the scripts in b), I created all the tables and indexes in schema Y of database B. These tables do not contain even a single row of data.
e) Using Oracle import(imp), imported the stattab into schema Y of database B.
f) Using dbms_stats.import_schema_stats, I imported all the statistics from stattab.
I cannot see the statistics like num_rows, avg_row_len etc. in user_tables. How does the CBO read these statistics?
Thanks
May 02, 2003 - 8:11 pm UTC
dbms_stats was designed to move the stats for user A from db1 to user A in db2.
It is the X to Y that is messing you up. This demonstrates what you are seeing, but shows how to adapt it to work anyway:
ops$tkyte@ORA920LAP> drop user a cascade;
User dropped.
ops$tkyte@ORA920LAP> drop user b cascade;
User dropped.
ops$tkyte@ORA920LAP> grant dba to a identified by a;
Grant succeeded.
ops$tkyte@ORA920LAP> grant dba to b identified by b;
Grant succeeded.
ops$tkyte@ORA920LAP> @connect a/a
a@ORA920LAP> create table t as select * from all_objects;
Table created.
a@ORA920LAP> create index t_idx on t(object_id);
Index created.
a@ORA920LAP> analyze table t compute statistics
2 for table for all indexes for all indexed columns;
Table analyzed.
a@ORA920LAP> exec dbms_stats.create_stat_table( user, 'st' );
PL/SQL procedure successfully completed.
a@ORA920LAP> exec dbms_stats.export_schema_stats( user, 'st' );
PL/SQL procedure successfully completed.
a@ORA920LAP> !exp userid=a/a tables=st
Export: Release 9.2.0.3.0 - Production on Fri May 2 20:08:56 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
Export done in WE8ISO8859P1 character set and AL16UTF16 NCHAR character set
About to export specified tables via Conventional Path ...
. . exporting table ST 78 rows exported
Export terminated successfully without warnings.
a@ORA920LAP> @connect b/b
b@ORA920LAP> create table t as select * from all_objects where 1=0;
Table created.
b@ORA920LAP> create index t_idx on t(object_id);
Index created.
b@ORA920LAP> !imp userid=b/b fromuser=a touser=b
Import: Release 9.2.0.3.0 - Production on Fri May 2 20:08:56 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
Export file created by EXPORT:V09.02.00 via conventional path
Warning: the objects were exported by A, not by you
import done in WE8ISO8859P1 character set and AL16UTF16 NCHAR character set
. . importing table "ST" 78 rows imported
Import terminated successfully without warnings.
b@ORA920LAP> exec dbms_stats.import_schema_stats( user, 'st' );
PL/SQL procedure successfully completed.
b@ORA920LAP> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
ST
T
<b>that is what you currently see but:</b>
b@ORA920LAP> update st
2 set c5 = USER
3 /
78 rows updated.
b@ORA920LAP> exec dbms_stats.import_schema_stats( user, 'st' );
PL/SQL procedure successfully completed.
b@ORA920LAP> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
ST
T 30557
b@ORA920LAP>
<b>that'll "rename the user" in the stats tab so it'll be looking for B.T, not A.T
</b>
It works
Arun Gupta, May 02, 2003 - 9:16 pm UTC
After updating c5 in stattab to new schema name, it works.
Thanks...!!
What about execution statistics
Arun Gupta, May 06, 2003 - 9:56 am UTC
Tom
I am a bit unclear about the purpose of importing statistics from one database to another. My questions are:
a) If I set a table with 0 rows with the statistics of a 1 million row table, can I run a query against this 0 row table and actually see the same buffer gets as I would see against a 1 million row table?
b) When the query actually hits a 1 million row table, can Oracle change the plan?
May 06, 2003 - 10:05 am UTC
a) no
b) not sure what you mean. if you mean "will the plan for a table with 900,000 rows be different then a plan for 1,000,000" -- the answer is "yes, it could be". If you mean "if the statistics say there are N rows but there are really M rows -- will Oracle change the plan after finding there are not N rows?" -- the answer is "no, once formulated -- the plan is the plan"
One use of the export/import stats trick is
a) you have a logical standby or some reporting instance that is a close copy of production (using replication, whatever)
b) you run stats on the copy. move them into production.
Problems in anlayze
SUBBARAO., May 07, 2003 - 1:20 am UTC
There is much useful information on ANALYZE in this article. It is helpful. My scenario is like this, looks strange but it is.
In my production system, we have some problems in getting some reports from the application.
For one report query has 8 tables, out of which 2 tables have huge number of records, both these tables are not analyzed, rest all the tables are analyzed. This report is not launching.
We have now analyzed the entire schema. Then the report started working. The cost of the query before and after analyze has drastically come down. Some tables will be getting different kinds of data, and some tables will be getting same type of data. I am not sure how long this report will work and when I should analyze tables. I have read about the table monitoring feature and document says that Oracle will automatically analyze the table if it requires analyze.
My questions:
1) So enabling this feature, what will be the side effects in performance. I know that if we want some thing we should scarifice few things, but I would like to highlight these side effects to my boss.
2) If a query is having 5 tables and out of which 3 tables are not analyzed and 2 tables are analyzed, will the cost calculation be accurate? I mean will the optimizer gives a better plan.
Is it required that all the tables in the query are to be analyzed or all not to be analyzed.
May 07, 2003 - 7:32 am UTC
1) monitoring is extremely low impact, not even measurable. YOU still have to analyze, you just use dbms_stats with the gather stale option. it'll collect stats only on tables you need.
2) of course the costing won't be accurate. you should either have all or no objects refered to by a query analyzed.
Excellent and Timely discussion....
A reader, May 07, 2003 - 10:22 am UTC
Hi Tom:
As usual, your answers allows us (DBA's) to have a good night's sleep.
On the issue of monitoring...
What criteria does oracle use to determine that the stats are stale ?
Can these thresholds be changed/set by the user ?
Are the stats in the user_tab_modifications cumulative ?
If I need to preserve the stats in the above view, can you please suggest a procedure/script as to how I go about saving these. I am interested in preserving the amount of dml activity by segment.
Thanks
Mohsin
May 07, 2003 - 10:46 am UTC
o when about 10% of the table is changed
o no
o yes
o you would have to "insert into some_other_table select sysdate, a.* from the modifications table A" before gathering stats. Make that part of your 'process'.
Excessive Recursive calls after stats
A reader, May 07, 2003 - 11:53 am UTC
Hi Tom:
I have observed that the statistics section of the sql+ trace reports excessive recursive call count for queries agains the tables just after the tables have been analyzed.
What does this mean and what are the side effects of stats gathering ? Has the stats gathering invalidated the parsed sql in the shared pool ?
Would this be a good way of implementing my own thresholds:
I'd turn on monitoring for the segments.
Collect the dml activity from the user_tab_modifications to my own table and then have my procedure re-compute the stats after say x% of change to the segment and not use the gather stale option of the dbms_stats.
What is the timestamp column for ? If the values are cummulative then how does it determine the incremental from the last stale stats run ?
Thanks.
May 07, 2003 - 1:45 pm UTC
hows about an example ?
Here it is.
A reader, May 07, 2003 - 2:42 pm UTC
09:11:53 SQL> connect scott/tiger
Connected.
09:16:14 SQL>
09:16:14 SQL>
09:16:14 SQL> analyze table junk compute statistics for table for all indexed columns for all indexes;
Table analyzed.
09:17:06 SQL> show user
USER is "SCOTT"
14:36:01 SQL> set autotrace traceonly explain statistics;
14:36:21 SQL>
14:36:23 SQL> select * from junk where owner='MMJ';
1 row selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=86)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'JUNK' (Cost=4 Card=1 Byt
es=86)
2 1 INDEX (RANGE SCAN) OF 'IDX_JUNK_OWNER' (NON-UNIQUE) (Cos
t=3 Card=1)
Statistics
----------------------------------------------------------
330 recursive calls
0 db block gets
73 consistent gets
3 physical reads
0 redo size
1114 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
14:36:37 SQL> /
1 row selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=86)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'JUNK' (Cost=4 Card=1 Byt
es=86)
2 1 INDEX (RANGE SCAN) OF 'IDX_JUNK_OWNER' (NON-UNIQUE) (Cos
t=3 Card=1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
1114 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
14:36:58 SQL>
May 08, 2003 - 9:17 am UTC
analyzing the table will invalidate open cursors on that object in the shared pool. the next parse against that object will be "hard". hard does lots more recursive sql then "soft"
It is not the analyze so much as "you did something to cause hard parsing to have to take place"
here is an example using the rbo and EMP:
ops$tkyte@ORA817DEV> select * from emp;
14 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
6 consistent gets
0 physical reads
0 redo size
1979 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed
ops$tkyte@ORA817DEV> alter table emp add x number;
Table altered.
ops$tkyte@ORA817DEV> select * from emp;
14 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'EMP'
Statistics
----------------------------------------------------------
148 recursive calls
15 db block gets
26 consistent gets
0 physical reads
0 redo size
2042 bytes sent via SQL*Net to client
425 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
14 rows processed
Of course, SHUTDOWN/STARTUP would have the same effect.
normal, natural, expected and one time -- you hard parse and soft after that.
Some more info....
A reader, May 07, 2003 - 3:32 pm UTC
Hi Tom.
Here is some more info that might help you.
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
The junk table is a mirror of dba_objects view.
And there was no activity in the d/b between the time 9:16 and 14:36 in my sql+ session results I posted earlier.
Please let me know if you need more info.
Thanks
Mohsin
effect of partial analyze
kumar, May 08, 2003 - 7:26 am UTC
Tom,
You have mentioned in one of your replies that we should not analyze certain objects used in a query and leave the rest un-analyzed. I have read similar statements in lots of places, but i could not really understand why. Can you please explain why is it so dangerous to do so?
May 08, 2003 - 9:57 am UTC
analogy time.
You are a manager now.
I give you a team of 4 people.
I tell you all about 2 of them -- their capabilities, experience, level of comprehension, lots of stuff.
I tell you NOTHING about the other two.
I give you a problem to solve using these 4 people. You need to make assignments based on your knowledge of them. You have good knowledge of 2, no knowledge of the other 2. You look at them and size them up. Maybe you don't like the way the 4th guy dressed -- he "looks dumb" to you.
You dole out responsibilities -- not giving much hard stuff to #4. Unfortunately for you he was the smartest in the group and if he had been given 99% of the work -- it would have been a success. As it was, due to inefficient resource allocation -- you never finish your project!
So it is with statistics. You give the optimizer only a little information, it has to "make up the rest" and can come to some very bad decisions as to the correct approach.
Hat's off to you...
A reader, May 08, 2003 - 11:20 am UTC
Hi Tom:
Excellent analogy about the CBO....
I will confirm your point, but I am positive nothing was happening on my d/b to have caused invalidation of my cursors resulting in hard parse. I conducted the test on my laptop, so I know what was happening.
Could the Oracle version have made the difference in this case?
I was using Oracle 9i Rel 2, and from your SQL+ prompt it seems you were using 8i.
Thanks
Mohsin
Ahhh the language barrier strikes again...
A reader, May 08, 2003 - 11:27 am UTC
Sorry Tom:
I read and re-read your response. I get it.
Once in a while in between Swahili, Gujarati, Katchi, Urdu, Hindi and a sprinkling of Arabic I forget my English...
As they say in Swahili - Hakuna Matata (No Problem)
Kwaheri.
Thanks again.
Analogy worth more than 5 stars!!!!
A reader, May 08, 2003 - 11:53 am UTC
Both Analyze and DBMS_STATS?
Dan Jatnieks, May 16, 2003 - 7:41 pm UTC
In your original response you stated there are some statistics that DBMS_STATS does not collect:
"...but not the number of chained rows, average free
space, or number of unused data blocks"
I would like to switch to using DBMS_STATS for CBO statistics on my 9i database, but I also want to collect the statistics that DBMS_STATS does not; I have monitoring software that collects and uses chained rows, avg free space and unused data blocks.
Can I use Analyze first and then DBMS_STATS (in that order) to accomplish this?
The reasoning would be that first using Analyze will collect the data that DBMS_STATS does not; then DBMS_STATS will overwrite the CBO statistics that Analyze collected, but leave alone the other Analyze data.
Will that work?
May 17, 2003 - 10:02 am UTC
it would work but it would be like using two cars to goto work -- it would be double the work in most cases (partitioned tables being the exception).
can the monitoring software use "list chained rows" instead and can it use dbms_space to get the other info.
Statistics for Partioned tables
Nathan, February 05, 2004 - 10:08 am UTC
Dear Tom,
Your answers have been comprehensive and helpful all the time, just another quick one ( hopefully) from me.
When we the Estimate_percent parameter and sent it to 0.00001 in DBMS_STATS.GATHER_TABLE_STATS , would it mean only that percent of the blocks be scanned for analysis ? Our gather_stats takes with estimate takes hours to complete , My Senior DBA is skeptical of the percentage option and reckons that it could be a bug , he concluded by watching V$session_longops ( as the same number of blocks were scanned ) . I have not checked it yet but apparently the percentage does work on non-partitioned tables . Is this true ? Is there a bug in oracle 9i release 1.0 software
Thanks and Regards
Vish
February 05, 2004 - 7:18 pm UTC
hows about you share the details -- such as the entire, full command you are using plus the type of table you are doing this to and some facts about the table itself.
0.00001 would be a "not so good number" to use, pretty darn small it would be.
Analyze and DBMS_STATS
chandu singh, February 06, 2004 - 12:28 am UTC
Hi
i think it is a best to have the percentage of minium to 10% . is that right if not pls explain.
February 07, 2004 - 12:10 pm UTC
why -- whats the reasoning (not that you are wrong, but -- why do you think that :)
it could be 1%
it could be 10%
it could be 100%
it just depends on "how evenly distributed the data is in general". If the data is large-ish AND pretty much evenly distributed, or you are not computing histograms -- 1% might be dandy.
Me, I like to compute whenever possible personally. Only when it gets really too large -- and then estimate as high as you can (realizing of course anything over 49% is really compute in disguise)
Analyze and DBMS_STATS
chandu singh, February 06, 2004 - 12:28 am UTC
Hi
i think it is a best to have the percentage of minium to 10% . is that right if not pls explain.
DBMS_STATS.AUTO_SAMPLE_SIZE Issues
RyuO, February 06, 2004 - 12:14 pm UTC
My experience was that DBMS_STATS is wonderful, but requires a lot of wrapping. Here is the wrapper proc I evolved:
CREATE OR REPLACE PROCEDURE analyze_schema(
p_schema_name IN VARCHAR2 := USER,
p_Estimate_Pct IN PLS_INTEGER := NULL) IS
-- like DBMS_UTILITY.ANALYZE_SCHEMA, but using DBMS_STATS
c_proc_name CONSTANT VARCHAR2(128) := UPPER('analyze_schema');
c_Def_Estimate_Pct CONSTANT PLS_INTEGER := 35;
c_Min_Degree CONSTANT PLS_INTEGER := 1;
c_Def_Degree CONSTANT PLS_INTEGER := LEAST(c_Min_Degree,DBMS_STATS.DEFAULT_DEGREE);
-- default degree is somehow 32767 at the moment
c_Def_Options CONSTANT VARCHAR2(30) := 'GATHER AUTO';
-- 'GATHER' means analyze all objects.
-- 'GATHER AUTO' means only objects that have changed.
v_Estimate_Pct PLS_INTEGER;
-- v_Estimate_Pct can also be set to:
-- DBMS_STATS.AUTO_SAMPLE_SIZE (Oracle sets this, but it has been known to be 0...)
-- NULL (which means, look at everything, e.g., 'COMPUTE')
v_schema_name ALL_TABLES.OWNER%TYPE := NVL(UPPER(p_schema_name),USER);
v_Msg_Text VARCHAR2(2000);
BEGIN
CASE p_Estimate_Pct
WHEN NULL THEN
v_Estimate_Pct := 100;
WHEN 100 THEN
v_Estimate_Pct := 100;
ELSE
IF DBMS_STATS.AUTO_SAMPLE_SIZE = 0 THEN
IF p_Estimate_Pct BETWEEN 1 AND 99 THEN
v_Estimate_Pct := p_Estimate_Pct;
ELSE
v_Estimate_Pct := c_Def_Estimate_Pct;
END IF;
ELSE
v_Estimate_Pct := DBMS_STATS.AUTO_SAMPLE_SIZE;
END IF;
END CASE;
DBMS_STATS.GATHER_SCHEMA_STATS(
ownname => v_schema_name,
estimate_percent => v_Estimate_Pct,
cascade => TRUE,
degree => c_Def_Degree,
options => c_Def_Options);
-- "cascade" means analyze indexes too.
v_Msg_Text := c_proc_name||'-> Schema['||v_schema_name||'], Option['||c_Def_Options||'], Sampling['||v_Estimate_Pct||'%]';
DBMS_OUTPUT.PUT_LINE(v_Msg_Text);
EXCEPTION
WHEN OTHERS THEN
v_Msg_Text := c_proc_name||'-> EXCEPTION: '||SQLERRM;
RAISE_APPLICATION_ERROR(-20199,v_Msg_Text);
END analyze_schema;
My questions are related to AUTO_SAMPLE_SIZE:
1. How come it is always 0? That's what I always see, generally in 9.2.0.3 and 9.2.0.4 on Solaris.
2. What should the default estimate percentage be, if DBMS_STATS can't tell me? Anecdotal evidence suggests that the ANALYZE/ESTIMATE percentage should be 35-40%, but then ANALYZE and DBMS_STATS are different...
February 07, 2004 - 1:34 pm UTC
requires....
OR "permits".
it requires NO WRAPPING.
It quite handily permits it easily.
you'll have to provide me a reproducible, small, simple, yet 100% complete test case to demonstrate what you mean by "0"
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user,'BIG_TABLE',estimate_percent => dbms_stats.auto_sample_size );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select sample_size from user_tables where table_name = 'BIG_TABLE';
SAMPLE_SIZE
-----------
100000
(which also shows "does not require", but would "permit" wrapping.... )
understanding sample_percent
A reader, February 25, 2004 - 3:55 pm UTC
Oracle 9.2.0.2
I observed that when estimate_percent is specified, the gather_schema_stats is still doing a compute statistics. The sample_size=num_rows. Am I incorrect in concluding that its ignoring the 25% and estimating 100% of rows ? If so, why ?
OR have I overlooked at any setting ?
Please look at the following Case :
------- # of rows from T --------
SQL> select count(*) from T
COUNT(*)
--------
600980
---- Querying the dba_tables ---------
SQL> select table_name, monitoring, num_rows, sample_size, to_char(last_analyzed, 'HH24-mi-ss') ANALYZED from dba_tables;
TABLE_NAME MON NUM_ROWS SAMPLE_SIZE ANALYZED
--------------------- --- ---------- ----------- --------
T YES 600980 600980 15-39-16
------ inserting data ----------
SQL> insert into T select * from T
600980 rows created.
SQL> commit;
Commit complete.
--------- row count from T ---------
SQL> select count(*) from T;
COUNT(*)
----------
1201960
-------- Querying the dba_tables ---------
SQL> select table_name, monitoring, num_rows, sample_size, to_char(last_analyzed, 'HH24-mi-ss') ANALYZED from dba_tables;
TABLE_NAME MON NUM_ROWS SAMPLE_SIZE ANALYZED
--------------------- --- ---------- ----------- --------
T YES 600980 600980 15-39-16
---------- gather_schema_stats ---------
begin
dbms_stats.gather_schema_stats(
ownname => 'TEST',
estimate_percent => 25,
method_opt => 'FOR ALL INDEXED COLUMNS SIZE 1',
degree => 2,
cascade => FALSE,
options => 'GATHER AUTO'
);
end;
/
PL/SQL procedure successfully completed.
-------- Querying the dba_tables ---------
SQL> select table_name, monitoring, num_rows, sample_size, to_char(last_analyzed, 'HH24-mi-ss') ANALYZED from dba_tables;
TABLE_NAME MON NUM_ROWS SAMPLE_SIZE ANALYZED
--------------------- --- ---------- ----------- --------
T YES 1201960 1201960 15-48-45
-----------------------------------
February 25, 2004 - 7:13 pm UTC
straight from the docs here:
GATHER AUTO: Gathers all necessary statistics automatically. Oracle implicitly determines which objects need new statistics, and determines how to gather those statistics. When GATHER AUTO is specified, the only additional valid parameters are ownname, stattab, statid, objlist and statown; all other parameter settings are ignored. Returns a list of processed objects.
got it ... but
A reader, February 26, 2004 - 3:38 pm UTC
Tom, I understand it better now.
I took out the invalid parameters.
But my question still lingers :
Why is the gather_schema_stats analyzing all 100% of rows. I changed only 10%(50K of rows) out of 450K rows.
My dbms_job wakes up every few days to execute gather_schema_stats. Now, if I have 10-15 tables which have changed by more than 10%, then analyzing these tables 100%(compute) will take lot of time.
I know oracle automatically calculates how much percentage to analyze. But everytime why is it analyzing all 100% of the rows? Why is it not estimating ?
=========
begin
dbms_stats.gather_schema_stats(
ownname => 'TEST',
options => 'GATHER AUTO'
);
end;
/
==============
February 26, 2004 - 4:24 pm UTC
it does not take that long to analyze 1,000,000 rows.
gather stale does not re-analyze just "changed rows"
gather stale re-analyzes the entire object.
If you don't want AUTO, don't use AUTO, use gather stale and fill in the remaining blanks (but for such small tables, i would get it all)
hmmm
A reader, February 27, 2004 - 2:52 pm UTC
Hi Tom,
the thing am fighting is I like to use "stale" because I can specify the percentage to analyze and take advantage of the degree. AUTO is good because it automatically analyzes newly added tables.
Best of both worlds would be :
1) analyze newly added tables -> AUTO does this. STALE doesn't.
2) analyze only percentage of rows, not 100% -> AUTO is analyzing 100%. STALE can do percentage.
Is it possible ?
Thanks as always
February 27, 2004 - 3:07 pm UTC
sounds like you might want:
a) alter schema tab monitoring
b) gather empty
c) gather stale
for your degree of control. consider:
ops$tkyte@ORA920PC> create table t1 as select * from all_users;
Table created.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec dbms_stats.alter_schema_tab_monitoring
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather empty' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1 45
<b>that picked it up, this</b>
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather stale' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1 45
<b>did nothing..</b>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create table t2 as select * from all_users;
Table created.
ops$tkyte@ORA920PC> insert into t1 select * from t1;
45 rows created.
ops$tkyte@ORA920PC> insert into t1 select * from t1;
90 rows created.
ops$tkyte@ORA920PC> insert into t1 select * from t1;
180 rows created.
ops$tkyte@ORA920PC> commit;
Commit complete.
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> exec dbms_stats.alter_schema_tab_monitoring
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1 45
T2
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather empty' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1 45
T2 45
<b>that got t2 and...</b>
ops$tkyte@ORA920PC> exec dbms_stats.gather_schema_stats( user, options => 'gather stale' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA920PC> select table_name, num_rows from user_tables;
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1 360
T2 45
ops$tkyte@ORA920PC>
<b>that got t1...</b>
perfecto
A reader, February 27, 2004 - 10:56 pm UTC
pretty GOOD !
Thank you much.
Have a good weekend.
gather_schema_stats('SYS');
A reader, April 27, 2004 - 9:46 am UTC
Tom,
Please let us know, when we need to do the following
execute dbms_stats.gather_schema_stats('SYS');
at what point of time, do you recommend us executing the above sql
Thanks a lot
April 28, 2004 - 12:40 pm UTC
in your test system before going production, to ensure you are not going to mess up an existing system.
in 10g, it'll be done for you out of the box.
in 9i, it is "ok and safe in most cases to do" -- but TEST it, don't just DO IT
in 8i and before, I would not do it, not necessary and probably would lead only to troubles. but again, you would TEST it of course.
Confusing situation
Bruno Di Rei Araujo, May 14, 2004 - 2:21 pm UTC
Tom,
I've read you're advice about utilizing DBMS_STAT instead of ANALYZE statement. But, I had an experience that makes me get confused: with analyze, the cost got Higher, but the response time was the best; with DBMS_STATS, little tiny costs, but 12 times the response time (comparing TKPROF results). Is there any reason for that situation?
(TKPROF of ANALYZEd query)
********************************************************************************
SELECT a.seq_mov, a.empresa, a.filial, a.LOCAL, a.produto, a.qtde_mov,
a.vlr_tot_mov, a.num_docto, a.cod_oper, a.dt_mov, a.historico,
a.lote_cont, a.lcto_direto, a.disp_trigger, a.dt_registro,
a.qtde_val,
b.entrada_saida
FROM ce_movest a, ce_operacoes b
WHERE (:b5 IS NULL OR a.empresa = :b5)
AND (:b4 IS NULL OR a.filial = :b4)
AND (:b3 IS NULL OR a.produto = :b3)
AND (:b2 IS NULL OR a.dt_mov >= :b2)
AND (:b1 IS NULL OR a.dt_mov <= (TO_DATE (:b1) + .99999))
AND b.empresa = a.empresa
AND b.cod_oper = a.cod_oper
order by a.empresa, a.filial, a.produto, a.dt_mov, b.ajuste_estoque,
b.entrada_saida /*decode(b.entrada_saida, 'E', decode(b.transferencia, 'S', 3, 1), 2)*/,
a.local, a.qtde_mov
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 2.27 5.99 55 19061 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 2.27 5.99 55 19061 0 4
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 47
Rows Row Source Operation
------- ---------------------------------------------------
4 SORT ORDER BY (cr=19061 r=55 w=0 time=5997116 us)
4 HASH JOIN (cr=19061 r=55 w=0 time=5996843 us)
103 INDEX FULL SCAN CE_OPERACOES_AK1 (cr=1 r=0 w=0 time=288 us)(object id 553481)
4 TABLE ACCESS FULL CE_MOVEST (cr=19060 r=55 w=0 time=5992324 us)
********************************************************************************
(TKPROF of DBMS_STATSed)
********************************************************************************
SELECT a.seq_mov, a.empresa, a.filial, a.LOCAL, a.produto, a.qtde_mov,
a.vlr_tot_mov, a.num_docto, a.cod_oper, a.dt_mov, a.historico,
a.lote_cont, a.lcto_direto, a.disp_trigger, a.dt_registro,
a.qtde_val,
b.entrada_saida
FROM ce_movest a, ce_operacoes b
WHERE (:b5 IS NULL OR a.empresa = :b5)
AND (:b4 IS NULL OR a.filial = :b4)
AND (:b3 IS NULL OR a.produto = :b3)
AND (:b2 IS NULL OR a.dt_mov >= :b2)
AND (:b1 IS NULL OR a.dt_mov <= (TO_DATE (:b1) + .99999))
AND b.empresa = a.empresa
AND b.cod_oper = a.cod_oper
order by a.empresa, a.filial, a.produto, a.dt_mov, b.ajuste_estoque,
b.entrada_saida /*decode(b.entrada_saida, 'E', decode(b.transferencia, 'S', 3, 1), 2)*/,
a.local, a.qtde_mov
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 13.03 74.99 0 104482 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 13.03 75.00 0 104482 0 2
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 47
Rows Row Source Operation
------- ---------------------------------------------------
2 SORT ORDER BY (cr=104482 r=0 w=0 time=74999502 us)
2 TABLE ACCESS BY INDEX ROWID CE_MOVEST (cr=104482 r=0 w=0 time=74999276 us)
1276536 NESTED LOOPS (cr=4901 r=0 w=0 time=29219344 us)
103 TABLE ACCESS FULL CE_OPERACOES (cr=3 r=0 w=0 time=728 us)
1276432 INDEX RANGE SCAN CE_MOVEST_FK3 (cr=4898 r=0 w=0 time=10600999 us)(object id 29012)
********************************************************************************
As we can see, the time for the query havin tables DBMS_STATSed were 75 seconds; when ANALYZEd, it took about 6 seconds... Is there a reason for that?
Thanks,
Bruno
May 15, 2004 - 12:14 pm UTC
umm, exact commands used please?
Related questions for a data load(9.2.0.4.0)
Matt, May 16, 2004 - 9:39 pm UTC
I have a third party system that initially has no data. I need to load alot of data into this system. The product provides a number of interface tables into which I can load my data. I then initiate a third party supplied procedure that will pick up the data in the interface tables, validate and "interalize" the data.
I want the load to run efficiently. I believe my best bet is to re-generate stats (on the products internal tables) whilst the load is in progress (a 10-20 hrs process). Here are my questions:
Will open cursors necessarily be reparsed when the stats have been generated?
I plan to also generate system stats (io, cpu etc). I am also wondering about generating stats on 'SYS' and 'SYSTEM'. Is this recommended on 9.2.0.4.0?
Many Thanks.
May 17, 2004 - 7:19 am UTC
"Will open cursors necessarily be reparsed when the stats have been generated?"
if not, why bother generating stats at all -- they will have their entire working set already cached in memory (in the shared pool). if we did not invalidate the cursors, the stats would "not be used". dbms_stats does have an invalidate flag you can use to control that --- but, I would ask the vendor for their best practice with their product (make sure they support the cbo first and foremost)
as long as you always do gather stats on sys and system, it is definitely a supported option in 9ir2 -- in 10g, it will be "the only way" in fact.
What is the difference between the following options..
Matt, May 28, 2004 - 10:20 am UTC
to dbms_stats.gather_schema_stats:
method_opt => 'FOR ALL INDEXED COLUMNS'
and
method_opt => 'FOR ALL COLUMNS SIZE AUTO' ?
I just drastically improved the performance of a query by using the first. The first appears to have difference use different bucket sizes and generates more stats. Can you please comment on what should be "best practice" as far as generating stats is concerned?
Many Thanks.
I've noted the follwing difference in stats:
Name Null? Type
----------------------------------------- -------- ---------------
VMETERID NOT NULL NUMBER(38)
CUI_TID NOT NULL NUMBER(38)
INTERVAL_ENDDT NOT NULL DATE
INTERVAL_STARTDT NOT NULL DATE
VALUE FLOAT(126)
READ_TYPE_FLAG VARCHAR2(5)
TRANS_TID NUMBER(38)
TRANS_UID NUMBER(38)
TRANS_STARTTS DATE
method_opt => 'FOR ALL INDEXED COLUMNS':
=======================================
COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY LAST_ANALYZED
------------------------------ ------------ ---------- ----------- ---------------- -------------------
CUI_TID 1 0 1 1.0000000000 05/28/2004 23:03:14
INTERVAL_ENDDT 796 0 1 .0012562814 05/28/2004 23:03:14
INTERVAL_STARTDT 792 0 1 .0012626263 05/28/2004 23:03:14
READ_TYPE_FLAG 3 0 1 .3333333333 05/28/2004 23:03:14
TRANS_STARTTS 7894 0 1 .0001266785 05/28/2004 23:03:14
TRANS_TID 1 0 1 1.0000000000 05/28/2004 23:03:14
TRANS_UID 1 0 1 1.0000000000 05/28/2004 23:03:14
VALUE 57746 0 1 .0000173172 05/28/2004 23:03:14
VMETERID 487822 0 1 .0000020499 05/28/2004 23:03:14
method_opt => 'FOR ALL COLUMNS SIZE AUTO':
=========================================
Name Null? Type
------------------------------------------------------------------------ -------- -------------------------------------------------
VMETERID NOT NULL NUMBER(38)
CUI_TID NOT NULL NUMBER(38)
INTERVAL_ENDDT NOT NULL DATE
INTERVAL_STARTDT NOT NULL DATE
VALUE FLOAT(126)
READ_TYPE_FLAG VARCHAR2(5)
TRANS_TID NUMBER(38)
TRANS_UID NUMBER(38)
TRANS_STARTTS DATE
INDEXES FOR TABLE: VMC_DATA
================================================
INDEX_NAME COLUMN_NAME COLUMN_POSITION
------------------------------ ------------------------------ ---------------
PKVMC_DATA VMETERID 1
CUI_TID 2
INTERVAL_STARTDT 3
TABLE STATISTICS FOR : VMC_DATA
===================================================
NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
---------- ---------- ----------- -------------------
4680488 35452 100 05/28/2004 22:27:10
COLUMN STATISTICS FOR : VMC_DATA
====================================================
COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY LAST_ANALYZED
------------------------------ ------------ ---------- ----------- ---------------- -------------------
CUI_TID 1 0 1 1.0000000000 05/28/2004 22:27:10
INTERVAL_STARTDT 792 0 68 .0012626263 05/28/2004 22:27:10
VMETERID 487489 0 75 .0000031279 05/28/2004 22:27:10
Confusing Situation (Continued)
Bruno Di Rei Araujo, June 05, 2004 - 1:16 pm UTC
Tom,
As you asked for (from my Review above), the exact commands were:
DBMS Stats
----------
exec dbms_stats.delete_table_stats('SAGE', 'CE_MOVEST');
exec dbms_stats.delete_table_stats('SAGE', 'CE_OPERACOES');
exec dbms_stats.gather_table_stats('SAGE', 'CE_MOVEST');
exec dbms_stats.gather_table_stats('SAGE', 'CE_OPERACOES');
ANALYZE
-------
exec dbms_stats.delete_table_stats('SAGE', 'CE_MOVEST');
exec dbms_stats.delete_table_stats('SAGE', 'CE_OPERACOES');
analyze table ce_movest delete statistics;
analyze table ce_operacoes delete statistics;
analyze table ce_movest estimate statistics for table for all indexes for all indexed columns;
analyze table ce_operacoes estimate statistics for table for all indexes for all indexed columns;
Note that my approach was, first, remove all relevant statistics for the method used and then gather them brand new.
Just can't understand why is it so different... Can you see any reason for it?
Thanks again,
Bruno.
June 05, 2004 - 1:40 pm UTC
add
cascade => true
to the dbms_stats and retry (so you compare apples to apples). you have no index stats.
Statistics
Aru, September 21, 2004 - 2:34 am UTC
Hi Tom,
Please can you clarify in what circumstances and why would you use the 3 different(listed below ) methods to analyze tables,
1). a@ORA920LAP> analyze table t compute statistics
for table for all indexes for all indexed columns;
2). a@ORA920LAP> analyze table t compute statistics;
3). dbms_stats(....
I,m using oracle8i.
Q1) Why would I need to compute stats for
all indexed columns when the whole table is getting
analyzed.
Q2) Does dbms_stats analyze the table, it's indexes and the indexed columns too or what exactly does it do.If I specify a schema then does it to all the above mentioned statistics.
Thanks lots as always Tom,
Regards,
Aru.
September 21, 2004 - 7:43 am UTC
1) i would not (do not) -- i have old examples that do but promise never to do it in an example again
2) see #1
3) yes.
q1) because you wanted to have histograms for all indexed columns (not saying you do, but if you did "for all indexed columns" that is the outcome -- you have histograms for indexed columns)
q2) see the documentation -- it'll tell you that method_opt defaults to 'for all columns size 1' and cascade (to indexes) defaults to FALSE, so you would have to pass TRUE to get indexes done.
cascade=true -- > local indexes and global indexes?
Pravesh Karthik from India, January 02, 2005 - 7:09 am UTC
Tom,
For a partitioned table, if i say cascade=true. It will collect stats on all local indexes and global indexes?
If yes,i dont need to use GATHER_INDEX_STATS - right?
Please confirm.
Thanks a lot
Pravesh Karthik
January 02, 2005 - 11:05 am UTC
depends on your granularity -- but left to the default,
exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
will get table, table partition, index, index partition statistics.
if yes -- then index statistics in which mode?
Pravesh Karthik from India, January 02, 2005 - 7:51 am UTC
January 02, 2005 - 11:14 am UTC
that is a broad sweeping generalization that is not really accurate (or well stated -- "... to COMPUTE index statistics in ESTIMATED mode..." -- bummer. You either COMPUTE or you ESTIMATE)
and to say the CBO is almost based on index statistcs -- well, I sent an email to the author of the note asking them to remove that paragraph.
awesome !!
Pravesh Karthik from India, January 02, 2005 - 11:37 am UTC
chong, January 03, 2005 - 2:49 am UTC
Hi Tom,
1)exec dbms_stats.gather_database_stats(method_opt=>'FOR ALL INDEXED COLUMNS',degree=>7,cascade=>true);
2)exec dbms_stats.gather_database_stats(method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO',degree=>7,cascade=>true);
Is the first command will collecting histogram? If yes what is the different between first and second command?
thanks
January 03, 2005 - 8:45 am UTC
see
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:5792247321358#7417227783861
they are very different. size auto sort of needs a historical workload to peek at (queries) as well -- that is part of it's algorithm.
consider:
ops$tkyte@ORA9IR2> create table t as select * from all_objects;
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx1 on t(object_id);
Index created.
ops$tkyte@ORA9IR2> create index t_idx2 on t(owner);
Index created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns', cascade => true );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' group by column_name;
COLUMN_NAME COUNT(*)
-------------------- ----------
OBJECT_ID 76
OWNER 22
<b>histograms collected for both...</b>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size auto', cascade => true );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' group by column_name;
COLUMN_NAME COUNT(*)
-------------------- ----------
OBJECT_ID 2
OWNER 2
<b>no histograms -- just hi/lo values and such. there were no queries that ever referenced either column so histograms were not collected</b>
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size skewonly', cascade => true );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'T' group by column_name;
COLUMN_NAME COUNT(*)
-------------------- ----------
OBJECT_ID 2
OWNER 22
<b>it looked at the columns and said "object id, unique -- no histograms needed there", owner -- not so unique and pretty skewed, lets get them</b>
But see that link, the meanings are defined there.
chong, January 03, 2005 - 10:26 pm UTC
Hi Tom,
Based on the link, size auto means:
It'll collect histograms in memory only for those columns which are used by your applications (those columns appearing in a predicate involving an equality, range, or like operators). we know that a particular column was used by an application because at parse time, we'll store workload information in SGA. Then we'll store histograms in the data dictionary only if it has skewed data (and it worthy of a histogram)
What i understand from the above definition is (please correct me if i'm wrong), oracle will only collect histogram on those tables' columns which are ever been used since database startup and has skewed data, it will not collect histogram for others tables' columns that never been used since database startup even skewed data exists. If my understanding is true, how long oracle will store those table parsing information in SGA (so that dbms_stats will know which column to analyze)? Does it means this option is not appropriate for a fresh database which is just starting up?
I see some of the oracle notes highly recommend to used this option to analyze the histogram information, is it true?
January 03, 2005 - 10:45 pm UTC
traveling right now but this is worthy of followup later after some research - so, check back maybe later this week.
(see, I don't know everything off the top of my head :)
Update:
looks like sys.col_usage$, it is persistently stored in the dictionary. The name of the columns that represent "flags" in there says it all.
dbms_stats.gather_schema_stats - Oracle9i
Altieri, January 04, 2005 - 4:19 pm UTC
Tom,
I'm using dbms_stats.gather_schema_stats( ownname=>'XXX') to analyze a whole schema.
I dont want to analyze just a table in this schema. What I have to do ?.
I have been verifying all parameters, but and I didnt find any options yet.
Thanks,
Altieri
January 05, 2005 - 9:06 am UTC
there is no "anti gather" parameter.
you would have to use gather table stats on the tables you wanted stats on, or move the table to another schema.
seems strange to not want stats on a single table doesn't it?
How to determine if statistics were collected by ANALYZE?
Alexander, January 07, 2005 - 3:19 pm UTC
Tom, how to determine if the statistics were collected by ANALYZE command or by the package?
January 08, 2005 - 4:15 pm UTC
if you look at columns that are filled in by ANALYZE but not dbms_stats - avg_space is one - you might be able to tell if ANALYZE was at some time in the past used:
ops$tkyte@ORA9IR2> create table t as select * from all_Users;
Table created.
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select avg_space from user_tables where table_name = 'T';
AVG_SPACE
----------
0
ops$tkyte@ORA9IR2> analyze table t compute statistics;
Table analyzed.
ops$tkyte@ORA9IR2> select avg_space from user_tables where table_name = 'T';
AVG_SPACE
----------
7488
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> select avg_space from user_tables where table_name = 'T';
AVG_SPACE
----------
7488
but dbms_stats won't overwrite the value -- so it is not conclusive proof.
followup on 3 jan 2005
chong, January 11, 2005 - 2:19 am UTC
Hi Tom,
Do you find out any research on 3rd jan 2005 thread which relate to dbms_stats with size auto option?
thanks
dbms stats failure
Rob H, January 11, 2005 - 12:18 pm UTC
I have the database setup (9i) to gather statistics however I am running into a strange error
BEGIN DBMS_STATS.GATHER_DATABASE_STATS(cascade=>TRUE); END;
*
ERROR at line 1:
ORA-00904: : invalid identifier
ORA-06512: at "SYS.DBMS_STATS", line 9375
ORA-06512: at "SYS.DBMS_STATS", line 9857
ORA-06512: at "SYS.DBMS_STATS", line 10041
ORA-06512: at "SYS.DBMS_STATS", line 10134
ORA-06512: at "SYS.DBMS_STATS", line 10114
ORA-06512: at line 1
I don't understand what
"ORA-00904: : invalid identifier" means with nothing between the 2 colon's. Any ideas? I think I know the schema its failing in.
January 11, 2005 - 1:48 pm UTC
trace it.
any 'bad' sql will be in the trace file. if you think you know the schema, just gather schema stats and see if it repros in there, tracefile would be smaller.
structure not sql
Rob H, January 11, 2005 - 1:56 pm UTC
Because I'm doing an analyze, isn't it an issue of object structure, not sql? Regardless, I am tracing it now.
January 11, 2005 - 2:03 pm UTC
dbms_stats runs sql
ora-904 is raised from sql.
interesting....
Rob H, January 12, 2005 - 12:21 pm UTC
after analyzing the tkprof for the failure I found that the table that was failing. What was interesting was that I found that the table being analyzed was failing on a specific index. One that is incredibly old (also note that table contains hourly transactions in the realm of only 6 million rows). Dropping the index and using dbms_stats again worked. Adding the index back and using dbms_stats worked.
TKPROF
********************************************************************************
select /*+ cursor_sharing_exact dynamic_sampling(0) no_monitoring no_expand
index(t,"INDX_TAB_CAMP") */ count(*) as nrw,count(distinct
sys_op_lbid(30519,'L',t.rowid)) as nlb,count(distinct "CAMP") as ndk,
sys_op_countchg(substrb(t.rowid,1,15),1) as clf
from
"KCAGENT"."TAB_ALL" sample block
(5.77406383513445589344056295551646142247) t where "CAMP" is not null
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 1 14 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 1 14 0 0
Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 43 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
0 SORT GROUP BY (cr=0 r=0 w=0 time=0 us)
1 INDEX SAMPLE FAST FULL SCAN INDX_TAB_ALL_ID_DATE (cr=14 r=1 w=0 time=6851 us)(object id 49563)
********************************************************************************
But the index in the hint is different than the index in the explain.
Regardless, the immediate problem is solved, thank you.
will the statistics be automatically be updated ?
Chenna, January 30, 2005 - 12:18 pm UTC
I'm interested in knowing in the following
1.Lets say table t has 1000 rows
2.Generate statistics for table t and the associated indexes.
3.Now lets truncate table t.
--will the statistics be automatically be updated ?
4.In the next scenario , lets delete all the rows without truncating.
--will the statistics be automatically be updated ?
If they dont can we do something to update the statistics immedietly based on a certain operation.
Thanks
January 30, 2005 - 12:36 pm UTC
when you tried it, what happened? I mean, the scenario you scoped out here would be pretty *trivial* to test? That is what I'm trying to teach here -- you "spec"ed out the question very well, the test case you would need to see would be very easy.
statistics are gathered when you ask us to gather them, or some job you run gathers them.
Very interesting discussion.
Bob, May 22, 2005 - 9:57 pm UTC
Thanks everyone.
My question - 10g's "black box" statistics collection mechanism, does it collect histograms and if so, through what parameters?
I've been reasonably satisfied with the execution plans we're getting and collecting system statistics in particular has been a boon. However there still are some zingers and they typically involve tables with what I know to be rather skewed data distribution on joined columns. I'd like to keep the process as "black box" as possible (our DBAs have bigger problems to worry about) so I've been wondering how I could verify and/or incorporate histogram collection into the black box, or whether I need to disable the auto collection and wrap up my own.
Thanks in advance - this forum is great!
May 23, 2005 - 8:21 am UTC
it uses "auto" and "auto" uses sys.col_usage$ to determine what columns have predicates applied to them and how the predicates use the column (equality, range scans.
so the set of columns that histograms are collected on can vary over time by simply introducing a new query into the system that "where's" or joins on columns that were never used that way before.
A reader, May 25, 2005 - 8:39 am UTC
Once a table is set up as Monitoring, how do I disable monitoring, if I need to?
May 25, 2005 - 11:43 am UTC
ops$tkyte@ORA9IR2> alter table t monitoring;
Table altered.
ops$tkyte@ORA9IR2> alter table t NOmonitoring;
Table altered.
Ravi, May 25, 2005 - 8:39 am UTC
How do I disable MONITORING on a table?
DBMS_STATS with low sample size produces wrong DISTINCT_KEYS on nonunique index
Marco Coletti, May 27, 2005 - 3:59 pm UTC
DBMS_STATS seems very unaccurate at estimating number of distinct keys on nonunique indexes.
I think it could be a bug.
/*
Test case:
DBMS_STATS gives wrong DISTINCT_KEYS for nonunique index
*/
select * from V$VERSION;
/*
BANNER
------------------------------------------------------------Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE 9.2.0.6.0 Production
TNS for Solaris: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
*/
create table NUMBERS as
select mod(N,2000001) N from (select ROWNUM N from DUAL connect by ROWNUM <= 6000000)
;
create index IX_N on NUMBERS (N) compute statistics;
select count(distinct N) from NUMBERS;
/*
COUNT(DISTINCTN)
----------------
2000001
*/
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
2000001 6000000 2005-05-25 16:25:12
*/
-- begin bug ------------------------------------
begin
DBMS_STATS.GATHER_INDEX_STATS (
ownname => user,
indname => 'IX_N',
estimate_percent => 20,
degree => 1);
end;
/
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
414785 1188855 2005-05-25 16:27:39
*/
-- end bug ---------------------------------------
analyze index IX_N estimate statistics sample 20 percent;
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
2023171 1415367 2005-05-25 16:47:17
*/
begin
DBMS_STATS.GATHER_INDEX_STATS (
ownname => user,
indname => 'IX_N',
estimate_percent => NULL,
degree => 1);
end;
/
select i.DISTINCT_KEYS, i.SAMPLE_SIZE, i.LAST_ANALYZED
from USER_INDEXES i
where i.INDEX_NAME = 'IX_N'
;
/*
DISTINCT_KEYS SAMPLE_SIZE LAST_ANALYZED
------------- ----------- -------------------
2000001 6000000 2005-05-25 16:48:53
*/
May 27, 2005 - 5:32 pm UTC
it is being computed the way the intended (believe it or not).
do you have a compelling case where by the number of distinct values results in a really wrong plan?
There is an enhancement request in to have a "re-look" at this, but if you have a compelling "bad plan", I'll be glad to take a look and see if we cannot get that into the system.
No compelling bad plan due to bad NDV for nonunique index
Marco Coletti, May 31, 2005 - 9:25 am UTC
We have lots of almost-unique indexes in our system.
Probably this is a simptom of bad design, but in development we were heavily constrained by the requirements, which are basically:
* a linear chain of four external systems that exchange messages bidirectionally about entities via Tibco bus; the systems do not employ an entity omogeneus unique key, but only entity unique keys specific for each system (i.e. each system marks an entity with its own key format), and correlation keys from system to system
* collect entities and related events from these systems intercepting the messages in near real-time
* aggregate the events every 1 hour
* make a entity flat warehouse table to be queried by a GUI; refresh it every 1 hour
* delete "dead" entities each day
* the average number of live entities is about 3.5 million per system
I don't remember a specific query which performs bad due to this problem.
May 31, 2005 - 10:20 am UTC
that is the issue though.
dbms_stats uses block sampling to quickly guestimate these statistics. the block sampling tends to read "sorted data" in the index. So, the number of distinct values is hard to compute. Analyze on the other hand used a totally different technique.
But unless you can show this is leading to "bad things happening", that the NDV is off materially affects somethnig -- there isn't an issue. The sampling is happening the way they intended it to happen, the bug was opened and closed on this. They would need a compelling case in order to re-look at it.
Is NDV accurate? No, observably so.
Is that fact an issue? That is what they would need to see.
External tables
A reader, June 06, 2005 - 9:49 am UTC
We thought we'd create a DDL trigger to automatically turn on Monitoring for new tables, but this DID NOT work for External tables.
Is there some way, (considering it is an AFTER trigger) to know, for SURE, that the current table
1)Is NOT an External table and set them to monitoring
or
2)IS an External table and dont set Monitoring for it?
CREATE OR REPLACE TRIGGER object_change AFTER
CREATE ON SCHEMA
DECLARE
alt_mont varchar2(1000);
BEGIN
if (ora_sysevent in ('CREATE') and
ora_dict_obj_type = 'TABLE') then
alt_mont :='ALTER TABLE ' ||ora_dict_obj_name||' MONITORING ';
execute immediate alt_mont;
end if;
end;
/
June 06, 2005 - 10:50 am UTC
either query the dictionary or just "try" and catch the exception (expect it, ignore it)
A reader, June 06, 2005 - 2:12 pm UTC
Thanks, accept catching exception is a brilliant solution, but can't find anything in the Data Dictionery that suggests the table being created is an External one.
By looking at xxx_objects or xxx_tables, what is the DEFINITIVE way to categorise Tables?
June 06, 2005 - 3:04 pm UTC
ops$tkyte@ORA9IR2> select * from user_external_tables;
no rows selected
replacing analyze with dbms_stats in 9.2.0.6
James K. Purtzer, July 18, 2005 - 1:09 pm UTC
Currently using:
DBMS_UTILITY.ANALYZE_SCHEMA('ESD', 'COMPUTE');
It takes about 70 minutes to run on about 30 Gb of Data warehouse weekly.
Would like to replace with:
dbms_stats.gather_schema_stats(ownname=>'ESD',
estimate_percent=>dbms_stats.auto_sample_size,
method_opt=>'AUTO', cascade=>TRUE);
1. I'm struggling with how to assess impact on the production server. I can run the new
query on an RMAN restore of the data, without a representative connection load. Is that a fair test per your earlier recommendations?
2. Does the old analyze data need to be dropped before I run the new statement?
3.Looking at the DBMS_STATS statement above, as its specified
It will only gather stale statistics from objects the CBO is using?
July 18, 2005 - 1:41 pm UTC
1) I'd be looking to compare plans. Biggest thing will be auto in the method_opt, in order for that to work, you need to have had some queries going over time. auto (you have the syntax wrong, see below) will not work "good" until it knows the predicates you use:
ops$tkyte-ORA9IR2> create table ttt as select object_name, owner, object_type from all_objects;
Table created.
ops$tkyte-ORA9IR2> create index t_idx1 on ttt(owner);
Index created.
ops$tkyte-ORA9IR2> exec dbms_stats.gather_schema_stats( ownname=> user, estimate_percent=>dbms_stats.auto_sample_size, method_opt=>'for all columns size AUTO',cascade=>true );
PL/SQL procedure successfully completed.
ops$tkyte-ORA9IR2>
ops$tkyte-ORA9IR2> set autotrace traceonly explain
ops$tkyte-ORA9IR2> select * from ttt t1 where owner = 'SCOTT';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=1271 Bytes=48298)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TTT' (Cost=12 Card=1271 Bytes=48298)
2 1 INDEX (RANGE SCAN) OF 'T_IDX1' (NON-UNIQUE) (Cost=3 Card=1271)
<b>see the card = 1271</b>
ops$tkyte-ORA9IR2> set autotrace off
ops$tkyte-ORA9IR2>
ops$tkyte-ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'TTT' group by column_name;
COLUMN_NAME COUNT(*)
------------------------------ ----------
OBJECT_NAME 2
OBJECT_TYPE 2
OWNER 2
<b>only two buckets, no histograms really</b>
ops$tkyte-ORA9IR2> exec dbms_stats.gather_schema_stats( ownname=> user, estimate_percent=>dbms_stats.auto_sample_size, method_opt=>'for all columns size AUTO',cascade=>true );
PL/SQL procedure successfully completed.
ops$tkyte-ORA9IR2> select column_name, count(*) from user_tab_histograms where table_name = 'TTT' group by column_name;
COLUMN_NAME COUNT(*)
------------------------------ ----------
OBJECT_NAME 2
OBJECT_TYPE 2
OWNER 22
<b>but it saw the predicate now, so it got them</b>
ops$tkyte-ORA9IR2>
ops$tkyte-ORA9IR2> set autotrace traceonly explain
ops$tkyte-ORA9IR2> select * from ttt t2 where owner = 'SCOTT';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=40 Bytes=1520)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TTT' (Cost=2 Card=40 Bytes=1520)
2 1 INDEX (RANGE SCAN) OF 'T_IDX1' (NON-UNIQUE) (Cost=1 Card=40)
ops$tkyte-ORA9IR2> set autotrace off
ops$tkyte-ORA9IR2> spool off
<b>and that card= is more accurate</b>
2) no
3) no, it'll gather them all. you would need to use the alter table T monitoring command and then do a gather stale option.
so method_opt updates as queries run on new stats?
James K. Purtzer, July 18, 2005 - 3:26 pm UTC
How does the method_opt=>'for all column size AUTO' update as the CBO parses queries? Your example showed cardiality decreasing as more queries run on the table, correct?
July 18, 2005 - 4:02 pm UTC
ctl-f for col_usage$ on this page.
transport stats to another table
valent, July 18, 2005 - 4:35 pm UTC
Hello Tom,
let's say i want to transport statistics from one big table to another.
create table t2 as select * from big_table
big table has up to date statistics and i want to transport them to t2
import_table_stats will not work, because it uses same table name.
thanks for your option
July 18, 2005 - 5:08 pm UTC
you can use SET_TABLE_STATS (you do know the stats will be different right -- t2 will be nicely compacted, no whitespace, the clustering factor on indexes will possibly be different.....)
the stats are not made to go from table to table.
but if you look at the stats table you export to, you'll see table names in there, it is not hard to figure out what you would have to update in the stats table after you export to it to touch up the table name.
valent, July 19, 2005 - 4:22 am UTC
thank you Tom,
we have large reporting table (no updates) with 15+ million rows par partition, created with CTAS. Statistics should be very similar for my new table copy.
July 19, 2005 - 7:34 am UTC
go for it then
suggestion..!
Reader, July 20, 2005 - 5:05 am UTC
Tom,
Every week we are analyzing the database schema using DBMS_STAT package as a batch process in our production environment.
But the this job is getting failed often with ORA-01555 Snapshot too old.We could understand the cause of the problem that is,this analyze job and another job which is schedule in the batch running parallely getting clashed at some point and end with ORA-01555 error.Here my question is,if we re-run the DBMS_STAT package after it failed,its analyzing all the objects from the begining.But we want to analyze objects from the point it failed previously.Is there any option in DBMS_STAT package?.Pls advice.I hope you understand the scenario .
July 20, 2005 - 6:42 am UTC
why don't we fix the problem? the 1555? make your UNDO permanently larger -- sized right to fullful the requirements of your system
the other path will be "write code" as dbms_stats doesn't have a "remember where you were and start over", we'd have to help it out that way.
dbms_stats hanging,
sns, July 20, 2005 - 9:23 am UTC
Lately we are experiencing an issue with DBMS_STATS package when we try to analyze the table.
This is what we did:
BEGIN dbms_stats.gather_table_stats(ownname => 'EURO_TSTAT', estimate_percent => 15, tabname => 'CALL_SYSTEM_EURO', block_sample => FALSE, method_opt => 'FOR ALL INDEXED COLUMNS SIZE 1', degree => 4, granularity => 'ALL', cascade => TRUE ); END;
What are the things we need to see while dbms_stats are in hung state?
Any steps to resolve this?
thanks,
July 20, 2005 - 12:49 pm UTC
you could trace it, you could query v$ views (eg: v$session_event) to see what it is waiting on
sql_trace with waits (10046 trace) could be used.
need to analyze index separately..!
Reader, August 06, 2005 - 10:17 pm UTC
Tom,
I have noted that gather schema stats would not analyze indexes until we specify cascade=>true,
is it true?
here the procesure I'm using,kindly give suggestion on this to do the normal process with indexes as well..
exec dbms_stats.gather_schema_stats( -
ownname => 'OMSTNX', -
options => 'GATHER AUTO', -
estimate_percent => dbms_stats.auto_sample_size, -
degree => DBMS_STATS.DEFAULT_DEGREE -
)
August 07, 2005 - 9:14 am UTC
that is the documented purpose of cascade, yes.
Trying to understand this behavior...
Mark, August 15, 2005 - 10:02 am UTC
Hi Tom,
We have this as a job run Weekly (Sunday, Early AM):
DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>'HT4'
, estimate_percent => 40
, block_sample => FALSE
, method_opt => 'FOR ALL COLUMNS SIZE 1'
, degree => NULL
, granularity => 'DEFAULT'
, cascade => TRUE
);
Now, I came into work Monday and the performance was 'yukky', so I did this:
DECLARE
BEGIN
EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_INSURANCE ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_PROCEDURE ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_PROCEDURE_BALANCE ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
EXECUTE IMMEDIATE 'ANALYZE TABLE PAT_FINANCIAL_TRANSACTION ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
EXECUTE IMMEDIATE 'ANALYZE TABLE ALL_NAME ESTIMATE STATISTICS FOR TABLE FOR ALL INDEXED COLUMNS FOR ALL INDEXES';
EXCEPTION
WHEN OTHERS
THEN
NULL;
END;
... And all was good again. Any ideas what is incorrect perhaps on the DBMS_STATS.GATHER_SCHEMA_STATS parameter settings, and could you suggest a good method using this procedure?
Thanks!
August 15, 2005 - 4:00 pm UTC
you would need to tell us what happened between sunday early am and monday morning.
and why you have a when other then null????????
erase that and stop using it, you want to just silently ignore ERRORS!!!!!???
I can say that the dbms_stats -- you told it "no histograms", and the analyze you let it "get histograms" on the indexed columns -- that jumps right out.
Yes, ok.
Mark, August 17, 2005 - 2:36 pm UTC
Thanks Tom.
The Exception handler was quick adhoc code and I don't usually code that way.
Anyway, this is what I've done:
1) turned table monitoring on
2) Via JOB at 1AM:
/* Gather Stale Stats Nightly */
DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>'HT4',
estimate_percent => 40, block_sample => FALSE, method_opt =>
'FOR ALL INDEXED COLUMNS', degree => NULL, granularity =>
'DEFAULT', options => 'GATHER STALE', cascade => TRUE);
3) Once per week (Sunday Early AM):
/* Gather All Stats Weekly */
DBMS_STATS.GATHER_SCHEMA_STATS(ownname =>'HT4',
estimate_percent => 40, block_sample => FALSE, method_opt =>
'FOR ALL INDEXED COLUMNS', degree => NULL, granularity =>
'DEFAULT', options => 'GATHER', cascade => TRUE);
This seems to give me a more consistent performance.
Thanks!
collecting initial stats using dbms_stats
JK Purtzer, September 02, 2005 - 1:14 pm UTC
running the package mentioned above in my previous review
begin
dbms_stats.gather_schema_stats(ownname=> 'ESD',estimate_percent=>dbms_stats.auto_sample_size,
method_opt=>'for all columns size
AUTO',cascade=>true );
commit;
end;
I can't seem to verify with
</code>
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96533/stats.htm#33117 <code>
SELECT TABLE_NAME, NUM_ROWS, BLOCKS, AVG_ROW_LEN,
TO_CHAR(LAST_ANALYZED, 'MM/DD/YYYY HH24:MI:SS')
FROM DBA_TABLES
WHERE TABLE_NAME IN ('SO_LINES_ALL','SO_HEADERS_ALL','SO_LAST_ALL');
I get 0 rows
thinking I'm missing something about initially gathering stats, I looked at ch6 of effective oracle by design also
the procedure only takes 1 sec to run, it can't be gathering stats quite that fast or can it?
September 03, 2005 - 7:40 am UTC
well, that dbms_stats will gather stats for the ESD schema
could it take 1 second? depends on the size of the ESD schema.
That query will return rows for those exact table names, if nothing is coming back, you don't have any tables by those names.
Nice
Catherine, September 15, 2005 - 2:10 pm UTC
Hi Tom,
How Statistics gathering is useful for the Optimizer??
In what ways it uses the statistics??
Please provide a detailed answer.
Bye!!
September 15, 2005 - 2:35 pm UTC
Support of Analyze
Bill, September 28, 2005 - 9:26 am UTC
Hi Tom,
Both you and the Oracle documentation (9i) state the cost-based optimizer will eventually use only statistics that have been collected by DBMS_STATS.
Do you know when this will be the case?
Also, is the DBMS_DDL.ANALYZE_OBJECT procedure equivalent to ANALYZE TABLE?
Thanks!
September 28, 2005 - 10:25 am UTC
The optimizer expects the statistics to be gathered by dbms_stats, the stats generated by analyze can and will be different than dbms_stats, but the optimizer is assuming "dbms_stats" did that.
There is no "end of life" on analyze for statistics gathering
JK Purtzer, October 03, 2005 - 5:14 pm UTC
I ran the sql below to gather stats:
begin
dbms_stats.gather_schema_stats(ownname=> 'schema',
estimate_percent=>dbms_stats.auto_sample_size,
cascade=>true);
commit;
end;
This caused several of the plans for procedures that are "unhinted" and were using indexes to stop using them and do full table scans, what would you suggest to look at to find out why.
Thanks again for the great book from the OTN lounge at OOW05
October 03, 2005 - 9:01 pm UTC
tell me you tested this in a test system first please????
you just turned on the CBO, entirely. Some things will go better - some will be unchanged and invariably, there will be others that do not go so well the first time (they are the ones you NOTICE, the others you do not).
They are something to be analyzed in a testing environment. Without version info and other stuff - no comment.
analyze and cursor invalidation
Al, October 04, 2005 - 11:05 am UTC
In a followup above you state:
----------------
analyzing the table will invalidate open cursors on that object in the shared pool. the next parse against that object will be "hard". hard does lots more recursive sql then "soft"
-----------------
I don't fully understand what you mean by 'invalidate open cursors". An anaylze of a table will not cause some other session's executing sql to fail, will it?
For complicated reasons I will need to run a bunch of dbms_stats.gather_table_stats after an application upgrade, and there will be real users on the application at that point. I am assuming that as the stats are gathered, any parse (soft or hard) of any sql on an analyzed object will cause complete re-optimization, but that in no case will an executing statement fail because of the analyze. Is this correct?
The goal here, obviously, is to get users on the application as quickly as possible after the upgrade, even if performance isn't what we'd like; then gather the stats as the system runs to bring it back to its 'normal' statistics state and 'normal' performance.
October 04, 2005 - 4:43 pm UTC
it'll invalidate the shared sql, the cursors in the shared pool so they next time they are executed, they'll hard parse.
it will not break "already running" stuff.
anaylze table, open cursors and parsing ...
Catalin CIONTU, October 04, 2005 - 2:23 pm UTC
Tom, please correct me if i'm wrong, but by analyzing the table, its corresponding open cursors will be invalidated in the shared_pool and the only difference is that the cursors have to do a hard parse instead of a soft parse . The process will complete, but will be less effective, this time. Regards, catalin .
October 04, 2005 - 6:24 pm UTC
the process will complete, yes, as to it "effectiveness", one of three things can happen..... (slower, faster, the same)
dbms_stats quibble
JK Purtzer, October 04, 2005 - 6:40 pm UTC
9.2.0.6 DB
We did some testing,(see general question above, sorry about that one, panic'd)Here is a more specific one.
We are now using the DBMS_STATS package for collecting stats. We were previously using the old DBMS_UTILITY.ANALYZE_SCHEMA method.
The following is the new method we are using to collect stats:
dbms_stats.gather_schema_stats(
ownname=>XXX',
estimate_percent=>dbms_stats.auto_sample_size,
method_opt=>'for all columns size AUTO',
cascade=>true);
Since we changed over to the new statistics gathering, we have seen significant degradation in some of our existing queries. In analyzing one of the queries, we found some strange behavior by the CBO when evaluating a TO_DATE conversion in the where predicate:
When the date string includes a 4-digit year such as this:
WHERE OA.CREATE_DATE >= TO_DATE('7/28/2005', 'MM/DD/RR')
It uses the index on the CREATE_DATE column which is desired.
When the date string uses a 2-digit year such as this:
WHERE OA.CREATE_DATE >= TO_DATE('7/28/05', 'MM/DD/RR')
It does NOT use the index on the CREATE_DATE column even though this also converts the string to 7/28/2005.
The difference in execution times between these two execution plans is less than a second when the index is used to minutes when the index is not used.
This is quite confusing and was hoping you could shed some light on this.
October 04, 2005 - 9:01 pm UTC
WHERE OA.CREATE_DATE >= TO_DATE('7/28/2005', 'MM/DD/RR')
THAT is a constant....
WHERE OA.CREATE_DATE >= TO_DATE('7/28/05', 'MM/DD/RR')
THAT is an "unknown", could be 1905, 2005, 2105 - depends on when you execute it (I know, a human would be able to sort of make a good guess - but the date format is not constant)
Are you sure this is due to stats? can we see an autotrace traceonly explain?
date quibble with explain plans
JK Purtzer, October 04, 2005 - 9:47 pm UTC
Hope you can read this, my email cut and paste got rid of the formatting, need a clarification
statement that the 2-digit year is an unknown. I am issuing the query today and by Oracle's rules that would mean an 05 year with an RR date format elelment would return year 2005. Tha makes it a constant as of right now doesn't it?
Using a 4-digit year in the to_date string:
Cost
Cardinality
Bytes
SELECT STATEMENT, GOAL = CHOOSE
3102
207
40986
SORT ORDER BY
3102
207
40986
HASH JOIN OUTER
3094
207
40986
HASH JOIN OUTER
2717
206
30488
NESTED LOOPS
2714
206
29046
HASH JOIN OUTER
2302
206
26162
NESTED LOOPS OUTER
2227
206
23484
NESTED LOOPS
2227
206
21836
HASH JOIN
2021
206
16686
TABLE ACCESS BY INDEX ROWID
WAREHOUSE
ORDER_AGGREGATE
1988
206
14214
INDEX RANGE SCAN
WAREHOUSE
OA_CREATE_DATE_IDX
30
5041
TABLE ACCESS FULL
WAREHOUSE
PUBLISHER_PROFILE
32
11898
142776
TABLE ACCESS BY INDEX ROWID
ESD
PUBLISHERS
1
1
25
INDEX UNIQUE SCAN
ESD
PUBLISHERS_PK
1
INDEX UNIQUE SCAN
ESD
SALES_AGENT_ID_PK
1
8
TABLE ACCESS FULL
WAREHOUSE
FULFILLMENTS
74
48036
624468
TABLE ACCESS BY INDEX ROWID
WAREHOUSE
ORDER_EXT_AGGREGATE
2
1
14
INDEX UNIQUE SCAN
WAREHOUSE
ORDERS_EXT_AGG_PK
1
1
TABLE ACCESS FULL
WAREHOUSE
CURRENCIES
2
169
1183
VIEW
WAREHOUSE
375
38293
1914650
SORT GROUP BY
375
38293
1799771
HASH JOIN
66
38293
1799771
TABLE ACCESS FULL
WAREHOUSE
RETURN_REASONS
3
696
20880
TABLE ACCESS FULL
WAREHOUSE
RETURNS
62
38293
650981
Using a 2-digit year in the to_date string:
05, RR
SELECT STATEMENT, GOAL = CHOOSE
21420
234
46332
SORT ORDER BY
21420
234
46332
FILTER
HASH JOIN OUTER
21411
234
46332
HASH JOIN OUTER
21034
233
34484
NESTED LOOPS
21031
233
32853
HASH JOIN OUTER
20565
233
29591
NESTED LOOPS OUTER
20490
233
26562
NESTED LOOPS
20490
233
24698
HASH JOIN
20257
233
18873
TABLE ACCESS FULL
WAREHOUSE
ORDER_AGGREGATE
20224
233
16077
TABLE ACCESS FULL
WAREHOUSE
PUBLISHER_PROFILE
32
11898
142776
TABLE ACCESS BY INDEX ROWID
ESD
PUBLISHERS
1
1
25
INDEX UNIQUE SCAN
ESD
PUBLISHERS_PK
1
INDEX UNIQUE SCAN
ESD
SALES_AGENT_ID_PK
1
8
TABLE ACCESS FULL
WAREHOUSE
FULFILLMENTS
74
48036
624468
TABLE ACCESS BY INDEX ROWID
WAREHOUSE
ORDER_EXT_AGGREGATE
2
1
14
INDEX UNIQUE SCAN
WAREHOUSE
ORDERS_EXT_AGG_PK
1
1
TABLE ACCESS FULL
WAREHOUSE
CURRENCIES
2
169
1183
VIEW
WAREHOUSE
375
38293
1914650
SORT GROUP BY
375
38293
1799771
HASH JOIN
66
38293
1799771
TABLE ACCESS FULL
WAREHOUSE
RETURN_REASONS
3
696
20880
TABLE ACCESS FULL
WAREHOUSE
RETURNS
62
38293
650981
October 05, 2005 - 6:59 am UTC
... That makes it a constant as of right now doesn't it? ...
no more than SYSDATE is a constant as of right now. 5 minutes from now - sysdate can be different (can be, doesn't have to be! you can freeze time)
WHERE OA.CREATE_DATE >= TO_DATE('7/28/05', 'MM/DD/RR')
Tell me, what date is that - no matter what you answer, I'll say "nope, I was thinking of another date"
would need to see the plan part that represented the estimated cardinality for this step.
(you can format plans skinny, I do it all of the time...)
How do you know if a table was analyzed with dbms_stats or analyze table?
Peter Tran, December 15, 2005 - 7:38 pm UTC
Hi Tom,
Someone asked me this today, but I haven't been able to figure out the answer.
Is there anyway you can tell whether someone analyzed a table using "DBMS_STATS" vs "ANALZE TABLE"?
My gut call is no.
Thanks,
-Peter
December 16, 2005 - 7:45 am UTC
it won't be "definitive", but you can take an educated guess.
I'll use dbms_stats, then analyze - compare, then dbms_stats and recompare..
ops$tkyte@ORA10GR2> create table t ( x int, y varchar2(4000), z varchar2(4000) );
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into t (x) select rownum from all_objects where rownum <= 10;
10 rows created.
ops$tkyte@ORA10GR2> commit;
Commit complete.
ops$tkyte@ORA10GR2> update t set y = rpad('*',4000,'*'), z = rpad('*',4000,'*');
10 rows updated.
ops$tkyte@ORA10GR2> commit;
Commit complete.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR2> create table tstats as
2 select 'DBMS_STATS' name, a.* from user_tables a where table_name = 'T';
Table created.
ops$tkyte@ORA10GR2> analyze table t compute statistics;
Table analyzed.
ops$tkyte@ORA10GR2> insert into tstats
2 select 'ANALYZE' name, a.* from user_tables a where table_name = 'T';
1 row created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> column cname format a20
ops$tkyte@ORA10GR2> column dbms_stats format a25
ops$tkyte@ORA10GR2> column analyze format a25
ops$tkyte@ORA10GR2> select a.cname, a.val dbms_stats, b.val analyze
2 from table( cols_as_rows( q'|select *
3 from tstats
4 where name = 'DBMS_STATS' |' ) ) a,
5 table( cols_as_rows( q'|select *
6 from tstats
7 where name = 'ANALYZE' |' ) ) b
8 where a.cname = b.cname
9 and decode( a.val, b.val, 0, 1 ) = 1
10 /
CNAME DBMS_STATS ANALYZE
-------------------- ------------------------- -------------------------
NAME DBMS_STATS ANALYZE
EMPTY_BLOCKS 0 3
AVG_SPACE 0 1878
CHAIN_CNT 0 10
AVG_ROW_LEN 8005 8022
GLOBAL_STATS YES NO
6 rows selected.
<b>dbms_stats won't gather things not used by the CBO such as chain_cnt/avg_space in particular. If they are 0 or null - it is unlikely that analyze has been used in the past - dbms_stats would be the one.
But global_stats for a non-partitioned table could be a "tell", avg_space, etc is not "good enough" since:</b>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> delete from tstats where name = 'DBMS_STATS';
1 row deleted.
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR2> insert into tstats
2 select 'DBMS_STATS' name, a.* from user_tables a where table_name = 'T';
1 row created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select a.cname, a.val dbms_stats, b.val analyze
2 from table( cols_as_rows( q'|select *
3 from tstats
4 where name = 'DBMS_STATS' |' ) ) a,
5 table( cols_as_rows( q'|select *
6 from tstats
7 where name = 'ANALYZE' |' ) ) b
8 where a.cname = b.cname
9 and decode( a.val, b.val, 0, 1 ) = 1
10 /
CNAME DBMS_STATS ANALYZE
-------------------- ------------------------- -------------------------
NAME DBMS_STATS ANALYZE
AVG_ROW_LEN 8005 8022
GLOBAL_STATS YES NO
<b>dbms_stats won't 'reset' them, their being set indicates analyze was used at some time in the past - but since analyze and dbms_stats set and reset global_stats - that could be a "tell" for many normal tables...</b>
Wow...Awesome.
Peter Tran, December 16, 2005 - 12:10 pm UTC
Perfect...
Thanks again Tom.
CHAIN_CNT
A reader, January 05, 2006 - 10:05 am UTC
Hi
Since dbms_stats doesnt get CHAIN_CNT for us, if we want to find out if we have chained rows on whic tablas how can we achieve it? By analyzing all tablas manually and overwrtting statistics gathered by dbms_stats?
January 05, 2006 - 11:04 am UTC
ops$tkyte@ORA10GR2> @?/rdbms/admin/utlchain
Table created.
ops$tkyte@ORA10GR2> analyze table t list chained rows;
Table analyzed.
ops$tkyte@ORA10GR2> desc chained_rows
Name Null? Type
---------------------------------------- -------- ----------------------------
OWNER_NAME VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
CLUSTER_NAME VARCHAR2(30)
PARTITION_NAME VARCHAR2(30)
SUBPARTITION_NAME VARCHAR2(30)
HEAD_ROWID ROWID
ANALYZE_TIMESTAMP DATE
A reader, January 05, 2006 - 10:48 pm UTC
Tom,
What is the reason for dbms_stats not gathering info about chained rows whereas analyze command does?
Thanks.
January 06, 2006 - 1:37 pm UTC
dbms_stats gathers only that which is used by the optimizer - it gathers no other bits of information.
Ramesh, January 06, 2006 - 9:48 am UTC
Tom,
How setting "mbrc" through dbms_stats and parameter db_file_multiblock_read_count influence differently.
Is that some thing like "mbrc" used during cost evaluation phase and db_file_multiblock_read_count used during execution phase.
Thanks for your time.
Ramesh.
January 06, 2006 - 2:11 pm UTC
the system statistics mbrc is the "observed actual multi-block read count on the system, regardless of the db file multiblock read count init.ora parameter"
the optimizer will use the "observed value", the "retrieval" engine will use db_file_multiblock_read_count to do the actual IO however.
So, if you tell the system via dbms_stats "we only get 8 blocks on multi-block reads" but set the multiblock read count to the max (say 128 blocks) - when the optimizer decides to to a full scan (multi-block read), we'll do as large an IO as possible, but the costing engine would consider "8" to be the number we would expect.
dbms_stats in 10gR2
Oraboy, February 21, 2006 - 5:21 pm UTC
Guess I am really tired and my eyes cant catch what I am missing here..Could anyone please help?
Issue: The index associated with Primary key Constraint never gets updated with correct statistics and hence my query is throwing wrong results
Commodity_DIM is a table with 700+ rows
<b>-- Version check </b>
SQL> select * from V$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
<b>-- Table details </b>
SQL> select table_name,num_rows,sample_size,last_analyzed from user_tables
2 where table_name='COMMODITY_DIM'
3 /
TABLE_NAME NUM_ROWS SAMPLE_SIZE LAST_ANAL
------------------------------ ---------- ----------- ---------
COMMODITY_DIM 730 730 21-FEB-06
SQL> select count(*) from commodity_dim;
COUNT(*)
----------
0
<b>-- Index details </b>
SQL> select index_name,num_rows,DISTINCT_KEYS,status from user_indexes
2 where table_name='COMMODITY_DIM';
INDEX_NAME NUM_ROWS DISTINCT_KEYS STATUS
------------------------------ ---------- ------------- --------
SYS_C009071 0 0 VALID
<b>-- Index associates to Primary Key constraint in the table </b>
SQL> select constraint_name,constraint_type from user_constraints
2 where table_name='COMMODITY_DIM' and index_name='SYS_C009071';
CONSTRAINT_NAME C
------------------------------ -
SYS_C009071 P
<b>-- and the constraint is Valid </b>
SQL> select constraint_name,constraint_type,status from user_constraints
2 where table_name='COMMODITY_DIM' and index_name='SYS_C009071';
CONSTRAINT_NAME C STATUS
------------------------------ - --------
SYS_C009071 P ENABLED
<b>-- analyzing the index - no help </b>
SQL> analyze index SYS_C009071 compute statistics;
Index analyzed.
SQL> select index_name,num_rows,DISTINCT_KEYS,status from user_indexes
2 where table_name='COMMODITY_DIM';
INDEX_NAME NUM_ROWS DISTINCT_KEYS STATUS
------------------------------ ---------- ------------- --------
SYS_C009071 0 0 VALID
<b>-- dbms_stats ; perform compute </b>
SQL> exec dbms_stats.gather_index_stats(ownname=>'GOLDDEMO',-
> indname=>'SYS_C009071',estimate_percent=>NULL);
PL/SQL procedure successfully completed.
SQL> select index_name,num_rows,DISTINCT_KEYS,status from user_indexes
2 where table_name='COMMODITY_DIM';
INDEX_NAME NUM_ROWS DISTINCT_KEYS STATUS
------------------------------ ---------- ------------- --------
SYS_C009071 0 0 VALID
SQL> desc commodity_dim
Name Null? Type
----------------------------------------- -------- ----------------------------
COMM_DIM_PK NOT NULL NUMBER
COMM_LOW_CATG_CODE NOT NULL VARCHAR2(10)
COMM_LOW_CATG_DESC VARCHAR2(50)
COMM_MID_CATG_CODE NOT NULL VARCHAR2(50)
COMM_MID_CATG_DESC VARCHAR2(50)
COMM_HI_CATG_CODE NOT NULL VARCHAR2(50)
COMM_HI_CATG_DESC VARCHAR2(50)
COMM_ALL_CATG_CODE VARCHAR2(10)
COMM_ALL_CATG_DESC VARCHAR2(50)
<b>-- Here is the problem </b>
SQL> select count(*) from commodity_dim;
COUNT(*)
----------
0
SQL> select count(distinct comm_dim_pk) from commodity_dim;
COUNT(DISTINCTCOMM_DIM_PK)
--------------------------
0
SQL> select /*+ FULL(c)*/
2 count(distinct comm_dim_pk) from commodity_dim c;
COUNT(DISTINCTCOMM_DIM_PK)
--------------------------
730
<b>-- Verifying Optimizer plans </b>
SQL> explain plan for select count(*) from commodity_dim;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 818636992
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 0 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FULL SCAN| SYS_C009071 | 730 | 0 (0)| 00:00:01 |
------------------------------------------------------------------------
SQL> explain plan for
2 select /*+ FULL(c)*/
3 count(distinct comm_dim_pk) from commodity_dim c;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
Plan hash value: 3961897095
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| T
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 8 (0)| 0
| 1 | SORT GROUP BY | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| COMMODITY_DIM | 730 | 2920 | 8 (0)| 0
---------------------------------------------------------------------------
<b>-- parameter listing , if this helps </b>
SQL> show parameter opt
NAME TYPE VALUE
------------------------------------ ----------- ----------
filesystemio_options string
object_cache_optimal_size integer 102400
optimizer_dynamic_sampling integer 2
optimizer_features_enable string 10.2.0.1
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_mode string ALL_ROWS
optimizer_secure_view_merging boolean TRUE
plsql_optimize_level integer 2
February 22, 2006 - 8:28 am UTC
analyze table commodity_dim validate structure cascade;
what does that return.
Ora-1499
oraboy, February 22, 2006 - 9:31 am UTC
Thanks Tom.. Looks like there is much bigger problem here.
SQL> analyze table commodity_dim validate structure;
Table analyzed.
SQL> analyze table commodity_dim validate structure cascade;
analyze table commodity_dim validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
I 'll follow-up this in metalink.
Thanks for your pointers, as always.
Weird but worked
Oraboy, February 22, 2006 - 9:39 am UTC
SQL> analyze table commodity_dim validate structure cascade;
analyze table commodity_dim validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> alter table commodity_dim move;
Table altered.
SQL> select index_name from user_indexes
2 where table_name='COMMODITY_DIM';
INDEX_NAME
------------------------------
SYS_C009071
SQL> alter index SYS_C009071 rebuild ;
Index altered.
SQL> analyze table commodity_dim validate structure cascade;
Table analyzed.
SQL> select count(*) from commodity_dim;
COUNT(*)
----------
730
SQL> explain plan for select count(*) from commodity_dim;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 2809477265
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| SYS_C009071 | 730 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Comparable test, different results...
Ben, March 09, 2006 - 2:52 pm UTC
This is a great discussion, it has brought to light the concept of histograms and generating statistics overall for me. The argument for how bright that light is, is still up in the air ;)
After reading the portion, dated Jan 05, that gave examples of the differences between the different method_opt choices, I decided to experiment a little for myself. I am using a test data warehouse that is composed of only 7 tables, each of which have a primary key index and two of which have an additional one column non-unique index. I show those below.
I found my results to be un-expectantly different from yours and I wondered if you could help me to understand why.
The first test was using 'FOR ALL INDEXED COLUMNS'. The second test was with 'FOR ALL INDEXED COLUMNS SIZE AUTO'. Finally, the third test was using 'FOR ALL INDEXED COLUMNS SIZE SKEWONLY'. The third test gave the most confusing results to me. I will include my observation and question following the results from each test, within the spooled listing below.
SQL> show user
USER is "DMR"
SQL> select USER || '.' || table_name, index_name, column_name
2 from user_ind_columns;
USER||'.'||TABLE_NAME INDEX_NAME COLUMN_NAME
--------------------- ---------------------- -------------------------
DMR.BRANCH_BO BRANCH_BO_1 REPORT_DATE
DMR.BRANCH_BO BRANCH_BO_1 PRODUCT
DMR.BRANCH_BO BRANCH_BO_1 BRANCH_PLANT
DMR.BRANCH_BO BRANCH_BO_1 TRX_CLASS
DMR.INV_FACT INV_FACT_1 REPORT_DATE
DMR.INV_FACT INV_FACT_1 PRODUCT
DMR.INV_FACT INV_FACT_1 BRANCH_PLANT
DMR.INV_FACT INV_FACT_1 CATEGORY
DMR.INV_FACT INV_FACT_1 OBSOLETE
DMR.ITEM ITEM_1 PRODUCT
DMR.ITEM ITEM_2** D_BUS_UNIT
DMR.ITEM_BRANCH ITEM_BRANCH_1 PRODUCT
DMR.ITEM_BRANCH ITEM_BRANCH_1 BRANCH_PLANT
DMR.ITEM_FILL ITEM_FILL_1 REPORT_DATE
DMR.ITEM_FILL ITEM_FILL_1 PRODUCT
DMR.ITEM_FILL ITEM_FILL_2** PRODUCT
DMR.OSB_FACT OSB_FACT_1 REPORT_DATE
DMR.OSB_FACT OSB_FACT_1 PRODUCT
DMR.OSB_FACT OSB_FACT_1 TRX_CLASS
DMR.PROD_FACT PROD_FACT_1 REPORT_DATE
DMR.PROD_FACT PROD_FACT_1 PRODUCT
DMR.PROD_FACT PROD_FACT_1 BRANCH_PLANT
DMR.TIME TIME_1 CURDATE
** signifies non unique indexes/columns
23 rows selected.
##############
The following shows the skewdness of d_bus_unit, curdate, and obsolete
##############
SQL> select count(distinct D_BUS_UNIT)
2 from item;
COUNT(DISTINCTD_BUS_UNIT)
-------------------------
12
SQL> select count(*)
2 from item;
COUNT(*)
----------
65119
SQL> select D_BUS_UNIT, count(*)
2 from item
3 group by D_BUS_UNIT;
D_BUS_UNIT COUNT(*)
------------------------------ ----------
617
Acute Care 29298
Corporate 10
Drop Ship Fee 1
Freight 2
Min Order Qty Fee 1
NonStock Expense Item 8
OEM 1920
Patient Care 30547
REBATE CREDIT 1
Raw Material Samples 2
D_BUS_UNIT COUNT(*)
------------------------------ ----------
Wound Care 2699
12 rows selected.
SQL> select count(distinct obsolete)
2 from inv_fact;
COUNT(DISTINCTOBSOLETE)
-----------------------
2
SQL> select count(*)
2 from inv_fact;
COUNT(*)
----------
2743569
SQL> select obsolete, count(*)
2 from inv_fact
3 group by obsolete;
O COUNT(*)
- ----------
N 2649249
Y 94320
SQL> select count(*)
2 from time;
COUNT(*)
----------
156
SQL> select count(distinct curdate)
2 from time;
COUNT(DISTINCTCURDATE)
----------------------
156
#################################
SQL> exec dbms_stats.delete_schema_stats( 'DMR' );
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
2 from user_tab_histograms
3 group by table_name, column_name;
no rows selected
SQL> exec dbms_stats.gather_schema_stats
( ownname =>'DMR',
cascade => TRUE,
estimate_percent => 10,
method_opt => 'FOR ALL INDEXED COLUMNS');
PL/SQL procedure successfully completed.
SQL> l
1 select table_name, column_name, count(*)
2 from user_tab_histograms
3* group by table_name, column_name
SQL> /
TABLE_NAME COLUMN_NAME COUNT(*)
--------------------- -------------- ----------
ITEM PRODUCT 2
ITEM D_BUS_UNIT 7
TIME CURDATE 76
INV_FACT PRODUCT 76
INV_FACT CATEGORY 2
INV_FACT OBSOLETE 2
INV_FACT REPORT_DATE 74
INV_FACT BRANCH_PLANT 25
OSB_FACT PRODUCT 76
OSB_FACT TRX_CLASS 6
OSB_FACT REPORT_DATE 74
BRANCH_BO PRODUCT 76
BRANCH_BO TRX_CLASS 6
BRANCH_BO REPORT_DATE 62
BRANCH_BO BRANCH_PLANT 24
ITEM_FILL PRODUCT 76
ITEM_FILL REPORT_DATE 3
PROD_FACT PRODUCT 76
PROD_FACT REPORT_DATE 44
PROD_FACT BRANCH_PLANT 32
ITEM_BRANCH PRODUCT 76
ITEM_BRANCH BRANCH_PLANT 33
22 rows selected.
######### OBSERVATIONS/QUESTIONS ########
When you did this test on table "T" that was built from
create table t as select * from all_objects
you got a 76 bucket histogram on object_id, a unique column, and a 22 bucket histogram on owner, a non-unique column.
In my sample item.product is the primary key, I would have expected more than a two bucket histogram.
ITEM.D_BUS_UNIT is non-unique and very skewed but yet I only got a 7 bucket histogram.
What could be the cause of this? It seems like I got the opposite of what your example shows.
################# END of Q1
SQL> exec dbms_stats.delete_schema_stats( 'DMR' );
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
2 from user_tab_histograms
3 group by table_name, column_name;
no rows selected
SQL> exec dbms_stats.gather_schema_stats
( ownname => 'DMR',
cascade => TRUE,
estimate_percent => 10,
method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO');
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
2 from user_tab_histograms
3 group by table_name, column_name;
TABLE_NAME COLUMN_NAME COUNT(*)
------------------------------ ------------------------- ----------
ITEM PRODUCT 2
ITEM D_BUS_UNIT 2
TIME CURDATE 2
INV_FACT PRODUCT 2
INV_FACT CATEGORY 2
INV_FACT OBSOLETE 2
INV_FACT REPORT_DATE 2
INV_FACT BRANCH_PLANT 2
OSB_FACT PRODUCT 2
OSB_FACT TRX_CLASS 2
OSB_FACT REPORT_DATE 2
BRANCH_BO PRODUCT 2
BRANCH_BO TRX_CLASS 2
BRANCH_BO REPORT_DATE 2
BRANCH_BO BRANCH_PLANT 2
ITEM_FILL PRODUCT 2
ITEM_FILL REPORT_DATE 2
PROD_FACT PRODUCT 2
PROD_FACT REPORT_DATE 2
PROD_FACT BRANCH_PLANT 2
ITEM_BRANCH PRODUCT 2
ITEM_BRANCH BRANCH_PLANT 2
22 rows selected.
##########################
No confusion here, got the exact results as your sample. This is a test db and hasn't had any sql ran for some time.
###########################
SQL> exec dbms_stats.delete_schema_stats( 'DMR' );
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
2 from user_tab_histograms
3 group by table_name, column_name;
no rows selected
SQL> exec dbms_stats.gather_schema_stats
( ownname =>'DMR',
cascade => TRUE,
estimate_percent => 10,
method_opt => 'FOR ALL INDEXED COLUMNS SIZE SKEWONLY');
PL/SQL procedure successfully completed.
SQL> select table_name, column_name, count(*)
2 from user_tab_histograms
3 group by table_name, column_name;
TABLE_NAME COLUMN_NAME COUNT(*)
------------------------------ ------------------------- ----------
ITEM PRODUCT 2
ITEM D_BUS_UNIT 2
TIME CURDATE 76
INV_FACT PRODUCT 201
INV_FACT CATEGORY 2
INV_FACT OBSOLETE 2
INV_FACT REPORT_DATE 99
INV_FACT BRANCH_PLANT 48
OSB_FACT PRODUCT 201
OSB_FACT TRX_CLASS 6
OSB_FACT REPORT_DATE 100
BRANCH_BO PRODUCT 201
BRANCH_BO TRX_CLASS 6
BRANCH_BO REPORT_DATE 89
BRANCH_BO BRANCH_PLANT 24
ITEM_FILL PRODUCT 102
ITEM_FILL REPORT_DATE 3
PROD_FACT PRODUCT 201
PROD_FACT REPORT_DATE 108
PROD_FACT BRANCH_PLANT 31
ITEM_BRANCH PRODUCT 201
ITEM_BRANCH BRANCH_PLANT 57
22 rows selected.
################################
Here's where I really get confused. I listed some statements at the top that shows the skewdness of
item.d_bus_unit, time.curdate, and inv_fact.obsolete. item.product is the primary key of item, so it's unique, and from example I would expect the 2 bucket histogram, but d_bus_unit is non_unique and very skewed. Why no histograms here? Time.curdate is unique,
why does it have histograms? inv_fact.obsolete is non-unique and very skewed, but yet again, no histograms. What could be causing this?
Also, I might be a little confused with the terminology and referring to that count as buckets. I saw where you were saying that 2 meant no histograms, just hi/lo endpoints, and that >2 meant histograms.
Thank you for any explanations you could give me, and let me know if you need to see other settings "maybe init.ora" or anything else.
March 09, 2006 - 3:58 pm UTC
what version here - dbms_stats behavior is driven by that greatly.
Also, I likely won't address everything here - this is sort of a review followup area, this is "big" (and I cannot really reproduce, I do not have your data, that makes it "hard")
in addition to above...
Ben, March 09, 2006 - 3:51 pm UTC
Sorry, I forgot to include the banner information
SQL> select banner from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
PL/SQL Release 9.2.0.5.0 - Production
CORE 9.2.0.6.0 Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.5.0 - Production
NLSRTL Version 9.2.0.5.0 - Production
another addition, your response wasn't posted before last.
Ben, March 09, 2006 - 4:17 pm UTC
sorry, again for the lack of data. I did think of something else that might impact this.
In my init.ora the compatible parameter is set to 8.1.0 BUT the optimizer_features_enable is set to 9.2.0
I just recently inherited the database so I can't give you a reason why these are the way they are.
March 09, 2006 - 4:25 pm UTC
I'll take a look later when I get back from travel if I get a chance.
Meanwhile and faraway, can you try to generate a test case using mod, rownum, decode whatever to generate dummy data that we can test with ?
a closer comparison
Ben, March 10, 2006 - 1:28 pm UTC
I re-created your exact example by using the all_objects table.
Those stats were almost exact. I think the difference with using skewonly is probably due to my db having fewer schemas than the one you were using.
I'm still confused on some things with my specific data tables.
The time table has a record for each day. with curdate being the actual date and primary key. The other fields are for day, month, year, day of week, day of year, etc. This is a test schema so there may not be a record for every day begining to end. Could this cause the need for histograms on that column? For example the min date could be 1/1/03 and the max 1/1/06 but it could be missing three months in between or even all of 2005.
SQL> show user
USER is "DMR"
SQL> create table t as select * from all_objects;
Table created.
SQL> create index t_idx1 on t(object_id);
Index created.
SQL> create index t_idx2 on t(owner);
Index created.
SQL> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns', cascade => true );
PL/SQL procedure successfully completed.
SQL> col column_name for a20
SQL> select column_name, count(*)
2 from user_tab_histograms
3 where table_name = 'TIME';
4 group by column_name
SQL> /
COLUMN_NAME COUNT(*)
-------------------- ----------
OBJECT_ID 76
OWNER 4
SQL> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size auto', cascade => true );
PL/SQL procedure successfully completed.
SQL> /
COLUMN_NAME COUNT(*)
-------------------- ----------
OBJECT_ID 2
OWNER 2
SQL> exec dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all indexed columns size skewonly', cascade => true );
PL/SQL procedure successfully completed.
SQL> /
COLUMN_NAME COUNT(*)
-------------------- ----------
OBJECT_ID 2
OWNER 4
SQL> exec dbms_stats.gather_table_stats( user, 'TIME', method_opt=> 'for all indexed columns size skewonly', cascade => true );
PL/SQL procedure successfully completed.
SQL> select column_name, count(*)
2 from user_tab_histograms
3 where table_name = 'TIME';
4 group by column_name
SQL> /
COLUMN_NAME COUNT(*)
-------------------- ----------
CURDATE 76
March 10, 2006 - 8:29 pm UTC
we flipped from T to TIME? I didn't follow? What is TIME exactly?
explanation
Ben, March 14, 2006 - 8:42 am UTC
I tried to explain in the last paragraph at the top before the example. I went through and recreated your example with almost exact results. I then posted numbers from a TIME.CURDATE field that is a primary key that holds the date. In production there is a record for every day, but the test instance has some gaps. I was asking if that is why I would get histograms on that column.
If I am running dbms_stats.gather_schema_stats(user, 'TIME', method_opt=> 'for all
indexed columns size skewonly', cascade => true );
Why would a single column primary key of varchar2 datatype get a histogram generated. That could explain a lot to me. I thought if a column is unique it would only need 1 bucket. Is that not true with varchar2 columns.
If, on a smaller scale, my column looks like:
10-245
10-456
10-984
10-999
15-875
50-3000
50-450
50-451
50-780
90-800
That is a unique column but would it get histograms generated if I used "for all indexed columns size skewonly"?
On the other hand, I also have a char(1) column that is either Y or N. Most of the 100,000 records have an N but yet I only get 1 bucket for this column. I wouldn't think that is sufficient. example:
N
N
N
Y
N
N
N
N
N
Y
Y
N
N
I am only getting a 1 bucket histogram for a column like this, of course on a much larger scale.
If one value out of two possible makes up less than 5% or 10% are histograms not created?
analyze, and gather_stats schemas
Mohammed Abdul Samad, April 24, 2006 - 4:59 am UTC
this is good and useful but I need more detail with
example like if we are gathering stats every week
so how to make scheduling with dbms_jobs,
can you give me with complete examples
hope you got the point......
April 24, 2006 - 5:45 am UTC
what issue do you have with dbms_job? It is pretty straightforward.
You have a procedure you write to gather stats (or just call dbms_stats)
You schedule it.
Problem with table analyze.
Abdul Seerae, May 02, 2006 - 4:15 pm UTC
Dear Tom,
We have the following script running daily (AT 2.00 AM) to gather statistics on a specific schema.
[ dbms_stats.gather_schema_stats(ownname=>'ABCDOWNER',cascade=>true); ]
We are facing a problem with this script. The table analyze stops at a specific table
(say 'Problem_tbl') and it will not proceed any further. It gives no error message or updates any logs. It stays there for ever. When checking through OEM, or sqlplus, we can see 'last analyzed date' updated for all tables before 'Problem_tbl' table, in this schema.
This problem_tbl is not very big in size (only16,000 rows) and we have many big ones in this schema.
When we analyze this table individually, we got same result. It stays for ever.
[ dbms_stats.gather_table_stats(ownname=> 'ABCDOWNER', tabname=> 'PROBLEM_TBL', partname=> NULL); ].
We can analyze all other tables without any problem. To rule out any table corruption, we exported and imported this table to our development server. The data seems OK.
Once we restart database, this problem goes away for few weeks before it reappear.
Any suggestions / thoughts in this respect will be highly appreciated.
O/S: Windows 2003.
Oracle : 9i (9.2.0.6.0)
May 03, 2006 - 1:33 am UTC
(this does seem like an OBVIOUS support issue, I always wonder why that path isn't taken...)
I would peek in the v$ views to see what this session is waiting for when it gets "stuck".
v$session_wait
v$session_event
Locks
Ik, May 08, 2006 - 6:53 am UTC
Tom,
Two questions:
Does ANALYZE or DBMS_STATS against a table, lock the table in any mode? I came across an old Oracle press documentation which said so, but could not simulate it (10g R2). Was it true in older versions of Oracle?
If the table is locked by another session, would analyze/dbms_stats wait for that lock to be released?
Thanks,
May 08, 2006 - 8:22 am UTC
certain variations of analyze can (to validate structures and such), but dbms_stats just runs queries basically.
Doubts
Ik, May 09, 2006 - 10:53 am UTC
Thanks a lot Tom.
I have some related questions.
1. Why do we have parameter force for gather_table_stats? This is FALSE by default. This means, dbms_stats.gather_table_stats errors out (or does it wait?) if the table is locked?
2. no_invalidate clause -- The documentation says "Does not invalidate the dependent cursors if set to TRUE. The procedure invalidates the dependent cursors immediately if set to FALSE."
Invalidating dependent cursors means, all the queries that have used this table and are present in the shared pool will be invalidated and will be hard parsed if executed again?
Was this the default behaviour with ANALYZE?
3. DBMS_STATS.GATHER_TABLE_STATS
BEGIN
DBMS_STATS.GATHER_TABLE_STATS
(
OWNNAME => 'TEST'
,TABNAME => 'RTEST'
,ESTIMATE_PERCENT => 1
,GRANULARITY => 'GLOBAL'
,CASCADE => FALSE
);
END;
This table has 140 partitions, 6018191841 rows and 80538932 blocks.
The above command takes 80 minutes.
If i change block_sample => TRUE, it completes in less than a minute.
I found this query (modified) to be the most time consuming one and waits on "db file scattered reads"
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */ *
from rtest sample (1) t
When the block_sample is made TRUE, query changes to
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */ *
from rtest sample block (1) t
and is much faster.
Tom, i found sample rows and sample block to return different amount of rows. Which is more accurate... when used within dbms_stats , and why?
Thanks a lot for your time.
Thanks,
May 09, 2006 - 11:08 am UTC
1) force overrides statistics that were "locked", you can "lock" statistics - make it so a normal "gather" won't overwrite them. Maybe they could have called it "freezing" statistics instead of locking them.
....
force When value of this argument is TRUE, deletes column statistics even if locked
.......
2) correct. analyze would invalidate cursors as well.
3) You need to understand the difference between block and row sampleing.
One of them samples 1% of the BLOCKS and processes any rows it finds on them.
The other samples 1% of the rows (read the rows, but only look at 1 out of every 100 of them).
You would certainly expect them to return different statistics since they look at radically different data in very different ways.
Thanks
Ik, May 09, 2006 - 11:23 am UTC
Many thanks Tom for the very fast and precise response.
When row sampling (1%) is used, oracle reads every row from the table, but considers only 1 out of every hundred rows. In effect the whole table is read.
The difference between this and COMPUTE statistics would then be that COMPUTE considers all the rows whereas sample rows considers only 1% of the rows. But, in both cases the entire table is read.
So, we save on CPU mostly by using sample rows when compared to COMPUTE? Am i correct in saying so?
Also, the Oracle documentation says "Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated." How is random block sampling more efficient?
Iam sorry for asking too many questions, Tom, but these are doubts for which very few people have an answer to.
May 09, 2006 - 12:12 pm UTC
you saw how random block sampling is more efficient - it only read 1% of the blocks, it ran a lot faster.
But it might get entirely the wrong answer by accident since it reads 1% of the blocks and could be subject to skew (load sorted data and it'll see lots of "the same values" on any given block which could be unnatural)
Alberto Dell'Era, May 09, 2006 - 11:30 am UTC
> Maybe they could have called it "freezing" statistics
> instead of locking them.
It's worth mentioning the different behaviour when trying to gather stats on a locked table versus the schema/db that contains it:
dellera@ORACLE10> exec dbms_stats.lock_table_stats (user, 't');
PL/SQL procedure successfully completed.
dellera@ORACLE10> exec dbms_stats.gather_table_stats (user, 't');
BEGIN dbms_stats.gather_table_stats (user, 't'); END;
*
ERROR at line 1:
ORA-20005: object statistics are locked (stattype = ALL)
ORA-06512: at "SYS.DBMS_STATS", line 13056
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1
dellera@ORACLE10> exec dbms_stats.gather_schema_stats (user);
PL/SQL procedure successfully completed.
The first is definetely a "lock" as in "you bang your head into the locked door" ;) - as a non-native speaker would express it ...
May 09, 2006 - 12:12 pm UTC
but not to be confused with a table lock or a row lock :)
David Aldridge, May 09, 2006 - 12:50 pm UTC
>> But it might get entirely the wrong answer by accident since it reads 1% of the
blocks and could be subject to skew (load sorted data and it'll see lots of "the
same values" on any given block which could be unnatural)
It's also sensitive to skewed numbers of rows per block, and the 1% sample is by no means guaranteed
</code>
http://oraclesponge.blogspot.com/2005/06/statistics-by-block-sampling.html <code>
dbms_stats for IOTs
Ik, May 16, 2006 - 8:11 am UTC
Tom,
This is 10g R2.
I noticed that when gathering stats on 10g using dbms_stats (specifying partname, granularity => 'PARTITION' and estimate_percent => <some number>)
performance on IOTs are very poor when compared to tables of similar/much bigger size.
The SQL it is waiting for is
INSERT /*+ append*/ into sys.<temp table>
SELECT .... FROM <iot> sample (0.001) t
WHERE TBL$OR$IDX$PART$NUM(<iot>,0,4,0,ROWID) = :objn
It is waiting on 'db_file_sequential_reads' and in the P1 column of v$session_wait, it shows up almost every datafile that this IOT spans across. This includes other partitions too. My question is that the granularity keyword seems to have no impact on IOTs.
Is this how it should be?
This command is taking more than an hour to run. We have around 100 partitions.
Tom, whenever granularity of PARTITION is specified for a table, index or IOT, shouldnt it go against that partition alone?
In cases when the granularity => 'PARTITION', would Oracle attempt to gather global statistics also by simply rolling up from individual partitions each time?
Please let us know.
A reader, June 03, 2006 - 10:05 pm UTC
Hi Tom,
A basic question but I am bit confused about it. I create one schema with some tables in it. Do I need to gateher stats for them before using them? Or Oracle will take care once they start getting populated? I am bit confused if we do need to take the stats for empty tables. This I am asking as I was asked to do so.May be I did understand it properly and I might have to go back and confirm.
Thx
Santosh.
June 04, 2006 - 8:16 am UTC
The only answers to all questions are:
a) why
b) it depends
The answer to this is "it depends"
are you using an older release and using the rule based optimizer (RBO) - then the answer is no, because you are using the RBO and the RBO doesn't want statistics
are you using current software and using the cost based optimizer (CBO)? Then we need to know how current - is it 10g current where dynamic sampling will kick in automatically or is it 8i/9i current where sampling does not kick in automatically.
In 8i/9i - you would be responsible for collecting the statistics. In 10g, it'll dynamically sample the table the first time it hard parses a query and if you are using the automatic job that installs with the database, it'll gather statistics for you on these objects in the future as the job runs over time.
Dbms_stats.gather_schema_stats
Vikas, June 04, 2006 - 9:49 am UTC
Hi Tom,
We are currently managing a database worth 461G worth of data. This is on the database server hosted on Linux machine having 4 CPU's , 8 disk devices supported by 4 disk controllers and 16 G of RAM.
We have been running the package to collect the statistics which will help the optimizer to generate the optimised execution plans using dbms_stats.gather_database_stats, but recently data being growing everyday to the tune of 1G the capturing of statistics has really slowed down and is taking 5 hrs to complete the job.
Create or replace procedure gather_db_stats AS
BEGIN
declare
Path VARCHAR2(100) ;
Filename VARCHAR2(100) := 'Dbmsstats.txt';
Output_file utl_file.file_type;
l_Start NUMBER;
l_tname VARCHAR2(30);
l_partition_name VARCHAR2(30);
rebuild_partition_index VARCHAR2(500);
err_num NUMBER;
err_msg VARCHAR2(500);
BEGIN
Select value into path from v$parameter where name = 'utl_file_dir';
output_file := utl_file.fopen (path,filename, 'a');
for x in (Select * from DWUSER.Rolling_tables where status = 'Y') loop
l_tname := x.table_name;
for table_specific_partitions in (Select all_ind_partitions.partition_name part_name from all_indexes,all_ind_partitions
Where all_indexes.index_name = all_ind_partitions.index_name
And all_indexes.table_name = l_tname
And all_ind_partitions.Status = 'UNUSABLE') loop
begin
l_partition_name := table_specific_partitions.part_name;
rebuild_partition_index := 'Alter table ' || l_tname || ' modify partition ' || l_partition_name || ' rebuild unusable local indexes';
execute immediate rebuild_partition_index;
EXCEPTION
When OTHERS then
err_num := SQLCODE;
err_msg := Substr(SQLERRM,1,500);
utl_file.put_line (output_file,'Date : ' || to_Char(sysdate,'dd/mm/yyyy hh24:mi:ss') || ' Error generated while rebuilding the UNUSABLE local indexes with an error message ' || err_msg);
utl_file.fclose(output_file);
return;
end;
end loop;
end loop;
for x in (select Username from dba_users where username not in ('SYS','SYSTEM'))
lOOP
l_start := dbms_utility.get_time;
dbms_stats.gather_schema_stats(ownname => x.username,
estimate_percent => dbms_stats.auto_sample_size,
method_opt => 'FOR ALL INDEXED COLUMNS SIZE SKEWONLY',
degree => 16,
cascade => true,
options => 'GATHER AUTO');
utl_file.put_line (output_file,'Date : ' || to_Char(sysdate,'dd/mm/yyyy hh24:mi:ss') || ' The total time spent to gather stats for the schema owner : ' || x.username || ' is ' || round((dbms_utility.get_time - l_start)/100,2) || 'seconds');
end loop;
END;
END;
/
We just don't have the bandwidth of 5 hrs to get this done. We have 4 partitioned tables which do have most of the data (could be 350M).
Can we use gather_table_stats instead of gather_schema_stats for only those partitions which have the LAST_ANALYZED_DATE as NULL, since our application is such which doesn't perforn any delete or insert activity once the End of day completes. We have Range partitition based on Date and have daily partitoins.
Will this help, or let us know if we can configure some of the parameter values for the gather_schema_stats to get the job done in a much faster way. We don't need to build histograms for the columns.
Thanks in anticipation
June 04, 2006 - 4:16 pm UTC
The answer is as always one of two:
a) why
b) it depends
your answer is b), it depends.
Can you just do partition stats? do your queries always hit a SINGLE partition, and is that partition known at hard parse time?
If your query hits more than one partition OR your query doesn't let us figure out which partition it will hit - then GLOBAL TABLE stats are used to optimize (not local partition statistics).
And the global table stats could be slightly "wrong" for columns if you don't maintain them over time.
eg: if you have 2 partitions - and each partition has 500,000 rows - it is easy enough to know there are 1,000,000 rows - that works OK from the local partition statistics.
however, if in those 2 partitions there it a column X and it has 10,000 distinct values in partition1 and 10,000 distinct values in partition2 - how many distinct values does it have "globally"? somewhere between 10,000 and 20,000 - which can be the difference between "night and day" to the optimizer.
There is no cut and dry answer here - you might have to "help" on some columns using dbms_stats.set_*** to set the column stats for us if you know them (and you are gathering histograms there - you said you don't need to so why?)
dbms_stats
geepee, July 05, 2006 - 4:56 am UTC
Dear Tom,
We have around 150 fast growing/changing transactional data tables in our production system.
Database is 10gR2.
Growth of these tables is from 0.1 million to 1 million in six months.
A few tables are partitioned tables.
Which way is good to analyze these fast growing tables? Dynamic sampling? or scheduling the script run using dbms_stats for these tables? Pl. suggest.
Regards.
July 08, 2006 - 7:51 am UTC
dynamic sampling will happen only at the first hard parse time (so if you never shutdown - you never get different plans - assuming you have used bind variables - which you must be since you said "we are transactional"). It would not be appropriate.
More likely would be to use dbms_stats to gather stats on STALE tables (tables are all set to monitoring in 10g you can use "gather stale" as the option, to avoid gathering stats on tables that haven't changed much).
Table statistics and 3113 end-of-file on comm channel
Jon Waterhouse, July 12, 2006 - 11:38 am UTC
I ran into the following problem on XE, which I find difficult to explain. I had the following query:
select count(*),urban
from case@caps s
inner join res_address@caps r ON r.res_address_id = s.res_address_id
inner join ref.good_postal_code p ON p.postal_code=r.postal_code
WHERE s.adults=1 and s.children > 0 -- is single parent
AND exists (select * from payment@caps p
WHERE s.case_id = p.case_id
AND p.payment_date between to_date('200605','yyyymm') and to_date('200606','yyyymm')-1)
GROUP BY urban
which worked fine. Note that there is one table on XE and the other two tables are remote. I then modified the query to take out the two items that restricted the query to single parents. I got a 3113 error running an explain plan. I found that adding in some restrictions (e.g. s.adults in (1,2,3)) would allow the query to run -- even though this in no way restricts the query - all records would satisfy this. I also found that running an analyze table on ref.good_postal_code allowed the query to run.
Since I wasn't sure if it was possible to delete stats I then did a CTAS to dreate a new table exactly the same as good_postal_code but with no stats. Surprisingly the query ran no problem.
I read somewhere that resource limits can generate 3113s. Is that what is happening here? i.e. the explain plan process decides that the query is going to be too big and returns the 3113?
July 12, 2006 - 5:04 pm UTC
doubtful - can you reproduce in "not XE"?
ZB, July 24, 2006 - 12:44 pm UTC
system@oip1> create or replace procedure ods_analyze
2 is
3 begin
4 execute immediate 'execute DBMS_STATS.GATHER_SCHEMA_STATS(''ODS'')';
5 end;
6 /
Procedure created.
system@oip1> exec ods_analyze;
BEGIN ods_analyze; END;
*
ERROR at line 1:
ORA-00900: invalid SQL statement
ORA-06512: at "SYSTEM.ODS_ANALYZE", line 4
ORA-06512: at line 1
Tom, why did I get the error?
July 24, 2006 - 2:54 pm UTC
why would you try to dynamically execute plsql - plsql that never changes???????
...
is
begin
dbms_stats.gather_schema_stats( 'ODS' );
end;
you have plsql
you are writing plsql
just call it
execute is a "sqlplus-ism". you can see this from your own example:
system@oip1> exec ods_analyze;
BEGIN ods_analyze; END;
*
ERROR at line 1:
ORA-00900: invalid SQL statement
see - it says "BEGIN ods_analyze; END;" - exec is just shorthand for sqlplus to glue begin and end on it. execute is NOT plsql - it is not part of the language at all.
ZB, July 24, 2006 - 3:07 pm UTC
thanks Tom, now i got the procedure compiled.
But got the new:
system@oip1> exec ods_analyze
BEGIN ods_analyze; END;
*
ERROR at line 1:
ORA-20000: Insufficient privileges to analyze an object in Schema
ORA-06512: at "SYS.DBMS_STATS", line 7649
ORA-06512: at "SYS.DBMS_STATS", line 7779
ORA-06512: at "SYS.DBMS_STATS", line 7830
ORA-06512: at "SYS.DBMS_STATS", line 7809
ORA-06512: at "SYSTEM.ODS_ANALYZE", line 4
ORA-06512: at line 1
system@oip1> exec DBMS_STATS.GATHER_SCHEMA_STATS('ODS');
PL/SQL procedure successfully completed.
But i could do it directly.
July 24, 2006 - 4:17 pm UTC
STOP USING SYSTEM
you have no reason to use system, it is "ours", stop using it.
create your own account
and read:
</code>
http://asktom.oracle.com/Misc/RolesAndProcedures.html <code>
grant it "analyze any" directly (no roles)
or better yet....
just write this as the ODS user, schedule it as the ODS user.
but don't use system!
Can a table be ANALYZE during the DML operations?
Ravi, August 09, 2006 - 1:01 am UTC
Hi Tom,
1. Can a table be ANALYZE during (to collect statistics) the DML (INSERT,
UPDATE, DELETE) operations?
2. If the column in WHERE caluse has an index on it, whether the statistics are
much concerns to use the index during DMLs?
3. Having statistics and not having statisttics will affect the use of the
index during DMLs? how?
4. If we use (within the same database) ANALYZE for few tables and DBMS_STATS
package to gather statistics for some other tables, will there be a performance
issue by using these two together?
Thanks,
Ravi
August 09, 2006 - 10:42 am UTC
1) you may use dbms_stats to gather statistics while DML is happening, yes.
2) sure. It is all about estimated cardinality.
3) you need to have statistics to use the cost based optimizer (CBO) safely. If you do not, you will not be able to use the CBO safely.
4) do not use analyze, use dbms_stats
DBMS_STATS & ANALYZE statement Oracle10g R2
Suvendu, August 19, 2006 - 7:06 am UTC
Hi Tom,
Below is the test case which shows the wrong information with DBMS_STATS in Oracle 10g R2. It has the correct information in Oracle 9i R2. As per metalink, CBO will get exact information when we will go by COMPUTE STATISTICS where NUM_ROWS and SAMPLE_SIZE columns will point same values. And, I also checked out on query execution.
SQL> create table t2 as select * from t1;
Table created.
SQL> select count(1) from t2;
COUNT(1)
----------
735392
SQL> exec DBMS_STATS.GATHER_TABLE_STATS(user, TABNAME => 'T2',GRANULARITY => 'DEFAULT' ,METHOD_OPT => 'FOR COLUMNS');
PL/SQL procedure successfully completed.
SQL> select table_name, num_rows, sample_size from user_tables where table_name='T2';
TABLE_NAME NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
T2 735392 51451
SQL> REM Here is the wrong number of rows compare to 735392 and 735392.
SQL> analyze table t2 compute statistics;
Table analyzed.
SQL> select table_name, num_rows, sample_size from user_tables where table_name='T2';
TABLE_NAME NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
T2 735392 735392
SQL> REM Here all values are accurate.
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
SQL>
======================
Oracle 9i R2
======================
SQL> select count(1) from t1;
COUNT(1)
----------
27525120
Elapsed: 00:01:07.05
SQL> exec DBMS_STATS.GATHER_TABLE_STATS(user, TABNAME => 'T1',GRANULARITY => 'DEFAULT' ,METHOD_OPT => 'FOR COLUMNS', DEGREE=>8);
PL/SQL procedure successfully completed.
Elapsed: 00:00:25.22
SQL> select table_name, num_rows, sample_size from user_tables where table_name='T1';
TABLE_NAME NUM_ROWS SAMPLE_SIZE
------------------------------ ---------- -----------
T1 27525120 27525120
Elapsed: 00:00:00.08
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE 9.2.0.6.0 Production
TNS for HPUX: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
Elapsed: 00:00:00.08
albprod@ALBBPM>
Could you please, elaborate the difference of DBMS_STATS in Oracle 10g ?
Thanking you.
Regards,
Suvendu
August 19, 2006 - 9:15 am UTC
that is not "wrong"
That is "correct" actually.
If you check out the docs for dbms_stats, you'll find the specification of gather table stats in 10gr2 is:
http://docs.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#sthref8129
DBMS_STATS.GATHER_TABLE_STATS (
ownname VARCHAR2,
tabname VARCHAR2,
partname VARCHAR2 DEFAULT NULL,
estimate_percent NUMBER DEFAULT to_estimate_percent_type
(get_param('ESTIMATE_PERCENT')),
block_sample BOOLEAN DEFAULT FALSE,
method_opt VARCHAR2 DEFAULT get_param('METHOD_OPT'),
degree NUMBER DEFAULT to_degree_type(get_param('DEGREE')),
granularity VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),
cascade BOOLEAN DEFAULT to_cascade_type(get_param('CASCADE')),
stattab VARCHAR2 DEFAULT NULL,
statid VARCHAR2 DEFAULT NULL,
statown VARCHAR2 DEFAULT NULL,
no_invalidate BOOLEAN DEFAULT to_no_invalidate_type (
get_param('NO_INVALIDATE')),
force BOOLEAN DEFAULT FALSE);
and that:
ops$tkyte%ORA10GR2> select dbms_stats.get_param( 'estimate_percent' ) from dual;
DBMS_STATS.GET_PARAM('ESTIMATE_PERCENT')
-------------------------------------------------------------------------------
DBMS_STATS.AUTO_SAMPLE_SIZE
you are using the auto_sample_size, therefore, it is free to sample.
It performed as documented - but things change (and the fact that the row count is pretty much dead on in 10gr2....)
Help needed
Anil P, August 23, 2006 - 5:09 am UTC
Hello,
This query is taking more than 12 hrs to execute. Can you let me know which areas I need to look to tune.
Thanks
Anil P
SQL> select z.IND_ID,
2 z.BLK_OPEN_DT,
3 z.ORG_SRC_CD,
4 z.PRM_SRC_CD,
5 c.calendar_month_name_abbr ACC_OPEN_MTH_ABR,
6 c.calendar_month ACC_OPEN_MTH,
7 c.calendar_year ACC_OPEN_YEAR
8 from mrtlookup.calendar_date c,
9 (select ind.individual_id IND_ID,
10 x.BLK_OPEN_DT,
11 ind.original_source_cd ORG_SRC_CD,
12 ind.primary_source_cd PRM_SRC_CD
13 from mrtcustomer.individual ind,
14 (select prp.individual_id IND_ID, min(prp.open_dt) BLK_OPEN_DT
15 from mrtcustomer.proprietary_account prp,
16 mrtcustomer.individual_segment isg,
17 (select (c.calendar_month_end_dt + 1) BLK_OPEN_START_DT,
18 (select max(c.calendar_month_end_dt)
19 from mrtlookup.calendar_date c,
20 (SELECT PERIOD_DATES.WEEK_END_DT computation_dt
21 FROM MRTLOOKUP.PERIOD_DATES PERIOD_DATES) e
22 where c.calendar_month_end_dt <=
23 e.computation_dt) END_DT
24 from mrtlookup.calendar_date c
25 where c.calendar_date =
26 (select min(ind.individual_create_dt)
27 from mrtcustomer.individual ind
28 where ind.original_source_cd = 'TEL')) y
29 where y.BLK_OPEN_START_DT <= prp.open_dt
30 and y.END_DT >= prp.open_dt
31 and prp.division_ind is not null
32 and isg.individual_id = prp.individual_id
33 and isg.marketing_segment_cd = 'MAILABLE'
34 group by prp.individual_id) x
35 where ind.primary_source_cd = 'GEC'
36 and ind.individual_id = x.IND_ID) z
37 where c.calendar_date = z.BLK_OPEN_DT;
1143156 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=204854 Card=37843
Bytes=2194894)
1 0 HASH JOIN (Cost=204854 Card=37843 Bytes=2194894)
2 1 TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (TABLE) (Cost=68
Card=14640 Bytes=278160)
3 1 NESTED LOOPS (Cost=204785 Card=37843 Bytes=1475877)
4 3 VIEW (Cost=166771 Card=37843 Bytes=832546)
5 4 SORT (GROUP BY) (Cost=166771 Card=37843 Bytes=200567
9)
6 5 NESTED LOOPS (Cost=58475 Card=37843 Bytes=2005679)
7 6 NESTED LOOPS (Cost=27419 Card=31019 Bytes=108566
5)
8 7 TABLE ACCESS (BY INDEX ROWID) OF 'CALENDAR_DAT
E' (TABLE) (Cost=2 Card=1 Bytes=16)
9 8 INDEX (UNIQUE SCAN) OF 'PK_CD' (INDEX (UNIQU
E)) (Cost=1 Card=1)
10 9 SORT (AGGREGATE)
11 10 PARTITION RANGE (ALL) (Cost=107721 Card=
1302415 Bytes=15628980)
12 11 TABLE ACCESS (FULL) OF 'INDIVIDUAL' (T
ABLE) (Cost=107721 Card=1302415 Bytes=15628980)
13 7 TABLE ACCESS (FULL) OF 'PROPRIETARY_ACCOUNT' (
TABLE) (Cost=27417 Card=31019 Bytes=589361)
14 13 SORT (AGGREGATE)
15 14 NESTED LOOPS (Cost=78 Card=732 Bytes=11712
)
16 15 TABLE ACCESS (FULL) OF 'PERIOD_DATES' (T
ABLE) (Cost=9 Card=1 Bytes=8)
17 15 TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (
TABLE) (Cost=69 Card=732 Bytes=5856)
18 6 PARTITION RANGE (ITERATOR) (Cost=1 Card=1 Bytes=
18)
19 18 INDEX (UNIQUE SCAN) OF 'PK_ISEG_A' (INDEX (UNI
QUE)) (Cost=1 Card=1 Bytes=18)
20 3 PARTITION RANGE (ITERATOR) (Cost=2 Card=1 Bytes=17)
21 20 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'INDIVIDUAL'
(TABLE) (Cost=2 Card=1 Bytes=17)
22 21 INDEX (UNIQUE SCAN) OF 'PK_IND' (INDEX (UNIQUE)) (
Cost=1 Card=1)
Statistics
----------------------------------------------------------
274 recursive calls
4 db block gets
14526033 consistent gets
1299590 physical reads
232 redo size
38571751 bytes sent via SQL*Net to client
838832 bytes received via SQL*Net from client
76212 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1143156 rows processed
August 27, 2006 - 3:23 pm UTC
are the ESTIMATED cardinalities in the explain plan above "realistic" (I'd say "no" based on the returned rows and the estimated returned rows)
are the statistics "up to date"
Thanks
Anil P, August 29, 2006 - 9:20 am UTC
Hello Tom,
Yes it's latest and realistic status. And it's a production db on Oracle 10g. Here's the latest explain plan output.
This query is been generated by a cognos tool.
Can you tell me which areas needs to be looked into?
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3257 Card=1 Bytes=
103)
1 0 FILTER
2 1 SORT (GROUP BY) (Cost=3257 Card=1 Bytes=103)
3 2 MERGE JOIN (CARTESIAN) (Cost=1344 Card=445 Bytes=45835
)
4 3 NESTED LOOPS (Cost=1281 Card=1 Bytes=77)
5 4 NESTED LOOPS (Cost=1280 Card=1 Bytes=59)
6 5 NESTED LOOPS (Cost=369 Card=455 Bytes=15925)
7 6 TABLE ACCESS (BY INDEX ROWID) OF 'CALENDAR_DAT
E' (TABLE) (Cost=2 Card=1 Bytes=16)
8 7 INDEX (UNIQUE SCAN) OF 'PK_CD' (INDEX (UNIQU
E)) (Cost=1 Card=1)
9 8 SORT (AGGREGATE)
10 9 TABLE ACCESS (FULL) OF 'INDIVIDUAL' (TAB
LE) (Cost=1847 Card=81250 Bytes=975000)
11 6 TABLE ACCESS (FULL) OF 'PROPRIETARY_ACCOUNT' (
TABLE) (Cost=367 Card=455 Bytes=8645)
12 11 SORT (AGGREGATE)
13 12 NESTED LOOPS (Cost=66 Card=740 Bytes=11840
)
14 13 TABLE ACCESS (FULL) OF 'PERIOD_DATES' (T
ABLE) (Cost=2 Card=1 Bytes=8)
15 13 TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (
TABLE) (Cost=64 Card=740 Bytes=5920)
16 5 TABLE ACCESS (BY INDEX ROWID) OF 'INDIVIDUAL' (T
ABLE) (Cost=2 Card=1 Bytes=24)
17 16 INDEX (UNIQUE SCAN) OF 'PK_IND' (INDEX (UNIQUE
)) (Cost=1 Card=1)
18 4 INDEX (UNIQUE SCAN) OF 'PK_ISEG' (INDEX (UNIQUE))
(Cost=1 Card=1 Bytes=18)
19 3 BUFFER (SORT) (Cost=1344 Card=14792 Bytes=384592)
20 19 TABLE ACCESS (FULL) OF 'CALENDAR_DATE' (TABLE) (Co
st=63 Card=14792 Bytes=384592)
Thanks
Anil
August 29, 2006 - 3:28 pm UTC
eh? this is totally different from above.
anyway - I will not be able to automagically fix this
Initial Load and statistics
A reader, September 08, 2006 - 8:51 am UTC
Hi Tom,
I am doing a massif initial load. During this initial load I am inserting into several tables and in the same time I am selecting from theses tables. This is why I kept indexes usable. In addition, when I am in 50% of my initial load I compute statistics using:
DBMS_STATS.GATHER_SCHEMA_STATS (shema_name, 30)
I wonder if you will advise me to do this or not?
I want to suppress this call to DBMS_STATS but want to be sure that this will improve performance of my process
Thanks for your advise
September 09, 2006 - 11:46 am UTC
depends on whether the queries you issue would benefit from this regather.
It is quite possible that you need not gather stats - but since YOU are doing the load (you know the stats already!!! you know the number of rows loaded, you know the average row width, you can guess at the number of blocks and so on) you could just call SET statistics and provide the correct values.
GATHER STALE
Neeraj Nagpal, September 15, 2006 - 2:49 pm UTC
Hi Tom,
I have a question for you regarding the GATHER STALE option of DBMS_STATS. Here is a quick little background about my database setup: I have 4 very large composite-partitioned tables -- with 50 partitions and 64 (hash) sub-partitions in each range partition. Each one of these 4 tables has on an average around 120 Million rows, with somewhere around 250 columns each.
Now, the data in these 50 partitions is pretty skewed, in terms of no. of records in them. And each one of these partitions has its own peculiar update cycle -- some states are updated more (much more) than others and some are inserted more than others -- Now what I want to do is that instead of indiscriminately issuing
"Exec dbms_stats.gather_schema_stats( user, options => 'gather stale' );" I want to get just the listing of the skewed partitions within the tables and selectively analyze these partitions which require them the most. Is there a system table which can give me the information about the particular partitions which are STALE in a table??
Thanks so much for your help,
Neeraj
September 16, 2006 - 2:27 pm UTC
look at "LIST STALE" as the option
Please answer
Reader, September 21, 2006 - 3:50 am UTC
Sir,
Please clarify my doubt.
q1) Why is CUSLED_KEY_1 Index being used instead of CUSLED_PK Primary Key index?
q2) Why its filtering ACCOUNT and AGE_PERIOD instead of using the PK?
q3) Or whatever is happening is correct?
But when I disable CUSLED index it uses CUSLED_PK index.
SQL> select index_name, substr(column_name,1,40) cname from all_ind_columns
2 where table_name = 'CUSLED';
INDEX_NAME CNAME
------------------------------ ----------------------------------------
CUSLED_PK LEDGER
CUSLED_PK ACCOUNT
CUSLED_PK AGE_PERIOD
CUSLED_PK DOC_REF
CUSLED_KEY_1 LEDGER
CUSLED_KEY_1 DOC_REF
CUSLED_KEY_2 LEDGER
CUSLED_KEY_2 ACCOUNT_REF
CUSLED_KEY_3 LEDGER
CUSLED_KEY_3 CHEQUE_NO
CUSLED_ACCOUNT_FK ACCOUNT
SQL> SELECT *
2 FROM CUSLED
3 WHERE CUSLED.LEDGER = 'UD'
4 AND CUSLED.ACCOUNT = '0000070101'
5 AND CUSLED.AGE_PERIOD = 'A510'
6 AND CUSLED.DOC_REF = '30001013'
7 AND CUSLED.REC_STA != 'X';
Execution Plan
----------------------------------------------------------
Plan hash value: 2871354732
--------------------------------------------------------------------------------
------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
| Time |
--------------------------------------------------------------------------------
------------
| 0 | SELECT STATEMENT | | 1 | 405 | 2 (0)
| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| CUSLED | 1 | 405 | 2 (0)
| 00:00:01 |
|* 2 | INDEX RANGE SCAN | CUSLED_KEY_1 | 3 | | 1 (0)
| 00:00:01 |
--------------------------------------------------------------------------------
------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CUSLED"."ACCOUNT"='0000070101' AND "CUSLED"."AGE_PERIOD"='A510' A
ND
"CUSLED"."REC_STA"<>'X')
2 - access("CUSLED"."LEDGER"='UD' AND "CUSLED"."DOC_REF"='30001013')
SQL> ALTER INDEX CUSLED_KEY_1 UNUSABLE;
Index altered.
SQL> SELECT *
2 FROM CUSLED
3 WHERE CUSLED.LEDGER = 'UD'
4 AND CUSLED.ACCOUNT = '0000070101'
5 AND CUSLED.AGE_PERIOD = 'A510'
6 AND CUSLED.DOC_REF = '30001013'
7 AND CUSLED.REC_STA != 'X';
Execution Plan
----------------------------------------------------------
Plan hash value: 4156143435
--------------------------------------------------------------------------------
---------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| T
ime |
--------------------------------------------------------------------------------
---------
| 0 | SELECT STATEMENT | | 1 | 405 | 2 (0)| 0
0:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| CUSLED | 1 | 405 | 2 (0)| 0
0:00:01 |
|* 2 | INDEX UNIQUE SCAN | CUSLED_PK | 1 | | 1 (0)| 0
0:00:01 |
--------------------------------------------------------------------------------
---------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CUSLED"."REC_STA"<>'X')
2 - access("CUSLED"."LEDGER"='UD' AND "CUSLED"."ACCOUNT"='0000070101' AND
"CUSLED"."AGE_PERIOD"='A510' AND "CUSLED"."DOC_REF"='30001013')
Thanks
September 22, 2006 - 1:50 am UTC
the costs in this case where so close, that they "tied", either index was in this case equally appropriate.
Still awaiting for your reply
Reader, September 22, 2006 - 12:29 am UTC
Still awaiting for your reply
September 22, 2006 - 2:51 am UTC
gee, thanks for letting me know?
I was hungry last night and went out for food....
LIST STALE PARTITIONS
Neeraj Nagpal, September 26, 2006 - 2:35 pm UTC
Tom,
Thanks for your answer to my question -- titled "GATHER STALE" asked on September 15, 2006. Breifly searching through your website, I came across this procedure that lists the STALE TABLES -- changed since the last-analyzed date -- but I am more interested in finding out the STALE PARTITIONS rather than just the STALE TABLE. How can one display the names of the STALE PARTITIONS using the following script.
declare
l_objList dbms_stats.objectTab;
begin
dbms_stats.gather_schema_stats
( ownname => USER,
options => 'LIST STALE',
objlist => l_objList );
for i in 1 .. l_objList.count
loop
dbms_output.put_line( l_objList(i).objType );
dbms_output.put_line( l_objList(i).objName );
end loop;
end;
Appreciate Your Help,
Neeraj
September 26, 2006 - 5:03 pm UTC
Ok, my help:
when you glanced at the documentation to see what else might be available in the dbms_stats.objectTable type - what did you see?
</code>
http://docs.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#sthref7889 <code>
I saw this:
TYPE ObjectElem IS RECORD (
ownname VARCHAR2(30), -- owner
objtype VARCHAR2(6), -- 'TABLE' or 'INDEX'
objname VARCHAR2(30), -- table/index
partname VARCHAR2(30), -- partition
subpartname VARCHAR2(30), -- subpartition
confidence NUMBER); -- not used
type ObjectTab is TABLE of ObjectElem;
which would indicate that other attributes of use exist there :)
Thanks Tom
Neeraj Nagpal, September 26, 2006 - 6:31 pm UTC
Thanks Tom, Yes ..it was my shortsighted search on the topic!
Neeraj Nagpal
analyze vs. dbms_stats
Rob, October 10, 2006 - 4:12 pm UTC
Tom,
Very useful - Thank You.
I have seen in your examples the use of:
analyze table t compute statistics for table for all indexes for all indexed columns
but then you'll say something like
In real life, however, one would want to use DBMS_STATS instead.
Could you please supply the equivalent statement using dbms_stats ?
Thanks
October 10, 2006 - 8:21 pm UTC
those are all old examples, many of which pre-date the existance of dbms_stats.
dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all indexed columns', cascade => true );
would be the "same" as that particular analyze.
DBMS_STATS - Locking
A reader, November 02, 2006 - 7:09 pm UTC
Tom,
Somewhere up there in this thread, you said certain variations of analyze can lock (to validate structures and such), but dbms_stats just runs queries basically."
However while running the following command,
DBMS_STATS.gather_table_stats (
ownname=> 'ADBPROD',
tabname=> OP_LAYER,
estimate_percent=> dbms_stats.auto_sample_size,
partname=> CALIFORNIA,
granularity=> 'PARTITION');
I noticed intermittently that the table 'OP_LAYER' was being locked by the Gather_Table_Stats command, and a few users, who wanted to acquire locks on this table, had to wait.
See the screen dump.
(Pardon the bad text formatting here
)
SQL> Select a.username, b.owner||'.'||b.name,b.type, b.mode_held,b.mode_requested
from v$session a, dba_ddl_locks b
where a.sid=b.session_id
and b.mode_held = 'None' and b.mode_requested <> 'None'
11:29:19 5 /
USERNAME B.OWNER||'.'||B.NAME
------------------------------ -------------------------------------------------------------
TYPE MODE_HELD MODE_REQU
---------------------------------------- --------- ---------
ADBPROD ADBPROD.OPINION_LAYER
Table/Procedure/Type None Share  (<-Share lock requested on the table)
ADBPROD ADBPROD.OPINION_LAYER
Table/Procedure/Type None Share  (<-Share lock requested on the table)
ADBPROD ADBPROD.OPINION_LAYER
Table/Procedure/Type None Share  (<-Share lock requested on the table)
ADBPROD ADBPROD.OPINION_LAYER
Table/Procedure/Type None Share  (<-Share lock requested on the table)
ADBPROD ADBPROD.OPINION_LAYER
Table/Procedure/Type None Share  (<-Share lock requested on the table)
11:37:24 ADBPROD@PROD SQL> Select a.username, b.owner||'.'||b.name,b.type, b.mode_held,b.mode_reques
ted
12:02:21 2 from v$session a, dba_ddl_locks b
12:02:21 3 where a.sid=b.session_id
12:02:21 4 and b.mode_held = 'None' and b.mode_requested <> 'None'
12:02:21 5 /
no rows selected <-NO LOCKS ON THE TABLE, AFTER A SHORT INTERVAL
Could you please tell me that is it one of those certain variations of the DBMS_STATS which can lock the table being analyzed???
November 02, 2006 - 9:07 pm UTC
would have been useful to see the blocker and their lock mode?
Blockers
A reader, November 03, 2006 - 1:58 pm UTC
Well I did check the blockers/waiters using the query below
, but got back -- no rows returned --
SELECT sid, DECODE( block, 0, 'NO', 'YES' ) BLOCKER, DECODE( request, 0, 'NO', 'YES' ) WAITER FROM v$lock WHERE request > 0 OR block > 0 ORDER BY block DESC;
Although I did'nt see any blockers OR waiters in the database at that time, the users were still getting breifly locked (like for 2-3 minutes and then released) on the table OP_LAYER. Not sure -- why. And I know that only process accessing the table 'OP_LAYER' was the DBMS_STATS.
November 04, 2006 - 12:08 pm UTC
v$sesstat
v$sessevent
would be useful if you see it again.
Is Commit after each table Analyze can make difference?
Maulesh Jani, November 21, 2006 - 12:08 pm UTC
HI TOM,
We are analyzing the schema in following way :
for rec_table_name in (
select table_name from user_tables
minus
select table_name from user_external_tables
) loop
dbms_stats.gather_table_stats('XXX',''||rec_table_name.TABLE_NAME||'',estimate_percent=>40,DEGREE=>4,CASCADE => TRUE);
end loop;
Now according to you if I put commit after each Loop iteration , than can it reduce overhead on UNDO/REDO/TEMP tablespace or its nothing to relate with it.
November 22, 2006 - 3:47 pm UTC
according to me, it would be a waste of time, since dbms_stats already commits.
ops$tkyte%ORA10GR2> create table t1 ( x int );
Table created.
ops$tkyte%ORA10GR2> create table t2 ( x int );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t1 values ( 1 );
1 row created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T2' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> rollback;
Rollback complete.
ops$tkyte%ORA10GR2> select * from t1;
X
----------
1
and there is no such thing as "overhead on redo/undo/temp" - I am not familar with what you might be thinking.
Can you please help me on this ..
Maulesh Jani, November 22, 2006 - 12:19 am UTC
Hi Tom ,
I need your help on this . As I have mentioned in my above question .We have one Analyze Schema stats process in Production system since 3 years . Now suddenly this process fails with below error :
Error While Analyzing Schema tables :
declare
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10$"
too small
ORA-06512: at line 12
Can you please tell me on it , that during stats gathering large amount of Redo generated . I am not sure but I am not been able to find any conviencing document on it. And the error is typical snapshot too old error , how it came during Analyze table stats ?
Parallel with this process no other process is executing .
Regards
Maulesh
November 24, 2006 - 12:29 pm UTC
that has nothing to do with redo
everything to do with undo.
snapshot too old happens when you have a query (and that is what dbms_stats does, runs queries) that is in need of undo that was overwritten.
increase your undo retention and/or review v$undostats to ensure you are not expiring undo prematurely.
Mayday
Apurva, November 30, 2006 - 1:46 pm UTC
We are handling massive volumes of data in our warehouse. Most of the fact tables will be appended with more than 200 million rows each month, and in the steady state the facts will have close to 2.5 billion rows.
The end-users of our tables are some analysts -- so to give these guys a low downtime (or, sleepless nights - speaking dysphemistically), we came up with the following strategy:
* We'll have two sets of tables
-> Temporary tables - in which we will load each month's data; these tables will not be partitioned
-> Final tables - each month we'll append Temporary tables' content to these tables through partition-exchange; Final tables will we range-partitioned on Month_ID
* As indexing is a time-intensive process -- we will build all the indexes in the world on the set of Temporary tables and have 'including indexes' clause in the partition-exchange statement
Now, here's where I am stumped (assuming you have played cricket)
* As my tables are pretty big in size (though their sheer size makes them ugly), and they will be used by analysts -- it's almost mandatory to gather statistics for these tables. Unfortunately, there is no option for a clause like 'including statistics' in partition-exchange statement -- by which I'll gather statistics for the Temporary table and carry the stats to the Final table (it doesn't make much sense either). So, the only option I am left with is to dump data in the Final table, and analyze the entire Final table for statistics -- but with 2.5 billion rows it'll be eons before it gets analyzed. Or is there a way of escaping this?, which'll help me realize my sadistic objectives.
Thanks Tom.
* We load each month's data in a Temporary table
* We build all the indexes we want on the temporary table
* Partition-exchange Temporary table with the Final table
November 30, 2006 - 3:32 pm UTC
from above:
...
you can import/export/set statistics directly with dbms_stats
.....
^^^
the partition stats will go, you'll just roll up the global stats you want to roll up - using your knowledge of the data
ops$tkyte%ORA10GR2> CREATE TABLE t
2 (
3 dt date,
4 x int not null,
5 y varchar2(30)
6 )
7 PARTITION BY RANGE (dt)
8 (
9 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
11 PARTITION junk VALUES LESS THAN (MAXVALUE)
12 )
13 /
Table created.
ops$tkyte%ORA10GR2> insert into t select to_date( '12-mar-2003','dd-mon-yyyy')+mod(rownum,3), user_id, username from all_users;
31 rows created.
ops$tkyte%ORA10GR2> create index t_idx on t(x) local;
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
31
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_tab_partitions where table_name =
2 'T';
PARTITION_NAME NUM_ROWS
------------------------------ ----------
JUNK 10
PART1 10
PART2 11
ops$tkyte%ORA10GR2> select num_rows from user_indexes where table_name = 'T';
NUM_ROWS
----------
31
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_ind_partitions where index_name =
2 'T_IDX';
PARTITION_NAME NUM_ROWS
------------------------------ ----------
JUNK 10
PART1 10
PART2 11
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table temp (dt date, x int not null, y varchar2(30) );
Table created.
ops$tkyte%ORA10GR2> insert into temp
2 select to_date( '12-mar-2003'), user_id, username from all_users;
31 rows created.
ops$tkyte%ORA10GR2> create index temp_idx on temp(x);
Index created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'TEMP' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter table t
2 exchange partition part1
3 with table temp
4 including indexes
5 without validation
6 /
Table altered.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
2 for x in (select sum(num_rows) nrows from user_tab_partitions where table_name = 'T')
3 loop
4 dbms_stats.set_table_stats( user, 'T', numrows => x.nrows );
5 dbms_stats.set_index_stats( user, 'T_IDX', numrows => x.nrows );
6 end loop;
7 end;
8 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
52
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
PARTITION_NAME NUM_ROWS
------------------------------ ----------
JUNK 10
PART1 31
PART2 11
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select num_rows from user_indexes where table_name = 'T';
NUM_ROWS
----------
52
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_ind_partitions where index_name =
2 'T_IDX';
PARTITION_NAME NUM_ROWS
------------------------------ ----------
JUNK 10
PART1 31
PART2 11
Thanks gazillion
Apurva, December 01, 2006 - 1:15 am UTC
Local vs. (Global + Local) statistics
Apurva, December 06, 2006 - 6:00 am UTC
Tom,
Consider this scenario:
My table has 12 partitions, with ~200 million rown in each partition. I have statistics gathered for each individual partition, but don't have the Global statistics for the table.
Can I expect to achieve a significant increase in query performance with Global indexes gathered?
Or, gathering Global stats will only be too consuming without much improvement in query response (as I already have stats for individual partitions)?
I understand this question is a bit abstract, and that I can test it myself -- but I don't enough data or resources to test it.
Please share your invaluable views.
Thanks
December 07, 2006 - 8:33 am UTC
...
Can I expect to achieve a significant increase in query performance with Global
indexes gathered?
....
one of three things can be expected:
a) increase in performance
b) decrease in performance (unlikely, but possible)
c) no change at all in performance
in other words - IT DEPENDS.
if your queries all include the partition key in such a way that the optimizer knows "only one partition will be hit", local statistics will be used to optimize that query.
global statistics may never be used (meaning C is likely)
on the other hand, if you run queries that do not use partition elimination down to a single partition - then global statistics will be used...
Now, are you sure you don't have SOME global stats :)
ops$tkyte%ORA10GR2> CREATE TABLE t
2 (
3 dt date,
4 x int,
5 y varchar2(30)
6 )
7 PARTITION BY RANGE (dt)
8 (
9 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
11 PARTITION junk VALUES LESS THAN (MAXVALUE)
12 )
13 /
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t
2 select to_date( '12-mar-2003','dd-mon-yyyy')+mod(rownum,3), object_id, object_name
3 from all_objects;
50122 rows created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'PART1', granularity => 'PARTITION' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'PART2', granularity => 'PARTITION' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', partname => 'JUNK', granularity => 'PARTITION' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
50122
ops$tkyte%ORA10GR2> select partition_name, num_rows from user_tab_partitions where table_name = 'T';
PARTITION_NAME NUM_ROWS
------------------------------ ----------
JUNK 16707
PART1 16707
PART2 16708
Thanks Tom!
Apurva, December 07, 2006 - 9:33 am UTC
Off-the-topic
Apurva, December 07, 2006 - 4:15 pm UTC
"What gets us into trouble is not what we don't know;
it's what we know for sure that just ain't so."
-- Twain
sys_op_countchg - Usage
Arindam Mukherjee, January 04, 2007 - 4:27 am UTC
Sir,
Today I was reading a chapter "The Clustering Factor" written by Mr. Jonathan Lewis in his new book (Chapter - 5) and have found one function "sys_op_countchg ()". I have found this explanation rather opaque due to my poor knowledge on Oracle. Could you kindly explain its usage with your inimitable style for us?
Why and How to use this function in live or production database Oracle 9i and 10g?
January 05, 2007 - 9:03 am UTC
you do not, it is not documented, do not use it.
Last_analyzed shows the incorrect date
Maulesh, January 22, 2007 - 7:19 am UTC
Hi TOM,
Thanks for the support and sharing knowledge about Oracle. We have implemented Schedule Job for statistics gathering . For which we are using Analyze statement .
But currently we come to know that values for Last_analyzed is not geeting updated daily and it is just remain as it is . Can you please help me on this ?
Thanks
system stats need updating?
steve, February 21, 2007 - 11:53 am UTC
Oracle 10G
Solaris 9
Regarding the "out of the box" gathering of system statistics, I'm wondering why this query is so slow. It takes well over 1 minute to run and there's not that many connections/processes (< 500).
select p.username pu,
s.username su,
s.status stat,
s.sid ssid,
s.serial# sser,
lpad(p.spid,7) spid,
to_char(logon_time,'YYYY-MM-DD HH24:MI:SS') logon_time,
substr(sa.sql_text,1,540) txt
from v$process p, v$session s, v$sqlarea sa
where p.addr=s.paddr
and s.username is not null
and s.sql_address=sa.address(+)
and s.sql_hash_value=sa.hash_value(+)
order by 1,2,7
The explain plan is:
====================
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1091464685
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 738 | 100 (4)| 00:00:02 |
| 1 | SORT ORDER BY | | 1 | 738 | 100 (4)| 00:00:02 |
|* 2 | HASH JOIN OUTER | | 1 | 738 | 99 (4)| 00:00:02 |
| 3 | NESTED LOOPS | | 1 | 204 | 73 (2)| 00:00:01 |
|* 4 | HASH JOIN | | 1 | 191 | 49 (3)| 00:00:01 |
|* 5 | FIXED TABLE FULL | X$KSUSE | 1 | 142 | 24 (0)| 00:00:01 |
|* 6 | FIXED TABLE FULL | X$KSUPR | 1 | 49 | 24 (0)| 00:00:01 |
|* 7 | FIXED TABLE FIXED INDEX| X$KSLED (ind:2) | 1 | 13 | 24 (0)| 00:00:01 |
|* 8 | VIEW | GV$SQLAREA | 100 | 53400 | 25 (4)| 00:00:01 |
|* 9 | FILTER | | | | | |
| 10 | SORT GROUP BY | | 100 | 72300 | 25 (4)| 00:00:01 |
| 11 | FIXED TABLE FULL | X$KGLOB | 100 | 72300 | 24 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("S"."KSUSESQL"="ADDRESS"(+) AND "S"."KSUSESQH"="HASH_VALUE"(+))
4 - access("ADDR"="S"."KSUSEPRO")
5 - filter("S"."KSUUDLNA" IS NOT NULL AND "S"."INST_ID"=:B1 AND
BITAND("S"."KSSPAFLG",1)<>0 AND BITAND("S"."KSUSEFLG",1)<>0)
6 - filter("INST_ID"=:B1 AND BITAND("KSSPAFLG",1)<>0)
7 - filter("S"."KSUSEOPC"="E"."INDX")
8 - filter("INST_ID"(+)=:B1)
9 - filter(SUM(DECODE("KGLOBT09",0,"KGLOBT02",0))<>0)
30 rows selected.
SQL> select count(*) from X$KSUSE;
COUNT(*)
----------
1200
SQL> select count(*) from X$KSUPR;
COUNT(*)
----------
700
SQL> select count(*) from X$KSLED;
COUNT(*)
----------
812
The last count(*) took a long time!
====================================
SQL> set timing on
SQL> select count(*) from X$KGLOB;
COUNT(*)
----------
933900
Elapsed: 00:01:08.83
February 21, 2007 - 12:27 pm UTC
you might want up to date stats on the fixed tables, yes.
DBMS_STAT
AD, February 21, 2007 - 1:56 pm UTC
Hi Tom,
Some of my team members think that we should have a wrapper (package) that would call DBMS_STATS to gather statistics instead of using the DBMS_STATS directly. What is your opinion about this approach.
Regards
February 21, 2007 - 3:12 pm UTC
depends on the goal of the wrapper package
if it simply calls dbms_stats - call for call - it would not be "smart"
if it does something "intelligent" (and you have to define intelligent in your context), then perhaps - but it would not really be a wrapper package at that point, but a set of procedural logic that gathers statistics on your system in the manner you find best.
SEan, February 25, 2007 - 8:59 am UTC
Tom,
i used dbms_stats.gather_database_stats as :
begin sys.DBMS_STATS.GATHER_DATABASE_STATS (); end;
and found out some of the schemas ware not gathered.
does it gather all the schemas in the database as supposed?
TIA
February 26, 2007 - 1:24 pm UTC
umm,
versions?
what schemas?
got example?
Sean, February 27, 2007 - 10:21 am UTC
it's 9.2.0.7
the job completed via dbms_job and also checked no errors in alert.log. the job have a few time after Feb 25.
the schemas ware not completed:
SQL> select owner, count(*) not_c from dba_tables where last_analyzed <'24-FEB-07' group by owner;
OWNER NOT_C
------------------------------ ---------
BIM 61
BIS 102
ENI 46
FII 76
GL 1
HRI 53
ISC 10
JTF 6
OPI 24
OSM 1
POA 36
PORTAL30_DEMO 1
SYS 316
13 rows selected.
the follwoing t_count is the total no. of the tables owned by the schema.
SQL> select b.owner, count(*) t_count from dba_tables b where b.owner in (select a.owner from dba_ta
bles a
2 where a.last_analyzed <'24-FEB-07') group by b.owner;
OWNER T_COUNT
------------------------------ ---------
BIM 261
BIS 460
ENI 140
FII 278
GL 176
HRI 185
ISC 37
JTF 403
OPI 122
OSM 198
POA 138
PORTAL30_DEMO 9
SYS 350
13 rows selected.
SQL>
February 27, 2007 - 11:03 am UTC
so, lets see the JOB.
Sean, February 27, 2007 - 1:16 pm UTC
those are not LOBs. not sure why dbms_stats.gather_database_stats does not pick up those.
analyze table does work for those...
SQL> select b.owner, b.object_name, b.object_type from
2 dba_objects b, dba_tables a
3 where a.owner=b.owner
4 and b.object_name=a.table_name
5 and a.last_analyzed <'24-FEB-07' and a.owner='SYS'
6 and rownum<3
7 /
OWNER OBJECT_NAME OBJECT_T
-------- ------------------------- --------
SYS FILE$ TABLE
SYS BOOTSTRAP$ TABLE
SQL> desc BOOTSTRAP$
ERROR:
ORA-04043: object BOOTSTRAP$ does not exist
SQL> desc sys.BOOTSTRAP$
Name Null? Type
----------------------------------------------------- -------- -----------------------
LINE# NOT NULL NUMBER
OBJ# NOT NULL NUMBER
SQL_TEXT NOT NULL VARCHAR2(4000)
SQL> select last_analyzed from dba_tables where table_name='BOOTSTRAP$';
LAST_ANAL
---------
07-OCT-06
SQL> analyze table sys.BOOTSTRAP$ compute statistics;
Table analyzed.
SQL> select last_analyzed from dba_tables where table_name='BOOTSTRAP$';
LAST_ANAL
---------
27-FEB-07
SQL>
SQL> select job
2 , log_user
3 , priv_user
4 , schema_user
5 , last_date
6 , last_sec
7 , this_date
8 , this_sec
9 , next_date
10 , next_sec
11 , total_time
12 , broken
13 , interval
14 , failures
15 , what
16 from sys.dba_jobs where log_user='SYS'
17 /
LOG PRIV SCHEMA
JOB USER USER USER LAST_DATE LAST_SEC THIS_DATE THIS_SEC
------- -------- -------- -------- --------- -------- --------- --------
TOTAL
NEXT_DATE NEXT_SEC TIME B INTERVAL FAIL WHAT
--------- -------- ------ - -------------------- ---- --------------------
40704 SYS SYS SYS 26-FEB-07 19:33:42
04-MAR-07 19:33:42 41484 N sysdate + 6 0 begin sys.DBMS_STATS
.GATHER_DATABASE_STA
TS (); end;
February 27, 2007 - 2:24 pm UTC
LOBS? I said "job", just wanted to see the parameters.
where those tables EVER analyzed?
Sean, February 27, 2007 - 2:50 pm UTC
use portal30_demo as example:
SQL> select table_name ,last_analyzed from dba_tables where owner='PORTAL30_DEMO';
TABLE_NAME LAST_ANAL
------------------------------ ---------
EMP 27-FEB-07
DEPT 27-FEB-07
MLOG$_EMP 14-NOV-04
RUPD$_EMP
EMP_SNAPSHOT 27-FEB-07
PEOPLE_INFO$ 27-FEB-07
TASK_CATEGORY$ 27-FEB-07
TASK_DEVELOPER$ 27-FEB-07
TASK$ 27-FEB-07
there are two tables were not picked by dbms_stats.gather_database_stats.
February 27, 2007 - 3:39 pm UTC
system generated things, so - got version?
older releases didn't really want statistics on some things.
it's 9.2.0.7
Sean, February 27, 2007 - 5:53 pm UTC
February 28, 2007 - 3:39 pm UTC
gather database stats skips those two particular segments.
dynamic sampling
Amit, March 01, 2007 - 12:38 pm UTC
Hi Tom,
Thanks for sharing your vast experience and knowledge with us.
From discussions about dynamic sampling above, I gathered the following information-
1)In 9i we have to collect the stats ourself,
2)In 10g, Oracle calc the stats for us (when we first fire the query during its hard parse)
Assuming point 2 above is right, I tried a test in 10g R2
But what I have noticed is dynamic sampling is not kicked in automatically as you can see in Area 1 below ?
and other issues in area 2 and area 3
10g Rel 2
=========================================================
AREA 0)
SQL> create table temp
2 (a number,
3 b number);
Table created.
SQL> declare
2 begin
3 for i in 1..100000
4 loop
5 insert into temp values(i,i+1);
6 end loop;
7 end;
8 /
PL/SQL procedure successfully completed.
============================================================
AREA 1)
SQL> explain plan for select a from temp;
Explained.
SQL> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
-----------------------------------------------------------
| Id | Operation | Name |
----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| TEMP |
----------------------------------
Note
-----
- 'PLAN_TABLE' is old version
- rule based optimizer used (consider using cbo)
12 rows selected.
===========================================================
So in Above it is not using dynamic_sampling automatically because it is using RBO and not CBO
But if I hint it to use dynamic sampling (as follows) then it shift to CBO and Rows are guesses nearly Right
=======================================
AREA 2)
SQL> explain plan for select /*+ dynamic_sampling (t 10) */ a from temp;
Explained.
SQL> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------
| 0 | SELECT STATEMENT | | 83932 | 1065K| 57 |
| 1 | TABLE ACCESS FULL| TEMP | 83932 | 1065K| 57 |
----------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
11 rows selected.
========================================================
Now dynamic sampling is done in previous step so it should hard parsed and plan is ready in shared pool. This same plan should be used if I fire syntactically same query again but again it is going back to RBO (as follows)
===============================================
AREA 3)
SQL> explain plan for select a from temp;
Explained.
SQL> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
----------------------------------
| Id | Operation | Name |
----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| TEMP |
----------------------------------
Note
-----
- 'PLAN_TABLE' is old version
- rule based optimizer used (consider using cbo)
12 rows selected.
================================================
March 02, 2007 - 12:48 pm UTC
1) you have dynamic sampling in 9i as well - it is just defaulting to a "lower level" than 10g did.
the RBO doesn't sample because the rbo would never use the information. Do not use the RBO in 10g.
Tables with stale stats ignored by dbms_stats
A reader, March 01, 2007 - 4:35 pm UTC
Tom,
This is 10gr2. We use the following command to gather statistics:
DBMS_STATS.GATHER_SCHEMA_STATS(ownname=>'ABC',GRANULARITY=>'ALL',CASCADE
=>TRUE,METHOD_OPT=>'FOR ALL COLUMNS size 1',DEGREE=>DBMS_STATS.AUTO_DEGREE);
The tables are in monitoring mode. We were told by Oracle support that the above method will ignore tables with stale statistics. Here is is quote from analyst:
=============
It is because some of the objects become stale and are ignored while gathering schema statistics. It is better to use
options=>'GATHER STALE' in the procedure.
=============
I always thought that if I do not mention any option, it will gather statistics for all tables in the schema, stale or not.
Thanks
Analyze process is taking very long
Apurva, March 05, 2007 - 11:16 am UTC
Tom,
I am using the following piece of PL-SQL code to analyze a 20-column table which has ~650 million rows.
<SQL>
begin
dbms_stats.gather_table_stats(ownname => 'ops\$datamart', tabname => 'FCT_SALES', estimate_percent => 15 ,
granularity => 'ALL', cascade => true, force => true);
end;
/
</SQL>
The problem is that the analyze process is taking close to 21 hours!, although we have a very good architecture.
What would be your take on this?
Thanks in anticipation
March 05, 2007 - 2:25 pm UTC
... although we have a very good architecture. ...
well, a pretty building won't make anything run faster (sorry, I could not resist - although we have a very good architecture... don't even know what to make of that!! :)
if the table is dense, you could try block sampling (do 15% of the IO to read).
You could also introduce "parallel" into it.
even if you do not use parallel on the table itself, you could run more than one gather against the table + indexes, instead of cascade...
or if you are on 10g, you might not even need cascade if you are re-creating the indexes after a load since the indexes collect stats during a create now.
but - you really don't give us much to work with. 20 columns and 650 million rows and a "very good archicture" plus $1 will get you a cup of coffee (not at starbucks of course).
you don't mention why you gather statistics (what do you do that causes this, a truncate and reload, a load of a new partition, "time" - what)
you don't tell us about the structure of this table
you don't tell us very much........
Table hung when dbms_stats
A reader, March 08, 2007 - 2:38 pm UTC
Hi Tom,
We have one partitioned table with 3986259 rows, we have script scheduled (twice a week) to gather stats on this table. This table usually takes maximum 10Mins to gather stats without any issues. But yesterday this job did not finished at all, we had to kill that process after 14Hours. When this was running any query on this table was hanging, even "desc" from sqlplus hung. As soon as this process was killed all was normal. Due to this reason any application that was trying to access this table was just freezing and brought down the whole system.
DBMS_STATS.gather_table_stats ('AUSER',
'ATABLE',
estimate_percent => 5,
CASCADE => TRUE
);
Do you know what could be the problem?
Many Thanks
March 08, 2007 - 3:34 pm UTC
please utilize support for this one.
A reader, March 08, 2007 - 2:56 pm UTC
Sorry forgot to mention that this Oracle 9.2.0.6 RAC on Sun Solaris
Thanks
A reader, March 12, 2007 - 4:22 am UTC
Thanks
Thanks, but...
Apurva, March 16, 2007 - 4:19 pm UTC
Tom,
Thanks a lot for all your help, and sorry for following-up a bit late: the system as well the building it's housed in have a wonderful architecture :-)
* 8 CPUs
* 32 GB RAM
* Oracle 10g R2 - both client and server run on Linux
* 1071, Fifth Avenue at 89th Street
Here are the details of how we load out fact table:
* The table is partitioned by MONTH (date format)
* In every refresh we first truncate the entire table
* Then using SQL loader we populate historical as well as current month's data [and in a given month we load about 200 million rows]
* Then we build 12 local bitmap indexes on 12 non-fact columns
* Then we would gather the statistics of the table using the following procedure:
<SQL>
begin
dbms_stats.gather_table_stats(ownname => 'ops\$datamart', tabname => 'FCT_SALES', estimate_percent => 15 ,
granularity => 'ALL', cascade => true, force => true);
end;
/
</SQL>
...but it would take close to a million years (~21 hours)
I came to you and as per your suggestion I removed 'force => true' option, but Oracle is misbehaving and is still gathering statistics for the indexes?! such that the overall and the time taken is still ~20 hours. Moreover, it's index analyzation that is eating most of the time - we found that out of 20 odd hours that the entire process takes, Oracle spends close to 18 hrs to gather stats for indexes...and quite obviously it's not a good idea as - as you remarked - in 10g index stats are automatically gathered when the index is built.
Seems it's an Oracle bug? What are your thoughts?
Thanks in anticipation
March 17, 2007 - 4:02 pm UTC
cascade is what affects indexes, oracle is not misbehaving, it is simply doing what you told it to.
cascade=>false, not force.
did you try ANY of the other things I mentioned??!!??
Apurva, March 17, 2007 - 11:09 pm UTC
that's what happens when you write your mails in anticipation of St. Patrick's day...
I simply removed 'cascade=>true' (and not 'force=>true' - as I miswrote) and did not explicitly write 'cascade=>false', in the fond hope that default value for cascade parameter is 'false'
I tried your 'block sampling' suggestion and found that it did not affect analyze time substantially - index-analyzation is the killer
Tom,
Thanks *A LOT* again.
it's working!!!!!!
Apurva, March 17, 2007 - 11:22 pm UTC
wrapper for DBMS_STATS
Chris Gould, March 20, 2007 - 7:10 am UTC
I noticed recently that if you use DBMS_STATS.GATHER_TABLE_STAS with CASCADE=>TRUE then it gathers stats for the table and each index in turn. This can take a long time, especially if there are a lot of indexes on the table (eg. in a warehouse). One way to speed this up is to use DBMS_SCHEDULER to create and run jobs for each index on the table simultaneously. Scheduler and the resource manager should ensure the database is not swamped and the jobs all complete as efficiently as possible. The following code indicates the idea :
begin
for r in (select ui.index_name
,uip.partition_name
,dbms_scheduler.generate_job_name('IXAZ_') as jobname
from user_indexes ui
inner join user_ind_partitions uip
on (uip.index_name = ui.index_name)
where ui.table_name = 'ASSET'
and uip.partition_name = 'P200703'
order by ui.index_name,uip.partition_name )
loop
execute immediate
'BEGIN
DBMS_SCHEDULER.CREATE_JOB (
job_name => :jobname,
job_type => :jobtype,
job_action => :jobaction,
start_date => NULL,
repeat_interval => NULL,
end_date => NULL,
enabled => TRUE,
auto_drop => TRUE,
comments => :jobcomments);
END;'
using
r.jobname
,'PLSQL_BLOCK'
,'begin dbms_stats.gather_index_stats(
ownname=>'''||user||'''
,indname=>'''||r.index_name||'''
,partname=>'''||r.partition_name||''''||');end;'
,'user='||user||',index='||r.index_name||',ptn='||r.partition_name;
end loop;
end;
The code above only considers the indexes for one partition on the table, but the total time to analyze all the indexes will be approx the time taken for the largest index if analyzed individually.
wrapper for DBMS_STATS - additional info
Chris Gould, March 21, 2007 - 5:28 am UTC
One suggestion I'd make regarding the DBMS_SCHEDULER wrapper to DBMS_STATS is not to specify a degree of parallelism > 1 when analyzing. Using a large degree of parallelism seems to confuse the resource manager - I guess because it thinks it's executing one job but this then spawns multiple sessions for the additional degrees of parallelism - and you can quickly exhaust the supply of available sessions.
Need Guidance on Partitioned tables
Vk, September 20, 2007 - 5:45 am UTC
Hi Tom,
We have huge patririoned tables which are continually loaded on a daily basis based on Range partition, and we do maintain 30 days worth of retention. The data for the past 29 days is static for the 30th day and we don't want to re-compute the partition level statistics again for them as they have become stale.
However we drop the old partition and adds a new partition on a daily basis , what is termed as Rolling Window.
Now please provide your suggestions:
1. We are doing gather_table_stats in a loop for the complete schema using
dbms_stats.gather_table_stats(ownname=> l_username,
tabname=> l_table_name,
estimate_percent => 5,
method_opt => l_method_opt,
degree => 16,
cascade => true);
Where l_method_opt is :
'for columns ' || p_col_name || ' SIZE 1'; for the partitioned tables and p_col_name is the column on which the data is partitioned.
However if the table is non-partitioned then
l_method_opt is :
'for all indexed columns SIZE 1';
Now this complete gathering of stats takes around 3 hrs and for us it is quite significant.
Can you please provide us suggestions as to how we can capture global and partition level statistics without recomputing them all over again and again.
Your advice/help is appreciated.
September 24, 2007 - 7:15 am UTC
why would you gather stats on the partition key only? that seems a little strange, what is the thought process behind that?
If you gather local stats on only the single new partition - we'll do our best to "guess" the global stats. In many cases - we'll get it right (eg: number of total rows). In other cases, we might not - eg: number of distinct values - if you have 30 partitions and X has 10 distinct values in each partition - you either have a) 10 distinct values of X, b) 300 distinct values of X, or c) between 11 and 299 distinct values of X - the only way to truly know would be to analyze everything OR have you use dbms_stats.set_xxxxx routines to tell us.
Thought Process of only gather stats on Partitioned Column
Vk, September 24, 2007 - 1:53 pm UTC
Hi Tom,
Since every query which comes to us has to have the WHERE clause suppoted by the PARTITIONED COLUMN we thought we don't need to gather stats on any other column, since this will drive the rows flowing from this step.
I am pretty okay in gathering only the stats for PARTITITIONED COLUMN and not on all the columns of the table as is done by Gather Stats by default. One reson is the time it takes to gather stats is very high on nearly a 1TB of database.
I am also thinking to gather the partition level stats only for the latest partition and not capture global level stats if Oracle's CBO can guess it 90% correct, I am happy with it.
Thanks,
Vikas khanna
September 26, 2007 - 8:30 pm UTC
umm, that is flawed
if you only get column stats on the partition column, what would the estimated card= value be for something like:
select from t where x in (select Y from partitioned_table where key=val and SOME_COL='abc')
if we know nothing about some_col, we cannot get the card=value.
so, please - go into more detail as to why you did this? what facts did you use in coming to this conclusion?
the "partition_key=value" genearally is used for partition elimination. after that, we still - well - need statistics.
Analyze Vs DBMS_STATS
Karthick, October 17, 2007 - 12:40 am UTC
for a particular table iam able to use the ANALYZE command but not DBMS_STATS.
see the test case below.
Enter user-name: sysadm/sysadm@inlabtst
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> select count(1) from HX_SSN_GEN_STG_TBL;
COUNT(1)
----------
162439
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_SSN_GEN_STG_TBL');
BEGIN DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_SSN_GEN_STG_TBL'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "SYSADM"."HX_SSN_GEN_STG_TBL", insufficient
privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13046
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1
SQL> ANALYZE TABLE HX_SSN_GEN_STG_TBL COMPUTE STATISTICS;
Table analyzed.
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_NAMES_STG');
PL/SQL procedure successfully completed.
just for that one table iam not able to use DBMS_STATS for other tables i can. but i can use ANALYZE command on that table. any idea what could be the problem.
Found the answer
Karthick, October 17, 2007 - 12:53 am UTC
Sorry for posting the pervious one tom. After some search in orafaq got the answer..
SQL> select username, privilege from user_sys_privs;
USERNAME PRIVILEGE
------------------------------ ----------------------------------------
SYSADM UNLIMITED TABLESPACE
SYSADM CREATE TABLE
SYSADM CREATE SESSION
SYSADM ANALYZE ANY
SQL> GRANT SELECT ANY TABLE TO SYSADM;
Grant succeeded.
SQL> select username, privilege from user_sys_privs;
USERNAME PRIVILEGE
------------------------------ ----------------------------------------
SYSADM UNLIMITED TABLESPACE
SYSADM SELECT ANY TABLE
SYSADM CREATE TABLE
SYSADM CREATE SESSION
SYSADM ANALYZE ANY
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => 'SYSADM', tabname => 'HX_SSN_GEN_STG_TBL');
PL/SQL procedure successfully completed.
About the first entry
Alvin, February 21, 2008 - 7:39 pm UTC
On a 10g box.
Will running an analyze table overwrite the statistics gathered by dbms_stats earlier ?
Dbms_stats ran at 10am
analyze table ran at 11am
Do both (analyze table and DBMS_Stat) store statistics at the same location ? (hence the earlier question of overwritting).
February 21, 2008 - 10:43 pm UTC
statistics are stored in one place. so, sure they would.
Killind DBMS_STATS in the middle of a large table
Ricardo Masashi, March 16, 2008 - 10:16 am UTC
Hi, Tom,
I was wondering what happen if we kill a DBMS_STATS running job, by some reason, in the middle of gathering statistics of a large/huge table.
That table would contain stale, last analyzed or no stats ?
My concern is if the dbms_stats:
- Delete stats before collecting;
- Collect before overwriting;
Thanks!
March 24, 2008 - 8:26 am UTC
well, statistics are gathered and updated and gathered and updated - and then finally committed at the end - as a transaction.
So, dbms_stats works like DDL in that sense, all or nothing.
different execution plans when using ANALYZE versus DBMS_STATS
A reader, May 16, 2008 - 10:24 am UTC
hi tom,
i analyzed one table, one time with ANALYZE the other time with DBMS_STATS.GATHER_TABLE_STATS (commands below). i got different executions plans for the same query. good with analyze, bad with DBMS_STATS.GATHER_TABLE_STATS. when comparing all statistics i found they to be the same with the excpetion of the histograms having different slot sizes (see below).
analyze table gfm.SUBCONFIG_CONFIGURABLE_OBJ_T estimate statistics
for table
for all indexes
for all indexed columns
sample 40 percent;
exec dbms_stats.gather_table_stats ( -
'GFM', 'SUBCONFIG_CONFIGURABLE_OBJ_T', estimate_percent=>40 -
,method_opt=>'for all indexed columns', cascade=>true);
SQL> select * from (
2 select 'ANALYZE' meth, a.* from system.t_analyze_hist a
3 UNION ALL
4 select 'DBMSSTATS', s.* from system.t_stats_hist s
5 ) where column_name = 'SUBCONFIG_ID'
6 order by 1;
METH OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- ------------------------------ -------------------- --------------- ----------
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 782 0
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 1662 1
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 2546 19371
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 3423 19390
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 4315 19411
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 5191 19471
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 6075 19490
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 6943 19510
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 7826 19531
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 8720 19571
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 9636 19591
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 10531 19611
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 11414 19631
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 12301 19651
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 13184 19671
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 14056 19673
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 14930 19691
ANALYZE GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 15816 19711
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 1526 0
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 3033 1
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 4396 19371
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 5745 19390
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 7119 19411
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 8486 19471
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 9894 19490
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 11234 19510
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 12627 19531
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 13996 19571
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 15363 19591
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 16761 19611
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 18149 19631
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 19540 19651
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 20926 19671
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 22320 19673
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 23722 19691
DBMSSTATS GFM SUBCONFIG_CONFIGURABLE_OBJ_T SUBCONFIG_ID 25042 19711
36 rows selected.
Q1) could the slot size difference of the histograms be the reason for the different plan?
Q2) what should i do in the future? Going back to ANALYZE?
May 19, 2008 - 3:10 pm UTC
q1) yes
q2) ummm, no. your output is hard to read - not sure how many buckets each one generated, but use something that generates the same buckets
and realize that since you have histograms, you'll get different plans for different "inputs" over time. sometimes (since you do not have a bucket PER VALUE) dbms_stats will get the better plan than analyze did - just change your "predicate" and you'll see - you have an imperfect picture of your data here.
different execution plans when using ANALYZE versus DBMS_STATS
A reader, May 20, 2008 - 10:30 am UTC
hi tom,
thanks for your answer. while doing further investigations on ANALYZE versus DBMS_STATS i found that i get the same histograms on COMPUTE statistics, but different ones on ESTIMATE SAMPLE x percent.
i would assume now that the internal sampling is done differently by ANALYZE than DBMS_STATS. correct?
May 20, 2008 - 11:45 am UTC
if you ran analyze over and over - you would discover IT generates different statistics.
they are random samples, if you randomly sample something over and over - you'll get different results....
how oracle will use the statistics after rename a table
Yuna, May 30, 2008 - 9:24 am UTC
Hi Tom,
We are going to delete tons of rows from a table. We will do the following:
1) create a temp table;
2) copy the good data from old table (A) to the temp table;
3) create index and trigger in the temp table;
4) rename old table to A_archive table;
5) rename temp table to A old table;
6) trucate A_archive table;
Thw question is after the temp is rename to A old table ( the same name). Whether Oralce will use the old statistic from the old table? in that case we need to analyze the new table right way?
May 30, 2008 - 3:07 pm UTC
the statistics are tracked by object ids, not the 'name'.
ops$tkyte%ORA10GR2> create table t1 as select * from all_objects;
Table created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T1' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> create table t2 as select * from t where owner = 'SCOTT';
Table created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T2' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name in ( 'T1', 'T2' );
TABLE_NAME NUM_ROWS
------------------------------ ----------
T1 49904
T2 16
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter table T1 rename to A_archive;
Table altered.
ops$tkyte%ORA10GR2> alter table T2 rename to T1;
Table altered.
ops$tkyte%ORA10GR2> truncate table A_archive;
Table truncated.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select table_name, num_rows from user_tables where table_name in ( 'T1', 'T2', 'A_ARCHIVE' );
TABLE_NAME NUM_ROWS
------------------------------ ----------
A_ARCHIVE 49904
T1 16
about the statisitcs
yuna, June 10, 2008 - 2:18 pm UTC
Thank you very much Tom!
I have the following questions.
1) After creating a table T1 and doing work for a while.
a: Without anylyze it : select table_name, num_rows from user_tables where table_name in = 'T1';
No rows returned. But we can see the statistics from ORACLE enterprise manager console.
b: Anylyze it : select table_name, num_rows from user_tables where table_name in = 'T1';
Row will be returned.
Does this mean after creating a table and never analyze that table manually, Oracle database still getting the related statistics for it?
2) If I want to calculate the table size using user_extents, whether I need to analyze the table in order to get the correct number? Whether the statisitcs by analyze or dbms_ stats is only for execution plan?
Thank you again for your wisdom.
Best regards, Yuna
June 10, 2008 - 2:59 pm UTC
a: Without anylyze it : select table_name, num_rows from user_tables where
table_name in = 'T1';
No rows returned. But we can see the statistics from ORACLE enterprise manager
console.
sorry, but that is not true, that returns a row as soon as the table is 'there', you have made a mistake somewhere.
2) user extents shows you allocated space, it is always accurate about what is allocated for the table.
still the statistics after creating that table (2)
Yuna, June 11, 2008 - 9:48 am UTC
Hi Tom,
Please see the following scripts about creating table without statistics in user_tables before analyzing.
SQL S3 >create table temptest2 as select * from JCL_SCHEDULE;
SQL S3 >select count(*) from temptest2;
COUNT(*)
----------
10
SQL S3 >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';
NUM_ROWS
----------
SQL S3 >analyze table temptest2 compute statistics for table for all indexed columns for all
indexes; 2
Table analyzed.
SQL S3 >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';
NUM_ROWS
----------
10
Thank you very much for your comments.
Best regards
June 11, 2008 - 9:55 am UTC
yes, so, this is how it is supposed to work and you got a record EACH TIME - both times you got a record.
so, what is the question?
(do not use analyze to gather statistics, use dbms_stats)
please read it again
A reader, June 11, 2008 - 10:51 am UTC
Please read the process again. There is no record for the first time run of :
SQL S3 >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';
NUM_ROWS
----------
Thank you!
Best regards, Yuna
June 11, 2008 - 11:19 am UTC
sure there is.
select ROWNUM, num_rows from user_tables where table_name = 'TEMPTEST2';
you cannot "see" null.
ops$tkyte%ORA9IR2> select null from dual where 1=0;
no rows selected
ops$tkyte%ORA9IR2> select null from dual where 1=1;
N
-
ops$tkyte%ORA9IR2>
see the difference.
no record ( or null record) returned
A reader, June 11, 2008 - 1:55 pm UTC
Hi Tom,
Thank you for the explanation. maybe my question can be satted as this way, Why before analyzing the table ( with 10 rows in that table), running "select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2';" returns NUll ( no records) from the SQLPLUS, but the result will show in DB console. After analyzing the table and run the same query, it will return the real num_rows to the SQLPLUS and db console?
SQL S3 >create table temptest2 as select * from JCL_SCHEDULE;
SQL S3 >select count(*) from temptest2; ( this is the real number)
COUNT(*)
----------
10
SQL S3 >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2'; ( NULL returned before analyzing)
NUM_ROWS
----------
SQL S3 >analyze table temptest2 compute statistics for table for all indexed columns for all
indexes; 2
Table analyzed.
SQL S3 >select NUM_ROWS from user_tables where TABLE_NAME ='TEMPTEST2'; ( the real number returned).
NUM_ROWS
----------
10
Thank you!
Best regards, Yuna
June 11, 2008 - 5:50 pm UTC
(please stop using analyze)
tell me the precise steps you undertake in dbconsole from the home page in order to see what you say you see.
analize table
A reader, June 12, 2008 - 9:01 am UTC
Thank you very much Tom for the answer.
I am sorry that I gave you wrong info about the db console. I could not see the stats either until I use dbms_stats to gather the infomation (exec dbms_stats.gather_table_stats(user, 'TEMPTEST2'). There was another long existing table before. I could get info from the dbconsole, but not from the user_tables. So I applied that for the newly created table.
the step is schema-table-and the specific table.
So I could not get the stats either from user_tables or the DB console before analyzing the newly created table.
Best regards, Yuna
privileges
A reader, June 12, 2008 - 11:47 am UTC
Hi Tom,
I'd like to know what specific privilege is used to analyze a table?
Thank you!
Best regards, Yuna
June 12, 2008 - 1:06 pm UTC
MV logs need gather stats ?
Megala, June 22, 2008 - 6:54 pm UTC
Tom:
Does materialized view log tables need to be analyzed or gather stats (dbms_stats)? Thanks
June 22, 2008 - 9:52 pm UTC
they could, sure, they are a table, in 10g if queried without - we will dynamically sample them at parse time.
so you probably do not need to do anything special for them, we'll sample them at parse time and given how simply they are used - that is probably more than sufficient.
MV log table analyze and estimate_percent => null question
Megala, June 22, 2008 - 11:04 pm UTC
Thanks Tom.
In Oracle 9i,Does MV log table always needs to be analyzed with current stats because after MVs refreshed, MV log will be emptied anyway. ?
Will there be performance problem if MV log tables are not analyzed ?
Thanks for your insight.
2) In oracle 9.2.0.6, When estimate_percent parameter is not specified, Will Oracle gather 100% table stats or ?
Example:
exec dbms_stats.gather_table_stats('CUR_DBA2', tabname => 'CTBILLINGTABLE', degree => dbms_stats.default_degree, cascade =>
true ) ;
June 23, 2008 - 7:42 am UTC
1) probably not. In 9i, dynamic sampling is set to 1
"Sample tables that have not been analyzed if there is more than one table in the query, the table in question has not been analyzed and it has no indexes, and the optimizer determines that the query plan would be affected based on the size of this object."
Hence, if you are using the CBO and we access this table using the CBO in a multi-table query, and this mlog$ table is 'large' (allocated space is big), it'll sample it. for example:
ops$tkyte%ORA9IR2> create table t ( x int primary key, y int );
Table created.
ops$tkyte%ORA9IR2> insert into t select rownum, rownum from all_users;
35 rows created.
ops$tkyte%ORA9IR2> create materialized view log on t;
Materialized view log created.
ops$tkyte%ORA9IR2> select object_name, object_type from user_objects order by object_name;
OBJECT_NAME OBJECT_TYPE
------------------------------ ------------------
MLOG$_T TABLE
RUPD$_T TABLE
SYS_C004615 INDEX
T TABLE
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert into mlog$_t
2 select rownum, sysdate, 'x', 'x', utl_raw.cast_to_raw( rpad('x',255,'x') ) from all_objects;
30662 rows created.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2> delete from mlog$_t;
30662 rows deleted.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> show parameter optimizer_dynamic_sampling
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 1
ops$tkyte%ORA9IR2> set autotrace traceonly explain
ops$tkyte%ORA9IR2> select * from t, mlog$_t where t.x = mlog$_t.x;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=161)
1 0 NESTED LOOPS (Cost=3 Card=1 Bytes=161)
2 1 TABLE ACCESS (FULL) OF 'MLOG$_T' (Cost=2 Card=1 Bytes=155)
3 1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=1 Card=1 Bytes=6)
4 3 INDEX (UNIQUE SCAN) OF 'SYS_C004615' (UNIQUE)
ops$tkyte%ORA9IR2> alter session set optimizer_dynamic_sampling = 0;
Session altered.
ops$tkyte%ORA9IR2> select * from t, mlog$_t where t.x = mlog$_t.x;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=82 Bytes=13202)
1 0 HASH JOIN (Cost=5 Card=82 Bytes=13202)
2 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=35 Bytes=210)
3 1 TABLE ACCESS (FULL) OF 'MLOG$_T' (Cost=2 Card=82 Bytes=12710)
ops$tkyte%ORA9IR2> set autotrace off
see how the plan is with and without dynamic sampling at it's default level...
2) documentation is actually quite good for that
http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96612/d_stats2.htm#1003993 .. estimate_percent
Percentage of rows to estimate
(NULL means compute) The valid range ..
estimate_percent => null
A reader, June 24, 2008 - 6:36 pm UTC
Tom,
I just need to clarify:
estimate_percent => null (default value) is equivalent to
estimate_percent=> 100 in dbms_stats.gather_table_stats ?
Thanks
100% table analyze
A reader, June 24, 2008 - 11:40 pm UTC
Tom:
my goal is to analyze the table 100%.
Can you suggest , Should i use default "compute" or estimate_percent => 100 for any tables ?
Is there any difference between Estimate(100) and Compute in the dbms_stats.gather_table_stats analyze ?
Under what circumstance, one should use "compute" over estimate_percent => 100.
June 25, 2008 - 8:29 am UTC
they are the same, which pleases you more? null or 100?
Lock Schema or Table Stats
A reader, July 29, 2008 - 1:48 pm UTC
Which data dictionary view contains the info about whether a table or a schema's statistics is currently locked?
August 01, 2008 - 10:22 am UTC
ops$tkyte%ORA10GR2> select stattype_locked from user_tab_statistics where table_name = 'T';
STATT
-----
ops$tkyte%ORA10GR2> exec dbms_stats.lock_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select stattype_locked from user_tab_statistics where table_name = 'T';
STATT
-----
ALL
confusing stats
a reader, September 04, 2008 - 4:49 pm UTC
Hey Tom:
I created a table.
SQL> select count(*) from MYBITMAP;
COUNT(*)
----------
1270064
SQL> analyze table mybitmap estimate statistics;
Table analyzed.
Elapsed: 00:00:01.44
SQL> select NUM_ROWS, blocks, empty_blocks,PCT_FREE,PCT_USED,CHAIN_CNT,AVG_ROW_LEN from user_tables where table_name='MYBITMAP';
NUM_ROWS BLOCKS EMPTY_BLOCKS PCT_FREE PCT_USED CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- ---------- -----------
1279153 31156 204 10 0 172
SQL> exec dbms_stats.gather_table_stats(ownname => 'JOSH', tabname => 'MYBITMAP', estimate_percent => 10);
PL/SQL procedure successfully completed.
SQL> select NUM_ROWS, blocks, empty_blocks,PCT_FREE,PCT_USED,CHAIN_CNT,AVG_ROW_LEN from user_tables where table_name='MYBITMAP';
NUM_ROWS BLOCKS EMPTY_BLOCKS PCT_FREE PCT_USED CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- ---------- -----------
1267080 31156 204 10 0 169
So how come the number of rows are different from analyze and the dbms_stat package, and also neither of them is the same with the real number of rows in the table.
Thank you very much!
September 05, 2008 - 7:40 am UTC
because you used estimate on the analyze (you saw that right? the word "estimate")
and you used a 10% estimate with dbms_stats (estimate... and estimate)
and estimate is - a guesstimate, an approximation, a guess. Consider:
ops$tkyte%ORA10GR2> create table t as select * from all_objects;
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', estimate_percent=> 10 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
49600
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', estimate_percent=> 10 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select num_rows from user_tables where table_name = 'T';
NUM_ROWS
----------
50730
just run dbms_stats and you'll get slightly different answers time after time. It is normal, expected, anticipated even.
Do not use analyze to gather statistics on segments, the optimizer is built to use information dbms_stats gathers and dbms_stats gathers things differently than analyze does. dbms_stats has been kept up to date over the years whereas analyze isn't as far as statistics gathering goes.
confusing stats
A reader, September 04, 2008 - 5:17 pm UTC
Sorry, I get it now, it is because they are only estimating the stats, after I do "compute" the number of rows are correct. but why the average row length from package is different from analyze though?
SQL> analyze table mybitmap compute statistics;
Table analyzed.
Elapsed: 00:02:28.66
SQL> select NUM_ROWS, blocks, empty_blocks,PCT_FREE,PCT_USED,CHAIN_CNT,AVG_ROW_LEN from user_tables where table_name='MYBITMAP';
NUM_ROWS BLOCKS EMPTY_BLOCKS PCT_FREE PCT_USED CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- ---------- -----------
1270064 31156 204 10 0 172
September 05, 2008 - 8:03 am UTC
because analyze is legacy code, not maintained and you should only use dbms_stats to gather statistics.
DBMS_STATS in parallel
A reader, October 02, 2008 - 7:29 am UTC
Hi Tom,
If stats are being gathered for one partition, and there is no index on table. Can oracle use degree of parallelism? if yes, what is happening in parallel?
Thanks
October 02, 2008 - 7:49 am UTC
yes it can use parallel.
The queries that dbms_stats execute can use parallel execution. That is what would be done in parallel - the sampling queries dbms_stats runs
degree of parallellism
A reader, October 04, 2008 - 9:45 pm UTC
Hi Tom,
Thanks.
If table is not defined with parallel clause, and degree of parallellism is not defined in dbms_stats clause. Does oracle automatically uses degree based upon spfile parameters?
Regards,
October 06, 2008 - 2:47 pm UTC
you would see what the default is - which varies by version.
check out the documentation for your release of the database.
which case both Analyze and dbms_stats do the same job(optimizer perspective ?
Mahesh, October 06, 2008 - 1:34 pm UTC
Tom,
Will following do the same thing ?
ANALYZE TABLE <tab_name> COMPUTE STATISTICS;
and
DBMS_STATS.GATHER_TABLE_STATS (
ownname => <owner>,
tabname =><tab_name>,
estimate_percent => 100 ); --> 100%
I am concerned because one of our application requires full statistics on certain IOT table. But few IOT tables are getting errored out(ora-600) due to bug# 4763768
So I am thinking to user dbms_stats with estimate_percent =100.
Will that be the same case as ANALYZE with COMPUTE STATISTICS?
Query Perf Slow after Analyze...
A reader, October 28, 2008 - 5:26 pm UTC
Hi Tom,
In our database :
Ver : 9.2.0.6
Optmiser Mode: FIRST_ROWS
Day#1: dbms_stats were used to gather stats and query "A" is taking 10 sec to get completed.
Day #2 : dbms_job failed . and no stats gathred ..query "A" is taking 10 sec to get completed.
....
....
day #50: and no stats gathred till date after day#1..query "A" is taking 10 sec to get completed.
day #51 : Then we used analyze command (analyze table .... compute statistics) to gather stats - compute stats.
day#52 : we noticed a query "A" finishing in 10 sec earlier now taking 1 hr.
day#52 : Added the hint /*+ Rule */ in quer "A" and now taking 12 sec.
Question:
a) If we gather stats today on day#52, using dbms_stats will the same query will get finished in ~12 sec?
b) CBO could not use efficiently the stats gatherred using analyze table .... compute statistics..? Your comments Please.
October 28, 2008 - 7:10 pm UTC
try it, you should not be using analyze, the optimizer is written to use the values dbms_stats provides - which are different than analyze will produce.
Query Perf Slow after Analyze...
A reader, October 28, 2008 - 8:49 pm UTC
Hi Tom
We have gathered stats using the dbms_stats ( gather auto) and it has finished gathering the stats.
There are 1000 tables in our schema. All were analyzed using "Analyze" command on day#52 as mentioned above.
Now after running the dbms_stats.gather_schema_stats ( gather auto) we found the last_analyzed on today only 10 tables.
Question :
a) Whether all the necessary stats which optimiser looks for would have been gathered.?
If not,
b) whether we need to delete statistics and then run dbms_stat again to gather the schema stats.
c) Query "A" still take 1 hr to complete.
October 28, 2008 - 9:47 pm UTC
you only gathered on 10 things. So, you can probably answer (a) yourself.
where any of the objects query A accesses in that set of 10?
Query Perf Slow after Analyze...
A reader, October 28, 2008 - 9:57 pm UTC
continued.....
d)If I
exe dbms_stats.delete_schema_stats('SCHEMA_OWNER')
and then run
exe dbms_stats.gather_schema_stats with gather auto on the above schema.
Then what stats optimiser will take after delete and before create statistics?
e) is it safe to do d) above in Live server?
Regards.
October 29, 2008 - 7:47 am UTC
d) the ones the documentation says would be there - in short, if you delete them all, gather will gather all of them.
and I'm not sure you want to use auto, just gather_schema_stats and let as many things default as possible
and do this IN A TEST INSTANCE, IF YOU DO THIS ON PRODUCTION WITHOUT TESTING - PLEASE DON'T COME BACK HERE. You never make major changes in a live system.
Query Perf Slow after Analyze...
A user, October 28, 2008 - 10:02 pm UTC
...you only gathered on 10 things. So, you can probably answer (a) yourself.
Sorry Tom
Not making the things clear earlier.
I ran
exec dbms_stats.gather_schema_stats(ownname=>'ZZ', degree=>4, cascade=>true, options=>'gather auto');
....where any of the objects query A accesses in that set of 10?
NO
October 29, 2008 - 7:48 am UTC
you said above "and last analyzed was today for only 10 things" - so apparently, you gathered statistics for 10 things, out of 1000
please stop playing in production right now - I'll not follow up anymore until you tell me "we are doing this in test now"
Query Perf Slow after Analyze...
A reader, October 29, 2008 - 12:34 am UTC
HI Tom.
>you only gathered on 10 things. So, you can probably answer (a) yourself.
You are right. I used 'gather' option to collect stats on all the tables/indexes.
Now the querry "A" has stats for all the objects it accesses using dbms_stat.
But still it takes long to execute. Does it mean that querry has some problem and stats are correct.?
Query Perf Slow after Analyze...
A Reader, October 29, 2008 - 2:56 pm UTC
Sorry Tom,
I understand the point which you made above.
Regards.
Analyze versus DBMS_STATS
Edson, December 17, 2008 - 8:18 am UTC
Hello my name is Edson and I'm from Brazil.
So I have been problems using DBMS_STATS because when I´m using Analyze for 4 tables, my query get better 20 seconds and when I´m using DBMS_STATS my query get better just 13 seconds.
I think that I´m using DBMS_STATS wrong.
The way that I´m using is :
EXECUTE SYS.DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'USER',TABNAME=>'TABLE',GRANULARITY=>'ALL',METHOD_OPT=>'FOR ALL COLUMNS SIZE 1', ESTIMATE_PERCENT=>NULL,CASCADE=>TRUE, DEGREE=>6);
Please Help me.
December 29, 2008 - 12:28 pm UTC
do you actually get different plans?
show us the TKPROFS with wait events included.
The difference between 13 and 20 could be "not different" for all we know. Show us the row source operations from a tkprof for the two.
Query Performance in 10g
Suzanne, December 31, 2008 - 2:01 pm UTC
We use Siebel as our CRM and I've just upgraded the Oracle database from 9.2.0.7 to 10.2.0.4. The database performance has been uneven but generally slower under 10g than 9i. One of the solutions that Siebel asked us to implement was the removal of all of the statistics on any empty table (there are many in our Siebel implementation but we don't use them that is why they are empty). This seems to has improved our performance significantly. This runs contrary to all I know about Oracle and statistics. Why did the removal of statistics on empty tables give us a significant performance boost?
January 05, 2009 - 10:20 am UTC
are they really empty?
or - are they empty usually - filled by the application - and then used.
I would say "probably, they are empty when you analyze and filled by the application at run time"
If you never query them - then removing the statistics would do nothing for you.
So, you must query them... And if you query them, they probably have data (else you would always get zero rows...).
In 9i, you had no statistics on these tables - and if they were not indexed, dynamic sampling would kick in. Dynamic sampling would have told the optimizer "there are 10,000 rows in here". We'd get the right plan.
In 10g, the auto job would find these unanalyzed tables and analyze them. Putting into the dictionary the "fact" that there are 0 rows. However, at run time, there would actually be 10,000 rows in there. When we have the wrong 'card=' values - we get the wrong plan. So, because we think 0 records - we get the wrong plan.
Remove the statistics and in 10g, we'd dynamically sample at hard parse time. We would discover "hey, there are 10,000 rows". We'd get the right plan.
https://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html
RE:Analyze versus DBMS_STATS
Edson, January 05, 2009 - 7:56 am UTC
Hi Tom,
My argument about Analyze cost plan is better as DBMS_STATS is wrong ... I have seen with my DB and the DBMS_STATS get the same cost plan.
Thank you for your answer.
And I want to tell you that the topic is so much rich and help me so much for my research.
One more time I really grateful for your help.
Oramag Article
S, January 17, 2009 - 1:03 pm UTC
Tom,
In the Oracle Magazine article On Dynamic Sampling, under the title When the Optimizer Guesses, you wrote:
"After gathering statistics, you ask the optimizer to estimate how many people were born in December, and it would almost certainly be able to come up with a very accurate estimate of 1/12 of the data (assuming a normal distribution of birth dates)."
It should probably read: (assuming the uniform distribution of birth dates).
Regards
January 17, 2009 - 1:28 pm UTC
hah, you are correct, I guess if it were "normal", it would have most people in the summer and very few in the winter.
Unless you are in Australia and then it would be the opposite...
thanks and I agree.
DBMS_STATS Problem !!!!
Indranil DAS, March 09, 2009 - 6:10 am UTC
Hi Tom,
We are not able to find out why the below error is occurred ::
SQL> BEGIN dbms_stats.GATHER_TABLE_STATS('MEA_REPORTS','RPT_OUTER_CARTON_T', estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE, block_sample=>TRUE, method_opt=>
'for all indexed columns size AUTO', granularity=>'ALL',cascade=>TRUE, degree=>4); END;
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
ORA-06512: at "SYS.DBMS_STATS", line 10502
ORA-06512: at "SYS.DBMS_STATS", line 10516
ORA-06512: at line 1
Tom, what wents wrong here -plz suggest.
Thanks in advance for your help.
Indranil
Oracle DBA,IBM
March 09, 2009 - 12:59 pm UTC
please utilize support/metalink for that.
I remember something vaguely relating to a function based index in early releases causing this.
Extended Statistics
Arvind Mishra, May 04, 2009 - 8:21 pm UTC
Hello Tom,
a) Please can you tell me how can I effectively use extended statistics.
b) If i use expression statistics then function based index is still useful? When should I use function based index and when should I use function based index?
c) How can I see which multicolumn statistics are gathered on a table.
d) Is this a new feature in 11g?
Thanks,
ARVIND
May 09, 2009 - 10:32 am UTC
a) by gathering statistics on expressions or by gathering statistics on correlated columns... In the same manner you would for an INDIVIDUAL column.
b) sure, if you gathered statistics on EMPNO in EMP, but did not have a primary key on EMPNO in EMP and EMP had a lot of records - would
select * from emp where empno = :x;
run any faster? Of course not :) you would certainly create an index in order to retrieve that single record quickly.
Same with an expression!!!
ops$tkyte%ORA11GR1> create table emp
2 as
3 select object_id empno,
4 trunc( dbms_random.value( 1000, 4000 ) ) sal,
5 trunc( dbms_random.value( 5000, 10000 ) ) comm,
6 decode( mod(rownum,2), 0, 'N', 'Y' ) flag1,
7 decode( mod(rownum,2), 0, 'Y', 'N' ) flag2
8 from stage
9 /
Table created.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> create index flag_idx on emp(flag1,flag2);
Index created.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP', method_opt => 'for all columns size 254', estimate_percent=>100);
PL/SQL procedure successfully completed.
<b>Now, without any extended statistics:</b>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> begin
2 for x in ( select /*+ gather_plan_statistics */ * from emp where flag1 = 'N' )
3 loop
4 null;
5 end loop;
6 for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
7 loop
8 dbms_output.put_line( '.'||x.plan_table_output );
9 end loop;
10 for x in ( select /*+ gather_plan_statistics */ * from emp where flag2 = 'N' )
11 loop
12 null;
13 end loop;
14 for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
15 loop
16 dbms_output.put_line( '.'||x.plan_table_output );
17 end loop;
18 for x in ( select /*+ gather_plan_statistics */ * from emp where flag1 = 'N' and flag2 = 'N' )
19 loop
20 null;
21 end loop;
22 for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
23 loop
24 dbms_output.put_line( '.'||x.plan_table_output );
25 end loop;
26 end;
27 /
.SQL_ID 6ukxu8bf6rrbc, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE FLAG1 = 'N'
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
.------------------------------------------------------------------------------------
.| 0 | SELECT STATEMENT | | 1 | | 34232 |00:00:00.01 | 552 |
.|* 1 | TABLE ACCESS FULL| EMP | 1 | 34232 | 34232 |00:00:00.01 | 552 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
. 1 - filter("FLAG1"='N')
.
.SQL_ID 1hwwuw5p2v7s8, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE FLAG2 = 'N'
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
.------------------------------------------------------------------------------------
.| 0 | SELECT STATEMENT | | 1 | | 34232 |00:00:00.01 | 552 |
.|* 1 | TABLE ACCESS FULL| EMP | 1 | 34232 | 34232 |00:00:00.01 | 552 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
. 1 - filter("FLAG2"='N')
.
.SQL_ID 00s1uatbzv2j2, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE FLAG1 = 'N' AND
.FLAG2 = 'N'
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
.------------------------------------------------------------------------------------
.| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 211 |
.|* 1 | TABLE ACCESS FULL| EMP | 1 | 17116 | 0 |00:00:00.01 | 211 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
. 1 - filter(("FLAG1"='N' AND "FLAG2"='N'))
.
PL/SQL procedure successfully completed.
<b>it gets the right estimated cardinalities for flag1=N, for flag2=N, but totally messes up flag1=N and flag2=N. Add that pair:</b>
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> exec dbms_output.put_line( dbms_stats.create_extended_stats( user, 'EMP', '(FLAG1,FLAG2)' ) );
SYS_STUGVNB7PTIYWAVPJX#YT77WGD
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP', method_opt => 'for all columns size 254', estimate_percent=>100);
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> begin
2 for x in ( select /*+ gather_plan_statistics */ * from emp EMP2 where flag1 = 'N' and flag2 = 'N' )
3 loop
4 null;
5 end loop;
6 for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
7 loop
8 dbms_output.put_line( '.'||x.plan_table_output );
9 end loop;
10 end;
11 /
.SQL_ID 769uyry623wf9, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP EMP2 WHERE FLAG1 = 'N'
.AND FLAG2 = 'N'
.
.Plan hash value: 3400700027
.
.--------------------------------------------------------------------------------------------------
.| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
.--------------------------------------------------------------------------------------------------
.| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 2 |
.| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 1 | 0 |00:00:00.01 | 2 |
.|* 2 | INDEX RANGE SCAN | FLAG_IDX | 1 | 1 | 0 |00:00:00.01 | 2 |
.--------------------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
. 2 - access("FLAG1"='N' AND "FLAG2"='N')
.
PL/SQL procedure successfully completed.
<b> cardinality is right, plan changes - all good
Now, do an expression:</b>
ops$tkyte%ORA11GR1> begin
2 for x in ( select /*+ gather_plan_statistics */ * from EMP where ln(sal+comm) > 9 )
3 loop
4 null;
5 end loop;
6 for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
7 loop
8 dbms_output.put_line( '.'||x.plan_table_output );
9 end loop;
10 end;
11 /
.SQL_ID f03rzhdyzy1gc, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE LN(SAL+COMM) > 9
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
.------------------------------------------------------------------------------------
.| 0 | SELECT STATEMENT | | 1 | | 58463 |00:00:08.42 | 794 |
.|* 1 | TABLE ACCESS FULL| EMP | 1 | 3423 | 58463 |00:00:08.42 | 794 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
. 1 - filter(LN("SAL"+"COMM")>9)
.
PL/SQL procedure successfully completed.
<b>well, we can see that is wrong (the estimated row count) </b>
ops$tkyte%ORA11GR1> alter table EMP add (comp as (ln(sal+comm)));
Table altered.
ops$tkyte%ORA11GR1> exec dbms_stats.gather_table_stats( user, 'EMP', method_opt => 'for all columns size 254', estimate_percent=>100);
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> begin
2 for x in ( select /*+ gather_plan_statistics */ * from EMP where ln(sal+comm) > 9 )
3 loop
4 null;
5 end loop;
6 for x in ( select plan_table_output from table(dbms_xplan.display_cursor(null, null, 'allstats last')) )
7 loop
8 dbms_output.put_line( '.'||x.plan_table_output );
9 end loop;
10 end;
11 /
.SQL_ID f03rzhdyzy1gc, child number 0
.-------------------------------------
.SELECT /*+ gather_plan_statistics */ * FROM EMP WHERE LN(SAL+COMM) > 9
.
.Plan hash value: 3956160932
.
.------------------------------------------------------------------------------------
.| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
.------------------------------------------------------------------------------------
.| 0 | SELECT STATEMENT | | 1 | | 58463 |00:00:05.61 | 794 |
.|* 1 | TABLE ACCESS FULL| EMP | 1 | 58481 | 58463 |00:00:05.61 | 794 |
.------------------------------------------------------------------------------------
.
.Predicate Information (identified by operation id):
.---------------------------------------------------
.
. 1 - filter(LN("SAL"+"COMM")>9)
.
PL/SQL procedure successfully completed.
<b>card = is now correct...</b>
c) query the dictionary:
ops$tkyte%ORA11GR1> select table_name, column_name from user_tab_col_statistics where table_name = 'EMP';
TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------
EMP EMPNO
EMP SAL
EMP COMM
EMP FLAG1
EMP FLAG2
EMP SYS_STUGVNB7PTIYWAVPJX#YT77WGD
EMP COMP
7 rows selected.
that sys_fadfadafds thing was dbms_output'ed above - that is flag1,flag2 - our virtual column looks like a normal one - COMP.
c)
ops$tkyte%ORA11GR1> select table_name, column_name from user_tab_col_statistics where table_name = 'EMP';
TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------
EMP EMPNO
EMP SAL
EMP COMM
EMP FLAG1
EMP FLAG2
EMP SYS_STUGVNB7PTIYWAVPJX#YT77WGD
EMP COMP
7 rows selected.
ops$tkyte%ORA11GR1> SELECT extension_name, extension
2 FROM dba_stat_extensions
3 WHERE table_name = 'EMP';
EXTENSION_NAME
------------------------------
EXTENSION
-------------------------------------------------------------------------------
SYS_STUGVNB7PTIYWAVPJX#YT77WGD
("FLAG1","FLAG2")
COMP
("SAL"+"COMM")
d) yes
CHAIN_CNT affecting query plan ... ?
Gary, August 19, 2009 - 11:23 am UTC
Tom,
You said in an earlier post in this thread (December 16, 2005) that "dbms_stats won't gather things not used by the CBO such as chain_cnt/avg_space in particular." We ran into an odd situation recently where removing the chain_cnt statistic seemed to result in an improved plan.
The job stream was running ANALYZE followed by DBMS_STATS, since that was the only way the development team knew to get a count of chained rows.
We decided on a simplified test case, since the developer had been rebuilding the table with a higher pctfree to eliminate the chained rows; here is what I tried:
Use DBMS_STATS.EXPORT_TABLE_STATS to preserve existing stats on two tables used by the problem query.
ANALYZE TABLE <name> DELETE STATISTICS ; -- for both tables
Use IMPORT_TABLE_STATS to put the exported statistics back into place
This should have had the net effect of removing only the stats gathered by ANALYZE TABLE, including CHAIN_CNT and AVG_SPACE. The plan changed very much for the better after this procedure was done. The old plan used index access to hit the two tables; the new plan full scanned them then used them in hash joins. The tables were only 2 MB and 14 MB in size, so the hash join approach seems like a no brainer chained rows or no.
The developer seemed pretty sure before this that CHAIN_CNT was affecting the query plan ... now I'm not sure what to tell him. (Maybe some other stat gathered by ANALYZE TABLE is to blame?)
Thanks,
Gary
August 24, 2009 - 5:00 pm UTC
did you export column and index statistics as well?
else you deleted A LOT of stuff.
stats
sam, August 27, 2009 - 12:43 pm UTC
Tom:
I want to setup a nightly job to update stats in a database.
What is the standard statement you would do. is it
dbms_stats.gather_schema_stats('MYAPP');
I noticed this did not update the stats in USER_IDNEXES and COLUMNS.
August 28, 2009 - 4:57 pm UTC
Sam/SMK
you know my answer is going to be
it depends.
You noticed something incorrectly. It does.
ops$tkyte%ORA10GR2> create table t ( x int );
Table created.
ops$tkyte%ORA10GR2> create index t_idx on t(x);
Index created.
ops$tkyte%ORA10GR2> insert into t select rownum from all_users;
42 rows created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't', num_rows from user_tables where table_name = 'T'
2 union all
3 select 'i', num_rows from user_indexes where index_name = 'T_IDX'
4 union all
5 select 'c', count(*) from user_tab_histograms where table_name = 'T';
' NUM_ROWS
- ----------
t
i
c 0
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_schema_stats( user );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't', num_rows from user_tables where table_name = 'T'
2 union all
3 select 'i', num_rows from user_indexes where index_name = 'T_IDX'
4 union all
5 select 'c', count(*) from user_tab_histograms where table_name = 'T';
' NUM_ROWS
- ----------
t 42
i 42
c 2
how cool is that, 42 :) that was a pleasant surprise.
stats
A reader, August 29, 2009 - 1:51 pm UTC
Tom:
1. You are looking at num_rows column.
I was looking at LAST_ANALYZED date stamp in USER INDEXES.
The LAST_ANALYZED in USER_TABLES gets update correctly to the date you run the statement but not USER_INDEXES.
2, on a second note, i was reading a book that says collecting stats may not be helpful ever day depending on data.
The Oracle optimizer will generate new query plan every time you collect stats so it will not reuse old plans.
is this correct and do you adivese to schedule it on a weekly basis?
August 29, 2009 - 7:27 pm UTC
1) sam, perhaps you should have looked again?
ops$tkyte%ORA10GR2> create table t ( x int );
Table created.
ops$tkyte%ORA10GR2> create index t_idx on t(x);
Index created.
ops$tkyte%ORA10GR2> insert into t select rownum from all_users;
42 rows created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
ops$tkyte%ORA10GR2> select 't', num_rows, last_analyzed from user_tables where table_name = 'T'
2 union all
3 select 'i', num_rows, last_analyzed from user_indexes where index_name = 'T_IDX'
4 union all
5 select 'c', count(*), to_date( null) from user_tab_histograms where table_name = 'T';
' NUM_ROWS LAST_ANAL
- ---------- ---------
t
i
c 0
ops$tkyte%ORA10GR2> exec dbms_stats.gather_schema_stats( user );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select 't', num_rows, last_analyzed from user_tables where table_name = 'T'
2 union all
3 select 'i', num_rows, last_analyzed from user_indexes where index_name = 'T_IDX'
4 union all
5 select 'c', count(*), to_date( null) from user_tab_histograms where table_name = 'T';
' NUM_ROWS LAST_ANAL
- ---------- ---------
t 42 29-AUG-09
i 42 29-AUG-09
c 2
2) .. The Oracle optimizer will generate new query plan every time you collect stats
so it will not reuse old plans.
..
sam - you do understand that the major goal of collecting statistics is....
to cause plans to change because the size and shape of the data has changed.
so, (this is dead serious, I'm not sarcastic here at all), so do you want your plans to change like that - or not.
Only you can answer the question "how often should I analyze"
you know your data
you know how your data changes
you know how your applications use your data
you know this, if you don't, stop - and get that information
index creation on table with LOCKED stats
psha, November 04, 2009 - 1:35 pm UTC
Hi Tom,
We are using Oracle 10g R2.We have one table with LOCKED stats. Now if I create indx on it, index is not analyzed. We are planining to migrate to 11g, is there any way in 11g to lock table stats such that whenever we create any index, index gets analyzed as soon as it is creatd?
Thanks
Reader
November 11, 2009 - 9:31 am UTC
no, when you lock, you lock.
you would unlock in order to gather, so just unlock, create index, lock
ops$tkyte%ORA11GR2> create table t
2 as
3 select * from all_objects;
Table created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create index t_idx on t(object_name);
Index created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec dbms_stats.lock_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select 't', table_name, num_rows, last_analyzed from user_tables where table_name = 'T'
2 union all
3 select 'i', index_name, num_rows, last_analyzed from user_indexes where table_name = 'T';
' TABLE_NAME NUM_ROWS LAST_ANALYZED
- ------------------------------ ---------- --------------------
t T 71588 11-nov-2009 10:30:13
i T_IDX 71588 11-nov-2009 10:30:13
ops$tkyte%ORA11GR2><b>
ops$tkyte%ORA11GR2> exec dbms_stats.unlock_table_stats( user, 'T' );
</b>
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> create index t_idx2 on t(owner,object_type);
Index created.
<b>
ops$tkyte%ORA11GR2> exec dbms_stats.lock_table_stats( user, 'T' );
</b>
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select 't', table_name, num_rows, last_analyzed from user_tables where table_name = 'T'
2 union all
3 select 'i', index_name, num_rows, last_analyzed from user_indexes where table_name = 'T';
' TABLE_NAME NUM_ROWS LAST_ANALYZED
- ------------------------------ ---------- --------------------
t T 71588 11-nov-2009 10:30:13
i T_IDX2 71588 11-nov-2009 10:30:14
i T_IDX 71588 11-nov-2009 10:30:13
A reader, January 24, 2010 - 3:43 pm UTC
After I issue "alter table move" or "index rebuild", do I need to analyze tables or use dbms_stats to gather statistics?
From your book I understand that column level histograms and avg_row_len statistics are lost??
January 26, 2010 - 1:48 am UTC
from *my* book? where did you read that - so I can fix it.
ops$tkyte%ORA11GR1> create table t as select * from all_users;
Table created.
ops$tkyte%ORA11GR1> create index t_idx on t(username);
Index created.
ops$tkyte%ORA11GR1> begin
2 dbms_stats.gather_table_stats
3 ( user, 'T', method_opt => 'for all indexed columns size 254' );
4 end;
5 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select avg_row_len from user_tables where table_name = 'T';
AVG_ROW_LEN
-----------
20
ops$tkyte%ORA11GR1> select count(*) from user_tab_histograms where table_name = 'T';
COUNT(*)
----------
38
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> alter table t move;
Table altered.
ops$tkyte%ORA11GR1> alter index t_idx rebuild;
Index altered.
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select avg_row_len from user_tables where table_name = 'T';
AVG_ROW_LEN
-----------
20
ops$tkyte%ORA11GR1> select count(*) from user_tab_histograms where table_name = 'T';
COUNT(*)
----------
38
Now, that aside, you may well have to gather statistics after a reorganization since major metrics can and will change - like number of blocks for example (just for example, the others might be affected as well)
A reader, February 20, 2010 - 11:31 pm UTC
Hi
Suppose statistics of two databases are exactly same,database version is also same, but there is a big difference interms of amount of data.
Does the response time of the sql queries will be exactly same ?
March 01, 2010 - 5:36 am UTC
... Does the response time of the sql queries will be exactly same ? ...
of course not. Use a little critical thinking and your knowledge of "physics" and you should be able to come up with an example. Here is a simple one:
select * from t;
In both databases, statistics say "there are 1,000 records". Everything is the same between the two - with the exception that in database 1 there are 1,000 records and in database 2 there are 1,000,000,000 records.
Will they perform the same?
Even with the same volumes of data - even with exactly the SAME NUMBER OF ROWS - even with precisely the same exact bits and bytes, you would not, should not, expect the performance to be the same.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8764517459743#737753700346729628 ops$tkyte%ORA10GR2> set echo on
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table t1;
ops$tkyte%ORA10GR2> drop table t2;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t1 as select * from all_objects where 1=0 order by object_id;
ops$tkyte%ORA10GR2> create table t2 as select * from all_objects where 1=0 order by dbms_random.random;
ops$tkyte%ORA10GR2> create index t1_idx on t1(object_id);
ops$tkyte%ORA10GR2> create index t2_idx on t2(object_id);
ops$tkyte%ORA10GR2> insert into t1 select * from all_objects order by object_id;
ops$tkyte%ORA10GR2> insert into t2 select * from all_objects order by dbms_random.random;
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 70000, numblks => 1000 );
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 70000, numblks => 1000 );
ops$tkyte%ORA10GR2> exec dbms_stats.set_index_stats( user, 'T1_IDX', numrows => 70000, numlblks => 100, numdist=>70000 );
ops$tkyte%ORA10GR2> exec dbms_stats.set_index_stats( user, 'T2_IDX', numrows => 70000, numlblks => 100, numdist=>70000 );
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set linesize 1000
ops$tkyte%ORA10GR2> set autotrace traceonly
ops$tkyte%ORA10GR2> select * from t1 where object_id between 1 and 500;
456 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 546753835
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 175 | 17500 | 6 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 175 | 17500 | 6 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 315 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID">=1 AND "OBJECT_ID"<=500)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
70 consistent gets
0 physical reads
0 redo size
19275 bytes sent via SQL*Net to client
730 bytes received via SQL*Net from client
32 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
456 rows processed
ops$tkyte%ORA10GR2> select * from t2 where object_id between 1 and 500;
456 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 4244861920
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 175 | 17500 | 6 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T2 | 175 | 17500 | 6 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T2_IDX | 315 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID">=1 AND "OBJECT_ID"<=500)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
490 consistent gets
0 physical reads
0 redo size
19275 bytes sent via SQL*Net to client
730 bytes received via SQL*Net from client
32 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
456 rows processed
ops$tkyte%ORA10GR2> set autotrace off
ops$tkyte%ORA10GR2>
see, same exact data, exactly same data, same stats, exactly the same, same plans, same everything
except the amount of work performed - since the organization of data on disk is different.
A reader, March 12, 2010 - 8:19 pm UTC
Mmmm.
Thanks sir.
Suppose uat database is refreshed from production.
Is there any benefit for transferring statistics from production to UAT?
March 15, 2010 - 9:51 am UTC
depends on how you refresh.
if you refresh by dump and load - just remember - UAT and production are only as alike as your image in a funhouse mirror is like you. You should gather stats on UAT as the order of rows will be different, sizes of things will be different - everything will be different except for the VALUES of the data. Do expect different plans. You could likely get the same plans if you move the stats over - however - you have no idea if that plan is good bad or indifferent - since the data is organized differently.
if you refresh by restoring a backup, you are done, just use it.
analyze
A reader, July 08, 2010 - 2:28 am UTC
greetings Thomas,
and thanks like always.
exec dbms_stats.gather_schema_stats('A');
exec dbms_stats.gather_table_stats('A','ACTIONS',estimate_percent => 100, cascade => true);
but the problem that, when oracle access the table (actions)
it issues an analyze (i traced the session and found analyze table A.ACTIONS compute statistics in the trace file).
table_name: A.ACTIONS
number of rows: 0
number of columns: 148
number of indexes: 113
i think it is related to the number of indexes, because when i dropped it worked fine, but my question is why is that happening??????
July 08, 2010 - 12:22 pm UTC
you think what is related to what???
what version are you on.
analyze
A reader, July 15, 2010 - 4:57 pm UTC
Greetings thomas,
and thanks like always.
regarding the above post:
-----you think what is related to what???
lots of columns and lots of indexes in one table.
but that is a guess, and that is why i am asking you.
------what version are you on.
10.2.0.1
July 19, 2010 - 1:10 pm UTC
1,000 columns, 1,000 indexes, no analyze
I'd need a test case to reproduce with
drop table t;
declare
l_create long := 'create table t ( a int primary key';
begin
for i in 1 .. 999
loop
l_create := l_create || ', c' || i || ' int';
end loop;
execute immediate l_create || ')';
for i in 1 .. 999
loop
execute immediate 'create index t_idx' || i || ' on t(c' || i || ')';
end loop;
end;
/
alter session set sql_trace=true;
exec dbms_stats.gather_table_stats( user, 'T', estimate_percent=> 100, cascade=> true );
i did not see analyze in a trace file anywhere.
_optimizer_invalidation_period
Ravi B, September 14, 2010 - 1:56 pm UTC
Hi Tom,
I read somewhere that the hidden parameter "_optimizer_invalidation_period" influences the setting NO_INVALIDATE in DBMS_STATS. By default it is 180000 (5hrs). What does it mean? The (IN)VALIDATION doesn't occur for 5hrs after it is set? I don't see any documentation on oracle support site either.
Thanks,
Ravi
September 15, 2010 - 7:36 am UTC
that is because it is undocumented, I don't discuss things like that really, you don't need to set them - except in response to support telling you to do so. Their meaning can, will, has changed from dot release to dot release - it would be unwise to play with them.
Oracle 11g : Enterprise vs Standard Editions
Cj, February 04, 2011 - 2:26 am UTC
Hello,
I have few question regarding Oracle 11g Standard Edition and Enterprise Edition
1- Where could I find a very detailed document of the differences between Enterprise and Standard Editions on Oracle 11g?
2. In Standard edition, is it possible to use packages DBMS_STATS and DBMS_UTILY?
Thanks in advance
February 04, 2011 - 9:32 am UTC
see
http://www.oracle.com/pls/db112/homepage every release has a License guide that details what comes with each edition.
yes, dbms_stats and dbms_utility are available with SE
Importing Statistics
Shravani, February 15, 2011 - 1:22 pm UTC
Hi Tom,
In our application, we have 2 schemas (A and B) with exactly same tables and data. During the data load process we connect user to B copy. We gather stats in A schema using DBMS_STATS and then import the stats to B schema.
However we are seeing the execution plan and timings totally different between A and B schema. Query executed in A copy returns results quickly as compare to that in B.
Can you please let me know what would make this happens? And what are the areas to look for.
Regards
Shravani.
February 15, 2011 - 4:54 pm UTC
show us a tkprof and a 'plan' (with estimated cardinalities - like from autotrace) of both.
Also - show us the method you use to copy the statistics.
Import Statistics
Shravani, February 16, 2011 - 11:51 am UTC
Thanks for your reply Tom.
Here is the query we are executing in A and B Copy
SELECT DISTINCT product_matrix.patient_case_key
FROM date_time_dim init_rcvd_date_condition,
dm_load_dt dm_load_dt_condition,
product_matrix,
product_dim product_dim_suspect
WHERE ( product_matrix.suspect_product_key =
product_dim_suspect.product_key
AND product_dim_suspect.role_indic = 'SUSPECT')
AND (product_matrix.init_recv_dt_key =
init_rcvd_date_condition.year_month_day)
AND (( dm_load_dt_condition.increment_dt >=
product_matrix.effective_dt AND dm_load_dt_condition.increment_dt < product_matrix.end_dt))
AND ( (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') = dm_load_dt_condition.reporting_dt)
AND dm_load_dt_condition.active = 'Y'
)
AND ( ( product_dim_suspect.generic_nm IN
('CELECOXIB')
OR 'ALL' IN ('CELECOXIB')
)
AND (dm_load_dt_condition.reporting_dt >=
SYS_CONTEXT ('pfast', 'mbss_release')
)
AND ((init_rcvd_date_condition.calendar_dt) >=
TO_DATE ('01-JAN-2010', 'DD-MON-YYYY')
)
AND ((init_rcvd_date_condition.calendar_dt) <=
TO_DATE ('25-SEP-2010', 'DD-MON-YYYY')
)
);
And here is the TKPROF for A
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.05 0 23 0 0
Execute 1 0.00 0.02 0 128 0 0
Fetch 210 0.00 5.41 0 36 0 3127
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 212 0.01 5.49 0 187 0 3127
Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 73 (FSTADM)
Rows Row Source Operation
------- ---------------------------------------------------
3127 PX COORDINATOR (cr=164 pr=0 pw=0 time=1302 us)
0 PX SEND QC (RANDOM) :TQ10002 (cr=0 pr=0 pw=0 time=0 us cost=171 size=106 card=1)
0 HASH UNIQUE (cr=0 pr=0 pw=0 time=0 us cost=171 size=106 card=1)
0 PX RECEIVE (cr=0 pr=0 pw=0 time=0 us)
0 PX SEND HASH :TQ10001 (cr=0 pr=0 pw=0 time=0 us)
0 FILTER (cr=0 pr=0 pw=0 time=0 us)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us cost=170 size=106 card=1)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us cost=170 size=276 card=3)
0 BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
0 PX RECEIVE (cr=0 pr=0 pw=0 time=0 us)
0 PX SEND BROADCAST :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
31 MERGE JOIN CARTESIAN (cr=36 pr=0 pw=0 time=2280 us cost=10 size=58 card=1)
31 TABLE ACCESS BY GLOBAL INDEX ROWID PRODUCT_DIM PARTITION: ROW LOCATION ROW LOCATION (cr=33 pr=0 pw=0 time=2100 us cost=8 size=39 card=1)
39 INDEX RANGE SCAN UX_PD_GNM_ROLE_PK (cr=3 pr=0 pw=0 time=0 us cost=2 size=0 card=12)(object id 8689572)
31 BUFFER SORT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
1 TABLE ACCESS BY INDEX ROWID DM_LOAD_DT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
1 INDEX RANGE SCAN IX_RPT_DATE_DM_LOAD_DT (cr=2 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 8691296)
0 PX PARTITION HASH ALL PARTITION: 1 4 (cr=0 pr=0 pw=0 time=0 us cost=170 size=170 card=5)
0 TABLE ACCESS BY LOCAL INDEX ROWID PRODUCT_MATRIX PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us cost=170 size=170 card=5)
0 BITMAP CONVERSION TO ROWIDS (cr=0 pr=0 pw=0 time=0 us)
0 BITMAP AND (cr=0 pr=0 pw=0 time=0 us)
0 BITMAP INDEX SINGLE VALUE BMAK_MATRIX_SPPRODKEY PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8695173)
0 BITMAP MERGE (cr=0 pr=0 pw=0 time=0 us)
0 BITMAP INDEX RANGE SCAN BMAK_MATRIX_EFFDT PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8694549)
0 INDEX UNIQUE SCAN PK_DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 8689529)
0 TABLE ACCESS BY INDEX ROWID DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=14 card=1)
unable to set optimizer goal
ORA-01986: OPTIMIZER_GOAL is obsolete
parse error offset: 33
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
3127 PX COORDINATOR
0 PX SEND (QC (RANDOM)) OF ':TQ10002' [:Q1002]
0 HASH (UNIQUE) [:Q1002]
0 PX RECEIVE [:Q1002]
0 PX SEND (HASH) OF ':TQ10001' [:Q1001]
0 FILTER [:Q1001]
0 NESTED LOOPS [:Q1001]
0 NESTED LOOPS [:Q1001]
0 NESTED LOOPS [:Q1001]
0 BUFFER (SORT) [:Q1001]
0 PX RECEIVE [:Q1001]
0 PX SEND (BROADCAST) OF ':TQ10000'
31 MERGE JOIN (CARTESIAN)
31 TABLE ACCESS GOAL: ANALYZED (BY
GLOBAL INDEX ROWID) OF 'PRODUCT_DIM' (TABLE)
PARTITION:ROW LOCATION
39 INDEX GOAL: ANALYZED (RANGE SCAN)
OF 'UX_PD_GNM_ROLE_PK' (INDEX (UNIQUE))
31 BUFFER (SORT)
1 TABLE ACCESS GOAL: ANALYZED (BY
INDEX ROWID) OF 'DM_LOAD_DT' (TABLE)
1 INDEX GOAL: ANALYZED (RANGE SCAN)
OF 'IX_RPT_DATE_DM_LOAD_DT' (INDEX)
0 PX PARTITION HASH (ALL) [:Q1001] PARTITION:
START=1 STOP=4
0 TABLE ACCESS GOAL: ANALYZED (BY LOCAL
INDEX ROWID) OF 'PRODUCT_MATRIX' (TABLE) [:Q1001]
PARTITION:KEY STOP=88
0 BITMAP CONVERSION (TO ROWIDS) [:Q1001]
0 BITMAP AND [:Q1001]
0 BITMAP INDEX (SINGLE VALUE) OF
'BMAK_MATRIX_SPPRODKEY' (INDEX (BITMAP)) [:Q1001]
PARTITION:KEY STOP=88
0 BITMAP MERGE [:Q1001]
0 BITMAP INDEX (RANGE SCAN) OF
'BMAK_MATRIX_EFFDT' (INDEX (BITMAP)) [:Q1001]
PARTITION:KEY STOP=88
0 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF
'PK_DATE_TIME_DIM' (INDEX (UNIQUE)) [:Q1001]
0 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'DATE_TIME_DIM' (TABLE) [:Q1001]
=========================================
Here is TKProf for B
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.04 0 26 0 0
Execute 1 0.00 0.02 0 100 0 0
Fetch 210 0.00 69.29 0 36 0 3127
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 212 0.02 69.36 0 162 0 3127
Misses in library cache during parse: 1
Optimizer goal: ALL_ROWS
Parsing user id: 76 (FSTBDM)
Rows Row Source Operation
------- ---------------------------------------------------
3127 PX COORDINATOR (cr=136 pr=0 pw=0 time=1693 us)
0 PX SEND QC (RANDOM) :TQ10002 (cr=0 pr=0 pw=0 time=0 us cost=517 size=114 card=1)
0 HASH UNIQUE (cr=0 pr=0 pw=0 time=0 us cost=517 size=114 card=1)
0 PX RECEIVE (cr=0 pr=0 pw=0 time=0 us)
0 PX SEND HASH :TQ10001 (cr=0 pr=0 pw=0 time=0 us)
0 FILTER (cr=0 pr=0 pw=0 time=0 us)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us cost=516 size=114 card=1)
0 NESTED LOOPS (cr=0 pr=0 pw=0 time=0 us cost=516 size=368 card=4)
0 BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
0 PX RECEIVE (cr=0 pr=0 pw=0 time=0 us)
0 PX SEND BROADCAST :TQ10000 (cr=0 pr=0 pw=0 time=0 us)
31 MERGE JOIN CARTESIAN (cr=36 pr=0 pw=0 time=780 us cost=99 size=58 card=1)
31 TABLE ACCESS BY GLOBAL INDEX ROWID PRODUCT_DIM PARTITION: ROW LOCATION ROW LOCATION (cr=33 pr=0 pw=0 time=600 us cost=97 size=39 card=1)
39 INDEX RANGE SCAN UX_PD_GNM_ROLE_PK (cr=3 pr=0 pw=0 time=0 us cost=2 size=0 card=189)(object id 8743581)
31 BUFFER SORT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
1 TABLE ACCESS BY INDEX ROWID DM_LOAD_DT (cr=3 pr=0 pw=0 time=0 us cost=2 size=19 card=1)
1 INDEX RANGE SCAN IX_RPT_DATE_DM_LOAD_DT (cr=2 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 8743601)
0 PX PARTITION HASH ALL PARTITION: 1 4 (cr=0 pr=0 pw=0 time=0 us cost=516 size=238 card=7)
0 TABLE ACCESS BY LOCAL INDEX ROWID PRODUCT_MATRIX PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us cost=516 size=238 card=7)
0 BITMAP CONVERSION TO ROWIDS (cr=0 pr=0 pw=0 time=0 us)
0 BITMAP AND (cr=0 pr=0 pw=0 time=0 us)
0 BITMAP INDEX SINGLE VALUE BMAK_MATRIX_SPPRODKEY PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8748879)
0 BITMAP MERGE (cr=0 pr=0 pw=0 time=0 us)
0 BITMAP INDEX RANGE SCAN BMAK_MATRIX_EFFDT PARTITION: KEY 88 (cr=0 pr=0 pw=0 time=0 us)(object id 8748990)
0 INDEX UNIQUE SCAN PK_DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 8743544)
0 TABLE ACCESS BY INDEX ROWID DATE_TIME_DIM (cr=0 pr=0 pw=0 time=0 us cost=0 size=22 card=1)
unable to set optimizer goal
ORA-01986: OPTIMIZER_GOAL is obsolete
parse error offset: 33
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: ALL_ROWS
3127 PX COORDINATOR
0 PX SEND (QC (RANDOM)) OF ':TQ10002' [:Q1002]
0 HASH (UNIQUE) [:Q1002]
0 PX RECEIVE [:Q1002]
0 PX SEND (HASH) OF ':TQ10001' [:Q1001]
0 FILTER [:Q1001]
0 NESTED LOOPS [:Q1001]
0 NESTED LOOPS [:Q1001]
0 NESTED LOOPS [:Q1001]
0 BUFFER (SORT) [:Q1001]
0 PX RECEIVE [:Q1001]
0 PX SEND (BROADCAST) OF ':TQ10000'
31 MERGE JOIN (CARTESIAN)
31 TABLE ACCESS GOAL: ANALYZED (BY
GLOBAL INDEX ROWID) OF 'PRODUCT_DIM' (TABLE)
PARTITION:ROW LOCATION
39 INDEX GOAL: ANALYZED (RANGE SCAN)
OF 'UX_PD_GNM_ROLE_PK' (INDEX (UNIQUE))
31 BUFFER (SORT)
1 TABLE ACCESS GOAL: ANALYZED (BY
INDEX ROWID) OF 'DM_LOAD_DT' (TABLE)
1 INDEX GOAL: ANALYZED (RANGE SCAN)
OF 'IX_RPT_DATE_DM_LOAD_DT' (INDEX)
0 PX PARTITION HASH (ALL) [:Q1001] PARTITION:
START=1 STOP=4
0 TABLE ACCESS GOAL: ANALYZED (BY LOCAL
INDEX ROWID) OF 'PRODUCT_MATRIX' (TABLE) [:Q1001]
PARTITION:KEY STOP=88
0 BITMAP CONVERSION (TO ROWIDS) [:Q1001]
0 BITMAP AND [:Q1001]
0 BITMAP INDEX (SINGLE VALUE) OF
'BMAK_MATRIX_SPPRODKEY' (INDEX (BITMAP)) [:Q1001]
PARTITION:KEY STOP=88
0 BITMAP MERGE [:Q1001]
0 BITMAP INDEX (RANGE SCAN) OF
'BMAK_MATRIX_EFFDT' (INDEX (BITMAP)) [:Q1001]
PARTITION:KEY STOP=88
0 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF
'PK_DATE_TIME_DIM' (INDEX (UNIQUE)) [:Q1001]
0 TABLE ACCESS (BY INDEX ROWID) OF 'DATE_TIME_DIM'
(TABLE) [:Q1001]
=========================================================
We are using "dbms_stats.export_schema_stats" to export the stats from A copy. And then "dbms_stats.import_table_stats" to import the stats into B copy.
Thanks Again
Regards
Shravani
February 16, 2011 - 2:11 pm UTC
same plans, same row source operations, same tim= values (similar enough to be the same)...
are you sure in real life the queries take this long - if you run them in sqlplus - do they run really fast in one schema and really slow (repeatedly) in the other?
This looks more like a strange tkprof issue - the TIM= information isn't matching the elapsed column at all.
Importing Stats
Shravani, February 16, 2011 - 3:35 pm UTC
Yes Tome, please believe me the query execution is faster in A copy where we gather the statistics. Can you please let us know where to lookout to find the root cause for this discrepancy?
February 16, 2011 - 5:27 pm UTC
can you show me a sqlplus cut/paste with timing on (run with set autotrace traceonly statistics too - we don't need to see the data, just the query, the row counts and the timing from sqplus)
are the data sets on the same devices?
is there other stuff going on on the machine?
do you have ash/awr access?
Import Statistics
Shravani, February 17, 2011 - 10:15 am UTC
Thanks so much Tom. Here is the required input
Query :
SELECT 'hints', patient_case_dim.case_nbr, therapy_profile_dim.dose_range,
meddra_dim.llt_cd_desc, meddra_dim.pt_cd_desc,
meddra_dim.primary_soc_cd_desc, meddra_dim.hlt_cd_desc,
meddra_dim.hlgt_cd_desc, therapy_profile_dim.multiple_route_admin,
therapy_profile_dim.multiple_formulation,
therapy_profile_dim.multiple_indication,
product_event_fact.interacting_drug_indic_desc,
patient_demographics_dim.gender_cd_desc,
patient_demographics_dim.age_group,
case_profile_dim.case_country_cd_desc,
case_profile_dim.co_suspect_prod_presence, patient_case_dim.age,
case_profile_dim.source_derived, case_profile_dim.conmed_prod_presence,
dm_load_dt.meddra_version_nbr,
product_event_profile_dim.latency_time_2, 'Default',
case_profile_dim.patient_history_presence, product_dim.generic_nm,
patient_case_dim.age_unit_cd_desc, case_profile_dim.case_completeness,
product_therapy_fact.first_daily_dose_qty,
CONCAT (RTRIM (patient_case_dim.case_nbr),
RTRIM (patient_case_dim.source_system_nm)
),
patient_case_dim.patient_case_key, product_dim.drl_cd,
product_event_fact.latency_days_calculated,
therapy_start_date.calendar_dt, therapy_end_date.calendar_dt,
event_profile_dim.event_seriousness_rep_cd_desc,
patient_demographics_dim.elderly_case_indic,
onset_date_dim.calendar_dt,
product_event_profile_dim.ib_labeling_indic,
product_event_profile_dim.cds_labeling_indic,
product_event_profile_dim.spc_labeling_indic,
product_event_profile_dim.uspi_only_labeling_indic,
meddra_dim.dme_indic, case_profile_dim.case_outcome_hierarchy,
case_profile_dim.case_seriousness_co_cd_desc,
CASE
WHEN (meddra_dim.pt_cd_desc) IS NULL
THEN (product_event_fact.verbatim_event_term) || '*'
ELSE (meddra_dim.pt_cd_desc)
END,
CASE
WHEN (meddra_dim.primary_soc_cd_desc) IS NULL
THEN 'SOC Not Yet Coded'
ELSE (meddra_dim.primary_soc_cd_desc)
END,
product_event_profile_dim.dechallenge_cd_desc,
event_profile_dim.treatment_rcvd_desc,
event_profile_dim.event_outcome_rep_cd_desc,
product_event_profile_dim.change_in_dosage_cd_desc,
patient_case_dim.age_in_years,
patient_demographics_dim.age_group_cd_desc,
case_xref_fact.case_reftype_desc, case_xref_fact.case_reference_id,
patient_case_dim.first_avail_eff_dt, therapy_start_date.derived_dt,
therapy_end_date.derived_dt,
product_event_profile_dim.latency_unit_cd_desc,
product_event_fact.latency, onset_date_dim.derived_dt
FROM product_event_fact,
product_therapy_fact,
therapy_profile_dim,
patient_case_dim,
product_dim,
case_profile_dim,
meddra_dim,
product_event_profile_dim,
date_time_dim therapy_start_date,
date_time_dim therapy_end_date,
date_time_dim onset_date_dim,
event_profile_dim,
patient_demographics_dim,
dm_load_dt,
case_xref_fact
WHERE (product_therapy_fact.therapy_profile_key =
therapy_profile_dim.therapy_profile_key
)
AND (meddra_dim.meddra_key = product_event_fact.reaction_key)
AND (event_profile_dim.event_profile_key =
product_event_fact.event_profile_key
)
AND (product_event_profile_dim.product_event_profile_key =
product_event_fact.product_event_profile_key
)
AND (patient_case_dim.patient_case_key =
product_event_fact.patient_case_key
)
AND (patient_case_dim.case_profile_key = case_profile_dim.case_profile_key
)
AND (onset_date_dim.year_month_day(+) = product_event_fact.onset_dt_key)
AND (product_therapy_fact.therapy_start_dt_key = therapy_start_date.year_month_day(+))
AND (product_therapy_fact.therapy_end_dt_key = therapy_end_date.year_month_day(+))
AND (product_event_fact.patient_demographics_key =
patient_demographics_dim.patient_demographics_key
)
AND ( product_event_fact.patient_case_key = product_therapy_fact.patient_case_key(+)
AND product_event_fact.product_key = product_therapy_fact.product_key(+)
AND product_event_fact.seq_product = product_therapy_fact.seq_product(+)
)
AND ( product_event_fact.product_key = product_dim.product_key
AND product_dim.role_indic = 'SUSPECT'
)
AND (( dm_load_dt.increment_dt >= patient_case_dim.effective_dt
AND dm_load_dt.increment_dt < patient_case_dim.end_dt
)
)
AND (case_xref_fact.patient_case_key(+) = patient_case_dim.patient_case_key)
AND ( (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') = dm_load_dt.reporting_dt
)
AND dm_load_dt.active = 'Y'
)
AND ( (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') = dm_load_dt.reporting_dt
)
AND dm_load_dt.active = 'Y'
)
AND ( ( patient_case_dim.case_logical_del_indic = 0
AND ( case_profile_dim.historical_case_indic IS NULL
OR case_profile_dim.historical_case_indic = 2
)
)
AND (product_dim.generic_nm IN ('CELECOXIB') OR 'ALL' IN
('CELECOXIB')
)
AND (patient_case_dim.case_nbr NOT IN (
SELECT clm_list_item.aer_no
FROM bo_sec.clm_list_item, bo_sec.clm_list
WHERE (clm_list.list_id = clm_list_item.list_id)
AND (clm_list.list_name IN ('NONE'))
AND (clm_list.list_type = 'BYPASS'))
)
AND patient_case_dim.patient_case_key IN (
SELECT DISTINCT product_matrix.patient_case_key
FROM date_time_dim init_rcvd_date_condition,
dm_load_dt dm_load_dt_condition,
product_matrix,
product_dim product_dim_suspect
WHERE ( product_matrix.suspect_product_key =
product_dim_suspect.product_key
AND product_dim_suspect.role_indic = 'SUSPECT'
)
AND (product_matrix.init_recv_dt_key =
init_rcvd_date_condition.year_month_day
)
AND (( dm_load_dt_condition.increment_dt >=
product_matrix.effective_dt
AND dm_load_dt_condition.increment_dt <
product_matrix.end_dt
)
)
AND ( (TO_DATE ('23-SEP-2010', 'DD-MON-YYYY') =
dm_load_dt_condition.reporting_dt
)
AND dm_load_dt_condition.active = 'Y'
)
AND ( ( product_dim_suspect.generic_nm IN
('CELECOXIB')
OR 'ALL' IN ('CELECOXIB')
)
AND (dm_load_dt_condition.reporting_dt >=
SYS_CONTEXT ('pfast', 'mbss_release')
)
AND ((init_rcvd_date_condition.calendar_dt) >=
TO_DATE ('01-JAN-2010', 'DD-MON-YYYY')
)
AND ((init_rcvd_date_condition.calendar_dt) <=
TO_DATE ('25-SEP-2010', 'DD-MON-YYYY')
)
))
)
=========================
Statistics from A Schema
=====================================================
21:06:53 SQL> @query1
12659 rows selected.
Elapsed: 00:10:58.07
Statistics
----------------------------------------------------------
5690 recursive calls
0 db block gets
13040813 consistent gets
454569 physical reads
1876 redo size
1883128 bytes sent via SQL*Net to client
16153 bytes received via SQL*Net from client
845 SQL*Net roundtrips to/from client
1755 sorts (memory)
0 sorts (disk)
12659 rows processed
21:17:55 SQL> spool off;
=====================================================
Statistics from B schema
21:07:46 SQL> @query1
12659 rows selected.
Elapsed: 00:28:13.07
Statistics
----------------------------------------------------------
785 recursive calls
0 db block gets
16474030 consistent gets
935724 physical reads
479484 redo size
1999291 bytes sent via SQL*Net to client
16153 bytes received via SQL*Net from client
845 SQL*Net roundtrips to/from client
19 sorts (memory)
0 sorts (disk)
12659 rows processed
21:36:06 SQL> spool off;
===================================================
The data in both the schema is exactly same. Only difference is in A copy data gets inserted via ETL process. And in B schema we truncate the current partition and copy data from A schema. Once the copy process is done we export the statistics gathered in A using DBMS_STATS package procedures. The execution plan for the query in the B copy is very much different than what appears for A.
And Sir, when we execute the queries in A and B copy there were no background jobs or any other processes running.
Can you please let me know what should be done to remove the discrepancy?
Regards
Shravani
February 17, 2011 - 12:07 pm UTC
... The data in both the schema is exactly same ...
Nope, sorry, but it is not. Look at the consistent gets. The data VALUES might be the same - but the ordering of the rows on disk is undoubtedly different (compare the statistics in both schemas - look for index clustering factors that are very different - that'll prove that out)
Here is an example - SAME DATA - exactly the SAME DATA - but watch the IO differences:
ops$tkyte%ORA11GR2> create table organized
2 as
3 select x.*
4 from (select * from stage order by object_name) x
5 /
Table created.
ops$tkyte%ORA11GR2> create table disorganized
2 as
3 select x.*
4 from (select * from stage order by dbms_random.random) x
5 /
Table created.
ops$tkyte%ORA11GR2> create index organized_idx on organized(object_name);
Index created.
ops$tkyte%ORA11GR2> create index disorganized_idx on disorganized(object_name);
Index created.
ops$tkyte%ORA11GR2> begin
2 dbms_stats.gather_table_stats
3 ( user, 'ORGANIZED',
4 estimate_percent => 100,
5 method_opt=>'for all indexed columns size 254'
6 );
7 dbms_stats.gather_table_stats
8 ( user, 'DISORGANIZED',
9 estimate_percent => 100,
10 method_opt=>'for all indexed columns size 254'
11 );
12 end;
13 /
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> select table_name, blocks, num_rows, 0.05*num_rows, 0.10*num_rows from user_tables
2 where table_name like '%ORGANIZED' order by 1;
TABLE_NAME BLOCKS NUM_ROWS 0.05*NUM_ROWS 0.10*NUM_ROWS
------------------------------ ---------- ---------- ------------- -------------
DISORGANIZED 1064 72759 3637.95 7275.9
ORGANIZED 1064 72759 3637.95 7275.9
ops$tkyte%ORA11GR2> select table_name, index_name, clustering_factor from user_indexes
2 where table_name like '%ORGANIZED' order by 1;
TABLE_NAME INDEX_NAME CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
DISORGANIZED DISORGANIZED_IDX 72581
ORGANIZED ORGANIZED_IDX 1038
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> set autotrace on
ops$tkyte%ORA11GR2> select /*+ index( organized organized_idx) */
2 count(subobject_name)
3 from organized;
COUNT(SUBOBJECT_NAME)
---------------------
1106
Execution Plan
----------------------------------------------------------
Plan hash value: 2296712572
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 1402 (1)| 00:00:17 |
| 1 | SORT AGGREGATE | | 1 | 17 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| ORGANIZED | 72759 | 1207K| 1402 (1)| 00:00:17 |
| 3 | INDEX FULL SCAN | ORGANIZED_IDX | 72759 | | 363 (1)| 00:00:05 |
----------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1400 consistent gets
0 physical reads
0 redo size
436 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte%ORA11GR2> select /*+ index( disorganized disorganized_idx) */
2 count(subobject_name)
3 from disorganized;
COUNT(SUBOBJECT_NAME)
---------------------
1106
Execution Plan
----------------------------------------------------------
Plan hash value: 3014554493
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 72966 (1)| 00:14:36 |
| 1 | SORT AGGREGATE | | 1 | 17 | | |
| 2 | TABLE ACCESS BY INDEX ROWID| DISORGANIZED | 72759 | 1207K| 72966 (1)| 00:14:36 |
| 3 | INDEX FULL SCAN | DISORGANIZED_IDX | 72759 | | 363 (1)| 00:00:05 |
-------------------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
72943 consistent gets
0 physical reads
0 redo size
436 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
ops$tkyte%ORA11GR2> set autotrace off
One of your schemas did more then twice the physical IO's and 3 million more logical IO's - the data is different.
Try gathering statistics on B directly and compare the statistics of the tables, indexes and columns - you should see they are "different"
Shravani, February 17, 2011 - 1:57 pm UTC
Dear Tom,
Thanks so so much for this answer. Yes after gathering the stats, I can see index clustering factor is very different.
However I am still not getting why the data in B copy is disorganised. Do we need to do changes in the copy process?
The current logic is, if the table is partitioned we truncate the current partition in B copy and then select-insert the current partition data from A into B. For the non-partitioned tables we just truncate-insert.
One more thing is, each day we drop all the indexes in B and recreate them. So to summarize the A-B copy process works in following way
1. Drop all indexes in B copy
2. Truncate the current partition or a complete table in B
3. Insert the data from A copy
4. Create all the indexes
5. Clone the statistics from A using DBMS_STATS.Export_Schema_Stats
6. Copy the statistics into b using DBMS_STATS.Import_Schema_Stats
Do you think one of the above steps has lead to disorganized data in B copy? And can you please give us hint on the probable resolution for this?
Thanks
Shravani
February 17, 2011 - 2:11 pm UTC
show me the code for step 3
tell me about the tablespaces - they are automatic segment space managed or manual segment space managed?
Importing Statistics
Shravani, February 17, 2011 - 3:30 pm UTC
Hi Tom,
The tablespace are automatic segment space managed.
Here is the code
IF clonetype = 1 THEN
EXECUTE IMMEDIATE ('alter table ' || targetschema || '.' || in_table || ' truncate subpartition ' || in_subpart );
ELSIF clonetype = 2 THEN
EXECUTE IMMEDIATE ('alter table ' || targetschema || '.' || in_table || ' truncate partition ' || in_subpart );
ELSIF clonetype = 3 THEN
EXECUTE IMMEDIATE ('truncate table ' || targetschema || '.' || in_table);
END IF;
/* Now, insert the data in Direct Path mode... */
diag_section := 'APPEND DATA';
mysql := 'insert /*+ APPEND NOLOGGING */ into ' || targetschema || '.' || in_table;
IF clonetype = 1 THEN
mysql := mysql || ' subpartition (' || in_subpart || ') (select * from '|| sourceschema || '.';
mysql := mysql || in_table || ' subpartition (' || in_subpart || '))';
ELSIF clonetype = 2 THEN
mysql := mysql || ' partition (' || in_subpart || ') (select * from ' || sourceschema || '.';
mysql := mysql || in_table || ' partition (' || in_subpart || '))';
ELSIF clonetype = 3 THEN
mysql := mysql || ' (select * from ' || sourceschema || '.' || in_table || ')';
END IF;
Regards
Shravani
February 17, 2011 - 6:02 pm UTC
you know that NOLOGGING is meaningless. Nologging is not a hint, it is not usable in SQL selects/inserts - it is only useful in an ALTER TABLE in that context.
that should be inserting the data in MORE OR LESS the same sort of order - automatic segment space management spreads data out by design though, so it won't be exact. Also - is parallel involved anywhere?
But if the clustering factors are very different - then, we know that it is laid out differently on disk.
You can use an order by on the insert as select to cause the rows to be organized in some fashion if you so desire.
Thanks so much Tom
shravani, February 21, 2011 - 7:31 am UTC
Thanks for the clarification Tom. We are now changing the INSERT method by adding the ORDER BY clause.
granularity with where condition
A reader, March 25, 2011 - 8:50 am UTC
Is it possible to gather table stats based on where condition
granularity=>"some_column=some_value"
March 25, 2011 - 11:21 am UTC
no, why??? what is your goal.
gather_stats
A reader, March 26, 2011 - 4:25 am UTC
I do not want to subpartition further. I would like to gather statistics on subset of data.
March 29, 2011 - 2:40 am UTC
We can only gather statistics on a segment. If you do not want to partition further, you have to gather statistics on what you have.
Analyze
A reader, July 15, 2011 - 3:49 pm UTC
Hi Tom - I thought multiple DML in a statement should be followed by analyze of the table with the indexes. I did as below but dont see any effects. So is it even multiple dml on the table keeps the index intact - so analyze needed.
-- create a table with a timestamp column
create table Test_Index as select a.*,a.OBJECT_ID indx_col from all_objects a where 1=2;
--create an index on the table
create index indx_indx_col on test_index(indx_col );
--analyze the table , nothing to analyze
analyze table Test_Index compute statistics
for table
for all indexes
for all indexed columns;
-- insert into the table
insert into test_index select a.*,a.OBJECT_ID indx_col from all_objects a;
-- see the sql plan
set autot traceonly
select * from test_index where indx_col=&1;
-- delete from the table
delete from test_index;
-- insert into the table
insert into test_index select a.*,a.OBJECT_ID indx_col from all_objects a;
-- insert into the table
insert into test_index select a.*,a.OBJECT_ID indx_col from all_objects a;
-- see the sql plan
set autot traceonly
select * from test_index where indx_col=&1;
--analyze the table , nothing to analyze
analyze table Test_Index compute statistics
for table
for all indexes
for all indexed columns;
-- see the sql plan
set autot traceonly
select * from test_index where indx_col=&1;
Please suggest.
July 18, 2011 - 10:14 am UTC
you should never use analyze to gather statistics, please ONLY use dbms_stats.
a gathering of statistics MAY affect the plan, it may not. It gives an opportunity for a plan change, it does not mean "the plan will change, the plan has to change". It simply means "plans might change, then again, plans might stay the same"
In general, you DO NOT want to gather statistics after multiple DML, it is not necessary, you do it when the data changes greatly.
when I ran the above (changing analyze to dbms_stats), it just always decides to use the index, nothing wrong or suspicious there.
Any Example
A reader, July 19, 2011 - 6:19 pm UTC
Hi Tom - Is there any specific example/links on your site where i can lookup to see what you meant by "
In general, you DO NOT want to gather statistics after multiple DML, it is not necessary, you do it when the data changes greatly. "
Interested to gather more information on the "data changes greatly" .
Please help.
Thanks as always
July 22, 2011 - 12:33 pm UTC
Another use for dbms_stats?
Raj, August 19, 2011 - 3:20 am UTC
Hi Tom,
I was trying to find another use for dbms_stats exports by trying to trend a table's growth (row count) over time. If I had a daily export (taken before a new gather), I'm having troubles figuring out which columns in my export table are reliable. I realize that this isn't totally accurate, more of a problem of not understanding the mapping. Eg.
select count(*) from test.a;
COUNT(*)
----------
107929
exec dbms_stats.gather_table_stats(ownname=>'test',tabname=>'a',estimate_percent=>0.1,granularity=>'ALL',degree=>2,cascade=>true);
select count(*) from test.a;
COUNT(*)
----------
107929
exec dbms_stats.create_stat_table('test','a_statexp','test');
exec dbms_stats.export_table_stats(ownname=>'test', tabname=>'a', stattab=>'a_statexp', statown=>'test');
I find 3 columns with the value 107929:
select count(*) from test.a_statexp where N1=107929;
COUNT(*)
----------
5
select count(*) from test.a_statexp where N3=107929;
COUNT(*)
----------
1
select count(*) from test.a_statexp where N8=107929;
COUNT(*)
----------
5
When I look at the value of C1 in the a_statexp table, I see that it's the primary key rather than the table itself. It sort of makes sense, but doesn't seem very trustworthy.
Is there a better way?
Re: Another use for dbms_stats?
Raj, August 19, 2011 - 10:37 am UTC
I forgot to include the concluding statement I was wondering about...
select n1 from test.a_statexp where c1 = pk_index_of_tab_a;
That's the statement that seems a little flaky to me (rather than finding the column that represents the num_rows on the table itself... if it exists!).
analyze and dbms_stats
Sushil, August 23, 2011 - 1:36 pm UTC
Tom, I tried to analyze couple tables in oracle 11g database and I could analyze some and couldn't some. Also, if I run the following gather table stats, it will be successful for some table and I am getting error for some table.
BEGIN DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'RPAPPL',TABNAME=>'ECHO_ZHOURLY',estimate_percent=>10,
method_opt=>'FOR ALL COLUMNS SIZE 1',Degree=>4, GRANULARITY=>'AUTO', CASCADE=>TRUE); END;
/
error:
BEGIN DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'RPAPPL',TABNAME=>'ECHO_ZHOURLY',estimate_percent=>10,
method_opt=>'FOR ALL COLUMNS SIZE 1',Degree=>4, GRANULARITY=>'AUTO', CASCADE=>TRUE); END;
Error at line 7
ORA-20001: Invalid or inconsistent input values
ORA-06512: at "SYS.DBMS_STATS", line 18408
ORA-06512: at "SYS.DBMS_STATS", line 18429
ORA-06512: at line 1
Script Terminated on line 7.
can you please help me to identify the issue?
August 30, 2011 - 3:25 pm UTC
I would highly recommend in 11g to use defaults for estimate_percent - do not sample anymore, let it figure out what it wants to do. Highly recommendj that.
You would need to give me a complete example if you want me to comment. I need to be able to reproduce the issue.
will be of more use
venkata, October 10, 2011 - 5:14 am UTC
Tom,
i have observed you use @connect user/password in all your replies. can i know what have you written in that connect.sql. hope this is not above classified ;)
October 10, 2011 - 10:48 am UTC
it isn't - and I've published it before - but since 10g it isn't necessary anymore.
It was simply:
set termout off
connect &1
@login
set termout on
because earlier releases of sqlplus didn't re-run the login.sql file (I set my prompt in that). But sqlplus does that now so I don't use it at all anymore.
rbo query
sandeep, December 11, 2011 - 11:34 am UTC
Hi Tom,
We have a 9i database which is in RBO mode. One query (shown below) which used to take 1 hr to complete took 32 hours to complete after running dbms_stats on those tables.
explain plan for
SELECT /*+ USE_NL(CFA) */
SUM( CFA.TRNSCTN_AMNT) NOB67AMNT
FROM
(SELECT C.CLM_ID,
TO_CHAR(RPTD_DATE, 'YYYY') RPT_YEAR,
C.ACCNT_PRGRM_PRD_ID
FROM CLM_R C,
PLCY_SRVC P,
LOSS_R L
WHERE C.CLM_ID > 0 AND
C.PLCY_SRVC_ID = P.PLCY_SRVC_ID AND
P.LOB_CODE LIKE '%DD' AND
C.LOSS_ID = L.LOSS_ID AND
TO_CHAR(L.RPTD_DATE, 'YYYY') > TO_CHAR(TO_NUMBER(TO_CHAR(TRUNC(SYSDATE),'YYYY'))-10) AND
C.LMTBL_PRDCT_ID =
(SELECT LMTBL_PRDCT_ID
FROM LMTBL_PRDCT
WHERE LMTBL_PRDCT_SHRT_NAME = 'WC')) C,
SVNGS_FCTR_HSTRY D,
NTR_OF_BNFT_R NOB,
CLM_FNNCL_ACTVTY_R CFA,
ENTRY_CODE_R E
WHERE CFA.CLM_FNNCL_ACTVTY_ID > 0 AND
E.ENTRY_CODE_NAME NOT LIKE 'NH%' AND
CFA.ENTRY_CODE_ID = E.ENTRY_CODE_ID AND
C.CLM_ID = CFA.CLM_ID AND
CFA.NTR_OF_BNFT_ID = NOB.NTR_OF_BNFT_ID AND
D.ACCNT_PRGRM_PRD_ID = C.ACCNT_PRGRM_PRD_ID AND
NOB.NTR_OF_BNFT_CODE = '67' AND
TO_CHAR(CFA.ROW_INSRTD_DATE, 'YYYY') >= C.RPT_YEAR AND
TO_CHAR(D.VLTN_MNTH_YEAR,'MM/YYYY')= TO_CHAR(ADD_MONTHS(TRUNC(SYSDATE), -1),'MM/YYYY') AND
TO_DATE(CFA.ROW_INSRTD_DATE, 'DD-MON-YYYY') <= TO_DATE(LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE), -1)), 'DD-MON-YYYY');
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 140 | 2656 |
| 1 | SORT AGGREGATE | | 1 | 140 | |
| 2 | NESTED LOOPS | | 1 | 140 | 2656 |
| 3 | NESTED LOOPS | | 11 | 1375 | 2655 |
| 4 | NESTED LOOPS | | 11 | 1254 | 2654 |
| 5 | NESTED LOOPS | | 920 | 68080 | 2470 |
| 6 | NESTED LOOPS | | 5784 | 338K| 1892 |
| 7 | NESTED LOOPS | | 422 | 10550 | 14 |
| 8 | TABLE ACCESS BY INDEX ROWID| NTR_OF_BNFT_R | 1 | 9 | 1 |
|* 9 | INDEX UNIQUE SCAN | NTR_OF_BNFT_CK01 | 1 | | |
|* 10 | INDEX FULL SCAN | SVNGS_FCTR_HSTRY_CK01 | 422 | 6752 | 246 |
|* 11 | TABLE ACCESS BY INDEX ROWID | CLM_R | 14 | 490 | 5 |
|* 12 | INDEX RANGE SCAN | CLM_R_PF01 | 192 | | 2 |
| 13 | TABLE ACCESS BY INDEX ROWID| LMTBL_PRDCT | 1 | 13 | 1 |
|* 14 | INDEX UNIQUE SCAN | LMTBL_PRDCT_CK02 | 1 | | |
|* 15 | TABLE ACCESS BY INDEX ROWID | PLCY_SRVC | 1 | 14 | 1 |
|* 16 | INDEX UNIQUE SCAN | PLCY_SRVC_PK | 1 | | 1 |
|* 17 | TABLE ACCESS BY INDEX ROWID | CLM_FNNCL_ACTVTY_R | 1 | 40 | 1 |
|* 18 | INDEX RANGE SCAN | CLM_FNNCL_ACTVTY_PF02 | 1 | | 3 |
|* 19 | TABLE ACCESS BY INDEX ROWID | ENTRY_CODE_R | 1 | 11 | 1 |
|* 20 | INDEX UNIQUE SCAN | SYS_C0010550 | 1 | | |
|* 21 | TABLE ACCESS BY INDEX ROWID | LOSS_R | 1 | 15 | 1 |
|* 22 | INDEX UNIQUE SCAN | LOSS_R_PK | 1 | | 1 |
---------------------------------------------------------------------------------------------
Whwn I am applying the hint RULE then its coming within 1 hour.
explain plan for
SELECT /*+ RULE */
SUM( CFA.TRNSCTN_AMNT) NOB67AMNT
FROM
(SELECT C.CLM_ID,
TO_CHAR(RPTD_DATE, 'YYYY') RPT_YEAR,
C.ACCNT_PRGRM_PRD_ID
FROM CLM_R C,
PLCY_SRVC P,
LOSS_R L
WHERE C.CLM_ID > 0 AND
C.PLCY_SRVC_ID = P.PLCY_SRVC_ID AND
P.LOB_CODE LIKE '%DD' AND
C.LOSS_ID = L.LOSS_ID AND
TO_CHAR(L.RPTD_DATE, 'YYYY') > TO_CHAR(TO_NUMBER(TO_CHAR(TRUNC(SYSDATE),'YYYY'))-10) AND
C.LMTBL_PRDCT_ID =
(SELECT LMTBL_PRDCT_ID
FROM LMTBL_PRDCT
WHERE LMTBL_PRDCT_SHRT_NAME = 'WC')) C,
SVNGS_FCTR_HSTRY D,
NTR_OF_BNFT_R NOB,
CLM_FNNCL_ACTVTY_R CFA,
ENTRY_CODE_R E
WHERE CFA.CLM_FNNCL_ACTVTY_ID > 0 AND
E.ENTRY_CODE_NAME NOT LIKE 'NH%' AND
CFA.ENTRY_CODE_ID = E.ENTRY_CODE_ID AND
C.CLM_ID = CFA.CLM_ID AND
CFA.NTR_OF_BNFT_ID = NOB.NTR_OF_BNFT_ID AND
D.ACCNT_PRGRM_PRD_ID = C.ACCNT_PRGRM_PRD_ID AND
NOB.NTR_OF_BNFT_CODE = '67' AND
TO_CHAR(CFA.ROW_INSRTD_DATE, 'YYYY') >= C.RPT_YEAR AND
TO_CHAR(D.VLTN_MNTH_YEAR,'MM/YYYY')= TO_CHAR(ADD_MONTHS(TRUNC(SYSDATE), -1),'MM/YYYY') AND
TO_DATE(CFA.ROW_INSRTD_DATE, 'DD-MON-YYYY') <= TO_DATE(LAST_DAY(ADD_MONTHS(TRUNC(SYSDATE), -1)), 'DD-MON-YYYY');
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT AGGREGATE | | | | |
|* 2 | FILTER | | | | |
| 3 | NESTED LOOPS | | | | |
| 4 | NESTED LOOPS | | | | |
| 5 | NESTED LOOPS | | | | |
| 6 | NESTED LOOPS | | | | |
| 7 | NESTED LOOPS | | | | |
| 8 | NESTED LOOPS | | | | |
| 9 | TABLE ACCESS BY INDEX ROWID| NTR_OF_BNFT_R | | | |
|* 10 | INDEX UNIQUE SCAN | NTR_OF_BNFT_CK01 | | | |
|* 11 | TABLE ACCESS BY INDEX ROWID| CLM_FNNCL_ACTVTY_R | | | |
|* 12 | INDEX RANGE SCAN | CLM_FNNCL_ACTVTY_FK01 | | | |
|* 13 | TABLE ACCESS BY INDEX ROWID | ENTRY_CODE_R | | | |
|* 14 | INDEX UNIQUE SCAN | SYS_C0010550 | | | |
| 15 | TABLE ACCESS BY INDEX ROWID | CLM_R | | | |
|* 16 | INDEX UNIQUE SCAN | CLM_PK | | | |
|* 17 | TABLE ACCESS BY INDEX ROWID | LOSS_R | | | |
|* 18 | INDEX UNIQUE SCAN | LOSS_R_PK | | | |
|* 19 | TABLE ACCESS BY INDEX ROWID | PLCY_SRVC | | | |
|* 20 | INDEX UNIQUE SCAN | PLCY_SRVC_PK | | | |
|* 21 | INDEX RANGE SCAN | SVNGS_FCTR_HSTRY_CK01 | | | |
| 22 | TABLE ACCESS BY INDEX ROWID | LMTBL_PRDCT | | | |
|* 23 | INDEX UNIQUE SCAN | LMTBL_PRDCT_CK02 | | | |
----------------------------------------------------------------------------------------------
We have a job to run the dbms_stats weekly as shown below.
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','CLM_FNNCL_ACTVTY_R',estimate_percent =>25,degree =>4,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','CLM_R',estimate_percent =>100,degree =>4,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','LOSS_R',estimate_percent =>100,degree =>4,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','PLCY_SRVC',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','SVNGS_FCTR_HSTRY',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','LMTBL_PRDCT',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','ENTRY_CODE_R',estimate_percent =>100,degree =>1,cascade => TRUE);
exec sys.DBMS_STATS.GATHER_TABLE_STATS ('PSTPP','NTR_OF_BNFT_R',estimate_percent =>100,degree =>1,cascade => TRUE);
But last week I had run the dbms_stats for clm_r in this way and after that the weekly job for analyzing also ran. I think this the reason clm_r coming within the nested loop before CLM_FNNCL_ACTVTY_R comes which is causing so much delay.
dbms_stats.gather_table_stats (
ownname => user,
tabname => 'CLM_R ,
estimate_percent => 100,
method_opt => 'for all columns size auto',
degree => 4,
cascade => TRUE
);
The size of the tables involved in the above query is given below .
OWNER SEGMENT_NAME SEGMENT_TYPE mb
------------------------------ ----------------------- ------------------ ----------
PSTPP PLCY_SRVC TABLE 120
PSTPP NTR_OF_BNFT_R TABLE .125
PSTPP CLM_FNNCL_ACTVTY_R TABLE 23552
PSTPP ENTRY_CODE_R TABLE .125
PSTPP LOSS_R TABLE 4736
PSTPP LMTBL_PRDCT TABLE .125
PSTPP SVNGS_FCTR_HSTRY TABLE 1.875
PSTPP CLM_R TABLE 5504
Can you please suggest how should I run the dbms_stats to give a constant plan. The plan is getting changed and causing the query to finh in 30 hours instaed of getting finshing within 1 hour.
December 11, 2011 - 2:56 pm UTC
why do you have a hint in it?
are the actual cardinalities (number of rows) flowing through the steps of the plan anywhere close to the estimated cardinalities?
it seems you had a method for getting good stats before and you changed your method and it didn't work out so well - is that correct?
rbo query
sandeep, December 14, 2011 - 10:14 pm UTC
Hi Tom,
Thanks for your reply. This query with the use_nl hint is being used from a long time(may be some years) in our database.So the developers are reluctant to change the code. Previously we were collecting stats by using "analyze table ..estimate statistics method". Now I have changed it to dbms_stats .
Now again I have deleted the statistics of a table -clm_r and did the old way of "analyze table ..estimate statistics method" of that particular table.The rest others are in dbms_stats way. Now when I am running the query the result is coming within 45 mins. I m bit confused which way I am going to collect the stats. Can you please suggest which method should I follow to collect the stats.
Many thanks in advance.
sandeep
December 16, 2011 - 6:13 am UTC
the code is not working, why are they reluctant?
get rid of it if you want to discuss it with me.
dbms_stats and analyze result in different plan
A reader, February 26, 2012 - 9:23 pm UTC
Hi Tom,
Could you please help to elaborate more on why 'dbms_stats' and 'analyze' result in different plan?
Below is the test scripts for your reference.
create table test as select rownum id, 'test' text from dual connect by level <= 10000;
create index test_i on test(id);
delete test where id <= 9990;
commit;
exec dbms_stats.gather_table_stats(ownname=>user, tabname=> 'TEST', cascade => true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
select * from test where id between 1 and 10000;
--IndexRangeScan
analyze index test_i compute statistics;
select * from test where id between 1 and 10000;
--FTS. Even there is only 10 rows now, but HWM is still high, so FTS is bad.
February 28, 2012 - 6:26 am UTC
because analyze has not been touched in many years as far as stats gathering goes, because analyze is deprecated for gathering stats, because dbms_stats is the only way to gather stats for the optimizer - the optimizer is coded expecting the numbers to have come from dbms_stats.
In short, they generate different statistics, different statistics often lead to different plans.
also, your two commands are entirely different - you gathered entirely different sets of stats..
Analyze any sys priv in 11gr2
A reader, March 19, 2012 - 10:51 am UTC
Tom,
I recently encountered a strange behavior. After upgrading from 10.2.0.3 to 11gr2, a package stopped working. This package owned by OWNER1, uses dbms_stats to gather stats on tables owned by another user OWNER2. It was working fine in 10g but in 11gr2 I started getting:
dbms_stats.gather_table_stats(ownname=>'OWNER2',tabname=>'T_1',cascade=>true);
ORA-20000: Unable to analyze TABLE "OWNER2"."T_1", insufficient privileges or does not exist
When I grant "ANALYZE ANY" privilege to package owner OWNER1, then I do not get any error.
Question is: Is this a change in 10g to 11gr2? In 10g the same command worked without granting "ANALYZE ANY" privilege.
Thanks...
Analyze any sys priv in 11gr2
A reader, March 20, 2012 - 7:39 am UTC
Upgrade was done via cold backup of 10g, restore on 11g, startup in upgrade mode, run catupgrd. So no 10g privileges were modified. The 10g database is still intact and I compared all the user privileges between 10g and 11g. They are same. I have checked all roles assigned to OWNER1 and they match between 10g and 11g. Somehow, OWNER1 can analyze OWNER2's objects with requiring "analyze all" privilege in 10g.
I just wanted to know if you were aware of something which would cause this to happen in 10g, like any initialization parameter or something which I might have missed.
Based on your reply, I think my best opton is to open SR with support.
Thanks...
March 20, 2012 - 9:52 am UTC
show us the set of privileges granted to the user in 10g - reproduce this.
Do this before opening an SR.
show me how, in 10g, you were able to do this - what grants did you give the user in 10g that made this possible?
Analyze any sys priv in 11gr2
A reader, March 20, 2012 - 10:21 am UTC
Seems like a 10g bug...
C:\>sqlplus system
SQL*Plus: Release 10.2.0.4.0 - Production on Tue Mar 20 10:58:35 2012
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Enter password:
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
SQL> drop user USER1 cascade;
drop user USER1 cascade
*
ERROR at line 1:
ORA-01918: user 'USER1' does not exist
SQL> drop user USER2 cascade;
drop user USER2 cascade
*
ERROR at line 1:
ORA-01918: user 'USER2' does not exist
SQL> create user USER1 identified by USER1;
User created.
SQL> grant create session to USER1;
Grant succeeded.
SQL> create user USER2 identified by USER2;
User created.
SQL> grant create session, create table to USER2;
Grant succeeded.
SQL> alter user USER2 quota unlimited on users_data;
User altered.
SQL> conn USER2/USER2
Connected.
SQL> create table t1 as select * from all_objects;
Table created.
SQL> conn USER1/USER1
Connected.
SQL> exec dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1');
PL/SQL procedure successfully completed.
SQL>
-- 11g
C:\Users\oradba>sqlplus system
SQL*Plus: Release 11.2.0.2.0 Production on Tue Mar 20 11:03:20 2012
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Enter password:
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters and Automatic Storage Management options
SQL> drop user USER1 cascade;
drop user USER1 cascade
*
ERROR at line 1:
ORA-01918: user 'USER1' does not exist
SQL> drop user USER2 cascade;
drop user USER2 cascade
*
ERROR at line 1:
ORA-01918: user 'USER2' does not exist
SQL> create user USER1 identified by USER1;
User created.
SQL> grant create session to USER1;
Grant succeeded.
SQL> create user USER2 identified by USER2;
User created.
SQL> grant create session, create table to USER2;
Grant succeeded.
SQL> alter user USER2 quota unlimited on users;
User altered.
SQL> conn USER2/USER2
Connected.
SQL> create table t1 as select * from all_objects;
Table created.
SQL> conn USER1/USER1
Connected.
SQL> exec dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or
does not exist
ORA-06512: at "SYS.DBMS_STATS", line 23427
ORA-06512: at "SYS.DBMS_STATS", line 23489
ORA-06512: at line 1
SQL>
March 20, 2012 - 3:19 pm UTC
It doesn't work like that by default in 10g
ps$tkyte%ORA10GR2> drop user USER1 cascade;
User dropped.
ops$tkyte%ORA10GR2> drop user USER2 cascade;
User dropped.
ops$tkyte%ORA10GR2> create user USER1 identified by USER1;
User created.
ops$tkyte%ORA10GR2> grant create session to USER1;
Grant succeeded.
ops$tkyte%ORA10GR2> create user USER2 identified by USER2 default tablespace users quota unlimited on users;
User created.
ops$tkyte%ORA10GR2> grant create session, create table to USER2;
Grant succeeded.
ops$tkyte%ORA10GR2> conn USER2/USER2
Connected.
user2%ORA10GR2> create table t1 as select * from all_objects where rownum = 1;
Table created.
user2%ORA10GR2> conn USER1/USER1
Connected.
user1%ORA10GR2> exec dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats. gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or
does not exist
ORA-06512: at "SYS.DBMS_STATS", line 15017
ORA-06512: at "SYS.DBMS_STATS", line 15049
ORA-06512: at line 1
I believe (firmly) that your 10g database had analyze any granted to PUBLIC. Please check out what privs PUBLIC has in your 10g database.
Nothing changed...
A reader, March 20, 2012 - 8:11 pm UTC
SQL> select * from dba_sys_privs where grantee='PUBLIC';
no rows selected
SQL>
I have tested in 3 different 10.2.0.3 databases and get the same behavior. I can run gather stats against a different user's schema even without any privilege.
I am sure it must be a 10.2.0.3 bug which was fixed may be in 10.2.0.4 and later. If it was a wrongly granted system privilege, it would have carried over into 11g database since the 11g database was created from cold backup of 10g.
March 20, 2012 - 9:17 pm UTC
I don't have a 10.2.0.3 instance to test with me this week...
Can anyone else out there with 10.2.0.3 try this out? I don't see any bugs that would have let this happen.
A reader, March 20, 2012 - 9:41 pm UTC
I have a bug number but no details. It might not be a published bug.
Bug 12714396: USER IS ABLE TO EXECUTE DBMS_STATS WITHOUT ANY PRIVILEGE
March 20, 2012 - 10:16 pm UTC
that was a bug closed as NOT A BUG
the bug was opened because someone could execute dbms_stats to gather stats on a schema without the "analyze any" privilege.
but what in fact was happening was that NOTHING in that schema needed stats, so nothing was gathered. if something needed stats - it failed as it should.
that bug does not apply here at all.
Analyze any sys priv in 11gr2
A reader, March 21, 2012 - 8:15 am UTC
Thanks for looking at the bug. Since this is not very urgent, please see if you can find a 10.2.0.3 database to test as per your convenience.
I will also try to create a 10.2.0.3 database and check if it works out of the box. Then I will upgrade to 10.2.0.4 and test if it is resolved. I will have to start with OS install so it will take couple of days.
Thanks for following up on this one even with your extremely busy schedule.
March 21, 2012 - 10:32 am UTC
If you can, please do this in 10g and report back the entire set of apparent system privs the newly created user has:
PLEASE REVIEW CAREFULLY, it operates as sysdba to create a new "dictoinary" like view that shows all of the roles granted to someone
user1%ORA10GR2> connect / as sysdba
Connected.
sys%ORA10GR2>
sys%ORA10GR2> create or replace view user_role_hierarchy
2 as
3 select u2.name granted_role
4 from ( select *
5 from sys.sysauth$
6 connect by prior privilege# = grantee#
7 start with grantee# = uid or grantee# = 1) sa,
8 sys.user$ u2
9 where u2.user#=sa.privilege#
10 union all select user from dual
11 union all select 'PUBLIC' from dual
12 /
View created.
sys%ORA10GR2> grant select on user_role_hierarchy to public;
Grant succeeded.
sys%ORA10GR2> create or replace public synonym user_role_hierarchy for user_role_hierarchy;
Synonym created.
sys%ORA10GR2>
sys%ORA10GR2> drop user user1 cascade;
User dropped.
sys%ORA10GR2> create user user1 identified by user1;
User created.
sys%ORA10GR2> grant create session to user1;
Grant succeeded.
sys%ORA10GR2> grant select on dba_sys_privs to user1;
Grant succeeded.
sys%ORA10GR2>
sys%ORA10GR2> connect user1/user1;
Connected.
user1%ORA10GR2>
user1%ORA10GR2> select * from user_role_hierarchy;
GRANTED_ROLE
------------------------------
USER1
PUBLIC
user1%ORA10GR2>
user1%ORA10GR2> select distinct PRIVILEGE
2 from dba_sys_privs
3 where grantee in ( select * from user_role_hierarchy )
4 /
PRIVILEGE
----------------------------------------
CREATE SESSION
user1%ORA10GR2>
A reader, March 21, 2012 - 11:49 am UTC
C:\>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Mar 21 12:45:20 2012
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
SQL> create or replace view user_role_hierarchy
2 as
3 select u2.name granted_role
4 from ( select *
5 from sys.sysauth$
6 connect by prior privilege# = grantee#
7 start with grantee# = uid or grantee# = 1) sa,
8 sys.user$ u2
9 where u2.user#=sa.privilege#
10 union all
11 select user from dual
12 union all
13 select 'PUBLIC' from dual
14 /
View created.
SQL> grant select on user_role_hierarchy to public;
Grant succeeded.
SQL> create or replace public synonym user_role_hierarchy for user_role_hierarchy;
Synonym created.
SQL> drop user user1 cascade;
User dropped.
SQL> create user user1 identified by user1;
User created.
SQL> grant create session to user1;
Grant succeeded.
SQL> grant select on dba_sys_privs to user1;
Grant succeeded.
SQL> connect user1/user1
Connected.
SQL> select * from user_role_hierarchy;
GRANTED_ROLE
------------------------------
USER1
PUBLIC
SQL> select distinct PRIVILEGE
2 from dba_sys_privs
3 where grantee in ( select * from user_role_hierarchy )
4 /
PRIVILEGE
----------------------------------------
CREATE SESSION
SQL>
March 21, 2012 - 10:44 pm UTC
that is perplexing, I would contact support at this point - you should not be able to analyze ANYONES tables - period.
can anyone else with 10.2.0.3 available reproduce?
Analyze any sys priv in 11gr2
A reader, March 22, 2012 - 8:08 am UTC
Tom,
I really like to express my sincere thanks to you for working on this issue. I have seen your busy schedule on the calendar. It is really amazing that you still find time to help Oracle users.
I did open a SR and the only thing I got out of support is:
"In 11g we tightened security on ANALYZE. It is a behavior change between versions."
Since your test on higher versions of 10g shows that dbms_stats fails without privileges, I kind of seem to disagree with that statement. Also, if it was a behavior change, there would be some documentation of it in new features or PL/SQL packages and type references guide.
Tomorrow I am going to create a VM and test with various versions of 10g to observe the behavior out of the box.
10.2.0.1.0 - 10.2.0.3.0 Results
Mark Williams, March 22, 2012 - 6:21 pm UTC
Here's some results from versions 10.2.0.1.0, 10.2.0.2.0, and 10.2.0.3.0 on a test VM (Windows Server 2003 32-bit).
Quick summary: I couldn't reproduce the issue.
Common tasks for each version:
create user USER1 identified by USER1
default tablespace users
temporary tablespace temp
quota unlimited on users;
grant create session to USER1;
create user USER2 identified by USER2
default tablespace users
temporary tablespace temp
quota unlimited on users;
grant create session, create table to USER2;
conn USER2/USER2
create table t1 as select * from all_objects;
conn USER1/USER1
exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
Here are the results for each version with the common tasks snipped out for space:
SQL> @get_version
PRODUCT VERSION
------------------------------------------------ ----------
NLSRTL 10.2.0.1.0
Oracle Database 10g Enterprise Edition 10.2.0.1.0
PL/SQL 10.2.0.1.0
TNS for 32-bit Windows: 10.2.0.1.0
4 rows selected.
SQL> exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13046
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1
SQL> @get_version
PRODUCT VERSION
------------------------------------------------ ----------
NLSRTL 10.2.0.2.0
Oracle Database 10g Enterprise Edition 10.2.0.2.0
PL/SQL 10.2.0.2.0
TNS for 32-bit Windows: 10.2.0.2.0
4 rows selected.
SQL> exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13149
ORA-06512: at "SYS.DBMS_STATS", line 13179
ORA-06512: at line 1
SQL> @get_version
PRODUCT VERSION
------------------------------------------------ ----------
NLSRTL 10.2.0.3.0
Oracle Database 10g Enterprise Edition 10.2.0.3.0
PL/SQL 10.2.0.3.0
TNS for 32-bit Windows: 10.2.0.3.0
4 rows selected.
SQL> exec dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1');
BEGIN dbms_stats.gather_table_stats (ownname=>'USER2',tabname=>'T1'); END;
*
ERROR at line 1:
ORA-20000: Unable to analyze TABLE "USER2"."T1", insufficient privileges or does not exist
ORA-06512: at "SYS.DBMS_STATS", line 13172
ORA-06512: at "SYS.DBMS_STATS", line 13202
ORA-06512: at line 1
March 22, 2012 - 7:04 pm UTC
Thanks Mark - I'm pretty sure there is something funky that I just cannot see with the original posters database.
That would have been a pretty large hole had it existed.
A reader, July 23, 2012 - 1:27 pm UTC
Tom,
When I gather table stats using a procedure that's scheduled using dbms_job I get the error below. If I gather stats the table directly I don't get that error. Any reason why? Your reply is greatly appreciated.
Unable to analyze TABLE , insufficient
privileges or does not exist
July 30, 2012 - 9:00 am UTC
A reader, July 30, 2012 - 12:39 pm UTC
That doesn't tell what to do if I run the procedure through job scheduler. I submitted a TAR with oracle support and the support analyst said I have to grant the analyze any table analyze any schema privileges to dbsnmp user. I can gather_stat if I run it using my id which has dba privilege.
July 30, 2012 - 4:05 pm UTC
same thing, you have no roles.
you need the privileges directly granted to you. You'll be running from inside of a stored procedure environment.
Analyze Command to be used?
Ramki, July 31, 2012 - 1:04 pm UTC
Tom
I saw your explanation on difference on using Analyze command vs dbms_stats.gather_table_stats.
I recently joined a Datawarehouse team where the DBA's are using Analyze command as a part of their periodic weekly stats gathering process for important tables.
The Oracle version used is 11g(Release 1102000100).
I referred to Oracle 11g Release notes they recommended to use dbms_stats.gather_table_stats package.
After seeing your recommendations i suggested them to change them to dbms_stats.gather_table_stats.
They(DBA's) responded by stating that "Actually oracle internally runs the “analyze” command within the dbms_stats package. Only difference is the cardinality will be more if you use the dbms_stats. Also running dbms_stats takes longer time than the regular analyze statement."
Is the above DBA statement is correct?Can you please share your thoughts on this so that i can follow up with DBA's to correct them.
July 31, 2012 - 1:08 pm UTC
They(DBA's) responded by stating that "Actually oracle internally runs the
“analyze” command within the dbms_stats package. Only difference is the
cardinality will be more if you use the dbms_stats. Also running dbms_stats
takes longer time than the regular analyze statement."
that is false on every level
first, enable sql trace, run dbms_stats, and then tkprof the trace file, show me a single ANALYZE in the trace? you cannot. show me a ton of SQL run against the objects? you'll find it all over the place.
dbms_stats does not run analyze, it doesn't.
(and if there is a difference - as they say there will be - doesn't that sort of rule out dbms_stats 'just runs analyze' - if they return different numbers?????)
and they should benchmark dbms_stats versus analyze. I think they'll be surprised.
dbams_stats gather taking ages.
A Reader, January 23, 2013 - 3:08 am UTC
Tom,
we are at 10.2.0.4
the job:
SQL> exec dbms_stats.gather_table_stats(user,'TABLE_1',cascade=>true);
it is running for more then 24 hrs...
table TABLE_1 size - 45 gb
Index IDX_NAME_UI size - 28 gb
table - sys.ORA_TEMP_1_DS_23485 has 0 rows in it.
index IDX_NAME_UI functiona based index on name column -- name
and it has definition as :
create index IDX_NAME_UI on TABLE_1 ( UPPER('name') );
SQL> desc TABLE_1;
Name Null? Type
----------------------------------------------------------------------------------------------------------------- --------
MANAGEDSERVICEID NOT NULL NUMBER
NAME VARCHAR2(650 CHAR)
MS2MANAGEDSERVICETYPE NUMBER
MS2MARKETTYPE NUMBER
OMSID VARCHAR2(2000 CHAR)
FULLNAME VARCHAR2(50 CHAR)
RELATIVENAME VARCHAR2(50 CHAR)
ALIAS1 VARCHAR2(50 CHAR)
ALIAS2 VARCHAR2(50 CHAR)
OBJECTID VARCHAR2(64 CHAR)
SUBTYPE VARCHAR2(50 CHAR)
SUBSTATUS VARCHAR2(50 CHAR)
DESCRIPTION VARCHAR2(2000 CHAR)
NOTES VARCHAR2(2000 CHAR)
CREATEDDATE DATE
LASTMODIFIEDDATE DATE
CREATEDBY2DIMUSER NUMBER
LASTMODIFIEDBY2DIMUSER NUMBER
MS2PROVISIONSTATUS NUMBER
MS2FUNCTIONALSTATUS NUMBER
MARKEDFORDELETE NUMBER(1)
ISVISIBLE NUMBER
LABEL NUMBER
X_CTCV_CONV_ID VARCHAR2(50 CHAR)
X_CONV_RUN_NO NUMBER
we seen currently it is executing the below statements.
sid for the above job is - 1023
SQL> select event,state,p1,p2,p3 from v$session_wait where sid=1023;
EVENT STATE P1 P2 P3
---------------------------------------------------------------- ------------------- ---------------- ---------- ----------
latch: cache buffers chains WAITED SHORT TIME 23568357320 122 1
SQL> select segment_name,
2 segment_type
3 from dba_extents
4 where file_id = 23568357320
5 and 122 between
6 block_id and block_id + blocks -1;
no rows selected
-- no file with file_id = 23568357320 !
-- current statement being executed ..
SQL> explain plan for
2 SELECT COUNT (rep)
3 FROM ( SELECT /*+ leading(v1 v2) use_nl_with_index(v2) */
4 COUNT (v1.val) rep
5 FROM (SELECT rn, val
6 FROM (SELECT ROWNUM rn, val
7 FROM ( SELECT /*+ no_merge no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */
8 SUBSTRB (
9 SYS_DS_ALIAS_26,
10 1,
11 32)
12 val
13 FROM sys.ora_temp_1_ds_23485 t
14 WHERE SUBSTRB (SYS_DS_ALIAS_26, 1, 32)
15 IS NOT NULL
16 GROUP BY SUBSTRB (SYS_DS_ALIAS_26, 1, 32)
17 HAVING COUNT (
18 SUBSTRB (SYS_DS_ALIAS_26, 1, 32)) =
19 1))
20 WHERE ORA_HASH (rn) <= 4015864698) v1,
21 (SELECT /*+ index(t2) */
22 UPPER ("NAME") val
23 FROM "CRAMER"."TABLE_1" t2
24 WHERE TBL$OR$IDX$PART$NUM ("CRAMER"."TABLE_1",
25 0,
26 4,
27 0,
28 "ROWID") = :objn) v2
29 WHERE v2.val LIKE v1.val || '%'
30 GROUP BY v1.val
31 HAVING COUNT (v1.val) <= 2);
Explained.
SQL> set linesize 200
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------
Plan hash value: 3648757343
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 100K (4)| 00:20:06 | | |
| 1 | SORT AGGREGATE | | 1 | 13 | | | | |
| 2 | VIEW | | 30M| 374M| 100K (4)| 00:20:06 | | |
|* 3 | FILTER | | | | | | | |
| 4 | HASH GROUP BY | | 30M| 1871M| 100K (4)| 00:20:06 | | |
| 5 | NESTED LOOPS | | 30M| 1871M| 97232 (1)| 00:19:27 | | |
|* 6 | VIEW | | 408 | 12648 | 21 (10)| 00:00:01 | | |
| 7 | COUNT | | | | | | | |
| 8 | VIEW | | 408 | 7344 | 21 (10)| 00:00:01 | | |
|* 9 | FILTER | | | | | | | |
| 10 | HASH GROUP BY | | 408 | 130K| 21 (10)| 00:00:01 | | |
|* 11 | TABLE ACCESS FULL| ORA_TEMP_1_DS_23485 | 408 | 130K| 20 (5)| 00:00:01 | | |
| 12 | PARTITION RANGE SINGLE| | 74008 | 2457K| 238 (1)| 00:00:03 | KEY | KEY |
|* 13 | INDEX RANGE SCAN | IDX_NAME_UI | 74008 | 2457K| 238 (1)| 00:00:03 | KEY | KEY |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(COUNT("VAL")<=2)
6 - filter(ORA_HASH("RN")<=4015864698)
9 - filter(COUNT(SUBSTRB("SYS_DS_ALIAS_26",1,32))=1)
11 - filter(SUBSTRB("SYS_DS_ALIAS_26",1,32) IS NOT NULL)
13 - access(UPPER("NAME") LIKE "VAL"||'%')
filter(UPPER("NAME") LIKE "VAL"||'%')
30 rows selected.
SQL>
question
a) where it is waiting ?
b) what P1 = 23568357320 from v$session_wait is referring here?
c) do system level stats gather would help?
d) any other point .. bug?
January 30, 2013 - 12:28 pm UTC
-- no file with file_id = 23568357320 !
p1 = address of latch, not file
ops$tkyte%ORA11GR2> select parameter1 p1, parameter2 p2, parameter3 p3
2 from v$event_name
3 where name = 'latch: cache buffers chains'
4 /
P1
----------------------------------------------------------------
P2
----------------------------------------------------------------
P3
----------------------------------------------------------------
address
number
tries
use ASH reports to see what it was doing.
do you have the resources to run in parallel (IO, CPU and memory)
..dbms_stats job taking ages
A Reader, January 23, 2013 - 5:16 am UTC
Tom
further to add to above
I did
alter session set events '10046 trace name context forever,level 12';
to above session runing stats gather:
excerpts from trace file:( job was running for many hrs..)
this obj#=202725 belongs to
|* 11 | TABLE ACCESS FULL| ORA_TEMP_1_DS_23485 | 408 | 130K| 20 (5)| 00:00:01 | | |
WAIT #20: nam='db file scattered read' ela= 41817 file#=1009 block#=6154 blocks=128 obj#=202725 tim=3368259988396
WAIT #20: nam='db file scattered read' ela= 3920 file#=1009 block#=6282 blocks=19 obj#=202725 tim=3368260153260
|* 13 | INDEX RANGE SCAN | IDX_NAME_UI | 74008 | 2457K| 238 (1)| 00:00:03 | KEY | KEY |
obj#=194946 belongs to IDX_NAME_UI
seen many events as below
WAIT #23: nam='db file sequential read' ela= 467 file#=260 block#=1432646 blocks=1 obj#=194946 tim=3369425036064
WAIT #23: nam='db file sequential read' ela= 533 file#=277 block#=215812 blocks=1 obj#=194946 tim=3369426851963
WAIT #23: nam='db file sequential read' ela= 442 file#=277 block#=215813 blocks=1 obj#=194946 tim=3369426852683
WAIT #23: nam='db file sequential read' ela= 403 file#=277 block#=215814 blocks=1 obj#=194946 tim=3369426853244
Trace file was not updating:
below was the line ... where it was aiting at
*** 2013-01-23 21:35:21.609
WAIT #23: nam='db file sequential read' ela= 13780 file#=269 block#=242690 blocks=1 obj#=194946 tim=3369458683230
I am sure it is this operation.
| 5 | NESTED LOOPS | | 30M| 1871M| 97232 (1)| 00:19:27 | | |
So
whether nested looop is nasty here ?
..dbms_stats job taking forever
A Reader, January 25, 2013 - 1:53 am UTC
Tom,
sorry to keep adding the info as it comes.
Here are more details to above problem
I ran the dbsm_stats.gather_table_stats job again without parallel ( no degree clause speficied)
and traced the session. below excerpts from the tracefile from udump.
...snip
... snip
WAIT #22: nam='db file sequential read' ela= 27129 file#=258 block#=1454180 blocks=1 obj#=194946 tim=3423691838220
WAIT #22: nam='db file sequential read' ela= 40410 file#=260 block#=1453019 blocks=1 obj#=194946 tim=3423691914858
WAIT #22: nam='db file sequential read' ela= 7393 file#=260 block#=1453060 blocks=1 obj#=194946 tim=3423691928244
<here no updates to the trace file size last 11 hrs... since the job is running..>
seeing what for session is waiting:
SQL> select event,state,p1,p2,p3 from v$session_wait where sid=1045;
EVENT STATE P1 P2 P3
---------------------------------------------------------------- ------------------- ---------- ---------- ----------
db file sequential read WAITED KNOWN TIME 260 1453060 1
SQL> select segment_name,
2 segment_type
3 from dba_extents
4 where file_id = 260
5 and 1453060 between
6 block_id and block_id + blocks -1;
SEGMENT_NAME SEGMENT_TYPE
------------------------------ --------------------------------------------------
IDX_NAME_UI INDEX PARTITION
- obj#=194946 in tracefile refres to index partition segment IDX_NAME_UI
when I looked at AWR
last 1 hr.. ( 4 cpu dual core)
Top 5 Timed Events
Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
CPU time 38,824 100
control file parallel write 36,658 187 5 0.5 System I/O
control file sequential read 198,347 168 1 0.4 System I/O
direct path read 62,568 34 1 0.1 User I/O
enq: CF - contention 120 31 259 0.1 Other
SQL ordered by Elapsed Time
Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
38,686 38,683 0 99.61 4wx2na79mm0jv SQL*Plus BEGIN DBMS_STATS.GATHER_TABLE_...
38,686 38,683 0 99.61 6kd805142rv5w SQL*Plus select count(rep) from (select...
question
a) is it because of sluggish IO?
b) we seen subquery ( running on behalf of DBMS_STATS.GATHER_TABLE_.. job) is cpu intesive?
full query is:
select count(rep) from (select /*+ leading(v1 v2) use_nl_with_index(v2) */ count(v1.val) rep from (select rn, val from (select rownum rn, val from (select /*+ no_merge no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */substrb("NAME", 1, 32) val from sys.ora_temp_1_ds_23528 t where substrb("NAME", 1, 32) is not null group by substrb("NAME", 1, 32) having count(substrb("NAME", 1, 32)) = 1)) where ora_hash(rn) <= 3909847333) v1, (select /*+ index(t2) */ "NAME" val from "user"."TABLE_1" t2 where TBL$OR$IDX$PART$NUM("user"."TABLE_1", 0, 4, 0, "ROWID") = :objn) v2 where v2.val like v1.val||'%' group by v1.val having count(v1.val) <= 2)
c) what is the way. do I need to raise a bug !.
..dbms_job taking ages..
A Reader, January 26, 2013 - 3:48 pm UTC
Tom
I got the tkprof for it.
snip...here
********************************************************************************
select count(rep)
from
(select /*+ leading(v1 v2) use_nl_with_index(v2) */ count(v1.val) rep from
(select rn, val from (select rownum rn, val from (select /*+ no_merge
no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact
use_weak_name_resl dynamic_sampling(0) no_monitoring */substrb("NAME",1,32)
val from sys.ora_temp_1_ds_23528 t where substrb("NAME",1,32) is not null
group by substrb("NAME",1,32) having count(substrb("NAME",1,32)) = 1))
where ora_hash(rn) <= 3909847333) v1, (select /*+ index(t2) */ "NAME" val
from "user"."TABLE_1" t2 where
TBL$OR$IDX$PART$NUM("CRAMER"."TABLE_1",0,4,0,"ROWID") = :objn) v2
where v2.val like v1.val||'%' group by v1.val having count(v1.val) <= 2)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 297 0 0
Fetch 1 86487.88 84518.20 3001 662474171 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 86487.89 84518.20 3001 662474468 0 1
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: 72 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=662474171 pr=3001 pw=0 time=2913827502 us)
1123 VIEW (cr=662474171 pr=3001 pw=0 time=2913846548 us)
1123 FILTER (cr=662474171 pr=3001 pw=0 time=2913845415 us)
2005 HASH GROUP BY (cr=662474171 pr=3001 pw=0 time=2913821602 us)
611873 NESTED LOOPS (cr=662474171 pr=3001 pw=0 time=111974935129 us)
2005 VIEW (cr=116 pr=0 pw=0 time=152402 us)
2197 COUNT (cr=116 pr=0 pw=0 time=102234 us)
2197 VIEW (cr=116 pr=0 pw=0 time=91204 us)
2197 FILTER (cr=116 pr=0 pw=0 time=80192 us)
2308 HASH GROUP BY (cr=116 pr=0 pw=0 time=78997 us)
4308 TABLE ACCESS FULL ORA_TEMP_1_DS_23528 (cr=116 pr=0 pw=0 time=4440 us)
611873 PARTITION RANGE SINGLE PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289249313 us)
611873 INDEX RANGE SCAN IDX_NAME_UI PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289129215 us)(object id 194945)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 3001 0.06 28.32
*********************
here Oracle created sys.ora_temp_1_ds_23528 using the TABLE_1 definition.
index clustering factor, table count, number of blocks---
SQL> select column_name, data_type , data_length from user_TAB_COLUMNS where column_name in (select column_name from dba_ind_columns where index_name ='IDX_NAME_UI') and table_name ='TABLE_1';
COLUMN_NAME DATA_TYPE DATA_LENGTH
------------------------------ ------------------------------ -----------
NAME VARCHAR2 650
SQL> select index_name, last_analyzed, clustering_factor, num_rows , distinct_keys from dba_indexes where table_name ='TABLE_1';
INDEX_NAME LAST_ANAL CLUSTERING_FACTOR NUM_ROWS DISTINCT_KEYS
------------------------------ --------- ----------------- ---------- -------------
IDX_NAME_UI 13-JAN-13 230094384 306871930 304700852
SQL> select sum(blocks) from dba_segments where segment_name ='TABLE_1';
SUM(BLOCKS)
-----------
5895680
observations/questions
a) each row from operation --> 2005 VIEW (cr=116 pr=0 pw=0 time=152402 us)
is looked into index IDX_NAME_UI and data is read from table TABLE_1. only 611873 matching rows are found.
due to poor clustering factor each block is read many many many times.?
b) Optimizer has chosen Nested loop join correctly but issue is poor clustering factor.?
the table in question has stale stats.. so to get the proper plan .. we need stats...so sort of race around condition
c) what is the way here - improve Clustering Factor. ? will do alter table move or online redifnition... but would that put the data in table in order for the NAME column key? table has another column createdate for which I can think of most of keys of createdate column would together in a block .. but not for name column..
d) Nested Loop operation has below stats
611873 NESTED LOOPS (cr=662474171 pr=3001 pw=0 time=111974935129 us)
here 111974 seconds ! ~ 31 hr
to compare 611873 rows taking 31 hr is huge..I see inner table lookup taking long here adding total time to NL operation.?
611873 PARTITION RANGE SINGLE PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289249313 us)
611873 INDEX RANGE SCAN IDX_NAME_UI PARTITION: KEY KEY (cr=662474055 pr=3001 pw=0 time=65289129215 us)(object id 194945)
time taken is around 18 hrs ( 65289249313 us) .
e) how to probe it further that poor clustering factor or Nested Loop is the issue? any other test...?
..dbms_stats taking ages
A Reader, January 27, 2013 - 5:48 pm UTC
Tom,
I see the above iss closely related with.
Bug 8651614 : GATHERING STATISTICS TAKE A LONG TIME WITH DBMS_STATS
will go for it.
Thanks
Execute DBMS_STATS.GATHER_TABLE_STATS inside a procedure
Raghu, April 03, 2013 - 7:20 pm UTC
Tom,
We have 2 tables that gets about 5-6 million transactions each day and the stats are gathered at the end of each day(all tables, not just these 2).
We have a couple of stored Procedures that access these tables but are run before the stats are gathered.
We have had to kill the session (since it takes more than 7 hrs sometimes, which it shouldn't), gather stats and re-run the stored procs (runs really fast the second time around) quite often..
Is it a good idea to gather stats only for those 2 tables inside the stored proc before every run (once everyday)?
procedure abcd
is
begin
DBMS_STATS.GATHER_TABLE_STATS(owner, table, cascade=>true);
Our Code here;
end;
Thanks in advance
April 22, 2013 - 1:58 pm UTC
... Is it a good idea to gather stats only for those 2 tables inside the stored
proc before every run (once everyday)? ...
it would be a good idea to either have already gathered statistics OR to have SET them (using dbms_stats) or to have copied them from yeserdays data - or to LOCK the stats on these tables and never gather again.
what you want to have in place are statistics that represent the data in the table - you could do that by gathering stats ONCE when the table is full, locking them and never doing it again.
you could do that by setting stats on the table - tell us "there are 6 million rows in here and they look sort of like this"
you could do that by gathering stats after the data has been loaded.
but doing what you are doing doesn't make sense...
Execute DBMS_STATS.GATHER_TABLE_STATS 2 methods
Linda, June 28, 2013 - 4:47 pm UTC
How to know that the statistics of a table where gathered with this two method?
DBMS_STATS.GATHER_TABLE_STATS(
ownname => <table owner>
,tabname =><table_name>
,estimate_percent => NULL
,block_sample => FALSE
,degree => DBMS_STATS.AUTO_DEGREE
,cascade => TRUE
And:
DBMS_STATS.GATHER_TABLE_STATS(
ownname =><table owner>
,tabname => <table_name>
,estimate_percent => NULL
,block_sample => FALSE
,method_opt => ''FOR ALL INDEXED COLUMNS''
,degree => DBMS_STATS.AUTO_DEGREE
,cascade => FALSE
Pawan Tyagi, October 23, 2013 - 12:37 pm UTC
I need to see the old DBMS Stats data on a table in oracle database. I'm able to see the time when stats ran last time and its history date as well but how can i see old values and new values on a table.
Analyzing Tables using DBMS STATS or Compute Statistics, which will be better ?
Malaya, October 31, 2013 - 2:47 pm UTC
Hi,
I am under confused stage. Generally I use DBMS STATS to optimize DB performance. But since last 2 months I encountered severe performance issue. Then I used compute statistics and performance also increased.
Please advice me, what should I use in future ? I am in a confuse stage now. Please help me.
Malaya
November 01, 2013 - 9:27 pm UTC
only use dbms_stats.
analyze table for gathering statistics is a deprecated bit of functionality.
use dbms_stats and make sure you are gathering the same set of statistics you did with analyze (you didn't share your analyze or dbms_stats commands so I cannot comment further). what happened is that you probably gathered entirely DIFFERENT types of statistics with the two commands. Make sure they are gathering the same amount and types of statistics.
Degree level in DBMS_Stats
Mahes, November 18, 2013 - 11:10 am UTC
hi Tom
i am using following scripts in loading Store Procedure. which runs every day and load the data to target table.
Orace Version:11.2
DBMS_STATS.Gather_Table_stats(ownname ='XYZ', tabname = V_tbl_name, cascade = True, degree => 3 );
in degree it is mentioned number. what basis it is used? and what is the range can be used?
i saw in this page autodegree number. can you explain how it is allocated?
also please tell about cascade?
Thanks
ENDPOINT_VALUE holding VARCHAR
Andre, January 10, 2014 - 1:54 pm UTC
Hi Tom,
A Very Happy New Year 2014 !!
+
Is there a way to decode the ENDPOINT_VALUE from histogram view where the value is VARCHAR2..?
I have tried to search for answers for days - got a link to Jonathan Lewis and others - but ...
... it seems that decoding the first 6 bytes is OK 0 beyond that there are problems.
And ... the field "ENDPOINT_ACTUAL_VALUE" - sometimes shows such values - but then they also somehow vanish...
Thanks in advance
Andre
ENDPOINT_VAL for VARCHAR2
Andre, January 12, 2014 - 11:03 am UTC
Hi Tom,
More searching through Jonathan Lewis' papers and I have located his paper on this subject.
It appears that Oracle ONLY encodes the first 5 characters + possibly the 6th (although this is not 100% correct) but beyond this the remaining values are garbage.
JL explained it pretty well - and it appears that the ACTUAL value column that is mostly NULL in the HISTOGRAM view stores the values if the first 6 are not unique. I have not tested such cases yet - and probably will when time allows.
But I am curious as to Oracle strategy... i.e. the first 5 bytes. Does it change for 12c..?
Thanks
Best wishes
A
indexing a patitioned table
jason, January 22, 2014 - 10:17 pm UTC
Could you please let me know the best way to find out which columns of a table is used most frequently? I just partitioned a table using range (gl date). But i would like to create indexes on a column or columns taht are used mostly in queries. id there a way for me to get this information? I have used the SYS.COL_USAGE$.but im not quite sure that's what is needed
performance imapct on stats collection
Ravi B, April 24, 2014 - 9:40 pm UTC
Hi Tom,
We have several "big" tables in our data warehouse. After each ETL we gather stats on the tables. Each of the table has partition and sub-partition. The complete process of gathering stats is taking hours to finish.
currently, this is what we do for each table.
EXEC DBMS_STATS.GATHER_TABLE_STATS(<SCHEMA_NAME>,<TABLE_NAME>);
Do you recommend us doing the following as recommended at
https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables EXEC DBMS_STATS.SET_TABLE_PREFS(<SCHEMA_NAME>,<TABLE_NAME>,'INCREMENTAL','TRUE');
EXEC DBMS_STATS.GATHER_TABLE_STATS(<SCHEMA_NAME>,<TABLE_NAME>);
Thanks!
a review
A reader, March 31, 2016 - 5:09 pm UTC
a review
Undo analyze table
Praveen Darla, May 27, 2019 - 10:05 am UTC
Hi Tom,
I analyzed table before taking previous statistics of that table. I want to undo or revert the
analyze table tablename compute statistics. Please assist with solution.
Thanks
May 29, 2019 - 6:23 am UTC
You can't undo an analyze (which is why we recommend using dbms_stats).
If you *do* have some historical stats, you can do:
exec dbms_stats.delete_table_stats(ownname=>'MYUSER', tabname=>'MYTABLE');
exec dbms_stats.restore_table_stats(ownname=>'schema1',tabname=>'MYTABLE',as_of_timestamp=>'12/11/19 12:12:12,123456 +08:00');
gabriel, November 14, 2022 - 5:39 pm UTC
hello. i have a problem and i don't know how to figure it out. can you help me please? i have this requires : Count the total number of books, magazines rented, and total number of members
used the library in the selected month.
my code : Select sum(transaction_id) from ledgers where month(issue_date) = 7;
the result : ORA-00904: "MONTH": invalid identifier
thank you.
November 15, 2022 - 7:06 am UTC
month(issue_date) = 7;
is probably
extract(month from issue_date) = 7
or you might consider something to ensure issue_date is unaffected, eg
issue_date >= date '2022-07-01' and
issue_date < date '2022-08-01'
(assuming you want data just for the current year)