Skip to Main Content
  • Questions
  • Incremental stats on large partitioned table

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Lakshmi.

Asked: April 24, 2017 - 5:23 pm UTC

Last updated: August 27, 2021 - 4:19 am UTC

Version: 11.2.0.4

Viewed 1000+ times

You Asked

we have our a reporting database which is being cloned from our production database daily. The source database is a mission critical database where we can't make any changes(can't set table preferences for stats gathering ). We have a range partitioned table which has more than 600+ partitions.
Every day after the clone we need to gather stats on this table to ensure our SQL's follow optimal path. Here is our challange.Full stats gathering takes 3.5 hours. Every day few partitions have new data.

Incremental stats is not possible since we clone the database from source daily. The incremental stats option is not set at source database and we can't change this.

I am thinking about an option which will help to gather stats quickly.

Partition level stats gathering - only for partitions which are in stale status - however global stats will not be updated if we follow this approach.

Is there any other elegant way of doing this stats collection. we just need to find the parititons which are stale and then gather stats and update global stats in a lean window.

Update ****
Clone is a copy of all data files from the source system. we then create a control file from source then rename the database on target and mount the database.
****


and Connor said...

Since you are cloning the files, the stats are being copied over *exactly* as they are on the source node. So I'm assuming what you are saying is that your stats are "bad" on the source node (where "bad" means they are not good for your reporting needs, but they are ok for your source node needs.


Based on that, this means you are free to do whatever you want with the stats on the reporting node - so my first piece of advice would be - stop *gathering* so many stats.

What I mean by that is

a) if only a few partitions are stale, just gather stats at the partition level for these. This should be efficient.

b) Large partitioned tables rarely have their stats *distribution* change dramatically. If your 5 billion row table has 200 distinct values for (say) a location column, then tomorrow, when there is 5.1 billion rows...then you'll probably *still* have 200 locations...or *maybe* 1 more.

So do less *gathering* and more *setting* of statistics, eg

- low distinct value columns (like the location example above), set once and revisit every 6 months

- incrementing values (timestamps, sequences etc). Easy - just get the max value and use that that set the column high water mark. For many dates, you dont even need to query, because the hwm is "today".

- num_rows - the same information that tells you what partitions are stale (xxx_TAB_MODIFICATIONS) shows you how many new rows were added. Use that to bump up num_rows.

- blocks - you'll already have avg_row_len...so 'n' new rows can easily be mapped into the blocks increment.

Once a partition table gets huge... its typically reached a steady state. Not steady in terms of *growth*, but steady in terms of the data distribution and the patterns of the data within it. Why make the database burn hours of cpu to tell you something you already know.

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, April 27, 2017 - 10:55 am UTC

Thank you. Useful insight to this issue.

INCREMENTAL STATS on partition table and new columns

Paul, August 24, 2021 - 5:36 am UTC

Greetings !

Database: 12.2.0.1 JAN2021 Release updates.

We have a large partition table which is enabled with incremental stats and it has daily partition.

After schema change was applied (3 new columns added to the table), we are seeing all partitions showing as STALE even though there was no data changes in old partitions.


Questions -

1) How do we prevent thie autostats jobs recollecting stats all over again the partition stats.

2)
For my table, incremental_staleness setr as 'NULL'

What s the recommended value for incremental_staleness parameter value for the table ?


Thanks
Connor McDonald
August 27, 2021 - 4:19 am UTC

Can we get a test case? Here's mine

SQL> create table t
  2  partition by list ( owner ) automatic
  3  ( partition p1 values ('SYS')
  4  )
  5  as select * from dba_objects;

Table created.

SQL>
SQL> exec dbms_stats.set_table_prefs('','T','INCREMENTAL','TRUE');

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.gather_table_stats('','T');

PL/SQL procedure successfully completed.

SQL> select partition_name, stale_stats
  2  from   user_tab_statistics
  3  where  table_name = 'T';

PARTITION_NAME                 STALE_S
------------------------------ -------
                               NO
SYS_P21579                     NO
SYS_P21596                     NO
SYS_P21564                     NO
SYS_P21569                     NO
SYS_P21571                     NO
SYS_P21585                     NO
...
...


SQL> alter table t add x int;

Table altered.

SQL> alter table t add y int;

Table altered.

SQL> alter table t add z int;

Table altered.

SQL> select partition_name, stale_stats
  2  from   user_tab_statistics
  3  where  table_name = 'T';

PARTITION_NAME                 STALE_S
------------------------------ -------
                               NO
SYS_P21579                     NO
SYS_P21596                     NO
SYS_P21564                     NO
SYS_P21569                     NO
SYS_P21571                     NO
SYS_P21585                     NO
...
...


So there's something missing in the picture here