Skip to Main Content
  • Questions
  • Peculiar partitioning method and automatic multiple column list partitioning.

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question.

Asked: May 15, 2020 - 10:07 am UTC

Answered by: Chris Saxon - Last updated: May 26, 2020 - 3:57 pm UTC

Category: SQL - Version: 12.2.0.2

Viewed 100+ times

You Asked

Hi


I am a great fan of Oracle partitioning. In our environment there is however few shortcomings
- unability to partition data by more than one date column
- unability to subpartition data by non fixed number of partitions.
(without any human interaction when new partitions are needed).

Then second could be overcome by using hash partitioning.
The first one might be overcome by using technique I have name as 'multiple day partitioning'; using greates value of these date columns as partitioning column.

As an wild thought I tried to use Oracle 12.2 feature where one could use automatic list partition by several columns to overcome both of these problems.

Well, it is not working; ran into some Oracle error, but that is not the point of my question.

How is this automatic list partitioning working ?
If one would partition by correlelated dates by month (e.g 5) by status (e.g. 4 values) products (e.g. 500) then for 5 years there would be 5*5*4*50 = 50000 partitions.
(this is quite a big table, rows inserted in one month would be hundred of millions).

Are these partitions scanned as 'full table' and matching partitions used?
Is there any indexing method? Does order of partitioning columns have any effect.

Is there absolutely any idea in this ?
(this is environment where no cloud, no enginered system is available).


create table lh_order_test( 
 ID number(20) not null, 
 PRODUCT varchar(6) not null, 
 STATUS varchar(10) not null, 
 ORDERDATE date not null, 
 WISHED_DELIVERY_DATE date not null, 
 BIGGER_THAN_ANY_DATE date generated always as (add_months(trunc(greatest("ORDERDATE","WISHED_DELIVERY_DATE"),'MM'),1)) not null, 
 SMALLEST_PARTITIONING_MONTH date generated always as (trunc(least("ORDERDATE","WISHED_DELIVERY_DATE"),'MM')) not null 
 ) 
 PARTITION BY LIST (PRODUCT, STATUS, BIGGER_THAN_ANY_DATE, SMALLEST_PARTITIONING_MONTH) AUTOMATIC( 
 partition example values ('000000','DELETED', to_date('20200301','yyyymmdd'),to_date('20200201','yyyymmdd') 
 ) 
 ) 
 ; 


 alter table lh_order_test add (constraint c1 check (ORDERDATE >= SMALLEST_PARTITIONING_MONTH)); 
 alter table lh_order_test add (constraint c2 check (ORDERDATE < BIGGER_THAN_ANY_DATE)); 
 alter table lh_order_test add (constraint c3 check (WISHED_DELIVERY_DATE >= SMALLEST_PARTITIONING_MONTH)); 
 alter table lh_order_test add (constraint c4 check (WISHED_DELIVERY_DATE < BIGGER_THAN_ANY_DATE)); 

 insert into lh_order_test(id,product,status, orderdate, wished_delivery_date) values (-1,'000000','DELETED',sysdate, sysdate+1); 

 ORA-01899: bad precision specifier 

 insert into lh_order_test(id, product, status, orderdate, wished_delivery_date) 
 select level, to_char(mod(level,1000),'FM099999'), 'ACTIVE',SYSDATE+level/6, SYSDATE+ level/12 from dual connect by level < 1000; 

 select * from lh_order_test where wished_delivery_date >= add_months(trunc(sysdate,'mm'),5); 



lh

and we said...

I'm confused as to what you're trying to do here!

- unability to partition data by more than one date column


You can?

create table t (
  c1 date, c2 date
) partition by range ( c1, c2 ) (
  partition p1 values less than ( 
    date'2020-01-01', date'2020-01-01' 
  )
);

select partition_name, high_value 
from   user_tab_partitions
where  table_name = 'T';

PARTITION_NAME    HIGH_VALUE                                                                                                                                                                 
P1                TO_DATE(' 2020-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'), TO_DATE(' 2020-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')    


- unability to subpartition data by non fixed number of partitions.
(without any human interaction when new partitions are needed).


True, if you want to change the number of subpartitions you need someone to go in and do that. If you have a reasonable idea of how many subpartitions you want for each partition you could pre-create them though. Or perhaps set up a job that adds subpartitions as needed.

How is this automatic list partitioning working?

The database creates new partitions on demand - like with interval partitioning.

If one would partition by correlelated dates by month (e.g 5) by status (e.g. 4 values) products (e.g. 500) then for 5 years there would be 5*5*4*50 = 50000 partitions.

If you'll have a row for every product and status for every month/year combination, then yes you'll end up with 50,000 partitions. But this is an upper bound.

Are these partitions scanned as 'full table' and matching partitions used?

If there are no indexes, then yes you'll get full table (partition) scans. The database can use partition elimination when reading them. A bit like indexes, this is most effective if the left-most partitioning columns are in the where clause:

drop table t 
  cascade constraints purge;
  
create table t (
  c1 date, c2 date
) partition by list ( c1, c2 )
  automatic (
  partition p1 values ( 
    date'2020-01-01', date'2020-01-01' 
  )
);

insert into t values ( trunc ( sysdate ), trunc ( sysdate ) );
insert into t values ( trunc ( sysdate ), trunc ( sysdate ) + 1 );
insert into t values ( trunc ( sysdate ) + 1, trunc ( sysdate ));
insert into t values ( trunc ( sysdate ) + 1, trunc ( sysdate ) + 1 );

set serveroutput off
select /*+ gather_plan_statistics */* from t
where  c1 = trunc ( sysdate );

select * 
from   table(dbms_xplan.display_cursor(null, null, 'ROWSTATS LAST +PARTITION'));

-----------------------------------------------------------------------------------    
| Id  | Operation               | Name | Starts | E-Rows | Pstart| Pstop | A-Rows |    
-----------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT        |      |      1 |        |       |       |      2 |    
|   1 |  PARTITION LIST ITERATOR|      |      1 |      1 |   KEY |   KEY |      2 |    
|*  2 |   TABLE ACCESS FULL     | T    |      2 |      1 |   KEY |   KEY |      2 |    
-----------------------------------------------------------------------------------  

select /*+ gather_plan_statistics */* from t
where  c2 = trunc ( sysdate );

select * 
from   table(dbms_xplan.display_cursor(null, null, 'ROWSTATS LAST +PARTITION'));

---------------------------------------------------------------------------------------    
| Id  | Operation                   | Name | Starts | E-Rows | Pstart| Pstop | A-Rows |    
---------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT            |      |      1 |        |       |       |      2 |    
|   1 |  PARTITION LIST MULTI-COLUMN|      |      1 |      1 |KEY(MC)|KEY(MC)|      2 |    
|*  2 |   TABLE ACCESS FULL         | T    |      5 |      1 |KEY(MC)|KEY(MC)|      2 |    
--------------------------------------------------------------------------------------- 


The first query filters on the leftmost partition column. This only starts 2 full partition scans.

Whereas the next query filters on the second column in the partition key. So this starts 5 full partition scans.

Is there any indexing method?

The usual indexing rules apply.

But... I'm still unclear what your goal is here. Clarifying the business objective you're hoping to achieve with this would help!

and you rated our response

  (1 rating)

Reviews

Trying to clarify my original question.

May 20, 2020 - 5:29 am UTC

Reviewer: A reader

Hi


Sorry about not being able to express myself clearly.

I have been under the impression, that if range partitioning is by three columns (date1, date2, date3) then
if in where clause there is date3 >= trunc(sysdate,'MM') and no reference to columns date1 and date2, partitioning is not very effective.

This thought was implement method where each separate data column could be used independently to query recent data by pruning old partitions away.
Accessed partitions would also contain non relevant rows, but that would be accessed. This method could also be used for rows (with little revision)
for rows which have null values for some dateN values.

Also the goal is to have partitioning method were no human intervention is needed for adding partitions when time passes and when data values change (e.g. new product ).

My question was meant to be about handling of metadata with multicolumn list partitioning. How it handled when determining which partitions are to be accessed? What if there are thousands of partitions ?
'Full table' type scan of metadata ? Indexing ? Does order of columns in multicolumn list partitioning have any effects.


I just would like to try to 'stretch' existing Oracle features.




Chris Saxon

Followup  

May 26, 2020 - 3:57 pm UTC

At 50k partitions you're still well short of the maximum of 2**20 partitions limit.

Some people with colossal databases have gotten close to that limit - notably CERN :)

https://cds.cern.ch/record/2676517/files/ATL-SOFT-SLIDE-2019-228.pdf

I'm not too familiar with the inner working of partitioning, but given the numbers you've quoted I wouldn't be too worried about data dictionary performance.

I don't know of any customers using multi-column list partitioning as you've described. As discussed, partition elimination is likely to be less effective. But it does still happen.

Really you want the leading partition columns to be present in 90%+ of the queries you run. Personally I'd try to (sub)partition on only the 1-2 most common columns.

Ultimately the best way to get confidence is to test, test, and test some more ;)

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database