Hi Tom,
We are creating a table that will be loaded daily. We want to partition the table by as of date. Is the best practice to create the partition using the partition by list automatic or partition by range using 1 day interval? Does one option give better performance over the other? Are there advantages/disadvantages of one option over the other? Thanks for your help and insight.
CREATE TABLE STG_PRODUCT1
(
AS_OF_DATE DATE,
PRODUCT_ID NUMBER(14),
TOTAL_BALANCE NUMBER(22,3)
)
PARTITION BY LIST (AS_OF_DATE) AUTOMATIC
(
PARTITION P_AS_OF_DATE VALUES (TO_DATE('31-MAR-2021'))
);
CREATE UNIQUE INDEX STG_PRODUCT1_IX ON STG_PRODUCT1
(AS_OF_DATE, PRODUCT_ID);
CREATE TABLE STG_PRODUCT2
(
AS_OF_DATE DATE,
PRODUCT_ID NUMBER(14),
TOTAL_BALANCE NUMBER(22,3)
)
PARTITION BY RANGE (AS_OF_DATE) INTERVAL (NUMTODSINTERVAL(1, 'DAY'))
(PARTITION P_AS_OF_DATE VALUES LESS THAN (TO_DATE(' 2022-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')));
CREATE UNIQUE INDEX STG_PRODUCT2_IX ON STG_PRODUCT2
(AS_OF_DATE, PRODUCT_ID);
List partitioning by date is generally a bad idea.
This is because the database needs a partition for each unique value. Dates in Oracle Database always include the time, so you could have a partition for each second. That's up to 86,400 partitions
every day!
Comparing loading a row for each hour over a couple of days into the list and range partitioned tables:
create table stg_product1 (
as_of_date date,
product_id number(14),
total_balance number(22,3)
) partition by list (as_of_date) automatic (
partition p_as_of_date values (to_date('31-mar-2022', 'dd-mon-yyyy'))
);
create table stg_product2 (
as_of_date date,
product_id number(14),
total_balance number(22,3)
) partition by range (as_of_date) interval (numtodsinterval(1, 'day')) (
partition p_as_of_date values less than (
to_date(' 2022-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss', 'nls_calendar=gregorian'))
);
insert into STG_PRODUCT1
with rws as (
select date'2022-03-31' + ( level / 24 ) dt
from dual
connect by level <= 48
)
select dt, 1, 1 from rws;
insert into STG_PRODUCT2
with rws as (
select date'2022-03-31' + ( level / 24 ) dt
from dual
connect by level <= 48
)
select dt, 1, 1 from rws;
select table_name, count(*)
from user_tab_partitions
where table_name like 'STG_P%'
group by table_name;
TABLE_NAME COUNT(*)
------------ ----------
STG_PRODUCT1 49
STG_PRODUCT2 3
There are 49 partitions for the list partitioned table vs just 3 for the range partitioned table.
Range partitions are also easier to manage if you want to change the granularity. For example, go to monthly or hourly partitions. With list partitioning you have to state all the values to go in each. Using range partitioning all you need is the upper bound and interval.
List partitioning is best suited to cases when either:
- There is a relatively small number of values, with new values being relatively rare. Think things like countries, currencies, status codes, etc.
- You want to split out a small number of popular values into their own partitions; e.g. you want to put the 10 customers with the most rows each into their own partition. All rows for the remaining customers should go in the same (default) partition