So as a claim status type changes, what happens to the data, does the data "move" from one sub partition to the other sub partition within a year partition, with updates to the primary and secondary indexes?If you update the partitioning column(s) for a table, then yes, the affected rows will move to another (sub)partition.
You need to enable row movement on the table to be able to do this.
If the partition scheme is reversed, where the Claim Types are used as the primary partitions and the years as the sub partitions, would there be any query performance advantage?If you partition by year, subpartition by claim type, then finding all the rows for a claim type, the database will:
- Iterate through all the year partitions
- Access only the relevant type subpartition within this
If you flip the scheme around, then the query will
- Access only the partition for the claim type
- Read all the year subpartitions in this
To answer which is faster, the best thing to do is test!
Here's partitioning by year, subpartitioning by type:
create table t (
id integer,
ins_year integer,
status varchar2(2),
stuff varchar2(100)
default rpad ( 'stuff', 100, 'f' )
) partition by range ( ins_year )
interval ( 1 )
subpartition by list ( status )
subpartition template (
subpartition sub_bc values ( 'BC' ),
subpartition sub_dc values ( 'DC' ),
subpartition sub_ac values ( 'AC' ),
subpartition sub_rc values ( 'RC' )
) (
partition p2016 values less than ( 2017 )
) enable row movement;
insert into t ( id, ins_year, status )
with rws as (
select level x from dual
connect by level <= 100000
)
select x,
ceil ( x / 25000 ) + 2016,
case mod ( x, 4 )
when 0 then 'BC'
when 1 then 'DC'
when 2 then 'AC'
when 3 then 'RC'
end
from rws;
commit;
set serveroutput off
alter session set statistics_level = all;
select count(*) from t
where status = 'AC';
select *
from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST +PARTITION'));
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Pstart| Pstop | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | | 1 |00:00:00.01 | 628 |
| 1 | SORT AGGREGATE | | 1 | 1 | | | 1 |00:00:00.01 | 628 |
| 2 | PARTITION RANGE ALL | | 1 | 24962 | 1 |1048575| 25000 |00:00:00.01 | 628 |
| 3 | PARTITION LIST SINGLE| | 5 | 24962 | KEY | KEY | 25000 |00:00:00.01 | 628 |
| 4 | TABLE ACCESS FULL | T | 5 | 24962 | KEY | KEY | 25000 |00:00:00.01 | 628 |
----------------------------------------------------------------------------------------------------------
select count(*) from t
where ins_year = 2018;
select *
from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST +PARTITION'));
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Pstart| Pstop | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | | 1 |00:00:00.01 | 596 |
| 1 | SORT AGGREGATE | | 1 | 1 | | | 1 |00:00:00.01 | 596 |
| 2 | PARTITION RANGE SINGLE| | 1 | 20455 | 3 | 3 | 25000 |00:00:00.01 | 596 |
| 3 | PARTITION LIST ALL | | 1 | 20455 | 1 | 4 | 25000 |00:00:00.01 | 596 |
|* 4 | TABLE ACCESS FULL | T | 4 | 20455 | 9 | 12 | 25000 |00:00:00.01 | 596 |
----------------------------------------------------------------------------------------------------------
Flipping the partitioning schemes around, loading the same data and running the same queries gives:
drop table t
cascade constraints purge;
create table t (
id integer,
ins_year integer,
status varchar2(2),
stuff varchar2(100)
default rpad ( 'stuff', 100, 'f' )
) partition by list ( status )
subpartition by range ( ins_year )
subpartition template (
subpartition p2016 values less than ( 2017 ),
subpartition p2017 values less than ( 2018 ),
subpartition p2018 values less than ( 2019 ),
subpartition p2019 values less than ( 2020 )
) (
partition sub_bc values ( 'BC' ),
partition sub_dc values ( 'DC' ),
partition sub_ac values ( 'AC' ),
partition sub_rc values ( 'RC' )
) enable row movement;
select count(*) from t
where status = 'AC';
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Pstart| Pstop | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | | 1 |00:00:00.01 | 628 |
| 1 | SORT AGGREGATE | | 1 | 1 | | | 1 |00:00:00.01 | 628 |
| 2 | PARTITION LIST SINGLE| | 1 | 20455 | KEY | KEY | 25000 |00:00:00.01 | 628 |
| 3 | PARTITION RANGE ALL | | 1 | 20455 | 1 | 4 | 25000 |00:00:00.01 | 628 |
| 4 | TABLE ACCESS FULL | T | 4 | 20455 | | | 25000 |00:00:00.01 | 628 |
---------------------------------------------------------------------------------------------------------
select count(*) from t
where ins_year = 2018;
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Pstart| Pstop | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | | 1 |00:00:00.01 | 660 |
| 1 | SORT AGGREGATE | | 1 | 1 | | | 1 |00:00:00.01 | 660 |
| 2 | PARTITION LIST ALL | | 1 | 28637 | 1 | 4 | 25000 |00:00:00.01 | 660 |
| 3 | PARTITION RANGE SINGLE| | 4 | 28637 | 3 | 3 | 25000 |00:00:00.01 | 660 |
|* 4 | TABLE ACCESS FULL | T | 4 | 28637 | | | 25000 |00:00:00.01 | 660 |
-----------------------------------------------------------------------------------------------------------
So the work done (buffers) is similar for all the queries. Querying on the partition key comes out slightly faster than the subpartition key in both cases, but the difference is small. YMMV
Would the "update" overhead be too much of a trade off?Only you can answer that!
From your description, it sounds to me like you'll have up to three status (and thus partition changing) updates to a row in its life.
I suspect you'll query each row many times more than this. Meaning it's likely you'll get much more benefit from partition elimination speeding up your queries than an extra overhead for updates will cost you. But it really comes down to the details of how exactly and how often you'll be querying and updating.
Test, test, test!
One other important consideration is partition maintenance.
You can only have interval partitioning at the top-level, not subpartitions. So if you choose to make year the subpartition, you'll need to update the subpartition templates regularly to add the new years - or pre-create a huge number.
Just in case this is overlooked one year, you'll also want a maxvalue subpartition to avoid any data loss. Which means splitting rows into subpartitions in the event this is missed.
Whereas if you partition by year, subpartition by type, you can use interval partitioning. So you'll only need to do partition maintenance in the (rare?) event you add more statuses.