Skip to Main Content

Breadcrumb

more

Connor and Chris don't just spend all day on AskTOM. You can also catch regular content via Connor's blog and Chris's blog. Or if video is more your thing, check out Connor's latest video and Chris's latest video from their Youtube channels. And of course, keep up to date with AskTOM via the official twitter account.

Question and Answer

Connor McDonald

Thanks for the question, Alex.

Asked: January 10, 2018 - 8:05 pm UTC

Answered by: Connor McDonald - Last updated: January 23, 2019 - 3:51 am UTC

Category: Database Development - Version: 11.2.0.1.0

Viewed 1000+ times

Whilst you are here, check out some content from the AskTom team: Optimizing the PL/SQL Challenge IV: More OR Condition Woes

You Asked

The new 11g Reference Partitioning option requires a foreign key on the child table, which requires a unique key on the parent table.
To create a unique key on a partitioned table, you either need to include the partitioning key to get a local index, or end up with a global index.
Including the partitioning key on the foreign key is what this option tries to avoid (column duplication). Ending up with a global index is really not practical with big tables.
Is this option then practically limited to small tables?

and we said...

"Ending up with a global index is really not practical with big tables"

Why do you say that ? I know of a friend with a partitioned table with a global index that has billions of rows.

There are benefits to global indexes, and there are drawbacks. And equally, there are benefits to local indexes, and there are drawbacks.

Avoiding column duplication is not the only motivator here. Reference partitioning enforces the strictness in partition definitions, which is critical when it comes to getting the performance benefits of partition-wise joins.

Don't get me wrong - *some* times if you need the best of both worlds, then occasionally applications need to make some physical design compromises (for example, they'll have an additional column in the primary key to get a local index etc).

One thing I will say, is that if you seriously considering reference partitioning, I would be using 12c, where a number of improvements were made in this area.

and you rated our response

  (13 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

12c improvements

January 11, 2018 - 7:17 am UTC

Reviewer: Rajeshwaran, Jeyabal

seriously considering reference partitioning, I would be using 12c, where a number of improvements were made in this area

we know about the partition improvements in 12c ( like read only partitions, Reference partition over interval partitioning, partial index for partition tables, multicolumn list partition, auto list partition , filtered PMO, partition for External tables, table for exchange with partitioned tables).

could you provide some/more insights about the reference partitioning improvements in Oracle 12c ?
Connor McDonald

Followup  

January 12, 2018 - 12:13 am UTC

From the 12c new features section in the VLDB/Partitioning manual:


Interval-Reference Partitioning
This feature enables reference-partitioned table to use interval partitioning as a top partitioning strategy, which provides a better partitioning modeling. Interval partitioned tables can be used as parent tables for reference partitioning. Partitions in the reference partitioned table corresponding to interval partitions in the parent table are created upon insert into the reference partitioned table.

Cascade Functionality for TRUNCATE PARTITION and EXCHANGE PARTITION
This feature provides a CASCADE option for TRUNCATE PARTITION and EXCHANGE PARTITION operations for reference and interval-reference partitioned tables, which cascades the operation to reference partitioned child tables. This functionality enables simplified application development by enabling the inheritance of the partition maintenance operation from the parent to the child tables. The cascade options are off by default so they do not affect compatibility.

March 08, 2018 - 4:36 pm UTC

Reviewer: Alex A

Hello everyone.

We have a very large table we'd like to partition and move older partitions to cheaper storage. Unfortunately, it doesn't have a nice natural key to use, it only has two columns with 17 billion rows. There is another table that can be potentially used to grab date information, if we were to add a column.

Questions:
1) Is this a candidate for reference partitioning?
2) Can any type of partitioning move partitions to different storage tiers or is that just range?
3) How would you approach this?

Thanks. In case you couldn't tell, I'm inexperienced when it comes to partitioning (thanks to the extra license cost).
Connor McDonald

Followup  

March 09, 2018 - 2:57 am UTC

Hi, can you tell us a bit more about the table. Does it have a key at all ? Can you share the DDL with us, and tell us about each column.

March 09, 2018 - 4:35 pm UTC

Reviewer: A reader

Here's the DDL

CREATE TABLE LIST_HISTORY
(
  LIST_ID NUMBER(38, 0) NOT NULL 
, ITEM_VALUE VARCHAR2(256 CHAR) NOT NULL 
, CONSTRAINT LIST_PK_IOT PRIMARY KEY 
  (
    ITEM_VALUE 
  , LIST_ID 
  )
  ENABLE 
)


The data looks like this

LIST_ID ITEM_VALUE
41143   0.0.0.0/0

Connor McDonald

Followup  

March 11, 2018 - 1:49 am UTC

OK, a lot of this will come down to what business requirements you want.

Reference partitioning is fine, but you will be making a hard/permanent link between the two tables. So some things to keep in mind:

- If in the past these tables have been managed in different ways, then you might need to change some business processes

- There is a small but measurable cost, because you will now have a foreign key back to the table containing the date, hence the population cost goes up slightly.

So you need to weigh up that with the alternative of simply adding the date column to the LIST_HISTORY table and partitioning it standalone.

March 12, 2018 - 3:54 pm UTC

Reviewer: Alex A

Thanks Connor.

Are we able to store old partitions on cheaper storage with the reference approach? Also, is there a reason you didn't suggest range partitioning on the LIST_ID? I'm guessing there are some limitations because of the compound pk?
Connor McDonald

Followup  

March 13, 2018 - 2:20 am UTC

Also, is there a reason you didn't suggest range partitioning on the LIST_ID

Because there's no information in the question that suggests value of LIST_ID is related to date, so the definition of "old" is indeterminate. From the information *in the question*, LIST_ID=1 could be the most recent, or it could be the oldest, or could be anywhere in between.

Are we able to store old partitions on cheaper storage with the reference approach?

Each partition is its own segment, which means it can be put on a tablespace of your choosing, hence the option for cheaper storage.

April 11, 2018 - 6:45 pm UTC

Reviewer: Alex A

We have determined that it would not be a good idea for us to partition on the date column because of how the application works. By default it loads all previous versions of files and would therefore need to scan all the partitions anyway. Alternatively the development team recommended an ID column to range partition on. My question to you guys is how can I determine what range to use for such a column. For example, the data starts at 2200 and has a max value of around 53000000. The data is pretty evenly distributed.

TIA
Connor McDonald

Followup  

April 12, 2018 - 1:47 am UTC

When partitioning by a range of numbers, you need to come up with a means that determines the granularity you are after. And that is governed by business and technical requirements. I'll explain with 2 examples I've done at clients in the past:

Example 1: Chronology.

We observed that approx 2000m sequence numbers equated to about 12 months of data. So we opted for range sizes of about 500m to give us "3 month" sized partitions.

Example 2: Maintenance

For reasons of file management, backups, etc, we wanted partition sizes of around 16G. Larger than that introduced some pain points for this particular client. So we did some analysis on what range size of numbers yielded approx 16G and range partitioned along those lines. We also added monitoring so that if we saw a partition growing abnormally fast, so that our range of (say) 10million was too large, we'd split the partition at a numeric point that is NOT in use, ie, slightly larger than the current sequence HWM and start a fresh partition.

April 12, 2018 - 9:09 pm UTC

Reviewer: Alex A

Our requirements are a bit thin. We're basically looking to partition it because it's a large table in terms of size (GB), and it's frequently accessed and want to improve query performance against it. In terms of rows it's not large, about 2 million. But there are two XML CLOB columns consuming most of the space. After thinking about your last reply I think it makes sense to organize the partitions so that they're breaking up the rows to evenly distribute the storage consumption. The problem is I don't know of a way to do that.
Connor McDonald

Followup  

April 13, 2018 - 2:53 am UTC

The brings me back to my previous comment

"We also added monitoring so that if we saw a partition growing abnormally fast, so that our range of (say) 10million was too large, we'd split the partition at a numeric point that is NOT in use, ie, slightly larger than the current sequence HWM and start a fresh partition. "


Start off with a single partition + maxvalue, eg

partition by range ( pk_seq )
p000 values less than 0
pmax values less than maxvalue

Each (say) night, check the size of pmax. If it is less than your limit (say 20G) then do nothing. If it *does* exceed your limit, then:

find the max current value of pk_seq (lets says it is 12000)
split pmax into p001 values less than 12100, pmax less than maxvalue

Now p001 is capped at roughly 20G and pmax is empty.

Repeat each night into p002, p003 etc

April 17, 2018 - 3:31 pm UTC

Reviewer: A reader

Thanks for walking me through this Connor. But I'm still not quite sure I understand. Did you do this in production?

I'm looking to do a POC in a DEV environment just so I can see if this is worth it. So this table won't be growing at all.
Connor McDonald

Followup  

April 18, 2018 - 2:09 am UTC

Did you do this in production?

yes.

I'm looking to do a POC in a DEV environment just so I can see if this is worth it. So this table won't be growing at all.

In that case, use NUM_ROWS/AVG_ROW_LEN from the table statistics to get a rough idea of how many rows equates to your desired partition size, and then choose partition ranges accordingly.

Didn't get it

April 18, 2018 - 9:45 am UTC

Reviewer: Gh

Reading the posts didn't catch what partitioning is to do with clobs? Its said that rows are around 2 millions? What partitioning the table have to do with the partition size? It is related to rows fetched number. Selectivity.
On the others hand it is z
Worthy to know what is the storage property of your lobs .are basic file or securefile?
Accessing lobs is via pointers .

global indexes vs local with reference partitions

October 05, 2018 - 6:03 pm UTC

Reviewer: Andy from Albany, NY USA

You mention drawbacks to the global indexes, but, no specifics. The one issue I am seeing with them is that when we drop a partition (out-dated data) the index becomes unusable and must be rebuilt. This creates additional time requirements and overhead for maintenance, whereas a locally indexed partition does not. So, do I understand correctly that I would need those additional columns (based on the parent PK) in order to make use of the local indexing scheme?

Thanks.
Connor McDonald

Followup  

October 06, 2018 - 6:14 am UTC

Couple of things

a) they can be big...not just because they are global, but a global index needs to store entries that point not just to a single segment. So each rowid in the index leaf block is 10bytes, whereas a local index entry has a 6byte rowid. That might be significant depending on the size of the key.

b) maintenance (as you've already described)


Asynchronous Global index maintenance

October 08, 2018 - 6:15 am UTC

Reviewer: Rajeshwaran, Jeyabal

....
You mention drawbacks to the global indexes, but, no specifics. The one issue I am seeing with them is that when we drop a partition (out-dated data) the index becomes unusable and must be rebuilt. This creates additional time requirements and overhead for maintenance,
....


but that got improved in 12c with "Asynchronous Global index maintenance" feature

https://docs.oracle.com/database/121/NEWFT/chapter12101.htm#NEWFT198
Connor McDonald

Followup  

October 10, 2018 - 1:33 am UTC

It did, but don't forget - we are not *reducing* that work, we are simply deferring it to another time of day. The index entries still need to get cleaned up - we just do it later.

January 08, 2019 - 10:52 pm UTC

Reviewer: Alex

Hi team,

I have a problem with a table we're trying to partition that doesn't have any natural partitioning key. It's about 20 billion rows and over 1TB. We are mostly interested in the ILM aspect of partitioning because it does not get queried much.

We found a table that we can use reference partitioning with, but unfortunately it seems the date column is defined in a way that won't allow partitioning because we get this error:

SQL> CREATE TABLE list_history
(
    ID NUMBER(38,0) NOT NULL,
        LIST_ID NUMBER(38,0) NOT NULL,
        SYNC_ID NUMBER(38,0) NOT NULL,
        CREATED_BY VARCHAR2(100 CHAR) NOT NULL,
        CREATE_DATE TIMESTAMP (6)  WITH TIME ZONE DEFAULT SYSDATE NOT NULL ENABLE,
      PRIMARY KEY (ID),
  2    3    4    5    6    7    8    9        CONSTRAINT UNIQ_LIST_SYNC UNIQUE (LIST_ID, SYNC_ID),
          CONSTRAINT FK_LIST_HISTORY FOREIGN KEY (LIST_ID)
            REFERENCES NETWORK_LIST (ID)
)
 10   11   12   13  PARTITION BY RANGE (create_date)
INTERVAL(NUMTOYMINTERVAL(1, 'YEAR'))
 14   15  (
 16     PARTITION part_nl2015 VALUES LESS THAN (TO_DATE('01-01-2016', 'DD-MM-YYYY')),
   PARTITION part_nl2016 VALUES LESS THAN (TO_DATE('01-01-2017', 'DD-MM-YYYY')),
 17   18     PARTITION part_nl2017 VALUES LESS THAN (TO_DATE('01-01-2018', 'DD-MM-YYYY'))
);  19
PARTITION BY RANGE (create_date)
                    *
ERROR at line 13:
ORA-03001: unimplemented feature


The referencing table would look like this:

CREATE TABLE LIST_HISTORY_ITM
( 
    LIST_ID NUMBER(38,0) NOT NULL, 
 ITEM_VALUE VARCHAR2(256 CHAR) NOT NULL, 
   PRIMARY KEY (LIST_ID, ITEM_VALUE),
   CONSTRAINT FK_LIST_HISTORY_ITM FOREIGN KEY (LIST_ID)
     REFERENCES LIST_HISTORY (ID)
)
PARTITION BY REFERENCE(FK_LIST_HISTORY_ITM); 


Do you have any ideas how we can solve this?

Thank you.
Chris Saxon

Followup  

January 09, 2019 - 2:19 pm UTC

You can't use columns with the datatype timestamp with time zone as a partition key.

The only supported types are:

CHAR, NCHAR, VARCHAR2, NVARCHAR2, VARCHAR, NUMBER, FLOAT, DATE, TIMESTAMP, TIMESTAMP WITH LOCAL TIMEZONE, or RAW.

So if you want to partition by this, you'll have to change the data type.

Or you could create a virtual column, casting create_date to a supported data type. e.g.:

CREATE TABLE list_history
(
  ID NUMBER(38,0) NOT NULL,
  LIST_ID NUMBER(38,0) NOT NULL,
  SYNC_ID NUMBER(38,0) NOT NULL,
  CREATED_BY VARCHAR2(100 CHAR) NOT NULL,
  CREATE_DATE TIMESTAMP (6) WITH TIME ZONE DEFAULT SYSDATE NOT NULL ENABLE,
  CREATE_DATE_VC DATE GENERATED ALWAYS AS ( 
    CAST ( CREATE_DATE AS DATE )
  ),
  PRIMARY KEY (ID),
  CONSTRAINT UNIQ_LIST_SYNC UNIQUE (LIST_ID, SYNC_ID)
) PARTITION BY RANGE (create_date_vc) INTERVAL(NUMTOYMINTERVAL(1, 'YEAR')) (
 PARTITION part_nl2015 VALUES LESS THAN (TO_DATE('01-01-2016', 'DD-MM-YYYY')),
 PARTITION part_nl2016 VALUES LESS THAN (TO_DATE('01-01-2017', 'DD-MM-YYYY')),
 PARTITION part_nl2017 VALUES LESS THAN (TO_DATE('01-01-2018', 'DD-MM-YYYY'))
);  

January 15, 2019 - 9:04 pm UTC

Reviewer: Alex

Thanks Chris. I went with the vc option and it seems to work ok. I created it and ran dbms_redefinition to move the original table to the partitioned version.

I tried doing the same thing for the partition referenced table, and the dbms_redefinition runs to completion but after the table has no rows in it. Do you guys have any idea what the problem could be here? There's nothing for me to go on.

SQL> set timing on
SQL>
SQL> exec dbms_redefinition.start_redef_table('API', 'LIST_HISTORY_ITM', 'LIST_HISTORY_ITM_P');

PL/SQL procedure successfully completed.

Elapsed: 11:57:06.17
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00

SQL> select count(*) from API.LIST_HISTORY_ITM_P;

  COUNT(*)
----------
         0

SQL> select count(*) from API.LIST_HISTORY_ITM;

  COUNT(*)
----------
2.0894E+10


Chris Saxon

Followup  

January 18, 2019 - 2:41 pm UTC

What's the complete list of commands you ran up to this point? Can you make a simple demo we can replicate?

January 22, 2019 - 5:07 pm UTC

Reviewer: Alex

Hi,

That is all I am running. I suspected it was space related and maybe the error wasn't being propagated for whatever reason, and tried an insert:

SQL> insert into api.LIST_HISTORY_ITM_P select * from api.LIST_HISTORY_ITM;

insert into api.LIST_HISTORY_ITm_P select * from api.LIST_HISTORY_ITM
                   *
ERROR at line 1:
ORA-01628: max # extents (32765) reached for rollback segment
_SYSSMU10_1172599362$


I don't understand this. Haven't seen a rollback segement error in ages.

SQL> show parameter undo

NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ------------------------------
temp_undo_enabled                    boolean                          FALSE
undo_management                      string                           AUTO
undo_retention                       integer                          2500000
undo_tablespace                      string                           UNDOTBS1


UNDO tablespace is 200GB. This database is not being used by anyone but me.
Connor McDonald

Followup  

January 23, 2019 - 3:51 am UTC

Yeah, ORA-01628 should not happen on automatic undo (or should be in extreme circumstances only).

MOS note 1580182.1 contains some potential causes and solutions here, but also there have been bugs in this areas on pre-12c releases, so I'd advise you to work with Support here.

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.