Skip to Main Content


Question and Answer

Chris Saxon

Thanks for the question.

Asked: June 12, 2019 - 11:34 am UTC

Last updated: January 14, 2020 - 3:16 pm UTC


Viewed 1000+ times

You Asked

I have a huge table with CLOB and BLOB two columns.The size of the table is 300 GB ( 80 GB Table Size + 220 GB LOB Columns).
We have requirement to keep only 3 years of latest data.What would be the best way to achieve this ?

My solution :
1. Disable constraints/triggers/indexes
2. Create backup table for 3 years (via CTAS)
3. truncate main table
4. insert data into main table from backup table (via Bulk Collect)
5. Enable constrainsts/triggers/indexes

Kindly comment here if this can be done in much better way as my solution is taking very long time ?


and we said...

There's no need for steps 1, 3, 4, & 5.

Instead, do:

1. Create backup table for 3 years (via CTAS)
2. Create constraints/indexes on this table
3. Drop the main table
4. Rename the backup table to the main table

This saves you doing (I'm guessing very slow) insert.

Plus with this method you can use online table redefinition. This helps eliminate/reduce downtime. And includes processes to copy the indexes etc. over to the new table:


  (28 ratings)

Is this answer out of date? If it is, please let us know via a Comment


Why not mention interval partitioning

Joseph, June 12, 2019 - 3:31 pm UTC

I would have expected a mention of Interval Partitioning in response to this question.
Chris Saxon
June 12, 2019 - 4:08 pm UTC

Yes, some form of partitioning would make the process much easier in the future.

Harry, June 16, 2019 - 6:20 am UTC

The solution you have provided is not useful in my case as our tables are important ones in our application and it is risky to drop them.Also they have sys indexes created on it.
In my solution insert operation taking long time which we need to find solution for it.I have tried insert with append hint but it is also not working as direct path loading will not work with LOB columns.Could you please suggest any better solution ?
Chris Saxon
June 17, 2019 - 8:32 am UTC

And truncating the table isn't risky?!

As the previous review suggests, the fastest, easiest way to archive old data ongoing is to partition the table. Then you can drop/truncate old partitions.

You can use dbms_redefinition to do this online (though not to delete the old rows as I previously suggested). Then archive off the old partitions when this is complete.

Both old an new tables still exist when the redefinition finishes. So you can manually remove whichever you don't want when the process is done.

Also they have sys indexes created on it.

Whatever you do, this seems like a good opportunity to move these indexes out of sys.

A reader, June 18, 2019 - 6:58 am UTC

Sorry for not highlighted before that we can't use partition here on that table as we don't have partition key.We are using another table to take reference of 3 years of data.
Chris Saxon
June 18, 2019 - 10:08 am UTC

We are using another table to take reference of 3 years of data.

Can you use reference partitioning?

A reader, June 18, 2019 - 7:02 am UTC

Can we use dbms_parallel_execute here in step 4 in my solution for better performance ? What could be best and fast way to insert data in LOB columns ?
Chris Saxon
June 18, 2019 - 10:16 am UTC

Using parallel dml may help - there's no need for dbms_parallel_execute:

alter session enable parallel dml;
insert /*+ parallel */ into ...
  select /*+ parallel */ ...

But I repeat - using create-table-as-select is likely to be the fastest way to complete this operation. You could parallelize this too if you want.

If dropping the table at step three is too scary for you, instead:

1. Create backup table for 3 years (via CTAS)
2. Create constraints/indexes on this table
3. Rename the main table to a temporary name
4. Rename the backup table to the main table

Then at some point after when you're happy everything's worked, then drop the original table.

This is also way less risky than truncating the main table if you ask me!

A reader, June 18, 2019 - 12:19 pm UTC

Thanks I will try your suggestions and let you know.

A reader, June 19, 2019 - 2:49 pm UTC

I am facing performance issue now when I do CTAS (step 1) as sample code below :

create table t1_backup nologging parallel as
select /*+ parallel(4) */ * from t1 where id in (
select id from t2 where daterange between date1 and date2);

Volume of select data - 95 million with LOB columns

Time to complete - 100 minutes

Is there any way to improve the performance ?
Chris Saxon
June 19, 2019 - 3:27 pm UTC

Trace your session to find out what's going on:

exec DBMS_monitor.session_trace_enable ( null, null, true, true );

create table ...
  select ...

exec DBMS_monitor.session_trace_disable;

Then review the formatted trace file (using tkprof or similar) to find the execution plan for the statement & see what it's doing.

If you're still struggling, paste your findings (the execution plan for the CTAS along with wait information) in a review and we'll see how we can help.

If you want to know more about tracing, see:

How about this?

Surren, June 20, 2019 - 4:40 am UTC

I never thought I'd come out of the woodwork after so long!
Create a synonym with the same name as the original table, point it to the original table then use this to flip between your two tables. This way you'll have table independence and can use many methods to move your data around. Are your sys$ indexes on your lob and blob storage or constraints columns? As a tip, always name all your own objects.
Chris Saxon
June 20, 2019 - 11:03 am UTC

The synonym's an interesting idea - you just have to be sure everyone uses it!

Also, try this....

Surren, June 20, 2019 - 4:53 am UTC

Also, check your PQ statement. A PQ pipeline is best maintained end to end through parallelism of the same degree. For example, your IN clause has has no PQ statement so not sure it will paralellise. Also consider reforming your selection via a join rather than the inlist iterator.
Chris Saxon
June 20, 2019 - 11:04 am UTC

It's possible the optimizer has already transformed the IN; we need to see what the execution plan says!

A reader, June 21, 2019 - 2:56 am UTC

Thanks for your above suggestion.I am really getting some knowledge here.
Sad part is I have very limited access to our database environment so can't trace session.
I didn't get clear understanding how synonym will work here ?

Apart from that I am facing performance issue when inserting data from backup table to main table as per steps 4
in my solution via parallel hint as sample code below :

insert /*+ parallel(4) */ into t1 select /*+ parallel(4) */ from t1_backup;

It is taking around 16 hours to insert data of volume 80 million.

Please provide some solution to better the performance ?
Chris Saxon
June 21, 2019 - 9:49 am UTC

Let's step back.

How long do each of steps 2, 3, & 4 take?

Assuming step 4 takes a significant amount of time, why are you still trying to insert data back into the original table instead of just switching them over?

I didn't get clear understanding how synonym will work here ?

The idea is you create the synonym:

create synonym tab for main_table;

And change your code to use the synonym (or rename the main table, and name the synonym with the original table's name).

Then after your CTAS for the backup, you can
create or replace synonym tab for backup_table;

And voila, your tables are swapped over!

A reader, June 21, 2019 - 3:14 am UTC

A little correction - 16 hours to complete steps 2,3 and 4.
I made changes to truncate table like below -
Truncate table t1 reuse storage;
It has any impact on performance ?

A reader, June 24, 2019 - 3:44 pm UTC

Please find below approximated time taken during each steps in my solution :

Step 1 : 1 sec
Step 2 : 100 mins
Step 3 : 1 sec
Step 4 : 8 hrs 30 mins ( via Bulk collect limit 50000)
Step 5 : 30 mins

We can't swap between main table and backup table as I mentioned earlier main table is used
everywhere and someplaces we don't have access.Also can't take risk to hamper production stability.
Also we have sys indexes on LOB columns on main table.

Please advise how can I improve performance if we have heavy table with lob columns.
Chris Saxon
June 25, 2019 - 7:46 am UTC

Remember: the absolute fastest way to do something is not to do it at all!

Renaming tables is an instant operation. No matter how much you tune your insert, you're not going to get anywhere near this fast.

And if you discover you made a mistake in the switch, rollback is fast and easy too:

Just rename the tables back again!

(assuming you discover this before going live with real data!)

We can't swap between main table and backup table as I mentioned earlier main table is used everywhere and someplaces we don't have access

Yes, switching is risky. But the whole process is risky!

The current process has you truncating the main table. Which destroys your rollback position!

For me this is a far greater risk than switching the tables over.

As long as you've copied all the indexes, constraints, triggers, and grants from the original table to the new everything should just work.

create table t1 as 
  select level c1, 'T1' c2 from dual
  connect by level <= 8;
create table t2 as 
  select c1, 'T2' c2 from t1
  where  c1 <= 4;

create or replace procedure p as
  for rws in ( 
    select * from t1
  ) loop
    dbms_output.put_line ( rws.c2 || ' ' || rws.c1 );
  end loop;
end p;

exec p;

T1 1
T1 2
T1 3
T1 4
T1 5
T1 6
T1 7
T1 8

rename t1 to told;
rename t2 to t1;

exec p;

T2 1
T2 2
T2 3
T2 4

But if you must persist with the current approach (which I strongly discourage)...

Step 4 : 8 hrs 30 mins ( via Bulk collect limit 50000)

Why are you using bulk collect? Why not just:

insert into main
  select * from backup

A reader, June 25, 2019 - 2:22 pm UTC

We have 1 billion data to be inserted into main table from backup table.
If we do then it will consume more undo and temp segment as we are inserting
into table with lob columns.This is my understanding.
Please let me know if my understanding is wrong ?
Chris Saxon
June 25, 2019 - 3:43 pm UTC

If it's a straightforward insert into select * from, you shouldn't need much temp.

If you're committing between bulk collect/forall batches, then you'll increase the resources used!

If not, I'd expect minimal difference.

But I reiterate: DON'T DO THE INSERT! Just switch the tables over. The time taken, undo and temp used, and any other metric you can think of will be essentially zero!

without extra effort

siva, June 25, 2019 - 5:00 pm UTC

HI ,

What about using below method .

create table t (
c1 int

insert into t values ( 1 );
insert into t values ( 2 );

select * from t22;

alter table t
move including rows where c1 > 1
; --which rows you want put it in where clause

select * from t;
Chris Saxon
June 25, 2019 - 5:18 pm UTC

They're on 11.2. Filtered move is a 12.2 operation :(

A reader, June 26, 2019 - 1:15 pm UTC

I am getting tablespace issue : ORA-30036: unable to extend segment by 4 in undo tablespace when doing directly.As I mentioned we are inserting billion data.
Chris Saxon
June 26, 2019 - 4:03 pm UTC

OK, here's a different approach. If switching the tables over is "too risky", how about partition exchange?

If you create the backup table with a single partition, you can use partition exchange to swap the data out:

create table tmain as 
  select level c1 from dual
  connect by level <= 100;
create table tbackup ( c1 ) 
partition by list ( c1 ) ( 
  partition pdef values ( default ) 
) as 
  select * from tmain
  where  c1 <= 10;
select count(*) from tmain;


alter table tbackup 
  exchange partition pdef 
  with table tmain;
select count(*) from tmain;


This will be substantially faster than truncate + insert. And (unlike truncate) you've still got all the original data in the backup table. So your rollback is just exchange back again. Or insert a (hopefully small) number of rows.

A reader, July 02, 2019 - 1:51 pm UTC

Sorry to come back after some days.Partition is disabled in our database so we can't use partitioing here.
SO I don't have much but to stick with my plan as below things didn't work for me because of LOB columns :
1) Append hint
2) Parallel hint
3) Nologging on table level

In my solution still bulk insert taking around 12 hours to insert 90 million data.
I am doing like below :

OPEN c1 for select * from main_table_backup ;
FETCH....LIMIT 50000;
INSERT INTO main_table .....
EXIT WHEN c1%notfound;

Could you please guide here to improve ?
Connor McDonald
July 03, 2019 - 1:19 am UTC

There's been a lot of options etc provided here, but the thing we still are missing is the key data to help with tuning - which is where the time is being lost.

So can we do this:

exec dbms_monitor.session_trace_enable(waits=>true);

OPEN c1 for select * from main_table_backup where rownum <= 1000000;  --ie, a subset
  FETCH....LIMIT 50000;
    INSERT INTO main_table .....
  EXIT WHEN c1%notfound;

and then extract the key elements from the tkprof-formatted trace file.

We need to see where the time being lost.

A reader, July 02, 2019 - 3:16 pm UTC

We are also facing ORA-01555 : snapshot too old issue here.
Connor McDonald
July 03, 2019 - 1:27 am UTC

Whenever you do:

open cursor C ...
  fetch from C;
end loop;

you are running that risk because every *fetch* command must get data from the cursor as of the time the cursor was *opened*.

So I open my cursor at 9am, then do some processing for an hour, then eventually the next fetch will need to unwind 1hours worth of changes to the source tables in the cursor. If you can't do that unwind, then out pops ora-1555


1) get all the data from the cursor in one hit. Either a big bulk collect (watch out for memory) or store in a temporary table

2) make your code ora-1555 aware, eg

open cursor C ...
  fetch from C;

  if ora-1555 then
     close cursor;
     open cursor;
  end if;
end loop;

Obviously this only works as long as the cursor being re-opened has some concept of being able to know where to the start off from.

A reader, July 03, 2019 - 5:22 am UTC

We are not making any changes in backup table from anywhere as we just created from main table in step 1 from my solution.Then why ORA-01555 issue is coming while doing bulk insert ? Also we are running database on noarchivelog mode.
Chris Saxon
July 03, 2019 - 12:29 pm UTC

It doesn't matter if you make changes to the backup table. You need to undo the insert from this back to the main table.

Like Connor says, we need to see the output from a formatted trace file to see what's going on.

But personally I think you're wasting your time trying to tune this. Just switch the tables over!

Your current process:

- Destroys your rollback position. What happens if, just after truncating the main table, you realise there's a mistake in the where clause for the CTAS? And you need to revert?

You're going to have to restore from backup...

- Writes the same data twice. So it must take at least twice as long as just CTAS + renaming. In practice (as you're finding) it'll more than double because one of the writes is an insert. Which generates significantly redo and need more undo than CTAS.

A reader, July 03, 2019 - 1:35 pm UTC

Yes agree my solution is not so good.
But how we will swich tables as we have lots of constraints and indexes include sys indexes on LOB columns. How DBMS_REDEFINITION package will help me ?
Is there any other way to do it ?
Chris Saxon
July 04, 2019 - 10:03 am UTC

DBMS_REDEFINITION won't help here - my mistake.

Dear "A Reader"

Another reader, July 04, 2019 - 12:32 am UTC

All LOB columns have a SYS index. It's a byproduct of the create of the table with a LOB. There's nothing you need to convert or move over for a SYS named index on a lob column.

Your other constraints and indexes can be created with the CTAS by adding them inline in the create table portion of the statement. It's a little tedious perhops. Or they can be created after with a little more risk but much greater speed on the CTAS.

Finally, I love how you are all worried about the integrity of this database and what Chris and Connor are proposing is far too risky, but then you mention your DB is in noarchivelog mode!! I truly laughed out loud when I saw that.
Chris Saxon
July 04, 2019 - 10:03 am UTC

then you mention your DB is in noarchivelog mode!! I truly laughed out loud when I saw that.


A reader, July 04, 2019 - 8:51 am UTC

We have kept database in noarchivelog mode to increase the performance as this is one time job.In case of any issues we will restore our backup. Anything to worry about ?

we also can't change any constraints or indexes names. So we can't created with CTAS.
Chris Saxon
July 04, 2019 - 10:09 am UTC

Anything to worry about ?

You've lost point-in-time recovery. If something goes wrong you can only restore to your last full backup. Not sometime after.

we also can't change any constraints or indexes names.

Really? It's rare for these names to matter.

Anyway, you can rename indexes & constraints. So after you've done the switchover, you can rename constraints on the new table as needed.

A reader, July 04, 2019 - 11:57 am UTC

If I do your steps with little change will it work :

1. Create backup table for 3 years (via CTAS)
2. Drop the main table
3. Create same constraints/indexes on backup table
4. Rename the backup table to the main table

Do you thing any issues here ?
Chris Saxon
July 04, 2019 - 3:09 pm UTC

You also need to ensure you copy any grants or triggers on the original table at step 3.

Also the constraints/indexes will need different names to the originals. So if you want the same names, you'll have to rename these after step 4.

Or switch the order of steps 3 & 4.

Purge Rows for more than 5 Years

Nidhika Vaidh, July 18, 2019 - 2:54 am UTC

Hi Tom,

I need your suggestion, I have a task where I need to purge data which are more than 5 years in db.We are using Oracle 12c and In my application tables have complex parent child relationship like table C is dependent on table B and table B is dependent on table A and we have composite primary keys as well.Actually my task is to purge
5 year old data. Could you please suggest how can I proceed to achieve this task in Oracle 12c.

Chris Saxon
July 18, 2019 - 2:33 pm UTC

What have you tried? How many rows are you removing? What exactly are your table structures?

Nidhika, July 20, 2019 - 5:52 pm UTC

We have parent table which has list partition and have multiple child tables all child tables have reference partition, so we are neither able to enable on delete case on child tables nor we are able to drop constraint, so my challenge is how can I drop partition in parent table.

Could you please give best way to resolve this iisue
Chris Saxon
July 22, 2019 - 10:03 am UTC

If you have reference partitioning, then dropping the parent partition will also remove this from all its children.

So... I'm not sure what the issue is here?

AV, January 07, 2020 - 10:42 am UTC

Hi Tom

Thanks for your help.

Our Oracle DB version is
There are 2 big tables with Parent-child relationship. There is NO "on delete cascade".
Both have "INTERVAL RANGE" partitioning on RC_DT (date type) column.

My task is to purge them to store only last 2 years of data and both above tables have to be retained.

Table P - rows 161397436
Table C - rows 395170380

Planned steps :
1. Create 2 new partitioned tables P_new and C_new with 2 years of data.
2. RENAME P & C to P_history & C_history respectively.
3. RENAME P_new & C_new to P & C respectively.
4. Create a scheduler job to keep purging data from P_new & C_new periodically to ensure only 2 years data is retained in new tables.

Here is the progress till now in non-production region:

1) I created P_new table via INSERT. It took 11 hrs. I did not use CTAS as I thought to keep INTERVAL RANGE partitioning in place so that data may keep on moving in respective partitions.

Upon reading your posts above I think I might have used CTAS and then created partitions & indexes/constraints. Not sure how much time it would have saved.

Anyway, table P_new is in place now.

2) Then I tried INSERT for populating C_new from C but it quickly failed due to failing foreign key (FK) constraint.
I disabled the FK and tried again but INSERT failed with snapshot too old error after running for 30 hrs.

So now I am pondering the following options.

Kindly provide your input:

1. Keep FK disabled for C_new, INSERT partition-wise data from C into C_new for 2 yrs partitions.
Then enable the FK with " EXCEPTIONS INTO MY_EXCEPTIONS;" .

then follow remaining of planned steps. But a caveat here is the ongoing transactions in the tables, which we will have to stop by taking a downtime, anyway.

2. Or If above does not work, make copies of P & C via expdp/impdp as _history tables.
Perform deletes on main tables P & C intermittently to remove data older than 2 years. Not deleting in single shot, but in chunks, may be spread over days.

As there is no on delete cascade, I will have to ensure that no children exists before deleting from parent table.

3. Or a workable & good method as per your advice.

Please help.
Chris Saxon
January 07, 2020 - 2:17 pm UTC

Have you tried using CTAS to populate the new tables? You can specify partitions & constraints when you do this:

create table t ( x ) 
  partition by range ( x ) 
  interval ( 10 ) (
    partition p0 values less than ( 1 )
  select level x from dual
  connect by level <= 100;
select partition_name, high_value 
from   user_tab_partitions
where  table_name = 'T';

P0                1             
SYS_P1269         11            
SYS_P1270         21            
SYS_P1271         31            
SYS_P1272         41            
SYS_P1273         51            
SYS_P1274         61            
SYS_P1275         71            
SYS_P1276         81            
SYS_P1277         91            
SYS_P1278         101

AV, January 07, 2020 - 2:27 pm UTC

Thanks Tom.
Will definitely try this.

But can you please also tell the definite way to take care of referential integrity relationship in this case.

Chris Saxon
January 08, 2020 - 10:43 am UTC

You'll have to add these afterwards with alter table ... add constraint.

You can make this process fast with:

alter table ... 
  add constraint ...

You'll want to validate the constraint afterwards with:

alter table ... 
  validate constraint ...;

This will take a while. But it is an online operation. So you can run it while your application is up.

Oren Nakdimon discusses this more in his constraint optimization series

Testing the Import of Ref constraint

AV, January 12, 2020 - 1:35 pm UTC

Hi Tom

In reference with my above requirement, I tested these steps on Oracle DB version and observed some behaviour which I seek your help to understand:

Basically, I did following things:

1. Export 2 tables (this will export data , indexes, and constraints PK & FK )
2. Rename the constraints so that while importing there is no name conflict
3. Import 2 tables (this will import data , indexes, and constraints PK & FK ) . Here I found something unexpected with "Referenced Constraint name" in dba_constraints table.

Here is sequence of steps for this test:

drop table HR.child;
drop table HR.child_history;
drop table HR.parent;
drop table HR.parent_history;




insert into HR.parent values ('1',to_date('31-AUG-2014','DD-MON-YYYY') );
insert into HR.parent values ('2',to_date('31-AUG-2018','DD-MON-YYYY') );
insert into HR.parent values ('3',to_date('31-AUG-2019','DD-MON-YYYY') );
insert into HR.parent values ('4',to_date('31-AUG-2020','DD-MON-YYYY') );
insert into HR.parent values ('5',to_date('31-DEC-2020','DD-MON-YYYY') );
insert into HR.parent values ('6',to_date('30-DEC-2020','DD-MON-YYYY') );

insert into HR.CHILD values ('1','3',to_date('31-AUG-2014','DD-MON-YYYY') );
insert into HR.CHILD values ('2','3',to_date('31-AUG-2018','DD-MON-YYYY') );
insert into HR.CHILD values ('3','5',to_date('31-AUG-2019','DD-MON-YYYY') );
insert into HR.CHILD values ('4','5',to_date('31-AUG-2020','DD-MON-YYYY') );
insert into HR.CHILD values ('5','6',to_date('31-DEC-2020','DD-MON-YYYY') );
insert into HR.CHILD values ('6','6',to_date('30-DEC-2020','DD-MON-YYYY') );

So after this, "select * from dba_indexes where table_name in ('PARENT','CHILD') order by table_name;" will show me following 2 indexes:


Constraints are as follows: "select * from dba_constraints where table_name in ('PARENT','CHILD') order by table_name;" :


Now , I rename the Indexes & Constraints :

alter index CHILD_PK_NEW rename to CHILD_PK_NEW_1;
alter index PARENT_PK_NEW rename to PARENT_PK_NEW_1;

alter table CHILD rename constraint CHILD_PK_NEW to CHILD_PK_NEW_1;
alter table CHILD rename constraint CHILD_FK_NEW to CHILD_FK_NEW_1;
alter table PARENT rename constraint PARENT_PK_NEW to PARENT_PK_NEW_1;

So my Indexes & constraints are renamed now with _1 appended to them.

Now the import, remapping PARENT & CHILD tables as PARENT_HISTORY & CHILD_HISTORY respectively :

impdp \"/ as sysdba\" directory=test_dir dumpfile=my_oe_code.dmp REMAP_TABLE=HR.PARENT:PARENT_HISTORY,HR.CHILD:CHILD_HISTORY

While checking indexes on my new tables 'PARENT_HISTORY','CHILD_HISTORY' :
"select * from dba_indexes where table_name in ('PARENT_HISTORY','CHILD_HISTORY') order by table_name;"
gave me EXPECTED result:


But, I got something UNEXPECTED while checking constraints:
select * from dba_constraints where table_name in ('PARENT_HISTORY','CHILD_HISTORY') order by table_name;

CHILD_FK_NEW (to my surprise this is referring to PARENT_PK_NEW_1 as per R_CONSTRAINT_NAME column , I was expecting PARENT_PK_NEW)

Due to this incorrect referencing, I am able to delete records from PARENT_HISTORY with no referential constraint violation error.

(I understand that another method is to get tables imported first & creating the constraints afterwards, but I am trying to achieve similar result via above method)

Can you please advise what I may be missing here to understand w.r.t. value in R_CONSTRAINT_NAME column ?
Chris Saxon
January 14, 2020 - 3:16 pm UTC

It's Chris (still) but hey :)

I'm not able to reproduce this, but it looks like it's not applying the remapping to the constraint name.

If this is the case, you could get around it by renaming the existing tables. So you can skip remapping during import.

Though I confess: I don't understand why you're using import?

It's quicker and easier to use either create-table-as-select or insert-as-select.

AV, January 21, 2020 - 8:32 am UTC

Thanks Chris.

Above I gave sample tables/data, but actually the parent & child tables are very large in my case and INSERT had taken forever before erring out with snapshot too old error.

I will use the alternate way in this case as suggested above.


Follow-up - SOLUTION

AV, February 24, 2020 - 3:24 pm UTC

Hi all,

Thanks for your help & guidance.

This post is in response to my original one posted on "January 07, 2020 - 10:42 am UTC".

Just sharing the way the given requirement was met.

Requirement : Purge 2 large partitioned tables, with foreign key relationship, to contain only last 2 years of data.

I had tried CTAS & INSERT methods, but given sheer size of data, they got timed out or gave snapshot too old error.

Lets call the tables P (parent) & C (child) with around 15 years of data. Lets call P_NEW & C_NEW the tables with last 2 years of data from P & C respectively.

1) P_NEW required data from all partitions of P based upon a WHERE clause. So I created a par file with QUERY clause & ran expdp. This took 3 hrs & produced 1 dmp file.

2) C_NEW just required last 2 year partitions worth of data, so I exported it partition wise. There were 26 partitions to consider. I ran 4 simultaneous expdp commands (with expdp for step 1 already running), dividing partitions in each export (6,6,6,8) with parallel=4.
This took 3.15 hrs (overlapping) and produced 4 dmp files.

3) Rename Primary, Unique, check etc constraints & indexes of P & C.

4) Import dmp file from Step 1 in P_NEW table with parallel=4, constraints=y indexes=y transform=disable_archive_logging:y. Took 35 mins.

5) Import all dmp files from Step 2 sequentially with parallel=6, indexes=y constraints=y EXCLUDE=REF_CONSTRAINT transform=disable_archive_logging:y.
Except 1st import in this step, remaining 3 imports had additional table_exists_action=append parameter.
Please note that here I excluded REF constraint for successful import.
Total time taken 1 hr 45 mins.

6) After this, I had P_NEW and C_NEW ready with data and indexes & all constraints , except REF constraint.

7) Create a FOREIGN KEY constraint on C_NEW referencing P_NEW in with NOVALIDATE.
This took 50 mins. The REF constraint was in DISABLED NOVALIDATE mode after this.

8) Now to ENABLE VALIDATE this REF constraint was tricky.

I tried BULK insert but it was failing as given that for every key in P_NEW, there were multiple records in C_NEW, every other attempted insert on P_NEW for same key failed due to primary key constraint violation error on P_NEW.

So I did it in a loop, the failing every other attempted insert on P_NEW from P for same key was sent to EXCEPTION section & loop continued with next key insert.
This took around 10 hrs (there may be some efficient way for this that I might be missing here).

So total work of creating a REF key constraint took approx 11 hrs.

9) At the end, I had P_NEW & C_NEW tables which had VALIDATED foreign key relationship and these were renamed to P & C quickly and requirement was met.

This was done in TEST region. Next will be doing in PROD region and as data is loaded only once a week, so they can tolerate the hours taken by above work.


More to Explore


Get all the information about database performance in the Database Performance guide.