Skip to Main Content
  • Questions
  • Archival strategy from huge data tables

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, Anvesh.

Asked: March 23, 2017 - 6:39 am UTC

Last updated: August 09, 2021 - 5:46 am UTC

Version: 11g

Viewed 10K+ times! This question is

You Asked

Hi Tom,

We have a requirement to archive data from huge data into some backup tables based upon a date range or number of rows.
As per the stats that i extracted these tables contains data around 275 MM records.
Can you please let me know the best effective way to push records into backup tables?
Please let me know any extra tips that i need to take care of dealing with such large tables.


and Connor said...

Partitioning is generally the best method, because you can typically archive data by simply doing an "exchange" command to archive off old data.

Data doesn't "move" in that scenario, it simply gets 'detached' from the table via data dictionary manipulation.

But if that is not an option, then direct mode operations, and potentially parallel operations, are the most effective ways of moving lots of data, ie:

create table BACKUP_TABLE parallel as
select /*+ parallel */ from MY_TABLE
where [archival criteria]

or

alter session enable parallel dml;

insert /*+ append parallel */ into BACKUP_TABLE
select /*+ parallel */ from MY_TABLE
where [archival criteria]


But copying the data is the easy part...the big question then becomes - how do you efficiently *remove* the data from your active table. DELETE's are very expensive, so deleting millions of records can take a *long* time and burn a lot of resources.

It may often be the case that it is more efficient to flip the process on its head, and write a query that saves the rows you want to *keep*, ie

create table NEW_MY_TABLE as
select * from MY_TABLE
where [NON-archival criteria]

rename NEW_MY_TABLE to MY_TABLE;

(and then copy of the grants, etc from the old table to the new table).


But in the general sense:

"Please let me know any extra tips that i need to take care of dealing with such large tables."

Partitioning....Partitioning....Partitioning....Partitioning

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Anvesh Soma, March 28, 2017 - 6:12 am UTC


Anvesh Soma, March 28, 2017 - 6:18 am UTC

Hi Connor,

Thanks for the answer.

create table NEW_MY_TABLE as
select * from MY_TABLE
where [NON-archival criteria]

rename NEW_MY_TABLE to MY_TABLE;

Seems the above suggestion will be used for one-time approach.

We are looking for the Incremental approach. We will be repeating this archival logic periodically. Can you please suggest us new approaches/alter above approach for periodically archiving.

Please correct me if I'm wrong.
Connor McDonald
March 28, 2017 - 12:21 pm UTC

"Seems the above suggestion will be used for one-time approach."

Not at all. There is nothing to stop you building an automation process for that.

But dont get me wrong - partitioning is perhaps the best way for an archival process.

Anvesh Soma, March 28, 2017 - 5:18 pm UTC

Hi Connor,
Thanks for the follow-up.

"Seems the above suggestion will be used for one-time approach."
Not at all. There is nothing to stop you building an automation process for that.
1) Can you please clarify me "automation process"?
2) If I have to use it as the periodical approach: So do you want me to create every time new table for next subsequent runs?
3) If No, How do i make it for next run? Using insert statment for new table / delete statment for old table???

But don't get me wrong - partitioning is perhaps the best way for an archival process.
Thanks for the suggestion, but due to some dependancy constraints , we are currently going up with archival process.
Connor McDonald
March 29, 2017 - 1:22 am UTC


SQL> create table t as
  2  select date '2010-01-01'+rownum x, rownum y from dual connect by level <= 365*5;

Table created.

SQL>
SQL> alter table t add constraint PK primary key ( x );

Table altered.

SQL>
SQL> create table t_bkp as select * from t where 1=0;
create table t_bkp as select * from t where 1=0
             *
ERROR at line 1:
ORA-00955: name is already used by an existing object


SQL>
SQL> create or replace
  2  procedure archiver(p_date date) is
  3  begin
  4    begin
  5      execute immediate 'drop table t_temp purge';
  6    exception
  7      when others then null;
  8    end;
  9
 10    insert /*+ APPEND */ into t_bkp
 11    select * from t
 12    where x <= p_date;
 13
 14    dbms_output.put_line('Copied '||sql%rowcount||' rows to backup table');
 15
 16    execute immediate
 17      'create table t_temp as select * from t where x > date '''||to_char(p_date,'yyyy-mm-dd')||'''';
 18    dbms_output.put_line('Created t_temp with remaining rows');
 19
 20    execute immediate 'alter table t rename constraint PK to PK_OLD';
 21    execute immediate 'alter index PK rename to PK_OLD';
 22    dbms_output.put_line('Renamed constraint');
 23
 24    execute immediate
 25      'alter table t_temp add constraint PK primary key ( x ) ';
 26    dbms_output.put_line('Added constraint to t_temp');
 27
 28    execute immediate
 29      'rename t to t_gone';
 30    dbms_output.put_line('Renamed table t to t_gone');
 31
 32    execute immediate
 33      'rename t_temp to t';
 34    dbms_output.put_line('Renamed table t_temp to t');
 35
 36    execute immediate
 37      'drop table t_gone purge';
 38    dbms_output.put_line('t_gone is gone ');
 39
 40  end;
 41  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec archiver(date '2012-01-01');
Copied 730 rows to backup table
Created t_temp with remaining rows
Renamed constraint
Added constraint to t_temp
Renamed table t to t_gone
Renamed table t_temp to t
t_gone is gone

PL/SQL procedure successfully completed.

SQL>
SQL> select min(x) from t;

MIN(X)
---------
02-JAN-12

1 row selected.

SQL> exec archiver(date '2013-01-01');
Copied 366 rows to backup table
Created t_temp with remaining rows
Renamed constraint
Added constraint to t_temp
Renamed table t to t_gone
Renamed table t_temp to t
t_gone is gone

PL/SQL procedure successfully completed.

SQL> select min(x) from t;

MIN(X)
---------
02-JAN-13

1 row selected.

SQL>


Partition exchange

Devender Rawat, March 12, 2018 - 2:09 pm UTC

Hi Connor,

we have been using Exadata you mentioned about partition exchange however I don't see that as an option for repeating it more than once for incremental extracts. What I tried by testing is it swaps the data which was exchanged the first time. is there a better to leverage partition exchange to move the data to the historical tables.

Thanks,
Devender Rawat.
Connor McDonald
March 13, 2018 - 2:17 am UTC

Not sure what you mean. A typical archival process would be:

"Archive June"

a- create table X as empty
b- exchange June partition in source table with X
c- so "June" data is now gone from the source table
d- create empty June partition in Archive table
e- exchange June partition in Archive table with X
f- "June" data is not in the Archive table, X is empty

You repeat that for each month you want to archive.

If you are (say) archiving monthly partitions into (say) yearly archive partitions, ie would be similar, but needs a little more work, ie

a- create table X as empty
b- exchange June partition in source table with X
c- so "June" data is now gone from the source table
d- insert /*+ APPEND */ X into existing yearly archive partition


To Vickey Panjiyar

J. Laurindo Chiappa, August 28, 2018 - 7:45 pm UTC

Hello : just to add, let me say about my experience : in the past I had, too, this need of archive and (under demand) recover old data, and if you (just like me) do not have all the disk storage needed to maintain all the data online (in different partitions due to performance reasons, but online) maybe you can do the same that I did, ie, : have the required table Partitioned, transform the partitions containing old data in tables (see PARTITION EXCHANGE, it is a very quick operation), backup this new 'table' containing old data and drop the 'table'... Later, when some Auditor request to see that old data, your DBA restore the backup containing the old data, and (if needed) the restored table can be changed again as partitions....
In my case, we were using a very old version, so our only option in the occasion was to use export to backup a table and import to restore a table, but nowadays we have RMAN Tablespace Point-in-Time Recovery (TSPITR) and/or Transportable Tablespaces : if your tables/partitions use different tablespaces each one, maybe you could use on of these (much more performant) options to backup and restore tables....

Regards,

Chiappa

Minor correction

Peter, January 25, 2019 - 7:33 pm UTC

I think you meant;

f- "June" data is now in the Archive table, X is empty
Connor McDonald
January 28, 2019 - 3:25 pm UTC

Yes, "not" and "now" was not a good typo :-)

How to use partitioning in this case?

Aniket, August 09, 2021 - 2:41 am UTC

Hi,

You suggested that partioning is best way to solve the problem, will it be possible for you give an example on how this should be implemented?

Thanks!
Connor McDonald
August 09, 2021 - 5:46 am UTC

A typical archival process would be:

"Archive June"

a- create table X as empty
b- exchange June partition in source table with X
c- so "June" data is now gone from the source table
d- create empty June partition in Archive table
e- exchange June partition in Archive table with X
f- "June" data is now in the Archive table, X is empty

You repeat that for each month you want to archive.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.