Skip to Main Content
  • Questions
  • Data archival and purging for OLTP database.

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, vara.

Asked: October 07, 2011 - 3:58 am UTC

Last updated: October 21, 2019 - 12:18 pm UTC

Version: 11g r2

Viewed 10K+ times! This question is

You Asked

Hi Tom,

Need your suggestion regarding data archival and purging solution for OLTP db. Currently we are planning to have below approach. database is size is 150 Gb and planning to run the jobs monthly.

1) Generate flat files from table based on the purging requirement. every table has its own purging requirement.
2) delete the data from table.
3) Move the flat files to tapes/disks for archival.

Each table has its own retention requirement and purging policy. So partitioning alone is not sufficient.

Is there any better approach for archival and purging other than this flat file approach..

thank you.

regards,
vara




and Tom said...

but partitioning should definitely be part of this - delete is just about the most expensive operation possible.

I have a feeling that MANY of your tables, if not most or all of them, could use partitioning to facilitate this aging of information and would allow you to retain the data right where it belongs - IN THE DATABASE.

To archive, you would exchange an obsolete set of data in a partition (the data to be 'purged') with an empty table. Now the purged data is in a table by itself. You can then drop the empty partition. You could then add the table of purged data to a history table (exhange it into the history).

No flat files

Online access to the historical data (which can be made READ ONLY for backup/recovery purposes)

very little code (almost none)

easy on the database (ddl, not DML)

That would be the better approach.

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Alexander, October 07, 2011 - 3:45 pm UTC

If the tables are not many could they not break them out into their own tablespaces and setup retention policies using flashback data archive?
Tom Kyte
October 07, 2011 - 4:22 pm UTC

they could - but for purging (equals massive data movement), I'd rather still use DDL over DML if at all possible.

To use the total recall - they would have to delete the records (messy) and then the FBDA process would read the undo generated, undelete the records (roll back the delete, get the before image of the row) and insert it into the archive.

For normal day to day deletes - fine, for a massive purge - not as fine.

Data archival and purging for OLTP database

vara Reddy, October 10, 2011 - 12:32 am UTC

Hi Tom,

Thanks for the suggestion. We have to retain the data for 1 year in OLTP system and 7 years of offline storage and there is no archival db or online access to historical data. retrieval of data is on request basis and is a rare operation.

currently we are using oracle 11g r2 for db. Please suggest.

Regards,
Prasad.
Tom Kyte
October 10, 2011 - 10:41 am UTC

I've already suggested, I despise the idea of having data but not having access to it.

Retrieval of the data would be near impossible - if you just kept it *in the database*, it would always be there and you might find people actually start using it.

So, my recommendation stands, I don't have any experience moving data out of the database - I'm pretty much out of the picture when that happens.

Where licensing crashes into best practices

David Weigel, October 10, 2011 - 1:00 pm UTC

If only partitioning wasn't licensed separately on top of Enterprise Edition. I once tried to replace a delicate and long-running archiving process littered with create-table-as-select-and-deletes. I wrote a nice little exchange-partition kind of archiving intending to put it into our end-of-month process before discovering that most of our customers couldn't use it.
Tom Kyte
October 10, 2011 - 1:34 pm UTC

Partitioned Views

Rob Burton, October 10, 2011 - 1:57 pm UTC

Reviewer: David Weigel from Wayne, PA USA
"If only partitioning wasn't licensed separately on top of Enterprise Edition."

You can still use the Oracle 7 style Partitioned Views for very little extra effort. The data would exist in separate tables for say every month and a view exists across all the tables with constraints to guide the optimizer.

Tom has covered their usage well on this site I believe.

I regularly use features such as this, Statspack, manual archive log copying for standbys as it is pretty much a requirement for everything to work on SE/SE1/XE generally.

How to Purge huge OLTP table

Durga, October 09, 2019 - 11:04 am UTC


Hi Connor,
We have a table in our database which has gets updated ~50 million times a day.
The table is not partitioned or analysed (and management will not partition or analyze it for their own performance reasons)
This table has grown huge (size is > 3 billion rows )
It gets purged on daily basis but our purge program cannot keep up with this huge data volume.
We also have a special purge of this table once a quarter, where we create a new table, export data from old table to new table and then recreate indexes on new table.

Is there any way we can have more data purged on daily/weekly basis?
Our target is to make table size to ~ 1 billion so that normal purges can keep up.
Is there a way to make the table size manageable?
Chris Saxon
October 10, 2019 - 2:40 pm UTC

Partitioning really is the best way to deal with this; drop/truncate partition makes removing old data simple and fast.

But if this is definitely ruled out...

A filtered table move (added 12.2) may help. This only keeps rows matching the where clause:

create table t as 
  select level c1 from dual
  connect by level <= 100;
  
select count (*) from t;

COUNT(*)   
        100 

alter table t
  move including rows 
  where c1 > 90;
  
select count (*) from t;

COUNT(*)   
         10


We also have a special purge of this table once a quarter, where we create a new table, export data from old table to new table and then recreate indexes on new table.

What do you mean by export? Create-table-as-select or something else? If you're doing something else, CTAS should help speed up this process.

Is there any way we can have more data purged on daily/weekly basis?

You can run a purge as often as you like - provided you have the resources to do this. It's not clear from what you've said how exactly the purge works. Or what's stopping you from running this more often.

If you share more details on how the process works and what your current system limitations are, we may be able to suggest more options.

How to Purge huge OLTP table

A reader, October 14, 2019 - 10:59 am UTC

Hi Chris,
I am very happy to see a reply.

Apologies, some of the information i gave was not correct - i checked again and this is what i found >>

The huge table in our database is ~4 billion large and it is range partitioned.
It has 16 partitions and it was last analyzed in Oct 2014.
The management will not analyze it again because it might change the run plan in the optimizer and cause the selects to run poorly.

We cannot run daily purges often as it takes up lot of resources - this creates load on our primary DB.
We have a primary DB where this big table resides and it gets updated more than 50million times a day.
The data from primary DB gets replicated to 14 other read only DBs (continuous replication through Golden gate)
The external systems (clients) hit these 14 read only DBs to get the information they want.

During special purge, we create a replica of big table on a database say DB-old (DB-old doesn't receive requests from external systems)
(You suggested that we should use CTAS to create this replica. Yes, this will improve performance )
From DB-old, this big table is exported to our primary DB as well as 14 read only DBs.
But the real problem is the length of time it takes to import the big table to each of the database and build the indices.
This entire process takes ~2.5 days.
Our primary DB has to undergo a downtime of ~2.5 days during special purge activity.
Management won't let us take down updates for the length of time required to perform the special purge we've been running once in a while.
So, even though special purges help us to reduce the number of rows in our big table, we cannot have special purges more often due to huge downtime of our system.


Is there any way to bypass the import?
Is there any way to make the special purge fast and reduce overall system downtime?
We want to delete ~1.8 billion rows.
We delete inactive data from the big table.
I am not sure if we can drop/truncate partitions.

I don't have much knowledge or experience in Oracle DBs, but i am curious to know how things can be worked out in this situation.
Connor McDonald
October 21, 2019 - 12:18 pm UTC

As we've already said, your approach does not sound sustainable in the long term - you really need to explore partitioning. But anyway,

So if I'm reading this correctly, the workflow is:

- create replica of big table
- export replica
- import to 14 db's

You can refine this by changing this to:

- create replica of big table *in its own tablespace*
- export the tablespace using transportable option (ie, just the metadata)
- copy the tablespace datafiles to the 14 targets
- import the tablespace using transportable option (ie, just the metadata)

This should be a lot faster

How to Purge huge OLTP table

Durga, October 14, 2019 - 1:21 pm UTC

Just forgot to mention that we are using Oracle 11g version of database.

Thank you.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.