Thanks for the question, Harbinder.
Asked: April 17, 2016 - 2:42 pm UTC
Last updated: April 19, 2016 - 4:24 am UTC
Version: 11.2
Viewed 1000+ times
You Asked
Hi Tom,
We inherited the application from some other company. Its an OLTP database with highly indexed tables. Some tables with 17-20 indexes. We have a requirement to purge the data older than 10 years, which we'll be doing in 3 phases.
1. Phase 1 will delete from the tables that do not have FK constraints but are big tables (>200 Million records) and are highly indexed. No issues with this one as we are dropping the indexes, deleting the data and then recreating the indexes. Takes around 6 hours.
2. Phase 2 - This is where we are having a problem. The parent table here has about 220 million records, and this table has 28 child tables(reason I couldn't give you create scripts) with FK Cascade Delete. We will be deleting close to 16 million rows from the parent table and there may or may not be a child record for each record in parent. All the tables are highly indexed, so simple delete statement was generating tons of redo/undo data and was taking forever to complete.
Second approach that we took was to 'Disable' the constraints, delete from child tables and then delete from Parent, which did not work very well either. It took almost 13 hours to delete from child tables. Specially 4 huge child tables with lots of composite indexes took 12 hours and 56 minutes. Rest of the 24 tables took 4-5 minutes. And then enabling the constraints took 14 hours. We created a single column(indexed) temporary table with the cases(primary key in parent table) that needs to be deleted. Then each delete for child refers this temp table to delete data and each delete statement looks like this:
DELETE /*+ use_hash(z) parallel (z, 16) */
FROM CHILD_TABLE_1 z
where case IN (SELECT /*+ parallel (x, 16) */ case from temp table x);
COMMIT;
DELETE /*+ use_hash(z) parallel (z, 16) */
FROM CHILD_TABLE_2 z
where case IN (SELECT /*+ parallel (x, 16) */ case from temp table x);
COMMIT;
.....
.....
.....
The outage window(Sunday only) we can afford is only 6-8 hours maximum.
Third approach we took was to create a scheduled job, that will delete 1 million records each day, so that we'll be caught up in 16 days. The job uses 'FORALL DELETE' from parent table on rowid, commiting after every 50K records. Sadly, we didn't have exact same environment as production. So the job was taking about 2 hours in test environment and we moved it to prod. In prod, the job is only deleting 100K records in 2 hours due to interference with other high I/O jobs running around the same time. We cannot allow the job to run longer than 2 hours. We dont have any other schedule window due to high volume of users hitting the system from all time zones. Could you please suggest, what would be best approach for this purge? PS: We are thinking about partitioning the data based on year, but we are in time crunch right now. From next year's purge we might go toward partitioning.
3. Phase 3 is dependent on phase 1 & 2 to finish and I do not think this will be an issue.
Please help us out with Phase 2.
Thanks in Advance.
and Connor said...
You said:
"Second approach that we took was to 'Disable' the constraints, delete from child tables and then delete from Parent, which did not work very well either."
Can you elaborate on why that did not work very well ? Because it would seem the most logical approach. So what was the drama with that approach ?
=================
Addenda:
Yeah, you're in a bit of tight spot. Some options you could consider, none of which are particularly palatable
1) continuous trickle delete. 1,000,000 rows per day means just over 10 rows per second. Rather than organize scheduling windows and hammering the box, perhaps negotiate to do the deletion continuously, for example, a long the of lines of "delete 500 rows, sleep 30 seconds" (using your table of primary keys as a source). You're still going to have a big bucket of redo to deal with, but hopefully this is a one-off exercise.
2) Home-grown parallel. You said you got 100k records in 2 hours. Perhaps you could get better than this by using (say) dbms_scheduler to fire up multiple deletes across different PK ranges during your scheduled window. THe benefit here really depends on how your server/storage resources scale to handle it. And obviously, you'll be sucking up a larger slice of the server pie and hence the impact more visible to users.
3) Start the journey to a better place. You mentioned possibly moving to partitioning etc. Rather than delete, build materialized views which are replicas of your tables *minus* the data to be purged (and they could also be partitioned perhaps) The cost here is materialized view logs plus (temporarily) double the space, but you periodically refresh the MV's until a logical point at which you take a small outage and cut over, making the MV's into tables. (This is effectively what dbms_redefinition does behind the scenes). It's doable, but not a trivial undertaking.
Hope this helps.
Rating
(1 rating)
Is this answer out of date? If it is, please let us know via a Comment