Skip to Main Content
  • Questions
  • Daily Big Amount of data deleteion on partitioned table, but not by partition key

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, Yosi.

Asked: May 08, 2017 - 4:17 pm UTC

Last updated: May 09, 2017 - 9:22 am UTC

Version: 11.2.0.3

Viewed 1000+ times

You Asked

Hi Tom,
First, Thanks for the great effort answering everyone questions.

lately im struggeling in deleting data in our system(OLTP).

The tables we need to delete are all partitioned by start_Time(which mostly being used by every query in the system).
The problem is that a specific customer want to delete(and only delete) the old data using db_insertion_time column + other search properties(event_type for example).
Also there is an option to preserve spefic records with preserve_indication in each event record(lets say 1k per day).
one more thing is that the DB TZ is configured as UTC. Customer want deletion to be done with Local Time (UTC-2) timings

events rate entering the system = 1M per day.
events retention time=30 Days.

tables participated in deletion:
1. t_event(partitioned by start_time) - pk(event_id + start_time) + global index(event_id) -needed for OGG bidirectional replication(by event_id).
2. t_event_not_partitioned pk(event_id + start_time) - smaller table(subcopy of t_event - not all columns exist)
alot of indexes for any possible search which does not include start_time in the system including the db_insertion_time index.

and alot of satelite tables(partitioned by start_time) for events - PK(event_id+start_time+XXXX#_id) - .
most tables have 2-5 records per event.
1 has 10 records per event.
1 has 20 records per event.

3. t_event_XXXX1
4. t_event_XXXX2
5. t_event_XXXX3
6. t_event_XXXX4
t. t_event_...
-------------------

before the deletion process we create a result_set which contains all the events that need to be deleted(94% of the events are in the same partition, 5% in the yesterday partition(UTC issue) and few are in other partitions)

our first approch was:
1. Creating the resultset(id=1).
<
insert into t_result_set select 1 rs_id, event_id,start_time from t_events WHERE db_insertion_Time < ..... AND ....
/>
2. DELETE Statments for example:
<
DELETE FROM t_event WHERE (event_id,start_time) in (select event_id, start_time from t_result_set where rs_id = 1);
DELETE FROM t_event_search WHERE (event_id,start_time) in (select event_id, start_time from t_result_set where rs_id = 1);
COMMIT; -- commit here for retry purpose of section 2
for each table in sateltile tables
DELETE FROM t_event_XXXX WHERE (event_id,start_time) in (select event_id, start_time from t_result_set where rs_id = 1);
COMMIT; -- commit here for retry purpose of section 2
end loop;
/>
this approch generates alot of REDO/UNDO + I/O and took alot of time.

so second approch was to create the resultset with a column row_seq(ORDERED BY start_time)
so this time
1. the RS is created as
<
insert into t_result_set
select 1 rs_id, event_id,start_time,rownum rn
from (select event_id,start_time
from t_events
WHERE db_insertion_Time < ..... AND ....
ORDER BY start_time)
/>
2. deletion is being done in chunks of 50K each time using the row_seq column (Same as before with a While loop on row_seq).
REDO/UNDO is smaller but still alot of I/O.

Third approch which i didnt test yet is to use partition exchange.
1. Create the resultset.
2. Find in t_events table which events need to stay preserved_events.
2. Delete the t_event_not_partitioned tables by resultset.
3. For each table in satelite tables copy preserved events to temp table + creating constraints.
5. Perform partition exchange.
6. drop new temp table.

This will probably work better but will make us to face with a new problem (global index TX LOCK in the t_event table).
i read somewhere that maybe we can delete the records for this table, copy the preserved events to temp table. delete the records in t_events, truncate the partition, reinsert the records - will avoid invalidating the index but will make it not so good for performance).

Please help me find the best solution. i know there isn't a magic solution for all tables togehter but the biggest problems are the global index + the tables with a large amount of records(15M).

What do you think about the thirs approach? maybe a 4th one?

thanks.

and Connor said...

Ultimately, there are 3 ways to shift data in Oracle

1) dont "really" shift it (ie, truncate, partition exchange, drop, etc)
- minimal work
- db/app design must support the intended operation
- if you have any kind of global index, then most of those benefits are lost because of index maintenance required either during or after the operation

2) move it via standard SQL
- simple, but typically lots of undo/redo

3) move it via optimized SQL (eg, direct mode, nologging, parallel dml, defer index maintenance etc)
- much less redo
- typically locking/concurrency restrictions

It sounds to me like you are going to have some sort of pain point no matter what you do, because your db physical design does not support the operation being requested. Similarly, if you cannot afford any kind of locking, then your options become more limited.

So sorry - I dont have a lot of options for you - you'll just need to do some benchmarking and try come with the best compromise between impact versus speed of archival.

Perhaps a good question is also - why do you need to delete ?

Is this answer out of date? If it is, please let us know via a Comment