Skip to Main Content
  • Questions
  • Most effective way to Delete large number of rows from an online table on a daily basis

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, J2a5c1k2.

Asked: April 14, 2011 - 9:03 pm UTC

Last updated: December 11, 2018 - 3:32 pm UTC

Version: 10.2.0.1

Viewed 50K+ times! This question is

You Asked

I have a need to write a cleanup script that would delete old data (1-2 Million rows)
based on a date on a daily basis. Almost equal amount of rows are inserted into the same table daily as well. Any suggestions on the most efficient way of doing that. Table Details 1) Table is Non Partitioned 2) 200 Million Row count on any given day. 3)Table in use throught the day (no downtime for deletion). 3) Couple indexes on the Table. Current sudo code that's in use
 loop 
delete from largetbl where somedate < mydate and rownum < 100; exit when rowsdeleted = 0; end loop

and Tom said...

the obvious answer is to change fact #1 and make the table become a partitioned table. Then you can use truncate and it would be immediately done (unless you needed global indexes, then you would have to wait for the index maintenance to complete)

Your delete loop is a really bad idea. Why do you do that? Just to make it take longer and perform more work?

You probably want a full scan and a single delete,

delete from largetbl where somedate < mydate;

and that is all.

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Alexander, April 15, 2011 - 1:17 pm UTC

Tom,

I have a similar situation only its a one time thing, we need to purge a bunch of old data, mostly LOB that will amount somewhere in the 100 GB range.

I hadn't considered your partition approach, I like it. However, that's a licensed option so would we need one to essentially "swap" it back and forth (assuming you can even do that, I have only read about partitioning.)
Tom Kyte
April 15, 2011 - 1:32 pm UTC

.. so would we need one to essentially "swap" it back and forth
(assuming you can even do that, I have only read about partitioning.) ..

not sure what you mean?



tell me, how big is what you have now versus what you'll have at the end.

Alexander, April 15, 2011 - 1:46 pm UTC

Make it a partitioned table temporarily to truncate the partitions with the date ranges I want to get rid of, then put the table back the way it was e.g. not partitioned anymore.

I don't have the criteria at the moment I have to get that from the application team as to how much we're deleting.
Tom Kyte
April 15, 2011 - 1:50 pm UTC

you'd still need the partitioning option - and it would require a rewrite of the entire table to partition it - I don't know why'd you do it only to put it back, you'd end up writing and rewriting and rewriting it.


If you are going to delete a large percentage of it - then you should consider:

create table new_table as select the data you want to keep;
drop old_table;

or if you need this to be online:

create new_table (....);
create materalized view log on old_table
create materialized view new_table on prebuilt table as select data you want to keep
index new_table
constrain new_table
grant on new_table

then - offline:
drop materialized view (table stays)
drop old_table
rename new_table to old_table;

LargeTableDelete...

J2a5c1k2 Bhatia, April 15, 2011 - 3:57 pm UTC

I agree Partition is the next logical step and will be working towards it.

on your last comment..."You probably want a full scan and a single delete"...wouldnt a large single(about 1-2 Mil rows) delete generate a huge undo vs smaller chunks of delete?
Tom Kyte
April 15, 2011 - 5:04 pm UTC

your smaller chunks generate MORE undo and MORE redo than a single statement would have.


delete rows

A reader, April 18, 2011 - 9:04 am UTC


Most effective to delete the large table rows

A reader, June 09, 2011 - 6:02 am UTC

Good Day Tom

create new_table (....);
create materalized view log on old_table

why we need to create the materalized view log on old_table?
create materialized view new_table on prebuilt table as select data you want to keep
why we need to create the materalized view on new_table?
If you could give us with same real example it's much helpfull to me.

index new_table
constrain new_table
grant on new_table


then - offline:
drop materialized view (table stays)
drop old_table
rename new_table to old_table;


Please could please explain me the reason behind this? i've a similar sutitation where "delete" is in loop and as you said it was really bad idea and it's giving the error "unable to extend the table space".

Tom Kyte
June 09, 2011 - 10:15 am UTC

why we need to create the materalized view log on old_table?


so we can incrementally refresh the new table. Otherwise we'd have to completely refresh it on every sync up we did.

why we need to create the materalized view on new_table?


because we sort of want to be able to refresh it???? We need new table to have all of the data we want to keep from the old table.


The steps show how to do an "online reorganization of a table and keep JUST the rows you want to keep" instead of deleting them.

see
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:17309172009134


MV

A reader, December 12, 2011 - 10:43 am UTC


Hi Tom,

I've an same situation as stated in very first post on this page. I've tried your method as stated in above URL, but i was hit with below error while having a complete refresh on T4 table.

*
ERROR at line 1:
ORA-12008: error in snapshot refresh path
ORA-01631: max # extents (249) reached in table RIM_OWNER.ITEMS_DUP
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 617
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 674
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 654
ORA-06512: at line 1

In my case, the materailized view must hold of 10,000,000 rows.(i.e one month data). How do we come out of this situation?
Tom Kyte
December 13, 2011 - 7:08 am UTC

wow, I haven't seen that message in decade or more, not since 7.3 of the database!

what is your setup? why do you even have a limit on extents? aren't you using locally managed tablespaces? If you are using old fashioned dictionary managed ones - use a larger extent size (alter your table) or set extents unlimited.

10,000,000 rows is tiny, you must be using an exceptionally small extent size.


oh, and I read this page. No one else had anything remotely similar to what you are experiencing?

similar case

A reader, December 04, 2018 - 10:56 am UTC

hi,

I am facing same problem. We need to delete 50000 records that are older than a year, every 3 minutes [job is scheduled].

Query for your reference:
DELETE FROM table1 WHERE OBJECTID < (SELECT MIN (OBJECTID) FROM table1 WHERE STARTTIME < SYSDATE-365) + 50000;

on both columns OBJECTID and STARTTIME we have normal index. but this query is taking very long time.

Can you suggest on performance improvement.
Chris Saxon
December 11, 2018 - 3:32 pm UTC

I don't understand why you have the subquery. Just delete everything older than a year:

DELETE FROM table1 WHERE STARTTIME < SYSDATE-365;


But the fastest way is the one Tom suggested at the start: partition your table. Then you can drop/truncate the oldest partition instantly.