Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, David.

Asked: June 25, 2008 - 12:07 am UTC

Last updated: June 28, 2008 - 1:22 pm UTC

Version: 10.2.0

Viewed 1000+ times

You Asked

Hi, Tom. I read the articles about the bulk fetches to move large numbers of rows. What I did not see was a great way of DELETING stale data. I need to take large numbers of rows and place them into an archive table (both need to be online, but the base table generates 100K rows per day so gets rather hard to manage after a few days).

The tables in question are identical in structure, but have different storage and the like. A create table structure might be:

create table table_a (
col_a varchar2(30), run_date date );

create table table_a_archive (
col_a varchar2(30), run_date date );

In reality, the table is quite large, and subject to change without much notice, making explicit column names rather difficult. Oh, and did I mention that there is absolutely NO primary key? Also, there are already issues with rollback segments, so I don't want to muck those up. folks around here get cranky when their DB shuts down.

Moving data is no problem. I can easily move data from A to B using something like the following. Short of using more traditional methods with delete, though, I don't see a way to use the bulk data features.

DECLARE

CURSOR delSel
IS
SELECT *
FROM table_a
WHERE TRUNC(RUN_DATE) < TRUNC(SYSDATE) - 30;

TYPE delSelType IS TABLE OF table_a%ROWTYPE;

delSelArray delSelType;

BEGIN

OPEN delSel;

LOOP
BEGIN
FETCH delSel
BULK COLLECT
INTO delSelArray
LIMIT 5000;

FORALL indx IN delSelArray.FIRST .. delSelArray.LAST
INSERT INTO table_a_archive VALUES delSelArray (indx);

COMMIT;

END;

EXIT WHEN delSel%NOTFOUND;

...and so on.

What I would like to do is something along the lines of adding another statement along the lines of:

FORALL indx IN delSelArray.FIRST .. delSelArray.LAST
delete from table_a where current of delSelArray(indx);

which obviously doesn't work.

Given this, is there some better way around this?

and Tom said...

declare
l_date date := trunc(sysdate)-30;
begin
insert into arch select * from src where run_date < l_date;
delete from src where run_date < l_date;
end;


that is all

NO CODE
seriously: NO PROCEDURAL LOGIC AT ALL


and please remember that if

trunc(x) < Y

and since trunc(x) <= X

then X < Y already - do not apply functions to columns unless you need to and here you have no need to.



but the most important point here is:

there is absolutely no need for any procedural code at all.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

David Aldridge, June 25, 2008 - 5:31 pm UTC

If finding the rows to delete is expensive (requires a full table scan to identify a small subset of rows, for example) or between the insert and the delete another process might add more rows that meet your archival condition, then you might consider changing that first insert into a multitable insert that logs the rowid's that you're going to want to delete into a dedicated temporary table. Then you can delete rows from the source table where the rowid is in the temporary table. If you made it a true global temporary table then I suppose there'd be no cleanup afterwards either. (Unless you hit some bug on multi table inserts into permanent and temporary tables, of course).
Tom Kyte
June 26, 2008 - 10:20 am UTC

it was based on run_date, old data, older than 30 days ago.

and we could use flashback query instead of global temporary tables or a serializable transaction (although, you should limit the size of serializable transactions - short, small)


so in this case, with run_dates in the past and archiving 30 day old data - I don't see a problem with just a simple insert and delete

but we could:

get_scn
delete where rowid in (select rowid as of that_scn)
insert into arch select * from t as of that_scn;


Close.....

David Coburn, June 27, 2008 - 1:09 am UTC

Thanks for the response. One concern I have, though, is blowing through the undo logs; they have been problematic in the past. One thing I did not mention is that the row is quite large (including a lob), and there is a LOT of activity in this database.

You can't do this without using at least *some* procedural code, I don't think. I got around this by using a loop around the statement

delete from src
where run_date < l_date
and rownum < 5000;

committing after each delete and terminating when it finds no more rows. This restricts it to 5K rows each time (well, 4,999 if you want to be picky) and makes the logging requirements lighter. I was hoping for something more elegant (read: speedier), though.

Thanks again!
Tom Kyte
June 27, 2008 - 9:12 am UTC

lobs do not generate significant undo. Lobs are versioned in the lob tablespace. The undo would be for the lob index only.

ops$tkyte%ORA10GR2> create table t
  2  as
  3  select *
  4    from all_objects;

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> delete from t;

50171 rows deleted.

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
      1441

<b>no lob, needs 1441 blocks</b>

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert /*+ append */ into t select * from all_objects;

50162 rows created.

ops$tkyte%ORA10GR2> alter table t add c clob;

Table altered.

ops$tkyte%ORA10GR2> declare
  2          l_data long := rpad('*',32000,'*');
  3  begin
  4          update t set c = l_data;
  5          commit;
  6  end;
  7  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select sum(dbms_lob.getlength(c)), count(*) from t;

SUM(DBMS_LOB.GETLENGTH(C))   COUNT(*)
-------------------------- ----------
                1605184000      50162

<b>so, that is 1.3GB of lob data...</b>

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> delete from t;

50162 rows deleted.

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
      2905

<b>That isn't enough undo blocks to hold 1.3gb - you'd need about 200,000 undo blocks for that (8k blocks as I have)</b>


ops$tkyte%ORA10GR2> commit;

Commit complete.





You can in fact do this without using procedural code.
We call it "sizing your system for what you actually do"

disk is cheap
your time is not
writing code is introducing bugs
bugs cost $$$$$$$
code has to be maintained
less code = less bugs


I will not budge from that position. If your undo configuration is undersized for what you do - you do not change what you do, you size your undo correctly.

you want fast?
you want the least amount of undo generated?
you want the least amount of redo generated?
you want the best index structures in place after the operation?

then you will use precisely two statements, insert+delete.


Another approach

johan, June 27, 2008 - 5:37 pm UTC

Another approach to consider (but take into account all the related issues it could introduce) is to partition the table by date. This will allow you to quickly swap old partitions between tables and/or may eliminate the need to do this at all - at least if you intend to keep the data in the same database. It also adds the benefit of transportable tablespaces to move the old data to a completely different database, if you plan your tablespaces appropriately.
Tom Kyte
June 28, 2008 - 1:22 pm UTC

absolutely, forest for the trees.

I should have said that.

Yes, instead of moving bits and bytes, we could just "swap names of segments" to 'virtually' move tons of data with a single ddl command - no undo, no redo really to speak of.

How'bout exchanging partitions

Asif Momen, June 30, 2008 - 5:44 am UTC

Another approach could be to partition both the tables (table_a & table_a_archive) on RUN_DATE column and then simply detach the required partition from table_a and attached it to table_a_archive. Something like this:

SQL> create table table_a (
  2    col_a varchar2(30), run_date date )
  3  partition by range(run_date)
  4  (partition table_a_Jan_2008 values less than (to_date('01-FEB-2008', 'DD-MON-YYYY')),
  5  partition table_a_Feb_2008 values less than (to_date('01-MAR-2008', 'DD-MON-YYYY')),
  6  partition table_a_Mar_2008 values less than (to_date('01-APR-2008', 'DD-MON-YYYY')),
  7  partition table_a_Apr_2008 values less than (to_date('01-MAY-2008', 'DD-MON-YYYY')))
  8  ;

Table created.

SQL>
SQL> create table table_a_archive (
  2    col_a varchar2(30), run_date date )
  3  partition by range(run_date)
  4  (partition table_a_arch_Jan_2008 values less than (to_date('01-FEB-2008', 'DD-MON-YYYY')),
  5  partition table_a_arch_Feb_2008 values less than (to_date('01-MAR-2008', 'DD-MON-YYYY')),
  6  partition table_a_arch_Mar_2008 values less than (to_date('01-APR-2008', 'DD-MON-YYYY')),
  7  partition table_a_arch_Apr_2008 values less than (to_date('01-MAY-2008', 'DD-MON-YYYY')))
  8  ;

Table created.

SQL>
SQL> create table table_a_dummy as select * from table_a where 1 = 2;

Table created.

SQL>
SQL> insert into table_a
  2    select 'dummy' || level,
  3           to_date('01-JAN-2008', 'DD-MON-YYYY') + level - 1
  4      from dual
  5      connect by level <= 120;

120 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL>
SQL> select trunc(run_date, 'Mon'), count(*) from table_a group by trunc(run_date, 'Mon');

TRUNC(RUN   COUNT(*)
--------- ----------
01-MAR-08         31
01-APR-08         29
01-JAN-08         31
01-FEB-08         29

SQL>
SQL> select trunc(run_date, 'Mon'), count(*) from table_a_archive group by trunc(run_date, 'Mon');

no rows selected

SQL>
SQL> select count(*) from table_a_dummy;

  COUNT(*)
----------
         0

SQL>
SQL> ALTER TABLE table_a
  2    EXCHANGE PARTITION table_a_Jan_2008
  3    WITH TABLE table_a_dummy ;

Table altered.

SQL>
SQL> select trunc(run_date, 'Mon'), count(*) from table_a group by trunc(run_date, 'Mon');

TRUNC(RUN   COUNT(*)
--------- ----------
01-MAR-08         31
01-APR-08         29
01-FEB-08         29

SQL>
SQL> select trunc(run_date, 'Mon'), count(*) from table_a_archive group by trunc(run_date, 'Mon');

no rows selected

SQL>
SQL> select count(*) from table_a_dummy;

  COUNT(*)
----------
        31

SQL>
SQL> ALTER TABLE table_a_archive
  2    EXCHANGE PARTITION table_a_arch_Jan_2008
  3    WITH TABLE table_a_dummy ;

Table altered.

SQL>
SQL>
SQL> select trunc(run_date, 'Mon'), count(*) from table_a group by trunc(run_date, 'Mon');

TRUNC(RUN   COUNT(*)
--------- ----------
01-MAR-08         31
01-APR-08         29
01-FEB-08         29

SQL>
SQL> select trunc(run_date, 'Mon'), count(*) from table_a_archive group by trunc(run_date, 'Mon');

TRUNC(RUN   COUNT(*)
--------- ----------
01-JAN-08         31

SQL>
SQL> select count(*) from table_a_dummy;

  COUNT(*)
----------
         0

SQL>



This approach will be fast and at the same time will generate no redo.