Skip to Main Content
  • Questions
  • Using Frequent Truncate in Application

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Michal.

Asked: September 15, 2005 - 10:59 am UTC

Last updated: November 09, 2006 - 2:18 pm UTC

Version: 10.1.0.4

Viewed 10K+ times! This question is

You Asked

We design application that has very high volume of transactions (1,000 transactions per minutes). Most operations with database are INSERTs. Performance is critical.

Application has two main tables:
First is partitioned by minute. So we have total 60*24=1440 partitions. Second is partitioned per hour. So we have total 24 partitions

The data is both tables are not required after several hours, so there is process that every minute / hour TRUNCATES the old data partitions.

Our concern is that the TRUNCATE will impact the application performance. What approach has less affect on INSERTs performance? Should we TRUNCATE frequently or better to wait and TRUNCATE large partitions?

Thanks a lot.


and Tom said...

I looked at this and asked Jonathan Lewis to peek as well. The answer below is a mixture - when it gets really in depth about the truncate processing - that is Jonathan.

1,000 transactions per minute is not excessively high volume, to partition a table by minutes in this case (1000 transactions in a single partition) sounds a little "excessive". Seems hourly would be sufficient for either table and will cut down drastically on the work performed here.

Truncate will affect you, it will checkpoint any outstanding dirty blocks, it will invalidate cursors. Jonathan Lewis went a bit further:

-------------------------------------------------------
The effects are also likely to change with version of Oracle, particularly as you cross from 9i to 10g because of the changes to infrastructure for partitioning.

The truncate invalidates current cursors that are dependent on the object being truncated - this could leads to a lot of library cache latch activity, and subsequent re-parse costs. (Can be monitored in v$librarycache as invalidations)

Truncate requires any dirty blocks in the segment to be written to disk (which probably won't matter in your case because there will probably be none a few hours down the line), but it does require dbwr to walk the checkpoint chain (latching).

Truncate requires any clean blocks in the segment that are in the buffer to have their buffer header marked as free. Oracle does not have an efficient mechanism
for finding clean blocks from an object - and this operation may cause a frenzy of cache buffers chains latch activity. The more indexes you have on the table, the worse this gets.

Truncating 60 partitions once per hour simply means doing 60 truncates very rapidly one after the other, so it isn't really saving any work. Your choice following the current design is really to identify the lesser of two evils. If you are on 1,000 transactions per minute, then hitting 60 segments once per hour may cause the system to come to a virtual standstill for many seconds whereas hitting one segment per minute may simply slow things down a bit, but not enough that people really notice.
-------------------------------------------------------


So, if you are 1000 transactions per minute (about what I do on asktom during busy periods of the day), minute wise partitions is probably a bit too much. You'd want to look at perhaps just hourly - and you'll definitely want to test this with the generated load hitting it. The truncate will affect you (that cursor invalidation is bad by itself)

Rating

  (24 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

invalid cursors

Doug, September 16, 2005 - 11:13 am UTC

So does this mean if you have a statement with a bind variable that was being used on partition 59MIN for example, as well as 58MIN, 57MIN, etc., and then 59MIN gets truncated - it has to be reparsed even though it is in use on the newer partitions?

Tom Kyte
September 16, 2005 - 1:58 pm UTC

ops$tkyte@ORA10G> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(25)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /
 
Table created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> insert into t select to_date('12-mar-2003','dd-mon-yyyy')+mod(rownum,5), rownum, rownum from all_users;
 
39 rows created.
 
ops$tkyte@ORA10G> commit;
 
Commit complete.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> alter session set sql_trace=true;
 
Session altered.
 
ops$tkyte@ORA10G> select count(*) from t where dt = to_date('12-mar-2003','dd-mon-yyyy');
 
  COUNT(*)
----------
         7
 
ops$tkyte@ORA10G> select count(*) from t where dt = to_date('12-mar-2003','dd-mon-yyyy');
 
  COUNT(*)
----------
         7
 
ops$tkyte@ORA10G> select count(*) from t where dt = to_date('12-mar-2003','dd-mon-yyyy');
 
  COUNT(*)
----------
         7
 
ops$tkyte@ORA10G> alter table t truncate partition junk;
 
Table truncated.
 
ops$tkyte@ORA10G> select count(*) from t where dt = to_date('12-mar-2003','dd-mon-yyyy');
 
  COUNT(*)
----------
         7



tkprof says:


select count(*)
from
 t where dt = to_date('12-mar-2003','dd-mon-yyyy')
                                                                                                                                                       
                                                                                                                                                       
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        4      0.00       0.00          0          2          0           0
Execute      4      0.00       0.00          0          0          0           0
Fetch        8      0.00       0.00          0         28          0           4
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       16      0.00       0.00          0         30          0           4
                                                                                                                                                       <b>
Misses in library cache during parse: 2
                              ^^^^^^^^^^^^^^</b>
Optimizer mode: ALL_ROWS
Parsing user id: 170
                                                                                                                                                       
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=7 pr=0 pw=0 time=107 us)
      7   PARTITION RANGE SINGLE PARTITION: 1 1 (cr=7 pr=0 pw=0 time=87 us)
      7    TABLE ACCESS FULL T PARTITION: 1 1 (cr=7 pr=0 pw=0 time=60 us)



 

Unexpected invalidation

Jonathan Lewis, September 16, 2005 - 12:52 pm UTC

Just a quick demo - with a bit of 'trust me'.

Session 1: (ordinary user)
SQL> l
  1* select id, grp from pt_range  partition(p600) where id >= :b1
SQL> /

        ID        GRP
---------- ----------
       550          3

Session 2: (SYS)
SQL> select invalidations, sql_text from v$sql
  2  where sql_text like 'select id, grp%p600%';

INVALIDATIONS
-------------
SQL_TEXT
------------------------------------------------------------
            0
select id, grp from pt_range  partition(p600) where id >= :b1



Session 1 (ordinary user)
SQL> alter table pt_range truncate partition p400;

Table truncated.

Session 2 (SYS)
SQL> /

INVALIDATIONS
-------------
SQL_TEXT
------------------------------------------------------------
            1
select id, grp from pt_range  partition(p600) where id >= :b1



The statement got invalidated - even though it used partition-specific code, and we truncated a different partition.

 

Tom Kyte
September 16, 2005 - 2:01 pm UTC

didn't see that before I did my tkprof - which includes a bit of trust me (that the cursor didn't get aged out or something)

Tom + Jonathan

Alberto Dell'Era, September 16, 2005 - 4:23 pm UTC

Nice thing is, this four-hands answering by Tom & Jonathan - are we going to see more of this ?
I know that you have already done that in the past, just wondering if you're going to strenghten the integration (you are somewhat complementary in your approach, total greater than the sum of the individuals).

Tom Kyte
September 16, 2005 - 6:03 pm UTC

Not really, I mean not more than usual -- on this one he would have some very specific input. I had a gut feel, he supplied the details.

Why Invalidation in this case ?

Jagjeet Singh, September 17, 2005 - 6:08 am UTC

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production

SQL> set echo on
SQL> start a
SQL> drop table t;

Table dropped.

SQL> 
SQL> create table  t ( id     int, t_date    date)
  2  partition by range (id )
  3  (
  4  partition p1  values less than (10),
  5  partition p2 values  less than (20))
  6  /

Table created.

SQL> 
SQL> insert into t select rownum r,sysdate from dict where rownum < 20;

19 rows created.

SQL> commit;

Commit complete.

SQL> 
SQL> select * from t where id = 1;

    ID T_DATE
---------- ---------
     1 01-JAN-99


SQL> select sql_text,invalidations from v$sql where sql_text like 'select * from t%'
  2  /

SQL_TEXT                 INVALIDATIONS
---------------------------------------- -------------
select * from t where id = 1                 0

-- I know it does not have any primary key

SQL> alter table t drop primary key ;

alter table t drop primary key
*
ERROR at line 1:
ORA-02441: Cannot drop nonexistent primary key

SQL> start b

SQL> select sql_text,invalidations from v$sql where sql_text like 'select * from t%'
  2  /

SQL_TEXT                 INVALIDATIONS
---------------------------------------- -------------
select * from t where id = 1                 1

Thanks,
Jagjeet Singh 

Tom Kyte
September 17, 2005 - 8:04 am UTC

The failed ddl seems to have invalidated it -- the ddl must have invaildated the dependent SQL and then failed apparently.

the dark side of doing DDL frequently in a live system.

Use ADD / DROP Partition or Seperate Tables

Michal Mor, September 17, 2005 - 8:01 am UTC

First, thank you both Tom and Jonathan for the great info.

Now, after I understand the affect of TRUNCATE operation, I understand we have big problem with the design.

First, I had serious mistake when typing the question. The transaction volumn is 1,000 per second and not 1000 per mintue as I wrote.
Second, performance are critical. Application is almost real time application and response time should be stable and quick.

Is it better to use DROP / ADD partition instead of TRUNCATE? Or maybe better to shift to seperate table per time period?

Thanks a lot!


Tom Kyte
September 17, 2005 - 8:25 am UTC

still, that is only 360,000 per hour - so minute wise partitions..... still too much.

one word for you:

benchmark

Contention problems

Jonathan Lewis, September 17, 2005 - 10:23 am UTC

Still running the double-hander here.

(Sorry Alberto, I'm just tracking this to see if any of my thoughts were relevant - AskTom is not going to become a double-act.)

Are you keeping a fixed number of partitions and doing a round-robin to re-use them, or will you be adding partitions at the top and dropping them at the bottom ? (The latter will be nasty with 1,440 partitions in 9.2 with your rate of work).

If you are re-using them, then perhaps you want to look at loading the data into a spare table, doing an exchange partition, and then dropping the spare table a little while later. (Details to be handled with extreme care).

I mention this, because at 1,000 transactions per second, if you have someone querying data in 'the top few partitions' as new data is going in, then their queries are going to be working like crazy to make blocks read consistent so that they can't see the rows that are arriving as their transaction runs. Loading data by hiding it until you exchange it stops this overhead.

But whatever you do, exchange, truncate etc. you will invalidate cursors, and even if it takes only 5 hundredths of a second (say) to complete the DDL, that's still a few dozen transactions that will jam, collide with the next few dozen, and give you extra problem with contention for buffer busy waits, redo copy latches, etc.


Tom Kyte
September 17, 2005 - 11:34 am UTC

this is going to sound strange (maybe not) - but have you (the poster) considered Times Ten...

This data sounds very "time is of the essence and after a short period of time, we don't really care about it anymore"

Deduct ponits for arithmetic

Jonathan Lewis, September 17, 2005 - 12:03 pm UTC

Tom,

Arithmetic error !!
Proof:

SQL> select 1000 * 3600 from dual;

 1000*3600
----------
   3600000

1 row selected.

That's 3.6 M per hour, not 360,000.


select 

Tom Kyte
September 17, 2005 - 1:37 pm UTC

doh! thanks

still - for a partition, 3.6mill isn't entirely large, and the 60,000 per minute is too small.

How many millions in a Partition ?

Reader, September 17, 2005 - 9:21 pm UTC

It would be nice to know how manys rows should be in a partition ? Or should it be by segment size eg 3gb per segment .

Thanks

Tom Kyte
September 17, 2005 - 11:51 pm UTC

none of the above.

It depends on your needs, your goals, your reason for using partitioning.

Sometimes a 100,000,000 million row table should be partitioned, and sometimes not.

You have to ask "what is my goal with partitioning"

Here, it is to purge data on a 24 hour revolving window.

Reader, September 17, 2005 - 11:23 pm UTC

"....60,000/minute is too small.."

How do you arrive at these values?

Tom Kyte
September 17, 2005 - 11:52 pm UTC

60,000 is just a very small number - some of my data dictionaries are larger than that "row wise"

Not a significant amount of data.

TimesTen

Michal Mor, September 18, 2005 - 12:27 am UTC

You are absolutely right regarding the recommendation for TimesTen.
We have several productions running system with TimesTen and we need to develop version with oracle as well (for smaller customers).

The answers are very useful, thanks a lot!


Tom Kyte
September 18, 2005 - 12:39 am UTC

that is something I could not have guessed (that you already used it). For really transient data like this, it might be the thing to look at.

What is "Times Ten"? Don't get it

A reader, September 18, 2005 - 5:16 am UTC


Alberto Dell'Era, September 18, 2005 - 6:52 am UTC

Tom Kyte
September 18, 2005 - 11:27 am UTC

Alberto - thanks!

Helena Marková, September 19, 2005 - 7:49 am UTC


extending rolling window

Jaromir D.B. Nemec, September 19, 2005 - 7:03 pm UTC

<quote>If you are re-using them, then perhaps you want to look at loading the data into
a spare table, doing an exchange partition, and then dropping the spare table a
little while later.
</quote>
Very nice formulation "little while later", I suppose the drop is delayed to give the active queries reading the data from the exchanged partition (now in a spare table) a chance to be finished.
This gives an interesting extension to the definition of a rolling window. For example: a table contains one year of historical data PLUS six hours for long running queries.

Jaromir D.B. Nemec


Impact of Delete stmt rather than Truncate?

John Skrabak, September 19, 2005 - 9:50 pm UTC

I work with a SAP Business Warehouse system running on 9.2. A recent feature SAP added to the system was to provide an optional setting that we can set to either truncate a table, or run a delete query. SAP suggests that for small tables (their suggested value is tables with < 1000 rows)it is faster to run the Delete rather than a Truncate. Given the earlier comments in the thread, that seems like it makes sense. What do you think?

Do you think that will hold true in 10g given some of the changes?

Size of the partition

reader, September 22, 2005 - 8:23 am UTC

Oracle9i Database Concepts
Release 2 (9.2)
Part Number A96524-01

The document says that Tables greater than 2GB should always be considered for partitioning.

What are your comments on this because it says 'should be considered' so it needs some expr to decide ?

Pl help .

Thanks

Tom Kyte
September 22, 2005 - 2:03 pm UTC

give me a link into otn please.


I think the "should be consdidered" gives wide lattitude here.

The Link

S, September 22, 2005 - 3:10 pm UTC

Tom Kyte
September 22, 2005 - 6:05 pm UTC

...
When to Partition a Table

Here are some suggestions for when to partition a table:

* Tables greater than 2GB should always be considered for partitioning.
* Tables containing historical data, in which new data is added into the newest partition. A typical example is a historical table where only the current month's data is updatable and the other 11 months are read-only.

.......

I don't see anything wrong with that suggestion - "suggestions", "be considered"

Seems very "generic" to me. 2gig is sort of small these days, might be a bit small.



truncate and load partition question

A reader, September 22, 2005 - 3:27 pm UTC

Here's a scenario:

We get data from various feeds on an irregular basis. We'd like to be able to truncate and load a table based on the feed provider. This table is very skinny, with a state, feed_id, hashed_id and file_ref.

A problem is, we want to load and truncate (partition) by feed_id for ease/speed of loading. The app will be down while we load. The feeds all have very different amounts of data. One feed provider can have 1 Billion rows, and one feed provider could have 60 Million rows. I immediately thought list partitioning (we have a set number of feed_id's) for the trunc/load procedure. This would lead to an uneven distribution of data. One billion rows in one partition, 60 million rows in another. Even if we were able to range on the feed_id, and hash on something else, we'd still have a very skewed partitioning scheme on these tables.

The feeds would have new data, data we currently have which would be updates to that data (merge came to mind for this) but we'd also have to delete (an unknown volume at this time) records which are no longer present in the feed files. Merge with delete doesn't sound very efficient over 1 billion rows, and we'd still have to stage the data.

Access (selects) on the table will be based on (hashed_id and state) or (feed_id and state).

Table would look like:
feed_lookup
(feed_id char(3) Not Null,
state char(2) not null,
hashed_id char(8) not null,
file_ref number not null)

However, since loads are only a small (yet, normal and intensive when done) part of the application maintnenance, I wonder if it's even worth the pain for ease of load.

Just wondering if there was something outside the box I'm not seeing for this particular scenario, since I'm so far inside the box at this point.

Tom Kyte
September 22, 2005 - 6:08 pm UTC

partitioning is a tool, one that can do many things

partitions do not have to be evenly loaded, it is not a requirement, nor even a "goal" really in general.

What is your goal that you are trying to get to - ease of loading. This could definitely achieve that.

The application doesn't have to be down during the load, you could load a separate table and when loaded, indexed, verified, whatever - use exchange partition to make it "current"

How about list partitioning by state?

Logan Palanisamy, September 22, 2005 - 4:28 pm UTC

Since "state" is used in both of your queries, why not LIST partition by state. It might also eliminate the problem of uneven partition sizes.

Going further, you can even sub-partition by the hasher_id field.



Truncating partitions

A reader, May 11, 2006 - 6:20 pm UTC

Hi Tom I have a table that is using a partition list where each partition represents a day so I have 31 partitions named D1,D2,D3....D31.
I am trying to create a store procedure that will truncate the partion that I don't need anymore and I cannot get the execute immediate work. I tried to use bind variables fir the partition name and it did not work, then I tried to do the execute immediate without using bind variables and it did not work either. Could you tell me what I am doing wrong here?

Thanks a lot for any help you can give me with this.

version without using bind variables
====================================
create or replace procedure sp_purging is
wpart varchar2(4):=' ';
sqlstr long;
begin
select 'd'||to_char(to_number(to_char(trunc(sysdate)-15,'dd'))) into wpart from dual;
sqlstr:='alter table details truncate partition '|| wpart ||' drop storage';
execute immediate sqlstr;
end;
/

version using bind variables
============================
create or replace procedure sp_purging is
wpart varchar2(4):=' ';
begin
select 'd'||to_char(to_number(to_char(trunc(sysdate)-15,'dd'))) into wpart from dual;
execute immediate 'alter table details truncate partition :wpart drop storage' using wpart;
end;
/


Tom Kyte
May 11, 2006 - 8:24 pm UTC

... I tried to use
bind variables fir the partition name and it did not work,
.......


you cannot bind in DDL. period.


but you give us no COMPLTETE (yet concise) test case to play with - so I can only assume the DDL statement you generated was not valid.

Usage of TimesTen

Pravin Ningoo, November 09, 2006 - 8:31 am UTC

Hi Tom,

Ours is a critical application (like everyone else's) :-)

But the challenging bit for us is
1 - 300-400 TPS (inserts) with clobs 4K to 5K in table 1
1 - 300-400 TPS (inserts) with clobs 20K to 25K in table 2
2 - 600-900 TPS (updates) to maintain the status of these records in table 1.

all of the above happen simultaneously. (there are batch jobs running overnight in addition to this).

Any recommendations on this?

Few questions...
- Can TimesTen take this load?
- Does TimesTen have clobs?
- is it sensible to think of using TimesTen at such a large scale?
- Current suggested design is updating a bitmap indexed column (at the rate of 600-900 TPS) --- wise thing to do?
- 5K clob table (table 1) is a IOT. (may need a mapping table for bitmap indexes)--- wise thing to do?

TIA
Pravin

Tom Kyte
November 09, 2006 - 2:18 pm UTC

o probably, probably oracle too. don't forget (2) above must be having some reads too - very important to add that to your spec (either you read and update or you update - and the update has to read)

o no

o sure

o hahahahahaha that'll never work in a billion years. No, you cannot have a bitmap index on these tables - period.

o no bitmaps, forget about them, they will not work

a single bitmap key entry points to MANY rows (concurrency issues)
bitmaps are not at all suitable for single row update/insert environments (even in 10g, they are better but not suitable)

Bhagat Singh, November 10, 2006 - 6:54 am UTC

Wonder full explaination my understanding on the same is like
truncate=ddl
drop constraint=ddl
so concluding that ddl on parititioned tables are going to do the above mentioned behaviour.

Bhagat Singh, November 10, 2006 - 7:47 am UTC

Wonder full explaination my understanding on the same is like
truncate=ddl
drop constraint=ddl
so concluding that ddl on parititioned tables are going to do the above mentioned behaviour.

truncate vs delete

shddba, January 14, 2007 - 5:40 am UTC

Hi,
We have a table that undergoes inserts (10000 to 20,000) per 30 minutes. However there is a need to schedule a job every 30 minutes which ONLY keep last Minute data into this table and delete the rest while same time online is inserting new records.
The table has a column time_in of data type which will have the sysdate value whenever a new record inserted.

So we thought of two ways ABOUT PURGING DATA every 30 minutes. i,e, purging data for (30-1=29 minutes).

1. Create a PL/SQL procedure that runs the Delete command
against this table with a filter clause of deleting records past (30-1) minutes. Execute this PL/SQL Package every 30 minutes from database scheduler.

2. Add a column into the table with number data type and every insert command will insert seconds from midnight.
Then create 48 range partitions on that column.
Afterwhich every 1/2 hr run a database scheduler job to just truncate the previous to current partition.

Questions:
1. Which way is better, 1 dba suggests that since there are not many transactions like 10K in 1/2 hr, delete works fine.
Would such delete affect the inserts? maybe block level locking in memory (latched) like insert will hang because the same block has rows which are being deleted etc?
The argument is that creating partions and running truncate is more heavier then just using the simple delete.

What is your opinion.

PS: We can not simulate the real time so we can't stress test it.







More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.