Thanks for the question, Tansel.

Asked: January 31, 2001 - 4:25 pm UTC

Last updated: February 06, 2013 - 2:32 pm UTC

Version: 8.1.5

Viewed 100K+ times! This question is

You Asked

Hi Tom,

Thanks again for your valuable help to the Oracle developer community:

We will develop an application where very large number of inserts will take place. We will be inserting records into our database as we read them from our data source. There will be only one application inserting records. The table has only a few columns. Our goal is to perform 15,000-20,000 inserts a second.

If we use PL/SQL bulk inserts we get the best performance but this option require more coding. My question is:

How can we optimize the insert performance of our database for a table as described above?

Thanks,

Tansel

and Tom said...

Use OCI or Pro*C. Compiled C code will be faster then PL/SQL.

Array processing is the way to go. The amount of code needed for bulk binds versus non-bulk binds is very little (especially when compared to the PERFORMANCE increase you get).

Instead of coding:

tkyte@TKYTE816> begin
2 for j in 1 .. 5000 loop
3 insert into insert_into_table values ( j );
4 end loop;
5 commit;
6 end;
7 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.74

you code:

tkyte@TKYTE816> declare
2 type numTab is table of number(4) index by binary_integer;
3 data numTab;
4 empty numTab;
5 begin
6 for j in 1 .. 5000 loop
7 data(data.count+1) := j;
8
9 if ( mod(data.count,1000) = 0 )
10 then
11 forall i in 1 .. data.count
12 insert into insert_into_table values
( data(i) );
13 data := empty;
14 end if;
15
16 end loop;
17 if ( data.count is not null )
18 then
19 forall i in 1 .. data.count
20 insert into insert_into_table values
( data(i) );
21 end if;
22 commit;
23 end;
24 /

PL/SQL procedure successfully completed.
Elapsed: 00:00:00.40

you replace the insert with:

7 data(data.count+1) := j;

and add:

9 if ( mod(data.count,1000) = 0 )
10 then
11 forall i in 1 .. data.count
12 insert into insert_into_table values
( data(i) );
13 data := empty;
14 end if;

after the assignments to insert every 1k rows for example. It does not add significantly to the code OR the logic at all (doesn't affect the logic really -- unless you expect a high number of REJECTS, then you need lots of error handling logic but you would need that anyway....)

Given that it takes it from 3+ seconds to 0.40 seconds -- it is well worth it.

OCI and Pro*C would be a little faster in my opinion in this case.

Rating

(26 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Good to find a statistic

Steve Pilgrim, June 26, 2001 - 5:39 am UTC

I am about to optimise an update and this is the sort of performance improvement I would like.

I am using Pro*C with an update of 20 - 30 attributes per customer and the forall looks the way to go

how do we go about in case table is having around 20-25 fields

Sushil Vyas, April 11, 2002 - 9:55 am UTC

Hi Tom,

The example you have provided is good as usual.
I am in the process of writing a procedure which has to push around 0.1 million records over the Lan using a DB link at an interval of 30mins from 20 diff tables.
This is a sort of replication based on snapshot logs but due to security reasons I have to push the data from the propagator DB instead of allowing receiver DB to pull the data.

The no of fields in the tables are around 20 on an average.
As of now my flow is such
Open a cursor .Go into for loop.
Fetching fields value on the basis of rowid into variables.
Insert into into a table using DBlink.
Returning to Fetch.
Committing outside the for loop.

Time taken is 21 mins for 128000 records.

Can u help me in making it perform better. I tried to follow what u have shown by creating PL/SQL types for each table but then that did not work with my understanding.

Would be grateful if u provide a example.

Tks
Sushil

April 11, 2002 - 9:57 am UTC

use insert into select from

don't use a cursor for what can easily be done in a single statement.

But i have to check for existence of records if any

Sushil Vyas, April 11, 2002 - 10:06 am UTC

Hi Tom,

Sorry for I did not communicated that there is also a possibility that the record already exists in the DB.Therefore I have to use a cursor and in case the record exists I got for updating that record.

Tks
Sushil.

April 11, 2002 - 10:36 am UTC

A snapshot style refresh is easiest done with two commands:

delete from remote_table where rowid in ( select rowid from snapshot_log where dml_type in ( 'U', 'D' ) );
insert into remote_table
select * from local_table
where rowid_pointer_column in ( select rowid from snapshot_log where dml_type in ( 'U', 'I' ) );

Don't even try to update, just DELETE all modified/deleted records. Insert all new ones.

Tuning insert into ... select * from ...

Marc, April 11, 2002 - 4:19 pm UTC

Hi Tom

Your page is very very good. Thanks for your work. I have a problem in tuning large inserts. I am preparing all my data within a global temporary table, and put them into the big "live" table. The table is partitioned. For a view million rows this tooks a while.

For better performace I have tried the append hint for the insert statement. But I have get a dead lock error when running for parallel processes on the same table.

How can I speed up a insert into ... select * from ... ? Thanks for your answer.

April 11, 2002 - 7:30 pm UTC

the /*+ append */ hint doesn't make sense really for a global temporary table -- it already has most of those attributes.

A select into temp table will go as fast as

a) the query
b) your disks

use TKPROF to see where the bottleneck resides

</code> http://asktom.oracle.com/~tkyte/tkprof.html <code>

What about forms?

Steve Kiteley, June 07, 2002 - 6:13 am UTC

What version of PL/SQL does this require. I tried to use the forall command from within a forms trigger and it would not compile. I tried Forms 6i (PL/SQL v 8.0.6) and Forms 5 (PL/SQL v 2.3.4).

June 07, 2002 - 7:12 am UTC

forall was added to the 8i release of the database -- you need the 8.1 version of plsql or better.

Why do insert /+ append / lock the table ?

A reader, July 23, 2002 - 4:56 pm UTC

Hi Tom

If I do an insert /*+ append */ into a table and wait with the commit, and start a second session doing the same insert a lock is occured. See my sample:

create table test as select * from dual;
insert /*+ append */ into test select * from test;

now start the insert a second time in a separate session:
insert /*+ append */ into test select * from test;

and you have to wait for commit.

If run the same insert without the append hint, no lock will happen. Is this a feature or a bug ? Thanks for your answer.

July 23, 2002 - 9:09 pm UTC

designed behavior. To do a parallel direct path load using SQL -- you use parallel query in the original insert.

Just like doing a parallel direct path load.

This can work for update too ?

Ruben, July 24, 2002 - 9:16 pm UTC

Tom , we want to delete a thousands of records from
different tables, but it takes 1 hour 20 mins. to delete
850,000 records, can Pro*c, OCI to help us to minimize
the time for this purpose

July 24, 2002 - 11:05 pm UTC

I don't know. Depends on the logic used to delete, the indexes in place and many other considerations.

Shahrouz DARVISHZADEH, March 24, 2003 - 9:20 am UTC

Tom, in case of bulk insert, is there a difference if I use NOLOGGING option? is there a difference if I use temporary tables? concrete case:
-insert without BULK INSERT:

Elapsed: 00:00:10.05
-Insert with BULK INSERT:

Elapsed: 00:00:03.02
- BULK INSERT with table in NOLOGGING mode:

Elapsed: 00:00:02.05
- BULK INSERT in a GLOBAL TEMPORARY TABLE:

Elapsed: 00:00:06.02

Why is the insert in temporary table so slowly. It has to be faster, hasn't it?

Shahrouz

March 24, 2003 - 10:14 am UTC

that is as concrete as this:

1
23432
232
2235432532

I have 4 numbers without an example....

NOLOGGING affects only one type of bulk modification -- direct path inserts such as those achieved with the direct path loader, CTAS, insert /*+ append */, or the direct path API in OCI.

It does NOT affect conventional path inserts at all.

A global temporary table can avoid some amount of REDO generation. It does however generate UNDO (has to) and that UNDO does generate REDO. Further, a GTT is just a table that truncates itself post commit or session - otherwise it is a table -- indexes must be maintained, blocks have to be formatted.

No reproducible example, no explanation from me.

In this case -- I would say you are probably testing on a system with other users and the flucuations in response time are due to them using the system at the same time you are -- or your redo logs are tiny and you are waiting for checkpoints and if you kept running the examples - you would get very spikey response times, sometimes things would take a second, sometimes ten -- because you are waiting on IO.

Good info! ......Snapshot type refresh method.

Robert, March 24, 2003 - 11:59 am UTC

Good information on snapshot style refresh in 2 steps...(delete, insert), vs. 1 step (update).

Shahrouz DARVISHZADEH, March 25, 2003 - 4:37 am UTC

I didn't want to bother you with my insert procedure which is about 1400 rows, therefore I wrote only the times statistics. Hier only the insert part:
loop
fetch x_cursor_var bulk collect into l_record.r_STRUCTUREPRIV_ID
LIMIT l_array_size;
l_done:=x_cursor_var%NOTFOUND;
forall i in 1..l_record.r_STRUCTUREPRIV_ID.count
insert into sis_structurepriv_user(id, structure_id, user_id, structurepriv_id, admin_option, cuserid, cdate, version)
values(seq_sis_structurepriv_user.nextval, l_record.r_STRUCTUREPRIV_ID(i), R.user_id, x_structpriv_select, 'NO', 3, sysdate, 1);
.....
exit when (l_done);
l_cnt := l_cnt + l_array_size;
end loop;
close x_cursor_var;

If I understand you properly there is no different if I use NOLOGGING or not in this case (but I swear IT IS faster with NOLOGGING). Elapsed time statistics which I sent you are reproducable everytime. I am the only user on a local machine.
Temporary tables are exactly the thing which we need in this case. We delete everything in this table and insert the values everytime new. The content of the table is not usable for other users. But GTTs are the slowest thing in my inserts as you can see. I read the capital in your book about GTTs and didn't find much about how I can tune these sort of tables. Thus my question: am I doing something wrong?

Thank you in advance
Shahrouz DARVISHZADEH

Logging in bulk operations

Sagar, October 01, 2003 - 4:31 am UTC

Hi Tom,
If I am doing any bulk insert/update/delete, will there be logging? If yes, can I change the behaviour to no-logging at the statement level itself? (for eg. in FOR ALL INSERT)

Regards,
Sagar

October 01, 2003 - 8:41 am UTC

update and delete = ALWAYS logging (and you are mistaken if you think you don't want it)

INSERT /*+ APPEND */ as select can skip logging.

INSERT with a VALUES clause = ALWAYS logging -- even with "forall", it is a single row insert and those are not done in "direct" mode.

Dan, June 14, 2004 - 11:31 am UTC

Hi Tom
First of all thanks for your superb support.

Question 1

In my Sp I would like to use bulk insert. In my case if the record exist in the table I will update only one column else I will do insert. Here I have seen "snapshot style refresh... I think it wont help in this situation. I will explain in detail.

For E.g.

I have a table

Create table Test (A number(3), B number(3), C number(3), D number(3));

ALTER TABLE Test ADD PRIMARY KEY(A);

CREATE TYPE Function_detail as object(
A NUMBER(3),
B NUMBER(3),
C NUMBER(3),
D NUMBER(3))

CREATE TYPE FUNCTION_DETAIL_ARRAY IS TABLE OF Function_detail;

------

I will check for the record if exist then i will update only one column "d". If there is no record exist then I will do insert.

Right now I am doing like this.
Create Procedure B_Bind(
P_FUNCTION_PLAN_ARRAY FUNCTION_DETAIL_ARRAY)
AS
BEGIN
FOR I IN 1..P_FUNCTION_PLAN_ARRAY.COUNT
LOOP
Update Test SET D= P_FUNCTION_PLAN_ARRAY(I).D WHERE A = P_FUNCTION_PLAN_ARRAY(I).A

IF SQL%NOTFOUND THEN
INSERT INTO A VALUES(
P_FUNCTION_PLAN_ARRAY(I).A,
P_FUNCTION_PLAN_ARRAY(I).B,
P_FUNCTION_PLAN_ARRAY(I).C,
P_FUNCTION_PLAN_ARRAY(I).D)
END IF;

END LOOP;
END;

I would like to use bulk bind here. Applications will send high volume of data here.
Please suggest some way to do.

Question 2.

In my application the java programmer will send one array as input parameter to stored procedure. Some time the array may contain more than 100 objects. In that case, what are the performance bottlenecks? I heard that if the array has more than 100 objects then it would cause some problem. Is it true?

Thanks in advance.
Dan

June 14, 2004 - 11:50 am UTC

1) merge....

ops$tkyte@ORA9IR2> declare
  2      p_function_plan_array  FUNCTION_DETAIL_ARRAY := FUNCTION_DETAIL_ARRAY(
  3      Function_detail( 1, 1, 1, 1 ),
  4      Function_detail( 2, 2, 2, 2 ),
  5      Function_detail( 3, 3, 3, 3 ) );
  6  begin
  7      merge into test t1
  8      using ( select * from table( cast( p_function_plan_array as FUNCTION_DETAIL_ARRAY ) ) ) t2
  9      on ( t1.a = t2.a )
 10      when matched then update set t1.d = t2.d
 11      when not matched then insert ( a,b,c,d) values ( t2.a, t2.b, t2.c, t2.d );
 12  end;
 13  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select * from test;
 
         A          B          C          D
---------- ---------- ---------- ----------
         3          3          3          3
         1          1          1          1
         2          2          2          2
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> declare
  2      p_function_plan_array  FUNCTION_DETAIL_ARRAY := FUNCTION_DETAIL_ARRAY(
  3      Function_detail( 1, 1, 1, 11 ),
  4      Function_detail( 3, 3, 3, 13 ),
  5      Function_detail( 4, 4, 4, 4 ) );
  6  begin
  7      merge into test t1
  8      using ( select * from table( cast( p_function_plan_array as FUNCTION_DETAIL_ARRAY ) ) ) t2
  9      on ( t1.a = t2.a )
 10      when matched then update set t1.d = t2.d
 11      when not matched then insert ( a,b,c,d) values ( t2.a, t2.b, t2.c, t2.d );
 12  end;
 13  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> select * from test;
 
         A          B          C          D
---------- ---------- ---------- ----------
         3          3          3         13
         1          1          1         11
         2          2          2          2
         4          4          4          4
 
ops$tkyte@ORA9IR2>



2) 100 is considered the "sweet spot" for bulk (array) binding.  It is a "rule of thumb".  It keeps things flowing nicely (java builds 100 somethings, database processes 100 somethings, java builds another 100 somethings -- while the database is doing other stuff like writing out buffered redo, flushing dirty blocks, etc).....

It is ok to go over 100, i would not like to see 1,000's of things at once (memory, network, flood of work).  better to "stream"

Ohhh So Fast

dan, June 14, 2004 - 12:12 pm UTC

Hi Tom,

Thanks For your immediate reply.

But I am using ORACLE 8.1.7

So I can't use merge statement. Please suggest some other way.

Thanks in advance
Dan

June 14, 2004 - 1:12 pm UTC

you cannot in 8i, not without moving all of the data out of the array of records - into a record of arrays.

your data structures won't permit you to perform bulk operations.

Bulk insert

Khairati Lal, October 15, 2004 - 1:13 am UTC

Great tips.

Performance - Querying and Inserting Into the Same Table

Su Baba, February 13, 2006 - 1:10 am UTC

In general, is the following type of operation to be avoided since it's querying and inserting into the same table? Is there anyway to get around this kind of construct?

INSERT INTO tab_a
SELECT *
FROM tab_b B
WHERE NOT EXISTS (
SELECT 1
FROM tab_a A
WHERE A.join_column = B.join_column
);

February 13, 2006 - 8:17 am UTC

no, this not to be avoided, nothing wrong with reading and writing to the same object - it is done "all of the time".

In what way /+ APPEND / hint on a GTT is useful ?

VKOUL, March 16, 2006 - 1:35 pm UTC

Hi Tom,

I have a test case and I want to get some insight about it.

Test Case
--------------
SQL> create table t as select * from all_objects;

Table created.

SQL> 
SQL> create global temporary table gt on commit delete  rows as select * from t;

Table created.

SQL> 
SQL> select count(*) from gt;

  COUNT(*)
----------
         0

SQL> 
SQL> @auto_on
SQL> 
SQL> insert /*+ append */ into gt select * from t;

29530 rows created.

Elapsed: 00:00:00.01

Statistics
----------------------------------------------------------
          2  recursive calls
         15  db block gets
        209  consistent gets
          0  physical reads
        224  redo size
        604  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      29530  rows processed

SQL> truncate table gt;

Table truncated.

Elapsed: 00:00:00.00
SQL> 
SQL> insert into gt select * from t;

29530 rows created.

Elapsed: 00:00:00.01

Statistics
----------------------------------------------------------
          0  recursive calls
       1161  db block gets
        410  consistent gets
          0  physical reads
     105988  redo size
        620  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      29530  rows processed

SQL> 
--------------------------------------------

In this post, you said /*+APPEND */ hint does not really make sense for a GTT.

Q1. Could you pls. explain the difference in statistics.

Q2. After the COMMIT or SESSION termination, the temp segments of a GTT are dropped (correct me if I am wrong).
I case of a non-temporary table, APPEND hint puts rows into the table above the current HWM. Could you pls. explain in case of a GTT what is the HWM and how APPEND hint affects/applies to it. Does each session has its own HWM for a GTT ?

Thanks

March 16, 2006 - 2:54 pm UTC

an insert into the table generates the least amount of undo (just puts "delete+rowid" into the undo tablespace). Since you have to commit after an append to re-read the data, it is sort of "not a good thing".

a whopping 100k of redo - not really worth the hassle of the append, is it.

In what way /+ APPEND / hint on a GTT is useful ?

VKOUL, March 16, 2006 - 3:31 pm UTC

Thanks Tom for your prompt answer.

Does HWM makes sense in GTT with ON COMMIT PRESERVE/DELETE ROWS ?

March 16, 2006 - 3:38 pm UTC

not really, the gtt disappears after your session or transaction is over.

Does the HWM come into play - sure, but only within your session or transaction

VKOUL, March 16, 2006 - 6:13 pm UTC

Thanks again Tom

why it doesn't work with XMLTYPE ?

B., September 27, 2007 - 11:38 am UTC

Dear Tom,

Can you tell me what i did wrong ?

set timing off
/
drop table insert_into_table
/
create table insert_into_table ( id xmltype )
/
set timing on

declare
poXML Clob;
begin
poXML := '<?xml version="1.0" standalone="yes"?>
<mydocument>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
</mydocument>';
for j in 1 .. 5000 loop
insert into insert_into_table values ( xmltype(poXML) );
end loop;
commit;
end;
/

truncate table insert_into_table
/

declare
type numTab is table of xmltype index by binary_integer;
data numTab;
empty numTab;
poXML Clob;
begin
poXML := '<?xml version="1.0" standalone="yes"?>
<mydocument>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
<myTag>abcdefghijklmnopqrstyvwxyz1234567890</myTag>
</mydocument>';
for j in 1 .. 5000 loop
data(data.count+1) := xmltype(poXML);
if ( mod(data.count,1000) = 0 )
then
forall i in 1 .. data.count
insert into insert_into_table values( data(i) );
data := empty;
end if;
end loop;
if ( data.count is not null )
then
forall i in 1 .. data.count
insert into insert_into_table values( data(i) );
end if;
commit;
end;
/

Results :

Table dropped.

Table created.

PL/SQL procedure successfully completed.

Elapsed: 00:00:12.25

Table truncated.

Elapsed: 00:00:05.31

PL/SQL procedure successfully completed.

Elapsed: 00:00:24.96

It is twice slower using bulk insert....

Best regards
B.

September 28, 2007 - 4:18 pm UTC

you basically time the time it takes plsql to create 5000 xmltype instances in memory in the second case.

If you look at the time to load only (using tkprof) you'd find a very different story:

INSERT INTO INSERT_INTO_TABLE T1 VALUES ( XMLTYPE(:B1 ) )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute   5000      8.74       8.57          0      20411       8347        5000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     5001      8.74       8.57          0      20411       8347        5000

INSERT INTO INSERT_INTO_TABLE T2 VALUES ( :B1 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      5      2.27       2.26          2       5370       8284        5000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        6      2.28       2.26          2       5370       8284        5000

I added correlation names t1 and t2 to the inserts to differentiate them.

xmltype

B., October 04, 2007 - 12:13 pm UTC

Dear Tom,

Is correct to say that the 2d test use less resources than the first one even if it's slower ? Did you think we should use bulk insert with XMLType ? I'm a little confuse, because we have a lot of data to load and not a lot of time to do it.

Best regards,
B.

Regarding BULK INSERT

Hemal Deshmukh, October 16, 2007 - 6:04 pm UTC

Hello Tom,
I was referring to the bulk insert example mentioned by you at the top of this page.

I was having a small query related to it.

Currently we are using the "CURSOR FOR LOOP" for insertion of the data.
Before inserting the record we store the record id(example: employee_id) in a variable.
If any error occur, then in the excepetion section we identify that particular record using
the employee_id present in that variable.

Now using bulk insert, suppose if the error occurred while insering the record then how should we identify that particular record in the exception section.

Please advice
Thanks and Best Regards
-Hemal

extending bulk insert PL/SQL with OCCI caller

Shane Miller, May 12, 2009 - 9:46 pm UTC

Like the original poster I also have a situation in which we'll read from a datasource and insert at a high rate into a table of ~50 columns where most columns are integers plus some varchars.

First, consider a single column. The code to bulk insert in PL/SQL an array of values for a single-columned table is well disseminated.

What isn't clear to me is how this would be done for 50-column table in which the rows are coming from a OCI/OCCI/ODBC client side application. That is, suppose I have a C-struct :

struct T { int c1; int c2; ... int c50; };
struct T t[100];

and my goal is to bulk insert 100 instances of T per commit with a C or C++ based OCI/OCCI application. As I understand it, there's no way for the application to do this without decomposing t into 50 one-hundred-sized array of integers so that my PL/SQL array must have 50 arguments.

Even if I,

create table Tbl( c1 number, c2 number, ... c50 number);

creating a natural 1:1 correspondence between the C-struct and Tbl, and even if the stored procedure accepts an argument with type Tbl%rowtype, I still cannot pass t into the stored procedure.

The only work arounds appear to be Oracle objects (which look fancy but slow - where is the bulk insert?). Alternatively I can pass in t as a image (or other byte array) of 100*sizeof(struct T) bytes and have the stored procedure decipher the array back into 100 rows of 50 columns stuffed into local arrays and bulk inserted.

Probably the most practical is to create a stored procedure to take 50 arguments. I had hoped to avoid this due to its unwieldly size, maintenance of keeping Tbl's columns and related stored procedure arguments in right order.

Performance of INSERT INTO ... SELECT FROM

Amit, February 06, 2013 - 11:49 am UTC

Hi Tom,

I am working on one assignment of purging the old data. There are 4 tables targeted for purging. These are range partitioned . While purging old data, the rule is to purge records older than 90 days. But there are certain type of old records (which can go back till year 2006) which should not be purged. This criteria depends upon certain data sources for which data cannot be deleted ever, also for other data sources, month-end summary data (which is identified by either last calender day or last business day) cannot be purged. Hence we cannot drop old partitions. Also, client is not ready to change partitioning strategy now (I am not sure why).
1. So my first question is, is it possible to change partitioning strategy so that the records that cannot be purged will be kept into separate partitions without changing any logic? One option I can think, is to create partitions by List (Data source) and then sub-partition by range (Date).
Since, it is very difficult to get partitioning strategy changed, we did analysis and it was found that almost 80 - 85 % of data will be deleted/purged when purge will happen for first time (these tables are never purged before). So, the approach was decided to create new tables with same structure as existing tables and copy the data we want to keep into these new tables. Identifying the records to keep requires joins with some other tables, hence INSERT INTO ... SELECT FROM query has become complex and execution plan showed that all the tables involved in SELECT query are doing FTS. Now, since the data being selected is around 15-20%, I am guessing that optimizer decides it is faster to do Full Table scan over Index scan (join conditions are using indexed columns). These original tables contain almost 300-400 million records each and after seeing FTS, our DBAs have raised concerns that the queries may take forever or even fail and TEMP space, UNDO may not be enough for this kind of queries. They are asking us to split each INSERT INTO ... SELECT FROM query so that these small queries will target small number of partitions.
2. So, my second question is, what approach would you suggest for copying 20% of data to new tables where original table has 300-400 Million records? I was thinking of using one INSERT INTO ... SELECT FROM statement for one table, but should I be worried about FTS happening? Will it help if I create multiple queries that will targer different partitions? My assumption was that every partition is queried independently and then the results are combined into one (of course, it is not true if I am using aggregate functions).
I was not able to understand why INSERT INTO ... SELECT FROM query will use TEMPORARY tablespace? may be for joining tables? If so, can I calculate how much TEMP space will be required for each query so that it can be added before starting the operation?
I am using APPEND hint so, UNDO will not be generated. Am I correct?

Thanks Tom!

February 06, 2013 - 2:32 pm UTC

1) yes, you described how - composite partitioning.

... the tables involved in SELECT query are doing FTS. ...

good, that is what you want. you are wanting 15-20% of the data, that is a TON, you don't want indexes. optimizer did good here.

your temp space should be able to hold 15-20% of your table, heck, it should be able to hold many times that.

and if you use insert /*+ append */ (direct path insert) there will be zero undo and potentionally zero redo

but it would be ok to do this partition by partition - sure, that is easily managed and easily restarted if something fails.

2) full scans rule, they are awesome, love em, bring them on!!!!

use direct path (append).

explain plan will give you a guess-timate of temp space required.

Performance of INSERT INTO ... SELECT FROM

Amit, February 08, 2013 - 10:52 am UTC

Thanks a lot, Tom ! You are the best !!!

Insert Statement Tuning

Narendra Kuamr, September 27, 2013 - 12:47 pm UTC

I have used the loop suggestion for inserting the bulk data.
My querey like Insert into table1 (select /*+ Append */ column1, col2.... from table2) was taking around 3.5 hours but through looping concept it is taking less than 1 Minute.

Thanks a lot for your valuable post.
Tom You Rock Alwasys Many Thanks :)

A reader, November 14, 2013 - 10:52 am UTC

I think you made an error in your query:
Insert into table1 (select /*+ Append */ column1, col2.... from table2)
it should be:
Insert /*+ Append */ into table1 (select column1, col2.... from table2)
and query as above is quite fast ;-)

regards,
Adam

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Good to find a statistic

how do we go about in case table is having around 20-25 fields

But i have to check for existence of records if any

Tuning insert into ... select * from ...

What about forms?

Why do insert /+ append / lock the table ?

This can work for update too ?

Good info! ......Snapshot type refresh method.

Logging in bulk operations

Ohhh So Fast

Bulk insert

Performance - Querying and Inserting Into the Same Table

In what way /+ APPEND / hint on a GTT is useful ?

In what way /+ APPEND / hint on a GTT is useful ?

why it doesn't work with XMLTYPE ?

xmltype

Regarding BULK INSERT

extending bulk insert PL/SQL with OCCI caller

Performance of INSERT INTO ... SELECT FROM

Performance of INSERT INTO ... SELECT FROM

Insert Statement Tuning

More to Explore

PL/SQL demos

PL/SQL docs

The best insert performance...

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Good to find a statistic

how do we go about in case table is having around 20-25 fields

But i have to check for existence of records if any

Tuning insert into ... select * from ...

What about forms?

Why do insert /*+ append */ lock the table ?

This can work for update too ?

Good info! ......Snapshot type refresh method.

Logging in bulk operations

Ohhh So Fast

Bulk insert

Performance - Querying and Inserting Into the Same Table

In what way /*+ APPEND */ hint on a GTT is useful ?

In what way /*+ APPEND */ hint on a GTT is useful ?

why it doesn't work with XMLTYPE ?

xmltype

Regarding BULK INSERT

extending bulk insert PL/SQL with OCCI caller

Performance of INSERT INTO ... SELECT FROM

Performance of INSERT INTO ... SELECT FROM

Insert Statement Tuning

More to Explore

PL/SQL demos

PL/SQL docs

Why do insert /+ append / lock the table ?

In what way /+ APPEND / hint on a GTT is useful ?

In what way /+ APPEND / hint on a GTT is useful ?