Skip to Main Content
  • Questions
  • dml single insert/select or bulk collect/forall

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Sam.

Asked: April 07, 2008 - 1:39 pm UTC

Last updated: June 21, 2012 - 7:29 am UTC

Version: 10g rel2

Viewed 10K+ times! This question is

You Asked

Tom,

i am currently designing a new database 10g Rel2 and would appreciate your advice on the following

Which method would you advise for the following business process

OPTION 1
########

DECLARE
CURSOR c1 IS
SELECT .............
FROM big_table_a a,
big_table_b b ,
big_table_c c,
small_ref_table_d d
WHERE a.col1 = b.col1
...............
...............;
type a_type is table of ....;
a_array a_type;

BEGIN

OPEN c1;
FETCH c1 bulk collect INTO a_array;
CLOSE c1;

FORALL i in 1 .. a_array.count
save exceptions
INSERT into big_table_x
(.....)
VALUES(a_array(i));
END;

OPTION 2
########

DECLARE
.....
BEGIN

SELECT ..........
BULK COLLECT INTO a_array;
.................
FROM big_table_a a,
big_table_b b ,
big_table_c c,
small_ref_table_d d
WHERE a.col1 = b.col1
...............
...............;

FORALL i IN 1...a_array.count
INSERT into big_table_x
()
VALUES
(a_array(i));

OPTION 3
########

DECLARE
BEGIN

INSERT into big_table_x
()
SELECT ..........
..........
FROM big_table_a a,
big_table_b b ,
big_table_c c,
small_ref_table_d d
WHERE a.col1 = b.col1
...............
...............;

If there is a different solution please feel free to elaboarte - i know u will think of something :)


Thanks in advance.

Sam

and we said...

the mantra:

o You should do it in a single SQL statement if at all possible.
o If you cannot do it in a single SQL Statement, then do it in PL/SQL.
o If you cannot do it in PL/SQL, try a Java Stored Procedure.
o If you cannot do it in Java, do it in a C external procedure.
o If you cannot do it in a C external routine, you might want to seriously
think about why it is you need to do itÂ…


So, go with the last one, a single sql statement... with /*+ APPEND */ if you understand what that does and if it applies....

Rating

  (25 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Mantra

Brian Tkatch, April 08, 2008 - 7:57 am UTC

Tom, thank you for the mantra.

The simple tips you put in to your answers, are always appreciated.

cursor loop

A reader, April 08, 2008 - 10:08 am UTC

Why is processing in a loop far slower than single statement processing ?

After all single statements are also processed one at a time within Oracle ?


Tom Kyte
April 09, 2008 - 2:22 pm UTC

because procedural code is slower than a single sql statement in general.

insert into t2 select * from t1;

it is all done efficiently, in the server, in bulk - we told the server everything we wanted to do and it could optimize the heck out of it.

for x in (select * from t1)
loop
insert into t2 values ( .... );
end loop


now we have to slow by slow send rows out of the database, only to turn around and send them back in.


more code = more bugs
more code = slower
more code = undesirable

Comment

sam, April 08, 2008 - 10:31 am UTC

Tom

In which situations would you advise using OPTION 1 or OPTION 2. I agree with you i always try and do everything in a single select (ideally with *+ APPEND */ and parallel if possible). I was just concerned with loading the SGA with millions of rows as the big tables are a couple of million each, hence was thinking of the bulk collect with limit ? I think this is what you were thinking when you mentioned "If you cannot do it in a single SQL Statement, then do it in PL/SQL" If u have any examples specific to my business scenario please let me know

Thanks in advance

Sam


Tom Kyte
April 09, 2008 - 2:28 pm UTC

I always advise 1 and 2, in all circumstances, not sure what you mean.

append - be careful with that, it is not necessarily faster than noappend and only applies in somewhat rare circumstances.

parallel - see append - but even less often.

the SGA takes care of itself, do not worry about protecting it. In fact, if anything - slow by slow processing will have a more negative effect on sga utilization than anything done in bulk (you tend to not have any nice efficient full scans - or less of them - and we end up trying to cache all of it for your procedural code, even though no one will reuse it - whereas with big bulk operations - we know to age it right out).

When I said "when you cannot do it in SQL, use as little plsql as possible", I was being quite precise and very literal.

when you cannot do it in sql. period, just when you CANNOT CODE the sql to do it.

loops

Clep, April 08, 2008 - 10:46 am UTC

Please don't use "u"

Use loops if and only if that is a must - where it cannot be done in single sqls.

And if loops are a must, bulk processing in a loop, always helps

Comment

sam, April 08, 2008 - 11:05 am UTC

Sorry about the 'u', i was multitasking in another same time window. Just to wrap up, you do not forsee any issues of caching millions of rows in the sga using a single select ? if PL/SQL is the only option then bulk collect and forall or would you also consider a varray for the business scenario i mentioned

Thanks alot your information is greatly appreciated.

Sam

Tom Kyte
April 09, 2008 - 2:34 pm UTC

... you do not forsee any issues of caching millions of rows in the sga
using a single select ? ...

sigh, the SGA is pretty well handled by the software.

you could trash the contents of a buffer cache much easier with slow by slow code than you would with BULK CODE.

If you end up reading every row in a table via an index - we'll end up trying to cache the entire table and index in the SGA - even though YOU are going to read it ONCE for your batch process.

if you end up full scanning segments - we smartly get them in and out of the buffer cache really fast - without wiping out lots of useful data

so your hypothesis actually works against you in this case - your presumption is not accurate.

Is Option 1 and Option2 work for oracle 9i

Sandip Basu Mallik, April 21, 2009 - 10:39 pm UTC

Hi,

Is option 1 and option 2 work for Oracle 9i?
I try this but getting error ORA-0947 not enough values.
I check my arrays column is same as target table column

Is option1 and option 2 will work for 9i?

Sandip Basu Mallik, April 21, 2009 - 10:42 pm UTC

Hi , Thanks for suggestion. I am getting error ORA-00947 not enough values . I check arrays and target table have same number of column.But still i am getting this.
Can you help? I am using oracle 9i

Sandip Basu Mallik, April 21, 2009 - 11:48 pm UTC

Hi Its works in 9i.Just now i check.

dml single insert/select or bulk collect/forall

jacewuk, May 26, 2009 - 10:15 am UTC

I have a simlar problem, but i have 800 million rows i need to move to another table. I can do it in one select but the undo tablespace size is nowhere near big enough. For this reason I'd like to split it into chunks and commit each chunk, therefore I'm looking to use a method similar to option 1 or option 2. Do you agree, or would you suggest something else?
Tom Kyte
May 26, 2009 - 10:25 am UTC

it should take almost no undo????

you would be using a direct path insert, I would presume the indexes would be disabled or non-existent.

ops$tkyte%ORA10GR2> create table t
  2  ( data varchar2(1000) )
  3  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert  /*+ append */ into t
  2  select rpad( '*', 1000, '*' )
  3  from dual
  4  connect by level <= 100000
  5  /

100000 rows created.

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
         1

ops$tkyte%ORA10GR2> commit;

Commit complete.

ops$tkyte%ORA10GR2> insert into t select rpad( '*', 1000, '*' )
  2  from dual connect by level <= 100000;

100000 rows created.

ops$tkyte%ORA10GR2> select used_ublk from v$transaction;

 USED_UBLK
----------
       237




so, use a single statement.

Oracle Support Suggest- Bulk Load instead of single SQL

A reader, September 28, 2009 - 5:58 pm UTC

Hi Sir,

We have not special situation but have hired oracle consultant and what he suggest is to use bulk insert instead of single sql.

We are using 10.2.0.3 DB or spacr 64 bit. Some of our tables are huge and also joins to another big tables and from that single SQL we use to load data in one intermediate table using insert append. But it's taking very long time and so he suggested us to use bulk insert which i think will be slower. But i can't provide test case for this. I tried creating and it gave me near to equal result.

set serveroutput on;
alter session set timed_statistics = true;
alter session set statistics_level=all;
alter session set max_dump_file_size = unlimited;
spool test1.txt
EXEC DBMS_MONITOR.session_trace_enable(waits=>TRUE, binds=>FALSE);
exec runstats_pkg.rs_start;
insert /*+ append */ into t1 select owner,object_name, count(*) as cnt from big_table group by owner,object_name;
commit;
exec runstats_pkg.rs_middle;

declare
TYPE ARRAY IS TABLE OF t2%ROWTYPE;
l_data ARRAY;
CURSOR c IS select owner,object_name, count(*) as cnt from big_table group by owner,object_name;
BEGIN
    OPEN c;
    LOOP
    FETCH c BULK COLLECT INTO l_data LIMIT 10000;
    FORALL i IN 1..l_data.COUNT
  INSERT /*+ append */ INTO t2 VALUES l_data(i);
  commit;
    EXIT WHEN c%NOTFOUND;
    END LOOP;
    CLOSE c;
end;
/

exec runstats_pkg.rs_stop();
spool off;

Run1 ran in 67155 hsecs                                                                                                                                                             
Run2 ran in 48388 hsecs                                                                                                                                                             
run 1 ran in 138.78% of the time        

Run1 latches total versus runs -- difference and pct                                                                                                                                
Run1        Run2        Diff       Pct                                                                                                                                              
6,707,646   6,253,849    -453,797    107.26%        

Some major Ones

LATCH.shared pool                   33,600      12,686     -20,914                                                                                                                  
LATCH.session allocation            43,486      15,366     -28,120                                                                                                                  
STAT...session uga memory          130,752      65,216     -65,536                                                                                                                  
LATCH.multiblock read objects      221,426     155,864     -65,562                                                                                                                  
STAT...IMU Redo allocation siz           0      93,348      93,348                                                                                                                  
STAT...undo change vector size      14,028     165,132     151,104                                                                                                                  
STAT...physical reads cache pr     783,094     533,063    -250,031                                                                                                                  
STAT...free buffer requested       835,490     568,929    -266,561                                                                                                                  
STAT...physical reads cache        835,444     568,692    -266,752                                                                                                                  
STAT...physical reads              835,444     568,692    -266,752                                                                                                                  
LATCH.object queue header oper   2,260,046   1,816,241    -443,805                                                                                                                  
LATCH.cache buffers chains       3,948,969   3,346,205    -602,764                                                                                                                  
LATCH.process queue reference            0     741,202     741,202                                                                   
                                              



Tom Kyte
October 02, 2009 - 8:05 am UTC

... We have not special situation but have hired oracle consultant and what he suggest is to use bulk insert instead of single sql. ...

why? do they want it to take longer?


... so he suggested us to use bulk insert which i think will be slower. ...

you sir, are correct. But you know what, you should ask this expert to SHOW YOU the numbers, not the other way around. They should PROVE TO YOU that what they say is correct.



In your test, I would surmise that run two ran faster only because run1 did the work of getting all of the blocks into the buffer cache - as evidenced by the reduced IO's - run the test the other way and see what you see then :)


writing code to replace that single insert would be *not smart*, as in *not intelligent*, as in *the wrong way*


I hope you do know that:

...
INSERT /*+ append */ INTO t2 VALUES l_data(i);
.....


is identical to

INSERT INTO t2 VALUES l_data(i);

(except in 11gR1 where append actually does 'work' by accident, fixed in 11gr2). Direct path loads for single row inserts would be deadly - they do not happen fortunately.

also, the indentation of your 'commit' makes me think you think it is "in" the forall loop - it is not. It is after - the forall is not really a loop - it is a single statement.


Let's do your test - removing the complex query and just loading data (so we are comparing "loading to loading" and not the performance of the query loading from from execution to execution.


ops$tkyte%ORA10GR2> drop table t1;
Table dropped.

ops$tkyte%ORA10GR2> drop table t2;
Table dropped.

ops$tkyte%ORA10GR2> create table t1 as select owner, object_name, object_id from all_objects where 1=0;
Table created.

ops$tkyte%ORA10GR2> create table t2 as select owner, object_name, object_id from all_objects where 1=0;
Table created.


ops$tkyte%ORA10GR2> exec runstats_pkg.rs_start;
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> begin
  2  for i in 1 .. 10
  3  loop
  4      insert /*+ append */ into t1 select owner,object_name, object_id from all_objects;
  5      commit;
  6  end loop;
  7  end;
  8  /
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec runstats_pkg.rs_middle;
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> declare
  2  TYPE ARRAY IS TABLE OF t2%ROWTYPE;
  3  l_data ARRAY;
  4  CURSOR c IS select owner,object_name, object_id from all_objects;
  5  BEGIN
  6  for i in 1 .. 10
  7  loop
  8      OPEN c;
  9      LOOP
 10          FETCH c BULK COLLECT INTO l_data LIMIT 10000;
 11          FORALL i IN 1..l_data.COUNT
 12              INSERT INTO t2 VALUES l_data(i);
 13          commit;
 14          EXIT WHEN c%NOTFOUND;
 15      END LOOP;
 16      CLOSE c;
 17  end loop;
 18  end;
 19  /
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec runstats_pkg.rs_stop();
Run1 ran in 1835 cpu hsecs
Run2 ran in 3624 cpu hsecs
run 1 ran in 50.63% of the time

Name                                  Run1        Run2        Diff
...
LATCH.session allocation           104,502     105,622       1,120
LATCH.enqueues                         532       1,758       1,226<b>
STAT...DB time                       1,909       3,634       1,725
STAT...Elapsed Time                  1,906       3,635       1,729
STAT...recursive cpu usage           1,838       3,618       1,780
STAT...CPU used when call star       1,841       3,627       1,786
STAT...CPU used by this sessio       1,838       3,627       1,789</b>
STAT...active txn count during           4       2,109       2,105
STAT...cleanout - number of kt           4       2,110       2,106
STAT...consistent gets - exami     202,590     204,708       2,118
LATCH.undo global data                 306       2,813       2,507
STAT...calls to kcmgcs                  41       2,770       2,729
STAT...db block gets direct          2,740           0      -2,740
STAT...physical writes direct        2,740           0      -2,740
STAT...physical writes non che       2,740           0      -2,740
STAT...physical writes               2,740           0      -2,740
STAT...commit cleanouts succes          96       2,878       2,782
STAT...commit cleanouts                100       2,883       2,783
STAT...free buffer requested           147       3,067       2,920
STAT...redo subscn max counts           59       2,985       2,926
STAT...calls to get snapshot s     115,094     118,047       2,953
LATCH.cache buffer handles           2,860       6,568       3,708
STAT...table scan rows gotten    1,101,167   1,104,931       3,764
LATCH.object queue header oper         155       6,284       6,129
LATCH.SQL memory manager latch          20      11,379      11,359
STAT...redo entries                  1,191      16,061      14,870
STAT...db block changes              1,579      21,969      20,390
STAT...db block gets                 4,520      26,343      21,823
STAT...db block gets from cach       1,780      26,343      24,563
STAT...physical read bytes          40,960      16,384     -24,576
STAT...IMU undo allocation siz       2,544      44,452      41,908
LATCH.row cache objects            241,283     308,810      67,527
STAT...IMU Redo allocation siz         320     523,212     522,892
LATCH.simulator hash latch          85,225     702,283     617,058
LATCH.SQL memory manager worka       1,804     786,059     784,255
STAT...physical read total byt     860,160      16,384    -843,776
STAT...undo change vector size      44,512   1,381,504   1,336,992
STAT...buffer is pinned count      729,915   4,218,843   3,488,928
STAT...index scans kdiixs1         651,953   4,140,973   3,489,020
STAT...no work - consistent re   1,242,934   4,779,892   3,536,958
STAT...consistent gets from ca   1,447,391   4,989,399   3,542,008
STAT...consistent gets           1,447,391   4,989,399   3,542,008
STAT...session logical reads     1,451,911   5,015,742   3,563,831
LATCH.cache buffers chains       2,698,761   9,865,284   7,166,523
STAT...physical write bytes     22,446,080           0 -22,446,080
STAT...physical write total by  22,446,080           0 -22,446,080<b>
STAT...redo size                   176,692  23,781,996  23,605,304</b>

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct<b>
3,570,177  12,234,835   8,664,658     29.18%
</b>
PL/SQL procedure successfully completed.





I use cpu hsecs now (replace dbms_utility.get_time with get_CPU_time in runstats)


but as you can see - it takes longer and uses more resources.


YOU CANNOT GET MORE "BULK LOAD" THAN A SINGLE SQL STATEMENT - INSERT INTO SELECT IS THE ULTIMATE BULK LOAD - YOU CANNOT WRITE BULKIER CODE YOURSELF, NO WAY NO HOW. 


so, if the consultant believes bulk loading is good (it is), then they went the WRONG DIRECTION didn't they?

Superb as usual

A reader, October 02, 2009 - 3:31 pm UTC

Sir,

As usual you made by doubts clear except this two things.
1.

I hope you do know that:

...
INSERT /*+ append */ INTO t2 VALUES l_data(i);
.....


is identical to

INSERT INTO t2 VALUES l_data(i);

(except in 11gR1 where append actually does 'work' by accident, fixed in 11gr2). Direct path loads for single row inserts would be deadly - they do not happen fortunately.


I thought that single insert also will do direct load. I know one think like direct load will load above high watermark.
Can you please explain this more and if possible little example.

2.

<1>
also, the indentation of your 'commit' makes me think you think it is "in" the forall loop - it is not. It is after - the forall is not really a loop - it is a single statement.


For this i am not clear can you please explain algorithm for this kind of loops

like what i understood that you told was right

open cursor (Lets say 100 rows)
if limit is 20 get 20 rows
commit
then check for more rows if it's there
if yes
loop again
Tom Kyte
October 07, 2009 - 3:51 pm UTC

1) explain what a little more?

insert /*+ APPEND */ works only with SELECT, not with VALUES. (except as mentioned in 11g Release 1 where there is a bug, in 11g Release 2 you can do it with values with a new hint, append_values - but append will NOT direct path with values)

2) forall is not a loop, that is all, it is not a loop.

it seemed that you thought it was and that the commit was in the loop. There is no loop. Your code:

    FETCH c BULK COLLECT INTO l_data LIMIT 10000;
    FORALL i IN 1..l_data.COUNT
        INSERT /*+ append */ INTO t2 VALUES l_data(i);
        commit;
    EXIT WHEN c%NOTFOUND;
    END LOOP;


Look at that lining up of the code - the commit is lined up with the insert, it just makes one THINK that you THINK the commit and insert are in a "loop", that is all I'm saying.



All I'll say is:


YOU CANNOT GET MORE "BULK LOAD" THAN A SINGLE SQL STATEMENT - INSERT INTO SELECT IS THE ULTIMATE
BULK LOAD - YOU CANNOT WRITE BULKIER CODE YOURSELF, NO WAY NO HOW.


doing this procedurally is "dumb"

Got the point

A reader, October 08, 2009 - 1:18 pm UTC

Thank you very very much. I got the both point which your are trying to say.

Anantha Lakshmi, October 14, 2009 - 9:31 am UTC

I have searched on this topic and this page was a godsend.
I tried bulkcollect in a cursor,which had select from multiple tables,I cant understand the declaration that should be made here..

My code is like

cursor c1 is select * from table t1,t2,... where ....

I just want to know how I should declare this

TYPE <> IS TABLE OF <>%ROWTYPE;
what should I give in the <>%ROWTYPE

What if we must use FORALL to insert?

Tom Canty, January 15, 2010 - 5:05 am UTC

Hi - I didn't see this directly answered, so I thought I'd ask. If we don't have the option to do this in a single SQL statement (in my case I'm reading from a file into a LOB (to process some XML), then filling a PL/SQL table, and finally doing a insert), do you have any recommendations for helping FORALL along? In particular, my array is about 20% sparse (that is, not sequential), so I'm using the INDICIES clause of FORALL. But I'm finding that the larger the array gets, the slower the FORALL becomes, to an extreme degree (not just 1:1).

Can you recommend anything? For example, should I try increasing the array size in SQL*Plus, do periodic inserts once the array has reached a certain size, use hints, etc.? Would having a sequential array help? Interestingly, filling the array doesn't appear to be the slow part - it's only when I go to insert it, in a single FORALL, that it gets disproportionately slow. If it matters I'm on 10.1.0.3, which was the terminal release for this platform (Mac PowerPC - ya, I know).

Thanks!
Tom Kyte
January 18, 2010 - 5:24 pm UTC

I directly answered that:

...
the mantra:

o You should do it in a single SQL statement if at all possible.
o If you cannot do it in a single SQL Statement, then do it in PL/SQL.
o If you cannot do it in PL/SQL, try a Java Stored Procedure.
o If you cannot do it in Java, do it in a C external procedure.
o If you cannot do it in a C external routine, you might want to seriously
think about why it is you need to do it?

.....


for bulk processing you should always

a) make the array size a parameter, so you can adjust it in your code
b) start with 100 and go upto about 1000 but stick in that range.

You don't want to do all of step A, filling up the array (in memory, lots of resources) and then doing step B

You do want to do a little of A, a little of B, a little of A, a little of B and so on - spread the work out, no big bumps in the processing timeline.

sqlplus arraysize setting has *nothing* to do with plsql, it will not affect anything in plsql - forall processing is procedural, you control the arraysize yourself.


bulk

A reader, January 18, 2010 - 7:34 pm UTC


One quetion on option 3

Poonam, July 26, 2011 - 8:30 am UTC

I want to use bulk insert in my code. as given in option-3.
Now, i want to know here i how can i handle exception? can i use save exception here? if not then how should i handle exception here?

OPTION 3
########

DECLARE
BEGIN

INSERT into big_table_x
()
SELECT ..........
..........
FROM big_table_a a,
big_table_b b ,
big_table_c c,
small_ref_table_d d
WHERE a.col1 = b.col1
...............
...............;
Tom Kyte
July 28, 2011 - 6:33 pm UTC

option 3 does NOT use a bulk insert.

It uses, correctly, a single SQL statement to do all of the work.


If you need to capture errors and deal with them later - you are looking for dml error logging.

http://asktom.oracle.com/Misc/how-cool-is-this.html
http://asktom.oracle.com/Misc/how-cool-is-this-part-ii.html

Insert and update - Merge vs Bulk Collect

AT, August 15, 2011 - 11:15 am UTC

Tom,

It was very insightful to go through this thread. However, I still have a doubt with respect to using merge (single SQL) vs bulk collect (Procedural).

Consider two tables, one staging table and other target table. Target table has millions of records (~150 million) and it is partitioned. Staging table too has millions of records (~30million).

Now, data from staging table needs to be moved to target table. If a specific criteria is matched, target table will be updated if not it will be insert.

I know this can be easily done using merge. But considering number of records is merge really useful here?

What you will go with?

Option 1:
Merge into t_smpl using ....

Option 2:

create table atrivedi.t_smpl(id number, tab char(1), tabdate date, constraint pk_ts primary key(id, tab));


create table atrivedi.t_smpl_lod(id number, tab char(1), tabdate date);

insert into t_smpl values (1, 'Y', sysdate);
commit;

insert into t_smpl_lod values (1, 'N', sysdate);
commit;


create or replace type l_v_smpl_typ as object
(ID number, tab char(1), tabdate date); 


create or replace type  l_v_smpl_tab as table of l_v_smpl_typ;


declare
 l_v_smpl_ins_or_upd  l_v_smpl_tab;
    
    
    begin
     select l_v_smpl_typ(id, tab,tabdate) bulk collect into l_v_smpl_ins_or_upd from t_smpl_lod ;
     forall i in 1 .. l_v_smpl_ins_or_upd.count
      merge into t_smpl t1 using (select * from table(l_v_smpl_tab(l_v_smpl_ins_or_upd(i))) t ) t
       on (t.id=t1.id)
      when matched then update set t1.tab =t.tab
      when not matched then insert (id, tab, tabdate) values (t.id, t.tab, t.tabdate);
   end;


Thanks
AT


Tom Kyte
August 15, 2011 - 8:42 pm UTC

I know this can be easily done using merge. But considering number of records is merge really useful here?


ummm, you seem to be missing the basic point of SQL - processing sets of data is what it does best.

SQL performance beats slow by slow performance more and more as the volume of data increases.


You want to full scan stage, full scan target, hash outer join them in a merge.

YOU DO NOT want to do 30,000,000 index lookups which is probably what would happen with your slow by slow merge.

and even if it didn't use an index - how would you full scanning stage into memory and then using that collection in a merge be *better* than just doing it straight away???

Insert into select

Madhavan.A, September 14, 2011 - 1:22 am UTC

I have a temp table which has 3 million records. I want to move these records into target table. I am using insert into select statement to perform this operation. But the query is running for 4 hours but didn't completed. The target table has 7 indexes. There is no difference even after dropping the indexes. Any suggestion to improve the performance.
My query is
insert into target_tbl
select * from temp_tbl
Tom Kyte
September 14, 2011 - 7:16 pm UTC

you'll need to describe the machine. My laptop can do that in a few seconds.


ops$tkyte%ORA11GR2> create table t as select * from big_table.big_table where 1=0;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set timing on
ops$tkyte%ORA11GR2> insert into t
  2  select * from big_table.big_table union all
  3  select * from big_table.big_table union all
  4  select * from big_table.big_table ;

3000000 rows created.

Elapsed: 00:00:33.30



does your machine have the ability to do IO. describe machine.

how big it 3,000,000 rows in bytes (3,000,000 rows is rather meaningless as a metric when you think about it).

have you considered using nologging on the target table and a direct path load? alter table target_tbl nologging; insert /*+ append */ into target_tbl ...; alter table target_tbl logging; BACKUP now.

have you dug into an ash report to see what it was doing?



Direct Path Load

Siby, October 26, 2011 - 11:21 pm UTC

Hi Tom,

When a table is created and populated using "CREATE TABLE T1 as SELECT <very complicated query>", does it do an insert using direct path load (Insert /*+append*/)?

Thanks.
Tom Kyte
October 27, 2011 - 11:15 am UTC

create table as select is a direct path load, yes.

for vs forall

pranav, May 25, 2012 - 8:08 pm UTC

Hi Tom,

Can you please clarify below question.

Is there any difference between using for vs forall during bulk collect?

For ex.,
1)
open test_curs
loop
fetch test_curs bulk collect into test_type limit 2000;
for i in 1..test_type.COUNT LOOP
BEGIN
insert into temp1(col1,col2,col3) values (test_type(i).col1,test_type(i).col2,test_type(i).col3);
END;
END LOOP;
END LOOP;
close test_curs;

VS

2)
open test_curs
loop
fetch test_curs bulk collect into test_type limit 2000;
forall i in 1..test_type.COUNT
insert into temp1(col1,col2,col3) values (test_type(i).col1,test_type(i).col2,test_type(i).col3);
END LOOP;
close test_curs;

Database version is 11g.

Thanks for all your replies.

Tom Kyte
May 26, 2012 - 2:02 am UTC

....Is there any difference between using for vs forall during bulk collect?...

well, first - you do not use forall during a bulk collect. bulk collect is just a bulk collect - it doesn't use forall

forall is a way to bulk up other DML statements like insert, update, delete and merge.

For would do these statements one by one
Forall does them in bulk.


If you were to trace your stuff above, tkprof would show you doing an execute for every single insert in the for loop, it would show you doing an execute for every 2000 inserts in the second loop



ops$tkyte%ORA11GR2> create table t1 as select * from all_objects;

Table created.

ops$tkyte%ORA11GR2> create table t2 as select * from t1 where 1=0;

Table created.

ops$tkyte%ORA11GR2> create table t3 as select * from t1 where 1=0;

Table created.

ops$tkyte%ORA11GR2> create table t4 as select * from t1 where 1=0;

Table created.

ops$tkyte%ORA11GR2> @trace
ops$tkyte%ORA11GR2> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte%ORA11GR2> declare
  2          type array is table of t1%rowtype index by binary_integer;
  3          cursor test_curs is select * from t1;
  4          l_data array;
  5  begin
  6          open test_curs;
  7          loop
  8                  fetch test_curs bulk collect into l_data limit 2000;
  9                  for i in 1..l_data.COUNT
 10                  LOOP
 11                          insert into t2 values l_data(i);
 12                  END loop;
 13                  exit when test_curs%notfound;
 14          END LOOP;
 15          close test_curs;
 16  end;
 17  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> declare
  2          type array is table of t1%rowtype index by binary_integer;
  3          cursor test_curs is select * from t1;
  4          l_data array;
  5  begin
  6          open test_curs;
  7          loop
  8                  fetch test_curs bulk collect into l_data limit 2000;
  9                  forall i in 1..l_data.COUNT
 10                          insert into t3 values l_data(i);
 11                  exit when test_curs%notfound;
 12          END LOOP;
 13          close test_curs;
 14  end;
 15  /

PL/SQL procedure successfully completed.


ops$tkyte%ORA11GR2> insert into t4 select * from t1;

72839 rows created.

*****************************************************************************

INSERT INTO T2 VALUES
 (:B1 ,:B2 ,:B3 ,:B4 ,:B5 ,:B6 ,:B7 ,:B8 ,:B9 ,:B10 ,:B11 ,:B12 ,:B13 ,:B14 , :B15 )


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  72839     30.51      35.24          0       1535      82646       72839
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    72840     30.51      35.24          0       1535      82646       72839
********************************************************************************
INSERT INTO T3 VALUES
 (:B1 ,:B2 ,:B3 ,:B4 ,:B5 ,:B6 ,:B7 ,:B8 ,:B9 ,:B10 ,:B11 ,:B12 ,:B13 ,:B14 , :B15 )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     37      0.24       0.33          0       2298      10018       72839
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       38      0.24       0.33          0       2298      10018       72839
********************************************************************************
insert into t4 select * from t1
  
  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          2          0           0
Execute      1      0.14       0.24        716       2980       9326       72839
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.15       0.25        716       2982       9326       72839




pretty compelling - slow by slow is slow. bulk by bulk is faster. single sql statement = perfection.

for vs forall

pranav, May 26, 2012 - 2:57 pm UTC

Thank you very much Tom. Cleared my confusion :-)

Appreciate all your help.

On a side note, how can you reply to 10 different questions at the same time(most of them are internals)? :-) I spent most of my time in your blog.
I have seen people mastering some particular topics in one subject. But you are true inspiration.

Thanks once again for all your help to the Oracle community.

How comes APPEND can reduce UNDO?

A reader, May 27, 2012 - 10:18 pm UTC

Hi Tom,

I am a little bit confused about 'append'.
As i know it is to insert directly after the HWM, together with nologging can reduce REDO significiantly.
But from your example with sql 'select used_ublk from v$transaction', the UNDO is also reduced much.

Just curious how comes UNDO is reduced? My thought is anyway, it should support rollback with same amout of UNDO.

Is it due to those UNDO's for recursive SQLs which are necessary without APPEND hint?

Plus, during my testing, i tried to watch the HWM, and accidently found allocated blocks for extents is 8 from begining and
then from some certain point of time, it changed to 128. Any idea? Will it be consistent under uniformed allocation?


create table test as select * from dba_objects where 1=2;
insert /*+ append */ into test select * from dba_objects union all select * from dba_objects union all select * from dba_objects;
insert into test select * from dba_objects union all select * from dba_objects union all select * from dba_objects;
select blocks, block_id from dba_extents where segment_name='TEST';
BLOCKS BLOCK_ID
---------- ----------
8 19848
8 19856
8 19864
8 19872
8 19912
128 19968
128 20096
128 20224
Tom Kyte
May 28, 2012 - 2:25 am UTC

we write directly to the blocks above the HWM - we know that in order to rollback - all we have to do is "unadvance" the HWM - we don't need to undo the block changes - we just need to undo the dictionary update.

hence we do not need to store undo for each and every block above the HWM - we just need undo to put the HWM back where it was.




as for the 8, to 128 and so on - that is the algorithm behind a locally managed tablespace with system allocated extent sizes - we allocated a set of 8 block extents, then 128, then bigger and bigger. the larger the table becomes, the larger the extent we'll allocate for it over time.

if you use uniform extent sizes - yes, they would be 'uniform' - they are always the same size in a uniform extent tablespace.

Renjith, June 08, 2012 - 12:32 am UTC

Hi Tom,
I have a table called emp5 with 1000020 rows and i was trying to insert that into another table called emp6.Emp6 has a primary key on employee_id.In emp5 there is one duplicate value for employee_id column. When I tried to insert using append and dml logging it is not working. but it is working with normal insert.
<Case 1>
SQL> insert /*+ append */ into emp6 select * from emp5;
insert /*+ append */ into emp6 select * from emp5
*
ERROR at line 1:
ORA-00001: unique constraint (RENJITH.PKKK) violated


Elapsed: 00:00:06.54
SQL> select count(*) from emp6;

  COUNT(*)
----------
         0
Case<2>
SQL> insert into emp6 select * from emp5
  2   log errors reject limit unlimited;

1000019 rows created

<Version>
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for 32-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production.

Could you please guide me How to resolve this .
Thanks In Advance.

Tom Kyte
June 08, 2012 - 7:16 am UTC

direct path inserts + unique constraint = no dml error logging. Just use a conventional path insert for such a teeny tiny bit of data


Or write the select statement in a fashion to get just the single "correct" record for each employee.

Renjith, June 08, 2012 - 11:12 am UTC

Thank you very much Tom.

For BLOB data

Samiran Banerjee, June 20, 2012 - 5:18 am UTC

I need to move data from differnt data base server amd the table containing BLOB data.
When I am executing the code where dblink is used it is failing because of BLOB data. Any other approach to move blob data.
declare
type array is table of test_work%ROWTYPE;
L_DATA array;
cursor C is select * from pc_work@prpctrg;
begin
open C;
LOOP
FETCH C bulk collect into L_DATA limit 100;
FORALL I in 1..L_DATA.COUNT
insert into t1_test_work values L_DATA(I);
commit;
EXIT when C%NOTFOUND;
end LOOP;
close C;
end;

Thanks in advance!!
Samiran
Tom Kyte
June 21, 2012 - 7:29 am UTC

insert into t1_test_work select * from pc_work@prpctrg;



why would you even consider writing anything else? Why do people always seem to go for the most complex, resource intensive approach every single time? Is someone teaching this in school? at work? where does this come from?????



More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library