Skip to Main Content
  • Questions
  • Optimize insert or update million records in a table

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Sam.

Asked: March 08, 2017 - 10:16 pm UTC

Last updated: April 30, 2019 - 3:01 am UTC

Version: 10

Viewed 10K+ times! This question is

You Asked

Hi Tom,

i need your help in optimizing a procedure in less time which is originally taking 40 mins to insert/update 14+million records into a table. We have a query with a table (20+ million records) which inserts/update(14 million records) into another table every day. Attaching the procedure which i am using

EXECUTE IMMEDIATE 'alter table TEST nologging';
declare
cursor c_data is
SELECT /*+ FIRST_ROWS(n) */ G.C1, G.C2, G.C3, G.C4,G.C5,
tab_to_uniq_string_limit (CAST (COLLECT (G.C6 ORDER BY G.C7 )
AS t_varchar2_tab), ',',1000) AS C6,
tab_to_uniq_string_limit (CAST (COLLECT (G.C7 ORDER BY G.C7 )
AS t_varchar2_tab), ',', 500) AS C7,
MAX(g.row_last_modified) AS C8,
G.C9
FROM table1 g
GROUP BY G.GE_ID, G.ENTRY_PARENTAGE, G.ENTRY_PEDIGREE, G.MATERIAL_TYPE, G.VARIETY_NAME, G.PEDIGREE_SECURED
MINUS
SELECT C1, C2, C3, C4, C5, C6, C7, C8, C9
FROM table2;

type t__data is table of c_data%rowtype index by binary_integer;
t_data t__data;
begin
open c_data;
loop
fetch c_data bulk collect into t_data limit 10000;
exit when t_data.count = 0;

for idx in t_data.first .. t_data.last loop
UPDATE table2 SET
C2 = t_data(idx).C2,
C3 = t_data(idx).C3,
C4 = t_data(idx).C4,
C5 = t_data(idx).C5,
C6 = t_data(idx).C6,
C7 = t_data(idx).C7,
C8 = t_data(idx).C8,
C1 =t_data(idx).C1
WHERE C1 = t_data(idx).C2;

IF (SQL%ROWCOUNT = 0) THEN

INSERT /*+ append */ into table2
(C1, C2,C3,C4,C5,C6,C7,C8,C9)
VALUES
(t_data(idx).C1, t_data(idx).C2,t_data(idx).C3, t_data(idx).C4, t_data(idx).C5, t_data(idx).C6, t_data(idx).C7, t_data(idx).C8, t_data(idx).C9);

END IF;
end loop;
commit;
end loop;
close c_data;
end;
COMMIT;
EXECUTE IMMEDIATE 'alter table TEST logging';

Please suggest a good solution to optimize it. I tried using merge but even it's taking too long.

and Connor said...

Well, we need to identify the problem area. A merge (SQL or logical code equivalent as above) does:

a- get the source rows
b- match with target rows
c- perform the insert/update


So firstly I would do a timing test on:

set timing
begin 
 for i in c  loop
    null;
 end loop;
end;
/


where C is your cursor above. That tells you time to find the rows.

Then dump the rows from C into a global temporary table (GTT). Then time a join between GTT and your target table. That's the cost of matching the rows.

Then time a test of inserting some/all of the GTT rows into the target - that's an insertion test.

Then you'll know where to focus your efforts. But I will say this - it is *far* more likely that a MERGE is going outperform a home-grown equivalent.

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Insert Append

John Keymer, March 09, 2017 - 8:06 am UTC

Also this...

INSERT /*+ append */ into table2 


Is a bad idea for single row inserts.
Connor McDonald
March 09, 2017 - 11:28 pm UTC

Nicely spotted.

Ideally plsql wont be needed at all - because if it is, there's plenty of room to improve this code.

append insert

Rajeshwaran, Jeyabal, March 10, 2017 - 2:37 am UTC

Are we saying that insert with append hint is causing the blocking/slowness? but looks like inserts with values(..) clause silently ignores the append hint here.
demo@ORA11G> create table t(x int);

Table created.

demo@ORA11G> insert /*+ append */ into t(x) values(55);

1 row created.

demo@ORA11G> select * from t;

         X
----------
        55

demo@ORA11G>

instead if they have used append_values hint, then blocking occurs.
demo@ORA11G> insert /*+ append_values */ into t(x) values(55);

1 row created.

demo@ORA11G> select * from t;
select * from t
              *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


demo@ORA11G>

Connor McDonald
March 11, 2017 - 1:48 am UTC

No, my point was the "append" creates an illusion that we're gaining something from it, whereas we get nothing for a single row insert.

Insert into table via loop taking time

ORA_Learner, April 24, 2019 - 7:45 pm UTC

Hi Conner,

I am trying to insert rows in a table in a loop which runs 15K times only, but it is taking nearly hour.
Below is what I am trying to do in pl/sql block
The total number of rows to be loaded in final table are 28,619.

Can you please look into it, and suggest your expert advice, to help reduce time to 10 mins. This process is expected to finish in max 10 mins.


create or replace procedure proc_grp_ld(in_proc_id in number,in_proc_name in varchar2,in_class in varchar2,in_year in number,in_period in number,user_id in varchar2) as

begin
for i in (select distinct * from table_A where class=in_class and grp_typ='N')
Loop
if field_typ='2' then
proc_1(in_proc_id,in_proc_name,i.class,i.object_typ,in_year,in_period,i.t_id,i.group_id,i.eff_dt,user_id);
elsif if field_typ='2' then
proc_2(in_proc_id,in_proc_name,i.class,i.object_typ,in_year,in_period,i.t_id,i.group_id,i.eff_dt,user_id);
else
null;
end if;
end loop;
commit;
end proc_grp_ld;

Below is the definition of the 2 procs that are called inside:

create or replace procedure proc_1( in_process_id in number,in_process_name in varchar2,in_class in varchar2,
in_object_typ in varchar2,in_year in number,in_period in number,in_t_id in varchar2,
in varchar2,in_group_id in varchar2,eff_dt date,user_id in varchar2)
as
t_sql varchar2(2000);
s_tbl_nm varchar2(30 char);
s_fld_nm varchar2(20 char);
BEGIN
if object_typ in('BU','BUL','AFF') THEN
s_tbl_nm :='ABC_TBL';
s_fld_nm :='B_U_L';
ELSIF object_typ='AFF_ND' THEN
s_tbl_nm :='DEF_TBL';
s_fld_nm :='VAL_CH';
ELSIF object_typ='ACC' THEN
s_tbl_nm :='GHI_TBL';
s_fld_nm :='ACC_T';
ELSIF object_typ='OPR' THEN
s_tbl_nm :='JKL_TBL';
s_fld_nm :='OU';
ELSE
s_tbl_nm :=NULL;
END IF;
--
t_sql:='insert into FINAL_GRP_TBL(T_ID,GROUP_ID,CLASS,A_VALUE,OBJECT_TYPE,YEAR,PERIOD) '||
'SELECT DISTINCT A.T_ID,A.GROUP_ID,'||''''||in_class||''''||',B.A_VALUE,'||''''||in_object_typ||''''||','||in_year||','||in_period||' FROM '||
'table_A,table_B,'||s_tbl_nm||' C '||
'WHERE A.T_ID='||''''||in_t_id||''''||' AND '||
'A.GROUP_ID='||''''||in_group_id||''''||' AND '||
'A.GROUP_ID=B.GROUP_ID AND '||
'C.EFFDT=(SELECT MAX(C_EFF.EFFDT) '||
'FROM '||s_tbl_nm||' C_EFF '||
'WHERE C.'||s_fld_nm||' = C_EFF.'||v_fld_nm||' AND '||
'C_EFF.EFFDT<= TO_DATE('||''''||in_period||'-'||in_year||''''||','||''''||'MM-YYYY'||''''||')) AND '||
'C.EFF_STATUS='||''''||'A'||''''||' AND '||
'B.A_VALUE=C.'||s_fld_nm;
--
EXECUTE IMMEDIATE t_sql;
commit;
END proc_1;

-----------****------------

create or replace procedure proc_2( in_process_id in number,in_process_name in varchar2,in_class in varchar2,
in_object_typ in varchar2,in_year in number,in_period in number,in_t_id in varchar2,
in varchar2,in_group_id in varchar2,eff_dt date,user_id in varchar2)
as
t_sql varchar2(2000);
s_tbl_nm varchar2(30 char);
s_fld_nm varchar2(20 char);
BEGIN
if object_typ in('BU','BUL','AFF') THEN
s_tbl_nm :='ABC_TBL';
s_fld_nm :='B_U_L';
ELSIF object_typ='AFF_ND' THEN
s_tbl_nm :='DEF_TBL';
s_fld_nm :='VAL_CH';
ELSIF object_typ='ACC' THEN
s_tbl_nm :='GHI_TBL';
s_fld_nm :='ACC_T';
ELSIF object_typ='OPR' THEN
s_tbl_nm :='JKL_TBL';
s_fld_nm :='OU';
ELSE
s_tbl_nm :=NULL;
END IF;
--
t_sql:='insert into FINAL_GRP_TBL(T_ID,GROUP_ID,CLASS,A_VALUE,OBJECT_TYPE,YEAR,PERIOD) '||
'SELECT DISTINCT A.T_ID,A.GROUP_ID,'||''''||in_class||''''||',C.'||s_fld_nm||',' ||
''''||in_object_typ||''''||','||in_year||','||in_period||' FROM '||
'table_A,table_BC,'||s_tbl_nm||' C ,table_D00 D '||
'WHERE A.T_ID='||''''||in_t_id||''''||' AND '||
'C.YEAR='||in_year||' AND '||
'C.PERIOD='||in_period||' AND '||
'A.GROUP_ID=B.GROUP_ID AND '||
'B.A_VALUE=C.TREE_NODE AND '||
'A.GROUP_ID='||''''||in_group_id||''''||' AND '||
'A.T_ID=D.T_ID AND '||
'D.EFFDT=(SELECT MAX(D_EFF.EFFDT) '||
'FROM table_D00 D_EFF '||
'WHERE D.T_ID=D_EFF.T_ID AND D.O_UNIT=D_EFF.O_UNIT AND '||
'D_EFF.EFFDT<= TO_DATE('||''''||in_period||'-'||in_year||''''||','||''''||'MM-YYYY'||''''||')) AND '||
'D.EFF_STATUS='||''''||'A'||''''||' AND '||
'D.O_UNIT=C.'||s_fld_nm;
--
EXECUTE IMMEDIATE t_sql;
commit;
END proc_2;

Thank You !!
Connor McDonald
April 30, 2019 - 3:01 am UTC

Step 1 - time how long this takes

for i in (select distinct * from table_A where class=in_class and grp_typ='N')
Loop
  null;
end loop;


That tells you if you need to tune the outer query. Let's assume its quick

Step 2 - time these two

for i in (select distinct * from table_A where class=in_class and grp_typ='N')
Loop
if field_typ='2' then
proc_1(in_proc_id,in_proc_name,i.class,i.object_typ,in_year,in_period,i.t_id,i.groupend if;
end loop;


for i in (select distinct * from table_A where class=in_class and grp_typ='N')
Loop
if if field_typ='2' then
proc_2(in_proc_id,in_proc_name,i.class,i.object_typ,in_year,in_period,i.t_id,i.group_id,i.eff_dt,user_id);
end loop;


That tells you which of proc1 and proc2 to focus on first to get the biggest gains. I'll assume proc2 but since they are similar procs, the process is similar

Step 3 - Quick fixes

You're building a dynamic SQL every time (I assume because the table name is dynamic). You *might* get some benefits just by lowering the parse costs by sharing cursors. So put this at the start of proc_grp_ld

execute immediate 'alter session set cursor_sharing = force'

and reset at the end

execute immediate 'alter session set cursor_sharing = exact'

Step 4 - Refactoring code

In reality you only have 4 possible tables here (ABC, DEF, etc).

So you could write this as distinct inserts without all the dynamic stuff, ie the first insert would be:

insert into FINAL_GRP_TBL(T_ID,GROUP_ID,CLASS,A_VALUE,OBJECT_TYPE,YEAR,PERIOD) 
SELECT DISTINCT A.T_ID,A.GROUP_ID,in_class,B.A_VALUE,in_object_typ,in_year,in_period 
FROM  table_A,
      table_B,
      ABC C,
      ( select distinct * from table_A where class=in_class and grp_typ='N' 
        and object_typ in('BU','BUL','AFF')  <==== so just focussing on rows that yield ABC as the join table
        ) x
WHERE A.T_ID=x.in_t_id AND 
A.GROUP_ID=x.in_group_id AND 
A.GROUP_ID=B.GROUP_ID AND 
C.EFFDT=(
  SELECT MAX(C_EFF.EFFDT) 
  FROM s_tbl_nm C_EFF 
  WHERE C.s_fld_nm = C_EFF.v_fld_nm 
  AND   C_EFF.EFFDT<= TO_DATE(in_period-in_year,MM-YYYY)) 
  AND  C.EFF_STATUS=A 
  AND  B.A_VALUE= c.BUL
  )


and then repeat for the other variations

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library