Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, rekha.

Asked: October 10, 2007 - 5:01 pm UTC

Last updated: May 11, 2009 - 8:51 am UTC

Version: 10

Viewed 10K+ times! This question is

You Asked

Hi

Im new to merge..
I have a table with 140 million records.
each and every month my staging table populates with 10 million records which either i should update or insert based on conditions.

What is the efficient way of merge for that
Is merge after disabling index is faster or merge with index on table is faster.

Update

Is update after disabling index is faster or update with index on table is faster.

Thanks & Regards
Rekha

and Tom said...

the only answer possible is

it depends.


You are only modifying about 7% of the data, so it is likely that it would be more efficient to leave the indexes in place and do a merge (and if many of the 10 million records are to be inserted, perhaps with the append hint to add the new records above the high water mark and make the index maintenance easier.

if you disable the index, the merge itself would likely go faster (presuming we did not use the index to FIND the rows and given the index doesn't exist - finding the rows takes longer). But the overall process of

a) disable
b) merge
c) rebuild

might take longer than

a) merge


I would suggest a benchmark is entirely called for! simulate this on your test system and evaluate both approaches.

Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Index

karthick, October 11, 2007 - 8:01 am UTC

does that mean Rebuilding index is much faster than udpating the index while the table is getting modified.

Followup to karthick

A reader, October 11, 2007 - 11:31 am UTC

karthick,

It looks like you did not follow Tom's answer. He never mentioned that rebuild of index will be faster. You have to test which one will take longer.

1. updating indexes for 7% of the table rows.
2. Rebuilding index for the whole table (140 million rows)

In my opinion, rebuilding indexes for the table will take more resource and time. But still yoo needs to test in your environment as all environments are different.

Regards

Sorry got that wrong!!

karthick, October 12, 2007 - 1:01 am UTC

Yes you are correct. I got that wrong. 

But in my case actually iam performing an insert into a table. This insert will put millions of records. And the table has lots of indexes. 

So i tried two things.

1. Just perform the insert with index on place.
2. drop index
   perform the insert
   recreate index.

In my environment the 2nd one was faster. But not sure why.

below is the test result.

SQL> drop table obj1;

Table dropped.

Elapsed: 00:00:00.09
SQL>
SQL> create table obj1 as select rownum rno, a.* from all_objects a where 1=2;

Table created.

Elapsed: 00:00:00.00
SQL>
SQL>
SQL> create index idx_obj1_1 on obj1(owner);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_2 on obj1(object_name);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_3 on obj1(subobject_name);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_4 on obj1(object_id);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_5 on obj1(data_object_id);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_6 on obj1(object_type);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_7 on obj1(created);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_8 on obj1(last_ddl_time);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_9 on obj1(timestamp);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_10 on obj1(status);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_11 on obj1(temporary);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_12 on obj1(generated);

Index created.

Elapsed: 00:00:00.00
SQL> create index idx_obj1_13 on obj1(secondary);

Index created.

Elapsed: 00:00:00.00
SQL>
SQL> insert into obj1 select rownum rno, a.* from all_objects a union all select rownum rno, a.* from all_objects a;

87690 rows created.

Elapsed: 00:00:37.03

Statistics
----------------------------------------------------------
      17353  recursive calls
     491270  db block gets
     212545  consistent gets
        854  physical reads
  155563848  redo size
        862  bytes sent via SQL*Net to client
        672  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      87690  rows processed

SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL>
SQL> truncate table obj1;

Table truncated.

Elapsed: 00:00:00.06
SQL> BEGIN
  2     EXECUTE IMMEDIATE 'drop index idx_obj1_1';
  3     EXECUTE IMMEDIATE 'drop index idx_obj1_2';
  4     EXECUTE IMMEDIATE 'drop index idx_obj1_3';
  5     EXECUTE IMMEDIATE 'drop index idx_obj1_4';
  6     EXECUTE IMMEDIATE 'drop index idx_obj1_5';
  7     EXECUTE IMMEDIATE 'drop index idx_obj1_6';
  8     EXECUTE IMMEDIATE 'drop index idx_obj1_7';
  9     EXECUTE IMMEDIATE 'drop index idx_obj1_8';
 10     EXECUTE IMMEDIATE 'drop index idx_obj1_9';
 11     EXECUTE IMMEDIATE 'drop index idx_obj1_10';
 12     EXECUTE IMMEDIATE 'drop index idx_obj1_11';
 13     EXECUTE IMMEDIATE 'drop index idx_obj1_12';
 14     EXECUTE IMMEDIATE 'drop index idx_obj1_13';
 15
 16     insert into obj1 select rownum rno, a.* from all_objects a union all select rownum rno, a.* from all_objects a;
 17
 18     EXECUTE IMMEDIATE 'create index idx_obj1_1 on obj1(owner)';
 19     EXECUTE IMMEDIATE 'create index idx_obj1_2 on obj1(object_name)';
 20     EXECUTE IMMEDIATE 'create index idx_obj1_3 on obj1(subobject_name)';
 21     EXECUTE IMMEDIATE 'create index idx_obj1_4 on obj1(object_id)';
 22     EXECUTE IMMEDIATE 'create index idx_obj1_5 on obj1(data_object_id)';
 23     EXECUTE IMMEDIATE 'create index idx_obj1_6 on obj1(object_type)';
 24     EXECUTE IMMEDIATE 'create index idx_obj1_7 on obj1(created)';
 25     EXECUTE IMMEDIATE 'create index idx_obj1_8 on obj1(last_ddl_time)';
 26     EXECUTE IMMEDIATE 'create index idx_obj1_9 on obj1(timestamp)';
 27     EXECUTE IMMEDIATE 'create index idx_obj1_10 on obj1(status)';
 28     EXECUTE IMMEDIATE 'create index idx_obj1_11 on obj1(temporary)';
 29     EXECUTE IMMEDIATE 'create index idx_obj1_12 on obj1(generated)';
 30     EXECUTE IMMEDIATE 'create index idx_obj1_13 on obj1(secondary)';
 31  END;
 32
 33  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:20.08 

Tom Kyte
October 14, 2007 - 9:29 pm UTC

you start with nothing in the table??

you modified 100% of the data. All of it.

I said "you are modifying about 7% of the data, might be better to leave them in place"

besides, 20 seconds vs 37 seconds, you are not doing very much work at all... So, you are not benchmarking anything "big" here. You might want to simulate what you actually intend on doing.

INSERT/MERGE APPEND index maintenance

Richard Smith, October 12, 2007 - 8:43 am UTC

Picking up on part of the first response to this question: "perhaps with the append hint to add the new records above the high water mark and make the index maintenance easier".

I know why the APPEND hint would make the table inserts quicker, but why would it affect the index maintenance - what the rowids in the index are surely wouldn't affect how the index is maintained,..,. would they?
Tom Kyte
October 14, 2007 - 9:11 pm UTC

when we direct path - we maintain "mini indexes" for the newly loaded data and AFTER the load is done, we merge in bulk these mini indexes with the main index.

Instead of adding the index entries one by one as we load. It is done in bulk, less 'damaging' to the index structure that way.

INSERT/MERGE APPEND index maintenance

Richard Smith, October 15, 2007 - 8:49 am UTC

Wow, I'm glad I asked the question now, I hadn't realised that the way the index was maintained was different. Think I'd best do some more reading up on this,..,.

Index maintenance and sequential read

Jakub, May 05, 2009 - 10:29 am UTC

Dear Tom,

Could you please clarify whether the "index maintenance" part of the merge command could cause the session to wait on "db file sequential read"?

We are performing a batch update of large partitioned table each day (on 10g/RAC); the update changes around 10% of the whole table. The update is performed via MERGE statement with UPDATE clause only and we run it independently for each partition in the table.

Sometimes we notice that MERGE into one partition takes significantly more time than MERGE into other partitions (hour versus minutes), although the partition sizes and number of changed records are very similar and the data distributions are quite uniform across partitions.

When we traced the session we found out that most of the time is spent waiting on the "db file sequential read" event in the HASH JOIN part of the command (according to the GV$SESSION_LONGOPS). This is what puzzles me - apart from the "INDEX MAINTENANCE" there is no index access in the explain plan (below) and I have no idea why is the session waiting.

This is the runtime explain plan from GV$SQLPLAN:

MERGE /*+ USE_HASH(SRC TRGT) PARALLEL(TRGT) NOREWRITE*/ INTO OWNER_DWH.FT_INSTALMENT_HEAD_AD PARTITION(PARTITION_01_20031231) TRGT  USING  (  SELECT  /* PARALLEL (FT,4) */  ROWID RID,  CODE_SOURCE_SYSTEM,  ID_SOURCE,  DATE_EFFECTIVE,  ...  
 PX COORDINATOR  
  PX SEND/QC (RANDOM)/SYS.:TQ10004 8166580065 8198
   INDEX MAINTENANCE/OWNER_DWH.FT_INSTALMENT_HEAD_AD  
    PX RECEIVE 8166580065 8198
     PX SEND/RANGE/SYS.:TQ10003 8166580065 8198
      MERGE/OWNER_DWH.FT_INSTALMENT_HEAD_AD  
       PX RECEIVE 8166580065 8198
        PX SEND/HYBRID (ROWID PKEY)/SYS.:TQ10002 8166580065 8198
         VIEW  
          HASH JOIN/BUFFERED 8166580065 8198
           PX RECEIVE 7889148879 7287
            PX SEND/HASH/SYS.:TQ10000 7889148879 7287
             PX BLOCK/ITERATOR/./1-1 7889148879 7287
              TABLE ACCESS/FULL/OWNER_DWH.FT_INSTALMENT_HEAD_AD/1-1 7889148879 7287
           PX RECEIVE 223147460 911
            PX SEND/HASH/SYS.:TQ10001 223147460 911
             PX BLOCK/ITERATOR/./1-1 223147460 911
              TABLE ACCESS/FULL/OWNER_DWH.FT_INSTALMENT_HEAD_AD/1-1 223147460 911


Thanks for the clarification,
Jakub
Tom Kyte
May 11, 2009 - 8:51 am UTC

... Could you please clarify whether the "index maintenance" part of the merge command could cause the session to wait on "db file sequential read"? ...

simple: YES, of course.

If maintaining the index requires the index be read in from disk - then you will wait for db file sequential reads (single block IO's) to complete.