Skip to Main Content
  • Questions
  • Insert 400 million faster from one table to another table

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Jaganath.

Asked: February 25, 2021 - 6:30 am UTC

Last updated: March 05, 2021 - 4:39 am UTC

Version: 12

Viewed 100+ times

You Asked

Hi Tom,

We have trouble loading 400 million records from one table(temp table) to main table. We have been getting 200 million data earlier from upstream and we were able to load it 3 hrs by DBMS_PARALLEL_EXECUTE concept. Now the data is increased to 400 million and we except 6 to 7 hrs to load. Is there any other approach that the data can be inserted faster. Below is the code used currently.

 BEGIN
        DBMS_PARALLEL_EXECUTE.DROP_TASK (L_TASK_NAME);
    EXCEPTION
        WHEN OTHERS
        THEN
            NULL;
    END;

    --- This create task will create a task which can be seen in USER_PARALLEL_EXECUTE_TASKS table
    DBMS_PARALLEL_EXECUTE.CREATE_TASK (L_TASK_NAME);

    --This statement chunks the data based on Rowid
    DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID (L_TASK_NAME,
                                                  'TIREC',
                                                  'TMP_PROFILE_OFFER_ORG',
                                                  TRUE,
                                                  500000);

    L_SQL :=
        'INSERT INTO T_PROFILE_OFFER (PROFILE_ID,
                                  ENROLLMENT_ID,
                                  MEM_OFFER_ID,
                                  OFFER_ID,
                                  KIT_ID,
                                  OFFER_CODE,
                                  START_DATE,
                                  END_DATE,
                                  OFFER_PRIORITY,
                                  ACTIVATION_STATUS,
                                  STATUS,
                                  CREATE_USER,
                                  UPDATE_USER)
    SELECT PROFILE_ID,
           ENROLLMENT_ID,
           MEM_OFFER_ID,
           OFFER_ID,
           KIT_ID,
           OFFER_CODE,
           START_DATE,
           END_DATE,
           OFFER_PRIORITY,
           ACTIVATION_STATUS,
           STATUS,
           CREATE_USER,
           UPDATE_USER
      FROM TMP_PROFILE_OFFER_ORG
     WHERE ROWID BETWEEN :start_id AND :end_id
            LOG ERRORS INTO ERR$_T_PROFILE_OFFER_MO
                     (''B109: Insert'')
                     REJECT LIMIT UNLIMITED';

    --Following statement runs multiple session based on parallel level supplied
    DBMS_PARALLEL_EXECUTE.RUN_TASK (TASK_NAME        => L_TASK_NAME,
                                    SQL_STMT         => L_SQL,
                                    LANGUAGE_FLAG    => DBMS_SQL.NATIVE,
                                    PARALLEL_LEVEL   => L_DEGREE);


    --If task is failed we need to try until it is finished or 5 attempts to execute it.
    LOOP
        EXIT WHEN    DBMS_PARALLEL_EXECUTE.TASK_STATUS (L_TASK_NAME) =
                     DBMS_PARALLEL_EXECUTE.FINISHED
                  OR L_ATTEMPTS > L_RETRIES;

        L_ATTEMPTS := L_ATTEMPTS + 1;

        DBMS_PARALLEL_EXECUTE.RESUME_TASK (L_TASK_NAME);
    END LOOP;


Thanks,
Jaganath K

and we said...

Before you look at changing anything, I think you should do some tracing because it should not take hours to load that volume of data.

Here's a small replica of yours running on my laptop.

SQL> select count(*) from TMP_PROFILE_OFFER_ORG;

  COUNT(*)
----------
 200000000   -- 200 million

SQL> set timing on
SQL> insert into T_PROFILE_OFFER (
  2      PROFILE_ID,
  3      ENROLLMENT_ID,
  4      MEM_OFFER_ID,
  5      OFFER_ID,
  6      KIT_ID,
  7      OFFER_CODE,
  8      START_DATE,
  9      END_DATE,
 10      OFFER_PRIORITY,
 11      ACTIVATION_STATUS,
 12      STATUS,
 13      CREATE_USER,
 14      UPDATE_USER
 15  )
 16  SELECT PROFILE_ID,
 17         ENROLLMENT_ID,
 18         MEM_OFFER_ID,
 19         OFFER_ID,
 20         KIT_ID,
 21         OFFER_CODE,
 22         START_DATE,
 23         END_DATE,
 24         OFFER_PRIORITY,
 25         ACTIVATION_STATUS,
 26         STATUS,
 27         CREATE_USER,
 28         UPDATE_USER
 29  FROM TMP_PROFILE_OFFER_ORG
 30  /

200000000 rows created.

Elapsed: 00:06:10.66
SQL> set timing off
SQL> truncate table T_PROFILE_OFFER reuse storage;

Table truncated.

SQL> set timing on
SQL> insert /*+ APPEND */ into T_PROFILE_OFFER (
  2      PROFILE_ID,
  3      ENROLLMENT_ID,
  4      MEM_OFFER_ID,
  5      OFFER_ID,
  6      KIT_ID,
  7      OFFER_CODE,
  8      START_DATE,
  9      END_DATE,
 10      OFFER_PRIORITY,
 11      ACTIVATION_STATUS,
 12      STATUS,
 13      CREATE_USER,
 14      UPDATE_USER
 15  )
 16  SELECT PROFILE_ID,
 17         ENROLLMENT_ID,
 18         MEM_OFFER_ID,
 19         OFFER_ID,
 20         KIT_ID,
 21         OFFER_CODE,
 22         START_DATE,
 23         END_DATE,
 24         OFFER_PRIORITY,
 25         ACTIVATION_STATUS,
 26         STATUS,
 27         CREATE_USER,
 28         UPDATE_USER
 29  FROM TMP_PROFILE_OFFER_ORG
 30  /

200000000 rows created.

Elapsed: 00:02:00.80
SQL> commit;

Commit complete.

Elapsed: 00:00:00.05
SQL>


You should be able to load millions of rows in minutes not hours, even in modest hardware.

So its time to trace your load, not via dbms_parallel, just a simple serial insert for (say) 1million rows and see where the time is being lost.


Rating

  (1 rating)

Comments

Insert 400 million faster from one table to another table

Jaganath, March 04, 2021 - 2:19 pm UTC

Thanks for taking time to answer my question. Here is one question i have will it take time if the T_PROFILE_OFFER table has a trigger in it and also has few indexes?
Connor McDonald
March 05, 2021 - 4:39 am UTC

Yes indeed.

Triggers add overhead
Indexes add overhead

This is why I said its worth tracing to find out *where* the slowdown is occurring

More to Explore

PL/SQL

Check out more PL/SQL tutorials on our LiveSQL tool.