Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, David.

Asked: December 26, 2011 - 7:27 pm UTC

Last updated: February 02, 2018 - 12:10 am UTC

Version: 11.2.0.2

Viewed 100K+ times! This question is

You Asked

Tom,

I have a table with 2.8 million rows, and one of the columns is a BLOB b/c this table holds binary attachments. I need to convert these BLOBS to their plain text equivalent to index the file contents in a system external to Oracle. I am successfully using the CTX_DOC.POLICY_FILTER() procedure to accomplish this, but when run in serial cursor loop, the stored proc processes about 5 BLOBs per second. At this rate, I am half way through a 6.5 day run for a very simple task.

My DB server has a little juice, boasting 16 cores and 128 GB RAM. When running my procedure, one core is getting utilized at a steady 30%. My code is very simple in that is creates a cursor to loop over all the rows in the table with the BLOB data, then for each one, it calls CTX_DOC.POLICY_FILTER() to run the BLOB through the AUTO_FILTER, then inserts the plain text output into a new table.

How can I make use of my hardware to speed up this process? I would love to see this thing use every core (or at least 14 or 15 cores) at 100% utilization and finish the job in less than 24 hours.

I understand how to set the parallel degree in the DBMS_JOB package, but I'm not sure that my procedure (looping over a cursor) can take advantage of that.

Thanks,
Dave

and Tom said...

this would be a good time to look at dbms_parallel_execute!

right now your code looks something like:

procedure p
is
begin
  for x in (select * from t) 
  loop
     process(x)
  end loop;
end;


we are going to rewrite that to be:


procedure p( lo_rowid in rowid, hi_rowid in rowid )
is
begin
  for x in (select * from t where rowid between lo_rowid and hi_rowid) 
  loop
     process(x)
  end loop;
end;



and then we can run many of these at the same time. Here is an excerpt from my last book that details how to do this. You might consider breaking the table up into a few hundred pieces and doing parallel somewhere between 32 and 48 concurrent processes (I based that on your 30% cpu utilization - you don't want to kill the machine - you don't want to be at over 90%'ish utilization in general - really close but not too much over)

<quote>


Do-It-Yourself Parallelism

!Say we have that same process as in the preceding section: the serial, simple procedure. We cannot afford a rather extensive rewrite of the implementation, but we would like to execute it in parallel. What can we do?

Starting with Oracle Database 11g Release 2 and above – we have a new way to implement parallelism via the DBMS_PARALLLEL_EXECUTE builtin package. Using it, you can execute a SQL or PL/SQL statement in parallel by taking the data to be processed and breaking it up into multiple, smaller streams. You can implement the same approach yourself, manually, and I show how in the section on “Old School Do-It-Yourself Parallelism” that follows. The beauty of the new package though, is that it eliminates much of the tedious work that you otherwise need to perform.

Let’s start with the premise that we have a SERIAL routine, that we’d like to execute in parallel against some large table. We’d like to do it with as little work as possible – modify as little code as possible and be responsible for generating very little new code. Enter DBMS_PARALLEL_EXECUTE. We will not cover every possible use of this package (it is fully documented in the Oracle® Database PL/SQL Packages and Types Reference 11g Release 2 (11.2) guide) – but rather use just enough of it to implement the process I’ve just described.

Assuming we start with the empty table T2 – we’ll modify our serial process now to look like this – additions to the original, simple serial process are in bold:
big_table%ORA11GR2> create or replace
  2  procedure serial<b>( p_lo_rid in rowid, p_hi_rid in rowid )</b>
  3  is
  4  begin
  5      for x in ( select object_id id, object_name text
  6                   from big_table<b>
  7                  where rowid between p_lo_rid
  8                                  and p_hi_rid</b> )
  9      loop
 10          -- complex process here
 11          insert into t2 (id, text, session_id )
 12          values ( x.id, x.text, sys_context( 'userenv', 'sessionid' ) );
 13      end loop;
 14  end;
 15  /

Procedure created.


That was it – just add the ROWID inputs and the predicate. The modified code has not changed much at all. I am using SYS_CONTEXT to get the SESSIONID so we can monitor how much work was done by each ‘thread’, each parallel session.

Now, to start the process – we first need to break up the table into small pieces. We can do this by some numeric range – useful for tables that use a SEQUENCE to populate their primary key, by any arbitrary SQL you want to code, or by ROWID ranges. We’ll use the ROWID range, I find that to simply be the most efficient – it creates non-overlapping ranges of the table (contention free) and doesn’t require querying the table to decide the ranges, it just uses the data dictionary. So, we’ll make the following API calls:

big_table%ORA11GR2> begin
  2      dbms_parallel_execute.create_task(‘PROCESS BIG TABLE’);
  3      dbms_parallel_execute.create_chunks_by_rowid
  4      ( task_name   => ‘PROCESS BIG TABLE’,
  5        table_owner => user,
  6        table_name  => 'BIG_TABLE',
  7        by_row      => false,
  8        chunk_size  => 10000 );
  9  end;
 10  /
PL/SQL procedure successfully completed.


We started by creating a named task – ‘PROCESS BIG TABLE’ in this case. That is just a unique name we’ll use to refer to our big process. Second we invoked the CREATE_CHUNKS_BY_ROWID procedure. That procedure does exactly what its name implies, it ‘chunks up’ a table by ROWID ranges in a manner similar to what we just did above. We told the procedure to read the information about the currently logged in user’s table named BIG_TABLE and to break it up into chunks of no more than about 10,000 blocks (CHUNK_SIZE). The parameter BY_ROW was set to false which implies in this case that the CHUNK_SIZE is not a count of rows to create ROWID ranges by, but rather a count of blocks to create them.

We can see the number of chunks and information about each chunk immediately after this block of code executes by querying DBA_PARALLEL_EXECUTE_CHUNKS – a new view:

big_table%ORA11GR2> select *
  2    from (
  3  select chunk_id, status, start_rowid, end_rowid
  4    from dba_parallel_execute_chunks
  5   where task_name = ‘PROCESS BIG TABLE’
  6   order by chunk_id
  7         )
  8   where rownum <= 5
  9  /

  CHUNK_ID STATUS               START_ROWID        END_ROWID
---------- -------------------- ------------------ ------------------
       590 UNASSIGNED           AAAXZ0AAEAAAAKAAAA AAAXZ0AAEAAAAZ/CcP
       591 UNASSIGNED           AAAXZ0AAEAAAAsAAAA AAAXZ0AAEAAAAt/CcP
       592 UNASSIGNED           AAAXZ0AAEAAAAuAAAA AAAXZ0AAEAAAAv/CcP
       593 UNASSIGNED           AAAXZ0AAEAAAAwAAAA AAAXZ0AAEAAAAx/CcP
       594 UNASSIGNED           AAAXZ0AAEAAAAyAAAA AAAXZ0AAEAAAAz/CcP


The query in this example shows the first five rows in the view – in my case there were 209 total rows in there for the table in question, each representing a non-overlapping chunk of the table to process. This does not mean we’ll be processing the table in “parallel 209” – but just that we have 209 chunks in total to process. We are now ready to run our task, that is accomplished via this API call:

big_table%ORA11GR2> begin
  2      dbms_parallel_execute.run_task
  3      ( task_name      => ‘PROCESS BIG TABLE’,
  4        sql_stmt       => 'begin serial( :start_id, :end_id ); end;',
  5        language_flag  => DBMS_SQL.NATIVE,
  6        parallel_level => 4 );
  7  end;
  8  /
PL/SQL procedure successfully completed.


Here we asked to run our task ‘PROCESS BIG TABLE’ – which points to our chunks. The SQL statement we want to execute is 'begin serial( :start_id, :end_id ); end;' – a simple call to our stored procedure with the ROWID range to process. The PARALLEL_LEVEL I decided to use was four – meaning, we’ll have four parallel threads/processes executing this. Even though there were 209 chunks – we’ll only do four at a time. Internally, this package uses the DBMS_SCHEDULER package to run these threads in parallel.

Once our task starts running – it will create four jobs, and each job is told to process the chunks identified by the key value ‘PROCESS BIG TABLE’ and run the stored procedure SERIAL against each chunk. So, these four sessions start and each reads a chunk from the DBA_PARALLEL_EXECUTE_CHUNKS view, processes it, and updates the STATUS column. If the chunk is successful – the row will be marked as PROCESSED, if it fails for any reason – if a given chunk cannot be processed, it will be marked as PROCESSED_WITH_ERROR and other columns will contain the detailed error message indicating the cause of the error. In either case – the session will then retrieve another chunk and process it and so on. So, eventually these four jobs will have processed all of the chunks and the task will complete.

If any of the chunks failed, you can correct the underlying cause of the error and resume the task. That will cause it to reprocess the failed chunks. When all complete successfully:

big_table%ORA11GR2> select *
  2    from (
  3  select chunk_id, status, start_rowid, end_rowid
  4    from dba_parallel_execute_chunks
  5   where task_name = 'process big table'
  6   order by chunk_id
  7         )
  8   where rownum <= 5
  9  /

  CHUNK_ID STATUS               START_ROWID        END_ROWID
---------- -------------------- ------------------ ------------------
       590 PROCESSED            AAAXZ0AAEAAAAKAAAA AAAXZ0AAEAAAAZ/CcP
       591 PROCESSED            AAAXZ0AAEAAAAsAAAA AAAXZ0AAEAAAAt/CcP
       592 PROCESSED            AAAXZ0AAEAAAAuAAAA AAAXZ0AAEAAAAv/CcP
       593 PROCESSED            AAAXZ0AAEAAAAwAAAA AAAXZ0AAEAAAAx/CcP
       594 PROCESSED            AAAXZ0AAEAAAAyAAAA AAAXZ0AAEAAAAz/CcP


You are done. You can either retain the task for “history” or remove it. The following example shows how to remove the task:

big_table%ORA11GR2> begin
  2        dbms_parallel_execute.drop_task('process big table' );
  3  end;
  4  /
PL/SQL procedure successfully completed.

If we review our own application table – we can see that in fact the job was done using parallel 4 – and each of the four processed about the same number of rows:
big_table%ORA11GR2> select session_id, count(*)
  2    from t2
  3   group by session_id
  4   order by session_id;

SESSION_ID   COUNT(*)
---------- ----------
  21855337    2412538
  21855338    2478642
  21855339    2639919
  21855340    2468901

If you aren’t yet up to Oracle Database 11g Release 2, then you can implement the same sort of parallelism using the more labor-intensive approach in the next section. However, the new package provides a rich API (which we’ve just touched on here) that has much more functionality than the manual implementation does.
</quote>




Rating

  (47 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Awesome

David Webb, December 27, 2011 - 11:17 am UTC

Great answer...makes perfect sense...I can't wait to try it this afternoon.

Thanks Tom!

dbms_parallel_execute.create_chunks_by_rowid

A reader, May 09, 2012 - 10:46 am UTC

If I have a big table and in my Serial Procedure the cursor only processes 5% of the big table(based on the where clause), how to I create the CHUCKS in this case without reading the big table twice.

Sample Serial Procedure Code
CREATE OR REPLACE SAMPLE_SERIAL_PROCEDURE AS
   L_cnt      NUMBER(10) := 1000;
   CURSOR CUR IS
      SELECT column1, colum2
      FROM big_table
      WHERE (some_condition);
      
  TYPE TBL_column1 IS TABLE of big_table.column1%type;
  t_column1   TBL_column1 := TBL_column1();
  
  TYPE TBL_column2 IS TABLE of big_table.column2%type;
  t_column2   TBL_column2 := TBL_column2();
BEGIN
   OPEN CUR;
   LOOP
      FETCH CUR BULK COLLECT INTO t_column1, t_column2 LIMIT L_cnt;
      FOR i IN 1 .. t_column1.COUNT LOOP
         ...Some Process
         ......
      END LOOP;
      EXIT WHEN CUR%NOTFOUND;
   END LOOP;
   CLOSE CUR;
END;

Tom Kyte
May 10, 2012 - 7:48 am UTC

create the chunks by rowid - that doesn't read the table (only reads the dictionary)

and then make your query be:

where (some_condition) AND rowid between :start and :stop;


Good but missing trouble shooting

Matt Doll, May 11, 2012 - 10:01 am UTC

good article but lacks detail on 'PROCESSED_WITH_ERROR'
exception handling and explanations.
Tom Kyte
May 11, 2012 - 11:51 am UTC

well, if it processed with error, you look at said error (it'll be there as well) and then fix the condition that caused the error and re-execute it.

chunking

a reader, May 11, 2012 - 2:28 pm UTC

Hi Tom,

when chunking, do you chunk to have data clustered in chunks from source table but when save it to target table sessions may have contention due multiple sessions try to insert/update the same blocks, or sacrifice retrieving side and chunk to achieve less contention on saving to target table?

Can you please share your strategy? Thanks.

DBMS_PARALLEL_EXECUTE

Akash, May 15, 2012 - 9:55 am UTC

Hi Tom,

Our database version is 11 rel 2(RAC 3 nodes). We have a pl/sql procedure which does row by row processing because we need to do complex processing for each row.

Since it is running slow and our server has capacity, I am planning to use DBMS_PARALLEL_EXECUTE. But our support team has a policy of not running jobs using DBMS_JOB and therefore application schema don't have CREATE JOB privilege and also no execute privilege on DBMS_PARALLEL_EXECUTE.

1) Do you see any drawbacks of scheduling jobs using DBMS_JOB package? Are they difficult to support?

2)Do you see any issues in using DBMS_PARALLEL_EXECUTE package in RAC environment? My DBA is saying it is not good to use it in RAC environment!!

Personally, I think that DBMS_PARALLEL_EXECUTE is good thing and we can meet our performance requirement without changing much code.

Please help me on this

Thanks,

Akash
Tom Kyte
May 15, 2012 - 1:56 pm UTC

dbms_job doesn't need that privilege - but dbms_scheduler does and dbms_parallel_execute uses the scheduler.

1) I have a problem with a "support team" saying "no" in a unilateral fashion. There is no reason you shouldn't be able to use dbms_parallel_execute if it would save you a ton of work - which, frankly, it will.

Yes, I see drawbacks of doing it yourself with dbms_job, the drawback being you'll spend a lot of money and time reinventing the wheel.


2) Ask your DBA for facts, for numbers, for some science. When you get it - we can debate it here.

DBMS_PARALLEL_EXECUTE

Akash, May 15, 2012 - 5:50 pm UTC

Hi Tom,

Sorry I got confused between DBMS_JOB and DBMS_SCHEDULAR. Perhaps yesterday's GOOD discussion with DBAs is the reason.

My question is:

If we have to schedule some job which are nothing but execution of some Oracle procedures then will you prefer DBMS_SCHEDULAR or any other tools like UNIX utilities or $Universe? Do you see any issue in managing any jobs scheduled using DBMS_SCHEDULAR over long period of time?

Thanks,

Akash.

Tom Kyte
May 16, 2012 - 1:09 am UTC

If they are jobs that need the database to be up and running - they are database jobs - it would only make sense to use dbms_job or dbms_scheduler - that way they only run when the database is in fact up.

And you can monitor them easily, have a nice history of execution, etc - from anywhere since all of the tools have interfaces to this stuff and you don't need any OS access to review what is going on.

DBMS_SCHEDULAR

Akash, May 16, 2012 - 9:07 am UTC

Thanks Tom

Stopping a Task thats running...

A reader, July 03, 2012 - 4:01 pm UTC

If I use dbms_parallel_execute.STOP_TASK, is it going to kill the jobs that's currently running(associated with the task) or will it wait for the current jobs being processed to complete and won't process anymore jobs that are UNASSIGNED?
Tom Kyte
July 03, 2012 - 7:18 pm UTC

The running job threads will run to completion and be marked appropriately (processed, in error, whatever)

No new threads will be started.

Already run threads will just be left as they were.

Main Procedure to execute DBMS_PARALLEL_EXECUTE

A reader, July 05, 2012 - 11:02 am UTC

I have a main procedure/package that runs the DBMS_PARALLEL_EXECUTE. I was to come out of the Main Procedure only when all the chucks are Finished or Finished with errors.

In the sample code below, is it better to SLEEP for few seconds(DBMS_LOCK.SLEEP(2))?
If I don't have the sleep statement, is it going to burn out lot of CPU time because it continuously checks the TASK_STATUS?
CREATE OR REPLACE MAIN_PROCEDURE AS
  .............variable declaration
  l_status NUMBER;
BEGIN
  Code to execute DBMS_PARALLEL_EXECUTE.RUN_TASK...........
  --Exit the Main Procedure when the all the chucks are finished
  L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
  WHILE L_status NOT IN (DBMS_PARALLEL_EXECUTE.FINISHED, DBMS_PARALLEL_EXECUTE.FINISHED_WITH_ERROR)  LOOP
DBMS_LOCK.SLEEP(2);
     L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS(I_task_name);
  END LOOP;
END MAIN_PROCEDURE;

Tom Kyte
July 05, 2012 - 11:43 am UTC

it would burn a lot of cpu checking the status unless you sleep for a bit, yes.

APPEND hint while using PARALLEL processing

SK, July 09, 2012 - 2:04 am UTC

Hi TOM,
Currently we are using Oracle 11g R2,we have partition tables on day wise(ONE TABLE for ONE DAY) crated using PARALLEL, in which we are loading around 6 million records per day. And we are using 16 core processes.

But our system's default parallel_automatic_tuning is not set to true

SQL> show parameter automatic;

parallel_automatic_tuning            boolean                          FALSE

But while inserting in to the tables we are using "append" hint

 INSERT /*+ append */  INTO DAILY_TABLE
 
After the above changes, we are facing slowness in the database inserts. Sometimes it takes only few seconds but sometimes 15 to 20 mins ....


I am thinking below reasons may be causing this problem

1.  parallel_automatic_tuning is flase,that is why "append" hint is not using PARALLEL processing for inserts and causing the slowness.
2.  According to your explanation in one of the threads,
  It would cause all modifications to serialize. No one else could insert/update/delete or merge into this table until the transaction that direct paths commits. 
  
  
Please suggest me 
 do I really need to use "append" hint ?
 do I need to set parallel_automatic_tuning before using PARALLEL processing on tables?

Thanks 
SK

Tom Kyte
July 11, 2012 - 12:52 pm UTC

if you want to do a parallel insert, the "append" part is implied. parallel inserts are direct path inserts.

and if you are doing a parallel insert - no one else can be modifying the table - so you would a) block until all outstanding transactions commit blocking all others from starting and then b) block all others from starting until it commits or rollsback.

do you really need append? do you really need parallel? If you are doing parallel - you are going to be the only thing doing anything to that table. Is that what you want?

You do not need parallel automatic tuning set, no, you do need to do this:

http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel003.htm#CACFJJGG

Excellent

Helena Marková, July 09, 2012 - 2:35 am UTC


Sree, July 21, 2012 - 8:26 am UTC

Hi Tom,

I would like to know the exception handling part in DBMS_PARALLEL_EXECUTE.

I created the task,
DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask8');

created the chunk
DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(task_name => 'mytask8',table_owner => 'RENJITH',table_name => 'EMPLOYEES',by_row => TRUE,chunk_size => 10);

then sql statement.
v_sql_stmt := 'update /*+ ROWID (dda) */ EMPLOYEES e
SET e.salary = e.salary + 30
WHERE rowid BETWEEN :start_id AND :end_id';

and run the task;

DBMS_PARALLEL_EXECUTE.RUN_TASK('mytask8', v_sql_stmt, DBMS_SQL.NATIVE,
parallel_level => 4);

then droped the task,

things are working fine if there are no errors.

I created a trigger to raise an error if i salary is more than 24000.

CREATE OR REPLACE TRIGGER TRGG BEFORE UPDATE ON EMPLOYEES FOR EACH ROW
BEGIN
IF :NEW.SALARY >24000 THEN
raise_application_error(-20434,'TEST FOR PARALLEL EXECUTE FUNCTION');
END IF;
END;

i was trying to insert all the new salaries which is less that 24000 and log the failed rows to a table. How can i achieve this.

Could you please advice me.

Thanks in advance
Renji

Tom Kyte
July 30, 2012 - 8:18 am UTC

use dml error logging

ops$tkyte%ORA11GR2> create table t
  2  as
  3  select rownum id, all_objects.*
  4    from all_objects
  5  /

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> alter table t add constraint t_pk primary key(id);

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create or replace trigger t_trigger
  2  after update on t for each row
  3  begin
  4          if (mod(:new.object_id,1000) = 0)
  5          then
  6                  raise_application_error( -20000, 'nope' );
  7          end if;
  8  end;
  9  /

Trigger created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_errlog.create_error_log( 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_parallel_execute.create_task( 'test' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_parallel_execute.create_chunks_by_rowid( task_name=>'test', table_owner=> user, table_name => 'T', by_row => FALSE, chunk_size => 1000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> begin
  2          dbms_parallel_execute.run_task
  3          ( task_name => 'test',
  4            sql_stmt  => 'update t set object_id = object_id+1 where rowid between :start_id and :end_id log errors reject limit unlimited',
  5            language_flag => dbms_sql.native,
  6            parallel_level => 4
  7          );
  8  end;
  9  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_parallel_execute.drop_task('test' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select ora_err_number$, ora_err_mesg$, id from err$_t where rownum <= 5;

ORA_ERR_NUMBER$ ORA_ERR_MESG$
--------------- ----------------------------------------
ID
-------------------------------------------------------------------------------
          20000 ORA-20000: nope
                ORA-06512: at "OPS$TKYTE.T_TRIGGER", lin
                e 4
                ORA-04088: error during execution of tri
                gger 'OPS$TKYTE.T_TRIGGER'
3834

          20000 ORA-20000: nope
                ORA-06512: at "OPS$TKYTE.T_TRIGGER", lin
                e 4
                ORA-04088: error during execution of tri
                gger 'OPS$TKYTE.T_TRIGGER'
25989

          20000 ORA-20000: nope
                ORA-06512: at "OPS$TKYTE.T_TRIGGER", lin
                e 4
                ORA-04088: error during execution of tri
                gger 'OPS$TKYTE.T_TRIGGER'
27264

          20000 ORA-20000: nope
                ORA-06512: at "OPS$TKYTE.T_TRIGGER", lin
                e 4
                ORA-04088: error during execution of tri
                gger 'OPS$TKYTE.T_TRIGGER'
27989

          20000 ORA-20000: nope
                ORA-06512: at "OPS$TKYTE.T_TRIGGER", lin
                e 4
                ORA-04088: error during execution of tri
                gger 'OPS$TKYTE.T_TRIGGER'
29264


ops$tkyte%ORA11GR2> 

A reader, August 04, 2012 - 8:16 am UTC

Hello Tom,

I was trying to understand DBMS_PARALLEL_EXECUTE in 11g R2. But it is not working for me. I tried to execute a insert statement with the same package. But it is not inserting any records into my target table. Could you please help/guide me how to proceed.

If possible could you please give examples for INSERT and DELETE execution with DBMS_PARALLEL_EXECUTE .Also is there any option to capture the failed transactions. Something like all the correct transactions has to go to the target table and failed ones has to go to the log table.

----- Table creation cripts----------------
create table src as
(select level id from dual connect by level <=100000);

create table tgt (id number primary key);
------------------ check for failed transactions---
truncate table tgt;
insert into tgt values (50000);

Now if i execute the above block i should get all the rows inserted except 50000 and this entry has to be logged into a table (INSERT/UPDATE/DELETE). I tried googling but couldnt get any solution.

i tried with CURSOR LIMIT,BULK COLLECT , FORALL in a loop and it is working fine. But people advised me don't use committing in a loop. it will decrease performance.
Thanks In advance,


Tom Kyte
August 17, 2012 - 10:09 am UTC

give the full example please. make it small, as small as you can, but 100% complete.

like mine!

A reader, August 04, 2012 - 8:18 am UTC

Sorry forgot to include my attempt...
DECLARE
pct_in NUMBER;
retries_in PLS_INTEGER DEFAULT 2;
c_update_statement CONSTANT VARCHAR2(1000) := 'insert /*+ ROWID (dda) */ into tgt
select * from src
WHERE ROWID BETWEEN :starting_rowid AND :ending_rowid';
c_task_name CONSTANT VARCHAR2(20) := 'Give Raise';
l_attempts PLS_INTEGER := 1;
BEGIN
DBMS_PARALLEL_EXECUTE.CREATE_TASK(c_task_name);

DBMS_PARALLEL_EXECUTE. CREATE_CHUNKS_BY_ROWID(task_name => c_task_name,
table_owner => USER,
table_name => 'SRC',
by_row => TRUE,
chunk_size => 1000);

DBMS_PARALLEL_EXECUTE. RUN_TASK(task_name => c_task_name,
sql_stmt => c_update_statement,
language_flag => DBMS_SQL.native,
parallel_level => 10);

LOOP
EXIT WHEN DBMS_PARALLEL_EXECUTE.TASK_STATUS(c_task_name) = DBMS_PARALLEL_EXECUTE.FINISHED OR l_attempts > retries_in;
l_attempts := l_attempts + 1;
DBMS_PARALLEL_EXECUTE.RESUME_TASK(c_task_name);
END LOOP;
DBMS_PARALLEL_EXECUTE.DROP_TASK(c_task_name);
END;

Thanks,
Renjith
Tom Kyte
August 17, 2012 - 10:42 am UTC

why do you think you'd get all but 5000?

the entire insert that inserts a range of rows would *fail*, you'd have to use dml error logging to have just the one single row fail.




but in any event, a simple query to select * from dba_parallel_execute_chunks;
would solve this for you.



ive Raise
PROCESSED_WITH_ERROR AAAeb2AAEAAAAlIAAA AAAeb2AAEAAAAlPCcP
           TASK$_4308_1
17-AUG-12 11.32.59.938174 AM
17-AUG-12 11.32.59.938666 AM
     -1006
ORA-01006: bind variable does not exist



don't be so fast to drop the task, debug it first.

use :start_id, :end_id

APPEND hint while using PARALLEL

Sandeep, August 25, 2012 - 11:34 pm UTC

Hi Tom,
This is SK from India, this is a follow-up question to the post which I made on July 9, 2012 - 2am(APPEND hint while using PARALLEL processing). http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4248554900346593542#5167241500346078817

We have removed /append/ hint from all the inserts, and parallel_degree_policy a initialization parameter has set to true which means direct inserts are enabled(from the your response to my post).

We are loading into a single table(CONTEXT INDEX SYNC on COMMIT) from 20 different servers, and we are using 20 different LOCAL partitions for each server.

Initially for 5,6 days loading has worked properly, now after 10 days loading has become very slow.

Please suggest me how to resolve this slowness issue.

Thanks
SK

Dought on DBMS_PARALLEL_EXECUTE

manish, October 18, 2012 - 6:30 am UTC

This is very good future in oracle 11g.

I want to use this future in Data migration project.
but i have below dought.

How to track failed records from chunk?

If any one record got failed then whole chunk will be marked as failure? will it rollback all transactions for rest of all successfully processed records from same chunk?

Resume_task will run only for failed chunk with failed records or will it run for all records of failed chunk?

Could you please provide some sample examples for all above queries?

Thanks
manish

Tom Kyte
October 18, 2012 - 8:33 am UTC

it would run for the entire chunk, the chunk is the transaction, a chunk should entirely succeed or entirely fail.

Now, there are ways to make it so that this isn't the case. You would use a stored procedure to process a chunk. the stored procedure could mark rows as processed (or just outright delete them when done with them) and do incremental commits. The rows that 'fail' would not be marked as done (nor deleted). The stored procedure would end by "failing" if any row was unprocessed in its chunk. The stored procedure would end by succeeding if all rows in the chunk succeeded.

So the stored procedure would query up its rows using something like:


select * from t where rowid between :x and :y and processed = 'N';

and as it processed those records, it would update processed to 'Y' and commit every now and then. Those processed records would not be processed again.

Dought on DBMS_PARALLEL_EXECUTE

manish, October 19, 2012 - 12:09 am UTC

Thanks for respond..:)


Tarun Agrawal, November 21, 2012 - 8:15 am UTC

Hi Tom,

I need your help to understand below problem

In dbms_parallel_execute.run_task I'm giving parameter value parallel_level => 10 and I have total 110 chunks to process by scheduler slaves.

My code is executed perfectly fine.

View dba_scheduler_job_run_details is showing 10 jobs are created with job name like 'TASK$XXX_XX'

But when I query dba_parallel_execute_chunks, it showing that all chucks are processed by only two scheduler slaves instead of 10.

Please help
Tom Kyte
November 21, 2012 - 9:41 am UTC

example? especially your evidence that only two jobs ran.

Please find an example

A reader, November 23, 2012 - 1:01 am UTC

As requested, I have created an working example of above problem.

Two tables are created:
1. temp_recd -> contains chunks details used in DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL. Total 10 rows are created as below:
Start_id End_id
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10

2.Out_recd -> to capture sid_number and chunk id.

Below is the code

DECLARE
l_task_name varchar2(100) := 'TEST_TASK';
l_sql clob;
BEGIN
l_sql := q'|

declare
l_start_id number := :start_id;
l_end_id number := :end_id;
l_sid number;
begin
select sid into l_sid from v$mystat where rownum =1;
insert into out_recd values (l_start_id,l_sid);
dbms_lock.sleep(1);
end;

|';
BEGIN
DBMS_PARALLEL_EXECUTE.DROP_TASK(task_name => l_task_name);
EXCEPTION
WHEN OTHERS THEN
NULL;
DBMS_OUTPUT.PUT_LINE('TASK NOT EXISTS: '|| l_task_name);
END;

DBMS_PARALLEL_EXECUTE.CREATE_TASK(task_name => l_task_name);
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL(task_name => l_task_name, sql_stmt => 'select start_id, end_id from temp_recd', by_rowid => FALSE);
DBMS_PARALLEL_EXECUTE.RUN_TASK(task_name => l_task_name, sql_stmt => l_sql, language_flag => DBMS_SQL.NATIVE, parallel_level => 5 );

END;

Below are the output:

select task_name, chunk_type, status, job_prefix from dba_parallel_execute_tasks
task_name chunk_type status job_prefix
TEST_TASK NUMBER_RANGE FINISHED TASK$_1059

select task_name, start_id, end_id, job_name, status from dba_parallel_execute_chunks
task_name start_id end_id job_name status
TEST_TASK 10 10 TASK$_1059_2 PROCESSED
TEST_TASK 9 9 TASK$_1059_1 PROCESSED
TEST_TASK 8 8 TASK$_1059_2 PROCESSED
TEST_TASK 7 7 TASK$_1059_1 PROCESSED
TEST_TASK 6 6 TASK$_1059_2 PROCESSED
TEST_TASK 5 5 TASK$_1059_1 PROCESSED
TEST_TASK 4 4 TASK$_1059_2 PROCESSED
TEST_TASK 3 3 TASK$_1059_1 PROCESSED
TEST_TASK 2 2 TASK$_1059_2 PROCESSED
TEST_TASK 1 1 TASK$_1059_1 PROCESSED

select job_name, status,run_duration, session_id, slave_pid from dba_scheduler_job_run_details
job_name status run_duration session_id slave_pid
TASK$_1059_2 SUCCEEDED +00 00:00:05.000000 174,41985 12001
TASK$_1059_3 SUCCEEDED +00 00:00:00.000000 174,41987 12001
TASK$_1059_5 SUCCEEDED +00 00:00:00.000000 174,41989 12001
TASK$_1059_1 SUCCEEDED +00 00:00:05.000000 224,3909 24221
TASK$_1059_4 SUCCEEDED +00 00:00:00.000000 224,3911 24221

Tom Kyte
November 29, 2012 - 6:59 am UTC

I don't see any example I can run - do you?


oh, by the way, I hate your code:


 WHEN OTHERS THEN
        NULL;
        DBMS_OUTPUT.PUT_LINE('TASK NOT EXISTS: '|| l_task_name);
    END;



i truly hate your code, why why why would you do that? just to make it harder to debug? that is the only thing you accomplish with that (and don't say "it is just a demo, that doesn't cut it - why put it into a demo??)


and I don't see in your output any indication of any problem????? You have 10 chunks in dba_parallel_execute_chunks. I see 5 entries for 5 jobs in dba_scheduler_job_run_details.

I don't see you query your table at all, so no clue what was there.

so what is the problem?????


(and where are your create tables and inserts??? I could not run your code - but nothing seems wrong to me here yet?????)


Un-even distribution of rows in Chunks

Rajeshwaran, Jeyabal, November 30, 2012 - 5:12 am UTC

Tom,

I was working on dbms_parallel_execute on BIG_TABLE (having 10M records) and set the chunk by rows with chunk_size as 1000.

But what I got is for chunk_ids between 1 and 10 the num of rows/chunk picked are < 500 for chunk_ids between 64 and 74 I got the num of rows/chunk is around 1500 to 3000.

As per docs, chunk_size is approx num of rows to process. but my question is

1) why is approximation varies drastically for each chunks? ( say for chunk_id=4 its very less than 1000 but the same for chunk_id=67 its very far away )
http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_parallel_ex.htm#CHDGCHFA
Docs, says this
by_row=> TRUE if chunk_size refers to the number of rows
chunk_size=> Approximate number of rows/blocks to process for each commit cycle
rajesh@ORA11G> begin
  2     dbms_parallel_execute.create_task('PROCESS_BIG_TABLE');
  3     dbms_parallel_execute.create_chunks_by_rowid
  4     ( task_name=>'PROCESS_BIG_TABLE',
  5       table_owner=>user,
  6       table_name=>'BIG_TABLE',
  7       by_row=>true,
  8       chunk_size=>1000 );
  9  end;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.36
rajesh@ORA11G>
rajesh@ORA11G>
rajesh@ORA11G> select t.*,
  2    ( select count(*)
  3      from big_table
  4      where rowid between start_rowid and end_rowid) cnt
  5  from  (
  6  select chunk_id,task_name,status,start_rowid,end_rowid
  7  from user_parallel_execute_chunks
  8  order by chunk_id
  9        ) t
 10  where rownum <= 10
 11  /

  CHUNK_ID TASK_NAME          STATUS     START_ROWID        END_ROWID                 CNT
---------- ------------------ ---------- ------------------ ------------------ ----------
         1 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAAVwAAA AAASkZAAFAAAAV3CcP        294
         2 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAAV4AAA AAASkZAAFAAAAV/CcP        267
         3 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAmXwAAA AAASkZAAFAAAmX3CcP        260
         4 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAmX4AAA AAASkZAAFAAAmX/CcP        259
         5 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAsWAAAA AAASkZAAFAAAsWHCcP        551
         6 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAsWIAAA AAASkZAAFAAAsWPCcP        522
         7 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAsWQAAA AAASkZAAFAAAsWXCcP        576
         8 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAsWYAAA AAASkZAAFAAAsWfCcP        517
         9 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAsWgAAA AAASkZAAFAAAsWnCcP        477
        10 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAsWoAAA AAASkZAAFAAAsWvCcP        448

10 rows selected.

Elapsed: 00:00:00.10
rajesh@ORA11G>
rajesh@ORA11G> select t.*,
  2    ( select count(*)
  3      from big_table
  4      where rowid between start_rowid and end_rowid) cnt
  5  from  (
  6  select chunk_id,task_name,status,start_rowid,end_rowid
  7  from user_parallel_execute_chunks
  8        ) t
  9  where chunk_id between 64
 10  and 74 ;

  CHUNK_ID TASK_NAME          STATUS     START_ROWID        END_ROWID                 CNT
---------- ------------------ ---------- ------------------ ------------------ ----------
        64 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtGAAAA AAASkZAAFAAAtGxCcP       3085
        65 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtGyAAA AAASkZAAFAAAtHjCcP       3254
        66 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtHkAAA AAASkZAAFAAAtH/CcP       1772
        67 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtIAAAA AAASkZAAFAAAtIxCcP       3105
        68 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtIyAAA AAASkZAAFAAAtJjCcP       3218
        69 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtJkAAA AAASkZAAFAAAtJ/CcP       1804
        70 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtKAAAA AAASkZAAFAAAtKxCcP       3057
        71 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtKyAAA AAASkZAAFAAAtLjCcP       3217
        72 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtLkAAA AAASkZAAFAAAtL/CcP       1781
        73 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtMAAAA AAASkZAAFAAAtMHCcP        513
        74 PROCESS_BIG_TABLE  UNASSIGNED AAASkZAAFAAAtOAAAA AAASkZAAFAAAtOxCcP       3087

11 rows selected.

Elapsed: 00:00:00.14
rajesh@ORA11G>

Tom Kyte
November 30, 2012 - 6:36 am UTC

we do not cross extents in by rowid, so the first few really small extents will contribute a small number of rows since they have really small extents - then the extents get larger and larger.

On my one million row "big_table" (copy of all objects over and over to get a million rows) with a chunk size of 100,000 you might see something like:

ops$tkyte%ORA11GR2> begin
  2     dbms_parallel_execute.create_task('PROCESS_BIG_TABLE');
  3     dbms_parallel_execute.create_chunks_by_rowid
  4     ( task_name=>'PROCESS_BIG_TABLE',
  5       table_owner=>'BIG_TABLE',
  6       table_name=>'BIG_TABLE',
  7       by_row=>true,
  8       chunk_size=>100000 );
  9  end;
 10  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> set linesize 10000
ops$tkyte%ORA11GR2> column task_name format a17
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select t.*,
  2    ( select count(*)
  3      from big_table.big_table
  4      where rowid between start_rowid and end_rowid) cnt
  5  from  (
  6  select chunk_id,start_rowid,end_rowid
  7  from user_parallel_execute_chunks
  8  order by chunk_id
  9        ) t
 10  /

  CHUNK_ID START_ROWID        END_ROWID                 CNT
---------- ------------------ ------------------ ----------
      2595 AAAdoMAADAAAACAAAA AAAdoMAADAAAACHCcP        407
      2596 AAAdoMAADAAAACIAAA AAAdoMAADAAAACPCcP        619
      2597 AAAdoMAADAAAACQAAA AAAdoMAADAAAACXCcP        551
      2598 AAAdoMAADAAAACYAAA AAAdoMAADAAAACfCcP        618
      2599 AAAdoMAADAAAACgAAA AAAdoMAADAAAACnCcP        532
      2600 AAAdoMAADAAAACoAAA AAAdoMAADAAAACvCcP        623
      2601 AAAdoMAADAAAACwAAA AAAdoMAADAAAAC3CcP        536
      2602 AAAdoMAADAAAAC4AAA AAAdoMAADAAAAC/CcP        668
      2603 AAAdoMAADAAAADAAAA AAAdoMAADAAAADHCcP        549
      2604 AAAdoMAADAAAADIAAA AAAdoMAADAAAADPCcP        608
      2605 AAAdoMAADAAAADQAAA AAAdoMAADAAAADXCcP        515
      2606 AAAdoMAADAAAADYAAA AAAdoMAADAAAADfCcP        596
      2607 AAAdoMAADAAAADgAAA AAAdoMAADAAAADnCcP        496
      2608 AAAdoMAADAAAADoAAA AAAdoMAADAAAADvCcP        616
      2609 AAAdoMAADAAAADwAAA AAAdoMAADAAAAD3CcP        527
      2610 AAAdoMAADAAAAD4AAA AAAdoMAADAAAAD/CcP        591
      2611 AAAdoMAADAAAAEAAAA AAAdoMAADAAAAF/CcP       8763
      2612 AAAdoMAADAAAAGAAAA AAAdoMAADAAAAH/CcP       8447
      2613 AAAdoMAADAAAAIAAAA AAAdoMAADAAAAJ/CcP       8501
      2614 AAAdoMAADAAAAKAAAA AAAdoMAADAAAAL/CcP       8411
      2615 AAAdoMAADAAAAMAAAA AAAdoMAADAAAAN/CcP       8452
      2616 AAAdoMAADAAAAOAAAA AAAdoMAADAAAAP/CcP       8671
      2617 AAAdoMAADAAAAQAAAA AAAdoMAADAAAAR/CcP       8811
      2618 AAAdoMAADAAAASAAAA AAAdoMAADAAAAT/CcP       9007
      2619 AAAdoMAADAAAAUAAAA AAAdoMAADAAAAV/CcP       9168
      2620 AAAdoMAADAAAAWAAAA AAAdoMAADAAAAX/CcP       8369
      2621 AAAdoMAADAAAAYAAAA AAAdoMAADAAAAZ/CcP       8553
      2622 AAAdoMAADAAAAaAAAA AAAdoMAADAAAAb/CcP       8409
      2623 AAAdoMAADAAAAcAAAA AAAdoMAADAAAAd/CcP       8446
      2624 AAAdoMAADAAAAeAAAA AAAdoMAADAAAAf/CcP       8665
      2625 AAAdoMAADAAAAgAAAA AAAdoMAADAAAAh/CcP       8677
      2626 AAAdoMAADAAAAiAAAA AAAdoMAADAAAAj/CcP       8593
      2627 AAAdoMAADAAAAkAAAA AAAdoMAADAAAAl/CcP       9549
      2628 AAAdoMAADAAAAmAAAA AAAdoMAADAAAAn/CcP       8587
      2629 AAAdoMAADAAAAoAAAA AAAdoMAADAAAAp/CcP       8465
      2630 AAAdoMAADAAAAqAAAA AAAdoMAADAAAAr/CcP       8493
      2631 AAAdoMAADAAAAsAAAA AAAdoMAADAAAAt/CcP       8407
      2632 AAAdoMAADAAAAuAAAA AAAdoMAADAAAAv/CcP       8457
      2633 AAAdoMAADAAAAwAAAA AAAdoMAADAAAAx/CcP       8763
      2634 AAAdoMAADAAAAyAAAA AAAdoMAADAAAAz/CcP       8700
      2635 AAAdoMAADAAAA0AAAA AAAdoMAADAAAA1/CcP       9209
      2636 AAAdoMAADAAAA2AAAA AAAdoMAADAAAA3/CcP       9019
      2637 AAAdoMAADAAAA4AAAA AAAdoMAADAAAA5/CcP       8389
      2638 AAAdoMAADAAAA6AAAA AAAdoMAADAAAA7/CcP       8545
      2639 AAAdoMAADAAAA8AAAA AAAdoMAADAAAA9/CcP       8410
      2640 AAAdoMAADAAAA+AAAA AAAdoMAADAAAA//CcP       8439
      2641 AAAdoMAADAAABAAAAA AAAdoMAADAAABB/CcP       8687
      2642 AAAdoMAADAAABCAAAA AAAdoMAADAAABD/CcP       8698
      2643 AAAdoMAADAAABEAAAA AAAdoMAADAAABF/CcP       8708
      2644 AAAdoMAADAAABGAAAA AAAdoMAADAAABH/CcP       9459
      2645 AAAdoMAADAAABIAAAA AAAdoMAADAAABJ/CcP       8492
      2646 AAAdoMAADAAABKAAAA AAAdoMAADAAABL/CcP       8507
      2647 AAAdoMAADAAABMAAAA AAAdoMAADAAABN/CcP       8448
      2648 AAAdoMAADAAABOAAAA AAAdoMAADAAABP/CcP       8430
      2649 AAAdoMAADAAABQAAAA AAAdoMAADAAABR/CcP       8450
      2650 AAAdoMAADAAABSAAAA AAAdoMAADAAABT/CcP       8767
      2651 AAAdoMAADAAABUAAAA AAAdoMAADAAABV/CcP       8629
      2652 AAAdoMAADAAABWAAAA AAAdoMAADAAABX/CcP       9375
      2653 AAAdoMAADAAABYAAAA AAAdoMAADAAABZ/CcP       8887
      2654 AAAdoMAADAAABaAAAA AAAdoMAADAAABb/CcP       8415
      2655 AAAdoMAADAAABcAAAA AAAdoMAADAAABd/CcP       8531
      2656 AAAdoMAADAAABeAAAA AAAdoMAADAAABf/CcP       8413
      2657 AAAdoMAADAAABgAAAA AAAdoMAADAAABh/CcP       8423
      2658 AAAdoMAADAAABiAAAA AAAdoMAADAAABj/CcP       8688
      2659 AAAdoMAADAAABkAAAA AAAdoMAADAAABl/CcP       8794
      2660 AAAdoMAADAAABmAAAA AAAdoMAADAAABn/CcP       8860
      2661 AAAdoMAADAAABoAAAA AAAdoMAADAAABp/CcP       9391
      2662 AAAdoMAADAAABqAAAA AAAdoMAADAAABr/CcP       8367
      2663 AAAdoMAADAAABsAAAA AAAdoMAADAAABt/CcP       8552
      2664 AAAdoMAADAAABuAAAA AAAdoMAADAAABv/CcP       8410
      2665 AAAdoMAADAAABwAAAA AAAdoMAADAAABx/CcP       8449
      2666 AAAdoMAADAAAByAAAA AAAdoMAADAAABz/CcP       8525
      2667 AAAdoMAADAAAB0AAAA AAAdoMAADAAAB1/CcP       8715
      2668 AAAdoMAADAAAB2AAAA AAAdoMAADAAAB3/CcP       8596
      2669 AAAdoMAADAAAB4AAAA AAAdoMAADAAAB5/CcP       9509
      2670 AAAdoMAADAAAB6AAAA AAAdoMAADAAAB7/CcP       8741
      2671 AAAdoMAADAAAB8AAAA AAAdoMAADAAAB9/CcP       8453
      2672 AAAdoMAADAAAB+AAAA AAAdoMAADAAAB//CcP       8494
      2673 AAAdoMAADAAACAAAAA AAAdoMAADAAACB/CcP       8473
      2674 AAAdoMAADAAACCAAAA AAAdoMAADAAACR/CcP      70262
      2675 AAAdoMAADAAACSAAAA AAAdoMAADAAACh/CcP      70305
      2676 AAAdoMAADAAACiAAAA AAAdoMAADAAACx/CcP      70292
      2677 AAAdoMAADAAACyAAAA AAAdoMAADAAADB/CcP      70288
      2678 AAAdoMAADAAADCAAAA AAAdoMAADAAADR/CcP      70293
      2679 AAAdoMAADAAADSAAAA AAAdoMAADAAADh/CcP      70231
      2680 AAAdoMAADAAADiAAAA AAAdoMAADAAADx/CcP      23566

86 rows selected.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec DBMS_PARALLEL_EXECUTE.DROP_TASK('PROCESS_BIG_TABLE');

PL/SQL procedure successfully completed.


it grows by the extent size...

pipelined functions

david, November 30, 2012 - 2:30 pm UTC

cant we achive the parallel processing with pipelined functions too?
Tom Kyte
December 03, 2012 - 8:08 am UTC

sure, in the same chapter that I picked this from - I show how to do a parallel pipelined function - what the requirements are.

It is *similar* but different.

dbms_parallel_execute breaks one big mega transaction into lots of small ones (each independent of each other). This has pros and cons of course.

database parallelism would keep it as one big transaction. This has pros and cons of course.

chunking by rowid

A reader, December 03, 2012 - 12:31 pm UTC

Hi Tom,

Chunking by rowid is efficient on the chunking process, but inserting to the target table could have contention because different chunks could have same keys on the target table. So chunking by rowid is not necessary the most efficient way to break up data for the whole process, can you please comment on that?

Thanks
Tom Kyte
December 04, 2012 - 7:37 am UTC

but inserting to the
target table could have contention because different chunks could have same
keys on the target table.


explain please? need more information.

I don't know what your processing involves - so I cannot really analyze if there are any issues

give us a "for example" to work with

A reader, December 05, 2012 - 10:24 am UTC

This thought of possible contention on inserting/updating side just occurred to me as I was reading this question thread. For example,

chunk1: rowid1 - rowid10, the rows in the range have keys KEY1, KEY2, KEY3, KEY4

chunk2: rowid11 - rowid20, the rows in the range have keys KEY1, LEY2, KEY3, KEY4

as 2 threads working on chunk1 and chunk2 separately, they would update on the same keys and have write contention.

If we chunk by key value, then we perhaps can have

chunk1: KEY1, KEY2
chunk2: KEY3, KEY4

then the 2 working threads won't bump into each other, but you pay on the chunking side, because you have to scan the source to separate them.

If we chunk by key value

Rajeshwaran, Jeyabal, December 14, 2012 - 3:40 am UTC

If we chunk by key value, then we perhaps can have

chunk1: KEY1, KEY2
chunk2: KEY3, KEY4


If you want your keys to be chunked by this way, please look at the "Chunk by User-Provided SQL" at this below link

http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_parallel_ex.htm#autoId6

dbms_parallel_execute

HARIHARAN. J, April 19, 2013 - 7:38 am UTC

While executing dbms_parallel_execute.run_taks.. How to set instance?.

E.g:
If I run the parallel tasks for the level 5. 5 taks are running in different instance. How to set all the tasks are run in one instances?. Hope, implicity oracle will run in different instances.
Tom Kyte
April 22, 2013 - 8:00 pm UTC

we are just using the scheduler in the background.

there is currently no way to say "use this on instance" with dbms_parallel_execute

http://docs.oracle.com/cd/E11882_01/appdev.112/e16760/d_parallel_ex.htm#CHDIBHHB


How to chunk data of a subpartition using DBMS_PARALLEL_EXECUTE

Ankit, April 25, 2013 - 11:59 am UTC

Hi,

In Oracle version 11.2.0.2, I have to migrate data from a table containing over a billion rows. This table is range partitioned based on date and further Hash subpartitioned based on an ID column.
My each subpartition holds approx 200 Million rows. I wanted to use DBMS_PARALLEL_EXECUTE to chunk the data subpartition by subpartition on the basis of ROWID.

Unfortunately DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID doens't take subpartition name as an input.
Can you suggest a way in which I can chunk the data of the subpartition?

Thanks,
Ankit
Tom Kyte
April 25, 2013 - 1:29 pm UTC

what is involved in this migration?

why not just a single DDL statement to create new table as select? no DML, all DDL? in parallel


in short, why chunk it up - what is the reasoning for that here?


you can always use create_chunks_by_sql

see
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10498431232211

there is a query that given a segment name - returns a set of non-overlapping rowid pairs to completely cover that segment. You can use that as a starting place for the sql you need to chunk your subpartition if you really want to go that route.

dbms_parallel_execute from docs (11.2.0.1)

Rajeshwaran, Jeyabal., May 31, 2013 - 10:09 am UTC

Tom:

I was reading about dbms_parallel_execute from docs at this link
http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_parallel_ex.htm#autoId21
<quote> This procedure deletes all the processed chunks whose status is PROCESSED or PROCESSED_WITH_ERROR </quote>

Tom: can you tell me why I am unable to delete chunks in "PROCESSED_WITH_ERROR" ? I am on 11.2.0.1 on 32-bit windows
rajesh@ORA11G> select chunk_id,task_name,status,error_code
  2  from user_parallel_execute_chunks
  3  /

  CHUNK_ID TASK_NAME            STATUS               ERROR_CODE
---------- -------------------- -------------------- ----------
     10148 TEST01               PROCESSED_WITH_ERROR      -1006
     10149 TEST01               PROCESSED_WITH_ERROR      -1006
     10150 TEST01               PROCESSED_WITH_ERROR      -1006
 
Elapsed: 00:00:00.00
rajesh@ORA11G> exec dbms_parallel_execute.purge_processed_chunks('TES

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
rajesh@ORA11G> select chunk_id,task_name,status,error_code
  2  from user_parallel_execute_chunks
  3  /

  CHUNK_ID TASK_NAME            STATUS               ERROR_CODE
---------- -------------------- -------------------- ----------
     10148 TEST01               PROCESSED_WITH_ERROR      -1006
     10149 TEST01               PROCESSED_WITH_ERROR      -1006
     10150 TEST01               PROCESSED_WITH_ERROR      -1006
 
Elapsed: 00:00:00.01
rajesh@ORA11G>

Tom Kyte
May 31, 2013 - 3:06 pm UTC

give the entire example to reproduce with

get rowid ranges in a block

A reader, June 07, 2013 - 6:43 pm UTC

Hi Tom,

Given the file number and a block number in this file, how can we get the rowids for all the rows in this block?

Thanks!
Tom Kyte
June 07, 2013 - 8:48 pm UTC

use dbms_rowid.rowid_create with a row_number of 0 and another call with a row_number of 10000.

that'll give you the lowest possible rowid for that block and a rowid that is higher than the largest possible rowid for that block.

Number of parallel threads

reader, July 17, 2013 - 6:56 pm UTC

Hi Tom
I tried your intial example you posted to try parallelism using dbms_parallel_execute package.
I applied the same parameter values.

begin dbms_parallel_execute.run_task
( task_name => 'PROCESS BIG TABLE',
sql_stmt => 'begin SERIAL_INSERT( :start_id, :end_id ); end;',
language_flag => DBMS_SQL.NATIVE,
parallel_level => 4 ); <-
end;

Even after setting the parallel level of 4 when i queried
select status,count(1)
from dba_parallel_execute_chunks
where task_name = 'PROCESS KW_IMREX_CUST_ORDER_DETAILS_W'
group by status

STATUS COUNT
ASSIGNED 3 <- ONLY 3 chunks are in assigned state
UNASSIGNED 8
PROCESSED 176

parallel_level => 4 .Does this parameter does not ensure to run 4 chunks of rowid ranges to execute in parallel? Should 4 chunks be always in assigned state during the life time of this procedure execution until all the chunks are executed?

Oracle Documentation says "The default number of Job slaves is computed as the product of Oracle parameters cpu_count and parallel_threads_per_cpu"

Currently the database has
cpu_count integer 32
parallel_threads_per_cpu integer 2

Is this behaviour caused by these parameters?
Tom Kyte
July 17, 2013 - 9:06 pm UTC

what is job_queue_processes set to?

and was it always 3 (did you query repeatedly). you could have gotten "lucky" and queried just when one finished and hadn't started yet.

and are there any other jobs running?

Number of parallel threads

reader, July 18, 2013 - 5:50 pm UTC

The job_queue_process is set to 4.

job_queue_processes integer 4

There were no other process runnning. Yes. I queried repeatedly.

Also on querying the target table on the number of sessions and count i see only 3 sessions available after the load.

select session_id,count(1) from T2
group by session_id;

Session_id Count(*)
19624546 316161
19624514 713987
19624513 706318

Tom Kyte
July 18, 2013 - 6:05 pm UTC

I cannot reproduce. when I set to 4 - 4 jobs run.

please verify using ps that there are four job queue processes

query the jobs and scheduler views to make sure nothing else is running in the background that you are not aware of (materialized view refreshes, EM jobs, etc)

please verify and show us using cut and paste right from sqlplus all of this information - job queue processes, that there are 4 job processes fired up, that there are no other background jobs.


Number of parallel threads

reader, July 18, 2013 - 7:06 pm UTC

Not sure why.I ran the same process in production and i see 4 threads processing the chunks in parallel.

The Test database is still processing the chunks(with three threads only).

I will get the sqlplus details in test database which runs in three threads.

I think the number of sessions grouped on the target table tells how many parallel sessions gets opened up to process the chunks in parallel.

The only difference i see between Test Database and Prod is the CPU Count

It is 32 in Test Database vs 48 in production.
Tom Kyte
July 19, 2013 - 12:51 pm UTC

cpu count is not relevant.

job_queue_processes is.

make sure there are 4
make sure 4 are running at the OS level (ps)
make sure no one else has the 4th one doing something

Hello Tom

Umakanth, July 29, 2013 - 3:04 pm UTC

Hi,


I am currently using Oracle 11gR2

I am developing parallel process using dbms_parallel_execute package. I tried to create chunks by using DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL('mytask', l_chunk_sql, false) but I am unsuccessful to create a required chunk_sql. As a work around I created a number column to get the start_id and end_id and using dbms_parallel_execute.create_chunks_by_number_col()
but to generate this number(ID) column I am using PL/SQl block to generate start_id and end_id. If I can achieve it using SQL then it will improve performance little bit more.


Table A

Col1 col2 .... coln ID

123 100 1

123 100 1

123 100 1


456 101 2

456 101 2

456 101 2


789 102 3

789 102 3

789 102 3


As I mentioned above I need to get the same ID number for group of col1values and ID value cannot be reused for different group. Is there anyway to achieve this in SQL query?


To achevie this I am using below PL/SQL



Declare

cursor c1 is select * from tableA;

v_id number :=0;

begin

for i in c1 loop

v_id := v_id +1;

update tableA set id = v_id where col1 = i.col1;

end loop;

end;


Please advise


Thanks

Tom Kyte
August 02, 2013 - 5:35 pm UTC

... but I am unsuccessful to create a required chunk_sql. ....

what does unsuccessful mean?

you sort of need to give an example, just like I do - from start to finish - create table, generate data, show us what you are doing and what unsuccessful means.

Same person asabove

Umakanth, July 31, 2013 - 8:56 am UTC

DENSE_RANK()over (order by col1) worked. Please ignore the above question

insufficient privilege error when I run a concurrent program

Ravi, August 20, 2013 - 6:47 pm UTC

Hi Tom,

When I try to run a similar example you had given through a concurrent program. I am getting the following error
' ERROR: Unexpected error while re-processing data: ORA-27486: insufficient privileges'.

The chunks are created, the task is also created but the run_task is throwing the error.

Please help me.

here is my code snippet


-----------------------------------------------------------------------------------------------------
--Procedure called by XXEXPD_OM_OMS_VALIDATE_IMPORT concurrent program which is invoked every 5 minutes
-----------------------------------------------------------------------------------------------------
PROCEDURE IMPORT_ORDER_VALIDATE_CONC( retcode OUT VARCHAR2,
errbuf OUT VARCHAR2,
p_processed_flag IN VARCHAR2,
p_debug_flag IN VARCHAR2,
p_chunk_size IN NUMBER,
p_parallel_threads IN NUMBER
)
IS
lv_chunk_size NUMBER:=0;
lv_parallel_threads NUMBER :=0;
/* chunk data and do things in parallel */

BEGIN
lv_chunk_size :=p_chunk_size;
lv_parallel_threads :=p_parallel_threads;
--STEP 1 CHUNK DATA
CHUNK_ORDER_HDR_REC_TASK(lv_chunk_size);

--STEP 2 Execute the task
PROCESS_CHUNK_ORDER_VAL_REC(lv_parallel_threads);

--STEP 3 Drop Task
CHUNK_ORDER_HDR_DROP_TASK;
EXCEPTION
WHEN OTHERS
THEN
retcode := 1;
errbuf := 'Unexpected error while re-processing data: ' || SQLERRM;
fnd_file.put_line (
fnd_file.output,
'ERROR: Unexpected error while re-processing data: ' || SQLERRM);
RETURN;

END IMPORT_ORDER_VALIDATE_CONC;


-----------------------------------------------------------------------------------------------------
--Procedure called to chunk the data in the staging header table
-----------------------------------------------------------------------------------------------------
PROCEDURE CHUNK_ORDER_HDR_REC_TASK(p_chunk_size in number)

IS
BEGIN
dbms_parallel_execute.create_task('PROCESS ORDER HEADER TABLE');
dbms_parallel_execute.create_chunks_by_rowid
(task_name => 'PROCESS ORDER HEADER TABLE',
table_owner => 'XXEXPD',
table_name => 'XXEXPD_OM_OMS_ORDER_HDR_T',
by_row => false,
chunk_size => p_chunk_size );
END;


-----------------------------------------------------------------------------------------------------
--Procedure called to run the task to validate the staging header table ( chunked data)
-----------------------------------------------------------------------------------------------------
PROCEDURE PROCESS_CHUNK_ORDER_VAL_REC(p_parallel_threads in number)

IS
BEGIN
dbms_parallel_execute.run_task
( task_name => 'PROCESS ORDER HEADER TABLE',
sql_stmt => 'begin XXEXPD_OM_NWORDERIMPORT_PKG.WRAPPER_VALIDATE_ORDER( :start_id, :end_id ); end;',
language_flag => DBMS_SQL.NATIVE,
parallel_level => p_parallel_threads );
END;

-----------------------------------------------------------------------------------------------------
--Procedure called to run the task to validate the staging header table ( chunked data)
-----------------------------------------------------------------------------------------------------
PROCEDURE PROCESS_CHUNK_ORDER_HDR_REC(p_parallel_threads in number)

IS
BEGIN
dbms_parallel_execute.run_task
( task_name => 'PROCESS ORDER HEADER TABLE',
sql_stmt => 'begin XXEXPD_OM_NWORDERIMPORT_PKG.WRAPPER_CALL_PROCESS_ORDER( :start_id, :end_id ); end;',
language_flag => DBMS_SQL.NATIVE,
parallel_level => p_parallel_threads );
END;
-----------------------------------------------------------------------------------------------------
--Procedure WRAPPER procedure for the validate procedure
-----------------------------------------------------------------------------------------------------
PROCEDURE WRAPPER_VALIDATE_ORDER ( p_lo_rid in rowid,
p_hi_rid in rowid )
IS

p_processed_flag varchar2(1) := 'N';
lv_status varchar2(10);
lv_error_msg varchar2(3000);

BEGIN

VALIDATE_ORDER(p_lo_rid,
p_hi_rid,
p_processed_flag,
lv_status,
lv_error_msg);
END;

thanks,
Ravi
Tom Kyte
August 28, 2013 - 5:37 pm UTC

EXCEPTION
WHEN OTHERS
THEN
retcode := 1;
errbuf := 'Unexpected error while re-processing data: ' || SQLERRM;
fnd_file.put_line (
fnd_file.output,
'ERROR: Unexpected error while re-processing data: ' || SQLERRM);


how much do I hate your code :( why don't you have a RAISE after this?????


see:

http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/d_parallel_ex.htm#CHDGJGDF

you need to have CREATE JOB granted directly to the owner of this procedure.

Changing job_prefix in dbms_parallel_execute

Yossi, December 28, 2014 - 9:57 am UTC

Hi,
I understand that the task prefix can be generated by:
dbms_parallel_execute.generate_task_name;

But the job prefix, that are created, is hardcoded as "TASK$_", which can be viewed in DBA_PARALLEL_EXECUTE_TASKS

I know that dbms_scheduler has the option to change its prefix via
dbms_scheduler.generate_job_name( prefix => 'MY_EXAMPLE_JOB_');

and I see that dbms_parallel_execute.run_internal_worker has job_name in one of his parameters:

procedure run_internal_worker(task_name in varchar2,
job_name in varchar2);

Is there a way to change the job prefix ?

Lower version mechanism is missing

User, October 06, 2015 - 5:43 am UTC

I have read this thread long back. As I can recollect Tom has explained the Do it yourself parallelism for version prior to 11g also, which is now missing. Can you please add it again?
Connor McDonald
October 06, 2015 - 10:00 am UTC

Sorry - I dont understand. The DIY parallelism is in the original answer - I can still see it, ie,

"Do-It-Yourself Parallelism

!Say we have that same process as in the...."

Awesome

Rupesh, January 16, 2017 - 8:30 am UTC

Sorry if I am asking a basic question but please enlighten me why does below query produces different results w.r.t every execution just before dropping task. Everytime one of the job heavily processes chunks but others do comparitively less work.

< SELECT job_name, COUNT (1)
FROM dba_parallel_execute_chunks
WHERE task_name = 'PROCESS BIG TABLE'
GROUP BY job_name;/>

Even the t2 data distribution is varying anywhere between 20000 to 30000.

< SELECT session_id, COUNT (*)
FROM t2
GROUP BY session_id
ORDER BY session_id;/>


Below is the template I'm using for verifying results:

<DROP TABLE big_table CASCADE CONSTRAINTS;

DROP TABLE T2 CASCADE CONSTRAINTS;

CREATE TABLE T2
(
ID NUMBER NOT NULL,
TEXT VARCHAR2(128 BYTE) NOT NULL,
SESSION_ID NUMBER
);

CREATE TABLE big_table
AS
SELECT ROWNUM id,
OWNER,
OBJECT_NAME,
SUBOBJECT_NAME,
OBJECT_ID,
DATA_OBJECT_ID,
OBJECT_TYPE,
CREATED,
LAST_DDL_TIME,
TIMESTAMP,
STATUS,
TEMPORARY,
GENERATED,
SECONDARY
FROM all_objects a
WHERE 1 = 0
/

ALTER TABLE big_table
NOLOGGING;

BEGIN
INSERT /*+ append */
INTO big_table
SELECT ROWNUM,
OWNER,
OBJECT_NAME,
SUBOBJECT_NAME,
OBJECT_ID,
DATA_OBJECT_ID,
OBJECT_TYPE,
CREATED,
LAST_DDL_TIME,
TIMESTAMP,
STATUS,
TEMPORARY,
GENERATED,
SECONDARY
FROM all_objects a
WHERE ROWNUM <= 100000;

COMMIT;

END;
/

ALTER TABLE big_table
ADD CONSTRAINT big_table_pk PRIMARY KEY (id)
/

BEGIN
DBMS_STATS.gather_table_stats (ownname => USER,
tabname => 'BIG_TABLE',
cascade => TRUE);
END;
/

CREATE OR REPLACE PROCEDURE serial (p_lo_rid IN ROWID, p_hi_rid IN ROWID)
IS
BEGIN
FOR x IN (SELECT object_id id, object_name text
FROM big_table
WHERE ROWID BETWEEN p_lo_rid AND p_hi_rid)
LOOP
-- complex process here
INSERT INTO t2 (id, text, session_id)
VALUES (x.id, x.text, SYS_CONTEXT ('userenv', 'sessionid'));
END LOOP;
END;
/

BEGIN
DBMS_PARALLEL_EXECUTE.drop_task ('PROCESS BIG TABLE');
DBMS_PARALLEL_EXECUTE.create_task ('PROCESS BIG TABLE');
DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid (
task_name => 'PROCESS BIG TABLE',
table_owner => USER,
table_name => 'BIG_TABLE',
by_row => FALSE,
chunk_size => 10000);
DBMS_PARALLEL_EXECUTE.run_task (
task_name => 'PROCESS BIG TABLE',
sql_stmt => 'begin serial( :start_id, :end_id ); end;',
language_flag => DBMS_SQL.NATIVE,
parallel_level => 4);
END; />


Connor McDonald
January 18, 2017 - 2:02 am UTC

Sorry - I dont see that behaviour

SQL>
SQL> CREATE TABLE T2
  2  (
  3  ID NUMBER NOT NULL,
  4  TEXT VARCHAR2(128 BYTE) NOT NULL,
  5  SESSION_ID NUMBER
  6  );

Table created.

SQL>
SQL> CREATE TABLE big_table
  2  AS
  3  SELECT ROWNUM id,
  4  OWNER,
  5  OBJECT_NAME,
  6  SUBOBJECT_NAME,
  7  OBJECT_ID,
  8  DATA_OBJECT_ID,
  9  OBJECT_TYPE,
 10  CREATED,
 11  LAST_DDL_TIME,
 12  TIMESTAMP,
 13  STATUS,
 14  TEMPORARY,
 15  GENERATED,
 16  SECONDARY
 17  FROM all_objects a
 18  WHERE 1 = 0
 19  /

Table created.

SQL>
SQL> ALTER TABLE big_table
  2  NOLOGGING;

Table altered.

SQL>
SQL> BEGIN
  2  INSERT /*+ append */
  3  INTO big_table
  4  SELECT ROWNUM,
  5  OWNER,
  6  OBJECT_NAME,
  7  SUBOBJECT_NAME,
  8  OBJECT_ID,
  9  DATA_OBJECT_ID,
 10  OBJECT_TYPE,
 11  CREATED,
 12  LAST_DDL_TIME,
 13  TIMESTAMP,
 14  STATUS,
 15  TEMPORARY,
 16  GENERATED,
 17  SECONDARY
 18  FROM dba_objects a
 19  WHERE ROWNUM <= 100000;
 20
 21  COMMIT;
 22
 23  END;
 24  /

PL/SQL procedure successfully completed.

SQL>
SQL> ALTER TABLE big_table
  2  ADD CONSTRAINT big_table_pk PRIMARY KEY (id)
  3  /

Table altered.

SQL>
SQL> BEGIN
  2  DBMS_STATS.gather_table_stats (ownname => USER,
  3  tabname => 'BIG_TABLE',
  4  cascade => TRUE);
  5  END;
  6  /

PL/SQL procedure successfully completed.

SQL>
SQL> CREATE OR REPLACE PROCEDURE serial (p_lo_rid IN ROWID, p_hi_rid IN ROWID)
  2  IS
  3  BEGIN
  4  FOR x IN (SELECT object_id id, object_name text
  5  FROM big_table
  6  WHERE ROWID BETWEEN p_lo_rid AND p_hi_rid)
  7  LOOP
  8  -- complex process here
  9  INSERT INTO t2 (id, text, session_id)
 10  VALUES (x.id, x.text, SYS_CONTEXT ('userenv', 'sessionid'));
 11  END LOOP;
 12  END;
 13  /

Procedure created.

SQL> exec DBMS_PARALLEL_EXECUTE.drop_task ('PROCESS BIG TABLE');
BEGIN DBMS_PARALLEL_EXECUTE.drop_task ('PROCESS BIG TABLE'); END;

*
ERROR at line 1:
ORA-29498: task not found
ORA-06512: at "SYS.DBMS_PARALLEL_EXECUTE_INTERNAL", line 88
ORA-06512: at "SYS.DBMS_PARALLEL_EXECUTE_INTERNAL", line 185
ORA-06512: at "SYS.DBMS_PARALLEL_EXECUTE", line 98
ORA-06512: at line 1


SQL>
SQL> exec DBMS_PARALLEL_EXECUTE.create_task ('PROCESS BIG TABLE');

PL/SQL procedure successfully completed.

SQL>
SQL> BEGIN
  2  DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid (
  3  task_name => 'PROCESS BIG TABLE',
  4  table_owner => USER,
  5  table_name => 'BIG_TABLE',
  6  by_row => FALSE,
  7  chunk_size => 10000);
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL>
SQL> begin
  2  DBMS_PARALLEL_EXECUTE.run_task (
  3  task_name => 'PROCESS BIG TABLE',
  4  sql_stmt => 'begin serial( :start_id, :end_id ); end;',
  5  language_flag => DBMS_SQL.NATIVE,
  6  parallel_level => 4);
  7  END;
  8  /

PL/SQL procedure successfully completed.

SQL> SELECT session_id, COUNT (*)
  2  FROM t2
  3  GROUP BY session_id
  4  ORDER BY session_id;

SESSION_ID   COUNT(*)
---------- ----------
   3773474      28316
   3773476      27159
   3773478      24055
   3773480      20470

4 rows selected.

SQL>
SQL> col job_name format a40
SQL> SELECT job_name, COUNT (1)
  2  FROM dba_parallel_execute_chunks
  3  WHERE task_name = 'PROCESS BIG TABLE'
  4  GROUP BY job_name;

JOB_NAME                                   COUNT(1)
---------------------------------------- ----------
TASK$_7604_2                                      6
TASK$_7604_4                                      8
TASK$_7604_1                                      7
TASK$_7604_3                                      6

4 rows selected.

SQL>


but having said that, there is no real reason why you might not see some skew. When chunking by rowid, things like extent sizes and the like might impact how we carve up the distribution etc. Load should *roughly* be spread across the tasks.

Parallel Processing

Jean, January 17, 2018 - 1:20 pm UTC

Hi,

Would like to know if the tables that we use for this parallel execution must have the PARALLEL processing enabled. Can you please throw some light on that as well.

Regrads,
Jean
Chris Saxon
January 17, 2018 - 2:01 pm UTC

You mean using dbms_parallel_execute?

No, this is a "roll your own" method for parallel processing, no need to add the parallel option.

Also, if you're on 11.2+ and using Auto Degree of Parallelism for regular parallel query, the database ignores the table settings:

https://blogs.oracle.com/datawarehousing/configuring-and-controlling-auto-dop

Thanks for the response but I have a question

Jean, January 18, 2018 - 5:07 pm UTC

Hi,

Thank you for your quick response. I admit that my query was not clear. Let me add the details:

Requirement:
Create a custom table that pulls data from two huge base tables through a select query along with other data and additional who columns. The select query would fetch almost 48 million records. This takes a long time to execute.

Thanks for this post, I got an idea about processing in parallel, those records that I will be populating in the table I just spoke of - requirement 2 (processing the records from the custom table)

But concerning the initial requirement of populating the table with huge data, I came across one of your threads:

https://asktom.oracle.com/pls/asktom/f?p=100:11:::::p11_question_id:1415454871121

So I am looking forward to put it to use. But first I would like to get your suggestion on using parallel processing concept for populating the data.

My pseudo code:

SET SERVEROUTPUT ON;
DECLARE
CURSOR cur_rec
IS
SELECT /*list of columns*/
FROM ##### cii ,
##### msib
WHERE 1 =1
AND /*join conditions*/
AND /*other essential conditions*/
AND EXISTS
(SELECT *
FROM ##### msn
WHERE /*join conditions and othre conditions*/
)
ORDER BY 1 DESC;
TYPE t_cur_typ
IS
TABLE OF cur_rec%ROWTYPE INDEX BY PLS_INTEGER;
l_cur_rec t_cur_typ;
l_count NUMBER;
l_prog_status VARCHAR2(100);
l_msg VARCHAR2(100);
l_limit NUMBER := 100;
l_cnt NUMBER := 0;
l_err_cnt NUMBER;
dml_errors EXCEPTION;
PRAGMA exception_init(dml_errors, -24001);
BEGIN
OPEN cur_rec;
LOOP
FETCH cur_rec BULK COLLECT INTO l_cur_rec LIMIT l_limit;
BEGIN
FORALL x IN l_cur_rec.First..l_cur_rec.Last SAVE EXCEPTIONS
INSERT
INTO #####
(
/*list of required columns*/
)
VALUES
(
l_cur_rec(x).column1
l_cur_rec(x).column2,
/*.......*/
SYSDATE,
SYSDATE
);
COMMIT;
EXCEPTION
WHEN dml_errors THEN
l_err_cnt := SQL%BULK_EXCEPTIONS.COUNT;
l_cnt := l_cnt + l_err_cnt;
FOR i IN 1..l_err_cnt
LOOP
dbms_output.put_line
(
'Error occurred during iteration ' || SQL%BULK_EXCEPTIONS(i).ERROR_INDEX || ' Oracle error is ' || SQL%BULK_EXCEPTIONS(i).ERROR_CODE
)
;
END LOOP;
END;
EXIT
WHEN cur_rec%NOTFOUND;
END LOOP;
CLOSE cur_rec;
END;

Please let me know what best I can do using either 'append' or 'parallel' clause - I am still not very clear about this and I earnestly require your suggestions.

And for the second requirement, I read the 'Do-It-Yourself Parallelism ' and I would like to know if I can put it to use/ execute it as a concurrent program.

Hope I have conveyed my point and I would certainly be greatful for your help.

Regards,
Jean
Chris Saxon
January 19, 2018 - 11:07 am UTC

Why the need for bulk collection?

An insert ... select ... is likely to be faster:

insert into #### ( /*list of required columns*/ )
  select /*list of columns*/ 
  from   ##### cii , 
         ##### msib 
  where  1 =1 
  and    /*join conditions*/ 
  and    /*other essential conditions*/ 
  and    exists  (
    select * 
    from   ##### msn 
    where  /*join conditions and othre conditions*/ 
  )


You could then use regular parallel query/DML to load the data, instead of a long winded DIY method.

If you need to capture failing rows, you can use the log errors clause to catch these:


Thanks for the response but I have a question

Jean, January 18, 2018 - 5:19 pm UTC

Adding the table creation script, in addition to the previosu details:

CREATE TABLE /*custom table*/
(
/*List of required columns*/
attribute1 VARCHAR2(100),
attribute2 VARCHAR2(100),
attribute3 VARCHAR2(100),
attribute4 VARCHAR2(100),
attribute5 VARCHAR2(100),
creation_date DATE,
created_by VARCHAR2(20),
last_update_date DATE,
last_update_by VARCHAR2(20),
last_update_login NUMBER
);

Thanks for the response but I have a question

Jean, January 19, 2018 - 4:44 pm UTC

Thank you again, but as you see, I have other columns that I would like to insert, the ID column with the sequence and then the WHO columns.
Thats the reason I have used the bulk collect and forall. I have to load almost 48 million records from base table into custom table, the script of I updated yesterday. So I am seeking advice from you all, also I saw that I am encountering the following error:

Error report -
ORA-24381: error(s) in array DML
ORA-06512: at line 52
24381. 00000 - "error(s) in array DML"
*Cause: One or more rows failed in the DML.
*Action: Refer to the error stack in the error handle.

I actually want to load this table in parallel but only with unique records, how can I do that ? I tried adding the custom table in the select query of the cursor to check that the already existing rows in the custom table are not picked by the cursor again. But that dint help much.

Thanks for the video, I ll incorporate it but I need all the unique data to be populated in the custom table with violating unique constraint errors, during parallel processing. Need your help.

Regards,
Jean
Chris Saxon
January 19, 2018 - 4:59 pm UTC

Thank you again, but as you see, I have other columns that I would like to insert, the ID column with the sequence and then the WHO columns.

I don't understand what the issue is. You can include sequences, functions, etc. in your select!

insert into #### ( /*list of required columns*/ )
  select /*list of columns*/, seq.nextval, sysdate, user 
  from   ##### cii , 
         ##### msib 
  where  1 =1 
  and    /*join conditions*/ 
  and    /*other essential conditions*/ 
  and    exists  (
    select * 
    from   ##### msn 
    where  /*join conditions and othre conditions*/ 
  )

Thanks for the response but I have a question

Jean, January 19, 2018 - 5:05 pm UTC

Great !! I ll try that, so in this case, can I omit using bulk collect and forall loop for faster insertion?

How can I do parallel processing for insertion in the custom table ?

Also I tried the 'exec dbms_errlog.create_error_log('owner.table##');', the procedure completed successfully, but I was not able to desc or query the table, it threw an error saying:

Error report -
ORA-44002: invalid object name
ORA-06512: at "SYS.DBMS_ASSERT", line 383
ORA-06512: at "SYS.DBMS_ERRLOG", line 114
ORA-06512: at line 1
44002. 0000 - "invalid object name"
*Document: Yes
*Cause: The input parameter string was not a qualified
SQL identifier of an existing SQL object.
*Action: Make sure the string matches an existing SQL object that
is accessible by the current schema.


Can you please help me with what hint i can use for parallel processing of inserts and also the parallel statement that I can use during custom table creation.

Thanks a ton for your response. Its a blessing to me.

Regards,
Jean

Please Review - code not working need suggestions

Jean, January 19, 2018 - 10:03 pm UTC

Hi,

I tried the following, but the rows were not inserted, I understand that I have missed something. Please suggest:

SET SERVEROUTPUT ON;
CREATE OR REPLACE PROCEDURE p1
IS
l_count NUMBER;
l_prog_status VARCHAR2(100);
l_msg VARCHAR2(100);
l_limit NUMBER := 10000;
l_cnt NUMBER := 0;
l_err_cnt NUMBER;
BEGIN
DBMS_OUTPUT.PUT_LINE('Begin');
BEGIN
INSERT /*+ PARALLEL(SPRN.SPRN_CSI_INST_DF_TBL_J4,2) */
INTO t1
(
--columns
)
( SELECT /*+ PARALLEL(4) */
--columns
FROM ## cii ,
## msib
WHERE 1 =1
AND -- join and other conditions
AND EXISTS
--conditions
AND ROWNUM < 100000
);
COMMIT;
DBMS_OUTPUT.PUT_LINE('Within loop After Insertion and comit');
end;
END p1;


BEGIN
dbms_parallel_execute.create_task('task1');
dbms_parallel_execute.create_chunks_by_rowid
( task_name => 'task1',
table_owner => USER,
table_name => 't1',
by_row => FALSE,
chunk_size => 10000 );
END;

/

BEGIN
dbms_parallel_execute.run_task
( task_name => 'task1',
sql_stmt => 'BEGIN p1; END;',
language_flag => DBMS_SQL.NATIVE,
parallel_level => 4 );
END;
/

The procedure got executed, but no records were insterd.

Regards,
Jean
Connor McDonald
January 22, 2018 - 2:18 am UTC

You need to understand the way dbms_parallel_execution is designed to work.

If I have task "X" I need to perform against a huge table, it *might* be faster to run several concurrent executions of that task, with each one accessing a *subset* of the table.

For example - lets say my table T has 100 million records, and every record has a sequence number from 1 to 100,000,000. And I need to update every single row.

To do this faster *manually* I might fire up 4 sqlplus sessions and do:

sqlplus session 1
=================
update T set ...
where seq between 1 and 25,000,000

sqlplus session 2
=================
update T set ...
where seq between 25,000,001 and 50,000,000

sqlplus session 3
=================
update T set ...
where seq between 50,000,001 and 75,000,000

sqlplus session 4
=================
update T set ...
where seq between 75,000,001 and 100,000,000


That is all that dbms_parallel_execute does. You give it

- a task to do (in my case it was an update, but it can be a procedure call etc)
- a means to carve up the large table.
- a way of passing that information to each session

Please have a read of Tim's excellent blog post on this here

https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2

and you'll see that you'll need pass parameters to your P1 procedure, and use those parameters to access subsets of the table.

Thanks for the response but I have a question

Jean, January 23, 2018 - 8:34 am UTC

Hi,

Thank you for the response. Would like to speeden the insertion, so I would like to know if the following script would serve the purpose. Also, I am not sure if I can use the /*+ PARALLEL */ hint for both insert and select clause or if I should use /*+ APPEND */ in insert clause alone.

I will pass the parameters though:
SET SERVEROUTPUT ON;
CREATE OR REPLACE PROCEDURE p1
IS
l_count NUMBER;
l_prog_status VARCHAR2(100);
l_msg VARCHAR2(100);
l_limit NUMBER := 10000;
l_cnt NUMBER := 0;
l_err_cnt NUMBER;
BEGIN
DBMS_OUTPUT.PUT_LINE('Begin');
BEGIN
INSERT /*+ PARALLEL(SPRN.SPRN_CSI_INST_DF_TBL_J4,2) */
INTO t1
(
--columns
)
( SELECT /*+ PARALLEL(4) */
--columns
FROM ## cii ,
## msib
WHERE 1 =1
AND -- join and other conditions
AND EXISTS
--conditions
AND ROWNUM < 100000
);
COMMIT;
DBMS_OUTPUT.PUT_LINE('Within loop After Insertion and comit');
end;
END p1;


Require your suggestions.

Regards,
Jean

Awaiting Response

Jean, January 24, 2018 - 9:33 am UTC

Hi Team,

Am not sure if I have misunderstood the concept and I do have a questions posted, I sincerely seek your suggestions. As mentioned in one of your threads regarding 'Huge Insert', I tried using a single insert statement without cursor or loops but that did not work for me. Need your suggestions.

Regards,
Jean
Connor McDonald
January 24, 2018 - 1:51 pm UTC

Did you look at the link we sent you ?

*Every* example in that link, plus the review we sent you talks about the need to have two parameters (the lower and upper bound) of the chunk.

Yet every time you post "P1" it never moves on from the original.

We can't write the code for you - you've have to meet us half way, and show us that you're looking at the examples we show or link to, and that you are trying some things yourself here.

Question on Insert

Jean, February 01, 2018 - 1:11 pm UTC

Hi,

I had tried inserting into a table, as follows:

BEGIN
dbms_parallel_execute.run_task ( task_name => 'task1', sql_stmt => 'DECLARE
l_start_id ROWID := :START_ID;
l_end_id ROWID:= :END_ID;
l_cnt NUMBER;
BEGIN
INSERT /*+ append */
INTO t1
(
/* column list */
)
( SELECT /* column list */
FROM cii ,
msib
WHERE 1 =1
AND /* join conditions and other conditions*/
);
COMMIT;
END;',
language_flag => DBMS_SQL.NATIVE,
parallel_level => 5 );
END;
/

But the table got populated with 177 million rows, where only 44 million rows were distinct and the rest were redundant. The above code executed for more than 24 hours and then I killed it. How can I avoid redundant data through this approach.


Chris Saxon
February 01, 2018 - 4:43 pm UTC

Presumably there's a problem with the query:

SELECT /* column list */ 
FROM cii , 
msib 
WHERE 1 =1 
AND /* join conditions and other conditions*/ 


Verify that's returning the correct information.

Though I don't see why you're using DIY parallel for this. Surely regular parallel query/DML is good enough here?


Question on Insert

A reader, February 01, 2018 - 6:08 pm UTC

Hi,

Thanks for the reply, but am trying to insert it in the fastest way. There are around 46M records that has to be inserted.
My main requirement is to load the table without redundant data.
Now I tried this approach: (Ref - https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1132417600346069010 )

CREATE TABLE t1
as SELECT /* column list */
FROM csi.csi_item_instances cii ,
apps.MTL_SYSTEM_ITEMS_B msib
WHERE 1 =1
AND /* join conditions and other conditions*/ ;

I get that my questions are unstructured and that I am trying n number of things, but I am just beginning and am trying to cover this requirement in the best way. I really need your help.


Connor McDonald
February 02, 2018 - 12:10 am UTC

You got redundant data because you used dbms_parallel_execute without specifying start/end parameters. Hence *every* execution of the slaves did the entire job, so you got duplicates.

CREATE TABLE t1  parallel
as SELECT /*+ parallel */  [columns]
FROM csi.csi_item_instances cii , 
     apps.MTL_SYSTEM_ITEMS_B msib 
WHERE 1 =1 
AND  [join conditions and other conditions]


is probably where you want to be here


More to Explore

CLOB/BLOB/etc

Complete documentation on Securefiles and Large Objects here