Skip to Main Content
  • Questions
  • Multithreaded Processing on Partitioned Table

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Debashis.

Asked: June 25, 2008 - 2:31 am UTC

Last updated: June 08, 2009 - 12:16 pm UTC

Version: 10.1.0.4

Viewed 1000+ times

You Asked

Hi Tom
Good Morning.
I've a partitioned table(let's say TAB_1) and a stored procedure(let's say SP_1). Could you please suggest me a way through which SP_1 can concurrently(Multithreaded) process all the partitions of TAB_1 ?
Let me refresh. Suppose, there are 4 partitions, part_1, part_2, part_3 and part_4. Is there any way so that while SP_1 is processing part_1 data, it can also process part_2, part_3 and part_4 simultaneously ?
Thanks in advance for your time.

[Version : Oracle Database 10g Enterprise Edition Release 10.1.0.4.]



Thanks
Debashis


and Tom said...

you can use the job queues or scheduler to do this.

here is an example using the job queues. Just have your process routine accept a pair of partition keys that select out data for ONE partition (we'll eliminate the others and do the right thing). do not be tempted to use dynamic sql and the extended partition name, that'll just make the code really ugly.


ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2008','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2008','dd-mon-yyyy')) ,
 11    PARTITION part3 VALUES LESS THAN (to_date('15-mar-2008','dd-mon-yyyy'))
 12  )
 13  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> insert into t
  2  select to_date('12-mar-2008','dd-mon-yyyy')+mod(rownum,3), object_id, object_name
  3    from all_objects
  4  /

49924 rows created.

ops$tkyte%ORA10GR2> select 'part1', count(*) from t partition(part1) union all
  2  select 'part2', count(*) from t partition(part2) union all
  3  select 'part3', count(*) from t partition(part2);

'PART   COUNT(*)
----- ----------
part1      16641
part2      16642
part3      16642

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table job_parms
  2  ( job        number primary key,
  3    lodate     date,
  4    hidate     date,
  5    scheduled  date,
  6    completed  date
  7  ) organization index;

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace procedure process_data( p_job in number )
  2  as
  3      l_rec job_parms%rowtype;
  4  begin
  5      select * into l_rec from job_parms where job = p_job;
  6
  7      for x in (select rowid rid, y from t where dt >= l_rec.lodate and dt < l_rec.hidate)
  8      loop
  9          update t set y = lower(x.y) where rowid = x.rid;
 10      end loop;
 11
 12      update job_parms set completed = sysdate where job = p_job;
 13      commit;
 14  end;
 15  /

Procedure created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace procedure submit_jobs
  2  as
  3      l_job number;
  4      l_dt  date := sysdate;
  5      l_n   number;
  6  begin
  7      for i in 1 .. 3
  8      loop
  9          dbms_job.submit( l_job, 'process_data( JOB );' );
 10          insert into job_parms (job,lodate,hidate,scheduled)
 11          values ( l_job,
 12                   to_date('12-mar-2008','dd-mon-yyyy')+i-1,
 13                   to_date('12-mar-2008','dd-mon-yyyy')+i,
 14                   l_dt );
 15      end loop;
 16
 17      commit; -- they can run now....
 18      loop
 19          dbms_lock.sleep(3);
 20          select count(*) into l_n from job_parms where scheduled = l_dt and completed is null and rownum = 1;
 21          exit when (l_n = 0);
 22      end loop;
 23  end;
 24  /

Procedure created.

ops$tkyte%ORA10GR2> show parameter job_queue

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
job_queue_processes                  integer     10
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select count( case when lower(y) = y then 1 end ) already_lower, count(*) from t;

ALREADY_LOWER   COUNT(*)
------------- ----------
          890      49924

ops$tkyte%ORA10GR2> exec submit_jobs

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select count( case when lower(y) = y then 1 end ) now_lower, count(*) from t;

 NOW_LOWER   COUNT(*)
---------- ----------
     49924      49924

ops$tkyte%ORA10GR2> select * from job_parms;

       JOB LODATE               HIDATE               SCHEDULED            COMPLETED
---------- -------------------- -------------------- -------------------- --------------------
       248 12-mar-2008 00:00:00 13-mar-2008 00:00:00 25-jun-2008 09:32:14 25-jun-2008 09:32:21
       249 13-mar-2008 00:00:00 14-mar-2008 00:00:00 25-jun-2008 09:32:14 25-jun-2008 09:32:21
       250 14-mar-2008 00:00:00 15-mar-2008 00:00:00 25-jun-2008 09:32:14 25-jun-2008 09:32:21





I liked this question :) A nice "how do I do this" question ....

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

I liked this answer :)

Laurent Schneider, June 25, 2008 - 10:33 am UTC

A nice "how do I do this" answer ....

Using dbms_scheduler instead of job queues

Joe, June 26, 2008 - 9:38 am UTC

Our DBA is requiring that we use dbms_scheduler on our 10gR2 database instead of the job queues. I agree with him on this, but I have problems converting examples using the job queues to dbms_scheduler.

A lot of this example will probably contain the same code no matter which process (job queues or dbms_scheduler) is used, but I'm sure there are some minor (but important) differences.

Could you comment on the changes to this excellent example if one were to use the dbms_scheduler instead of the job queues?

Thanks.
Tom Kyte
June 26, 2008 - 4:20 pm UTC

Not all can be converted. The dbms_scheduler package has a deficiency in my eyes.

It commits. It is not transactional, there are many places you cannot use the scheduler safely.

So, I disagree with your DBA, even on 11g, I constantly use dbms_job. In fact, dbms_job in 11g uses the same basic framework as dbms_scheduler.


A reader, June 26, 2008 - 9:44 am UTC

Where do you get the value for l_job (248,249,250)?
Tom Kyte
June 26, 2008 - 4:21 pm UTC

dbms_job.submit - right in the code, you can see me get it:

  2  as
  3      l_job number;
  4      l_dt  date := sysdate;
  5      l_n   number;
  6  begin
  7      for i in 1 .. 3
  8      loop
  9          dbms_job.submit( l_job, 'process_data( JOB );' );
 10          insert into job_parms (job,lodate,hidate,scheduled)
 11          values ( l_job,
 12                   to_date('12-mar-2008','dd-mon-yyyy')+i-1,
 13                   to_date('12-mar-2008','dd-mon-yyyy')+i,
 14                   l_dt );
 15      end loop;



dbms_job CREATED the job id, it is an out parameter.

parallel PL/SQL processing

Fredx, June 05, 2009 - 6:26 am UTC

Hi Tom,

I agree with you regarding your comments on using dbms_job package even in 11g version of Oracle database, but I¿m concerned because Oracle is constantly direct us to replace dbms_job package with a newer one dbms_scheduler.

How can I possibly in that case continue to write my parallel PL/SQL procedures which are all based on dbms_job and dbms_pipe packages?

Should I start to learn Oracle Streams to continue to write parallel PL/SQL procedures?

Thanks for your answer and I hope you¿ll wisit Croatia soon.




Tom Kyte
June 08, 2009 - 12:16 pm UTC

streams?? no. not sure how you put them together....


dbms_jobs is now a thin layer on top of the scheduler. and you could use the scheduler directly for running your plsql in parallel if you wanted (it runs things in the background like jobs)

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.