Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Sonu.

Asked: January 13, 2005 - 2:45 pm UTC

Last updated: December 24, 2006 - 9:00 am UTC

Version: 9.2.4

Viewed 1000+ times

You Asked

I would like to have your comments on what would be a good solution for the following problem:

Let an Employee table have 500,000 employee_ids and an Salary table have their corresponding salary amounts.

From a Java program using 10 threads(for example) making jdbc calls I would like to assign 50,000 employee_ids to each thread and each of these threads would independently update salaries of 50,000 employee they correspond to.

Do I have to use the pagination example and create select statement for each thread in java code or there is a better way in which , on database end, I will get all the 500,000 employee_ids and then assigns 50,000 to each thread so that they can update the salary.


and Tom said...

why would you have to do such a small amount of work in parallel at all??

ops$tkyte@ORA9IR2> create table emp
2 as
3 select id empno, 0 sal
4 from big_table.big_table
5 where rownum <= 500000;

Table created.


ops$tkyte@ORA9IR2> create table emp_sals
2 as
3 select id empno, object_id new_sal
4 from big_table.big_table
5 where rownum <= 500000;

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table emp_sals add constraint emp_sals_pk primary key(empno);

Table altered.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set timing on
ops$tkyte@ORA9IR2> update (select sal, new_sal
2 from emp, emp_sals
3 where emp.empno = emp_sals.empno )
4 set sal = new_sal;

500000 rows updated.

Elapsed: 00:00:33.08
ops$tkyte@ORA9IR2>


that was on a tiny desktop machine, nothing fancy I would not be writing code at all (complex multi-threaded code in particular) to do something that takes seconds to accomplish.


Or, if you feel "parallel must be done", use the database:

ops$tkyte@ORA9IR2> alter table emp parallel 4;
Table altered.

ops$tkyte@ORA9IR2> alter table emp_sals parallel 4;
Table altered.

ops$tkyte@ORA9IR2> alter session enable parallel dml;
Session altered.


Although on my single disk, single cpu system that takes about the same amount of time.



Trying to do this particular thing in your own code by employee id ranges would be not very performant either. You would have 10 things doing a FULL scan of the employee table (either that, or even worse -- 10 RANGE scans of the employee table) and 10 full scans of the employee salary table (or worse, they might use an index) and updating rows next to each other and so on. It would introduce as much contention as you could.


I think a single update is all you need here.

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Sonu, January 13, 2005 - 4:23 pm UTC

I think I have over simplified the example, what I wanted to know was how best to use java multi-threading to process a large(eg : 5,000,000) Oracle resultset, because java code knows the complex logic of updating the salary.



Tom Kyte
January 13, 2005 - 5:08 pm UTC

rowid ranges. suggestion: write a single program and run 10 of them. single threaded = comprehendable by humans. multi-thread = technically really cool sounding but geez, when it goes wrong -- bam.

You are going to have 10 separate connections anyhow...


see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10498431232211 <code>

5,000,000 employees, impressive ;)

I find most logic, when exposed -- is not so complex.... i'd still look at SQL. You'd be surprised what it can do.

Java Threads in Oracle Server Side Database

olerag, December 22, 2006 - 8:27 pm UTC

I've done something similar pertaining to this topic. A Java class with 20 threads (each with its own connection) invokes a PLSQL packaged procedure to retrieve information and insert the results in CLOB rows, used for web search/display capabilities. In so doing, the overall processing time decreased by approx 30%.

Q1. Can you confirm that this manipulation, executed in a single Unix process and using 20 independent connections, is running 20 separate sessions? Enterprise Manager seems to confirm this?!?

Q2. When the Java source is transfered to the database as a stored Java class and executed, only one session for the job appears in Ent Mgr. Do you know if multi-thread Java source runs differently on an Oracle Server-Side database than from a client-side machine?

Note. For Q2 only, the "default" JDBC connection is being utilized.
Tom Kyte
December 24, 2006 - 9:00 am UTC

if you have 20 threads and each thread has it's own connection, well - yes you have 20 separate sessions.

a multi-threaded java application in the database uses a single thread, there is NO concurrency, you do not really want to do that.

Merry XMAS Tom

olerag, December 24, 2006 - 12:10 pm UTC

Can't believe it - XMAS eve (on a Sunday yet) and Tom replies; you absolutely "rock". Thanx for the reply; it is as I expected and appreciate your confirmation. Have a great holiday season!!