Skip to Main Content
  • Questions
  • Large row by row table update with values from another table using cursor

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Altus.

Asked: November 07, 2012 - 5:29 am UTC

Last updated: September 09, 2016 - 8:23 am UTC

Version: 10.2.0

Viewed 10K+ times! This question is

You Asked

Hello,

I have a requirement to update 2 columns (A, B) in a arbitrarily (very, very) large table. The values that needs to be updated relies on values from another table and must be consistent with the values from the other table. Since the values in A differ from B, currently, i have to do 2 updates to the table for each record in the cursor. This is obviously inefficient. It can take longer than 12 hours to complete (and has been confirmed as such). There surely must be some way to improve the process. All my reading and trawling the web seems to point to using FETCH, BULK COLLECT and LIMIT etc. Is this the way to go? Can you please perhaps furnish an example?

See the code example below:
declare
CURSOR tran
IS
  SELECT oid, user_account_number FROM user_account001;
BEGIN
  FOR t IN tran
  LOOP
    UPDATE transaction001
    SET payer_account_number = t.user_account_number
    WHERE payer_account_oid  = t.oid;
    UPDATE transaction001
    SET payee_account_number = t.user_account_number
    WHERE payee_account_oid  = t.oid;
  END LOOP;
END;


Thanks!

and Tom said...

why is this anything more than:

begin

merge into transaction001 t
using user_account001 ua
on ( t.payer_account_oid = ua.oid )
when matched then update set payer_account_number = ua.user_account_number;

merge into transaction001 t
using user_account001 ua
on ( t.payee_account_oid = ua.oid )
when matched then update set payee_account_number = ua.user_account_number;

end;


Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thank you

Altus, November 07, 2012 - 11:36 pm UTC

Thank you for taking the time to answer my question! I've taken your approach and can see a massive performance increase nearly halving the time it takes to execute.
Tom Kyte
November 08, 2012 - 8:10 am UTC

can you tell me - are you updating most/many of the rows in this table? how many database blocks out of how many do you think this update would affect?

It might be *much* more efficient to:



create table new_table
as
select ...., 
       case when ua1.oid is null then t.payer_account_number else ua1.user_account_number end payer_account_number,
       case when ua2.oid is null then t.payee_account_number else ua2.user_account_number end payee_account_number
  from transaction001 tran,
       user_account_number ua1,
       user_account_number ua2
 where tran.oid = ua1.oid(+)
   and tran.oid = ua2.oid(+);


drop table transaction001;
rename table new_table to transaction001;



Reply to follow up

Altus, November 09, 2012 - 12:22 am UTC

Hi there,

The goal is to update _all_ the records in the table (t) (since all the records in the other table (ua) has been updated / changed. and thus I imagine will affect most if not all data blocks. Already for one other table the method was chosen to create a new table as (select) since it was MASSIVE and required only one columns' data to be changed and it did / does not matter what the new value is. Unfortunately in this case, it very much does. But, as you've now pointed out, seems the same route can be followed.

Thank you again for providing valuable input. You've certainly broadened my knowledge ten fold the past couple of days!.

Best Regards.
Tom Kyte
November 09, 2012 - 6:33 am UTC

use create table as select.


it'll be a lot better than reading every block of the tables into the buffer cache, generating undo and redo, probably not having sufficient room in the cache to hold everything (hence writing it out) and having to read it back in again (perhaps many times).


the create table as select will skip undo, can skip redo, will just do direct path reads into your PGA and direct path writes out to the table.

Jess, September 07, 2016 - 11:27 am UTC

Hi Tom,
We have a similar scenario in that we have updates coming into a large(ish) table that we're not sure how to make more efficient.

There's a table, say, 'order details'. It's got about 15 columns, most description-type varchars-256. This table does not keep history.
As part of the business process, each record get updated at least twice as follows after it's been inserted:
- for a given order id,
- every record in that order gets its status updated (3-char column) by the system (once); usually 1,000 records in 1 shot
- then individual records in that order may get updated by the business one to umpteen times across 1 description column and 3 flags over the next few days
- once this step in the process has finished, the records are never modified again (though occasionally order-based deletes happen as part of another process)

The table is about 500,000 rows now expected to grow to 10-15M over the course of the project.

NOTE: The first (full order) update we can do in one shot. For individual updates, we cannot receive them together from the application (i.e., even if a user changes 2 records and clicks 'submit', we will get them as 2 individual update statements to 2 records).

We were worried about constantly updating records like that in a table that size, so our original approach was to create a 'draft' table containing just the updateable values. When we insert into 'order details', we insert into 'draft' as well. All the updates are done there.
Then the question became what do we do with them at the end of the process--do we write them back to the original table or do we just take contents and insert into a new 'order details final amendments' table... Writing back to the original table would be cleaner, but seems to defeat the purpose.
1. Would doing direct updates in the base table be a problem in the long term (so there's value in having a draft table and potentially not writing back to the base table) or should we just update the base table over and over?
2. If we had to have a draft table for other reasons (say the business don't want to see transient changes in reports run against 'order details'), would that change the decision on what to do with it in the end--write finalised changes back to base table or keep in a separate one?

Thank you as always.

Chris Saxon
September 07, 2016 - 7:49 pm UTC

Why can't you merge the two changes into a single update?

I don't see what benefit the draft table gives you. You'll still have the same undo/redo (assuming the same indexes).

If you want to ensure reports view the table at a point in time (before updates) you could look into using Flashback Data Archive. This enables you to view the table at a particular date with "as of timestamp" queries.

Chris

Jess, September 08, 2016 - 8:24 am UTC

Hi Chris,
Not entirely sure what you mean by 'merge two changes into single update', as the two happen at different time by different processes: when an order is open for investigation, the status of every record is updated by the system (big starting update). Then business users can make some changes to some records, save their work, go away, come back the next day, make changes, etc. Nothing ca be combined with anything into a single update other than what gets 'saved' at the same time. They don't tend to update the same record multiple times, but they can update 20 records in an order, saving each one as they go (thus resulting in 20 updates to db).

The draft table is very skinny--only the fields that can be updated, but the indexes are mostly the same--no one searches for the bulk of the columns in the main table, they're for display only. We thought the draft table would give us less 'churn' on the main table, as it would be big. The draft table could be updated over and over, but it would be small, as records would be deleted once the investigation/updating of details is done. Are you saying that won't matter?

We would rather not get into flashback data archive, as reports are run by multiple applications against db, some are not Oracle and just want to run a simple query against a table. We are not in control of those.

Chris Saxon
September 09, 2016 - 8:23 am UTC

Sorry, I misread your statement about updates.

If the draft table is skinny (compared to the original) you may have less row chaining/migration as a result of the updates. But it depends upon what you're doing. Otherwise the work Oracle does updating the skinny table will be similar or the same as the work it does updating the wide table.

And with the draft table you will have the extra work of inserting to and deleting from it. And the general development overhead of managing this.

I just don't see how the draft table helps you in a substantial way. It makes your application more complicated. Which is generally A Bad Thing.

So unless you can prove that using a draft table gives a big reduction in undo/redo or other metrics I wouldn't bother.

Chris