Thanks for the question, Craig.

Asked: February 03, 2003 - 2:48 pm UTC

Last updated: October 18, 2006 - 8:01 am UTC

Version: 9.0.1.2

Viewed 1000+ times

You Asked

Hi, Tom -

I'm working on our nightly ETL process, grabbing all the prod data that is new or newly updated from the day before. Of course this means the primary key from the source system is no longer unique. My insert uses a sequence to generate a new primary key

Now the code I have inherited loops through a cursor and commits every 1000 rows. I've been reading this site for some time, and I know you loathe fetching across a commit, as it leads to ORA-01555 and other such calamities.

Option A) So I remove the loop, and thereby the commit interval, and do it as one big bulk insert (as many as 100k rows on some nights)

Option B) I loop through the entire cursor and insert row by row, and then commit once I'm done.

Both options work, and both result in my DBA saying "you need to commit every n rows!! 100k row unit of work is too big for one commit." What can/should I do? Is there some option I'm yet to consider? Your word carries wieght around here...maybe you can convince him.

Thanks for all your help, Tom.

and Tom said...

your DBA needs to worry about backup and recovery and leave transactional integrity to you as they do not have a clue in that regard (apparently).

A unit is work is never dictated by SIZE -- it is dictated by business requirements.

Committing frequently will only:

o increase the probability of an error
o cause you to run slower
o make your code more breakable

is that what the DBA wants to do? Make you run slower, with more errors and stand a chance of logically corrupting the data????

100k rows? that is toy sized -- small, small potatoes, nothing. I do tons more than that on my laptop.

Tell the DBA, Ok, I'll commit ever N rows -- N=10,000,000 -- no worries!

If you can, use insert /*+ append */ and bypass some of the UNDO generation (index undo will still be generated)! That is the only thing "committing" would "release"....

You should find option (a) to be the most expedient, easiest to code, trivial to maintain.

Rating

(8 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Trying to avoid fetching across a commit

subbu, February 03, 2003 - 4:28 pm UTC

hi tom, i live in old database world with 8i and few 734 where i face prblem with rollback segments depending on the commit frequency. Does this holds good for my environment also. we do have ETL jobs running daily and on monthly basis.

February 03, 2003 - 5:03 pm UTC

doesn't matter how old your database is -- it has always been that

o frequent commits slow you down
o probably break your data integrity
o is harder to code

version 6 on....

All you need to do is actually SIZE your rbs for the work you do. RBS is not "overhead", not something to minimize, not something to skimp on. It is an integral component of the database -- getting it right (sized) is crucial.

Conflicting info from Oracle

Craig, February 03, 2003 - 5:40 pm UTC

Thanks for your amazingly quick replies!

I was referred to Oracle PL/SQL Tips & Techniques, pp. 166-168, by my DBA. The author recommends commiting every 1000 rows "to reduce the number of rollback segment switches, as well as to ensure all transactions within that set can be processed by one rollback segment."

Are we talking about apples an oranges here? The text does not support your conclusion in this case. Forgive me, but it can be confusing when two sets of experts give conflicting advice.

Other pertinent info that may impact your response: our db currently has Undo tablespaces, with undo management set up. The Undo Tablespace is 3GB.

Thanks again!

February 03, 2003 - 6:52 pm UTC

That book is wrong wrong wrong. So wrong wrong wrong

Ask the DBA where in the book is scientifically shows that this is "good" -- you know "proves it".

Then you can point him to my book "Expert one on one Oracle" pages 143-148 the section entitled "Bad Transaction Habits" where I prove this is a terrible idea for more then one reason.

What text doesn't support my scientifically provable fact?

What the heck is a "rollback segment 'switch'" anyway -- that is not even terminology we use when talking of rollback segments

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:4951966319022


or, if you don't believe me

http://www.ixora.com.au/newsletter/2001_09.htm

(steve adams)

http://www.experts-exchange.com/Databases/Oracle/Q_20328566.html


this one is really cool, shows about a 400% decrease in performance by committing every 500/1000 rows:

http://home.clara.net/dwotton/dba/big_insert.htm#results <code>

just search google for

"frequent commits" oracle

to read others.

Do not believe any tuning advice or information until you see factual concrete reproducible numbers to back it up. Everything else is conjecture, hypothesis and mostly mythology. If the book doesn't prove it, don't believe it is my mantra.

Excellent explanation...

Shailesh, February 04, 2003 - 2:34 am UTC

Hi Tom,

I agree with your opinion, But there are few circumstances where you require Commit and will face ORA-1555 error. I have faced this situation "Fetch across commit".

We have added one ID column in all tables of our Application database. This ID will have unique value, so have created a sequence and written following Stored Procedure. There are around 1200 tables and records count range goes from 100 to 10 Million. DB size is 5GB. So I have to commit after one table ID update.

CREATE OR REPLACE PROCEDURE UpdateAllId IS
CURSOR ALLTABLES IS
SELECT Table_Name
FROM User_Tables where table_name !='PLAN_TABLE';

Row_Rec ALLTABLES%ROWTYPE ;
STMT Varchar2(200);
BEGIN
OPEN ALLTABLES ;

LOOP
FETCH ALLTABLES INTO Row_Rec ;
EXIT WHEN ALLTABLES%NOTFOUND ;
STMT :='UPDATE ' || Row_Rec.Table_Name || ' SET ID = SEQID.NEXTVAL ';
Execute Immediate STMT;
Commit;
END LOOP ;

CLOSE ALLTABLES;

END UpdateAllId ;
/

WHAT YOU WILL DO IN THIS SITUATION.

Thanks & Regards,

Shailesh

February 04, 2003 - 7:55 am UTC

Hope and pray I don't get an ora-1555. Cause If I do, we'll, I'm hosed aren't I...

I don't know what tables I processed.
I don't know what tables I have left to go.

I'm in a HOPELESS situation -- I'd have to start over and reupdate everything.

Your solutions (in my order of preference):

#1 size your RBS to do the work your system needs to do
#2 if you don't want to do #1, rethink your decision
#3 make sure your process is RESTARTABLE. That would be:

create tablesToProcess as select table_name from user_tables where 1=0;

create or replace procedure updateAllId
is
begin
for x in ( select * from tablesToProcess )
loop
dbms_application_info.set_client_info( x.table_name );
execute immediate 'update ' || x.table_name || ' set id = seqid.nextval';
delete from tablesToProcess where table_name = x.table_name;
commit;
end loop;
end;
/

And then to run this, you:

insert into tablesToProcess
select table_name
from user_tables
where table_name <> 'PLAN_TABLE';
commit;

exec updateAllId;

Now, if updateAllId fails for any reason -- down system, whatever - just run updateAllId again and it picks up where it left off.

Took more brain power.
Gotta test it (it'll probably work).
Harder..
But solves the restart problem (which most people ignore)....

How to handle bad data?

A reader, February 04, 2003 - 10:13 am UTC

Option A) So I remove the loop, and thereby the commit interval, and do it as one big bulk insert (as many as 100k rows on some nights)

You should find option (a) to be the most expedient, easiest to code, trivial to maintain.

Tom,

In the above example/solution, how do you recommend handling situations with bad data (example: trying to insert a record with a bad date)?

February 04, 2003 - 10:31 am UTC

If they are fetching from one table to another -- should not be an issue.

If you need that "row by row" stuff - bulk operations WITHOUT a commit (and that is the issue here -- the commit) would be useful. Array (bulk) fetch and Array insert 100 or so rows at a time (without committing) and doing whatever error handling you need.

questions

A reader, June 24, 2003 - 5:30 pm UTC

Tom,

In one of the articles referenced on expert-exchange.com, one of the posters suggested using an order by 1 clause and preventing the read consistency issue since the data would all be stored in a temp segment. Your thoughts?

Secondly, most of your examples revolve around update statements and ORA-01555 being caused by them. What about insert statements and delete statements inside a for loop with commits, how could they run into an ORA-01555 error, assuming a single user database? The for loop uses a cursor which selects from tables which are not referenced at all in the insert statements. Could it be that some recursive sql is being executed in the background which requires some undo generated by the inserts/deletes?

June 25, 2003 - 11:40 am UTC

well, you really want to copy the data? aren't you just working around the "we don't want lots of undo" thing with "we love gobs and gobs of temp"

No, I do not "approve", you NEED to size your undo for your system, period, plain and simple, done.

besides, what happens when we decide to use an index and skip the sort? now what?

AND you only avoid the 1555 maybe for your query -- not for all of the other poor users of your system, your frequent commits are going to wrap the rbs -- causing them to get 1555s they would not normally get.

bad fix.

it is the COMMIT, not the update that causes the 1555 here. change the update to insert/update/merge and same thing happens. It is the commit.

Commit cycle

A Reader, May 12, 2004 - 9:30 am UTC

Hi Tom

Currently I am extracting data into a reporting data mart
from a production database. The database is Oracle 9i with undo management mode set to auto and undo tablespace defined. Now, with these settings can I just have one commit for each process populating thousands of records instead of defining commit cycles within a loop

Thanks

May 12, 2004 - 6:48 pm UTC

you could always (should always perhaps) have that capability.

automatic undo is not significantly different under the covers from manual - the major feature change is the undo-retention setting, something that was not possible in 8i with manual.

Commit cyles

A reader, May 13, 2004 - 3:58 am UTC

Many Thanks

commiting after inserting n number of rows

GRJ, October 18, 2006 - 2:10 am UTC

Tom,

Does it have any performance issue/errors like 01555 if we commit after inserting every n=100,200...1000 rows as shown in the below code piece

FOR qoh_rec IN SRC_STORAGE_ITEM_CUR
LOOP
bRecordExists := 'F';
src_storage_item_no := qoh_rec.storage_item_no;
IF (UPPER(sInsertAll) = 'N') THEN
OPEN DES_QOH_CUR;
FETCH DES_QOH_CUR INTO dest_storageitem_no;

IF DES_QOH_CUR%FOUND THEN
bRecordExists := 'T';
ELSE
bRecordExists := 'F';
END IF;
CLOSE DES_QOH_CUR;
END IF;
-- Insert only when there is no records in QOH table
IF (bRecordExists = 'F') THEN
record_count := record_count + 1;
INSERT /*+ Append */ INTO QOH_BEGIN_BALANCE
(storage_item_no,
qoh_beg_balance,
bin_location_no,
bin_location_id,
location_code,
created_date_time)
VALUES
(qoh_rec.storage_item_no,
qoh_rec.quantity_on_hand,
qoh_rec.bin_location_no,
qoh_rec.bin_location_id,
qoh_rec.location_code,
qoh_rundate
);
-- Commit every 100 records
IF MOD (record_count, 100) = 0 THEN
COMMIT;
END IF;
END IF;
END LOOP;

October 18, 2006 - 8:01 am UTC

you know that

insert /*+ append */ into .....

is just as effective as:

insert /*+ I_DONT_UNDERSTAND_HOW_APPEND_WORKS */ into ....

insert append works ONLY for insert as select.
insert append would be "not smart" for single row inserts as it always writes above the high water mark (if you got what you asked for, well, you would achieve a whooping 1 row per block - be thankful we ignored it for you!)

You know what I want to know about your code - what happens when it FAILS part way through? what happens when you are 50% of the way through the cursor and some error is raised??????

How do you recover?
How do you restart safely?
How do you avoid re-inserting rows you've already inserted? (maybe that slow by slow check you have there, you know, the one that just eats resources for no good reason?)

Can you tell me why this procedural code exists at all????

Why isn't it just:

insert /*+ append */ into T (columns....)
select columns.....
from t2
where storage_item_no not in (select storage_item_no from some_table where storage_item_no is not null );

?????

More to Explore

Hints

Need more information on hints? Check out the complete hint descriptions

Breadcrumb

Question and Answer