A reader, December 10, 2010 - 2:42 pm UTC
Hi Tom,
don't you think the op loop hit ora-1555 error since its updating the same table on which it has a loop?
Thanks
December 10, 2010 - 2:50 pm UTC
probably not...
I'll do this as a challenge :) Can anyone see why this code:
for x in (select * from big_table order by a,b,c,d)
loop
process...
update big_table set x=y, z=m where a=x.a and b=x.b and c=x.c and d=x.d;
commit;
end loop;
probably would not be subject to an Ora-1555? There is a sound technical reason why in THIS CASE - it would probably work.
I'm not a fan of committing after each row - but in this case it probably doesn't hurt performance too much since in PLSQL - commits are asynchronous - we don't wait for lgwr.
Although - if it takes say 1/10000 of a second to commit - the commits would be adding 45 minutes of runtime! Bulking up to 1000 rows at a time will knock that down a lot - down to about 2.7 seconds
A reader, December 10, 2010 - 3:26 pm UTC
"probably would not be subject to an Ora-1555? There is a sound technical reason why in THIS CASE - it would probably work."
Is it since there is order by a,b,c,d?
Thanks
December 10, 2010 - 3:39 pm UTC
not going to say - until you tell us "why"
Small transactions!
Venzi, December 10, 2010 - 3:40 pm UTC
Because you do a commit after every row. So you have a lot of tiny transactions. Therefore the RBS will most likely not even grow on their own.
Oracle can reuse the RBS that just got freed up by the commit and therefore keeps the other undo data which is accessed by the select statement.
December 10, 2010 - 3:47 pm UTC
but that is the CAUSE of the ora-1555 - Oracle can reuse the RBS that just got freed up by the commit and therefore
keeps the other undo data which is accessed by the select statement.
that is exactly what CAUSES an ora-1555!!!!
your read components of statements receive an ora-1555 when the undo they need to provide a read consistent copy of the data is no longer there.
the query "select * from big_table order by a,b,c,d" must return the last row (of 27,000,000 rows) "as of the same time" as the first row. Suppose that process that loops 27,000,000 times takes 1/1000th of a second - that is 7.5 hours of runtime. That means - 7.5 hours after this query begins - it could require undo that is 7.5 hours old - but if it were overwritten.... bam - ora-1555.
no reading what we write
Flado, December 10, 2010 - 3:44 pm UTC
We won't read the rows we updated and commited, so no problem - in isolation. We can still ORA-1555 if other sessions are busy updating and commiting at the same time.
December 10, 2010 - 3:54 pm UTC
ora-1555 happens at the BLOCK level - not the row level. You could be reading and re-reading the same blocks over and over and over again.
for x in ( select * from t where indexed_column between ? and ? )
loop
update t set ....;
commit;
end loop;
is just asking for an ora-1555. In a single user database.
that, and...
Flado, December 10, 2010 - 3:53 pm UTC
of course, by the time we fetch the first 100 rows in the for loop we will have done all the consistent read gets we would need (because of the order by) :-)
December 10, 2010 - 3:55 pm UTC
keep going - tell us why that is so :)
temp segment
bfirgi, December 10, 2010 - 4:03 pm UTC
Because of the "order by" the data will be on the temp segment.
by definition of sort...
Flado, December 10, 2010 - 4:05 pm UTC
to output the first row, we must be sure that it is indeed the first (based on order by), so we need to have compared it to all other rows, so we need to have read the other rows via CR - bang! all CRs are done. The rows are safely in private memory or temp... Even if there are concurrent commits.
BTW my first hunch applies only to an implicit cursor without order by, I think...
December 10, 2010 - 4:13 pm UTC
bingo (and the last reviewer too)
that is it - but it would not be solely due to an implicit cursor - it would be for any cursor.
In general, for a query like:
select * from big_table order by a,b,c,d
the plan would be:
full scan -> sort order by
So, in order to get the FIRST row, we have to actually get the last row (into temp) and temp doesn't need consistent reads from undo since we already DID the consistent read.
If someone changed the optimizer mode to first rows - AND there was an index on a,b,c,d AND one of a,b,c,d was NOT NULL - then - that code would almost certainly blow up with an ora-1555 - because the optimizer would use the index to read the data sorted (really really SLOWLY).
And the odds are we'd have to read the same blocks over and over and over again - with long periods of time in between the reads and we'd need to apply the UNDO we ourselves generated hours ago and it might well not exist anymore.
But here - the supposition would be full scan -> sort order by
Ditch the Loop
Matt, December 10, 2010 - 4:12 pm UTC
Surely the goal would be to attempt to remove the loop completely and replace with a single SQL update statement. (as per Toms own mantra)
Or even better get rid of the update completely and do the calculations when the record is first posted - if that can't be done maybe delete/insert would be better than a big correlated updated.
December 10, 2010 - 4:49 pm UTC
I looked briefly at the logic - it would be hard since there are rolling totals/summations happening there. It would take a MERGE of a query against the table (using analytics and such) into the table itself again - a ton of work. This is a rare case where it MIGHT (stress that word) be better to make the single pass and update back by rowid IN BULK.
migrating the code to be bulky should be rather easy - and would have great benefit - it can be done fast. The merge statement would have taken a while - if it could be done (haven't looked at it that closely)
index?
Flado, December 10, 2010 - 4:13 pm UTC
thinking further about it, we could still get ORA-1555, if we don't sort - by accessing the table via a perfect index full scan. In which case we still may visit the same block many times. That takes care of the 'probably' part, I hope.
December 10, 2010 - 4:50 pm UTC
yes, that is it - if they changed the optimizer goal to first rows and we used an index full scan - that would ora-1555 almost for sure.
oops
Matt, December 10, 2010 - 4:14 pm UTC
... no correlation but still a big update...
without order by...
Flado, December 10, 2010 - 4:41 pm UTC
Yay!
now that the time is up, I can clarify what I meant by 'we don't read what we write':
if we didn't have an order by, and in the absence of any predicates that could invoke an indexed access, we'd just read every block of the table (via a full scan) just once before updating it (many times). We would be safe from ORA-1555 in isolation, but would still be vulnerable from concurrent commits to blocks near the "end" of the table.
more on "probably"
Flado, December 10, 2010 - 4:56 pm UTC
... which brings me to the observation that even in the 'full scan-sort order by' case, we ARE vulnerable to ORA-1555 due to concurrent commits while the full scan runs.
Phew!
Flado, December 10, 2010 - 5:37 pm UTC
"has a good chance of ora-1555'ing - in a single user database!"
-- yeees, given lots of delayed block cleanouts on the table, but doesn't automatic undo try hard to reduce that "good chance" somewhat?
December 10, 2010 - 6:35 pm UTC
automatic undo management *can* definitely help reduce it - IF and ONLY IF
a) the DBA had the foresight to set the undo retention realistically for the work performed on the system
AND
b) the DBA had the fortitude to allocate sufficient space to the undo tablespace to hold that much undo
I many times find neither A nor B.
I frequently see A but not B.
I rarely see A and B :)
Single SQL statement
A reader, December 11, 2010 - 3:52 pm UTC
Tom:
I have 10 processes running insert, update, delete into the same table around 500000 times. The total time it takes is around 3:30 minutes. The total number of records it inserts is around 30 million. We are trying to change it into to insert all at one time (simple insert statement with hash joins),update all at one time and delete all at the same time.
The insert all takes around 30 min,
update takes around 60 min--We have dropped some updatable indices
Delete takes around 60 min
Additional processing around 30 min. Basically we are not saving much time. The prototype we did initially showed promising results. However when the number of rows increased , I guess something is a miss. The difference between the original run and this run is NL join versus Hash joins in the later. Can you outline what we are missing?
It looks more like the 20 different process more efficiently.
Thanks,
December 12, 2010 - 8:10 am UTC
3:30 minutes is ambiguous to me. Is that 3 hours 30 minutes or 3 minutes 30 seconds? Sounds like 3 hours 30 minutes given the other information. It is a very strange way to present timings to someone.
Are you comparing a parallel operation with a non-parallel operation here? You don't say if you employed parallel query in the bulk operation. If not, why NOT? if you did, why don't you say so???????????
Any reason this cannot be done with a MERGE statement which is capable of inserting new, updating existing and removing others?? You give very little information there regarding the actual processing.
Based on the previous...
A reader, December 11, 2010 - 6:27 pm UTC
All our freelists and freelist groups are all adjusted accordingly.
Basically the essentials of batch processing are all met...
Thanks,
A reader, December 12, 2010 - 8:21 am UTC
Tom:
We are inserting into a permanent process table. This takes 30 min.
2. Updating the main transaction table based on the process table (this is merge). Takes more than 60 minutes. Takes more time if the indices are present.
3. Deleting the records from process table. That takes around 60 minutes.
Yes we did enable parallelism of 10.
Thanks,
December 13, 2010 - 2:05 am UTC
and how many cpus do you have. and how much memory. and what is your pga aggregate target. and did you do any analysis using AWR/statspack to see if the advisors had anything interesting to say.
I'm going to guess that your pga is set too low and you are doing many multi-pass pga operations that go to disk.
why is step 3 not a truncate?
there is really entirely insufficient detail here - "we insert into t1, we update t2 from t0, we delete from t0" - that is what this seems to boil down to - not much to go on.
A reader, December 12, 2010 - 8:26 am UTC
Tom:
The insert into the process table joins around 6 tables(some of them as big as 90 million).
Thanks,
Still subject to Delayed Block Cleanout?
Brian B, December 13, 2010 - 12:22 pm UTC
"There is a sound technical reason why in THIS CASE - it would probably work. "
...
"So, in order to get the FIRST row, we have to actually get the last row (into temp) and temp doesn't need consistent reads from undo since we already DID the consistent read. "
That makes a lot of sense as to why this query itself doesn't get an ORA-01555. You said "probably" in the original challenge... could this query still set up conditions for another reader to get an ORA-01555 due to delayed block cleanout?
December 14, 2010 - 3:38 am UTC
sure - if the optimizer was using first_rows(n) optimization and there existed an index on the columns we were ordering by - the optimizer might choose to read the table via the index - which means the last row would be retrieved perhaps hours after the first row - and might actually reside on the block where the first row was retrieved from - leading to a probable ora-1555.
Understandings !!
Rajeshwaran, Jeyabal, December 14, 2010 - 3:26 am UTC
for x in (select * from big_table order by a,b,c,d)
loop
process...
update big_table set x=y, z=m where a=x.a and b=x.b and c=x.c and d=x.d;
commit;
end loop;
In order to get the FIRST row, we have to actually get the last row into temp. (No consistent reads, No before Image, so no ORA-1555) So we will not hit ORA-1555 here!
for x in (select * from big_table)
loop
process...
update big_table set x=y, z=m where a=x.a and b=x.b and c=x.c and d=x.d;
commit;
end loop;
In this scenario, consistent reads is needed each row retrieved from
big_table by reconstructing blocks from undo segment (before Image). If the requested block(before Image) is unavailable in undo segment then we will hit ORA-1555.
Is that my understanding is correct Tom?
December 14, 2010 - 7:21 am UTC
consistent reads are needed for each BLOCK retrieved from big_table and may need undo to reconstruct those blocks. If that undo isn't available - then ora-1555
Understandings !!
Rajeshwaran, Jeyabal, December 15, 2010 - 1:37 am UTC
Thanks for correcting me Tom!!
consistent reads are needed for each BLOCK retrieved from big_table