Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Jane.

Asked: June 05, 2000 - 3:48 pm UTC

Answered by: Tom Kyte - Last updated: March 05, 2020 - 3:14 am UTC

Category: Database - Version: oracle 8i

Viewed 100K+ times! This question is

You Asked

Tom,
Would you tell me what snapshot too old error. When does it happen? What's the possible causes? How to fix it?

Thank you very much.

Jane

and we said...

I think support note <Note:467872.1> covers this topic very well:


There are many good notes on ORA-1555 causes and solutions. In order to be able to resolve a problem, we must first understand the problem.

We will examine an extended example of the most common cause of the ORA-1555.

EXAMPLE:

Time 1: Session #1 starts a query on table A
Time 2: Session #2 updates row X in table A
Time 3: The query from session #1 arrives at row X and discovers the last updated time
(based on the SCN number) of the row to be later than Time 1 ... so the UNDO or
LOB segments are used to get the read consistent view (before image) of the row
and then the query proceeds
Time 4: Session #2 updates row Y in table A and then COMMITs (thus making it possible
for this transaction slot to be overwritten)
Time 5: Session #2 updates row Z in table B and COMMITs ... either due to space pressure
(using AUM) or bad luck (using rollback) in the read consistent view for the update
of row Y in table A at time 4 ... is overwritten (we wont examine why the overwrite
occurs at this point)
Time 6: The query from session #1 arrives at row Y and and discovers the last updated time
(based on the SCN number) of the row to be later than Time 1 ... so the UNDO or
LOB segments are examined to find the read consistent view ... BUT ... the
transaction slot containing the view ... was overwritten at time 5 ... so no read
consistent view is available ... so an ORA-1555 occurs

NOTE ... UNDO is a generic term that can refer to either UNDO (AUM) or Rollback segments

==============================================================================================

How can AUM allow a read consistent view of a transaction slot for a committed row to be overwritten?

1) The transaction slot for the row has expired.

This means that the current time minus the commit time of the row is greater than the
UNDO_RETENTION. Once a transaction slot of a committed row is 'expired', it is available for
reuse. Often we get customers who ask:

Why do my queries sometimes run so much longer than my UNDO_RETENTION before
encountering an ORA-1555 and other times it occurs soon after?

The answer is all left up to chance and how much pressure the undo tablespace is under.

2) The transaction slot for the row has NOT expired and yet was still overwritten.

This condition occurs under two conditions:

a) The undo tablespace has become full and must 'steal' transaction slots from committed, unexpired
rows

b) A bug

3) The LOB segment read consistent copy of the LOB is no longer available

Investigate how the LOB column is configured, in-row or out-of-row. In-row LOBs should be utilizing
normal UNDO algorithms in the UNDO tablespace. Out-of-row LOBs use the information below.

Read consistency with LOBs is controlled by 2 methods

a) Use of PCTVERSION (old method)

Specify the maximum percentage of overall LOB storage space used for maintaining old versions of
the LOB. The default value is 10, meaning that older versions of the LOB data are not overwritten
until they consume 10% of the overall LOB storage space.

The downfall to this method is customers who do frequent updates or deletes of tables with LOBs
often exceed even 100% PCTVERSION ...

b) Use of RETENTION (current method for use with AUM)

Use this clause to indicate that Oracle Database should retain old versions of this LOB column.
Oracle Database uses the value of the UNDO_RETENTION initialization parameter to determine
the amount of committed undo data to retain in the database.

This method uses the same expiration as undo segments

If an ORA-1555 occurs with a LOB segment. then this means that either:

PCTVERSION was exceeded and the read consistent view of the LOB was overwritten

OR

the LOB has exceeded RETENTION and the row LOB was overwritten sometime during
the query that generated the ORA-1555

New information

LOB problems with ORA-1555 can be very difficult to fix. Updates to out-of-row LOBs
should be minimized. Deletes and inserts have much less impact to UNDO operations than
updates. PCTVERSION is the preferred approach with 10.2.x and AUM. However,
100% will mean that lots of disk space is required to keep up with UNDO "copies" of LOB
information in environments with frequent LOB updates.

In some cases it has helped to move the LOB column to a tablespace where the Segment
Space Management is manual.

NOTE: transaction slots for UNCOMMITTED rows cannot be overwritten

Note 422826.1 How To Identify LOB Segment Use PCTVERSION Or RETENTION
Note 563470.1 Lob retention not changing when undo_retention is changed
Note 800386.1 ORA-1555 - UNDO_RETENTION is silently ignored if the LOB
resides in a MSSM tablespace
=======================================================================================

Now that we understand why the ORA-1555 occurs and some aspects about how they can occur we need to examine the following:

How can we logically determine and resolve what has occurred to cause the ORA-1555?

1) Determine if UNDO_MANAGEMENT is MANUAL or AUTO

If set to MANUAL, it is best to move to AUM. If it is not feasible to switch to AUM see

Note 69464.1 Rollback Segment Configuration & Tips

to attempt to tune around the ORA-1555 using V$ROLLSTAT

If set to AUTO, proceed to #2

2) Gather the basic data

a) Acquire both the error message from the user / client ... and the message in the alert log

User / Client session example:

ORA-01555: snapshot too old: rollback segment number 9 with name "_SYSSMU1$" too small

Alert log example

ORA-01555 caused by SQL statement below (Query Duration=9999 sec, SCN:0x000.008a7c2d)

b) Determine the QUERY DURATION from the message in the alert log

From our example above ... this would be 9999

c) Determine the undo segment name from the user / client message

From our example above ... this would be _SYSSMU1$

d) Determine the UNDO_RETENTION of the undo tablespace

show parameter undo_retention

3) Determine if the ORA-1555 is occurring with an UNDO or a LOB segment

If the undo segment name is null ...

ORA-01555: snapshot too old: rollback segment number with name "" too small

or the undo segment is unknown

ORA-01555: snapshot too old: rollback segment number # with name "???" too small

then this means this is a read consistent failure on a LOB segment

If the segment_name or the undo segment is known the error is occurring with an UNDO segment.

==============================================================================================

What to do if an ORA-1555 is occurring with an UNDO segment
-------------------------------------------------------------------------------------------------

1) QUERY DURATION > UNDO_RETENTION

There are no guarantees that read consistency can be maintained after the transaction slot for the
committed row has expired (exceeded UNDO_RETENTION)

Why would one think that the transaction slot's time has exceeded UNDO_RETENTION?

Lets answer this with an example

If UNDO_RETENTION = 900 seconds ... but our QUERY DURATION is 2000 seconds ...

This says that our query has most likely encountered a row that was committed more than 900
seconds ago ... and has been overwritten as we KNOW that the transaction slot being examined
no longer matches the row we are looking for

The reason we say "most likely" is that it is possible that an unexpired committed transaction slot was
overwritten due to either space pressure on the undo segment or this is a bug

SOLUTION:

The best solution is to tune the query can to reduce its duration. If that cannot be done then increase
UNDO_RETENTION based on QUERY DURATION to allow it to protect the committed
transaction slots for a longer period of time

NOTE 1: Increasing UNDO_RETENTION requires additional space in the UNDO tablespace. Make
sure to accommodate for this space. One method of doing this is to set AUTOEXTEND on one or
more of the UNDO tablespace datafiles for a period of time to allow for the increased space. Once the
size has stabilized, AUTOEXTEND can be removed.

See the solution for #2 below for more options

NOTE 2: It is NOT recommended to have a mix of AUTOEXTENSIBLE and NON-AUTOEXTENSIBLE files within an UNDO tablespace as this can cause TUNED_UNDORETENTION to miscalculate


2) QUERY DURATION <= UNDO_RETENTION

This case is most often due to the UNDO tablespace becoming full sometime during the time when the
query was running

How do we tell if the UNDO tablespace has become full during the query?

Examine V$UNDOSTAT.UNXPSTEALCNT for the period while the query that generated the
ORA-1555 occurred.

This column shows committed transaction slots that have not exceeded UNDO_RETENTION but
were overwritten due to space pressure on the undo tablespace (IE became full).

If UNEXPSTEACNT > 0 for the time period during which the query was running then this shows
that the undo tablespace was too small to be able to maintain UNDO_RETENTION. Unexpired
blocks were over written, thus ending read consistency for those blocks for that time period.

set pagesize 25
set linesize 120

select inst_id,
to_char(begin_time,'MM/DD/YYYY HH24:MI') begin_time,
UNXPSTEALCNT "# Unexpired|Stolen",
EXPSTEALCNT "# Expired|Reused",
SSOLDERRCNT "ORA-1555|Error",
NOSPACEERRCNT "Out-Of-space|Error",
MAXQUERYLEN "Max Query|Length"
from gv$undostat
where begin_time between
to_date('<start time of the ORA-1555 query>','MM/DD/YYYY HH24:MI:SS')
and
to_date('<time when ORA-1555 occured>','MM/DD/YYYY HH24:MI:SS')
order by inst_id, begin_time;

NOTE: Start time of the query can be determined by subtracting the query duration from the timestamp
of the ORA-1555


EXAMPLE:

Tue May 26 16:16:57 2009
ORA-01555 caused by SQL statement below (SQL ID: 54yn3n36w24ft, Query Duration=922 sec, SCN: 0x0007.8a55f4e3)

922 seconds is 15 min 22 seconds

So the start time of the query would be May 26 16:01:35

If after researching, you find no existing bugs seem to match your case, then additional info will be
needed to file a bug.

Starting with release 9.2.0.4 an event, 10442, can be set to dump all the necessary diagnostics
information to one trace file. NOTE: Event 10442 does not trigger diagnostics in cases where the
ORA-1555 error is associated with out-of-row LOB undo.

The event can be enabled using the following settting in init.ora

event="10442 trace name context forever, level 10"

or

Alter system set events '10442 trace name context forever , level 10';

Reproduce the ORA-1555 error with a new connection to trigger the trace diagnostics file.

NOTE: There have been issues with 10g and 11g on RAC using 10442. The event can contribute to factors leading to a hang on one or more nodes on the RAC. Refer to Bug 10051817 for more fixes for these problems on RAC.


Another thing to consider is whether Oracle is behaving as designed as far as "steal logic" is concerned

Here is the logic for allocating space for a new transaction--based on Bug:4070480

* Allocate new extent in the undo tablespace
* If failed, steal an expired extent from another undo segment. This involves shrinking from the
other segment and adding to the current undo segment.
* If failed, autoextend a datafile (if enabled)
* If failed, reuse an unexpired extent from the current undo segment.
* If failed, steal an unexpired extent from another undo segment.
* If failed, report an "Out-Of-Space" error.

What to do if an ORA-1555 is occurring with a LOB segment
---------------------------------------------------------------------------------------------

1) Often the ORA-1555 with the null undo segment is accompanied by an ORA-22924

Error: ORA-22924
Text: snapshot too old
---------------------------------------------------------------------------
Cause: The version of the LOB value needed for the consistent read was already overwritten by another
writer.

2) Determine if PCTVERSION or RETENTION is used

Examine DBA_LOBS.PCTVERSION and DBA_LOBS.RETENTION
(or investigate through the data dictionary--see below)

NOTE: Only one of these may be set, so this will determine which method of undo management is
used for this LOB

There is a bug with these two columns prior to 10.2 and as such the data dictionary must be
examined to determine which of these is being used

Note 422826.1 How To Identify LOB Segment Use PCTVERSION Or RETENTION From
Data Dictionary

If PCTVERSION is used
------------------------------------

This method of LOB undo management is akin to use of rollback segments and as such there is little
control to tune undo usage.

The only parameter we have control over with this method is PCTVERSION itself.

SOLUTION:

Increase PCTVERSION


ALTER TABLE **table name** MODIFY LOB (**lob column name**) (PCTVERSION 100)

If the parameter is at 100% then the only solution will be to move to use of RETENTION


ALTER TABLE **table name** MODIFY LOB (**lob column name**) (RETENTION)

If RETENTION is used
---------------------------------

You may have RETENTION in place, but it may not be working as you expect.

See Note 800386.1 ORA-1555 - UNDO_RETENTION is silently ignored if the LOB
resides in a MSSM tablespace
Note 563470.1 Lob retention not changing when undo_retention is changed

If RETENTION is used the same rules apply for the LOB segments as do the UNDO segments

1) QUERY DURATION > UNDO_RETENTION

There are no guarantees that read consistency can be maintained after the old lob segment for
the committed row has expired (exceeded UNDO_RETENTION)

SOLUTION:

The best solution is to tune the query to reduce its duration. If that cannot be done then
increase UNDO_RETENTION based on QUERY DURATION to allow it to protect the
committed transaction slots for a longer period of time. A change to UNDO_RETENTION
does not automatically change RETENTION. See Note 563470.1 Lob retention not
changing when undo_retention is changed.

NOTE 1: Increasing UNDO_RETENTION requires additional space in the LOB segments
tablespace. Make sure to accommodate for this. One method of doing this is to set
AUTOEXTEND on one of more of the lob segments tablespace datafiles. There may or
may not be a stabilization of size like the UNDO tablespace. If it does stabilize then the
AUTOEXTEND can be removed. See the solution for #2 below for more options.

NOTE 2: RETENTION for LOBs does not function the same as it does for undo. LOB segments
will not automatically extend to allow for retention of read consistent copies of a lob. As
such, at times like these, PCTVERSION 100 is a better solution.

2) QUERY DURATION <= UNDO_RETENTION

This case is most often due to the LOB segments tablespace becoming full sometime during the time
when the query was running.

How do we tell if the LOB segment tablespace has become full during the query?

Unfortunately there is no easy way to do this. V$UNDOSTAT does not contain info about the LOB
segments. There may be a way to use Enterprise Manager and examine the high watermark
(HWM) of the tablespace in question.

There is an enhancement request for this info to be included in the future

Bug:3200789 Abstract: VISIBILITY OF LOB SEGMENT USAGE FOR UNDO

SOLUTION:

Provide more space in the LOB segments tablespace so that transaction slots do not have to be
stolen. This can be done by:

* Using AUTOEXTEND on an existing LOB segments tablespace datafile
* Increase the size of an existing LOB segments tablespace datafile
* Add a datafile to the LOB segments tablespace


and you rated our response

  (280 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Snapshot too Old.

March 20, 2001 - 11:13 pm UTC

Reviewer: Ganesh Raja from Chennai, Tamil Nadu India.

This Article Was real Good. Thanks Tom for Highlighting this.

Snapshot too old error

March 29, 2001 - 10:43 pm UTC

Reviewer: KP from India

yr reply is ultimate.


Simply Superb

August 07, 2001 - 5:49 am UTC

Reviewer: Nikhil S Bidwalkar from Singapore

Tom your reply was just terrific ... I need to know more about how oracle makes a read consistent views using rollback segment if more then 2 transactions are reading same block which are updated by another transactions .

Snapshot too old error

October 31, 2001 - 4:21 am UTC

Reviewer: AC from Deutschland

Superb! Excellent article and carefully explained.

Snapshot too old

November 09, 2001 - 10:59 am UTC

Reviewer: Jo?o Paulo from Brazil

Finnaly I completely understand the rules for the snapshot too old.

Thanks Tom.

"Order by" alternative

January 14, 2002 - 9:47 pm UTC

Reviewer: walt from CT

I have found some situations where "Order By" is a reasonable way to eliminate snapshot too old errors. This can force all the I/O to occur before any rows are returned. This seems particularly useful when cursor processing has time consuming nested logic.

Snapshot too old error

February 07, 2002 - 11:50 pm UTC

Reviewer: Prasath Srinivasan from Chennai,India

Tom
I came across this site only on 07/02/2001
The informations you provide and the way you approach is really amazing and extremely useful.
Thanks for ur work


Snapshot too old error

February 07, 2002 - 11:51 pm UTC

Reviewer: Prasath Srinivasan from Chennai,India

Tom
I came across this site only on 07/02/2001
The informations you provide and the way you approach is really amazing and extremely useful.
Thanks for ur work


facing same problem.

March 31, 2002 - 6:11 pm UTC

Reviewer: Sudhanshu Jain from India, Delhi

Hi

I am facing the same problem in my application. My process involves fetch across commit.

I will implement the suggestion as you mentioned.

Thanks.

Great Job!!!

August 01, 2002 - 1:36 pm UTC

Reviewer: Fiza from Canada

Thanks Tom your reply is more than useful...wonderful

Snapshot too old

January 09, 2003 - 12:40 am UTC

Reviewer: K Mangesh from Pune, India

Thanks a lot Tom, This is really great.....

Extremely Useful Information

February 23, 2003 - 12:39 pm UTC

Reviewer: ik from BG

Tom,

Two Questions -

1) Further to the concept of delayed block cleanout - If assuming that transactions happen on the database during the night to sync up with another system, would it be advisable enough to do a SELECT /*+ FULL */ COUNT(*) FROM updated_table after the nightly bulk inserts/updates?
This is to ensure that table is 'cleaned out'. Otherwise the next day regular queries will find the updated blocks as uncommitted and would have to read the rollback segments. Objective is to reduce work done by the regular queries on the database.

2) Would the query lock the rollback segment for consistent reads?

Thanks as always.

Tom Kyte

Followup  

February 23, 2003 - 12:46 pm UTC

1) i would not, no. The concept here is that the work is so neglible to the guys the next day -- that it won't be noticed.

2) no

Good article

February 26, 2003 - 6:48 am UTC

Reviewer: Victor Oosterbaan from Netherlands

Very good article, beginning to understand RBs now :P


more help needed in understanding ora-01555

May 15, 2003 - 12:06 am UTC

Reviewer: Jerry from USA

I read some of the postings on this website and your book. I still feel it difficult to piece together the reason for ora-01555. Two questions:

1. How many slots the transaction table has for a rbs?
2. Could you help explain the second example in the article?

ROLLBACK TRANSACTION SLOT OVERWRITTEN

rem * 1555_b.sql - Example of getting ora-1555 "Snapshot too old" by
rem * overwriting the transaction slot in the rollback
rem * segment header. This just uses one session.

drop table bigemp;
create table bigemp (a number, b varchar2(30), done char(1));

rem * Populate demo table.
begin
for i in 1..200 loop
insert into bigemp values (mod(i,20), to_char(i), 'N');
if mod(i,100) = 0 then
commit; (Q: ==> why commit here? I recall you mentioned in another posting that it's better to bite the bullet and size the rbs properly. Is it for demo purpose to avoid the maxextents reached? Or is it suggested to do commit every n rows?)
end if;
end loop;
commit;
end;
/

drop table mydual;
create table mydual (a number);
insert into mydual values (1);
commit;

rem * Cleanout demo table.
select count(*) from bigemp;

declare

cursor c1 is select * from bigemp;

begin

-- The following update is required to illustrate the problem if block
-- cleanout has been done on 'bigemp'. If the cleanout (above) is commented
-- out then the update and commit statements can be commented and the
-- script will fail with ORA-1555 for the block cleanout variant. (Q: ==> could you please expand on this and explain why update is needed here?

update bigemp set b = 'aaaaa';
commit;

for c1rec in c1 loop (Q: ==> The cursor here already got the commited changes from bigemp, right?)
for i in 1..20 loop
update mydual set a=a; ( Q: ==>Does fetch across cursor means that the update/fetch is on the same table? Why ora-01555 occurs here?)
commit;
end loop;
end loop;
end;
/


Tom Kyte

Followup  

May 15, 2003 - 9:20 am UTC

see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>
for my example of this. I did not do the example above, yes, I would not have filled the table in that fashion. i would have used insert into select .... from all_objects myself.

the number of transaction entries varies by blocksize.

If you have my book "Expert one on one Oracle" -- I spend lots of time on this topic with lots of examples. Basically, we get into a situation where we cannot tell if the version of the block we have access to is "current enough" yet "not too current" for our queries result set. the information we needed to tell that is wiped out of the rbs.

nifty article

May 30, 2003 - 4:22 am UTC

Reviewer: Anirudh Sharma from New Delhi, India

Hi Tom,
The article about snapshot too old error was very good but I have a small doubt:
What if a procedure which is declared Autonomous transaction is called in a loop and that procedure has commit in it. Can that be a cause for ORA 01555 as well.
Thanks

Tom Kyte

Followup  

May 30, 2003 - 8:09 am UTC

the only CAUSE of a 1555 is improperly sized rollback segments.


committing in a cursor for loop is a way to experience the symptoms all by yourself (without any outside help).

June 09, 2003 - 4:13 am UTC

Reviewer: A reader

Tom,
Here is the situation I run into.

I start with an empty database and insert millions of rows.

After the load is done and the updates are committed, I run a very long query that will run for hours.

While the query is running other batch jobs are loading other tables. The table that I am querying is not updated again.

After running for a long time my query fails with snapshot too old.

I assume this is due to the delayed block cleanout.

Do you agree?

If the data blocks were updated, committed and not cleaned out and the rollback segments can be overwritten because it is committed how do the blocks ever get cleaned out in this situation?




Tom Kyte

Followup  

June 09, 2003 - 7:20 am UTC

yes.

your query is cleaning the blocks out itself.

point 6 of note

August 08, 2003 - 2:41 am UTC

Reviewer: A reader

Hi

In the note you provided in point 6 of solutions of case 1 it states this:

6. Ensure that the outer select does not revisit the same block at different
times during the processing. This can be achieved by :

- Using a full table scan rather than an index lookup
- Introducing a dummy sort so that we retrieve all the data, sort it and then sequentially visit these data blocks.


How can a sort avoid this error?

Tom Kyte

Followup  

August 10, 2003 - 11:30 am UTC



Umm, by RETRIEVING all of the data (before the first row is returned -- before a commit happens)

ORA-01555 during export

August 25, 2003 - 9:12 am UTC

Reviewer: Sunil Gururaj from Cyprus

Hi Tom,

I am performing an export of around 80 Million records on a table which is not used by anybody else except the export itself.

I was successful in performing such an export many times in the past few days.

I am using the below command:

gzexp test1/test1 file = exp_RATED_EVENT_XXX.dmp log = exp_RATED_EVENT_XXX.log tables = RATED_EVENT_XXX feedback = 1000000 buffer = 40960000 grants = n constraints = n indexes = n compress = n direct = y

Note: gzexp is a piping script to overcome the 2GB limit.

However, today, after exporting about 70 million records, the export failed with an ORA-01555 error.

The only operation that was performed on this table after the export began was "alter table nologging"

My question is whether the "alter table nologging" is potential enough to cause a "snapshot too old" error or am I missing something here.

Thanks,

Tom Kyte

Followup  

August 25, 2003 - 9:30 am UTC

you don't even need that -- there can be NO activity on this table and you can get an ora-1555. it just means whilst you were querying there were lots of other little transactions all committing and wiping out undo you needed to ensure the consistent read.

it can happen on a READ ONLY tablespace even.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>

for example demonstrates that. it means your RBS is too small for what you do on your system.

if you restart, it should be OK now since at least 7/8ths of the table has had its blocks cleaned out.

August 25, 2003 - 3:36 pm UTC

Reviewer: A reader


what are systemTables ? what is the main usage ?

September 09, 2003 - 7:31 am UTC

Reviewer: raja from India


hi, tom

what are the systemtables and usages ?

how to catch the for..loop Exception's ?

i am writing one simple procedure , in that procedure executable block ,2 nd statement raises an exception, it goes to exceptions block and after that i am interested to execute again 3 rd statement's onwards...also how to do this ?



Tom Kyte

Followup  

September 09, 2003 - 11:50 am UTC

system tables are the data dictionary.


for ...
loop
statement 1
begin
statement 2
exception
when named_exception then ....
end
statement 3
end loop

that'll execute statement 1, then statement 2. if statement 2 raises some exception and you catch it, handle it -- then statement 3 will execute.

OPTIMAL parameter for rollback segments

September 11, 2003 - 1:48 am UTC

Reviewer: Mohan from bangalore

Hi Tom,

I am confused about whether to specify the optimal parameter if storage cluase of "create rollback segment". The article above says if a value is specified for the OPTIMAL parameter then it can cause "snapshot too old error". When I don't specify a value for the optimal parameter then the rollback segment ocuupies the entire tablespace, never shrinks. How to overcome this.

Mohan


Tom Kyte

Followup  

September 11, 2003 - 8:36 am UTC

let the rollback segment stay that size?!?

i mean, apparently, they need to be that big. why keep shrinking them if they really want to grow that large. it is expensive to let them grow. It is expensive to let them shrink. let them be the size they need to be for your system.

rebuild index and snapshot too old

October 08, 2003 - 8:38 am UTC

Reviewer: A reader

Hi

Recently we got snapshot too old errors when rebuild an index (not very big, around 1GB). How so.... we rebuild index using conventional methods not using ONLINE clause so no DML is allowed we dont understand why we can get ORA-01555. Can it be delayed block clean out? If so how can we avoid that?

Tom Kyte

Followup  

October 08, 2003 - 10:51 am UTC

it is the delayed cleanout -- yes.

you increase the size of your permanently allocated RBS to be large enough so as to not wrap around during your longest running statement.

Breaking my head. Please help me out !

October 10, 2003 - 9:31 am UTC

Reviewer: Tony from India

I'm in the process of tuning pro*c programs. The existing code does the following:

1. Opens a cursor for 10000 records.
2. Bulk fetch 100 records at time.
3. Process the data and update, insert...etc
4. Commit for every 500 records.
5. Colse the cursor and Open the cursor for another 10000 records.
6. step 1 to 5 again
........
.........

The above actions are repeated many times as the table has millions of records. The application(month end process) is very slow but doesn't through any error.

Since cursor reopen for every 10000 records and frequent commit for every 500 records are identified as the main cause for the slow down of the application, Its decided to 1. Remove cursor reopen,
2. Open the cursor only once with all records.
3. Bulk fetch 100 records at a time.
4. Process the data, update /insert ....
5. Create a large rollback segment.
6. Increase the commit interval (Commit for every 1 lack record (100000) instead of every 500 record).

Now there is a significant improvement in the performance.
But ORA-01555 "Snapshot too old" error is encountered after commiting certain number of records.

The tablespace for the large rollback segment size is above 3GB. And the large rollback segment has the following values:
Initial extent= 50 MB,
min extent = 2,
next extents = 2 MB,
max extent = unlimited.

Now I encounter ORA-01555 "Snapshot too old" error, but rollback tablespace still has so much of free space and only 6 extents are allocated for the rollback segment.

What could be the reason for the error. I'm breaking my head for the past 3 days. I've increased the initial extent to 512 and Still I face the problem. Tom, Please help me out.

Thanks in advance.

Than





Tom Kyte

Followup  

October 10, 2003 - 10:46 am UTC

the probability of a 1555 is directly related to the SIZE OF THE SMALLEST RBS. Your query needs to READ from ALL rbs's -- your transaction might be writing to one, but your query needs them all. And they are wrapping around.

You want to make ALL of your RBS's big enough so they do not wrap around during the course of this processing.

You want this to go really fast? remove the procedural code as much as possible (eg: single SQL insert/updates - NO QUERY) or even consider putting it into plsql instead of bringing it all of the way OUT of the database just to put it back INTO the database.




Why query needs rollback segment?

October 11, 2003 - 12:40 am UTC

Reviewer: Tony from India

Tom, Thanks for your answer for my previous question on
ORA-01555. I have few more doubts.

1.As per I know, only insert, update and delete need rollback segs. Why the query needs rollback segments?.
2.The pro*c programs are forced to use the BIG rollback segment and the error ORA-01555 is raised for the BIG rollback segment and not for the samll ones. then, why should I increase the size of small rollback segs?




Tom Kyte

Followup  

October 11, 2003 - 10:22 am UTC

1) read:
</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96524/c21cnsis.htm#17882 <code>

to learn all about the COOLEST feature in Oracle. It explains why rollback is NOT just for modifications.

This is something that EVERYONE needs to understand. It is the fundemental "thing" about Oracle.

2) if you have my book expert one on one Oracle -- I wrote on this extensively.

STOP committing until your TRANSACTION is complete.


Question.

November 10, 2003 - 9:07 am UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada.

Tom,

Can we say that we cannot get ORA-1555 more times then the sum of the WRAPS column in v$rollstat for all rollback segments ? (given that we have not droped/created/offlined any of them, and excluding the system rollback).

If not, can you explain why?

Thanks.


Tom Kyte

Followup  

November 10, 2003 - 12:06 pm UTC

no -- those wraps are different then "wrap arounds" -- they are wraps from one extent to another.

Ah !

November 10, 2003 - 12:53 pm UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

Ah ... confusion again. I found a question about wraps, that explains it thanks.

Is there anyway to see wrap arrounds via a statistic ?

Is there are any usefull info we can get, via such a statistic?

Tom Kyte

Followup  

November 10, 2003 - 3:03 pm UTC

with undo segments -- it is easy. with undo segments -- you don't need to worry about such calculations.

with old fashioned ones, not so. you could look at writes (bytes written) to see how much activity it generated. a statspack would give you this for a discrete window of time. You can see how fast you generate undo and the compute the theoretical wrap arounds from there.

If DML session starts first, is it possible to get ORA-01555?

November 12, 2003 - 7:39 pm UTC

Reviewer: John from San Jose

Hi Tom,

Feel guilty everytime I post here - thinking you are being bombarded with questions from all over the world. Such is the price of fame, yes? I typically come here only when I have issues that I absolutely can not resolve myself. So here you go...

If I have only 2 sessions running in a system, One running DML, one doing query. If the DML session starts first, is it possible for the querying session to get ORA-01555?

Here is my analysis:

1. Session 1 modifies 100 blocks in a table at time T1. 100 blocks copied to rollback segment with SCN marked as, say, SCN1.
2. Session 2 comes in at time T2 and select one of the blocks, let's say block1. Since the data is being modified by session 1, session 2 goes to the rollback for the block. Let's say the current SCN is SCN2.
3. At time T3 session 1 commits. The rollback is now up for grab. Let's say it is overwritten before session 2 finishes the long runnig query.
4. At time T4 session 2 ask for another data block, say block100.

Here is the critical point. What happens when session 2 ask for block100 at T4? It looks up block100 and notice the data in the block has been committed and the SCN is SCN1 which is older than SCN2, the starting SCN of the query. This is OK for read consistency purpose. It then proceeded to read the data block from the table. So, even though
the rollback data is now gone, it does not matter. Because session 1 does not need it anymore.

What happens if session 2 at T4 asked for block1, the same block it visitied at T2? Again it looks up the data block in the table, noticed the data has been committed, SCN is older than its starting SCN and decided to read from it. So, session 2 read block1 from rollback at T2 and from table itslef at T4. There is no problem with this, it there?

Do you see any problems with my thinking? Am I correct in saying that the querying session will not get ORA-01555, as long as it starts AFTER the DML session? Thanks much!


John




Tom Kyte

Followup  

November 13, 2003 - 6:44 am UTC

is this a SINGLE QUERY visiting blocks 1 and 100 or separate queryies?

if these are the result of a single query-- then the premise that the SCN1 is old enough is wrong.

if they are individual queries, then no ora-1555 unless you were using serializable which freezes your scn as of the time of the first statement you execute in the transaction.

great

November 13, 2003 - 8:48 am UTC

Reviewer: mengyp from China

ths Tom,
if in a long updating,most block is writlen back to
disk,then commit,and no transaction that visits any block affected by the update to 'tidy up' the block.

1.days after,what happening to those blocks?(if select need to visit)
2.restart instance,what happening to those blocks?
are those blocks need to 'tidy up'(if select need to visit
those blocks)


Question about delayed block cleanout

November 13, 2003 - 5:33 pm UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

When a given block is not cleaned out, and it's rollback transaction information is lost due to high commit activity, how does oracle know what SCN to assign to that block when cleaning it out?



Tom Kyte

Followup  

November 13, 2003 - 9:24 pm UTC

blocks don't have scn's really. rows on the blocks do. the scn is already on the block (left behind from the transaction that modified it in the first place)

ROW SCNs???

November 14, 2003 - 1:07 am UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

That's strange.

I thought that there's one SCN per block, am I wrong ?

Also, how is it possible for the SCN to be already on the modified block if it is not cleaned out ? I also thought that all blocks modified by one transaction should get the System SCN as of the time of the commit, which is unknown when the blocks are been modofied. Is that wrong ?

Here's how I see the process of updating a row

1. Finds block P containing row N
2. Lock row N (if possible)
3. Make a copy of the block in the rollback segment
4. Updates header of block P with a pointer to a rollback segment transaction, which poins to the previous made copy
5. Make the changes to the row and the block
6. Returns to user input

When the user commits, the following happens:

7. Gets the system current SCN
8. Marks the transaction as commited in the rollback segment header, writing the SCN we got.
9. Cleans out some of the blocks (possibly not all)
10. Returns to user input

Is my understanding correct ? Can you correct any incorect steps ?


Tom Kyte

Followup  

November 14, 2003 - 8:10 am UTC

sorry -- i knew after I posted this i mispoke -- was in a hurry last night.

But in any case, the transaction information on the block header is what we need and that is all there -- it is just that the transaction information is "stale" and needs to be refreshed. The transaction header is whats vital here and it is all there -- it is just that we have to peek to see if the transaction that appears to have the rows locked is still active or not (it would appear in that looking at the block that it is but peeking at the active transactions will tell us it is not).

sorry about that.

So where do we get the SCN?

November 14, 2003 - 9:04 am UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

It's ok as long as we clear out the story.

So how does a block get the commit SCN when it is in the "delayed block cleanout " state, if it's not revisited after the commit and it's transaction information can be lost ?



Tom Kyte

Followup  

November 14, 2003 - 10:18 am UTC

you do know "at least how old" it is. it has the base scn on the block as of the modification to the block.. this is one of the reasons this:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>

can happen

more

November 14, 2003 - 11:13 am UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

So when such a block (with lost transactional info) is been cleaned out, will it keep that base SCN ?

Does that mean, that a block with a lost transaction information will be stamped with a SCN earlier then the SCN of the commit? If so, does it mean that different block modified by the same transaction can have different SCNs depending on the cleanout?


Tom Kyte

Followup  

November 14, 2003 - 5:03 pm UTC

see, it is more complex then just a simple "scn on a block".  It really is "row related", "transaction related" (trying to say "it is bigger then a bread box")

things really are at the row level (based on information in the transaction header) at the lowest level.  there are base scn's, wrap scn's on the block, there are commit times in the transaction headers -- the scn base is used in that case.

conceptually, it is easy to think of "block has scn"
physically, it is not so cut and dry.

curious -- why do you ask?

you can see this is more or less at the "row level" via a simple simulation.  we'll create a table, two rows.  both rows -- same block.  open cursors on each row (but don't fetch).  update one row 10,000 times and commit each.  Now, print data.  Review the number of consistent gets for each cursor:

ops$tkyte@ORA920PC> create table t ( x int, data char(10) );
 
Table created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> insert into t values ( 1, 'x' );
 
1 row created.
 
ops$tkyte@ORA920PC> insert into t values ( 2, 'x' );
 
1 row created.
 
ops$tkyte@ORA920PC> commit;
 
Commit complete.
 
ops$tkyte@ORA920PC> select x, dbms_rowid.rowid_block_number(rowid) from t;
 
         X DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
---------- ------------------------------------
         1                                   53
         2                                   53
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> variable x refcursor
ops$tkyte@ORA920PC> variable y refcursor
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> @trace
ops$tkyte@ORA920PC> alter session set events '10046 trace name context forever, level 12';
 
Session altered.
 
ops$tkyte@ORA920PC> begin
  2          open :x for select * from t where x = 1;
  3          open :y for select * from t where x = 2;
  4  end;
  5  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> begin
  2          for i in 1 .. 10000
  3          loop
  4                  update t set data = 'x' where x = 1;
  5                  commit;
  6          end loop;
  7  end;
  8  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> print x
 
         X DATA
---------- ----------
         1 x
 
ops$tkyte@ORA920PC> print y
 
         X DATA
---------- ----------
         2 x


<b>tkprof says:</b>


SELECT * from t where x = 1
                                                                                                                                                            
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.05       0.27        113      10002          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.05       0.27        113      10002          0           1
********************************************************************************
SELECT * from t where x = 2
                                                                                                                                                            
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          7          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          7          0           1


So, only the query that hit the block that required a ROW changed after the query began did the massive consistent reads needed to restore it.

It is more complex then simply "there is an scn on the block"


 

Wow

November 14, 2003 - 8:35 pm UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

Wow, that was impressive.

I could've sweared that both queries would do 10 000 consistent reads.

Why I ask ? I've dedicated my self to solve every single "mystery" in Oracle that I encouter. Just recently I was trying to explain to a friend of mine the real reason why he was gething 1555, and that it's not just because he was doing a commit accross fetch.

Anyways, I was thinking about the reasons, and that question came to my mind. How does Oracle know the SCN of the block, when it's info is lost and a 1555 has to be returned ...

Anyway , your example is most impressive, seriously why there's no need for a consistent read on the second row ?

Just when I think I fully understand it, you show me I don't and there's a lot of missing parts.

I am most interested in the aditional details on the "scn per block" concept.


Tom Kyte

Followup  

November 15, 2003 - 9:04 am UTC

hmmm, not so impressive after all. i should have been more thorough. I found a test that proved what I wanted to see :)

reverse the prints -- print y then x -- and you'll see what I mean.

In short, there is scn data on the block but each transaction entry when cleaned out fills in information as well. if not cleaned out, they use the scn base which is "at least old enough".

as for:

"How does Oracle know the SCN of the block, when it's info is lost and a 1555 has
to be returned ..."

thats the issue, we don't KNOW at that point (search for

"alter tablespace" readonly 1555

on this site to read about a read only tablespace throwing a 1555 for this very reason.


I shouldn't have rushed this answer -- so to answer:

"Anyway , your example is most impressive, seriously why there's no need for a
consistent read on the second row ?"

it is because it was in the buffer cache -- we didn't have to walk the undo chain to reconstruct it, the first query did (we can see that by reversing the print statements -- it works in reverse just as well)

time for you to start looking at block dumps maybe :) -- i tend to totally avoid them, get too bogged down in the weeds.

Responibility for ORA-1555: developer or dba?

December 03, 2003 - 10:50 am UTC

Reviewer: Olga from Vienna

I have a transaction, where a 1555 occurs sometimes while I do a very big insert into a temporary table in a pl/sql-Block like this:

begin
insert into temp_parm_table ...;

execute immediate( truncate temp_table1 );

insert into temp_table1
select ... from temp_parm_table, big_table where ...
-<BANG ORA-1555 occurs!>

commit;

end;

For the most entries in temp_parm_table the select runs a few seconds. But for some entries it does take a lot of time. And then sometimes a 1555 occurs.

You said:

"the only CAUSE of a 1555 is improperly sized rollback segments."

I told it the DBA, but he said that my code is wrong (without seeing it) and said also, that there would have been a lot of free space in the rollback segment when my programm generated the ORA-01555. Could that be true?

I think, in his opinion the only CAUSE of a 1555 is the wrong coding of the developers.


Tom Kyte

Followup  

December 03, 2003 - 11:06 am UTC

he is wrong.

you have a query that is running for N minutes.

he has configured the system to hold (N-M) minutes of undo.


Your N minute long query will fail because they have not sized sufficient undo space.

Thanks for yorur Answer!

December 03, 2003 - 11:36 am UTC

Reviewer: Olga from Vienna

Many thanks for your answer, Tom.

I put it to my "private oracle rules":

An ORA-01555 will never be raised, when there is free space in the rollback-segment.





Tom Kyte

Followup  

December 03, 2003 - 4:41 pm UTC

thats not true!

say you have 15 rollback segments.

you start a query.

while that is running, you have a big transaction that fills up rbs1 (almost). it commits.

query still running. rbs2..rbs15 haven't been touched.

Now you have 30 small transactions. 2 each to the 15 rbs's. The first one in rbs1 causes it to WRAP around and reuse some of the space of the big transaction. the second overwrites a tiny bit more.

unfortunately, your still running query needs the undo generated by big transaction. parts of big transaction are gonzo.

ora-1555, even though rbs2..rbs15 havent really been "touched" very much.

Ok, I should say it clearer

December 05, 2003 - 11:32 am UTC

Reviewer: Olga from Vienna

Yes, that was my understanding. But if I have a very long running query and nothing in a cursor for loop and a ORA-1555 occurs the only possiblity for me is to make the rbs bigger. Ok? I understand the read consistency. But

Now I have an other ORA-1555 question:

I start a pl/sql-block:

BEGIN

update prg_log_table
set program_state = 1, run_begin = sysdate
where prg_id = 1
;

-- everybody should see that the prg is running
-- and working and nobody should start the same program
-- so I need a commit

commit;

Now some big inserts started and were running for a while.

-- At the end I must "give the program free":

update prg_log_table
set program_state = 2, run_end = sysdate
where prg_id = 1
;

And finally:

commit;

END;

At the last update I run into 1555. I could not understand it. Why does Oracle need the rollback-information for this last update on my log-table?

Excuse me, ask you the same questions you answered hundred times before. But when it does not seem logical for me in all details I would not understand it (Perhaps it might make you happier when I tell you that I ordered your new book some days ago ;-))

And what can I do? I need the first commit because I don't want that anbody else runs the program while it is running. And with the commit


Tom Kyte

Followup  

December 05, 2003 - 12:27 pm UTC

use dbms_application_info to notify external sessions of what you are doing.

you can set 3 columns in v$session
you can set a row in v$session_longops

if you wanted to serialize this process, you would just use dbms_lock (actually -- your UPDATE is quite enough to make sure that two processes do not run at the same time!!)


I doubt the 1555 was coming from the update itself -- but it could. updates need read consistency to do their reading just like a select.


don't commit, you don't want to commit, use the other features (dbms_application_info). If you commit -- and fail in the middle -- then you have a manual restart needed (you need to manually clean up the row). Just use dbms_application_info -- and no worries.

What if no transaction revisits the block.

December 15, 2003 - 11:58 am UTC

Reviewer: Kamal

Tom,
.....
"When the user commits the transaction Oracle does
NOT go back and revisit these blocks to make the change permanent. It is left
for the next transaction that visits any block affected by the update to 'tidy
up' the block (hence the term 'delayed block cleanout').
"
.....

What if the block does not get revisted for some reason for a long time, Then how will get cleaned out.

Tom Kyte

Followup  

December 15, 2003 - 3:52 pm UTC

if no one goes there, no need to clean it out then is there?

if it did not get cleaned out naturally (we clean most blocks out, only really big transactions won't get them cleaned) it'll just stay that way until its revisited -- 1 minute, 1 day, 1 week, 1 year, 1 decade in the future.

Avoiding Block Cleanouts

December 30, 2003 - 8:17 am UTC

Reviewer: Vivek Sharma from Bombay, India

Dear Tom,

Thanks for your knowledge sharing. I would like to know how to minimize this block cleanouts. Since the session doing the cleanout consumes resources how do i minimize this. I have a database in which i see redos of around 24-25 mb generated by those sessions which are only querying the database. I assume, due to these the performance of the database is poor. What care should be taken inorder to minimize this. Please clarify ?

At present, I have planned to increase the buffer cache size and will be reboucing the database tomorrow morning. The database version is 8.1.7.4.

Another reason for increasing the buffer cache is to eliminate cache buffer chain latch contention. But will this minimize the block cleanouts ?

Thanks and Regards
Vivek


Tom Kyte

Followup  

December 30, 2003 - 10:31 am UTC

don't assume.

prove.

then come back and we can study it futher!

(to minimize block clean outs -- use BULK OPERATIONS on your loads (not slow by slow -- opps meant row by row processing). CREATE table as select -- no dirty blocks. insert /*+ append */ -- ditto. direct path loads -- you got it, no dirty blocks. use a larger cache (the threshold for cleaning the blocks is 10% of the buffer cache -- if you have large transactions that do more than that, not all of the blocks get cleaned)


increasing the buffer cache won't do anything for cbc latching if you have a hot block(s).



Minimizing Block Cleanout

December 30, 2003 - 10:47 am UTC

Reviewer: Vivek Sharma from Bombay, India

Dear Tom,

Thanks for your prompt reply.

Many sessions doing block cleanouts and generating at an average 25 MB of redos. You mean to say it will not degrade the performance of the database. My main reason of Increasing the Buffer Cache is :

1. Minimize Block Cleanouts
2. Reduce Cache Buffer Chain Latch.

For point no.2, I have read from your site itself that more the number of block buffers will eliminate the contention on Cache Buffer Chain.
The site address is :
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1229436447262, <code>

I know, i need to tune my queries in order to eliminate contention on hot blocks. So If I increase the number of blocks will it have any impact on Cache Buffer Chain ?

For delayed cleanouts, I cannot change the big procedures immediately since they are developed by third party and will require major changes in the procedures. I have initiated this process of changing the procedures one by one but as a immediate workaround I have decided to increase the buffer cache.

Regards
Vivek


Tom Kyte

Followup  

December 30, 2003 - 11:46 am UTC

I just made the counter point that IF the cbc latching is due to a hot block(s), it will matter not if you have 100, 1000, or 1000000 blocks. It *CAN* reduce cbc latching IF the contention is for the lists of blocks, not specific blocks (eg: increasing the number of lists of blocks will do nothing to reduce contention for the list that contains the hot blocks).

That was what I said in the related note as well -- there is a big "if" statement in that answer.

"Many sessions doing block cleanouts and generating at an average 25 MB of redos.
You mean to say it will not degrade the performance of the database."

I mean to say that -- unless you show us that this is causing some performance hit, there is nothing to look at.

Unsure of approach to avoid this error for this scenario

December 31, 2003 - 3:07 pm UTC

Reviewer: Mark from USA

BANNER
Oracle8i Enterprise Edition Release 8.1.7.3.0 - 64bit Production
PL/SQL Release 8.1.7.3.0 - Production
CORE 8.1.7.0.0 Production
TNS for HPUX: Version 8.1.7.3.0 - Production
NLSRTL Version 3.4.1.0.0 - Production


ACCOUNT REFRESH PROCEDURE
Oracle packaged procedure runs once a day, has a cursor that reads the account table (1,000,000 rows) and returns accounts which need to be updated from the mainframe. We expect this cursor to find 80,000 account refresh candidates. There are 102 columns in the account table, 45 of which may be updated from the mainframe.

Within the cursor loop we call an Oracle Java SP which communicates with the mainframe to get updated account values. We are tied to this method.

Updating the account table directly from Java was too slow (1.3 seconds per account). I know this is supposed to be much faster in Oracle9i, but we won't be upgraded until late 2004 (and it doesn't solve the basic problem we are having). The Java program sends the update SQL statement back to the calling procedure which then does EXECUTE IMMEDIATE to do the update (0.35 seconds per account).

When this runs the users are also working these accounts via a GUI, and will be reading and possibly updating the same accounts in the account table (different columns). Our SLA is 2 seconds response time at the GUI. The refresh program issues a COMMIT for each account, within the loop.

The obvious problem here is the "snapshot too old". In addition, we'd like to speed this up further. Do you have any ideas for a better approach? We are considering a temp table (or non-temp acting as a temp) to resolve the ORA-01555 but not sure if this would prevent it (I thought I understood this problem but after reading your website found that it can happen even if the update is on a different table from that in the cursor loop..?), seperating the "get the stuff from the mainframe for all accounts" from the "update the account table" piece instead of "get one, update one"...

Any ideas we can try, much appreciated.
Mark.

Tom Kyte

Followup  

December 31, 2003 - 3:24 pm UTC

well, updating the row in java isn't any "slower" per say in java then in plsql. it is just SQL after all?

to have a single row update take 0.35 seconds is way too long as well.

You have a lot of slow by slow (woops - meant row by row) processing that you say "you are tied to".

think about 1,000,000 java sp calls.
and 1,000,000 mainframe calls

1,000,000 of anything takes a long long time. the largest impact will be to employ bulk processing where ever possible -- and perhaps parallelize the process.

So, can you hit the mainframe with more than one session?
(you are using bind variables right????)

More info...

December 31, 2003 - 3:40 pm UTC

Reviewer: Mark from USA

Well it's only 80,000 accounts out of the 1,000,000 that this process will update... but yes, it's row by row.
Bind vars - I need to check what the Java program is sending back. This isn't my program, I'm trying to help out.
Stuff I should've mentioned:
We are already running 4 sessions to cut down the overall elapsed time. The 0.35 - there is some additional processing of other tables within the loop when we do the update. It's really not bad, but we'd like to do better. I couldn't come up with a way to 'bulk process' this...

Tom Kyte

Followup  

December 31, 2003 - 3:58 pm UTC

but -- you make 1,000,000 mainframe calls to find these 80k right?

the only way to speed this up will be to bulk it up. array fetches, array updates, not slow by slow.

80k is a huge number, 1,000,000 is larger -- doing anything that many times is going to be slow. that's what we have to cut down on.

have you tkprofed it
have you dbms_profiled it.

bulking up

December 31, 2003 - 4:09 pm UTC

Reviewer: Mark from USA

No, the cursor returns 80,000 rows from a 1,000,000 row table. For each of the 80,000 we call the Java SP to get 'fresh' data from the mainframe, then we update the account table.
I have not TKPROF'ed it because I figured we need to rethink the approach in order to avoid the snapshot too old problem. This is the main problem with this program - while fixing it I want to see if we can make it go faster too. I figured it's pointless tuning something that we may rewrite.
If we bulk up, we need to do it in a way that we don't lock any account for more than a few seconds.

Tom Kyte

Followup  

December 31, 2003 - 5:02 pm UTC

if you make it go faster -- you'll help avoid the 1555 as well.

i always start with a good tkprof to see if there is any obvious low hanging fruit.

but one way to avoid the 1555 would be to

insert into gtt select the 80k rows


and process them from there -- a global temporary table (gtt) in temp doesn't need consistent reads -- it uses direct reads (since the data is known to be consistent)

That is exactly what I needed to know - thanks for the fast input

December 31, 2003 - 5:32 pm UTC

Reviewer: Mark from USA


About the off-line rollback segment

January 08, 2004 - 12:18 am UTC

Reviewer: Sanjaya Balasuriya from Colombo, Sri Lanka

Hi Tom,

I have created 180 rollback segments and I have set max_rollback_segments=180.
All of the rollback segments are public.

But whenever I restart the database only 125 rollback segments are on-line.
This is verified in v$resource_limit as well.

Why are the rest of the segments off-line ?

Thanks in advance.


Tom Kyte

Followup  

January 08, 2004 - 1:21 pm UTC

</code> http://docs.oracle.com/docs/cd/A87860_01/doc/server.817/a76956/rollbak.htm#498 <code>

...
Then, when an instance opens a database, it attempts to acquire at least TRANSACTIONS/ TRANSACTIONS_PER_ROLLBACK_SEGMENT rollback segments to handle the maximum amount of transactions.
.......


max is the max, the number of actually acquired is a function of your transactions.

Got the point

January 10, 2004 - 9:58 am UTC

Reviewer: Sanjaya Balasuriya from Colombo, Sri Lanka

Hi Tom,

Thanks a lot.

Got the point. I have really forgotten "TRANSACTIONS/ TRANSACTIONS_PER_ROLLBACK_SEGMENT".



very good article

February 10, 2004 - 2:16 am UTC

Reviewer: Ravi Chander Kondoori from INDIA

Its really a good article with indepth explanation.

unlimited maxextents for rbs

February 20, 2004 - 11:29 am UTC

Reviewer: ana from PA, USA

doc 50380.1 on metalink recommends not setting the MAXEXTENTS value to UNLIMITED. Could you please explain in simple words why?

Thanks

Tom Kyte

Followup  

February 20, 2004 - 1:10 pm UTC

even in LMTS where the default (and in fact only value) is unlimited for segments -- an exception of 32k is made for RBS's (they are not unlimited, they are limited to 32k extents)

Has to do with the way these things are managed, big circular buffers.

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96524/b_deprec.htm#635261 <code>

"optimal" would have been killer on a DMT managed rbs with thousands of extents.

Some more clarifications pls

March 10, 2004 - 3:54 am UTC

Reviewer: Zee from Singapore

Thanks for the indepth explanation of rollback segments. Can you give some solution to the following issue?
We have two batch processes running in UAT and Production. They basically move data from staging to main tables. On days when the volume of data is high (like mondays) we get the "Snapshot too old" error only in UAT. I have checked the rollback segments in both UAT n Production. The optimum size in UAT is 50 MB and 860 MB in Production. Can I safely assume that this is the reason why we get the problem only in UAT? I know you dont like using optimal while defining rollback segments, but I dont have much of an option here as I am only a developer and the DBAs insist on using this feature.
Coming back to my problem, can you have a look at my procedure and suggest how I can change the code to avoid this problem in UAT? Due to space issues, I am not at a liberty to change the size of the Rollback Segments.

The procedure is as follows:
***start of code*****

DELETE TPH_FRM_TRD_DATA where trunc(STAMP_ADD_DZ)= trunc( Sysdate );
commit;

INSERT INTO TPH_FRM_TRD_DATA
(
tph_frm_trans_cd,
tph_frm_rec_id_c,
tph_frm_seq_nbr,
tph_frm_acct_id_c,
tph_frm_acct_typ,
tph_frm_cusip_id_c,
tph_frm_corr_q,
tph_frm_rr_q,
tph_frm_tot_pos,
tph_frm_stk_splt,
tph_frm_book_cost,
tph_frm_div,
tph_frm_avg_px,
tph_frm_mkt_value,
tph_frm_curr_px,
tph_frm_user_px,
tph_frm_ai_csh_div,
tph_frm_dly_sls_crdt,
tph_frm_prev_day_tot_tkt,
tph_frm_cum_pnl_dly,
tph_frm_td_cum_pnl_mtd,
tph_frm_td_mtd_tot_tkt,
tph_frm_td_trd_int,
tph_frm_td_unrlz_pnl_mtd_chng,
tph_frm_td_rlzd_pnl_mtd_chng,
tph_frm_td_sls_crdt_mtd,
tph_frm_persh_chrg_mtd,
tph_frm_cum_pnl_mtd,
tph_frm_mtd_tot_tkt,
tph_frm_trd_int,
tph_frm_unrlz_pnl_mtd_chg,
tph_frm_rlzd_pnl_mtd_chg,
tph_frm_sls_crdt_mtd,
tph_frm_dte_of_data,
user_id_add_c,
stamp_add_dz,
user_id_updt_c,
stamp_updt_dz
)
SELECT
TPH_FRM_TRANS_CD,
TPH_FRM_REC_ID_C,
TPH_FRM_SEQ_NBR,
TPH_FRM_ACCT_ID_C,
TPH_FRM_ACCT_TYP,
TPH_FRM_CUSIP_ID_C,
TPH_FRM_CORR_Q,
TPH_FRM_RR_Q,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TOT_POS)||nvl(TPH_FRM_TOT_POS_SGN,' '))/100000 as TPH_FRM_TOT_POS,
TPH_FRM_STK_SPLT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_BOOK_COST)||nvl(TPH_FRM_BOOK_COST_SGN,' '))/100 as TPH_FRM_BOOK_COST,
TPH_FRM_DIV,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_AVG_PX)||nvl(TPH_FRM_AVG_PX_SGN,' '))/1000000000 as TPH_FRM_AVG_PX,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_MKT_VALUE)||nvl(TPH_FRM_MKT_VALUE_SGN,' '))/100 as TPH_FRM_MKT_VALUE,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_CURR_PX)||nvl(TPH_FRM_CURR_PX_SGN,' '))/1000000000 as TPH_FRM_CURR_PX,
TPH_FRM_USER_PX,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_AI_CSH_DIV)||nvl(TPH_FRM_AI_CSH_DIV_SGN,' '))/100 as TPH_FRM_AI_CSH_DIV,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_DLY_SLS_CRDT)||nvl(TPH_FRM_DLY_SLS_CRDT_SGN,' '))/100 as TPH_FRM_DLY_SLS_CRDT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_PREV_DAY_TOT_TKT)||nvl(TPH_FRM_PREV_DAY_TOT_TKT_SGN,' '))/100 as TPH_FRM_PREV_DAY_TOT_TKT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_CUM_PNL_DLY)||nvl(TPH_FRM_CUM_PNL_DLY_SGN,' '))/100 as TPH_FRM_CUM_PNL_DLY,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TD_CUM_PNL_MTD)||nvl(TPH_FRM_TD_CUM_PNL_MTD_SGN,' '))/100 as TPH_FRM_TD_CUM_PNL_MTD,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TD_MTD_TOT_TKT)||nvl(TPH_FRM_TD_MTD_TOT_TKT_SGN,' '))/100 as TPH_FRM_TD_MTD_TOT_TKT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TD_TRD_INT)||nvl(TPH_FRM_TD_TRD_INT_SGN,' '))/100 as TPH_FRM_TD_TRD_INT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TD_UNR_PNL_MTD_CHG)||nvl(TPH_FRM_TD_UNR_PNL_MTD_CHG_SGN,' '))/100 as TPH_FRM_TD_UNR_PNL_MTD_CHG,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TD_RL_PNL_MTD_CHG)||nvl(TPH_FRM_TD_RL_PNL_MTD_CHG_SGN,' '))/100 as TPH_FRM_TD_RL_PNL_MTD_CHG,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TD_SLS_CRDT_MTD)||nvl(TPH_FRM_TD_SLS_CRDT_MTD_SGN,' '))/100 as TPH_FRM_TD_SLS_CRDT_MTD,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_PERSH_CHRG_MTD)||nvl(TPH_FRM_PERSH_CHRG_MTD_SGN,' '))/100 as TPH_FRM_PERSH_CHRG_MTD,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_CUM_PNL_MTD)||nvl(TPH_FRM_CUM_PNL_MTD_SGN,' '))/100 as TPH_FRM_CUM_PNL_MTD,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_MTD_TOT_TKT)||nvl(TPH_FRM_MTD_TOT_TKT_SGN,' '))/100 as TPH_FRM_MTD_TOT_TKT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_TRD_INT)||nvl(TPH_FRM_TRD_INT_SGN,' '))/100 as TPH_FRM_TRD_INT,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_UNRLZ_PNL_MTD_CHG)||nvl(TPH_FRM_UNRLZ_PNL_MTD_CHG_SGN,' '))/100 as TPH_FRM_UNRLZ_PNL_MTD_CHG,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_RLZD_PNL_MTD_CHG)||nvl(TPH_FRM_RLZD_PNL_MTD_CHG_SGN,' '))/100 as TPH_FRM_RLZD_PNL_MTD_CHG,
TPH_UTIL_PKG.GET_SIGN((TPH_FRM_SLS_CRDT_MTD)||nvl(TPH_FRM_SLS_CRDT_MTD_SGN,' '))/100 as TPH_FRM_SLS_CRDT_MTD,
TPH_FRM_DTE_OF_DATA,
USER,
SYSDATE,
USER,
SYSDATE
FROM TPH_FRM_TRD_DATA_STG;

COMMIT;
***end of code*****
Thanks in advance.

Z.


Tom Kyte

Followup  

March 10, 2004 - 9:02 am UTC

yes, if uat has significantly smaller rbs's and is getting 1555 (which are the result of rbs being too small), that would be the cause.


You are being prevented from doing your job. $5 of disk space is causing your company $$$$ in lost productivity. Has anyone pointed that out to anyone?


You could try making the query portion (which is what is getting the 1555) faster. I see alot of plsql function calls in there. Each one adds to the runtime.

what is the code of "get_sign"




Thanks for the response Tom!

March 11, 2004 - 12:07 am UTC

Reviewer: Zee from Singapore

For quantity fields, the last character of the input string indicates the sign. (The data is coming from a third party feed provider.) The code of get_sign is as follows:

**********start of code**********

v_temp_sign := SUBSTR(i_value,-1);
v_temp_no := TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1));
--
IF v_temp_sign IN (' ','+') THEN
RETURN (v_temp_no);
ELSIF v_temp_sign = '-' THEN
RETURN (v_temp_no*-1);
ELSIF v_temp_sign = '{' THEN
RETURN (v_temp_no*10);
ELSIF v_temp_sign = '}' THEN
RETURN (v_temp_no*-10);
ELSIF v_temp_sign BETWEEN 'A' AND 'I' THEN
RETURN ((v_temp_no*10)+(ASCII(v_temp_sign)-64));
ELSIF v_temp_sign BETWEEN 'J' AND 'R' THEN
RETURN ((v_temp_no*-10)-(ASCII(v_temp_sign)-73));
END IF;

**********end of code****************

Can you suggest how to modify the previous SQL statement to prevent this error?

Some people have suggested to replace with the "insert into --- select ---" chunk with a cursor which will fetch all the data from the staging table and while inserting into the main table, do a commit for every 1000 odd rows. Is this the right solution?

Please advise.

Tom Kyte

Followup  

March 11, 2004 - 8:27 am UTC

use decode( SUBSTR(i_value,-1),
' ', TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1)),
'+', TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1)),
'-', -1*TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1)),
'{', 10*TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1)),
'}', -10*TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1)),
case when substr(i_value,-1) between 'A' and 'I'
then TO_NUMBER(SUBSTR(i_value,1,length(i_value)-1))*10+.... )
else ....
end )

(or just case -- upto you) instead of plsql.

Only follow that last bit of advice (to incrementally commit) if you want to

a) have this run longer
b) use more resources
c) but most CERTAINLY assure that you will hit the ora-1555

that commit will pretty much guarantee your 1555 will happen, but after running lots and lots of code, consuming much resources.


And then -- well, and then you are "hosed", cause you have to figure out "how would I restart this"


no, that commmit every 1,000 is really bad advice.

Great

March 11, 2004 - 9:55 am UTC

Reviewer: Zee from Singapore

Thanks for your advice on the commit per 1000 rows. I will modify the sql to avoid the pl/sql calls and test it in UAT.
Thanks again for your help!!

Some doubts ..

March 17, 2004 - 11:22 am UTC

Reviewer: Jagjeet Singh from INDIA

Hi,

         Using Redhat 8.0 + Oracle 9.2.0
         -------------------------------

Sir, You mentioned three reasons for "Snapshot too Old" in 
     your book. I wast just trying to reproduce this error
     with this test.


$ sqlplus /nolog

SQL*Plus: Release 9.2.0.1.0 - Production on Tue Mar 16 11:17:00 2004

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

SQL> conn / as sysdba
Connected.
SQL>
SQL> Create Undo tablespace undo2 datafile '/tmp/undo2.dbf'
  2  size 2m reuse;

Tablespace created.

SQL> alter system set undo_tablespace=undo2;

System altered.

SQL>
SQL> Select file_name,user_blocks from dba_data_files
  2  where file_name = '/tmp/undo2.dbf' ;

FILE_NAME                      USER_BLOCKS
------------------------------ -----------
/tmp/undo2.dbf                         992


--#  We got 900+ blocks for storing undo image...

SQL>
SQL> conn rnd
Enter password:
Connected.
SQL> create table t (x    int,y  char(1800));

Table created.

SQL> insert into t select rownum r,'y' from dict where rownum
  2  < 1000 ;

999 rows created.

SQL> commit;

Commit complete.

SQL> select count(*),
  2  count(distinct(dbms_rowid.rowid_block_number(rowid)))
  3  from t;

  COUNT(*) COUNT(DISTINCT(DBMS_ROWID.ROWID_BLOCK_NUMBER
--------- -------------------------------------------------
999       999




--#  One row in One Block ...


SQL>
SQL> update t set y=y where rownum < 525;
update t set y=y where rownum < 525
*
ERROR at line 1:
ORA-30036: unable to extend segment by 512 in undo tablespace 'UNDO2'


--#  The First Question is .. 
--#  previous image for 525 blks are not fitting in 
     undo tablespace
--#  with 900+ available blocks .. ????????
--#  How much undo will be genrated for 
--#  this statement, exactly 524 blocks or 525+ ... 


       

--#  changing 525 to 480 blocks ... 

SQL> c  /525/480
  1* update t set y=y where rownum < 480
SQL> /

479 rows updated.

SQL> Select Used_ublk from v$transaction;

 USED_UBLK
----------
       479   << ----- exactly 479 blocks ...

SQL> commit;

Commit complete.

--#   trying to genrate "Snapshot too old"


SQL>
SQL>
SQL> begin
  2     for i in ( select x,
  3     dbms_rowid.rowid_block_number(rowid) blk from t )
  4          loop
  5
  6        dbms_output.put_line(' -------  '||i.blk);
  7
  8            update t set y=y where rownum < 480 ;
  9                       commit;
 10          end loop;
 11  end;
 12  /
-------  165
-------  166
-------  167
-------  168
-------  169
-------  170
begin
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 12 with name "_SYSSMU12$"
too small
ORA-06512: at line 2


--# Sir,If Previous image for 525 blocks are not fitting
    in undo tablespace then I think 480 undo blocks will 
    consume the 90% of undo tablespace. Then this error
    should come after third record.


--#   Oracle fills undo segments in Round-Robin fashion
--#   My assumptions ..

o Undo Seg   |-s1-|-s2-|-s3-|-s4-|-s5-|-s6-|-s7-|-s8-|-s9-|
o Undo Tabsp |--------------------------------------------|
o First Upd. |11111111111111111111111111111111111---------|
  fill up 90%                                             
o Sec. upd                                       222222222|
             |2222222222222222222222222                   |
o Third upd.                           3333333333333333333|
             | 333333333333333                             


--# Oracle will try  to read the blocks with "1" update
    previous image.. But third update will totaly overwrite
    the "1" image blocks then this error should come after 
    third record ... ????????


--#   change 480 blocks to 100 blocks....
      Ora-1555 after 8 update ... 
      

SQL> ed
Wrote file afiedt.buf

  1  begin
  2     for i in ( select x,
  3     dbms_rowid.rowid_block_number(rowid) blk from t )
  4          loop
  5           dbms_output.put_line(' -------  '||i.blk);
  6               update t set y=y where rownum < 100 ;
  7                       commit;
  8          end loop;
  9* end;
 10  /
-------  165
-------  166
-------  167
-------  168
-------  169
-------  170
-------  171
-------  172
begin
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 11 with name "_SYSSMU11$"
too small
ORA-06512: at line 2


--##  change 100 to 50
      Ora-1555 after 9 update ...  


SQL> ed
Wrote file afiedt.buf

  1  begin
  2          for i in ( select x,
  3      dbms_rowid.rowid_block_number(rowid) blk from t )
  4          loop
  5           dbms_output.put_line(' -------  '||i.blk);
  6            update t set y=y where rownum < 50 ;
  7                       commit;
  8          end loop;
  9* end;
SQL> /
-------  165
-------  166
-------  167
-------  168
-------  169
-------  170
-------  171
-------  172
-------  173
begin
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 13 with name "_SYSSMU13$"
too small
ORA-06512: at line 2

Thanks,
Js 



 

Tom Kyte

Followup  

March 17, 2004 - 11:35 am UTC

you have an undo tablespace.

it is 900 some blocks "big"

IT has many rollback segments in it.

Each of them is smaller than 900 blocks (much much smaller as the entire 900 some odd blocks are allocated to the undo tablespace and you have N rbs's in there -- like 2 at least I think and maybe more)

So, you do not have 900 some odd blocks available for a single transaction, but rather for all concurrent transactions and the biggest a single transaction can have cannot span the size of the rollback segments created therein. so that tells you why the update of all rows failed -- it used MORE undo than any single rbs in your undo tablespace had space for.




snapshot too old

March 25, 2004 - 5:41 pm UTC

Reviewer: reader from Southborough, MA

The topic has been discussed extremly well.

A Problem

April 19, 2004 - 2:00 pm UTC

Reviewer: Anirudh from NJ, USA

Hi Tom,
I am facing a problem with one of my scripts. The query in question is following:
Table product_doument_au (id_prod number, doc_class_cd char(3), document LONG)

SELECT
decode(DOC_CLASS_CD,'PRS','PRL',DOC_CLASS_CD),DOCUMENT
FROM
PRODUCT_DOCUMENT_AU
WHERE DOC_CLASS_CD IN ( 'UKD','ANO','DES','BAP','BCV','FEA',
'NTE','BUL','TOC','RVW','PRE','PRS',
'WEB','URL','ABI','AUD','TEC','ULE' , 'LDI'
) AND
ID_PROD=:p1

------------
Id_prod gets passed in the loop. The process is running through a perl code and product_document_au table has appx 300,000 records out of which 3/4th recs are fetched.
This script throws me a snapshot too old err.
When asked the DBA he says we need to fine tune the query and that the RBS is well sized. Please tell me whether in any way we can fine tune this query (I really am doubtfull)..or we need to make it strict to the DBA to increase the size of RBS. Also the time at which this process runs , no other process is accessing/altering the table.
Plz clarify
Thanks,
Anirudh

Tom Kyte

Followup  

April 19, 2004 - 3:09 pm UTC

why are you doing this in a loop? why not a single query?

pulling 75% of 300,000 records is "not alot" -- how long is this taking?

A Problem contd...

April 19, 2004 - 7:32 pm UTC

Reviewer: Anirudh from NJ, USA

Hi Tom,
This process takes one hr to complete. This is written in perl and the functionality is to query and write the results to a file. The psuedo-code, if we go by, is
1. open a file
2. query the main table (select id_prod from product where ...). This query yields appx 250,000 records.
3. take the above id_prods one by one and pass it to another query (Original query as in my previous review) and as the records are fetched one by one write them to the file in the following datatype format:- NUMBER|CHAR|LONG
This process as a whole takes 1 hr to complete.
Please let me know if anything else is needed

Thanks,
Anirudh

Tom Kyte

Followup  

April 19, 2004 - 10:54 pm UTC

like I said -- sounds like a single query to me.

anytime I see

#2
#3

it means "you should have a single query".

And if it takes an hour to dump 350k records, you might have a problem with a "slow client". You might want to look at C or something faster in this case for a data dumper.

</code> http://asktom.oracle.com/~tkyte/flat/index.html <code>

would not take too much work to make that work for you (and just just a SINGLE query, that is all it takes, let the database join).

Can a create table made to "SnapShot Too Old" Error

April 26, 2004 - 6:41 am UTC

Reviewer: Sushanta Panda from INDIA

Tom,
Can this statement lead to "Snapshot too old error"
---------------------------------------------
CREATE TABLE T AS SELECT * FROM EQ_SELECT;
---------------------------------------------
In what ever be the Condition (Rollback segment is small
or there is a massive update and delete on the EQ_SELECT
or whatever be the condition)

How the statement
CREATE TABLE T AS SELECT * FROM EQ_SELECT;
is processing (Internally), is the whole of thedata is
loaded into the memory (from the table EQ_SELECT and then
write the contents into the hard disk and is the select
statement is searching the data in the ROLLBACK Segment
if at the time of select it fond that the data is
modified ???

What should be the solution to copy a table so that
"SNAPSHOT TOO OLD" cannot be occured in any time.

Regards
Sushanta

Tom Kyte

Followup  

April 26, 2004 - 6:58 am UTC

sure, anything that READS is subject to read consistency.

just as select * from t can ora-1555, create table tt as SELECT * FROM T can ora-1555 for the same reason.

avoiding the ora-1555 is as simple as sizing rbs properly. use undo-retention and AUM in 9i, size right in 8i and before.

READ CONSISTENCY FOR CLEANED OUT BLOCKS

May 19, 2004 - 5:22 pm UTC

Reviewer: Sri from MI

Tom,
I have been trying to hunt answer for this.
If a block has been cleaned out, then the next
transaction requiring the older version of the block
will have no way of knowing which RBS contains the rollback info ? How will the system build a read consistent image then ? Will it scan the transaction table of all the rollback segments ? Consider the following.
A query Q1 beigins at SCN 100 and is doing full tablescan.
An update transaction updates the last block of the table
and its commit SCN is 101. Also since only one block is updated the same TXN cleans out the block (< 10% of buffer blocks , say).
The query Q1 comes to the last block and finds the SCN to be 101. It now needs rollback info. SInce the block is cleaned out the transaction header no longer has rollback info ? (Is it so ?)
In order to rollback to 100, now the system has to scan the transaction tables of all RBS.
Could you please explain what really happens ?


Tom Kyte

Followup  

May 19, 2004 - 6:15 pm UTC

it doesn't need the entire transaction history, just the last transaction.

it'll walk the chain from there. Just need to get to the "last change"

How does it last transaction details ?

May 19, 2004 - 10:06 pm UTC

Reviewer: Sri from MI

Tom,

Thanks for a quick response. But still I am not clear about how it builds the consistent image ? because the block is cleaned out. So it does not contain the pointer to rollback segment header containing the undo info. Does it have to scan the transaction tables of all the rollback segments to determine which rollback segment contains the undo info ?

Tom Kyte

Followup  

May 20, 2004 - 10:20 am UTC

the last stuff is always there on the block. it is marked "this is committed" instead of "this might not be committed" is all during the cleanout.

But our query needs the block as of earlier SCN

May 20, 2004 - 6:03 pm UTC

Reviewer: Sri from MI

Tom,

Sorry for following up on this Sir. But our query needs
committed info prior to the most recent commit. Following is my question which I am pasting again.

"A query Q1 beigins at SCN 100 and is doing full tablescan.
An update transaction updates the last block of the table
and its commit SCN is 101. Also since only one block is updated the same TXN cleans out the block (< 10% of buffer blocks , say).
The query Q1 comes to the last block and finds the SCN to be 101. It now needs rollback info (because we need the block as of SCN 100). Since the block is cleaned out the transaction header no longer has rollback info (Is it so ?)

In order to rollback the block to SCN 100, now the system has to scan the transaction tables of all RBS. Is this what happens ?"

Sir appreciate if you could provide your feedback assuming my question is clear now.




Tom Kyte

Followup  

May 20, 2004 - 8:46 pm UTC

system just has to walk the rbs chain. last transaction -> that block shows us -> next to last -> that block shows us -> next to next to last and so on.


no, it does not have to scan the transaction tables of all.


run this:

set echo on
clear screen

drop table t;

create table t ( x int, y int ) tablespace users;
insert into t values (1,1);
commit;
select * from t;
pause
clear screen


variable a refcursor
variable b refcursor
variable c refcursor

alter session set events '10046 trace name context forever, level 12';
begin
open :a for select * from t a;
open :b for select * from t b;
open :c for select * from t c;
end;
/
pause
clear screen
print a
pause
clear screen

begin
for i in 1 .. 10000
loop
update t set x = x+1;
commit;
end loop;
end;
/

pause
clear screen
print b
update t set x = x+1;
commit;

print c
pause
clear screen



tkprof the results and see what you see.

Questions about hmmmmm.....everything

July 02, 2004 - 12:00 am UTC

Reviewer: Rahul Dixit from New Zealand

Hi Tom,

From Oracle Concepts Manual :-
"Oracle automatically provides read consistency to a query so that all the data that the query sees comes from a single point in time (statement-level read consistency). Oracle can also provide read consistency to all of the queries in a transaction (transaction-level read consistency)."

Now basically in a pl/sql bock like the following where emp table has say- 10000000 records, and is using a small rollback segment.(like in your book - 1 on 1).Present SCN - 50.

for i in (select empid from emp where deptno = 10) loop
update emp set sal= sal + 100 where empid = i.empid;
commit;
end loop;
end;

This can also be written as :-

for i in (select empid from emp where deptno = 10) loop
update
(select empid from emp where empid = i.empid)
set sal = sal + 100;
commit; -- ends transaction.
end loop;
end;

1).Here each transaction (from what I understand ) will end on the commit.According to Oracle commit will always end a transaction.The first loop end's the transaction and the SCN is allocated.
That means this block does 10000000 transactions.Right ? and also new 10000000 SCN's ?

2).Now that means the cursor (select empid from emp where deptno = 10) after the first loop
will try to get a consistent image as of SCN 50 or SCN 51? If 50 then why? After all the first transaction
is already over.Why would it need the old image now that the row is updated and made permanent by the commit. Suppose the sal was 10 and is now 100 for empid = 1 then fine, that's what oracle does.
I don't understand why oracle would like the row to be shown again as sal = 10.
I would understand if the user wants to rollback then he would like to have a read-consistent view
and read from rollback segment but not in this case , Would he?

3).For knowledge sake - would the cursor be refreshed everytime in the loop or just stay the same ?
In short how does the cursor behave in this senerio?

Am a begineer but don't want to stay ignorent blissfully.
Thanks,
RD.



Tom Kyte

Followup  

July 02, 2004 - 8:51 am UTC

1) correct.

2) the query will be as of the scn in place when opened. if that scn was 50, it'll be 50.

You don't see the same row twice? I don't know where you are going with that concept. Once fetched, the row is fetched, we don't fetch it again.


think about this though -- what if the query was:

select empid from emp where deptno = 10 order by sal;

and say there was an index on (deptno,sal) and we used that to both find the records AND order the data.

you do your update of sal from 10 to 110. Now there is a new 110 in that index. As we are scanning through there -- WHAT WOULD HAPPEN IF WE SAW THAT NEW INDEX ENTRY? Well, you'd give the guy a 100 raise of course and chang 110 to 210. Can you spell infinite loop?

That is why we utilize read consistency there -- in order to not see things we should not see.


3) the cursor is not "refreshed", the cursors result set was pre-ordained at the moment it was opened. The data will be retrieved from disk/cache as needed -- using the rollback segments to avoid seeing data that was added/modified (or to see data that was deleted) since the result set was opened.

Few more of the many more ......

July 02, 2004 - 8:49 pm UTC

Reviewer: Rahul Dixit from New Zealand

Thanks Tom,

1).Now if there is no index(s) in my earlier pl/sql block then
should I take it for granted that there is no chance of the
dreaded ORA-01555? I can't think of a scenerio where it's
possible.
2). You said "the cursor is not "refreshed", the cursors result set was pre-ordained at the
moment it was opened. The data will be retrieved from disk/cache as needed --
using the rollback segments to avoid seeing data that was added/modified (or to
see data that was deleted) since the result set was opened."


Does it mean when the second update is about to happen then the cursor gets the data from the buffer/rollback segment (if necessary) then itself and fetched by the query part of the update? Why can't it just select the data and keep it in memory instead once and for all and keep reading from there itself? Afterall that's as read consistent as it gets
instead of reading multiple versions to get the data as of point in time.Also it would eleminate the ORA-01555 error emenating from the frequent commits.

Thanks in advance.

Regards.

Tom Kyte

Followup  

July 03, 2004 - 10:13 am UTC

1) </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>

example that

a) uses NO indexes
b) gets the 1555 on a read only tablespace....


if you have access to my book "Expert one on one Oracle" - I tried to beat this topic (1555) to a virtual death.

2) think about this: "select * from ten_billion_row_table". do you really want to pre-fetch all of that? How about this:

open cursor 'select * from big_table';
fetch just_the_first_row
close cursor

do you really want to pre-fetch all of that. It is just not reasonable nor feasible to pre-fetch all of the data.

"think big", think bigger than emp and dept. think "page through results on the web" for example (get me the first 10 rows ASAP from this 1,000,000 row result set). think bigger than 10 rows.l

July 09, 2004 - 1:31 pm UTC

Reviewer: A reader

Tom,

I am getting SNAPSHOT TOO OLD error while analyzing the tables. The program that analyzes the table runs in a loop by analying one table at a time.
I got the error three times so far.
First time, I added a new file to UNDO tablespace thinking the space might be problem. Second time I decreased the undo_retention parameter from 43k to 3600. This time I don't know what to do.

The tables are analyzed using DBMS_STATS by setting CASCADE=>true.

Any clue why this error during analyzing the table?

thanks,

Tom Kyte

Followup  

July 09, 2004 - 1:58 pm UTC

is the snapshot too old happening on the query itself -- the one generating the list of tablenames - or dbms_stats.


you increased retention from 43,000 to 3,600?

July 09, 2004 - 2:14 pm UTC

Reviewer: A reader

I decreased the undo_retention from 43,000 to 3,600.

I saw the alert log file on ../bdump folder, it mentioned

Fri Jul 9 14:03:23 2004
ORA-01555 caused by SQL statement below (Query Duration=2313981 sec, SCN: 0x0001.d5b2f2b3):
Fri Jul 9 14:03:23 2004
SELECT table_name
FROM user_tables

In my package the cursor to read the table is defined like the above SQL statement. Within a loop, I execute dbms_stats by passing in the table name which I get from cursor.

Any clue now?



Tom Kyte

Followup  

July 09, 2004 - 4:08 pm UTC

by decreasing the undo retention, you INCREASED the probability of a 1555 on a long running operation.

you can check your v$ undo stats view to see if extents are expired "prematurely".

you want your undo retention to be longer than the time it takes for you to run a process your longest running query.

If this user has a reasonable number of tables, you could just bulk collect the table names into an array and use that.

or, simply use gather schema stats.

July 09, 2004 - 3:49 pm UTC

Reviewer: A reader

Accordng to my previous reply, do you think increasing the undo_retention from 3600 is a good option or have you seen such problem when running dbms_stats package?

Please let me know


Tom Kyte

Followup  

July 09, 2004 - 4:17 pm UTC

it is not dbms_stats -- it is the fact that you are running a query for a long long time

for x in (select * from user_tables ) <<<===
loop
do something that takes a long time


that query is running for hours perhaps. Oracle needs to have a consistent read on that query. Oracle needs rollback to do that consistent read. You are reuseing the rollback at some point.

increasing undo retention is one method.

array fetching out the table names (making the query run for a short time) is another.

using gather schema stats is (easiest) another..


Urgent - help needed

July 29, 2004 - 4:18 pm UTC

Reviewer: A reader

Hi Tom,

We have an environment where a VC++ application calls a few procedures in Oracle. We are getting an error
ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old
when we execute this application. This error has been happening only from yesterday before which it has been executing successfully.

We set "alter session set sql_trace=TRUE" in the logon trigger but everything seems to be ok in the trace file. We added a few insert debug statements in our procedure and we cannot pinpoint the statement which produces this error.

I am just giving a few statements which are performed before this error is arised.

Stmt 1 in procedure 1:

select templateid, templatetype from template
where bio1id = 4147971;

Stmt 2 in procedure 2:

select bio1id, workstationid, imagesize from bio1
where bio1id = 4147971;

Stmt 3 in procedure 3:

select image from bio1
where bio1id = 4147971;

We want to know which statement is causing this error (ora-01555) and how to solve it.

Pls help.

Thanks.

Tom Kyte

Followup  

July 29, 2004 - 4:22 pm UTC

the error stack should have the exact line number of the statement in it.

1555 is caused by one thing -- rbs is "too small", did your DBA change something recently.

1555 after bulk loading

October 11, 2004 - 12:54 am UTC

Reviewer: abhi from india

Hi,

We are loading daily 4 gb data using external tables
insert /*+APPEND*/ into main_tab select * from external_tab
We load data at night and in morning.. our users query this table for reporting purpose.And they are frequently
getting snapshot too old. Out undo tablespace size is 10gb and retention period 3 hrs. There is not other updation in
database.

Could you pls. explain the reason and solution.

Thanks

Tom Kyte

Followup  

October 11, 2004 - 8:05 am UTC

if you think the database isn't doing transactions - you would be wrong, there are tons of background processes, pmon, smon, jobs, aq, etc -- transactions are happening all of the time here.....

it is probably happening on the indexes of this table, since the insert itself generates no undo for the table, but the indexes always do.

Soooo. how long is your longest running query here?

and are you seeing extents getting expired before their time? (read about v$undostat or search for it here to see examples) that is, is 10gig sufficient for your transactional load (your insert) AND your queries at the same time.... might well not be -- 4gig of data -- the undo for the indexes could be quite literally "huge"


For example, a modest set of three indexes:


big_table@ORA9IR2> exec show_space( 'BIG_TABLE' );
Free Blocks............................. 0
Total Blocks............................ 15,360
Total Bytes............................. 125,829,120
Total MBytes............................ 120
Unused Blocks........................... 997
Unused Bytes............................ 8,167,424
Last Used Ext FileId.................... 9
Last Used Ext BlockId................... 14,345
Last Used Block......................... 27

PL/SQL procedure successfully completed.

big_table@ORA9IR2>
big_table@ORA9IR2> set echo off
Wrote file /tmp/xtmpx.sql
Datatypes for Table big_table

Data Data
Column Name Type Length Nullable
------------------------------ -------------------- ---------- --------
ID NUMBER not null
OWNER VARCHAR2 30 not null
OBJECT_NAME VARCHAR2 30 not null
SUBOBJECT_NAME VARCHAR2 30 null
OBJECT_ID NUMBER not null
DATA_OBJECT_ID NUMBER null
OBJECT_TYPE VARCHAR2 18 null
CREATED DATE 7 not null
LAST_DDL_TIME DATE 7 not null
TIMESTAMP VARCHAR2 19 null
STATUS VARCHAR2 7 null
TEMPORARY VARCHAR2 1 null
GENERATED VARCHAR2 1 null
SECONDARY VARCHAR2 1 null


Indexes on big_table

Index Is
Name Unique COLUMNS
------------------------------ ------ --------------------------------
T_IDX1 No OWNER, OBJECT_TYPE, OBJECT_NAME
T_IDX2 No LAST_DDL_TIME
BIG_TABLE_PK Yes ID


Triggers on big_table
big_table@ORA9IR2>
big_table@ORA9IR2> insert /*+ append */
2 into big_table
3 select -id, owner, object_name, subobject_name, object_id,
4 data_object_id, object_type, created, last_ddl_time,
5 timestamp, status, temporary, generated, secondary
6 from big_table;

1000000 rows created.

big_table@ORA9IR2>
big_table@ORA9IR2> select used_ublk*8/1024 from v$transaction;

USED_UBLK*8/1024
----------------
83.2421875

big_table@ORA9IR2> exec show_space( 'BIG_TABLE' );
Free Blocks............................. 0
Total Blocks............................ 29,696
Total Bytes............................. 243,269,632
Total MBytes............................ 232
Unused Blocks........................... 15,333
Unused Bytes............................ 125,607,936
Last Used Ext FileId.................... 9
Last Used Ext BlockId................... 14,345
Last Used Block......................... 27

PL/SQL procedure successfully completed.

so, about 110meg of new data -- 85meg of undo. Add a couple more indexes -- or just indexes that split lots -- and this would go up and up and up...



Re:

October 12, 2004 - 5:04 am UTC

Reviewer: Abhi

Hi,
Thanks for you reply.
We load data only in night. It takes only 2+ hour. We have
two big ( partitioned ) table. Two sessions run their query
in morning 9 am to 6 pm. At this time there is no other
activity except query.

They spool the data in file. The longest query takes 40 minutes to 1 hour and other query takes 20 minutes.

Sir, Is this due to " Delayed block cleaning"
Undo/Redo due to select ) as you described in
"Expert one on one"


Thanks,
Abhi




Tom Kyte

Followup  

October 12, 2004 - 8:14 am UTC

you said above though "at night and in the morning".


so, are you sure? are you sure there isn't some program you are not aware of that is doing lots of transactional stuff (that is the cause of the delayed stuff, lots of commits).

Re:

October 12, 2004 - 11:53 am UTC

Reviewer: Abhi from India

Yes ..

I am 100% sure .. there is no other transaction except two query sessions. One more thing I would like to tell you ..
We load data using APPEND hint

Insert /*+APPEND*/ into table select * from external_tab.

Thanx,
Abhi Malhotra

Tom Kyte

Followup  

October 12, 2004 - 1:11 pm UTC

so, see how many commits you've done/are doing during that period. a simple query against the v$tables will tell you that.

assuming table has an index, that is generating gobs and gobs of undo that will be used by queries. append skips it for the table blocks, not so for index.

how to find longest running query to avoid ORA-1555 error in undo management auto.

November 02, 2004 - 5:59 am UTC

Reviewer: SREENIVASA RAO from INDIA

Hi,

recently i faced ORA-1555 having auto undo management in my production with 9.2.0.1.0.So that one dbms_job failed and background process(ora_j000_sid) hanging over bounces of the database.So that i kept that job in broken status and rerun that job.After struggle for 24 hours , my production running fine.

I referred my favourite book EFFECTIVE DESIGN BY THOMAS KYTE.
Here i found to set my UNDO_RETENTION TO longest running query time and size accordingly.
1. but How to find longest running query in 100 sessions of my production for whole day.
2. am i right in resolving the failure of dbms_job or something effected now?

pls answer my 2 questions.
thanks in advance.



Tom Kyte

Followup  

November 02, 2004 - 8:44 am UTC

you can peek at elapsed time in v$sql before and after (subtract from after the before value) and divide by executions to get an "average".

you can monitor v$undostat (search for that on this site) to see how well you are "doing" in that area.


if you have a background ora-j000-sid "hanging over" across a restart, that is a problem, but I don't think you did.

i think you had a job, it was running, it failed with 1555, it got restarted. if you bounced the database, the database would just restart the job again, that is normal and expected.

November 12, 2004 - 6:14 pm UTC

Reviewer: Ravi

Tom,
Have read Oracle note:
</code> http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=40689.1 <code>

In that doc, at QENV50, I start a query that brings back following record

Empno ->50
City ->'Chicago' where empno is primary key
for Session 1

Session 2
Queries and updates Empno50 to make City->New York
say SCN51, Commits it really.

But Session 1 is not finished yet and lets' say it queries
Empno again, perhaps with a UNION in the select:

1)Will the City now be Chicago or New York

2) Is Oracle at this point trying to retrive Block 500 at QENV50 to get the Empno 50 record and is this exactly the Snapshot mechanism?

3)And is it at this precise point Ora-1555 occurs, if RBS 500.3 had been overwritten by Session 2 or any other session?

4) Now is another important question, assume its Session 1 which did the commit and SCN51, and the rollback was overwritten by other sessions.

a) Again, I try to retrieve Empno 50, Would Oracle NOT give ORA-1555 here? Because it is Read consistent, as it has the current correct value of New York which had been changed by Session 1.

b) In other words, for Session 1 to get Ora-1555 should it be retrieving a record in block 500 that is NOT for Empno 50?

Ravi

Tom Kyte

Followup  

November 12, 2004 - 8:51 pm UTC

1) if the query in session 1 was opened PRIOR to the commit by sessions 2 -- it'll see chicago.

if the query in session 1 was opened AFTER the commit by session 2 AND the session 1 isolation level is read committed, it'll see New York.

2) that is the read consistency mechanism, all rows retrieved by that query are "as of the same point in time"

3) 1555's happen when a query tried to get a read consistent version of a block, and the undo needed to do that doesn't exist anymore (yes in short)

4) does not matter WHO overwrote the rbs data (it is overwritten after all, we don't know by who really -- just that it IS).

Ravi

November 14, 2004 - 12:46 pm UTC

Reviewer: Ravi from Edinburgh, UK

Tom,

create table test (empno number(10) primary key, city varchar2(100));

insert into test values(50,'New York');
commit;


begin
for i in (Select city,empno from test UNION ALL select city, empno from test) loop
dbms_output.put_line(i.city);
Update test b set city = decode(i.city,'New York','Chicago','Chicago','Miami','X')
where empno = i.empno;
commit;
end loop;
end;
/

I’ve tested above to find New York being output, but assuming we introduce a monstrous delay between the first and the second record, can you confirm

1) If the decode statement were to be changed from decode(i.city,… to decode(b.city, the city gets changed to Miami. Is this because Update retrieves record in “Current” gets?
2) And if the code was as it is the end result after running the above PL/SQL would be a value of Chicago in test.emp, is this the “Consistent” mode of operation?
3) Behind the scenes, the Block where the record is modified and the Rollback Segment moves from “Active” to “Commit” after the first update and commit in the loop. During the second record, Oracle actually, gets the data from the Rollback Segment, and temporarily gets a Snapshot (to change it from Chicago to New York) and display “New York”?
4) And finally, when the update statement in the second iteration is reached, Update again gets to see the Current state of city, which is Chicago and updates it to Miami?

Tom, can you confirm how much of the four points of mechanism (all my assumptions) are true?

Cheers

Ravi


Tom Kyte

Followup  

November 14, 2004 - 1:03 pm UTC

the results of

for i in (Select city,empno from test UNION ALL select city, empno from test)

are preordained at the point in time that query is first opened. I don't care how many minutes, hours, whatever you wait between fetches, every row that comes from that query will come from the database as of the point in time the query began (in read committed, serializable/readonly make it as of the point in time the transaction began)

The UPDATE follows the same rules - but it is run "as of a different point in time" from the query -- and could therefore see totally different data than the query did. Heck, the row the query fetched might well have been DELETED by some other session -- the update would not see it.


Your code is an example of "lost update" type of processing. You are reading information from the database -- making some decisions based on it (you don't in this example, but you would be in real life) and then blindly updating the data in the database -- without checking to see if someone else may have modified it as well.


If you have my book "Expert One on One Oracle" -- i cover these topics and more in the first 1/3 of the book -- concurrency control, locking, blocking, deadlocks, redo, rollback, transaction control, and so on.


1) no, it wasn't really the current get in as much as it was the fact that the update runs at a different point in time than the query. The consistent read on that update saw the change made by the first update.

2) this is a bad example for many reasons -- the lost update being one of them (bad coding practice).

3) Oracle may well have performed a read asided to the rollback segment in the first query -- yes, depends on which version (but not because consistent read processing changed -- but rather because plsql pre-fetches in 10g, but that is another story...)

4) the update would see the modification in both its consistent read AND current mode read (assuming no one else is playing with the record) because of the point in time at which the query is executed (the update)

November 14, 2004 - 3:13 pm UTC

Reviewer: A reader

By point on your last feedback, when you said the Update sees both Current and Consistent, did you mean that

Update emp i set city =
decode(i.city,'New York','Chicago','Chicago','Miami','X')
where
(A rather long running query...)

1) The update sees the query results in consistent mode, ie at time of the start of the query?

2)Say if this query runs for 10 minutes and the value of City had changed to 'Chicago' by this time, the Update sees its Current value of Chicago and not New York (which was at the start of the Query)?

3) In the example of your feedback, assuming version of Oracle is 8.1.7, would the Rollback Segment slot become 'COMMITTED' and lets say the block cleanout hadn't happened, and the block has a modified value of 'Chicago'

Would Oracle now use the Rollback Segment slot ('COMMITTED') whose value I guess could be 'New York' ?

Tom Kyte

Followup  

November 14, 2004 - 3:58 pm UTC

1) the update has two parts to it

a) a query "where ......"
b) a modification "set ...."


the query is done in read consistent mode, we find the rows using read consistency, upon finding a row you want to update -- we get that block in current mode in order to update the most recent version. It gets very interesting when the current mode read differs in values from the consistent read. see
</code> http://asktom.oracle.com/~tkyte/wc.html <code>
for details on what I mean by that.

2) see #1 and read the link.

3) block clean out and read consistency are not related really -- the same thing would happen regardless of block cleanout's or not -- you would get the read consistent version of the block and if that meant "you gotta roll back -- undo the changes" -- well, we would.

fetch across commit

November 24, 2004 - 12:31 pm UTC

Reviewer: tester from usa

Great thread, I have a question on it.
We have package a which has 1 procedure and 2 functions.
procedure a
cursor a select from taba
open a loop fetch
call function 1;
end loop;

function 1
cursor b select from tabb
open b loop fetch
call function 2
end loop

function 2
insert into tabc
commit

We're getting 1555's. Is this a case of fetch across commit, even though tabc (the committed table) isn't in
the orirginal cursors?

Basically, cursor select, open another cursor on another table, cursor select and insert and commit into yet another table.


Tom Kyte

Followup  

November 24, 2004 - 1:46 pm UTC

inline the code and you get:

for X in a
loop
for Y in b
loop
insert into table...
commit;
end loop
end loop


so yes, you are not only fetching over a cursor (two of them) causing a very much self inflicted ora-1555, you probably are not "restartable" either -- meaning when you get the 1555 you are in a whole heap of trouble since you cannot really "pick up" where you left off (eg: transactional integrity = 0)

the only CAUSE of a 1555 is improperly sized rollback segments.

December 14, 2004 - 11:29 am UTC

Reviewer: Nameless Non-DBA from Dublin, IR

Tom, you state that "the only CAUSE of a 1555 is improperly sized rollback segments." Our "corporate" DBAs (and I use the term loosely) responded to our request to resolve a 1555 error with the following:

"ERROR at line 4:
ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old
ORA-02063: preceding 2 lines from DBPRD

There is not a way to fix the Snapshot too old error, This error does not
give a rollback in the error message.
This data is corrupt, and it thinks the data is in the rollback segment, but
it goes to check the rollback segment and the segment is not a valid segment
(because this segment is long aged out of the system). That is why there is
no rollback segment name in the error message.

Plus you (the developers) can pull out Documents of 30 MB without getting
this error. Therefore it is not a resizing problem. It is a data corruption
problem. So no amount of resizing a rollback segments will fix this problem.
In my opinion , these BLOBS are corrupt and can not be retrieved."

I don't buy it and your statement is also to the contrary of this DBA. Is there any other way that 1555 can be caused? Mis-sized PCTVERSION perhaps? Corrupted BLOBs?

Tom Kyte

Followup  

December 15, 2004 - 12:31 pm UTC

is that from a distributed query?
The RBS number isn't passed back

but not to worry -- because the probability of a 1555 is directly proportional to the size of the SMALLEST RBS segment *and* all RBS's (manual RBS) should be the same size anyway -- so if ONE is too small, they ALL are too small.


If these are blobs that are raising this then the DBA's need to check out "pctversion" -- blobs are versioned in the blob segment itself, THEY control this as well -- it is up to them.


(so I guess there is another way to get the 1555 -- this is true, LOBS with insufficient pctversion set)

thanks for the clarification....

snapshot too old error

December 19, 2004 - 6:58 am UTC

Reviewer: Vithal from India


Hi Tom,

I had a question, If i put set transaction use rollback segment <rbs_name> and that is also in a loop is there any possibilities to get this error.
Ex;
declare
cursor a is select * from emp;
cnt number:=0;
begin
for a_rec in a
loop
insert into emp1
values(a_rec.empno,a_rec.ename);
cnt := cnt + 1;

if cnt = 1000 then
commit;
set transaction use rollback segment rbs_1
end if;

end loop;
end;

can u please update me for this.

Thanks

Tom Kyte

Followup  

December 19, 2004 - 11:39 am UTC

yes. absolutely.

especially if you do that, you'll most likely INCREASE the change of the 1555!

that code should be nothing more than:

insert into emp1 select * from emp;

period.

Why ORA-1555 can't be addressed by 'set transaction use rollback segment'

December 19, 2004 - 4:09 pm UTC

Reviewer: Mark J. Bobak from Belleville, MI

To further clarify, ORA-1555 can NEVER be addressed by 'set transaction use rollback segment xxx'. To think that it can, is a misconception.

See:
</code> http://www.jlcomp.demon.co.uk/faq/settrans.html <code>
for a full explanation.

-Mark

ORA-01555

December 22, 2004 - 1:20 am UTC

Reviewer: Vithal from India

Thanks Tom/Mark

In Practical exp.. I am doing some transaction and gating the data from around 11 tables which contains more then 34 million records and after filtering though I am gating 7 minions records and every time I am getting the same error(ORA-01555) so I tried to use a rollback segment which has a more space and now the problem is been solved but at the time of transaction I tried to see how many rollback segment is using currently by the transaction and it will increase up to one level and will decrease zero and when i am heating a select query at the same time for the same table I am getting the ORA-01555 error, as per the oracle doc. the select query is not using the rollback segment then why I am getting the error. and also I goon through one book where they mentioned after completion of the traction if I am heating the same rollback segment then it will reallocate..is it some thing wrong...

Tom Kyte

Followup  

December 22, 2004 - 8:47 am UTC

your probability of 1555 is proportional to the size of your smallest RBS.


You are missing this -- 1555 happens to the QUERY, it is not happening to the modificatoin. use rollback segment affects the modification -- and will IN FACT MASSIVELY INCREASE the chance of the 1555!!!!!!


the select has to use the RBS's where the undo it needs is -- that would be ALL rbs's!


You need to

a) stop using "use rollback segment"
b) permanently increase the size of all RBS's

Need Clarification

January 04, 2005 - 12:47 pm UTC

Reviewer: Reader from Detroit

Tom ~ I got this error for the first time today on a procedure that's been running for a few weeks without a problem. It seems similar to your examples.

create table completed_files (file_name varchar2(8), dir_cd number, completed varchar2(1));

insert into completed_files (file_name, dir_cd)
(select user_id from user_info);
commit;

-- 500 records are created with distinct file_names and all are within 14 directories.
-- example records:
file_x1, 1, NULL
file_x2, 1, NULL
file_x3, 2, NULL
file_x4, 3, NULL
...

<session 1>
BEGIN
OPEN l_user_cursor FOR SELECT file_name from completed_files where dir_cd = 1;

LOOP

FETCH l_user_cursor INTO l_user_info;
EXIT WHEN l_user_cursor%NOTFOUND;
--open a file, write data to file, close file.

update completed_files
set completed = 'Y'
where file_name = l_user_info.file_name;
COMMIT;
END LOOP;

<session 2>
-- does the same as session 1 but with a different directory in the cursor. These procedures would be running concurrently.

Why do I get the snapshot too old error on this?

Tom Kyte

Followup  

January 05, 2005 - 8:53 am UTC

search this site for 1555 and read

see "Expert one on one Oracle" -- I wrote tons about that.


commit in cursor for loop -> 1555 easily.


You are modifying the table you are reading. You are committing the modifications. Once you commit the changes, you are saying "it is OK to overwrite my UNDO now -- go for it". Unfortunately for you -- your long running query NEEDS that undo you just released for reuse.

In other words, you just shot yourself in your own foot.

You have insufficient UNDO configured
AND you are committing in a cursor for loop.


but...

January 06, 2005 - 10:32 am UTC

Reviewer: Lee from Detroit

How can I keep track of the all the files that I have created then without committing in the loop?

Can I update a different table that I'm not pulling from and commit inside the loop? or is committing inside a loop just a very bad practice?

I need to be able to open another session and say:

Select directory_cd, count(file_name) from table_name
where completed is NULL;

dir_cd count
------ -----
5 15
6 22
7 60

If I ran that same SQL 2 minutes later, the counts would be decremented and I could easily tell how many files are left to be created.

Also, If the server crashes or I get the 01555 (like I did last week), I can easily start up where I left off because I know how many files I have created so far.

Tom Kyte

Followup  

January 06, 2005 - 11:26 am UTC

commit when you are done creating files.

use dbms_application_info.set_session_longops to let you monitor a job from another session -- don't "commit" to do that.  search this site or see "Expert One on One Oracle" for details on dbms_application_info.  every program should call that stuff (set client info, set action, set module at the very least and set session longops for long running processes)


basically -- if you sized your RBS correctly for the work you do, the commit in the cursor for loop wouldn't hurt you either.  


and if you don't mind getting the 1555, just program for it, when it happens, catch it and start over.

or, run more, but shorter lived queries:


ops$tkyte@ORA9IR2> declare
  2      l_rid rowid;
  3      l_data t.x%type;
  4  begin
  5      loop
  6      begin
  7          select rowid, x into l_rid, l_data
  8            from t
  9           where completed_flag = 'N'
 10             and rownum = 1
 11             for update;
 12      exception
 13          when no_data_found then EXIT;
 14      end;
 15
 16          -- process, call dbms_application_info here too!
 17
 18          update t
 19             set completed_flag = 'Y'
 20           where rowid = l_rid;
 21
 22          commit;
 23      end loop;
 24  end;
 25  /
 
PL/SQL procedure successfully completed.


not much of a 1555 chance there and easily "restartable" 

how to minimize the undo generation (DB Ver. 920)

January 07, 2005 - 11:55 am UTC

Reviewer: Jianhui from VA

Tom,
In data warehouse load process,

1)any tips for developers to minimize undo generation? Such as using truncate instead of delete when it's possible.

2)Will frequent commit save some space in undo(i.e commit per xxxx records instead of whole load, etc).

Thank you.

Tom Kyte

Followup  

January 08, 2005 - 3:40 pm UTC

1) use operations that can bypass undo:

truncate
insert /*+ append */
disable indexes, rebuild afterwards...


2) laughing out loud. frequent commits do precisely this:

a) make you run slower
b) generate MORE undo
c) generate MORE redo
d) break your transactional integrity.

hmm, lets see -- anything good to say about frequent commits

<insert large silence right here>


Nope, nothing.

not like what you say, frequent commit and undo tablespace usage

January 10, 2005 - 12:29 pm UTC

Reviewer: jianhui from VA

<<
2) laughing out loud.  frequent commits do precisely this:
a) make you run slower
b) generate MORE undo
c) generate MORE redo
d) break your transactional integrity.
>>

please dont laugh,

a) agree, but goal is to limit undo tablespace usage, performance is acceptable.
b) agree,
but there are two concepts need to clearify
1) total undo generated by the process
2) total undo tablespace needed by the process
your comment applies to case 1), yes ,total undo is bigger by frequent commit, but what we care about is case 2), total undo tablespace needed is smaller.
frequent commit can reuse undo, while 1 commit can not reuse undo until the end, this way the total UNDO tablespace size can be smaller although the generated undo data is larger. i.e total undo usage is 10G, I have undo tablespace 5G, single commit wont be able to fit the space, but 10 commits will do, althougt 10 commits could generate 10+G in total, but each commit only consumes approx. 1G, subsequent commits can reuse the space as well.

Here is my test example, although i did it in 10G but it should be enough to prove the point for this case

SQL>ed
Wrote file afiedt.buf

  1  select tablespace_name, sum(bytes)/1024/1024 mb
  2  from dba_data_files
  3  where tablespace_name='UNDO'
  4* group by tablespace_name
SQL>/

TABLESPACE_NAME                        MB
------------------------------ ----------
UNDO                                   20

SQL>alter system set undo_tablespace='undo';

System altered.

SQL>desc t
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 C                                                  CHAR(2000)
 I                                                  NUMBER(38)

SQL>analyze table t;
analyze table t
              *
ERROR at line 1:
ORA-01490: invalid ANALYZE command


SQL>analyze table t compute statistics;

Table analyzed.

SQL>select sum(blocks)*8/1024 mb from user_tables
  2  where table_name='T';

        MB
----------
39.7578125

SQL>select min(i), max(i), count(distinct i)
  2  from t;

    MIN(I)     MAX(I) COUNT(DISTINCTI)
---------- ---------- ----------------
         1      15000            15000

SQL>select count(*) from t; // comment: PK column i has distinct values from 1 to 15000

  COUNT(*)
----------
     15000

SQL>update t set c='one commit';
update t set c='one commit'
       *
ERROR at line 1:
ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDO'  // comment: 1 commit can not reuse undo


SQL>rollback;

Rollback complete.

// comment, let's do it in 100 small chunks (150 rows each)
SQL>begin
  2   for n in 1..100 loop
  3    update t set c='100 commits'
  4    where i>=(n-1)*150+1
  5    and i<=n*150;
  6    commit;
  7   end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.    // comment: 100 commits can reuse undo and succeeded

SQL>select distinct c from t;

C
----------------------------------------------------
100 commits

c) agree
d) we dont break the transaction to do the commit, it's batch data load process.
 

Tom Kyte

Followup  

January 10, 2005 - 1:59 pm UTC

a) why? why is the goal to limit undo tablespace usage. And if performance is just acceptable now, you'll make it go slower which falls into "unacceptable"



no so tech related explaination.

January 10, 2005 - 2:44 pm UTC

Reviewer: jianhui from VA

a) why? why is the goal to limit undo tablespace usage. And if performance is
just acceptable now, you'll make it go slower which falls into "unacceptable"

simple, not enough space available to increase the UNDO tablespace, so we sacrifice time for space. And time(performance) is still within acceptable range, just a trade off. Its better to get job done slower than not able to get job done at all, that's why. You would ask why dont i just increase the UNDO tablespace size, i 'd love to. A simple analogy can explain it, if a manager gives me 1000$ to do a project, i can do it either by means A faster but needs more than 1000$, or i can do it by means B slower but I can get it done, what would you do? I would get job done first by means B and let manager know if i had 1500$, i could get it done by means A and faster.

Tom Kyte

Followup  

January 10, 2005 - 4:14 pm UTC

just make sure you are restartable too, so that when you fail 25% of the way through, you can pick up where you left off "safely"

ORA-01555 "Snapshot too old" - Detailed Explanation

January 18, 2005 - 9:40 am UTC

Reviewer: Sudhanshu Gautam from mumbai

IT's clear whole funda of rollback segment and ORA-01555

ORA-1555 with COPY command

January 19, 2005 - 4:08 pm UTC

Reviewer: A reader

Hi,

I getting this error using SQL Plus COPY command. This command has to copy about 20 GB data (incl. LONG's)from table A to empty (new) table B. I get this error allways at the same point, when 17 GB of data is copied. Tried to increase initial size of the rbs, but doesn't help. COPY command looks like :

copy from <user_id>/<user_pw>@<DB> TO <user_id>/<user_pw>@<DB> insert table_B(a,b,c,d) using select <SEQ>.nextval,A.b,A.c,0 from table_A A;

I have 13 rbs within one .dbf file. Each rbs has initial size of 4 Mb and next extend of 512 kb.

Suggestions?

Thank You.



Tom Kyte

Followup  

January 20, 2005 - 10:05 am UTC

you have insufficient UNDO configured on the database you are reading from.


period. It is not the size the rbs can grow to -- it is the PERMANENTLY allocated size of the rbs that counts here.

they are too small, they are being reused -- wrapping around -- in less time than it take to run your queries.

Can "alter system switch logfile" solve the dealyed block cleanout problem?

February 03, 2005 - 6:19 am UTC

Reviewer: A reader

I wonder whether "alter system switch logfile" can solve the problem since it triggers checkpoint and it flush all the data from buffer to files.

Thanks

Tom Kyte

Followup  

February 03, 2005 - 1:46 pm UTC

that would increase the probability you have blocks clean out later.

not decrease it.

Why increase?

February 04, 2005 - 7:23 am UTC

Reviewer: A reader

Hi Tom,

Could you explain why using "alter system switch logfile" will increase the chance of getting the cleanout delay problem?

Thanks

Tom Kyte

Followup  

February 04, 2005 - 11:47 am UTC

initiates a checkpoint
checkpoint flushes blocks to disk
blocks get flushed before I can come back and clean them out on commit.


if they were in the buffer cache when I commit - I might be able to clean them, you flush them and I won't be able to.

February 21, 2005 - 7:15 am UTC

Reviewer: A reader

Hi Tom,

Does this mean that we can only perform the delay cleanout when the blocks are in buffer cache? If yes, could you explain why we cannot do the cleanout if the blocks are already on the disk? And how to solve the cleanout delay problem if the blocks have been flushed to disk?

Thanks
David

Tom Kyte

Followup  

February 21, 2005 - 11:00 am UTC

correct.

there is nothing to "fix", it is working as designed.

You can avoid dirty blocks by doing things like using direct path operations in your data warehousing processing (instead of slow by slow), but slow by slow updates to lots of data will necessarily cause this phenonmena to happen.

March 04, 2005 - 7:01 am UTC

Reviewer: A reader

Hi Tom,

Some questions regarding the delay cleanout.

1) If the blocks are on the disk, does this mean that the uncommitted change in the block's header is always in the header (assume no database shutdown)?

2) If another transaction want to update this data block later, does Oracle allow it to do so? Or return ORA-1555? If Oracle does not allow us to update this data block, does this mean that this data block can never be read again (assume no database shutdown) since you said there is nothing to "fix" this problem.

3) You mentioned that we can avoid dirty blocks (i.e. uncommitted change) by direct path operations. What are the direct path operations?

Thanks
David

Tom Kyte

Followup  

March 04, 2005 - 8:26 am UTC

1) in the block header is "stuff we use to manage the block", it is not really committed or uncommitted stuff -- it is just stuff.

2) sure it lets it do it. the database wouldn't even barely work otherwise.

1555 is not returned to a modification, 1555 is the exclusive domain of "reads" (the read component of an update might get a 1555 - the part that finds rows -- but the update component will not get 1555)


3) direct path loads (direct=y with sqlldr), insert /*+ append */ as select, create table as select, alter table t move;

things that bypass the buffer cache and just format data blocks and write them.

How can you pinpoint it?

March 17, 2005 - 5:30 am UTC

Reviewer: Dave Martin from The Hague, Netherlands

ORA-1555 is down to having not enough RBS's or too small RBS's.

How can you pinpoint which one it is?

Tom Kyte

Followup  

March 17, 2005 - 8:48 am UTC

the odds of a 1555 are directly proportional to the size of your smallest RBS.


You can either

a) make each of the N rbs's you have "bigger" or
b) add M more rbs's of the same size so you have N+M of them

either approach makes them not wrap around so fast. The goal is to have sufficient permanently configured undo available so as to not reuse it in the time it takes to run your longest running query.

You can achieve that in either way -- more rbs's or same number of rbs's only bigger.


Automatic Undo Management makes this much easier, you set the undo retention period instead of trying to figure out how many and how large.

How can you pinpoint it?

March 17, 2005 - 5:31 am UTC

Reviewer: Dave Martin from The Hague, Netherlands

ORA-1555 is down to having not enough RBS's or too small RBS's.

How can you pinpoint which one it is?

Snapshot too old error

March 17, 2005 - 2:11 pm UTC

Reviewer: veera from Gaithersburg

Tom,

I could not understand step 6 from your case study. Once we commit a transaction in step 5 we do not expect oracle to get data from the pre committed stage and we do not want also. I may be missing some point here can you please explain this why oracle needs precommitted data at step 6.

"CASE 1 - ROLLBACK OVERWRITTEN

This breaks down into two cases: another session overwriting the rollback that the current session requires or the case where the current session overwrites the rollback information that it requires. The latter is discussed in this article because this is usually the harder one to understand.

Steps:

1. Session 1 starts query at time T1 and QENV 50
2. Session 1 selects block B1 during this query
3. Session 1 updates the block at SCN 51
4. Session 1 does some other work that generates rollback information.
5. Session 1 commits the changes made in steps '3' and '4'.
(Now other transactions are free to overwrite this rollback information)
6. Session 1 revisits the same block B1 (perhaps for a different row).
Now, Oracle can see from the block's header that it has been changed and it is later than the required QENV (which was 50). Therefore we need to get an image of the block as of this QENV.

If an old enough version of the block can be found in the buffer cache then we will use this, otherwise we need to rollback the current block to generate another version of the block as at the required QENV.

It is under this condition that Oracle may not be able to get the required rollback information because Session 1's changes have generated rollback information that has overwritten it and returns the ORA-1555 error."



Tom Kyte

Followup  

March 17, 2005 - 2:23 pm UTC

for x in
( select rowid, t.*
from t
where indexed_column = <value> ) <<== this is read, the query in step 1
loop
-- steps 2 and 6 happen here --

update t set y = <something> where rowid = x.rowid; --- step 3
update another_t ....; --- step 4
commit; --- step 5
end loop;
/

since we probably red the table T via the index, we'll read

block B1
block B2
block B55
block B2
block B1
block B55
..........

we keep coming back to the block, but each time we come back we need the version of the block that existed WHEN OUR QUERY began -- but we cannot get that version anymore because the undo we generated on the first iteration is "gone"



Hmmmm

March 18, 2005 - 3:07 am UTC

Reviewer: Dave Martin from The Hague, Netherlands

Tom,
Surely it comes down to either:

1) Your undo is being overwritten due to your RBSs being too small - solution - make 'em bigger or make more RBS's

2) The old transaction slots are being overwritten in the RBS header - solution - more RBS's (increase the size of your existing RBSs wont help here)

Im tracking the curext (current extent) in v$rollstat to see for each RBS, how often it wraps around and uses that same curext -if its once or twice a week, its exceptional, so I have a feeling its case 2 - I just want to be able to pinpoint it.

Tom Kyte

Followup  

March 18, 2005 - 7:01 am UTC

(aum)

aum?

March 18, 2005 - 8:26 am UTC

Reviewer: Dave Martin from The Hague, Netherlands

FAIR WARNING
if you use 'IM' speak (you know, u wrt like this w plz and thnks), I will make fun of you. I will point out that it is not very professional or readable

(aum)?

AAAAHA Auto Undo Management.......
(OOOOOH - we have another three months under 8i before we go to 9i)

- are you telling me theres no easy way to see whether I need more RBSs or bigger ones? (with jolly old 8i?)

Tom Kyte

Followup  

March 18, 2005 - 10:04 am UTC

it was just a gentle hint (aum)


You'd have to know if you were getting block cleanout issues or just wraparounds. You can probably "know" that because block cleanouts would be an issue shortly after a large mass load or bulk update -- and then go away. 1555's would happen constantly.

Wouldn't caching of rows returned by a cursor help solve ORA-1555.

April 21, 2005 - 1:44 am UTC

Reviewer: Arun from Hyderabad, India

Hi Tom,

I am pretty new to PL/SQL, just about a month into it, but I’ve learnt a lot from this site. Thanks a ton.

Somehow when I read about cursors, I thought that on opening a cursor, the result set is fetched and cached somewhere. But it looks like my assumption is wrong, as some examples of ORA-1555 seem to depend on the fact that on looping around, the next row needs to fetched / recreated using undo logs from the table.

I have 2 questions:
1. Why aren’t the rows retrieved cached on opening the cursor?
2. What if the query involves a sort? Then all the rows have to be retrieved before returning the first row. Does that mean that ORA-1555 will not occur in this case?

Thanks and regards,

Arun


Tom Kyte

Followup  

April 21, 2005 - 6:14 am UTC

Here is an excerpt from Effective Oracle by Design that I wrote on this topic. (as for #1, think about "select * from one_billion_row_table", do you want that "cached"? somewhere)

(1)Queries from Start to Finish

In this section, we’ll look at two queries to observe some of the ways in which Oracle accomplishes the processing of a query. These two queries are both very simple, but they are handled differently:
o One query can start returning rows long before it ever gets near the last row.
o The other query must wait until many (or all) rows have been processed before returning a single record.

Both queries use BIG_TABLE, a table in which we have well over a million rows. We’ll query it twice—once without an ORDER BY and again with an ORDER BY—and compare the results.

Ask Tom
“How many rows of data will come back from my query?”
In general, Oracle doesn’t know until you actually fetch the last row of data how many rows will be returned. Oracle can guess (the AUTOTRACE facility shows us that guess), but it is nothing more than that.
End Ask Tom

(2)A Quick-Return Query

In this case, we will submit a rather simple block of PL/SQL code. It simply declares a cursor, opens it, fetches the first row, and closes that cursor.

alter session set sql_trace=true;
declare
cursor c is select * from big_table;
l_rec big_table%rowtype;
begin
open c;
fetch c into l_rec;
close c;
end;
/
Reviewing the TKPROF report from that execution, we would see something similar to this:

SELECT * from big_table

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----- ---------- ---------- ------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 196 4 0
Fetch 1 0.00 0.10 4 4 0 1
------- ------ -------- ---------- ----- ---------- ---------- ------
total 3 0.00 0.11 4 200 4 1

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL BIG_TABLE

As you can see, we got our first row back very quickly and performed very little I/O in doing so. It should be obvious from the report that Oracle doesn’t know yet how many rows we might be returning, as it could not have possibly looked at all of them. Here, Oracle is getting the data in response to our fetch calls; the more rows we read, the more work it does. We can see that by fetching more than just the first row.

Let’s use a modified block that executes five separate queries but fetches 1,000, 2,000, 3,000, 4,000, and then 5,000 rows.

declare
c sys_refcursor;
type array is table of big_table%rowtype index by binary_integer;
l_rec array;
begin
for i in 1 .. 5
loop
open c for select * from big_table;
fetch c bulk collect into l_rec limit i*1000;
close c;
end loop;
end;
/

TKPROF shows us the amount of work performed by each fetch (note – I used aggregate=NO on the TKPROF command line to get the separate FETCH results):

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----- ------- ------- ----------
Fetch 1 0.01 0.01 1 14 0 1000
Fetch 1 0.03 0.03 15 25 0 2000
Fetch 1 0.05 0.06 24 35 0 3000
Fetch 1 0.06 0.06 41 47 0 4000
Fetch 1 0.08 0.08 47 58 0 5000

Note: The use of the ref cursor was intentional in this example. My goal was to see the amount of work performed by five separate queries. If I had used standard static SQL in PL/SQL, PL/SQL would have cached my cursor for me, making it impossible for TKPROF to separate the statistics. Using this technique, I was able to use AGGREGATE = NO on the TKPROF command line in order to have TKPROF give me five sets of statistics.

So, the more data we asked for, the more work Oracle does. Unless and until we ask Oracle for the “last row” in our query, Oracle has no idea how many rows will be returned.

(2)A Slow-Return Query

Next, we’ll execute a query with an ORDER BY in it. We’ll simply add ORDER BY OWNER to the original example:

declare
c sys_refcursor;
type array is table of big_table%rowtype index by binary_integer;
l_rec array;
begin
for i in 1 .. 5
loop
open c for select * from big_table ORDER BY OWNER;
fetch c bulk collect into l_rec limit i*1000;
close c;
end loop;
end;

TKPROF now shows us the following:

SELECT * from big_table order by owner
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----- ------- ------- --------
Fetch 1 12.51 29.40 21858 22004 42 1000
Fetch 1 12.40 25.95 21851 22002 42 2000
Fetch 1 12.32 25.32 21859 22002 42 3000
Fetch 1 12.34 23.23 21866 22002 42 4000
Fetch 1 12.40 25.57 21859 22002 42 5000

This is very different from the other query. Here, Oracle read the entire table, sorted it, and then gave us the first rows back. We waited over 25 seconds to see that single, first row, as opposed to the instantaneous response we got with the previous version. Also of note is the fact that the amount of work, (CPU time and query mode block gets) performed by this query was more or less constant – even as the number of rows retrieved increased. This is as opposed to the prior example where the amount of CPU time and query mode gets increased as the amount of data fetched increased. This second case then is an example of a query where Oracle answered the entire question, and then started returning the results, rather than just returning results.

There are many cases where this will be true. For example, with aggregation, Oracle often answers the entire question and then returns the results. However, if Oracle can use an index to efficiently perform the aggregation without sorting, for example, you may find you can get the first row before Oracle generates the entire result set.

In this particular case, I did have an index on the OWNER column in the database. So, you might wonder why Oracle didn’t use that index. In this case, it is because Oracle had no idea we were interested in only the first rows; we didn’t tell Oracle that fact. We could let Oracle in on that particular detail, using a session setting or a hint in the query (see Chapter 6 for details on hints and when hinting might be an appropriate development technique).

declare
cursor c is select /*+ FIRST_ROWS(1) */ * from big_table order by owner;
l_rec big_table%rowtype;
begin
open c;
fetch c into l_rec;
close c;
end;
/

TKPROF now shows this report:
SELECT /*+ FIRST_ROWS(1) */ * from big_table order by owner

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----- ----- ---------- -----
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 4 0 1
------- ------ -------- ---------- ----- ----- ---------- -----
total 3 0.00 0.00 0 4 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY INDEX ROWID BIG_TABLE
1 INDEX FULL SCAN BIG_TABLE_IDX

You can see we are back to the case where Oracle did not process the entire result set. However, before you got into the mindset that Oracle should have used the index, you would need to investigate what would happen in the case where you fetched all of the rows from the result set (a sensible assumption; you would typically fetch all of the data, not just a first row). When we do that, running both queries and exhausting their result sets, we discover this:
SELECT /*+ FIRST_ROWS(1) */ * from big_table order by owner

call count cpu elapsed disk query current rows
------- ------ ------ --------- ------- ---------- ------- --------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1833793 80.41 101.78 44386 3667790 0 1833792
------- ------ ------ --------- ------- ---------- ------- --------
total 1833795 80.41 101.78 44386 3667790 0 1833792

Rows Row Source Operation
------- ---------------------------------------------------
1833792 TABLE ACCESS BY INDEX ROWID BIG_TABLE
1833792 INDEX FULL SCAN BIG_TABLE_IDX

SELECT * from big_table order by owner

call count cpu elapsed disk query current rows
------- ------ ------ ------- ------- ------- ------- --------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 196 4 0
Fetch 1833793 60.77 81.26 40861 22002 31 1833792
------- ------ ------ ------- ------- ------- ------- --------
total 1833795 60.77 81.27 40861 22198 35 1833792

Rows Row Source Operation
------- ---------------------------------------------------
1833792 SORT ORDER BY
1833792 TABLE ACCESS FULL BIG_TABLE

Oracle did the right thing, in general.

No question’s a stupid question….

April 21, 2005 - 11:54 am UTC

Reviewer: Arun from Hyderabad, India

But maybe here’s one….

SELECT * from big_table order by owner
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ----- ------- ------- --------
Fetch 1 12.51 29.40 21858 22004 42 1000
Fetch 1 12.40 25.95 21851 22002 42 2000
Fetch 1 12.32 25.32 21859 22002 42 3000
Fetch 1 12.34 23.23 21866 22002 42 4000
Fetch 1 12.40 25.57 21859 22002 42 5000

Why was the first fetch of only 1000 rows so different from the others? I mean more cpu, elapsed, disk and query, when I’d expect the reverse.

Thanks for your prompt reply Tom.

Regards,
Arun


Tom Kyte

Followup  

April 22, 2005 - 9:02 am UTC

the CPU wasn't statistically different as far as I'm concerned.

but.... I was running on a cooked file system. With buffered unix IO. The PIO's (physical IO) probably were "true" PIO's the first time and not "true" PIO's the next times.

The physical IO's for the 2, 3, 4, 5th run of the query were hitting the unix buffer cache in all likely hood.

Thanks Tom

April 22, 2005 - 11:19 am UTC

Reviewer: A reader


May 05, 2005 - 6:12 am UTC

Reviewer: A reader

Hi Tom,

I have questions in the delayed block cleanout.

1) From Oracle metalink, Oracle suggested to prevent ORA-01555 due to delayed block cleanout, you can use select count(*) from table to force the block cleanout.

If I have a batch job that update a lot of records, and the modified data blocks do not revisit in long period of time. Does this mean that the modified block will always has a pointer to point the rollback segment?

2) If question 1 is yes, apart from revisit the block by using select count(*) from table, is there any events in Oracle that will make the block cleanout?

3) You mentioned that the block cleanout will only occured if the block is in buffer cache, does this mean that during the block flush to disk, Oracle will do the block cleanout?


Thanks
David

Tom Kyte

Followup  

May 05, 2005 - 7:34 am UTC

1) the blocks will look like there is a transaction in process against them. When we read it, we'll ask "is this transaction still in process" and then clean out the block. Now, after a short period of time -- you won't get the 1555 because the transaction you ask about is so old, that we just know it is over.

If you have Expert one on one Oracle, I cover this in great detail.

2) you could analyze (gather stats) the table, or avoid dirtying the block by using direct path operations.

3) no, that during the commit processing, Oracle will revisit the blocks in the cache. it is not a flush to disk process.


May 05, 2005 - 8:11 am UTC

Reviewer: A reader

Hi Tom,

Thanks for your answer. But for question 1, you said "Now, after a short period of time -- you won't get the 1555 because the transaction you ask about is so old, that we just know it is over". How Oracle determines the "short period of time" and knows it is so old and not need to visit the rollback segment? And is your answer only valid for the blocks in buffer cache, and not on the disk?

Thanks
David

Tom Kyte

Followup  

May 05, 2005 - 9:12 am UTC

it is variable -- it is "workload based". Eventually, the transaction is so old, it is known that the data on the block is "good enough"

You read blocks from the buffer cache in general, even if they aren't in there when you need them, you put them there and then read them.

So, the blocks are coming from the cache.

Amount of work required to undo changes

May 19, 2005 - 10:13 pm UTC

Reviewer: Reader from Russia

The most often used example to demostrate amount of work Oracle must do to UNDO the chages is:

1. Open cursor.
2. Update the data needs by that cursor N-times.
3. Fetch and in TKPROF point to Query columns - Oracle is read block N-times in order to get the required version of the block.

It seems what this example is not 100% correct. Is five versions of block can be get by Oracle for free (at least in 9.2.0.6)? Consider:

SQL>  create table t1 (n number);

Table created.

SQL> insert into t1 values (0);

1 row created.

SQL> commit;

Commit complete.

Here we'll use slighty modified classic example. I've added a cursor fetch before update actually starts - to see amount of work without need for undo changes.

SQL> DECLARE
  2   CURSOR l_cur1 IS SELECT * FROM t1 before_update_one_row;
  3   CURSOR l_cur2 IS SELECT * FROM t1 after_update_one_row;
  4   l_n NUMBER;
  5  BEGIN
  6   OPEN l_cur1;
  7   OPEN l_cur2;
  8   FETCH l_cur1 INTO l_n;
  9   
 10   FOR i IN 1 .. 100
 11   LOOP
 12    UPDATE t1 SET n=i;
 13    COMMIT;
 14   END LOOP;
 15  
 16   FETCH l_cur2 INTO l_n;
 17   CLOSE l_cur1;
 18   CLOSE l_cur2;
 19  END;
 20  /

PL/SQL procedure successfully completed.

Now, we insert a second row into table and see what happens.

SQL> insert into t1 values (0);

1 row created.

SQL> commit;

Commit complete.

SQL> DECLARE
  2   CURSOR l_cur1 IS SELECT * FROM t1 before_update_two_row;
  3   CURSOR l_cur2 IS SELECT * FROM t1 after_update_two_row;
  4   l_n NUMBER;
  5  BEGIN
  6   OPEN l_cur1;
  7   OPEN l_cur2;
  8   FETCH l_cur1 INTO l_n;
  9   
 10   FOR i IN 1 .. 100
 11   LOOP
 12    UPDATE t1 SET n=i;
 13    COMMIT;
 14   END LOOP;
 15  
 16   FETCH l_cur2 INTO l_n;
 17   CLOSE l_cur1;
 18   CLOSE l_cur2;
 19  END;
 20  /

PL/SQL procedure successfully completed.

In TKPROF Query columns we see:

1) SELECT * FROM T1 BEFORE_UPDATE_ONE_ROW: 5
2) SELECT * FROM T1 AFTER_UPDATE_ONE_ROW:  100
3) SELECT * FROM T1 BEFORE_UPDATE_TWO_ROW: 5
4) SELECT * FROM T1 AFTER_UPDATE_TWO_ROW:  195

The example used to demonstrate "hey, i updated this block 100 times, so Oracle will undo changes 100 times" - seems to be a little flawed.

In the first case oracle have to read five blocks (total number of blocks in table). If Oracle needs to do undo 100 times for 100 updates - why in second case whe Query number is not 105? Where is the "lost" 5 block reads here? Or do the 5 first versions of the block is for free? Look at the last Query count - 195. Again, if "Oracle needs 100 times to undo" - why it is not 200?

So, i think the conclusion is - to undo the first five versions of the block Oracle doesn't need to re-read block in sequental fashion to get the version it needs.

So, for SELECT * FROM T1 AFTER_UPDATE_ONE_ROW we got: 
5 (number of block in table)+100 (times to undo 100 block changes for one row)-5 ("free of charge undo" of one row)=100.

For SELECT * FROM T1 AFTER_UPDATE_TWO_ROW we got:
5 (number of block in table)+200*2 (times to undo 100 block changes for two rows)-5*2 ("free of charge undo" of two rows)=195.

Am i correct or there is something i missed up? 

Tom Kyte

Followup  

May 20, 2005 - 7:47 am UTC

The numbers are "about", for you see, it is called multi-versioning and read consistency.


there are multiple versions of blocks available.

but the case shows the point -- if we have to undo, we do and you can see the measurable work performed. That is more than sufficient to understand what happens under the covers.

I could really mess up the test by going multi-user and running the queries in other sessions -- you might see 5, 5, 5, 5 or some other sequence of numbers in that case.

You are demonstrating "work" performed


Ghost Transaction

June 01, 2005 - 7:59 am UTC

Reviewer: marrec

Hi there,

I have a prob with one of my rollback-segments.
this is a big segment, which is used for some special transactions.

Yesterday a transaction failed.
We tried to set the rbs offline. After this the database was very slow, and some employees could'nt work.


The alert.log file said:
SMON: about to recover undo segment xy
SMON: mark undo segment xy as available
.
.
.
These two lines were repeated thousand times and more in the alert.log.
RBS had the state "Partly Available".
After setting the state back to online, the message disappered.

Now we tried to set the rbs offline to drop and recreate it, but the state is now "pending offline".

In v$rollstat i can see that one transaction is active, but i can't see any transactions in v$transaction.
In v$rollstat the column writes is = 0 !!!

SELECT name, xacts "ACTIVE TRANSACTIONS"
FROM v$rollname, v$rollstat
WHERE status = 'PENDING OFFLINE'
AND v$rollname.usn = v$rollstat.usn;

I did the following:

-- You can Query the V$ROLLSTAT (column WRITES) to find out if there are any UNDO being generated for this session. If the session is active and doing the update, then there will be some undo being generated.

-- The view V$SESSION_LONGOPS will also show the progress of the session. Query this view for the session executing the update statement and this will show if there are any progress being made.

-- Check the V$SESSION_WAIT view to find out if the session is waiting on any particular resource.

select event,p1,p2,p3,seconds_in_wait from v$session_wait where sid = SID and wait_time = 0;

-- The V$SESSTAT will also help in identifying the progress made by the session. Run the Query given below and check if the value is increasing or not.

select st.name,sess.value from v$statname st,v$sesstat sess where st.statistic#=sess.statistic# and st.name = 'redo blocks written' and sess.sid=SID of hanging session;

What I found out was following:

v$rollstat "writes" is 0 !
(the transaction has been running for hours now and there's no undo generated)

in v$session_longops is no record which has column sofar < totalwork --> so every work should be done

In v$session_wait are a lot of session waiting for something:
SQL*Net message from client has state "WAITED UNKNOWN TIME"
rdbms ipc message,
pmon_timer,
buffer_busy_waits,
db file sequential read,
Wait for stopper event to be increased,
PX Deq: Txn Recovery Start


select st.name,sess.value from v$statname st,v$sesstat sess where st.statistic#=sess.statistic# and st.name = 'redo blocks written' and sess.sid=SID

there are 34 rows with
redo blocks written 0

and the last one has:
redo blocks written 31341470
this session is from a background process.

There's one session, which is a system background process, that has:
"Wait for stopper event to be increased"
in v$session_wait - event.

This session has also a TX lock in v$locks. (and some ps locks)

We use oracle 8.1.7 on win200 advanced server

I'd really appreciate any helps,

Thx so far
marrec



Tom Kyte

Followup  

June 01, 2005 - 10:11 am UTC

same answer I gave somewhere else -- please contact support.

consistent read

June 14, 2005 - 12:24 pm UTC

Reviewer: Lizhuohua from China

Hi tom,
Q1
Look the test:
Session 1:
SQL> select sid from v$mystat where rownum=1;

       SID                                                                                                                        
----------                                                                                                                        
        14                                                                                                                        

Elapsed: 00:00:00.00
SQL> alter session set isolation_level=serializable;

Session altered.

Elapsed: 00:00:00.00
SQL> select * from t1;
 
         X DATA
---------- --------------------
         1 x
 
Elapsed: 00:00:00.00


                                         Session 2:
                                         SQL> begin
                                           2  for i in 1..1000
                                           3  loop
                                           4  update t1 set data='x';
                                           5  commit;
                                           6  end loop;
                                           7  end;
                                           8  /
                                          
                                         PL/SQL procedure successfully completed.
 
                                         Elapsed: 00:00:02.06
                                         SQL> select value from v$sesstat where sid=14 and statistic#=41;
                                          
                                              VALUE
                                         ----------
                                               1111
                                          
                                         Elapsed: 00:00:00.00
                                         
Session 1:
SQL> /
 
         X DATA
---------- --------------------
         1 x
 
Elapsed: 00:00:00.01
                                         Session 2:
                                         SQL> select value as current_value ,
                                              value-&before_value as consistent_gets
                                         from v$sesstat
                                         where sid=14
                                            and statistic#=41 --consistent gets  2    3    4    5
                                           6  /
                                         Enter value for before_value: 1111
                                         old   2:      value-&before_value as consistent_gets
                                         new   2:      value-1111 as consistent_gets
                                          
                                         CURRENT_VALUE CONSISTENT_GETS
                                         ------------- ---------------
                                                  2110             999
                                          
                                         Elapsed: 00:00:00.02
Session 1:
SQL> /
 
         X DATA
---------- --------------------
         1 x
 
Elapsed: 00:00:00.01
                                         Session 2:
                                         SQL> /
                                         Enter value for before_value: 2114
                                         old   2:      value-&before_value as consistent_gets
                                         new   2:      value-2114 as consistent_gets
                                          
                                         CURRENT_VALUE CONSISTENT_GETS
                                         ------------- ---------------
                                                  2118               4
                                          
                                         Elapsed: 00:00:00.02
Session 1:
SQL> set autot trace
SQL> /
 
Elapsed: 00:00:00.15
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'T1'
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL> /
 
Elapsed: 00:00:00.16
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (FULL) OF 'T1'
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1004  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Q: Why after set autot trace the consistent gets become big?

Q2:
About CASE 2 - ROLLBACK TRANSACTION SLOT OVERWRITTEN
I have a 5G rollback segment RSBIG and small db cache size
Session 1:
SQL> select value/8192 from v$parameter where name='db_cache_size';
 
VALUE/8192
----------
      2048
 
Elapsed: 00:00:00.01
SQL> select blocks from dba_segments where segment_name='BIG2';
 
    BLOCKS
----------
      3328
 
Elapsed: 00:00:00.18
SQL> set transaction use rollback segment rsbig;--usn is 2
 
Transaction set.
 
Elapsed: 00:00:00.00
SQL> select count(*) from v$transaction where xidusn=2;
 
  COUNT(*)
----------
         1
 
Elapsed: 00:00:00.01
SQL> commit;
 
Commit complete.
 
Elapsed: 00:00:00.00
Then I use a java class generate 100 transaction use rsbig;
two of them throw ora-1554 out of transaction slots 
in transaction tables exception

SQL> select count(*) from v$transaction where xidusn=2;
 
  COUNT(*)
----------
        98
 
Elapsed: 00:00:00.01
---1 minutes later
SQL> /
 
  COUNT(*)
----------
         0
 
Elapsed: 00:00:00.01
SQL> select count(*) from big2;
 
Elapsed: 00:00:02.46
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=310 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'BIG2' (Cost=310 Card=235648)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       6425  consistent gets
       3209  physical reads
     192540  redo size
        381  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
192540  redo size Show the delayed cleanout,but why there no ora-1555.
Thanks &
Best Regards. 

Tom Kyte

Followup  

June 14, 2005 - 4:15 pm UTC

They became big before the autotrace too. They are read asides to the rollback segment.

Oracle has the ability to store multiple versions of any block in the buffer cache -- if we find the version we need in the cache, we don't need to roll it back. If we don't -- we do need to roll it back. So, sometimes we found the version we needed, sometimes we did not.


why do you believe a 1555 would happen there?

consistent reads

June 14, 2005 - 9:01 pm UTC

Reviewer: Lizhuohua from China

Hi,tom
Thanks for you review
1.In my test 1,when I query v$sesstat ,first time consistent gets is 999 because of rollback(reconstruct the before image),second time cosistent gets is 4 because of the cache.After set autot trace, first consistent gets is 4 too,but second time the consistent gets is 1004 even try again it's also 1004,the cache buffer OVERWRITTEN,why?
2.Because I think the 98 transactions have OVERWRITTEN the transaction slots,and when delayed cleanout ,oracle can not find the commit scn.Thats didn't get from transaction slots or the rollback segment header protected by undo,or something else?

Thanks &
Best Regards.

Tom Kyte

Followup  

June 15, 2005 - 3:19 am UTC

1) why not? what else is going on in the system. anything can happen (please don't say "nothing else is going on", because there is ALWAYS something else going on, smon, the job queues, many many backgrounds going on)

2) but the query started AFTER everything else. so why?


consistent read

June 15, 2005 - 9:49 am UTC

Reviewer: Lizhuiohua from China

Hi,Tom
  1.I test this several times and get the same resault,If i query v$sesstat the consistent gets is 4(after the first times) even I query a big table in other session,but after set autot trace first time consistent gets is 4 but after the first time is big.I know autot trace will do many dml on plan_table ,but I don't think that will affect so much.
2.Yes the query after everything ,but I think there was a delayed cleanout ,and that need the commit scn ,but the transaction slots have already OVERWRITTEN,so oracle cannot find the commit scn will throw 0ra-1555 .I test this In serializable transaction and other approach like :
SQL> begin
  2  open :x for select count(*) from big2;
  3  end;
  4  /
 
PL/SQL procedure successfully completed.
 
Elapsed: 00:00:00.01
SQL> set transaction use rollback segment rsbig;
 
Transaction set.
 
Elapsed: 00:00:00.00
SQL> update big2 set owner=owner;
 
235648 rows updated.
 
Elapsed: 00:00:17.06
SQL> commit;
 
Commit complete.
 
Elapsed: 00:00:00.01
--after java code concurrently generate many transactions use same rollback segment 
SQL> print x
 
  COUNT(*)
----------
    235648
 
Elapsed: 00:00:02.76

Please point where I am misunderstand the CASE 2 - ROLLBACK TRANSACTION SLOT OVERWRITTEN.

Many thanks.

Lizh. 

Tom Kyte

Followup  

June 15, 2005 - 10:12 am UTC

2) but obviously that didn't come into play so no 1555. and there have been many enhancements over the years, when modifying my book for 9i/10g, I needed to use multiple sessions (five of them) to simulate the 1555 due to the block cleanout.

and i don't see where you overwrote all of them, just that you used them up. this will happen when

a) i start a long running query (soon after a massive update that definitely left blocks in the table without cleaning them out)

b) THEN lots of transactions happen and clobber the transaction information in the rbs

c) and THEN the query from a) hits a block that it is uncertain about and cannot find the transaction information.


it takes lots of stuff happening in a specific order for this to occur.

consistent read

June 15, 2005 - 9:32 pm UTC

Reviewer: Lizhuohua from China

Hi,Tom
In my first post I say the java code concurrently generate 100 transactions and two of them throw ora-1554 out of transaction slots in transaction tables exception,I think this show that the transaction slots have OVERWRITTEN(These done after I commit the massive update).
I have read your test case for ora-1555 use small rollback segment ,But I think thats overwrote all the rollback segment not only the header information,so I test use a big rollback segment ,In fact first I test this use the method like
<quote>
tkyte@TKYTE816> begin
2 commit;
3 set transaction use rollback segment rbs_small;
4 update t
5 set object_type = lower(object_type);
6 commit;
7 execute immediate 'alter tablespace users read only';
8
9 for x in ( select * from t )
10 loop
11 for i in 1 .. 20
12 loop
13 set transaction use rollback segment rbs_small;
14 update small set x = x+1, y = x;
15 commit;
16 end loop;
17 end loop;
18 end;
19 /
</quote>
but use big rbs ,It run a long time and didn't get the ora-1555,I think it may be not orverwrote all the slots ,so I write a multithread class and do the test I have post.
If any thing I am wrong ,Please point it.
Thanks,
Lizh

Tom Kyte

Followup  

June 16, 2005 - 3:42 am UTC

Look -- it is not throwing a 1555, therefore, you have not done anything to warrant a 1555.

I don't know the entire scenario behind your test case here. It would take the sequence of:

a) make DARN SURE a table is stored on disk with lots of blocks that did not get cleaned out (did you do that)

b) open a query against that (did you do that before the java code)

c) run lots of small transactions that over write the rbs transaction slots (running out of them, don't care, not relevant. perform LOTS of small transactions)

d) then start fetching from the query opened in b) it might hit a block that it cannot figure out if it is old enough or not.

consistent reads

June 16, 2005 - 11:31 am UTC

Reviewer: Lizhuohua from China

Hi,Tom
These are some information about my test:
SQL> select blocks from user_segments where segment_name='BIG2';
 
    BLOCKS
----------
      3328
SQL> select count(*) from big2;
 
  COUNT(*)
----------
    235648
SQL> SELECT VALUE/8192 FROM V$PARAMETER WHERE NAME='db_cache_size';
 
VALUE/8192
----------
      2048

I think the big2 is big enough.

<quote>
 open a query against that (did you do that before the java code)
</quote>
That's mean if open the query after lots transaction (or even after the massive update&commit) the delayed cleanout may use the new scn not commit scn?

I don't know why should run lots transaction ,if I generate enough concurrent transaction 
it will over write the transaction slots(That's why I am use multithread java code generate only 100 transactions.).

<quote>
and i don't see where you overwrote all of them, just that you used them up. 
</quote>
You mean that use up <> overwrote? My massive update has commited,
then if all the slots used up ,I think the slots that the massive update used has
been overwritten.

Follow is spool from sqlplus:
SQL> var x refcursor;
SQL> begin
  2  open :x for select count(*) from big2;
  3  end;
  4  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL> set transaction use rollback segment rsbig;

Transaction set.

Elapsed: 00:00:00.00
SQL> update big2 set owner=owner;

235648 rows updated.

Elapsed: 00:00:16.58
SQL> commit;

Commit complete.

Elapsed: 00:00:00.02
SQL> begin
  2    for i in 1..200000
  3    loop
  4      set transaction use rollback segment rsbig;
  5      update t2 set a=a+1;
  6      commit;
  7    end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Elapsed: 00:04:06.93
SQL> select value
  2  from v$mystat s,v$statname n
  3  where s.statistic#=n.statistic#
  4       and n.name='redo size';

     VALUE                                                                                                                        
----------                                                                                                                        
 321115516                                                                                                                        

Elapsed: 00:00:00.01
SQL> print x

  COUNT(*)                                                                                                                        
----------                                                                                                                        
    235648                                                                                                                        

Elapsed: 00:00:18.15
SQL> select value
  2  from v$mystat s,v$statname n
  3  where s.statistic#=n.statistic#
  4       and n.name='redo size';

     VALUE                                                                                                                        
----------                                                                                                                        
 321308056                                                                                                                        

Elapsed: 00:00:00.00
SQL> spool off

I think the redo size(321308056-321115516=192540) 
show there are delayed cleanout,correct?
Does my test have any error?
BTW: I am test on 9204, and set undo_management manual,does it matter? 

Tom Kyte

Followup  

June 16, 2005 - 1:13 pm UTC

just please do a,b,c,d if you want to try to emulate this.

just do "LOTS" of transactions, "LOTS AND LOTS" of transactions. don't try to "do the perfect number" just do LOTS.

Period -- you open the query (which sets the as of SCN for that query) AND THEN do lots and lots of transactions. And then, that query (which uses the old scn) may,might, could hit an ora-1555 due to hitting a block with an old uncleaned out transaction.



consistent read

June 16, 2005 - 9:52 pm UTC

Reviewer: Lizhuohua from China

Hi , tom
I think may be oracle use undo protect the rollback segment header (say system rollback segment),when the information on the system rollback segment overwritten then oralce cannot rollback the rollback segment header then we will encounter the ora-1555.correct?
Thanks for you patient.
Lizhuohua.

Tom Kyte

Followup  

June 16, 2005 - 11:32 pm UTC

no, the rollback data is not "read consistent" like that. It is when you start a query at SCN 123 and you KNOW for a fact that everything from transaction XYZ and before is OK for you to see,

meanwhile you are running lots of transactions - so the SCN is going up and up and up and the transaction information is getting overwriten.

you hit a block and just cannot figure out if it is OK for you to see it -- the transaction is newer than XYZ but the transaction information is no longer in the RBS, you stop, ora-1555

if you have Expert one on one Oracle, I go into more detail on this.

read consistent

June 17, 2005 - 11:55 am UTC

Reviewer: Lizhuohua from China

Hi,Tom
  Thanks for your patient again.May be I am wrong somewhere ,Just now I test again,
SQL> begin
  2  update big2 set owner=owner;
  3  commit;
  4  for v in (select * from big2) loop
  5    for i in 1..20 loop
  6    update t2 set a=a+1;
  7    commit;
  8    end loop;
  9  end loop;
 10  end;
 11  /
 
PL/SQL procedure successfully completed.
 
Elapsed: 01:20:07.01
There only one non-system and small rollback segment(initial 128k) online,but no  ora-1555,may be my big2 doesn't big enough, but in  my prior test there are redo generate against the last query:(.
Ok,I'll check it and think and test again.
Thanks ,hope I am not waste your time.
Sorry for my poor English ;)
Lizhuohua 

Tom Kyte

Followup  

June 17, 2005 - 4:12 pm UTC

Ok, here is an (unedited) excerpt from the next release of my book on this topic in 10g:

<quote>

Delayed Block Cleanout

This cause of the ORA-01555 is harder to eliminate entirely, but is rare, as the circumstances under which it occurs do not happen frequently (at least not in Oracle 8i and above anymore). We have already discussed the block cleanout mechanism, but to summarize, it is the process whereby the next session to access a block after it has been modified, may have to check to see if the transaction that last modified the block is still active. Once it determines that it is not active, it cleans out the block so that the next session to access it does not have to go through the same process again. In order to clean out the block, Oracle determines the rollback segment used for the previous transaction (from the blocks header), and then determines whether the rollback header indicates whether it has been committed or not. This confirmation is accomplished in one of two ways. One way is that Oracle can determine that the transaction committed a long time ago, even though its transaction slot has been overwritten in the rollback segment transaction table. The other way is that the COMMIT SCN is still in the transaction table of the rollback segment, meaning the transaction committed a short period of time ago, and its transaction slot hasn't been overwritten.

In order to receive the ORA-01555 from a delayed block cleanout, all of the following conditions must be met:

* A modification is made and COMMITed, and the blocks are not cleaned out automatically (for example, it modified more blocks than can be fitted in 10 percent of the SGA block buffer cache).
* These blocks are not touched by another session, and will not be touched until our unfortunate query below hits it.
* A 'long running' query begins. This query will ultimately read some of those blocks from above. This query starts at SCN t1. This is the read consistent SCN it must roll data back to, in order to achieve read consistency. The transaction entry for the modification transaction is still in the rollback segment transaction table when we began.
* During the query, many commits are made in the system. These transactions do not touch the blocks in question (if they did, then we wouldn't have the impending problem).
* The transaction tables in the rollback segments roll around and reuse slots due to the high degree of COMMITs. Most importantly, the transaction entry for the original modification transaction is cycled over and reused. In addition, the system has reused rollback segment extents, so as to prevent a consistent read on the rollback segment header block itself.
* Additionally, the lowest SCN recorded in the rollback segment now exceeds t1 (it is higher than the read consistent SCN of the query), due to the large amount of commits.


Now, when our query gets to the block that was modified and committed before it began, it is in trouble. Normally, it would go to the rollback segment pointed to by the block and find the status of the transaction that modified it (in other words, find the COMMIT SCN of that transaction). If the COMMIT SCN is less than t1, our query can use this block. If the COMMIT SCN is greater than t1, our query must roll back that block. The problem is however, that our query is unable to determine in this particular case if the COMMIT SCN of the block is greater than or less than t1. It is unsure as to whether it can use it or not. The ORA-01555 then results.

In order to see this, what we will do is create many blocks in a table that need to be cleaned out. We will then open a cursor on that table and allow many small transactions to take place against some other table - not this table we just updated and opened the cursor on.  Then, finally we will attempt to fetch the data for the cursor.  Now, we know that the data required by the cursor will be "ok" - we should be able to see all of it since the modifications to the table will have taken place and been committed before we open the cursor.  When we get an ORA-01555 this time, it will be because of the above described problem.  In order to set up for this, we'll be using

* 2 MB undo_small undo tablespace again
* 4 MB buffer cache, enough to hold about 500 blocks.  This is so we can get some dirty blocks flushed to disk to observe this phenomena.

Before we start, we'll create the big table we'll be querying:

ops$tkyte@ORA10G> create table big
  2  as
  3  select a.*, rpad('*',1000,'*') data
  4    from all_objects a;                                                                                                            
Table created.
                                                                                                               
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'BIG' );
PL/SQL procedure successfully completed.

That table will have lots of blocks as we get about 6 or 7 rows per block using that big data field.  Next, we'll create the small table the many little transactions will modify:

ops$tkyte@ORA10G> create table small ( x int, y char(500) );
Table created.
                                                                                                                 
ops$tkyte@ORA10G> insert into small select rownum, 'x' from all_users;
38 rows created.

ops$tkyte@ORA10G> commit;
Commit complete.
                                                                                                                 
ops$tkyte@ORA10G> exec dbms_stats.gather_table_stats( user, 'SMALL' );
PL/SQL procedure successfully completed.


Now, we'll dirty that table up.  I have a very small undo tablespace - so what I want to do is update as many blocks of this big table as possible - all while generating the least amount of undo possible.  I'm using a fancy update statement to do that.  Basically, the subquery below is finding the "first" rowid of a row on every block - that subquery will return a rowid for each and every database block identifying a single row on it.  We'll update that row setting a 1 character field.  This will let us update all of the blocks in the table (some 8,000 plus in my example) - flooding the buffer cache with dirty blocks that will have to be written out (we only have room for 500 right now).  We'll make sure we are using that small undo tablespace as well:
NOTE: See the chapter on Analytic functions for details on what the ROW_NUMBER() function below does and how it works.

ops$tkyte@ORA10G> alter system set undo_tablespace = undo_small;
System altered.
                                                                                                                 
ops$tkyte@ORA10G> update big
  2     set temporary = temporary
  3   where rowid in
  4  (
  5  select r
  6    from (
  7  select rowid r, row_number() over 
            (partition by dbms_rowid.rowid_block_number(rowid) order by rowid) rn
  8    from big
  9         )
 10   where rn = 1
 11  )
 12  /
8045 rows updated.
                                                                                                                 
ops$tkyte@ORA10G> commit;
Commit complete.

Ok, so now we know that we have lots of "dirty" blocks on disk - we definitely wrote some of them out, we just did not have the room to hold them all.  Next, we opened a cursor - but did not yet fetch a single row.  Remember, when we open the cursor, the result set is preordained, even though Oracle did not actually process a row of data, the act of opening that result set fixed the point in time the results must be 'as of'.  Now since we will be fetching the data we just updated and committed - and we know no one else is modifying the data - we should be able to retrieve the rows without needing any undo at all.  But that is where the Delayed Block cleanout raises it's head.  The transaction that modified these blocks is so new - that Oracle will be obliged to verify that it committed before we began - and if we overwrite that information (also stored in the undo tablespace) the query will fail.  So, here is the open of the cursor:

ops$tkyte@ORA10G> variable x refcursor
ops$tkyte@ORA10G> exec open :x for select * from big;
PL/SQL procedure successfully completed.
                                                                                                                 
ops$tkyte@ORA10G> !./run.sh

Now, that run.sh is a shell script.  It simply fired off  9 SQL*Plus sessions using a command:

$ORACLE_HOME/bin/sqlplus / @test2 1  &

Where each SQL*Plus session was passed a different number (that was number 1, there was a 2,3 and so on).  The script test2.sql they each ran was:

begin
    for i in 1 .. 1000
    loop
        update small set y = i where x= &1;
        commit;
    end loop;
end;
/
exit

So, we had 9 sessions, inside of a tight loop, initiate many transactions. The run.sh script waited for the 9 SQL*Plus sessions to complete their work - and then we returned to our session, the one with the open cursor.  Upon attempting to print it out, I observed:

ops$tkyte@ORA10G> print x
ERROR:
ORA-01555: snapshot too old: rollback segment number 23 with name "_SYSSMU23$"
too small
no rows selected

As I said, the above is a rare case. It took a lot of conditions, that all must exist simultaneously to occur. We needed blocks that were in need of a cleanout to exist and these blocks are rare in Oracle8i and above.  A DBMS_STATS call to collect statistics, gets rid of them so the most common causes, large mass updates and bulk loads, should not be a concern, since the tables need to be analyzed after such operations anyway. Most transactions tend to touch less then 10 percent of the block the buffer cache and hence, do not generate blocks that need to be cleaned out. In the event that you believe this issue is encountered, whereby a SELECT against a table that has no other DML applied to it is raising the ORA-01555, the things to try are:

* Ensure you are using 'right sized' transactions in the first place. Make sure you are not committing more frequently than you should.
* Use DBMS_STATS to scan the related objects, cleaning them out, after the load. Since the block cleanout is the result of a very large mass UPDATE or INSERT, this needs to be done anyway.
* Allow the undo tablespace to grow by giving it the room to grow and increasing the undo retention. This decreases the likelihood of a rollback segment transaction table slot being overwritten during the course of your long running query. This is the same as the solution for the other cause of an ORA-01555 (in fact, the two are very much related; you are experiencing rollback segment reuse during the processing of your query).  In fact, I re-ran the above example with the undo tablespace set to autoextend 1 M at a time, with an undo retention of 900 seconds.  The query against the table BIG completed successfully.
* Reduce the run-time of your query (tune it). This is always a good thing if possible, so it might be the first thing you try.

</quote>


 

Too large INITIAL extents ?

June 21, 2005 - 4:51 pm UTC

Reviewer: AP from nj

I have two instances, very much similar to each other. Here are their RBS config.

Db1 Db2
==== ====
# of RBS 4 4
Size (each) 40M 40M
INITIAL ext. 512K 5M
NEXT 5M 5M
MIN 2 8
max 32k 32k
OPTIMAL 40M 40M

While my ANALYZE statement succeeds in DB1, it fails with ORA-01555 in Db2.

1. Is is because on db2, very high INITIAL has been set ( 8 * 5M = 40m ) ??

2. Does it makes sense to set OPTIMAL to 40M for RBS=40M ?

3. What will alleviate the problem the most - more # of rbs OR larger rbs OR both ?

Tom Kyte

Followup  

June 21, 2005 - 9:05 pm UTC

are the transaction loads on each near the same?


3) either or, sounds like you have more transactions on database 2, so I would suggest "more rbs's" if that is the case.

Too large INITIAL extent ?

June 21, 2005 - 4:52 pm UTC

Reviewer: AP from nj

An update on above : RBS on both database are on DMT tablespace.

High INITIAL extent size ?

June 22, 2005 - 9:11 am UTC

Reviewer: AP from NJ

Apart from transaction load, could it be possible that very high INITIAL extent size ( min=8, size=5M; total = 8*5=40M ) is causing 01555 ?

Also, Does it makes sense to set OPTIMAL to 40M for RBS=40M ?


Tom Kyte

Followup  

June 22, 2005 - 4:36 pm UTC

doubtful, it is that you are overwriting undo data.

ORA-01555: snapshot too old: rollback segment number 10 too small

July 14, 2005 - 6:19 am UTC

Reviewer: Pradeep from Reading , U.K

Hi Tom

We had a problem with one of our batch job on our production systems , were in the batch was consistently failing with the snapshot too old :
rollback segment number too small error .We tried increasing the rollback segment size and also tried assigning the transaction to a specific rollback segment by using the SET TRANSACTION statement , with no success .
Later on after lot of investigation we were able to identify the actually cause of this to be , one of the updates statment, which was updating some 1 million record .
My question is :
During the execution of a batch is there a way to identify the SQL with is filling up the ROLLBACK SEGMENT .


Tom Kyte

Followup  

July 14, 2005 - 10:50 am UTC

well, the session that GETS the ora-1555 isn't one that needs to be associated with a "big rbs", that never works -- it cannot work.

And the "need" for a big RBS means you'll get 1555's because the odds of a 1555 are directly related to the size of your SMALLEST rbs, not biggest.

You need to figure out how much undo you generate in a period of time (not an individual sql statement really, just "how much do I generate in 30 minutes"). Then you need to figure out "how long do I need to keep it" - that is easy, what is your longest running query. Now, add some pad to that time and size your rbs to that -- using EQUI-SIZED rbs.

Or, just figure out the time and use automatic undo management and let the database figure out "how big" and let it steal extents and create big rbs's for things that need it and so on. I'd opt for #2 every time.


Yes, you can see what TRANSACTIONS have undo requirements by querying v$transction, used_ublk shows you what they are using.


that million row update would only be the root cause of the 1555 if it committed every N records (which means it is not a 1 million row update but a series of small transactions)

Same issue occasionally

October 05, 2005 - 9:51 am UTC

Reviewer: Mark from Boston, MA

Hi Tom,

I get the same issue occasionally when gathering stats with DBMS_STATS:

*** SESSION ID:(136.49334) 2005-10-05 04:33:36.179
*** 2005-10-05 04:33:36.179
ORA-12012: error on auto execute of job 25051
ORA-01555: snapshot too old: rollback segment number 74 with name "RBS_34" too small
ORA-06512: at "SYS.DBMS_DDL", line 179
ORA-06512: at "SYS.DBMS_STATS", line 4469
ORA-06510: PL/SQL: unhandled user-defined exception
ORA-06512: at "SYS.DBMS_STATS", line 4611
ORA-06512: at "SYS.DBMS_STATS", line 4721
ORA-06512: at "SYS.DBMS_STATS", line 4703
ORA-06512: at line 1
*** SESSION ID:(136.49382) 2005-10-05 08:57:57.611
*** 2005-10-05 08:57:57.611
ORA-12012: error on auto execute of job 25051
ORA-01555: snapshot too old: rollback segment number 8 with name "RBS_SMALL_11" too small
ORA-06512: at "SYS.DBMS_DDL", line 179
ORA-06512: at "SYS.DBMS_STATS", line 4469
ORA-06510: PL/SQL: unhandled user-defined exception
ORA-06512: at "SYS.DBMS_STATS", line 4611
ORA-06512: at "SYS.DBMS_STATS", line 4721
ORA-06512: at "SYS.DBMS_STATS", line 4703
ORA-06512: at line 1

This happens when I execute DBMS_STATS dirrectly from a JOB. So, I've written a wrapper proc so I can assign a very large RBS to the transaction:

CREATE OR REPLACE PROCEDURE GATHER_HT4_STATS
IS
BEGIN
SET TRANSACTION USE ROLLBACK SEGMENT import_rbs_vlarge;
/* Gather All Stats Nightly */
DBMS_STATS.gather_schema_stats (
ownname => 'MY_SCHEMA'
,estimate_percent => 40
,block_sample => FALSE
,method_opt => 'FOR ALL INDEXED COLUMNS'
,degree => NULL
,granularity => 'DEFAULT'
,options => 'GATHER'
,cascade => TRUE
);
COMMIT;
EXCEPTION
WHEN OTHERS
THEN
raise_application_error (-20001, SQLERRM);
END;

Do you think this is a reasonable approach to avoid this problem until we get around to sizing all our RBS's the same?

Great Site! Keep doing what you are doing!


Tom Kyte

Followup  

October 05, 2005 - 11:27 am UTC

umm, that does nothing really - it is pure coincidence if it works. and it'll "stop" working anytime in the future (cause it doesn't fix the problem)


you wrote:

...This happens when I execute DBMS_STATS dirrectly from a JOB....


No, this happens when your rollback segments are too small for the amount of work you do. You get the 1555 on a QUERY here - dbms_stats is reading data, not generating gobs of undo. It is the OTHER jobs, other transactions that are causing the ora1555.


And the probability of a 1555 - controlled by the size of your SMALLEST rollback segment.

hmmm...

October 05, 2005 - 1:31 pm UTC

Reviewer: Mark from Boston, MA

Must be late in the day...confused.

You said....

"No, this happens when your rollback segments are too small for the amount of work you do." - I understand that

"You get the 1555 on a QUERY here - dbms_stats is reading data, not generating gobs of undo." - I understand that as well, as DBMS_STATS queries my database to develop stats.

"It is the OTHER jobs, other transactions that are causing the ora1555." - ??

Ok, assume you are right (and, you nearly are every time), then *how* do I discover what Query or Job is causing this condition? My error message clearly states it was this particular job causing the error.

Is this correct: I get 'snapshot too old error' when a rollback segment holding a consistent read version of part of my database that my job needs is overwritten by another transaction?

Help me understand this better.

I understand the solution now, so that is good - size all RBS' homogenously to be large enough to handle the largest transaction.

Thanks.

Tom Kyte

Followup  

October 05, 2005 - 1:50 pm UTC

ALL of the jobs are - not a single one, EVERY transaction is a contributor - there is no single individual.


In order to avoid 1555 you must have permanently allocated (no optimal stuff to shrink small) rollback space such that in the period of time of your longest running query - you don't wrap around and reuse rollback.


Say you have 10 rbs's of 50 meg apiece (500 meg total)

You generate 100 meg of undo per minute (just assume) over many separate transactions (spread over all rbs's)

You can run a query for up to about 5 minutes in duration before having a high probability of getting a 1555 - since the other transactions will have generated 500 mb of undo and need to reuse some of it (and you might well NEED what they are reusing).


Your statement sums it up:


I get 'snapshot too old error' when a rollback segment holding
a consistent read version of part of my database that my job needs is
overwritten by another transaction?



You should size the rbs to be big enough not to wrap over and be reused in the time it takes to run your longest running query.

automatic UNDO (AUM) makes this *much easier* as you set an undo retention period.

Snapshot too old

November 22, 2005 - 3:42 am UTC

Reviewer: Raghav from India

Hi Tom

The following error returned when the procedure is being executed. This procedure deletes the data from the materialised view table and inserts back the data from live table to this table.

"SQL Error ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10$" too small updating table MV_SPR. Calling procedure is proc_update_4_hrs. Update failed"

I have gone through the article and it is very good and very useful. In Oracle 8i, I am able to create different rollback segments which my application uses and had never got such error. In our oracle9i, when I tried to increase the space of system rollback segment or tried to create one more rollback segment as RLS01. The system was giving an error saying:

"Ora-30019:Illegal rollback segment operation in automatic undo mode".

I understand from the above that in my database the rollback segments were set to auto mode.

But, I am getting the "snapshot" error now and then when the processes were executed.

Please let me know, what can we do to overcome this problem?

Thanks and Regards
Raghav



Tom Kyte

Followup  

November 22, 2005 - 8:35 am UTC

you would set your undo_retention higher and ensure your undo tablespace is sized large enough to hold that much undo.

You need to have the undo retention set longer than the duraction of your longest running query.

And you need to have sufficient disk space to hold that much undo.

undo tablespace

December 09, 2005 - 6:07 am UTC

Reviewer: Raghav from India

Hi Tom

Thanks for your reply. For our application we are using two spaces "Cpms_data" for data and temp_cpms for temporary tablespaces. We have assigned only these two tablespaces while creating the user cpms.
I found a tablespace "UNDOTBS" (created by the system). How can we know which tablespace is being used by the application while doing the process.

Is it by default take the UNDOTBS for undo transactions? This was set to auto extend mode. Is that we still increase the space?
The system is not allowing me to increase the size of the "system" rollback segment.

Please clarify.

Thanks and Regards
Raghav


Tom Kyte

Followup  

December 09, 2005 - 6:51 am UTC

undotbs is used as your undo segment, every process will use this at some point.


are you the DBA? if not, this is "dba" stuff. If so, you can check out the admin guide that goes through managing your undo and "howto"

table space set to auto extend

December 10, 2005 - 1:13 am UTC

Reviewer: Raghav from India

Hi Tom

Thanks for your quick reply. Please provide me any link to the admin guide if available. Still my question is - even if the tablespace is set to auto extend mode, do we need to increase the size manually? (if we keep aside the performance issue like the process becomes slow when the tablespace is being increased automatically).

Thanks and Regards
Raghav


Tom Kyte

Followup  

December 10, 2005 - 5:20 am UTC

if the tablespace is set to autoextend and space exists in the OS, it'll autoextend by itself (that is the entire sole goal of that particular piece of syntax)

ORA-01555

January 12, 2006 - 5:06 pm UTC

Reviewer: Prerak Mehta from New York, NY, USA

Hi Tom,

As you mentioned about ORA-01555 in detail in the doc, i had a question. To give little background of my problem. I have a query which gives ora-01555 once in a day. My undo ts size is 34 gb and undo retention is 54000. I calculated my undo ts (estimated) size using the following query...

SELECT MAX(UNDOPERSEC) * 8 * 54000/(1024*1024) -- IN GIGS
FROM (SELECT UNDOBLKS/((END_TIME - BEGIN_TIME)*86400) UNDOPERSEC FROM V$UNDOSTAT );

8 - Size of my block
54000 - 15 hrs undo retention parameter
Max(undopersec) comes to around 180 blocks...

and the result comes to around 75gb.

Now the main question is can Oracle overwrite committed blocks in undo ts even if there are free blocks in undo ts?If yes what can we do to prevent this?

Tom Kyte

Followup  

January 13, 2006 - 10:56 am UTC

oracle will attempt to use any free space it can in the undo tablespace - even by stealing extents from other undo segements when needed and moving them around.

you don't have enough undo space setup to satisfy your retention request.

ora-01555

April 03, 2006 - 8:03 am UTC

Reviewer: A reader

If someone is receving ora-01555 error while doing complete refresh e.g 'ORA-06512: at "SYS.DBMS_SNAPSHOT",'

Tring to refresh instance A from instance B:

And if B receives above error then this is due to small size of rollback of A or from due to small size of rollback of B.

Tom Kyte

Followup  

April 04, 2006 - 9:28 am UTC

likely the remote (b) site.

you'd need to know the query that failed...

DBMS_STATS

April 12, 2006 - 10:47 pm UTC

Reviewer: Rajiv Kumar from India

I am running dbms_stats.gather_table_stats with cascade and histograms and getting snapshot too old error. The table is big and might be getting changed while I do stats gathering. I have already set the undo_retension to 10000 seconds ( that is what I can afford on this db ), is there any other way of avoiding it.

Tom Kyte

Followup  

April 13, 2006 - 7:37 am UTC

you may well have set it to 10,000 - but did you configure sufficient UNDO tablespace to hold 10,000 seconds?

Are the extents in your undo tablespace getting expired prematurely? Refer to v$undostat

DBMS_STATS

April 13, 2006 - 9:33 pm UTC

Reviewer: Kirtikumar Deshpande from Carrollton, TX, USA

Did you check alert.log file for that ORA-1555 error?
It will report how long (Query Duration)the query ran before aborting. You will need to bump up undo_retention time at least to more than this value.

Since the query did not complete successfully you will still not know how long it would have run.

So, the way to avoid this error could be to try to avoid stats gathering when there is DML performed on the table, or use a smallish sample, and/or use parallel operation, if that's possible.

HTH..

- Kirti

why 1st rollback segment (USN=1) has much higher wraps?

April 22, 2006 - 12:28 pm UTC

Reviewer: Mike from CA

Hi Tom,
Unfortunately we are still using manual rollback segment in 9205. The question is that from statspack daily report, RBS01(USN=1) always have highest wraps count, USN=2 seems next to it, and so on.

Originally we thought that oracle always starts from USN=1 for round robin algorithm, instead of starts from last USN being used. So it seems reasonable to have higher wraps for rollback segments with lower USN.

However, the test seems telling different story, it seems that Oracle does pick up next rollback segment starting from last one being used. In test scenario, I have a single user enviroment test database, each time I have a new transaction, I can see USN increased in v$transaction.

So question one, does oracle pick up rollback segment from USN 1 or last one being used for a new transaction?

Question two, if it always picks up rollback segment from USN1 for new transaction, then it seems that USN1 is most likely being overwritten which may cause ORA-1555?

Tom Kyte

Followup  

April 22, 2006 - 3:25 pm UTC

"bad luck" could be the answer OR "set transaction use rollback segment"



testing result

April 22, 2006 - 12:37 pm UTC

Reviewer: mike from CA

Hi Tom,
Sorry, I forgot to paste the testing result. Since I am the single user in the database, so i expected to see my XIDUSN always equals to 1, however, not the case.

SQL> desc tst
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 C                                                  CHAR(2000)

SQL> select xidusn from v$transaction;

no rows selected

SQL> update tst set c=lower(c) where rownum<=1;

1 row updated.

SQL> select xidusn from v$transaction;

    XIDUSN
----------
        16

SQL> rollback;

Rollback complete.

SQL> insert into tst values ('adsf');

1 row created.

SQL> select xidusn from v$transaction;

    XIDUSN
----------
        17

SQL> rollback;

Rollback complete.

SQL> update tst set c=lower(c) where rownum<=1;

1 row updated.

SQL> select xidusn from v$transaction;

    XIDUSN
----------
        18

SQL> rollback;

Rollback complete.

SQL> update tst set c=lower(c) where rownum<=1;

1 row updated.

SQL> select xidusn from v$transaction;


    XIDUSN
----------
        19

SQL> SQL> rollback;

Rollback complete.

SQL> update tst set c=lower(c) where rownum<=1;

1 row updated.

SQL> select xidusn from v$transaction;

    XIDUSN
----------
        20

SQL>
 

to Mike

April 22, 2006 - 8:16 pm UTC

Reviewer: A reader

But are the XIDUSN's round robin across separate sessions? Perhaps sessions (and their transactions) do not live long enough for other sessions to go somewhere other than rbs#1 for some particular transaction profile?

what do you mean by "bad luck"

April 22, 2006 - 8:56 pm UTC

Reviewer: Mike from CA

Hi Tom,
Sorry, I dont understand your answers. Let me rephrase my questions.
First one, does Oracle always starts with USN1 when doing round robin for rollback segment? Or does oracle starts from the last USN being used, then the next one will be USN+1?

Second one, if oracle always starts fron USN1, then the first rollback segment is going to be very hot, how to avoid this?

Tom Kyte

Followup  

April 23, 2006 - 5:22 am UTC

"bad luck" would be one option (eg: it is purely a coincidence)

"use rollback segment" could be another option (you have an application that forces itself to use this RBS)

Oracle does not always start from one - you have demonstrated that yourself.


remember - wraps is not an indication of "use", it is not showing how "hot" a rollback segment is.  It shows how often the rbs had to wrap a transaction into another extent.

It could also be that usn1 has smaller extents than the others.  In the following example - one rbs has 64k extents, the rest have 1m extents:


ops$tkyte@ORA9IR2> create table t
  2  as
  3  select * from v$transaction where 1=0;

Table created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t2 as select * from all_objects where rownum <= 200;

Table created.

ops$tkyte@ORA9IR2> create index t2_idx on t2(object_name,owner,object_type);

Index created.

ops$tkyte@ORA9IR2> alter table t2 modify object_name varchar2(40);

Table altered.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select usn, wraps from v$rollstat;

       USN      WRAPS
---------- ----------
         0          0
        11          0
        12          0
        13          0
        14          0
        15          0

6 rows selected.

<b>started cold, all stats reset... now do 1,000 transactions:</b>

ops$tkyte@ORA9IR2> begin
  2          for i in 1 .. 1000
  3          loop
  4                  update t2 set object_name = systimestamp;
  5                  insert into t select * from v$transaction;
  6                  commit;
  7          end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> select xidusn, sum(used_ublk), count(*) from t group by xidusn;

    XIDUSN SUM(USED_UBLK)   COUNT(*)
---------- -------------- ----------
        11           2000        200
        12           2010        201
        13           2001        200
        14           2001        200
        15           1992        200



<b>each RBS used about the same number of blocks and had about the same number of transactions - but:</b>

ops$tkyte@ORA9IR2> select usn, wraps from v$rollstat;

       USN      WRAPS
---------- ----------
         0          0
        11         25
        12         25
        13        421
        14         25
        15         25

6 rows selected.

<b>rbs with USN=13 'wrapped' more, meaning that our transaction spanned extents more often. (because my transaction was about 10 or 11 blocks of undo and that rbs had 8 blocks per extent....)</b> 

April 23, 2006 - 12:52 pm UTC

Reviewer: Mike from CA

Hi Tom,
Thank you very much for spending time with the detail. Since the size of the RBS is the same, not the case as you demenstrated. As for "bad luck", it seems not the case either, we have many databases showing similary pattern, which is hard to say it's random bad luck.

So the only possibility is 'SET TRANSACTION', although it's very unlikely since we are busy OLTP systems with many small transactions, and it makes no sense for applications to forse RBS01 since all RBS are same size.
But, I would like to how to find this out without scanning application codes, not sure it's possible or not?

For you following comment, i think it may not apply to our OLTP systems with pretty much similar size small transactions, if transaction sizes are the same, does not high wraps mean more transaction data being allocated to this rollback segment, hence more transactions?

<quote>
remember - wraps is not an indication of "use", it is not showing how "hot" a
rollback segment is. It shows how often the rbs had to wrap a transaction into
another extent.
</quote>

Sincerely,

Tom Kyte

Followup  

April 23, 2006 - 2:41 pm UTC

"size" != "extents" please demonstrate conclusively that the extents are the same size. I can have two rbs's that are 50 meg - one using 64k extents and one using 1m extents - no problem.

In order to see if people are using set transaction - you would have to see the code.

wraps are about transactions spanning rollback segment extents, they are not a measure of "usage" of a rbs. It is when a transaction spans extents, it is mostly about "size of extents and size of transactions"



OLTP system with same rollback segment extent size, number of extents

April 23, 2006 - 7:39 pm UTC

Reviewer: Mike from CA

Hi Tom,
I understood what you said. However it's just not the case of our system.

<quote>
wraps are about transactions spanning rollback segment extents, they are not a
measure of "usage" of a rbs. It is when a transaction spans extents, it is
mostly about "size of extents and size of transactions"
</quote>

Under the circumstance that all rollback segments have same number of extents, and same extent size, same minextents, and we have pretty much uniform OLTP type small transaction size, does not number of wraps reflect number of transactions?

Which in turn reflects which RBS has most transactions allocated during the statspack time window? (we never have more than minextent in rollback segments, extends=0).

Unless big size transactions always get allocated to RBS01, but how could that be possible. Each rollback segments should get equal chance no matter big or small size transactions get assigned to them.

That's why it puzzles me.
Best Regards,



Tom Kyte

Followup  

April 24, 2006 - 12:02 am UTC

I don't believe you have the same sized transactions - there are lots of transactions happening in the background that you are not responsible for. And I'm sure you have some other sized transactions (that you felt the only way to find out if someone was using "set transaction" indicates there are perhaps things going on you don't know about).




does transaction sizes matter?

April 24, 2006 - 2:57 pm UTC

Reviewer: Mike from CA

Hi Tom,
Let's suppose that we have different size transactions in our system, does that really matter? Because eventually no matter big or small transactions get equal chance of all available rollback segments? Why RBS01 gets more chance of either more transactions or big transactions, it just makes no sense to me.

Secondly, for 'SET TRANSACTION' scernaio, I would be supprised they are developers in deed doing this for no reason, since all rollback segments are same size, why bother to SET TRANSACTIOn to rbs01, but it seems only possibility for me to rule out in order to solve this puzzle.

Thank you always!

Tom Kyte

Followup  

April 24, 2006 - 3:09 pm UTC

Just commenting on the fact you said they are all the same size - I doubt that.

The only thing that is popping into my head would be different sized extents (which you say they are, but haven't shown they are...) and a set transaction statement.

RBS Anomaly

April 24, 2006 - 4:16 pm UTC

Reviewer: Jonathan Lewis from UK

Mike,

Your results look a little odd.

Theoretically, all rollback segments will tend to grow to the same size and suffer the same amount of activity - over a long enough period of time.

I haven't done any experiments with allocation for a long time - but the algorithm __seemed__ to be a modified round-robin (using the order of rollback segment names in the rollback_segments parameter, if set) that picked the next rollback segment in the sequence that had the current minimum number of active transactions.

Apart from deliberate 'set transaction', and the luck of the odd massive batch processes which could distort the results in the short term, the principle is that no rollback segment gets more rollback written than any other rollback segment so that the chance of a 1555 is kept to a minimum.

How frequently do you restart the database - if ever ?
Do you ever take rollback segments offline and bring them back online ?

It might be helpful to see the result of the following query:

column usn format 999
column extents format 999 heading "Ext"
column gets format 999,999
column waits format 9,999

select
usn,
extents,
rssize,
writes,
gets,
waits,
optsize,
extends,
aveshrink
from
v$rollstat
where
status = 'ONLINE'
;




pretty much rule out "SET TRANSACTION"

April 24, 2006 - 9:04 pm UTC

Reviewer: Mike from CA

Hi Tom and John,
Thank you very much for your help. Indeed we even tried to drop RBS01, if application had SET TRANSACTION they would have failed, however, not the case. I totally agree what you guy said about how rollback segment should be assigned, and that's what i expected. That's why it puzzles me when the fact tells me different story. This is one of our databases having this pattern in statspack report. I even guessed thats some bugs in statspack and manully get data from v$rollstat and its consistent that RBS01 has highest hits.
Maybe some bugs in database.

statspack from 00:00 - 15:00 window

        Trans Table       Pct   Undo Bytes
RBS No      Gets        Waits     Written        Wraps  Shrinks  Extends
------ -------------- ------- --------------- -------- -------- --------
     0          251.0    0.00               0        0        0        0
     1   31,279,296.0    1.20    -698,128,476      185        0        0
     2    6,563,535.0    0.32     730,348,910       37        0        0
     3    3,192,161.0    0.15     361,163,990       19        0        0
     4    2,934,321.0    0.14     337,923,108       17        0        0
     5    2,946,402.0    0.14     335,525,984       17        0        0
     6    2,966,193.0    0.14     336,334,926       17        0        0
     7    2,954,845.0    0.14     332,669,000       17        0        0
     8    2,944,023.0    0.14     334,236,136       17        0        0
     9    2,967,883.0    0.14     336,181,944       17        0        0
    10    2,896,844.0    0.14     333,535,124       17        0        0
    11    2,946,669.0    0.14     334,020,522       17        0        0
    12    2,881,340.0    0.14     335,648,526       17        0        0
    13    2,963,002.0    0.14     338,949,248       17        0        0
    14    2,928,279.0    0.14     336,286,612       17        0        0
    15    2,962,280.0    0.14     335,311,682       17        0        0
    16    2,970,973.0    0.14     338,463,368       18        0        0
    17    2,917,708.0    0.14     336,439,644       17        0        0
    18    2,941,931.0    0.14     336,330,720       17        0        0
    19    2,956,033.0    0.14     335,757,558       17        0        0
    20    2,916,111.0    0.14     335,019,482       17        0        0
          -------------------------------------------------------------



SQL> select segment_name, initial_extent, next_extent, min_extents, max_extents, pct_increase, status from dba_rollback_segs;

SEGMENT_NAME                   INITIAL_EXTENT NEXT_EXTENT MIN_EXTENTS MAX_EXTENTS PCT_INCREASE STATUS
------------------------------ -------------- ----------- ----------- ----------- ------------ ----------------
SYSTEM                                  57344       57344           2         505            0 ONLINE
RBS01                                20971520    20971520          20         150            0 ONLINE
RBS02                                20971520    20971520          20         150            0 ONLINE
RBS03                                20971520    20971520          20         150            0 ONLINE
RBS04                                20971520    20971520          20         150            0 ONLINE
RBS05                                20971520    20971520          20         150            0 ONLINE
RBS06                                20971520    20971520          20         150            0 ONLINE
RBS07                                20971520    20971520          20         150            0 ONLINE
RBS08                                20971520    20971520          20         150            0 ONLINE
RBS09                                20971520    20971520          20         150            0 ONLINE
RBS10                                20971520    20971520          20         150            0 ONLINE
RBS11                                20971520    20971520          20         150            0 ONLINE
RBS12                                20971520    20971520          20         150            0 ONLINE
RBS13                                20971520    20971520          20         150            0 ONLINE
RBS14                                20971520    20971520          20         150            0 ONLINE
RBS15                                20971520    20971520          20         150            0 ONLINE
RBS16                                20971520    20971520          20         150            0 ONLINE
RBS17                                20971520    20971520          20         150            0 ONLINE
RBS18                                20971520    20971520          20         150            0 ONLINE
RBS19                                20971520    20971520          20         150            0 ONLINE
RBS20                                20971520    20971520          20         150            0 ONLINE
 

April 25, 2006 - 9:40 am UTC

Reviewer: candba from Toronto

Tom ,
We have been encountering 'snapshot too old error ' in odd cases recently - we have a database storing application usage data . This database has some tables which are created by a third party tool . One of the tables have grown to about 3.4GB in size & approx 25M records . Now this table keeps on giving us 'snapshot too old' error to us whenever :
1.We try to export this table
2.We try to calculate stats for this table using either dbms_stats or analyze

I have never seen this error happening for the above cases.

Can you please help me in understanding what might be the cause ?

We are using 9206/solaris 8 and our undo tablespace is 8GB in size.

Thanks .

Tom Kyte

Followup  

April 25, 2006 - 10:25 am UTC

what is your undo_retention set to.

April 25, 2006 - 11:25 am UTC

Reviewer: Candba from Canada

NAME TYPE VALUE
------------------------------------ ----------- ---------------------------
undo_management string AUTO
undo_retention integer 900
undo_suppress_errors boolean FALSE
undo_tablespace string undo_tbs

Tom Kyte

Followup  

April 25, 2006 - 2:21 pm UTC

900 seconds = 15 minutes.

Probably a tad "short" for undo retention isn't it? You want undo retention to be longer than your longest running query at least.

how long does it take to read this table from start to finish...

April 25, 2006 - 4:00 pm UTC

Reviewer: candba

Tom ,
How can I find out the time needed to read this table start to finish ?

A " select count(*) from table_name ; " took 04 minutes 18 seconds to return the result.


Thanks.

Tom Kyte

Followup  

April 25, 2006 - 11:54 pm UTC

well, it is not just the time to read it - it is the time to fetch the entire row to the client (exp) and write it to disk.

check out v$undostat - it has this metric (maxquerylen)

RBS Anomaly revisited

April 26, 2006 - 11:07 am UTC

Reviewer: jonathan lewis from UK

Mike,

Coincidentally, I've been kicking a 10g database quite hard with large numbers of small transactions that (historically) should have rotated through the available rollback segments.

In fact, they displayed the behaviour you have described - about 60% of the work took place in the first segment, 25% in the second, and 15% in the third.

I don't have time to investigate it at present, but I think it has to fall under one of two headings:
a) Bug
b) Deliberate strategy change - possibly brought in with the implementation of automatic undo - that biases the system towards using the low numbered rollback segments so that higher numbered rollback segments can be shrunk and taken off line. (Automatic undo adds segments pretty aggressively, so there has to be a mechanism that allows them to become redundant - this might be it).



April 26, 2006 - 7:30 pm UTC

Reviewer: candba

Increasing the undo_retention to 3hrs from 15 minutes seems to have solved all of " snapshot too old " problems .
We can export/collects stats successfully now .

Thanks for the help , Tom.

Does "fetch across commit" occured in delete operation?

April 28, 2006 - 10:07 am UTC

Reviewer: A reader

Hi Tom,

1) If delete and commit in the cursor, does this also generate the ORA-01555?

2) Does this mean that the chance of hitting this ORA-01555 is higher if the record is smaller (since one block contains more record)?

Thanks
David

Tom Kyte

Followup  

April 28, 2006 - 12:43 pm UTC

tell me how you "delete and commit in the cursor" first - I didn't get that.

the probability of a 1555 is directly related to the amount of undo generated in a period of time and the amount of undo space you have configured for the system.

if you delete lots of small records, it'll generate as much undo as a couple of big records would - it is about the volume of undo generated - not the size of the records.

April 28, 2006 - 9:52 pm UTC

Reviewer: A reader

Hi Tom,

I mean:

set rollback segment big_rollback
open cursor with
select * from a, b where a.id
between 1 and 100 and a.id = b.id
delete from a
number of record = number of record + 1
if number of record > 10
then commit and set rollback segment big_rollback
close cursor

Does this also causes the ORA-01555?

Thanks
David

Tom Kyte

Followup  

April 29, 2006 - 3:05 am UTC

the probability of a 1555 is related to the size of your smallest rollback segment.


you are missing some sort of loop there - logic doesn't make sense.


Yes, you would be setting yourself up for a 1555.

Let us say the logic you meant was this:

set transaction use big_rollback;
for x in ( your_cursor_that_reads_A )
loop
delete from a where key = x.something;
cnt = cnt+1
if (mod(cnt,10) = 0)
then
commit
set transaction use big_rollback;
end if;
end loop;
commit;


NOW -
IF big_rollback is big enough to hold all of the undo you need for read consistency on table A in the cursor

THEN
one must really question WHY you are committing - what is the point?
other than to waste time, and increase resource usage

BECAUSE
the rollback segment must be big enough to NOT wrap around on itself
and therefore, it could have held all of the undo and hence you did not
need to "commit to free up the undo"



So, why would you be commiting like that one wonders. You must have sufficient undo configured to hold it all without overwriting it in the first place.

April 29, 2006 - 4:06 am UTC

Reviewer: A reader

Hi Tom,

Thanks again for your information.

The reason why we use commit inside the loop is the rollback segment is not large enough for the whole "delete" operation. In fact, the program delete about 10,000,000 records from 30,000,000 records. This table also has 5 indexes. The time to execute this program is very slow (about 10-20 hours).

However, we found that this program can execute successfully when the rollback segment was set to 1G. But the last time this program executed with ORA-01555, the rollback segment was set to 10G!!! The number of records to delete in these two executions was nearly the same, and only this transaction was in the database. Later we found that there is a "commit" in the cursor loop. We don't know whether this is the main reason why ORA-01555 occured, but what puzzled us is why this error was shown with larger rollback segment on the same program.





Tom Kyte

Followup  

April 30, 2006 - 4:48 am UTC

that is due to the fact you are NEEDING the data you have placed into the undo segment. But, since you commited the transaction - we are free to reuse them - and so we do.


it is exactly what I said above - you are reading table A. You are also modifying it. For as long as you read it - you need the undo you generated against it, but you commit and free it. That is the surest way to a 1555.


You might consider that instead of deleting 30,000,000 records, you look at a better way to accomplish that (that does not involve delete) such as

a) partitioning - so you can just truncate
b) using create table as select <records to keep> + index + grant + whatever and then drop old table rename new table

for example


In 2006, 10gb is but a proverbial drop in the bucket - that is another thing to consider, it just isn't that much disk anymore. Meaning "10G!!!" looks funny - it just isn't that unusually big (meaning also, 20G is ok, 40G could be ok)

April 29, 2006 - 5:34 am UTC

Reviewer: Mohamed from France

Dear Tom,

If this person has read your book Expert-one-to-one(particularly chapter 4) he will get the answer to his question. The difference between ora-01555 and ora-01562 is clearly stated their. I have just to add that even if you set this
set transaction use big_rollback;
You are not 100% sure that your transaction will be using big_rollback segment

Anyway, when based on your book and validated by my experience,I explain to DBA that we don't have to commit in side the loop, they said ok but each, say N transactions, close your loop and commit. And again open your cursor loop and close your loop and commit and so on...

What's your opinion Tom

Thanks a lot

Tom Kyte

Followup  

April 30, 2006 - 4:50 am UTC

my opinion is the DBA's have just told you how to set "slow = true" in your code.

May 01, 2006 - 9:13 am UTC

Reviewer: A reader

Hi Tom,

As mentioned about, we are using delete DML and commit statement inside cursor loop. We found that ORA-01555 occured even the rollback segment is set to 10G (but 1G may sometiems OK). Could you help me to clarify whether my understanding is correct to explain why such error occured:

Assume record 1, 4, 7 is in data block 1.
record 2, 5, 8 is in data block 2.
record 3, 6, 9 is in data block 3.

Transaciton begins, deletes two records and now commits. At that time, assume RBS allocates two extents for this transaction so far.

Later, the transaction want to delete the 3rd record. RBS sees there are no acive transaction in extent 1 and so it wraps to extent 1 for the transaction.

Then the transaction want to delete the 4th record. As this record is in the same data block with 1st record that deleted earlier, and because of extent 1 is overwritten by the same transaction, ORA-01555 shown (assume data block is not in buffer cache).

Am my understanding is correct to explain one of the possible reasons that ORA-01555 occured even the rollback segment is set to 10G?

Thanks,
David


Tom Kyte

Followup  

May 02, 2006 - 2:53 am UTC

The reason you get the 1555 is because your rollback segment is not permanently allocated large enough to hold all of the undo generated during your processing and you are overwriting some information you need - yourself. You are shooting yourself in the foot.

That you NEED to commit because if you don't you "run out of rollback space" proves that pretty much conclusively. If you

a) read from table A
b) modify table A
c) MUST commit because while doing A and B you run out of rollback

then - you are a prime candidate for the "classic" cause of an ora-1555.

You are doing exactly that.


Your example (confusing to follow - but I think when you said "transaction begins and deletes two records" - you meant - "transaction begins and deletes record #1 in block 1, record #2 in block 2" - not just any two records) would be an example.

More simply put:

o at the beginning of your transaction you removed a row from block #1.

o you commit - allowing the undo that protected this modification to be overwritten.

o you in fact DO overwrite that undo - as the rollback segment wraps around.

o later you need to REVISIT block #1, but you cannot as the undo you need has been overwritten BY YOU

Implying: you need larger, permanently configured rollback OR you need to totally change your programs logic so as to not hold the cursor open across commits - leading to infinitely more complex code that runs slower.


I opt for option 1 - disk is cheap, our time is expensive.

May 02, 2006 - 6:35 am UTC

Reviewer: A reader

Hi Tom,

Thanks for the information.

Just want to clarify, for the option 1 you suggested, should we first remove the "commit" statement that's inside the cursor loop, and then use a large rollback segment for the whole delete operation? i.e. using a large rollback segment cannot solve ORA-01555 error in fetch across commit.

Thanks,
David

Tom Kyte

Followup  

May 02, 2006 - 7:29 am UTC

using a large rollback statement with a commit in a cursor for loop is no assurance of not getting a 1555 - that is correct.


You either need to either:

a) remove the commit and let the undo survive for the duration of the transaction.

b) entirely rewrite your algorithm so as to NOT include having a cursor open across your commits (since your cursor is reading the data you are committing against and releasing the undo for). This'll generally be hard to write, hard to maintain, hard to debug, and run slower.

May 02, 2006 - 9:05 am UTC

Reviewer: A reader

Hi Tom,

Thanks so much for your help.

Chances of getting Snap shot too old error on a table where no dml

May 15, 2006 - 5:57 pm UTC

Reviewer: Sarma from Herndon, VA

Hi Tom,

I am querying a big table after making sure there will not be any dml operations on that table while my query is running. Would this gurantee me that I will not hit snap shot too old error?

Thanks
Sarma

Tom Kyte

Followup  

May 15, 2006 - 6:19 pm UTC

no, it would not.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>



still not clear

May 16, 2006 - 11:04 am UTC

Reviewer: Venu from herndon va 20171

Hi Tom, here is the scenario:
1. we're running on 10g R1.
2. there are no dmls going on this table while my query on this table is running.
3. are there any chances that I would get snap shot too old error? If yes, could you please explain us?

Thank you,
Venu

Tom Kyte

Followup  

May 16, 2006 - 11:20 am UTC

YES, the answer is YES.

the link I pointed you to demonstrates YES.

A 1555 happened on a table in a READ ONLY tablespace.

May 16, 2006 - 11:24 am UTC

Reviewer: venugopala raju dandu from herndon va 20171

in the link you pointed me , there a table named "t" dml was done on it but delayed clean out was not done. If the clean out was done before putting the tablespace in read only. The snap shot too old error won't occur. Is that correct?

Regards,
Venu

Tom Kyte

Followup  

May 16, 2006 - 11:40 am UTC

correct.

the 1555 happened due to the fact the blocks had not yet been cleaned out and Oracle couldn't decide if they had been modified since the statement performing the query began.

delayed block clean out

May 17, 2006 - 12:23 pm UTC

Reviewer: venugopala raju dandu from herndon va US

Hi Tom,

A question related to delayed block clean out explained in your book "Expert one on one" page 194:

The following is the excerpt from the paragraph next to the bullet point 6 is as follows:

"If commit SCN is less than t1, our query can use this block. If commit SCN is greater than t1, our query
must rollback that block. The problem is however, that our query is unable to determine in this particular case
if the commit SCN of the block is greater than or less than t1. It is unsure as to whether it can use it or not.
The or-1555 then results"

could you please explain what happens to this block after throwing the error ora-1555?
If I open a new session and query the same data(data in that block), would I get the same error(ora-1555) again?

Regards,
Venu


Tom Kyte

Followup  

May 18, 2006 - 10:09 am UTC

you would not get the 1555 the second time, T1 is "reset" and we will know now that the commit scn associated with the block is just fine.

May 18, 2006 - 11:25 am UTC

Reviewer: venugopala raju dandu from herndon va US

Hi Tom,

I'm eager to know what exactly happens (the series of events) in the database when we're thrown with ora-1555 error. which value "t1" will be "reset" to?

Regards,
Venu

Tom Kyte

Followup  

May 19, 2006 - 9:54 am UTC

"right now", "t1" is "start of query"

May 18, 2006 - 11:29 am UTC

Reviewer: venugopala raju dandu from herndon va US

Hi Tom,

the reason for getting this error was as it was not able to find the commit scn for that block in the rollback segment header. How will we know that the commit scn associated with the block is just fine?

Regards,
Venu

Tom Kyte

Followup  

May 19, 2006 - 9:57 am UTC

because we restart and the time is "now" and we know the state of the system - the maximum commit scn is much further advanced NOW then it was when we first tried.

We'll know the transaction that left the block dirty is way committed this second time around

To rephrase Tom's comments...

May 19, 2006 - 12:23 pm UTC

Reviewer: Roderick

When you issue a query, the query will try to make sure all blocks it visits has an SCN < the SCN of the time query started (T1). So, when your query visits a block that needs cleanout, if Oracle cannot determine the actual commit SCN (because undo segments are too small), it will calculate a safest estimate SCN (T2) and write that value into the block, thereby cleaning it out. If T2 > T1, then you recieve an ORA-1555 error. Even in the worst case, T2 will be equal to the current SCN. If you re-execute the query, your new query SCN (T3) is guaranteed to be >= T2 that was stored in the cleaned out data block since the current SCN will be the same or will have advanced by the time you re-execute. [One exception would be if you use Flashback Query and specify an "AS OF timestamp or SCN" that is < T2.] If you get yet another ORA-1555 error with query SCN T3, then it is likely that the system is write intensive with undo segments that are too small, and the query encountered yet another block that needed cleanout.

does oracle support read as-is

May 27, 2006 - 5:06 pm UTC

Reviewer: jianhui from CA

Hi Tom,

In some scenarios, my database users don't care read consistency. Although very unlikely I am still wondering whether it's supported by oracle to read data as is so these users will not get ORA-1555 error.(data consistency is less important than getting data out of database in this scenario).

Best Regards,


Tom Kyte

Followup  

May 27, 2006 - 9:52 pm UTC

If getting data is out - then the answer is "size your undo tablespace". That is the way Oracle works.

transaction slot overwritten case

May 28, 2006 - 5:53 pm UTC

Reviewer: Jianhui from CA

Hi Tom,
Thanks for the reply. However, we are still at stone age, yep we are still using manaual rollback segment management.

Statspack report shows average rollback segment extent get cycled in more than 3 hours (calculated by #wraps / #extents for each RBS) during busy business hour. These query usually takes less than 1 hour to run, so it's unlikely to be the case that rollback segment itself being overwritten. We already increased total RBS to nearly 100G.

At the same time, this is very busy OLTP database with thousands of executions per second and many small transactions. Long queries are getting data from same tables the OLTP applications are busy writting to.

So that leads me to think it's more the case that transaction slot gets overwritten. The question is that whether the max number of transaction slot is fixed per rollback segment or not? otherwise, increasing RBS size is not going to help, adding more RBS seems to make more sense sicne it adds more transaction slots.

Thanks

Tom Kyte

Followup  

May 28, 2006 - 6:28 pm UTC

so? you can still SIZE your rbs properly for the work you do, has been possible since day 1.

#wraps is not "wrap arounds", it is a count of transactions that span rollback segment extents (wrap from one to the other extent).

You need to look at

a) amount of undo generated per hour
b) say roughly double it (to be safe)
c) permanently have allocated AT LEAST that much.



May 28, 2006 - 9:02 pm UTC

Reviewer: jianhui from CA

Hi Tom,
I never said wrap is wrapping around. That's why I mentioned #wraps/#extents is the number of wrapping around.

For instance, statspack 3 hours window, and I have 150 wraps with 160 min extents(never got extended), wraps/exetents is less than 1 meanning RBS never gets wrapped around in 3 hours.

So I was suspecting transaction slot being over written scenario, is max number of transaction slot per rollback segment fixed?

Sincerely,

Tom Kyte

Followup  

May 28, 2006 - 9:57 pm UTC

but #wraps / #extents is *not* the number of wrapping arounds? Either that or I don't know what you mean by wrapping around.


#wraps is number of times a transaction spanned more than one extent in an RBS.


What do YOU mean by "wrapped around"

If you mean "rollback advanced and reused stuff" - that isn't what this statistic (wraps) is counting.

May 29, 2006 - 1:20 pm UTC

Reviewer: jianhui from CA

I understand wraps is number of times transaction enters into a new extent, for example, from extent#1 into extent#2 then wraps gets increased by 1. Since the rollback segment in my database has fixed amount of extents(always stays at minextents, never got extended), so total number of wraps divided by number of extents is the number of times that rollbask segment gets full cycled(wrapping around).

For example, if a rollback segment has 20 mininum extents and never gets extended during statspack time window, let's say 3 hours, and number of wraps is 60, so the average time for a full cycle is around 1 hour, meaning the possibility rollback segment being overwritten is greater than 1 hour. If my queries run less than 1 hour, hypothetically, I should not hit ORA-1555 because of rollback segment being overwritten. Since I still get ORA-1555 error, so I suspect the transaction slot gets overwitten because we have many small transactions going on at the same time against the same table and they can not be stopped due to business reasons.

So is the max number of transaction slot of each rollback segment a fixed number or is it proportional to the size of rollback segment? If it's fixed, then increasing size of rollback segment is not going to help, but adding more rollback segments seems making more sense.

Hope that explains what i meant at the beginning.

Sincerely

Tom Kyte

Followup  

May 30, 2006 - 9:08 am UTC

we (you and I) have no control/true insight into the number of slots - adding more rbs would give you more slots.

Question though, these queries that take an hour, are they making inappropriate use of indexes - should they perhaps be "faster" (by full scanning)

I'll try more rollback segments approach

May 30, 2006 - 5:53 pm UTC

Reviewer: jianhui from CA

Tom,
Thank you for the reply. As for index scan or full table scan, yes, we've already asked developers to use full table scan hints in these queries. Besides, this is probably the time to say goodbye to manual rollback segment management, we'll take an outage for that :-)
Sincerely

does increasing size of rollback segment give more slot?

May 30, 2006 - 5:57 pm UTC

Reviewer: Jianhui from CA

Tom,
Of course adding more rollback segments will give us more transaction slot? How about increasing size of rollback segment? Does it give us more slot or unknown?(My guess is no, since the header block is fixed in size)
Cheers,

Tom Kyte

Followup  

May 30, 2006 - 7:12 pm UTC

The size of RBS's vary all of the time (they grow and shrink) in general - the size would be the same.



RBS transaction slots

May 30, 2006 - 9:31 pm UTC

Reviewer: Roderick

The number of transaction slots in an RBS header is fixed no matter how large or small the RBS is, but Oracle will save "undo" for the transaction slot changes in the same RBS where the changes are made, so larger RBSes will store more transaction slot change history.

You should be looking at shorter statspack report over shorter periods. Even, for example, if it took 3 hours for 60 wraps, it still means it was feasible that the first 20 wraps occurred within 30 minutes and caused an ORA-1555 for a long running query running at the same time. Then the remaining 40 wraps may have leisurely occurred over the last 2.5 hours of the reporting period. This might be more true if the environment does heavy updates in sporadic batches as opposed to a in a nice steady stream.

undo saved for transaction slot

May 30, 2006 - 11:43 pm UTC

Reviewer: jianhui from CA

Roderick,
Thank you for the note. Could you post any official document talking about saving transaction slot information in the same rollback segment?

As for the statspack period, I have smaller time window, 3 hours was just an analogy. Our peak business hour generates 10G undo and rollback segments already 60GB in total.

Regards,

Official documents?

June 01, 2006 - 1:33 am UTC

Reviewer: Roderick

Unfortunately, I don't know of any papers from Oracle that cover the topic in that level of detail.

Great Work... Saved me a lot of time

August 02, 2006 - 1:31 am UTC

Reviewer: Ash LeClair from Palmyra, NJ USA

Had a procedure working for years that started producing the snapshot error. After reading the article it was clear that the table size of the target table had "outgrown" the commits inside of the loops of the procedure.

A reader from India

September 14, 2006 - 6:10 am UTC

Reviewer: prem from INDIA

Hi tom,

The above information where very precise and useful.

ORA-01555 error

September 14, 2006 - 6:29 am UTC

Reviewer: A reader from Australia

Hi Tom,

First of all thank you very much for your great help to DBA world.

We have been getting ORA-01555 error from couple of days. I would like to resolve this problem by modifying rollback sgment storage.

We have 35 rollback segments with following storage clause

initial=2M,next=2M, minimum extents=20 maxextents=32657 (This is unlimited if I am not wrong? DB_BLOCK_SIZE=8K)

From databse side what do I need to do to minimise chances of getting ORA-01555? We are on 9.2.0.5 (Though using rollback segment) and I am sure that the error is coming from badly written code but it is not something which I can fix in a day that is why I have decided to change storage clause of rollback segments!

My questions are as under
(1) Would it help us if we would modify our rollback segment storage as under?

INITIAL=10M (instead of 2M)
NEXT=10M (instead of 2M)
MINIMUM extents=40 (Instead of 20)

(2) Will increasing number of rollback segments help in avoiding ORA-01555 error? (We have just alocated 9GB space to rollback tablespace)

(3) In my wonder (Or lack of knowledge)Oracle did not put any error message in ALERT log file(Though application cought the error from procedure) is it a bug?


Cheers,

Tom Kyte

Followup  

September 14, 2006 - 9:13 am UTC

if you are using manual rollback segments you need to figure out:

what is your rate of undo generation.
what is the duration of your longest running query.

ensure that enough permanently configured undo space is allocated (based on rate of generation) to get your through the period of time of your longest queries.

1555's are application errors, not logged in alert log, no bug.

One more question please

September 14, 2006 - 9:29 am UTC

Reviewer: A reader

Tom,

Thanks for the quick response.
Some how I did not get your point about permanent undo space, did you mean initial extents or minimum extents?


Cheers,

Tom Kyte

Followup  

September 14, 2006 - 9:55 am UTC

after you issue the create rollback segment statements (so that initial,next and minextents are all allocated) - the sum of the sizes.

that is your "permanently allocated undo space - the MINIMUM about of undo space your system will ever have allocated"

Committed insert

November 07, 2006 - 11:06 pm UTC

Reviewer: Martyn from Brisbane, Australia

Tom - with regards your response to "A Reader" on June 09, 2003.
We have a large insert that is then committed.
A query against that table follows after the commit and the table being queried is not updated again.
How can that cause an ORA-01555, surely the data being queried remains consistent throughout the query?

Tom Kyte

Followup  

November 08, 2006 - 8:23 am UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>

Fetch across commits

November 30, 2006 - 11:57 pm UTC

Reviewer: Shibdas Roy from India

Hi Tom,

I have gone through your wonderful explanation of this topic (ORA 01555) in "Expert Oracle Architecture".

1. "It is the very ACT of cleaning out a block that did not get cleaned out that causes the 1555. I can get an ora-1555 against a table EVEN THOUGH there have been no modifications to the table since my query began (the last modification to that table committed before my query began and NO ONE else has touched that table)."

Could you please put more light on this? Is it only due to delayed block cleanout that we can face 1555 against a table EVEN THOUGH we have not modified it?

2. What is "immediate cleanout" and how is it different from delayed block cleanout?

3. Could you please elaborate on "Fetch across commits" as one of the reasons for this error, the 2 cases being where i) we are updating the same table that our cursor queries on and
ii) we are updating a table other than the one our cursor queries on.
The latter of these is what I am not thoroughly clear about. In this context, I find almost all texts saying that we should avoid committing in a LOOP. Does that mean we cannot get 1555 at all if we commit outside the LOOP? In your book, I guess you have mentioned that we might get error saying cannot extend rollback segment in case we don't commit inside a long cursor FOR loop. And this error doesn't lead us to an inconsistent and non-restartable state, whereas 1555 does. But, does that ensure 1555 will not be faced at all?

Tom Kyte

Followup  

December 01, 2006 - 5:24 am UTC

1) </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429 <code>

2) it is immediate, not delayed - how else to say it? It happens with the transaction, it is not delayed until later.

3) I really did cover this as best as I could in the book - if that didn't convey the material - you'll have to find someone else to say it in a different way.


does that mean you cannot get a 1555 if you commit outside the loop - of course not, a 1555 is raised when undo you need no longer exists. You or ANYONE ELSE may have overwritten it. It is a fact that if you read and modify the same table and commit frequently - you increase the chance that YOU cause the 1555 for YOURSELF. But not committing will NOT prevent the 1555 from occurring - only properly sized undo will do that for you.



Fetching across commits

December 01, 2006 - 12:15 am UTC

Reviewer: Shibdas Roy from India

This is what you have mentioned in your book w.r.t. my 3rd question above:

The major differences between the two errors are as follows:
• The ORA-01555 example left my update in a totally unknown state. Some of the work had been done; some had not.
• There is absolutely nothing I can do to avoid the ORA-01555, given that I committed in the cursor FOR loop.
• The ORA-30036 error can be avoided by allocating appropriate resources in the system.

This error is avoidable by correct sizing; the first error is not. Further, even if I don’t avoid this error, at least the update is rolled back and the database is left in a known, consistent state—I’m not left halfway through some large update.
The bottom line here is that you cannot “save” on undo space by committing frequently — you need that undo. I was in a single-user system when I received the ORA-01555 error. It takes only one session to cause that error, and many times even in real life it is a single session causing
its own ORA-01555 errors. Developers and DBAs need to work together to size these segments adequately for the jobs that need to be done. There can be no short-changing here.

December 17, 2006 - 6:05 am UTC

Reviewer: A reader

1) When we do a select count(*) from big_table to clean out the blocks, and we
have started the query at a QENV that is greater than the SCN of a Rollback
segment of an updated block. Now the Slot has been overwritten as the data has
been committed.
The question is, the select count(*) ASSUMES that the slot has been overwritten
because the data has been committed and cleans out the block i.e. sets the blocks header as
+----+--------------+
| tx | None |
+----+--------------+

2) I am assuming answer to one is yes, if right, this is pretty much a followup.
Does the 1555 happen for a Query with the cause as delayed block cleanout with
the sole purpose of indicating the query results CANNOT be consistent.

That is at start of query when QENV was 50, the RBS's SCN was indicated to be
51, but by the time it reaches Block B, which indicates it has not been cleaned
up, the RBS's SCN has moved to say 54. And the transaction slot used by the data
block B has been overwritten. Now although we can presume we have committed the
data (Question 1), the query cannot gaurantee data consistency, hence generates
1555?


ORA-01555 on SELECT

January 25, 2007 - 2:19 pm UTC

Reviewer: Sanji from Shelton, CT

Tom,

We are encountering ora-1555 error almost everyday at a particular time.

ORA-01555 caused by SQL statement below (SCN: 0x0001.dd102ccb):
Thu Jan 25 01:52:10 2007
select /*+ INDEX_ASC(WSC WSCJAN1)*/ WSC.WYNNE_ID, WSC.R_NAME, WSC.R_GROUP, WSC.RMAN_LOC, WSC.R_MENU, WSC.MENU_DESC, WSC.R_OPTION, W
SC.OPTION_DESC from CDB.WYNNESEC WSC where WSC.WYNNE_ID>=:1 order by WSC.WYNNE_ID, WSC.R_MENU, WSC.R_OPTION

At around the same time, a hot backup of the database is conducted. I am presuming that since hot backup would trigger checkpoints, the dirty blocks are being flushed to disk before they are cleaned by transactions requiring them.

Is my understanding correct ?

If it is the case, then how can we avoid this issue. Resizing segments "might" not solve the problem because the blocks are being written to disk before they are cleaned.
The only solution i feel, is to defer the hot backup.

Any suggestions ?

Rgds
Sanji

ora-01555 error from a table where there is no transaction.

February 12, 2007 - 10:09 pm UTC

Reviewer: sean from NJ, USA

Hi Tom,

We have proceudure like this:

--------------------------------------
procedure p1 is
cursor c_emp as select * from emp;
begin
for r_emp in c_emp loop -- line 4.
-- some codes:
commit;
end loop

end;
-----------------------------------------

The loop run more than 10 hr, our undo_retention time is 10 hr.

We got error like this:

ORA-01555: snapshot too old: rollback segment number 14 with name "_SYSSMU14$"
too small
ORA-06512: at "p1", line 4.

But since there is no update, delete or insert on emp table, the cursor should not get any data of emp from undo tablespace( our undo tablespace is not full). Why do we still get this error? Oracle9207

Thanks so much for your help.

Sean

ora-01555 error from a table where there is no transaction

February 13, 2007 - 12:18 pm UTC

Reviewer: Sean from NJ, USA

Hi Tom,

Thanks for the information. Because of the error we got, we just created the table for this procedure, like ¿create table emp as select * from emp_temp¿. There has never been any update, delete or insert on this table.
Tom Kyte

Followup  

February 13, 2007 - 12:30 pm UTC

did you read my example though, we got an ora-1555 on a read only tablespace. No dml happening during the execution of that query.

but I would hazard a guess that in fact there were DML operations on the table in question - unlikely to see this on a table created via create table as select with no dml.

ora-01555 error from a table where there is no transaction

February 13, 2007 - 2:24 pm UTC

Reviewer: Sean from NJ, USA

Hi Tom,

I need to clarify one thing, there has never been DML on this table, why do we still get snap shot too old error in the cursor, as I mentioned above? Thanks so much for your help. -- Sean
Tom Kyte

Followup  

February 13, 2007 - 4:18 pm UTC

I would guess there is in fact DML you are unaware of - not see or having an entirely reproducible process from end to end - I cannot really speculate more than that.

get 1555 error with cursor table without any dml

February 14, 2007 - 6:12 pm UTC

Reviewer: Sean from NJ, USA

The case you mentioned is that there is dml on that table before the tablespace becomes read only. The developer said that they just created this table for this procedure to avoid 1555 error. Do you think this code will generate 1555 error, assuming undo tablespace is not full? Thanks so much. -- Sean

SQL> create table emp as select * from emp_temp;
SQL>declare
begin
for r_emp in (select * from emp)
loop
-- some codes, the whole cursor will takes longer to finish than undo_retention time
commit;
end loop
end;

Tom Kyte

Followup  

February 15, 2007 - 9:54 am UTC

They can say over and over "no dml"

and I can say over and over "maybe there wasn't, maybe there was". For you see - I cannot count how many times in my life I've heard "nothing changed..." only to discover after digging through that something was changed. Or "we didn't do that" only to find - they did. It happens all of the time, all of the time, constantly.


So, sometimes I just say "maybe you did, maybe you didn't"

it is unlikely that code above would 1555.

Ora-01555

February 15, 2007 - 10:43 am UTC

Reviewer: Mohamed from France

Yes, this code

SQL> create table emp as select * from emp_temp;
SQL>declare
begin
for r_emp in (select * from emp)
loop
-- some codes, the whole cursor will takes longer to finish than undo_retention time
commit;
end loop
end;

is candidate to the ora-01555 error. One raison for that is the "commit across fetch" or a commit inside the loop.

When you commit inside the loop you are telling oracle that he can overide the rollback segment he has created for your initial query. If during the execution of your loop, oracle will come to visit the data you have committed he will see that rollback data has gone and he is not able to give back a read consistency ====> ora-01555.

You don't need to have another session updating the data you are selecting. You are updating them via your commit.



set transaction use rollback segment XXL_RB

February 23, 2007 - 4:46 am UTC

Reviewer: Mohamed from France

One of my colleagues comes and told me that despite he has explicitly done this
set transaction use rollback segment xxl_rb;

He got ora-01555 snapshot too old: rollback segment number 7 with name "RB_SEG6" too small
I then redirect him to the Jonathan Lewis article about this
http://www.jlcomp.demon.co.uk/faq/settrans.html
Later he comes back and told me that he read carefully this article and he assure me that he run his job on a test environment during the night where nobody was doing updates on his tables. He was the only one doing DML on his tables.
Then, when I looked carefully to his plsql code I found that at a point of his code he did this:
Execute Immediate 'Truncate table t';

I think that following this hiding commit his transaction has finished and his DML will no longer be guaranteed to use the XXL_RB and this might explain the ora-01555 he encountered in RB_SEG6 instead of XXL_RB.
Do you agree with me?

Tom Kyte

Followup  

February 26, 2007 - 10:12 am UTC



1555 is raised when you process a "read" of data.

set tranaction use rollback segment specifies where your transaction WRITES data.


1555 - result of a READ.

use rollback segment - controls where you WRITE.


the colleague is running a query. This query reads data. This query needed some undo information from rb_seg6. Ignore for a moment the fact they are writing data, the 1555 is ONLY concerned with the read. The query might need undo from EVERY rollback segment.

the chances of a 1555 are proportional to the size of the small rollback segment.


And - you are never ever the "only one in the database", the database is always doing transactions as well - it uses the same undo and redo.


But yes, the truncate is a commit as well and would have terminated the "use rollback segment" - but it is not like the cause of the overall problem here - it could be - if he was the one that overwrote the undo information - but it is not definite that he was.

All Quiet on the Rollback Front..?

April 04, 2007 - 10:01 am UTC

Reviewer: Dave Martin from The Hague, Netherlands..

Well Tom, its all been quiet here on the rollback front for months, since we wnt to 9i and got into (aum)

Until today..

We picked up an ORA-1555 on a query with 'Query Duration=0 sec'. Any ideas what could be causing this? Ive read all the issues with delayed block cleanout and fetching across commits but Im no wiser. Normally I would expect query duration to come out around the same value as undo_retention, in which case the cause is fairly clear, but 0 sec? (undo_retention is 900 secs)
Tom Kyte

Followup  

April 04, 2007 - 11:02 am UTC

flashback query.


the query never got started. it failed really fast :)

So, was someone doing a flashback query that triggered this 1555?

All quiet on the rollback front..

April 05, 2007 - 4:13 am UTC

Reviewer: Dave Martin

No, at least not as far as I can tell. The FLASHBACK ANY TABLE isn't granted to any user other than the DBA. Nor is execute privilege on DBMS_FLASHBACK granted to anyone other than the DBA.
The query in question isn't using 'AS OF' either.

Weird!

All quiet on the rollback front

April 06, 2007 - 3:13 am UTC

Reviewer: Dave Martin from The Hague, Netherlands

A simple increase in undo_retention solved the problem.

The query duration of zero threw me, but is this any kind of special case?

The transaction count in dba_undo_extents was 10,000 over ten minutes, or 16/second. The normal time for the query was 32 msecs.


Tom Kyte

Followup  

April 06, 2007 - 11:41 am UTC

i believe you were using flashback query.

snapshot too olod on export

July 12, 2007 - 1:20 pm UTC

Reviewer: Jeff from Milwaukee WI USA

Tom,

Database is 9.2.0.8. Undo_management is auto, undo retention is set to 86400.

Performing nightly full export on the database (yes, I know, there's no point, I'm trying to convince them that this is a waste of resources, but right now I'm just trying to figure out why we're getting this snapshot too old error).

EVERY Night the full export reports a snapshot too old on the same table. It's a very small table with only 3200 rows in it. The table does have an NCLOB column. I've done tablescans, validated the indexes, exported just the table by itself (successfuly), etc, etc, etc, basically everything I can think of to eliminate delayed block cleanout or anything else that would cause the snapshot.

But it still happens. Every time.

Any ideas of what to look at? The export takes less than the 24 hours that we've got the undo retention set to.

Tom Kyte

Followup  

July 12, 2007 - 2:19 pm UTC

is this table heavily modified?

SNAPSHOT TOO OLD

July 12, 2007 - 2:45 pm UTC

Reviewer: Jeff

<is the table heavily modified?

Not that I can tell. There's no datestamp info on the table to record changes. I'm not at all familiar with the system. I'm a DBA consultant brought onsite a few days ago to look at some nagging issues, of which this is one. They were getting snapshots all over the export, upping the undo_retention got rid of all of them except this one. I did look at V$TRANSACTIONS to make sure there were no hanging processes out there.

I found a note on Metalink that implied there might be a corruption in the lob, but I ran the diagnostic script and it didn't return anything. (Metalink docid 423113.1)

the describe of the table is:

 desc portal_extra_info
 Name                 Null?    Type
 ------------------ -------- ----------------------------
 PK1                NOT NULL NUMBER(38)
 MODULE_PK1                  NUMBER(38)
 FOLDER_NAME                 NVARCHAR2(64)
 PORTAL_VIEWER_PK1           NUMBER(38)
 EXTRA_INFO                  NCLOB



Thanks for the quick response!
Tom Kyte

Followup  

July 12, 2007 - 5:03 pm UTC

try to find out if the table is heavily modified (query v$sql looking for sql that references this and look at the execute count before/after the export for example)

what is the precise error - could it be the lob segment (which doesn't use undo, but pctversion?)

Question on RBS error explanation

July 25, 2007 - 2:19 pm UTC

Reviewer: Srikanth S from Dairyland

Tom
In the figure below (Taken from the note at the beginning of this article), how does the datablock header maintain the uncommited transaction information, when more than one change occurs. In the example shown below, one record is updated and hence the header contains the RBS address as 5.3uncommited. How will multiple changes be represented?



Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx |5.3uncommitted|-+ | transaction entry 01 |ACTIVE |
+----+--------------+ | | transaction entry 02 |ACTIVE |
| row 1 | +-->| transaction entry 03 |ACTIVE |
| row 2 *changed* | | transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |
| row n | | transaction entry nn |COMMITTED|
+------------------+ +--------------------------------+


Thanks and appreciate your time.
- Sri
Tom Kyte

Followup  

July 26, 2007 - 5:41 pm UTC

as multiple entries in the block header, there can be more than one entry there.

snapshot too old and lobs

August 01, 2007 - 12:40 pm UTC

Reviewer: Darcy from VA USA

We had been encountering ORA-01555 & 22924 errors on queries against our lobs. Increasing UNDO_RETENTION and modifying PCTVERSION hadn't helped. The lob corruption script in Metalink Docid 423113.1 ran successfully. We did discover file corruption using the dbverify tool against the datafiles for our lob tablespaces though (our lobs are stored out of line).
Tom Kyte

Followup  

August 05, 2007 - 10:07 am UTC

define what you mean by "we did discover file corruption" - if you use dbverify against live files, there will be many in flux blocks.


you say pctversion hasn't helped, but you don't say at all

a) what it was
b) what it is
c) how long your queries run for
d) how long you actually have stuff saved for

eg: you give nothing anyone would be able to comment on really....


While Exporting ...

August 22, 2007 - 12:28 pm UTC

Reviewer: Chaman

Tom,
While exporting, Im getting the following error :

EXP-00056: ORACLE error 1555 encountered
ORA-01555: snapshot too old: rollback segment number 13 with name "_SYSSMU13$" too small

Why should I be getting an error while exporting. First I even ran with CONSISTENT = Y. 

exp userid=userid/password@oradb1 file=c:\tab_ord buffer=500000000 compress=Y direct=Y 
consistent = Y feedback=20000 tables=tab_ord log=c:\tab_ord.log




Tom Kyte

Followup  

August 22, 2007 - 2:27 pm UTC

because you only get 1555's when you read data, that is what gets them.

export reads data.

In order for your export to succeed, you must ensure that undo_retention is set higher than your time to export (eg: if the export takes 2 hours, you need undo_retention to be set to AT LEAST two hours) and you need to ensure you have sufficient undo tablespace allocated (or the undo tablespace is allowed to grow dynamically)

It worked ...

August 23, 2007 - 5:25 am UTC

Reviewer: Chaman

Tom,
Thanks for the reply. Yes, the export of that table was taking 6 hours. I was able to do it successfully after office hours when nobody was using the DB and without changing any parameter.

Thanks

session in a loop with no open cursor - fetch across commit ??

September 10, 2007 - 3:35 pm UTC

Reviewer: ZT from WI USA

I am new to this issue and recently I kept getting the ORA-01555 error from the following query. A guy told me I had the fetch across commit problem but I doult because I do not have any open cursor. If it is not fetch across commit problem , what would be the likely cause of the ORA-01555? I kept hitting ORA-01555 when the loop is half way done. The LOOP is used to control the amount of data for each query/insert, the thought is to make each block of the query/insert short. I already significantly reduced the amount of the data in each query by significantly reduced the number of PATIENT_LINK_KEY associated with each blk_num in the loop control, but still keep getting the same error. It seems that reducing the size of the block does not help. Each loop takes about 10 min to half an hour, I have about 30 blocks to go. I am just wondering in such a PL/SQL loop, if all the query/insert treated as ONE session or each loop will have its own session. Many thanks in advance.

----- The PL/SQL code -------
create table lab_dump as
select PATIENT_LINK_KEY, TEST_CODE, TEST_RESULT from lab_results_vw where 1=2;

declare
min_blk_num NUMBER :=0;
max_blk_num NUMBER :=0;

begin

select min(BLOCK_NUM) into min_blk_num from P1_PROD_PATIENT;
select max(BLOCK_NUM) into max_blk_num from P1_PROD_PATIENT;


for blk_num in min_blk_num .. max_blk_num loop
insert into lab_dump
select distinct PATIENT_LINK_KEY,TEST_DATE,TEST_TIME,TEST_CODE,TEST_RESULT_NUMERIC TEST_RESULT_C
from lab_results_vw where TEST_RESULT_NUMERIC is not null
and TEST_DATE >= (sysdate-60)
and PATIENT_LINK_KEY in (SELECT distinct PATIENT_LINK_KEY FROM P1_PROD_patient where BLOCK_NUM=blk_num);
commit;


insert into lab_dump
select distinct PATIENT_LINK_KEY, TEST_DATE TEST_DATE, TEST_TIME,TEST_CODE,TEST_RESULT TEST_RESULT_C
from lab_results_vw where TEST_RESULT_NUMERIC is null
and (TEST_RESULT like '%>%' or TEST_RESULT like '%<%' or upper(TEST_RESULT) like '%DETECTED%')
and (TEST_LINE is null or TEST_LINE=1)
and TEST_DATE >= (sysdate-60)
and PATIENT_LINK_KEY in (SELECT distinct PATIENT_LINK_KEY FROM P1_PROD_patient where BLOCK_NUM=blk_num);
commit;

end loop;
end;
/

Tom Kyte

Followup  

September 15, 2007 - 3:10 pm UTC

.. the thought is to make each block of the
query/insert short.
...

that is a bad thought. Look at your code, when it fails - HOW DO YOU RESTART IT.


this should be a single insert into, no if and or buts about it - a SINGLE insert into as select.

period.


You have a query that takes a long time to run in there apparently (you don't tell us much here - other than "something is getting a 1555 sometime).


get rid of all procedural code, do it as a single insert as select.


Oracle DBA

September 27, 2007 - 6:12 am UTC

Reviewer: Hemchander from India

The article was very useful

1555 inspite of high undo retention

December 06, 2007 - 4:05 pm UTC

Reviewer: Sanji from Shelton, CT

Tom,
I'm encountering 01555 inspite of an undo_retention setting more than the one mentioned in the alert log detail.

Thu Dec 6 07:35:23 2007
ORA-01555 caused by SQL statement below (Query Duration=21749 sec, SCN: 0x0003.16258393):
Thu Dec 6 07:35:23 2007
select /*+ INDEX_ASC(AST ASTSET1)*/ TO_CHAR(AST.ASSET,'FM0999999999'), TO_CHAR(AST.DSP_ASSET,'FM0999999999'), AST.DESCRIPTION, AST

OPEN:SANJI:PFIN@LAWSON1>show parameter undo

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 46800
undo_suppress_errors boolean FALSE
undo_tablespace string UNDO

The query failed because the consistent image of the block it was looking for, wasn't retained for 21749 sec (as per alert log). The retention though is set at 46800 seconds.
The undo tablespace is 16G.
Env. is 9i, HP-UX.

Any reason why this (1555) happened in this case ?

Thanks
Sanji
Tom Kyte

Followup  

December 10, 2007 - 10:29 am UTC

query v$undostat - you'll almost certainly find that there were prematurely expired undo extents.


undo_retention is a target, just because you set it to 46,800 doesn't mean it'll save it for that long.

If you did not allocate sufficient undo space and you don't let the undo segments autoextend the underlying datafiles, we won't be able to keep it that long.

The previous review for 1555

December 06, 2007 - 4:28 pm UTC

Reviewer: Sanji from Shelton, CT

Tom,
I feel the undo tablespace reached it's limit, probably that's why the undo would have been reused.
I have a doubt here though.
In such a scenario, where the undo retention is 3 hrs and the undo tablespace is maxing out, which undo segment would be wrapped first ?
The next "inactive" one in the round robin structure, even when the retention for the undo in that segment has still not reached ?

Thanks
Sanji
Tom Kyte

Followup  

December 10, 2007 - 10:30 am UTC

it'll go for the oldest undo possible first.

snapshot too old error

May 11, 2008 - 6:47 am UTC

Reviewer: Anirban from India

Hi Tom,
It's a nice explanation. But one thing I am not clear about, is
1. say session 1, transaction 1 updates a row, at this point the header of the corresponding data block will be marked and entry will be made in rollback segment header and rollback segment.
Now suppose session 1 commits it, at this point the entries in the rollback segment and rollback segment header can be overwitten. But still the data block contains the old version of data.
Now suppose since a very long time this block is not get reffered by any query but rollback segment and rollback segment header is overwritten by some other entries.
Say now some query reffers to that data block(currently it contains old version of data, because since it has been commited no query has reffered this block). Then how the query will get the latest version of data (since rollback segment and rollback segment header both were overwritten)


Tom Kyte

Followup  

May 12, 2008 - 1:28 pm UTC

the data block does not contain the old version of the data.

That is an erroneous assumption - once you did the update, the old version of the data was overwritten on that block. We did a current mode get, modified the bits and bytes - generated redo and undo for that. But the old version is not on that block anymore.

read each and every block,

May 27, 2008 - 8:27 pm UTC

Reviewer: A reader

What is the good way to read and each and every block in the database to make sure there is no block corruption or any other issues?

One of the team recommended to take export.
Any other idea?

Thanks,
Tom Kyte

Followup  

May 28, 2008 - 8:31 am UTC

RMAN....

just backup the database. RMAN does block checking.


export will full scan all tables, export will not touch indexes.

export would likely be "good enough", but RMAN will do it and you'll get something useful out of it.

A backup

June 22, 2008 - 6:44 am UTC

Reviewer: A reader

1)Can a Select (no Commits involved) return a 1555 error, considerng I've got a long running Select from a client, so long a block it would visit in the future had been changed and the rollback segment overwritten.

2) If a block cleanout cannot find a rollback segment, will it perpetually leave the block header's transaction status ("tx") as "uncommitted pointed to the reused rollback segment" instead of changing the status to "None"

Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx |5.3uncommitted|--+ | transaction entry 01 |ACTIVE |
+----+--------------+ | | transaction entry 02 |ACTIVE |
| row 1 | +--->| transaction entry 03 |COMMITTED|
| row 2 *changed* | | transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |
| row n | | transaction entry nn |COMMITTED|
+------------------+ +--------------------------------+
Tom Kyte

Followup  

June 22, 2008 - 8:55 am UTC

1) of course, selects are what GET 1555's

the read component of a select, insert, update, delete - they get the 1555

the modification bit - that'll never get a 1555

update t set x = 5 where y > 6;

the "select * from t where y > 6" bit - that can get a 1555 for the update.


Any query is subject to a 1555.


And the commits - they come from other sessions and are always happening ( the database never sleeps, there is ALWAYS activity, even if you think you are the only one logged on, you aren't, pmon, smon, etc - they are all there)

2) no, once we hit that condition, we'll know when we restart that any transaction that predates "M" is really old. and "M" will now cover the transaction that modified that block we hit.

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:811822300346967203

June 23, 2008 - 11:41 pm UTC

Reviewer: Aman.... from India

Hi sir,
Sir does undo segments go in redo logs?Or undo segments get updated so that their change vectors go in the redo stream?
Regards
Aman....
Tom Kyte

Followup  

June 24, 2008 - 4:51 am UTC

undo is logged in the same manner any segment is logged - so just like when you modify a table block, redo is generated - when you modify an undo block, redo is generated.

June 24, 2008 - 6:22 am UTC

Reviewer: Aman.... from India

hi sir,
Thanks alot for the answer.Can you please define that how one can update the undo blocks?I amnot sure that I understood this part that we can update the undo blocks as I understood that undo segments are system objects used and maintained by system itself.The only time they can be modified are when they are marked as candidate for being overwritten after undo retention is over.Than how we can say that they are modifiable and generate redo?I shall appreciate ifyou can show me with a small example or guide me how to set up one to see this?
Best regards
Aman....
Tom Kyte

Followup  

June 24, 2008 - 6:55 am UTC

update t set x = 5;
1 row updated.



that'll generate undo, that'll modify undo blocks.

You do not directly "update" undo yourself, you perform operations that generate undo.

Just like you don't update an index, you update a row in a table and that might update an index as a side effect.

June 24, 2008 - 7:52 am UTC

Reviewer: Aman.... from India

Sir,
Please bear with me.Iam so sorry I am taking too much time of yours.
Update x set a=2 where a=1;
[block=1 TX-Status=Active]
This will generate redo and in the effect of that ,corresponding undo will be there.Is it correct sir?
Now a block of Undo tablespace has value was 1.How we will update it?Yes we wont be doing it by ourselves as you mentioned.So this is what I understood,
[block=1 TX-Status=Active]
We committed,
[block=1 TX-Status=None]
Now some another transaction comes up,it will try to update teh same block with some another value eg 3.
[block=3 TX-Status=Active]
This updation you mentioned will update the undo block is it correct?
What I am not able tounderstand the line "this will generate redo".What will be the redo of this updation when this itself is an outcome of a redo-generating update statement?
I am sorry sir,I guess I guessI came back to same question that how ocme undo is supposed to generate redo?We committed a block,its undo block is free now.When we update it for some another value,this also will log itself in the redologs ?Why so?
Regards
Aman....
Tom Kyte

Followup  

June 24, 2008 - 10:32 am UTC

every update will generate

a) undo (so you can rollback)
b) redo (so we can redo/recover your update if the system fails)


when you issued the update statement, you generated UNDO

That is what "updates undo", you generate undo, we need to put that undo somewhere, so we put it into an undo block - and there you go, you just updated undo - you created undo - you generated undo - you modified an undo block.




No one will ever modify your undo, your undo is your undo. It will get overwritten at some point.


You modify/write/create/generate UNDO with every modification you make. That is all.


And undo, since it is stored in a segment, has redo generated for it.


Here is why.

You issue: update t set x = 5 where x = 6;

you want to update block 5, row 1. You want to set x=5 for that row.

you generate UNDO for that update, the undo says "put x back to 6 on block 5, row 1". This undo is stored in memory, in the buffer cache. We generate redo for that as well, the redo says "put into undo the fact that we need to put x back to 6 on block 5, row 1"

you modify block 5, row 1 and set x = 5, that generates redo.

lgwr decides to flush the redo log buffer, your redo is written out

SCENARIO: the system crashes right now. When we restart, we apply crash recovery. That means we read the online redo logs and apply what they say to apply. We recover the undo block (that says put x back to 6 on block 5, row 1). We recover block 6, row 1 and make x = 5 (we replay your change). We are done with the roll forward stage. Now we need to rollback any uncommitted transactions. Well - good thing we generated redo for undo, we have the undo we need now to roll back your change. If we didn't have the undo protected by redo - we would only be able to roll forward your changes - but not roll them back!

June 24, 2008 - 11:17 pm UTC

Reviewer: Aman.... from India

Excellent sir.
Just one last doubt.Its on hypothesis only but just came to mind so askig.
The redo is containing now x=5 where earlier it was x=6.Now in the time of crash,assumingly that Undo got overwritten and we don't have undo record x=6 stored in the Undo anymore,instance recovery is required.Now in the rollforward,Oracle willapply x=5(change replay).But from where the in the open stage for roll backward,Undo will come?What I always understand is that as redo contains both the old and new changes,itwill regenerate the undo for the redo.But I was told that there is only Undo statement stored in the redo vectors not the actual data.So how the old data ie x=6 will be generated and applied?
Thanks a ton forthe last reply!
Best regards
Tom Kyte

Followup  

June 25, 2008 - 8:29 am UTC

the undo could not be overwritten, we had not yet committed.


June 25, 2008 - 10:09 am UTC

Reviewer: Aman....

Hi sir,
So it means that Undo segments are kept intact till the transaction is not committed.So is the statement that redo contains the undo data is half correct or is completely correct?The reason for asking this is that if the Undo is kept all teh time till the transaction is not over than oracle can just keep the statement in teh redo vectors to undo the data but doesn't need to keep the data itself.Is it correct?
And one more thing that if the data is not being overwritten if the transaction is not committed,what happens if the tablespace is exhausted?I mean we filled the file completely ,none of the transaction is committed and there is more need to keep the undo data for incoming transactions?
Thanks a ton sir.
Aman....
Tom Kyte

Followup  

June 25, 2008 - 11:37 am UTC

it mean the undo generated by a transaction remains intact UNTIL the transaction commits (you said it backwards)


And all of the undo you generate generated redo. The online redo logs may or may not have the redo for all of the undo (and in fact may or may not have all of the redo for the database blocks modified). If we checkpoint a block (any block, undo, data, whatever) - we can overwrite that redo. But we cannot overwrite undo until the transaction commits.


Say you are going to update 1,000,000 rows.

This will generate 10mb of undo
This will generate 10mb of redo for the undo
This will generate 10mb of changed blocks
This will generate 10mb of redo for the changed blocks

Say you have two 1mb online redo logs.


Ok, we start the update.

The update generates undo and modifies blocks - it generates redo for both. The first redo log fills up. We advance from log 1 into log 2 - at the same time we initiate a checkpoint of all of the blocks currently protected by redo log 1 - any block that has redo in log 1 will be written to the datafiles. When we fill up redo log 2 - BEFORE we advance back into redo log 1, we make sure that checkpoint is complete (if not, we issue 'checkpoint not complete cannot allocate new log' and wait). When that checkpoint is complete - we know that every block protected by the redo in redo log 1 is safely on disk - we don't need the redo for crash recovery anymore - we advance into redo log 1 and overwrite that stuff.

The undo is still there
The modified blocks are still there
but some of the redo we generated is gone, overwritten (might be in archive logs, but that is not relevant we do not use archives for crash recovery - only media recovery)


If the tablespace fills up and we cannot allocate anymore undo, we fail. You get error messages back in the application and the application decides what it wants to do about that.


June 26, 2008 - 11:24 pm UTC

Reviewer: Aman,,,, from India

Hi Sir,
Thanks alot for the reply.Tons of things are getting cleared but its arising more doubts too.Please bear with me for this.
Q1)You said
"The online redo logs may or may not have the redo for all of the undo (and in fact may or may not have all of the redo for the database blocks modified)."
How it is possible that the online redo logs may not have the redo for all the undo?Before Oracle wont be ensuring that whatever block is changed is already not checkpointed,it wont let the online redo go away.Does your statement refers to the redo records which are checkpointed and are thus erased or some thing else?My doubt is that how it is possible that redo records will have gaps for the undo data?
Q2)Sir,yesterday while discussing this same topic and the same points,me and my friend got struck at one point.I request you to give insight for it.This is the background,
We have a block in the buffer cache containing 1.We fired update and changed it to 2.Now in the cache, there are two blocks which are there,one Data block containing 2 and one Undo block containing 1.The transaction information is recorded in the V$transaction as well which knows that to which Undo Segment the Undo block belongs.
Now the doubts,
Doubt1)Oracle had captured a block in the cache and has (assumingly) mapped to the Undo segment.IMO it wont happen that it will go to the physical datafile of the undo tablespace and will read teh block.It will be too much of work to do.So a block in the cache is mapped to Undo Segment.How this entry is told to the physical Undo Segment sir?When we do a transaction,oracle immediately enters the values inthe V$tranactionfor the Undo segment number and so on which means that they have immediately grabbed the Undo Segment.But how come it can happen when the block is in the cahce only still(if it is) and not pulled from the hard disk.If it is pulled from the hard disk,things fall at place but otherwise,we are big time confused.
Doubt2)This is related to first only.The Undo block is in the cache first of all.This block will be checkpointed just like any other data block by DBWR when the time will come.Assumingly that checkpoint will happen in 10 seconds and before it gets complete and Oracle is able to write that undo entry in the Undo block physically,we lost the instance.Now in this case,how will Transaction Recovery follow in the next startup?My question is that as we have lost undo data,how it will be generated?If it was already there on the disk that means oracle did a PIO which doesnt sound correct.
I am so sorry sir that I am taking too much time of yours for this probably a basic topic but pieces are not getting linked hence I am troubling you.
Thanks and regards
Amna....
Tom Kyte

Followup  

June 27, 2008 - 8:50 am UTC

q1) ... Does your statement refers to the redo records which are checkpointed ...

yes, that is precisely what it refers to. That is what this says:

... The update generates undo and modifies blocks - it generates redo for both. The first redo log fills up. We advance from log 1 into log 2 - at the same time we initiate a checkpoint of all of the blocks currently protected by redo log 1 - any block that has redo in log 1 will be written to the datafiles. When we fill up redo log 2 - BEFORE we advance back into redo log 1, we make sure that checkpoint is complete (if not, we issue 'checkpoint not complete cannot allocate new log' and wait). When that checkpoint is complete - we know that every block protected by the redo in redo log 1 is safely on disk - we don't need the redo for crash recovery anymore - we advance into redo log 1 and overwrite that stuff. ...


q2)


We have a block in the buffer cache containing 1.We fired update and changed it
to 2.Now in the cache, there are two blocks which are there,one Data block
containing 2 and one Undo block containing 1.



Not necessarily, there COULD be, there does not HAVE to be. Multi-versioning would permit there to be two images of the block as of different points in time, but an update will not ASSURE us this is true.

when we updated the block, we updated the block header, that tells us what undo would be needed to undo the changes. This undo MAY or MAY NOT be in the cache. For example, what if you

a) update block
b) alter system flush buffer_cache
c) another session queries that block - we need the undo, but the undo is on disk


undo is just data, data on a block. blocks are either found in the cache or not, if not in the cache, they are loaded from disk into the cache and then used. If you think of undo as "just data", you'll have an easier time of this. Pretend the transaction information is like an index, the index points to undo, the undo may or may not be in the cache (hopefully it is, but it is not necessary to be so)


as for the second part of this second question - please just re-read the answer to the first part.

June 28, 2008 - 7:16 am UTC

Reviewer: Aman Sharma

Hi sir,

q1)Thanks alot.Feels so good to understand something :-).
q2)"Not necessarily, there COULD be, there does not HAVE to be. Multi-versioning would permit there to be two images of the block as of different points in time, but an update will not ASSURE us this is true. "
Ok I agree sir.Actually the idea to put to blocks in the originalal question was to just make things a simple and easy to refer for me.Surely with the update only,the CR block wont be created.
Takng this point,things became more confusing for me sir.
This is what you mentioned,
"when we updated the block, we updated the block header, that tells us what undo would be needed to undo the changes. This undo MAY or MAY NOT be in the cache. For example, what if you

a) update block
b) alter system flush buffer_cache
c) another session queries that block - we need the undo, but the undo is on disk


undo is just data, data on a block. blocks are either found in the cache or not, if not in the cache, they are loaded from disk into the cache and then used. If you think of undo as "just data", you'll have an easier time of this. Pretend the transaction information is like an index, the index points to undo, the undo may or may not be in the cache (hopefully it is, but it is not necessary to be so) "
I agree that Undo block is just another block like the Data block that is in the datafile ike other datafiles.Now my confusion is that it would be a tedious thing for oracle to hit Undo datafile all the time.It just doesn't sound correct.I did an update and now I hit the HDD to store its undo.It sounds like tons of work.If oracle would Oracle acquire a block which is sort of "linked" to undo datafile where the information is maintained ,it looks like an more fast process.I agree that with the Update,the data block will be updated to mention the address of the Undo block which is storing the undo.This point only is my point of confusion.Oracle did store an entry in the block header that block no X in the undo segment Y is storing the Undo for this change.But when did Oracle update the actual block in the undo datafile(if it did)?At the time of the update only or later ?This is where the above PIO point strikes me that it shouldn't be a physical block access immediately.But Oracle immediately maintains an entry in the V$transaction that it has acquired so and so Undo segment to store the undo it means ,oracle knows that where it is stored (or will be stored) in the HDD and to get it back in the instance recovery.This is my doubt sir.
Thanks and regards
Aman....

Tom Kyte

Followup  

June 28, 2008 - 2:04 pm UTC

how do you know you hit the hard disk to store the undo? It was more likely the redo that made your disk blink, or the update the control files.

but we cache undo just like we cache your tables, your indexes, any block. undo blocks are just blocks, we cache them.

June 28, 2008 - 3:04 pm UTC

Reviewer: Aman Sharma

Hi sir,
how do you know you hit the hard disk to store the undo?
Absolutely no idea I have sir about this.Its just an hypothesis that I put in front of you.My best guess would be that its not a PIO(its shouldn't be) but than the next part contradicts it.
It was more likely the redo that made your disk blink, or the update the control files.
So you mean to say that just like the incremental checkpoint is written to control file header or redo logs,the same kind of thing is there for undo too?But this means constant write so as the suggestion is given that redo logs should be on fast devices,Undo tablespace should also be placed on such a device.But that still isa constant IO even if it would be on a fast device and DBWR needs to be continually busy to do this.:-S(confused).
but we cache undo just like we cache your tables, your indexes, any block. undo blocks are just blocks, we cache them.
Yes sir,I agree to this point and understand it.But if its jst like any other data,its flushing would be with the rest of the data only at checkpoint.There is this part that confuses me that if its not written in the file and instance crashed in the while(rarest situation may be but still 0.0001% chance that it may happen) than how will Undo be given for rollbacks?That's the original doubt sir.If its like redo where every 3 second writng is done,it makes perfect logic but does DBWR does the same kind of IO too over the datafiles for Undo blocks?
Thanks and regards
Aman....
Tom Kyte

Followup  

June 28, 2008 - 3:20 pm UTC

first bolding - undo is cached, like anything else, if we want to overwrite the redo that protects something in the cache, we have to flush it from cache - but not before then.


2nd bolding - control files are written to continously, constantly, as are redo logs - all of the time. Increment, full, whatever - doesn't matter, these files are constantly being written to.

But undo, undo is just data, like a table, like anything else. Pretend that undo is just an index on your table - that might help. An index that is maintained transparently as modifications are made - undo is just like that.


3rd bolding - before we overwrite ANYTHING in the online redo logs that needs protecting (because it is in cache, not one disk) we make sure that is one disk - data, index, foo, bar, undo, whatever - before we overwrite redo for stuff in cache, we make sure stuff on cache is on disk - that is all

June 29, 2008 - 1:18 am UTC

Reviewer: Aman Sharma

Hi sir,
Please correct me,redo contains all the changes even the changes happening to the Undo segment as its changes are also generating redo ie change vectors.Entire change is maitained in the redo log buffer.Once flushed to log file,the contents of the Undo can be overwritten knowing that redo is already protected and if needed,i can be used to roll forward the changes meaning the changes done to Undo also.So Oracle overwrites teh contents of the Undo once redo is protected.In case of crahs,using redo,everything is retained.
Is it correct sir?
Regards
Aman....
Tom Kyte

Followup  

July 01, 2008 - 6:53 am UTC

... Once flushed to log
file,the contents of the Undo can be overwritten knowing that redo is already
protected and if needed,i can be used to roll forward the changes meaning the
changes done to Undo also. ...

no, that is not correct.


redo contains all changes made to blocks in the database (including undo)

redo is buffered in redo log buffer and constantly flushed to disk (even before you commit)

data is buffered in the cache, but might be flushed to disk (checkpointed)
once you commit a transaction - the undo generated by that transaction can be overwritten (nothing to do with redo, it is the commit that frees undo)

redo can be overwritten once the blocks it protects are checkpointed to disk.


If Oracle crashed, all of the needed redo to reconstruct any undo we need to rollback would be present in the online redo logs - we might get previously checkpointed blocks plus undo blocks recovered from online redo logs to rollback, but every thing we need is in place.

June 30, 2008 - 6:47 am UTC

Reviewer: A reader

I'd suppose Oracle only 're-uses' Undo if the Undo is marked as Committed!
Tom Kyte

Followup  

July 06, 2008 - 6:40 pm UTC

yes

June 30, 2008 - 6:49 am UTC

Reviewer: A reader

And if asked why not just rely on redo log file flushed from redo log buffer, the main reason could be 'Why do Disk Read' when all the buffers are in fast cached memory?
Tom Kyte

Followup  

July 06, 2008 - 6:42 pm UTC

huh?

we do rely on it - but we use undo for rollback and redo for recovery - two structures, two different purposes.


and cached memory is not 'fast', don't think that way, just think "PROBABLY, but not necessarily, faster than a disk read but really expensive and to be avoided since it is after all a shared data structure that must be protected with lots of latches, locks and so on"

avoid full table scan during index creation on NULL value

July 01, 2008 - 8:54 pm UTC

Reviewer: Jianhui from PA

Tom,

We have an existing multi TB legacy table with a newly added column, currently this new column has no value(all NULL) before the project goes live, after project goes live, this column will have values.

We have requirement to create an index on this column, we tried full table scan 4 parallel threads, it hits 1555 error after 17 hours. Currently we are trying to use online index creation, hopefully that can avoid 1555 error, however it's going to take much longer.

Is there a way to let Oracle know there is no need to scan the whole table and populate the index structure directly(because of all NULL value) so that we can avoid 1555 error and make index creation much faster(basically nothing to do,just some metadata change because of all NULL value)?

Best Regards,

Tom Kyte

Followup  

July 06, 2008 - 7:24 pm UTC

there is not, we need to convince ourselves there is nothing to index

is there some partitioning going on here - or do you really have a multi-terabyte single segment structure?

thanks

July 08, 2008 - 4:33 pm UTC

Reviewer: Jianhui from PA

Thanks for the feedback. It has been there for years until it grows so big and catches attention. Sure it's suboptimal design, but we have to deal with the reality.
Tom Kyte

Followup  

July 08, 2008 - 5:34 pm UTC

can you increase the degree of parallelism - do you have the resources to do that (IO bandwidth, cpu, memory)

But what about when undo_management=auto?

July 28, 2008 - 2:48 pm UTC

Reviewer: Fred from New York, NY

I was recently the victim of an ORA-01555, and after studying this thread I am convinced that it was caused by a long update that committed too frequently. When a long-running mview refresh (whose base table is the same as one involved in the long update) tried to construct a consistent view of its data, it must have encountered a changed block that had already been committed and overwritten (or perhaps its transaction slot had been overwritten) and this raised the error, even though neither the update nor the mview refresh even came close to the value of UNDO_RETENTION and there was plenty of room in the UNDO tablespace.

So apart from coming down on the commit-happy developer who made the original update, what is to be done about this? Obviously increasing UNDO_RETENTION does not help in this situation, since Oracle is doing what it is supposed to do here: it is re-using undo segments that are marked as committed. It has no way of knowing that another query will need those segments in a few minutes.

The solution above (from 5-Jun-2000) recommends adding rollback segments, thus encouraging Oracle to distribute the undo amongst many segments and minimize the chances of overwriting the segments that might be needed later . But in 10g with UNDO_MANAGEMENT=AUTO, we have no control over how many rollback segments there are.

Well then, would I be right to suggest that setting UNDO GUARANTEE on the undo tablespace would be a way out of this problem? I understand the caveat about possibly harming other transactions if undo becomes unavailable-- fortunately, the size of my UNDO tablespace is not a cause for concern.

(Sorry if this is long-winded or seems like a no-brainer-- I'm just trying to work out the implications of what was presented above.)

Thanks,

Tom Kyte

Followup  

July 29, 2008 - 3:31 pm UTC

... Obviously increasing UNDO_RETENTION does not help in this situation, s ....

no, that is wrong. increasing undo_retention would absolutely help in this situation.

undo_retention tells Oracle "hey, even though you COULD reuse this undo, do not - it is still too fresh, we are asking you to keep all undo until it is at least N seconds old - this is not N seconds old, so do not overwrite it as long as you have space ANYWHERE else - extend the undo segment, steal an expired extent from another segment, grow the datafile to extend the undo tablespace if you can - but do not overwrite this until you have no other choice"


Thanks sir....

July 29, 2008 - 11:23 pm UTC

Reviewer: Aman....

Sorry for the late reply sir.I am travelling and not feeling well.So got late in the reply.Thanks a ton for the answer.As for now,most of the things are clear.
Thanks and regards
Aman....
PS: In the downloads (archives) only the actual question of that week is there.I don't think that complete thread is available.Did I miss it?And is it possible for you to provide download on yearly/monthly basis inlcuding all teh discussions?Please take into consideration of this request.Going weekly is too much and as only the original answer of yours is quoted,rest of the discussion is not coming up which is also very important.

RE: But what about when undo_management=auto?

July 31, 2008 - 9:39 am UTC

Reviewer: Fred from New York, NY

Tom,

Thanks for your ^^^ Followup on July 29, 2008 - 3pm US/Eastern:

>>[Fred:]... Obviously increasing UNDO_RETENTION does not help in
>>this situation ....

>[Tom:] no, that is wrong. increasing undo_retention would
>absolutely help in this situation.

Your explanation is the way I thought it should work. But my reality does not coincide. Please consider:

Someone in the firm launched a big update of user account settings on the OLTP before leaving work on Friday evening (sql id c7w8j21hmpkcp, see Exhibit A below). At this point, UNDO_RETENTION was set to 18,000. According to DBA_HIST_UNDOSTAT, the update ran for 15,361 seconds and finished around 23:00. This was the longest-running SQL during this time. There were no errors.

Around 23:10 the same update started again (this is not a regularly scheduled process, so someone must have launched it manually again). At 23:38 an ORA-01555 was spawned. The SQL that errored was a SELECT that is part of an mview refresh that launches from the data warehouse (that's a separate DB on another server). The mview's base table resides on the OLTP. The mview's base table is also one of the tables involved in the long-running update, the aforementioned c7w8j21hmpkcp.

At 2am my pager finally nagged me into responding to the ORA-01555; I increased UNDO_RETENTION from 18000 to 21600. 6 hours of undo ought to shut the blasted thing up, right? I went back to bed.

Well, around 10:00 Saturday morning we received a second ORA-01555: the same mview refresh had failed. The result was a whole morning's worth of failed data warehouse jobs that led to a "lost weekend" for the DBA team... of which sorry tale I shall speak no further.

Given your explanation of Oracle's treatment of undo segments -- extend the undo segment, steal an expired extent from another segment, grow the datafile to extend the undo tablespace if you can - but do not overwrite this until you have no other choice -- I do not understand the evidence presented by Exhibit A. At no point does any query even come close to UNDO_RETENTION. And even if it did, the UNDO tablespace has plenty of room to extend into (12 GB allocated, AUTOEXTEND is on, and max datafile size is 32 GB).

I've gone over the theory and practice of ORA-01555 on this site, in your book, on Metalink, and elsewhere over the past week and a half, and I just don't understand this phenomenon. Could it be delayed block cleanout? Fetching across commits? Could be, given the facts of the case, but shouldn't my large UNDO_RETENTION be taking care of that?

It seems like I could set UNDO_RETENTION to a kajillion and I'd still get the 1555. What am I missing?

Thanks,

Fred


Exhibit A:

select to_char(begin_time,'yyyy-mm-dd hh24:mi:ss') begin_time, 
 maxquerylen, tuned_undoretention, maxquerysqlid,
 ssolderrcnt
from dba_hist_undostat 
where begin_time between to_date('2008-07-18 18:00','yyyy-mm-dd hh24:mi')
  and to_date('2008-07-19 10:00','yyyy-mm-dd hh24:mi')
order by begin_time
/

BEGIN_TIME           MAXQUERYLEN TUNED_UNDORETENTION MAXQUERYSQLID SSOLDERRCNT
-------------------- ----------- ------------------- ------------- -----------
2008-07-18 18:07:05          227               18000 b7jawfcy1ubss           0
2008-07-18 18:17:05          417               18000 f78zzrwd92pt3           0
2008-07-18 18:27:05         1018               18000 f78zzrwd92pt3           0
2008-07-18 18:37:05         1619               18000 f78zzrwd92pt3           0
2008-07-18 18:47:05          941               18000 c7w8j21hmpkcp           0
2008-07-18 18:57:05         1542               18000 c7w8j21hmpkcp           0
2008-07-18 19:07:05         2143               18000 c7w8j21hmpkcp           0
<snip: more of the same until...>
2008-07-18 22:37:05        14760               18000 c7w8j21hmpkcp           0
2008-07-18 22:47:05        15361               18000 c7w8j21hmpkcp           0
2008-07-18 22:57:05           65               18000 3273x06wdcmr0           0
2008-07-18 23:07:05          645               18000 c7w8j21hmpkcp           0

2008-07-18 23:17:05         1246               18000 c7w8j21hmpkcp           0
2008-07-18 23:27:05         1847               18000 c7w8j21hmpkcp           0
2008-07-18 23:37:05         2448               18000 c7w8j21hmpkcp           1 <--
2008-07-18 23:47:05         3048               18000 c7w8j21hmpkcp           0
2008-07-18 23:57:05         3649               18000 c7w8j21hmpkcp           0
<snip: more of the same until...>
2008-07-19 01:57:05        10859               21600 c7w8j21hmpkcp           0
2008-07-19 02:07:05        11460               21600 c7w8j21hmpkcp           0
2008-07-19 02:17:05         2927               21600 a7c7s5401u2a4           0
2008-07-19 02:27:05         3528               21600 a7c7s5401u2a4           0
2008-07-19 02:37:05         4129               21600 a7c7s5401u2a4           0
2008-07-19 02:47:05         4730               21600 a7c7s5401u2a4           0
2008-07-19 02:57:05         5331               21600 a7c7s5401u2a4           0
2008-07-19 03:07:05         5932               21600 a7c7s5401u2a4           0
2008-07-19 03:17:05         6532               21600 a7c7s5401u2a4           0
2008-07-19 03:27:05         7134               21600 a7c7s5401u2a4           0
2008-07-19 03:37:05         3225               21600 f78zzrwd92pt3           0
2008-07-19 03:47:05         3825               21600 f78zzrwd92pt3           0
2008-07-19 03:57:05         4426               21600 f78zzrwd92pt3           0
2008-07-19 04:07:05         5027               21600 f78zzrwd92pt3           0
2008-07-19 04:17:05         5628               21600 f78zzrwd92pt3           0
2008-07-19 04:27:05         6229               21600 f78zzrwd92pt3           0
2008-07-19 04:37:05          364               21600 fb52xsmqzmxxm           0
2008-07-19 04:47:05          633               21600 213n0m32rkg8m           0
2008-07-19 04:57:05         1234               21600 213n0m32rkg8m           0
2008-07-19 05:07:05           57               21600 4hfp44bc0a32q           0
2008-07-19 05:17:05          659               21600 4hfp44bc0a32q           0
2008-07-19 05:27:05          115               21600 fb0190xm3vnsk           0
2008-07-19 05:37:05          115               21600 5fkqajjsz47sq           0
2008-07-19 05:47:05          716               21600 5fkqajjsz47sq           0
2008-07-19 05:57:05         1316               21600 5fkqajjsz47sq           0
2008-07-19 06:07:05         1917               21600 5fkqajjsz47sq           0
2008-07-19 06:17:05         2518               21600 5fkqajjsz47sq           0
2008-07-19 06:27:05         3118               21600 5fkqajjsz47sq           0
2008-07-19 06:37:05         3720               21600 5fkqajjsz47sq           0
2008-07-19 06:47:05         4321               21600 5fkqajjsz47sq           0
2008-07-19 06:57:05         4922               21600 5fkqajjsz47sq           0
2008-07-19 07:07:05          404               21600 cbhzrrzraj1tx           0
2008-07-19 07:17:05         1004               21600 cbhzrrzraj1tx           0
2008-07-19 07:27:05         1606               21600 cbhzrrzraj1tx           0
2008-07-19 07:37:05         2207               21600 cbhzrrzraj1tx           0
2008-07-19 07:47:05          278               21600 f8vspn0bcv2v9           0
2008-07-19 07:57:05          247               21600 ch5n1g04p2bqy           0
2008-07-19 08:07:05          848               21600 ch5n1g04p2bqy           0
2008-07-19 08:17:05         1448               21600 ch5n1g04p2bqy           0
2008-07-19 08:27:05         2050               21600 ch5n1g04p2bqy           0
2008-07-19 08:37:05         2649               21600 ch5n1g04p2bqy           0
2008-07-19 08:47:05         3251               21600 ch5n1g04p2bqy           0
2008-07-19 08:57:05         3852               21600 ch5n1g04p2bqy           0
2008-07-19 09:07:05         4453               21600 ch5n1g04p2bqy           0
2008-07-19 09:17:05          110               21600 1xffhfqr9q70t           0
2008-07-19 09:27:05          530               21600 3ghqu9m8u617j           0
2008-07-19 09:37:05         1131               21600 3ghqu9m8u617j           0
2008-07-19 09:47:05         1732               21600 3ghqu9m8u617j           1 <--
2008-07-19 09:57:05         2333               21600 3ghqu9m8u617j           0

Tom Kyte

Followup  

August 02, 2008 - 5:46 pm UTC

do you have a history of prematurely expired extents - 12gb sounds like a lot, but it very likely might not be sufficient to hold all of your undo - especially in light of massive updates and materialized view refreshes.

I suspect prematurely expired undo due to insufficient space.

ORA-01555 - Snapshot too old error.

September 23, 2008 - 3:03 am UTC

Reviewer: Ayyappan Kashivishwanath from India.

Hi Tom,

I went through the "Snapshot too old error" article. It was really good. But i am facing this problem again and again.. I am executing a query which consists of Select statements and Union all. While executing the query stops midway after a few mins and throws the snapshot too old error. I checked with the DBA's in my organization and they told me that the rollback segment sizes are fine.. and he suggested me to tune the query.
However my query contains only multiple select statements with Union all's. These tables are not updated by any other user.
Your suggestions will to avoid this error will be helpful..

Thank you.
Please revert back to me for any further clarifications.
Tom Kyte

Followup  

September 24, 2008 - 3:33 pm UTC

If you are using rollback segments and not an undo tablespace your DBA has done it wrong.

In any case - your rollback segments/undo tablespace is undoubtedly too small - if after a couple of MINUTES it fails with 1555 - they are too small.

Ask your dba to review v$undostat, I have a feeling they'll find they have prematurely expired extents (too small) OR their undo retention is set too small (smaller than a couple of minutes)

and you'll find these tables are at least in tablespaces that have lots of concurrent transactions - if in fact the base tables are not modified themselves (many times people do not realize what is really happening in their database - you might *think* no one touches these tables - but I'm not convinced)

ORA-01555: snapshot too old: rollback segment number string with name "" too small

October 08, 2008 - 6:57 am UTC

Reviewer: YS from San Jose, Califoria

ORA-01555: snapshot too old: rollback segment number string with name "" too small, This error occurs after 4-5 hours of running of my Query.
what could be the probable reasons for the same.



Tom Kyte

Followup  

October 08, 2008 - 10:04 pm UTC

read above.

thanks

October 28, 2008 - 3:42 am UTC

Reviewer: Atef Z. from Qatar

Thank you very much

snapshot too old

November 15, 2008 - 11:36 am UTC

Reviewer: Anupam Pandey from INDIA

Hi Tom,
  I am accessing two tables in remote database from my local DB with the help of db link.Here is the query 

SQL>
SQL>
SQL> create table profile_line_view
  2  as
  3  select /*+ driving_site(l) */l.LINEID, s.key ,s.operator,s.value,decode(l.delivery_flag, '*', 'I', 'Exclusive', 'I', 'Filler', 'III', 'Infinite', 'I', 'PreEmpBrtr', 'II', 'PreEmpPd', 'II', l.delivery_flag) as Routine_TYPE
  4  from line_all@remoteDB l, st_line_profile_v@remoteDB s
  5  where l.lineid = s.lineid
  6  and l.lineid in ( select lineid
  7                    from lines) ;
and l.lineid in ( select lineid
                         *
ERROR at line 6:
ORA-01555: snapshot too old: rollback segment number  with name "" too small
ORA-02063: preceding line from remoteDB


SQL>
SQL>
SQL> spool off;



In this query lines is the table whcih is in my local DB.
This query ran for more than 6 hours and then It gave the error.Lines table contains nearly 30,000 entries otherwise i would have put those in IN list and executed the query .

Plz help me .
Thanks And Regards,
Anupam Pandey 

Tom Kyte

Followup  

November 18, 2008 - 6:43 pm UTC

the answer is above... it is not any different because it is distributed, the rollback segments on the remote site or the ones locally are not large enough to hold six hours of undo and they *need* to be in order to run a six hour query.

need to understand scenario

December 15, 2008 - 9:04 am UTC

Reviewer: Chandan SB from Bangalore, India

I have one below scenario that i would like to understand

1. Suppose there is one query that starts at 10:00 and finishes at 10:30 and this query was suppose to read one block say b100 (say sal=10000) at time 10:15.
2. Now in another session, we run one update statement at 10:05 that will update rows (say sal is increased to 20000) in block b100 and will commit it at the same time and as per my understanding, at this time, commit cleanout will happen and this will remove the transaction details from the block header of b100.
3. Now, data blocks contain 20000 and its undo information i.e. 10000 is still there in undo tablespace.
4. At 10:15 when query from 1st session reaches block b100, for read consistency, query should read data as it exists at 10:00 i.e. 10000.

My understanding was that data blocks also contain SCN and this SCN will be matched with the SCN when the query starts at 10:00 and if this SCN is lower then SCN in data blocks, for read consistency , query will look into undo tablespace for previous image of the block but you said that data blocks does not contain SCN, then how oracle will come to know that transaction has happened as the transaction details has already been cleaned out with commit? Now, the query from 1st session should read 10000 and not 20000 as it was updated after the query starts.

is there anything i am missing out in above steps?
if you can clear this point, it will be really helpful.

thanks a lot

Tom Kyte

Followup  

December 29, 2008 - 10:24 am UTC

2) no, it will not remove the details - it will just clean out the block and let anyone that sees the block know that the transaction committed. That is what the cleanout does. It does NOT remove the transaction history


The blocks conceptually contain the SCN, they contain transaction information that can be used to ascertain this information. Always they contain this.

Updates to the original posts

March 17, 2009 - 10:24 am UTC

Reviewer: Mark Brady from Baltimore, MD

Can you go over what has changed in the latest editions 10gr2 and 11g that are different than the original information? The first post is so detailed, it's a very nice source document.
Tom Kyte

Followup  

March 17, 2009 - 11:02 am UTC

it is fundamentally the same, I would not make any major edits to it.

Other than "use automatic undo management and set your retention time sufficiently large to exceed the duration of time your longest running query takes to run from start to finish"

For a good time: DailyWTF today

April 02, 2009 - 12:45 pm UTC

Reviewer: David Weigel from Wayne, PA USA

Today's DailyWTF ( http://thedailywtf.com/Articles/A-Statistical-Anomaly.aspx ) is related to this thread and it has everything: "snapshot too old" error, a commit in a trigger, pointless procedures, poor commenting and curious advice from a Don Burleson page.

All it needs is a "WHEN OTHERS" and it'd be perfect for asktom.

July 20, 2009 - 6:56 am UTC

Reviewer: Ricardinho

Sir,
I am just reading "block cleanout" chapter in your book.
I have some questions in this issue:

1-)Suppose, we modify blocks which exceeds 10 percent of the buffer cache and we never select
that table or gather statistics.
In this case, space will be wasted in buffer cache and our new data will never be inserted into
datafiles, is that right?


2-)If we modify blocks which exceeds 10 percent of the buffer cache,
and rather than full scan,if we query "select * from t where ..."
Only the selected blocks will become dirty right?


3-)Is the redo generated by the "select" (which cleanouts the block) necessary for instance or
media recovery?

Tom Kyte

Followup  

July 24, 2009 - 11:33 am UTC

1) why would it be wasted? they get flushed out of the cache eventually (or you shutdown and the cache goes away). The data will ultimately get checkpointed.


2) only blocks that are modified (cleaned out) would be 'dirty'

3) yes and yes. both.

July 25, 2009 - 5:43 am UTC

Reviewer: A reader

From your book I understand that if the query start at t1, during the execution of the query, if oracle finds out the
the block is modified after t1, it reads the before image from undo.


Does oracle only check buffer cache to see if the block is modified ?

Whatif someone did the modifications and all dirty blocks are written to disk.
How does oracle understand if the block is modified after T1?

Tom Kyte

Followup  

July 26, 2009 - 7:46 am UTC

not sure what you mean? the block would not be in the cache if the dirty version was flushed to disk - else the block would be in the cache.

August 01, 2009 - 12:22 am UTC

Reviewer: A reader

Respected sir,

Assume my transaction starts at T1,
Someone else modified the blocks after T1.
From your book, I understand that Oracle will read the before image from undo.

How can oracle understands that blocks are modified after T1?
Does it check the buffer cache to understand if blocks are modified or does it also check blocks on disk?

Tom Kyte

Followup  

August 04, 2009 - 12:36 pm UTC

.... From your book, I understand that Oracle will reconstruct the before image of the block from undo....

... How can oracle understands that blocks are modified after T1? ...

the transaction header on the block tells us everything we need to know - or at least points to everything we need to know.

August 07, 2009 - 10:13 pm UTC

Reviewer: A reader

So it doesnt matter wheter block is in disk or in cache,
Oracle always checks the transaction header, is that right sir?
Tom Kyte

Followup  

August 11, 2009 - 12:39 pm UTC

if the block is not in the cache, we read into the cache from disk

and then - the normal processing of the block begins. And the information recorded on the block regarding the transaction history has everything we need to figure out if that block image needs to be rolled back or not.

August 08, 2009 - 8:02 pm UTC

Reviewer: Scofield

Respected Sir,
Assume I performed a transaction less than %10 percent of buffercache.
And dirty blocks being written to datafiles.
After I commit, Does oracle goes to disk to clean the ITL entries of these blocks?
or
Even these blocks are flushed from buffer cache, their itl entries still reside in buffer cache.
and oracle simply clears them in buffer cache?
Tom Kyte

Followup  

August 11, 2009 - 1:27 pm UTC

the commit cleanout will only hit cached blocks. If the block isn't there anymore, we won't go to disk clean it at that time.

August 15, 2009 - 9:01 pm UTC

Reviewer: A reader

Respected Sir;
From your book I understand that if the query start at t1, during the execution of the query, if oracle finds out that the row is modified after t1, it reads the before image from undo.

How does oracle understand if the row is modified after t1?The block may be either in cache or in disk
Tom Kyte

Followup  

August 24, 2009 - 8:04 am UTC

the block header has the recent transaction history, we can tell if a block was modified.

August 16, 2009 - 3:20 pm UTC

Reviewer: A reader

Hi Tom
I got nice informations from this article
I have folowing two confutions

QUESTION 1:
1. session 1 executes a select query (QENV50) that needs a long time to be completed say 2 hours.
2. session 2 DML (QENV51) updates one of the blocks that query QENV50 will travel later.
3. session 2 commits. block header contains the undo block pointer
4. session 3 access the same block and cleans out the block.
5. session 1 now the query QENV50 access the same block.At this moment how the query QENV50 comes to know that the informations in the block is modified and it's not the information when the query QENV50 started executing?

QUESTION 2
1. session 1 updates a block. in the block header ROLLBACK TRANSACTION SLOT is written
2. session 1 commit. now this ROLLBACK TRANSACTION SLOT can be overwritten.
3. since a long time no query access this block and the ROLLBACK TRANSACTION SLOT has got overwritten by some other transaction.
4. session 2 accesses the same block. but it does not find the ROLLBACK TRANSACTION SLOT
now what will happen? how the block will be cleaned out?

Thanks
Ani
Tom Kyte

Followup  

August 24, 2009 - 8:39 am UTC

q1

cleaning out a block does not erase everything, it simply marks the transaction header to show that the session 2 transaction committed.

Otherwise, everyone that visited that block would have to look into the undo segments to see if session 2's transaction was still happening.


The transaction history is still there, we just know now that session 2's transaction was committed - the block cleanout adds information.


q2

Unclear as to when session 2 began their query that accessed this block. And it doesn't matter if the undo was overwritten. If you access a block that has a transaction that was not known to have committed (according to the block header, we don't know if transaction X committed or not) and we travel to the undo to find out (if the undo isn't there anymore, we know it must have finished) and we find out "it is done", we'll clean out that block if possible (block could be in read only tablespace - might not be able to modify it for real)


September 05, 2009 - 2:16 am UTC

Reviewer: scofield

Respected Sir;

I am little bit confused.

In this scenerio:

1-)My transaction starts at T1
2-)it modifed the block B1 at T2
3-)then my session again revists B1 at T3

Does my session has to read the before image of block B1 from undo?



Tom Kyte

Followup  

September 06, 2009 - 9:10 am UTC

hah, the only answer is..........

it depends.


case 1)

for x in ( select rowid rid from t where x = 5 order by x, y )
loop
   update t set y = y+1 where rowid = x.rid;
end loop;


let us assume that there is an index on (x,y). let us assume that this index is used. Let us assume there are three rows

x=5, y=1 on block 1
x=5, y=2 on block 100
x=5, y=3 on block 1

at time t1, you will read block 1. at time t2 you modify block 1. At time t3 you read block 100, at time t4 you modify block 100.

At time t5, you need to re-read block 1. The QUERY will do a consistent read - it will (must, has to) undo your changes - it cannot see your changes (yet). The query reads block 1 as of time t1. The update that takes place at time t6 does a CURRENT read of the block to verify that the row was not modified (still exists with x and y as they were during the consistent read).

So, in this case, you would read the before image of the block at time t5.



case 2)

t1; select rowid, t.* from t where x= 5 and y = 1;
t2; update t set y = y+1 where rowid = that_rowid;
t3; select rowid, t.* from t where x= 5 and y = 1;
t4; select rowid, t.* from t where rowid = that_rowid;

at time t3 - you are permitted to see your uncommitted changes, hence, you will see the modification - and hence at time t3 - no data will be found(y=1 is not satisfied anymore)

at time t4 - you are permitted to see your uncommitted changes, hence, you will see the modification and this time you will see x=5, y=2 for that row by rowid.


So, it depends :)

September 08, 2009 - 12:48 am UTC

Reviewer: A reader

Respected Sir,

I have couple of question about who Oracle generate the past image of a block
(1) Before generating past image of block (say SCN 50) will oracle try to search if there is any block available in buffer cache
as of require SCN(i.e. SCN 50) OR will it always create the past image by copying the current image and applying undo?
(2) Consider that Oracle search for suitable past image for block in question and find a block with SCN (55) ,
will it copy this block and continue to apply the undo chain to get to the targeted block (i.e. SCN 50)

Thanks

Kind Regard's
Tom Kyte

Followup  

September 08, 2009 - 6:01 am UTC

if the image of the block it needs is in the cache, it'll use it from the cache and not reconstruct it, this is why the cache supports multiple versions - so that we can use it over and over if need be.

and yes, if it finds one not quite old enough, it can start with that.

September 08, 2009 - 10:11 am UTC

Reviewer: A reader

Excellent Many Thanks Sir

Very good article, thanks

September 30, 2009 - 3:47 am UTC

Reviewer: Nic from France

Dear Tom,
powerful article but still need to have your commitment on a very simple example.

We do huge insertions/updates in a group of tables T1, T2, T3, then commit the transaction.
There is no other DML activity on T1, T2, T3 for a long time, so only oracle system activity or only DML on another table T4.
We start a long select query using T1, T2, T3 tables.
Is there really some chance to get a snapshot too old because of this select?

I request you such question because you often refer to an example which demonstrates that, e.g.:

“there can be NO activity on this table and you can get an ora-1555. it just means whilst you were querying there were lots of other little transactions all committing and wiping out undo you needed to ensure the consistent read.
it can happen on a READ ONLY tablespace even.
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:895410916429“

But this example is doing some insert, commit, then do other updates on same table using a for loop commit.

My simple concern is quite different: do a big insert/update on some tables, commit, then start only a big select query on same tables (oracle 9i or 10g).

Thanks for your valuable help.
Nic

October 10, 2009 - 2:54 am UTC

Reviewer: A reader

Sir,
I have some doubts in this issue:

Say ,my transaction started at T1.

After T1,some transactions performed and therefore ITL entries are open.
Then a commit issued and ITL entries are cleaned.

During my transaction how does oracle understand if the row was modified after T1 and read the
before image from undo?
ITL entries are already cleaned, so it has to check something else..

December 05, 2009 - 7:36 pm UTC

Reviewer: Carew

Hi Tom;

From your book, I learned that oracle needs before image of the block, and if it cannot find it
snapshot error is returned. This can be prevented by undo retention, etc..

I want to ask my scenerio:

In oracle 9i, I use old fashioned rollback segments (not AUM).

When I try to analyze some small tables, It always return snapshot too old error.
Whats the reason for this? Is there a relation with before image of the block?

Tom Kyte

Followup  

December 07, 2009 - 2:17 pm UTC

if you say you get ora-1555 on small tables where the analyze would take seconds to execute - not that I don't believe you - but I don't believe you necessarily. I'd need to see a cut and paste - with timings, the error message and information about the size of the table.


the analyze command wants a read consistent image of the table - just pretend that analyze of that_table is nothing more then "select from that_table".

If the analyze command takes an hour to execute - but your only have configured sufficient undo segments to hold 30 minutes of rollback data in your database - the analyze will likely (but not definitely) receive ora-1555 snapshot too old

Which means you did not size your undo for the work you perform

Which means you should probably switch over to automatic undo management, rather than fixing the sizes of what you have.


Also, stop using analyze, it has been deprecated for nine releases now. DBMS_STATS is the way to gather statistics.

December 11, 2009 - 6:47 pm UTC

Reviewer: Carew

Thanks Sir;
If I am the only user in the database, and I have very small rollback segment, When I attempt to analyze a big table, is there a possibility that I hit ORA-01555?
Tom Kyte

Followup  

December 14, 2009 - 8:03 am UTC

yes. of course.

by the way, you are never the only user in the database never... Don't forget about the Oracle background processes - they are always there and are always doing work. They commit frequently after doing work. You are NEVER alone.


do this:

take your largest table.

create an index on some data that arrives randomly (like last name), not on a column that is populated via a sequence or date.

then

for x in ( select /*+ index( t that_index ) */ * from t where last_name is not null )
loop
update t set last_name = last_name || 'x' where primary_key = x.primary_key;
commit;
end loop;

that'll almost surely 1555 you if you have small rollback set up.

Since the data arrives randomly, the data in the table is scattered alphabetically (block one contains last_names that start with A, B, C, .. M, .. Q, R, S, ... Y, Z). We read the data via the index - so we read and process the A's then the B's then the C's and so on.


So, when you process the A's, you read and modify a lot of blocks in TABLE. When you hit the B's - the same... Probably when you get to the D's or E's - you've modified and committed changes to EVERY block in the table. You are reading and re-reading blocks over and over and over. When you get to the Z's you need to re-read the first block you read (and probably have read and re-read many times already) - but we need to roll that block back to the way it was before you did any updates - and the odds are it won't be in the rollback anymore - bam - ora1555 snapshot too old - you lose, failure, roll back, all work gone (and you were SO close)

December 19, 2009 - 5:39 pm UTC

Reviewer: A reader

Thanks sir.
Your explanation makes sense.

However, there is no for loop in simple analyze table command.
Why would I get ora-01555 in this case?



Tom Kyte

Followup  

December 20, 2009 - 9:05 am UTC

you have the source code for analyze? You know this to be true? Nice.


In any case, you do not need a loop, a full scan will do - anything that reads data from the database in any access pattern will do.


When a consistent read begins (say at 9am), all of the blocks touched by that consistent read will be as of that time (9am). An hour into the consistent read - we'll roll every block we touch back 1 hour. So an analyze that takes any amount of time would definitely be subject to an ora-1555.

ora-1555 can it be caused by updates on another table ?

February 09, 2010 - 1:43 pm UTC

Reviewer: A reader

Hi Tom,

1. I undestand in physical data files simple heap tables will have their unique data blocks allocated to them but If there are mutiple reads/writes happening in the database is the data from various tables might be sharing the same blocks in buffer cache or blocks in buffer cache always have data from unique tables ?

2.If above is true is it possible that long-running reads on a table X can end up with "snapshot too old" by the updates on table Y if they happen to share same blocks in buffer cache ?

thanks !
Tom Kyte

Followup  

February 15, 2010 - 2:56 pm UTC

first, I'll answer the subject line "can it be caused by updates on another table" - answer is YES

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:895410916429


#1) not relevant, ora-1555 happens when UNDO is overwritten, it would not matter if data were on same blocks are not (but yes, two tables can in fact share a block, many tables can share a single block - we call them "clustered tables")

#2) the outcome you mention is possible but NOT for the reason you suggest. It is because UNDO information is overwritten, not because of the table blocks themselves.

March 08, 2010 - 7:29 am UTC

Reviewer: Sonia from India

Hi Tom
Thanks for the nice explaination.
I have few doubts.Could you please clear those doubts?

Suppose

QENV50 tries to select data from block B1

Question 1.
if the block is already cleaned (Delay block cleanout is not required) then how QENV50 comes to know whether the last updation has been made (commited) by SCN 49 OR SCN 51?
Is SCN is stored as a Block attribute?

Question 2.
if the block is not cleaned, then suppose QENV50 finds that some SCN49 (again, at this position does QENV50 know whether the changes has been made by SCN49 or SCN50? ) has changed the block.
Now QENV50 tries to find out the information(whether the block has been commited or not) in rollback segment transaction table header.But QENV50 does not find the information there because it has already been overwritten.
Now at this situation, whether QENV50 understands that block has already been committed (since rollback segment header has been overwritten ) and will clean the block OR it will raise ORA-1555 error?

Tom Kyte

Followup  

March 09, 2010 - 11:29 am UTC

q1- the recent transaction history is in the block, we use that to read the undo segment information and that tells us if/when it was committed.

q2) the transaction history in the block header tells us that normally.

In this case, it would raise 1555. see

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:895410916429

Thanks

March 19, 2010 - 6:15 am UTC

Reviewer: Divya from INDIA

Tom, this information is very useful. Thanks a lot.

Good articale

March 30, 2010 - 9:04 am UTC

Reviewer: lingaraj from FL,US

Tom, Nice article. Thanks.

Oracle rocks!

snapshot too old error

May 25, 2010 - 2:18 pm UTC

Reviewer: Amrit from India

Hi Tom,

I am new to DBA field. I have some confusion about this Undo behaviour and snapshot too old error.

I read it in Oracle 9i Backup and Recovery guide that "Each redo record contains both the old and the new values.Oracle also records the old value to an undo block ...."

My question are below
1) If we have UNDO Tablespace why do Oracle also stores the old value in the redo record?

2) Then why cant Oracle get old record from REDO LOGs rather than displaying ORA-01555?

Apology if i have asked anything stupid. But i am very confused.

Thanks in advance for you kind consideration.
Tom Kyte

Followup  

May 26, 2010 - 7:00 am UTC

First - what is stored in redo is redo, true there is redo for the undo blocks to roll them forward - but it is not stored as undo - it is redo information. So, while in theory there is "undo" in there, it is hidden as redo that must be replayed against and undo block in order to get the undo we'd need. That is - it is not in a format very useful for rollback back - it is there for roll forward (redo).


Second - if the data isn't in undo anymore, it probably isn't in redo either - since redo in general is much smaller than undo. Say you have two 50mb redo files. Further, you want to update 500mb of data. You will generate 500mb of changes to the data, 500mb of undo (suppose that is true) for a total of 1gb of redo. That transaction will succeed (assuming you can get 500mb of undo) - and when you are nearly done - you'll have generated 1gb of redo but only 100mb of it can possibly be available (while all of the undo is).

Third - redo is optimized for writing, it is unbuffered - it is written to using sequential IO, we do not read it. It would be a real performance issue to read it - that is the reason we have separate redo and undo streams like that - so we don't have to read the redo.

June 09, 2010 - 7:57 am UTC

Reviewer: A reader

Thanks Tom for this wonderful reply.

June 21, 2010 - 12:33 pm UTC

Reviewer: aliyar from INDIA

Dear Tom ,

Thank you for your wonderful service

recently i found ora-1555 error found for the following flashback query

insert into xxx select a,b,c from yyy as of scn ...

Database : 10g
Undo retention : 900 sec

could you please guide me how to solve this issue

Thanks
Aliyar


Tom Kyte

Followup  

June 22, 2010 - 1:00 pm UTC

read the above original ansewr.


You would "solve" the issue by either

a) specifying an SCN close to right now
b) using a higher undo retention than you currently are, you can only flashback 15 minutes reliably the way you have it configured.

transaction slots...

July 19, 2010 - 1:36 pm UTC

Reviewer: Raj from Victoria, B.C.

Tom,
I'm trying to understand transaction slots as they relate to AUM.

In an earlier post, it was written that:

"we (you and I) have no control/true insight
into the number of slots - adding more rbs
would give you more slots."

(1)Using AUM, are my transaction slots controlled per segment, per undo datafile, or per undo tablespace?
(2)Would I add more segments by adding space to the one datafile of an undo tablespace, or by adding additional datafiles to an undo tablespace?

3)(sort of repetitive question)
If for UNDO_1: `select count(*) from dba_undo_extents where tablespace_name='UNDO_1'` = 1000
AND `select count(*) from dba_data_files and tablespace_name = 'UNDO_1'` = 1

and for UNDO_2: `select count(*) from dba_undo_extents where tablespace_name='UNDO_2'` = 1000
AND `select count(*) from dba_data_files and tablespace_name = 'UNDO_2'` = 2

Will UNDO_2 have more transaction slots than UNDO_1?

Am I considering this improperly altogether?

Sincerely,
Raj.
Tom Kyte

Followup  

July 19, 2010 - 2:19 pm UTC

1 they are segment level , your undo tablespace contains some number of undo segments

2 you do not add segments, we do - when we feel like it, in response to increased transaction load. we get rid of them - when we feel like it, in response to reduced transaction load.

3 it doesn't really matter to us, it is a bit way too low level, we are using AUM whereby the A = automatic. It takes care of itself.

Snapshot too old

July 21, 2010 - 11:27 am UTC

Reviewer: Gautham from INDIA

Hi tom,

One small query,

I am updating a row 1 which is in block 1 and it has been set has uncommited in header. That entire row or that table is not used by any transaction for 2 days,so obviously that slot in rollback segment will be overwritten by some other transaction.

My question is,

if i want to fetch that row 1 will i get snapshot too old error,because there are no transactions after that update to make the changes permanent in the datablock?

Thanks in advance


Tom Kyte

Followup  

July 23, 2010 - 8:59 am UTC

... I am updating a row 1 which is in block 1 and it has been set has uncommited in
header. That entire row or that table is not used by any transaction for 2
days,so obviously that slot in rollback segment will be overwritten by some
other transaction.
...


no it won't, you started by saying "it is uncommitted". as long as it is not committed - we'll have everything about that transaction we need.



Very Useful

July 22, 2010 - 2:00 pm UTC

Reviewer: Gandla Ram Reddy from USA

Very Useful

SNAPSHOT TOO OLD

July 24, 2010 - 2:56 am UTC

Reviewer: Gautham from INDIA

Hi Tom,

Thanks for your reply.

If my press commit after my update and no other transaction is made on that updated row for many days. Do i get snapshot too old error if i want to fetch that row after long time.
Tom Kyte

Followup  

July 26, 2010 - 8:17 am UTC

there is insufficient context here to answer - what is going to get a snapshot too old? You didn't give a timeline of any read queries. Even if you do, the answer will be "maybe", since we don't know how much undo you have configured - and so on.



You will get snapshot too old if you run a query that requires undo that no longer exists.

ora-1555

July 29, 2010 - 11:38 am UTC

Reviewer: Gautham from Gautham

Hi Tom,

Sorry for your inconvenience in above question.

Let me explain you with a example.

Consider a table T1

create table t1(id number(10);

insert into t1 values(1);

later i had updated my value to 2;

Consider this is the situation after my update

Data Block 500 Rollback Segment Header 5
+----+--------------+ +----------------------+---------+
| tx |5.3uncommitted|-+ | transaction entry 01 |ACTIVE |
+----+--------------+ | | transaction entry 02 |ACTIVE |
| row 1 *changed* | +-->| transaction entry 03 |ACTIVE |
| row 2 | |transaction entry 04 |COMMITTED|
| ... .. | | ... ... .. | ... |

My row1 is changed and i press commit, So the corresponding rollback segment will remain committed.

As you said,The row in the datablock will be changed until a new transaction is made on that row or block.


No transaction was made on that row for 2 days and my UNDO RETENTION is 900 seconds,So obviously that transaction entry 04 from rollback segment will be flushed out after the undo retention.

My Question is:

If i want to select that row from T1 table after two days,Do i get value 1 or snapshot too old error or value 2.

Thanks


Tom Kyte

Followup  

July 29, 2010 - 12:40 pm UTC

... No transaction was made on that row for 2 days and my UNDO RETENTION is 900
seconds,So obviously that transaction entry 04 from rollback segment will be
flushed out after the undo retention.
....

that is not obvious, it doesn't have to be true, but let us say it happened that way, entry 04 is gone.

entry 04 isn't needed for the block in question. it isn't relevant.


but lets say you really meant entry 03.

It is highly doubtful you would get a 1555 (it would not happen) because the transaction committed so long ago - we *know* it happened before your query began (we know it happened and finished way before your query began). We keep track of a "point in time at which any transaction that happened before this time is KNOWN to have committed".



1555

August 28, 2010 - 2:54 am UTC

Reviewer: aliyar from indai

Dear Tom ,

could you please clear my doubt ..

found one very simple sql causing 1555 error after running 18 hours from alert log file .. then i ran that sql and completed with in few seconds

how this simple query can run 18 hours and causing 1555 error .... also this same query multiple time reported in the alert log file after running almost same amount of time

sql is as follows
===================

select cii.col1 as object_id,3 as object_type
from table1 cii,table2 eaet
where cii.col2=eaet.col2 and
cii.col3 =eaet.col3 and
group_id=288205

table1 : 1.5 milloin rows
table2 : 4500 rows
Joining : Hash Joing

database : 10g

also found around 20 of similar sql statement running in the Db with different value for group_id column ..( not using bind variables )

Thanks
Aliyar
Tom Kyte

Followup  

September 07, 2010 - 7:58 am UTC

easy:

someone:

a) opened the cursor at 11am on tuesday
b) they fetched the first row at 11:10am on tuesday
c) they went to lunch........... and forgot what they were doing...
d) finally, at 5am on Wednesday they remembered what they were doing and went to fetch a row, they they hit the ora-1555 reading the table that was being full scanned second (the first table was already read from 11-11:10am on tuesday and hashed into memory/temp - then we would slowly full scan the second table as you fetch from the result set)



delayed block clean out on OLAP database

September 01, 2010 - 10:31 pm UTC

Reviewer: alan

Hello,Tom
I got a problem of ora-01555 error on our OLAP database:
Our business on the database is just data loading like :
insert into ....values( ...) and big query.

Every 10000 rows inserted,there is a commit;
every day about 10 million rows were loaded into database,the tables was created as partitioned tables with 1 partition per month.
there is no update and delete on these data,just insert and query.

My probem is :
When users issued a big query,such as it query data over 30 days ,aften failed with ora-1555 error.
Becouse there are no DML operations on these data,I think is was not cause by consistent read.
I not not sure if it was caused by delayed block clean out .
the rollback usage reached to 8G 。

Could you tell me what is the proberbly reason or the way to find the reason?

Hope you understand!

Thank you very much!

Alan

Tom Kyte

Followup  

September 09, 2010 - 6:50 pm UTC

If it is a long running query (a query of data 30 days in size - that is what I get from your comment) and this is happening during the load, it sounds like normal read consistency.

8gb doesn't tell us anything, we would need to know what your undo retention is and whether we are forced to expire undo prematurely because your undo is too small. 8gb might be 5 minutes of load work for you for all I know - it doesn't tell us how much of your load work (which is being done in the worst way possible of course) fits in there.

If the 1555 is happening AFTER the load is completed, it is undoubtedly delayed block cleanout related.

snapshot too old

September 29, 2010 - 3:03 pm UTC

Reviewer: A reader

Hi Tom,

I can understand that if while querying a corresponding undo block is overwritten within or over Undo retention period for space requirement we may face ORA-01555 error and the select query fails. Now if we suppose a "For loop" in which a query is running for every loop. The query faces ORA 01555 for loop1 and we then increase either the Undo_retention or undo tablespace size. Can the same query as its undo has already been overwritten will succeed in next few loops? Will the query hold new SCN at the starting of every loop?
Tom Kyte

Followup  

September 29, 2010 - 4:18 pm UTC

are you talking about a query like:

for x in (select * from big_table)  <<<<=== this query
loop
    ... do some procedural code ..
end loop;



or are you talking about a query like:


for i in 1 .. 1000
loop
    for y in (select something_small from small_table) <<<<==== this query



In the former, the 1555 would kill the query - the loop would have to restart.

In the latter, the for y queries are all independent of each other.


But in any case - once the query fails - it has a chance of succeeding on the next try regardless of what you do (you do not need to change the undo settings - it might work the next time around simply because the SCN "as of time" for the query would be set "as of" right now - not "as of then"

snapshot too old

September 29, 2010 - 9:37 pm UTC

Reviewer: A reader

Hi Tom,

Thanks. This should be fine if the query wants to maintain the Read consistency since for each loop the SCN will be incremented and it will give another try to find its compatible undo or data block. And we don't have to increase any value I think. But what will happen if it wants for a delayed block clean out? Will it be successful in making the cleaning of a block whose undo has got missing?

How can we know if the query has failed maintaining read consistency or delayed block clean out? Is there any way to determine it?
Tom Kyte

Followup  

September 30, 2010 - 7:12 am UTC

if you get an ora-1555 due to delayed block cleanout - the next execution of that will not fail for the data block since we'll now know that "all transactions since X have committed".

How can we know if the query has failed maintaining read consistency or delayed block clean out? Is there any way to determine it?


they look identical to you as they have the same exact underlying cause - insufficiently sized undo for what you are doing on your system.

October 21, 2010 - 5:33 am UTC

Reviewer: Sonia from India

Thanks Tom
Superb explanations.
I have few questions. Would you please Answer them?

Suppose there are two sessions

Session1:
1. An update statement is executed from this session (and this update statement is updating only one record), say the record is in data block 500.
data block header is updated with the rollback segment header and slot information (say rollback segment header 5 slot 2 )

2. No COMMIT or ROLLBACK is executed



Session2:
1.Another update statement is executed from another session say Session2 (and this update statement too is updating only one record), say this record also is in data block 500.

Again data block header is updated with the rollback segment header and slot information (say rollback segment header 5 slot 3 )

Question 1. Do the above recent entry in data block header (rollback segment header 5 slot 3) overwrites the previous one (that is rollback segment header 5 slot 2)?

2. A COMMIT is executed

Now at this moment the the rollback segment header 5 slot 3 is updated to 'COMMITED'.

Now suppose after sometimes a select satement visits the same data block 500, it sees 'rollback segment header 5 slot 3' at the data block header and then it goes to rollback segment header 5 slot 3 and finds that the block is 'COMMIT'ed. So it cleans the data block header (delayed block cleanout).
But actually Session1 hasn't yet executed 'COMMIT'.

So in this case although Session1 has updated a data block and hasn't executed COMMIT yet, eventhough the data block is cleaned. Is this correnct? If not, would you please explain this scenario?


Tom Kyte

Followup  

October 25, 2010 - 3:18 pm UTC

q1: no, it cannot overwrite it - it is still active. There is an ITL (interested transaction list) in the block header, an array of transactions - there can be more than one..


q2: the information for the first transaction might be "cleaned out" (if it needed to be - if you only update a single row - almost certainly the commit cleanout would cover it) - but the second one could not be. The block would still be "not cleaned out".

SnapShot too old - delayed clean out

January 11, 2011 - 5:58 am UTC

Reviewer: Sanju from Dilli, INDIA

Hi Tom,

Just need one clarification on delayed clean out.
Considering a table has multiple updates and commits and the corresponding rollback segment has been overridden (i.e. if I query the table I will get snap shot too old error).

Now, what will happen if I execute the following query:

Select count(*) from table

Will this query clean the table (delayed clean out)?? As far as I know this query will do the INDEX FAST FULL SCAN thus avoiding the reading of data blocks. Please clarify?

Thanks

Tom Kyte

Followup  

January 11, 2011 - 6:03 am UTC

"it depends"

it depends on what blocks it decides to process to answer that question. There are lots of possible paths.

It might hit any one of the indexes on that table that contain at least one NOT NULL column in them and do a fast full scan.

It might hit a 1555 on one of those indexes (or not, maybe the update updates column X but the index is only on column Y - if we scanned the table - the table would 1555 but if we scan the index on Y - it would not, but if we scanned another index on X - it may well 1555)

It might use a materialized view on TABLE

It depends. Obviously, if it does not read the table, it cannot clean out any of the table blocks.

01555 error with duration 0sec

January 31, 2011 - 12:25 am UTC

Reviewer: Aliyar from India

Dear Tom ,

Could you please Help me in the following issue

Database : 10.2.0.4.0
Undo Retention. : 3.5 Hrs

Found lots of frequent ora-01555 error in alert log file
most of them with duration 0 sec and few with 1 or 3 sec..

failing sql is using this Hint : OPAQUE_TRANSFORM

How one query can fail within 0 ,1 0r 3 sec ...
searched in google got some points as follows

1 ) due to delayed block clean out problem

2 ) probably Bug in DB

But still did not take any corrective action to avoid this error.


Could you please Help me ..

Thanks for your valuable help for DBA world

Thanks
Aliyar
Tom Kyte

Followup  

February 01, 2011 - 4:30 pm UTC

if you are getting an ora-1555 immediately - that is a bug. Please utilize support, something is not right.

01555 error with duration 0sec

January 31, 2011 - 12:29 am UTC

Reviewer: Aliyar from India

Hi Tom,

Sorry ,, I forgot to mention in my previous question

all are same SQL statement with " OPAQUE_TRANSFORM " hint

Thanks
Aliyar

How many blocks (potentially)?

March 11, 2011 - 3:27 pm UTC

Reviewer: Raj from Victoria, B.C.

Tom,
A long running select on a 10g instance kicks off at time=0 gets to a data block#99 at time=2 that was changed at time=1 (and committed).

Does the select: (1)read block#99 only to find from its header that it's too new based on the header's SCN being > than the SCN at time=0 and then (2)checking the same block header to find out the undo segment# & transaction# of its old version and then finally (3)reading the undo block of block#99 with time=0 data?

Assuming that is at least a little bit close to accurate... how many previous undo segment&transaction numbers does the data block header keep in the chance that the block was changed >1 time?
Tom Kyte

Followup  

March 11, 2011 - 3:50 pm UTC

conceptually you have it.

The 'scn' isn't really there - we just see there was a transaction - we might (or might not) know that it committed - but we'll find out what state it is in and when it happened by following the trail from the block header to the undo data - discover "it is too new" and rollback the change.

We have transaction information in the block header, from that we can determine "age"


how many previous
undo segment&transaction numbers does the data block header keep in the chance
that the block was changed >1 time?


however many we can fit up to 255 (maxtrans). However, remember that the act of undoing the change of one will put the block back the way it was - INCLUDING the header - so that 255 might be 256, 257, .... thousands... as we go back in time with the block

RE: How many blocks (potentially)?

March 11, 2011 - 4:01 pm UTC

Reviewer: Raj from Victoria, B.C.

Would that just be two blocks read in that case by the select (for block#99 in the example above)? Or more for reading the undo transaction table?

Either way, if you wanted to investigate if a query was slower due to these increased block reads of undo (slower than if the table wasn't being changed), what would be the tell-tail wait? Or am I going about it the wrong way?

I'm trying to determine if I can gather convincing evidence to prove that long running queries on my OLTP will be better for the user if run on the physical standby (open read only). Apparently I need to sell it to the user that it's in their interest, not just the OLTP primary's.
Tom Kyte

Followup  

March 12, 2011 - 8:47 am UTC

it depends. Is the other block version in the SGA - might be no extra IO. If we have to reproduce it from undo - it is probably one or two - but could be more (depends on how many changes we have to undo).

You wouldn't see any special 'wait event', they are just block gets. You would see more block gets.

Here is a technique you can use to see what statistics might tip you off to the possibility of a problem. We'll cause a lot of undo changes to need to be applied to a cursor (Y in this example) and compare that to a cursor without any changes that need to be applied (X). We'll capture our session statistics during this and compare their values:

ops$tkyte%ORA11GR2> create or replace package my_stats
  2  as
  3      type array is table of number index by varchar2(255);
  4  
  5      g_before array;
  6      g_after  array;
  7  end;
  8  /

Package created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> drop table t;

Table dropped.

ops$tkyte%ORA11GR2> create table t ( x int );

Table created.

ops$tkyte%ORA11GR2> insert into t values ( 1 );

1 row created.

ops$tkyte%ORA11GR2> commit;

Commit complete.

ops$tkyte%ORA11GR2> connect /
Connected.
ops$tkyte%ORA11GR2> set linesize 1000
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> variable x refcursor
ops$tkyte%ORA11GR2> variable y refcursor
ops$tkyte%ORA11GR2> begin
  2      open :x for select * from t X;
  3      open :y for select * from t Y;
  4  
  5      for x in ( select a.name, b.value
  6                   from v$statname a, v$mystat b
  7                  where a.statistic# = b.statistic# )
  8      loop
  9          my_stats.g_before(x.name) := x.value;
 10      end loop;
 11  end;
 12  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> print x

         X
----------
         1

ops$tkyte%ORA11GR2> begin
  2      for x in ( select a.name, b.value
  3                   from v$statname a, v$mystat b
  4                  where a.statistic# = b.statistic# )
  5      loop
  6          my_stats.g_after(x.name) := x.value;
  7      end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set echo off
in another session execute:
begin
for i in 1 .. 10000
loop
update t set x = x+1;
commit;
end loop;
end;
/

ops$tkyte%ORA11GR2> print y

         X
----------
         1

ops$tkyte%ORA11GR2> begin
  2      for x in ( select a.name, b.value
  3                   from v$statname a, v$mystat b
  4                  where a.statistic# = b.statistic# )
  5      loop
  6          if ( (my_stats.g_after( x.name ) - my_stats.g_before( x.name ) ) > 0
  7               or
  8               (x.value - my_stats.g_after(x.name)) > 0 )
  9          then
 10          dbms_output.put_line
 11          ( rpad( substr( x.name, 1, 60 ), 60 ) ||
 12            to_char( my_stats.g_after( x.name ) - my_stats.g_before( x.name ), '999,999,999' ) ||
 13            to_char( x.value - my_stats.g_after(x.name), '999,999,999' ) );
 14          end if;
 15      end loop;
 16  end;
 17  /
opened cursors cumulative                                              6           6
user calls                                                             6           6
recursive calls                                                       13           8
recursive cpu usage                                                    0           1
session logical reads                                                  7      10,002
CPU used when call started                                             1           2
CPU used by this session                                               0           2
DB time                                                                1           2
non-idle wait count                                                    5           4
session uga memory                                                73,096     123,440
session uga memory max                                           262,060      57,928
session pga memory                                                65,536     131,072
session pga memory max                                           131,072      65,536
enqueue requests                                                       1           1
enqueue releases                                                       1           1
consistent gets                                                        7      10,002
consistent gets from cache                                             7      10,002
consistent gets from cache (fastpath)                                  7           7
consistent gets - examination                                          0       9,995
consistent changes                                                     0       9,995
calls to kcmgcs                                                        2           2
calls to get snapshot scn: kcmgss                                      1           1
data blocks consistent reads - undo records applied                    0       9,995
no work - consistent read gets                                         5           4
rollbacks only - consistent read gets                                  0           1
table scans (short tables)                                             1           1
table scan rows gotten                                                 1           1
table scan blocks gotten                                               5           5
session cursor cache hits                                              3           3
session cursor cache count                                             1           0
workarea memory allocated                                            -20          16
workarea executions - optimal                                          1           1
parse count (total)                                                    4           4
parse count (hard)                                                     1           1
execute count                                                          4           4
bytes sent via SQL*Net to client                                   1,043         800
bytes received via SQL*Net from client                             1,355       1,742
SQL*Net roundtrips to/from client                                      4           4

PL/SQL procedure successfully completed.


So, you might be looking for

data blocks consistent reads - undo records applied 0 9,995


a high value in that by session.


Not sure if I followed the instruction

March 12, 2011 - 2:13 pm UTC

Reviewer: Jim from palo alto

Hi, Tom,

I followed the instruction you gave at the top of this page but I didn't get ora-1555. Could you tell me why?



drop table bigemp;
create table bigemp (a number, b varchar2(30), done char(1));

rem * Populate demo table.
begin
for i in 1..200 loop
insert into bigemp values (mod(i,20), to_char(i), 'N');
if mod(i,100) = 0 then
commit;
end if;
end loop;
commit;
end;
/

drop table mydual;
create table mydual (a number);
insert into mydual values (1);
commit;

rem * Cleanout demo table.
rem select count(*) from bigemp;

declare

cursor c1 is select * from bigemp;

begin

-- The following update is required to illustrate the problem if block
-- cleanout has been done on 'bigemp'. If the cleanout (above) is commented
-- out then the update and commit statements can be commented and the
-- script will fail with ORA-1555 for the block cleanout variant.
-- update bigemp set b = 'aaaaa';
-- commit;

for c1rec in c1 loop
for i in 1..20 loop
update mydual set a=a;
commit;
end loop;
end loop;
end;
/

Tom Kyte

Followup  

March 12, 2011 - 5:11 pm UTC

depends on the size of your rollback segments, this was written way before automatic undo management - you'll probably have to configure manual undo to see this in a controlled demo.

something like this:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:895410916429

Why/When Oracle (re)uses the same block

April 05, 2011 - 5:23 pm UTC

Reviewer: James Smyth from Santa Barbara, CA, USA

Tom,

Can you explain why or when a transaction would use the same rollback segment? Is there something in the algorithm of rollback segment selection process that causes the same rollback segment to be overwritten for a given table's data block? Is there something more inherent that induces it commonly?

Or are these examples just very simplified to illustrate what happens when a subsequent transaction happens to use the same data block for a given table?

One reason I ask is that I get this problem on long-running, high-volume transactions using a commong table, when there appears to be plenty of available undo space.

Thanks,

James Smyth
Tom Kyte

Followup  

April 12, 2011 - 12:55 pm UTC

There are a finite (small) number of undo (rollback) segments.

There are many many transactions. Typically, you might see many concurrent transactions using the same undo segment - just because there are more transactions than undo segments.


Data in undo is managed in blocks, the blocks are managed in extents, an undo segment, much like a table, has many extents and each extent has many blocks.


The undo segment itself employs a circular buffer. We start at say "block 1" in extent 1, and fill up extent 1 and then go onto extent 2 (undo segments always have at least two extents). When we fill up extent 2 - we would normally try to advance into extent 1 again and overwrite it. We will do this (overwrite the data in extent 1) IF and ONLY IF:

a) there are no active transactions still using extent 1, if there are - we CANNOT advance into it.

b) the data in extent 1 is older than the undo retention period. That is, if you set undo retention to 1 hour - and the data in extent 1 was generated 30 minutes ago, we will NOT advance into extent 1.


If we do not advance into extent 1, we'll add a new extent 3 to this undo segment and advance into that - we won't overwrite anything.


A caveat: if the undo in extent 1 is say 30 minutes old and your undo retention is 60 minutes - we don't want to advance into that extent - but we MIGHT. If the undo tablespace is 100% full and we cannot grow it dynamically (no autoextending allowed) and we cannot 'steal' an old extent from some other undo segment that doens't need it anymore - we'll be forced to go into extent 1 and prematurely expire some undo (assuming no active transactions are using extent 1 - we simply cannot advance into extent 1 if that were true).





It sounds like, if you are getting ora-1555, you either:

a) have undo retention set too low, it needs to be set longer than your longest running query.

b) do not have sufficient undo allocated

(v$undo_stat can be used to determine if (a) and/or (b) are the case)

c) you might be running into delayed block cleanout - we'd need to understand your processing better to answer that.


I typically guess (a) and (b) as many people go "skimpy" on undo for some reason.

May 02, 2011 - 7:21 pm UTC

Reviewer: James Smyth

Tom,

In b) of your answer below, did you really mean to say that it will not re-use the extent if the data there is older than the undo retention period??

Thanks,

[Tom said]The undo segment itself employs a circular buffer. We start at say "block 1" in extent 1, and fill up extent 1 and then go onto extent 2 (undo segments always have at least two extents). When we fill up extent 2 - we would normally try to advance into extent 1 again and overwrite it. We will do this (overwrite the data in extent 1) IF and ONLY IF:

a) there are no active transactions still using extent 1, if there are - we CANNOT advance into it.

b) the data in extent 1 is older than the undo retention period. That is, if you set undo retention to 1 hour - and the data in extent 1 was generated 30 minutes ago, we will NOT advance into extent 1. [/Tom said]
Tom Kyte

Followup  

May 04, 2011 - 12:15 pm UTC

Yes, I meant to say that.

I was saying - we move into extent 1 again IF and ONLY IF these two things are true:

a) there are no active transactions
b) the data in extent 1 is older than the undo retention period.


If the data in extent 1 is "younger" than the undo retention period - we would like to NOT overwrite it yet.

In Logical Standby

May 31, 2011 - 8:08 am UTC

Reviewer: A reader

Hi Tom - Have been getting this error in my logical standby database : "ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old".
We also run several report generation queries on it, do you think because it is a logical standby results in the error.
Is this set of information sufficient for you to comment. if not , what set of info will you need .
What should be the correct set of steps to diagnose this error.

Thanks as always.
Tom Kyte

Followup  

May 31, 2011 - 12:58 pm UTC

it has nothing to do with being a logical standby.

It has everything to do with the DBA improperly sizing the undo_retention on the remote database (the missing undo segment name indicates this is happening on a dblink).

The database causing this is not saving undo for the length of the longest running query against that database. they need to set undo_retention higher (v$undo_stat or the undo advisor in statspack or AWR can be used to determine what that value should be) and ensure they have permitted the undo tablespace to grow to a size that can hold that much undo

Wonderful

May 31, 2011 - 11:00 pm UTC

Reviewer: A reader

Thank Tom. Greatest as ever!

October 20, 2011 - 9:56 pm UTC

Reviewer: bharathy from india

hi,
we are getting ORA-01555: snapshot too old: rollback segment number error when we are trying to
fetch a data from other database and try to insert in to ours.we are running this as a batch and
moreover we are not using any cursors in this

example

delete our_db_table;
begin
insert into our_db_table
(col1,col2,col3)
select col1,col2,col3 from other_db_table;

update db_erro_table;
commit;
end

we think instead of hitting other database everytime. if we use cursor and commit after every insert statement,whether it will help us in reducing the executing time and get rid of the above error.

kindly suggest




Tom Kyte

Followup  

October 20, 2011 - 10:47 pm UTC

How about you ask this question in one and only one place. Go to the other place you posted this same exact question on this site and read the answer I've already prepared.

ORA-01555

July 19, 2012 - 3:20 am UTC

Reviewer: vinod from india

Hello Tom,

Few times we also get snap shot too old error,i noted queries,first i sent query to developer for review,
but it seems its already optimized query there is not much room for improvement in it
also i noticed during this error its not always same query
which caused this error ,so it made me to think of looking in db setup to avoid this
i generated statistics which indicates undo retention is sufficient,but i feel i could increase undo size
will this help in fixing problem ,can it have any adverse effect on other side of performance ( space for increasing size will not be problem as i have free i could double its size)
Please suggest

thanks

Tom Kyte

Followup  

July 19, 2012 - 10:05 am UTC

i generated statistics which indicates undo retention is sufficient

you apparently generated the wrong statistics, what statistics could you have generated to show this is sufficient.

the mere appearance of a 1555 would indicate to me - that no, no you didn't set undo retention high enough.

look at v$undostat
http://docs.oracle.com/cd/E11882_01/server.112/e25513/dynviews_3118.htm#i1422075

also, check your alert log. You should see messages like:
ORA-01555 caused by SQL statement below (SQL ID: fp4pnpm934qkp, Query Duration=20 sec, SCN: 0x0000.05294eb0):

it'll show you the affected sql's and how long they had been running



it sure sounds like your undo retention needs to be longer.


caused by

July 19, 2012 - 6:45 pm UTC

Reviewer: yestoall from cybermen headquarters

I'm wondering if the "caused by" in the alert log message is misleading. Sometimes it would be true, as in fetch across commit or perhaps if some query were visiting a block more than once (as explained in case 1 section 6 of MOS https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=40689.1 ). But wouldn't the more general cause be some other DML pushing out the needed rollback info, or delayed block cleanout or whatever? Maybe it should say "thrown by."
Tom Kyte

Followup  

July 19, 2012 - 8:00 pm UTC

the ora-1555 is not technically "cause by" a sql statement, it an error received by the sql statement.

fetch across commit, the sql statment could receive an ora-1555 by its own doing.


The ora-1555 is "caused" by

a) undo that is needed not being around anymore
b) delayed block clean out - we cannot figure out if the block needs to be rolled back or not

(a) is the "cause" 99.999% of the time.

The alert log should say:


ORA-01555 experienced by SQL statement below (SQL ID: fp4pnpm934qkp, Query Duration=20 sec, SCN: 0x0000.05294eb0):


but our tech writers write the documentation and we let programmers write error messages.

snapshot too old error

July 20, 2012 - 4:43 am UTC

Reviewer: vinod from india

Thanks Tom

Offcourse i checked alert log,noted sql_id of query /transaction,and extracted full text of query
and checked if it could be optimized
as ora-01555 is caused query duration is apprantely more then undo retention set,i see 3 options in this scenerio
1)optimizing query-which checked by us and app team ,no
improvements seems to be possible
2)increase undo retention
3)increase undo size

my question is should i increase undo retention or size.
like if i increase retention same as query duration shown in alert log while ora-01555 ,some other query with more duration will raise same error ,so this may not be permanent fix may be specific to 1 query ,also i am noticing that not same query raise this error
Tom Kyte

Followup  

July 30, 2012 - 7:46 am UTC

2 and 3 sort of go hand in hand....


In general, you cannot increase the retention without also increasing the size (assuming you are not currently way oversized as it is)

do both.

snap shot too old error

July 25, 2012 - 9:23 am UTC

Reviewer: vinod from india

Thanks Tom

Offcourse i checked alert log,noted sql_id of query /transaction,and extracted full text of query
and checked if it could be optimized
as ora-01555 is caused query duration is apprantely more then undo retention set,i see 3 options in
this scenerio
1)optimizing query-which checked by us and app team ,no
improvements seems to be possible
2)increase undo retention
3)increase undo size

my question is should i increase undo retention or size.
like if i increase retention same as query duration shown in alert log while ora-01555 ,some other
query with more duration will raise same error ,so this may not be permanent fix may be specific to
1 query ,also i am noticing that not same query raise this error



Snapshot too old

July 26, 2012 - 4:27 am UTC

Reviewer: Rakesh from Canda

HI Tom,

Suppose, the record in the block is committed and block header is updated as committed, Suppose there are no transactions on the block for the next 1 week, so obviously undo retention will be expired. Then the transaction comes after 1 week and it searches for the relevant record in the rollback segment and it triggers ora-1555 error?

What could be solution in this scenario.
Tom Kyte

Followup  

July 30, 2012 - 9:38 am UTC

I don't see a question here, what are you trying to ask?

ORA-01555: snapshot too old: rollback segment number 30 with name "_SYSSMU30_3149236339$" too small

July 29, 2012 - 11:21 pm UTC

Reviewer: ramki from USA

Hi Tom
Your explanation on Rollback segment error is excellent.
Currently i am facing an error in a production environment

ORA-01555: snapshot too old: rollback segment number 30 with name "_SYSSMU30_3149236339$" too small

The job(ETL(Extraction Transformation Loading) process loads the oracle table through Informatica(ETL) tool.


The job selects data from Table A which has 7 million rows and loads into Table B.
We have set the commit interval in Informatica settings for every 1 million rows.
Both the tables are not used by other jobs( modify-update)during this load process.

After the job loads around 2 million rows it fails with the above error.
Should i need to check with DBA's if the Rollback segment is created with "Automatic"

Can you please suggest what should should i look in further of this issue.

select * from dba_segments
where segment_name='_SYSSMU30_3149236339$'


Tom Kyte

Followup  

July 30, 2012 - 12:01 pm UTC

you probably doing this to yourself, by committing.


you don't give enough information about the flow here - it sounds like you might be doing slow by slow processing (ugh), and having a big cursor on table A open for a long time (long time because of your slow by slow by really slow processing) and you commit while this cursor is open.


you need your undo retention to be set for longer than your longest running query (probably that query against table A) - which means probably at least 4 times longer than it is now (you get about 25% of the way through). And you need to make sure your undo tablespace is large enough to hold that much undo.


or you could just use a single query to move the data over and do it in a few seconds instead.


ORA-01555: snapshot too old: rollback segment number 30 with name

July 31, 2012 - 11:05 am UTC

Reviewer: Ramki from CA,USA

Thank you Tom for your response.
Table A( has 7.7 million rows) stats was gathered and ,loaded into Table B through Informatica.There was no cursor involved.It is an Oracle straight Select-Insert statement which is done through Informatica committing at every 1 million rows.
It took 2 hours to load 2 million rows and committed it and somehwere in the middle between 2-3 million rows it got failed with "ORA-01555: snapshot too old: rollback segment number 30 with name"
Table A is a kind of Staging table(Truncate and Load).
Table B is a standalone table which has
16 Non Unique Bit-Map Index
3 Composite Non-Unique Bit Map Index
Should i need to load the data and then create the Bit Map Index?Is the slowness due to Bit Map Index on the target table?
Should i try asking DBA's to increase the UNDO segment space?I will be happy to provide more information on this incase i did not provide.

Tom Kyte

Followup  

July 31, 2012 - 12:59 pm UTC

... There was no cursor involved ...

how do you think informatica got access to the data???????


... It is an Oracle straight
Select-Insert statement which is done through Informatica committing at every 1
million rows. ...

hah, please tell me how you got an insert as select to commit every 1,000,000 rows? explain that syntax to me (it does not exist, it is not happening the way you think, informatica is FAMOUS for being the slow by slow tool of the world)

so, please go back to the drawing board and figure out what information is *really* doing, because it isn't doing what you say - nothing is, nothing can!

ORA-01555: snapshot old error

August 01, 2012 - 12:56 am UTC

Reviewer: ramki from CA,USA

As per Informatica Session Log i see the following steps.
1) Establish the connection with Source DB(Table A) which has 7.7 million rows.
2) Execute the Oracle Select Query against the DB.
3) Query returns the first row from DB.
4) Starts writting into the Target Table(B).
5) Commit Interval is set to 1 million."A commit interval is the interval at which the Informatica Integration
Service commits data to targets during a session".
6) Commits the first 1 million rows after 1 hour.
7) Reads the next 1 million rows and commits(loaded in target table) and commits after another 1 hour.
8) Reads the data between 2-3 million and failed with "ORA-01555: snapshot too old: rollback segment number 30 with name"

I can think of the above steps which Informatica tries to read and write the data.It could be filling in the undo segment for every 1 million rows and We suspect the undo segment was not sufficient to handle this query runtime
Tom Kyte

Followup  

August 01, 2012 - 6:50 am UTC

2) see - no insert as select

3) see - slow by slow processing

4) see - no insert as select, obviously SLOW BY SLOW - the worst way to possibly think about doing this sort of stuff

5) there you go....


now why did you say "insert as select", this is obviously slow by slow and as soon as you said informatica, I knew it would be.


your undo needs to be retained for as long as the query in #1 is open. Meaning your undo retention better be set for HOURS and HOURS since it is going to take informatica hours and hours to do something that could take seconds or minutes if done correctly.



SnapShot old error

August 01, 2012 - 11:55 am UTC

Reviewer: ramki from CA,USA

Hi Tom
Thank you.I will rewrite this load in a separate select/insert statement and run through Procedure rather than informatica.
Also i will check with DBA's to increase the Undo retention accordingly.
The target table has around 77 million rows with
16 Non Unique Bit-Map Index
3 Composite Non-Unique Bit Map Index

And when this process is going to load another 7.7 million will the slowness will be due to that?
Should i create the index once all the data is loaded.
I joined into middle of the project and i am trying to correct certain things which is causing performance issues in production.Also it looks like Informatica/Oracle share the same server :(

Tom Kyte

Followup  

August 01, 2012 - 1:06 pm UTC

... And when this process is going to load another 7.7 million will the slowness
will be due to that? ...

it will have a measurable impact - yes.

anyway you can use partitioning so you could just direct path load 7.7 million records into a table, index it and then exchange that table into the partitioned table? that would be pretty darn fast, you could do that in seconds or minutes.

snapshot too old error ORA-01555

August 13, 2012 - 4:00 am UTC

Reviewer: vinod from india

Hello Tom,

Addition to my previous message and your recommendations
i was working on plan to change undo size /retention
noticed intresting ,what undo size shown in v$/UNDOSTAT views
is actually current size of undo tablespace ,also UNDO tablespace is set autoextensible and from meta data found max size still have about 30% more left ,so wondering if UNDO size was not sufficient for queries when it throws ORA-01555 then why UNDO tablespace not grown on its own to reach UNDO max size ,current size of UNDO tablespace shown is 15 GB but max size is 25 GB with autoextend on
please clear this doubt

also checked on OEM undo advisor it gives some indications
for increasing retention,undo size ,is this advisor very much relaiable to make changes or cross check with our own observations please suggest
Thanks
Tom Kyte

Followup  

August 17, 2012 - 1:54 pm UTC

because the undo tablespace size is governed by undo_retention unless you are using self tuning undo retention.

if you tell us to retain for one hour, we will and after an hour we'll be able to start overwriting stuff. If you have a 3 hour query - we don't care, you told us "one hour".

the advisor makes its recommendations based on what it has observed your system doing, it is pretty much dead on.

snapshot too old error ORA-01555

August 13, 2012 - 9:07 am UTC

Reviewer: vinod from india

missed to add
UNDO tablespace is configured as AUTO
and autoextend is ON ,have enough space to grow
than what is shown in v$undostat recommendations/statistics
so why it has not grown beyond what is configured
do i need to resize it further to level of size as per advisory
thanks
Tom Kyte

Followup  

August 17, 2012 - 1:54 pm UTC

because you sent the undo_retention, it won't grow if it doesn't have to.

Snapshop too old error ORA-01555

September 26, 2012 - 2:26 pm UTC

Reviewer: KLS

Tom,
I have been reading about "ORA-01555 Snapshot too old error" (Chapter 8, Page 287-288) from your book "Expert Oracle Database Architecture Oracle Database 9i, 10g and 11g Programming Techniques and Solutions - Second Edition". I was unable to understand the explanation for the below example in the book. To a certain extent I understand that when you "commit" across fetch there is a possiblity of the undo being overwritten if it is not big enough and it can result in a ORA-01555. But, I could not understand your comments about index range scan and it's impact in the below example and how it results in a ORA-01555 error. Can you please specifically elaborate/explain your comments in the below paragraph, from page 288 in the book? Thank you.

"I get the error. should point out that I added an index hint to the query and a WHERE clause to make sure I was readinf he table randomly (together they caused the cost-based optimizer to read the table "sorted" by the index key). When we process a table via an index, we tend to read a block from a single row..."
example


set echo on
drop table t;
create table t as select * from all_objects;
create index t_idx on t(object_name);
exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
create undo tablespace undo_small
datafile '/home/ora11gr2/app/ora11gr2/oradata/orcl/undo_small.dbf'
size 10m reuse
autoextend off
/
alter system set undo_tablespace = undo_small;
begin
for x in ( select /*+ INDEX(t t_idx) */ rowid rid, object_name, rownum r
from t
where object_name > ' ' )
loop
update t
set object_name = lower(x.object_name)
where rowid = x.rid;
if ( mod(x.r,100) = 0 ) then
commit;
end if;
end loop;
commit;
end;
/
alter system set undo_tablespace = UNDOTBS1;
drop tablespace undo_small;

- KLS
Tom Kyte

Followup  

September 27, 2012 - 8:42 am UTC

close your eyes and envision how the data is being read and processed via the index and compare that to how it would be read and processed with say a full scan.

suppose we used a full scan, the query in the loop would read block 1, process all of the rows on block 1 via the update, and then commit. we would *never* come back to block 1, not in our query in the loop or the update. we do not need the undo for block 1 ever again. we would not hit an ora-1555 by accident since we never re-read a block in that case.


now, envision using the index on object_name. further, suppose block one of the table contains objects AAAAAAAAAAAAAAAAAAAAA and ZZZZZZZZZZZZZZZZZZZZZZZ.

now, using the index, we process AAAAAAAAAAAAAAAAAAAAA first, it is the lowest object_name. we then commit that change. the undo we generated for block 1 is free to be overwritten.

now we process all of the other rows, it is much later now. We get to ZZZZZZZZZZZZZZZZZZZZ. the index tells us to read block 1, we do - discover it has been changed, go to roll it back - but the undo we generated for AAAAAAAAAAAAAAAAA is gone! ora-1555.



October 02, 2012 - 3:04 pm UTC

Reviewer: KLS

Tom,
Doesn't the index on object_name store it in alphabetical order in each block? i.e., the index Block "X" would have entries for example rowids which have objects names starting with "A..", "B..", "C.." etc. Still not able to understand why an index block would have entries for example "A.." and "Z.." and object names starting with "B..","C.." etc would be in another index block. Can you please explain?

In a full tablescan how does it store the entries in the block? Are the entries in the block stored randomly by the indexed column or is it alphabetically ordered by the indexed column. Thank you.

- KLS


Tom Kyte

Followup  

October 09, 2012 - 11:40 am UTC

forget the index order, it is all about the TABLE here.


say the first table block contains two rows;

AAAAAAAAAAAAAAAAAA
ZZZZZZZZZZZZZZZZZZ



the index might contain the rows 'sorted', which means that if you use the index, you'll read block 1 to get AAAAAAAAAAA, you'll then read all of the other blocks in the order the index tells you to. And then to get the last row - you'll have to read block 1 again to get ZZZZZZZZZZZZZZZZZZZ.

you'll read and re-read block 1 at the begin and end of the query if you use the index.

October 02, 2012 - 3:07 pm UTC

Reviewer: KLS

A correction in my previuos posting.

In a full tablescan how does it store the entries in the block? Are the entries in the block stored
randomly or is it alphabetically ordered. Thank you.

Tom Kyte

Followup  

October 09, 2012 - 11:43 am UTC

randomly, we just put data where it fits, where there is space by default in a heap based table.


October 11, 2012 - 9:47 am UTC

Reviewer: KLS

1. In the below script, the "select /*+ INDEX(t t_idx) */ rowid rid, object_name, rownum r
from t
where object_name > ' '" uses the index range scan and all the necessary columns are in the index so it does not access the table blocks correct?

2. When the update (update t
set object_name = lower(x.object_name)
where rowid = x.rid;
)is done in the for loop the index also gets updated correct and the before image of the index is kept in the undo tablespace to get a read consitent view when it tries to get the next row to process in the begining of the "for loop" correct?

3. If the above two statements are correct can the read consistent image of the index that is in the undo tablespace at time t1 i.e., when the " for x in ( select /*+ INDEX(t t_idx) */ rowid rid, object_name, rownum r
from t
where object_name > ' ' )
" began be overwritten causing a ORA-01555 "Snapshot too old" error?


set echo on
drop table t;
create table t as select * from all_objects;
create index t_idx on t(object_name);
exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
create undo tablespace undo_small
datafile '/home/ora11gr2/app/ora11gr2/oradata/orcl/undo_small.dbf'
size 10m reuse
autoextend off
/
alter system set undo_tablespace = undo_small;
begin
for x in ( select /*+ INDEX(t t_idx) */ rowid rid, object_name, rownum r
from t
where object_name > ' ' )
loop
update t
set object_name = lower(x.object_name)
where rowid = x.rid;
if ( mod(x.r,100) = 0 ) then
commit;
end if;
end loop;
commit;
end;
/
alter system set undo_tablespace = UNDOTBS1;
drop tablespace undo_small;

Thank you,
KLS
Tom Kyte

Followup  

October 11, 2012 - 9:57 am UTC

1) yes, the

2) yes, indexes generate undo just like everything else

3) yes, I must have been getting the 1555 on the index in this case. good point.

thank you

October 11, 2012 - 10:15 am UTC

Reviewer: KLS

Tom, thank you for clarifying. This site has always been the place where I go to if I have questions. It has been tremendously useful to me and I am sure many would feel the same. Thank you for sharing your knowledge and expertise.
KLS

October 11, 2012 - 1:51 pm UTC

Reviewer: A reader

Tom,
When I tested the script I did get the "ORA-01555 "snapshot too old " error before the update in the loop but after "select" in the forloop. I was just wondering if the snapshot error could happen when it does a "select" in the for loop. From your answer I interpret that can happen as well.
Thank you.
Kirthi
Tom Kyte

Followup  

October 11, 2012 - 3:44 pm UTC

what is there before the update but after the select?

it could get an ora-1555 on any DML statement (select, insert, update, delete, merge, etc) at any time, yes.

sanpshot too old

October 19, 2012 - 7:02 am UTC

Reviewer: Sameer from India

We have this procedure which comes out with error snapshot too old ocassionally means once in 10 days or so.
procedure employee is
type t_emp is table of emp_details_1%rowtype;
r_emp t_emp;
cursor c1
is select * from stg_emp_details where trunc(update_dt) = trunc(sysdate-1);
begin
execute immediate 'alter table emp_details_1 nologging';
execute immediate 'truncate table emp_details_1';
open c1;
loop
fetch c1 bulk collect into r_emp limit 50000;
forall i in 1..r_emp.count
insert /*+ append */ into emp_details_1 values r_emp(i);
commit;
exit when r_emp.count < 50000;
end loop;
close c1;
execute immediate 'create or replace view emp_details as select * from emp_details_1';
end;

stg_emp_details table has 20M records and cursor returns around 7-8M records. There are 7 indexes on this tables.
We have tried increasing undo tablespace from 6GB to 10GB and then to 14GB but still we get this error. DBA has declined the request of increasing it further and has suggested to tune the code. We have used append hint as well while inserting, but in vain.Will increasing the limit from 50000 to 100000 help? Could you please suggest what could be the possible reason of snapshot too old in this case and how to avoid it?
Thanks.
Tom Kyte

Followup  

October 23, 2012 - 11:20 am UTC

select * from stg_emp_details where trunc(update_dt) = trunc(sysdate-1);

would be better written as:

select * from stg_emp_details where update_dt >= trunc(sysdate-1) and update_dt < trunc(sysdate);


don't apply a function over and over and over to a column when you really don't need to.


why are you using procedural code?

why isn't this just

truncate table emp_details_1
insert /*+ append */ into emp_details_1 (....) select * from stg_emp_details where update_dt >= trunc(sysdate-1) and update_dt < trunc(sysdate);


why do you create the view over and over and set the table nologging over and over - the view would be created once, the table - upon creation - would be set t nologging?


stop doing the procedural code and your code will in fact run faster, use less resources, be more efficient.


It should only take a few minutes to load such a small bit of data - have you looked at the query that turns 20million records into 7million - how long does it take to execute?

October 30, 2012 - 12:42 pm UTC

Reviewer: A reader

I had put a "dbms_output.putline" statement after the select and before the update

set echo on
drop table t;
create table t as select * from all_objects;
create index t_idx on t(object_name);
exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
create undo tablespace undo_small
datafile '/home/ora11gr2/app/ora11gr2/oradata/orcl/undo_small.dbf'
size 10m reuse
autoextend off
/
alter system set undo_tablespace = undo_small;
begin
for x in ( select /*+ INDEX(t t_idx) */ rowid rid, object_name, rownum r
from t
where object_name > ' ' )
loop
dbms_output.put_line("here 1");
update t
set object_name = lower(x.object_name)
where rowid = x.rid;
if ( mod(x.r,100) = 0 ) then
commit;
end if;
end loop;
commit;
end;
/
alter system set undo_tablespace = UNDOTBS1;
drop tablespace undo_small;



Tom Kyte

Followup  

October 31, 2012 - 3:57 pm UTC

show the cut and paste from sqlplus of the code (with line numbers) and the full error stack.

the error (1555) is for the select (look in your alert log)

ORA-1555 while sorting

September 27, 2013 - 10:52 am UTC

Reviewer: AJ from UK

Let's say you have a single select statement that returns a large volume of data which requires sorting.
While the query is running a select on v$longops in another session shows first of all a full table scan, and then when this has completed v$longops shows a sort operation on the data, and the sort is estimated to take x number of minutes. My question is this: can ORA-1555 be triggerd during this sort phase? My assumption is that it will not be triggered because at that stage all the data has already been retrieved and is just being sorted...

Effect of indexes on 'snapshot too old'

October 17, 2013 - 1:14 pm UTC

Reviewer: Tony from U.K

Thanks for the explanation about how use of indexes can contribute to 'snapshot too old.

Re: Your book, pg 255, about correct sizing of the undo segments.
Thought it might interest you that we are actually getting this problem in production :>

Cheers

p.s scary that this question was first asked 13 years ago!

Good

January 21, 2014 - 7:39 am UTC

Reviewer: Baiju from INDIA

the information was so good and clear

April 04, 2014 - 5:28 am UTC

Reviewer: pavan from india

awesome answer.

Snapshot Too Old

April 30, 2014 - 4:08 pm UTC

Reviewer: Prithwish from Kolkata, India

Hi Tom,
This topic really explains the details with clarity.
One question

/* CASE 1 - ROLLBACK OVERWRITTEN

This breaks down into two cases: another session overwriting the rollback that
the current session requires or the case where the current session overwrites
the rollback information that it requires. The latter is discussed in this
article because this is usually the harder one to understand.

Steps:

1. Session 1 starts query at time T1 and QENV 50

2. Session 1 selects block B1 during this query

3. Session 1 updates the block at SCN 51

4. Session 1 does some other work that generates rollback information.

5. Session 1 commits the changes made in steps '3' and '4'.
(Now other transactions are free to overwrite this rollback information)

6. Session 1 revisits the same block B1 (perhaps for a different row). */

For Step 6 when the session 1 revisits the same block B1 isn't it at SCN 51.
The steps are done one after the other I believe and after step 3 the SCN has changed to 51.
I know I am missing something here. Can you please tell me what?

What happens if the rollback segment is overwritten

September 17, 2014 - 4:01 pm UTC

Reviewer: A reader from Bethpage, NY, USA

What happens if the rollback segment is overwritten before the second session to visit the committed block?

After the changed block is committed, the data block header still has status of changed and points to the rollback segment but the rollback segment is already overwritten because it's committed. How Oracle deals with this situation?

excellent

December 17, 2014 - 8:01 am UTC

Reviewer: Chandu from India

Really i don't know how can i explain? good i am feeling nurvous to use this small words telling about you

car

October 27, 2016 - 2:50 pm UTC

Reviewer: Debbie from England

Hi I am having trouble fixing the reverse lights for a renault Megan scienic 04 plate please could you help
Connor McDonald

Followup  

October 28, 2016 - 2:04 am UTC

Well.... you bought a Renault. I don't think there is any help for that

:-)

More to Explore

Administration

Need more information on Administration? Check out the Administrators guide for the Oracle Database