Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, parag jayant patankar.

Asked: July 23, 2005 - 1:33 am UTC

Last updated: September 28, 2009 - 8:25 am UTC

Version: 9.2

Viewed 10K+ times! This question is

You Asked

Hi Tom,

I have recently faced a question from respectable person from Oracle.

He asked me which kind of statements generate undo ? I said insert/update/delete/merge .. etc.

He asked me where select statement generates undo ? I said no. He said me yes. I asked him how ? He told me belive me it generates undo and it is a long process and can not explained in our short meeting. ( Due to various reasons till now I am not able to contact this person yet )

I had tried to generate few tests on that, But I did not find satisfying results to his answer.

Do you agree to his answer ? If yes or no kindly explain.

regards & thanks
pjp

and Tom said...

I think you mean "redo", not UNDO in general.


A select for update will generate both redo and undo.

A read only select will USE undo (for read consistency)

A read only select may GENERATE redo due to block cleanouts.

Here is an excerpt from my forthcoming book on this topic:

<quote>

Surprising to many people, the SELECT will have generated redo. Not only that, but it will also have 'dirtied' these modified blocks, causing DBWR to write them again. This is due to the block cleanout. Next, I'll run the SELECT once again and we'll see that no redo is generated. This is expected as the blocks are all 'clean' at this point.

ops$tkyte@ORA10G> create table t
2 ( x char(2000),
3 y char(2000),
4 z char(2000)
5 )
6 /
Table created.

ops$tkyte@ORA10G> set autotrace traceonly statistics
ops$tkyte@ORA10G> insert into t
2 select 'x', 'y', 'z'
3 from all_objects
4 where rownum <= 500;
500 rows created.

Statistics
----------------------------------------------------------
...
3297580 redo size
...
500 rows processed
ops$tkyte@ORA10G> commit;
Commit complete.

So, this is our table with one row per block (in my 8 KB block size database). Now we will measure the amount of redo generated during the read of the data:

ops$tkyte@ORA10G> select *
2 from t;
500 rows selected.

Statistics
----------------------------------------------------------
...
36484 redo size
...
500 rows processed

So, this SELECT generated about 35 KB of redo during its processing. This represents the block headers it modified during the full scan of T. DBWR will be writing these modified blocks back out to disk at some point in the future. Now, if we run the query again:

ops$tkyte@ORA10G> select *
2 from t;
500 rows selected.

Statistics
----------------------------------------------------------
...
0 redo size
...
500 rows processed

ops$tkyte@ORA10G> set autotrace off

we see that no redo is generated - the blocks are all clean.
If we were to re-run the above example with the buffer cache set to hold at least 5,000 blocks, we'll find that we generate little to no redo on any of the SELECTs - we will not have to clean dirty blocks during either of our SELECT statements. This is because the 500 blocks we modified fit comfortably into 10 percent of our buffer cache, and we are the only user. There is no one else is mucking around with the data, and no one else is causing our data to be flushed to disk or accessing those blocks. In a live system, it will be normal for at least some of the blocks to not be cleaned out sometimes.

Where this behavior will most affect you, is after a large INSERT (as demonstrated above), UPDATE, or DELETE - one that affects many blocks in the database (anything more than 10 percent of the size of the cache will definitely do it). You will notice that the first query to touch the block after this will generate a little redo and dirty the block, possibly causing it to be rewritten if DBWR had already flushed it, or the instance had been shutdown, clearing out the buffer cache all together. There is not too much you can do about it. It is normal and to be expected. If Oracle did not do this deferred cleanout of a block, a COMMIT could take as long to process as the transaction itself. The COMMIT would have to revisit each and every block, possibly reading them in from disk again (they could have been flushed). If you are not aware of block cleanouts and how they work, it will be one of those mysterious things that just seem to happen for no reason. For example, you UPDATE a lot of data and COMMIT. Now you run a query against that data to verify the results. The query appears to generate tons of write I/O and redo. Seems impossible if you are unaware of this - it was to me the first time I saw it. You go and get someone to observe this with you but it is not reproducible, as the blocks are now 'clean' on the second query. You simply write it off as one of those 'mysteries'.

In an OLTP system, you will probably never see this happening since those systems are characterized by small, short transactions that affect a few blocks. By desing, all or most of the transactions are short and sweet. Modify a couple of blocks and they all get cleaned out. In a warehouse where you make massive UPDATEs to the data after a load, block cleanouts may be a factor in your design. Some operations will create data on 'clean' blocks. For example, CREATE TABLE AS SELECT, direct path loaded data, direct path inserted data; they will all create 'clean' blocks. An UPDATE, normal INSERT, or DELETE, may create blocks that need to be cleaned with the first read. This could really affect you if your processing consists of:

* Bulk loading lots of new data into the data warehouse.
* Running UPDATEs on all of the data you just loaded (producing blocks that need to be cleaned out).
* Letting people query the data.

You will have to realize that the first query to touch the data will incur some additional processing if the block needs to be cleaned. Realizing this, you yourself should 'touch' the data after the UPDATE. You just loaded or modified a ton of data, you need to analyze it at the very least. Perhaps you need to run some reports yourself to validate the load. This will clean the block out, and make it so the next query doesn't have to do this. Better yet, since you just bulk loaded the data, you now need to refresh the statistics anyway. Running the DBMS_STATS utility to gather statistics may well clean out all of the blocks as well as it just uses SQL to query the information and would naturally clean them out as it goes along.
</quote>

Rating

  (35 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

block cleanout also for direct-path inserts ?

Alberto Dell'Era, July 23, 2005 - 12:20 pm UTC

> direct path inserted data; they will create 'clean' blocks

Yeahbut (10.1.0.3) :

dellera@ORACLE10> create table t (x char(2000), y char (2000), z char (2000));

Table created.

dellera@ORACLE10>
dellera@ORACLE10> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
85400156

dellera@ORACLE10>
dellera@ORACLE10> insert /*+ append */ into t (x,y,z) select rownum, rownum, rownum from dual connect by level <= 500;

500 rows created.

dellera@ORACLE10>
dellera@ORACLE10> -- to verify that the insert was a direct-path one
dellera@ORACLE10> -- (comment this out and the results are the same)
dellera@ORACLE10> select count(*) from t;
select count(*) from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


dellera@ORACLE10>
dellera@ORACLE10> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
85400214

dellera@ORACLE10>
dellera@ORACLE10> commit;

Commit complete.

dellera@ORACLE10>
dellera@ORACLE10> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
85400216

dellera@ORACLE10>
dellera@ORACLE10> set autotrace on
dellera@ORACLE10> select min(ora_rowscn), max (ora_rowscn) from t;

MIN(ORA_ROWSCN) MAX(ORA_ROWSCN)
--------------- ---------------
85400215 85400215

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=116 Card=1)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=116 Card=543)

Statistics
----------------------------------------------------------
27 recursive calls
1 db block gets
580 consistent gets
500 physical reads
168 redo size


We have 500 blocks (since each row goes in its own block, db_block_size=8K ), and we get 500 physical reads; that confirms that the blocks were written directly to disk, bypassing the buffer cache.

SCN before the commit was 85400214, so the scn of blocks on disk has to be <= 85400214 (if any, perhaps the scn gets written as blank, but that's not important).

When we query, the block scn is 85400215 (commit-time scn - note how it went to 85400216 after the commit); hence, block cleanout was performed ...

Additional twist - the redo size is always 168 bytes, regardless of the number of blocks - I've tried inserting 1000 blocks instead of 500 and it stays at 168. Probably only some administrative infos gets logged (segment header ?), but not the block cleanout.

So what does it mean - that block cleanout happens in memory, but it is not logged ?

I'm running in NOARCHIVELOG, that may be important to know.

Please find below the original script for your convenience if you want to experiment - in that case, TIA :)

drop table t;
set echo on

create table t (x char(2000), y char (2000), z char (2000));

select dbms_flashback.get_system_change_number from dual;

insert /*+ append */ into t (x,y,z) select rownum, rownum, rownum from dual connect by level <= 500;

-- to verify that it went in direct-path
-- (comment this out and the results are the same)
select count(*) from t;

select dbms_flashback.get_system_change_number from dual;

commit;

select dbms_flashback.get_system_change_number from dual;

set autotrace on
select min(ora_rowscn), max (ora_rowscn) from t;
set autotrace off

Tom Kyte
July 23, 2005 - 12:53 pm UTC

with the direct path, there is no need to clean out the block, they do not hit the cache, they go to disk.

looks like you are using ASSM perhaps and seeing some of the assm managed blocks.

The direct path operation put the blocks away cleanly.

Alberto Dell'Era, July 23, 2005 - 1:38 pm UTC

> with the direct path, there is no need to clean out the
> block, they do not hit the cache, they go to disk.

And who updates the block scn to the commit-time scn (85400215) that we see in the select, if *deferred* block cleanout does not happen at select time ?

They can't go to disk knowing the *future* commit scn ... so something has to perform the cleanout.

Tom Kyte
July 23, 2005 - 2:30 pm UTC

you are assuming that in this case a fixed SCN is on a block, rather than just transaction information.

The block cleanout is specifically about "whose got that row locked", the block cleanout is to clean out the transaction "row locking" information. This is not about transaction row locking information -- the row locking information would be "not needed" since these blocks were written above the high water mark (no other transaction could have SEEN them, there was no need to leave behind the "I've got this row locked, don't touch it")

delayed block cleanout occurs when a select reads a block and it would appear that the row(s) are locked. It looks and says "ah, you were only kidding, they are not, let me clean this out so that subsequent reads don't get fooled like I just did"



Alberto Dell'Era, July 23, 2005 - 2:54 pm UTC

Oh! But yes, I was assuming that ora_rowscn attempted to read the block scn, and noticing that there was a pending TX in the ITL, went to the undo segment header, got the commit scn from there, updated the block scn (thus dirtying the buffer) to avoid that other TXs had to re-get the undo segment header.

Either reading ora_rowscn does not update the block scn at all, or that update doesn't need to be logged.

Anyway it makes sense. Thanks again!

Tom Kyte
July 23, 2005 - 3:15 pm UTC

be interesting to see what happens with rowdependencies on :)

only because I know you don't really want to do anything this weekend, I figure if I'm trapped inside doing tech edits ;) misery loves company

Alberto Dell'Era, July 23, 2005 - 5:39 pm UTC

And they say that telepathy doesn't exist - I just returned back home wondering about rowdependencies (after a nice Spaghetti alla Carbonara dinner), and I've found your followup :)

It changes ALMOST nothing, what changes it's that ora_rowscn returns the scn at insert time, not the scn at commit time.

I've perfected the test case in the meanwhile to show dirty blocks from v$bh:

dellera@ORACLE10> alter system flush buffer_cache;

System altered.

dellera@ORACLE10>
dellera@ORACLE10> create table t (x char(2000), y char (2000), z char (2000)) rowdependencies;

Table created.

dellera@ORACLE10>
dellera@ORACLE10> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
85413293

dellera@ORACLE10>
dellera@ORACLE10> insert /*+ append */ into t (x,y,z) select rownum, rownum, rownum from dual connect by level <= 500;

500 rows created.

dellera@ORACLE10>
dellera@ORACLE10> -- to verify that it went in direct-path
dellera@ORACLE10> -- (comment this out and the results are the same)
dellera@ORACLE10> select count(*) from t;
select count(*) from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


dellera@ORACLE10>
dellera@ORACLE10> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
85413391

dellera@ORACLE10>
dellera@ORACLE10> commit;

Commit complete.

dellera@ORACLE10>
dellera@ORACLE10> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
85413393

dellera@ORACLE10>
dellera@ORACLE10> select v.dirty, count(*)
2 from v$bh v
3 where v.objd = (select object_id from user_objects where object_name = 'T')
4 group by v.dirty;

DIR COUNT(*)
--- ----------
Y 18

dellera@ORACLE10>
dellera@ORACLE10> set autotrace on
dellera@ORACLE10> select min(ora_rowscn), max (ora_rowscn), count(ora_rowscn) from t;

MIN(ORA_ROWSCN) MAX(ORA_ROWSCN) COUNT(ORA_ROWSCN)
--------------- --------------- -----------------
85413293 85413293 500


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=116 Card=1)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=116 Card=502)

Statistics
----------------------------------------------------------
27 recursive calls
1 db block gets
575 consistent gets
500 physical reads
168 redo size
544 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

dellera@ORACLE10> set autotrace off
dellera@ORACLE10>
dellera@ORACLE10> select v.dirty, count(*)
2 from v$bh v
3 where v.objd = (select object_id from user_objects where object_name = 'T')
4 group by v.dirty;

DIR COUNT(*)
--- ----------
N 499
Y 19

It makes sense, since the main user of ora_rowscn is for implementing optimistic locking, it is not important if the "row birth date" is at commit time or before.

Better freezing it at insert time, and so avoid cleaning up the block when it got read into the buffer cache...

I'm at your disposal if you have other tests in mind - two hours to kill, and midnight Italy time seems to be perfect for new discoveries :)

Tom Kyte
July 23, 2005 - 6:42 pm UTC

you are using assm are you not..... (assm uses blocks to manage space within the table itself)

Alberto Dell'Era, July 23, 2005 - 6:49 pm UTC

Sure - the default in 10g:

dellera@ORACLE10> select segment_space_management from dba_tablespaces where tablespace_name = 'USERS';

SEGMENT_SPACE_MANA
------------------
AUTO

You mean that the 18 dirty blocks we find before the insert are used for ASSM ?

Tom Kyte
July 24, 2005 - 9:45 am UTC

it is NOT the default in 10g, USERS might have ASSM but it is not the default in 10gr1 (in r2, yes, but in r1 manual is)

ops$tkyte@ORA10G> create tablespace  xxxx datafile size 1m;
 
Tablespace created.
 
 
ops$tkyte@ORA10G> select segment_space_management from dba_tablespaces where tablespace_name = 'XXXX';
 
SEGMEN
------
MANUAL
 



but:

ops$tkyte@ORA10G> exec show_space( 'T' );
Unformatted Blocks .....................               0
FS1 Blocks (0-25)  .....................               0
FS2 Blocks (25-50) .....................               0
FS3 Blocks (50-75) .....................               0
FS4 Blocks (75-100).....................               0
Full Blocks        .....................             500
Total Blocks............................             640
Total Bytes.............................       5,242,880
Total MBytes............................               5
Unused Blocks...........................             122
Unused Bytes............................         999,424
Last Used Ext FileId....................               4
Last Used Ext BlockId...................          47,113
Last Used Block.........................               6
 
PL/SQL procedure successfully completed.


500 blocks full
640 blocks allocated

140 blocks difference between used and allocated
122 blocks "unused"
 18 <<<<== blocks missing in action, used to manage the space....

 

Alberto Dell'Era, July 24, 2005 - 11:31 am UTC

Thanks for the correction, and for the confirmation of the identity of the 18 blocks, both very appreciated, as always..

I was reasoning about the test case with rowdependenciess off (it shows the same result as far v$bh is concerned - 499 non-dirty blocks after the select).

Do you think that it may a reasonable hypothesis that the block scn *is* updated in memory, but the update not logged and the block not labeled as dirty ?

Reasoning: that way, accessing the same block again will not require getting the segment header again - and if the instance crashes it's not a problem, since the block scn is "redundant", it can be recalculated from the ITL entries.

Tom Kyte
July 24, 2005 - 12:45 pm UTC

I would conceptually consider the "block scn" like a rowid or block address. something that can be inferred from other data. Thats about as deep as I've gone with it personally at this point (chapter 11 today... indexes... have benchmark to redo since I mucked up some assumptions first time around :)

Alberto Dell'Era, July 24, 2005 - 3:38 pm UTC

There's for sure a cache that remembers the scn of a committed tx - otherwise we'd see 500 gets from the rollback segment header (one for every block) instead of 2 in the test case below.

Interesting ... I'm going out for a quick Garden Chat, so if something fun to perform in sqlplus pops up in your mind, let me know and I will do it ... at midnight, as usual ;)

dellera@ORACLE10> -- "disable" dynamic sampling
dellera@ORACLE10> alter session set optimizer_dynamic_sampling = 1;

Session altered.

dellera@ORACLE10>
dellera@ORACLE10> create table t (x char(2000), y char (2000), z char (2000));

Table created.

dellera@ORACLE10>
dellera@ORACLE10> insert /*+ append */ into t (x,y,z) select rownum, rownum, rownum from dual connect by level <= 500;

500 rows created.

dellera@ORACLE10>
dellera@ORACLE10> -- to verify that it went in direct-path
dellera@ORACLE10> -- (comment this out and the results are the same)
dellera@ORACLE10> select count(*) from t;
select count(*) from t
*
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


dellera@ORACLE10>
dellera@ORACLE10> -- save undo segment number of tx (the one that goes in the ITL)
dellera@ORACLE10> col XIDUSN new_value XIDUSN
dellera@ORACLE10> select XIDUSN from v$transaction where ses_addr = (
2 select saddr from v$session where sid = (
3 select sid from v$mystat where rownum=1
4 )
5 );

XIDUSN
----------
1

dellera@ORACLE10>
dellera@ORACLE10> commit;

Commit complete.

dellera@ORACLE10>
dellera@ORACLE10> col gets new_value gets
dellera@ORACLE10> select sum(gets) gets from v$rollstat where usn = &XIDUSN.;
old 1: select sum(gets) gets from v$rollstat where usn = &XIDUSN.
new 1: select sum(gets) gets from v$rollstat where usn = 1

GETS
----------
2536

dellera@ORACLE10>
dellera@ORACLE10> set autotrace on
dellera@ORACLE10> select min(ora_rowscn), max (ora_rowscn), count(ora_rowscn) from t;

MIN(ORA_ROWSCN) MAX(ORA_ROWSCN) COUNT(ORA_ROWSCN)
--------------- --------------- -----------------
85432143 85432143 500


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=2 Card=82)




Statistics
----------------------------------------------------------
1 recursive calls
1 db block gets
508 consistent gets
500 physical reads
168 redo size
544 bytes sent via SQL*Net to client
512 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

dellera@ORACLE10> set autotrace off
dellera@ORACLE10>
dellera@ORACLE10> select sum(gets) - &gets. gets_from_rollheader from v$rollstat where usn = &XIDUSN.;
old 1: select sum(gets) - &gets. gets_from_rollheader from v$rollstat where usn = &XIDUSN.
new 1: select sum(gets) - 2536 gets_from_rollheader from v$rollstat where usn = 1

GETS_FROM_ROLLHEADER
--------------------
2

UNDO & REDO

Vikram, July 25, 2005 - 1:15 am UTC

Tom,

You were talking about an UNDO and REDO.
Can you please show some light on what an UNDO and REDO is
and how does it affect the performance of queries?

Thanks,
Vikram


Tom Kyte
July 25, 2005 - 7:32 am UTC

Do you have access to Expert One on One Oracle? I spent a chapter on undo and redo.

they are what their names imply, undo is used to undo changes (it is used to ROLLBACK)

redo is used to recover changes after a failure (redo log)


Redo is the information Oracle records in online (and archived) redo log files in order to "replay" your transaction in the event of a failure. Undo is the information Oracle records in the undo segments in order to reverse, or roll back, your transaction.


Undo is also used to process queries:
</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96524/c01_02intro.htm#44618 <code>

Simply Incredible

Rob H, August 08, 2005 - 4:56 pm UTC

This article should be locked. It is an incredible example of many features/internals of oracle. It is very well documented and provides good examples.

Well done Al and Tom

Does Select generate Undo?

D.Sasi Kumar, August 09, 2005 - 8:03 am UTC

It is very nice. Last evening I met an senior dba and during our discussion he raised one question about rollback segments and the question is :
"Of the dml commands insert or delete - which one will generate more rollback segments?"
I wasn't able to answer him. Can you please help me?

Tom Kyte
August 09, 2005 - 10:21 am UTC

The answer is "it depends" -- and you can reason through the cases.

a) insert, what must be placed into undo for an insert? DELETE+ROWID
b) delete, what must be placed into undo for a delete? INSERT+row_data

c) caveat: indexes, compare insert against a indexed table to a delete against an unindexed table....


In general, the insert generates the least UNDO, the delete generates the most UNDO.


And if you count a direct path insert, insert /*+ APPEND */, it can SKIP undo generation for the table (not the index(es))

A reader, August 12, 2005 - 2:27 pm UTC


A reader, August 22, 2005 - 11:56 am UTC

Tom,
Why does a Select For Update generate Undo?
I noticed that the Open Cursor with a For Update took long as well as a Cancel Query statement. What does the Select to generate this Rollback Segment?

Tom Kyte
August 23, 2005 - 8:58 am UTC

because select for update locks the row by modifying the block, locks are attributes of the data, they are stored on the 'block'

the select for update will always lock all of the rows before returning to you the first row.

dbwr wrote data block twice?

John, October 14, 2005 - 3:20 pm UTC

i am a bit confused when read a discussion about the "delayed block cleanout" here and also Note:40689.1 on metalink,

isnt delayed block cleanout something purely about update data block header to reflect the fact that the update have already been commited?

while you said that DBWR will update those data block to disk again, isn't dbwr will only update datafile upon checkpoint? will dbwr update twice for each commit? (update datafile the 2nd time during delayed block cleanout)

thanks for any response...wish you a pleasant trip in Montreal.


Tom Kyte
October 14, 2005 - 5:58 pm UTC

delayed block cleanout is about updating the block header to cleanout the committed transaction.


it typically happens when the block isn't to be found in the buffer cache anymore when we commit (dbwr already wrote it out). It can also happen when you update tons of rows (even if in the cache, we don't revisit them all at commit time).

So, the block gets written to disk with the "dirty header".

later you read the block in.

We clean it out (the block header) and dbwr will ultimately checkpoint the block back to disk.



OK

Ramkumar, October 15, 2005 - 8:24 am UTC

Hi Tom,
Is it possible to find out
"what are the tables available in the buffer cache"??

Any Dictionary views available??

Tom Kyte
October 15, 2005 - 9:54 am UTC

v$bh shows blocks in the cache. objd is the data object id of the block - can be joined to various views to see the object the block belongs to.

Dirty header

Saibal Ghosh, October 29, 2005 - 12:50 pm UTC

<So, the block gets written to disk with the "dirty header".>

What exactly is the dirty header in this case? Does it consist ONLY of the ITL and row level locks not yet cleaned out?
It will be a great help if you could explain a bit about how the equivalent of delayed_logging_block_cleanouts work in Oracle 9i/10g.
Thanks in advance

Tom Kyte
October 29, 2005 - 1:36 pm UTC

I do - in my book in some detail (either of Expert One on One Oracle or Expert Oracle Database Architecture (the 9i/10g update))

It really hasn't changed - it just means the block was written to disk saying "transaction 42 is working on this" and on the next read, we must figure out that "transaction 42 completed weeks ago".

It is basically the ITL entries that say "rows so and so are locked by transaction this and that" and then finding out the transactions are long gone.

delayed_logging_block_cleanouts

Saibal Ghosh, October 29, 2005 - 6:30 pm UTC

Tom,
Thanks for your quick response. I have your book expert one-on-one (but not the updated version) and have read it in some detail.
I am trying to upgrade my understanding of the delayed block cleanouts. In prior versions -till 8i I believe-delayed_logging_block_cleanouts defaulted to true and the logging of the redo was delayed until another change was made in the current buffer. If the parameter was set to false redo was logged immediately, the buffer marked dirty
and DBWn wrote the cleaned out block back to the disk.
I am working with a 9i database and was unable to find the
delayed_logging_block_cleanout parameter. Have things changed significantly regarding the delayed logging of block clean outs in Oracle 9i/10g? Have you dealt with this in your updated version of expert one-on-one? If so, I need to get hold of that book.
On another note, your itenerary on your home page never seems to mention India ever anywhere! Do you have any plans of coming here in the forseeable future or will we have to forever make do with the MS guys promoting SQL SERVER(and no doubt, denigrating Oracle) to the Govt. and industry?

Tom Kyte
October 30, 2005 - 3:49 am UTC

that parameter disappeared with 8i (not till 8i - but rather "till 8.0")

It is as if it were set to true in 8i on.



My travels actually do not include *most* of the world - so don't feel that I'm singling out India. There are definitely more places I haven't been to than I've been to.

Block cleanouts

Saibal Ghosh, October 31, 2005 - 6:58 am UTC

Hi Tom,
Sorry to bother you again on the same topic but I seem to be having some problems with it.In your book expert one-on-one, on page 178-179 you have shown that if the sum of blocks we modify is less than 10% of the buffer cache size, and the modified blocks are still in the buffer, then the blocks are cleaned out by the commit and the subsequent select statment does not generate any redo. You have proved it by an example. But when I tested it, I find that i am getting redo in both cases-when blocks modified are less than 10% of the buffer size or otherwise. Am I doing something wrong in the following test?
SQL>sho parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_cache_size                        big integer 4194304
SQL>sho parameter db_block_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_size                        integer     8192
SQL>create table test_table(
  2  x char(200),
  3  y char(200),
  4  z char(200))
  5  pctfree 99
  6  pctused 1;

Table created.

SQL>insert into test_table select rownum,rownum,rownum from dba_objects
  2  where rownum<1001;

1000 rows created.

SQL>analyze table test_table compute statistics;

Table analyzed.

SQL>select blocks,empty_blocks, avg_row_len from dba_tables
  2  where table_name='TEST_TABLE';

    BLOCKS EMPTY_BLOCKS AVG_ROW_LEN
---------- ------------ -----------
      1000           23         606

SQL>select ((1023*606)/4194304)*100 from dual;

((1023*606)/4194304)*100
------------------------
              14.7804737--> percentage of buffers modified

SQL>commit;

Commit complete.

SQL>exec redo--a small procedure that measures total redo
1151348

PL/SQL procedure successfully completed.

SQL>set autotrace traceonly
SQL>select * from test_table;

1000 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=98 Card=1000 Bytes=6
          00000)

   1    0   TABLE ACCESS (FULL) OF 'TEST_TABLE' (Cost=98 Card=1000 Byt
          es=600000)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
          0  bytes sent via SQL*Net to client
          0  bytes received via SQL*Net from client
          0  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1000  rows processed

SQL>exec redo
1152888--as expected total redo has increased from the 
       --previous figure
PL/SQL procedure successfully completed.

SQL>set autotrace off
SQL>truncate table test_table;

Table truncated.

SQL>insert into test_table select rownum,rownum,rownum from dba_objects
  2  where rownum<301;

300 rows created.

SQL>analyze table test_table compute statistics;

Table analyzed.

SQL>select blocks,empty_blocks, avg_row_len from dba_tables
  2   where table_name='TEST_TABLE';

    BLOCKS EMPTY_BLOCKS AVG_ROW_LEN
---------- ------------ -----------
       300           83         606

SQL>select ((383*606)/4194304)*100 from dual;

((383*606)/4194304)*100
-----------------------
             5.53364754--percentage of buffers modified

SQL>commit;

Commit complete.

SQL>exec redo
1532444

PL/SQL procedure successfully completed.

SQL>set autotrace traceonly
SQL>select * from test_table;

300 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=30 Card=300 Bytes=18
          0000)

   1    0   TABLE ACCESS (FULL) OF 'TEST_TABLE' (Cost=30 Card=300 Byte
          s=180000)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
          0  bytes sent via SQL*Net to client
          0  bytes received via SQL*Net from client
          0  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        300  rows processed

SQL>exec redo
1533984--REDO HAS INCREASED.WHY??

PL/SQL procedure successfully completed.

SQL>select * from test_table;

300 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=30 Card=300 Bytes=18
          0000)

   1    0   TABLE ACCESS (FULL) OF 'TEST_TABLE' (Cost=30 Card=300 Byte
          s=180000)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
          0  bytes sent via SQL*Net to client
          0  bytes received via SQL*Net from client
          0  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        300  rows processed

SQL>exec redo
1535376--AGAIN THERE IS AN INCREASE IN REDO!

PL/SQL procedure successfully completed.
 

Tom Kyte
November 01, 2005 - 3:24 am UTC

you are running as SYS - stop it, just don't do that, do ever do that, not to test, not implement. use it to startup, shutdown, upgrade and that is about it.

did you notice all of the zeroes?????????


do NOT use sys.

also, you have 512 blocks - 512 blocks! tiny, teeny. 10% of 512 is 51 - 300 is much greater than 51.

10% of the buffer cache would be 51.2 blocks - your math using 4mb is incorrect.

What about INDEX segment blocks?

Robert, November 01, 2005 - 12:06 pm UTC

Tom,

Is it possible for INDEX segment blocks to be a part of a delayed block cleanout situation?

Thanks,

Robert.

Tom Kyte
November 02, 2005 - 4:45 am UTC

sure, anything can be - they just need "not be in the buffer cache" by the time you commit (eg: they already got checkpointed for example) or you did a really big transaction.

Can INDEX block cause Delayed Block Cleanout ORA-1555?

Robert, November 02, 2005 - 10:09 am UTC

Tom,

The common wisdom for preventing Delayed Block Cleanout (DBC) after a large update is to do a full table scan on the table in order to touch all the blocks and thereby clean them out.

BUT...

since INDEX blocks can be part of DBC...
and since the full table scan may not touch the affected index blocks...

THEREFORE...

One may perform a full table scan after a large update and STILL get an ORA-1555 error due to Delayed Block Cleanout (!?)

Am I correct?

Thank you,

Robert.

Tom Kyte
November 03, 2005 - 5:13 am UTC

it could happen, but should be improbable unless you are forcing the use of indexes when they are not appropriate :)

it would take a long running query and my long running queries shouldn't be doing big nested loop operations.


The 1555 from the block cleanout is most often observed on a consistent export of the data after a load (i'd just skip the export myself) - that doesn't use indexes.


you can index fast full scan the indexes as well.

Question on cleanout

Arul Ramachandran, February 01, 2006 - 8:11 pm UTC

Hi Tom,

<quote>
Better yet, since you just bulk loaded the data, you now need to refresh the statistics anyway. Running the DBMS_STATS utility to gather statistics may well clean out all of the blocks as
well as it just uses SQL to query the information and would naturally clean them
out as it goes along.
</quote>

After a huge update on a very large table, if I run dbms_stats for a 5% sample, will this clean all the required blocks of the table?

Thanks and regards,
Arul

Tom Kyte
February 02, 2006 - 11:48 am UTC

no, it would not hit every block.

A reader

HK, October 27, 2006 - 7:56 am UTC

Tom,

Begining of this discussion you stated that SELECTs (pure read only) also generate redo. Fine.

Let us suppose I have a long running report job which does only SELECTS (no updates), since it is running for long time (say more than 5 hours) I suppose it would have generated some redo. My question is, what happens to all these generated redo if I kill this long running job.



Tom Kyte
October 27, 2006 - 9:12 am UTC

MAY generation redo, not will.

redo once generated, is just "there". It goes into the online redo logs and eventually into is turned into archived redo. Not sure what you mean by "what happens to it", it was generated - it just exists.

why is the delayed block cleanout protected by redo?

Jan, November 28, 2006 - 6:58 pm UTC

Hi Tom,

when delayed block cleanout happens, the TX info is cleared and block is marked dirty.

So far so clear. What I don't understand is why redo generation is necessary?

Lets SUPPOSE the redo entry for this change would not be written and that the instance would crash right after the block cleanout but before the db writer could write this dirty block to its datafile. The cleaned block will be "lost" in this case.

But the cleanout could be re-done anytime in the future when somebody reads this block again.

So why is this protected with redo?

The only reason I can think about is that the oracle will do this cleanout again when doing cache recovery right after the crash, so the future user session will not be bothered with cleaning again. It doesn't seems likely to me as oracle doesn't crash often.

Can you explain please?

Thank you
Jan

Tom Kyte
November 28, 2006 - 8:10 pm UTC

because the block is dirtied.

and that causes redo.


you need a nice sequence of replayable events upon a restart. oracle might not crash but power goes out.

Just another reason for redo in queries

Gints Plivna, February 18, 2007 - 1:47 pm UTC

This thread has evolved as great thread about delayed block cleanout but there is at least one more reason for redo generation for selects i.e. if someone uses subquery factoring clause and either Oracle decides to materialize the subquery into temp table or it is forced to do that via hint as for example:
SQL> ed
Wrote file afiedt.buf

  1  with a as (select /*+ materialize */count(*) cnt from dual)
  2* select cnt from a
SQL> /
<first time execution skipped>
SQL> /

       CNT
----------
         1


Execution Plan
----------------------------------------------------------
Plan hash value: 2378808713

----------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |     1 |    13 |     4   (0)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |
|   2 |   LOAD AS SELECT           |                           |       |       |            |          |
|   3 |    SORT AGGREGATE          |                           |     1 |       |            |          |
|   4 |     FAST DUAL              |                           |     1 |       |     2   (0)| 00:00:01 |
|   5 |   VIEW                     |                           |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6643_4DFADF |     1 |    13 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          2  recursive calls
          8  db block gets
          5  consistent gets
          1  physical reads
        648  redo size
        406  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> ed
Wrote file afiedt.buf

  1  with a as (select count(*) cnt from dual)
  2  select cnt from a
  3  union all
  4* select cnt from a
SQL> /

       CNT
----------
         1
         1


Execution Plan
----------------------------------------------------------
Plan hash value: 2435678466

----------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |     2 |    26 |     4  (50)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |
|   2 |   LOAD AS SELECT           |                           |       |       |            |          |
|   3 |    SORT AGGREGATE          |                           |     1 |       |            |          |
|   4 |     FAST DUAL              |                           |     1 |       |     2   (0)| 00:00:01 |
|   5 |   UNION-ALL                |                           |       |       |            |          |
|   6 |    VIEW                    |                           |     1 |    13 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6644_4DFADF |     1 |    13 |     2   (0)| 00:00:01 
|   8 |    VIEW                    |                           |     1 |    13 |     2   (0)| 00:00:01 |
|   9 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6644_4DFADF |     1 |    13 |     2   (0)| 00:00:01 
----------------------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          2  recursive calls
          8  db block gets
          8  consistent gets
          1  physical reads
        648  redo size
        442  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

SQL>

How can UNDO per session be measured

Mathias, April 04, 2007 - 9:29 am UTC

Redo can be found per session, but where do I find UNDO stats? I can find it for a given transaction, but if I want to get the total for all transactions a session has processed, where would I look then?
Tom Kyte
April 04, 2007 - 11:02 am UTC

look at the session statistic "undo change vector size"

Can you explain the unit?

Mathias, April 04, 2007 - 11:54 am UTC

Tom,

Thanks for the hint. I noticed the existance of that, but could not find any information about it.

Would my (newfound) assumption that this would then be undo data generated by a session (across many transactions) measured in bytes. If not, what is the unit?
Tom Kyte
April 04, 2007 - 12:09 pm UTC

it is bytes, yes.

Merin, October 01, 2007 - 8:41 am UTC

This is a select on an 18 mio records table with lots of inserts & deletes. The explain plan is a full table scan with a cost of 405.000. The Oracle is 9.2.0.6 on AIX

select count(*) from tracker_message_part
where jmspriority <> 4;

(running approx 30 minutes for reading 4.3 mio blocks of data).

It generates this autotrace:

db block gets: 0
Consistent gets: 4.5 mio
Phys reads: 4.3 mio
Redo: 12.3 mio

Can you explain the (to me) very high redo size? What is going on here? Are we reading from the Rollback segments? Or is that the diff of 200.000 (4.5-4.3)? Or ?

best regards
Merin

Do you have a drawing of this "late cleanout of dirty"? Maybe its easier to comprehend in a drawing :-)


Tom Kyte
October 03, 2007 - 2:48 pm UTC

but I explained it as verbosely as I can imagine in the original answer?

basically, the blocks were put to disk dirty by dbwr.

the next thing that reads them cleans them up.

that is the redo you see generated.

Statistics name - immediaete ( cr ) block cleanout application

Hee, January 14, 2009 - 8:47 am UTC

Dear Tom,
For Oracle 10.2, may I know the number of blocks applied to create the read consistecy due to block cleanout is this?

Thanks in advance.

number of times = number of blocks?

Winston, February 09, 2009 - 1:55 pm UTC

doc says " Number of times cleanout records are applied immediately during consistent-read requests"

Does it mean each time only one block is applied?
Tom Kyte
February 09, 2009 - 7:39 pm UTC

well, there is a chance that a single block could be cleaned out over and over - it would be rare, but it could happen.

but it could say "number of times we cleaned out a block immediately during .."

A reader, July 25, 2009 - 10:52 pm UTC

Sir
I have some issues regarding to grasp block cleanout issue:

1-)
In your book you mentioned that

"This action generates redo and causes the block become dirty if it wasnt already"

After, insert,update,delete, block become dirty immediately.How come it cannot be dirty?


2-)
Sir you also mentioned that:
"select generated redo, it will also dirtied modified blocks, causing DBWR to write them
again."

Suppose we modified blocks which exceeds 10 percent of the buffer cache,
Now these blocks become dirty.
How can select will dirty these modified blocks?
As far as I know,They immediately become dirty after the transaction.
and
there is no checkpoint how can dbwr write them again after the select?

Tom Kyte
July 26, 2009 - 8:13 am UTC

1) we are talking about selecting the block, not modifying it.

If the block is cleaned out on commit, they when flushed to disk - it is "clean" the next time it is read - no work to be done.

As I wrote however, if you modify LOTS of blocks - when you commit, we do not clean them all out, we do not mark them as "transaction complete". A subsequent SELECT operation will do that.

2) You have the order of operation wrong here. The premise is:

a) you did a modification
b) you modified more than 10% of the cache size
c) therefore, many of the blocks you modified WILL NOT be cleaned out

and

d) therefore, the next read of those blocks will.


and the act of cleaning them out - dirties them just like a modification would and dbwr will in fact flush them to disk later.

A reader, August 08, 2009 - 4:08 am UTC

Hello Tom;
I have just read an article about commit cleanout.
Does commit generate redo?
"When you issue the commit, your session will update its transaction table slot (which is in an undo segment header block), generating a little redo to describe this action."

Tom Kyte
August 11, 2009 - 12:59 pm UTC

every change to a block would generate redo (exception: non-logged bulk operations - but they would be the exception)

A reader, September 05, 2009 - 2:32 am UTC

Respected Sir;

Assume pctfree is %20.
Initrans is 1.
MAxtrans is 255.

When the block is %80 full, Does it mean that there will be only
1 ITL entry, therefore there will be only one transaction at a time?
In other words, only one row can be modified at a time.
All other have to wait...
Tom Kyte
September 06, 2009 - 9:21 am UTC

initrans in current versions is always 2 at least, always - for all segment types now.

when the block is 80% full - it is 20% empty, giving you plenty of room to add more ITL entries if need be.

A reader, September 11, 2009 - 7:53 pm UTC

Thanks Sir,
Does it mean that itl enries stored in pctfree part of the block?
Tom Kyte
September 14, 2009 - 1:09 pm UTC

no, it is stored in the block header which can grow if there is free space.

pctfree is used by insert, prior to inserting a row, we verify that at least "pctfree" bytes will be left. Eventually a block is taken off of the freelist with "about" pctfree bytes left.

The block data can grow into this pctfree
The block header can grow into this pctfree

but it isn't really "pctfree part", pctfree is just a number that tells us when to talk the block off of the freelist - it isn't a reserved area of the block.

A reader, September 19, 2009 - 4:06 am UTC

Is the block header fixed size?
If there is not enough space in block header, Can oracle extend it?


Tom Kyte
September 28, 2009 - 8:25 am UTC

there is information in the block header (the ITL - interested transaction list) that can grow if there is room - so yes, the block header can grow over time.

Donatello Settembrino, October 09, 2009 - 9:55 am UTC

Hi Tom,
I have the following problem about performing a procedure PL/SQL like:

procedure X

Cursor c…
select … from t1 where …;
begin

for r in c
loop

select … into:v1 from t2 where …;
select … into:v2 from t3 where …;
select … into:v3 from t4 where …;
select … into:v4 from t5 where …;
select … into:v5 from t6 where …;
select … into:v6 from t7 where …;

insert /*+ append */ into t8 …
end loop;
end;/*procedure*/


The procedure until recently ended in a day about. About 4 months ago to today it require 5 days before giving the result. Note that referenced table from the cursor has greater 1 billion records (partitioned table, update statistic).
Doing some checking I noticed that the cursor (table 1) takes the data from undo, and that has created in me some confusion. I expected that this will not happen for the simple fact that the referenced table on the cursor isn’t update on the procedure.
The table t1 loads the previous week at the execution of my procedure with an insert/*+ append */.
My question is this: in this situation, can a select generate an undo?
I tried to do block cleanout before the execution of cursor, but this did not solve the problem.
Also doing shutdown after loading the table t1 and rerun the procedure, the cursor continue to persist the problem.
Please, can help me to solve this problem(doubt)?


Thank you a lot

Donatello

A reader, October 10, 2009 - 2:51 am UTC

Respected Sir

1-)
I know that itl enries are stored in block header.
If I specficy a big value as a inittrans and there is no available space in block header.
Does oracle allocate more size to block header or returns error?

2-)
Is there anything stored in block header other than ITL ?

More to Explore

DBMS_STATS

More on PL/SQL routine DBMS_STATS here