Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, ali.

Asked: March 02, 2001 - 11:36 pm UTC

Last updated: March 29, 2011 - 3:43 am UTC

Version: 8.0.5

Viewed 1000+ times

You Asked

Hi sir,

this is ali from hyderabad pakistan. i am little bit confused, how oracle server determine about the changed blocks (dirty blocks) after a checkpoint event by which the DBWR writes dirty buffers in data files and still user do not commit the transaction or rollback. i mean.

1. If user decided to rollback the transaction, what steps server process will take to roll back the transaction effect.

2. And if user decided to commit the transaction , what steps server process will take to make the change permanent, becuase the transaction data already written in data files because of any prvious checkpoint (suppose log file switch).

thanks

ali



and Tom said...

It is complex -- here is a short scenario.

You update 1,000 rows with a single statement. Oracle will:

o generate modified datablocks in the SGA
o generate rollback records in the SGA
o generate redo for BOTH datablocks and rollback in the SGA

We may flush some of the REDO generated to the logfiles during this process (or not). Remember also that before we checkpoint a block -- the REDO that protects that block is flushed first, before DBWR can flush it. There is never a block on disk that doesn't have its redo written first. Let's say the system fails right now in the middle of our update. We know the above 3 things have happened and either:

o some REDO had been written to the logs and NO data/rollback blocks have been. When the system restarts, this REDO will be applied. This will reconstruct the ROLLBACK blocks as well as the modified DATABLOCKS. After we roll forward in recovery, we will roll back using the rollback blocks we just reconstructed. It will be as if the transaction never happened.

o some REDO and SOME data/rollback blocks have been written. Same as above then -- we rollforward, applying the REDO to the blocks that were not checkpointed -- and then we rollback.

o NO redo had been written to the logs. When the system restarts -- it is simply as if the transaction never happened. we are done. no recovery needed.




Now, let's say the system does not fail during our update and it succeeds. We have

o generated redo and it is in the SGA (probably)
o generated modified datablocks and they are in the SGA
o generated rollback records and they are in the SGA.

we commit. LGWR flushes our redo to disk. Thats it -- we've committed at that point. If the system fails right after we commit, when we restart we apply the REDO to redo that transaction -- we have all of the information needed to replay it and we just do.

Hope that helps.


If we do flush some redo to disk and we crash

Rating

  (12 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

number of dirty buffers reaches the a threshold value

ali, March 11, 2001 - 1:12 am UTC

dear sir tom,

infact i was confused by the steps taking when a checkpoint occurs if it reaches its threshold value (how we determine this limit). next thing , is DBWR writes all dirty blocks
(data and rollback) to their respective segments, whether its transaction became committed or not, if is this happend then is DBWR writes again on changed data from rollback segment data, if user rollback the transaction.

thanks

ali

A reader, December 18, 2003 - 2:55 am UTC

greate help. thanks a lot

Threshold size for DB_BUFFER

Jayesh, March 07, 2004 - 7:30 am UTC

Hello Tom,
You did'nt give any reply for specifying the Threshold size. In oracle Book also its mentioned..But did'nt give any valid reasons or any way to specify it..?



Tom Kyte
March 07, 2004 - 9:28 am UTC

you use parameters like fast_start_mttr target, fast_start_io_target (superceded by prior parameter), log checkpoint interval and such to control how aggresive dbwr is at keeping the buffer cache "clean"

Role of rollback/undo during ROLLBACK

Suraj Sharma, December 18, 2006 - 12:40 am UTC

Thanks a lot Tom for this useful reply. I am a very big fan of your and try to find all depth knowledge about Oracle only from your articals. Even I am a bit confused in this case. I am confused about the role of UNDO/Rollback segments during rollback of the transaction. As I can understand from when a block is modified Oracle produces

o generate modified datablocks in the SGA
o generate rollback records in the SGA
o generate redo for BOTH datablocks and rollback in the SGA

I am confused on one thing:
1. As you are saying that "generate rollback records in the SGA", but as-far-as I know rollback records are in rollback segments. Do you want to say that like other dirty blocks these also first goes into memory then to the UNDO/Rollback segment?


Tom Kyte
December 18, 2006 - 8:18 am UTC

rollback is buffered in the cache just like data blocks are.

the rollback blocks are written into the buffer cache and if needed, dbwr will write them to disk - into the rollback segments on disk.

If things aren't on disk yet ..

A reader, December 21, 2006 - 11:03 pm UTC

You stated if we succeed
"o generated redo and it is in the SGA (probably)
o generated modified datablocks and they are in the SGA
o generated rollback records and they are in the SGA."

-- So everything is in the SGA.
"we commit. LGWR flushes our redo to disk."
What if system fails while LGWR is flushing redo to disk.
They you don't have all the redo to redo it, or roll it back, only some of it.

Tom Kyte
December 22, 2006 - 6:24 am UTC

if the system fails before lgwr succeeds then the commit did not happen.

it'll be rolled back upon restart/recovery.

A clarification

A reader, December 23, 2006 - 1:11 am UTC

The undo hasn't been written to disk yet though, it can't roll back. Is it more like - since the redo never made it to disk, when the database is started up, it will ignore this transaction?
Tom Kyte
December 24, 2006 - 9:08 am UTC

if the commit succeeded, the redo DID make it to disk.

If the commit did not succeed all of the way, then some of the redo may or may not have made it - but the commit record definitely did not - so sufficient UNDO to rollback the blocks (remember - they were not checkpointed so they do not necessarily NEED the undo yet) definitely exists.

Rollback

A reader, December 23, 2006 - 9:50 am UTC

That was a pretty good question from Suraj Sharma from India above.

I have some questions regarding that.

1) Is there a sub-cache for rollback/undo data in the buffer_cache ? and if so, what determine its size ?
2) What is the threshold to write buffered undo/data blocks to disk (to undo segment) ?

I understand that writing data to memory directly is more efficient than to disk, just wanted to know if there is more information about this out there.

Thanks, as always.
Tom Kyte
December 24, 2006 - 9:13 am UTC

1) no, they are not treated substantially different from other cached blocks.

2) no, see 1

More Clarification

Matt, December 29, 2006 - 9:14 pm UTC

Hi Tom,

In your initial reply one of the situations you mention is:

o some REDO and SOME data/rollback blocks have been written. Same as above then -- we rollforward, applying the REDO to the blocks that were not checkpointed -- and then we rollback.

I have a concern about what happens if that redo is no longer available such as an incomplete recovery after all members of a redo log group are lost.

Is it possible for some of the data blocks to be written to the data files, but the corresponding rollback blocks to not be written?

Is it possible to end up with a database that contains data from uncommitted transactions when doing an incomplete recovery?

Thanks,

Matt
Tom Kyte
December 30, 2006 - 9:12 am UTC

you would be in a media recovery NOT a crash recovery situation at that point

that is

restore backup
roll forward as far as you can
open database reset logs

Nevermind

Matt, December 29, 2006 - 11:40 pm UTC

Sorry Tom - I realized the question I just asked was stupid. Having an archive log to recover with => log switch => checkpoint.

Matt

Checkpoint queue

bakunian, January 05, 2007 - 11:13 pm UTC

Tom,

I am reading Oracle Press 10G certification all-in-one exam guide which states. "Many people think that log switch triggers a checkpoint. It doesn't. It used to, but that behavior changed with release 8i, when incremental checkpointing was introduced...."

OK I admit I am one of those people I was trying to find any consistent information regarding this in Oracle docs but could not. Could you help to clarify this?
Does it mean that CKPT sees that there are too many dirty buffers in checkpoint queue and signals DBWR to flush?
What is incremental checkpointing anyway?

Thanks in advance.

Tom Kyte
January 06, 2007 - 4:34 pm UTC

well, that is not quite right - it does trigger a checkpoint - incremental checkpointing didn't remove it.

incremental checkpointing is a more "continuous" thing - you use settings such as the fast_start_mttr_target to tell dbwr how clean to keep the buffer cache (to reduce time to recover)

How DBWR flushes multiple versions of the SAME block to disk?

MaxU, March 28, 2011 - 11:20 am UTC

Hi Tom, thanks a lot for your detailed explanations!!!

i do have one question in regards to dirty blocks:

if we have two rows (rowid1 & rowid2) belonging to the same data block in the table "A". And the following happens to those rows and to that block at approx. the same time (within a fractions of second):

1. Session1: update A set col1=1 where rowid=rowid1;
2. Session2: changes yet another row in the SAME block and performs "commit;" (i.e. the block SCN has been changed)
3. Session3: update A set col1=2 where rowid=rowid2;
4. Session1: commit;
5. Session3: commit;

Question: how these two/three different versions of the SAME block in the Buffer Cache will be written to the disk?
Does the DBWR checks the SCN in the ITL (Interested Transaction List) of a dirty block and generates yet another copy of a block (merging the rows with the latest SCN) that can be written to the disk?

Thank you!
Tom Kyte
March 29, 2011 - 3:43 am UTC

we only need to write ONE version to disk - the most 'current' version. the other versions don't have to ever be written.

when we actually modify the rows on a block - we get (always) the MOST CURRENT version and make the changes there. We do not update different versions of the block - the other versions are OLDER versions.

In reality only one session updates a block at any point in time, never two. Only one session gets the block in current mode and updates it at a time.

How DBWR flushes multiple versions of the SAME block to disk?

MaxU, March 28, 2011 - 12:19 pm UTC

in regards to my previous question -
i guess the "DELAYED BLOCK CLEANOUT" would do the job, but what happens to all those different versions of the same block in the buffer cache?

Thank you!