Thanks for the question, Sudhir.

Asked: October 28, 2001 - 8:41 pm UTC

Last updated: November 06, 2012 - 10:45 am UTC

Version: 8.1.7.2

Viewed 10K+ times! This question is

You Asked

If I see redo logs are causing log sync wait event on the application processes, what could be the cause?

Too often commits, too small log buffer (1M at present on solaris 2.6), some resource limit? How does one figure if 1M log buffer is sufficient (its filling up in less than 3sec).

what are some queries I can run to find what should be minimum log buffer size. Also, having data files, etc on the redo log drive - would this seriously impact performance of the LGWR and consequently application?

Thanks very much in advance.

and Tom said...

log sync waits are most commonly caused by frequent commits. When you commit, you must wait for LGWR to flush the contents of the redo log buffer to disk (else we could not assure you of recovery).

A small log buffer would manifest itself as a LOG_BUFFER_SPACE wait.

Having data files and other things on the same device as your online redo log can definitely impact log writers ability to do its job as fast as possible.

Things you can do:

o look at your transactions. If you are "over committing" -- commiting in the mistaken belief that you are saving resources and not on true transactional boundaries -- stop it. Commit only when your transaction is over and do not commit too frequently (if you have very small transactions, so be it -- you MUST commit but if you are committing just for the sake of committing, don't)

o speed up LGWR. Make it more efficient. Ensure you are not using something like RAID 5 for log devices. Reduce contention on that device (let LGWR own it). Ensure LGWR and ARCH are not contending with eachother (you want at lead 5 devices dedicated to logging -- you NEED to mirror redo members and you NEED to archive. So, if you have disks 1..5 you can:

have redo log groups 1, 3, 5, 7, .... on disks 1 and 3
redo log groups 2, 4, 6, 8, .... on disks 2 and 4
archive destination to disk 5

Now, when LGWR is writing to groups 1, 3, 5, 7, .... on disks 1/3, ARCH will be reading groups 2, 4, 6, 8, ... on disks 2/4 and writing to disk 5. When LGWR advances, so does ARCH and they'll switch disks. Never does ARCH and LGWR contend with eachother.

Rating

(40 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Reader

Reader, October 29, 2001 - 11:30 am UTC

Tom,

I have heard from someone that Oracle recommends that there should be at least 3 log members to a group. Is there any truth to it.

Thanks

October 29, 2001 - 12:53 pm UTC

Depends on your level of paranoia. If you believe 2 disks will fail simultaneously -- but three of them will not -- yes, there is truth to it.

2 is the minimum there should be. It goes up from there.

Other waits events

Andre Whittick Nasser, October 29, 2001 - 1:52 pm UTC

In fact, you should get rid of all (or most) factors affecting redo log contention. In addition, maybe:

- Making sure DBWR checkpoiting and ARCH copying are not blocking log switches (check out ALTER.LOG);

- You can also verify that the following statistics in V$SYSSTAT are either high or increasing:

"redo buffer allocation retries" - Total number of retries necessary to allocate space in the redo buffer. Retries are needed either because the redo writer has fallen behind or because an event such as a log switch is occurring.

"redo log space requests" - Number of times the active log file is full and Oracle must wait for disk space to be allocated for the redo log entries. Such space is created by performing a log switch.

In these cases, you can try some things, the main ones being:

. Make DBWR more infrequent (performance will inscrease but recoverbility could be compomised). Do that adjusting FAST_START_IO_TARGET (8i) or FAST_START_MTTR_TARGET (9i).

. Enlarge your redo files. This will make checkpoiting more infrequent on log switches.

. Spawn more DBRWs, ARCHs, or their IO Slaves to increase throughput.

Hope this helps !

Reader

Reader, October 29, 2001 - 7:28 pm UTC

Actually, this person is a consultant and he was
claiming that Oracle recommends a minimum of 3 log members.
He was explaining like very similar to "philosopher's
algorithm". If Oracle finds one member of redo log
is corrupt, it will compare the other two, to determine
which redo log member contains valid data.

I was not impressed, but I thought, you would definitely
know if Oracle used such an algorithm

Thanks

Following some information presented ...

A reader, October 29, 2001 - 8:50 pm UTC

Tom, and others who pitched in, Thanks

I understand following triggers redo log buffer flush
1 - commits
2 - redo log buffer 1/3 full
3 - more than a meg change in redo log buffer
4 - before flushing dirty data buffers to data files

So if I have 3MB of buffer than as soon as it fills 1MB it will dump it to redo files and clean the redo log memeory buffer? Then 2MB will never be used?

Or if I have very small update and a commit (and no one else is working), redo log buffer will get flushed. Say I have 10K worth change, does it mean all redo log buffer will get cleaned?

I understand you mention make sure LGWR & ARCH are working on different disks.

How can I find how often LGWR is flushing the buffer and what kind of requests its putting on hold since its not done writing its information? What to look for in iostat command?

Is there a way to avoide redo log flush in case of commit? I understand for recovery guarantee oracle needs it but can it can wait for another 3 seconds? (I know this is silly question but it seems like in multiuser environment there will be 10s of commits (We have 500 servers running), so there is very high chance next redo log flush will occur within 3seconds. I am just trying to understand what can be done to maximise redo log buffer memory use and take 3seconds risk(lose maximum 3sec worth of data, say 3MB)

Thanks for you help

October 29, 2001 - 9:28 pm UTC

It does not mean the other 2mb will never get used, it just means that when we get to 1mb, LGWR (who does not FILL the redo log buffer) will start emptying it (whilst we are busy filling the other 2mb that is free). It might just keep emptying it if we fill it as fast (or faster) then he can empty it.

Yes, if you have a small update and ocmmit -- your changes, all of them, are written to disk. Any outstanding stuff in the log buffer will go to disk.

You won't use IOSTAT, you'll monitor the stats/events Andre Whittick Nasser mentioned to see if there is an issue (v$tables..)

NO, there is not really a way to avoid the redo log flush in case of a commit. Once you commit, we've made a solemn promise to you that cannot be broken.

Following some information presented (cont'd)

Andre Whittick Nasser, October 29, 2001 - 9:32 pm UTC

Well..

1) That's right, it suffices that you fill up 1/3 of the log buffer so it to be flushed. The log buffer works in a circular fashion, so the (inevitably) old data beyond that will be overwritten.

2) Yes, all data will be flushed on a commit. See 4.

3) There is no way to monitor LGWR activity, but tools like UNIX's iostat on the respective disk. There is nothing like V$FILESTAT on redo files.

4) As you mentioned, the only way Oracle has to enforce that commited data will be rolled forward in case of recovery is flushing it immediately after a commit. The big "compromise" Oracle has towards you is that all commited data will be in redo files. The golden rule: "All commited data will be available in datafiles after a recovery". The data that is rolled back and/or never explicitly commited will not have a commit record in the redo file and will not be applied during recovery.

Hope this helps !

Following some information presented (cont'd)

Andre Whittick Nasser, October 29, 2001 - 9:41 pm UTC

Right,

As to my answer number 3) you can also check for contention using some info in V$SYSSTAT:

"redo buffer allocation retries" - Total number of retries necessary to allocate space in the redo buffer. Retries are needed either because the redo writer has fallen behind or because an event such as a log switch is occurring.

"redo log space requests" - Number of times the active log file is full and Oracle must wait for disk space to be allocated for the redo log entries. Such space is created by performing a log switch.

... although there is not really a "monitoring tool".

Thanks !

Thanks for all help,

A reader, October 29, 2001 - 9:47 pm UTC

Given that whilest 1MB buffer is getting flushed some other DML activity is filling up rest of the redo log buffers,

how much should be the redo log buffer size? Say database is very active with 250 active connections at any time. I understand 50MB is probably too big but would it be bad idea to have size of 5MB (is it too small?). SGA is close to 2.5GB. Current redo log buffer size is 5MB. Redo log file size is 1GB and it is Veritas quickio.

Any suggestions. Thanks again.

Following some information presented (cont'd)

Andre Whittick Nasser, October 29, 2001 - 9:51 pm UTC

Unix's IOSTAT (sar, etc) is the only "tool" to monitor redo file I/O activity on the respective disk the file is placed on. It terms of CONTENTION -- that is what you asked about -- the above V$SYSSTAT statistics apply.

Andre, thanks for you help!

A reader, October 29, 2001 - 9:54 pm UTC

I really appreciate great and very timely help from you. Tom, as usual you have provided great help and stage for people to learn great things.

Thanks

Thanks for all help (cnt'd)

Andre Whittick Nasser, October 29, 2001 - 9:59 pm UTC

The tuning guideline for you to size redo log buffer is checking for contention, again using those two above mentioned statistics:

"redo buffer allocation retries" and "redo log space requests" in V$SYSSTAT. Try and find a value for LOG_BUFFER to stabilize theses statistics. It means that the log buffer is too small and LGWR tries to flush so frequently that contention is generated.

Thanks !

A followup on redo log space request

A reader, November 01, 2001 - 6:16 am UTC

"
"redo log space requests" - Number of times the active log file is full and
Oracle must wait for disk space to be allocated for the redo log entries. Such
space is created by performing a log switch.
"

- Since redo log file is pre-allocated at OS level, what is above statement talking about? I assume its just writing in linear fashion, correct? then wherever the disk head is resting (in case only redo is on the disk), it will resume from there. Then how does log switch help?

Can you please explain/

Thanks

November 01, 2001 - 8:35 am UTC

you are waiting for a log switch -- that is all it is saying. switching of logs, like the changing of the guard, is an event. Things happen. Things can prevent it from being fast.

If you've ever seen:

"checkpoint not complete, cannot allocate new log"

or

"archival required, cannot allocate new log"

for example, you've seen a case where a log switch can indeed take a very long time. Even when its fast, its not instantaneous -- just fast.

clarification on one item please ...

A reader, November 07, 2001 - 9:52 am UTC

From the 8i reference guide:

LOG_BUFFER
Parameter type: Integer
Parameter class: Static
Default value: Operating system specific. Maximum: 500K or 128K * CPU_COUNT,
whichever is greater
Range of values: Operating system dependent
LOG_BUFFER specifies the amount of memory, in bytes, that Oracle uses when buffering
redo entries to a redo log file. Redo log entries contain a record of the changes that have been
made to the database block buffers. The LGWR process writes redo log entries from the log
buffer to a redo log file.
In general, larger values for LOG_BUFFER reduce redo log file I/O, particularly if
transactions are long or numerous. In a busy system, a value 65536 or higher is reasonable.
See Also:
n Oracle8i Designing and Tuning for Performance for more information on setting this
parameter.
n Your operating system specific Oracle documentation for the default value and range of
values.

If we have 40 CPU, wouldn't the log buffer size be 40*128k = 5MB? Is this correct?

thanks

November 07, 2001 - 6:24 pm UTC

According to the documentation -- yes, I don't have a 40 cpu machine to test with but yes.

Reader

Reader, November 08, 2001 - 10:59 pm UTC

Tom,

As referenced from the manual, "Maximum: 500K or 128K * CPU_COUNT":

The maximum size of redo_buffer increase as a function # CPU
Is the reasoning behind this:
while redo gets flushes to disk, redo continually gets created and writtn to remaining part of redo_buffer, the more # of CPU there is, the possibility of more concurrent processes which leads to higher volume of redo generated, while LGWR busy.

Is this a resonable conclusion.

Thanks
flushing the redo_buffer to disk.

November 09, 2001 - 10:16 am UTC

exactly -- dead on.

The more cpu you have, the more redo log you generate in parallel, the more there is to cache.

Followup on Reader from USA

Andre Whittick Nasser, November 09, 2001 - 6:56 am UTC

Yes, correct. LOG_BUFFER should be proportional to the number of CPUs to avoid "overcrowding" this area.

Quote from Oracle8i Designing and Tuning for Performance:

" The LOG_BUFFER parameter reserves space for the redo log buffer that is fixed in size. On machines with fast processors and relatively slow disks, the processors may be filling the rest of the buffer in the time it takes the redo log writer to move a portion of the buffer to disk. The log writer process (LGWR) always starts when the buffer begins to fill. For this reason, a larger buffer makes it less likely that new entries collide with the part of the buffer still being written. "

...

" When LGWR writes redo entries from the redo log buffer to a redo log file or disk, user processes can then copy new entries over the entries in memory that have been written to disk. LGWR normally writes fast enough to ensure that space is always available in the buffer for new entries, even when access to the redo log is heavy. "

.. still, just to refresh:

" The statistic REDO BUFFER ALLOCATION RETRIES reflects the number of times a user process waits for space in the redo log buffer. This statistic is available through the dynamic performance view V$SYSSTAT. "

Hope this helps !

Reader

Reader, November 09, 2001 - 10:31 am UTC

Tom,

Thanks very much

Andre, Tom and others - Thank you!

A reader, November 10, 2001 - 6:22 pm UTC

I wish one statment needs little clarification

When Tom, or someone says - you would never need more than 1MB of log buffer - specify on what #cpu(Mhz as well), how furiously application is working/commiting(very hard to know but needed any way). Otherwise as some of document listed above say - it seem to matter how many cpu you got

Thanks

November 10, 2001 - 6:52 pm UTC

I never say you never need more then 1mb of log buffer? Maybe I'm misunderstanding you here -- but I've never said you would never need more then 1mb, regardless of the number of CPU's

thanks Tom, see following links

A reader, November 11, 2001 - 9:33 am UTC

Following links suggests log buffer size without regard to cpu:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:621023586146,%7Bredo%7D%20and%20%7Blog%7D%20and%20%7Bbuffer%7D http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:55812348055 http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1724586308922 <code>

May be it is my misunderstanding but thats why I was little confused, since these links do not tell about 128k * #cpu equation for log_buffer size.

Thanks in advance

November 11, 2001 - 10:40 am UTC

Umm, I'm not getting the point here?

First -- I cannot tell what comments are yours or someone elses so the context is a little murky here, I'm assuming "this reader" is the same "reader" as the comment right above about 1m. If you actually use a name or provide some context, it would make more sense.

Second -- so, they didn't talk about 128k * #cpu which is simply the documented default value for this parameter as stated in the Oracle8i reference manual? They do talk about how the redo log buffer gets flushed, when you would think about increasing its size and so on. So, I'm not understanding the point here. I do assume people read the documentation and the computation for the default size really wasn't relevant in the various discussions.

These threads are all about the redo log buffer getting flushed, not its size -- not really even about sizing it. It would have been "off topic" to just stick in there "oh by the way, the default size is ....."

Tom, thanks for all your help

Sudhir M., November 11, 2001 - 4:35 pm UTC

I am sorry for expressing incorrectly my thoughts. Any way, I agree, main focus of the thread was log sync event and I should have stayed with it. Sorry for that.

Once again, I apologize for that.

I must say I am extremely happy with the service this site provides.

Thanks again

Sudhir

Reader

A reader, October 17, 2002 - 7:16 pm UTC

Tom,

log file sync wait time and
log file parallel write are both
waits,waiting for LGWR writing log buffer
redo log. How to determine the
difference in these waits.

Does log file sync happen only when
user commits or as posted by DBWR, and
log file parallel write takes place ,
when there is no commit,
and log buffer is 1/3 full or 3 seconds
interval.

Thanks

October 17, 2002 - 7:47 pm UTC

log file parallel write takes place inside of LGWR, it is LGWR waiting for this -- not a user.

If waits for "log file parallel write" are significant then this will show up as other user wait events (such as long "log file sync" wait times).

So, just cause LGWR waits for the parallel write -- that doesn't mean users are, so this is one that you have to look elsewhere to see if it is a problem.

Log files configuration with a few large disks

David Piazza, March 09, 2003 - 10:32 pm UTC

Environment is Oracle 9.2.1 and Windows 2K.
We were doing a performance test of an application using load runner, and I noticed log sync wait events using OEM. I also saw that "redo buffer allocation retries", and "redo log space requests" were increasing. I increased the log_buffer and increased the log file size and added more groups, and noticed that there weren't as many log sync wait events and "redo buffer allocation retries", and "redo log space requests" were increasing at a far less rate. We have a high speed RAID 10 SAN, and also two local disks. I ended up using disks on the SAN and the local disks for log files(local). I put all even numbered groups on one disk(SAN), all odd numbered groups on another(SAN), all second members on a third disk(local) and archives on another disk (local). We have a large amount of disk space with just a few disks. My Questions are: 1) Since we are running RAID 10 on the SAN and having data striped on different drives, does it matter if all the log files are on the same disks on the SAN. 2) I'm thinking of running three tests, all log files on the SAN, all log files local, and log files on both the SAN and local drives to see which has better performance. Do you have any thoughts on this?

Also, on some of our other applications, we only have four drives and can't implement odds, evens, odd members, even members and archives on different disks like you have suggested. We have a setup like the following:

Disk A: 1P, 3M, 4M
Disk B: 2P, 4P
Disk C: 1M, 2M, 3P
Disk D: Archives

M-member log file
P-primary log file

With this setup, the archiver is reading from a disk that is not currently having redo information written to the log file.

Also given four disks, do you have any comments on the following setup, with all members on one disk:

Disk A: 1P, 3P
Disk B: 2P, 4P
Disk C: 1M, 2M, 3M, 4M
Disk D: Archives

M-member log file
P-primary log file

Thanks.

March 10, 2003 - 7:17 am UTC

1) well, it sounds like a "disk" on the san isn't a disk really but a big old striped volume. I don't know if it matters or not -- the process you are going through will determine that however (excellent that you are benchmarking)

2) excellent idea, test all, test to destruction. You are doing the right things

#2 setup looks more "understandable"

When is the LOG BUFFER Flushed

A reader, March 15, 2003 - 6:32 pm UTC

Tom,

Is this is an accurate statement:

Log buffer is flushed when "any" of the following events occur. (as of both 8i and 9i)

(1) When 1/3rd full
(2) When 1 MB full
(3) Every three seconds
(4) When a commit occurs
(5) When a checkpoint occurs

I was told that point (1) was not correct as of 9.2; it was 2/3rd. Is is so?

In the light of the above, it seems there is no advantage to creating a log buffer of more than 3 MB, is there?

Thanks a lot in advance.

March 16, 2003 - 9:41 am UTC

I have not heard of any change in that regards.

There are definitely advantages to larger, much larger log buffers.

Think of it as a pipe. lgwr is responsible for emptying the pipe but may acrue a large backlog -- while lgwr is writing, others can be filling -- so lgwr may continously be active writing to disk whilst others are filling it. If you have lots of big transactions, a larger log buffer can help the "piping" of data flow smoothly. people don't have to wait for lgwr to empty the buffer to get space - there is always space for them.

Switch takes over 1 minute

yazid, March 04, 2004 - 10:58 am UTC

Hello,
I have a prod database that the switch takes an average of more than 1 Minute :

Beginning log switch checkpoint up to RBA [0x268f.2.10], SCN: 0x03e9.b28bba13
Thread 1 advanced to log sequence 9871
Current log# 5 seq# 9871 mem# 0: /disk3/pdco1/var/redolog/redo1.log
Mon Mar 1 13:15:07 2004
Completed checkpoint up to RBA [0x268f.2.10], SCN: 0x03e9.b28bba13
Mon Mar 1 13:15:53 2004
Beginning log switch checkpoint up to RBA [0x2690.2.10], SCN: 0x03e9.b28c3f97
Thread 1 advanced to log sequence 9872
Current log# 6 seq# 9872 mem# 0: /disk3/pdco1/var/redolog/redo2.log
Mon Mar 1 13:21:29 2004
Completed checkpoint up to RBA [0x2690.2.10], SCN: 0x03e9.b28c3f97
Mon Mar 1 13:22:20 2004
Beginning log switch checkpoint up to RBA [0x2691.2.10], SCN: 0x03e9.b28ca165
Thread 1 advanced to log sequence 9873
Current log# 2 seq# 9873 mem# 0: /disk3/pdco1/var/redolog/redo3.log
Mon Mar 1 13:30:07 2004
Completed checkpoint up to RBA [0x2691.2.10], SCN: 0x03e9.b28ca165
Mon Mar 1 13:30:56 2004
Beginning log switch checkpoint up to RBA [0x2692.2.10], SCN: 0x03e9.b28d2627
Thread 1 advanced to log sequence 9874
Current log# 5 seq# 9874 mem# 0: /disk3/pdco1/var/redolog/redo1.log
Mon Mar 1 14:01:56 2004
Completed checkpoint up to RBA [0x2692.2.10], SCN: 0x03e9.b28d2627
Mon Mar 1 14:04:24 2004

As we can see I have 3 redo groups, every redo file is 150Mo

My questions :
1)What is -the acceptable time for a switch ? I think one minute is huge.
2) What is the reason for this ? (not a lot redo groups, may be LGWR is blocked by DBWR)

No other alerte in the <SID>ALRT.log

thanks.

March 04, 2004 - 2:24 pm UTC

that is not the time for the switch. the switch was INSTANTANEOUS.

this is the time to write the blocks from the buffer cache onto disk, in the background, slowly so as to not swamp the machine.

1 minute seems "dandy"

there is nothing wrong here -- this is normal. You don't want DBWR (lgwr isn't really involved here) to totally squash you. You just want to make sure that dbwr is flushing FAST enough so as to have flushed the blocks before the redo log file that fired the checkpoint needs to be reused - and in your case, it is (there are no "checkpoint not completes" in your logs so it is happening more than fast enough)

Ravi, September 15, 2004 - 9:25 am UTC

A quick question though, still dont know why the redo buffer flush happens even for modest change, instead of doing it ONLY for the changed items and doing it for all the pending stuff in the 'pipe'

Is it because

a)I've started a costly disk write option, I might as well write all the pending stuff (1MB)?

or

b)Oracle simply doesnt want to complicate by working to detect the changes in the redo buffer and flushing them alone? In other words its quicker to flush the WHOLE stuff instead of the CHANGED stuff?

September 15, 2004 - 10:08 am UTC

everything in the redo buffer represents a change? what do you mean by "a modest change" -- all changes are created equal?

Ravi, September 15, 2004 - 1:03 pm UTC

What I meant was the 10K change in one of the posts. What

I am looking for is if Oracle can cleverly commit only the current transactions redo buffer (the clever part is locating the current transaction's changes only).

That would mean commit after every unit of work could be possible? As it is synchronous for the whole database now, a developer who has coded plenty of commits (Say a 100K loop of inserts and commit after each insert) may actually "hang" a database as everyone else waits till redo log buffer is flushed?

I am not sure if 10K times flushing of 10KB chunks of data could be quicker than 100 times flushing of 1MB of data?

September 15, 2004 - 1:29 pm UTC

it would not make sense to do that -- an IO is an IO, do them all.

the buffer is a buffer, people are NOT waiting to get in whilst lgwr is writing out.

this will not "hang" anything. you can be filling the redo log buffer as lgwr is emptying it (hopefully that is what is always happening).

Too high or not too high

RD, September 16, 2004 - 7:36 pm UTC

Hi Tom,
I,m going through this thread and also checking my database for "redo buffer allocation retries" and "redo log space requests". But how do I know if the results I get are good or bad???

NAME VALUE
---------------------------------------- ----------
redo buffer allocation retries 28084
redo log space requests 2281

Looks bad to me seriously but how can I be sure and what should I do about it.
Oracle8i is the DB.Application - peoplesoft.
Thanks,
Regards,
RahulD.

September 16, 2004 - 8:13 pm UTC

good question -- if I told you

"on my last car ride, i stopped at 500 red lights"

is that good or bad?

28,000 is a small number if you've been up and running for a while.
28,000 is a big number if you've been up for 5 minutes.

(500 red lights wouldn't be bad on a road trip from new york to san francisco, it would be horrible across town).........

you need to see how long your applications waited on things to see "whats relevant"

Sorry bout' earlier no startup time

RD, September 16, 2004 - 9:12 pm UTC

Hi Tom,

Silly me for not telling you when the database was last started up.

ps8sys > select instance_name,startup_time from v$instance;

INSTANCE_NAME STARTUP_T
---------------- ---------
ps8sys 11-SEP-04

So that's just about a week back.6.5 days to be exact and it's a 10hrs X 5days oracle 8i database generally.

Now
NAME VALUE
---------------------------------------- ----------
redo buffer allocation retries 28084
redo log space requests 2281

Looks bad to me seriously but how can I be sure and what should I do about it.
Oracle8i is the DB.Application - peoplesoft.

Thanks,
Regards,

RD

September 17, 2004 - 8:13 am UTC

ops$tkyte@ORA9IR2> select 50*60 / 28084 from dual;
 
50*60/28084
-----------
  .10682239



so, you get one of those waits about every 10 minutes using averages to hide information.  so, i say, given that number -- 28k and 50 hours or so of database time -- so what?  we are back to "500 stops at red lights".  


You need to look at what you applications wait on in order to 'tune'

log file sync waits

Robert, April 16, 2010 - 5:45 pm UTC

Hi Tom,

We have just switched from EMC to IBM XIV disk array and now our production database (heavy oltp and reporting) has begun to experience massive 'log file sync' waits, from time to time. However in this case, these waits are occurring at times when there is relatively low commit rate ( < 5/second)... but with relatively high redo generation ( > 3 or 4 M / second). Our SA's assure us that XIV can handle all the I/O we can throw at it, but then the new disk array is the only thing that has 'changed'.

I have been using the v$sysmetric_history data to show the I/O rates for overall datafile performance... because it seems that when the high redo size and corresponding log file sync waits occur, there is also corresponding long write times for the datafile disks.

Do you have any suggestions on stats to gather to prove to the SA's that it is because of the XIV disk array.... or else suggestions on how to fix the problem?

Thanks,

Robert.

April 16, 2010 - 6:01 pm UTC

... Our SA's assure us that
XIV can handle all the I/O we can throw at it ....

sure it can, when properly configured - who did that?

Just run a test without Oracle, do some directio to those devices, use dd or something.

Sandeep Arora, June 23, 2010 - 7:19 pm UTC

Hi Tom,
I am new baby into the field of ORACLE DBA,

My Que is that whenever 1/3 ONLINE REDO LOG is full LGWR will write the contents to REDO LOG FILES...but if at that time transcation is running and is not commited it means that transaction is also be written to REDOLOG FILES?.....is it really so please guide me need your expert advise.

Waiting for your reply....

June 24, 2010 - 7:45 am UTC

My Que?

Looks like a mixture of Spanish and English? Not sure what to make of that.

whenever the redo log buffer (in memory) is 1/3 full - not the online redo log file, the online redo log file is what lgwr writes to.

And in reality, it'll probably flush way before that because it does so every 3 seconds, when X megabytes has been put in the redo log buffer, every commit - as well - which ever comes first

We write uncommitted data to the online redo log files all of the time - it is natural and normal and totally by design

You can have a system whereby you have 100mb of online redo log configured and run a transaction that generates 100GB of redo - that would be OK.

If you are interested in the mechanics - how this all works - it takes a couple of pages. If you like the way I explain things, I did document this in my book Expert Oracle Database Architecture in the chapter on redo and undo. If you have access to that - give that a read.

Group# log switching order...

Raj, January 14, 2011 - 1:15 am UTC

Tom,
In the response to the original post of this thread back on Oct.28,2001... you mentioned:

"have redo log groups 1, 3, 5, 7, .... on disks 1 and 3
redo log groups 2, 4, 6, 8, .... on disks 2 and 4
archive destination to disk 5

Now, when LGWR is writing to groups 1, 3, 5, 7, .... on disks 1/3, ARCH will be reading
groups 2, 4, 6, 8, ... on disks 2/4 and writing to disk 5. When LGWR advances, so does
ARCH and they'll switch disks. Never does ARCH and LGWR contend with eachother."

Does that still apply today? I noticed today that on a 10.2.0.4 instance (AIX5.3) my log switches did not go through groups 1-5 sequentially. I dropped and recreated them all sequentially again just in case it was a fluke(sp?) but saw the same "disorder"... well different order, but still not sequentially from 1-5. So if group #2 switches to group #4 next, then it messes up the Oct.28,2001 plan. If this is known already... what determines the switch order?

Thank you!

January 14, 2011 - 12:01 pm UTC

let me rephrase

put your first, third, fifth, seventh, ... log group on disk 1&3
put your second, fourth, sixth, eighth, .... log group on disk 2&5

typically the physical log group 1 will be followed by 2 and so on. query v$log and look at the group#, sequence# to see what your order is.

but also realize that today with striped filesystems coming from a SAN - there might be no physical difference between disk 1, 2, 3, 4 and 5. They might well all be the same disks underneath - meaning - this isn't such a big deal these days unless you have total control over the physical layout.

What determines the switch order

Hemant K Chitale, January 17, 2011 - 1:14 am UTC

Raj,
Your question : "what determines the switch order"

If you add a Redo Log Group to a database, the next switch from the CURRENT Redo Log will go to the new Group.  Thus, if the CURRENT Redo Log was Group#2 in a set of 5 Groups but you add a Redo Log Group#6 now, the next switch will be from Group#2 to Group#6.  This pattern will continue thereafter -- every switch from Group#2 will go to Group#6.

See this demo :

[code]
SQL> @log_files

 Grp  SEQUENCE# MEMBER                                        Status       Size
---- ---------- --------------------------------------------- --------- -------
   1       1927 /redologs/redo01.dbf  INACTIVE    10.00
   2       1928 /redologs/redo02.dbf  CURRENT     10.00
   3       1926 /redologs/redo03.dbf  INACTIVE    10.00

SQL> alter database add logfile group 4 '/redologs/red04.dbf' size 10M;

Database altered.

SQL> @log_files

 Grp  SEQUENCE# MEMBER                                        Status       Size
---- ---------- --------------------------------------------- --------- -------
   1       1927 /redologs/redo01.dbf  INACTIVE    10.00
   2       1928 /redologs/redo02.dbf  CURRENT     10.00
   3       1926 /redologs/redo03.dbf  INACTIVE    10.00
   4          0 /redologs/red04.dbf   UNUSED      10.00

SQL> alter system switch logfile;

System altered.

SQL> @log_files

 Grp  SEQUENCE# MEMBER                                        Status       Size
---- ---------- --------------------------------------------- --------- -------
   1       1927 /redologs/redo01.dbf  INACTIVE    10.00
   2       1928 /redologs/redo02.dbf  ACTIVE      10.00
   3       1926 /redologs/redo03.dbf  INACTIVE    10.00
   4       1929 /redologs/red04.dbf   CURRENT     10.00

SQL> alter system switch logfile;

System altered.

SQL> @log_files

 Grp  SEQUENCE# MEMBER                                        Status       Size
---- ---------- --------------------------------------------- --------- -------
   1       1927 /redologs/redo01.dbf  INACTIVE    10.00
   2       1928 /redologs/redo02.dbf  ACTIVE      10.00
   3       1930 /redologs/redo03.dbf  CURRENT     10.00
   4       1929 /redologs/red04.dbf   ACTIVE      10.00

SQL> alter database add logfile group 5 '/redologs/red05.dbf' size 10M;

Database altered.

SQL> @log_files

 Grp  SEQUENCE# MEMBER                                        Status       Size
---- ---------- --------------------------------------------- --------- -------
   1       1927 /redologs/redo01.dbf  INACTIVE    10.00
   2       1928 /redologs/redo02.dbf  ACTIVE      10.00
   3       1930 /redologs/redo03.dbf  CURRENT     10.00
   4       1929 /redologs/red04.dbf   ACTIVE      10.00
   5          0 /redologs/red05.dbf   UNUSED      10.00

SQL> alter system switch logfile;

System altered.

SQL> @log_files

 Grp  SEQUENCE# MEMBER                                        Status       Size
---- ---------- --------------------------------------------- --------- -------
   1       1927 /redologs/redo01.dbf  INACTIVE    10.00
   2       1928 /redologs/redo02.dbf  ACTIVE      10.00
   3       1930 /redologs/redo03.dbf  ACTIVE      10.00
   4       1929 /redologs/red04.dbf   ACTIVE      10.00
   5       1931 /redologs/red05.dbf   CURRENT     10.00

SQL>
SQL> alter system switch logfile;

System altered.

SQL> select group# from v$log where status = 'CURRENT';

 Grp
----
   1

SQL> alter system switch logfile;

System altered.

SQL> select group# from v$log where status = 'CURRENT';

 Grp
----
   2

SQL> alter system switch logfile;

System altered.

SQL> select group# from v$log where status = 'CURRENT';

 Grp
----
   4

SQL> alter system switch logfile;

System altered.

SQL> select group# from v$log where status = 'CURRENT';

 Grp
----
   3

SQL> alter system switch logfile;

System altered.

SQL>
SQL> select group# from v$log where status = 'CURRENT';

 Grp
----
   5

SQL> [/code]

Notice how it switched from Group#2 to Group#4 because #4 was the "new" (UNUSED) group when #2 was CURRENT.  After 4 it switched to #3. 
Thereafter

logfile sync wait - changing disk location of logfile

A reader, May 16, 2012 - 2:23 pm UTC

hi tom,

i realise a significant high volume of logfile sync, but there is nothing much we can do about it in terms on commit in application as there is a recent rise in amount of transactions , thus the rise in commit.

however, i realise that the current database is having its redolog all in the same disk/volume.

i am trying to move each of them out into different disk/volume.

I have 3 group and each group have 1 member only

q1) is the step below correct ->

T1) add a new member to each of the group.
the member of each group will reside in different disk

T2) drop the existing member of the group ( that reside in the same disk)

q2) what happen when we a new member into a group ?
will the existing log content's be sync over to the new log as well ?

q3) can we add a new member to loggroup with status ACTIVE? as i believe it is writing data into its member, and how do we ensure the new member will be sync properly ?

Regards,
Noob

May 17, 2012 - 2:54 am UTC

however, i realise that the current database is having its redolog all in the
same disk/volume.

if that disk/volume only has redo - there is not necessarily anything wrong with that.

you should have at least 2 members.

Just create new groups with two members - each member being on different devices then drop the old groups.

logfile sync wait - changing disk location of logfile

A reader, May 17, 2012 - 4:13 am UTC

hi tom,

thanks for the reply.
---

"if that disk/volume only has redo - there is not necessarily anything wrong with that. "

nope. they are together with the datafiles. thus i am moving them out.
--------------

Assuming now I have 3 group (each with a single member)

group1 member1 -> /u03/sid/data
group2 member1 -> /u03/sid/data
group3 member1 -> /u03/sid/data

what i need to do is
create
group4 member1 -> /u01/sid/redo/
group4 member2 -> /u02/sid/redo/

group5 member1 -> /u01/sid/redo/
group5 member2 -> /u02/sid/redo/

group6 member1 -> /u01/sid/redo/
group6 member2 -> /u02/sid/redo/

then drop group 1 and 2 and 3 ?
---------

q1) how do i make sure group1 or 2 or 3 are not the current group when i drop them ?
i have to manually switch them out right ?

q2) how do i make sure that members in group 1 or 2 or 3 are already archived ?

i am worried that if i dropped them, and they are not yet archived, i might have problem recovering ?

q3) will archivelog from missing/dropped redolog group have any problem during media recovery ?

q4) do i have to do any backup after dropping the redo log groups ?

Sorry if i worried too much.

Regards,
Noob

May 18, 2012 - 2:06 am UTC

then drop group 1 and 2 and 3 ?

yes.

q1) alter system switch logfile;

if they are current.

q2) we won't let you drop them if they are - but you can:

alter system archive log all;

q3) no

q4) you technically do not have to - no.

overwriting active redo log files

Gareth, May 23, 2012 - 9:52 am UTC

Hello,

I understand that Oracle won't let you drop an active redo log file, but can it be over written?
My understanding is no but I'm researching this at the moment, which is why I'm reading this thread and found the following in the manuals;

The following are quotes from:

Oracle® Database Administrator's Guide
11g Release 1 (11.1)
Part Number B28310-04

"Filled redo log files are available to LGWR for reuse depending on whether archiving is enabled.
•
If archiving is disabled (the database is in NOARCHIVELOG mode), a filled redo log file is available after the changes recorded in it have been written to the datafiles.

•
If archiving is enabled (the database is in ARCHIVELOG mode), a filled redo log file is available to LGWR after the changes recorded in it have been written to the datafiles and the file has been archived."

In the next paragraph it states:

"... Active (Current) and Inactive Redo Log Files

If you have enabled archiving (the database is in ARCHIVELOG mode), then the database cannot reuse or overwrite an active online log file until one of the archiver background processes (ARCn) has archived its contents. If archiving is disabled (the database is in NOARCHIVELOG mode), then when the last redo log file is full, LGWR continues by overwriting the first available active file."

I'm trying to work out whether it overwrites 'active' files in noarchivelog mode.
My understanding is that only inactive files will be overwritten in either mode and only when archived in archivelog mode. Is 'active' a typo at the end of the previous paragraph?

May 23, 2012 - 11:12 am UTC

an active redo log file (one that is needed for crash recovery) cannot be overwritten until the blocks it protects are safely on disk. This is independent of whether the database is in archivelog or noarchivelog mode.

It is all about crash recovery.

that is a typo.

I filed this against that bit of documentation:

In the above, we write:

...If archiving is disabled (the database is in NOARCHIVELOG mode), then when the last redo log file is full, LGWR continues by overwriting the first available active file....

that is incorrect. LGWR continues by overwriting the next log file in the sequence when it becomes INACTIVE.

We do not pick files willy nilly ("first available"), we pick files in a circle - in a predicable fashion.

We cannot overwrite an ACTIVE redo log file, it is necessary for crash (instance) recovery.

ARCH process

John Liu, June 18, 2012 - 4:50 pm UTC

I am having trouble finding good information on the ARCH process. Reading your answer on alternating redo log files on two drives (group 1,3,5,7 vs 2,4,6,8) to avoid ARCH contention, it gets me thinking:

1) When does archival of a redo log file happen? Does it happen during a log switch, when a redo log is ACTIVE, or when a redo log is INACTIVE?

2) Does ARCH copy from log file to archive file, or from log buffer?

My guess is, ARCH is a separate process that happens independent of LGWR, therefore a redo log can be inactivated before archived, or vice versa. The ARCH process does not block log switches unless the next log file in line has not been archived yet from the previous cycle. Am I correct?

Kind Regards,
John

June 18, 2012 - 4:54 pm UTC

1) typically as you switch out of an online redo log into the next. that triggers an archive.

while it is active (active means "we still need it")
but not while it is current
before it becomes inactive (that means we can reuse that log file, it is checkpointed and it is archived)

2) from online redo log to archive file. The data is in the log buffer for such a blink of an eye - it is destined to be on disk.

You don't have to guess - read the Server concepts guide!!

http://docs.oracle.com/cd/E11882_01/server.112/e25789/toc.htm

it tells you that arch is a separate process from lgwr.

But the status of a log file is in general:

active: we cannot overwrite it yet, we need to finish archiving it or we need to complete a checkpoint for the blocks it protects

inactive: we can overwrite it

current: this is the current one.

re: ARCH process

John Liu, June 20, 2012 - 1:34 pm UTC

Thank you for your clear answer. I have read the documentation, but I cannot find in any place (even in books) where the ACTIVE status of the redo log is clearly explained, and many contradict one another. For example, the V$LOG page on http://docs.oracle.com/cd/E11882_01/server.112/e25513/dynviews_2031.htm states:

ACTIVE - Log is active but is not the current log. It is needed for crash recovery. It may be in use for block recovery. It may or may not be archived.

INACTIVE - Log is no longer needed for instance recovery. It may be in use for media recovery. ***It may or may not be archived.***

This document is saying that it's possible for an unarchived redo log to be inactive. Is it wrong?

Kind Regards

June 21, 2012 - 7:59 am UTC

inactive - if the database is in noarchivelog mode, it doesn't need to be archived.

log file sync vs logfile parrallel write

A reader, July 12, 2012 - 1:34 pm UTC

hi tom,

logfile sync - time to sync the contents in the log buffer to disk

logfile parallel write - time to write the contents to all the log members in the a group

-------

what is the difference between the 2 ?
- a slow logfile write will trigger a slow logfile sync , isn't it ?

or the only difference is the logfile sync is trigger only when a commit or rollback issue, while logfile parallel write happen all the time ?

why is there a need to differentiate between the 2 ? is there any scenario whereby only 1 event wait will occur between the 2 ?

Regards,
Noob

July 12, 2012 - 6:36 pm UTC

log file sync is what you and your applications wait on. It includes all of the time you wait for the log file sync to happen. It includes time spent in the client generating the request, time spent in lgwr getting the request and getting the redo setup and ready to go. It includes time spent writing it. It includes time spent in lgwr figuring out if everything is good to go. It includes time spent by the client receiving the response and finishing up.

So, there is "a lot of CPU - physical IO - more CPU" involved in a log file sync.

A log file parallel write is the time spent by log writer writing to the log.

IF you have high log file sync waits, you have a problem to investigate.

IF you have high log file parallel write waits, you MIGHT NOT HAVE any problems whatsoever.

you care about log file sync waits - your program waits on them.
you MIGHT NOT care about the log file parallel writes - you very well might not be waiting on them.

for example, say you have a million rows to load. you do this statement:

insert into t select * from one_million_row_table;

Now, during that insert - lgwr is going to be writing and writing and writing - constantly. lgwr will be experiencing log file parallel write wait events.

But - you - your insert - it won't be. It is filling the redo log buffer. lgwr is emptying it. You are not waiting on log write at that point. lgwr might be doing IO - but so what - it is doing it in the background. You are not waiting for it.

now, let's say you decide to load the 1,000,000 rows like this:

loop over 1,000,000 row table
   insert into t the row
   commit
end the loop

Now - now you'll be waiting for lgwr to finish writing after each record. You'll have tons of log file sync waits (like 1,000,000 of them). And if you have high log file parallel write waits - you might be able to blame part of that on the writes.

I would blame the programmer, but you could blame the IO. the IO isn't really the cause, the algorithm is, but you see what I mean.

Make sense?

log file parallel write is something lgwr waits for.

log file sync waits might be you waiting on lgwr (might be starved for CPU, lgwr might not be the cause at all). If you are waiting on lgwr, you might want to look at what lgwr is waiting for. those could be log file parallel writes.

Increase size of online redol log files,

A reader, October 29, 2012 - 4:51 pm UTC

Hello,

I think there are two ways of looking at performance issues that are related to log file sync - 1)reduce log switches by increasing the size of redo log files or 2)fix the application that is causing log file sync (and retain the redo log file size as it is).

Would #1 help alleviate the performance problem (say by increasing the log file size from 512MB to 1GB). We are currently seeing around 20 log switches per hour and theoretically it is recommended to have less than 5 per hour.

In our OLTP system, the transactions are taking an average of 50ms. If we reduce the number of log switches, can there be a direct impact on the elapsed time of the transaction?

Thanks,

October 29, 2012 - 9:08 pm UTC

1) why would reduced log switches affect this? a log file sync?

2) you might not have anything to fix, but it would be something to look at.

where is it recommended to have less than 5/hour?

unless you are hitting "checkpoint not complete, cannot allocate new log", your log switches are likely just fine. the log switch is triggering dbwr to aggressively clean out the buffer cache. if you remove the log switches, dbwr will have a bigger job to do and you may well discover that you end up making things worse rather than better.

follow up,

A reader, October 30, 2012 - 10:35 am UTC

Thank you for your response. Reducing log switches is an attempt made to reduce waits on log file sync. About log switches less than 5 per hour, I haven't seen one but in my work place, this news is taking its own priority. So having 20-25 log switches would not be a problem?

On a similar line, if increasing the redo log files to 1GB, would it actually reduce the number of log switches? The reason for this question is, I have seen log files less than 50MB even though the redo log file is created as 512MB.

Thanks,

October 31, 2012 - 3:55 pm UTC

Reducing log switches is an attempt made to
reduce waits on log file sync.

please tell me how that would work - it doesn't reduce log file sync waits. log file sync waits are a wait on lgwr to complete a write to disk - log switches are a trigger to dbwr to flush buffers to disk. reducing log switches won't make lgwr write to disk any faster.

if increasing the redo log files to 1GB, would it actually
reduce the number of log switches?

of course it would, you would have about 1/4th the number you have now if you have 250mb files now. But that could be a good thing, a bad thing, or no impact at all.

where is it recommended to have less than 5/hour?

helmo, November 06, 2012 - 9:29 am UTC

where is it recommended to have less than 5/hour?

Might be it is taken from "Troubleshooting: log file sync' Waits [ID 1376916.1]" (a sub-section of "FAQ: How to Use AWR reports to Diagnose Database Performance Issues [ID 1359094.1]" :)

A quote from the above article:

Check to see if redo logs are large enough

A 'log file sync' operation is performed every time the redo logs switch to the next log to ensure that everything is written before the next log is started. Standard recommendations are that a log switch should occur at most once every 15 to 20 minutes. If switches occur more frequently than this, then more 'log file sync' operations will occur meaning more waiting for individual sessions.

Link: https://support.oracle.com/epmos/faces/ui/km/DocumentDisplay.jspx?id=1376916.1

November 06, 2012 - 10:45 am UTC

a log file sync is a wait - not an "operation".

that note is wrong.

I've emailed the author asking them to consider removing that section from that otherwise fine document.

what is redo log sync and what more can be done about it

A reader, February 19, 2014 - 5:58 pm UTC

At the end of this thread, you state that "log file syncs" are wait events not operations and that you would send a note to the author to update note 1376916.1 to indicate that redo log switches are a function of DBWR switching to a new redo log not LGWR flushing it's buffer to the redo log. And in addition, the general rule of 5 log switches per hour is now a myth that is still being perpetuated by this note and others such as Note ID 781999.1.

Since the change hasn't been made to note 1376916.1, I am wondering who is correct!

Perhaps the confusion is really due to the fact that in the alert log, all log file switches include in parentheses the name (LGWR switch) when in fact it is DBWR who is making the switch because of the trigger from LGWR. An example snippet as seen below:

Thu Jun 02 14:57:01 2011
Thread 1 advanced to log sequence 2501 (LGWR switch)
Current log# 5 seq# 2501 mem# 0: /opt/oracle/oradata/orcl/redo05a.log
Current log# 5 seq# 2501 mem# 1: /opt/oracle/logs/orcl/redo05b.log
Thu Nov 03 14:59:12 2011
Thread 1 advanced to log sequence 2502 (LGWR switch)
Current log# 6 seq# 2502 mem# 0: /opt/oracle/oradata/orcl/redo06a.log
Current log# 6 seq# 2502 mem# 1: /opt/oracle/logs/orcl/redo06b.log

In fact I do agree with you, the DBWR switching from one log to another (due to the redo log filling up, not due to the LGWR writing to the file) should have minimal impact to performance as it's just changing the location where it writes.

I am only adding this review because I still get comments from consultants and others that log file switches are bad and you shouldn't have more than 4 or 5 per hour thereby unnecessarily resizing redo logs when in reality the size of the redo logs has more to do with how quickly you want to recover your database than any impact on performance. And if a quick recovery time requires a smaller redo log size and hence more log switches per hour then so be it. Don't mess with redo log sizes unless you encounter checkpoint problems as you have previously stated.

Just my 2 cents. Keep up the great work.

Live SQL

Share and learn SQL and PL/SQL; free access to the latest version of Oracle Database!

Dev Gym

Classes, workouts and quizzes on Oracle Database technologies. Expertise through exercise!

what is redo log sync and what more can be done about it

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Reader

Other waits events

Reader

Following some information presented ...

Following some information presented (cont'd)

Following some information presented (cont'd)

Thanks for all help,

Following some information presented (cont'd)

Andre, thanks for you help!

Thanks for all help (cnt'd)

A followup on redo log space request

clarification on one item please ...

Reader

Followup on Reader from USA

Reader

Andre, Tom and others - Thank you!

thanks Tom, see following links

Tom, thanks for all your help

Reader

Log files configuration with a few large disks

When is the LOG BUFFER Flushed

Switch takes over 1 minute

Too high or not too high

Sorry bout' earlier no startup time

log file sync waits

Group# log switching order...

What determines the switch order

logfile sync wait - changing disk location of logfile

logfile sync wait - changing disk location of logfile

overwriting active redo log files

ARCH process

re: ARCH process

log file sync vs logfile parrallel write

Increase size of online redol log files,

follow up,

where is it recommended to have less than 5/hour?

what is redo log sync and what more can be done about it

More to Explore

Live SQL

Dev Gym