Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, lisa.

Asked: October 24, 2001 - 10:59 am UTC

Last updated: August 06, 2018 - 2:12 pm UTC

Version: 8.1.6

Viewed 50K+ times! This question is

You Asked

Oracle documents say Log buffer is normally small compared with the total SGA size, its key ratio is the space request ratio: redo log space requests / redo entries.

Could you tell me what is the key ratio we are looking when adjust log buffer in initsid ora file ?
and how to adjust the log buffer size ?

As always, thank you for your response.

and Tom said...

Well, I just look at wait events. If you see lots of log_buffer_space waits, consider making the log_buffer init.ora parameter bigger.

It is true that the log_buffer is trivial in size compared to the rest of the sga. Your block buffer cache is generally many 10's to 100's of megs (or even in the gigs) in size. Your shared pool is somewhere between 25 and 100 meg or so. Your log_buffer is generally 5meg or less.

Since the log buffer is flushed:

o every 3 seconds
o every commit
o when 1/3 full
o when 1meg full

regardless -- having a HUGE one just wastes ram as one of the above 4 events will kick in an flush it.

Rating

  (44 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

LGRW flushing

Andre Whittick Nasser, October 24, 2001 - 1:25 pm UTC

Additionally to the LGWR flushing triggering events you mentioned, Oracle says LGWR writes "before DBWR writes to disk". That is something that is not clear to many people. It means...

Quote from the Concepts Guide:

Before DBWn can write a modified buffer, all redo records associated with the changes to the buffer must be written to disk (the write-ahead protocol). If DBWn finds that some redo records have not been written, it signals LGWR to write the redo records to disk and waits for LGWR to complete writing the redo log buffer before it can write out the data buffers.

Thanks !


Tom Kyte
October 24, 2001 - 1:34 pm UTC

correct -- dbwr might nudge lgwr to make sure the redo for a block is flushed BEFORE the block is....

log buffer space

A reader, October 24, 2001 - 1:48 pm UTC

You said- "If you see lots of log_buffer_space waits,
consider making the log_buffer init.ora parameter bigger."
How much do you consider 'lots'
I checked the v$system_event and found 115 in the morning and 180 in the afternoon, Is this consider a lot?


Tom Kyte
October 24, 2001 - 2:01 pm UTC

turn on timed statistics and view the time_waited in v$system_event. If that amount of time is more then you want to have -- fix it. If the time waited is neglible in the grand scheme of things, ignore it.

Looking at the wait count of 115, its not realy possible to say if that is good or bad. If you had but 1 transaction, its probably pretty bad. If you had 1,000,000, its probably pretty good. You need to look at that number in regards to your system to determine if it is "bad" or not.

A reader, July 19, 2002 - 1:58 pm UTC

Hi Tom,

Our DBA has set this values for log buffers and buffer cache

SQL> sho parameter buffer

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
buffer_pool_keep                     string
buffer_pool_recycle                  string
db_block_buffers                     integer 22000  <-------
log_buffer                           integer 3276800 <-----
use_indirect_data_buffers            boolean FALSE

 As far as i know log buufer shuold be small aroung 5M.

 This setting is in production.

 Our environment is working without any problem.

 Any comment regarding above buffer values?

 Thanks


 

Tom Kyte
July 19, 2002 - 3:50 pm UTC

If I told you "I stopped at 50 red lights on my last car trip" -- could you comment on that? Is that "good" or is that "bad"

Regardless of what you say -- given just that amount of data -- I will successfully argue the counter. If you say it is bad, I'll prove it was very good and vice versa.

No, I cannot comment. If the log buffer should always be 5m - it wouldn't be a parameter ;)


Without knowing

o the transaction rates
o the sizes of the transactions
o the amount of ram on the machine
o the number of users
o the configuration (dedicated or shared)
o the block buffer size
o etc etc etc

one cannot really comment.

A reader, July 19, 2002 - 3:58 pm UTC

o the transaction rates
what is it? how can i get that?
o the sizes of the transactions
Siebel CRM application with 150 Concurrent user.
o the amount of ram on the machine
5 GB (Solaris OS)
o the number of users
150 Concurrent user
o the configuration (dedicated or shared)
Dedicated
o the block buffer size
8k

- ORACLE 8.1.6.1



Tom Kyte
July 20, 2002 - 10:08 am UTC

If you developed or are deploying a system and don't know what "transaction rates" are (defined) and don't know what they are going to be (eg: we will be doing 1,000 transactions per minute) -- you could be in trouble.

Not ever having done a Sighbel implementation, I cannot comment on the size of their transactions.

The numbers look OK to me, especially if "system is running fine".

A reader, May 07, 2003 - 9:22 am UTC

Hi Tom,

 Asking same question again in this thread. Now system
is NOT Running fine(after a Year) It is getting worst day by day (Siebel CRM Implementation)

During the day time CRM database is doing only query(90% of the time) and during night we have batch processing from lagacy system which update CRM database.

Queries are taking long time during day
Our settings are

SQL> sho parameter buffer

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
buffer_pool_keep                     string
buffer_pool_recycle                  string
db_block_buffers                     integer 22000  <-------
log_buffer                           integer 3276800 <-----
use_indirect_data_buffers            boolean FALSE

sort_area_retained_size              integer 253952
sort_area_size                       integer 1032192
hash_area_size                       integer 2064384


any comments on init.ora parameter or others setting.

Thnaks

 

Tom Kyte
May 07, 2003 - 9:32 am UTC

I have no idea who you are, or what this might be a continuation of.

I can say that looking at init.ora parameters is sort of a useless exercise.  Especially incomplete ones.

For example,

ops$tkyte@ORA920> select 22000*2/1024, 22000*4/1024, 22000*8/1024, 22000*16/1024 from dual;

22000*2/1024 22000*4/1024 22000*8/1024 22000*16/1024
------------ ------------ ------------ -------------
    42.96875      85.9375      171.875        343.75



so, do you have 43m, 86m, 172m or a 344meg buffer cache.

Even if you were to answer that -- so what?  without lots of other stuff -- they are just numbers, like 123 is a number.  by itself -- nothing.


Now, since you are running an off the shelf application, provided by a relatively large company with lots of customers -- I would suspect they would have 

a) helped you size your system
b) ensured you bought sufficient resources for your implementation
c) gave you best practices for tuning Oracle under siebel
d) have consultants on site working with you to ensure your success?


A 15 minute long statspack report performed during your peak performance times would be useful if you want any comments from us over here. 

A reader, May 07, 2003 - 10:24 am UTC

Hi Tom,

Thnaks for the feedback.
I am your one of the biggest fan.

Siebel uses RULE base optimizer.
what about log_buffer = 3276800 setting.

Any advice from siebel regarding database always stinks.Even if they have problem is their siftware they most of the time blame oracle.

Their consultant always give good imresssion about sql server and DB2 databases (But they accept most of their users(60-70%) uses oracle as database)




Tom Kyte
May 07, 2003 - 10:39 am UTC

log buffer does not affect SELECT query performance



Log_buffer

Nathan, August 21, 2003 - 10:17 am UTC

Tom,

How can I reduce the size of Log_buffer ? If the log_buffer is smaller than needed we can increase it by looking at log_buffer_space waits, what about the other way round ?

The log_buffer setting, 150M at my new work place seems very very high ( for any kind of transactions ). A truncate of a single row table takes 5 minutes ( local write wait !) There is indeed a huge io activity on the database , (around 35 million per day ) ( nologging ). I suspect this setting but to bounce the db i need to be able to prove it and be certain about the value. Is there any way i can check log_buffer usage and its impact on the db ? Also there are waits due to db file sequential read and db file scattered read , I intend to increase the multi block read count to 32 ( block size 8) Would this help or increase IO bottle neck congestion ?

Top command issued generally shows these values !
178 processes: 174 sleeping, 2 running, 2 on cpu
CPU states: 4.2% idle, 35.5% user, 5.4% kernel, 54.9% iowait, 0.0% swap
Memory: 10G real, 207M free, 8347M swap in use, 11G swap free

PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
25513 oracle 1 40 0 0K 0K cpu11 90:51 12.55% oracle
14210 oracle 1 40 0 0K 0K run 32:56 8.20% oracle
22698 oracle 1 60 0 0K 0K sleep 7:33 5.75% oracle
3980 oracle 1 60 2 0K 0K sleep 37:59 2.59% oracle
15230 oracle 1 60 0 0K 0K sleep 15:17 2.14% oracle
5227 oracle 1 60 0 0K 0K sleep 3:36 0.92% oracle
10377 oracle 1 60 0 0K 0K sleep 203:12 0.29% oracle
27037 oracle 1 58 0 0K 0K sleep 10:30 0.27% oracle
10339 oracle 1 60 0 0K 0K sleep 211:19 0.26% oracle
13441 oracle 1 58 0 0K 0K sleep 18.9H 0.24% oracle
23789 oracle 1 60 0 0K 0K sleep 0:20 0.23% oracle
23653 oracle 1 58 0 0K 0K sleep 0:20 0.22% oracle
12447 oracle 1 60 0 0K 0K sleep 0:04 0.21% oracle
10390 oracle 1 58 0 0K 0K sleep 265:04 0.18% oracle
10359 oracle 1 60 0 0K 0K sleep 201:36 0.18% oracle
Regards
Nathan


Tom Kyte
August 21, 2003 - 7:06 pm UTC

the log buffer will flush

every 3 seconds
when 1/3 full
upon commit
when some threshold is hit -- forget if it is 1/3/5 meg -- buffered

your issues are not log buffer related.

a truncate taking really long means either:

o table has lots of dictionary managed extents -- solution = use lmt's

o table has lots and lots of dirty blocks in the buffer cache and we are checkpointing them out of the system. solution = wait (make io faster)

log buffer size

Venkat, April 01, 2004 - 5:36 am UTC

Dear Tom,

Don Burleson says that the maximum size recommended for log buffer is 500k or 128k * CPU count. Also, he says that the log_buffer has been set too high (greater than 1 meg), causing performance problems because the writes will be performed synchronously because of the large size (log sync wait events are high).

Is the above right? Kindly let us know.

Thanks
Venkat


Tom Kyte
April 01, 2004 - 10:35 am UTC

funny, the max default size is larger then that on most ports.


No, if he said that, it is wrong. Log buffers in the 10's of megabytes -- probably not useful, but into the megabytes -- very useful.


log file sync wait events happen upon commit. Since lgwr is is flushing whenever:

a) 1meg of data appears in the log buffer
b) every 3 seconds
c) when it is 1/3 full
d) upon commit


the data is constantly trickling out in the background. If your log buffer is too small -- you'll be waiting for lgwr to write the data not only when you commit, but when as you generate redo entries as well. You need sufficient space for you to be ADDING to the redo buffer as lgwr is writing it out.



Confusing documentation

Mark J. Bobak, April 02, 2004 - 9:24 am UTC

Well, it's not the first time I've disagreed w/ something Don
said, but, this time, he may not be entirely to blame.

In the 8.1.7 Reference manaul, with regard to log_buffer,
we find this:
Default value: Operating system specific. Maximum: 500K or 128K * CPU_COUNT, whichever is greater

Now, this is saying that the default is the OS specific, and
it's the greater of 500k or 128k*CPU_COUNT. But, the way
it's phrased, with that "Maximum:" in there, it's a little
confusing. It's easy to see how someone may read that and
think, oh, Max is 500K or 128k*CPU_COUNT. But, that's not
the case.

Note that in the 9i doc, it has:
Default value 512 KB or 128 KB * CPU_COUNT, whichever is greater

So, they dropped the "Maximum:", which makes it a bit
clearer and easier to understand.

I think Steve Adams mentions this documentation issue
somewhere on his website, but I did a quick search and I
was unable to find it.

As to a practical max value, it seems to me it's got to be
3MB. If the buffer flushes at 1/3 full or at 1MB full,
whichever comes first, well, once you're over 3MB, 1MB full
will always come first, and you'll never hit the 1/3 full,
(unless you mess with _log_io_size).

-Mark

Tom Kyte
April 02, 2004 - 10:24 am UTC

No, 3meg would not be the practical max.

remember

1/3
1m
commit
3 seconds

are TRIGGERS to lgwr to begin writing. they do not stop us from generating more.

lgwr gets triggered to write because it is 1m full -- ok fine, whilst it writes that 1m we generate 4m more. lgwr will then write that 4m, ok fine, whilst is writes that 4m, we generate 8m more and so on

they are triggers -- not "stop" points. we can keep writing to the log buffer while lgwr is flushing other parts of it.

huge log buffer space waits in tkprof

bob, April 14, 2004 - 8:16 am UTC

Tom,

I know CPU timing can be off in reports because the time is counted when the process ends, inaccuracy in waits though is related to rounding errors in subsecond timing correct? Version 9.0.1.4

with my job_queue_processes maxed out my snap job didn't run when it was supposed to, so for a 90 minute period instead of a 15 minute window, the top wait was "log buffer space" and there were 125,000 waits, and 40,730 seconds of waiting time.

dbms_jobs were loading blob into the db during this time.

11+ hours of wait time in a 90 minute window?

async disk io is usually the first or second max wait event, especially during the recent slowness the db has been showing.

In 9.0.1.4 asynch disk io could be one of any of the background writing processes correct? Any hints how to further break down these waits?

I suspect lgwr,arch,dbwr are all problems due to the RAID 5 we are relegated to using.

I want to be able to prove this is the problem, is large waits for async disk io enough to say that the background processes can't write to disk quickly enough, therefore the disks are the problem. sar,iostat, etc.. don't give me much of a clue since they report many different disks and I have no way to map tablespaces/datafiles back to the physical disk they are actually on.

Tom Kyte
April 14, 2004 - 9:07 am UTC

the cpu timing can be off because we are measuring very small things many times. If we did it all at the end -- it would be different. Eg:

l_start_cpu = getcpu();
for i in 1 .. 1000000
loop
l_start_cpu2 = getcpu();
do something...
l_tot_cpu2 = l_tot_cpu2 + (getcpu()-l_start_cpu2)
end loop;
l_tot_cpu = getcpu() - l_start_cpu;


Now, the l_tot_cpu2 is sort of what you see in a tkprof -- we measure lots of little things and add them up. Imagine if the little thing happens in less time than the cpu clock it ticking in? it could contribute 0 -- or 1. Over time, the law of averages usually kicks in and we get an even number of 0/1's and it works out.

But the l_tot_cpu would be more accurate for that entire "process" as the over/under error does not creep in.

11+ hours of wait time -- sure.

If you have 100 people waiting 1 minute, for only 1 minute -- what do you have? 100 minutes of system wide wait time for log buffer space!


with blobs -- if they are "nocache", the foregrounds (the client) is doing direct writes to the datafiles -- they are waiting for the IO, dbwr is not doing it for them, they do it.


Ouch, raid 5 with massive blob loads -- not good.



are the blobs CACHE or NOCACHE.

async disk io

bob, April 15, 2004 - 10:27 am UTC

Tom,

The blobs are nocache (default). The docs for 9.0.1 say that nocache may mean they are written to the end of the LRU in the buffer cache or directly to datafiles. How is this decision made?

Any ideas how I can rationalize/explain to managment that since the tkprof report during the troublesome period says, "asynch disk io" was 11 hours, that the i/o system is a problem. This wait event is always the #1 or #2 wait event.

Would one of the sys admins have seen high wait times on the disk during that time as well? I assume if Oracle is waiting on async disk io, the disks would indicate they are very busy in iostat or sar, right?

For all I know underneath several layers of disk managment software, archives, redo and blob tablespace data files are using the same couple physical disks configured with RAID5.



Tom Kyte
April 15, 2004 - 10:54 am UTC

I've not seen them get cached when writing - you always wait for IO (but that would manifest itself as a direct path write (LOB) wait, not ansync io)

did you mean statspack and not tkprof?


I would trace AN APPLICATION (eg: have a job enable the 10046 trace event with waits) and see what the APPLICATIONs are waiting for -- they are whats important. Statspack is too high level to see that information.


The admins may or may not see high waits/activity on disk -- it could be that the disks are "slow" (low activity) causing lots of waits in a "fast" thing.


I would trace an APPLICATION instance.

yes, statspack..

bob, April 15, 2004 - 2:13 pm UTC

yes, I am meant statspack, even in my most recent 15 minute snap where there was almost no blob writing going on (which as you said wouldn't affect the async disk io) , and top wait event was async disk io for 30 minutes on a relatively low transaction rate system (<1-2 tps)

System is just horribly slow lately. full scan on half a million record table takes hours.

I will turn on that trace event for a couple sessions in sql*plus where the delay also manifests itself and see what kind of waits it is seeing.

Thanks for the helpful leads. In one of the many disks that show up in an iostat during one of these busy times I did manage to see 100% busy numbers.


async wait io bug.

bob, April 16, 2004 - 4:13 am UTC

Thought you might find this interesting.. bug#2452357 for 9.2, no fix data available on metalink.

PROBLEM:
--------
From kernel cache buffer load (kcbl) layer from where we issue direct ios, we might cumulate the wait time in 2 different wait events: . one of them is among "direct path read", "direct path write", "direct path read (LOB)", "direct path write (LOB)" and is registered in kcbl layer . the other one is "async disk io" and is registered in kernel service direct file i/o (ksdf) layer
In other words, if we wait for a direct path read/write (lob) wait event, we might also wait for the async disk io wait event. There should not be more than one wait event accounted for at any point in time. Tools like tkprof which analyzes 10046 trace files or statspack which uses v$system_event input will give a wrong picture of the waits relative weights because they don't consider direct path and async disk io as correlated wait events.

[...]

This is the kind of patterns you might see in a 10046 trace file:
.
WAIT #1: nam='async disk IO' ela= 0 p1=0 p2=0 p3=0
WAIT #1: nam='direct path write' ela= 0 p1=1022 p2=229471 p3=15
WAIT #1: nam='async disk IO' ela= 10000 p1=0 p2=0 p3=0
WAIT #1: nam='direct path write' ela= 10000 p1=1022 p2=229501 p3=15
WAIT #1: nam='async disk IO' ela= 10000 p1=0 p2=0 p3=0
WAIT #1: nam='direct path write' ela= 10000 p1=1022 p2=229531 p3=15
WAIT #1: nam='async disk IO' ela= 0 p1=0 p2=0 p3=0
WAIT #1: nam='direct path write' ela= 0 p1=1022 p2=229561 p3=15
WAIT #1: nam='async disk IO' ela= 10000 p1=0 p2=0 p3=0
WAIT #1: nam='direct path write' ela= 10000 p1=1022 p2=229591 p3=15
.
So we can see that the wait events are going in pair and have the same wait time.


Tom Kyte
April 16, 2004 - 7:38 am UTC

interesting, yes, that could definitely be happening here -- a tkprof would bear that out.

thanks!

async disk io

bob, April 25, 2004 - 10:07 pm UTC

tracing an application and summing the waits with 9iR1 tkprof, showed excessive waits for async disk io as well.

I didn't dig into the raw trace to see what cursors were causing it, but it truly is/was a system-wide slowness. I am reading Milsap's book now, (since I already finished your second one), and understand that a system-wide problem will likely manifest itself for a specific application as well. (as it is here)

I read a metalink note on the subject (async disk io wait), but it seems I will have to dig into v$system_event to find which background processes are generating the most waits for this very generic wait used by many background processes.

searches for this wait on the internet turn up almost nothing. (no one else seeing similar large waits for this event)

Tom Kyte
April 26, 2004 - 6:14 am UTC

did you check out that bug# above -- it could be "noise"

raw trace file

bob, April 26, 2004 - 12:12 pm UTC

yes, I was aware of what the bug said, but I didn't remember a similar 1:1 wait ratio in the overall wait summary.

relooking at the raw and summary for those, there was direct path write waits that were almost identical to the async io waits.

in another case in the raw trace there 3 async io waits for every db file scattered read, with total time of the 3 aio waits~=db file scattered read wait, so the ratio was 3:1 not instead of the 1:1 the bug describes.

async io wait=noise (in my case). Thanks.


log buffer space waits for a single blob load

bob, May 24, 2004 - 6:42 pm UTC

db is archivelog mode.

10046 traced the procedure that does a single blob load into a NOCACHE table in a NOLOGGING tablespace.

The blob column is part of a table that is in LOGGING tablespace, with the blob data storage clause pointing to a NOLOGGING blob tablespace.

for a 75MB BLOB load, that took 2 minutes to load, there were 500 log buffer space waits totalling 1:56 seconds.

log buffer size is default .5 MB. For a nologging operationg this seems like a lot of log buffer space request time.

I guess something has got to be the bottleneck for pushing this data to disk, but what is getting put in the log buffer? lob index segment changes? (I seem to remember that being logged regardless of nologging mode, right?)


Tom Kyte
May 24, 2004 - 7:13 pm UTC

the lob index will be logged.

consider upping that log buffer, lgwr will be writing that out as you are filling it. 3, 4, 5m would not be unreasonable.

A reader, June 28, 2004 - 2:02 pm UTC

Thanks Tom you are my hero I solved my doubt about the sizing of the LOG_BUFFER parameter.
Because most dbas says not to set more than 1MB, else than setting more can be harmful, and you give a good reason why and when not.

Some doubts

pjp, August 03, 2004 - 9:08 am UTC

Hi Tom,

I have not understood following concept regarding Oracle 9i Perfomance Tunning concepts.

Oracle Manual is saying that v$sysstat should be check for redo buffer allocation retries and redo entries and v$session_wait is to be checked for log buffer space event.

My understanding is that v$sysstat and v$session_wait views in concern with redo logs both are for checking the log buffer event, v$sysstat is checking waiting time in seconds and another is in number of tries. Am I correct ? If not request you to explain in detail.

thanks & regards


Tom Kyte
August 03, 2004 - 9:29 am UTC

sysstat is system statistics -- counts, number of times an event occured.

v$session_wait is -- well, wait events. it has the wait event and how long (time) it was waiting.

so, you have that backwards. v$sysstat is not wait time, session wait is.

what is 'redo entries' in v$sysstat ?

Parag Jayant Patankar, March 12, 2005 - 6:17 am UTC

Hi Tom,

Regarding tunning of log buffer I have came arcoss 'redo entries' in v$sysstat. Oracle Ref Manual is saying 'redo entries' means

"Number of times a redo entry is copied into the redo log buffer".

I have not understood the cocept and requirement why redo entry has to copied into redo log buffer ? Kindly explain.

regards & thanks
pjp

Tom Kyte
March 12, 2005 - 10:12 am UTC

as you generate redo, it has to go "somewhere", it gets copied in the redo log buffer and as that fills -- lgwr writes it out.

What if I size log buffer a high value

Mohan, July 01, 2005 - 2:30 pm UTC

If I size the log_buffer to 5M, the system will write to disk when log_buffer is 1m full and again when the log buffer is 2m full or any commit happens or after 3 secs. In this way lgwr will write the same data to disk many times. So keeping the log buffer size small, say 128K, is beneficial. Please comment.

Tom Kyte
July 01, 2005 - 3:20 pm UTC

anytime there is more than 1meg in the buffer, lgwr will be busy writing it out...
every three seconds, lgwr will be busy writing it out....
every commit it'll write it out...



keeping is really small could make you wait for log buffer space. lgwr works in the background (except when you commit, then you wait for lgwr)...

think about this, you are doing a long transaction (like a data load...) you are generating lots of redo. If you had a small log buffer, you'd be waiting for lgwr to empty it. If you had a large one, you could be filling parts of it while lgwr is writing out other parts -- you can "work together".





What should be the size of log buffer

Mohan, July 05, 2005 - 12:37 pm UTC

This is only a suggestion Oracle can implement in its next release.

The log buffer should be divided into a number of segmen ts instead of having a single log buffer. The number can be specified by the user in init.ora file. The size of the online redo log file should be the total size of the log bufer.

Suppose there are 100 buffers of 128K each. Oracle keeps writing to log buffers 1 through 100 in a cyclic round robin fashion. The lgwr can leisurely write each segment to the file.



Tom Kyte
July 05, 2005 - 1:09 pm UTC

the redolog buffer is already a circular buffer in that fashion.

say you make it 10meg in size. it'll flush

o with 3.3meg full or 1meg full (1/3 or 1meg is a trigger)
o with every commit
o with every 3 seconds of idle time

whichever comes first.

lets say you are doing a really big transaction and you are "alone" so the 1meg or 1/3 trigger is kicking in. lgwr writes out whatever is in there, letting you fill up the rest, lgwr takes the 1meg of data and starts writing it out, letting you deal with filling the other 9meg. probably, when lgwr is done writing the 1meg, there is anothing 1meg to write.




Sizing of log buffer

Mohan, July 05, 2005 - 3:40 pm UTC

For a system having small transactions and frequent commits, will it be disadvantageous if there is a really large value for log buffer. My doubt is will it genearte more disk IO?.

Tom Kyte
July 05, 2005 - 3:52 pm UTC

other than the fact that you are allocating memory to a buffer that might never need it, no, it'll not cause anymore IO.

A reader, July 06, 2005 - 1:43 am UTC

Surprising Tom did not point Mohan from Chicago to the concepts guide :-)

1 Mb - where does it come from?

Denis Avdonin, September 26, 2005 - 12:36 pm UTC

Hi Tom,

You are saying that LGWR will flush log buffer when it is 1Mb full. Is it still true in 9i and 10g? I cannot find it mentioned in any documentation... However have seen a statement in a book saying that it's not simply 1Mb of any redo, but 1Mb of updated records, i.e. this 1Mb does not include deleted and/or inserted records. Could you comment on this statement as well please?

Thanks,
Denis

Tom Kyte
September 27, 2005 - 9:30 am UTC

redo is not redo oriented - there are more than just rows and columns in the database (think of indexes, branch blocks, root blocks for example).. redo is not organized around "row changes"

So, not sure what was meant by deleted and/or inserted records - but not seeing the quote in context, I won't comment on it.

There are other times lgwr will flush as well - and the 1mb is still one of them.


Archive log mode

Mita, September 30, 2005 - 12:42 pm UTC

How do you size filesystem for Archive Log if your database is running in ARchive Log mode ??

Tom Kyte
September 30, 2005 - 2:16 pm UTC

you watch how much you generate?

your goal: get the archives OFF OF THAT SYSTEM ASAP.

why? because "machines fail" and those disks with the archives are on that machine.

So, you are constantly getting them off of that system and you can just clear out the oldest ones as you need space (leave them on as long as you can, can make recover of a single datafile much faster - but you have to get them OFF)

but company policies..

Mita, September 30, 2005 - 5:42 pm UTC

I agree with that. but my company have policy of moving archive file only once in 24 hrs. and as we are processing of milions of records everyday, archive space gets full during day and someone have to manually move the files to backup drives. what else could be done

Tom Kyte
October 01, 2005 - 8:46 pm UTC

change that bad company policy?

David Aldridge, October 01, 2005 - 12:51 am UTC

Mita,

Maybe not ethical, but how about you let the instance come to a dead halt with a full archiving location until you reach your company-mandated archiving window, or until the CEO issues an emergency exemption for your system -- whichever happens first ;)

Tom Kyte
October 01, 2005 - 9:01 pm UTC

nice... one way to get "company policy reconsidered"

(don't recommend it however, it could be a CLM - career limiting move)

log buffer size

A reader, July 23, 2006 - 4:47 pm UTC

I set my log_buffer=100MB, but I still see log buffer waits?
as we know that log buffer is flushed frequently due
to many reasons, how come is it possible that 100MB of log buffer is filled?

log buffer waits are time it waited for allocation
of space in log buffer, right?



Tom Kyte
July 24, 2006 - 9:42 am UTC

</code> http://docs.oracle.com/docs/cd/B19306_01/server.102/b14237/waitevents003.htm#sthref4463 <code>

how come it is possible? simple: redo is being generated at a rate higher than lgwr can write it to disk. so lgwr falls behind (more redo is created than can be written in some period of time).

could be that you have insufficiently sized redo logs (do you see waits for log file switches as well).

could be that you have a monster transaction that generates tons of redo in a big burst and must wait for lgwr to catch up.

could be that you have lots of concurrent medium/largish transactions that do the same.

log_buffer

A reader, July 24, 2006 - 10:49 am UTC

Then why you said in your first reply in this thread

"regardless -- having a HUGE one just wastes ram as one of the above 4 events
will kick in an flush it. "



Tom Kyte
July 24, 2006 - 11:04 am UTC

when someone posts a huge number, that will apply.

So, what is HUGE? 100mb is not huge if you are performing transactions that generate 10's/100's of megabytes.


50gig - that might be considered huge, 100mb, not so huge these days.


but it depends entirely on

a) SIZE of redo generated by TRANSACTION
b) NUMBER of CONCURRENT TRANSACTIONS generating the redo in a)



Log Buffer

A Reader, September 21, 2006 - 8:20 am UTC

Hi Tom

When I execute the following query in one of our databases, the value is consistently grater than 1500.

SQL> select value  from v$sysstat
  2  where name = 'redo log space requests';

     VALUE
----------
      1560

Am I right in saying that the log buffer value is low because the value returned from v$sysstat is greater than 100 ( always greater than 1500 in fact ).

Please let me know.

Thanks 

Tom Kyte
September 22, 2006 - 1:57 am UTC

that value is growing since your database was started. when you shutdown, it'll go to zero.

I would say 1,560 is a tiny value in general. Doubtful this is indicating anything you need to look at.

log_buffer in 10gr2

A reader, September 25, 2006 - 10:17 pm UTC

Tom,
I would have normally waited for a new question to ask this but this is so intriguing and not documented that I have to ask it here. In 10gr2 RAC database, no matter what I set the log_buffer parameter to in spfile, Oracle is defaulting to log_buffer=14M on all databases on the cluster. I tested on a small home database (single instance RAC) and Oracle seems to default to 2.5M no matter what I set. I searched in the documentation but there is no explanation for this. The databases are using the spfile because other parameter changes are taking effect.

Thanks


Tom Kyte
September 26, 2006 - 2:31 am UTC

from Expert Oracle Database Architecture (basically, it is OS specific - the minimum is)

The default size of the redo buffer, as controlled by the LOG_BUFFER parameter, is the greater of 512 KB and (128 * number of CPUs) KB. The minimum size of this area is operating system dependent. If you would like to find out what that is, just set your LOG_BUFFER to 1 byte and restart your database. For example, on my RedHat Linux instance I see:

sys@ORA10G> alter system set log_buffer=1 scope=spfile;
System altered.

sys@ORA10G> startup force
ORACLE instance started.

Total System Global Area 1593835520 bytes
Fixed Size 779316 bytes
Variable Size 401611724 bytes
Database Buffers 1191182336 bytes
Redo Buffers 262144 bytes
Database mounted.
Database opened.
sys@ORA10G> show parameter log_buffer

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_buffer integer 262144

The smallest log buffer I can really have, regardless of my settings, is going to be 256 KB on this system.


log_buffer in 10gr2

A reader, September 25, 2006 - 10:31 pm UTC

I found the answer on Metalink Note:373018.1. I cannot understand what they are saying though. Either they are trying to defend a bug or the entire 10gr2 documentation is incorrect about this parameter. This is definitely a change from 9i so their claim that this is the expected behaviour for log_buffer seems to be a bit hollow.

Thanks

Tom Kyte
September 26, 2006 - 2:35 am UTC

the log buffer is OS dependent. As 14m is rather "tiny" in the grand scheme of things, I'm not sure this is worth gettig "worked up" about?

and they sort of describe what is happening - with the fixed size, granules and all.

Log Writer Process runnig since long

Maulesh Jani, December 02, 2006 - 1:38 pm UTC

Hi TOM,
Again I would like to thank you for all these information . I have one question related with LGWR process. Actually during my daily monitoring I found several process taking long time due to log-buffer-space wait event. After furthure analysis I found that the Process of LGWR on HP-UX taking long time to finish .(approx 30 min. to 1 hr) . These is happining since just last 2 days , I checked and found that no Oracle parameters has changed and nothing is changed in Data-load.
--- Can you please describe that in which situation LGWR process runs for Long ? (Does it relate with any I/O problme ?).
--- Or I can start with increasing Buffer size ?
I have also one more question which is not directly realted with this , but I am requesting you to that can you plese give your comments on below question also.
--- In the V$session my session is active and in TOP command of Unix it shows that my PiD is Sleeping ,although the CPU is enough idle .


Thanks


Tom Kyte
December 02, 2006 - 7:59 pm UTC

how do you measure that the "process of lgwr is taking 30min to 1hour"??? I'd love to know what that "means" exactly.


lgwr better ALWAYS be running, forever - or your database isn't up.

Question with Correction

Maulesh jani, December 03, 2006 - 5:09 am UTC

Hi,
Yes , these processes are always runnig for Up Database.
What I was mean :
The Processes of Logwr shows in TOP command of Unix that it consumes the 80% to 90% of CPU . (there are 5 cpus).And I observe that several user processes with wait event Log-buffer-space ,once Lgwr finishes from TOP to consuming high CPU ,other user processes comes to utilize CPU and finish their tasks . Here the problem is same pattern is repeating and user processes get very less CPU and LGWR very high.
-- overlall CPU (Total of 5) shows 50 % idle then also user processes not consuming it . Does it because user processes ( Wait_event-log-buffer-space) waiting to LGWr finishes its dumps from b uffer to file.

How this could happen ? Does it indicating problem at Unix-O.S. side . We dont have any Resource-cosumer-group set up at databse level.

Thanks

Tom Kyte
December 03, 2006 - 7:49 am UTC

that would imply that.... well.... during that period of time you are having some process GENERATE lots of stuff. So, there is something else afoot here.

if you have 5 cpus, it is not only highly unlikely - but pretty much not possible for the single process named "lgwr" to consume 80 to 90% of the cpu - it could at most consume 20% (1 out of 5 cpus)

have you considered that your log buffer might just be too small for the work you perform? You haven't said at all what you are doing in the other processes (they must be generating gobs of redo), you haven't said how big your log buffer is.

Also, are your log devices "contention free", is lgwr able to efficiently write to them.

log_buffer being too large?

ignorant, April 25, 2007 - 7:44 pm UTC

Hi Tom,
Everything I see on this page agrees with what I saw in concepts guide (no surprise there). But I came across the following comment that I am not able to understand unfortunately -
"Larger LOG_BUFFER values reduce log file I/O, but may increase the time OLTP users have to wait for write operations to complete."
(site: http://orafaq.com/faq/how_does_one_tune_the_redo_log_buffer )

This is a great site so I am sure that I am missing something.
A little background: I have a 9i database that slows down every day at a certain time. This is due to a large DELETE statement (deletes 2million out of 3m rows). If I could have my way I would change the application logic to do the following -
1) Create second table with the 1m rows
2) Drop orig table
3) Rename second table to orig.
But unfortunately that is not an option. So I looked at the time the slowdown happens and it is because of a large number of "log file sync" waits (from V$session_wait). The LGWR process is waiting on "log file parallel write" ( http://download-east.oracle.com/docs/cd/B10501_01/server.920/a96536/apa5.htm#972405 ).
I also looked at V$SYSSTAT -
STATISTIC# NAME CLASS VALUE
---------- ---------------------------------------------------------------- ---------- ----------
116 redo buffer allocation retries 2 384
123 redo log space wait time 2 359271

and after the delete -
STATISTIC# NAME CLASS VALUE
---------- ---------------------------------------------------------------- ---------- ----------
116 redo buffer allocation retries 2 1946
123 redo log space wait time 2 583203

So it appears that I should think about resizing my log_buffer parameter. The redo produced is about 1,282,855,420 and the log_buffer is 100M. So naturally I thought of increasing it.... till I came across that note.
The tuning goal is that the performance of this one DELETE is not important (although would be nice), what is important that other users should not have to wait that severely.
So the question becomes, should I increase the log_buffer or decrease it? Is it harmful to have it being so big?
Thanks once again.
Tom Kyte
April 26, 2007 - 11:28 am UTC

I'd question the veracity of that paper. Why would a larger log buffer reduce the overall IO? Why is 10m something you would want to question yourself about?


You wait for log file sync when you commit.

If you are deleting lots of records - don't commit, you won't wait, problem solved....

don't think about a silver bullet parameter, think about changing your implementation.

RE: log buffer size being too large

ignorant, April 26, 2007 - 12:01 pm UTC

thanks Tom. That does help. Unfortunately however I can't change the application :(
The logic for log buffer size seemed to be - lot of redo info to be written goes to log buffer. The log writer flushes this to disk. It has to wait for the disk even though it does this async. So if you have a surge, a large buffer would hold the additional redo to be written. In my case the redo is about 1G worth. So my poor log buffer of 100M does not do.
The logic follows that increasing this buffer would hold this additional redo (from the bad app) and so LGWR would have enough time to catch up and write them to disk.
You are right though, the app is busted. We should really be fixing that. But given that we are going to have that large DELETE and INSERT, how best to accomodate that?
Tom Kyte
April 27, 2007 - 10:19 am UTC

so, tell me how a larger redo log buffer REDUCES the amount of redo written. Article says:


.. Larger LOG_BUFFER values reduce log file I/O ...
                            ^^^^^^



maybe, in some extreme cases, larger log buffer values can reduce the NUMBER OF TIMES lgwr writes to the log files, but it won't reduce the AGGREGATE AMOUNT of log data written really.

but given you are waiting for log file sync (commits frequently), a larger log buffer would not do anything because lgwr is already being told (frequently) to flush.

And your slow by slow processing is assured to generate the MOST redo possible, you've made the problem doubly worse.

how best to accommodate a delete and insert? not sure what you mean - databases where born to process those statements.

10g & beyond log write thresholds

A reader, August 14, 2007 - 11:35 am UTC

Tom,

I RTM'ed manuals (& googled it, but google is not accurate):

We (Manuals) used to say (8i days) that the lgwr writes:

o every 3 seconds
o every commit
o when 1/3 full
o when 1meg full

It would seem that we (manuals) have dropped the 1meg full for a log writer threshold. Are these

o every 3 seconds
o every commit
o when 1/3 full

now the correct thresholds?

Thanks

Wayne
"Back in the saddle"
Tom Kyte
August 20, 2007 - 12:04 pm UTC

probably still trips a write at 1mb full - but, redo logging is much more complex in 10g then it used to be (private redo strands, other things). I'd feel comfortable just saying

redo is buffered for a very short period of time and constantly being streamed out to disk - this is why the commit of 100mb of redo takes as much time as a commit of 10k of redo does in general - you are only waiting for the last tiny bit in both cases.

Slow writes

A reader, September 14, 2007 - 10:15 am UTC

I appreciate that the following statspack info is a bit incomplete, but I would be interested in your thoughts anyway. This is a 10g db on Solaris, with a single user connected and nothing else running on the server or db. The report covers a 20 minute period when a 10 million row table was being updatedby a single sql statement. No other user sql running at all. Apparently the disks are super quick (according to our unix admins).

Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
db file parallel write 7,642 2,562 335 60.5 System I/O
log file sync 2,440 2,298 942 54.2 Commit
log file parallel write 178 1,155 6486 27.2 System I/O
log buffer space 1,140 1,073 941 25.3 Configurat
CPU time 296 7.0


Looks like an awful lot of write waits to me, of one sort or another. The log buffer is around 30 mb, but the results were similar when it was half that size. Where would you focus your attention to improve the speed of the update?
Tom Kyte
September 15, 2007 - 9:50 pm UTC

log file sync - looks like you wrote a procedural process here and it is constantly committing.

the db file parallel write - dbwr (software) waits on that - no big deal.

the log file sync - you wait on that when you commit, that is a big (avoidable) deal.

Slow Writes (continued)

A reader, September 14, 2007 - 10:20 am UTC

what I meant to say wasn't that there are a lot of write waits, rather that the avg write times seem very slow...
Tom Kyte
September 15, 2007 - 9:53 pm UTC

they were probably batch writes (more than a block)

there is a huge spike the the log writes from time to time though - might want to look at that.

log buffer space wait event on statspack

Arulkumar, October 19, 2011 - 11:35 pm UTC

Hi Tom,

I'm running replication (truncate & insert) on my database . It's taking more than 7hrs to complete .
It seems to be log buffer space wait event is very high during this replication.I’m planning to increase the redo logfile size and log buffer size
as per oracle docuement suggestion.

Current size of redo logfile :150MB

Note : 3 log switches are happening for every 4 minutes.

SQL> select GROUP#,BYTES from v$log;

GROUP# BYTES
---------- ----------
1 157286400
2 157286400
3 157286400
4 157286400
5 157286400

SQL> sho parameter log_buffer

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_buffer integer 2097152


DB size :440G

oraclle version : 9.2.0.5.0

Os : Sun os

can you please suggest me how much can be logfile and log buffer size increased?

Thanks,
Arul.




Tom Kyte
October 19, 2011 - 11:38 pm UTC

please answer all of these:

do you have indexes in place on this materialized view?

are you in archivelog mode or noarchivelogmode?

is the materialized view nologging (sounds like it could be since you could just refresh it instead of recover it)?

how much time is spent on log buffer space waits?

is anything else going on (are you SURE the log buffer space waits are from replication)?


log buffer space wait event on statspack

Arulkumar, October 20, 2011 - 12:41 am UTC

do you have indexes in place on this materialized view?

Ans -->This is not MV refresh . Some of the tables are truncating and inserting records from production database.

are you in archivelog mode or noarchivelogmode?

Ans : -->Archivelog mode

is the materialized view nologging (sounds like it could be since you could just refresh it instead of recover it)?

Ans : --> It's not MV

how much time is spent on log buffer space waits?

Ans : -->Most of the time during replication (more than 5hrs)
is anything else going on (are you SURE the log buffer space waits are from replication)?

Ans : Insert operation going only particular schema

Apart from nothing is going on
Tom Kyte
October 20, 2011 - 2:28 am UTC

ok, ok, so let me rephrase (they should be materialized views, why aren't they?). One thinks you could have answered my questions anyway - even though they are materialized views...


do you have any indexes on this segment?

is this segment nologging?

let me see the statspack report here.



more questions to be answered

lh, October 21, 2011 - 5:53 am UTC

Arulkumar: How many logfiles do You have ? They are no comments on waiting of switching redo logs in alert file ? Does your archiving prosess keep up with the rate of generating the redologs ? How many redolog groups do You have ?



Tom Kyte
October 21, 2011 - 8:29 pm UTC

I would like to skip the redo generation altogether, that is where I'm going with this.

they do not need redo for this particular table, it would be more efficient to just regenerate it in the event of a media failure.

log buffer space for massive parallel update

deba, June 07, 2012 - 10:37 am UTC

Hi,

We are running massive update in parallel for single table ( parallel degree is 20 ) and it will update around 500 million rows.

While update is going on , I can see that most of the sessions are waiting for log buffer space event. What is the best way to avoid this problem ?

Thanks
Deba
Tom Kyte
June 07, 2012 - 11:30 am UTC

500 million out of how many? Will all of those blocks of the table get loaded up in the buffer cache ultimately - and generate undo/redo - and have to be written back out?

why not parallel create table as select?

no undo
no redo (if you want to skip it, you can)
direct path


Log buffer

Sheryl Mae De La Pena, August 04, 2018 - 10:49 pm UTC

I only have 64k higher log buffer. Is there a possibilities to get 32k in the developers option by doing something?If I can't which one is the best log buffer to use for Samsung S7?
Chris Saxon
August 06, 2018 - 2:12 pm UTC

which one is the best log buffer to use for Samsung S7?

Wait... you're running Oracle Database on a Samsung S7?! Really?