Skip to Main Content
  • Questions
  • Redo Logs - "Thread 1 cannot allocate new log" in alert log

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Michael.

Asked: August 15, 2012 - 5:09 pm UTC

Last updated: August 17, 2012 - 3:13 pm UTC

Version: 11.2.0.1.0

Viewed 10K+ times! This question is

You Asked

Hello,
I'll start by saying I'm not a DBA but rather a developer and I recently purchased your book "Expert Oracle Database Architecture". In chapter 3 it mentions that in the alert log we should never get the message "Thread 1 cannot allocate new log" because if we do, that is indicating we are trying to switch to a new redo log before DBWn is done flushing all the blocks for that log and processing is being suspended while everything gets flushed.

I took a peak at the alert log for our prod database and discovered that in one day that message appeared 33 times, all during peak processing, many times the messages are just a few minutes apart. At the busiest time it appeared 13 times in a span of 120 minutes.

The message "Private strand flush not complete" also comes in the next line followed by 3 lines that contain the name of the redo log.

--
Questions:
Will this message only appear in the alert log when we try to swith to a redo log that has not been flushed or can there be other causes?

How much of an impact can this have and can it cause noticable slowdowns in our batch processing and also for users running reports in the database?
--

I plan on sending my analysis to our DBAs and suggest that this be looked into. I started here at a new company 2 months back and of course the biggest issue my team is dealing with is poor performance in the database so I'm trying to find anything I can that can be impacting performance. I already found they did not have any of the large tables(300 million+ rows) partitioned in way that benefited loading/reading data. They had the tables hash partitioned into 8 pieces and I suggested partitioning by month since the data is loaded and reported on by date. So far in our test database we have seen jobs finish up to 5 times faster because of the new partitioning! A full blown test with all big table partitioned this way is being done now.

Thanks!!

and Tom said...

Will this message only appear in the alert log when we try to swith to a redo log that has not been flushed or can there be other causes?

that is the cause of this message - yes. You are trying to reuse a redo log but there are still some dirty blocks in the buffer cache protected by this log. before we can reuse it - we need to checkpoint those blocks.


How much of an impact can this have and can it cause noticable slowdowns in our batch processing and also for users running reports in the database?

An AWR/ASH report would be good to give you numbers on that. It will register as a wait event. You'll be able to see clearly how much time is spent waiting for this event by all sessions

Rating

  (3 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

What about Doc ID 435887.1?

Charlie B, August 17, 2012 - 9:39 am UTC

Tom -

Thanks for looking into this.

We've always forced log rolls every hour, as part of our disaster recovery planning. A cron script kicks off the SQL statement for all our production db's.

Starting with 10.2, if my memory serves me right, we started getting these messages in most of our prod db's. We'd never seen them before the upgrade except during times of high db activity. Now they usually, but don't always, seem to occur in db's with low levels of activity.

ML doc id 435887.1 tells us that "this is expected behavior according to development." That's a shame, because we used to be able to ID db's with log performance issues in part due to the messages in the alert log. Now it's much harder: these "false positives" generate a lot of noise in the alert log.

Is there something we can do on the production side to prevent forced log rolls from generating these messages?
Tom Kyte
August 17, 2012 - 3:13 pm UTC

http://docs.oracle.com/cd/E11882_01/server.112/e25513/initparams009.htm#REFRN10003

would be a better way to do that. No cron junk.





this is expected behavior - and could have happened in ALL RELEASES. It is simply because you are trying to advance into a redo log that has blocks in the buffer cache it is protecting.

you can use something like:

http://docs.oracle.com/cd/E11882_01/server.112/e25513/initparams088.htm#REFRN10058

or

http://docs.oracle.com/cd/E11882_01/server.112/e25513/initparams132.htm#REFRN10095

to make dbwr more aggressive, to keep those blocks out of the cache.



But this is something that always happened - forever. It just means that when you tried to advance into the next redo log file - it needed to finish its checkpoint first.

I learned something new - thanks!

Charlie B, August 20, 2012 - 12:44 pm UTC


A reader, October 17, 2014 - 10:08 am UTC

The note 435887.1 explains everything, so I think Tom's answer is not accurate.

From my understanding, This has nothing to do with performance, nothing to do with dirty data buffer not flushed to disk. It's about part of log buffer or private strand not flushed to the current redo log.



More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.