Skip to Main Content


Question and Answer

Tom Kyte

Thanks for the question.

Asked: August 19, 2007 - 8:58 pm UTC

Last updated: March 25, 2013 - 9:40 am UTC


Viewed 10K+ times! This question is

You Asked

Hi ,Tom!
I read oracle doc ,and it said latch is a kind of thing which does wait for the lock.
But when I read statspack ,I found the latch free wait time on top 5,what deos it mean?

and Tom said...

you were waiting for a latch to become free - latch free.

Basically it means some session needed a latch (on the library cache for example to parse some SQL).

it tried to get the latch, but failed (because someone else had it). So, it goes to sleep (waits on a latch free) and wakes up and tries again. The time it was asleep - that is the wait time for "latch free"

Here is a snippet from Expert Oracle Database Architecture on this topic:

Latches are lightweight serialization devices used to coordinate multiuser access to shared data structures, objects, and files.
Latches are locks designed to be held for extremely short periods of time¿for example, the time it takes to modify an in-memory data structure. They are used to protect certain memory structures, such as the database block buffer cache or the library cache in the shared pool. Latches are typically requested internally in a 'willing to wait' mode. This means that if the latch is not available, the requesting session will sleep for a short period of time and retry the operation later. Other latches may be requested in an 'immediate' mode, which is similar in concept to a SELECT FOR UPDATE NOWAIT, meaning that the process will go do something else, such as try to grab an equivalent sibling latch that may be free, rather than sit and wait for this latch to become available. Since many requestors may be waiting for a latch at the same time, you may see some processes waiting longer than others. Latches are assigned rather randomly, based on the luck of the draw, if you will. Whichever session asks for a latch right after it was released will get it. There is no line of latch waiters¿just a mob of waiters constantly retrying.
Oracle uses atomic instructions like 'test and set' and 'compare and swap' for operating on latches. Since the instructions to set and free latches are atomic, the operating system itself guarantees that only one process gets to test and set the latch even though many processes may be going for it simultaneously. Since the instruction is only one instruction, it can be quite fast. Latches are held for short periods of time and provide a mechanism for cleanup in case a latch holder 'dies' abnormally while holding it. This cleanup process would be performed by PMON.
Enqueues, which we discussed earlier, are another, more sophisticated serialization device used when updating rows in a database table, for example. They differ from latches in that they allow the requestor to 'queue up' and wait for the resource. With a latch request, the requestor session is told right away whether or not it got the latch . With an enqueue lock, the requestor session will be blocked until it can actually attain it.
Note Using SELECT FOR UPDATE NOWAIT or WAIT [n], you can optionally decide not to wait for an enqueue lock if your session would be blocked , but if you do block and wait, you will wait in a queue.
As such, an enqueue is not as fast as a latch can be, but it does provided functionality over and above what a latch can offer. Enqueues may be obtained at various levels, so you can have many share locks and locks with various degrees of shareability.

Latch 'Spinning'

One thing I¿d like to drive home with regard to latches is this: latches are a type of lock, locks are serialization devices, and serialization devices inhibit scalability. If your goal is to construct an application that scales well in an Oracle environment, you must look for approaches and solutions that minimize the amount of latching you need to perform.
Even seemingly simple activities, such as parsing a SQL statement, acquire and release hundreds or thousands of latches on the library cache and related structures in the shared pool. If we have a latch, then someone else might be waiting for it. When we go to get a latch, we may well have to wait for it ourselves.
Waiting for a latch can be an expensive operation. If the latch is not available immediately and we are willing to wait for it, as we likely are most of the time, then on a multi-CPU machine our session will spin¿trying over and over, in a loop, to get the latch. The reasoning behind this is that context switching (i.e., getting 'kicked off' the CPU and having to get back on the CPU) is expensive. So, if the process cannot get a latch immediately, we¿ll stay on the CPU and try again immediately rather than just going to sleep, giving up the CPU, and trying later when we¿ll have to get scheduled back on the CPU. The hope is that the holder of the latch is busy processing on the other CPU (and since latches are designed to be held for very short periods of time, this is likely) and will give it up soon. If after spinning and constantly trying to get the latch, we still fail to obtain it, only then will our process sleep, or take itself off of the CPU, and let some other work take place. The pseudo-code for a latch get might look like this:
Attempt to get Latch
If Latch gotten
  return SUCCESS
  Misses on that Latch = Misses+1;
    Sleeps on Latch = Sleeps + 1
      For I in 1 .. 2000
        Attempt to get Latch
        If Latch gotten
          Return SUCCESS
        End if
      End loop
    Go to sleep for short period
  End loop
End if

The logic is to try to get the latch and, failing that, to increment the miss count¿a statistic we can see in a statspack report or by querying the V$LATCH view directly. Once the process misses, it will loop some number of times (an undocumented parameter controls the number of times and is typically set to 2,000), attempting to get the latch over and over. If one of these get attempts succeeds, then it returns and we continue processing. If they all fail, the process will go to sleep for a short duration of time, after incrementing the sleep count for that latch. Upon waking up, the process begins all over again. This implies that the cost of getting a latch is not just the 'test and set'-type operation that takes place, but can also be a considerable amount of CPU while we try to get the latch. Our system will appear to be very busy (with much CPU being consumed), but not much work is getting done.



  (19 ratings)

Is this answer out of date? If it is, please let us know via a Comment


Sleep count

Michel CADOT, September 07, 2007 - 2:06 am UTC

Is not the line
Sleeps on Latch = Sleeps + 1
after the spinning loop that just before or after the line
Go to sleep for short period
That is does it count the number of spins or the number of real sleeps (or maybe a spin is a "sleep")?


Tom Kyte
September 11, 2007 - 8:50 am UTC

you are correct, not sure how the sleep increment got before the loop


Attempt to get Latch
If Latch gotten
  return SUCCESS
  Misses on that Latch = Misses+1;
      For I in 1 .. 2000
        Attempt to get Latch
        If Latch gotten
          Return SUCCESS
        End if
      End loop
    Sleeps on Latch = Sleeps + 1
    Go to sleep for short period
  End loop
End if

Look out for....

Ian, September 07, 2007 - 4:32 am UTC

On 9i versions less than there is a nasty bug to look out for whereby the spin_count is multiplied by the number of CPUs. So on our 42 CPU system we were spinning 84,000 times instead of 2,000 times! Not good.

The workaround (once you have confirmed with Oracle that you are affected) is to set _spin_count to 2000/Number of CPUs.

See Note:4696143.8 on Metalink.

measuring cpu cycles consumed due to spin

Nilanjan Ray, March 20, 2008 - 7:10 pm UTC

Hi Tom,

I suspect that our system in affected by too much LIO on an inflated buffer cache and too much keep pool size. This results in too much spinning on cache buffer chain latch. The CPU utilization goes up with the run queue increasing.

Since most of the time the latch is obtained by spinning, nothing is visible in the wait interface but the high CPU utilization gives a wrong impression. The spin gets coincides well with the high CPU utilization and run queue being built up.

So far so good, I am convinced (may be wrongly). The question which I really want to ask is: Is it possible to prove with any oracle stats, that indeed the spinning is the cause of performance bottleneck. Any metrics or any derivation from metrics in Oracle 10g. This is crucial because a belief can be substantiated with proper data and proof.

Many thanks

clarification on sleep time

Sravan Tailam, March 25, 2009 - 10:15 am UTC

Tom, you mentioned :
it tried to get the latch, but failed (because someone else had it). So, it goes to sleep (waits on a latch free) and wakes up and tries again. The time it was asleep - that is the wait time for "latch free"

Can you please confirm if the time shown in statspack for a latch free wait is sleeping time or spinning time or both ?

In other words, if my Top 5 events is something like :

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 14,073 92.03
latch free 130,315 401 2.62
enqueue 5,085 275 1.80
db file scattered read 815,104 212 1.39
db file sequential read 1,168,259 73 .48


Is the 401 seconds the time spent in sleeping only?
If thats the case, how do I calculate the time spent in spinning? So, does the spinning time be rolled into the CPU time ?

Tom Kyte
March 29, 2009 - 8:20 pm UTC

that is sleeping only, goto the latch report in the same file and review the MISSES and SPINS.

You cannot calculate the time spent in spins (it is some part of the CPU time) only that you spun.

Krzysztof W¿asiuk, March 30, 2009 - 3:11 am UTC

In the above Statspack report header the number of latch waits is quite small in comparison to db block usage in my opinion. Are You sure there is nothing else witch consume CPU time? Some ugly query maybe?
Tom Kyte
March 30, 2009 - 4:17 pm UTC

the number of latch waits is the TIP OF THE ICEBERG.

In order to get to a latch wait, you spun in a hard loop 2,000 times trying to get the latch. Here is a simplified "latch algorithm", it looks much like this:

 for I in 1 .. 2000
  try to get latch
  if got latch, return
  if I = 1 then misses=misses+1
 end loop
End loop;

so, every miss = CPU time. If and only if you miss it 2000 times in a row will you wait. So, 103,000 times you waited - that means 103,000 times you spun 2000 times burning up cpu. The MISSES in the latch report will tell you how many times altogether (probably in the millions) you spun - but spun less than 2000 times.

Did I ever say "nothing else consumes cpu"?? I didn't say that at all.

I said "go to the latch report, review the number of MISSES and understand they all eat CPU"

of course other things are consuming CPU. That is a given.

the question was
If thats the case, how do I calculate the time spent in spinning? So, does the spinning time be
rolled into the CPU time ?

Latch Free on enqueue

Upendra Satluri, May 04, 2009 - 10:39 pm UTC


I dont know if this is a new question or relevant to the thread, but I see a lot of "latch free" waits in our system, and the P2 is enqueue. I kind of get stuck there. What do I check next in case it is an enqueue. I checked my statspack for the "sleepers", and this is what I find.

NoWait Waiter
Latch Name Where Misses Sleeps Sleeps
------------------------ -------------------------- ------- ---------- --------
enqueues ksqdel 0 2,267,369 ########
enqueues ksqgel: create enqueue 0 2,254,960 ########

I cant make anything out of this, but these 2 guys seem to be causing the bulk of the sleeps. Can you throw some light on this, or should this have gone to a different forum?

Tom Kyte
May 09, 2009 - 10:37 am UTC

enqueues are locks - row locks typically.

do you see large enqueue waits as well?

any unindexed foreign keys out there throwing full table locks "by accident"?

A number of sleeps is by itself "not useful", time periods (duration of report), information on activity - they are (needed in addition to a number)

Atomicity of "test and set"

Prajjwal Mallik, May 11, 2009 - 12:10 pm UTC

Hi Tom,
What a wonderful bit of information!

However, I have a silly question regarding this statement "Oracle uses atomic instructions like 'test and set' and 'compare and swap' for operating on latches"

As per my understanding, oracle when locks a set of rows (say) from one session, the lock information is stored in the data blocks(in the header) tagged with the transaction ID. Now suppose two sessions are simulateously requesting for a lock on an overlapping set of rows, the situation may be like this:

step1 ) session 1 >>> "test" row #5 (say) if it is locked by some other transaction. (At this moment, the row #5 is found not locked. Also, session 1 does not lock it now)
step2 ) session 2 >>> "test" row #5 (say) if it is locked by some other transaction. (But at this moment, the row #5 is not yet locked by session 1)
step3 ) session 1 >>> "set" row #5 as locked by the transaction in session 1
step4 ) session 2 >>> "set" row #5 as locked by the transaction in session 2 (It does not know session 1 has already locked the row #5. It only relies on the information from step 2)

My question is, with what construct does oracle ensure the serialization of the flow of steps to be
Step1 then Step 3 then Step 2 then Step 4
Step 2 then Step 4 then Step1 then Step 3 ?

I believe "test and set" as a whole has to be atomic. If in case this is the answer you give me, I have yet another question. :)

Digging down to the lowest level, test and set can never be executed in a single CPU instruction if I am not wrong. Then is the big question, how is "Test and set" atomic as a whole?

Tom Kyte
May 11, 2009 - 7:03 pm UTC

we do not use latches to lock rows. re-read the quote above about what latches are and what they are used for.

In order to access a row to modify it, you get the block the row is on in CURRENT MODE. Only one session can get a block in CURRENT MODE - it uses a latch to serialize access to the entire block, looks at the row it wants (it is the ONLY thing that can look at this row in CURRENT MODE), and if it is not already locked - it locks it (by updating the transaction information in the block header). If it is locked, it notes that. Either way, it immediately unlatches the block now (someone else can look at it in current mode). If it locked the row - it continues on. If it could not lock the row because someone else had it - it creates an ENQUEUE LOCK (heavy duty) on the session that has it locked - when that session commits - they get a chance to get at the row again.

... Digging down to the lowest level, test and set can never be executed in a
single CPU instruction if I am not wrong. ...

sure it can - that is what a test&set is - a CPU instruction, the CPU provides this for us (most do, HP's do not for example :( there we have to use heavier devices to serialize)

ther's a specific Oracle feature

A reader, May 11, 2009 - 12:16 pm UTC

try use
dbms_lock.sleep( 5 );

follow - up

A reader, May 11, 2009 - 12:33 pm UTC

The actual wait time for the "latch wait" is huge, and that was what got me interested. There seems to be a lot of conversions/releases and requests. The system may be just too busy. Tried looking for the individual Enqueue waits, but none were alarming. One of my collegues was thinking this may be because of a Remote transactions the process does, and how that effects the enqueues for the local transactions.
Here are the Top 5 from the awr:

Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
db file sequential read 42,005,780 232,838 6 73.8 User I/O
CPU time 124,672 39.5
latch free 11,592,952 76,746 7 24.3 Other
wait list latch free 107,553 2,088 19 0.7 Other
latch: library cache 1,135,976 1,862 2 0.6 Concurrenc

Here are the sleep numbers from the awr report:

Latch Name
Get Requests Misses Sleeps Spin Gets Sleep1 Sleep2 Sleep3
-------------- ----------- ----------- ---------- -------- -------- --------
164,661,022 67,359,830 11,487,083 ########## 0 0 0
library cache
390,477,596 14,128,161 1,136,176 ########## 0 0 0
library cache pin
260,126,896 6,223,864 831,838 5,398,122 0 0 0
cache buffers chains
1,262,613,116 2,602,885 429,898 2,175,706 0 0 0
transaction branch allocation
82,282,012 1,600,047 95,477 1,505,151 0 0 0

Would altering the spin_count help with the LATCH FREE waits? It would seem logical because now i will spin longer, and may not have to sleep, so will have lesser latch free waits.

Does it make sense to close the DB Link after doing a remote transaction, so the local transactions are treated differently, and not so manyh enqueues are needed?

Thanks for the reply, Tom.

Tom Kyte
May 11, 2009 - 7:08 pm UTC

... The actual wait time for the "latch wait" is huge ..

a) I had no way to see or verify that
b) no, it is not. not relative to IO waits anyway.

... Would altering the spin_count help with the LATCH FREE waits? ....

DO NOT EVEN CONSIDER THAT. Unless support tells you to - do not touch it, you can so completely destroy things with that.

Sure, you'll spin longer - but - what if your degree of concurrency is such that the latch holder is getting kicked off the cpu (they ran out of their time slice)... do you REALLY want the latch requesters to actually *spin longer* preventing the latch holder from getting onto the cpu again?!?!?!?!

... Does it make sense to close the DB Link after doing a remote transaction ...

no, if you don't access the link, it won't be a distributed transaction.

any unindexed foreign keys out there throwing full table locks "by accident"?

Follow - up - Latch Free.

Upendra Satluri, May 11, 2009 - 12:37 pm UTC

The previous post was from me. I guess i didnt fill in the details in a hurry.


Prajjwal Mallik, May 12, 2009 - 11:01 am UTC

Thanks a lot Tom for the clarifications. :)

Enqueue waits

Upendra Satluri, May 13, 2009 - 11:04 am UTC

Thanks Tom for making things totally clear.

... Does it make sense to close the DB Link after doing a remote transaction ...

no, if you don't access the link, it won't be a distributed transaction.

What I meant was, since the dblink remains open till the time the session remains open, does it make sense to close the DB Link with the alter session close database link command? Since there is a session open on the target, does oracle treat all statements coming from the master as transactions (including select statements, may be?). So, does it make sense to actually close the db link with the alter statement, so as to reduce the number of enqueue requests. Does it actually matter at all ?

And I woudnt touch the spin_count without getting the approval from oracle support anyways (I have a job to keep), just wanted an idea if I should even think about such a thing. I got my answer.

I guess I am diverting from the main topic of Latch Free, which was what this forum was for, so will not mind if you dont want to answer this, but will surely appreciate if you would.

Thanks again Tom. Very helpful, as always.
Tom Kyte
May 13, 2009 - 3:02 pm UTC

... does
oracle treat all statements coming from the master as transactions ...

no, it doesn't

you'd want to close the link if

a) you were not using shared database links
b) you wanted to reduce the resources consumed ON THE REMOTE site - each session would be taking up a bit of resource over there.

Thank you for the wonderful insight

Upendra Satluri, May 14, 2009 - 11:28 am UTC

Thanks Tom.
Got all my answers.

Also, may be because of the limitation you say with the HP Unix CPUs (where test and set is not a single CPU instruction), maybe, just maybe, enqueue latch is a solitary latch in HP (or is it a solitary latch everywhere, latch# 18). May be this is how oracle ensures serialization? OR am i totally confused?

Tom Kyte
May 14, 2009 - 12:09 pm UTC

no, it would not make us use a solitary latch (whether you have ONE or THOUSANDS - the logic of latching is "serialize the latch get" in general. Just having one latch would not 'fix' anything)

On hardware that don't have a test and set, we use something heavier like a semaphore - which the OS provides for serialization.

Thanks again - Tom

Upendra Satluri, May 14, 2009 - 2:33 pm UTC

Thanks Tom. A good learning experience.
Will keep checking the system to see if I find something, but most likely is just the way the application is designed. Management wants to run more than 200 application processes, and as soon as we increase the number to 150, I start seeing these "Latch Free" events and the total throughput starts reducing.
I was able to eliminate some of the "concurrency" waits by partitioning some indexes and increasing the pctfree for some of the index blocks, but cant seem to handle these "other" waits, as awr reports these.

But thanks Tom, very useful definitions and insights. At least now I know where NOT to look, which is important too.


Claus, May 15, 2009 - 7:39 am UTC

As discussed, the implementation of latches depends on the CPU architecture. HP-UX runs on two CPU architectures PA-RISC and Itanium. PA-RISC does not have an atomic test-and-set instruction, Itanium does.
Tom Kyte
May 15, 2009 - 1:47 pm UTC


More of latches on

Sidney Pryor, April 23, 2010 - 11:45 am UTC

Hi Tom,
Reading your answers is always a great way to know Oracle better.
I read this question from its begining and some other questions came up on my mind. We are starting to see some events of wait for latch on our databases. So here they go:
1)as latches are semaphores controling access to memory blocks, if we have waits for it, does it means that we have few memory blocks free, that we have high concurrency for some specific hot blocks, OR that we have slow DMLs operations?
2)are latches configuration an operational system (OS environment) issue or limitation?
3)what is the best way to find the source of latch waits?
4)are latches issues associated with application logic too and how can database help on these cases?
5)if it is possible, how can pct_free help to solve or decrease concurrency? I read something about this too but it was not very clear.
Thank you for your attention.
Tom Kyte
April 23, 2010 - 2:17 pm UTC

1) latches are locks - serialization devices - use to control concurrent access to in memory data structures that are shared by many thread.

It is not "memory blocks", it is shared data structures. Like the shared pool for example.

there are latches that protect MANY structures, not all are buffer cache related, in fact - more are not.

You'd have to see what the latch is for - say it was for the shared pool - well, tuning your buffer cache wouldn't do anything for you.

2) I don't know what you mean.

3) v$ views, statspack/awr reports, wait events in ASH or tkprof reports.

4) latches are used in response to the client asking us to do something, so yes, they are associated with how the application uses the database.

Say the application parses 100 queries per second
Say further, the application does use any binds

You'll hard parse like mad, you'll have shared pool and library cache latching like crazy and the only way to fix it would be to reduce the hard parsing - which is induced by the way the application was coded.

Say the application forces the use of indexes inappropriately - like this:

that'll latch on the cache buffer chain (buffer cache) latch like crazy due to the excessive logical IO's. It has to do with the way the application is coded.

5) pct free is used to avoid migrated rows - not concurrency so much. However, since pct free will limit the number of rows per block it can have an indirect affect on concurrency. But probably not latching issues - more like buffer busy wait events (too many people wanting to modify the same block at the same time - if you reduce the number of rows per block, it might be safe to assume that the number of concurrent accesses of that block would decrease as well)

Still about latches

Sidney Pryor, April 26, 2010 - 10:17 am UTC

First I want to thank you for answering all my questions above.
1) Understood, thank you. So i can assume that latches control (serialize) access to a group of memory blocks that together is a shared data structure? For example, when a session waits for "latch:cache buffer", this means that it is waiting for 1 serialization device specifically that controls access to some specific data, or is it waiting for 1 free serialization device to access the buffer cache area?
2) On this question, i was willing to know if there is some environment parameter that can be tuned on OS system in order to improve or decrease latch contention?
3) Ok, perfect, thank you;
4) Sorry, i am not sure about what you really meant with "the application does use any binds". Did you meant "Not use any binds"?
5) Perfect clear, thank you very much.
Best regards,
Tom Kyte
April 26, 2010 - 10:33 am UTC

1) shared data structures, do not think in "blocks", think arrays, linked lists, record structures - think pointers from C.

when waiting for example for the cache buffer chains (cbc) latch, it is waiting to access a list of database blocks that are in the cache - these blocks represent a subset of all the blocks in the cache.

2) not really - not an environment variable really. Latch contention happens because of too many concurrent requests for the same resource. The way to reduce that contention is to reduce the need for all of those concurrent requests.

EG: reduce hard parsing, that'll reduce the shared pool latch requests.
tune a query that is using indexes inappropriately, reduce the cbc latches requested.

4) yes, that does should have been doesn't

Latch Sleep Breakdown

Arvind Mishra, August 01, 2011 - 12:01 am UTC

Hi Tom,

1) In your book 'Expert Oracle Database Architecture', in chapter 6, Locking and Latching's page 225 while discussing latch sleep breakdown you wrote that perform two units of work we needed to use 3 units of CPU.. I am not sure how you reached on that conclusion? The data shown on the page is:

Latch Name Requests misses sleeps sleeps 1->3
shared pool 1,126,006 229,537 406 229135/398/4/0
library cache 1,108,039 43,582 7 45575/7/0/0

2) On page 226 of the same book you write " In our tests, because we have single user example, we conclude that about 22 CPU seconds was spent spinning on the latch" But elapsed time for single user is 0.52 and double user is 0.78. Therefore 78-52 = 24. Is this a misprint or I am comparing wrong data?

3) What is the meaning of data in column Sleeps 1->3+?

4) Where can I find full details about data contained in stats pack report?

5) What are the differences between statpack report and new AWR report?


Arvind Mishra
Tom Kyte
August 01, 2011 - 11:54 am UTC

1) Look at the CPU time used at the aggregate level. When two users ran, the total database cpu time was tripled. You cannot look at the latch report for that - look at the instance activity.

We used 26 CPU seconds for the single user, we used 74 CPU seconds for two users. Tripled cpu. Doubled work.

2) elapsed time is not meaningful when discussing the numbers we are discussing, we are looking at CPU TIME.

3) how many times someone slept ONCE for the latch, then twice, then three times, then three or more.

229135/398/4/0 => 229135 times we slept once, 398 times twice, 4 times thrice, and 0 times more than three time.

4) reference guide details many of the statistics. Most are obvious - the less obvious are documented in the reference guide available on and or the performance guide available on the same site.

5) statspack is sort of frozen in time, will not be enhanced necessarily, AWR and stats pack have a large amount of overlap - but will diverge over time as more and more is added to AWR.

Latch Issue in Oracl 10G

Santosh, March 15, 2013 - 5:31 am UTC

Hi Tom,
We are having latch issue in one of 10G live database,
there is a batch job which select details from v$latch ,but the select on v$latch getting hang and its consuming 100% CPU.
We checked the AWR report but latch event is not reported in the top 5 events.

Can you please help ??

Tom Kyte
March 25, 2013 - 9:40 am UTC

please utilize support for something like this.

and question why a batch job is selecting from v$latch - an in memory data structure which itself has to be protected by latches to be accessed. You do know that frequently accessing v$ tables is going to itself be a performance issue. I cannot imagine the reason a batch job would be selecting from this?