Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question.

Asked: July 13, 2006 - 9:57 am UTC

Last updated: July 14, 2006 - 1:01 pm UTC

Version: 9204

Viewed 1000+ times

You Asked

This is what "they" are telling me. If I go from a hit ratio on the buffer cache from 98% to 99%, I have inversely lowered by direct disk reads from 2% to 1%, thereby cutting them in half. 1 is 1/2 of 2. See the reasoning? Please help.

and Tom said...

cache hit ratio = 100 * (1- PIO/LIO)

And in order to do so, you either

a) decreased physical IO (PIO) - in which case this is good (Logical - LIO stayed the same, you decreased physical IO). Say you went from

pio = 20,000, lio = 1,000,000 -> hit ratio = 98%
pio = 10,000, lio = 1,000,000 -> hit ratio = 99%

I agree, good. But I'd be looking for the decrease in pio and the constant LIO reading - not a cache hit move

b) did not decrease physical IO - you INCREASED LIO, this is bad.

pio = 20,000, lio = 1,000,000 -> hit ratio = 98%
pio = 20,000, lio = 2,000,000 -> hit ratio = 99%

I'd be really upset at you right now.

c) increased both PIO and LIO

pio = 20,000, lio = 1,000,000 -> hit ratio = 98%
pio = 40,000, lio = 4,000,000 -> hit ratio = 99%

now I'd be really upset - more than before.

I don't see any way to say "by increasing the hit ratio, we decreased PIO". PIO could

a) go up
b) go down
c) stay the same!!!

Want a really good hit ratio, do what this developer might want to do (use rule hint to make everything INDEX HAPPY)
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6749454952894#6760861174154 <code>

we went from a 50% hit: pio = 23380 lio = 59350
to a 98% hit: pio = 1154555 lio = 121367981

for the same query - affecting their desired "high hit ratio", as the expense of overall performance.


What you want to do is NOT get the buffer cache hit ratio to "N", you want to do less work. When you are doing the least amount of work you can - you know now the hit ratio for your database - which will vary widely (warehouse, reporting, transactional - all different).


The hit ratio is like a finger print, it is a metric. When the metric moves - something has changed. It moving from 98 to 99% could mean one of three things:

a) things are better
b) things are much worse
c) things haven't really changed at all as far as anyone is concerned.

So, moving the metric - big deal.


Doing LESS WORK - priceless.

Do you see the flaw in the reasoning now?

It is not "a => b" therefore "b => a" situation.


it is true that if you lower disk reads from 2% to 1% of the hit ratio, that the hit ratio will go from 98% to 99%.

It is NOT true that if you increase the hit ratio from 98% to 99% that the physical IO will have had to of gone down. You might have introduced a wickedly BAD query into the system (see link) and achieved this "goal"

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Another question about PIO and LIO

John, July 14, 2006 - 11:58 am UTC

Tom,
As always, great info. I have a followup question. What should I be doing (if anything) to monitor the LIO and PIO values? At this moment ours is PIO=192178 and LIO=2796469. Should I just be looking for little or no change over time for both values? Our system response time seems to be adequate at the moment.

Tom Kyte
July 14, 2006 - 1:01 pm UTC

the values likely will vary over time by some amount - but you can use this to "trend" out what your system fingerprint looks like.

When you try to figure out two weeks from now - "what changed", you'll be so happy to have a history so you can look for wide variations.

Recently - working on a system that just "went to pieces". Totally to pieces. Became unworkable. Just was "stopped" basically.

First question: "what changed"
Answer is course: "nothing"

fortunately, we had history (had the answer "nothing" from them more than once in the past - DBA now had history for the last month on hand all of the time).

We started hypothesizing what could be wrong.

Maybe an increase in transactions per second was causing this massive contention... Nope: history proved we were doing same or less work when the problem started. RULED OUT.

Maybe we are doing more hard parsing than we used to, that could have put us over a tipping point. Nope: history proved we were doing same or less hard parsing. RULED OUT.

Maybe we were doing more parsing.... (you get the gist)


We ruled out many things - and fortunately as we were ruling them out one by one, the root cause (a small system change, some auditing parameters in the OS) was discovered. That was our "tipping point", put the system over the edge.

Without the history, we would have just been sitting going "well, it could be this, it could be that, it could be this other thing" and having to set up some tests to see. With the history, we were able to preclude things from consideration rapidly.

With the history - we might even have gotten lucky - one of the questions/hypothesis was "lots of extra IO or IO times have gone way up". history ruled that out.

Very good advice

John, July 17, 2006 - 5:17 pm UTC

thanks Tom.