Skip to Main Content

Breadcrumb

more

Connor and Chris don't just spend all day on AskTOM. You can also catch regular content via Connor's blog and Chris's blog. Or if video is more your thing, check out Connor's latest video and Chris's latest video from their Youtube channels. And of course, keep up to date with AskTOM via the official twitter account.

Question and Answer

Tom Kyte

Thanks for the question, Vijay.

Asked: May 30, 2000 - 3:31 am UTC

Answered by: Tom Kyte - Last updated: May 29, 2013 - 4:53 pm UTC

Category: Database - Version: 8i

Viewed 100K+ times! This question is

Whilst you are here, check out some content from the AskTom team: Table Functions, Part 4: Streaming table functions

You Asked

Hi,
Is it enough if we use
ALTER TABLE <table_name> CACHE; to push the table to cache ?.
Do I need to do anything else to have the table in the Memory ?

Thx, VJ

and we said...

That command simply makes a notation in the data dictionary that blocks from this table should be handled differently when they get into the cache. Usually, when we full scan an object, we put the blocks onto the least recently end of the list. These blocks are candidates for "aging" from the buffer cache. By altering the table to 'cache', we put the blocks onto the most recently used end -- making them less prone to being aged out of the buffer cache.

From the sql reference manual we see:

...
for data that is accessed frequently, specifies that the blocks retrieved for this table are placed at the most recently used end of the LRU list in the buffer cache when a full table
scan is performed. This attribute is useful for small lookup tables.
...

So, no -- issueing the alter table ... cache command does not put the table into memory, you must full scan the table for that to happen.

Consider looking into setting up multiple buffer pools and associating this table with its own buffer pool if you are dead serious about caching it. Bufrer pools are

<quote tuning manual>
Schema objects are referenced with varying usage patterns; therefore, their cache behavior may be quite different. Multiple buffer pools enable you to address these differences. You can use a KEEP buffer pool to maintain objects in the buffer cache
and a RECYCLE buffer pool to prevent objects from consuming unnecessary space in the cache. When an object is allocated to a cache, all blocks from that object are placed in that cache. Oracle maintains a DEFAULT buffer pool for objects that have
not been assigned to one of the buffer pools.
</quote>

Bear in mind tho... there is really no true way to a have a purely "in memory" table -- even cached tables are subject to aging from the buffer cache.

and you rated our response

  (80 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Finding which tables to cache

October 27, 2003 - 6:45 pm UTC

Reviewer: A reader

Hi Tom,

What is your approach to find which tables to cache in SGA?

In a properly tuned system, this caching should be unnecessary and won't give much benefit, is it correct?

Thanks,

Tom Kyte

Followup  

October 27, 2003 - 6:58 pm UTC


you normally should not ever need to use this, it only comes into play when you full scan a table that exceeds the short table threshold -- causing us to cache it differently.

it is rare to do that.
you would know you are doing it.

i've used it once in my experience.

How about alter table ... storage (buffer_pool keep)?

October 27, 2003 - 7:33 pm UTC

Reviewer: A reader

Thanks Tom.

Do you mean that I should use "alter table ... storage (buffer_pool keep)" instead of "alter table ... cache", for small frequently-accessed look-up tables?

Tom Kyte

Followup  

October 27, 2003 - 7:52 pm UTC

not really -- they were always "fully cachable" as is.


I ascribe to the 99/1 rule (variation on the 80/20 rule). the buffer cache is fairly efficient. If it was meant to be cached, and the memory exists, it will be cached. small, frequently accessed lookup tables would be in the cache by "popular demand" 99% of the time already.

October 27, 2003 - 11:36 pm UTC

Reviewer: Dave from Colorado

One of the situations for getting involved in setting the buffer pool attribute might be in a partitioned fact table -- you might choose to set the most recent few partitions to "keep", and the older ones to "recycle", thus helping to prevent a more rare scan of older data from spoiling the day of the majority of users who access recent data.

what about large tables

October 28, 2003 - 4:24 am UTC

Reviewer: Duncan from UK

From what was said above am I to take it that putting large tables in the buffer keep pool is going to be pointless because if they take up most of the pool then they'll get aged out as soon as a new object requires the space in memory to be cached?

Tom Kyte

Followup  

October 28, 2003 - 8:01 am UTC

no?

that is not what it means.


blocks are aged out of buffer pools based on how often they are used. So, if you put large tables in a keep pool AND you use their blocks lots (more then other blocks), they'll stay in there.

What about Performance

October 28, 2003 - 9:20 am UTC

Reviewer: A reader from Santo Domingo, Dominican Republic

Is there any sample about the performance impact of caching tables.

Tom Kyte

Followup  

October 28, 2003 - 10:21 am UTC

blocks are always cached.

the alter table cache command is somewhat of a misnomer. it simply changes the way a full scan of a LARGE table would be treated in the buffer cache. It doesn't "cache a table".

Performance is relative.

alter table cache can kill performance.
alter table cache can enhance positively performance.
alter table cache can have zero effect on performance.

it is a tool, that can sometimes - in rare cases (eg: example posted by another reader -- Dave -- above) be used.

short table threshold ?

February 16, 2004 - 1:53 pm UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

Tom,

You mention the "short table trheshold" what is it ? According to Oracle documentations it should be 5 database blocks.
However I have a table that uses 935 blocks, and is 37 000 rows, and when I full scan it the "table scans (short tables)" statistic is incremented.

What are your observations?


Thanks

February 16, 2004 - 2:22 pm UTC

Reviewer: Christo Kutrovsky from Ottawa, ON Canada

Thanks.



Excellent Exchange

April 26, 2004 - 10:52 am UTC

Reviewer: Kyle from VA, USA

Thanks for posting this!

Cache a table value to be viewed by anyone

May 17, 2004 - 5:12 pm UTC

Reviewer: A reader

Hi Tom, This is a question for you :)

I have a value that is in a table, but this value is not going to change in all database life.
I can keep the table in memory but even then, Oracle will have block gets to get that value.
I can use a package begiend, a better solution, but per session there will be a read to the table in memory.

My question is can I put a value in memory for public access. So I read the value at database startup and from then it is read from memory from all session?

Thanks Tom


Tom Kyte

Followup  

May 18, 2004 - 7:39 am UTC

that package variable is the solution you are looking for. You are over analyzing the situation. the "read to the table in memory" isn't anything to be concerned about.

Thanks Tom

May 18, 2004 - 9:40 am UTC

Reviewer: A reader


difference between alter table.....cache and buffer_pool_keep(...)

December 14, 2004 - 5:41 am UTC

Reviewer: Anurag from INDIA

Hi Tom,

I am not clear about the difference between putting table to cache with alter
table ....cache command and using execute ....keep('table')

Can you please explain the difference actually what happens during both?

best regards,

Anurag




Tom Kyte

Followup  

December 14, 2004 - 9:05 am UTC

what is "execute ....keep('table')" exactly?


if you mean altering the table to be in the keep pool, then read on.

alter table cache's behavior is described above, for the original answer.


putting a table into the keep pool just changes the buffer cache where the blocks are cached. Instead of the blocks going into the default buffer pool -- they go into the keep buffer pool. Otherwise, the behaviour is precisely the same as just a "normal table". All it does is move where the blocks are placed -- the keep pool instead of the default pool.

more about keep pool

December 15, 2004 - 2:12 am UTC

Reviewer: Anurag from INDIA

Thanks Tom,

Well, I may be cloudy at my concepts. How the block of same table behaves if it is kept in keep pool rather cached. Or is it the same behaviour wise. Can you please illustrate this.



Tom Kyte

Followup  

December 15, 2004 - 1:38 pm UTC

when you full scan a long table (bigger than say DUAL) blocks resulting from that full scan are eligible for aging out of the buffer cache right away.

This prevents a single big full scan from WIPING OUT your buffer cache. When the full scan needs more space in the buffer cache for the blocks it is reading, the ones that get pushed out are the ones the full scan just put in there -- instead of all of the other data.

If you alter table t CACHE, you simply change that behaviour. The blocks read from that long full table scan are not subject to that behaviour. they WILL not be the first ones "out". So, a big full scan (say by accident) would in effect flush your buffer cache if the table was "cache".


With a keep pool -- an orthogonal concept from CACHE -- the two have really not much to do with eachother at all -- you are saying "when you do put a block from this table into the buffer cache, please put it over here, in this keep pool -- NOT in the default cache"



So, if a table is set to point to a keep pool, it behaves NOT ANY DIFFERENTLY than a table not pointing to the keep pool, it is just that its blocks will be put in the keep buffer pool instead of the default.


Don't even try to compare these two things, they have not much to do with eachother at all.

CACHE -- where in the "LRU" (conceptually speaking) the block read in from disk goes in the buffer cache

KEEP/RECYCLE/DEFAULT -- three buffer pools you may optionally set up.


December 15, 2004 - 7:51 am UTC

Reviewer: Shimon Tourgeman


named caches?

January 11, 2005 - 8:53 am UTC

Reviewer: A reader

hi tom,

why doesn't oracle provide named caches? eg. if there are say 10 applications in the database wouldn't it be more efficient for each application to have also 10 seperate caches? the lio / pio behavior & impacts would be per application. they wouldn't influence each other.

regards,
max


Tom Kyte

Followup  

January 11, 2005 - 10:49 am UTC

No, in my experience what you would achieve is "gotta buy tons more ram in order to do this since each application refuses to share what they have with anyone else"


there are up to 7 'caches' now, more than two (actually, in most all cases more than one) seems overkill.

look up table in cache memory

January 30, 2005 - 6:28 pm UTC

Reviewer: prathima from USA


Cache VS. Keep Pool

February 06, 2005 - 2:34 pm UTC

Reviewer: Randall from CA USA

Tom;

Recently you recommended considering setting up a "keep" pool. It is my understanding that there are no keep or recycle pools for alternate buffer caches, only for the default block size cache. Therefore designating a table as "keep" will break if the table is moved to an alternate block size tablespace later for tuning purposes. However, creating it as a CACHE type table should work regardless of the block size. At least that is what I am telling my students these days. What do you think?


Tom Kyte

Followup  

February 07, 2005 - 4:01 am UTC

where did I recomend that?


but the keep/recycle pools are just "alternate" buffer caches -- like a different sized block one would be... eg: another way to have a "keep pool" would be to create a non-standard blocksize tablespace and move the object into that. You achieve basically the same effect.

February 08, 2005 - 3:39 pm UTC

Reviewer: A reader

hi Tom,

Table UBER_INTERVAL currently has only 146 blocks and since it is a small table, Oracle is doing a FTS for a query on this table, even though the same can be achieved in fewer 'consistent gets' by using a bitmap index.

FTS consistent gets : 152
bitmap index consistent gets : 4

My question : How do I artificially make this table NOT to belong to this 'small table' category ?

Even after setting the 'number of blocks' to 10000 using the below dbms_stats, there is no effect(it is still using a FTS, for the query) and the consistent read is still 151 in sql_trace, not sure why the dbms_Stats is not having any effect

begin
dbms_Stats.set_table_stats(ownname=>'MEC_USER',tabname=>'UBER_INTERVAL_C',numblks=>10000);
end;

SELECT uber_interval
FROM uber_interval_c a
WHERE interval_type_code = :"SYS_B_0"

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.06 0.15 0 304 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.08 0.29 0 151 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.14 0.45 0 455 0 2

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 20

Rows Row Source Operation
------- ---------------------------------------------------
2 TABLE ACCESS FULL OBJ#(1837479) (cr=151 r=0 w=0 time=299577 us)

db version 9203
db_cache_size big integer 738197504

I guess there should be some undocumented parameter for setting this small table threshold ? Why is the dbms_Stats.set_table_stats not having any effect( as shown in sql_trace ?


thanks
Anto


Tom Kyte

Followup  

February 09, 2005 - 1:46 am UTC

do you care to share a test case (create table, insert into table select..., create index, dbms_stats.gather, etc)

just like I would.


and why are you using cursor_sharing -- seems like you have a bug in the code that needs fixing soon so you can turn that OFF. (this is probably more along the lines of "over binding is my problem" rather than stats. oracle will not full scan a table regardless of the size if the cardinality to be retrieved is small -- so, i'm thinking this is "cursor sharing" -- NOT "small table threshold"


set table stats, i don't see how you can say "as shown in sql_trace", i see nothing in there that would indicate anything?



February 08, 2005 - 3:42 pm UTC

Reviewer: A reader

corrrection :

the db version is 9204 not 9203

and I did a flush of shared_pool before the 2nd run(after setting dbms_Stats)

February 09, 2005 - 10:20 am UTC

Reviewer: A reader

Hi Tom,

You are right, after setting the following (cardinality for the column in where condition)

begin
dbms_Stats.set_column_stats(ownname=>'MEC_USER',tabname=>'UBER_INTERVAL_C',colname=>'INTERVAL_TYPE_CODE',distcnt=>1000);
end;

the optimizer did use the bitmap index instead of going for a full table scan(FTS). Here are some other things I noted

a) Cursor_sharing(whether it was the default (exact) or Similar) - did not make any difference to the above ( I did flush the shared pool each time, otherwise the cursor_sharing was not having any effect)

b) I had to increase both the numblks(for table) as well as distcnt( for the column in the where condition), for the optimizer to choose the index instead of FTS. So I assume the 'Small table threshold' was actually coming into play here. Is my assumption right ?

For my previous post, it should have been tkprof not sql_trace. Also my question(why tkprof was still showing 151 for cr- consistent reads, even after setting table_stats) was stupid, since sql_trace or tkprof always shows the actual values irrespective of the table or column stats. Sorry about that

Thanks for your help

Anto

Tom Kyte

Followup  

February 09, 2005 - 2:43 pm UTC

so basically, the optimizer just didn't have "correct data"

give it the right stuff and it works. give it incomplete or stale stuff and it doesn't


small table threshold is not coming into play.

cardinalities, rows retrieved, amount of data to process - they are.

small table threshold has to do with how the blocks are placed into the cache.

February 09, 2005 - 10:32 am UTC

Reviewer: A reader

And raising question in your site is far far better than raising TARs with oracle - both from the point of view of response time as well as for getting the workarounds/usefulness.

Metalink.oracle.com is really useful,but not raising TARs with oracle, maybe it depends on the type of support our client is having with oracle.

thanks
Anto

February 09, 2005 - 3:19 pm UTC

Reviewer: A reader

I was under the impression that if the number of blocks in a table is below some threshold value, Oracle will always do a FTS, irrespective of statistics, even if index access might be cheaper.

Anto



Tom Kyte

Followup  

February 09, 2005 - 3:40 pm UTC

FALSE.. bzzt.

provably false:

ops$tkyte@ORA9IR2> create table t ( x int primary key, y int );
 
Table created.
  
ops$tkyte@ORA9IR2> insert into t values ( 1, 1 );
 
1 row created.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>TRUE );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> set linesize 121
ops$tkyte@ORA9IR2> select * from t where x = 1;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=6)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=1 Card=1 Bytes=6)
   2    1     INDEX (UNIQUE SCAN) OF 'SYS_C005654' (UNIQUE)
 
 


it has never been true... 

February 09, 2005 - 5:02 pm UTC

Reviewer: A reader

Thanks, Tom for the clear example.

Not sure how that wrong notion came into my head



Why index scan ?

October 01, 2005 - 4:34 am UTC

Reviewer: A Reader from India

Hi,

Why index scan is cheaper for 1 record from 1 block table.


Thanks

Tom Kyte

Followup  

October 01, 2005 - 9:03 pm UTC

cheaper than what?

Re:

October 02, 2005 - 3:34 am UTC

Reviewer: A Reader from INIDA

Cheaper than Full Table Scan.

Tom Kyte

Followup  

October 02, 2005 - 10:51 am UTC

test it and see.

tablespace ASSM is auto segment space managed
tablespace MSSM is manual segment space managed

Less LIO's with the index - we can read the single index block, the single table block. 

With ASSM we read all blocks below the high water mark which is always advanced "high" (part of the design) as well as blocks that represent the extent map.

With MSSM we read all of the blocks below the high water mark as well as blocks that represent the extent map.


ops$tkyte@ORA9IR2> create table t ( x int constraint t_pk primary key, y int ) tablespace assm;
                                                                                                        
Table created.
                                                                                                        
ops$tkyte@ORA9IR2> insert into t values ( 1, 2 );
                                                                                                        
1 row created.
                                                                                                        
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly statistics
ops$tkyte@ORA9IR2> select /*+ INDEX( t t_pk ) */ * from t where x = 1;
                                                                                                        
                                                                                                        
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
                                                                                                        
ops$tkyte@ORA9IR2> select /*+ FULL(t) */ * from t where x = 1;
                                                                                                        
                                                                                                        
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop table t;
 
Table dropped.
 
ops$tkyte@ORA9IR2> create table t ( x int constraint t_pk primary key, y int ) tablespace mssm;
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( 1, 2 );
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace traceonly statistics
ops$tkyte@ORA9IR2> select /*+ INDEX( t t_pk ) */ * from t where x = 1;
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> select /*+ FULL(t) */ * from t where x = 1;
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        424  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
 

Thanks

October 03, 2005 - 2:42 am UTC

Reviewer: A Reader from India


used buffer count

October 03, 2005 - 9:22 am UTC

Reviewer: Markus from Austria

dear tom,

you mentioned a few comments before:
"When the full scan needs more space in the buffer cache for the blocks it is reading, the ones that get pushed out are the ones the full scan just put in there --
instead of all of the other data."

this sounds like the fts reuses "his own" buffers. say our buffer cache is 10000 blocks in size. say we are starting a fts over a table 3000 blocks in size. what happens? how many of the original 10000 block buffers will be effectivly aged out and used by this fts?


Tom Kyte

Followup  

October 03, 2005 - 11:18 am UTC

depends on how full the cache is to begin with and what else is going on.

No fixed "fast rule"

ad "used buffer count"

October 04, 2005 - 5:40 am UTC

Reviewer: Markus from Austria

dear tom,

thanks for your answer. could you please explain how a fts decides to reuse "his own" buffers?

Tom Kyte

Followup  

October 04, 2005 - 3:51 pm UTC

it is based on the way they are added to the cache, they go out at the ends, not in the middle - so when a buffer needs to be reused - they look like the oldest buffers.

ad "used buffer count"

October 05, 2005 - 5:58 am UTC

Reviewer: Markus from Austria

tom,

i thought about that. it think the fts is done "db_multiblock_read_count"-wise:

1. the session searches for free buffers
2. the free buffers are pinned
3. one time "db_multiblock_read_count" datablocks are read into this buffers
4. some operations occur on this data
5. the buffers are marked free
6. the session searches for free buffers (finding the ones recently used by itself)
7. ...

it that the way it works? or am i totally wrong?

Tom Kyte

Followup  

October 05, 2005 - 7:43 am UTC

not really, think of it like this


when we full scan a large table (we have thresholds to define "large" as a percent of the buffer cache), we'll put the blocks on the "OLD" end of the list, instead of the "NEW". OLD blocks age out faster. Since these new, yet marked as OLD blocks are on the OLD end of the list, they will be the first to go when we need more space. They will not kick the "NEW" blocks off of the list....

ad "used buffer count"

October 05, 2005 - 10:45 am UTC

Reviewer: Markus from Austria

hi tom,

thanks for your answer but you misunderstood. my question was and is still related to your reply found above "...the ones that get pushed out are the ones the full scan just put in there...". let me try to refine.

i know that the fts places its blocks at the cold end of the lru list per definition. but how can it happen that the same fts which places its blocks into buffers at the cold end of the lru list can REUSE these block buffers recently used by ITSELF? the fts is aging out blocks read-in by itself!?

that's why i was thinking the data is read "db_multiblock_read_count" wise. because it has to be read in chunks. the fts would not be able to reuse buffers recently used by its own otherwise.

please, could you explain on this, especially this "chunk" algorithm?





Tom Kyte

Followup  

October 05, 2005 - 11:44 am UTC

you have a buffer cache....

you are full scanning....

In order to do so, you need a free block. So, where do you get it? From the "cold end" as you say. You use it, you put it on the "cold end" (large table full table scan....)

Later you need another block - where do you get it? from the cold end - you know, right where you put the last stuff....


The fts is aging out blocks read in by itself - precisely.

so what if it is read in in chunks - you read 8 blocks - what do you need? You need 8 free blocks - go bump out others you already read and processed.

still "used buffer count"

October 06, 2005 - 5:16 am UTC

Reviewer: Markus from Austria

yes, yes, yes, come on. could you please go into depth on that?

what i mean is that there must be an algorithm like "read in a chunk of blocks" - "process this chunk of blocks" - "read in another chunk of blocks" (LEADING TO A REUSE OF THE BLOCK BUFFERS RECENTLY USED BY ITSELF) - "process this chunk of blocks" - "read in ..."

a fts is done with multiblock i/o not single block i/o per definition. that's why i thought it might be done "db_multiblock_read_count" wise.

who decides how many blocks are read in? how many blocks are read in per chunk? is it "db_multiblock_read_count", is it a threshold, is it a formular? because only this "chunk wise thing" leads to the reuse of block buffers recently used by itself. could you please explain this "chunk wise thing" algorithm?

Tom Kyte

Followup  

October 06, 2005 - 7:51 am UTC

no no no no no, because - it frankly doesn't really matter to us.

All *we* need to understand is that when we full scan - we won't wipe out the buffer cache. that is it pretty much.

The internals - they change. We've used the term "lru" to conceptually describe the process, well, it isn't an LRU, hasn't been since 8i, it uses a touch count.


We can conceptually understand what happens, the actual mechanics, not relevant and no I won't go further into it.


db_file_multiblock_read_count is documented to be the deciding that that controls the maximum size of a read, yes. That is what it does.

there is no true "chunk wise thing", all things are done at the block level in the buffer cache - a multi-block IO leads to a wait for "db file SCATTERED read" meaning the blocks are SCATTERED in the buffer cache (we hash their DBA's data block addresses to figure out what LIST of cached blocks - of which there are many - to put them on. They are not treated as a "chunk", they do not travel together).


If you want to understand some of the mechanics of the cache, how Oracle uses memory and such - I did write about that in some depth in my latest book (and in expert one on one Oracle - but things change, that one only covers up to 8i, the new one up to 10g)

ad "used buffer count"

October 06, 2005 - 10:14 am UTC

Reviewer: Markus from Austria

hi tom,

ok, i will take a look at your book "expert one on one oracle", first edition, i have. should i simply search for "cache management" or is there a special page number i should look at?

by the way, it would be hot - i think - if we could redirect blocks read in by a fts to the recycle buffer "on the fly"(*) using a hint. eg. "select /*+ DB_RECYCLE_CACHE(table_name) FULL(table_name) */...". might be a nice option for hybrid systems.

(*) NO, i don't want to change the definition "on the fly" using dynamic sql. ;)

Tom Kyte

Followup  

October 06, 2005 - 11:58 am UTC

why? we'd never be able to FIND the darn things again if you told us dynamically what cache to put them in.

that, and it is not necessary, since, well, the algorithm already says "don't flush out other stuff - flush out the full scanned blocks first anyway"


Suggest reading the first 1/3 of Expert :) it is sort of my version of the concepts guide. Buffer cache stuff is sprinkled throughout.

Relation of KEEP pool & large table sizes

November 09, 2005 - 11:23 am UTC

Reviewer: A reader

Many performance manuals indicate that if a "large" table is scanned/used frequently, it may benefit to configure a KEEP buffer. I understand as you say "if you put large tables in a keep pool AND you use their blocks lots (more then other blocks), they'll stay in there." Assuming the size of a particular table is larger than the KEEP pool itself, what impact does this impose? Is there still a benefit to utilize the KEEP pool?

Tom Kyte

Followup  

November 11, 2005 - 10:12 am UTC

the caching of the blocks depends on how the blocks are retrieved. a large table read via single block IO (index reads) will be cached one way (the blocks will not age out really fast).


I would not suggest the arbitrary use of the KEEP pool unless you had identified a problem (IO wise) that could be fixed by using it.

Have you?

FTS of large table

November 12, 2005 - 8:24 pm UTC

Reviewer: A reader

I was just curious from a theoretical perspective. Assuming the blocks are retrieved via a full table scan and the table itself is much larger than the size of the KEEP pool. Would it really benefit to configure the KEEP pool for this table?

Tom Kyte

Followup  

November 13, 2005 - 10:06 am UTC

yes, no, maybe, it depends.


Do the blocks need to be kept cached? should they be cached? are you using parallel query (more common with large full table scans) - then the cache doesn't really matter (direct io).


some of the blocks would be found in the buffer cache after the full scan. so, some of the blocks may be available for other queries.

In general, I'd rather have a reason for using the non-default pools, rather than hypothesize all of the possible reasons you may or may not have for using it.

Here is something I've written on this recently in Expert Oracle Database Architecture:

<quote>
Block Buffer Cache

So far, we have looked at relatively small components of the SGA. Now we are going to look at one that is possibly huge in size. The block buffer cache is where Oracle stores database blocks before writing them to disk and after reading them in from disk. This is a crucial area of the SGA for us. Make it too small and our queries will take forever to run. Make it too big and we’ll starve other processes (e.g., we won’t leave enough room for a dedicated server to create its PGA, and we won’t even get started).

In earlier releases of Oracle, there was a single block buffer cache, and all blocks from any segment went into this single area. Starting with Oracle 8.0, we had three places to store cached blocks from individual segments in the SGA:

* Default pool: The location where all segment blocks are normally cached. This is the original—and previously only—buffer pool.

* Keep pool: An alternate buffer pool where by convention you would assign segments that were accessed fairly frequently, but still got aged out of the default buffer pool due to other segments needing space.

* Recycle pool: An alternate buffer pool where by convention you would assign large segments that you access very randomly, and which would therefore cause excessive buffer flushing but would offer no benefit because by the time you wanted the block again it would have been aged out of the cache. You would separate these segments out from the segments in the default and keep pools so that they would not cause those blocks to age out of the cache.


Note that in the keep and recycle pool descriptions I used the phrase “by convention.” There is nothing in place to ensure that you use neither the keep pool nor the recycle pool in the fashion described. In fact, the three pools manage blocks in a mostly identical fashion; they do not have radically different algorithms for aging or caching blocks. The goal here was to give the DBA the ability to segregate segments to hot, warm, and do not care to cache areas. The theory was that objects in the default pool would be hot enough (i.e., used enough) to warrant staying in the cache all by themselves. The cache would keep them in memory since they were very popular blocks. You might have had some segments that were fairly popular, but not really hot; these would be considered the warm blocks. These segments' blocks could get flushed from the cache to make room for some blocks you used infrequently (the “do not care to cache” blocks). To keep these warm segments blocks cached, you could do one of the following:

* Assign these segments to the keep pool, in an attempt to let the warm blocks stay in the buffer cache longer.

* Assign the “do not care to cache” segments to the recycle pool, keeping the recycle pool fairly small so as to let the blocks come into the cache and leave the cache rapidly (decrease the overhead of managing them all).

This increased the management work the DBA had to perform, as there were three caches to think about, size, and assign objects to. Remember also that there is no sharing between them, so if the keep pool has lots of unused space, it won’t give it to the overworked default or recycle pool. All in all, these pools were generally regarded a very fine, low-level tuning device, only to be used after most all other tuning alternatives had been looked at (if I could rewrite a query to do one-tenth the I/O rather then set up multiple buffer pools, that would be my choice!).

Starting in Oracle9i, the DBA had up to four more optional caches, the db_Nk_caches, to consider in addition to the default, keep, and recycle pools. These caches were added in support of multiple blocksizes in the database. Prior to Oracle9i, a database would have a single blocksize (typically 2KB, 4KB, 8KB, 16KB, or 32KB). Starting with Oracle9i, a database can have a default blocksize, which is the size of the blocks stored in the default, keep, or recycle pool, as well as up to four nondefault blocksizes, as explained in Chapter 3.
The blocks in these buffer caches are managed in the same way as the blocks in the original default pool—there are no special algorithm changes for them either. Let’s now move on to cover how the blocks are managed in these pools.
</quote>

Is this a compelling reason for using seperate caches?

December 16, 2005 - 6:16 am UTC

Reviewer: Jack Douglas from Maidenhead, UK

Hi Tom,

We have a large table (95% of the whole database) that is often queried via an index. The problem we have is that even though we access only a very small percentage of the table in a query, it is still a large number of blocks relative to the buffer size.

Because we are not doing a full scan am I right that potentially the buffer cache will be flushed by these queries?

Is it true that the small table threshold or 'cache' setting will not prevent this happening to any of the blocks currently in the buffer?

If so, does that make this an immediate candidate for a 'recycle' pool?

Thanks for your input,
Jack

Tom Kyte

Followup  

December 16, 2005 - 8:45 am UTC

small table thresholds/cache settings are relevant for full scans.


nothing makes anything an immediate candidate for the recycle pool. Have you diagnosed a physical IO problem on your system regarding other objects (beyond this one)

Have you considered an index organized table - to reduce the number of blocks required to satisfy an index range scan (instead of a possible

a) 3 or 4 blocks to read index
b) plus number of blocks = number of rows retrieved for table access by index rowid in worst case

you would have

a) 3 or 4 blocks to read index to find first row
b) plus as few blocks as it can take to store the data

?

December 16, 2005 - 9:19 am UTC

Reviewer: David Aldridge http://oraclesponge.blogspot.com from Colorado Springs

We were discussing here </code> http://tinyurl.com/cyyyu <code>the effect of read ahead caching on FTS of small tables. Do you think that buffer caching of small tables might be a "valid" method of avoiding i/o inefficiency due to read ahead kicking in towards the end of the table scan and inadvertantly reading past the HWM? Or is that crazy talk?

Tom Kyte

Followup  

December 16, 2005 - 12:15 pm UTC

this reminds me sort of "separate indexes from tables" in a way ;)

Might be more reasonable to place "large segments you plan on multi-block IO'ing together and separate them from segments you plan o single-block IO'ing" at the volume level in order to prevent read ahead from kicking in.

Reason I say this is - it was observed that separating indexes from data helped a system once. turned out it was the separation of single block IO'ed objects from multi-block IO'ed objects - and it was the suppression of "magic read ahead algorithms" on the single block IO'ed things that made the difference.

Is this a compelling reason for using seperate caches? - followup

December 16, 2005 - 11:36 am UTC

Reviewer: Jack Douglas from Maidenhead, UK

No, at the moment I only suspect a PIO problem on other objects. I will focus my attention on proving this one way or the other. If it proves to be the case would you then suggest considering splitting the cache?

Strangely enough it is IO that prevents us using an IOT. The queries at the end of the script are like a typical query on our system - the fact table is queried by dimension1, dimension2 and also by dimension1, dimension3:



SQL*Plus: Release 10.2.0.1.0 - Production on Fri Dec 16 16:28:33 2005

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected to:
Oracle9i Release 9.2.0.6.0 - Production
JServer Release 9.2.0.6.0 - Production

16:28:33 TRACKER@oracle> create cluster t_cluster (id1 integer, partial_id2 integer, id3 integer);

Cluster created.

Elapsed: 00:00:00.01
16:28:37 TRACKER@oracle> create index k_t_cluster on cluster t_cluster;

Index created.

Elapsed: 00:00:00.01
16:28:37 TRACKER@oracle> create table t1 (id1 integer, partial_id2 integer, id2 integer, id3 integer, dummy char (100)) cluster t_cluster (id1, partial_id2, id3);

Table created.

Elapsed: 00:00:00.01
16:28:38 TRACKER@oracle> alter table t1 add constraint pk_t1 primary key (id1, id2, id3);

Table altered.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create index nu_t1 on t1 (id1, id3);

Index created.

Elapsed: 00:00:00.03
16:28:38 TRACKER@oracle> create table t2 (id1 integer, partial_id2 integer, id2 integer, id3 integer, dummy char (100), constraint pk_t2 primary key (id1, partial_id2, id3, id2)) organization index;

Table created.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create unique index u_t2 on t2 (id1, id2, id3);

Index created.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create index nu_t2 on t2 (id1, id3);

Index created.

Elapsed: 00:00:00.03
16:28:38 TRACKER@oracle> create table t3 (id1 integer, id2 integer, id3 integer, dummy char (100), constraint pk_t3 primary key (id1, id3, id2)) organization index;

Table created.

Elapsed: 00:00:00.04
16:28:38 TRACKER@oracle> create index nu_t3 on t3 (id1, id3);

Index created.

Elapsed: 00:00:00.03
16:28:38 TRACKER@oracle> --
16:28:38 TRACKER@oracle> begin
16:28:38 2 for j in 1..10 loop
16:28:38 3 for i in 1..50 loop
16:28:38 4 insert into t1 (id1, partial_id2, id2, id3, dummy)
16:28:38 5 select j, round (rownum / 50), rownum, i, 'A'
16:28:38 6 from dba_objects
16:28:38 7 where rownum <= 100;
16:28:38 8 --
16:28:38 9 insert into t2 (id1, partial_id2, id2, id3, dummy)
16:28:38 10 select j, round (rownum / 50), rownum, i, 'A'
16:28:38 11 from dba_objects
16:28:38 12 where rownum <= 100;
16:28:38 13 --
16:28:38 14 insert into t3 (id1, id2, id3, dummy)
16:28:38 15 select j, rownum, i, 'A'
16:28:38 16 from dba_objects
16:28:38 17 where rownum <= 100;
16:28:38 18 end loop;
16:28:38 19 end loop;
16:28:38 20 end;
16:28:38 21 /

PL/SQL procedure successfully completed.

Elapsed: 00:01:33.71
16:30:12 TRACKER@oracle> --
16:30:12 TRACKER@oracle> commit;

Commit complete.

Elapsed: 00:00:00.03
16:30:12 TRACKER@oracle> --
16:30:12 TRACKER@oracle> create table tx as select distinct id1, partial_id2 from t1;
create table tx as select distinct id1, partial_id2 from t1
*
ERROR at line 1:
ORA-00955: name is already used by an existing object


Elapsed: 00:00:00.01
16:30:12 TRACKER@oracle> --
16:30:12 TRACKER@oracle> analyze table t1 compute statistics;

Table analyzed.

Elapsed: 00:00:02.57
16:30:15 TRACKER@oracle> analyze table t2 compute statistics;

Table analyzed.

Elapsed: 00:00:01.53
16:30:16 TRACKER@oracle> analyze table t3 compute statistics;

Table analyzed.

Elapsed: 00:00:01.50
16:30:18 TRACKER@oracle> analyze table tx compute statistics;

Table analyzed.

Elapsed: 00:00:00.03
16:30:18 TRACKER@oracle> --
16:30:18 TRACKER@oracle> select sum (blocks) as blocks from (select blocks from dba_tables where table_name = 'T1' union all select leaf_blocks from dba_indexes where table_name = 'T1');
--

BLOCKS
----------
2591

Elapsed: 00:00:00.73
16:30:18 TRACKER@oracle> select sum (leaf_blocks) as blocks from dba_indexes where table_name = 'T2';
--

BLOCKS
----------
2764

Elapsed: 00:00:00.04
16:30:18 TRACKER@oracle> select sum (leaf_blocks) as blocks from dba_indexes where table_name = 'T3';
--

BLOCKS
----------
1864

Elapsed: 00:00:00.03
16:30:19 TRACKER@oracle> set autotrace trace explain statistics;
16:30:19 TRACKER@oracle> --
16:30:19 TRACKER@oracle> select min (dummy) from (select * from t1 where id1 = 5 and id2 = 50 union select * from t1 natural join tx where id1 = 5 and id3 = 25);

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=57 Card=1 Bytes=102)
1 0 SORT (AGGREGATE)
2 1 VIEW (Cost=57 Card=150 Bytes=15300)
3 2 SORT (UNIQUE) (Cost=57 Card=150 Bytes=16600)
4 3 UNION-ALL
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=14 Car
d=50 Bytes=5400)

6 5 INDEX (RANGE SCAN) OF 'PK_T1' (UNIQUE) (Cost=3 C
ard=50)

7 4 NESTED LOOPS (Cost=4 Card=100 Bytes=11200)
8 7 TABLE ACCESS (FULL) OF 'TX' (Cost=3 Card=3 Bytes
=12)

9 7 TABLE ACCESS (CLUSTER) OF 'T1' (Cost=1 Card=33 B
ytes=3564)

10 9 INDEX (UNIQUE SCAN) OF 'K_T_CLUSTER' (NON-UNIQ
UE)





Statistics
----------------------------------------------------------
29 recursive calls
0 db block gets
70 consistent gets
0 physical reads
0 redo size
479 bytes sent via SQL*Net to client
372 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

16:30:19 TRACKER@oracle> select min (dummy) from (select * from t2 where id1 = 5 and id2 = 50 union select * from t2 natural join tx where id1 = 5 and id3 = 25);

Elapsed: 00:00:00.18

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=45 Card=1 Bytes=102)
1 0 SORT (AGGREGATE)
2 1 VIEW (Cost=45 Card=150 Bytes=15300)
3 2 SORT (UNIQUE) (Cost=45 Card=150 Bytes=16600)
4 3 UNION-ALL
5 4 INDEX (UNIQUE SCAN) OF 'PK_T2' (UNIQUE) (Cost=1 Ca
rd=50 Bytes=5400)

6 5 INDEX (RANGE SCAN) OF 'U_T2' (UNIQUE) (Cost=1 Ca
rd=50)

7 4 NESTED LOOPS (Cost=4 Card=100 Bytes=11200)
8 7 TABLE ACCESS (FULL) OF 'TX' (Cost=3 Card=3 Bytes
=12)

9 7 INDEX (RANGE SCAN) OF 'PK_T2' (UNIQUE) (Cost=1 C
ard=33 Bytes=3564)





Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
168 consistent gets
0 physical reads
0 redo size
479 bytes sent via SQL*Net to client
372 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

16:30:19 TRACKER@oracle> select min (dummy) from (select * from t3 where id1 = 5 and id2 = 50 union select * from t3 where id1 = 5 and id3 = 25);

Elapsed: 00:00:00.25

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=1 Bytes=102)
1 0 SORT (AGGREGATE)
2 1 VIEW (Cost=46 Card=150 Bytes=15300)
3 2 SORT (UNIQUE) (Cost=46 Card=150 Bytes=15900)
4 3 UNION-ALL
5 4 INDEX (UNIQUE SCAN) OF 'PK_T3' (UNIQUE) (Cost=19 C
ard=50 Bytes=5300)

6 5 INDEX (RANGE SCAN) OF 'NU_T3' (NON-UNIQUE) (Cost
=19 Card=5000)

7 4 INDEX (UNIQUE SCAN) OF 'PK_T3' (UNIQUE) (Cost=1 Ca
rd=100 Bytes=10600)

8 7 INDEX (RANGE SCAN) OF 'NU_T3' (NON-UNIQUE) (Cost
=1 Card=100)





Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
15342 consistent gets
0 physical reads
0 redo size
479 bytes sent via SQL*Net to client
372 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed


Tom Kyte

Followup  

December 16, 2005 - 1:03 pm UTC

... If it proves to be the case
would you then suggest considering splitting the cache? ...

no, not without more information. I would first be looking at decreasing the IO's needed on the other thing.


so you are using a cluster and have it clustered by dimension1?

oops

December 16, 2005 - 11:54 am UTC

Reviewer: Jack Douglas from Maidenhead, UK

Sorry Tom,

Table tx create failed in my script above because I ran it twice and forgot to drop it. If you run it, it should work though...

Perhaps I should also have mentioned that I agree with your logic about IOTs in general and that the PL/SQL part of the script simulates the way data is added to our particular database (1 large dataload per week, dimension3/id3 is a week number). I am not trying to generalise about IOTs versus heaps/clusters.

Tom Kyte

Followup  

December 16, 2005 - 1:05 pm UTC

no worries - the mentioning of the IOT was to achieve clustering - which maybe you are already doing but via a cluster right?

yes

December 16, 2005 - 3:24 pm UTC

Reviewer: Jack Douglas from Maidenhead, UK

Yes we use the cluster to the same effect as an IOT - just to keep certain rows together that we query together.

We cluster on a function of the dimensions rather than on any one of them (actually id1, round (id2 / 50), id3) because we query both id1, id3 and id1, id2 and this is a kind of halfway house between the two. Clustering on either one would be great for one side of the union in the query but disastrous for the other - hence the relatively huge IO on the third query.

But basically, yes, as you say, we do it for the same reason we would use an IOT. Unfortunately what you gain with an IOT is lost when you add secondary indexes in our case due to the primary key duplication.

December 17, 2005 - 11:33 am UTC

Reviewer: David Aldridge http://oraclesponge.blogspot.com from Colorado Springs

>> this reminds me sort of "separate indexes from tables" in a way ;)

Might be more reasonable to place "large segments you plan on multi-block IO'ing
together and separate them from segments you plan o single-block IO'ing" at the
volume level in order to prevent read ahead from kicking in.

Reason I say this is - it was observed that separating indexes from data helped
a system once. turned out it was the separation of single block IO'ed objects
from multi-block IO'ed objects - and it was the suppression of "magic read ahead
algorithms" on the single block IO'ed things that made the difference. <<

Ah, that's an interesting point on the index/table setup ... yes, if there is a way to segregate the segments according to whether we want read ahead applied or not then that would do the trick. Much depends on how read ahead works on a particular system though I'd think.

Defining Multiple Buffer Pools

May 03, 2006 - 4:59 pm UTC

Reviewer: Yoav

Hi Tom,

In 8.1.7.4 when defining multiple buffer pool its needed to specify two attributes for each buffer pool:
* The number of buffers in the buffer pool
* The number of lru latches allocated to the buffer pool
for example :

BUFFER_POOL_KEEP=(BUFFERS:10000, LRU_LATCHES:2)

Could you please explain how to calculate the size of "BUFFERS" and "LRE_LATCHES" ?

Regards.


Tom Kyte

Followup  

May 04, 2006 - 1:47 am UTC

are you really realy really sure you need to use this (I mean, if you don't know how big it is - that should be something you "know" since you are going to use this to tune a very specific problem with?)

Multiple Pools

May 05, 2006 - 12:55 am UTC

Reviewer: Yoav

Hi Tom,
Thanks for your input.
We have 3rd party application,and we are suffering from heavy logical i/o activity.
(e.g:db file sequential read show 80% in thae last statspack on a unix with 8 cpus).
For now we cant touch the application code .

So i read in expert-one-on-one page 80 :
"...a buffer pool ,large enough to our 'lookup' tables in memory. for example, when oracle read a blocks from this table,they always get cached in this special pool.
... A buffer that is set up to cache blocks like this is known as KEEP pool..."

And also read : </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6265095774206 <code>

and thought that increasing the buffer cache and using multiple pools ,m a y b e, help to reduce this the logical i/o.

1. So back to my last question, i think that "BUFFER:" represent the SUM(BYTE) from dba_tables + SUM(BYTE) from
dba_indexes . Is it right ?
2. I actualy dont know how much "LRE_LATCHES:" to assign
for the KEEP pool and for the RECYCLE pool.
I hope you can advice about that.

Regards.

Tom Kyte

Followup  

May 05, 2006 - 1:51 am UTC

....
and we are suffering from heavy logical i/o
activity.
(e.g:db file sequential read show 80% in thae last statspack on a unix with 8
cpus).
.........

that would not be heavy logical i/o perhaps - that would be PHYSICAL IO. and 80% - that is just a ratio, 80% of WHAT?


I'm not going to touch the buffer pool here - you haven't identified as far as I can tell what the problem is, what objects you might slide into these pools and how that might help you.

Multiple buffer pools

May 05, 2006 - 4:16 am UTC

Reviewer: Yoav

Hi Tom,
I based my estimation on the statspack. since this thread is not about the
statspack i didnt want to add any copy/paste proof from the statspack.

Hir some more information:
1. I wrote heavy logical i/o based on the logical reads per second (2054 per second)
which is about 7 time more then the physical reads (300 per second)
2. From the "Top 5 Wait Events" 80% of the total waits event was due to "db file sequential read"
(I know that you want like it, but for fairness I'll say that the statspack run for about 40 hours - but in this case its just help me to show that i have
a consistent problem with "db file sequential read" wait event).
3. The best thing to do, was to change the appliaction code.
I know that touching the buffer cache want change the fact that 1 of 4 transaction still will ended with rollback.
But ss i said Since its not possible ,at this point in time to change the appliaction code,
dont you think that increasing the buffer cache and using multiple pools can help ?

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 524288 log_buffer:262144
db_block_size: 4096 shared_pool_size:89128960

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 129,876.68 29,747.96
Logical reads: 2,054.39 470.55
Block changes: 949.55 217.49
Physical reads: 299.29 68.55
Physical writes: 72.85 16.69
User calls: 329.95 75.58
Parses: 37.97 8.70
Hard parses: 0.10 0.02
Sorts: 9,597.31 2,198.24
Logons: 0.07 0.02
Executes: 195.04 44.67
Transactions: 4.37

% Blocks changed per Read: 46.22 Recursive Call %: 16.25
Rollback per transaction %: 24.00 Rows per Sort: 0.03

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read 15,204,906 6,856,997 79.03
log file parallel write 1,085,325 1,070,571 12.34
-------------------------------------------------------------

Thanks.

Tom Kyte

Followup  

May 05, 2006 - 6:51 am UTC

so, how did that tell you what segments would benefit from their own buffer pool.

That is my point, how the heck do you know what you want to put into this buffer pool.

40 hours, bah - ignore this statspack, you cannot do anything with it.


multiple buffer pools ain't going to help - unless and until you know the segments that would truly benefit from it - and I don't think you do.


Touch count algorithm and "alter table t1 cache"

March 16, 2007 - 9:36 am UTC

Reviewer: Guy Lambregts from Belgium

Tom,

I think to know what alter table t1 cache did - in release 7.x and 8.x - this with repect to MRU and LRU if the blocks of table t1 where accessed in a scattered way (FTS).

Also I think to know from release 8.1.5 onwards Oracle has introduced the touch count algorithm. With my own words : A block is read into the buffer cache where it is inserted at a midpoint, somwhere at the "border" of the cold and the warm region. The more a block is touched the more its touch count increases (x$bh.tch) the lesser the likelyhood the block will be "thrown" out of the cache. I know there are a set of hidden parameters, which I never ... touch.

Now my question : Is the alter table t1 cache still relevant with respect to this touch count philosophy ? I mean is there still any difference between alter table t1 cache and alter table t1 nocache ?

Kind Regards
Guy
Tom Kyte

Followup  

March 17, 2007 - 3:53 pm UTC

you should use alter table T cache since the introduction of the keep and recycle pools really.

but the effect is the same, a table with the cache attribute is considered a short table during a full scan regardless of its real size.


keep pool

March 26, 2007 - 2:14 am UTC

Reviewer: A reader

I have a table a with size 250 MB and full table scan happens on this table.

[1] Does it worth to keep table a in keep pool.

[2] e.g for a table which is not in keep pool, any dml updates data buffer cache and then write the data on disk at checkpoint.What happens when we keep a table in keep pool. How does inserts, deletes and updates work on this table a.
Tom Kyte

Followup  

March 26, 2007 - 7:37 am UTC

1) yes, no, maybe.

2) just like they would if you did not use a keep pool - the keep/recycle pools are just like the buffer cache itself, they are buffer caches.

How big a table can be cached?

February 21, 2008 - 2:51 pm UTC

Reviewer: Sal from NJ

OLTP .. 10.2 version.

We have identified 4 tables that are the center of all activities in the database. Update, insert, delete, selects..

Some DBA's are suggesting to put them in keep pool. They tested it on a production copy and the bottlenck operations started working in seconds instead of 5+ minutes. But in their test they were the only one in the database..

The tables are 3 GB in size.. Our shared pool is 8 gb, they are saying lets increase it to 16 gb and put these tables worth 3 gb on keep pool.

Is there any limit to what can be cached? I have cached smaller lookup tables in past, but not sure about this here.

Any suggestions Sir?

thanks
Tom Kyte

Followup  

February 21, 2008 - 5:04 pm UTC

only the limits of machine addressability.

alter table cache candidate

February 28, 2008 - 11:59 am UTC

Reviewer: Gareth Adcock from England

Hello

I think that I may have a candidate for the 'alter table cache' command.

The database concerned (version 9.2.0.3.0) has a buffer cache of 2400 Mb.
I have a table which I think 'should' be cached most of the time but would appear to have been aged out most of the times that

the application has read from it, resulting in a large ammount of physical reads.

The table I am looking at the most is at the bottom of the following list in terms of size (F554215).

select segment_name,segment_type, ((blocks*8)/1024)/1024 GB
from dba_segments
where (owner,segment_name) in
(select owner,object_name from v$segment_statistics where statistic_name = 'physical reads' and value > 100000000)
order by blocks desc

SEGMENT_NAME SEGMENT_TYPE GB
-------------------------------------------------------------
F42199 TABLE 15.902557373046875
F42199 TABLE 13.0977630615234375
F4111 TABLE 10.4477691650390625
F0911 TABLE 10.242462158203125
F42119 TABLE 1.8017578125
F0902 TABLE .7207183837890625
F4611 TABLE .4254913330078125
F4108 TABLE .34760284423828125
F4211 TABLE .11937713623046875
F554215 TABLE .0668182373046875



It is a 'hot' table rather than a 'keep warm' table.
It gets read a lot; the potential problem is that it gets read a lot form disk.


select owner,object_name,value from v$segment_statistics where statistic_name = 'physical reads' and value > 100000000

(hundred million)
order by value desc

OWNER OBJECT_NAME VALUE
----------------------------------------------
PRODDTA F0911 2480884853
PRODDTA F4111 1733224651
PRODDTA F554215 1549742899
PRODDTA F42119 982973397
PRODDTA F42199 908307993
PRODDTA F0902 296390653
CRPDTA F42199 187409137
PRODDTA F4108 181621627
PRODDTA F4211 148565012
PRODDTA F4611 118151835

So a relatively small table comes third in total physical reads.

select owner,object_name,value
from v$segment_statistics
where statistic_name = 'logical reads'
and value > 1000000000 (thousand million)
order by value desc

OWNER OBJECT_NAME VALUE
----------------------------------------------------
PRODDTA F0911 4087505552
SYS I_OBJ1 2717363136
PRODDTA F4111 1909729328
PRODDTA F554215 1840557360
PRODDTA F4108_0 1626329632
PRODDTA F554108_7 1328939200
PRODDTA F41021_PK 1310094416
PRODDTA F59FS800 1146349296
PRODDTA CI_F0911_FIN_CUBE 1006994896


So about 84% of the time the data gets read from disk.
I looked at the sql statements pertaining to this table,which were currently stored in memory.
SQL Analyzer showed up the likely suspect and I queried v$sql

select fetches, executions, first_load_time,last_load_time
from v$sql
where sql_text =
'SELECT * FROM PRODDTA.F554215 WHERE ( XHMCU = :KEY1 ) '

FETCHES EXECUTIONS FIRST_LOAD_TIME LAST_LOAD_TIME
----------------------------------------------------------------------------------------------------------------
5587 5587 2008-01-14/06:14:02 2008-02-25/06:28:48


SELECT count(*) FROM PRODDTA.F554215
group by xhmcu

COUNT(*)
--------------
175002
2
1

I'm guessing it's always going to do a full table scan.
This table gets updated a lot and I understand that this makes it unsuitable for a bitmap index, besides it's going to want to

bring back almost all of the blocks almost all of the time.
A function based index may be an option for the rare values but these are almost never queried.

I ran the sql statement from my SQL Developer session.

Explian Plan:
OPERATION OPTIMIZER COST CARDINALITY BYTES
SELECT STATEMENT CHOOSE 1907 174200 61144200


TABLE ACCESS(FULL) PRODDTA.F554215 ANALYZED 1907 174200 61144200





Autotrace:

recursive calls 0
db block gets 0
consistent gets 8856
physical reads 8730
redo size 0
bytes sent via SQL*Net to client 1613
bytes received via SQL*Net from client 647
SQL*Net roundtrips to/from client 5
sorts (memory) 2
sorts (disk) 0


In my novice opinion it would seem that this table could benefit from 'alter table cache' to slow down the aging out process.
If this is not a practical or a good idea would setting up a seperate keep buffer cache be worth looking into?

My first instinct is to trust the algorithms that Oracle uses to allocate resources but this one is puzzling me.


Thanks

Gareth
Tom Kyte

Followup  

February 28, 2008 - 11:14 pm UTC

SELECT count(*) FROM PRODDTA.F554215
group by xhmcu


I would erase that query in the application, that would be my fix.

I'll bet it is some silly logic like:

select the count
if the count > 0
then
do something



I would just code:

do something


period, end, you never need to count the rows. Besides, look at that query, the output is USELESS

SELECT count(*) FROM PRODDTA.F554215
group by xhmcu


a count grouped by a column that is not selected, so you get a bunch of random numbers in some arbitrary order representing something.

what a waste.

My mistake

February 29, 2008 - 4:34 am UTC

Reviewer: Gareth from England

The actual query I am looking at is

SELECT * FROM PRODDTA.F554215 WHERE ( XHMCU = :KEY1 )

Sorry, it's tucked away in:

select fetches, executions, first_load_time,last_load_time
from v$sql
where sql_text =
'SELECT * FROM PRODDTA.F554215 WHERE ( XHMCU = :KEY1 ) '

I should have referenced it explicitly, my fault.

The count(*) was just to demonstrate the limited number of possible values in xhmcu and skewed nature of the data.

apologies Gareth
Tom Kyte

Followup  

March 01, 2008 - 10:29 am UTC

ok, now for a super silly question.

why do you need to full scan this over and over (I'm going to keep going back to the application, where 99.999% of all tuning needs be done in every case).

Why would you full scan this table *so often*, what is the logic there. Does it really make sense to do this (putting in ram isn't going to fix very much, if you are using regular OS files, it is already probably buffered in the OS file system cache).


don't forget - the v$ tables are since *forever* (since instance started, those IO's are cumulative)

Now, that said - looking at the dates on that - that is like two weeks worth of executes.

Why do you need to full scan this table 400 times a day?

Thanks

March 03, 2008 - 5:35 am UTC

Reviewer: Gareth from England

Thanks for looking at this.

The short answer is that I don't know.

This is third party application that bolts on to our main JDEdwards database.
I don't have any direct involvement in development other than to raise concerns after the event. I am currently highlighting issues concerning the use of literals rather than bind variables in the code resulting in hundreds of almost identical sql statements in the shared pool.

I will add this to my list of issues.

Thanks again for taking the time to look at this.

Gareth
Tom Kyte

Followup  

March 03, 2008 - 7:38 am UTC

you could try setting up a keep pool (a couple times larger than the size of the table, especially if it is modified/updated)...

alter the table to be in the keep pool
alter the table 'cache'

that should reduce the physical IO to minimal - and if that is the cause of a problem (the IO's were), that problem would be reduced...

Caching a table in Memory

September 05, 2008 - 6:15 pm UTC

Reviewer: Ramani from USA

Hi tom
Thanks for the information regarding keeping / deleting tables in buffer_pool.

alter table mytable storage (buffer_pool keep);
alter table mytable storage (buffer_pool default);

Is there a way to check what tables are kept in buffer_pool ? from which dictionary view.
I do not think I get the table names that are kept in buffer_pool from V$BUFFER_POOL.

Thanks in advance,
Ramani

ALTER TABLE CACHE

July 09, 2009 - 1:31 am UTC

Reviewer: A reader

Single user environment.

SQL> 
SQL> SELECT * FROM v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for 32-bit Windows: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

Elapsed: 00:00:00.01
SQL> 
SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 1104M
sga_target                           big integer 1104M
SQL> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 400M
SQL> 
SQL> CREATE TABLE m AS
  2  SELECT rownum mID,
  3       MOD(rownum, 100) cnt1,
  4       MOD(rownum, 200) cnt2,
  5       TRUNC(dbms_random.value(1, 10000)) cnt3,
  6       RPAD('x', 30, 'x') char1,
  7       RPAD('x', 30, 'y') char2,
  8       RPAD('x', 300, 'z') char3
  9  FROM   dual
 10  CONNECT BY level <= 1000000;

Table created.

Elapsed: 00:00:16.75
SQL> 
SQL> CREATE TABLE i AS
  2  SELECT rownum iID,
  3       mID,
  4       MOD(rownum, 2) cnt1,
  5       MOD(rownum, 4) cnt2,
  6       RPAD('x', 30, 'x') char1
  7  FROM   m
  8  UNION ALL
  9  SELECT rownum + 1000001 iID,
 10       mID,
 11       MOD(rownum, 2) cnt1,
 12       MOD(rownum, 4) cnt2,
 13       RPAD('x', 30, 'x') char1
 14  FROM   m;

Table created.

Elapsed: 00:00:17.15
SQL> 
SQL> SELECT segment_Name, bytes/1024/1024 MB
  2  FROM   user_segments
  3  WHERE  segment_Name IN ('I', 'M');

SEGMENT_NAME                           MB
------------------------------ ----------
I                                     119
M                                     439

Elapsed: 00:00:00.01
SQL> 
SQL> 
SQL> ALTER TABLE m ADD CONSTRAINT m_pk PRIMARY KEY (mID);

Table altered.

Elapsed: 00:00:08.54
SQL> 
SQL> ALTER TABLE i ADD CONSTRAINT i_fk1 FOREIGN KEY (mID)
  2   REFERENCES m(mID);

Table altered.

Elapsed: 00:00:04.82
SQL> 
SQL> CREATE INDEX i_fk1 ON i(mID);

Index created.

Elapsed: 00:00:02.54
SQL> 
SQL> 
SQL> BEGIN
  2    DBMS_STATS.GATHER_TABLE_STATS(
  3    ownname   => user,
  4    tabname   => 'M',
  5    estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
  6    method_opt   => 'FOR ALL COLUMNS SIZE SKEWONLY',
  7    cascade   => TRUE);
  8  
  9    DBMS_STATS.GATHER_TABLE_STATS(
 10    ownname   => user,
 11    tabname   => 'I',
 12    estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
 13    method_opt   => 'FOR ALL COLUMNS SIZE SKEWONLY',
 14    cascade   => TRUE);
 15  END;
 16  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:21.51
SQL> 
SQL> 
SQL> set autotrace on
SQL> 
SQL> SELECT COUNT(*)
  2  FROM   m, i
  3  WHERE  m.mID   = i.mID AND
  4       i.cnt1  = 1 AND
  5       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

  COUNT(*)
----------
   1000000

Elapsed: 00:00:05.78

Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 18686   (1)| 00:03:45 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |   995K|    41M|    18M| 18686   (1)| 00:03:45 |
|*  3 |    TABLE ACCESS FULL| I    |   995K|  7774K|       |  3200   (1)| 00:00:39 |
|*  4 |    TABLE ACCESS FULL| M    |   999K|    34M|       | 12258   (1)| 00:02:28 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("I"."MID"="M"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      69907  consistent gets
      55557  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> ALTER TABLE m CACHE;

Table altered.

Elapsed: 00:00:00.00
SQL> ALTER TABLE i CACHE;

Table altered.

Elapsed: 00:00:00.01
SQL> 
SQL> SELECT COUNT(*)
  2  FROM   m, i
  3  WHERE  m.mID   = i.mID AND
  4       i.cnt1  = 1 AND
  5       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

  COUNT(*)
----------
   1000000

Elapsed: 00:00:05.40

Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 18686   (1)| 00:03:45 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |   995K|    41M|    18M| 18686   (1)| 00:03:45 |
|*  3 |    TABLE ACCESS FULL| I    |   995K|  7774K|       |  3200   (1)| 00:00:39 |
|*  4 |    TABLE ACCESS FULL| M    |   999K|    34M|       | 12258   (1)| 00:02:28 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("I"."MID"="M"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
        365  recursive calls
          0  db block gets
      69968  consistent gets
      55556  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         12  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> -- --------------------------------------------------------------
SQL> -- Run the SQL again.
SQL> -- --------------------------------------------------------------
SQL> /

  COUNT(*)
----------
   1000000

Elapsed: 00:00:05.29

Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 18686   (1)| 00:03:45 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |   995K|    41M|    18M| 18686   (1)| 00:03:45 |
|*  3 |    TABLE ACCESS FULL| I    |   995K|  7774K|       |  3200   (1)| 00:00:39 |
|*  4 |    TABLE ACCESS FULL| M    |   999K|    34M|       | 12258   (1)| 00:02:28 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("I"."MID"="M"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      69907  consistent gets
      55556  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> 
SQL> spool off


Why are there so many physical IO's even after I marked the tables for caching in memory? If these were the only two tables in the database, is there anyway I can completely eliminate physical IO's giving that 1 GB is set aside for SGA and the 2 tables added up to be around 550 MB?

Tom Kyte

Followup  

July 14, 2009 - 1:57 pm UTC

well, you say the SGA is 1gb, but the SGA is made of many things. How big is your BUFFER Cache and remember - lots of other stuff has to fit in there.

and what was the physical IO to - tkprof with a row source operation would be infinitely better than this - this has no detail.

Result cache

July 14, 2009 - 3:52 pm UTC

Reviewer: A reader

Reader,

Since you're on 11g, you may want to look into using the result cache feature. Make sure the appropriate result cache init parameters are sized for the amount of data you want to cache. Rerun your example and add the /*+ result_cache */ hint on your selects. The first select will still do all the IOs, subsequent executions of the same select should be extremely fast because the result is cached.

Give it a try, it rocks!



ALTER TABLE CACHE Followup

July 14, 2009 - 4:58 pm UTC

Reviewer: A reader

Here's the tkprof with row source operation. It looks like the physical IO's are all from full scanning table m.

STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=69914 pr=55556 pw=55556 time=0 us)'
STAT #3 id=2 cnt=1000000 pid=1 pos=1 obj=0 op='HASH JOIN  (cr=69914 pr=55556 pw=55556 time=266516 us cost=18706 size=44818048 card=1018592)'
STAT #3 id=3 cnt=1000000 pid=2 pos=1 obj=22718 op='TABLE ACCESS FULL I (cr=14350 pr=0 pw=0 time=3888 us cost=3199 size=8148736 card=1018592)'
STAT #3 id=4 cnt=1000000 pid=2 pos=2 obj=22717 op='TABLE ACCESS FULL M (cr=55564 pr=55556 pw=55556 time=245890 us cost=12257 size=35996760 card=999910)'


SQL> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
db_cache_size                        big integer 712M

SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 1104M
sga_target                           big integer 1104M

I tried setting db_cache_size to 0 so Oracle would automatically allocate memory.  However, the result was the same.


Tom Kyte

Followup  

July 15, 2009 - 11:33 am UTC

I see that pr=55556
I see also that pw (physical write) is...... 55556.


How about you do the trace with wait events being recorded - I'll bet the waits are "direct read temp" and "direct write temp"

Eg: table M is hashed but spills to disk.

To "A Reader" - July 14, 2009 - 3pm US/Eastern

July 14, 2009 - 5:01 pm UTC

Reviewer: A reader

Thanks for the tip. However, that's not the point of the question. I was trying to get a better understanding of the behavior of caching a table.
Tom Kyte

Followup  

July 15, 2009 - 11:35 am UTC

I think the table is probably cached, but you are spilling to disk on a hash operation. Given that pr = pw anyway.

ALTER TABLE CACHE Followup #2

July 15, 2009 - 12:22 pm UTC

Reviewer: A reader


All waits are direct path reads.

WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2623184 block cnt=16 obj#=22717 tim=5157349182372
WAIT #3: nam='direct path read' ela= 4 file number=8 first dba=2623200 block cnt=16 obj#=22717 tim=5157349183354
WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2623216 block cnt=16 obj#=22717 tim=5157349184092
...
...
...
WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2562560 block cnt=16 obj#=22717 tim=5157351288487
WAIT #3: nam='direct path read' ela= 4 file number=8 first dba=2562576 block cnt=16 obj#=22717 tim=5157351289457
WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2562592 block cnt=16 obj#=22717 tim=5157351290425

********************************************************************************

SELECT COUNT(*)
FROM m, i
WHERE m.mID = i.mID AND
i.cnt1 = 1 AND
m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 1.76 7.86 55556 69914 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 1.76 7.86 55556 69914 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 30

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=69914 pr=55556 pw=55556 time=0 us)
1000000 HASH JOIN (cr=69914 pr=55556 pw=55556 time=266516 us cost=18706 size=44818048 card=1018592)
1000000 TABLE ACCESS FULL I (cr=14350 pr=0 pw=0 time=3888 us cost=3199 size=8148736 card=1018592)
1000000 TABLE ACCESS FULL M (cr=55564 pr=55556 pw=55556 time=245890 us cost=12257 size=35996760 card=999910)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
reliable message 1 0.00 0.00
direct path read 3757 0.00 0.01
SQL*Net message from client 2 0.00 0.00

********************************************************************************


Tom Kyte

Followup  

July 15, 2009 - 12:41 pm UTC

oh, that explains it, it isn't using a conventional path read at all - it is bypassing the buffer cache. So the alter cache is not useful at all in this case.

In 11g - we read via direct path over conventional path (buffer cache) based on statistics and setup. Direct path reads make sure the latest version of the block is on disk and then just reads away.


strange that the direct path read incremented the pw, that isn't right.

ALTER TABLE CACHE Followup #3

July 15, 2009 - 1:17 pm UTC

Reviewer: A reader

Can you elaborate on this statement "In 11g - we read via direct path over conventional path (buffer cache) based on statistics and setup.", specifically the setup part.

Also why would it do direct path reads on one table but not the other?
Tom Kyte

Followup  

July 15, 2009 - 2:41 pm UTC

the table sizes are different. Based on your object and system statistics, the optimizer opted for a direct path read.

Normally, a full scan in SERIAL mode would use a conventional path read. It would use your db file multiblock read count (should be set automatically by us in 10g and above to do the MAX multiblock read available on your system) to figure out how many blocks to read at a time. Sounds all good - all efficient - however....

Say we decided to read 32 blocks at a time. Further assume block 1, 10, 15, 20, 22 are in the cache already. We cannot use the image on disk for these guys so the IO looks like:

logical IO block 1.
multiblock read 2-9 into the cache, logical IO them out.
logical IO block 10
multiblock read 11-14 into the cache, logical IO them out.
logical IO block 15
multiblock read 16-19 into the cache, logical IO them out.
logical IO block 20
multiblock (well, single block really) read 21 into the cache, logical IO it out
logical IO block 22
multiblock read 23-32 into the cache, logical IO them out.


Using a direct read it would be

a) before running query, checkpoint the segment ensuring current version of data is on disk.

b) read blocks 1-32 into PGA and process them.



The size of the table (your one is bigger than the other) the multiblock read count (actual observed on the system - not the parameter) will influence whether we direct path or conventional path read the segment.

ALTER TABLE CACHE Followup #4

July 15, 2009 - 2:17 pm UTC

Reviewer: A reader

I ran the exact same test case on 10.2.0.4. Now it looks like neither tables are being cached?!

SQL> show parameter sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
lock_sga                             boolean     FALSE
pre_page_sga                         boolean     FALSE
sga_max_size                         big integer 1104M
sga_target                           big integer 1104M
SQL> show parameter db_cache_size

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
db_cache_size                        big integer 0

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

PARSING IN CURSOR #48 len=127 dep=0 uid=47 oct=3 lid=47 tim=3302881257 hv=1790720527 ad='4c634a74'
SELECT COUNT(*)
FROM   m, i
WHERE  m.mID   = i.mID AND
       i.cnt1  = 1 AND
       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
END OF STMT
PARSE #48:c=15625,e=18817,p=0,cr=56,cu=2,mis=1,r=0,dep=0,og=1,tim=3302881255
EXEC #48:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=3302881324
WAIT #48: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=8862 tim=3302881345
WAIT #48: nam='db file scattered read' ela= 10238 file#=16 block#=1826748 blocks=5 obj#=111458 tim=3302892276
WAIT #48: nam='db file scattered read' ela= 23861 file#=16 block#=1826753 blocks=8 obj#=111458 tim=3302917607
WAIT #48: nam='db file scattered read' ela= 427 file#=16 block#=1826762 blocks=7 obj#=111458 tim=3302918388
WAIT #48: nam='db file scattered read' ela= 519 file#=16 block#=1826769 blocks=8 obj#=111458 tim=3302919216
WAIT #48: nam='db file scattered read' ela= 439 file#=16 block#=1826778 blocks=7 obj#=111458 tim=3302919994
...
...
WAIT #48: nam='db file scattered read' ela= 1198 file#=16 block#=1873645 blocks=16 obj#=111457 tim=3311136789
WAIT #48: nam='db file scattered read' ela= 1305 file#=16 block#=1873661 blocks=16 obj#=111457 tim=3311138410
WAIT #48: nam='db file scattered read' ela= 2228 file#=16 block#=1873677 blocks=16 obj#=111457 tim=3311140953


********************************************************************************

SELECT COUNT(*)
FROM   m, i
WHERE  m.mID   = i.mID AND
       i.cnt1  = 1 AND
       m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          2           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      2.06       9.26      69891      69917          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      2.07       9.27      69891      69917          2           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 47  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=69917 pr=69891 pw=0 time=9264531 us)
1000000   HASH JOIN  (cr=69917 pr=69891 pw=0 time=8964016 us)
1000000    TABLE ACCESS FULL I (cr=14346 pr=14335 pw=0 time=2010330 us)
1000000    TABLE ACCESS FULL M (cr=55571 pr=55556 pw=0 time=4020519 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                       4415        0.04          7.49
  SQL*Net message from client                     2        0.00          0.00
********************************************************************************


Tom Kyte

Followup  

July 15, 2009 - 3:02 pm UTC

probably the cache is just insufficient. By the time we read the end of the table we are finding we have to overwrite the "head" of the table. So when you go for the second run, the head of the table isn't there anymore, we read that in and overwrite more of the table - which we have to read and so on.

Not sure why this is surprising? You have well over 1/2gb of data, your SGA is just 1gb. There are many other 'pools' in there.


and unless you are the sole user on this system - there is a ton of other stuff going on.


July 15, 2009 - 3:52 pm UTC

Reviewer: A reader

This is surprising because it's a single-user environment (my test database) and no tables are configured to specifically use a different pool. Shouldn't Oracle be able to stuff 1/2 GB of data into a 1 GB shared pool?

SQL> SELECT buffer_pool, COUNT(*)
  2  FROM   dba_tables
  3  GROUP  BY buffer_pool;

BUFFER_   COUNT(*)
------- ----------
               153
DEFAULT       1655

SQL> show parameter pool

NAME                                 TYPE        VALUE
------------------------------------ ----------- --------
buffer_pool_keep                     string
buffer_pool_recycle                  string
global_context_pool_size             string
java_pool_size                       big integer 0
large_pool_size                      big integer 0
olap_page_pool_size                  big integer 0
shared_pool_reserved_size            big integer 30198988
shared_pool_size                     big integer 0
streams_pool_size                    big integer 0


Tom Kyte

Followup  

July 15, 2009 - 4:38 pm UTC

SQL> show sga

let us see how much is currently allocated to the buffer pool.

July 16, 2009 - 1:56 am UTC

Reviewer: A reader

SQL> show sga

Total System Global Area 1157627904 bytes
Fixed Size                  1298016 bytes
Variable Size             218104224 bytes
Database Buffers          931135488 bytes
Redo Buffers                7090176 bytes

Tom Kyte

Followup  

July 16, 2009 - 11:26 am UTC

let me see the output of:

ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE';

NAME||'='||VALUE
-------------------------------------------------------------------------------
processes=150
sessions=300
sga_max_size=1157627904
sga_target=1157627904
control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, /home/ora10gr2/ora
data/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl

db_block_size=8192
compatible=10.2.0.1.0
db_create_file_dest=/home/ora10gr2/oradata/ora10gr2
db_recovery_file_dest=/home/ora10gr2/oradata/fbra
db_recovery_file_dest_size=10737418240
undo_management=AUTO
undo_tablespace=UNDOTBS
undo_retention=5000
db_domain=
dispatchers=(protocol=tcp)
job_queue_processes=10
db_name=ora10gr2
open_cursors=300
os_authent_prefix=OPS$
pga_aggregate_target=419430400

20 rows selected.


and I'll see if I can reproduce - as it is not, I cannot. It would seem you have sufficient buffer cache.

July 16, 2009 - 11:56 am UTC

Reviewer: A reader


SQL> select name || '=' || value from v$parameter where isdefault = 'FALSE'
  2  order  by name
  3  /

NAME||'='||VALUE
-----------------------------------------------------------------------------------------------------------------------------------
aq_tm_processes=0
background_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\BDUMP
compatible=10.2.0.4.0
control_files=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL01.CTL, C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL02.CTL, C:\OR
ACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL03.CTL

core_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\CDUMP
db_block_size=8192
db_cache_size=209715200
db_create_file_dest=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB
db_domain=
db_file_multiblock_read_count=16
db_name=TEST_DB
disk_asynch_io=FALSE
fast_start_mttr_target=0
java_pool_size=0
job_queue_processes=200
large_pool_size=0
nls_length_semantics=BYTE
open_cursors=300
optimizer_index_caching=90
optimizer_index_cost_adj=25
pga_aggregate_target=419430400
processes=1000
query_rewrite_enabled=FALSE
remote_login_passwordfile=EXCLUSIVE
resource_manager_plan=
session_max_open_files=20
sessions=1500
sga_max_size=1157627904
sga_target=1157627904
shared_pool_size=0
sort_area_size=524288
star_transformation_enabled=FALSE
streams_pool_size=0
timed_statistics=TRUE
undo_management=AUTO
undo_retention=900
undo_tablespace=UNDOTBS1
user_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\UDUMP

38 rows selected.


Tom Kyte

Followup  

July 16, 2009 - 4:38 pm UTC

well, I disagree with the setting of most of your parameters - however on linux, I cannot reproduce your apparent findings. Now, that said, you differ from me in that the automatic SGA resizing has kicked in on your system - the buffer cache was made larger on your system.


That said, maybe someone else with a windows play system can test this out for us

ops$tkyte%ORA10GR2> set linesize 1000
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table m;

Table dropped.

ops$tkyte%ORA10GR2> drop table i;

Table dropped.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE m AS
  2  SELECT rownum mID,
  3          MOD(rownum, 100) cnt1,
  4          MOD(rownum, 200) cnt2,
  5          TRUNC(dbms_random.value(1, 10000)) cnt3,
  6          RPAD('x', 30, 'x') char1,
  7          RPAD('x', 30, 'y') char2,
  8          RPAD('x', 300, 'z') char3
  9  FROM   dual
 10  CONNECT BY level <= 1000000;

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> CREATE TABLE i AS
  2  SELECT rownum iID,
  3          mID,
  4          MOD(rownum, 2) cnt1,
  5          MOD(rownum, 4) cnt2,
  6          RPAD('x', 30, 'x') char1
  7  FROM   m
  8  UNION ALL
  9  SELECT rownum + 1000001 iID,
 10          mID,
 11          MOD(rownum, 2) cnt1,
 12          MOD(rownum, 4) cnt2,
 13          RPAD('x', 30, 'x') char1
 14  FROM   m;

Table created.

ops$tkyte%ORA10GR2> SELECT segment_Name, bytes/1024/1024 MB
  2  FROM   user_segments
  3  WHERE  segment_Name IN ('I', 'M');

SEGMENT_NAME                           MB
------------------------------ ----------
I                                     120
M                                     438

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'M' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'I' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE' order  by name;

NAME||'='||VALUE
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
aq_tm_processes=0
compatible=10.2.0.4.0
control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, /home/ora10gr2/oradata/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl
db_block_size=8192
db_cache_size=218103808
db_create_file_dest=/home/ora10gr2/oradata/ora10gr2
db_domain=
db_file_multiblock_read_count=16
db_name=ora10gr2
db_recovery_file_dest=/home/ora10gr2/oradata/fbra
db_recovery_file_dest_size=10737418240
disk_asynch_io=FALSE
dispatchers=(protocol=tcp)
fast_start_mttr_target=0
java_pool_size=0
job_queue_processes=200
large_pool_size=0
nls_length_semantics=BYTE
open_cursors=300
optimizer_index_caching=90
optimizer_index_cost_adj=25
os_authent_prefix=OPS$
pga_aggregate_target=419430400
processes=1000
query_rewrite_enabled=FALSE
remote_login_passwordfile=EXCLUSIVE
session_max_open_files=20
sessions=1500
sga_max_size=1157627904
sga_target=1157627904
shared_pool_size=0
sort_area_size=524288
star_transformation_enabled=FALSE
streams_pool_size=0
timed_statistics=TRUE
undo_management=AUTO
undo_retention=5000
undo_tablespace=UNDOTBS

38 rows selected.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> alter system flush buffer_cache;

System altered.

ops$tkyte%ORA10GR2> alter table m cache;

Table altered.

ops$tkyte%ORA10GR2> alter table i cache;

Table altered.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace on
ops$tkyte%ORA10GR2> SELECT COUNT(*)
  2    FROM   m, i
  3   WHERE  m.mID   = i.mID AND
  4          i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       |  5617   (3)| 00:00:42 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M|  5617   (3)| 00:00:42 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  1048   (5)| 00:00:08 |
|*  4 |    TABLE ACCESS FULL| M    |   995K|    34M|       |  3896   (2)| 00:00:29 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
        314  recursive calls
          0  db block gets
      69955  consistent gets
      69899  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          8  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> SELECT COUNT(*)
  2    FROM   m, i
  3   WHERE  m.mID   = i.mID AND
  4          i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       |  5617   (3)| 00:00:42 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |          |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M|  5617   (3)| 00:00:42 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  1048   (5)| 00:00:08 |
|*  4 |    TABLE ACCESS FULL| M    |   995K|    34M|       |  3896   (2)| 00:00:29 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      69917  consistent gets
          0  physical reads
          0  redo size
        411  bytes sent via SQL*Net to client
        400  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte%ORA10GR2> show sga

Total System Global Area 1157627904 bytes
Fixed Size                  1267116 bytes
Variable Size             301992532 bytes
Database Buffers          838860800 bytes
Redo Buffers               15507456 bytes
ops$tkyte%ORA10GR2>

July 16, 2009 - 6:58 pm UTC

Reviewer: A reader

"I disagree with the setting of most of your parameters"

Can you give me a few examples and the reason? Thanks.
Tom Kyte

Followup  

July 24, 2009 - 7:17 am UTC

because they are set - you shouldn't be setting very many, if any, non-default parameters beyond what MUST be set.

disabling features, setting pools to zero

Cache

July 17, 2009 - 2:51 am UTC

Reviewer: Dheeraj from India

Tom,

Referring to the cache discussion in ther very first question.

I understand that the alter table ...cache, only effects the ageing of the table.

My question is,once we do the alter table...cache does it apply to the entire life of the table or is it session specific and goes off once the session is exit

Thanks,
Dheeraj
Tom Kyte

Followup  

July 24, 2009 - 7:21 am UTC

it is a table attribute that is in place until you change it. It survives database restarts and all.

To: Dheera

July 17, 2009 - 4:47 am UTC

Reviewer: Oleksandr Alesinskyy from Germany

You alter TABLE, not session. this alteration (as was already pointed by Tom) is written into data dictionary - that means that it is permanent and not session-specific. Moreover, this option may be specified by table creation.

ALTER TABLE CACHE Followup #5

July 17, 2009 - 1:19 pm UTC

Reviewer: A reader

Tom,

The physical IO's have all disappered if there were no constraints and indexes on the tables. However, they come back as soon as constraints and indexes are introduced in the test.

These indexes/constraints were in the originally test case (ALTER TABLE CACHE July 9, 2009 - 1am US/Eastern), but you have removed them from your test case.


Tom Kyte

Followup  

July 24, 2009 - 8:38 am UTC

give me the example, please don't make me attempt to try to reconstruct the example you think I should run...

A step back in Oracle 11g ?

July 23, 2009 - 12:44 pm UTC

Reviewer: Srinath from Dayton, OH

Hi Tom,
 SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jul 23 11:42:11 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Enter user-name: scott/********

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> CREATE TABLE m AS
  2  SELECT rownum mID,
  3          MOD(rownum, 100) cnt1,
  4          MOD(rownum, 200) cnt2,
  5          TRUNC(dbms_random.value(1, 10000)) cnt3,
  6          RPAD('x', 30, 'x') char1,
  7          RPAD('x', 30, 'y') char2,
  8          RPAD('x', 300, 'z') char3
  9  FROM   dual
 10  CONNECT BY level <= 1000000;

Table created.

SQL>
SQL>  CREATE TABLE i AS
  2  SELECT rownum iID,
  3          mID,
  4          MOD(rownum, 2) cnt1,
  5          MOD(rownum, 4) cnt2,
  6          RPAD('x', 30, 'x') char1
  7  FROM   m
  8  UNION ALL
  9  SELECT rownum + 1000001 iID,
 10          mID,
 11          MOD(rownum, 2) cnt1,
 12          MOD(rownum, 4) cnt2,
 13          RPAD('x', 30, 'x') char1
 14  FROM   m;

Table created.

SQL> set linesize 1000
SQL> SELECT segment_Name, bytes/1024/1024 MB
  2    FROM   user_segments
  3    WHERE  segment_Name IN ('I', 'M');

SEGMENT_NAME          MB
--------------------------------------------------------------------------------- ----------
I                     120
M                     440

SQL> exec dbms_stats.gather_table_stats( user, 'M' );

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.gather_table_stats( user, 'I' );

PL/SQL procedure successfully completed.


SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jul 23 11:59:06 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Enter user-name: scott/tiger

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set line 1000
SQL> select name || '=' || value from v$parameter where isdefault = 'FALSE' orde
r
  2  by name;

NAME||'='||VALUE
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
audit_file_dest=C:\APP\******\ADMIN\ORA11G\ADUMP
audit_trail=DB
compatible=11.1.0.0.0
control_files=C:\APP\******\ORADATA\ORA11G\CONTROL01.CTL, C:\APP\******\ORADATA\ORA11G
\CONTROL02.CTL, C:\APP\******\ORADATA\ORA11G\CONTROL03.CTL
db_block_size=8192
db_domain=
db_name=ora11g
db_recovery_file_dest=C:\app\******\flash_recovery_area
db_recovery_file_dest_size=2147483648
diagnostic_dest=C:\APP\******
dispatchers=(PROTOCOL=TCP) (SERVICE=ora11gXDB)

NAME||'='||VALUE
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
----------------------------------------
memory_target=1719664640
open_cursors=300
processes=150
remote_login_passwordfile=EXCLUSIVE
undo_tablespace=UNDOTBS1

16 rows selected.

SQL> alter system flush buffer_cache;

System altered.

SQL> alter table m cache;

Table altered.

SQL>
SQL> alter table i cache;

Table altered.

SQL> set autotrace on
SQL> SELECT COUNT(*)
  2      FROM   m, i
  3     WHERE  m.mID   = i.mID AND
  4            i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time
   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 22353   (1)| 00:04:29 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |   |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M| 22353   (1)| 00:04:29 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  3954   (1)| 00:00:48 |
|*  4 |    TABLE ACCESS FULL| M    |  1000K|    34M|       | 15167   (1)| 00:03:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
        314  recursive calls
          0  db block gets
      69936  consistent gets
      69896  physical reads
          0  redo size
        418  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         10  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> SELECT COUNT(*)
  2      FROM   m, i
  3     WHERE  m.mID   = i.mID AND
  4            i.cnt1  = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

  COUNT(*)
----------
   1000000


Execution Plan
----------------------------------------------------------
Plan hash value: 4156681420

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time
   |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    44 |       | 22353   (1)| 00:04:29 |
|   1 |  SORT AGGREGATE     |      |     1 |    44 |       |            |   |
|*  2 |   HASH JOIN         |      |  1000K|    41M|    19M| 22353   (1)| 00:04:29 |
|*  3 |    TABLE ACCESS FULL| I    |  1000K|  7812K|       |  3954   (1)| 00:00:48 |
|*  4 |    TABLE ACCESS FULL| M    |  1000K|    34M|       | 15167   (1)| 00:03:02 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("M"."MID"="I"."MID")
   3 - filter("I"."CNT1"=1)
   4 - filter("M"."CHAR1"='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      69900  consistent gets
      69891  physical reads
          0  redo size
        418  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> show sga

Total System Global Area 1071333376 bytes
Fixed Size                  1334380 bytes
Variable Size             327156628 bytes
Database Buffers          738197504 bytes
Redo Buffers                4644864 bytes
SQL> 
Can you please shed some light on these stats in Oracle 11g please ? (autotrace output of the second Sql call)
I would not expect any physical reads at all, but I cant believe what I'm seeing.
Tom Kyte

Followup  

July 26, 2009 - 7:01 am UTC

tkprof, undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache.

why the change in Stats between 10g and 11g ?

July 26, 2009 - 7:05 pm UTC

Reviewer: Srinath from Dayton, OH

Tom,
Why is there a change in the stats between 11g and 10g versions ? Is it like a new feature in 11g database ?

Thanks.
Tom Kyte

Followup  

July 26, 2009 - 9:39 pm UTC


anyway - we went over this a lot above - I'm not sure why this is a surprise. We beat this to death actually. You just ran an example we provided, and that example was provided to show that what you are seeing would actually happen....

But yes, you should always expect version X to work differently in some respects from version Y - else version X would be version Y wouldn't it.

July 27, 2009 - 12:51 pm UTC

Reviewer: A reader

Maybe I can rephrase Srinath's question regarding the change of behavior between 10g and 11g because I also have a similar question. This direct path reads or whatever new data access path that 11g is employing to retrieve data - why should it be a step down in terms of performance? If you ran the SQL in 10g and assuming the tables are fully cached, it will always be faster than running the same SQL in 11g because 11g is doing direct path reads. In other words, if I upgrade my database to 11g, the application will suffer significant performance degradation. So the question is how do we get around this problem? Thanks.
Tom Kyte

Followup  

July 27, 2009 - 7:49 pm UTC

... t will always be faster than running
the same SQL in 11g because 11g is doing direct path reads. ...

bzzzt - false.

did you read where I said above:

... undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache. ....

parallel query has been doing this for years and years (since about 1994). A full scan reading from the buffer cache is horribly painful - we have to read a block at a time, not multiblocks - and it takes hundreds, if not thousands of cpu cycles/instructions to get a single block and dozens of latches (locks, serialization devices)


You are making an assumption (cache must always be faster than disk) that is not true.

prove to us that reading 1,000,000 blocks from the buffer cache on a normal system is "always going to be faster" than getting from disk.

Followup to Your "July 24, 2009 - 7am US/Eastern" Followup

July 27, 2009 - 12:57 pm UTC

Reviewer: A reader

Tom, earlier in this thread, you mentioned that we shouldn't be setting too many initialization parameters - e.g. setting pools to zero. I was under the impression that if automatic memory management is used, by setting pools to zero, Oracle would automatically figure out over time what's the best settings for each of the pools based on the usage of the application. Is this not correct understanding? Can you point me to a thread on how automatic memory management should be used properly?
Tom Kyte

Followup  

July 27, 2009 - 8:28 pm UTC

You need not set anything other then the parameter to control how much memory to use - SGA_TARGET (just automatic sga memory management) or MEMORY_TARGET (in 11g, auto tune the pga AND sga together).

You need not set the individual pools

Sometimes we have to set parameters ....

July 30, 2009 - 6:20 am UTC

Reviewer: Nadya from Russia

Tom, "You need not set anything other then the parameter..." I thought so when we upgrade to 10.2.0.4.But two days later after upgrade the instanse was terminated by RECO:ORA-04031. Trace file:
...
"KGH: NO ACCESS " 1046641408
...
We have to set the pools manual.

Caching Tables

September 23, 2010 - 12:44 pm UTC

Reviewer: fakdaddy from MO

Curious about the comment

"Bear in mind tho... there is really no true way to a have a purely "in memory table -- even cached tables are subject to aging from the buffer cache. "

So for exaggeration purposes - I have a 1GB KEEP cache, and I assign 1x 100MB table to it - the blocks would be aged out ??

I have never configured a keep cache before as I figure oracle is cleverer than myself to mange. However we have a critical batch app (more critical than its day use) .... so we are considering a KEPP pool to cach high read/low size tables so they don't get aged out during the 12HOUR OLTP/DAY window.


Tom Kyte

Followup  

September 27, 2010 - 9:49 am UTC

your keep cache is the same as your recycle cache is the same as the default cache as far as holding blocks in memory go. The keep cache is not a special "keep this in memory" cache, it manages blocks the same way as the other caches. It is just a 'naming convention', you put things you want to 'keep' in the nicely sized keep pool and things you don't in the smaller sized recycle pool.


The goal of the keep pool would be to make it have very few objects - just the things you would like to have in the cache as much as possible - so they are not competing with lots of other stuff for space.

touch count algorithm

October 19, 2011 - 8:58 am UTC

Reviewer: A reader

Dear Tom,

I was reading the internals of touch count algorithm. I have a question. In the document it was told that,

"Oracle only allows buffer’s touch count to be incremented, at most, once every 3 seconds.
When a touch count is incremented buffer pointer should move. But movement of buffer pointer is independent of touch count increment. Also for any activity in memory area oracle needs a latch for assuring cache consistency. But there is an exception here !! For updating touch count, Oracle does not use latch and buffer block can be modified while touch count is getting incremented. But more interesting is that, two processes may increment the touch count to same value, and when this happens Oracle assures the worst that could happen is the touch count is not actually incremented every time a buffer is touched and that no cache corruption will result.
"

Therefore,
1) Does it mean at every 3 seconds the touch count is increased only once for each buffer? If not why?
Also, I couldn't understand how the latch process increments the touch count? \
2) How it is independently working with every 3 seconds process of incrementation of touch count to solve the problems?
3) When was Touch count algorithm introduced? Does it overwrites all the properties of Modified LRU?
Tom Kyte

Followup  

October 19, 2011 - 6:19 pm UTC

1) first the latch processing. A latch is another name for a lock, a serialization device. In general, if I have a latch on something - you cannot get it (some latches can be shared, but generally for read only access to some memory segment).

Since we are not using a latch to protect this counter - you can and will have the condition whereby two (or more) processes attempt to increase the counter at the same time. Suppose you and I both read the same block at about the same time. We both want to update the touch count. Suppose the existing touch count was 42 - we would both see 42, and then set the count to 43. It really should have been "44" since we both read it - but it won't be. But this is OK - we haven't 'corrupted' anything and we still have a 'pretty good' count.

a touch count for a block is only incremented after about 3 seconds.

2) I don't know what you are trying to ask there.

3) it *is* a modified LRU type of algorithm. It was introduced back in the Oracle 8i days.


goood

October 20, 2011 - 12:11 pm UTC

Reviewer: A reader


Table Caching

February 21, 2012 - 12:33 am UTC

Reviewer: A reader

hi Tom,

I have five-six tables having row count between 1000 to 25000. Is it advisable to keep the tables of such sizes in memory through CACHE?

What is the ideal size for a table with 20 columns with average column size as 20 bytes. The SGA is 3GB

What are the drawbacks of the CACHE aaproach?

Please advise.
Tom Kyte

Followup  

February 21, 2012 - 7:21 pm UTC

no, do not use cache, if you use them - they'll be cached.


... What is the ideal size for a table with 20 columns with average column size as
20 bytes. ...


you are missing an important metric ;) the number of rows.


alter table t cache does not cause a table to be cached. It changes the way the blocks are managed in the cache when the blocks are read during a full scan only. If you are using index access, it will have no affect.

Buffer Cache & Parallel Query with Direct I/O

May 30, 2012 - 5:11 am UTC

Reviewer: Ian Wallace from Dublin

Hi Tom,

I've read a number of comments with interest in this thread, especially around the use of the buffer cache and parallel queries. In the thread about you state the following:

"Do the blocks need to be kept cached? should they be cached? are you using parallel query (more
common with large full table scans) - then the cache doesn't really matter (direct io).
"

"... undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache. ...."

And finally....

"parallel query has been doing this for years and years (since about 1994). A full scan reading from the buffer cache is horribly painful - we have to read a block at a time, not multiblocks - and it takes hundreds, if not thousands of cpu cycles/instructions to get a single block and dozens of latches (locks, serialization devices)"

This topic is very pertinent to us at the moment. We're designing a reporting solution whereby we have created one denormalized table which is composite partitioned RANGE - HASH, the theory being that with partition elimination we will always full scan a sub partition. We have a 4 node RAC cluster and we're hoping to take advantage of parallelism.

We're currently designing this solution with some external consultants who have suggested defining this table in the KEEP pool. The size of the KEEP pool would be circa 10GB whereas the size of our SGA is currently 30GB. Based on what you've said above would we actually make this of this pool, especially if parallel query is using direct I/O? Also could you provide a simple example showing how the use of parallel query negates the use of the buffer pool?

The database version is 10.2.0.2 on Red Hat Linux 4.

Many thanks,
Ian.

Tom Kyte

Followup  

May 30, 2012 - 7:37 am UTC

ask them "why", since full scans in general will flush the buffer cache and do direct IO from disk?

Just run a parallel query and tell me if you see logical IO's or not ;)

In 11g, there is a new in memory parallel query
http://docs.oracle.com/cd/E11882_01/server.112/e16638/memory.htm#i30761

but that doesn't apply to you.

and even then, I wouldn't be using the KEEP pool


May 30, 2012 - 11:23 am UTC

Reviewer: Alexander

"ask them "why", since full scans in general will flush the buffer cache and do direct IO from disk?"

Did you mean "will NOT flush the buffer cache..."?

Or what the CACHE table setting somewhere implied that I missed?
Tom Kyte

Followup  

May 30, 2012 - 11:57 pm UTC

No, I meant it would do a segment level checkpoint - to get the blocks for the segment you are scanning onto disk so it can bypass the inefficiencies of the buffer cache and just do direct IO from disk.

if they didn't flush those blocks to disk - they might read the wrong version (there could be a committed version in the cache that is not on disk).



I should have said "parallel full scans" for 10g, but in 11g - just saying "full scans" is sufficient

May 30, 2012 - 1:27 pm UTC

Reviewer: Ian Wallace from Dublin

Hi Tom,

Could you clarify what you mean when you say full scans in general will flush the buffer cache and do direct path I/O from disk. My understanding is that parallel query in 10g will just bypass the buffer cache and always do direct I/O?

Thanks,
Ian.
Tom Kyte

Followup  

May 31, 2012 - 12:00 am UTC

see right above, I clarified that.

'keep' pool

June 01, 2012 - 12:38 am UTC

Reviewer: A reader

Hi tom,

How is ORACLE managing 'keep' pool itself?

Let's see i use 'storage' clause to put many tables into 'keep' pool, whose total size can not be
accormordated by 'keep' pool size, which one will be 'out'?
Is it following the same logic that full table scan one will be the first candidate?
Tom Kyte

Followup  

June 01, 2012 - 6:48 am UTC

the keep and recycle pool are identical in nature of the "buffer pool"

by default segments are cached in the buffer pool, the default pool

You can alter/create a segment and specify to use the keep or recycle pools instead. Once they are marked for those pools they will be cached in those pools instead of the default pool - but the way they are cached are identical to the default pool.

I wish the keep pool was named "non-default pool1" and the recycle pool was named "non-default pool2".


In your question:

Let's see i use 'storage' clause to put many tables into 'keep' pool, whose
total size can not be
accormordated by 'keep' pool size, which one will be 'out'?
Is it following the same logic that full table scan one will be the first
candidate?


just replace keep with 'default' (or recycle even) and the answer is the same.

when the X pool fills up, dbwr is called upon to make space in it using LRU like algorithms.

Keeping table in memory

September 27, 2012 - 9:13 am UTC

Reviewer: Olaf

We are on Oracle 11.2.0.3.

We have a problem with some business critical maintenance jobs running too long (alter table move lob).

We need some workaround to improve the execution time. Oracle Support confirms that neither parallel DDL nor datapump or CTAS can not be used on tables with LOB.

What we are doing is putting tables into keep_pool by:

alter table X storage (buffer_pool keep) modify lob (data) (storage (buffer_pool keep));

than precache the table by starting set of 10-20 jobs or starting select /*+ parallel (t 16) */ * from X t;

In any case

SELECT sysdate, COUNT(*)*8*1024/1024/1024/1024 GB FROM V$BH WHERE OBJD = 5839675;

shows that exactly 2/3 of the table will be cached - in case of 45 GB table it is 30 GB or in case of 27 GB table it is 18 GB.

In this case following alter table move need to read 1/3 of the blocks (we traced both alter table move lob with and without pre-caching, in one case we need to read about 1000000 blocks from disk and the overall time reduced by 2/3 in other case (without pre-caching) we need to read 3000000 blocks and need factor 3 of time.

Can you explain why oracle stop to cache data in keep_pool after it reached about 2/3 of the table size. There is still plenty of (unused) memory in keep_pool (is 70 GB currently).

Should we change the table to cache in addition to changing storage to keep_pool. What would be a syntax for table with lobs. The tables consist of just two columns id and xmlrecord (xmldata), size of table is just 1% of size of LOB segment (out of row storage).

Your answer is really appreciate.

Many thanks in advance
Tom Kyte

Followup  

September 27, 2012 - 3:20 pm UTC

Oracle Support confirms
that neither parallel DDL nor datapump or CTAS can not be used on tables with
LOB.,


ummm, that is incorrect. are you sure you understood what they said - did they really say that?


if you have to wait for the IO to complete to "precache", what is the point??? I mean - you still have to wait for the IO???? what is the logic behind all of this?


what relevance is the alter table move stuff? that does direct IO and pretty much bypasses the cache anyway.

what are you really trying to accomplish here.

Caching table for creating multiple indexes

February 26, 2013 - 11:13 am UTC

Reviewer: dmills from Pennsylvania

I have only ever found one decent use for caching a table and it had nothing to do with query performance. When I migrated to 11.2, I had to import a few 100 million record tables that were around 45 GB in size. I used datapump to import my tables and create all my indexes, excluding indexes on these large tables.

What I did was create a script that "alter table cached" a table, set my workarea_size_policy to manual, set my sort_area_size to 2GB, and ran my index creation script with a parallel degree on that table. Afterwards I set it back to nocache. The first index would take 10 minutes, but each subsequent index creation took 1 minute or less, with each parallel processes using up to 2GB of pga memory. This method temporarily consumed a lot of memory, but it was a new system with plenty of RAM. It reduced my migration time by approximately 2 hours over using datapump alone.

There were other ways to do this of course. I could have transported tablespaces. I was going from solaris to linux, so the endian change conversion consumed a lot of time. Overall this ended up being the fastest way I could figure out to do the migration and stay in the time window allocated... and it was kind of fun to watch the index creations fly by.

Caching

May 22, 2013 - 5:39 pm UTC

Reviewer: A Reader from USA

Hi Tom

Two of the tables in our system are

a) Company (4 million rows, 812 MB size)
b) Company_Alias (1million rows, 178 MB size)

Being master data we expect sizing to stay fairly consistent over time.

Our system (using Oracle 11g) needs to support a Search feature using LIKE searches on Company Name (in Company Table) and Company Alias Name (in Company_Alias Table).

LIKE forces a Full Table Scan on these tables and the results can take more than a 3 minutes which is causing a lot of problems. The execution times for the query (as expected) are very dependent on data being in cache.

So we are considering keeping these two tables in Keep Pool and have some questions in this regard.

a) Since we are not using KEEP Pool currently, if we only assign these two tables to the Keep Pool and size it sufficiently to hold data from these two tables, it is fair to assume that these two tables will not be aged out and will always be available in the Keep Pool?

b) Since the two tables are about 1 GB, how big of a KEEP pool should we plan. I read that the KEEP pool should be atlease twice the table size - is that correct?

c) Since the LIKE work only on the Party Name and Alias Name, can we create indexes on those columns and put those indexes in the KEEP Pool (so that we dont have to allocate large amount of space)

for example if we have a query such as

select *
from company join address on ....
join identifiers on ...
where company_id in (select company_id from company where company_name like '%IBM%')
or company_id in (select company_id from company_alias where company_name like '%IBM%')

the two inner queries that have only the company_name and company_alias may be able to use the index from the keep pool

once the rows are identified then for the small number of selected rows it can fetch the data from the company tanle and any other tables using the normal method (physical io/default pool). does this make sense?

d) Other approach we are looking for is to create text indexes on these columns and use the oracle text based searches. Ifwe were to go that route which of the DR$ tables would be most useful to put in the keep pool

Thanks in advance
Tom Kyte

Followup  

May 22, 2013 - 7:11 pm UTC

a) not unless you 'cache' them - we do not do things differently in the keep/recycle pool really - they are just separate, non-automated buffer caches, they would do a full scan just like the default pool.

b) depends on how much you modify these tables. If you don't modify them - it'll take less since we won't have multiple versions to keep in the cache. If you do modify them - it'll be very dependent on how much you do modify them.

If you do not modify these often - you should almost certainly:

alter table t compress;
alter table t move;
(rebuild any indexes)

and consider doing the move after any "refresh" of this data. If you have enterprise edition, you have the ability to use this basic compression.

c) have you considered text indexes at all?


you could do the subqueries, just use rowid instead of the primary key (no need to go index to index to table...) You might have to play with it a bit - as the cardinality estimates will be way off with the %xxx% stuff. For example, it might be something like this:


ops$tkyte%ORA11GR2> /*
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> drop table t1;
ops$tkyte%ORA11GR2> drop table t2;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t1
ops$tkyte%ORA11GR2> as
ops$tkyte%ORA11GR2> select object_name company_name, object_id company_id, a.*
ops$tkyte%ORA11GR2>   from all_objects a;
ops$tkyte%ORA11GR2> create index t1_idx on t1(company_name);
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t2
ops$tkyte%ORA11GR2> as
ops$tkyte%ORA11GR2> select object_name company_name, object_id company_id, a.*
ops$tkyte%ORA11GR2>   from all_objects a;
ops$tkyte%ORA11GR2> create index t2_idx on t2(company_name);
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T1' );
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T2' );
ops$tkyte%ORA11GR2> */
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 4000000, numblks => 500000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 1000000, numblks => 100000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> with
  2  t1_rids as (select /*+ materialize cardinality( t1 100 ) */ rowid rid from t1 where company_name like '%Oracle%'),
  3  t2_rids as (select /*+ materialize cardinality( t2 100 ) */ rowid rid from t2 where company_name like '%Oracle%')
  4  select *
  5    from (select *
  6            from t1
  7           where rowid in (select * from t1_rids)) t1,
  8          (select *
  9             from t2
 10           where rowid in (select * from t2_rids)) t2
 11   where t1.company_id = t2.company_id
 12  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3932266967

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                            |     1 |   284 | 10519   (1)| 00:02:07 |
|   1 |  TEMP TABLE TRANSFORMATION     |                            |       |       |            |          |
|   2 |   LOAD AS SELECT               | SYS_TEMP_0FD9D6675_6FF4833 |       |       |            |          |
|*  3 |    INDEX FAST FULL SCAN        | T1_IDX                     |   100 |  3700 |   107   (1)| 00:00:02 |
|   4 |   LOAD AS SELECT               | SYS_TEMP_0FD9D6676_6FF4833 |       |       |            |          |
|*  5 |    INDEX FAST FULL SCAN        | T2_IDX                     |   100 |  3700 |   107   (1)| 00:00:02 |
|   6 |   NESTED LOOPS                 |                            |     1 |   284 | 10305   (1)| 00:02:04 |
|   7 |    MERGE JOIN CARTESIAN        |                            | 10000 |  1503K|   302   (0)| 00:00:04 |
|   8 |     NESTED LOOPS               |                            |   100 | 14200 |   102   (0)| 00:00:02 |
|   9 |      VIEW                      |                            |   100 |  1200 |     2   (0)| 00:00:01 |
|  10 |       TABLE ACCESS FULL        | SYS_TEMP_0FD9D6675_6FF4833 |   100 |  1200 |     2   (0)| 00:00:01 |
|  11 |      TABLE ACCESS BY USER ROWID| T1                         |     1 |   130 |     1   (0)| 00:00:01 |
|  12 |     BUFFER SORT                |                            |   100 |  1200 |   301   (0)| 00:00:04 |
|  13 |      VIEW                      |                            |   100 |  1200 |     2   (0)| 00:00:01 |
|  14 |       TABLE ACCESS FULL        | SYS_TEMP_0FD9D6676_6FF4833 |   100 |  1200 |     2   (0)| 00:00:01 |
|* 15 |    TABLE ACCESS BY USER ROWID  | T2                         |     1 |   130 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("COMPANY_NAME" LIKE '%Oracle%')
   5 - filter("COMPANY_NAME" LIKE '%Oracle%')
  15 - filter("T1"."COMPANY_ID"="T2"."COMPANY_ID")

ops$tkyte%ORA11GR2> set autotrace off



the materialize (or you can play with no_merge) will prevent the optimizer from recognizing that it could just to a single table query...


d) you wouldn't, you would use this sort of index:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:37336026927381#37360199308431

and just let it be.

May 24, 2013 - 3:29 pm UTC

Reviewer: A reader

hi tom,

why there is Merge join at step-7 and also nested loop again at step-6?

Thanks

Tom Kyte

Followup  

May 29, 2013 - 4:53 pm UTC

to pick up the table data - step 15.