Is IOT appropriate here?
andrew, July      08, 2002 - 12:38 pm UTC
 
 
Tom
Please could you comment on the appropriateness of an IOT for this problem. If appropriate - under what circumstances would the cluster solution be better? 
 
July      08, 2002 - 4:40 pm UTC 
 
 
probably not.
A cluster just ensures that data that shares a common key is stored together.  
An IOT goes a step further and imposes "order" on the data -- 1, 2, 3, 4, 5, ... type of order. 
It is more expensive to insert into an IOT then a cluster -- as we might have to physically MOVE data to make room for new data (index splits and such).
In a cluster, we just make sure that like data is stored together.  It doesn't matter that physically we store 1, 5, 2, 3, 4 -- just that all of the 1's are together, all of the 5's and so on.
Since the goal here was to cluster like values together (to reduce the physical IO's and make better, more efficient use of the buffer cache)  a cluster would probably be more appropriate.
 
 
 
 
confused by 'single table' hash cluster in Chapter 6
Doug, July      08, 2002 - 8:29 pm UTC
 
 
Tom - in this question here, you are suggesting a single table might be worth clustering to co-locate the data.  In your book (pg. 238) you make reference to a special case of the single table hash cluster. You say this is designed for those occassions when you want to access a table by primary key and don't wish to cluster other tables with it.  "If you need fast access to an employee record by EMPNO - a single table hash cluster might be called for".
In the example in the book it looks like just a regular old table with a standard primary key.  In this example in the thread it looks like there are many records with the same value.   
So..
If I have a table that is not joined to other tables and I need fast access to it's primary key, do I want a single table hash cluster?  If so, isn't it somewhat more common to just make use of the index on the primary key? If I can predict the size of this table is it worth single table hash clustering? In this question thread, it seems we can store many values together but in the book example, I think you imply that empno is guaranteed to be completely unique between 1 and 50,000.
The example here makes sense but the book example is confusing me.  Tables are accessed by primary key quite frequently - I would assume I would see more single table hash clusters if they are worth doing in single table mode.  
 
July      09, 2002 - 7:05 am UTC 
 
 
I think that either
o I'm confused by your question
o your confusing topics
In this question -- the user wants to physically colocate related data by some value.  "... each key generally having around 10-200 rows."  They would like all data with the same key value physically stored near all of the data with the same key.
The answer to that is CLUSTER (well, more explicitly -- B*Tree CLUSTER).  In their case, it'll be a single table b*tree cluster but that is not any different then a multi-table one in our eyes.
Now, you are talking about a table with a primary key (unique) AND that you want to access purely by primary key.  That is when a HASH cluster comes into play.  Normally, you would create a table with a primary key index and then when accessing the table -- we would:
o index unqiue scan the index (say that is 2-3 IO's to find the key)
o table access by rowid (that is 1 IO to get the data)
Using a HASH cluster however we can take that above work and instead:
o hash the primary key (CPU operation) into a physical location
o do the single IO needed to read that block
So, we can cut out the index access all together.  "the data is the index"
Even if that table is JOINED to other tables -- this still makes sense, as long as the access path is generally via the primary key (and don't forget -- we can create other indexes on this object as well!)
So, in this thread, I was talking about b*tree clusters (see the book on page 224).  The example I used there was EMP+DEPT and a cluster key on DEPTNO, not EMPNO. The goal was to get the DEPT info physically colocated with the EMP info for that DEPTNO.
Later, I go onto HASH clusters.
Why are hash clusters not the norm than?  
a) you don't know the number of records ultimately
b) there is a performance penalty on insertions (see above -- i discuss that in the answer)
c) 99% of people aren't aware of them, most people never bother to learn about them
I've used them to great success with some very large tables. 
 
 
 
collision ....
Reader, November  01, 2003 - 10:31 pm UTC
 
 
Tom, in your new book on page 385, you mention collision in hash table. Could you please elaborate what it is and how exceeding value for hashkeys parameter affect the hash table? Thanks for your time. Your book is great. 
 
November  02, 2003 - 9:59 am UTC 
 
 
a collision occurs when 2 "hash key values" -- the values that are hashed -- result in the same "location", the same place, the same hash value.
You are taking a possibly infinite set of values (the keys you hash) and are hashing them into a finite set of values.  You will have unavoidable "collisions" that result in a little extra lookup time (we hash to where we expect to find the data and then have to look for the data -- it might not be "right there" since something else may well have hashed to that value already as well).
If you set hashkeys too low and many values hash to the same location, we'll have to link together lots of blocks in order to find your data.  Say you set hashkeys to 5 (for illustration purposes).  Oracle will conceptually set up a 5 block table:
+----+----+----+----+----+
|    |    |    |    |    |
|    |    |    |    |    |
+----+----+----+----+----+
Now, you create a table in this hash cluster based on a key (say "id").  It turns out that 50 rows will fit onto a single block with your data.  You start inserting information and get 'lucky', the data spreads out evenly so that you have 250 rows spread across 5 blocks.  
Now, any retrieval such as "select * from T where id = 55" will result in 1 logical IO since Oracle will hash 55 to a database block address and read that block, finding all of the data for id=55.  
Now, you insert row 251 -- with id=60.  It cannot fit onto the block it hashed to since that block is totally full.  Assume further that 55 and 60 "hash" to the same block address.  Oracle will link another block to it:
hash(55) = block 3 = hash(60)
            |
            V
+----+----+----+----+----+
|    |    |50  |    |    |
|    |    | 55 |    |    |
+----+----+----+----+----+
            |
            V
          +----+
          | 60 |
          |    |
          +----+
So, now 
"select * from t where id = 60" 
will take 2 lio's (as well where id = 55, where id = 50) since we had so many "collisions" that we filled up that block and had to link other blocks to it. 
 
 
 
Great explanation - collision
Reader, November  02, 2003 - 10:53 am UTC
 
 
I understand that where id=60 will take two LIOs. Why for id=50 and id=55 also take 2 LIOs? Thanks again. 
 
November  02, 2003 - 12:27 pm UTC 
 
 
because the data for 50 and 55 could also be on that linked block.  if you were to insert yet another row with id=50, it would has to that block, that block is full -- hence it goes onto the linked blocks.  once linked -- any of the data that could be on the head block might be on the linked blocks as well -- we have to check for it. 
 
 
 
Comparing Lookup Methods
A reader, November  09, 2003 - 1:20 pm UTC
 
 
In your new book, you compared the different methods of doing lookups.
For heap table, you used an example of TABLE ACCESS BY INDEX ROWID, which is not so efficient as SINGLE-TABLE HASH CLUSTER or IOT. 
I am wondering if we could achieve the same or better performance if we use INDEX FAST FULL SCAN on the heap table?
Thanks. 
 
November  09, 2003 - 2:07 pm UTC 
 
 
an index fast full scan ??? to do a keyed read????
nope, not a chance (think about it -- do you really want to full scan the entire index each time??????) 
 
 
 
Single-Table B-Tree Cluster
A reader, November  09, 2003 - 3:34 pm UTC
 
 
Thanks Tom.
According to Oracle DBA Guide:
"The SINGLE TABLE option is valid only for hash clusters. HASHKEYS must also be specified."
Is it right that I should not use a single-table b-tree index cluster.
 
 
November  09, 2003 - 5:09 pm UTC 
 
 
no, it is not right.
you can use a b*tree cluster very effectively with a single table in it.  Consider a single table in a b*tree cluster that has a key "deptno" on it. The single table is EMP.  Now -- all emps in the same deptno will be physically "colocated" together.  If that is advantageous to your application, it'll be a good thing. 
 
 
 
IOT versus clustered table
Dushan, November  10, 2003 - 2:32 am UTC
 
 
As it was said, insert into IOT table is more expensive than insert into clustered table. Is this the only benefit? 
As for me the biggest cons of clustered table in common are:
1) Not possible to partition
2) Not possible to direct path load (same for IOT, I think)
So, when we store huge amount of data into clustered table, deleting from this table turns into nightmare. From this point IOT wins (truncate/drop partition).
 
 
November  10, 2003 - 7:40 am UTC 
 
 
it was said?  where?
A cluster allows you to store rows from many tables together on a single block.  All of the data from "DEPT" and "EMP" can be physically stored on a single block.
IOT's do not have that.
They all are just TOOLS -- to be used at the appropriate time.  Look at the Oracle data dictionary -- when you create a table (with columns, indexes, constraints, etc) -- we store that in a cluster so that most of the information about a TABLE will be on the same block.
You need to be aware of how each works, what is can and cannot do for you and so on.
 
 
 
 
Clustered table experience
Dushan, November  10, 2003 - 10:52 am UTC
 
 
Understand, Oracle data dictionary is an excellent example - changes to data dictionary are rare (usally), quick response is required, low IO.
Unfortunalely, I had seen rather strange table design on two billing systems regarding using of cluster.
In the first case, weekly CDR tables (big tables) were clustered on customer code column, and were joined in partitioned view. So, no problem with getting rid of data for one week, nevertheless, inserts of new CDR's suffered. Everybody believed that billing process is much faster using these clustered tables, until we tested (by accident) the whole process on regular heap tables with regular B-tree indexes. No impact on performance at all!
Second case - different client, different billing system but the same CDR story. Even worse - really huge clustered table and problem with deleting. Just nightmare! 
Of course, I agree with you that this all about TOOLS to be used in proper way.
(Just reading your Effective Oracle by Design, excellent book!)
 
 
 
IOT / hash / heap
Marcio, January   22, 2004 - 9:23 pm UTC
 
 
Just to clarifying: 
Scenario:
DB A
t_source ( id_contract   varchar2(17), name_con varchar2(40), ... )
DB B
t_interface ( id_contract varchar2(17), dt_load date )
t_process (id_contract varchar2(17), dt_process date )
t_contract (id varchar2(17) pk, name ..., state check ('A','B','C','D','E'))
t_history (qty_rows_process number(7), qty_files ...)
There is no fk references between tables.
1) A job #1 wakes up --> collect all data from t_source between dt1 and dt2 and insert into t_interface set null in dt_load. About 2,000,000 rows at time -- interval 1h (8am - 5pm) we talking about 20,000,000 rows/day.
2) Another job #2 wakes up --> insert into t_process from t_source where dt_load is null, set dt_load.
3) Another yet job #3 wakes up --> select t_process, t_contract where id_contract = id and state in ('A','B') -- export some files with information based on state, insert into stats into t_history and set dt_process.
The tables t_contract and t_source I can't touch. 
Wich would be create rest of them (t_interface, t_process and t_history)? IOT/b*tree Hash/heap/Other? and please explain...
 
 
January   23, 2004 - 7:09 am UTC 
 
 
those structures you mention
o IOT's
o b*tree cluster
o hash cluster
are all about ACCESS -- they are used to optimize your frequent retrievals from the tables you load into.
Hence, a description of a load cannot tell us what sort of structures you should use to load.  You have to look at 
o how you use the data after loaded
o whether locale of data would be useful
if you want examples and discussions on this -- I go over it in both "Expert one on one Oracle" and "Effective Oracle by Design" in some detail. 
 
 
 
IOT / hash / heap - Continued
Marcio, January   23, 2004 - 7:38 am UTC
 
 
The system above is all about. As is!
I've found full relevant information about iot, b*tree cluster and so on your book, but I'd like know your opinion about what type of creation would be the tables in the scenario above -- if you don't mind.
Btw - fyi: You have yet the warning on followup page about your trip. "Bear in mind, I am out of the country from Jan 10th ..." 
 
January   23, 2004 - 8:12 am UTC 
 
 
you only load, you never query... hmmm
fastest way to do that is -- don't load.  you don't need it i guess.  if that is "the system", just don't load -- you never query ;)
there you go.
(there MUST be more to the system, else this system does nothing.  If you want to learn what structures would be best -- you need to understand how these structures work
a) server concepts
b) my books
you can read those to gain understanding!)
the scenario above would call for simple "heap" tables.
the missing part is "what really happens to the data later" -- that is where iot/hash clusters/b*tree clusters etc come into play
 
 
 
 
Load performance...
Scott Watson, March     18, 2004 - 4:50 pm UTC
 
 
Tom,
One of the apps I support has a query like this.
select many columns from part_iot_table
where ctry_cd = 'XX' and id = 999;
On average there are 9 rows per id, ctry_cd will always be the same.
I decided to create the table as a partition IOT to colocate the data together around the id column.  However, the team has decided that they also want to query the table like so.
select many columns from part_iot_table
where id = 999;
Now my partition elimiation scheme is out the window.  I could create a global secondary index on the subj_id, however, I was thinking that maybe a single hash cluster on id might solve both problems at once.
My problem is that loading the single table hash cluster is so slow that this option does not seem feasible.  Hopefully you can shed your experience on increasing load performance for these types of objects.  So far it has been running for about 8 hours and I have no way to estimate when it will complete.
Since I am loading from the IOT and the data is already sorted by id, I would expect that most blocks would be cached after the first read and not accessed again.
I tried to test the single hash cluster by loading a single partition of 30M rows.
My cluster was created with the following syntax.
create cluster match_data_cluster
( subj_id number(13))
single table
hashkeys 30000000  -- should have used 30M/9 rows per key
size 94
tablespace SCOTT_DONT_USE
pctfree 0
parallel 4
Can you suggest anything I can do to increase the load performance.  I am not running in Archive log mode on my test server.
My next option, if I cannot get the load to speed up, would be to try and create a hash partitioned IOT on the id column and add a dummy column using rownum to ensure that I have a primary key.  A head table was out of the question as we sometimes need to lookup 5000 id at a time in the 300M table and there is too much bouncing around. 
I would appreciate your thoughts and comments.  I am running 9.2.0.3 on HP 11.
Thanks once again
Scott Watson 
 
March     18, 2004 - 5:17 pm UTC 
 
 
I'm confused -- if they query:
select many columns from part_iot_table
where ctry_cd = 'XX' and id = 999;
select many columns from part_iot_table
where id = 999;
the primary key of the IOT for clustering should be:
(id,ctry_cd,.....anything else you need...)
and the data would be partitioned by ID ranges/hashes/whatever...
so, why doesn't the partitioned IOT work for you? 
 
 
 
More information
Scott Watson, March     19, 2004 - 7:21 am UTC
 
 
That was my plan if the hash cluster didn't work.  However, I didn't want to give up on the single table hash cluster just yet.  I was hoping you might know why the load would be taking so long.
 
 
March     19, 2004 - 9:47 am UTC 
 
 
funny -- thought the IOT was the plan but abandoned because you thought it would not work, when it would have to be built with the key in that order in the first place :)
you have size 94, is 94 bytes sufficient to hold 9 rows per hash key? 
 
 
 
More info cont..
Scott Watson, March     19, 2004 - 11:38 am UTC
 
 
The 94 bytes is the space used per row.  I guess I should be using 94*9 instead, an oversite on my part.
The IOT is a great option and it is what I had been recommending.  The problem was that they added another query that wasn't suited for how the table was origninally built. 
So given the new set of requirements, I thought I would build one IOT version and one single table hash cluster   and compare the performance difference between the two.  ( I would have never thought of using the cluster until I read your new book)  Once I saw the terrible performance of the load into the cluster I decided I did something wrong and sought your advice.  I did not mean for you to think that I was not considering the IOT.
Thanks  
 
 
March     19, 2004 - 1:55 pm UTC 
 
 
well, part of the horrid performance would be caused by the massive collisions and chaining brought about by undersizing the cluster by an order of magnitude!
"size 1000"  
 
 
 
Follow up on index clusters
Christo Kutrovsky, April     27, 2004 - 4:36 pm UTC
 
 
When you have btree single table clusters, and you set the size low, i.e. several rows with different cluster key share the same block, how are the blocks linked when 1 block is not enought to hold the data?
Is it simply "all those cluster keys share that linked list of data blocks" ? Or is it "the next block for cluster key <bla> is this <rowid>" ?
We have a big table (300 milion rows, ~120 bytes per row) which has a list_id having ~150 000 distinct values, very skewed. For some list_id values we will have 1 row, for others 1 million rows. I was thinking of using a btree cluster index. I desided I will first try the cluster with 2 columns (list_id, domain_id). I am testing it now, but the problem is with those small list_ids, I would have 10 rows with the same list_id, but different domain_ids, thus will be over 10 datablocks. Initially I desided that I will have 1 cluster key per block, but now I may need to reconsider my decision, and depending on your anwers I may just drop the idea.
I am also open to other ideas.
9.2
 
 
April     28, 2004 - 2:11 pm UTC 
 
 
well, it is not 100% about size -- it is related more to how the data is inserted.
Lets say you created a key on "X".
You insert 100 records with X=5
then
You insert 100 records with X=6
and assume 75 rows fit on a block.....
Well, the first 75 rows for x=5 would go onto a block (the first x=5 set the cluster key -> rowid ) -- and then the next 25 would go onto a chained block.
Then, the first row for x=6 would go onto its own block (discovering that the block it *wanted* to go onto was full) and the next 74 would follow it and then the next 25 would be chained.
Now, if you loaded x=5, x=6, and then 99 x=5's and 99 x=6's -- then x=5 and x=6 would share the same block and both would chain -- so you would have 200 rows on  two blocks.
IOT's might be something to look into for locale of data (getting them together) without having to "right size" it.  that and you can partition as well. 
 
 
 
clustered materialized view on heap table
Harry, January   08, 2005 - 10:01 pm UTC
 
 
Is it possible to create clustered MV for query and use heap table for DML? 
 
January   09, 2005 - 11:32 am UTC 
 
 
well, as long as you understand that the DML applied to the heap table is ultimately also applied to the MV... sure, the MV can have very different structural qualities from the parent table.
ops$tkyte@ORA9IR2> create cluster single_table_hash_cluster
  2  ( hash_key number(4) )
  3  hashkeys 50000
  4  size 45
  5  single table
  6  /
 
Cluster created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table emp as select * from scott.emp;
 
Table created.
 
ops$tkyte@ORA9IR2> alter table emp add constraint emp_pk primary key(empno);
 
Table altered.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create materialized view emp_mv
  2  cluster single_table_hash_cluster(empno)
  3  as
  4  select * from emp;
 
Materialized view created.
 
 
 
 
 
 
HASHKEYS
Jim, February  10, 2005 - 8:18 pm UTC
 
 
Is there a way to change the HASHKEYS parameter value after the cluster is created and populated? I was wondering that could help to avoid "collisions". Thanks. 
 
February  11, 2005 - 7:53 pm UTC 
 
 
hashkeys is used to size the cluster initially and set up the hash function to map the hash key to a DBA (data block address)
they take (hashkeys * size / block size) and allocate that much space.
so, changing hashkeys would require a rebuild. 
 
 
 
Loading data
Peter Tran, February  16, 2005 - 2:48 pm UTC
 
 
Tom,
In your above example with X=5 and X=6, are you suggesting that we should try to load the data with the same clustered key at the same time to get the most benefit?  
If this is not possible, then would using a cluster still make sense?
Thanks,
-Peter 
 
February  16, 2005 - 3:04 pm UTC 
 
 
if you want the most perfect cluster ever -- yes, you would attempt to load "in order"
however. if they do arrive out of order -- but the cluster was sized for it (the keys and size were "right") the cluster would still be beneficial.
In the example above, with x=5/6 we had sized the cluster to allow for multiple x= values to be on a block -- and you load MORE than fits, that is when the problems come into play. (so it was a wrong sized cluster that lead to the deterioration because of the out of order arrival.  had the cluster been sized -- it would not have deterioriated -- might waste space though) 
 
 
 
Big tables or many reads?
Flavio Casetta, March     29, 2005 - 8:29 am UTC
 
 
Hi Tom,
I have got an application where there are some tables storing geographical data like cities, regions, states and so on. Primary keys are numbers generated by sequences and a typical "where" clause contains only equality operators against primary keys, like for instance "where region_id = :1 and location_id = :2 ". 
This kind of data doesn't change over time, indeed it has been loaded once for good therefore the correct size of the hash key could be easily calculated. These tables aren't big, say the biggest will contain 20.000 rows, while the smallest is only 50 rows, but they are likely to be queried at all times.
Are they a candidate for some sort of multi table cluster because, in the end, the less blocks you read, the better is, or is it a completely useless exercise because it would be much better to cache direcly the smaller ones and let an ordinary unique index scan do the rest of the job?
Thanks and bye,
Flavio
PS: great books, I love them! 
 
March     29, 2005 - 10:24 am UTC 
 
 
are you joining these tables and do you query them by the same key?  Doesn't look that way.
for example, 
emp( ..., DEPTNO )
dept( DEPTNO, .... )
create them in a multi-table cluster by DEPTNO and 
from emp, dept where emp.deptno = dept.deptno and dept.deptno = :x;
might benefit. 
 
 
 
Yes, tables are joined by the same keys
A reader, March     29, 2005 - 10:43 am UTC
 
 
Yes Tom,
I forgot to include the joins...
region_id is on the "provinces" table and is a foreign key to the "regions" table. Province_id is on the "locations" table and is a foreign key to the "provinces" table. 
This is a snowflake sort of thing.
So, when I query the tables there are always the joins between the tables in the where clause.
Bye,
Flavio 
 
March     29, 2005 - 11:07 am UTC 
 
 
right but they don't use the same keys do they?
lets see a sample query (small) 
 
 
 
here is the actual query 
Flavio Casetta, March     29, 2005 - 11:35 am UTC
 
 
It's an HTMLDB based application query.
select i.item, i.ean, i.id, i.unit, i.scale, i.category_name, l.net_price - l.discount price, l.net_price, l.last_date, l.store_name, l.store_group_id, 
l.location_id, s.province_id, p.region_id
from items_list i, last_prices l, locations s, provinces p
where i.id = l.item_id
and l.location_id = s.id
and s.province_id = p.id
and (l.store_id = :P120_STORE_ID or :P120_STORE_ID = 0)
and (l.location_id = :P120_LOCATION_ID or :P120_LOCATION_ID = 0)
and (s.province_id = :P120_PROVINCE_ID or :P120_PROVINCE_ID = 0)
and (p.region_id = :P120_REGION_ID or :P120_REGION_ID = 0)
and (l.store_group_id = :P120_STORE_GROUP_ID or :P120_STORE_GROUP_ID = 0)
and (i.category_id  = :P120_CATEGORY_ID or :P120_CATEGORY_ID = 0)
 
 
March     29, 2005 - 12:27 pm UTC 
 
 
"I" and "L" go together by ID/ITEM_ID
but "L" and "S" by location_id/id
and "S" and "P" by province_id/id
I don't see how you would be able to cluster them all together since they join on different keys.
If you cluster "I" and "L" by ID/ITEM_ID, then "S" could not be clusterd with "L", since "L" would have to be clustered by location_id...
and so on.
that "or :Pxxx_yyyy_zzz = 0" troubles me, what is that for? 
 
 
 
explanation
Flavio Casetta, March     29, 2005 - 4:45 pm UTC
 
 
Tom,
It means any region or any location and so on.
When I first wrote the clause instead of "OR :Pxxx_yyy_zzz = 0" there was "OR :Pxxx_yyy_zzz is null" because the logic was that a null value should mean any value. 
Then I found out that HTMLDB is a bit naive in the way it represents null values in drop down lists as a null value is actually represented by the string '%null%' if you don't specify something else, but only after submitting the page.
So you end up with a query that needs to check if a value is null or if it is equal to '%null%'. That's why I preferred to use my own "null" value and I opted for 0. 
Back to clusters, what you say is also my understanding, but  do you see any advantage in clustering together just the two biggest tables (locations and provinces)?
The other two tables are not eligible I guess, one is a view containing OLAP functions.
Thanks!
Flavio 
 
March     29, 2005 - 7:46 pm UTC 
 
 
I would construct the query based on the inputs then.  use a "plsql function returning a query" -- build the query with the where clauses you need.
as for the clustering, do you have a physical IO issue now?  The cluster would possibly help decrease that.  
 
 
 
and the 11th commandment ?
Flavio, March     30, 2005 - 3:38 am UTC
 
 
Tom,
if understand correctly you suggest to fine-tune the WHERE clause omitting some ORs instead of having a unique sql statement using bind variables.
Isn't that a breach of the 11th commandment????
"Don't clutter up the shared pool with unrecyclable sql statements" (Kyte, 17, 11)
;-)
Now, seriously, what is the exact risk behind using that kind of boolean logic?
Currently there are no I/O issues, but, as you teach, the sooner you tackle possible bottlenecks, the better...
Bye,
Flavio 
 
March     30, 2005 - 7:20 am UTC 
 
 
they are reusable, you will have a very finite set of fully shareable cursors with the best plans possible.
 
 
 
 
tradeoffs
Flavio Casetta, March     30, 2005 - 8:05 am UTC
 
 
that's certainly true Tom, I made a comparison between the execution plans and what you say is perfectly reasonable.
However this leads me to ask how and if you normally weigh the tradeoffs between higher performance and coding readability.
Are there situations where you privilege the latter or do you always favor performance, even if there is only a slight difference?
In other words, where is the turning point, if any?
Bye,
Flavio 
 
 
thanks!
Flavio Casetta, March     30, 2005 - 3:56 pm UTC
 
 
I'll modify the query type then and see the results.
Thanks for you time and patience!
Bye,
Flavio 
 
 
Pls. clarify
A reader, July      12, 2005 - 11:26 pm UTC
 
 
Tom,
At the start of the discussion you mention that a primary key lookup with a hash cluster will require only 1 I/O. However, subsequently you mention that 50 could be in chained blocks also. How can it be possible if that column is the primary key?
Please clarify 
 
July      13, 2005 - 10:59 am UTC 
 
 
"50"?
not sure what you mean 
 
 
 
RE : Hash Cluster
A reader, July      13, 2005 - 4:52 pm UTC
 
 
Tom,
Sorry for not being very clear. The "50" in my question refers to the value that we lookup against. You mention that we need to keep looking in the chained list (once the data block address of the starting block is determined using the hashing algorithm) because there is a possiblity the value of 50 could also be in another block in the same list of blocks. But the initial thread is that these are primary keys. How can 50 be in another block in the chain?
 
 
July      13, 2005 - 4:59 pm UTC 
 
 
Yes, if your hash cluster is sized improperly and too many rows hash to the same block - then Oracle will chain the blocks together.
However, you can have the same issue with migrated rows.
A properly sized hash cluster for a lookup table however will give you pretty much a SINGLE IO for a keyed lookup
as compared to an index which will give you 3 or so IO's in the index and 1 IO against the table. 
 
 
 
Howard J. Rogers, July      13, 2005 - 9:29 pm UTC
 
 
I might just add that in 10g we get a SORTED hash cluster, so that example of Tom's to which you refer (I think it was from 2003, so not surprising if Tom has forgotten about it!) would not have to take place, even if the hash cluster was not clustered on the primary key.
If there were a number of rows with values 50, 55 and 60, then we have to search all possible hash buckets in the chain, even for the 50 and 55 values, because they might be hiding anywhere in the chain. But if you sort the values, then all the 50s would be together, all the 55s would be together and so forth... at which point, as soon as you find your first 55, you know there cannot be any more 50s, and you can stop searching.
I like sorted hash clusters a lot!
The point from Tom's original example was that in an unsorted hash cluster, you know that 50, 55, 60, 65 and so on are all going to be found in the one hash bucket (and that 51, 56, 61, 66 would all be in another hash bucket). But you don't know where in the bucket they are, and so you search until you find the one you want. I can't recall if the optimiser understood that the hash key was a primary key, and therefore having found one value of 50 it could assume there would be no others, but I don't think that's the main point even so: in the absence of a sort within the cluster, you cannot know that 50 will be in the first part of the particular hash bucket, and therefore *in principle*, hash collisions give rise to extra I/O. 
 
 
July      14, 2005 - 10:23 am UTC 
 
 
sorted hash clusters are good for a fifo queue, but they break down seriously fast if the data arrives out of order, seriously fast.  Good therefore for stock quotes in hindsight (I'd size the size of the data associated with a key to be a block so you would get a block per key in that case) 
 
 
 
view on cluster segments
reader, July      27, 2005 - 1:26 pm UTC
 
 
(1) Can one create a view on cluster segements? any restrictions?
(2) can one implement FGA control on cluster tables?
Thanks. 
 
July      27, 2005 - 2:46 pm UTC 
 
 
one creates views of tables (and other views)
a table can be in a cluster.
a cluster is not queryable directly, you need to put a table in it.
So, it doesn't really make sense to create a view of a cluster segment (not a table), but tables in clusters -- sure, your databases data dictionary is chock full of examples of that!
you can implement FGA (fine grained auditing) on tables, yes.  Even if you meant FGAC (fine grained access control), you can do that on tables and views. 
 
 
 
Performance 
Anil, October   17, 2005 - 6:45 pm UTC
 
 
Hi tom,
If we create a cluster having m number of tables("key1" as pk ) on a cluster key say "key1"
then select on these will be faster instead of joining all these tables then
1.insertion/updation will be slow?
2.LIO's might be high? depending on the query u fired.
3.If it's a OLTP env, will there be any contension for latches in momory while doing select/dml operations?
Thankx
Anil
 
 
October   18, 2005 - 8:25 am UTC 
 
 
the select might be faster - what you have done is make it so that the related data is stored together - so one physical IO might get all of the data you need, whereas if the data were stored all over the place - it might take many more physical IO's to get the data together.
1) modifications are penalized, to the possible benefit of data retrieval, yes.
2) no, you would expect LIO's to stay the same or more likely - DECREASE.  The data is stored on a single (or small set of) block.  We don't have to go to 50 blocks to get 50 rows - we goto one block to get the 50 rows.
3) the only answer that can be given is one of three.  Yes, No, Maybe - eg: it depends.
if a single session tends to work on "KEY1=X" by itself - you'll likely reduce contention (as all of the key1=x data is together - and key1=y isn't on the same block in general - you've segregated the data out)
but that is not assured. 
 
 
 
performance
anil, October   18, 2005 - 10:27 am UTC
 
 
Hi tom,
Thankx for the quick turnaround.
Regarding the LIO's if query being fired is somewhat like this
select * from a;
select * from b;
select * from c;
etc
and a ,b ,c are part of cluster then in this case LIO's won't be higher as the data is across multiple blocks?
Thankx,
ANIL
 
 
October   18, 2005 - 10:44 am UTC 
 
 
no, that is not true.
select * from a where key1 = 'x'
select * from b where key1 = 'x'
select * from c where key1 = 'x'
could incurr less LIO/PIO if a, b, c are clustered on the key1 column since all of the x values would be near eachother, clustered.
just select * from a however will incurr the same LIO/PIO regardless of clustering, you are getting ALL of the rows. 
 
 
 
performance
anil, October   18, 2005 - 10:31 am UTC
 
 
Hi tom,
Thankx for the quick turnaround.
Regarding the LIO's if query being fired is somewhat like this
select * from a;
select * from b;
select * from c;
etc
and a ,b ,c are part of cluster then in this case LIO's won't be higher as the data is across multiple blocks?
Thankx,
ANIL
 
 
 
performance
Anil, October   18, 2005 - 2:21 pm UTC
 
 
tom,
Please help me in investigating why the LIO's and some wait events are on higher side(like db file seq.read,log file sync and latch free)
STATSPACK report for
DB Name         DB Id    Instance     Inst Num Release     OPS Host
------------ ----------- ------------ -------- ----------- --- ------------
MRDP           354944511 MRDP                1 8.1.7.4.0   NO  mrdutpc1pr
                Snap Id     Snap Time      Sessions
                ------- ------------------ --------
 Begin Snap:         58 14-Oct-05 10:46:08      512
   End Snap:         59 14-Oct-05 11:49:20      512
    Elapsed:                  63.20 (mins)
Cache Sizes
~~~~~~~~~~~
           db_block_buffers:     180000          log_buffer:    5242880
              db_block_size:       8192    shared_pool_size:  350000000
Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:             84,660.44              3,760.70
              Logical reads:              4,542.45                201.78
              Block changes:                476.59                 21.17
             Physical reads:                 69.20                  3.07
            Physical writes:                 15.38                  0.68
                 User calls:                543.61                 24.15
                     Parses:                218.49                  9.71
                Hard parses:                 16.06                  0.71
                      Sorts:                  4.40                  0.20
                     Logons:                  1.35                  0.06
                   Executes:                359.59                 15.97
               Transactions:                 22.51
  % Blocks changed per Read:   10.49    Recursive Call %:   50.75
 Rollback per transaction %:   11.52       Rows per Sort:   11.23
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   99.98       Redo NoWait %:  100.00
            Buffer  Hit   %:   98.48    In-memory Sort %:   99.99
            Library Hit   %:   97.84        Soft Parse %:   92.65
         Execute to Parse %:   39.24         Latch Hit %:   99.11
Parse CPU to Parse Elapsd %:   80.91     % Non-Parse CPU:   88.36
 Shared Pool Statistics        Begin   End
                               ------  ------
             Memory Usage %:   39.64   83.23
    % SQL with executions>1:    7.69    5.97
  % Memory for SQL w/exec>1:   10.83    9.05
Top 5 Wait Events
~~~~~~~~~~~~~~~~~                                             Wait     % Total
Event                                               Waits  Time (cs)   Wt Time
-------------------------------------------- ------------ ------------ -------
db file sequential read                           193,827      165,114   67.98
log file sync                                      87,051       22,982    9.46
SQL*Net break/reset to client                     141,005       16,198    6.67
latch free                                        155,806       11,999    4.94
log file parallel write                            74,366        9,773    4.02
          -------------------------------------------------------------
Wait Events for DB: MRDP  Instance: MRDP  Snaps: 58 -59
-> cs - centisecond -  100th of a second
-> ms - millisecond - 1000th of a second
-> ordered by wait time desc, waits desc (idle events last)
                                                                    Avg
                                                     Total Wait    wait  Waits
Event                               Waits   Timeouts  Time (cs)    (ms)   /txn
---------------------------- ------------ ---------- ----------- ------ ------
db file sequential read           193,827          0     165,114      9    2.3
log file sync                      87,051          2      22,982      3    1.0
SQL*Net break/reset to clien      141,005          9      16,198      1    1.7
latch free                        155,806     86,079      11,999      1    1.8
log file parallel write            74,366          0       9,773      1    0.9
SQL*Net more data to client       260,636         23       7,958      0    3.1
db file scattered read              7,732          0       3,469      4    0.1
db file parallel write              6,948          0       2,839      4    0.1
log file sequential read            4,068          0         995      2    0.0
buffer busy waits                   3,014          0         626      2    0.0
enqueue                             2,047          0         356      2    0.0
control file parallel write         1,368          0         341      2    0.0
direct path read                    1,927          0          64      0    0.0
direct path write (lob)               440          0          59      1    0.0
file open                           3,636          0          50      0    0.0
direct path read (lob)                333          0          28      1    0.0
LGWR wait for redo copy               918          0           8      0    0.0
log file switch completion              3          0           7     23    0.0
buffer deadlock                        29         29           3      1    0.0
direct path write                     560          0           2      0    0.0
control file sequential read          349          0           2      0    0.0
file identify                          14          0           1      1    0.0
library cache pin                      12          0           1      1    0.0
refresh controlfile command            10          0           0      0    0.0
log file single write                   4          0           0      0    0.0
library cache load lock                 2          0           0      0    0.0
virtual circuit status          1,272,864         83   6,374,114     50   14.9
SQL*Net message from client     1,594,984          0      71,694      0   18.7
SQL*Net more data from clien       16,944          0      15,379      9    0.2
SQL*Net message to client       1,594,984          0       1,067      0   18.7
  
thanks a lot,
Anil
 
 
 
Single-table hash clusters
A reader, January   12, 2006 - 5:39 am UTC
 
 
Hi Tom,
  I've got a question regarding single-table hash clusters.  If your table primary key is a varchar2 type column, Oracle doesn't allow you to specify it in the 'HASH IS' part, since that only takes a NUMBER type.  Is there any negative impact in this case of using the internal hashing function?  What is the reasoning behind only allowing you to specify a NUMBER type for the Hash function?
Thanks 
 
January   12, 2006 - 11:01 am UTC 
 
 
ops$tkyte@ORA10GR2> create cluster Hash_Cluster
  2  ( id varchar2(200) )
  3  SINGLE TABLE
  4  hashkeys 75000   -- number of distinct lookups we expect over time
  5  size 125         -- size of data associated with key
  6  /
Cluster created.
no sure what you are trying to do exactly?  have example? 
 
 
 
 
Single table hash cluster - more info
A reader, January   16, 2006 - 4:15 am UTC
 
 
Hi Tom,
  This is what I'm trying to do:
SQL> create cluster hash_cluster
  2  (id varchar2(2000)) 
  3  single table
  4  hashkeys 200
  5  hash is id 
  6  size 125;
create cluster hash_cluster
*
ERROR at line 1:
ORA-02456: The HASH IS column specification must be NUMBER(*,0)
  I was trying to use the cluster key as the hash function.  But Oracle doesn't like it since it is a varchar2 column.   
 
 
January   16, 2006 - 9:37 am UTC 
 
 
Right - you need to apply some function to your string resulting in some number, eg:
ops$tkyte@ORA10GR2> create cluster hash_cluster
  2  (id varchar2(2000))
  3  single table
  4  hashkeys 200
  5  hash is TO_NUMBER(id)
  6  size 125;
Cluster created.
It need not be TO_NUMBER, just some function that takes your inputs and hashes them - returning a number.
but in general, you don't even need the "hash is", we'll hash your string into a number for you. 
 
 
 
 
Thanks for the info.
A reader, January   16, 2006 - 11:00 am UTC
 
 
Hi Tom,
  Thanks for the quick response.  I do realise that if you don't specify the 'HASH IS' clause, Oracle uses its own hashing function. 
  My question was how different will this be in terms of performance, if any?  The manual says 
'If no HASH IS clause is specified, the database uses an internal hash function. If the cluster key is already a unique identifier that is uniformly distributed over its range, you can bypass the internal hash function and specify the cluster key as the hash value, as is the case in the preceding example. You can also use the HASH IS clause to specify a user-defined hash function.'
  I was just curious if there would be any performance difference in using the different hash functions (internal, user-defined and just using the cluster key)
Regards 
 
January   16, 2006 - 12:49 pm UTC 
 
 
depends - can you effciently hash your string, more so than we can.  If so, go ahead and write your own hash - but just remember, it needs to return a NUMBER.
You can "hash is" your string, with the rule that your hash function returns a NUMBER. 
 
 
 
How can I set the right size for a Cluster (hash/btree)???
Jairo Ojeda, February  10, 2006 - 1:01 pm UTC
 
 
This is the first time I will use cluster tables and I don't know how to set the right size:
using single table hash cluster, using multi table hash cluster or using btree cluster.
Can you tell me how to calculate that size?
Thanks,
 
 
February  10, 2006 - 1:27 pm UTC 
 
 
do you have access to my books where I cover this in great detail?
the size is simply the sum of the bytes associated with each key.
If you hash by "id", what amount of data is associated with id.  that is the size. 
 
 
 
Cluster collision
Jairo Ojeda, February  14, 2006 - 10:10 am UTC
 
 
Thanks, for your comments about cluster size!, but now I want to know how to check if I get cluster collision? 
 
February  14, 2006 - 11:12 am UTC 
 
 
read out each key value (assuming the hash key is unique).
I would expect one LIO per retrieval.  For example, this is a quick demo I've used:
create cluster Hash_Cluster
( id number )
SINGLE TABLE
hashkeys 75000   -- number of distinct lookups we expect over time
size 125         -- size of data associated with key
/
pause
clear screen
create table heap_lookup
( OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID,
  DATA_OBJECT_ID, OBJECT_TYPE, CREATED, LAST_DDL_TIME,
  TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY,
  primary key(object_id)
)
as
select * from all_objects;
create table hash_lookup
( OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID,
  DATA_OBJECT_ID, OBJECT_TYPE, CREATED, LAST_DDL_TIME,
  TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY,
  primary key(object_id)
)
cluster hash_cluster(object_id)
as
select * from heap_lookup;
pause
clear screen
begin
    dbms_stats.gather_table_stats
    ( user, 'HEAP_LOOKUP', cascade=>true );
    dbms_stats.gather_table_stats
    ( user, 'HASH_LOOKUP', cascade=>true );
end;
/
pause
clear screen
declare
    type array is table of number;
    l_object_ids array;
    l_rec        heap_lookup%rowtype;
begin
    select object_id bulk collect into l_object_ids
      from heap_lookup;
    execute immediate q'|
        alter session set events '10046 trace name context forever, level 12'
    |';
    for k in 1 .. l_object_ids.count
    loop
        select * into l_rec
          from heap_lookup
         where object_id = l_object_ids(k);
        select * into l_rec
          from hash_lookup
         where object_id = l_object_ids(k);
    end loop;
end;
/
In the tkrprof, I expect something like this:
SELECT * FROM HEAP_LOOKUP WHERE OBJECT_ID = :B1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    99115      7.96       7.96          0     148671          0       49557
Rows     Row Source Operation
-------  ---------------------------------------------------
  49557  TABLE ACCESS BY INDEX ROWID HEAP_LOOKUP (cr=148671 pr=0 pw=0 time=1798682 us)
  49557   INDEX UNIQUE SCAN SYS_C006949 (cr=99114 pr=0 pw=0 time=959911 us)(object id 57938)
********************************************************************************
SELECT * FROM HASH_LOOKUP WHERE OBJECT_ID = :B1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    99115      7.88       7.93          0      49559          0       49557
Rows     Row Source Operation
-------  ---------------------------------------------------
  49557  TABLE ACCESS HASH HASH_LOOKUP (cr=49557 pr=0 pw=0 time=1352434 us)
the heap table did the io's via the INDEX and then into the table, the number of IO's against the table = number of rows retrieved.
the hash clustered table did the IO's directly - number of IO's against the table exactly and precisely equal the number of rows retrieved.  "best case".
You'll always get "collisions" (they are expected, we get more than one row per block), you want to know if it is taking more IO than necessary to get your data.
In this case, the answer is "no", but if we lowered the cluster size - eventually the answer would be "yes". 
 
 
 
Cluster Table
Arindam Mukherjee, August    18, 2006 - 9:08 am UTC
 
 
Respected Mr. Tom,
I have read 10 times the following lines from Oracle 9i Database Concepts – Chapter 10 and sorry I could not understand the following two (2) sentences. If possible, could you kindly explain me with example?
Sentence -1
--------------
Storage parameters cannot be specified when a clustered table is created or altered. 
Sentence -2
--------------
The storage parameters set for the cluster always control the storage of all tables in the cluster.
ItÂ’s true; I am not much more familiar with cluster table.
 
 
August    18, 2006 - 10:25 am UTC 
 
 
when you create a CLUSTER - you are creating a segment, the segment (the storage) is the CLUSTER.
when you create a table in a cluster, you do not create a segment, you are just creating a table.  the segment (the storage) is the cluster segment, not the table.
So, when you create a table in a cluster (also known as a clustered table), you cannot specify storage parameters like initial, next and so on because the TABLE doesn't control the storage, the cluster itself does.  And the table just resides in the cluster segment.
Here is a short writeup from Expert Oracle database Architecture on this:
<quote>
So, letÂ’s look at how we might go about creating a clustered object. Creating a cluster of tables in the object is straightforward. The definition of the storage of the object (PCTFREE, PCTUSED, INITIAL, and so on) is associated with the CLUSTER, not the tables. This makes sense since there will be many tables in the cluster, and they will be on the same block. Having different PCTFREEs would not make sense. Therefore, a CREATE CLUSTER looks a lot like a CREATE TABLE with a small number of columns (just the cluster key columns):
ops$tkyte@ORA10GR1> create cluster emp_dept_cluster
  2  ( deptno number(2) )
  3  size 1024
  4  /
Cluster created.
Here, we have created an index cluster (the other type being a hash cluster, which weÂ’ll look at in the next section). The clustering column for this cluster will be the DEPTNO column. The columns in the tables do not have to be called DEPTNO, but they must be NUMBER(2), to match this definition. We have, on the cluster definition, a SIZE 1024 option. This is used to tell Oracle that we expect about 1,024 bytes of data to be associated with each cluster key value. Oracle will use that to compute the maximum number of cluster keys that could fit per block. Given that we have an 8KB block size, Oracle will fit up to seven cluster keys (but maybe less if the data is larger then expected) per database block. This is, the data for departments 10, 20, 30, 40, 50, 60, and 70 would tend to go onto one block, and as soon as we insert department 80, a new block will be used. That does not mean that the data is stored in a sorted manner; it just means that if we inserted the departments in that order, they would naturally tend to be put together. If we inserted the departments in the order 10, 80, 20, 30, 40, 50, 60, and then 70, the final department, 70, would tend to be on the newly added block. As weÂ’ll see below, both the size of the data and the order in which the data is inserted will affect the number of keys we can store per block.
The SIZE parameter therefore controls the maximum number of cluster keys per block. It is the single largest influence on the space utilization of our cluster. Set the size too high, and weÂ’ll get very few keys per block and weÂ’ll use more space then we need. Set the size too low, and weÂ’ll get excessive chaining of data, which offsets the purpose of the cluster to store all of the data together on a single block. It is the most important parameter for a cluster.
Now for the cluster index on our cluster. We need to index the cluster before we can put data in it. We could create tables in the cluster right now, but weÂ’re going to create and populate the tables simultaneously, and we need a cluster index before we can have any data. The cluster indexÂ’s job is to take a cluster key value and return the block address of the block that contains that key. It is a primary key in effect, where each cluster key value points to a single block in the cluster itself. So, when we ask for the data in department 10, Oracle will read the cluster key, determine the block address for that, and then read the data. The cluster key index is created as follows:
ops$tkyte@ORA10GR1> create index emp_dept_cluster_idx
  2  on cluster emp_dept_cluster
  3  /
Index created.
It can have all of the normal storage parameters of an index and can be stored in another tablespace. It is just a regular index, so it can be on multiple columns; it just happens to index into a cluster and can also include an entry for a completely null value (see Chapter 11 for the reason why this is interesting to note). Note that we do not specify a list of columns in this CREATE INDEX statement—that is derived from the CLUSTER definition itself. Now we are ready to create our tables in the cluster:
ops$tkyte@ORA10GR1> create table dept
  2  ( deptno number(2) primary key,
  3    dname  varchar2(14),
  4    loc    varchar2(13)
  5  )
  6  cluster emp_dept_cluster(deptno)
  7  /
Table created.
 
ops$tkyte@ORA10GR1> create table emp
  2  ( empno    number primary key,
  3    ename    varchar2(10),
  4    job      varchar2(9),
  5    mgr      number,
  6    hiredate date,
  7    sal      number,
  8    comm     number,
  9    deptno number(2) references dept(deptno)
 10  )
 11  cluster emp_dept_cluster(deptno)
 12  /
Table created.
Here, the only difference from a “normal” table is that we used the CLUSTER keyword and told Oracle which column of the base table will map to the cluster key in the cluster itself. Remember, the cluster is the segment here, therefore this table will never have segment attributes such as TABLESPACE, PCTFREE, and so on—they are attributes of the cluster segment, not the table we just created. 
</quote> 
 
 
 
 
IOT with sys_guid() values
A reader, November  24, 2006 - 7:00 am UTC
 
 
Hi Tom - 
Can you explain how an IOT with a primary key populated by sys_guid() would work?
Just not sure how the IOT would determine what order to put RAW values like 6FB56BAF0F5140E184C5E9FDA101389F, B3457FA7049F4C41A48A6E250982CEF6, 8DC7D39212FD4BA390A53A0925D47D7D, . . . 
How does it determine if 6FB56BAF0F5140E184C5E9FDA101389F should be stored before or after the value B3457FA7049F4C41A48A6E250982CEF6 ???
Thanks for your advice 
 
November  24, 2006 - 6:41 pm UTC 
 
 
the same as an IOT populated by anything.
the rows are stored in sorted order by primary key.  
So, is 6B > B3 or is 6b < B3, it just compares bytes (that is the rule of raw) 
 
 
 
Too many consistent gets in cluster access
Lino Bort, March     21, 2007 - 12:12 pm UTC
 
 
Hi Tom,
I am considering the use of a cluster to store together data from 5 tables than are joined frequently.
I have set up the following test:
create cluster simple_cluster (id number) size 128;
create index idx_simple_cluster on cluster simple_cluster;
create table t1 (id number primary key, a varchar2(40)) cluster simple_cluster (id);
create table t2 (id number primary key, b date) cluster simple_cluster (id);
create table ht1 (id number primary key, a varchar2(40));
create table ht2 (id number primary key, b date);
insert
when (1=1) then
  into t1 values (id,a)
when (1=1) then
  into t2 values (id,b)
select object_id id, object_name a, created b from all_objects where rownum <= 10000;
commit;
insert into ht1 select * from t1;
insert into ht2 select * from t2;
commit;
analyze cluster simple_cluster compute statistics;
analyze index idx_simple_cluster compute statistics;
analyze table t1 compute statistics for table for all indexes for all indexed columns;
analyze table t2  compute statistics for table for all indexes for all indexed columns;
analyze table ht1 compute statistics for table for all indexes for all indexed columns;
analyze table ht2 compute statistics for table for all indexes for all indexed columns;
analyze index idx_simple_cluster validate structure;
select lf_blks, br_blks, height, distinct_keys from index_stats;
   LF_BLKS    BR_BLKS     HEIGHT DISTINCT_KEYS
---------- ---------- ---------- -------------
        27          1          2         10000
select dbms_rowid.rowid_block_number(t1.rowid) t1_block,
       dbms_rowid.rowid_block_number(t2.rowid) t2_block
from t1, t2
where t1.id = t2.id
  and t1.id = 258;
  T1_BLOCK   T2_BLOCK
---------- ----------
     22085      22085
set autotrace traceonly
select *
from t1, t2
where t1.id = t2.id
  and t1.id = 258;
...
--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |     1 |    38 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |             |     1 |    38 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T2          |     1 |    11 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SYS_C006814 |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T1          |     1 |    27 |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN         | SYS_C006813 |     1 |       |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
...
Estadistica
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
...
select *
from ht1, ht2
where ht1.id = ht2.id
  and ht1.id = 258;
...
--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |     1 |    39 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |             |     1 |    39 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| HT2         |     1 |    13 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | SYS_C006816 |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS BY INDEX ROWID| HT1         |     1 |    26 |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN         | SYS_C006815 |     1 |       |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
...
Estadistica
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
...
alter index idx_simple_cluster rebuild;
analyze index idx_simple_cluster validate structure;
set autotrace off
select lf_blks, br_blks, height, distinct_keys from index_stats;
   LF_BLKS    BR_BLKS     HEIGHT DISTINCT_KEYS
---------- ---------- ---------- -------------
        24          1          2         10000
set autotrace traceonly
select *
from t1, t2
where t1.id = t2.id
  and t1.id = 258;
/
...
skiped the first plan
...
--------------------------------------------------------------------------------------------
| Id  | Operation             | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                    |     1 |    38 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS         |                    |     1 |    38 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS CLUSTER| T2                 |     1 |    11 |     2   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN  | IDX_SIMPLE_CLUSTER |     1 |       |     1   (0)| 00:00:01 |
|   4 |   TABLE ACCESS CLUSTER| T1                 |     1 |    27 |     1   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN  | IDX_SIMPLE_CLUSTER |     1 |       |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
...
Estadistica
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
...
The cluster index has a height of 2. Therefore, with 3 blocks of I/O, in theory, we can get the data from both tables we are looking for. How is it possible to take 8 consistent gets?
Thanks in advance, 
 
March     21, 2007 - 12:25 pm UTC 
 
 
tkprof it.  with extended statistics, you'll see the IO's performed step by step. 
 
 
Statistics for cluster access path
Lino Bort, March     21, 2007 - 1:16 pm UTC
 
 
Hi Tom,
I have obtained the following statistics from a trace file:
STAT #1 id=1 cnt=1 pid=0 pos=1 obj=0 op='NESTED LOOPS  (cr=8 pr=0 pw=0 time=1440 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=50728 op='TABLE ACCESS CLUSTER T2 (cr=4 pr=0 pw=0 time=1362 us)'
STAT #1 id=3 cnt=1 pid=2 pos=1 obj=50725 op='INDEX UNIQUE SCAN IDX_SIMPLE_CLUSTER (cr=2 pr=0 pw=0 time=635 us)'
STAT #1 id=4 cnt=1 pid=1 pos=2 obj=50726 op='TABLE ACCESS CLUSTER T1 (cr=4 pr=0 pw=0 time=51 us)'
STAT #1 id=5 cnt=1 pid=4 pos=1 obj=50725 op='INDEX UNIQUE SCAN IDX_SIMPLE_CLUSTER (cr=2 pr=0 pw=0 time=26 us)'
Meaning that I performed 4 consistent gets for the cluster index and other 4 for the cluster access. But, actually, all the data for that cluster key is in one block. ¿?
Thanks 
 
March     21, 2007 - 7:19 pm UTC 
 
 
sure it is all in one block - but we got the block once to access T1 and again to access T2.
this is normal.
It put all of the rows for T1 onto as few blocks as possible.  Same for T2.  And it tries to put T1 and T2 together (to reduce physical IOs)
but we still have to do a "get" for T1 data and a "get" for T2 data.
Clustering made it so that physical IO is reduced (1 IO, get the data) and logical IO is reduced as much as possible. 
 
 
More tests in my cluster scenario
Lino Bort, March     22, 2007 - 7:02 am UTC
 
 
Hi Tom,
I researched a bit more in my test case and find something interesting.
I looked at the data distribution:
select table_name, blocks, empty_blocks,  avg_space
from user_tables
where table_name in ('T1','T2','HT1','HT2')
order by table_name;
TABLE_NAME                         BLOCKS EMPTY_BLOCKS  AVG_SPACE
------------------------------ ---------- ------------ ----------
HT1                                   370           14       1694
HT2                                   370           14       1694
T1                                    748           20       1016
T2                                    748           20       1016
select i.table_name, i.index_name, s.blocks
from user_segments s, user_indexes i
where (i.table_name in ('T1','T2','HT1','HT2') or i.index_name = 'IDX_SIMPLE_CLUSTER')
  and s.segment_name = i.index_name
order by 1,2;
TABLE_NAME                     INDEX_NAME                         BLOCKS
------------------------------ ------------------------------ ----------
HT1                            SYS_C006875                           104
HT2                            SYS_C006876                           104
SIMPLE_CLUSTER                 IDX_SIMPLE_CLUSTER                    104
T1                             SYS_C006873                            96
T2                             SYS_C006874                            96
So, the cluster is more or less the same size than the two heap tables together. Indeed, we have two indexes for the heap tables and three for the cluster (two for each table and another one for the cluster index).
Depending on the plan we can use a distinct index. Therefore, we are using more space in the cluster approach.
I set up another test, based on chapter 7, p391 of your book "Effective Oracle by Design".
********************************************************************************
SELECT T1.ID, T1.A, T2.B
FROM
 T1, T2 WHERE T1.ID = T2.ID AND T1.ID = :B1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 387590     28.20      22.90          0          0          0           0
Fetch   387590     45.36      31.64        151    1275180          0      100000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   775181     73.57      54.55        151    1275180          0      100000
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 58     (recursive depth: 1)
Rows     Row Source Operation
-------  ---------------------------------------------------
 100000  NESTED LOOPS  (cr=1275180 pr=151 pw=0 time=40365241 us)
 100000   TABLE ACCESS CLUSTER T2 (cr=975180 pr=151 pw=0 time=21235320 us)
 100000    INDEX UNIQUE SCAN IDX_SIMPLE_CLUSTER (cr=775180 pr=22 pw=0 time=9269972 us)(object id 50725)
 100000   TABLE ACCESS CLUSTER T1 (cr=300000 pr=0 pw=0 time=7004082 us)
 100000    INDEX UNIQUE SCAN IDX_SIMPLE_CLUSTER (cr=200000 pr=0 pw=0 time=2838852 us)(object id 50725)
********************************************************************************
SELECT HT1.ID, HT1.A, HT2.B
FROM
 HT1, HT2 WHERE HT1.ID = HT2.ID AND HT1.ID = :B1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute 387590     29.00      23.69          0          0          0           0
Fetch   387590     46.24      31.98        130    1175180          0      100000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   775181     75.25      55.68        130    1175180          0      100000
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 58     (recursive depth: 1)
Rows     Row Source Operation
-------  ---------------------------------------------------
 100000  NESTED LOOPS  (cr=1175180 pr=130 pw=0 time=40937034 us)
 100000   TABLE ACCESS BY INDEX ROWID HT2 (cr=875180 pr=54 pw=0 time=21063360 us)
 100000    INDEX UNIQUE SCAN SYS_C006816 (cr=775180 pr=30 pw=0 time=9443417 us)(object id 50733)
 100000   TABLE ACCESS BY INDEX ROWID HT1 (cr=300000 pr=76 pw=0 time=7366452 us)
 100000    INDEX UNIQUE SCAN SYS_C006815 (cr=200000 pr=31 pw=0 time=3148368 us)(object id 50731)
As you can see, with my cluster approach I am making more physical and logical I/O. With that results, I am discouraged to use clusters in our design. 
Thanks, 
 
 
Altering the cluster key of a record, what happens?
Adam, April     12, 2007 - 9:42 pm UTC
 
 
Hello Tom,
If I set up the following example:
create cluster scott.emp_dept_cluster
( deptno number(2) )
tablespace users
size 1024
/
Cluster created.
create index scott.emp_dept_cluster_idx
on cluster scott.emp_dept_cluster
tablespace users
/
Index created.
create table scott.emp
( empno    number primary key,
ename    varchar2(10),
job      varchar2(9),
mgr      number,
hiredate date,
sal      number,
comm     number,
deptno number(2) references scott.dept(deptno)
)
cluster scott.emp_dept_cluster(deptno)
/
Table created.
insert into scott.emp (empno,ename,job,mgr,hiredate,sal,comm     ,deptno) values (7788,'SCOTT','ANALYST',7566,'19-APR-87',3000,20);
1 row inserted.
commit;
update scott.emp set DEPTNO = 30 where ENAME = 'SCOTT';
1 row updated.
commit;
Can one conclude the record for employee Scott has been moved from the block with the cluster key id = 20 to the block with the cluster key id = 30?  
 
April     13, 2007 - 1:13 pm UTC 
 
 
it is more complex than that - it is similar to a migrated row.
the rowid will not change.
ops$tkyte%ORA9IR2> create cluster emp_dept_cluster
  2  ( deptno number(2) )
  3  tablespace users<b>
  4  size 8k -- to make a deptno have a block to itself in my 8k block db</b>
  5  /
Cluster created.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create index emp_dept_cluster_idx
  2  on cluster emp_dept_cluster
  3  tablespace users
  4  /
Index created.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create table emp
  2  ( empno    number primary key,
  3  ename    varchar2(10),
  4  job      varchar2(9),
  5  mgr      number,
  6  hiredate date,
  7  sal      number,
  8  comm     number,
  9  deptno number(2)
 10  )
 11  cluster emp_dept_cluster(deptno)
 12  /
Table created.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert into emp select * from scott.emp;
14 rows created.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2> select deptno, ename, rowid, dbms_rowid.rowid_block_number(rowid) from emp order by deptno;
    DEPTNO ENAME      ROWID              DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
---------- ---------- ------------------ ------------------------------------
        10 CLARK      AAAH68AAJAAAA+3AAA                                 4023
        10 KING       AAAH68AAJAAAA+3AAB                                 4023
        10 MILLER     AAAH68AAJAAAA+3AAC                                 4023
        20 SMITH      AAAH68AAJAAAA+4AAA                                 4024
        20 ADAMS      AAAH68AAJAAAA+4AAD                                 4024
        20 JONES      AAAH68AAJAAAA+4AAB                                 4024
        20 SCOTT      AAAH68AAJAAAA+4AAC                                 4024
        20 FORD       AAAH68AAJAAAA+4AAE                                 4024
        30 ALLEN      AAAH68AAJAAAA+0AAA                                 4020
        30 WARD       AAAH68AAJAAAA+0AAB                                 4020
        30 MARTIN     AAAH68AAJAAAA+0AAC                                 4020
        30 BLAKE      AAAH68AAJAAAA+0AAD                                 4020
        30 TURNER     AAAH68AAJAAAA+0AAE                                 4020
        30 JAMES      AAAH68AAJAAAA+0AAF                                 4020
14 rows selected.
ops$tkyte%ORA9IR2> update emp set DEPTNO = 30 where ENAME = 'SCOTT';
1 row updated.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2> select deptno, ename, rowid, dbms_rowid.rowid_block_number(rowid) from emp order by deptno;
    DEPTNO ENAME      ROWID              DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
---------- ---------- ------------------ ------------------------------------
        10 CLARK      AAAH68AAJAAAA+3AAA                                 4023
        10 KING       AAAH68AAJAAAA+3AAB                                 4023
        10 MILLER     AAAH68AAJAAAA+3AAC                                 4023
        20 SMITH      AAAH68AAJAAAA+4AAA                                 4024
        20 FORD       AAAH68AAJAAAA+4AAE                                 4024
        20 ADAMS      AAAH68AAJAAAA+4AAD                                 4024
        20 JONES      AAAH68AAJAAAA+4AAB                                 4024
        30 ALLEN      AAAH68AAJAAAA+0AAA                                 4020
        30 WARD       AAAH68AAJAAAA+0AAB                                 4020
        30 MARTIN     AAAH68AAJAAAA+0AAC                                 4020
        30 BLAKE      AAAH68AAJAAAA+0AAD                                 4020
        30 TURNER     AAAH68AAJAAAA+0AAE                                 4020
        30 JAMES      AAAH68AAJAAAA+0AAF                                 4020
        30 SCOTT      AAAH68AAJAAAA+4AAC                                 4024
14 rows selected.So, scott is still on block 4024 (sort of) - has to keep the rowid since an index on ENAME for example would have the rowid.
We can see the row migrated - and left the forwarding address behind...
you can analyze the table to see the chained row count goes up - or, if we add an index on ENAME and query:
select * from emp where ename= 'SCOTT';
before and after and measure 
a) logical IO's
b) the table fetch continued row statistic
we'd see the IO goes up by one as does the continued row statistic:
ops$tkyte%ORA9IR2> create index ename_idx on emp(ename);
Index created.
ops$tkyte%ORA9IR2> insert into emp select * from scott.emp;
14 rows created.
ops$tkyte%ORA9IR2> select * from emp where ENAME='SCOTT';
     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7788 SCOTT      ANALYST         7566 09-DEC-82       3000                    20
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> @mystat continued
ops$tkyte%ORA9IR2> set echo off
NAME                        VALUE
---------------------- ----------
table fetch continued           5
row
ops$tkyte%ORA9IR2> set autotrace traceonly statistics
ops$tkyte%ORA9IR2> select * from emp where ENAME='SCOTT';
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        805  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
ops$tkyte%ORA9IR2> set autotrace off
ops$tkyte%ORA9IR2> @mystat2
ops$tkyte%ORA9IR2> set echo off
NAME                        VALUE DIFF
---------------------- ---------- ------------------
table fetch continued           5                0
row
ops$tkyte%ORA9IR2> update emp set DEPTNO = 30 where ENAME = 'SCOTT';
1 row updated.
ops$tkyte%ORA9IR2> commit;
Commit complete.
ops$tkyte%ORA9IR2> @mystat continued
ops$tkyte%ORA9IR2> set echo off
NAME                        VALUE
---------------------- ----------
table fetch continued           5
row
ops$tkyte%ORA9IR2> set autotrace traceonly statistics
ops$tkyte%ORA9IR2> select * from emp where ENAME='SCOTT';
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        805  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
ops$tkyte%ORA9IR2> set autotrace off
ops$tkyte%ORA9IR2> @mystat2
ops$tkyte%ORA9IR2> set echo off
NAME                        VALUE DIFF
---------------------- ---------- ------------------
table fetch continued           6                1
row
 
 
 
 
Thank you
A reader, April     13, 2007 - 1:23 pm UTC
 
 
Tom,
     Thank you for the exceptionally detailed walk through.  Professional and to the point as always. 
 
1753?
Stephan, July      26, 2007 - 10:51 pm UTC
 
 
Hi Tom,
Trying the hash cluster examples from your new book (pg 385). I've changed the table and cluster names, but I think I've got everything straight. I can't seem to get past this error - any quick thoughts?
ops$us\suzzell@OPERA> create cluster single_tab_hash_clust
  2  ( hash_key number(10) )
  3  hashkeys 75000
  4  size 150
  5  single table
  6  hash is HASH_KEY
  7  /
Cluster created.
Elapsed: 00:00:01.00
ops$us\suzzell@OPERA> create table t10_9_hashed
  2  cluster single_tab_hash_clust(object_id)
  3  as
  4  select owner, object_name, subobject_name,
  5         cast( object_id as number(10) ) object_id,
  6         data_object_id, object_type, created,
  7         last_ddl_time, timestamp, status, temporary,
  8         generated, secondary
  9    from dba_objects
 10  /
cluster single_tab_hash_clust(object_id)
                              *
ERROR at line 2:
ORA-01753: column definition incompatible with clustered column definitionShould probably mention I'm using 9i...
Thanks, 
 
July      27, 2007 - 2:00 pm UTC 
 
 
I cannot reproduce in 10.2.0.3 or 9.2.0.8 
 
 
I guess I concur
Stephan, July      27, 2007 - 5:23 pm UTC
 
 
I tested in 10.2.0.2 and it worked.
At least I can rest knowing I didn't miss anything stupid.
Thanks, Tom 
 
single hash clustered table still used much CPU than 
Qihua, August    23, 2008 - 4:55 am UTC
 
 
Hash accessing consumes much more CPU than index accessing even if hash accessing uses 1 LIO and index accessing uses at least 2 LIO.
In expect one-on-one page 303.
"The hash cluster query took significantly more CPU. This too could be anticipated.The act of performing a hash is very CPU¿intensive"
But for the test with a table using "hash is" (page 304), since oracle don't need perform a hash, so it won't take much CPU. But the result of your testing is that the CPU is still much more than index accessing.
I did a test on a huge single clustered table, an index access with 5 LIO uses nearly the same CPU as hash accessing with 1 LIO. I don't know whether the extra CPU goes to. And based on the test, there is no advantage to use hash cluster as for normal table, data are tightly packed, but for cluster table, data are loosely packed. That means if using cluster table we will have a much less cache hit. 
August    26, 2008 - 7:55 pm UTC 
 
 
Post full test case please - in most cases, the hash access takes a little less cpu (it'll vary by machine of course)
and remember the main goal - reduce contention, reduce IO.
when you use an index range scan + table access by index rowid - you have what is called a "hot block" - the root index block.  Everyone needs it.  The more people that go after it, the more contention for the cache buffers chains (cbc) latch there will be.  The more people contending for the same latch EVEN MORE cpu will be used (we use spin locks in most all cases - we spin, in a loop, waiting for a latch, the more people trying to get a latch, the more cpu you expend in each session getting that latch)
did you think to take your test to 10 or more concurrent users? 
 
 
hash uses 1 LIO and index used 5 LIO, cpu times is nearly the same
Qihua, August    30, 2008 - 2:50 am UTC
 
 
Thanks Tom, 
I post my test result below and I also have several questions/guess here, please confirm:
1: what's the overhead of hashing? Could hashing overhead be ignored in comparison with 1 LIO?  
My test told me hashing seems not very expensive as if using the "hash is" instead of 
the default hash function,there is only very little improvement in elapse time. 
2: Seems one LIO to access an index block is much cheaper than one LIO to access
 a block by hash  When access a block by hash, oracle has to scan the whole block,
 but to access index block, oracle could quick locate the row in the block 
 which save lots of cpu  Only in this way I could explain why 1 LIO for hash is almost 
the same costly as 5 LIO as index access  (4 LIO index blocks and 1 LIO for data block access by rowid)
3: The big table(30G) I want to turn into cluster is a partition table, so the 
contention is not so huge.
   And to avoid hash collision, I'd like to allocate double space than 
the original one,   then the cache hit ratio could drop a lot. 
If 1 LIO for hash access is nearly the same as 5LIO for index access,
 then seems there is no need to make the change, am I right?
SQL> exec runstats_pkg.rs_start;
PL/SQL procedure successfully completed.
SQL> declare
  2   v_record LOOKUP%rowtype;
  3  begin
  4     for i in 1..10000
  5     loop
  6        select * into v_record from LOOKUP where user_id=74560223;
  7     end loop;
  8  end;
  9  /
PL/SQL procedure successfully completed.
SQL>
SQL> exec runstats_pkg.rs_middle
PL/SQL procedure successfully completed.
SQL> declare
  2   v_record LOOKUP%rowtype;
  3  begin
  4     for i in 1..10000
  5     loop
  6        select /*+ INDEX(LOOKUP,LOOKUP_PK) */* into v_record from LOOKUP where user_id=74560223;
  7     end loop;
  8  end;
  9  /
PL/SQL procedure successfully completed.
SQL> exec runstats_pkg.rs_stop(5)
Run1 ran in 149 hsecs
Run2 ran in 160 hsecs
run 1 ran in 93.13% of the time
      
Name                                  Run1        Run2        Diff
STAT...db block gets                    18          24           6
STAT...db block gets from cach          18          24           6
LATCH.channel operations paren          56          50          -6
LATCH.active service list               14          21           7
STAT...recursive cpu usage           1,089       1,069         -20
STAT...CPU used by this sessio       1,110       1,090         -20
STAT...DB time                       1,122       1,090         -32
STAT...CPU used when call star       1,123       1,090         -33
STAT...Elapsed Time                  1,080       1,134          54
STAT...bytes received via SQL*       1,577       1,635          58
STAT...redo size                     3,044       3,104          60
LATCH.checkpoint queue latch         2,560       2,496         -64
LATCH.undo global data                   5          74          69
LATCH.enqueues                         132         206          74
LATCH.enqueue hash chains              138         213          75
STAT...session uga memory           -2,336       3,944       6,280
STAT...cluster key scans            10,000           0     -10,000
STAT...table fetch by rowid              2      10,002      10,000
STAT...buffer is not pinned co      10,004      20,004      10,000
STAT...cluster key scan block       10,000           0     -10,000
STAT...rows fetched via callba           2      10,002      10,000
STAT...no work - consistent re      10,000           0     -10,000
STAT...index fetch by key                2      10,002      10,000
LATCH.cache buffers chains          20,158      50,169      30,011
STAT...consistent gets              10,015      50,017      40,002
STAT...consistent gets from ca      10,015      50,017      40,002
STAT...session logical reads        10,033      50,041      40,008 
           ===> hash access: 1 LIO, index access: 5 LIO 
STAT...consistent gets - exami          10      50,012      50,002
      
Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
23,590      53,763      30,173     43.88%
PL/SQL procedure successfully completed. 
 
August    30, 2008 - 10:09 am UTC 
 
 
in a SINGLE USER TEST  the cpu needed to an a SINGLE LOGICAL IO will be the same for everyone.
So, you have a query that takes 1 LIO, it takes X cpu to accomplish that.  Of that X, some percentage is for the hash and some percentage is for the LIO.  Lets call that percentage for the LIO "x" (lowercase case) and all we know is that "x" < X and X-"x" is the cpu used for the  hashing (let us call that "y")
"y" is constant regardless of the number of concurrent users.  "y" is fixed.
"x" on the other hand is a function of the number of concurrent users, "x" is variable, "x" GOES UP as the user load goes up (you need to latch in order to do "x", latching is a serialization process, we use spin locks in general, the more people going after a latch at the same time, the more cpu each needs to do so).
So, as we add more uses, the amount of cpu needed to perform that 1 LIO query is:
"y" + "x" * some_multiplier
You have that 5 LIO query, it is all LIO work (no hashing).  In single user mode it also used X cpu time.  Here we know that X = 5 * "x" (5 LIOS).  As the concurrency goes up you have:
5 * ("x" * some_multiplier)
The rate of cpu increase is much higher for this second query, it is really even potentially much worse (the multiplier for this second query could be much higher than the first).  Why?  Because one of the blocks needed by this 5 LIO query is the root index block - every one needs it, it'll be a hot block, hot blocks lead to cache buffers chains (a latch name) contention.
hash clustered access takes about the same amount of cpu to retrieve a row in single user mode as the index access would.
it is when you scale up that you'll see a marked difference in the ability to scale, you'll have removed that hot block, you have reduce contention (every LIO you do is a point of contention). 
 
 
Hagi, August    01, 2009 - 9:31 pm UTC
 
 
Sir
Regarding the index cluster tables. 
Is it necessary to load the data with cluster key ?
Assume I have two tables that are clustered.
First, I insert big amount of data to table1.
All the blocks in the cluster will be filled.
Now I insert to table2.
How can oracle put the clustered column to same block in table1.
The clustered blocks are already full.   
August    04, 2009 - 1:23 pm UTC 
 
 
I think you mean to say "is it necessary to load ALL of the rows with the same cluster key at the same time, in order"
The answer is:
It would be nice, but it is certainly not necessary, nor a requirement.
When you create a b*tree cluster - you set a SIZE parameter.  The SIZE parameter is the estimate of the size in bytes of the data you expect to store with a given cluster key.
So, say you have a cluster key "ID".  If you estimate that table1 will have 500 bytes of data associated with a given ID on average (overestimate a bit), and table2 will have 1500 bytes of data on average, you would set SIZE for the cluster to be about 2000.
Now, say you have an 8k block.  8k/2000 is about 4.  When you load table1, we'll put 4 cluster keys on a block and then move on to the next block.  We won't use pctfree, we'll use the size parameter.
see the difference in rows/block when we use different cluster SIZE settings?
ops$tkyte%ORA10GR2> drop table t1;
Table dropped.
ops$tkyte%ORA10GR2> drop cluster my_cluster;
Cluster dropped.
ops$tkyte%ORA10GR2> create cluster my_cluster ( id number ) size 2000;
Cluster created.
ops$tkyte%ORA10GR2> create index my_cluster_idx on cluster my_cluster;
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t1 ( id number, data char(450) ) cluster my_cluster( id );
Table created.
ops$tkyte%ORA10GR2> insert into t1 select rownum, 'x' from dual connect by level <= 100;
100 rows created.
ops$tkyte%ORA10GR2> select min(cnt), max(cnt), avg(cnt) , count(*)
  2    from (
  3  select count(*) cnt
  4    from t1
  5   group by dbms_rowid.rowid_block_number( rowid )
  6         );
  MIN(CNT)   MAX(CNT)   AVG(CNT)   COUNT(*)
---------- ---------- ---------- ----------
         4          4          4         25
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table t1;
Table dropped.
ops$tkyte%ORA10GR2> drop cluster my_cluster;
Cluster dropped.
ops$tkyte%ORA10GR2> create cluster my_cluster ( id number ) size 1000;
Cluster created.
ops$tkyte%ORA10GR2> create index my_cluster_idx on cluster my_cluster;
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t1 ( id number, data char(450) ) cluster my_cluster( id );
Table created.
ops$tkyte%ORA10GR2> insert into t1 select rownum, 'x' from dual connect by level <= 100;
100 rows created.
ops$tkyte%ORA10GR2> select min(cnt), max(cnt), avg(cnt) , count(*)
  2    from (
  3  select count(*) cnt
  4    from t1
  5   group by dbms_rowid.rowid_block_number( rowid )
  6         );
  MIN(CNT)   MAX(CNT)   AVG(CNT)   COUNT(*)
---------- ---------- ---------- ----------
         4          8 7.69230769         13
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> drop table t1;
Table dropped.
ops$tkyte%ORA10GR2> drop cluster my_cluster;
Cluster dropped.
ops$tkyte%ORA10GR2> create cluster my_cluster ( id number ) size 8000;
Cluster created.
ops$tkyte%ORA10GR2> create index my_cluster_idx on cluster my_cluster;
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t1 ( id number, data char(450) ) cluster my_cluster( id );
Table created.
ops$tkyte%ORA10GR2> insert into t1 select rownum, 'x' from dual connect by level <= 100;
100 rows created.
ops$tkyte%ORA10GR2> select min(cnt), max(cnt), avg(cnt) , count(*)
  2    from (
  3  select count(*) cnt
  4    from t1
  5   group by dbms_rowid.rowid_block_number( rowid )
  6         );
  MIN(CNT)   MAX(CNT)   AVG(CNT)   COUNT(*)
---------- ---------- ---------- ----------
         1          1          1        100
 
 
 
 
A reader, August    07, 2009 - 10:03 pm UTC
 
 
Sir;
As far as I know the purpose of the cluster is two or more column share the same block.
In this scenerio,
If I insert into table1,we'll put 4 cluster keys on a (block1) and then move on to the (block2).
Then I insert into table2, we'll again put 4 cluster keys on a different (block3) and then move on to the (block4).
The thing is,the clustered column of table1 and table2 will not reside in the same block..
They are all using different blocks.
 
August    11, 2009 - 12:38 pm UTC 
 
 
... As far as I know the purpose of the cluster is two or more column share the 
same block.
 ...
No, that isn't right.
the goal of the cluster is to:
store on a single block, or as few blocks as possible, all of the rows from all of the tables that share a common key.
You are missing the concept of the b*tree cluster entirely.  When we insert the 4 rows for table1 - the 4 keys we created were put onto block X.  Later, when those four keys are again inserted for any row in any table in that cluster - we'll want to put them onto block X again.
the cluster "clusters related data together".  it forces "location" on the data.  Before we insert into a table in the cluster, we figure out where the key values in the newly inserted rows say the row goes.
Say emp and dept are clustered together on DEPTNO.
Say you use a SIZE of 2k - so we get 4 keys/block.
Say you start with everything empty.
You insert DEPT=100, we look up 100 in the b*tree cluster index, we don't find it (empty tables in the cluster).  So, we find a block with sufficient free space AND that does not already have 4 cluster keys on it (call this block A) and insert DEPT=100 on that block and update the b*tree cluster index to show "dept=100 goes to block A"
You insert DEPT=10, we look up 10 in the b*tree cluster index, we don't find it.  We find a block with sufficient free space AND that does not have 4 cluster keys on it (this will be block A again probably).  We put dept=10 on that block and update the b*tree cluster to show "dept=10 goes to block A"
You insert DEPT=200, the same thing that happened to DEPT=10 happens again.
You insert DEPT=1, then same thing that happened to DEPT=10 happens again.
You insert DEPT=50, this time, when we go to find a free block, we cannot use block A again because block A already has 4 keys on it, so we put DEPT=50 (and the next three DEPT=101,33,42 values) onto block B.
So, you have two blocks
block A has 100, 10, 200, 1
block B has 50, 101, 33, 42
when you insert an employee with a deptno = 42, we look up in the b*tree cluster and discover "42 goes onto block B", so we put that emp row onto block B.
when you inesrt employee with a deptno = 100, we look up in the b*tree cluster index and discover "100 goes onto block A"
and so on....
that is the way the b*tree cluster works. 
 
 
A reader, August    14, 2009 - 9:43 pm UTC
 
 
Sir.
Thanks a lot for your enlightining explanation. I am now clear.
I apprecate your time.
Sorry for any inconvinience. 
 
converting to a single table hash cluster
A reader, September 23, 2009 - 11:11 pm UTC
 
 
Tom,
Having read this article, we've done some benchmarks and have decided that we will implement single table hash clusters (STHC).
We now need to implement this in a production environment by converting an existing HEAP table (referenced by foreign keys) to a STHC. It's not a huge table, approx 4 million rows.
Is there a "quick" command to do this? 
September 28, 2009 - 2:37 pm UTC 
 
 
if you want, online operations are permitted:
ops$tkyte%ORA10GR2> create table t1
  2  as
  3  select * from all_objects
  4   order by dbms_random.random;
Table created.
ops$tkyte%ORA10GR2> alter table t1 add constraint t1_pk primary key(object_id);
Table altered.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create cluster Hash_Cluster
  2  ( id number )
  3  SINGLE TABLE
  4  hashkeys 80000 -- number of distinct lookups we expect over time
  5  size 125         -- size of data associated with key
  6  /
Cluster created.
ops$tkyte%ORA10GR2> create table t2 (
  2  OWNER                  VARCHAR2(30),
  3  OBJECT_NAME            VARCHAR2(30),
  4  SUBOBJECT_NAME         VARCHAR2(30),
  5  OBJECT_ID              NUMBER,
  6  DATA_OBJECT_ID         NUMBER,
  7  OBJECT_TYPE            VARCHAR2(19),
  8  CREATED                DATE,
  9  LAST_DDL_TIME          DATE,
 10  TIMESTAMP              VARCHAR2(19),
 11  STATUS                 VARCHAR2(7),
 12  TEMPORARY              VARCHAR2(1),
 13  GENERATED              VARCHAR2(1),
 14  SECONDARY              VARCHAR2(1),
 15  NAMESPACE              NUMBER,
 16  EDITION_NAME           VARCHAR2(30)
 17  )
 18  cluster hash_cluster( object_id );
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_redefinition.can_redef_table( user, 'T1' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_redefinition.start_redef_table( user, 'T1', 'T2' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> variable nerrors number
ops$tkyte%ORA10GR2> begin
  2          dbms_redefinition.copy_table_dependents
  3          ( user, 'T1', 'T2',
  4            copy_indexes => dbms_redefinition.cons_orig_params,
  5            num_errors => :nerrors );
  6  end;
  7  /
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> exec dbms_redefinition.finish_redef_table( user, 'T1', 'T2' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> drop table t2;
Table dropped.
ops$tkyte%ORA10GR2> select dbms_metadata.get_ddl( 'TABLE', 'T1' ) from dual;
DBMS_METADATA.GET_DDL('TABLE','T1')
-------------------------------------------------------------------------------
  CREATE TABLE "OPS$TKYTE"."T1"
   (    "OWNER" VARCHAR2(30) CONSTRAINT "SYS_C0039325" NOT NULL ENABLE NOVALIDATE,
        "OBJECT_NAME" VARCHAR2(30) CONSTRAINT "SYS_C0039326" NOT NULL ENABLE NOVALIDAT
E,
        "SUBOBJECT_NAME" VARCHAR2(30),
        "OBJECT_ID" NUMBER CONSTRAINT "SYS_C0039327" NOT NULL ENABLE NOVALIDATE,
        "DATA_OBJECT_ID" NUMBER,
        "OBJECT_TYPE" VARCHAR2(19),
        "CREATED" DATE CONSTRAINT "SYS_C0039328" NOT NULL ENABLE NOVALIDATE,
        "LAST_DDL_TIME" DATE CONSTRAINT "SYS_C0039329" NOT NULL ENABLE NOVALIDATE,
        "TIMESTAMP" VARCHAR2(19),
        "STATUS" VARCHAR2(7),
        "TEMPORARY" VARCHAR2(1),
        "GENERATED" VARCHAR2(1),
        "SECONDARY" VARCHAR2(1),
        "NAMESPACE" NUMBER,
        "EDITION_NAME" VARCHAR2(30),
         CONSTRAINT "T1_PK" PRIMARY KEY ("OBJECT_ID")
  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "USERS"  ENABLE NOVALIDATE
   ) CLUSTER "OPS$TKYTE"."HASH_CLUSTER" ("OBJECT_ID")
else take an outage and:
a) create table new as select (put new table into hash cluster)
b) add constraints, grants, etc to it
c) drop old table
d) rename new 
 
 
 
dbms_redefinition
A reader, September 29, 2009 - 3:50 pm UTC
 
 
Thank you. dbms_redefinition - most useful to know! 
 
Single-table Hash Cluster vs. IOT
Stacey Jones, October   13, 2009 - 3:29 pm UTC
 
 
Tom,
In one of our data warehouses we have a currency conversion table that gets populated with conversion rates on a daily
basis.  The data is kept historically so that reports can compute the conversion rate for any day in the past.  The math
to do the rate conversion is very simple but the problem is that it can be done up to a 100 million times per day and
this consumes quite a bit of CPU when the reports are running.  The conversion rate table has about 500,000 rows and I
thought I would pre-compute the conversions with a materialized view so that there was no math involved - only a select.
The resulting MV will contain about 170 million rows.  Initially I built this with a MV with an IOT table, but was
considering a single-table hash cluster instead, since this seems to be your suggestion in "Effective Oracle by Design".
I built both versions and was a little suprised that the hash cluster solution is slower (elapsed time) and also
consumes more CPU when I check the consumption in v$sqlarea.  Could you check out my solution and let me know if
    A) I've missed something with my hash cluster solution and this could be improved.
    B) I'm better off with the IOT solution
    C) I'm better off with something else entirely.
    
I'm running Oracle 9.2.0.7 on Sun Solaris.  Many thanks in advance, your book does rock!
Thanks
Stacey
-- Create the IOT table
--DROP TABLE STG.MV_WSG_IOT;
CREATE TABLE STG.MV_WSG_IOT
(
  CONV_DATE           DATE                      NOT NULL,
  FROM_CURRENCY_CODE  VARCHAR2(6 BYTE)          NOT NULL,
  TO_CURRENCY_CODE    VARCHAR2(6 BYTE)          NOT NULL,
  CONV_RATE           NUMBER                    NOT NULL,
  A_ROWID             ROWID                     NOT NULL,
  B_ROWID             ROWID                     NOT NULL, 
  CONSTRAINT MV_WSG_PK_IOT
 PRIMARY KEY
 (CONV_DATE, FROM_CURRENCY_CODE, TO_CURRENCY_CODE)
)
ORGANIZATION INDEX
NOLOGGING
TABLESPACE AUDIT_SHIPMENT_DATA
STORAGE    (
            INITIAL          1M
            NEXT             1M
           )
COMPRESS 2
NOPARALLEL;
-- Load the IOT with 500000 rows of data
begin
 insert/*+ APPEND */ into STG.MV_WSG_IOT select * from stg.mv_wsg where rownum < 500001;
 commit;
end;
-- Create the cluster
--DROP TABLE STG.MV_WSG_HASH_CLUSTER;
--DROP CLUSTER STG.CLU_CURR_CONV_LKP;
CREATE CLUSTER STG.CLU_CURR_CONV_LKP
(
  CONV_DATE                       DATE,
  FROM_CURRENCY_CODE              VARCHAR2(6 BYTE),
  TO_CURRENCY_CODE                VARCHAR2(6 BYTE)
)
TABLESPACE AUDIT_SHIPMENT_DATA
SIZE       61
STORAGE    (
            INITIAL          1M
            NEXT             1M
           )
SINGLE TABLE
HASHKEYS 500000
NOROWDEPENDENCIES
NOCACHE
NOPARALLEL;
-- Create the hash clustered table
--DROP TABLE STG.MV_WSG_HASH_CLUSTER
CREATE TABLE STG.MV_WSG_HASH_CLUSTER
(
  CONV_DATE           DATE                      NOT NULL,
  FROM_CURRENCY_CODE  VARCHAR2(6 BYTE)          NOT NULL,
  TO_CURRENCY_CODE    VARCHAR2(6 BYTE)          NOT NULL,
  CONV_RATE           NUMBER                    NOT NULL,
  A_ROWID             ROWID                     NOT NULL,
  B_ROWID             ROWID                     NOT NULL 
)
CLUSTER STG.CLU_CURR_CONV_LKP(CONV_DATE,FROM_CURRENCY_CODE,TO_CURRENCY_CODE)
;
-- Load the Hash clustered table with 500000 rows of data
begin
 insert/*+ APPEND */ into STG.MV_WSG_HASH_CLUSTER select * from stg.mv_wsg where rownum < 500001;
 commit;
end;
-- Analyze both tables
BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
      OwnName        => 'STG'
     ,TabName        => 'MV_WSG_HASH_CLUSTER'
    ,Estimate_Percent  => NULL
    ,Method_Opt        => 'FOR ALL INDEXED COLUMNS SIZE AUTO '
    ,Degree            => 4
    ,Cascade           => TRUE
    ,No_Invalidate     => FALSE);
  commit;
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
      OwnName        => 'STG'
     ,TabName        => 'MV_WSG_IOT'
    ,Estimate_Percent  => NULL
    ,Method_Opt        => 'FOR ALL INDEXED COLUMNS SIZE AUTO '
    ,Degree            => 4
    ,Cascade           => TRUE
    ,No_Invalidate     => FALSE);
  commit;
END;
--Run an explain plan for each of the queries:
SELECT * FROM STG.MV_WSG_IOT t
 WHERE 1=1
   AND t.conv_date = '05-JUL-04'
   AND t.from_currency_code = 'JPY'
   AND t.to_currency_code = 'CNY'
Plan
SELECT STATEMENT  CHOOSE Cost: 2  Bytes: 43  Cardinality: 1  IO Cost: 2      
    1 INDEX UNIQUE SCAN UNIQUE STG.MV_WSG_PK_IOT Search Columns: 3  Access Predicates: "T"."CONV_DATE"='05-JUL-04' AND "T"."FROM_CURRENCY_CODE"='JPY' AND "T"."TO_CURRENCY_CODE"='CNY'  Cost: 2  Bytes: 43  Cardinality: 1  IO Cost: 2  
SELECT * FROM STG.MV_WSG_HASH_CLUSTER t
 WHERE 1=1
   AND t.conv_date = '05-JUL-04'
   AND t.from_currency_code = 'JPY'
   AND t.to_currency_code = 'CNY'
Plan
SELECT STATEMENT  CHOOSE Bytes: 46  Cardinality: 1      
    1 TABLE ACCESS HASH STG.MV_WSG_HASH_CLUSTER Object Instance: 1  Search Columns: 3  Access Predicates: "T"."CONV_DATE"='05-JUL-04' AND "T"."FROM_CURRENCY_CODE"='JPY' AND "T"."TO_CURRENCY_CODE"='CNY'  Bytes: 46  Cardinality: 1  
-- Create a jumbled list to use in querying both tables.
 
create table stg.test_params
tablespace AUDIT_SHIPMENT_DATA
compress
as
select conv_date,from_currency_code,to_currency_code from (
select * from stg.mv_wsg x where rownum < 500001) x2
order by to_char(conv_date,'dd'),to_char(conv_date,'yy') desc, to_char(conv_date,'mm'), substr(from_currency_code,2,1),substr(to_currency_code,3,1) desc; 
-- Run the following test block to time random conversion rate lookups...
DECLARE
   CURSOR c_master IS SELECT * FROM stg.test_params WHERE rownum < 100001;
   CURSOR c_baseline_query(b1 IN stg.mv_wsg_iot.conv_date%TYPE
     ,b2 IN stg.mv_wsg_iot.from_currency_code%TYPE
     ,b3 IN stg.mv_wsg_iot.to_currency_code%TYPE
   ) IS   
   SELECT t.conv_rate FROM STG.MV_WSG_IOT t WHERE t.conv_date = b1 AND t.from_currency_code = b2 AND t.to_currency_code = b3;
   CURSOR c_replacement_query(
      b1 IN stg.mv_wsg_hash_cluster.conv_date%TYPE
     ,b2 IN stg.mv_wsg_hash_cluster.from_currency_code%TYPE
     ,b3 IN stg.mv_wsg_hash_cluster.to_currency_code%TYPE
   ) IS   
   SELECT t.conv_rate FROM STG.MV_WSG_HASH_CLUSTER t WHERE t.conv_date = b1 AND t.from_currency_code = b2 AND t.to_currency_code = b3;
    -- Constants and Variables
    num_seconds_in_a_day NUMBER := 24*60*60;
    start_time DATE;
    end_time DATE;
    baseline_elapsed_time NUMBER := 0;-- In seconds
    replacement_elapsed_time NUMBER := 0;-- In seconds
    percent_change NUMBER := 0;
BEGIN
   DBMS_OUTPUT.ENABLE(1000000);
   COMMIT;   
   -- Run and time the baseline query
   start_time := SYSDATE;
   DBMS_OUTPUT.put_line('Baseline run began at: ' || TO_CHAR(start_time, 'mm/dd/yyyy hh24:mi:ss'));
   COMMIT;   
   FOR m IN c_master LOOP
       FOR r IN c_baseline_query(m.conv_date,m.from_currency_code,m.to_currency_code) LOOP
        NULL;
       END LOOP;-- c_test       
   END LOOP;--c_master
   end_time := sysdate;
   DBMS_OUTPUT.put_line('Baseline run ended at: ' || TO_CHAR(end_time, 'mm/dd/yyyy hh24:mi:ss'));
   COMMIT;
   baseline_elapsed_time := (end_time - start_time) * num_seconds_in_a_day;   
   -- Run and time the replacement query   
   start_time := SYSDATE;
   DBMS_OUTPUT.put_line('Replacement query run began at: ' || TO_CHAR(start_time, 'mm/dd/yyyy hh24:mi:ss'));
   COMMIT;
   FOR m IN c_master LOOP
       FOR r IN c_replacement_query(m.conv_date,m.from_currency_code,m.to_currency_code) LOOP
        NULL;
       END LOOP;-- c_test       
   END LOOP;--c_master
   end_time := sysdate;
   DBMS_OUTPUT.put_line('Replacement query run ended at: ' || TO_CHAR(end_time, 'mm/dd/yyyy hh24:mi:ss'));
   COMMIT;
   replacement_elapsed_time := (end_time - start_time) * num_seconds_in_a_day;
   -- Summarize the results
   DBMS_OUTPUT.put_line('******************************* SUMMARY *******************************');
   DBMS_OUTPUT.put_line('Baseline elapsed time in seconds  -> '||to_char(baseline_elapsed_time));
   DBMS_OUTPUT.put_line('New query elapsed time in seconds -> '||to_char(replacement_elapsed_time));
   percent_change := ROUND(ABS(((baseline_elapsed_time-replacement_elapsed_time)/baseline_elapsed_time)*100));
   IF (replacement_elapsed_time < baseline_elapsed_time) THEN
    dbms_output.put_line('New query is '||to_char(percent_change)||'% faster than the original.');
   ELSIF (replacement_elapsed_time > baseline_elapsed_time) THEN
    dbms_output.put_line('New query is '||to_char(percent_change)||'% slower than the original.');
   ELSE
    dbms_output.put_line('No significant performance difference detected, you might try a larger test case');   
   END IF;
   COMMIT;
END;
Gives these results:
Baseline run began at: 10/13/2009 20:04:43
Baseline run ended at: 10/13/2009 20:05:10
Replacement query run began at: 10/13/2009 20:05:10
Replacement query run ended at: 10/13/2009 20:05:41
******************************* SUMMARY *******************************
Baseline elapsed time in seconds  -> 27
New query elapsed time in seconds -> 30.99999999999999999999999999999999999997
New query is 15% slower than the original.
 
 
Continued
Stacey Jones, October   13, 2009 - 3:33 pm UTC
 
 
   DBMS_OUTPUT.put_line('New query elapsed time in seconds -> '||to_char(replacement_elapsed_time));
   percent_change := ROUND(ABS(((baseline_elapsed_time-replacement_elapsed_time)/baseline_elapsed_time)*100));
   IF (replacement_elapsed_time < baseline_elapsed_time) THEN
    dbms_output.put_line('New query is '||to_char(percent_change)||'% faster than the original.');
   ELSIF (replacement_elapsed_time > baseline_elapsed_time) THEN
    dbms_output.put_line('New query is '||to_char(percent_change)||'% slower than the original.');
   ELSE
    dbms_output.put_line('No significant performance difference detected, you might try a larger test case');   
   END IF;
   COMMIT;
END;
Gives these results:
Baseline run began at: 10/13/2009 20:04:43
Baseline run ended at: 10/13/2009 20:05:10
Replacement query run began at: 10/13/2009 20:05:10
Replacement query run ended at: 10/13/2009 20:05:41
******************************* SUMMARY *******************************
Baseline elapsed time in seconds  -> 27
New query elapsed time in seconds -> 30.99999999999999999999999999999999999997
New query is 15% slower than the original.
 
 
Computing index cluster size
Johann, April     12, 2011 - 2:48 am UTC
 
 
Hi Tom,
As usual, I love the learnings I get from this site.  However, while browsing I still cannot find a good way to calculate what I should put in as size given the ff:
1. Have two tables, A and B.  
2. For Table A, AVG_ROW_LEN is 172.
3.  For table B, it's just 26, but I know this is because there's one column there that people haven't started populating.  The possible values for that column has an average length of 226 so I'm thinking I should for now use 26+226=252 as average row length for Table B (I know that's a bit simplistic but until I really have a good idea on the values for that column, it will do for now)
4.  And on the average, there are 3.77 entries on table B for each entry on table A.
Given the above, what I have so far as candidate cluster size is 172 + (252 * 3.77) = 1122.  That still doesn't take into consideration indexes I want to put in for the columns aside from the clustering column and I currently have no idea how to factor them in.  Would appreciate pointers.  Thanks. 
April     13, 2011 - 9:14 am UTC 
 
 
"ff" ??? not even a tiny idea what that is.
 http://acronyms.thefreedictionary.com/FF lots of possibilities.
If you are trying to size the cluster, it is just the amount of data associated with the key - indexes don't matter, they don't count in the sizing.  Just the sum of the row sizes associated with a given cluster key value. 
 
 
 
Thanks!
Johann, April     14, 2011 - 3:09 am UTC
 
 
Sorry about the "ff" - I've always used it as "following" and I thought that was the norm even before days of IM and SMS.
Thanks for the clarification.  I guess I can proceed with my computation. 
April     14, 2011 - 9:50 am UTC 
 
 
I've honestly never heard of using ff as "following" 
 
 
Clusters vs Partitioning
Nathan Marston, April     20, 2011 - 10:30 pm UTC
 
 
Hi Tom,
Similar question to the ones before - what are the performance implications of choosing between a clustered table and a partitioned table (list partitioned by what would have been the cluster key) - both for selects based on the cluster key, and DML?
(I guess I'm struggling to see how the organisation of the table data would be different in any meaningful way - and hence, if there would be any performance difference) 
April     25, 2011 - 7:54 am UTC 
 
 
they are about as different as night and day.
they are the same in that both force "organization" on the data.
A cluster ensures that rows with a common key are stored very very very near eachother - typically in the same block.  All of the rows in the emp table with DEPTNO=10 would be stored on the same block for example.
Partitioning ensures that rows are stored in a partition (which has many many many blocks) based on the value of their partition key.  In a HASH partitioned table using the hash key of DEPTNO, all records with deptno=10 would be stored in the same partition, but it is highly unlikely they would be on the same block.
Clustering is typically about getting data on the same block, partitioning is not 
 
 
Re: Clusters vs Partitioning
Nathan Marston, April     26, 2011 - 11:02 pm UTC
 
 
Thanks for the quick reply.
FYI, this isn't a work-related question so there's no urgency for an answer on my end. I think we agree that clustering is specifically aimed at improving performance while partitioning is more general-purpose in nature. The motivation for my question is just academic curiosity :).
Getting back to your response, sure - I understand that in the general case, partitioning will not get rows with equal keys organised together. That's because ordinarily the relationship between partition key values and partitions is typically many keys to one partition, not one-to-one and Oracle's under no obligation to put the same key values together.
So in general, a query on the partitioned field still has to scan the partition and that partition contains data that doesn't match the key you're looking for, so relative to clustering you're performing more LIOs. Even if you had (say) a local index on that partition key - and Oracle actually used it (which is doubtful) you're still worse off than if you'd clustered in the first place.
That wasn't my question though - I was asking about a more specific case. The case I'm interested in is where you are list partitioning the table by what would have been the cluster key, and you have set up the partitioning so there is one partition per key value.
For example, say you had this table containing a code field and the table is frequently accessed by that field:
create table blah ( ..., code varchar2(50) not null, ... );
Say code has 10 distinct values. In one scenario you cluster it by "code", in the other you list partition it by "code" with one partition per distinct "code" value (so there are 10 partitions).
I think we both agree that clustering still makes more sense even in this more specific case (even in this more specific case), if our goal is to make the "code" values sit together in the table and hence reduce LIO. What clinches it for me is that clustering just says what I mean. What I'm trying to see is what the performance implications of choosing partitioning instead would be.
Having explained the difference between my question and what I think you thought my question was... I think I now know how Oracle would treat them differently.
I think the issue is that with my partitioning scheme, "it just so happens" that partition keys and paritions map one to one - this kind of setup is a special corner case. My guess would be that Oracle would (under some circumstances) be unable to take advantage of that fact when optimising queries against the partitioned table. 
April     27, 2011 - 7:51 am UTC 
 
 
I think we both agree that clustering still makes more sense even in this more 
specific case (even in this more specific case), if our goal is to make the 
"code" values sit together in the table and hence reduce LIO. 
No, we wouldn't agree, in that narrow very specific case - partitioning and clustering would achieve effectively the same goal.
If you list partition by CODE and each CODE has its own partition - it would be very much like clustering - even a bit better in fact.  Both cases would have blocks that are filled with just a single CODE value - but the partitioning example would also have these blocks adjacent to each other (better for the full scan) AND it would have it so that we'd know all of the blocks that need to be full scanned for a partition - whereas the cluster would not, you'd have to full scan the entire cluster if you were to full scan.
 
 
 
Clusters vs Partitions
Nathan Marston, April     28, 2011 - 9:07 pm UTC
 
 
If you list partition by CODE and each CODE has its own partition - it would be very much like clustering - even a bit better in fact. Both cases would have blocks that are filled with just a single CODE value - but the partitioning example would also have these blocks adjacent to each other (better for the full scan) AND it would have it so that we'd know all of the blocks that need to be full scanned for a partition - whereas the cluster would not, you'd have to full scan the entire cluster if you were to full scan.
Ah... I see where my confusion has come from. I didn't read the manual properly in the first place ;).
What I was *thinking* clustering did was physically order rows by the cluster key (a kind of generalised IOT, except the key fields didn't need to be the primary key). I couldn't see a difference because I thought that advantage (row ordering) applied in both cases.
If I read the docs correctly, what clustering actually does is organise the row data so that for any cluster key value the rows corresponding to that value occupy the minimum possible number of blocks. If there's multiple blocks involved, they could potentially be anywhere.
Running with the example further, I can't think of anything I could do (short of adding a parallel clause - and that's cheating) to make that clustered table outperform the list partitioned table for queries that have "where code = :x" in the SQL.  
The data I'm after is already in the minimum possible number of blocks, Oracle knows exactly where those blocks are, and those blocks are all together. In terms of data organisation, I think that's as good as it gets. 
Indexing the cluster key on my clustered table might improve its performance (such an index might not get used), but even if it did I don't think it would outperform the list partitioned table for the same query. 
April     29, 2011 - 8:12 am UTC 
 
 
If I read the docs correctly, what clustering actually does is organise the row data so that for any cluster key value the rows corresponding to that value occupy the minimum possible number of blocks. If there's multiple blocks involved, they could potentially be anywhere. 
exactly.
Running with the example further, I can't think of anything I could do (short of adding a parallel clause - and that's cheating) to make that clustered table outperform the list partitioned table for queries that have "where code = :x" in the SQL.
Neither can I.
The data I'm after is already in the minimum possible number of blocks, Oracle knows exactly where those blocks are, and those blocks are all together. In terms of data organisation, I think that's as good as it gets.
well, there are always opportunities to perhaps compress the data onto fewer blocks... 
 
 
Clusters vs Partitions
Nathan Marston, April     29, 2011 - 1:31 pm UTC
 
 
Thanks Tom - that makes complete sense to me now. Funny how that happens when I go back and re-check my assumptions, isn't it? ;)
Two more questions - I mentioned earlier that how I *thought* clusters worked (i.e. that it imposed a partial ordering on the data based on some combination of fields, so data with the same "key" sits together on disk like they do with an IOT).
First - I'm wondering whether Oracle currently supports this kind of table? It seems to me having "a table where the rows are ordered by columns x,y,z" where those columns aren't necessarily unique could be quite useful.
Second - any idea when you're likely to start taking new questions? 
I do have another one regarding a situation where I think the MV refresh engine is doing "the wrong thing", but I won't take up your time here since it isn't really related to clustering at all. 
April     30, 2011 - 10:03 am UTC 
 
 
To accomplish storing all data with the same key together would reqire either preallocating a ton of space for each key and hoping we didn't run out of room or shifting and entire table down when you insert a key at the top or moving all of the rows with the same key to somewhere else when the don't fit any more.  In other words - hugely expensive.  I don't foresee that happening.
I take new questions all of the time...  took at least 36 last week... 
 
 
Clusters vs Partitioning
Nathan Marston, May       01, 2011 - 8:38 pm UTC
 
 
To accomplish storing all data with the same key together would reqire either preallocating a ton of space for each key and hoping we didn't run out of room or shifting and entire table down when you insert a key at the top or moving all of the rows with the same key to somewhere else when the don't fit any more.
Hm... fair enough, but I think I'm still missing something then. Why isn't this an issue with IOTs?
I take new questions all of the time... took at least 36 last week... 
Perhaps it has something to do with the time of day that I'm looking to submit questions? Seems almost every time I look at the main page it says you're unable to take new questions. 
May       04, 2011 - 11:45 am UTC 
 
 
IOTs do not store the data for the entire table sorted on disk, they store a bunch of blocks on disk that if you read them with a full scan - would return the data unsorted.
In an IOT, the rows on a given block might be sorted by the primary key - but the rows on disk, in whole, are NOT.
An IOT stores data just like an index, an index that is fast full scanned using multiblock IO doesn't return data sorted, it returns it as it sees it on disk. 
 
 
Cluster ot partition
Jayadevan, September 17, 2012 - 2:39 am UTC
 
 
Hello Tom,
I have 6-7 tables involved in a number of left outer joins. There is an activity table which tracks activities by members of a loyalty program. This table has many millions of records with unique activity numbers (millions of customers, each with many activities). Depending on the 'type' of activity- redemption, accrual, tier change etc, the details of the activities will be in the child tables (the other 6 tables). They are joined by the activitynum.
We have one query which lists the activities along with the details (from child tables) for any one member, for the past year.. To tune the query, we tried partitioning (i.e. activity table has an curr_year flag - set to Y or N). We have indexes on the child tables and the query is doing mostly PK-index unque scans. Still the query takes a few seconds while we are tying to make ti retieve data in under a second. Will clustering all these tables on the activity number help? 
September 20, 2012 - 1:54 am UTC 
 
 
 To tune the query, we tried 
partitioning
unless you were full scanning, partitioning isn't really going to be useful to "tune a query" with.  Unless you can employ partition elimination to remove some data from being scanned, partitioning isn't going to make you go faster (in fact, it could make you go much slower if you are not careful!!!)
I cannot answer the last question without understanding the nature of the data as it exists right now and how much of your time is spent doing physical IO's.
Here would be a quick way to answer this, to see if it would help - what is the response time of the query when it does ZERO physical IO's (eg: the second time you run it with the same inputs).
If that is not less than a second, then no, clustering (which would be useful to reduce physical IO's) by itself would not accomplish your goal.
How about showing us the tkprof report of the query (with the row source operation section - that is vital!!!) and explaining the schema (indexes and the like) 
 
 
Partitioning and clustering
Jayadevan, September 20, 2012 - 11:40 pm UTC
 
 
Hi Tom,
I am trying to answer all your questions - though not in the same order.
We did not have full scans. But I thought that if the active - i.e. current year's data is a different partition, and we ensured that the data is filtered on the partition key (which we are), it will help. We made local indexes.
The second time the query is executed, it does come back very fast - i.e. physical IO does seem to contribute to the delay.
TKPROF - 
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID DOMPARAMS (cr=3 pr=0 pw=0 time=0 us cost=2 size=61 card=1)
      1   INDEX UNIQUE SCAN DOMPARAMS_PK (cr=2 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 245981)
      1  TABLE ACCESS BY INDEX ROWID DOMPARAMS (cr=3 pr=0 pw=0 time=0 us cost=2 size=61 card=1)
      1   INDEX UNIQUE SCAN DOMPARAMS_PK (cr=2 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 245981)
      0  TABLE ACCESS BY INDEX ROWID DOMPARAMS (cr=0 pr=0 pw=0 time=0 us cost=2 size=61 card=1)
      0   INDEX UNIQUE SCAN DOMPARAMS_PK (cr=0 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 245981)
      0  TABLE ACCESS BY INDEX ROWID DOMPARAMS (cr=0 pr=0 pw=0 time=0 us cost=2 size=61 card=1)
      0   INDEX UNIQUE SCAN DOMPARAMS_PK (cr=0 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 245981)
      0  TABLE ACCESS BY INDEX ROWID MEMBERACTIVITY (cr=0 pr=0 pw=0 time=0 us cost=3 size=32 card=1)
      0   INDEX UNIQUE SCAN MEMBERACTIVITY_PK (cr=0 pr=0 pw=0 time=0 us cost=2 size=0 card=1)(object id 246166)
      0  TABLE ACCESS BY INDEX ROWID DOMPARAMS (cr=0 pr=0 pw=0 time=0 us cost=2 size=61 card=1)
      0   INDEX UNIQUE SCAN DOMPARAMS_PK (cr=0 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 245981)
   1688  SORT AGGREGATE (cr=3 pr=0 pw=0 time=0 us)
      0   NESTED LOOPS  (cr=3 pr=0 pw=0 time=0 us cost=2 size=61 card=1)
      0    TABLE ACCESS BY INDEX ROWID MEMBERMASTER (cr=3 pr=0 pw=0 time=0 us cost=2 size=30 card=1)
      0     INDEX RANGE SCAN MEMBERMASTER_PK (cr=3 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 246200)
      0    INDEX UNIQUE SCAN PRGMEMSTMACTDTL_PK (cr=0 pr=0 pw=0 time=0 us cost=0 size=31 card=1)(object id 246197)
  13504  SORT ORDER BY (cr=15651 pr=0 pw=0 time=24305 us cost=50 size=10801 card=7)
  13504   NESTED LOOPS OUTER (cr=15651 pr=0 pw=0 time=642539 us cost=49 size=10801 card=7)
  13504    NESTED LOOPS OUTER (cr=15643 pr=0 pw=0 time=602157 us cost=41 size=10458 card=7)
  13504     NESTED LOOPS OUTER (cr=15643 pr=0 pw=0 time=580501 us cost=40 size=10269 card=7)
  13504      VIEW  (cr=11553 pr=0 pw=0 time=515406 us cost=20 size=9961 card=7)
  13504       NESTED LOOPS OUTER (cr=11553 pr=0 pw=0 time=508655 us cost=20 size=10052 card=7)
   1688        VIEW  (cr=8174 pr=0 pw=0 time=58322 us cost=16 size=2802 card=2)
   1688         COUNT  (cr=8165 pr=0 pw=0 time=43982 us)
   1688          FILTER  (cr=8165 pr=0 pw=0 time=40617 us)
   1688           NESTED LOOPS OUTER (cr=8165 pr=0 pw=0 time=41572 us cost=16 size=2362 card=2)
   1688            VIEW  (cr=4918 pr=0 pw=0 time=30968 us cost=12 size=2328 card=2)
   1688             NESTED LOOPS OUTER (cr=4918 pr=0 pw=0 time=30245 us cost=12 size=464 card=2)
   1688              NESTED LOOPS OUTER (cr=4270 pr=0 pw=0 time=24702 us cost=9 size=416 card=2)
   1688               NESTED LOOPS OUTER (cr=668 pr=0 pw=0 time=10242 us cost=6 size=366 card=2)
   1688                TABLE ACCESS BY INDEX ROWID MEMBERACTIVITY (cr=439 pr=0 pw=0 time=4820 us cost=5 size=266 card=2)
   1688                 INDEX RANGE SCAN TCC_INDEX (cr=14 pr=0 pw=0 time=1566 us cost=3 size=0 card=2)(object id 299024)
      0                TABLE ACCESS BY INDEX ROWID PRGMANACCUPDACT (cr=229 pr=0 pw=0 time=0 us cost=1 size=50 card=1)
      0                 INDEX UNIQUE SCAN PRGMANACCUPDACT_PK (cr=229 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 246144)
   1688               TABLE ACCESS BY INDEX ROWID MEMACRACT (cr=3602 pr=0 pw=0 time=0 us cost=2 size=25 card=1)
   1688                INDEX UNIQUE SCAN MEMACRACT_PK (cr=1914 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 246164)
      0              TABLE ACCESS BY INDEX ROWID PRGTIRCHGACT (cr=648 pr=0 pw=0 time=0 us cost=2 size=24 card=1)
      0               INDEX UNIQUE SCAN PRGTIRCHGACT_PK (cr=648 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 246225)
   1688            TABLE ACCESS BY INDEX ROWID ACTIVITYPRG (cr=3247 pr=0 pw=0 time=0 us cost=2 size=17 card=1)
   1688             INDEX UNIQUE SCAN ACRACTPGM_PK (cr=1559 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 245844)
  13504        TABLE ACCESS BY INDEX ROWID PNTTYPEMASTER (cr=3379 pr=0 pw=0 time=0 us cost=3 size=140 card=4)
  13504         INDEX RANGE SCAN PNTTYPEMASTER_PK (cr=3 pr=0 pw=0 time=3920 us cost=1 size=0 card=8)(object id 246216)
   1200      TABLE ACCESS BY INDEX ROWID PRGMEMPNTTXN (cr=4090 pr=0 pw=0 time=0 us cost=3 size=44 card=1)
   1200       INDEX RANGE SCAN TCC_INDEX1 (cr=3702 pr=0 pw=0 time=0 us cost=2 size=0 card=1)(object id 298527)
      0     TABLE ACCESS BY INDEX ROWID COMBNSMST (cr=0 pr=0 pw=0 time=0 us cost=1 size=27 card=1)
      0      INDEX UNIQUE SCAN COMBNSMST_PK (cr=0 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 245953)
      0    TABLE ACCESS BY INDEX ROWID MEMBERACTIVITYOTHLNGDES (cr=8 pr=0 pw=0 time=0 us cost=2 size=49 card=1)
      0     INDEX RANGE SCAN MEMACTMUTLINGDES_PK (cr=8 pr=0 pw=0 time=0 us cost=1 size=0 card=1)(object id 246171)
********************************************************************************
Snce the tables are linked on the activity/activityreference number, my thought was to cluster the tables on that column. But the tables involved are big - each one a few hundred MB. So I haev concerns on the insert/update performance. 
 
September 26, 2012 - 12:11 pm UTC 
 
 
that says none of your time is spent doing physical IO, I don't see any pr's or pw's.
there are no full scans (so partitioning != (fast=true) for you)
and where is the rest of the tkrprof??? the query? the report of parse/fetch/execute?  the wait events????
true I said the row source was important, but i didn't "forget everything else"
usually people just post the other bits... 
 
 
Single table hash cluster
A reader, August    14, 2013 - 8:45 am UTC
 
 
I'm following an example in your book where I implement a Single Table Hash Cluster when the HASHKEY is a surrogate primary key generated from a sequence, ie. it will never have a hash collision.
As the column is the primary key - I want to declare a constraint to state as such - however, that means it will also create a UNIQUE INDEX to enforce the constraint?
Can that cause the optimizer to use the UNIQUE index when accessing the table? Kind of defeats the purpose of having the cluster to reduce I/O ? 
August    14, 2013 - 3:20 pm UTC 
 
 
ie. it will never have a hash collision.
it absolutely would have hash collisions, there is always the chance of a hash collision (by design) in a hash cluster.  You are mapping a potentially infinite set of values into a finite number of buckets.
Even if you size your hash cluster at a billion entries and only put in two rows with different hash keys - you have a chance of a hash collision!
however, that means it will also create a UNIQUE INDEX to enforce the 
constraint?
absolutely 
Can that cause the optimizer to use the UNIQUE index when accessing the table? 
Kind of defeats the purpose of having the cluster to reduce I/O ?
no, the optimizer sees the three access paths:
a) full scan
b) index unique scan
c) hash
the optimizer will cost out each one and pick the "cheapest" path which would almost certainly be hash access.  that is what the optimizer does. 
 
 
 Single table hash cluster  
A reader, August    16, 2013 - 11:27 am UTC
 
 
Thank you.
One more question;
If I had a lookup table where I always go in by primary key (generated from a sequence) - what's the difference between using a Single Table Hash cluster and an IOT ? What's the reasons I would choose one over the other ? 
August    28, 2013 - 5:01 pm UTC 
 
 
single table hash cluster = hash(key) to get file.block and retrieve record in a single IO.
IOT = index range scan, read root, branch (maybe more than one branch) and then leaf block to get row.  probably 3 or 4 single block IO's.
so, hash cluster = cpu(to hash key) + one single block IO
IOT = three/four single block IO's, where everyone is trying to read the same root block (contention for cache buffers chains latch possible).
 
 
 
Single table hash cluster   Vs IOT
A reader, August    30, 2013 - 12:21 pm UTC
 
 
Thank you.
I was perhaps thinking you would use an IOT for a lookup table where don't know with certainty how many rows there are going to be? Looking at your book again, that would appear logical as you state that hash clusters is suitable when "you know with a good degree of accuracy how many rows the table will have..."
Following on from that, another reason would be when you need to perform range scanning of the cluster key? 
Think I kind of answered my own question based on the "opposites" theories for using hash clusters in your book! 
 
View Sizing Post Loading
Martin Rose, August    18, 2015 - 6:56 pm UTC
 
 
Tom, is there a way of seeing how well you've sized a hash cluster after you've loaded it up?
I have a single table hash cluster with a fixed, never-changing number of rows.   I thought I had correctly sized it, but a hash cluster version of the data takes up 10G, whilst the IOT version just 300K !
Clearly I've sized it wrongly, but how can I see where I've gone wrong?