Helena Markova, November 16, 2001 - 2:20 am UTC
why not use cluster always?
Thiyag, October 13, 2004 - 8:06 am UTC
Hi Tom,
Why not use cluster always? In other if the cluster does not qualify the below criteria cluster is not to be used;
1. Have lot of DML activity
2. Have non-uniform data.
Can you let us know why? Does cluster backfire anyway? Any idea if the usage of cluster in MS-Sql server is different from Oracle?
October 13, 2004 - 8:38 am UTC
do you have Expert One on One Oracle?
In there I spend many pages explaining what they are, how to use them and in fact how they are very different from the clustered index in sybase (later bought by MS)
Which of your books should I read first.
Stewart W. Bryson, February 25, 2005 - 3:22 pm UTC
I'm a data warehouse DBA (9.2 and 10) working mostly on tuning. Which one would benefit me the most. You seem to reference Expert One on One more in the threads I read, but I wanted to make sure.
Thanks.
February 25, 2005 - 6:44 pm UTC
design, best practices: Effective Oracle by Design (9ir2 and before)
examples, factual, text book like: Expert one on one Oracle (8ir3 and before)
I won't have all of the new 9i/10g stuff in Expert until end of this year.
Advice
Stewart W. Bryson, February 28, 2005 - 10:48 am UTC
If you didn't have to take our questions all day long, maybe you get get the new version out quicker.
Thanks as usual.
February 28, 2005 - 10:54 am UTC
where do you think I get material for the books from :)
hash vs index cluster
Kumar, May 12, 2005 - 10:03 am UTC
Does oracle preallocate space for hash cluster and not for index cluster? If so, one has to estimate really well the space for hash cluster and that may not be necessary for index cluster? Is this a fair statement? Thanks.
May 12, 2005 - 1:15 pm UTC
hash cluster --> preallocated based on number of keys and key size.
index cluster --> not preallocated
in both cases, you want to get the key size (amount of data associated with a given key) as good as possible, but with hash clusters understanding the uppper bound on the ultimate number of keys is very relevant, yes.
migrate sql server clustered index
A reader, December 22, 2005 - 3:13 am UTC
Hi
I am migrating a SQL Server 2000 Application to Oracle 10gR2. There are tons of clustered index in this application I was wondering how to transform that to Oracle type of clustered index?
Is it an IOT? A index cluster, a hash cluster, a sort hash cluster? I think it´s more like an IOT however I am not sure if its a good idea change over 50 tables to IOT....
December 22, 2005 - 10:44 am UTC
answer this very very important question first:
why did you use the clustered index in sqlserver. Was the primary technical reason behind using that structure to "permit more than one person at a time to insert into the table"?
migrate sql server clustered index
A reader, December 22, 2005 - 12:40 pm UTC
Hi
The reason is this is a GIS Application and making the index clustered the query runs much faster since the index contains the data the query is looking for.
I looked IOT and it wont work neither since IOT only supports primary keys! And these sql server clustered indexes are not used on PK :-(
December 22, 2005 - 4:47 pm UTC
well, you can create "just indexes" in Oracle to do that - create index i on t(a,b,c)
And same in sqlserver - the clustered index would not have changed that? It would have stored the data "sorted" on disk.
Is it that you want all of the rows with the same "key" values stored physically together?
migrate sql server clustered index
A reader, December 24, 2005 - 4:28 pm UTC
Hi
I talked with the developers and they say that clustered index store index and the data, same as IOT. They created the index for Spatial queries which accoding to them run much faster. They say the data is clustered in the disk (groupped) so the access time is much better
I think this has to be an IOT but IOT only works with PKs so I think at end of the day Clustered index as such in Oracle does not exist...
I would have to see if create an index with the data would be the solution, but if that is the only Oracle solution then why SQL Server invented clustered index? I am sure they can create composite indexes as well?
December 24, 2005 - 5:24 pm UTC
clustered indexes in sqlserver can have a 110% overhead - but anyway.
You are likely looking at a B*TREE CLUSTER, you create an index on the cluster key (say DEPTNO from EMP). There will be an entry in this cluster key once for each distinct key value - it'll point to the block where all of the data that has the same deptno is stored together.
Have you look through the Oracle Concepts Guide? (or if you have access to any of my books - I've written about these physical structures - b*tree clusters, hash clusters, sorted hash clusters, heap tables, external tables, index organized tables ..... in every one)
what do you mean 110$ overhead
A reader, December 25, 2005 - 4:32 pm UTC
Hi
What is the 110% overhead you mentioned in SQL Server clustered index?
December 25, 2005 - 4:45 pm UTC
extra storage in subsidiary data structures - such as other indexes. Sort of like secondary indexes on IOT's in Oracle has. Secondary indexes on these structures (iots/clustered indexes in sqlserver) cannot point to "where the data is", they have to store the "index key" in the secondary index and in effect do two index range scans to find the data)
And I believe the index key is stored "twice", once with the index entry in the intermediate index nodes - and again with the row data - so if the key is a sizeable component of the row - and is stored twice...
If the posters goal was to CLUSTER LIKE DATA together by some key, what they are looking for would be the b*tree cluster in Oracle.
Many times - clustered indexes in SQLServer are used to simply spread the data out - to avoid contention. If that was the reason the poster wanted to mimick their behaviour in Oracle - the answer would be "don't - you just need a normal table"
b*tree clusters
A reader, December 25, 2005 - 5:56 pm UTC
Hi
Where is b*tree clusters in the doc explained? Cant find anything :-?
December 26, 2005 - 10:08 am UTC
Hash vs. B*Tree cluster
Alen, June 07, 2007 - 12:34 pm UTC
Hi Tom,
in your book "Efficient Oracle by design" i don't understand something - maybe you could clarify. On the pages 393-394 you compared hash clusters vs. b*tree clusters and you wrote:
"If a B*Tree index can increase the buffer cache utilization by simply storing similar data together, a hash cluster can go one step further by eliminating the index blocks altogether."
However, the tkproof report doesn't show that - the numbers in the "query" column are almost the same (49234 vs. 49730), why is that? Since we only access the table blocks, no index blocks, why is the ammount of consistent gets almost the same?
The second thing i don't understand is the number of "fetch" and "rows" - meanwhile in the first and second test they are the same (47620 fetches, 47170 rows) - the third time they are different, but they shouldn't be (47610 fetches, 47160 rows) - 10 fetches less, 10 rows less.
June 07, 2007 - 3:19 pm UTC
almost the same is NOT 'the same', it reduced the logical IO's against the data stored in the hash cluster
i had 10 less rows when I created the tables the second time around apparently.
Hash vs. B*Tree cluster
Alen, June 07, 2007 - 5:02 pm UTC
Well I was expecting a large drop in LIO, since the use of HEAP table produced 95930 LIO and the use of B*Tree cluster produced 49730 LIO, thats only 51% of the previous one. Having in mind that "
If a B*Tree index can increase the buffer cache utilization by simply storing similar data together, a hash cluster can go one step further by eliminating the index blocks altogether." the result is poor - only 1% LIO less. That is really not a "one step further".
What amazes me more is my test. I wanted to try reproduce your tests, here are my results:
heap table = 38.183 consistent gets
btree cluster = 51.467 consistent gets
hash cluster = 91.845 consistent gets
When I compared yours and mine TKPROF report, i see the difference.
Your's output:
MERGE JOIN CARTESIAN
TABLE ACCESS HASH USERS_OBJECTS
BUFFER SORT
TABLE ACCESS HASH USER_INFO
Mine output:
MERGE JOIN CARTESIAN
TABLE ACCESS HASH USER_INFO
BUFFER SORT
TABLE ACCESS HASH USERS_OBJECTS
See, the tables "switched" the position. I guess that's because of the different data in the tables -> producing different statistics -> different "optimal" plan.
But if you think about it for a second: isn't it ALWAYS better to first access the USER_INFO table, since there is at most one row for a given username (we are using user_info.username = :value), and then join the result to USERS_OBJECTS?
I'm confused...