Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Arindam.

Asked: December 25, 2006 - 11:53 pm UTC

Last updated: July 23, 2010 - 6:56 am UTC

Version: 9i

Viewed 1000+ times

You Asked

Respected Mr. Tom,

My question is based especially on Hash Cluster. I have already read your two books ¿ effective design and the latest. My question is on the record length. In your example, ¿emp¿ and ¿dept¿ tables are clustered on ¿dept_no¿ column. On the basis of my understanding,

Question 1: At the time of table creation, I load data simultaneously department with employee but Later how to load employee for a specific department when new employee join when our goal is set all related data at the same block.

Question 2: How to envisage the number of employee for a particular department for future i.e. "SIZE" parameter. Let us take example of Oracle Corporation. Say the department is ¿Developer¿. Since the industry is now going though booming phase, the number of developers are on the increase. So under same department ID (say 7 for Developer Dept), the record length is graduating faster than imagination as department ID is same for all developers.

If my understanding is wrong, please help me get clear picture on ¿SIZE¿ parameter of Cluster. Please consider both - Index cluster and Hash cluster although "SIZE" is irrelevant to Index Cluster

Regards,

and Tom said...

question 1: you just insert that employee. Oracle is responsible for remembering where this employee record needs to be stored. If the block with the other employees has sufficient free space - that new employee record will be stored there, else we'll chain in a new block to hold it.


question 2: only you, only you can answer that. You need to understand your data and how you anticipate it being used long time.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Cluster - Record length

Arindam Mukherjee, December 26, 2006 - 9:50 am UTC

Sir,

First off, I thank you so much for responding to my questions. Still I have one doubt about what the record length is IN A CLUSTER. Please help me.

Say, in the ¿dept¿ table, record length is max. 23 as Dept_ID is number (3) and Dept_name is varchar2 (20). For simplicity, I assume its length is 23 all the time.

In the ¿Employee¿ table, the record length is 100, assuming constant.

There is one (1) record in employee table for dept_id = 1.
So in a cluster, for that dept_id, the record length = (23+ (100-3)) = 120. `3¿ for Dept_id, Is it ok?

There are three (3) records in employee table for dept_id = 2.
So in a cluster, for that dept_id, the record length = (23+ (97*3)) = 314. Is it ok?

I know it is very silly question but I am not confident about this record length calculation in cluster. Please do not get annoyed.

Tom Kyte
December 26, 2006 - 9:05 pm UTC

you need to figure out

how many records approximately belong to the same cluster key and what their average length is.

you have (suppose) 1 dept record at 20 (cluster key doesn't count) and 3 emp records at 97...

so, 1*20+3*97

if that is the "average" or "best guess"

System generated Cluster - SIZE

Arindam Mukherjee, December 27, 2006 - 1:22 am UTC

Respected Mr. Tom,

Thank you so much for your unflagging response to my chain of questions on the same issue. In your book, you have spelled out that Oracle itself uses cluster for data dictionary tables. When I install Oracle database, database does not know how much volume it would be in future - it might be gigabyte or terabyte. So how does Oracle calculate SIZE parameter for their clusters? The only plausible answer is either Oracle never uses Hash Cluster for data dictionary table or follows some rules. It is obvious that Index cluster also considers the average record length to fit all records linked with same cluster key in a single block.

Could you kindly shed some light on average record length or SIZE for system generated cluster? Sorry!! Please struggle with my poor English to get what I am trying to ask you. Thank you once again for your caring response.

Tom Kyte
December 28, 2006 - 8:34 am UTC

it is based on an average row size times number of average expected rows.

the ultimate size of your database - not relevant

about how many columns we expect per table - relevant (for sizing the cluster that contains col$ for example)

Is this a good candidate for hash cluster

Ilan, July 20, 2010 - 3:10 am UTC

Hello Tom,
Following your recent talk at iloug, I decided to try find a good candidate for a hash cluster single table in my DB.
The thing is, it's a sys table, so I wanted to ask before touching it.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.8.0 - Production

sql> desc sys.JAVASNM$
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 SHORT                                     NOT NULL VARCHAR2(30)
 LONGNAME                                  NOT NULL LONG RAW
 LONGDBCS                                           VARCHAR2(4000)

sql> analyze table sys.JAVASNM$ compute statistics;

Table analyzed.

sql> select avg_row_len, num_rows from dba_tables where table_name = 'JAVASNM$' and owner = 'SYS';

AVG_ROW_LEN   NUM_ROWS
----------- ----------
        118      11636

sql> analyze table sys.JAVASNM$ delete statistics;

Table analyzed.
The only command against that table we fing in v$sql is:
SELECT longname FROM JAVASNM$ WHERE short = :b1
which does index unique scan followed by table access by rowid.
That commands has by far the most executions in the DB (a busy oltp system) which is why I want to improve it in some way.
Also I want to thank you for your help for many years now.

Tom Kyte
July 23, 2010 - 6:56 am UTC

you cannot touch the dictionary, don't even think about touching SYS owned objects - never.

This is useful for you and your tables - never the dictionary.


but to answer the question - this table would not be a good candidate in general for a hash cluster, you cannot size it - you do not know the number of distinct keys in general - right now you have 11,636 keys, what happens if someone installs a new bit of code that adds 20,000 more? hash clusters are good for lookups and things you can size rather well.

A reader, July 25, 2010 - 9:57 am UTC

Thank you for the answer. I will look for another candidate which is not SYS.