Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, wallacel.

Asked: November 13, 2000 - 2:29 pm UTC

Last updated: May 10, 2013 - 12:40 pm UTC

Version: 8.1.6

Viewed 10K+ times! This question is

You Asked

Tom:

the clustering_facotr column in the user_indexes view
is a measure of how organized the
data is compared to the indexed column, is there any way i can
imporve clustering factor of a index. or how to improve it?

Regards



and Tom said...

This defines how ordered the rows are in the index. If CLUSTERING_FACTOR approaches the number of blocks in the table, the rows are ordered. If it approaches the number of rows in the table, the rows are randomly ordered. In such a case (clustering factor near the number of rows), it is unlikely that index entries in the same leaf block will point to rows in the same data blocks.


Note that typically only 1 index per table will be heavily clustered (if any). It would be extremely unlikely for 2 indexes to be very clustered.

If you want an index to be very clustered -- consider using index organized tables. They force the rows into a specific physical location based on their index entry.

Otherwise, a rebuild of the table is the only way to get it clustered (but you really don't want to get into that habit for what will typically be of marginal overall improvement).

Followup to the comment below

I though that was somewhat self explainatory based on the notes associated with this column (found in the Oracle8i Reference Manual):

The clustering factor is:
<quote>
Indicates the amount of order of the rows in the table based on the values of the index.

n If the value is near the number of blocks, then the table
is very well ordered. In this case, the index entries in a
single leaf block tend to point to rows in the same data
blocks.

n If the value is near the number of rows, then the table
is very randomly ordered. In this case, it is unlikely
that index entries in the same leaf block point to rows
in the same data blocks.
</quote>


In general, if all of the index entries in a given leaf block point to the same block -- the table is well ordered with regards to this index.

If all of the index entries in a given leaf block point to different blocks in the tabe -- the table is not well ordered with respect to this index.

Thats all this is.


Rating

  (67 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

clustering factor

A reader, April 23, 2001 - 12:54 pm UTC

the article will be more useful if it talks about how the clustering factor is determined

My understanding of how clustering factor is determined

Doug Cowles, October 20, 2001 - 3:36 pm UTC

Correct me if I'm wrong -
I thought that during an analyze. A scan of the index was done in order, starting with a clustering factor of zero. Every time we end up switching blocks during the fast scan of the index, we increase the clustering factor by one. Therefore, if we had to switch blocks for every row, we ended up with a clustering factor close to the number of rows, and if we didn't have to switch we ended up with a clustering factor near the number of blocks. No? Now, what "order" we are strarting with I would assume means staring with the root block etc., I also go the impression that this "block switch" was really a datablock switch, not jumping from leaf to leaf of the index for example.

Tom Kyte
October 20, 2001 - 7:09 pm UTC

the leaf blocks are stored in a doubly link list, once we get to the "first" entry -- scanning becomes easy, we don't have to walk a tree, we go from leaf node to leaf node.

As we scan the elements in the leaf nodes (and we really only read the leaf nodes pretty much during this process as we get to the first row in 1 or 2 blocks), we can simply look at the rowid in the index entry to see if the block is the same as or different from the others rows in the index right around it.

So, the block switch is a data block switch and its based on the rowid found in the index entry on the leaf node.

number of blocks

PINGU, May 09, 2003 - 4:02 am UTC

Hi

when the reference guide says number of blocks, what blocks? High watermark? Used blocks under hig waterark? Or the total blocks?

I guesssss it´s BLOCKS in user_tables, i.e HWM... but I am not sure

Tom Kyte
May 09, 2003 - 1:25 pm UTC

the reference guide probably says that term in hundreds of places.

Out of context like that -- like saying "the ref guide says 'it is blue'" -- i cannot comment on what blocks they were refering to.

sorry to get back late!

PINGU, May 15, 2003 - 5:59 am UTC

Hi

If we dont know what blocks is the documentation referring how do we compare how good or bad is the data ordered? How do we take the clustering_factor and compare it with what values...?

Tom Kyte
May 15, 2003 - 9:39 am UTC

if the documentation was talking about cluster factors and blocks, it would be the blocks in the table that could contain data -- the blocks under the HWM.

Clustering factor

Bhavini, February 14, 2005 - 7:10 am UTC

It was really good.Infact all question that the site provides helps me in some way nor the other.
Just want to know how can i raise my question to this site

Tom Kyte
February 14, 2005 - 8:36 am UTC

when I'm taking them, there is a button on the home page that lets you submit them

BunditJ, March 06, 2005 - 11:37 pm UTC

Dear Tom,

From your book "expert one-on-one", the example of clustering_factor (colocated_pk and disorganized_pk) you illustrated, is splendid. (On page 283)
I would like to know further, should they show us more clustering_factor value than either num_rows or blocks, how will oracle manage them using RBO ?

Tom Kyte
March 07, 2005 - 7:38 am UTC

clustering factor is more "informational", it can be used to tell you

a) that your index will or will not be used for large range scan's
b) how organized your table is with respect to some index key values
c) that you might need to use a structure like an IOT or cluster if having the data organized by that key is important.


So, it is just more information -- sort of like num rows and blocks, they are just numbers.

I don't know what you mean by "show us more clustering factor" -- it is shown as easily as any other statistics.


the RBO is blind to ALL statistics - num rows, block, clustering factor -- you name it.

Clustering Factor

Bob, October 29, 2006 - 5:50 am UTC

So, following on from this discussion. We have a table with
34 million records. The clustering factor (after gathering stats and cascading to indexes) is 33 million (for an unique index which is used extensively in our ETL process). i.e very disordered. The performance is pretty abysmal!

So the options are IOT (but this table has hundreds of columns) - so out of the question. Clustered table - a possibility (but then DML - inserts to the tables will slow down). Rebuild the table and recreate B-tree index (and order by columns in index)

Anyone else have an better suggestions?



Tom Kyte
October 29, 2006 - 7:08 am UTC

tell us how you actually access this data, if by primary key value - the clustering factor is relatively MOOT. Do you range scan this primary key value and what IS this primary key value?

clustering would not result in a table organized by primary key at all.

and most importantly - MOST IMPORTANTLY - what did you do in order to ascertain it is this "index" that is the root cause of the "abysmal" performance?

Clustering Factor

Bob, October 29, 2006 - 2:09 pm UTC

Hi Tom,

Sure. First a little background to start off with.

This is a piece of PL/SQL code that is run in our ETL overnight load. My objective is to tune this code (or rewrite the code) to get optimal performance.

For brevity, I'll will show you a cutdown version (this piece of PL/SQL is huge). It uses a "bulk collection" followed by a bulk bind

declare

cursor c1 is
select col1...col11
from x
for update nowait;

TYPE integerarray is table of integer
index by binary_integer;

TYPE varchararray is table of varchar2(10)
index by binary_integer;

TYPE x_rec is RECORD
(col1 integerarray ,
col2 integerarray ,
col3 integerarray ,
col4 integerarray ,
col5 integerarray ,
col6 varchararray ,
col7 integerarray ,
col8 integerarray ,
col9 varchararray ,
col10 varchararray ,
col11 varchararray ,
l_calc1 dbms_sql.number_table,
.
.
.
l_calc50 dbms_sql.number_table);

l_record x_rec;
l_array_size pls_integer := 5000;
l_current_year number := TO_NUMBER(TO_CHAR(sysdate,'YYYY'));

l_volume number(11,2);

begin
open c1;

fetch c1 bulk collect into
l_record.col1,
l_record.col2,
l_record.col3,
l_record.col4,
l_record.col5,
l_record.col6,
l_record.col7,
l_record.col8,
l_record.col9,
l_record.col10,
l_record.col11
LIMIT l_array_size;

for l_index 1..l_record.col1.count loop


select sum(y.volume_amt) into l_record.l_calc1(l_index)
from p,
y
where y.col1 = l_record.col1(l_index)
and y.col2 = l_record.col2(l_index)
....
and y.col11 = l_record.col11(l_index)
and y.period = p.period
and p.year = l_current_year
and p.datatype = 'ANN_AMT';

... about 40-50 SQL statements like this with different
datatypes and restrictions on table p



end loop;

forall l_index 1..l_record.col1.count
update x
set calc1 = l_record.l_calc1(l_index),
calc2 = l_record.l_calc2(l_index)
.
.
calc50 = l_record.l_calc50(l_index)
where col1 = l_record.col1(l_index)
.
.
and col11 = l_record.col11(l_index);

commit;

exception
when others then dbms_output.put_line(sqlerrm);

end;

I personally think this this could be written as one piece
of SQL as an update and subqueries for the sum calculations.
This should perform better than PL/SQL - since there won't be as much context switching.

The above PL/SQL takes a few hours to run:

Table x has 30000 records.
Table p has 88 records.
Table y has 34 million records.

After running dbms_stats.gather_table_stats:

dbms_stats.gather_table_stats
(ownname => user,
tabname => 'Y',
method_opt => 'FOR ALL INDEXED COLUMNS',
granularity => 'ALL',
cascade => TRUE,
degree => DBMS_STATS.DEFAULT_DEGREE);

I looked at the clustering factor in dba_indexes for y_pk
this had the following:

blevel = 4
clustering factor = 33162719
distinct keys = 34156634

I changed the PL/SQL to an update SQL statement and ran for
10,000 records for a single calculation: This took around 9 minutes.

update x
set calc1 = (select sum(volume_amt)
from y,
p
where x.col1 = y.col1
and x.col2 = y.col2
...
and x.col11 = y.col11
and p.year = :l_current_year
and y.period = p.period
and y.datatype = :l_datatype)
where rownum < 10000

Note the index of y_pk has columns matched on the predicates of the above query i.e y_pk has col1 through to col11 as a composite unique index.

After tracing I got this:

update x
set calc1 =
(SELECT SUM(volume_amt)
FROM p,
y
WHERE x.col1 = y.col1
and x.col2 = y.col2
....
and x.col11 = y.col11
and y.period = p.period
AND p.year = :l_current_year
AND y.data_type = :l_data_type)
where rownum < 10000

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.16 38 712 0 0
Execute 1 97.22 551.77 738468 1454857 30831 9999
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 97.24 551.94 738506 1455569 30831 9999
Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE
9999 COUNT STOPKEY
9999 TABLE ACCESS FULL X
9999 SORT AGGREGATE
19784 NESTED LOOPS
290539 TABLE ACCESS BY INDEX ROWID Y
290539 INDEX RANGE SCAN Y_PK (object id 40318)
19784 TABLE ACCESS BY INDEX ROWID P
290539 INDEX UNIQUE SCAN P_PK (object id 40381)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 738444 0.12 470.78
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00
********************************************************************************

The unique composite index y_pk is being range scanned. I know this index is being used and has a high clustering factor. I am attempting to lower it or rebuild table Y so
that this calculation will run faster.

Am I going about this the correct way? And what does MOOT mean?

Cheers




Tom Kyte
October 29, 2006 - 3:09 pm UTC

yes, but, why do you believe the clustering factor will "fix" anything?

Most of the time seems to be spent on physical IO.

If you set statistics level to all, we'd be able to see in the row source operation WHAT is experiencing the high physical IO.

You are not using the entire primary key (else it would be unique scan, not range), so it is not that it is "out of order by primary key", but rather some subset - and you still haven't said "what" the primary key is?

</code> http://www.webster.com/cgi-bin/dictionary?sourceid=Mozilla-search&va=moot <code>

Clustering Factor

Bob, October 29, 2006 - 2:17 pm UTC

Apolgies, please note in the bulk collection code. There
should be a

begin
open c1;

loop
fetch c1 bulk collect
.
.
.
process
forall... etc update

exit when c1%notfound;
end loop;

exception
whatever handlers
end;

Clustering Factor

Bob, October 29, 2006 - 5:06 pm UTC

Hi Tom,

Thanks for the tip on alter system set statistics_level=all.
I didn't know about that option! As for MOOT, I had never heard that either. Maybe us BRITS can learn something about our language after all! ;-)

I thought that a lowered "clustering factor" would mean you would visit fewer blocks for I/O if the columns were ordered on the unique key.

I also thought that a lowered binary height, would imply less I/O - since you would have few nodes to tranverse from the root branch to leaf blocks. Please correct me if I am wrong!!!

Here are the results:

update hiir p
set ytd_anr = (SELECT SUM(h.value_or_amt)
FROM period_table pe,
mfg_dist_dlr_prod_history h
WHERE h.mfg_no = p.mfg_no
AND h.mfg_loc = p.mfg_loc_no
AND h.dist_no = p.dist_no
AND h.dist_loc = p.dist_loc_no
AND h.dlr_no = p.dlr_no
AND h.prod_code = p.prod_code
AND h.program_no = p.program_no
AND h.recv_branch_no = p.recv_branch_no
AND h.rt_mfg_flag = p.rt_mfg_flag
AND h.rt_dist_flag = p.rt_dist_flag
AND h.rt_dlr_flag = p.rt_dlr_flag
AND h.period = pe.period
AND pe.year = :l_current_year
AND h.data_type = :l_data_type)
where rownum < 10000

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 4 0 0
Execute 1 98.06 541.14 735758 1454850 30986 9999
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 98.06 541.14 735758 1454854 30986 9999


Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 165

Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE (cr=1454772 r=735739 w=0 time=541076424 us)
9999 COUNT STOPKEY (cr=19766 r=3755 w=0 time=968199 us)
9999 TABLE ACCESS FULL HIIR (cr=19766 r=3755 w=0 time=935182 us)
9999 SORT AGGREGATE (cr=1434947 r=731923 w=0 time=537588304 us)
19784 NESTED LOOPS (cr=1434947 r=731923 w=0 time=537385558 us)
290539 TABLE ACCESS BY INDEX ROWID MFG_DIST_DLR_PROD_HISTORY (cr=1144406 r=731922 w=0 time=532734557 us)
290539 INDEX RANGE SCAN MDDP_PK (cr=872544 r=622899 w=0 time=183584801 us)(object id 40318)
19784 TABLE ACCESS BY INDEX ROWID PERIOD_TABLE (cr=290541 r=1 w=0 time=3097982 us)
290539 INDEX UNIQUE SCAN PERIOD_TABLE_PK (cr=2 r=0 w=0 time=1030705 us)(object id 40381)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 735709 0.12 458.79
latch free 1 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00
********************************************************************************

So tables

hiir - 30,000 records
mfg_dist_dlr_prod_history 34 million records
period_table - 88 records

Tom I know the PK is misleading - we use unique keys. The unique key MDDP_PK is:

CREATE UNIQUE INDEX MDDP_PK ON MFG_DIST_DLR_PROD_HISTORY
(MFG_NO, MFG_LOC, DIST_NO, DIST_LOC, DLR_NO,
PERIOD, DATA_TYPE, RECV_BRANCH_NO, PROD_CODE, PROGRAM_NO,
RT_MFG_FLAG, RT_DIST_FLAG, RT_DLR_FLAG)
TABLESPACE IMAX_INDEX3
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 2048M
NEXT 10M
MINEXTENTS 1
MAXEXTENTS 2147483645
PCTINCREASE 0
FREELISTS 1
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
);





Tom Kyte
October 29, 2006 - 5:17 pm UTC

largest physical IO producer:

290539 INDEX RANGE SCAN MDDP_PK (cr=872544 r=622899 w=0 time=183584801
us)(object id 40318)

the index is already sorted by key. so, how would referencing fewer table blocks help in this case?

Clustering Factor

Bob, October 29, 2006 - 5:26 pm UTC

Please can you explain how you have arrived at that conclusion.

i) Which step has the largest I/O Producer
ii)How you can tell MDDP_PK index is already sorted by key

Thanks Tom

Tom Kyte
October 29, 2006 - 5:34 pm UTC

i) cut and paste, see the r= value? they rollup to the next level, that one step - the index - did most of it.

This step (and all lower level steps of it, of which there are none):

290539 INDEX RANGE SCAN MDDP_PK (cr=872544 r=622899 w=0 time=183584801
us)(object id 40318)

did
o 872544 logical IOs
o 622899 physicaly IOs
o 0 writes (to temp for example...)

ii) indexes are stored sorted by key.

Clustering Factor

Bob, October 29, 2006 - 5:48 pm UTC

OK - that makes much more sense.

So what about the high clustering factor? Will this make any difference to the performance if I reduce it or the binary height? Should I consider storing the table differently? At the moment 9 mins for 9999 rows is unacceptable.

Can you suggest any methods to improve the response time?

Tom Kyte
October 29, 2006 - 6:30 pm UTC

you might look at the ORDERING of the columns in your primary key - so that the rows you access via the index are closer together in the index - to hopefully reduce the physical IO - that is where the mass of the time is currently, physical IO in the INDEX.



Clustering Factor

Bob, October 30, 2006 - 5:58 am UTC

Hi Tom,

Thanks for your comments. Is there any chance you could provide me with a concise, reproducible example that can prove that this theory would work.

Thanks very much for our kind help!

Tom Kyte
October 30, 2006 - 9:13 am UTC

you cannot?  You actually understand how you use your data, you know how it is sorted in the primary key.

If your key is (a,b,c,d,e) (see - I don't even really know what your KEY IS) and you

where a = :a and b = :b and e = :e

we can range scan the index for all A and B values - looking for E's, if you had a key on (a,b,e,c,d) - we would go right to a,b,e and stop immediately - no big scan.

ops$tkyte%ORA9IR2> create table t ( a int, b int, c char(20), d char(20), e int );

Table created.

ops$tkyte%ORA9IR2> insert into t
  2  select 1, 1, 1, 1, rownum
  3    from all_objects;

30300 rows created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create unique index t_idx on t(a,b,c,d,e);

Index created.

ops$tkyte%ORA9IR2> set autotrace on
ops$tkyte%ORA9IR2> set termout off
ops$tkyte%ORA9IR2> select * from t where a = 1 and b = 1 and e = 40000;

no rows selected


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (UNIQUE)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        269  consistent gets
          0  physical reads
          0  redo size
        440  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte%ORA9IR2> set autotrace off
ops$tkyte%ORA9IR2> drop index t_idx;

Index dropped.

ops$tkyte%ORA9IR2> create unique index t_idx on t(a,b,e,c,d);

Index created.

ops$tkyte%ORA9IR2> set autotrace on
ops$tkyte%ORA9IR2> set termout off
ops$tkyte%ORA9IR2> select * from t where a = 1 and b = 1 and e = 40000;

no rows selected


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (UNIQUE)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        440  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte%ORA9IR2> set autotrace off
ops$tkyte%ORA9IR2>
 

Clustering Factor

Bob, October 30, 2006 - 9:25 am UTC

Thanks Tom. That's crystal clear now! I really appreciate your help.

Clustering Factor

Bob, October 30, 2006 - 3:47 pm UTC

For those of you interested, there's a great explanation of the clustering factor at:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1296165726968 <code>

I've learnt so much from this site!

Tom Kyte
October 30, 2006 - 3:49 pm UTC

Not as much as I have learned here :)

question on selectivity and clustering factor,

A reader, December 06, 2006 - 4:42 am UTC

Say I have a emp table and ename is indexed (non-unique).

The table has 100 rows and out of that 80 are distinct.

The selectivity of a value in the name column = 1/80.
What if the ename='SMITH' appears 20 times and all other enames are unique? Will the selectivity remains 1/80 for the ename='SMITH'?

Next about clustering factor: I understand the value of clustering factor on the Index on ENAME is dependent on how many blocks the data is stored in the table blocks.
If the value CLARK is stored in multiple blocks (values scattered across number of blocks assuming CLARK has more than one row in the table), then oracle may favor full table scan. If ename=JAMES appears same number of times as CLARK in the table and if all the rows for JAMES are in same block (single block in the table), how does the clustering factor decides for the optimizer to choose an index or a FTS for these two values?

How does optimizer considers SELECTIVITY and CLUSTERFING FACTOR together to play a role in coming up with a suitable plan?

Thanks


Tom Kyte
December 07, 2006 - 8:19 am UTC

depends on whether you have histograms or not - since you have less then 255 distinct values here - the optimizer would KNOW that smith is 20% of the records.


the clustering factor is not really dependent on the how many blocks of data there are - rather it is dependent on HOW SORTED the rows in the table are with respect to the indexed columns.


a clustering factor is a measure of how many LIO's would be done if all of the rows are read from the table via the index.

ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table t;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2> select rownum id1, object_id id2, rpad('*',100,'*') data
ops$tkyte%ORA10GR2>   from (select object_id from all_objects order by dbms_random.random)
ops$tkyte%ORA10GR2> /
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index t_idx1 on t(id1);
ops$tkyte%ORA10GR2> create index t_idx2 on t(id2);
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select index_name, clustering_factor
  2    from user_indexes
  3   where index_name like 'T_IDX_'
  4   order by 1;

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
T_IDX1                                       806
T_IDX2                                     50032

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set arraysize 5000
ops$tkyte%ORA10GR2> set autotrace traceonly statistics
ops$tkyte%ORA10GR2> select /*+ index( t t_idx1 ) */ * from t where id1 is not null;

50094 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        940  consistent gets
          0  physical reads
          0  redo size
     733241  bytes sent via SQL*Net to client
        495  bytes received via SQL*Net from client
         12  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      50094  rows processed

ops$tkyte%ORA10GR2> select /*+ index( t t_idx2 ) */ * from t where id2 is not null;

50094 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      50155  consistent gets
          0  physical reads
          0  redo size
     733241  bytes sent via SQL*Net to client
        495  bytes received via SQL*Net from client
         12  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      50094  rows processed

ops$tkyte%ORA10GR2> set autotrace off



and a clustering factor is for the INDEX, it would not have any information about the fact that "clark is not clustered together but james is" - the optimizer would simply see the clustering factor of the index and that is all. 

a reader

raman, August 15, 2007 - 8:00 am UTC

SQL> DESC ins_parttaker
 Name                                      Null?    Type
 ----------------------------------------- -------- --------------------------
 PARTTAKER_CODE                            NOT NULL VARCHAR2(8)
 CATEGORY_PARTTAKER_CODE                            VARCHAR2(3)
 CATEGORY_CODE                                      VARCHAR2(3)
 PARENT_CODE                                        VARCHAR2(8)
 ABBREVIATION                                       VARCHAR2(10)
 ADDRESS                                            VARCHAR2(300)
 CONTACT_PERSON                                     VARCHAR2(30)
 PHONE_NO                                           VARCHAR2(100)
 START_DATE                                         DATE
 FAX_NO                                             VARCHAR2(60)
 REG_NO                                             VARCHAR2(30)
 NTN_NO                                             VARCHAR2(30)
 EMAIL_ADDRESS                                      VARCHAR2(70)
 WEB_SITE                                           VARCHAR2(60)
 ZIP_CODE                                           NUMBER(10)
 P_O_BOX                                            NUMBER(10)
 REMARKS                                            VARCHAR2(300)
 ACTIVE                                             CHAR(1)
 PARTTAKER_NAME                                     VARCHAR2(100)
 NIC                                                VARCHAR2(20)
 SECTOR_CODE                                        VARCHAR2(2)
 TAX_DED                                            CHAR(1)
 ENT_BY                                             VARCHAR2(20)
 ENT_DATE                                           DATE
 BUSINESS_CODE                                      VARCHAR2(2)
 DISTRICT_CODE                                      VARCHAR2(4)
 SUP_BY                                             VARCHAR2(20)
 SUP_DATE                                           DATE
 CANCEL_BY                                          VARCHAR2(20)
 CANCEL_DATE                                        DATE
 CANCEL_REMARKS                                     VARCHAR2(100)
 CATEGORY_TYPE                                      VARCHAR2(2)
 EDUCATION                                          VARCHAR2(100)
 TAX_EXEMPT_DOCNO                                   VARCHAR2(100)
 DATE_OF_BIRTH                                      DATE
 VOLUME_DISCOUNT                                    NUMBER(5,2)

SQL> select a.index_name,b.num_rows,b.blocks,a.clustering_factor
  2    from user_indexes a,user_tables b
  3   where a.table_name=b.table_name
  4     and lower(a.table_name)='ins_parttaker'
  5  /

INDEX_NAME                       NUM_ROWS     BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ---------- -----------------
XPKINS_PARTTAKER                     1340         28               691


XPKINS_PARTTAKER is primary key at parttaker code ,what can i evaluate that my table is organized or not because clustering facotr value is not close to number of rows and not close to number of blocks?

why clustering factor value is close to number of rows as parttaker code is primary key it should be close to number of blocks?

if my table is not organized how can i organize it?

raman
Tom Kyte
August 20, 2007 - 1:35 pm UTC

it is a tiny, uninteresting table...


you can say your table is only sort of sorted by the key - if you were to retrieve all of the data via the index, sometimes two rows would be on the same block sequentially and sometimes not.


my question for you - a very important one - why do you want to sort this table by primary key, what benefit would that have for you?

a reader

raman, August 28, 2007 - 5:18 am UTC

tom this primary key will be referenced from others transaction table for that reason i created primary key..

tom as you said

"sometimes two rows would be on the same block sequentially and sometimes not."

i doubt the reason is that others transaction simultaneoulsy going on when inserting going on at table ins_parttaker
thats why table ins_parttaker is randomly ordered otherwise it is not.

please correct me from my above statment.

raman
Tom Kyte
September 04, 2007 - 12:41 pm UTC

you did not address MY question, why do you believe having this table sorted by primary key would do anything for you?

so what if it isn't, what downside do you see resulting from that?

a reader

raman, September 05, 2007 - 3:32 am UTC

tom this table is getting grow and this table is being used
in most of the areas as a joining table in reports.If it is
disorganized then it may degrade performance by increasing
LIOs.

clustering factor not clearing me that either its organized
or disorganized,if its organized then how will i judge from
my above clustering factor figure in my first post in this
thread?if its disorganized then why it is?

how will i compare clustering_factor to num of rows or
number of blocks in order to know is it organized or not??

if it is disorganized then should i export only this table
data and truncate this table and then only import data in
this table to turn it into organized?

thanks tom its my nightmare i am learning a lot from you and reading yours thread.



raman
Tom Kyte
September 05, 2007 - 4:07 pm UTC

so, how do you join it


and, be technical here, why do you think it would affect this - do the math for us, walk us THROUGH your supposed join operation.


if you have access to my books Expert One on One Oracle or Expert Oracle Database Architecture - I cover this with examples and lots of text...


if the clustering factor near number of rows, table data is not stored sorted by the index key you are looking at.

if near number of blocks, it is.

consider:

ops$tkyte%ORA10GR2> create table t
  2  as
  3  select rownum id1,
  4         dbms_random.random id2,
  5             rpad( '*', 200, '*' ) data
  6    from all_objects
  7  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index organized on t(id1);

Index created.

ops$tkyte%ORA10GR2> create index disorganized on t(id2);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select index_name, clustering_factor from user_indexes where table_name = 'T';

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
ORGANIZED                                   1506
DISORGANIZED                               49661

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select num_rows, blocks from user_tables where table_name = 'T';

  NUM_ROWS     BLOCKS
---------- ----------
     49686       1540



as that table is loaded, rownum TENDS to be sequential, the rows are stored id1=1,2,3,4,.... in that order.

id2 however, being random, are stored however, not in any particular order...


so, the clustering factor on the index on id1 - it is near the number of blocks, the table is sorted by that key.

the clustering factor on id2 - it is not.

if we "fix" the disorganized index:

ops$tkyte%ORA10GR2> create table t2 as select * from t order by id2;

Table created.

ops$tkyte%ORA10GR2> create index id1_idx on t2(id1);

Index created.

ops$tkyte%ORA10GR2> create index id2_idx on t2(id2);

Index created.

ops$tkyte%ORA10GR2> select index_name, clustering_factor from user_indexes where table_name = 'T2';

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
ID1_IDX                                    49650
ID2_IDX                                     1506


we "break" the other one (cause a table can be sorted only one way really....)

clustering factor and selectivity

AMIR RIAZ, September 05, 2007 - 12:54 pm UTC

Hi Tom

1, if i have a index on columns a,b,c (table sort by a,b,c) then cluster factor will be high. Now if i create another table with same data sorted by a,b,c and create a index on c,b,a. Now the clustering factor is low and if i queried as

select * from t2 where c=:x and b=:y and a=:z

will it perform a full scan or use index range scan or optimizer will choose any one of them based on cost.

2, if c most discriminating among a,b,c than b than a and the clustering factor on c,b,a is very low and clustering factor on a,b,c is very good. does performce of both the indexes will be same as describe in your book on page 487. I am having hard time to believe that cr, cpu, and pr will be same in this case

regards
Amir Riaz
Tom Kyte
September 05, 2007 - 5:39 pm UTC

1) no, cluster factor will be low (near number of blocks, not number of rows)

but your example is bad because "where c=:x and b=:y and a=:z" would presumably return a small set of rows.

if it returns a LARGE set, then a full scan might be chosen as a more efficient path yes, but so what?

2) you need my book - selectivity is not something you consider when determining the order of attributes in an index.


if you use your where clause, I don't care if you index c,b,a - a,b,c - c,a,b - whatever - they would do the same amount of work (think about this... test it out...)

here is my test case

AMIR RIAZ, September 06, 2007 - 10:41 am UTC

hi tom thanks for quick response here is my test case.

create table test as select * from all_objects ORDER BY owner,object_type,object_name ;

create index test_ord1 on test(object_type,owner,object_name)  ;
CREATE index test_ord2 on test(object_name,owner,object_type)  ;
CREATE index test_ord3 on test(owner,object_name,object_type)  ;
CREATE index test_ord4 on test(owner,object_type,object_name)  ;   -- create index according to sort
CREATE index test_ord5 on test(object_name,object_type,owner)  ;   -- just reverses all the columns of index TEST_ORD4


EXEC dbms_stats.gather_table_stats(ownname => USER,tabname => 'TEST',cascade => TRUE);

select index_name,clustering_factor from user_indexes where index_name like 'TEST%';

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
TEST_ORD1                                    923
TEST_ORD2                                  41184
TEST_ORD3                                   7068
TEST_ORD4                                    683
TEST_ORD5                                  42508   

by reversing the order of the columns of index TEST_ORD4 we made Test_ord5 with worst clustering factor proving that if we have a table sorted by a,b,c then index a,b,c will be the best while c,b,a can be the worst so column order does matter because by putting columns in any order we can make that index bad because of bad clustering.

but here is thing i did not manage to understand in my test case. I use below query so that the index TEST_ORD4 is
used but oracle surprised me to select index TEST_ORD1 why? 

SQL> set autotrace traceonly  explain statistics
SQL> SELECT * from test
  2  WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
  3  /

64 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=2 Bytes=186)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=2 Bytes=186)

   2    1     INDEX (RANGE SCAN) OF 'TEST_ORD1' (INDEX) (Cost=3 Card=2)
Statistics
----------------------------------------------------------
        731  recursive calls
          0  db block gets
        208  consistent gets
         43  physical reads
          0  redo size
       7810  bytes sent via SQL*Net to client
       1028  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
         64  rows processed

 
--i flushed both the pools for comparison
alter system flush shared_pool;
alter system flush buffer_cache;

-- i have to use hint for the index

SQL> SELECT /*+ index(test test_ord4) */  * from test
  2  WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
  3  /

64 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=2 Bytes=186)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=2 Bytes=186)

   2    1     INDEX (RANGE SCAN) OF 'TEST_ORD4' (INDEX) (Cost=3 Card=2)

Statistics
----------------------------------------------------------
        731  recursive calls
          0  db block gets
        191  consistent gets
         28  physical reads
          0  redo size
       7810  bytes sent via SQL*Net to client
       1028  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
         64  rows processed

Tom with good clustering and the leading column owner is present in query TEST_ORD4 should be the best choice also the other two columns are after owner columns so still TEST_ORD4 should be the best choice for range scan why i have TEST_ord1.

regards
Amir Riaz

Tom Kyte
September 11, 2007 - 8:15 am UTC

do you see the high number of recursive sql's? you need to run your queries a couple of times.

you likely have no histograms either, so the database does not understand how many rows will actually be returned.

do you see the very low estimated card=values? the database things "small number of rows", it matters not therefore which index would be used.

Clustering factor

Vikas Atrey, September 10, 2007 - 3:51 am UTC

I doubt Clustering factor will matter here as combination of (owner , object_type, object_name) is almost unique here.
Oracle will come to know this by user_indexes.distinct_keys.

AMIR RIAZ -Clustering Factor

sriram vrinda, September 11, 2007 - 6:08 am UTC

Hi,

I think the optimizer is not given the actual details, look at the cardinality in your plan it is 2 , but you retrieve 64 rows , so missing histograms could be a reason.

Please can you collect histograms using

EXEC dbms_stats.gather_table_stats(ownname => USER,tabname => 'TEST',cascade => TRUE,method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO');


Now check whether the Correct Index is used or not?

Do correct us TOM , if we are leading no where.

Thanks

Use histograms why?

AMIR RIAZ, September 11, 2007 - 1:54 pm UTC

SELECT index_name,distinct_keys ,num_rows,distinct_keys/num_rows,
leaf_blocks,clustering_factor,blevel,avg_leaf_blocks_per_key
FROM dba_indexes WHERE owner = USER AND table_name = 'TEST'
ORDER BY index_name;


INDEX_NAME DISTINCT_KEYS NUM_ROWS DISTINCT_KEYS/NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------- ------------- ---------- ---------------------- ----------- -----------------
TEST_ORD1 49755 49802 0.999056262800691 343 923
TEST_ORD2 49755 49802 0.999056262800691 343 41184
TEST_ORD3 49755 49802 0.999056262800691 343 7068
TEST_ORD4 49755 49802 0.999056262800691 343 683
TEST_ORD5 49755 49802 0.999056262800691 343 42508


SELECT * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
/

64 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=119 Bytes=11067)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=119 Bytes=11067)

2 1 INDEX (RANGE SCAN) OF 'TEST_ORD4' (INDEX) (Cost=3 Card=119)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
26 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

SELECT /*+ index(test test_ord5) */ * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
/
64 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=11 Card=119 Bytes=11067)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=11 Card=119 Bytes=11067)

2 1 INDEX (RANGE SCAN) OF 'TEST_ORD5' (INDEX) (Cost=4 Card=275)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
54 consistent gets
0 physical reads
0 redo size
8285 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

1 SELECT /*+ index(test test_ord1) */ * from test
2* WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
SQL> /
64 rows selected.

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=119 Bytes=11067)

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=119 Bytes=11067)

2 1 INDEX (RANGE SCAN) OF 'TEST_ORD1' (INDEX) (Cost=3 Card=119)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
43 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

Yes i got the required result. The selectivity of all the indexes are same understood(i hope i am using the right formula). But why oracle not manage to select the right index without histogram. I know what an histogram is but that means that oracle can select a wrong index when using bind variables. Why it used 'TEST_ORD1' index without histograms?

regards
Amir Riaz

why use histograms

AMIR RIAZ, September 19, 2007 - 11:45 am UTC

Hi Tom.

In above example you suggested to use histograms But Why it used 'TEST_ORD1' index without histograms. keep in view that

1. Selectivity of all the indexes are same (give out same amount of rows)

2. Clustering factor of 'TEST_ORD4' makes it more efficient.

so

Even without histograms 'TEST_ORD4' index should be used. but 'TEST_ORD1' is being used. please explain it

regards
Amir Riaz

Tom Kyte
September 19, 2007 - 1:17 pm UTC

give me a small concise, SMALL concise, yet 100% complete example to work with - I do not page up down and all around trying to reconstruct things...

with such a small bit of data being accessed - the clustering factor is NOT going to have an impact, the clustering factor would impact a large range scan - you have tiny results - it is, as the old saying goes, "six one way, half dozen the other"


it does appear to me that the index you desired to be used was being used. You said test_ord4 should be and right above here - IT IS.


Wrong Index ?

Jonathan Lewis, September 20, 2007 - 12:07 am UTC

It looks like

a) The optimizer_index_cost_adj has been set to a fairly small value.
b) The cost of the indexes test_ord1 and test_ord4 work out to be exactly the same (with rounding).

So, since the cost is the same and the number of distinct keys in the two indexes is the same, the optimizer has picked the index with the name that sorts lower alphabetically.

If you want to check this hypothesis, swap the names of the two indexes - the index called test_ord1 will be used, even though it's now the "other" index.

See also: http://www.jlcomp.demon.co.uk/18_oica_i.html

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com

Here is the test case i will try to be precise

AMIR RIAZ, September 20, 2007 - 12:49 pm UTC

SQL> show parameter index
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_index_caching integer 90
optimizer_index_cost_adj integer 10
skip_unusable_indexes boolean TRUE

create table test as select * from all_objects ORDER BY owner,object_type,object_name ;
create index test_ord1 on test(object_type,owner,object_name) ;
CREATE index test_ord2 on test(object_name,owner,object_type) ;
CREATE index test_ord3 on test(owner,object_name,object_type) ;
CREATE index test_ord4 on test(owner,object_type,object_name) ;
CREATE index test_ord5 on test(object_name,object_type,owner) ;

EXEC dbms_stats.gather_table_stats(ownname => USER,tabname => 'TEST',cascade => TRUE);

SQL> SELECT index_name,leaf_blocks,blevel,clustering_factor
2 FROM dba_indexes WHERE owner = USER AND table_name = 'TEST'
3 ORDER BY index_name
4 /
INDEX_NAME LEAF_BLOCKS BLEVEL CLUSTERING_FACTOR
-------------- ----------- ---------- -----------------
TEST_ORD1 344 2 924
TEST_ORD2 343 2 41243
TEST_ORD3 344 2 7146
TEST_ORD4 344 2 683
TEST_ORD5 343 2 42561

now let test the Jonathan Lewis hypothesis.

SELECT * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%';

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
Plan hash value: 1400830044
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 279 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 3 | 279 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_ORD1 | 3 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
2 - access("OBJECT_TYPE" LIKE 'T%' AND "OWNER"='SYS' AND "OBJECT_NAME" LIKE 'D%')
filter("OWNER"='SYS' AND "OBJECT_NAME" LIKE 'D%' AND "OBJECT_TYPE" LIKE'T%')

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=3 Bytes=279)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=3 Bytes=279)
2 1 INDEX (RANGE SCAN) OF 'TEST_ORD1' (INDEX) (Cost=3 Card=3)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
42 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

SELECT /*+ index(test test_ord4) */ * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%';


PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
Plan hash value: 2023861017
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 279 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 3 | 279 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_ORD4 | 3 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
2 - access("OWNER"='SYS' AND "OBJECT_TYPE" LIKE 'T%' AND "OBJECT_NAME" LIKE 'D%')
filter("OBJECT_NAME" LIKE 'D%' AND "OBJECT_TYPE" LIKE 'T%')

Execution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=3 Bytes=279)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=3 Bytes=279)
2 1 INDEX (RANGE SCAN) OF 'TEST_ORD4' (INDEX) (Cost=3 Card=3)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

here the cost of both the plan are same but consistent gets are different and surprisingly oracle did not take
into account that index with small consistent get is the optimal index. why?
now lets change the name of the indexes rename index test_ord4 to test_ord1 and test_ord1 to test_ord4


DROP INDEX test_ord4; --previously(owner,object_type,object_name)
CREATE INDEX test_ord4 on test (object_type,owner,object_name) compute statistics ;

DROP INDEX test_ord1; --previously (object_type,owner,object_name)
CREATE INDEX test_ord1 on test(owner,object_type,object_name) compute statistics ;

SELECT * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=2 Bytes=186)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=2 Bytes=186)
2 1 INDEX (RANGE SCAN) OF 'TEST_ORD1' (INDEX) (Cost=3 Card=2)
STATISTICS
----------------------------------------------------------
0 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

it seem Jonathan Lewis is right So hypothesis, when the cost is the same and the number of distinct keys in the two indexes is the same, then optimizer will pick the index with the name that sorts lower alphabetically, is right.

BUT when i tried the same hypothesis for histograms here are the results.


drop table test;
create table test as select * from all_objects ORDER BY owner,object_type,object_name ;
CREATE INDEX test_ord1 on test(object_type,owner,object_name) ;
CREATE INDEX test_ord2 on test(object_name,owner,object_type) ;
CREATE INDEX test_ord3 on test(owner,object_name,object_type) ;
CREATE INDEX test_ord4 on test(owner,object_type,object_name) ;
CREATE INDEX test_ord5 on test(object_name,object_type,owner) ;
EXEC dbms_stats.gather_table_stats(ownname => USER,tabname => 'TEST',cascade =>TRUE,method_opt=>'FOR ALL INDEXED COLUMNS SIZE AUTO');

SELECT * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
/

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2023861017
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TIME |
--------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 126 | 11718 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 126 | 11718 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_ORD4 | 126 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OWNER"='SYS' AND "OBJECT_TYPE" LIKE 'T%' AND "OBJECT_NAME" LIKE 'D%')
filter("OBJECT_NAME" LIKE 'D%' AND "OBJECT_TYPE" LIKE 'T%')

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=126 Bytes=11718)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=126 Bytes=11718)
2 1 INDEX (RANGE SCAN) OF 'TEST_ORD4' (INDEX) (Cost=3 Card=126)
STATISTICS
----------------------------------------------------------
0 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

SELECT /*+ index(test test_ord1) */ * from test
WHERE owner='SYS' and object_type LIKE 'T%' AND object_name LIKE 'D%'
/

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1400830044
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TIME |
--------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 126 | 11718 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 126 | 11718 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_ORD1 | 126 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_TYPE" LIKE 'T%' AND "OWNER"='SYS' AND "OBJECT_NAME" LIKE 'D%')
filter("OWNER"='SYS' AND "OBJECT_NAME" LIKE 'D%' AND "OBJECT_TYPE" LIKE 'T%')

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=126 Bytes=11718)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (TABLE) (Cost=1 Card=126 Bytes=11718)
2 1 INDEX (RANGE SCAN) OF 'TEST_ORD1' (INDEX) (Cost=3 Card=126)
STATISTICS
----------------------------------------------------------
0 recursive calls
0 db block gets
42 consistent gets
0 physical reads
0 redo size
7810 bytes sent via SQL*Net to client
1028 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
64 rows processed

the cost of index test_ord1 and test_ord4 are same again using histograms but oracle manage to select the exact index the hypothesis which we constructed above that since the cost is the same and the number of distinct keys in the two indexes is the same, the optimizer has picked the index with the name that sorts lower alphabetically fails(index test_ord4 is selected while according to hypothesis index test_ord1 should be selected). here the cost is same, cards are same but oracle selected the right index. But it seems that in case of histograms oracle take into consideration the consistent gets too which it ignored when the cost is equal in case of without histograms why.

1. if i use index test_ord5(worst clustering) i get card=293. but how can one index(using test_ord4 as a base) return row =126
while the other retruns row =293 to return the same number of rows. I think cards is number of row visited in index not
the number of rows return because the number of row return by index should always be same for the same query.
2 why in case of without histograms consistent gets are neglected and wrong index is being selected based on its alphabetic
order as described by Jonathan Lewis in above post.
3. why in case with histograms when the cost and cards are same, oracle take into account the consistent get too. which may
have refuted the Jonathan Lewis's hypothesis


regards
Amir Riaz

clustering_factor of an index

A reader, May 31, 2010 - 8:38 am UTC

my table's dba_index value is ---> All the indexes are global index and the table has partiotn and has 16 partitions.

INDEX_NAME |BLEVEL |LEAF_BLOCKS |DISTINCT_KEYS AVG_LEAF_BLOCKS_PER_KEY |AVG_DATA_BLOCKS_PER_KEY |CLUSTERING_FACTOR |NUM_ROWS

I1 |3 |3884800 |7875 |1206 |54534 |9412882 |391915541
I2 |3 |2252800 |173 |31285 |239305 |811119 |392609183
I3 |3 |3008000 |3012 |2128 |9091 |446201 |391393133
I4 |3 |5804800 |150090 |330 |11533 |19492602 |390806293
I5 |3 |3334400 |16344 |670 |37913 |12107411 |393296779

and the tables's dba_table's value is ----->

NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE
520303375 9344000 236569 859

There is job which deletes data daily form this table.

I'm giving you the sample delete statement --->

DELETE FROM tab_name
WHERE post_date < (TO_DATE(20100601, 'YYYYMMDD') -210)
AND tran_date < (TO_DATE(20100601, 'YYYYMMDD') -210);

Now this table is hash partitoned by some another column which you can see we don't use in delete statement.

I'm planning to create a range partition and key would be post_data which is present in delett statement.
And as you can see every day data are deleted so corresponding index entry will also get deleted so table's segment as well as index segment will day by day is getting un-organized and this deletion is running form 2-3 years.

So can you please suggest me
what i need to do as to re-organize the table's data and index segment?
Do i need to replace the deletion in pl/sql doing some bulk collect operation??

Tom Kyte
May 31, 2010 - 8:52 am UTC

why do you think (stress *thing*) that the table and index are getting un-organized.

As I see it, an index on post date will have all of the data below some point in time removed, so any index on post_date will have blocks on the left hand size of the index becoming completely 100% empty - and we'll remove those blocks from the index and put them on the right hand side as the new data is inserted. Same with tran_date.

So, you'd have a hard time telling me "it is unorganized", this is a good situation you have so far.


And as I see it, you remove data from the table - which will cause blocks to have free space, which permits them to be used by subsequent inserts. Space will be efficiently reused.


Now, it is true that over time, the data in the table will not be clustered nicely by either of post_date nor tran_date. And that can only be fixed by using a range partition on the table by post_date and tran_date and using truncate on the partitions instead of delete.

But you have no need of a reorg here - you'd have to tell me why you believe you do. Saying "we've been doing this for 2-3 years" is not a sufficient answer to that question.

Some doubts on index

A reader, June 03, 2010 - 11:39 pm UTC

Hi Tom,

Many thanks four your clarification.I have some dounbts more.As you said below

"As I see it, an index on post date will have all of the data below some point in time removed, so any index on post_date will have blocks on the left hand size of the index becoming completely 100% empty - and we'll remove those blocks from the index and put them on the right hand side as the new data is inserted. Same with tran_date"

My doubt is
1. when some data of post_date gets removed why blocks on the left hand size of the index becoming completely 100% empty not right side?
2. As you said how we'll remove those blocks from the index and put them on the right hand side as the new data is inserted?

As you said "Now, it is true that over time, the data in the table will not be clustered nicely by either of post_date nor tran_date. And that can only be fixed by using a range partition on the table by post_date and tran_date and using truncate on the partitions instead of delete"

My doubt is

1. If i drop individual data range partiotns will it be good rather than truncate?




Tom Kyte
June 08, 2010 - 12:32 pm UTC

1) because the lower the value, the more far 'left' the value in the index.

picture an index in your mind.

At the top - a root block which points to branch (navigation blocks). 'small' things sort left, 'big' things sort right.

To find a 'small' value you go left

All 'small' values are on the left.

If you delete a lot of them (all of them below a certain point) you are deleting contiguous values on the LEFT.


2) how? We just do, it is the way we are programmed to work.





as for the second question set, the answer is the same as for all questions like this:

a) it might be better
b) it might not make a difference
c) it might be horrible

Benchmark and test it, dropping may not be an option in a high dml environment but truncate might work. the empty partitions can be cleaned up later - whenever.

Clutering factor and table data

Bhaskar, May 27, 2011 - 5:18 am UTC

Hi Tom,

Firstly many thanks for your explanation in this tpoics.

I have some question on the basis of the below data.

TABLE_NAME INDEX_NAME CLUSTERING_FACTOR NUM_ROWS_as_dba_tables NUM_ROWS_as_dba_index BLOCKS DISTINCT_KEYS
table_name index_I1 18308825 718381647 733147222 10240000 11254
table_name index_I2 1585053 718381647 725183712 10240000 186
table_name index_I3 833817 718381647 719294474 10240000 3206
table_name index_I4 41095172 718381647 737193813 10240000 165937
table_name index_I5 23424055 718381647 710835058 10240000 21553


Now this is a history table and every day massive data gets loaded/updated in this table.I could find clustering factor is high as well as low for some indexes and all the indexes are nonunique.

It would be great if you can suggest me what i need to do to improve the selectivity of the table.
Tom Kyte
May 27, 2011 - 10:50 am UTC

Well, this question is a bit like:

I have some blue sky and green grass. I live by a big lake. How do I make this tiger into a duck?



Selectivity is not an attribute of a table, selectivity is generally accepted to be a measure of how "selective" a set of attributes for a given table would be. For example, if a table has 1,000,000 rows and column C1 has 5 distinct values - it would be said that "C1 is not very selective". Meaning, if you use "where c1 = ?" - your'll retrieve about 20% of the table - not very selective at all. If on the other hand, column C2 has 500,000 distinct values - then we would say that C2 is very selective since it would return on average about 2 out of 1,000,000 rows.

The clustering factor (a measure of how "sorted" a table is with respect to a set of index keys) is the clustering factor regardless of 'selectivity'. I can show you selective columns with any range of clustering factors - from "good" to "bad". I can do the same with non-selective columns.


In order to change the selectivity of a column - we'd need to change a tiger into a duck. That is, we'd need to change the VALUES of the column. The clustering factor and that this is a history table and that every day you load data into this it as relevant as blue skies and green grass and living by a lake is to changing a tiger into duck :)


So, what are you *really* looking for.


A reader, August 22, 2011 - 8:44 am UTC

 select i.index_name, I.NUM_ROWS indexrows,t.blocks,t.num_rows tablerows,CLUSTERING_FACTOR cf,
 clustering_factor /t.blocks fav_index, clustering_factor/t.num_rows fav_table 
 from dba_INDEXES I,
DBA_TABLES T
 where t.table_name ='TEST'
 and i.table_name=t.table_name
 order by fav_index desc

Per my understanding of CF from posts above,the following query provides me the most favourable indexes towards the end of the query output.The first row showed me 6 against fav_index and 0.71 against fav_table.Does it imply that FT would probably be preferred by optimizer. Is my query correct (as CF should approach tables' num_blocks instead of its rows) ?

Tom Kyte
August 30, 2011 - 2:18 pm UTC

I don't get what you are trying to measure here.


It isn't that an index is favorable or not - it is whether the index is considered useful for large range scans or not.

If you are getting a single row out of a non-trivial table, all indexes are favorable.

When you start getting hundreds or thousands - or more - that some indexes will not be as EFFICIENT as other indexes - some indexes will cause more table IO's to occur than others - for retrieving the same number of rows.

The clustering factor is used to estimate the number of table IO's that will be performed. The optimizer uses that to determine whether a full scan is preferable to an index range scan+table access by index rowid to retrieve that large number of rows.

Favourable for single row or small percent

A reader, August 31, 2011 - 1:21 am UTC

Thanks for your reply.So indexes can still be favourable if more than one row ? (if it is small percentage of table rows .. <5-10%).


Tom Kyte
August 31, 2011 - 9:15 am UTC

it is not a percentage of the table so much as "how much IO would we have to peform".


for example:

ops$tkyte%ORA11GR2> create table organized
  2  as
  3  select x.*
  4    from (select * from stage order by object_name) x
  5  /

Table created.

ops$tkyte%ORA11GR2> create table disorganized
  2  as
  3  select x.*
  4    from (select * from stage order by dbms_random.random) x
  5  /

Table created.


arguably the same data (stage is a copy of all objects), just in different row order on disk..

ops$tkyte%ORA11GR2> create index organized_idx on organized(object_name);

Index created.

ops$tkyte%ORA11GR2> create index disorganized_idx on disorganized(object_name);

Index created.


ops$tkyte%ORA11GR2> begin
  2  dbms_stats.gather_table_stats
  3  ( user, 'ORGANIZED',
  4    estimate_percent => 100,
  5    method_opt=>'for all indexed columns size 254'
  6  );
  7  dbms_stats.gather_table_stats
  8  ( user, 'DISORGANIZED',
  9    estimate_percent => 100,
 10    method_opt=>'for all indexed columns size 254'
 11  );
 12  end;
 13  /

PL/SQL procedure successfully completed.




same indexing strategy, same statistics gathered. but when we look at the details:

ops$tkyte%ORA11GR2> select table_name, blocks, num_rows, 0.05*num_rows, 0.10*num_rows from user_tables
  2  where table_name like '%ORGANIZED' order by 1;

TABLE_NAME                         BLOCKS   NUM_ROWS 0.05*NUM_ROWS 0.10*NUM_ROWS
------------------------------ ---------- ---------- ------------- -------------
DISORGANIZED                         1053      72109       3605.45        7210.9
ORGANIZED                            1053      72109       3605.45        7210.9

ops$tkyte%ORA11GR2> select table_name, index_name, clustering_factor from user_indexes
  2  where table_name like '%ORGANIZED' order by 1;

TABLE_NAME                     INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
DISORGANIZED                   DISORGANIZED_IDX                           72032
ORGANIZED                      ORGANIZED_IDX                               1027




we see they are very different. Note that 5/10% of the table is 3605/7210 rows. Remember that number.


ops$tkyte%ORA11GR2> select * from organized where object_name like 'F%';

Execution Plan
----------------------------------------------------------
Plan hash value: 1925627673

-------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CP
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |   147 | 14259 |     6   (
|   1 |  TABLE ACCESS BY INDEX ROWID| ORGANIZED     |   147 | 14259 |     6   (
|*  2 |   INDEX RANGE SCAN          | ORGANIZED_IDX |   147 |       |     3   (
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME" LIKE 'F%')
       filter("OBJECT_NAME" LIKE 'F%')

ops$tkyte%ORA11GR2> select * from disorganized where object_name like 'F%';

Execution Plan
----------------------------------------------------------
Plan hash value: 3767053355

-------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |   147 | 14259 |   151
|   1 |  TABLE ACCESS BY INDEX ROWID| DISORGANIZED     |   147 | 14259 |   151
|*  2 |   INDEX RANGE SCAN          | DISORGANIZED_IDX |   147 |       |     3
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME" LIKE 'F%')
       filter("OBJECT_NAME" LIKE 'F%')




Now, if we attempt to retreive 147 rows from the table - both will use an index but notice the COST of each. The cost is higher for the second table because of the number of table IO's it thinks will will have to do. In the first case - it thinks 3 IO's for the index and 3 more IO's for the table - since all of the F's are located *right next to each other*.

When we spread them out randomly as we did in the second table - it thinks I'll have to do 3 IO's against the index and 147 IO's against the table since no block in the table contains more than one row with the letter F on it - they are spread out all over the place

Furthermore:

ops$tkyte%ORA11GR2> select * from organized where object_name like 'A%';

Execution Plan
----------------------------------------------------------
Plan hash value: 1925627673

-------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CP
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |  1806 |   171K|    37   (
|   1 |  TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  1806 |   171K|    37   (
|*  2 |   INDEX RANGE SCAN          | ORGANIZED_IDX |  1806 |       |    11   (
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_NAME" LIKE 'A%')
       filter("OBJECT_NAME" LIKE 'A%')

ops$tkyte%ORA11GR2> select * from disorganized where object_name like 'A%';

Execution Plan
----------------------------------------------------------
Plan hash value: 2727546897

-------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |  1806 |   171K|   288   (1)| 00:00:0
|*  1 |  TABLE ACCESS FULL| DISORGANIZED |  1806 |   171K|   288   (1)| 00:00:0
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_NAME" LIKE 'A%')



when we go for about 2 - 2.5% of the rows in the table, we see the first table will use an index - but the second goes for a full scan.

The reason - the second table would have to do 11 IO's against the index plus 1,806 IO's against the table- the cost would be greater than 1,817. The full scan cost is only 288.

In fact - we can compute when the optimizer would stop using the index on this table - it would be whenever we select more than about 280-285 records!

That is about 0.4% of the rows in the table...





why is it that with bad clustering, performance is much better?!

A reader, December 09, 2011 - 12:48 pm UTC

hi tom,

1) i got a index that the opitmiser won't used until i force it.

2) everytime i issue the statement, it would be a FTS and it took very long.

3) if i force hint to use an index, the results returned would be very fast.

4) from the explain plan, the COST for using the index is much way higher then using the FTS, but the result return by using the index is way faster.

----------------------------------------

Below are the clustering facts of the index and the table

select segment_name, blocks from user_segments where segment_name = 'CDR';

CDR 2016288 -- i got about 2 million blocks

select index_name, clustering_factor From user_indexes where table_name = 'CDR';

CDR_PK 25144546 -- i got 20 million clustering (10 times
more) (of course a FTS would make more sense then).

select count(*) from cdr;

46316321 -- i got 40 million rows.

-------------------------------------------------

thus i am wondering the below ->

if the clustering factor is indeed so high and the rows are scattered around

q1) why do i get results returned so fast if i force to use the INDEX ?!

q2) how am i suppose to fix this issue, i cannot be reorganizing the table everytime i hit this problem~

Regards,
Noob
Tom Kyte
December 11, 2011 - 2:53 pm UTC

are the estimated cardinalities from the plan close to the real cardinalities when you run the query.

use this technique to find out:

http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/

report back the plan with e-rows and a-rows for the unhinted query.

A reader, December 12, 2011 - 2:06 pm UTC

hi tom,

do i need to wait for the query to execute finish ?
as everytime when the SQL is executed finish, i have problem finding the sql_id in v$sql.

below is the finding when i cancel the executing sql and run


--------------------------------------

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));


SQL_ID fg4sbvdt9rfpr, child number 0
-------------------------------------
select /*+ gather_plan_statistics ordered */ * From cdr where
cdr_id >= 100570865 and cdr_id <= 103631203

Plan hash value: 1543093305

--------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
--------------------------------------------------------------------------
|* 1 | TABLE ACCESS FULL| CDR | 1 | 3059K| 0 |00:00:00.01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(("CDR_ID">=100570865 AND "CDR_ID"<=103631203))

------------
Regards,
Noob
Tom Kyte
December 13, 2011 - 7:12 am UTC

you need to finish it, and it would be in the shared pool right after it was done.

A reader, December 12, 2011 - 2:22 pm UTC

hi tom,

please ignore my previous post.

below are my test results
---------------
alter session set STATISTICS_LEVEL = ALL;

select /*+ gather_plan_statistics
*/ * From cdr where cdr_id >= 103629760 and cdr_id <= 106497084;

select * from table(dbms_xplan.display_cursor('1snvct04xg3ds',0,'ALLSTATS'));


SQL_ID 1snvct04xg3ds, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * From cdr where cdr_id >= 103629760 and
cdr_id <= 106497084

Plan hash value: 1543093305

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
|* 1 | TABLE ACCESS FULL| CDR | 1 | 2866K| 50 |00:02:07.90 | 1837K| 1816K|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(("CDR_ID">=103629760 AND "CDR_ID"<=106497084))

A reader, December 12, 2011 - 2:33 pm UTC

hi tom,

sorry just to add on
- any idea why the A-ROW is 50 ? the actual aint 50, it is at least a million rows.

is it because my fetch array is set at 50 at a time ?

Regards,
Noob
Tom Kyte
December 13, 2011 - 7:14 am UTC

did you let the query run to completion?

are you telling me you got a MILLION rows via an index and thought it was fast - or did you just get the first couple and say "hey, that was fast".

When you are comparing these queries - are you running them to COMPLETION?

A reader, December 13, 2011 - 12:12 pm UTC

hi tom,

to answer your questions as below :

1) i am using SQL developer to run these queries.
2) i am not exactly sure of what you mean by running to completion (completion as in the complete retrieval of all the rows ? to be display on screen ?)

3) when i run the query without an index, it is taking forever for the rows to be appearing on my output and every further scroll down to retrieve the subsequent rows seems slow.

however when i run the query with an index, it took me just a few seconds to get rows appearing on my output and further scroll down to retrieve subsequents rows is as fast as it is.

Regards,
Noob


Tom Kyte
December 13, 2011 - 12:55 pm UTC

1) don't - use sqlplus and run them to completion. sql developer gets the first n-rows and stops.

your index query seems fast - because it is the fastest way to get the first 25 or so rows.

your index query will be SLOWER than you can imagine to get 1,000,000 rows.

the full scan is the fastest way to the last row.
the index range scan is the fastest way to the first row.


So, do this;

set autotrace traceonly statistics;
run the two queries
set autotrace off

then tell us which one was faster.



Question for you - are you going to display these rows 25 at a time in a user interface? or are you going to put this query in a batch like program?


if the former, feel free to tell us to optimize for first_rows(25) (use that hint, not an index - that hint gives us context, more information). you can even set that at the session level.

if the latter, full scan away

A reader, December 13, 2011 - 12:36 pm UTC

hi tom,

below are the autotrace stats gathered that the SQL is run without the index.

---------------------------
select * From cdr where cdr_id >= 103629760 and cdr_id <= 106497084;

FTS - COST 440212

CPU used by this session 3505
CPU used when call started 3504
DB time 37514
OS Involuntary context switches 8112
OS Page faults 2
OS Page reclaims 217456
OS System time used 1824
OS User time used 1680
OS Voluntary context switches 113202
SQL*Net roundtrips to/from client 9
buffer is not pinned count 52
bytes received via SQL*Net from client 419
bytes sent via SQL*Net to client 26323
calls to get snapshot scn: kcmgss 88
cluster key scan block gets 8
cluster key scans 8
consistent gets 1838008
consistent gets - examination 67
consistent gets from cache 1838008
cursor authentications 16
dirty buffers inspected 6
enqueue releases 85
enqueue requests 85
execute count 43
free buffer inspected 1832565
free buffer requested 1823568
hot buffers moved to head of LRU 180
index fetch by key 23
index scans kdiixs1 20
no work - consistent read gets 1837876
opened cursors cumulative 43
parse count (hard) 35
parse count (total) 36
parse time cpu 17
parse time elapsed 30
physical read IO requests 120870
physical read bytes 14938669056
physical read total IO requests 120870
physical read total bytes 14938669056
physical read total multi block requests 119794
physical reads 1823568
physical reads cache 1823568
physical reads cache prefetch 1702698
pinned buffers inspected 258
recursive calls 450
recursive cpu usage 13
rows fetched via callback 9
session cursor cache hits 26
session logical reads 1838008
session uga memory 58080
shared hash latch upgrades - no wait 21
sorts (memory) 18
sorts (rows) 746
table fetch by rowid 21
table scan blocks gotten 1837854
table scan rows gotten 42532798
table scans (long tables) 1
user I/O wait time 34881
user calls 11
workarea executions - optimal 15
workarea memory allocated 33

-----------------------------
WITH index hint

select /*+INDEX(C, CDR_PK)*/* From cdr c where cdr_id >= 103629760 and cdr_id <= 106497084;

TABLE ACCESS BY INDEX ROWID COST - 1475547
INDEX RANGE SCAN COST - 11306

CPU used by this session 1
CPU used when call started 11
DB time 123
SQL*Net roundtrips to/from client 9
buffer is not pinned count 8
buffer is pinned count 92
bytes received via SQL*Net from client 442
bytes sent via SQL*Net to client 25817
calls to get snapshot scn: kcmgss 2
consistent gets 10
consistent gets - examination 2
consistent gets from cache 10
cursor authentications 1
enqueue releases 1
enqueue requests 1
execute count 2
free buffer requested 7
index scans kdiixs1 1
no work - consistent read gets 7
opened cursors cumulative 2
parse count (hard) 1
parse count (total) 2
parse time cpu 1
parse time elapsed 1
physical read IO requests 7
physical read bytes 57344
physical read total IO requests 7
physical read total bytes 57344
physical reads 7
physical reads cache 7
recursive calls 1
recursive cpu usage 1
session logical reads 10
session pga memory -327680
shared hash latch upgrades - no wait 2
sorts (memory) 2
sorts (rows) 726
table fetch by rowid 50
user I/O wait time 91
user calls 11
workarea executions - optimal 5
----------------------------------------

You can see that the one with the index is taking very CPU/DB time

Regards,
Noob

A reader, December 15, 2011 - 2:44 pm UTC

hi tom,

please see my results as below
--------------------------------

SQL> conn starhub/eqshgrsdb106@192.168.1.175:1521/starhub
Connected.
SQL> set timing on;
SQL> set autotrace on;
SQL> set autotrace traceonly statistics; 
SQL> select * from cdr where cdr_id >= 100570865 and cdr_id <=103631203;

3059485 rows selected.

Elapsed: 00:46:15.87

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
    2234510  consistent gets
    1927440  physical reads
          0  redo size
  613614249  bytes sent via SQL*Net to client
    2243964  bytes received via SQL*Net from client
     203967  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    3059485  rows processed

SQL> select /*+INDEX(C, CDR_PK)*/* from cdr C where cdr_id > 100570865 and cdr_id < 103631203;

3059483 rows selected.

Elapsed: 00:50:33.48

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
    1517470  consistent gets
     145368  physical reads
          0  redo size
 1014759213  bytes sent via SQL*Net to client
    2243964  bytes received via SQL*Net from client
     203967  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    3059483  rows processed

SQL> 
---------------------------------------

1) you are right, the path with the index doesn't mean it is the fastest (it is slower by 3 mintes actually).

q1) but what i dont understand is,
with the index path, there is less physical reads, but there are less consistent gets.

so is it because of the lesser consistent gets that the index hinted query take a longer time ?

q2) any idea why are there lesser consistent gets ? because the index information is not in the db_buffer_cache ?

q3) when running statistics like this, do i need to clear the db_buffer_cache following each statement executed ?

will the 1st query retrieve all the require blocks into the buffer cache thus increasing the likelihood of consistent gets for the 2nd similar query ?

Regards,
Noob

Tom Kyte
December 16, 2011 - 5:19 pm UTC

1) the reads in case one are big multiblock reads. the reads in case two are all random seeks.


multiblock reads are in general a more efficient way to retrieve millions of rows.

2) nothing to do with what is in the cache, if we had to read from disk, we'd put it into the cache and then read it from there (a physical read is typically followed by a logical read)

If you want to "tune" this, I'd suggest setting your arraysize to say 500

SQL> set arraysize 500

you'll see the consistent gets go down for the full scan

3) NO - that would be more artificial than anything, your cache is never truly empty in real life.


will the 1st query retrieve all the require blocks into the buffer cache thus
increasing the likelihood of consistent gets for the 2nd similar query ?


no, but only because the number of consistent gets is not affected by the presence or lack thereof of a block in the cache.


consider how the number of consistent gets is not affected by the buffer cache, it doesn't impact it:

ops$tkyte%ORA11GR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_idx on t(object_id);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set termout off
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly
ops$tkyte%ORA11GR2> select /*+ index(t,t_idx) */ * from t;

72182 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3778778741

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       | 80054 |    12M|  1808   (1)| 00:00:22 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     | 80054 |    12M|  1808   (1)| 00:00:22 |
|   2 |   INDEX FULL SCAN           | T_IDX | 80054 |       |   162   (1)| 00:00:02 |
-------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      11323  consistent gets
          5  physical reads
          0  redo size
    8008983  bytes sent via SQL*Net to client
      53352  bytes received via SQL*Net from client
       4814  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      72182  rows processed

ops$tkyte%ORA11GR2> select /*+ index(t,t_idx) */ * from t;

72182 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3778778741

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       | 80054 |    12M|  1808   (1)| 00:00:22 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     | 80054 |    12M|  1808   (1)| 00:00:22 |
|   2 |   INDEX FULL SCAN           | T_IDX | 80054 |       |   162   (1)| 00:00:02 |
-------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      11323  consistent gets
          1  physical reads
          0  redo size
    8008983  bytes sent via SQL*Net to client
      53352  bytes received via SQL*Net from client
       4814  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      72182  rows processed

ops$tkyte%ORA11GR2> alter system flush buffer_cache;

System altered.

ops$tkyte%ORA11GR2> select /*+ index(t,t_idx) */ * from t;

72182 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3778778741

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       | 80054 |    12M|  1808   (1)| 00:00:22 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     | 80054 |    12M|  1808   (1)| 00:00:22 |
|   2 |   INDEX FULL SCAN           | T_IDX | 80054 |       |   162   (1)| 00:00:02 |
-------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      11323  consistent gets
       1189  physical reads
          0  redo size
    8008983  bytes sent via SQL*Net to client
      53352  bytes received via SQL*Net from client
       4814  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      72182  rows processed

ops$tkyte%ORA11GR2> select /*+ index(t,t_idx) */ * from t;

72182 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3778778741

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       | 80054 |    12M|  1808   (1)| 00:00:22 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     | 80054 |    12M|  1808   (1)| 00:00:22 |
|   2 |   INDEX FULL SCAN           | T_IDX | 80054 |       |   162   (1)| 00:00:02 |
-------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      11323  consistent gets
          0  physical reads
          0  redo size
    8008983  bytes sent via SQL*Net to client
      53352  bytes received via SQL*Net from client
       4814  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      72182  rows processed

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 


A reader, December 16, 2011 - 11:25 am UTC

dear tom,

are you still here?
Tom Kyte
December 16, 2011 - 6:01 pm UTC

haven't gone anywhere, nope.

but this week I've been in Quebec, NYC, Philly, and finally Montreal - just got home a while ago.


Where have you been?

A reader, December 19, 2011 - 12:10 pm UTC

hey tom,

sorry for the late response. as i am trying to read through your previous thread on consistent reads with array size (
rows/arraysize + block) = consistent get.

---------------------------------------

back to the topic

1) since i am usingthe
- same array size, and
- the amount of rows in the table to be selected remains the same and
- there are no update done to these rows

why does the consistent gets for index path seems to be lesser then the one with FTS ?

in my opinion, it should be more,
since i will be accessing the index block in additional to the data blocks for the same amount of rows.

is it because for FTS, i will be accessing/reading through blocks which might not contain the rows i need ? (like what you say multiblock read)

2) is there anything in the autotrace that suggest that using the index will take longer time ?

since the amount of consistent gets is not really related between the FTS and index access method, and the amount of physical read is infact lesser in the index access method.

so why does the index path take a longer time then ?
( because of the clustering factor ?)

3) if it is because of the clustering factor (meaning that rows are scattered through different blocks in the table, wouldn't this in fact increased my chance of consistent gets since ->

e.g.
get row 1 -> block 1
get row 2 -> block 8
get row 3 -> block 7
get row 4 -> block 4
get row 5 -> block 1

4) lastly -> yes the end-users are accessing the rows through a webpage which fetch about 50 rows per page.

is it normal for an application to use "index hint" in the sql query for daily operations ?

or "hint" is only used in performaning tuning process for analyze only.

Regards,
Noob
Tom Kyte
December 19, 2011 - 5:03 pm UTC

why does the consistent gets for index path seems to be lesser then the one
with FTS ?


it doesn't seem to be less, it was in fact less. It just was. It took less consistent gets using the index. That was just a fact. It took less because it had to process less blocks overall.

in my opinion, it should be more,
since i will be accessing the index block in additional to the data blocks for
the same amount of rows


are you saying that if I am going to get 10 rows out of a table that has 1,000 rows on 100 block (10 rows per block), you think that the full scan would do less consistent gets (it has to read 100 blocks to find the 10) than an index (which would read maybe 3 or 4 index blocks and then at most 10 table blocks - 20 if the rows were all migrated).

The full scan - 100 blocks.
The index range scan - 14 blocks.

both find 10 rows.


is it because for FTS, i will be accessing/reading through blocks which might
not contain the rows i need ?


yes, a full table scan is a full table scan, it full scans every block in the table looking for the rows. It does not know beforehand that a given block doesn't contain any useful data.




2) I honestly don't think those queries really take different amounts of time. They are so very close - less than 10% difference. They are about the same.


... and the amount of physical read is infact lesser in the
index access method. ...


but the IO types are different. one uses multiblock reads and the other uses single block reads.

Assuming an 8k block size and a 1mb IO size - you'd be reading up to 128 blocks at a time using the full scan. The full scan might have issued only 15,000 IO calls to the OS - that is another way to look at it.



3) yes, it would, but what was the relevance?

4) you should use the first_rows(50) hint, not index. tell the optimizer to optimize for initial response time.

The default mode of the optimizer is to optimize for getting the last row of the query as fast as possible. You want to optimize to get the first rows as fast as possible.





A reader, December 20, 2011 - 2:12 pm UTC

hi tom,

1st) thank you very much from the bottom of my heart really. i have learned so much reading from your threads and your answers. you have solved my complication thoughts with really simple and concise answers.

2nd) how do i really know how much multiblock read per I/O can my OS support ?

3rd) moving on forward, is there any hard and fast rule stating that when a index is not used , it is probably due to clustering factor if selectivity is high.

as i am having a simple statement

select * From cdr where corporate = 'x';
corporate column is non-unique indexed and selectivity of value x is high.

rows with value 'x' - 3272
total rows in table - 47471599
rows with value in corporate column = 17378158
clustering factor in corporate_idx = 2021670
blocks in cdr table = 17095557

but again, the index is not used.
now both the clusterign factor is neither near to the amount of rows entries in the index, nor it is near to the amount of blocks in the table.

before i run some statistics on both query with and without index and even if it shows that the amount of I/O physical reads is lesser (with multiblock read) for FTS, what could be the underlying issue that cause the amount of physical I/O to be more for a index access ?

the previous issue was that, as the index clustering was way higher then the amount of blocks, oracle would have to do more physical io with the single block reads (from both index and data) in order to get the amount of rows required, thus FTS is preferred.

but what could be the issue now that, the index clustering was low, selectivity was low too but yet FTS is choosen.

Regards,
Noob



Tom Kyte
December 20, 2011 - 2:29 pm UTC

2nd) how do i really know how much multiblock read per I/O can my OS support ?


One way to see would to full scan a largish table with wait events enabled after flushing the buffer cache on a locally managed tablespace with large uniform extents (we cannot multi block IO over extents).

My largest observed read was 512 blocks - 4mb

select count(*) from t
END OF STMT
PARSE #81319016:c=9998,e=10573,p=403,cr=83,cu=1,mis=1,r=0,dep=0,og=1,plh=2966233522,tim=1324412492477285
EXEC #81319016:c=0,e=53,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2966233522,tim=1324412492477389
WAIT #81319016: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1324412492477455
WAIT #81319016: nam='reliable message' ela= 466 channel context=673266856 channel handle=673237868 broadcast message=674287996 obj#=0 tim=1324412492478102
WAIT #81319016: nam='enq: KO - fast object checkpoint' ela= 835 name|mode=1263468550 2=65563 0=1 obj#=0 tim=1324412492478984
WAIT #81319016: nam='direct path read' ela= 2603 file number=3 first dba=140 block cnt=372 obj#=103099 tim=1324412492481898
WAIT #81319016: nam='direct path read' ela= 2865 file number=3 first dba=512 block cnt=256 obj#=103099 tim=1324412492485054
WAIT #81319016: nam='direct path read' ela= 1084 file number=3 first dba=778 block cnt=246 obj#=103099 tim=1324412492491084
WAIT #81319016: nam='direct path read' ela= 1936 file number=3 first dba=1024 block cnt=384 obj#=103099 tim=1324412492495498
WAIT #81319016: nam='direct path read' ela= 781 file number=3 first dba=1418 block cnt=118 obj#=103099 tim=1324412492498838
WAIT #81319016: nam='direct path read' ela= 2237 file number=3 first dba=1536 block cnt=512 obj#=103099 tim=1324412492505251
WAIT #81319016: nam='direct path read' ela= 2278 file number=3 first dba=2058 block cnt=502 obj#=103099 tim=1324412492508519
WAIT #81319016: nam='direct path read' ela= 539 file number=3 first dba=2560 block cnt=128 obj#=103099 tim=1324412492513173
WAIT #81319016: nam='direct path read' ela= 1423 file number=3 first dba=2698 block cnt=374 obj#=103099 tim=1324412492518475
WAIT #81319016: nam='direct path read' ela= 897 file number=3 first dba=3072 block cnt=256 obj#=103099 tim=1324412492520426
WAIT #81319016: nam='direct path read' ela= 889 file number=3 first dba=3338 block cnt=246 obj#=103099 tim=1324412492524095
WAIT #81319016: nam='direct path read' ela= 1542 file number=3 first dba=3584 block cnt=384 obj#=103099 tim=1324412492527742
WAIT #81319016: nam='direct path read' ela= 387 file number=3 first dba=3978 block cnt=118 obj#=103099 tim=1324412492530200
WAIT #81319016: nam='direct path read' ela= 1938 file number=3 first dba=4096 block cnt=512 obj#=103099 tim=1324412492535121
WAIT #81319016: nam='direct path read' ela= 2122 file number=3 first dba=4618 block cnt=502 obj#=103099 tim=1324412492538292
....


I used this to set it up:


create tablespace test_ts
datafile '/tmp/test_ts.dbf'
size 20m
autoextend on next 5m
extent management local uniform size 5m;

create table t tablespace test_ts as select * from all_objects;
insert /*+ append */ into t select * from t;
commit;
insert /*+ append */ into t select * from t;
commit;
insert /*+ append */ into t select * from t;
commit;
insert /*+ append */ into t select * from t;
commit;

alter session set db_file_multiblock_read_count=1000;
alter system flush buffer_cache;
@trace
select count(*) from t;


before i run some statistics on both query with and without index and even if
it shows that the amount of I/O physical reads is lesser (with multiblock read)
for FTS, what could be the underlying issue that cause the amount of physical
I/O to be more for a index access ?



It is easy for that to happen, consider this example:

https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

it can happen when you have to visit and re-visit blocks in the cache - but they aren't in the cache anymore so you end up reading and re-reading them.


but what could be the issue now that, the index clustering was low, selectivity
was low too but yet FTS is choosen.


post example using this technique:

http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/

so we can see the a-rows and e-rows columns.


A reader, December 20, 2011 - 2:15 pm UTC

amendment on the rows statistics

rows with value 'x' - 3272
total rows in table - 47471599
rows with value in corporate column = 17378158
clustering factor in corporate_idx = 2021670
blocks in cdr table = 2065440

so with the clustering factor near to the amount of blocks in the cdr table + high selectivity, why would a FTS still be choosen instead ?

Regards,
Noob
Tom Kyte
December 20, 2011 - 2:30 pm UTC

post the example I asked for

A reader, December 21, 2011 - 1:33 pm UTC

hi tom,

i followed your example and this is what is see in my trace file.


PARSING IN CURSOR #6 len=22 dep=0 uid=0 oct=3 lid=0 tim=1293450501092569 hv=2763161912 ad='c93a4588'
select count(*) from t
END OF STMT
PARSE #6:c=51992,e=60104,p=1825,cr=67,cu=1,mis=1,r=0,dep=0,og=1,tim=1293450501092563
EXEC #6:c=0,e=48,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1293450501092673
WAIT #6: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=123620 tim=1293450501092712
WAIT #6: nam='db file scattered read' ela= 272 file#=14 block#=21 blocks=23 obj#=123620 tim=1293450501093191
WAIT #6: nam='db file scattered read' ela= 371 file#=14 block#=300 blocks=32 obj#=123620 tim=1293450501096246
WAIT #6: nam='db file scattered read' ela= 509 file#=14 block#=460 blocks=46 obj#=123620 tim=1293450501098374
WAIT #6: nam='db file scattered read' ela= 159 file#=14 block#=634 blocks=15 obj#=123620 tim=1293450501100167
WAIT #6: nam='db file scattered read' ela= 312 file#=14 block#=659 blocks=29 obj#=123620 tim=1293450501100754
WAIT #6: nam='db file scattered read' ela= 1147 file#=14 block#=816 blocks=99 obj#=123620 tim=1293450501103619
WAIT #6: nam='db file scattered read' ela= 615 file#=14 block#=1043 blocks=55 obj#=123620 tim=1293450501106557
WAIT #6: nam='db file scattered read' ela= 115 file#=14 block#=1226 blocks=10 obj#=123620 tim=1293450501108417
WAIT #6: nam='db file scattered read' ela= 391 file#=14 block#=1299 blocks=36 obj#=123620 tim=1293450501109554
WAIT #6: nam='db file scattered read' ela= 1523 file#=14 block#=1463 blocks=128 obj#=123620 tim=1293450501113035
WAIT #6: nam='db file scattered read' ela= 214 file#=14 block#=1591 blocks=19 obj#=123620 tim=1293450501114602
WAIT #6: nam='db file scattered read' ela= 428 file#=14 block#=1738 blocks=38 obj#=123620 tim=1293450501116543
WAIT #6: nam='db file scattered read' ela= 277 file#=14 block#=1904 blocks=25 obj#=123620 tim=1293450501118451
WAIT #6: nam='db file scattered read' ela= 84 file#=14 block#=1939 blocks=8 obj#=123620 tim=1293450501118837
WAIT #6: nam='db file scattered read' ela= 1051 file#=14 block#=2075 blocks=91 obj#=123620 tim=1293450501121473
WAIT #6: nam='db file scattered read' ela= 1498 file#=14 block#=2294 blocks=128 obj#=123620 tim=1293450501125433
WAIT #6: nam='db file scattered read' ela= 38 file#=14 block#=2422 blocks=2 obj#=123620 tim=1293450501126764
WAIT #6: nam='db file scattered read' ela= 208 file#=14 block#=2552 blocks=17 obj#=123620 tim=1293450501128276
WAIT #6: nam='db file scattered read' ela= 106 file#=14 block#=2579 blocks=9 obj#=123620 tim=1293450501128623
FETCH #6:c=37995,e=36977,p=810,cr=2635,cu=0,mis=0,r=1,dep=0,og=1,tim=1293450501129724
WAIT #6: nam='SQL*Net message from client' ela= 359 driver id=1650815232 #bytes=1 p3=0 obj#=123620 tim=1293450501130199
FETCH #6:c=0,e=2,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,tim=1293450501130238
WAIT #6: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=123620 tim=1293450501130254
WAIT #6: nam='SQL*Net message from client' ela= 179 driver id=1650815232 #bytes=1 p3=0 obj#=123620 tim=1293450501130448
STAT #6 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=2635 pr=810 pw=0 time=36969 us)'
STAT #6 id=2 cnt=206272 pid=1 pos=1 obj=123620 op='TABLE ACCESS FULL T (cr=2635 pr=810 pw=0 time=492 us)'


q1) not sure why is db scattered read but yours is direct path read. ( i do insert the table data by append though )

q2) so can i say that my max multiblock read is 128
WAIT #6: nam='db file scattered read' ela= 1523 file#=14 block#=1463 blocks=128 obj#=123620 tim=1293450501113035

q3) can i change my current parameter db_file_multiblock_read_count to 128 instead ?(now it is at 16)

---------------------------------------------------------

on the question on why is the corporate index not used and a FTS is used (when selectivity is high, and clustering factor near to number of blocks in table)

below is my result (FOR FTS)
------------------------------

SQL_ID 383mp1nvc5d8p, child number 0
-------------------------------------
SELECT /*+gather_plan_statistics */ c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE,
c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER,
c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID,
c.CORPORATE_ID FROM CDR c LEFT JOIN SERVICE s ON c.SERVICE_ID = s.SERVICE_ID WHERE c.CORPORATE_ID='test' AND
c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS') AND c.CONNECT_STARTTIME<=to_date('01
Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS') AND c.STATUS='0' order by USER_ID,CONNECT_STARTTIME desc

Plan hash value: 411674111

--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | O/1/M |
--------------------------------------------------------------------------------------------------------------------------------
| 1 | SORT ORDER BY | | 1 | 358K| 137 |00:01:58.35 | 2067K| 1960K| 24576 | 24576 | 1/0/0|
|* 2 | HASH JOIN RIGHT OUTER| | 1 | 358K| 137 |00:02:33.59 | 2067K| 1960K| 1023K| 1023K| 1/0/0|
| 3 | TABLE ACCESS FULL | SERVICE | 1 | 18 | 19 |00:00:00.01 | 7 | 0 | | | |
|* 4 | TABLE ACCESS FULL | CDR | 1 | 358K| 137 |00:02:33.59 | 2067K| 1960K| | | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C"."SERVICE_ID"="S"."SERVICE_ID")
4 - filter(("C"."CONNECT_STARTTIME">=TO_DATE('2011-11-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND
"C"."CORPORATE_ID"='test' AND "C"."STATUS"='0' AND "C"."CONNECT_STARTTIME"<=TO_DATE('2011-12-01 23:59:59', 'yyyy-mm-dd
hh24:mi:ss')))
--------------------------------------------------------

q3) why is the E-ROWS 358k? how does oracle derive this estimated value from ?

--> i will be posting the autotrace result in the next thread.
Tom Kyte
December 21, 2011 - 3:06 pm UTC

q1) you are probably 10g, I was 11g, 11g does direct path reads more often than 10g did.

q2) looks like a 1mb IO size with 128x8k blocks.

q3) it shouldn't be set at all! You should let it default. I set it to 1000 just for this demo, and only in that session. Mine defaults.

If you let it default - we'll use your observed multi block read count to cost the query, but we'll always try to do the maximum IO size at runtime.

A reader, December 21, 2011 - 1:55 pm UTC

SQL> SELECT  c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID 
  2  FROM CDR c LEFT JOIN SERVICE s 
  3  ON c.SERVICE_ID = s.SERVICE_ID 
  4  WHERE  c.CORPORATE_ID='test'
  5  AND    c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS') 
  6  AND    c.CONNECT_STARTTIME<=to_date('01 Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS') 
  7  AND c.STATUS='0'
  8  order by USER_ID,CONNECT_STARTTIME desc;

137 rows selected.

Elapsed: 00:02:21.71

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
    2067064  consistent gets
    1960713  physical reads
          0  redo size
       7772  bytes sent via SQL*Net to client
        448  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
        137  rows processed


SQL> 
SQL> SELECT /*+index(c,CDR_CORPORATEIDX) */ c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID
  2  FROM CDR c LEFT JOIN SERVICE s 
  3  ON c.SERVICE_ID = s.SERVICE_ID 
  4  WHERE  c.CORPORATE_ID='test'
  5  AND    c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS') 
  6  AND    c.CONNECT_STARTTIME<=to_date('01 Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS') 
  7  AND c.STATUS='0'
  8  order by USER_ID,CONNECT_STARTTIME desc;

137 rows selected.

Elapsed: 00:00:05.78

Statistics
----------------------------------------------------------
         22  recursive calls
          0  db block gets
       2730  consistent gets
       1568  physical reads
          0  redo size
       7772  bytes sent via SQL*Net to client
        448  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
        137  rows processed

-----------------------------------

obviously the index route is shorter/faster.
even i can think of multiblock-read for the FTS, 1960713  / 128 - that would still give me 15318 physical read compared to 1568 physical read for index.

q4) is it because the CBO estimated the wrong amount of rows to be fetch thus it chooses the FTS (as shown in the previous dbms_xplan ?


Regards,
Noob

A reader, December 23, 2011 - 11:27 am UTC

hi tom,

thanks for answering my multi-block side-question.

any idea with my previous posting on query (when selectivity is high and clustering factor near to number of blocks in table)

i have posted the dbms_xplan example earlier.

Regards,
Noob
Tom Kyte
December 23, 2011 - 11:34 am UTC

if we estimate the cardinality wrong, that is the biggest factor in getting the correct cost - then single biggest factor, we can many times get the wrong plan.

This article:
http://www.oracle.com/technetwork/issue-archive/2009/09-jan/o19asktom-086775.html

talks in more depth about that.

A reader, December 27, 2011 - 1:26 pm UTC

hi tom,

thanks for the help on dynamic sampling. i am trying to digest and apply in my application now.

however, there seems to be something abnormal on my hint recently, i am not sure if it is a bug or did i miss anything

->

my first_row hint is not working properly.
(in the oracle sql document, it says that if your query include an order by (and the order by column is not the index column that is being use), CBO will ignore the hint.

now let's see.

-------------------------------------
SELECT /*+ FIRST_ROWS(50) */ *
FROM cdr c
where cdr_id >= 100570865 and cdR_id <= 103631203
ORDER BY status;

(FTS choosen) -- proven correct , cbo ignore hint.


SELECT /*+ FIRST_ROWS(50) */ *
FROM cdr c
where cdr_id >= 100570865 and cdR_id <= 103631203;

(Index range choosen) -- yes hint used.


SELECT /*+ FIRST_ROWS(50) */ *
FROM cdr c
where cdr_id >= 100570865 and cdR_id <= 103631203
order by cdr_id;

(Index range choosen) -- yes hint used, shows that if order by clause is using the index column, cbo will still use the hint)

------------------

keeping the 3 above example in mind, now let see another query


SELECT /*+FIRST_ROWS(10) */ c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID
FROM CDR c LEFT JOIN SERVICE s
ON c.SERVICE_ID = s.SERVICE_ID
WHERE c.CORPORATE_ID='test'
AND c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS')
AND c.CONNECT_STARTTIME<=to_date('01 Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS')
AND c.STATUS='0'
order by USER_ID,CONNECT_STARTTIME desc;

-- index column is CORPORATE_ID
-- FTS choosen, so i am guessing might be the ORDER BY clause choosing other column.



so i remove the order by clause

SELECT /*+FIRST_ROWS(10) */ c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID
FROM CDR c LEFT JOIN SERVICE s
ON c.SERVICE_ID = s.SERVICE_ID
WHERE c.CORPORATE_ID='test'
AND c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS')
AND c.CONNECT_STARTTIME<=to_date('01 Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS')
AND c.STATUS='0';


-- still FTS choosen, so something might be wrong.... ok nvm let try this (removing the 10, from the FIRST_ROWS hint)


SELECT /*+FIRST_ROWS */ c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID
FROM CDR c LEFT JOIN SERVICE s
ON c.SERVICE_ID = s.SERVICE_ID
WHERE c.CORPORATE_ID='test'
AND c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS')
AND c.CONNECT_STARTTIME<=to_date('01 Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS')
AND c.STATUS='0';

-- INDEX RANGE SCAN CHOOSEN -- why!?! dont believe, let me include order by.


SELECT /*+FIRST_ROWS */ c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID
FROM CDR c LEFT JOIN SERVICE s
ON c.SERVICE_ID = s.SERVICE_ID
WHERE c.CORPORATE_ID='test'
AND c.CONNECT_STARTTIME>=to_date('01 Nov 2011 00:00:00','DD MON YYYY HH24:mi:SS')
AND c.CONNECT_STARTTIME<=to_date('01 Dec 2011 23:59:59','DD MON YYYY HH24:mi:SS')
AND c.STATUS='0';
order by USER_ID,CONNECT_STARTTIME desc;


-- index range scan choosen! (even when order by include non-related columns)~



----------------------


so why does FIRST_ROWS work and FIRST_ROWS(n) doesnt ?

Regards,
Noob
Tom Kyte
December 27, 2011 - 2:10 pm UTC

I wasn't saying necessarily "use dynamic sampling" but rather "this is what happens when we get the cardinality wrong"

some of the tools you have to fix that

o dynamic sampling
o sql profiles
o cardinality hint
o extended statistics
o histograms
o 'better' (more representative) statistics
o removal of things that make cardinality estimates really really hard - like

where trunc(date_col) = to_date( 'xxxxx', 'yyyy' )

do instead where date_col between to_date( ... ) and to_date( ....) +1 -1/24/60/60


my first_row hint is not working properly.
(in the oracle sql document, it says that if your query include an order by
(and the order by column is not the index column that is being use), CBO will
ignore the hint.


where does it say that???


http://docs.oracle.com/cd/E11882_01/server.112/e26088/sql_elements006.htm#SQLRF50302

A reader, December 27, 2011 - 3:12 pm UTC

hi tom,

i have simplied by SQL to the simplest form. 

--------

select index_name, table_name, clustering_factor, num_rows from dba_indexes where table_name = 'CDR' and index_name = 'CDR_CORPORATEIDX';

CDR_CORPORATEIDX CDR 2021670 17095557

select SEGMENT_NAME, SEGMENT_TYPE, BLOCKS from dba_segmentS where SEGMENT_NAME = 'CDR';
CDR TABLE 2098208


select COUNT(*) FROM CDR;
48146495

SELECT COUNT(*) fROM CDR WHERE CORPORATE_ID IS NOT NULL;
17632699

---------------------

SQL> SELECT  c.CONNECT_STARTTIME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID
  2  FROM CDR c 
  3  WHERE c.CORPORATE_ID='test' 
  4  and c.CONNECT_STARTTIME between to_date('01-NOV-11') and to_date('01-DEC-11')-1/24/60/60
  5  AND c.STATUS='0';

136 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1344962781

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |   346K|    32M|   454K  (1)| 01:30:54 |
|*  1 |  FILTER            |      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| CDR  |   346K|    32M|   454K  (1)| 01:30:54 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE('01-NOV-11')<=TO_DATE('01-DEC-11')-.00001157407407
              407407407407407407407407407407)
   2 - filter("C"."CORPORATE_ID"='test' AND "C"."STATUS"='0' AND
              "C"."CONNECT_STARTTIME">=TO_DATE('01-NOV-11') AND
              "C"."CONNECT_STARTTIME"<=TO_DATE('01-DEC-11')-.0000115740740740740
740740

              7407407407407407407)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
    2092671  consistent gets
    1974201  physical reads
          0  redo size
       8398  bytes sent via SQL*Net to client
        448  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        136  rows processed


---------------------

with dynamic sampling..( of 4), rows drop from 346k to 5730 but estimation is still far from the actual rows (136).

SQL> SELECT /*+ dynamic_sampling(c 4) */ c.CONNECT_STARTTIME, c.CHARGED_DURATION, c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_I
  2  FROM CDR c 
  3  WHERE c.CORPORATE_ID='test' 
  4  and c.CONNECT_STARTTIME between to_date('01-NOV-11') and to_date('01-DEC-11')-1/24/60/60
  5  AND c.STATUS='0';

136 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1344962781

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  5730 |   548K|   454K  (1)| 01:30:54 |
|*  1 |  FILTER            |      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| CDR  |  5730 |   548K|   454K  (1)| 01:30:54 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE('01-NOV-11')<=TO_DATE('01-DEC-11')-.00001157407407
              407407407407407407407407407407)
   2 - filter("C"."CORPORATE_ID"='test' AND "C"."STATUS"='0' AND
              "C"."CONNECT_STARTTIME">=TO_DATE('01-NOV-11') AND
              "C"."CONNECT_STARTTIME"<=TO_DATE('01-DEC-11')-.0000115740740740740
740740

              7407407407407407407)

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
    2092948  consistent gets
    1969960  physical reads
          0  redo size
       8398  bytes sent via SQL*Net to client
        448  bytes received via SQL*Net from client
         11  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        136  rows processed

---------------

however actual rows are still 136 - only.


---------------
with dbms_xplan

SQL_ID  5h5vayzd9jb6h, child number 0
-------------------------------------
SELECT  /*+gather_plan_statistics */  c.CONNECT_STARTTIME, c.CHARGED_DURATION, 
c.CHARGED_RATE, c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, 
c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, c.ACCESS_POINT_NUMBER, 
c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, c.ORIGINATE_NUMBER, 
c.USER_ID, c.CORPORATE_ID FROM CDR c WHERE c.CORPORATE_ID='test' and 
c.CONNECT_STARTTIME between to_date('01-NOV-11') and to_date('01-DEC-11')-1/24/60/60 
AND c.STATUS='0'
 
Plan hash value: 1344962781
 
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|*  1 |  FILTER            |      |      1 |        |    136 |00:01:38.83 |    2092K|   1973K|
|*  2 |   TABLE ACCESS FULL| CDR  |      1 |    346K|    136 |00:01:38.83 |    2092K|   1973K|
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(TO_DATE('01-NOV-11')<=TO_DATE('01-DEC-11')-.0000115740740740740740740740
              7407407407407407)
   2 - filter(("C"."CORPORATE_ID"='test' AND "C"."STATUS"='0' AND 
              "C"."CONNECT_STARTTIME">=TO_DATE('01-NOV-11') AND 
              "C"."CONNECT_STARTTIME"<=TO_DATE('01-DEC-11')-.000011574074074074074074074074074074074
              07407))
 


----------------

now it can't be the clustering factor as it is amount the same of the number of blocks in the table.

dynamic sampling of level 4, does rectify the reduce the estimation from 346k to  , but still it doesn't reflect the actual figure (136 rows) and FTS is still used.

what could be the problem here then ? not enough statistics collected ? but this is already a much simplified statement and i have also manually run DBMS_STATS on both the table and the index already.

q1) Any advice tom ?  i am not a very strong oracle dba i must admit sorry about it.

-----------------------------------------------

On the FIRST_ROW hint (sorry i got it from a 9i document although  i am on 10g)

9i -> http://docs.oracle.com/cd/B10501_01/server.920/a96533/hintsref.htm#4942

10g -> The optimizer ignores this hint in DELETE and UPDATE statement blocks and in SELECT statement blocks that include any blocking operations, such as sorts or groupings. 

q2) however adhering to 10g standard, even with ORDER BY clauses
as long as i am using FIRST_ROWS and not FIRST_ROWS(n), the index will be used - any idea why ?

Lastly would really want to say thank you. I been asking around among colleagues, in metalink, but really to no avail.

Regards,
Noob

Tom Kyte
December 27, 2011 - 3:18 pm UTC

q2) however adhering to 10g standard, even with ORDER BY clauses
as long as i am using FIRST_ROWS and not FIRST_ROWS(n), the index will be used
- any idea why ?,


because it was deemed to be the *best plan*.


the estimated cardinality is WAY off from the actual.

what statistics do you have on the columns in the predicate (histograms, no histograms)

is the data skewed (would histograms make sense)

does dynamic sampling apply - it would take about 5 seconds to test it.

A reader, December 27, 2011 - 3:46 pm UTC

hi tom,

thanks for the fast response.

q2) however adhering to 10g standard, even with ORDER BY clauses
as long as i am using FIRST_ROWS and not FIRST_ROWS(n), the index will be used
- any idea why ?,

"because it was deemed to be the *best plan*"

--> what is the difference between FIRST_ROWS and FIRST_ROWS(n) ?

with FIRST_ROWS, index is always used, with FIRST_ROWS(n), it isn't (even if it is FIRST_ROWS(1))

---------------------------------

sorry to be asking this....

>>what statistics do you have on the columns in the >>predicate (histograms, no histograms)

(i dont think i ever collect any histograms on the column beside the basic

begin
dbms_Stats.gather_index_stats('schema','idxname');
dbms_stats.gather_table_stats('schema','tblname');
end;


>>is the data skewed (would histograms make sense)

do you mean ?

i) data on the column "corporate" ?
ii) or corporate (amount of rows) as a whole compare to the amount of rows in the CDR table ?

test001 3
gemalto 5
clickhere2 6
ergotravel 9
globalroam 29
voizip7 57
vshipscap 75
YLPLLC12 96
wayco111 338
azaas 439
ogerik14 602
sgalleria 637
cconcepts 1525
mpro 1732
silverback 1888
grundf15 2011
vship 2847
CANNON10 3168
test 3351
lgatel9 3638
starfu16 6295
bh 8164
voizip8 11456
grcorp02 19094
slp 21118
isscs17 22227
band 57663
borderless 77108
grcorp03 107292
grcorp01 647104
grcorp04 16632980

out of the 17095557 corporate records, only 3351 records belong to corporate "test" -> is this consider skewed ?

out of 48146495 records, 17095557 have corporate_id (35%)

>> does dynamic sampling apply - it would take about 5 seconds to test it.

I do not understand what you mean, tom. I have just tested using dynamic_sampling hint (level 4) earlier in my example.

or do you mean some other tests ?

Regards,
Noob
Tom Kyte
December 27, 2011 - 4:07 pm UTC

(i dont think i ever collect any histograms on the column beside the basic


don't be so sure, read:

http://www.oracle.com/technetwork/issue-archive/2006/06-may/o36asktom-099107.html

the section entitled "Why Does My Plan Change?"



>>is the data skewed (would histograms make sense)

do you mean ?

i) data on the column "corporate" ?
ii) or corporate (amount of rows) as a whole compare to the amount of rows in
the CDR table ?


I mean in either of the columns in your where clause. Any of these three:

WHERE c.CORPORATE_ID='test'
4 and c.CONNECT_STARTTIME between to_date('01-NOV-11') and to_date('01-DEC-11')-1/24/60/60
5 AND c.STATUS='0';


and the answer is "yes". It is skewed on corporate_id because some values have 16,632,980 and some have 3. That is skew.



>> does dynamic sampling apply - it would take about 5 seconds to test it.

I do not understand what you mean, tom. I have just tested using
dynamic_sampling hint (level 4) earlier in my example.


your examples are too large to read in their entirety. I didn't really look. what did it do for you - was using dynamic sampling helpful in getting better cardinality estimates? If not, did you try at a slightly higher level. And have you considered sql profiles? A more aggressive form of sampling.


A reader, December 28, 2011 - 2:05 pm UTC

hi tom,

thanks for showing me more insights
below are my test results

------------------------------

SELECT num_distinct, 
      low_value,
      high_value,
      num_buckets, 
      histogram,
      last_analyzed 
FROM user_tab_col_statistics 
WHERE table_name = 'CDR' AND column_name = 'CORPORATE_ID';

28 43414E4E4F4E3130 776179636F313131 11 HEIGHT BALANCED 21-DEC-11 02:29:21

------------------

select column_name, count(*)
      from user_tab_histograms
     where table_name = 'CDR'
       and column_name = 'CORPORATE_ID'
     group by column_name;

CORPORATE_ID 3

------------------

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'CDR',METHOD_OPT=> 'FOR COLUMNS CORPORATE_ID SIZE SKEWONLY');

PL/SQL procedure successfully completed.

-------------------

SELECT num_distinct, 
      low_value,
      high_value,
      num_buckets, 
      histogram,
      last_analyzed 
FROM user_tab_col_statistics 
WHERE table_name = 'CDR' AND column_name = 'CORPORATE_ID';

12 43414E4E4F4E3130 776179636F313131 12 FREQUENCY 29-DEC-11 03:34:32

(HISTOGRAM from HeightBalance to Frequency) --> not sure what it means though .



select column_name, count(*)
      from user_tab_histograms
     where table_name = 'CDR'
       and column_name = 'CORPORATE_ID'
     group by column_name;

CORPORATE_ID 12

-- histogram from 3 become 12.
---------------------------------------

Let's see if we can do it !


SQL> SELECT  c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE, 
  2  c.CHARGED_AMOUNT, c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER, 
  3  c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT, 
  4  c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID 
  5  FROM CDR c LEFT JOIN SERVICE s 
  6  ON c.SERVICE_ID = s.SERVICE_ID 
  7  WHERE  c.CORPORATE_ID='test'
  8  AND c.CONNECT_STARTTIME between to_date('01-NOV-11') and to_date('01-DEC-11')-1/24/60/60
  9  AND c.STATUS='0'
 10  order by USER_ID,CONNECT_STARTTIME desc;

Execution Plan
----------------------------------------------------------
Plan hash value: 1218877854

---------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                  |     1 |   118 |     7  (15)| 00:00:01 |
|   1 |  SORT ORDER BY                 |                  |     1 |   118 |     7  (15)| 00:00:01 |
|*  2 |   FILTER                       |                  |       |       |            |          |
|   3 |    NESTED LOOPS OUTER          |                  |     1 |   118 |     6   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| CDR              |     1 |    98 |     5   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | CDR_CORPORATEIDX |     1 |       |     4   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| SERVICE          |     1 |    20 |     1   (0)| 00:00:01 |
|*  7 |      INDEX UNIQUE SCAN         | SERVICE_PK       |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_DATE('01-NOV-11')<=TO_DATE('01-DEC-11')-.000011574074074074074074074074074
              07407407407)
   4 - filter("C"."STATUS"='0' AND "C"."CONNECT_STARTTIME">=TO_DATE('01-NOV-11') AND
              "C"."CONNECT_STARTTIME"<=TO_DATE('01-DEC-11')-.00001157407407407407407407407407407407407407
              )
   5 - access("C"."CORPORATE_ID"='test')
   7 - access("C"."SERVICE_ID"="S"."SERVICE_ID"(+))

--------------------------------------------------

SQL_ID  8qmsv77zgjg4d, child number 0
-------------------------------------
 SELECT  /*+gather_plan_statistics */  c.CONNECT_STARTTIME, s.SERVICE_NAME, c.CHARGED_DURATION, c.CHARGED_RATE,  c.CHARGED_AMOUNT, 
c.BALANCE_AFTER_TRANSACTION, c.CDR_ID, c.SERVICE_ID, NULL, c.DESTINATION_NUMBER,  c.ACCESS_POINT_NUMBER, c.L1_CHARGED_DURATION, 
c.L1_CHARGED_RATE, c.L1_CHARGED_AMOUNT,  c.ORIGINATE_NUMBER, c.USER_ID, c.CORPORATE_ID  FROM CDR c LEFT JOIN SERVICE s  ON c.SERVICE_ID 
= s.SERVICE_ID  WHERE  c.CORPORATE_ID='test'  AND c.CONNECT_STARTTIME between to_date('01-NOV-11') and to_date('01-DEC-11')-1/24/60/60  
AND c.STATUS='0'  order by USER_ID,CONNECT_STARTTIME desc
 
Plan hash value: 1218877854
 
-------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
-------------------------------------------------------------------------------------------------------------------------------------------------
|   1 |  SORT ORDER BY                 |                  |      1 |      1 |    136 |00:00:00.01 |    2922 |      1 | 24576 | 24576 |     1/0/0|
|*  2 |   FILTER                       |                  |      1 |        |    136 |00:00:00.01 |    2922 |      1 |       |       |          |
|   3 |    NESTED LOOPS OUTER          |                  |      1 |      1 |    136 |00:00:00.01 |    2922 |      1 |       |       |          |
|*  4 |     TABLE ACCESS BY INDEX ROWID| CDR              |      1 |      1 |    136 |00:00:00.01 |    2784 |      1 |       |       |          |
|*  5 |      INDEX RANGE SCAN          | CDR_CORPORATEIDX |      1 |      1 |   3353 |00:00:00.01 |      14 |      1 |       |       |          |
|   6 |     TABLE ACCESS BY INDEX ROWID| SERVICE          |    136 |      1 |    136 |00:00:00.01 |     138 |      0 |       |       |          |
|*  7 |      INDEX UNIQUE SCAN         | SERVICE_PK       |    136 |      1 |    136 |00:00:00.01 |       2 |      0 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(TO_DATE('01-NOV-11')<=TO_DATE('01-DEC-11')-.00001157407407407407407407407407407407407407)
   4 - filter(("C"."STATUS"='0' AND "C"."CONNECT_STARTTIME">=TO_DATE('01-NOV-11') AND 
              "C"."CONNECT_STARTTIME"<=TO_DATE('01-DEC-11')-.00001157407407407407407407407407407407407407))
   5 - access("C"."CORPORATE_ID"='test')
   7 - access("C"."SERVICE_ID"="S"."SERVICE_ID")

--------------------------------------

W00000 YES TOM I DID IT :)!!!!
but not sure why the E-ROWS are 1 ? 

Thanks am 10000th million!

Regards,
Noob



using FIRST_ROWS and NOT using but having same explain plan.

A reader, February 13, 2012 - 1:48 pm UTC

Hi Tom,

I used to think that with FIRST_ROWS hint, query will be force to use a index instead of a FTS in order to return the first n row faster.

But today, it seems that with or without FIRST_ROWS hint, CBO might used the same plan but just with different cost.

--------
WITH FIRST_ROWS(n) hint

SELECT /*+FIRST_ROWS(10) */ c.connect_starttime, s.service_name, c.charged_duration,c.charged_rate,
c.charged_amount, c.balance_after_transaction,c.cdr_id, c.service_id, NULL,
c.destination_number,
c.access_point_number, c.l1_charged_duration, c.l1_charged_rate,
c.l1_charged_amount,
c.originate_number, c.user_id, c.corporate_id
FROM cdr c LEFT JOIN service s ON c.service_id = s.service_id
WHERE c.cdr_id >= GET_RPT_ID ('MIN','CDR','MONTH','01-JAN-12')
AND c.connect_starttime >= TO_DATE ('01 JAN 2012 00:00:00', 'DD MON YYYY HH24:mi:SS')
AND c.connect_starttime <= TO_DATE ('01 FEB 2012 00:00:00', 'DD MON YYYY HH24:mi:SS')
AND c.status = '0'
AND c.corporate_id = 'borderless';



16754 rows selected.

Elapsed: 00:00:30.12

Execution Plan
----------------------------------------------------------
Plan hash value: 3947246258

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19200 | 4087K| 141 (1)| 00:00:02 |
|* 1 | HASH JOIN RIGHT OUTER | | 19200 | 4087K| 141 (1)| 00:00:02 |
| 2 | TABLE ACCESS FULL | SERVICE | 18 | 360 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS BY INDEX ROWID| CDR | 10 | 990 | 137 (0)| 00:00:02 |
|* 4 | INDEX RANGE SCAN | CDR_PK | 477K| | 4 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("C"."SERVICE_ID"="S"."SERVICE_ID"(+))
3 - filter("C"."CONNECT_STARTTIME">=TO_DATE('2012-01-01 00:00:00',
'yyyy-mm-dd hh24:mi:ss') AND "C"."CORPORATE_ID"='borderless' AND
"C"."STATUS"='0' AND "C"."CONNECT_STARTTIME"<=TO_DATE('2012-02-01 00:00:00',
'yyyy-mm-dd hh24:mi:ss'))
4 - access("C"."CDR_ID">="GET_RPT_ID"('MIN','CDR','MONTH','01-JAN-12'))


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2775993 consistent gets
174666 physical reads
0 redo size
1299893 bytes sent via SQL*Net to client
12626 bytes received via SQL*Net from client
1118 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16754 rows processed

----------------------------------
Without Hint

SELECT c.connect_starttime, s.service_name, c.charged_duration,c.charged_rate,
c.charged_amount, c.balance_after_transaction,c.cdr_id, c.service_id, NULL,
c.destination_number,
c.access_point_number, c.l1_charged_duration, c.l1_charged_rate,
c.l1_charged_amount,
c.originate_number, c.user_id, c.corporate_id
FROM cdr c LEFT JOIN service s ON c.service_id = s.service_id
WHERE c.cdr_id >= GET_RPT_ID ('MIN','CDR','MONTH','01-JAN-12')
AND c.connect_starttime >= TO_DATE ('01 JAN 2012 00:00:00', 'DD MON YYYY HH24:mi:SS')
AND c.connect_starttime <= TO_DATE ('01 FEB 2012 00:00:00', 'DD MON YYYY HH24:mi:SS')
AND c.status = '0'
AND c.corporate_id = 'borderless';

Elapsed: 00:00:23.25

Execution Plan
----------------------------------------------------------
Plan hash value: 3947246258

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19200 | 2231K| 256K (1)| 00:51:19 |
|* 1 | HASH JOIN RIGHT OUTER | | 19200 | 2231K| 256K (1)| 00:51:19 |
| 2 | TABLE ACCESS FULL | SERVICE | 18 | 360 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS BY INDEX ROWID| CDR | 19200 | 1856K| 256K (1)| 00:51:18 |
|* 4 | INDEX RANGE SCAN | CDR_PK | 477K| | 2041 (1)| 00:00:25 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("C"."SERVICE_ID"="S"."SERVICE_ID"(+))
3 - filter("C"."CONNECT_STARTTIME">=TO_DATE('2012-01-01 00:00:00',
'yyyy-mm-dd hh24:mi:ss') AND "C"."CORPORATE_ID"='borderless' AND
"C"."STATUS"='0' AND "C"."CONNECT_STARTTIME"<=TO_DATE('2012-02-01 00:00:00',
'yyyy-mm-dd hh24:mi:ss'))
4 - access("C"."CDR_ID">="GET_RPT_ID"('MIN','CDR','MONTH','01-JAN-12'))


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2776296 consistent gets
174666 physical reads
0 redo size
1299893 bytes sent via SQL*Net to client
12626 bytes received via SQL*Net from client
1118 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16754 rows processed

--------------------------------

You can see the execution plan are the same, but the cost is different

FIRST_ROW - cost = 141
No hint - cost = 256K

q1) so what is the difference between FIRST_ROW and ALL_ROW then in this scenario since both use the same plan ?

FIRST_ROW will return rows while the rows are being fetch ?
ALL_ROW will return rows only when all the rows are fetched finish ?

q2) why is it that the COST for FIRST_ROW is lower ? Is it the COST for fetching the 1st 10 rows or all the rows ?

q3) sorry to sidetrack abit, as to supplement q1) - i believe the steps in a sql cursor are
DECLARE - OPEN - FETCH - CLOSE

- when we open a cursor, the execution plan is being determined , am i right ?

- when we fetch, the rows are returned while the table is being scan through am i right ?

so does all_rows only return rows when the table is scan finished ? while first_rows return rows while the table is being scan ?

Thanks and hope to hear your advice.

Regards,
Noob

Tom Kyte
February 13, 2012 - 2:25 pm UTC

... query will be force to use a index
instead of a FTS ...

that was *never* true. no.


q1) it just means that the plans happened to be the same - that shouldn't necessarily be surprising. Using different optimization modes does NOT assure you get different plans, it only makes it possible.

q2) that was a side effect of the hint - see the cardinality of 10, you artificially lowered the cardinality. that is the number one thing that affects the cost.

q3) we parse, bind, execute, fetch, close a select statement. The plan is usually determined during the execute phase for a hard parse.

during the fetch phase, the rows might come from the table out of the buffer cache, or they might be coming from TEMP.


all_rows/first_rows does not affect this at all.


You should give serious consideration to adding corporate_id, status, connect_starttime (in that order) to the CDR_PK index or creating a new index with all four attributes in them. I would guess if you looked at a TKPROF, you get LOTS of rows flowing out of step 4, and around 17,000 rows flowing out of step 3. that means you go from the index to the table many more times than necessary.



using FIRST_ROWS and NOT using but having same explain plan.

A reader, February 14, 2012 - 4:01 pm UTC

hi tom,

thank you for your prompt response and very sorry for my late reply.

after reading what you wrote, i went to read through alot of stuffs from composite index, to clustering factor, to selectivity etc..

-------------------------------------
As what you mentioned ->

"you get LOTS of rows flowing out of step 4, and around 17,000 rows flowing out of step 3. that means you go from the index to the table many more times than necessary. "


q1) i saw from step 3 and 4
do you mean that on step 4, i do a index range scan and access 477K rows in the table and only do return 17,000 rows
after applying the filter condition on step 3. ?

q2) could you advise why is there a need to add in the CONNECT_STARTTIME column into the composite index if we already had CDR_ID inside the composite index

the CDR_ID returned via function GET_RPT_ID ('MIN','CDR','MONTH','01-JAN-12') will return the starting CDR_ID from 01-JAN-12 onwards.

q3) for the ordering of the composite index columns, is there any rules to try to put the higher selectivity column to the leading portion ?

as i am having a dilemna to put STATUS column, or CORPORATE column first

q4) read about selectivity and quoted from a website
< http://www.akadia.com/services/ora_index_selectivity.html >

"The ratio of the number of distinct values in the indexed column / columns to the number of records in the table represents the selectivity of an index. The ideal selectivity is 1. Such a selectivity can be reached only by unique indexes on NOT NULL columns."

so if i am having 1000 rows and and 22 distinct value on column A

22/1000 = 0.022 (it is not near to 1)
Would you say that this column have low selectivity ?

- on avg, 1000/22 = only 45 rows will be return out of 1000 rows for each value, to me, that is quite high selectivity isnt it ?

- somehow, i feel selectivity have to do with cardinality as well. I might have 22 distinct value ( 1 to 22 )

But if the column value "22" return only 1 row. I would have a very high selectivity (1 out of 1000 rows), isnt it ?

so if the formula wrong ? or am i misinterpreting ?

Hope to hear your opinion and advice.

Thank you.
Regards,
Noob







Tom Kyte
February 15, 2012 - 8:01 am UTC

q1) that is what the plan says, yes:

|*  3 |   TABLE ACCESS BY INDEX ROWID| CDR     | 19200 |  1856K|   256K  (1)| 00:51:18 |
|*  4 |    INDEX RANGE SCAN          | CDR_PK  |   477K|       |  2041   (1)| 00:00:25 |
----------------------------------------------------------------------------------------



it says - we think you'll find 477,000 rows in the index range scan, but only 19,200 of them will ultimately be returned.

That seems to be borne out by real life, since you retrieve about 17,000 rows in total.

q2) what composite index??? I don't know what indexes you have and what they are made up of. You didn't share that.

I'm looking at your predicate section:



   3 - filter("C"."CONNECT_STARTTIME">=TO_DATE('2012-01-01 00:00:00',
              'yyyy-mm-dd hh24:mi:ss') AND "C"."CORPORATE_ID"='borderless' AND
              "C"."STATUS"='0' AND "C"."CONNECT_STARTTIME"<=TO_DATE('2012-02-01 00:00:00',
              'yyyy-mm-dd hh24:mi:ss'))
   4 - access("C"."CDR_ID">="GET_RPT_ID"('MIN','CDR','MONTH','01-JAN-12'))




we would like to move the filter against the table into the access against the index - so that we only find 17,000 rows in the index instead of the current 1/2 million rows.

q3) no, there is not. Say you have a query like:

select * from table_with_millions_of_rows where column_with_five_values = :x and unique_column > :y;


I would definitely want the index to be on column_with_five_values,unique_column since we use equals on the one and a range on the other.

close your eyes and imagine how many blocks we would have to scan in the index in each case. If the low selectivity column was first - we'd go right to the section in the index where that column was that value and start scanning as unique_column > :y. Every single block we would read would be 100% full of data we were interested in. There would be no row on any leaf block that didn't match our predicate.

Now, if the unique_column was first - we'd have to range scan over every unique_column > :y - and that would include lots of rows in the index such that column_with_five_values was NOT equal to :x. We would in general scan 5 times as many rows in leaf blocks as necessary if the unique column were first.

You put things in your index based on how you use them in the queries. selectivity doesn't really have anything to do with the order of attributes in an index - it was/is a 'common sense MYTH' that a lot of people believe - but it is 100% wrong.

q4) selectivity only really has to do with cardinality estimates. (by the way: 15% is just wrong on that page. What they should say is:

you want to use an index to retrieve a few rows. You do not want to use an index to retrieve millions of rows.


If 15% is in the millions - you do not want an index. If 1% is in the millions, you do not want an index. It shouldn't be percentage based.)


Selectivity is used to determine cardinality, cardinality is used to determine the plan.

using FIRST_ROWS and NOT using but having same explain plan.

A reader, February 15, 2012 - 2:04 pm UTC

hi tom,

thank you so much for the explanation.

For question 3 on the index column ordering, i am drawn up an example using excel. i am not sure if this is what you meant.
SET A SET B
5_value unique unique 5_value
a 1 1 a
a 2 2 a
a 3 3 a
a 4 4 a
a 5 5 a
a 18 6 b
a 19 7 b
a 20 8 b
a 21 9 b
a 22 10 b
b 6 11 c
b 7 12 c
b 8 13 c
b 9 14 c
b 10 15 c
b 16 16 b
b 17 17 b
b 23 18 a
b 24 19 a
b 25 20 a
c 11 21 a
c 12 22 a
c 13 23 b
c 14 24 b
c 15 25 b
c 26 26 c
c 27 27 c
c 28 28 c
c 29 29 c
c 30 30 c
-----------------------------------------------------

So assuming my predicate is
where 5_value = 'b' and unique>=10

For set A, i will just go straight to B and start to range scan every row of B starting from b6 to b25

For set B, i will have to range scan from 1 to 30 and for each scan that is >10, i might not get the 5_value ='b'

q1) is that what you meant ?


======================================

q2) for range scan, does it go to every row to check the value ? e.g >10, does it start scanning from 1 onwards or does it start scanning from 10 onwards ?
as i am thinking if range scan goes directly to the value onwards, i might scan lesser blocks instead

e.g unique > 30 and 5_value = 'b'
for SET B, i will just go 30 and see if value is B, since 30 is last entry, thats it. no more.

for SET A, i will go to every of B and see if the value is more then 30. so i might scan more rows ?

could you elaborate abit further on how the search will be for both SET A and B


=================================


q3) how does a B-tree index for SET A and SET B look like ?
sounds to me like

for SET A, branch block are the 5_value column and leaf blocks are the unique column?

for SET B, branch block are the unique column, and leaf blocks are the 5_value column ?


Regards,
Noob
Tom Kyte
February 15, 2012 - 3:45 pm UTC

q1) in set A, you will go straight to 5_value = 'b' and unique >= 10 and just hit elements:

      b    10             
      b    16        
      b    17        
      b    23        
      b    24        
      b    25


If the index were created as set B, then you would find unique >= 10 and inspect each one for 5_value = 'b', you would have to process:

  10 b
  11 c
  12 c
  13 c
  14 c
  15 c
  16 b
  17 b
  18 a
  19 a
  20 a
  21 a
  22 a
  23 b
  24 b
  25 b
  26 c
  27 c
  28 c
  29 c
  30 c 




q2) it would start at 10 in the index (set b) and scan forward from there.

However, it would STILL make sense to put 5_value first in the index even with a predicate like "unique > 30 and 5_value = 'b'", it will go to B and unique > 30 and find "nothing" and stop.

It would take as many IO's and as much work to do "unique>30 and 5_value='b'" regardless of the ordering of the columns in the index. Neither would have an advantage in that case. However, in pretty much EVERY OTHER case - the ordering of set A above would result in *less work* being performed.

In other words, at its very rare best, set B is as good as set A, however in most all cases, set B is much worse than set A.


q3) the branch blocks would have both columns, the leaf blocks would have both columns. The entire key is present at both levels.

We'll navigate the branch blocks to find the first leaf block that

in the case of set A, has 5_value='b' and unique>=30

in the case of set B, has unique >= 30, we cannot do anything about 5_value at that point, we are forced to look at every row such that unique >= 30




using FIRST_ROWS and NOT using but having same explain plan.

A reader, February 16, 2012 - 12:46 pm UTC

hi tom,

thank you very much. i get the picture now.
my very last question on

"we parse, bind, execute, fetch, close a select statement. The plan is usually determined during the execute phase for a hard parse."

during execution, it will start scanning through the buffer / disk IO for the required data we need.

during fetching, it will be returning the data/rows to the user.

my question is ->

does execution and fetching occur concurrently or sequentially ?

(like ..scan the disk, bring to buffer, found some rows satisfying the condition, fetch and return)

or

(scan the disk, bring to buffer, find all the rows 1st, then return ?)

or both (if there is a order by, must order 1st before you can return)

Regards,
Noob

Tom Kyte
February 17, 2012 - 5:18 am UTC

does execution and fetching occur concurrently or sequentially ?


yes. both happen.

Say you have:

select * from big_table;

that is what I call a fast return query, it returns rows way before encountering the last row. it can return the first row IMMEDIATELY - without having to read everything.

In fact - for that query - the execute phase would probably do ZERO IO's to the table. the first fetch would do an IO, the second fetch would pick up where the first left off.


However, select x, count(*) from big_table group by x order by x;

would probably read the entire table on the first fetch - build the entire result set into temp (memory and/or disk) and then return the answer from there.

So, both can happen...

Not percentage based?

Matthew McPeak, February 16, 2012 - 2:08 pm UTC

Tom,

I am wondering about your comment:

>> If 1% is in the millions, you do not want an index. It shouldn't be percentage based

Time for school again for me, maybe. Can you please point out what's wrong with my reasoning below?

If 1% is in the millions, then a FTS is reading 100's of millions of rows.

I always viewed it like this:

1) Cost of reading a table via FTS is, more or less, # rows/rows per block/blocks per I/O (that is, MBRC).
2) Cost of reading a row via index unique scan is, more or less, depth of index + 1 (to get the table block).

So, if you have a 100 million row table, cost of FTS might be 100M/20 rows per block/8 blocks per read = 625,000.

Cost per row via index might be (3+1=)4. So, if you were reading less than 625K/4 or 156,250 rows, the index would be best.

That puts the break-even at 0.15% of the rows in the table, in this particular example.

I'm not really concerned about the exact calculations for the various costs (though I welcome all the knowledge you want to share!). But I'd really like to know whether I need to change my conceptual understanding of this issue.

That is, whether I need to change my believe that the break-even between index access and FTS is, in general, a percentage of the number of rows in the table, though it may vary from table to table and system to system.

Thanks!

Tom Kyte
February 17, 2012 - 5:24 am UTC

it depends on the row size, the distribution of the rows in the table and so on. There is no single percentage for "any" table.


Here is the answer I will give you:

if you are retrieving more than many hundreds of thousands of rows - and you are planning on using an index - you will be waiting for at least an hour in general. Either

a) live with that FACT
b) use a different approach - such as partitioning the data so that you can full scan a smaller partition to retrieve your data.


Every index range scan -> table access by index rowid has a cost (in time) associated with it. It might be small (5/1000th of a second if you have to do a physical IO), but if you do it often enough it adds up (fact of math: BIG number * small number = still a BIG number...)


It is a number of rows. that is why the percentage of rows accessed in a table has dropped over the years.

When I started doing database stuff, the rule of thumb (ROT) was "25%".
Later it dropped to 20%
then 15%
then 12%
then 10%
then lower and lower.

Was it because the percentage really changed? NO.

It was because tables kept getting bigger and bigger and bigger - the ROT adjusted the % in order to limit the number of rows being returned.


If you are going to get a million rows from a table
And you are planning on using an index to access the table
You are talking about an hour long process....

do you want to do that?

A reader, February 17, 2012 - 10:18 am UTC

Dear Sir,

Thanks for your valuable and great suppoert to Oracle communtiy!!

but say there are 1 billion records and I am selecting 1% out of it than
index will be used provided other factor are favouring it right?

don't it be like that smaller the % of the selection (selectivity) of data from total records of the
table index is more favorable?

Can you please through more light on this?



Tom Kyte
February 17, 2012 - 8:56 pm UTC

if you are getting 10,000,000 records out via an index, and assuming the data isn't cached for the most part (reasonable assumption) - then you might have to do 10,000,000 single block reads against this table - each of which takes about 5ms.

Now, we are assuming the index is entirely in memory and "free" to access - it takes 0.0 cpu seconds to access the index (which is not true, but let's assume that for a minute).

then you will wait



ops$tkyte%ORA11GR2> select 10000000*5/1000/60/60 from dual;

10000000*5/1000/60/60
---------------------
           13.8888889


about 14 hours.


what do you think of that - are you willing to wait?

Or, would you think about using compression (less data to scan), partitioning (to hopefully eliminate lots of data - and full scanning using nice big juicy multiblock IO reading 1mb at a time - in about the same amount of time you are reading say 8 or 16k?


I'd rather full scan using nice big multiblock IO in all likelihood - I'm not patient enough to wait 14 hours.


The only case where the index would be useful here would be if the data was natrually 'clustered' together - that is, the 10,000,000 rows you were interested in where all together on the same blocks - so that you are only hitting 1% of the blocks in the table - and the clustering factor of the index was very good (near the number of blocks in the table) so that once we read a block into the cache - we would subsequently and immediately read all of the other rows off of that block - so we would not be reading and re-reading blocks over and over again.

And if that was the case - then the argument for partitioning becomes EVEN STRONGER - meaning, I still wouldn't use the index, I'd be scrambling to fix the design of that table to utilize partitioning - it would be a natural fit.

Reorganize the index as clustering factor is high

Bhaskar, April 18, 2012 - 1:26 am UTC

Hi Tom,

Firstly thanks for your all valuable suggestions here.
I have the below data for a table(say tab1).
block-352
row=86540
CF-30675

Now primary key(say pk1) has three columns in the table in the below column sequence.

pk1-(col1,col2,col3)

Now col1 has 81309 distinct values
col2 has 31342 distinct values
col3 has 61 distinct values.

Now this pk1 i think needs to be reorganized.

In which way i will re sequence column order in pk1?


Thanks in advance.
Tom Kyte
April 18, 2012 - 3:15 am UTC

why do you think this needs to be reorganized?


first of all, a primary key index is TYPICALLY used to find a single row. when going for a single row (or a small number of rows) the clustering factor isn't really relevant. It is only when you start doing large range scans that the clustering factor would come into play.

secondly, the order of columns in an index is first and foremost based on how the attributes are used in the predicate. If you use "=" on all three, all of the time - then the order would not be relevant performance wise since we will get a single row (cluster factor is not relevant there - cost of using the index will be index height plus 1 regardless of the clustering factor). If you use "=" on col1 and col2 but <, >, etc on col3 - then the order of col1 and col2 can be debated - but col3 would be last. and so on. One would need to understand the general type of predicate used on these columns.


If you use = on all three most all of the time, I would put col3 first and use "compress 1" on the index perhaps to get a smaller index.


but, explain why you think it needs to be reorganized and we'll discuss the merits of that.

Reorganize the index as clustering factor is high

Bhaskar, April 18, 2012 - 1:28 am UTC

Sorry in the previous post col2 has 81342 distinct values.
Tom Kyte
April 18, 2012 - 3:16 am UTC

see above.

Bhaskar, April 18, 2012 - 4:22 am UTC

Thnaks Tom for the useful information.

Actually we have a module which runs every night.In that module we have a cursor statement which uses tab1.

Now as stated by you
"If you use = on all three most all of the time" in that cursor all columns in pk1 are used by = operator.

This module runs for nearly 4 hours and in the statspack report we ahve found the below usasge of the index pk1.

93% of all logical reads between were against the index pk1 below;

index logical reads %total

PK1 INDEX 644,288,752 93.26

Also query execution plan shows pk1 index has been used by unique scan.We have three other huge table in this cursor joining which returns huge amount of data.

Now as stated by you below
"I would put col3 first and use "compress 1" on the index perhaps to get a smaller index.
"
i was thinking we are not reaching to the index in smarter way and in the current index column sequence we are traversing much than needed to reach the data.

As mentioned by you above would you like column sequence to be as col3,col1,col2?
Also i didn't get the compress 1 stated by you.Can you please explain this.


Thnaks in advance

Tom Kyte
April 18, 2012 - 8:07 am UTC

think about it for a minute. If you are using an index to do a single row fetch - how many IOs (how much work) would be performed?

It would read the index to get a rowid to get a table block.

Now, if the index is on c1,c2,c3 - the height of the index would be N. The amount of work would be N+1 - N io's in the index, 1 io against the table.

Now, if the index is on c3,c2,c1 - what would the size of the index be? It would be the same as if the index were on c1,c2,c3. The height would be the same in general. The work performed would be the same. It takes the SAME EXACT AMOUNT OF WORK to search for c1,c2,c3 as it does c3,c2,c1 as it does c2,c1,c3, etc etc etc. The order doesn't matter, the height does.


You know, if you compress it - you would end up possibly with a smaller index - with a shorter height ;)


The order of columns in the index is NOT BASED ON SELECTIVITY. It is based on how they are used in the where clause (things with = go first, things with <, > and such go last). All things considered equal - if all things use =, then you might consider putting the least selective first in order to compress the index.



Think about how the index is traversed - it is NOT a binary tree, it is a b*tree - we only traverse N blocks where N= height of index!!! The order of the columns do not matter.

Reorganize the index as clustering factor is high

Bhaskar, April 18, 2012 - 11:31 pm UTC

Hi Tom,

Thnaks for the detailed planation.
I have some below doubts.it would be great if you can help me in understanding them.

"The order doesn't matter, the height does."

When designing a composite index how we will say we have created the index with minimal height so that to find a rowid in this index we will traverse minimal path.
I was thinking if we put less selective column (like c3 in my example) first and then c1 and then c2 the height of the index would be less.

"he order of columns in the index is NOT BASED ON SELECTIVITY. It is based on how they are used in the where clause (things with = go first, things with <, > and such go last). All things considered equal - if all things use =, then you might consider putting the least selective first in order to compress the index. "

Now as per the above statement by you are you saying the height of index doesn't matter in which sequence we put columns in composite index?And if we use a sql query where col3 is used with= col2 with > and col1 with < then height would matter in which way we have put columns in composite index?
Then my assumption as i have stated above is wrong because i read somewhere that we need to chhose the column sequence in a composite indesuch way less selective column first.

Also in my case the bugging sql query has used all three columns i.e. c1,c2,c3 are used with = operator.
So shall i go for compression ?

Also i have another doubt.
As in my case CF is near to the number of rows not block.So data in the index is not properly ordered as in in the table.So when a complex query will be using this table will it be a problem?
And as i read we would try to reduce CF so that it would be near to number of rows what preventive measure i need to take so that my existing index would be redisgne dso that its CF should be near the number of blocks in the table.

Please help me because i think my concept are wrong in some mattes here.

Thanks in advance







Tom Kyte
April 19, 2012 - 5:40 am UTC

Reorganize the index as clustering factor is high

You've got to stop saying that. It is so wrong. A clustering factor is a measurement of how sorted a table is with respect to an index key. It is a measure of THE TABLE, not the index. I don't care how many times you reorganize (rebuild) an index - it will not change the cluster factor.

Perhaps you mean "changing the order of attributes in a concatenated index", but I know you don't mean "reorganize" in the database sense.


When designing a composite index how we will say we have created the index with minimal height so that to find a rowid in this index we will traverse minimal path.
I was thinking if we put less selective column (like c3 in my example) first and then c1 and then c2 the height of the index would be less.


the order of the columns won't matter in general. The height of the index is mostly a function of the size of the key (order won't matter there) and the number of rows being index (order won't matter there)

remember, this is NOT a binary tree, this is a b*tree, completely different. I think you are hung up on a binary tree representation in your mind.

http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm#CBBECABB




Now as per the above statement by you are you saying the height of index doesn't matter in which sequence we put columns in composite index?And if we use a sql query where col3 is used with= col2 with > and col1 with < then height would matter in which way we have put columns in composite index?

Not, not even remotely.


You want to put things with = first, before <, > - so we can zero in on the rows we are interested in. Close your eyes and envision the work that would have to be performed.

Let's say you have a column A with 5 distinct values 1..5. Let's say you have a column B with 1,000,000 distinct values from 1..1,000,000. Let's say there are 5,000,000 rows in the table (each distinct value of B has the five values of A associated with it)

Suppose you have a where clause:

where A = 5 and B between 100 and 200;


Now, if you have an index on (B,A) you would have to search through:

(100,1)(100,2)(100,3)(100,4)(100,5)(101,1)(101,2)....(200,1)(200,2)(200,3)(200,4)(200,5)

since the index would be stored sorted by B,A. that would be 505 entries you would have to inspect.


Now, assume the index is on (A,B), you would have to search through:

(5,100)(5,101)(5,102).....(5,199)(5,200)

You would have 101 entries to inspect in the index.

both indexes would have about the same height (in general) so it would take the same amount of work to find the first leaf entry regardless of column order - but one of the indexes would make you look through 505 leaf entries and the other 101 - which one makes sense to use?


So shall i go for compression ?


it is an option, however it isn't going to do anything for you. You are reading the table by primary key - at most - this will reduce the height by maybe one - if you are lucky. that is it. I believe you are focusing on the wrong problem here.

If the query: select ... from t where primary key = some value - is your problem query, it isn't because the query performs slowly, it is because you execute the query a billion times. if this query is in your batch process - the query isn't the problem, your batch process is. it is doing things slow by slow. You'll need to look at the batch process and figure out how to make it use SQL - really use SQL - not silly little tiny SQL - but big bulking, efficient, does a lot of work, removes hundreds or thousands of lines of inefficient, bad, buggy procedural code.




stop getting hung up on the clustering fact - look at your code and reduce the amount of procedural code you have. use sql more fully.

Bhaskar, April 19, 2012 - 11:41 pm UTC

Hi Tom,

Many thnaks for your help which now clears my confusions.

"is s a measure of THE TABLE, not the index. I don't care...."

Thnaks for this clarification.
Now if the CF is high for a table and i know multiple dml operations are performed in that table what thing we need to do to lower down CF so that it comes near to the number of the blocks in the table.

If my table data is badly sorted and it is not sorted as data in the index certainly we will have performance problem when idex will search data in table.

"the order of the columns won't matter in general. The ..."

Thanks for this clarification.
Now how this function computes the height of the balanced tree?As i know branch blocks are decided in such way so that quickly we can reach the leaf blok.Is there any logical algorithm is there for this?

"Let's say you have a column ..."

many thanks for this clarification.
So as per the example stated by you there will be no difference to raech the leaf blocks as because height will be the same but while traversing leaf blocks index search will be better if we put a,b instead of b,a.
So in such scenario we will always try to arrange the index like (a,b) not thinking it will reduce the height of index(like me :) ) but thinking how searching in leaf blocks would be better.

Now as stated by you let's say a table tab1 has this (a,b) composite index.If this table is joined with other tables how the selectivity as described by you will work?


"it is an option, however it isn't ..."

Again many thanks for this clarification.
Actually i have checked that the bugging sql query is called my module in a particular time and only in single call.But as dml operations are high and data volume is high in the time when this module runs we have found very high 644,288,752 consistent gets for pk1(as in mentioned above).
Now no other modules run or access this pk1 that particular time.
We are in process of tuning this query also i was thinking if we can change the index column sequence so that it benefits index searching.

Thanks in advance.

Tom Kyte
April 20, 2012 - 4:10 pm UTC

Now if the CF is high for a table and i know multiple dml operations are
performed in that table what thing we need to do to lower down CF so that it
comes near to the number of the blocks in the table.


how did you arrive at that conclusion.

the amount of DML doesn't matter. It doesn't come into play.

the clustering factor is a metric used by the optimizer to figure out when it would be more efficient to full scan for a given predicate - over an index range scan. the clustering factor tells the optimizer how "efficient" a given index would be for a large range scan.

If the clustering factor is near the number of blocks, the index will be used to retrieve more rows from a range scan against that table than if the index has a clustering factor near the number of rows in the table. It is a measure of how many IO's it would take to read the table out via an index. the smaller the number, the more "efficient" that index is for a LARGE range scan.

It won't really affect a primary key lookup ever - since you are getting ONE row - not a large amount.


If my table data is badly sorted and it is not sorted as data in the index
certainly we will have performance problem when idex will search data in table.


that is NOT certainly true and in your case, NOT TRUE at all. You are using a primary key, you are getting a single row. the clustering factor could be infinity and it wouldn't matter. It only comes into play for a large scan.


And another thing wrong with your thinking is this - you typically have more than one index on a table - and you typically can have a table sorted by ONE THING, one set of attributes. For example, lets say you have an EMP table and an index on ENAME and an index on HIREDATE.

Since rows are added to the table in order of hiredate - the clustering factor on hire date will be low. since you DO NOT hire people in alphabetical order - the clustering factor on ENAME will probably be very high.

If you were to "fix" the ename index - what would you have done to the hiredate clustering factor? if you reorg the table to be sorted by ename - will it still be sorted by hiredate?


Now how this function computes the height of the balanced tree?As i know branch
blocks are decided in such way so that quickly we can reach the leaf blok.Is
there any logical algorithm is there for this?


google up b*tree indexes and read about them if the concepts guide did not have sufficient information.

Bhaskar, April 23, 2012 - 1:02 am UTC

Hi tom,

Many thanks for your all posts in this.It now clears all of my doubts and misconceptions.

Thanks for your patience and help.

clusterring_factor confusion

Biswaranja, June 02, 2012 - 6:31 am UTC

Hi Tom,

I created some table and then created index to see clustering factor ,got some doubt.

I created as below.


create table a(a number);
insert into a select to_char(level) from dual connect by level<10000000;

create index a1 on a(a);
analyze table a compute statistics;

SQL> select BLEVEL,leaf_blocks,CLUSTERING_FACTOR from user_indexes where table_name='AAAA';

    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         2        2226              1516


###################################################
create table aa(a varchar2(8));
insert into aa select level from dual connect by level<10000000;--- I also tried with to_char(level)

create index aa1 on aa(a);
analyze table aa compute statistics;

SQL> select BLEVEL,leaf_blocks,CLUSTERING_FACTOR from user_indexes where table_name='AA2';

    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         2       26316           1916062


#####################################################

create table aaa(a char(8));
insert into aaa select level from dual connect by level<10000000;--- I also tried with to_char(level)

create index aaa1 on aaa(a);
analyze table aa compute statistics;

SQL> select BLEVEL,leaf_blocks,CLUSTERING_FACTOR from user_indexes where table_name='AA3';

    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         2       27856           1917323

########################################################

Can you please tell why clusterring_factor is highly deviated from number of blocks  when I created the index on  non number datatype column in above cases.

Is it possible to reduce the clusterring_factor in this above case.



Thanks as always,
Biswaranjan

Tom Kyte
June 04, 2012 - 8:17 am UTC

Let me ask you a question, why on earth would you do this:

create table a(a number);
insert into a select to_char(level) from dual connect by level<10000000;



why would you to_char level - since it has to be inserted into a number field? that is just like:

create table a(a number);
insert into a select <b>TO_NUMBER(to_char(level))</b> from dual connect by level<10000000;


I know this is just a demo, but people code into their demos what they do in real life. I hope you do not do anything remotely similar to this in real life????




Also:

analyze table a compute statistics;


please do not do that, that command is deprecated. You gather statistics on a table by using dbms_stats.

Furthermore, you didn't even need to do that to see the clustering factor. Statistics are computed on index automagically in 10g and above during the create index, all you need was the create index statement for the information you displayed.


But - let's look at this in detail. Which table do you think is larger? The number implementation or the varchar2 implementation? We see that the number of index leaf blocks is more than in the varchar2 index than in the number index (storing a number in a varchar is ALWAYS a bad idea, for many reasons - space being one of them).

So, which do you think is larger - the number table or the varchar2 table?

Since the clustering factor is by definition:


the number of IO's that would be performed in order to read the entire table out via the index in a single range scan

this can optimally be near the number of blocks in the table (smallest number of IO's possible after all) - or, assuming more than one row exists per block, the number of rows in the table (which would tend to maximize IO's assuming more than one row per block)


The minimum value - *the minimum value* for the clustering factor is going to be the number of blocks in the table

it would logically follow, if one table is much larger than another table - the clustering factor on the larger table *has* to be larger than the clustering factor on the smaller table.

The clustering factor is at least the number of DATA blocks in the table (assuming a non-null value, which we have here!). If the table is larger, the clustering factor MUST be larger.


That accounts for part of the issue here - the table of varchar2 is obviously larger (how much so - we'll see below). But it doesn't account for all of it - we need to look at the data.


You *think* you loaded two tables with the same data - stored once (properly) as a number and stored again (totally wrong) as a string. But you didn't - you have two entirely different sets of data! Consider:

ops$tkyte%ORA11GR2> with data (r)
  2  as
  3  (select 1 r from dual union all select r+1 from data where r<= 1000 )
  4  select * from data order by to_char(r);

         R
----------
         1
        10
       100
      1000
      1001
       101
       102
       103
       104
       105
       106
       107
       108
       109
        11
       110



Look at that, 1, 10, 100, 1000, .... are first. The data in the index would have 1, 10, 100, 1000 FIRST. Well, in order to process that first leaf block - how many IO's against the table would you have to perform? Remember, the data in the table is:

1,2,3,4,5......

the table is ordered NUMERICALLY, the index is ordered using CHARACTER STRINGS. A number in a character string sorts entirely differently than a number *in a number*.

So, this accounts for the SECOND issue you have - your data in the table is not sorted in the same fashion as the data in the index!!!!!! By definition pretty much that will increase the clustering factor.

So, the clustering factor is higher in the varchar2 table because:

a) you used a lot more space, therefore the lowest possible clustering factor is MUCH HIGHER

b) you loaded the table in an unsorted fashion - we have to do reads and re-reads of blocks over and over and over because of that. With the numbers - the table data and index data were *sorted* identically


Here is a complete example showing the numbers you would really want to report on:

ops$tkyte%ORA11GR2> create table t_number(x number);

Table created.

ops$tkyte%ORA11GR2> insert into t_number
  2  with data(r)
  3  as
  4  (select 1 r from dual
  5    union all
  6   select r+1 from data where r < 10000000
  7  )
  8  select * from data;

10000000 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T_NUMBER' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> create index t_number_idx on t_number(x);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t_varchar2(x varchar2(8) );

Table created.

ops$tkyte%ORA11GR2> insert into t_varchar2
  2  with data(r)
  3  as
  4  (select 1 r from dual
  5    union all
  6   select r+1 from data where r < 10000000
  7  )
  8  select * from data;

10000000 rows created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_varchar2_idx on t_varchar2(x);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T_VARCHAR2', cascade => true );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select t.table_name, i.index_name,
  2         i.blevel, i.leaf_blocks, i.clustering_factor cf,
  3         t.num_rows, t.blocks
  4    from user_indexes i, user_tables t
  5   where t.table_name in ( 'T_NUMBER', 'T_VARCHAR2' )
  6     and t.table_name = i.table_name
  7  /

TABLE_NAME INDEX_NAME     BLEVEL LEAF_BLOCKS       CF   NUM_ROWS     BLOCKS
---------- -------------- ------ ----------- -------- ---------- ----------
T_NUMBER   T_NUMBER_IDX        2       23553    15152   10000000      15962
T_VARCHAR2 T_VARCHAR2_IDX      2       27944  2055865   10000000      18257



ops$tkyte%ORA11GR2> create index t_varchar2_idx2 on t_varchar2(to_number(x));

Index created.


ops$tkyte%ORA11GR2> select t.table_name, i.index_name,
  2         i.blevel, i.leaf_blocks, i.clustering_factor cf,
  3         t.num_rows, t.blocks
  4    from user_indexes i, user_tables t
  5   where t.table_name in ( 'T_NUMBER', 'T_VARCHAR2' )
  6     and t.table_name = i.table_name
  7  /

TABLE_NAME INDEX_NAME      BLEVEL LEAF_BLOCKS       CF   NUM_ROWS     BLOCKS
---------- --------------- ------ ----------- -------- ---------- ----------
T_NUMBER   T_NUMBER_IDX         2       23553    15152   10000000      15962
T_VARCHAR2 T_VARCHAR2_IDX2      2       23553    17769   10000000      18257
T_VARCHAR2 T_VARCHAR2_IDX       2       27944  2055865   10000000      18257





Now, I cannot reproduce your number of leaf blocks in the number index - however you didn't show all of your work so we don't really know what you did. But we can see that the varchar2 implementation is LARGER overall - therefore the clustering factor is larger overall.

additionally - we know by the way strings sort versus numbers - the data in the varchar2 table is NOT SORTED by the index key.


So - how to fix all of this?

Simple: use the right type, use a number. You'll use less storage, you'll be able to retrieve your data in actual numeric sorted order, you'll know you have numbers, etc etc etc


do not use the function based index as I did - that would be silly (convert the varcahr2 column into a number!) I only did that to show that the clustering factor was affected predominantly by the fact your data is not sorted in the table - and even if it were, the clustering factor would be higher simply because the table is larger!!!!

got it :)

Biswaranja, June 02, 2012 - 8:37 am UTC

Hi Tom,

I researched myself on my question about clusterring_factor and understood the reason behind 
that high clusterring_factor :) :) :). We are inserting as number sequence but in index it is taking it as 
string.

A
--------
1
10
100
1000
10000
100000
1000000
1000001
1000002
1000003
1000004     ----------like this in index

but we are inserting into table as number sequence  ,so that caused the high clusterring factor.
I think I understtod correctly but just want to listen from you whether I understood the it was or not.



Then I did something to reduce clusterring factor and which worked fine.

create table aa_temp as select * from aa;
truncate table aa;

Then I inserted in to the "aa" table with below query.(without no_merge i was doing initially but got 
frustrated and saw one of your post using no_merge :)).

insert into aa
    select /*+ NO_MERGE(a) */ *
      from (select /*+ NO_MERGE */ *
              from aa_temp where a<10000000
             order by a) a;
SQL> select BLEVEL,leaf_blocks,CLUSTERING_FACTOR from user_indexes where table_name='AA'; 
leaf_blocks  clusterring_factor
26316         17762

I am very happy that I am developing myself about performance tuning.  thanks for your help..........

Thanks,
Biswaranjan.

Tom Kyte
June 04, 2012 - 8:30 am UTC

but also remember this - do not get hung up on the clustering factor. It is a reason an index might not be used, that is the primary reason for knowing of its existence - so you can explain things.


In general, you won't be reorganizing a table to "improve the clustering factor" (as was stated originally!). for you see - if you do that, you'll probably destroy the clustering factor of other indexes.

for example, think of the EMP table. It is normally inserted into and not deleted from. So, naturally - the rows in the table will be generally "in order of EMPNO (sequence) and HIREDATE (sysdate)". The rows will NOT be in order of DEPTNO, or of ENAME, or of any other column in general.


So, the clustering factor on the EMPNO column will be "good", the clustering factor on the HIREDATE column will be good.

The clustering factors on the rest would be "bad".


So a query like "select * from emp where ename like 'K%' " would be used to retrieve FEWER rows before resulting in a full scan than a query such as: "select * from emp where hiredate between :x and :y" would be due to the clustering factor.

Now, if you reorg the table by ENAME - the roles would be reversed, the ENAME index would be used to retrieve more rows from the table than the HIREDATE index - sounds all good for ename searches

Until you go to retrieve stuff by hiredate, then it is not so good.



wow...

Biswaranjan, June 04, 2012 - 8:49 am UTC

Hi Tom,

thank you very much for this nice explanation.
I dont use like i showed above to_char() (by mistake copied from sql developer).

But I will sure remember about "analyze table a compute statistics" not to use.

from My recent post I thought I got everything about clustering factor but when I saw your empno and hiredate example again my concept got clear(single line but contains much things).

I am jealous about your knowledge( hahaha joking).

I just want to say , many doubt to come and many to be cleared.

Thanks as always,
your fan.

Try to get a better understanding of clustering factor

A reader, November 20, 2012 - 2:04 pm UTC

In the example below, I create two tables with the same data, but one with good clustering factor, the other bad clustering factor. When I run the similar SQL on these tables, I expect the good table to consume far less I/O's than the bad table would. That doesn't seem to be the case. Can you please explain why? Thanks.


SQL> 
SQL> SELECT * FROM v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for 64-bit Windows: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> 
SQL> -- ------------------------------------------------------------------------
SQL> -- Create a table with bad clustering factor
SQL> -- ------------------------------------------------------------------------
SQL> DROP TABLE bad;

Table dropped.

SQL> 
SQL> CREATE TABLE bad (
  2   id INTEGER
  3  );

Table created.

SQL> 
SQL> BEGIN
  2   FOR i IN 1..1000 LOOP
  3      INSERT INTO bad
  4      SELECT level
  5      FROM   dual
  6      CONNECT BY level <= 10000;
  7   END LOOP;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> 
SQL> COMMIT;

Commit complete.

SQL> 
SQL> CREATE INDEX bad_idx ON bad(id);

Index created.

SQL> 
SQL> -- ------------------------------------------------------------------------
SQL> -- Create a table with good clustering factor
SQL> -- ------------------------------------------------------------------------
SQL> DROP TABLE good;

Table dropped.

SQL> 
SQL> CREATE TABLE good (
  2   id INTEGER
  3  );

Table created.

SQL> 
SQL> BEGIN
  2   FOR i IN 1..10000 LOOP
  3      FOR j IN 1..1000 LOOP
  4         INSERT INTO good VALUES (i);
  5      END LOOP;
  6   END LOOP;
  7  END;
  8  /

PL/SQL procedure successfully completed.

SQL> 
SQL> COMMIT;

Commit complete.

SQL> 
SQL> CREATE INDEX good_idx ON good(id);

Index created.

SQL> 
SQL> SELECT index_name, blevel, leaf_blocks, distinct_keys, num_rows, clustering_factor
  2  FROM   user_indexes
  3  WHERe  index_name IN ('BAD_IDX', 'GOOD_IDX');

INDEX_NAME                         BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ---------- ----------- ------------- ---------- -----------------
BAD_IDX                                 2       20893         10000   10000000          10000000
GOOD_IDX                                2       20893         10000   10000000             17255

SQL> 
SQL> 
SQL> col Cnt for 999,999,999,999
SQL> SELECT COUNT(*) Cnt FROM bad;

             CNT
----------------
      10,000,000

SQL> SELECT COUNT(DISTINCT id) FROM bad;

COUNT(DISTINCTID)
-----------------
            10000

SQL> SELECT COUNT(*) Cnt FROM good;

             CNT
----------------
      10,000,000

SQL> SELECT COUNT(DISTINCT id) FROM good;

COUNT(DISTINCTID)
-----------------
            10000

SQL> 
SQL> set autotrace traceonly
SQL> 
SQL> SELECT * FROM bad WHERE id BETWEEN 1 AND 1000;

1000000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1170211543

----------------------------------------------------------------------------
| Id  | Operation        | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |         |  1085K|    13M|  2295   (1)| 00:00:28 |
|*  1 |  INDEX RANGE SCAN| BAD_IDX |  1085K|    13M|  2295   (1)| 00:00:28 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID">=1 AND "ID"<=1000)

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          9  recursive calls
          1  db block gets
      68856  consistent gets
       2078  physical reads
          0  redo size
   17204330  bytes sent via SQL*Net to client
     733849  bytes received via SQL*Net from client
      66668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000000  rows processed

SQL> SELECT * FROM good WHERE id BETWEEN 1 AND 1000;

1000000 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3406690345

-----------------------------------------------------------------------------
| Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |          |   796K|     9M|  1740   (1)| 00:00:21 |
|*  1 |  INDEX RANGE SCAN| GOOD_IDX |   796K|     9M|  1740   (1)| 00:00:21 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID">=1 AND "ID"<=1000)

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          9  recursive calls
          1  db block gets
      68854  consistent gets
       2078  physical reads
          0  redo size
   17204330  bytes sent via SQL*Net to client
     733849  bytes received via SQL*Net from client
      66668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000000  rows processed

Tom Kyte
November 20, 2012 - 3:49 pm UTC

did you notice that neither of them went to the table???

the clustering factor is a measure of how sorted a TABLE is with respect to it's index key.

since you never hit the table, the clustering factor simply didn't matter.

of course both would do the same number of IO's - they both only read the index and the indexes in this case are *identical*


You have to take a query that goes from the index *to the table* to see any effect. for example:

ops$tkyte%ORA11GR2> create table organized
  2  as
  3  select *
  4    from all_objects
  5   order by object_id;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'ORGANIZED' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> alter table organized add constraint organized_pk primary key(object_id);

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table disorganized
  2  as
  3  select *
  4    from organized
  5   order by dbms_random.random;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'DISORGANIZED' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> alter table disorganized add constraint disorganized_pk primary key(object_id);

Table altered.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select i.index_name, i.clustering_factor, t.blocks, t.num_rows
  2    from user_indexes i, user_tables t
  3   where t.table_name in ('ORGANIZED','DISORGANIZED')
  4     and t.table_name = i.table_name
  5  /

INDEX_NAME                     CLUSTERING_FACTOR     BLOCKS   NUM_ROWS
------------------------------ ----------------- ---------- ----------
DISORGANIZED_PK                            72882       1065      72951
ORGANIZED_PK                                1039       1065      72951

<b>so we can see the data is constructed as we desire - the organized table has a 'good' clustering factor on object id - very near the number of blocks in the table, while the disorganized table has a 'bad' one - near the number of rows.

now for a query that goes from index to table for every row:</b>

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_monitor.session_trace_enable();

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select /*+ index( organized organized_pk ) */ count(subobject_name)
  2    from organized
  3   where object_id > 0;

COUNT(SUBOBJECT_NAME)
---------------------
                  667

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select /*+ index( disorganized disorganized_pk ) */ count(subobject_name)
  2    from disorganized
  3   where object_id > 0;

COUNT(SUBOBJECT_NAME)
---------------------
                  667


<b>and tkprof says:</b>


select /*+ index( organized organized_pk ) */ count(subobject_name)
  from organized
 where object_id > 0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.12       0.12        152       1192          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.12       0.12        152       1192          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 931
Number of plan statistics captured: 1

Row Source Operation
---------------------------------------------------
SORT AGGREGATE (cr=1192 pr=152 pw=0 time=124309 us)
 TABLE ACCESS BY INDEX ROWID ORGANIZED (cr=1192 pr=152 pw=0 time=98651 us cost=1194 size=510657 card=72951)
  INDEX RANGE SCAN ORGANIZED_PK (cr=153 pr=152 pw=0 time=33548 us cost=154 size=0 card=72951)(object id 131875)

<b>if we take 1192 and subtract the 153 IO's we did on the index - we get 1039 - the exact clustering factor for the index...</b>


select /*+ index( disorganized disorganized_pk ) */ count(subobject_name)
  from disorganized
 where object_id > 0

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.24       0.24        152      73035          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.24       0.24        152      73035          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 931
Number of plan statistics captured: 1

Row Source Operation
---------------------------------------------------
SORT AGGREGATE (cr=73035 pr=152 pw=0 time=242386 us)
 TABLE ACCESS BY INDEX ROWID DISORGANIZED (cr=73035 pr=152 pw=0 time=215021 us cost=73058 size=510657 card=72951)
  INDEX RANGE SCAN DISORGANIZED_PK (cr=153 pr=152 pw=0 time=38186 us cost=154 size=0 card=72951)(object id 131877)


<b>
and 73035-153 = 72882, the exact clustering factor of the disorganized index</b>



Clustering factor smaller than number of blocks

Horia Berca, May 10, 2013 - 10:11 am UTC

Hello Tom,

I wanted to ask what influences how the clustering factor is computed that sometimes results in a value lower than the number of blocks for the table.
I tried to alter the cpu_count parameter in my instance (as read in Jonathan Lewis book Cost-Based Oracle Fundamentals, chapter 5 - The Clustering Factor - the clustering_factor computation may vary a little whether you are running a
single or multiple CPU machine) - it did not affected the results in my case.

Here is my scenario and the results.

11.2.0.3> Create table clust (id number, text varchar2(30));

Table created.

11.2.0.3> create sequence clust_seq;

Sequence created.

11.2.0.3> CREATE OR REPLACE PROCEDURE clust_proc AS
  2  BEGIN
  3  FOR i IN 1..100000 LOOP
  4  INSERT INTO clust VALUES (clust_seq.NEXTVAL, 'CLUST_FACTOR');
  5  COMMIT;
  6  END LOOP;
  7  END;
  8   /

Procedure created.

11.2.0.3> create index clust_id_i on clust(id);

Index created.

11.2.0.3> exec clust_proc;

PL/SQL procedure successfully completed.

11.2.0.3> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'CLUST',pname=
>'TABLE_CACHED_BLOCKS', pvalue=>DBMS_STATS.AUTO_TABLE_CACHED_BLOCKS);

PL/SQL procedure successfully completed.

11.2.0.3> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'CLUST', es
timate_percent=> 100);

PL/SQL procedure successfully completed.

11.2.0.3> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'CLUST_ID_I
', estimate_percent=> null);

PL/SQL procedure successfully completed.

11.2.0.3> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_
factor
  2  FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.
index_name='CLUST_ID_I';

TABLE_NAME                     INDEX_NAME                         BLOCKS
------------------------------ ------------------------------ ----------
  NUM_ROWS CLUSTERING_FACTOR
---------- -----------------
CLUST                          CLUST_ID_I                            320
    100000               316


11.2.0.3> select index_name, blevel, leaf_blocks, clustering_factor from user_in
dexes WHERE index_name='CLUST_ID_I';

INDEX_NAME                         BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ----------- -----------------
CLUST_ID_I                              1         199               316

11.2.0.3> 

We can see that clustering_factor is 316 when we actually have 320 blocks in the table.

Many thanks,
Horia

Tom Kyte
May 10, 2013 - 12:40 pm UTC

you are using automatic segment space management (ASSM). some of those blocks are used by us to manage space. they are the bitmap freelists.

....
ops$tkyte%ORA11GR2> select index_name, blevel, leaf_blocks, clustering_factor from user_indexes WHERE index_name='CLUST_ID_I';

INDEX_NAME                         BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ---------- ----------- -----------------
CLUST_ID_I                              1         199               316

ops$tkyte%ORA11GR2> select count(distinct dbms_rowid.rowid_block_number(rowid)) from clust;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                                316



see, there are only 316 blocks with data on them...