Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, gaurang.

Asked: August 30, 2001 - 9:55 am UTC

Last updated: May 21, 2012 - 7:54 am UTC

Version: 8.1.6

Viewed 10K+ times! This question is

You Asked

hi tom

i have question regarding composite index.
how it is stroed in b*tree structure?
if i have composite index on(empno,deptno) and i am only using
deptno in where clause ,
how that index will be used?

Thanks

and Tom said...

If you query:

select count(distinct deptno) from t

and either of EMPNO or DEPTNO is defined as "not null" -- we may very well use the INDEX via a FAST FULL INDEX SCAN over the table (the index being a "skinny version" of the table in this case.


If you query "select empno, deptno from t where deptno = :x" we MAY use the index again (as a skinny table).

If you query "select ename, deptno from t where deptno = :x" we will NOT use the index as we would have to full scan the entire index AND then access the table anyway. We would just full scan the table instead.

A composite index is stored sorted in key order (by empno and then deptno in your case)



Rating

  (35 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Response about "Composite index"

Michael Leung, August 31, 2001 - 1:55 am UTC

What is "FAST FULL INDEX SCAN"? Are there many types of index scanning?

Also, what is the difference between the first query and second query of "select empno, deptno from t where deptno = :x"?

Thank you for your kindly help to us!

Tom Kyte
August 31, 2001 - 7:09 am UTC

An INDEX_FFS (index fast full scan) is a process where by we read the index data as if it were a table. Normally, an index is processed a block at a time. goto the root block, use the branch blocks to navigate. It is a data structure.

In a FAST FULL SCAN, we just read the entire structure as it exists on disk. We do not treat it as an index, but more like a table. We read it out of order (data does not come back sorted from an index ffs) and we read it using multiblock IO.

This allows us to use an index like a "skinnier" version of the table.

We can:

range scan and index (ascending or descending)

combine indexes (what we do with bitmapped indexes to process complex AND/OR conditions)

join indexes (use two or more indexes joined together)

full scan indexes as above.


I don't understand what you mean by "what is the difference between"

select count(distinct deptno) from t;
select empno,deptno from t where deptno = :x;

one returns a count of distinct deptno's in T, the other returns the empno/deptno for a given department. Both might use the index with a FULL scan since the index can answer them (without having to hit the base table).

The last query "select ENAME, deptno from t where deptno = :x" most likely WON'T use the index with a full scan for the simple reason that ENAME is not in the index and we would have to full scan the index AND access the table to get ENAME (so we would probably just full scan the table instead)



A reader, November 13, 2002 - 10:20 am UTC

In your first response, you have given example of 2 queries which are same..and in one you say index will b used and in other us say index wont be used.That is teh confusion. Here it is:
"If you query "select empno, deptno from t where deptno = :x" we MAY use the
index again (as a skinny table).

If you query "select ename, deptno from t where deptno = :x" we will NOT use the
index as we would have to full scan the entire index AND then access the table
anyway. We would just full scan the table instead.
"

Tom Kyte
November 13, 2002 - 2:04 pm UTC

they are not the same

one selects EMPNO (in the index)
the other selects ENAME (not in the index)

Two composite indexes have overlapping columns

Sean, January 18, 2005 - 11:12 am UTC

Hi Tom,

I have a table with two indexes. Here is the structure:

Table t(c1, c2, c3, c4)

Index ind1(c1, c2, c3,)
Index ind2(c1, c2)

Is ind2 necessary since ind1 includes column c1 and c2?

It is the existing application and I don’t know how the queries access the table. I just want to drop the ind2 if I can be sure that dropping that index will not affect the application since those indexes use a lot of disk space.

Thanks so much for your help.





Tom Kyte
January 18, 2005 - 11:48 am UTC

test it.
test it.
test it.

and ...

test it.


is index ind2 unique? probably needed.
is index ind2 fast full scanned? ind1 could be but it is bigger

the answer is "it is probably perfectly OK to drop it"
^^^^^^^^

but test it.

List of rejected plans

abhay, January 18, 2005 - 3:27 pm UTC

Is there a way to get list of plans rejected by optimizer and reasons for rejection ?

for example :

Plan 1 : Cost = 100

SELECT ( sort )
table1 ( index )
table2 ( full scan )

Plan 2: Cost = 30
SELECT ( sort )
table1 ( index )
table2 ( index )

Tom Kyte
January 18, 2005 - 3:40 pm UTC

search for

10053

on this site, or see chapter 6 of "Effective Oracle by Design"

Composite index column order

APL, March 22, 2005 - 11:35 pm UTC

I got a column in my table whose cardinality is very small.It got only 6 distinct values. I am always using this column in my select statement. I tried to create a bitmap index on this column, but since it is an OLTP environment and created a BTree index. I created a composite index by combining the above mentioned column and primary key.My intention was to increase the cardinality. Is this a good approach?
But the problem is it is showing high clustering factor ie coming near to number of rows. What is the best approach for creating indexes on this scenario?


Tom Kyte
March 23, 2005 - 1:15 am UTC

ouch, so many things to say - too little room to say them.

so, you always use this in the where clause? so what? if you have the data evenly distributed and select 1/6th the data -- well, that isn't oltp anymore


more details would be needed.

bitmap indexes are not. (given oltp and modifications)

Composite index

APL, March 23, 2005 - 7:32 am UTC

Is my approcah of creating composite index by using theabove mentioned two columns good?

Tom Kyte
March 23, 2005 - 9:07 am UTC

depends, infinitely better than using a bitmap in oltp

followup for above

Raaghid, March 23, 2005 - 8:27 am UTC

hi my suggestion:

If you are working in 9i and above, you can go for composite index without any issue (in oltop)
since skip scan index is available in 9i if leading columns are not present in index also, the optimizer will use the index. But 8i and before, if where condition is not having the leading columns of the query in index, then optimizer will not use the index.

Go through effective oracle by desin book (by Tom),which explains nicely.

Tom, please correct me if anything wrong.


Tom Kyte
March 23, 2005 - 9:08 am UTC

nothing wrong there.

Clustering Factors for the Composite Indexes

Vin, March 21, 2006 - 5:41 pm UTC

Oracle Version 8.1.7.4.0
Question:Based on the example given below.
1>Please explian why given the huge difference in clustering factor, why the one with high value (PKC_CAI_2-13902) is a right composite formation then the one with low (PKC_CAI_2R-172).
2>Should one consider the clustering factor while desiging a composite index, and desgin that one with the lowest clustering factor even if the execution plan, cost and consistents gets are high in order.
3>Is a low clustering factor good for a composite index, when the DML's specific on the table is INSERT, SELECT, and UPDATE in that order of severity. 

Appreciate your effort and time.

No of Rows in table:14540
This is with composite index as 
constraint PKC_CAI_2 primary key (UNIVERSAL_DLR_ID,UNIVERSAL_APP_ID)

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
PKC_CAI_2                                  13902

SQL> select UNIVERSAL_DLR_ID,UNIVERSAL_APP_ID,SUBMIT_DATE_TIME from cai
  2  where UNIVERSAL_DLR_ID='128165' and SUBMIT_DATE_TIME > sysdate -90;

UNIVERSAL_DLR_ID     UNIVERSAL_APP_ID                         SUBMIT_D
-------------------- ---------------------------------------- --------
128165               6011562663                               01192006
128165               6011640726                               01192006


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=23)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'CAI' (Cost=5 Card=1 Bytes=23)
   2    1     INDEX (RANGE SCAN) OF 'PKC_CAI_2' (UNIQUE) (Cost=2 Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        505  bytes sent via SQL*Net to client
        358  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          2  rows processed

********************************************************************
This is with composite index as 
constraint PKC_CAI_2r primary key (UNIVERSAL_APP_ID,UNIVERSAL_DLR_ID)

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
PKC_CAI_2R                                   172

SQL> select UNIVERSAL_DLR_ID,UNIVERSAL_APP_ID,SUBMIT_DATE_TIME from cai
  2  where UNIVERSAL_DLR_ID='128165' and SUBMIT_DATE_TIME > sysdate -90;

UNIVERSAL_DLR_ID     UNIVERSAL_APP_ID                         SUBMIT_D
-------------------- ---------------------------------------- --------
128165               6011562663                               01192006
128165               6011640726                               01192006

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=27 Card=1 Bytes=23)
   1    0   TABLE ACCESS (FULL) OF 'CAI' (Cost=27 Card=1 Bytes=23)

Statistics
----------------------------------------------------------
          0  recursive calls
          4  db block gets
        176  consistent gets
          0  physical reads
          0  redo size
        505  bytes sent via SQL*Net to client
        358  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          2  rows processed 

Tom Kyte
March 22, 2006 - 3:34 pm UTC

a clustering factor represents how sorted a tables data on disk is (the order of the rows in the table in the blocks) in relation to an index.

You have a compound index on (A,B) and then on (B,A). The table T can only be sorted by either

a) A and then B
b) B and then A
c) something else entirely

In your case, it looks like the data is inserted in order of (UNIVERSAL_APP_ID,UNIVERSAL_DLR_ID) and not the other way. The data when inserted with a key of (UNIVERSAL_DLR_ID,UNIVERSAL_APP_ID) does not arrive in that order.


for example

insert into t (x,y)
select rownum, dbms_random.random from all_objects;

that would create a set of data sorted by X with a random number Y.

that would create a set of data out of order by Y with a value X associated to each Y.

an index on (x,y) would have a low clustering factor.
an index on (y,x) would have a high clustering factor (since the data is not sorted by y)

Composite Indexes

Vin, March 22, 2006 - 8:23 pm UTC

appreciate your valuable time in explaining the details.
always learn every day from this site.

so when desiging a composite index say for example on columns(x,y) where column x has the least distinct values over column y, one should
1>get all the sql's (select, insert, update, delete) that hit the object.

2>find out how the insert is setup to like
insert into t(x,y) OR insert into t(y,x)

3>the composite index that would be usefull for most selectivity should be designed as...
for a insert type t(x,y), and select type where predicate is "where x='' and y=''
as create index ic_xy on t(x,y)

AND
for a insert type t(y,x), and select type where predicate is "where x='' and y=''
as create index ic_xy on t(y,x)

the decision of positioning of columns (x,y), which is the right sort position, against (y,x) should be made based on the type of insert, which actually effect the clustering factor of a composite index.

Tom Kyte
March 23, 2006 - 9:59 am UTC

no, not at all - sorry - must not have been clear.

it does not matter the order of columns in the insert statement (insert into t(x,y) and insert into t(y,x) are IDENTICAL).

it matters the order of ARRIVAL (in a HEAP table, one that does not force location of data).  Lets say you have a primary key on CUSTOMER_ID, DATE_OF_ORDER (bad key, should not include a date/timestamp in a key, but makes the example easy to explain).

Say you index(customer_id,date_of_order), the data will arrive for a given day "out of order" by customer id - customer 100 orders something, then 1, then 100000, then 5400, then 234, then 634 then ... and so on.  The table data is NOT sorted by customer_id - in fact, it would be unusual for two records "next" to eachother physically to have the same customer-id or customer-id's numerically next to eachother. 

However, in the same table, the rows would be nicely sorted by DATE_OF_ORDER since that is a monotomically increasing value and the rows would tend to goto the "bottom" of the table.  

So an index on (customer_id,date_of_order) would have a "poor" clustering factor (but ignore the word poor for moment!!!!).  Because the data is not sorted on disk in the table by customer_id.

However an index on (date_of_order,customer_id) would have a "great" clustering factor (ignore great for a moment).  Because the data on disk IS sorted by date_of_order.


So, should the primary key therefore be (date_of_order,customer_id) or should it be (customer_id,date_of_order)?

The answer:

<b>it depends</b>

The only thing that dictates the order of columns in indexes is "how you retrieve the data".

Say you run queries such as:

select * from T where customer_id = ?;
select * from T where customer_id = ? and date_of_order >= sysdate-7;
select /*+ first_rows(10) */ * from T where customer_id = ? order by date_of_order;


Now, now and only now can I say that the 'best' key will be customer_id, date_of_order - the clustering factor, it will not be a determining factor.  How I need to RETRIEVE the data - that will be, that is.

An index on (date_of_order,customer_id) would LIKELY be useless (skip scans could change that, but probably not because date_of_order is likely to have HIGH distinct cardinality) in answering any of the above questions.  We need to have customer_id first and then date_of_order.

The clustering factor would be used by the optimizer along with OTHER considerations to determine if the index would be the right approach to retrieve the data (or if a full scan would be better).

We can use our knowledge of the data in order to use a data structure to improve the clustering factor of the data we need to retrieve.  Suppose the above three queries were "the three queries" we really want to optimize.

We might use

o an IOT with a key on (customer_id,date_of_order) - the data would be stored physically sorted by customer_id, date_of_order ensuring that all of the data for a given customer_id was right next to eachother

o a B*Tree or HASH cluster with a cluster key of customer_id - again, ensures all customer_id's with the same values are stored physically close to eachother - ensuring that the above three queries do the minimal amount of logical IO to retrieve the data.

o a sorted hash cluster with a cluster key of customer_id and a sort key of date_of_order - storing the data clumped by customer_id and sorted within customer_id by date_of_order.

Here is a short excerpt from my recent book that relates to this topic as well:

<quote>

Myth: Most Discriminating Elements Should Be First 

This seems like common sense. If you are going to create an index on the columns C1 and C2 in a table with 100,000 rows, and you find C1 has 100,000 distinct values and C2 has 25,000 distinct values, you would want to create the index on T(C1,C2). This means that C1 should be first, which is the “commonsense” approach. The fact is, when comparing vectors of data (consider C1, C2 to be a vector), it doesn’t matter which you put first. Consider the following example. We will create a table based on ALL_OBJECTS and an index on the OWNER, OBJECT_TYPE, and OBJECT_NAME columns (least discriminating to most discriminating) and also on OBJECT_NAME, OBJECT_TYPE, and OWNER:

ops$tkyte@ORA10GR1> create table t
  2  as
  3  select * from all_objects;
Table created.
 
ops$tkyte@ORA10GR1> create index t_idx_1 on t(owner,object_type,object_name);
Index created.

ops$tkyte@ORA10GR1> create index t_idx_2 on t(object_name,object_type,owner);
Index created.
 
ops$tkyte@ORA10GR1> select count(distinct owner), count(distinct object_type),
  2   count(distinct object_name ), count(*)
  3  from t;
 
DISTINCTOWNER DISTINCTOBJECT_TYPE DISTINCTOBJECT_NAME  COUNT(*)
------------- ------------------- ------------------- --------
           28                  36               28537    48243
Now, to show that neither is more efficient space-wise, we’ll measure their space utilization:
ops$tkyte@ORA10GR1> analyze index t_idx_1 validate structure;
Index analyzed.
 
ops$tkyte@ORA10GR1> select btree_space, pct_used, opt_cmpr_count, opt_cmpr_pctsave
  2 from index_stats;
 
BTREE_SPACE    PCT  OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------- ------  -------------- ----------------
    2702744   89.0               2               28
 
ops$tkyte@ORA10GR1> analyze index t_idx_2 validate structure;
Index analyzed.
 
ops$tkyte@ORA10GR1> select btree_space, pct_used, opt_cmpr_count, opt_cmpr_pctsave
  2 from index_stats;
                 
BTREE_SPACE    PCT  OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------- ------  -------------- ----------------
    2702744   89.0               1               13

They use exactly the same amount of space, down to the byte—there are no differences there. However, the first index is a lot more compressible if we use index key compression, as evidenced by the OPT_CMP_PCTSAVE value. There is an argument for arranging the columns in the index in order from the least discriminating to the most discriminating. Now let’s see how they perform, to determine if either index is generally more efficient than the other. To test this, we’ll use a PL/SQL block with hinted queries (so as to use one index or the other):

ops$tkyte@ORA10GR1> alter session set sql_trace=true;
Session altered.

ops$tkyte@ORA10GR1> declare
  2          cnt int;
  3  begin
  4    for x in ( select /*+FULL(t)*/ owner, object_type, object_name from t )
  5    loop
  6         select /*+ INDEX( t t_idx_1 ) */ count(*) into cnt
  7           from t
  8          where object_name = x.object_name
  9            and object_type = x.object_type
 10            and owner = x.owner;
 11
 12          select /*+ INDEX( t t_idx_2 ) */ count(*) into cnt
 13           from t
 14          where object_name = x.object_name
 15            and object_type = x.object_type
 16            and owner = x.owner;
 17    end loop;
 18  end;
 19  /
PL/SQL procedure successfully completed. 
These queries read every single row in the table by means of the index. The TKPROF report shows us the following:
SELECT /*+ INDEX( t t_idx_1 ) */ COUNT(*) FROM T
WHERE OBJECT_NAME = :B3 AND OBJECT_TYPE = :B2 AND OWNER = :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  48243     10.63      10.78          0          0          0           0
Fetch    48243      1.90       1.77          0     145133          0       48243
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    96487     12.53      12.55          0     145133          0       48243

Rows     Row Source Operation
-------  ---------------------------------------------------
  48243  SORT AGGREGATE (cr=145133 pr=0 pw=0 time=2334197 us)
  57879   INDEX RANGE SCAN T_IDX_1 (cr=145133 pr=0 pw=0 time=1440672 us)(object…

********************************************************************************

SELECT /*+ INDEX( t t_idx_2 ) */ COUNT(*) FROM T
WHERE OBJECT_NAME = :B3 AND OBJECT_TYPE = :B2 AND OWNER = :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  48243     11.00      10.78          0          0          0           0
Fetch    48243      1.87       2.10          0     145168          0       48243
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    96487     12.87      12.88          0     145168          0       48243

Rows     Row Source Operation
-------  ---------------------------------------------------
  48243  SORT AGGREGATE (cr=145168 pr=0 pw=0 time=2251857 us)
  57879   INDEX RANGE SCAN T_IDX_2 (cr=145168 pr=0 pw=0 time=1382547 us)(object…

They processed the same exact number of rows and very similar numbers of blocks (minor variations coming from accidental ordering of rows in the table and consequential optimizations made by Oracle), used equivalent amounts of CPU time, and ran in about the same elapsed time (run this same test again and the CPU and ELAPSED numbers will be a little different, but on average they will be the same). There are no inherent efficiencies to be gained by placing the columns in order of how discriminating they are, and as stated previously, with index key compression there is an argument for putting the least selective first. If you run the preceding example with COMPRESS 2 on the indexes, you’ll find that the first index will perform about two-thirds the I/O of the second, given the nature of the query in this case.

However, the fact is that the decision to put column C1 before C2 must be driven by how the index is used. If you have lots of queries like the following:

select * from t where c1 = :x and c2 = :y;
select * from t where c2 = :y;

it makes more sense to place the index on T(C2,C1). This single index could be used by either of the queries. Additionally, using index key compression (which we looked at with regard to IOTs and will examine further later), we can build a smaller index if C2 is first. This is because each value of C2 repeats itself on average four times in the index. If C1 and C2 are both, on average, 10 bytes in length, the index entries for this index would nominally be 2,000,000 bytes (100,000 ~TMS 20). Using index key compression on (C2, C1), we could shrink this index to 1,250,000 (100,000 ~TMS 12.5), since three out of four repetitions of C2 could be suppressed.

In Oracle 5 (yes, version 5!), there was an argument for placing the most selective columns first in an index. It had to do with the way version 5 implemented index compression (not the same as index key compression). This feature was removed in version 6 with the addition of row-level locking. Since then, it is not true that putting the most discriminating entries first in the index will make the index smaller or more efficient. It seems like it will, but it will not. With index key compression, there is a compelling argument to go the other way since it can make the index smaller. However, it should be driven by how you use the index, as previously stated.
</quote> 

Can't thank you enough. Excellent Explanation!

Vin, March 23, 2006 - 2:48 pm UTC


One composite index vs 7 indexes

SD, May 15, 2006 - 2:31 pm UTC

Hi Tom,

I have a paritioned table that has 7 non key (and non-partition key) columns that can be queried. Assuming all 7 columns have similar cardinality and chances of being queried. Should I make one global index with all 7 columns and use index skip scan hint in all the queries or make 7 global indexes on one column each. This is not OLTP system but needs available 24 by 7 for query. I am considering global indexes for better query performance. What type of index is more suitable?

Thanks

SD


Tom Kyte
May 15, 2006 - 3:32 pm UTC

index skip scan isn't going to be very useful - in order for that to work, you'd have to have leading columns of pretty low distinct cardinality - meaning they shouldn't be indexed in the first place!


could be BITMAP index candidates - 7 single column bitmaps that can be or'ed and and'ed together - but insufficient data to say for for here.

followup - bitmap vs btree

SD, May 15, 2006 - 4:03 pm UTC

This is a archiving/consolidation database that gets data from multiple OLTP systems through batch load. There is reqmt to have the system available 24 X 7 for query. I can not disable constraints/indexes during the load as the system is to support query during the load time too. Some columns may be high distinct value and some may be less distinct. Do you think it is better to create one-column bitmap indexes on less distinct value columns and btree on high distinct value column?

Thanks



Tom Kyte
May 15, 2006 - 4:38 pm UTC

if you batch load (bulk load) you need not disable indexes.

The goal of the bitmaps would be to allow the optimizer to AND/OR as many as it wanted together. In general a single b*tree would be used (yes, there are times when more than one could be - but in general). With bitmaps however, it'll use many of them.

It is something to consider and test.

A reader, June 16, 2006 - 3:59 pm UTC

I have following table

Table :T

Columns:
Id 1
Id 2
id 3
id 4

I am always going to have data for ID1 and ID2 and a value for either ID3 or ID4

My sample query is

Select * from T where id1 = ? and id2 =? And (id3 = ? or id4 =?);

I want to have a concatenate index on this table. What is the best index for my table?

I’m thinking about

1 index on ID1, ID2, ID3 and another index on ID1, ID2, ID4

Please let me know the correct design.


Tom Kyte
June 16, 2006 - 7:21 pm UTC

nope, just one on id1,id2,id3,id4 probably.

searching for succeeding rows

James Su, May 14, 2007 - 11:13 am UTC

hi Tom,
If I have a composite index on my_table (col1,col2,col3), given a current row, what is the best way to find succeeding rows?

select * from my_table where col1>currentrow.col1 OR (col1 = currentrow.col1 AND (col2>currentrow.col2 OR (col2 = currentrow.col2 AND col3>currentrow.col3 )));

If the columns are NULL allowed then this will be even longer. Can CBO use the INDEX?

Thank you.
Tom Kyte
May 14, 2007 - 4:53 pm UTC

are you trying to paginate through a result set?


paginating through a result set

James Su, May 14, 2007 - 5:13 pm UTC

Dear Tom,
You are right, all I know is the last row of the last page, and I am trying to returning the following page. I am going to do a SELECT /*+ FIRST_ROWS */
to retrieve the records.
Thank you.

paginating through a result set

James Su, May 14, 2007 - 5:34 pm UTC

Hello Tom,
If create an index as my_table(lpad(col1,10)||lpad(col2,10)||lpad(col3,10)), then I can select where lpad(col1,10)||lpad(col2,10)||lpad(col3,10) > lpad(currentrow.col1,10)||lpad(currentrow.col2,10)||lpad(currentrow.col3,10), this will make it simplified. Do you think it's a good approach?
Thank you.
Tom Kyte
May 14, 2007 - 9:15 pm UTC

if you do it the way I demonstrated in the link, you won't have to do anything funky.

and it'll be simplified.

people won't page past page 10 or 100 (goto google, search for oracle, go to page 101 and tell me what is at the top....)

the pagination query demonstrated is the way to go.

paginating through a result set

James Su, May 14, 2007 - 5:39 pm UTC

hi Tom,
Yes I read that article before, but now the requirement is to retrieve data after the current row, so I don't know the row number and page number.
Tom Kyte
May 14, 2007 - 9:16 pm UTC

sure you do - you are PAGING through a result set.

you are on page N, all you need to know is N (and how big a page is). You cannot tell me you don't know that.

paging question

James Su, May 14, 2007 - 9:48 pm UTC

hello Tom, you are so right, I will propose that we do the paging in your way and give up those funky things.

But I am still curious, is there a nice and simple way to replace this code:
where col1>currentrow.col1 OR (col1 = currentrow.col1 AND (col2>currentrow.col2 OR (col2 = currentrow.col2 AND col3>currentrow.col3 )));

Thank you.
Tom Kyte
May 15, 2007 - 3:53 pm UTC

yes, with the pagination query I've been talking about....

that is what you replace that convoluted stuff with.

composite index

David, October 10, 2007 - 7:42 pm UTC

how would you decide what columns and how many columns composite index should have to get an acceptable response time for a query. Thanks

shyam, June 13, 2008 - 6:48 am UTC

If i have two columns c1 and c2 . if c1 alone is used in some query and combination of c1 and c2 in other query . is it advisable to create two indexes for c1 and c2 or a composite index of (c1,c2).
Tom Kyte
June 13, 2008 - 8:28 am UTC

*probably* just an index on (c1,c2)

but sometimes both.


say you query frequently:

select c1 from t where c1 between :x and :y

and really want that to be as tuned as humanly possible. An index on just c1 would be the most efficient way to access that data.

if you

select c1, c2 from t where c1 between :x and :y

now the index on (c1,c2) makes huge sense again.


but probably JUST ONE index.

Composite Index

Amritpal Singh, December 19, 2008 - 6:56 pm UTC

Hi Tom,

We have a big table called 'Member'. The primary key is on the sequence generated mbr_sid column.

Is there any point having a composite Index on mbr_sid and some other columns. I think since mbr_sid itself is unique, we don't really need a composite Index, because the select column list always has other columns that are not part of the composite Index.

In the current system, we have a composite Index on mbr_sid, from_date, to_date, oprtnl_flag, data_source.

There are 10 columns in the table, and there is this Index, with 5 columns, making it almost half the size of the table.

Is it worth creating this.

Composite Index is preferred when for a given mbr_sid we get lots of records, and we use another field to reduce this number, which could be say (last_name or a date) etc. That is the best case for a composite INDEX, not here, where mbr_sid is unique.

what do you think.

regards
amrit
Tom Kyte
December 29, 2008 - 2:13 pm UTC

... Is it worth creating this....

I cannot answer that, not unless you tell me

what you do to this table, how you query it.


If you query:

select *
from table
where mbr_sid = ?;

only, then it probably does not make sense.

If you query:

select mbr_sid, oprtnl_flag, data_source
from table
where mbr_sid = ?


frequently, than if MIGHT WELL make really good sense (as the table access by index rowid can be avoided)

it completely depends on the queries involved and how they can (or cannot) use these additional columns.



Columns in the FROM Clause not SELECT

Amritpal singh, December 29, 2008 - 5:39 pm UTC

yes Tom, it makes sense, If we query:

select mbr_sid, oprtnl_flag, data_source
from table
where mbr_sid = ?

this will avoid the Table access, but in our case, the query is something like this

select *
from table
where mbr_sid = 12345 and from_date = some_date
and to_date = some_date and oprtnl_flag = 'A'
and data_source = 'A';

we have to go and see the table.

The same set of columns are there in all major tables, and most of these tables have unique values for mbr_sid, so for a give value of mbr_sid there will be 1 row returned.

For a couple of tables there are 4-5 rows returned for a given mbr_sid, but since it is a Select * , we can't avoid table access.

So, in this case, is it better to use this composite Index or the Index on just the mbr_sid column, which will return say 5 rows, with different values for the from_date,to_date columns.

is there any recommendation for the number of Indexes we should have on a table? In this case, the total size of all Indexes put together is much more than the table.



Tom Kyte
January 05, 2009 - 8:24 am UTC

You only go an "see the table" IF

...
where mbr_sid = 12345 and from_date = some_date
and to_date = some_date and oprtnl_flag = 'A'
and data_source = 'A';
.......


is true. If you have an index on just mbr_sid, you cannot tell if that predicate is satisfied. If you have an index on mbr_sid,from_date,to_date,oprtnl_flag,data_source - you can.

So, you can avoid going to the table still, even if mbr_sid is unique.


... So, in this case, is it better to use this composite Index or the Index on just
the mbr_sid column, which will return say 5 rows, with different values for the
from_date,to_date columns.
....

that doesn't compute. Not sure what you mean. I thought mbr_sid was unique?

Simple Index understanding

samy, January 13, 2010 - 4:53 am UTC

Dear Tom,

Suppose i have a table test with c1,c2,c3,c4,c5 Columns.

i Create an index on c1,c2,c3

i want to know does the Changes in order of columns Selected Or Changes in Order of columns in Where clause
affects usage of my index. as i was been told that order in where clause does matters, but i never knew that column in SELECT does matters in usage of index.

So here are the few cases want to know which query will use Index.

Simple Query
1) Select c1,c2,c3 from test where c1 = 1 and c2 = 2 and c3 = 3 ;

Order of Select column changed
2) Select c3,c2,c1 from test where c1 = 1 and c2 = 2 and c3 = 3 ;

Order of Where column changed
3) Select c1,c2,c3 from test where c3 = 3 and c2 = 2 and c1 = 1 ;

Order of Where and Select column changed
4) Select c3,c2,c1 from test where c1 = 1 and c2 = 2 and c3 = 3 ;

few column missing in where clause.
5) Select c1,c2,c3 from test where c2 = 2 and c3 = 3 ;

few column missing in Select .
6) Select c2,c3 from test where c1 = 1 and c2 = 2 and c3 = 3 ;

more column added in Where Clause.
7) Select c2,c3 from test where c1 = 1 and c4 = 2 and c3 = 3 ;

more column added in Select.
8) Select c4,c2,c3 from test where c2 = 2 and c3 = 3 ;

Tom Kyte
January 18, 2010 - 4:15 pm UTC

the order of the columns in the query do not matter, select component or where component. we don't care, it is the same to us.



One Composite or Two Single Indexes?

John Gilmore, February 11, 2010 - 10:02 am UTC

We have a customer table which can be queried on either or both of surname or first_name.

My intention was to create separate indexes on each column, but I've been told that Oracle will normally only ever choose one index per table for access so it's better to create a composite index on (surname, first_name) instead.

I would have thought that for queries which specify both columns, Oracle would use both of these indexes by merging the results from each before accessing the table data. However I was told that this operation ("index and") is quite expensive internally and therefore quite often not chosen by the optimiser.

Do you agree with the advice I received?
Tom Kyte
February 16, 2010 - 10:09 am UTC

if you have queries:

where first_name = ?
where surname = ?
where first_name = ? and surname = ?


then you would need an index that STARTS with first_name, and another index that starts with surname.

So, probably:

create index i1 on t(first_name,surname) /* covers 1st and 3rd query */
create index i2 on t(surname) /* covers 2nd query */


If you just created an index on (surname,first_name) the query on "where first_name=?" would not be able to use that index to find the rows... (surname would be too unique to consider an index skip scan)

John Gilmore, February 17, 2010 - 4:43 am UTC

OK, but say I had an index just on surname and another just on first_name.

If the query specifies both surname and first_name, would Oracle use both indexes and merge the rowids from each or would it just choose to use one index and filter the rows returned.

So, in general, is it the case that a table is accessed by only one index even if using two could reduce the rows returned?
Tom Kyte
February 17, 2010 - 9:20 am UTC

... If the query specifies both surname and first_name, would Oracle use both indexes and merge the rowids from each or would it just choose to use one index and filter the rows returned. ...

it would only do an index join (use both) if all of the data could be retrieved from the indexes - eg:

select first_name, surname from t where first_name = ? and last_name = ?;


would be able to do an index join. But

select first_name, surname, hiredate from t where first_name = ? and last_name = ?;


would not - unless hiredate was in one of the two indexes.




Doing an index join is an expensive operation in itself. It in general would not make sense to goto index 1 and find all of the rowids, then go to index two and find all of the rowids - join these two by rowid to get the set of rowids that satisfy your criteria - and then go to the table. It would make sense to pick the index (the single index) that was more selective and just go to the table to look up the first name.



Composite index and a regular index

Ash C, March 21, 2010 - 1:20 pm UTC

Hi Tom, my first post here and been a lurker for a long time and youve helped us a lot due to answering related issues we exactly have.

Anyhow here is what I thought a related question to this topic.

I have a table T1(c1,c2,c3,c4,c5) and index details follows:
PK(c1,c2,c3)
IND1(c2,c3)
IND2(c3)

Now, I tried using the script ("TFSFKCHLK.SQL") provided by Metalink to check if all the FK columns are have indexes to prevent a possible locking, etc.

Couple of findings were:
~~~~~~~
Changing data in table T99 will lock table T1
Create an index on table T1 with the following columns to remove lock problem
Column = C1 (1)
~~~~~~~
Changing data in table T50 will lock table T1
Create an index on table T1 with the following columns to remove lock problem
Column = C2 (1)
~~~~~~~~

Question:
1. Does TFSFKCHLK.SQL normally un-aware of the pre-existing composite indices?
2. Does TFSFKCHLK.SQL fully aware of the re-existing composite indices, and that its really asking for this additional index?

Note that this table is heavily used for an OLTP app. Avg num rows of T1=43711871

TIA,
Ash
Tom Kyte
March 22, 2010 - 9:06 am UTC

I don't use that script, I don't have that script, I've never used that script.

I use my own query, which I've given out for anyone to use. You can use that if you like my stuff. If you use the above script, Please utilize support to question it. I didn't write it.


http://asktom.oracle.com/Misc/httpasktomoraclecomtkyteunindex.html


I don't know what "t99" is as well.


Here is the rule:

if you have a foreign key on column(s)

c1, ..., cN

then you need an index that starts with

c1, ... cN

it (the index) can be on c1, ... cN, CA, CB, C....CZ

it just has to start with C1, ... CN in some order.


My script assumes the ordering is the same in the index as the foreign key. They don't have to be, but my script assumes they do.

indexing strategy

Ravi B, July 30, 2010 - 7:45 pm UTC

Hi Tom,

We have an application where we access a big table(about 3075708 rows) with about 30 columns.

There is a multi-user interface where operators query this table and perform DML operations.

The table could be queried on single or combination of columns. What indexing strategy would you recommend?

Thanks!
Tom Kyte
August 02, 2010 - 8:48 am UTC

it would depend on the predicates themselves - insufficient data here to answer.

Ravi B, August 02, 2010 - 11:03 am UTC

Tom,

Most of the time predicates are like operator on these columns. Please let me know if you need more information.

Thanks.
Tom Kyte
August 02, 2010 - 11:09 am UTC

like, <, = - doesn't matter (well, they will)

what matters first - the where clauses

do you do

where a = ? and b > ? and c like ?

or do you do

where a = ?
where b > ?
where c like ?

do you do this to all 30 columns

etc. You'd need to see the workload - perhaps what you want to do is run your workload and then use the index advisor in the tuning pack of enterprise manager to see what it recommends based on your queries.

A reader, August 02, 2010 - 11:31 am UTC

Tom,

All the columns are varchar2 columns. Operators would input partial values may be 2 to 3 columns (approx) at a time and do a search. So, the search is always %<value>%. These 2 to 3 columns could be any of the 30 columns based on what they are searching. There would be no =, >, < or any other type of searches.
Tom Kyte
August 02, 2010 - 11:43 am UTC

... %<value>% ...

no b-tree indexes then, maybe a text index - you'll want to read up on that. You'll want to maybe tag these elements and have a single index so you can section search. text indexes can help with leading % searches as well.

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:37336026927381#37360199308431

Ravi B, August 02, 2010 - 3:24 pm UTC

Tom,

I looked into the article and documentation on intermedia text. I guess i will run into a problem with sync'ing the index. Our system is more like an OLTP system with frequent inserts/updates hence frequent syncing might not be advisable for this huge table. I think these kind of adhoc searches on dynamic column predicates is quite frequent in data warehouse environment. I wonder how to deal with it other than text based search.

Thanks!
Tom Kyte
August 02, 2010 - 4:18 pm UTC

in a warehouse - you'd have the same problem - leading wildcard searches are never going to be the most performant thing on the planet here. Think about it - normal b*tree and bitmap indexes - not useful - the leading % means you have to inspect *every* row with them. A text index (a single text index) on all of the columns is going to be the way to go. What you need to decide is whether you can deal with a small lag between time of insert and the time you can search for that record.

Here on asktom, you cannot search for your review right away after putting it in - you have to wait for it to be sync'ed. That to me is acceptable. Maybe - once you make the business understand this 'requirement' of theirs and how expensive it will be - they will be more than willing to live with some lag time.

A reader, August 02, 2010 - 4:46 pm UTC

Hi Tom,

index on (a,b);

What are the situations in general - where the index can be used when we use ONLY 'b' in the predicates of SQLs ?

Regards

Tom Kyte
August 03, 2010 - 8:13 am UTC

well, we could use the index INSTEAD of the table (if you just select a or just select b or just select a,b - the index could be used instead of the table if either of A or B is NOT NULL).

We could do an index skip scan if the distinct cardinality of A is deemed low enough.

ops$tkyte%ORA11GR2> create table t
  2  as
  3  select mod(rownum,2) a, rownum b, a.*
  4    from all_objects a
  5  /

Table created.

ops$tkyte%ORA11GR2> alter table t modify (a not null);

Table altered.

ops$tkyte%ORA11GR2> create index t_idx on t(a,b);

Index created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select b from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 3163761342

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       | 72174 |   352K|    52   (0)| 00:00:01 |
|   1 |  INDEX FAST FULL SCAN| T_IDX | 72174 |   352K|    52   (0)| 00:00:01 |
------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> select * from t where b = 50;

Execution Plan
----------------------------------------------------------
Plan hash value: 2053318169

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   105 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |   105 |     4   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | T_IDX |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("B"=50)
       filter("B"=50)

ops$tkyte%ORA11GR2> set autotrace off



there could be other special cases - like an INDEX JOIN (use another index in conjunction with this index), btree bitmap plans and so on - but the above two would probably be the most common.

index order in a composite index

David, April 23, 2011 - 6:30 am UTC

Hello Tom

The other day while testing some indexes in order to tune a couple of queries the results puzzled me a bit.

I was testing by putting different column as the leading index column because in the queries uses both columns and noticed that the column with fewer distinct keys actually gave better results. I dont understand why because I cannot visualize how is thsi index access happening under the hoods. Is the composite index structure a concatenation of all columns? For example imagine columns c1 (leass distict keys) and c2 with values 'A' and 10, 'B' and 100, in the index leaf how does this look like? Is it

'A'||10||rowid and 'B'||100||rowid ?


This is my test case, I am surprised by all 4 queries in the last part of the test. Can you please explain how does a query read a composite index?

Thank you




create table t1
(
c1 varchar2(3),
c2 number,
c3 varchar2(3)
);

create table t2
(
c1 varchar2(2)
);

insert into t2 values ('H');
insert into t2 values ('B');
insert into t2 values ('C');
insert into t2 values ('F');
insert into t2 values ('I');
insert into t2 values ('D');
insert into t2 values ('A');
insert into t2 values ('E');
insert into t2 values ('G');
insert into t2 values ('J');

begin
for i in (select * from t2)
loop
for x in 1..1000
loop
insert into t1 values (i.c1, x, 'HI');
end loop;
end loop;
end;
/

commit;

create index t1_i1 on t1(c1, c2);

create index t1_i2 on t1(c2, c1);

create index t1_i3 on t1(c1);

create index t1_i4 on t1(c2);

exec dbms_stats.gather_table_stats(USER, 'T1', estimate_percent => 100)

select a.table_name, a.column_name, a.num_distinct, b.num_rows
from user_tab_columns a, user_tables b
where a.table_name = b.table_name
and a.table_name = 'T1';

TABLE_NAME COLUMN_NAME NUM_DISTINCT NUM_ROWS
------------ -------------- ------------ ----------
T1 C1 10 10000
T1 C2 1000 10000
T1 C3 1 10000

select index_name, distinct_keys, clustering_factor from user_indexes where table_name = 'T1'order by 1

INDEX_NAME DISTINCT_KEYS CLUSTERING_FACTOR
------------------------------ ------------- -----------------
T1_I1 10000 28
T1_I2 10000 10000
T1_I3 10 29
T1_I4 1000 10000

select /*+ index(t1 t1_i1) */ *
from t1
where c1 in ('A', 'E', 'G', 'J')
and c2 in (10, 20, 30, 100, 200, 300);

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
32 consistent gets
0 physical reads
0 redo size
862 bytes sent via SQL*Net to client
407 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
24 rows processed

select /*+ index(t1 t1_i2) */ *
from t1
where c1 in ('A', 'E', 'G', 'J')
and c2 in (10, 20, 30, 100, 200, 300);

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
38 consistent gets
0 physical reads
0 redo size
848 bytes sent via SQL*Net to client
407 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
24 rows processed

select /*+ index(t1 t1_i3) */ *
from t1
where c1 in ('A', 'E', 'G', 'J')
and c2 in (10, 20, 30, 100, 200, 300);

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
33 consistent gets
0 physical reads
0 redo size
862 bytes sent via SQL*Net to client
407 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
24 rows processed

select /*+ index(t1 t1_i4) */ *
from t1
where c1 in ('A', 'E', 'G', 'J')
and c2 in (10, 20, 30, 100, 200, 300);

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
74 consistent gets
0 physical reads
0 redo size
848 bytes sent via SQL*Net to client
407 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
24 rows processed


Tom Kyte
April 25, 2011 - 11:42 am UTC

the keys are stored:

c1,c2,c3,...cn,ROWID

the individual columns are stored in sorted order in the index leaf blocks - and the rowids are "at the end".

In your case - it would be better to look at the explain plans:

ops$tkyte%ORA11GR2> select /*+ index(t1 t1_i1) */ *
  2    from t1
  3   where c1 in ('A', 'E', 'G', 'J')
  4     and c2 in (10, 20, 30, 100, 200, 300);

Execution Plan
----------------------------------------------------------
Plan hash value: 3456773112

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    24 |   216 |     9   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |    24 |   216 |     9   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |    24 |       |     8   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C1"='A' OR "C1"='E' OR "C1"='G' OR "C1"='J')
       filter("C2"=10 OR "C2"=20 OR "C2"=30 OR "C2"=100 OR "C2"=200 OR
              "C2"=300)



here we can see the index was used in an INLIST ITERATOR (so it looped over C1 values - iterated over the inlist values) and then applied the filter against C2 using the index.

When you see "access", that bit is using the index to locate rows, when you see filter, that bit is applying a where clause without using the index as a means of locating the rows (even though it did the filter ON the index, it didn't really use the index structure to process that bit).

So, in effect, in this case - it used the index to find all of the "C1 = 'A'" rows, and for each one it found, it applied the filter criteria to it to see if it wanted it, and then accessed the table if it did.



When you did it via the second index:

ops$tkyte%ORA11GR2> select /*+ index(t1 t1_i2) */ *
  2    from t1
  3   where c1 in ('A', 'E', 'G', 'J')
  4     and c2 in (10, 20, 30, 100, 200, 300);

Execution Plan
----------------------------------------------------------
Plan hash value: 2879882323

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    24 |   216 |    30   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |    24 |   216 |    30   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T1_I2 |    24 |       |     5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C2"=10 OR "C2"=20 OR "C2"=30 OR "C2"=100 OR "C2"=200 OR
              "C2"=300)
       filter("C1"='A' OR "C1"='E' OR "C1"='G' OR "C1"='J')


It did the opposite, it used the index to find C2 = 10 and then C2- 20 and so on, and for each of those, it would apply the filter against C1. It would look at ALL of the c2=10 values and apply that secondary filter.


when you used the 3rd and 4th indexes, you moved the application of the filter from the INDEX to the TABLE. We found all of the C1 (or C2) rows for the keys and went to the table for each and every one of them to see if the other column matched the filter condition. With the 1st and 2nd indexes - we could do the filter in the index before hitting the table, not so here.



hope that helps make it a bit more clear in this case.

index order in a composite index

David, April 27, 2011 - 9:14 am UTC

Hi

In the third and fourth query how come in order to read same 24 rows with a much more selective index we need to read 78 blocks and with the less selective index only 33 reads os requiered?

Thanks a lot
Tom Kyte
April 27, 2011 - 11:04 am UTC

That'll have to do with things like the clustering factor of the index - the way the rows are sorted on disk.


You created the data in this fashion:
begin
    for i in (select * from t2)
    loop
       for x in 1..1000
       loop
           insert into t1 values (i.c1, x, 'HI');
       end loop;
    end loop;
end;
/

so you put all of the A's right next to eachother and all of the B's right next to eachother and so on. Since it takes about 3 blocks to hold 1000 rows of your data, if you ask for all of the A's, it would take 3 IO's:
ops$tkyte%ORA10GR2> select c1, dbms_rowid.rowid_block_number(rowid), count(*) from t1
  2  group by c1, dbms_rowid.rowid_block_number(rowid) order by 1, 2;

C1  DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)   COUNT(*)
--- ------------------------------------ ----------
A                                  54626        519
A                                  54631        276
A                                  54635        205
B                                  54613        522
B                                  54614        430
B                                  54616         48
C                                  54614         96
C                                  54615        519
C                                  54632        385
D                                  54629        231
D                                  54630        519
D                                  54631        250
E                                  54635        321
E                                  54636        519
E                                  54637        160
F                                  54625        519
F                                  54627        340
F                                  54632        141
G                                  54637        366
G                                  54638        519
G                                  54639        115
H                                  54612        526
H                                  54616        474
I                                  54627        186
I                                  54628        519
I                                  54629        295
J                                  54634         70
J                                  54639        411
J                                  54640        519

29 rows selected.


But if you want all of the number 10's - you'd have to go all over the place (the number 10's would be on 10 different blocks since you have 10 rows with 10 in it).

So, if you say "where c1 = 'A' and c2 = 10" and use an index on c1, you'd hit 3 blocks.
If you do the same where clause but use an index on 10, you'd hit.... 10 blocks.

using the inlist iterator - you either hit 4*3 table blocks (C1 has four values of interest) or you hit 6*10 table blocks. If you add in a few IO's for the index itself, the numbers all add up.

TKPROF shows this as well:


select /*+ index(t1 t1_i3) */ * from t1 where c1 in ('A', 'E', 'G', 'J')
   and c2 in (10, 20, 30, 100, 200, 300)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        3      0.00       0.00          0         31          0          24
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.00       0.00          0         31          0          24

Rows     Row Source Operation
-------  ---------------------------------------------------
     24  INLIST ITERATOR  (cr=31 pr=0 pw=0 time=82 us)
     24   TABLE ACCESS BY INDEX ROWID T1 (cr=31 pr=0 pw=0 time=247 us)
   4000    INDEX RANGE SCAN T1_I3 (cr=17 pr=0 pw=0 time=81 us)(object id 65155)
********************************************************************************
select /*+ index(t1 t1_i4) */ * from t1 where c1 in ('A', 'E', 'G', 'J')
   and c2 in (10, 20, 30, 100, 200, 300)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        3      0.00       0.00          0         74          0          24
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        5      0.00       0.00          0         74          0          24

Rows     Row Source Operation
-------  ---------------------------------------------------
     24  INLIST ITERATOR  (cr=74 pr=0 pw=0 time=75 us)
     24   TABLE ACCESS BY INDEX ROWID T1 (cr=74 pr=0 pw=0 time=268 us)
     60    INDEX RANGE SCAN T1_I4 (cr=14 pr=0 pw=0 time=53 us)(object id 65156)

Composite Index Structure

A reader, January 03, 2012 - 11:38 am UTC

On p. 428 of your Expert Oracle Database Architecture 2nd Edition, you provided a diagram of a single column B*Tree index structure. Can you please do one for a composite index structure? Thank you.
Tom Kyte
January 03, 2012 - 12:01 pm UTC

just add ",another_column" after any of the existing columns.

It would be no different.


It would just use a vector comparision to navigate the interior structure and get to the leaf, just like an order by would sort data.


for example, in the picture, I have 0..50 pointing left. If that index were on X,Y instead of just X, it might have:


(0,0) ... (50,25)


pointing left. Meaning any 'vector' between (0,0) ... (50,25) would be found down that way. Vectors such as (0,1), (15,2000), (50,24) would all be found "to the left".

One Composite or Two Single Indexes?

John Gilmore, May 18, 2012 - 4:43 am UTC

Hi Tom,

In your reply to my post above (February 17, 2010) regarding whether Oracle would use two indexes and merge the rowids, you said that "it would only do an index join (use both) if all of the data could be retrieved from the indexes".

However, I just found one of your older questions which seems to indicate that Oracle will use multiple indexes to satisfy a query even if the result is not available within the index itself.
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1144177862559

Is this something which was only available with the rule-based optimiser?

Tom Kyte
May 21, 2012 - 7:54 am UTC

that is a bitmap index, the rules are very different for them.

bitmap indexes are designed to be used like that - where we use many of them and AND/OR/XOR them together at runtime. They are very different from b*tree indexes which is what we are discussing on this page.

In this page, it was written:


We can:

range scan and index (ascending or descending)

combine indexes (what we do with bitmapped indexes to process complex AND/OR conditions)


....

The goal of the bitmaps would be to allow the optimizer to AND/OR as many as it wanted together.
In general a single b*tree would be used (yes, there are times when more than one could be - but in
general). With bitmaps however, it'll use many of them.




join indexes (use two or more indexes joined together)

full scan indexes as above.



and bitmaps are only available with the CBO - the RBO would not be able to use one.


(there is also the special case of btree bitmap plans, whereby a b*tree index is converted on the fly to a bitmap index to be combined with other converted b*tree indexes...)