Skip to Main Content
  • Questions
  • Is it recommendable to have a single column index when it is also the first in a multicolumn index?

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Sushil.

Asked: February 10, 2021 - 8:59 am UTC

Last updated: February 11, 2021 - 2:22 pm UTC

Version: 12c

Viewed 10K+ times! This question is

You Asked

Hi
I have a table with around million rows and around 30 columns.
Table_name:- example_table
Columns :- A, B, C, D, E, ……..

There are multiple indexes on that table out of which two indexes are like below

Create index idx_1 on example_table(A) ;
Create index idx_2 on example_table(A,B,C,D);


Here you see, column A is used in Sigle column index and in multi column index.
Is it recommendable to have single column index on column A, where as it exists in multi column index as first column or is it just a overhead on that table?


and Chris said...

Maybe.

While both are candidates for searches on column A, you may find that the single-column index is used in some cases where the multicolumn index isn't.

For example, let's create a table and create a three-column index on it:
create table t ( 
  c1 constraint pk primary key, c2 not null, c3 not null, c4 not null
) as 
  select level c1, ceil ( level / 100 ) c2, 
         date'2021-01-01' + mod ( level, 37 ) c3,
         lpad ( 'stuff', 100, 'f' ) c4
  from   dual
  connect by level <= 10000;
  
create index i231 on t ( c2, c3, c1 );


A query on the index's leading column returns 1% of the rows, but the optimizer chooses a full table scan:

set serveroutput off

select * from t 
where  c2 = 1;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST'));

----------------------------------    
| Id  | Operation         | Name |    
----------------------------------    
|   0 | SELECT STATEMENT  |      |    
|   1 |  TABLE ACCESS FULL| T    |    
----------------------------------


But create an index on this column alone and the optimizer does use it:

create index i2 on t ( c2 );

select * from t
where  c2 = 1;

select * 
from   table(dbms_xplan.display_cursor(null, null, 'BASIC LAST'));

----------------------------------------------------    
| Id  | Operation                           | Name |    
----------------------------------------------------    
|   0 | SELECT STATEMENT                    |      |    
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |    
|   2 |   INDEX RANGE SCAN                  | I2   |    
---------------------------------------------------- 


So... why is this?

There are a couple of reasons.

First up, the single-column is smaller - there's less data in it. So it's faster to read it.

Secondly the multi-column index has a much higher clustering factor - over 20x higher:

select index_name, leaf_blocks, clustering_factor
from   user_indexes
where  table_name = 'T';

INDEX_NAME   LEAF_BLOCKS   CLUSTERING_FACTOR   
PK                    20                 170 
I231                  37                7250 
I2                    20                 170 


The clustering factor is a key determinant of how effective an index is and thus how likely the optimizer is to choose it (lower = better). The fewer columns there are, the lower the clustering factor is likely to be.

It's a good idea to create as few indexes as you can. But as this example shows there may be cases where creating "redundant" indexes leads to faster queries.

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

David D., February 10, 2021 - 5:59 pm UTC

It is interesting to create an index on one column (col1) to select datas on all the table and an index on N columns (columns 1 to 4 for exemple) for a select on these columns.

I think that Oracle will use the first index when you do a search on column 1 like
SELECT col 12, col 22 from T1 where col1 = 22;

I think Oracle wil use the second index if you do a SELECT on the columns which are in the index like
SELECT col 2, col 3, col 4 from T1 where col1 > 10 and col 1 <21;


So you have two indexes but they do not response to the same selects.

But, last thing, if you have your two indexes (colA for the first, colA, B, C, D for the second), maybe, for millions or billions rows, that the level or height of your indexes will not be the same because you need, for the second index, more block to put the datas. If the level is 4 for the first and 5 for the second, there is a real benefit to use the first one : Oracle has to read one 8Ko block less (it represents 25% or 20% less).

David D. from Paris
Chris Saxon
February 11, 2021 - 2:22 pm UTC

Yep, it's likely to work as you describe.

If the level is 4 for the first and 5 for the second, there is a real benefit to use the first one : Oracle has to read one 8Ko block less (it represents 25% or 20% less).

True, this is just a side-effect of the single-column index being smaller though. Assuming many rows can have the same value for A (otherwise there'd be a unique index on it right?), you'll have to read far more index leaf blocks with the multi-column index.

Depending on your data and the values you're searching for, the savings could be far greater!

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.