Hi
I have a table with around million rows and around 30 columns.
Table_name:- example_table
Columns :- A, B, C, D, E, ……..
There are multiple indexes on that table out of which two indexes are like below
Create index idx_1 on example_table(A) ;
Create index idx_2 on example_table(A,B,C,D);
Here you see, column A is used in Sigle column index and in multi column index.
Is it recommendable to have single column index on column A, where as it exists in multi column index as first column or is it just a overhead on that table?
Maybe.
While both are candidates for searches on column A, you may find that the single-column index is used in some cases where the multicolumn index isn't.
For example, let's create a table and create a three-column index on it:
create table t (
c1 constraint pk primary key, c2 not null, c3 not null, c4 not null
) as
select level c1, ceil ( level / 100 ) c2,
date'2021-01-01' + mod ( level, 37 ) c3,
lpad ( 'stuff', 100, 'f' ) c4
from dual
connect by level <= 10000;
create index i231 on t ( c2, c3, c1 );
A query on the index's leading column returns 1% of the rows, but the optimizer chooses a full table scan:
set serveroutput off
select * from t
where c2 = 1;
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC LAST'));
----------------------------------
| Id | Operation | Name |
----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| T |
----------------------------------
But create an index on this column alone and the optimizer
does use it:
create index i2 on t ( c2 );
select * from t
where c2 = 1;
select *
from table(dbms_xplan.display_cursor(null, null, 'BASIC LAST'));
----------------------------------------------------
| Id | Operation | Name |
----------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
| 2 | INDEX RANGE SCAN | I2 |
----------------------------------------------------
So... why is this?
There are a couple of reasons.
First up, the single-column is smaller - there's less data in it. So it's faster to read it.
Secondly the multi-column index has a much higher clustering factor - over 20x higher:
select index_name, leaf_blocks, clustering_factor
from user_indexes
where table_name = 'T';
INDEX_NAME LEAF_BLOCKS CLUSTERING_FACTOR
PK 20 170
I231 37 7250
I2 20 170
The clustering factor is a key determinant of how effective an index is and thus how likely the optimizer is to choose it (lower = better). The fewer columns there are, the lower the clustering factor is likely to be.
It's a good idea to create as few indexes as you can. But as this example shows there may be cases where creating "redundant" indexes leads to faster queries.