Well, it depends on the skew of the data of course. Looking at the predicate we have:
name like 'K%' and user_id > 100;
An index on any of
name
user_id
name, user_id
user_id, name
would/could be used. However, if a large pct of the rows in this table satisfied "user_id > 100", the indexes with user_id on the leading edge are not very efficient. Same would hold true for "name like 'K%'".
Only if one of the two was selective would it be efficient, and then that column should be on the leading edge in this case (it is NOT true that the most selective items should be on the leading edge in general!!!)
ops$tkyte@ORA717DEV.US.ORACLE.COM> create table t1
2 as
3 select object_name name, object_id user_id, created from all_objects
4 /
Table created.
ops$tkyte@ORA717DEV.US.ORACLE.COM> create index t1_idx1 on t1(name,user_id);
Index created.
ops$tkyte@ORA717DEV.US.ORACLE.COM> create index t1_idx2 on t1(user_id,name);
Index created.
ops$tkyte@ORA717DEV.US.ORACLE.COM>
ops$tkyte@ORA717DEV.US.ORACLE.COM> create table t2
2 as
3 select decode( mod(rownum,2), 0, 'K', 'A' ) || substr(name,2) name,
4 200-user_id user_id,
5 created
6 from t1
7 /
Table created.
ops$tkyte@ORA717DEV.US.ORACLE.COM> create index t2_idx1 on t2(name,user_id);
Index created.
ops$tkyte@ORA717DEV.US.ORACLE.COM> create index t2_idx2 on t2(user_id,name);
Index created.
ops$tkyte@ORA717DEV.US.ORACLE.COM>
ops$tkyte@ORA717DEV.US.ORACLE.COM>
ops$tkyte@ORA717DEV.US.ORACLE.COM> select sum( case when name like 'K%' then 1 else 0 end ) K_cnt,
2 sum( case when user_id > 100 then 1 else 0 end ) hundred_cnt,
3 count(*) cnt
4 from t1
5 /
K_CNT HUNDRED_CNT CNT
---------- ----------- ----------
12 16871 16965
ops$tkyte@ORA717DEV.US.ORACLE.COM> select sum( case when name like 'K%' then 1 else 0 end ) K_cnt,
2 sum( case when user_id > 100 then 1 else 0 end ) hundred_cnt,
3 count(*) cnt
4 from t2
5 /
K_CNT HUNDRED_CNT CNT
---------- ----------- ----------
8482 93 16965
ops$tkyte@ORA717DEV.US.ORACLE.COM>
ops$tkyte@ORA717DEV.US.ORACLE.COM> alter session set sql_trace=true;
Session altered.
ops$tkyte@ORA717DEV.US.ORACLE.COM> select /*+ index( t1 t1_idx1 ) */ count(created)
2 from t1
3 where name like 'K%' and user_id > 100;
COUNT(CREATED)
--------------
12
ops$tkyte@ORA717DEV.US.ORACLE.COM> select /*+ index( t1 t1_idx2 ) */ count(created)
2 from t1
3 where name like 'K%' and user_id > 100;
COUNT(CREATED)
--------------
12
ops$tkyte@ORA717DEV.US.ORACLE.COM>
ops$tkyte@ORA717DEV.US.ORACLE.COM> select /*+ index( t2 t2_idx1 ) */ count(created)
2 from t2
3 where name like 'K%' and user_id > 100;
COUNT(CREATED)
--------------
38
ops$tkyte@ORA717DEV.US.ORACLE.COM> select /*+ index( t2 t2_idx2 ) */ count(created)
2 from t2
3 where name like 'K%' and user_id > 100;
COUNT(CREATED)
--------------
38
Now, tkprof will show us:
select /*+ index( t1 t1_idx1 ) */ count(created)
from t1
where name like 'K%' and user_id > 100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 2 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.01 0 2 0 1
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
12 INDEX RANGE SCAN (object id 21913)
********************************************************************************
select /*+ index( t1 t1_idx2 ) */ count(created)
from t1
where name like 'K%' and user_id > 100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.06 0.09 95 96 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.07 0.09 95 96 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 52
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
12 INDEX RANGE SCAN (object id 21914)
<b>Here, name is very selective (12 out of 17,000 start with K). So, when we use the index on (NAME,USER_ID), we do very very little IO. It is very fast to identify the rows where name starts with K, look to see if the user_id is > 100 and then access the table using t1_idx1.
Now, when we use t1_idx2, we see it is somewhat (lots) less efficient. USER_ID is not very selective when it comes to the predicate "user_id > 100". Virtually EVERY row satisfies that -- hence we end up reading the entire index in this case. We get the same answer, we just had to do 45 times the work
Onto the next case...</b>
select /*+ index( t2 t2_idx1 ) */ count(created)
from t2
where name like 'K%' and user_id > 100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.03 0.03 0 51 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.04 0.03 0 51 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 52
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
38 INDEX RANGE SCAN (object id 21916)
********************************************************************************
select /*+ index( t2 t2_idx2 ) */ count(created)
from t2
where name like 'K%' and user_id > 100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 2 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 2 3 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 52
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
38 INDEX RANGE SCAN (object id 21917)
<b>Reversing the selectivity of the data (with regards to this particular predicate) reverses the situation. When name like 'K%' is not very selective (we made about 1/2 of the rows start with K), we find we do more work then the other query. Its simply the converse</b>
So, what did we learn from this? Actually not as much as you think. We learned that for this SPECIFIC PREDICATE we could over analyze the data and come up with the perfect index. However, this is not the only predicate we must be concerned with, there will be others. You really have to look at all of the questions you will be asking of this table to figure out the proper indexing scheme.