Hopefully this is an easy one. I would like to know why an index is ignored when the SELECT clause contains multiple columns. It was my understanding that the index points to the position of the row so that whether you select one column or ten should not matter (assuming the row is not chained).
In a nutshell, why does this statement force a full table scan:
select *
from my_table
where my_indexed_column = x;
While this uses the index:
select column1
from my_table
where my_indexed_column = x;
in a nutshell - it doesn't force a full scan.
probably - the second query only used the index - that is, the index was on (my_indexed_column,column1). The table was avoided.
So, I can give you examples whereby the index will be used, the index will not be used - "it depends".
Also, the index was not ignored, it was decided that "the index would not be the efficient way to process the data"
So, I'll guess (no test case :( :( :( no example to work from :( )
column1 was in the index
the other columns selected with * was not
the clustering factor of the index was rather 'poor' (near the number of rows in the table, not the number of blocks)
that would inhibit large range scans that need to hip hop from index to table - since each row we were to retrieve is presumed to be on a block different from any other block in the table we had already accessed (eg: we'd be reading a lot of the table, a block at a time)
For example:
ops$tkyte%ORA10GR2> create table t
2 as
3 select *
4 from all_objects
5 order by dbms_random.random;
Table created.
ops$tkyte%ORA10GR2> create index t_idx on t(owner, object_name);
Index created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', estimate_percent => 100, method_opt => 'for columns owner size 254' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select table_name, blocks, num_rows,
2 (select clustering_factor from user_indexes where index_name = 'T_IDX') cf
3 from user_tables where table_name = 'T';
TABLE_NAME BLOCKS NUM_ROWS CF
------------------------------ ---------- ---------- ----------
T 704 49741 49651
<b> clustering factor is near the number of rows in the table, not blocks. Therefore, the optimizer knows that
if you do a large range scan, it will hit LOTS of blocks - the data we want to get is scattered all over the place.
That is why I ordered by random above, to get the data spread out, disorganized</b>
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select owner, object_name from t where owner = 'WMSYS';
Execution Plan
----------------------------------------------------------
Plan hash value: 2946670127
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 232 | 5336 | 3 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| T_IDX | 232 | 5336 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
ops$tkyte%ORA10GR2> select owner, object_name, object_type from t where owner = 'WMSYS';
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 232 | 7888 | 222 (3)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 232 | 7888 | 222 (3)| 00:00:02 |
--------------------------------------------------------------------------
It did the right thing here, the full scan is cheaper (just, the index plan would cost about 235/240 - because it would do some IO's thru the index and 232 against the table)..
So, when we need to hit the table, it becomes too expensive to use the index.
If the SAME EXACT DATA were stored just a little differently on disk, things might be very different, for example:
ops$tkyte%ORA10GR2> create table t
2 as
3 select *
4 from all_objects
5 order by owner, object_name;
Table created.
ops$tkyte%ORA10GR2> create index t_idx on t(owner, object_name);
Index created.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T', estimate_percent => 100, method_opt => 'for columns owner size 254' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select table_name, blocks, num_rows,
2 (select clustering_factor from user_indexes where index_name = 'T_IDX') cf
3 from user_tables where table_name = 'T';
TABLE_NAME BLOCKS NUM_ROWS CF
------------------------------ ---------- ---------- ----------
T 704 49743 684
ops$tkyte%ORA10GR2> select owner, object_name from t where owner = 'WMSYS';
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 232 | 5336 | 3 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| T_IDX | 232 | 5336 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
ops$tkyte%ORA10GR2> select owner, object_name, object_type from t where owner = 'WMSYS';
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 232 | 7888 | 7 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 232 | 7888 | 7 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_IDX | 232 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
since the data we wanted was easy to retrieve from the table (it was all right next to each other) it used the index - the cost was very low. It believes it'll hit about 4 blocks (since about 73 rows/block fit in my 8k block database - this is right) and the cost is very low.
So, you started with "hopefully this is an easy one". It is actually quite complex