ops$tkyte%ORA11GR2> create table organized
2 as
3 select x.*
4 from (select * from all_objects order by object_name) x
5 /
Table created.
ops$tkyte%ORA11GR2> create table disorganized
2 as
3 select x.*
4 from (select * from organized order by dbms_random.random) x
5 /
Table created.
<b>those tables are arguably "the same", but they are very very different - because the rows are in different orders on the disk</b>
ops$tkyte%ORA11GR2> create index organized_idx on organized(object_name);
Index created.
ops$tkyte%ORA11GR2> create index disorganized_idx on disorganized(object_name);
Index created.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> begin
2 dbms_stats.gather_table_stats
3 ( user, 'ORGANIZED',
4 estimate_percent => 100,
5 method_opt=>'for all indexed columns size 254'
6 );
7 dbms_stats.gather_table_stats
8 ( user, 'DISORGANIZED',
9 estimate_percent => 100,
10 method_opt=>'for all indexed columns size 254'
11 );
12 end;
13 /
PL/SQL procedure successfully completed.
<b>both are indexed similarly and have the same level of statistics</b>
ops$tkyte%ORA11GR2> select table_name, blocks, num_rows, 0.05*num_rows, 0.10*num_rows from user_tables
2 where table_name like '%ORGANIZED' order by 1;
TABLE_NAME BLOCKS NUM_ROWS 0.05*NUM_ROWS 0.10*NUM_ROWS
------------------------------ ---------- ---------- ------------- -------------
DISORGANIZED 1053 72082 3604.1 7208.2
ORGANIZED 1054 72082 3604.1 7208.2
ops$tkyte%ORA11GR2> select table_name, index_name, clustering_factor from user_indexes
2 where table_name like '%ORGANIZED' order by 1;
TABLE_NAME INDEX_NAME CLUSTERING_FACTOR
------------------------------ ------------------------------ -----------------
DISORGANIZED DISORGANIZED_IDX 72030
ORGANIZED ORGANIZED_IDX 1028
<b>but their stats are different - because the organization of the data is different. One has a clustering factor near the number of blocks in the table (good) and the other near the number of rows (bad). The index with the low clustering factor will be used for large range scans - a high percentage of the table can be retrieved efficiently with that one. The other - not so much
Notice that 5% of the table is about 3,600 rows...</b>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> column PLAN_TABLE_OUTPUT format a120 truncate
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select * from organized where object_name like 'X%';
Execution Plan
----------------------------------------------------------
Plan hash value: 1925627673
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 383 | 37151 | 10 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| ORGANIZED | 383 | 37151 | 10 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | ORGANIZED_IDX | 383 | | 4 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_NAME" LIKE 'X%')
filter("OBJECT_NAME" LIKE 'X%')
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select * from disorganized where object_name like 'X%';
Execution Plan
----------------------------------------------------------
Plan hash value: 2727546897
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 383 | 37151 | 288 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| DISORGANIZED | 383 | 37151 | 288 (1)| 00:00:04 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME" LIKE 'X%')
<b>Notice that one used an index and the other a full scan.
The costs are very different.
The second figured it was better to full scan rather than do 383 single block IO's (because we scattered the data all over the place) to retrieve the rows from the table.
383 is about 0.5% of the table, we stopped using it. In fact, we'd have to be getting less than about 0.3/0.4% of that table before we'd use an index:</b>
ops$tkyte%ORA11GR2> select * from disorganized where object_name like 'R%';
Execution Plan
----------------------------------------------------------
Plan hash value: 3767053355
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 24250 | 254 (0)| 00:00:04 |
| 1 | TABLE ACCESS BY INDEX ROWID| DISORGANIZED | 250 | 24250 | 254 (0)| 00:00:04 |
|* 2 | INDEX RANGE SCAN | DISORGANIZED_IDX | 250 | | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_NAME" LIKE 'R%')
filter("OBJECT_NAME" LIKE 'R%')