SQL> create table T1(id number , name varchar2 (20)) nologging tablespace largets;
Table created.
SQL>
SQL> insert /*+ append */ into t1
2 select decode(mod(rownum,1000),0,rownum), rownum
3 from ( select 1 from dual connect by level <= 1000 ),
4 ( select 1 from dual connect by level <= 10000 );
10000000 rows created.
SQL>
SQL> commit;
Commit complete.
So I have 10million rows and 10,000 of them contain a value for ID. Without any indexes, I'm scanning the whole set.
SQL>
SQL> set autotrace traceonly explain
SQL> select count(*) from t1 where id is not null and name like '200%';
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 5356 (2)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 9 | 5356 (2)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ID" IS NOT NULL AND "NAME" LIKE '200%')
SQL> set autotrace off
Now I'll create an index on NAME and see how that goes
SQL>
SQL> create index ix1 on t1 ( name, id );
Index created.
SQL>
SQL>
SQL> set autotrace traceonly explain
SQL> select count(*) from t1 where id is not null and name like '200%';
Execution Plan
----------------------------------------------------------
Plan hash value: 3432355337
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | INDEX RANGE SCAN| IX1 | 1 | 9 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("NAME" LIKE '200%')
filter("ID" IS NOT NULL AND "NAME" LIKE '200%')
So I have gained some potential benefit here, but I'm not really taking advantage of the fact that so many ID values are null. So I will use an function-based index to only index those rows that have ID not null
SQL> set autotrace off
SQL> create index ix2 on t1
2 ( case when id is not null then name end );
Index created.
SQL>
SQL>
SQL> set autotrace traceonly explain
SQL> select count(*) from t1 where ( case when id is not null then name end ) like '200%';
Execution Plan
----------------------------------------------------------
Plan hash value: 1955444768
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 9 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | INDEX FAST FULL SCAN| IX2 | 500K| 4394K| 9 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(CASE WHEN "ID" IS NOT NULL THEN "NAME" END LIKE '200%')
You can see how efficient this new index is by looking at its size
SQL> select bytes from user_segments where segment_name in ('IX1','IX2');
BYTES
----------
234881024
262144
2 rows selected.
SQL>