Skip to Main Content
  • Questions
  • Indexing a Null column with a Date data type

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Jimmy.

Asked: September 30, 2010 - 1:22 pm UTC

Last updated: October 01, 2010 - 9:39 am UTC

Version: 10.2.0.4.0

Viewed 1000+ times

You Asked

Tom,

I have used your technique of appending a space to an index in order to get Oracle to index null columns. This works fine with Varchar2 and Number data types. But it doesn't seem to work with the Date data type. In theory, I would think this should work. I know I could use a function-based index using DECODE to do this, which I think you have posted before, but I was curious why appending a character to an index does not work for Dates.

jbrock@orcl10g-local> set serverout off
jbrock@orcl10g-local> alter session set statistics_level=all;

Session altered.

jbrock@orcl10g-local> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Prod
PL/SQL Release 10.2.0.3.0 - Production
CORE 10.2.0.3.0 Production
TNS for 32-bit Windows: Version 10.2.0.3.0 - Production
NLSRTL Version 10.2.0.3.0 - Production

jbrock@orcl10g-local> drop table t purge;
drop table t purge
           *
ERROR at line 1:
ORA-00942: table or view does not exist


jbrock@orcl10g-local> create table t as select * from dba_tables;

Table created.

jbrock@orcl10g-local> select count(*) from t;

  COUNT(*)
----------
      1621

jbrock@orcl10g-local> select count(*) from t where pct_free is null;

  COUNT(*)
----------
        52

jbrock@orcl10g-local> select count(*) from t where last_analyzed is null;

  COUNT(*)
----------
       100

jbrock@orcl10g-local> select count(*) from t where logging is null;

  COUNT(*)
----------
       168

jbrock@orcl10g-local> create index t_pct_free_i on t(pct_free, ' ');

Index created.

jbrock@orcl10g-local> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

jbrock@orcl10g-local> select table_name from t where pct_free is null;

TABLE_NAME
-------------------------
LOGMNR_DICTSTATE$
LOGMNR_DICTIONARY$
LOGMNR_OBJ$
...
...

52 rows selected.

jbrock@orcl10g-local> select * from table(dbms_xplan.display_cursor(null,null,'last allstats'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID  a6pqujpbcbn0b, child number 0
-------------------------------------
select table_name from t where pct_free is null

Plan hash value: 2209910213

------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID| T            |      1 |     52 |     52 |00:00:00.01 |      18 |
|*  2 |   INDEX RANGE SCAN          | T_PCT_FREE_I |      1 |     52 |     52 |00:00:00.01 |       6 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PCT_FREE" IS NULL)


18 rows selected.

jbrock@orcl10g-local> create index t_logging_i on t(logging, ' ');

Index created.

jbrock@orcl10g-local> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

jbrock@orcl10g-local> select table_name from t where logging is null;

TABLE_NAME
-------------------------
RULE_SET_PR$
RULE_SET_IOT$
RULE_SET_ROP$
...
...

168 rows selected.

jbrock@orcl10g-local> select * from table(dbms_xplan.display_cursor(null,null,'last allstats'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID  398k9s2hmr6yz, child number 0
-------------------------------------
select table_name from t where logging is null

Plan hash value: 1742471471

-----------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID| T           |      1 |    168 |    168 |00:00:00.01 |      50 |
|*  2 |   INDEX RANGE SCAN          | T_LOGGING_I |      1 |    168 |    168 |00:00:00.01 |      14 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("LOGGING" IS NULL)


18 rows selected.

jbrock@orcl10g-local> create index t_last_analyzed_i on t(last_analyzed, ' ');

Index created.

jbrock@orcl10g-local> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

jbrock@orcl10g-local> select table_name from t where last_analyzed is null;

TABLE_NAME
-------------------------
SUMDELTA$
ATEMPTAB$
MAP_OBJECT
...
...

100 rows selected.

jbrock@orcl10g-local> select * from table(dbms_xplan.display_cursor(null,null,'last allstats'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID  1hyzrytdz4214, child number 0
-------------------------------------
select table_name from t where last_analyzed is null

Plan hash value: 1601196873

------------------------------------------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| T    |      1 |    100 |    100 |00:00:00.01 |      59 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("LAST_ANALYZED" IS NULL)


17 rows selected.

REM this technique is not working with the DATE data type, so lets try a function-based index

jbrock@orcl10g-local> create index t_last_analyzed_fbi on t(DECODE(last_analyzed, null, 'NULL', last_analyzed));

Index created.

jbrock@orcl10g-local> exec dbms_stats.gather_table_stats(user,'T')

PL/SQL procedure successfully completed.

jbrock@orcl10g-local> select table_name from t where (DECODE(last_analyzed, null, 'NULL', last_analyzed)) = 'NULL';

TABLE_NAME
-------------------------
SUMDELTA$
ATEMPTAB$
MAP_OBJECT
...
...

100 rows selected.

jbrock@orcl10g-local> select * from table(dbms_xplan.display_cursor(null,null,'last allstats'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------
SQL_ID  9616r44uvbqbq, child number 0
-------------------------------------
select table_name from t where (DECODE(last_analyzed, null, 'NULL', last_analyzed)) = 'NULL'

Plan hash value: 441262039

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID| T                   |      1 |     54 |    100 |00:00:00.01 |      38 |
|*  2 |   INDEX RANGE SCAN          | T_LAST_ANALYZED_FBI |      1 |     54 |    100 |00:00:00.01 |       9 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T"."SYS_NC00051$"='NULL')

and Tom said...

I think this has to do with the clustering factor - not the datatype.

Consider as I create a table not ordered by anything and a table ordered by last_aanalyzed. I index them and query them - one uses the index the other does not.

All because of the clustering factor.

If you look at the clustering factor of your two indexes - you'll probably find them to be different - the function based one is probably being computed artificially low for some reason and looks more appealing:


ops$tkyte%ORA11GR2> create table t1 as select * from dba_tables;

Table created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats(user,'T1');

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> create index t1_idx on t1(last_analyzed,0);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t2 as select * from dba_tables ORDER BY LAST_ANALYZED;

Table created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats(user,'T2');

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> create index t2_idx on t2(last_analyzed,0);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select count(*), count(last_analyzed), count(*)-count(last_analyzed) from t1;

  COUNT(*) COUNT(LAST_ANALYZED) COUNT(*)-COUNT(LAST_ANALYZED)
---------- -------------------- -----------------------------
      2813                 2669                           144

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select index_name, clustering_factor from user_indexes where index_name in ('T1_IDX','T2_IDX');

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
T1_IDX                                       958
T2_IDX                                        98

ops$tkyte%ORA11GR2> set autotrace traceonl explain
ops$tkyte%ORA11GR2> select * from t1 where last_analyzed is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   144 | 34848 |    31   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   144 | 34848 |    31   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("LAST_ANALYZED" IS NULL)

ops$tkyte%ORA11GR2> select * from t2 where last_analyzed is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 4244861920

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |   144 | 34848 |     8   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T2     |   144 | 34848 |     8   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T2_IDX |   144 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("LAST_ANALYZED" IS NULL)

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> 


http://asktom.oracle.com/pls/asktom/asktom.search?p_string=clustering_factor

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, October 01, 2010 - 8:54 am UTC

Hi Tom,

Is the problem of clustering factor come into play when creating FBI on date?

Look like Oracle compute high CF when data in the table is not order on the underline date column

Also this behavior does not appear when the column is nunmber or varchar and not load in order of this column

Is it expected or bug?

Thanks



Tom Kyte
October 01, 2010 - 9:39 am UTC

... Also this behavior does not appear when the column is nunmber or varchar and
not load in order of this column ..

yes it does??? many examples on this site.

here is one - id1 and id2 both get about 1,000 rows from my table in the following (my all_objects has about 71-72,000 rows)

One query below uses and index - one does not. All because of the CLUSTERING FACTOR.

and they are numbers.

The clustering factor is what is causing what you are seeing - inspect them - and you'll find the clustering factor on last_analyzed is high while the clustering factor on the other attributes it likely much lower.

ops$tkyte%ORA11GR2> create table t
  2  as
  3  select mod(rownum,71) id1,
  4         trunc(rownum/1000) id2,
  5             a.*
  6    from all_objects a
  7  /

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_idx1 on t(id1);

Index created.

ops$tkyte%ORA11GR2> create index t_idx2 on t(id2);

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select index_name, clustering_factor from user_indexes where index_name in ( 'T_IDX1', 'T_IDX2' );

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
T_IDX1                                     71298
T_IDX2                                      1082

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t where id1 = 42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1010 |   181K|   303   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T    |  1010 |   181K|   303   (1)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ID1"=42)

Note
-----
   - dynamic sampling used for this statement (level=2)

ops$tkyte%ORA11GR2> select * from t where id2 = 42;

Execution Plan
----------------------------------------------------------
Plan hash value: 3371054274

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |  1000 |   179K|    22   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T      |  1000 |   179K|    22   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX2 |  1000 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID2"=42)

Note
-----
   - dynamic sampling used for this statement (level=2)


More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library