Thanks for the question, Gabriel.

Asked: June 09, 2008 - 6:14 pm UTC

Last updated: April 05, 2022 - 5:21 am UTC

Version: 10.2.0.2

Viewed 10K+ times! This question is

You Asked

Hello Tom,

I have a large table (20 million rows) partitioned by date (2 million rows/part) that includes an hour column. The hour data is stored in the '99' format as we only have 24 values/day. However the client wants the data displayed as '9999' so we created a set of views that display the data as hour||'00'. Our base table unique index includes the hour column, but now all the queries are seen by the database as where hour ='1100' as oposed to where hour='11' and the index is not used any more. How can we tell the optimizer that the '1100' that the client is selecting through the web views is actually our '11' that is stored in the already existing index?

Thank you very much,

and Tom said...

Index the function....

ops$tkyte%ORA10GR2> create table t ( hour number(2) );

Table created.

ops$tkyte%ORA10GR2> create index t_idx on t(hour||'00');

Index created.

ops$tkyte%ORA10GR2> create view vw as select hour, hour||'00' new_hour  from t;

View created.

ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from vw where new_hour = '1100';

Execution Plan
----------------------------------------------------------
Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |    13 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |    13 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T_IDX |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(TO_CHAR("HOUR")||'00'='1100')

Note
-----
   - dynamic sampling used for this statement

ops$tkyte%ORA10GR2> set autotrace off

Rating

(40 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Too much space required...

Gabriel Schor, June 10, 2008 - 1:51 pm UTC

Hello Tom,

Thank you for you help. I already have an index that includes the hour and it has 1.3Gb. I don't want to spend another 1.3Gb to index hour||'00'. Is there no other trick that I can use to tell oracle:
listen Oracle, when my client asks for 1100 he actually means 11?
No way to translate this to Oracle?

June 10, 2008 - 2:54 pm UTC

are you kidding me or what?

1.3gb - wait, I'll send you my pendrive, it'll fit on that.

Since your hour index is currently *useless*, drop it and re-index. No, there is no way to use the existing index, it isn't indexing ANYTHING you are whereing on.

Just an example

Gabriel Schor, June 10, 2008 - 4:27 pm UTC

Hello Tom,

I'm always serious when I talk about Oracle (sorry for the cheesy line, but I was forced to watch Pretty Woman last night so let me get at least this much use out of it).

That was just an example, we actualy have 60 reports that behave similarly. In that case this gets closer to 100Gb for additional indexes.

The current index is used for other queries, so we can't drop it.

I find it too bad that even when using a deterministic function like || we can't tell oracle to look for the parameter of the function as oposed to the result.

June 11, 2008 - 7:28 am UTC

well, why don't you - instead of changing what is selected - change what is INPUT.

*you* substr the input instead of concatenating to the output

select * from t where hour = to_number(substr(:x,1,2))

Possible alternative

Robert, June 11, 2008 - 10:58 am UTC

Gabriel

Could you select both the '99' and the '9999' format in your view and have the users filter on the 2 digit column and display the 4 digit one?

rpad

A reader, June 18, 2008 - 2:54 pm UTC

SQL> select rpad(to_char(a),4,'0') from (select 20 a from dual);

RPAD
----
2000

June 18, 2008 - 3:30 pm UTC

and
what
exactly
would that do for anything here.

to clarify for "a reader"

Stephan, June 18, 2008 - 5:20 pm UTC

> create table t ( hour number(2) );

Table created.

> insert into t (select level-1 from dual connect by level<24);

23 rows created.

> create index t_idx on t(hour);

Index created.

> set autotrace on explain
> select * 
  from ( select rpad( lpad( to_char( hour ), 2, '0' ), 4, '0') hours
           from ( select hour
                    from t
                )
       )
 where hours = '1100'
/


HOURS
------------
1100


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    13 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    13 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(RPAD(LPAD(TO_CHAR("HOUR"),2,'0'),4,'0')='1100')

Note
-----
   - dynamic sampling used for this statement

So - there's the output you're looking for, but as there is no index on lpad(hour) or rpad(hour), there is no index that can be used.

To make that suggestion work, you'd need an index that could be used:

> create index t_idx2 on t( rpad( lpad( to_char( hour ), 2, '0' ), 4, '0' ) );

Index created.

> select * from (select rpad( lpad( to_char( hour ), 2, '0' ), 4, '0' ) hours from (select hour from t)) where hours = '1100'
;

HOURS
------------
1100


Execution Plan
----------------------------------------------------------
Plan hash value: 3020041354

---------------------------------------------------------------------------
| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |    13 |     1   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| T_IDX2 |     1 |    13 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access(RPAD(LPAD(TO_CHAR("HOUR"),2,'0'),4,'0')='1100')

Note
-----
   - dynamic sampling used for this statement

So, at best, you've given an alternative function that could be applied - but it is a lot more complicated than Tom's original suggestion of simply appending '00' to the hour...

Am I right, Tom?

June 18, 2008 - 5:35 pm UTC

basically, they don't want to create any more indexes, they want the index on the base column to be used.

but it cannot

that is why I asked "what would this do for us..."

Data warehousing

Gabriel Schor, June 19, 2008 - 10:00 am UTC

Hello Tom and Stephan,

Thank you for your interest.
I tried a couple of more things. I created an index organized table with 2 columns:

CREATE TABLE HOUR_TABLE
(
SHORT_HOUR VARCHAR2(2 BYTE),
LONG_HOUR VARCHAR2(4 BYTE),
CONSTRAINT PK_HOUR_TABLE
PRIMARY KEY
(SHORT_HOUR, LONG_HOUR)
)
ORGANIZATION INDEX;

CREATE UNIQUE INDEX REV_HOUR ON HOUR_TABLE
(LONG_HOUR, SHORT_HOUR);

CREATE UNIQUE INDEX UK_HOUR_TABLE ON HOUR_TABLE
(SHORT_HOUR);

ALTER TABLE HOUR_TABLE ADD (
CONSTRAINT UK_SHORT
UNIQUE (SHORT_HOUR));

I populated the short column with hour like 11 and the long hour column with hour like 1100. I select and where on the long_hour and I join with my other table on the short hour. Also I created on my other table a foreign key for hour to reference short hour in my new table. All this to no effect, it is still doing a full table scan.
I was basically trying to use the hour_table as a reference table. Am I doing something wrong?
Aren't there any data warehouse tools or objects that we can use? We are basically trying to tell oracle that we call 11 like 1100. It is like having an index on the state code 'QC' for Quebec or 'FL' for Florida but doing the query with where state='Florida'. There must be a way to tell oracle that Florida='FL' and that he should look for 'FL' and not Florida...

Thanks a lot again,

I also created a foreign key on the hour column

June 19, 2008 - 1:50 pm UTC

... Aren't there any data warehouse tools or objects that we can use? We are
basically trying to tell oracle that we call 11 like 1100. It is like having an
index on the state code 'QC' for Quebec or 'FL' for Florida but doing the query
with where state='Florida'. There must be a way to tell oracle that
Florida='FL' and that he should look for 'FL' and not Florida...
...

well, no. If you query "where state = 'Florida'" - the database will literally do that.

By duplicating the column (having the 'dimension' table if you will) you turn it into a lookup and you would join - eg: 'fact' table has 2 character state code, dimension has two character state to long state name. A join would be used to allow you to query on Florida and see either FL or Florida in the output.

you could do the same for this

But, I would just use the function based index, you are done. Disk space is cheap. If you add up the amount of $$$'s spent on you during this thought process - you probably could have bought 5 times as much disk as you need for this.

Or, have the client application know what to do in the first place - eg: user inputs 1100 - meaning 11 really.

Actually disk is more expensive

Gabriel Schor, June 19, 2008 - 3:13 pm UTC

Hello Tom,

Actually with all the bureaucracy around it will take many more times the $$$ to buy the disk space than to implement the reference table. I tried to implement the reference table but I was unable to make the query use the index. Can you please show us an example based on the tables that you started the example with?

Thank you very much,

June 19, 2008 - 3:48 pm UTC

it would all come back to INDEXING THE DATA THEY WHERE ON.

can you implement a dimension? sure

would it magically solve your problem? no

you will either

a) live with poor performance
b) have the client realize IT should use "where hour = substr(:bind,1,2)" instead of "where hour_string_formatted = :bind"
c) index the information you are using in the where clause.

the price of disk just dropped $100 in this little conversation....

This little formatting change is going to cost - one way or the other.

Hmmm - not sure I'm seeing what you are

Stephan, June 19, 2008 - 5:58 pm UTC

Hi Tom, Gabriel

I'm not sure I'm seeing what you're seeing. I have 20m records in t

> select hour, count(*) from t group by hour order by hour;

      HOUR             COUNT(*)
---------- --------------------
         0              875,008
         1              875,008
         2              875,008
         3              875,008
         4              875,008
         5              875,008
         6              875,008
         7              875,008
         8              875,008
         9              875,008
        10              875,008
        11              875,008
        12              875,008
        13              875,008
        14              875,008
        15              874,987
        16              874,987
        17              874,987
        18              874,987
        19              874,987
        20              874,987
        21              874,987
        22              874,987
        23              875,008

24 rows selected.

I then created the exact same IOT that Gabriel did.

> set autot traceonly explain statistics
> select h.* from t h, hour_table ht where ht.short_hour = h.hour and ht.long_hour = '1100';

875008 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3377880327

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |     1 |    10 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS     |          |     1 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| REV_HOUR |     1 |     7 |     1   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN| T_IDX    |     1 |     3 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("HT"."LONG_HOUR"='1100')
   3 - access("H"."HOUR"=TO_NUMBER("HT"."SHORT_HOUR"))

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
        405  recursive calls
          0  db block gets
      64179  consistent gets
       2922  physical reads
     232044  redo size
   11783764  bytes sent via SQL*Net to client
     642063  bytes received via SQL*Net from client
      58335  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
     875008  rows processed

For me, the index is being used. Why the difference?

June 19, 2008 - 6:49 pm UTC

what did you index though.

he wants ONE INDEX and to where on the unindexed column

No index on (1100)
Index on 11

where on 1100

the entire goal of his is "no index on anything else other than hour which is two digits"

You're probably right, but...

Stephan, June 23, 2008 - 10:18 am UTC

I'm fairly sure you are right - I believe I had the index on t that you had created - on hour||'00'

So I dropped that and tried again:

> drop index h_idx;

Index dropped.

> create index h_idx on h(hour);

Index created.

So, I now have h, with only the hour column (hour stored as a 1-2 digit number). I also have the IOT that Gabriel created, exactly as he described. I am not getting the FTS that he described, however. So I'm wondering why that approach isn't feasible for him...

Wrote file afiedt.buf

  1* select h.* from h, hour_table ht where ht.short_hour = h.hour and ht.long_hour = '1100'
> /

875008 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2190710608

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |     1 |     9 |     4  (50)| 00:00:01 |
|   1 |  NESTED LOOPS     |          |     1 |     9 |     4  (50)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| REV_HOUR |     1 |     7 |     1   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN| H_IDX    |   874K|  1707K|     3  (67)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("HT"."LONG_HOUR"='1100')
   3 - access("H"."HOUR"=TO_NUMBER("HT"."SHORT_HOUR"))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      59704  consistent gets
       1277  physical reads
          0  redo size
   11783764  bytes sent via SQL*Net to client
     642063  bytes received via SQL*Net from client
      58335  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     875008  rows processed

June 23, 2008 - 1:05 pm UTC

what space have you saved?

none

it would be easier to just index hour||'00'

right now they have

create view view as select ...., hour, hour||'00' as pretty_hour, .....

they want to be able to have "where pretty_hour = '1100'" use the index on hour.

So, you must accomplish your goal without adding any indexes (you have two apparently, one on 1100 and one on just 11), and that auxiliary table consumes storage too

full examples are appreciated - it is too hard to page up and down and try to figure out what you had in place.

but all you get is an index on hour. That dimension table that converts hour into pretty_hour - consumes storage, as would all of the indexes you create.

Not much storage

Gabriel Schor, June 23, 2008 - 3:03 pm UTC

Hello Tom,

The dimension table doesn't consume that much space, it would only have 24 entries, we're only storing 1 entry/hour and all entries are on the hour (00,01,02,03 etc) that doesn't even register as a factor. Even though, I wasn't able to modify my query to make it use the index. Can you please show us how would you do it?

Thank you very much,

June 23, 2008 - 3:46 pm UTC

just like any star schema would be...

you want the foreign key to ensure your "dimension" always has the right values (prefer an error during modification over wrong result during select!)

you create the dimension, index it (24 rows as you said, but needs a primary key index and an index on the dimension attribute, so you have the table plus two extra indexes)


ops$tkyte%ORA10GR2> create table t
  2  as
  3  select a.*, to_number(to_char( created,'hh24')) hour
  4    from all_objects a
  5  /

Table created.

ops$tkyte%ORA10GR2> create index t_idx on t(hour)
  2  /

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t_hour_dim
  2  as
  3  select level hour, level * 100 hour_pretty
  4    from dual
  5  connect by level <= 24
  6  /

Table created.

ops$tkyte%ORA10GR2> alter table t_hour_dim add constraint t_hour_dim_pk primary key(hour);

Table altered.

ops$tkyte%ORA10GR2> alter table t add constraint t_fk_t_hour_dim foreign key (hour) references t_hour_dim(hour);

Table altered.

ops$tkyte%ORA10GR2> create index t_hour_dim_idx on t_hour_dim(hour_pretty);

Index created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create or replace view vw
  2  as
  3  select t.*, t_hour_dim.hour_pretty
  4    from t, t_hour_dim
  5   where t.hour = t_hour_dim.hour
  6  /

View created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from vw where hour_pretty = 1100;

Execution Plan
----------------------------------------------------------
Plan hash value: 480274813

------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                |  8102 |  1321K|   110   (2)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID  | T              |  8102 |  1115K|   108   (2)| 00:00:01 |
|   2 |   NESTED LOOPS                |                |  8102 |  1321K|   110   (2)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T_HOUR_DIM     |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T_HOUR_DIM_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T_IDX          |  8102 |       |     6   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T_HOUR_DIM"."HOUR_PRETTY"=1100)
   5 - access("T"."HOUR"="T_HOUR_DIM"."HOUR")

The table is partitioned

Gabriel Schor, June 25, 2008 - 10:31 am UTC

Hello Tom,

Thank you very much for your help.
I mentioned in the beggining that my tables are partitioned. I modified your example to illustrate that and it isn't using the index again.

drop table t;

create table t
PARTITION BY RANGE (myday)
(
PARTITION WEEK_20080605 VALUES LESS THAN (TO_DATE(' 2008-06-05 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080606 VALUES LESS THAN (TO_DATE(' 2008-06-06 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080607 VALUES LESS THAN (TO_DATE(' 2008-06-07 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080608 VALUES LESS THAN (TO_DATE(' 2008-06-08 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080609 VALUES LESS THAN (TO_DATE(' 2008-06-09 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080610 VALUES LESS THAN (TO_DATE(' 2008-06-10 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080611 VALUES LESS THAN (TO_DATE(' 2008-06-11 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080612 VALUES LESS THAN (TO_DATE(' 2008-06-12 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080613 VALUES LESS THAN (TO_DATE(' 2008-06-13 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080614 VALUES LESS THAN (TO_DATE(' 2008-06-14 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080615 VALUES LESS THAN (TO_DATE(' 2008-06-15 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080616 VALUES LESS THAN (TO_DATE(' 2008-06-16 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080617 VALUES LESS THAN (TO_DATE(' 2008-06-17 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080618 VALUES LESS THAN (TO_DATE(' 2008-06-18 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080619 VALUES LESS THAN (TO_DATE(' 2008-06-19 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080620 VALUES LESS THAN (TO_DATE(' 2008-06-20 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080621 VALUES LESS THAN (TO_DATE(' 2008-06-21 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080622 VALUES LESS THAN (TO_DATE(' 2008-06-22 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080623 VALUES LESS THAN (TO_DATE(' 2008-06-23 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080624 VALUES LESS THAN (TO_DATE(' 2008-06-24 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_20080625 VALUES LESS THAN (TO_DATE(' 2008-06-25 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
PARTITION WEEK_MAX VALUES LESS THAN (MAXVALUE)
)as
select a.*, to_number(to_char( created,'hh24')) hour, trunc(sysdate - rownum/2880) myday
from all_objects a;

CREATE UNIQUE INDEX UK_T ON T
(OWNER, OBJECT_NAME, OBJECT_ID, HOUR, MYDAY)
TABLESPACE RPT_INDX_1WEEKLY
LOCAL (
PARTITION WEEK_20080605,
PARTITION WEEK_20080606,
PARTITION WEEK_20080607,
PARTITION WEEK_20080608,
PARTITION WEEK_20080609,
PARTITION WEEK_20080610,
PARTITION WEEK_20080611,
PARTITION WEEK_20080612,
PARTITION WEEK_20080613,
PARTITION WEEK_20080614,
PARTITION WEEK_20080615,
PARTITION WEEK_20080616,
PARTITION WEEK_20080617,
PARTITION WEEK_20080618,
PARTITION WEEK_20080619,
PARTITION WEEK_20080620,
PARTITION WEEK_20080621,
PARTITION WEEK_20080622,
PARTITION WEEK_20080623,
PARTITION WEEK_20080624,
PARTITION WEEK_20080325,
PARTITION WEEK_MAX
)
NOPARALLEL;

create table hour_dim1
as
select level hour, level * 100 hour_pretty
from dual
connect by level <= 24;

alter table hour_dim1 add constraint t_hour_dim1_pk primary key(hour);

alter table t add constraint fk_t_hour_dim foreign key (hour) references hour_dim1(hour);

create index ix_hour_dim1 on hour_dim1(hour_pretty);

create or replace view vw
as
select t.*, hour_dim1.hour_pretty
from t, hour_dim1
where t.hour = hour_dim1.hour;

set autotrace traceonly explain
select * from vw where hour_pretty = 1100;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=213 Card=2 K Bytes=340 K)
1 0 HASH JOIN (Cost=213 Card=2 K Bytes=340 K)
2 1 TABLE ACCESS BY INDEX ROWID OMDMS_RPT.HOUR_DIM1 (Cost=2 Card=1 Bytes=26)
3 2 INDEX RANGE SCAN OMDMS_RPT.IX_HOUR_DIM1 (Cost=1 Card=1)
4 1 PARTITION RANGE ALL (Cost=210 Card=56 K Bytes=5 M)
5 4 TABLE ACCESS FULL OMDMS_RPT.T (Cost=210 Card=56 K Bytes=5 M)

Thank you again,

June 25, 2008 - 11:41 am UTC

you have no index on T(hour)

you said you did. You do not.

Review my schema...

ops$tkyte%ORA10GR2> create index t_hour_idx on t(hour);

Index created.

ops$tkyte%ORA10GR2>  set autotrace traceonly explain
ops$tkyte%ORA10GR2>  select * from vw where hour_pretty = 1100;

Execution Plan
----------------------------------------------------------
Plan hash value: 2143162902

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |              | 10578 |  1818K|   185   (2)| 00:00:01 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| T            | 10578 |  1549K|   183   (2)| 00:00:01 | ROWID | ROWID |
|   2 |   NESTED LOOPS                     |              | 10578 |  1818K|   185   (2)| 00:00:01 |       |       |
|   3 |    TABLE ACCESS BY INDEX ROWID     | HOUR_DIM1    |     1 |    26 |     2   (0)| 00:00:01 |       |       |
|*  4 |     INDEX RANGE SCAN               | IX_HOUR_DIM1 |     1 |       |     1   (0)| 00:00:01 |       |       |
|*  5 |    INDEX RANGE SCAN                | T_HOUR_IDX   | 10578 |       |     8  (13)| 00:00:01 |       |       |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("HOUR_DIM1"."HOUR_PRETTY"=1100)
   5 - access("T"."HOUR"="HOUR_DIM1"."HOUR")

Note
-----
   - dynamic sampling used for this statement

Not exactly

Gabriel Schor, June 25, 2008 - 1:06 pm UTC

Hello Tom,

I said:

"Our base table unique index includes the hour column,"

:-)

Next time I'll be more specific and include the DDL from the start so we don't go on a wrong path.
Given the existing index, is there still anything that can be done to have it used in queries?

Thank you,

June 25, 2008 - 3:32 pm UTC

nope, hour is hidden way way in the middle. It would have to be on the leading edge in order to go from the "dimension" table to those rows with that hour, or you would have to use a single value

select * from t
where first_col_in_index = ?
  and second_col_in_index = ?
  and hour_which_is_third = (select hour from dim where pretty_hour = 1100 );

that could use that index.

Local and Global Index

Bhushan, July 08, 2008 - 4:43 pm UTC

Hello Thomas,
Is there any view which can differentiate between local and Global indexes on a table for a partitioned table.
I saw the all_part_indexes view, it contains the name of the indexes but i could no find how to distinguish a Local Index from a Global looking at it.

Regards,
Bhushan

July 08, 2008 - 6:16 pm UTC

ops$tkyte%ORA10GR2> CREATE TABLE t
  2  (
  3    dt  date,
  4    x   int,
  5    y   varchar2(30)
  6  )
  7  PARTITION BY RANGE (dt)
  8  (
  9    PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
 10    PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
 11    PARTITION junk VALUES LESS THAN (MAXVALUE)
 12  )
 13  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index g on t(x) global partition by hash(x) partitions 8;

Index created.

ops$tkyte%ORA10GR2> create index l on t(y) local;

Index created.

ops$tkyte%ORA10GR2> select index_name, table_name, partitioning_type, locality from user_part_indexes where index_name in ('G','L');

INDEX_NAME                     TABLE_NAME                     PARTITI LOCALI
------------------------------ ------------------------------ ------- ------
G                              T                              HASH    GLOBAL
L                              T                              RANGE   LOCAL

Cool!!!

Bhushan, July 09, 2008 - 12:15 am UTC

Thank You for the quick and accurate response.

I'm afraid i've missed something or decode would be benefit ?

Christian G, July 09, 2008 - 3:24 am UTC

Hello,

We have two usage of the "hour" column : 
   1. predicate
   2. display value

So, why don't you use the DECODE function in the select clause :
SQL> create table t (h number(2));

Table created.

SQL> insert into t (select level-1 from dual connect by level<=24);

24 rows created.

SQL> create index t_idx on t(h);

Index created.

SQL> set autotrace on explain
SQL>  select decode(h,-1, -1, h||'00') from t where h=11;

DECODE(H,-1,-1,H||'00')
-----------------------
                   1100


Execution Plan
----------------------------------------------------------
Plan hash value: 2946670127

-------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |     1 |    13 |     1   (0)| 00:00:01
|*  1 |  INDEX RANGE SCAN| T_IDX |     1 |    13 |     1   (0)| 00:00:01
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("H"=11)

Note
-----
   - dynamic sampling used for this statement

Let me know if I'm wrong, and what step I've missed...

Christian.

For Christian

Stephan, July 09, 2008 - 8:56 am UTC

Hi Christian,

You've missed the point entirely - it was not to get output in the format you've returned, but to query where hour = '1100'.

You're still querying where hour = 11

Index Range Scan

Nadir, April 13, 2010 - 4:39 am UTC

Hi Tom,
   I am following a meatlink document to understand the index range scan. My database version is 10.2.0.3.0. For this purpose i created a table EMPL by using the following query:
  SQL> creeate table empl as select * from scott.emp;

Next i create a non unique index by using the following query.
  SQL> create index empl_empno_nonunique on scott.empl(empno);

Then i set the sqlplus settings as follows:
  SQL> set autotrace on explain

After doing all this, when i query the empl table by using the followin query

SQL> select empno,ename from empl where empno > 7876;

it gives me the following Execution plan.

     EMPNO ENAME
---------- ----------
      7900 JAMES
      7902 FORD
      7934 MILLER

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 card=3 Bytes=60)
   1    0   TABLE ACCESS (FULL) OF 'EMPL' (TABLE) (Cost=2 Card=3 Bytes=60)

Now in this execution plan query is not using index and there is no index range scan. 

Please tell me what step i am missing, because of which this query is not doing index range scan.

April 13, 2010 - 9:20 am UTC

your table is so very very tiny, that using an index would take more work than not using it.

so we do not use it.

make your table bigger - indexes are good for range scanning in general when you get a small number of rows from a big table (all is relative, small is small - not x%, not y%- just small).

3 rows out of 14 - we figured "eh, just read the single block and be done with it"

Indexes

Aru, April 15, 2010 - 12:04 am UTC

Hi Tom,

We are running oracle 10gR2 and have a major problem with a query that takes a long time to execute and the difference is that LNNVL that is in the predicate information. The same indexes are being used in both the good and bad. The query has gone from executing in .09 secs to 111 secs overnight.

GOOD
----
SQL_TEXT
----------------------------------------------------------------
SELECT /*skew3*/ ROWID, MOBILE_NO_V, DECODE(SERVICE_TYPE_N, 22, CHAR_FLD5_V, '1003') TYPE_V, DECODE(SERVICE_TYPE_N, 22, CHAR_FLD7_V, 2) CURRENT_STATE_V, DECODE(SERVICE_TYPE_N, 22, CHAR_FLD8_V, 0) PREVIOUS_STATE_V, DECODE(SERVICE_TYPE_N, 22, NUM_FLD1_
N, 0) CURRENTBALANCE_N, DECODE(SERVICE_TYPE_N, 22, NUM_FLD2_N,0) PREVIOUS_BALANCE_V, CALL_DATE_DT CHANGETIME, DECODE(SERVICE_TYPE_N, 22, CHAR_FLD6_V, 0) OLDVALUE, DECODE(SERVICE_TYPE_N, 22, CHAR_FLD7_V, 0) NEWVALUE, FILE_ID_N, SERVICE_TYPE_N FROM
tabillity_prod.GSM_PREPAID_CDRS WHERE (SERVICE_TYPE_N = 1 AND CHAR_FLD3_V = '7' AND CHAR_FLD20_V = 'Q') OR SERVICE_TYPE_N = 22 AND CHAR_FLD11_V = 'Q'

-------------------------------------------------------------------+-----------------------------------+---------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop |
-------------------------------------------------------------------+-----------------------------------+---------------+
| 0 | SELECT STATEMENT | | | | 860 | | | |
| 1 | CONCATENATION | | | | | | | |
| 2 | PARTITION RANGE ALL | | 1 | 72 | 430 | 00:00:06 | 1 | 214 |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID | GSM_PREPAID_CDRS | 1 | 72 | 430 | 00:00:06 | 1 | 214 |
| 4 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_5| 1 | | 429 | 00:00:06 | 1 | 214 |
| 5 | PARTITION RANGE ALL | | 4 | 288 | 430 | 00:00:06 | 1 | 214 |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID | GSM_PREPAID_CDRS | 4 | 288 | 430 | 00:00:06 | 1 | 214 |
| 7 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_4| 4 | | 429 | 00:00:06 | 1 | 214 |
-------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
4 - access("SERVICE_TYPE_N"=22 AND "CHAR_FLD11_V"='Q')
6 - filter((LNNVL("SERVICE_TYPE_N"=22) OR LNNVL("CHAR_FLD11_V"='Q')))
7 - access("SERVICE_TYPE_N"=1 AND "CHAR_FLD3_V"='7' AND "CHAR_FLD20_V"='Q')

Predicate Information (identified by operation id):
---------------------------------------------------

6 - access("SERVICE_TYPE_N"=1 AND "CHAR_FLD3_V"='7' AND "CHAR_FLD20_V"='Q')
8 - access("SERVICE_TYPE_N"=22 AND "CHAR_FLD11_V"='Q')

BAD
---
SQL_TEXT
----------------------------------------------------------------
SELECT ROWID,MOBILE_NO_V,DECODE(SERVICE_TYPE_N,22,CHAR_FLD5_V,'1003') TYPE_V, DECODE(SERVICE_TYPE_N,22,CHAR_FLD7_V,2) CURRENT_S
TATE_V, DECODE(SERVICE_TYPE_N,22,CHAR_FLD8_V,0)PREVIOUS_STATE_V,DECODE(SERVICE_TYPE_N,22,NUM_FLD1_N,0) CURRENTBALANCE_N, DECODE(SERVICE_TYPE_N,22,NUM_FLD2_N,0) REVIOUS_BALANCE_V,CALL_DATE_DT CHANGETIME, DECODE(SERVICE_TYPE_N,22,CHAR_FLD6_V,0) OLDVALUE,
DECODE(SERVICE_TYPE_N,22,CHAR_FLD7_V,0) NEWVALUE, ILE_ID_N, SERVICE_TYPE_N FROM GSM_PREPAID_CDRS WHERE SERVICE_TYPE_N=22 AND CHAR_FLD11_V='Q' OR (SERVICE_TYPE_N=1 AND CHAR_FLD3_V='7' AND CHAR_FLD20_V='Q')

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 14756 (100)| | |
| 1 | CONCATENATION | | | | | | |
| 2 | PARTITION RANGE ALL | | 41 | 5658 | 5470 (1)| 1 | 239 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| GSM_PREPAID_CDRS | 41 | 5658 | 5470 (1)| 1 | 239 |
|* 4 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_5 | 171K| | 844 (1)| 1 | 239 |
| 5 | PARTITION RANGE ALL | | 4114 | 554K| 9285 (1)| 1 | 239 |
|* 6 | TABLE ACCESS BY LOCAL INDEX ROWID| GSM_PREPAID_CDRS | 4114 | 554K| 9285 (1)| 1 | 239 |
|* 7 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_4 | 165K| | 881 (1)| 1 | 239 |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter(("CHAR_FLD3_V"='7' AND "CHAR_FLD20_V"='Q'))
4 - access("SERVICE_TYPE_N"=1)
6 - filter("CHAR_FLD11_V"='Q')
7 - access("SERVICE_TYPE_N"=22)
filter((LNNVL("SERVICE_TYPE_N"=1) OR LNNVL("CHAR_FLD3_V"='7') OR LNNVL("CHAR_FLD20_V"='Q')))

Really confused about this LNNVL function and think that may be the cause for this. The same indexes are being used in both cases.
Thanks for all help in advance.
Regards,
Aru.

April 15, 2010 - 8:21 am UTC

 
WHERE (SERVICE_TYPE_N = 1 AND CHAR_FLD3_V = '7' AND CHAR_FLD20_V = 'Q')
   OR SERVICE_TYPE_N = 22   
  AND CHAR_FLD11_V = 'Q'


WHERE SERVICE_TYPE_N=22 
  AND CHAR_FLD11_V='Q' 
   OR (SERVICE_TYPE_N=1 AND CHAR_FLD3_V='7' AND CHAR_FLD20_V='Q')

You do see that these are not remotely the same query right?

Index

Aru, April 15, 2010 - 5:35 pm UTC

Hi Tom,
Sorry! my bad.
We were just trying the query with a different order that particular time.

GOOD
----

SELECT ROWID,
MOBILE_NO_V,
DECODE(SERVICE_TYPE_N, 22,CHAR_FLD5_V,'1003') TYPE_V,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD7_V, 2) CURRENT_STATE_V,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD8_V, 0) PREVIOUS_STATE_V,
DECODE(SERVICE_TYPE_N, 22, NUM_FLD1_N, 0) CURRENTBALANCE_N,
DECODE(SERVICE_TYPE_N, 22, NUM_FLD2_N, 0) PREVIOUS_BALANCE_V,
CALL_DATE_DT CHANGETIME,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD6_V, 0) OLDVALUE,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD7_V, 0) NEWVALUE,
FILE_ID_N,
SERVICE_TYPE_N
FROM TABILLITY_PROD.GSM_PREPAID_CDRS
WHERE SERVICE_TYPE_N = 22
AND CHAR_FLD11_V = 'Q'
OR (SERVICE_TYPE_N = 1 AND CHAR_FLD3_V = '7' AND CHAR_FLD20_V = 'Q');

============
Plan Table
============
-------------------------------------------------------------------+-----------------------------------+---------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop |
-------------------------------------------------------------------+-----------------------------------+---------------+
| 0 | SELECT STATEMENT | | | | 860 | | | |
| 1 | CONCATENATION | | | | | | | |
| 2 | PARTITION RANGE ALL | | 1 | 72 | 430 | 00:00:06 | 1 | 214 |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID | GSM_PREPAID_CDRS | 1 | 72 | 430 | 00:00:06 | 1 | 214 |
| 4 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_5| 1 | | 429 | 00:00:06 | 1 | 214 |
| 5 | PARTITION RANGE ALL | | 4 | 288 | 430 | 00:00:06 | 1 | 214 |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID | GSM_PREPAID_CDRS | 4 | 288 | 430 | 00:00:06 | 1 | 214 |
| 7 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_4| 4 | | 429 | 00:00:06 | 1 | 214 |
-------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
4 - access("SERVICE_TYPE_N"=22 AND "CHAR_FLD11_V"='Q')
6 - filter((LNNVL("SERVICE_TYPE_N"=22) OR LNNVL("CHAR_FLD11_V"='Q')))
7 - access("SERVICE_TYPE_N"=1 AND "CHAR_FLD3_V"='7' AND "CHAR_FLD20_V"='Q')

BAD
---

SQL_TEXT
----------------------------------------------------------------
SELECT ROWID,
MOBILE_NO_V,
DECODE(SERVICE_TYPE_N, 22,CHAR_FLD5_V,'1003') TYPE_V,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD7_V, 2) CURRENT_STATE_V,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD8_V, 0) PREVIOUS_STATE_V,
DECODE(SERVICE_TYPE_N, 22, NUM_FLD1_N, 0) CURRENTBALANCE_N,
DECODE(SERVICE_TYPE_N, 22, NUM_FLD2_N, 0) PREVIOUS_BALANCE_V,
CALL_DATE_DT CHANGETIME,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD6_V, 0) OLDVALUE,
DECODE(SERVICE_TYPE_N, 22, CHAR_FLD7_V, 0) NEWVALUE,
FILE_ID_N,
SERVICE_TYPE_N
FROM TABILLITY_PROD.GSM_PREPAID_CDRS
WHERE SERVICE_TYPE_N = 22
AND CHAR_FLD11_V = 'Q'
OR (SERVICE_TYPE_N = 1 AND CHAR_FLD3_V = '7' AND CHAR_FLD20_V = 'Q');

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 38133 (100)| | |
| 1 | CONCATENATION | | | | | | |
| 2 | PARTITION RANGE ALL | | 123 | 16974 | 15124 (1)| 1 | 214 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| GSM_PREPAID_CDRS | 123 | 16974 | 15124 (1)| 1 | 214 |
|* 4 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_5 | 491K| | 1772 (1)| 1 | 214 |
| 5 | PARTITION RANGE ALL | | 12290 | 1656K| 23010 (1)| 1 | 214 |
|* 6 | TABLE ACCESS BY LOCAL INDEX ROWID| GSM_PREPAID_CDRS | 12290 | 1656K| 23010 (1)| 1 | 214 |
|* 7 | INDEX RANGE SCAN | I_GSM_PREPAID_CDRS_4 | 491K| | 1974 (1)| 1 | 214 |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter(("CHAR_FLD3_V"='7' AND "CHAR_FLD20_V"='Q'))
4 - access("SERVICE_TYPE_N"=1)
6 - filter("CHAR_FLD11_V"='Q')
7 - access("SERVICE_TYPE_N"=22)
filter((LNNVL("SERVICE_TYPE_N"=1) OR LNNVL("CHAR_FLD3_V"='7') OR LNNVL("CHAR_FLD20_V"='Q')))

Regards,
Aru.

April 16, 2010 - 9:27 am UTC

what happens in between good and bad to cause the plan to change?

are you using cursor_sharing at all?

Indexes

Aru, April 19, 2010 - 3:54 pm UTC

Hi Tom,
Nothing is happening on the database side. It's just business as usual.
CURSOR_SHARING=exact is being used in out database.
Regards,
Aru.

April 19, 2010 - 4:30 pm UTC

sorry, but something has to be changing if your plans are changing - you are not using binds, so it is not bind peeking.

statistics are gathered?
something is different between the time plan A and plan B are developed.

Indexes

Aru, April 19, 2010 - 7:45 pm UTC

Hi Tom,
You are RIGHT as usual. Talking to the client's in house DBA's, they said they decided to drop the histograms as the table has no skewed data( so they say). It was actually to fix a, much more run and important SQL, but this sql got affected as well.

But Tom,
The indexes are the same. So what is the lnnvl function doing that is so performance impacting?
I thought the histograms have influence on the CBO optimizer plan on first hard parse, but there seems to be much more than that evidently....

Regards,
Aru.

April 20, 2010 - 8:02 am UTC

it isn't the lnnvl function in as much as a change in access path - the plan changed.

The access/filter is the key - access means "used index to satisfy this part", filter means "did this once for every row we processed"

Indexes

Aru, April 20, 2010 - 2:36 pm UTC

Thanks for that Tom,

The part where it says in the predictate filter:
'filter((LNNVL("SERVICE_TYPE_N"=1) OR LNNVL("CHAR_FLD3_V"='7') OR LNNVL("CHAR_FLD20_V"='Q')))
'

Does this mean oracle is checking for null values on the service_type, char_fld3_v and char_fld20_v using the lnnvl function? 
Earlier the number of null values was provided to the CBO via the histograms, is it? In the test environment I can see that :-

  1* select bucket_cnt,row_cnt,null_cnt from hist_head$ where obj#=12184
SQL> /

BUCKET_CNT    ROW_CNT   NULL_CNT
---------- ---------- ----------
         1          0          0
       254        255          0
         1          0          0
      5709          4          0
      5709         10          0
      5709         73          0
         1          0          0
       254        255          0
      5709          3          0
      5180          1    4258879
      5658          6     410591
     60195          5   37481541
      5709         14          0
      4239          2   11834692
       254        233       7358
       254         48          0
       254        112   41127461
     32446         14   41127461
      3150          3   20602026
      5187          3    4202523
       254        118    7986404
       254         19   33615281


Is my assumption right?

Regards,
Aru

April 20, 2010 - 2:52 pm UTC

http://docs.oracle.com/docs/cd/B13789_01/server.101/b10759/functions068.htm

not sure what you mean by "earlier" because you said "nothing changed"

Indexes

Aru, April 20, 2010 - 8:55 pm UTC

Hi Tom,
One post earlier I had apologized for saying 'nothing has changed' as the client DBA's had not gotten back to us till then and we had changed nothing.

"You are RIGHT as usual. Talking to the client's in house DBA's, they said they decided to drop the
histograms as the table has no skewed data( so they say). It was actually to fix a, much more run
and important SQL, but this sql got affected as well.
".

Have read the link you have sent me. The lnnvl function is not specified in our query. So oracle is internally applying the lnnvl function it seems. Is this expected behaviour? Does that also mean that histograms will be needed for table with null values in columns, as they have the null_cnt information in them?
Am really confused.
As soon as we recreate the histograms the lnnvl function is no longer applied.
Regards,
Aru.

April 21, 2010 - 8:44 am UTC

well, that change impacted the estimated cardinality and changed the plan. It is not the lnnvl function inasmuch as the PLAN CHANGE - do you see the access versus the access/filter.

If the estimated card=values are wrong because of insufficient or incorrect statistics - we'll get the wrong plan. And that is precisely what happened here.

In this case, you do want those histograms in order to get the right estimated cardinalities to get the right plan.

Stats.

Aru, May 05, 2010 - 7:27 pm UTC

Hi Tom,
Thanks for all help. We have got the application code changed to use bind variables now for the above sql's. Very happy about that. But I have a couple of questions now which I am bit confused about.
The query and plan is as below.

Without column stats
--------------------

SQL_ID I# Hash Value EXECUTIONS INVALS LOADS FIRST_LOAD_TIME LAST_LOAD_TIME LAST_ACTIVE_TIME DISK_READS BUFFER_GETS ROWS_PROC CPU_SECS ELAPSED_SECS SECS_P_EXEC
------------- ---- ----------- ---------- ---------- ---------- ------------------- ------------------- ------------------- ---------- ------------- ---------- ---------- ------------ -----------
fbcu65nrxsh7r 1 802963703 12 0 1 2010-05-06/09:28:20 2010-05-06/09:28:20 2010-05-06/09:35:55 25903 530688 0 15.274169 284.33664 23.6947

SQLTXT
--------------------------------------------------------------------------------
SELECT TO_CHAR(CALL_DATE_DT,'DD/MM/YY
YY HH:MI:SS AM') AS CALLDATETIME,(NVL(CHARGE_AMT_N,0)/1000) AS AMO
UNT, CHAR_FLD7_V AS CALLED_CALLING_NUMBER, (NUM_FLD1_N/1000) AS B
ALANCE, DECODE(CHAR_FLD4_V, '0', 'INTERNATIONAL CALL','1', 'Other network call
', '2', '2degrees call', '3', 'CALLED ROAMING', '4', 'NO CHARGED','5', 'FORWARDI
NG CALLING NUMBER','6', 'FORWARDING CALLED NUMBER','7', 'CHANGE ORDER OF A FORWA
RDED VOICE MAILBOX','8', 'ONE OFF CHARGE','9', 'ONE OFF CHARGE','T', 'Intl. roam
ing call','V', 'Voicemail call','R', 'Roaming received call',CHAR_FLD4_V) AS CHA
RGE_TYPE, CHAR_FLD8_V AS CALL_DURATION FROM GSM_PREPAID_CDRS A W
HERE SERV_ACCOUNT_LINK_CODE_N =:1 AND CALL_DATE_DT >=:2 AND CALL_DATE_DT <=:3 A
ND SERVICE_TYPE_N = 1 ORDER BY CALL_DATE_DT DESC

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 895 (100)| | |
| 1 | FILTER | | | | | | |
| 2 | PARTITION RANGE ITERATOR | | 10 | 1120 | 895 (1)| KEY | KEY |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| GSM_PREPAID_CDRS | 10 | 1120 | 895 (1)| KEY | KEY |
| 4 | INDEX RANGE SCAN DESCENDING | I_GSM_PREPAID_CDRS_1 | 745 | | 876 (1)| KEY | KEY |
-----------------------------------------------------------------------------------------------------------------

BIND VALUE_STRING DATATYPE_STRING LAST_CAPTURED
---------------- ------------------------------------------------------------ --------------- ------------------
(child 0) B1: 1028 VARCHAR2(32) 06-May-10 09:28:20
(child 0) B2: 05-APR-10 VARCHAR2(32) 06-May-10 09:28:20
(child 0) B3: 05-MAY-10 VARCHAR2(32) 06-May-10 09:28:20

With column stats on SERV_ACCOUNT_LINK_CODE_N
---------------------------------------------

SQL_ID I# Hash Value EXECUTIONS INVALS LOADS FIRST_LOAD_TIME LAST_LOAD_TIME LAST_ACTIVE_TIME DISK_READS BUFFER_GETS ROWS_PROC CPU_SECS ELAPSED_SECS SECS_P_EXEC
------------- ---- ----------- ---------- ---------- ---------- ------------------- ------------------- ------------------- ---------- ------------- ---------- ---------- ------------ -----------
fbcu65nrxsh7r 1 802963703 50962 0 1 2010-05-06/09:28:20 2010-05-06/10:20:24 2010-05-06/10:47:04 3 4487261 1884720 51.627316 73.232044 0.0014

SQLTXT
--------------------------------------------------------------------------------
SELECT TO_CHAR(CALL_DATE_DT,'DD/MM/YY
YY HH:MI:SS AM') AS CALLDATETIME,(NVL(CHARGE_AMT_N,0)/1000) AS AMO
UNT, CHAR_FLD7_V AS CALLED_CALLING_NUMBER, (NUM_FLD1_N/1000) AS B
ALANCE, DECODE(CHAR_FLD4_V, '0', 'INTERNATIONAL CALL','1', 'Other network call
', '2', '2degrees call', '3', 'CALLED ROAMING', '4', 'NO CHARGED','5', 'FORWARDI
NG CALLING NUMBER','6', 'FORWARDING CALLED NUMBER','7', 'CHANGE ORDER OF A FORWA
RDED VOICE MAILBOX','8', 'ONE OFF CHARGE','9', 'ONE OFF CHARGE','T', 'Intl. roam
ing call','V', 'Voicemail call','R', 'Roaming received call',CHAR_FLD4_V) AS CHA
RGE_TYPE, CHAR_FLD8_V AS CALL_DURATION FROM GSM_PREPAID_CDRS A W
HERE SERV_ACCOUNT_LINK_CODE_N =:1 AND CALL_DATE_DT >=:2 AND CALL_DATE_DT <=:3 A
ND SERVICE_TYPE_N = 1 ORDER BY CALL_DATE_DT DESC

----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 525 (100)| | |
| 1 | FILTER | | | | | | |
| 2 | PARTITION RANGE ITERATOR | | 1 | 104 | 525 (1)| KEY | KEY |
| 3 | SORT ORDER BY | | 1 | 104 | 525 (1)| | |
| 4 | TABLE ACCESS BY LOCAL INDEX ROWID| GSM_PREPAID_CDRS | 1 | 104 | 524 (0)| KEY | KEY |
| 5 | INDEX RANGE SCAN | GSM_PREPAID_CDRS_6 | 1 | | 523 (0)| KEY | KEY |
----------------------------------------------------------------------------------------------------------------

BIND VALUE_STRING DATATYPE_STRING LAST_CAPTURED
---------------- ------------------------------------------------------------ --------------- ------------------
(child 3) B1: 540 VARCHAR2(32) 06-May-10 10:35:24
(child 3) B2: 01-SEP-09 VARCHAR2(32) 06-May-10 10:35:24
(child 3) B3: 25-SEP-09 VARCHAR2(32) 06-May-10 10:35:24

This is superfast as it uses the correct index.

1) Is oracle using global stats for this or partition stats?
2) If only global stats are used as we think they are, do we still need to get partition stats as well? Both take about 6 hours for dbms_stats.
3) New SERV_ACCOUNT_LINK_CODE_N are generated everyday. Do we need to generate stats for them everyday?
4) Any other options available to us as 6 hours is too long and this query is running hundreds of times 24/7.

GSM_PREPAID_CDRS is a partitioned table on call_date, partition added daily with about 800000 rows per partition.
We keep 6 months of partitions online.

Regards,
ARU

May 06, 2010 - 2:25 pm UTC

1) it will use local statistics IF and ONLY IF it can be sure that exactly one and only one partition will be accessed.

Otherwise it will use global statistics.

so, knowing that, and given you know your tables and we do not - you can answer that question :)

2) you never "need" to get local statistics if you have global, we'll infer them (vice verse is true as well).

3) I don't know what a serv_account_line_code_n is, how it is used, what it's values are and whether the amount of data added during the course of the day would have any material impact on query plans - do you?

4) why do you gather global stats? why not just local stats on partitions that have changed a lot? are you using gather stale just to gather when we think it has changed a lot? are you gather statistics such as histograms that take a long time to compute - and if so - did you really mean to?

why not just gather stats on newly added partitions and not on old ones (making assumption that old partitions would be rarely modified that might not be true - but I have no idea)

You see, we have not nearly anywhere near the type of information one would need to look at this...

test case

Chris, May 07, 2010 - 4:52 pm UTC

it will use local statistics IF and ONLY IF it can be sure that exactly one and only one partition will be accessed.

Some 10053 event tracing I've done seems to suggest the optimizer may use local statistics if more than one partition is accessed, and global stats are not present. This test was performed on 11.2.0.1. Consider:

drop table t purge; 

create table t (x int) partition by range(x) (
 partition part_0 values less than (2000),
 partition part_1 values less than (15000),
 partition part_2 values less than (30000)
)
/

insert into t select level from dual connect by level <= 1000;
insert into t select level + 2000 from dual connect by level <= 10000;
insert into t select level + 15000 from dual connect by level <= 10000;


exec dbms_stats.gather_table_stats(user, 'T', 'PART_1', granularity => 'partition', estimate_percent => null);

exec dbms_stats.gather_table_stats(user, 'T', 'PART_2', granularity => 'partition', estimate_percent => null);



alter session set events '10053 trace name context forever, level 1';


select count(*) from t where x > 2000;

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T  Partition [1]
    #Rows: 10000  # Blks:  20  AvgRowLen:  4.00
  Partition [2]
    #Rows: 10000  # Blks:  20  AvgRowLen:  5.00
    #Rows: 20000  # Blks:  40  AvgRowLen:  4.00
Access path analysis for T
***************************************

Looks like each partition's stats were analyzed separately (i.e. local stats used).

select count(*) from t where x < 2000;

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T  Partition [0]  (NOT ANALYZED)
    #Rows: 82  # Blks:  1  AvgRowLen:  100.00
  (NOT ANALYZED)
    #Rows: 82  # Blks:  1  AvgRowLen:  100.00
Access path analysis for T
***************************************

This is just to show that the 10053 trace output looks similar for a single partition access. Of course in this particular case, there are no stats for part_0. Now let's gather global and local stats:

exec dbms_stats.gather_table_stats(user, 'T', estimate_percent => null, no_invalidate => false);


select count(*) from t where x > 2000;

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table:  T  Alias:  T  (Using composite stats)
  (making adjustments for partition skews)
  ORIGINAL VALUES::    #Rows: 21000  # Blks:  45  AvgRowLen:  5.00
  PARTITIONS::
  PRUNED: 2
  ANALYZED: 2  UNANALYZED: 0
    #Rows: 21000  # Blks:  40  AvgRowLen:  5.00
Access path analysis for T
***************************************

The first and last queries above are the same, except that the first one has only local stats, whereas the last one has both local and global. Notice the lack of this line:

  Table:  T  Alias:  T  (Using composite stats)

from the first 10053 trace. Also, the rowcount estimate from the first trace is spot on: 20000. Whereas the last trace (using global stats) is off by 1000 (the exact number of rows in part_0).

If my analysis of the 10053 traces is correct, it seems that in the absence of global stats, local stats from multiple partitions may be used in some cases.

May 08, 2010 - 7:37 am UTC

Ok, true, let me rewrite:

In the presence of both global and local statistics, it will use local statistics IF and ONLY IF it can be sure that exactly one and only one partition will be accessed.

Otherwise, it will try to infer statistics from any existing statistics (we can guess as global from local in most/many cases) or use dynamic sampling to get them.

Confusion galore

aru, May 09, 2010 - 5:35 pm UTC

Hi Tom,
Thanks for all your help.
Just adding to the above question of mine.
1) Right! so using bind variables = using global stats.
Does bind peeking on hard parse come into play here?
2) 'you never "need" to get local statistics if you have global'. I don't really understand this. Aren't the partition stats and global stats have different information for the CBO?
3) serv_account_line_code_n is the column that I think is skewed. It links new mobile account numbers to the calls they make. We can have more than 1000 new numbers per day. There are 240000 distinct values and the occurrence range from 1 to 5000.
4) "Why do you gather global stats?" - I have never read about the benefits or pos's/con's of using global or partition stats.
I gather global stats because, I think the query will be using global stats because of the "key key" in the plan.
What would be the benefit of gathering local stats in this case? We gather histograms on the 'serv_account_line_code_n' column as somehow the CBO goes for the right index when we do so.

Thanks for all your patience Tom,
Please can you tell me the semantics between global stats and local/partitioned stats? how do they relate to each other and does gathering local/partitioned stats, make the global stats uptodate as well( if we collect it)?

Thanks again,
Regards,
Aru.

May 10, 2010 - 6:53 pm UTC

1) no, binds do not equate to global stats.

select * from t where partition_column = ?

assuming that T is partitioned on the column partition_column only - that can and will use bind peeking at hard parse time and would use the local stats on the partition being queried via that bind variable (assuming the local stats existed)

2) they have different information - yes, does that mean YOU HAVE TO? No, it doesn't.

Heck - you don't technically need *any* statistics.

Put the words equivalent to "always" or "never" into a sentence and that sentence will always be wrong, it will never be right (hahaha - parse THAT statement will you :) )

3) ok?

4) consider (remember dbms_xplan.display_cursor is new in 10g and shows the plan USED (not an explain plan!!) by the last query in the session as I'm using it)

ops$tkyte%ORA11GR2> CREATE TABLE t
  2  (
  3    x   int,
  4    y   varchar2(30)
  5  )
  6  PARTITION BY RANGE (x)
  7  (
  8    PARTITION part1 VALUES LESS THAN (1),
  9    PARTITION part2 VALUES LESS THAN (2)
 10  )
 11  /

Table created.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> insert into t
  2  select 0, object_name from all_objects
  3  union all
  4  select 1, username from all_users;

72302 rows created.

<b>lots of data in part1, little data in part2</b>

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> variable x number
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T', granularity => 'GLOBAL' );

PL/SQL procedure successfully completed.

<b>with only global stats let's see what the optimizer says:</b>

ops$tkyte%ORA11GR2> exec :x := 0;

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select count(*) from t x_is_zero_global where x = :x;

  COUNT(*)
----------
     72257

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor() );

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4a5d7x6vgpy23, child number 0
-------------------------------------
select count(*) from t x_is_zero_global where x = :x

Plan hash value: 3660200434

------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |       |       |    53 (100)|          |       |       |
|   1 |  SORT AGGREGATE         |      |     1 |     3 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|      | 36151 |   105K|    53   (0)| 00:00:01 |   KEY |   KEY |
|*  3 |    TABLE ACCESS FULL    | T    | 36151 |   105K|    53   (0)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("X"=:X)


20 rows selected.

<b>it guessed about 50% of the data - it ASSUMED half of the data in each partition - because it didn't have any information about the partitions, just about the table itself globally - it was far off, under by 50%</b>

ops$tkyte%ORA11GR2> exec :x := 1;

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select count(*) from t x_is_one_global where x = :x;

  COUNT(*)
----------
        45

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor() );

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  ggh83kfxgynqu, child number 0
-------------------------------------
select count(*) from t x_is_one_global where x = :x

Plan hash value: 3660200434

------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |       |       |    53 (100)|          |       |       |
|   1 |  SORT AGGREGATE         |      |     1 |     3 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|      | 36151 |   105K|    53   (0)| 00:00:01 |   KEY |   KEY |
|*  3 |    TABLE ACCESS FULL    | T    | 36151 |   105K|    53   (0)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("X"=:X)


20 rows selected.

<b>and there it way over estimated... because it didn't know this partition would be almost empty

NOTICE: key key!!! Continuing:</b>

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T', granularity => 'PARTITION' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> exec :x := 0;

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select count(*) from t x_is_zero_local where x = :x;

  COUNT(*)
----------
     72257

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor() );

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dtq89pdx1275y, child number 0
-------------------------------------
select count(*) from t x_is_zero_local where x = :x

Plan hash value: 3660200434

------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |       |       |   103 (100)|          |       |       |
|   1 |  SORT AGGREGATE         |      |     1 |     2 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|      | 72250 |   141K|   103   (1)| 00:00:02 |   KEY |   KEY |
|*  3 |    TABLE ACCESS FULL    | T    | 72250 |   141K|   103   (1)| 00:00:02 |   KEY |   KEY |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("X"=:X)


20 rows selected.

<b>dead on - with key key, because it used local stats</b>

ops$tkyte%ORA11GR2> exec :x := 1;

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select count(*) from t x_is_one_local where x = :x;

  COUNT(*)
----------
        45

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor() );

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2fymh5mbhsy3u, child number 0
-------------------------------------
select count(*) from t x_is_one_local where x = :x

Plan hash value: 3660200434

------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |       |       |     3 (100)|          |       |       |
|   1 |  SORT AGGREGATE         |      |     1 |     3 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|      |    45 |   135 |     3   (0)| 00:00:01 |   KEY |   KEY |
|*  3 |    TABLE ACCESS FULL    | T    |    45 |   135 |     3   (0)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("X"=:X)


20 rows selected.

<b>ditto</b>

so, does that mean I always want local and global - no, not necessarily - the above demonstrates a 'feature' and a 'flaw'.

The feature is "better estimated cardinality"

The flaw is "bind peeking - what if I parse a query with x=0, but a mostly run it with x=1" in the above? Think about that (although in 11g there is adaptive cursor sharing - search for that, examples on this site and many others - that helps offset the issue of bind peeking in that case.

Strange CBO choice

Manuel Vidigal, September 16, 2010 - 7:18 am UTC

Hi Tom,

I'm runing some tests to check weather processed_flag should have NULL for already processed values or not. I thought that having a smaller index on the processed_flag column should be better. Tests were done using my laptop with Oracle 11.2.0.1.0.

Check the following example:

CREATE TABLE processed_flag_not_null AS
SELECT LEVEL id,
       lpad('A', 10, 'A') VALUE,
       CASE
         WHEN MOD(LEVEL, 100) = 0 THEN
          0
         ELSE
          1
       END processed_flag
  FROM dual
CONNECT BY LEVEL <= 1000000;
CREATE INDEX IDX_PROCESSED_FLAG_NOT_NULL ON PROCESSED_FLAG_NOT_NULL (PROCESSED_FLAG);

CREATE TABLE processed_flag_null AS
SELECT LEVEL id,
       lpad('A', 10, 'A') VALUE,
       CASE
         WHEN MOD(LEVEL, 100) = 0 THEN
          0
       END processed_flag
  FROM dual
CONNECT BY LEVEL <= 1000000;
CREATE INDEX IDX_PROCESSED_FLAG_NULL ON PROCESSED_FLAG_NULL (PROCESSED_FLAG);

BEGIN
  dbms_stats.gather_table_stats(USER, 'PROCESSED_FLAG_NOT_NULL', cascade => TRUE);
  dbms_stats.gather_table_stats(USER, 'PROCESSED_FLAG_NULL', cascade => TRUE);
END;

The first strange thing I've found was in all_tab_histograms:

SELECT *
  FROM all_tab_histograms
 WHERE table_name IN ('PROCESSED_FLAG_NOT_NULL', 'PROCESSED_FLAG_NULL')
   AND column_name = 'PROCESSED_FLAG'
 ORDER BY 2,4;

TABLE_NAME         COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
PROCESSED_FLAG_NOT_NULL PROCESSED_FLAG 0 0
PROCESSED_FLAG_NOT_NULL PROCESSED_FLAG 1 1
PROCESSED_FLAG_NULL PROCESSED_FLAG 0 0
PROCESSED_FLAG_NULL PROCESSED_FLAG 1 0


Only after running queries against both tables with ("where processed_flag = 0") and gathering table stats again I get the following histograms:


TABLE_NAME                 COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
PROCESSED_FLAG_NOT_NULL PROCESSED_FLAG 59                 0
PROCESSED_FLAG_NOT_NULL PROCESSED_FLAG 5598                 1
PROCESSED_FLAG_NULL PROCESSED_FLAG 10000                 0

Can you explain to me this behaviour?

Now for the main question, I don't understand why the CBO chooses an Index Range Scan for the PROCESSED_FLAG_NOT_NULL table, and a Full Table Scan for the PROCESSED_FLAG_NULL:

SQL> set timing on
SQL> set autotrace traceonly
SQL> SELECT *
  2    FROM processed_flag_not_null
  3   WHERE processed_flag = 0;

10000 rows selected.

Elapsed: 00:00:00.20

Execution Plan
----------------------------------------------------------
Plan hash value: 3652560023

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                             | 10539 |   195K|    93   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PROCESSED_FLAG_NOT_NULL     | 10539 |   195K|    93   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | IDX_PROCESSED_FLAG_NOT_NULL | 10539 |  |    23   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PROCESSED_FLAG"=0)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       4444  consistent gets
          0  physical reads
          0  redo size
     306954  bytes sent via SQL*Net to client
       7745  bytes received via SQL*Net from client
        668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

SQL> SELECT *
  2    FROM processed_flag_null
  3   WHERE processed_flag = 0;

10000 rows selected.

Elapsed: 00:00:00.22

Execution Plan
----------------------------------------------------------
Plan hash value: 1150676937
-----------------------------------------------------------------------------------------
| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                     | 10000 |   166K|   802   (2)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| PROCESSED_FLAG_NULL | 10000 |   166K|   802   (2)| 00:00:10 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROCESSED_FLAG"=0)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       3571  consistent gets
          0  physical reads
          0  redo size
     174974  bytes sent via SQL*Net to client
       7745  bytes received via SQL*Net from client
        668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed


If I compare both aproaches using Runstats:

BEGIN
  runstats_pkg.rs_start;
  BEGIN
    FOR i IN 1 .. 1000
    LOOP
      FOR x IN (SELECT *
                  FROM processed_flag_not_null
                 WHERE processed_flag = 0)
      LOOP
        NULL;
      END LOOP;
    END LOOP;
  END;
  runstats_pkg.rs_middle;
  BEGIN
    FOR i IN 1 .. 1000
    LOOP
      FOR x IN (SELECT *
                  FROM processed_flag_null
                 WHERE processed_flag = 0)
      LOOP
        NULL;
      END LOOP;
    END LOOP;
  END;
  runstats_pkg.rs_stop;
END;
-- Output
Run1 ran in 4295 hsecs
Run2 ran in 5123 hsecs
run 1 ran in 83.84% of the time

If I compare the processed_flag_null without hints versus processed_flag_null with index hint this are the results:

BEGIN
  runstats_pkg.rs_start;
  BEGIN
    FOR i IN 1 .. 1000
    LOOP
      FOR x IN (SELECT *
                  FROM processed_flag_null
                 WHERE processed_flag = 0)
      LOOP
        NULL;
      END LOOP;
    END LOOP;
  END;
  runstats_pkg.rs_middle;
  BEGIN
    FOR i IN 1 .. 1000
    LOOP
      FOR x IN (SELECT /*+ index(processed_flag_null IDX_PROCESSED_FLAG_NULL) */
                       *
                  FROM processed_flag_null
                 WHERE processed_flag = 0)
      LOOP
        NULL;
      END LOOP;
    END LOOP;
  END;
  runstats_pkg.rs_stop;
END;

-- Output
Run1 ran in 5017 hsecs
Run2 ran in 2212 hsecs
run 1 ran in 226.81% of the time


As I expected using the hint is more than twices fast of not using the index, why doesn't the CBO choose the index aproach? Can I tune the stats to give the CBO more information?

Thanks in advance,
Manuel Vidigal

September 16, 2010 - 8:11 am UTC

as for your first strange thing - that histograms are not gathered until you query the data, see

http://www.oracle.com/technetwork/issue-archive/2006/06-may/o36asktom-099107.html
"why did my plans change"

it is not strange but rather is expected.

... Now for the main question, I don't understand why the CBO chooses an Index
Range Scan for the PROCESSED_FLAG_NOT_NULL table, and a Full Table Scan for the
PROCESSED_FLAG_NULL:
....

it'll be the clustering factor. In the index with the 0's and 1's, the clustering factor of the index will be very near the number of blocks (because so many 1's are near each other, the index believes the range scan using that index will be very efficient since it will be retrieving data that is right next to each other on the blocks). The index with just 0's will have a poor clustering factor - since no block in the table will have two rows with 0's on it due to the way you loaded the data.

see
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:8764517459743#737753700346729628
for an example

Which brings me to an interesting point on your example. Since in real life, "unprocessed" records would tend to be 'at the bottom of the table' since they are inserted and we tend to insert into a table at the bottom the 0's would be clustered together well (unless the table under goes a lot of deletes - but then the deletes would be of old, contiguous data so the newly inserted data would tend to be clumped together as well).

So, perhaps your test is not real life - you insert 99 1's and then 1 0, 99 1's and then 1 0 and so on. I think you should insert lots of 1's and then some 0's and the commit and test - that would mimic real life better.

ops$tkyte%ORA11GR2> /*
ops$tkyte%ORA11GR2> drop TABLE processed_flag_null;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> CREATE TABLE processed_flag_null AS
ops$tkyte%ORA11GR2> SELECT rownum id,
ops$tkyte%ORA11GR2>        lpad('A', 10, 'A') VALUE,
ops$tkyte%ORA11GR2>        CASE
ops$tkyte%ORA11GR2>          WHEN MOD(rownum, 100) = 0 THEN
ops$tkyte%ORA11GR2>           0
ops$tkyte%ORA11GR2>        END processed_flag_not_clustered,
ops$tkyte%ORA11GR2>        case when rownum >= 1000000 - 0.01 * 1000000 then
ops$tkyte%ORA11GR2>           0
ops$tkyte%ORA11GR2>        end processed_flag_clustered
ops$tkyte%ORA11GR2>   FROM big_table.big_table;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> CREATE INDEX IDX_PROCESSED_FLAG_nc_idx ON
ops$tkyte%ORA11GR2> PROCESSED_FLAG_NULL (PROCESSED_FLAG_not_clustered);
ops$tkyte%ORA11GR2> CREATE INDEX IDX_PROCESSED_FLAG_c_idx ON
ops$tkyte%ORA11GR2> PROCESSED_FLAG_NULL (PROCESSED_FLAG_clustered);
ops$tkyte%ORA11GR2> */
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'PROCESSED_FLAG_NULL' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select i.index_name, i.clustering_factor, t.num_rows, t.blocks
  2    from user_indexes i, user_tables t
  3   where i.index_name like 'IDX_PROCESSED_FLAG%'
  4     and t.table_name = 'PROCESSED_FLAG_NULL';

INDEX_NAME                     CLUSTERING_FACTOR   NUM_ROWS     BLOCKS
------------------------------ ----------------- ---------- ----------
IDX_PROCESSED_FLAG_C_IDX                      34    1000000       2966
IDX_PROCESSED_FLAG_NC_IDX                   2910    1000000       2966

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from processed_flag_null where processed_flag_not_clustered = 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 1150676937

-----------------------------------------------------------------------------------------
| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                     | 10000 |   175K|   813   (1)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| PROCESSED_FLAG_NULL | 10000 |   175K|   813   (1)| 00:00:10 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROCESSED_FLAG_NOT_CLUSTERED"=0)

ops$tkyte%ORA11GR2> select * from processed_flag_null where processed_flag_clustered = 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 1133246066

--------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                          | 10000 |   175K|    53   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PROCESSED_FLAG_NULL      | 10000 |   175K|    53   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_PROCESSED_FLAG_C_IDX | 10001 |       |    19   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PROCESSED_FLAG_CLUSTERED"=0)

ops$tkyte%ORA11GR2> set autotrace off

Manuel Vidigal, September 16, 2010 - 10:04 am UTC

Thanks for the excelent explanation. It was driving me crazy :).

If instead of the processed flag table, I have a contracts table with an active flag, which is active ("1") only for the last row of each contract. In this case we can have a poor clustered table. In this case would you recomend do populate all the inactive contracts with ("0")? What do you feel is the best aproach for his case scenario?

Kind Regards

September 16, 2010 - 10:18 am UTC

If you want the index to be used, first_rows(25) hint it, optimize for initial response time...

Thanks

Manuel Vidigal, September 16, 2010 - 11:40 am UTC

Not using Index ?

Bixapathi, September 22, 2010 - 9:49 am UTC

Hi Tom,

Why the Count(*) is not uing the index when it is availabe ???
It is on 11gR2 


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select * from temp;

   DEPT_NO DEPT_NAME
---------- ---------------
         1 IT
         2 Accounts
         3 Business
         4 Sales
         5 Pre-sales
         6 Customercare
         7
         8

---NO INDEX AT ALL - doing FULL TABLE SCAN - its OK 

SQL> select count(*) from temp;

  COUNT(*)
----------
         8

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3785724760

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| TEMP |     8 |     4   (0)| 00:00:01 |
-------------------------------------------------------------------


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        335  bytes sent via SQL*Net to client
        337  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed



WITH INDEX NOW :

SQL> create index dept_idx on temp(dept_no);

Index created.

Elapsed: 00:00:00.03
SQL> execute DBMS_STATS.gather_table_stats(ownname=>'APEXTEST',tabname=>'TEMP');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.10
SQL> select count(*) from temp;  ------------ Still DOING FULL TABLE SCAN ?????

  COUNT(*)
----------
         8

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 3785724760

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| TEMP |     8 |     4   (0)| 00:00:01 |
-------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        335  bytes sent via SQL*Net to client
        337  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(dept_no) from temp;---------- HERE it is OK 

COUNT(DEPT_NO)
--------------
             8

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1347723045

-----------------------------------------------------------------------------
| Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |          |     1 |     3 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |          |     1 |     3 |            |          |
|   2 |   INDEX FULL SCAN| DEPT_IDX |     8 |    24 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        341  bytes sent via SQL*Net to client
        337  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

September 23, 2010 - 10:40 am UTC

dept_no must be NULLable. I am guessing of course - because you don't tell us :(

alter table temp modify dept_no not null;

try again. Entire null keys are not made in b*tree indexes.... If we know that at least one attribute of a b*tree key is NOT NULL - we know that every row in the table appears in the index. if every attribute in a b*tree key is nullable - we DON'T know that.

sorry -

Bixapathi, September 22, 2010 - 10:16 am UTC

Sorry Tom , I did not observe that it was created as 'non-unique' DEPT_IDX - NONUNIQUE
that might be the reason why Oracle did not look at this
I have made it 'unique and not null' , now Oracle is taking the index into consideration to count the no of rows .

September 23, 2010 - 10:41 am UTC

it was not the unique bit

it was the NOT NULL bit that makes the difference...

not using index

Duke Ganote, December 04, 2010 - 12:51 pm UTC

First query is fast enough:

select count(*) from CVG where period = date '2009-09-01' and LB = 'XX';

            COUNT(*)
--------------------
              39,199

Elapsed: 00:00:00.54

Execution Plan
----------------------------------------------------------
Plan hash value: 1403108430

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)| Time  | Pstart   | Pstop |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |     1 |    11 |     2   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE               |            |     1 |    11 |            |          |       |       |
|   2 |   PARTITION LIST SINGLE       |            | 39199 |   421K|     2   (0)| 00:00:01 |   142 |   142 |
|   3 |    BITMAP CONVERSION COUNT    |            | 39199 |   421K|     2   (0)| 00:00:01 |       |       |
|   4 |     BITMAP AND                |            |       |       |            |          |       |       |
|*  5 |      BITMAP INDEX SINGLE VALUE| CVG_LB_BIT |       |       |            |          |   142 |   142 |
|*  6 |      BITMAP INDEX SINGLE VALUE| CVG_P_BIT  |       |       |            |          |   142 |   142 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("LB"='XX')
   6 - access("P"=TO_DATE(' 2009-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The second query is simpler and "self evident": how many distinct LB exist when it's specified in the predicate. But the optimizer chooses a full partition scan?

select count(distinct LB) from CVG where P = date '2009-09-01' and LB = 'XX';

COUNT(DISTINCTLB)
-----------------
                1

Elapsed: 00:00:03.23

Execution Plan
----------------------------------------------------------
Plan hash value: 2537488476

---------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |     1 |    11 |  1787   (5)| 00:00:26 |       |       |
|   1 |  SORT GROUP BY         |          |     1 |    11 |            |          |       |       |
|   2 |   PARTITION LIST SINGLE|          | 39199 |   421K|  1787   (5)| 00:00:26 |   142 |   142 |
|*  3 |    TABLE ACCESS FULL   | CVG | 39199 |   421K|  1787   (5)| 00:00:26 |   142 |   142 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("LB"='XX')

Index is not used for a large table

vishwanath, February 03, 2011 - 6:01 am UTC

Hi Tom,

i have a large table contains 2377758 rows and have indexes on 2 columns(total are four columns in the table).but when i have tried to see the usage then found indexes are not using at all that leads to slowness of the database.

In-order to see the usage i have fired the below statement:

1.)alter index <index name> monitoring usage
2.)select index_name,used from v$object_usage where index_name='<>'

--output--

INDEX_NAME USE
------------------------------ ---
D199_IDX1 NO

here is my query...

SELECT OBJECTID, T3STATUS, T3TYPE, V__199.st_SHAPE$, V__199.st_len$, V__199
.st_points,V__199.st_numpts,V__199.st_entity,V__199.st_minx,V__199.st_miny,V__19
9.st_maxx,V__199.st_maxy,V__199.st_area$,V__199.st_len$,V__199.st_rowid FROM (
SELECT b.OBJECTID,b.T3STATUS,b.T3TYPE,1 st_SHAPE$ ,b.SHAPE.points as st_points,b
.SHAPE.numpts as st_numpts,b.SHAPE.entity as st_entity,b.SHAPE.minx as st_minx,b
.SHAPE.miny as st_miny,b.SHAPE.maxx as st_maxx,b.SHAPE.maxy as st_maxy,b.SHAPE.a
rea as st_area$,b.SHAPE.len as st_len$,b.rowid as st_rowid FROM NWN.fLink_Featur
es b WHERE SDE.ST_EnvIntersects(b.SHAPE,:1,:2,:3,:4) = 1 AND b.OBJECTID NOT IN
(SELECT /*+ HASH_AJ */ SDE_DELETES_ROW_ID FROM NWN.D199 WHERE DELETED_AT IN (SEL
ECT l.lineage_id FROM SDE.state_lineages l WHERE l.lineage_name = :lineage_name
1 AND l.lineage_id <= :state_id1) AND SDE_STATE_ID = 0) UNION ALL SELECT a.OBJEC
TID,a.T3STATUS,a.T3TYPE,2 st_SHAPE$ ,a.SHAPE.points as st_points,a.SHAPE.numpts
as st_numpts,a.SHAPE.entity as st_entity

Can u see this please its really urgent..

Best Regards
Vishwanath

February 03, 2011 - 3:31 pm UTC

how many rows does the query that you think should use the index return?
what is the clustering factor of that index (query user_indexes)?
how many blocks are in the table (query user_tables)?

U doesn't work here by the way. I'll take a look at it but only if you ask me and NOT ask U in the future.

index

A reader, February 03, 2011 - 8:32 pm UTC

index not in use

vishwanath, February 03, 2011 - 10:28 pm UTC

Thanx very much for your immediate reply

Here is the input you require from my side...
Q:what is the clustering factor of that index (query user_indexes)?
Ans:clustering factor--->0
Q:how many blocks are in the table (query user_tables)?
Ans:Blocks-------->5410

February 04, 2011 - 9:11 am UTC

you did not answer all of my questions.

how many rows does the query that you think should use the index return?

if your clustering factor is zero, you are invalidate/stale statistics. You'll never get the right plan that way.

The clustering factor will be a number between the number of blocks in the table and the number of rows in the table in general, not zero.

Index usage

Rakesh, August 24, 2011 - 6:31 am UTC

SQL> EXPLAIN PLAN FOR SELECT    ST.CELL_NUMBER FROM ABCDPROD.ABCD_INVT_MST_CELLNO_CAT_STAG ST, 
ABCDPROD.ABCD_INVT_MST_ITEM_MSISDN_DTL MD, CP
ST_PARAM P WHERE ST.CELL_NUMBER = MD.MSISDN_NUMBER AND MD.MSISDN_STATUS_ID = P.PARAM_ID AND 
MD.CIRCLE_ID = 11 AND MD.IS_ACTIVE = 'Y' AND ST.
D ST.IS_INCLUDED_FLAG = 'X' AND ( UPPER(P.SHORT_DESC) = 'AVAILABLE' OR UPPER(P.SHORT_DESC) = 
'UNAVAILABLE' OR UPPER(P.SHORT_DESC) = 'IDLE' )

Explained.

SQL> @?/rdbms/admin/utlxpls.sql

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
----------------------------------------
----------------------------------------------------------------------------------------------------
----------------------------------------
Plan hash value: 3537718687

----------------------------------------------------------------------------------------------------
----------------------------------
| Id  | Operation                            | Name                          | Rows  | Bytes | Cost 
(%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
----------------------------------
|   0 | SELECT STATEMENT                     |                               |  1437 | 60354 |  
5077   (1)| 00:00:38 |       |       |
|*  1 |  HASH JOIN                           |                               |  1437 | 60354 |  
5077   (1)| 00:00:38 |       |       |
|*  2 |   TABLE ACCESS FULL                  | ABCD_CMN_MST_PARAM            |     4 |    56 |     
2   (0)| 00:00:01 |       |       |
|   3 |   NESTED LOOPS                       |                               |  4846 |   132K|  
5074   (1)| 00:00:38 |       |       |
|*  4 |    INDEX FAST FULL SCAN              | IDX_STG_CELL                  |  4846 | 58152 |    
96   (3)| 00:00:01 |       |       |
|*  5 |    TABLE ACCESS BY GLOBAL INDEX ROWID| ABCD_INVT_MST_ITEM_MSISDN_DTL |     1 |    16 |     
2   (0)| 00:00:01 |    11 |    11 |
|*  6 |     INDEX UNIQUE SCAN                | UK_C0012643_UK                |     1 |       |     
1   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------
----------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("MD"."MSISDN_STATUS_ID"="P"."PARAM_ID")
   2 - filter(UPPER("P"."SHORT_DESC")='AVAILABLE' OR UPPER("P"."SHORT_DESC")='IDLE' OR 
UPPER("P"."SHORT_DESC")='UNAVAILABLE')
   4 - filter("ST"."CIRCLE_ID"=11 AND "ST"."IS_INCLUDED_FLAG"='X')
   5 - filter("MD"."IS_ACTIVE"='Y' AND "MD"."CIRCLE_ID"=11)
   6 - access("ST"."CELL_NUMBER"="MD"."MSISDN_NUMBER")

22 rows selected.

SQL> select clustering_factor,last_analyzed from dba_indexes where index_name='IDX_STG_CELL';

CLUSTERING_FACTOR LAST_ANAL
----------------- ---------
            16569 05-AUG-11


SQL>  select table_name,last_analyzed,blocks  from dba_tables where 
table_name='ABCD_INVT_MST_CELLNO_CAT_STAG';

TABLE_NAME                     LAST_ANAL     BLOCKS
------------------------------ --------- ----------
ABCD_INVT_MST_CELLNO_CAT_STAG  05-AUG-11       4024


1 row selected.

SQL> select blocks from dba_Segments where segment_name='ABCD_INVT_MST_CELLNO_CAT_STAG';


    BLOCKS
----------
      4096


SQL> SELECT COLUMN_NAME,COLUMN_POSITION FROM DBA_IND_COLUMNS WHERE INDEX_NAME='IDX_STG_CELL';

COLUMN_NAME                    COLUMN_POSITION
------------------------------ ---------------
IS_INCLUDED_FLAG                             3
CIRCLE_ID                                    2
CELL_NUMBER                                  1



Tom,

If optimizer goes for an index scan,it should switch 16569 blocks, But if it choses fullscan it 
should scan just 4096 blocks. Why is the optimizer choosing index scan instead of full scan here.

August 30, 2011 - 3:47 pm UTC

I do not see the table ABCD_INVT_MST_CELLNO_CAT_STAG referenced in the plan.

I do not see a complete example, I am very confused by what I do see.

I don't know what table IDX_STG_CELL is defined on.

Query taking a different index

Parikshit Paul, March 24, 2014 - 4:26 pm UTC

Hi Tom,

Reently, few queries in one of our production databases got very slow (around 30 times slower). On running a trace on the queries and tkprofing it, I got this:

SQL ID: c3p02d11m83rx Plan Hash: 3031845093

SELECT *
FROM
 "HRP1001" WHERE "MANDT"=:A0 AND "OBJID"=:A1 AND "OTYPE"=:A2 AND "SCLAS"=:A3
  AND "RSIGN"=:A4 AND "RELAT"=:A5 AND "BEGDA"<=:A6 AND "ENDDA">=:A7


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute   1674      0.00       0.04          0          0          0           0
Fetch     1674     58.62     103.82          0   13907245          0        1141
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     3348     58.62     103.86          0   13907245          0        1141

explain plan:
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |   130 |     1   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| HRP1001   |     1 |   130 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | HRP1001~7 |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJID"=:A1 AND "BEGDA"<=:A6 AND "RSIGN"=:A4 AND "MANDT"=:A0)
   2 - access("RELAT"=:A5 AND "SCLAS"=:A3 AND "OTYPE"=:A2 AND "ENDDA">=:A7)
       filter("ENDDA">=:A7)

However, the same set of queries are taking normal time in the QA databases which has around 4 months old data.Trace from the qa database show this:

SQL ID: 82z7tf7jnfmtj Plan Hash: 3534585829

SELECT *
FROM
 "HRP1001" WHERE "MANDT"=:A0 AND "PLVAR"=:A1 AND "OTJID"=:A2 AND "SUBTY"=:A3
  AND "ISTAT"=:A4 AND "BEGDA"<=:A5 AND "ENDDA">=:A6


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute   6540      0.00       0.10          0          0          0           0
Fetch     6539      0.00       0.20         45      25888          0        6582
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13079      0.00       0.30         45      25888          0        6582
explain plan:

-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |   130 |     1   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| HRP1001     |     1 |   130 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | HRP1001~Z01 |     2 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SCLAS"=:A3 AND "OTYPE"=:A2 AND "BEGDA"<=:A6 AND "ENDDA">=:A7 AND
              "RSIGN"=:A4)
   2 - access("MANDT"=:A0 AND "RELAT"=:A5 AND "OBJID"=:A1)

My question is : Why does the production db using a different index and access path. How do I investigate this difference?

Correction

Parikshit Paul, March 25, 2014 - 5:57 am UTC

Sorry posted wrong stats for the qa database
The correct one is as follows:

SQL ID: c3p02d11m83rx Plan Hash: 2745651483

SELECT *
FROM
"HRP1001" WHERE "MANDT"=:A0 AND "OBJID"=:A1 AND "OTYPE"=:A2 AND "SCLAS"=:A3
AND "RSIGN"=:A4 AND "RELAT"=:A5 AND "BEGDA"<=:A6 AND "ENDDA">=:A7

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 0 0.00 0.00 0 0 0 0
Execute 8498 0.00 0.13 0 0 0 0
Fetch 8498 0.00 1.07 130 32569 0 5772
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16996 0.00 1.21 130 32569 0 5772

explain plan:

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 130 | 1 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| HRP1001 | 1 | 130 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | HRP1001~Z01 | 2 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("SCLAS"=:A3 AND "OTYPE"=:A2 AND "BEGDA"<=:A6 AND "ENDDA">=:A7 AND
"RSIGN"=:A4)
2 - access("MANDT"=:A0 AND "RELAT"=:A5 AND "OBJID"=:A1)

Optimization of query

Namrata, April 03, 2022 - 5:26 pm UTC

I have two tables one table A having transactional data (transactions) and other table B which is small having static data.

I am joining these two tables (inner join) to get the data. but for table A its showing FULL Table scan. Table - transactions gets loaded . As table A is growing the performance of query is getting decreased .

Query is taking time .

Can you please suggest how to make sure query returns data fast (in micro seconds even though TABLE A is growing)

April 04, 2022 - 12:37 pm UTC

Have you got an index on the join columns?

How many rows is the query returning?

Optimization of query

A reader, April 04, 2022 - 2:06 pm UTC

Yes, i have index on the join columns. Query is returning around 5 rows . Once transactions start flowing this will increase.

I have created view on this query and using view (have where clause) in Java code .

April 05, 2022 - 5:21 am UTC

We need to see a test case, or at least some data.

- Show us the DDL and sizes for each table
- and your query
- and the runtime plan

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.

Not using the index

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Too much space required...

Just an example

Possible alternative

rpad

to clarify for "a reader"

Data warehousing

Actually disk is more expensive

Hmmm - not sure I'm seeing what you are

You're probably right, but...

Not much storage

The table is partitioned

Not exactly

Local and Global Index

Cool!!!

I'm afraid i've missed something or decode would be benefit ?

For Christian

Index Range Scan

Indexes

Index

Indexes

Indexes

Indexes

Indexes

Stats.

test case

Confusion galore

Strange CBO choice

Thanks

Not using Index ?

sorry -

not using index

Index is not used for a large table

index

index not in use

Index usage

Query taking a different index

Correction

Optimization of query

Optimization of query

More to Explore

Performance