Thanks for the question, Irshad.

Asked: September 11, 2000 - 12:22 pm UTC

Last updated: November 21, 2012 - 9:28 am UTC

Version: 8.0.5

Viewed 50K+ times! This question is

You Asked

Hi,

I created an index name_idx on a table temp on its column name(varchar2(20)). This column name does not have a not NULL constraint on it.

When I issue a query like

select ... from temp
where name='JAMES';

I found out ( using autotrace )that this index is not being used. I made this index to work using a hint with this query.

However if I issue

select count(*) from temp
where name='JAMES';

this index gets used.

Am I missing something. Any pointers please.

Irshad.

*******************

Tom,

I created the temp table as scott.emp. It has 16 rows and count(*) returns 1.

and Tom said...

Ok, you are using the cost based optimizer (CBO) and it is doing absolutely the most correct thing possible.

You have a very small table. It has 1 block basically. The CBO understands it is very very small. It understands that to use an index to answer the select .... from temp where name = 'JAMES', it would have to read an index and then do a table access by rowid. It also understands that it could just read the entire table in 1 IO and get the data just as easy. Here, on such a small table -- the index increases overhead, not descreases it.

When you do the count(*), it understands it does not need to access the table hence the query can be entirely answered by reading the equally tiny index -- so it does.

Using the rule based optimizer, which does not consider the size of tables, we can get the less optimal plan of "index range scan, table access by rowid".

The bottom line here is that INDEXES != FASTER STUFF in all cases. The optimizer is doing the correct thing here given the stats on the table.

Here is the example for all to see the behavior:

ops$tkyte@ORA8I.WORLD> create table emp
2 as
3 select * from scott.emp;
Table created.

ops$tkyte@ORA8I.WORLD> create index emp_idx on emp(ename);
Index created.

ops$tkyte@ORA8I.WORLD> set autotrace traceonly explain

ops$tkyte@ORA8I.WORLD> select empno, ename
2 from emp
3 where ename = 'JAMES';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
2 1 INDEX (RANGE SCAN) OF 'EMP_IDX' (NON-UNIQUE)

So, using RBO which uses rules only to develop a query plan, we use the index. Using the index is actually NOT the optimal plan given the size of this table!

ops$tkyte@ORA8I.WORLD> select count(*)
2 from emp
3 where ename = 'JAMES';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'EMP_IDX' (NON-UNIQUE)

Here is uses the smaller even index -- since it knows it need not goto the table to pick up other columns.... This is optimal

ops$tkyte@ORA8I.WORLD> analyze table emp compute statistics;
Table analyzed.

ops$tkyte@ORA8I.WORLD> select empno, ename
2 from emp
3 where ename = 'JAMES';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1
1 0 TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=1 Bytes=27)

Now it picks the BETTER plan for this query given this small amount of data.

ops$tkyte@ORA8I.WORLD> select count(*)
2 from emp
3 where ename = 'JAMES';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'EMP_IDX' (NON-UNIQUE)

and the count(*) is unaffected...

Rating

(117 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Indexes should not be used in case of CBO- in the given example

Nag, November 08, 2001 - 7:02 pm UTC

SQL> CREATE UNIQUE INDEX  DEPTNO_IDX ON DEPT(DEPTNO);

Index created.


First case:

SQL> ANALYZE TABLE EMP COMPUTE STATISTICS
SQL> ANALYZE TABLE DEPT COMPUTE STATISTICS


SQL> set autotrace on explain
SQL> SELECT ENAME,EMP.DEPTNO
  2  from emp,dept
  3  where emp.deptno=dept.deptno ;

ENAME          DEPTNO
---------- ----------
SMITH              20
ALLEN              30
WARD               30
JONES              20
MARTIN             30
BLAKE              30
CLARK              10
SCOTT              20
KING               10
TURNER             30
ADAMS              20

ENAME          DEPTNO
---------- ----------
JAMES              30
FORD               20
MILLER             10

14 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=14 Bytes=126)
   1    0   NESTED LOOPS (Cost=1 Card=14 Bytes=126)
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=98)
   3    1     INDEX (UNIQUE SCAN) OF 'DEPTNO_IDX' (UNIQUE)



Second Case:

SQL> analyze table emp delete statistics;

Table analyzed.

SQL> analyze table dept delete statistics;

Table analyzed.

we have deleted the statistics so we now should switch to RBO. ( Isnt it ?)

SQL> SELECT ENAME,EMP.DEPTNO
  2  from emp,dept
  3  where emp.deptno=dept.deptno ;

ENAME          DEPTNO
---------- ----------
SMITH              20
ALLEN              30
WARD               30
JONES              20
MARTIN             30
BLAKE              30
CLARK              10
SCOTT              20
KING               10
TURNER             30
ADAMS              20

ENAME          DEPTNO
---------- ----------
JAMES              30
FORD               20
MILLER             10

14 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   NESTED LOOPS
   2    1     TABLE ACCESS (FULL) OF 'EMP'
   3    1     INDEX (UNIQUE SCAN) OF 'DEPTNO_IDX' (UNIQUE)


In the first case the indexes should not be used in the light of the fact that the tables are very small, and a full table scan will be much more beneficial. But it isnt happening.

Could you give your thoughts on why the above is not happening according to  my expectations.

thank you

November 09, 2001 - 10:08 am UTC

In this case, since you are not accessing the DEPT table really -- this query is pretty much the same as:

select ename, deptno
  from emp 
  where exists ( select null from dept where dept.deptno = emp.deptno )

and thats how its being processed.  Its using the index instead of sorting the table and merging (its using the already sorted dept index).

If you add a column from dept -- making it so the index cannot by itself be used -- the plan can and will change:

ops$tkyte@ORA717DEV.US.ORACLE.COM> create table emp as select * From scott.emp;

Table created.

ops$tkyte@ORA717DEV.US.ORACLE.COM> create table dept as select * From scott.dept;

Table created.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> alter table dept add constraint dept_pk primary key(deptno);

Table altered.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table emp compute statistics;

Table analyzed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> analyze table dept compute statistics;

Table analyzed.

ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> set autotrace traceonly explain
ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> SELECT ENAME,EMP.DEPTNO
  2    from emp,dept
  3   where emp.deptno=dept.deptno ;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=14 Bytes=126)
   1    0   NESTED LOOPS (Cost=1 Card=14 Bytes=126)
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=98)
   3    1     INDEX (UNIQUE SCAN) OF 'DEPT_PK' (UNIQUE)



ops$tkyte@ORA717DEV.US.ORACLE.COM> 
ops$tkyte@ORA717DEV.US.ORACLE.COM> SELECT ENAME,EMP.DEPTNO, dept.dname
  2    from emp,dept
  3   where emp.deptno=dept.deptno ;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=14 Bytes=252)
   1    0   NESTED LOOPS (Cost=5 Card=14 Bytes=252)
   2    1     TABLE ACCESS (FULL) OF 'DEPT' (Cost=1 Card=4 Bytes=44)
   3    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=1 Card=14 Bytes=98)

agreed...but need your suggestions

Senthil, May 03, 2002 - 12:10 pm UTC

Tom-

I agree with your reply for the above query. but i have got something where i am facing the same performance issue. I would like to have your suggestions to redesign the select statement in such a way to perform better. have changed the tablename and column name for question purpose. I am using 8.1.7 for Solaris.

SQL> show parameter optimizer_mode

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
optimizer_mode                       string  CHOOSE
SQL> show parameter compatible

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
compatible                           string  8.1.5.0.0

=============================================================================

Query 1:

select   sum.table_name 
from     summary sum, dv dvn
where    sum.reference = :b1 
and      sum.vol_num = dvn.vol_num 
and      dvn.expired is null 
and      dvn.accepted is not null 
and      ((sum.oldest_date >= :b2 and sum.oldest_date <= :b3 )
or        (sum.latest_date >= :b2 and sum.latest_date <= :b3 )
or        (sum.oldest_date <= :b2 and sum.latest_date >= :b3 ))
group by  sum.table_name
order by max(sum.latest_date) desc,
         max(sum.table_ref) desc 


Trace details:

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=11 Card=2 Bytes=76)
   1    0   SORT (ORDER BY) (Cost=11 Card=2 Bytes=76)
   2    1     SORT (GROUP BY) (Cost=11 Card=2 Bytes=76)
   3    2       HASH JOIN (Cost=3 Card=3 Bytes=114)
   4    3         INDEX (FULL SCAN) OF 'DV_IDX1' (NON-UNIQUE) (Cost=1 Card=17 Bytes=289)
   5    3         TABLE ACCESS (FULL) OF 'SUMMARY' (Cost=1 Card=3 Bytes=63)

Statistics
----------------------------------------------------------
          0  recursive calls
          4  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        224  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          0  rows processed


Index list:

SQL> select index_name, column_name, column_position from user_ind_columns where table_name = '&tab' order by 1,3;
Enter value for tab: SUMMARY

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
SUMMARY_IDX1                   REFERENCE                                    1
SUMMARY_IDX2                   OLDEST_DATE                                  1
SUMMARY_IDX3                   LATEST_DATE                                  1
PK_SUMMARY                     VOL_NUM                                      1

SQL> /           
Enter value for tab: DV

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
DV_IDX1                        VOL_NUM                                      1
DV_IDX1                        EXPIRED                                      2
DV_IDX1                        ACCEPTED                                     3
PK_DV                          VOL_NUM                                      1


SQL> select count(*) from summary;

  COUNT(*)
----------
        23

SQL> select count(*) from dv;

  COUNT(*)
----------
        23
        
>>The Query1 is not using the index even though its analyzed and used properly in the query.       

=============================================================================

Query 2:

select   ti.ji_id,
         ti.seq,
         ti.status,
         ti.state_value,
         ti.status_date_time 
from     ji,
         ti 
where    ji.job_id = :b1 
and      ji.id = ti.ji_id 
and      ti.jtt_id = :b2
order by ji.schedule_date_time desc,
         ti.seq desc 


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=6 Bytes=570)
   1    0   SORT (ORDER BY) (Cost=8 Card=6 Bytes=570)
   2    1     HASH JOIN (Cost=4 Card=6 Bytes=570)
   3    2       TABLE ACCESS (FULL) OF 'JI' (Cost=1 Card=2 Bytes=26)
   4    2       TABLE ACCESS (FULL) OF 'TI' (Cost=2 Card=75 Bytes=6150)

Statistics
----------------------------------------------------------
          0  recursive calls
          8  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
       1443  bytes sent via SQL*Net to client
        430  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed


SQL> select index_name, column_name, column_position from user_ind_columns where table_name = '&tab' order by 1,3;
Enter value for tab: JI

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
JI_IDX_1                       JOB_ID                                       1
PK_JI                          ID                                           1

SQL> /
Enter value for tab: TI

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
PK_TI                          JI_ID                                        1
PK_TI                          JTT_ID                                       2
PK_TI                          SEQ                                          3


SQL> select count(*) from ji;

  COUNT(*)
----------
        25

SQL> select count(*) from ti;

  COUNT(*)
----------
       264
       
>>Same as Query1, isn't using the index properly. Both the tables(ji & ti) could grow fastly.

=============================================================================

select   ji.id,
         ji.schedule_id,
         dname.name,
         j.name,
         ji.schedule_date_time,
         ji.status,
         ji.elapsed_seconds,
         ji.start_date_time,
         ji.end_date_time,
         t.name,
         jtt.task_order,
         ji.input_value
from     ji,
         jtt,
         task t,
         job j,
         dname
where    ji.status = :b1 
and      ji.schedule_date_time <= sysdate 
and      ji.job_id = j.id 
and      j.expired is null 
and      j.reference = dname.reference 
and      dname.expired is null 
and      ji.last_job_type_task_id = jtt.id (+) 
and      jtt.task_id = t.id (+) 
order by ji.schedule_date_time,
         ji.start_date_time,
         ji.id 


Trace details:

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=1 Bytes=745)
   1    0   SORT (ORDER BY) (Cost=9 Card=1 Bytes=745)
   2    1     NESTED LOOPS (OUTER) (Cost=5 Card=1 Bytes=745)
   3    2       NESTED LOOPS (OUTER) (Cost=4 Card=1 Bytes=714)
   4    3         NESTED LOOPS (Cost=3 Card=1 Bytes=675)
   5    4           NESTED LOOPS (Cost=2 Card=1 Bytes=642)
   6    5             TABLE ACCESS (FULL) OF 'JI' (Cost=1 Card=1 Bytes=589)
   7    5             TABLE ACCESS (BY INDEX ROWID) OF 'JOB' (Cost=1 Card=3 Bytes=159)
   8    7               INDEX (UNIQUE SCAN) OF 'PK_JOB' (UNIQUE)
   9    4           TABLE ACCESS (BY INDEX ROWID) OF 'DNAME' (Cost=1 Card=3 Bytes=99)
  10    9             INDEX (UNIQUE SCAN) OF 'PK_DNAME' (UNIQUE)
  11    3         TABLE ACCESS (BY INDEX ROWID) OF 'JTT' (Cost=1 Card=41 Bytes=1599)
  12   11           INDEX (UNIQUE SCAN) OF 'PK_JTT' (UNIQUE)
  13    2       TABLE ACCESS (BY INDEX ROWID) OF 'TASK' (Cost=1 Card=41 Bytes=1271)
  14   13         INDEX (UNIQUE SCAN) OF 'PK_TASK' (UNIQUE)


Statistics
----------------------------------------------------------
          0  recursive calls
          4  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        825  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed


SQL> select index_name, column_name, column_position from user_ind_columns where table_name = '&tab' order by 1,3
Enter value for tab: JI

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
JI_IDX_1                       JOB_ID                                       1
JI_IDX_2                       SCHEDULE_ID                                  1
JI_IDX_3                       LAST_JOB_TYPE_TASK_ID                        1
JI_IDX_4                       SCHEDULE_DATE_TIME                           1
JI_IDX_5                       START_DATE_TIME                              1
JI_IDX_6                       STATUS                                       1
PK_JI                          ID                                           1

7 rows selected.

SQL> /
Enter value for tab: JOB_TYPE_TASK

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
JTT_1                          TASK_ID                                      1
PK_JTT                         ID                                           1
UNQ_JTT_1                      JOB_TYPE_ID                                  1
UNQ_JTT_1                      TASK_ID                                      2
UNQ_JTT_1                      TASK_ORDER                                   3

SQL> /
Enter value for tab: TASK

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
PK_TASK                        ID                                           1
TASK_IDX_1                     TASK_TYPE_ID                                 1
UNQ_TASK_1                     NAME                                         1
UNQ_TASK_1                     DATE_EXPIRED                                 2

SQL> /
Enter value for tab: JOB

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
JOB_IDX_1                      REFERENCE                                    1
JOB_IDX_2                      JOB_TYPE_ID                                  1
PK_JOB                         ID                                           1
UNQ_JOB_1                      NAME                                         1
UNQ_JOB_1                      REFERENCE                                    2
UNQ_JOB_1                      DATE_EXPIRED                                 3

6 rows selected.

SQL> /
Enter value for tab: DNAME

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
PK_DNAME                       REFERENCE                                    1
UNQ_DNAME_1                    NAME                                         1
UNQ_DNAME_1                    DATE_EXPIRED                                 2


SQL> select count(*) from ji;  

  COUNT(*)
----------
        25

SQL> select count(*) from jtt;

  COUNT(*)
----------
        19

SQL> select count(*) from task;

  COUNT(*)
----------
        15

SQL> select count(*) from job;

  COUNT(*)
----------
         4

SQL> select count(*) from dname;

  COUNT(*)
----------
         3

>>Its different in this query, even though its using the index, due to the outer join, its taking more time to process the query. Is therey any workaround to avoid using outer join in this query or to improve further.

May 03, 2002 - 1:50 pm UTC

I see *no timings* here.

I see *very very small consistent gets*

it will not use an index unless an index would be measurably better. You seem to be running these queries against empty or almost empty tables.

remember these facts:

indexes are not always faster
indexes are sometimes slower
indexes are not always faster
indexes sometimes take longer then not using the index
indexes are not always faster

Cannot help you on this one as there is no apparent problem to be solved. On tables of this size, indexes are a waste.

This query not always use the PK index. Any ideas?

ER, July 14, 2003 - 4:32 pm UTC

Hi Tom,
I have this table ACCOUNTS which primary key is ID. The table is analyzed as well as all its indexes.
For some reason that I'm still figuring out, with different values of ID, oracle is using different plans (I'm using CBO). These are the min and max values for ID in this table:

select min(id), max(id) from accounts;
   MIN(ID)    MAX(ID)
---------- ----------
     10000    2700000

Now look at these execution plans:

SQL> Select * from (SELECT * from ACCOUNTS a where id > 10000 order by id) where rownum < 5;
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=119118 Card=2413500
          Bytes=347544000)
   1    0   COUNT (STOPKEY)
   2    1     VIEW* (Cost=119118 Card=2413500 Bytes=347544000)         :Q119700
                                                                      1
   3    2       SORT* (ORDER BY STOPKEY) (Cost=119118 Card=2413500 Byt :Q119700
         es=222042000)                                                1
   4    3         PARTITION RANGE* (ALL)                               :Q119700
                                                                       1
   5    4           SORT* (ORDER BY STOPKEY)                           :Q119700
                                                                       0
   6    5             TABLE ACCESS* (FULL) OF 'ACCOUNTS' (Cost=12606 C :Q119700
          ard=2413500 Bytes=222042000)                                 0

   2 PARALLEL_TO_SERIAL            SELECT C0,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C1
                                   2,C13,C14,C15,C1 FROM (SELECT A1.C0
   3 PARALLEL_COMBINED_WITH_PARENT
   4 PARALLEL_COMBINED_WITH_PARENT
   5 PARALLEL_TO_PARALLEL          SELECT A1.C0 C0,A1.C1 C1,A1.C2 C2,A1.C3 C3,A
                                   1.C4 C4,A1.C5 C5,A1.C6 C6,A1.C7 C7,A
   6 PARALLEL_COMBINED_WITH_PARENT


But, if I query the maximun value for ID, I get this plan

SQL> c/10000/2700000
  1* Select * from (SELECT * from ACCOUNTS a where id > 2700000 order by id) where rownum < 5
SQL> /
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3223 Card=11789 Byte
          s=1697616)

   1    0   COUNT (STOPKEY)
   2    1     VIEW (Cost=3223 Card=11789 Bytes=1697616)
   3    2       TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'ACCOUNTS' (Co
          st=3223 Card=11789 Bytes=1084588)

   4    3         INDEX (RANGE SCAN) OF 'ACCOUNTS_PK' (UNIQUE) (Co
          st=642 Card=11789)


Also the timings are very different. The first query (full scan) took 6.23 seconds while the second (index range scan) took 0.06 second.
Additional information:
- The table is partitioned into 3 partitions by ID (the same column as the PK).
- The unique index of the PK is a global index.
- All the data resides in one partition (the values I'm querying exists in that partition).
- There are about 2 million rows in the table.

If I hint the first query to use the PK it works as fast as the second query. I really would like to avoid that hint here cause I guess there is something else affecting the plan.
Any advice would be very appreciated.
Thanks.

July 15, 2003 - 1:09 am UTC

you should hint the inner query to be FIRST_ROWS as that is what you really want.

FIRST_ROWS and ALL_ROWS are two "safe" hints, they tell the optimizer exactly what you want, what you mean.

this query:

SELECT * from ACCOUNTS a where id > 10000 order by id

should NOT use an index, and does not. the other one should and does

If you FIRST_ROWS hint it, both will.

Thanks for your answer but...

ER, July 15, 2003 - 11:30 am UTC

Tom, I'm sorry but I still don't understand "why" the first query should not use an index. Why CBO prefers a full scan instead of an index access in this case.
The cost for the query is greater when it is hinted, and that is what I don't understand.
What is the difference between 10000 and 27000000 from the CBO point of view?
I detected this behavior on all my partitioned tables, and I'd like to undesrtand it.
Thank you.

July 15, 2003 - 11:51 am UTC

in the first case -- it says "hmmmm, lots of rows are > 10000, full scan"

if you tell it "but wait -- I really truly interested in response time, not throughput" it will say "ok, so maybe this index is best for you.

Sorry, I got it !

ER, July 15, 2003 - 11:41 am UTC

Sorry Tom, I didn't think in depth your answer.
The first query is not using the index because CBO knows that it has to visit all the rows in the table since I'm looking for the IDs greater than the minimum ID in the table. The index is not useful in that case.
Is that rule that states "if you gonna select more that 25% of the table you should avoid the index" still valid?
Thanks again for your answer.

July 15, 2003 - 11:54 am UTC

it is not 25% -- it is much more complex then that. 25% is ROT (rule of thumb).

it is a function of the estimated cardinality, the clustering factor, the row widths, the db_file_multi_block_read_count, etc etc etc etc.

How to avoid <> condition in queries to make use of indexes?

Tony, August 12, 2003 - 2:56 am UTC

My customer table has millions of rows. I want to update the table as below:

UPDATE cust
SET column1 = 'some value'
WHERE cust_type <> '12';

This update is very slow as the index on cust_type column is not used due to <> condition. I'm forced to use <> condition as its not known what are the other values the column have and going to have in future. I come across similar scenario in many places in pro*c application as well as in PL/SQL. what should be done to make use of the index? Is there any alternative to speed up the update? I'm using 8.1.6. Please help me out.

August 12, 2003 - 8:42 am UTC

umm, tell me -- how many rows have cust_type <> 12?

full scanning a table with "millions" of rows takes just a couple of seconds/minutes.

An index would most likely make this run SLOWER....

Index FFS or Full table Scan ?

Haratsaris, August 12, 2003 - 10:55 am UTC

Tom,
I've got this situation. The optimizer mode is RULE with 8.1.7.4 

SQL> select count(*) from table_a;

 COUNT(*)
---------
     1058

SQL> set autotrace traceonly explain

SQL> select count(*) from table_a;

Execution Plan
-----------------------------------------------------
     SELECT STATEMENT Optimizer=RULE
   0   SORT (AGGREGATE)
   1     TABLE ACCESS (FULL) OF 'RA_INTERFACE_LINES_ALL'



SQL> alter session set optimizer_mode=CHOOSE;

Session altered.

SQL> select count(*) from RA_INTERFACE_LINES_ALL;

Execution Plan
-----------------------------------------------------
     SELECT STATEMENT Optimizer=CHOOSE (Cost=18 Card=1)
   0   SORT (AGGREGATE)
   1     INDEX (FAST FULL SCAN) OF 'indtab_4' (NON-U
     NIQUE) (Cost=18 Card=1058)

That seems reasonable and acceptable on performance , 'cause my cardinality on index is the same as from the table. But I don´t understand two things :

- What says to optimizer that index FFS is better than Full table scan is this case ? 
- Note that there is no where clause on the statement...so  why indexing ? 

Thanks !

August 12, 2003 - 11:07 am UTC

The CBO, having in its tool box many more access methods then the RBO will ever have, saw that there was an index on the table -- and that the index was on a NOT NULL set of columns (hence for every row in the table, there must be an entry in the index itself). It also new the index consumed less blocks on disk then the table did.

Hence, it fast full scanned (read using multi-block io) the index and counted rows in the index rather then full scanning the larger table structure.

Index FFS or Full table Scan ?

Haratsaris, August 12, 2003 - 12:48 pm UTC

OK, that's alright. CBO made the job.
But, for example, my APP(Oracle E-Business 11.0.3) requires setting RULE optimizer as default. So then I replicate the database for testing it with Cost optimizer, previous analyzing everything and so on...and Explain plains becomes terrible...with unacceptable performance diagnostics.
So, is it true that even with 8i(even 9i) advanced features avaiable I'm a "prisioner" of the standard rules for old written SQL/PL-SQL code ( C "Black box" programs included )?
Thanks !

August 12, 2003 - 1:07 pm UTC

when using COTS applications that use the database, you must follow the rules laid out by the application vendor.

Set expire data to far future instead of NULL

A reader, October 06, 2003 - 9:37 pm UTC

Hi Tom,

In our system, we have many queries of the following types:

select ...
from ...
where effective_date <= p_date
and (expiry_date >= p_date or expiry_date is null)

May I ask what is the best way to boost the performance of this type of query.

Do you think it a good idea to make all NULL expiry date to a date in the far future, such as '99991231'.

Thanks a lot.

October 07, 2003 - 8:03 am UTC

ops$tkyte@ORA920> create table t1 ( effective_date date, expiry_date date, data char(40) );
 
Table created.
 
ops$tkyte@ORA920> create table t2 ( effective_date date, expiry_date date, data char(40) );
 
Table created.
 
ops$tkyte@ORA920> create table t3 as select distinct created dt from all_objects;
 
Table created.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into t1
  2  select created, decode( mod( rownum, 10 ), 0, to_date(null), last_ddl_time ), 'x'
  3    from all_objects;
 
30211 rows created.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into t2
  2  select effective_date, nvl(expiry_date,to_date('01-jan-4000')), data from t1;
 
30211 rows created.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> create index t1_idx on t1(effective_date,expiry_date);
 
Index created.
 
ops$tkyte@ORA920> create index t2_idx on t2(effective_date,expiry_date);
 
Index created.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'for all indexed columns',cascade=>true);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920> exec dbms_stats.gather_table_stats(user,'T2',method_opt=>'for all indexed columns',cascade=>true);
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> @trace
ops$tkyte@ORA920> alter session set events '10046 trace name context forever, level 12';
 
Session altered.


and then we run:
declare
    type array is table of char(40) index by binary_integer;
    l_data array;
begin
    for x in ( select * from t3 )
    loop
        select data bulk collect into l_data
          from t1
         where effective_date <= x.dt
           and (expiry_date >= x.dt or expiry_date is null);
        select data bulk collect into l_data
          from t2
         where effective_date <= x.dt
           and expiry_date >= x.dt;
    end loop;
end;


with sql trace and find:

SELECT data from t1
         where effective_date <= :b1
           and (expiry_date >= :b1 or expiry_date is null)
                                                                                                          
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute   2123      1.01       1.20          0          0          0           0
Fetch     2123    272.33     339.27         50   21178876          0    13432142
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     4247    273.35     340.47         50   21178876          0    13432142
                                                                                                          
Rows     Row Source Operation
-------  ---------------------------------------------------
13432142  TABLE ACCESS BY INDEX ROWID T1 (cr=21178876 r=50 w=0 time=284842603 us)
36392296   INDEX RANGE SCAN T1_IDX (cr=136753 r=0 w=0 time=93162618 us)(object id 40855)
********************************************************************************
                                                                                                          
SELECT data from t2
         where effective_date <= :b1
           and expiry_date >= :b1
                                                                                                          
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute   2123      0.97       1.10          0          0          0           0
Fetch     2123    154.79     188.79         44    9133953          0    13432142
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     4247    155.76     189.90         44    9133953          0    13432142
                                                                                                          
Rows     Row Source Operation
-------  ---------------------------------------------------
13432142  TABLE ACCESS BY INDEX ROWID T2 (cr=9133953 r=44 w=0 time=137071559 us)
13432142   INDEX RANGE SCAN T2_IDX (cr=140470 r=0 w=0 time=48335407 us)(object id 40856)


there is the potential for savings -- as we can scan less rows via the index with the simple "and" query -- lots less LIO's, lots less CPU.


remember -- your mileage may vary, it depends on your DATA.  You might consider copying your data into t1/t2 (your dates) as I did and then running a similar test with YOUR common inputs to see if such a change would be beneficial to you or not.

Set expire data to far future instead of NULL

A reader, October 06, 2003 - 9:41 pm UTC

P.S. - I am asking the second question because then one can write the SQL as:

where effective_date >= p_date
and expiry_date <= p_date

index behavior

Saminathan Seerangan, November 05, 2003 - 12:03 pm UTC

Dear Tom,

Could you please throw ligts on the following index behavior

(a)"name" is an unique index column
(b) table has been analyzed before run the query

1) select id from table1 where name like 'ABC%';
179 record returned
==>FULL Table scan

2) select id from table1 where name like 'AB%';
180 record returned
==>Index scan

Your help is always appreciated.

November 06, 2003 - 7:03 am UTC

lets see the autotrace traceonly explain and the EXACT command used to analyze the table

more info about the table

Saminathan Seerangan, November 06, 2003 - 8:29 am UTC

Dear Tom, as you requested...

Oracle8i Enterprise Edition Release 8.1.7.3.0 - Production
With the Partitioning and Parallel Server options
JServer Release 8.1.7.3.0 - Production

We use OPS(Oracle Parallel Server)

select count(*) from profileduser
  COUNT(*)
----------
     60820
1 row selected
 
set linesize 200
select index_name,index_type,uniqueness,BLEVEL,CLUSTERING_FACTOR,leaf_blocks 
from dba_indexes 
where owner='CRMF' 
and table_name='PROFILEDUSER'

INDEX_NAME              INDEX_TYPE    UNIQUENESS     BLEVEL CLUSTERING_FACTOR LEAF_BLOCKS
----------------------- ------------- ---------- ---------- ----------------- -----------
PROFILEDUSER_AK1        NORMAL        UNIQUE              2             59970        2204
PROFILEDUSER_AK2        NORMAL        UNIQUE              0                 0           0
PROFILEDUSER_I03        NORMAL        NONUNIQUE           1              1036          35
PROFILEDUSER_IX01       NORMAL        NONUNIQUE           1              1749          47
PROFILEDUSER_IX02       NORMAL        NONUNIQUE           1              1824          43
PROFILEDUSER_IX03       NORMAL        NONUNIQUE           1              1139          47
PROFILEDUSER_PK         NORMAL        UNIQUE              2             60807        2295
7 rows selected
 
 

 
set linesize 200
select dcc.constraint_name,constraint_type,column_name
from dba_cons_columns dcc,dba_constraints dc
where dcc.constraint_name=dc.constraint_name
and      dcc.owner=dc.owner
and dcc.owner='CRMF'
and dcc.table_name='PROFILEDUSER'

CONSTRAINT_NAME                CONSTRAINT_TYPE COLUMN_NAME                     
------------------------------ --------------- --------------------------------
PROFILEDUSER_FK01              R               ROLEPLAYERID                    
PROFILEDUSER_FK02              R               DEFAULTCONTACT                  
PROFILEDUSER_PK                P               PROFILEDUSERID                  
SYS_C007887                    C               PROFILEDUSERID                  
SYS_C007888                    C               USERID                          
SYS_C007889                    C               CREATEDATE                      
SYS_C007890                    C               UPDATEDATE                      
SYS_C007891                    C               MEMORABLE_COUNT                 
SYS_C007892                    C               LOGIN_FAIL_COUNT                
SYS_C007893                    C               TRANSIENT_LOGIN_FAIL_COUNT      
SYS_C008514                    C               PASSWORD_CONVERTED              
11 rows selected
 
 
select count(*) from profileduser
  COUNT(*)
----------
     60820
1 row selected




SQL> analyze table  crmf.profileduser compute statistics 
for table 
for all indexes 
for all indexed columns;

Table analyzed.

SQL> set autotrace on
SQL> select * from profileduser where userid like 'ABC%';
<<Actual data bas been removed for display purpose>>
1 row selected.                                                                               
                                                                                


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=681 Card=12607 Bytes          
          =1739766)                                                             
                                                                                
   1    0   TABLE ACCESS (FULL) OF 'PROFILEDUSER' (Cost=681 Card=12607          
           Bytes=1739766)                                                       
                                                                                

Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          4  db block gets                                                      
       4484  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
       3141  bytes sent via SQL*Net to client                                   
        430  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          1  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> l
  1* select * from profileduser where userid like 'ABC%'
SQL> c/ABC/AB
  1* select * from profileduser where userid like 'AB%'
SQL> /
<<Actual data bas been removed for display purpose>>
1 row selected.                                                                                


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=138)           
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'PROFILEDUSER' (Cost=4 Ca          
          rd=1 Bytes=138)                                                       
                                                                                
   2    1     INDEX (RANGE SCAN) OF 'PROFILEDUSER_AK1' (UNIQUE) (Cost=          
          3 Card=1)                                                             
                                                                                

Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
          5  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
       3141  bytes sent via SQL*Net to client                                   
        430  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          1  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> l
  1* select * from profileduser where userid like 'AB%'

SQL> c/AB/HCR
  1* select * from profileduser where userid like 'HCR%'
SQL> /

no rows selected


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=681 Card=12607 Bytes          
          =1739766)                                                             
                                                                                
   1    0   TABLE ACCESS (FULL) OF 'PROFILEDUSER' (Cost=681 Card=12607          
           Bytes=1739766)                                                       
                                                                                

Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          4  db block gets                                                      
       4483  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
       2634  bytes sent via SQL*Net to client                                   
        319  bytes received via SQL*Net from client                             
          1  SQL*Net roundtrips to/from client                                  
          1  sorts (memory)                                                     
          0  sorts (disk)                                                       
          0  rows processed                                                     

SQL> l
  1* select * from profileduser where userid like 'HCR%'
SQL> c/HCR/HC
  1* select * from profileduser where userid like 'HC%'
SQL> /

<<Actual data bas been removed for display purpose>>

220 rows selected.


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=2 Bytes=276)           
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'PROFILEDUSER' (Cost=5 Ca          
          rd=2 Bytes=276)                                                       
                                                                                
   2    1     INDEX (RANGE SCAN) OF 'PROFILEDUSER_AK1' (UNIQUE) (Cost=          
          3 Card=2)                                                             
                                                                                

Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
        228  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
      59178  bytes sent via SQL*Net to client                                   
       1984  bytes received via SQL*Net from client                             
         16  SQL*Net roundtrips to/from client                                  
          1  sorts (memory)                                                     
          0  sorts (disk)                                                       
        220  rows processed                                                     

SQL> l
  1* select * from profileduser where userid like 'HC%'
SQL> c/HC/HCT
  1* select * from profileduser where userid like 'HCT%'
SQL> /

<<Actual data bas been removed for display purpose>>

207 rows selected.


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=681 Card=12607 Bytes          
          =1739766)                                                             
                                                                                
   1    0   TABLE ACCESS (FULL) OF 'PROFILEDUSER' (Cost=681 Card=12607          
           Bytes=1739766)                                                       
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          4  db block gets                                                      
       4497  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
      55520  bytes sent via SQL*Net to client                                   
       1873  bytes received via SQL*Net from client                             
         15  SQL*Net roundtrips to/from client                                  
          1  sorts (memory)                                                     
          0  sorts (disk)                                                       
        207  rows processed                                                     

SQL> spool off

Sorry for the incomplete info in my previous posting

Saminathan Seerangan, November 06, 2003 - 9:33 am UTC

select di.index_name,substr(column_name,1,20),column_position,uniqueness,BLEVEL,CLUSTERING_FACTOR,leaf_blocks
from dba_indexes di,dba_ind_columns dic
where di.index_name=dic.INDEX_NAME
and di.owner='CRMF'
and di.table_name='PROFILEDUSER'

INDEX_NAME SUBSTR(COLUMN_NAME,1,20) COL_POSI UNIQUENESS BLEVEL CLUSTERING_FACTOR LEAF_BLOCKS
------------------------------ -------------------------- -------- ---------- ---------- ----------------- -----------
PROFILEDUSER_AK1 USERID 1 UNIQUE 2 59970 2204
PROFILEDUSER_AK2 LDAPCOMMONNAME 1 UNIQUE 0 0 0
PROFILEDUSER_PK PROFILEDUSERID 1 UNIQUE 2 60807 2295
PROFILEDUSER_I03 BUSINESS_COUNTRY_ID 1 NONUNIQUE 1 1036 35
PROFILEDUSER_IX01 SEARCH_FIRST_NAME 1 NONUNIQUE 1 1749 47
PROFILEDUSER_IX02 SEARCH_LAST_NAME 1 NONUNIQUE 1 1824 43
PROFILEDUSER_IX03 SEARCH_COMPANY_NAME 1 NONUNIQUE 1 1139 47
7 rows selected

Block Selectivity

A reader, December 17, 2003 - 6:44 pm UTC

Tom,

In the discussions above, in one example, you mention that 25% is a ROT and that the determination to use or not to use an index depends on lot of factors such as estimated cardinality, clustering factor, db_file_multiblock_read_count etc. My questions are :
1. What if the clustering factor is poor and the FIRST_ROWS is used. Will it still use the index?
2. Does the etc include the BLOCK SELECTIVITY and parameter settings for ..._INDEX_COST_ADJ?
3. What other factors are considered in determining that?
4. How do you identify the estimated cardinality of a query from the EXPLAIN PLAN? Which of the output info do you look at such as Card=, Bytes= etc. I am asking this because sometimes the EXPLAIN PLAN output does not reflect the correct values for these fields. e-g even if a query retrieives say 1000 rows, the Card= says something like 900. What info. do you use to get the ESTIMATED CARDINALITY for a query from the EXPLAIN PLAN/TRACE output

December 18, 2003 - 10:31 am UTC

1) sure.

2) sure.

3) lots -- sort area size, pga aggregate target, db file multi block read count, too numerous to enumerate

4) CARD *is* the estimated cardinality.

that IS the estimated number of rows. You get 1,000 -- the explain plan ESTIMATED "about 900". that is what an estimation, by definition, is. A best guess.

Unused Indexes in oracle 8.1.7

Wor, February 23, 2004 - 6:18 pm UTC

Hi Tom,
I am using oracle 8.1.7 , windows 2000
Over the time we have added alot of indexes and I guess that some of the indexes may not be getting used at all.

Is there a way in ver 8.1.7 to identify the unused indexes, so that I can get rid of them.

Wor

February 23, 2004 - 6:57 pm UTC

do you have my book "expert one on one Oracle". in there i suggest a method using query plan stability.

Index usage throgh db link

Lakshminarasimhan R, February 23, 2004 - 10:56 pm UTC

Tom

how to ensure whether index is used if a query contains a table from remote database.

I hv tried to traceout explain plan but not giving any clue

QL> SELECT TRAVEL_MONTH,PROCESSING_MONTH,AGENCY,CHECK_DIGIT,
2 ONLINE_ORG,ONLINE_DEST,ORIGIN,DEST,COMPARTMENT,ONLINE_CARRIER,PAX_COUNT,EK_PRESENT,
3 ONLINE_CLASS,POS,CRS_CODE,DIRECT_FLAG,sp.RUN_INDICATOR,sp.OLD_CUR_NEW,sp.data_type,
4 sp.host_al_code,sp.primary_seq_id,sp.secondary_seq_id,sp.fin_id,sp.info_cat,c.MIDT_DATE_REQ
5 FROM syn_mkt_share_aln_source src,
6 TDC_SESSION_PARAMETERS sp,
7 v_midt_req_period c
8 WHERE src.processing_month >= DECODE(sp.RUN_INDICATOR,'N',src.processing_month,sp.DATA_DATE_FROM)
9 AND src.processing_month <= DECODE(sp.RUN_INDICATOR,'N',src.processing_month,sp.DATA_DATE_TO)
10 AND sp.info_cat = 'MKT_SHARE_ALN';

xecution Plan
---------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'DUAL'
4 2 TABLE ACCESS (FULL) OF 'TDC_SESSION_PARAMETERS'
5 1 REMOTE* INST_INS
IGHT_ADH
_LINK

5 SERIAL_FROM_REMOTE SELECT "TRAVEL_MONTH","PROCESSING_MONTH","AG
ENCY","CHECK_DIGIT","ONLINE_ORG","ONLINE_DES
T","ORIGIN","DEST","COMPARTMENT","ONLINE_CAR
RIER","PAX_COUNT","EK_PRESENT","ONLINE_CLASS
","POS","CRS_CODE","DIRECT_FLAG" FROM "MIDT_
MAIN_YTD_TDC" "SRC" WHERE "PROCESSING_MONTH"
<=DECODE(:1,'N',"PROCESSING_MONTH",:2) AND "
PROCESSING_MONTH">=DECODE(:3,'N',"PROCESSING
_MONTH",:4)

Here, table Midt_main_ytd_tdc lying in Remote db and accessing thr db link (synonym).

can you throw some light on this.

February 24, 2004 - 6:38 am UTC

touch the database link (select * from dual@remote)

log directly into remote from another session and use dbms_system to set_sql_trace_in_session for that new session you just created.

run your query and exit.

then use tkprof.

(or cut and paste that query above into sqlplus on remote and use set autotrace traceonly explain).

Looking at the query -- I would say pretty close to 0% chance that an index is involved. Look at the predicate:

WHERE "PROCESSING_MONTH"
<=DECODE(:1,'N',"PROCESSING_MONTH",:2) AND "
PROCESSING_MONTH">=DECODE(:3,'N',"PROCESSING
_MONTH",:4)

In this case, I would advise (since that sp table looks just to be a parameter table) that you use a ref cursor and procedural logic to decide on "query a" or "query b" and avoid using the function on the database column as that pretty much "precludes indexes"

Indexes Not used

A reader, May 13, 2004 - 10:59 am UTC

Hi
I have a problem with Indexes not getting used in Query
The Number rows of tables are

ACC> SELECT COUNT(*) FROM tb_tir0ssd0;

COUNT(*)
----------
192808

ACC> SELECT COUNT(*) FROM tb_tir0pgn0;

COUNT(*)
----------
7798138

ACC> SELECT COUNT(*) FROM tb_tir0pos0;

COUNT(*)
----------
4383116
ACC> SELECT COUNT(*) FROM TB_TIR0SSD7;

COUNT(*)
----------
291113
ACC> SELECT COUNT(*) FROM TB_TIR0PGNN;

COUNT(*)
----------
8894102

The tables have the following indexes:

---PGN0---
CONSTRAINT PK_TB_TIR0PGN0 PRIMARY KEY (PK_ROOT) USING INDEX

INDEX IDX_NUQ_PGN_CTR_CD ON TB_TIR0PGN0(ORG_CTR_CD)

INDEX IDX_NUQ_DC_PGN0 ON TB_TIR0PGN0(DCM_TYPE_CD)

UNIQUE INDEX IDX_UNQ_TIR0PGN0 ON TB_TIR0PGN0(PTY_IDEN_APC_CD, PTY_IDEN_VL, BNS_DT, RCD_TYPE)

INDEX IDX_NUQ_PGN_BNS_DT ON TB_TIR0PGN0(BNS_DT)

INDEX IDX_NUQ_TIR0PGN0 ON TB_TIR0PGN0(PTY_IDEN_APC_CD, PTY_IDEN_VL)

---PGNN-----
INDEX IDX_PGN_QCR_001 ON
TB_TIR0PGNN(PTY_LINK_APC_CD, PTY_LINK_IDEN_VL)

INDEX IDX_TB_TIR0PGNN ON TB_TIR0PGNN(PK_ROOT)

--- SSD0 ----
CONSTRAINT PK_TB_TIR0SSD0 PRIMARY KEY (PK_ROOT) USING INDEX

INDEX IDX_NUQ_ISSID_SSD0 ON TB_TIR0SSD0(ISS_APC_CD, ISS_IDEN_VL)

INDEX IDX_NUQ_SSD_BNS_DT ON TB_TIR0SSD0(BNS_DT)

INDEX IDX_NUQ_TIR0SSD0 ON TB_TIR0SSD0(SCR_ID_APC_CD, SCR_ID_IDEN_VL)

UNIQUE INDEX IDX_UNQ_TIR0SSD0 ON TB_TIR0SSD0(SCR_ID_APC_CD, SCR_ID_IDEN_VL, BNS_DT, RCD_TYPE)

--- SSD7 ----
INDEX TIR.IDX_NUQ_SSD7_2 ON TB_TIR0SSD7(PK_ROOT)

--- POS0 ----
CONSTRAINT PK_TB_TIR0POS0 PRIMARY KEY (PK_ROOT) USING INDEX

INDEX IDX_NUQ_SCRACC_POS0 ON TB_TIR0POS0(SCR_ACC_ID_APC_CD, SCR_ACC_ID_IDEN_VL)

INDEX IDX_NUQ_SCR_POS0 ON TB_TIR0POS0(SCR_ID_APC_CD, SCR_ID_IDEN_VL)

INDEX IDX_UNQ_TIR0POS0 ON TB_TIR0POS0(SCR_PST_ID_APC_CD, SCR_PST_ID_IDEN_VL, BNS_DT, RCD_TYPE)

INDEX IDX_NUQ_POS_BNS_DT ON TB_TIR0POS0(BNS_DT)

INDEX IDX_NUQ_TIR0POS0 ON TB_TIR0POS0(SCR_PST_ID_APC_CD, SCR_PST_ID_IDEN_VL)

The indexes columns not in the query below are targeted for other queries in the application.

The Query is:

SELECT TB_TIR0POS0.scr_acc_id_apc_cd, TB_TIR0POS0.scr_acc_id_iden_vl,
TB_TIR0POS0.qte_mhd_cd, TB_TIR0POS0.crc_cd,
TB_TIR0SSD0.cls_cd, TB_TIR0SSD0.s_cls_cd,
TB_TIR0SSD0.iss_apc_cd, TB_TIR0SSD0.iss_iden_vl,
TB_TIR0PGN0.ads_cty_cd, TB_TIR0PGN0.pk_root,
TB_TIR0PGNN.pty_link_iden_vl, TB_TIR0PGN0.dcm_type_cd,
TB_TIR0SSD0.scr_id_apc_cd, TB_TIR0SSD0.scr_id_iden_vl,
TB_TIR0POS0.stl_pst_sgn_cd, TB_TIR0POS0.stl_pst_am,
TB_TIR0SSD7.exg_cd, TB_TIR0PGN0.pty_iden_vl
FROM TB_TIR0SSD0,
TB_TIR0PGN0,
TB_TIR0PGNN,
TB_TIR0POS0,
TB_TIR0SSD7
WHERE TB_TIR0POS0.scr_id_apc_cd = TB_TIR0SSD0.scr_id_apc_cd
AND TB_TIR0POS0.scr_id_iden_vl = TB_TIR0SSD0.scr_id_iden_vl
AND TB_TIR0SSD0.iss_apc_cd = TB_TIR0PGNN.pty_link_apc_cd
AND TB_TIR0SSD0.iss_iden_vl = TB_TIR0PGNN.pty_link_iden_vl
AND TB_TIR0PGN0.pk_root = TB_TIR0PGNN.pk_root(+)
AND TB_TIR0SSD0.pk_root = TB_TIR0SSD7.pk_root

and the Plan is

Operation Object Name Rows Bytes Cost TQ In/Out PStart PStop

SELECT STATEMENT Hint=CHOOSE 3 M 916205
MERGE JOIN 3 M 613 M 916205
SORT JOIN 295 K 38 M 808611
MERGE JOIN 295 K 38 M 782596
SORT JOIN 202 K 22 M 778304
MERGE JOIN 202 K 22 M 762705
SORT JOIN 5 M 307 M 755001
MERGE JOIN OUTER 5 M 307 M 460555
SORT JOIN 4 M 141 M 288135
TABLE ACCESS FULL TB_TIR0PGN0 4 M 141 M 137670
SORT JOIN 5 M 141 M 172420
TABLE ACCESS FULL TB_TIR0PGNN 5 M 141 M 17641
SORT JOIN 195 K 10 M 7704
TABLE ACCESS FULL TB_TIR0SSD0 195 K 10 M 1014
SORT JOIN 284 K 4 M 4293
TABLE ACCESS FULL TB_TIR0SSD7 284 K 4 M 279
SORT JOIN 2 M 107 M 107594
TABLE ACCESS FULL TB_TIR0POS0 2 M 107 M 14540

***
The DB uses CBO and it is a sort of a hybrid database: something of a data warehouse with huge volumes of daily transaction. The data is loaded using ETL tools. The data is extracted using SQL queries built into an application.
We are using Oracle Version 8.1.7. The database size at the moment is about 350GB and is expected to grow to 600GB. The daily transaction volume is about 0.2 million and a total of around 10 million bew records in the database/day

We have analyzed the all the tables using Analyze table <Tname> compute statistics ; But the plan remains the same both before and after analyzing the Tables!!! Due to the FTS (probably) the queries run for hours on end and the performance has gone for a toss!

Can u help??

May 13, 2004 - 11:36 am UTC

thank goodness no indexes were used. It would have been painfully slow otherwise!!!!!! Painfully slow.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6749454952894#6760861174154 <code>

why do you think indexes would help this even a tiny bit? it would make it do tons more work over and over and over again.

indexes = good for getting a row
bulk operations = good for getting them all.

It does appear that someone un-intelligently has set the fantasic HASH_JOIN_ENABLED parameter to false. You'll find that once you

a) lose your fear of the full scan
b) re-enabled the features you've disabled
c) set a nice juicy hash_area_size

you'll be going as fast as it is going to get.

Great help

A reader, May 14, 2004 - 4:15 am UTC

Hi Tom,

Thanks for the quick response... It was a great relief to read about the FTS and a great help indeed. Yesterday we found the queries to be much fasrer even on the FTS. :-)

Will get the parameters corrected. By the way, how do we decide the optimal HASH_AREA_SIZE? What stats can help in deciding the size? Is there a formula that can give an idea of the probable size range?

May 14, 2004 - 10:17 am UTC

it is a function of

o number of concurrent users
o ram you have

in 9i, this is done automagically for you with pga_aggregate_target.

Two indexes in a table is worse than one idex in a table,isn't it?

guoli cai, June 02, 2004 - 4:34 am UTC

Tom:
In your book,you said: Generally speaking, to a good clustered gene,there is only one index in a table,and the data will be phisically orgnized only in one mode.Does it
mean that two indexes in a table is worse than one index?
Can you list any proofs?
Thanks,
guoli cai

June 02, 2004 - 8:33 am UTC

that is not what was meant at all.

I was stating the fact that

a) a table is sorted in one way at most
b) therefore, at most one index will have an awesome clustering factor
c) and the rest will most likely not
d) so don't reorg tables for each index, as each reorg destroys the last one

the point being -- data can be sorted (organized) at most one way, don't even try to get N indexes on a table to have "really good cluster factors", you cannot achieve that.

The cluster factor comes in for large index RANGE scans. Indexes are generally used to retrieve a FEW rows. Having many indexes on a table can be

a) really good
b) really bad
c) neither good nor bad

Indexes on Columns NULL/ NOT NULL

Wor, July 09, 2004 - 10:00 am UTC

Hi Tom,
I am sorry if I am asking a very basic question, But please answer me.
I want to know which performance is better and why.

A) one solution is to have just one table and 2 nullable FK's
e.g. I have a table t which has col1 and col2 ( col1 is primary key in Table t1 and col2 is primary key in table t2 )
For each row in table t either col1 or col2 will be null.

B) second soultion is to have 2 different table and each has unique not null FK.
e.g. I have table tt1 which has NOT NULL col1 as FK ( col1 is primary key in Table t1 ) and table tt2 which has NOT NULL col2 as FK ( col2 is primary key in Table t2 )

Question:
How is performance of Indexes on Nullable columns versus Not null columns. Which is better.
I am using Oracle 8.1.7 on Win 2k

Thanks as always
Wor

July 09, 2004 - 10:38 am UTC

seems your DATA MODEL and how you use the data would drive this decision.

indexes on null/not null columns perform exactly the same -- there is no difference at all.

there is just the existence of nulls (or lack of more appropriately i guess, well, we don't know if they exist or not -- that is the essence of null)

Am I missing something obvious?

Manjunath Kottur, July 14, 2004 - 2:02 am UTC

Hi Tom, 
I have gained a lot of insight into how Oracle works from your site. I have done a lot of tuning, based on the information available here, but this one beat me. Its very simple, but...
 I have this simple query which is doing an FTS. 
 On Production, the table is not very big, around 1500 records on any given day, but the query is executed more4 than 500,000 times, each day.

 The sessionid field (not null and of varchar2(40) datatype) has an Unique index and the table is regularly analyzed using dbms stats.

Except the INDEX or RULE hints, nothing works. I want to make sure that I am not missing something obvious. So far, I tried rebuilding the index, generating histograms, MOVE etc. As this appears in many of our procedures, adding a hint every where is not practical. Stored outlines too are not an option unless I can convince my management that there is no other way out.


I am using 8.1.7.4. 

SQL> SELECT enterprisekey FROM LYSESSION WHERE SESSIONID='9be29b9d1eb04b04a10e0a21aa0c94f0'; 

Execution Plan 
---------------------------------------------------------- 
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=36) 
   1    0   TABLE ACCESS (FULL) OF 'LYSESSION' (Cost=1 Card=1 Bytes=36 
          ) 

Statistics 
---------------------------------------------------------- 
         14  recursive calls 
         12  db block gets 
         10  consistent gets 
          0  physical reads 
          0  redo size 
        295  bytes sent via SQL*Net to client 
        366  bytes received via SQL*Net from client 
          3  SQL*Net roundtrips to/from client 
          1  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 

  SQL> SELECT /*+ RULE */ enterprisekey FROM LYSESSION WHERE SESSIONID='9be29b9d1eb04b04a10e0a21aa0c94f0'; 

Execution Plan 
---------------------------------------------------------- 
   0      SELECT STATEMENT Optimizer=HINT: RULE 
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'LYSESSION' 
   2    1     INDEX (UNIQUE SCAN) OF 'UKLYSESSION1' (UNIQUE) 

Statistics 
---------------------------------------------------------- 
          0  recursive calls 
          0  db block gets 
          3  consistent gets 
          0  physical reads 
          0  redo size 
        295  bytes sent via SQL*Net to client 
        378  bytes received via SQL*Net from client 
          3  SQL*Net roundtrips to/from client 
          0  sorts (memory) 
          0  sorts (disk) 
          1  rows processed 

SQL>

I can create a test case, but is there anything else that I can try?

Thanks
Manjunath

July 14, 2004 - 11:20 am UTC

the table is tiny? it is 6 blocks or less?

show me a tkprof of both queries -- after they have already been hard parsed (no recursive sql)

Here is the tkprof results

A reader, July 14, 2004 - 12:11 pm UTC

Hi Tom,
Yes, on the Dev server on which I am doing my tests, there are 202 records with 2 blocks.
On Production it is 1215 records in 25 blocks.

Here is the tkprof results: (from my Dev environment)
SELECT enterprisekey FROM LYSESSION WHERE
SESSIONID='9be29b9d1eb04b04a10e0a21aa0c94f0'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 7 12 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 7 12 1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 83 (ETS_QAMAIN_4136)

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL LYSESSION

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'LYSESSION'

SELECT /*+ RULE */ enterprisekey FROM LYSESSION WHERE
SESSIONID='9be29b9d1eb04b04a10e0a21aa0c94f0'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 0 1

Misses in library cache during parse: 0
Optimizer goal: RULE
Parsing user id: 83 (ETS_QAMAIN_4136)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: HINT: RULE
0 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF 'LYSESSION'
0 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'UKLYSESSION1' (UNIQUE)

Thanks
Manjunath

July 14, 2004 - 12:15 pm UTC

so, what does it do on prod? in real life.

or copy prod to dev so we can actually analyze the problem.

Does Varchar2(40) has anything to do with it?

Manjunath, July 14, 2004 - 1:00 pm UTC

Hi Tom,
Due to too many internal processes, it will be difficult to get a table dump from the Production. :-(
While analyzing a stored procedure for performance issues, I noticed this behaviour on Dev. Knowing how frequently this statement is used, I tried changing the behaviour. Is there anything with indexes on varchar fields (with lengths like 40), that the optimizer decides to go in for FTS? The clustering factor is too high, but as the table has another 3 indexes, I cannot think or reorg as an alternate.

Thanks
Manjunath

July 14, 2004 - 10:04 pm UTC

your table is too small -- make it the same size as production.

A reader, October 14, 2004 - 3:42 pm UTC

I have the following query

SELECT sec_num, bal_date, bal_type, SUM (units), SUM (book_cost), SUM (tax_cost)
FROM sl_bal z
WHERE (sec_num, bal_type, bal_date) IN (SELECT sec_num, 'e', enter_date
FROM sl_trans_det t
WHERE load_sequence_number = 3103)
GROUP BY sec_num, bal_date, bal_type

There is a non-unique index on (sec_num, bal_type, bal_date), but CBO doesnt use it. It does a FTS on sl_bal which is a 60M row table (sl_trans_det is about 45 million)

The plan is

PLAN_TABLE_OUTPUT

----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 429 | 21021 | | 20392 (0)|
| 1 | SORT GROUP BY | | 429 | 21021 | | 20392 (0)|
|* 2 | HASH JOIN | | 429 | 21021 | | 20387 (0)|
| 3 | VIEW | VW_NSO_1 | 7320 | 128K| | |
| 4 | SORT UNIQUE | | 7320 | 135K| 472K| 49 (0)|
| 5 | TABLE ACCESS BY INDEX ROWID| SL_TRANS_DET | 7320 | 135K| | 16 (0)|
|* 6 | INDEX RANGE SCAN | SL_TRANS_DET#LOAD_SEQ_NO | 7320 | | | 18 (0)|
| 7 | TABLE ACCESS FULL | SL_BAL | 58M| 1737M| | 11977 (0)|
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("Z"."SEC_NUM"="VW_NSO_1"."$nso_col_1" AND "Z"."BAL_TYPE"="VW_NSO_1"."$nso_col_2" AND
"Z"."BAL_DATE"="VW_NSO_1"."$nso_col_3")
6 - access("T"."LOAD_SEQUENCE_NUMBER"=3103)

This query takes forever

Now, if I hint the query as follows

SELECT /*+ index(z xie2sl_bal) */ sec_num, bal_date, bal_type, SUM (units), SUM (book_cost), SUM (tax_cost)
FROM sl_bal z
WHERE (sec_num, bal_type, bal_date) IN (SELECT sec_num, 'e', enter_date
FROM sl_trans_det t
WHERE load_sequence_number = 3103)
GROUP BY sec_num, bal_date, bal_type

The plan changes to

PLAN_TABLE_OUTPUT

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 429 | 21021 | | 301K (0)|
| 1 | SORT GROUP BY | | 429 | 21021 | | 301K (0)|
| 2 | TABLE ACCESS BY INDEX ROWID | SL_BAL | 1 | 31 | | 42 (0)|
| 3 | NESTED LOOPS | | 429 | 21021 | | 301K (0)|
| 4 | VIEW | VW_NSO_1 | 7320 | 128K| | |
| 5 | SORT UNIQUE | | 7320 | 135K| 472K| 49 (0)|
| 6 | TABLE ACCESS BY INDEX ROWID| SL_TRANS_DET | 7320 | 135K| | 16 (0)|
|* 7 | INDEX RANGE SCAN | SL_TRANS_DET#LOAD_SEQ_NO | 7320 | | | 18 (0)|
|* 8 | INDEX RANGE SCAN | XIE2SL_BAL | 1 | | | |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

7 - access("T"."LOAD_SEQUENCE_NUMBER"=3103)
8 - access("Z"."SEC_NUM"="VW_NSO_1"."$nso_col_1" AND "Z"."BAL_TYPE"="VW_NSO_1"."$nso_col_2" AND
"Z"."BAL_DATE"="VW_NSO_1"."$nso_col_3")

This comes back in a few seconds.

Surprisingly enough, the total no of rows returned is 445 and both the above plans get this number almost perfectly.

So why is the first plan so much worser than the second?

[Yes, stats are up to date]

Help? Thanks

October 14, 2004 - 7:48 pm UTC

tell you what -- zip up the results of a 10053 trace (search for it here if you don't know what it is) and i'll take a look -- looks interesting enough. email it to me (tkyte@oracle.com) and make sure to reference this question.

i need BOTH queries 10053 traced.

to a Reader

Vadim Gotovsky, October 14, 2004 - 8:36 pm UTC

I have seen CBO uses FTS and HASH joins when GROUP BY or ORDER BY present (apparently thinks that it is usless to use NL as rows need to be sorted at the end before being returned):

I remember fixing it by rewriting query like this:

SELECT * FROM
(
SELECT sec_num, bal_date, bal_type, SUM (units), SUM (book_cost), SUM
(tax_cost)
FROM sl_bal z
WHERE (sec_num, bal_type, bal_date) IN (SELECT sec_num, 'e', enter_date
FROM sl_trans_det t
WHERE load_sequence_number = 3103)
)
GROUP BY sec_num, bal_date, bal_type

It often helped. Try this and see if it will work while Tom researching 10053.
Thanks

Vadim

Bugs

A reader, November 01, 2004 - 11:09 pm UTC

FYI

As I suspected, it was a CBO bug.

There are one-off patches for Bug 3499674 and the 9.2.0.4.0 performance Mini Patch BUG:3742913. The former bug seems to be what was causing the issue, but applying that patch did NOT produce a better plan. Applying the second patch did generate the better plan (same as with the RULE hint!)

The 9.2.0.6 patchset is supposed to fix all these issues.

Thanks

CBO not behaving properly for this simple query.

A reader, November 16, 2004 - 2:16 pm UTC

I am using CBO for my database version 9.2.0.4, and dbms_stats as the way to gather stats regularly. In most cases, with all the info provided over here in your website, I am able to figure out why the CBO behaves the way it behaves. But I here I have this one simple query, which does not use an index even after gathering stats on the table.

The table PS_PENDING_ITEM :

select count(*) from PS_PENDING_ITEM;

COUNT(*)
----------
114705

select num_rows from dba_tables where table_name = 'PS_PENDING_ITEM';

NUM_ROWS
----------
112163

The stats are gathered with 40%.

There are about 5 indexes on the table, but here are the relevant ones which potentially could have been used.

Index PSAPENDING_ITEM (NON-UNIQUE)
(POSTED_FLAG,BUSINESS_UNIT,CUST_ID,ITEM,ITEM_LINE)

Index PS_PENDING_ITEM (UNIQUE)(GROUP_BU,GROUP_ID,BUSINESS_UNIT,CUST_ID,ITEM,ITEM_LINE,
GROUP_SEQ_NUM)

Clustering factor for PSAPENDING_ITEM :
CLUSTERING_FACTOR BLEVEL
----------------- ----------
88327 2
Clustering factor for PS_PENDING_ITEM :
CLUSTERING_FACTOR BLEVEL
----------------- ----------
18329 2

Database parameters related to optimizer :

show parameter opt

NAME TYPE VALUE
------------------------------------ ----------- -------
filesystemio_options string asynch
object_cache_optimal_size integer 102400
optimizer_dynamic_sampling integer 1
optimizer_features_enable string 9.2.0
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_max_permutations integer 2000
optimizer_mode string CHOOSE

Below is the query :

set autotrace on statistics
select VAT_TRANS_AMT from PS_PENDING_ITEM where
BUSINESS_UNIT = 'LU001'
and ITEM = 'LUL0100000011'
and ITEM_LINE = 1;

VAT_TRANS_AMT
-------------
600

Elapsed: 00:00:05.61

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8147 consistent gets
7985 physical reads
512 redo size
496 bytes sent via SQL*Net to client
651 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

The plan for the above is

set autotrace traceonly explain
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1196 Card=1 Bytes=33
)
1 0 TABLE ACCESS (FULL) OF 'PS_PENDING_ITEM' (Cost=1196 Card=1
Bytes=33)

As soon as I provide the hint to use the non-unique index, it uses it.Hint with Unique index is totally ignored by the CBO. Here are the stats and plan.

set autotrace traceonly explain

select /*+ index_ffs(PS_PENDING_ITEM PSAPENDING_ITEM) */
VAT_TRANS_AMT from PS_PENDING_ITEM where
BUSINESS_UNIT = 'LU001'
and ITEM = 'LUL0100000011'
and ITEM_LINE = 1
/

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1089 Card=1 Bytes=33
)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'PS_PENDING_ITEM' (Cost=1
089 Card=1 Bytes=33)
2 1 INDEX (FULL SCAN) OF 'PSAPENDING_ITEM' (NON-UNIQUE) (Cos
t=1088 Card=1)

set autotrace on statistics

VAT_TRANS_AMT
-------------
600

Elapsed: 00:00:03.77

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1121 consistent gets
1088 physical reads
376 redo size
496 bytes sent via SQL*Net to client
651 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

I fail to find an explaination for the above. Index provides the answer faster, but CBO consistently ignores it.

I appreciate your input.
Thanks,

November 16, 2004 - 11:05 pm UTC

i'm curious about something else first....

ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> select /*+ index_ffs(t t_idx) */ * from t;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=82 Bytes=1066)
   1    0   INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=5 Card=82 Bytes=1066)


why are you not getting FFS? just index scan?

A reader, November 17, 2004 - 9:25 am UTC

Yep, I have the exact same concern. The optimizer decides to use index scan and not FFS.

November 17, 2004 - 1:55 pm UTC

i wondering how an index_ffs could lead to a index being used.

if you

a) exp the table with rows=n
b) use dbms_stats to export the statistics to another table
c) exp that stats table

compress those two files -- rename to something other than .zip (it bounce otherwise) and email, i'll see what I see.

CURSOR_SHARING ?

Gary, November 17, 2004 - 4:06 pm UTC

"i wondering how an index_ffs could lead to a index being used."

Haven't you always said, a hint is just a hint and the optimizer can ignore it ?

I wonder what CURSOR_SHARING is set to.
If the unhinted query was using a plan based on bind variable peeking (from other liternal values), then the hinted query would simply get a new plan based on its bind-variable peeking.

November 17, 2004 - 4:44 pm UTC

i've not seen an index_ffs make something go from full scan to index full scan, i would not anticipate that.

explain plan (autotrace) cannot do bind variable peeking. peeking does not happen with explain plan/autotrace explains.

A reader, November 18, 2004 - 9:32 am UTC

Tom,
I am sending you the compressed exp files that you have requested. Also,

show parameter cursor_sharing

NAME TYPE VALUE
------------------------------------ ----------- ------
cursor_sharing string EXACT

I am not using bind variables at all. The query is being run exactly the way it was posted. (With literal values)

Thanks,

November 18, 2004 - 12:03 pm UTC

that is interesting, i would have expected the FFS to have happened.

what command exactly do you gather stats with (eg: do you have histograms fully on the columns of the leading edge columns of of the indexes? especially posted flag.

A reader, November 18, 2004 - 2:05 pm UTC

The analyze is done using a PL/SQL script :

dbms_stats.gather_table_stats
(ownname=>'SYSADM',
tabname=>x.table_name,
method_opt=> 'for all indexed columns size 1',
estimate_percent=>40,
degree=>${DEGREE},
cascade => true);

It is possible that we did gather histograms in the past maybe once or twice, out of lack of understanding or to test something.

select count(*) from DBA_HISTOGRAMS where TABLE_NAME='PS_PENDING_ITEM';

COUNT(*)
----------
2863

The above is way more than 2*no_of_columns.

Also,
select * from DBA_HISTOGRAMS where TABLE_NAME='PS_PENDING_ITEM' and column_name='POSTED_FLAG'

Returns 2 records.

My e-mail should have reached you by now, since it did not bounce back to me.

Thanks,

November 18, 2004 - 2:20 pm UTC

<quote>
It is possible that we did gather histograms in the past maybe once or twice,
out of lack of understanding or to test something.
</quote>

i know you did -- some columns have many, most have just 2 buckets (high, low)

i have your test, needed to verify how you gathered stats.

suggestion: small table, copy it, gather histograms on the flag column (which I'm guessing has few values). we might see a nice skip scan if the optimizer knows.

A reader, November 22, 2004 - 9:56 am UTC

Here is what I did :

select distinct posted_flag from ps_pending_item;

POS
---
N
Y

SQL> select count(*) from ps_pending_item where posted_flag = 'Y';

  COUNT(*)
----------
    119082

SQL> select count(*) from ps_pending_item where posted_flag = 'N';

  COUNT(*)
----------
        54

Also, the original query returns a value of 'Y' for the posted flag. Since, most of the rows have the value of 'Y', the CBO, decides to do a full index scan maybe ? 

Could you please interpret the query results above and help me draw a more logical conclusion ?

Thanks so much for your valuable suggestions and time,

November 22, 2004 - 2:43 pm UTC

did you gather histograms on this column so the optimizer knows:

a) y = 119k rows
b) n = 54 rows

vs

a) the high value is Y, low value is N
b) there are 2 distinct values
c) there are 119k rows in the table, hence Y = 119k/2 and N = 119k/2

A reader, November 23, 2004 - 10:20 am UTC

I gathered histograms for that table on all indexed columns size skew only. But still I get the exact same results.

No clue as to why no index is picked up in the fist place. Also, after the hint for index_ffs, the optimizer does an index full scan.

I tried creating an index only on the columns that are involved in the query. That index does get picked up without any hints. I was trying to avoid that index overhead and rely on the index skip scan capability, since there are 2 indexes which already have those columns along with other columns.

Anything else that I could try ?

Thanks,

November 23, 2004 - 10:33 am UTC

can you just try gathering histograms EXPLICITLY on this column for me.

A reader, November 23, 2004 - 11:24 am UTC

Here is the command that I ran :

exec dbms_stats.gather_table_stats(ownname=>'SYSADM',tabname=>'PS_PENDING_ITEM',method_opt=>'for columns posted_flag size skewonly',cascade=> true,degree=>6);

Please let me know if this is what you wanted. The above gave no change in the execution plan.

Thanks,

November 23, 2004 - 12:40 pm UTC

size 254

A reader, November 23, 2004 - 1:40 pm UTC

Nope. No change there. I did size 254, but still got the same plan.

Thanks,

November 23, 2004 - 2:44 pm UTC

Index PSAPENDING_ITEM (NON-UNIQUE)
(POSTED_FLAG,BUSINESS_UNIT,CUST_ID,ITEM,ITEM_LINE)

it appears via testing that it is the cust_id stuck in the middle that is precluding any sort of skip scan -- if cust_id were moved last or removed, it skip scans naturally.

A reader, December 01, 2004 - 10:45 am UTC

Just wanted to know if by moving cust_id to the very end, could it potentially impact other queries that use this index column ?

In other words, (1,2,3,4,5) now becomes (1,2,4,5,3). How would queries using columns 1,2,3 behave ?

Thanks,

December 01, 2004 - 11:00 am UTC

the order of columns in the index is extremely important (as you have found here).

you would need to test those.

A reader, December 01, 2004 - 11:33 am UTC

Thank you for all the suggestions and help.

Index Not used

gia, December 03, 2004 - 12:54 am UTC

Hi Tom,

Please explain why index is not used in 2nd statement.



16:18:25 SQL> exec dbms_stats.gather_table_stats(ownname => 'CMCREC',tabname => 'DATA_3',method_opt 
=> ' FOR ALL INDEXES FOR ALL INDEXED COLUMNS') 

PL/SQL procedure successfully completed.

16:18:56 SQL> select count(*)
16:19:19   2  from data_3
16:19:19   3  where f1 in ('QTM','CMT');

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=3)
   1    0   SORT (AGGREGATE)
   2    1     INLIST ITERATOR
   3    2       BITMAP CONVERSION (COUNT)
   4    3         BITMAP INDEX (RANGE SCAN) OF 'DATA3_F1MTYPE_BIDX'

**** INDEX NOT USED HERE *****

16:19:20 SQL> ed
Wrote file afiedt.buf

  1  select count(*)
  2  from data_3
  3* where f1 in ('QTM','CMT') and match_type in ('DATA_3')
16:19:43 SQL> /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=1 Bytes=11)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'DATA_3' (Cost=257 Card=9143 Byte
          s=100573)




16:19:45 SQL> ed
Wrote file afiedt.buf

  1  select count(*)
  2  from data_3
  3* where f1 in ('QTM','CMT') and match_type in ('DATA_3','DATA_4')
16:20:25 SQL> /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=11)
   1    0   SORT (AGGREGATE)
   2    1     CONCATENATION
   3    2       BITMAP CONVERSION (TO ROWIDS)
   4    3         BITMAP INDEX (RANGE SCAN) OF 'DATA3_F1MTYPE_BIDX'
   5    2       BITMAP CONVERSION (TO ROWIDS)
   6    5         BITMAP INDEX (RANGE SCAN) OF 'DATA3_F1MTYPE_BIDX'



16:20:26 SQL>

December 03, 2004 - 7:54 am UTC

without an example, cannot even guess. if you believe an index would be superior, perhaps you should hint the query to make it use the indexes to prove the point.

else the answer given this level of detail is "indexes are not 100% good, they are not fast=true, they are not the 'best way'"

no table structures
no index structures
no information on the data contained there-in

more info

gia, December 03, 2004 - 1:30 am UTC

index is used for match_type in ('DATA_4')


17:25:06 SQL> ed
Wrote file afiedt.buf

  1  select count(*)
  2    from data_3
  3  where f1 in ('QTM','CMT')
  4* and match_type in ('DATA_4')
17:25:16 SQL> /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
   1    0   SORT (AGGREGATE)
   2    1     CONCATENATION
   3    2       BITMAP CONVERSION (TO ROWIDS)
   4    3         BITMAP INDEX (SINGLE VALUE) OF 'DATA3_F1MTYPE_BIDX'
   5    2       BITMAP CONVERSION (TO ROWIDS)
   6    5         BITMAP INDEX (SINGLE VALUE) OF 'DATA3_F1MTYPE_BIDX'

FTS and Index usage

Yogesh, February 09, 2005 - 9:58 am UTC

I have a table size 12G, with following structure

SEQ_NO,
TRN_NO,
OUT_NO,
TXN_DATE,
AMOUNT

PK SEQ_NO,TRN_NO.
Local partition index on TXT_DATE.
Partitioned on TXN_DATE.
Each partition cover 3 months.
Statistics updated till date.
Table has data from 1999 - till date.

I'm using a query where this table is joined with couple of more tables. One of the predicates in the query is TXN_DATE > '01-SEP-2003'.

Following is the explain plan for the query.

----------------------------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost | Pstart| Pstop |
----------------------------------------------------------------------------------
| SELECT STATEMENT | | 104K| 26M| 869663 | | |
| NESTED LOOPS | | 104K| 26M| 869663 | | |
| HASH JOIN | | 104K| 24M| 556646 | | |
| TABLE ACCESS FULL |AREA | 144 | 3K| 1 | | |
| HASH JOIN | | 104K| 21M| 556642 | | |
| TABLE ACCESS FULL |OUTLET | 1K| 41K| 5 | | |
| HASH JOIN | | 106K| 18M| 556620 | | |
| HASH JOIN | | 106K| 6M| 449501 | | |
| TABLE ACCESS FULL |TCONS | 161K| 3M| 89 | | |
| PARTITION CONCATENATED | | | | | KEY | 12 |
| TABLE ACCESS FULL |TRAN_MAS| 103M| 3G| 108367 | KEY | 12 |
| TABLE ACCESS FULL |TRAN_DET| 8M| 888M| 15793 | | |
| TABLE ACCESS BY INDEX ROWID |LOAD_TAB| 97M| 1G| 3 | | |
| INDEX RANGE SCAN |L_S1 | 97M| | 2 | | |
----------------------------------------------------------------------------------

When I used SQL trace I got following output for this query.

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.05 0.06 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 1 641.84 1233.37 583591 630827 161 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 641.90 1233.44 583591 630827 161 0

Following are my concerns.

1. When I'm using a predicate TXN_DATE > '01-SEP-2003', why it is doing PARTITION CONCATENATED and FTS?

2. When I wrote this query, it use to pick up the PK index. But since last couple of weeks it has stopped using it. As a result, query is taking 15-20 mins. Earlier it use to take couple of mins.

What could be wrong? Please advice.

February 09, 2005 - 2:38 pm UTC

1) incomplete information -> no guesses by me.
2) sounds like stale or incomplete stats. but, thats guessing.

looks like a "new question"

How you find that

A reader, February 23, 2005 - 6:38 pm UTC

Tom,

In one of the responses to the discussion threads above, you mentioned the following :

thank goodness no indexes were used. It would have been painfully slow
otherwise!!!!!! Painfully slow.

It does appear that someone un-intelligently has set the fantasic
HASH_JOIN_ENABLED parameter to false. You'll find that once you

a) lose your fear of the full scan
b) re-enabled the features you've disabled
c) set a nice juicy hash_area_size

Questions

1. From the above query plan, how did you figure out that using indexes would make the query run longer?
2. How did you find out that HASH_JOIN_ENABLED has been set to FALSE.
3. What are the things that you typically look for in a query plan to arrive at this conclusion. Is that HASH JOIN is the best join method when accessing tables using Full table scans?

Thanks

February 24, 2005 - 5:30 am UTC

1) it was reading all or most of the rows. indexes are good for finding A ROW, not all of them.

2) i guessed, when I see lots of sort/merge joins, it generally means that.

3) if joining big table to big table, a hash join is brutally efficient.
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6749454952894#6760861174154 <code>

Pls. clarify

A reader, February 24, 2005 - 9:47 am UTC

Tom,

From the discussions that i have been reading about execution plans, I understand that you say that :
1. Analyze the tables/indexes correctly
2. Leave the decision to the optimizer and do not use hints
   that could instruct the optimizer to do things in a different way and only use hints that provide extra info to the optimizer

My questions are :

1. How do I determine that the optimizer has made the correct plan?
2. Assuming that the optimizer has done the correct plan, if my query takes longer to complete, what are the key things I need to look for.
3. In my case, I have a SELECT that completes very quickly and the logical I/Os are less. In this case, I did not put any hint to force the optimizer and it used its own plan. The entire select completes in about 1 minute. But when I use this select to do a join update, the update runs for more than 30 minutes. Please let me know what I need to look for in the autotrace.

Here is the SELECT statement 

SELECT target.cost_basis_method target_cost_basis,
        src.cost_basis_method source_cost_basis
   FROM pace_masterdbo.cost_basis@datahubd src,
        martmgr.portfolio_holdings target
 WHERE  src.account_number = target.account_number
   AND  src.cusip                  = target.cusip
   AND  src.effective_date = target.effective_date
   AND  src.source                 = target.source
   AND  src.source                 = 'NFS'
/

The table "SOURCE" has around 362404 rows and "TARGET" has around 719691 rows. The select returns 360741 rows.

Here is the Execution plan and autotrace output

SQL> @/tmp/x
 
360741 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6264 Card=2 Bytes=15
          2)
 
   1    0   HASH JOIN (Cost=6264 Card=2 Bytes=152)
   2    1     REMOTE* (Cost=32 Card=362404 Bytes=13771352)             DATAHUBD
                                                                       .IMG.BAN
                                                                       KONE.NET
 
   3    1     TABLE ACCESS (FULL) OF 'PORTFOLIO_HOLDINGS' (Cost=5341 C
          ard=809502 Bytes=30761076)
 
 
 
   2 SERIAL_FROM_REMOTE            SELECT "ACCOUNT_NUMBER","EFFECTIVE_DATE","SO
                                   URCE","CUSIP","COST_BASIS_METHOD" FR
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      87633  consistent gets
      82146  physical reads
          0  redo size
   10434762  bytes sent via SQL*Net to client
    1948332  bytes received via SQL*Net from client
      24051  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     360741  rows processed

Thanks

February 24, 2005 - 9:57 am UTC

1) judgement call on your part. does the amount of work seem reasonable (this is one of the unanswerable questions...)

2) what are you waiting for (what is your query waiting for)? search for 10046 and you'll see lots of examples.

3) well, all you show is the select, not the join update. and remember, an update of course runs slower than just the select component.

compare the plans, use explain plan and see if anything is different.

RE : Update Statement

A reader, February 24, 2005 - 10:59 am UTC

Tom,

Here is the join-update statement I used :

UPDATE
(
 SELECT target.cost_basis_method target_cost_basis,
        src.cost_basis_method source_cost_basis
   FROM pace_masterdbo.cost_basis@datahubd src,
        martmgr.portfolio_holdings target
 WHERE  src.account_number = target.account_number
   AND  src.cusip                  = target.cusip
   AND  src.effective_date = target.effective_date
   AND  src.source                 = target.source
   AND  src.source                 = 'NFS'
)
SET target_cost_basis = source_cost_basis
/

The table "SOURCE" has a primary key on account_number,cusip, effective_date and source (required for join update)

The following is the explain plan for UPDATE :

SQL> @/tmp/x
 
360741 rows updated.
 
 
Execution Plan
----------------------------------------------------------
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=6264 Card=2 Bytes=15
          2)
 
   1    0   UPDATE OF 'PORTFOLIO_HOLDINGS'
   2    1     HASH JOIN (Cost=6264 Card=2 Bytes=152)
   3    2       REMOTE* (Cost=32 Card=362404 Bytes=13771352)           DATAHUBD
                                                                       .IMG.BAN
                                                                       KONE.NET
 
   4    2       TABLE ACCESS (FULL) OF 'PORTFOLIO_HOLDINGS' (Cost=5341
           Card=809502 Bytes=30761076)
 
 
 
   3 SERIAL_FROM_REMOTE            SELECT "ACCOUNT_NUMBER","EFFECTIVE_DATE","SO
                                   URCE","CUSIP","COST_BASIS_METHOD" FR
 
Can you pls. explain what SERIAL_FROM_REMOTE plan is. Under what circumstance can we see that in an EXPLAIN PLAN

February 24, 2005 - 4:59 pm UTC

serial from remote -- ran a query, get the results - query is there.

so, what does the tkprof (using a 10046 level 12 trace) show us? waits for what? IO's? and so on

Which index to use?

Sujit, February 25, 2005 - 6:02 am UTC

Hi,

I have a table with around 1 million rows and there is a column in this table with around 2000 distinct values. There is a bitmap index created on this column and we are concerned whether it is correct to ok bitmap index in this case. The original desingers of the system are not with us now to answer the logic behind creating bitmap index on this column.

Could you please advise what could be the best solution in this case. Whether we can leave it as it as or should we go on changing it to b-tree index.

thanks,
Sujit

February 25, 2005 - 5:16 pm UTC

insufficient data to answer, but I suggest you read the three bitmap articles by Jonathan Lewis:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:32633204077527 <code>

A reader, February 25, 2005 - 2:17 pm UTC

It all depends on the type of query, you plan to run against this table. If yours is a datawarehouse and you want star transformation, you will probably need to keep the bitmap index. Even otherwise if you run queries and use 'AND' 'OR' in the where clause, bitmap indexes help.

In addition to the performance benefit, it is faster to create the bitmap index and it occupies less storage space when compared to btree index.

On the downside,DML operations will slow down if the table has bitmap indexes(rather than bree indexes).

FTS Ratio

Yogesh, March 02, 2005 - 12:10 pm UTC

I have two tables MASTER & DETAIL, with following structures

TABLE_NAME : MASTER
CLUSTER_NAME :
IOT_NAME :
PCT_FREE : 10
PCT_USED : 40
INI_TRANS : 1
MAX_TRANS : 255
INITIAL_EXTENT : 1048576
NEXT_EXTENT : 10485760
MIN_EXTENTS : 1
MAX_EXTENTS : 2147483645
PCT_INCREASE : 0
FREELISTS : 1
FREELIST_GROUPS : 1
LOGGING : YES
BACKED_UP : N
NUM_ROWS : 8219235
BLOCKS : 164208
EMPTY_BLOCKS : 2820
AVG_SPACE : 2061
CHAIN_CNT : 0
AVG_ROW_LEN : 118
AVG_SPACE_FREELIST_BLOCKS : 6331
NUM_FREELIST_BLOCKS : 8449
DEGREE : 1
INSTANCES : 1
CACHE : N
TABLE_LOCK : ENABLED
SAMPLE_SIZE : 0
LAST_ANALYZED : 02-mar-2005 11:02:43
PARTITIONED : NO
IOT_TYPE :
TEMPORARY : N
NESTED : NO
BUFFER_POOL : DEFAULT

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

TABLE_NAME : DETAIL
CLUSTER_NAME :
IOT_NAME :
PCT_FREE : 10
PCT_USED : 40
INI_TRANS : 1
MAX_TRANS : 255
INITIAL_EXTENT : 1048576
NEXT_EXTENT : 1048576
MIN_EXTENTS : 1
MAX_EXTENTS : 2147483645
PCT_INCREASE : 0
FREELISTS : 1
FREELIST_GROUPS : 1
LOGGING : YES
BACKED_UP : N
NUM_ROWS : 203040
BLOCKS : 1124
EMPTY_BLOCKS : 45
AVG_SPACE : 3058
CHAIN_CNT : 0
AVG_ROW_LEN : 23
AVG_SPACE_FREELIST_BLOCKS : 7342
NUM_FREELIST_BLOCKS : 331
DEGREE : 1
INSTANCES : 1
CACHE : N
TABLE_LOCK : ENABLED
SAMPLE_SIZE : 0
LAST_ANALYZED : 02-mar-2005 10:05:31
PARTITIONED : NO
IOT_TYPE :
TEMPORARY : N
NESTED : NO
BUFFER_POOL : DEFAULT

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Following are the index details on these two tables

INDEX_NAME : MASTER_PRIM
INDEX_TYPE : NORMAL
TABLE_NAME : MASTER
TABLE_TYPE : TABLE
UNIQUENESS : UNIQUE
INI_TRANS : 2
MAX_TRANS : 255
INITIAL_EXTENT : 1048576
NEXT_EXTENT : 41943040
MIN_EXTENTS : 1
MAX_EXTENTS : 2147483645
PCT_INCREASE : 0
PCT_THRESHOLD :
INCLUDE_COLUMN :
FREELISTS : 1
FREELIST_GROUPS : 1
PCT_FREE : 10
LOGGING : YES
BLEVEL : 2
LEAF_BLOCKS : 28350
DISTINCT_KEYS : 8219235
AVG_LEAF_BLOCKS_PER_KEY : 1
AVG_DATA_BLOCKS_PER_KEY : 1
CLUSTERING_FACTOR : 7277038
STATUS : VALID
NUM_ROWS : 8219235
SAMPLE_SIZE : 0
LAST_ANALYZED : 02-mar-2005 11:02:43
DEGREE : 1
INSTANCES : 1
PARTITIONED : NO
TEMPORARY : N
GENERATED : N
BUFFER_POOL : DEFAULT

INDEX_NAME : DETAIL_INDEX
INDEX_TYPE : NORMAL
TABLE_NAME : DETAIL
TABLE_TYPE : TABLE
UNIQUENESS : NONUNIQUE
TABLESPACE_NAME : USERS
INI_TRANS : 2
MAX_TRANS : 255
INITIAL_EXTENT : 1048576
NEXT_EXTENT : 1048576
MIN_EXTENTS : 1
MAX_EXTENTS : 2147483645
PCT_INCREASE : 0
PCT_THRESHOLD :
INCLUDE_COLUMN :
FREELISTS : 1
FREELIST_GROUPS : 1
PCT_FREE : 10
LOGGING : YES
BLEVEL :
LEAF_BLOCKS :
DISTINCT_KEYS :
AVG_LEAF_BLOCKS_PER_KEY :
AVG_DATA_BLOCKS_PER_KEY :
CLUSTERING_FACTOR :
STATUS : VALID
NUM_ROWS :
SAMPLE_SIZE :
LAST_ANALYZED :
DEGREE : 1
INSTANCES : 1
PARTITIONED : NO
TEMPORARY : N
GENERATED : N
BUFFER_POOL : DEFAULT

Following is the column definition

MASTER

COLUMN_NAME : CSEQNO
DATA_TYPE : NUMBER
DATA_PRECISION : 38
NULLABLE : N
NUM_DISTINCT : 7387429
-----------------
COLUMN_NAME : ANO
DATA_TYPE : NUMBER
DATA_PRECISION : 3
NULLABLE : N
NUM_DISTINCT : 76

DETAIL

COLUMN_NAME : CSEQNO
DATA_TYPE : NUMBER
DATA_PRECISION : 38
NULLABLE : Y
NUM_DISTINCT : 132052
-----------------
COLUMN_NAME : ANO
DATA_TYPE : NUMBER
DATA_PRECISION : 3
NULLABLE : Y
NUM_DISTINCT : 39

When I use following query

SELECT a.* FROM MASTER a, DETAIL b WHERE a.CSEQNO = b.CSEQNO AND a.ANO = b.ANO
/

It is doing FTS for MASTER table. Following is the explain plan

SELECT STATEMENT Optimizer Mode=CHOOSE 203 K 99302
HASH JOIN 203 K 27 M 99302
TABLE ACCESS FULL DETAIL 203 K 4 M 26
TABLE ACCESS FULL MASTER 8 M 924 M 15793

DETAIL table is 1/40 of MASTER table. On what basis optimizer decides to make FTS? If I use another DETAIL table with 5000 rows, it is using the MASTER index.

March 02, 2005 - 12:32 pm UTC

please - use this for a followup to the existing question, not for new stuff.

and thank goodness it did a hash join, anything else would be really bad.

think about it. do you have any predicates that could use an index efficiently to answer that query.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6749454952894#6760861174154 <code>

Absence of statistics

Keval Shah, March 13, 2005 - 2:34 am UTC

If the object is not analyzed then will CBO always use Full Table Scan?

drop table t;
create table t ( n number primary key, c char ) ;

the mentioned below query did full table scan
select n from t;

Plan
0 (null) SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=13)
1 0 _TABLE ACCESS (FULL) Of T (TABLE) (Cost=2 Card=1 Bytes=13)

but after analyzing the table plan changed to fetch values from index

Plan
0 (null) SELECT STATEMENT Optimizer=ALL_ROWS (Cost=0 Card=1 Bytes=13)
1 0 _INDEX (FULL SCAN) Of SYS_C00325103 (INDEX (UNIQUE))

there is no data in the table but under no circumstance full table scan will be better then "cover query" for this example. Do CBO always need statistics to make any decision?

btw I tried this on online HTML DB running 10g. I am sorry if this question is already answered.

March 13, 2005 - 9:48 am UTC

no it will not "always".  


but in this case -- the table T would most likely be smaller than an index on N!

And don't forget, in 10g -- dynamic sampling will kick in.

ops$tkyte@ORA10G> create table t ( n number primary key, c char );
 
Table created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set autotrace traceonly explain
ops$tkyte@ORA10G> select n from t before_insert;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=13)
   1    0   TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=13)
 
 
 
ops$tkyte@ORA10G> set autotrace off
ops$tkyte@ORA10G> insert into t select rownum, 'x' from all_objects;
 
48248 rows created.
 
ops$tkyte@ORA10G> set autotrace traceonly explain
ops$tkyte@ORA10G> select n from t after_insert;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=21 Card=50232 Bytes=653016)
   1    0   TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=21 Card=50232 Bytes=653016)
 
 
 
ops$tkyte@ORA10G> set autotrace off
ops$tkyte@ORA10G>


<b>so, even when there are no stats - there are in fact stats! And when it makes sense (when the table is bigger than the resulting index) it'll use indexes as appropriate:</b>


ops$tkyte@ORA10G> create table t as select rownum n, a.* from all_objects a where 1=0;
 
Table created.
 
ops$tkyte@ORA10G> alter table t add constraint t_pk primary key(n);
 
Table altered.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> set autotrace traceonly explain
ops$tkyte@ORA10G> select n from t before_insert;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=13)
   1    0   TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=2 Card=1 Bytes=13)
 
 
 
ops$tkyte@ORA10G> set autotrace off
ops$tkyte@ORA10G> insert into t select rownum, a.* from all_objects a;
 
48252 rows created.
 
ops$tkyte@ORA10G> set autotrace traceonly explain
ops$tkyte@ORA10G> select n from t after_insert;
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=22 Card=49819 Bytes=647647)
   1    0   INDEX (FAST FULL SCAN) OF 'T_PK' (INDEX (UNIQUE)) (Cost=22 Card=49819 Bytes=647647)
 
 
 
ops$tkyte@ORA10G> set autotrace off

Eating back my words

Keval Shah, March 13, 2005 - 3:26 am UTC

Ok, I apologize, I am backing off from my words "under no circumstance full table scan will be better than cover query".

I was thinking of possibilities when FTS will be better than index scan for this example and found 1 possibility when there is heavy DML on the table. Very heavy DML may result in index block to split causing IO of atleast 3 data blocks [ 1 root and 2 leaves ] via index fetch but the table data is in just one block.

I am sure this can be proved. Say for eg. Data block of table can store upto 100 rows and say index can store upto 200 rows. And if we insert records with id from 1 to 100 and then suppose we update
id = id + 100 where id between 11 and 90;
Such multiple updates will take us to the desired result.

I am not proving it with actual numbers but can definitely be done by above mentioned way. Hope in such incident CBO will fetch values from table rather than from index after analyze.

March 13, 2005 - 9:56 am UTC

I would hope the index would "survive" such modifications rather unscathed actually. Index space is most certainly reused over time.

Vodoo magic :-)

hrishy, April 30, 2005 - 10:30 am UTC

Hi Tom

In the peoplesoft query above

select VAT_TRANS_AMT from PS_PENDING_ITEM where
BUSINESS_UNIT = 'LU001'
and ITEM = 'LUL0100000011'
and ITEM_LINE = 1;

you said that the index is not being used because of cust_id as the column sticking in there.How did you come to this conclusion.
Index PS_PENDING_ITEM
(UNIQUE)(GROUP_BU,GROUP_ID,BUSINESS_UNIT,CUST_ID,ITEM,ITEM_LINE,
GROUP_SEQ_NUM)

April 30, 2005 - 11:08 am UTC

...
it appears via testing that it is the cust_id stuck in the middle that is
precluding any sort of skip scan -- if cust_id were moved last or removed, it
skip scans naturally.
......

I set up tests, loaded data, simulated the environment.

RULE based is faster then COST based

fftbng@onsemi.com, May 05, 2005 - 11:58 am UTC

Hi Tom,

   I have a query which return more than 25% of the rows from table. From what I understand, FTS is a better choice compare to using Index. However, the query seems like faster to return the result when we use RULE hint as it use INDEX Full Scan. We take out the RULE hint, it use FULL TABLE SCAN and the respond is slower than RULE. Attached is the explain plan for your advice. I have gathered the schema before executing the Query. Do we need to tune the query or it is behave in this manner ? Thanks in advance.

  1  SELECT /*+ rule */ M.ONPN, M.PTI_CD, M.PART_DESC, M.PART_STATUS, M.PACKAGE
CD,
  2  (SELECT MAX(P.EFFECTIVE_DT) FROM GPS_owner.PRICE P where M.ONPN=P.ONPN) AS
EFFECTIVE_DT
  3  FROM GPS_OWNER.GPS_MATRIX M
  4  WHERE PART_STATUS = '4-MATURE 2'
  5* ORDER BY M.ONPN
SQL> /

119503 rows selected.

Elapsed: 00:00:18.26

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=HINT: RULE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'GPS_MATRIX'
   2    1     INDEX (FULL SCAN) OF 'GPS_MATRIX_PK' (UNIQUE)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
          0  bytes sent via SQL*Net to client
          0  bytes received via SQL*Net from client
          0  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     119503  rows processed

SQL> SELECT  M.ONPN, M.PTI_CD, M.PART_DESC, M.PART_STATUS, M.PACKAGE_CD,
(SELECT MAX(P.EFFECTIVE_DT) FROM GPS_owner.PRICE P where M.ONPN=P.ONPN) AS EFFE
TIVE_DT
FROM GPS_OWNER.GPS_MATRIX M
WHERE PART_STATUS = '4-MATURE 2'
ORDER BY M.ONPN
  2    3    4    5    6  /

119503 rows selected.

Elapsed: 00:00:22.73

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=641 Card=17109 Bytes
          =752796)

   1    0   SORT (ORDER BY) (Cost=641 Card=17109 Bytes=752796)
   2    1     TABLE ACCESS (FULL) OF 'GPS_MATRIX' (Cost=247 Card=17109
           Bytes=752796)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
          0  bytes sent via SQL*Net to client
          0  bytes received via SQL*Net from client
          0  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     119503  rows processed

SQL>

May 05, 2005 - 1:00 pm UTC

please do not run tests as SYS, you miss so much and SYS is special, like magical

we are missing all of the information.

OK

Raju, May 27, 2005 - 12:39 pm UTC

Hi Tom,
i)For an Index to be used,Is it mandatory to have a WHERE clause
in a Query??
ii)What are the various ways of suppressing the usage of an index??
Please do reply.
BYE!

May 27, 2005 - 12:49 pm UTC

i) nope....
ii) don't create it.

what is the goal in asking ii?

Alex, June 08, 2005 - 12:06 pm UTC

Tom,

I have a query that uses a temporary table. I have a feeling the optimizer is coming up with a sub-optimal plan because the temp table isn't filled in until run time. Using the ordered hint makes it run much faster. Do you have any suggestions for a problem like this? Is a hint ok in this case? Here are the two plans:

SELECT /*+ ordered */
jt.job
FROM (SELECT DISTINCT ID
FROM temp_table) tt, job_table jt, titles t, salary s
WHERE jt.ID = tt.ID
AND jt.OID = t.OID(+)
AND jt.NAME = s.NAME(+)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1198507 Card=14358
1106 Bytes=45084467284)

1 0 HASH JOIN (OUTER) (Cost=1198507 Card=143581106 Bytes=45084
467284)

2 1 HASH JOIN (OUTER) (Cost=557026 Card=143581106 Bytes=4048
9871892)

3 2 NESTED LOOPS (Cost=2501 Card=143581106 Bytes=346030465
46)

4 3 VIEW (Cost=51 Card=8168 Bytes=106184)
5 4 SORT (UNIQUE) (Cost=51 Card=8168 Bytes=106184)
6 5 TABLE ACCESS (FULL) OF 'TEMP_TABLE' (C
ost=16 Card=8168 Bytes=106184)

7 3 TABLE ACCESS (BY INDEX ROWID) OF 'JOB_TABLE
' (Cost=1 Card=17578 Bytes=4007784)

8 7 INDEX (UNIQUE SCAN) OF 'JOB_TABLE#PK' (UN
IQUE)

9 2 TABLE ACCESS (FULL) OF 'TITLES' (Cost=124 Card=56905 Byte
s=2333105)

10 1 TABLE ACCESS (FULL) OF 'SALARY' (Cost=9 Card
=3248 Bytes=103936)

SELECT jt.job
FROM (SELECT DISTINCT ID
FROM temp_table) tt, job_table jt, titles t, salary s
WHERE jt.ID = tt.ID
AND jt.OID = t.OID(+)
AND jt.NAME = s.NAME(+)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=26666 Card=1435748
17 Bytes=47092539976)

1 0 HASH JOIN (Cost=26666 Card=143574817 Bytes=47092539976)
2 1 VIEW (Cost=51 Card=8168 Bytes=106184)
3 2 SORT (UNIQUE) (Cost=51 Card=8168 Bytes=106184)
4 3 TABLE ACCESS (FULL) OF 'TEMP_TABLE' (Cost=
16 Card=8168 Bytes=106184)

5 1 HASH JOIN (OUTER) (Cost=24852 Card=1757772 Bytes=5536981
80)

6 5 HASH JOIN (OUTER) (Cost=17058 Card=1757772 Bytes=47987
1756)

7 6 TABLE ACCESS (FULL) OF 'JOB_TABLE' (Cost=10
316 Card=1757772 Bytes=421865280)

8 6 TABLE ACCESS (FULL) OF 'SALARY' (Cost=9
Card=3248 Bytes=107184)

9 5 TABLE ACCESS (FULL) OF 'TITLES' (Cost=124 Card=57325 Byte
s=2407650)

June 08, 2005 - 1:07 pm UTC

see
</code> https://www.oracle.com/technetwork/issue-archive/2014/14-sep/o54asktom-2279184.html <code>
query plans with temporary tables...

SELECT /*+ ordered */
jt.job
FROM (SELECT DISTINCT ID
FROM temp_table) tt, job_table jt, titles t, salary s
WHERE jt.ID = tt.ID
AND jt.OID = t.OID(+)
AND jt.NAME = s.NAME(+)

should be:

where jt.id in ( select id from temp_table)

and did you really need to join to T and S? You select nothing from them.

sachin

A reader, June 08, 2005 - 1:10 pm UTC

Hi Tom

i have not been able to understand this and all that is written related to this defferrable constraint in your book "beginning oracle programming"..could you please make this more clear

....."fortunately it is possible to take control of the constraints so that they are enforsed with non unique indexes that dont just vanish.the trick is to make use of the idea of deferrable constraints.

create table inventory(
partno number(4) contraint partno_pk primary key deferrable initally immediate
partdesc varchar(3) constraint partdesc_uq unique deferrable initialy immediate)

"if the table must permit violating records to reside within it,we cant have a unique index lurking in the backgroung,because its awareness of its own uniquiness being voilated would cause it to want to reject the records as they are inserted".............

Thanks
Sachin

June 08, 2005 - 1:16 pm UTC

which part wasn't clear?

if you want to have a deferrable constraint that happens to be a UNIQUE constraint that allows for duplicates temporarily, we have to use a NON-UNIQUE index to enforce the constraint, not the typical UNIQUE index we would normally.

sachin

A reader, June 08, 2005 - 1:25 pm UTC

Please explain this with a little example to make it more clear.. the whole deferral constraint part...it is either dificult or confusing..

June 08, 2005 - 1:48 pm UTC

are you asking "what is a deferred constraint"?

Alex, June 08, 2005 - 1:33 pm UTC

Excellent thank you. One other thing, is it safe to use the first_rows hint you have with dynamic_sampling? Suppose that temp table grows huge, you wouldn't want first_rows right?

June 08, 2005 - 1:50 pm UTC

you use first_rows when you want first rows first (fastest)

it is "information" to the optimizer (Hey, I have an end user here, please get first rows as fast as possible)

A reader, June 08, 2005 - 1:55 pm UTC

i understand what is deferrable constraints but somehow coudnt understand it in the praticle example given in the book and the WHOLE of this thing line by line please:::::

........"fortunately it is possible to take control of the constraints so that they
are enforsed with non unique indexes that dont just vanish.the trick is to make
use of the idea of deferrable constraints.

create table inventory(
partno number(4) contraint partno_pk primary key deferrable initally immediate
partdesc varchar(3) constraint partdesc_uq unique deferrable initialy immediate)

"if the table must permit violating records to reside within it,we cant have a
unique index lurking in the backgroung,because its awareness of its own
uniquiness being voilated would cause it to want to reject the records as they
are inserted".............

June 08, 2005 - 4:53 pm UTC

that is just saying "look, if we create a deferrable unique or primary key, we'll create a NON-UNIQUE index to go with it. That is instead of the UNIUQE index that would normally be created otherwise"

IF unique|primary key is DEFERRABLE
then
it'll use a NON-UNIQUE index
else
it'll use a UNIQUE index
end if

Alex, June 08, 2005 - 2:00 pm UTC

Oh ok so in this case,

*+ first_rows
2 dynamic_sampling(temp_create_awp_smrs 2) */

Is saying, "get us the table information the fastest" not the actual rows is that correct?

June 08, 2005 - 4:54 pm UTC

no, that is saying:

please use dynamic sampling if needed to understand temp_create_awp_smrs. After you do that, optimize this query to return the rows as soon as possible -- optimize for query initial response time, NOT total throughput.

sachin

A reader, June 08, 2005 - 4:57 pm UTC

thanks alot:-)

Alex, June 09, 2005 - 10:23 am UTC

Tom,

If my optimizer_feature_enable parameter is set to 8.1.7, and I try to use dynamic_sampling, will it work?

index not getting used when using union all

A reader, December 14, 2005 - 5:00 pm UTC

Hi Tom,

I have two similar tables from two different schema's A and B (> 100000 rows) and had a view C crated on these tables like:
create view C as
select col1,clo2,....from A
union all
select col1,col2,....from B;

If I see the explain plan it always goes for full table scan even though I have some indexes available on the base tables.

-----------------------------------------------------------
| Id | Operation |Name| Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 7949K| 1155M| 12048 |
| 1 | UNION-ALL | | | | |
| 2 | TABLE ACCESS FULL | A | 7566K| 1104M| 11444 |
| 3 | TABLE ACCESS FULL | B | 382K| 51M| 604 |
-----------------------------------------------------------

we want all the data so no where condition in view definition.All tables and indexes are analyzed and using choose optimizer

How can i make use of indexes other than using hints?

December 14, 2005 - 8:27 pm UTC

you'd need to provide "an example"

but, in the plan above, looks right to me - if A and B are close to 7.5million and 400thousand rows (or those are large percentages of their size)

A reader, March 29, 2006 - 11:18 am UTC

Hi Tom,

I have a procedure which generates a query like the following and when I run it, it takes more than one min. Can you shed some light on how this is to be handled ?

I did a

set autotrace traceonly explain

SELECT a.*, rownum FROM ( SELECT /*+ FIRST_ROWS */ a.id , fn , mn, ln, ssn, dob,
c.addr , c.city, c.zip, d.addr addr1, d.city city1, d.zip zip1
FROM t1 a, t2 c, t3 d
WHERE 1 = 1 and flag != 1
and a.id = c.id (+) and a.id = d.id (+) and c.flag1 = 'Y'
and col1 = 0 and col2 = 0 and col3 = 0 and col4 = 0
and rownum < 101 ) a
/
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=4128 Card=
100 Bytes=121800)

1 0 COUNT
2 1 VIEW (Cost=4128 Card=100 Bytes=121800)
3 2 COUNT (STOPKEY)
4 3 TABLE ACCESS (BY INDEX ROWID) OF 'T2' (TABLE)
(Cost=3 Card=1 Bytes=48)

5 4 NESTED LOOPS (Cost=4128 Card=151 Bytes=26878)
6 5 HASH JOIN (OUTER) (Cost=3788 Card=113 Bytes=1469
0)

7 6 TABLE ACCESS (FULL) OF 'T1' (TABLE) (Cost=31
46 Card=113 Bytes=10396)

8 6 TABLE ACCESS (FULL) OF 'T3' (TABLE) (Co
st=635 Card=435355 Bytes=16543490)

9 5 INDEX (RANGE SCAN) OF 'IDX_T2_ID' (INDEX
) (Cost=2 Card=2)

Please help.

Thanks.

March 29, 2006 - 11:38 am UTC

...
and a.id = c.id (+) and a.id = d.id (+) and c.flag1 = 'Y'
and col1 = 0 and col2 = 0 and col3 = 0 and col4 = 0
and rownum < 101 ) a
......

so, if you outer join to C, could c.flag1 be 'Y'???? think about that.

no tables.
no indexing scheme
nothing much more to say....

don't even know where col1, col2, ... come from.

Not using the right index.

Nitin Naik, June 13, 2006 - 4:16 pm UTC

I have a XYZ non-partitioned table with 70 million rows.

The PK XYZ_PK is on col A, B, EFF_DT

additional index XYZ_IX1 is on col B,C

We recently analyzed (as part of scheduled analyze) this table and now the following query uses the XYZ_IX1 instead of XYZ_PK and hence runs slower.

I am suspecting that it was using PK before the analyze.

SELECT COUNT(*)
into v_count
FROM XYZ
WHERE A = v_a
AND B = v_b
AND C = v_c
AND NVL(d, 'X') = NVL(v_d, 'X')
AND EXP_DT = v_exp_dt;

Excerpt from tkprof trace output

SELECT COUNT(*)
FROM
 XYZ WHERE A = :B5 AND B = :B4 AND C =
  :B3 AND NVL(D, 'X') = NVL(:B2 , 'X') AND EXP_DT = :B1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.09       0.08        444        450          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.09       0.09        444        450          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 18  (XXX)   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE
      1   TABLE ACCESS BY INDEX ROWID XYZ
   1990    INDEX RANGE SCAN XYZ_IX1 (object id 120441)


I have added the HINT for PK and its faster 

SELECT /*+ INDEX(XYZ XYZ_PK) */ COUNT(*)
FROM
 XYZ WHERE A = :B5 AND B = :B4 AND C =
  :B3 AND NVL(D, 'X') = NVL(:B2 , 'X') AND EXP_DT = :B1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          3          6          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          3          6          0           1

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 18  (XXX)   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE
      1   TABLE ACCESS BY INDEX ROWID XYZ
      2    INDEX RANGE SCAN XYZ_PK (object id 29940)


Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   GOAL: CHOOSE
      1   SORT (AGGREGATE)
      1    TABLE ACCESS   GOAL: ANALYZED (BY INDEX ROWID) OF
               'XYZ'
      2     INDEX   GOAL: ANALYZED (RANGE SCAN) OF 'XYZ_PK'
                (UNIQUE)

Is it because of clustering factor?? It is picking index with lower clustering factor for RANGE SCAN

SQL> l
  1  select index_name, clustering_factor from dba_indexes
  2* where table_name = 'XYZ'
SQL> /

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
XYZ_PK                                  66858948
XYZ_IX1                                 11168612

I am using 

analyze table XYZ estimate statistics sample 40 percent;

How can I make the query use PK without the HINT?

Thanks in advance.

June 13, 2006 - 5:06 pm UTC

can we have autotrace traceonly explains of that?

(use to_number() and to_date() about the binds with autotrace)

RE: Not using the right index.

Nitin Naik, June 14, 2006 - 10:23 am UTC

Here's the autotrace you asked.

SQL> set autotrace traceonly
SQL> SELECT COUNT(*)
  2  FROM XYZ
  3  WHERE A = 2400417
  4  AND B = 70
  5  AND C = 70000
  6  AND NVL(D, 'X') = NVL(NULL, 'X')
  7  AND EXP_DT = to_date('12/31/9999','MM/DD/YYYY');


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=20)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'XYZ' (Co
          st=6 Card=1 Bytes=20)

   3    2       INDEX (RANGE SCAN) OF 'XYZ_IX1' (NON-UNI
          QUE) (Cost=4 Card=11)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        449  consistent gets
        449  physical reads
          0  redo size
        299  bytes sent via SQL*Net to client
        364  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> SELECT /*+ INDEX(XYZ XYZ_PK) */ COUNT(*)
  2  FROM XYZ
  3  WHERE A = 2400417
  4  AND B = 70
  5  AND C = 70000
  6  AND NVL(D, 'X') = NVL(NULL, 'X')
  7  AND EXP_DT = to_date('12/31/9999','MM/DD/YYYY');


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=20)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'XYZ' (Co
          st=6 Card=1 Bytes=20)

   3    2       INDEX (RANGE SCAN) OF 'XYZ_PK' (UNIQUE) (Co
          st=4 Card=1)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          6  physical reads
          0  redo size
        299  bytes sent via SQL*Net to client
        364  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

I also did 10053 trace and found the 2 costs (i.e with IX1 and PK) equal.

June 14, 2006 - 12:39 pm UTC

are you using 9999 in real life?

Not using the right index

Nitin Naik, June 14, 2006 - 2:23 pm UTC

Followup:
are you using 9999 in real life?

Yes this date is used for records that are not expired (i.e. current)

June 15, 2006 - 8:16 am UTC

ugh (you need Jonathan Lewis' book on the CBO - you done gone and broke it with that "technique")

so, see the card= value there, the low one on the index that really returns thousands of records? you did that with the 9999 "trick", the optimizer cannot figure out what the true set of values are. It is confused.

Not using the right index

Nitin Naik, June 15, 2006 - 3:19 pm UTC

I'll buy the book.

In the meantime..

3 2 INDEX (RANGE SCAN) OF 'XYZ_IX1' (NON-UNI
QUE) (Cost=4 Card=11) *** This one returns 1000s of records

3 2 INDEX (RANGE SCAN) OF 'XYZ_PK' (UNIQUE) (Co
st=4 Card=1)

So if the lower card is better why doesnt CBO pick '_PK' plan on its own without the hint?

The exp_dt=9999 has nothing to do with this. Whether I use that predicate in the query or not it still uses 'IX1' when '_PK' is better. Also Note that '_PK' is on A, B and EFF_DT (not EXP_DT).

June 16, 2006 - 6:35 pm UTC

because the costs "tied", it didn't matter.

it is your use of that fake number that really skewed the data and messes up the estimation.

exp_dt = 9999 HAS EVERY SINGLE THING UNDER THE SUN to do with this.  Everything.


Allow me to repeat, your use of this fake date - it is the root cause.  It messed up the cardinality estimate, entirely.  Such a bad idea.  Such a common idea (irrational FEAR of null).  Such a horrible bad idea.

consider:

ops$tkyte@ORA10GR2> create table t
  2  as
  3  select object_id,
  4         decode( mod(rownum,10), 0, to_date(null), created ) date1,
  5         decode( mod(rownum,10), 0, to_date('01-jan-9999'), created ) date2
  6    from all_objects;

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select count(*),
  2         count( case when date1 is null then 1 end ) cnt1,
  3         count( case when date2 = to_date('01-jan-9999') then 1 end ) cnt2
  4    from t;

  COUNT(*)       CNT1       CNT2
---------- ---------- ----------
     49993       4999       4999

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace traceonly explain
ops$tkyte@ORA10GR2> select * from t where date1 is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  4899 | 97980 |    63   (7)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |  4899 | 97980 |    63   (7)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DATE1" IS NULL)

ops$tkyte@ORA10GR2> select * from t where date2 = to_date('01-jan-9999');

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    63 |  1260 |    63   (7)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |    63 |  1260 |    63   (7)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DATE2"=TO_DATE('9999-01-01 00:00:00', 'yyyy-mm-dd
              hh24:mi:ss'))

<b>whoops, 63 is far away from 5000</b>

query to get sessions which do full table scan of big tables.

Sean, June 15, 2006 - 5:51 pm UTC

Hi Tom,

We have a few tables with millions of rows and we would like to know when a session issue a query or dml which does full table scan on these tables. I would like to have a query which will tell me the session id when such things happen, such as

Select session_id from such_dictionary_view where table_name in (a few big tables);

Thanks.

June 16, 2006 - 6:56 pm UTC

query v$session_longops

if they are full scanning that big table right then and there, they'll appear there.



ops$tkyte@ORA10GR2> exec print_table('select * from v$session_longops where time_remaining > 0' );
.SID                          : 155
.SERIAL#                      : 14
.OPNAME                       : Table Scan
.TARGET                       : BIG_TABLE.BIG_TABLE
.TARGET_DESC                  :
.SOFAR                        : 18739
.TOTALWORK                    : 146312
.UNITS                        : Blocks
.START_TIME                   : 16-jun-2006 18:51:32
.LAST_UPDATE_TIME             : 16-jun-2006 18:51:59
.TIMESTAMP                    :
.TIME_REMAINING               : 184
.ELAPSED_SECONDS              : 27
.CONTEXT                      : 0
.MESSAGE                      : Table Scan:  BIG_TABLE.BIG_TABLE: 18739 out of
146312 Blocks done
.USERNAME                     : BIG_TABLE
.SQL_ADDRESS                  : 3FF651A8
.SQL_HASH_VALUE               : 283889156
.SQL_ID                       : 4a50fhh8frmh4
.QCSID                        : 0
-----------------

PL/SQL procedure successfully completed.

Not using the right index

Nitin Naik, June 20, 2006 - 1:44 pm UTC

RE Followup:  

exp_dt = 9999 HAS EVERY SINGLE THING UNDER THE SUN to do with this.  Everything.

 =====================

Thanks for that example. However I still didn't get your point. I shouldn't have included exp_dt predicate in that example in the first place. May be it made my problem more muddy for you.

I have made my problem simpler here..
XYZ has following indexes
1. XYZ_PK on A,B,EFF_DT
  note: EFF_DT ranges from 01-jan-00 to today (no 9999s here)
2. XYZ_IX1 on B,C

Forget about exp_dt=9999 ... Its not in the query. The column is not indexed either.

Why does it pick XYZ_IX1 when I don't give the hint? It used to pick XYZ_PK before. We recently analyzed the table as a part of regularly scheduled job. 


SQL> set autotrace traceonly explain
SQL>
SQL> SELECT COUNT(*)
  2  FROM XYZ
  3  WHERE A = 2400417
  4  AND B = 70
  5  AND C = 70000;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'XYZ' (Co
          st=6 Card=1 Bytes=12)

   3    2       INDEX (RANGE SCAN) OF 'XYZ_IX1' (NON-UNI
          QUE) (Cost=4 Card=12)




SQL>
SQL> SELECT /*+ INDEX(XYZ XYZ_PK) */ COUNT(*)
  2  FROM XYZ
  3  WHERE A = 2400417
  4  AND B = 70
  5  AND C = 70000;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=12)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'XYZ' (Co
          st=6 Card=1 Bytes=12)

   3    2       INDEX (RANGE SCAN) OF 'XYZ_PK' (UNIQUE) (Co
          st=4 Card=1)

June 21, 2006 - 9:36 am UTC

it screws up the estimated cardinality. That is the root cause, period. The guess of 12 is "wrong" right -- way way way wrong...

Root cause = wrong cardinality.

Confused.

Gerald Koerkenmeier, June 22, 2006 - 11:52 pm UTC

Tom,

I am having trouble understanding why the optomizer chooses the execution plan it does with the following two queries. The first pulls data from a combination of tables (many are joined in the PORTSVIEW view) and it is extremely fast (.04 seconds). The second is identical except instead of pulling the actual data, it just asks for a count(*), but it is very slow (over 6 minutes) and uses several full table scans. I have included both queries and their execution times and plans.

First query pulling real data:

1 SELECT ports.SITE_HUM_ID,ports.SHELF_HUM_ID,ports.CARD_NAME,ports.CARD_TYPE,ports.CARD_DESCRIPT
2 ports.PATH_DECOMMISSION_DATE,ports.PENDING_PATH_NAME,ports.SEGMENT_NAME,ports.CABLE_NAME,Substr
3 IP_ADDRESS,ports.CARD_IN_SVC_DATE,ports.CARD_SCHED_DATE,ports.DEVICE_ID,ports.SUB_CARD_NAME,por
4 ports.CARD_INST_ID,ports.PORT_HUM_ID,ports.BANDWIDTH,ports.DESCRIPTION,ports.STATUS,ports.CONNE
5 ports.EQUIP_INST_ID,ports.SITE_INST_ID,ports.CIRC_INST_ID,ports.CIRC_PATH_INST_ID,ports.CABLE_I
6 ports.USER_TG_MEM,ports.EFFECT_TG_MEM,ports.PATH_CHG_DATE,ports.NEXT_PATH_INST_ID,ports.A_WIRED
7 ports.Z_WIRED_PORT_INST_ID,ports.PARENT_PORT_INST_ID,ports.PARENT_PORT_CHAN,ports.VIRTUAL_PORT,
8 ports.PARENT_PORT_CHAN_NAME,ports.COST,ports.LAST_MOD_BY,ports.LAST_MOD_TS,ports.BANDWIDTH_INST
9 FROM PORTVIEW ports , EQUIP_INST shelf
10 where EXISTS (select null from equip_domain_map dm
11 where dm.EQUIP_INST_ID = shelf.EQUIP_INST_ID and
12 dm.DOMAIN_INST_ID in (1000,1020,1021,1022,1023,1040,1041,1080,1081,1100,1101,1102,1103,1104,110
13 and ports.SUB_CARD_NAME LIKE '%GI8/16%'
14 and shelf.EQUIP_INST_ID = ports.EQUIP_INST_ID
15* and shelf.DESCR LIKE '%DLLSTXEM06W%'
ets_test@OPWK10> /

SITE_HUM_ID SHELF_HUM_ID
----------------------------------------------------------------------------------------------------
DLLSTXEM DLLSTXEM06W
DLLSTXEM DLLSTXEM06W

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=5550737 Card=722195 Bytes=261434590)
1 0 NESTED LOOPS (Cost=5550737 Card=722195 Bytes=261434590)
2 1 NESTED LOOPS (Cost=4828542 Card=722195 Bytes=239768740)
3 2 FILTER
4 3 NESTED LOOPS (OUTER)
5 4 NESTED LOOPS (Cost=3384152 Card=722195 Bytes=211603135)
6 5 NESTED LOOPS (OUTER) (Cost=2661957 Card=722195 Bytes=186326310)
7 6 NESTED LOOPS (OUTER) (Cost=1939762 Card=722195 Bytes=166827045)
8 7 NESTED LOOPS (OUTER) (Cost=1217567 Card=722195 Bytes=144439000)
9 8 NESTED LOOPS (OUTER) (Cost=495372 Card=722195 Bytes=123495345)
10 9 NESTED LOOPS (Cost=495372 Card=722195 Bytes=77997060)
11 10 NESTED LOOPS (SEMI) (Cost=6447 Card=3075 Bytes=98400)
12 11 INDEX (FULL SCAN) OF 'EQUIP_INST_IDX2' (NON-UNIQUE) (Cost=297 Card=3075 Bytes=73800)
13 11 INLIST ITERATOR
14 13 INDEX (RANGE SCAN) OF 'EQUIP_DOMAIN_MAP_IDX2' (NON-UNIQUE) (Cost=2 Card=272257 Bytes=2178056)
15 10 TABLE ACCESS (BY INDEX ROWID) OF 'EPA' (Cost=159 Card=235 Bytes=17860)
16 15 INDEX (RANGE SCAN) OF 'EQUIP_PORT_IDX' (NON-UNIQUE) (Cost=2 Card=235)
17 9 TABLE ACCESS (BY INDEX ROWID) OF 'CIRC_INST'
18 17 INDEX (UNIQUE SCAN) OF 'PK_CIRC_INST' (UNIQUE)
19 8 TABLE ACCESS (BY INDEX ROWID) OF 'CIRC_PATH_INST' (Cost=1 Card=1 Bytes=29)
20 19 INDEX (UNIQUE SCAN) OF 'PK_CIRC_PATH_INST' (UNIQUE)
21 7 TABLE ACCESS (BY INDEX ROWID) OF 'CIRC_PATH_INST' (Cost=1 Card=1 Bytes=31)
22 21 INDEX (UNIQUE SCAN) OF 'PK_CIRC_PATH_INST' (UNIQUE)
23 6 TABLE ACCESS (BY INDEX ROWID) OF 'CABLE_INST' (Cost=1 Card=1 Bytes=27)
24 23 INDEX (UNIQUE SCAN) OF 'PK_CABLE_INST' (UNIQUE)
25 5 TABLE ACCESS (BY INDEX ROWID) OF 'CARD_INST' (Cost=1 Card=1 Bytes=35)
26 25 INDEX (UNIQUE SCAN) OF 'PK_CARD_INST' (UNIQUE)
27 4 TABLE ACCESS (BY INDEX ROWID) OF 'CARD_INST' (Cost=1 Card=1 Bytes=13)
28 27 INDEX (UNIQUE SCAN) OF 'PK_CARD_INST' (UNIQUE)
29 2 TABLE ACCESS (BY INDEX ROWID) OF 'EQUIP_INST' (Cost=1 Card=1 Bytes=26)
30 29 INDEX (UNIQUE SCAN) OF 'PK_EQUIP_ISNT' (UNIQUE)
31 1 TABLE ACCESS (BY INDEX ROWID) OF 'SITE_INST' (Cost=1 Card=1 Bytes=30)
32 31 INDEX (UNIQUE SCAN) OF 'PK_SITE_INST' (UNIQUE)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
12853 consistent gets
0 physical reads
0 redo size
1811 bytes sent via SQL*Net to client
275 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

Second query with count(*):

SELECT count(*)
FROM PORTVIEW ports , EQUIP_INST shelf
where EXISTS (select null from equip_domain_map dm
where dm.EQUIP_INST_ID = shelf.EQUIP_INST_ID and
dm.DOMAIN_INST_ID in (1000,1020,1021,1022,1023,1040,1041,1080,1081,1100,1101,1102,1103,1104,1105))
and ports.SUB_CARD_NAME LIKE '%GI8/16%'
and shelf.EQUIP_INST_ID = ports.EQUIP_INST_ID
and shelf.DESCR LIKE '%DLLSTXEM06W%';
COUNT(*)
----------
2

Elapsed: 00:06:05.02

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=28915 Card=1 Bytes=134)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=28915 Card=722195 Bytes=96774130)
3 2 NESTED LOOPS (Cost=28915 Card=722195 Bytes=93163155)
4 3 FILTER
5 4 HASH JOIN (OUTER)
6 5 HASH JOIN (Cost=27298 Card=722195 Bytes=80163645)
7 6 TABLE ACCESS (FULL) OF 'CARD_INST' (Cost=690 Card=305329 Bytes=5190593)
8 6 NESTED LOOPS (OUTER) (Cost=25310 Card=722195 Bytes=67886330)
9 8 NESTED LOOPS (OUTER) (Cost=25310 Card=722195 Bytes=58497795)
10 9 NESTED LOOPS (OUTER) (Cost=25310 Card=722195 Bytes=54886820)
11 10 NESTED LOOPS (OUTER) (Cost=25310 Card=722195 Bytes=51275845)
12 11 HASH JOIN (Cost=25310 Card=722195 Bytes=47664870)
13 12 HASH JOIN (SEMI) (Cost=146 Card=3075 Bytes=98400)
14 13 INDEX (FAST FULL SCAN) OF 'EQUIP_INST_IDX2' (NON-UNIQUE) (Cost=46 Card=3075 Bytes=73800)
15 13 TABLE ACCESS (FULL) OF 'EQUIP_DOMAIN_MAP' (Cost=89 Card=272257 Bytes=2178056)
16 12 TABLE ACCESS (FULL) OF 'EPA' (Cost=23961 Card=12533587 Bytes=426141958)
17 11 INDEX (UNIQUE SCAN) OF 'PK_CIRC_PATH_INST' (UNIQUE)
18 10 INDEX (UNIQUE SCAN) OF 'PK_CIRC_PATH_INST' (UNIQUE)
19 9 INDEX (UNIQUE SCAN) OF 'PK_CABLE_INST' (UNIQUE)
20 8 INDEX (UNIQUE SCAN) OF 'PK_CIRC_INST' (UNIQUE)
21 5 INDEX (FAST FULL SCAN) OF 'CARD_INST_IDX3' (NON-UNIQUE) (Cost=151 Card=305329 Bytes=396277)
22 3 INDEX (UNIQUE SCAN) OF 'PK_EQUIP_ISNT' (UNIQUE)
23 2 INDEX (UNIQUE SCAN) OF 'PK_SITE_INST' (UNIQUE)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
164381 consistent gets
156911 physical reads
0 redo size
211 bytes sent via SQL*Net to client
275 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

Any ideas?

View Query

Gerald Koerkenmeier, June 23, 2006 - 2:43 pm UTC

I forgot to include the query for the PORTVIEW view and to mention that the tables have accurate stats.

select /*+ first_rows */
p.*,
s.SITE_HUM_ID as SITE_HUM_ID,
e.DESCR as SHELF_HUM_ID,
case
when parent_card.slot is null then c2.slot
else parent_card.slot
end as CARD_NAME,
c2.TYPE as CARD_TYPE,
c2.description AS CARD_DESCRIPTION,
t1.CIRC_PATH_HUM_ID as PATH_NAME,
t1.DECOMMISSION as PATH_DECOMMISSION_DATE,
t2.CIRC_PATH_HUM_ID as PENDING_PATH_NAME,
c.CIRC_HUM_ID as SEGMENT_NAME,
b.CABLE_NAME as CABLE_NAME,
get_port_ip_addr(p.port_inst_id) as IP_ADDRESS,
c.IN_SERVICE as CARD_IN_SVC_DATE,
c.SCHED_DATE as CARD_SCHED_DATE,
e.target_id as DEVICE_ID ,
case
when parent_card.slot is not null then c2.slot
else NULL
end as SUB_CARD_NAME
from epa p
inner join site_inst s on (p.site_inst_id = s.site_inst_id)
inner join equip_inst e on(p.equip_inst_id = e.equip_inst_id)
inner join card_inst c2 on(p.card_inst_id = c2.card_inst_id)
left outer join circ_inst c on (p.circ_inst_id = c.circ_inst_id)
left outer join cable_inst b on(p.cable_inst_id = b.cable_inst_id)
left outer join circ_path_inst t1 on(p.circ_path_inst_id = t1.circ_path_inst_id)
left outer join circ_path_inst t2 on (p.next_path_inst_id = t2.circ_path_inst_id)
left outer join card_inst parent_card on(c2.parent_card_inst_id = parent_card.card_inst_id)

June 24, 2006 - 11:15 am UTC

too big for "a review/followup" - but never never ever compare

select <columns>

to

select count(*) -- instead of columns

does not make sense, I would anticipate ENTIRELY different plans

Getting confused (once again)...

Narendra, August 01, 2006 - 1:24 am UTC

Tom,

It seems this is one thing I keep stumbling upon every time I try it myself. Here is what I have tried:

SQL> create table t as select * from all_objects ;

Table created.

SQL> select count(*) from t ;

  COUNT(*)
----------
     25123

1 row selected.

SQL> select count(distinct object_type) from t ;

COUNT(DISTINCTOBJECT_TYPE)
--------------------------
                        22

1 row selected.

SQL> select count(distinct owner) from t ;

COUNT(DISTINCTOWNER)
--------------------
                  45

1 row selected.

SQL> create index idx_t on t(object_type, owner) ;

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'T', method_opt => 'FOR ALL INDEXED COLUMNS', cascade
=> true) ;

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly
SQL> select * from t where owner = 'WMSYS' ;

63 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=63 Bytes=5922)

   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=12 Card=63 Bytes=5922)




Statistics
----------------------------------------------------------
         42  recursive calls
          0  db block gets
        184  consistent gets
          0  physical reads
          0  redo size
       4702  bytes sent via SQL*Net to client
        303  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         63  rows processed

SQL> select * from t where owner = 'WMSYS' ;

63 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=12 Card=63 Bytes=5922)

   1    0   TABLE ACCESS (FULL) OF 'T' (Cost=12 Card=63 Bytes=5922)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        180  consistent gets
          0  physical reads
          0  redo size
       4704  bytes sent via SQL*Net to client
        303  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         63  rows processed

SQL> select /*+ INDEX(T IDX_T) */ * from t where owner = 'WMSYS' ;

63 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=48 Card=63 Bytes=5922)

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=48 Card=63 Bytes=5922)

   2    1     INDEX (FULL SCAN) OF 'IDX_T' (NON-UNIQUE) (Cost=46 Card=63)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         57  consistent gets
          0  physical reads
          0  redo size
       4661  bytes sent via SQL*Net to client
        303  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         63  rows processed

SQL> set autotrace off
SQL> select clustering_factor from all_indexes where index_name = 'IDX_T' ;

CLUSTERING_FACTOR
-----------------
              408

1 row selected.

SQL> select blocks, num_rows from all_tables where table_name = 'T' ;

    BLOCKS   NUM_ROWS
---------- ----------
       179      25123

1 row selected.

SQL> show parameter optimizer_i

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_index_caching              integer     0
optimizer_index_cost_adj             integer     100

SQL> show parameter db_file_mult

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------
db_file_multiblock_read_count        integer     32

SQL> select * from v$version ;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE    9.2.0.6.0       Production
TNS for Solaris: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production

My question is why CBO decides to do a full-scan (which needs 180 logical IOs) instead of doing an index scan (which needs only 57 logical IOs) for the query
SQL> select * from t where owner = 'WMSYS' ;

I suppose CBO's decision is influenced by the cost. But why does CBO "thinks" that cost of full scan (12) is less than cost of index scan(48) even when index scan requires (quite) less logical IOs compared to full scan ?

August 01, 2006 - 7:07 am UTC

the index full scan is rather non-intuitive, you've hidden the column at the end. The costing of the full scan (single block IO's all - as opposed to a fast FULL scan) plus the 63 single block IO's needed to fetch the rows outweighed the cost of the simple full scan of a relatively small table.

Not quite understood...

Narendra, August 01, 2006 - 8:01 am UTC

Tom,

Thanks for the reply. But I did not understand the following:
The costing of the full scan (single block IO's all - as opposed to a fast FULL scan) plus the 63 single block IO's needed to fetch the rows outweighed the cost of the simple full scan of a relatively small table.

The index scan statistics do not reflect 63 single block IOs that you mentioned. It just shows 57 logical IOs. Doesn't a single block IO count as a logical IO?
Also, is this table having 25000+ rows still a small table for the optimizer to distinguish between full table scan and index scan ?

August 01, 2006 - 10:11 am UTC

depends on your version. There are multiblock IO's and single block IO's but an IO is an IO is an IO - they are all "costed" the same. Given the organization of your index (it is frankly "backwards" for the question), it would be highly unlikely to be used.

Using Indexes

Kiran, August 08, 2006 - 9:18 am UTC

Hi

below query uses idexes
select paty_id from ams_person_details_v where party_attribute24 ='36342627'

this query is not using index even by adding hint also

select /*+ INDEX(HZ_PARTIES XXMTC_HZ_PARTY_PH_INDEX_N2) */ party_id from ams_person_details_v
where party_attribute24in (select subno from crm_user_info where subno ='36342627'

please help me in changing the query to use index.

Suubu, August 16, 2006 - 9:00 pm UTC

Tom,

I have the same queries, one without index hint and another one with index hint.

Previously, the index did not exist on login2customer.lead_acct_num and i added index on this column but still it is not using index.

Just for the performance comparison, i forced the index hint, surprisingly optimized chose the index and did less logical io than the other.

I gathered stats on all tables (involved in this query) with dbms_stats.

exec dbms_stats.gather_table_stats(user, tabname => <tabname>, estimate_percent => dbms_stats.auto_sample_size, cascade => true) ;

1. Do you have any suggestions on tuning this query itself
2. I do not want to use index hint. How to optimize this query.

Thans.

Database instance optimizer settings :

optimizer_mode string CHOOSE
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100

********************************************************************************

select distinct LEAD_ACCT_NUM,CDG,SUB_ACCT_NUM
from login aa, login2customer bb, login2cust2svc cc, login2svc dd
where aa.username = 'june2006user'
and aa.login_id = dd.login_id
and dd.login2svc_id = cc.login2svc_id
and dd.service_type= 7
and bb.lead_acct_num is not null
and cc.login2customer_id = bb.login2customer_id

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.01 0 0 0 0
Fetch 1 0.81 1.23 5192 6364 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.82 1.26 5192 6364 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 141

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT UNIQUE
0 HASH JOIN
18283 TABLE ACCESS BY INDEX ROWID LOGIN2CUST2SVC
18285 NESTED LOOPS
1 NESTED LOOPS
1 TABLE ACCESS BY INDEX ROWID LOGIN
1 INDEX RANGE SCAN LOGIN_USERNAME_INDX2 (object id 66833)
1 TABLE ACCESS BY INDEX ROWID LOGIN2SVC
1 INDEX UNIQUE SCAN LOGIN2SVC_UK2 (object id 66842)
18283 INDEX RANGE SCAN LOGIN2SVC_ID_IDX1 (object id 67169)
17388 TABLE ACCESS FULL LOGIN2CUSTOMER

********************************************************************************

********************************************************************************

select /*+ INDEX(bb LOGIN2CUSTOMER_INDX3) */ distinct LEAD_ACCT_NUM,CDG,SUB_ACCT_NUM
from login aa, login2customer bb, login2cust2svc cc, login2svc dd
where aa.username = 'june2006user'
and aa.login_id = dd.login_id
and dd.login2svc_id = cc.login2svc_id
and dd.service_type= 7
and bb.lead_acct_num is not null
and cc.login2customer_id = bb.login2customer_id

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.17 0.16 0 672 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.18 0.17 0 672 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 141

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT UNIQUE
0 HASH JOIN
18283 TABLE ACCESS BY INDEX ROWID OBJ#(67166)
18285 NESTED LOOPS
1 NESTED LOOPS
1 TABLE ACCESS BY INDEX ROWID OBJ#(66830)
1 INDEX RANGE SCAN OBJ#(66833) (object id 66833)
1 TABLE ACCESS BY INDEX ROWID OBJ#(66841)
1 INDEX UNIQUE SCAN OBJ#(66842) (object id 66842)
18283 INDEX RANGE SCAN OBJ#(67169) (object id 67169)
17388 TABLE ACCESS BY INDEX ROWID OBJ#(66366)
17388 INDEX FULL SCAN OBJ#(76390) (object id 76390)

********************************************************************************

August 17, 2006 - 8:35 am UTC

as always - get the plan (autotrace traceonly explain), compare the ESTIMATED cardinality to the ACTUAL (the tkprof)

See where they are different and start asking "why"

but something is funny here - if you did 17,388 table access by index rowids... be interesting to see the tkprof with statistics_level set to all (to get the cr/pr/pw statistics associated with each line). Your data must be unusually "well clustered" in this case.

Column position in Index

Karteek, April 26, 2007 - 8:47 am UTC

Tom,

To my knowledge position of the column in index is very important. If I am correct, do you suggest to use FLAG fields in the first position of index while creating also is it a good to use FLAG field as first condition in WHERE clause.

like... WHERE ACTIVE_FLAG = 'Y' AND ACCOUNT_KEY = ...

My question is looking very dry... but I feel your reply can make me move further...

- Karteek

April 26, 2007 - 1:05 pm UTC

<quote src=expert oracle database architecture>
Myth: Most Discriminating Elements Should Be First
This seems like common sense. If you are going to create an index on the columns C1 and C2 in a table with 100,000 rows, and you find C1 has 100,000 distinct values and C2 has 25,000 distinct values, you would want to create the index on T(C1,C2). This means that C1 should be first, which is the ¿commonsense¿ approach. The fact is, when comparing vectors of data (consider C1, C2 to be a vector), it doesn¿t matter which you put first. Consider the following example. We will create a table based on ALL_OBJECTS and an index on the OWNER, OBJECT_TYPE, and OBJECT_NAME columns (least discriminating to most discriminating) and also on OBJECT_NAME, OBJECT_TYPE, and OWNER:

ops$tkyte@ORA10GR1> create table t
  2  as
  3  select * from all_objects;
Table created.
 
ops$tkyte@ORA10GR1> create index t_idx_1 on t(owner,object_type,object_name);
Index created.

ops$tkyte@ORA10GR1> create index t_idx_2 on t(object_name,object_type,owner);
Index created.
 
ops$tkyte@ORA10GR1> select count(distinct owner), count(distinct object_type),
  2   count(distinct object_name ), count(*)
  3  from t;
 
DISTINCTOWNER DISTINCTOBJECT_TYPE DISTINCTOBJECT_NAME  COUNT(*)
------------- ------------------- ------------------- --------
           28                  36               28537    48243

Now, to show that neither is more efficient space-wise, we¿ll measure their space utilization:

ops$tkyte@ORA10GR1> analyze index t_idx_1 validate structure;
Index analyzed.
 
ops$tkyte@ORA10GR1> select btree_space, pct_used, opt_cmpr_count, opt_cmpr_pctsave
  2 from index_stats;
 
BTREE_SPACE    PCT  OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------- ------  -------------- ----------------
    2702744   89.0               2               28
 
ops$tkyte@ORA10GR1> analyze index t_idx_2 validate structure;
Index analyzed.
 
ops$tkyte@ORA10GR1> select btree_space, pct_used, opt_cmpr_count, opt_cmpr_pctsave
  2 from index_stats;
                 
BTREE_SPACE    PCT  OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------- ------  -------------- ----------------
    2702744   89.0               1               13

They use exactly the same amount of space, down to the byte¿there are no differences there. However, the first index is a lot more compressible if we use index key compression, as evidenced by the OPT_CMP_PCTSAVE value. There is an argument for arranging the columns in the index in order from the least discriminating to the most discriminating. Now let¿s see how they perform, to determine if either index is generally more efficient than the other. To test this, we¿ll use a PL/SQL block with hinted queries (so as to use one index or the other):

ops$tkyte@ORA10GR1> alter session set sql_trace=true;
Session altered.

ops$tkyte@ORA10GR1> declare
  2          cnt int;
  3  begin
  4    for x in ( select /*+FULL(t)*/ owner, object_type, object_name from t )
  5    loop
  6         select /*+ INDEX( t t_idx_1 ) */ count(*) into cnt
  7           from t
  8          where object_name = x.object_name
  9            and object_type = x.object_type
 10            and owner = x.owner;
 11
 12          select /*+ INDEX( t t_idx_2 ) */ count(*) into cnt
 13           from t
 14          where object_name = x.object_name
 15            and object_type = x.object_type
 16            and owner = x.owner;
 17    end loop;
 18  end;
 19  /
PL/SQL procedure successfully completed.

These queries read every single row in the table by means of the index. The TKPROF report shows us the following:

SELECT /*+ INDEX( t t_idx_1 ) */ COUNT(*) FROM T
WHERE OBJECT_NAME = :B3 AND OBJECT_TYPE = :B2 AND OWNER = :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  48243     10.63      10.78          0          0          0           0
Fetch    48243      1.90       1.77          0     145133          0       48243
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    96487     12.53      12.55          0     145133          0       48243

Rows     Row Source Operation
-------  ---------------------------------------------------
  48243  SORT AGGREGATE (cr=145133 pr=0 pw=0 time=2334197 us)
  57879   INDEX RANGE SCAN T_IDX_1 (cr=145133 pr=0 pw=0 time=1440672 us)(object¿

********************************************************************************

SELECT /*+ INDEX( t t_idx_2 ) */ COUNT(*) FROM T
WHERE OBJECT_NAME = :B3 AND OBJECT_TYPE = :B2 AND OWNER = :B1

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  48243     11.00      10.78          0          0          0           0
Fetch    48243      1.87       2.10          0     145168          0       48243
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    96487     12.87      12.88          0     145168          0       48243

Rows     Row Source Operation
-------  ---------------------------------------------------
  48243  SORT AGGREGATE (cr=145168 pr=0 pw=0 time=2251857 us)
  57879   INDEX RANGE SCAN T_IDX_2 (cr=145168 pr=0 pw=0 time=1382547 us)(object¿

They processed the same exact number of rows and very similar numbers of blocks (minor variations coming from accidental ordering of rows in the table and consequential optimizations made by Oracle), used equivalent amounts of CPU time, and ran in about the same elapsed time (run this same test again and the CPU and ELAPSED numbers will be a little different, but on average they will be the same). There are no inherent efficiencies to be gained by placing the columns in order of how discriminating they are, and as stated previously, with index key compression there is an argument for putting the least selective first. If you run the preceding example with COMPRESS 2 on the indexes, you¿ll find that the first index will perform about two-thirds the I/O of the second, given the nature of the query in this case.
However, the fact is that the decision to put column C1 before C2 must be driven by how the index is used. If you have lots of queries like the following:

select * from t where c1 = :x and c2 = :y;
select * from t where c2 = :y;

it makes more sense to place the index on T(C2,C1). This single index could be used by either of the queries. Additionally, using index key compression (which we looked at with regard to IOTs and will examine further later), we can build a smaller index if C2 is first. This is because each value of C2 repeats itself on average four times in the index. If C1 and C2 are both, on average, 10 bytes in length, the index entries for this index would nominally be 2,000,000 bytes (100,000 ~TMS 20). Using index key compression on (C2, C1), we could shrink this index to 1,250,000 (100,000 ~TMS 12.5), since three out of four repetitions of C2 could be suppressed.
In Oracle 5 (yes, version 5!), there was an argument for placing the most selective columns first in an index. It had to do with the way version 5 implemented index compression (not the same as index key compression). This feature was removed in version 6 with the addition of row-level locking. Since then, it is not true that putting the most discriminating entries first in the index will make the index smaller or more efficient. It seems like it will, but it will not. With index key compression, there is a compelling argument to go the other way since it can make the index smaller. However, it should be driven by how you use the index, as previously stated.
</quote>

Index not used

Karthick, June 13, 2008 - 3:19 am UTC

I did this in the following version...

Oracle9i Enterprise Edition Release 9.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production

I created this table...

SQL> create table hx_index_test_tbl(
  2  no integer not null,
  3  name varchar2(50) not null,
  4  old_value varchar2(100),
  5  new_value varchar2(100));

Table created.

I created these index...

SQL> create index hx_index_test_tbl_idx on hx_index_test_tbl(name,no);

Index created.

SQL> create index hx_index_test_tbl_idx_001 on hx_index_test_tbl(name);

Index created.

I am inserted following value...

SQL> insert into hx_index_test_tbl
  2  select row_number() over(partition by name order by 1) no, name, rpad('*',100,'*') old_value, rpad('*',100,'*') new_value
  3    from (select 'A'||mod(level,100) name from dual connect by level <=1000);

1000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns');

PL/SQL procedure successfully completed.

SQL> validate index hx_index_test_tbl_idx;

Index analyzed.

SQL> set serveroutput on

This is the stats of the index. Total blockes consumed is 1+3=4

SQL> exec print_table('select * from index_stats');

HEIGHT                        : 2
BLOCKS                        : 8
NAME                          : HX_INDEX_TEST_TBL_IDX
PARTITION_NAME                :
LF_ROWS                       : 1000
LF_BLKS                       : 3
LF_ROWS_LEN                   : 17900
LF_BLK_LEN                    : 7996
BR_ROWS                       : 2
BR_BLKS                       : 1
BR_ROWS_LEN                   : 27
BR_BLK_LEN                    : 8028
DEL_LF_ROWS                   : 0
DEL_LF_ROWS_LEN               : 0
DISTINCT_KEYS                 : 1000
MOST_REPEATED_KEY             : 1
BTREE_SPACE                   : 32016
USED_SPACE                    : 17927
PCT_USED                      : 56
ROWS_PER_KEY                  : 1
BLKS_GETS_PER_ACCESS          : 3
PRE_ROWS                      : 0
PRE_ROWS_LEN                  : 0
OPT_CMPR_COUNT                : 1
OPT_CMPR_PCTSAVE              : 16
-----------------

PL/SQL procedure successfully completed.

SQL> exec print_table('select * from user_tables where table_name = ''HX_INDEX_TEST_TBL''');

Total blocks in the table is 31

TABLE_NAME                    : HX_INDEX_TEST_TBL
TABLESPACE_NAME               : PSDEFAULT
CLUSTER_NAME                  :
IOT_NAME                      :
PCT_FREE                      : 10
PCT_USED                      : 40
INI_TRANS                     : 1
MAX_TRANS                     : 255
INITIAL_EXTENT                : 65536
NEXT_EXTENT                   :
MIN_EXTENTS                   : 1
MAX_EXTENTS                   : 2147483645
PCT_INCREASE                  :
FREELISTS                     : 1
FREELIST_GROUPS               : 1
LOGGING                       : YES
BACKED_UP                     : N
NUM_ROWS                      : 1000
BLOCKS                        : 31
EMPTY_BLOCKS                  : 0
AVG_SPACE                     : 0
CHAIN_CNT                     : 0
AVG_ROW_LEN                   : 100
AVG_SPACE_FREELIST_BLOCKS     : 0
NUM_FREELIST_BLOCKS           : 0
DEGREE                        :          1
INSTANCES                     :          1
CACHE                         :     N
TABLE_LOCK                    : ENABLED
SAMPLE_SIZE                   : 1000
LAST_ANALYZED                 : 13-jun-2008 08:32:40
PARTITIONED                   : NO
IOT_TYPE                      :
TEMPORARY                     : N
SECONDARY                     : N
NESTED                        : NO
BUFFER_POOL                   : DEFAULT
ROW_MOVEMENT                  : DISABLED
GLOBAL_STATS                  : YES
USER_STATS                    : NO
DURATION                      :
SKIP_CORRUPT                  : DISABLED
MONITORING                    : NO
CLUSTER_OWNER                 :
DEPENDENCIES                  : DISABLED
-----------------

PL/SQL procedure successfully completed.

The problem is when i have just the name condition alone in my select then no index is considered. 
It always gose for a full scan.

SQL> set autotrace traceonly explain
SQL> select * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=10 Bytes=1000)
   1    0   TABLE ACCESS (FULL) OF 'HX_INDEX_TEST_TBL' (Cost=5 Card=10 Bytes=1000)


SQL> select * from hx_index_test_tbl where name = 'A99';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=10 Bytes=1000)
   1    0   TABLE ACCESS (FULL) OF 'HX_INDEX_TEST_TBL' (Cost=5 Card=10 Bytes=1000)


but when i add the no condition it goes for the index...

SQL> select * from hx_index_test_tbl where name = 'A0' and no = 1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=100)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=2 Card=1 Bytes=100)
   2    1     INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX' (NON-UNIQUE) (Cost=1 Card=1)


now i forced it to pick the index. but it shows a high cost for that...

SQL> select /*+ INDEX(hx_index_test_tbl hx_index_test_tbl_idx_001) */ * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=10 Bytes=1000)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=9 Card=10 Bytes=1000)
   2    1     INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX_001' (NON-UNIQUE) (Cost=1 Card=10)


SQL> select /*+ INDEX(hx_index_test_tbl hx_index_test_tbl_idx) */ * from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=10 Bytes=1000)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'HX_INDEX_TEST_TBL' (Cost=10 Card=10 Bytes=1000)
   2    1     INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX' (NON-UNIQUE) (Cost=2 Card=10)


why my index is not used. why using index shows high cost. and why it is used when i include no column.

but when i select just name or name, no in my select list then the index is considered.

SQL> select name from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=10 Bytes=40)
   1    0   INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX_001' (NON-UNIQUE) (Cost=1 Card=10 Bytes=40)


SQL> select no,name from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=10 Bytes=70)
   1    0   INDEX (RANGE SCAN) OF 'HX_INDEX_TEST_TBL_IDX' (NON-UNIQUE) (Cost=2 Card=10 Bytes=70)


once i include the old_value in the select list then it gose for a full scan why is it so.


SQL> select no,name,old_value from hx_index_test_tbl where name = 'A0';

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=10 Bytes=590)
   1    0   TABLE ACCESS (FULL) OF 'HX_INDEX_TEST_TBL' (Cost=5 Card=10 Bytes=590)
   
My table has 31 block of data. But the index has very less blocks but even then it goes for the full scan. Why is it so?

June 13, 2008 - 8:09 am UTC

you didn't analyze your index, lack of data to the optimizer results in bad plan...

exec dbms_stats.gather_table_stats(user,'hx_index_test_tbl',method_opt=>'for all indexed columns',cascade=>true);

Index creation in different schema

A reader, October 28, 2009 - 11:56 pm UTC

Hi Tom,

I have one question in the index which is not created in the same schema where table is available.

1. I have created the table in the Demo1 schema and inserted the some sample records.

sql> conn demo1/demo1

sql> create table table_in_demo1
(
col1 number,
col2 number
);

sql> insert into table_in_demo1 values (10,11);
sql> insert into table_in_demo1 values (20,21);
sql> insert into table_in_demo1 values (30,31);
sql> select * from table_in_demo1;

COL1 COL2
10 11
20 21
30 31

2. I have given grant to schema Demo2 on the table. now I am able to access the table from demo2.

sql> grant all on table_in_demo1 to demo2

sql> conn demo2/demo2

sql> select * from demo1.table_in_demo1
COL1 COL2

10 11
20 21
30 31

3. I have created the index in “table_in_demo1” table with in Demo2 schema. I have check the owner details in the data dictionary table. It is showing demo2 as a owner.

sql> conn demo2/demo2

sql> create index tab_idx on demo1.table_in_demo1(col2);

sql> select owner,index_name,table_owner from all_indexes
where table_name = 'TABLE_IN_DEMO1';

owner Index_name table_owner

DEMO2 TAB_IDX DEMO1

4. I have connected to the schema Demo1 once again and I have revoked the permission from the Demo2 schema on table_in_demo1.

sql> conn demo1/demo1
sql> revoke all on table_in_demo1 from demo2;

My Questions:

1. I have executed the explain plan of the below query. Still it is going for the index. (INDEX RANGE SCAN NON-UNIQUE DEMO2.TAB_IDX)

I have revoked the demo2 permission2. could you explain me how it access the index which is available in the demo2 schema.

sql> select * from table_in_demo1
where col2 = 21;

Plan
SELECT STATEMENT CHOOSE Cost: 2 Bytes: 4 Cardinality: 1
2 TABLE ACCESS BY INDEX ROWID DEMO1.TABLE_IN_DEMO1 Cost: 2 Bytes: 4 Cardinality: 1
1 INDEX RANGE SCAN NON-UNIQUE DEMO2.TAB_IDX Cost: 1 Cardinality: 1

2. I have tried to insert the data and it is inserting with out any error. Could you guide me how it is inserting the data in index entry which is available in the Demo2 schema and since we have removed the access from Demo1.

sql> conn demo1/demo1
sql> insert into table_in_demo1 values (40,41);
sql> select * from table_in_demo1;

COL1 COL2

10 11
20 21
30 31
40 41

Regards,
Karthi

October 29, 2009 - 8:34 am UTC

1) so, you revoked the permission going forward, but you cannot change the past, the past has already happened.

They no longer have permission to do something
But they used to
And when they had this permission - they did something

You cannot change that fact.

2) the index 'belongs' to the table, the index is part of the table, the index isn't a standalone object - it exists only in support of the table. The user demo2 cannot query that index - they do not have access to the data.

Index not used

Hari, November 24, 2009 - 6:09 am UTC

Hi Tom,

Good Day.

I have a table called PERSHING_SECURITY_CUSTODIAN, which has 4808461 records in it.

There are two columns PROCESSED (values Y and N) and IBDPARTYID (values 1, 111501, 120901) which are used in the select statement.

Now the column PROCESSED is B*Tree indexed and IBDPARTYID is a part of composite B*Tree index of the form DATEOFDATAHDR and IBDPARTYID (i.e.) IBDPARTYID is 2nd column in the index.

The following query returns 756 rows, but statistics is huge.

SELECT * FROM PERSHING_SECURITY_CUSTODIAN WHERE PROCESSED = 'N' AND IBDPARTYID = 1;
756 rows selected.
Execution Plan
----------------------------------------------------------
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost|
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 785K| 197M| 39353|
| 1 | TABLE ACCESS FULL| PERSHING_SECURITY_CUSTODIAN | 785K| 197M| 39353|
---------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
Statistics
----------------------------------------------------------
1544 recursive calls
0 db block gets
182298 consistent gets
70075 physical reads
0 redo size
112972 bytes sent via SQL*Net to client
927 bytes received via SQL*Net from client
52 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
756 rows processed

Pardon me for improper formatting.

I am not much bothered about full table scan, as it is not an evil and CBO would have selected the best plan. However, the CONSISTENT GETS and PHYSICAL READS are very high.

I tried the same query with various combination of indexes which are provided below:

Scenario 1: Creating a separate index on IBDPARTYID column. The statistics I am getting is
182304 consistent gets
181959 physical reads

Scenario 2: Creating a separate bitmap index on IBDPARTYID column. The statistics I am getting is
182231 consistent gets
173371 physical reads

Scenario 3: Creating index on IBDPARTYID and DATEOFDATAHDR columns. The statistics I am getting is
181963 consistent gets
147960 physical reads

Scenario 4: Creating index on PROCESSED and IBDPARTYID columns. The statistics I am getting is
181963 consistent gets
137731 physical reads

Scenario 5: Creating bitmap index on PROCESSED and IBDPARTYID columns. The statistics I am getting is
181963 consistent gets
169302 physical reads

Please note that the statistics are updated. I think, in Oracle, the create index statement updates the index statistics.

Now, my questions are:

a) Is this the behaviour of the query?
b) Why is that indexes are not used? Is it because the number of records are high?
c) Is there any better way to tune this query in order to reduce the consistent gets and physical reads?

Thanks

Hari

November 24, 2009 - 11:48 am UTC

Look at the plan

the optimizer is guessing 785,000 records.
reality is 756 records

when the estimated cardinality is so so so SO far off from reality - the optimizer will never get the right plan.

Are the statistics up to date
How do you gather column statistics - if you have skewed data - as you must - you might need to get histograms in place.

Scofield, December 20, 2009 - 1:56 am UTC

Hi sir
Regarding the index range scan,
suppose "where x betwen 20 and 30"
From your book I understand that:

"Oracle finds the first index block contains 20 and then just walks horizantally thorugh the linked list of leaf nodes"

What I want to ask is;

Does oracle perform like this:
1-)go to index block that contain 20, get the rowid
2-)go to block, read the data
3-)go to index block that contain 21, get the rowid
4-)go to block, read the data
5-)go to index block that contain 22, get the rowid
..
..

December 20, 2009 - 9:10 am UTC

It might.

It depends on the rest of the plan.

But usually, yes, an index range scan would be followed by a table access by index rowid - back and forth, back and forth between the two.

index

A reader, December 21, 2009 - 11:56 am UTC

To Hari...

A reader, December 22, 2009 - 7:38 pm UTC

Hi Hari,

Did you try creating an index on : IBDPARTYID, PROCESSED?
Also, It seems the data on IBDPARTYID (particularly) is highly skewed. As Tom has suggested you might want to create histograms.

Cheers.

A reader, January 11, 2010 - 2:05 am UTC

Sir, in above you confirmed that below is generally true:

"where x betwen 20 and 30"

1-)go to index block that contain 20, get the rowid
2-)go to block, read the data
3-)go to index block that contain 21, get the rowid
4-)go to block, read the data
5-)go to index block that contain 22, get the rowid

Suppose it will take 3 phyical or logical IO to reach the index block that contain 20.

1-)go to index block that contain 20, get the rowid
2-)go to block, read the data

When I try to go to index block that contain 21, does it again cost 3 phyical or logical IO?

January 18, 2010 - 6:16 am UTC

once we get to the leaf block, we never need to traverse up the tree again - at most, we'll need to do a logical IO on the leaf block.

A reader, January 23, 2010 - 1:16 am UTC

Respected Sir;

...once we get to the leaf block, we never need to traverse up the tree again.

How does this happen?
Does oracle store some information about this so that traversing up the tree again is not required ?

January 25, 2010 - 5:57 am UTC

every leaf block has a pointer right and a pointer left - to it neighboring block. There is a doubly linked list of leaf blocks.

Different plans for the same query

Kulkarni, September 13, 2011 - 11:16 pm UTC

Dear Tom,
I am facing a peculiar problem. We recently partitioned a big table on a date column. Created local index on the partitioned column and global partitioned index on another column which will be the leading column in the where clause of most of the queries. Now we are facing performance issue in production. I gathered the query in question and tested it development database and QA database. Though both databases are at the same version and OS there is difference in the query plan.

Here is the query and plan.


SELECT TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DATASTREAM.DS_PRICES
 WHERE update_stamp IN (TO_DATE('05/12/1998', 'MM/DD/YYYY'),
                        TO_DATE('05/13/1998', 'MM/DD/YYYY'),
                        TO_DATE('05/14/1998', 'MM/DD/YYYY'),
                        TO_DATE('05/20/1998', 'MM/DD/YYYY'),
                        TO_DATE('05/25/1998', 'MM/DD/YYYY'),
                        TO_DATE('06/10/1998', 'MM/DD/YYYY'),
                        TO_DATE('07/13/1998', 'MM/DD/YYYY'),
                        TO_DATE('08/11/1998', 'MM/DD/YYYY'),
   TO_DATE('11/09/1998', 'MM/DD/YYYY'))
               AND DS_ID = '27348D' ORDER BY UPDATE_STAMP;


Query plan in development database :

SQL> SELECT TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DATASTREAM.DS_PRICES
  2   WHERE update_stamp IN (TO_DATE('10/19/2001', 'MM/DD/YYYY'),
  3                          TO_DATE('10/22/2001', 'MM/DD/YYYY'),
  4                          TO_DATE('10/23/2001', 'MM/DD/YYYY'),
  5                          TO_DATE('10/29/2001', 'MM/DD/YYYY'),
  6                          TO_DATE('11/01/2001', 'MM/DD/YYYY'),
  7                          TO_DATE('11/19/2001', 'MM/DD/YYYY'),
  8                          TO_DATE('12/21/2001', 'MM/DD/YYYY'),
  9                          TO_DATE('01/21/2002', 'MM/DD/YYYY'),
 10                          TO_DATE('04/22/2002', 'MM/DD/YYYY'))
 11                 AND DS_ID = '315238' ORDER BY UPDATE_STAMP;

Elapsed: 00:00:00.07

Execution Plan
----------------------------------------------------------
Plan hash value: 1014417044

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |     4 |    92 |    11   (0)| 00:00:01 |       |       |
|   1 |  PARTITION HASH SINGLE              |                   |     4 |    92 |    11   (0)| 00:00:01 |     6 |     6 |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| DS_PRICES         |     4 |    92 |    11   (0)| 00:00:01 | ROWID | ROWID |
|*  3 |    INDEX RANGE SCAN                 | DS_ID_UPDT_GLOBAL |     4 |       |     9   (0)| 00:00:01 |     6 |     6 |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DS_ID"='315238')
       filter("UPDATE_STAMP"=TO_DATE(' 2001-10-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 2001-10-22 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2001-10-23
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2001-10-29 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              OR "UPDATE_STAMP"=TO_DATE(' 2001-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE('
              2001-11-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2001-12-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2002-01-21 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 2002-04-22 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Statistics
----------------------------------------------------------
        836  recursive calls
          0  db block gets
        183  consistent gets
         12  physical reads
          0  redo size
        687  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         10  sorts (memory)
          0  sorts (disk)
          1  rows processed

Query plan in QA server(which is RAC):
Execution Plan
----------------------------------------------------------
Plan hash value: 3819973762

----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |     4 |    92 |     3  (34)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE INLIST|           |     4 |    92 |     3  (34)| 00:00:01 |KEY(I) |KEY(I) |
|   2 |   SORT ORDER BY        |           |     4 |    92 |     3  (34)| 00:00:01 |       |       |
|*  3 |    TABLE ACCESS FULL   | DS_PRICES |     4 |    92 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("DS_ID"='27348D' AND ("UPDATE_STAMP"=TO_DATE(' 1974-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 1998-05-12 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 1998-05-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-05-14 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-05-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-05-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-06-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-07-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-08-11 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-11-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          2  physical reads
          0  redo size
        480  bytes sent via SQL*Net to client
        481  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          0  rows processed

I see the same plan when this query is run on production.

What is more puzzling is the change in the plan when the value of DS_ID changed to 315238. We have latest stats gathered on the table.

I am at a loss to know why index is not used in QA but it is used in development database. In fact there are more rows in development database than in QA.

Please help me.
With regards,
Bheemasen Kulkarni

Index not used

Kulkarni, September 13, 2011 - 11:18 pm UTC

Just to add. I am using 10.2.0.5. Development is non-RAC database where as QA and production are two node RAC's

September 14, 2011 - 7:07 pm UTC

those tables are tiny - the cost of scanning is "2".

Why do you care what plan is being used at that point? We could use just about *ANY* plan, it doesn't matter when the data is this small.

do you have an actual problem to solve?

Index not used

Kulkarni, September 15, 2011 - 1:41 am UTC

Tom,
Table is not small. It is 13gb in size and has 240 million records. We partitioned this table and tested the queries and jobs on DEV database. Hence we moved on to production and partitioned this table. Here are the sql's used to create partitioned table and indexes on it.


CREATE TABLE DATASTREAM.DS_PRICES_PART
   (    DS_ID VARCHAR2(9) NOT NULL ENABLE NOVALIDATE,
        UPDATE_STAMP DATE NOT NULL ENABLE NOVALIDATE,
        MVALUE NUMBER(14,3),
        NOSH NUMBER(15,3),
        PRICE NUMBER(15,3),
        PHIGH NUMBER(15,3),
        PLOW NUMBER(15,3),
        POPEN NUMBER(15,3),
        VOLUME NUMBER(13,0),
        PRICE_US NUMBER(15,3),
        PHIGH_US NUMBER(15,3),
        PLOW_US NUMBER(15,3),
        POPEN_US NUMBER(15,3)
   ) PCTFREE 5
PARTITION BY RANGE (UPDATE_STAMP)
(PARTITION DS_PRICES_p1 VALUES LESS THAN (TO_DATE('31-MAR-1975 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p2 VALUES LESS THAN (TO_DATE('31-MAR-1980 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p3 VALUES LESS THAN (TO_DATE('31-MAR-1985 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p4 VALUES LESS THAN (TO_DATE('31-MAR-1990 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p5 VALUES LESS THAN (TO_DATE('31-MAR-2000 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p6 VALUES LESS THAN (TO_DATE('31-MAR-2005 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p7 VALUES LESS THAN (TO_DATE('31-MAR-2010 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p8 VALUES LESS THAN (TO_DATE('31-MAR-2015 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE);


CREATE INDEX DATASTREAM.DS_ID_UPDT_GLOBAL ON DATASTREAM.DS_PRICES_PART(DS_ID,UPDATE_STAMP) GLOBAL
partition by hash(ds_id) 
partitions 8
TABLESPACE INDEX5;

CREATE INDEX DATASTREAM.DSPRICES_UPDTSTAMP_IDX ON DATASTREAM.DS_PRICES (UPDATE_STAMP) LOCAL TABLESPACE INDEX5;

After this most of the jobs and SQL's are running fine and even there is improvement in the performance. However one job is having problem. Application team complains that it almost gets hung. When I checked in the database I found the  following query running with "db file scattered reads".


SELECT TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us               FROM DS_PRICES
  WHERE update_stamp IN (TO_DATE(:p1, 'MM/DD/YYYY'),                         TO_DATE(:p2, 'MM/DD/YYY
Y'),                         TO_DATE(:p3, 'MM/DD/YYYY'),                         TO_DATE(:p4, 'MM/DD
/YYYY'),                         TO_DATE(:p5, 'MM/DD/YYYY'),                         TO_DATE(:p6, 'M
M/DD/YYYY'),                         TO_DATE(:p7, 'MM/DD/YYYY'),                         TO_DATE(:p8
, 'MM/DD/YYYY'),                         TO_DATE(:p9, 'MM/DD/YYYY'))                AND DS_ID = :p10
ORDER BY UPDATE_STAMP


When I ran this query with the specific values I found the plan as below.

SQL> SELECT TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DATASTREAM.DS_PRICES
  2   WHERE update_stamp IN (TO_DATE('04/12/2011', 'MM/DD/YYYY'),
  3                          TO_DATE('04/13/2011', 'MM/DD/YYYY'),
  4                          TO_DATE('04/14/2011', 'MM/DD/YYYY'),
  5                          TO_DATE('04/20/2011', 'MM/DD/YYYY'),
  6                          TO_DATE('04/25/2011', 'MM/DD/YYYY'),
  7                          TO_DATE('05/11/2011', 'MM/DD/YYYY'),
  8                          TO_DATE('06/13/2011', 'MM/DD/YYYY'),
  9                          TO_DATE('07/12/2011', 'MM/DD/YYYY'),
                        TO_DATE('10/10/2011', 'MM/DD/YYYY'))
 10   11                 AND DS_ID = '50853E' ORDER BY UPDATE_STAMP;

8 rows selected.

Elapsed: 00:00:00.55

Execution Plan
----------------------------------------------------------
Plan hash value: 1014417044

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |     5 |   115 |     9   (0)| 00:00:01 |       |       |
|   1 |  PARTITION HASH SINGLE              |                   |     5 |   115 |     9   (0)| 00:00:01 |     4 |     4 |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| DS_PRICES         |     5 |   115 |     9   (0)| 00:00:01 | ROWID | ROWID |
|*  3 |    INDEX RANGE SCAN                 | DS_ID_UPDT_GLOBAL |     5 |       |     8   (0)| 00:00:01 |     4 |     4 |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DS_ID"='50853E')
       filter("UPDATE_STAMP"=TO_DATE(' 2011-04-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 2011-04-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-04-14
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-04-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              OR "UPDATE_STAMP"=TO_DATE(' 2011-04-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE('
              2011-05-11 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-06-13 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-07-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 2011-10-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Statistics
----------------------------------------------------------
       3840  recursive calls
          0  db block gets
        842  consistent gets
         26  physical reads
          0  redo size
        906  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         83  sorts (memory)
          0  sorts (disk)
          8  rows processed

     

What is strange is that some times there will be waits on "db file sequential reads". In such scenario I found that the job was finished within the window. 

I am not able to make out why this query some times goes for scattered reads( fill scans, in this case jobs runs for a very long time) and some times sequential reads( in this case job completes within the expected window).

We are facing this issue only after partitioning the table.

September 15, 2011 - 7:45 am UTC

... Table is not small. ...

sorry, plan says OTHERWISE, quite clearly it does.

so either you've not told us the full story, or you are running on test data (which is the same as not telling us the full story)

Look, in the plan:

...

|* 3 | TABLE ACCESS FULL | DS_PRICES | 4 | 92 | 2 (0)| 00:00:01 |KEY(I) |KEY(I)

...

a cost of two, an estimated time to scan - 1 second (it is actually less than that, we just don't do zeroes)

Also:

          6  consistent gets
          2  physical reads

that table is not large.

and you told me:

We have
latest stats gathered on the table.

so, before I even begin to look at this, explain that discrepancy.

if you are trying to tune a query that runs against big data in a test environment without data - just stop. It is useless to do that - utterly and completely useless.

Index not used

Kulkarni, September 15, 2011 - 8:27 am UTC

Tom,
As I said this is a big table and we partitioned it.

SQL> select count(*) from DATASTREAM.DS_PRICES;

  COUNT(*)
----------
 236759177

Statistics are also collected on this partitioned table.

select COLUMN_NAME,NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,DENSITY,NUM_NULLS,NUM_BUCKETS,LAST_ANALYZED,SAMPLE_SIZE,GLOBAL_STATS,HISTOGRAM from DBA_TAB_COL_STATISTICS where TABLE_NAME='DS_PRICES';

COLUMN_NAME          NUM_DISTINCT LOW_VALUE            HIGH_VALUE              DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANALYZED      SAMPLE_SIZE GLO
-------------------- ------------ -------------------- -------------------- ---------- ---------- ----------- ------------------ ----------- ---
HISTOGRAM
---------------
DS_ID                      118590 313330303036         533533313734         .000015045          0          75 14-SEP-11                57541 YES
HEIGHT BALANCED

UPDATE_STAMP                 2078 78650C0B010101       786E0119010101       .000642674          0          75 14-SEP-11                 5731 YES
HEIGHT BALANCED


SQL> select index_name,CLUSTERING_FACTOR,NUM_ROWS from dba_indexes where table_name='DS_PRICES';

INDEX_NAME                     CLUSTERING_FACTOR   NUM_ROWS
------------------------------ ----------------- ----------
DSPRICES_UPDTSTAMP_IDX                 219731311  232832372
DS_ID_UPDT_GLOBAL                       27945709  234387393


SQL> select PARTITION_NAME,INDEX_NAME,CLUSTERING_FACTOR,NUM_ROWS,SAMPLE_SIZE,GLOBAL_STATS,USER_STATS from DBA_IND_STATISTICS where TABLE_NAME='DS_PRICES';

PARTITION_NAME                 INDEX_NAME                     CLUSTERING_FACTOR   NUM_ROWS SAMPLE_SIZE GLO USE
------------------------------ ------------------------------ ----------------- ---------- ----------- --- ---
                               DSPRICES_UPDTSTAMP_IDX                 219731311  232832372      422832 YES NO
                               DS_ID_UPDT_GLOBAL                       27945709  234387393      270304 YES NO
DS_PRICES_P1                   DSPRICES_UPDTSTAMP_IDX                         0          0             YES NO
DS_PRICES_P2                   DSPRICES_UPDTSTAMP_IDX                         0          0             YES NO
DS_PRICES_P3                   DSPRICES_UPDTSTAMP_IDX                         0          0             YES NO
DS_PRICES_P4                   DSPRICES_UPDTSTAMP_IDX                         0          0             YES NO
DS_PRICES_P5                   DSPRICES_UPDTSTAMP_IDX                         0          0             YES NO
DS_PRICES_P6                   DSPRICES_UPDTSTAMP_IDX                  62338124   66568959      429403 YES NO
DS_PRICES_P7                   DSPRICES_UPDTSTAMP_IDX                 124384641  130492568      428272 YES NO
DS_PRICES_P8                   DSPRICES_UPDTSTAMP_IDX                  34193479   37867321      416476 YES NO
SYS_P21                        DS_ID_UPDT_GLOBAL                        3418770   28208162      259790 YES NO
SYS_P22                        DS_ID_UPDT_GLOBAL                        3559860   29716441      274387 YES NO
SYS_P23                        DS_ID_UPDT_GLOBAL                        3633923   30627329      284198 YES NO
SYS_P24                        DS_ID_UPDT_GLOBAL                        3620383   29851792      274088 YES NO
SYS_P25                        DS_ID_UPDT_GLOBAL                        3358688   28104853      261427 YES NO
SYS_P26                        DS_ID_UPDT_GLOBAL                        3645123   30837152      283743 YES NO
SYS_P27                        DS_ID_UPDT_GLOBAL                        3423924   28932400      266042 YES NO
SYS_P28                        DS_ID_UPDT_GLOBAL                        3284304   28087072      258567 YES NO


SQL> select TABLE_NAME,PARTITION_NAME,NUM_ROWS,SAMPLE_SIZE,GLOBAL_STATS,USER_STATS from dba_tab_statistics where table_name='DS_PRICES';

TABLE_NAME                     PARTITION_NAME                   NUM_ROWS SAMPLE_SIZE GLO USE
------------------------------ ------------------------------ ---------- ----------- --- ---
DS_PRICES                                                      237029034      524956 YES NO
DS_PRICES                      DS_PRICES_P1                            0             YES NO
DS_PRICES                      DS_PRICES_P2                            0             YES NO
DS_PRICES                      DS_PRICES_P3                            0             YES NO
DS_PRICES                      DS_PRICES_P4                            0             YES NO
DS_PRICES                      DS_PRICES_P5                            0             YES NO
DS_PRICES                      DS_PRICES_P6                     66396541       52265 YES NO
DS_PRICES                      DS_PRICES_P7                    130976925       64967 YES NO
DS_PRICES                      DS_PRICES_P8                     38879540       58991 YES NO

Hope I have given enough details. Please let me know if some more information needed.

September 16, 2011 - 1:11 pm UTC

dude, see above.

How could this big table do six IO's only?

are you testing things on a tiny system that you plan on doing against a big system?

Query plan in QA server(which is RAC):
Execution Plan
----------------------------------------------------------
Plan hash value: 3819973762

----------------------------------------------------------------------------------------------------

| Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop 
|
----------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |           |     4 |    92 |     3  (34)| 00:00:01 |       |       
|
|   1 |  PARTITION RANGE INLIST|           |     4 |    92 |     3  (34)| 00:00:01 |KEY(I) |KEY(I) 
|
|   2 |   SORT ORDER BY        |           |     4 |    92 |     3  (34)| 00:00:01 |       |       
|
|*  3 |    TABLE ACCESS FULL   | DS_PRICES |     4 |    92 |     2   (0)| 00:00:01 |KEY(I) |KEY(I) 
|
----------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("DS_ID"='27348D' AND ("UPDATE_STAMP"=TO_DATE(' 1974-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 1998-05-12 00:00:00', 
'syyyy-mm-dd
              hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 1998-05-13 00:00:00', 'syyyy-mm-dd 
hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-05-14 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-05-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-05-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-06-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-07-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-08-11 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 1998-11-09 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          2  physical reads
          0  redo size
        480  bytes sent via SQL*Net to client
        481  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          0  rows processed

that table is NOT big.

I want information exclusively straight from the machine having the issue, this QA machine is *useless*.

You need to get me in a readable format - the estimated number of rows for each step in the plan and the actual number of rows.

Best if you can just run the query using /*+ gather_plan_statistics */ as a hint after the word select and the output from dbms_xplan.display_cursor.

use this technique:

http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/

IN PRODUCTION, not in QA

Index not used

Kulkarni, September 15, 2011 - 10:06 am UTC

Tom,
Let me put the problem in a nutshell.
1.Before partition this job which runs the query mentioned above used to take less than 2 hours.
2.After partition this job does not complete even after 6 hours. The query mentioned above is one the queries run by the job. We find that the job is struck at this query. During this time I see  lot of db scattered reads happening on this table DS_PRICES.
3. We are using bind variables in this query.
4. Yesterday I captured a set of values passed into these bind variables while this job was running on production and ran the query manually. Here is the plan which clearly indicates that index is used.

SQL> SELECT TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DATASTREAM.DS_PRICES
  2   WHERE update_stamp IN (TO_DATE('04/12/2011', 'MM/DD/YYYY'),
  3                          TO_DATE('04/13/2011', 'MM/DD/YYYY'),
  4                          TO_DATE('04/14/2011', 'MM/DD/YYYY'),
  5                          TO_DATE('04/20/2011', 'MM/DD/YYYY'),
  6                          TO_DATE('04/25/2011', 'MM/DD/YYYY'),
  7                          TO_DATE('05/11/2011', 'MM/DD/YYYY'),
  8                          TO_DATE('06/13/2011', 'MM/DD/YYYY'),
  9                          TO_DATE('07/12/2011', 'MM/DD/YYYY'),
                        TO_DATE('10/10/2011', 'MM/DD/YYYY'))
 10   11                 AND DS_ID = '50853E' ORDER BY UPDATE_STAMP;

8 rows selected.

Elapsed: 00:00:00.55

Execution Plan
----------------------------------------------------------
Plan hash value: 1014417044

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |     5 |   115 |     9   (0)| 00:00:01 |       |       |
|   1 |  PARTITION HASH SINGLE              |                   |     5 |   115 |     9   (0)| 00:00:01 |     4 |     4 |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| DS_PRICES         |     5 |   115 |     9   (0)| 00:00:01 | ROWID | ROWID |
|*  3 |    INDEX RANGE SCAN                 | DS_ID_UPDT_GLOBAL |     5 |       |     8   (0)| 00:00:01 |     4 |     4 |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DS_ID"='50853E')
       filter("UPDATE_STAMP"=TO_DATE(' 2011-04-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 2011-04-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-04-14
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-04-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              OR "UPDATE_STAMP"=TO_DATE(' 2011-04-25 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE('
              2011-05-11 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-06-13 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') OR "UPDATE_STAMP"=TO_DATE(' 2011-07-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss') OR
              "UPDATE_STAMP"=TO_DATE(' 2011-10-10 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Statistics
----------------------------------------------------------
       3840  recursive calls
          0  db block gets
        842  consistent gets
         26  physical reads
          0  redo size
        906  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         83  sorts (memory)
          0  sorts (disk)
          8  rows processed


5. This means query is using index when bind variables are not used and is doing full table scan when bind variables are used.

Today when we forced this query to use the index the job completed within 20 mins. This time I could see db file sequential reads.

Now the question is why Oracle is not picking the index on its own. I don't want to use any hint in the query as that will be only a work around.

Hope I am clear in presenting the facts.

September 16, 2011 - 1:17 pm UTC

the index is clearly being used for that query - that doesn't say anything about the plan actually being used by the query in the shared pool already.

use dbms_xplan to display the query plan for the actual query straight from the shared pool please.

I want that gather_plan_statistics output too - we need to compare the estimated rows to the actual rows.

Index not used

Kulkarni, September 17, 2011 - 9:28 am UTC

Hi Tom,
Here is the plan it used during execution of the job.

SQL> select * from TABLE(dbms_xplan.display_cursor('&SQL_ID', &CHILD,'ALL IOSTATS LAST'));
Enter value for sql_id: 0yuuh59u9k9b1
Enter value for child: 1
old   1: select * from TABLE(dbms_xplan.display_cursor('&SQL_ID', &CHILD,'ALL IOSTATS LAST'))
new   1: select * from TABLE(dbms_xplan.display_cursor('0yuuh59u9k9b1', 1,'ALL IOSTATS LAST'))

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0yuuh59u9k9b1, child number 1
-------------------------------------
SELECT /* + INDEX(DS_PRICES DS_ID_UPDT_GLOBAL) */   TO_CHAR(update_stamp, 'MM/DD/YYYY'),
price, price_us               FROM DS_PRICES              WHERE update_stamp IN
(TO_DATE(:p1, 'MM/DD/YYYY'),                         TO_DATE(:p2, 'MM/DD/YYYY'),
            TO_DATE(:p3, 'MM/DD/YYYY'),                         TO_DATE(:p4, 'MM/DD/YYYY'),
                       TO_DATE(:p5, 'MM/DD/YYYY'),                         TO_DATE(:p6,
'MM/DD/YYYY'),                         TO_DATE(:p7, 'MM/DD/YYYY'),
TO_DATE(:p8, 'MM/DD/YYYY'),                         TO_DATE(:p9, 'MM/DD/YYYY'))
  AND DS_ID = :p10 ORDER BY UPDATE_STAMP


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3819973762

-----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |        |       |     3 (100)|          |       |       |
|   1 |  PARTITION RANGE INLIST|           |      1 |    23 |     3  (34)| 00:00:01 | KEY(I | KEY(I |
|   2 |   SORT ORDER BY        |           |      1 |    23 |     3  (34)| 00:00:01 |       |       |
|*  3 |    TABLE ACCESS FULL   | DS_PRICES |      1 |    23 |     2   (0)| 00:00:01 | KEY(I | KEY(I |
-----------------------------------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   3 - SEL$1 / DS_PRICES@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("DS_ID"=:P10 AND INTERNAL_FUNCTION("UPDATE_STAMP")))


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22],
       "PRICE_US"[NUMBER,22]
   2 - (#keys=1) "UPDATE_STAMP"[DATE,7], TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYY
       Y')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
   3 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]

Note
-----

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level



What is confusing is that this plan changes during the execution of that job. After some time we found different plan which was using index. I wonder how plan can change during execution of the job when there is chance for aging of the sql from the shared_pool.

Index not used

Kulkarni, September 17, 2011 - 9:33 am UTC

Hi Tom,
I need to add some more clarification. Last day we put the hint in the query to force use of Index as a work around for the time being. You can see even after putting the hit it is not using the index always though it uses it occasionally.

I am at my wits end to understand this behavior. How it can use different plans during execution of the same query multiple times with different bind variables? Is it right that bind peeking happens only once during hard parse?

Index not used

Kulkarni, September 17, 2011 - 9:43 am UTC

Tom,
All the information I have provided above is straight from the production machine which is having issue. This table is having nearly 240 million records.

September 17, 2011 - 11:27 am UTC

I'm sorry - but that table is seen as being very very very very small. do you see the cost of 2? If the table were large, the cost of scanning would be large regardless of the number of rows the scan returned.

look at your statistics - they are not correct, they do not represent what the data looks like. Also, look at how and when you gather statistics, it sounds like someone might be firing them off at various points in the day (plan changes all of sudden - query didn't change).

something is fundamentally wrong with your statistics.

and you didn't give me what I asked for- I need you to run the query with the gather plan statistics hint and then use dbms_xplan. I need to see e-rows and A-ROWS - estimated versus actual rows.

Index not used

Kulkarni, September 19, 2011 - 7:05 am UTC

Hi Tom,
Here is the output of dbms_xplan after putting the hint in the query. Surprisingly now it is using index though we have not changed anything. However we removed the hint to force index.

SQL> select * from TABLE(dbms_xplan.display_cursor('&SQL_ID', &CHILD,'ALL IOSTATS LAST'));
Enter value for sql_id: 0t2ups4rmfqzv
Enter value for child: 1
old   1: select * from TABLE(dbms_xplan.display_cursor('&SQL_ID', &CHILD,'ALL IOSTATS LAST'))
new   1: select * from TABLE(dbms_xplan.display_cursor('0t2ups4rmfqzv', 1,'ALL IOSTATS LAST'))

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0t2ups4rmfqzv, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */   TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us               FROM DS_PRICES              WHERE update_stamp IN
(TO_DATE(:p1, 'MM/DD/YYYY'),                         TO_DATE(:p2, 'MM/DD/YYYY'),                         TO_DATE(:p3, 'MM/DD/YYYY'),
TO_DATE(:p4, 'MM/DD/YYYY'),                         TO_DATE(:p5, 'MM/DD/YYYY'),                         TO_DATE(:p6, 'MM/DD/YYYY'),
TO_DATE(:p7, 'MM/DD/YYYY'),                         TO_DATE(:p8, 'MM/DD/YYYY'),                         TO_DATE(:p9, 'MM/DD/YYYY'))                AND DS_ID = :
p10

ORDER BY UPDATE_STAMP

Plan hash value: 1208038665

----------------------------------------------------------------------------------------------------------------------------------------------------------------
------------

| Id  | Operation                           | Name              | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffer
s | Reads  |

----------------------------------------------------------------------------------------------------------------------------------------------------------------
------------

|   0 | SELECT STATEMENT                    |                   |      1 |        |       |     9 (100)|          |       |       |      5 |00:00:00.03 |      1
7 |     15 |

|   1 |  PARTITION HASH SINGLE              |                   |      1 |      5 |   115 |     9   (0)| 00:00:01 |   KEY |   KEY |      5 |00:00:00.03 |      1
7 |     15 |

|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| DS_PRICES         |      1 |      5 |   115 |     9   (0)| 00:00:01 | ROW L | ROW L |      5 |00:00:00.03 |      1
7 |     15 |

|*  3 |    INDEX RANGE SCAN                 | DS_ID_UPDT_GLOBAL |      1 |      5 |       |     8   (0)| 00:00:01 |   KEY |   KEY |      5 |00:00:00.02 |      1
4 |     12 |

----------------------------------------------------------------------------------------------------------------------------------------------------------------
------------


Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$1 / DS_PRICES@SEL$1
   3 - SEL$1 / DS_PRICES@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DS_ID"=:P10)
       filter(("UPDATE_STAMP"=TO_DATE(:P1,'MM/DD/YYYY') OR "UPDATE_STAMP"=TO_DATE(:P2,'MM/DD/YYYY') OR "UPDATE_STAMP"=TO_DATE(:P3,'MM/DD/YYYY') OR
              "UPDATE_STAMP"=TO_DATE(:P4,'MM/DD/YYYY') OR "UPDATE_STAMP"=TO_DATE(:P5,'MM/DD/YYYY') OR "UPDATE_STAMP"=TO_DATE(:P6,'MM/DD/YYYY') OR
              "UPDATE_STAMP"=TO_DATE(:P7,'MM/DD/YYYY') OR "UPDATE_STAMP"=TO_DATE(:P8,'MM/DD/YYYY') OR "UPDATE_STAMP"=TO_DATE(:P9,'MM/DD/YYYY')))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
   2 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
   3 - "DS_PRICES".ROWID[ROWID,10], "UPDATE_STAMP"[DATE,7]


41 rows selected.

Index not used

Kulkarni, September 20, 2011 - 5:42 am UTC

Hi Tom,
Today again I captured plan while the job is running. Surprisingly now it is doing full table scans and I can see scattered reads.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0t2ups4rmfqzv, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */ TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DS_PRICES WHERE
update_stamp IN (TO_DATE(:p1, 'MM/DD/YYYY'), TO_DATE(:p2, 'MM/DD/YYYY'), TO_DATE(:p3,
'MM/DD/YYYY'), TO_DATE(:p4, 'MM/DD/YYYY'), TO_DATE(:p5, 'MM/DD/YYYY'),
TO_DATE(:p6, 'MM/DD/YYYY'), TO_DATE(:p7, 'MM/DD/YYYY'), TO_DATE(:p8, 'MM/DD/YYYY'),
TO_DATE(:p9, 'MM/DD/YYYY')) AND DS_ID = :p10 ORDER BY UPDATE_STAMP

Plan hash value: 3819973762

------------------------------------------------------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
-

| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop | A-Rows | A-Time | Buffers | Reads
|

------------------------------------------------------------------------------------------------------------------------------------------------------
-

| 0 | SELECT STATEMENT | | 1 | | | 3 (100)| | | | 9 |00:02:18.11 | 798K| 792K
|

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | PARTITION RANGE INLIST| | 1 | 1 | 23 | 3 (34)| 00:00:01 | KEY(I | KEY(I | 9 |00:02:18.11 | 798K| 792K
|

| 2 | SORT ORDER BY | | 1 | 1 | 23 | 3 (34)| 00:00:01 | | | 9 |00:02:18.11 | 798K| 792K
|

|* 3 | TABLE ACCESS FULL | DS_PRICES | 1 | 1 | 23 | 2 (0)| 00:00:01 | KEY(I | KEY(I | 9 |00:01:03.58 | 798K| 792K
|

------------------------------------------------------------------------------------------------------------------------------------------------------
-

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
3 - SEL$1 / DS_PRICES@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
3 - filter(("DS_ID"=:P10 AND INTERNAL_FUNCTION("UPDATE_STAMP")))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
2 - (#keys=1) "UPDATE_STAMP"[DATE,7], TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
3 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]

37 rows selected.

Here are the bind values :

NAME POSITION DATATYPE_STRING WAS VALUE_STRI ANYDATA.ACCESSTIMESTAMP(VALUE_ANYDATA)
---------- ---------- -------------------- --- ---------- ----------------------------------------
:P1 1 VARCHAR2(128) YES 02/21/2007
:P2 2 VARCHAR2(128) YES 02/22/2007
:P3 3 VARCHAR2(128) YES 02/23/2007
:P4 4 VARCHAR2(128) YES 03/01/2007
:P5 5 VARCHAR2(128) YES 03/05/2007
:P6 6 VARCHAR2(128) YES 03/22/2007
:P7 7 VARCHAR2(128) YES 04/23/2007
:P8 8 VARCHAR2(128) YES 05/23/2007
:P9 9 VARCHAR2(128) YES 08/21/2007
:P10 10 VARCHAR2(32) YES 280123

September 20, 2011 - 6:40 pm UTC

can you please verify your statistics. A full scan, even if it returns a single row, would not have a cost of "3" for a "big" table.



ops$tkyte%ORA11GR2> create table t as select * from all_objects where rownum =1;

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t t_first where object_id = 42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    73 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    73 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=42)

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> insert into t select * from all_objects;

72250 rows created.

ops$tkyte%ORA11GR2> commit;

Commit complete.

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t t_second where object_id = 42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    73 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    73 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=42)

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> select * from t t_third where object_id = 42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    97 |   309   (1)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    97 |   309   (1)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=42)

ops$tkyte%ORA11GR2> set autotrace off

your statistics are *not right* here at all.

Index not used

Kulkarni, September 20, 2011 - 6:28 am UTC

We also observed that this is taking more time to process records for some value of ds_id column where as it is very fast processing some values.

Index not used

Kulkarni, September 21, 2011 - 6:23 am UTC

Tom,
Suspecting statistics are playing the foul game I gathered statistics again today but with SKEWONLY option as I saw that this column is highly skewed(earlier I did not use this hence it created histograms with 74 buckets)

SQL> select min(cnt), max(cnt), avg(cnt), stddev(cnt)
  from ( select ds_id, count(*) cnt from datastream.ds_prices group by ds_id );  2

  MIN(CNT)   MAX(CNT)   AVG(CNT) STDDEV(CNT)
---------- ---------- ---------- -----------
         1       2584 1074.16035  960.363436

EXEC DBMS_STATS.gather_table_stats (ownname=>'DATASTREAM',tabname=>'DS_PRICES',estimate_percent=>dbms_stats.auto_sample_size, method_opt=> 'FOR ALL INDEXES 

FOR ALL INDEXED COLUMNS SIZE SKEWONLY', GRANULARITY => 'APPROX_GLOBAL AND PARTITION' , cascade=> true);

And I see that histogram is also created on the ds_id column with 254 buckets.


COLUMN_NAME                    NUM_DISTINCT LOW_VALUE
------------------------------ ------------ ----------------------------------------------------------------
HIGH_VALUE                                                          DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANALYZED      SAMPLE_SIZE GLO HISTOGRAM
---------------------------------------------------------------- ---------- ---------- ----------- ------------------ ----------- --- ---------------
DS_ID                                191450 313330303035
533533313734                                                     6.7676E-06          0         254 21-SEP-11               500731 YES HEIGHT BALANCED

UPDATE_STAMP                           2544 78650C0A010101
786F0910010101                                                   .000393082          0           1 21-SEP-11                 5048 YES NONE

Even after gathering statistics again today again it is doing full table scan. Here is the plan I captured today.

SQL_ID  0t2ups4rmfqzv, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */   TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us               FROM DS_PRICES              WHERE
update_stamp IN (TO_DATE(:p1, 'MM/DD/YYYY'),                         TO_DATE(:p2, 'MM/DD/YYYY'),                         TO_DATE(:p3,
'MM/DD/YYYY'),                         TO_DATE(:p4, 'MM/DD/YYYY'),                         TO_DATE(:p5, 'MM/DD/YYYY'),
TO_DATE(:p6, 'MM/DD/YYYY'),                         TO_DATE(:p7, 'MM/DD/YYYY'),                         TO_DATE(:p8, 'MM/DD/YYYY'),
             TO_DATE(:p9, 'MM/DD/YYYY'))                AND DS_ID = :p10 ORDER BY UPDATE_STAMP

Plan hash value: 3819973762

-------------------------------------------------------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |      1 |        |       |     3 (100)|          |       |       |      9 |00:01:41.89 |     798K|    744K|
|   1 |  PARTITION RANGE INLIST|           |      1 |      1 |    23 |     3  (34)| 00:00:01 | KEY(I | KEY(I |      9 |00:01:41.89 |     798K|    744K|
|   2 |   SORT ORDER BY        |           |      1 |      1 |    23 |     3  (34)| 00:00:01 |       |       |      9 |00:01:41.89 |     798K|    744K|
|*  3 |    TABLE ACCESS FULL   | DS_PRICES |      1 |      1 |    23 |     2   (0)| 00:00:01 | KEY(I | KEY(I |      9 |00:00:13.38 |     798K|    744K|
-------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
   1 - SEL$1
   3 - SEL$1 / DS_PRICES@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("DS_ID"=:P10 AND INTERNAL_FUNCTION("UPDATE_STAMP")))

Column Projection Information (identified by operation id):
-----------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
   1 - TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
   2 - (#keys=1) "UPDATE_STAMP"[DATE,7], TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
   3 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]


37 rows selected.


I am at my wits end now. This is working fine in QA after gathering statistics which created histogram with 254 buckets. Everything is identical between QA and production. Then why there is issue in production? Same job completes in QA in 3 hours but it runs for more than 16 hours in production.

September 21, 2011 - 9:42 am UTC

are you telling me this table is HUGE?

|* 3 | TABLE ACCESS FULL | DS_PRICES | 1 | 1 | 23 | 2
(0)| 00:00:01 | KEY(I | KEY(I | 9 |00:00:13.38 | 798K| 744K|

that cost is 2 to full scan it.

do this - just

set autotrace traceonly explain
select * from ds_prices;
set autotrace off

show the output.

Index not used

Kulkarni, September 21, 2011 - 7:58 am UTC

Tom,
I used ADVANCED option in dbms_xplan and got the following output.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0t2ups4rmfqzv, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */ TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DS_PRICES WHERE
update_stamp IN (TO_DATE(:p1, 'MM/DD/YYYY'), TO_DATE(:p2, 'MM/DD/YYYY'), TO_DATE(:p3,
'MM/DD/YYYY'), TO_DATE(:p4, 'MM/DD/YYYY'), TO_DATE(:p5, 'MM/DD/YYYY'),
TO_DATE(:p6, 'MM/DD/YYYY'), TO_DATE(:p7, 'MM/DD/YYYY'), TO_DATE(:p8, 'MM/DD/YYYY'),
TO_DATE(:p9, 'MM/DD/YYYY')) AND DS_ID = :p10 ORDER BY UPDATE_STAMP

Plan hash value: 3819973762

------------------------------------------------------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
-

| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop | A-Rows | A-Time | Buffers | Reads
|

------------------------------------------------------------------------------------------------------------------------------------------------------
-

| 0 | SELECT STATEMENT | | 1 | | | 3 (100)| | | | 8 |00:00:27.03 | 241K| 223K
|

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | PARTITION RANGE INLIST| | 1 | 1 | 23 | 3 (34)| 00:00:01 | KEY(I | KEY(I | 8 |00:00:27.03 | 241K| 223K
|

| 2 | SORT ORDER BY | | 1 | 1 | 23 | 3 (34)| 00:00:01 | | | 8 |00:00:27.03 | 241K| 223K
|

|* 3 | TABLE ACCESS FULL | DS_PRICES | 1 | 1 | 23 | 2 (0)| 00:00:01 | KEY(I | KEY(I | 8 |00:00:04.51 | 241K| 223K
|

------------------------------------------------------------------------------------------------------------------------------------------------------
-

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
3 - SEL$1 / DS_PRICES@SEL$1

Outline Data
-------------

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "DS_PRICES"@"SEL$1")
END_OUTLINE_DATA
*/

Peeked Binds (identified by position):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------

1 - (VARCHAR2(30), CSID=873): '07/21/1989'
2 - (VARCHAR2(30), CSID=873): '07/24/1989'
3 - (VARCHAR2(30), CSID=873): '07/25/1989'
4 - (VARCHAR2(30), CSID=873): '07/31/1989'
5 - (VARCHAR2(30), CSID=873): '08/03/1989'
6 - (VARCHAR2(30), CSID=873): '08/21/1989'
7 - (VARCHAR2(30), CSID=873): '09/22/1989'
8 - (VARCHAR2(30), CSID=873): '10/23/1989'
9 - (VARCHAR2(30), CSID=873): '01/22/1990'

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
10 - (VARCHAR2(30), CSID=873): '932121'

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter(("DS_ID"=:P10 AND INTERNAL_FUNCTION("UPDATE_STAMP")))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
2 - (#keys=1) "UPDATE_STAMP"[DATE,7], TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
3 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]

Index not used

Kulkarni, September 21, 2011 - 8:02 am UTC

This is more readable format.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0t2ups4rmfqzv, child number 1
-------------------------------------
SELECT /*+ gather_plan_statistics */ TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DS_PRICES WHERE
update_stamp IN (TO_DATE(:p1, 'MM/DD/YYYY'), TO_DATE(:p2, 'MM/DD/YYYY'), TO_DATE(:p3,
'MM/DD/YYYY'), TO_DATE(:p4, 'MM/DD/YYYY'), TO_DATE(:p5, 'MM/DD/YYYY'),
TO_DATE(:p6, 'MM/DD/YYYY'), TO_DATE(:p7, 'MM/DD/YYYY'), TO_DATE(:p8, 'MM/DD/YYYY'),
TO_DATE(:p9, 'MM/DD/YYYY')) AND DS_ID = :p10 ORDER BY UPDATE_STAMP

Plan hash value: 3819973762

------------------------------------------------------------------------------------------------------------------------------------------------------
-

| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop | A-Rows | A-Time | Buffers | Reads
|

------------------------------------------------------------------------------------------------------------------------------------------------------
-

| 0 | SELECT STATEMENT | | 1 | | | 3 (100)| | | | 8 |00:00:15.30 | 241K| 239K
|

| 1 | PARTITION RANGE INLIST| | 1 | 1 | 23 | 3 (34)| 00:00:01 | KEY(I | KEY(I | 8 |00:00:15.30 | 241K| 239K
|

| 2 | SORT ORDER BY | | 1 | 1 | 23 | 3 (34)| 00:00:01 | | | 8 |00:00:15.30 | 241K| 239K
|

|* 3 | TABLE ACCESS FULL | DS_PRICES | 1 | 1 | 23 | 2 (0)| 00:00:01 | KEY(I | KEY(I | 8 |00:00:14.84 | 241K| 239K
|

------------------------------------------------------------------------------------------------------------------------------------------------------
-

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
3 - SEL$1 / DS_PRICES@SEL$1

Outline Data
-------------

/*+
BEGIN_OUTLINE_DATA
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "DS_PRICES"@"SEL$1")
END_OUTLINE_DATA
*/

Peeked Binds (identified by position):
--------------------------------------

1 - (VARCHAR2(30), CSID=873): '07/21/1989'
2 - (VARCHAR2(30), CSID=873): '07/24/1989'
3 - (VARCHAR2(30), CSID=873): '07/25/1989'
4 - (VARCHAR2(30), CSID=873): '07/31/1989'
5 - (VARCHAR2(30), CSID=873): '08/03/1989'
6 - (VARCHAR2(30), CSID=873): '08/21/1989'
7 - (VARCHAR2(30), CSID=873): '09/22/1989'
8 - (VARCHAR2(30), CSID=873): '10/23/1989'
9 - (VARCHAR2(30), CSID=873): '01/22/1990'
10 - (VARCHAR2(30), CSID=873): '932121'

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter(("DS_ID"=:P10 AND INTERNAL_FUNCTION("UPDATE_STAMP")))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
2 - (#keys=1) "UPDATE_STAMP"[DATE,7], TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
3 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]

64 rows selected.

Index not used

Kulkarni, September 21, 2011 - 11:48 pm UTC

Tom,
I am running the query select * from ds_prices with autotrace on. It will take lot of time. Mean time I captured the plan it is using through dbms_xplan.

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dgv47xqp01d93, child number 0
-------------------------------------
select count(*) from datastream.ds_prices

Plan hash value: 2050148873

-------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows | Cost (%CPU)| E-Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | 339K(100)| | | |
| 1 | SORT AGGREGATE | | 1 | | | | |
| 2 | PARTITION RANGE ALL| | 237M| 339K (1)| 01:07:51 | 1 | 8 |
| 3 | TABLE ACCESS FULL | DS_PRICES | 237M| 339K (1)| 01:07:51 | 1 | 8 |
-------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
3 - SEL$1 / DS_PRICES@SEL$1

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=0) COUNT(*)[22]

Here you can how big is the table.

September 22, 2011 - 5:32 pm UTC

I said to do this:

do this - just

set autotrace traceonly explain
select * from ds_prices;
set autotrace off

which would take a millisecond.

Notice the cost of full scanning is 339k. Why do your full scan costs above show up as tiny - SOMETHING IS VERY FISHY HERE - I've asked before if you are posting information from the real database and I seriously doubt you are. Everything is suspect here.

The number of rows coming back will not influence the full scan cost, consider:

big_table%ORA11GR2> set autotrace traceonly explain
big_table%ORA11GR2> select * from big_table;

Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |  1000K|    94M|  3950   (1)| 00:00:48 |
|   1 |  TABLE ACCESS FULL| BIG_TABLE |  1000K|    94M|  3950   (1)| 00:00:48 |
-------------------------------------------------------------------------------

big_table%ORA11GR2> select * from big_table where owner = 'x';

Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    99 |  3942   (1)| 00:00:48 |
|*  1 |  TABLE ACCESS FULL| BIG_TABLE |     1 |    99 |  3942   (1)| 00:00:48 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OWNER"='x')

the cost hardly goes down. Now until you can explain how the cost of your full scan above is 339k, but the other full scan costs are like "2" - I'm not even going to look at this anymore.

You are not using the 'real' system - all bets are off, it is not worth my time to even look at this stuff - I've said that before.

Index not used

Kulkarni, September 22, 2011 - 12:05 am UTC

Tom,
Here is the output of the query.


SQL> set timing on
SQL> set autotrace trace exp stat
SQL> select count(*) from datastream.ds_prices;

Elapsed: 00:20:54.63

Execution Plan
----------------------------------------------------------
Plan hash value: 2050148873

--------------------------------------------------------------------------------
----------

| Id  | Operation            | Name      | Rows  | Cost (%CPU)| Time     | Pstar
t| Pstop |

--------------------------------------------------------------------------------
----------

|   0 | SELECT STATEMENT     |           |     1 |   339K  (1)| 01:07:51 |
 |       |

|   1 |  SORT AGGREGATE      |           |     1 |            |          |
 |       |

|   2 |   PARTITION RANGE ALL|           |   237M|   339K  (1)| 01:07:51 |     1
 |     8 |

|   3 |    TABLE ACCESS FULL | DS_PRICES |   237M|   339K  (1)| 01:07:51 |     1
 |     8 |

--------------------------------------------------------------------------------
----------



Statistics
----------------------------------------------------------
       3701  recursive calls
          0  db block gets
    1447237  consistent gets
    1441841  physical reads
          0  redo size
        519  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         75  sorts (memory)
          0  sorts (disk)
          1  rows processed

Index not used

Kulkarni, September 22, 2011 - 3:23 am UTC

Here the issue is when this query goes for a full table scan this job runs for ever. Day before yesterday it ran for more than 16 hours but even did not finish. We killed it.

Yesterday this job was running for long and when checked this query was doing full table scan. We flushed the shared pool after waiting for more than 3 hours. Then the plan suddenly changed to index reads and got finished within 15 minutes.

Index not used

Kulkarni, September 22, 2011 - 11:09 pm UTC

Tom,
I value your time. But at the same time I am not pasting here anything that is garbage. This is production system. I strongly suspect that this is due to bind peeking. As I have already provided the information above regarding the skewness of the data on column ds_id.

Do you suggest any other way to stop bind peeking than putting that undocumented parameter? I was not successful even after removing histograms on this column, bind peeking happened even after removing histograms.

I stress again, all the outputs that I paste here are from production system though they appear to be fishy.

September 23, 2011 - 9:32 am UTC

look - the cost of a full scan on a large table is not affected by the where clause too much. The cost is just wrong here - there is no way a full scan of that table would have a cost of "2".

do this for me, in a single - continuous, unedited cut and paste:


ops$tkyte%ORA11GR2> select num_rows, blocks, avg_row_len, sample_size, last_analyzed
  2    from user_tab_partitions where table_name = 'your table name';

  NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE LAST_ANAL
---------- ---------- ----------- ----------- ---------
     24044        174          44       24044 23-SEP-11
     24043        174          44       24043 23-SEP-11
     24044        174          44       24044 23-SEP-11

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select num_rows, blocks, avg_row_len, sample_size, last_analyzed
  2    from user_tables where table_name = 'your table name';

  NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE LAST_ANAL
---------- ---------- ----------- ----------- ---------
     72131        522          44       72131 23-SEP-11

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain

ops$tkyte%ORA11GR2> select * from your table name;
ops$tkyte%ORA11GR2> your_other_query

ops$tkyte%ORA11GR2> set autotrace off

Index not used

Kulkarni, September 23, 2011 - 11:43 am UTC

Tom,
Here is the output as you have asked to get.


SQL> select num_rows, blocks, avg_row_len, sample_size, last_analyzed from dba_tab_partitions where table_name = 'DS_PRICES';

  NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE LAST_ANALYZED
---------- ---------- ----------- ----------- ------------------
         0          0           0             22-SEP-11
         0          0           0             22-SEP-11
         0          0           0             22-SEP-11
         0          0           0             22-SEP-11
         0          0           0             22-SEP-11
  66833049     432638          46    66833049 22-SEP-11
 131426042     864382          45   131426042 22-SEP-11
  39774332     241406          45    39774332 22-SEP-11

8 rows selected.

Elapsed: 00:00:00.15
SQL> select num_rows, blocks, avg_row_len, sample_size, last_analyzed from dba_tables where table_name = 'DS_PRICES';

  NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE LAST_ANALYZED
---------- ---------- ----------- ----------- ------------------
 238033423    1538426          46   238033423 22-SEP-11

Elapsed: 00:00:00.06
SQL> set autotrace traceonly explain
SQL> select * from datastream.ds_prices;
Elapsed: 00:00:00.15

Execution Plan
----------------------------------------------------------
Plan hash value: 419681141

-------------------------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |   238M|    10G|   343K  (2)| 01:08:37 |       |       |
|   1 |  PARTITION RANGE ALL|           |   238M|    10G|   343K  (2)| 01:08:37 |     1 |     8 |
|   2 |   TABLE ACCESS FULL | DS_PRICES |   238M|    10G|   343K  (2)| 01:08:37 |     1 |     8 |
-------------------------------------------------------------------------------------------------

SQL> SELECT /*+ gather_plan_statistics */   TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us FROM DATASTREAM.DS_PRICES WHERE
update_stamp IN (TO_DATE(:p1, '07/24/1989'),
                 TO_DATE(:p2, '07/25/1989'),
  2    3    4                   TO_DATE(:p3, '07/31/1989'),
  5                   TO_DATE(:p4, '08/03/1989'),
  6                   TO_DATE(:p5, '09/22/1989'),
  7                   TO_DATE(:p6, '10/23/1989'),
  8                   TO_DATE(:p7, '07/21/1989'),
  9                   TO_DATE(:p8, '01/22/1990'),
 10                   TO_DATE(:p9, '08/21/1989')) AND DS_ID = 932121 ORDER BY UPDATE_STAMP;
Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 3819973762

----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |     4 |    96 |   342K  (2)| 01:08:25 |       |       |
|   1 |  PARTITION RANGE INLIST|           |     4 |    96 |   342K  (2)| 01:08:25 |KEY(I) |KEY(I) |
|   2 |   SORT ORDER BY        |           |     4 |    96 |   342K  (2)| 01:08:25 |       |       |
|*  3 |    TABLE ACCESS FULL   | DS_PRICES |     4 |    96 |   342K  (2)| 01:08:25 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(TO_NUMBER("DS_ID")=932121 AND ("UPDATE_STAMP"=TO_DATE(:P1,'07/24/1989') OR
              "UPDATE_STAMP"=TO_DATE(:P2,'07/25/1989') OR "UPDATE_STAMP"=TO_DATE(:P3,'07/31/1989') OR
              "UPDATE_STAMP"=TO_DATE(:P4,'08/03/1989') OR "UPDATE_STAMP"=TO_DATE(:P5,'09/22/1989') OR
              "UPDATE_STAMP"=TO_DATE(:P6,'10/23/1989') OR "UPDATE_STAMP"=TO_DATE(:P7,'07/21/1989') OR
              "UPDATE_STAMP"=TO_DATE(:P8,'01/22/1990') OR "UPDATE_STAMP"=TO_DATE(:P9,'08/21/1989')))

SQL> set autotrace off

September 23, 2011 - 12:51 pm UTC

This is a bind peeking issue probably.

If you hard parse the query with binds that point to a single partition - then we will optimize the query using LOCAL partition statistics.

So, if you initially parse the query with binds that let the optimizer know "I'm going to hit partition 1 only - and I see partition 1 is empty - zero rows - I'll use a full scan".

If you initially parse the query with binds that let the optimizer know "I'm going to hit the partition with 864382 blocks and lots of rows - I'll use an index" - you get the index plan.

So, it is due to the massive skew of data here - you have some empty partitions and some very full ones.

You would be best off by either

o not using binds against the partition key, so that you get different plans for different partitions. This is only going to be the case if you do not execute these queries hundreds of times per day (that is, it is infrequently executed)

o not having local partition statistics, just have global statistics. Then only one plan will be generated.

Index not used

Kulkarni, September 25, 2011 - 5:17 pm UTC

Tom,
As this query is used n-number of times we can not avoid using binds. Hence I followed your second suggestion and collected Global statistics as below,

EXEC DBMS_STATS.gather_table_stats (ownname=>'DATASTREAM',tabname=>'DS_PRICES',estimate_percent=>dbms_stats.auto_sample_size, method_opt=> 'FOR ALL INDEXES FOR ALL INDEXED COLUMNS SIZE SKEWONLY' , GRANULARITY => 'GLOBAL' , cascade=> true);

But this did not help as again this job went for full table scan. Again after flushing the shared_pool immediately it changed to index read.

However I verified that in both the cases when it did peek at the bind values no rows were returned. I mean in both the cases partition with 0 rows was hit. Yet in the first case it went for full scan and after flushing it went on to read indexes.

Is there a way to come of this situation?

September 25, 2011 - 5:39 pm UTC

did you remove the local statistics first? remove the local partition statistics if you want the global to be used - else you have changed nothing. The bind peeking will have isolated the query to a single partition and when that happens the optimizer will use local partition statistics - which is bad for you since you have such a wild skew of data.

either that, or just get rid of those empty partitions - they are messing up the plans big time.

Index not used

Kulkarni, September 25, 2011 - 5:22 pm UTC

Tom,
Just to add further,
In our QA environment data distribution is also same as in production.
SQL>  select TABLE_NAME,PARTITION_NAME,NUM_ROWS,SAMPLE_SIZE,GLOBAL_STATS,USER_STATS from dba_tab_statistics where table_name='DS_PRICES';

TABLE_NAME                     PARTITION_NAME                   NUM_ROWS SAMPLE_SIZE GLO USE
------------------------------ ------------------------------ ---------- ----------- --- ---
DS_PRICES                                                      193880960       55705 YES NO
DS_PRICES                      DS_PRICES_P1                            0             YES NO
DS_PRICES                      DS_PRICES_P2                            0             YES NO
DS_PRICES                      DS_PRICES_P3                            0             YES NO
DS_PRICES                      DS_PRICES_P4                            0             YES NO
DS_PRICES                      DS_PRICES_P5                            0             YES NO
DS_PRICES                      DS_PRICES_P6                     67332112       50189 YES NO
DS_PRICES                      DS_PRICES_P7                    126663852       56891 YES NO
DS_PRICES                      DS_PRICES_P8                         1236        1236 YES NO

But here we do not have this issue. We tested this job many times before moving into production. I fail to understand why such weird behavior only in production.

September 25, 2011 - 5:40 pm UTC

then in QA submit a query that hits partition DS_PRICES_P3 based on its bind inputs (make sure it hard parses) - and then submit the same query with binds that hit DB_PRICES_P7 and you'll see the same issue.

Index not used

Kulkarni, September 26, 2011 - 1:34 am UTC

Tom,
I removed partition statistics and ran the query. It ran fine for couple of times and then again went into full scan and almost hung.


SQL_ID  fhwr7g17mas9b, child number 1
-------------------------------------
SELECT    TO_CHAR(update_stamp, 'MM/DD/YYYY'), price, price_us               FROM DS_PRICES
            WHERE update_stamp IN (TO_DATE(:p1, 'MM/DD/YYYY'),
TO_DATE(:p2, 'MM/DD/YYYY'),                         TO_DATE(:p3, 'MM/DD/YYYY'),
           TO_DATE(:p4, 'MM/DD/YYYY'),                         TO_DATE(:p5, 'MM/DD/YYYY'),
                      TO_DATE(:p6, 'MM/DD/YYYY'),                         TO_DATE(:p7,
'MM/DD/YYYY'),                         TO_DATE(:p8, 'MM/DD/YYYY'),
TO_DATE(:p9, 'MM/DD/YYYY'))                AND DS_ID = :p10 ORDER BY UPDATE_STAMP

Plan hash value: 3819973762

-----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |        |       |     3 (100)|          |       |       |
|   1 |  PARTITION RANGE INLIST|           |      1 |    41 |     3  (34)| 00:00:01 | KEY(I | KEY(I |
|   2 |   SORT ORDER BY        |           |      1 |    41 |     3  (34)| 00:00:01 |       |       |
|*  3 |    TABLE ACCESS FULL   | DS_PRICES |      1 |    41 |     2   (0)| 00:00:01 | KEY(I | KEY(I |
-----------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   3 - SEL$1 / DS_PRICES@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.5')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "DS_PRICES"@"SEL$1")
      END_OUTLINE_DATA
  */

Peeked Binds (identified by position):
--------------------------------------

   1 - (VARCHAR2(30), CSID=873): '09/07/1989'
   2 - (VARCHAR2(30), CSID=873): '09/08/1989'
   3 - (VARCHAR2(30), CSID=873): '09/11/1989'
   4 - (VARCHAR2(30), CSID=873): '09/15/1989'
   5 - (VARCHAR2(30), CSID=873): '09/18/1989'
   6 - (VARCHAR2(30), CSID=873): '10/06/1989'
   7 - (VARCHAR2(30), CSID=873): '11/07/1989'
   8 - (VARCHAR2(30), CSID=873): '12/07/1989'
   9 - (VARCHAR2(30), CSID=873): '03/07/1990'
  10 - (VARCHAR2(30), CSID=873): '944295'

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("DS_ID"=:P10 AND INTERNAL_FUNCTION("UPDATE_STAMP")))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYYY')[10], "PRICE"[NUMBER,22],
       "PRICE_US"[NUMBER,22]
   2 - (#keys=1) "UPDATE_STAMP"[DATE,7], TO_CHAR(INTERNAL_FUNCTION("UPDATE_STAMP"),'MM/DD/YYY
       Y')[10], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]
   3 - "UPDATE_STAMP"[DATE,7], "PRICE"[NUMBER,22], "PRICE_US"[NUMBER,22]


SQL> select NUM_ROWS,PARTITION_NAME,AVG_ROW_LEN,SAMPLE_SIZE,GLOBAL_STATS from dba_tab_statistics where TABLE_NAME='DS_PRICES';

  NUM_ROWS PARTITION_NAME                 AVG_ROW_LEN SAMPLE_SIZE GLO
---------- ------------------------------ ----------- ----------- ---
                                                                  NO
           DS_PRICES_P1                                           NO
           DS_PRICES_P2                                           NO
           DS_PRICES_P3                                           NO
           DS_PRICES_P4                                           NO
           DS_PRICES_P5                                           NO
           DS_PRICES_P6                                           NO
           DS_PRICES_P7                                           NO
           DS_PRICES_P8                                           NO

September 26, 2011 - 7:30 am UTC

you have a stored outline in there for this query, it says to do a full scan. It is right there staring you in the face????

Index not used

Kulkarni, September 27, 2011 - 12:50 am UTC

Tom,
I never created this outline. I am also not able to find the details using dba_outlines.

SQL>  select NAME,OWNER,USED,TIMESTAMP,SQL_TEXT,ENABLED,FORMAT from dba_outlines;

no rows selected

Index not used

Kulkarni, September 27, 2011 - 12:56 am UTC

SQL> select * from dba_outlines;

no rows selected

SQL> select * from dba_outline_hints
  2  ;

no rows selected

SQL>

Index not used

Kulkarni, October 03, 2011 - 3:52 am UTC

Tom,
Any clue how this stored outline came into picture though I was not able to find one?

As of now I have resolved the issue by removing partition indexes and putting index hint in the query. It is going fine now and much faster than before partition.

Index not used whenever '>' is used for date column

Deepak, October 18, 2011 - 5:19 am UTC

Dear Tom,

I am facing issue that index is not getting used whenever for the following type of query.

select acid from lht where sol_id='101' and (lim_exp_date between TO_DATE('26-05-2011', 'dd-mm-yy') AND TO_DATE('31-03-2099', 'dd-mm-yy'));

OR

select acid from lht where sol_id='101' and lim_exp_date>TO_DATE ('26-05-2011', 'dd-mm-yy');

Please suggest.

Thanks & Regards,
DP

October 18, 2011 - 4:47 pm UTC

suggest you look at the estimated cardinality in the plan. Suggest that perhaps a full scan is what should be happening due to a high estimated cardinality.

Are you sure the index is the right access path.

Do this for us:

http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/

run the query that way - with gather_plan_statistics - and post the dbms_xplan output. We are interested in the a-row and e-rows columns.

Also, if you can, post a tkprof of the query (without any hints) and a tkprof of the query with the index hint to make it use the index.

B Tree indexes Vs Bitmap indexes.

Amir Riaz, October 19, 2011 - 2:28 am UTC

Tom,

Accept my analogy for asking this question, sound basic one but I have not able to find its answer and proof from some times, so please help.

I know B tree indexes fetches one block at time of table on which index reside. but in case of range scan, does Oracle fetches multiblock. that is, it uses multi block read count (one IO call) or it still fetch the range of blocks from table in multiple calls.

Another point of discussion in our warehouse is, when we use bitmap indexes. for each value in index there are huge number of blocks to be fetch ( do to low cardinality). Here did Oracle fetch by multiple read counts (one io call) or by one block (multiple io calls)

The objective of this research is to clear out thoughts that indexes always do one block IO, they dont do multiple block io.

October 19, 2011 - 4:47 pm UTC

It cannot use multi-block IO in an index range scan. That is because

a) the index leaf blocks are not stored next to each other - they are stored block by block and leaf blocks that are logically next to each other in the data structure might be physically far away from each other (they could even be on separate disks).

b) the table blocks as retrieved via an index are not necessarily next to each other either. the first index entry might point to block 50, the second to block 100, the third to block 1 and so on.

bitmap indexes - same deal, same exact deal. single block IO's

surprised.

A reader, October 20, 2011 - 2:20 am UTC

b) the table blocks as retrieved via an index are not necessarily next to each other either. the first index entry might point to block 50, the second to block 100, the third to block 1 and so on.

Lets leave the B*Tree index, do you think Bit map index can access data with multiblock reads.

reason is, in data warehouses, the data is loaded in a continuous manner. The data is normalized and some times sorted (suppose).

now if i want to sum the sales of data by "Oracle" and "IBM". thats a continuous data, may be in three or four extent. If i create a bit map indexes, I should be able get those extents in 3-4 ios.

A block by block ios is too costly for SAN environments. if you have to read just 4 extents out of 1000 extents. here I am considering 1 extent = 8*128. so for reading 4 extents Oracle needs 4*128 io calls across the FC SAN. sound like place for optimization in Oracle.

I think there are something we are missing or Oracle lack. I am of the view point that oracle with bitmap index can do single block read and multiple block reads from the table. index should be read block by block does not bother me much.

October 20, 2011 - 2:37 am UTC

bitmap, b*tree, text, spatial - I don't care WHAT KIND of index it is. They all access data via a key which is sorted in some fashion. The data in the table is not likely to be sorted by that key - is it?

create bitmap index I on emp(job);

Is everyone with the JOB 'salesman' located next to each other in the table? Probably not, and even if they were by some accident of nature - we could not RELY on that fact.

Index range scanning uses single block IO.

The only thing that does multiblock IO is a full scan - either a full table scan or an index fast full scan (which reads the entire index, in an unsorted manner).

Think about this a bit longer - the indexes must do single block reads. just because your data is accidentally sorted (or so you THINK IT IS) - we don't know that nor can we rely on that.

Did you know that data might not go in in the order you think it does? Automatic segment space management for example purposely spreads inserts out over many blocks (it "de-orders" the data). Parallel operations - what about them? Concurrent operations - how about that? Space reusage - what about that?

The data might be *almost* sorted - but almost only counts in hand grenades and horse shoes.

Besides even if all of the Oracle data was 'together', what dictates where it STARTS and where it STOPS being all together?

clustering factor

A reader, October 20, 2011 - 8:12 am UTC

Tom,

This surprised me because its a non optimal behavior by Oracle the amount of single block IOs you have, it will increase the cost of query. Oracle has much to learn in data warehousing sometime i really feel like this. suppose I load data with following statement, in a uniformed extent tablespace. I load data

insert /*+append*/ t_load
select * from source
order by name;

commit;

create bitmap index text_index on t(name);

with ASM strip size 1MB and MBRC 1MB and Extent size 1MB the whole extent will be written and suppose the name is duplicate name "Tom" to two extents and remain other name are also of low cardinality.

Now in index key. Tom will have a low row-id and high row-id and a bitmap.

suppose this low row-id and high row-id is spans multiple extents or even one extent

does cluster factor does not looked up by Oracle. Oracle can evaluate that weather to do single block read from table or multiple block reads.

That is one big reason I raised this question and its a big fault with Oracle bitmap index and they need optimization.

October 20, 2011 - 5:49 pm UTC

This surprised me because its a non optimal behavior by Oracle the amount of

You really need to close your eyes for a moment and envision how indexes - any index - works. You have keys in the indexes, the keys point to rowids (whether is it a b*tree or bitmap). the rowid points to A ROW, not a range of rows, but a single row.

An index is used to find a rowid(s). These rowids are pointers, pointers are dereferenced one by one. There is no way to know that these rowids collectively point to a logically contiguous set of blocks. There is no way to know if there might exist some other rows in that range that DO NOT meet your criteria.

insert /*+append*/ t_load
select * from source
order by name;

commit;
</code>

and that does not guarantee the data will strictly be ordered by name. As I said, ASSM (automatic segment space management) can kick in and spread data out. Parallel query could kick in and spread data out.

And even if it doesn't. The database doesn't know that someone didn't subsequently issue:

delete from t_load where name = 'Frank';
insert into t_load (name) values ('Alice' );

guess where Alice might go? Right where Frank used to be.

You are making a naive assumption that your data in a HEAP table (look up heap as it relates to computer science...) will be ordered.

the cluster factor is a measure - a relative measure, not an exact measure over time - of how sorted the table data is with respect to the key values. It does not mean that the data is perfectly sorted, it is just a measure used by b*tree indexes to figure out how efficient a given index would be at retrieving data block by block from a table.

you are wrong in your final analysis, you need to get a better understanding of how data is stored in tables in general. There is NO GUARANTEE that your rows are in any given order, there are thousands of side effects that will cause that to be "not true".

what about partition

A reader, October 20, 2011 - 8:26 am UTC

keeping in view the above example, if i partition the table by date and on that data only "Tom" has made transaction

Then "Tom" will be in one partition, contain extents, here i have access data with index key "Tom" and whole partition will be access by block. Suppose "Tom" never made an transaction on another date. If Oracle read the extent by MBRC only one io is involved and access will be fast.

October 20, 2011 - 5:53 pm UTC

But, there will be no information that the optimizer can rely on that tells it that Tom is ONLY IN that single partition.

How does it know that Tom wasn't loaded six months ago? Statistics are imperfect and go stale as data is mucked around with over time. Just because six months ago the statistics gathering process said "hey, I didn't see Tom in this partition" - does NOT mean Tom did not exist in that partition.

Again, you are being somewhat too simplistic in your view of how things work *in general*. While you might know that tom only exists in that one partition (and you might be wrong after all - your knowledge might be imperfect as well) - the optimizer CANNOT make that assumption - if it did - it would get what is known as "the wrong answer"

hmm... that is strange.

Amir Riaz, October 20, 2011 - 8:53 am UTC

Tom,

A bitmap index with Key "Tom" on name column and as we know by the structure of bitmap indexes that bitmap indexes has low rowid and high rowid. Lets say these low and high rowid were spanned over 10 blocks in extent. Oracle will read all those 10 block in one Io.

Your analogy that all indexes will fetch one block kills the purpose of bitmap index. if the sole purpose of bitmap indexes was to decrease the size of bitmap indexes we can do this with index partition. The objective of bitmap indexes is to compact data and keep the data clustering map. which they keep in the form of low rowid and high rowid.

and i agree if they dont that bitmap indexes are useless for data warehouse purposes.

I think this topic need a bit of your attention.

October 20, 2011 - 6:12 pm UTC

<Your analogy that all indexes will fetch one block kills the purpose of bitmap
index.

how so, b*tree and bitmap indexes are useful in the following cases (from a presentation I just gave yesterday on effective indexing:)

So, the ‘rules are’
As the means to access rows in a table: You will read the index to get to a row in the table. Here you want to access a very small number of the rows in the table.

As the means to answer a query: The index contains enough information to answer the entire query—we will not have to go to the table at all. The index will be used as a “thinner” version of the table.

As the means to optimize for initial response time: You want to retrieve all of the rows in a table, including columns that are not in the index itself – in some sorted order – the index will possibly allow for immediate response (but slower overall throughput).

If you are using an index to retrieve millions of rows - you are DOING IT WRONG (unless you are using the index INSTEAD OF the table, or you need to optimize for initial response time).

Therefore, you use indexes to retriever a SMALL NUMBER of rows from a table.

And the FACT (not an analogy) that indexes retrieve blocks using single block IO does NOT 'kill' the purpose of a bitmap index.

A bitmap index is used to a) retrieve a small number of rows, b) avoid having to use the table altogether or c) to optimize for initial response time.

If you have a situation whereby you are getting so many rows that using multiblock IO would actually be beneficial (we are talking about the need to process hundreds, thousands, MILLIONS of blocks - not a handful) - then full scanning - NOT USING A SILLY INDEX - is the approach to take. Indexes are useful to get a SMALL NUMBER. If you have a small number of rows to get - it doesn't matter really if you use single or multi block IO.

If you think this through - you'll find that when using a bitmap to retrieve data from a table - you are in fact getting a small number of rows from the table - and you typically do this by combining more than one bitmap together to reduce the number of rows retrieved. You are not using the bitmap to retrieve a million rows (if you are - you are doing it wrong). Therefore if you did what you seem to be saying you want to do - you would end up reading a ton of blocks that CONTAIN NOTHING OF INTEREST to you.

Say you took a bitmap index on colA and combined it with a bitmap on colB and another on colC (the where clause was "where (cola between ? and ? and colb > ?) or col3 = ?". You now have a series of bit map ranges in this new combined index. Say this bitmap range covers the entire table now ( the lowest lo-rowid is the start of the table and the highest rowid is the end of the table and all bits of the table in between are represented as well). This newly combined bitmap index points to 0.01% of the table perhaps (it is getting 1,000 rows out of 10,000,000).

Now, would you really want us to use multiblock IO to read the entire table?
Or, would you perhaps want us to read just the blocks that contain data?

which is it? should we full scan the entire table after finding 1,000 rowids - or should we just read the 1,000 rows?

and i agree if they dont that bitmap indexes are useless for data warehouse
purposes.

that is a rather naive statement - given the fact that you are empirically proven wrong every single day since version 7.3.3 was released with bitmap indexes. Think about that for a minute - people use bitmap indexes heavily in data warehousing applications every single day, every single second of every single day. if they were useless - do you think that would be the case? Your statement is somewhat absurd on the face of it when presented with the reality of what happens every single day.

I think you guys need to think this through a bit better and get a better understanding of WHEN and WHERE indexes would be used.

If you are retrieving so many rows from a table that multiblock IO would even make a measurable different - YOU PROBABLY SHOULD NOT BE USING AN INDEX in the first place.

People try to 'tune' via indexes in warehouses way to often, it is a wrong headed approach if you are dealing with retrieving and processing millions of rows - full scanning (without indexes) is the way to go.

Index

Abhishek Das, November 19, 2012 - 4:36 am UTC

Hi

I have a query with a very bad execution plan. It's bad because index on table fp_data is not being used. I tried index hint but still its not using the index. Then while trying different hints to get a better execution plan I applied parallel hint and suddenly the plan changed and it started using the index though its not using parallel execution. Could you please elaborate why in the first case it was not using index but in second case it is using.

Reason why I was trying to force it to use index is that in my development environment it was using it but not in production.

Query
--------
SELECT fp.*
FROM CT_CASES a,
FP_DATA fp
WHERE fp.A_NUM IN
(SELECT al.AL_PROF_NO
FROM ALERTS al ,
CT_CASE_ALERTS ct
WHERE al.AL_NUMBER = ct.AL_NUMBER
AND ct.CA_NUMBER = a.CA_NUMBER
)
AND a.CA_STATUS_DT BETWEEN sysdate-1 AND sysdate
AND a.CS_NAME ='Closed'
AND a.CD_NAME IN
(SELECT b.CD_NAME FROM CT_DISPOSITIONS b WHERE b.CD_FRAUD_IND=1
);

Plan hash value: 4276194151

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 267M (1)|890:32:04 | | |
|* 1 | FILTER | | | | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | MERGE JOIN CARTESIAN| | 133M| 12G| 172K (1)| 00:34:31 | | |
| 4 | REMOTE | | 6 | 96 | 2 (0)| 00:00:01 | FMS01 | R->S |
| 5 | BUFFER SORT | | 22M| 512M| 172K (1)| 00:34:29 | | |
| 6 | TABLE ACCESS FULL | FP_DATA | 22M| 512M| 28723 (1)| 00:05:45 | | |
| 7 | REMOTE | | 2 | 60 | 1 (0)| 00:00:01 | FMS01 | R->S |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter( EXISTS (SELECT 0 FROM "A2", "A1" WHERE "CT"."CA_NUMBER"=:B1 AND
"AL"."AL_PROF_NO"=:B2 AND "AL"."AL_NUMBER"="CT"."AL_NUMBER"))
2 - filter(SYSDATE@!-1<=SYSDATE@!)

Remote SQL Information (identified by operation id):
----------------------------------------------------

4 - SELECT "A1"."CD_NAME","A1"."CD_FRAUD_IND","A2"."CA_NUMBER","A2"."CD_NAME","A2"."CS_
NAME","A2"."CA_STATUS_DT" FROM "CT_DISPOSITIONS" "A1","CT_CASES" "A2" WHERE
"A2"."CS_NAME"='Closed' AND "A2"."CA_STATUS_DT">=SYSDATE@!-1 AND
"A2"."CA_STATUS_DT"<=SYSDATE@! AND "A2"."CD_NAME"="A1"."CD_NAME" AND
"A1"."CD_FRAUD_IND"=1 (accessing 'FMS01.BC' )

7 - SELECT "A1"."AL_NUMBER","A1"."AL_PROF_NO","A2"."CA_NUMBER","A2"."AL_NUMBER" FROM
"ALERTS" "A1","CT_CASE_ALERTS" "A2" WHERE "A1"."AL_NUMBER"="A2"."AL_NUMBER" AND
"A2"."CA_NUMBER"=:1 AND "A1"."AL_PROF_NO"=:2 (accessing 'FMS01.BC' )

Plan hash value: 2728964220

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 97 | 12416 | 2356 (1)| 00:00:05 | | |
| 1 | NESTED LOOPS | | 97 | 12416 | 2356 (1)| 00:00:05 | | |
| 2 | NESTED LOOPS | | 106 | 11872 | 2355 (1)| 00:00:05 | | |
|* 3 | HASH JOIN | | 34 | 2992 | 2354 (1)| 00:00:05 | | |
| 4 | REMOTE | CT_CASES | 6 | 348 | 161 (0)| 00:00:01 | FMS01 | R->S |
| 5 | VIEW | VW_SQ_1 | 123K| 3604K| 2193 (1)| 00:00:05 | | |
| 6 | REMOTE | | | | | | FMS01 | R->S |
| 7 | TABLE ACCESS BY INDEX ROWID| FP_DATA | 3 | 72 | 0 (0)| 00:00:01 | | |
|* 8 | INDEX RANGE SCAN | FP_DATA_IDX1 | 3 | | 0 (0)| 00:00:01 | | |
| 9 | REMOTE | CT_DISPOSITIONS | 1 | 16 | 0 (0)| 00:00:01 | FMS01 | R->S |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("ITEM_0"="A"."CA_NUMBER")
8 - access("FP"."A_NUM"="AL_PROF_NO")

Remote SQL Information (identified by operation id):
----------------------------------------------------

4 - SELECT /*+ SHARED (32767) */ "CA_NUMBER","CD_NAME","CS_NAME","CA_STATUS_DT" FROM "CT_CASES" "A"
WHERE "CS_NAME"='Closed' AND "CA_STATUS_DT">=:1-1 AND "CA_STATUS_DT"<=:2 (accessing 'FMS01.BC' )

6 - EXPLAIN PLAN INTO PLAN_TABLE@! FOR SELECT DISTINCT "A2"."AL_PROF_NO","A1"."CA_NUMBER" FROM
"ALERTS" "A2","CT_CASE_ALERTS" "A1" WHERE "A2"."AL_NUMBER"="A1"."AL_NUMBER" AND SYSDATE@!-1<=SYSDATE@!
(accessing 'FMS01.BC' )

9 - SELECT /*+ SHARED (32767) */ "CD_NAME","CD_FRAUD_IND" FROM "CT_DISPOSITIONS" "B" WHERE
"CD_FRAUD_IND"=1 AND :1="CD_NAME" (accessing 'FMS01.BC' )

Note
-----
- automatic DOP: skipped because of IO calibrate statistics are missing

November 19, 2012 - 10:29 am UTC

are the estimated cardinalities anywhere near to the actuals for this plan? that is the first step in triaging a bad plan.

A reader, November 19, 2012 - 12:48 pm UTC

I think its hard for optimizer to see any join condition btw CT_CASES a and FP_DATA fp as you are referencing CT_CASES in the sub-query (i.e. IN query) and that is the reason its going for CARTESIAN JOIN in your first plan.
Also as suggested by Tom make make sure stats are representing actual cardinality.

Also how about re-writing the query as below which is more readable and easy to understand.

select * from FP_DATA fp where EXISTS ( SELECT NULL FROM ALERTS al , CT_CASE_ALERTS ct, CT_CASES a WHERE al.AL_NUMBER = ct.AL_NUMBER AND ct.CA_NUMBER = a.CA_NUMBER and a.CA_STATUS_DT BETWEEN sysdate-1 AND sysdate AND a.CS_NAME ='Closed' and a.CD_NAME IN (SELECT b.CD_NAME FROM CT_DISPOSITIONS b WHERE b.CD_FRAUD_IND=1) and fp.A_NUM = al.AL_PROF_NO )

Index

Abhishek Das, November 20, 2012 - 10:03 pm UTC

Tom

Yes the cardinality is correct. I gathered stats just before generating the plan. My concern is that why it didn't used index even when I forced an index hint and started using index when I forced a parallel hint, though the query was still the same.

November 21, 2012 - 9:28 am UTC

you didn't give us enough of test case to answer any of this. perhaps you hinted wrong, perhaps it was using the index and you read it wrong, I don't know.

you'd have to give the entire set up, we never saw it "not using the index", you may well have to use a variety of hints (that is the problem with them) to get it to use the index - like leading, use_nl, ordered, etc. Once you start "programming the plan", you have to go all in - be verbose.

and just because you gathered stats right before running something, that doesn't mean the estimates are going to be accurate. I'll have to assume that the estimates are accurate for now because you said they were (did you verify that??? you ran a tkprof right and compared the actual to the estimate right?

Index being used only after Hint is being applied

Sunny, April 29, 2014 - 6:54 pm UTC

Hi Tom,

I have a situation where only when I use the Index hint, does index is being used.

Use Case:-

1. SELECT query on a table having around 75k data.
2. Have index on the column used in WHERE clause of the SELECT query.

CREATE TABLE him_entity (id NUMBER, him_val CHAR(1)); INSERT INTO him_entity SELECT rownum,'1' FROM dual CONNECT BY level<=1000000; CREATE INDEX idx_him_val ON him_entity(him_val);

Without using Index Hint, Explain plan:-

SELECT ID FROM him_entity WHERE him_val='1'; ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 4882K| 498 (5)| 00:00:06 | |* 1 | TABLE ACCESS FULL| HIM_ENTITY | 1000K| 4882K| 498 (5)| 00:00:06 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("HIM_VAL"='1')

Using Index Hint, Explain plan:-

SELECT /*+ index(him_entity idx_him_val)*/ ID FROM him_entity WHERE him_val='1'; --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 4882K| 3485 (1)| 00:00:42 | | 1 | TABLE ACCESS BY INDEX ROWID| HIM_ENTITY | 1000K| 4882K| 3485 (1)| 00:00:42 | |* 2 | INDEX RANGE SCAN | IDX_HIM_VAL | 1000K| | 1824 (1)| 00:00:22 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("HIM_VAL"='1')

1) Could you explain why Index was not being picked up without Using Index hint ?
2) I had gathered statistics for the table prior to query execution.
3) Is it fine/safe to use FIRST_ROWS hint instead of the Index hint?

Thanks,
Sunny

Connor and Chris don't just spend all day on AskTOM. You can also catch regular content via Connor's blog and Chris's blog. Or if video is more your thing, check out Connor's channel and Chris's channel from their Youtube channels. And of course, keep up to date with AskTOM via the official twitter account.

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.

Index not used.

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Indexes should not be used in case of CBO- in the given example

agreed...but need your suggestions

This query *not always* use the PK index. Any ideas?

Thanks for your answer but...

Sorry, I got it !

How to avoid <> condition in queries to make use of indexes?

Index FFS or Full table Scan ?

Index FFS or Full table Scan ?

Set expire data to far future instead of NULL

Set expire data to far future instead of NULL

index behavior

more info about the table

Sorry for the incomplete info in my previous posting

Block Selectivity

Unused Indexes in oracle 8.1.7

Index usage throgh db link

Indexes Not used

Great help

Two indexes in a table is worse than one idex in a table,isn't it?

Indexes on Columns NULL/ NOT NULL

Am I missing something obvious?

Here is the tkprof results

Does Varchar2(40) has anything to do with it?

to a Reader

Bugs

CBO not behaving properly for this simple query.

CURSOR_SHARING ?

Index Not used

more info

FTS and Index usage

How you find that

Pls. clarify

RE : Update Statement

Which index to use?

FTS Ratio

Absence of statistics

Eating back my words

Vodoo magic :-)

RULE based is faster then COST based

OK

sachin

sachin

sachin

index not getting used when using union all

Not using the right index.

RE: Not using the right index.

Not using the right index

Not using the right index

query to get sessions which do full table scan of big tables.

Not using the right index

Confused.

View Query

Getting confused (once again)...

Not quite understood...

Using Indexes

Column position in Index

Index not used

Index creation in different schema

Index not used

index

To Hari...

Different plans for the same query

Index not used

Index not used

Index not used

Index not used

Index not used

Index not used

Index not used

Index not used

Index not used

Index not used

Index not used

This query not always use the PK index. Any ideas?