Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, khalid.

Asked: September 28, 2002 - 11:27 am UTC

Last updated: August 12, 2013 - 4:20 pm UTC

Version: 8.1.7

Viewed 50K+ times! This question is

You Asked

Hi Tom.
If I have a table called Test contains 5 columns a, b, c, d and e.
1. What is the deference between index_1, and index_2 if they are defined as:-
Index_1.
Create index_1 on test (a, b);
Index_2.
Create index_2 on test (b, a).
Is the order in columns making difference? What is the deference and how it effect.

2. Is it reasonable to have index defined on columns a, b and c and another one
defined on column c and d. What I mean is column c exist in tow index.
Should I redefine the tow indexes to one index defined in a, b and c and another one defined on column d only or should I create one index defined on columns a, b, c and d.

Thank.

and Tom said...



Do you have my book "Expert one on one Oracle"? If so, the chapter on indexes goes into some detail on this topic.


The order of the columns can make a difference. index_1 would be most useful if you ask queries such as:

select * from t where a = :a and b = :b;
select * from t where a = :a;


Index_2 would be most useful if you ask queries such as:

select * from t where a = :a and b = :b;
select * from t where b = :b;

(in 9i, there is a new "index skip scan" -- search for that there to read about that. It makes the index (a,b) OR (b,a) useful in both of the above cases sometimes!)

So, the order of columns in your index depends on HOW YOUR QUERIES are written. You want to be able to use the index for as many queries as you can (so as to cut down on the over all number of indexes you have) -- that will drive the order of the columns. Nothing else (selectivity of a or b does not count at all)


2) yes, it is not only reasonable but very very common to have indexes on

(a,b,c)
(c,d)

It would not suffice to add D to the end of A,B,C -- if what you needed was an index on C,D to answer one set of questions and A,B,C for another.



Rating

  (87 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thanks

khalid., September 28, 2002 - 1:25 pm UTC

Thank you very much.

A reader, November 18, 2002 - 3:13 pm UTC

Hi Tom,

Is the ordering of index cloumns makes any differnce in performance?

Like I have index_1(a,b,c)
index_2(b,a,c) and index_3 (c,a,b), What is the
difference between these?

Tom Kyte
November 18, 2002 - 8:50 pm UTC

If you have my book "Expert one on one Oracle" -- i dive into all of the details on this and describe how you would set about deciding on how to order the columns -- it will affect

o the reusability of that index, some people might create more indexes then they actually need -- if they had reorded some index.

o the compressability of that index

for a query that references all of A, B, C the above three indexes will perform more or less the same -- regardless of the selectivity of A, B, or C. Benchmark in the book ;)

One More Doubt

Tony, November 19, 2002 - 12:21 am UTC

I have created an index_1(a,b). And I have the following queries.

1. select * from x
where a = 1
and b = 2;

2. select * from x
where b = 2
and a = 1;

3. select * from x
where a = 1
and c = 3
and b =2;

4. select * from x
where b = 2
and c = 3
and a = 1;

5. select * from x
where b = 2;


Is the index_1 used by all the above quries?
Is there any difference in index usage by these quries?
If yes/no, How?

Kindly explain in details. Your answer will be very helpful for the whole developers community. Thanks a lot in advance.






Tom Kyte
November 19, 2002 - 7:17 am UTC

in addition to what Connor said (if you have my book "Expert one on one Oracle" i actually go into this in lots of detail in the chapter on indexes) - the short answer to your query is

ALL of the queries could use that index in multiple ways.

1) probably index range scan + table access by rowid (if X has more columns then A and B)

2) same as #1
3) same as #1
4) same as #1
5) depends on version. version 9i and up it might index skip scan:
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1156161921465 <code>

For Tony...

Connor McDonald, November 19, 2002 - 4:09 am UTC

There is a lot more to it than just the index column order and the query presented to the optimizer. Things that influence the (cost) optimizer will include:

- the number of distinct values in each column
- the number of distinct values in the indexes
- the clustering factor of the index

hth
connor

Selectivity

Alexander, November 19, 2002 - 9:48 am UTC

Hello Tom,

Can you explain this better:

>for a query that references all of A, B, C the above three indexes will perform more or less the same -- regardless of the selectivity of A, B, or C. Benchmark in the book ;)

Does it mean than queries:

a) select /*+ index (x index_abc) */ * from x
where a = 1
and c between 1 and 7
and b > 2

b) select /*+ index (x index_cba) */ * from x
where a = 1
and c between 1 and 7
and b > 2

will have the same performance regardless of the selictivity of A, B and C fields?

Best regards.



Tom Kyte
November 19, 2002 - 5:19 pm UTC

yes and no. it'll depend -- but in general, for most cases, they would be the same.

In your case -- if the following was true:


there is
one row for b such that b > 2 (and hence only one index entry)
and c was equal to 2 for that row
and a = 1

and at the same time there is
1,000,000 rows such that a=1
and c was equal to 2 for all of those rows
and one of those rows had b > 2


an index on B,C,A or B,A,C would probably be better then one on A,C,B as the index on B,C,A would be able to stop "faster"


(but did that REALLY have to do with the selectivity of B over C??? NO cause this is also true:

there are 5 million other rows with b = 1
each of those 5 million rows have unique values for A
each of those 5 million rows have unique values for C (and all of them could be between 1 and 7 still


So here -- B is not selective at all but A and C are.


maximum key length exceeded

A reader, November 21, 2002 - 1:30 am UTC

I am getting the following error while trying to create an index on column varchar2(1000)
On Oracle 8.1.5 and Windows/2000
----
ERROR at line 1:
ORA-01450: maximum key length (758) exceeded
----
Oracle documentation saying the maximum value is operating system-specific.
Could you please explain me what is meant and how to rectify it.

This column is using in WHERE condition with LIKE operator.
Is this feasible providing the index on large column like this if not can you explain it ?

Thanks



Tom Kyte
November 21, 2002 - 1:00 pm UTC

It is based on the block size -- you get about 40% of the block size for the max size of an index key:

ops$tkyte@ORA920.US.ORACLE.COM> select 2048*.4 from dual;

   2048*.4
----------
     819.2


you have a 2k block size -- hence, the largest index key entry will in fact be 758bytes -- you have a column that could blow that out easily.


For something this large -- I doubt your like clauses are all in the form:

where column like 'something something something %'

if the like can start with %, then the index is worthless anyway ... so -- i recommand interMedia text and a text index on a field this big, regardless of the block size.

See
http://docs.oracle.com/docs/cd/A87860_01/doc/inter.817/index.htm
for details.
 

Balaji, November 21, 2002 - 9:04 am UTC

Hi Tom
Do u then mean to say that if there are 3 columns a b and c then if all the 3 columns are used in the query , then the ordering of columns in the index does not matter?
Ie selectivity does not come in to picture??

Then when does selectivity come in to picture??

Also , when two columns say a and b are used in one set of queries , and say b and c are used in other set of queries , then should i create 2 indexes on (a,b) and (b , c) or will one ondex give better performance for all the queires.
I heard that in 9i , the concept of leading column is eliminated.Please elaborate on this.


Tom Kyte
November 21, 2002 - 1:13 pm UTC

sigh,

it is not the first thing, it is not the most important thing. sure, it is something to consider but it is relatively far down there in the grand scheme of things.

if i have a,b,c and I find I query lots of times like this:


where a = :a and b = :b and c = :c
where c = :c
where b = :b and c = :c

I would prefer a single index (c,b,a) over any other combination -- it can easily be used for all of the above queries.

In certain strange, very perculiar and abnormal cases (like the above with really utterly skewed data), the selectivity could easily matter HOWEVER, they are

a) pretty rare
b) truly dependent on the values used at runtime, as all skewed queries are

so in general, look at the questions you have, try to minimize the indexes you need based on that. Look at the predicates in general, factor out what you where on and use that as a starting point.

Would this be the case

Vipin, May 30, 2003 - 4:00 pm UTC

Hi tom,

would you go with creating index over (c,b,a) even if the distinct values in columns C is just 2 say Y and N. In this case I think C must always be considered(if needed to be considered) in an Index at last position.

Please explain.

Tom Kyte
May 31, 2003 - 11:36 am UTC

you would be mistaken.

Suppose you have

c char(1) = 'Y', 'N'
b number = 1,2,3,4,5
a date = 1 million distinct values


and you query

where c = 'Y' and b = 5 and a > sysdate

will, I would presonally want that it be C,B,A or B,C,A -- cause we scan on A. It would be most efficient to goto the place in the index where C=Y, B=5, A > sysdate and simply scan from there (we are interested in all of those values until either C or B changes)

If A were first, we would read lots of C=N values or B!= 5 values while scanning A>sysdate.

The number of distinct values in a column in a concatenated index is not relevant when considering the position in the index.

HOW YOU USE the index -- that is what is relevant when deciding.




why the limit is imposed on indexes?

Tony, June 10, 2003 - 8:22 am UTC

Tom,
Why this restriction "ora-01450 maximum key length (758) exceeded" is imposed on indexes?. Can't index data be stored in multiple blocks? Really, I don't get need for this restriction.

Tom Kyte
June 10, 2003 - 9:13 am UTC

no, index data cannot be stored on multiple blocks (starting to defeat the purpose of an index isn't it? indexes = small, fast way to get to data by key)

The index key length is a function of your blocksize -- about 40% of the size of the block. you have a 2k block, small key length.

The word "index skip scan" is misleading

Tony, June 10, 2003 - 8:25 am UTC

Tom, The word "index skip scan" is misleading in oracle9i.
Is the index skipped, i mean unused, if this feature is used? Can you explain please?

Tom Kyte
June 10, 2003 - 9:15 am UTC

No, I don't aggree.

It is a SCAN that SKIPS around in the index structure. It is not an "index SKIPPING scan", it is an index skip scan.


the index is not skipped.

the index structure is skipped around in. see the link above in one of the other reviews, it has an example.

Is it possible to change index size limit (40% )?

Tony, June 12, 2003 - 3:10 am UTC

"Index key length is a function of your blocksize -- about 40% of the size of the block."

Why 40%?. Why can't be 60% or 70%? Is it possible to change the 40% limit?



Tom Kyte
June 12, 2003 - 8:51 am UTC

because we wanted at least 2 keys per block.

Look, if you are getting 1 key per leaf block, you don't deserve an index here, indexes need MANY rows per leaf block to be efficient.



Full Scan.

Kashif, June 12, 2003 - 8:04 pm UTC

Hi Tom,

The definition of a full scan of an index in the Performance Guide is as follows:

<quote>
A full scan is available if a predicate references one of the columns in the index. The predicate does not need to be an index driver.
</quote>

What I understand of that is that if a column referenced in the predicate of a query is part of an index, then irrespective of what its position in the index is, it can potentially use that index, in a full index scan. Does that sound about right, and if so, then does the order of columns matter as much?

Secondly, I tried verifying this but for some reason, unless the leading column of the index is used, the index is not used and instead an FTS is performed. What exactly does this full scan do?

TIA.

Kashif

Tom Kyte
June 12, 2003 - 8:46 pm UTC

yes, it could use the index in a full scan or even better a FAST full scan (read out of order using multi-block io)

regardless of the position in the index.

it will generally use this when the INDEX itself can be used INSTEAD of the table:

ops$tkyte@ORA920> create table t as select * from all_objects;

Table created.

ops$tkyte@ORA920> create index t_idx on t(object_name,object_id);

Index created.

ops$tkyte@ORA920> analyze table t
  2  compute statistics
  3  for table for all indexes for all indexed columns;

Table analyzed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> set autotrace traceonly explain
ops$tkyte@ORA920> select object_name from t where object_id = 55;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=18 Card=1 Bytes=27)
   1    0   INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE) (Cost=18 Card=1 Bytes=27)



 

Would you go for an additional index...

Kamal Kishore, June 12, 2003 - 9:32 pm UTC

Hi Tom,
Would you go for an additional index on OBJECT_ID and avoid the FAST FULL SCAN?
Will it be more efficient to have a separate index on OBJECT_ID rather than the FFS?

When would you choose one over the other?
Thanks,


Tom Kyte
June 13, 2003 - 7:48 am UTC

it depends on my needs. In the example above, my only goal was to demonstrate that yes, the index would be, could be used.

If that was a one off query -- I would be quite pleased that the optimizer did that for me.

If that was a query I executed 500 times a second, I would consider alternative indexing techniques.

Index FFS vs Range Scan

Jerry, June 13, 2003 - 7:37 am UTC

I have found a case where column order on an index does seem to make a difference in LIO and performance, given the same query.  The same query generates much more LIO using an FFS versus a range scan.  The only thing that has changed is the index column order.

Paycheck_status and pay_end_dt are not very selective.

SQL> create index psrpay_check on ps_pay_check
  2    ( paycheck_status,
  3      pay_end_dt,
  4      emplid,
  5      company
  6    ) tablespace psindex storage (initial 10485760 next 10485760 maxextents unlimited pctincrease 0) nologging pctfree 10;

Index created.

SQL>
SQL> select
  2   a.pay_end_dt
  3  from ps_pay_check a
  4  where a.emplid='xxxyyy'
  5   and a.company='CB1'
  6   and a.pay_end_dt<='30-apr-2003'
  7   and a.paycheck_status='V';

no rows selected

SQL>
SQL> set autotrace on stat exp
SQL> set timing on
SQL> /

no rows selected

Elapsed: 00:00:01.05

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=34 Bytes=714)
   1    0   INDEX (FAST FULL SCAN) OF 'PSRPAY_CHECK' (NON-UNIQUE) (Cost=2 Card=34 Bytes=714)




Statistics
----------------------------------------------------------
          0  recursive calls
          4  db block gets
       4316  consistent gets
          0  physical reads
          0  redo size
        139  bytes sent via SQL*Net to client
         18  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed


SQL>
SQL> create index psrpay_check on ps_pay_check
  2    (
  3      emplid,
  4      company,
  5      pay_end_dt,
  6      paycheck_status
  7    ) tablespace psindex storage (initial 10485760 next 10485760 maxextents unlimited pctincrease 0) nologging pctfree 10;

Index created.

SQL>
SQL> select
  2   a.pay_end_dt
  3  from ps_pay_check a
  4  where a.emplid='xxxyyy'
  5   and a.company='CB1'
  6   and a.pay_end_dt<='30-apr-2003'
  7   and a.paycheck_status='V';

no rows selected

SQL>
SQL> set autotrace on stat exp
SQL> set timing on
SQL> /

no rows selected

Elapsed: 00:00:00.07

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=34 Bytes=714)
   1    0   INDEX (RANGE SCAN) OF 'PSRPAY_CHECK' (NON-UNIQUE) (Cost=2 Card=34 Bytes=714)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        139  bytes sent via SQL*Net to client
         18  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
 

peter, June 14, 2003 - 2:46 am UTC

I have two questions:

1)If I have index_1(col_a, col_b, col_c), will the colunmn order affect the performance?

e.g. select *
from table
where col_a = 1
and col_b = 2
and col_c = 3
and
select *
from table
where col_b = 2
and col_c = 3
and col_a = 1
and
select *
from table
where col_c = 3
and col_b = 2
and col_a = 1

Are there any difference?

2)If I have two tables. One has millions of rows and other one has thousands of rows. Does the order of tables affect the performance?

table1: big table, table: small table

select *
from table1,
table2
where table1.col_a = table2.col_a
and table1.col_b = table2.col_b

select *
from table2,
table1
where table1.col_a = table2.col_a
and table1.col_b = table2.col_b

Thanks!

Tom Kyte
June 14, 2003 - 8:44 am UTC

in the exact equalities, no.

in things with ranges (between, <, >) it can, yes.

if you query "where a = 5 and b = 6 and c > 0"

and you have twenty rows with a=5, b=6 (5 of which are c>0) the index on

(a,b,c)

will process 5 rows in the index leafs (it'll find a=5,b=6, c=0 and scan from there)


on the other hand, if you have 1,000,000 rows where c>0 (same set of data) and lots of rows where a=5, you'll process lots more leaf rows looking for the 5 that have b=6. It'll scan more data. It has to look at entries like:

0,5,....
1,5,....
2,5,.....
and so on.

instead of just being able to zero in on

5,6,....

in the index.


2) not unless you are using the archaic RBO.

Golden rule for ordering columns on indexes

juancarlosreyesp@yahoo.com, September 03, 2003 - 12:56 pm UTC

Tom I had been investigating about ordering columns in indexes, and this is my opinion, if you can please get a time (and patiente) to check them.

I think the best rule is,
1. "Create indexes with all the combinations of columns, and test which is better FOR EVERY SPECIFIC QUERY"
( don't forget to calculate statistics for that indexes before testing )
2. About factor clustering, in the practice the real test is going to be the query speed and gets.
To know things like this (from documentation)
"
clustering factor.
Indicates the amount of order of the rows in the table based on the
values of the index.
n If the value is near the number of blocks, then the table is very well
ordered. In this case, the index entries in a single leaf block tend to
point to rows in the same data blocks.
n If the value is near the number of rows, then the table is very
randomly ordered. In this case, it is unlikely that index entries in
the same leaf block point to rows in the same data blocks.
The cost of fetching rows using rowids depends on the
index clustering factor.
Oracle does I/O by blocks. Therefore, the optimizer?s decision to use full table scans
is influenced by the percentage of blocks accessed, not rows. This is called the index
clustering factor. If blocks contain single rows, then rows accessed and blocks
accessed are the same.
However, most tables have multiple rows in each block. Consequently, the desired
number of rows could be clustered together in a few blocks, or they could be spread
out over a larger number of blocks.
Consider a case where each block has 100 rows, but on an average only 5 rows for
each block meet the query condition. A query accessing 2% of the rows might need
40% of the blocks.
"
Is interesting but in definitive the clustering is not what determines if an index is going to be useful or not.
3. This is a question.
If I export and import a table, and the physical order of the rows change completely, the indexes used to be the best will still be the best, even when in some situation a full scan will be better than using the index??

Thanks

Tom Kyte
September 04, 2003 - 8:27 am UTC

#1 seems wrong, you should be able to very much reduce the set of necessary candidate indexes without trying each and every permutation. you know how you query, you should be able to eliminate many

#2 you have wrong, the cluster factor is very much what determines if an index will be useful or not -- the CBO takes it into serious consideration.

#3 not sure what you are asking there -- but the cbo will look at the statistics and decide whether an index is correct or not.

:) Thank you Tom

A reader, September 04, 2003 - 9:31 am UTC


:) Fixed goldedn rules ot determine column ordre in bt index

juancarlosreyesp@yahoo.com, September 04, 2003 - 11:05 am UTC

Hi Tom I fixed the rules, so claiming to your patiente I ask you again, after some test.

Rules for defining column order B-TREE indexes
1. Normally, you must create indexes on the columns you query or sort.
2. In an equality, The columns with less unique values must be to the left, the columns with more unique values to the right
3. In a like comparison, the columns that has the like comparison should be at the left first.
3. The clustering factor is what determines if and index is or not used by the CBO,
question1:this means it chooses between the index with the column needed and a full scan?, based on the test I did it seems to be so. Or there is a fixed criteria.

If I recreate the table and reinsert the record in a specific order
for example create table a as select * from b order by col1,col2;
the cluster in the indexes changed complete from 20148 to 1041, but the same index was
used
QUESTION:This means that if I query a table in a specific order, specially a table that
is almost static, is advisable to recreate it in the order I query?? or maybe only if I do range scans for example between 3 and 10

Thank you

Tom Kyte
September 04, 2003 - 11:31 am UTC

1 -- i would agree, although unique/primary keys and fkeys are another reason

2 -- no, the order of columns in an index is based on how you use them, not on their cardinality.

3 -- (first #3). no, see #2. that generalization is not always true. in fact, if you

where x = 5 and y like 'abc%'

i'd PROBABLY rather the index be on (x,y), not on (y,x)

3 -- (second #3) it chooses between USING the index and NOT USING the index. It might full scan, it might do something entirely different. there are lots of access plans available.


remember -- a table can be sorted how many ways??? one. so in general, one index on a table MAY have a really good cluster factor, the rest won't.

I would use a data structure such as an IOT or a b*tree cluster or a hash cluster if locale of data (having related data stored together) was my utmost concern. I spend lots of time on these topics in both "expert one on one Oracle" as well as the newer "effective oracle by design"

Thanks Tom, and...

juancarlosreyesp@yahoo.com, September 04, 2003 - 12:38 pm UTC

Thanks tom, a long multi part question as the one you don't like.
 
(In your book Expert one on one I didn't found a page talking about column ordering specifically if you did ;)please tellme the page)
( In your new book are you talking specifically about column ordering in indexes? )
(about IOT I tried to use them, but there is a problem with developer 6i)

I took this from Oracle internlas November 2002 Michael R. Ault.
      HINT TO OTHER DBAS: IF YOU SUBSCRIBE TO A TRIAL TO ORACLE 
      INTERNALS YOU CAN READ PREVIOUS EDITIONS, VERY VERY NICE AND
      FREE :)

Tom, could you please explainme why this happens?
Tom What I see is that the best index is the one that has at the left the column with more distinct values, I supposed that the best always will be the one with the column with less distinct values to the left.
What I'm trying to understand is some simple rules to define oracle columns order, without trying all the permutations (as you said).
Here is the example I´ll do as it is in that very nice article:

(don't be angry, in the example he uses analyze, that's why I use analyze but I run too EXEC DBMS_UTILITY.ANALYZE_SCHEMA('ADM','COMPUTE') ;)

SQL> DROP TABLE TEST;
Tabla borrada.

SQL> create table test as select * from dba_objects ORDER BY OWNER,OBJECT_TYPE ;

Tabla creada.

SQL> create index test_ord1 on test(object_type,owner,object_name);

Indice creado.

SQL> analyze index test_ord1 compute statistics;

Indice analizado.

SQL> create index test_ord2 on test(object_name,owner,object_type);

Indice creado.

SQL> analyze index test_ord2 compute statistics;

Indice analizado.

SQL> create index test_ord3 on test(owner,object_name,object_type);
Indice creado.
SQL> analyze index test_ord3 compute statistics;
Indice analizado.
SQL> create index test_ord4 on test(owner,object_type,object_name);
Indice creado.
SQL> analyze index test_ord4 compute statistics;
Indice analizado.
SQL> analyze table test compute statistics;
Tabla analizada.

SQL> SELECT INDEX_NAME,CLUSTERING_FACTOR FROM USER_INDEXES WHERE INDEX_NAME LIKE 'TEST%';

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
TEST_ORD1                                  18773
TEST_ORD2                                  26932
TEST_ORD3                                  20250
TEST_ORD4                                  18653

SQL>

SQL>  select count(distinct(substr(dbms_rowid.rowid_to_restricted(rowid,1),1,8))) from test;

COUNT(DISTINCT(SUBSTR(DBMS_ROW
------------------------------
                           411

SQL>  select count(*) from test;

 COUNT(*)
---------
    30227

SQL>


SQL> select count(*) from test where owner = 'ADM' AND OBJECT_TYPE='TABLE';

 COUNT(*)
---------
      120


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'TEST_ORD1' (NON-UNIQUE) (Cost=2 C
          ard=27 Bytes=351)

**** he is using index on type,order, this is the column with MORE distinct values to the left.

SQL> select count(*) from test where owner = 'ADM' AND OBJECT_NAME = 'CUENTAS_ME';

 COUNT(*)
---------
        1


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=28)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'TEST_ORD2' (NON-UNIQUE) (Cost=2 C
          ard=1 Bytes=28)

**** Again he is using index on type,order, this is the column with MORE distinct values to the left.


SQL> select count(*) from test where object_type = 'TABLE' AND OBJECT_NAME = 'CUENTAS_ME';

 COUNT(*)
---------
        1


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=31)
   1    0   SORT (AGGREGATE)
   2    1     INDEX (RANGE SCAN) OF 'TEST_ORD1' (NON-UNIQUE) (Cost=9 C
          ard=1 Bytes=31)

But in this las situation optimicer breakes the previous rule and choose the index with the column with fewer distinct values to the left.

Question:
So, I don't understand which other options I have other than create all permutations of table and test.
;) Or you know some rules, please share them. I don't say THE RULES, I say some rules that works most of the time.
Because I'm where I started I don't know how to choose the best order or columns. 

Thanks Tom 

Tom Kyte
September 05, 2003 - 1:33 pm UTC

chapter on indexes discusses the order of columns in an index. page 311

as for the rest -- someday, when I'm taking questions - feel free.

what I'm finding is I cannot take any new questions since I get 50-100 of these sorts of question in a question here. .....





Thank you :)

A reader, September 05, 2003 - 4:57 pm UTC


Use most selective index

A reader, October 18, 2003 - 4:13 am UTC

Hi Tom,

We are using 9i in OLTP.

I have a query like the following:

select * from t
where column_with_10_distinct_values = :1
and column_with_10000_distinct_values = :2;

Both columns are indexed (non-uniqe b-tree).

But I noticed recently that the query is sometimes choosing the less selective index on column_with_10_distinct_values.

Q1: why did this happen?
Q2: how to avoid this WITHOUT changing the actually code (i.e. somehow to affect the CBO to choose the more selective index).

Tom Kyte
October 19, 2003 - 6:03 pm UTC

question for you

why then do you have an index on a column with 10 distinct values?

but beyond that -- please give more details, what does the table look like, what are the distinct 10 values (and their distribution -- eg select column_with_10_values, count(*) from t group by column_with_10_values)

do you analyze?
how do you analyze (EXACT COMMAND - not "we compute stats", but "we use analyze table T compute statistics for table for all indexes" or "dbms_stats.gather....." etc...


how big is the table?

what is your version?

Thanks

A reader, October 21, 2003 - 8:26 pm UTC

Maybe it's my fault that I thought index stats are gathered by default if I gather table stats. Seems I have to specify specifically whether using "analyze table" or using "dbms_stats".



Tom Kyte
October 21, 2003 - 9:53 pm UTC

er? don't follow? can you answer my questions?

why this does not use index skip scan?

pinguman, February 24, 2004 - 1:23 pm UTC

I have question about this feature, using 9.2.0.4 on HP-UX

Basically I have query which fits perfectly in this feature but Oracle is not using it, when I force skip scan hint it does index full scan instead of range... The SETID columns have only a single value.

When I add setid condition uery runs fast and use proper plan.

Can you throw some lights?



SELECT
a.RBTACCTNAME,c.inst_prod_id,c.product_id,d.descr,e.PA_EXTERNAL_ID,e.PA_COMP_INST_ID,e.PA_FAC_STATUS_VAL,
f.attribute_id,f.attribute_value,c.ORDER_DATE,c.INSTALLED_DATE
FROM sysadm.ps_rbt_account a,sysadm.ps_rftinst_prod b,sysadm.ps_rf_inst_prod c,sysadm.PS_PROD_ITEM d,
sysadm.ps_pa_arbor_elemen e,sysadm.ps_rf_attr_insp f
WHERE a.RBTACCTID = b.RBTACCTID
AND f.INST_PROD_ID (+)= c.INST_PROD_ID
AND e.INST_PROD_ID = c.INST_PROD_ID
AND c.inst_prod_id = b.inst_prod_id
AND d.product_id =c.product_id
AND a.RBTACCTNAME='N28281879K0002'
ORDER BY a.RBTACCTNAME

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.16 0.19 2 3 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 23.70 106.40 97324 97462 0 7
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 23.86 106.60 97326 97465 0 7

Rows Row Source Operation
------- ---------------------------------------------------
7 HASH JOIN OUTER (cr=97462 r=97324 w=0 time=106407272 us)
6 NESTED LOOPS (cr=61414 r=61293 w=0 time=94604699 us)
6 NESTED LOOPS (cr=46995 r=46910 w=0 time=63306469 us)
6 NESTED LOOPS (cr=46981 r=46910 w=0 time=63305379 us)
6 NESTED LOOPS (cr=7 r=0 w=0 time=294 us)
1 INDEX RANGE SCAN OBJ#(46628) (cr=3 r=0 w=0 time=82 us)(object id 46628)
6 INDEX RANGE SCAN OBJ#(55878) (cr=4 r=0 w=0 time=174 us)(object id 55878)
6 PARTITION HASH ITERATOR PARTITION: KEY KEY (cr=46974 r=46910 w=0 time=63304776 us)
6 TABLE ACCESS FULL OBJ#(47711) PARTITION: KEY KEY (cr=46974 r=46910 w=0 time=63304587 us)
6 TABLE ACCESS BY INDEX ROWID OBJ#(46023) (cr=14 r=0 w=0 time=809 us)
6 INDEX RANGE SCAN OBJ#(46170) (cr=8 r=0 w=0 time=529 us)(object id 46170)
6 PARTITION HASH ITERATOR PARTITION: KEY KEY (cr=14419 r=14383 w=0 time=31297927 us)
6 TABLE ACCESS FULL OBJ#(48103) PARTITION: KEY KEY (cr=14419 r=14383 w=0 time=31297773 us)
2703952 TABLE ACCESS FULL OBJ#(33064) (cr=36048 r=36031 w=0 time=5457645 us)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=18548 Card=15 Bytes=3015)
1 0 HASH JOIN (OUTER) (Cost=18548 Card=15 Bytes=3015)
2 1 NESTED LOOPS (Cost=14779 Card=15 Bytes=2250)
3 2 NESTED LOOPS (Cost=11320 Card=15 Bytes=1890)
4 3 NESTED LOOPS (Cost=11290 Card=15 Bytes=1200)
5 4 NESTED LOOPS (Cost=5 Card=15 Bytes=630)
6 5 INDEX (RANGE SCAN) OF 'PS5RBT_ACCOUNT' (NON-UNIQUE) (Cost=3 Card=1 Bytes=23)
7 5 INDEX (RANGE SCAN) OF 'PS0RFTINST_PROD' (NON-UNIQUE) (Cost=3 Card=15 Bytes=285)
8 4 PARTITION HASH (ITERATOR)
9 8 TABLE ACCESS (FULL) OF 'PS_RF_INST_PROD' (Cost=753 Card=1 Bytes=38)
10 3 TABLE ACCESS (BY INDEX ROWID) OF 'PS_PROD_ITEM' (Cost=2 Card=1 Bytes=46)
11 10 INDEX (RANGE SCAN) OF 'PSBPROD_ITEM' (NON-UNIQUE) (Cost=1 Card=1)
12 2 PARTITION HASH (ITERATOR)
13 12 TABLE ACCESS (FULL) OF 'PS_PA_ARBOR_ELEMEN' (Cost=231 Card=1 Bytes=24)
14 1 TABLE ACCESS (FULL) OF 'PS_RF_ATTR_INSP' (Cost=3467 Card=2845310 Bytes=145110810)


============================================================================


SELECT /*+ index_ss(c PS_RF_INST_PROD) */
a.RBTACCTNAME,c.inst_prod_id,c.product_id,d.descr,e.PA_EXTERNAL_ID,e.PA_COMP_INST_ID,e.PA_FAC_STATUS_VAL,
f.attribute_id,f.attribute_value,c.ORDER_DATE,c.INSTALLED_DATE
FROM sysadm.ps_rbt_account a,sysadm.ps_rftinst_prod b,sysadm.ps_rf_inst_prod c,sysadm.PS_PROD_ITEM d,
sysadm.ps_pa_arbor_elemen e,sysadm.ps_rf_attr_insp f
WHERE a.RBTACCTID = b.RBTACCTID
AND f.INST_PROD_ID (+)= c.INST_PROD_ID
AND e.INST_PROD_ID = c.INST_PROD_ID
AND c.inst_prod_id = b.inst_prod_id
AND d.product_id =c.product_id
AND a.RBTACCTNAME='N28281879K0002'
ORDER BY a.RBTACCTNAME

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.15 0.14 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 188.57 347.84 524458 524598 0 7
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 188.72 347.98 524458 524598 0 7

Rows Row Source Operation
------- ---------------------------------------------------
7 HASH JOIN OUTER (cr=524598 r=524458 w=0 time=347839584 us)
6 NESTED LOOPS (cr=488550 r=488424 w=0 time=315275165 us)
6 NESTED LOOPS (cr=474131 r=474042 w=0 time=293947883 us)
6 NESTED LOOPS (cr=474117 r=474042 w=0 time=293945089 us)
6 NESTED LOOPS (cr=7 r=0 w=0 time=307 us)
1 INDEX RANGE SCAN OBJ#(46628) (cr=3 r=0 w=0 time=83 us)(object id 46628)
6 INDEX RANGE SCAN OBJ#(55878) (cr=4 r=0 w=0 time=173 us)(object id 55878)
6 TABLE ACCESS BY GLOBAL INDEX ROWID OBJ#(47711) PARTITION: ROW LOCATION ROW LOCATION (cr=474110 r=474042 w=0 time=293944
414 us)
6 INDEX FULL SCAN OBJ#(55877) (cr=474104 r=474042 w=0 time=293943840 us)(object id 55877)
6 TABLE ACCESS BY INDEX ROWID OBJ#(46023) (cr=14 r=0 w=0 time=2476 us)
6 INDEX RANGE SCAN OBJ#(46170) (cr=8 r=0 w=0 time=2247 us)(object id 46170)
6 PARTITION HASH ITERATOR PARTITION: KEY KEY (cr=14419 r=14382 w=0 time=21326288 us)
6 TABLE ACCESS FULL OBJ#(48103) PARTITION: KEY KEY (cr=14419 r=14382 w=0 time=21326143 us)
2703952 TABLE ACCESS FULL OBJ#(33064) (cr=36048 r=36034 w=0 time=24444187 us)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1238733 Card=15 Bytes=3015)
1 0 HASH JOIN (OUTER) (Cost=1238733 Card=15 Bytes=3015)
2 1 NESTED LOOPS (Cost=1234964 Card=15 Bytes=2250)
3 2 NESTED LOOPS (Cost=1231505 Card=15 Bytes=1890)
4 3 NESTED LOOPS (Cost=1231475 Card=15 Bytes=1200)
5 4 NESTED LOOPS (Cost=5 Card=15 Bytes=630)
6 5 INDEX (RANGE SCAN) OF 'PS5RBT_ACCOUNT' (NON-UNIQUE) (Cost=3 Card=1 Bytes=23)
7 5 INDEX (RANGE SCAN) OF 'PS0RFTINST_PROD' (NON-UNIQUE) (Cost=3 Card=15 Bytes=285)
8 4 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'PS_RF_INST_PROD' (Cost=82098 Card=1 Bytes=38)
9 8 INDEX (FULL SCAN) OF 'PS_RF_INST_PROD' (UNIQUE) (Cost=82096 Card=1)
10 3 TABLE ACCESS (BY INDEX ROWID) OF 'PS_PROD_ITEM' (Cost=2 Card=1 Bytes=46)
11 10 INDEX (RANGE SCAN) OF 'PSBPROD_ITEM' (NON-UNIQUE) (Cost=1 Card=1)
12 2 PARTITION HASH (ITERATOR)
13 12 TABLE ACCESS (FULL) OF 'PS_PA_ARBOR_ELEMEN' (Cost=231 Card=1 Bytes=24)
14 1 TABLE ACCESS (FULL) OF 'PS_RF_ATTR_INSP' (Cost=3467 Card=2845310 Bytes=145110810)


============================================================================


SELECT
a.RBTACCTNAME,c.inst_prod_id,c.product_id,d.descr,e.PA_EXTERNAL_ID,e.PA_COMP_INST_ID,e.PA_FAC_STATUS_VAL,
f.attribute_id,f.attribute_value,c.ORDER_DATE,c.INSTALLED_DATE
FROM sysadm.ps_rbt_account a,sysadm.ps_rftinst_prod b,sysadm.ps_rf_inst_prod c,sysadm.PS_PROD_ITEM d,
sysadm.ps_pa_arbor_elemen e,sysadm.ps_rf_attr_insp f
WHERE a.RBTACCTID = b.RBTACCTID
AND f.INST_PROD_ID (+)= c.INST_PROD_ID
AND e.INST_PROD_ID = c.INST_PROD_ID
AND c.inst_prod_id = b.inst_prod_id
AND d.product_id =c.product_id
AND a.RBTACCTNAME='N28281879K0002'
and c.setid = 'SHARE'
and e.setid = 'SHARE'
ORDER BY a.RBTACCTNAME

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.15 0.15 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 12.29 12.20 36032 36109 0 7
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 12.44 12.35 36032 36109 0 7

Rows Row Source Operation
------- ---------------------------------------------------
7 HASH JOIN OUTER (cr=36109 r=36032 w=0 time=12201778 us)
6 NESTED LOOPS (cr=61 r=0 w=0 time=1312 us)
6 NESTED LOOPS (cr=41 r=0 w=0 time=968 us)
6 NESTED LOOPS (cr=27 r=0 w=0 time=650 us)
6 NESTED LOOPS (cr=7 r=0 w=0 time=214 us)
1 INDEX RANGE SCAN OBJ#(46628) (cr=3 r=0 w=0 time=90 us)(object id 46628)
6 INDEX RANGE SCAN OBJ#(55878) (cr=4 r=0 w=0 time=101 us)(object id 55878)
6 TABLE ACCESS BY GLOBAL INDEX ROWID OBJ#(47711) PARTITION: ROW LOCATION ROW LOCATION (cr=20 r=0 w=0 time=367 us)
6 INDEX UNIQUE SCAN OBJ#(55877) (cr=14 r=0 w=0 time=222 us)(object id 55877)
6 TABLE ACCESS BY INDEX ROWID OBJ#(46023) (cr=14 r=0 w=0 time=259 us)
6 INDEX RANGE SCAN OBJ#(46170) (cr=8 r=0 w=0 time=155 us)(object id 46170)
6 TABLE ACCESS BY GLOBAL INDEX ROWID OBJ#(48103) PARTITION: ROW LOCATION ROW LOCATION (cr=20 r=0 w=0 time=294 us)
6 INDEX UNIQUE SCAN OBJ#(55791) (cr=14 r=0 w=0 time=200 us)(object id 55791)
2703952 TABLE ACCESS FULL OBJ#(33064) (cr=36048 r=36032 w=0 time=5854299 us)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3864 Card=15 Bytes=3195)
1 0 HASH JOIN (OUTER) (Cost=3864 Card=15 Bytes=3195)
2 1 NESTED LOOPS (Cost=95 Card=15 Bytes=2430)
3 2 NESTED LOOPS (Cost=65 Card=15 Bytes=1980)
4 3 NESTED LOOPS (Cost=35 Card=15 Bytes=1290)
5 4 NESTED LOOPS (Cost=5 Card=15 Bytes=630)
6 5 INDEX (RANGE SCAN) OF 'PS5RBT_ACCOUNT' (NON-UNIQUE) (Cost=3 Card=1 Bytes=23)
7 5 INDEX (RANGE SCAN) OF 'PS0RFTINST_PROD' (NON-UNIQUE) (Cost=3 Card=15 Bytes=285)
8 4 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'PS_RF_INST_PROD' (Cost=2 Card=1 Bytes=44)
9 8 INDEX (UNIQUE SCAN) OF 'PS_RF_INST_PROD' (UNIQUE) (Cost=1 Card=1)
10 3 TABLE ACCESS (BY INDEX ROWID) OF 'PS_PROD_ITEM' (Cost=2 Card=1 Bytes=46)
11 10 INDEX (RANGE SCAN) OF 'PSBPROD_ITEM' (NON-UNIQUE) (Cost=1 Card=1)
12 2 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'PS_PA_ARBOR_ELEMEN' (Cost=2 Card=1 Bytes=30)
13 12 INDEX (UNIQUE SCAN) OF 'PS_PA_ARBOR_ELEMEN' (UNIQUE) (Cost=1 Card=1)
14 1 TABLE ACCESS (FULL) OF 'PS_RF_ATTR_INSP' (Cost=3467 Card=2845310 Bytes=145110810)







Tom Kyte
February 24, 2004 - 2:00 pm UTC

hows about telling us what columns constitute the index.

Lots of data but none of it helps one out this time.

Need to know

a) what the columns in order are.
b) what the distinct cardinality of the FIRST column in the index is.

more info

pinguman, February 24, 2004 - 3:03 pm UTC

Hi

a) what the columns in order are.

'PS_RF_INST_PROD' is made up of SETID and INST_PROD_ID
'PS_PA_ARBOR_ELEMEN' is made up of SETID and INST_PROD_ID

b) what the distinct cardinality of the FIRST column in the index is.

SETID is the first/leading column and has value 'SHARE'




Tom Kyte
February 25, 2004 - 1:02 pm UTC

tell you what - if you export the schema definition for those tables:

exp userid=u/p 'tables=(list of tables in the query)' rows=n


and export the stats using dbms_stats into another dmp file


tar and compress them

email them to me.... I'll take a peek. need to understand the partitioning scheme.

Index Skip Scan in 8174 Database

Vivek Sharma, March 29, 2004 - 11:01 am UTC

Dear Tom,

I am facing a Unique Problem in my 8174 Version of Database. I have not seen this behaviour before and hence I feel this as a Unique Behaviour.

I have a table of 65 Lakhs record. Table Name is PDC_BANKING_TBL. I have created an Index PDC_BANKING_TBL_VS on columns(PBN_CHQ_NO, PBN_OUTBATCH_NO, PBN_CUST_CD, PBN_AGREEMENT_NO). The Ordering are same as mentioned. BUt my below mentioned query is doing a Index Full Scan on this Index and the queries takes lots of time.

select distinct(pbn_cust_cd) from pdchome.pdc_banking_tbl
where pbn_agreement_no = 'LBCHE000564718' ;

Query Plan
-----------------------------------------------------------
SELECT STATEMENT [CHOOSE] Cost = 9
SORT UNIQUE
INDEX FAST FULL SCAN PDC_BANKING_TBL_VS

Whereas when I drop this Index, it uses a Range Scan on another Index with column(PBN_AGREEMENT_NO, PBN_XYZ_COL). i.e. this Index has only one column in the query. When Using this second Index, the query runs in seconds.

The Schemas are analyzed and the statistics are latest. Then why the optimizer is using this Index Skip Scan feature? As per my knowledge, the index should only be used when the leading columns are used in the query.

Please help me out as this is the query which is executed most of the time in our database and takes heavy resources.

Regards
Vivek




Tom Kyte
March 29, 2004 - 11:49 am UTC

that is not a skip scan, its a full scan, very different.

show use the tkprofs with and without the indexes and use AUTOTRACE TRACEONLY explain as well please.

Difference between Full Scan and Index Skip Scan

Vivek Sharma, March 29, 2004 - 12:09 pm UTC

Dear Tom,

Thanks the update. But I am not satisfied the answer given by you. I basically know the difference between the Full Scan and Index Skip Scan and hence had asked this question that why the query is using an Index with the trailing columns. Basically it is the concept of Index Skip Scan introduced in 9i. But mine is an Oracle8i 8174 Database and my query is doing a Index Full Scan on Index having a trailing columns. Why ?

I cannot drop and recreate the Index as it is my production database and creating an Index takes about an hour but have tried this once. I created the Index, noted this behaviour and the response time was poor. I dropped the Index and then checked the plan, it used an index with the column involved in the where clause as a leading column. The response time was very good (in seconds). In the evening again I created this index again the query started using the new Index and the respone time poor.

Why is this behaviour ? I know this is not Index Skip Scan but the behavious is almost same. My question is not the concept. My question is why the query is using an Index with a trailing column ?

Regards
Vivek

Tom Kyte
March 29, 2004 - 12:18 pm UTC

satisfied or not, it is NOT skip scanning. If it were -- it would say "index skip scan".

It says clearly INDEX FAST FULL SCAN PDC_BANKING_TBL_VS

It is using the index as a SKINNY version of the table. It is full scanning the index using multi-block IO. It is not even a little related to skip scans.

The optimizer has decided that this index is small enough that the advantage of multi-block IO to read a little more outweighs the single block IO used in the range scan.


when you created this new index, did it have any statistics?
what is the count returned by said query




Thanks Tom

Vivek Sharma, March 29, 2004 - 12:35 pm UTC

Thanks Tom for your patience and quick response. The output of the query is only one row. Based your answer, I can come to this conclusion that even in oracle versions prior to 9i, Optimizer can make use of Index with a trailing columns. Then How do I avoid using this Index Full Scan. I cannot change the code and provide HINT as it is a third party application and I cannot drop the Index as it has improved the performance of other query.

Please provide me some solution.

Regards
Vivek


Tom Kyte
March 29, 2004 - 12:48 pm UTC

i know it is only ONE row -- my question was "how many records did we have to scan" -- what was the OUTPUT?

oracle has for a very very long time been able to use the index in this fashion, yes.


would you like to provide me with what I asked for?

did you compute stats on the index itself?
what was the output of the query?

Thanks a lot Tom

Vivek Sharma, March 30, 2004 - 1:17 am UTC

Dear Tom,

Thanks a lot for your valuable suggestions. After Analyzing the Index, it got resolved. I appreciate your patience and know that when I come to this site with some queries it will be resolved definetely. This is why I feel that I have learnt a lot from this site of yours.

Thanks again.

Regards
Vivek

What happens?

A reader, May 06, 2004 - 11:53 am UTC

Hi Tom,
Just a simple question for you. I have a table which has 13 columns. There is one primary key on that table and 3 foreign key columns. I recently did a query and discovered a full table scan. The table is big enough, greater than 4 Million records. The query i made extracted a few reocords and used one of the foreign key columns and another normal date column in the where clause.
The query was taking a lot of time than usual and after i queried the session using OEM i discovered that it was doing a full table scan.
The foreign key column is indexed and the other date column is indexed as well. But both of them are indexed in separate indexes and do not share the same index.
I did resolve the issue and created an index on both of them and man was it fast with few consistent gets, problem resolved.
But one question comes to my mind. Why didn't CBO use either or both the indexes when executing the query? Why is it necesary to have an index with both the columns indexed in the same index?
I would be obliged if u, sorry, you will shed some light on it.
;) I know u hate these SMS words :D Sorry, just kiddin'.

Tom Kyte
May 06, 2004 - 3:03 pm UTC

you have a date column.....
you have another scalar column....

tell us, what would be the result of:

select max(count(*)) from t group by dt_column;
select max(count(*)) from t group by scalar;
select max(count(*)) from t group by dt_column,scalar;


(and then see if you cannot come up with some reasons.... if not, just give us the counts)

Any Advise!!

A reader, May 14, 2004 - 9:48 am UTC

Hi there Tom,
Yet another question. I know about the Skip Scan Feature Of Oracle 9i. Well my question is a bit related to this feature. I have 2 indexes which index the same fields but the order of the columns are different. The first index has the Foreign key column on the leading edge. So i was thinking of deleteing the second one.
For e.g.
Indx1:
cols: a(FK),b,c,d,e,f
Indx2:
cols: b,c,a(FK),f,e,d
Should i go ahead with it?
Thanks a lot, as always.

Tom Kyte
May 15, 2004 - 10:28 am UTC

umm, in a word

no,
yes,
maybe


is A(FK) extremely repetitive (like it has maybe single digit numbers of distinct values). Yes/Maybe comes into play. In order for skip scanning to work, there must be a small number of distinct values on the leading edge of the index itself. If not, it's not going to fly.

As always - a test in your test environment would be called for.

skip scan index

David, May 15, 2004 - 1:13 pm UTC

My colleague was saying that in 9i, it does not matter whether the column referred in the where clause is in leading edge of the index or not because of the auto feature of "Skip Scan Index"? Is this true? If so I can create one composite index on T(col1,col2,col3) and my where clause can refer any of col1, col2, col3 and this index would be used by the optimizer? Does this feature "Skip Scan Index" reduce number of indexes that need to be created to satisfy all of the combinations that one could use in the where clause?

Tom Kyte
May 15, 2004 - 2:00 pm UTC

not true in general.

true in SPECIFIC CERTAIN CASES when the leading edge column value is of very low cardinality.


please -- search for

"skip scan"

on this site, read about it, learn about it. useful in a small set of cases.

why is leading column so important?

Srikanth Adiga, May 17, 2004 - 4:19 am UTC

Why is that the index is not used if the leading column is not present in the where column? I read somewhere that some statistics is stored along with the index. Is it true and what statistics is it?

~Sri

Tom Kyte
May 17, 2004 - 7:43 am UTC

say you:

create table t as select * from all_objects;

create index t_idx on t(object_id,object_name,owner);


the data in the index is sorted left to right by object_id, object_name, owner.

Now you query:

where owner = 'SCOTT';


you'd have to look at EACH AND EVERY index entry one by one to find "scott" as it could be anywhere.

Same with

where object_name = 'EMP' and owner = 'SCOTT' -- you would have to look at EACH and EVERY element...


An index skip scan in 9i can change that however, again, search for skip scan on this site if you want to read about that.

Column Ordering in an Index

Vivek Sharma, May 21, 2004 - 9:40 am UTC

Dear Tom,

I would like to know whether is there any issue in creating a composite index on columns with either leading or trailing column having only 2-3 distinct values out of a million rows.

For eg.
Assuming empno is not a primary key but a column with high distinct values.

Index A on (sex, empno)
Index B on (empno)
Index C on (empno, sex)

select * from emp
where empno=:b1
and sex='M';

Will there be any difference in the response time and number of logical reads when using above three indexes ?
The table has around 1 Million Records and the output of the query is around 10 rows.

I have tested this scenario and found out that there is no difference and the cost assigned is the same when using any of the Indexes.

Your opinion please.

Regards
Vivek

Tom Kyte
May 21, 2004 - 11:21 am UTC

depends -- if SEX can be used to "not go to the table", it could have a profound effect.  Say there are 50 records for an EMPNO.  Say that EMPNO is a female:

ops$tkyte@ORA9IR2> create table t ( empno int, sex varchar2(1) );
 
Table created.
 
ops$tkyte@ORA9IR2> insert into t values ( 1, 'F' );
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t values ( 2, 'F' );
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> alter table t minimize records_per_block;
 
Table altered.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t
  2  select mod(rownum,2)+1, 'F'
  3    from all_objects
  4   where rownum <= 100;
 
100 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create index t_idx on t(empno);
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select * from t where empno = 1 and sex = 'M';
 
no rows selected
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'T'
   2    1     INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         52  consistent gets
          0  physical reads
          0  redo size
        272  bytes sent via SQL*Net to client
        368  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> drop index t_idx;
 
Index dropped.
 
ops$tkyte@ORA9IR2> create index t_idx on t(empno,sex);
 
Index created.
 
ops$tkyte@ORA9IR2> set termout off
ops$tkyte@ORA9IR2> set autotrace traceonly
ops$tkyte@ORA9IR2> select * from t where empno = 1 and sex = 'M';
 
no rows selected
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        272  bytes sent via SQL*Net to client
        368  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
 
ops$tkyte@ORA9IR2> set autotrace off


If the columns in the index can help us "not have to goto the table", it can be very beneficial to have them in there.

Don't get hung up on costs -- try to think about "what would we have to do if that column was not in there" -- in this case, we might be able to avoid alot of IO in some (but not all, if empno=1 were a male, we'd have 51 io's regardless) cases. 

usage of skip scan even though leading column is in where clause

Vlado Barun, June 08, 2004 - 6:38 pm UTC

Tom,

any ideas under which conditions Oracle chooses an execution plan with a skip scan even though the leading columns in the index are part of the where clause?

SELECT SUM(NVL(implicit_congestion_charge,0) + NVL(explicit_congestion_charge,0))
FROM cototals.hourly_congestion
WHERE fk_settlement_type_id = 1 AND fk_org_id = 3 AND hour_date >= TO_DATE('01-FEB-04')
AND hour_date < TO_DATE('01-MAR-04')

call count cpu elapsed disk query current rows misses
------- ----- ------ --------- ----- ------- ------------ ------------ ---------
Parse 1 0.00 0.00 0 2 0 0 1
Execute 1 0.00 0.00 0 0 0 0 0
------- ----- ------ --------- ----- ------- ------------ ------------ ---------
total 2 0.00 0.00 0 2 0 0 1

|Rows Row Source Operation
|----- ---------------------------------------------------
| 0 SORT AGGREGATE
| 0 .FILTER
| 0 ..PARTITION RANGE ITERATOR PARTITION: KEY KEY
| 0 ...TABLE ACCESS BY LOCAL INDEX ROWID HOURLY_CONGESTION_T PARTITION: KEY KEY (object id 4271 )
| 0 ....INDEX SKIP SCAN NDX_HOURLY_CONGESTION_TMW PARTITION: KEY KEY (object id 43345 )


CREATE INDEX cototals.ndx_hourly_congestion_tmw ON cototals.hourly_congestion_t
(
hour_date ASC,
fk_settlement_type_id ASC,
fk_org_id ASC,
sched9_ftr_mws ASC
)
PCTFREE 10
INITRANS 2
MAXTRANS 255
LOCAL (
PARTITION P_2000_MAYJUN
PCTFREE 10
INITRANS 2
MAXTRANS 255,
PARTITION P_2000__JULSEP
PCTFREE 10
INITRANS 2
MAXTRANS 255,
...
PARTITION P_2004_APRJUN
PCTFREE 10
INITRANS 2
MAXTRANS 255
)
/


Tom Kyte
June 08, 2004 - 7:11 pm UTC

can i see the create table as well?

can you use explain plan and select * from table(dbms_xplan.display)? it would be interesting to see the filters.

re: usage of skip scan even though leading column is in where clause

Vlado Barun, June 09, 2004 - 9:29 am UTC

Below is the info you requested. The from clause refers to a view so I included it's script too, though I cut out the majority of the columns as it's a wide view and table.

barunv@MSETTST> EXPLain plan for
2 SELECT SUM(NVL(implicit_congestion_charge,0) +
3 NVL(explicit_congestion_charge,0))
4 FROM cototals.hourly_congestion
5 WHERE fk_settlement_type_id = 1 AND fk_org_id = 3 AND hour_date >=
6 TO_DATE('01-FEB-04')
7 AND hour_date < TO_DATE('01-MAR-04')
8 ;

Explained.

barunv@MSETTST> SELECT * FROM table(DBMS_XPLAN.DISPLAY);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
|Id|Operation | Name |Rows |Bytes|Cost|Pstart|Pstop|
--------------------------------------------------------------------------------------------------
| 0|SELECT STATEMENT | | 1| 16| 22| | |
| 1| SORT AGGREGATE | | 1| 16| | | |
|*2| FILTER | | | | | | |
| 3| PARTITION RANGE ITERATOR | | | | | KEY | KEY|
| 4| TABLE ACCESS BY LOCAL INDEX ROWID|HOURLY_CONGESTION_T | 225| 3600| 22| KEY | |
|*5| INDEX SKIP SCAN |NDX_HOURLY_CONGESTION_TMW| 6| | 20| KEY | KEY|
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(TO_DATE('01-MAR-04')>TO_DATE('01-FEB-04') AND TO_DATE('01-FEB-04')<TO_DATE('01-MAR-04'
5 - access("HOURLY_CONGESTION_T"."HOUR_DATE">=TO_DATE('01-FEB-04') AND
"HOURLY_CONGESTION_T"."FK_SETTLEMENT_TYPE_ID"=1 AND "HOURLY_CONGESTION_T"."FK_ORG_ID"=3 AND
"HOURLY_CONGESTION_T"."HOUR_DATE"<TO_DATE('01-MAR-04'))
filter("HOURLY_CONGESTION_T"."FK_SETTLEMENT_TYPE_ID"=1 AND "HOURLY_CONGESTION_T"."FK_ORG_ID"=

Note: cpu costing is off

22 rows selected.

barunv@MSETTST>


CREATE OR REPLACE VIEW cototals.hourly_congestion (
hourly_congestion_id,
hour_date,
fk_org_id,
fk_settlement_type_id,
dst_flag,...
)
AS
select "HOURLY_CONGESTION_ID","HOUR_DATE","FK_ORG_ID","FK_SETTLEMENT_TYPE_ID", "DST_FLAG", ... from cototals.hourly_congestion_t

CREATE TABLE cototals.hourly_congestion_t
(hourly_congestion_id NUMBER(9,0) NOT NULL,
hour_date DATE NOT NULL,
fk_org_id NUMBER(9,0) NOT NULL,
fk_settlement_type_id NUMBER(9,0) NOT NULL,
dst_flag NUMBER(1,0) NOT NULL,
...)
PCTFREE 10
PCTUSED 40
INITRANS 1
MAXTRANS 255
PARTITION BY RANGE (HOUR_DATE)
(
PARTITION p_2000_mayjun VALUES LESS THAN (TO_DATE(' 2000-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10
PCTUSED 40
INITRANS 1
MAXTRANS 255,
...
PARTITION p_2004_aprjun VALUES LESS THAN (TO_DATE(' 2004-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
PCTFREE 10
PCTUSED 40
INITRANS 1
MAXTRANS 255
)

CREATE UNIQUE INDEX cototals.ndx_hourly_congestion_t ON cototals.hourly_congestion_t
(
hour_date ASC,
fk_settlement_type_id ASC,
dst_flag ASC,
fk_org_id ASC
)
PCTFREE 10
INITRANS 2
MAXTRANS 255
LOCAL (
PARTITION P_2000_MAYJUN
PCTFREE 10
INITRANS 2
MAXTRANS 255,
...
PARTITION P_2004_APRJUN
PCTFREE 10
INITRANS 2
MAXTRANS 255
)
/

CREATE INDEX cototals.ndx_hourly_congestion_fkorgid ON cototals.hourly_congestion_t
(
fk_org_id ASC,
fk_settlement_type_id ASC
)
PCTFREE 10
INITRANS 2
MAXTRANS 255
/

ALTER TABLE cototals.hourly_congestion_t
ADD CONSTRAINT pk_hourly_congestion_t PRIMARY KEY (hourly_congestion_id)
USING INDEX
PCTFREE 10
INITRANS 2
MAXTRANS 255
/


Tom Kyte
June 09, 2004 - 10:10 am UTC

Ok, i see now.

Predicate is on:

hour_date >= to_date(...) and hour_date < to_date( ... ) <<= index column 1
fk_settlement_type_id = 1 <<= index column 2
fk_org_id = 3 <<= index column 4

it is skip scanning on dst_flag. that is, for each hour_date/fk_settlement_type_id's -- it is saying "there are just a few dst_flags so we'll skip scan them to find the fk_org_ids=3 INSTEAD of having to range scan all key values for index column 1 and 2"

say for hour_date = "X" and fk_settlement_type_id = 1 there were 1,000 keys (1,000 dst_flag=some_value and fk_org_id=some_other_value).

We could either

a) read them all and process them
b) realize that there are 3 values for dst_flag (for example) and pretend you really asked:

where hour_date = "X" and fk_settlement_type_id = 1 and dst_flag in ( 1,2,3) and fk_org_id = 3

and do the three exact probes into this index -- reading just the records of interest instead of scanning all of them.


Neat, every day learn something new -- I did not realize we could/would skip scan "inside" the index like that.

re: usage of skip scan even though leading column is in where clause

A reader, June 09, 2004 - 4:33 pm UTC

Hm, but that can't be the case here since it's doing a skip scan on the ndx_hourly_congestion_tmw index which consists of 4 columns, of which the first three are part of the where clause. I think you were looking at the ndx_hourly_congestion_t index, which has the dst_flag.
The script for the ndx_hourly_congestion_tmw index is in the original post, not in my reply, sorry, I should have repeated it when I included the other scripts. I included it below.
Could it be that it's skip scanning on fk_settlement_type_id (it has only 3 distinct values out of 1.2 million rows in that partition) even though it's part of the where clause?

CREATE INDEX cototals.ndx_hourly_congestion_tmw ON cototals.hourly_congestion_t
(
hour_date ASC,
fk_settlement_type_id ASC,
fk_org_id ASC,
sched9_ftr_mws ASC
)
PCTFREE 10
INITRANS 2
MAXTRANS 255
LOCAL (
PARTITION P_2000_MAYJUN
PCTFREE 10
INITRANS 2
MAXTRANS 255,
PARTITION P_2000__JULSEP
PCTFREE 10
INITRANS 2
MAXTRANS 255,
...
PARTITION P_2004_APRJUN
PCTFREE 10
INITRANS 2
MAXTRANS 255
)
/


Tom Kyte
June 09, 2004 - 4:50 pm UTC

how many distinct hour_date values would that range cover?

re: usage of skip scan even though leading column is in where clause

Vlado Barun, June 09, 2004 - 7:28 pm UTC

barunv@MSETPRD> select count(distinct hour_date)
2 from cototals.hourly_congestion
3 WHERE hour_date >= TO_DATE('01-FEB-04') AND hour_date < TO_DATE('01-MAR-04')
4 /

COUNT(DISTINCTHOUR_DATE)
------------------------
696


Based on index

Saar, June 10, 2004 - 8:11 am UTC

Tom,

I am bit confused with a sql statement which is working at SQL promot but when tried to use it in a procedure, it's throwing with a njon-numeric or value error. I have tried to hard code it but it's of no use. Pls. try to sought it out at the earliest.

Query :

I have duplicate rows in my table. For example...

ref_id serial_number from_date to_date
213 1 01/03/2004 31/03/2004
213 2 01/04/2004 15/07/2004
213 3 01/03/2004 31/03/2004

I want to delete duplicate columns and make the serial number in order.

ex:-
ref_id serial_number from_date to_date
213 1 01/03/2004 31/03/2004
213 2 01/04/2004 15/07/2004

i have the query but it's not working when used it in a procedure.

INSERT INTO <TABLE>
(REF_ID,SERIAL_NUMBER,FROM_DATE,TO_DATE)
SELECT REF_ID,ROWNUM,FROM_DATE,TO_DATE
FROM ( SELECT REF_ID,SERIAL_NUMBER,
MIN(FROM_DATE),MAX(TO_DATE)
FROM <TABLE>
WHERE REF_ID = 213
GROUP BY REF_ID,SERIAL_NUMBER
);

Your feedback will be highly appreciated.

Rgrd
Saar

Tom Kyte
June 10, 2004 - 9:40 am UTC

if you give me a reproducible test case I too can just cut and paste and pretty much run (with create tables, insert intos, the statement as it works, the plsql routine with the statement not working) in the manner I give examples to you -- I'll be glad to take a look!

sanjay, July 17, 2004 - 6:47 am UTC

Hi Tom,


 create index emp_ena_sal on emp(ename,sal);

Query # I

SQL> select * from emp where rownum < 1;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   COUNT (STOPKEY)
   2    1     TABLE ACCESS (FULL) OF 'EMP'


query # II


  1  select * from emp where rownum < 1
  2  order by ename,sal
SQL> /

Execution Plan
----------------------------------------------------------
  0      SELECT STATEMENT Optimizer=CHOOSE
  1    0   COUNT (STOPKEY)
  2    1    TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
  3    2     INDEX (FULL SCAN) OF 'EMP_ENA_SAL'(NON-UNIQUE)


I was just wondering why the index being used for 'order by' clause. 



regards
sanjay  

Tom Kyte
July 17, 2004 - 2:46 pm UTC

because it wanted to?

but without full examples -- it is really hard to say. on my system, using demobld.sql -- we'd never get that result -- since ename and sal are NULLABLE in the demos and the index could not be used.

Dont understand

A reader, July 17, 2004 - 8:23 pm UTC

In one of your followups above, where you demonstrate how having more columns in the index helps...

"depends -- if SEX can be used to "not go to the table", it could have a profound effect. Say there are 50 records for an EMPNO. Say that EMPNO is a female:"
...

ops$tkyte@ORA9IR2> alter table t minimize records_per_block;

1. What is the reason for specifying this?

ops$tkyte@ORA9IR2> select * from t where empno = 1 and sex = 'M';

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1 consistent gets
"if empno=1 were a male, we'd have 51 io's regardless"

I dont understand this.

2. Why is there only one LIO in this case?

Oracle fetches the first leaf from the index, sees that emp_no=1, sex=F, that doesnt satisfy the query. There might be many more leafs with emp_no=1 since emp_no is not an unique index. So why does Oracle stop after one LIO? How does it know that no other rows satisfy the quuery?

Thanks

Tom Kyte
July 18, 2004 - 11:54 am UTC

1) wanted to make an example that would have the data spread over many blocks, simulating a larger rowsize. I could have just added a char(2000) or something as well.


the 50 io's regardless was because i had 100 records -- 50 with empno = 1 and 50 with empno = 2. IF empno = 1 were a "M" record -- we would have had 50 table access by index rowids. Assuming a "normal" program not performing array fetches -- we would have had 50 IO's to the table.


2) because we read the single small index block, inspected all empno=1's to see if any were male (they were not) and returned.


In this case, the index was teeny tiny. A single block -- for 100 measely rows.

Dont understand

A reader, July 18, 2004 - 12:41 pm UTC

Ok so what you are saying is that the index uses only 1 block. So, it is almost like a IOT where the 1 index block contains all the table data.

Regarding (1), are you saying that the table data is spread across 50 blocks? How?

I still dont get the 50 IOs part. If emp_no=1 were a "M" record, since the index has all the columns needed by the query, that single "emp_no=1" index leaf block would satisfy the query. Why would it need to do the 50 additional IOs to the table?

Thanks

Tom Kyte
July 18, 2004 - 2:44 pm UTC

That index, on 100 rows -- indexing the EMPNO, SEX column would be on a single block.

An IOT is a table in an index and in general would have lots more than one block.


The table data was forced over 50 different blocks by the alter table minimize command, yes. And the way I loaded the data (empno=1, empno=2, empno=1, empno=2....) made it so that every record for an empno=X value would be on a different block. I forced that.

I was assuming there would be more columns "in real life", so the select * would force a table access by index rowid to pick up the others.


sanjay, July 19, 2004 - 12:47 am UTC

sorry Tom I have not mention one important thing I have alter the emp table 
SQL> desc emp
 Name         Null?    Type
 ----------- -------- ----------------------------
 EMPNO        NOT NULL  NUMBER(4)
 ENAME        NOT NULL  VARCHAR2(10)
 JOB                    VARCHAR2(9)
 MGR                    NUMBER(4)
 HIREDATE               DATE
 SAL                    NUMBER(7,2)
 COMM                   NUMBER(7,2)
 DEPTNO                 NUMBER(2)



 

Tom Kyte
July 19, 2004 - 7:33 am UTC

same answer applies, because it wanted to. the rbo uses indexes whenever possible. also, with the stop count -- the rownum clause -- it makes sense to get the first rows as fast as possible.

Range Scan - IFFS

Shivaswamy, January 05, 2005 - 9:57 am UTC

Tom,

I could not understand, why in one case range scan and in the other Index Fast FUll Scan. Can you tell me why, please?

CREATE TABLE test
(
REF# VARCHAR2(12 BYTE) NOT NULL,
CLAIM# VARCHAR2(12 BYTE) NOT NULL,
SOURCEID NUMBER(4) DEFAULT '2' NOT NULL,
SEQNUM NUMBER(6) NOT NULL,
CODE VARCHAR2(1 BYTE) NOT NULL,
TRANSACTIONTS DATE NOT NULL,
INSERTTS DATE NOT NULL,
LASTUPDATETS DATE NOT NULL
)
noLOGGING
NOCACHE
NOPARALLEL
MONITORING;

CREATE UNIQUE INDEX test_PK ON test
(REF#, CLAIM#)
noLOGGING NOPARALLEL;

ALTER TABLE test ADD (
CONSTRAINT test_PK PRIMARY KEY (REF#, CLAIM#));

I have inserted 8,800,474 rows in to it.

I have "analyze table test for compute statistics for table for all indexes for all indexed columns;"

And here is the snippet from the trace file

********************************************************************************
Select ref#
From TEST
Where claim# = :"SYS_B_0"

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 4.10 4.01 0 27750 2 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 4.10 4.01 0 27750 2 1

Rows Row Source Operation
------- ---------------------------------------------------
1 INDEX FAST FULL SCAN TEST_PK (object id 154936)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 INDEX GOAL: ANALYZED (FAST FULL SCAN) OF 'TEST_PK' (UNIQUE)

Select claim#
From TEST
Where ref# = :"SYS_B_0"

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 4 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 4 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 INDEX RANGE SCAN TEST_PK (object id 154936)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'TEST_PK' (UNIQUE)

********************************************************************************



Tom Kyte
January 05, 2005 - 11:02 am UTC

try to conceptualize the index.

it is data sorted by key, your key is "ref#, claim#"

if you say "where claim# = :x", it must inspect EACH AND EVERY index entry (well, there is a skip scan, but we won't go there right now). Why? because the index is sorted by REF# and within each REF# by CLAIM#. So, say you have ref#,claim# = 1, 1 and 1000000000, 1

the first 1,1 is on the "left hand side of the index tree"
the second 1000000000, 1 record is on the "right hand side of the index tree"

and there are millions of records in between -- we have to look at all of them because claim# = 1 could be in none, any or ALL of them....


that you are using cursor_sharing=force is a bad thing, you need to use bind variables!

Thanks

Shivaswamy, January 05, 2005 - 12:16 pm UTC

Tom,

Thank you very much for explaining this one. It is clear to me. On the bind variables: This is more of a warehouse.

Shivaswamy

Tom Kyte
January 05, 2005 - 12:41 pm UTC

then cursor_sharing=force is evil still.

it was evil then, it is evil now. if you don't want binds, cursor_sharing=force is putting them in. If you want bind variables, you shouldn't need to set it, you would have coded it in the first place.

Shivaswamy, January 08, 2005 - 8:52 pm UTC

Tom
Actually I have cursor_sharing=similar on this database. Curious to know what made you to think that, it was FORCE?
Shivaswamy

Tom Kyte
January 09, 2005 - 11:24 am UTC

similar/force -- same basic effect.

Select ref#
From TEST
Where claim# = :"SYS_B_0"

shows me is it not the setting it should be which is EXACT.




Thanks, Tom.

A reader, January 09, 2005 - 9:57 pm UTC


Skip scan on PK

Peter Tran, January 20, 2005 - 11:43 pm UTC

Hi Tom,

This is related to something you answered earlier. I have the following IOT
table with only PK index. It's ranged partitioned by DPTDATE.

The values in ( ) are the # of distinct values for that column.

> desc od_fltfrc;
Name Null? Type
----------------------------------------------------- -------- --------------
CRRCODE ( 1) NOT NULL VARCHAR2(3)
FLTNUM ( 2021) NOT NULL VARCHAR2(5)
DPTDATE ( 40) NOT NULL DATE
ORGN ( 175) NOT NULL VARCHAR2(5)
DSTN ( 174) NOT NULL VARCHAR2(5)
CLSCODE ( 20) NOT NULL VARCHAR2(2)
POS ( 75) NOT NULL VARCHAR2(9)
PAXTYPE ( 2) NOT NULL VARCHAR2(1)
DCP ( 22) NOT NULL NUMBER(2)
FRCDATE ( 1) NOT NULL DATE
BKGMEAN NUMBER
BKGMEANINFLUENCED NUMBER
BKGVARIANCE NUMBER
XXLMEAN NUMBER
XXLMEANINFLUENCED NUMBER
XXLVARIANCE NUMBER

> select /*+ parallel(od_fltfrc 4) */ count(*) from od_fltfrc;

COUNT(*)
----------
9358356

Key columns are in order:

DPTDATE
FLTNUM
CRRCODE
ORGN
DSTN
CLSCODE
POS
PAXTYPE
DCP
FRCDATE

Question: Why is my update SQL using skip scan when I give it all but the last
column in the PK in the predicate?

UPDATE OD_FLTFRC SET FRCDATE = :1, XXLMEAN = :2, BKGVARIANCE = :3,
BKGMEANINFLUENCED = :4, XXLMEANINFLUENCED = :5, BKGMEAN = :6,
XXLVARIANCE = :7
WHERE
DSTN = :8 AND ORGN = :9 AND FLTNUM = :10 AND PAXTYPE = :11 AND DPTDATE = :12
AND POS = :13 AND DCP = :14 AND CRRCODE = :15 AND CLSCODE = :16


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 37 0.01 0.00 0 0 0 0
Execute 109348 1807.89 1803.49 719 34922283 631234 109348
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 109385 1807.90 1803.49 719 34922283 631234 109348

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 36

Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE
2882 PARTITION RANGE ALL PARTITION: 1 25
2882 INDEX SKIP SCAN OD_FLTFRC_PK PARTITION: 1 25 (object id 67148)


Thanks,
-Peter


Tom Kyte
January 21, 2005 - 7:49 am UTC

need to see the example, the partitioning scheme.

(I can say it should be obvious that a bit of ARRAY PROCESSING on your part would be of immense help to your performance here..... Updating 10% of the data slow by slow....)

We are array processing

Peter Tran, January 21, 2005 - 9:16 am UTC

Hi Tom,

I'll put together the example and DDL for you.

Regarding your comment about using "ARRAY PROCESSING". I'm very sure we are. This is a JAVA server-side program that uses java.sql.PrepareStatement.

PrepareStatement pstmt = ... ;
for ( 1 ... # data to add )
{
// bind all data
pstmt.addBatch();
}
pstmt.executeBatch();

Isn't this the most optimal way to do it?

Thanks,
-Peter

Tom Kyte
January 21, 2005 - 12:10 pm UTC

Execute 109348 1807.89 1803.49 719 34922283 631234 109348

executed 109,348 times
for 109,348 rows


you are BATCHING, not array processing



conn.setAutoCommit(false);
((OracleConnection)conn).setDefaultExecuteBatch (50);
PreparedStatement ps =
conn.prepareStatement ("insert into dept (deptno,dname,loc) values (?, ?, ?)");


would do an array insert of 50 rows at a time (executes the statement once with 50 sets of inputs)

Gotcha!

Peter Tran, January 21, 2005 - 12:21 pm UTC

So if I was array processing with 50 then my stats would be:

109,348 / 50 + 1 = 2187

executed 2,187 times
for 109,348 rows

correct?

Tom Kyte
January 21, 2005 - 6:40 pm UTC

yes.... the difference would look like this:

ops$tkyte@ORA9IR2> declare
  2          type array is table of number index by binary_integer;
  3          l_data array;
  4  begin
  5          for i in 1 .. 100
  6          loop
  7                  insert into t slow_by_slow values ( i );
  8                  l_data(l_data.count+1) := i;
  9          end loop;
 10
 11          forall i in 1 .. l_data.count
 12                  insert into t array values ( l_data(i) );
 13  end;
 14  /
 
PL/SQL procedure successfully completed.
 
INSERT into t slow_by_slow values ( :b1 )
                                                                                                                                   
                                                                                                                                   
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute    100      0.02       0.02          0          1        108         100
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      101      0.02       0.02          0          1        108         100
                                                                                                                                   
********************************************************************************
INSERT into t array values ( :b1 )
                                                                                                                                   
                                                                                                                                   
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1          1         100
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1          1         100


 

Skip Scan on PK

Peter Tran, January 21, 2005 - 2:59 pm UTC

Hi Tom,

Here's the create table as you requested:

CREATE TABLE OD_FLTFRC
(CRRCODE VARCHAR2(3) NOT NULL ENABLE,
FLTNUM VARCHAR2(5) NOT NULL ENABLE,
DPTDATE DATE NOT NULL ENABLE,
ORGN VARCHAR2(5) NOT NULL ENABLE,
DSTN VARCHAR2(5) NOT NULL ENABLE,
CLSCODE VARCHAR2(2) NOT NULL ENABLE,
POS VARCHAR2(9) NOT NULL ENABLE,
PAXTYPE VARCHAR2(1) NOT NULL ENABLE,
DCP NUMBER(2,0) NOT NULL ENABLE,
FRCDATE DATE NOT NULL ENABLE,
BKGMEAN NUMBER,
BKGMEANINFLUENCED NUMBER,
BKGVARIANCE NUMBER,
XXLMEAN NUMBER,
XXLMEANINFLUENCED NUMBER,
XXLVARIANCE NUMBER,
CONSTRAINT OD_FLTFRC_PK PRIMARY KEY (
DPTDATE, FLTNUM, CRRCODE, ORGN,
DSTN, CLSCODE, POS, PAXTYPE, DCP, FRCDATE)
) ORGANIZATION INDEX
PARTITION BY RANGE (DPTDATE)
(
PARTITION FLTFRC_200401 VALUES LESS THAN (TO_DATE('20040201', 'YYYYMMDD')),
PARTITION FLTFRC_200402 VALUES LESS THAN (TO_DATE('20040301', 'YYYYMMDD')),
PARTITION FLTFRC_200403 VALUES LESS THAN (TO_DATE('20040401', 'YYYYMMDD')),
PARTITION FLTFRC_200404 VALUES LESS THAN (TO_DATE('20040501', 'YYYYMMDD')),
PARTITION FLTFRC_200405 VALUES LESS THAN (TO_DATE('20040601', 'YYYYMMDD')),
PARTITION FLTFRC_200406 VALUES LESS THAN (TO_DATE('20040701', 'YYYYMMDD')),
PARTITION FLTFRC_200407 VALUES LESS THAN (TO_DATE('20040801', 'YYYYMMDD')),
PARTITION FLTFRC_200408 VALUES LESS THAN (TO_DATE('20040901', 'YYYYMMDD')),
PARTITION FLTFRC_200409 VALUES LESS THAN (TO_DATE('20041001', 'YYYYMMDD')),
PARTITION FLTFRC_200410 VALUES LESS THAN (TO_DATE('20041101', 'YYYYMMDD')),
PARTITION FLTFRC_200411 VALUES LESS THAN (TO_DATE('20041201', 'YYYYMMDD')),
PARTITION FLTFRC_200412 VALUES LESS THAN (TO_DATE('20050101', 'YYYYMMDD')),
PARTITION FLTFRC_200501 VALUES LESS THAN (TO_DATE('20050201', 'YYYYMMDD')),
PARTITION FLTFRC_200502 VALUES LESS THAN (TO_DATE('20050301', 'YYYYMMDD')),
PARTITION FLTFRC_200503 VALUES LESS THAN (TO_DATE('20050401', 'YYYYMMDD')),
PARTITION FLTFRC_200504 VALUES LESS THAN (TO_DATE('20050501', 'YYYYMMDD')),
PARTITION FLTFRC_200505 VALUES LESS THAN (TO_DATE('20050601', 'YYYYMMDD')),
PARTITION FLTFRC_200506 VALUES LESS THAN (TO_DATE('20050701', 'YYYYMMDD')),
PARTITION FLTFRC_200507 VALUES LESS THAN (TO_DATE('20050801', 'YYYYMMDD')),
PARTITION FLTFRC_200508 VALUES LESS THAN (TO_DATE('20050901', 'YYYYMMDD')),
PARTITION FLTFRC_200509 VALUES LESS THAN (TO_DATE('20051001', 'YYYYMMDD')),
PARTITION FLTFRC_200510 VALUES LESS THAN (TO_DATE('20051101', 'YYYYMMDD')),
PARTITION FLTFRC_200511 VALUES LESS THAN (TO_DATE('20051201', 'YYYYMMDD')),
PARTITION FLTFRC_200512 VALUES LESS THAN (TO_DATE('20060101', 'YYYYMMDD')),
PARTITION FLTFRC_200601 VALUES LESS THAN (TO_DATE('20060201', 'YYYYMMDD'))
);

Let me know if you need anything else.

BTW, if you deem this to be a new question, then let me know and I'll wait until your queue is open for questioning.

Thanks!

Tom Kyte
January 21, 2005 - 7:57 pm UTC

I'm thinking this is going to be an implicit conversion on a bind variable.. meaning you are comparing database column of type X to bind variable of type Y and there is an implicit function being applied to the database column.


so, we need to understand the binding that is happening here.  Can you get:

ops$tkyte@ORA9IR2> variable b1 varchar2(5)
ops$tkyte@ORA9IR2> variable b2 varchar2(5)
ops$tkyte@ORA9IR2> variable b3 varchar2(5)
ops$tkyte@ORA9IR2> variable b4 varchar2(5)
ops$tkyte@ORA9IR2> variable b5 varchar2(5)
ops$tkyte@ORA9IR2> variable b6 varchar2(5)
ops$tkyte@ORA9IR2> variable b7 varchar2(5)
ops$tkyte@ORA9IR2> variable b8 varchar2(5)
ops$tkyte@ORA9IR2> variable b9 varchar2(5)
ops$tkyte@ORA9IR2> variable b10 varchar2(5)
ops$tkyte@ORA9IR2> variable b11 varchar2(5)
ops$tkyte@ORA9IR2> variable b12 varchar2(5)
ops$tkyte@ORA9IR2> variable b13 varchar2(5)
ops$tkyte@ORA9IR2> variable b14 varchar2(5)
ops$tkyte@ORA9IR2> variable b15 varchar2(5)
ops$tkyte@ORA9IR2> variable b16 varchar2(5)
ops$tkyte@ORA9IR2> variable b17 varchar2(5)
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> delete from plan_table;
 
4 rows deleted.
 
ops$tkyte@ORA9IR2> explain plan for
  2  UPDATE OD_FLTFRC SET FRCDATE = :b1, XXLMEAN = :b2, BKGVARIANCE = :b3,
  3    BKGMEANINFLUENCED = :b4, XXLMEANINFLUENCED = :b5, BKGMEAN = :b6,
  4      XXLVARIANCE = :b7
  5          WHERE
  6           DSTN = :b8 AND ORGN = :b9 AND FLTNUM = :b10 AND PAXTYPE = :b11 AND DPTDATE = :b12
  7             AND POS = :b13 AND DCP = :b14 AND CRRCODE = :b15 AND CLSCODE = :b16
  8  /
 
Explained.
 
ops$tkyte@ORA9IR2> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------
| Id  | Operation               |  Name         | Rows  | Bytes | Cost  | Pstart| Pstop |
-----------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT        |               |     1 |   135 |     2 |       |       |
|   1 |  UPDATE                 | OD_FLTFRC     |       |       |       |       |       |
|   2 |   PARTITION RANGE SINGLE|               |       |       |       |   KEY |   KEY |
|*  3 |    INDEX RANGE SCAN     | OD_FLTFRC_PK  |     1 |   135 |     2 |   KEY |   KEY |
-----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("OD_FLTFRC"."DPTDATE"=:Z AND "OD_FLTFRC"."FLTNUM"=:Z AND
              "OD_FLTFRC"."CRRCODE"=:Z AND "OD_FLTFRC"."ORGN"=:Z AND "OD_FLTFRC"."DSTN"=:Z AND
              "OD_FLTFRC"."CLSCODE"=:Z AND "OD_FLTFRC"."POS"=:Z AND "OD_FLTFRC"."PAXTYPE"=:Z AND
              "OD_FLTFRC"."DCP"=TO_NUMBER(:Z))
 
Note: cpu costing is off
 
19 rows selected.


to reproduce your query plan by wrapping TO_DATE, TO_NUMBER around your bind variables as they exist in your code (eg: if :1 was bound as a string, leave it be.  if :2 was bound as a number type in your code -- put to_number around it and so on) 

Implicit conversion even when calling the correct setXXX()?

Peter Tran, January 21, 2005 - 9:06 pm UTC

Hi Tom,

I think I understand what you're saying, but would Oracle still do an implicit conversion even when I call the appropriate java.sql.PreparedStatement.setXXX() method for the column type?

I'll modify the code to do what you suggest to see if it helps.

BTW, I took at the raw trace file and it looks like Oracle is using the correct "dty" value.

PARSING IN CURSOR #5 len=293 dep=0 uid=36 oct=6 lid=36 tim=1080348806918157 hv=1079826328 ad='7c5dd27c'
UPDATE OD_FLTFRC SET FRCDATE = :1, XXLMEAN = :2, BKGVARIANCE = :3, BKGMEANINFLUENCED = :4, XXLMEANINFLUENCED = :5, BKGMEAN = :6, XXLVARIANCE = :7 WHERE DSTN = :8 AND ORGN = :9 AND FLTNUM = :10 AND PAXTYPE = :11 AND DPTDATE = :12 AND POS = :13 AND DCP = :14 AND CRRCODE = :15 AND CLSCODE = :16
END OF STMT
PARSE #5:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=1080348806918147
BINDS #5:
bind 0: dty=180 mxl=11(11) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=416 offset=0
bfp=40793610 bln=11 avl=07 flg=05
value=
Dump of memory from 0x40793610 to 0x40793617
40793610 03016978 00010101 [xi......]
bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=12
bfp=4079361c bln=22 avl=00 flg=01
bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=36
bfp=40793634 bln=22 avl=00 flg=01
bind 3: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=60
bfp=4079364c bln=22 avl=01 flg=01
value=0
bind 4: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=84
bfp=40793664 bln=22 avl=03 flg=01
value=.0588
bind 5: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=108
bfp=4079367c bln=22 avl=00 flg=01
bind 6: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=132
bfp=40793694 bln=22 avl=00 flg=01
bind 7: dty=1 mxl=32(03) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=156
bfp=407936ac bln=32 avl=03 flg=01
value="MUC"
bind 8: dty=1 mxl=32(03) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=188
bfp=407936cc bln=32 avl=03 flg=01
value="TRN"
bind 9: dty=1 mxl=32(05) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=220
bfp=407936ec bln=32 avl=05 flg=01
value="04023"
bind 10: dty=1 mxl=32(01) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=252
bfp=4079370c bln=32 avl=01 flg=01
value="G"
bind 11: dty=180 mxl=11(11) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=284
bfp=4079372c bln=11 avl=07 flg=01
value=
Dump of memory from 0x4079372C to 0x40793733
40793720 1C036978 [xi..]
40793730 00010101 [....]
bind 12: dty=1 mxl=32(02) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=296
bfp=40793738 bln=32 avl=02 flg=01
value="**"
bind 13: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=0 offset=328
bfp=40793758 bln=22 avl=02 flg=01
value=20
bind 14: dty=1 mxl=32(02) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=352
bfp=40793770 bln=32 avl=02 flg=01
value="LH"
bind 15: dty=1 mxl=32(01) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=0 offset=384
bfp=40793790 bln=32 avl=01 flg=01
value="V"


In your book, you wrote "The dty (data type) number may be decoded using information from the USER_TABL_COLUMNS view." Can you show me how? When I did what you suggest I get.

> select text from all_views where view_name = 'USER_VIEWS';

TEXT
-------------------------------------------------------------------------

select o.name, v.textlength, v.text, t.typetextlength, t.typetext,
t.oidtextlength, t.oidtext, t.typeowner, t.typename,
decode(bitand(v.property, 134217728), 134217728,
(select sv.name from superobj$ h, obj$ sv
where h.subobj# = o.obj# and h.superobj# = sv.obj#), null)
from sys.obj$ o, sys.view$ v, sys.typed_view$ t
where o.obj# = v.obj#
and o.obj# = t.obj#(+)
and o.owner# = userenv('SCHEMAID')

Thanks,
-Peter

Tom Kyte
January 22, 2005 - 9:47 am UTC

thats it.

frcdate is a DATE, dptdate is a DATE, but

 bind 0: dty=180

is a timestamp.


ops$tkyte@ORA9IR2> select text from all_views where view_name = 'USER_TAB_COLS'
  2  /
 
TEXT
--------------------------------------------------------------------------------
select o.name,
       c.name,
       decode(c.type#, 1, decode(c.charsetform, 2, 'NVARCHAR2', 'VARCHAR2'),
                       2, decode(c.scale, null,
                                 decode(c.precision#, null, 'NUMBER', 'FLOAT'),
                                 'NUMBER'),
                       8, 'LONG',
                       9, decode(c.charsetform, 2, 'NCHAR VARYING', 'VARCHAR'),
                       12, 'DATE', 23, 'RAW', 24, 'LONG RAW',
                       58, nvl2(ac.synobj#, (select o.name from obj$ o
                                where o.obj#=ac.synobj#), ot.name),
                       69, 'ROWID',
                       96, decode(c.charsetform, 2, 'NCHAR', 'CHAR'),
                       105, 'MLSLABEL',
                       106, 'MLSLABEL',
                       111, nvl2(ac.synobj#, (select o.name from obj$ o
                                 where o.obj#=ac.synobj#), ot.name),
                       112, decode(c.charsetform, 2, 'NCLOB', 'CLOB'),
                       113, 'BLOB', 114, 'BFILE', 115, 'CFILE',
                       121, nvl2(ac.synobj#, (select o.name from obj$ o
                                 where o.obj#=ac.synobj#), ot.name),
                       122, nvl2(ac.synobj#, (select o.name from obj$ o
                                 where o.obj#=ac.synobj#), ot.name),
                       123, nvl2(ac.synobj#, (select o.name from obj$ o
                                 where o.obj#=ac.synobj#), ot.name),
                       178, 'TIME(' ||c.scale|| ')',
                       179, 'TIME(' ||c.scale|| ')' || ' WITH TIME ZONE',
                       180, 'TIMESTAMP(' ||c.scale|| ')',
                       181, 'TIMESTAMP(' ||c.scale|| ')' || ' WITH TIME ZONE',
                       231, 'TIMESTAMP(' ||c.scale|| ')' || ' WITH LOCAL TIME ZO
NE',
                       182, 'INTERVAL YEAR(' ||c.precision#||') TO MONTH',
                       183, 'INTERVAL DAY(' ||c.precision#||') TO SECOND(' ||
                             c.scale || ')',
                       208, 'UROWID',
                       'UNDEFINED'),



If you:


ops$tkyte@ORA9IR2> set autotrace traceonly explain
ops$tkyte@ORA9IR2> UPDATE OD_FLTFRC SET FRCDATE = :b1, XXLMEAN = :b2, BKGVARIANCE = :b3,
  2    BKGMEANINFLUENCED = :b4, XXLMEANINFLUENCED = :b5, BKGMEAN = :b6,
  3      XXLVARIANCE = :b7
  4          WHERE
  5           DSTN = :b8 AND ORGN = :b9 AND FLTNUM = :b10 AND PAXTYPE = :b11 AND DPTDATE = :b12
  6             AND POS = :b13 AND DCP = :b14 AND CRRCODE = :b15 AND CLSCODE = :b16
  7  /
 
0 rows updated.
 
 
Execution Plan
----------------------------------------------------------
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=135)
   1    0   UPDATE OF 'OD_FLTFRC'
   2    1     PARTITION RANGE (SINGLE)
   3    2       INDEX (RANGE SCAN) OF 'OD_FLTFRC_PK' (UNIQUE) (Cost=2 Card=1 Bytes=135)
 
 
 
ops$tkyte@ORA9IR2> UPDATE OD_FLTFRC SET FRCDATE = to_timestamp(:b1), XXLMEAN = :b2, BKGVARIANCE = :b3,
  2    BKGMEANINFLUENCED = :b4, XXLMEANINFLUENCED = :b5, BKGMEAN = :b6,
  3      XXLVARIANCE = :b7
  4          WHERE
  5           DSTN = :b8 AND ORGN = :b9 AND FLTNUM = :b10 AND PAXTYPE = :b11 AND DPTDATE = to_timestamp(:b12)
  6             AND POS = :b13 AND DCP = :b14 AND CRRCODE = :b15 AND CLSCODE = :b16
  7  /
 
0 rows updated.
 
 
Execution Plan
----------------------------------------------------------
   0      UPDATE STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=135)
   1    0   UPDATE OF 'OD_FLTFRC'
   2    1     PARTITION RANGE (ALL)
   3    2       INDEX (FAST FULL SCAN) OF 'OD_FLTFRC_PK' (UNIQUE) (Cost=5 Card=1 Bytes=135)
 
 
 
ops$tkyte@ORA9IR2> set autotrace off


See how it flips from range to fast full scanning?  In your case, it knew that DPTDATE was low cardinality so it index skip scanned.



 

Sweet!!!

Peter Tran, January 22, 2005 - 11:42 am UTC

Grand Master Tom comes through again. :)

Have a great weekend.

Dump of memory from

A reader, May 18, 2005 - 7:46 pm UTC

Here is the 10046 level 12 trace file from Oracle 8.1.7.3 and it was a recursive call with USER_ID:SYS

=====================
PARSING IN CURSOR #25 len=85 dep=1 uid=0 oct=6 lid=0 tim=119192781 hv=1004498818 ad='11937a88'
UPDATE DBMS_LOCK_ALLOCATED SET EXPIRATION=SYSDATE + (:b1 / 86400 ) WHERE ROWID = :b2
END OF STMT
PARSE #25:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=119192781
BINDS #25:
bind 0: dty=2 mxl=22(21) mal=00 scl=00 pre=00 oacflg=03 oacfl2=1 size=24 offset=0
bfp=18b577e8 bln=22 avl=03 flg=05
value=864000
bind 1: dty=208 mxl=4000(4000) mal=00 scl=00 pre=00 oacflg=00 oacfl2=1 size=4000 offset=0
bfp=18d49138 bln=4000 avl=13 flg=05
value=
Dump of memory from 0x18D49138 to 0x18D49145
18D49130 0100028A AD000100 Ý....Ý...¨
18D49140 01935E00 2028B1C0 Ý.l;....{¨
EXEC #25:c=0,e=0,p=0,cr=1,cu=1,mis=0,r=1,dep=1,og=4,tim=119192781
XCTEND rlbk=0, rd_only=0
XCTEND rlbk=0, rd_only=1

FETCH #23:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=119192782
WAIT #26: nam='library cache pin' ela= 1 p1=267653288 p2=243543904 p3=20
WAIT #26: nam='library cache pin' ela= 0 p1=254221800 p2=243543760 p3=20
=====================

Within 10 minutes interval, #25 was called 216 times, every time it emitted 'Dump of memory'.

It didn't cost much in cpu and elapsed time, but is it normal? if not, what caused it? and what can we do avoid it?

Thanks in advance!

Tom Kyte
May 18, 2005 - 8:07 pm UTC

that is just the bind output display, nothing scary there.
that is the rowid being printed out.

what is your CODE doing -- looks related to dbms_lock.

why two XCTEND entries?

A reader, May 19, 2005 - 1:32 am UTC

Thanks Tom!

Now I only have 10046 trace file, and I scanned the entire file and found the following statements are related to dbms_lock

DELETE FROM DBMS_LOCK_ALLOCATED WHERE EXPIRATION < SYSDATE

PARSING IN CURSOR #24 len=69 dep=1 uid=0 oct=3 lid=0 tim=119192781 hv=2283848797 ad='11938320'
SELECT LOCKID FROM DBMS_LOCK_ALLOCATED WHERE NAME = :b1 FOR UPDATE
END OF STMT

PARSING IN CURSOR #25 len=85 dep=1 uid=0 oct=6 lid=0 tim=119192781 hv=1004498818 ad='11937a88'
UPDATE DBMS_LOCK_ALLOCATED SET EXPIRATION=SYSDATE + (:b1 / 86400 ) WHERE ROWID = :b2
END OF STMT

PARSING IN CURSOR #271 len=39 dep=1 uid=0 oct=3 lid=0 tim=119196910 hv=2555653940 ad='119381e8'
SELECT DBMS_LOCK_ID.NEXTVAL FROM DUAL
END OF STMT
PARSING IN CURSOR #272 len=77 dep=1 uid=0 oct=2 lid=0 tim=119196910 hv=3299803993 ad='11937c80'
INSERT INTO DBMS_LOCK_ALLOCATED VALUES ( :b1,:b2,SYSDATE + (:b3 / 86400 ) )
END OF STMT


The biggest time-consumer is:

CURSOR_ID:23 LENGTH:69 ADDRESS:11938320 HASH_VALUE:2283848797 OPTIMIZER_GOAL:CHOOSE USER_ID:SYS
SELECT LOCKID FROM DBMS_LOCK_ALLOCATED WHERE NAME = :b1 FOR UPDATE
call count cpu elapsed disk query current rows misses
------- --------- --------- --------- ------------ ------------ ------------ ------------ ---------
Parse 1 0.00 0.00 0 0 0 0 0
Execute 243 0.10 36.07 48 1231 650 0 0
Fetch 243 0.00 0.00 0 0 0 216 0
------- --------- --------- --------- ------------ ------------ ------------ ------------ ---------
total 487 0.10 36.07 48 1231 650 216 0


All the calls are at dep=1 and userid=SYS, I couldn't find any calls at dep=0 that used dbms_lock

Tom Kyte
May 19, 2005 - 7:48 am UTC

you were using dbms_lock and you wanted to get a lock and you blocked on it.

that is all -- looks perfectly normal to me for an application that is using dbms_lock.

have you looked at the code itself.

I suspected it was materialized view refresh that triggered DBMS_LOCK_ALLOCATED

A reader, May 19, 2005 - 1:36 pm UTC

I just turned on the trace for materialized view and I got:

localhost:/u01/app/oracle/admin/stra/udump >grep -in DBMS_LOCK stra_ora_14192.trc
3135:SELECT LOCKID FROM DBMS_LOCK_ALLOCATED WHERE NAME = :b1 FOR UPDATE
3146:SELECT DBMS_LOCK_ID.NEXTVAL FROM DUAL
3205:INSERT INTO DBMS_LOCK_ALLOCATED VALUES ( :b1,:b2,SYSDATE + (:b3 / 86400 ) )
8559:STAT #32 id=2 cnt=0 pid=1 pos=1 obj=3434 op='TABLE ACCESS BY INDEX ROWID DBMS_LOCK_ALLOCATED (cr=1 r=0 w=0 time=45 us)'


Do you happen to know how we can determine if it's really caused by MV refresh, for example will the bind value for name in line 3135 (e.g. value="ORA$DEF$EXE$PurgeCommonLock", value="BE$INDV$000001451846000001451846" or BE$CREL$000004964074) give us any hint?

Thanks!

P.S. As you suggested, I've asked someone to look into all source code (four years old) for dbms_lock.

Tom Kyte
May 19, 2005 - 2:00 pm UTC

turn on trace
run mv refresh
exit

if tracefile has it, it was part of the mv refresh.

Range Scan on Composite Index

Brandon Allen, June 13, 2005 - 5:24 pm UTC

Please confirm if my understanding below is correct regarding a range scan on a composite index:

Given the following index:

create index testidx on testtab (a,b,c,d,e);

And the following SELECT statement:

select * from testtab
where a = :v
and b = :w
and c between :x1 and :x2
and d = :y
and e = :z;

Does the "start key" consist of only the first 3 indexed columns (a=:v, b=:w & c >= :x1), ignoring d & e?

In other words, does it just navigate through the branches to the leaf with the smallest value >= (:v || :w || :x1) as the starting point and then scan subsequent leaves from there, until c is > :x2?

If the above is true, would it follow that it might (not knowing anything about selectivity, distribution, other query predicates, etc.) be better to recreate this index as (a,b,d,e,c), thus moving the range scanned column to the end and possibly reducing the number of rows to be scanned before it finds c > :x2?

If you can point to any good documentation that covers this (is it in your book?) I'd appreciate it. I found a good explanation of how an index range scan works in the standard 9i documentation (</code> http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96524/c11schem.htm#26777 <code>, but it doesn't get into specifics of composite indexes.

Thanks!


Tom Kyte
June 13, 2005 - 9:04 pm UTC

the start key is a,b,c -- and it'll range ALL of them checking out d and e (not ignoring). when d and e match, it'll table access by index rowid and output the row (assuming there is a column f)

If this were the only query to tune, it would work best with C *last* yes.

Did you read the chapter on indexes in Expert one on one?

(this is something that wouldn't be documented I don't think, I mean -- it is the only thing it could possibly do. In Expert one on one Oracle I talk about how the order of columns in the index is 100% based on the questions you ask)

Range Scan on Composite Index

Brandon Allen, June 14, 2005 - 12:05 pm UTC

Okay, Thanks. I understand that it won't completely ignore D & E (poor wording on my part) - it must check them to see if the row should be returned or not; but, for navigating through the tree, it can only use A, B & C to find the start key and then for checking when to stop the range scan, it only has to check C, so as far as navigation goes, it doesn't (can't) use D or E, right?

I haven't read your book yet, but definitely will soon. On Amazon.com they show a new 9i/10g version to be released so I was going to wait for that one to come out - any idea on an ETA?

Tom Kyte
June 14, 2005 - 4:05 pm UTC

correct, it cannot use D or E to abort the range scan -- since the D and E's are "sprinkled" throughout lots of the C's in the range scan.


The 2nd edition will be out at the end of the year.

ETA for the new book

Brandon Allen, June 14, 2005 - 3:52 pm UTC

Nevermind my last question - I found the answer at:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:23060676216774 <code>

Where you say Q3 2005.






unique key column order

daxu, August 24, 2005 - 11:11 am UTC

Tom,
Does the column order in the unique key matter? How does the index range scan work if my select statement where clause skips a field (not the first one in the key) that is in the unqiue key? For example, if I have a table like this,
create table tt
(col_a number not null,
col_b number not null,
col_c number not null,
col_d number not null);

I have an unique key tt_uk (col_a, col_b, col_c) on this table, it is order by col_a, col_b and col_c.

If I have seletct statment like this,
select * from tt
where col_a = :v_a
and col_b = :v_a;

The index range scan will look for the col_a value, and col_b value in the index table, return rowid and fetch the data from tt table, as what I understand.

How about I do,
select * from tt
where col_a = :v_a
and col_c = :v_c

Will the index range scan only look for col_a value in the index only and will not look for col_c value in the index because here we skipped col_b which is the second field in the unique key? Or the index range scan will still look both col_a and col_c value in the index? Which one is correct?

Assuming we use RBO, and please forgive me for not using CBO because we still have applications using Oralce 7 version.

Thanks,














Tom Kyte
August 24, 2005 - 6:06 pm UTC

<i>Will the index range scan only look for col_a value in the index only and will</i>


ops$tkyte@ORA10GR1> create table t
  2  ( a int, b int, c int,
  3    d varchar2(4000) default rpad( 'x', 4000, 'x' ),
  4    e varchar2(4000) default rpad( 'x', 4000, 'x' ) ,
  5    constraint t_pk primary key(a,b,c)
  6  );

Table created.

ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> insert into t(a,b,c)
  2  select 1, rownum, rownum
  3    from all_users;

67 rows created.

ops$tkyte@ORA10GR1> commit;

Commit complete.

ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> @trace
ops$tkyte@ORA10GR1> alter session set events '10046 trace name context forever, level 12';

Session altered.

ops$tkyte@ORA10GR1> select /*+ rule */ a, b, c, substr(d,1,1), substr(e,1,1)
  2    from t
  3   where a = 1
  4     and c = 5;

         A          B          C S S
---------- ---------- ---------- - -
         1          5          5 x x



that table has 1 row per block (8k block size).  All 67 rows (i have 67 rows in all users) have a=1.  tkprof shows:

select /*+ rule */ a, b, c, substr(d,1,1), substr(e,1,1)
  from t
 where a = 1
   and c = 5

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.03          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.03          0          4          0           1

Misses in library cache during parse: 1
Optimizer mode: RULE
Parsing user id: 59

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID OBJ#(52386) (cr=4 pr=0 pw=0 time=107 us)
      1   INDEX RANGE SCAN OBJ#(52387) (cr=2 pr=0 pw=0 time=80 us)(object id 52387)


that is could not have gone to the table each time to check C, it must have checked C in the index *and then* went to the table.

Not sure about 7.x, probably the same, but you know how to "see" for yourself now. 

Columns For Select

Bhushan, March 03, 2006 - 6:48 am UTC

Hi Tom,
I have a serious doubt.I have a query which uses the Index i want to only if i select the columns on which i have the index.The moment i do a SELECT * or select some other column other than the column on whch the index is built it goes on FULL TABLE SCAN.Any hints on why this may be happening.
Waiting for your reply.

Thanks

Tom Kyte
March 03, 2006 - 8:19 am UTC

any hints on what is happening?

The optimizer is doing it's job. It has determined that based on the amount of data you are retrieving from the table makes the use of the index a bad idea.

I might even presume to assume (which I sort of have to since we really don't have any other information to go on) that the plan is switching from an index fast full scan (using the index as if it were a table) to a table full scan. Meaning it was simply using the index as a skinnier version of the table - and when you retrieve columns not in the index - it cannot do that anymore.

Columns For Select

Bhushan, March 04, 2006 - 12:39 am UTC

Hi Tom,
You really surprised me by that quick response.I did read your book after logging the query and got to know the reason....however can you suggest how could be improve the performance other than having a B*Tree index on all the columns that we select(Dont Know it seems like a Crude way of tuning to me May be i am wrong...).

Tom Kyte
March 04, 2006 - 7:09 am UTC

Not sure what you mean - the optimizer will use any available index when it makes sense to do so. It will avoid using any available index when it makes sense to avoid it.

No, I don't know how to get blood from a stone in this case. Using the existing index you have doesn't make sense. Adding more columns to the index so that the full scan of the index (which I presume is taking place, you never really "said") could make sense again may or may not be the correct approach. Insufficient data.

Index Concatenated question

Martín Ortíz, March 17, 2006 - 7:17 pm UTC

Hi Tom,

If we have a query on table "a". The WHERE clause is like:

"where a.x1||a.x2||a.x3 = y"

...how should I define the index for table a?

I ask because we defined an index with this cols:
x1,
x2,
x3

and also:
x1||x2||x3

but didn't work.

so, what else should we do?

Thanks for your help


Tom Kyte
March 18, 2006 - 11:57 am UTC

that seems like a "bad" search criteria, doesn't it.

You don't say where Y comes from. If Y is yet another column in that table - no index will help that predicate (we'd have to rewrite the predicate).

If Y is a bind variable - then x1||x2||x3 would in fact work, if you met the necessary setup for using a function based index (eg: definitely using the CBO for example)

But, no example - no further comment.

Advantage of Index Skip Scan

Arindam Mukherjee, July 30, 2006 - 1:51 am UTC

Respected Mr. Tom,

Right now I am aware of “Index Skip Scans “ on reading your book “Effective Oracle by Design” (Chapter-8 : Effective SQL) and Oracle 9i Database Performance Tuning Guide and Reference (Page: 1-37).

Now I doubt on the veracity of the following example taken from Oracle 9i SQL Reference (Page: 7-35). Please see the name of Index (promo_ix1) remains same for two unique constraints. Do we really need this permutation of columns in Unique Constraints and in addition to reuse same Index for both when we can make efficient use of Index Skip Scan in Oracle 9i?

Oh!! May be I am wrong in getting the essence of the following example. You please shed light on my misgivings. At the same time I can assure you, mathematically I can’t prove this permutation of columns is legitimate.

******************************************
Taken from Oracle 9i SQL Reference (7-35).
******************************************
-------------------------------------

CREATE TABLE promotions_var3
( promo_id NUMBER(6)
, promo_name VARCHAR2(20)
, promo_category VARCHAR2(15)
, promo_cost NUMBER(10,2)
, promo_begin_date DATE
, promo_end_date DATE
, CONSTRAINT promo_id_u UNIQUE (promo_id, promo_cost)
USING INDEX (CREATE UNIQUE INDEX promo_ix1
ON promotions_var3 (promo_id, promo_cost))
, CONSTRAINT promo_id_u2 UNIQUE (promo_cost, promo_id)
USING INDEX promo_ix1);
-------------------------------------


Regards
Arindam Mukherjee


Tom Kyte
July 30, 2006 - 8:39 am UTC

you would need to understand the number of distinct values for both promo_id and promo_cost first.

Given that promo_id is a six digit number and promo_cost is a 10 digit number one must expect their distinct cardinalities to be potentially high.

An index skip scan only makes sense when the leading edge of the index is of low distinct cardinality. For example:

create index t_idx on t( GENDER, DATE_OF_BIRTH );

here GENDER could be assumed to have two (or few) distinct values. It would be OK as a skip scanned index. However, the other way around (DATE_OF_BIRTH,GENDER) would not be because date_of_birth must be assumed to have lots of distinct values and thus not really a candidate to be skip scanned.

Thanks but triggers another doubt

Arindam Mukherjee, July 30, 2006 - 9:32 am UTC

Respected Mr. Tom,

Well, I concur with you. So we do need two Indexes for the combination of (GENDER, DATE_OF_BIRTH) and (DATE_OF_BIRTH, GENDER) as we can’t take the advantage of Index skip scan. But in my previous create table example, they create two different unique keys just based on permutation of two columns but use the same Index.

What benefit we would derive from that example using same Index on different unique key based on different combination of two columns.

In a nutshell, I could not understand that create table example using same index. Hope you are able to get my difficulty.

Regards,
Arindam Mukherjee



Tom Kyte
July 30, 2006 - 1:29 pm UTC

No, I was saying "if you have gender with low distinct cardinality and date of birth with high distinct cardinality, we may be able to do a skip scan solution"

I also said "with the example in the book, given the datatypes, it is reasonable to expect that BOTH columns are of high distinct cardinality and hence a skip scan on the example would not likely be reasonable"

It is common to create an index "going both ways" in a table commonly known as "an association object". In that case, promo_id would be a foreign key to some table and promo_cost a foreign key to some other table.

better example perhaps:

student( student_id primary key, name );
professor( prof_id primary key, name );

student_to_professor( student_id, prof_id );

We might want to go:

from student, student_to_professor, professor
where student.name = 'Kyte'
and student.student_id = student_to_professor.student_id
and student_to_professor.prof_id = professor.prof_id;

(that is traverse FROM student TO professor)

and other times we'd want to go:


from student, student_to_professor, professor
where professor.name = 'Smith'
and student.student_id = student_to_professor.student_id
and student_to_professor.prof_id = professor.prof_id;

that is traverse FROM professor TO student


In the former case, we'd like to use an index on student(name) to find the student_id and go to student_to_professor by student_id to find the prof_id's and then join to professor (so the index should be on student_id,prof_id in that case)

In the latter, we'd start at professor, find the prof_id and look up the student_id's - we'd want the index on student_to_professor to be on prof_id,student_id




Quite Taken aback

Arindam Mukherjee, July 31, 2006 - 12:35 am UTC

Respected Mr. Tom,

Commendable explanation and I have nothing to say on it, just bowled over. Sir, I am truly obliged to your patience (typing long), sterling efforts (thinking of different topics) with the set of excellent examples in your own inimitable way explaining us the matter.

Regards,
Arindam Mukherjee


A reader, August 13, 2007 - 1:12 pm UTC

Tom,

I have a query like follows:

select * from t where
(a >= X1 and a <= X2)
or (b >= X1 and b <= X2)
or (a < X1 and b> X2);

The table itself is huge - it consist of about 60 columns and 7 Mio rows - but it has a compact index on columns (a,b).
Why Oracle refuse to use this index for the query above?


Tom Kyte
August 15, 2007 - 11:09 am UTC

I see how the first and third could possibly use that index - but not the second

why do you think this index would be useful - we'd have to look at EVERY ENTRY in the index - no range scanning - in order to process it.

A reader, August 17, 2007 - 5:57 am UTC

Tom,

I just wanted to know if I have query like:

select ... where startnum > X1 and endnum < Y1 where startnum and endnum are the columns of the same table what index I can create on this table to support this query (so that Oracle can do only one index scan)
Tom Kyte
August 20, 2007 - 10:30 pm UTC

only answer is

"it depends"

oracle will do only one index scan no matter what...


let's say that X1 = 0 and all startnums are greater than zero. Further, Y1 is zero. and all endnums are greater than zero.

So, an index on (startnum,endnum) would be very bad.
An index on (endnum,startnum) would be very good.

let's say that X1=0 and all startnums are LESS than zero. Further, Y1 is zero. and all endnums are greater than zero.

So an index on (startnum,endnum) would be very good.
An index on (endnum,startnum) would be very good.

let's say that X1=0 and all startnums are less than zero. Further Y1 is zero. and all endnums are LESS THAN zero.

So an index on (startnum,endnum) would be very good.
An index on (endnum,startnum) would be VERY BAD.




think about what an index is (sorted keys) and how many keys would be inspected...

CONCATENATION and VIEW operations

A reader, August 23, 2007 - 11:39 am UTC

Tom,

For the query above (with two ORs):

select ... from t where
c=C1
and d=D1
and e >= E1
and e <=E2
and (
(a >= X1 and a <= X2)
or (b >= X1 and b <= X2)
or (a < X1 and b> X2)
);

I have the following entries in the execution plan:

VIEW
SORT GROUP BY
CONCATENATION
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID T1
INDEX RANGE SCAN I1
INDEX FULL SCAN I2
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID T1
INDEX RANGE SCAN I1
INDEX FULL SCAN I2
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID T1
INDEX RANGE SCAN I1
INDEX FULL SCAN I2

Could you please explain what CONCATENATION and VIEW operations means. I could not find any desciption for this operation.
Does Oracle indeed does the same thing three times (three nested loops in the plan).

I1 here is index on c,d,a and I2 is index on c,d,e.

The performance of this query is terrible - it tooks 6 Mio LIOs to return 1000 rows
Tom Kyte
August 24, 2007 - 1:49 pm UTC

you have a group by, it could be that you grouped 3,000,000 rows into 1,000 and 6,000,000 LIOs was great.

tkprof it, see how many rows flow out of each step - you might have to readjust your concept of "horrible", it might be GREAT (it might not be, but it certainly COULD be)

but - concatenate, just what it sounds like "take this result and concatenate the next result and concatenate the next (eg: run the three equivalent queries in serial and just add their results together)

the view is just a "placeholder" sort of thing, your query was against a view (apparently, you query t, you access t1).



CONCATENATION and VIEW operations

A reader, August 23, 2007 - 1:03 pm UTC

Tom,

I corrected the query above:

For the query above (with two ORs):

select ... from t , t1 where
t1.c = t.c
t1.d = t.d
and t.e = E3
and t1.c=C1
and t1.d=D1
and t1.e >= E1
and t1.e <=E2
and (
(t1.a >= X1 and t1.a <= X2)
or (t1.b >= X1 and t1.b <= X2)
or (t1.a < X1 and t1.b> X2)
);

I have the following entries in the execution plan:

VIEW
SORT GROUP BY
CONCATENATION
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID T1
INDEX RANGE SCAN I1
INDEX FULL SCAN I2
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID T1
INDEX RANGE SCAN I1
INDEX FULL SCAN I2
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID T1
INDEX RANGE SCAN I1
INDEX FULL SCAN I2

Could you please explain what CONCATENATION and VIEW operations means. I could not find any
desciption for this operation.
Does Oracle indeed does the same thing three times (three nested loops in the plan).

I1 here is index on T1(c,d,a) and I2 is index on T(c,d,e)

The performance of this query is terrible - it tooks 6 Mio LIOs to return 1000 rows


Tom Kyte
August 24, 2007 - 1:53 pm UTC

tkprof it, you might have to readjust your thoughts of what "terrible are", sort group by - you have an aggregate there.

the view is your rewritten query against t1 into three queries to do the OR.

funny to not see T referenced in the plan.

A reader, August 24, 2007 - 2:28 pm UTC

Tom,

This was a part of actual query which is about 600 lines (and about 450 lines of execution plan) with a lot of private information in it.
I thought I am not supposed to place so bit query here (but it is the second week I try to tune the query...)

A reader, August 30, 2007 - 11:00 am UTC

Tom

I found that if my query above use the indexes I12 and I22 the performance is very good, but if it use index I4 the performance is very bad.
Is it time to rebuild indexes I12 and I22 because of very bad clustering factor (near num of rows) or it is okay to set statistics to cause optimizer to use this index.

SUBSTR(INDEX_NAME,1,3) BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
------------------------------ ---------- ----------- ------------- ----------------- ----------
I1_ 1 317 42472 91057 117510
I12 1 403 42176 90070 117510
I2_ 2 1109 67995 91791 117510
I22 1 318 42553 91129 117510
I3_ 2 598 102 35095 117510
I4_ 2 529 19 21127 117510
I5_ 1 251 4 6956 117510
I6_ 2 534 42980 90151 117510
I7_ 1 402 42185 89930 117510
I8_ 1 247 3 4029 117510
I9_ 1 285 701 9543 117510
Tom Kyte
September 04, 2007 - 5:31 pm UTC

you can rebuild that index all you want - the clustering fact (a measurement of how sorted the table is with respect to the indexed columns) will NEVER CHANGE.


clustering factor is based on the TABLE and how the rows are physically stored in it.

doing something to the index will NOT affect the clustering factor, it cannot - you need to move rows in the TABLE to affect the clustering factor.

and ask yourself - how many ways can a table be stored sorted? typically one :)

A reader, August 30, 2007 - 11:23 am UTC

Tom,

I want to add some info to my question above.

What could be a reason that ALL SELECTIVE indexes above have so bad clustering factor? The whole database was imported (with exp compress=N) about a month ago. The block number is:

BLOCKS NUM_ROWS
---------- ----------
3898 117096


Tom Kyte
September 04, 2007 - 5:35 pm UTC

see above, or

The Oracle Reference Manual tells us this column has the following meaning:

Begin Quote
Indicates the amount of order of the rows in the table based on the values of the index:
* If the value is near the number of blocks, then the table is very well ordered. In this case, the index entries in a single leaf block tend to point to rows in the same data blocks.
* If the value is near the number of rows, then the table is very randomly ordered. In this case, it is unlikely that index entries in the same leaf block point to rows in the same data blocks.
End Quote


it is unusual to have more than one index with a "good" clustering factor (and that index is generally the index on a monotonically increasing value like a date, timestamp or sequence populated field)

Korall, September 25, 2007 - 10:36 am UTC

Hi Tom

Up to now I was relying on your answers regarding "ORDER OF COLUMNS in INDEX":

>> Nothing else (selectivity of a or b does not count at all)

or

>> .. WHERE c=:x and b=:y and a=:z
index (c,b,a) = (a,b,c) = (c,a,b) - whatever - they would do the same amount of work (think about this... test it out...)

I'm at the moment at "Oracle Database 10g: SQL Tuning 3 - 11" which says the following confusing me a lot:

-----------------------
"
* Ordering Columns in an Index

Order columns that have the most selectivity first. This method is the most commonly used because it provides the fastest access with minimal I/O to the actual row IDs required. This technique is used mainly for primary keys and for very selective range scans
"
----------------------------------

That statement in my eyes is the opposite what you say.
please correct me or the documentation..

Thanks a lot for your valuable forum!
Tom Kyte
September 26, 2007 - 9:23 pm UTC

what is:

Oracle Database 10g: SQL Tuning 3 - 11

is that oracle doc? (i'll have that fixed)
or is that book - if so, please point to book

I asked a similar question

AMIR RIAZ, September 27, 2007 - 2:46 am UTC

hi tom

i have a similar problem see this thread where i tried to prove that column order does matter. but failed to get any reply. also i have further questions

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1032431852141#457022500346979566

regards
Amir Riaz
Tom Kyte
September 27, 2007 - 7:02 am UTC

and that is way too big for me to look at online, quickly and concisely.

He gave some reasoning (jonathan)
And I keep saying "your example is way way way too small.

And in fact the example is not similar - we are talking about clustering factor over there - NOT on what is most selective.

Here the question is on "should I order columns by what is most selective" - answer is NO

Your question is about clustering factor - but given the tiny number of rows you are selecting in your example - it really didn't matter which was used, it was a tie.

How to index ?

chaman, October 01, 2007 - 11:06 am UTC

Hello Tom,

SELECT COL1, COL2
FROM TABLE1
WHERE COL10 = var1
AND COL11 != var2
AND COL11 != var3
AND COL11 != var4

Here though COL11 is indexed it goes for a full table scan. How to index this column ?

Thanks


Tom Kyte
October 03, 2007 - 3:17 pm UTC

create index on table1(col10,col11)

would likely be the one that might work.

it would be able to process the "col10 = var1" bit, and then filter out on col11 != var2, var3, var4

Column Ordering and CPU Usage

VLS, November 19, 2007 - 2:32 pm UTC

Hi Tom,

Does column ordering matter in an index matter in the way the index entries are scanned and thus cpu utilization for these scans ?

For example, if I have a Unique Key index on
(T_date, T_ID, T_SL) where the Number of Distinct values (NDV) of these are as under :

T_Date stores truncated date and thus NDV is 1
T_ID NDV is 41443 and
T_SL NDV is 1426

The total rows are 463220.

A simple query, like

select * from this_table
where t_date=:b1
and t_id=:b2
and t_sl=:b3

has to scan all the 463220 rows from an index to get the matching t_id and t_sl. This will be CPU intensive and can also contribute to Buffer Busy Waits.

Whereas, if I create the same index with t_id as first, it has to scan only 11 rows (463220/41443) to get the matching t_sl and t_date. This scan will be faster and less resource intensive.

Am I correct ?

Regards
VLS
Tom Kyte
November 21, 2007 - 1:12 pm UTC

...
has to scan all the 463220 rows from an index to get the matching t_id and
t_sl. This will be CPU intensive and can also contribute to Buffer Busy Waits. ....

hmm, did you even consider, remotely, that you could test this.


where c1 = ? and c2 = ? and c3 = ?

is sort of like

where c1||'/'||c2||'/'||c3 = ?

they are all sort of compared as a whole - NOT as separate bits.

It would NOT scan 463,220 rows if you use

select * from this_table
where t_date=:b1
and t_id=:b2
and t_sl=:b3

and t_date,t_id,t_sl were unique, it would hit ONE.


Suggestion: just test things out - we give you all of the tools, I demonstrate them here all of the time....


http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1409668892797#3991545699434

How to see within indexed table

Dulal, December 27, 2007 - 3:59 am UTC

Hi Tom,
Thanks in advance.
I want to see/query records from indexed table
how records are organized/ordered within a indexed table.
Because after creating index on a table there created a indexed table and take storage in tablespace with a name like *.idx. I want to query/see that indexed table.
Is it possible? If so, how?
Tom Kyte
December 27, 2007 - 9:43 am UTC

not sure what you mean.

the records in a table that has indexes defined on it are organized "no differently" than a table without indexes.

Unless you mean "an index organized table" in which case, the records are stored in the index itself, there is no table really, just the index - and they are stored in there in the same fashion as "index data" is stored.

and to query an index organized table, you just use select like any other table.

Michal Mor, September 24, 2008 - 6:32 am UTC

Hi Tom,

I found senario where the selectivity of leading columns within index is important. I can not understand why.

Table is partitioned by range of column SOURCE_COMP_ID.Each partition stores single value of SOURCE_COMP_ID.

Local index defined on table is TRB_PUB_LOG_PK (SOURCE_COMP_ID, BUFFER_ID, PUB_TRX_ID). Index leading column is partition key 

SQL below search in single partition for first 100 records using index TRB_PUB_LOG_PK (purpose is to get first 100 records with lower value for PUB_TRX_ID). Performance are very poor.

New index was defined: TRB_PUB_LOG_1IX (BUFFER_ID, PUB_TRX_ID, SOURCE_COMP_ID). In new index the partition key is moved to be last column in index.

Using index with leading partition key

SQL> UPDATE (SELECT /*+ index ( TRB1_PUB_LOG TRB1_PUB_LOG_PK) +*/
               *
          FROM trb1_pub_log
         WHERE (buffer_id = -1 AND source_comp_id = 2003))
   SET buffer_id = 99
 WHERE ROWNUM <= 100  

Elapsed: 00:07:19.26


 UPDATE STATEMENT        
  UPDATE                  TRB1_PUB_LOG    
   COUNT STOPKEY          
    PARTITION RANGE SINGLE
     INDEX RANGE SCAN     TRB1_PUB_LOG_PK
 

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(ROWNUM<=100)
   4 - access("SOURCE_COMP_ID"=2003 AND "BUFFER_ID"=(-1))


 
Statistics
----------------------------------------------------------
          1  recursive calls
        939  db block gets
      60697  consistent gets
      23674  physical reads
      67816  redo size
        829  bytes sent via SQL*Net to client
        896  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         48  rows processed
 
Using index with partition key last 

SQL> UPDATE (SELECT /*+ index (TRB1_PUB_LOG TRB1_PUB_LOG_1IX) +*/
               *
          FROM trb1_pub_log
         WHERE (buffer_id = -1 AND source_comp_id = 2003))
SET buffer_id = 99
WHERE ROWNUM <= 100
/

Elapsed: 00:00:00.20

UPDATE STATEMENT        
  UPDATE                 TRB1_PUB_LOG     
   COUNT STOPKEY          
    PARTITION RANGE SINGLE
         INDEX RANGE SCAN    TRB1_PUB_LOG_1IX 
 

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(ROWNUM<=100)
   4 - access("BUFFER_ID"=(-1) AND "SOURCE_COMP_ID"=2003)
       filter("SOURCE_COMP_ID"=2003)

Statistics
----------------------------------------------------------
        106  recursive calls
        214  db block gets
       4387  consistent gets
          1  physical reads
      19960  redo size
        822  bytes sent via SQL*Net to client
        897  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
         15  rows processed


Tom Kyte
September 24, 2008 - 7:59 pm UTC

umm, did you notice they updated different sets of rows entirely? what is up with that - you are not using the same data????? why are you comparing these - they are different data sets...

Index Skip scan is still confusing!!

Jay, December 03, 2008 - 12:25 pm UTC

Hi,
I have not seen any website as useful as this one. While going through this post, I could recollect one of the issue I encountered with Index Skip Scan. One of my table has 99M records and we have a global index on Settle_Date and Account_Id. I use both these columns in the where clause, it goes for a Index Skip Scan. See the details below
---------------------------------------------------
SQL> edit
Wrote file afiedt.buf

  1  explain plan for select * from f_transaction where settle_date between
  2  '28 Nov 2008' and '01 Dec 2008'
  3* and account_id = :Account_Id
SQL> /

Explained.

SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------
| Id  | Operation                          |  Name             | Rows  | Bytes | Cost  | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                   |    27 |  5670 |    27 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| F_TRANSACTION     |    27 |  5670 |    27 | ROWID | ROW
|   2 |   INDEX SKIP SCAN                  | IX7F_TRANSACTION  |    27 |       |    21 |       |       |
----------------------------------------------------------------------------------------------------

Note: cpu costing is off, PLAN_TABLE' is old version

10 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   COLLECTION ITERATOR (PICKLER FETCH) OF 'DISPLAY'




Statistics
----------------------------------------------------------
         12  recursive calls
          0  db block gets
        183  consistent gets
          0  physical reads
          0  redo size
       1081  bytes sent via SQL*Net to client
        276  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         10  rows processed
-----------------------------------------------------------
You can see it goes for Index Skip Scan.
Now the interesting part, I change the hardcoded dates to bind variables and check the plan, you can see it goes for Index Range Scan
-----------------------------------------------------------
SQL> edit
Wrote file afiedt.buf

  1   explain plan for select * from f_transaction where settle_date between
  2   :Date1 and :Date2
  3*  and account_id = :Account_Id
SQL> /

Explained.

SQL>  SELECT * FROM table(DBMS_XPLAN.DISPLAY);

----------------------------------------------------------------------------------------------------
| Id  | Operation                           |  Name            | Rows  | Bytes | Cost  | Pstart| Pstop |  TQ    |IN-OUT| PQ Dis
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                  |    35 |  7350 |  3375 |       |       |        |      |            |
|   1 |  FILTER                             |                  |       |       |       |       |       | 37,00  | PCWC |            |
|   2 |   PARTITION RANGE ALL               |                  |       |       |       |     1 |    22 | 37,00  | PCWP 
|   3 |    TABLE ACCESS BY LOCAL INDEX ROWID| F_TRANSACTION    |    35 |  7350 |  3375 |     1 |    
|   4 |     INDEX RANGE SCAN                | IX4_TRANSACTION  | 13850 |       |    88 |     1 |    22 | 37,00
----------------------------------------------------------------------------------------------------

Note: cpu costing is off, PLAN_TABLE' is old version

12 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   COLLECTION ITERATOR (PICKLER FETCH) OF 'DISPLAY'




Statistics
----------------------------------------------------------
         14  recursive calls
          0  db block gets
        183  consistent gets
         47  physical reads
          0  redo size
       1562  bytes sent via SQL*Net to client
        276  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         12  rows processed
-----------------------------------------------------------
Also note in the first query (where I hardcoded the dates), if I run for single dates it uses the Index Range Scan.
Can you advise when it would be Index Range Scan??? Please let me know if you need anymore details on the Table/Index
Thanks
Jay

Tom Kyte
December 09, 2008 - 10:52 am UTC

... where settle_date between
2 '28 Nov 2008' and '01 Dec 2008'
....

ouch, that hurts so very very much. Ok, I'll have to assume, presume, guess that settle_date is really a date and you are *relying* on implicit conversions.

One statement for you:

STOP DOING THAT. Use to_date and always use a format with it.


Ok, when you have an index on two or more columns - you could imagine (close eyes and visualize) that you really have N indexes - where N = number of distinct values of the first column in the index.

For example, you have an index on (settle_date, account_id )

if select count(distinct settle_date) was say 1,000 - you could pretend you had 1,000 indexes - each one on a different date, for the account_ids in that date.


So, when you do a query like:

select ...
from table
where settle_date
between TO_DATE( literal, 'fmt' ) and TO_DATE( literal, 'fmt')
and account_id = ?


the optimizer (which has access to the actual literals in the first case) can ascertain "ok, there are 4 days between 28-nov and 01-dec, we'll use the index on settle_date, account_id as if it were a bunch of little indexes and do 4 index range scans on it - in short, the query will be as if it were:


select from table
where settle_date = to_date('28-nov-2008','dd-mon-yyyy') and account_id = ?
UNION ALL
select from table
where settle_date = to_date('29-nov-2008','dd-mon-yyyy') and account_id = ?
UNION ALL
select from table
where settle_date = to_date('30-nov-2008','dd-mon-yyyy') and account_id = ?
UNION ALL
select from table
where settle_date = to_date('01-dec-2008','dd-mon-yyyy') and account_id = ?


we will "skip around" in the index and do 4 mini range scans by the dates.


Now when you do not use binds, explain plan doesn't have any clue as to the actual values that will be used and so it "guesses" and in this case, it guesses "too many dates to do a skip scan - so we'll just index range scan instead"


bear in mind the plan you see with explain plan is generated WITHOUT having access to the bind values, therefore, it is likely NOT THE PLAN that will be used in your case.


consider:

ops$tkyte%ORA10GR2> create table t
  2  as
  3  select to_date( '01-nov-2008', 'dd-mon-yyyy' )+mod(rownum,40) settle_date,
  4         object_id account_id,
  5             all_objects.*
  6    from all_objects
  7  /

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create index t_idx on t(settle_date,account_id);

Index created.

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> REM , method_opt => 'for columns settle_date, account_id size 254' );
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> variable x varchar2(30)
ops$tkyte%ORA10GR2> variable sdate varchar2(30)
ops$tkyte%ORA10GR2> variable edate varchar2(30)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :x := 42; :sdate := '28-nov-2008'; :edate := '01-dec-2008'

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select *
  2    from t
  3   where settle_date between to_date('28-nov-2008','dd-mon-yyyy') and to_date( '01-dec-2008','dd-mon-yyyy')
  4     and account_id = to_number(:x);

Execution Plan
----------------------------------------------------------
Plan hash value: 2053318169

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   106 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |     1 |   106 |     4   (0)| 00:00:01 |
|*  2 |   INDEX SKIP SCAN           | T_IDX |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SETTLE_DATE">=TO_DATE(' 2008-11-28 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "ACCOUNT_ID"=TO_NUMBER(:X) AND "SETTLE_DATE"<=TO_DATE('
              2008-12-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("ACCOUNT_ID"=TO_NUMBER(:X))

ops$tkyte%ORA10GR2> select *
  2    from t
  3   where settle_date between to_date(:sdate,'dd-mon-yyyy') and to_date(:edate,'dd-mon-yyyy')
  4     And account_id = to_number(:x);

Execution Plan
----------------------------------------------------------
Plan hash value: 1056976431

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |     1 |   106 |     3   (0)| 00:00:01 |
|*  1 |  FILTER                      |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T     |     1 |   106 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | T_IDX |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE(:SDATE,'dd-mon-yyyy')<=TO_DATE(:EDATE,'dd-mon-yyyy'))
   3 - access("SETTLE_DATE">=TO_DATE(:SDATE,'dd-mon-yyyy') AND
              "ACCOUNT_ID"=TO_NUMBER(:X) AND "SETTLE_DATE"<=TO_DATE(:EDATE,'dd-mon-yyyy'))
       filter("ACCOUNT_ID"=TO_NUMBER(:X))

<b>sort of looks like what you see - but, explain plan is *wrong*.  It does not bind peek, it doesn't know what sdate and edate probably are.  So - using display_cursor from dbms_xplan to show what *really* happened (new in 10g, won't work in 9i)</b>


ops$tkyte%ORA10GR2> set autotrace off
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set serveroutput off
ops$tkyte%ORA10GR2> select *
  2    from t
  3   where settle_date between to_date(:sdate,'dd-mon-yyyy') and to_date(:edate,'dd-mon-yyyy')
  4     and account_id = to_number(:x);

no rows selected

ops$tkyte%ORA10GR2> select * from table( dbms_xplan.display_cursor(null,null,'typical +peeked_binds') );

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  51dggrhnbxntr, child number 0
-------------------------------------
select *   from t  where settle_date between to_date(:sdate,'dd-mon-yyyy')
and to_date(:edate,'dd-mon-yyyy')    and account_id = to_number(:x)

Plan hash value: 2764490456

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |       |       |     4 (100)|          |
|*  1 |  FILTER                      |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T     |     1 |   106 |     4   (0)| 00:00:01 |
|*  3 |    INDEX SKIP SCAN           | T_IDX |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :SDATE (VARCHAR2(30), CSID=31): '28-nov-2008'
   2 - :EDATE (VARCHAR2(30), CSID=31): '01-dec-2008'
   3 - :X (VARCHAR2(30), CSID=31): '42'

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE(:SDATE,'dd-mon-yyyy')<=TO_DATE(:EDATE,'dd-mon-yyyy'))
   3 - access("SETTLE_DATE">=TO_DATE(:SDATE,'dd-mon-yyyy') AND
              "ACCOUNT_ID"=TO_NUMBER(:X) AND "SETTLE_DATE"<=TO_DATE(:EDATE,'dd-mon-yyyy'))
       filter("ACCOUNT_ID"=TO_NUMBER(:X))


31 rows selected.



We can see the binds were peeked at, optimized with and the skip scan was really actually used

Index Skip scan is still confusing!!

Jay, December 03, 2008 - 12:39 pm UTC

Small correction in the above statement. In the first query it does a Index Skip Scan on IX7F_Transaction index which has Settle_date and Account_id. In the second query where I use the bind variables, it uses Index Range Scan on IX4_Transaction Index(local index to a partition) which has Account_Id and DB_Ref_Id columns. My question is, why it is not using IX7F_Transaction Index though we are using both the columns in the index as part of the predicate.
Thanks
Jay

Thanks

Jay, February 09, 2009 - 9:35 am UTC

Excellent. thanks for the information.
Thanks
Jay

Why index not used

Narendra, March 11, 2009 - 12:14 pm UTC

Tom,

I came across following entry
http://dioncho.wordpress.com/2009/02/11/why-full-table-scan-even-with-lower-index-scan-cost/#comments

As stated, why does CBO decide to use TABLE FULL SCAN even when INDEX FULL SCAN costs less ?
Can you please throw some light on this behaviour?

p.s. I tried it myself and I got the same results as the blog post. It is just that I am not able to post my results as of now.

Implict Conversion happened Why Index Used

Rajeswari, September 17, 2009 - 1:33 am UTC

http://74.125.153.132/search?q=cache:7nZzDgw6KM4J:www.oratraining.com/blog/2009/06/oracle-sql-tuning-tips/+ORACLE+IMPLICIT+CONVERSION+RIGHT+SIDE+TO_NUMBER&cd=9&hl=en&ct=clnk&gl=in

Part of information from above URL:
Oracle automatically performs simple column type conversions(or casting) when it compares columns of different types. Depending on the type of conversion, indexes may not be used. Make sure you declare your program variables as the same type as your Oracle columns, if the type is supported in the programming language you are using.
Use:
SELECT emp_no, emp_name, sal FROM emp WHERE emp_no = '123';
HERE if emp_no indexed numeric, then after implicit conversion query will be:
SELECT emp_no, emp_name, sal FROM emp WHERE emp_no = TO_NUMBER('123');
Thus, index is used in this case.
Don't use:
SELECT emp_no, emp_name, sal FROM emp WHERE emp_type = 123;
HERE if emp_type is indexed varchar2, then after implicit conversion query will be:
SELECT emp_no, emp_name, sal FROM emp WHERE TO_NUMBER(emp_type) = 123;
Thus, index will not be used in this case.

My question:
Is above statement true? Till now my understanding is whatever datatype if implicit conversion happens index will not be used. In my UAT environment I ran below query and waiting for tkprof output to know the actuals. Before that I want to hear from you too.

Table Name: CDMST
Column: CARD_NUM
Data Type: NUMBER(16)

Indexes on Columns:

CDMST_NIDX2 CDMST CUST_NO 1
CDMST_NIDX1 CDMST CARD_NUM 1
CDMST_NIDX4 CDMST CARD_CUST_NO 1
CDMST_NIDX3 CDMST PRIMARY_PARENT_CARD_NUM 1

SELECT CARD_NUM, PRIMARY_PARENT_CARD_NUM,CARD_TYPE,SUBSTR(LB_DATE,1,4),LB_DATE,
BILLED_AMOUNT FROM CDMST
WHERE CARD_NUM in (to_char(4564070096994002),to_char(5546379650972001))


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=2 Bytes=54)
   1    0   INLIST ITERATOR
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'CDMST' (Cost=3 Card=2
          Bytes=54)

   3    2       INDEX (RANGE SCAN) OF 'CDMST_NIDX1' (UNIQUE) (Cost=2 C
          ard=2)



Background:
Our application is developed in Pro*C and most of the variables are declared as VARCHAR2. Recently we faced a issue on performance where we have table having a index on column which is not used. As where criteria have few other columns they planned to create few more indexes. But my point is due to implicit conversion even if index get created it will not be used.
Still we created index and program completed very fast.



Tom Kyte
September 17, 2009 - 9:19 am UTC

... Is above statement true? ...

why not test it and verify??? It is what I would do if I didn't believe something, it would be easy for you to verify that "hey, even when there is an implicit conversion - the index is used"

Actually, I see you did - you wrote:

"... Till now my understanding is whatever datatype if implicit conversion happens index will not be used. ...."

and then presented a test case that shows that understanding to be false, the index was readily used.


The rule is (forget function based indexes for a minute) in general: if you apply a function to a database column, an index on that column cannot be use.

In the first case:

where database_column_that_is_number = '123'

the resulting predicate is:
where database_column_that_is_number = to_number('123')

and the implicit conversion (which is BAD, should not be done) does not prevent the index on database_column_that_is_number from being used, there was no function applied to that column

Thank You

Rajeswari, September 18, 2009 - 2:28 am UTC

Thank you very much Tom

What phisics/math behind that?

Marat Tolgambayev, November 17, 2010 - 6:56 pm UTC

Dear Tom,

I do understand that columns order is not important for index performance. But Why the selectivity of the columns doesn't influence on that? Is it a special Oracle optimizing algorithm, or just a nature of B-Tree indexes?
Thank you!
Tom Kyte
November 18, 2010 - 3:33 am UTC

Column ordering IS important for index performance.

If you query "where x > 5 and y = 10", you want the index on (y,x) - NOT on (x,y) (even if X is unique and y has 42 values and the table has a million records - you want the index on Y,X)


If you query "where x = 5 and y = 10" - then it matters not if you put x or y first (actually, there is something to be said for putting y first - that index would compress very nicely in Y,X - but not compress at all on X,Y)


close your eyes and think about it. What would the height of the two indexes be in the following:


index(x,y)
index(y,x)

would they be

a) probably the same
b) probably different


I pick (A). the height of an index tells you how many IO's from root to leaf for any given key. Now, if the heights of the indexes are the same - then..... their retrieval performance to get the first rowid out of them would be .... the same.


No magical optimization - just a fact.


(if we compressed the indexes - then the heights would be more likely to differ with the height of (x,y) being greater than or equal to the height of (y,x) - making Y,X perhaps a more sensible choice - which runs counter to "conventional wisdom" of course)


It is the nature of searching a sorted list - not a b*tree, nothing magic, just a sorted list.


One should consider "non equal" queries

Marat Tolgambayev, November 18, 2010 - 10:03 pm UTC

Thank you Tom!

column order of index

A reader, August 12, 2013 - 3:04 pm UTC

Hi Tom,

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1409668892797#3991545699434

In the above link, you created two indexes:

create index t_idx_1 on t(owner,object_type,object_name)

create index t_idx_2 on t(object_name,object_type,owner)

For index t_idx_1, the columns are ordered from least discriminating to most discriminating, and index t_idx_2 is the opposite.

I have some questions:

1. If vlaue in the least discriminating column (in this case, column owner) is updated very often, will that cause index t_idx_1 to have more index reorganization than t_idx_2?

2. If we delete data by the most discriminating column (in this case, column object_name), will it cause index t_idx_1 to become a more spare index than t_idx_2?

Thanks!




Tom Kyte
August 12, 2013 - 4:20 pm UTC

1) no not really. think about it. Any modification of any value is going to cause a move.


Say you have the above two indexes. owner is non-selective (few values) and object_name is highly selective.

remember also that space cannot be reused in a non-unique index within a single transaction.
http://richardfoote.wordpress.com/2009/03/25/differences-between-unique-and-non-unique-indexes-part-iv-take-it-back/


So, if you update the first index and change owner from X to Y - you'll likely have to move that index entry to an entirely different leaf block - because all of the rows in front and behind the X row would be X's as well - the Y value "doesn't fit there". So we can agree that row would move in most all cases (sure, there could be a case where it doesn't have to move to a different leaf block - but it would certain *move* within the index structure.


Now, if you update the object_name in that first index - it again would *have to move*. Even if you just updated object_name from A to AA and there were no other object names that started with A at all - it would have to move. Because we cannot reuse space within that single transaction.


Same for the other index - you have a new key - that key cannot use the existing space - it has to claim new space - it has to move.


2) as for deletions, if you delete by object_name, you are deleteing but one or two rows - maybe three or four. Not many at all.

While it would be true that if the 1-4 rows were "together", and you inserted a set of rows with object names "near" to that deleted value in another transaction - they might be able to reuse that space. But then again, if you inserted a few new object names that were NOT near that space - you'd not reuse that space.

On the other hand, if you do a few deletes here and there - you have space spread out all over the place and the possibility that any random insert will find space already without having to block split goes up. On the other hand - if you insert a bunch of similar object names that all go together - the odds you find a big enough chunk to put them all together without splitting goes down.


So as you can see - it is rather a story of six one way, half a dozen the other...