Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Arjun.

Asked: April 25, 2003 - 9:39 am UTC

Last updated: March 25, 2013 - 2:49 pm UTC

Version: 8.1.7

Viewed 100K+ times! This question is

You Asked

Hi Tom,
I have the situation as defined below:

I have created two tables
create table I1(n number primary key, v varchar2(10));
create table I2(n number primary key, v varchar2(10));

and a map table
create table MAP(n number primary key, i1 number referencing I1(n),
i2 number referencing I2(n));

I have an index created on MAP
create unique index IDX_MAP on MAP(i1, i2)

Now when I take the explain plan for the query :
select *
from
i1,
map,
i2
where i1.n = map.i1
and i2.n = map.i2
and i1.v = 'x'
and i2.v = 'y';

I see the plan as

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'MAP'
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'I2'
5 4 INDEX (UNIQUE SCAN) OF 'SYS_C00683648' (UNIQUE)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I1'
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C00683647' (UNIQUE)

Is there any way to avoid the full table scan on MAP table?

What ever I try, one table is always going for a full scan. Like if I change the query like
select *
from
map,
i1,
i2
where i1.n = map.i1
and i2.n = map.i2
and i1.v = 'x'
and i2.v = 'y';
I get the plan
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'I1'
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'MAP'
5 4 INDEX (RANGE SCAN) OF 'IDX_MAP' (UNIQUE)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2'
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C00688807' (UNIQUE)


What should I do to avoid a full scan in such a case?

Thanks in advance
Arjun


and Tom said...

first of all -- just keep saying to yourself over and over


"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"

over and over and over...

Then, look at your query -- tell me, how can a full scan be avoided?

tell me -- using the EXISTING data structures you have -- what plan could YOU yourself as a human come up with that does not involve a full table or index scan? Given your existing structures, the INDEXES are actually deadly here -- the CBO would be smarter and stop using them.

ops$tkyte@ORA920> set autotrace traceonly explain
ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'MAP'
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'I2'
5 4 INDEX (UNIQUE SCAN) OF 'SYS_C003735' (UNIQUE)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I1'
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C003734' (UNIQUE)



ops$tkyte@ORA920>
ops$tkyte@ORA920> exec dbms_stats.set_table_stats( user, 'I1', numrows => 10000, numblks => 10000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA920> exec dbms_stats.set_table_stats( user, 'I2', numrows => 10000, numblks => 10000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA920> exec dbms_stats.set_table_stats( user, 'MAP', numrows => 100000, numblks => 10000 );

PL/SQL procedure successfully completed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2891 Card=10 Bytes=790)
1 0 HASH JOIN (Cost=2891 Card=10 Bytes=790)
2 1 TABLE ACCESS (FULL) OF 'I2' (Cost=963 Card=100 Bytes=2000)
3 1 HASH JOIN (Cost=1927 Card=1000 Bytes=59000)
4 3 TABLE ACCESS (FULL) OF 'I1' (Cost=963 Card=100 Bytes=2000)
5 3 TABLE ACCESS (FULL) OF 'MAP' (Cost=963 Card=100000 Bytes=3900000)


as designed, those indexes would KILL YOU -- literally KILL you. Watch:

ops$tkyte@ORA920> insert into i1 select rownum, rpad('*',10,'*') from all_objects;

30020 rows created.

ops$tkyte@ORA920> insert into i2 select rownum, rpad('*',10,'*') from all_objects;

30020 rows created.

ops$tkyte@ORA920> insert into map select rownum, rownum, rownum from all_objects;

30020 rows created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> set autotrace traceonly
ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

no rows selected


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'MAP'
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'I2'
5 4 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I1'
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C003754' (UNIQUE)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
60127 consistent gets
0 physical reads
60 redo size
513 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

ops$tkyte@ORA920>
ops$tkyte@ORA920> analyze table i1 compute statistics;

Table analyzed.

ops$tkyte@ORA920> analyze table i2 compute statistics;

Table analyzed.

ops$tkyte@ORA920> analyze table map compute statistics;

Table analyzed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

no rows selected


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=21 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=20 Card=1 Bytes=26)
3 2 TABLE ACCESS (FULL) OF 'I1' (Cost=10 Card=1 Bytes=14)
4 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
5 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
6 5 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
92 consistent gets
0 physical reads
0 redo size
513 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

ops$tkyte@ORA920> set autotrace off

tell me -- which looks better to you?? to me -- that second one is looking pretty good



creating an index on i1.v or i2.v would be helpful there maybe?


ops$tkyte@ORA920> create index i1_idx on i1(v);

Index created.

ops$tkyte@ORA920> create index i2_idx on i2(v);

Index created.

ops$tkyte@ORA920> analyze table i1 compute statistics;

Table analyzed.

ops$tkyte@ORA920> analyze table i2 compute statistics;

Table analyzed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> set autotrace traceonly
ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

no rows selected


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=13 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=12 Card=1 Bytes=26)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'I1' (Cost=2 Card=1 Bytes=14)
4 3 INDEX (RANGE SCAN) OF 'I1_IDX' (NON-UNIQUE) (Cost=1 Card=1)
5 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
513 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

ops$tkyte@ORA920>
ops$tkyte@ORA920> set autotrace off


Also -- you really need to ANALYZE your tables and use the CBO to its fullest.


But remember, keep saying

"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"
"full scans are not evil, indexes are not good"

until you believe it.






Rating

  (118 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

I would rephrase that

A reader, April 25, 2003 - 4:47 pm UTC

"full scans are NOT ALWAYS evil, indexes are NOT ALWAYS good"
"full scans are NOT ALWAYS evil, indexes are NOT ALWAYS good"
"full scans are NOT ALWAYS evil, indexes are NOT ALWAYS good"
"full scans are NOT ALWAYS evil, indexes are NOT ALWAYS good"


Tom Kyte
April 25, 2003 - 7:37 pm UTC

good point...

nothing is black or white, it is all shades of grey

JUST COOL

kiro, April 26, 2003 - 9:44 am UTC


Indexes are (usually) good?

Dragan Novicic, October 27, 2003 - 1:02 pm UTC

I have seen the previous example in a sample Chapter 2 (Page 30) of your new book and spent a day analyzing it. Something confused me.
You were definite in your attitude that having a predefined database structure and a query to run against it, it is not possible to avoid a full table scan.
For me, the answer is simply creating a missing index on table I1 on column v. Actually, you were proposing such an index in your answer, but the real cause for confusion was in choosing a constant value for v column (‘**********’).
If we change your example slightly, by filling v column with
substr(object_name,1,10), we will come to totally different
conclusions.

dnovicic@NOKI> select * from i1, map, i2
2 where i1.n=map.i1
3 and i2.n=map.i2
4 and i1.v='x'
5 and i2.v='y';

no rows selected

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=15 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=15 Card=1 Bytes=40)
2 1 NESTED LOOPS (Cost=12 Card=3 Bytes=78)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'I1' (Cost=3 Card=3 Bytes=42)
4 3 INDEX (RANGE SCAN) OF 'I1_IDX' (NON-UNIQUE) (Cost=1 Card=3)
5 2 TABLE ACCESS (BY INDEX ROWID) OF 'MAP' (Cost=3 Card=21570 Bytes=258840)
6 5 INDEX (RANGE SCAN) OF 'IDX_MAP' (UNIQUE) (Cost=2 Card=21570)
7 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=3 Bytes=42)
8 7 INDEX (UNIQUE SCAN) OF 'SYS_C002343' (UNIQUE)

As we can see, CBO has produced the execution plan, without using full table scan. What you were demonstrating was more like an exception (tables filled with constant values), than a rule, for a given table structure and a given query against them.



Tom Kyte
October 27, 2003 - 1:49 pm UTC

i was adamant that given the structures the person asking the question was using -- a full scan was the only way. I said "could add an index to avoid it" but saw a better way.

Some other aspects

Dragan Novicic, October 27, 2003 - 1:09 pm UTC

One remark also stands here: measuring efficiency of query plan by statistics of consistent gets in a case when result contains no rows at all, is misleading. Because query plan contains steps that were not executed simply because access to the first of three joined tables produced no rows, we can not have a picture of what will happen if these tables were accessed during query execution.
I will recommend you to use excellent example of three tables and a given query against them, from original question, to discuss influence of cardinality on mapping table. When accessing MAP table from the side of I1 table, as in the query plan above, it would be interesting to see what will happen when the mapping is actually 1:N and not 1:1 as in your example. In addition, the influence of CLUSTERING_FACTOR of index on map(i1,i2) could be demonstrated here. Maybe even the use of over-indexing on map table by producing index on map(i1,i2,n) to avoid access to a map table in joining operation at all.
Demonstration of index inefficiency on low cardinality columns should be demonstrated on a single table and not using such a complex, three tables scenario.


Scan Waits?

Mike, November 05, 2003 - 2:57 pm UTC

Hi Tom,

In our *large* warehouse, we do a lot of full table scans. Our statspack reports usually show something like:

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file scattered read 28,040 1,771 54.79
free buffer waits 1,464 1,252 38.73
CPU time 109 3.38
db file sequential read 12,462 87 2.69
log file parallel write 172 4 .12
-------------------------------------------------------------
Wait Events for DB:

Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
db file scattered read 28,040 0 1,771 63 9,346.7
free buffer waits 1,464 1,129 1,252 855 488.0
db file sequential read 12,462 0 87 7 4,154.0

Are these waits due to the full table scans?

Do you have any suggestions for increasing the performance of full table scans? Block buffers are maxed out and disk system is already tuned.

Thank you.

Tom Kyte
November 06, 2003 - 7:13 am UTC


have you looked at MV's to help with the full scan issue (full scan 10meg instead of 10gig)

have you efficiently partitioned in order to help reduce the io?

are you using bitmap indexes in order to help reduce the io?


we are not going to look at "how to full scan faster", we are going to look at "how to NOT have to full scan"

the best way to speed up something is to not do it in the first place.

Scan Waits

A reader, November 06, 2003 - 9:02 am UTC

We have implemented all of your suggestions. However, we get a number of one time query requests that often result in doing a full table scan on a billion plus row table (12 partitions).

Am I correct to say that the waits are due to Oracle trying to find blocks in the buffer cache?


how to use Full Table Scan

A reader, November 28, 2003 - 7:46 am UTC

Hi

My question is the other way round.
I want to use FTS but Oracle does not do it!

Here are the informations:

TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS AVG_SPACE AVG_ROW_LEN
-------------- ---------- ---------- ------------ ---------- -----------
PSCLASSDEFN 300 3 4 1804 61
PSROLECLASS 2216 12 3 1651 33
PSROLEUSER 9160 39 0 1154 28

TABLE_NAME INDEX_NAME COLUMN_NAME
-------------- ---------------- ----------------
PSCLASSDEFN PS_PSCLASSDEFN CLASSID
PSCLASSDEFN PSAPSCLASSDEFN CLASSID
PSCLASSDEFN PSAPSCLASSDEFN VERSION
PSCLASSDEFN PSBPSCLASSDEFN VERSION
PSROLECLASS PS_PSROLECLASS ROLENAME
PSROLECLASS PS_PSROLECLASS CLASSID
PSROLEUSER PS_PSROLEUSER ROLEUSER
PSROLEUSER PS_PSROLEUSER ROLENAME
PSROLEUSER PSAPSROLEUSER ROLENAME


dbms_space information

Owner Type Name Blocks Next Ext Unused Free
------------ --------------- ---------------------------- ------- -------- ------ ----
SYSADM TABLE PSROLEUSER 40 0 4
SYSADM TABLE PSROLECLASS 16 3 2
SYSADM TABLE PSCLASSDEFN 8 4 1
SYSADM TABLE PSROLEUSER_RMT 8 7 0
SYSADM INDEX PSAPSROLEUSER 64 3 44
SYSADM INDEX PS_PSROLEUSER 64 3 43
SYSADM INDEX PS_PSROLECLASS 24 3 10
SYSADM INDEX PSAPSCLASSDEFN 8 6 0
SYSADM INDEX PSBPSCLASSDEFN 8 6 0
SYSADM INDEX PS_PSCLASSDEFN 8 6 0
SYSADM INDEX PS_PSROLEUSER_RMT 8 6 0



variable a varchar2(30)

exec :a := 'ABILEY'

SELECT DISTINCT (b.classid)
FROM SYSADM.PSROLEUSER A, SYSADM.PSROLECLASS B
WHERE a.rolename = b.rolename
AND a.roleuser = :A
AND NOT EXISTS (SELECT c.classid
FROM SYSADM.PSCLASSDEFN C
WHERE c.classid = b.classid)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=48 Card=43 Bytes=2365)
1 0 SORT (UNIQUE) (Cost=5 Card=43 Bytes=2365)
2 1 FILTER
3 2 NESTED LOOPS (Cost=3 Card=43 Bytes=2365)
4 3 TABLE ACCESS (FULL) OF 'PSROLECLASS' (Cost=3 Card=111 Bytes=3330)
5 3 INDEX (UNIQUE SCAN) OF 'PS_PSROLEUSER' (UNIQUE)
6 2 INDEX (UNIQUE SCAN) OF 'PS_PSCLASSDEFN' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2233 consistent gets
0 physical reads
0 redo size
275 bytes sent via SQL*Net to client
460 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed

HERE if I use FTS


SELECT /*+ FULL(A) */ DISTINCT (b.classid)
FROM SYSADM.PSROLEUSER A, SYSADM.PSROLECLASS B
WHERE a.rolename = b.rolename
AND a.roleuser = :A
AND NOT EXISTS (SELECT c.classid
FROM SYSADM.PSCLASSDEFN C
WHERE c.classid = b.classid)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=57 Card=43 Bytes=2365)
1 0 SORT (UNIQUE) (Cost=14 Card=43 Bytes=2365)
2 1 FILTER
3 2 HASH JOIN (Cost=12 Card=43 Bytes=2365)
4 3 TABLE ACCESS (FULL) OF 'PSROLEUSER' (Cost=8 Card=22 Bytes=550)
5 3 TABLE ACCESS (FULL) OF 'PSROLECLASS' (Cost=3 Card=111 Bytes=3330)
6 2 INDEX (UNIQUE SCAN) OF 'PS_PSCLASSDEFN' (UNIQUE)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
42 consistent gets
0 physical reads
0 redo size
275 bytes sent via SQL*Net to client
460 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed


42 consistent gets vs 2233!!!!!

Oracle 9.2.0.2 64 bits on HP-UX 64 bits

Why is the optimizer so stupid :-(

Tom Kyte
November 28, 2003 - 10:51 am UTC

seems to me that your statistics must be way out of whack. You see -- the optimizer thinks "111 rows from psroleclass"

so, lets see the tkprof.


and remember the old saying "garbage in, garbage out", things are only as smart as the information provided them.

analyze or dbms_stats.gather_table_stats

Robin H, November 28, 2003 - 2:18 pm UTC

I was told in 9i to stop using the feature "analyze table" and replace it with dbms_stats.gather_table_stats().

Does it matter, are they the same?

Tom Kyte
November 28, 2003 - 2:44 pm UTC

the recommendation going forward is "dbms_stats"

it is more flexible (you'll love gather stale and alter table monitoring -- you only gather stats on that which you need). can run in parallel. is more automated/automatable.

as with ANY big change -- testing is important.

in general,

dbms_stats.gather_table_stats( user, 'T', method_opt => 'for all indexed columns', cascade => true )

is more or less the same as

analyze table t compute statistics for table for all indexes for all indexed columns;


method opt gets histograms
cascade gets indexes
gather_table_stats does the table

42 consistent gets vs 2233!!!!!

A reader, December 01, 2003 - 4:33 am UTC

hi again

I re-analyzed the table with this syntax

exec dbms_stats.gather_table_stats(OWNNAME => 'SYSADM', -
TABNAME => 'PSROLEUSER', -
ESTIMATE_PERCENT => 99, -
METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 75', -
DEGREE => 2, -
CASCADE => TRUE);

for all three involved tables


and still got this plan

1 SELECT DISTINCT (b.classid)
2 FROM SYSADM.PSROLEUSER A, SYSADM.PSROLECLASS B
3 WHERE a.rolename = b.rolename
4 AND a.roleuser = :A
5 AND NOT EXISTS (SELECT c.classid
6 FROM SYSADM.PSCLASSDEFN C
7* WHERE c.classid = b.classid)

no rows selected


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=11 Card=6 Bytes=324)
1 0 SORT (UNIQUE) (Cost=5 Card=6 Bytes=324)
2 1 FILTER
3 2 NESTED LOOPS (Cost=3 Card=6 Bytes=324)
4 3 TABLE ACCESS (FULL) OF 'PSROLECLASS' (Cost=3 Card=111 Bytes=3330)
5 3 INDEX (UNIQUE SCAN) OF 'PS_PSROLEUSER' (UNIQUE)
6 2 INDEX (UNIQUE SCAN) OF 'PS_PSCLASSDEFN' (UNIQUE)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2233 consistent gets
0 physical reads
0 redo size
275 bytes sent via SQL*Net to client
460 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed

As you said it´s taking 111 as cardinality, not sure why because if I do


select count(*) from SYSADM.PSROLECLASS;


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'PSROLECLASS' (Cost=3 Card=2216)

as you can see it does know 2216 as the *real* cardinality

Now, let´s see the tkprof


SELECT DISTINCT (b.classid)
FROM SYSADM.PSROLEUSER A, SYSADM.PSROLECLASS B
WHERE a.rolename = b.rolename
AND a.roleuser = :A
AND NOT EXISTS (SELECT c.classid
FROM SYSADM.PSCLASSDEFN C
WHERE c.classid = b.classid)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.02 0.02 0 2233 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.02 0.02 0 2233 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 56 (SYSTEM)

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT UNIQUE (cr=2233 r=0 w=0 time=24878 us)
0 FILTER (cr=2233 r=0 w=0 time=24848 us)
0 NESTED LOOPS (cr=2233 r=0 w=0 time=24844 us)
2216 TABLE ACCESS FULL OBJ#(40626) (cr=15 r=0 w=0 time=4385 us)
0 INDEX UNIQUE SCAN OBJ#(40650) (cr=2218 r=0 w=0 time=11728 us)(object id 40650)
0 INDEX UNIQUE SCAN OBJ#(39824) (cr=0 r=0 w=0 time=0 us)(object id 39824)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 SORT (UNIQUE)
0 FILTER
0 NESTED LOOPS
2216 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'PSROLECLASS'
0 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'PS_PSROLEUSER'
(UNIQUE)
0 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'PS_PSCLASSDEFN'
(UNIQUE)


I am puzzled

Tom Kyte
December 01, 2003 - 7:15 am UTC

what is the output of

SQL> show parameter optimizer 

show parameter

A reader, December 01, 2003 - 7:54 am UTC

show parameter optimizer

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling integer 1
optimizer_features_enable string 9.2.0
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_max_permutations integer 2000
optimizer_mode string CHOOSE


all default!


try to rebuild

A reader, December 02, 2003 - 3:59 pm UTC

hi

have you reorganized your database? tables and indexes?

please do so and see your database fly

Tom Kyte
December 02, 2003 - 9:16 pm UTC

or not, or maybe see your system grind to a halt.

point is MEASURE and make sure you DO MORE GOOD THEN HARM.

(like a doctor)

Setting Table Statistics

A reader, December 03, 2003 - 5:28 pm UTC

Tom,

1. In the above queries, what was your observation that made you to conclude that an index will be a poor performer. It would be of help if you could explain your observation on that and how you decided that using an index will be a performance killer. What statistics we need to consider?
2. My understanding is that set_table_stats can be used to simulate a production volume on a development environment to test queries. Is that correct? Also, what does numblks option mean?
3. If 2 is true, then is there a way to simulate things such as co-located/disorganized. ie I want to simulate a condition where a table value is spread across multiple blocks (ie. want to simulate a condition where data is disorganized and data is colocated). Are there any procedures like set_table_stats available to simulate the situation. A good example will be of great help

Tom Kyte
December 03, 2003 - 7:12 pm UTC

1) no statistics, just looked at the structures:

create table I1(n number primary key, v varchar2(10));
create table I2(n number primary key, v varchar2(10));

create table MAP(n number primary key, i1 number referencing I1(n),
i2 number referencing I2(n));

create unique index IDX_MAP on MAP(i1, i2)

select * from i1, map, i2 where i1.n = map.i1 and i2.n = map.i2
and i1.v = 'x' and i2.v = 'y';


now, there are no indexes on v. So, we have to join from i1 to map to i2 (if we want to use any indexes -- the index on map leads with i1)

So, we could

full scan i1 -> filter on v='x' -> index scan to map -> index scan to i2 -> filter on v='y'


say there are 100,000 records in i1.

most or many are v='x'.

so we'll do tons of index range scans on map's index (for each row in i1 such that v='x') and then do tons of index ranges on i2 to do a table access by index rowid to determine if v='y' in that table.

hip hop, skip, jump, io io io io io (after io after io)

indexes would not be "a good thing"

2) it can not be used for that AT ALL. not AT ALL. you can use it to see what the plan would be if the table in dev was that size (the size production was).

but -- so what???? what would that do for you?

scenario:

table as 1,000,000 rows in production. you export stats and import then into test (where the table has 1,000 rows) or you just set them.

you run a query in test. it runs GREAT. superb. fantastic.

what can you say about its performance in prod?

nothing, less then nothing actually (because you are falsely influenced by "great performance in test with prod stats")

you run another query in test. it runs terrible, horrible, really bad.


what can you say about its performance in prod?

nothing, less then nothing actually (because you are falsely influenced by "great performance in test with prod stats")

I find exporting of stats from prod into test good for:

a) reproducing an optimizer "bad plan" for support
b) testing different hints, to see which "stick"


thats about it.

3) I call this "restore the backup of prod to test". that buys you two things


a) you discover for the first time whether you actually can (restore :)
b) you can do real load testing, real performance testing, real QA testing.


you cannot test on an empty database.
you cannot test on a database with 10% of production

and get many "meaningful data points" from it. You can test functionality, but not performance -- not scale.

what is the percentage of rows

A reader, December 04, 2003 - 4:29 am UTC

Hi

If we have 1000000 rows, at how much percent Oracle would decide to do a full table scan? I read somewhere (cant find it now) that if Oracle has to return more than 10 percent of rows it would go FTS even there are indexes

Is it true?

Tom Kyte
December 04, 2003 - 8:20 am UTC

it is not a %, it is not true.

for the RBO, it would use an index if it can - at all costs it would (stupidly) use an index.

for the CBO, it is a matter of "how many LIO's will i have to do in order to process the query".  It is a function (among many things) of the clustering factor. 

Consider these two tables:


ops$tkyte@ORA920PC> create table orderly
  2  as
  3  select *
  4    from all_objects
  5   order by object_id;
 
Table created.
 
ops$tkyte@ORA920PC> create index orderly_idx on orderly(object_id);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> analyze table orderly compute statistics
  2  for table
  3  for all indexes
  4  for all indexed columns;
 
Table analyzed.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> create table disorganized
  2  as
  3  select *
  4    from orderly
  5   order by reverse(object_id);
 
Table created.
 
ops$tkyte@ORA920PC> create index disorganized_idx on disorganized(object_id);
 
Index created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> analyze table disorganized compute statistics
  2  for table
  3  for all indexes
  4  for all indexed columns;
 
Table analyzed.
 
<b>same exact amount of data -- down to the last bit.  Should behave the same right?  

wrong.  the first table just happens to be "sorted" by object_id (so a range scan on the index will hit the minimum number of blocks).  the second is randomly organized (so a range scan on the index will hit many many blocks)

the clustering factor computed for each is:</b>


ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select a.table_name, b.index_name, a.blocks, b.CLUSTERING_FACTOR
  2    from user_tables a, user_indexes b
  3   where a.table_name in ( 'ORDERLY', 'DISORGANIZED' )
  4     and a.table_name = b.table_name
  5  /
 
TABLE_NAME        INDEX_NAME                         BLOCKS CLUSTERING_FACTOR
----------------- ------------------------------ ---------- -----------------
DISORGANIZED      DISORGANIZED_IDX                      429             30240
ORDERLY           ORDERLY_IDX                           429               413
 

<b>orderly has a cluster factor near the NUMBER OF BLOCKS (that means an index range scan will hit as few table blocks as possible -- the data in the table just happens to be sorted by the index key)

disorganized has a cluster factor near the number of rows in the table (that means an index range scan will hit as many blocks as possible -- each row in the index points to a block different from the row in front and behind it)

Optimizer says:</b>

ops$tkyte@ORA920PC> set autotrace traceonly
ops$tkyte@ORA920PC> set pagesize 900
ops$tkyte@ORA920PC> set pause on
ops$tkyte@ORA920PC> select *
  2    from orderly
  3   where object_id between 10000 and 12500
  4  /
 
2501 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=42 Card=2495 Bytes=239520)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'ORDERLY' (Cost=42 Card=2495 Bytes=239520)
   2    1     INDEX (RANGE SCAN) OF 'ORDERLY_IDX' (NON-UNIQUE) (Cost=7 Card=2495)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        375  consistent gets
         36  physical reads
          0  redo size
     172161  bytes sent via SQL*Net to client
       2325  bytes received via SQL*Net from client
        168  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2501  rows processed
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select *
  2    from disorganized
  3   where object_id between 10000 and 12500
  4  /
 
2501 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=43 Card=2495 Bytes=239520)
   1    0   TABLE ACCESS (FULL) OF 'DISORGANIZED' (Cost=43 Card=2495 Bytes=239520)
 
 
 
 
Statistics
----------------------------------------------------------
        242  recursive calls
          0  db block gets
        605  consistent gets
         44  physical reads
          0  redo size
     209302  bytes sent via SQL*Net to client
       2325  bytes received via SQL*Net from client
        168  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2501  rows processed
 
<b> range scan for one and full scan for the other -- note the index scan seems more efficient -- used less consistent gets, but...</b>

ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select /*+ index( disorganized disorganized_idx ) */ *
  2    from disorganized
  3   where object_id between 10000 and 12500
  4  /
 
2501 rows selected.
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2503 Card=2495 Bytes=239520)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'DISORGANIZED' (Cost=2503 Card=2495 Bytes=239520)
   2    1     INDEX (RANGE SCAN) OF 'DISORGANIZED_IDX' (NON-UNIQUE) (Cost=7 Card=2495)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2675  consistent gets
         33  physical reads
          0  redo size
     172161  bytes sent via SQL*Net to client
       2325  bytes received via SQL*Net from client
        168  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2501  rows processed
 
<b>we can see how much less efficient in this case the index is... for THIS table</b>
 

Different Plan

A reader, December 04, 2003 - 12:59 pm UTC

Tom,

I created the sample tables I1,I2,MAP on my development environment using the same scripts given in the discussion. Also, I populated I1 with around 25000 rows and no rows in I2 and MAP. But the execution plan was different.

The following was the execution plan :

Nested Loops
Nested Loops
Table Access FULL MAP
Table Access by rowid I2
Index Unique scan
Table Access by rowid I1
Index unique scan

It did not do a Full table scan on I1 and did not start with I1 as you explained. Is this because of the data? If the tables are ANALYZED will it get the execution path you mentioned.
2. Is the CBO execution plan dependent on data volumes and data distribution. If so, how do we write a query that can run efficiently in both production and development environments. What should we do to run efficiently. ie. What to look for in a query so that if I run it on development environment (low data volume), so that the same query will run efficiently with large data volumes. development, it will run efficiently


Tom Kyte
December 04, 2003 - 1:09 pm UTC

you got the same plan as the original questioner did (and didn't want).

I did not explain that it would start with I1, i was "hypothesizing that if it did in fact start with I1 this is what would happen and it would be *bad*"



Clustering factor not considered. Why?

A reader, December 05, 2003 - 6:10 pm UTC

Tom,

I have a table TRADE that has a VARCHAR2(15) column TICKET_NUMBER which has a nonunique index TRADE_IDX1. Here are the statistics from the table :

Total Rows : 2179860
Clus factor for TRADE_IDX1 : 1845610
Total Table Blocks : 215187
Optimizer Feature : 9.2

When I do a query something like SELECT * FROM TRADE
WHERE TICKET_NUMBER LIKE '2003%', it has about 1702348 rows. I expected the EXPLAIN PLAN to show TABLE ACCESS FULL because you are retreiving a large percentage of rows and that the clustering factor is almost equal to the number of rows. But the EXPLAIN PLAN showed a INDEX RANGE SCAN TRADE_IDX1 followed by a TABLE ACCESS BY ROWID which will not be very efficient in this case. Why does the optimizer generate this plan? Also, per your books, you suggested that it would be better to do a full scan in such circumstances since the index approach will involve more logical I/Os compared to full scan which can make use of db_file_multiblock_read_count parameter to read as many blocks in one I/O. Also, when I force the OPTIMIZER to do a FULL scan, the query runs very slow. Has something changed in 9i mode. I saw this happening in some other tables in my database. Pls. help.


Tom Kyte
December 06, 2003 - 9:01 am UTC

hows about showing us the ESTIMATED CARDINALITY that would be in the explain plan as well -- that'll tell us why.

but here I'm confused

a) optimizer uses indexes
b) force to use full scan and find much slower

??? umm, maybe it is doing the right thing for you already??

Why this happens

A reader, December 06, 2003 - 5:48 pm UTC

Tom,

Per your book, the CBO will automatically perform a FTS when the tables are properly analyzed and the clustering factor is equal to number of rows and you select a significant proportion of rows (in this case almost 80% of rows are being selected). Why in the above case the CBO performs an index-range scan followed by a table-access by ROWID instead of full-table scan. I have read similar examples in your book where you show difference between colocated and disorganized and in that discussion you discuss about looking at the CLUSTERING_FACTOR. Why is it different in the above case?
2. Suppose you create a non-unique index on a unique column and do a equality check. Will Oracle still perform an index range scan?

Tom Kyte
December 07, 2003 - 8:43 am UTC

1) hows about you provide the requested information (autotrace traceonly explain). Then, then and only then, can we comment.


2) yes. (easy enough to test no?)

ESTIMATED CARDINALITY

A reader, December 08, 2003 - 1:43 pm UTC

Tom,

In the above response, you had requested for ESTIMATED CARDINALITY in the EXPLAIN PLAN. Where should I look for that requested info. Is it Card=....



Tom Kyte
December 09, 2003 - 5:37 am UTC

just cut and paste the output of an autotrace traceonly explain

yes, it is the CARD= part

Hash joins

kit, December 09, 2003 - 11:23 am UTC

I did place a question on hash join and you asked for more examples. As i was browsing this area, you covered some of the points I wanted to ask

please see below


select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';



Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=13 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=12 Card=1 Bytes=26)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'I1' (Cost=2 Card=1 Bytes=14)
4 3 INDEX (RANGE SCAN) OF 'I1_IDX' (NON-UNIQUE) (Cost=1 Card=1)
5 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)


re the above plan






my understanding is

Hash table Map on column - i2 and create temp sort area temp_a

Index scan on table I1 -


As each row is found, the column n from table i1 is hashed and looks up temp_a to get selected columns from Map table

The rows that match

for each row

scan unique index on table I2





0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=21 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=20 Card=1 Bytes=26)
3 2 TABLE ACCESS (FULL) OF 'I1' (Cost=10 Card=1 Bytes=14)
4 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
5 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
6 5 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)



For the plan B above
Perform full scan on I1 and Map

I believe oracle then splits this into many partitions as possible. Is this based on the join columns
i2.n = map.i2. How does it partition, set number of rows etc ?. What memory is it checking for to see how many partitions
it can make

How does it choose which one to hash


could you please give an example to explain the below two statements from oracle

2. Oracle builds a hash table from one of the partitions (if possible, Oracle
selects a partition that fits into available memory). Oracle then uses the
corresponding partition in the other table to probe the hash table. All
partition pairs that do not fit into memory are placed onto disk.

3. For each pair of partitions (one from each table), Oracle uses the smaller
one to build a hash table and the larger one to probe the hash table.

thanks
kit

Tom Kyte
December 10, 2003 - 3:27 pm UTC

1) if you want to understand "which one to hash" i go over the ins and outs of that in my book 'effective oracle by design'

it trys to hash the smaller of the two.

2) not a question
3) not a question


avoiding full table scans

kit, December 10, 2003 - 7:33 am UTC

Hi Tom,
On question review it indicates my latest question has been
answered but I cannot see it.I would be obliged if you could confirm or correct my understanding

Tom Kyte
December 10, 2003 - 3:25 pm UTC

where do you see that? anyway. I don't see each and ever review every time. i browse what I can but not all of them.


how to avoid full table scans

kit, December 11, 2003 - 10:18 am UTC

Thanks tom but can you please confirm my understanding is correct


Hash table Map on column - i2 and create temp sort area temp_a

Index scan on table I1 -


As each row is found, the column n from table i1 is hashed and looks up
temp_a to get selected columns from Map table

The rows that match

for each row

scan unique index on table I2

on the first plan
and



For the plan B above
Perform full scan on I1 and Map

I believe oracle then splits this into many partitions as possible. Is this
based on the join columns
i2.n = map.i2. How does it partition, set number of rows etc ?.

Questions

(1)What memory is it checking for to see how many partitions it can make

(2) How does it choose which one to hash
(3)can you please explain what is meant by the below two statments from oracle




A. Oracle builds a hash table from one of the partitions (if possible,
Oracle
selects a partition that fits into available memory). Oracle then uses the
corresponding partition in the other table to probe the hash table. All
partition pairs that do not fit into memory are placed onto disk.

B. For each pair of partitions (one from each table), Oracle uses the
smaller
one to build a hash table and the larger one to probe the hash table.



thanks
Kit

Tom Kyte
December 11, 2003 - 10:35 am UTC

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=13 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=12 Card=1 Bytes=26)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'I1' (Cost=2 Card=1 Bytes=14)
4 3 INDEX (RANGE SCAN) OF 'I1_IDX' (NON-UNIQUE) (Cost=1 Card=1)
5 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)


psuedo code for above:

find rows in I1 using index
create hash table on I1

for x in ( select * from map )
loop
hash column in map row, look up in hash_table_I1
if found
then
look up row in I2
if found
then
OUTPUT ROW
end if
end if
end loop


hash joins want to take the smallest table (result) and hash it. It'll reorder them from the plan and hash the smaller of the two.



0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=21 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=20 Card=1 Bytes=26)
3 2 TABLE ACCESS (FULL) OF 'I1' (Cost=10 Card=1 Bytes=14)
4 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
5 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
6 5 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)

psuedo code for that

find rows in I1 using full scan
create hash table on I1

for x in ( select * from map )
loop
hash column in map row, look up in hash_table_I1
if found
then
look up row in I2
if found
then
OUTPUT ROW
end if
end if
end loop



oracle uses hash_area_size in 8i -- what it gets allocated from pga_aggregate_target in 9i -- to determine the partition size.

it wants to hash which ever segment it thinks will be SMALLER

the hash join algorithm is complex -- I've documented it fully in my book if you are interested (with lots of examples). It is too complex to fit here. Here is a short excerpt from the book on the "when it's too big to fit in memory" issue:

....
The processing is a little different if the hash table being developed does not fit entirely in memory. In that case, Oracle will use TEMP space to hold portions (partitions) of the hash table and perhaps of the table that will be used to probe this hash table. When the hash area fills up and Oracle needs to use TEMP, it will find the largest partition within the hash table and write that out to TEMP. (Oracle actually uses a two-part hash table: there is a partition, or section, of the hash table a row belongs to and then a position in that partition.) Any new row retrieved that belongs to that on-disk partition will modify the on-disk image to contain it.

This process continues until the entire hash table is built, with part of it in memory and part of it on disk. Then Oracle begins reading from the other source. As before, the join key will be hashed in order to look up the other value. If the hashed value points to a row in memory, the join is completed and the row is returned from the join. If the row points to a hash partition on disk, Oracle will store this row in TEMP, using the same partitioning scheme it did for the original source data. So, when the first pass is done-all of the rows that could be joined to the in-memory hash table have been joined and some number of partitions remain in TEMP-each partition will be joined only to rows in the other corresponding partition. Oracle will then process each on-disk temporary partition one by one, reading them into memory and joining them, resulting in the final answer.

......


Avoid full table scans

kit, December 15, 2003 - 10:13 am UTC

thanks, I've read pg 468 - 470 many times now and some of its helping but still do not understand

section talks about when when does not fit into memory

-read it few times, is there an example of explaining the text with examples


- what criteria to work out the partitions of the hash table

-under what circumstance will it put the table used to probe the hash table into temp

- What is meant by "using the same partitioning scheme it did for the
original source data" ?

- how can we tell best not use a hash, is it when more 70 % still in temp. how can we tell when the hash table fits into memory


can you get scenarios as
index scan I, index scan table B

find the smaller result set and hash.

For each bigger result set , hash its key and do a lookup


would this be shown as

hash join
index Table I
Index Table B

Or would performance be bad and with a hash join of the table must be full table scan





eg



Table 1 driving table - 215 M rows of which 170,000 rows selected - not long
Table 2 26 M Joined via unique index, stil 170 k rows selected



Table 3 564 rows

still 170k


Woud this be best via hash or NL. the query is in MV and issue about returning 1st row back soon as possible is not an issue

Thanks for your patience


Tom Kyte
December 15, 2003 - 10:39 am UTC

sorry -- but i guess I lose on this one. I cannot explain it in less words then I did in the book.

Basically -- there is a two part key to the hash (that is the partitioning scheme, that is what I mean by "the same scheme"). So, when we hash something -- it is put into a hash table in a particular partition.

If that hash table doesn't fit in ram -- we write out N-1 of the partitions (keeping one in memory).

We finish hashing now. as we add data to the hash table -- if the partition is in memory -- it goes there. All other stuff goes onto disk.

So, now we have this big hash table -- part in memory, part on disk. We start reading the other result set. We look at the hash key and determine what partition it goes with. If it goes with the partition we have in memory -- we output that row, else the row from the other result set we just started reading goes to disk in an "analogous" partition -- meaning, all of the data to hit the first partition we swapped out will be put into the same temp partition, all of the data to hit the second partition of the hash table will be in the same temp partition and so on.

When we finish with this first hash table partition -- we get rid of it and load up the first swapped out one. Now we read that first "temp partition" we just created above and we know every row in there will be hashed to a row in the partition we just loaded up.

and so on.


the "criteria" is just a hashing algorithm. internal, cannot be influenced by you. a function of the join key.


Anytime part of the table used to probe the hash table tries to prode a partition that is NOT IN RAM, that probing row goes to temp. Waits till we load the hash partition it wants into ram.


"We" do not want to make the judgement when best not to use hash joins. "It" (the CBO) has that job.






full table scan

kit, December 16, 2003 - 12:08 pm UTC

hi, thanks again

slowly going in
however, please see below my assumptions

assumptions
============
partitioning is done internally, we cannot control how it partitions

if 10 partitions and does not fit into ram, then writes 9 paritions onto disk


Need more clarification
========================
Did not understand, could this be expanded

"We finish hashing now. as we add data to the hash table -- if the partition is
in memory -- it goes there. All other stuff goes onto disk."



- what data are we adding to the adding to the hash table once finished hasing



"all of the data to hit the
first partition we swapped out will be put into the same temp partition, all of
the data to hit the second partition of the hash table will be in the same temp
partition and so on."

- swapped out, is the ones that could not fit in memory initially, i.e n -1



what stats can we look at to see how much in ram and on disk


Previous questions
===================

did you get chance to look at

can you get scenarios as
index scan I, index scan table B

find the smaller result set and hash.

For each bigger result set , hash its key and do a lookup

would this be shown as

hash join
index Table I
Index Table B

Or would performance be bad and with a hash join of the table must be full
table scan
eg
Table 1 driving table - 215 M rows of which 170,000 rows selected
- not long
Table 2 26 M Joined via unique index, stil 170 k rows
selected

Table 3 564 rows

still 170k


Woud this be best via hash or NL. the query is in MV and issue about
returning 1st row back soon as possible is not an issue



Thanks for your time again

Tom Kyte
December 16, 2003 - 2:11 pm UTC

when i say "we finish hashing now" what I mean is:

we were hashing into ram. we ran out of room. we paused. we wrote on n-1 partitions to disk. NOW we finish hashing -- reading the source to hash, hashing and putting the hashed results into the right place.

In 9iR2, you can see the "swapping" in a tkprof, on the query plan lines (you'll see a w=NNNNN showing the number of writes involved), also available in v$ tables (but as an aggregate for the query)



as for the last question -- i'll just keep reverting to "analyze, and ask the CBO". that is the answer.

merry xmas and happy new year to you

kit, December 23, 2003 - 6:18 pm UTC

tom,

Thanks for your help, still trying to understand the hashing but for time have an excellent xmas and new year

reader, March 18, 2004 - 7:53 pm UTC

HI TOM,

I am the beginner for sql turning.
I have a couple os of questions.

1.QUESTION 1
---------------
Two tables

T1(K1,K2,F1,F2...)

T2(K1,K2,F1,F2...)

K1 AND K2 are the KEYS OF TABLE T1 and T2
I have a query like this

SELECT *
FROM
T1,
T2
WHERE
T1.K1=T2.K1(+)
AND
T1.K2=T2.K2(+)
AND
T2.F1 >4
;
The question is "Is that necessary to put index on T2.F1 to increase the perfmance ?"

a table T3(F1,F2,F3...)

SELECT
SUBSTR(TO_CHAR(T3.F1,'000'),2,3),
COUNT(F2),
SUM(F3)
FROM T3
GROUP BY
SUBSTR(TO_CHAR(T3.F1,'000'),2,3);

QUESTION2
-----------------
Is that necessary like this in the group by
"
GROUP BY
SUBSTR(TO_CHAR(T3.F1,'000'),2,3)
"
instead of

"
GROUP BY
T3.F1
"
WHAT IS different btween them ?

QUESTION3 IS
----------------
What is the relationship btween 'index' and 'group by' ?

For the above case put the index on f1 can increase performance ? if so INDEX ON FUNCTION SUBSTR(TO_CHAR(T3.F1,'000'),2,3) can be better?


Thank you very much in advance!

Tom Kyte
March 18, 2004 - 8:09 pm UTC

1) the question should be "why did I outer join to T2 in the first place"

think about it. why did you outer join to t2? tell me if it makes sense.

2) you select to_char( .... ), that is what you must group by. t3.f1 isn't what you selected, it isn't what you can group by.

the difference is one query will actually execute - and the other will not (cannot, doesn't make sense)

3) an index is a data structure. group by is an operation. there is not any "relationship" between them whatsoever.


indexes can increase performance
indexes can (massively) decrease performance
indexes can have no impact whatsoever.

I can give examples of data whereby the answer to all of your index questions is "yes", or "no" or "maybe".

reader, March 19, 2004 - 3:14 pm UTC

Hi TOM,

Thank you for the perfect answers.

For the question 1

Is that you means the outjoin is useless

SELECT *
FROM
T1,
T2
WHERE
T1.K1=T2.K1(+)
AND
T1.K2=T2.K2(+)
AND
T2.F1 >4
;

my further question is how the following can impact the performance ?

SELECT *
FROM
T1,
T2
WHERE
T1.K1=T2.K1(+)
AND
T1.K2=T2.K2(+)
AND
T2.F1 >4
;

Is that outjoin on the k1 amd k2 will make the full scan for the t2 (the parimary key will be useless here ?) ?

Thank you !

Tom Kyte
March 19, 2004 - 3:18 pm UTC

it is not just that the primary key is or is not useless (indexes are not even part of the equation here yet -- who knows if that should full scan or not. its all about the size and spread of the data), it is that who ever wrote the query didn't truly understand what they were asking for if they used an outer join.

The outer join can preclude some access paths from consideration (the RBO was massively impacted by this, the CBO much less so).

Joining 2 big tables , query performance

Sujit Mondal, April 15, 2004 - 9:14 am UTC

I have two fact tables
T1(C1,C2,C3,F1,F2)
and T2(C1,C2,F3,F4)

T2 has 90000 records and growing by 48 per day.

T1 has 60 times of records than T2.


I like to make a performance tuning of the following query.

SELECT T1.C1,T1.C2,T1.C3,T1.F1,T1.F2,T2.F3,T2.F4
FROM T1,T2
WHERE T1.C1=55
AND T2.C1=55
AND T1.C1=T2.C1
AND T1.C2=T2.C2;

I have thought of 3 possible solution , can you please tell me which one is more appropriate or anything else other than the 3 below?

1) Partitioning 0f the table T1 and T2 based on C1 will help us to get fast response? if so what type of partitioning you recommend?

2) Adding a Primary Key Column in T2 and using it as a foreign key column in T1 and joining based on that column.

3) Merge two table into One and repeat T2 data for each row of T1 (That is 60 times) and avoid any join.

Regards
-Sujit

Tom Kyte
April 15, 2004 - 9:36 am UTC

depends totally on the frequency of "t1.c1=55" and "t2.c1=55".

if c1 is fairly selective (selectivity is based on the size of the objects), then a simple b*tree index is all you need on (c1,c2) (in both tables)

there is insufficient data here to make any judgement calls.

Joining 2 big tables , query performance

Sujit Mondal, April 15, 2004 - 9:59 am UTC

Thanks for your reply.

T2.C1=55 has 50 records and T1.C1=55 has 3000 records that is 60 times of T2.

"55" is variable based on user input but the amount of data will be almost same for all var value of C1, and so the ratio of record in T1,T2.

I have bitmap index on C1,C2 in T1.

And T2 has the primary key on C1 , C2.

Do you think I need to make a btree index on C1,C2 in both tables and avoid bitmap , Primary Key?

And also if the size of the table increases , then there won't be any performance impact in future? As T1 will grow by 3000 records per day and T2 by 50 records per day.

Regards
-Sujit

Tom Kyte
April 15, 2004 - 10:31 am UTC

if you are trickle feeding this table -- bitmaps will utimately be "a bad idea".

Besides, it doesn't sound like c1,c2 are low cardinality with respect to the tables if t1.c1=55 has 50 occurrences out of 90,000 (c1 sounds high cardinality, highly selective)

I would b*tree both tables on c1,c2. You are only pull 3000 records from each. Perhaps one or the other (or both) should be IOTs to cluster the c1/c2 values together.

full table scans

ramki, July 18, 2004 - 10:24 am UTC

Hi Tom,

we are having a batch process for creating monthly payroll
we have two categories of personnel(cat1-7000 employees cat2-50000 employees) for whom we run the same process separately due to different routines for the data entry transactions . For this purpose we have a set of five tables in which the data is populated(while applying all the business rules during calculation) from all the transaction tables. At the end of the batch process, the processed data from the 5 tables is then shifted to the actual tables where they have to get stored.

i have seen that whenever the process was executed for cat1 it took around one and half hours and but for cat2 it was taking around 20 hours . But when cat2 was broken into smaller sets of 5000 each , it used to complete in around 30-40 mins for each set i.e total of approx 7 hours.

I have seen that most of the sqls were undergoing full table scans(queries on these 5 tables.These 5 tables are under nologging mode also).

Now the question is that , if i want the CBO to fully use the statistics ,these five tables can be put under monitoring (as i cannot run dbms_stat prior to the batch process as there is no data in them) .Now will i gain any performance benefit in MONITORING, as the overhead might increase because with every dml/population of the data, the stats are also updated.

thanx in advance

Tom Kyte
July 18, 2004 - 12:30 pm UTC

monitoring just flags that a table needs to have stats gathered, it does not actually gather them.

full table scans

ramki, July 19, 2004 - 3:11 pm UTC

Hi Tom,
Whenever the tables are monitored , you mean to say that the tables are only flagged but not gathered .(is this used for gathering stale statistics????) then when do i gather statistics in the situation posted in hte previous ??????,then should it be a manual execution separately.



Tom Kyte
July 19, 2004 - 4:40 pm UTC

monitoring just flags a table as stale or not.

dbms_stats.gather_schema|database_stats must be run with the GATHER STALE option in order to get the stats for these stale objects.

Full table scan in this example is not desired :-(

Ninoslav, July 24, 2004 - 3:12 pm UTC

Hi Tom,
I have an example with full table scan, which is not desirable in this case (I have written a dozen times that full table scan is not evil, but since that mantra didn't help I have to raise it to upper level :-)).
So,
where ( A.col1 = p_rec_goe.kol
or p_rec_goe.kol is null
)
is the part of my query that causes not using index on col1 in table A. p_rec_goe.kol is a parameter that comes from outside (oracle forms) and it can be null. How to avoid full table scan in this situation?
Thanks


Tom Kyte
July 24, 2004 - 9:16 pm UTC

is that the entire predicate?

are you using the CBO???

More inforation

Ninoslav, July 25, 2004 - 5:19 am UTC

Hi Tom,
Yes I'm using CBO. I have even tried with very simple table A which has number col1 and varchar2 col2. Then, I have inserted, for test, 1 000 000 rows. On col1 I have put an index and have repeated the test, but the result is the same: full table scan and cost of 764.


Tom Kyte
July 25, 2004 - 11:52 am UTC

well, if you look at it, this might be expected.

you are saying:

sometimes I want all rows (full scan)
othertimes i might just want some rows.

and if that column is NULLABLE, the index in fact *cannot* be used in the case of "get me all rows"


but in any case, consider this example (and then consider rewriting to use NVL())


drop table t;

create table t as select * from all_objects;

create index t_idx on t(object_id);

analyze table t compute statistics for table for all indexed columns;



variable foo number
set termout off

exec :foo := 55;
select * from t foo_is_55 where object_id = :foo or :foo is null;
select * from t foo_is_55 where object_id = nvl(:foo,object_id);

exec :foo := null;
select * from t foo_is_null where object_id = :foo or :foo is null;
select * from t foo_is_null where object_id = nvl(:foo,object_id);

@trace
exec :foo := 55;
select * from t foo_is_55 where object_id = :foo or :foo is null;
select * from t foo_is_55 where object_id = nvl(:foo,object_id);

exec :foo := null;
select * from t foo_is_null where object_id = :foo or :foo is null;
select * from t foo_is_null where object_id = nvl(:foo,object_id);

set termout on

the tkprof shows us that there are some optimizations we can make with the NVL that create "two plans" -- one when the bind is null and one when the bind is not null

********************************************************************************
select * from t foo_is_55 where object_id = :foo or :foo is null

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 425 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.01 0.00 0 425 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL T (cr=425 r=0 w=0 time=8183 us)
********************************************************************************
select * from t foo_is_55 where object_id = nvl(:foo,object_id)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 4 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 4 0 1

Rows Row Source Operation
------- ---------------------------------------------------
1 CONCATENATION (cr=4 r=0 w=0 time=103 us)
0 FILTER (cr=0 r=0 w=0 time=1 us)
0 TABLE ACCESS FULL T (cr=0 r=0 w=0 time=0 us)
1 FILTER (cr=4 r=0 w=0 time=86 us)
1 TABLE ACCESS BY INDEX ROWID T (cr=4 r=0 w=0 time=80 us)
1 INDEX RANGE SCAN T_IDX (cr=3 r=0 w=0 time=62 us)(object id 38854)
********************************************************************************

see there, only the index range scan part was done -- the full scan was skipped and further:


select * from t foo_is_null where object_id = :foo or :foo is null

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2049 0.18 0.17 0 2452 0 30713
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2051 0.18 0.17 0 2452 0 30713

Rows Row Source Operation
------- ---------------------------------------------------
30713 TABLE ACCESS FULL T (cr=2452 r=0 w=0 time=53363 us)
********************************************************************************
select * from t foo_is_null where object_id = nvl(:foo,object_id)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2049 0.26 0.27 0 2452 0 30713
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2051 0.26 0.27 0 2452 0 30713

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 165

Rows Row Source Operation
------- ---------------------------------------------------
30713 CONCATENATION (cr=2452 r=0 w=0 time=145894 us)
30713 FILTER (cr=2452 r=0 w=0 time=99203 us)
30713 TABLE ACCESS FULL OBJ#(38853) (cr=2452 r=0 w=0 time=50747 us)
0 FILTER (cr=0 r=0 w=0 time=1 us)
0 TABLE ACCESS BY INDEX ROWID OBJ#(38853) (cr=0 r=0 w=0 time=0 us)
0 INDEX RANGE SCAN OBJ#(38854) (cr=0 r=0 w=0 time=0 us)(object id 38854)

there the index range scan was skipped and the full scan performed...


Nagaraju, July 25, 2004 - 11:10 pm UTC

Tom,

From your answer to the original question
<quote>
ops$tkyte@ORA920>
ops$tkyte@ORA920> set autotrace traceonly
ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

no rows selected


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'MAP'
4 2 TABLE ACCESS (BY INDEX ROWID) OF 'I2'
5 4 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)
6 1 TABLE ACCESS (BY INDEX ROWID) OF 'I1'
7 6 INDEX (UNIQUE SCAN) OF 'SYS_C003754' (UNIQUE)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
60127 consistent gets
0 physical reads
60 redo size
513 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

ops$tkyte@ORA920>
ops$tkyte@ORA920> analyze table i1 compute statistics;

Table analyzed.

ops$tkyte@ORA920> analyze table i2 compute statistics;

Table analyzed.

ops$tkyte@ORA920> analyze table map compute statistics;

Table analyzed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select *
2 from
3 i1,
4 map,
5 i2
6 where i1.n = map.i1
7 and i2.n = map.i2
8 and i1.v = 'x'
9 and i2.v = 'y';

no rows selected


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=1 Bytes=40)
1 0 NESTED LOOPS (Cost=21 Card=1 Bytes=40)
2 1 HASH JOIN (Cost=20 Card=1 Bytes=26)
3 2 TABLE ACCESS (FULL) OF 'I1' (Cost=10 Card=1 Bytes=14)
4 2 TABLE ACCESS (FULL) OF 'MAP' (Cost=9 Card=30020 Bytes=360240)
5 1 TABLE ACCESS (BY INDEX ROWID) OF 'I2' (Cost=1 Card=1 Bytes=14)
6 5 INDEX (UNIQUE SCAN) OF 'SYS_C003755' (UNIQUE)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
92 consistent gets
0 physical reads
0 redo size
513 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

ops$tkyte@ORA920> set autotrace off

tell me -- which looks better to you?? to me -- that second one is looking
pretty good
</quote>

I don't understand how the indexes are not used in the second one. Because you didn't drop any indexes. May be is that because you used "analyze table i1 compute statistics;
"?. if yes, by executing "analyze table i1 compute statistics;", are we removing the indexe statistics?

2.when i generate statistics like below, will the index statistics be gathered?
exec dbms_stats.gather_table_stats(user,'t1',cascade => TRUE,method_opt => ' FOR ALL INDEXES FOR ALL INDEXED COLUMNS ')?


Tom Kyte
July 26, 2004 - 7:25 am UTC

1) the indexes are not used because using them would be "not as good" and the cbo understands that.

2) cascade => true means "analyze the indexes also". method opt should just be "for all indexed columns" - not for all indexes, that is what cascade does

full table scan

A reader, August 04, 2004 - 9:40 am UTC

Hi Tom,

i have a table with about 25 million rows and i am trying to run a query like

select nno from tab where col = 100;

col is indexed and when i see the explain plan, it is reported as a table scan and the query takes more than 10 mins to complete execution. i want to know how i can make it faster and if possible to avoid full table scan. pls help.

thanks.

Tom Kyte
August 04, 2004 - 10:49 am UTC

sooo, how many rows are returned "where col = 100"

followup

A reader, August 04, 2004 - 11:25 am UTC

that select stmt returns about 2 million rows.

thanks.

Tom Kyte
August 04, 2004 - 1:34 pm UTC

so, why do you believe that an index access would be superior?

anyway -- tell you what, using tkprof, run the query with the full scan and the query with an INDEX hint (or FIRST_ROWS -- that would probably trigger the index) and see.

I'd "guess" a nice juicy full scan will best a index retrieval unless the data just happened to be clustered together by accident.


just some math for fun, you can plug your own numbers in.

25mill rows, assume say 100 rows per block. so about 250,000 blocks to read. Using nice juicy big multi-block reads.

Now, assume you access 2,000,000 rows via the index, you have to read the index structure (small, couple hundred/thousand blocks) and for each index entry you have to do a table access by index rowid -- 2,000,000 of them. could be 2,000,000 logical IO's there to do that.



consider... I created a 5,000,000 row table using all_objects (over and over and over)....

we:

big_table@ORA9IR2> update big_table set data_object_id = mod(rownum,12);
5000000 rows updated.

big_table@ORA9IR2> create index data_object_id_idx on big_table(data_object_id) compute statistics;
Index created.

so, where data_object_id = 0 will pull back about 1/12th of the data -- yours is pulling back 1/12.5th of the data - more or less the same...


big_table@ORA9IR2> set autotrace traceonly explain
big_table@ORA9IR2> select data_object_id, object_id
2 from big_table
3 where data_object_id = 0;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10944 Card=416667 Bytes=5833338)
1 0 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=10944 Card=416667 Bytes=5833338)


oh dear, a horrid full table scan -- as monty python used to say "run away, run away"....


big_table@ORA9IR2> select /*+ first_rows */ data_object_id, object_id
2 from big_table
3 where data_object_id = 0;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=72908 Card=416667 Bytes=5833338)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=72908 Card=416667 Bytes=5833338)
2 1 INDEX (RANGE SCAN) OF 'DATA_OBJECT_ID_IDX' (NON-UNIQUE) (Cost=811 Card=416667)


ah, there we go, isn't that nice. But wait -- what about reality:

big_table@ORA9IR2> set autotrace traceonly
big_table@ORA9IR2> select data_object_id, object_id
2 from big_table where data_object_id = 0;

416666 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10944 Card=416667 Bytes=5833338)
1 0 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=10944 Card=416667 Bytes=5833338)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
99483 consistent gets
70612 physical reads

0 redo size
7384630 bytes sent via SQL*Net to client
306046 bytes received via SQL*Net from client
27779 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
416666 rows processed

big_table@ORA9IR2> select /*+ first_rows */ data_object_id, object_id from big_table
2 where data_object_id = 0;

416666 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=72908 Card=416667 Bytes=5833338)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=72908 Card=416667 Bytes=5833338)
2 1 INDEX (RANGE SCAN) OF 'DATA_OBJECT_ID_IDX' (NON-UNIQUE) (Cost=811 Card=416667)




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
123534 consistent gets
69786 physical reads

0 redo size
7384630 bytes sent via SQL*Net to client
306046 bytes received via SQL*Net from client
27779 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
416666 rows processed

Hmmm, be careful of what you wish for!!

Now, for your simple query -- perhaps:


big_table@ORA9IR2> drop index data_object_id_idx;

Index dropped.

big_table@ORA9IR2> create index better on big_table(data_object_id,object_id) compute statistics;

Index created.

big_table@ORA9IR2> select data_object_id, object_id from big_table
2 where data_object_id = 0;

416666 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1083 Card=416667 Bytes=5833338)
1 0 INDEX (RANGE SCAN) OF 'BETTER' (NON-UNIQUE) (Cost=1083 Card=416667 Bytes=5833338)




Statistics
----------------------------------------------------------
79 recursive calls
0 db block gets
28775 consistent gets
1027 physical reads
0 redo size
5582920 bytes sent via SQL*Net to client
306046 bytes received via SQL*Net from client
27779 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
416666 rows processed




could be you want to look at your indexing strategy before you call full scans evil (cause they just ain't)


How many blocks?

Mark J. Bobak, August 04, 2004 - 12:21 pm UTC

Ok, so 2 million out of 25 million rows are returned by
your SQL statement. But, how many distinct table blocks
are those rows in? How many table blocks are all 25 million
rows in? What's db_file_multiblock_read_count set to?

What the optimizer is doing here is, it's looking at how
many rows are returned based on the 'where col1=100' predicate,
and it's looking at the index's clustering factor to determine
how many blocks it's going to have to access to be get the
row level data from the table, and then it looks at
db_file_multiblock_read_count to estimate how many multi-
block reads it will have to do to scan the entire table.

Your other option is to consider adding an index on tab
on (col,nno). This index would then have the data required
in the index, and avoid the TABLE ACCESS BY ROWID lookup
on the table. This would almost certainly outperform a
full table scan. This new index could be in addition to
the existing index on col, but consider also that it could
replace the index on col.

Something to think about,

-Mark

Quantity of Indexes!!!!!!!!!!!!

A reader, August 11, 2004 - 1:03 pm UTC

Hi Tom,
We recently had an OCP come and analyze the database. After the analyzing was complete he said that a table must not have more than 5 Indexes. I asked him why. But he could not give me a convincing answer.
But his "advice" kind of put me in a worry. So is there a problem if there are more than 5 indexes. As one of our highly accessed tables already has like 6 indexes and I might have to add 2 more to improve performance. So in all there will be like 8-9 indexes. This table is big with around 6,000,000 records. And is an important table.
Also the developers did some new queries which are going to be effected by these new indexes. I know about the Skip Scan Feature of 9i (learnt about it form your site :) ). But the new indexes are very different from the earlier ones.
Any advice.
Oracle Version 9iR2 on Windows 2000 Advanced Server.
Thanks a lot for your advice.
Waiting for it impatiently :)

Tom Kyte
August 11, 2004 - 1:59 pm UTC

ROTFL

I think the answer is 42. Now, if i could only remember the question.....


the correct answer is somewhere between zero and lots more than zero.

Developers want no FTS on this large table

simmi, May 15, 2005 - 11:49 pm UTC

Hi Tom,

I read your advise: FTS not always evil. But if it is FTS on Very large tables, then developers demand it to be avoided: I have the following query which takes around 16 seconds if the 'mrs' table has above 25000 records.MRS has indexes on mr_seqno, mr_number, mr_template.
===============
SELECT DISTINCT mr_seqno, mrv_revno, mr_number, mr_desc, rv_c.vtyv_desc revcond, mr_date,
TO_CHAR (mr_date, 'dd/Mon/yyyy') mrdate,
EXP.busr_lastname
|| DECODE (EXP.busr_lastname, NULL, '', ', ')
|| EXP.busr_firstname
|| DECODE (EXP.busr_middlename, NULL, '', ' ')
|| EXP.busr_middlename exped,
DECODE (fn_attachmentexists (mr_seqno), 0, '', '') attch
FROM bps_users EXP,valid_type_values rv_c, mr_personnel_assigns expeditor,mr_revisions,mrs
WHERE rv_c.vtyv_vtyp_type(+) = 'MR REVISION CONDITION'
AND rv_c.vtyv_value(+) = mrv_complete_cancelflag
AND mrs.mr_seqno = mr_revisions.mrv_mr_seqno
AND expeditor.mrpa_mr_seqno(+) = mr_seqno
AND mr_revisions.mrv_revno = mrs.mr_lastrevno
AND EXP.busr_id(+) = expeditor.mrpa_busr_id ;

Execution Plan
----------------------------------------------------------
0 ? SELECT STATEMENT Optimizer=CHOOSE ?
1 0 SORT (UNIQUE) ?
2 1 NESTED LOOPS (OUTER) ?
3 2 NESTED LOOPS (OUTER) ?
4 3 NESTED LOOPS (OUTER) ?
5 4 NESTED LOOPS ?
6 5 TABLE ACCESS (FULL) OF 'MRS' ?
7 5 TABLE ACCESS (BY INDEX ROWID) OF 'MR_REVISIONS' ?
8 7 AND-EQUAL ?
9 8 INDEX (RANGE SCAN) OF 'MRV_1' (NON-UNIQUE) ?
10 8 INDEX (RANGE SCAN) OF 'MRV_3' (NON-UNIQUE) ?
11 4 TABLE ACCESS (BY INDEX ROWID) OF 'VALID_TYPE_VALUES'
12 11 INDEX (UNIQUE SCAN) OF 'VTYV_PRIME' (UNIQUE) ?
13 3 INDEX (RANGE SCAN) OF 'MRPA_PRIME1' (UNIQUE) ?
14 2 TABLE ACCESS (BY INDEX ROWID) OF 'BPS_USERS' ?
15 14 INDEX (UNIQUE SCAN) OF 'BUSR_PRIME' (UNIQUE) ?
===============
I need to avoid FTS of MRS and MR_REVISIONS tables. Even if I change the order of tables, it then does FTS on MR_REVISIONS, an equally big table
====================
TO_CHAR (mr_date, 'dd/Mon/yyyy') mrdate,
EXP.busr_lastname
|| DECODE (EXP.busr_lastname, NULL, '', ', ')
|| EXP.busr_firstname
|| DECODE (EXP.busr_middlename, NULL, '', ' ')
|| EXP.busr_middlename exped,
DECODE (fn_attachmentexists (mr_seqno), 0, '', '') attch
FROM bps_users EXP, mr_personnel_assigns expeditor, mrs,mr_revisions, valid_type_values rv_c
WHERE rv_c.vtyv_vtyp_type(+) = 'MR REVISION CONDITION'
AND rv_c.vtyv_value(+) = mrv_complete_cancelflag
AND mrs.mr_seqno = mr_revisions.mrv_mr_seqno
AND mrs.mr_lastrevno=mr_revisions.mrv_revno
AND expeditor.mrpa_mr_seqno(+) = mr_seqno
AND EXP.busr_id(+) = expeditor.mrpa_busr_id

Execution Plan
----------------------------------------------------------
0 ? SELECT STATEMENT Optimizer=CHOOSE ?
1 0 SORT (UNIQUE) ?
2 1 NESTED LOOPS (OUTER) ?
3 2 NESTED LOOPS (OUTER) ?
4 3 NESTED LOOPS ?
5 4 NESTED LOOPS (OUTER) ?
6 5 TABLE ACCESS (FULL) OF 'MR_REVISIONS' ?
7 5 TABLE ACCESS (BY INDEX ROWID) OF 'VALID_TYPE_VALUES'
8 7 INDEX (UNIQUE SCAN) OF 'VTYV_PRIME' (UNIQUE) ?
9 4 TABLE ACCESS (BY INDEX ROWID) OF 'MRS' ?
10 9 INDEX (UNIQUE SCAN) OF 'MR_PRIME' (UNIQUE) ?
11 3 INDEX (RANGE SCAN) OF 'MRPA_PRIME1' (UNIQUE) ?
12 2 TABLE ACCESS (BY INDEX ROWID) OF 'BPS_USERS' ?
13 12 INDEX (UNIQUE SCAN) OF 'BUSR_PRIME' (UNIQUE) ?


Tom Kyte
May 16, 2005 - 7:46 am UTC

but you are querying all rows -- why the heck would you even consider any other plan???


INDEXES are used to retrieve a couple of rows.

FULL SCANS are for all rows.


Look at the query, there are a bunch of joins -- no predicates, no filters

YOU ARE GETTING ALL OF THE ROWS

In fact, this looks "rbo'ish" to me, i would hope to see not a single index used for best performance.

Have them stare at this example for a while:
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6749454952894#6760861174154 <code>


indexes -- use them to get SOME rows from a BIG set
full scans -- use them to get ALL rows from ANY set


In my experience, the above query should use ZERO indexes. that it is using some is a problem, not the other way around.



case 21312531 why ROT (rule of thumb) is dangerous without the knowledge behind the ROT. The developers have an unhealthy fear of 'full table scans' and if they are successful in their goal of full table scan erradication will really cripple the system.

Tuning FTS

dennis, May 16, 2005 - 3:13 am UTC


Hi Tom,

We have an application where in all the queries that are fired do FTS on large
tables .From the statspack report and with inputs from application team it was found
that these queries are executed just once or twice in a day.

There are columns in the where clause which have high selectivity and hence candidates
for indexing .

But is creating a new index worth for queries executed just once or twice in a day(Storage is not a problem)?.

How can we use the recycle pool feature to our benefit ?.

Is there some other methos to tune these Full Table Scans
in our case?.

We are using Oracle 9i.

thanks,
Dennis

Tom Kyte
May 16, 2005 - 8:00 am UTC

before I answer, show me the where clause -- in light of the above question especially. There is a big where clause right above -- but not a predicate in sight, no indexes should be used.......


It is doubtful the recycle pool comes into play --for most of the day these tables are probably used in a normal fashion, no?

Developers want no FTS on this large table

simmi, May 16, 2005 - 9:34 am UTC

Hi Tom,

Sorry I didn't post complete query for I wanted it to look brief but omitted important information by mistake. Here is the complete query:
SELECT DISTINCT mr_seqno, mrv_revno, mr_number, mr_desc,
rv_c.vtyv_desc revcond, mr_date,
TO_CHAR (mr_date, 'dd/Mon/yyyy') mrdate,
EXP.busr_lastname
|| DECODE (EXP.busr_lastname, NULL, '', ', ')
|| EXP.busr_firstname
|| DECODE (EXP.busr_middlename, NULL, '', ' ')
|| EXP.busr_middlename exped,
DECODE (fn_attachmentexists (mr_seqno), 0, '', '') attch
FROM bps_users EXP,valid_type_values rv_c,mr_personnel_assigns expeditor,mr_revisions, mrs
WHERE rv_c.vtyv_vtyp_type(+) = 'MR REVISION CONDITION'
AND rv_c.vtyv_value(+) = mrv_complete_cancelflag
AND mrs.mr_seqno = mr_revisions.mrv_mr_seqno
AND expeditor.mrpa_mr_seqno(+) = mr_seqno
AND mr_revisions.mrv_revno = mrs.mr_lastrevno
AND EXP.busr_id(+) = expeditor.mrpa_busr_id
AND expeditor.mrpa_relationship(+) = 'REQUESTOR'
AND NVL (mr_template, 0) = 0
AND NVL (mrs.mr_complete_cancelflag, '@') IN('C', 'AWD', 'A', '@')
AND NVL (mr_revisions.mrv_complete_cancelflag, '@') IN
('PRD','L', 'R','IFA','TP', 'AIP', 'OBT' )
AND fn_isuserassociatedwithmr (mr_seqno, 'SXMARTIN', '') = 1
ORDER BY mr_number DESC;
Execution Plan
----------------------------------------------------------
0 ? SELECT STATEMENT Optimizer=CHOOSE ?
1 0 SORT (ORDER BY) ?
2 1 SORT (UNIQUE) ?
3 2 NESTED LOOPS (OUTER) ?
4 3 NESTED LOOPS (OUTER) ?
5 4 NESTED LOOPS (OUTER) ?
6 5 NESTED LOOPS ?
7 6 TABLE ACCESS (FULL) OF 'MRS' ?
8 6 TABLE ACCESS (BY INDEX ROWID) OF 'MR_REVISIONS'
9 8 AND-EQUAL ?
10 9 INDEX (RANGE SCAN) OF 'MRV_1' (NON-UNIQUE) ?
11 9 INDEX (RANGE SCAN) OF 'MRV_3' (NON-UNIQUE) ?
12 5 TABLE ACCESS (BY INDEX ROWID) OF 'VALID_TYPE_VALUES'
13 12 INDEX (UNIQUE SCAN) OF 'VTYV_PRIME' (UNIQUE) ?
14 4 INDEX (RANGE SCAN) OF 'MRPA_PRIME1' (UNIQUE) ?
15 3 TABLE ACCESS (BY INDEX ROWID) OF 'BPS_USERS' ?
16 15 INDEX (UNIQUE SCAN) OF 'BUSR_PRIME' (UNIQUE) ?
=================
When I change the order, it then does a FTS on MR_REVISIONS (as big a table as MRS). MRS has unique indexes on MR_SEQNO, MR_NUMBER and non-unique index on MR_TEMPLATE. MR_REVISIONS has unique index on MRV_SEQNO, non-unique indexes on MRV_REVNO, MRV_BIDP_SEQNO, MRV_MR_SEQNO.
==============
Execution Plan
----------------------------------------------------------
0 ? SELECT STATEMENT Optimizer=CHOOSE ?
1 0 SORT (ORDER BY) ?
2 1 SORT (UNIQUE) ?
3 2 NESTED LOOPS (OUTER) ?
4 3 NESTED LOOPS (OUTER) ?
5 4 NESTED LOOPS ?
6 5 NESTED LOOPS (OUTER) ?
7 6 TABLE ACCESS (FULL) OF 'MR_REVISIONS' ?
8 6 TABLE ACCESS (BY INDEX ROWID) OF 'VALID_TYPE_VALUES'
9 8 INDEX (UNIQUE SCAN) OF 'VTYV_PRIME' (UNIQUE) ?
10 5 TABLE ACCESS (BY INDEX ROWID) OF 'MRS' ?
11 10 INDEX (UNIQUE SCAN) OF 'MR_PRIME' (UNIQUE) ?
12 4 INDEX (RANGE SCAN) OF 'MRPA_PRIME1' (UNIQUE) ?
13 3 TABLE ACCESS (BY INDEX ROWID) OF 'BPS_USERS' ?
14 13 INDEX (UNIQUE SCAN) OF 'BUSR_PRIME' (UNIQUE) ?
=======================
But don't you think the second query is better because it joins MR_REVISIONS(big table) with VALID_TYPE_VALUES (small table). Instead of joining MRS with MR_REVISIONS(both big tables). Nevertheless, both queries take approximately same time.
If only I can change it to Not do FTS or FTS on a small table like VALID_TYPE_VALUES, BPS_USERS. MR_PERSONNAL_ASSIGNS is also big table.

Tom Kyte
May 16, 2005 - 1:07 pm UTC

AND NVL (mr_template, 0) = 0
AND NVL (mrs.mr_complete_cancelflag, '@') IN('C', 'AWD', 'A', '@')
AND NVL (mr_revisions.mrv_complete_cancelflag, '@') IN
('PRD','L', 'R','IFA','TP', 'AIP', 'OBT' )
AND fn_isuserassociatedwithmr (mr_seqno, 'SXMARTIN', '') = 1


you are applying functions to all of those columns


no indexes there -- unless you go function based indexes. Also -- try to lose that fn_is..... -- see if you cannot replace that with pure SQL.

Changing of table order

A reader, May 16, 2005 - 10:30 am UTC

Changing of Table order after From clause should not make a difference unless you are using the old RBO.

Any specific reason why developers are against FTS on that table ?

Regi

A reader, May 16, 2005 - 10:34 am UTC

For tuning FTS, increase the parameter db_file_multiblock_read_count to maximum i/o bandwidth supported by your OS. By increasing this parameter, chances of other queries using FTS are increased.

Also consider running your query in parallel, provided you have multiple CPUs and your current CPU utilization is low

Developers want no FTS on this large table

simmi, May 16, 2005 - 4:59 pm UTC

We're using RBO. Thats why I had mentioned about the order of tables in FROM clause. Also, the db_file_multiblock_readcount is already set to 32 and values higher than 32 are not recommemded(because it'll make other queries do FTS as well). Developers are against FTS because MRS, MR_REVISIONS are very big tables- query takes around 16 seconds if the mrs have above 25000 records.
Please advise!
Thanks a lot.

Tom Kyte
May 16, 2005 - 5:16 pm UTC

so, if you are using RBO function based indexes are not happening

you have what you have and it isn't going to change. the functions on the columns prohibit indexes on those columns, and indexing the functions mandates the user of the CBO.

rock, hard place.

want no FTS on this large table

simmi, May 17, 2005 - 10:47 am UTC

If I change 'AND NVL (mr_template, 0) = 0' to 'AND mr_template = 0', it starts using the index on mr_template & avoids FTS. But why does the query not run faster?

But since I do want the functionality of NVL, so I change it to 'AND (mr_template = 0 or mr_template is null)'. It then does not use the index on mr_template and does FTS of MRS table.Even if I give 'AND mr_template is null', it does FTS. I thought changing NVL this way should use the index. In fact we were planning to change all queries in the application which were using NVL to this way...but I see that this meathod dosen't work. Why?

Also, I had the understanding the using functions instead of plain SQL in queries, makes the queries faster? Thats why we used 'fn_is'. Is this not correct? Even if I remove 'AND fn_isuserassociatedwith(mrseqno, 'SXMARTIN', '') = 1' completely, there is no change to execution plan.

So what is the workaround please?

Tom Kyte
May 17, 2005 - 1:46 pm UTC

<quote>
But why does the query not run faster?
</quote>

because indexes are *not fast=true*

<quote>
AND (mr_template = 0 or mr_template is null)
</quote>

if the index is on MR_TEMPLATE and mr_template alone, then mr_template is null *cannot* use the index, entirely null keys are not placed into a b*tree index.

<quote>
Also, I had the understanding the using functions instead of plain SQL in
queries, makes the queries faster?
</quote>

that I don't understand.

But basically the use of the functions (nvl and such) PRECLUDE THE USE OF ANY INDEXES WITH THE RBO. Period.


The workaround? I think it goes way back to the data model here (the model that forces functions to be placed on database columns in order to retrieve the data needed).

Also, it is really extremely hard for someone, anyone to look at a multi-table query where correlation names are not used 100% of the time. No clues as to what columns come from what tables.

FTS

Dennis, June 11, 2005 - 2:19 am UTC

Tom,

I have a doubt regarding FTS.

When a query is doing FTS based on some filter
condition, is the entire table loaded into memory
or are the blocks that meet the filtering
condition loaded into memory .

In the later case the sorting for the data blocks
meeting the select criteria will be done on the disk
and in the former the sorting will be in memory.

Please advice.

thanks,
Dennis

Tom Kyte
June 11, 2005 - 9:35 am UTC

depends, if you say:


select * from one_hundred_billion_row_table where x > 5;

and that results in a full scan, the table will be read bit by bit by bit as you fetch. You'll have a "block" at a time really (it'll multi-block IO it, put say 16 blocks in the cache, you'll ask for block 1 process it, later block 2 and so on)

blocks are cached, never rows. blocks.


we use temp space -- use so much ram and then when that is filled swap out to disk and fill ram again...

FTS

Dennis, June 13, 2005 - 3:00 am UTC

Tom,


what I interpret from your explanation is that
blocks for the entire table are cached in the RAM
using multiblock I/O then the desired blocks (depending
on the filtering criteria) are processed in RAM.
Now if we have insufficient RAM then swapping occurs.

Correct me if I have got it wrong.

Tom Kyte
June 13, 2005 - 10:51 am UTC

well, sort of.

A query might use TEMP, TEMP might get written to disk.

A full scan "select * from t" will not likely use any temp, we'll read a bit from the cache (or disk into cache) and then give to client and so on.

sal, June 22, 2005 - 11:18 am UTC

where can i see the table statistics?
I am running daily on query some time this query takes too much time to finish. How to resolve this situation

Tom Kyte
June 22, 2005 - 4:47 pm UTC

what table statistics? not sure what you are looking for.

maybe you mean sql_trace and tkprof.

wait event

abc, June 22, 2005 - 6:11 pm UTC

Tom,

Recently I did one tuning assignment and chnaged init parameters in 9.2.0.5
One thing is bothering me i,e
WAIT #1: nam='PX Deq Credit: send blkd' ela= 136 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: need buffer' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: need buffer' ela= 256 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 396 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: need buffer' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='db file scattered read' ela= 761 p1=53 p2=164309 p3=8
WAIT #1: nam='PX Deq Credit: send blkd' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 140 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 167 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: need buffer' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: need buffer' ela= 619 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 221 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 480 p1=268500995 p2=2 p3=0
WAIT #1: nam='db file scattered read' ela= 711 p1=53 p2=164317 p3=8
WAIT #1: nam='PX Deq Credit: need buffer' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: need buffer' ela= 296 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 2 p1=268500995 p2=1 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 566 p1=268500995 p2=2 p3=0
WAIT #1: nam='PX Deq Credit: send blkd' ela= 2 p1=268500995 p2=1 p3=0


SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
log_parallelism                      integer     1
parallel_adaptive_multi_user         boolean     FALSE
parallel_automatic_tuning            boolean     FALSE
parallel_execution_message_size      integer     2148
parallel_instance_group              string
parallel_max_servers                 integer     5
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

SQL> show parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 1073741824
SQL>

Please suggest how to avoid wait events 

Tom Kyte
June 23, 2005 - 1:54 pm UTC

<quote>
Please suggest how to avoid wait events
</quote>

turn off computer, that is the only way....


you gotta be waiting for something or you have just invented the quantum computer.


sorry, this snippet is way to small and too far out of context to make any comments on. YOu are looking at a single trace from a parallel query (of which there would be many traces) without a single bit of "context"

It was funny :)

abc, June 23, 2005 - 5:53 pm UTC


Explain plan

A reader, September 12, 2005 - 3:41 pm UTC

Hi Tom,

I have a table t1 as

create table t1
(
t1id number,
id number,
img blob
);

I am trying to compare 2 queries against this table.

1. select * from t1 where t1id between 1 and 10000;

When I do an explain plan for this query, it is doing an index scan of t1id and the results are coming back quickly.

2. select * from t1 where id between 1 and 10000;

When I do an explain plan for this query, it is doing a table scan of t1 and the results are taking a long time.

t1 has an unique constraint on t1id and unique index on id and it is partitioned by range id.

Can you please help me with narrowing down why query 2 is taking a while?

Thanks.




Tom Kyte
September 12, 2005 - 5:36 pm UTC

do you know about clustering factors?

it sounds like t1id is what this table is "sorted" by on disk and id is randomly distributed.

I'm am totally guessing however, since the setup behind your test is missing here.

</code> http://asktom.oracle.com/pls/ask/search?p_string=colocated+%22create+table+disorganized%22+clustering_factor <code>



explain plan

A reader, October 13, 2005 - 10:02 am UTC

The explain plan shows FULL TABLE SCAN when I supply the exact partition keys to a list
partitioned table, whereas when I use any invalid keys or bind values the plan shows
using the partition by list. Why is this behaviour?? Should not the plan show partition list opeation
all the time.

Please see below:

 create table tab1 (col1 number, col2 number)
 partition by list (col2)
 (partition p_1 values (1), 
  partition p_2 values (2)
 )
/
insert into tab1 values (1,1)
/
1 row created

insert into tab1 values (1,1)
/
1 row created

insert into tab1 values (1,1)
/
1 row created

insert into tab1 values (1,2)
/
1 row created

insert into tab1 values (1,2)
/
1 row created

insert into tab1 values (1,2)
/
1 row created

commit ;
Commit complete.

analyze table tab1 compute statistics 
/
Table analyzed.

SQL> explain plan for 
  2  select * from tab1 where col2 =1
  3  /

Explained.

SQL> select * from table(dbms_xplan.display())
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |    26 |     2 |       |       |
|   1 |  TABLE ACCESS FULL   | TAB1        |     1 |    26 |     2 |     1 |     1 |
------------------------------------------------------------------------------------

Note: cpu costing is off

9 rows selected.

SQL>  explain plan for 
  2   select * from tab1 where col2 =2
  3  /

Explained.

SQL> select * from table(dbms_xplan.display())
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |    26 |     2 |       |       |
|   1 |  TABLE ACCESS FULL   | TAB1        |     1 |    26 |     2 |     2 |     2 |
------------------------------------------------------------------------------------

Note: cpu costing is off

9 rows selected.

IN BOTH CASES ABOVE WHEN AN EXISTING PARTITION KEY IS SUPPLIED IT SHOWS FULL TABLE SCAN.
HOWEVER WHEN I USE AN INVALID PARTITION KEY OR USE BIND VARIABLE THE PLAN USES PARTITION LIST AS
SHOWN BELOW.

SQL> explain plan for 
  2  select * from tab1 where col2 =3
  3  /

Explained.

SQL>  select * from table(dbms_xplan.display())
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |    26 |     2 |       |       |
|   1 |  PARTITION LIST EMPTY|             |       |       |       |INVALID|INVALID|
|*  2 |   TABLE ACCESS FULL  | TAB1        |     1 |    26 |     2 |INVALID|INVALID|
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

   2 - filter("TAB1"."COL2"=3)

Note: cpu costing is off

15 rows selected.

SQL> explain plan for 
  2  select * from tab1 where col2 = :1
  3  /

Explained.

SQL>  select * from table(dbms_xplan.display())
  2  /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------
| Id  | Operation             |  Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |     1 |    26 |     2 |       |       |
|   1 |  PARTITION LIST SINGLE|             |       |       |       |   KEY |   KEY |
|*  2 |   TABLE ACCESS FULL   | TAB1        |     1 |    26 |     2 |   KEY |   KEY |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

   2 - filter("TAB1"."COL2"=TO_NUMBER(:Z))

Note: cpu costing is off

15 rows selected.

Even on a table with 10M or 50M rows the plan shows similar behaviour. 

Tom Kyte
October 13, 2005 - 10:59 am UTC

do you see the pstart and pstop columns?

they list what partitions will be hit - so..... pstart=1, pstop=1 means only partition 1 will be full scanned.


when you see key/key like that - it means as runtime, when the values are supplied, we'll see what partitions to hit.

Explain plan

mike, October 13, 2005 - 11:36 am UTC

THANKS A LOT. YOU ARE THE BEST.

Need Help

A read, November 28, 2005 - 2:19 pm UTC

Tom,

Its a wierd situation here. I have 40m record table on a remote node(production), trying to execute below query it does FTS, same query if i execute against a UAT box, it picks indexes. Both the tables are similar, with similar indexes and same amount of data.

DELETE
FROM a@dblink x
WHERE EXISTS (
SELECT '*'
FROM a y
WHERE col1 BETWEEN :b3
AND :b2
AND y.col2 = :b1
AND x.col2 = y.col2
AND x.col3 = y.col3
AND x.col4 = y.col4
AND x.col5 = y.col5)

same query works when i do count on it.

select count(*)
FROM a@dblink x
WHERE EXISTS (
SELECT '*'
FROM a y
WHERE col1 BETWEEN :b3
AND :b2
AND y.col2 = :b1
AND x.col2 = y.col2
AND x.col3 = y.col3
AND x.col4 = y.col4
AND x.col5 = y.col5)

Any insight would be highly appreciated.

Thanks

Tom Kyte
November 28, 2005 - 3:33 pm UTC

index on what - more detail please.

need help

A reader, November 29, 2005 - 10:57 am UTC

indexes are in place for all the columns in sub query. we are replicating the data from local node to the remote. we delete the records first which meet the condition and than insert.


Tom Kyte
November 30, 2005 - 10:47 am UTC

give example. i want table creates and I want to see your plans. Help me help you (but no promises)

Help

A reader, November 30, 2005 - 9:33 am UTC

Tom,

i deseperatly need your help for the above problem.

Tom Kyte
November 30, 2005 - 2:51 pm UTC

I travel, quite a bit. I don't monitor this 24x7. In fact, today, I'll spend most of the day on a plane.

A reader, November 30, 2005 - 2:27 pm UTC

here is the explain plan for the delete statement.

explain plan for DELETE 
FROM a@dblink x 
WHERE EXISTS ( 
SELECT '*' 
FROM a y 
WHERE col1 BETWEEN :b3 
AND :b2 
AND y.col2 = :b1 
AND x.col2 = y.col2 
AND x.col3 = y.col3
AND x.col4 = y.col4 
AND x.col5 = y.col5)

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------
| Id  | Operation              |  Name            | Rows  | Bytes | Cost  | TQ/Ins |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT REMOTE|                  |  2065K|    39M|  8394K|        |      |            |
|   1 |  DELETE                | a            |       |       |       | OP~ |      |            |
|   2 |   FILTER               |                  |       |       |       |        |      |            |
|   3 |    TABLE ACCESS FULL   | a            |  2065K|    39M|   134K| 70,00  | P->S | QC (RAND)  |
|   4 |    FILTER              |                  |       |       |       |        |      |            |
|   5 |     REMOTE             |                  |     1 |    60 |     4 | !      | R->S |            |
--------------------------------------------------------------------------------------------------------


here is the explain plan for the select statement.

explain plan for select count(*)
FROM a@dblink x 
WHERE EXISTS ( 
SELECT '*' 
FROM a y 
WHERE col1 BETWEEN :b3 
AND :b2 
AND y.col2 = :b1 
AND x.col2 = y.col2 
AND x.col3 = y.col3
AND x.col4 = y.col4 
AND x.col5 = y.col5)



Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                       |  Name                         | Rows  | Bytes | Cost  | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                               |     1 |    75 |   441 |        |      |
|   1 |  SORT AGGREGATE                 |                               |     1 |    75 |       |        |      |
|   2 |   FILTER                        |                               |       |       |       |        |      |
|   3 |    NESTED LOOPS                 |                               |     1 |    75 |   441 |        |      |
|   4 |     SORT UNIQUE                 |                               |       |       |       |        |      |
|   5 |      TABLE ACCESS BY INDEX ROWID| a                        |   168 |  4032 |   103 |        |      |
|   6 |       INDEX RANGE SCAN          | IX_a                    |   303 |       |     5 |        |      |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   7 |     REMOTE                      |                               |     1 |    51 |     2 | OP~ | R->S |
-----------------------------------------------------------------------------------------------------------------

Note: cpu costing is off, PLAN_TABLE' is old version
 

Tom Kyte
November 30, 2005 - 8:51 pm UTC

what happens when you hint it, have you tried that as a stop gap measure.



A reader, December 01, 2005 - 5:02 am UTC

i have tried all the hints that i am aware of. please let me know which hint should i try.



Tom Kyte
December 01, 2005 - 12:38 pm UTC

can you try "in" instead of exists, rewrite the query - play with it. I'm currently traveling and have spotty connections. If I cannot resolve something almost instantly (after getting about 100 of these a day) I cannot spend a lot of time setting it up. I asked for tables, create indexes - set up for the stuff, but never got them.

Thanks a Lot

A reader, December 02, 2005 - 8:31 pm UTC

Tom,

That worked. I removed "EXISTS" and modify the query with "IN", now its much more faster than before.

I have no words to thank you for your precious advise.

Thanks again.

How to change the CBO's mind without a hint

Rob van Wijk, February 03, 2006 - 9:33 am UTC

Tom,

We have a production issue that I have simulated with the script below:

var N number
exec :N := 1000000;

create table t1
( id number(10) not null
, lss_code varchar2(3) not null
, naam varchar2(100) not null
);

create table w
( t1_id number(10) not null
, t2_id number(10) not null
);

alter table t1 nologging;
alter table w nologging;

insert into /*+ APPEND */ t1
select l
, case
when l<:N*.8+1 then 'DEL' -- 80% 'DEL'
when l<:N*.8+1+:N*.001 then 'VOL' -- .1% 'VOL'
else 'LS' || TO_CHAR(mod(l,9)) -- 9 other values
end
, 'Dit is een lange omschrijving met bijna, om en nabij, maar net geen 100 karakters ' || to_char(l)
from (select level l from dual connect by level <= :N);

insert into /*+ APPEND */ w select l,1+trunc(500*l/:N) from (select level l from dual connect by level <= :N);

alter table t1 logging;
alter table w logging;

alter table w add constraint w_pk primary key (t2_id,t1_id);
alter table t1 add constraint t1_pk primary key (id);
alter table w add constraint w_t1_fk foreign key (t1_id) references t1(id);
create index w_t1_fk_i on w(t1_id);
create index t1_lss_code_i on t1(lss_code);

exec dbms_stats.gather_table_stats(user,'W',method_opt=>'FOR ALL INDEXED COLUMNS',cascade=>TRUE)
exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'FOR ALL INDEXED COLUMNS',cascade=>TRUE)


explain plan
for
SELECT COUNT (*)
FROM w
, t1
WHERE t1.ID = w.t1_id
AND t1.lss_code = 'DEL'
AND w.t2_id = 234
/

select * from table(dbms_xplan.display);

alter session set sql_trace true;

SELECT COUNT (*)
FROM w
, t1
WHERE t1.ID = w.t1_id
AND t1.lss_code = 'DEL'
AND w.t2_id = 234
/

alter session set sql_trace false;

pause

explain plan
for
SELECT /*+ ORDERED */ COUNT (*)
FROM w
, t1
WHERE t1.ID = w.t1_id
AND t1.lss_code = 'DEL'
AND w.t2_id = 234
/

select * from table(dbms_xplan.display);

alter session set sql_trace true;

SELECT /*+ ORDERED */ COUNT (*)
FROM w
, t1
WHERE t1.ID = w.t1_id
AND t1.lss_code = 'DEL'
AND w.t2_id = 234
/

alter session set sql_trace false;

drop table w;
drop table t1;

--
Plan query 1 (without the hint):
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 1393 |
| 1 | SORT AGGREGATE | | 1 | 17 | |
|* 2 | HASH JOIN | | 2000 | 34000 | 1393 |
|* 3 | INDEX RANGE SCAN | W_PK | 2000 | 16000 | 8 |
|* 4 | TABLE ACCESS FULL | T1 | 800K| 7031K| 1374 |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."ID"="W"."T1_ID")
3 - access("W"."T2_ID"=234)
4 - filter("T1"."LSS_CODE"='DEL')


Plan query 2 (with ordered hint):
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 2014 |
| 1 | SORT AGGREGATE | | 1 | 17 | |
| 2 | NESTED LOOPS | | 2000 | 34000 | 2014 |
|* 3 | INDEX RANGE SCAN | W_PK | 2000 | 16000 | 8 |
|* 4 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 9 | 2 |
|* 5 | INDEX UNIQUE SCAN | T1_PK | 1 | | 1 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("W"."T2_ID"=234)
4 - filter("T1"."LSS_CODE"='DEL')
5 - access("T1"."ID"="W"."T1_ID")


As you see the query without the hint does a full table scan while the hinted query uses the index. Although a full table scan is generally not evil, here the response time and the number of LIO's are much higher. With the hint, the query uses a better plan.

Any advice on how to make the CBO think that the second query is better, without using a hint ?

Kind regards,
Rob.


... or should i wait for the opportunity to ask a new question ?

Rob van Wijk, February 06, 2006 - 3:50 am UTC


Tom Kyte
February 07, 2006 - 12:36 am UTC

I get so many of these that generally when I have to hit page down 3 or 4 times to read it all, I have to skip it. I've been doing that more and more recently. If I cannot get the gist in a couple of seconds, I do consider it a new question.

only problem is, the more of these - the less of those (new questions...). vicious circle.

Cartesian Product. Help !!

wajid, May 25, 2006 - 1:11 pm UTC

Attached explain plan is from a developer.

I can see MERGE JOIN CARTESIAN on top, but how do I know which tables are participating in the MERGE JOIN ?

------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 112 | 745 |
| 1 | NESTED LOOPS | | 1 | 112 | 745 |
| 2 | NESTED LOOPS | | 1 | 96 | 744 |
| 3 | MERGE JOIN CARTESIAN | | 1 | 70 | 743 |
| 4 | NESTED LOOPS | | 1 | 62 | 742 |
| 5 | INDEX UNIQUE SCAN | CMN_SEC_USERS_PK | 1 | 6 | 1 |
| 6 | VIEW | CMN_SEC_ASSGND_OBJ_PERM_V0 | 155 | 8680 | 741 |
| 7 | SORT UNIQUE | | 155 | 19233 | 741 |
| 8 | UNION-ALL | | | | |
| 9 | HASH JOIN | | 91 | 8554 | 243 |
| 10 | TABLE ACCESS FULL | CMN_SEC_PERM_ELEMENTS | 2 | 52 | 1 |
| 11 | NESTED LOOPS | | 2001 | 132K| 241 |
| 12 | HASH JOIN | | 42 | 1764 | 115 |
| 13 | HASH JOIN | | 63 | 2079 | 66 |
| 14 | NESTED LOOPS | | 69 | 1656 | 17 |
| 15 | INDEX FAST FULL SCAN | CMN_SEC_GROUP_FLAT_HIERS_U1 | 33557 | 393K| 17 |
| 16 | INDEX UNIQUE SCAN | CMN_SEC_USER_GROUPS_U1 | 25 | 300 | |
| 17 | TABLE ACCESS FULL | CMN_SEC_GROUPS | 11040 | 99360 | 48 |
| 18 | TABLE ACCESS FULL | CMN_SEC_GROUPS | 11040 | 99360 | 48 |
| 19 | INDEX RANGE SCAN | CMN_SEC_ASSGND_OBJ_PERM_N1 | 990K| 24M| 3 |
| 20 | HASH JOIN | | 3 | 156 | 5 |
| 21 | INDEX RANGE SCAN | CMN_SEC_ASSGND_OBJ_PERM_N1 | 48 | 1248 | 3 |
| 22 | TABLE ACCESS FULL | CMN_SEC_PERM_ELEMENTS | 2 | 52 | 1 |
| 23 | NESTED LOOPS | | 1 | 155 | 44 |
| 24 | HASH JOIN | | 88 | 10472 | 44 |
| 25 | HASH JOIN | | 8 | 744 | 24 |
| 26 | TABLE ACCESS FULL | CMN_SEC_PERM_ELEMENTS | 2 | 52 | 1 |
| 27 | NESTED LOOPS | | 178 | 11926 | 22 |
| 28 | NESTED LOOPS | | 4 | 188 | 10 |
| 29 | INDEX RANGE SCAN | CMN_SEC_ASSGND_RIGHT_U1 | 4 | 156 | 2 |
| 30 | TABLE ACCESS BY INDEX ROWID | CMN_SEC_GROUPS | 21976 | 171K| 2 |
| 31 | INDEX RANGE SCAN | CMN_SEC_GROUPS_N1 | 21976 | | 1 |
| 32 | INDEX RANGE SCAN | CMN_SEC_ASSGND_OBJ_PERM_N1 | 924K| 17M| 3 |
| 33 | INDEX FAST FULL SCAN | PRJ_OBS_ASSOCIATIONS_N1 | 15704 | 398K| 19 |
| 34 | INDEX UNIQUE SCAN | OBS_UNITS_FLAT_BY_MODE_N1 | 7784 | 273K| |
| 35 | HASH JOIN | | 16 | 2672 | 99 |
| 36 | TABLE ACCESS FULL | CMN_SEC_PERM_ELEMENTS | 2 | 52 | 1 |
| 37 | NESTED LOOPS | | 356 | 50196 | 97 |
| 38 | NESTED LOOPS | | 8 | 968 | 73 |
| 39 | HASH JOIN | | 908 | 77180 | 73 |
| 40 | HASH JOIN | | 83 | 4897 | 53 |
| 41 | NESTED LOOPS | | 83 | 4233 | 4 |
| 42 | INDEX FAST FULL SCAN | CMN_SEC_ASSGND_RIGHT_U1 | 965 | 37635 | 4 |
| 43 | INDEX UNIQUE SCAN | CMN_SEC_USER_GROUPS_U1 | 25 | 300 | |
| 44 | TABLE ACCESS FULL | CMN_SEC_GROUPS | 21976 | 171K| 48 |
| 45 | INDEX FAST FULL SCAN | PRJ_OBS_ASSOCIATIONS_N1 | 15704 | 398K| 19 |
| 46 | INDEX UNIQUE SCAN | OBS_UNITS_FLAT_BY_MODE_N1 | 7784 | 273K| |
| 47 | INDEX RANGE SCAN | CMN_SEC_ASSGND_OBJ_PERM_N1 | 924K| 17M| 3 |
| 48 | HASH JOIN | | 34 | 5406 | 76 |
| 49 | TABLE ACCESS FULL | CMN_SEC_PERM_ELEMENTS | 2 | 52 | 1 |
| 50 | NESTED LOOPS | | 756 | 98K| 74 |
| 51 | HASH JOIN | | 17 | 1921 | 23 |
| 52 | NESTED LOOPS | | 40 | 2960 | 18 |
| 53 | NESTED LOOPS | | 2 | 76 | 6 |
| 54 | TABLE ACCESS BY INDEX ROWID | SRM_RESOURCES | 2 | 24 | 2 |
| 55 | INDEX RANGE SCAN | SRM_RESOURCES_N1 | 2 | | 1 |
| 56 | TABLE ACCESS BY INDEX ROWID | PRJ_OBS_ASSOCIATIONS | 3926 | 99K| 2 |
| 57 | INDEX RANGE SCAN | PRJ_OBS_ASSOCIATIONS_N2 | 3926 | | 1 |
| 58 | TABLE ACCESS BY INDEX ROWID | OBS_UNITS_FLAT_BY_MODE | 7784 | 273K| 6 |
| 59 | INDEX RANGE SCAN | OBS_UNITS_FLAT_BY_MODE_N2 | 7784 | | 1 |
| 60 | INDEX FAST FULL SCAN | CMN_SEC_ASSGND_RIGHT_U1 | 644 | 25116 | 4 |
| 61 | INDEX RANGE SCAN | CMN_SEC_ASSGND_OBJ_PERM_N1 | 924K| 17M| 3 |
| 62 | NESTED LOOPS | | 10 | 2290 | 141 |
| 63 | NESTED LOOPS | | 10 | 2090 | 111 |
| 64 | HASH JOIN | | 1094 | 184K| 111 |
| 65 | HASH JOIN | | 100 | 14700 | 91 |
| 66 | HASH JOIN | | 100 | 13900 | 42 |
| 67 | NESTED LOOPS | | 80 | 8000 | 37 |
| 68 | NESTED LOOPS | | 4 | 256 | 13 |
| 69 | MERGE JOIN CARTESIAN | | 4 | 152 | 5 |
| 70 | TABLE ACCESS FULL | CMN_SEC_PERM_ELEMENTS | 2 | 52 | 1 |
| 71 | SORT JOIN | | 2 | 24 | 4 |
| 72 | TABLE ACCESS BY INDEX ROWID| SRM_RESOURCES | 2 | 24 | 2 |
| 73 | INDEX RANGE SCAN | SRM_RESOURCES_N1 | 2 | | 1 |
| 74 | TABLE ACCESS BY INDEX ROWID | PRJ_OBS_ASSOCIATIONS | 3926 | 99K| 2 |
| 75 | INDEX RANGE SCAN | PRJ_OBS_ASSOCIATIONS_N2 | 3926 | | 1 |
| 76 | TABLE ACCESS BY INDEX ROWID | OBS_UNITS_FLAT_BY_MODE | 7784 | 273K| 6 |
| 77 | INDEX RANGE SCAN | OBS_UNITS_FLAT_BY_MODE_N2 | 7784 | | 1 |
| 78 | INDEX FAST FULL SCAN | CMN_SEC_ASSGND_RIGHT_U1 | 1930 | 75270 | 4 |
| 79 | TABLE ACCESS FULL | CMN_SEC_GROUPS | 21976 | 171K| 48 |
| 80 | INDEX FAST FULL SCAN | PRJ_OBS_ASSOCIATIONS_N1 | 15704 | 398K| 19 |
| 81 | INDEX UNIQUE SCAN | OBS_UNITS_FLAT_BY_MODE_N1 | 7784 | 273K| |
| 82 | INDEX RANGE SCAN | CMN_SEC_ASSGND_OBJ_PERM_N1 | 924K| 17M| 3 |
| 83 | SORT JOIN | | 1 | 8 | 2 |
| 84 | TABLE ACCESS FULL | CMN_COMPONENTS | 1 | 8 | 1 |
| 85 | TABLE ACCESS BY INDEX ROWID | CMN_SEC_OBJECTS | 2 | 52 | 1 |
| 86 | INDEX RANGE SCAN | CMN_SEC_OBJECTS_N1 | 2 | | |
| 87 | TABLE ACCESS BY INDEX ROWID | CMN_LOOKUPS | 2 | 32 | 1 |
| 88 | INDEX UNIQUE SCAN | CMN_LOOKUPS_PK | 2 | | |
------------------------------------------------------------------------------------------------------------


Tom Kyte
May 25, 2006 - 1:51 pm UTC

well, I'm not going to un-munge the tree that wrapped - but it would be the output of the two steps indented right under it


...
MERGE JOIN CARTESION
SOME STEP 1
SOME STEP ......
.....
.....
SOME STEP 2
......


some step 1 and 2 will be joined at that level.

Same EXPLAIN PLAN different Performace

wajid, May 26, 2006 - 9:36 am UTC

In a query, when the value for one of the parameter is changed, the query takes 60 minutes instead of seconds.

The query plan in both cases is exactly the same.

What could be the reason ?

Tom Kyte
May 27, 2006 - 9:17 pm UTC

look at the tkprofs, that'll give you a pretty good idea.

(think about it, different INPUTS - different OUTPUTS, different processing to get to the data)...

scalar sub queries

Deepak, December 29, 2006 - 12:23 pm UTC

H Tom,

Am stuck in a tuning problem and need your expertise to get rid of the issue.

I have 3 huge tables(~20M rows each) and many small look up tables joined in a query as follows...

Assume:
Big tables : BT1, BT2, BT3
Small tables: ST1, ST2, ST3, ST4, ST5, ST6

select * from BT1, BT2, BT3, ST1, ST2, ST3, ST4, ST5, ST6
where
BT1.id=BT2.id(+) and
BT1.id=BT3.id(+) and
BT1.id=ST1.id(+) and
BT1.id=ST2.id(+) and
BT1.id=ST3.id(+) and
BT1.id=ST4.id(+) and
BT1.id=ST5.id(+) and
BT1.id=ST6.id(+);

The CBO is using hash joins and the cost is too high (400K) and we are having a lots of "direct path write waits".

I tried to remove the outer joins for the small lookup tables by using sclar queries. The cost reduced drastically (10K) but the overall execution time got increased.

Badly need your expertise to get rid of this issue.

Also would like to know even if the CBO cost is less in the second case, why the overall execution time is more. Is n't the cost inversly proportional to the time taken to execute the query?
Tom Kyte
December 30, 2006 - 8:56 am UTC

it is perfection - sheer utter perfection - that the cbo is using full scans and hash joins. I am so so glad it is doing that.

Because anything else would be wrong to get all of the rows as fast as possible from that query.

make sure you have set your pga_aggregate_target OR hash_area_size appropriately

HASH_AREA_SIZE

Deepak, December 30, 2006 - 9:52 am UTC

Hi Tom,

Thanks for your explanation. In the above case I had set PGA_AGGREGATE_TARGET=4GB. But when I observed the pga statistics, I found that the max. pga used by a session is 300M. That was the only user session running.

Later I used manual PGA and defined the hash_area_size=2GB explicitly. Then oracle started using the whole hash_area and the query ran faster.

Would like to know why oracle did not allocated teh desired hash_area_size when I used auto PGA?

Are there any situations where we should use manual pga?


Anything I missed?
Tom Kyte
December 30, 2006 - 11:16 am UTC

pga aggregate target is designed to "use all memory, in a fair fashion"

It will never give it all to a single session, it is designed to shared memory fairly across all sessions.

if you have a single large batch job (you own the machine), then manual memory management (which you can enable a session at a time using alter session) makes complete sense - you don't want to be FAIR at that point, you just want to use it all.

automatic memory management assumes other users will want to use memory too

great tip

Deepak, December 30, 2006 - 11:47 am UTC

Hi Tom,

Thanks a lot for the great tip.

Avoiding full table scan

A reader, June 08, 2007 - 11:10 am UTC

One of application hit our database with the following sql - each 30 seconds - it is on the top of statspack both for buffer gets and physical reads. The database is 8.1.7

Physical Reads Executions Reads per Exec % Total
--------------- ------------ -------------- ------- ------------
2,101,670 26 80,833.5 76.8

SELECT Count (*) from "BESTELLUNG" WHERE ( "MSTATUS" = 0 OR
"MSTATUS" IS NULL )

Buffer Gets Executions Gets per Exec % Total
--------------- ------------ -------------- ------- ------------
4,053,843 26 155,917.0 83.2

SELECT Count (*) from "BESTELLUNG" WHERE ( "MSTATUS" = 0 OR
MSTATUS" IS NULL )

The table has about 1000000 rows and there are only 4 distinct values:

SQL> select mstatus, count(*) from bestellung group by mstatus;
MSTATUS COUNT(*)
------------- ----------
0 1
1 9801565
2 1
9 6456

How can I avoid table scan - I think the problem is searching for null - table has an index on mstatus but is is not used.

I gathered table statistics as follow:


SQL> exec dbms_stats.gather_table_stats('KA', 'BESTELLUNG', cascade=>true, method_opt=>'FOR COLUMNS MSTATUS SIZE 4');

As it is third party application we cannot change something in the code..

It is brilliant

A reader, June 11, 2007 - 7:59 am UTC

Tom, your answer is brilliant - by creating just one index I was able to reduce physical reads /logical IOs per second from:

Physical reads: 3,180.94
Logical reads: 5,666.53

to:

Physical reads: 494.97
Logical reads: 696.98

It is very strange, but buffer hit is actually degraded from:

Buffer Hit %: 43.86

to

Buffer Hit %: 28.98 after creating the index.

How I have two top sqls in the statspack

Physical Reads Executions Reads per Exec % Total
--------------- ------------ -------------- -------
177,317 37 4,792.4 33.9

SELECT rowid, a, b, c, d, FROM "TRANSFER" WHERE a = 0 AND b = 1 AND c = 1 ORDER BY c ASC

79,093 7 11,299.0 15.1
SELECT MIN ("BESTELLUNGSDATUM") , MAX ("BESTELLUNGSDATUM") from "BESTELLUNG"

My questions are:

1. How can I avoid full table scan of BESTELLUNG table. I have Oracle 8.1.7 Standard Edition and as far as I know SE can not use function based indexes. So the full table scan of my bestellung table "blow away" all entries from my buffer_pool. I think that is why I have so bad buffer hit ratio.

2. My another table TRANSFER is actually very small - only 417 rows with max 9+5+5+5+255 bytes row length. So full table scan of this table is actually not so bad. What I don't understand is why the table allocate 4868 blocks and why oracle scans all this 4868 blocks by full table scan. What is wrong with this table - should I somehow rebuild the table?

SQL> desc transfer

ID NOT NULL NUMBER(9)
a NOT NULL NUMBER(5)
b NOT NULL NUMBER(5)
c NOT NULL NUMBER(5)
d NOT NULL VARCHAR2(255)

select num_rows, blocks, empty_blocks from dba_tables where table_name like 'TRANSFER';

NUM_ROWS BLOCKS EMPTY_BLOCKS
---------- ---------- ------------
417 4864 0
Tom Kyte
June 11, 2007 - 10:43 am UTC

tuning will often reduce the cache hit ratio :)

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

the cache hit is:

  1  SELECT phy.value,
  2         cur.value,
  3         con.value,
  4         1-((phy.value)/((cur.value)+(con.value))) "Cache hit ratio"
  5  FROM   v$sysstat cur, v$sysstat con, v$sysstat phy
  6  WHERE  cur.name='db block gets'
  7  AND    con.name='consistent gets'
  8* AND    phy.name='physical reads'



if you decrease logical IO (curr+con) faster than you decrease physical IO (phy), then

phy/(cur+con)

increases.... that is what you did.


1) no it is not blowing away all of your entries. blocks read via full scans are cached 'differently'. they will NOT blow out the cache by design. see
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:313416745628#309906000346303334
for a recent example.

2) high water marks. table WAS much larger at some point.


ops$tkyte%ORA9IR2> create table t as select * from all_objects;

Table created.

ops$tkyte%ORA9IR2> delete from t;

30761 rows deleted.

ops$tkyte%ORA9IR2> commit;

Commit complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> set autotrace on
ops$tkyte%ORA9IR2> select * from t;

no rows selected


Execution Plan
----------------------------------------------------------
   0
SELECT STATEMENT Optimizer=CHOOSE


   1    0
  TABLE ACCESS (FULL) OF 'T'






Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        425  consistent gets
          0  physical reads
          0  redo size
        945  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte%ORA9IR2> select * from t;

no rows selected


Execution Plan
----------------------------------------------------------
   0
SELECT STATEMENT Optimizer=CHOOSE


   1    0
  TABLE ACCESS (FULL) OF 'T'






Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        425  consistent gets
          0  physical reads
          0  redo size
        945  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte%ORA9IR2> alter table t move;

Table altered.

ops$tkyte%ORA9IR2> select * from t;

no rows selected


Execution Plan
----------------------------------------------------------
   0
SELECT STATEMENT Optimizer=CHOOSE


   1    0
  TABLE ACCESS (FULL) OF 'T'






Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        945  bytes sent via SQL*Net to client
        372  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte%ORA9IR2> set autotrace off

Very good answer

A reader, June 11, 2007 - 11:03 am UTC

Thanks for you quick response, Tom.

But what can I do to AVOID full table scan when selecting min(date), max(date) from a ten-million-row table in 8.1.7 Standard Edition. I have an index on the date coulumn but I suppose it is not used because of min/max functions.

My second question is "alter table move" SAVE? Can this statement for e.g. invalidate some procedures based on this table?
Or in other words can I as a DBA simple execute alter table move without telling something to our developers :-)
Tom Kyte
June 11, 2007 - 11:32 am UTC

you can do the alter table T move (and alter index I rebuild !!!! all indexes will go invalid) without affecting code.

This is an OFFLINE operation of course.


the full scan of the table is likely due to either

a) use of RBO
b) nullality of the date column

if DATE is nullable - then the index will not be used to find the min and the max.


are we able to rewrite this query?

avoiding FTS

A reader, June 12, 2007 - 11:08 am UTC

Yes, the date column is nullable. We can try to rewrite the code here.

Many thanks again
Tom Kyte
June 12, 2007 - 1:50 pm UTC

select min(x), max(y)
from (select min(dt), to_date(null) from t
union all
select to_date(null), max(dt) from t)
/

should do it for you

How to avoid FTS?

Karthik, June 22, 2007 - 11:29 am UTC

Hello Tom,
We have huge table with million rows. We have some couple queries of this nature:
SELECT <colum list>>
FROM T1 
WHERE C10 LIKE 'Site running%'
AND C5 != 'AWAITING CLOSURE'
AND C5 != 'CANCELLED'


The C10 column is VARCHAR2(255).

Is there any way I can create function based index on C10 using INSTR and changing the query appropriately. Or is there any other way?

Or is there anyway I can index on C5 ?
Please help.




Tom Kyte
June 22, 2007 - 5:11 pm UTC

just index c10?

why would you want a function based index - a "normal" one would more than suffice. Add c5 to the index as well - so we can find and filter via the index and just access the table when we KNOW we have a row of interest.

Function based index and consistent gets

Jayanth, June 25, 2007 - 3:02 pm UTC

Tom,
One of my queries is taking a long time to complete. I ran the autotrace for the query and here are the results.
But before the results, here is a background of the schema/data.

I have a parent table of 3 million rows.
I have a child table of about 100,000 rows I need to update the parent table with (not necessarily a merge).
The match criteria requires that I have function based indexes on the parent table.
The following is the query and the associated autotrace results.

Here is the query and the autotrace:

Query:
 select email_address
           from email_address
       where last_name = 'some last_name'
            and substr(address, instr(address, ' ')-1) = 'some_address'
        and substr(zip, 1, 5) = 'some_zip'
        and rownum = 1

This query is actually part of a pl/sql block.
The variables some last_name, some_address, some_zip come from a cursor constructed from the child table
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8558 card=1 Bytes=42)
   1    0   COUNT (STOPKEY)
   2    1     TABLE ACCESS (FULL) OF 'EMAIL_ADDRESS' (Cost=8558 Card=1 Bytes=42)



Statistics
----------------------------------------------------------
         49  recursive calls
          0  db block gets
      89450  consistent gets
      72765  physical reads
          0  redo size
        140  bytes sent via SQL*Net to client
        234  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

The following are the relevant indexes on the table:

create index ea_add_substr_idx on email_address substr(address, instr(address, ' ')-1);
create index ea_add_substr_char_idx on email_address substr(address, 1, 1);
create index ea_zip_5_idx on email_address substr(zip, 1, 5);

and I have analyzed the indexes:

analyze index ea_add_substr_idx compute statistics;
analyze index ea_add_substr_char_idx compute statistics;
analyze index ea_zip_5_idx compute statistics;

My question is, why I am having such a high consistent gets, and physical reads?

Also, why is the query not utilizing the indexes, and instead making a full table scan?

It would really help if you could give some insight into it.

Thanks!
Tom Kyte
July 02, 2007 - 9:03 am UTC

no version information, so I cannot tell you if you have done everything you need to have done in order to have function based indexes used - or not.

seems like a simple index on last name would be virtually all that was needed, or last_name, address, zip (no functions even) as last name is pretty selective and we could filter on the index entries for address and zip, only hitting the table on a "real hit"

Function based index and consistent gets

Jayanth, July 03, 2007 - 12:19 pm UTC

Tom,

Here is the version info.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production


I made it to work a lot quicker (in the order of few mins, compared to days) but might have taken a rather extreme measure.

I created a combination index on all the 4 columns (last_name, substr(address..), substr(zip...), email_address).

Here is the execution plan and its stats after the new index
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=42)
   1    0   COUNT (STOPKEY)
   2    1     INDEX (RANGE SCAN) OF 'SFL_COMPOSITE1_INDEX' (NON-UNIQUE
          ) (Cost=3 Card=1 Bytes=42)


Statistics
----------------------------------------------------------
         56  recursive calls
          0  db block gets
         16  consistent gets
          0  physical reads
          0  redo size
        140  bytes sent via SQL*Net to client
        234  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

I will probably test it when I get a chance with an index just on the last name, though I thought there was an index on last_name already specified.

Thanks a lot for your input!

FTS question

Karthik, October 15, 2007 - 2:56 pm UTC

Tom,
We have an OLTP environment where a particular sql in one database running with index(checked with explain plan) runs fast(10 sec) when compared to another database that goes for FTS in the same box with same environment runs for 15 minutes.Both databases have same table/index structure and analyzed latest. The no. of records are also same.Queries are using bind variables. We are using Oracle 9.2.0.8 and this is the only one table used in SQL stmt.

Appriciate your suggestions on how to proceed

Jason Martin, October 16, 2007 - 11:40 am UTC

Are the initialization parameters the same? Do both instances have the same System statistics gathered? Any hints in the SQL? Stored outlines? Do the tables occupy equal space (ie is the HWM in the same place)?

Two identical systems should act the same, so _something_ is different.

"full scans are not evil, indexes are not good"

Karthick Pattabiraman, January 17, 2008 - 2:45 am UTC

I use to believe strongly in this point made by you in most of the places.

"full scans are not evil, indexes are not good"

But today this example tells me some thing else...

i got a mail which had some DB tips. Of which one was 

where rno <> 1000 will not take index 

so write like this

where rno > 1000 or rno < 1000

When i say i want all row that are not equal to 1000 that means iam getting a large (or most) portion of the table. So in that case a full table scan is advisable. And thats why the optimizer gose for a full table scan. and making it to use the index must be wrong. 

so i tested it..

SQL> CREATE TABLE T1 AS SELECT LEVEL RNO FROM DUAL CONNECT BY LEVEL <=100000;

Table created.

SQL> CREATE INDEX T1_IDX ON T1(RNO);

Index created.


SQL> ANALYZE TABLE t COMPUTE STATISTICS FOR TABLE FOR ALL INDEXES FOR ALL INDEXED COLUMNS;

Table analyzed.

and following is the tkprof result for the two query that i run

SELECT * FROM T WHERE RNO <> 10000


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch    13336      0.68       0.86          0      13648          0      199998
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13340      0.68       0.87          0      13648          0      199998

Rows     Row Source Operation
-------  ---------------------------------------------------
  99999  TABLE ACCESS FULL T (cr=6824 pr=0 pw=0 time=200195 us)


SELECT * FROM T WHERE RNO > 10000 OR RNO < 10000


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.01       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch    13336      0.54       0.73          0      13648          0      199998
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13340      0.56       0.74          0      13648          0      199998

Rows     Row Source Operation
-------  ---------------------------------------------------
  99999  TABLE ACCESS FULL T (cr=6824 pr=0 pw=0 time=200160 us)

This tkprof shows index gose faster than full scan. can you tell me why and how.


Tom Kyte
January 17, 2008 - 11:11 am UTC

a) stop using analyze, just stop, dbms_stats please

b) I see no index use here.

and please bear in mind, you have a single column - add some data please to make this real world (so that the index by itself cannot be used, the index AND table would have to be used)

Wrong script

Karthick Pattabiraman, January 17, 2008 - 4:01 am UTC

I gave the wrong script. The correct one is 

SQL> CREATE TABLE T AS SELECT LEVEL RNO FROM DUAL CONNECT BY LEVEL <=100000;

Table created.

SQL> CREATE INDEX T_IDX ON T(RNO);

Index created.


SQL> ANALYZE TABLE T COMPUTE STATISTICS FOR TABLE FOR ALL INDEXES FOR ALL INDEXED COLUMNS;

Table analyzed.

Tom Kyte
January 17, 2008 - 11:12 am UTC

see above, use

select level rno, rpad('*', 80, '*') data from dual.....

and select * on THAT

why this index is not being used

Reene, June 04, 2008 - 7:56 am UTC

Hi Tom,

I have a very puzzling problem. 

on the item_mast table - there is an index on time_stamp column. but the query below does not use it ,i have verified it in tkprof as well.

when we run the actual query  only :b2 and :b1 are NOT 

SQL>explain plan for
  2  SELECT  MSGID, ITEM_TYPE, ITEM_KEY 
  3  FROM
  4   ITEM_MAST WHERE 
  5    (TRANSACTION_TYPE = :B6  or :B6 IS NULL)
  6     AND (   TRANSACTION_SUBTYPE = :B5 or :B5 IS NULL ) 
  7        AND (PARTYID = :B4 or :B4 IS NULL) 
  8       AND ( PARTY_SITE_ID = :B3  or :b3 is null) 
  9       AND
 10   ( TIME_STAMP >=:B2   or :B2 is null  ) AND 
 11    ( TIME_STAMP <= :B1   or :B1 IS NULL  )     ;

Explained.

SQL>
SQL>select * from TABLE(dbms_xplan.display) ;

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------
| Id  | Operation            |  Name        | Rows  | Bytes | Cost  |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |     8 |   568 |  9781 |
|*  1 |  TABLE ACCESS FULL   | ITEM_MAST  |     8 |   568 |  9781 |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("ITEM_MAST"."TRANSACTION_TYPE"=:Z OR :Z IS NULL)
              AND ("ITEM_MAST"."TRANSACTION_SUBTYPE"=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."PARTYID"=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."PARTY_SITE_ID"=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."TIME_STAMP">=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."TIME_STAMP"<=:Z OR :Z IS NULL))

Note: cpu costing is off

19 rows selected. 


But If I change the above query like below and then do a explain plan - please consider the explain output as below.

please note the modifications in line 10 and 11.

SQL>explain plan for
  2  SELECT  MSGID, ITEM_TYPE, ITEM_KEY 
  3  FROM
  4   ITEM_MAST WHERE 
  5    (TRANSACTION_TYPE = :B6  or :B6 IS NULL)
  6     AND (   TRANSACTION_SUBTYPE = :B5 or :B5 IS NULL ) 
  7        AND (PARTYID = :B4 or :B4 IS NULL) 
  8       AND ( PARTY_SITE_ID = :B3  or :b3 is null) 
  9       AND
 10   ( TIME_STAMP >=:B2  /* or :B2 is null */ ) AND 
 11    ( TIME_STAMP <= :B1  /* or :B1 IS NULL */ )     ;

Explained.

SQL>
SQL>select * from TABLE(dbms_xplan.display) ;

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------
| Id  | Operation                    |  Name           | Rows  | Bytes | Cost  |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |     2 |   142 |  1142 |
|*  1 |  FILTER                      |                 |       |       |       |
|*  2 |   TABLE ACCESS BY INDEX ROWID| ITEM_MAST     |     2 |   142 |  1142 |
|*  3 |    INDEX RANGE SCAN          | ITEM_MAST_N3  |  1987 |       |     8 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE(:Z)<=TO_DATE(:Z))
   2 - filter(("ITEM_MAST"."TRANSACTION_TYPE"=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."TRANSACTION_SUBTYPE"=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."PARTYID"=:Z OR :Z IS NULL) AND
              ("ITEM_MAST"."PARTY_SITE_ID"=:Z OR :Z IS NULL))
   3 - access("ITEM_MAST"."TIME_STAMP">=:Z AND "ITEM_MAST"."TIME_STAMP"<=:Z)

Note: cpu costing is off

21 rows selected. 

question is Why it is so ? what could be a solutin.

as per requirement the it seems that that changes I did is not right as users can pass null values for :b1 and :b2 both .

thanks

Tom Kyte
June 04, 2008 - 10:18 am UTC

ok, tell me, if :b1` and :b2 are null:

10 ( TIME_STAMP >=:B2 or :B2 is null ) AND
11 ( TIME_STAMP <= :B1 or :B1 IS NULL ) ;


what happens to that predicate - does it make sense to use an index for it, in fact, if time_stamp is a NULLABLE column and the index is only on timestamp - it would be IMPOSSIBLE to use the index!!!!


that is like:

where time_stamp between :b2 and :b1
OR (:b1 is null and :b2 is null)


now, an index on a single nullable column timestamp could be used for the "timestamp between" but it CAN NOT be used for the (:b1 is null and :b2 is null) since entirely null keys are not entered into the b*tree.



I hate queries like this:

 5    (TRANSACTION_TYPE = :B6  or :B6 IS NULL)
  6     AND (   TRANSACTION_SUBTYPE = :B5 or :B5 IS NULL ) 
  7        AND (PARTYID = :B4 or :B4 IS NULL) 
  8       AND ( PARTY_SITE_ID = :B3  or :b3 is null) 
  9       AND
 10   ( TIME_STAMP >=:B2   or :B2 is null  ) AND 
 11    ( TIME_STAMP <= :B1   or :B1 IS NULL  )     ;



they are quite simply "not really possible to optimize"

what if transaction_type is very very selective and you supply :b6, it should use an index on transaction_type

but

what if partyid is also very very very selective and you supply :b3, it should use an index on partyid


BUT

we will only choose one or the other at hard parse time - we cannot use both.

Meaning, you will never get the optimal plan.


http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1288401763279

use that technique instead.

thanks

Reene, June 05, 2008 - 2:43 am UTC

Tom,

Thanks so much for your answer. really really helped me understand this issue .

as you already said - yes time_stamp is nullbale field so the index in not being used at all in the original query.

also time_stamp is the only column in the index.

i have gone through the alternate solution you have recommended and I am changing the code that way.

Thanks again

is it related to my question

Reene, June 05, 2008 - 10:38 am UTC

Hi Tom,

it your latest follow -up on my question above is related with my issue , as I am not getting any error- it is just the index is not being used and you told the reason.

Regards

Tom Kyte
June 05, 2008 - 10:59 am UTC

sorry, that was a cut and paste error! wasn't meant for your followup, I moved it

FTS

Bob, February 11, 2009 - 9:34 am UTC

Hi Tom,

I have a question for you regarding Full Table Scans. I have table "x" that contains 16.8 millions rows (approx)
I am populating lookup table by selecting from the table as follows:

select DISTINCT a,b,c,d from x where b = 2.

Now originally b had the following data distribution:

SQL> select b, count(*)
2 from x
3 group by b;

B COUNT(*)
- ----------
1 10025591
6216
2 2967656

So I created a Function Based Index on B like this:

create index x_FBI1 on x
(case when B = '2' then '2'
else NULL end)
tablespace y_tbs
compute statistics

And when I ran this piece of code to populate the lookup table it ran in 1.34 secs and used the FBI.

INSERT INTO LOOKUP
SELECT DISTINCT
a,
b,
c,
d
FROM x
where (case
when B = '2' then '2'
else NULL
end) = '2'

OK that was fine ... Now on repopulation of the main table, the data distribution has changed to this:


B COUNT(*)
- ----------
1 10523893
6194
2 6327634

Now it doesn't use the FBI and does a FTS.


--------------------------------------------------------------------------
| Id  | Operation          | Name                | Rows  | Bytes | Cost  |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                     |  1976K|    41M|   151K|
|   1 |  HASH UNIQUE       |                     |  1976K|    41M|   151K|
|   2 |   TABLE ACCESS FULL| X|    16M|   353M|   101K|
--------------------------------------------------------------------------


How can I avoid this FTS as now it takes 10 mins. Thanks
Tom Kyte
February 11, 2009 - 12:30 pm UTC

why do you think the index range scan of 6.3 million rows would be faster now?

you can use the index hint to compare performance

FTS

Bob, February 11, 2009 - 7:27 pm UTC

Hi Tom, I tried using a hint but the optimizer ignored it. Is there a way to scan only 60% rows of the rows instead of scanning 100% rows. I want to get the time down from 10 minutes. I tried using a bitmap index, but the optimizer still opted for a FTS.

The parameter db_file_multiblock_read_count is set to 8.
Tom Kyte
February 12, 2009 - 12:01 pm UTC

show us how you hinted it. If the index access path *was possible* and the hints are well formed, then it would use it.

so, show us a CUT AND PASTE from sqlplus of your exact query and the autotrace.

Index access and performance as table/index grows

Sunny, March 02, 2009 - 2:28 pm UTC

Tom,

I have a question related to Index access and performance as the table grows.

I have a pretty intensive PL/SQL process (no I can't rewrite it in SQL unfortunately) that does fuzzy matching in a loop. All the queries in the loop are index access or table access via index and my performance is about 400 to 500 rows/sec (which is pretty good considering all the matching in the queries). The performance is acceptable right now but the table is expected to grow 10 fold over time. My question is if the access is strictly based on index/table access via index will the performance stay approx the same as the table grows. Every one of the queries will return less than a handful of rows at a time (less than 10) regardless of the table size.

The index size will grow but if the time for index access stays the same and the number of rows returned by each query stays the same I see no reason why the performance would go down from the 500 rows/sec right now - do you?

Thanks.
Tom Kyte
March 03, 2009 - 4:08 pm UTC

there are two cases:

case 1: when the table is 1,000 rows, you get 10 rows out of the index. When the table is 1,000,000 rows - you STILL get 10 rows out of the index.

In case 1, as the table grows, you would expect your application to run at about the same speed - it's performance would not be impacted by the size of the table.


case 2: when the table is 1,000 rows, you get 10 rows out of the index. When the table is 1,000,000 rows you get 10,000 rows. Since you get 1,000 times more data, you might take 1,000 times as long to execute (don't know, depends on your program a lot)




You *sound* like case 1 - but I'm not 100% sure. You say "the number of rows returned by each query" - but we'd need to know if the number of rows flowing out of each step of the query PLAN were expected to stay the same - if so, you should expect it to take about the same amount of time.

Sunny, March 03, 2009 - 6:44 pm UTC

Thanks for the answer Tom.

"You *sound* like case 1 - but I'm not 100% sure"

Yes my situation is case 1 where even if the table grows exponentially the number of rows returned by each query remains the same. My table now has 10 mil rows and even when it grows to 100 mil I will still be returning less than 10 (most often just 1 row) per each query in the loop. I figured as much but now that you have confirmed it I will rest a bit easier.

Thanks again.

Full tablescans in prod not in Dev

Raviprakash, April 22, 2009 - 7:02 am UTC

Hi Tom,

I have a problem with full table scans here. Qrery is doing full table scans in prdouction But in developement it is working fine and taking 9 secs. In production it is hanging forever. When i checked explain plan, it is doing full table scans. I have rebuild indices and again analyzed them. But still same problem. Could you please help me with this ?
Tom Kyte
April 23, 2009 - 12:39 pm UTC

why the heck would you rebuild indexes?

compare the plans for us, share the plans with us

and explain in painstaking detail how test is constructed versus production. Unless test is a restore of prodcution, you are comparing apples to flying toaster ovens.

Long running table scan in a procedure while not in a query

ZRayee, February 09, 2010 - 7:57 am UTC

Hi Tom,

I have two full table scan operations occuring in an insert and a merge query. Its a puzzle to me as of now that when I run the insert and merge as sql scripts the two successive table scans take 100 secs and 10 secs respectively. But when I have the two queries as part of two separate procedures and I execute the procedures, the first scan takes 100 secs while the second table scan shows 117000 secs (v$session_longops statistics).

Full table scans are required here as the table contains a days data which will be aggregated and loaded into another table.

Any suggestions/thoughts on why this should be happening?
Tom Kyte
February 15, 2010 - 2:09 pm UTC

You would need to provide an example, I have no idea how to compare "the insert and merge as sql scripts" with "I have the two queries as part of two separate procedures and I execute the procedures"

show me how to set up your case - what you are doing - step by step.

Long running table scan in a procedure while not in a query

ZRayee, February 17, 2010 - 2:15 am UTC

#Procedure structure

Create or replace package pkg1
ProcessLoad;
End;

Create or replace package body
/* package entry point procedure which inturn calls two other procedures*/
Procedure ProcessLoad
ProcInsertSQl(date);
ProcMergeSQl(date);
Commit;
Exeception
When Error
Rollback;
RAISE_APPLICATION_ERROR();
End;
ProcInsertSQL(date)
#SQL Insert
Exception
Raise Error;
End;
ProcMergeSQl(date)
#SQL Merge
SQL Merge
Exception Raise Error;
End;

execute pkg1.ProcessLoad;

v$session_longops stats when executing pkg1.processload
procedure ProcInsertSQL; table scan stg_sub total_time = 115 secs
procedure ProcMergeSQL; table scan stg_sub total_time = 117000 secs

v$session_longops stats when executing SQL Insert, SQL Merge and commit sequentially
InsertSQL table scan stg_sub total_time = 115 secs
MergeSQL table scan stg_sub total_time = 15 secs

stg_sub table holds data temporarily while the data load process is in progress. It does not have any indexes
Number of records in stg_sub is approx. 1.5million.


**********code snippets to give an idea. these are not the complete sqls*****************
#SQL Insert
INSERT FIRST WHEN
(PARENTORDERKEY IS NULL) THEN INTO stg_errdata
(errorid, SessionDate, targettablename, rowidentifiercols, rowidentifier, errormsg, SOURCESYSTEMNAME, createdate)
VALUES
(SEQ_ERRDATA_ERRORID.nextval,
TO_DATE('20100216','YYYYMMDD'),
'FACT_SX_VENUEEXEC',
'sourcesystemname-PARENTORDERID-sessiondate-ROUTENAME',
sourcesystemname || '-' || PARENTORDERID || '-' || sessiondate || '-' || ROUTENAME,
'parentorderid or sessiondate or venue or trader is null',
sourcesystemname,
sysdate)
ELSE into fact_sx_venueexec
(parentorderkey,
execqty,
createdate
)
values
(parentorderkey,
fillqty,
sysdate)
select
dimp.parentorderkey parentorderkey,
SUM(CASE WHEN sh.MSGTYPE='8' THEN sh.LASTQTY ELSE 0 END) fillqty
from stg_SUB sh,
dim_p dimp,
dim_v dimv
where
sh.sessiondate = TO_DATE('20100216','YYYYMMDD')
and sh.sourcesystemname = dimp.sourcesystemname
and sh.rootorderid = dimp.parentorderid
and sh.rootclorderid = dimp.clorderid
and sh.sessiondate = dimp.sessiondate
and sh.routename = dimv.routename
group by dimp.parentorderkey
;

************************
#SQL Merge

MERGE INTO FACT_SX_VENUEEXEC FSV
USING (
SELECT LAST_PO.PARENTORDERKEY,
SUM(SUB.LASTQTY)EXECQTY
FROM STG_RO TRO,
STG_SUB SUB,
(select parentorderkey,
parentorderid,
sessiondate,
sourcesystemname
from (select dimp.parentorderkey, parentorderid,
sessiondate,
sourcesystemname
row_number() over(partition by dimp.sessiondate, dimp.parentorderid order by dimp.orderversion desc) rn
from dim_p dimp)
where rn = 1) LAST_PO,
-- last version of a parent order
DIM_V DIMV
WHERE SUB.SESSIONDATE = TO_DATE('20100216','YYYYMMDD')
AND TRO.SESSIONDATE = SUB.SESSIONDATE
AND TRO.ORDER_ID = SUB.ROOTORDERID
AND TRO.CLORDERID = SUB.ROOTCLORDERID
AND TRO.SOURCESYSTEMNAME = SUB.SOURCESYSTEMNAME
AND TRO.SESSIONDATE = LAST_PO.SESSIONDATE
AND TRO.ORDER_ID = LAST_PO.parentorderid
AND TRO.SOURCESYSTEMNAME = LAST_PO.SOURCESYSTEMNAME
AND SUB.ROUTENAME = DIMV.ROUTENAME
GROUP BY LAST_PO.PARENTORDERKEY
HAVING SUM(SUB.LASTQTY) <> 0
) CANC_FILLQTY
ON (
FSV.SESSIONDATE = CANC_FILLQTY.SESSIONDATE
AND FSV.PARENTORDERKEY = CANC_FILLQTY.PARENTORDERKEY
)
WHEN MATCHED THEN
UPDATE
SET FSV.EXECQTY = FSV.EXECQTY + CANC_FILLQTY.EXECQTY
*********
COMMIT; --- used while executing the sqls
Tom Kyte
February 17, 2010 - 9:00 am UTC

tell you what, since we cannot have a complete example, you'll have to do the work here.


trace it from sqlPlus.
trace it from plsql.

use tkprof to generate reports from both.

compare the plans and ensure they are the same. I'll bet - they are not - and the LIKELY culprit will be:

in sqlplus you executed a query that is totally different from your plsql query, in plsql you used bind variables and you did not in sqlplus.


But let us start there, trace them, compare them, tell us what is different about the plans (or not)

Long running table scan in a procedure while not in a query

ZRayee, March 10, 2010 - 6:15 am UTC

Hi Tom,

Sorry for the delay in replying.

While I was searching for SQL_TRACE on ask tom, I came across your post on Bind Value Peeking and it turned out that I was facing the same issue. I evaluated the feasibility of your tip "Do not bind" on my packageand ... voila.

Thanks much for your advise.

Index is not used when getting min value

Kulkarni, October 19, 2011 - 7:44 am UTC

Hi Tom,
I have a problem with a query. Here is the query and the plan.

SQL_ID  gkhy763hk60uk, child number 0
-------------------------------------
SELECT DS_ID, UPDATE_STAMP FROM DS_PRICES A WHERE DS_ID IN (SELECT DS_ID FROM BASE_DATE_DSID) AND UPDATE_STAMP >=(SELECT
MIN(UPDATE_STAMP) FROM DS_DAILY_PRICES_COMPLETE WHERE A.DS_ID=DS_ID) AND
UPDATE_STAMP<=TO_DATE('12-AUG-2011','DD-MON-YYYY') AND PRICE IS NULL ORDER BY DS_ID,UPDATE_STAMP

Plan hash value: 715071183

----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                     | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                          |        |       | 30925 (100)|          |       |       |
|   1 |  SORT ORDER BY                       |                          |    698 | 29316 | 30925   (4)| 00:06:12 |       |       |
|*  2 |   HASH JOIN SEMI                     |                          |    698 | 29316 | 30924   (4)| 00:06:12 |       |       |
|*  3 |    TABLE ACCESS BY GLOBAL INDEX ROWID| DS_PRICES                |      9 |   171 |    29   (0)| 00:00:01 | ROW L | ROW L |
|   4 |     NESTED LOOPS                     |                          |   5266 |   179K| 30913   (4)| 00:06:11 |       |       |
|   5 |      VIEW                            | VW_SQ_1                  |    614 |  9824 | 13986   (8)| 00:02:48 |       |       |
|*  6 |       FILTER                         |                          |        |       |            |          |       |       |
|   7 |        HASH GROUP BY                 |                          |    614 |  9210 | 13986   (8)| 00:02:48 |       |       |
|   8 |         TABLE ACCESS FULL            | DS_DAILY_PRICES_COMPLETE |     13M|   185M| 13115   (2)| 00:02:38 |       |       |
|   9 |      PARTITION HASH ITERATOR         |                          |    113 |       |     3   (0)| 00:00:01 |   KEY |   KEY |
|* 10 |       INDEX RANGE SCAN               | DS_ID_UPDT_GLOBAL        |    113 |       |     3   (0)| 00:00:01 |   KEY |   KEY |
|  11 |    INDEX FAST FULL SCAN              | INDX_DS_ID               |  15483 |   105K|    10   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$DBA06A95
   3 - SEL$DBA06A95 / A@SEL$1
   5 - SEL$291F8F59 / VW_SQ_1@SEL$BD9E0841
   6 - SEL$291F8F59
   8 - SEL$291F8F59 / DS_DAILY_PRICES_COMPLETE@SEL$3
  10 - SEL$DBA06A95 / A@SEL$1
  11 - SEL$DBA06A95 / BASE_DATE_DSID@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DS_ID"="DS_ID")
   3 - filter("PRICE" IS NULL)
   6 - filter(MIN("UPDATE_STAMP")<=TO_DATE(' 2011-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
  10 - access("A"."DS_ID"="ITEM_1" AND "UPDATE_STAMP">="VW_COL_1" AND "UPDATE_STAMP"<=TO_DATE(' 2011-08-12 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=2) "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
   2 - (#keys=1) "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
   3 - "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
   4 - "A".ROWID[ROWID,10], "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
   5 - "VW_COL_1"[DATE,7], "ITEM_1"[VARCHAR2,9]
   6 - "DS_ID"[VARCHAR2,9], MIN("UPDATE_STAMP")[7]
   7 - "DS_ID"[VARCHAR2,9], MIN("UPDATE_STAMP")[7]
   8 - "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
   9 - "A".ROWID[ROWID,10], "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
  10 - "A".ROWID[ROWID,10], "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
  11 - "DS_ID"[VARCHAR2,10]


Though there are indexes on the table DS_DAILY_PRICES_COMPLETE these are not used. However this column update_stamp is nullable.


SQL> desc datastream.DS_DAILY_PRICES_COMPLETE;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 DS_ID                                              VARCHAR2(9)
 UPDATE_STAMP                                       DATE
 MVALUE                                             NUMBER(14,3)
 NOSH                                               NUMBER(15,3)
 PRICE                                              NUMBER(13,3)
 PHIGH                                              NUMBER(13,3)
 PLOW                                               NUMBER(13,3)
 POPEN                                              NUMBER(13,3)
 VOLUME                                             NUMBER(13)


Please let me know if there is any way to rewrite the query to avoid full scan?

Thanks a lot for your help.

Index is not used when getting min value

Kulkarni, October 19, 2011 - 8:56 am UTC

Tom,
I tried to run that sub-query which select min(update_stamp) like below. But interestingly it uses the index and returns the value quickly.


SQL>  set autotrace on exp stat
SQL> set timing on
SQL> set lines 150
SQL> set pages 500
SQL> select min(UPDATE_STAMP) from datastream.DS_DAILY_PRICES_COMPLETE;

MIN(UPDATE_STAMP)
------------------
30-DEC-64

Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 200437311

------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                         |     1 |     8 | 13115   (2)| 00:02:38 |
|   1 |  SORT AGGREGATE            |                         |     1 |     8 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| INDX_DS_TEMP_UPDT_STAMP |    13M|    99M|            |          |
------------------------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
        421  recursive calls
          0  db block gets
         98  consistent gets
          8  physical reads
          0  redo size
        529  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         18  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select max(UPDATE_STAMP) from datastream.DS_DAILY_PRICES_COMPLETE;

MAX(UPDATE_STAMP)
------------------
12-AUG-11

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 200437311

------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                         |     1 |     8 | 13115   (2)| 00:02:38 |
|   1 |  SORT AGGREGATE            |                         |     1 |     8 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| INDX_DS_TEMP_UPDT_STAMP |    13M|    99M|            |          |
------------------------------------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          2  physical reads
          0  redo size
        529  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select min(UPDATE_STAMP) from datastream.DS_DAILY_PRICES_COMPLETE where ds_id='905100';

MIN(UPDATE_STAMP)
------------------
20-DEC-67

Elapsed: 00:00:18.49

Execution Plan
----------------------------------------------------------
Plan hash value: 3406073904

---------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                          |     1 |    15 |   889   (0)| 00:00:11 |
|   1 |  SORT AGGREGATE              |                          |     1 |    15 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| DS_DAILY_PRICES_COMPLETE |  1060 | 15900 |   889   (0)| 00:00:11 |
|*  3 |    INDEX RANGE SCAN          | INDX_DS_TEMP_DS_ID       |  1128 |       |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("DS_ID"='905100')


Statistics
----------------------------------------------------------
         24  recursive calls
          0  db block gets
       5626  consistent gets
       5127  physical reads
          0  redo size
        529  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

I am not sure why it is not using index when used in the main query. Because of this query is performing badly.

Tom Kyte
October 19, 2011 - 6:13 pm UTC

no schema, no comment.

Your index is likely not suitable for an index min/max scan when used with an additional predicate.

You would want an index on ds_id, update_stamp to get the min/max range scan.



Index is not used when getting min value

A reader, October 20, 2011 - 2:25 am UTC

Tom,
As you can see from my posting when I ran the query with ds_id in where clause then also it worked fine and used index. But when it was part of the bigger query it is going for full scan.


Tom Kyte
October 20, 2011 - 2:45 am UTC

I told you once, I'll remind you again:


no schema, no comment.


That is not very ambiguous is it? I don't know what your indexes look a like.


You are comparing apples to flying toaster ovens. so what if for a given ds_id, it used the index. How about in general - would it use the index for EVERY ds_id, or just a few. I don't know squat about your data, how many value of ds_id exist? Looks like maybe 614. what would the AVERAGE number of rows returned be for a given ds_id, looks like maybe 22,000. How many times would it have to use this index in the bigger query - if it has to hit the index over and over and over and over (not just once but say 15,000 plus times) would the work done by a single full scan be FASTER than 15,000 plus big, huge index range scans?

go ahead, hint it to use the index for a test, just for grins. See how much more work it has to do to use the index.


It is likely doing the full scan for your benefit.

Indexes are evil
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

(*sometimes*)

Index is not used when getting min value

Kulkarni, October 20, 2011 - 7:20 am UTC

Tom,
I think you wanted these details( that is what I thought you mean by "No schema no comments")


SQL> select TABLE_NAME,PARTITION_NAME,NUM_ROWS,SAMPLE_SIZE,GLOBAL_STATS,USER_STATS from dba_tab_statistics where table_name in('DS_PRICES','DS_DAILY_PRICES_COMPLETE','BASE_DATE_DSID');

TABLE_NAME                     PARTITION_NAME                   NUM_ROWS SAMPLE_SIZE GLO USE
------------------------------ ------------------------------ ---------- ----------- --- ---
DS_DAILY_PRICES_COMPLETE                                        13001281       52312 YES NO
BASE_DATE_DSID                                                     15483       15483 YES NO
DS_PRICES                                                      263868501       53009 YES NO
DS_PRICES                      DS_PRICES_P1                       452800        5268 YES NO
DS_PRICES                      DS_PRICES_P2                       294334        5510 YES NO
DS_PRICES                      DS_PRICES_P3                       309189        4836 YES NO
DS_PRICES                      DS_PRICES_P4                       484104       67995 YES NO
DS_PRICES                      DS_PRICES_P5                      4434801       59561 YES NO
DS_PRICES                      DS_PRICES_P6                     73098997       52402 YES NO
DS_PRICES                      DS_PRICES_P7                    140306068       56041 YES NO
DS_PRICES                      DS_PRICES_P8                     45251383       58610 YES NO

SQL> select table_name,index_name,column_name,column_position from dba_ind_columns where table_name in('DS_PRICES','DS_DAILY_PRICES_COMPLETE','BASE_DATE_DSID');


TABLE_NAME                     INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ------------------------------ ---------------
DS_PRICES                      DSPRICES_UPDTSTAMP_IDX         UPDATE_STAMP                                 1
                               DS_ID_UPDT_GLOBAL              DS_ID                                        1
                               DS_ID_UPDT_GLOBAL              UPDATE_STAMP                                 2
BASE_DATE_DSID                 INDX_DS_ID                     DS_ID                                        1
DS_DAILY_PRICES_COMPLETE       INDX_DS_TEMP_UPDT_STAMP        UPDATE_STAMP                                 1
                               INDX_DS_TEMP_DS_ID             DS_ID                                        1
                               IDX_UPDT_DS_ID                 DS_ID                                        1
                               IDX_UPDT_DS_ID                 UPDATE_STAMP                                 2


SQL>  select table_name,PARTITION_NAME,INDEX_NAME,CLUSTERING_FACTOR,NUM_ROWS,SAMPLE_SIZE,GLOBAL_STATS,USER_STATS from DBA_IND_STATISTICS where TABLE_NAME in('DS_PRICES','DS_DAILY_PRICES_COMPLETE','BASE_DATE_DSID');

TABLE_NAME                     PARTITION_NAME                 INDEX_NAME                     CLUSTERING_FACTOR   NUM_ROWS SAMPLE_SIZE GLO USE
------------------------------ ------------------------------ ------------------------------ ----------------- ---------- ----------- --- ---
DS_PRICES                                                     DSPRICES_UPDTSTAMP_IDX                 236767006  256498350      365339 YES NO
                                                              DS_ID_UPDT_GLOBAL                       58486294  260084913      234825 YES NO
BASE_DATE_DSID                                                INDX_DS_ID                                 14778      15483       15483 NO  NO
DS_DAILY_PRICES_COMPLETE                                      INDX_DS_TEMP_UPDT_STAMP                 12263404   12686798    14196491 YES NO
                                                              INDX_DS_TEMP_DS_ID                      10820735   13826747    14196491 YES NO
                                                              IDX_UPDT_DS_ID                          13015840   13018847    13018847 NO  NO
DS_PRICES                      DS_PRICES_P1                   DSPRICES_UPDTSTAMP_IDX                    438788     453238      453238 YES NO
                               DS_PRICES_P2                   DSPRICES_UPDTSTAMP_IDX                    277204     293800      293800 YES NO
                               DS_PRICES_P3                   DSPRICES_UPDTSTAMP_IDX                    298118     316203      316203 YES NO
                               DS_PRICES_P4                   DSPRICES_UPDTSTAMP_IDX                    458658     482402      482402 YES NO
                               DS_PRICES_P5                   DSPRICES_UPDTSTAMP_IDX                   4214285    4398736      263207 YES NO
                               DS_PRICES_P6                   DSPRICES_UPDTSTAMP_IDX                  65239445   69615820      368490 YES NO
                               DS_PRICES_P7                   DSPRICES_UPDTSTAMP_IDX                 128632131  134078767      355639 YES NO
                               DS_PRICES_P8                   DSPRICES_UPDTSTAMP_IDX                  38089666   45226372      396376 YES NO
                               SYS_P24                        DS_ID_UPDT_GLOBAL                        7640767   34007974      243097 YES NO
                               SYS_P25                        DS_ID_UPDT_GLOBAL                        6946923   31041463      226216 YES NO
                               SYS_P26                        DS_ID_UPDT_GLOBAL                        7883709   34271672      246553 YES NO
                               SYS_P27                        DS_ID_UPDT_GLOBAL                        7802809   33891637      243819 YES NO
                               SYS_P28                        DS_ID_UPDT_GLOBAL                        7477125   33313244      239658 YES NO
                               SYS_P21                        DS_ID_UPDT_GLOBAL                        6997161   32259892      233578 YES NO
                               SYS_P22                        DS_ID_UPDT_GLOBAL                        7283605   34110039      246974 YES NO
                               SYS_P23                        DS_ID_UPDT_GLOBAL                        7458892   33089753      241143 YES NO

I hope I have provided the information you required.

Here is the issue is not index not being used. This query almost hangs. I don't mind full table scan if the query responds within reasonable time.

Tom Kyte
October 20, 2011 - 8:02 am UTC

I just want create index statements.


Here is the issue is not index not being used. This query almost hangs. I don't


i don't care that the index is not being used. did you read the link above? did you read what I wrote?


hint it, tell us what happens if you USE the index.

and supply the schema, a bunch of queries that I would have to read and figure out what the sql is - not time efficient.

give me the create table (no tablespace, no storage clauses - just the relevant columns)
give me the create indexes

and show me that using an index would "make this faster"

Index is not used when getting min value

Kulkarni, October 20, 2011 - 7:22 am UTC

Here are the details of table structure.


SQL> desc datastream.DS_PRICES
 Name                                                                                Null?    Type
 ----------------------------------------------------------------------------------- -------- --------------------------------------------------------
 DS_ID                                                                                        VARCHAR2(9)
 UPDATE_STAMP                                                                                 DATE
 MVALUE                                                                                       NUMBER(14,3)
 NOSH                                                                                         NUMBER(15,3)
 PRICE                                                                                        NUMBER(15,3)
 PHIGH                                                                                        NUMBER(15,3)
 PLOW                                                                                         NUMBER(15,3)
 POPEN                                                                                        NUMBER(15,3)
 VOLUME                                                                                       NUMBER(13)
 PRICE_US                                                                                     NUMBER(15,3)
 PHIGH_US                                                                                     NUMBER(15,3)
 PLOW_US                                                                                      NUMBER(15,3)
 POPEN_US                                                                                     NUMBER(15,3)

SQL> desc datastream.BASE_DATE_DSID
 Name                                                                                Null?    Type
 ----------------------------------------------------------------------------------- -------- --------------------------------------------------------
 DS_ID                                                                                        VARCHAR2(10)
 BASE_DATE                                                                                    DATE

SQL> desc datastream.DS_DAILY_PRICES_COMPLETE
 Name                                                                                Null?    Type
 ----------------------------------------------------------------------------------- -------- --------------------------------------------------------
 DS_ID                                                                                        VARCHAR2(9)
 UPDATE_STAMP                                                                                 DATE
 MVALUE                                                                                       NUMBER(14,3)
 NOSH                                                                                         NUMBER(15,3)
 PRICE                                                                                        NUMBER(13,3)
 PHIGH                                                                                        NUMBER(13,3)
 PLOW                                                                                         NUMBER(13,3)
 POPEN                                                                                        NUMBER(13,3)
 VOLUME                                                                                       NUMBER(13)

Tom Kyte
October 20, 2011 - 8:05 am UTC

no describes, no queries, just create statements, something easily readable

Index is not used when getting min value

Kulkarni, October 20, 2011 - 12:10 pm UTC

Tom,
Here are the create statements for tables and indexes.


CREATE TABLE DATASTREAM.DS_PRICES
( DS_ID VARCHAR2(9) NOT NULL ENABLE NOVALIDATE,
UPDATE_STAMP DATE NOT NULL ENABLE NOVALIDATE,
MVALUE NUMBER(14,3),
NOSH NUMBER(15,3),
PRICE NUMBER(15,3),
PHIGH NUMBER(15,3),
PLOW NUMBER(15,3),
POPEN NUMBER(15,3),
VOLUME NUMBER(13,0),
PRICE_US NUMBER(15,3),
PHIGH_US NUMBER(15,3),
PLOW_US NUMBER(15,3),
POPEN_US NUMBER(15,3)
) PCTFREE 5
PARTITION BY RANGE (UPDATE_STAMP)
(PARTITION DS_PRICES_p1 VALUES LESS THAN (TO_DATE('31-MAR-1975 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p2 VALUES LESS THAN (TO_DATE('31-MAR-1980 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p3 VALUES LESS THAN (TO_DATE('31-MAR-1985 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p4 VALUES LESS THAN (TO_DATE('31-MAR-1990 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p5 VALUES LESS THAN (TO_DATE('31-MAR-2000 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p6 VALUES LESS THAN (TO_DATE('31-MAR-2005 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p7 VALUES LESS THAN (TO_DATE('31-MAR-2010 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE,
PARTITION DS_PRICES_p8 VALUES LESS THAN (TO_DATE('31-MAR-2015 23:59:59', 'DD-MON-YYYY HH24:MI:SS')) COMPRESS tablespace DATASTREAM_LARGE);


CREATE TABLE BASE_DATE_DSID
( DS_ID VARCHAR2(10),
BASE_DATE DATE
)



CREATE TABLE DS_DAILY_PRICES_COMPLETE
( "DS_ID" VARCHAR2(9),
"UPDATE_STAMP" DATE,
"MVALUE" NUMBER(14,3),
"NOSH" NUMBER(15,3),
"PRICE" NUMBER(13,3),
"PHIGH" NUMBER(13,3),
"PLOW" NUMBER(13,3),
"POPEN" NUMBER(13,3),
"VOLUME" NUMBER(13,0)
)



CREATE INDEX DSPRICES_UPDTSTAMP_IDX ON DS_PRICES (UPDATE_STAMP) LOCAL

CREATE INDEX DS_ID_UPDT_GLOBAL ON DS_PRICES (DS_ID, UPDATE_STAMP) GLOBAL PARTITION BY HASH (DS_ID) partitions 8

CREATE INDEX INDX_DS_ID ON BASE_DATE_DSID(DS_ID)

CREATE INDEX INDX_DS_TEMP_UPDT_STAMP ON DS_DAILY_PRICES_COMPLETE (UPDATE_STAMP)

CREATE INDEX INDX_DS_TEMP_DS_ID ON DS_DAILY_PRICES_COMPLETE (DS_ID)

CREATE INDEX IDX_UPDT_DS_ID ON DS_DAILY_PRICES_COMPLETE (DS_ID,UPDATE_STAMP)


Index is not used when getting min value

Kulkarni, October 20, 2011 - 12:33 pm UTC

Tom,
Even with index hint query plan did not change.

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID bw8f7ph9tmb26, child number 0
-------------------------------------
SELECT /*+ iNDEX( DS_DAILY_PRICES_COMPLETE IDX_UPDT_DS_ID ) */ DS_ID, UPDATE_STAMP FROM DS_PRICES A WHERE DS_ID IN
(SELECT DS_ID FROM BASE_DATE_DSID) AND UPDATE_STAMP >=(SELECT MIN(UPDATE_STAMP) FROM DS_DAILY_PRICES_COMPLETE WHERE
A.DS_ID=DS_ID) AND UPDATE_STAMP<=TO_DATE('12-AUG-2011','DD-MON-YYYY') AND PRICE IS NULL ORDER BY DS_ID,UPDATE_STAMP

Plan hash value: 715071183

----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 30925 (100)| | | |
| 1 | SORT ORDER BY | | 698 | 29316 | 30925 (4)| 00:06:12 | | |
|* 2 | HASH JOIN SEMI | | 698 | 29316 | 30924 (4)| 00:06:12 | | |
|* 3 | TABLE ACCESS BY GLOBAL INDEX ROWID| DS_PRICES | 9 | 171 | 29 (0)| 00:00:01 | ROW L | ROW L |
| 4 | NESTED LOOPS | | 5266 | 179K| 30913 (4)| 00:06:11 | | |
| 5 | VIEW | VW_SQ_1 | 614 | 9824 | 13986 (8)| 00:02:48 | | |
|* 6 | FILTER | | | | | | | |
| 7 | HASH GROUP BY | | 614 | 9210 | 13986 (8)| 00:02:48 | | |
| 8 | TABLE ACCESS FULL | DS_DAILY_PRICES_COMPLETE | 13M| 185M| 13115 (2)| 00:02:38 | | |
| 9 | PARTITION HASH ITERATOR | | 113 | | 3 (0)| 00:00:01 | KEY | KEY |
|* 10 | INDEX RANGE SCAN | DS_ID_UPDT_GLOBAL | 113 | | 3 (0)| 00:00:01 | KEY | KEY |
| 11 | INDEX FAST FULL SCAN | INDX_DS_ID | 15483 | 105K| 10 (0)| 00:00:01 | | |
----------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$DBA06A95
3 - SEL$DBA06A95 / A@SEL$1
5 - SEL$291F8F59 / VW_SQ_1@SEL$BD9E0841
6 - SEL$291F8F59
8 - SEL$291F8F59 / DS_DAILY_PRICES_COMPLETE@SEL$3
10 - SEL$DBA06A95 / A@SEL$1
11 - SEL$DBA06A95 / BASE_DATE_DSID@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DS_ID"="DS_ID")
3 - filter("PRICE" IS NULL)
6 - filter(MIN("UPDATE_STAMP")<=TO_DATE(' 2011-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
10 - access("A"."DS_ID"="ITEM_1" AND "UPDATE_STAMP">="VW_COL_1" AND "UPDATE_STAMP"<=TO_DATE(' 2011-08-12 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=2) "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
2 - (#keys=1) "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
3 - "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
4 - "A".ROWID[ROWID,10], "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
5 - "VW_COL_1"[DATE,7], "ITEM_1"[VARCHAR2,9]
6 - "DS_ID"[VARCHAR2,9], MIN("UPDATE_STAMP")[7]
7 - "DS_ID"[VARCHAR2,9], MIN("UPDATE_STAMP")[7]
8 - "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
9 - "A".ROWID[ROWID,10], "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
10 - "A".ROWID[ROWID,10], "DS_ID"[VARCHAR2,9], "UPDATE_STAMP"[DATE,7]
11 - "DS_ID"[VARCHAR2,10]


However the query is still hanging without giving any out put but I could get the plan using dbms_xplan.

This session is showing up in the list of waiting sessions with the wait event "db file sequential read"
Tom Kyte
October 20, 2011 - 6:22 pm UTC

try putting the hint in the right place - you know, on the subquery you are trying to affect. The query block you put it in doesn't have that table in its scope.


This session is showing up in the list of waiting sessions with the wait event
"db file sequential read"


Oh, that is so rich.... The irony of it all.

Guess what causes db file sequential read waits?

INDEX READS <<<===============


the fact your query is waiting on db file sequential reads means the very very very efficient full scan has already taken place and you are reading the indexes now - your full scan you are so afraid of (but I'm not, I'm glad to see it) has already happened and you are waiting now for the very inefficient indexes to be used to process the rest of the query.


the irony of it all - here you are working as hard as you can to make the query slower!

my suggestion: try to get the plan to use LESS indexes.

Tell me - do those estimated cardinalities look reasonable to you in the plan, do the numbers 698, 9, 5266, 614, 113, 15483 in the plan look right to you?

I'll be guessing that they are way underestimating the rows that would be returned through each step of the plan.

Am I right?


Index is not used when getting min value

Kulkarni, October 21, 2011 - 12:00 am UTC

Tom,
Sorry for putting the hint in the wrong place. Anyway I am not keen on forcing the index reads. But here the issue is why the query almost hangs even when no hints are used. You said,
"my suggestion: try to get the plan to use LESS indexes".

But how avoid or use less indexes? Do I need to use hint again for not using indexes? Please clarify.
Tom Kyte
October 21, 2011 - 1:29 am UTC

Tell me - do those estimated cardinalities look reasonable to you in the plan, do the numbers 698, 9, 5266, 614, 113, 15483 in the plan look right to you?

I'll be guessing that they are way underestimating the rows that would be returned through each step of the plan.

Am I right?


address that for me please, what I suspect is that the estimated cardinalities are way off - we are getting the wrong plan - and we need to fix that.

Also, what version are you in? Can you use the sql monitor (11g new feature) to monitor the progress of this query - it will also show you the estimated row counts versus the actual row counts as the query is running.

The query is NOT hanging, the query is taking longer to run than you would like, but it is still running, please do not call that a hang, a hang means "it is not running at all"

Index is not used when getting min value

Kulkarni, October 21, 2011 - 4:43 am UTC

Tom,
We are using 10.2.0.5. The cardinality estimated are correct.
Yes the query is not hanging but running for very long time. I can see that the block number and file number read by the query continuously changing.

Do you feel that CBO is not choosing correct plan because for statistics?
Tom Kyte
October 21, 2011 - 8:27 pm UTC

then the cardinality estimates are not correct, if they where - this would run fast.

I'll ask again - do you think you'll get 600 some odd records out of this? Do you think that the one step will get about 15,000 records, the other about 5,000 records?

Because if they do - it would not take very long to execute this query. So, I have to guess that those estimates are *very low*. Use your knowledge of the data to confirm this please.

Use FullScan instead of FBI

Yoni, October 30, 2011 - 10:14 am UTC

Hi tom,
I have an Function Base Index (local index) on a table
and when i query:
select /*+ index(table FBI)*/ func(column)
from table;

The optimizer use Full Scan instead of index fast full scan
why?

forgive me for not copy&paste my code, i have a private network.

thanks.
Tom Kyte
October 31, 2011 - 11:14 am UTC

Your function must be able to return nulls and the optimizer KNOWS it cannot use your index since an entirely null key is never entered into a b*tree index.

the opimizer it saying "no" for the same reason it would say no to:

create table t ( x int, y char(2000) );
-- x allows nulls

create index t_idx on t(x);

select /*+ index(t t_idx) */ x from t;


it has nothing to do with the function - everything to do with nulls and b*tree indices.


You need to either

a) add an attribute to your index that is known to be NOT NULL - I will demonstrate that.

b) add a predicate to your query that precludes f(x) from being null - such as "where f(x) is not null" or "where f(x) > 0" and so on.




ops$tkyte%ORA11GR2> create table t ( x int, y char(2000) );

Table created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T', numrows => 1000000, numblks => 100000 );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create or replace function f( x in number ) return number
  2  deterministic
  3  as
  4  begin
  5          return x;
  6  end;
  7  /

Function created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_idx on t(f(x));

Index created.

ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select /*+ index( t t_idx ) */ x from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  1000K|    12M| 27121   (1)| 00:05:26 |
|   1 |  TABLE ACCESS FULL| T    |  1000K|    12M| 27121   (1)| 00:05:26 |
--------------------------------------------------------------------------

ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> drop index t_idx;

Index dropped.

ops$tkyte%ORA11GR2> create index t_idx on t(f(x), 0);

Index created.

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select /*+ index( t t_idx ) */ x from t;

Execution Plan
----------------------------------------------------------
Plan hash value: 3778778741

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |  1000K|    12M|     0   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T     |  1000K|    12M|     0   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN           | T_IDX |  1000K|       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

ops$tkyte%ORA11GR2> set autotrace off



that second index indexes the attribute 0, 0 is known to be NOT NULL, since at least one of the attributes in the index is known to be NOT NULL - the optimizer knows every row in the table is in the index - therefore it can use the index to retrieve every row....


Exadata and Full Table Scan

Parthiban Nagarajan, February 17, 2012 - 3:41 am UTC

Hi Tom

To how far the Exadata giant has changed the meaning (or semantics) of the phrase "Full Table Scan"?

I mean, by what magic the Exadata achieves the huge performance difference from other systems? Is it true that "Full Table Scan" in Exadata differs from "Full Table Scan" in other machines?

Thanks and regards

Tom Kyte
February 17, 2012 - 5:36 am UTC

it depends.

if you issue "select * from t" on exadata, the full scan will be a "normal full scan", all of the blocks, all of the data will be retrieved from disk and sent to the database instance. If table t was a terabyte in size, there will be a terabyte sent from disk to the instance.

on the other hand, if you issue:

select sum(sales) from t where product=55 and sales_date > sysdate-30;

then, the full scan you might typically see will be replaced by a 'smart scan'. the smart scan will push the where clause down to the storage to be processed in a massively parallel (MPP), shared nothing fashion. The storage cells will each scan their data sets (using storage indexes to preclude data from having to be scanned) and will process the where clause - finding ONLY the relevant records - and sending only the columns of interest for the rows that are relevant back to the server instance.


So, there is a full scan - which is a full scan. There is also a smart scan - which implies some/much of the query processing has been offloaded to the storage array.

Alexander, February 17, 2012 - 8:25 am UTC

Not to mention super fast SSD flashcache storage, monster network bandwidth, and what, 96 cpus depending on the unit? That doesn't hurt.
Tom Kyte
February 17, 2012 - 8:47 pm UTC

just flashcache stored, not SSD flashcache.

You missed the 168 cores in the storage array...

http://www.oracle.com/technetwork/server-storage/engineered-systems/exadata/dbmachine-x2-8-datasheet-173705.pdf

Jess, February 20, 2012 - 3:29 pm UTC

Hi Tom,

Even though FTSs may not be evil, this one is killing us:

select /*+ parallel (p, 5) */ <30 columns>
from t_abc where flag1 = 'Y' and flag2 = 'N';

We expect the query to return somewhere between 50 and 75% of rows in the table.

The hint is already there at this time.
The table has 10M rows and is not partitioned. It's got 15 Gb of data in it.
There is a bitmap index on flag1 with parallel degree of 5 (and 2 b-tree indexes on other columns).
Stats are gathered on the table/index. Data in tab_col_statistics shows correct number of distinct values (3), with num_buckets = 3 and histogram = FREQUENCY.

Selecting that much data, we'd expect a full table scan, which CBO does, but it takes forever to run. This SQL hasn't been seen to complete in ages.

Run as is, it produces the following plan:

--------------------------------------------------------------------------------
Id|Operation |Name |Rows |Bytes|TmpSp|Cost(%CPU)|Time | TQ |INOUT
--------------------------------------------------------------------------------
0|SELECT STATEMENT | |4810K|2532M| | 221K (1)|00:44:20| |
1| PX COORDINATOR | | | | | | | |
2| PX SEND QC (ORDER)|:TQ101|4810K|2532M| | 221K (1)|00:44:20|Q1,01|P->S
3| SORT ORDER BY | |4810K|2532M|2890M| 221K (1)|00:44:20|Q1,01|PCWP
4| PX RECEIVE | |4810K|2532M| |96766 (1)|00:19:22|Q1,01|PCWP
5| PX SEND RANGE |:TQ100|4810K|2532M| |96766 (1)|00:19:22|Q1,00|P->P
6| PX BLOCK ITRTR| |4810K|2532M| |96766 (1)|00:19:22|Q1,00|PCWC
*7| TBL ACC FULL |T_ABC |4810K|2532M| |96766 (1)|00:19:22|Q1,00|PCWP
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
7 - filter("FLAG1"='Y' AND "FLAG2"='N')


The index is unused.

Trying to force the index to be used with a hint, makes it worse:

------------------------------------------------------------------------------------
Id|Operation |Name |Rows |Bytes |TmpSp|Cost (%CPU)|Time
------------------------------------------------------------------------------------
0|SELECT STATEMENT | |4828K| 2542M| | 4537K (1)|15:07:32
1| SORT ORDER BY | |4828K| 2542M|2902M| 4537K (1)|15:07:32
*2| TABLE ACCESS BY INDEX ROWID |T_ABC |4828K| 2542M| | 3973K (1)|13:14:44
3| BITMAP CONVERSION TO ROWIDS| | | | | |
*4| BITMAP INDEX SINGLE VALUE |IND_ABC_F1| | | | |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("FLAG2"='N')
4 - access("FLAG1"='Y')


So we said perhaps we should index Flag2 to see what happens.
Create index on Flag2 as regular b-tree. The CBO still favours the full table scan.
Forcing the index looks somewhat to the plan above, just the indexes are swapped around (and differences due to one being bitmap):

----------------------------------------------------------------------------------
Id|Operation |Name |Rows |Bytes|TmpSp| Cost(%CPU)|Time
----------------------------------------------------------------------------------
0|SELECT STATEMENT | |4810K|2532M| | 3164K (1)|10:32:58
1| SORT ORDER BY | |4810K|2532M|2890M| 3164K (1)|10:32:58
*2| TABLE ACCESS BY INDEX ROWID|T_ABC |4810K|2532M| | 2602K (1)|08:40:36
*3| INDEX RANGE SCAN |IND_ABC_F2|4871K| | | 9393 (1)|00:01:53
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("FLAG1"='Y')
3 - access("FLAG2"='N')



Ok, just out of curiosity, drop the index on Flag2 and recreate it as a bitmap index. Drop index, create as bitmap, parallel, etc, flush buffer cache, gather table/column stats, histograms, etc.
Re-running the query, CBO still wants a full table scan. Trying to see what the plan is like with an index hint causes both indexes to be used (regardless of which one you hint) with a pretty sad result:

----------------------------------------------------------------------------------
Id|Operation |Name |Rows |Bytes|TmpSp|Cost(%CPU)|Time
----------------------------------------------------------------------------------
0|SELECT STATEMENT | |4857K|2557M| | 23M (1)|77:55:51
1| SORT ORDER BY | |4857K|2557M|2919M| 23M (1)|77:55:51
2| TABLE ACCESS BY INDEX ROWID |T_ABC |4857K|2557M| | 22M (1)|76:02:23
3| BITMAP CONVERSION TO ROWIDS| | | | | |
4| BITMAP AND | | | | | |
*5| BITMAP INDEX SINGLE VALUE|IND_ABC_F2| | | | |
*6| BITMAP INDEX SINGLE VALUE|IND_ABC_F1| | | | |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("FLAG2"='N')
6 - access("FLAG1"='Y')


Somewhat confused by that. Wouldn't it be faster to do an index scan by 2 indexes?

As long as we're here, what if we had one index on both Flag1 and Flag2?
Drop, create a b-tree on combination on flag1/flag2. Flush buffer cache, gather stats. Try the query again. Full table scan. Trying to force the index, no better:

------------------------------------------------------------------------------
|Id|Operation |Name |Rows |Bytes|TmpSp| Cost(%CPU)|Time |
------------------------------------------------------------------------------
| 0|SELECT STATEMENT | |4857K|2557M| | 3181K (1)|10:36:15 |
| 1| SORT ORDER BY | |4857K|2557M|2919M| 3181K (1)|10:36:15 |
| 2| TBL ACC BY IDX ROWID|T_ABC |4857K|2557M| | 2613K (1)|08:42:47 |
|*3| INDEX RANGE SCAN |IND_ABC_F2|4857K| | | 10724 (1)|00:02:09 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("FLAG1"='Y' AND "FLAG2"='N')


So thoroughly confused now.... What can we do to get a better performance out of this query?
Thank you as always--completely out of ideas here.....


Tom Kyte
February 21, 2012 - 7:08 pm UTC

You'll need slightly better IO capabilities - there is no magic here, only physics at work.

You *know* that the index is not the way to go - that is obvious. To get more than a million rows - an index would take almost 1.5 hours on a machine that can do a random IO in an average of 5ms (pretty good speed).

You are trying for 5-7 MILLION rows - that would be *many* hours


Please

a) define "forever", what does "forever" mean with regards to the full scan

b) take some of your datafiles and dd them (assuming unix/linux here) to /dev/null and tell me how many MB/sec you can achieve. That will dictate how long this takes overall.



Also, the plan shows a sort order by, your query doesn't, How did that happen? You are not showing me everything here.

What is your PGA memory settings, are you spending a long time on direct path reads/writes to/from temp? That'll add to your IO woes.

Jess, February 22, 2012 - 9:06 am UTC

Hi Tom,

Thank you for the advice, as always.

Better i/o capabilities are not in the cards I'm afraid.
I was toying with the idea of partitioning the table perhaps? That way partitions would be very skewed, which wouldn't be worse than now on insert, but would hopefully help with select, as you'd be selecting everything. (Unable to test that at the moment, but will do once environment is back with me.) Is that something that can potentially help?

To answer your questions,

a). 'forever' is undefined. As it runs for over 2 hours in production, it gets killed without being able to finish. I was able to run it in the performance enviroment for 6 hours at one point before it was killed. It's close to impossible to run something this long-running (rules, alerts, etc.--meant to help, but mostly hinders).

b). I don't have access to the Oracle boxes in either Prod or Perf environment, and request to Unix folks to do so was denied. I was told, however, that average for the box is 1,000 iops, peaking to 3,000, transferring 30 MB/s to 100 MB/s, respectively.

c). The orderby was a copy/paste mistake. There is an orderby in the query. We've already made a recommendation to change the underlying java code to pull the data back unsorted and handle ordering there (records need to be ordered by processing), but it's a large change that needs to wait for a release, whereas we're hoping to be able to boost performance at least a little at this time.

d). PGA settings:
aggregate PGA target parameter = 3072 GB
aggregate PGA auto target = 1334 GB
total PGA inuse = 1591 GB
total PGA allocated = 1829 GB
over allocation count = 1
extra bytes read/written = 5,186,823,154,688
cache hit percentage = 67.85%

Direct path reads are fussy. Most of the loads/processing seems to cope OK, but occasionally they get stuck for good. Overall the DBAs haven't flagged anything there.

Does this help? Thanks again
Tom Kyte
February 23, 2012 - 7:25 pm UTC

regarding partitioning, I don't know if that would help - based on the predicate you supplied. AT MOST, it would reduce the reads against the table by 25/50%. However, I have a strong suspicion is the sorting.

a) i really feel this is sorting related, not reading... to prove it, just run the query in sqlplus with set arraysize 100, set termout off and time it - WITHOUT an order by.

b) so, the IO is really slow. You have enough IO to keep 1, maybe 2 cpus BUSY at peak.

c) DO NOT ORDER IN THE JAVA CODE, it will be slower, IT WILL BE MUCH MUCH SLOWER.


can I see the real query? can I know the real size of the output data (not in rows, in MBs)

Jess, February 24, 2012 - 12:34 pm UTC

Hi Tom,

You've been a lifesaver on this one, as always.

It *IS* sorting, just as you said....

a). Running in sqlplus without order by started returning rows within 5 seconds. It took the query 1h 47m to dump out 7,480,317 rows.
(That said--even though I should be jumping up and down seeing how it returned in < 2h--I'd imagine this is still a little long to be returning a dataset to a java process that then operates on it in batches of 500 records....)

The unsorted run completed with the following plan:

----------------------------------------------------------------------------------------
Id|Operation |Name |Rows |Bytes|Cost (%CPU)|Time | TQ |IN-O|PQ Distrib
----------------------------------------------------------------------------------------
0 |SELECT STATEMENT | | | | 151K(100)| | | |
1 | PX COORDINATOR | | | | | | | |
2 | PX SEND QC (RANDOM)|:TQ10000|7242K|3813M| 151K (1)|00:45:16|Q1,00|P->S|QC (RAND)
3 | PX BLOCK ITERATOR | |7242K|3813M| 151K (1)|00:45:16|Q1,00|PCWC|
4 | TABLE ACCESS FULL|TBL_TXNS|7242K|3813M| 151K (1)|00:45:16|Q1,00|PCWP|
----------------------------------------------------------------------------------------

c). The rows have to be ordered for processing. If you're recommending against Java, then the only solution I see at the moment (to process the backlog, not to redesign) is to do multiple passes over data on combination of fields (so that essentially you always pick up the first transaction for every corp_id on the first pass, get those processed, then go back and do it again picking up every next one).

d). The actual query I've inherited is this:

select /*+ parallel(t, 5)*/ corp_id, corp_type, log_date, unique_id, redirect_info,
origin, curr_status, reported_date, processed_date, filename, agent_code, agent_type,
rcpt_code, rcpt_type, corp_processed, agent_processed, rcpt_processed, origin_id,
original_rc, sec_type, intl_no, processed_status, order_no, recent_flag, sec_other, quantity
from tbl_txns t
where processed_status = 'N' and original_rc = 'Y'
order by corp_id desc, unique_id desc, log_date desc, order_no desc;

Indexes on the table:
NAME UNIQUENESS TYPE COLUMNS
idx1 nonunique normal log_date, unique_id, corp_id
idx2 nonunique normal order_no
idx3 nonunique bitmap final_status


As to the real size of the output data, looking at the plan,'bytes' would imply that it's 3.7 gigs?

But the table has 11.7M rows. avg_row_len * num_rows = 14 gigs.
Since we returned 7.5M rows (but not all the columns though most of them), didn't it theoretically return shy of 9 gigs?

That said, the DBAs tell me that looking at datafile sizes, this table is 19.7 gigs
(so, as a side note, wondering why such discrepancy between the two--indexes are in a different tablespace).

==
Btw, nothing is ever being deleted from this non-partitioned table. It was put it to hold transactions that require additional processing and, once processed (set processed_status = 'Y'), to remain forever in the table for audit purposes *by design*. I'm working on changing it to have the job move the processed records to an audit table (so insert/delete instead of updating and keeping). However, current backlog needs to be processed now, before the architecture change can be applied :(

Thanks again for being out there, educating and helping out.

Tom Kyte
February 25, 2012 - 5:32 am UTC

looking at the plan,'bytes' would imply
that it's 3.7 gigs?


not relevant, I want facts, not guesses - the plan is a guess.


If we use an index to read this - I estimate it would take over 8 hours to read this all out by index.

but if you have a java process (a single process) reading this and processing it row by row - that is probably the least of your worries.

I'd be looking for a way to either

a) get rid of java, just process in bulk sql right in the database

b) if you have to do java, figure out a way to parameterize MANY java threads/processes and have each process a small chunk of data, at the same time

Jess, February 27, 2012 - 8:48 am UTC

Thanks Tom.

Getting it out of the java isn't really an option due to the overall design.
There are multiple threads set up to work on this, but it's limiting at 20 in the code. Will have to crank it up a bit more and see what can be done with more to make a dent in this.

Thanks again for all the help.

Is there a reason to treat avoiding full scans as a best practice?

Ivan Grishulenko, March 01, 2012 - 12:39 am UTC

Hi Tom-

From time to time a hear a word from experienced developers and dba that full table scans are evil and should be avoided by any means necessary. Stated rationale is that several full scans of a table T will be done in a queue, delaying each other and all those minor reads and updates of T from another sessions.
But I can't understand how N sessions doing a sequential read of entire table T could be faster than N sessions doing a scattered read of the same table. Could you advice me on that matter? Every time I force use of (useless) index in a query I feel like guilty...

P.S. We use Oracle in dedicated mode, with hundreds of users online.
Tom Kyte
March 01, 2012 - 7:52 am UTC

Stated rationale is that several full scans of a table T will be done in a queue, delaying each other and all those minor reads and updates of T from another sessions.

there is nothing true in that statement.

ask your experienced developers and DBAs to comment on this:

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

ask them which would be better - and explain to them that their rational has no basis in reality.

Re: Is there a reason to treat avoiding full scans as a best practice?

Ivan Grishulenko, March 16, 2012 - 8:38 am UTC

Tom,

Thank you for your advice. Although, collegues didn't believe me.

And then I thought: what if they have a reason to do so? So I've analysed our schema to find out things that could lead to such Oracle behaviour (described in prior message).
And it seems that we have CACHE attribute on large (even 21-GiB) tables. So performing a full table scan could really slow down everything else, by consuming both buffer space and cpu power. And performing several FTSs could effectively hang up the system.
What do you think?
Tom Kyte
March 16, 2012 - 8:52 am UTC

I think that if you need to retrieve hundreds of thousands of rows from a table, you better full scan - or you better have hours to wait.

Further more, in 11g, we'll tend to direct path read the table - bypassing the inefficiencies of the buffer cache entirely.

Further more, if you decided to change the defaults on a table and cache it - that is your decision and you live with the results of it. We don't take the cache/nocache attribute into effect.

If your colleagues didn't want full scanning to take place - why would they have even touched the cache attribute? It seems strange doesn't it?


If you think a full scan consumes cpu power, you should see what index access can do!!

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154


Jess, May 24, 2012 - 5:03 am UTC

Hi Tom,

We're back to selecting and sorting... There is another process that selects ~5M rows from a 10-15M partition based on 8-10 FK joins to other table that allow for data to be eliminated. It now transpires that the 5M records need to be returned in sorted order.

(At that, we have an option. We can either sort at the time we select from the large partition before writing to the target table OR we can write to the target table unsorted, and then sort the 5M upon passing it to the next process--the one that requires data to be ordered. The second one seemed a better option, but it still does not return in 15+ minutes.)

Based on your pointers to my previous question, I am struggling to understand how we could sort it efficiently. I would think dump it to a file and let perl have a go, but that doesn't seem to be great either. If you *have* to sort, what is the recommended approach for sorting datasets as large as these?

Thank you as always.
Tom Kyte
May 24, 2012 - 9:12 am UTC

... We can either sort at the time we select from the
large partition before writing to the target table OR we can write to the
target table unsorted, and then sort the 5M upon passing it to the next
process--the one that requires data to be ordered. ...

even if you sorted and then loaded - there is no assurance that the data will be retrieved sorted. You would STILL absolutely need an order by on the retrieval of this data - and it would be sorted again.

you would sort twice if you sort -> load. because you have to sort -> load -> SORT.


the database was born to join
the database was born to sort

let it sort, it is what databases do best.


just make sure your PGA memory settings are appropriate.


and, if you can avoid that writing of the data to a table - do so. maybe you can do a pipelined table function? Your first process would pipe rows to the second process to write to the ultimate final table.

Jess, May 24, 2012 - 9:56 am UTC

Hi Tom,

Thanks for the reply. Now I am thoroughly confused though... In the comments above, when we had a sorting problem a little while back, it seems you were suggesting that sorting that many rows was something that would take hours to do because it's just, well, too many rows. In this comment, however, you seem to be suggesting we just sort away. What is it that makes up the difference?

With respect to pipelined table function, I don't want to highjack this thread (I read one you have on these), but I'm struggling to see how we'd use it.

Say our data is such that each row has original amount and a currency code.

The buseness process is this:
- select data from partition based on joins to other table to eliminate all but ~5M rows
- ensure all data is converted to currency XYZ (by looking up daily exchange rates for that currency)
- error out the rows that could not be converted and keep those
- write that data to a file

With this requirement, our thinking was that we create a target table and an error log table into which unconvertable rows will go. We'd pre-select 3 most-popular currencies into variables. Then we'd select the 5M from the base partition and insert it into the target table with a case statement in it (when currency = top1, new_amount = rate*amount). Then we look at the target table for the rows with a null new_amount, get the straggler currencies, update those rows, and then use utl_file to write out the file (while possibly ordering at the same time).

How would we be able to use a pipelined table function to do all of that (with the conversion and what not) without a table? And how would a pipelined table function help with ordering in thsi case?

Thanks again ever so much.


Tom Kyte
May 24, 2012 - 10:27 am UTC

big page here, can you point me to the exact reference- all of the reviews/followups have a "bookmark" link - that'll point us right to it.


first, 5,000,000 rows in 2012 is pretty small (well, it is hard to say since 5,000,000 rows could be 5mb or 500tb - but in general...)

It is pretty darn small. Here is my laptop doing it running in a virtual machine (with a whopping 1gb of memory allocated to it) while listening to pandora, typing to you, running other virtual machines and so on:

big_table%ORA11GR2> select count(*) from big_table;

  COUNT(*)
----------
   5000000

big_table%ORA11GR2> desc big_table;
 Name                                     Null?    Type
 ---------------------------------------- -------- ----------------------------
 ID                                       NOT NULL NUMBER
 OWNER                                    NOT NULL VARCHAR2(30)
 OBJECT_NAME                              NOT NULL VARCHAR2(30)
 SUBOBJECT_NAME                                    VARCHAR2(30)
 OBJECT_ID                                NOT NULL NUMBER
 DATA_OBJECT_ID                                    NUMBER
 OBJECT_TYPE                                       VARCHAR2(19)
 CREATED                                  NOT NULL DATE
 LAST_DDL_TIME                            NOT NULL DATE
 TIMESTAMP                                         VARCHAR2(19)
 STATUS                                            VARCHAR2(7)
 TEMPORARY                                         VARCHAR2(1)
 GENERATED                                         VARCHAR2(1)
 SECONDARY                                         VARCHAR2(1)

big_table%ORA11GR2> set autotrace traceonly statistics
big_table%ORA11GR2> set autotrace traceonly 
big_table%ORA11GR2> set timing o
SP2-0265: timing must be set ON or OFF
big_table%ORA11GR2> set timing on
big_table%ORA11GR2> select * from big_table order by 1, 2, 3, 4;

5000000 rows selected.

Elapsed: 00:01:29.16

Execution Plan
----------------------------------------------------------
Plan hash value: 1472477105

-------------------------------------------------------------------------------
---------

| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| T
ime     |

-------------------------------------------------------------------------------
---------

|   0 | SELECT STATEMENT   |           |  5000K|   476M|       |   133K  (1)| 0
0:26:43 |

|   1 |  SORT ORDER BY     |           |  5000K|   476M|   640M|   133K  (1)| 0
0:26:43 |

|   2 |   TABLE ACCESS FULL| BIG_TABLE |  5000K|   476M|       | 20026   (1)| 0
0:04:01 |

-------------------------------------------------------------------------------
---------



Statistics
----------------------------------------------------------
        579  recursive calls
        809  db block gets
      73163  consistent gets
     147022  physical reads
          0  redo size
  277136037  bytes sent via SQL*Net to client
    3667083  bytes received via SQL*Net from client
     333335  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
    5000000  rows processed


big_table is a copy of all_objects over and over and over again..


The buseness process is this:
- select data from partition based on joins to other table to eliminate all but
~5M rows
- ensure all data is converted to currency XYZ (by looking up daily exchange
rates for that currency)
- error out the rows that could not be converted and keep those
- write that data to a file


I honestly do not see why that is more than a single insert statement??? I don't see why there is procedural code at all. With dml_error logging - you'd be able to do all of that in one pass??

The pipelined function, if it is even necessary - this sounds rather easy actually - would allow you have have a cursor feed into a process feed into another process and so on. You would STILL need to sort.

You have a query that gets 5,000,000 rows call it Q

You would create a function that took a cursor that processed the rows - doing the currency lookup if it cannot be done in sql.

That function would pipe the rows out.

So, you would be able to:

insert into target_table
select * from table( your_function( cursor( select * from (Q) order by ....) );

You could even parallel enable that and parallelize you processing if need be with parallel query.

Jess, May 24, 2012 - 11:43 am UTC

Hi Tom,

Apologies, the original question/answer are here, plus 2 more exchanges immediately below that:
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:9422487749968#4431537600346026474

That one was around trying to pull 7M rows from a table and failing miserably (with and without an index).

I just ran a skeleton version of the new query for comparison: select * from partition where <simple joins>. Expected return was about 3M rows. Without ordering, it starts spitting out rows about 15 seconds into it. With an order by it took over a minute to get it going.

As for why it would be more than 1 insert... Trying to be a little too clever perhaps and having it come out wrong? We figured that if we have the most popular currencies stashed, we could just multiply by a scalar in-flight in a very slim-down case statement (only 3 possibilities). Didn't want to end up with having to look up currencies per row (wasteful). Though about pulling all available currencies with today's rates into a cursor but ended up deciding against it. Is that the option you're essentially suggesting then?

As for the pipeline function (should one be required indeed), if all i have is a function and a cursor and no table, if I do the conversion as part of Q, where would the error rows go?

Thanks again Tom.
(looking forward to your talk in Scotland next month :) )


Tom Kyte
May 24, 2012 - 12:29 pm UTC

read that a little closer - I was telling him that trying to pull that many rows VIA AN INDEX would take hours. Using a full scan and sorting would be the right approach.

I was trying to convince them to NOT use an index, to NOT do it in java, to let the database do the sort.


Expected return was about 3M rows. Without
ordering, it starts spitting out rows about 15 seconds into it. With an order
by it took over a minute to get it going.


so what? a program doesn't care if it takes a bit of time to get started, it doesn't get bored. it wants to get to the last row as fast as possible.

Consider this example:

http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

the nested loops plan gets rows IMMEDIATELY, but takes forever to finish. The full scan plan takes a little while to get the first row - but returns the last row very very fast.

You don't care how long it takes for a program to get the first row, you care about how long it takes to get all of the rows.

If this were an interactive application and a user was waiting - we might have a different story.


Trying to be a little too clever
perhaps and having it come out wrong?,


I didn't understand that comment.


Is that the option you're essentially suggesting then?

I'm into packing as much into SQL as possible, yes.


if I do the conversion as part of Q,
where would the error rows go?


you could either use dml_error_logging (return a row that would fail the ultimate insert, we'll log it)

or your procedure could use an autonomous transaction to log it.




Jess, May 24, 2012 - 11:47 pm UTC

Hi Tom,

Thanks for clarifying.

With respect to the code trying to be a bit clever, all I was saying that we opted
for the approach of selecting top currencies, converting dynamically in the select,
and then updating the remaining currencies after the insert seemed like a more logical
thing to do than try to update all 50+ currencies at the same time on select. If
that's what you're suggesting we do, what would be the best approach? Simply adding
another join into the select and pulling from currencies or would we want to get the
data out of the currencies table first (cursor? array?) and do it that way?

For the pipelined table function, I am afraid I must've gone thick because I think I'm
still missing something. To use DML error logging, we'd need, well, DML... But it
seems that here we would--in the function--select from the source table in one giant
statement that selects and enriches the data and then pass that to utl_file to be
written to the file. Where does the error logging fit in this case if there's no insert or update?

Thanks again, this has been most informative.
Tom Kyte
May 25, 2012 - 8:57 am UTC

just join, databases were born to join, they were born to sort..



and now that I see you are just writing this to a file - not to a table - then yo u just need a single small stored procedure.

it would write "good" records to the file
it would log "bad" records where ever you wanted.

You are right - you wouldn't even need a pipelined function - you probably just need a single query to do the lookups and then write to a file.

but utl_file might not be the most efficient way to write 5,000,000 rows to a file - you might consider using a 3gl to run the query and write the file

Jess, May 25, 2012 - 10:21 am UTC

Thanks Tom.

Not sure about 3GL--we don't have much writing that's done outside of utl_file.

So sounds like you're staying we still don't need a table for the good records (i.e., no need for a 2-step process whereby we select all the records into a table first, and then write them out to a file). My only concern with that would be that if the select has taken a while to run, and then something happens to that session, you have to start from scratch on re-run, whereas if the records were in a table, you'd be able to just call the file writer..

But I guess that's a risk we have to evaluate.

Thanks again!

Regarding join tables and predicate order

Ankit, June 07, 2012 - 12:22 pm UTC

Hi Tom
Thanks a lot for all valuable stuff that you share with us.

I have following query:

select col1,col2,col3 from tab1 ods, tab2@dblink_1 stg
where to_number(ods.rowstamp) < to_number(stg.rowstamp)
and to_number(ods.rowstamp) > 987567856
and to_number(stg.rowstamp) < 998764582

Both tables have millions of records, ods probably has more than stg. stg is accessed via dblink. rowstamp columns are varchar2(40), that's why i placed to_number() function with them. rowstamp columns don't have indexes.

Currently above query takes very long to complete. My aim is to reduce time as much as possible.

1) How important are the order of where clause predicates ? If i shuffle them, will it help and in what order they should be for optimal performance ?
2) How important is order of tables in FROM clause ? Can I use some hint ?
3) I know that "to_number()" causes overhead but i need to place it as rowstamps are varchar and "to_number(ods.rowstamp) < to_number(stg.rowstamp)" did give me unwanted result when "to_number()" was not there. What do you say about this ?

Thanks a lot.
Tom Kyte
June 07, 2012 - 6:58 pm UTC

select col1,col2,col3 from tab1 ods, tab2@dblink_1 stg 
where to_number(ods.rowstamp) < to_number(stg.rowstamp) 
and to_number(ods.rowstamp) > 987567856
and to_number(stg.rowstamp) < 998764582


that was painful to read :( why to_number??? if you have a number that would be

to_number( to_char( number) )

if you don't have a number - WHY THE HECK NOT????

man oh man, that just hurts, and hurts and hurts.....


You know how much cpu you are burning with all of those to_numbers (and maybe to_chars)???? You are contributing to global warming in a big way.


1) not at all

2) not at all

3) overhead - horrifically so. I already said what I had to say about that before I even read this far.

Re:Regarding join tables and predicate order

Ankit, June 08, 2012 - 12:19 am UTC

Hi Tom

I know that to_number() hurts, but really rowstamp is varchar in both the tables. But values are NUMERIC.
Actual issue is that I have 420 such tables where rowstamps have to be compared like this with corresponding table via dblink.
I know they all have numeric values, but still the data-type is varchar and its an existing application having so many tables with so many records.

Please suggest if I remove to_number() from :
and to_number(ods.rowstamp) > 987567856
and to_number(stg.rowstamp) < 998764582
thereby allowing implicit conversion at runtime, will it do any good and give correct results ??

Thanks.
Tom Kyte
June 08, 2012 - 7:14 am UTC

But values are NUMERIC.

I laughed at this, no THEY ARE NOT. They are strings, they are in a varchar2, they are *strings*.

If your logic says "but they have to be compared, sorted, used as numbers all of the time", then common sense dictates "you guys have messed up big time".

implicit conversions are evil - do not ever permit them in your code.

Continue as you are, realizing that you yourselves have negatively impacted performance, data integrity, everything due to this "design decision"

Weired Full Table Scan

A reader, July 05, 2012 - 3:12 am UTC

Hi Tom,
we are running 10.2.0.4 version DB.
I have few questions regarding full table scan.
Is it possible for ORACLE to go for FTS even if all the conditions for INDEX scan are meeting?
Like,
1) Index exists in valid state.
2) Stats are as recent as today [
3) No of rows selecting due to "where idx_col<condn>" is less than total TABLE count. ( in our case its 6M versus 149M )

Also, what dows "ROW" column give information? No of rows returned from that operation?
"ROW" column in explain plan (for one using FTS) gives 2 whereas other (using INDEX scan) gives 96M.
Is this causing ORACLE to go for FTS?

Thanks

Tom Kyte
July 05, 2012 - 8:47 am UTC

6,000,000 rows is WAY too many to get via an index.

Way way way too many.

Think about this - you probably will hit millions of table blocks to read 6,000,000 rows in a table out via an index. Lets say you have to hit 3,000,000 blocks (you'll be reading the table using single block IO, you might have to read the same block DOZENS of times at different points in time during the processing of the query).

So, you are going to do MILLIONS of single block physical IO's. Let's say you get really really really luck and you only need to do 2,000,000 single block physical IO's.

A very good response time for an IO like that is about 5ms (5/1000th of a second). If you do 2,000,000 of them, that is 10,000 seconds or almost 3 hours. And that is if you are a) lucky and b) have really really good hardware that c) no one else is using at all.



If you are getting millions of rows via an index, you are doing it wrong.

go ahead, force it to use the index - and see what happens, go ahead and hint it and fetch the entire result set and compare that to the full scan performance (use sql trace and tkprof).

you'll find something like this I believe:

https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6749454952894#6760861174154

You should be grateful it goes for the full scan.

Weired Full Table Scan

A reader, July 05, 2012 - 5:08 am UTC


And here are the stats for 2 operations [ with and w/o index ]

###################################################

With INDEX hint
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5270214 consistent gets
103989 physical reads
188 redo size
814 bytes sent via SQL*Net to client
1396 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
0 rows processed

########################################################

Without giving any kind of hint ( Original query ) doing FTS


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2692316 consistent gets
1625790 physical reads
0 redo size
823 bytes sent via SQL*Net to client
1363 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed

Tom Kyte
July 05, 2012 - 9:05 am UTC

So, which one actually runs faster.


and why are there zero rows processed??????????

Weired Full Table Scan

A reader, July 05, 2012 - 10:23 pm UTC

Thanks for your quick reply !!!
We ran the original query using index hint and it actually doing less physical I/Os as I provided earlier.
Using FTS its not performing well compare to INDEX hint. But ORACLE is still going for FTS.
With INDEX its retuning in some secs and in FTS its taking several mins :)

Total NUM_ROWS in TABLE
----------
149371666 >>>> 149M

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Total count of rows on indexed_col

select count(1) from table where indexed_col > sysdate - 30;
count(1)
----------
6408888

Weired FULL Table scan

A reader, July 05, 2012 - 10:25 pm UTC


>> and why are there zero rows processed??????????

The query updates 0 rows.

Weired Full Table Scan

A reader, July 05, 2012 - 10:46 pm UTC

We also did comparisons for explain plans and performance of query across PROD,QA,TEST.

It is foing for FTS in PROD and TEST but going for INDEX scan for QA.
And QA is being refreshed from PROD itself and stats are gathered using same method as in PROD.

exec dbms_stats.gather_table_stats(OWNNAME=>owner,TABNAME=>tab,ESTIMATE_PERCENT=>100,DEGREE=>5,CASCADE=>TRUE);

Our query is normal update, something like below and runs same query in ALL env. Everywhere it updates 0 rows only.

update table set col1=......where indexed_col > sysdate-30

I have also given below row comparisons in each env.

PROD :-

Total NUM_ROWS in TABLE
----------
149371666

Total count of rows on indexed_col
----------
6408888

++++++++++++++++++++++++++++++++++++++++++
QA :-

Total NUM_ROWS in TABLE
----------
121252694


Total count of rows on indexed_col
----------
2284

++++++++++++++++++++++++++++++++++++++++++
TEST :-

Total NUM_ROWS in TABLE
----------
138734274


Total count of rows on indexed_col
----------
866


Tom Kyte
July 11, 2012 - 8:16 am UTC

ummm, in QA there are very few rows there - explain how that is a copy of production please????

why do you think you would get even remotely the same plan given the HUGE different in data values?

QA is as much like production as a funhouse mirror is an accurate representation of you.

Sammy, July 23, 2012 - 6:39 am UTC

Hi Tom,
We are facing one issue while executing one query.
Please find the details below :

In our table one column default value is 'Y',when we are using it in our where condition like is_active = 'Y' then full table scan is coming .we have created index also on that column then also same prob.
If the value is N its using index but when is_Active = 'Y'
its fill table scan.

Could you please provide s solution how to avoid full table scan .

Tom Kyte
July 30, 2012 - 8:56 am UTC

why - why do you want to make your query go slower than slow?


I presume that most of the rows are "Y"

Few of the rows are "N"

since an index is only good for getting a SMALL NUMBER of rows from a table, we use the index for "N"

but smartly avoid it for "Y"



so, tell us - why do you want to use this index? You might have a reason such as "we need to get just the first row" or something - but tell us.

Merge Clause is using Index when Table is empty

amit, March 15, 2013 - 7:27 am UTC

HI,

I could see that when my table is empty, the Merge clause is going for Index.But When Table in Non-Empty index is not getting used.


Came You Please Let the know the reason behind this.

Empty Table.

SQL> select count(*) from AMIT;

  COUNT(*)
----------
         0

 explain plan for
        MERGE INTO AMIT X
        USING (  SELECT 
                              DISTINCT        
                              'RETAIL'                             APP_NAME,
                              a.store_id                           store_id,
                                TRUNC(SYSDATE+(SHIFT_INTERVAL_IN_MIN/1440)) REPORTING_DATE,
                              SHIFT_INTERVAL_IN_MIN                gmt_offset_in_min
                        FROM
                             BB a,
                             AA b
                        WHERE  b.STORE_ID=a.STORE_ID
                              AND SOURCE_TIME_ZONE='GMT'
                              AND TARGET_TIME_ZONE=STORE_TIMEZONE
                              AND SYSDATE >= start_ts
                              AND SYSDATE < END_TS                         
                     ) Y
              ON (X.APP_NAME = Y.APP_NAME AND X.STORE_ID = Y.STORE_ID )
              WHEN matched THEN
                    UPDATE SET
                                X.REPORTING_DATE  =      Y.REPORTING_DATE,
                                X.GMT_OFFSET_IN_MIN =    Y.GMT_OFFSET_IN_MIN
               WHEN NOT matched THEN
                   INSERT (
                                 X.APP_NAME,
                                 X.STORE_ID,
                                 X.REPORTING_DATE,
                                 X.GMT_OFFSET_IN_MIN
                                 )
                   VALUES (
                                 Y.APP_NAME,
                                 Y.STORE_ID,
                                 Y.REPORTING_DATE,
                                 Y.GMT_OFFSET_IN_MIN
                         );


Plan hash value: 3354129795

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                |                            |   183 | 12993 |    23   (9)| 00:00:01 |
|   1 |  MERGE                         | AMIT    |       |       |            |          |
|   2 |   VIEW                         |                            |       |       |            |          |
|   3 |    NESTED LOOPS OUTER          |                            |   183 | 15189 |    23   (9)| 00:00:01 |
|   4 |     VIEW                       |                            |   183 |  6039 |    21  (10)| 00:00:01 |
|   5 |      SORT UNIQUE               |                            |   183 | 12261 |    21  (10)| 00:00:01 |
|*  6 |       HASH JOIN                |                            |   248 | 16616 |    20   (5)| 00:00:01 |
|*  7 |        MAT_VIEW ACCESS FULL    | AA            |   124 |  7688 |    17   (0)| 00:00:01 |
|   8 |        INDEX FAST FULL SCAN    | BB_IDX      |   516 |  2580 |     2   (0)| 00:00:01 |
|   9 |     TABLE ACCESS BY INDEX ROWID| AMIT    |     1 |    50 |     1   (0)| 00:00:01 |
|* 10 |      INDEX UNIQUE SCAN         | PK_AMIT |     1 |       |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("B"."STORE_ID"="A"."STORE_ID")
   7 - filter("TARGET_TIME_ZONE"="STORE_TIMEZONE" AND "SOURCE_TIME_ZONE"='GMT' AND
              "END_TS">SYSDATE@! AND "START_TS"<=SYSDATE@!)
  10 - access("X"."APP_NAME"(+)="Y"."APP_NAME" AND "X"."STORE_ID"(+)="Y"."STORE_ID")

Note
-----
   - dynamic sampling used for this statement (level=2)

29 rows selected.



When Table is non Empty 

SQL> select count(*) from AMIT;

  COUNT(*)
----------
       374
       
       
        explain plan for
        MERGE INTO AMIT X
        USING (  SELECT 
                              DISTINCT        
                              'RETAIL'                             APP_NAME,
                              a.store_id                           store_id,
                                TRUNC(SYSDATE+(SHIFT_INTERVAL_IN_MIN/1440)) REPORTING_DATE,
                              SHIFT_INTERVAL_IN_MIN                gmt_offset_in_min
                        FROM
                             BB a,
                             AA b
                        WHERE  b.STORE_ID=a.STORE_ID
                              AND SOURCE_TIME_ZONE='GMT'
                              AND TARGET_TIME_ZONE=STORE_TIMEZONE
                              AND SYSDATE >= start_ts
                              AND SYSDATE < END_TS                         
                     ) Y
              ON (X.APP_NAME = Y.APP_NAME AND X.STORE_ID = Y.STORE_ID )
              WHEN matched THEN
                    UPDATE SET
                                X.REPORTING_DATE  =      Y.REPORTING_DATE,
                                X.GMT_OFFSET_IN_MIN =    Y.GMT_OFFSET_IN_MIN
               WHEN NOT matched THEN
                   INSERT (
                                 X.APP_NAME,
                                 X.STORE_ID,
                                 X.REPORTING_DATE,
                                 X.GMT_OFFSET_IN_MIN
                                 )
                   VALUES (
                                 Y.APP_NAME,
                                 Y.STORE_ID,
                                 Y.REPORTING_DATE,
                                 Y.GMT_OFFSET_IN_MIN
                         );


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1111949030

------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT            |                         |   183 | 12993 |    24   (9)| 00:00:01 |
|   1 |  MERGE                     | AMIT |       |       |            |          |
|   2 |   VIEW                     |                         |       |       |            |          |
|*  3 |    HASH JOIN OUTER         |                         |   183 | 15189 |    24   (9)| 00:00:01 |
|   4 |     VIEW                   |                         |   183 |  6039 |    21  (10)| 00:00:01 |
|   5 |      SORT UNIQUE           |                         |   183 | 12261 |    21  (10)| 00:00:01 |
|*  6 |       HASH JOIN            |                         |   248 | 16616 |    20   (5)| 00:00:01 |
|*  7 |        MAT_VIEW ACCESS FULL| AA         |   124 |  7688 |    17   (0)| 00:00:01 |
|   8 |        INDEX FAST FULL SCAN| BB_IDX   |   516 |  2580 |     2   (0)| 00:00:01 |
|   9 |     TABLE ACCESS FULL      | AMIT |   374 | 18700 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("X"."STORE_ID"(+)="Y"."STORE_ID" AND "X"."APP_NAME"(+)="Y"."APP_NAME")
   6 - access("B"."STORE_ID"="A"."STORE_ID")
   7 - filter("TARGET_TIME_ZONE"="STORE_TIMEZONE" AND "SOURCE_TIME_ZONE"='GMT' AND
              "END_TS">SYSDATE@! AND "START_TS"<=SYSDATE@!)

Tom Kyte
March 25, 2013 - 2:49 pm UTC

when the table is empty, the access path *doesn't really matter*.
when it is not empty, it matters a lot and we use the most efficient path.


when the table is empty, joining to another table is perhaps best accomplished via nested loops.

when the table is not empty, joining is perhaps best accomplished via hash join.


it is all about the numbers, the statistics.



i'll say that in general. And in greater length

no creates
no inserts
no look

if I don't have your schema, if I cannot reproduce, I cannot really comment. I'm not going to reverse engineer your situation - if you want people to help, you'll need to provide a clear, concise, yet 100% complete test case.

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library