Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Irfan.

Asked: August 20, 2002 - 8:24 am UTC

Last updated: October 26, 2023 - 12:47 pm UTC

Version: 9i

Viewed 10K+ times! This question is

You Asked

Hello Tom,

I have a live table (Investor) with customers records and another table (Invdatew) has the same records but have one extra column for date,. After each end of day I use dbms_job to insert the the exact data from live table to the other one plus sysdate. But it takes so long that it takes approximately three hours for inserting 36 rows (rows depends on new entries per day). Can you please help tune this query.

SQL> insert into invdatew
select inv, name, nama, type, clas, stat, act, rbrk, rest, reme,
adde, adda, cnum, dload, sysdate from investor
where inv not in
(select inv from invdatew);

36 rows created.


Execution Plan
----------------------------------------------------------
0 INSERT STATEMENT Optimizer=CHOOSE (Cost=170 Card=3672 Bytes=
168912)

1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'INVESTOR' (Cost=170 Card=3672 By
tes=168912)

3 1 TABLE ACCESS (FULL) OF 'INVDATEW' (Cost=196 Card=2614 By
tes=33982)


Statistics
----------------------------------------------------------
0 recursive calls
284958 db block gets
47270538 consistent gets
42091341 physical reads
4528 redo size
886 bytes sent via SQL*Net to client
808 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
36 rows processed

Thanks


and Tom said...

Well, not having the table defs makes it a bit of a guess but I'll guess that inv is defined as nullable which prevents us from doing some nice optimizations. Here is an example where I compare the results of three inserts and their performance. This should get you going. Suggest the hash anti-join method as being the probable "best". Depends on the size of the tables really. Also, I assume an index on invdatew for the NOT EXISTS query, if no such index, don't do that one!

ops$tkyte@ORA817DEV.US.ORACLE.COM> drop table invdatew;

Table dropped.

ops$tkyte@ORA817DEV.US.ORACLE.COM> drop table investor;

Table dropped.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table invdatew
2 as
3 select 1 inv, a.*, sysdate dt from all_objects a where 1=0;

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table investor
2 as
3 select 1 inv, a.* from all_objects a where 1=0;

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> create index invdatew_idx on invdatew(inv);

Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> exec gen_data( 'INVDATEW', 5000);

PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> exec gen_data( 'INVESTOR', 5000);

PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table invdatew compute statistics for table for all indexes for all indexed columns;

Table analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table investor compute statistics for table for all indexes for all indexed columns;

Table analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set sql_trace=true;

Session altered.

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into invdatew
2 select a.*, sysdate from investor a
3 where inv not in (select inv from invdatew);

5000 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> rollback;

Rollback complete.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into invdatew
2 select a.*, sysdate from investor a
3 where inv is not null
4 and inv not in (select /*+ HASH_AJ */ inv from invdatew where inv is not null );

5000 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> rollback;

Rollback complete.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into invdatew
2 select a.*, sysdate from investor a
3 where not exists ( select *
4 from invdatew
5 where invdatew.inv = a.inv );

5000 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> rollback;

Rollback complete.

ops$tkyte@ORA817DEV.US.ORACLE.COM>



Now, the three queries are *different* in that if INV is null in invdatew -- the first one returns a different answer (no rows) where as the second two might return some rows -- see
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:442029737684 <code>
for the reasoning behind that...

So, I think the second two queries are OK for you (if you have null in inv in invdatew you would always get ZERO rows -- probably not what you wanted right?)

So, looking at the tkprof for these inserts we see hugely different performance characteristics:

insert into invdatew
select a.*, sysdate from investor a
where inv not in (select inv from invdatew)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.02 0 0 0 0
Execute 1 85.64 86.08 0 1974190 31754 5000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 85.66 86.10 0 1974190 31754 5000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 511

Rows Row Source Operation
------- ---------------------------------------------------
5001 FILTER
5001 TABLE ACCESS FULL INVESTOR
5000 TABLE ACCESS FULL INVDATEW


This query was processed like this:


for each row in investor
loop
FULL SCAN INVDATEW making sure that INV not in there
end loop

that is -- 5000 full scans of INVDATEW (5000 = number of rows in investor!)


********************************************************************************


insert into invdatew
select a.*, sysdate from investor a
where inv is not null
and inv not in (select /*+ HASH_AJ */ inv from invdatew where inv is not null )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.67 3.84 0 550 10631 5000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.68 3.85 0 550 10631 5000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 511

Rows Row Source Operation
------- ---------------------------------------------------
5001 HASH JOIN ANTI
5000 TABLE ACCESS FULL INVESTOR
5000 VIEW VW_NSO_1
5000 INDEX FAST FULL SCAN (object id 44573)

Wow -- that is different -- from over 85 cpu seconds down to .6, over 86 seconds runtime to under 4 seconds.

This was processed sort of like this:

for every row in outer join investor to invdatew
if invdatew.inv is null then insert this record

Much more efficient...

********************************************************************************

insert into invdatew
select a.*, sysdate from investor a
where not exists ( select *
from invdatew
where invdatew.inv = a.inv )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.72 4.32 0 10672 10623 5000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.72 4.32 0 10672 10623 5000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 511

Rows Row Source Operation
------- ---------------------------------------------------
5001 FILTER
5001 TABLE ACCESS FULL INVESTOR
5000 INDEX RANGE SCAN (object id 44573)

Now, not as good as the hash_aj but.... very close, very close. The logical IO's would make me avoid this approach however, rather do 550 LIO's then 10,672. This was processed like this:


for every record in investor
run a subquery that uses an index to find a row in invdatew
if not found
then insert
end if;
end loop



Hopefully, you can use that hash_aj and your query will run in minutes or less.



Rating

  (42 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Great !!!

Irfan, August 21, 2002 - 7:47 am UTC

Hello Tom,
Thanks very much for the solution. I worked fine, just took less than 40 seconds (compared to 3 hours, wow..). You are so humble and nice.


Tom Kyte
August 21, 2002 - 8:23 am UTC

Dig it -- did it 270 times faster ;) sweet...

which one did you end up using?

As you suggested -- Second One

Irfan, August 22, 2002 - 3:19 am UTC

<quote>
Hopefully, you can use that hash_aj and your query will run in minutes or less.
</quote>

Hi Tom,
I used the second one and it just fires. Amazed... Never know that hash_aj is so powerful - and your ideas are so brilliant.

Thanks.


what is hash_aj

rajiv, September 27, 2002 - 5:11 pm UTC

tom,

brilliant

what is this hash_aj?how it worked? is it new feature in 9i?

thanks
rajiv

Tom Kyte
September 27, 2002 - 6:33 pm UTC

It is a CBO thing, around for a while -- the example was in 817 here.

</code> http://docs.oracle.com/docs/cd/A87860_01/doc/server.817/a76992/hints.htm#21060 <code>


Terrific!

Urs, March 28, 2003 - 9:39 am UTC

Tom,

the "where inv is not null" thing saved me more than 99% execution time. How can I rate 6 stars? (I am not worthy...)

However, shouldn't the CBO be clever enough to know that NOT NULL fields will always be not null?

Tom Kyte
March 28, 2003 - 9:51 am UTC

inv was not "not null"

if the columns is NOT NULL in the data dictionary -- yes, it is "smart enough". If you do not tell it -- if the field permits NULL's then it is smart enough to know that and goes for the correct answer instead of a fast wrong one.

Excellent!!!

Mirjana, March 28, 2003 - 10:55 am UTC


Why not auto?

Christo Kutrovsky, March 28, 2003 - 9:38 pm UTC

Why did not the CBO choose to use HASH_AJ in this case?

For example, if we are to manually define the statistics for those tables, what should they be in order for the CBO to choose to use HASH_AJ without giving a hint to the query?



Tom Kyte
March 29, 2003 - 9:28 am UTC

and can and will -- especially in 9i (incremental improvements).

also -- just discovered the test database I was using had always_anti_join = nested_loops ;(

turning that off -- we get:

insert into invdatew
select a.*, sysdate from investor a
where inv is not null
and inv not in (select inv from invdatew where inv is not null )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.01 0.02 0 0 0 0
Execute 2 0.72 4.86 350 556 11122 5000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.73 4.88 350 556 11122 5000

Misses in library cache during parse: 2
Optimizer goal: CHOOSE
Parsing user id: 687

Rows Row Source Operation
------- ---------------------------------------------------
1 HASH JOIN ANTI
0 TABLE ACCESS FULL INVESTOR
0 VIEW VW_NSO_1
0 INDEX FULL SCAN (object id 55526)


right all -- hash aj.

NOT NULL column on remote table

Urs, March 31, 2003 - 2:20 am UTC

Thanks for your reply, Tom.

In my case, I guess the CBO had no chance to know that the column is declared NOT NULL as it is selected from a remote database across a db link.

Is there a way to help the CBO in using the remote db's data dictionary?

Urs

Tom Kyte
March 31, 2003 - 8:24 am UTC

distributed stuff changes all of the rules -- not every access plan is available.

ALWAYS_ANTI_JOIN

Kashif, April 01, 2003 - 4:26 pm UTC

Hi Tom,

I know this parameter is obsolete in 9i, but I was wondering what the value NESTED_LOOPS means. Thanks.

Kashif

Tom Kyte
April 01, 2003 - 7:36 pm UTC

means to use a nested loops anti join and is the reason my 8i example originally had a HASH_AJ hint in it.



Awesome!

Max, April 02, 2003 - 7:03 am UTC

subj.

How about using Append Hint

A reader, August 11, 2003 - 11:40 pm UTC

Does it run more efficiently if we use the append hint along with the hash_AJ hint ?

Tom Kyte
August 12, 2003 - 8:27 am UTC

maybe, maybe not.

it won't reuse any existing free space in the table if you do.

your dba must (has to, imperative) schedule a backup around your processing if you do this in nologging mode.

it won't have a tremendous affect if there are lots of indexes on the table (as they are always logged)

and given that the original question had:

...
But it takes so long that it takes approximately three
hours for inserting 36 rows
.....

for 36 rows -- append isn't going to save much of anything!

Superb Tom!

Denise, August 13, 2003 - 1:38 pm UTC

how do you assign always_anti_join=nested_loops in 8.1.7.

I had never known this existed until reading this thread.

We have queries that run excruciatingly slow and now I will
go back and take a look at them. Will print out your examples(I have a "script" book that I keep with all your examples from this website) and add to the others.

Tom Kyte
August 13, 2003 - 2:15 pm UTC

ops$tkyte@ORA817DEV> show parameter always

NAME                                 TYPE    VALUE
------------------------------------ ------- ------------------------------
always_anti_join                     string  NESTED_LOOPS
always_semi_join                     string  standard
ops$tkyte@ORA817DEV>

they are just init.ora parameters.... 

Any Difference?

Kamal Kishore, August 13, 2003 - 9:10 pm UTC

Hi Tom,
Is there a difference between the two?

SQL> alter table emp add constraint ck_emp_deptno check(deptno is not null) ;

and

SQL> alter table emp modify (deptno not null) ;

The reason I ask is because if I use the first ALTER TABLE and then do the following query, I get this plan:

SQL> select * from emp where deptno not in (select /*+ HASH_AJ */ deptno from dept) ;

no rows selected


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=40)
   1    0   FILTER
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=1 Bytes=40)
   3    1     INDEX (FULL SCAN) OF 'PK_DEPT' (UNIQUE) (Cost=1 Card=1 Bytes=2)



And, When I use the second ALTER TABLE command and then run the query again, I get this plan:

SQL> select * from emp where deptno not in (select /*+ HASH_AJ */ deptno from dept) ;

no rows selected


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=42)
   1    0   HASH JOIN (ANTI) (Cost=5 Card=1 Bytes=42)
   2    1     TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=560)
   3    1     INDEX (FULL SCAN) OF 'PK_DEPT' (UNIQUE) (Cost=1 Card=4 Bytes=8)


Are the two alter table statements behave differently?

Thanks,
 

Tom Kyte
August 13, 2003 - 9:42 pm UTC

you just found the difference then.

the check constraint is just that -- a check constraint. it could be "check the sky is blue" for all we know. It is not a NOT NULL constraint.

in a not in, the subquery must be known to NOT return nulls for the anti-join

you could add the predicate "where column is not null" as well

I Agree.

Kamal Kishore, August 13, 2003 - 10:24 pm UTC

Hi Tom,
One reason of my question was that at the client site they always insist that all integrity constraints (including NOT NULL) must be enforced using ALTER TABLE ADD CONSTRAINT and never use the ALTER TABLE MODIFY.

I was looking for a example where it would make a difference and found one. Thanks.

To sum it up, would you then say that NOT NULL check must always and always be enforced using ALTER TABLE MODIFY?
Thanks,


Tom Kyte
August 14, 2003 - 7:49 am UTC

by definition a not null can only be added via modify -- not add constraint.  the wire diagrams in the sql reference tell us so.

http://docs.oracle.com/docs/cd/B10501_01/server.920/a96540/statements_32a.htm#2104002

shows we can add an "out of line" constraint with ADD

http://docs.oracle.com/docs/cd/B10501_01/server.920/a96540/clauses3a.htm#1001938

shows us NOT NULL is an "in line" constraint only.


the syntax they are looking for for NOT NULL is:

ops$tkyte@ORA920LAP> alter table t modify ( x constraint x_not_null not null );

Table altered.




it is imperative they use NOT NULL and not a check constraint that checks if X is not null if they would like the optimizer to be able to fully do its job. 

??? could you please explain in detail ???

A reader, August 14, 2003 - 12:53 am UTC

I've to admit, I didn't get your answer above:

<quote>
you just found the difference then.

the check constraint is just that -- a check constraint. it could be "check the sky is blue" for all we know. It is not a NOT NULL constraint.

in a not in, the subquery must be known to NOT return nulls for the anti-join

you could add the predicate "where column is not null" as well
</quote>

could you please provide some further explanation?

Tom Kyte
August 14, 2003 - 7:56 am UTC

umm, one of them tells the optimizer

a) nothing (that is the check constraint. to the optimizer that is just that -- some ARBITRARY check constraint)

the other tells the optimizer

b) the column does not allow null values, you never need to concern yourself that the column would be NULL -- it always has a value.




not able to tune the query

mohsin, August 14, 2003 - 4:04 am UTC

Hi
Need help in tuning the data warehouse query
select
borrowername,
DESDEALER,
DESSOURCE,
dimrepmonth,
locdealer,
lan,
dimdisbmonth,
disbdate,
dimstartmonth,
DIMCURROSMONTHS,
dimcurrdpd,
subitem,
desdealer,
BRDITEM,
desitem,
a.dimitem,
TOTALNETDISBAMT,
TOTALGROSSDISBAMT,
TOTALEMIOUTSTANDING,
TOTALPRINOUTSTANDING,
TOTALINTOUTSTANDING,
TOTALOUTSTANDING,
descbucket
from mining.crash_monthlyfact a,
MINING.CRASH_DIMDEALER C,
mining.crash_dimitem d,
mining.crash_dimbucket e,
mining.crash_dimemi h,
mining.crash_dimsource f
where a.DIMitem =d.dimitem
AND A.DIMDEALER=C.DIMDEALER
and a.dimcurrbkt=e.dimbucket
and a.dimemi=h.dimemi
and a.dimsource=f.dimsource
and a.dimcurrbkt>=2
and c.locdealer='HYDERABAD'
AND a.DISBDATE >=to_date('01-JAN-02')

The execution Path of the Query is :::
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10150 Card=3896 Byte
s=907768)

1 0 HASH JOIN (Cost=10150 Card=3896 Bytes=907768)
2 1 HASH JOIN (Cost=10126 Card=3896 Bytes=720760)
3 2 TABLE ACCESS (FULL) OF 'CRASH_DIMSOURCE' (Cost=6 Card=
19880 Bytes=397600)

4 2 NESTED LOOPS (Cost=10115 Card=4081 Bytes=673365)
5 4 HASH JOIN (Cost=10115 Card=4081 Bytes=665203)
6 5 TABLE ACCESS (FULL) OF 'CRASH_DIMBUCKET' (Cost=1 C
ard=14 Bytes=126)

7 5 HASH JOIN (Cost=10113 Card=4081 Bytes=628474)
8 7 TABLE ACCESS (FULL) OF 'CRASH_DIMDEALER' (Cost=2
3 Card=248 Bytes=9672)

9 7 TABLE ACCESS (FULL) OF 'CRASH_MONTHLYFACT' (Cost
=10086 Card=344633 Bytes=39632795)

10 4 INDEX (RANGE SCAN) OF 'CRASHDIMEMI' (NON-UNIQUE)
11 1 TABLE ACCESS (FULL) OF 'CRASH_DIMITEM' (Cost=17 Card=213
65 Bytes=1025520)

The execution plan shows the FTS of CRASH_MONTHLYFACT.. Althougt the DISBDATE column of CRASH_MONTHLYFACT is indexed using a B tree
PLs help us ...


Tom Kyte
August 14, 2003 - 8:02 am UTC

so,

a) how many rows does crash_monthly fact actually have?

b) are the statistics totally up to date? show me the output of

set autotrace traceonly explain
select * from crash_monthlyfact

c) how many rows in that table SATISFY the predicate on the date?

d) how many blocks in that table and what is the clustering factor of the index thereon?

e) you do know that indexes do not in any way shape or form imply "i will be fast", they as often imply "i'll be as slow as molasses in feburary in alaska"

TUNING OF SQL QUERY

Mohsin, August 20, 2003 - 8:33 am UTC

a) how many rows does crash_monthly fact actually have?
Ans:2081311

b) are the statistics totally up to date? show me the output of

set autotrace traceonly explain
select * from crash_monthlyfact

Ans:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10086 Card=2081311 B
ytes=1371583949)

1 0 TABLE ACCESS (FULL) OF 'CRASH_MONTHLYFACT' (Cost=10086 Car
d=2081311 Bytes=1371583949)

c) how many rows in that table SATISFY the predicate on the date?
Ans: 739770
d) how many blocks in that table and what is the clustering factor of the index
thereon?
Ans:
BLOCKS
---------
112000

CLUSTERING_FACTOR
-----------------
918023
PLS GUIDE..

Tom Kyte
August 21, 2003 - 7:56 am UTC

so, why do you believe that using this index would be useful?

everything I see indicates "it would not"


When the cluster factor is nearer the number of rows in the table and far from the number of blocks in the table -- that means the data in the table is not sorted by the index itself. Meaning as we index range scan thru that index, we'll be reading lots of blocks all over the place -- not just a few blocks.


facts:

o index and table are not sorted the same

o reading N rows via index will result in N different blocks being accessed in the table. (it is nice when reading N rows via an index results in 1 or 2 blocks being accessed -- your cluster factor however indicates N rows => N blocks)

o reading 112,000 table blocks using single block IO (and reading and re-reading many of those blocks 7,8,9,10 times bear in mind, as we skip around) is INFINITELY less effecient then full scanning.



Do this for us, HINT the query, get the index plan you think would be better and then post us the benchmark results between the two!

clarification on hash join and nulls

amit, December 01, 2004 - 1:56 pm UTC

From this it is clear that for using hash anti join the join column should be defined as not null, Would it be correct to say that hash joins can only be performed when the join key is defined as not null on both the data sets.



Tom Kyte
December 01, 2004 - 3:12 pm UTC

no, not at all.


where x = y

can be hash joined, even if x and y allow nulls.

x = null is never true
x != null is never true

nor are they false

but that doesn't prevent them from being hash joined.

more clarification

amit poddar, December 01, 2004 - 4:31 pm UTC

two questions:

1. Will all the null values map to same hash bucket ?
I am asking this since if we have most of the columns as null then hash join would not be usefull since most of them would go to sam bucket.

2. If hash join can be done even with null join keys what prevents oracle from doing hash anti join when the join key can be null in the IN the case of (..) correlated subquery



Tom Kyte
December 01, 2004 - 7:51 pm UTC

1) they don't have to hash anywhere --

where x = y

will NEVER be true (or false) if either X or Y is null!!! nulls are not relevant here at all, they do not come into play!

2) where x NOT IN ( select null from dual )

is not true
is not false
is always unknown.

it is the semantics of NULL's that prevent it. Nulls are never equal, nor not equal -- they are Unknown.

query tuning

Mani, October 22, 2005 - 9:17 am UTC


Please advise me to tune the below query

select count(*) cnt from hire_money_fnmvt a
where txn_cacn <> '1000'
and a.doc_type like 'CR%'
and a.doc_date >= get_finyr(sysdate)
and a.branch_code < 300
and exists (select 'x' from receipt_source_document b
where a.branch_code = b.branch_code
and a.doc_type = b.doc_type
and a.doc_no = b.doc_no
and src_doc_type <> 7)
and nvl(system_doc,'N') = 'N'
having count(*) >0

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 2076.35 2716.19 162242 3556958 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 2076.36 2716.20 162242 3556958 0 0

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 509 (HPUSR) (recursive depth: 1)

Rows Row Source Operation
------- ---------------------------------------------------
0 FILTER
1 SORT AGGREGATE
0 FILTER
1968 TABLE ACCESS BY INDEX ROWID HIRE_MONEY_FNMVT
5777997 INDEX RANGE SCAN HFNMVT_BRDNOTY (object id 190019)
0 TABLE ACCESS BY INDEX ROWID RECEIPT_SOURCE_DOCUMENT
1806 INDEX UNIQUE SCAN CU_RSD_BRCD_DCTY_DCNO_001 (object id 192276)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
0 FILTER
1 SORT (AGGREGATE)
0 FILTER
1968 TABLE ACCESS (BY INDEX ROWID) OF 'HIRE_MONEY_FNMVT'
5777997 INDEX (RANGE SCAN) OF 'HFNMVT_BRDNOTY' (NON-UNIQUE)
0 TABLE ACCESS (BY INDEX ROWID) OF
'RECEIPT_SOURCE_DOCUMENT'
1806 INDEX (UNIQUE SCAN) OF 'CU_RSD_BRCD_DCTY_DCNO_001'

Tom Kyte
October 22, 2005 - 10:43 am UTC

rbo or cbo (with good stats)

SQL Query Tuning

A reader, October 24, 2005 - 3:20 am UTC

Oracle Version 9i (RBO)

Tom Kyte
October 24, 2005 - 6:21 am UTC

share with us the output from an autotrace traceonly explain of the query.

SQL Query Tuning

A reader, October 24, 2005 - 7:28 am UTC

Trace output :

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 SORT (AGGREGATE)
3 2 FILTER
4 3 TABLE ACCESS (BY INDEX ROWID) OF 'HIRE_MONEY_FNMVT'
5 4 INDEX (RANGE SCAN) OF 'HFNMVT_BRDNOTY' (NON-UNIQUE
)

6 3 TABLE ACCESS (BY INDEX ROWID) OF 'RECEIPT_SOURCE_DOC
UMENT'

7 6 INDEX (UNIQUE SCAN) OF 'CU_RSD_BRCD_DCTY_DCNO_001'
(UNIQUE)



Tom Kyte
October 24, 2005 - 11:35 am UTC

oh - i see, I read too fast, I was hoping to hear "cbo"

use CBO - rbo will be "index happy" and that'll kill you. the in/exists processing is best with CBO as well (it'll consider in versus exists type of processing)



SQL Query Tuning

A reader, October 27, 2005 - 2:32 am UTC

I still wasnt to user RBO

Tom Kyte
October 27, 2005 - 6:47 am UTC

"sorry" jumps to mind then. The rbo is very limited and primitive.


select count(*) cnt from hire_money_fnmvt a
where txn_cacn <> '1000'
and a.doc_type like 'CR%'
and a.doc_date >= get_finyr(sysdate)
and a.branch_code < 300
and exists (select 'x' from receipt_source_document b
where a.branch_code = b.branch_code
and a.doc_type = b.doc_type
and a.doc_no = b.doc_no
and src_doc_type <> 7)
and nvl(system_doc,'N') = 'N'
having count(*) >0

could be written as
select count(*) cnt
from hire_money_fnmvt a,
(select distinct branch_code, doc_type, doc_no
from branch_code
where src_doc_type <> 7
and branch_code < 300
and doc_type like 'CR%') b
where a.txn_cacn <> '1000'
and a.doc_type like 'CR%'
and a.doc_date >= (select get_finyr(sysdate) from dual)
and a.branch_code < 300
and a.branch_code = b.branch_code
and a.doc_type = b.doc_type
and a.doc_no = b.doc_no
and nvl(a.system_doc,'N') = 'N'
having count(*) >0


Tuning status query... and question about pruning in views

Vladimir Sadilovskiy, November 08, 2005 - 11:45 am UTC

Hello Tom,

We have queries that return set of natural values based on whether a particular id value is found in a set of corresponding tables. Original query looks like:

SELECT 0 FROM DUAL 
 WHERE EXISTS 
       (SELECT null
          FROM TABLE_1
         WHERE id IN (select id from LookupTable))
UNION 
SELECT 1 FROM DUAL 
 WHERE EXISTS 
       (SELECT null
          FROM TABLE_2
         WHERE id IN (select id from LookupTable))
...

If an individual id value is found in one of the tables TABLE_..., it will not be found in the rest. Do you think this logic could be optimized?

I've tried following form, but it seems that I'm missing something and optimizer doesn't push "flag" predicate into union-all partitioned view:

explain plan for
select  flag
  from (select 0 flag from dual
        union all
        select 1 flag from dual
        union all
        select 2 flag from dual
        union all
        select 3 flag from dual
        union all
        select 4 flag from dual
        union all
        select 5 flag from dual
        union all
        select 6 flag from dual
        union all
        select 7 flag from dual
       ) a
where exists
      (select null
         from 
              (
               select 0 flag, ID from TABLE_1
               union all
               select 1 flag, ID from TABLE_2
               union all
               select 2 flag, ID from TABLE_3
               union all
               select 3 flag, ID from TABLE_4
               union all
               select 4 flag, ID from TABLE_5
               union all
               select 5 flag, ID from TABLE_6
               union all
               select 6 flag, ID from TABLE_7
               union all
               select 7 flag, ID from TABLE_8
              ) b
        where a.flag = b.flag
          and b.ID in
              (SELECT id from lookuptable)
      );

------------------------------------------------------------------------------------------
| Id  | Operation               |  Name                     | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                           |    13 |    78 |   789  (34)|
|*  1 |  HASH JOIN SEMI         |                           |    13 |    78 |   789  (34)|
|   2 |   VIEW                  |                           |  1312 |  3936 |            |
|   3 |    UNION-ALL            |                           |       |       |            |
|   4 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|   5 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|   6 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|   7 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|   8 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|   9 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|  10 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|  11 |     TABLE ACCESS FULL   | DUAL                      |   164 |       |     5  (20)|
|  12 |   VIEW                  | VW_SQ_1                   |  2363K|  6923K|            |
|  13 |    NESTED LOOPS         |                           |  2363K|    49M|   501   (1)|
|  14 |     TABLE ACCESS FULL   | LOOKUPTABLE               |   164 |  2132 |     5  (20)|
|  15 |     VIEW                |                           | 14411 |   126K|            |
|  16 |      UNION-ALL PARTITION|                           |       |       |            |
|* 17 |       INDEX RANGE SCAN  | TABLE_1$ID_INDEX          |     1 |     6 |     3  (34)|
|* 18 |       INDEX RANGE SCAN  | TABLE_2$ID_INDEX          |     1 |    13 |            |
|* 19 |       INDEX RANGE SCAN  | TABLE_3$ID_INDEX          |     1 |    13 |            |
|* 20 |       INDEX RANGE SCAN  | TABLE_4$ID_INDEX          |     1 |    13 |            |
|* 21 |       INDEX RANGE SCAN  | TABLE_5$ID_INDEX          |     1 |    13 |            |
|* 22 |       INDEX RANGE SCAN  | TABLE_6$ID_INDEX          |     1 |    13 |            |
|* 23 |       INDEX RANGE SCAN  | TABLE_7$ID_INDEX          |     2 |    12 |     4  (25)|
|* 24 |       INDEX RANGE SCAN  | TABLE_8$ID_INDEX          |     4 |    24 |     4  (25)|
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."FLAG"="VW_SQ_1"."FLAG")
  11 - access("TABLE_1"."ID"="LOOKUPTABLE"."ID")
  12 - access("TABLE_2"."ID"="LOOKUPTABLE"."ID")
  13 - access("TABLE_3"."ID"="LOOKUPTABLE"."ID")
  14 - access("TABLE_4"."ID"="LOOKUPTABLE"."ID")
  15 - access("TABLE_5"."ID"="LOOKUPTABLE"."ID")
  16 - access("TABLE_6"."ID"="LOOKUPTABLE"."ID")
  17 - access("TABLE_7"."ID"="LOOKUPTABLE"."ID")
  18 - access("TABLE_8"."ID"="LOOKUPTABLE"."ID")
SQL> 

Is there anything I can do to tell optimizer that if pair (0,id) is found in TABLE_1 don't look for this pair in the rest of the tables?

Thanks,

- Vladimir 

Tom Kyte
November 08, 2005 - 10:24 pm UTC

use UNION ALL and

select *
from (current query with union all instead of union)
where rownum = 1;


how does that work - it should stop as soon as it generates a hit and the union all doesn't need to do the sort/distinct you are currently doing.

I'm confused... I need all possible flags.

Vladimir Sadilovskiy, November 09, 2005 - 1:18 am UTC

Tom,

Perhaps, I was unclear. I needed all distinct flags from the query not just one. I was asking if it is possible to skip searching table with even numbers (table_1) if lookup table doesn't contain them as per the following test. Some kind of partition prunning only with separate tables.

Thank you for your thoughts.

Here is a test case.

set autotrace off

create table table_1 (id number);
create index table_1$index_id on table_1(id);
create table table_2 (id number);
create index table_2$index_id on table_2(id);

delete from table_1;
delete from table_2;

begin
    for i in 1..40000 loop
        insert into table_1 values (mod(i,100)+mod(i,2)); -- even
        insert into table_2 values (mod(i,100)+mod(i+1,2)); -- odd
    end loop;
end;
/
commit;

exec dbms_stats.gather_table_stats(null,'table_1',method_opt=>'for all columns size 254',cascade=> true);
exec dbms_stats.gather_table_stats(null,'table_1',method_opt=>'for all columns size 254',cascade=> true);

create table lookuptable (id number);

prompt case 1.1 (both tables involved)

delete from lookuptable;
insert into lookuptable values (1);
insert into lookuptable values (50);

exec dbms_stats.gather_table_stats(null,'lookuptable',method_opt=>'for all columns size 254',cascade=> true);

set autotrace on
select 1 from dual
where exists
      (select null from table_1
        where id in (select * from lookuptable))
union all
select 2 from dual
where exists
      (select null from table_2
        where id in (select * from lookuptable));

set autotrace off

prompt case 1.2 (both tables involved)

set autotrace on
select flag 
  from (select 1 flag from dual
        union all
        select 2 flag from dual) a
where exists
      (select null 
         from 
           (select 1 flag, id from table_1
            union all
            select 2 flag, id from table_2
           ) b
        where b.flag = a.flag
          and id in (select * from lookuptable));
set autotrace off

prompt case 2.1 (one table with odd values involved)

delete from lookuptable;
insert into lookuptable values (1);
insert into lookuptable values (51);

exec dbms_stats.gather_table_stats(null,'lookuptable',method_opt=>'for all columns size 254',cascade=> true);

set autotrace on

select 1 from dual
where exists
      (select null from table_1
        where id in (select * from lookuptable))
union all
select 2 from dual
where exists
      (select null from table_2
        where id in (select * from lookuptable));

set autotrace off

prompt case 2.2 (one table with odd values involved)

set autotrace on
select flag 
  from (select 1 flag from dual
        union all
        select 2 flag from dual) a
where exists
      (select null 
         from 
           (select 1 flag, id from table_1
            union all
            select 2 flag, id from table_2
           ) b
        where b.flag = a.flag
          and id in (select * from lookuptable));
set autotrace off

Here are the results:

SQL> prompt case 1.1 (both tables involved)

         1
----------
         1
         2


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=328)
   1    0   UNION-ALL
   2    1     FILTER
   3    2       TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   4    2       NESTED LOOPS (Cost=7 Card=400 Bytes=2000)
   5    4         TABLE ACCESS (FULL) OF 'LOOKUPTABLE' (Cost=5 Card=2
          Bytes=6)

   6    4         INDEX (RANGE SCAN) OF 'TABLE_1$INDEX_ID' (NON-UNIQUE
          ) (Cost=2 Card=200 Bytes=400)

   7    1     FILTER
   8    7       TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   9    7       NESTED LOOPS (Cost=7 Card=8998 Bytes=143968)
  10    9         TABLE ACCESS (FULL) OF 'LOOKUPTABLE' (Cost=5 Card=2
          Bytes=6)

  11    9         INDEX (RANGE SCAN) OF 'TABLE_2$INDEX_ID' (NON-UNIQUE
          ) (Cost=2 Card=4499 Bytes=58487)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         20  consistent gets
          0  physical reads
          0  redo size
        407  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

SQL> SQL> SQL> SQL> case 1.2 (both tables involved)

      FLAG
----------
         2
         1


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=152 Card=3 Bytes=18)
   1    0   HASH JOIN (SEMI) (Cost=152 Card=3 Bytes=18)
   2    1     VIEW (Cost=10 Card=328 Bytes=984)
   3    2       UNION-ALL
   4    3         TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   5    3         TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   6    1     VIEW OF 'VW_SQ_1' (Cost=138 Card=48998 Bytes=146994)
   7    6       NESTED LOOPS (Cost=138 Card=48998 Bytes=440982)
   8    7         TABLE ACCESS (FULL) OF 'LOOKUPTABLE' (Cost=5 Card=2
          Bytes=6)

   9    7         VIEW (Cost=67 Card=24499 Bytes=146994)
  10    9           UNION-ALL (PARTITION)
  11   10             INDEX (RANGE SCAN) OF 'TABLE_1$INDEX_ID' (NON-UN
          IQUE) (Cost=3 Card=784 Bytes=1568)

  12   10             INDEX (RANGE SCAN) OF 'TABLE_2$INDEX_ID' (NON-UN
          IQUE) (Cost=2 Card=90 Bytes=1170)





Statistics
----------------------------------------------------------
         10  recursive calls
          0  db block gets
         32  consistent gets
          0  physical reads
        128  redo size
        410  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          2  rows processed

SQL> SQL> prompt case 2.1 (one table with odd values involved)

         1
----------
         2


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=328)
   1    0   UNION-ALL
   2    1     FILTER
   3    2       TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   4    2       NESTED LOOPS (Cost=7 Card=1569 Bytes=7845)
   5    4         TABLE ACCESS (FULL) OF 'LOOKUPTABLE' (Cost=5 Card=2
          Bytes=6)

   6    4         INDEX (RANGE SCAN) OF 'TABLE_1$INDEX_ID' (NON-UNIQUE
          ) (Cost=2 Card=784 Bytes=1568)

   7    1     FILTER
   8    7       TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   9    7       NESTED LOOPS (Cost=7 Card=8998 Bytes=143968)
  10    9         TABLE ACCESS (FULL) OF 'LOOKUPTABLE' (Cost=5 Card=2
          Bytes=6)

  11    9         INDEX (RANGE SCAN) OF 'TABLE_2$INDEX_ID' (NON-UNIQUE
          ) (Cost=2 Card=4499 Bytes=58487)





Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads
          0  redo size
        372  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> SQL> SQL> SQL> case 2.2 (one table with odd values involved)
      FLAG
----------
         2


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=152 Card=3 Bytes=18)
   1    0   HASH JOIN (SEMI) (Cost=152 Card=3 Bytes=18)
   2    1     VIEW (Cost=10 Card=328 Bytes=984)
   3    2       UNION-ALL
   4    3         TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   5    3         TABLE ACCESS (FULL) OF 'DUAL' (Cost=5 Card=164)
   6    1     VIEW OF 'VW_SQ_1' (Cost=138 Card=48998 Bytes=146994)
   7    6       NESTED LOOPS (Cost=138 Card=48998 Bytes=440982)
   8    7         TABLE ACCESS (FULL) OF 'LOOKUPTABLE' (Cost=5 Card=2
          Bytes=6)

   9    7         VIEW (Cost=67 Card=24499 Bytes=146994)
  10    9           UNION-ALL (PARTITION)
  11   10             INDEX (RANGE SCAN) OF 'TABLE_1$INDEX_ID' (NON-UN
          IQUE) (Cost=3 Card=784 Bytes=1568)

  12   10             INDEX (RANGE SCAN) OF 'TABLE_2$INDEX_ID' (NON-UN
          IQUE) (Cost=2 Card=90 Bytes=1170)





Statistics
----------------------------------------------------------
         10  recursive calls
          0  db block gets
         30  consistent gets
          0  physical reads
          0  redo size
        375  bytes sent via SQL*Net to client
        503  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> SQL> 
 

Tom Kyte
November 09, 2005 - 9:30 am UTC

I'm confused - how about you don't post SQL and plans but rather just text describing the problem more clearly.

For example, in the first post you said:

...If an individual id value is found in one of the tables TABLE_..., it will not
be found in the rest. Do you think this logic could be optimized?....


that sounds like "search the tables and when you get a hit, stop"


statements like:

...if it is possible to skip searching table with even numbers
(table_1) ....

are confusing - since table_1 seems to be an odd number?

Clarification on the algorithm...

Vladimir Sadilovskiy, November 09, 2005 - 11:03 am UTC

Tom,

Not a best thinking on my part, hence lots of confusion. Sorry for that. Please, drop my earlier explanation. I just mixed the desire to prune tables by a flag value when connecting to dual, not actually by an ID value.

So, fresh start. Here is the algorithm that I wanted to implement.

Input:
1. I have set of tables that contain different type of data:
strings,numbers,images,urls etc. But otherwise they have the same structure.

2. Each table is identified by a unique flag.
strings - 1, numbers - 2, urls - 3 ...

Goal:
The goal of the query is to retrieve set of flags that identify presence/absence of rows suttisfying cirtain condition in all these tables. The input condition is a set of ID values. This input array might get quite big. So, I wanted to combine it into a lookup table.

Algorithm:
start loop k in IDs
get ID(k);
start loop n in Tables (1)
if n has already been returned,
continue to next Table;
find one row in Table(n) that matches ID(k);
if found, return n;
end loop through Tables;
if all combination of Ns have been returned,
exit;
end loop through IDs;

algorithm ends here.

(1) - Ns are the flags assigned to each table.

Thanks again for looking into this.

- Vladimir

Tom Kyte
November 11, 2005 - 10:05 am UTC

select 1
from t1
where id in ( set-of-ids )
and rownum = 1
UNION ALL
select 2
from t2
where id in ( set-of-ids )
and rownum = 1
UNION ALL
......


why does that not work? no procedural code, just sql.

Any other savings?

Vladimir Sadilovskiy, November 12, 2005 - 12:49 pm UTC

Tom,

Yes it does work. I thought I could send less data to the server when binding lookup table (that's actually an ARRAY). Do you think I should use factored view. Or populate temp table?

Thanks,

- Vladimir


Tom Kyte
November 12, 2005 - 1:04 pm UTC

with subquery would be my first choice. You'd bind it once.

global temporary table would be 2nd choice.

HASH_AJ in Oracle10g

Alexander, November 25, 2005 - 11:23 am UTC

Dear Tom,

Hint HASH_AJ was very useful in Oracle9i and we used it frequently enough. But Oracle 10g documentation contains information that the hint was deprecated.
Our small tests showed that it still works, but we would not like to use deprecated features.
Can you please advise any hint or tip which we can use in place of HASH_AJ.

Tom Kyte
November 25, 2005 - 1:42 pm UTC

no hint at all!

Deprecated hash_aj hint

Jonathan Lewis, November 26, 2005 - 9:24 am UTC

Failing the 'no hint needed' approach, you could try the 10g version of the leading() hint combined with a simple hash hint, viz:

/*+ leading(tabX tabY) use_hash(tabY) */

The leading hint can list all the tables in the query, so if you copy the table order that you get from the current hash_aj hint and simply ask for a hash then
a) Oracle ought to use the order and
b) will have to do an antijoin, which has been hinted to use a hash join.

I've not tried this - so I won't guarantee that Oracle won't find an alternative transformation that puts the hints out of context.

I suppose it's always possible you may have to put the tables the "wrong" way round with a swap_join_inputs hint to make it work.


Why is HASH_AJ hint not used?

Rambabu, September 08, 2006 - 4:39 am UTC

Hi,
Why is HASH_AJ not been used here?

select count(1)
from (select * from basket_detail
where the_date between '01-aug-2003' and sysdate - 500
) h
where h.hh_id not in
(select /*+ HASH_AJ */ hh_id
from basket_detail h1,
prod_item pi
where H.hh_id = H1.hh_id
and h1.the_date between '01-aug-2003' and sysdate - 500
AND pi.merchandise_type = 'TRU'
and pi.department_cd = 0
and pi.class_cd = 1
and pi.sub_class_cd = 95
and h1.hh_id <> -1
and h1.division_cd = 'TRU'
)
and division_cd = 'TRU'
and hh_id <> -1
group by h.hh_id




Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=706291 Card=1 Bytes=20)

1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 FILTER
4 3 PARTITION RANGE (ITERATOR)
5 4 TABLE ACCESS (FULL) OF 'BASKET_DETAIL' (
Cost=706234 Card=2390860 Bytes=47817200)

6 2 FILTER
7 6 MERGE JOIN* (CARTESIAN) (Cost=57 Card=4738 Bytes=118 :Q531577
450) 001

8 7 PARTITION RANGE* (ALL) :Q531577
001

9 8 TABLE ACCESS* (BY LOCAL INDEX ROWID) OF ':Q531577
BASKET_DETAIL' (Cost=50 Card=14 Bytes=168) 001

10 9 INDEX* (RANGE SCAN) OF 'FK2_HOU_MA_ID_HBD' (NO :Q531577
N-UNIQUE) (Cost=77 Card=21) 001

11 7 BUFFER* (SORT) (Cost=7 Card=327 Bytes=4251) :Q531577
001

12 11 INDEX* (RANGE SCAN) OF 'BT1_MT_DC_CC_SCC_PI' (NO :Q531577
N-UNIQUE) (Cost=1 Card=327 Bytes=4251) 000



7 PARALLEL_TO_SERIAL SELECT /*+ ORDERED NO_EXPAND USE_MERGE(A2) *
/ A1.C0,A1.C1,A1.C2,A2.C0,A2.C1,A2.C

8 PARALLEL_COMBINED_WITH_PARENT
9 PARALLEL_COMBINED_WITH_PARENT
10 PARALLEL_COMBINED_WITH_PARENT
11 PARALLEL_COMBINED_WITH_PARENT
12 PARALLEL_FROM_SERIAL






hash anti join seems to be unavailable with functions

andy b, November 21, 2007 - 9:12 am UTC

Tom,

I have made good use of the information on your site regarding the hash anti join. However, in some cases I was unable to get the optimizer to use this plan. On investigation it appears to be down to whether the join is on a colunm or a pseudo-column that used a user-defined function, as in this example:


SQL> 
SQL> set echo on
SQL> set autotrace off
SQL> 
SQL> 
SQL> -- oracle 9r2
SQL> 
SQL> create table t1(t varchar2(20));

Table created.

SQL> insert into t1
  2  select substr(object_name,1,20) from all_objects;

26559 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create table t2(t varchar2(20));

Table created.

SQL> insert into t2
  2  select substr(object_name,1,20) from all_objects;

26560 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> analyze table t1 compute statistics;

Table analyzed.

SQL> analyze table t2 compute statistics;

Table analyzed.

SQL> 
SQL> 
SQL> set autotrace on
SQL> 
SQL> 
SQL> 
SQL> select count(*)
  2  from t2
  3  where not exists
  4  (select null from t1
  5  where t2.t = t1.t);

  COUNT(*)                                                                      
----------                                                                      
         1                                                                      


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=63 Card=1 Bytes=38)           
   1    0   SORT (AGGREGATE)                                                    
   2    1     HASH JOIN (ANTI) (Cost=63 Card=2 Bytes=76)                        
   3    2       TABLE ACCESS (FULL) OF 'T2' (Cost=11 Card=26560 Bytes=          
          504640)                                                               
                                                                                
   4    2       TABLE ACCESS (FULL) OF 'T1' (Cost=11 Card=26559 Bytes=          
          504621)                                                               
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
        198  consistent gets                                                    
         35  physical reads                                                     
          0  redo size                                                          
        379  bytes sent via SQL*Net to client                                   
        499  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> 
SQL> 
SQL> 
SQL> select count(*)
  2  from t2
  3  where not exists
  4  (select null from t1
  5  where length(t2.t) = length(t1.t));

  COUNT(*)                                                                      
----------                                                                      
         0                                                                      


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=63 Card=1 Bytes=38)           
   1    0   SORT (AGGREGATE)                                                    
   2    1     HASH JOIN (ANTI) (Cost=63 Card=26294 Bytes=999172)                
   3    2       TABLE ACCESS (FULL) OF 'T2' (Cost=11 Card=26560 Bytes=          
          504640)                                                               
                                                                                
   4    2       TABLE ACCESS (FULL) OF 'T1' (Cost=11 Card=26559 Bytes=          
          504621)                                                               
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
        198  consistent gets                                                    
        252  physical reads                                                     
          0  redo size                                                          
        378  bytes sent via SQL*Net to client                                   
        499  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> 
SQL> 
SQL> 
SQL> 
SQL> 
SQL> create or replace function myfn
  2  (t1 in varchar2)
  3  return varchar2
  4  deterministic
  5  as
  6  t2 number;
  7  begin
  8  t2 := length(t1);
  9  return t2;
 10  end;
 11  /

Function created.

SQL> 
SQL> select count(*)
  2  from t2
  3  where length(t) < 20;

  COUNT(*)                                                                      
----------                                                                      
      5964                                                                      


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=11 Card=1 Bytes=19)           
   1    0   SORT (AGGREGATE)                                                    
   2    1     TABLE ACCESS (FULL) OF 'T2' (Cost=11 Card=1328 Bytes=252          
          32)                                                                   
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
         99  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
        380  bytes sent via SQL*Net to client                                   
        499  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> 
SQL> select count(*)
  2  from t2
  3  where myfn(t) < 20;

  COUNT(*)                                                                      
----------                                                                      
      5964                                                                      


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=11 Card=1 Bytes=19)           
   1    0   SORT (AGGREGATE)                                                    
   2    1     TABLE ACCESS (FULL) OF 'T2' (Cost=11 Card=1328 Bytes=252          
          32)                                                                   
                                                                                




Statistics
----------------------------------------------------------                      
         22  recursive calls                                                    
          0  db block gets                                                      
        110  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
        380  bytes sent via SQL*Net to client                                   
        499  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          5  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> 
SQL> 
SQL> 
SQL> 
SQL> select count(*)
  2  from t2
  3  where not exists
  4  (select null from t1
  5  where length(t2.t) = myfn(t1.t));

  COUNT(*)                                                                      
----------                                                                      
         0                                                                      


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=19)           
   1    0   SORT (AGGREGATE)                                                    
   2    1     FILTER                                                            
   3    2       TABLE ACCESS (FULL) OF 'T2' (Cost=11 Card=1328 Bytes=2          
          5232)                                                                 
                                                                                
   4    2       TABLE ACCESS (FULL) OF 'T1' (Cost=11 Card=266 Bytes=50          
          54)                                                                   
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
     135974  consistent gets                                                    
          0  physical reads                                                     
          0  redo size                                                          
        378  bytes sent via SQL*Net to client                                   
        499  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> 
SQL> 
SQL> 
SQL> 
SQL> create table t3
  2  as
  3  select myfn(t) as t
  4  from t1;

Table created.

SQL> 
SQL> analyze table t3 compute statistics;

Table analyzed.

SQL> 
SQL> 
SQL> 
SQL> select count(*)
  2  from t2
  3  where not exists
  4  (select null from t3
  5  where length(t2.t) = t3.t);

  COUNT(*)                                                                      
----------                                                                      
         0                                                                      


Execution Plan
----------------------------------------------------------                      
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=47 Card=1 Bytes=21)           
   1    0   SORT (AGGREGATE)                                                    
   2    1     HASH JOIN (ANTI) (Cost=47 Card=25162 Bytes=528402)                
   3    2       TABLE ACCESS (FULL) OF 'T2' (Cost=11 Card=26560 Bytes=          
          504640)                                                               
                                                                                
   4    2       TABLE ACCESS (FULL) OF 'T3' (Cost=6 Card=26559 Bytes=5          
          3118)                                                                 
                                                                                




Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
        143  consistent gets                                                    
        210  physical reads                                                     
          0  redo size                                                          
        378  bytes sent via SQL*Net to client                                   
        499  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

SQL> 
SQL> 
SQL> 
SQL> spool off




Is there a way of steering the optimizer towards the hash anti join plan?

SQL query tuning suggestions

Megala, March 15, 2008 - 9:38 am UTC

Tom:

Following sql query is taking several hours.
Any suggestions on rewriting the query to make it effecient.

Database version: 9.2.0.6

SQL> select BB.CTLACCT, BB.INV_NUM, BB.INV_DT, BB.IVC_LI_TX, BB.IVC_LI_GRS_AM, BB.IVC_LI_DCT_AM,
  2  BB.ADJUSTMENT_TAXES, BB.ADJUSTMENT_AMOUNT, AA.TOTAL_ADJUSTMENT_AMOUNT
  3  from
  4  (select
  5  A.INV_NUM,sum(A.ADJUSTMENT_AMOUNT) AS TOTAL_ADJUSTMENT_AMOUNT
  6  from ( SELECT BDB_BILCYC.INV_NUM INV_NUM,
  7  (Q_LDOL.IVC_LI_GRS_AM + Q_LDOL.IVC_LI_DCT_AM + Q_LDOL.IVC_LI_CNY_TAX_AM + Q_LDOL.IVC_LI_CTY_TAX_AM+Q_LDOL.IVC_LI_ST_TAX_AM) ADJUSTMENT_AMOUNT
  8  FROM BDB_BILCYC, Q_LSEQ, Q_LDOL
  9  where exists (select null from EJ_BILL_INV WHERE EJ_BILL_INV.CTLACCT=BDB_BILCYC.CTLACCT and EJ_BILL_INV.STARTDATE=BDB_BILCYC.CYCLE_START_DT )
 10  and BDB_BILCYC.INV_NUM= Q_LSEQ.IVC_NO
 11  and BDB_BILCYC.INV_NUM= Q_LDOL.IVC_NO
 12  and BDB_BILCYC.INV_TYP_CD='2' ) A
 13  group by A.INV_NUM) AA,
 14  (SELECT BDB_BILCYC.CTLACCT CTLACCT, BDB_BILCYC.INV_NUM INV_NUM, TO_CHAR(BDB_BILCYC.INV_DT,'YYYY-MM-DD HH24:MI:SS') INV_DT, Q_LSEQ.IVC_LI_TX IVC_LI_TX, Q_LDOL.IVC_LI_GRS_AM IVC_LI_GRS_AM, Q_LDOL.IVC_LI_DCT_AM, (Q_LDOL.IVC_LI_CNY_TAX_AM + Q_LDOL.IVC_LI_CTY_TAX_AM + Q_LDOL.IVC_LI_ST_TAX_AM) ADJUSTMENT_TAXES,
 15  (Q_LDOL.IVC_LI_GRS_AM + Q_LDOL.IVC_LI_DCT_AM + Q_LDOL.IVC_LI_CNY_TAX_AM + Q_LDOL.IVC_LI_CTY_TAX_AM+Q_LDOL.IVC_LI_ST_TAX_AM) ADJUSTMENT_AMOUNT
 16  FROM (select CTLACCT, CYCLE_START_DT, INV_TYP_CD, INV_NUM, INV_DT from BDB_BILCYC where INV_TYP_CD='2' )
 17  BDB_BILCYC, Q_LSEQ, Q_LDOL
 18  where exists (select null from EJ_BILL_INV WHERE EJ_BILL_INV.CTLACCT=BDB_BILCYC.CTLACCT and EJ_BILL_INV.STARTDATE=BDB_BILCYC.CYCLE_START_DT )
 19  and BDB_BILCYC.INV_NUM= Q_LSEQ.IVC_NO and BDB_BILCYC.INV_NUM= Q_LDOL.IVC_NO ) BB
 20  where AA.INV_NUM=BB.INV_NUM
 21  /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4490 Card=1 Bytes=23
          0)

   1    0   SORT (GROUP BY) (Cost=4490 Card=1 Bytes=230)
   2    1     NESTED LOOPS (Cost=4482 Card=1 Bytes=230)
   3    2       NESTED LOOPS (Cost=4480 Card=1 Bytes=222)
   4    3         NESTED LOOPS (Cost=4477 Card=1 Bytes=146)
   5    4           HASH JOIN (SEMI) (Cost=4474 Card=1 Bytes=121)
   6    5             TABLE ACCESS (BY INDEX ROWID) OF 'BDB_BILCYC' (C
          ost=3 Card=1 Bytes=25)

   7    6               NESTED LOOPS (Cost=4440 Card=16 Bytes=1712)
   8    7                 NESTED LOOPS (Cost=4407 Card=11 Bytes=902)
   9    8                   HASH JOIN (SEMI) (Cost=4401 Card=2 Bytes=1
          00)

  10    9                     TABLE ACCESS (FULL) OF 'BDB_BILCYC' (Cos
          t=3262 Card=1776380 Bytes=63949680)

  11    9                     INDEX (FAST FULL SCAN) OF 'EJ_BILL_INV_E
          TL_IDX' (NON-UNIQUE) (Cost=33 Card=192002 Bytes=2688028)

  12    8                   TABLE ACCESS (BY INDEX ROWID) OF 'Q_LDOL'
          (Cost=3 Card=5 Bytes=160)

  13   12                     INDEX (RANGE SCAN) OF 'Q_LDOL_PK' (UNIQU
          E) (Cost=2 Card=5)

  14    7                 INDEX (RANGE SCAN) OF 'BDB_BILCYC_IDX5' (NON
          -UNIQUE) (Cost=2 Card=2)

  15    5             INDEX (FAST FULL SCAN) OF 'EJ_BILL_INV_ETL_IDX'
          (NON-UNIQUE) (Cost=33 Card=192002 Bytes=2688028)

  16    4           TABLE ACCESS (BY INDEX ROWID) OF 'Q_LDOL' (Cost=3
          Card=5 Bytes=125)

  17   16             INDEX (RANGE SCAN) OF 'Q_LDOL_PK' (UNIQUE) (Cost
          =2 Card=5)

  18    3         TABLE ACCESS (BY INDEX ROWID) OF 'Q_LSEQ' (Cost=3 Ca
          rd=7 Bytes=532)

  19   18           INDEX (RANGE SCAN) OF 'Q_LSEQ_PK' (UNIQUE) (Cost=2
           Card=7)

  20    2       INDEX (RANGE SCAN) OF 'Q_LSEQ_PK' (UNIQUE) (Cost=2 Car
          d=7 Bytes=56)



=====
  1* select table_name, num_rows from user_tables where table_name in ('BDB_BILCYC','Q_LSEQ','Q_LDOL','EJ_BILL_INV')
SQL> /

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
BDB_BILCYC                        3552760
EJ_BILL_INV                        192002
Q_LDOL                            1326401
Q_LSEQ                            1907054


Rows from BDB_BILCYC with INV_TYP_CD='2'  :
============================================
SQL> select count(*) from  BDB_BILCYC where INV_TYP_CD='2'  ;

  COUNT(*)
----------
     31108

SQL> @qry_ind
SQL> column column_name format a30
SQL> select table_name, index_name, column_name from user_ind_columns where table_name in
  2  ('BDB_BILCYC','Q_LSEQ','Q_LDOL','EJ_BILL_INV')
  3  order by table_name, index_name, column_position
  4  /

TABLE_NAME                     INDEX_NAME                     COLUMN_NAME
------------------------------ ------------------------------ ------------------------------
BDB_BILCYC                     BDB_BILCYC_ETL_IDX             CTLACCT
BDB_BILCYC                     BDB_BILCYC_ETL_IDX             CYCLE_START_DT
BDB_BILCYC                     BDB_BILCYC_ETL_IDX             INV_NUM
BDB_BILCYC                     BDB_BILCYC_IDX1                CYCLE_START_DT
BDB_BILCYC                     BDB_BILCYC_IDX2                CTLACCT
BDB_BILCYC                     BDB_BILCYC_IDX4                INV_TYP_CD
BDB_BILCYC                     BDB_BILCYC_IDX5                INV_NUM
EJ_BILL_INV                    EEJ_BILL_INV_IDX1              STARTDATE
EJ_BILL_INV                    EEJ_BILL_INV_IDX1              ENDDATE
EJ_BILL_INV                    EEJ_BILL_INV_IDX3              INV_NUM
EJ_BILL_INV                    EEJ_BILL_INV_IDX4              SITE_ID
EJ_BILL_INV                    EEJ_BILL_INV_IDX4              BEID
EJ_BILL_INV                    EJ_BILL_INV_ETL_IDX            CTLACCT
EJ_BILL_INV                    EJ_BILL_INV_ETL_IDX            STARTDATE
EJ_BILL_INV                    EJ_BILL_INV_FBIDX              INV_NUM
EJ_BILL_INV                    EJ_BILL_INV_FBIDX              SYS_NC00059$
EJ_BILL_INV                    EJ_BILL_INV_IDX5               CTLACCT
EJ_BILL_INV                    EJ_BILL_INV_IDX6               SYS_NC00058$
Q_LDOL                         Q_LDOL_PK                      IVC_NO
Q_LDOL                         Q_LDOL_PK                      IVC_LI_SEQ_NO
Q_LSEQ                         Q_LSEQ_PK                      IVC_NO
Q_LSEQ                         Q_LSEQ_PK                      IVC_LI_SEQ_NO

22 rows selected.



Tom Kyte
March 15, 2008 - 9:58 am UTC

are the card= values in the plan close to reality. I'm guessing "no", else this would be going "pretty fast"

Query tuning

Rich, September 09, 2020 - 2:51 am UTC

Can you please help with tuning this query.to make it more efficient.

It returns only few rows.

PRICING_DATE on EST.FFINCPRC is the partition key column. This is on Oracle 12.2 on Exadata.

Also noticed below-
1)
I am seeing the same query execution running slow with varying inputs eventhough stats are uptodate.

2) On 1st execution, Query always takes longer. Any way to get consistent response time even on 1st execution.


SQL> WITH LAST_TICKS_BY_DAY AS (
SELECT MAX(LAST_PRICE_TIME) AS LAST_PRICE_TIME, MAX(PRICING_DATE) AS PRICING_DATE
FROM TEST.FFINCPRC A
WHERE A.INSTRUMENT_ID = 8553958
AND A.PRICING_DATE BETWEEN TO_DATE('2020-08-24', 'YYYY-MM-DD') AND TO_DATE('2020-08-28', 'YYYY-MM-DD')
AND LAST_PRICE_TIME >= TO_TIMESTAMP('24-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF PM')
AND LAST_PRICE_TIME <= TO_TIMESTAMP('27-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF PM')
AND A.PRICE_TYPE_ID in(5 , 6 )
GROUP BY TRUNC(LAST_PRICE_TIME)
)
SELECT p.INSTRUMENT_ID, p.SOURCE, p.BID_PR 2 ICE, p.LAST_PRICE_TIME, p.ASK_PRICE,
p.MID_PRICE, p.BID_SPREAD, p.ASK_SPREAD, p.MID_SPREAD, p.BID_YIELD, p.ASK_YIELD, p.MID_YIELD,
p.ASK_SPREAD_CURVE, p.T_SPREAD, p.MID_T_SPREAD, p.ASK_T_SPREAD, p.TEST_CONDITION_CODE,
p.CURVE_YIELD, p.PRICE_ 3 TYPE, p.BID_SPREAD_CURVE, p.MID_SPREAD_CURVE, p.PRICE_TYPE_ID,
p.BID_DISCOUNT_RATE, p.ASK_DISCOUNT_RATE, p.EOD_LATE_BID_PRICE_DELTA, p.SPREAD,
a.PRICING_SPREAD_TYPE, a.BENCHMARK_NAME, a.TBA_BACK_BENCH, a.DIMINIMIS_ELIGIBLE_FLAG,
a.HALF_DEMIN 4 IMIS_ELIGIBLE_FLAG, a.BENCHMARK_TBA, a.PAYUP
FROM TEST.FFINCPRC p
LEFT JOIN TEST.FPRCASSUMP a
ON p.ASSUMP_ID = a.ASSUMP_ID
JOIN LAST_TICKS_BY_DAY B
ON P.LAST_PRICE_TIME = B.LAST_PRICE_TIME
WHERE p.INSTRUMENT_ID = 8553958
AND p.PR 5 ICING_DATE = B.PRICING_DATE
AND p.PRICE_TYPE_ID in(5, 6)
ORDER BY A.LAST_PRICE_TIME ASC 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 /



Elapsed: 00:00:00.24

Execution Plan
----------------------------------------------------------
Plan hash value: 86494900

--------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 207 | 46 (5)| 00:00:01 | | |
| 1 | SORT ORDER BY | | 1 | 207 | 46 (5)| 00:00:01 | | |
| 2 | NESTED LOOPS OUTER | | 1 | 207 | 45 (3)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 1 | 168 | 41 (3)| 00:00:01 | | |
| 4 | VIEW | | 1 | 22 | 36 (3)| 00:00:01 | | |
| 5 | HASH GROUP BY | | 1 | 29 | 36 (3)| 00:00:01 | | |
| 6 | PARTITION RANGE ITERATOR | | 1 | 29 | 35 (0)| 00:00:01 | 2428 | 2432 |
| 7 | INLIST ITERATOR | | | | | | | |
|* 8 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| FFINCPRC | 1 | 29 | 35 (0)| 00:00:01 | 2428 | 2432 |
| 9 | SORT CLUSTER BY ROWID | | 1 | | 17 (0)| 00:00:01 | | |
|* 10 | INDEX RANGE SCAN | FF_INC_PRICE_IDX1 | 1 | | 17 (0)| 00:00:01 | 2428 | 2432 |
| 11 | PARTITION RANGE ITERATOR | | 1 | 146 | 5 (0)| 00:00:01 | KEY | KEY |
| 12 | INLIST ITERATOR | | | | | | | |
|* 13 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | FFINCPRC | 1 | 146 | 5 (0)| 00:00:01 | KEY | KEY |
|* 14 | INDEX RANGE SCAN | FF_INC_PRICE_IDX1 | 1 | | 4 (0)| 00:00:01 | KEY | KEY |
| 15 | TABLE ACCESS BY GLOBAL INDEX ROWID | FPRCASSUMP | 1 | 39 | 4 (0)| 00:00:01 | ROWID | ROWID |
|* 16 | INDEX UNIQUE SCAN | FPRCASSUMP_PK | 1 | | 3 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("A"."PRICING_DATE"<=TO_DATE(' 2020-08-28 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
10 - access(("A"."PRICE_TYPE_ID"=5 OR "A"."PRICE_TYPE_ID"=6) AND "A"."INSTRUMENT_ID"=8553958 AND
"LAST_PRICE_TIME">=TO_TIMESTAMP('24-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF AM') AND
"LAST_PRICE_TIME"<=TO_TIMESTAMP('27-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF AM'))
13 - filter("P"."PRICING_DATE"="B"."PRICING_DATE")
14 - access(("P"."PRICE_TYPE_ID"=5 OR "P"."PRICE_TYPE_ID"=6) AND "P"."INSTRUMENT_ID"=8553958 AND
"P"."LAST_PRICE_TIME"="B"."LAST_PRICE_TIME")
16 - access("P"."ASSUMP_ID"="A"."ASSUMP_ID"(+))


Statistics
----------------------------------------------------------
10 recursive calls
0 db block gets
52412 consistent gets
0 physical reads
0 redo size
3308 bytes sent via SQL*Net to client
608 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
35 sorts (memory)
0 sorts (disk)
3 rows processed

SQL>


2nd time execution with different inputs to the same query:
==========================================================


WITH LAST_TICKS_BY_DAY AS (
SELECT MAX(LAST_PRICE_TIME) AS LAST_PRICE_TIME, MAX(PRICING_DATE) AS PRICING_DATE
FROM TEST.FFINCPRC A
WHERE A.INSTRUMENT_ID = 36316463
AND A.PRICING_DATE BETWEEN TO_DATE('2020-08-24', 'YYYY-MM-DD') AND TO_DATE('2020-08-28', 'YYYY-MM-DD')
AND LAST_PRICE_TIME >= TO_TIMESTAMP('24-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF PM')
AND LAST_PRICE_TIME <= TO_TIMESTAMP('27-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF PM')
AND A.PRICE_TYPE_ID in(5 , 6 )
GROUP BY TRUNC(LAST_PRICE_TIME)
)
SELECT p.INSTRUMENT_ID, p.SOURCE, p.BID_PRICE, p.LAST_PRICE_TIME, p.ASK_PRICE,
p.MID_PRICE, p.BID_SPREAD, p.ASK_SPREAD, p.MID_SPREAD, p.BID_YIELD, p.ASK_YIELD, p.MID_YIELD,
p.ASK_SPREAD_CURVE, p.T_SPREAD, p.MID_T_SPREAD, p.ASK_T_SPREAD, p.TEST_CONDITION_CODE,
p.CURVE_YIELD, p.PRICE_TYPE, p.BID_SPREAD_CURVE, p.MID_SPREAD_CURVE, p.PRICE_TYPE_ID,
p.BID_DISCOUNT_RATE, p.ASK_DISCOUNT_RATE, p.EOD_LATE_BID_PRICE_DELTA, p.SPREAD,
a.PRICING_SPREAD_TYPE, a.BENCHMARK_NAME, a.TBA_BACK_BENCH, a.DIMINIMIS_ELIGIBLE_FLAG,
a.HALF_DEMINIMIS_ELIGIBLE_FLAG, a.BENCHMARK_TBA, a.PAYUP
FROM TEST.FFINCPRC p
LEFT JOIN TEST.FPRCASSUMP a
ON p.ASSUMP_ID = a.ASSUMP_ID
JOIN LAST_TICKS_BY_DAY B
ON P.LAST_PRICE_TIME = B.LAST_PRICE_TIME
WHERE p.INSTRUMENT_ID = 36316463
AND p.PRICING_DATE = B.PRICING_DATE
AND p.PRICE_TYPE_ID in(5, 6)
ORDER BY A.LAST_PRICE_TIME ASC
/


Elapsed: 00:00:13.17

Execution Plan
----------------------------------------------------------
Plan hash value: 86494900

--------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 207 | 46 (5)| 00:00:01 | | |
| 1 | SORT ORDER BY | | 1 | 207 | 46 (5)| 00:00:01 | | |
| 2 | NESTED LOOPS OUTER | | 1 | 207 | 45 (3)| 00:00:01 | | |
| 3 | NESTED LOOPS | | 1 | 168 | 41 (3)| 00:00:01 | | |
| 4 | VIEW | | 1 | 22 | 36 (3)| 00:00:01 | | |
| 5 | HASH GROUP BY | | 1 | 29 | 36 (3)| 00:00:01 | | |
| 6 | PARTITION RANGE ITERATOR | | 1 | 29 | 35 (0)| 00:00:01 | 2428 | 2432 |
| 7 | INLIST ITERATOR | | | | | | | |
|* 8 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| FFINCPRC | 1 | 29 | 35 (0)| 00:00:01 | 2428 | 2432 |
| 9 | SORT CLUSTER BY ROWID | | 1 | | 17 (0)| 00:00:01 | | |
|* 10 | INDEX RANGE SCAN | FF_INC_PRICE_IDX1 | 1 | | 17 (0)| 00:00:01 | 2428 | 2432 |
| 11 | PARTITION RANGE ITERATOR | | 1 | 146 | 5 (0)| 00:00:01 | KEY | KEY |
| 12 | INLIST ITERATOR | | | | | | | |
|* 13 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | FFINCPRC | 1 | 146 | 5 (0)| 00:00:01 | KEY | KEY |
|* 14 | INDEX RANGE SCAN | FF_INC_PRICE_IDX1 | 1 | | 4 (0)| 00:00:01 | KEY | KEY |
| 15 | TABLE ACCESS BY GLOBAL INDEX ROWID | FPRCASSUMP | 1 | 39 | 4 (0)| 00:00:01 | ROWID | ROWID |
|* 16 | INDEX UNIQUE SCAN | FPRCASSUMP_PK | 1 | | 3 (0)| 00:00:01 | | |
--------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("A"."PRICING_DATE"<=TO_DATE(' 2020-08-28 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
10 - access(("A"."PRICE_TYPE_ID"=5 OR "A"."PRICE_TYPE_ID"=6) AND "A"."INSTRUMENT_ID"=36316463 AND
"LAST_PRICE_TIME">=TO_TIMESTAMP('24-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF AM') AND
"LAST_PRICE_TIME"<=TO_TIMESTAMP('27-AUG-20 11.59.59.000000000 PM','DD-MON-YY HH.MI.SS.FF AM'))
13 - filter("P"."PRICING_DATE"="B"."PRICING_DATE")
14 - access(("P"."PRICE_TYPE_ID"=5 OR "P"."PRICE_TYPE_ID"=6) AND "P"."INSTRUMENT_ID"=36316463 AND
"P"."LAST_PRICE_TIME"="B"."LAST_PRICE_TIME")
16 - access("P"."ASSUMP_ID"="A"."ASSUMP_ID"(+))


Statistics
----------------------------------------------------------
10 recursive calls
0 db block gets
56697 consistent gets
229467 physical reads
0 redo size
3287 bytes sent via SQL*Net to client
608 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
10724 sorts (memory)
0 sorts (disk)
3 rows processed

Connor McDonald
September 10, 2020 - 6:51 am UTC

please ask a fresh question, and include output from a gather_plan_statistics hint. (Examples of that on this site)

Continuation..

Rich, September 09, 2020 - 3:10 am UTC

Sorry, forgot to mention that, the same query that ran fine in the session just before and when rerunning the query right after in the same session takes for ever and plan is also changing. There was no change in the data in the partition during the run. Plan is changing to HASH and PARTITION RANGE ALL.. Any idea what could be causing this ?


Any suggestions to Improve query performance

Mike, April 11, 2023 - 4:47 am UTC

Thanks for your insights on this issue.

I have an application requirement (SLA) to improve the response time of the queries to be within 200 milliseconds consistently.
The query is pulling data based on the date range like weekly, 1 month, 6 months, 10 year etc based on user choice.
Tables involved here in this view have about 30-40 billion rows. User query will have to full scan index partitions scan
(EFF_DT_START column in the view is the partition key and indexes are all LOCAL indexes on these tables).

I traced the query with 10046 trace and observing that, Oracle spent lot of time reading from the LOCAL index partitions (almost has to scan entire all local index partitions).

Questions :-

1) Is there any better way to optimize/rewrite the view to speed up index partition scan.

Here the starting date is always passed as - 18991231 (Yes its 12/31/1899) from the TablePlus application (analytics tool) to read all the history upto the end date range.

2) Since this database is on Exadata, does keeping the table and its LOCAL indexes on EXADATA FLASHCACHE will help to provide consistent performance to match SLA.




-- View code

  CREATE OR REPLACE FORCE EDITIONABLE VIEW "curs"."cur_primary_vw" ("cur_ENTITY_ID", "EXTERNAL_NAME", "EXTERNAL_ID", "cur_DATE", "EFF_DT_START", "TENOR_DATE", "RATE_SIDE", "PAR_YIELD", "BENCHMARK_YIELD", "ZERO_YIELD", "FORWARD_RATE", "DISCOUNT_FACTOR", "TENOR_ID", "TENOR_NAME", "DISCOUNT_MARGIN", "Z_SPREAD", "ASSET_SWAP_SPREAD", "IS_CDS_cur", "PAR_SPREAD_BID", "PAR_SPREAD_MID", "PAR_SPREAD_OFFER") DEFAULT COLLATION "USING_NLS_COMP"  AS
  SELECT T.cur_ENTITY_ID, C.EXTERNAL_NAME, C.EXTERNAL_ID, TRUNC(T.EFF_DT_START), T.EFF_DT_START, T.TENOR_DATE, T.RATE_SIDE, T.PAR_YIELD, T.BENCHMARK_YIELD,
  T.ZERO_YIELD, T.FORWARD_RATE, T.DISCOUNT_FACTOR, T.TENOR_ID, T.TENOR_NAME, T.DISCOUNT_MARGIN, T.Z_SPREAD, T.ASSET_SWAP_SPREAD, 'N' AS IS_CDS_cur,
       NULL AS PAR_SPREAD_BID,
       NULL AS PAR_SPREAD_MID,
       NULL AS PAR_SPREAD_OFFER
FROM curs.F_cur_YIELD_BI T
INNER JOIN curs.cur_BI C ON (
    C.cur_ENTITY_ID = T.cur_ENTITY_ID and
    C.EFF_DT_START = T.EFF_DT_START and
    SYS_EXTRACT_UTC(SYSTIMESTAMP) < C.RECORDED_DATE_END)
INNER JOIN curs.MILESTONES M ON (
    M.MILESTONE_NAME = C.PRIMARY_MILESTONE and
    M.MILESTONE_START = C.EFF_DT_START and
    SYS_EXTRACT_UTC(SYSTIMESTAMP) < M.RECORDED_DATE_END)
WHERE T.RECORDED_DATE_END > SYS_EXTRACT_UTC(SYSTIMESTAMP)
UNION
SELECT cdc.cma_entity_id AS cur_ENTITY_ID,
       cdc.idd_cur_ticker AS EXTERNAL_NAME,
       NULL AS EXTERNAL_ID,
       TRUNC(cp.EFF_DT_START) AS cur_DATE,
       cp.EFF_DT_START,
       ct.maturity_date AS TENOR_DATE,
       NULL AS RATE_SIDE,
       NULL AS PAR_YIELD,
       NULL AS BENCHMARK_YIELD,
       NULL AS ZERO_YIELD,
       NULL AS FORWARD_RATE,
       NULL AS DISCOUNT_FACTOR,
       ct.Tenor AS TENOR_ID,
       CASE WHEN ct.tenor < 1 THEN
            (ct.tenor/0.25)*3 || 'M'
            ELSE ct.tenor || 'Y'
       END AS TENOR_NAME,
       NULL AS DISCOUNT_MARGIN,
       NULL Z_SPREAD,
       NULL AS ASSET_SWAP_SPREAD,
       'Y' AS IS_CDS_cur,
       cp.par_spread_bid AS PAR_SPREAD_BID,
       cp.par_spread_mid AS PAR_SPREAD_MID,
       cp.par_spread_offer AS PAR_SPREAD_OFFER
     FROM curs.cds_cur cdc
    INNER JOIN curs.cds_tenor ct
      ON ct.idd_cur_ticker = cdc.idd_cur_ticker
     AND cdc.recorded_date_end > SYS_EXTRACT_UTC(SYSTIMESTAMP)
     AND cdc.effective_date_end > SYS_EXTRACT_UTC(SYSTIMESTAMP)
     AND ct.tenor != 0
     AND ct.recorded_date_end > SYS_EXTRACT_UTC(SYSTIMESTAMP)
     AND ct.effective_date_end > SYS_EXTRACT_UTC(SYSTIMESTAMP)
     AND cdc.display_in_cur_viewer = 'Y'
    INNER JOIN curs.cds_prc cp
      ON cp.cma_entity_id = cdc.cma_entity_id
     AND cp.seniority = cdc.seniority
     AND cp.restructuring_type = cdc.restructuring_type
     AND cp.currency = cdc.currency
     AND cp.tenor = ct.tenor
     AND cp.recorded_date_end > SYS_EXTRACT_UTC(SYSTIMESTAMP)
     AND cp.effective_date_end > SYS_EXTRACT_UTC(SYSTIMESTAMP)
;

--

-- application query
var entityId NUMBER
var externalName  VARCHAR2(50)
var tenorPattern NUMBER
var dateFrom VARCHAR2(8)
var dateTo VARCHAR2(8)


exec :entityId := 368486731
exec :tenorPattern := 30
exec :dateFrom := '18991231'
exec :dateTo := '20220403'

        SELECT
                TO_CHAR("_".cur_DATE, 'YYYYMMDD') AS "dateTime",
                TO_CHAR("_".TENOR_DATE, 'YYYYMMDD') AS "tenorDate",
                "_".TENOR_NAME AS "tenor",
                DECODE("_".Z_SPREAD, 0, NULL, TO_CHAR("_".Z_SPREAD, 'FM9999990D99999999999999999')) AS "zSpread",
                DECODE("_".ASSET_SWAP_SPREAD, 0, NULL, TO_CHAR("_".ASSET_SWAP_SPREAD, 'FM9999990D99999999999999999')) AS "assetSwapSpread",
                DECODE("_".ZERO_YIELD, 0, NULL, TO_CHAR("_".ZERO_YIELD, 'FM9999990D99999999999999999')) AS "zeroRate",
                DECODE("_".PAR_YIELD, 0, NULL, TO_CHAR("_".PAR_YIELD, 'FM9999990D99999999999999999')) AS "parYield",
                DECODE("_".DISCOUNT_FACTOR, 0, NULL, TO_CHAR("_".DISCOUNT_FACTOR, 'FM9999990D99999999999999999')) AS "discountFactor",
                DECODE("_".FORWARD_RATE, 0, NULL, TO_CHAR("_".FORWARD_RATE, 'FM9999990D99999999999999999')) AS "forwardRate"
        FROM
                "curS"."cur_primary_vw" "_"
        WHERE
                "_"."cur_ENTITY_ID" = :entityId
                AND "_"."TENOR_ID" = :tenorPattern
                AND "_"."EFF_DT_START" BETWEEN TO_TIMESTAMP(:dateFrom, 'YYYYMMDD')
                AND TO_TIMESTAMP(:dateTo, 'YYYYMMDD') + 1
        ORDER BY
                "_"."EFF_DT_START" ASC,
                "_"."TENOR_DATE" ASC
;


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.02       0.03          0          0          0           0
Fetch       20      0.80       2.39      12738      15824          0         272
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       22      0.83       2.43      12738      15824          0         272

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 119
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
       272        272        272  SORT ORDER BY (cr=15824 pr=12738 pw=0 time=2396887 us starts=1 cost=13749 size=559234 card=3562)
       272        272        272   FILTER  (cr=15824 pr=12738 pw=0 time=2396022 us starts=1)
       272        272        272    VIEW  CUR_PRIMARY_VW (cr=15824 pr=12738 pw=0 time=2395992 us starts=1 cost=13748 size=559234 card=3562)
       272        272        272     SORT UNIQUE (cr=15824 pr=12738 pw=0 time=2395972 us starts=1 cost=13748 size=676792 card=3562)
       272        272        272      UNION-ALL  (cr=15824 pr=12738 pw=0 time=2395455 us starts=1)
       272        272        272       FILTER  (cr=6829 pr=3752 pw=0 time=2115244 us starts=1)
       272        272        272        HASH JOIN  (cr=6829 pr=3752 pw=0 time=2115211 us starts=1 cost=10176 size=676590 card=3561)
       544        544        544         PARTITION RANGE ITERATOR PARTITION: KEY KEY (cr=3811 pr=3637 pw=0 time=1845227 us starts=1 cost=5989 size=322608 card=3432)
       544        544        544          TABLE ACCESS BY LOCAL INDEX ROWID BATCHED F_CUR_YIELD_BI PARTITION: KEY KEY (cr=3811 pr=3637 pw=0 time=1844355 us starts=798 cost=5989 size=322608 card=3432)
       544        544        544           INDEX RANGE SCAN F_CUR_YIELD_BI_PK PARTITION: KEY KEY (cr=3267 pr=3093 pw=0 time=1762722 us starts=798 cost=2451 size=0 card=3564)(object id 30515935)
       272        272        272         HASH JOIN SEMI (cr=3018 pr=115 pw=0 time=268448 us starts=1 cost=4188 size=340992 card=3552)
       544        544        544          PARTITION RANGE ITERATOR PARTITION: KEY KEY (cr=2503 pr=115 pw=0 time=238530 us starts=1 cost=4053 size=237984 card=3552)
       544        544        544           TABLE ACCESS BY LOCAL INDEX ROWID BATCHED CUR_BI PARTITION: KEY KEY (cr=2503 pr=115 pw=0 time=237957 us starts=772 cost=4053 size=237984 card=3552)
       544        544        544            INDEX RANGE SCAN CUR_BI_PK PARTITION: KEY KEY (cr=2317 pr=2 pw=0 time=186779 us starts=772 cost=1547 size=0 card=3561)(object id 30523111)
     89418      89418      89418          INDEX STORAGE FAST FULL SCAN MILESTONES_IDX2 (cr=515 pr=0 pw=0 time=13980 us starts=1 cost=135 size=2593180 card=89420)(object id 47888672)
         0          0          0       FILTER  (cr=8995 pr=8986 pw=0 time=280028 us starts=1)
         0          0          0        NESTED LOOPS  (cr=8995 pr=8986 pw=0 time=280018 us starts=1 cost=3569 size=202 card=1)
         0          0          0         NESTED LOOPS  (cr=8995 pr=8986 pw=0 time=280016 us starts=1 cost=3569 size=202 card=1)
         0          0          0          NESTED LOOPS  (cr=8995 pr=8986 pw=0 time=280014 us starts=1 cost=2587 size=114 card=1)
         0          0          0           TABLE ACCESS STORAGE FULL CDS_CUR (cr=8995 pr=8986 pw=0 time=280011 us starts=1 cost=2496 size=63 card=1)
         0          0          0           TABLE ACCESS BY INDEX ROWID BATCHED CDS_TENOR (cr=0 pr=0 pw=0 time=0 us starts=0 cost=91 size=2091 card=41)
         0          0          0            INDEX RANGE SCAN CDS_TENOR_IDX2 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=4 size=0 card=1024)(object id 47886579)
         0          0          0          PARTITION RANGE ITERATOR PARTITION: KEY KEY (cr=0 pr=0 pw=0 time=0 us starts=0 cost=981 size=0 card=1)
         0          0          0           INDEX RANGE SCAN cds_prc_PK PARTITION: KEY KEY (cr=0 pr=0 pw=0 time=0 us starts=0 cost=981 size=0 card=1)(object id 30942578)
         0          0          0         TABLE ACCESS BY LOCAL INDEX ROWID cds_prc PARTITION: 1 1 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=982 size=88 card=1)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                      184        0.00          0.00
  PGA memory operation                           61        0.00          0.00
  library cache lock                             22        0.00          0.00
  IPC group service call                         12        0.00          0.00
  library cache pin                              30        0.00          0.00
  SQL*Net message to client                      20        0.00          0.00
  gc cr grant 2-way                            2144        0.00          0.26
  cell single block physical read              3142        0.00          1.21
  gc current block 3-way                        516        0.00          0.10
  gc current block 2-way                        499        0.00          0.04
  gc current grant 2-way                          2        0.00          0.00
  gc cr multi block grant                       243        0.00          0.03
  cell list of blocks physical read              75        0.00          0.07
  cell multiblock physical read                 102        0.00          0.14
  SQL*Net message from client                    20      106.48        106.49
********************************************************************************

Chris Saxon
April 13, 2023 - 1:34 pm UTC

With such large date ranges, you get little/no benefit from partition pruning.

The scan of F_CUR_YIELD_BI_PK PARTITION reads 798 partitions (starts=798) which uses 3267 gets (cr=3267). This works out to (3267/798) ~4 gets (I/Os) per partition. That's already low; it'll be hard to do much less work as-is.

Have you tried making this a global index instead of local? This could significantly the work this operation does.

The same could be said for CUR_BI_PK.

Another point - why UNION in the view instead of UNION ALL? It looks to me like both subqueries read different tables - can they really return the same data?


Mike, April 13, 2023 - 3:34 pm UTC

Hi Chris,

Thanks.

>> Have you tried making this a global index instead of local?

I have not tried it. The reason, the indexes kept as Local were due to the larger size (the specific F_CUR_YIELD_BI_PK PARTITION are almost 1.4 TB in size) and ease of maintenance. I understand that reading one single global index will do better work than reading 100's local index partitions.

I agree that UNION can be changed to UNION ALL for this query as only one portion of the query in the VIEW will be executed.


If you have any other suggestions in terms of query optimization, that would be helpful.

Thanks!




With such large date ranges, you get little/no benefit from partition pruning.

The scan of F_CUR_YIELD_BI_PK PARTITION reads 798 partitions (starts=798) which uses 3267 gets (cr=3267). This works out to (3267/798) ~4 gets (I/Os) per partition. That's already low; it'll be hard to do much less work as-is.
Hi Chris,

Tn
Have you tried making this a global index instead of local? This could significantly the work this operation does.

The same could be said for CUR_BI_PK.

Another point - why UNION in the view instead of UNION ALL? It looks to me like both subqueries read different tables - can they really return the same data?

Chris Saxon
April 14, 2023 - 1:04 pm UTC

Global index maintenance has improved greatly over the releases.

I'm guessing a bit here, but the name (F_CUR_YIELD_BI_PK) suggests this is the primary key. To create a unique local index you have to include the partition key columns, which is EFF_DT_START.

Suggesting this is a compound constraint over ( ???, EFF_DT_START ). If so, have you looked into index compression to reduce the size of the global index?

Mike, April 14, 2023 - 3:01 pm UTC

Hi Chris,

Correct, indexes in the plan are all Primary key local indexes. EFF_DT_START is the partition key column which is already part of the primary key index. EFF_DT_START column is in the middle of the PK indexed columns.

Here are the Primary key indexed columns in the order as is:

F_CUR_YIELD_BI_PK on (CUR_ENTITY_ID,EFF_DT_START,TENOR_ID,RECORDED_DATE_START) columns

CUR_BI_PK on (CUR_ENTITY_ID,EFF_DT_START,RECORDED_DATE_START) columns

CDS_PRC_PK on (CMA_ENTITY_ID,SENIORITY,RESTRUCTURING_TYPE, CURRENCY, TENOR,RECORDED_DATE_START,EFF_DT_START) columns


Question - For performance reasons, Is it always good practie to place Partition key column as trailing (last) column in the index column order .?

>> If so, Have you looked into index compression to reduce the size of the global index?

I believe you are referring to advanced index compression. We can try with global indexes but not sure about how the data load performance would be with compressed global index in place.

Just FYI, this is Exadata environment and database is on 19.11.0.0.

Thank you.

Chris Saxon
April 17, 2023 - 2:03 pm UTC

You could use advanced index compress or plain key compression:

CREATE INDEX ... COMPRESS n

Where N is the number of leading columns to compress. If (on average) many rows have the same value for these N columns you can get decent space savings.

Mike, April 18, 2023 - 5:21 am UTC

>> You could use advanced index compress or plain key compression:
CREATE INDEX ... COMPRESS n
Where N is the number of leading columns to compress. If (on average) many rows have the same value for these N columns you can get decent space savings.


question -
I created global index with columns in the order (cur_entity_id, tenor_id, eff_dt_start). Based on the dba_tab_col_Statistics shown below for those columns,

1) What's the recommended compression number should i choose ?
Is it 2 good value to start with ?

2) Size of the global index comes to 1 TB .. Wondering if partitioned the global index would help to keep the size under control besides index compression. Global partition index by Hash of 4 for example.

OWNER  TABLE_NAME                COLUMN_NAME             NUM_DISTINCT       NUM_NULLS     NUM_BUCKETS     SAMPLE_SIZE GLO     AVG_COL_LEN HISTOGRAM
------ ------------------------- -------------------- --------------- --------------- --------------- --------------- --- --------------- ---------------
CURS   F_CUR_POINT_YIELD_BI      CUR_ENTITY_ID               623279               0             255     29499182836 YES               7 HYBRID
CURS   F_CUR_POINT_YIELD_BI      TENOR_ID                        80               0              80     29499182836 YES               4 FREQUENCY
CURS   F_CUR_POINT_YIELD_BI      EFF_DT_START                 47010               0             254     29499182836 YES              11 HYBRID



Thank you.
Chris Saxon
April 18, 2023 - 12:50 pm UTC

1 - it's how many leading columns contain duplicate values that matters. If the combination of (CUR_ENTITY_ID,EFF_DT_START) is (close to) unique, compressing these two columns will actually increase the size of the index!

Richard Foote explains more & how to find the optimum number to compress
https://richardfoote.wordpress.com/2017/08/02/basic-index-compression-made-simple-it-aint-easy/

2 - Well the overall size will still be ~1 Tb. Hash partitioning will split it into more manageable chunks though - that's part of the point of partitioning! Experiment with different numbers of partitions to see what works best for you.

Mike, April 21, 2023 - 10:56 pm UTC

Thanks Chris for the details!

TO_TIMESTAMP function on the input bind value

A reader, May 01, 2023 - 3:14 am UTC

I have the below query and in the predicate information section,

Any reason, why oracle is applying TO_CHAR on TO_DATE before converting to TO_TIMESTAMP on the input bind value ?

Or is there any better way to rewrite TO_TIMESTAMP conversion on the received input value ?

"
"EFF_DT_START"=TO_TIMESTAMP(TO_CHAR(TO_DATE(:EFF_DT_START,'YYYY-MM-DD HH24:MI')))
"
---


VAR C_ENTITY_ID NUMBER
exec :C_ENTITY_ID := -1170776;
VAR EFF_DT_START VARCHAR2(50)
exec :EFF_DT_START := '2023-04-24 20:00';
VAR EFFECTIVEDATEEND VARCHAR2(50)
exec :EFFECTIVEDATEEND := '31-DEC-99 12.00.00.000000000 AM'




SQL> select count(*) from CURS.F_CPY_BI WHERE c_entity_id = :C_ENTITY_ID and EFF_DT_START = to_timestamp(to_date(:EFF_DT_START,'YYYY-MM-DD HH24:MI')) and recorded_date_end = TO_TIMESTAMP(:effectiveDateEnd)
2 /

COUNT(*)
----------
0

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 4037726422

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                                |     1 |    29 |     4   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE         |                                |     1 |    29 |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|                                |     1 |    29 |     4   (0)| 00:00:01 |   KEY |   KEY |
|*  3 |    INDEX RANGE SCAN     | F_CPY_BI_IDX1                  |     1 |    29 |     4   (0)| 00:00:01 |   KEY |   KEY |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C_ENTITY_ID"=TO_NUMBER(:C_ENTITY_ID) AND
              "EFF_DT_START"=TO_TIMESTAMP(TO_CHAR(TO_DATE(:EFF_DT_START,'YYYY-MM-DD HH24:MI'))) AND
              "RECORDED_DATE_END"=TO_TIMESTAMP(:EFFECTIVEDATEEND))
       filter("RECORDED_DATE_END"=TO_TIMESTAMP(:EFFECTIVEDATEEND))


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          2  physical reads
          0  redo size
        549  bytes sent via SQL*Net to client
        810  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>



Thanks!
Connor McDonald
May 01, 2023 - 3:27 pm UTC

TO_TIMESTAMP takes a *character* input, so when you send it a date, it says "I can't do that, so i'll convert to a string for you", hence the to_char.

You can just do TO_TIMESTAMP directly, ie

to_timestamp(:EFF_DT_START,'YYYY-MM-DD HH24:MI')

Mike, May 02, 2023 - 3:57 pm UTC

Hi Connor,

Thank you.
Connor McDonald
May 10, 2023 - 4:36 am UTC

glad to help

query optimization

Mike, October 26, 2023 - 3:59 am UTC

Hello,

Just wondering if any better way rewrite this query to improve.
Tables involved both are partition tables (evaluation_date is the partition key and its monthly range partition).

I tried with different set of parallel DOP and no parallel and it still takes long time (~20 min). Thanks for your insight if the query could be rewritten in optimized way. ThanksĀ 

var b1 VARCHAR2(32)
var b2 VARCHAR2(32)
var b3 VARCHAR2(32)
exec :b1 := 'MEL7PM';
exec :b2 := '09/25/2023 00:00:00';
exec :b3 := '10/25/2023 00:00:00';

WITH prices AS
(
SELECT
/*+ PARALLEL(mddp,8) */
mdd.entity_id,
CASE
WHEN mddp.transaction_type = 'BID'
OR mddp.transaction_type = 'OFFER'
THEN mddp.price
END AS quote_price,
CASE
WHEN mddp.transaction_type = 'BID'
THEN mddp.price
END quote_bid_price,
CASE
WHEN mddp.transaction_type = 'OFFER'
THEN mddp.price
END quote_ask_price,
CASE
WHEN mddp.transaction_type = 'TRADE'
THEN mddp.price
END trade_price,
mddp.price AS market_color_price
FROM imd.mkt_d_daily_price mddp,
TABLE( CAST( MULTISET
(
SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= mddp.count
) AS SYS.ODCINUMBERLIST ) ) l
JOIN imd.mkt_d_daily mdd
ON mddp.mdd_id = mdd.id
AND mddp.update_time = mdd.update_time
WHERE mdd.eval_cycle = :B1
AND mdd.evaluation_date >= TO_DATE(:B2,'MM/DD/YYYY HH24:MI:SS')
AND mdd.evaluation_date <= TO_DATE(:B3,'MM/DD/YYYY HH24:MI:SS')
)
SELECT p.entity_id ,
COUNT(p.quote_price) quote_count ,
MIN(p.quote_price) AS quote_low ,
MAX(p.quote_price) AS quote_high ,
stddev_pop(p.quote_price) quote_std_dev ,
COUNT(p.quote_bid_price) quote_bid_count ,
MIN(p.quote_bid_price) AS quote_bid_low ,
MAX(p.quote_bid_price) AS quote_bid_high ,
stddev_pop(p.quote_bid_price) quote_bid_std_dev,
COUNT(p.quote_ask_price) quote_ask_count ,
MIN(p.quote_ask_price) AS quote_ask_low ,
MAX(p.quote_ask_price) AS quote_ask_high ,
stddev_pop(p.quote_ask_price) quote_ask_std_dev,
COUNT(p.trade_price) trade_count ,
MIN(p.trade_price) AS trade_low ,
MAX(p.trade_price) AS trade_high ,
stddev_pop(p.trade_price) trade_std_dev
FROM prices p
GROUP BY p.entity_id
/

Chris Saxon
October 26, 2023 - 12:47 pm UTC

For us to help you, please submit a new question with this query and it's execution plan.

You can get this by running:

alter session set statistics_level = all;
set serveroutput off

select ...

select * 
from   table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));


Ensure the plan includes the E-rows, A-rows, Starts, time, & buffers columns

More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library