Thanks for the question regarding "IN & EXISTS", version 8.1.6
March 31, 2001 - 7pm Central time zone
Reviewer: Seetharaman Srinivasan from Chennai , TamilNadu, India
Excellent
Thanks for the question regarding "IN & EXISTS", version 8.1.6
April 11, 2001 - 3pm Central time zone
Reviewer: Mani Srinivasan from CA USA
This is good stuff....

May 18, 2001 - 2am Central time zone
Reviewer: Helena Markova from Bratislava, Slovakia
Could not Call Report from Form (Menu)
May 28, 2001 - 1am Central time zone
Reviewer: Mohd. Manik Miah from Dhaka, Bangladesh
Dear Sir
I learned oracle a few months ago.
But I could not call report from Form (Menu).
Would you please help me in this regard?
Thank & best regards
Manik Miah
manikmiah@yahoo.com
Instead of T, why cant you use EMP AND DEPT TABLES
September 28, 2001 - 4pm Central time zone
Reviewer: A reader
Tom
Can you illustrate your answer by using the emp dept table.
and by
1.increasing and decreasing the number of rows in both the tables.
2.by dropping and adding indexes on both the tables
That way your explanation will be vindicated.
Can you do that please.
Followup September 29, 2001 - 2pm Central time zone:
Well -- I'm not going to use EMP and DEPT as I would have to generate tons of EMP and DEPT data to
illustrate my point (feel free to do that if you want to ;)
I'll use BIG and SMALL to make the point. I ran:
rem create table big as select * from all_objects;
rem insert /*+ append */ into big select * from big;
rem commit;
rem insert /*+ append */ into big select * from big;
rem commit;
rem insert /*+ append */ into big select * from big;
rem create index big_idx on big(object_id);
rem
rem
rem create table small as select * from all_objects where rownum < 100;
rem create index small_idx on small(object_id);
rem
rem analyze table big compute statistics
rem for table
rem for all indexes
rem for all indexed columns
rem /
rem analyze table small compute statistics
rem for table
rem for all indexes
rem for all indexed columns
rem /
so, small has 99 rows, big has 133,000+
select count(subobject_name)
from big
where object_id in ( select object_id from small )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.02 0.02 0 993 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.03 0.03 0 993 0 1
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
792 MERGE JOIN
100 SORT (JOIN)
100 VIEW OF 'VW_NSO_1'
99 SORT (UNIQUE)
792 INDEX GOAL: ANALYZED (FULL SCAN) OF 'SMALL_IDX'
(NON-UNIQUE)
891 SORT (JOIN)
0 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'BIG'
versus:
select count(subobject_name)
from big
where exists ( select null from small where small.object_id = big.object_id )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 4.12 4.12 0 135356 15 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 4.12 4.12 0 135356 15 1
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
792 FILTER
135297 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'BIG'
133504 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'SMALL_IDX'
(NON-UNIQUE)
That shows if the outer query is "big" and the inner query is "small", in is generally more
efficient then NOT EXISTS.
Now:
select count(subobject_name)
from small
where object_id in ( select object_id from big )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2 0.51 0.82 50 298 22 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5 0.52 0.83 50 298 22 1
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
99 MERGE JOIN
16913 SORT (JOIN)
16912 VIEW OF 'VW_NSO_1'
16912 SORT (UNIQUE)
135296 INDEX GOAL: ANALYZED (FAST FULL SCAN) OF 'BIG_IDX'
(NON-UNIQUE)
99 SORT (JOIN)
99 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'SMALL'
versus:
select count(subobject_name)
from small
where exists ( select null from big where small.object_id = big.object_id )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.01 0.01 0 204 12 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.01 0.01 0 204 12 1
EGATE)
99 FILTER
100 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'SMALL'
99 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'BIG_IDX' (NON-UNIQUE)
shows that is the outer query is "small" and the inner query is "big" -- a WHERE EXISTS can be
quite efficient.
What difference does it make
September 30, 2001 - 12am Central time zone
Reviewer: A reader
when you remove those indexes on the small and big tables.
Followup September 30, 2001 - 8am Central time zone:
Run it and see. All of the necessary code is there -- it is very easy to do these sorts of tests.
The outcome is easy to hypothesize however.
The "select * from big where object_id in ( select object_id from small )" will sort BIG once and
SMALL once and join them (sort merge join) in all likelyhood.
The "select * from big where exists ( select null from small where small.object_id = big.object_id
)" will tend to FULL SCAN big once and FOR EVERY ROW in big -- FULL SCAN small.
(after running it -- i verified it and the "rule of thumb" holds true. BIG outer query and SMALL
inner query = IN. SMALL outer query and BIG inner query = WHERE EXISTS. Remember -- thats is a
RULE OF THUMB and rules of thumb always have infinitely many exceptions to the rule.
Tom, this is very clever and not fair
September 30, 2001 - 9am Central time zone
Reviewer: TJ from USA
You have explained the whole of the in and exists concepts so well( no one has ever done in like
this, you must be one happy person, you believe in sharing)
and now you leave us with an open ended answer..
"Remember -- thats is a RULE OF THUMB and rules of thumb always have infinitely many exceptions to
the rule."
Kindly discuss what can be some of those possible exceptions, tell us what factors can cause those
exceptions.
Followup September 30, 2001 - 10am Central time zone:
Well, there are infinitely many -- IO can affect the outcome of this. The goal of the query can
effect this (eg: the WHERE EXISTS will find the first row faster in general then the IN will -- the
IN will get the LAST row (all rows) faster then the where exists). If your goal is the FIRST row
-- exists might totally blow away IN. If you are a batch process (and hence getting to the LAST
row is vital) ....
and so on. The point is: be aware of their differences, try them both when tuning, understand
conceptually what they do and you'll be able to use them to maximum effect.
does this explanation hold good even for NOT IN and NOT EXISTS
September 30, 2001 - 7pm Central time zone
Reviewer: Chenna
Summary
October 1, 2001 - 10am Central time zone
Reviewer: Chenna
This is discussion of the top most class --
O.k. let me try to sum up what you are saying --
in and exists can be substitutes for each other, and we have to use one in the place of other
depending upon the rowns returned by the outer and inner query,
where as
'not in' and 'not exists' are not substitutes for each other , and they are not related in any way.
There use depends on the specific need.
o.k. to sum it up finally , can you give us examples of when the use of 'not in' is appropriate and
when 'not exists' is appropriate.
Followup October 1, 2001 - 3pm Central time zone:
you got the first part right (in/exists)
the second part is too strong. not in and not exists are not perfect substitutes. NOT IN is
different then NOT exists but NOT EXISTS and NOT IN are the same when the subquery you use in the
NOT IN does not contain NULLS.
Both of not in and not exists can be very efficient when there are no nulls (and not in WITH THE
CBO is pretty good -- using an "anti join" -- see the design/tuning for performance guide for
details on that). With NULLS -- a not in can be very in-efficient and many people substitute a
not exists for it (not realzing the ANSWER just changed!!!)
anti join
October 1, 2001 - 3pm Central time zone
Reviewer: A reader
(and not in WITH THE CBO is pretty good -- using an "anti join" -- see the design/tuning for
performance guide for details on that)
Please give the link to the concerned page in the manual, there is no search feature in the
document to search for 'anti join'
the mystery with non in and not exists
October 1, 2001 - 9pm Central time zone
Reviewer: A reader
"Both of not in and not exists can be very efficient when there are no nulls (and not in WITH THE
CBO is pretty good -- using an "anti join" -- see the design/tuning for performance guide for
details on that). With NULLS -- a not in can be very in-efficient and many people substitute a
not exists for it (not realzing the ANSWER just changed!!!) "
1.Not in and not exists return the same performance result when there arer no nulls.
2.Not in is especially very efficient when used with cbo and no nulls.
3.And when there are nulls what is preferred is not exists, and not not in.
I think Iam right in what I understood above.
Now
1.Why is not in in efficient when there are nulls. Can you demonstrate this.
2.And how is not exists better over not in when there are nulls.
3.What do you mean by saying
"and many people substitute a not exists for it (not realzing the ANSWER just changed!!!) "
do you mean not exists should be substituted for not in when there are nulls ???????
Followup October 2, 2001 - 7pm Central time zone:
1a) not in can be faster then not exists (or slower). The are different, do different things. A
not exists is alot like an EXISTS. A not in when the subquery cannot contain NULLS can be alot
like an IN (processed as an ANTI-JOIN instead of a JOIN). So, just like IN is sometimes better
then EXISTS and vice versa, NOT IT is sometimes (and sometimes not) better then NOT EXISTS
2a) thats when it works best, yes
3a) typically...
1b) Ok, time for the demo of NOT IN and NOT EXISTS (i guess I knew it was coming eventually!)
Here we go:
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table big as select * from all_objects;
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table small as select * from all_objects where rownum <
10;
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter table big modify object_id null;
Table altered.
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter table small modify object_id null;
Table altered.
ops$tkyte@ORA817DEV.US.ORACLE.COM> update small set object_id = null where rownum = 1;
1 row updated.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create unique index big_idx on big(object_id);
Index created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create unique index small_idx on small(object_id);
Index created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table big compute statistics
2 for table
3 for all indexes
4 for all indexed columns;
Table analyzed.
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table small compute statistics
2 for table
3 for all indexes
4 for all indexed columns;
Table analyzed.
So, there is the big setup -- we have our tables. SMALL will be used in the subquery. As it
stands now -- lets see what happens with NOT IN and NOT EXISTS:
ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace traceonly
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from big
2 where NOT EXISTS (select null
3 from small
4 where small.object_id = big.object_id)
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=845 Bytes=81965)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=845 Bytes=81965)
3 1 INDEX (UNIQUE SCAN) OF 'SMALL_IDX' (UNIQUE)
Statistics
----------------------------------------------------------
305 recursive calls
15 db block gets
18278 consistent gets
0 physical reads
0 redo size
1961046 bytes sent via SQL*Net to client
125305 bytes received via SQL*Net from client
1127 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
16886 rows processed
So, for each row in BIG it did an INDEX PROBE into SMALL to see if that row existed or not. So
far, so good... 18k consistent reads (high) but not as high as:
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from big
2 where object_id NOT IN ( select object_id
3 from small )
4 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=845 Bytes=81965)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=845 Bytes=81965)
3 1 TABLE ACCESS (FULL) OF 'SMALL' (Cost=1 Card=1 Bytes=3)
Statistics
----------------------------------------------------------
0 recursive calls
202743 db block gets
84706 consistent gets
0 physical reads
0 redo size
862 bytes sent via SQL*Net to client
319 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
0 rows processed
Wow -- what happened here? Well, for each row in BIG -- we FULL SCANNED small -- yes, for each
row, we full scanned small -- we full scanned small some 16,000 times (imagine if small weren't so
small). Also take care to note that this query processed 0 rows - no data found. The prior one
got over 16k -- that is the effect of a NULL in the subquery of a NOT IN.
So, lets get rid of that nasty (we don't have to get rid of it, we could use a predicate to
preclude it as well and we'll do that on BIG as a demonstration)
ops$tkyte@ORA817DEV.US.ORACLE.COM> update small set object_id = -1 where object_id is null;
1 row updated.
ops$tkyte@ORA817DEV.US.ORACLE.COM> commit;
Commit complete.
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter table small modify object_id NOT NULL;
Table altered.
ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table small compute statistics
2 for table
3 for all indexes
4 for all indexed columns;
Table analyzed.
ops$tkyte@ORA817DEV.US.ORACLE.COM> Select * from big
2 where NOT EXISTS (select null
3 from small
4 where small.object_id = big.object_id)
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=845 Bytes=81965)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=845 Bytes=81965)
3 1 INDEX (UNIQUE SCAN) OF 'SMALL_IDX' (UNIQUE)
Statistics
----------------------------------------------------------
71 recursive calls
15 db block gets
18270 consistent gets
0 physical reads
0 redo size
1961046 bytes sent via SQL*Net to client
125305 bytes received via SQL*Net from client
1127 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
16886 rows processed
So, that is no different... but:
ops$tkyte@ORA817DEV.US.ORACLE.COM> Select * from big b0
2 where object_id IS NOT NULL
3 and object_id NOT IN ( select object_id
4 from small )
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=845 Bytes=81965)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=845 Bytes=81965)
3 1 INDEX (UNIQUE SCAN) OF 'SMALL_IDX' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
15 db block gets
18243 consistent gets
0 physical reads
0 redo size
1961046 bytes sent via SQL*Net to client
125305 bytes received via SQL*Net from client
1127 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
16886 rows processed
Hey -- thats pretty good already. Its just like the NOT EXISTS. But wait -- there is more. Lets
start anti-joining...
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set always_anti_join = MERGE
2 /
Session altered.
ops$tkyte@ORA817DEV.US.ORACLE.COM> Select * from big b1
2 where object_id is not null
3 and object_id NOT IN ( select object_id
4 from small )
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=269 Card=16885 Bytes=1857350)
1 0 MERGE JOIN (ANTI) (Cost=269 Card=16885 Bytes=1857350)
2 1 SORT (JOIN) (Cost=267 Card=16894 Bytes=1638718)
3 2 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=16894 Bytes=1638718)
4 1 SORT (UNIQUE) (Cost=3 Card=9 Bytes=117)
5 4 VIEW OF 'VW_NSO_1' (Cost=1 Card=9 Bytes=117)
6 5 INDEX (FULL SCAN) OF 'SMALL_IDX' (UNIQUE) (Cost=1 Card=9 Bytes=36)
Statistics
----------------------------------------------------------
0 recursive calls
19 db block gets
237 consistent gets
236 physical reads
0 redo size
1863304 bytes sent via SQL*Net to client
125305 bytes received via SQL*Net from client
1127 SQL*Net roundtrips to/from client
4 sorts (memory)
1 sorts (disk)
16886 rows processed
Now that is much much less work -- from 18k to 237 consistent gets -- not bad. The hash anti join
is similar:
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set always_anti_join = HASH
2 /
Session altered.
ops$tkyte@ORA817DEV.US.ORACLE.COM> Select * from big b2
2 where object_id is not null
3 and object_id NOT IN ( select object_id
4 from small )
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=145 Card=16885 Bytes=1857350)
1 0 HASH JOIN (ANTI) (Cost=145 Card=16885 Bytes=1857350)
2 1 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=16894 Bytes=1638718)
3 1 VIEW OF 'VW_NSO_1' (Cost=1 Card=9 Bytes=117)
4 3 INDEX (FULL SCAN) OF 'SMALL_IDX' (UNIQUE) (Cost=1 Card=9 Bytes=36)
Statistics
----------------------------------------------------------
0 recursive calls
15 db block gets
237 consistent gets
31 physical reads
0 redo size
2350779 bytes sent via SQL*Net to client
125305 bytes received via SQL*Net from client
1127 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
16886 rows processed
2b) see 1b)
3b) see 1b) where the NOT IN returns 0 rows, but NOT EXISTS returns 16k -- they are NOT the same
when NULLs are involved.
ANTI JOIN and 'ANY', 'SOME','ALL'
October 2, 2001 - 4pm Central time zone
Reviewer: Chenna
Optimizing Anti-Joins and Semi-Joins
An anti-join returns rows from the left side of the predicate for which there is no corresponding
row on the right side of the predicate. That is, it returns rows that fail to match (NOT IN) the
subquery on the right side. For example, an anti-join can select a list of employees who are not in
a particular set of departments:
SELECT * FROM emp
WHERE deptno NOT IN
(SELECT deptno FROM dept
WHERE loc = 'HEADQUARTERS');
The optimizer uses a nested loops algorithm for NOT IN subqueries by default, unless the
initialization parameter ALWAYS_ANTI_JOIN is set to MERGE or HASH and various required conditions
are met that allow the transformation of the NOT IN subquery into a sort-merge or hash anti-join.
You can place a MERGE_AJ or HASH_AJ hint in the NOT IN subquery to specify which algorithm the
optimizer should use.
A semi-join returns rows that match an EXISTS subquery, without duplicating rows from the left side
of the predicate when multiple rows on the right side satisfy the criteria of the subquery. For
example:
SELECT * FROM dept
WHERE EXISTS
(SELECT * FROM emp
WHERE dept.ename = emp.ename
AND emp.bonus > 5000);
In this query, only one row needs to be returned from dept even though many rows in emp might match
the subquery. If there is no index on the bonus column in emp, then a semi-join can be used to
improve query performance.
The optimizer uses a nested loops algorithm for EXISTS subqueries by default, unless the
initialization parameter ALWAYS_SEMI_JOIN is set to MERGE or HASH and various required conditions
are met. You can place a MERGE_SJ or HASH_SJ hint in the EXISTS subquery to specify which algorithm
the optimizer should use.
SELECT * FROM dept
WHERE EXISTS
(SELECT * FROM emp
WHERE dept.ename = emp.ename
AND emp.bonus > 5000);
What difference does it make when there is an index on the bonus column.
A semi join is already being executed in the above query, right ?
How will the use of MERGE_SJ or HASH_SJ hint in the EXISTS subquery improve the performance.
what do these hints tell the optimizer.
And , finally, can you give examples of using 'ANY' ,'ALL' AND 'SOME' clauses, as to where and when
to use these clauses. We have never used these clauses.
Followup October 2, 2001 - 7pm Central time zone:
ANY/SOME are alot like an in.
where x in ( select y from t )
is the same as:
where x = ANY(select y from t)
SOME is an aliase for ANY, ANY is the same as SOME.
any/some can be useful with inequalities:
scott@ORA817DEV.US.ORACLE.COM> select ename, sal, comm from emp where sal < some( select comm from
emp );
ENAME SAL COMM
---------- ---------- ----------
A 800
WARD 1250 500
MARTIN 1250 1400
ADAMS 1100
JAMES 950
MILLER 1300
6 rows selected.
that finds emps such that there exists SOME commision greater then their salary....
ALL is a cool one. I use it to find the most frequently ocurring thing, for example:
scott@ORA817DEV.US.ORACLE.COM> select to_char(hiredate,'mm')
2 from emp
3 group by to_char(hiredate,'mm')
4 having count(*) >= ALL ( select count(*) from emp group by to_char(hiredate,'mm') );
TO
--
12
that shows me the month with the MOST hiredates.... it says give me the month whose count is
greater than or equal to ALL other counts by month...
Tom, will the result be the same if ..
October 3, 2001 - 2am Central time zone
Reviewer: Chenna
ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set always_anti_join = MERGE
2 /
Session altered.
ops$tkyte@ORA817DEV.US.ORACLE.COM> Select * from big b1
2 where object_id is not null
3 and object_id NOT IN ( select object_id
4 from small )
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=269 Card=16885 Bytes=1857350)
1 0 MERGE JOIN (ANTI) (Cost=269 Card=16885 Bytes=1857350)
2 1 SORT (JOIN) (Cost=267 Card=16894 Bytes=1638718)
3 2 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=16894 Bytes=1638718)
4 1 SORT (UNIQUE) (Cost=3 Card=9 Bytes=117)
5 4 VIEW OF 'VW_NSO_1' (Cost=1 Card=9 Bytes=117)
6 5 INDEX (FULL SCAN) OF 'SMALL_IDX' (UNIQUE) (Cost=1 Card=9
Bytes=36)
Statistics
----------------------------------------------------------
0 recursive calls
19 db block gets
237 consistent gets
236 physical reads
0 redo size
1863304 bytes sent via SQL*Net to client
125305 bytes received via SQL*Net from client
1127 SQL*Net roundtrips to/from client
4 sorts (memory)
1 sorts (disk)
16886 rows processed
1.Tom, will the result be the same if there is no index on the object_id of the small table.
2.You are using CBO , all the way, what will be the result of all these tests for 'in and exists'
and 'not in and not exists' if we are using RBO.
3.After we analyze the tables, how should we switch to RBO if we want to.
4.Will the push_sq (push sub query ) come into picture in any way inthe above examples. Actually
when is this hint relevant.
5.out of context: Can you give us the link to the oracle error messages manual.
6.Suppose if the query's where clause has 5 sub queries, for the third sub query in the where
clause , what will be the outer query, I mean how will it resolve.
For the fifth sub query in the where clause, I guess the whole of the restof the query will be the
outer query ?
Followup October 3, 2001 - 8am Central time zone:
1) yes more or less -- the ANSWER will be the same (16,886 rows processed). The plan will be
slightly different -- a FULL SCAN of small instead of an index full scan. (you have ALL of the
code needed to replicate my tests -- try it out!)
2) I don't recommend the RBO with 8i -- too many features not available. But -- hey -- you have
the code, give it a whirl, try it out (but make sure you have lots of time -- the RBO is not as
smart as the CBO when it comes to this -- reread this entire article, I stress the need for the CBO
to make this efficient!)
3) use an alter session set optimizer_goal or /*+ RULE */ hint in the query.
4) push_subq affects when subqueries are evaluated (last by default, earlier if you ask for it).
So yes, it could affect the plan here.
5) all documentation is available from http://otn.oracle.com/, the Oracle error messages manual varies from release to release. Suggest you goto otn, follow the
documentation link and goto the version you are interested in.
6) this does not make syntactic sense to me.
Train us to use parallel query
October 3, 2001 - 6pm Central time zone
Reviewer: A reader
From the oracle magazine article
What Can Execute in Parallel?
By using parallel options, data-sensitive SQL state-
ments, database recovery, and data loads can be exe-
cuted by multiple processes simultaneously. With the
release of Oracle 8i, all the following operations can
be executed in parallel:
Table scan
PL/SQL functions called from SQL
Sort merge join
"Not in"
Nested loop join
Select distinct
Hash join
Aggregation
Group by
Cube
Union and union all
Create table as select
Order by
Rebuild index partition
Rollup
Move partition
Create index
Update
Rebuild index
Insert . . . select
Split partition
Enable constraint (the table
Delete scan is parallelized)
Star transformation
If the want to execute the above not in sub queries in parallel, actually when should I decide to
do it in paralle.
Will it suffice if I give the following.
Select * from big b1
2 where object_id is not null
3 and object_id NOT IN ( select/*+ PARALLEL(a1,8)*/ object_id from small a1)
usually I see that it is the Sort merge joins which take the most amount of time in a query, how
can I parallelize a sort merge join.
Thank You
Followup October 4, 2001 - 6am Central time zone:
If you have to parallelize the subquery -- you are in a heap of trouble. that means it is really
big and since you would be full scanning IT once per row in big -- its going to be slow. Look at
the other options below in order to AVOID having to full scan it once per row.
My advice here -- try it yourself. Take your ideas, set up the test case, test it out -- see what
works, what doesn't. Its the best way to really learn and understand how this works.
I will do it only my own but -- --
October 4, 2001 - 11am Central time zone
Reviewer: Chenna
Select * from big b1
2 where object_id is not null
3 and object_id NOT IN ( select/*+ PARALLEL(a1,8)*/ object_id from
small a1)
O.k. , let me try parallel, for myself, and report the conclusions to you for discussion.
But before that , you need to weed out these small doubts for me ..
1.select/*+ PARALLEL(a1,8)*/ object_id from
small a1 -- before using the parallel hint, do I need to change any init.ora parameters,
2.I understand that 8 refers to the number of processors, how can I know the number of processors
available for my server.
3.Where you use parallel query, how does my query get divided between multiple processors. Where
are the results from each divided query kept , and then combined , before they are displayed to the
user.
4.Give us a brief insignt into the internals of how parallel query works.
And by the way, I cant see the asktom or any other images on this page. Some thing you might want
to know about.

October 4, 2001 - 10pm Central time zone
Reviewer: Chenna
SQL> connect system/manager
Connected.
SQL> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ------- ------------------------------
cpu_count integer 1
SQL> show parameter parallel
NAME TYPE VALUE
------------------------------------ ------- ------------------------------
fast_start_parallel_rollback string LOW
optimizer_percent_parallel integer 0
parallel_adaptive_multi_user boolean FALSE
parallel_automatic_tuning boolean FALSE
parallel_broadcast_enabled boolean FALSE
parallel_execution_message_size integer 2148
parallel_instance_group string
parallel_max_servers integer 5
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
NAME TYPE VALUE
------------------------------------ ------- ------------------------------
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0
SQL>
The cpu count shows only one , but look at this
parallel_max_servers integer 5.
1.how can the parallel_max_servers be 5 when the cpu count is 1.
2.Does this mean that I can give a parallel hint to use 5 parallel servers.
3.When the cpu count is 1 I expect the parallel_max_servers
to be 1.
Perplexing? Need to be clarified.
Followup October 5, 2001 - 11am Central time zone:
did you read the documentation i pointed you to?
you can do parallel query on a single cpu box. You can do a parallel 100 query on a 20 cpu box.
You need not have parity between processes and cpu's (if you did, unix would be a single user OS).
You can have parallel_max_servers at whatever you want, regardless of the number of cpus on the
system. Since parallel queries typically involve tons of IO, and IO is something you wait for --
there is plenty of opportunity for using parallel query on a single CPU system (when one process is
waiting for IO, blocked, the other process can be munching through some data it read).
Read the docs, get a good understanding of how it really works -- then it won't be confusing.
sub query
October 6, 2001 - 5pm Central time zone
Reviewer: Jay
Tom
Lets assume that the where clause has 10 conditions, and each condition gets its value from a sub
query. IN what order do these sub queries get resolved.
select a from b,c
were b.x=c.x
and b.t=(select t from some ttablle where something)
and c.l=(select l from some xtable where something)
Now when resolving and c.l=(select l from some xtable where something)
will select a from b,c
were b.x=c.x
and b.t=(select t from some ttablle where something)
be considered as outer query, or only
select a from b,c
were b.x=c.x
be considered as outer query
Followup October 7, 2001 - 4pm Central time zone:
when looking at the IN, the outer query is the result set that will be joined to the IN result.
if b.t = (select t ... ) is processed FIRST (before c.l = ( select l...)), the that would be the
"outer query" with regards to c.1 = ( select l...).
However -- you cannot count on the order of operation of predicates!!! we can and will change the
order at will.
Good topic
May 17, 2002 - 12pm Central time zone
Reviewer: A reader
Tom and team, Many thanks. Love your site.
I'm running Oracle 8.1.7.3
I need help with a "NOT IN" query.
Here is what the tables look like:
Table 1:
CREATE TABLE SKU
(ITEM_ID NUMBER NOT NULL,
SKU_CODE VARCHAR2(10) NOT NULL,
SKU_DESC VARCHAR2(255) NOT NULL
,
CONSTRAINT PK_SKU
PRIMARY KEY (ITEM_ID, SKU_CODE)
USING INDEX)
/
CREATE TABLE SKU_SOLD_OUT
(ITEM_ID NUMBER NOT NULL,
SKU_CODE VARCHAR2(10) NOT NULL,
SOLD_OUT_DATE DATE NOT NULL,
DELETED CHAR(1) NOT NULL,
SOLD_OUT_TEXT VARCHAR2(1000) NOT NULL
,
CONSTRAINT PK_SKU_SOLD_OUT
PRIMARY KEY (ITEM_ID, SKU_CODE, SOLD_OUT_DATE)
USING INDEX)
/
With item_id, sku_code combinations having zero to many rows in sku_sold_out
but only one will be active (deleted = 'N') at a time.
I need to get a list of items that are not completely sold out. i.e.
If an Item_id has four sku_codes and three are currently sold out, return the item_id.
If all four sku_codes are sold out, do not return the item_id.
Here is the query I am useing now.
select distinct item_id
from sku
where item_id||'-'||sku_code not in (
select item_id||'-'||sku_code
from sku_sold_out
where deleted = 'N'
and sold_out_date <= sysdate)
sku has about 10,000 rows
sku sold out has about 120,000 rows.
Is there a better way of doing this? The schema is set and is production so I cant
make any database changes other than views, procs, ect.
Thanks for the help.
Thanks
Followup May 17, 2002 - 1pm Central time zone:
Well, other ways would be
select distinct item_id
from sku
where (item_id,sku_code) not in ( select item_id, sku_code
from sku_sold_out
where deleted = 'N'
and sold_out_date <= sysdate)
/
select distinct sku.item_id
from sku, sku_sold_out
where sku.item_id = sku_sold_out.item_id (+)
and sku.sku_code = sku_sold_out.sku_code (+)
and nvl(sku_sold_out.deleted,'Y') = 'Y'
and nvl(sku_sold_out.sold_out_date,sysdate) <= sysdate
/
Index result of Subquery and FILTER it
May 18, 2002 - 12am Central time zone
Reviewer: Jaromir D.B. Nemec from Vienna, Austria
Hi Tom,
in the original explanation you told us:
The subquery is evaluated, distinct'ed, >>>indexed<<<< (or hashed or sorted) and then ....
1) Can you explain how can be the result of a subquery indexed.
temporary index is created or something like that? What you see in explain plan in that case? -
never heard of this feature.
2) Some of the explain plans you demonstrated consist of:
FILTER
row set 1
row set 2
I've learned that FILTER is a kind of "filter" to suppress rows that doesn't match some condition.
This is evidently NOT (in) the case.
Can you explain (or link) what is the functionality of this two-row-ser-filter?
My experience is that this kind of FILTER operate similar to nested loops and is not performable
in parallel - is it right?
Thanks
Jaromir
www.db-nemec.com
Followup May 18, 2002 - 10am Central time zone:
1) it is stored sorted or hashed. The structure itself is an implicit "indexing" of sorts. Lets
say you were doing a sort merge join -- you would see two sorts (out to temp) and then a merge. A
hash join -- one of the sets gets hashed.
for #2, I assume you mean something like:
ops$tkyte@ORA817DEV.US.ORACLE.COM> Select * from big b0
2 where object_id IS NOT NULL
3 and object_id NOT IN ( select object_id
4 from small )
5 /
16886 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=845 Bytes=81965)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'BIG' (Cost=7 Card=845 Bytes=81965)
3 1 INDEX (UNIQUE SCAN) OF 'SMALL_IDX' (UNIQUE)
The FILTER in this case is sort of like a nested loops join yes. The table access full is done --
and the "is not null" part is evaluated. Then each row must be filtered for the NOT IN and that is
done via the index.
This could be done in parallel -- meaning the full scan can be parallelized.
Useful info about IN & EXISTS
July 23, 2002 - 3pm Central time zone
Reviewer: Vish Inamdar from USA
Yes, this article gives me clear idea how IN & EXISTS are different from each other and where to
apply them.
Good and simple example made it more clear.
I will be asking more question in future.
Vish
Exists, IN and consistent gets
August 22, 2002 - 8am Central time zone
Reviewer: Vikas Sharma from Delhi India
Hi tom,
Can i directly relate the consistent gets of plan with the time taken by the query executed by
using IN and another same query executed by using exists. Actually I have query which I execute in
two different ways eg.
Query with exists :
select person_id from persons where exists
(select con_person_id from contacts where con_person_id = person_id ) and contact_details = 'Y'
93 rows selected ..
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=951 Card=1 Bytes=9)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'PERSONS' (Cost=951 Card=1 Bytes=
9)
3 1 INDEX (UNIQUE SCAN) OF 'CON_PK' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
8 db block gets
6365 consistent gets
0 physical reads
0 redo size
2967 bytes sent via SQL*Net to client
1091 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
93 rows processed
Query with IN :
select person_id from persons where person_id in (select con_person_id from contacts ) and
contact_details = 'Y';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=187 Card=1 Bytes=17)
1 0 NESTED LOOPS (Cost=187 Card=1 Bytes=17)
2 1 INDEX (FULL SCAN) OF 'CON_PK' (UNIQUE) (Cost=1 Card=93 B
ytes=744)
3 1 TABLE ACCESS (BY INDEX ROWID) OF 'PERSONS' (Cost=2 Card=
1 Bytes=9)
4 3 INDEX (UNIQUE SCAN) OF 'PER_PK' (UNIQUE) (Cost=1 Card=
1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
543 consistent gets
0 physical reads
0 redo size
2967 bytes sent via SQL*Net to client
1091 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
93 rows processed
The persons table has 31000+ records and the contact table has 120+ record in this case as you
proved the IN is better I am too getting the same results as you can see the IN is fast.
I would like to know that is it always true for two same queries (executed in the different way )
if the consistent gets for a query are more then that query is not optimized and will always
result a poor show, we should always go for the query which results less consistent get.
I read in you book also the consistent gets depend upon the order of the data of the index columns.
It will be more when the data for the index column is disordered. Does that will always effect the
time taken by the query.
Followup August 22, 2002 - 9am Central time zone:
Less consistent gets is a good goal. In generate, the less consistent gets
o the faster it'll be
o the more SCALABLE it'll be (as you add users, you can add MORE users with less consistent gets)
Always is a very very strong term. Almost Always i would say.
As for the last one -- that is due to a large index range scan that'll be getting LOTS of data from
the table -- not really applicable in this example. Does it affect the time taken by the query --
probably, the more consistent gets, the more latching (thats a lock, a serialization thing), the
longer it takes to get the data.

August 22, 2002 - 10am Central time zone
Reviewer: Ashok, Bombay ,India
your style of explaining things is just amazing..!!you must be one helluva teacher !!!
Thanks a Lot
August 22, 2002 - 12pm Central time zone
Reviewer: Vikas Sharma from Delhi India
Hi Tom,
Thanks lot
The way you explain the things it is great..
Regards,
Vikas Sharma
Very helpful
August 22, 2002 - 8pm Central time zone
Reviewer: Bijay from Fremont, CA
Tom
I wonder how you can find so much time in replying to all questions. Its amazing a simple question
can create such a big chain of Q & A.
You have been a great source of knowledge for all of us. Thank you for being you.
Bj
why is my query plan using NL? where as your use SORT-MERGE
September 5, 2002 - 3pm Central time zone
Reviewer: A reader
Tom,
I am trying to understand exists and not exists
analyze table big compute statistics
2 for table
3 for all indexes
4 for all indexed columns
5 /
Table analyzed.
SQL> analyze table small compute statistics
2 for table
3 for all indexes
4 for all indexed columns
5 /
Table analyzed.
SQL> truncate table plan_table;
Table truncated.
SQL> explain plan for
2 select count(subobject_name)
from big
where object_id in ( select object_id from small )
3 4 5 /
Explained.
SQL> @$ORACLE_HOME/rdbms/admin/utlxpls.sql
| Operation | Name | Rows | Bytes| Cost | Pstart| Pstop |
--------------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | 19 | 598 | | |
| SORT AGGREGATE | | 1 | 19 | | | |
| NESTED LOOPS | | 396 | 7K| 598 | | |
| VIEW |VW_NSO_1 | 99 | 1K| 4 | | |
| SORT UNIQUE | | 99 | 396 | 4 | | |
| INDEX FULL SCAN |SMALL_IDX | 99 | 396 | 1 | | |
| TABLE ACCESS BY INDEX R|BIG | 113K| 665K| 6 | | |
| INDEX RANGE SCAN |BIG_IDX | 113K| | 2 | | |
--------------------------------------------------------------------------------
11 rows selected.
Thans and Regards,
Followup September 5, 2002 - 10pm Central time zone:
things like sort_area_size, db_block_buffers, hash area size, database version, etc will all affect
the plans.
Update Process Options
September 9, 2002 - 10am Central time zone
Reviewer: SK from NJ
Hi Tom,
I have to update my production table daily with new/update data. My update process is like this.
Table big_table (col1 integer, col2 integer, col3 integer)
Table Small_table (c1 integer, c2 integer, c3 integer)
Small table is a external Table. and Big Table as index on Col1.
My update process is like this:
Delete from big_table where col1 in (select c1 from small_table);
insert into big_table select c1,c2,c3 from small_table.
My Big table has 32 million rows and small table has 400K rows. This process is taking 30 minutes
to do the above two processes. The requirement is to reduce this time. How can we do it. I believe
using Exists will do both full table scans so that will not be efficient. Where as using in may use
indexes (but I doubt that it is using indexes).
1. Should I go for updating using procedure where I delete records one by one and I will be sure
that processes will use indexes. But that is not my choice and it will defy your first rule of
using single statement instead of using procedure.
2. In one such update my big_table has more than 500 million records. and small Table has 7 million
records. I have 24GB of temp tablespace. My delete statement is failing due to unable to allocate
extents on temp tablespace. I am surprised on this. Is Temp tablespace of 24GB too large? In the
above delete statement will it use only col1 and c1 for sort/merge. How do I estimate the temp
tablespace size for this transaction. I am running Oracle 9iR2 RAC.
Your response will be highly appreciated.
Thanks
Follow up
September 9, 2002 - 11am Central time zone
Reviewer: SK from NJ
Hi Tom,
I am using merge for some tables where I have one to one relationship between the two table. But
for many big tables I have multiple records for same key (tables with non unique key). So will
merge will take care of following situation where for big table has 5 records for col1. and small
table has 3 records for the same key. Will merge replace 5 rows in big table with 3 rows from small
table for the same key (means deleteion of two records from big table).
I will certainly prefer merge if this is possible.
Regards
Followup September 9, 2002 - 12pm Central time zone:
Then that is not a merge.
A merge would leave the same number of rows.
You'll need to do the delete+insert. You'll want to use tkprof to see what it is doing now and see
if it can be speeded up. Deleting 400k rows is going to take a while if you have indexes on the
table.
Oracle Optimizer and In & Exists
September 11, 2002 - 12am Central time zone
Reviewer: Subhrajyoti Paul from Mumbai, India
Tom,
From the discussion above, I get the impression that In and exists clause can be used
interchangibly, in case there is a subquery inside the IN clause. But, one would be better than the
other depending on the amount data retrieved in the inner and the outer query.
If they can be used interchangibly, then it sounds logical that the optimizer can substitute an
in clause with an exists and vice versa, given it is provided with proper statistics... Does this
happen in practice... Or am I making some mistakes in my understanding of the things...
One other question,
"select *
from t1, ( select distinct y from t2 ) t2
where t1.x = t2.y;
The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then
joined to the original table -- typically. "
Here you mentioned that it is distinct'ed, which means it is sorted, now once sorted, will oracle
favor the use of merge sort join, because the sort operation is already complete...or may it choose
to Indexing or hash join. How much importance is given to the fact that one step (sort, in this
case) can be eliminated from the join operation, or is this relevant at all...
Followup September 11, 2002 - 7am Central time zone:
And in 8i, ... 9i, .... 10i, and on up -- that is true. You'll see interesting query rewrites with
both (given that you use the cost based optimizer which I can assure you you will be in 10i as the
RBO is gone in that release)
distinct <> sort. SORT is SORT, distinct is distinct. DISTINCT does not imply or infer or mean
SORTed. Never has, never will.
But how otherwise can u perform a distinct if u do not sort...
September 11, 2002 - 8am Central time zone
Reviewer: Subhrajyoti Paul from Mumbai, India
Tom,
I am relieved that the optimizer does the substitution...And ofcourse, I use cost based
optimizer...
But the distinct thing.. If distincted data does not mean sorted, how otherwise can u have
distinct data if u did not sort it... Can you give me an example...
Subhro
Followup September 11, 2002 - 8am Central time zone:
Lets say you have a partitioned table.
You used hash partitioning on the ID column.
We know that across partitions -- ID is "unique" (eg: if ID=5 is in partition 1, ID=5 will NOT be
in partitions 2, 3, 4, .... and so on).
We use parallel query.
Each partitioned is distincted and then glued together. ID=5 from partition 1 could come after
ID=1000 from partition 2 if the PQ slave processing partition 2 was "faster" then the one doing
partition 1.
For example. There are other cases as well. distinct, group by, etc -- do not imply "sorted"
Thanks a lot... I got the point...
September 11, 2002 - 8am Central time zone
Reviewer: Subhrajyoti Paul from Mumbai, India
Tom,
Most of what u said in the last followup is a bouncer to me(in cricket, a bouncer is a ball that
goes above the batsman's head ;-) )...
But, I got the point that distinct does not mean sorted...
I just could not think of a case myself ..so the example was needed...
Thanks a lot... everytime i think i got u...u prove me wrong... cool....
not in & not exists
November 5, 2002 - 5am Central time zone
Reviewer: Brigitte from Germany
Hallo Tom,
Some times ago I changed a query to "not exists", because "not in" takes some hours.
Now I tried your interesting followup from October 01, 2001 with the test database, but alter
session
set always_anti_join = HASH/MERGE hasn't an effect. Why?
notesid is not null in all tables, it is the primary key in testidxdokument,testidxvorgang and
testidxmappe
Sql> set autotrace trace
Sql> set timing on
Sql> r
1 select distinct mandant,notesid from testleserfeldgruppe l
2 where not exists
3 (select 'x' from testidxdokument
4 where l.notesid = notesid)
5 and not exists
6 (select 'x' from testidxvorgang
7 where l.notesid = notesid)
8 and not exists
9 (select 'x' from testidxmappe
10 where l.notesid = notesid)
11* order by mandant, notesid
no rows selected
Elapsed: 00:00:01.02
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=35)
1 0 SORT (ORDER BY) (Cost=6 Card=1 Bytes=35)
2 1 SORT (UNIQUE) (Cost=5 Card=1 Bytes=35)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'TESTLESERFELDGRUPPE' (Cost=3 Card=1 Bytes=35)
5 3 INDEX (UNIQUE SCAN) OF 'UK_TESTIDXDOKUMENT' (UNIQUE) (Cost=1 Card=1 Bytes=18)
6 3 INDEX (UNIQUE SCAN) OF 'UK_TESTIDXVORGANG' (UNIQUE) (Cost=1 Card=1 Bytes=18)
7 3 INDEX (UNIQUE SCAN) OF 'UK_TESTIDXMAPPE' (UNIQUE) (Cost=1 Card=1 Bytes=18)
Statistics
----------------------------------------------------------
0 recursive calls
4 db block gets
1507 consistent gets
0 physical reads
0 redo size
160 bytes sent via SQL*Net to client
251 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
0 rows processed
Sql> r
1 select distinct mandant,notesid from testleserfeldgruppe where notesid not in
2 (select notesid from testidxdokument)
3 and notesid not in
4 (select notesid from testidxvorgang)
5 and notesid not in
6 (select notesid from testidxmappe)
7* order by mandant, notesid
no rows selected
Elapsed: 00:00:02.44
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=35)
1 0 SORT (ORDER BY) (Cost=6 Card=1 Bytes=35)
2 1 SORT (UNIQUE) (Cost=5 Card=1 Bytes=35)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'TESTLESERFELDGRUPPE' (Cost=3Card=1 Bytes=35)
5 3 TABLE ACCESS (FULL) OF 'TESTIDXDOKUMENT' (Cost=12 Card=303 Bytes=5454)
6 3 TABLE ACCESS (FULL) OF 'TESTIDXVORGANG' (Cost=1 Card=9 Bytes=162)
7 3 TABLE ACCESS (FULL) OF 'TESTIDXMAPPE' (Cost=1 Card=9 Bytes=162)
Statistics
----------------------------------------------------------
0 recursive calls
3304 db block gets
32597 consistent gets
0 physical reads
0 redo size
167 bytes sent via SQL*Net to client
249 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed
Sql> alter session set always_anti_join = MERGE;
Session altered.
Sql> select distinct mandant,notesid from testleserfeldgruppe where notesid not in
2 (select notesid from testidxdokument)
3 and notesid not in
4 (select notesid from testidxvorgang)
5 and notesid not in
6 (select notesid from testidxmappe)
7 order by mandant, notesid;
no rows selected
Elapsed: 00:00:01.23
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=35)
1 0 SORT (ORDER BY) (Cost=6 Card=1 Bytes=35)
2 1 SORT (UNIQUE) (Cost=5 Card=1 Bytes=35)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'TESTLESERFELDGRUPPE' (Cost=3 Card=1 Bytes=35)
5 3 TABLE ACCESS (FULL) OF 'TESTIDXDOKUMENT' (Cost=12 Card=303 Bytes=5454)
6 3 TABLE ACCESS (FULL) OF 'TESTIDXVORGANG' (Cost=1 Card=9 Bytes=162)
7 3 TABLE ACCESS (FULL) OF 'TESTIDXMAPPE' (Cost=1 Card=9 Bytes=162)
Statistics
----------------------------------------------------------
0 recursive calls
3304 db block gets
32597 consistent gets
0 physical reads
0 redo size
167 bytes sent via SQL*Net to client
249 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed
Sql> alter session set always_anti_join = HASH;
Session altered.
Sql> select distinct mandant,notesid from testleserfeldgruppe where notesid not in
2 (select notesid from testidxdokument)
3 and notesid not in
4 (select notesid from testidxvorgang)
5 and notesid not in
6 (select notesid from testidxmappe)
7 order by mandant, notesid;
no rows selected
Elapsed: 00:00:01.13
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=35)
1 0 SORT (ORDER BY) (Cost=6 Card=1 Bytes=35)
2 1 SORT (UNIQUE) (Cost=5 Card=1 Bytes=35)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'TESTLESERFELDGRUPPE' (Cost=3 Card=1 Bytes=35)
5 3 TABLE ACCESS (FULL) OF 'TESTIDXDOKUMENT' (Cost=12 Card=303 Bytes=5454)
6 3 TABLE ACCESS (FULL) OF 'TESTIDXVORGANG' (Cost=1 Card=9 Bytes=162)
7 3 TABLE ACCESS (FULL) OF 'TESTIDXMAPPE' (Cost=1 Card=9 Bytes=162)
Statistics
----------------------------------------------------------
0 recursive calls
3304 db block gets
32597 consistent gets
0 physical reads
0 redo size
167 bytes sent via SQL*Net to client
249 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed
Regards
Followup November 5, 2002 - 9am Central time zone:
are those statistics close to right? 9 rows and such -- on such a trivial amount of data -- all
bets are off
either get real stats on there or load the tables with real data.
not in or not exists
November 6, 2002 - 10am Central time zone
Reviewer: Brigitte from Germany
Hallo Tom,
here are more data.
1. Why has the alter session no effect. What's wrong? Some parameters?
2. what means: "elapsed 02:133:8032.30", I know hours:minutes:seconds
Does it mean 2 hours + 13 minutes +...
or 2 hours + 2 hours + 13 minutes +...
06.11.02 07:53 Anzahl der Tabellenzeilen Seite: 1
dms
DMSIDXDOKUMENT : 46220
DMSIDXMAPPE : 13
DMSIDXVORGANG : 464
DMSLESERFELDGRUPPE : 137369
Na so was> r
1 select distinct mandant,notesid from dmsleserfeldgruppe l
2 where not exists
3 (select 'x' from dmsidxdokument
4 where l.notesid = notesid)
5 and not exists
6 (select 'x' from dmsidxvorgang
7 where l.notesid = notesid)
8 and not exists
9 (select 'x' from dmsidxmappe
10 where l.notesid = notesid)
11* order by mandant, notesid
no rows selected
Elapsed: 00:00:19.18
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=71 Card=25 Bytes=950)
1 0 SORT (ORDER BY) (Cost=71 Card=25 Bytes=950)
2 1 SORT (UNIQUE) (Cost=68 Card=25 Bytes=950)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'DMSLESERFELDGRUPPE' (Cost=65 Card=25 Bytes=950)
5 3 INDEX (UNIQUE SCAN) OF 'UK_DMSIDXDOKUMENT' (UNIQUE) (Cost=1 Card=1 Bytes=32)
6 3 INDEX (UNIQUE SCAN) OF 'UK_DMSIDXVORGANG' (UNIQUE) (Cost=1 Card=1 Bytes=31)
7 3 INDEX (UNIQUE SCAN) OF 'UK_DMSIDXMAPPE' (UNIQUE)
Statistics
----------------------------------------------------------
245 recursive calls
7 db block gets
149823 consistent gets
1439 physical reads
0 redo size
161 bytes sent via SQL*Net to client
246 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
18 sorts (memory)
0 sorts (disk)
0 rows processed
Na so was> r
1 select distinct mandant,notesid from dmsleserfeldgruppe where notesid not in
2 (select notesid from dmsidxdokument)
3 and notesid not in
4 (select notesid from dmsidxvorgang)
5 and notesid not in
6 (select notesid from dmsidxmappe)
7* order by mandant, notesid
no rows selected
Elapsed: 02:133:8032.30
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=71 Card=25 Bytes=950)
1 0 SORT (ORDER BY) (Cost=71 Card=25 Bytes=950)
2 1 SORT (UNIQUE) (Cost=68 Card=25 Bytes=950)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'DMSLESERFELDGRUPPE' (Cost=65 Card=25 Bytes=950)
5 3 TABLE ACCESS (FULL) OF 'DMSIDXDOKUMENT' (Cost=118 Card=2101 Bytes=67232)
6 3 TABLE ACCESS (FULL) OF 'DMSIDXVORGANG' (Cost=1 Card=21 Bytes=651)
7 3 TABLE ACCESS (FULL) OF 'DMSIDXMAPPE' (Cost=1 Card=2 Bytes=50)
Statistics
----------------------------------------------------------
0 recursive calls
464062 db block gets
101189223 consistent gets
12940155 physical reads
0 redo size
167 bytes sent via SQL*Net to client
244 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed
Na so was> alter session set always_anti_join = MERGE;
Session altered.
Na so was> r
1 select distinct mandant,notesid from dmsleserfeldgruppe where notesid not in
2 (select notesid from dmsidxdokument)
3 and notesid not in
4 (select notesid from dmsidxvorgang)
5 and notesid not in
6 (select notesid from dmsidxmappe)
7* order by mandant, notesid
no rows selected
Elapsed: 02:133:8036.06
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=71 Card=25 Bytes=950)
1 0 SORT (ORDER BY) (Cost=71 Card=25 Bytes=950)
2 1 SORT (UNIQUE) (Cost=68 Card=25 Bytes=950)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'DMSLESERFELDGRUPPE' (Cost=65 Card=25 Bytes=950)
5 3 TABLE ACCESS (FULL) OF 'DMSIDXDOKUMENT' (Cost=118 Card=2101 Bytes=67232)
6 3 TABLE ACCESS (FULL) OF 'DMSIDXVORGANG' (Cost=1 Card=21 Bytes=651)
7 3 TABLE ACCESS (FULL) OF 'DMSIDXMAPPE' (Cost=1 Card=2 Bytes=50)
Statistics
----------------------------------------------------------
0 recursive calls
464112 db block gets
101219920 consistent gets
13238978 physical reads
0 redo size
167 bytes sent via SQL*Net to client
244 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed
Na so was> alter session set always_anti_join = HASH;
Session altered.
Na so was> select distinct mandant,notesid from dmsleserfeldgruppe where notesid not in
2 (select notesid from dmsidxdokument)
3 and notesid not in
4 (select notesid from dmsidxvorgang)
5 and notesid not in
6 (select notesid from dmsidxmappe)
7 order by mandant, notesid;
no rows selected
Elapsed: 02:135:8144.32
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=71 Card=25 Bytes=950)
1 0 SORT (ORDER BY) (Cost=71 Card=25 Bytes=950)
2 1 SORT (UNIQUE) (Cost=68 Card=25 Bytes=950)
3 2 FILTER
4 3 TABLE ACCESS (FULL) OF 'DMSLESERFELDGRUPPE' (Cost=65 Card=25 Bytes=950)
5 3 TABLE ACCESS (FULL) OF 'DMSIDXDOKUMENT' (Cost=118 Card=2101 Bytes=67232)
6 3 TABLE ACCESS (FULL) OF 'DMSIDXVORGANG' (Cost=1 Card=21 Bytes=651)
7 3 TABLE ACCESS (FULL) OF 'DMSIDXMAPPE' (Cost=1 Card=2 Bytes=50)
Statistics
----------------------------------------------------------
0 recursive calls
464327 db block gets
101357656 consistent gets
14016238 physical reads
0 redo size
167 bytes sent via SQL*Net to client
244 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed
Regards
not in & not exists
November 8, 2002 - 3am Central time zone
Reviewer: Brigitte from Germany
Hallo Tom,
I hope the data which I send on November 06 are enough.
I think a difference from 19 minutes to some hours is significant.
I'm very interested in your reply.
Regards
Followup November 8, 2002 - 7am Central time zone:
Brigitte -- for something so big, you'll need to do it as a question - this is for followups, quick
questions. This is an entire question in a question.
Is the logic backwards for NOT IN and NOT EXISTS
November 15, 2002 - 4pm Central time zone
Reviewer: Doug from CT, USA
Tom - this may be an oversimplification - however if I was to distill whether IN or EXISTS was
better I would say this after reading your reply. An IN will generally full table scan the
subquery, and an EXISTS will generally full table scan the outer query. So, if my outer loop is
small and my inner code is large then an exists might be appropriate but if my outer loop is large
and my inner code is small (the match is quick).. then an IN might be appropriate. Is that
correct? If so.. is the situation completely reversed if using NOT in and NOT exists? I would
guess is you were using not in.. you would have to full table scan the OUTER query rather than
sorting,hashing, the inner one. You did cover that NOT IN will return zero rows if it is NOT IN
(null).. but I wasn't sure if the psuedo-code for processing is backwards from the original
examples if one is using NOT IN and NOT EXISTS.
Followup November 15, 2002 - 8pm Central time zone:
I encapsulate it with this:
...
So, when is where exists appropriate and in appropriate?
Lets say the result of the subquery
( select y from T2 )
is "huge" and takes a long time. But the table T1 is relatively small and
executing ( select null from t2 where y = x.x ) is very very fast (nice index on
t2(y)). Then the exists will be faster as the time to full scan T1 and do the
index probe into T2 could be less then the time to simply full scan T2 to build
the subquery we need to distinct on.
Lets say the result of the subquery is small -- then IN is typicaly more
appropriate.
If both the subquery and the outer table are huge -- either might work as well
as the other -- depends on the indexes and other factors.
.....
full scans -- no, they enter not into the picture at all IN GENERAL. If the subquery can use
indexes, it'll use an index with IN or EXIST.
It is if the subquery is LARGE and TAKES A LONG TIME TO BUILD.... (regardless of full scan)
Not in and not exists is even harder to characterize (pull my AOTC slides from the home page for an
interesting example -- they will be there for a while around November of 2002 -- if you are reading
this later then that, they might not be there anymore)
How to make NOT EXISTS handle comparison of NULLs in same way as MINUS
March 13, 2003 - 10am Central time zone
Reviewer: Jim Dickson (DBA/developer) from London, UK
As ever, learn a lot from this site.
But one thing not addressed is how to make an antijoin deal with NULLs. The set operators (UNION,
MINUS) do this automatically.
But can it be done efficiently with NOT EXISTS or NOT IN.
e.g.
I need to compare 2 tables such that two rows are duplicates if corresponding columns are either
equal, or both contain NULL.
I cannot use MINUS since that columns being compared are a subset of the row, but i need to return
the whole row.
My understanding is that I cannot use NOT IN because it does not handle "null comparison".
NOT EXISTS works but seems to use NESTED LOOPS => v slow
My understanding is that HASH antijoin will not work because HASH join only works for equality
tests (which fail for NULLs).
I can post example code if this helps.
Followup March 14, 2003 - 5pm Central time zone:
both not exists and ESPECIALLLY not in do not do nulls like that.
not in and not exists don't even treat nulls the same in fact. do this:
select * from emp where empno not in ( select mgr from emp );
and compare that to the "equivalent" not exists:
select * from emp where not exists ( select null from emp e2 where e2.mgr = emp.empno );
you'll get a different answer.
You would have to tediously compare column by column by column
where (x = y or (x is null and y is null))
otherwise.
How to make NOT EXISTS handle comparison of NULLs in same way as MINUS
March 17, 2003 - 12pm Central time zone
Reviewer: Jim Dickson from London, UK
I absolutely agree but the question more about access paths than semantics.
I believe i understand the semantics surrounding nulls - which is why MINUS/INTERSECT are such
great operators.
I am prepared to write "tedious" column comparison.
Can I get Oracle to perform comparisons via a single-pass full table scan access method (hash,
sort-merge).
I have only seen Cartesian product/Nested Loops approach which obviously does not scale with number
of rows.
Is there any chance of Oracle implementing ANSI set operator extension "CORRESPONDING BY
{column-list}" ?
Followup March 17, 2003 - 1pm Central time zone:
curious -- why cannot you simply use MERGE/INTERSECT?
How to make NOT EXISTS handle comparison of NULLs in same way as MINUS
March 18, 2003 - 4am Central time zone
Reviewer: Jim Dickson from London, UK
Because I am not merging into target table directly.
The sequence of processing is
1/ Read an external table
2/ Identify differences based on a subset of columns
3/ Write out differences to OS for further processing (fuzzy logic on Name & Address data)
4/ Read 2nd external table containing processed data and then "merge" into target table.
Where I have a primary key, I can use MINUS for step 2.
But where there is no PK, then I am trying to get NOT EXISTS to scale.
Hope that makes sense.
BTW Is MERGE statement ANSI?
Followup March 18, 2003 - 7am Central time zone:
well, not following you by what you mean by "getting not exists to scale" -- but give me an EXAMPLE
of the syntax you would like to use and describe what would be returned and I'll see if there isn't
a way to code that...
Don't know if merge is "ansi". Thing is NIST stop testing for ansi compliance years ago, no one
does it, the standard is so huge now that full compliance is pretty much "not going to happen" and
database indenpendence, well, I don't even go there -- the only results I see from that over the
last 15 years has been applications that perform marginally well on a single database and blow
chunks on the rest.
How to make NOT EXISTS handle comparison of NULLs in same way as
March 19, 2003 - 4am Central time zone
Reviewer: Jim Dickson from London, UK
Code posted below.
The NOT EXISTS does not include the SECONDARY column.
Ran with 100 rows, 1000 rows and 2389 rows.
On my machine
MINUS 100 rows 00.44 secs, 8 db block gets
1000 rows 00.27 secs, 7 db block gets
2389 rows 00.51 secs, 4 db block gets
NOT EXISTS 100 rows 00.43 secs, 208 db block gets
1000 rows 02.90 secs, 2007 db block gets
2389 rows 11.75 secs, 4780 db block gets
I need this to run with 10m rows.
spool not_exists.2389.out
set timing on
set echo on
drop table t1;
drop table t2;
create table t1 as select * from all_objects where rownum <= 2500;
create table t2 as select * from all_objects where rownum <= 2500;
analyze table t1 compute statistics;
analyze table t2 compute statistics;
--desc t1
--desc t2
insert into t1 values ('EXTRA ROW', null, null, null, null, null, null, null, null, null, null,
null, null);
select count(*) from t1;
select count(*) from t2;
set autotrace on
select count(*) from
(
select * from t1
minus
select * from t2
);
select count(*) from t1 where NOT EXISTS
( select null from t2 where
( t1.OWNER = t2.OWNER or ( t1.OWNER is null and t2.OWNER is null ) )
and
( t1.OBJECT_NAME = t2.OBJECT_NAME or ( t1.OBJECT_NAME is null and t2.OBJECT_NAME is null ) )
and
( t1.SUBOBJECT_NAME = t2.SUBOBJECT_NAME or ( t1.SUBOBJECT_NAME is null and t2.SUBOBJECT_NAME is
null ) )
and
( t1.OBJECT_ID = t2.OBJECT_ID or ( t1.OBJECT_ID is null and t2.OBJECT_ID is null ) )
and
( t1.DATA_OBJECT_ID = t2.DATA_OBJECT_ID or ( t1.DATA_OBJECT_ID is null and t2.DATA_OBJECT_ID is
null ) )
and
( t1.OBJECT_TYPE = t2.OBJECT_TYPE or ( t1.OBJECT_TYPE is null and t2.OBJECT_TYPE is null ) )
and
( t1.CREATED = t2.CREATED or ( t1.CREATED is null and t2.CREATED is null ) )
and
( t1.LAST_DDL_TIME = t2.LAST_DDL_TIME or ( t1.LAST_DDL_TIME is null and t2.LAST_DDL_TIME is null
) )
and
( t1.TIMESTAMP = t2.TIMESTAMP or ( t1.TIMESTAMP is null and t2.TIMESTAMP is null ) )
and
( t1.STATUS = t2.STATUS or ( t1.STATUS is null and t2.STATUS is null ) )
and
( t1.TEMPORARY = t2.TEMPORARY or ( t1.TEMPORARY is null and t2.TEMPORARY is null ) )
and
( t1.GENERATED = t2.GENERATED or ( t1.GENERATED is null and t2.GENERATED is null ) )
)
;
spool off
exit
As ever, thanks.
Think the whole Oracle community is disappointed that the new book will be delayed.
Followup March 20, 2003 - 1pm Central time zone:
what I meant by:
but
give me an EXAMPLE of the syntax you would like to use and describe what would
be returned and I'll see if there isn't a way to code that...
was -- give me a scenario where you want to use not exists instead of minus (eg: in the above, i
certainly would NOT use not exists) and I'll see what I can do with that.
That is -- give me a real case to work with whereby minus cannot be used.

March 21, 2003 - 1pm Central time zone
Reviewer: green
Hi Tom,
Can you please explain a litte bit 'NOT IN' here? Where can it be used properly?
Thanks!
Followup March 21, 2003 - 2pm Central time zone:
I don't understand?
not in can be used anywhere you need "not in" -- where "record" not in "set".
How to use MINUS on subset of columns
March 27, 2003 - 4am Central time zone
Reviewer: Jim Dickson from London, UK
I use MINUS extensively but in certain situations I believe it is unsuitable.
e.g.
When trying to find the difference between 2 tables, where at least 1 of tables lack a primary key,
using a subset of columns for comparison and some of those columns can be null.
The real life example is comparing a new list of suppressions against an existing table of
suppressions.
(Suppressions used in marketing to avoid target customers who do not wish to receive offers).
The ddl is pretty much address1-10, suppression_flags1-10.
The comparison is done on address lines only but return set must be all columns.
Hope that makes sense.
The ANSI syntax i would love to use is
select * from suppressions_new
except corresponding by (address1, address2, ...., address10)
select * from suppressions_old
Suppressions_new will have no primary key.
How can i achieve this in Oracle using MINUS instead of EXCEPT.
If it cannot be done using MINUS, then how to achieve reasonable performance using NOT EXISTS.
Followup March 27, 2003 - 8am Central time zone:
Ok, I have a big_table (4million rows) and a sample of that (400k rows). Using NVL() on columns
that "might be nullable" and supplying an unreasonable value (eg: chr(0) for strings,
99999999999999999999999999 for numbers, to_date(01010001,'ddmmyyyyy') for dates) you can anti join
with NOT IN like this:
select *
from big_table
where (object_id, nvl(subobject_name,chr(0)) )
NOT IN (select object_id, nvl(subobject_name,chr(0))
from ten_percent
where nvl(subobject_name,chr(0)) IS NOT NULL )
and nvl(subobject_name,chr(0)) is not null
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 114.65 248.86 118052 60214 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 114.65 248.86 118052 60214 0 0
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 60
Rows Row Source Operation
------- ---------------------------------------------------
0 HASH JOIN ANTI (cr=60214 r=118052 w=58940 time=77317035 us)
3817191 TABLE ACCESS FULL BIG_TABLE (cr=54700 r=54649 w=0 time=14975098 us)
381274 TABLE ACCESS FULL TEN_PERCENT (cr=5514 r=4463 w=0 time=1360329 us)
Now, that "scales" just as well as:
select *
from big_table
where (object_id, object_name )
NOT IN (select object_id, object_name
from ten_percent )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 114.19 224.33 117115 60213 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 114.19 224.33 117115 60213 0 0
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 60
Rows Row Source Operation
------- ---------------------------------------------------
0 HASH JOIN ANTI (cr=60213 r=117115 w=57995 time=224338032 us)
3817191 TABLE ACCESS FULL BIG_TABLE (cr=54699 r=54657 w=0 time=16561038 us)
381274 TABLE ACCESS FULL TEN_PERCENT (cr=5514 r=4463 w=0 time=1308385 us)
without the NVL's.
Use of "nvl(subobject_name,chr(0)) IS NOT NULL "
March 27, 2003 - 1pm Central time zone
Reviewer: Logan Palanisamy from Santa Clara, CA USA
Tom,
In your example,
select *
from big_table
where (object_id, nvl(subobject_name,chr(0)) )
NOT IN (select object_id, nvl(subobject_name,chr(0))
from ten_percent
where nvl(subobject_name,chr(0)) IS NOT NULL )
and nvl(subobject_name,chr(0)) is not null
1. What does chr(0) translate to? When I run "select chr(0) from dual", it returns me an invisble
character. My db character set is WE8ISO8859P15
2. What does "nvl(subobject_name,chr(0)) IS NOT NULL" do? Doesn't it evaluate to TRUE always, since
chr(0) is a not null value?
3. Also, the number of rows returned as shown in TKPROF output is zero. How come? Shouldn't return
about 3.6 millon rows?
4. I am familiar with TKPROF. But what does the name really stand for? Is it an acronym for
something?
Grateful as ever.
Followup March 27, 2003 - 2pm Central time zone:
1) exactly -- it is just "\0" in C, a null string. It is an improbable character, something
subobject_name cannot be...
2) exactly -- I'm telling the optimizer "this query returns only data that is NOT NULL". That
permits the use of the anti-join which is what our goal was here.
3) no, there were no rows "not in" the other set.
4) tool kit profiler is the rumorous name I've heard it refered to as.
LIKE.... OR LIKE.... OR
May 12, 2003 - 11am Central time zone
Reviewer: Robert from PA
Tom,
Is there a better way (with a sub-query) to do
LIKE .... OR LIKE ... OR LIKE ... ?
e.g.:
select * from user_objects where
OBJECT_TYPE LIKE 'PROC%' OR OBJECT_TYPE LIKE 'FUNC%' OR OBJECT_TYPE LIKE 'VIEW%" ?
Thanks
Followup May 12, 2003 - 1pm Central time zone:
that would be the most "optimal" way normally -- it could permit a normal index on object_type to
be used.
You could shorten it to
where substr( object_type, 1, 4 ) in ( 'PROC', 'FUNC', 'VIEW' )
in your case -- and if you had a function based index on substr(object_type,1,4) it could use an
index.
EXISTS and FTS
May 14, 2003 - 6pm Central time zone
Reviewer: Nitin from Atlanta
I tried the following.
SQL> create table outer (a number);
Table created.
SQL> create index outer_ind on outer (a);
Index created.
SQL> create table inner (b number);
Table created.
SQL> create index inner_ind on inner (b);
Index created.
SQL> select 'x' from outer where exists (select 'x' from inner where outer.a = inner.b);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'OUTER'
3 1 INDEX (RANGE SCAN) OF 'INNER_IND' (NON-UNIQUE)
SQL> analyze table outer compute statistics;
Table analyzed.
SQL> analyze table inner compute statistics;
Table analyzed.
SQL> select 'x' from outer where exists (select 'x' from inner where outer.a = inner.b);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=13)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'OUTER' (Cost=1 Card=1 Bytes=13)
3 1 INDEX (RANGE SCAN) OF 'INNER_IND' (NON-UNIQUE) (Cost=1 C
ard=1 Bytes=13)
SQL> insert into outer select OBJECT_ID from user_objects;
40899 rows created.
SQL> commit;
Commit complete.
SQL> analyze table outer compute statistics;
Table analyzed.
SQL> select 'x' from outer where exists (select 'x' from inner where outer.a = inner.b);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=2045 Bytes=81
80)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'OUTER' (Cost=4 Card=2045 Bytes=8
180)
3 1 INDEX (RANGE SCAN) OF 'INNER_IND' (NON-UNIQUE) (Cost=1 C
ard=1 Bytes=13)
Should it not perform an index scan on the OUTER table?
Regards
Nitin
Followup May 14, 2003 - 11pm Central time zone:
why would it?
every row must be inspected.
the rows are very narrow
the table is SMALLER then the index (the index has 2 columns -- the column and the rowid, the table
is just a column)
use "real" examples.
ops$tkyte@ORA920LAP> create table t1 as select * from all_objects;
Table created.
ops$tkyte@ORA920LAP> create index t1_idx on t1(object_id);
Index created.
ops$tkyte@ORA920LAP> analyze table t1 compute statistics
2 for table for all indexes for all indexed columns;
Table analyzed.
ops$tkyte@ORA920LAP> create table t2 as select * from all_objects where 1=0;
Table created.
ops$tkyte@ORA920LAP> create index t2_idx on t2(object_id);
Index created.
ops$tkyte@ORA920LAP> analyze table t1 compute statistics
2 for table for all indexes for all indexed columns;
Table analyzed.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> set autotrace traceonly explain
ops$tkyte@ORA920LAP> select * from t1
2 where exists ( select null from t2 where t2.object_id = t1.object_id );
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=168 Card=82 Bytes=8938)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=2 Card=1 Bytes=96)
2 1 NESTED LOOPS (Cost=168 Card=82 Bytes=8938)
3 2 SORT (UNIQUE)
4 3 TABLE ACCESS (FULL) OF 'T2' (Cost=2 Card=82 Bytes=1066)
5 2 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=1 Card=1)
ops$tkyte@ORA920LAP> set autotrace on
Still FTS
May 15, 2003 - 12pm Central time zone
Reviewer: Nitin
I replicated all the steps performed by you. And I am getting the following execution plan.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=55 Card=3515 Bytes=2
70655)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'T1' (Cost=55 Card=3515 Bytes=270
655)
3 1 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE) (Cost=1 Card
=1 Bytes=13)
The db is 8.1.7.4. Is is possible that there is some init.ora parameter which is dictating the
above FTS execution plan?
Thanks
Followup May 15, 2003 - 5pm Central time zone:
that is fundementally the same plan I had -- so?
it needs to full scan, mine full scanned.
tell me -- what would the "straw man plan" you think would be better be exactly?
Still FTS
May 15, 2003 - 7pm Central time zone
Reviewer: Nitin
The difference is that the optimizer is choosing to perform FTS on T1 (which is huge) and use index
scan on T2 which is very small.
Regards
Followup May 15, 2003 - 7pm Central time zone:
ops$tkyte@ORA817DEV> set autotrace traceonly explain
ops$tkyte@ORA817DEV> select * from t1, ( select distinct object_id from t2 ) t2
2 where t1.object_id = t2.object_id;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=109)
1 0 NESTED LOOPS (Cost=5 Card=1 Bytes=109)
2 1 VIEW (Cost=3 Card=1 Bytes=13)
3 2 SORT (UNIQUE) (Cost=3 Card=1 Bytes=13)
4 3 INDEX (FULL SCAN) OF 'T2_IDX' (NON-UNIQUE)
5 1 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=2 Card=26793 Bytes=2572128)
6 5 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=1 Card=26793)
in 8i seems to be the way to go -- things change, the optimizer gets smarter
What about with updates?
May 21, 2003 - 2am Central time zone
Reviewer: Brett from Australia
Tom,
How does IN with a subquery affect performance of updates - is the subquery executed once for each
row (which could be really inefficient), or once for the query. eg:
update table1
set column_1 = 'Y'
where row_id in (select row_id
from table1 t1,
table2 t2
where t1.f_key = t2.row_id
and t2.class_cd = 'Address'
and t2.created > sysdate - 30);
Is this subquery executed only once for the entire update? Or would it be better to use the
subquery to insert the data into a temporary table with a unique constraint/primary key and then
use an updatable view based on table1 and the temporary table?
Followup May 21, 2003 - 8am Central time zone:
the subquery, since it is not a correlated subquery, would tend to be done "once"
Now, if t2.row_id is UNIQUE, and there is a unique/primary key on it I would code:
ops$tkyte@ORA920> update ( select column_1
2 from t1, t2
3 where t1.f_key = t2.row_id
4 and t2.class_cd = 'Address'
5 and t2.created > sysdate-30 )
6 set column_1 = 'Y'
7 /
Are these same?
May 22, 2003 - 12pm Central time zone
Reviewer: Nitin
Are the following SQL's similar? Will they retrieve same set of data under all circumstances?
1.
select
APRCN_ID_BI, APPRAISAL_DT, RANK_ID
from APPRECIATN_TBL WHERE
(APRCN_ID, APRCN_ID, ACTIVITY_ID, NOTES_ID) in ( SELECT
APRCN_ID, APRCN_ID, ACTIVITY_ID, NOTES_ID
FROM APPRECIATN_TBL_TMP TMP WHERE
APPRECIATN_TBL.APRCN_ID = TMP.APRCN_ID AND
APPRECIATN_TBL.APRCN_ID = TMP.APRCN_ID AND
APPRECIATN_TBL.ACTIVITY_ID = TMP.ACTIVITY_ID AND
APPRECIATN_TBL.NOTES_ID = TMP.NOTES_ID AND TMP.PROCESS_INSTANCE =
363480 AND TMP.APRCN_ID_BI = '52000' AND TMP.APPRAISAL_DT =
TO_DATE('2003-01-19','YYYY-MM-DD') AND TMP.RANK_ID = '55053')
2.
select
APRCN_ID_BI, APPRAISAL_DT, RANK_ID
from APPRECIATN_TBL WHERE
EXISTS ( SELECT
APRCN_ID, APRCN_ID, ACTIVITY_ID, NOTES_ID
FROM APPRECIATN_TBL_TMP TMP WHERE
APPRECIATN_TBL.APRCN_ID = TMP.APRCN_ID AND
APPRECIATN_TBL.APRCN_ID = TMP.APRCN_ID AND
APPRECIATN_TBL.ACTIVITY_ID = TMP.ACTIVITY_ID AND
APPRECIATN_TBL.NOTES_ID = TMP.NOTES_ID AND TMP.PROCESS_INSTANCE =
363480 AND TMP.APRCN_ID_BI = '52000' AND TMP.APPRAISAL_DT =
TO_DATE('2003-01-19','YYYY-MM-DD') AND TMP.RANK_ID = '55053')
Regards
Followup May 23, 2003 - 8am Central time zone:
they are the same, but the first is less efficient by far. a correlated "IN", wouldn't do that.
query
May 22, 2003 - 7pm Central time zone
Reviewer: mo
Tom:
Do you know why the first query does not produce any results while the other does:
select request_id from shipm_request where request_id not in (Select distinct request_id from
shipment);
select request_id from shipm_request a where request_id not exists (select request_id from shipment
b where a.request_id = b.request_id);
Table shipm_request(request_id) has one to many relationship with table shipment(shipment_id).
Thanks
Followup May 23, 2003 - 8am Central time zone:
lookup upwards for this text:
No, partially because NOT IN and NOT EXISTS are not substitutes for eachother!
and read that. it is all about NULL
Proof IN / EXISTS
August 29, 2003 - 1pm Central time zone
Reviewer: Marcio from br
Tom, I did test to prove to my colleague the theory of IN/EXISTS exposed here (He just ignore it
and ask to developers to use IN always -- to improve performance).
But when I executed -- to me they are same!
take a look
ops$t_mp00@MRP9I1> drop table small;
Table dropped.
ops$t_mp00@MRP9I1> drop table big;
Table dropped.
ops$t_mp00@MRP9I1> ed teste_in_exist
ops$t_mp00@MRP9I1> create table big as select * from all_objects;
Table created.
ops$t_mp00@MRP9I1> insert /*+ append */ into big select * from big;
31317 rows created.
ops$t_mp00@MRP9I1> commit;
Commit complete.
ops$t_mp00@MRP9I1> insert /*+ append */ into big select * from big;
62634 rows created.
ops$t_mp00@MRP9I1> commit;
Commit complete.
ops$t_mp00@MRP9I1> insert /*+ append */ into big select * from big;
125268 rows created.
ops$t_mp00@MRP9I1> commit;
Commit complete.
ops$t_mp00@MRP9I1> create index big_idx on big(object_id);
Index created.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> create table small as select * from all_objects where rownum < 100;
Table created.
ops$t_mp00@MRP9I1> create index small_idx on small(object_id);
Index created.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> analyze table big compute statistics
2 for table
3 for all indexes
4 for all indexed columns
5 /
Table analyzed.
ops$t_mp00@MRP9I1> analyze table small compute statistics
2 for table
3 for all indexes
4 for all indexed columns
5 /
Table analyzed.
ops$t_mp00@MRP9I1> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.2.1 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.2.0 - Production
c:\migracao\loads>plus /
SQL*Plus: Release 9.2.0.2.0 - Production on Fri Aug 29 14:07:08 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.2.1 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.2.0 - Production
ops$t_mp00@MRP9I1> @teste_in_exist
ops$t_mp00@MRP9I1> rem create table big as select * from all_objects;
ops$t_mp00@MRP9I1> rem insert /*+ append */ into big select * from big;
ops$t_mp00@MRP9I1> rem commit;
ops$t_mp00@MRP9I1> rem insert /*+ append */ into big select * from big;
ops$t_mp00@MRP9I1> rem commit;
ops$t_mp00@MRP9I1> rem insert /*+ append */ into big select * from big;
ops$t_mp00@MRP9I1> rem commit;
ops$t_mp00@MRP9I1> rem create index big_idx on big(object_id);
ops$t_mp00@MRP9I1> rem
ops$t_mp00@MRP9I1> rem
ops$t_mp00@MRP9I1> rem create table small as select * from all_objects where rownum < 100;
ops$t_mp00@MRP9I1> rem create index small_idx on small(object_id);
ops$t_mp00@MRP9I1> rem
ops$t_mp00@MRP9I1> rem analyze table big compute statistics
ops$t_mp00@MRP9I1> rem for table
ops$t_mp00@MRP9I1> rem for all indexes
ops$t_mp00@MRP9I1> rem for all indexed columns
ops$t_mp00@MRP9I1> rem /
ops$t_mp00@MRP9I1> rem analyze table small compute statistics
ops$t_mp00@MRP9I1> rem for table
ops$t_mp00@MRP9I1> rem for all indexes
ops$t_mp00@MRP9I1> rem for all indexed columns
ops$t_mp00@MRP9I1> rem /
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> alter session set sql_trace = true;
Session altered.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> select count(subobject_name)
2 from big
3 where object_id in ( select object_id from small )
4 /
COUNT(SUBOBJECT_NAME)
---------------------
0
1 row selected.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> select count(subobject_name)
2 from big
3 where exists ( select null from small where small.object_id = big.object_id )
4 /
COUNT(SUBOBJECT_NAME)
---------------------
0
1 row selected.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> select count(subobject_name)
2 from small
3 where object_id in ( select object_id from big )
4 /
COUNT(SUBOBJECT_NAME)
---------------------
0
1 row selected.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> select count(subobject_name)
2 from small
3 where exists ( select null from big where small.object_id = big.object_id )
4 /
COUNT(SUBOBJECT_NAME)
---------------------
0
1 row selected.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> alter session set sql_trace = false;
Session altered.
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1>
ops$t_mp00@MRP9I1> set echo off
ops$t_mp00@MRP9I1> host tkprof e:\oracle\admin\mrp9i1\udump\mrp9i1_ora_1876.trc sys=n
output=tkprof.txt explain=/
TKPROF: Release 9.2.0.2.1 - Production on Fri Aug 29 14:07:57 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
ops$t_mp00@MRP9I1>
tkprof.txt
==========
select count(subobject_name)
from big
where object_id in ( select object_id from small )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.02 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 7.11 7.45 0 3424 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 7.13 7.47 0 3424 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 102 (OPS$T_MP00)
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=3424 r=0 w=0 time=7457761 us)
792 NESTED LOOPS SEMI (cr=3424 r=0 w=0 time=7454167 us)
250536 TABLE ACCESS FULL BIG (cr=3422 r=0 w=0 time=1579621 us)
792 INDEX RANGE SCAN SMALL_IDX (cr=2 r=0 w=0 time=2667651 us)(object id 40890)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
792 NESTED LOOPS (SEMI)
250536 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'BIG'
792 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'SMALL_IDX'
(NON-UNIQUE)
********************************************************************************
select count(subobject_name)
from big
where exists ( select null from small where small.object_id = big.object_id )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 6.99 6.99 0 3421 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 7.00 7.00 0 3421 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 102 (OPS$T_MP00)
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=3421 r=0 w=0 time=6995659 us)
792 NESTED LOOPS SEMI (cr=3421 r=0 w=0 time=6992099 us)
250536 TABLE ACCESS FULL BIG (cr=3419 r=0 w=0 time=1473069 us)
792 INDEX RANGE SCAN SMALL_IDX (cr=2 r=0 w=0 time=2557279 us)(object id 40890)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
792 NESTED LOOPS (SEMI)
250536 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'BIG'
792 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'SMALL_IDX'
(NON-UNIQUE)
********************************************************************************
select count(subobject_name)
from small
where object_id in ( select object_id from big )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.01 0.00 0 105 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.02 0.01 0 105 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 102 (OPS$T_MP00)
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=105 r=0 w=0 time=7280 us)
99 NESTED LOOPS SEMI (cr=105 r=0 w=0 time=6659 us)
99 TABLE ACCESS FULL SMALL (cr=4 r=0 w=0 time=886 us)
99 INDEX RANGE SCAN BIG_IDX (cr=101 r=0 w=0 time=3305 us)(object id 40888)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
99 NESTED LOOPS (SEMI)
99 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'SMALL'
99 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'BIG_IDX' (NON-UNIQUE)
********************************************************************************
select count(subobject_name)
from small
where exists ( select null from big where small.object_id = big.object_id )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.01 0.00 0 105 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.01 0.01 0 105 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 102 (OPS$T_MP00)
Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE (cr=105 r=0 w=0 time=7255 us)
99 NESTED LOOPS SEMI (cr=105 r=0 w=0 time=6643 us)
99 TABLE ACCESS FULL SMALL (cr=4 r=0 w=0 time=863 us)
99 INDEX RANGE SCAN BIG_IDX (cr=101 r=0 w=0 time=3243 us)(object id 40888)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
99 NESTED LOOPS (SEMI)
99 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'SMALL'
99 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'BIG_IDX' (NON-UNIQUE)
To me they are same!
Any thought?
Followup August 29, 2003 - 1pm Central time zone:
yes, the cbo is much much smarter then the rbo ever was, i should have caveated that. using the
rbo -- very different. using the cbo -- all fast, you need not be as concerned about small things
like this as much
Query Result
September 3, 2003 - 11pm Central time zone
Reviewer: AVS from India
Will the two queries metioned below fetch me the same result ?
Query : 1 SELECT /*+ index(ord_l2) */
MAX(ord.data_date) data_date,
us.person_id,
MAX(us.level_1) ouc,
MAX(ord.taker_user_id) css_user_id,
MAX(ord.week_number) week_no,
SUM(DECODE(ord.create_type,'M',
DECODE(mv1.measure,'OSM00091',0,'OSM00096',0,ord_l2.quantity),
ord_l2.quantity)) total_quantity,
SUM(ord_l2.SPECIAL_PAYMENT_CHARGE) * 100 Money_value,
mv1.measure measure,
us.mgr_ouc
FROM poc1 ord,
fcih us,
polc1 ord_l2,
fmv mv1
WHERE ord.taker_user_id = us.css_user_id
AND ord.create_type IN ('P', 'G', 'M')
AND ord.data_date = report_date
AND ord.sub_zone LIKE report_sub_zone
AND ord.order_status != 'OL'
AND ord_l2.input_cont_type IN ('O', 'R', 'S1')
AND ord_l2.order_number = ord.order_number
AND ord_l2.data_date = ord.data_date
AND ord_l2.sub_zone = ord.sub_zone
AND ord_l2.product_id = mv1.column_value
AND ord_l2.action_id NOT IN ('C', 'R', 'S')
AND ord_l2.status != 'X'
AND mv1.source_system = 'FCH'
AND (mv1.measure = 'OSM00100' OR mv1.measure LIKE ('OSM0009%'))
AND mv1.column_name = 'PRODUCT_ID'
AND EXISTS (SELECT 'x'
FROM polc1 ord_l,
fmv mv2
WHERE ord_l.input_cont_type IN ('O', 'R', 'S1')
AND ord_l.order_number = ord.order_number
AND ord_l.data_date = ord.data_date
AND ord_l.sub_zone = ord.sub_zone
AND ord_l.product_id = mv2.column_value
AND ord_l.action_id NOT IN ('C', 'R', 'S')
AND ord_l.status != 'X'
AND mv2.source_system = 'FCH'
AND (mv2.measure ='OSM00100' OR mv2.measure LIKE ('OSM0009%'))
AND mv2.column_name = 'PRODUCT_ID'
AND EXISTS (SELECT 'x'
FROM fmv mv3
WHERE ord_l.sales_channel = mv3.column_value
AND mv3.source_system = 'FCH'
AND mv3.measure = mv2.measure
AND mv3.column_name = 'SALES_CHANNEL'))
GROUP BY person_id, measure, mgr_ouc;
Query 2 : SELECT /*+ index(ord_l2) */
MAX(ord.data_date) data_date,
us.person_id,
MAX(us.level_1) ouc,
MAX(ord.taker_user_id) css_user_id,
MAX(ord.week_number) week_no,
SUM(DECODE(ord.create_type,'M',
DECODE(mv1.measure,'OSM00091',0,'OSM00096',0,ord_l2.quantity),
ord_l2.quantity)) total_quantity,
SUM(ord_l2.SPECIAL_PAYMENT_CHARGE) * 100 Money_value,
mv1.measure measure,
us.mgr_ouc
FROM poc1 ord,
fcih us,
polc1 ord_l2,
fmv mv1
WHERE ord.taker_user_id = us.css_user_id
AND ord.create_type IN ('P', 'G', 'M')
AND ord.data_date = report_date
AND ord.sub_zone LIKE report_sub_zone
AND ord.order_status != 'OL'
AND ord_l2.input_cont_type IN ('O', 'R', 'S1')
AND ord_l2.order_number = ord.order_number
AND ord_l2.data_date = ord.data_date
AND ord_l2.sub_zone = ord.sub_zone
AND ord_l2.product_id = mv1.column_value
AND ord_l2.action_id NOT IN ('C', 'R', 'S')
AND ord_l2.status != 'X'
AND mv1.source_system = 'FCH'
AND (mv1.measure = 'OSM00100' OR mv1.measure LIKE ('OSM0009%'))
AND mv1.column_name = 'PRODUCT_ID'
AND EXISTS (SELECT 'x'
FROM fmv mv3
WHERE ord_l.sales_channel = mv3.column_value
AND mv3.source_system = 'FCH'
AND mv3.measure = mv1.measure
AND mv3.column_name = 'SALES_CHANNEL')
GROUP BY person_id, measure, mgr_ouc;
Followup September 4, 2003 - 8am Central time zone:
i don't know, hows about you tell us what is different between them.
they are pretty big - point out the salient changes and why you made them.
Query Tuning (Rewriting).
September 5, 2003 - 9am Central time zone
Reviewer: AVS from India
The Query 1 looks like
SELECT XXXXXX FROM A,B,C,D /* PART 1 of QUERY 1 */
WHERE CONDITION1
AND CONDITION2
AND CONDITION3
AND CONDITION4
AND CONDITION5
AND CONDITION6
AND CONDITION7
AND CONDITION8
AND CONDITION9
AND CONDITION10
AND CONDITION11
AND CONDITION12
AND CONDITION13
AND CONDITION14
AND CONDITION15
AND EXISTS
(SELECT x FROM C,D /* PART 2 of QUERY 1 */
WHERE CONDITION1
AND CONDITION2
AND CONDITION3
AND CONDITION4
AND CONDITION5
AND CONDITION6
AND CONDITION7
AND CONDITION8
AND CONDITION9
AND EXIST
(SELECT 'x' FROM C,D /* PART 3 of QUERY 1 */
WHERE CONDITION1
AND CONDITION2
AND CONDITION3
AND CONDITION4 )
I found that the PART 2's (FIRST EXISTS) WHERE conditions is repeatation of PART1's condtions.
Hence I want to eliminate it, which I've done in QUERY 2.
The PART1 of Query1 without EXISTS fetch me reults in 5 minutes but with SINGLE EXISTS (QUERY 2) it
runs for 4 hours .
So I wrote another yet query by creating a new table, NEWTAB
/* CREATE TABLE NEWTAB AS
SELECT * FROM fch
WHERE SOURCE_SYSTEM = 'FCH'
and COLUMN_NAME = 'SALES_COLUMN' */
Query 3 :
SELECT /*+ index(ord_l2) */
MAX(ord.data_date) data_date,
us.person_id,
MAX(us.level_1) ouc,
MAX(ord.taker_user_id) css_user_id,
MAX(ord.week_number) week_no,
SUM(DECODE(ord.create_type,'M',
DECODE(mv1.measure,'OSM00091',0,'OSM00096',0,ord_l2.quantity),
ord_l2.quantity)) total_quantity,
SUM(ord_l2.SPECIAL_PAYMENT_CHARGE) * 100 Money_value,
mv1.measure measure,
us.mgr_ouc
FROM PSTN_ORDER_C_1@qdwp ord,
fch_user_id_hier@qdwp us,
PSTN_ORDER_LINE_C_1@qdwp ord_l2,
fch_measure_values@qdwp mv1, NEWTAB mv3
WHERE ord.taker_user_id = us.css_user_id
AND ord.create_type IN ('P', 'G', 'M')
AND ord.data_date = report_date
AND ord.sub_zone LIKE report_sub_zone
AND ord.order_status != 'OL'
AND ord_l2.input_cont_type IN ('O', 'R', 'S1')
AND ord_l2.order_number = ord.order_number
AND ord_l2.data_date = ord.data_date
AND ord_l2.sub_zone = ord.sub_zone
AND ord_l2.product_id = mv1.column_value
AND ord_l2.action_id NOT IN ('C', 'R', 'S')
AND ord_l2.status != 'X'
AND mv1.source_system = 'FCH'
AND (mv1.measure = 'OSM00100' OR mv1.measure LIKE ('OSM0009%'))
AND mv1.column_name = 'PRODUCT_ID'
AND ord_l2.sales_channel = mv3.column_value
AND mv1.measure = mv3.measure
GROUP BY person_id, mv1.measure, mgr_ouc;
This query (QUERY 3) also takes 4 hours.
Hope you are now clear, what I want to achieve !
Please Help.
Followup September 5, 2003 - 4pm Central time zone:
what do you want to achieve?
help with what?
if this is "please tune my query" -- not here, this is for followups on the original question.
Query Rewriting
September 8, 2003 - 8am Central time zone
Reviewer: AVS from India
Yes you are right I was trying to tune the query, for which I required your help. Since it
consisted of removing of EXIST clause from one of the query I thought its continuation of the
original subject(IN & EXISTS). If it was annoying I'm sorry.
Query Execution takes forever using IN operator
November 7, 2003 - 2pm Central time zone
Reviewer: TS from USA
Tom,
I 've a query like this:-
select id from table1
where ( contains( name, ' $?AB and $?CD and $?EF ' ) > 0)
and id in ('1234','4567','97343','0934',4576','6376''5017', '3930', '5099', '5089', '5061', '4753',
'3646', '3357', '3047', '8059','4124', '17591','14700', '60973', '34987', '37586', '58985',
'34874', '49161', '11489');
the name column which has an intermedia index is being used during query execution but the index
on ID is not being used.
The query takes a long time to execute. I tried using
RBO, but still was slow. Eventually I tried deleting the
statistics using "analyze table1 delete statistics" and the
query was very fast. So,do I need to delete the statistics
to make it run faster(my only concern is this query might run faster after stats delete but some
other query hitting the same table might become slower),or is there a way to rewrite the query to
make it run faster?
Thanks for your help.
TS
Followup November 7, 2003 - 5pm Central time zone:
well, a query with contains is going to use CBO no matter what (so, you never used RBO, cannot be
done!)
Hows about you show us
a) tkprof AND autotrace traceonly explain of the query with stats
b) tkprof AND autorace traceonly explain of the query without stats
Followup: Query Execution takes forever using IN operator
November 12, 2003 - 11am Central time zone
Reviewer: TS from USA
Tom,
Here 's the tkprof output running the query,
Without Stats:-
==============
select id from table1
where ( contains( name, ' $?AB and $?CD and $?EF ' ) > 0)
and id in ('1234','4567','97343','0934',4576','6376''5017', '3930', '5099',
'5089', '5061', '4753', '3646', '3357', '3047', '8059','4124', '17591','14700',
'60973', '34987', '37586', '58985', '34874', '49161', '11489');
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=88)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TABLE1' (Cost=1 Card=1 B
ytes=88)
2 1 DOMAIN INDEX OF 'NAME_TXT_IND' (Cost=1 Card=1)
from tkprof output:-
===================
EXPLAIN PLAN SET STATEMENT_ID='PLUS1583908' FOR select id from table1
where ( contains( name, ' $?AB and $?CD and $?EF ' ) > 0)
and id in ('1234','4567','97343','0934',4576','6376''5017', '3930', '5099',
'5089', '5061', '4753', '3646', '3357', '3047', '8059','4124', '17591','14700',
'60973', '34987', '37586', '58985', '34874', '49161', '11489');
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 92 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.00 0.00 0 92 0 0
With Stats:-
==========
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=88)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TABLE1' (Cost=1 Card=1 B
ytes=88)
2 1 DOMAIN INDEX OF 'NAME_TXT_IND' (Cost=1 Card=1)
Statistics
----------------------------------------------------------
1313 recursive calls
1 db block gets
1715 consistent gets
205 physical reads
9568 redo size
6729 bytes sent via SQL*Net to client
4967 bytes received via SQL*Net from client
67 SQL*Net roundtrips to/from client
16 sorts (memory)
0 sorts (disk)
0 rows processed
tkprof output:-
=============
select id from table1
where ( contains( name, ' $?AB and $?CD and $?EF ' ) > 0)
and id in ('1234','4567','97343','0934',4576','6376''5017', '3930', '5099',
'5089', '5061', '4753', '3646', '3357', '3047', '8059','4124', '17591','14700',
'60973', '34987', '37586', '58985', '34874', '49161', '11489');
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 33 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 0 33 0 0
Thanks,
TS
Followup November 12, 2003 - 4pm Central time zone:
i want the RAW tkprof with timings -- no explain plans (they can lie). show me the real tkprof,
making sure of course that timed statistics was ON
Thanks for the insight
December 30, 2003 - 4pm Central time zone
Reviewer: dharma from CA USA
HI Tom,
I have a table like this(A fictitious though)
company_id deptno empno status
1 10 1 A
1 10 2 A
1 10 3 P
1 20 4 A
1 20 5 A
1 30 6 A
1 30 7 P
2 10 8 A
2 10 9 A
2 20 10 P
2 30 11 A
Primary Key on company_id,deptno,empno
To get the depts that dont have employees with status=P.
Is there anyother way other than using not exists to bring the required rows
SELECT DISTINCT deptno FROM COMP_EMP ce
WHERE NOT EXISTS (SELECT 1 FROM COMP_EMP ces
WHERE ce.company_id=ces.company_id
AND ce.deptno=ces.deptno
AND ces.status='P')
since it does a sub-query on the table comp_emp and comp_emp has a million rows, will "not exists"
withstand the query.
BTW
Database is Oracle9i Enterprise Edition Release 9.2.0.4.0
Followup December 30, 2003 - 4pm Central time zone:
select deptno, count( decode(status,'P',1) )
from comp_emp
group by deptno
having count( decode( status,'P',1 ) ) = 0;
will do it as well. there are dozens of ways....
select deptno from comp_emp
minus
select deptno from comp_emp where status = 'P';
is another....
ORA-01415 too many distinct aggregate functions..
January 2, 2004 - 6am Central time zone
Reviewer: Naveen from India
Hi Tom,
I have a select statement which has a IN clause like below:
Select <column list>
from <tablename>
where column1 in(1,2,3,..)
When i am testing it by giving more and more values in the IN clause it gave the above error. Why
is there a limit for values that can be set in a IN clause. Is the above approach recommended. Is
there a way to overcome this. Kindly throw some light on the issue.
Thank you
Nav.
Followup January 2, 2004 - 9am Central time zone:
there is a documented limited on the number of OR's and inlist items.
use a subquery (you probably aren't using binds either are you!).
create global temporary table t ( x int ) on commit delete rows;
and then INSERT the values into t and query simply
select * from table where c in ( select * from t );
Thanks Tom
January 2, 2004 - 11am Central time zone
Reviewer: Naveen from India
Hi Tom,
Yes you are correct. I am not using bind variables either. I am checking the limit of IN clause as
i have to decide which approach i need to take. I'll keep this in mind and thanks again for
immediate reply.
Thank you
Nav.
Followup January 2, 2004 - 1pm Central time zone:
not using bind variables will be the largest mistake of your entire career! rethink that.
interesting ?
February 3, 2004 - 6am Central time zone
Reviewer: Fenng from CNOUG China
SQL> desc a
Name Null? Type
----------------------------------------- -------- ----------------------------
X NUMBER(38)
Y VARCHAR2(1)
SQL> desc b
Name Null? Type
----------------------------------------- -------- ----------------------------
X1 NUMBER(38)
Y1 VARCHAR2(1)
SQL> select x from a
2 where x not in (select x from b);
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'A'
3 1 FILTER
4 3 TABLE ACCESS (FULL) OF 'B'
SQL>
It's OK.But "x" is NOT table B's field.Would you like explain it for us ? Thanks a lot !
Followup February 3, 2004 - 7am Central time zone:
it is expected, done by design, part of the standard.
it is simply called a correlated subquery.
consider:
select * from dept
where exists ( select null from emp where emp.deptno = dept.deptno )
^^^^^^^^^^^
you expect that to work right? well:
SQL> select x from a
2 where x not in (select x from b);
is not any different than:
SQL> select x from a
2 where x not in (select A.X from b);
Thank you!
February 3, 2004 - 9pm Central time zone
Reviewer: Fenng
Thank you very much~!
How to optimize this query?
April 14, 2004 - 9am Central time zone
Reviewer: Tony from India, Chennai
Tom,
I want the rows from table cust_trans that are not in both cust_mast and customer_master@db_link1
tables. The query is
SELECT a.cust_id, a.cust_name, a.trans_date
FROM cust_trans SUBPARTITION(sp001) a
WHERE( a.cust_id
NOT IN(SELECT b.cust_id
FROM cust_mast SUBPARTITION(sp001) b )
AND a.cust_id
NOT IN(SELECT c.cust_id
FROM customer_master@db_link1 c )
);
The table customer_master is accessed through db link and it has millions of records. Other tables
also have millions of records. The subpartiton name is dynamically passed to the stored procedure.
Thus I'm going to use dynamic cursor or Execute immediate. Is there any way to optimize this query?
Followup April 14, 2004 - 10am Central time zone:
got a query plan for us to look at? (set autotrace traceonly explain would be best)
exists is NOT the same as join right?
April 28, 2004 - 4am Central time zone
Reviewer: A reader
Hi
I always thought that an EXISTS can be converted to a JOIN until I find this query
select count(*)
FROM OM_WELCOME_PACK t
WHERE EXISTS (SELECT 1
FROM OM_EMF e, LISTA_ORDENES l
WHERE t.emf_ext_id=e.external_id
AND e.order_id=l.order_id
AND l.order_type_id=3)
COUNT(*)
----------
99
If I convert this to a join
select count(*)
FROM OM_WELCOME_PACK t, OM_EMF e, LISTA_ORDENES l
WHERE t.emf_ext_id=e.external_id
AND e.order_id=l.order_id
COUNT(*)
----------
163
I get duplicate rows, is this correct behaviour...?
Jr. Oracle Developer
May 24, 2004 - 5pm Central time zone
Reviewer: Dave from California, USA
Hello Tom,
I need to combine three select statement in script1/select1.
Two columns (PCS, Speciality) of script1 should be base on the condition define in script 2 &
script 3.
Columns alias are "PCS" and "Specialty"
PCS should be base on script two and Speciality should be base on script 3.
Please have a look.
Script 1: (Main Script)
select a.ref_patient_fname|| ' , ' || a.ref_patient_lname "Patient Name",
to_char( a.ref_patient_dob, 'mm/dd/yy' ) DOB,
a.ref_patient_id1 "OVMC ID",
d.site_name || ' , '|| b. clact_desc "PCS",
d.site_name || ' , '|| b. clact_desc "Specialty",
to_char(c.appt_date , 'HH:MI AM' ) Appointments
from dhs_referrals a,
dhs_clinical_activities b,
dhs_appointments c,
dhs_sites d
where a.ref_clact_id_requested = b.clact_id and
a.ref_clact_id_requesting = b.clact_id and
c.appt_ref_id = a.ref_id and
d.site_id = b.clact_site_id
Script 2:
select c.site_name || ' , '|| b. clact_desc "PCS" , a.ref_clact_id_requesting
from dhs_sites c, dhs_clinical_activities b, dhs_referrals a
where a.ref_clact_id_requesting = b.clact_id
Script 3
select c.site_name || ' , '|| b. clact_desc "Speciality" , a.ref_clact_id_requested
from dhs_sites c, dhs_clinical_activities b, dhs_referrals a
where a.ref_clact_id_requested = b.clact_id
Any help will be appreciated.
Dave
Followup May 24, 2004 - 7pm Central time zone:
don't understand the question. I see three queries but I've not any idea what you are trying to do
with them really?
Jr. Oracle Developer
May 24, 2004 - 7pm Central time zone
Reviewer: Dave from CA, USA
Sorry for inconvinince.
Actually, On my script1.
I have two columns "PCS" and "Speciality".
I need to call values for "PCS" and "Speciality" based on script 2 & script 3 respectively.
e.g:
select c.site_name || ' , '|| b. clact_desc "PCS"
from dhs_sites c, dhs_clinical_activities b, dhs_referrals a
where a.ref_clact_id_requesting = b.clact_id
will give me the values for "PCS" in script1.
and
select c.site_name || ' , '|| b. clact_desc "Speciality"
from dhs_sites c, dhs_clinical_activities b, dhs_referrals a
where a.ref_clact_id_requested = b.clact_id
will give me values for "Speciality" in Script1.
Please note: I used "PCS" and "Speciality" as a column aliases in all scripts.
cheers,
Dave
Followup May 24, 2004 - 11pm Central time zone:
and what does:
I need to call values for "PCS" and "Speciality" based
mean exactly? I mean you have two spool files maybe, one full of column "X" and one full of column
"Y". Now what? what is the logic to use that?
Jr. Oracle Developer
May 25, 2004 - 12pm Central time zone
Reviewer: Dave from CA, USA
No I do not have any spool file etc.
Here what I wanted to do.
e.g:
I have main script which is consists of three columns.
select a.DOB,
b.clact_desc||' , '||c.site_name "PCS",
b.clact_desc||' , '||c.site_name "Speciality"
from dhs_referrals a, dhs_clinical b, dhs_site c
where .....
In dhs_referrals table I also have two more columns called clact_requested_id &
clact_requesting_id.
Here is the logic I wanted to put for column 2 & 3 respectively.
where b.clact_id = a.clact_requesting_id
gives me the value for PCS. (Column 2 in the main script above)
or
where b.clact_id = a.clact_requested_id gives me the value for Speacility. (Column 3 in the main
script above)
I need your help to add above two logic in the main select script.
Followup May 25, 2004 - 12pm Central time zone:
sorry -- i don't think we are going to get anywhere here. Your terminology is just not anything
I'm understanding. I do not see the tie in between columns in a select and separate predicates.
Join or sub-query
August 20, 2004 - 5am Central time zone
Reviewer: atul from India
Hi,
What is fast Join or subquery??
Under what conditions??
Thanks,
Atul
Followup August 20, 2004 - 11am Central time zone:
er? does not compute.
How to update a row if the row exist if not insert?
September 23, 2004 - 10pm Central time zone
Reviewer: Vinod from Banglore, India
Tom - I want to update a row if the row exist if not insert that row. Is there a simple way to do
this? I was doing if exist then update if not insert but I would like to try it in one single DML
if possible.
Vinod
Followup September 24, 2004 - 10am Central time zone:
in theory you could use merge
merge into t
into ( select :x x, :y y, :z z from dual ) d
on ( t.x = d.x )
when matched then update set ...
when not matched then insert....
but
update t set y = :y, z = :z where x = :x;
if (sql%rowcount = 0)
then
insert into t (...) values (....);
end if;
it about the same in this case.
What about "=" and "LIKE" comparison operators
October 11, 2004 - 3pm Central time zone
Reviewer: denni50 from na
Hi Tom....Welcome back!....(we MISSED you!)
I know the discussion here involves NOT,EXISTS,NOT IN and
NOT EXISTS operators....however I recently stumbled upon a
situation where using the "=" comparison caused the
system to come to a grinding halt. When I changed the
operator to LIKE 'A%' instead of = 'A' the system performed
normally.
We recently converted test system to CBO and encountered
this situation for the first time when testing one of
our third party software modules where users are allowed
to insert additional sql statements. With RBO users would
enter: where col_name='A' and everything would work fine.
This was discovered when investigating why this function performed normally during the first phase,
then came to a grinding slowdown during the second phase and saw where some users entered col_name
LIKE 'A%' and others had entered col_name ='A'(this is when the system would virtually stop). When
I changed all the ='A' to LIKE 'A%'everything ran smoothly.
Have you ever encountered performance issues between = and
LIKE?
glad you made it "home" safe and sound.
Followup October 11, 2004 - 4pm Central time zone:
how do you gather statistics -- exact command.
when you do an autotrace traceonly explain on the two -- what do you see as differences? is the
CARD=xxxx part of the query even close to reality.
OEM SQL ANALYZE
October 11, 2004 - 5pm Central time zone
Reviewer: denni50 from na
I ran a sql analyze through OEM on both statements
the cost was around 10450-14512 for both sql statements.
Can't post results due to proprietary constraints.
There's alot of nested loops and bitmap conversions
from rowids with the largest table(approx 8+ million
records) doing full table scans.
Followup October 11, 2004 - 7pm Central time zone:
cost is not meaningful.
the CARD= parts -- are they even *close* to reality.
given you stats method of collection (no histograms -- just one bucket with method_opt at its
default) the optimizer is using "distributions and best guesses".
My guess is, 'A' has lots and lots of values.
where column = 'A' is guessing "too low" (no histograms)
where column like 'A%' is guessing "higher, closer to reality"
Gather statistics
October 11, 2004 - 5pm Central time zone
Reviewer: denni50 from na
I use the dbms_stats package:
gather_schema_stats cascade>=true
thanks Tom
October 12, 2004 - 9am Central time zone
Reviewer: denni50 from na
I understand what you're saying about the histograms.
While you were away I experimented and ran statistics(dbms_stats) on the schema and used the
method_opt SIZE AUTO on Indexed Columns parameter to gather histogram stats. Upon completion
everything I tried to run, simple sql statements and queries took forever to run. I would log into
the software application and everything seemed to just hang. I deleted all the histograms, re-ran
table and index stats only and performance went back to normal processing
except for this latest incident with the = and LIKE operators.
I'm still trying to learn and understand how histograms
affect and work within CBO. The other frustrating part
is that I'm dealing with code that is not mine so tuning
it is out of the question. All I can do is run my own
analyze and trace reports...send it to the vendor and
HOPE someone figures out the problem.
I am going to re-run the trace reports through sqlplus
instead of OEM since the OEM does not show the card=
(at least I didn't see it unless it is represented under
a different header).
I know "cost" is not what is important. Cost is just an
arbitrary number assigned to a particular query at a
particular time in a given environment that the
optimizer chooses as the lowest cost.
I've learned quite a bit in 2+ years but still have a
heck of alot more to learn and understand as far as CBO
is concerned.
Explain Plan
October 15, 2004 - 3pm Central time zone
Reviewer: denni50 from na
Tom
attached is the explain plan for the scenario I've been describing in the earlier threads here
about the '=' and
'LIKE' operators.
(had to modify the names of tables,columns and indexes
since this is not my code.)
Table1 has approx 3.2 million records
Table2 is a temporary table(just for this function)
Table3 has 8+ million records.
Just by looking at the explain plan can you tell what
is going on. This is a new procedure that's been added
to a newer version of software that is causing the functionality to take 3 times longer to
complete.
thanks
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=15825 Card=468 Bytes
=30888)
1 0 NESTED LOOPS (Cost=15825 Card=468 Bytes=30888)
2 1 HASH JOIN (Cost=15824 Card=468 Bytes=28080)
3 2 NESTED LOOPS (Cost=2714 Card=15279 Bytes=320859)
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'TABLE1' (Cost=
2649 Card=15279 Bytes=229185)
6 5 INDEX (RANGE SCAN) OF 'TABLE1_COL2_INDEX'
(NON-UNIQUE) (Cost=439 Card=241176)
7 3 INDEX (UNIQUE SCAN) OF 'TABLE2_ID_IDX' (UNIQUE)
8 2 VIEW (Cost=13084 Card=11085 Bytes=432315)
9 8 FILTER
10 9 SORT (GROUP BY) (Cost=13084 Card=11085 Bytes=31038
0)
11 10 NESTED LOOPS (Cost=10452 Card=230371 Bytes=64503
88)
12 11 TABLE ACCESS (FULL) OF 'TABLE3' (Cost=10450 C
ard=230371 Bytes=5068162)
13 11 INDEX (UNIQUE SCAN) OF 'TABLE2_ID_IDX' (UNIQ
UE)
14 1 INDEX (UNIQUE SCAN) OF 'PK_TABLE1_ID' (UNIQUE)
Crazy behaviour!!
December 9, 2004 - 7am Central time zone
Reviewer: Ganesh from Bangalore, India
create table emp (empno number(3), dno number(3))
/
create table dept (dno number(3))
/
select * from emp
where empno in
(select empno from dept)
/
I'm wondering how the above query works when there is no empno field in dept table?
Followup December 9, 2004 - 2pm Central time zone:
that is not crazy, it is expected and called a correlated subquery.
select *
from emp
where emp.empno in ( select emp.empno
from dept );
makes more sense with the correlation names there.....
just like:
select * from dept
where exists ( select null from emp where emp.deptno = dept.deptno );
the columns of the outer query are available in the inner queries.
why is it so?
December 10, 2004 - 1am Central time zone
Reviewer: Ganesh from Bangalore, India
Thanks Tom. I guessed that it would be correlated subquery.
By why does oracle apply its intelligence, when I dont mention explicitly emp.empno?
Followup December 10, 2004 - 10am Central time zone:
confusion -- it did apply "its intelligence"
your query was syntactically correct
your query was valid SQL
your query must be run as is
It did exactly what is it documented, expected and supposed to do. It is not a "bad query", it is
not a "mistake". the only mistake here was that you didn't mean to run THAT query.
This is like asking us to raise an error because you query "where 1=0" or something.
delete performance issue using 'IN'
December 13, 2004 - 12pm Central time zone
Reviewer: A reader
Hi Tom,
I have the following SQL:
DELETE
FROM nitsmedia
WHERE nitstory_srl not in
(
SELECT srl
FROM nitstory
WHERE lower(category) = 'stjohns'
AND to_date('MAY-01-2003' || ' 00:00:00', 'MM-DD-YYYY HH24:MI:SS') <= last_modified_time
AND last_modified_time <= to_date('MAY-01-2003' || '23:59:59', 'MM-DD-YYYY HH24:MI:SS')
AND story_origin = 'Regional'
);
It takes forever!
Number of rows for inner query: 10 rows
Number of rows for outer query: 15289
However, the following are fast:
(1) The inner query:
SELECT srl
FROM nitstory
WHERE lower(category) = 'stjohns'
AND to_date('MAY-01-2003' || ' 00:00:00', 'MM-DD-YYYY HH24:MI:SS') <= last_modified_time
AND last_modified_time <= to_date('MAY-01-2003' || '23:59:59', 'MM-DD-YYYY HH24:MI:SS')
AND story_origin = 'Regional';
SRL
----------
70640
70670
70671
70673
70674
70681
70682
70688
70689
70692
10 rows selected.
(2)
DELETE FROM nitsmedia WHERE nitstory_srl IN
(70640,70670,70671,70673,70674,70681,70682,70688,70689,70692)
(3) DELETE FROM nitsmedia;
I remember this rule:
If the result of the subquery is small -- then IN is typicaly more appropriate. However, this is
not working this time!
Can you shed any light on why it is like this and what I should do?
Here are the info:
Oracle 9.2 on HP-UX
All tabled have been analyzed
Thanks,
- Arash
Followup December 13, 2004 - 2pm Central time zone:
are you using the CBO
and is sr1 defined as NOT NULL
(and forget that rule of thumb, not really valid in CBO days... this dates way back to the time of
the RBO... thinks change.)

December 13, 2004 - 2pm Central time zone
Reviewer: A reader
Yes Tom. I use the default which is 'choose' as optimizer mode. Srl is not null and there is
foreign key between srl and nitstory_srl
and here is the explain plan:
Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=232 Card=15288 Bytes
=1223040)
1 0 DELETE OF 'NITSMEDIA'
2 1 FILTER
3 2 NESTED LOOPS (ANTI) (Cost=232 Card=15288 Bytes=1223040
)
4 3 TABLE ACCESS (FULL) OF 'NITSMEDIA' (Cost=25 Card=152
89 Bytes=397514)
5 3 TABLE ACCESS (BY INDEX ROWID) OF 'NITSTORY' (Cost=23
2 Card=1 Bytes=54)
6 5 BITMAP CONVERSION (TO ROWIDS)
7 6 BITMAP AND
8 7 BITMAP CONVERSION (FROM ROWIDS)
9 8 INDEX (RANGE SCAN) OF 'PK_NITSTORY' (UNIQUE)
10 7 BITMAP CONVERSION (FROM ROWIDS)
11 10 INDEX (RANGE SCAN) OF 'I2_NITSTORY_STORY_ORI
GIN' (NON-UNIQUE)
SQL> desc nitsmedia
Name Null? Type
----------------------------------------- -------- ----------------------------
SRL NOT NULL NUMBER(12)
NITSTORY_SRL NOT NULL NUMBER(12)
MEDIAFILE_NME VARCHAR2(100)
MEDIA_TYP NOT NULL VARCHAR2(20)
THUMBNAIL VARCHAR2(100)
CAPTION VARCHAR2(1000)
POSITION NUMBER
LAYOUT VARCHAR2(10)
SQL> desc nitstory
Name Null? Type
----------------------------------------- -------- ----------------------------
SRL NOT NULL NUMBER(12)
STORYFILE_NME NOT NULL VARCHAR2(30)
AUTHOR VARCHAR2(42)
STORY_ORIGIN VARCHAR2(10)
CATEGORY VARCHAR2(30)
HEADLINE VARCHAR2(200)
SUBHEADLINE VARCHAR2(200)
KEYWORDS VARCHAR2(200)
HISTORY VARCHAR2(200)
DATELINE VARCHAR2(100)
HIGHLIGHT VARCHAR2(1200)
CREATION_TIME VARCHAR2(30)
LAST_MODIFIED_TIME VARCHAR2(30)
TIMEZONE VARCHAR2(25)
FOOTNOTES VARCHAR2(1000)
BODY CLOB
I am not sure why there lots of "BITMAP CONVERSION" in plan?
Thank you Tom,
- Arash'
Followup December 13, 2004 - 2pm Central time zone:
can we see a tkprof of the "really slow one"

December 14, 2004 - 11am Central time zone
Reviewer: A reader
Tom,
Here is the tkprof for the query. I canceled the connection in the middle as I told you it takes
forever. I also noticed that, it locks both tables 'nitsmedia' and 'nitstory'.
Thanks again for your help and outstanding website.
- Arash
DELETE
FROM nitsmedia
WHERE nitstory_srl not in
(
SELECT srl
FROM nitstory
WHERE lower(category) = 'stjohns'
AND to_date('MAY-01-2003' || ' 00:00:00', 'MM-DD-YYYY HH24:MI:SS') <= last_mo
dified_time
AND last_modified_time <= to_date('MAY-01-2003' || '23:59:59', 'MM-DD-YYYY H
H24:MI:SS')
AND story_origin = 'Regional'
)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.11 0.11 0 323 0 0
Execute 1 51.59 52.22 0 2336957 2122 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 51.70 52.34 0 2337280 2122 0
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 101 (NI_DDL)
Rows Execution Plan
------- ---------------------------------------------------
0 DELETE STATEMENT GOAL: CHOOSE
0 DELETE OF 'NITSMEDIA'
0 FILTER
0 TABLE ACCESS (FULL) OF 'NITSMEDIA'
0 TABLE ACCESS (BY INDEX ROWID) OF 'NITSTORY'
0 INDEX (RANGE SCAN) OF 'I2_NITSTORY_STORY_ORIGIN'
(NON-UNIQUE)
********************************************************************************
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.11 0.11 0 323 0 0
Execute 1 51.59 52.22 0 2336957 2122 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 51.70 52.34 0 2337280 2122 0
Misses in library cache during parse: 1
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 29 0.06 0.04 0 147 0 0
Execute 76 0.02 0.01 0 0 0 0
Fetch 287 0.00 0.01 0 323 0 223
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 392 0.08 0.07 0 470 0 223
Misses in library cache during parse: 20
1 user SQL statements in session.
29 internal SQL statements in session.
30 SQL statements in session.
1 statement EXPLAINed in this session.
********************************************************************************
Trace file: /db/9.2/dbo01/log/udump/dbo01_ora_11063.trc
Trace file compatibility: 9.00.01
Sort options: default
1 session in tracefile.
1 user SQL statements in trace file.
29 internal SQL statements in trace file.
30 SQL statements in trace file.
21 unique SQL statements in trace file.
1 SQL statements EXPLAINed using schema:
SYSTEM.prof$plan_table
Default table was used.
Table was created.
Table was dropped.
604 lines in trace file.
Followup December 15, 2004 - 12pm Central time zone:
sorry -- but a query won't lock any tables.
but something is fishy here.
if you run this script:
drop table emp;
create table emp as select * from scott.emp;
alter session set sql_trace=true;
select count(*) from emp;
then tkprof will say:
select count(*)
from
emp
....
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
14 TABLE ACCESS (FULL) OF 'EMP'
but if you run this one:
drop table emp;
create table emp as select * from scott.emp;
exec dbms_stats.gather_table_stats( user, 'EMP' );
alter session set sql_trace=true;
select count(*) from emp;
tkprof says
select count(*)
from
emp
....
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
14 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'EMP'
But before you showed us:
Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=232 Card=15288 Bytes
=1223040)
1 0 DELETE OF 'NITSMEDIA'
2 1 FILTER
3 2 NESTED LOOPS (ANTI) (Cost=232 Card=15288 Bytes=1223040
)
4 3 TABLE ACCESS (FULL) OF 'NITSMEDIA' (Cost=25 Card=152
89 Bytes=397514)
5 3 TABLE ACCESS (BY INDEX ROWID) OF 'NITSTORY' (Cost=23
2 Card=1 Bytes=54)
6 5 BITMAP CONVERSION (TO ROWIDS)
7 6 BITMAP AND
8 7 BITMAP CONVERSION (FROM ROWIDS)
9 8 INDEX (RANGE SCAN) OF 'PK_NITSTORY' (UNIQUE)
10 7 BITMAP CONVERSION (FROM ROWIDS)
11 10 INDEX (RANGE SCAN) OF 'I2_NITSTORY_STORY_ORI
GIN' (NON-UNIQUE)
which shows the use of the CBO (the cost= card= is there ONLY when the CBO is used)
so, I'm thinking you have a big mixture of ANALYZED and UNANALYZED tables here the and the
optimizer doesn't stand a ghost's of a chance of getting the right answer.
EG: table you are deleting from is analyzed....
These tables you are querying are not (and must be -- if you want NOT IN to run "in your
lifetime"

December 15, 2004 - 2pm Central time zone
Reviewer: A reader
Tom,
All Tables are analyzed, however the problem still there.
ANALYZE TABLE NITSMEDIA COMPUTE STATISTICS FOR TABLE FOR ALL INDEXES ;
ANALYZE TABLE NITSTORY COMPUTE STATISTICS FOR TABLE FOR ALL INDEXES ;
I noticed something that may help.
I created two new tables like:
create table NITSTORY1 as select * from NITSTORY;
create table NITSMEDIA as select * from NITMEDIA;
and then ran the query with new table names:
DELETE
FROM nitsmedia1
WHERE nitstory_srl not in
(
SELECT srl
FROM nitstory1
WHERE lower(category) = 'stjohns'
AND to_date('MAY-01-2003' || ' 00:00:00', 'MM-DD-YYYY HH24:MI:SS') <= last_modified_time
AND last_modified_time <= to_date('MAY-01-2003' || '23:59:59', 'MM-DD-YYYY HH24:MI:SS')
AND story_origin = 'Regional'
)
and it was very FAST. The only difference that I saw between the old tables and new ones was a
foreign key between the old tables that new tables didn't have that. So I dropped the foreign key
from old tables to make it in sync with the new tables. But still the delete is very slow on old
tables.
Please let me know your idea.
Thanks,
- Arash
Followup December 15, 2004 - 6pm Central time zone:
don't use analyze.
use dbms_stats.gather_table_stats( user, 'table_name', cascade=>true );
so, show us the tkprofs between the two. (i've a feeling that not null column isn't really not
null -- eg: maybe it says not null but someone put it on there with "novalidate" or something)

December 16, 2004 - 9am Central time zone
Reviewer: A reader
Hi Tom,
can you tell me why, when I recreate tables (create table NITSTORY1 as select * from NITSTORY....),
it works well?
I also test combination of new tables and old tables in the delete statement, and noticed , the
problem is the inner table "NITSTORY' which makes the query slow. And when I replace this table
with the new table 'NITSTORY1' , it works fine.
Here are the tables:
create table NITSTORY (
SRL NUMBER(12) not null,
STORYFILE_NME VARCHAR2(30) not null,
AUTHOR VARCHAR2(42),
STORY_ORIGIN VARCHAR2(10),
CATEGORY VARCHAR2(30),
HEADLINE VARCHAR2(200),
SUBHEADLINE VARCHAR2(200),
KEYWORDS VARCHAR2(200),
HISTORY VARCHAR2(200),
DATELINE VARCHAR2(100),
HIGHLIGHT VARCHAR2(1200),
CREATION_TIME VARCHAR2(30),
LAST_MODIFIED_TIME VARCHAR2(30),
TIMEZONE VARCHAR2(25),
FOOTNOTES VARCHAR2(1000),
BODY CLOB,
constraint PK_NITSTORY primary key (SRL)
);
create table NITSMEDIA (
SRL NUMBER(12) not null,
NITSTORY_SRL NUMBER(12) not null,
MEDIAFILE_NME VARCHAR2(100),
MEDIA_TYP VARCHAR2(20) not null,
THUMBNAIL VARCHAR2(100),
CAPTION VARCHAR2(1000),
POSITION NUMBER,
LAYOUT VARCHAR2(10),
constraint PK_NITSMEDIA primary key (SRL)
);
alter table NITSMEDIA
add constraint FK1_NITSTORY_NITSMEDIA foreign key (NITSTORY_SRL)
references NITSTORY (SRL);
As always, thank you for your time,
- Arash
Followup December 16, 2004 - 10am Central time zone:
not without the tkprofs I asked for, no.

December 16, 2004 - 11am Central time zone
Reviewer: A reader
Tom.
I used:
dbms_stats.gather_table_stats( user, 'table_name', cascade=>true );
and it fixed the problem. Why is that? Can you elaborate it a little more?
I found that Analyze method from your site. But it seems it doesnÂ’t work well for 9.2!
Thank you again,
- Arash
Followup December 16, 2004 - 11am Central time zone:
you gathered table and index stats but no column stats -- that gathers table, column and index
stats.
analyze table t compute statistics;
would do roughly the same thing, but dbms_stats is more proper.
IN or EXISTS - back on topic !
December 16, 2004 - 3pm Central time zone
Reviewer: DaPi from Geneva - CH
Hi Tom,
Many thanks for the site. I much appreciate your scientific/analytic approach.
This thread keeps popping to the top even after 4 years. All you say is correct, but there is a
problem: I believe the question is not the right one to ask.
Real-life:
I have something to code. I do so to the best of my ability, keeping in mind clarity &
maintainability. I test to make sure I have the right result and, if the performance is
appropriate, I give thanks and go on to the next task. If it's slow (for the purpose), I quickly
try some tricks I have up my sleeve and, if still slow, begin an analysis of what is going "wrong".
I can't afford the time to analyse - not ANALYZE :) - every query.
I think the right question to ask is:
Q: I can code my query with IN or with EXISTS; which should I try first?
In 5 years of Rdb and 5 years of Oracle, I developed the habit of coding EXISTS. In those 5 Oracle
years I have recoded EXISTS as IN only twice (I'm pretty sure of this because it's been so rare).
A colleague with the IN habit, would give me a slow query to tune every week or two; I'd just
recode the IN as EXISTS.
OK, I have no statistics, but in my real-life experience the answer is:
A: Try EXISTS first. You may have to recode, but not very often.
Waiting to be blasted - DaPi
Followup December 16, 2004 - 3pm Central time zone:
A: use the cbo and it'll try out both.
No blasting, I use exists so infrequently and in almost all of the time myself.
IN & EXISTS - CBO tries both?
December 16, 2004 - 4pm Central time zone
Reviewer: DaPi from Geneva - CH
You said: "A: use the cbo and it'll try out both."
I've been running CBO (7.3.4 then 8.0.6 then 8.1.7) with up-to-date stats all those 5 years - it
must miss a trick from time to time, because recoding HAS made a difference in some cases.
I'd agree that in many cases there is NO performance difference between IN and EXISTS (when the CBO
gets it right?). But if there is going to be a difference, I'd put my money on EXISTS.
Thanks - DaPi
Followup December 16, 2004 - 5pm Central time zone:
select *
from big_table
where exists ( select null from small_table where small_table.x = big_table.x)
vs
select * from big_table
where x in ( select x from small_table )
give me the IN any time (in rbo days)
IN vs EXISTS
December 17, 2004 - 3am Central time zone
Reviewer: DaPi from Geneva - CH
OK - I'll buy that (confirmed by experiment under CBO).
I guess it's that my real-life experience tends not to deal with small tables . . . . and with big
tables it hurts even more when you guess wrong!
Followup December 17, 2004 - 7am Central time zone:
doesn't even have to be small tables, big tables that return smallish results
select * from big_table
where x in ( select x from really_huge_table where <some predicate>)
if some predicate returned just 100 rows from really huge table, IN would have been preferred :)

December 26, 2004 - 2am Central time zone
Reviewer: Logan
for x in ( select * from t1 )
loop
if ( exists ( select null from t2 where y = x.x )
then
OUTPUT THE RECORD
end if
end loop
-----------------------------------------------------------
The explanation you gave for IN & EXISTS being better and in which situation is convincing, but the
example loop you gave...
1.Well when you use exists the SQL always returns only one row, is that true i.e. when you used
exists to find whether a conditions is true or false. For examples
select decode( count(*), 1, 'y', 'n')
from dual where exists ( select null from t1 , t2 where
where t1.x = t2.x)
I guess you cannot do such a sql using IN?
2. My question is doesnt EXISTS as the name of the clause itself suggests checks for the existense
of a condition. Lets say if 10 rows meet that condition, does the exists clause read all the 10
rows, or it returns the result after meeting the condition for the first time.
Followup December 26, 2004 - 12pm Central time zone:
I was giving psuedo code here to explain the concept.
exists and in can both short circut since the CBO will turn in into exists and exists into in as
needed these days. so, it is safe to say that either can stop after the first one -- or get them
all and semijoin.

December 26, 2004 - 9pm Central time zone
Reviewer: Logan
--------------
it is safe to say that either can stop
after the first one -- or get them all and semijoin
-----------
1.Tom , can you please illustrate the same. Can you give an example where an EXISTS stops after the
first row, and where it gets all the rows.
2.What do you mean by a semi join?
Followup December 27, 2004 - 9am Central time zone:
it is that:
select * from t1 where EXISTS ( select NULL from t2 where t2.id = t1.id )
can be written as:
select * from t1 where id in ( select id from t2 );
and the optimizer is free to (re)write it either way.
consider:
ops$tkyte@ORA9IR2> create table t1( id int, y char(2000));
Table created.
ops$tkyte@ORA9IR2> create table t2( id int );
Table created.
ops$tkyte@ORA9IR2> delete from plan_table;
4 rows deleted.
ops$tkyte@ORA9IR2> explain plan for
2 select * from t1 where exists ( select null from t2 where t2.id = t1.id );
Explained.
ops$tkyte@ORA9IR2> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
--------------------
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | FILTER | | | | |
| 2 | TABLE ACCESS FULL | T1 | | | |
|* 3 | TABLE ACCESS FULL | T2 | | | |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "T2" "T2" WHERE "T2"."ID"=:B1))
3 - filter("T2"."ID"=:B1)
Note: rule based optimization
17 rows selected.
that shows Oracle doing the "where exists" as a filter -- for every row in T1 it'll run that
filter (which results in a full scan of T2 -- but stops after it hits the first "t2.id = :b1" -- so
it'll not totally full scan the table each time PERHAPS, just scans till it gets a hit)
But, lets use the more intelligent CBO and tell it about our tables (tell it "t1" is medium size,
"t2" is big)
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 5000, numblks=>500);
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 50000, numblks=>5000);
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> delete from plan_table;
4 rows deleted.
ops$tkyte@ORA9IR2> explain plan for
2 select * from t1 where exists ( select null from t2 where t2.id = t1.id );
Explained.
ops$tkyte@ORA9IR2> select * from table( dbms_xplan.display );
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5000 | 551K| 854 |
|* 1 | HASH JOIN SEMI | | 5000 | 551K| 854 |
| 2 | TABLE ACCESS FULL | T1 | 5000 | 488K| 78 |
| 3 | TABLE ACCESS FULL | T2 | 50000 | 634K| 760 |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T2"."ID"="T1"."ID")
Note: cpu costing is off
16 rows selected.
ops$tkyte@ORA9IR2>
Now, it is doing a single full scan of T1 and of T2 (instead of a full scan of T2 for EVERY ROW
IN T1) followed by that "semi join" thingy (which I wrote about in Effective Oracle by Design in
detail)
The semi join is a join that joins T1 with T2 but only ONCE (if id=55 is in T2 5,000 times -- we
don't join the row in T1 with id=55 to each on -- only to the first). A semi join is a "where
exists" style join, just join to the first occurrence
But - we built the entire set. We read all of T2, got all of the rows and are joining (semi
joining) to them
how to use index in exists statement?
March 25, 2005 - 12am Central time zone
Reviewer: A reader
Tom,
I have the delete on the large table pa_budget_lines.
I want to use the index on resource_assignment_id on the table pa_budget_lines. It seams
impossible?
thanks a lot!
DELETE FROM pa_budget_lines bl
where EXISTS (SELECT 'X'
FROM pa_resource_assignments ra,
pa_budget_versions bv
WHERE bv.current_flag <> 'Y'
AND bv.budget_status_code = 'B'
AND bv.baselined_date IS NOT NULL
AND bl.resource_assignment_id = ra.resource_assignment_id
AND bv.budget_version_id = ra.budget_version_id)
/
Followup March 25, 2005 - 7am Central time zone:
are you using the cbo.

March 25, 2005 - 9pm Central time zone
Reviewer: A reader
"
are you using the cbo.
"
Yes, SIr. it the cost based opt on 8174.
Followup March 26, 2005 - 9am Central time zone:
then lets see the autotrace traceonly explain output. perhaps the index would not be useful.

March 26, 2005 - 11am Central time zone
Reviewer: A reader
the autotrace as the follow.
there 2 millions rows in pa_budget_lines , 164k in pa_resource_assignments and 13k in
pa_budget_versions .
APPS@S0P2DBT3> DELETE FROM pa_budget_lines bl
2 where EXISTS (SELECT 'X'
3 FROM pa_resource_assignments ra,
4 pa_budget_versions bv
5 WHERE bv.current_flag <> 'Y'
6 AND bv.budget_status_code = 'B'
7 AND bv.baselined_date IS NOT NULL
8 AND bl.resource_assignment_id = ra.resource_assignment_id
9 AND bv.budget_version_id = ra.budget_version_id)
10 /
432 rows deleted.
Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=11837 Card=102454 Bytes=2868712)
1 0 DELETE OF 'PA_BUDGET_LINES'
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'PA_BUDGET_LINES' (Cost=11837 Card=102454 Bytes=2868712)
4 2 NESTED LOOPS (Cost=4 Card=1 Bytes=24)
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'PA_RESOURCE_ASSIGNMENTS' (Cost=3 Card=1
Bytes=10)
6 5 INDEX (UNIQUE SCAN) OF 'PA_RESOURCE_ASSIGNMENTS_U1' (UNIQUE) (Cost=2 Card=2)
7 4 TABLE ACCESS (BY INDEX ROWID) OF 'PA_BUDGET_VERSIONS' (Cost=1 Card=1085
Bytes=15190)
8 7 INDEX (UNIQUE SCAN) OF 'PA_BUDGET_VERSIONS_U1' (UNIQUE)
Statistics
----------------------------------------------------------
1329 recursive calls
8377 db block gets
1314941 consistent gets
84134 physical reads
586996 redo size
515 bytes sent via SQL*Net to client
939 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
432 rows processed
APPS@S0P2DBT3>
APPS@S0P2DBT3> rollback
2 /
Rollback complete.
Followup March 26, 2005 - 12pm Central time zone:
just explain this query (you don't need run it, just explain plan it)
ops$tkyte@ORA9IR2> DELETE FROM t1 bl
2 where resource_assignment_id in
3 (SELECT ra.resource_assignment_id
4 FROM t2 ra,
5 t3 bv
6 WHERE bv.current_flag <> 'Y'
7 AND bv.budget_status_code = 'B'
8 AND bv.baselined_date IS NOT NULL
9 AND bv.budget_version_id = ra.budget_version_id)
10 /

March 26, 2005 - 1pm Central time zone
Reviewer: A reader
apps@s0p2dbt3> explain plan for
2 DELETE FROM pa_budget_lines bl
3 where resource_assignment_id in
4 (SELECT ra.resource_assignment_id
5 FROM pa_resource_assignments ra,
6 pa_budget_versions bv
7 WHERE bv.current_flag <> 'Y'
8 AND bv.budget_status_code = 'B'
9 AND bv.baselined_date IS NOT NULL
10 AND bv.budget_version_id = ra.budget_version_id)
11 /
Explained.
apps@s0p2dbt3>
apps@s0p2dbt3> @plan
QUERY_PLAN
------------------------------------------------------------
DELETE PA_BUDGET_LINES
HASH JOIN
VIEW VW_NSO_1
SORT UNIQUE
HASH JOIN
TABLE ACCESS FULL PA_BUDGET_VERSIONS
TABLE ACCESS FULL PA_RESOURCE_ASSIGNMENTS
TABLE ACCESS FULL PA_BUDGET_LINES
Followup March 26, 2005 - 1pm Central time zone:
are you sure the indices are in place? what happens with the following scripts?
(safe if you do not have t1, t2, t3)
drop table t1;
drop table t2;
drop table t3;
create table t1 ( resource_assignment_id int, data char(80) );
create table t2 ( resource_assignment_id int, budget_version_id int, data char(80) );
create table t3 ( current_flag varchar2(1), budget_status_code varchar2(10),
baselined_date date, budget_version_id int, data char(80) );
exec dbms_stats.set_table_stats( user, 'T1', numrows=> 2000000, numblks => 12000 );
exec dbms_stats.set_table_stats( user, 'T2', numrows=> 164000, numblks => 1000 );
exec dbms_stats.set_table_stats( user, 'T3', numrows=> 13000, numblks => 500 );
create index t1_idx on t1(resource_assignment_id);
create UNIQUE index t2_idx on t2(resource_assignment_id);
create UNIQUE index t3_idx on t3(budget_version_id);
exec dbms_stats.set_index_stats( user, 'T1_IDX', numrows=> 2000000, numlblks => 1200 );
exec dbms_stats.set_index_stats( user, 'T2_IDX', numrows=> 164000, numlblks => 100 );
exec dbms_stats.set_index_stats( user, 'T3_IDX', numrows=> 13000, numlblks => 50 );
set autotrace on explain
DELETE FROM t1 bl
where EXISTS (SELECT 'X'
FROM t2 ra,
t3 bv
WHERE bv.current_flag <> 'Y'
AND bv.budget_status_code = 'B'
AND bv.baselined_date IS NOT NULL
AND bl.resource_assignment_id = ra.resource_assignment_id
AND bv.budget_version_id = ra.budget_version_id)
/
DELETE FROM t1 bl
where resource_assignment_id in
(SELECT ra.resource_assignment_id
FROM t2 ra,
t3 bv
WHERE bv.current_flag <> 'Y'
AND bv.budget_status_code = 'B'
AND bv.baselined_date IS NOT NULL
AND bv.budget_version_id = ra.budget_version_id)
/
set autotrace off

March 26, 2005 - 3pm Central time zone
Reviewer: A reader
apps@s0p2dbt3> create table pa.t2 as select * from pa.pa_resource_assignments;
Table created.
apps@s0p2dbt3> create table pa.t3 as select * from pa.pa_budget_versions ;
Table created.
apps@s0p2dbt3> create table pa.t1 as select * from pa.pa_budget_lines;
Table created.
apps@s0p2dbt3> exec dbms_stats.set_table_stats( 'PA', 'T1', numrows=> 2000000, numblks => 1
2000 );
PL/SQL procedure successfully completed.
apps@s0p2dbt3> exec dbms_stats.set_table_stats( 'PA', 'T3', numrows=> 13000, numblks => 500
);
PL/SQL procedure successfully completed.
apps@s0p2dbt3> exec dbms_stats.set_table_stats( 'PA', 'T2', numrows=> 164000, numblks =>
1000 );
PL/SQL procedure successfully completed.
apps@s0p2dbt3>create index pa.t1_idx on pa.t1(resource_assignment_id) tablespace pax;
Index created.
SQL> create UNIQUE index pa.t2_idx on pa.t2(resource_assignment_id) tablespace pax;
create UNIQUE index pa.t3_idx on pa.t3(budget_version_id) tablespace pax;
Index created.
SQL>
Index created.
apps@s0p2dbt3> exec dbms_stats.set_index_stats( 'PA', 'T1_IDX', numrows=> 2000000, numlblks
=> 1200 );
PL/SQL procedure successfully completed.
apps@s0p2dbt3> exec dbms_stats.set_index_stats( 'PA', 'T2_IDX', numrows=> 164000, numlblks
=> 100 );
PL/SQL procedure successfully completed.
apps@s0p2dbt3> exec dbms_stats.set_index_stats( 'PA', 'T3_IDX', numrows=> 13000, numlblks =
> 50 );
PL/SQL procedure successfully completed.
apps@s0p2dbt3> set autotrace on explain
apps@s0p2dbt3> DELETE FROM pa.t1 bl
2 where EXISTS (SELECT 'X'
3 FROM pa.t2 ra,
4 pa.t3 bv
5 WHERE bv.current_flag <> 'Y'
6 AND bv.budget_status_code = 'B'
7 AND bv.baselined_date IS NOT NULL
8 AND bl.resource_assignment_id = ra.resource_assignment_id
9 AND bv.budget_version_id = ra.budget_version_id)
10 /
432 rows deleted.
Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=1201 Card=100000 Byt
es=1300000)
1 0 DELETE OF 'T1'
2 1 INDEX (FULL SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=1201 Ca
rd=100000 Bytes=1300000)
3 2 NESTED LOOPS (Cost=3 Card=1 Bytes=67)
4 3 TABLE ACCESS (BY INDEX ROWID) OF 'T2' (Cost=2 Card=1
Bytes=26)
5 4 INDEX (UNIQUE SCAN) OF 'T2_IDX' (UNIQUE) (Cost=1 C
ard=1640)
6 3 TABLE ACCESS (BY INDEX ROWID) OF 'T3' (Cost=1 Card=1
Bytes=41)
7 6 INDEX (UNIQUE SCAN) OF 'T3_IDX' (UNIQUE)
apps@s0p2dbt3>
apps@s0p2dbt3> rollback
2 /
Rollback complete.
apps@s0p2dbt3>
apps@s0p2dbt3> DELETE FROM pa.t1 bl
2 where resource_assignment_id in
3 (SELECT ra.resource_assignment_id
4 FROM pa.t2 ra,
5 pa.t3 bv
6 WHERE bv.current_flag <> 'Y'
7 AND bv.budget_status_code = 'B'
8 AND bv.baselined_date IS NOT NULL
9 AND bv.budget_version_id = ra.budget_version_id)
10 /
432 rows deleted.
Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=1474 Card=32800000 B
ytes=852800000)
1 0 DELETE OF 'T1'
2 1 MERGE JOIN (Cost=1474 Card=32800000 Bytes=852800000)
3 2 INDEX (FULL SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=1201
Card=2000000 Bytes=26000000)
4 2 SORT (JOIN) (Cost=273 Card=1640 Bytes=21320)
5 4 VIEW OF 'VW_NSO_1' (Cost=266 Card=1640 Bytes=21320)
6 5 SORT (UNIQUE) (Cost=266 Card=1640 Bytes=109880)
7 6 HASH JOIN (Cost=246 Card=1640 Bytes=109880)
8 7 TABLE ACCESS (FULL) OF 'T3' (Cost=78 Card=1 By
tes=41)
9 7 TABLE ACCESS (FULL) OF 'T2' (Cost=153 Card=164
000 Bytes=4264000)
apps@s0p2dbt3>
apps@s0p2dbt3> rollback
2 /
Rollback complete.
Followup March 26, 2005 - 4pm Central time zone:
looks like it could be a limitation of the very old software. I need to go way back to 8i to
reproduce this behavior.
suggest you might temporarily try two stepping this, eg:
tkyte@ORA8IW> create global temporary table gtt as
2 (SELECT ra.resource_assignment_id
3 FROM t2 ra
4 WHERE 1=0)
5 /
Table created.
tkyte@ORA8IW> exec dbms_stats.set_table_stats( user, 'GTT', numrows => 500 );
PL/SQL procedure successfully completed.
the above two things are done ONCE, then:
tkyte@ORA8IW> insert into gtt
2 (SELECT ra.resource_assignment_id
3 FROM t2 ra,
4 t3 bv
5 WHERE bv.current_flag <> 'Y'
6 AND bv.budget_status_code = 'B'
7 AND bv.baselined_date IS NOT NULL
8 AND bv.budget_version_id = ra.budget_version_id);
tkyte@ORA8IW> delete from t1 where resource_assignment_id in ( select * from gtt );
Execution Plan
----------------------------------------------------------
0 DELETE STATEMENT Optimizer=CHOOSE (Cost=519 Card=10000000 Bytes=260000000)
1 0 DELETE OF 'T1'
2 1 NESTED LOOPS (Cost=519 Card=10000000 Bytes=260000000)
3 2 VIEW OF 'VW_NSO_1' (Cost=19 Card=500 Bytes=6500)
4 3 SORT (UNIQUE) (Cost=19 Card=500 Bytes=6500)
5 4 TABLE ACCESS (FULL) OF 'GTT' (Cost=16 Card=500 Bytes=6500)
6 2 INDEX (RANGE SCAN) OF 'T1_IDX' (NON-UNIQUE) (Cost=1 Card=2000000 Bytes=26000000)
tkyte@ORA8IW> set autotrace off
Purge huge table
April 2, 2005 - 7am Central time zone
Reviewer: A reader
Tom,
"
looks like it could be a limitation of the very old software. I need to go way
back to 8i to reproduce this behavior.
suggest you might temporarily try two stepping this, eg:
"
I should give you the what I am trying to accomplish here.
We want to purge a table which has 70 million rows about 10 GB. on critira to partition the table.
the goal is to remove 60 million rows from it based the critiras.
After reading your thread:
http://asktom.oracle.com/pls/ask/f?p=4950:8:344944802785198156::NO::F4950_P8_DISPLAYID,F4950_P8_CRIT
ERIA:6407993912330
It is no pratical to delete the table by the conventional method.
So i am working on based on
create tmp table with nologging;
insert table with append, and parallel;
drop original table;
rename tmp to original;
recreate indexes and grants;
done.
the insert 60 millions took 20 hours.
I would like your comment on the temp table, if the temp suitable to the massive inserts. what are
needed to be change on the table:
CREATE TABLE PURGE_TMP_S
(
RESOURCE_ASSIGNMENT_ID NUMBER(15) NOT NULL,
START_DATE DATE NOT NULL,
LAST_UPDATE_DATE DATE NOT NULL,
LAST_UPDATED_BY NUMBER NOT NULL,
CREATION_DATE DATE NOT NULL,
CREATED_BY NUMBER NOT NULL,
LAST_UPDATE_LOGIN NUMBER NOT NULL,
END_DATE DATE NOT NULL,
PERIOD_NAME VARCHAR2(30),
QUANTITY NUMBER,
RAW_COST NUMBER,
BURDENED_COST NUMBER,
REVENUE NUMBER,
CHANGE_REASON_CODE VARCHAR2(30),
DESCRIPTION VARCHAR2(255),
ATTRIBUTE_CATEGORY VARCHAR2(30),
ATTRIBUTE1 VARCHAR2(150),
ATTRIBUTE2 VARCHAR2(150),
ATTRIBUTE3 VARCHAR2(150),
ATTRIBUTE4 VARCHAR2(150),
ATTRIBUTE5 VARCHAR2(150),
ATTRIBUTE6 VARCHAR2(150),
ATTRIBUTE14 VARCHAR2(150),
ATTRIBUTE15 VARCHAR2(150),
RAW_COST_SOURCE VARCHAR2(2),
BURDENED_COST_SOURCE VARCHAR2(2),
QUANTITY_SOURCE VARCHAR2(2),
REVENUE_SOURCE VARCHAR2(2),
PM_PRODUCT_CODE VARCHAR2(30),
PM_BUDGET_LINE_REFERENCE VARCHAR2(25),
COST_REJECTION_CODE VARCHAR2(30),
REVENUE_REJECTION_CODE VARCHAR2(30),
BURDEN_REJECTION_CODE VARCHAR2(30),
OTHER_REJECTION_CODE VARCHAR2(30),
CODE_COMBINATION_ID NUMBER(15),
CCID_GEN_STATUS_CODE VARCHAR2(1),
CCID_GEN_REJ_MESSAGE VARCHAR2(2000),
REQUEST_ID NUMBER
)
TABLESPACE PAD
PCTUSED 40
PCTFREE 10
INITRANS 1
MAXTRANS 255
STORAGE (
INITIAL 40K
NEXT 40K
MINEXTENTS 1
MAXEXTENTS 10000000
PCTINCREASE 0
FREELISTS 1
FREELIST GROUPS 1
BUFFER_POOL DEFAULT
)
NOLOGGING
NOCACHE
NOPARALLEL;
Thanks a lot Tom.
Followup April 2, 2005 - 9am Central time zone:
the insert of 60million records took 20hours?
what kind of hardware/disk are we talking here. That is abnormally slow.
did you do a subset of it with trace to see what is taking so long?

April 2, 2005 - 7am Central time zone
Reviewer: A reader
"We want to purge a table which has 70 million rows about 10 GB. on critira to
partition the table. "
i meant there is no critira for implement partitions on the table.
sorry for the typo.

April 2, 2005 - 11am Central time zone
Reviewer: A reader
"
what kind of hardware/disk are we talking here. That is abnormally slow.
did you do a subset of it with trace to see what is taking so long?
"
it is Sun solaris 480 with 4 CPU. disk is raid 5. and there are some other applications running on
the machine also.
I had it with the parallel 2 and did not set the sqltrace on it.
here is from statspack:
Buffer Gets Executions Gets per Exec % Total Hash Value
--------------- ------------ -------------- ------- ------------
29,014,953 1 29,014,953.0 25120040
insert
/*+ append */ into pa.pa_budget_lines_tmp_s pbls select
* from pa.pa_budget_lines pbl where pbl.resource_assignment_id i
n (SELECT ra.resource_assignment_id FROM pa.pa_resourc
e_assignments ra, pa.pa_budget_versions bv
WHERE bv.current_flag <> 'Y' AND bv.budget_sta
Followup April 2, 2005 - 11am Central time zone:
that is Oracle applications, do you have support working with you if you are planning on doing
stuff to those tables with SQL?
is is the QUERY that takes 20 hours or the insert?
If you explain plan the insert and the select, do they use the same plan.
raid 5 = not fast stuff for direct path writing. but not 20 hours slow.

April 2, 2005 - 12pm Central time zone
Reviewer: A reader
"
that is Oracle applications, do you have support working with you if you are planning on doing
stuff to those tables with SQL?
"
Yes, it's oracle apps.
We want to present oracle support what we can do for the purge of PA tables, since the support has
not been able to provide the method. we test the purge on our Test environment.
"
is is the QUERY that takes 20 hours or the insert?
"
Yes.
"
If you explain plan the insert and the select, do they use the same plan.
"
apps@s0p2dbt1> explain plan for
2 insert /*+ append parallel(pbls,4)*/
3 into pa.pa_budget_lines_tmp_s pbls
4 select /*+ parallel(pbl, 4) */
5 * from pa.pa_budget_lines pbl
6 where pbl.resource_assignment_id in
7 (SELECT ra.resource_assignment_id
8 FROM pa.pa_resource_assignments ra,
9 pa.pa_budget_versions bv
10 WHERE bv.current_flag <> 'Y'
11 AND bv.budget_status_code = 'B'
12 AND bv.baselined_date IS NOT NULL
13 AND bv.budget_version_id = ra.budget_version_id)
14 /
Explained.
apps@s0p2dbt1> @plan
Sat Apr 02
QUERY_PLAN
------------------------------------------------------------
LOAD AS SELECT
NESTED LOOPS
VIEW VW_NSO_1
SORT UNIQUE
HASH JOIN
TABLE ACCESS FULL PA_BUDGET_VERSIONS
TABLE ACCESS FULL PA_RESOURCE_ASSIGNMENTS
TABLE ACCESS BY INDEX ROWID PA_BUDGET_LINES
INDEX RANGE SCAN PA_BUDGET_LINES_N2
9 rows selected.
10 rows deleted.
apps@s0p2dbt1> explain plan for
2 select /*+ parallel(pbl, 4) */
3 * from pa.pa_budget_lines pbl
4 where pbl.resource_assignment_id in
5 (SELECT ra.resource_assignment_id
6 FROM pa.pa_resource_assignments ra,
7 pa.pa_budget_versions bv
8 WHERE bv.current_flag <> 'Y'
9 AND bv.budget_status_code = 'B'
10 AND bv.baselined_date IS NOT NULL
11 AND bv.budget_version_id = ra.budget_version_id)
12 /
Explained.
apps@s0p2dbt1> @plan
Sat Apr 02
QUERY_PLAN
------------------------------------------------------------
NESTED LOOPS
VIEW VW_NSO_1
SORT UNIQUE
HASH JOIN
TABLE ACCESS FULL PA_BUDGET_VERSIONS
TABLE ACCESS FULL PA_RESOURCE_ASSIGNMENTS
TABLE ACCESS BY INDEX ROWID PA_BUDGET_LINES
INDEX RANGE SCAN PA_BUDGET_LINES_N2
8 rows selected.
9 rows deleted.
apps@s0p2dbt1>
Followup April 2, 2005 - 1pm Central time zone:
"
is is the QUERY that takes 20 hours or the insert?
"
Yes.
Hmmm
Q: Is it raining or sunny
A: Yes
Must be a sunshower. So, still don't know which it is.
use autotrace traceonly explain, you'll get the cost/card values which are vital.

April 2, 2005 - 1pm Central time zone
Reviewer: A reader
Q: Is it raining or sunny
A: Yes
sorry i did not read your question carefully.
Yes, I mean 20 hrs for inserts. the query run 30 minutes.
I found the parallelism does not work here:
insert /*+ append parallel(pbls,4)*/
into pa.pa_budget_lines_tmp_s pbls
select /*+ parallel(pbl, 4) */
* from pa.pa_budget_lines pbl
where pbl.resource_assignment_id in
(SELECT ra.resource_assignment_id
FROM pa.pa_resource_assignments ra,
pa.pa_budget_versions bv
WHERE bv.current_flag <> 'Y'
AND bv.budget_status_code = 'B'
AND bv.baselined_date IS NOT NULL
AND bv.budget_version_id = ra.budget_version_id)
/
And The parallelism does not work for
insert /*+ append parallel(pbls,4)*/
into pa.pa_budget_lines_tmp_s pbls
select
* from pa.pa_budget_lines pbl
where pbl.resource_assignment_id in
(SELECT ra.resource_assignment_id
FROM pa.pa_resource_assignments ra,
pa.pa_budget_versions bv
WHERE bv.current_flag <> 'Y'
AND bv.budget_status_code = 'B'
AND bv.baselined_date IS NOT NULL
AND bv.budget_version_id = ra.budget_version_id)
/
The parallelism does not work for
select /*+ parallel(pbl, 4) */
* from pa.pa_budget_lines pbl
where pbl.resource_assignment_id in
(SELECT ra.resource_assignment_id
FROM pa.pa_resource_assignments ra,
pa.pa_budget_versions bv
WHERE bv.current_flag <> 'Y'
AND bv.budget_status_code = 'B'
AND bv.baselined_date IS NOT NULL
AND bv.budget_version_id = ra.budget_version_id)
/
The parallelism only work for:
select /*+ parallel(pbl, 4) */
* from pa.pa_budget_lines pbl
init.ora
NAME TYPE VALUE
------------------------------------ ------- ------------------------------
fast_start_parallel_rollback string LOW
optimizer_percent_parallel integer 0
parallel_adaptive_multi_user boolean FALSE
parallel_automatic_tuning boolean FALSE
parallel_broadcast_enabled boolean FALSE
parallel_execution_message_size integer 2148
parallel_instance_group string
parallel_max_servers integer 25
Sat Apr 02
page 2
NAME TYPE VALUE
------------------------------------ ------- ------------------------------
parallel_min_percent integer 0
parallel_min_servers integer 0
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_threads_per_cpu integer 2
recovery_parallelism integer 0
SQL>
it's 8174 db.
Followup April 2, 2005 - 1pm Central time zone:
so, given that it would not normally take 19.5 hours to direct path write that data.... I'd suggest
sql tracing on a sizable subset of the data and see where the bottleneck is. Just do it serially,
easier to trace and understand.

April 2, 2005 - 2pm Central time zone
Reviewer: A reader
Tom,
i will trace it. before doing that i would like the parallelism does not for the inserts and the
query as i refered. it is only working for
select /*+ parallel(t,4) */ * from pa.pa_budget_lines t;
Followup April 2, 2005 - 2pm Central time zone:
not everything is parallelizable. and given the plans -- hash join used in a NL, I would guess
that parallelizing the access to the outer table (which is now being picked up by index) wouldn't
make sense.
maybe extent management ...
April 2, 2005 - 3pm Central time zone
Reviewer: Alberto Dell'Era from Milan, Italy
from the review "Purge huge table" by "A Reader":
CREATE TABLE PURGE_TMP_S ...
STORAGE (
INITIAL 40K
NEXT 40K
...
)
if the tablespace is dictionary managed, parallel or not, most of the time will be spent changing
the data dictionary and logging the changes ;)
Especially since the rows appear to be quite long, say 1k, so 1 extent for every 40 rows,
60,000,000 rows -> 1,500,000 extents looks a tad too much.
Followup April 2, 2005 - 3pm Central time zone:
laughing out loud, didn't even see that ;)
yes, that would be considered a tad small.

April 2, 2005 - 7pm Central time zone
Reviewer: A reader
from the review "Purge huge table" by "A Reader":
CREATE TABLE PURGE_TMP_S ...
STORAGE (
INITIAL 40K
NEXT 40K
...
)
"
if the tablespace is dictionary managed, parallel or not, most of the time will
be spent changing the data dictionary and logging the changes ;)
Especially since the rows appear to be quite long, say 1k, so 1 extent for every
40 rows, 60,000,000 rows -> 1,500,000 extents looks a tad too much.
"
Forgot to metioned that. It is local managed tablespace in Oracle Applications 11.5.7.
and the temp table was created as
create table temptable as select * from pa.table;
The PA table is the orginal apps table.
Tom, that was the reason i asked your comment on the temp table.
to "A Reader"
April 3, 2005 - 11am Central time zone
Reviewer: Alberto Dell'Era from Milan, Italy
>and the temp table was created as
>create table temptable as select * from pa.table;
(i guess you meant "create table temptable as select * from pa.table WHERE 1=0", right ?)
By "temptable", you mean pa.pa_budget_lines_tmp_s, correct ?
Btw, what's the output of
select TABLESPACE_NAME, EXTENT_MANAGEMENT, allocation_type, initial_extent, next_extent from
dba_tablespaces where TABLESPACE_NAME= (select tablespace_name from dba_tables where owner = 'PA'
and table_name = upper ('pa_budget_lines_tmp_s'));
select tablespace_name, initial_extent, next_extent, pct_increase from dba_tables where owner =
'PA' and table_name = upper ('pa_budget_lines_tmp_s');
And i would be curious to know what is the CREATE PURGE_TMP_S statement you gave above, if you
created the table as a create-table-as-select ...

April 3, 2005 - 7pm Central time zone
Reviewer: A reader
"
(i guess you meant "create table temptable as select * from pa.table WHERE 1=0",
right ?)
"
that's correct.
create table pa.pa_budget_lines_tmp_s as select * from pa.pa_budget_lines where 1=2
/
the temp table was created in the PA data tablespace PAD.
we are working on Apps 11i. we do not want to the original table definitions/storge, because the
apps may become no supportable by Oracle.
APPS@S0P2DBT1> select TABLESPACE_NAME, EXTENT_MANAGEMENT, allocation_type, initial_extent,
2 next_extent from dba_tablespaces where TABLESPACE_NAME= (select tablespace_name
3 from dba_tables where owner = 'PA' and table_name = upper
4 ('pa_budget_lines_tmp_s'));
TABLESPACE_NAME
------------------------------------------------------------------------------------------
EXTENT_MANAGEMENT ALLOCATION_TYPE INITIAL_EXTENT NEXT_EXTENT
------------------------------ --------------------------- -------------- -----------
PAD
LOCAL USER 40960 40960
APPS@S0P2DBT1> select tablespace_name, initial_extent, next_extent, pct_increase from
2 dba_tables where owner = 'PA' and table_name = upper ('pa_budget_lines_tmp_s');
TABLESPACE_NAME
------------------------------------------------------------------------------------------
INITIAL_EXTENT NEXT_EXTENT PCT_INCREASE
-------------- ----------- ------------
PAD
40960 40960 0
APPS@S0P2DBT1>
APPS@S0P2DBT1>
Followup April 3, 2005 - 7pm Central time zone:
40k is a tad small for a really big table, don't you think?
I'll really have to recommend you work with support on this task. They can advice you if the
extent sizes are relevant. You need to work within their advice for apps.

April 3, 2005 - 7pm Central time zone
Reviewer: A reader
"we do not want to the original table ..."
should be "we do not want to alter the original table ..."
sorry for the typo.
IN and Inner Join
April 4, 2005 - 2pm Central time zone
Reviewer: Praveen from Bangalore
Hi Tom,
I did the following experiment:
Oracle9iR2
CREATE TABLE ALL_OBJECTS1 AS SELECT * FROM sys.ALL_OBJECTS;
CREATE TABLE ALL_OBJECTS2 AS SELECT * FROM sys.ALL_OBJECTS;
CREATE INDEX idx_allobj1_objid ON ALL_OBJECTS1(object_id);
CREATE INDEX idx_allobj2_objid ON ALL_OBJECTS2(object_id);
ANALYZE TABLE ALL_OBJECTS1 COMPUTE STATISTICS;
ANALYZE TABLE ALL_OBJECTS2 COMPUTE STATISTICS;
SELECT COUNT(*) FROM ALL_OBJECTS;
COUNT(*)
--------
23533
Query 1
-------
SELECT * FROM ALL_OBJECTS1 a1
WHERE EXISTS (SELECT NULL FROM ALL_OBJECTS2 WHERE OBJECT_ID = a1.OBJECT_ID);
Explain Plan Output
-------------------
SELECT STATEMENT Optimizer Mode=CHOOSE
HASH JOIN
SORT UNIQUE
INDEX FAST FULL SCAN ORCL.IDX_ALLOBJ2_OBJID
TABLE ACCESS FULL ORCL.ALL_OBJECTS1
Query2
------
SELECT * FROM ALL_OBJECTS1 a1
WHERE EXISTS (SELECT NULL FROM ALL_OBJECTS2 WHERE OBJECT_ID = a1.OBJECT_ID);
Explain Plan Output
-------------------
SELECT STATEMENT Optimizer Mode=CHOOSE
HASH JOIN
SORT UNIQUE
INDEX FAST FULL SCAN ORCL.IDX_ALLOBJ2_OBJID
TABLE ACCESS FULL ORCL.ALL_OBJECTS1
Query3
------
SELECT * FROM ALL_OBJECTS1 a1, ALL_OBJECTS2 a2 WHERE a1.object_id = a2.object_id;
Explain Plan Output
-------------------
SELECT STATEMENT Optimizer Mode=CHOOSE
HASH JOIN
TABLE ACCESS FULL ORCL.ALL_OBJECTS1
TABLE ACCESS FULL ORCL.ALL_OBJECTS2
Que 1: The explain plan output of first two queries contradicts with what you have answered
to the question of this thread. ie: IN and EXISTS are processed very, very
differently. But the plan here shows same exact access paths and costs and performance.
What went wrong?
Que 2: Ref: Query3. Why did a direct inner join between the two tables gave and entirely
different plan and poor performance than IN (or EXISTS). You said that IN is
infact a join done by first table on a 'distincted' and 'hashed', inner query.
Query 3 is not even using the indexes available!
Que 3: How should I choose between inner join and IN?
Thanks
Praveen
Followup April 4, 2005 - 5pm Central time zone:
good point
things change, good thing I did a test case above that you can test with to observe that and point
it out to the world (I'm not being "snarky", I'm really overjoyed at the timing actually)....
Use the RBO and see what you see. way back when I wrote this, that was the "more popular" of the
two perhaps
today in 2005, what I said years ago using the RBO does not apply to the CBO. the cbo is smart
enough to recognize these two things are effectively the same.
test recoded
April 4, 2005 - 6pm Central time zone
Reviewer: Alberto Dell'Era from Milan, Italy
But query3 isn't semantically equivalent to query1/2... you should put a unique constraint on
ALL_OBJECTS2_uq (object_id) to make them "the same", and recode the statement as below (or not?)
It goes even better anyway - the sort can be avoided alltogether (and it's confirmed that query1
and query2 plans are the same) - i have coded query2 the way i suppose it should be (query1 text is
equal to query2 in the review):
CREATE TABLE ALL_OBJECTS1 AS SELECT * FROM sys.ALL_OBJECTS;
CREATE TABLE ALL_OBJECTS2 AS SELECT * FROM sys.ALL_OBJECTS;
CREATE INDEX idx_allobj1_objid ON ALL_OBJECTS1(object_id);
CREATE INDEX idx_allobj2_objid ON ALL_OBJECTS2(object_id);
exec dbms_stats.gather_table_stats (user, 'ALL_OBJECTS1', cascade=>true);
exec dbms_stats.gather_table_stats (user, 'ALL_OBJECTS2', cascade=>true);
Query 1:
SELECT /* check */ * FROM ALL_OBJECTS1 a1
WHERE EXISTS (SELECT NULL FROM ALL_OBJECTS2 WHERE OBJECT_ID = a1.OBJECT_ID);
Query2:
SELECT /* check */ * FROM ALL_OBJECTS1 a1
WHERE OBJECT_ID in (SELECT OBJECT_ID FROM ALL_OBJECTS2);
alter table ALL_OBJECTS2 add constraint ALL_OBJECTS2_uq unique (object_id);
Query3:
SELECT /* check unique */ a1.* FROM ALL_OBJECTS1 a1, ALL_OBJECTS2 a2
WHERE a1.object_id = a2.object_id;
======================================================================
SELECT /* check */ * FROM ALL_OBJECTS1 a1 WHERE EXISTS (SELECT NULL
FROM ALL_OBJECTS2 WHERE OBJECT_ID = a1.OBJECT_ID)
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 94 |
|* 1 | HASH JOIN SEMI | | 35281 | 3376K| 94 |
| 2 | TABLE ACCESS FULL | ALL_OBJECTS1 | 35281 | 3204K| 50 |
| 3 | INDEX FAST FULL SCAN| IDX_ALLOBJ2_OBJID | 35282 | 172K| 9 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("OBJECT_ID"="OBJECT_ID")
Note: cpu costing is off
======================================================================
SELECT /* check */ * FROM ALL_OBJECTS1 a1 WHERE OBJECT_ID in (SELECT OBJECT_ID
FROM ALL_OBJECTS2)
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 94 |
|* 1 | HASH JOIN SEMI | | 35281 | 3376K| 94 |
| 2 | TABLE ACCESS FULL | ALL_OBJECTS1 | 35281 | 3204K| 50 |
| 3 | INDEX FAST FULL SCAN| IDX_ALLOBJ2_OBJID | 35282 | 172K| 9 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("OBJECT_ID"="OBJECT_ID")
Note: cpu costing is off
======================================================================
SELECT /* check unique */ a1.* FROM ALL_OBJECTS1 a1, ALL_OBJECTS2 a2 WHERE a1.object_id =
a2.object_id
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 50 |
| 1 | NESTED LOOPS | | 35281 | 3376K| 50 |
| 2 | TABLE ACCESS FULL | ALL_OBJECTS1 | 35281 | 3204K| 50 |
|* 3 | INDEX RANGE SCAN | IDX_ALLOBJ2_OBJID | 1 | 5 | |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("OBJECT_ID"="OBJECT_ID")
Note: cpu costing is off
Note: plans fetched from v$sql_plan, 9.2.0.6.
In and Out Parameters and IN/Exists operator
April 5, 2005 - 3am Central time zone
Reviewer: Keshav Madhusoodan from India
Excellent stuff, I got here.
Inner Join and IN
April 5, 2005 - 9am Central time zone
Reviewer: Praveen from Bangalore
Well Tom, by that you mean, with newer versions of CBO playing around, there won't by any
difference at all between IN and EXISTS - whatever be the nature of data sources are - means
whether an index exists or not, unique or not, relative sizes of each tables etc, did you? That's
what my execution plan shows me - with a small difference. With unique key defined, 'NESTED LOOP
SEMI' - for EXISTS, 'NESTED LOOP' - for IN; But otherwise the costs are same.
I had one more doubt that whether IN is always going to perform better that than a direct inner
join? Testing I've done is telling this. Or is there any situation where inner join is better than
IN?
Thanks, Alberto, for pointing out something important (and also correcting the error,...:).
Unfortunately, the execution plan I get here, after putting an unique constraint on object_id, is
very much different from the one that you have given. 'NESTED LOOP SEMI' - for EXISTS, 'NESTED
LOOP' - for IN and surprisingly, HASH JOIN - for inner join! Well, I gathered statistics using
ANALYZE.
Thanks
Praveen
Followup April 5, 2005 - 12pm Central time zone:
in and exists are equivalent, so the advice of the past (with rbo) does not apply.
use either or.
Question regarding "IN & EXISTS"
May 16, 2005 - 4pm Central time zone
Reviewer: Leo James from Glen Burnie, MD, USA
Very well explained TOM.
in, inner join and unnesting
August 10, 2005 - 9pm Central time zone
Reviewer: James from Australia
Dear Tom,
You said (right at the start of this thread) concerning the unnesting/query rewriting of in
subqueries:
>is typically processed as:
>
>select *
> from t1, ( select distinct y from t2 ) t2
> where t1.x = t2.y;
>
>The subquery is evaluated, distinct'ed, indexed (or >hashed or sorted) and then
>joined to the original table -- typically.
I'm worried about the typically :-)
I have 3 queries that return the same result:
Query 1 - in query
select /*+ NORMAL */ count(*) from carsales_view
where carsales_view.dealer_id in (select dealer_id from dealer_regions where region_id=325)
Query 2 - inner join
select /*+ INNER */ count(*) from carsales_view,dealer_regions
where carsales_view.dealer_Id=dealer_regions.dealer_id and region_id = 325
Query 3 - "Typical" unnested in subquery
select /*+ UNNESTED */
count(*) from carsales_view t1,( select distinct dealer_id from dealer_regions where region_id=325)
t2
where t1.dealer_Id=t2.dealer_id
The dealer_regions table has had stats created and enough buckets specified for the CBO to choose
between two indexes.
The problem is that the 1st query - the in subquery - is not being executed as well as it should,
the wrong index is being selected. What is even more curious is that the other 2 queries perform as
expected - even q3 which is the "typical" unnested version of the 1st query.
I could just use an index hint but I would rather the CBO got it right as the selectivity of
"region_id" varies (and sometimes the index I specified in the hint wont be the best one.)
Any advice to "encourage" the CBO to see things my way?
BTW I am using 9.1.5 so that maybe where my problem lies.
--trace output--
select /*+ NORMAL */ count(*) from carsales_view
where carsales_view.dealer_id in (select dealer_id from dealer_regions where region_id=325)
call count cpu elapsed disk query current rows
total 4 0.58 0.61 55 94638 0 1
Rows Row Source Operation
1 SORT AGGREGATE (cr=94620 r=55 w=0 time=595018 us)
39 NESTED LOOPS SEMI (cr=94620 r=55 w=0 time=594979 us)
45815 NESTED LOOPS (cr=48803 r=55 w=0 time=349295 us)
48690 INDEX FAST FULL SCAN CSVMV_DEID_IND (cr=111 r=55 w=0 time=65768 us)(object id 270087)
45815 INDEX UNIQUE SCAN CSDM_PK (cr=48692 r=0 w=0 time=199251 us)(object id 270093)
39 INDEX UNIQUE SCAN DERE_PK (cr=45817 r=0 w=0 time=167174 us)(object id 281197)
Rows Execution Plan
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
39 NESTED LOOPS (SEMI)
45815 NESTED LOOPS
48690 INDEX GOAL: ANALYZED (FAST FULL SCAN) OF
'CSVMV_DEID_IND' (NON-UNIQUE)
45815 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'CSDM_PK' (UNIQUE)
39 INDEX (UNIQUE SCAN) OF 'DERE_PK' (UNIQUE)
********************************************************************************
select /*+ INNERJOIN */ count(*) from carsales_view,dealer_regions
where carsales_view.dealer_Id=dealer_regions.dealer_id and region_id = 325
call count cpu elapsed disk query current rows
total 4 0.00 0.00 3 56 0 1
Rows Row Source Operation
1 SORT AGGREGATE (cr=54 r=3 w=0 time=965 us)
39 NESTED LOOPS (cr=54 r=3 w=0 time=941 us)
39 NESTED LOOPS (cr=13 r=3 w=0 time=690 us)
5 TABLE ACCESS BY INDEX ROWID DEALER_REGIONS (cr=6 r=3 w=0 time=514 us)
5 INDEX RANGE SCAN DERE_REID_IND (cr=2 r=1 w=0 time=229 us)(object id 281196)
39 INDEX RANGE SCAN CSVMV_DEID_IND (cr=7 r=0 w=0 time=143 us)(object id 270087)
39 INDEX UNIQUE SCAN CSDM_PK (cr=41 r=0 w=0 time=180 us)(object id 270093)
Rows Execution Plan
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
39 NESTED LOOPS
39 NESTED LOOPS
5 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'DEALER_REGIONS'
5 INDEX (RANGE SCAN) OF 'DERE_REID_IND' (NON-UNIQUE)
39 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'CSVMV_DEID_IND'
(NON-UNIQUE)
39 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'CSDM_PK' (UNIQUE)
********************************************************************************
select /*+ UNNESTED */
count(*) from carsales_view t1,( select distinct dealer_id from dealer_regions where region_id=325)
t2
where t1.dealer_Id=t2.dealer_id
call count cpu elapsed disk query current rows
total 4 0.00 0.00 0 56 0 1
Rows Row Source Operation
1 SORT AGGREGATE (cr=54 r=0 w=0 time=596 us)
39 NESTED LOOPS (cr=54 r=0 w=0 time=577 us)
39 NESTED LOOPS (cr=13 r=0 w=0 time=337 us)
5 VIEW (cr=6 r=0 w=0 time=196 us)
5 SORT UNIQUE (cr=6 r=0 w=0 time=189 us)
5 TABLE ACCESS BY INDEX ROWID OBJ#(281059) (cr=6 r=0 w=0 time=99 us)
5 INDEX RANGE SCAN OBJ#(281196) (cr=2 r=0 w=0 time=48 us)(object id 281196)
39 INDEX RANGE SCAN OBJ#(270087) (cr=7 r=0 w=0 time=111 us)(object id 270087)
39 INDEX UNIQUE SCAN OBJ#(270093) (cr=41 r=0 w=0 time=161 us)(object id 270093)
Rows Execution Plan
0 SELECT STATEMENT GOAL: CHOOSE
1 SORT (AGGREGATE)
39 NESTED LOOPS
39 NESTED LOOPS
5 VIEW
5 SORT (UNIQUE)
5 TABLE ACCESS GOAL: ANALYZED (BY INDEX ROWID) OF
'DEALER_REGIONS'
5 INDEX (RANGE SCAN) OF 'DERE_REID_IND' (NON-UNIQUE)
39 INDEX GOAL: ANALYZED (RANGE SCAN) OF 'CSVMV_DEID_IND'
(NON-UNIQUE)
39 INDEX GOAL: ANALYZED (UNIQUE SCAN) OF 'CSDM_PK' (UNIQUE)
Followup August 11, 2005 - 9am Central time zone:
do you really do this without bind variables? in the application?
what is 9.1.5
A soloution to my last review
August 11, 2005 - 1am Central time zone
Reviewer: James from Australia
Sorry but I have found a work around.
The 1st query can be written as:
select
count(*) from carsales_view
where dealer_id in (
select dealer_id from dealer_regions where region_id=325 and rownum>0)
/
Which achieves the same performance of a inner join for selective and non-selective values of
region_id (although with a different plan):
region_id=325 - low selectivity:
56 consistant gets compared with 94,000 (qry1)
This compares to the manually unested query (q3) of 56 consistant gets.
region_id=81 - high selectivity:
15,572 consistant gets compared to 94,000 (using qry1)
This compares to the manually unested query (q3) of 15572 consistant gets.
So I have managed to get the CBO to execute the query based on the selectivity of the region_id
field in the dealer_regions table.
I beleive this is probably a bug in the 9.1.5 CBO?
region_id=325 - high selectivity with rownum>0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=19 Card=1 Bytes=22)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=19 Card=241 Bytes=5302)
3 2 NESTED LOOPS (Cost=19 Card=241 Bytes=4338)
4 3 VIEW OF 'VW_NSO_1' (Cost=2 Card=5 Bytes=65)
5 4 SORT (UNIQUE)
6 5 COUNT
7 6 FILTER
8 7 TABLE ACCESS (BY INDEX ROWID) OF 'DEALER_REG
IONS' (Cost=2 Card=5 Bytes=40)
9 8 INDEX (RANGE SCAN) OF 'DERE_REID_IND' (NON
-UNIQUE) (Cost=1 Card=5)
10 3 INDEX (RANGE SCAN) OF 'CSVMV_DEID_IND' (NON-UNIQUE)
(Cost=1 Card=48 Bytes=240)
11 2 INDEX (UNIQUE SCAN) OF 'CSDM_PK' (UNIQUE)
Statistics
----------------------------------------------------------
7 recursive calls
0 db block gets
56 consistent gets
0 physical reads
0 redo size
214 bytes sent via SQL*Net to client
275 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
region_id=81 - high selectivity with rownum>0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=22)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=22 Card=33150 Bytes=729300)
3 2 HASH JOIN (SEMI) (Cost=22 Card=33150 Bytes=596700)
4 3 INDEX (FAST FULL SCAN) OF 'CSVMV_DEID_IND' (NON-UNIQ
UE) (Cost=12 Card=48691 Bytes=243455)
5 3 VIEW OF 'VW_NSO_1' (Cost=3 Card=689 Bytes=8957)
6 5 COUNT
7 6 FILTER
8 7 TABLE ACCESS (FULL) OF 'DEALER_REGIONS' (Cost=
3 Card=689 Bytes=5512)
9 2 INDEX (UNIQUE SCAN) OF 'CSDM_PK' (UNIQUE)
Statistics
----------------------------------------------------------
7 recursive calls
0 db block gets
15572 consistent gets
0 physical reads
0 redo size
216 bytes sent via SQL*Net to client
275 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
reference for last answer
August 11, 2005 - 1am Central time zone
Reviewer: James from Australia
Found on google groups, keywords: oracle unnest subquery
http://groups.google.com/group/comp.databases.oracle.server/browse_frm/thread/939ef882371d3ce8/5066c
973efa68919?tvc=1&q=oracle+unnest+subquery#5066c973efa68919
regards
IN, EXISTS and NULL
November 21, 2005 - 4am Central time zone
Reviewer: Sanjaya Balasuriya from Colombo, Sri Lanka
Hi Tom,
How IN and EXISTS are effected by NULL values ?
Is it possible that IN and EXISTS return different sets of data depending on NULL values ?
Followup November 21, 2005 - 8am Central time zone:
in and exists are comparable in the presence of NULLS
not in and not exists behave differently.
Oracle10g Doc has a different opinion.
December 20, 2005 - 5am Central time zone
Reviewer: Praveen from Dubai
Hi Tom,
Oracle10g documentation says IN and EXISTS executes and performs differently depending on the
situation.
"In general, if the selective predicate is in the subquery, then use IN. If the
selective predicate is in the parent query, then use EXISTS."
http://download-west.oracle.com/docs/cd/B14117_01/server.101/b10752/sql_1016.htm#28145
This is contradicting with your opinion that-
"today in 2005, what I said years ago using the RBO does not apply to the CBO.
the cbo is smart enough to recognize these two things are effectively the same." above.
Please explain.
Regards
Followup December 20, 2005 - 10am Central time zone:
ops$tkyte@ORA10GR2> create table employees as select * from hr.employees where 1=0;
Table created.
ops$tkyte@ORA10GR2> create table orders as select * from oe.orders where 1=0;
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into employees
2 select object_id,
3 substr(object_name,1,20),
4 substr(object_name,1,25),
5 substr(object_type,1,25),
6 substr(last_ddl_time,1,20),
7 created,
8 substr(object_name,1,10),
9 object_id,
10 0.10,
11 object_id+1,
12 mod(rownum,1000)
13 from all_objects
14 where rownum <= 27000;
27000 rows created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into orders
2 select rownum,
3 created,
4 substr(object_type,1,8),
5 mod(rownum,150),
6 1,
7 object_id,
8 object_id,
9 1
10 from all_objects
11 where rownum <= 10000;
10000 rows created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> alter table employees add constraint emp_pk primary key(employee_id);
Table altered.
ops$tkyte@ORA10GR2> create index orders_customer_id on orders(customer_id);
Index created.
ops$tkyte@ORA10GR2> create index employees_department_id on employees(department_id);
Index created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'EMPLOYEES', method_opt=>'for all
indexed columns' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR2> exec dbms_stats.gather_table_stats( user, 'ORDERS', method_opt=>'for all
indexed columns' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA10GR2> set autotrace traceonly explain
ops$tkyte@ORA10GR2> SELECT /* EXISTS example */
2 e.employee_id, e.first_name, e.last_name, e.salary
3 FROM employees e
4 WHERE EXISTS (SELECT 1 FROM orders o /* Note 1 */
5 WHERE e.employee_id = o.sales_rep_id /* Note 2 */
6 AND o.customer_id = 144);
Execution Plan
----------------------------------------------------------
Plan hash value: 551415261
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 67 | 4087 | 49 (3)|
| 1 | NESTED LOOPS | | 67 | 4087 | 49 (3)|
| 2 | SORT UNIQUE | | 67 | 1139 | 14 (0)|
|* 3 | TABLE ACCESS FULL | ORDERS | 67 | 1139 | 14 (0)|
| 4 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 44 | 1 (0)|
|* 5 | INDEX UNIQUE SCAN | EMP_PK | 1 | | 0 (0)|
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("O"."CUSTOMER_ID"=144)
5 - access("E"."EMPLOYEE_ID"="O"."SALES_REP_ID")
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> SELECT /* IN example */
2 e.employee_id, e.first_name, e.last_name, e.salary
3 FROM employees e
4 WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */
5 FROM orders o
6 WHERE o.customer_id = 144);
Execution Plan
----------------------------------------------------------
Plan hash value: 551415261
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 67 | 4087 | 49 (3)|
| 1 | NESTED LOOPS | | 67 | 4087 | 49 (3)|
| 2 | SORT UNIQUE | | 67 | 1139 | 14 (0)|
|* 3 | TABLE ACCESS FULL | ORDERS | 67 | 1139 | 14 (0)|
| 4 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 44 | 1 (0)|
|* 5 | INDEX UNIQUE SCAN | EMP_PK | 1 | | 0 (0)|
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("O"."CUSTOMER_ID"=144)
5 - access("E"."EMPLOYEE_ID"="O"."SALES_REP_ID")
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> set autotrace off
They give insufficient details in their example to reproduce. when you tested this out - what did
you yourself see?
what's the maximum number of values in the IN expression
January 3, 2006 - 8pm Central time zone
Reviewer: Jianhui from CA
Hi Tom,
IN (value1, value2, ... valueN)
What's the maximum limit of N here allowed by Oracle? Do you have URL to the document for this
limit?
Thanks
Question about not exists
January 19, 2006 - 8am Central time zone
Reviewer: Oleg
Hi,Tom,
I have this query:
SELECT b.branch_no, b.salesman_no,
ROUND
( SUM (CASE
WHEN b.returning_no !=2
THEN order_qty
END)
/ 22
) avg_rtrn_orders_qty,
ROUND ( SUM (CASE
WHEN b.bsnss_no != 6
THEN order_qty
END)
/ 22
) avg_no_gas_orders_qty,
ROUND ( SUM (CASE
WHEN b.bsnss_no = 6
THEN order_qty
END)
/ 22
) avg_gas_orders_qty
FROM km_orders a, km_customers b, km_nwpr_editions c,km_closed_customers d
WHERE a.cstmr_no = b.cstmr_no
and b.cstmr_no=d.cstmr_no(+)
and c.nwpr_date BETWEEN d.date_from(+) AND d.date_to(+)
AND b.bsnsslike_sw = 'ë'
AND a.prod_no = c.prod_no
AND c.nwpr_date BETWEEN a.date_from AND a.date_to
AND a.prod_no = 1
AND b.branch_no != 9
AND c.nwpr_date BETWEEN TO_DATE ('01/11/2005', 'DD/MM/YYYY')
AND TO_DATE ('30/11/2005', 'DD/MM/YYYY')
AND a.day_in_week = c.nwpr_day
AND NOT EXISTS (
SELECT 1
FROM km_closed_customers
WHERE cstmr_no = a.cstmr_no
AND c.nwpr_date BETWEEN date_from AND date_to)
GROUP BY b.branch_no, b.salesman_no
I see that without
" AND NOT EXISTS (
SELECT 1
FROM km_closed_customers
WHERE cstmr_no = a.cstmr_no
AND c.nwpr_date BETWEEN date_from AND date_to)"
the query is running faster,but i need to check this table,
what can i do instead this not exists statement?
Thank you,
Oleg,
best practice?
February 17, 2006 - 1pm Central time zone
Reviewer: A reader
Tom,
is it a good practice to have 2 exists in the form?
SELECT *
FROM TABLE a
WHERE NOT EXISTS (SELECT 1
FROM TABLE B i
WHERE i.CLASS = a.CLASS_NAME
AND NVL(i.CLASS_NAME,0) = NVL(a.CLASS_DEPT,0)
)
AND NOT EXISTS (SELECT 1
FROM work_orders w
WHERE w.class_id = a.s_name
AND NVL(w.CLASS_ID,0) = nvl(a.CLASS_DEPT,0)
)
Followup February 17, 2006 - 4pm Central time zone:
it seems to be less a matter of "good practice" and more a matter of getting the right answer
though isn't it?
It is a bad practice to do that nvl'ing bit in general. sure you really need to do that.
can you clarify with example please
February 22, 2006 - 2pm Central time zone
Reviewer: A reader
Followup February 22, 2006 - 2pm Central time zone:
an example of what exactly
of how can I avoid
February 22, 2006 - 3pm Central time zone
Reviewer: A reader
Tom,
How can I avoid "the nvl'ing bit" in this case.
Followup February 22, 2006 - 3pm Central time zone:
fixing the model? apparently you didn't mean to use nulls? Not if you basically join on them like
that.
Else, we'd have to function base index them in this case.
Are you sure nulls are an issue here. Do you really need to join by them or is that "just in case"
THANKS!!!!!
February 22, 2006 - 4pm Central time zone
Reviewer: A reader
updating using EXISTS
March 9, 2006 - 3pm Central time zone
Reviewer: anirudh from usa
hi tom
in our project we often have a situaton where some of the fields in a table (master_table in the
example below) needs to be updated with data in another table (UPDATES_FEED in the example below)
MASTER_TABLE
key_fld
fld_1
fld_2
fld_3
fld_4
fld_5
fld_6
UPDATES_FEED
key_fld
fld_1
fld_2
i typically use SQL such as :
update master_table a
set (fld_1,fld_2) = (select fld_1,fld_2
from updates_feed b
where a.key_fld = b.key_fld)
where exists (select 1
from updates_feed c
where a.key_fld = c.key_fld)
I have to add the WHERE condition otherwise if no match is found in updates_feed table it would
update the fields to NULL)
even though it works - it looks some what less than elegant - and has performance issues as well -
as i think the lookup on the updates_feed is happening two times
Is there a better way of achieving the result ?
Followup March 9, 2006 - 4pm Central time zone:
update a join
update (select m.fld_1, m.fld_2, u.fld_1 n1, u.fld_2 n2
from master_table m, updates_feed u
where m.key_fld = u.key_fld )
set fld_1 = n1, fld_2 = n2;
wonderful
March 10, 2006 - 4am Central time zone
Reviewer: A reader
Morning Tom,
i have always wondered the best way of doing that kind of update. in your example is it therefore
possible to update either or indeed both of the tables used in the select statement?
Regards
Dave
Followup March 10, 2006 - 12pm Central time zone:
only one, you are updating a table from another table
Strange behaviour
March 30, 2006 - 6am Central time zone
Reviewer: Abhijit Mallick from India
Hi Tom,
Today I have found a strange behaviuor of SQL. The problem is as follows:
Please see the details:
SQL> create table tab2
2 (col21 number);
Table created.
SQL> insert into tab2 select rownum from user_objects;
2429 rows created.
SQL> create table tab1
2 (col1 varchar2(1000),
3 col2 number);
Table created.
SQL> insert into tab1 select table_name,rownum from user_tables;
293 rows created.
SQL> commit;
Commit complete.
SQL> ed
Wrote file afiedt.buf
1 select count(*) from tab1
2 where col1 is not null
3* and col2 not in (select col2 from tab2)
SQL> /
COUNT(*)
----------
0
---LOOK AT THE INNER QUERY
THE TABLE TAB2 DOES NOT HAVE THE COL2 STILL THE QUERY EXECUTES WITHOUT ANY ERROR. SEE THE BELOW
PART , IT GIVES THE ERROR.
SQL> select col2 from tab2;
select col2 from tab2
*
ERROR at line 1:
ORA-00904: "COL2": invalid identifier
SQL>
Can you please discuss the behaviour of the above sub-query.
Thanks
--Abhijit Mallick
Bangalore
Followup March 31, 2006 - 11am Central time zone:
it is called a correlated subquery.
select count(*)
from tab1
where col1 is not null
and col2 not in (select TAB1.COL2 from tab2)
that is the same query as the one you executed.
Just like you could:
and not exists ( select * from tab2 where tab2.col21 = tab1.col2 )
would work - as would
and not exists ( select * from tab2 where col21 = col2 )
we would just figure out using standard scoping rules whence each reference column was from.
IN & EXISTS
April 19, 2006 - 8am Central time zone
Reviewer: jash from India
How to tune the below query to get result very quickly
select count(*) from polineitemtab;
COUNT(*)
----------
1581181
select count(*) from requisitiontab;
COUNT(*)
----------
284956
select rootid from NEW_TEMP_polineitemtab
where poli_requisition not in
(select rootid from NEW_TEMP_requisitiontab)
Followup April 19, 2006 - 9am Central time zone:
make sure you are using the CBO
make sure rootid is NOT NULL in NEW_TEMP_requisitiontab OR add "where rootid is not null" to the
subquery.
or
select /*+ ALL_ROWS */ a.rootid
from NEW_TEMP_polineitemtab a, NEW_TEMP_requisitiontab b
where a.rootid = b.rootid(+)
and b.rootid is null;
(a do it yourself hash anti-join)
IN & EXISTS
April 19, 2006 - 9am Central time zone
Reviewer: jash from India
What is b.rootid(+) significance in this query
Followup April 19, 2006 - 9am Central time zone:
outer join - you wanted to find rows in A, not in B - so we outer join A to B and only keep those
where there was no match (b.rootid is null)
IN & EXISTS
April 19, 2006 - 9am Central time zone
Reviewer: jash from India
There is no null values in b.rootid(column) NEW_TEMP_requisitiontab(table) what is the reason using
is null
Followup April 19, 2006 - 10am Central time zone:
when you outer join T1 to T2, SQL will "make up" a NULL row in T2 to join to when there isn't a row
to be joined to.
So, the following two queries are semantically equivalent (assuming t2.x is NOT NULL)
ops$tkyte@ORA9IR2> create table t1 ( x int );
Table created.
ops$tkyte@ORA9IR2> create table t2 ( x int NOT NULL );
Table created.
ops$tkyte@ORA9IR2> insert into t1 values ( 1 );
1 row created.
ops$tkyte@ORA9IR2> insert into t1 values ( 2 );
1 row created.
ops$tkyte@ORA9IR2> insert into t2 values ( 1 );
1 row created.
ops$tkyte@ORA9IR2> select * from t1 where x not in (select x from t2);
X
----------
2
ops$tkyte@ORA9IR2> select t1.*
2 from t1, t2
3 where t1.x = t2.x(+)
4 and t2.x is null;
X
----------
2
EXISTS vs "where rownum <=1"
May 4, 2006 - 7pm Central time zone
Reviewer: K. Cruz from WashingtonDC
Thanks Tom for your valuable support.
I'm reviewing potentially bad queries and ran across a query that looks for the existence of a
value in a table. This query looked suspicious to me and wanted to get your feedback.
SELECT DECODE(COUNT(NF.SUFFIX), 0, 'N', 'Y')
FROM NF_FILES NF,
NF_RPC_LOCATION NRL
WHERE NF.SUFFIX = :B3
AND NF.OFFICE_ID = NRL.OFFICE_ID
AND NF.SECTION_CODE = NRL.SECTION_CODE
AND NF.RESP_CODE = NRL.RESP_CODE
AND NF.OFFICE_ID = :B2
AND NF.LAST_TRANSACTION_DATE >= to_date(:B1, 'MMDDYYYY')
AND ROWNUM <= 1
After rewriting the query so that it uses an EXIST clause, the explain plan for both seemed to look
very similar.
17:48:53 nfuser@NFDEV> l
1 SELECT DECODE(COUNT(NF.SUFFIX), 0, 'N', 'Y')
2 FROM NF_FILES NF,
3 NF_RPC_LOCATION NRL
4 WHERE NF.SUFFIX = :B3
5 AND NF.OFFICE_ID = NRL.OFFICE_ID
6 AND NF.SECTION_CODE = NRL.SECTION_CODE
7 AND NF.RESP_CODE = NRL.RESP_CODE
8 AND NF.OFFICE_ID = :B2
9 AND NF.LAST_TRANSACTION_DATE >= to_date(:B1, 'MMDDYYYY')
10* AND ROWNUM <= 1
17:48:54 nfuser@NFDEV> /
D
-
N
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=43)
1 0 SORT (AGGREGATE)
2 1 COUNT (STOPKEY)
3 2 NESTED LOOPS (Cost=7 Card=1 Bytes=43)
4 3 PARTITION RANGE (SINGLE)
5 4 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'NF_FILES' (Cost=5 Card=1 Bytes=31)
6 5 INDEX (RANGE SCAN) OF 'NF_OFFID_P1_IDX' (NON-UNIQUE) (Cost=4 Card=1)
7 3 INDEX (RANGE SCAN) OF 'RPC_LOCATION_RESP_CNTL_FK_I'(NON-UNIQUE) (Cost=2 Card=1
Bytes=12)
17:50:15 nfuser@NFDEV> l
1 SELECT 'Y'
2 FROM DUAL
3 WHERE EXISTS ( SELECT 1
4 FROM NF_FILES NF,
5 NF_RPC_LOCATION NRL
6 WHERE NF.SUFFIX = :B3
7 AND NF.OFFICE_ID = NRL.OFFICE_ID
8 AND NF.SECTION_CODE = NRL.SECTION_CODE
9 AND NF.RESP_CODE = NRL.RESP_CODE
10 AND NF.OFFICE_ID = :B2
11* AND NF.LAST_TRANSACTION_DATE >= to_date(:B1, 'MMDDYYYY') )
17:50:16 nfuser@NFDEV> /
no rows selected
Elapsed: 00:00:00.06
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=8168)
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
3 1 NESTED LOOPS (Cost=7 Card=1 Bytes=43)
4 3 PARTITION RANGE (SINGLE)
5 4 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'NF_FILES'(Cost=5 Card=1 Bytes=31)
6 5 INDEX (RANGE SCAN) OF 'NF_OFFID_P1_IDX' (NON-UNIQUE) (Cost=4 Card=1)
7 3 INDEX (RANGE SCAN) OF 'RPC_LOCATION_RESP_CNTL_FK_I' (NON-UNIQUE) (Cost=2 Card=1
Bytes=12)
NF_FILES is a 300 million row table so based on your previous advice, my outer query was the DUAL
table and my inner query was the large NF_FILES table. I'm not able to determine which is the most
efficient and the original query has been flagged and resource intensive. Could you provide some
feedback on which approach makes more sense?
Thank you!!
Followup May 5, 2006 - 1am Central time zone:
the original doesn't have to read dual, yours does the work of the original PLUS the reading of
dual.
hopefully the index nf_offid_p1_idx has suffix and last_transaction_date in it as well
(concatenated index)
WHERE NF.SUFFIX = :B3
7 AND NF.OFFICE_ID = NRL.OFFICE_ID
8 AND NF.SECTION_CODE = NRL.SECTION_CODE
9 AND NF.RESP_CODE = NRL.RESP_CODE
10 AND NF.OFFICE_ID = :B2
11* AND NF.LAST_TRANSACTION_DATE >= to_date(:B1, 'MMDDYYYY')
I'd want an index on (office_id,suffix,last_transaction_date) probably.

May 10, 2006 - 9am Central time zone
Reviewer: Reader
I have a table called person. I need to update the last name and birth date of person_no 2 with
last name and birthdate of person_no 1. "id" is the primary key of the person table.
Can you please let me know how to do this update?
create sequence id_seq start with 1;
create table person
(id number primary key
, person_no number
,first_nm varchar2(30)
,last_nm varchar2(30)
,birth_dt varchar2(14)
,CONSTRAINT pk_id PRIMARY KEY(id) )
insert into person
values
(id_seq.nextval,1,'ABC','AAA','19991120');
insert into person
values
(id_seq.nextval,2,'DEF','','00000000');
insert into person
values
(id_seq.nextval,1,'GHI','KLM','19921021');
insert into person
values
(id_seq.nextval,2,'NPQ','','00000000');
insert into person
values
(id_seq.nextval,1,'RST','UVW','19721215');
insert into person
values
(id_seq.nextval,2,'XYZ','','00000000');
insert into person
values
(id_Seq.nextval,1,'TTT','GGG','19750324');
select * from person;
ID PERSON_NO FIRST_NM LAST_NM BIRTH_DT
--- -------- -------- ------- --------
1 1 ABC AAA 19991120
2 2 DEF 00000000
3 1 GHI KLM 19921021
4 2 NPQ 00000000
5 1 RST UVW 19721215
6 2 XYZ 00000000
7 1 TTT GGG 19750324
Thanks
Followup May 10, 2006 - 10am Central time zone:
you seem to have person_no = 1 and 2 in there more than once.
It is not at all clear how you would pick which person_no=1 to use to update all of the person_no =
2 records.

May 10, 2006 - 11am Central time zone
Reviewer: Reader
Tom,
I am sorry for not being clear on the above question. Please find below the details of the tables.
I need to update the last name and birth date of
person_no 2 with last name and birthdate of person_no 1.
of the person table.
Can you please let me know how to do this update?
create sequence id_pseq start with 200;
create table person
(id_person number primary key
, person_no number
,first_nm varchar2(30)
,last_nm varchar2(30)
,birth_dt varchar2(14)
)
insert into person
values
(id_pseq.nextval,1,'ABC','AAA','19991120');
insert into person
values
(id_pseq.nextval,2,'DEF','','00000000');
insert into person
values
(id_pseq.nextval,1,'GHI','KLM','19921021');
insert into person
values
(id_pseq.nextval,2,'NPQ','','00000000');
insert into person
values
(id_pseq.nextval,1,'RST','UVW','19721215');
insert into person
values
(id_pseq.nextval,2,'XYZ','','00000000');
insert into person
values
(id_pseq.nextval,1,'TTT','GGG','19750324');
create sequence id_pbseq start with 100;
create table person_i_bnft
(id_ibnft number primary key
,id_person number references person(id_person)
,person_no number
,id_bnft number references person_bnft(id_bnft)
,age number
);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,200,1,3000,20);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,201,2,3000,25);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,202,1,3001,30);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,203,2,3001,35);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,204,1,3002,40);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,205,2,3002,45);
insert into person_i_bnft
values
(id_pbseq.NEXTVAL,206,1,3003,50);
create sequence id_bseq start with 3000;
create table person_bnft
(id_bnft number primary key
,amt varchar2(6)
);
insert into person_bnft
values
(id_bseq.NEXTVAL,'$1000');
insert into person_bnft
values
(id_bseq.NEXTVAL,'$2000');
insert into person_bnft
values
(id_bseq.NEXTVAL,'$3000');
insert into person_bnft
values
(id_bseq.NEXTVAL,'$4000');
Data from tables:
PErson table:
ID_PERSON PERSON_NO FIRST_NM LAST_NM BIRTH_DT
200 1 ABC AAA 19991120
201 2 DEF 00000000
202 1 GHI KLM 19921021
203 2 NPQ 00000000
204 1 RST UVW 19721215
205 2 XYZ 00000000
206 1 TTT GGG 19750324
PErson_i_bnft
ID_IBNFT ID_PERSON PERSON_NO ID_BNFT AGE
100 200 1 3000 20
101 201 2 3000 25
102 202 1 3001 30
103 203 2 3001 35
104 204 1 3002 40
105 205 2 3002 45
106 206 1 3003 50
Person_bnft
ID_BNFT AMT
3000 $1000
3001 $2000
3002 $3000
3003 $4000
Thanks
Followup May 11, 2006 - 7am Central time zone:
no, this is still as clear as a puddle of mud.
sigh, do you see that you have "FOUR PERSON_NO=1" records. You have "THREE PERSON_NO=2" records.
You have only said:
...
I need to update the last name and birth date of
person_no 2 with last name and birthdate of person_no 1.
of the person table.
.......
No one can answer your question - there is no answer to your question. Your question doesn't make
sense yet. Which of the four should be applied to the three, and why.

May 11, 2006 - 8am Central time zone
Reviewer: Reader
ID_PERSON PERSON_NO FIRST_NM LAST_NM BIRTH_DT
200 1 ABC AAA 19991120
201 2 DEF 00000000
202 1 GHI KLM 19921021
203 2 NPQ 00000000
204 1 RST UVW 19721215
205 2 XYZ 00000000
206 1 TTT GGG 19750324
I need to update (last_nm,birth_dt) of ID_PERSON=201(person_no=2) with the value of
(last_nm,birth_dt) in ID_PERSON=200, if the birth_dt of 201 is '00000000' and last_nm of 201 is
null
I need to update (last_nm,birth_dt) of ID_PERSON=203(person_no=2) with the value of
(last_nm,birth_dt) in ID_PERSON=202, if the birth_dt of 202 is '00000000' and last_nm of 202 is
null
I need to update (last_nm,birth_dt) of ID_PERSON=205(person_no=2) with the value of
(last_nm,birth_dt) in ID_PERSON=204, if the birth_dt of 204 is '00000000' and last_nm of 204 is
null.
Thanks
Followup May 11, 2006 - 7pm Central time zone:
and how pray tell did you "know that"
logic - tell use the LOGIC here.
why does 200 go with 201, are they always sequential, is 200 a fixed constant or will that change.
Yes, it is now "semi-obvious" what you seem to be saying, but I want you to say it, out loud here.
What if the row with id_person = 202 is missing for example. CONSIDER ALL OF THE CASES, be
detailed, be explicit, state requirements - go 100% of the way.

May 11, 2006 - 11am Central time zone
Reviewer: Reader
I am trying to do the following update, but I get the error
"ORA-01779: cannot modify a column which maps to a non key-preserved table "
Can you please let me know what is wrong here?
update
(
select t1.fm1,t1.lm1,t2.fm2,t2.lm2
from
(select p11.ID_PERSON id1
,p11.FIRST_NM fm1
,p11.LAST_NM lm1
,p11.birth_dt dob1
,p11.PERSON_NO per_no1
,p13.ID_BNFT T1_ID_BNFT
from person p11
,person_i_bnft p12
,person_bnft p13
where p11.ID_PERSON = p12.ID_PERSON
and p12.ID_BNFT = p13.ID_BNFT
and p11.PERSON_NO = 1
and p11.BIRTH_DT <> '00000000'
) T1,
(select p21.ID_PERSON id2
,p21.FIRST_NM fm2
,p21.LAST_NM lm2
,p21.birth_dt dob2
,p21.PERSON_NO per_no2
,p23.ID_BNFT T2_ID_BNFT
from person p21
,person_i_bnft p22
,person_bnft p23
where p21.ID_PERSON = p22.ID_PERSON
and p22.ID_BNFT = p23.ID_BNFT
and p21.PERSON_NO = 2
and p21.BIRTH_DT = '00000000'
) T2
where T1_ID_BNFT = T2_ID_BNFT)
set lm2=lm1
,dob2=dob1;
Thanks
Followup May 11, 2006 - 7pm Central time zone:
did you search for "key preserved"
I only ask because those words have not yet appeared here at all and they seem like a logical thing
to search for?
are these two queries the same...?
June 2, 2006 - 5am Central time zone
Reviewer: A reader
Hi
I have a query which uses EXISTS I modified it to a join (reduced 50% logical I/O) however for the
test data we have it seems correct but I am not 100% sure, may you have a look and see if they are
the same?
SELECT *
FROM BCOMMON c
WHERE c.type = 'B'
AND c.i_period = 1
AND EXISTS (
SELECT 1
FROM BCOMMON c2
WHERE c2.type = c.type
AND c2.id_bonif_periodica = c.id_bonif
AND TO_CHAR (c2.f_imputacion, 'YYYYMMDD') = '20060601')
SELECT c.*
FROM BCOMMON c, BCOMMON c2
WHERE c.type='B'
AND c.i_period=1
AND c2.ID_BONIF_PERIODICA=c.ID_BONIF
AND to_char(c2.F_IMPUTACION, 'YYYYMMDD')='20060601'
Followup June 2, 2006 - 10am Central time zone:
No.
you don't have a predicate on the second query to make sure that c2.type is 'B'
if id_bonif_periodica is not unique for a given type/f_imputacion, you cannot just join (it'll
increase the number of returned records)
and please, never do this:
AND TO_CHAR (c2.f_imputacion, 'YYYYMMDD') = '20060601'
do this:
and f_imputacion = to_date( '20060601', 'yyyymmdd' )
or, if the date column has a time component:
and f_imputacion >= to_date( '20060601', 'yyyymmdd' )
and f_imputacion < to_date( '20060601', 'yyyymmdd' )+1
thanks
June 2, 2006 - 6pm Central time zone
Reviewer: A reader
Hi Tom
Thanks for your valuable advices.
join returns more records
June 5, 2006 - 3am Central time zone
Reviewer: A reader
Hi Tom
I added a predicate to make sure that c2.type is 'B' so the query becomes
SELECT c.*
FROM BCOMMON c, BCOMMON c2
WHERE c.type='B'
AND c2.type='B'
AND c.i_period=1
AND c2.ID_BONIF_PERIODICA=c.ID_BONIF
and c2.f_imputacion >= to_date( '20060601', 'yyyymmdd' )
and c2.f_imputacion < to_date( '20060601', 'yyyymmdd' )+1
Since c2.ID_BONIF_PERIODICA is not unique I see more rows (with EXISTS only a rown is returned,
with join two)
I guess join doesnt work in this case?
Sorry for my stupidness :-(
Followup June 5, 2006 - 7am Central time zone:
but that is just what I said above:
...
if id_bonif_periodica is not unique for a given type/f_imputacion, you cannot
just join (it'll increase the number of returned records)
.......
you can use IN, you can use EXISTS, you can join to an inline view that distincts - but if that
column is not unique for the given attributes - you cannot "just join" really - it'll increase the
number of records returned
Merge exists and not exists into one SQL statement
October 10, 2006 - 11am Central time zone
Reviewer: Kandy Train from NY
Hi,
Hope this is an appropriate question for thread.
Here is an example,
drop table picking_tab;
drop table locations_tab;
create table picking_tab (pick_ticket_no number, location_id varchar2(10));
create table locations_tab (location_id varchar2(10), warehouse varchar2(10));
insert into picking_tab values(1001, 'Bos');
insert into picking_tab values(1002, 'Bos');
insert into picking_tab values(1002, 'Chi');
insert into picking_tab values(1003, 'Chi');
insert into locations_tab values('Bos', 'NorthEast');
insert into locations_tab values('Chi', 'MidWest');
commit;
What I would like to have is just one SQL statement so that I can pass one literal "NorthEast" or
"MidWest" to a view with that SQL statement.
When I pass NorthEast, I want NorthEast specific Pick Tickets and any other pick ticket that picks
from more locations, but at least one of them must be from NorthEast.
The results should look like,
PICK_TICKET_NO LOCATION_ID WAREHOUSE
-------------- ----------- ----------
1002 Bos NorthEast
1002 Chi MidWest
1003 Chi MidWest
When I pass MidWest, I only need MidWest Pick Tickets that are not shared with any other region.
The results should look like,
PICK_TICKET_NO LOCATION_ID WAREHOUSE
-------------- ----------- ----------
1003 Chi MidWest
The example may look silly, as I have to come up with something simple for you to help me.
I appreciate your help as always.
Followup October 10, 2006 - 7pm Central time zone:
why are there not two northeast records in the first result?
and you know, you do not "pass" things to a view, you apply predicates to a view, what do you mean
by "pass"???
Merge exists and not exists into one SQL statement
October 10, 2006 - 11am Central time zone
Reviewer: Kandy Train from NY
In the above posting, I didn't mention what I have tried so far. I am putting it here in case you
wanted to know it.
--Results
PICK_TICKET_NO LOCATION_ID WAREHOUSE
-------------- ----------- ----------
1003 Chi MidWest
To get the above results, I used
select p.pick_ticket_no, p.location_id, l.warehouse
from picking_tab p, locations_tab l
where p.location_id = l.location_id
and not exists (select lt.warehouse
from picking_tab pt, locations_tab lt
where pt.pick_ticket_no = p.pick_ticket_no
and pt.location_id = lt.location_id
and lt.warehouse = 'NorthEast');
--Results
PICK_TICKET_NO LOCATION_ID WAREHOUSE
-------------- ----------- ----------
1002 Bos NorthEast
1002 Chi MidWest
1003 Chi MidWest
To get the above results, I used
select p.pick_ticket_no, p.location_id, l.warehouse
from picking_tab p, locations_tab l
where p.location_id = l.location_id
and exists (select lt.warehouse
from picking_tab pt, locations_tab lt
where pt.pick_ticket_no = p.pick_ticket_no
and pt.location_id = lt.location_id
and lt.warehouse = 'MidWest');
I want to have just one SQL where I could pass either "MidWest" or "NorthEast".
Thanks,
Followup October 10, 2006 - 8pm Central time zone:
see above, why just one northeast, doesn't make sense.
Merge exists and not exists into one SQL statement
October 10, 2006 - 8pm Central time zone
Reviewer: Kandy train from NY
Hi Tom,
You are right and my first result (in first posting) should look like,
PICK_TICKET_NO LOCATION_ID WAREHOUSE
-------------- ----------- ----------
1001 Bos NorthEast
1002 Bos NorthEast
1002 Chi MidWest
And this is the SQL statement I have used to get this results.
select p.pick_ticket_no, p.location_id, l.warehouse
from picking_tab p, locations_tab l
where p.location_id = l.location_id
and exists (select lt.warehouse
from picking_tab pt, locations_tab lt
where pt.pick_ticket_no = p.pick_ticket_no
and pt.location_id = lt.location_id
and lt.warehouse = 'NorthEast');
As you can see, It is the same SQL statement, and the only difference is in EXISTS and NOT EXISTS
clauses.
I have a ref cursor that has a return type. I tried to use native dynamic SQL by creating two views
with these two sql statements and did hit a error, which is a limitation in 9i.
So, now I have to create a one view that can produce result sets in one SQL statement.
May be I am making it harder for you to understand.
If that's the case, I will post whole procedure togather with the ref cursor tomorrow.
Thanks,
Followup October 10, 2006 - 8pm Central time zone:
ok, why just one midwest then, why not both.
Merge exists and not exists into one SQL statement
October 11, 2006 - 6am Central time zone
Reviewer: Kandy Train from NY
Pick_TIcket_Id 1002 has two records (Picking from two locations).
When the SQL is run from MidWest, They should only get the Pick Tickets that are purely from
MidWest. If a Pick Ticket is filled from NorthEast and MidWest, it is not handled from MidWest, but
from NorthEast.
When the SQL is run from NorthEast, They should get all the
Pick tickets that at least have a record in pickcing_tab for the same pick_ticket_no.
Does that make sense??
Thanks,
Followup October 11, 2006 - 8am Central time zone:
nope, sorry it does not, this is sounding alot like "procedural logic" to me here, the rules seem
arbitrary.
Merge exists and not exists into one SQL statement
October 11, 2006 - 11am Central time zone
Reviewer: Kandy Train from NY
Thanks and I will use Procedural code to solve this issue.
Thanks again and appreciate your help
Performance Issue when using Exists...
November 2, 2006 - 12pm Central time zone
Reviewer: VLS from Bombay, India
Hi Tom,
In recent Oracle Magazine, in one of the example, you mentioned that exists will perform better in
cases where the columns of only one table is used in SELECT statement. For example,
select x.a, x.b. x.c
from x, y
where x.a = y.a
and x.b = :b1
can be coded as
select x.a, x.b. x.c
from x
where x.b = :b1
and exists
(select 'x' from y
where y.a = x.a)
But, in my cases, I have observed that the query performance degrades dratiscally when I change
these type of queries to make use of EXISTS. For example,
SELECT dn.dn_num msisdn, ca.customer_id, ca.custcode, co.co_id,
t.ttl_des title, cc.ccfname fname, cc.cclname lname, ca.birthdate,
cc.ccemail email, ca.billcycle, ca.csclimit csc_limit, ca.prgcode,
prg.prgname, co.co_activated, ch.ch_status, co.tmcode, tm.shdes shdes,
sm.SM_PUK , cc.CCZIP , cc.ccline3 , ccline4 , ccline5 , cccity
FROM (select customer_id, custcode, birthdate, billcycle, csclimit,
prgcode, rownum srno from customer_all
where prgcode in
(select prgcode from hss_cust_segment where description='POST')) ca,
contract_all co,
curr_co_status ch,
ccontact_all cc,
directory_number dn,
pricegroup_all prg,
mputmtab tm,
title t,
storage_medium sm
WHERE ca.customer_id = co.customer_id
AND cc.customer_id = ca.customer_id
AND cc.ccbill = 'X'
AND cc.CCTITLE = t.TTL_ID(+)
AND co.co_id = ch.co_id
AND ch.ch_status IN ('a', 's')
AND ca.prgcode = prg.prgcode
AND co.tmcode = tm.tmcode
AND tm.vscode = 1
and exists
(select 'x' from contr_services_cap cs
where cs.co_id = co.co_id
AND cs.sncode = 1
AND cs.dn_id = dn.dn_id)
and exists
(select 'x' from contr_devices cd
where cd.co_id = co.co_id
AND cd.cd_deactiv_date IS NULL
AND cd.port_id = sm.sm_id);
Above query did not fetch result in 6 hours whereas below query executed in 4 hours.
SELECT dn.dn_num msisdn, ca.customer_id, ca.custcode, co.co_id,
t.ttl_des title, cc.ccfname fname, cc.cclname lname, ca.birthdate,
cc.ccemail email, ca.billcycle, ca.csclimit csc_limit, ca.prgcode,
prg.prgname, co.co_activated, ch.ch_status, co.tmcode, tm.shdes shdes,
sm.SM_PUK , cc.CCZIP , cc.ccline3 , ccline4 , ccline5 , cccity
FROM (select customer_id, custcode, birthdate, billcycle, csclimit,
prgcode, rownum srno from customer_all
where prgcode in
(select prgcode from hss_cust_segment where description='POST')) ca,
contract_all co,
curr_co_status ch,
ccontact_all cc,
directory_number dn,
pricegroup_all prg,
mputmtab tm,
title t,
storage_medium sm,
contr_devices cd,
contr_services_cap cs
WHERE ca.customer_id = co.customer_id
AND cc.customer_id = ca.customer_id
AND cc.ccbill = 'X'
AND cc.CCTITLE = t.TTL_ID(+)
AND co.co_id = ch.co_id
AND ch.ch_status IN ('a', 's')
and cs.co_id = co.co_id
AND cs.sncode = 1
AND cs.seqno = (SELECT MAX (seqno)
FROM contr_services_cap x
WHERE cs.co_id = x.co_id AND x.sncode = 1)
AND ca.prgcode = prg.prgcode
AND co.tmcode = tm.tmcode
AND tm.vscode = 1
AND cs.dn_id = dn.dn_id
and cd.co_id = co.co_id
AND cd.cd_deactiv_date IS NULL
AND cd.port_id = sm.sm_id;
Similarly, there were other incidence when exists has caused performance issues. Also, the tables
used inside the exists condition are big and have proper indexes even though these fail.
What could be the problem ? Please help...
Regards
VLS
Followup November 2, 2006 - 12pm Central time zone:
I did not say "would work better"
I said "semantically and syntactically it would be MORE MEANINGFUL"
are you using the cbo.
Query Performance with Exists
November 2, 2006 - 10pm Central time zone
Reviewer: VLS from Bombay, India
Yes Tom, I am using CBO. The inner query also uses a proper indexes.
Regards
VLS
Followup November 3, 2006 - 9am Central time zone:
those two queries do not seem to be even remotely the same here?
how can you even compare them? they do not return the same results.
Difference between the 2 Queries
November 3, 2006 - 9am Central time zone
Reviewer: VLS from Bombay, India
Hi Tom,
Would appreciate if you can let me know the difference between the two queries. Second query is the
original query and first one is modified by me. In the Original query, contr_services_cap is joined
back to contr_services_cap to get max(seqno) so that only one co_id is returned. I tried to avoid
this aggregation by using EXISTS and thought that aggregation should resource intensive.
Regards
VLS
Followup November 3, 2006 - 10am Central time zone:
that is precisely the bit that seems strange - how can they be equivalent?
one says "keep only the row with the max seqno"
one says "keep any row that has a mate over here"
first query - no predicate on seqno
second query - predicate on seqno
don't get how they are the same.
In response to my previous post
November 3, 2006 - 12pm Central time zone
Reviewer: VLS from Bombay, India
Hi Tom,
"one says "keep only the row with the max seqno"" v/s one says "keep any row that has a mate over
here". How does it matter ? For example,
Table A
Column x Column y
------------------------
100 128
100 290
100 300
100 180
200 768
200 190
300 650
select x from a o
where y = (select max(y) from a i where i.x=o.x.x)
will be
100
200
300
No matter 300 has only one value but this one value will be a max for this value of x. So, anyway
each of x value will have a max. I used this logic and hence instead of taking max used EXISTS.
Now, for the predicates on SEQNO, this predicate is only used to get a single row for each co_id.
Again using max. This was not required when I used EXISTS.
Regards
VLS
Followup November 3, 2006 - 2pm Central time zone:
you cannot compare the two queries you have.
where exists (something)
where something = (select max(somethingelse)
are not comparable. Show me the WHERE exists that is comparable to you:
select x from a o
where y = (select max(y) from a i where i.x=o.x.x)
please!!
I don't have (nor do I want) your schema
I don't know your relationships
I did not decompose your query
I do however have a strong feeling "they just ain't the same"
exits perform better?
January 18, 2007 - 9am Central time zone
Reviewer: A reader
Hi Tom
I had a query like this : which was doing over 2250 LIO/execute -
select DISTINCT
BO128_Accounts_Mod.Entity_cre_flag,
BO128_Accounts_Mod.MainTableID, BO128_Accounts_Mod.BOCreatedBy,
BO128_Accounts_Mod.Cust_Last_Name, BO128_Accounts_Mod.Cust_First_Name,
BO128_Accounts_Mod.OrgKey,
fn_CategoryLookup(BO128_Accounts_Mod.Cust_Type, 'ACCOUNT_TYPE','en_US'),
BO128_Accounts_Mod.Cust_Type,
BO128_Accounts_Mod.MakerID, BO128_Accounts_Mod.MakerID,
BO128_Accounts_Mod.ownedUserID, BO128_Accounts_Mod.AssignedTo,
BO128_Accounts_Mod.AssignedTo, BO128_Accounts_Mod.ownedUserID,
BO128_Accounts_Mod.AssignedTo, BO128_Accounts_Mod.AssignedToGroup,
fn_CategoryLookup(BO128_Accounts_Mod.StageName,'PROCESSSTEP_STAGE','en_US'),
BO128_Accounts_Mod.StageName,
fn_CategoryLookup(BO128_Accounts_Mod.RecordStatus,'RECORD_STATUS','en_US'),
BO128_Accounts_Mod.RecordStatus,
fn_CategoryLookup(BO128_Accounts_Mod.LastOperPerformed,'LAST_OPERATION', 'en_US'),
BO128_Accounts_Mod.LastOperPerformed,
BO128_Accounts_Mod.ProcessID, decode(BO128_Accounts_Mod.AccountID,NULL,NULL, 'srmBOObj:AccountModBO/AccountModBO.accountID:' || BO128_Accounts_Mod.AccountID),
BO128_Accounts_Mod.GroupID,
BO128_Accounts_Mod.createdUserID, BO128_Accounts_Mod.AccountID,
BO128_Accounts_Mod.AccessOwnerGroup, BO128_Accounts_Mod.LastEditedPage,
BO128_Accounts_Mod.Cust_DOB, BO128_Accounts_Mod.TFPartyFlag,
BO128_Accounts_Mod.Negated, BO128_Accounts_Mod.BlackListed,
BO128_Accounts_Mod.Suspended, BO128_Accounts_Mod.IsMCEdited,
BO128_Accounts_Mod.AssignedByUserID, BO128_Accounts_Mod.IsTampered,
BO128_Accounts_Mod.OwnerGroup
From
ao_acctmod BO128_Accounts_Mod, ao_demomod BO129_Demographic_Mod
where BO128_Accounts_Mod.AccountID = BO129_Demographic_Mod.AccountID AND
( ( ( ( ( ( ( ( ( BO128_Accounts_Mod.MakerID = 1 ) OR (
BO128_Accounts_Mod.ownedUserID = 400013 ) ) ) OR (
BO128_Accounts_Mod.AssignedTo = 400013 ) ) ) AND ( ( ( ( (
BO128_Accounts_Mod.RecordStatus = 'F' ) OR (
BO128_Accounts_Mod.RecordStatus = 'F' ) ) ) OR (
BO128_Accounts_Mod.RecordStatus = 'F' ) ) ) ) ) AND (rownum <= 201) ) )
order by BO128_Accounts_Mod.Cust_Last_Name Asc
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 15 0.10 0.08 0 443 0 201
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 17 0.11 0.09 0 443 0 201
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 38 (CRMUSER)
Rows Row Source Operation
------- ---------------------------------------------------
201 SORT UNIQUE (cr=2252 pr=0 pw=0 time=951615 us)
201 COUNT STOPKEY (cr=443 pr=0 pw=0 time=4815 us)
201 NESTED LOOPS (cr=443 pr=0 pw=0 time=4812 us)
206 TABLE ACCESS FULL AO_ACCTMOD (cr=235 pr=0 pw=0 time=3567 us)
201 INDEX RANGE SCAN IX_AO_DEMO_ACCID (cr=208 pr=0 pw=0 time=1943 us)(object id 18490)
when I changed it to like below (used exists instead of distinct) then it is doing alot less LIO's and peforms better too (in terms of response time) .
select
BO128_Accounts_Mod.Entity_cre_flag,
BO128_Accounts_Mod.MainTableID, BO128_Accounts_Mod.BOCreatedBy,
BO128_Accounts_Mod.Cust_Last_Name, BO128_Accounts_Mod.Cust_First_Name,
BO128_Accounts_Mod.OrgKey,
fn_CategoryLookup(BO128_Accounts_Mod.Cust_Type, 'ACCOUNT_TYPE','en_US'),
BO128_Accounts_Mod.Cust_Type,
BO128_Accounts_Mod.MakerID, BO128_Accounts_Mod.MakerID,
BO128_Accounts_Mod.ownedUserID, BO128_Accounts_Mod.AssignedTo,
BO128_Accounts_Mod.AssignedTo, BO128_Accounts_Mod.ownedUserID,
BO128_Accounts_Mod.AssignedTo, BO128_Accounts_Mod.AssignedToGroup,
fn_CategoryLookup(BO128_Accounts_Mod.StageName,'PROCESSSTEP_STAGE','en_US'),
BO128_Accounts_Mod.StageName,
fn_CategoryLookup(BO128_Accounts_Mod.RecordStatus,'RECORD_STATUS','en_US'),
BO128_Accounts_Mod.RecordStatus,
fn_CategoryLookup(BO128_Accounts_Mod.LastOperPerformed,'LAST_OPERATION', 'en_US'),
BO128_Accounts_Mod.LastOperPerformed,
BO128_Accounts_Mod.ProcessID, decode(BO128_Accounts_Mod.AccountID,NULL,NULL, 'srmBOObj:AccountModBO/AccountModBO.accountID:' || BO128_Accounts_Mod.AccountID),
BO128_Accounts_Mod.GroupID,
BO128_Accounts_Mod.createdUserID, BO128_Accounts_Mod.AccountID,
BO128_Accounts_Mod.AccessOwnerGroup, BO128_Accounts_Mod.LastEditedPage,
BO128_Accounts_Mod.Cust_DOB, BO128_Accounts_Mod.TFPartyFlag,
BO128_Accounts_Mod.Negated, BO128_Accounts_Mod.BlackListed,
BO128_Accounts_Mod.Suspended, BO128_Accounts_Mod.IsMCEdited,
BO128_Accounts_Mod.AssignedByUserID, BO128_Accounts_Mod.IsTampered,
BO128_Accounts_Mod.OwnerGroup
From
ao_acctmod BO128_Accounts_Mod
where exists ( select null from
ao_demomod BO129_Demographic_Mod
where
BO128_Accounts_Mod.AccountID = BO129_Demographic_Mod.AccountID AND
( ( BO128_Accounts_Mod.MakerID = 1 ) OR ( BO128_Accounts_Mod.ownedUserID = 400020 ) OR ( BO128_Accounts_Mod.AssignedTo = 400020 ) ) AND ( BO128_Accounts_Mod.RecordStatus = 'F' OR BO128_Accounts_Mod.RecordStatus = 'F' OR BO128_Accounts_Mod.RecordStatus = 'F' ) )
AND rownum <= 201
order by BO128_Accounts_Mod.Cust_Last_Name Asc
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 15 0.07 0.17 0 735 0 201
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 17 0.07 0.17 0 735 0 201
Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 38 (CRMUSER)
Rows Row Source Operation
------- ---------------------------------------------------
201 COUNT STOPKEY (cr=735 pr=0 pw=0 time=5822 us)
201 FILTER (cr=735 pr=0 pw=0 time=5618 us)
206 TABLE ACCESS BY INDEX ROWID AO_ACCTMOD (cr=323 pr=0 pw=0 time=2705 us)
206 INDEX FULL SCAN IX_AO1_CUST_LAST_NAME (cr=17 pr=0 pw=0 time=225 us)(object id 18495)
201 FILTER (cr=412 pr=0 pw=0 time=2461 us)
201 INDEX RANGE SCAN IX_AO_DEMO_ACCID (cr=412 pr=0 pw=0 time=2071 us)(object id 18490)
my question - what is the reason - how distinct peforms poor than exists..
Ajeet
change this query
March 26, 2007 - 7am Central time zone
Reviewer: A reader
select PGM_VEH_CONFIG.CNTRY_CD
, PGM_VEH_CONFIG.LANG_CD
, PGM_VEH_CONFIG.BUSNS_FCN_USE_CD
, PGM_VEH_CONFIG.MODL_YR_NBR
, PGM_VEH_CONFIG.PVC_ID
, RTRIM(PGM_VEH_CONFIG.PVC_DESC)
, PGM_VEH_CONFIG.PVC_MSRP
, TO_CHAR(PGM_VEH_CONFIG.PVC_EFCTV_DT,'yyyy-mm-dd')
, TO_CHAR(PGM_VEH_CONFIG.PVC_TERMNT_DT,'yyyy-mm-dd')
, PGM_VEH_CONFIG.SELLNG_SRC_CD
, PGM_VEH_CONFIG.NAMPLT_CD
, PGM_VEH_CONFIG.MDSNG_MODL_DESGTR
, PGM_VEH_CONFIG.USG_SEQ_NBR
, PGM_VEH_CONFIG.MKTG_BODY_STYLE_CD
, PGM_VEH_CONFIG.SUB_BRAND_CD
, PGM_VEH_CONFIG.PACKG_OPTN_CD
, PGM_VEH_CONFIG.SEGMENT_CD
, PGM_VEH_CONFIG.COMPRS_MODL_CD
, PGM_VEH_CONFIG.CONFIG_MODL_CD
, PGM_VEH_CONFIG.OPTN_USG_STMT
from gmceycea.PGM_VEH_CONFIG
, gmceycea.PRICE_RANGE
where PGM_VEH_CONFIG.BUSNS_FCN_USE_CD ='038'
and PGM_VEH_CONFIG.CNTRY_CD = PRICE_RANGE.CNTRY_CD
and PGM_VEH_CONFIG.LANG_CD = PRICE_RANGE.LANG_CD
and PGM_VEH_CONFIG.BUSNS_FCN_USE_CD = PRICE_RANGE.BUSNS_FCN_USE_CD
and PGM_VEH_CONFIG.PVC_MSRP between PRICE_RANGE.START_PRICE_RANGE and PRICE_RANGE.END_PRICE_RANGE
AND EXISTS (SELECT * from gmceycea.PVC_COLR
where pvc_colr.cntry_cd = pgm_veh_config.cntry_cd
and pvc_colr.lang_cd = pgm_veh_config.lang_cd
and pvc_colr.busns_fcn_use_cd = pgm_veh_config.busns_fcn_use_cd
and pvc_colr.modl_yr_nbr = pgm_veh_config.modl_yr_nbr
and pvc_colr.pvc_id = pgm_veh_config.pvc_id
and pvc_colr.busns_fcn_use_cd = '038'
and PGM_VEH_CONFIG.BUSNS_FCN_USE_CD= '038');
if PGM_VEH_CONFIG.BUSNS_FCN_USE_CD= '038'
column having the value 038 then EXISTS part should work otherwise exists part should not work.
kindly rephrase it.
Followup March 26, 2007 - 11am Central time zone:
huh?
Question
March 27, 2007 - 9am Central time zone
Reviewer: Reader
Hello,
How do we make Oracle to use index if the query has NOT IN?
SELECT *
FROM ORDERS
WHERE STATUS NOT IN ('NEW','OLD')
I created a Function Based Index.
CREATE INDEX IDX_FBI ON ORDERS (DECODE(STATUS,'NEW','1','OLD','1',0)) = '0'
SELECT *
FROM ORDERS
WHERE DECODE(STATUS,'NEW','1','OLD','1',0)) = '0'
It does a Full table scan but when I do a COUNT(*) it does a Fast Full Index scan. How to use index while using NOT IN?
Thanks
Followup March 27, 2007 - 10am Central time zone:
no really, what index did you create - given the one you demonstrate does not work.
and what version are you using.
ops$tkyte%ORA10GR2> /*
ops$tkyte%ORA10GR2> drop table t;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> create table t
ops$tkyte%ORA10GR2> as
ops$tkyte%ORA10GR2> select decode( mod(rownum,3),0,'OLD',1,'NEW' ) ostatus, a.*
ops$tkyte%ORA10GR2> from all_objects a;
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'T' );
ops$tkyte%ORA10GR2> */
ops$tkyte%ORA10GR2> drop index t_idx;
Index dropped.
ops$tkyte%ORA10GR2> create index t_idx on t( decode( ostatus, 'NEW', null, 'OLD', null, 'YES' ) );
Index created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> @at
ops$tkyte%ORA10GR2> column PLAN_TABLE_OUTPUT format a72 truncate
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select * from t where decode( ostatus, 'NEW', null, 'OLD', null, 'YES' ) =
'YES';
Execution Plan
----------------------------------------------------------
Plan hash value: 470836197
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 48000 | 44 (0
| 1 | TABLE ACCESS BY INDEX ROWID| T | 500 | 48000 | 44 (0
|* 2 | INDEX RANGE SCAN | T_IDX | 200 | | 35 (0
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(DECODE("OSTATUS",'NEW',NULL,'OLD',NULL,'YES')='YES')
ops$tkyte%ORA10GR2> set autotrace off

March 28, 2007 - 10am Central time zone
Reviewer: A reader
Tom,
I have a problem with a select I am not able to built
consider that
select * from t
where decode (:x, null,'999', t.a) in (select t2.a from t2 where t2.a = :x)
This is not working when :x is null
When :x is null I will normally have this
select * from t
where '999'in (select t2.a from t2 where t2.a = '')
What I would like to have in this case is a sort of
select * from t
where '999' in ('999')
Could you please help me
Thanks a lot for your help
Followup March 28, 2007 - 12pm Central time zone:
... This is not working when :x is null ...
sure it is, it works perfectly. It just probably doesn't do what you want - that is something else altogether.
select *
from t
where :x is null or a in (select t2.a from t2 where t2.a = :x)
seems to say exactly what you meant.
NOT IN too slow
May 4, 2007 - 2am Central time zone
Reviewer: Sagar from Mumbai India
Hi Tom,
I have one small NOT IN query on 2 tables which when run on 9i database takes more than 2 hours, I had to kill the session without ouput.But IN works in 2 seconds!.
I copied same tables and created same indexes in 10gR2 , now the query takes around 5-6 minutes in 10gR2.
For Details see below.I am interested in knowing what causes consistent gets to be so large. There is no one connected to the user other than me.
SQL> select count(1) from camps_upload ;
COUNT(1)
----------
43656
Elapsed: 00:00:02.65
Execution Plan
----------------------------------------------------------
Plan hash value: 889239046
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 184 (2)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| CAMPS_UPLOAD | 43656 | 184 (2)| 00:00:03 |
---------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
805 consistent gets
781 physical reads
0 redo size
413 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select count(1) from camps_master_upload ;
COUNT(1)
----------
38266
Elapsed: 00:00:07.71
Execution Plan
----------------------------------------------------------
Plan hash value: 1748044587
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 182 (2)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| CAMPS_MASTER_UPLOAD | 38266 | 182 (2)| 00:00:03 |
----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
338 recursive calls
0 db block gets
852 consistent gets
792 physical reads
0 redo size
413 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select bytes/(1024*1024) usage from user_segments where segment_name IN
('CAMPS_UPLOAD','CAMPS
_MASTER_UPLOAD') ;
USAGE
----------
7
7
Elapsed: 00:00:04.75
Execution Plan
----------------------------------------------------------
ERROR:
ORA-01039: insufficient privileges on underlying objects of the view
SP2-0612: Error generating AUTOTRACE EXPLAIN report
Statistics
----------------------------------------------------------
2017 recursive calls
0 db block gets
4111 consistent gets
309 physical reads
0 redo size
444 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
56 sorts (memory)
0 sorts (disk)
2 rows processed
SQL> select count(1) from camps_upload where
2 (folio_no,comp_code) in ( select folio_no,comp_code from camps_master_upload) ;
COUNT(1)
----------
38121
Elapsed: 00:00:02.50
Execution Plan
----------------------------------------------------------
Plan hash value: 123087718
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 66 (7)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | HASH JOIN RIGHT SEMI | | 1 | 26 | 66 (7)| 00:00:01 |
| 3 | INDEX FAST FULL SCAN| IND_CMP_MAST_UPD | 38266 | 485K| 30 (4)| 00:00:01 |
| 4 | INDEX FAST FULL SCAN| IND_CMP_UPD | 43656 | 554K| 34 (3)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FOLIO_NO"="FOLIO_NO" AND "COMP_CODE"="COMP_CODE")
Statistics
----------------------------------------------------------
551 recursive calls
0 db block gets
380 consistent gets
273 physical reads
0 redo size
413 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
12 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> ed
Wrote file afiedt.buf
1 select count(1) from camps_upload where
2* (folio_no,comp_code) not in ( select folio_no,comp_code from camps_master_upload)
SQL> /
COUNT(1)
----------
5535
Elapsed: 00:05:57.61
Execution Plan
----------------------------------------------------------
Plan hash value: 3220892056
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 7228K (3)| 24:05:45 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | FILTER | | | | | |
| 3 | TABLE ACCESS FULL| CAMPS_UPLOAD | 43656 | 554K| 188 (4)| 00:00:03 |
|* 4 | TABLE ACCESS FULL| CAMPS_MASTER_UPLOAD | 1 | 13 | 185 (3)| 00:00:03 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "CAMPS_MASTER_UPLOAD"
"CAMPS_MASTER_UPLOAD" WHERE LNNVL("FOLIO_NO"<>:B1) AND LNNVL("COMP_CODE"<>:B2)))
4 - filter(LNNVL("FOLIO_NO"<>:B1) AND LNNVL("COMP_CODE"<>:B2))
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
19545709 consistent gets
20 physical reads
0 redo size
412 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
Even though index is not used why consistent gets are so abnormally high?.
Followup May 4, 2007 - 1pm Central time zone:
the camps_master_upload table was full scanned 43,656 times.
NOT IN too slow
May 5, 2007 - 1am Central time zone
Reviewer: Sagar from Mumbai , India
Thanks Tom,
I have used "NOT EXIST" instead of "NOT IN" and the query executes within 3-4 seconds.Both tables do not have NULLs in the columns used in the query.
SQL> ED
Wrote file afiedt.buf
1 select COUNT(1) FROM camps_upload a
2 where not exists ( select 1 from camps_master_upload b where a.folio_no =b.folio_no
3* and a.comp_code = b.comp_code )
SQL> /
COUNT(1)
----------
5535
Elapsed: 00:00:03.46
Execution Plan
----------------------------------------------------------
Plan hash value: 1970687259
--------------------------------------------------------------------------------
-----------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
--------------------------------------------------------------------------------
-----------
| 0 | SELECT STATEMENT | | 1 | 26 | 220 (5)|
00:00:03 |
| 1 | SORT AGGREGATE | | 1 | 26 | |
|
|* 2 | HASH JOIN RIGHT ANTI | | 43655 | 1108K| 220 (5)|
00:00:03 |
| 3 | INDEX FAST FULL SCAN| IND_CMP_MAST_UPD | 38266 | 485K| 30 (4)|
00:00:01 |
| 4 | TABLE ACCESS FULL | CAMPS_UPLOAD | 43656 | 554K| 188 (4)|
00:00:03 |
--------------------------------------------------------------------------------
-----------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"."FOLIO_NO"="B"."FOLIO_NO" AND "A"."COMP_CODE"="B"."COMP_CODE")
Statistics
----------------------------------------------------------
17 recursive calls
0 db block gets
939 consistent gets
239 physical reads
0 redo size
412 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Can you please explain how "NOT IN" works as "IN" executes within 3-4 seconds even if we use no indexes.This will be helpful for all.
Thanks in advance.
Wrote file afiedt.buf
1 select COUNT(1) FROM camps_upload
2 where (folio_no,comp_code) in
3* ( select folio_no,comp_code from camps_master_upload)
SQL> /
COUNT(1)
----------
38121
Elapsed: 00:00:12.18
Execution Plan
----------------------------------------------------------
Plan hash value: 474859051
--------------------------------------------------------------------------------
-------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
)| Time |
--------------------------------------------------------------------------------
-------------
| 0 | SELECT STATEMENT | | 1 | 26 | 375 (4
)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 26 |
| |
|* 2 | HASH JOIN RIGHT SEMI| | 1 | 26 | 375 (4
)| 00:00:05 |
| 3 | TABLE ACCESS FULL | CAMPS_MASTER_UPLOAD | 38266 | 485K| 185 (3
)| 00:00:03 |
| 4 | TABLE ACCESS FULL | CAMPS_UPLOAD | 43656 | 554K| 188 (4
)| 00:00:03 |
--------------------------------------------------------------------------------
-------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FOLIO_NO"="FOLIO_NO" AND "COMP_CODE"="COMP_CODE")
Statistics
----------------------------------------------------------
398 recursive calls
0 db block gets
1662 consistent gets
559 physical reads
0 redo size
413 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
10 sorts (memory)
0 sorts (disk)
1 rows processed
Followup May 8, 2007 - 10am Central time zone:
well, I don't know about you - but I see a "not exists" and an "in"
that is
I see entirely different queries here that do entirely different things and return entirely different answers.
NOT IN too slow
May 9, 2007 - 1am Central time zone
Reviewer: Sagar from Mumbai India.
Thanks for the reply Tom,
Actually my question was continuation of the previously asked quetion just above.
Anyway here is the problem I am facing.I have a simple "IN" query which executes in less than 5 seconds.If I put "NOT IN" in place of "IN" the query takes more than 5 minutes.Both the tables have around 40,000 records.You had replied that one of the table is full scanned 40,000 times in case of "NOT IN".Then why it is not full scanned it case of "IN"?.
Thanks in advance.
SQL> ed
Wrote file afiedt.buf
1 select count(1) from camps_upload
2 where
3 (folio_no,comp_code) in
4* ( select folio_no,comp_code from camps_master_upload )
SQL> /
COUNT(1)
----------
38121
Elapsed: 00:00:04.28
Execution Plan
----------------------------------------------------------
Plan hash value: 474859051
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 375 (4)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | HASH JOIN RIGHT SEMI| | 1 | 26 | 375 (4)| 00:00:05 |
| 3 | TABLE ACCESS FULL | CAMPS_MASTER_UPLOAD | 38266 | 485K| 185 (3)| 00:00:03 |
| 4 | TABLE ACCESS FULL | CAMPS_UPLOAD | 43656 | 554K| 188 (4)| 00:00:03 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FOLIO_NO"="FOLIO_NO" AND "COMP_CODE"="COMP_CODE")
Statistics
----------------------------------------------------------
654 recursive calls
0 db block gets
1704 consistent gets
1596 physical reads
0 redo size
413 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
12 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> ed
Wrote file afiedt.buf
1 select count(1) from camps_upload
2 where
3 (folio_no,comp_code) not in
4* ( select folio_no,comp_code from camps_master_upload )
SQL> /
COUNT(1)
----------
5535
Elapsed: 00:05:28.67
Execution Plan
----------------------------------------------------------
Plan hash value: 3220892056
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 7228K (3)| 24:05:45 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | FILTER | | | | | |
| 3 | TABLE ACCESS FULL| CAMPS_UPLOAD | 43656 | 554K| 188 (4)| 00:00:03 |
|* 4 | TABLE ACCESS FULL| CAMPS_MASTER_UPLOAD | 1 | 13 | 185 (3)| 00:00:03 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "CAMPS_MASTER_UPLOAD"
"CAMPS_MASTER_UPLOAD" WHERE LNNVL("FOLIO_NO"<>:B1) AND LNNVL("COMP_CODE"<>:B2)))
4 - filter(LNNVL("FOLIO_NO"<>:B1) AND LNNVL("COMP_CODE"<>:B2))
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
19545709 consistent gets
1591 physical reads
0 redo size
412 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Followup May 11, 2007 - 9am Central time zone:
because in and not in are ENTIRELY DIFFERENT
I don't know why you would even consider comparing them.
NOT IN too slow
May 12, 2007 - 4am Central time zone
Reviewer: Sagar from Mumbai , India
Thanks for the reply Tom,
Yes IN and NOT IN are different, but from point of view of algorithm for finding elements from the set A also in the set B (the IN process ) and finding elements in set A that are not in set B ( the NOT IN process) they are very much similar.
Followup May 14, 2007 - 12pm Central time zone:
huh, you are kidding right?
they are very different.
think about this - you are asked to find "where x = 5", that is very very very different from "where x <> 5"
think about what you just said. Other than the fact both are written in SQL, there is nothing similar about them at all.
NOT IN too slow
May 16, 2007 - 5am Central time zone
Reviewer: Sagar from Mumbai , India
Thanks for the reply Tom,
I am still not convinced.
Consider this manual process.
There is person having 2 boxes containing some objects.
Box1 has 2 objects.
Box2 has 3 objects.
1. He is told to find out objects from box1 that are also in box2.
Suppose he takes 1 minute to finish.This is the IN process.
2. Now we ask him to find objects from box1 that are not there in box2. This is NOT IN process.
For this he takes 100 minutes!.
(the eg. I showed you for IN executes within 4 seconds while for NOT IN with same data it takes 400 seconds)
In this case can we believe him when he takes 100 minutes for this ?.
IN and NOT IN have similar algorithm only the difference is that search is carried out till last element in the case of NOT IN but I don't think this can justify the delay.
Followup May 16, 2007 - 10am Central time zone:
do you understand why
where x = 5
where x <> 5
would pretty much be processed very differently???
eg: where x = 5, obviously, a candidate for an index - find 5, get data
where x <> 5, obviously - NOT a candidate for an index, indexes are using to "find" stuff, not "not find" stuff.
in and not in are like NIGHT and DAY. eg: DIFFERENT.
NOT IN and filter
May 16, 2007 - 7pm Central time zone
Reviewer: Jonathan Lewis from NY NY
Sagar, in one of your posts you say: "Both tables do not have NULLs in the columns used in the query." However, the predicate section of the execution plan shows that you have not declared these columns to be NOT NULL:
2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "CAMPS_MASTER_UPLOAD"
"CAMPS_MASTER_UPLOAD" WHERE LNNVL("FOLIO_NO"<>:B1) AND LNNVL("COMP_CODE"<>:B2)))
4 - filter(LNNVL("FOLIO_NO"<>:B1) AND LNNVL("COMP_CODE"<>:B2))
If you do not declare the columns as NOT NULL, the optimizer has to assume that they may contain nulls, hence the use of the lnnvl() function calls that disable the use of indexes, and leave Oracle doing a tablescan of the camps_master_upload table for (potientially) every row in the camps_upload table.
Declare the columns not null, or add '{column} is not null' predicates for all for column referenced, and you would probably get the same anti-join as you did with the 'not exists' version of the query. For related comments see: http://jonathanlewis.wordpress.com/2007/02/25/not-in/
NOT IN too slow
May 17, 2007 - 6am Central time zone
Reviewer: Sagar from Mumbai , India.
Thanks Jonathan!!!
Jonathan you are absolutely right!!!.Thank you very much.
I just added not null constraint to the referenced columns and its DONE ;).See the output below.Its good example for oracle fans.
Also thanks Tom for taking his very precious time for this query.
SQL> alter table camps_master_upload modify (folio_no not null , comp_code not null ) ;
Table altered.
SQL> alter table camps_upload modify (folio_no not null , comp_code not null ) ;
Table altered.
SQL> ed
Wrote file afiedt.buf
1 select count(1) from camps_upload
2 where
3* (folio_no,comp_code) in ( select folio_no,comp_code from camps_master_upload )
SQL> /
COUNT(1)
----------
38121
Elapsed: 00:00:00.54
Execution Plan
----------------------------------------------------------
Plan hash value: 474859051
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 375 (4)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | HASH JOIN RIGHT SEMI| | 1 | 26 | 375 (4)| 00:00:05 |
| 3 | TABLE ACCESS FULL | CAMPS_MASTER_UPLOAD | 38266 | 485K| 185 (3)| 00:00:03 |
| 4 | TABLE ACCESS FULL | CAMPS_UPLOAD | 43656 | 554K| 188 (4)| 00:00:03 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FOLIO_NO"="FOLIO_NO" AND "COMP_CODE"="COMP_CODE")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1601 consistent gets
0 physical reads
0 redo size
413 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> ed
Wrote file afiedt.buf
1 select count(1) from camps_upload
2 where
3* (folio_no,comp_code) not in ( select folio_no,comp_code from camps_master_upload )
SQL> /
COUNT(1)
----------
5535
Elapsed: 00:00:00.16
Execution Plan
----------------------------------------------------------
Plan hash value: 1887649868
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 375 (4)| 00:00:05 |
| 1 | SORT AGGREGATE | | 1 | 26 | | |
|* 2 | HASH JOIN RIGHT ANTI| | 43655 | 1108K| 375 (4)| 00:00:05 |
| 3 | TABLE ACCESS FULL | CAMPS_MASTER_UPLOAD | 38266 | 485K| 185 (3)| 00:00:03 |
| 4 | TABLE ACCESS FULL | CAMPS_UPLOAD | 43656 | 554K| 188 (4)| 00:00:03 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("FOLIO_NO"="FOLIO_NO" AND "COMP_CODE"="COMP_CODE")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1601 consistent gets
0 physical reads
0 redo size
412 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
NOT IN too slow...
May 17, 2007 - 8am Central time zone
Reviewer: Sagar from Mumbai , India.
Note that there are no indexes on both the tables.
Followup May 17, 2007 - 11am Central time zone:
all I've been trying to get you to realize is:
not in and in are as fundamentally different as night and day are.
not in, especially with regards to nullality, is hugely different.
Thanks Tom
May 17, 2007 - 12pm Central time zone
Reviewer: Sagar from Mumbai , India.
Thanks once again Tom,
Yes I got it now. Each day I learn something new here.Keep up the great work.I am a Tom Kyte fan ;).
I need this
October 23, 2007 - 7am Central time zone
Reviewer: rebeccah tukisi from south africa,midrand
good afternoon Tom
please help me to register on your site
so that i can email question too, i kind of new to the site and the oracle world
do you also help with the discoverer queries/reports
from rebeccah
Followup October 23, 2007 - 11am Central time zone:
nobody registers here.
when i have time, I permit new questions to be asked - it is first come, first serve and catch as catch can.

March 10, 2008 - 7am Central time zone
Reviewer: Sachin from India
Hi Tom ,
Look at below sql query ,
insert into rp_alert_cdr
select
/*+ INDEX(c, CDR_IDX2)*/
al_number_seq.currval,
c.*
from
cdr c
where
c.prof_num = lv_high_usage_data(i).plg_prof_num and
c.insert_dt >= lv_start_time and
c.insert_dt <= lv_end_time and
c.prof_num is not null and
c.toll_value > 0 and
c.b_intl_ind = 2 and
c.b_off_hook_ind = 1 and
c.orig_type <> 1 and
upper(c.DIALLED_DIGITS) Not Like '101%' and
c.term_type <> 0 and
nvl(c.cic,999999) not in ('*V','*U'); -- added new clause
My process is ,just inserting records from CDR table to RP_ALERT_CDR only those record which
condition is qulifiy the where clause. this was running as per need,, now i have added new
condition.
"and nvl(c.cic,999999) not in ('*V','*U')" -- added new clause
cdr table has CIC column varchar2(5) datatype. it can contain null record also .
in current time , we have only *V,*U and null record in CIC column, after adding condition with out
using NVL
it gives 0 record
and when we use query with nvl function it give desire
result.
Followup March 10, 2008 - 11am Central time zone:
"and nvl(c.cic,999999) not in ('*V','*U')" -- added new clause
cdr table has CIC column varchar2(5) datatype.
one utterly and completely wonders why you would nvl to a number then doesn't one? If you have a varchar2(5) - what could be the possible logic behind nvl'ing to 999,999???
but anyway - you have this column cic
it has in it:
a) *V
b) *U
c) a null value
Now, you want to keep all records where cic is not in *V, *U
Well, null is neither NOT IN, nor IN ('*v','*u', NULL is "not known"
so, where null not in (a,b) - is NOT TRUE (nor is it false)
where NOT(null not in (a,b)) - is NOT TRUE (nor is it false)
they are unknown.
so, currently, seems what you really meant to say was:
where cic is null
plain and simple

March 12, 2008 - 1am Central time zone
Reviewer: sachin from india
Thanks Tom ,
i was missing this.
are these 2 SQLs are same
March 14, 2008 - 6am Central time zone
Reviewer: Reene
Hi Tom
I have a SQL developed by some developer.
select ooh.order_number,count(ool.line_id)
from ont.oe_order_headers_all ooh,
ont.oe_order_lines_all ool,
wsh.wsh_delivery_Details wdd
where
exists
( select */ 'X'
from ont.oe_order_headers_all oh,
ont.oe_order_lines_all ol,
wsh.wsh_delivery_Details wdd,
ont.OE_SYSTEM_PARAMETERS_ALL osp
where oh.header_id=ol.header_id and
ol.line_id=wdd.source_line_id (+) and
( oh.ORDER_TYPE_ID = osp.ATTRIBUTE1 and osp.attribute1 is not null ) and
oh.org_id=3171 and
oh.order_source_id=1203 and
nvl(wdd.released_status,'x') <> 'C' and
nvl(wdd.released_status,'x') <> 'D' and
ol.shippable_flag='Y' and
ooh.order_number = oh.order_number
) and
ooh.order_source_id=1203 and
ooh.header_id=ool.header_id and
nvl(wdd.released_status,'x') <> 'D' and
ool.line_id=wdd.source_line_id (+) and
ool.shippable_flag='Y'
group by ooh.order_number
/
I tried to simplify it -as its tkprof shows a poor execution plan - i think that the above SQL can
be re-wrtitten as :
select oh.order_number,count(ol.line_id)
from ont.oe_order_headers_all oh,
ont.oe_order_lines_all ol,
wsh.wsh_delivery_Details wdd,
ont.OE_SYSTEM_PARAMETERS_ALL osp
where
oh.order_source_id=1203 and
oh.header_id=ol.header_id and
oh.ORDER_TYPE_ID = osp.ATTRIBUTE1 and osp.attribute1 is not null ) and
oh.org_id=3171 and
nvl(wdd.released_status,'x') <> 'D' and
nvl(wdd.released_status,'x') <> 'C' and
ol.line_id=wdd.source_line_id (+) and
ol.shippable_flag='Y'
group by oh.order_number
/
is it correct .
Thanks
Followup March 15, 2008 - 9am Central time zone:
run it - and observe the two totally different answers you will likely get.
no, definitely not the same.
typo in above post
March 14, 2008 - 6am Central time zone
Reviewer: Reene
select ooh.order_number,count(ool.line_id)
from ont.oe_order_headers_all ooh,
ont.oe_order_lines_all ool,
wsh.wsh_delivery_Details wdd
where
exists
( select 'X'
from ont.oe_order_headers_all oh,
ont.oe_order_lines_all ol,
wsh.wsh_delivery_Details wdd,
ont.OE_SYSTEM_PARAMETERS_ALL osp
where oh.header_id=ol.header_id and
ol.line_id=wdd.source_line_id (+) and
( oh.ORDER_TYPE_ID = osp.ATTRIBUTE1 and osp.attribute1 is not null ) and
oh.org_id=3171 and
oh.order_source_id=1203 and
nvl(wdd.released_status,'x') <> 'C' and
nvl(wdd.released_status,'x') <> 'D' and
ol.shippable_flag='Y' and
ooh.order_number = oh.order_number
) and
ooh.order_source_id=1203 and
ooh.header_id=ool.header_id and
nvl(wdd.released_status,'x') <> 'D' and
ool.line_id=wdd.source_line_id (+) and
ool.shippable_flag='Y'
group by ooh.order_number
/
I tried to simplify it -as its tkprof shows a poor execution plan - i think that the above SQL can
be re-wrtitten as :
select oh.order_number,count(ol.line_id)
from ont.oe_order_headers_all oh,
ont.oe_order_lines_all ol,
wsh.wsh_delivery_Details wdd,
ont.OE_SYSTEM_PARAMETERS_ALL osp
where
oh.order_source_id=1203 and
oh.header_id=ol.header_id and
oh.ORDER_TYPE_ID = osp.ATTRIBUTE1 and osp.attribute1 is not null ) and
oh.org_id=3171 and
nvl(wdd.released_status,'x') <> 'D' and
nvl(wdd.released_status,'x') <> 'C' and
ol.line_id=wdd.source_line_id (+) and
ol.shippable_flag='Y'
group by oh.order_number
/
is it correct .
Thanks
correct
April 7, 2008 - 4am Central time zone
Reviewer: Reene
Thanks Tom,
even though the number of records fethced were same,but the data was different. that is
order_numbers were same but the counts were different.thanks for poiniting it out.
understanding them a bit better now.
thanks

July 23, 2008 - 12pm Central time zone
Reviewer: A reader from NYC, USA
Tom,
After reading this thread, i am running the IN & EXIST query for emp and dept table. Both the query shows the same execution plan. But it should be different. For IN query,optimizer should use index scan on emp table. For exists query, optimizer should use FTS on emp table. Please correct me if i am wrong..
SQL> select count(*) from emp;
COUNT(*)
----------
458752
SQL> select count(*) from dept;
COUNT(*)
----------
2
SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SQLADMIN',TABNAME => 'EMP',ESTIMAT
E_PERCENT => 10, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', CASCADE => TRUE);
PL/SQL procedure successfully completed.
SQL>
SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SQLADMIN',TABNAME => 'DEPT',ESTIMA
TE_PERCENT => 10, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', CASCADE => TRUE);
PL/SQL procedure successfully completed.
SQL>
SQL> select count(ename) from emp where deptno in(select deptno from dept);
COUNT(ENAME)
------------
100
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=335 Card=1 Bytes=13)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=335 Card=228355 Bytes=2968615)
3 2 TABLE ACCESS (FULL) OF 'EMP' (Cost=335 Card=456710 Byt
es=4567100)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_C0010206' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5485 consistent gets
0 physical reads
0 redo size
307 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> SELECT COUNT(ENAME) from emp where exists (select null from dept where dept.deptno =
emp.deptno);
COUNT(ENAME)
------------
100
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=335 Card=1 Bytes=13)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=335 Card=228355 Bytes=2968615)
3 2 TABLE ACCESS (FULL) OF 'EMP' (Cost=335 Card=456710 Byt
es=4567100)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_C0010206' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5485 consistent gets
0 physical reads
0 redo size
307 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
Query Tuning
November 10, 2008 - 3am Central time zone
Reviewer: Thakur Manoj from India
Dear Sir
This query is taking 3 seconds to execute it
select distinct
atcf.base_entity_id
, be.entity_name_full As Cutomer_Name
, mbf.bank_facility_name AS Account_No
, atcf.proposed_limit AS Amount_Sanction
, atcf.os_balance AS Amount_Outstand
, rfdt.filter_date
, bd.mst_borrower_state_id
from aud_trn_c3_fac atcf
, rvw_cas_getchkfacdt rfdt
, base_entity be
, rvw_cas_stmt_borr_dtls bd
, mst_bank_facility mbf
Where atcf.trans_datetime = rfdt.facility_dt and
atcf.base_entity_id = rfdt.base_entity_id and
atcf.base_entity_id = be.id and
atcf.id = rfdt.trn_c3_fac_id and
rfdt.base_entity_id = bd.base_entity_id and
rfdt.sub_process_id = bd.sub_process_id and
rfdt.filter_date = bd.filter_date and
rfdt.base_entity_id = be.id and
atcf.mst_bank_facility_id = mbf.id and
mbf.mst_facilitytype_id = 1 and
(atcf.base_entity_id, atcf.id) not in
(
select distinct tss.base_entity_id,
tss.trn_prop_facility_id
from
rvw_cas_stmt_sanc_sec_dtls tss)
Please check the explain Plan
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=292 Card=1 Bytes=153
)
1 0 SORT (UNIQUE) (Cost=228 Card=1 Bytes=153)
2 1 FILTER
3 2 MERGE JOIN (Cost=226 Card=1 Bytes=153)
4 3 SORT (JOIN) (Cost=64 Card=1 Bytes=105)
5 4 MERGE JOIN (Cost=63 Card=1 Bytes=105)
6 5 SORT (JOIN) (Cost=14 Card=48 Bytes=3696)
7 6 HASH JOIN (Cost=12 Card=48 Bytes=3696)
8 7 HASH JOIN (Cost=7 Card=48 Bytes=2352)
9 8 TABLE ACCESS (BY INDEX ROWID) OF 'MST_BANK
_FACILITY' (Cost=2 Card=29 Bytes=754)
10 9 INDEX (RANGE SCAN) OF 'IDX_MST_BANK_FACI
LITY' (NON-UNIQUE) (Cost=1 Card=29)
11 8 TABLE ACCESS (FULL) OF 'AUD_TRN_C3_FAC' (C
ost=4 Card=48 Bytes=1104)
12 7 TABLE ACCESS (FULL) OF 'BASE_ENTITY' (Cost=4
Card=1567 Bytes=43876)
13 5 SORT (JOIN) (Cost=49 Card=1 Bytes=28)
14 13 VIEW OF 'RVW_CAS_GETCHKFACDT' (Cost=48 Card=1
Bytes=28)
15 14 SORT (GROUP BY) (Cost=48 Card=1 Bytes=63)
16 15 MERGE JOIN (Cost=46 Card=1 Bytes=63)
17 16 SORT (JOIN) (Cost=10 Card=1 Bytes=26)
18 17 HASH JOIN (Cost=8 Card=1 Bytes=26)
19 18 TABLE ACCESS (FULL) OF 'TRN_C3_FAC'
(Cost=3 Card=328 Bytes=2952)
20 18 TABLE ACCESS (FULL) OF 'AUD_TRN_C3_F
AC' (Cost=4 Card=951 Bytes=16167)
21 16 FILTER
22 21 SORT (JOIN)
23 22 VIEW (Cost=44 Card=97 Bytes=3589)
24 23 UNION-ALL
25 24 SORT (UNIQUE) (Cost=30 Card=89 B
ytes=2492)
26 25 FILTER
27 26 TABLE ACCESS (FULL) OF 'AUD_
TRN_PROC_STATE' (Cost=28 Card=89 Bytes=2492)
28 27 SORT (AGGREGATE)
29 28 TABLE ACCESS (FULL) OF '
MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
30 26 TABLE ACCESS (FULL) OF 'MSTW
F_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
31 24 SORT (UNIQUE) (Cost=6 Card=8 Byt
es=176)
32 31 FILTER
33 32 TABLE ACCESS (FULL) OF 'AUD_
TRN_DATA_STATE' (Cost=4 Card=8 Bytes=176)
34 33 SORT (AGGREGATE)
35 34 TABLE ACCESS (FULL) OF '
MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
36 32 TABLE ACCESS (FULL) OF 'MSTW
F_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
37 3 SORT (JOIN) (Cost=162 Card=3 Bytes=144)
38 37 VIEW OF 'RVW_CAS_STMT_BORR_DTLS' (Cost=162 Card=3
Bytes=144)
39 38 UNION-ALL
40 39 SORT (UNIQUE) (Cost=56 Card=1 Bytes=51)
41 40 MERGE JOIN (Cost=54 Card=1 Bytes=51)
42 41 SORT (JOIN) (Cost=49 Card=1 Bytes=24)
43 42 VIEW (Cost=47 Card=1 Bytes=24)
44 43 SORT (GROUP BY) (Cost=47 Card=1 Bytes=
45)
45 44 MERGE JOIN (Cost=44 Card=1 Bytes=45)
46 45 SORT (JOIN) (Cost=8 Card=4 Bytes=7
6)
47 46 HASH JOIN (Cost=6 Card=4 Bytes=7
6)
48 47 TABLE ACCESS (FULL) OF 'TRN_C3
_CORPORATE' (Cost=2 Card=110 Bytes=660)
49 47 TABLE ACCESS (FULL) OF 'AUD_TR
N_C3_CORPORATE' (Cost=3 Card=358 Bytes=4654)
50 45 FILTER
51 50 SORT (JOIN)
52 51 VIEW (Cost=44 Card=97 Bytes=25
22)
53 52 UNION-ALL
54 53 SORT (UNIQUE) (Cost=30 Car
d=89 Bytes=2492)
55 54 FILTER
56 55 TABLE ACCESS (FULL) OF
'AUD_TRN_PROC_STATE' (Cost=28 Card=89 Bytes=2492)
57 56 SORT (AGGREGATE)
58 57 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
59 55 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
60 53 SORT (UNIQUE) (Cost=6 Card
=8 Bytes=176)
61 60 FILTER
62 61 TABLE ACCESS (FULL) OF
'AUD_TRN_DATA_STATE' (Cost=4 Card=8 Bytes=176)
63 62 SORT (AGGREGATE)
64 63 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
65 61 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
66 41 SORT (JOIN) (Cost=6 Card=358 Bytes=9666)
67 66 TABLE ACCESS (FULL) OF 'AUD_TRN_C3_CORPO
RATE' (Cost=3 Card=358 Bytes=9666)
68 39 SORT (UNIQUE) (Cost=53 Card=1 Bytes=51)
69 68 MERGE JOIN (Cost=51 Card=1 Bytes=51)
70 69 SORT (JOIN) (Cost=48 Card=3 Bytes=72)
71 70 VIEW (Cost=46 Card=3 Bytes=72)
72 71 SORT (GROUP BY) (Cost=46 Card=3 Bytes=
135)
73 72 MERGE JOIN (Cost=43 Card=49 Bytes=22
05)
74 73 SORT (JOIN) (Cost=7 Card=30 Bytes=
570)
75 74 HASH JOIN (Cost=5 Card=30 Bytes=
570)
76 75 TABLE ACCESS (FULL) OF 'TRN_C3
_BANK' (Cost=2 Card=4 Bytes=24)
77 75 TABLE ACCESS (FULL) OF 'AUD_TR
N_C3_BANK' (Cost=2 Card=68 Bytes=884)
78 73 FILTER
79 78 SORT (JOIN)
80 79 VIEW (Cost=44 Card=97 Bytes=25
22)
81 80 UNION-ALL
82 81 SORT (UNIQUE) (Cost=30 Car
d=89 Bytes=2492)
83 82 FILTER
84 83 TABLE ACCESS (FULL) OF
'AUD_TRN_PROC_STATE' (Cost=28 Card=89 Bytes=2492)
85 84 SORT (AGGREGATE)
86 85 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
87 83 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
88 81 SORT (UNIQUE) (Cost=6 Card
=8 Bytes=176)
89 88 FILTER
90 89 TABLE ACCESS (FULL) OF
'AUD_TRN_DATA_STATE' (Cost=4 Card=8 Bytes=176)
91 90 SORT (AGGREGATE)
92 91 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
93 89 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
94 69 SORT (JOIN) (Cost=4 Card=68 Bytes=1836)
95 94 TABLE ACCESS (FULL) OF 'AUD_TRN_C3_BANK'
(Cost=2 Card=68 Bytes=1836)
96 39 SORT (UNIQUE) (Cost=53 Card=1 Bytes=45)
97 96 MERGE JOIN (Cost=51 Card=1 Bytes=45)
98 97 SORT (JOIN) (Cost=48 Card=3 Bytes=72)
99 98 VIEW (Cost=46 Card=3 Bytes=72)
100 99 SORT (GROUP BY) (Cost=46 Card=3 Bytes=
135)
101 100 MERGE JOIN (Cost=43 Card=29 Bytes=13
05)
102 101 SORT (JOIN) (Cost=7 Card=18 Bytes=
342)
103 102 HASH JOIN (Cost=5 Card=18 Bytes=
342)
104 103 TABLE ACCESS (FULL) OF 'TRN_C3
_SOVEREIGN' (Cost=2 Card=3 Bytes=18)
105 103 TABLE ACCESS (FULL) OF 'AUD_TR
N_C3_SOVEREIGN' (Cost=2 Card=53 Bytes=689)
106 101 FILTER
107 106 SORT (JOIN)
108 107 VIEW (Cost=44 Card=97 Bytes=25
22)
109 108 UNION-ALL
110 109 SORT (UNIQUE) (Cost=30 Car
d=89 Bytes=2492)
111 110 FILTER
112 111 TABLE ACCESS (FULL) OF
'AUD_TRN_PROC_STATE' (Cost=28 Card=89 Bytes=2492)
113 112 SORT (AGGREGATE)
114 113 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
115 111 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
116 109 SORT (UNIQUE) (Cost=6 Card
=8 Bytes=176)
117 116 FILTER
118 117 TABLE ACCESS (FULL) OF
'AUD_TRN_DATA_STATE' (Cost=4 Card=8 Bytes=176)
119 118 SORT (AGGREGATE)
120 119 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
121 117 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
122 97 SORT (JOIN) (Cost=4 Card=53 Bytes=1113)
123 122 TABLE ACCESS (FULL) OF 'AUD_TRN_C3_SOVER
EIGN' (Cost=2 Card=53 Bytes=1113)
124 2 MERGE JOIN (Cost=64 Card=1 Bytes=69)
125 124 SORT (JOIN) (Cost=60 Card=1 Bytes=50)
126 125 VIEW (Cost=59 Card=1 Bytes=50)
127 126 SORT (GROUP BY) (Cost=59 Card=1 Bytes=47)
128 127 MERGE JOIN (Cost=56 Card=1 Bytes=47)
129 128 SORT (JOIN) (Cost=7 Card=1 Bytes=28)
130 129 HASH JOIN (Cost=5 Card=1 Bytes=28)
131 130 TABLE ACCESS (FULL) OF 'TRN_SANC_SECURIT
Y' (Cost=2 Card=270 Bytes=2430)
132 130 TABLE ACCESS (FULL) OF 'AUD_TRN_SANC_SEC
URITY' (Cost=2 Card=428 Bytes=8132)
133 128 FILTER
134 133 SORT (JOIN)
135 134 VIEW OF 'RVW_CAS_GETCHKFACDT' (Cost=48 C
ard=1 Bytes=19)
136 135 SORT (GROUP BY) (Cost=48 Card=1 Bytes=
63)
137 136 MERGE JOIN (Cost=46 Card=1 Bytes=63)
138 137 SORT (JOIN) (Cost=10 Card=1 Bytes=
26)
139 138 HASH JOIN (Cost=8 Card=1 Bytes=2
6)
140 139 TABLE ACCESS (FULL) OF 'TRN_C3
_FAC' (Cost=3 Card=328 Bytes=2952)
141 139 TABLE ACCESS (FULL) OF 'AUD_TR
N_C3_FAC' (Cost=4 Card=951 Bytes=16167)
142 137 FILTER
143 142 SORT (JOIN)
144 143 VIEW (Cost=44 Card=97 Bytes=35
89)
145 144 UNION-ALL
146 145 SORT (UNIQUE) (Cost=30 Car
d=89 Bytes=2492)
147 146 FILTER
148 147 TABLE ACCESS (FULL) OF
'AUD_TRN_PROC_STATE' (Cost=28 Card=89 Bytes=2492)
149 148 SORT (AGGREGATE)
150 149 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
151 147 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
152 145 SORT (UNIQUE) (Cost=6 Card
=8 Bytes=176)
153 152 FILTER
154 153 TABLE ACCESS (FULL) OF
'AUD_TRN_DATA_STATE' (Cost=4 Card=8 Bytes=176)
155 154 SORT (AGGREGATE)
156 155 TABLE ACCESS (FULL
) OF 'MSTWF_PROCESS_ITEM' (Cost=2 Card=1 Bytes=16)
157 153 TABLE ACCESS (FULL) OF
'MSTWF_MKR_CHKR_STATE' (Cost=2 Card=1 Bytes=18)
158 124 SORT (JOIN) (Cost=4 Card=1 Bytes=19)
159 158 TABLE ACCESS (FULL) OF 'AUD_TRN_SANC_SECURITY' (Co
st=2 Card=1 Bytes=19)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
35867 consistent gets
22459 physical reads
0 redo size
3309 bytes sent via SQL*Net to client
543 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
964 sorts (memory)
0 sorts (disk)
68 rows processed
Please help to tune the below query?
Regards
|