Excellent Response!!!
Rafique Awan, August 28, 2004 - 6:22 pm UTC
You are great! Thanks a lot for your excellent response. I liked the approach of using scalar type instead of handling string and also it is also easy to maintain in the long run.
I have just some questions regarding scalar subqueries that I want to clarify.
1) Does Oracle execute scalary subquery for every row? can you please through some lights on the internal implementation of this feature?
2) In case if we want to search based on derived column from scalar subqueries, is ther any way to optimize that?
BTW, I have both of your books EXPERT ONE ON ONE and EFFECTIVE ORACLE BY DESIGN, I love to read both books and I have learnt a lot from these and I also recomnded these books to many other friends.
August 28, 2004 - 7:04 pm UTC
Index on scalar type columns
A reader, August 28, 2004 - 11:00 pm UTC
<quote>
create or replace view v as
ops$tkyte@ORA9IR2> select object_id,
2 x.data.c1 c1,
3 x.data.c2 c2,
4 x.data.c3 c3,
5 x.data.c4 c4
6 from (
7 select object_id, (select myScalarType(c1,c2,c3,c4)
8 from t_c1
9 where object_id = t.object_id ) data
10 from t
11 ) x
12 /
<quote>
I think what he means is if you create a view on your query like above and do a query like
select * from v where c1 = ...
it will not use the index on t_c1.c1 because of the way it is being wrapped in the scalar type and scalar subquery and stuff, no?
Thanks
August 29, 2004 - 11:17 am UTC
Ok, then you would not be needing a scalar subquery or outer join *AT ALL*
if the concept is
a) outer join from T to T1
b) put a predicate on T1.column
you've made a grievous error..
You would "just join" at that point.
select ...
from t, t1
where join...
and t1.c = ....
because if you have:
where t.pk = t1.k (+)
and t1.c <op> <value>
and <op> is anything other than "is null" -- what was the point of the outer join other than to waste CPU cycles and reduce the number of plans available?
for you see, if the row in t1 is "made up" due to the outer join -- then all of the columns of t1 for that row will be null and null <op> <value> is never TRUE nor FALSE but "not known" -- it would be removed from the result set.
So, I would never use this technique if I wanted to predicate on the columns from T1, it would quite simply "not make any sense"
Thanks Reader!
Rafique Awan, August 28, 2004 - 11:41 pm UTC
Reader you are right. That is what I mean to ask.
August 29, 2004 - 11:42 am UTC
see above, if you predicate on it, you would never have outer joined to it in the first place
Thanks!!!
Rafique Awan, August 29, 2004 - 11:44 am UTC
I agree with you that it doesn't make any sense to use outer join/scalar subqueries if query is based on the column from child table but I was thinking to create just one generic view which will satisfy all the requirements that I have. But now I realized that I should not use this view if I need to query based on the column from child table,I should use seprate set of queries in this specific requirement with simple join from parent to child table as you said.
Thanks again.
August 29, 2004 - 12:03 pm UTC
"generic" = "poor performance"
this generic view cannot satisfy all requirements. proof is simple. Here is the list of requirements you have:
a) ...
b) ...
c) performance
d) ....
C) cannot be satisfied by anything generic ;)
I hate "generic", everytime I see "generic" I see "performance problem", I see "maintainance issue", I see "confusion in the future".
A reader, August 29, 2004 - 10:24 pm UTC
"for you see, if the row in t1 is "made up" due to the outer join -- then all of the columns of t1 for that row will be null and null <op> <value> is never TRUE
nor FALSE but "not known" -- it would be removed from the result set"
Well, that is true if the row is indeed made up. But the definition of outer join is a optional join. I want to predicate/filter on t1.col when the row is found, not made up.
Whats wrong with that? I cant make it a exact join because then I would lose some rows from "t"!
<quote>
I hate "generic", everytime I see "generic" I see "performance problem", I see "maintainance issue", I see "confusion in the future"
<quote>
I would mostly agree, but then you come to a point when 10 views on the same underlying tables are 99.9% similar! Each app, user, adhoc query needs a teeny tiny twist on the same kind of query.
In this case, creating generic view makes sense. Make the join "generic", expose different columns for different users and let each app use what they need.
You dont agree?
August 30, 2004 - 8:13 am UTC
yes you can. the premise is:
an outer join from T to T1
with a predicate on T1.c
the outer join is from T to T1, if it were the other way -- no problem
from t, t1 where t1.k = t.k(+) and t1.c <op> <value> makes sense
from t, t1 where t1.k(+) = t.k and t1.c <op> <value> does NOT make sense.
No, I obviously do not agree, I stated as much above?
For "a reader"
Muhammad Riaz Shahid, August 30, 2004 - 4:17 am UTC
Consider the query:
select t1.col1,t1.col2,t2.col1,t2.col3
from t2, t1
where t2.pk=t1.col1(+)
So here you are doing some outer join. You simply told optimizer "Hay ! I am not sure that i will get a row in t1 based on col1". If a row is not found in t1 w.r.t then all the columns of table t1 (in select) will become null....right ?
Now:
select t1.col1,t1.col2,t2.col1,t2.col3
from t2, t1
where t2.pk=t1.col1(+)
and col2<=100 --say you add this condition
Now since for each row of t2, if the same was not found in t1, then all the columns of t1 should be null (TURE ???). Then what is the purpose of this condition:
and col2<=100
that simply implies that your are expecting col2 to be not null (implies that you are telling here the optimizer "Hay ! I will definitely get a row from t1 for condition t2.pk=t1.col1". So you misguided the optimizer....it should have simply been:
select t1.col1,t1.col2,t2.col1,t2.col3
from t2, t1
where t2.pk=t1.col1
and col2<=100
Remeber the OUTER JOINS are always expensive....we should use them only when they make sense...
Here is the proof:
SQL> create table tab1(a number, b number);
Table created.
Elapsed: 00:00:00.25
SQL> insert into tab1 select rownum,rownum+.25 from user_objects
10 rows created.
Elapsed: 00:00:00.87
SQL> commit;
Commit complete.
Elapsed: 00:00:00.10
SQL> ed
Wrote file afiedt.buf
1* create table tab2 as select decode(mod(rownum,2),0,rownum) d, rownum
SQL> /
Table created.
Elapsed: 00:00:00.18
SQL> select * FRom tab1;
A B
========== ==========
1 1.25
2 2.25
3 3.25
4 4.25
5 5.25
6 6.25
7 7.25
8 8.25
9 9.25
10 10.25
10 rows selected.
Elapsed: 00:00:00.94
SQL> select * From tab2;
D E
========== ==========
1001
2 1002
1003
4 1004
1005
6 1006
1007
8 1008
1009
10 1010
10 rows selected.
Elapsed: 00:00:00.93
SQL> select a,b,d,e from tab1, tab2
2 where a=d(+)
3
SQL> /
A B D E
========== ========== ========== ==========
1 1.25
2 2.25 2 1002
3 3.25
4 4.25 4 1004
5 5.25
6 6.25 6 1006
7 7.25
8 8.25 8 1008
9 9.25
10 10.25 10 1010
10 rows selected.
Elapsed: 00:00:00.94
SQL> ed
Wrote file afiedt.buf
1 select a,b,d,e from tab1, tab2
2 where a=d(+)
3* and e>=1006
SQL> /
A B D E
========== ========== ========== ==========
6 6.25 6 1006
8 8.25 8 1008
10 10.25 10 1010
Remaining part
Muhammad Riaz Shahid, August 30, 2004 - 4:19 am UTC
Here is the last query i forgot to paste.
SQL> ed
Wrote file afiedt.buf
1 select a,b,d,e from tab1, tab2
2 where a=d
3* and e>=1006
SQL> /
A B D E
========== ========== ========== ==========
6 6.25 6 1006
8 8.25 8 1008
10 10.25 10 1010
and t1.c <op> <value>
Pratap, August 30, 2004 - 5:09 am UTC
About your comment on outer join with condition - and t1.c <op> <value>
Consider a requirement -
Tables t1 and t2
1. t1 has rows and show all of them
2. t2 may or may not have any rows
3. Show data from t2 only if value of a column is 'something'
create table t1 ( a number );
create table t2 ( a number, b varchar2(100) )
insert into t1 values ( 1 );
insert into t2 values ( 1, 'something' );
insert into t2 values ( 1, 'something else' );
select *
from t1, t2
where t1.a = t2.a (+)
and t2.b (+) = 'something';
Pratap
August 30, 2004 - 8:47 am UTC
that is an outer join period. not at all what we were discussing which is:
t1.a = t2.a(+)
and
t2.b = 'something'
A reader, August 30, 2004 - 8:33 am UTC
"Now since for each row of t2, if the same was not found in t1, then all the columns of t1 should be null (TURE ???). Then what is the purpose of this condition:
and col2<=100"
But this is where I dont understand.
Both you and the reader Shahid keep pointing out the case where the row is NOT found in the outerjoined table (in bold above).
I am interested in the case where the row IS found in the table and I want to filter in that case.
In that case
from t,t1 where t.pk=t1.pk(+) and t1.col='something' would make perfect sense, wouldnt it?
Also, how would you respsond to the latest followup from "Pratap", where I am interested in filtering on the some other column in t1, not the one joined with t
from t,t1
where t.pk=t1.pk(+)
and t1.other(+)='something'
This is a fairly common business requirement, one that you just cannot trivialize saying 'bah, bad design'.
August 30, 2004 - 9:00 am UTC
<quote>
from t,t1 where t.pk=t1.pk(+) and t1.col='something' would make perfect sense,
wouldnt it?
</quote>
No, not at all.
Fact: if t.pk = t1.pk(+) "makes up a t1 row" (eg: the outer join did something), the predicate t1.col = 'something' WOULD ALWAYS REMOVE IT, meaning the (+) just does *nothing*. Well, not nothing, it removes from consideration certain opportunities
when you see:
where t.pk = t1.pk(+) and t1.c = 'something'
you know the person that coded it didn't understand what was going on. the only way that should be coded is:
where t.pk = t1.pk and t1.c = 'something'
I didn't say that last one was "bah bad design", I said "nah, that is not what we were taking about"
that is an outer join from t to t1. pretend that 'something' comes from t (was a column in T). that query is totally valid and reasonable. That query just is not what we were discussing.
A reader, August 30, 2004 - 8:56 am UTC
t1.a = t2.a(+)
and
t2.b(+) = 'something'
Sorry for the confusion, but the above is what I thought we were discussing, an "outer join, period"
Getting back to the question about comparing scalar subqueries (with or without scalar types), in the above query, the CBO will use the index on t2.b, if it makes sense.
But in the myScalarType example above, which is also effectively an outer join, the CBO will not use the index on t_c1.c1, right? Is there any way to have the best of both worlds?
Thanks
August 30, 2004 - 9:17 am UTC
only if the predicate itself it pushed down into the scalar subquery (making it behave just like "t2.b(+) = 'something'"
if you wanted to just have t2.b = 'something', we want to go back to a JOIN.
Pratap, August 30, 2004 - 9:27 am UTC
Tom is right in saying that
where t.pk = t1.pk(+) and t1.c = 'something'
is as good as
where t.pk = t1.pk and t1.c = 'something'
But probably those who code like this do have a requirement on filtering t1.c = 'something', BUT they don't know that -
1. t1.c = 'something' is going to return false because the value of t1.c may be null AND
2. One can also put an outer join like t1.c (+) = 'something', when comparing a literal or variable.
From sql reference -
If the WHERE clause contains a condition that compares a column from table B with a constant, then the (+) operator must be applied to the column so that Oracle returns the rows from table A for which it has generated nulls for this column. Otherwise Oracle returns only the results of a simple join.
August 30, 2004 - 9:47 am UTC
Or,
select *
from t, ( select * from t1 where c = 'something' ) t1
where t.k = t1.k(+)
as well....
it says
"give me every row in T"
and if there is a mate in T2 such that c = 'something', show me that as well.
A reader, August 30, 2004 - 9:59 am UTC
"select *
from t, ( select * from t1 where c = 'something' ) t1
where t.k = t1.k(+)"
That will work only if 'something' is well-known and part of the requirements for this view.
It wont work if I need to 'expose' 'c' in the view and let the app filter on what ever value they want for 'c'
Thanks
August 30, 2004 - 10:34 am UTC
I'm not even thinking "view" at that point.
I'm thinking "query"
View vs query
A reader, August 30, 2004 - 5:10 pm UTC
"I'm not even thinking "view" at that point.
I'm thinking "query" "
Well, thats the easiest thing then. Taking one specific query, analysing it, reducing its LIOs, is very easy in the grand scheme of things, once you get to know Oracle well.
Its when you expose a commonly-uses query/business question as a VIEW that things start to get hairy. Users/apps start to put predicates on columns simply not designed to do so!
Thanks
To "A reader "
Muhammad Riaz Shahid, September 21, 2004 - 8:00 am UTC
Please search this site for "parameterized views" and you will know that the query provided by TOM can also be used in a view....where you can also use different values for "something" on thr fly...
outer join vs scalar subquery
Arya, April 19, 2005 - 10:52 am UTC
Hi Tom,
I have a query using outer join users are complaining which is slow so I am working to make it faster and in that process I have rewritten this query as a scalar subquery which you have explained in your book. To my surprise scalar subquery is doing more logical reads and taking more time than outerjoin.
I request your help in pointing me where am I doing mistake?
Here is the original query which is doing less logical reads.
SQL> SELECT /*+ FIRST_ROWS */ cp.cp_id
2 ,Initcap(nvl(trim(both ' ' from full_name), cp.last_name || ', ' || cp.first_name)) FULL_NAME
3 ,cp.last_name
4 ,cp.first_name
5 ,cp.middle_name
6 ,org.NAME ORG_NAME
7 ,org.org_desc
8 ,orl.NAME ORG_ROLE
9 ,orl.role_Desc
10 ,NULL role_name
11 ,cv.Address_Line_1 Address_Line_1
12 ,cv.address_line_2 Address_Line_2
13 ,(case when (trim(both ' ' from address_line_1) is null
14 and trim(both ' ' from address_line_2) is not null)
15 then trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
16 when (trim(both ' ' from address_line_2) is null
17 and trim(both ' ' from address_line_1) is not null)
18 then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual))
19 when (trim(both ' ' from address_line_1) is not null
20 and trim(both ' ' from address_line_2) is not null)
21 then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual)) || ', ' || trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
22 else null end) FULL_ADDRESS
23 ,Initcap(cv.city) city
24 ,cv.state
25 ,cv.zip_cd
26 ,(case when email like '%@%' then email else null END) email
27 ,cv.phone_nbr
28 ,cv.fax_nbr
29 ,cv.cv_comment
30 ,org.org_id
31 ,cv.cv_id
32 ,cp.oid_user_name
33 ,NULL brl_id
34 ,NULL cpr_id
35 FROM contact_persons cp
36 ,contact_vehicles cv
37 ,organizations org
38 ,organization_roles orl
39 ,(select distinct cp_id, org_id from contact_person_roles) cpr
40 WHERE cp.cp_id = cv.cp_id(+)
41 AND cp.cp_id = cpr.cp_id(+)
42 AND cpr.org_id = org.org_id(+)
43 AND org.orl_id = orl.orl_id(+)
44 AND ROWNUM <20
45 /
19 rows selected.
Elapsed: 00:00:01.06
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=879812 Card=19 Bytes=7201)
1 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
2 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
3 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
4 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
5 0 COUNT (STOPKEY)
6 5 NESTED LOOPS (OUTER) (Cost=879812 Card=219630 Bytes=83239770)
7 6 NESTED LOOPS (OUTER) (Cost=440552 Card=219630 Bytes=64351590)
8 7 NESTED LOOPS (OUTER) (Cost=220922 Card=219630 Bytes=20425590)
9 8 HASH JOIN (OUTER) (Cost=1292 Card=219630 Bytes=12079650)
10 9 TABLE ACCESS (FULL) OF 'CONTACT_PERSONS' (Cost=317 Card=219630 Bytes=6369270)
11 9 VIEW (Cost=750 Card=152051 Bytes=3953326)
12 11 SORT (UNIQUE) (Cost=750 Card=152051 Bytes=1216408)
13 12 TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271 Card=218110 Bytes=1744880)
14 8 TABLE ACCESS (BY INDEX ROWID) OF 'ORGANIZATIONS' (Cost=1 Card=1 Bytes=38)
15 14 INDEX (UNIQUE SCAN) OF 'ORG_PK' (UNIQUE)
16 7 TABLE ACCESS (BY INDEX ROWID) OF 'ORGANIZATION_ROLES' (Cost=1 Card=1 Bytes=200)
17 16 INDEX (UNIQUE SCAN) OF 'ORL_PK' (UNIQUE)
18 6 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_VEHICLES' (Cost=2 Card=1 Bytes=86)
19 18 INDEX (RANGE SCAN) OF 'CV_CP_FK_I' (NON-UNIQUE) (Cost=1 Card=1)
Statistics
----------------------------------------------------------
327 recursive calls
0 db block gets
4923 consistent gets
2029 physical reads
0 redo size
5675 bytes sent via SQL*Net to client
514 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
19 rows processed
I have rewritten above query as scalar subquery and here is the autotrace for the same which is doing more logical reads compared to outer join.
SELECT cp_id,
nvl(full_name, last_name || ', ' || first_name) FULL_NAME,
last_name,
first_name,
middle_name,
oid_user_name,
(CASE WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NULL AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
THEN TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,101,100)) FROM DUAL))
WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,101,100)) IS NULL AND trim(both ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL)
THEN TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL))
WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
THEN TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL)) || ', ' || trim(both ' ' FROM (SELECT elease_util.initcap2(substr(cv_data,101,100)) FROM DUAL))
ELSE NULL END) FULL_ADDRESS
,TRIM(SUBSTR(cv_data,1,100)) address_line_1
,TRIM(SUBSTR(cv_data,101,100)) address_line_2
,TRIM(SUBSTR(cv_data,201,100)) city
,TRIM(SUBSTR(cv_data,301,2)) state
,TRIM(SUBSTR(cv_data,303,12)) zip_cd
,CASE WHEN TRIM(SUBSTR(cv_data,315,100)) LIKE '%@%' THEN trim(SUBSTR(cv_data,315,100)) ELSE NULL END EMAIL
,TRIM(SUBSTR(cv_data,415,32)) phone_nbr
,TRIM(SUBSTR(cv_data,447,20)) fax_nbr
--,trim(substr(cv_data,367,200)) comment
,TO_NUMBER( TRIM(SUBSTR(cv_data,467,10))) cv_id
,NULL brl_id
,NULL cpr_id
,NULL role_name
FROM (
SELECT cp.cp_id
,cp.full_name
,cp.last_name
,cp.first_name
,cp.middle_name
,cp.oid_user_name
,(SELECT rpad(nvl(cv.Address_Line_1,' '),100,' ') ||
rpad(nvl(cv.address_line_2,' '),100,' ') ||
rpad(nvl(cv.city,' '),100,' ') ||
rpad(nvl(cv.state,' '),2,' ') ||
rpad(nvl(cv.zip_cd,' '),12,' ') ||
rpad(nvl(cv.email,' '),100,' ') ||
rpad(nvl(cv.phone_nbr,' '),32,' ') ||
rpad(nvl(cv.fax_nbr,' '),20,' ') ||
--rpad(nvl(cv.cv_comment,' '),200,' ') ||
to_char(cv.cv_id,'s0000000000')
FROM contact_vehicles cv
WHERE cv.cp_id = cp.cp_id) cv_data
,( SELECT org.NAME ||
org.org_desc ||
org.org_id ||
orl.name ||
orl.role_desc
FROM organizations org
,organization_roles orl
WHERE org.org_id = cpr.org_id
AND org.org_id = orl.orl_id) ORG_DATA
FROM contact_persons cp
,(SELECT DISTINCT cp_id, org_id
FROM contact_person_roles) cpr
WHERE ROWNUM < 20
)
19 rows selected.
Elapsed: 00:00:01.07
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=69113167 Card=19 Bytes=9310)
1 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
2 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
3 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
4 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
5 0 VIEW (Cost=69113167 Card=19 Bytes=9310)
6 5 COUNT (STOPKEY)
7 6 MERGE JOIN (CARTESIAN) (Cost=69113167 Card=47680974000 Bytes=2002600908000)
8 7 VIEW (Cost=827 Card=218020 Bytes=2834260)
9 8 SORT (UNIQUE) (Cost=827 Card=218020 Bytes=1744160)
10 9 TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271 Card=218020 Bytes=1744160)
11 7 BUFFER (SORT) (Cost=69113167 Card=218700 Bytes=6342300)
12 11 TABLE ACCESS (FULL) OF 'CONTACT_PERSONS' (Cost=317 Card=218700 Bytes=6342300)
Statistics
----------------------------------------------------------
2530 recursive calls
2 db block gets
11516 consistent gets
1829 physical reads
0 redo size
4481 bytes sent via SQL*Net to client
1391 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
5 sorts (memory)
1 sorts (disk)
19 rows processed
SQL>
Thanking you for all your help so far.
Arya
April 19, 2005 - 12:09 pm UTC
it (scalar subqueries) are a technique, not a rule of thumb (because ROT is ROT and ROT is bad).
never said they would always be better, they apply to a certain class of problems.
seems funny to want just the first 20 rows...
seems you MISSED a join?
FROM contact_persons cp
,(SELECT DISTINCT cp_id, org_id
FROM contact_person_roles) cpr
WHERE ROWNUM < 20
what happened to:
AND cp.cp_id = cpr.cp_id(+)
perhaps that is part of the problem
outer join vs scalar subquery
Arya, April 19, 2005 - 12:52 pm UTC
Hi Tom,
Thanks for pointing out missing join. I have added the missing join and executed but still no great improvement. Here is the autotrace output.
SQL> ed
Wrote file afiedt.buf
1 SELECT /*+ PARALLEL (CP,2) */cp_id,
2 nvl(full_name, last_name || ', ' || first_name) FULL_NAME,
3 last_name,
4 first_name,
5 middle_name,
6 oid_user_name,
7 (CASE WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NULL AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
8 THEN TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,101,100)) FROM DUAL))
9 WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,101,100)) IS NULL AND trim(both ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL)
10 THEN TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL))
11 WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
12 THEN TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL)) || ', ' || trim(both ' ' FROM (SELECT elease_util.initcap2(substr(cv_data,101,100)) FROM DUAL))
13 ELSE NULL END) FULL_ADDRESS
14 ,TRIM(SUBSTR(cv_data,1,100)) address_line_1
15 ,TRIM(SUBSTR(cv_data,101,100)) address_line_2
16 ,TRIM(SUBSTR(cv_data,201,100)) city
17 ,TRIM(SUBSTR(cv_data,301,2)) state
18 ,TRIM(SUBSTR(cv_data,303,12)) zip_cd
19 ,CASE WHEN TRIM(SUBSTR(cv_data,315,100)) LIKE '%@%' THEN trim(SUBSTR(cv_data,315,100)) ELSE NULL END EMAIL
20 ,TRIM(SUBSTR(cv_data,415,32)) phone_nbr
21 ,TRIM(SUBSTR(cv_data,447,20)) fax_nbr
22 --,trim(substr(cv_data,367,200)) comment
23 ,TO_NUMBER( TRIM(SUBSTR(cv_data,467,10))) cv_id
24 ,NULL brl_id
25 ,NULL cpr_id
26 ,NULL role_name
27 FROM (
28 SELECT cp.cp_id
29 ,cp.full_name
30 ,cp.last_name
31 ,cp.first_name
32 ,cp.middle_name
33 ,cp.oid_user_name
34 ,(SELECT rpad(nvl(cv.Address_Line_1,' '),100,' ') ||
35 rpad(nvl(cv.address_line_2,' '),100,' ') ||
36 rpad(nvl(cv.city,' '),100,' ') ||
37 rpad(nvl(cv.state,' '),2,' ') ||
38 rpad(nvl(cv.zip_cd,' '),12,' ') ||
39 rpad(nvl(cv.email,' '),100,' ') ||
40 rpad(nvl(cv.phone_nbr,' '),32,' ') ||
41 rpad(nvl(cv.fax_nbr,' '),20,' ') ||
42 --rpad(nvl(cv.cv_comment,' '),200,' ') ||
43 to_char(cv.cv_id,'s0000000000')
44 FROM contact_vehicles cv
45 WHERE cv.cp_id = cp.cp_id) cv_data
46 ,( SELECT org.NAME ||
47 org.org_desc ||
48 org.org_id ||
49 orl.name ||
50 orl.role_desc
51 FROM organizations org
52 ,organization_roles orl
53 WHERE org.org_id = org_id
54 AND org.org_id = orl.orl_id) ORG_DATA
55 FROM contact_persons cp,
56 (SELECT DISTINCT cp_id, org_id
57 FROM contact_person_roles) cpr
58 where cp.cp_id = cpr.cp_id(+)
59 and ROWNUM < 20
60* )
SQL> /
19 rows selected.
Elapsed: 00:00:01.06
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1364 Card=19 Bytes=9310)
1 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
2 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
3 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
4 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
5 0 VIEW (Cost=1364 Card=19 Bytes=9310)
6 5 COUNT (STOPKEY)
7 6 HASH JOIN (OUTER) (Cost=1364 Card=218700 Bytes=9185400)
8 7 TABLE ACCESS (FULL) OF 'CONTACT_PERSONS' (Cost=317 Card=218700 Bytes=6342300)
9 7 VIEW (Cost=827 Card=218020 Bytes=2834260)
10 9 SORT (UNIQUE) (Cost=827 Card=218020 Bytes=1744160)
11 10 TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271 Card=218020 Bytes=1744160)
Statistics
----------------------------------------------------------
2190 recursive calls
0 db block gets
10509 consistent gets
2170 physical reads
0 redo size
4712 bytes sent via SQL*Net to client
1445 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
19 rows processed
SQL>
This is java application so to do pagination they have coded rownum < 20.
I appreciate if you can help me how to make this query run faster. I have tried creating materialized view with fast refresh but it has a limitation to have unique constraint on column of inner table which is in outer join so I could not do that. Please bear with my ignorance can I use parallel option to make it faster? I don't know how to use parallel in multiple join like this I did not find any example in oracle documentation also.
Thank you for your help as always.
Regards
Arya
April 19, 2005 - 1:13 pm UTC
parallel is NOT a way to get results subsecond!
there is no order by so the entire rownum < 20 doesn't make sense, you are getting 20 random rows and no way to get the NEXT 20 random ones.
get rid of that parallel.
Are you sure you need that "distinct" in there, having to do this:
10 9 SORT (UNIQUE) (Cost=827 Card=218020 Bytes=1744160)
11 10 TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271
Card=218020 Bytes=1744160)
before you can get row #1 is going to be *something expensive*
You are great
Arya, April 19, 2005 - 3:29 pm UTC
Hi Tom,
I have discussed with the my lead for removing distinct and we are testing it in the mean while I have tested with out distinct it is performing very well.
Thanks for pointing out.
Regards
Arya
inline query
Arya, April 19, 2005 - 4:50 pm UTC
Hi Tom,
I have rewritten the query as inline and it is much faster now.
Here is the autotrace out put.
SELECT /*+ FIRST_ROWS */
cp.cp_id
,Initcap(nvl(trim(both ' ' from full_name), cp.last_name || ', ' || cp.first_name)) FULL_NAME
,cp.last_name
,cp.first_name
,cp.middle_name
,cv.Address_Line_1 Address_Line_1
,cv.address_line_2 Address_Line_2
,(case when (trim(both ' ' from address_line_1) is null
and trim(both ' ' from address_line_2) is not null)
then trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
when (trim(both ' ' from address_line_2) is null
and trim(both ' ' from address_line_1) is not null)
then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual))
when (trim(both ' ' from address_line_1) is not null
and trim(both ' ' from address_line_2) is not null)
then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual)) || ', ' || trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
else null end) FULL_ADDRESS
,Initcap(cv.city) city
,cv.state
,cv.zip_cd
,(case when email like '%@%' then email else null END) email
,cv.phone_nbr
,cv.fax_nbr
,cv.cv_comment
--,org.org_id
,cv.cv_id
,cp.oid_user_name
,(SELECT DISTINCT cpr.org_id
FROM contact_person_roles cpr
WHERE cpr.cp_id = cp.cp_id
AND rownum = 1) org_id
,(SELECT DISTINCT org.name
FROM organizations org
WHERE org_id = org.org_id
AND ROWNUM = 1) org_name
FROM contact_persons cp
,contact_vehicles cv
WHERE cp.cp_id = cv.cp_id(+)
AND ROWNUM < 20
/
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=442347 Car
d=19 Bytes=2185)
1 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
2 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
3 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
4 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
5 0 COUNT (STOPKEY)
6 5 TABLE ACCESS (FULL) OF 'ORGANIZATIONS' (Cost=2 Card=5 By
tes=145)
7 0 COUNT (STOPKEY)
8 7 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_PERSON_ROLES'
(Cost=3 Card=1 Bytes=8)
9 8 INDEX (RANGE SCAN) OF 'CPR_UK' (UNIQUE) (Cost=2 Card=1
)
10 0 COUNT (STOPKEY)
11 10 NESTED LOOPS (OUTER) (Cost=442347 Card=218700 Bytes=2515
0500)
12 11 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_PERSONS' (Co
st=4947 Card=218700 Bytes=6342300)
13 12 INDEX (FULL SCAN) OF 'CP_PK' (UNIQUE) (Cost=487 Card
=218700)
14 11 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_VEHICLES' (C
ost=2 Card=1 Bytes=86)
15 14 INDEX (RANGE SCAN) OF 'CV_CP_FK_I' (NON-UNIQUE) (Cos
t=1 Card=1)
Statistics
----------------------------------------------------------
394 recursive calls
0 db block gets
1335 consistent gets
4 physical reads
0 redo size
4885 bytes sent via SQL*Net to client
514 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
19 rows processed
Thanks for all your help.
Regards
Sarma
Outer Join Vs Regular join
Mita, April 19, 2005 - 5:37 pm UTC
create table issuer(iss_id number not null, iss_name varchar2(100)
primary key pk_iss iss_id)
create table sec (secid nubmer not null,
iss_id nubmer not null,
sec_name varchar2(100)
primary key sec_id
foreign key iss_id references pk_iss);
given the above table structure,
how will be performance diff in following cases
1. select b.iss_nm, a.sec_nm
from sec, issuer
where sec.iss_id = issuer.iss_id(+)
2. select b.iss_nm, a.sec_nm
from sec, issuer
where sec.iss_id = issuer.iss_id
April 19, 2005 - 7:58 pm UTC
why does it matter?
If YOU NEED an outer join
then
you need it
else
you don't
end if
for such simple queries they would perform more or less the same, but again, the answer is "which one do you need"
Outer Join
Mita, April 20, 2005 - 11:40 am UTC
this is not the complete query. but what I want to know is, if you are sure that for each iss_id in table sec you will find record in iss table (it has foreign key reference to iss table), changing the outer join to regular join will change the performance or not.
as such we don't need outer join, but this is existing code and we are suppose to tune it for better performance. but if it is not going to make much different we can do something else to tune up the performance.
April 20, 2005 - 9:04 pm UTC
you need a fkey AND not null constraint to be sure
if you have both, then no outer join
else you can only answer the question "do i need the outer join, is the question I'm asking needing it"
it is really hard to tune sql withing KNOWING the question you are trying to answer ;)
Von, October 06, 2005 - 6:03 pm UTC
Tom,
Sorry for posting this question here...i am in dire need to get this thing to work..
i have the following table
table ord_fct
item desc fymth fywk units_ord units_shp
---------------------------------------------------------
calc calculator 199912 199950 5 2
calc calculator 199912 199949 6 3
calc calculator 199911 199947 4 4
calc calculator 199910 199943 6 5
mod modem 199912 199950 5 6
mod modem 199910 199942 10 6
the following is the sample data from date table
date
fymth fywk
--------------
199910 199940
199910 199941
199910 199942
199910 199943
199911 199944
199911 199945
199911 199946
199911 199947
199911 199948
199912 199949
199912 199950
199912 199951
199912 199952
I need to do an outer join to get the following result..i tried union all not much luck
desired output
item desc fymth fywk units_ord units_shp
--------------------------------------------------------
calc calculator 199910 199940 0 0
calc calculator 199910 199941 0 0
calc calculator 199910 199942 0 0
calc calculator 199910 199943 6 5
calc calculator 199911 199944 0 0
calc calculator 199911 199945 0 0
calc calculator 199911 199946 0 0
calc calculator 199911 199947 4 4
calc calculator 199911 199948 0 0
calc calculator 199912 199949 6 3
calc calculator 199912 199950 5 2
calc calculator 199912 199951 0 0
calc calculator 199912 199952 0 0
mod modem 199910 199940 0 0
mod modem 199910 199941 0 0
mod modem 199910 199942 10 6
mod modem 199910 199943 0 0
mod modem 199911 199944 0 0
mod modem 199911 199945 0 0
mod modem 199911 199946 0 0
mod modem 199911 199947 0 0
mod modem 199911 199948 0 0
mod modem 199912 199949 0 0
mod modem 199912 199950 5 6
mod modem 199912 199951 0 0
mod modem 199912 199952 0 0
Any help would be greatly appreciated
October 06, 2005 - 6:57 pm UTC
create tables... insert intos.... won't even look at it without them.
That - and explain in text what the output logic it - it is really hard for me to figure out what the "algorithm" is.
A reader, October 06, 2005 - 7:15 pm UTC
sorry about that tom..here they are
create table ord_fct
(item varchar2(10),
desc_re varchar2(20),
fymth number,
fywk number,
units_ord number,
units_shp number);
create table date_tab
(
fymth number,
fywk number
);
insert into ord_fct values ('calc', 'calculator', 199912, 199950, 5, 2);
insert into ord_fct values ('calc', 'calculator', 199912, 199949, 6, 3);
insert into ord_fct values ('calc', 'calculator', 199911, 199947, 4, 4);
insert into ord_fct values ('calc', 'calculator', 199910, 199943, 6, 5);
insert into ord_fct values ('mod', 'modem', 199912, 199950, 5, 6);
insert into ord_fct values ('mod', 'modem', 199910, 199942, 10, 6);
commit;
insert into date_tab values( 199910, 199940 );
insert into date_tab values( 199910, 199941 );
insert into date_tab values( 199910, 199942 );
insert into date_tab values( 199910, 199943 );
insert into date_tab values( 199911, 199944 );
insert into date_tab values( 199911, 199945 );
insert into date_tab values( 199911, 199946 );
insert into date_tab values( 199911, 199947 );
insert into date_tab values( 199911, 199948 );
insert into date_tab values( 199912, 199949 );
insert into date_tab values( 199912, 199950 );
insert into date_tab values( 199912, 199951 );
insert into date_tab values( 199912, 199952 );
commit;
The o/p i need units_ord and units_shp for items across all time periods no matter they have values for that time period or not( if they dont have then 0)..
October 06, 2005 - 7:52 pm UTC
I don't understand what "199940" is as a time period?
how big is your real date_tab
Von, October 06, 2005 - 8:05 pm UTC
the date_tab table has around 5000 rows,,,
fymth - month(fiscal)
fywk - week(fiscal)
thanks
October 06, 2005 - 8:58 pm UTC
last question
version.....
this works in 9i:
ops$tkyte@ORA10G> with dims
2 as
3 (select distinct date_tab.*, ord_fct.item, ord_fct.desc_re
4 from date_tab, ord_fct )
5 select dims.*, nvl(ord_fct.units_ord,0) uord, nvl(ord_fct.units_shp,0) ushp
6 from dims left outer join ord_fct on (dims.fymth = ord_fct.fymth and dims.fywk = ord_fct.fywk
7 and dims.item = ord_fct.item and dims.desc_re = ord_fct.desc_re )
8 order by dims.item, dims.desc_re, dims.fymth, dims.fywk
9 /
FYMTH FYWK ITEM DESC_RE UORD USHP
---------- ---------- ---------- -------------------- ---------- ----------
199910 199940 calc calculator 0 0
199910 199941 calc calculator 0 0
199910 199942 calc calculator 0 0
199910 199943 calc calculator 6 5
199911 199944 calc calculator 0 0
199911 199945 calc calculator 0 0
199911 199946 calc calculator 0 0
199911 199947 calc calculator 4 4
199911 199948 calc calculator 0 0
199912 199949 calc calculator 6 3
199912 199950 calc calculator 5 2
199912 199951 calc calculator 0 0
199912 199952 calc calculator 0 0
199910 199940 mod modem 0 0
199910 199941 mod modem 0 0
199910 199942 mod modem 10 6
199910 199943 mod modem 0 0
199911 199944 mod modem 0 0
199911 199945 mod modem 0 0
199911 199946 mod modem 0 0
199911 199947 mod modem 0 0
199911 199948 mod modem 0 0
199912 199949 mod modem 0 0
199912 199950 mod modem 5 6
199912 199951 mod modem 0 0
199912 199952 mod modem 0 0
26 rows selected.
but not in 8i and in 10g there is a better answer using partitioned outer joins.
A reader, October 06, 2005 - 10:22 pm UTC
excellent solution tom...
thanks a lot....is the following possible??
i tried group by rollup..but i got more rows than i actually needed..
Item desc_re fymth fywk UORD USHP
calc calculator 199910 6 5 ----> total
199940 0 0
199941 0 0
199942 0 0
199943 6 5
calc calculator 199911 4 4
199944 0 0
199945 0 0
199946 0 0
199947 4 4
199948 0 0
calc calculator 199912 11 5
199949 6 3
199950 5 2
199951 0 0
199952 0 0
mod modem 199910 10 6
199940 0 0
199941 0 0
199942 10 6
199943 0 0
mod modem 199911 0 0
199944 0 0
199945 0 0
199946 0 0
199947 0 0
199948 0 0
mod modem 199912 5 6
199949 0 0
199950 5 6
199951 0 0
199952 0 0
October 07, 2005 - 8:14 am UTC
try again, i cannot read this
Von, October 07, 2005 - 8:50 am UTC
Sorry..here it is
Item desc_re fymth fywk UORD USHP
calc calculator 199910 6 5 ----> total
199940 0 0
199941 0 0
199942 0 0
199943 6 5
calc calculator 199911 4 4 ----> total
199944 0 0
199945 0 0
199946 0 0
199947 4 4
199948 0 0
calc calculator 199912 11 5 ----> total
199949 6 3
199950 5 2
199951 0 0
199952 0 0
mod modem 199910 10 6 ----> total
199940 0 0
199941 0 0
199942 10 6
199943 0 0
mod modem 199911 0 0 ----> total
199944 0 0
199945 0 0
199946 0 0
199947 0 0
199948 0 0
mod modem 199912 5 6 ----> total
199949 0 0
199950 5 6
199951 0 0
199952 0 0
October 07, 2005 - 9:25 am UTC
ops$tkyte@ORA10G> with dims
2 as
3 (select distinct date_tab.*, ord_fct.item, ord_fct.desc_re
4 from date_tab, ord_fct )
5 select decode( grouping_id(dims.fywk), 1, dims.item ) item,
6 decode( grouping_id(dims.fywk), 1, dims.desc_re ) desc_re,
7 decode( grouping_id(dims.fywk), 1, dims.fymth ) fymth,
8 dims.fywk,
9 sum( nvl(ord_fct.units_ord,0) ) uord,
10 sum( nvl(ord_fct.units_shp,0) ) ushp
11 from dims left outer join ord_fct on
12 (dims.fymth = ord_fct.fymth
13 and dims.fywk = ord_fct.fywk
14 and dims.item = ord_fct.item
15 and dims.desc_re = ord_fct.desc_re )
16 group by grouping sets( (dims.item, dims.desc_re, dims.fymth, dims.fywk ),
17 (dims.item, dims.desc_re, dims.fymth ) )
18 order by dims.item, dims.fymth, grouping_id(dims.fywk) DESC, dims.fywk
19 /
ITEM DESC_RE FYMTH FYWK UORD USHP
---------- -------------------- ---------- ---------- ---------- ----------
calc calculator 199910 6 5
199940 0 0
199941 0 0
199942 0 0
199943 6 5
calc calculator 199911 4 4
199944 0 0
199945 0 0
199946 0 0
199947 4 4
199948 0 0
calc calculator 199912 11 5
199949 6 3
199950 5 2
199951 0 0
199952 0 0
mod modem 199910 10 6
199940 0 0
199941 0 0
199942 10 6
199943 0 0
mod modem 199911 0 0
199944 0 0
199945 0 0
199946 0 0
199947 0 0
199948 0 0
mod modem 199912 5 6
199949 0 0
199950 5 6
199951 0 0
199952 0 0
32 rows selected.
Von
A reader, October 07, 2005 - 1:46 pm UTC
execelletn as usual
thanks tom
Bratt, May 07, 2006 - 12:40 pm UTC
Say the response time by using scalar subqueries and outer join is the same..which method would you prefer?
scalar subquery or outer join?
May 07, 2006 - 1:09 pm UTC
outer join - syntax is easier.
You would use the scalar subquery "trick" to enhance the performance of an outer join that must return its first rows "as fast as possible" - usually this inludes aggregation or something as well.
Bratt, May 07, 2006 - 5:38 pm UTC
If the Select SQL is part of an insert block, then outer join would be better than Scalar Subquery right?
May 08, 2006 - 7:47 am UTC
in all probability, sure - using the CBO.
Bratt
A reader, May 09, 2006 - 7:54 am UTC
Thanks Tom
A reader, October 30, 2006 - 8:56 am UTC
Why is DIM1 invisible to the inline view of the scalar subquery as below:
SELECT ( SELECT dummy dim2
FROM ( SELECT dummy
FROM DUAL
WHERE dummy = dim1 ))
FROM ( SELECT dummy dim1
FROM DUAL )
October 30, 2006 - 9:38 am UTC
because they only go one level down in scalar subqueries. it just "is that way". I don't have a reason for it, it just "is"
Outer Join in Scalar Subqueries
Sokrates, December 17, 2009 - 5:23 am UTC
with
x as
(select 1 a from dual),
y as
(select 1 b, 'X' c from dual),
z as
(select 4 e, 'Y' f from dual)
select
x.a,
(select y.c
from y, z
where y.b=x.a
and z.e(+)=x.a) g
from x;
A G
---------- -
1
oops, I expected
A G
---------- -
1 X
why does the scalar subquery doesn't return a row ?
December 17, 2009 - 10:19 am UTC
select ..., (select y.c
from y, z
where y.b=x.a
and z.e(+)=x.a)
from x;
is just like:
select y.c
from y, z
where y.b=:b1
and z.e(+)=:b2
there is no join between Y and Z, hence there is nothing "outer" about it. "outer" is between two tables.
just like
select * from dual where dummy(+) = 'Y';
returns zero rows. The outer join to Z returns zero rows and then you do an inner join to Y of that result.
thanks
Sokrates, December 18, 2009 - 1:09 am UTC
A reader, September 06, 2010 - 5:40 am UTC
Hi Tom,
In your book 'Effective Oracle by design' - you mentioned that you use 'scalar subqueries' to remove outer joins when we want the small set of rows quickly .
Can't we gain the same performance by using the Nested loop join method(hint) in the same outer join ? Does the Optimizer never use 'Nested loops' for Outer joins?
Many thanks
September 09, 2010 - 7:56 pm UTC
it can use nested loops yes. however, it is usually still not practical in many cases. when you through in things like "order by" - it typically will feel compelled to get to the last row before the first row can be returned.
Take this schema:
drop table t1;
drop table t2;
create table t1 as select * from all_users;
alter table t1 add constraint t1_pk primary key(username);
create table t2 as select * from all_objects;
create index t2_idx on t2(owner);
exec dbms_stats.gather_table_stats( user, 'T1' );
exec dbms_stats.gather_table_stats( user, 'T2' );
Now, report on the first two users sorted by username with their object count, you could:
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from (
2 select /*+ first_rows(2) */ t1.username, count(t2.object_name)
3 from t1 left outer join t2
4 on (t1.username = t2.owner)
5 group by t1.username
6 order by t1.username
7 )
8 where rownum <= 2
9 /
Execution Plan
----------------------------------------------------------
Plan hash value: 1469972403
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 3 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 2 | 60 | 3 (0)| 00:00:01 |
| 3 | SORT GROUP BY NOSORT | | 2 | 80 | 3 (0)| 00:00:01 |
| 4 | NESTED LOOPS OUTER | | 2 | 80 | 3 (0)| 00:00:01 |
| 5 | INDEX FULL SCAN | T1_PK | 1 | 9 | 1 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| T2 | 2 | 62 | 2 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | T2_IDX | 3 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=2)
7 - access("T1"."USERNAME"="T2"."OWNER"(+))
ops$tkyte%ORA11GR2> select /*+ first_rows(2) */ t1.username, (select count(*) from t2 where t2.owner = t1.username)
2 from (select username
3 from t1
4 order by username ) t1
5 where rownum <= 2
6 /
Execution Plan
----------------------------------------------------------
Plan hash value: 679100844
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 34 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 6 | | |
|* 2 | INDEX RANGE SCAN| T2_IDX | 2051 | 12306 | 5 (0)| 00:00:01 |
|* 3 | COUNT STOPKEY | | | | | |
| 4 | VIEW | | 2 | 34 | 1 (0)| 00:00:01 |
| 5 | INDEX FULL SCAN| T1_PK | 2 | 18 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T2"."OWNER"=:B1)
3 - filter(ROWNUM<=2)
if you look at the first query, even though it is nested loops, you have to wait for the last row to get the first row - where as with the 2nd query, we get the first row almost immediately.. (since t1 is small, the top-n query is very fast and the retrieval of the two count(*)'s is relatively fast....)
full outer join with Analytics.
A reader, September 27, 2010 - 11:00 am UTC
Hi Tom,
We are using Oracle 10.2.0.2.0. Please have a look at the below query and the plan.
SELECT NVL (b.service_id, a.service_id) AS service_id, NVL (b.ab, a.ab) AS ab
FROM x_tab b FULL OUTER JOIN
(SELECT service_id, ab from
(SELECT service_id, ab,
row_number () OVER (PARTITION BY service_id ORDER BY service_id DESC) rr
FROM x_tab_x)where rr =1 ) a
ON a.service_id = b.service_id
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 240 | 14 (15)| 00:00:01 |
| 1 | VIEW | | 4 | 240 | 14 (15)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | HASH JOIN OUTER | | 3 | 219 | 8 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL | X_TAB | 2 | 60 | 3 (0)| 00:00:01 |
|* 5 | VIEW | | 3 | 129 | 4 (25)| 00:00:01 |
|* 6 | WINDOW SORT PUSHED RANK| | 3 | 90 | 4 (25)| 00:00:01 |
| 7 | TABLE ACCESS FULL | X_TAB_X | 3 | 90 | 3 (0)| 00:00:01 |
|* 8 | HASH JOIN ANTI | | 1 | 43 | 7 (15)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | X_TAB_X | 3 | 90 | 3 (0)| 00:00:01 |
| 10 | TABLE ACCESS FULL | X_TAB | 2 | 26 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("SERVICE_ID"(+)="B"."SERVICE_ID")
5 - filter("RR"(+)=1)
6 - filter(RANK() OVER ( PARTITION BY "SERVICE_ID" ORDER BY "SERVICE_ID"
DESC )<=1)
8 - access("SERVICE_ID"="B"."SERVICE_ID")
9 - filter(RANK() OVER ( PARTITION BY "SERVICE_ID" ORDER BY "SERVICE_ID"
DESC )=1)
however we want the result of the subquery where RR = 1 and then have the full outer join. Here the condition RR = 1 seems to be added after the join in outer join mode and hence some records are being supressed (RN is null after outer join ). Please suggest a way to overcome this. Also please let us know why the condition RR = 1 is outer joined.
September 27, 2010 - 12:27 pm UTC
put the code tags around that plan - you lose all indenting without it.
It seems right to me - if RN (think you mean RR), that just means there was no row in the (A) query to join to - there was no matching A.SERVICE_ID for some B.SERVICE_ID and we "made up" a row - that is what outer joins do??
Outer Join and Analytics.
A reader, September 27, 2010 - 12:37 pm UTC
Hi Tom,
Why is the RR= 1 not taken first while evaluating the inline view rather its taken as RR (+) = 1. we do not want the outer join on RR.
What should we do so that the inner query (inline view) is evaluated first with RR = 1 (and not RR(+) = 1) and then the full outer join to happen.
September 27, 2010 - 12:43 pm UTC
it has to be!!! you are asking for an outer join!!!!!
we merged the view here - you are OUTER JOINING to a set defined by "where rr=1", hence we take the set (with RR in it) and then outer join to it - if rr=1 does NOT EXIST for some service_id - we make up a row for it (that is the entire sole goal of the outer join you asked for!!!!)
It is doing the right thing.
Suppose it did what you ask - it evaluated the rr=1 first. What would happen if a serivce_id in (a) did not exist????
answer: it would make up a row, assign rr=null to it and join to that - because you are doing an outer join.
In other words - the answer would be EXACTLY the same.
Outer Join and Analytics
A reader, September 27, 2010 - 1:36 pm UTC
Thanks a lot Tom , for explaining.
But how do we materialize the subquery so that RR = 1 doesn't get view merged. So that the subquery behaves as a table and we don't get the RR (+) = 1 . The problem is not with RR column getting null values rather its suppressing the record with null RR. Outer join would not supress records.
Strangely we would get the required values when we write RR=1 or RR is null instead of only RR=1 in the subquery.
September 27, 2010 - 1:47 pm UTC
I don't understand, the answer should be the same regardless of whether the (A) view was materialized or merged.
Do this
create the tables for us (give us the creates)
put in a row or two of data (that would lead to a "wrong" answer in your opinion)
and then explain what the "right" answer should be.
A reader, September 27, 2010 - 3:43 pm UTC
Hi Tom,
why optimizer choose rank() over row_number() in the above plan?
Is it because column in "partition by" clause and "order by" clause is same?
Thanks
September 27, 2010 - 5:40 pm UTC
that is interesting, I did not look at the plan really, was concentrating on the question.
to the original poster - can we have the full schema for this - I'd like to reproduce that