Thanks for the question, Rafique.

Asked: August 28, 2004 - 11:45 am UTC

Last updated: September 27, 2010 - 5:40 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Tom,

I have a specific requirement where I have about 10 to 12 tables to query the data from. In this case, I have one parent table let us say called t and all other child tables let us called t_c1,t_c2,t_c3..t_c12. As parent to child tables mapping is not one to one so I don't have any choice but outer join and there are always more than columns to select from child tables. As outer join in this case is killing the performance of the query, so I explore your website to find any better solution than outer join and one approach I come up with is scalar subqueries with string concatenation as described below.

create table t (object_id number,object_name varchar2(150))
/
insert into t as select object_id,object_name from all_objects
/
create table t_c1(object_id number,c1 varchar2(100),c2 varchar2(100))
/
alter table t add primary key (object_id)
/
declare
num number :=300;--just sample number
count1 number :=1;
begin
for i in(select object_id from t where rownum<500)
loop
if mod(count1,2)=0
then
insert into t_c1 values(i.object_id,num,num+1);
end if;
count1:=count1+1;
end loop;
end;
/
create index tc1_idx on t_C1(object_id)
/

Here is outer join approach:

select t.object_id,t1.c1,t1.c2
from t,t_C1 t1
where t.object_id=t1.object_id(+)
/

scalar subqueries with string concatenation:

select object_id,to_number(substr(c1_data||'|',1,instr(c1_data||'|','|')-1)) c1,
to_number(substr(c1_data||'||',instr(c1_data||'||', '|') +1,
instr( c1_data||'||', '|', 1, 2 )-instr(c1_data||'||','|')-1 )) c1
from (select object_id,
(select c1||'|'||c2
from t_C1
where object_id=t.object_id) c1_data
from t)
/

Although performance is much better of scalar subqueries than outer join but still I am not satisfied with this approach as converting the string back to original types of the columns may result some invalid number/invalid dates run time errors.

Can you think of any better approach than this?

Thanks in advance for your help.

and Tom said...

If you have my book "Effective Oracle by Design", I describe this technique and offer another way to implement it.

However, I fail to see how "converting the string back to original types of the column may result some invalid number/invalida dates". That would only be possible if the conversion of the number/date -> string produced something that was not a number/date!

I prefer to use fixed width fields personally. So, the above for me would be (i added a number and a date and added a dash of NULLS to demonstrate how to deal with them)

ops$tkyte@ORA9IR2> drop table t;

Table dropped.

ops$tkyte@ORA9IR2> drop table t_c1;

Table dropped.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create table t (object_id number,object_name varchar2(150))
2 /

Table created.

ops$tkyte@ORA9IR2> insert into t select object_id,object_name from all_objects
2 where rownum <= 20
3 /

20 rows created.

ops$tkyte@ORA9IR2> create table t_c1
2 (object_id number,
3 c1 varchar2(100),
4 c2 varchar2(100),
5 c3 date,
6 c4 number(8) )
7 /

Table created.

ops$tkyte@ORA9IR2> alter table t add primary key (object_id)
2 /

Table altered.

ops$tkyte@ORA9IR2> declare
2 num number :=300;--just sample number
3 count1 number :=1;
4 begin
5 for i in(select object_id from t where rownum<=10)
6 loop
7 if ( mod(count1,4) = 0 )
8 then
9 insert into t_c1 values(i.object_id,NULL,num+1,sysdate+count1,num+42);
10 elsif ( mod(count1,4) = 1 )
11 then
12 insert into t_c1 values(i.object_id,num,NULL,sysdate+count1,num+42);
13 elsif ( mod(count1,4) = 2 )
14 then
15 insert into t_c1 values(i.object_id,num,num+1,NULL,num+42);
16 else
17 insert into t_c1 values(i.object_id,num,num+1,sysdate+count1,NULL);
18 end if;
19 count1:=count1+1;
20 end loop;
21 end;
22 /

PL/SQL procedure successfully completed.

ops$tkyte@ORA9IR2> create index tc1_idx on t_C1(object_id)
2 /

Index created.

In the following i used fixed width fields (easier to "parse") and I encoded dates as yyyymmddhh24miss and the number format you use will be DEPENDENT on the numbers you have -- your number format should match the valid values for that number type.

Note that all of the columns in the scalar subquery have special "null handling" except for the last one (doesn't need it).

ops$tkyte@ORA9IR2> select object_id,
2 trim( substr( data, 1, 100 ) ) c1,
3 trim( substr( data, 101, 100)) c2,
4 to_date( trim(substr( data, 201, 14 )), 'yyyymmddhh24miss' ) c3,
5 to_number( trim(substr( data, 215 )) ) c4
6 from (
7 select object_id, (select rpad( nvl(c1,' '), 100, ' ' ) ||
8 rpad( nvl(c2,' '), 100, ' ' ) ||
9 nvl(to_char(c3,'yyyymmddhh24miss'), rpad( ' ', 14, ' ' )) ||
10 to_char(c4,'s000000000')
11 from t_c1
12 where object_id = t.object_id ) data
13 from t
14 )
15 /

OBJECT_ID C1 C2 C3 C4
---------- ---------- ---------- --------- ----------
17287 300 29-AUG-04 342
7559 300 301 342
22543 300 301 31-AUG-04
13615 301 01-SEP-04 342
22346 300 02-SEP-04 342
10173 300 301 342
22748 300 301 04-SEP-04
22798 301 05-SEP-04 342
9849 300 06-SEP-04 342
11261 300 301 342
16887
7745
18771
9249
6371
18773
22842
14287
20118
18289

20 rows selected.

Now, suppose you didn't want to "parse" and nvl and all of that stuff, you can do this instead:

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace type myScalarType as object
2 ( c1 varchar2(100),
3 c2 varchar2(100),
4 c3 date,
5 c4 number
6 )
7 /

Type created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select object_id,
2 x.data.c1 c1,
3 x.data.c2 c2,
4 x.data.c3 c3,
5 x.data.c4 c4
6 from (
7 select object_id, (select myScalarType(c1,c2,c3,c4)
8 from t_c1
9 where object_id = t.object_id ) data
10 from t
11 ) x
12 /

OBJECT_ID C1 C2 C3 C4
---------- ---------- ---------- --------- ----------
17287 300 29-AUG-04 342
7559 300 301 342
22543 300 301 31-AUG-04
13615 301 01-SEP-04 342
22346 300 02-SEP-04 342
10173 300 301 342
22748 300 301 04-SEP-04
22798 301 05-SEP-04 342
9849 300 06-SEP-04 342
11261 300 301 342
16887
7745
18771
9249
6371
18773
22842
14287
20118
18289

20 rows selected.

Rating

(37 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Excellent Response!!!

Rafique Awan, August 28, 2004 - 6:22 pm UTC

You are great! Thanks a lot for your excellent response. I liked the approach of using scalar type instead of handling string and also it is also easy to maintain in the long run.

I have just some questions regarding scalar subqueries that I want to clarify.

1) Does Oracle execute scalary subquery for every row? can you please through some lights on the internal implementation of this feature?
2) In case if we want to search based on derived column from scalar subqueries, is ther any way to optimize that?

BTW, I have both of your books EXPERT ONE ON ONE and EFFECTIVE ORACLE BY DESIGN, I love to read both books and I have learnt a lot from these and I also recomnded these books to many other friends.

August 28, 2004 - 7:04 pm UTC

1) </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:279415836583#19447054234438 <code>

2) don't understand this question?

Index on scalar type columns

A reader, August 28, 2004 - 11:00 pm UTC

<quote>
create or replace view v as
ops$tkyte@ORA9IR2> select object_id,
2 x.data.c1 c1,
3 x.data.c2 c2,
4 x.data.c3 c3,
5 x.data.c4 c4
6 from (
7 select object_id, (select myScalarType(c1,c2,c3,c4)
8 from t_c1
9 where object_id = t.object_id ) data
10 from t
11 ) x
12 /
<quote>

I think what he means is if you create a view on your query like above and do a query like

select * from v where c1 = ...

it will not use the index on t_c1.c1 because of the way it is being wrapped in the scalar type and scalar subquery and stuff, no?

Thanks

August 29, 2004 - 11:17 am UTC

Ok, then you would not be needing a scalar subquery or outer join *AT ALL*

if the concept is

a) outer join from T to T1
b) put a predicate on T1.column

you've made a grievous error..

You would "just join" at that point.

select ...
from t, t1
where join...
and t1.c = ....

because if you have:

where t.pk = t1.k (+)
and t1.c <op> <value>

and <op> is anything other than "is null" -- what was the point of the outer join other than to waste CPU cycles and reduce the number of plans available?

for you see, if the row in t1 is "made up" due to the outer join -- then all of the columns of t1 for that row will be null and null <op> <value> is never TRUE nor FALSE but "not known" -- it would be removed from the result set.

So, I would never use this technique if I wanted to predicate on the columns from T1, it would quite simply "not make any sense"

Thanks Reader!

Rafique Awan, August 28, 2004 - 11:41 pm UTC

Reader you are right. That is what I mean to ask.

August 29, 2004 - 11:42 am UTC

see above, if you predicate on it, you would never have outer joined to it in the first place

Thanks!!!

Rafique Awan, August 29, 2004 - 11:44 am UTC

I agree with you that it doesn't make any sense to use outer join/scalar subqueries if query is based on the column from child table but I was thinking to create just one generic view which will satisfy all the requirements that I have. But now I realized that I should not use this view if I need to query based on the column from child table,I should use seprate set of queries in this specific requirement with simple join from parent to child table as you said.

Thanks again.

August 29, 2004 - 12:03 pm UTC

"generic" = "poor performance"

this generic view cannot satisfy all requirements. proof is simple. Here is the list of requirements you have:

a) ...
b) ...
c) performance
d) ....

C) cannot be satisfied by anything generic ;)

I hate "generic", everytime I see "generic" I see "performance problem", I see "maintainance issue", I see "confusion in the future".

A reader, August 29, 2004 - 10:24 pm UTC

"for you see, if the row in t1 is "made up" due to the outer join -- then all of the columns of t1 for that row will be null and null <op> <value> is never TRUE
nor FALSE but "not known" -- it would be removed from the result set"

Well, that is true if the row is indeed made up. But the definition of outer join is a optional join. I want to predicate/filter on t1.col when the row is found, not made up.

Whats wrong with that? I cant make it a exact join because then I would lose some rows from "t"!

<quote>
I hate "generic", everytime I see "generic" I see "performance problem", I see "maintainance issue", I see "confusion in the future"
<quote>

I would mostly agree, but then you come to a point when 10 views on the same underlying tables are 99.9% similar! Each app, user, adhoc query needs a teeny tiny twist on the same kind of query.

In this case, creating generic view makes sense. Make the join "generic", expose different columns for different users and let each app use what they need.

You dont agree?

August 30, 2004 - 8:13 am UTC

yes you can. the premise is:

an outer join from T to T1
with a predicate on T1.c

the outer join is from T to T1, if it were the other way -- no problem

from t, t1 where t1.k = t.k(+) and t1.c <op> <value> makes sense
from t, t1 where t1.k(+) = t.k and t1.c <op> <value> does NOT make sense.

No, I obviously do not agree, I stated as much above?

For "a reader"

Muhammad Riaz Shahid, August 30, 2004 - 4:17 am UTC

Consider the query:

select t1.col1,t1.col2,t2.col1,t2.col3
from t2, t1
where t2.pk=t1.col1(+)

So here you are doing some outer join. You simply told optimizer "Hay ! I am not sure that i will get a row in t1 based on col1". If a row is not found in t1 w.r.t then all the columns of table t1 (in select) will become null....right ?

Now:


select t1.col1,t1.col2,t2.col1,t2.col3
from t2, t1
where t2.pk=t1.col1(+)
and col2<=100 --say you add this condition

Now since for each row of t2, if the same was not found in t1, then all the columns of t1 should be null (TURE ???). Then what is the purpose of this condition:

and col2<=100

that simply implies that your are expecting col2 to be not null (implies that you are telling here the optimizer "Hay ! I will definitely get a row from t1 for condition t2.pk=t1.col1". So you misguided the optimizer....it should have simply been:

select t1.col1,t1.col2,t2.col1,t2.col3
from t2, t1
where t2.pk=t1.col1
and col2<=100

Remeber the OUTER JOINS are always expensive....we should use them only when they make sense...

Here is the proof:

SQL> create table tab1(a number, b number);

Table created.

Elapsed: 00:00:00.25
SQL> insert into tab1 select rownum,rownum+.25 from user_objects 

10 rows created.

Elapsed: 00:00:00.87
SQL> commit;

Commit complete.

Elapsed: 00:00:00.10
SQL> ed
Wrote file afiedt.buf

  1* create table tab2  as select decode(mod(rownum,2),0,rownum) d, rownum
SQL> /

Table created.

Elapsed: 00:00:00.18
SQL> select * FRom tab1;

         A          B
========== ==========
         1       1.25
         2       2.25
         3       3.25
         4       4.25
         5       5.25
         6       6.25
         7       7.25
         8       8.25
         9       9.25
        10      10.25

10 rows selected.

Elapsed: 00:00:00.94
SQL> select * From tab2;

         D          E
========== ==========
                 1001
         2       1002
                 1003
         4       1004
                 1005
         6       1006
                 1007
         8       1008
                 1009
        10       1010

10 rows selected.

Elapsed: 00:00:00.93
SQL> select a,b,d,e from tab1, tab2 
  2  where a=d(+)
  3  
SQL> /

         A          B          D          E
========== ========== ========== ==========
         1       1.25
         2       2.25          2       1002
         3       3.25
         4       4.25          4       1004
         5       5.25
         6       6.25          6       1006
         7       7.25
         8       8.25          8       1008
         9       9.25
        10      10.25         10       1010

10 rows selected.

Elapsed: 00:00:00.94
SQL> ed
Wrote file afiedt.buf

  1  select a,b,d,e from tab1, tab2
  2  where a=d(+)
  3* and e>=1006
SQL> /

         A          B          D          E
========== ========== ========== ==========
         6       6.25          6       1006
         8       8.25          8       1008
        10      10.25         10       1010

Remaining part

Muhammad Riaz Shahid, August 30, 2004 - 4:19 am UTC

Here is the last query i forgot to paste.

SQL> ed
Wrote file afiedt.buf

  1  select a,b,d,e from tab1, tab2
  2  where a=d
  3* and e>=1006
SQL> /

         A          B          D          E
========== ========== ========== ==========
         6       6.25          6       1006
         8       8.25          8       1008
        10      10.25         10       1010

and t1.c <op> <value>

Pratap, August 30, 2004 - 5:09 am UTC

About your comment on outer join with condition - and t1.c <op> <value>

Consider a requirement -

Tables t1 and t2

1. t1 has rows and show all of them
2. t2 may or may not have any rows
3. Show data from t2 only if value of a column is 'something'

create table t1 ( a number );

create table t2 ( a number, b varchar2(100) )

insert into t1 values ( 1 );

insert into t2 values ( 1, 'something' );

insert into t2 values ( 1, 'something else' );

select *
from t1, t2
where t1.a = t2.a (+)
and t2.b (+) = 'something';

Pratap

August 30, 2004 - 8:47 am UTC

that is an outer join period. not at all what we were discussing which is:

t1.a = t2.a(+)
and
t2.b = 'something'

A reader, August 30, 2004 - 8:33 am UTC

"Now since for each row of t2, if the same was not found in t1, then all the columns of t1 should be null (TURE ???). Then what is the purpose of this condition:
and col2<=100"

But this is where I dont understand.

Both you and the reader Shahid keep pointing out the case where the row is NOT found in the outerjoined table (in bold above).

I am interested in the case where the row IS found in the table and I want to filter in that case.

In that case

from t,t1 where t.pk=t1.pk(+) and t1.col='something' would make perfect sense, wouldnt it?

Also, how would you respsond to the latest followup from "Pratap", where I am interested in filtering on the some other column in t1, not the one joined with t

from t,t1
where t.pk=t1.pk(+)
and t1.other(+)='something'

This is a fairly common business requirement, one that you just cannot trivialize saying 'bah, bad design'.

August 30, 2004 - 9:00 am UTC

<quote>
from t,t1 where t.pk=t1.pk(+) and t1.col='something' would make perfect sense,
wouldnt it?
</quote>

No, not at all.

Fact: if t.pk = t1.pk(+) "makes up a t1 row" (eg: the outer join did something), the predicate t1.col = 'something' WOULD ALWAYS REMOVE IT, meaning the (+) just does *nothing*. Well, not nothing, it removes from consideration certain opportunities

when you see:

where t.pk = t1.pk(+) and t1.c = 'something'

you know the person that coded it didn't understand what was going on. the only way that should be coded is:

where t.pk = t1.pk and t1.c = 'something'

I didn't say that last one was "bah bad design", I said "nah, that is not what we were taking about"

that is an outer join from t to t1. pretend that 'something' comes from t (was a column in T). that query is totally valid and reasonable. That query just is not what we were discussing.

A reader, August 30, 2004 - 8:56 am UTC

t1.a = t2.a(+)
and
t2.b(+) = 'something'

Sorry for the confusion, but the above is what I thought we were discussing, an "outer join, period"

Getting back to the question about comparing scalar subqueries (with or without scalar types), in the above query, the CBO will use the index on t2.b, if it makes sense.

But in the myScalarType example above, which is also effectively an outer join, the CBO will not use the index on t_c1.c1, right? Is there any way to have the best of both worlds?

Thanks

August 30, 2004 - 9:17 am UTC

only if the predicate itself it pushed down into the scalar subquery (making it behave just like "t2.b(+) = 'something'"

if you wanted to just have t2.b = 'something', we want to go back to a JOIN.

Pratap, August 30, 2004 - 9:27 am UTC

Tom is right in saying that

where t.pk = t1.pk(+) and t1.c = 'something'
is as good as
where t.pk = t1.pk and t1.c = 'something'

But probably those who code like this do have a requirement on filtering t1.c = 'something', BUT they don't know that -
1. t1.c = 'something' is going to return false because the value of t1.c may be null AND

2. One can also put an outer join like t1.c (+) = 'something', when comparing a literal or variable.

From sql reference -

If the WHERE clause contains a condition that compares a column from table B with a constant, then the (+) operator must be applied to the column so that Oracle returns the rows from table A for which it has generated nulls for this column. Otherwise Oracle returns only the results of a simple join.

August 30, 2004 - 9:47 am UTC

Or,

select *
from t, ( select * from t1 where c = 'something' ) t1
where t.k = t1.k(+)

as well....

it says

"give me every row in T"
and if there is a mate in T2 such that c = 'something', show me that as well.

A reader, August 30, 2004 - 9:59 am UTC

"select *
from t, ( select * from t1 where c = 'something' ) t1
where t.k = t1.k(+)"

That will work only if 'something' is well-known and part of the requirements for this view.

It wont work if I need to 'expose' 'c' in the view and let the app filter on what ever value they want for 'c'

Thanks

August 30, 2004 - 10:34 am UTC

I'm not even thinking "view" at that point.

I'm thinking "query"

View vs query

A reader, August 30, 2004 - 5:10 pm UTC

"I'm not even thinking "view" at that point.
I'm thinking "query" "

Well, thats the easiest thing then. Taking one specific query, analysing it, reducing its LIOs, is very easy in the grand scheme of things, once you get to know Oracle well.

Its when you expose a commonly-uses query/business question as a VIEW that things start to get hairy. Users/apps start to put predicates on columns simply not designed to do so!

Thanks

To "A reader "

Muhammad Riaz Shahid, September 21, 2004 - 8:00 am UTC

Please search this site for "parameterized views" and you will know that the query provided by TOM can also be used in a view....where you can also use different values for "something" on thr fly...

outer join vs scalar subquery

Arya, April 19, 2005 - 10:52 am UTC

Hi Tom,

I have a query using outer join users are complaining which is slow so I am working to make it faster and in that process I have rewritten this query as a scalar subquery which you have explained in your book. To my surprise scalar subquery is doing more logical reads and taking more time than outerjoin. 

I request your help in pointing me where am I doing mistake?

Here is the original query which is doing less logical reads.

SQL> SELECT /*+ FIRST_ROWS */ cp.cp_id
  2     ,Initcap(nvl(trim(both ' ' from full_name), cp.last_name || ', ' || cp.first_name)) FULL_NAME
  3     ,cp.last_name
  4     ,cp.first_name
  5     ,cp.middle_name
  6     ,org.NAME ORG_NAME
  7     ,org.org_desc
  8     ,orl.NAME ORG_ROLE
  9     ,orl.role_Desc
 10     ,NULL role_name
 11     ,cv.Address_Line_1 Address_Line_1
 12     ,cv.address_line_2 Address_Line_2
 13     ,(case when (trim(both ' ' from address_line_1) is null
 14       and trim(both ' ' from address_line_2) is not null)
 15      then trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
 16            when (trim(both ' ' from address_line_2) is null
 17             and trim(both ' ' from address_line_1) is not null)
 18      then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual))
 19            when (trim(both ' ' from address_line_1) is not null
 20      and trim(both ' ' from address_line_2) is not null)
 21      then  trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual)) || ', '  || trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
 22        else null end)  FULL_ADDRESS
 23     ,Initcap(cv.city) city
 24     ,cv.state
 25     ,cv.zip_cd
 26     ,(case when email like '%@%' then email else null END) email
 27     ,cv.phone_nbr
 28     ,cv.fax_nbr
 29     ,cv.cv_comment
 30     ,org.org_id
 31     ,cv.cv_id
 32     ,cp.oid_user_name
 33     ,NULL brl_id
 34     ,NULL cpr_id
 35  FROM contact_persons cp
 36      ,contact_vehicles cv
 37      ,organizations org
 38      ,organization_roles orl
 39      ,(select distinct cp_id, org_id from contact_person_roles) cpr
 40  WHERE cp.cp_id = cv.cp_id(+)
 41    AND cp.cp_id = cpr.cp_id(+)
 42    AND cpr.org_id = org.org_id(+)
 43    AND org.orl_id = orl.orl_id(+)
 44    AND ROWNUM <20
 45  /

19 rows selected.

Elapsed: 00:00:01.06

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=879812 Card=19 Bytes=7201)

   1    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   2    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   3    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   4    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   5    0   COUNT (STOPKEY)
   6    5     NESTED LOOPS (OUTER) (Cost=879812 Card=219630 Bytes=83239770)

   7    6       NESTED LOOPS (OUTER) (Cost=440552 Card=219630 Bytes=64351590)

   8    7         NESTED LOOPS (OUTER) (Cost=220922 Card=219630 Bytes=20425590)

   9    8           HASH JOIN (OUTER) (Cost=1292 Card=219630 Bytes=12079650)

  10    9             TABLE ACCESS (FULL) OF 'CONTACT_PERSONS' (Cost=317 Card=219630 Bytes=6369270)

  11    9             VIEW (Cost=750 Card=152051 Bytes=3953326)
  12   11               SORT (UNIQUE) (Cost=750 Card=152051 Bytes=1216408)

  13   12                 TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271 Card=218110 Bytes=1744880)

  14    8           TABLE ACCESS (BY INDEX ROWID) OF 'ORGANIZATIONS' (Cost=1 Card=1 Bytes=38)

  15   14             INDEX (UNIQUE SCAN) OF 'ORG_PK' (UNIQUE)
  16    7         TABLE ACCESS (BY INDEX ROWID) OF 'ORGANIZATION_ROLES' (Cost=1 Card=1 Bytes=200)

  17   16           INDEX (UNIQUE SCAN) OF 'ORL_PK' (UNIQUE)
  18    6       TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_VEHICLES' (Cost=2 Card=1 Bytes=86)

  19   18         INDEX (RANGE SCAN) OF 'CV_CP_FK_I' (NON-UNIQUE) (Cost=1 Card=1)


Statistics
----------------------------------------------------------
        327  recursive calls
          0  db block gets
       4923  consistent gets
       2029  physical reads
          0  redo size
       5675  bytes sent via SQL*Net to client
        514  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         19  rows processed

I have rewritten above query as scalar subquery and here is the autotrace for the same which is doing more logical reads compared to outer join.

SELECT cp_id,
       nvl(full_name, last_name || ', ' || first_name) FULL_NAME,
       last_name, 
       first_name, 
       middle_name, 
       oid_user_name,
       (CASE WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NULL  AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
                 THEN  TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,101,100)) FROM DUAL))
                 WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,101,100)) IS NULL  AND trim(both ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL)
                 THEN  TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL))
                 WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
                 THEN  TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL)) || ', '  || trim(both ' ' FROM (SELECT elease_util.initcap2(substr(cv_data,101,100)) FROM DUAL))
        ELSE NULL END) FULL_ADDRESS
       ,TRIM(SUBSTR(cv_data,1,100)) address_line_1
       ,TRIM(SUBSTR(cv_data,101,100)) address_line_2
       ,TRIM(SUBSTR(cv_data,201,100)) city
       ,TRIM(SUBSTR(cv_data,301,2)) state
       ,TRIM(SUBSTR(cv_data,303,12)) zip_cd
       ,CASE WHEN TRIM(SUBSTR(cv_data,315,100)) LIKE '%@%' THEN trim(SUBSTR(cv_data,315,100)) ELSE NULL END EMAIL
       ,TRIM(SUBSTR(cv_data,415,32)) phone_nbr
       ,TRIM(SUBSTR(cv_data,447,20)) fax_nbr
       --,trim(substr(cv_data,367,200)) comment
       ,TO_NUMBER( TRIM(SUBSTR(cv_data,467,10))) cv_id
        ,NULL brl_id
       ,NULL cpr_id
       ,NULL role_name
  FROM (
SELECT cp.cp_id
       ,cp.full_name
       ,cp.last_name
       ,cp.first_name
       ,cp.middle_name
       ,cp.oid_user_name
       ,(SELECT rpad(nvl(cv.Address_Line_1,' '),100,' ') ||
           rpad(nvl(cv.address_line_2,' '),100,' ') ||
           rpad(nvl(cv.city,' '),100,' ') ||
           rpad(nvl(cv.state,' '),2,' ') ||
           rpad(nvl(cv.zip_cd,' '),12,' ') ||
        rpad(nvl(cv.email,' '),100,' ') ||
           rpad(nvl(cv.phone_nbr,' '),32,' ') ||
           rpad(nvl(cv.fax_nbr,' '),20,' ') ||
           --rpad(nvl(cv.cv_comment,' '),200,' ') ||
           to_char(cv.cv_id,'s0000000000') 
           FROM contact_vehicles cv
          WHERE cv.cp_id = cp.cp_id) cv_data
      ,( SELECT org.NAME ||
                   org.org_desc ||
                    org.org_id ||
            orl.name ||
            orl.role_desc
               FROM organizations org
            ,organization_roles orl
              WHERE org.org_id = cpr.org_id
            AND org.org_id = orl.orl_id) ORG_DATA
 FROM contact_persons cp
      ,(SELECT DISTINCT cp_id, org_id 
          FROM contact_person_roles) cpr
WHERE ROWNUM < 20
)
19 rows selected.

Elapsed: 00:00:01.07

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=69113167 Card=19 Bytes=9310)
   1    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   2    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   3    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   4    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   5    0   VIEW (Cost=69113167 Card=19 Bytes=9310)
   6    5     COUNT (STOPKEY)
   7    6       MERGE JOIN (CARTESIAN) (Cost=69113167 Card=47680974000 Bytes=2002600908000)
   8    7         VIEW (Cost=827 Card=218020 Bytes=2834260)
   9    8           SORT (UNIQUE) (Cost=827 Card=218020 Bytes=1744160)
  10    9             TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271 Card=218020 Bytes=1744160)
  11    7         BUFFER (SORT) (Cost=69113167 Card=218700 Bytes=6342300)
  12   11           TABLE ACCESS (FULL) OF 'CONTACT_PERSONS' (Cost=317 Card=218700 Bytes=6342300)

Statistics
----------------------------------------------------------
       2530  recursive calls
          2  db block gets
      11516  consistent gets
       1829  physical reads
          0  redo size
       4481  bytes sent via SQL*Net to client
       1391  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          5  sorts (memory)
          1  sorts (disk)
         19  rows processed

SQL>

Thanking you for all your help so far.

Arya

April 19, 2005 - 12:09 pm UTC

it (scalar subqueries) are a technique, not a rule of thumb (because ROT is ROT and ROT is bad).

never said they would always be better, they apply to a certain class of problems.

seems funny to want just the first 20 rows...

seems you MISSED a join?

FROM contact_persons cp
,(SELECT DISTINCT cp_id, org_id
FROM contact_person_roles) cpr
WHERE ROWNUM < 20

what happened to:

AND cp.cp_id = cpr.cp_id(+)

perhaps that is part of the problem

outer join vs scalar subquery

Arya, April 19, 2005 - 12:52 pm UTC

Hi Tom,

Thanks for pointing out missing join. I have added the missing join and executed but still no great improvement. Here is the autotrace output.

SQL> ed
Wrote file afiedt.buf

  1  SELECT /*+ PARALLEL (CP,2) */cp_id,
  2         nvl(full_name, last_name || ', ' || first_name) FULL_NAME,
  3         last_name,
  4         first_name,
  5         middle_name,
  6         oid_user_name,
  7        (CASE WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NULL  AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
  8                   THEN  TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,101,100)) FROM DUAL))
  9                   WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,101,100)) IS NULL  AND trim(both ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL)
 10                   THEN  TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL))
 11                   WHEN (TRIM(BOTH ' ' FROM SUBSTR(cv_data,1,100)) IS NOT NULL AND trim(both ' ' FROM SUBSTR(cv_data,101,100)) IS NOT NULL)
 12                   THEN  TRIM(BOTH ' ' FROM (SELECT elease_util.initcap2(SUBSTR(cv_data,1,100)) FROM DUAL)) || ', '  || trim(both ' ' FROM (SELECT elease_util.initcap2(substr(cv_data,101,100)) FROM DUAL))
 13          ELSE NULL END) FULL_ADDRESS
 14         ,TRIM(SUBSTR(cv_data,1,100)) address_line_1
 15         ,TRIM(SUBSTR(cv_data,101,100)) address_line_2
 16         ,TRIM(SUBSTR(cv_data,201,100)) city
 17         ,TRIM(SUBSTR(cv_data,301,2)) state
 18         ,TRIM(SUBSTR(cv_data,303,12)) zip_cd
 19         ,CASE WHEN TRIM(SUBSTR(cv_data,315,100)) LIKE '%@%' THEN trim(SUBSTR(cv_data,315,100)) ELSE NULL END EMAIL
 20         ,TRIM(SUBSTR(cv_data,415,32)) phone_nbr
 21         ,TRIM(SUBSTR(cv_data,447,20)) fax_nbr
 22        --,trim(substr(cv_data,367,200)) comment
 23         ,TO_NUMBER( TRIM(SUBSTR(cv_data,467,10))) cv_id
 24         ,NULL brl_id
 25         ,NULL cpr_id
 26         ,NULL role_name
 27    FROM (
 28  SELECT cp.cp_id
 29         ,cp.full_name
 30         ,cp.last_name
 31         ,cp.first_name
 32         ,cp.middle_name
 33         ,cp.oid_user_name
 34         ,(SELECT rpad(nvl(cv.Address_Line_1,' '),100,' ') ||
 35                     rpad(nvl(cv.address_line_2,' '),100,' ') ||
 36                     rpad(nvl(cv.city,' '),100,' ') ||
 37                     rpad(nvl(cv.state,' '),2,' ') ||
 38                     rpad(nvl(cv.zip_cd,' '),12,' ') ||
 39             rpad(nvl(cv.email,' '),100,' ') ||
 40                     rpad(nvl(cv.phone_nbr,' '),32,' ') ||
 41                     rpad(nvl(cv.fax_nbr,' '),20,' ') ||
 42                     --rpad(nvl(cv.cv_comment,' '),200,' ') ||
 43                     to_char(cv.cv_id,'s0000000000')
 44             FROM contact_vehicles cv
 45            WHERE cv.cp_id = cp.cp_id) cv_data
 46       ,( SELECT org.NAME ||
 47                         org.org_desc ||
 48                      org.org_id ||
 49                 orl.name ||
 50                 orl.role_desc
 51                 FROM organizations org
 52                 ,organization_roles orl
 53                WHERE org.org_id = org_id
 54             AND org.org_id = orl.orl_id) ORG_DATA
 55   FROM contact_persons cp,
 56        (SELECT DISTINCT cp_id, org_id
 57               FROM contact_person_roles) cpr
 58  where cp.cp_id = cpr.cp_id(+)
 59    and ROWNUM < 20
 60* )
SQL> /

19 rows selected.

Elapsed: 00:00:01.06

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1364 Card=19 Bytes=9310)
   1    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   2    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   3    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   4    0   TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
   5    0   VIEW (Cost=1364 Card=19 Bytes=9310)
   6    5     COUNT (STOPKEY)
   7    6       HASH JOIN (OUTER) (Cost=1364 Card=218700 Bytes=9185400)
   8    7         TABLE ACCESS (FULL) OF 'CONTACT_PERSONS' (Cost=317 Card=218700 Bytes=6342300)
   9    7         VIEW (Cost=827 Card=218020 Bytes=2834260)
  10    9           SORT (UNIQUE) (Cost=827 Card=218020 Bytes=1744160)
  11   10             TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271 Card=218020 Bytes=1744160)

Statistics
----------------------------------------------------------
       2190  recursive calls
          0  db block gets
      10509  consistent gets
       2170  physical reads
          0  redo size
       4712  bytes sent via SQL*Net to client
       1445  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
         19  rows processed

SQL>

This is java application so to do pagination they have coded rownum < 20.

I appreciate if you can help me how to make this query run faster. I have tried creating materialized view with fast refresh but it has a limitation to have unique constraint on column of inner table which is in outer join so I could not do that. Please bear with my ignorance can I use parallel option to make it faster? I don't know how to use parallel in multiple join like this I did not find any example in oracle documentation also.

Thank you for your help as always.

Regards
Arya

April 19, 2005 - 1:13 pm UTC

parallel is NOT a way to get results subsecond!

there is no order by so the entire rownum < 20 doesn't make sense, you are getting 20 random rows and no way to get the NEXT 20 random ones.

get rid of that parallel.

Are you sure you need that "distinct" in there, having to do this:

10 9 SORT (UNIQUE) (Cost=827 Card=218020 Bytes=1744160)
11 10 TABLE ACCESS (FULL) OF 'CONTACT_PERSON_ROLES' (Cost=271
Card=218020 Bytes=1744160)

before you can get row #1 is going to be *something expensive*

You are great

Arya, April 19, 2005 - 3:29 pm UTC

Hi Tom,

I have discussed with the my lead for removing distinct and we are testing it in the mean while I have tested with out distinct it is performing very well.

Thanks for pointing out.

Regards
Arya

inline query

Arya, April 19, 2005 - 4:50 pm UTC

Hi Tom,

I have rewritten the query as inline and it is much faster now.

Here is the autotrace out put.

SELECT /*+ FIRST_ROWS */
cp.cp_id
,Initcap(nvl(trim(both ' ' from full_name), cp.last_name || ', ' || cp.first_name)) FULL_NAME
,cp.last_name
,cp.first_name
,cp.middle_name
,cv.Address_Line_1 Address_Line_1
,cv.address_line_2 Address_Line_2
,(case when (trim(both ' ' from address_line_1) is null
and trim(both ' ' from address_line_2) is not null)
then trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
when (trim(both ' ' from address_line_2) is null
and trim(both ' ' from address_line_1) is not null)
then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual))
when (trim(both ' ' from address_line_1) is not null
and trim(both ' ' from address_line_2) is not null)
then trim(both ' ' from (select elease_util.initcap2(address_line_1) from dual)) || ', ' || trim(both ' ' from (select elease_util.initcap2(address_line_2) from dual))
else null end) FULL_ADDRESS
,Initcap(cv.city) city
,cv.state
,cv.zip_cd
,(case when email like '%@%' then email else null END) email
,cv.phone_nbr
,cv.fax_nbr
,cv.cv_comment
--,org.org_id
,cv.cv_id
,cp.oid_user_name
,(SELECT DISTINCT cpr.org_id
FROM contact_person_roles cpr
WHERE cpr.cp_id = cp.cp_id
AND rownum = 1) org_id
,(SELECT DISTINCT org.name
FROM organizations org
WHERE org_id = org.org_id
AND ROWNUM = 1) org_name
FROM contact_persons cp
,contact_vehicles cv
WHERE cp.cp_id = cv.cp_id(+)
AND ROWNUM < 20
/
Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=442347 Car
d=19 Bytes=2185)

1 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
2 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
3 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
4 0 TABLE ACCESS (FULL) OF 'DUAL' (Cost=17 Card=8168)
5 0 COUNT (STOPKEY)
6 5 TABLE ACCESS (FULL) OF 'ORGANIZATIONS' (Cost=2 Card=5 By
tes=145)

7 0 COUNT (STOPKEY)
8 7 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_PERSON_ROLES'
(Cost=3 Card=1 Bytes=8)

9 8 INDEX (RANGE SCAN) OF 'CPR_UK' (UNIQUE) (Cost=2 Card=1
)

10 0 COUNT (STOPKEY)
11 10 NESTED LOOPS (OUTER) (Cost=442347 Card=218700 Bytes=2515
0500)

12 11 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_PERSONS' (Co
st=4947 Card=218700 Bytes=6342300)

13 12 INDEX (FULL SCAN) OF 'CP_PK' (UNIQUE) (Cost=487 Card
=218700)

14 11 TABLE ACCESS (BY INDEX ROWID) OF 'CONTACT_VEHICLES' (C
ost=2 Card=1 Bytes=86)

15 14 INDEX (RANGE SCAN) OF 'CV_CP_FK_I' (NON-UNIQUE) (Cos
t=1 Card=1)

Statistics
----------------------------------------------------------
394 recursive calls
0 db block gets
1335 consistent gets
4 physical reads
0 redo size
4885 bytes sent via SQL*Net to client
514 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
19 rows processed

Thanks for all your help.

Regards
Sarma

Outer Join Vs Regular join

Mita, April 19, 2005 - 5:37 pm UTC

create table issuer(iss_id number not null, iss_name varchar2(100)
primary key pk_iss iss_id)

create table sec (secid nubmer not null,
iss_id nubmer not null,
sec_name varchar2(100)
primary key sec_id
foreign key iss_id references pk_iss);

given the above table structure,
how will be performance diff in following cases

1. select b.iss_nm, a.sec_nm
from sec, issuer
where sec.iss_id = issuer.iss_id(+)

2. select b.iss_nm, a.sec_nm
from sec, issuer
where sec.iss_id = issuer.iss_id

April 19, 2005 - 7:58 pm UTC

why does it matter?

If YOU NEED an outer join
then
you need it
else
you don't
end if

for such simple queries they would perform more or less the same, but again, the answer is "which one do you need"

Outer Join

Mita, April 20, 2005 - 11:40 am UTC

this is not the complete query. but what I want to know is, if you are sure that for each iss_id in table sec you will find record in iss table (it has foreign key reference to iss table), changing the outer join to regular join will change the performance or not.

as such we don't need outer join, but this is existing code and we are suppose to tune it for better performance. but if it is not going to make much different we can do something else to tune up the performance.

April 20, 2005 - 9:04 pm UTC

you need a fkey AND not null constraint to be sure

if you have both, then no outer join

else you can only answer the question "do i need the outer join, is the question I'm asking needing it"

it is really hard to tune sql withing KNOWING the question you are trying to answer ;)

Von, October 06, 2005 - 6:03 pm UTC

Tom,
Sorry for posting this question here...i am in dire need to get this thing to work..

i have the following table

table ord_fct

item desc fymth fywk units_ord units_shp
---------------------------------------------------------
calc calculator 199912 199950 5 2
calc calculator 199912 199949 6 3
calc calculator 199911 199947 4 4
calc calculator 199910 199943 6 5
mod modem 199912 199950 5 6
mod modem 199910 199942 10 6

the following is the sample data from date table
date

fymth fywk
--------------
199910 199940
199910 199941
199910 199942
199910 199943
199911 199944
199911 199945
199911 199946
199911 199947
199911 199948
199912 199949
199912 199950
199912 199951
199912 199952

I need to do an outer join to get the following result..i tried union all not much luck
desired output

item desc fymth fywk units_ord units_shp
--------------------------------------------------------

calc calculator 199910 199940 0 0
calc calculator 199910 199941 0 0
calc calculator 199910 199942 0 0
calc calculator 199910 199943 6 5
calc calculator 199911 199944 0 0
calc calculator 199911 199945 0 0
calc calculator 199911 199946 0 0
calc calculator 199911 199947 4 4
calc calculator 199911 199948 0 0
calc calculator 199912 199949 6 3
calc calculator 199912 199950 5 2
calc calculator 199912 199951 0 0
calc calculator 199912 199952 0 0
mod modem 199910 199940 0 0
mod modem 199910 199941 0 0
mod modem 199910 199942 10 6
mod modem 199910 199943 0 0
mod modem 199911 199944 0 0
mod modem 199911 199945 0 0
mod modem 199911 199946 0 0
mod modem 199911 199947 0 0
mod modem 199911 199948 0 0
mod modem 199912 199949 0 0
mod modem 199912 199950 5 6
mod modem 199912 199951 0 0
mod modem 199912 199952 0 0

Any help would be greatly appreciated

October 06, 2005 - 6:57 pm UTC

create tables... insert intos.... won't even look at it without them.

That - and explain in text what the output logic it - it is really hard for me to figure out what the "algorithm" is.

A reader, October 06, 2005 - 7:15 pm UTC

sorry about that tom..here they are

create table ord_fct
(item varchar2(10),
desc_re varchar2(20),
fymth number,
fywk number,
units_ord number,
units_shp number);

create table date_tab
(
fymth number,
fywk number
);

insert into ord_fct values ('calc', 'calculator', 199912, 199950, 5, 2);
insert into ord_fct values ('calc', 'calculator', 199912, 199949, 6, 3);
insert into ord_fct values ('calc', 'calculator', 199911, 199947, 4, 4);
insert into ord_fct values ('calc', 'calculator', 199910, 199943, 6, 5);
insert into ord_fct values ('mod', 'modem', 199912, 199950, 5, 6);
insert into ord_fct values ('mod', 'modem', 199910, 199942, 10, 6);
commit;

insert into date_tab values( 199910, 199940 );
insert into date_tab values( 199910, 199941 );
insert into date_tab values( 199910, 199942 );
insert into date_tab values( 199910, 199943 );
insert into date_tab values( 199911, 199944 );
insert into date_tab values( 199911, 199945 );
insert into date_tab values( 199911, 199946 );
insert into date_tab values( 199911, 199947 );
insert into date_tab values( 199911, 199948 );
insert into date_tab values( 199912, 199949 );
insert into date_tab values( 199912, 199950 );
insert into date_tab values( 199912, 199951 );
insert into date_tab values( 199912, 199952 );
commit;

The o/p i need units_ord and units_shp for items across all time periods no matter they have values for that time period or not( if they dont have then 0)..

October 06, 2005 - 7:52 pm UTC

I don't understand what "199940" is as a time period?

how big is your real date_tab

Von, October 06, 2005 - 8:05 pm UTC

the date_tab table has around 5000 rows,,,
fymth - month(fiscal)
fywk - week(fiscal)
thanks

October 06, 2005 - 8:58 pm UTC

last question


version.....

this works in 9i:

ops$tkyte@ORA10G> with dims
  2  as
  3  (select distinct date_tab.*, ord_fct.item, ord_fct.desc_re
  4     from date_tab, ord_fct )
  5  select dims.*, nvl(ord_fct.units_ord,0) uord, nvl(ord_fct.units_shp,0) ushp
  6    from dims left outer join ord_fct on (dims.fymth = ord_fct.fymth and dims.fywk = ord_fct.fywk
  7                                          and dims.item = ord_fct.item and dims.desc_re = ord_fct.desc_re )
  8   order by dims.item, dims.desc_re, dims.fymth, dims.fywk
  9  /
 
     FYMTH       FYWK ITEM       DESC_RE                    UORD       USHP
---------- ---------- ---------- -------------------- ---------- ----------
    199910     199940 calc       calculator                    0          0
    199910     199941 calc       calculator                    0          0
    199910     199942 calc       calculator                    0          0
    199910     199943 calc       calculator                    6          5
    199911     199944 calc       calculator                    0          0
    199911     199945 calc       calculator                    0          0
    199911     199946 calc       calculator                    0          0
    199911     199947 calc       calculator                    4          4
    199911     199948 calc       calculator                    0          0
    199912     199949 calc       calculator                    6          3
    199912     199950 calc       calculator                    5          2
    199912     199951 calc       calculator                    0          0
    199912     199952 calc       calculator                    0          0
    199910     199940 mod        modem                         0          0
    199910     199941 mod        modem                         0          0
    199910     199942 mod        modem                        10          6
    199910     199943 mod        modem                         0          0
    199911     199944 mod        modem                         0          0
    199911     199945 mod        modem                         0          0
    199911     199946 mod        modem                         0          0
    199911     199947 mod        modem                         0          0
    199911     199948 mod        modem                         0          0
    199912     199949 mod        modem                         0          0
    199912     199950 mod        modem                         5          6
    199912     199951 mod        modem                         0          0
    199912     199952 mod        modem                         0          0
 
26 rows selected.


but not in 8i and in 10g there is a better answer using partitioned outer joins.

A reader, October 06, 2005 - 10:22 pm UTC

excellent solution tom...
thanks a lot....is the following possible??
i tried group by rollup..but i got more rows than i actually needed..

Item desc_re fymth fywk UORD USHP
calc calculator 199910 6 5 ----> total
199940 0 0
199941 0 0
199942 0 0
199943 6 5
calc calculator 199911 4 4
199944 0 0
199945 0 0
199946 0 0
199947 4 4
199948 0 0
calc calculator 199912 11 5
199949 6 3
199950 5 2
199951 0 0
199952 0 0
mod modem 199910 10 6
199940 0 0
199941 0 0
199942 10 6
199943 0 0
mod modem 199911 0 0
199944 0 0
199945 0 0
199946 0 0
199947 0 0
199948 0 0
mod modem 199912 5 6
199949 0 0
199950 5 6
199951 0 0
199952 0 0

October 07, 2005 - 8:14 am UTC

try again, i cannot read this

Von, October 07, 2005 - 8:50 am UTC

Sorry..here it is

Item desc_re fymth fywk UORD USHP
calc calculator 199910 6 5 ----> total
199940 0 0
199941 0 0
199942 0 0
199943 6 5
calc calculator 199911 4 4 ----> total
199944 0 0
199945 0 0
199946 0 0
199947 4 4
199948 0 0
calc calculator 199912 11 5 ----> total
199949 6 3
199950 5 2
199951 0 0
199952 0 0
mod modem 199910 10 6 ----> total
199940 0 0
199941 0 0
199942 10 6
199943 0 0
mod modem 199911 0 0 ----> total
199944 0 0
199945 0 0
199946 0 0
199947 0 0
199948 0 0
mod modem 199912 5 6 ----> total
199949 0 0
199950 5 6
199951 0 0
199952 0 0

October 07, 2005 - 9:25 am UTC

ops$tkyte@ORA10G> with dims
  2  as
  3  (select distinct date_tab.*, ord_fct.item, ord_fct.desc_re
  4     from date_tab, ord_fct )
  5  select decode( grouping_id(dims.fywk), 1, dims.item ) item,
  6         decode( grouping_id(dims.fywk), 1, dims.desc_re ) desc_re,
  7             decode( grouping_id(dims.fywk), 1, dims.fymth ) fymth,
  8             dims.fywk,
  9         sum( nvl(ord_fct.units_ord,0) ) uord,
 10             sum( nvl(ord_fct.units_shp,0) ) ushp
 11    from dims left outer join ord_fct on
 12    (dims.fymth = ord_fct.fymth
 13     and dims.fywk = ord_fct.fywk
 14     and dims.item = ord_fct.item
 15     and dims.desc_re = ord_fct.desc_re )
 16   group by grouping sets( (dims.item, dims.desc_re, dims.fymth, dims.fywk ),
 17                           (dims.item, dims.desc_re, dims.fymth ) )
 18   order by dims.item, dims.fymth, grouping_id(dims.fywk) DESC, dims.fywk
 19  /
 
ITEM       DESC_RE                   FYMTH       FYWK       UORD       USHP
---------- -------------------- ---------- ---------- ---------- ----------
calc       calculator               199910                     6          5
                                               199940          0          0
                                               199941          0          0
                                               199942          0          0
                                               199943          6          5
calc       calculator               199911                     4          4
                                               199944          0          0
                                               199945          0          0
                                               199946          0          0
                                               199947          4          4
                                               199948          0          0
calc       calculator               199912                    11          5
                                               199949          6          3
                                               199950          5          2
                                               199951          0          0
                                               199952          0          0
mod        modem                    199910                    10          6
                                               199940          0          0
                                               199941          0          0
                                               199942         10          6
                                               199943          0          0
mod        modem                    199911                     0          0
                                               199944          0          0
                                               199945          0          0
                                               199946          0          0
                                               199947          0          0
                                               199948          0          0
mod        modem                    199912                     5          6
                                               199949          0          0
                                               199950          5          6
                                               199951          0          0
                                               199952          0          0
 
32 rows selected.

Von

A reader, October 07, 2005 - 1:46 pm UTC

execelletn as usual
thanks tom

Bratt, May 07, 2006 - 12:40 pm UTC

Say the response time by using scalar subqueries and outer join is the same..which method would you prefer?
scalar subquery or outer join?

May 07, 2006 - 1:09 pm UTC

outer join - syntax is easier.

You would use the scalar subquery "trick" to enhance the performance of an outer join that must return its first rows "as fast as possible" - usually this inludes aggregation or something as well.

Bratt, May 07, 2006 - 5:38 pm UTC

If the Select SQL is part of an insert block, then outer join would be better than Scalar Subquery right?

May 08, 2006 - 7:47 am UTC

in all probability, sure - using the CBO.

Bratt

A reader, May 09, 2006 - 7:54 am UTC

Thanks Tom

A reader, October 30, 2006 - 8:56 am UTC

Why is DIM1 invisible to the inline view of the scalar subquery as below:

SELECT ( SELECT dummy dim2
FROM ( SELECT dummy
FROM DUAL
WHERE dummy = dim1 ))
FROM ( SELECT dummy dim1
FROM DUAL )

October 30, 2006 - 9:38 am UTC

because they only go one level down in scalar subqueries. it just "is that way". I don't have a reason for it, it just "is"

Outer Join in Scalar Subqueries

Sokrates, December 17, 2009 - 5:23 am UTC

with
x as
(select 1 a from dual),
y as
(select 1 b, 'X' c from dual),
z as
(select 4 e, 'Y' f from dual)
select
  x.a,
  (select y.c
   from y, z
   where y.b=x.a
   and z.e(+)=x.a) g
from x;

         A G
---------- -
         1

oops, I expected

A G
---------- -
1 X

why does the scalar subquery doesn't return a row ?

December 17, 2009 - 10:19 am UTC

select ..., (select y.c
from y, z
where y.b=x.a
and z.e(+)=x.a)
from x;

is just like:

select y.c
from y, z
where y.b=:b1
and z.e(+)=:b2

there is no join between Y and Z, hence there is nothing "outer" about it. "outer" is between two tables.

just like

select * from dual where dummy(+) = 'Y';

returns zero rows. The outer join to Z returns zero rows and then you do an inner join to Y of that result.

thanks

Sokrates, December 18, 2009 - 1:09 am UTC

A reader, September 06, 2010 - 5:40 am UTC

Hi Tom,
In your book 'Effective Oracle by design' - you mentioned that you use 'scalar subqueries' to remove outer joins when we want the small set of rows quickly .

Can't we gain the same performance by using the Nested loop join method(hint) in the same outer join ? Does the Optimizer never use 'Nested loops' for Outer joins?

Many thanks

September 09, 2010 - 7:56 pm UTC

it can use nested loops yes. however, it is usually still not practical in many cases. when you through in things like "order by" - it typically will feel compelled to get to the last row before the first row can be returned.

Take this schema:

drop table t1;
drop table t2;

create table t1 as select * from all_users;
alter table t1 add constraint t1_pk primary key(username);
create table t2 as select * from all_objects;
create index t2_idx on t2(owner);

exec dbms_stats.gather_table_stats( user, 'T1' );
exec dbms_stats.gather_table_stats( user, 'T2' );

Now, report on the first two users sorted by username with their object count, you could:

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from (
  2  select /*+ first_rows(2) */ t1.username, count(t2.object_name)
  3    from t1 left outer join t2
  4      on (t1.username = t2.owner)
  5   group by t1.username
  6   order by t1.username
  7  )
  8  where rownum <= 2
  9  /

Execution Plan
----------------------------------------------------------
Plan hash value: 1469972403

------------------------------------------------------------------------------------------
| Id  | Operation                       | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |        |     2 |    60 |     3   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                  |        |       |       |            |          |
|   2 |   VIEW                          |        |     2 |    60 |     3   (0)| 00:00:01 |
|   3 |    SORT GROUP BY NOSORT         |        |     2 |    80 |     3   (0)| 00:00:01 |
|   4 |     NESTED LOOPS OUTER          |        |     2 |    80 |     3   (0)| 00:00:01 |
|   5 |      INDEX FULL SCAN            | T1_PK  |     1 |     9 |     1   (0)| 00:00:01 |
|   6 |      TABLE ACCESS BY INDEX ROWID| T2     |     2 |    62 |     2   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN          | T2_IDX |     3 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=2)
   7 - access("T1"."USERNAME"="T2"."OWNER"(+))

ops$tkyte%ORA11GR2> select /*+ first_rows(2) */ t1.username, (select count(*) from t2 where t2.owner = t1.username)
  2    from (select username
  3            from t1
  4                   order by username ) t1
  5   where rownum <= 2
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 679100844

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |     2 |    34 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |        |     1 |     6 |            |          |
|*  2 |   INDEX RANGE SCAN| T2_IDX |  2051 | 12306 |     5   (0)| 00:00:01 |
|*  3 |  COUNT STOPKEY    |        |       |       |            |          |
|   4 |   VIEW            |        |     2 |    34 |     1   (0)| 00:00:01 |
|   5 |    INDEX FULL SCAN| T1_PK  |     2 |    18 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T2"."OWNER"=:B1)
   3 - filter(ROWNUM<=2)

if you look at the first query, even though it is nested loops, you have to wait for the last row to get the first row - where as with the 2nd query, we get the first row almost immediately.. (since t1 is small, the top-n query is very fast and the retrieval of the two count(*)'s is relatively fast....)

full outer join with Analytics.

A reader, September 27, 2010 - 11:00 am UTC

Hi Tom,

We are using Oracle 10.2.0.2.0. Please have a look at the below query and the plan.

SELECT NVL (b.service_id, a.service_id) AS service_id, NVL (b.ab, a.ab) AS ab
FROM x_tab b FULL OUTER JOIN
(SELECT service_id, ab from
(SELECT service_id, ab,
row_number () OVER (PARTITION BY service_id ORDER BY service_id DESC) rr
FROM x_tab_x)where rr =1 ) a
ON a.service_id = b.service_id

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 240 | 14 (15)| 00:00:01 |
| 1 | VIEW | | 4 | 240 | 14 (15)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | HASH JOIN OUTER | | 3 | 219 | 8 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL | X_TAB | 2 | 60 | 3 (0)| 00:00:01 |
|* 5 | VIEW | | 3 | 129 | 4 (25)| 00:00:01 |
|* 6 | WINDOW SORT PUSHED RANK| | 3 | 90 | 4 (25)| 00:00:01 |
| 7 | TABLE ACCESS FULL | X_TAB_X | 3 | 90 | 3 (0)| 00:00:01 |
|* 8 | HASH JOIN ANTI | | 1 | 43 | 7 (15)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | X_TAB_X | 3 | 90 | 3 (0)| 00:00:01 |
| 10 | TABLE ACCESS FULL | X_TAB | 2 | 26 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("SERVICE_ID"(+)="B"."SERVICE_ID")
5 - filter("RR"(+)=1)
6 - filter(RANK() OVER ( PARTITION BY "SERVICE_ID" ORDER BY "SERVICE_ID"
DESC )<=1)
8 - access("SERVICE_ID"="B"."SERVICE_ID")
9 - filter(RANK() OVER ( PARTITION BY "SERVICE_ID" ORDER BY "SERVICE_ID"
DESC )=1)

however we want the result of the subquery where RR = 1 and then have the full outer join. Here the condition RR = 1 seems to be added after the join in outer join mode and hence some records are being supressed (RN is null after outer join ). Please suggest a way to overcome this. Also please let us know why the condition RR = 1 is outer joined.

September 27, 2010 - 12:27 pm UTC

put the code tags around that plan - you lose all indenting without it.

It seems right to me - if RN (think you mean RR), that just means there was no row in the (A) query to join to - there was no matching A.SERVICE_ID for some B.SERVICE_ID and we "made up" a row - that is what outer joins do??

Outer Join and Analytics.

A reader, September 27, 2010 - 12:37 pm UTC

Hi Tom,

Why is the RR= 1 not taken first while evaluating the inline view rather its taken as RR (+) = 1. we do not want the outer join on RR.

What should we do so that the inner query (inline view) is evaluated first with RR = 1 (and not RR(+) = 1) and then the full outer join to happen.

September 27, 2010 - 12:43 pm UTC

it has to be!!! you are asking for an outer join!!!!!

we merged the view here - you are OUTER JOINING to a set defined by "where rr=1", hence we take the set (with RR in it) and then outer join to it - if rr=1 does NOT EXIST for some service_id - we make up a row for it (that is the entire sole goal of the outer join you asked for!!!!)

It is doing the right thing.

Suppose it did what you ask - it evaluated the rr=1 first. What would happen if a serivce_id in (a) did not exist????

answer: it would make up a row, assign rr=null to it and join to that - because you are doing an outer join.

In other words - the answer would be EXACTLY the same.

Outer Join and Analytics

A reader, September 27, 2010 - 1:36 pm UTC

Thanks a lot Tom , for explaining.
But how do we materialize the subquery so that RR = 1 doesn't get view merged. So that the subquery behaves as a table and we don't get the RR (+) = 1 . The problem is not with RR column getting null values rather its suppressing the record with null RR. Outer join would not supress records.

Strangely we would get the required values when we write RR=1 or RR is null instead of only RR=1 in the subquery.

September 27, 2010 - 1:47 pm UTC

I don't understand, the answer should be the same regardless of whether the (A) view was materialized or merged.

Do this

create the tables for us (give us the creates)

put in a row or two of data (that would lead to a "wrong" answer in your opinion)

and then explain what the "right" answer should be.

A reader, September 27, 2010 - 3:43 pm UTC

Hi Tom,

why optimizer choose rank() over row_number() in the above plan?

Is it because column in "partition by" clause and "order by" clause is same?

Thanks

September 27, 2010 - 5:40 pm UTC

that is interesting, I did not look at the plan really, was concentrating on the question.

to the original poster - can we have the full schema for this - I'd like to reproduce that

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Excellent Response!!!

Index on scalar type columns

Thanks Reader!

Thanks!!!

For "a reader"

Remaining part

and t1.c <op> <value>

View vs query

To "A reader "

outer join vs scalar subquery

outer join vs scalar subquery

You are great

inline query

Outer Join Vs Regular join

Outer Join

Von

Bratt

Outer Join in Scalar Subqueries

thanks

full outer join with Analytics.

Outer Join and Analytics.

Outer Join and Analytics

More to Explore

PL/SQL demos

PL/SQL docs

Outer Join vs Scalar Subqueries

Breadcrumb

Question and Answer

You Asked

and Tom said...

Rating

Comments

Excellent Response!!!

Index on scalar type columns

Thanks Reader!

Thanks!!!

For "a reader"

Remaining part

and t1.c <op> <value>

View vs query

To "A reader "

outer join vs scalar subquery

outer join vs scalar subquery

You are great

inline query

Outer Join Vs Regular join

Outer Join

Von

Bratt

Outer Join in Scalar Subqueries

thanks

full outer join with Analytics.

Outer Join and Analytics.

Outer Join and Analytics

More to Explore

PL/SQL demos

PL/SQL docs