Home>Question Details



Nag -- Thanks for the question regarding "Outer Join why is it so expensive?", version 8.1.7

Submitted on 31-Aug-2002 20:55 Central time zone
Last updated 9-Dec-2008 12:06

You Asked

select columns

from 

tab1    (1000 rows)
tab2      (800 rows)
tab3      (299 rows)

where tab1.primarykey = tab2.primarykey and 
tab23.primarykey =tab2.primarykey(+);

The above is one of the scenarios I'm facing

Tom I have observed that when I removed theouter join the rows returned are 297 and the 
rows returned with the outer join is 299.

But when I remove the outer join , the query finishes execution in less than 2 seconds, 
and with the outer join it takes 70 seconds.


Why is it that outer join always proves so expensive? what makes it so expensive.

( I used the selecting from a select in my columns, to avoid the outer join, but in the 
first place why is the outer join so expensive, as I asked above what makes it so 
expensive)

Please illustrate your answer.


Thanks
 

and we said...

It is an anti-join of sorts.

It forces a specific order of processing tables.  Since your example makes no sense to me 
(you join t1 to t2 normally, then you join t23 what is that! to t2 using an outer 
join.  it NEVER makes sense to outer join and natural join to the same table)

We'll do this:

t1
t2
t3

select * 
  from t1,t2,t3
 where t1.x = t2.x and t2.y = t3.y(+);


Now, T3 cannot be used as the driving table in this query by definition.  You are 
asking for all rows from T1 join T2 and if they happen to have a match in T3, give that 
at well.  So, T3 cannot be used to "drive" the query -- even if T3 would be the best 
choice (this is true because we might have to "make up rows" in T3 for the outer join and 
we'd never know we needed to make them up if we used T3 as the driving table)


Now, turn that into:


select * 
  from t1,t2,t3
 where t1.x = t2.x and t2.y = t3.y;


and we've added in new paths for the optimizer to take -- paths that were unavailable 
with the outer join.  In your case -- a patch that is available with the natural join 
that wasn't available with the outer join is significantly faster.

It is not that outer joins are ALWAYS slower -- it is that outer joins preclude the 
consideration of many access plans and therefore stand a chance of being significantly 
slower. 

Reviews    
5 stars oops   September 1, 2002 - 10am Central time zone
Reviewer: Nag 
The below is the correct query

select columns

from 

tab1    (1000 rows)
tab2      (800 rows)
tab3      (299 rows)
tab4      (100 rows)

where tab1.primarykey = tab2.primarykey and 
tab2.primarykey(+) =tab3.primarykey  and
tab4.category='SWE';

If you carefully observe above I dont have a join condition for the tab4. And yes it does a 
cartesian product. But still even with this handicap when I remove the outer join the query 
executes very fast.

In the example you have given I understand that the table with the outer join cannot be the driving 
table, but that is not the case with me, I dont have it as the driving table.


Before we go any further, can you explain the itnernals of outer join, how do you make up the rows 
for the table which is lacking rows in a outer join. Please explain.

 


Followup   September 1, 2002 - 2pm Central time zone:

Well, having

where t1.pk = t2.pk 
  and t2.pk (+) = t3.pk
  and tb.c = 'SWE'

is just bad code on your part.

Think about it -- if:

  t2.pk(+) = t3.pk 

is *needed*, then:

  t1.pk = t2.pk 

will just remove it!!! You have just totally wasted cycles, reduced the available plans.


You could never OUTER join to T2 and have that row in the actual result set due to the NATURAL join 
to t1!!!!

Here, you just wrote bad code.  

The outer join here was just a plain old BAD idea -- and the above argument about driving tables 
and reduced opportunities for plans applies 1000%


As for the last part -- not sure what you are getting at.  If you don't understand the basic 
semantics of an outer join, you shouldn't even be using it.

outer join 101:

you have two tables -- emp and dept.  dept has 4 rows (deptno = 10, 20, 30, 40).  emp has 14 rows 
but only 3 distinct values for deptno (10,20,30).

You need to write a report that shows the DEPTNO and count of employees for ALL departments.  This 
requires an OUTER JOIN since the natural join:

   select d.deptno, count(empno)
     from emp e, dept d
    where e.deptno = d.deptno
    group by d.deptno;

would "lose" deptno = 40.  So, we:

   select d.deptno, count(empno)
     from emp e, dept d
    where e.deptno(+) = d.deptno
    group by d.deptno;

and that simply means "use DEPT as the driving table, for each row in DEPT, find and count all of 
the EMPNOS we find.  IF there isn't a match in EMP for a given DEPTNO -- then "make up" a record in 
EMP with all NULL values and join that record -- so we don't drop the dept record"




 

3 stars Probably   September 1, 2002 - 5pm Central time zone
Reviewer: A reader from Bangalore
Probably,
where tab1.primarykey = tab3.primarykey and 
tab2.primarykey(+) =tab3.primarykey  and
tab4.category='SWE';
could give a more meaningful result. 


Followup   September 2, 2002 - 9am Central time zone:

how can you say that?

It gives a wholly DIFFERENT result, but to say "more meaningful?"  No, it would give a wrong result 
if the original predicate was the one needed to get the answer. 

5 stars How   September 1, 2002 - 5pm Central time zone
Reviewer: Nag 
Tom , I'm confused

Please expand on the following , and tell me how will waste cycles occur ...

Think about it -- if:

  t2.pk(+) = t3.pk 

is *needed*, then:

  t1.pk = t2.pk 

will just remove it!!! You have just totally wasted cycles, reduced the 
available plans.

And how should in your opinion this should be rewritten..



Thanks 


Followup   September 2, 2002 - 10am Central time zone:

If we needed to use  t3.pk = t2.pk(+) -- then t2.pk will be NULL and therefore t1.pk = t2.pk will 
not be satisified.  HENCE, when we actually outer join t2 to t3 and "make up a row in t2", we also 
immediately turn around and throw it out.

THEREFORE, the results of the queries:

where t1.pk = t2.pk 
  and t2.pk (+) = t3.pk
  and tb.c = 'SWE'


and

where t1.pk = t2.pk 
  and t2.pk = t3.pk
  and tb.c = 'SWE'


are identitical.  By adding the (+) to the first one, all you did was remove many different 
possible execution plans.  And given all of your other comments about the performance of the query 
with and without the (+) you removed the PLAN THAT ACTUALLY WORKS BEST from even being considered.


Anytime -- anytime -- you see:


  where t1.x = t2.x(+)
    and t2.any_column = any_value


you know that you can (should, must, be silly not to) remove the (+) from the query.  Because if we 
"make up a NULL row for t2" then we KNOW that t2.any_column cannnot be equal to any_value (it is 
NULL after all!!) 

5 stars outer join   September 1, 2002 - 5pm Central time zone
Reviewer: Nag 
I asked about outer join because sometimes it is fruitful to go back to basics. It suddenly gives 
you a new perspective on a problem. 


4 stars outer join   September 2, 2002 - 1am Central time zone
Reviewer: parvez from india
select columns
from 
tab1    (1000 rows)
tab2      (800 rows)
tab3      (299 rows)
tab4      (100 rows)

where tab1.primarykey = tab2.primarykey and 
tab2.primarykey(+) =tab3.primarykey  and
tab4.category='SWE';

If u have one record in tab1 which is not there in tab2 the first join will not select the record 
so outer joining tab2 with tab3 is not needed. 


4 stars A Reader   September 2, 2002 - 4am Central time zone
Reviewer: Rahul from India
Hi Tom

As I think statement below

where t1.pk = t2.pk 
  and t2.pk (+) = t3.pk
  and tb.c = 'SWE'

should be like 

where t1.pk = t2.pk(+)
  and t2.pk (+) = t3.pk
  and tb.c = 'SWE'

Pls. tell me am i roght or not??

Regards
Rahul 


Followup   September 2, 2002 - 10am Central time zone:

I cannot -- only the original poster of the question can answer that.

Those two predicates answer two entirely different questions.  One would need to know the question 
before saying what the answer was.

I can only say that

where t1.pk = t2.pk 
  and t2.pk = t3.pk
  and tb.c = 'SWE'

is the same as 

where t1.pk = t2.pk 
  and t2.pk (+) = t3.pk
  and tb.c = 'SWE'

So if they got the answer right with their predicate, just dropping the outer join gets the SAME 
answer -- faster. 

5 stars rewritten   September 2, 2002 - 12pm Central time zone
Reviewer: Nag 
select columns

from 

tab1    (1000 rows)
tab2      (800 rows)
tab3      (299 rows)
tab4      (100 rows)

where tab1.primarykey = tab2.primarykey and 
tab2.primarykey(+) =tab3.primarykey  and
tab4.category='SWE';


I have rewrote the above as 

select columns

from 

tab1    (1000 rows)     ...lookup table 1
tab2      (800 rows)    ....lookup table 2
tab3      (299 rows)     ...... loan table
tab4      (100 rows)     ...... category table

where tab3.primarykey = tab1.primarykey and 
tab3.primarykey =tab2.primarykey(+)  and
tab4.category='SWE';

The objective of using the outer join in the above rewrittedn query is to return rows from tab3 
even if a matching row is lacking in tab2.

Without the outer join it returns 297 rows, and with the outer join in returns 299 rows.

Does this rewritten query make any sense.


 


Followup   September 2, 2002 - 1pm Central time zone:

If that is the answer to your question - then yes.  But it is a wholly different query then you 
started with!!!  You are joining totally different tables together.

All of the queries "made sense".  Only you can tell us if you have the relations proper now. 

5 stars   September 2, 2002 - 1pm Central time zone
Reviewer: Nag 
select columns

from 

tab1    (1000 rows)     ...lookup table 1
tab2      (800 rows)    ....lookup table 2
tab3      (299 rows)     ...... loan table
tab4      (100 rows)     ...... category table

where tab3.primarykey = tab1.primarykey and 
tab3.primarykey =tab2.primarykey(+)  and
tab4.category='SWE'


In the above case , when the first join succeeds , and then the second join fails , then outer join 
comes into force and creates a null record and still returns the record even if the tab2 does not 
contain a row to satisfy the the second join. That is how I understand it from the above 
discussion.

Now when the difference between having the outer join and not having the outer join is just 2 rows. 
But as I have been saying the taken is lot more when we have a outer join.


I have actually totally avoided the outer join by using.

select columns,
tab3.desc_key,
(select tab2.description from tab2 where where tab2.desc_key=tab3.desc_key ),
columns

from 

tab1    (1000 rows)     ...lookup table 1
--tab2      (800 rows)    ....lookup table 2
tab3      (299 rows)     ...... loan table
tab4      (100 rows)     ...... category table

where tab3.primarykey = tab1.primarykey and 
--tab3.primarykey =tab2.primarykey(+)  and
tab4.category='SWE'

I have used this trick after learning it from your website.


But still I dont understand why it should take so much time to create two null records when using 
the outer join. I know you have spent enough time on this thread, but still if you can come up with 
an illustration which can more explicitly explain the scenario .. please kindly do so.
 


Followup   September 2, 2002 - 1pm Central time zone:

ARG -- 

compare the plans with and without the outer join.

As I've said (many times) -- the driving table is getting changed here BECAUES of the outer join 
and its requirements.

Look at the plan, it should be obvious. 

5 stars Very Very Very Very Useful   September 25, 2002 - 10am Central time zone
Reviewer: Kiran from India
Hi Tom,

I now really understand way outer join actually works and how it affects the query's execution 
path.

I thank you for providing us with the insight.

U r the "TRUE" Oracle Guru :-)

Thanks 
Kiran


 


5 stars does this make sense?   January 29, 2004 - 9pm Central time zone
Reviewer: A reader 
hi Tom,

does this outer join make sense?

select ...
from t1,t2,t3
where t1.pk = t2.pk1
  and t2.pk1 = t3.pk1(+)
  and t2.pk2 = t3.pk2(+)
  and t2.x = 'any value'

thanks in advance!

steve

 


4 stars Outer join   June 1, 2004 - 5am Central time zone
Reviewer: shgoh from Malaysia
Dear Tom,

   Just to clarify some doubts on the usage of outer join from your replied. 

From ur replied.
where t1.pk = t2.pk 
  and t2.pk = t3.pk
  and tb.c = 'SWE'

is the same as 

where t1.pk = t2.pk 
  and t2.pk (+) = t3.pk
  and tb.c = 'SWE'

How about the following statment?
pol table consists of 1,2,3,4 value for po_line_id
pll table consists of 1,2,3 value for po_line_id

select pol.po_line_id
from pol,pll
WHERE ( pol.po_line_id (+) = pll.po_line_id ) 
AND ( :pol_po_line_id = pll.po_line_id) 

is the same as 

select pol.po_line_id
from pol,pll
WHERE ( pol.po_line_id = pll.po_line_id ) 
AND ( :pol_po_line_id = pll.po_line_id) 

Am I right to say that ? Thanks.

Rgds
Goh 


Followup   June 1, 2004 - 8am Central time zone:

nope, thats wrong.



the general statement is:

   from t1, t2
  where t1.key = t2.key(+)
    and <predicate on t2 using anything other than NULL checks>

is the same as

   from t1, t2
  where t1.key = t2.key
    and .......


that is, 

  from t1, t2 where t1.key = t2.key(+) and t2.key > 5;


the outer join to t2.key is meaningless.  If we outer join to t2 -- t2.key is NULL.  NULL is never 
greater than 5.  Hence the (+) is not needed and implies the coder didn't understand fully what 
they were doing.



In your case, you hav:

  from pol, pll
  where pll.key = pol.key(+)
    and pll.key = CONSTANT


you need the outer join here if POL may or may not have the key value and you want a row returned 
if one exists in PLL.

if it were

  and pol.key = CONSTANT

then the outer join can be dropped. 

5 stars Thank you for clarification   June 1, 2004 - 10pm Central time zone
Reviewer: SHGoh from Malaysia
Dear Tom,

   Thanks for clarification. I have a better understanding now.

Rgds
SHGoh 


4 stars Why the outer join cause so much gets   July 30, 2004 - 4am Central time zone
Reviewer: shgoh from malaysia
Dear Tom,

    Why the query increase tremendously when I add the outer join in the select statement. 

select p.name from p,m   -- p has 200000 rows, m has 300rows
where p.id < 10        -- p.id is index
**the above sql is pretty fast. but after add the following SQL. the sql runs very slow even though 
the m table is very small.
    AND   p.XX = m.YY (+)

    Can you explain why this happen and how to avoid this>? Thanks.

Goh
 


Followup   July 30, 2004 - 8am Central time zone:


the first query doesn't seem to make any sense?  a cartesian product of 9 random rows from p with 
all 300 rows in m?????


give me a real "for example" -- in the fashion I give you -- create tables, real queries -- steps 
to reproduce. 

4 stars Remove my previous message   July 30, 2004 - 5am Central time zone
Reviewer: shgoh from malaysia
Dear Tom,

    
   Please ignore the previous message. Thanks.

Goh
 


4 stars Outer Join   January 17, 2005 - 6am Central time zone
Reviewer: Yogesh from Pune, India
Tom,  

Need your help in optimizing following query. 

select /*+ first_rows  */
         T2.DIV,
         T2.TERR,
         T2.REGION,
         T1.ONO,
         T2.OLET,
         T2.RN,
         T1.TDATE,
         T3.SUPP,
         T3.SUB,
         T3.GRP,
         T4.LDESC,
         T4.DESTREG,
         T5.COUNTRY,
         T5.ZONE,
         T5.AREA,
         decode(T5.OVER,null,'Not Known',T5.OVER),
         T1.PROD,
        T1.DISC,
         decode(T1.STAT,'N',1,'C',-1,0),
         T1.REV,
         decode(T7.CLODE,'XYZ','11','ABC','118'),
         to_char(T1.TDATE,'Day')
from     E1.MODEL T1,
         PY.COMPS T2,
         PY.SUPPS T3,
         PY.LOCAS T4,
         PY.COUNS T5,
         PY.DISCC T6,
         E1.CON T7
where
         T1.TDATE between to_date('01-nov-2003','DD-MON-YYYY') 
and      to_date('31-dec-2004','DD-MON-YYYY')
and      T1.ONO = T2.ONO
and      T1.SCODE = T3.SCODE (+)
and      T1.DECODE = T4.LCODE (+)
and      T1.CNCODE = T5.CCD (+)
and      T1.CNO = T6.CNO (+)
and      T1.ANO = T6.ANO (+)
and      T1.CNO = T7.CNO
and      (T6.DICODE is not null or T7.CLCODE in ('XYZ','ABC','LLL','MHNC','TZXV'))

Table details 

E1.MODEL T1,         311 MB   Partitioned on TDATE, Index (CNO,ANO,TDATE)
PY.COMPS T2,         1 MB 
PY.SUPPS T3,         1 MB 
PY.LOCAS T4,         1 MB 
PY.COUNS T5,         1 MB 
PY.DISCC T6,         10 MB 
E1.CON T7         4210 MB  partitioned on CNO, Index (CNO) PK

Statistics of all the tables is up to date. Right now this query is taking hours to execute.  Where 
I'm going wrong? Please advice.

 


Followup   January 17, 2005 - 8am Central time zone:

well, why first rows. 

3 stars First_rows   January 17, 2005 - 9am Central time zone
Reviewer: Yogesh from pune, India
I was checking impact of hints... if I take out first_rows it does FTS for almost all tables. 
Please ignore that.   


Followup   January 17, 2005 - 10am Central time zone:

please ignore what?  FTS looks like an *excellent* idea to me. 

3 stars First_rows   January 17, 2005 - 10am Central time zone
Reviewer: Yogesh from Pune, India
I was checking impact of FIRST_ROWS hint. While copy paste I forgot to remove first_rows. So please 
ignore that hint. 

 


Followup   January 17, 2005 - 11am Central time zone:

so, tkprof it, look at the volumes of data, see what it is physically doing (and you might even 
appreciate the speed at which it runs all of a sudden) 

4 stars Outer Join   January 18, 2005 - 10am Central time zone
Reviewer: Yogesh from Pune, India
Tom, 

I saw following clauses in one query, can you please explain what these clauses mean? I never seen 
outer join used like this. 

    f.code (+) NOT IN ('MSC','VO','WTV')
    f.type (+) = 'C'
 


Followup   January 18, 2005 - 11am Central time zone:

they are outer joins.

pretend 'C' was EMPNO

now does it make sense? 

3 stars   January 19, 2005 - 1pm Central time zone
Reviewer: Yogesh from Pune, India
Can you please give me an example where (+) not in can be used .....  


Followup   January 20, 2005 - 9am Central time zone:

you said you saw them?!?  therefore you already have the example.


    f.code (+) NOT IN ('MSC','VO','WTV')
    f.type (+) = 'C'
    f.KEY (+) = t.KEY

is just like:


select * 
  from T left join 
    (select * from f where code not in ( 'x','y','z' ) and type = 'C' ) f
   on t.key = f.key;


it says "take all of the rows from F such that code not in and type =" and then "outer join to the 
remaining rows"
      

4 stars Perhaps an overstatement?   April 17, 2007 - 12pm Central time zone
Reviewer: Reader 
Tom

Thanks for this great site! Sorry to wake up this thread again after two years. I'm thinking whether your statement on outerjoin is slightly mistaken.

<quote>
the general statement is:

  from t1, t2
where t1.key = t2.key(+)
  and <predicate on t2 using anything other than NULL checks>

is the same as

  from t1, t2
where t1.key = t2.key
  and .......
</quote>

Please check this example:

select * from emp;


  DEPTNO    EMPNO    SAL ACTIVE
---------- ---------- ---------- -
    10    1111    2000 N
    20    4444    5000 Y
    40    5555    5000 Y
    10    2222    3000 N
    30    3333    4000 Y

select * from dept;

  DEPTNO DNAME
---------- ---------------
    10 Accounts
    20 IT
    30 Sales
    40 MKTG

select * from emp_address ;
  EMPNO ADDRESS  DOB
--------- ---------- ---------
  1111 New York  01-JAN-78
  2222 Boston  23-SEP-99
  5555 Dallas  04-FEB-88

According to your statement, the following queries must return the same results. But they don't.

SQL>  select * from emp a, dept b, emp_address c where
  2   b.deptno = a.deptno(+) and
  3   a.empno = c.empno(+);

DEPTNO  EMPNO   SAL A DEPTNO DNAME      EMPNO ADDRESS   DOB
------ ------ ----- - ------ --------- ------ --------- ---------
    10   1111  2000 N     10 Accounts    1111 New York  01-JAN-78
    10   2222  3000 N     10 Accounts    2222 Boston    23-SEP-99
    40   5555  5000 Y     40 MKTG        5555 Dallas    04-FEB-88
    20   4444  5000 Y     20 IT
    30   3333  4000 Y     30 Sales

Elapsed: 00:00:00.48
SQL> select * from emp a, dept b, emp_address c where
  2  b.deptno = a.deptno(+) and
  3  a.empno = c.empno ;

DEPTNO  EMPNO   SAL A DEPTNO DNAME      EMPNO ADDRESS   DOB
------ ------ ----- - ------ --------- ------ --------- ---------
    10   2222  3000 N     10 Accounts    2222 Boston    23-SEP-99
    10   1111  2000 N     10 Accounts    1111 New York  01-JAN-78
    40   5555  5000 Y     40 MKTG        5555 Dallas    04-FEB-88


Am I right?



Followup   April 18, 2007 - 10am Central time zone:

no, not really, that is just a continuation of the outer join - there are (+)'s all across the board.


4 stars Thanks for looking into this   April 19, 2007 - 6am Central time zone
Reviewer: Confused 
Tom

First of all, thanks for looking into it. But I am not sure I understand your remarks.

Am I correct in saying that you meant earlier in the post that joining any other column with any other table on an already outer-joined column is just meaningless?

Didn't you say that

select * from t1, t2, t3
where t1.pk = t2.pk<b>(+) and
     t2.any_other_columns = t3.any_other_column</b>


is same as

select * from t1, t2, t3
where t1.pk = t2.<b>pk and
     t2.any_other_columns = t3.any_other_column</b>


? Or am I mistaken?


Followup   April 19, 2007 - 7am Central time zone:

you have a chain of outer joins


from t1, t2, t3, t4
 where t1.key = t2.key(+) -- make up row in t2 if needed...
   and t2.key = t3.key(+) -- make up row in t3 if we made up row in t2 or 
                          -- t2 does not have a match....
   and t3.key = t4.key(+) -- and if t3 was made up or t4 doesn't have a match
                          -- for t4, make one up



if you drop any of the plus(+) after the first, you don't need an outer join - as long as you keep outer joining, it is "sensible", when you stop - it is not.
4 stars TAG issue   April 19, 2007 - 6am Central time zone
Reviewer: Same confused person 
Tom

Sorry for "B" tags within the code. I used your Bold tag to highlight the problematic part. So are the tags not allowed to embed within another tag here?

2 stars Re: TAG issue   April 19, 2007 - 8am Central time zone
Reviewer: Michel CADOT from France

Tags can be embedded but not between code tags that precisely mean "take this as I give it to you, don't interpret it".

Regards
Michel

5 stars Outer Joins are no more a nightmare   April 20, 2007 - 10am Central time zone
Reviewer: A reader 
Tom

I thank you very much for your brief but elegant reply to my queries. Certainly, it makes lot of sense to me. Finally, just one question, if you don't mind.

Is it possible to get rid of outer joins from all application code either by normalisation (or even by denormalisation) or by adding some default (constant) values to missing data in such columns ? Or is there any good method for completely getting rid of these messy outer joins?

Thanks and have a great weekend

Followup   April 20, 2007 - 1pm Central time zone:

they are not messy when necessary.

they are beautiful things - when needed.

don't use dummy values

and data normalization isn't anything that would remove the need for an OPTIONAL relationship.


Now, if you have lots of option 1:1 relationships - those can be corrected by putting the attributes into a single table as they should have been from the beginning...
4 stars Join Fundamental   April 21, 2007 - 6am Central time zone
Reviewer: Manoj Pradhan from Chennai , India
Tom ,
  Please tell me ....
1. How driving table decided for a join .
2. Whether driving table play  role in all type of joins or only in Outer join.
3. I am new to your portal as well as to Tunning .
  Plese provide me some link where you already discussed about join fundamental.
4 . I there any limit to use number of tables in a join ?
5. I have an assignment( to tune ) an Query  joining 80+ tables , on a transaction db , each table having atleast 50K+ records .
  The query consuming all resources and at peak hour , other users unable to use application .
  This is a sales processing system and having 20k + user . Please advise me how to proceed . 
   

Followup   April 21, 2007 - 9am Central time zone:

1) based on tons of things - access paths to other tables, access paths to the table being considered, estimated cardinalities, ..... lots of stuff.

2) all types - there is a driving table in natural joins, semi joins, all joins.

3) Effective Oracle by Design - a book - I covered them in detail. Also:
http://www.oracle.com/pls/db102/portal.portal_db?selected=3#index-PER


4) only practical ones, dozens of table may perform well at a tie.

5) 80 tables - in a single query - wow. My suggestion - ignore the existing query entirely. Phrase the question being asked in the form of a formal requirement and start fresh.
3 stars Multiple Instance of Same table   April 22, 2007 - 12pm Central time zone
Reviewer: Manoj Pradhan from Chennai , India
Tom ,
  Thank you for your response .
Now I am going through your books .
I am explaining a similar example to my assignment.

create table employee (
              empId number(9),
              EmpName Varchar2(50),
              OfficeAddress_Id number(9),
              ResidentAddress_Id number(9),
              PermentAddress_Id number(9) ) ;
create table AddressBook (
                AddressBook_ID number(9) ,
                addressLine  varchar2(50) ,
                stateId    number(9)  ) ;
create table State (
            state_id number(9) ,
            state_name varchar2(50) ) ;

select *
    from employye emp ,
        addressBook offAdrs ,
        addressBook ResAdrs ,
        addressBook PermAdrs ,
        state    OffState ,
        state    ResState ,
        state    PermState
where  emp.officeAddressId = offAdrs.addressbook_Id
  and emp.residentAddressId = ResAddrs.addressbook_Id
  and emp.permentAddressId = PermAdrs.addressbook_id 
  and offAdrs.state_Id = OffState.state_id
  and ResAddrs.state_Id = Resstate.state_id
  and PermState.state_Id = PermState.state_id ;


1. here Multiple Instance of addressBook and state used 
2. I my assignment like this I use some table even 12 time
  there is total 35 unique table but whe we instacesiate 
  multiple time total table count become 80+ 
  Is there any better way to rewrite the Query To get same output .
3. My approach to use PL/SQL functions for tables like state where only one column retived and to remove that table from join.
  but form one of your article I came to know function is not a good choice as compare to Join because function will be invoked for each row recursively and the table will scan multiple time . which is a time consuming process allso there a context switching time for function .

please guide me .



Followup   April 23, 2007 - 4pm Central time zone:



given the way you've modeled this, you are likely stuck with a query like this. If you would have modeled it differently (better, using naturally surrogate keys - I just made that up, you would have had other opportunities).

eg: if the empid was in the child tables, instead of them having their own surrogage keys....
3 stars   April 25, 2007 - 5am Central time zone
Reviewer: Manoj Pradhan from Chennai , India
Tom ,
  Thank you for your help .
Would you a bit more clear .. I unable to catch all ... plaese explain with a example ...

Followup   April 25, 2007 - 10am Central time zone:

if you did it more like:

create table emp ( empid primary key, ... );
create table addresses ( empid, addr_type, ...., primary key(empid,addr_type) );

and addr_type would be one of "office", "perm", etc - we'd have different SQL approaches

but you made addresses have it's own surrogate key, making it hard to find all of the addresses for a given employee in a single join.
4 stars Outerjoins are no more a nightmare, or are they?   April 25, 2007 - 12pm Central time zone
Reviewer: The same Reader 
Tom

Thanks for your response to my queries and for your time. Can I just ask your expert advice finally once on the same subject for the last time?

If a query looks like this, would you still say that the outer joins are not necessary here?

select a.col1, b.col2, c.col3 d.col4, e.col5 from
t1 a, t2 b, t3 c, t4 d, t5 e, t6 f
where a.item_code(+) = b.icode and
b.sales_id (+) = c.sales_id and
d.sales_id (+) = c.sales_id and
e.sales_id (+) = c.sales_id and
f.invent_code = a.invent_code;


In the above query, because T1 is outer joined to T2, is it meaningless to outer join T2.sales_id to T3?

From your earlier repsonse, I assume that because T1 is outer joined to T2 on icode, the last join, namely T6 outer joining to T1 is wrong, am I correct? I guarantee that I won't disturb you again on the same subject.

Thanks


Followup   April 25, 2007 - 1pm Central time zone:

i would ask you to tell us if the outer joins are necessary

given it is your model
you know the relations
and you understand the question being asked

but yes, if a row is made up in A, the a.invent_code is NULL and f.invent_code will never equal A.invent_code when that is true, so those rows disappear...

so the outer join to A is not necessary.
5 stars Great   April 26, 2007 - 4am Central time zone
Reviewer: Same Reader 
Thanks Tom.

Great stuff.

3 stars   April 26, 2007 - 11am Central time zone
Reviewer: Manoj Pradhan from Chennai , India
Tom ,
  hank you for your suggestion and co-operation .

  I change my table design as per you suggestation ,
but still i have ti use multiple instance of address table , is there any alternate way .
I want to retrive name ,Resident , Office , Perment Address then as per my understanding I have to put the addres table 3 times in from clause

select emp.* , ResAdd.* , OffAdd.* , PermAdd.*
  from employee emp ,
    address ResAdd ,
    address OffAdd ,
    address  PermAdd
  where emp.address_id = ResAdd.address_id
    and ResAdd.addr_type = 'res'(+)
    and emp.address_id = OffAdd.address_id
    and OffAdd.addr_type = 'off'(+)
    and emp.address_id = PermAdd.address_id
    and PermAdd.addr_type = 'perm'(+) ;

Is there any better way to rewrite the query to get same out put .

Followup   April 27, 2007 - 10am Central time zone:

select emp.empno, emp.whatever, 
       max( decode( addr.addr_type, 'res', addr.address ) ),
       max( decode( addr.addr_type, 'off', addr.address ) ),
       max( decode( addr.addr_type, 'perm', addr.address ) )
  from emp, (select * 
               from addr 
              where addr.addr_type in ('res', 'off', 'perm')) addr
 where emp.addr_id = addr.addr_id
 group by emp.empno, emp.whatever -- make sure to pull something UNIQUE from emp to group by;



pivot. outer join only if needed.
5 stars   May 3, 2007 - 9am Central time zone
Reviewer: Manoj Pradhan from Chennai , India
Tom,
  It is a great help for me . I anlyse our table structure , and your Idea will help us , now our table count reduced from 85 to 42 .

  But I am afraid of the error "ORA-01467: sort key too long " .
Becuse we are selecting 300+ columns and our client will not agree to reduce number of columns .
and out of 300 columns around 90+ will be in group by clause and around 200 column will be use Max(decode(****)) in select clause .
As per your suggestion am rewritting the query agin .

Please help me to over come "ORA-01467: sort key too long "

Followup   May 3, 2007 - 10pm Central time zone:

use a smaller sort key...

blocksize is?

insufficient data to help you out here, we don't even have a query to peek at.
3 stars   May 5, 2007 - 10am Central time zone
Reviewer: Manoj Pradhan from Chennai , India
Tom ,

1 . The term "sort key " is new to me .
I dont have any idea on it . I came to know about this from the post "How to Calculate memory occupied by Aggregate function " . ( I posted one review there and ask some question , to avoid duplication )

  how to select a smaller sort key .

2 . db_block_size  is 4096

  Oracle 8i Enterprise Edition Release 8.1.7.0.1

3 . I am reproducing your Query with some addision

  select emp.empno , emp.whatever , dept.whatever ,
    max( decode( addr.addr_type , 'res' ,addr.address)),
    max(decode ( addr.addr_type , 'off' , addr.address ) ,
  max (decode (addr.addr_type , 'perm' , addr.addres ) )
from emp , ( select *
        from addr
        where addr.addr_type in ('res','off','perm'))
          addr
where emp.addr_id = addr.addr_id
    and emp.dept_id = dept.dept_id(+)
group by emp.empno , emp.whatever , dept.whatever

Like dept table there are 30 more tables .

Around 70+ columns will be in group by caluse . whether it will affect the performance .
 
  Is there any alterante way ?

This is an Online application .


4 . As you mentioned earlier ... How to Pivot Outer join ?   

Please help...     

4 stars   June 12, 2007 - 10am Central time zone
Reviewer: Lax from Ind
Hi,

The ccard_list (0 rows) and the bmf (23 Million rows).
We have analyzed the ccard_list table but still the bmf is being used as driving table.
Appreciate your help on this.

SELECT COUNT (*)
FROM ccard_list,bmf,
    bmf_trans_descr,   
    customer_id_acct_map,
    external_id_type_ref
WHERE bmf.account_no = customer_id_acct_map.account_no
  AND bmf.bmf_trans_type = bmf_trans_descr.bmf_trans_type
  AND bmf.ccard_id = ccard_list.ccard_id (+)
  AND bmf.ccard_id_serv = ccard_list.ccard_id_serv(+)
  AND customer_id_acct_map.is_current = 1
  AND customer_id_acct_map.external_id_type = external_id_type_ref.external_id_type
  AND external_id_type_ref.is_default = 1
  AND bmf.bill_ref_no = 0
  AND bmf.no_bill = 0
  AND bmf.orig_tracking_id_serv = 3
  AND bmf.action_code = 'REV'
  AND bmf.orig_tracking_id

Thanks

Followup   June 12, 2007 - 10am Central time zone:

do you see the outer join there?

do you know that means - well - how can you drive from ccard_list??

if you outer join to ccard_list, it matters not if there are zero rows - ever row is going to be returned from the OTHER tables.

so, my help is, I don't see anything wrong, unless of course you do not NEED the outer join, in which case, this query will run really fast...
5 stars   June 12, 2007 - 12pm Central time zone
Reviewer: Lax from Ind
Thanks a lot for a quick response.
Without outer join the query response is very fast.
The query is coming from an application and we are expecting data in CCARD_LIST in near future.
Subsequently, I sent the query to developer with your suggestion. However, is there any way to quickly fix this query.

Followup   June 12, 2007 - 1pm Central time zone:

the query IS NOT BROKEN.

questions for you:

Do you understand what the outer join means?

And why that would sort of PRECLUDE this outer joined to table from being the driving table?



If you outer join to this table - every row from the OTHER TABLES ARE RETURNED.
select * 
  from one_billion_row_table x, zero_row_table y
 where x.key = y.key (+);


Assuming that first table has 1,000,000,000 rows in it and the second one has zero - that query returns one billion rows.

Now, how would you drive that query from Y?
5 stars Thanks !!!   June 13, 2007 - 1am Central time zone
Reviewer: Lax from Ind
I got the answer.
Thank for your time !!!

3 stars joe from singapore   June 20, 2008 - 6am Central time zone
Reviewer: Joe from singapore
Tom,

How can I better write this query. ?
R has 11 mil recs
A has 7 mil recs
C has 5000 recs.  

The query takes around 15 mins to execute 

SELECT 
         R.VALUATION_DATE, 
     R.CLIENT_CODE, 
     R.TYPE_POS_BASIS, 
     A.ACCRUAL_VAL2,
     A.ACCRUAL_VAL3,
     (R.MKT_VAL2+NVL(A.ACCRUAL_VAL2, 0)) AS AGG_MKT_VAL2, 
     (R.MKT_VAL3+NVL(A.ACCRUAL_VAL3, 0)) AS AGG_MKT_VAL3 
     FROM SEC_RESULTS R, ACCRUAL_RESULTS A, CONTROL C
     WHERE SEC_TYPE NOT IN ( '20', '25', '28', '34', '37', '40', '43', '47', '83')
     AND R.JOB_ID = A.JOB_ID (+)
     AND R.ID = A.ID (+)
     AND R.CLIENT = A.CLIENT (+)
     AND R.TYPE_POS_BASIS = A.TYPE_POS_BASIS (+)
     AND R.DEP_CODE = A.DEP_CODE (+)
     AND C.JOB_ID = R.JOB_ID     
     AND C.RUN_TYPE = 'batch'
     AND C.STATUS = 'done'     

Thanks
Joe


Followup   June 20, 2008 - 10am Central time zone:

looks perfect to me.

given what I know.

which is almost nothing...........


we don't know what sec_type is - what table it is from.

We don't know how selective sec_type might be, I presume it comes from R, if sec_type NOT IN (.....) returns 10,000,000 rows - well, not sure what you expect...

we don't know how many rows run_type = 'batch' and status = 'done' returns. Only that it is somewhere between 0 and 5,000 rows...

we don't know what indexes you might (or might not) have.

We don't know what is unique - or not.

We don't know the relationships between R, A, and C (one to one, one to many, etc...)

We don't know more things than we know here.
4 stars is outer joins required in this query   September 30, 2008 - 6am Central time zone
Reviewer: Reene 
Hi Tom

i have  a query as below given below :


select  a.*
from
(SELECT  msi.item_type,
                           msi.segment1,
                           msi.primary_uom_code,
                           msi.description,
                           msi.PLANNING_MAKE_BUY_CODE,
                           msi.inventory_item_id,
                           msi.organization_id,
                           pgtpt.trp_cur_tran_price,
                           pgtpt.trp_cur_last_updt_dt,
                           pgtpt.TRP_CUR_MMICV,
                           pgit.itm_item_num_pk,
                           pgit.itm_id_type,
                           pgit.itm_prim_source_fk,
                           pgit.itm_material_class,
                           pgit.itm_software_flg,
                           CIC.LAST_UPDATE_DATE
                      FROM APPS.MTL_SYSTEM_ITEMS_B   MSI,
                           APPS.CST_ITEM_COSTS CIC,
                           PREPRO.GEMS_ITEM_TB       PGIT,
                           PREPRO.GEMS_TRAN_PRICE_TB PGTPT
                     WHERE MSI.ITEM_TYPE = 'FG' AND
                           MSI.INVENTORY_ASSET_FLAG = 'Y' AND
                           MSI.COSTING_ENABLED_FLAG = 'Y' AND
                           MSI.ORGANIZATION_ID = 2572 AND
                           PGIT.ITM_START_DT_PK = TRUNC(SYSDATE, 'y') AND
                           PGIT.ITM_ITEM_NUM_PK(+) = MSI.SEGMENT1 AND           
                           PGTPT.TRP_SOURCE_PF(+) = PGIT.ITM_PRIM_SOURCE_FK AND 
                           PGTPT.TRP_ITEM_NUM_PF(+) = PGIT.ITM_ITEM_NUM_PK  
                           and PGTPT.TRP_DEST_PF(+) = '000000'  
                           and  msi.INVENTORY_ITEM_ID = CIC.INVENTORY_ITEM_ID AND
                           msi.ORGANIZATION_ID   = CIC.ORGANIZATION_ID
                           and CIC.COST_TYPE_ID = 1
 ) a
 where ( a.last_update_date >= sysdate-15  or a.trp_cur_last_updt_dt >= sysdate-15  ) 

I think the following outerjoins are not required and they should be equi joins , could you please 
tell if I am correct or not.

PGIT.ITM_ITEM_NUM_PK(+) = MSI.SEGMENT1 

as i have a filter like this 

PGIT.ITM_START_DT_PK = TRUNC(SYSDATE, 'y') 

then these 2 outer joins should also be equi joins

PGTPT.TRP_SOURCE_PF(+) = PGIT.ITM_PRIM_SOURCE_FK AND 
                           PGTPT.TRP_ITEM_NUM_PF(+) = PGIT.ITM_ITEM_NUM_PK  

as we have another join like this

and PGTPT.TRP_DEST_PF(+) = '000000' 

please let me know if I am wrong.

Thanks


Followup   September 30, 2008 - 9am Central time zone:

PGIT.ITM_START_DT_PK = TRUNC(SYSDATE, 'y') AND
                           PGIT.ITM_ITEM_NUM_PK(+) = MSI.SEGMENT1 AND  


no outer join needed to pgit - that should be obvious - if a pgit row were made up, itm_start_dt_pk would be NULL and hence rejected from consideration.

pgtpt - we cannot tell if we need to or not from the query, but.....

YOU CAN - you know the tables, you know of the relationship is optional or not. Use your intimate knowledge of the schema and (most importantly) the QUESTION BEING ASKED OF IT.
3 stars Outer join trouble   October 9, 2008 - 8am Central time zone
Reviewer: Maverick 
Tom,
I am struggling with a simple outer join today. basically, All I need is rows from dept and emp tables.
An employee in emp table is in one department. So, I need to show it like this:

DNAME          DEPTNO    EMPNO
Administration  10    
Finance         20    
Human resources 30    
Technology      40    
Helpdesk        50       9934
testdept        60    
TEST            70    
Finance         90    


I need this for all the employees[ I mean for each employee I want to repeat the list of departments and show which department this emp belongs to]

How do I do that? I got it for one employee record using this:


select d.dname,d.deptno,e.empno
  from dept d, (select empno,deptno from emp 
                  where empno=9934
                  ) e
 where d.deptno=e.deptno(+)
 order by 2;
If I remove where condition for Emp table, I am loosing the department list..

Thanks in advance

Followup   October 9, 2008 - 11am Central time zone:

partitioned outer join

ops$tkyte%ORA11GR1> select dept.*,
  2         case when emp.rowid is null then null else emp.empno end emp_no
  3    from emp partition by (empno) right outer join dept on ( emp.deptno = dept.deptno )
  4   order by emp.empno, dept.deptno
  5  /

    DEPTNO DNAME          LOC               EMP_NO
---------- -------------- ------------- ----------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7369
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7499
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7521
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7566
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7654
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7698
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK            7782
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7788
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK            7839
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7844
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7876
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7900
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7902
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK            7934
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

56 rows selected.

2 stars Not in ANSI Standard   October 9, 2008 - 2pm Central time zone
Reviewer: Maverick 
Tom, for some reason I could never understand ANSI Standards for Joins. And I definietly do not 
want to start now[in fact I do not like it ..Sorry!!]. 
So, please can you give the same in regular joins. I tried by adding partition by (empno) for emp 
table and it failed.


by regular joins I mean:
 select dept.* , emp.empno
  from emp  ,    [partition by (empno)]????
       dept
  ...

Thank you


Followup   October 9, 2008 - 2pm Central time zone:

umm, in order to do a partitioned outer join you need to use that syntax.

learn it. It does what you wanted.

else you are in the realm of cartesian joins.

scott%ORA11GR1> select dept.deptno, dept.dname, dept.loc, decode( dept.deptno, emp.deptno, 
emp.empno ) empno
  2    from emp, dept
  3   order by emp.empno, dept.deptno
  4  /

    DEPTNO DNAME          LOC                EMPNO
---------- -------------- ------------- ----------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7369
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7499
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO             7521
        40 OPERATIONS     BOSTON
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS              7566
....



5 stars NULL Equality in Outer Joins?   December 5, 2008 - 6am Central time zone
Reviewer: Ramchandra Joshi from India
"
you have a chain of outer joins 



from t1, t2, t3, t4
 where t1.key = t2.key(+) -- make up row in t2 if needed...
   and t2.key = t3.key(+) -- make up row in t3 if we made up row in t2 or 
                          -- t2 does not have a match....
   and t3.key = t4.key(+) -- and if t3 was made up or t4 doesn't have a match
                          -- for t4, make one up

"

When a row in t3 is MADE UP as we made up a row in t2 ,How is it that it will match?
Isnt here then we are trying to equate NULLS ( both of which are MADE UP as a result of an outer 
join ? )

Thanks in advanve
Ram


Followup   December 9, 2008 - 12pm Central time zone:

when we "make up" a row, it is a row entirely of NULL values.

Write a Review
 


All information and materials provided here are provided "as-is"; Oracle disclaims all express and implied warranties, including, the implied warranties of merchantability or fitness for a particular use. Oracle shall not be liable for any damages, including, direct, indirect, incidental, special or consequential damages for loss of profits, revenue, data or data use, incurred by you or any third party in connection with the use of this information or these materials.

About Oracle | Legal Notices and Terms of Use | Privacy Statement