Skip to Main Content
  • Questions
  • how to do outer joins in multiples tables.

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Miguel.

Asked: January 09, 2002 - 9:23 am UTC

Last updated: July 31, 2012 - 1:01 pm UTC

Version: 8i

Viewed 10K+ times! This question is

You Asked

I'm trying to covert an Access db to an Oracle db. I have a view with this query:

In access a inner join returns the rows that are equal in both tables.
A rigth join returns all the rows of the rigth table and the coincidences in the table of the left side, is to say, all the rows of the rigth table and the values in the left table of the fields that should coincide. I expect to have expressed correctly. I think this is an outer join in Oracle isn't it?

SELECT DISTINCTROW MODULOS_DATOS_ECONOMICOS.CURSO, MODULOS_DATOS_ECONOMICOS.MODULO, MODULOS_DATOS_ECONOMICOS.PRESUPUESTO, MODULOS_DATOS_ECONOMICOS.GASTO, MODULOS_DATOS_ECONOMICOS.DESVIACION, MODULOS_DATOS_ECONOMICOS.DESVIACIONPCT, CURSOS_DATOS_ECONOMICOS.OBSERVACIONES, GASTOS.CARPETAS, GASTOS.ALQUILERES, GASTOS.LIBRERIA, conGastosAgenciaModulo.SumaDeIMP_VIAJE, conGastosAgenciaModulo.SumaDeIMP_HOTEL, conGastosAgenciaModulo.SumaDeIMP_TRASLADOS, conGastosAgenciaModulo.SumaDeIMP_MANUTENCION, conGastosFotocopiasModulo.SumaDeIMPORTE, conGastosRetribucionesModulo.SumaDeRETRIB, conGastosOtrosGastosModulo.SumaDePTAS
FROM (conGastosOtrosGastosModulo RIGHT JOIN (conGastosRetribucionesModulo RIGHT JOIN (conGastosFotocopiasModulo RIGHT JOIN (conGastosAgenciaModulo RIGHT JOIN (GASTOS RIGHT JOIN MODULOS_DATOS_ECONOMICOS ON
(GASTOS.MODULO = MODULOS_DATOS_ECONOMICOS.MODULO) AND
(GASTOS.CURSO = MODULOS_DATOS_ECONOMICOS.CURSO)) ON (conGastosAgenciaModulo.MODULO = MODULOS_DATOS_ECONOMICOS.MODULO) AND (conGastosAgenciaModulo.CURSO = MODULOS_DATOS_ECONOMICOS.CURSO)) ON (conGastosFotocopiasModulo.MODULO = MODULOS_DATOS_ECONOMICOS.MODULO) AND (conGastosFotocopiasModulo.CURSO = MODULOS_DATOS_ECONOMICOS.CURSO)) ON (conGastosRetribucionesModulo.MODULO = MODULOS_DATOS_ECONOMICOS.MODULO)
AND (conGastosRetribucionesModulo.CURSO = MODULOS_DATOS_ECONOMICOS.CURSO)) ON (conGastosOtrosGastosModulo.MODULO = MODULOS_DATOS_ECONOMICOS.MODULO) AND (conGastosOtrosGastosModulo.CURSO = MODULOS_DATOS_ECONOMICOS.CURSO)) INNER JOIN CURSOS_DATOS_ECONOMICOS ON MODULOS_DATOS_ECONOMICOS.CURSO = CURSOS_DATOS_ECONOMICOS.CURSO;

in Oracle i do it like this:

select GCDAT.CURSOS_DATOS_ECONOMICOS.OBSERVACIONES, GCDAT.GASTOS.
ALQUILERES, GCDAT.GASTOS.CARPETAS, GCDAT.GASTOS.LIBRERIA, GCDAT.
MODULOS_DATOS_ECONOMICOS.CURSO, GCDAT.MODULOS_DATOS_ECONOMICOS.
DESVIACION, GCDAT.MODULOS_DATOS_ECONOMICOS.DESVIACIONPCT, GCDAT.
MODULOS_DATOS_ECONOMICOS.GASTO, GCDAT.MODULOS_DATOS_ECONOMICOS.
MODULO, GCDAT.MODULOS_DATOS_ECONOMICOS.PRESUPUESTO,
GCDAT.CONGASTOSAGENCIAMODULO.SumaDeIMP_VIAJE, GCDAT.CONGASTOSAGENCIAMODULO.SumaDeIMP_HOTEL,
GCDAT.CONGASTOSAGENCIAMODULO.SumaDeIMP_TRASLADOS, GCDAT.CONGASTOSAGENCIAMODULO.SumaDeIMP_MANUTENCION,
GCDAT.CONGASTOSFOTOCOPIASMODULO.SumaDeIMPORTE, GCDAT.CONGASTOSRETRIBUCIONESMODULO.SumaDeRETRIB,
GCDAT.CONGASTOSOTROSGASTOSMODULO.SumaDePTAS
from GCDAT.CURSOS_DATOS_ECONOMICOS, GCDAT.GASTOS, GCDAT.
MODULOS_DATOS_ECONOMICOS,GCDAT.CONGASTOSOTROSGASTOSMODULO,GCDAT.CONGASTOSRETRIBUCIONESMODULO,GCDAT.CONGASTOSFOTOCOPIASMODULO,GCDAT.CONGASTOSAGENCIAMODULO
where ((GCDAT.GASTOS.MODULO = GCDAT.MODULOS_DATOS_ECONOMICOS.MODULO(+)) AND (GCDAT.GASTOS.CURSO = GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO(+)) AND
(GCDAT.CONGASTOSAGENCIAMODULO.MODULO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.MODULO) AND (GCDAT.CONGASTOSAGENCIAMODULO.CURSO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO) AND
(GCDAT.CONGASTOSFOTOCOPIASMODULO.MODULO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.MODULO) AND (GCDAT.CONGASTOSFOTOCOPIASMODULO.CURSO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO) AND
(GCDAT.CONGASTOSRETRIBUCIONESMODULO.MODULO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.MODULO) AND (GCDAT.CONGASTOSRETRIBUCIONESMODULO.CURSO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO) AND
(GCDAT.CONGASTOSOTROSGASTOSMODULO.MODULO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.MODULO) AND (GCDAT.CONGASTOSOTROSGASTOSMODULO.CURSO(+) = GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO) AND
(GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO(+) = GCDAT.CURSOS_DATOS_ECONOMICOS.CURSO))

this returns:
ERROR at line 13:
ORA-01417: a table may be outer joined to at most one other table

I try it in other forms but do not gives the output i want it only gives me 1 row.

Cuold you help me? please.

Thanks.


and Tom said...

You have the (+) on the wrong side. You pue that after the table you want to outer join TO.

MODULOS_DATOS_ECONOMICOS is your "driving table". everything else is outer joined to "it" (or just joined to it). So the (+) goes after the other table, never after MODULOS_DATOS_ECONOMICOS.

This part of your query:
GCDAT.GASTOS.MODULO = GCDAT.MODULOS_DATOS_ECONOMICOS.MODULO(+)) AND
(GCDAT.GASTOS.CURSO = GCDAT.MODULOS_DATOS_ECONOMICOS.CURSO(+)) AND


is what messed it all up. You were saying (most of the time) outer join to MODULOS_DATOS_ECONOMICOS, but that time you said the opposite (which doesn't make sense)

Your query in 8i would be:




ops$tkyte@ORA817DEV.US.ORACLE.COM> select *
2 from CURSOS_DATOS_ECONOMICOS,
3 GASTOS,
4 MODULOS_DATOS_ECONOMICOS,
5 CONGASTOSOTROSGASTOSMODULO,
6 CONGASTOSRETRIBUCIONESMODULO,
7 CONGASTOSFOTOCOPIASMODULO,
8 CONGASTOSAGENCIAMODULO
9 where GASTOS.MODULO(+) = MODULOS_DATOS_ECONOMICOS.MODULO
10 AND GASTOS.CURSO(+) = MODULOS_DATOS_ECONOMICOS.CURSO
11 AND CONGASTOSAGENCIAMODULO.MODULO(+) = MODULOS_DATOS_ECONOMICOS.MODULO
12 AND CONGASTOSAGENCIAMODULO.CURSO(+) = MODULOS_DATOS_ECONOMICOS.CURSO
13 AND CONGASTOSFOTOCOPIASMODULO.MODULO(+) = MODULOS_DATOS_ECONOMICOS.MODULO
14 AND CONGASTOSFOTOCOPIASMODULO.CURSO(+) = MODULOS_DATOS_ECONOMICOS.CURSO
15 AND CONGASTOSRETRIBUCIONESMODULO.MODULO(+) = MODULOS_DATOS_ECONOMICOS.MODULO
16 AND CONGASTOSRETRIBUCIONESMODULO.CURSO(+) = MODULOS_DATOS_ECONOMICOS.CURSO
17 AND CONGASTOSOTROSGASTOSMODULO.MODULO(+) = MODULOS_DATOS_ECONOMICOS.MODULO
18 AND CONGASTOSOTROSGASTOSMODULO.CURSO(+) = MODULOS_DATOS_ECONOMICOS.CURSO
19 AND MODULOS_DATOS_ECONOMICOS.CURSO = CURSOS_DATOS_ECONOMICOS.CURSO
20 /

no rows selected


In 9i you can use the right join/inner join syntax if you like.


Rating

  (54 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Thanks

Miguel Garcia, January 10, 2002 - 9:48 am UTC

You have solve my problem, Thank you very much.

Follow-UP : join to the same Table more than Once 8.1.6

Pascal, January 11, 2002 - 4:02 am UTC

Hi Tom ,

Just a Follow-up on OUTER JOINs :

I have this QUERY:

select op.name OP , nvl(tz.name,'No TZ') TZ , nvl(pg.DESCRIPTION,null) PG
,( case
when ptz.price is null then pop.price
when pop.price is null then ptz.price
else 0 end ) price
from CLIENTS op, TZ_OPE t , PRICE_OPE pop , CLIENT_ZONES tz ,PRICE_TZ ptz , PRICE_LISTS pg
where op.id = t.ope_id and
op.id = pop.ope_id and
pg.id = pop.PG_ID and
pg.id = ptz.PG_ID and
tz.id = t.TZ_ID and
tz.id = ptz.TZ_ID


In the above Query:

CLIENTS <1 to many> PRICE_OPE <many to 1> PRICE_LISTS
CLIENTS <1 to many> TZ_OPE <many to 1> CLIENT_ZONES <1 to many> PRICE_TZ <many to 1> PRICE_LISTS

All id 's are assumed to be primary keys and ope_id, pg_id , tz_id are foreign keys.


Hint: Take <many to 1> or <1 to many> as Foreign key Relationships between the tables


Now, i would like to return all op.name from CLIENTS ,
all pg.DESCRIPTION from PRICE_LISTS ,
all tz.name from CLIENT_ZONES ,
all price from either PRICE_OPE or PRICE_TZ :


There could be some records in PRICE_LISTS/CLIENTS which don't have matches in PRICE_OPE
but i would like to return all of them through OUTER JOINs

There could be records in CLIENT_ZONES which don't have matches in TZ_OPE / PRICE_TZ
but i would also like to return all of them.


I have tried some OUTER JOINs myself but everytime i find myself OUTER JOINING to the same table
more than ONCE - especially for tables : CLIENTS op and PRICE_LISTS pg .


Your help would be greatly appreciated.


Regards,


Pasko

Outer Join returning different value when using NVL

B, October 09, 2002 - 9:56 am UTC

Hi Tom,

There are three tables in a query

The header table(ar_payment_schedules) has one entry for an invoice. If there are any payments recd against that invoice then it is available in the lines table (i.e. a view) ar_app_adj_v. The third
table is used to join the the above two.

The requirement : when the query is run for a period
it should pick up all the invoices available in the
header table that have the gl_date less than the passed period parameter
and also the lines, if any, from the lines table that have the gl_date
less than the passed period parameter.

This required an outer join between the header and the lines table as there
may not be an entry in the lines table.

There is a customer_trx_id (43397) available in both the header and line
table. But the GL_DATE for the header table is 31-Jul-02 and GL_DATE for
the lines table is 15-Aug-02. And we had used the following condition in
the lines table.

AND NVL(app.gl_date,TO_DATE(:P_PERIOD)) <= TO_DATE(:P_PERIOD)


So when the :P_PERIOD passed to the query was 31-Jul-02, we expected a row for
43397 with only the information related to header table and not line table.
But the query did not return any rows. Later the where condition
was changed to the following condition and it produced the desired result.

AND app.gl_date(+) <= TO_DATE(:P_PERIOD)

Can you explain the logic behind this.


SELECT
ps.due_date,
ps.gl_date head_date,
ps.AMOUNT_DUE_ORIGINAL,
ps.tax_original,
trx.customer_trx_id,
trx.INTERFACE_HEADER_ATTRIBUTE1,
app.AMOUNT,
app.TAX_AMOUNT,
app.gl_date
FROM ar_payment_schedules ps
,ra_customer_trx trx
,ar_app_adj_v app
WHERE PS.GL_DATE <= TO_DATE(:P_PERIOD)
AND trx.customer_trx_id = ps.customer_trx_id
AND trx.INTERFACE_HEADER_ATTRIBUTE1 = '13017'
AND trx.trx_number = '120124'
AND ps.CUSTOMER_TRX_ID = app.CUSTOMER_TRX_ID(+)
-- AND NVL(app.gl_date,TO_DATE(:P_PERIOD)) <= TO_DATE(:P_PERIOD)
AND app.gl_date(+) <= TO_DATE(:P_PERIOD)

Tom Kyte
October 09, 2002 - 4:41 pm UTC

Unless and until you used the outer join (the second predicate) there is no reason we would "make up" a row in in APP. So, app.gl_date would never be "made up" for you.

If there was NO row in app that matched the criteria, there is nothing to "join to" and in the absence of an outer join -- there would be no data returned.



Joins

Juan Guascarancas Pena, November 07, 2002 - 1:09 pm UTC

Hi Tom,

Where can I learn more about joins (left join, right join, inner join)?
What is this "inner join" syntax Oracle now uses?
How do you translate a left join syntax to an Oracle (<9i) syntax?

Thanks!!

Thanks

Juan Guascarancas Pena, November 08, 2002 - 8:30 am UTC

Thanks Tom!

Just one more question: Is it possible to do a Full Outer Join on Oracle 8.0.4, or is it time for an upgrade?

Mr J.

Tom Kyte
November 08, 2002 - 9:14 am UTC

The syntax "full outer join" was added in 9i and up.  Basically you can do it in any release and it would look like this:

ops$tkyte@ORA920.US.ORACLE.COM> select nvl( dept.dname, 'MADE UP') dname,
  2         nvl( emp.ename, 'MADE UP' ) ename
  3    from emp, dept
  4   where emp.deptno = dept.deptno(+)
  5   union all
  6  select nvl( dept.dname, 'MADE UP') dname,
  7         nvl( emp.ename, 'MADE UP' ) ename
  8    from emp, dept
  9   where emp.deptno(+) = dept.deptno
 10     and emp.deptno is null
 11  /

DNAME          ENAME
-------------- ----------
ACCOUNTING     FORD
ACCOUNTING     ADAMS
ACCOUNTING     SCOTT
ACCOUNTING     JONES
ACCOUNTING     SMITH
RESEARCH       JAMES
RESEARCH       TURNER
RESEARCH       BLAKE
RESEARCH       MARTIN
RESEARCH       WARD
RESEARCH       ALLEN
MADE UP        MILLER
MADE UP        KING
MADE UP        CLARK
SALES          MADE UP
x              MADE UP
OPERATIONS     MADE UP

17 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=28 Bytes=448)
   1    0   UNION-ALL
   2    1     HASH JOIN (OUTER) (Cost=5 Card=14 Bytes=224)
   3    2       TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=98)
   4    2       TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=5 Bytes=45)
   5    1     FILTER
   6    5       HASH JOIN (OUTER)
   7    6         TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=5 Bytes=45)
   8    6         TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=98)



ops$tkyte@ORA920.US.ORACLE.COM>
ops$tkyte@ORA920.US.ORACLE.COM> select nvl( dept.dname, 'MADE UP') dname,
  2         nvl( emp.ename, 'MADE UP' ) ename
  3    from emp full outer join dept on ( emp.deptno = dept.deptno )
  4  /

DNAME          ENAME
-------------- ----------
ACCOUNTING     FORD
ACCOUNTING     ADAMS
ACCOUNTING     SCOTT
ACCOUNTING     JONES
ACCOUNTING     SMITH
RESEARCH       JAMES
RESEARCH       TURNER
RESEARCH       BLAKE
RESEARCH       MARTIN
RESEARCH       WARD
RESEARCH       ALLEN
MADE UP        MILLER
MADE UP        KING
MADE UP        CLARK
SALES          MADE UP
OPERATIONS     MADE UP
x              MADE UP

17 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=16 Bytes=256)
   1    0   VIEW (Cost=7 Card=16 Bytes=256)
   2    1     UNION-ALL
   3    2       HASH JOIN (OUTER) (Cost=5 Card=14 Bytes=224)
   4    3         TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=98)
   5    3         TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=5 Bytes=45)
   6    2       NESTED LOOPS (ANTI) (Cost=2 Card=2 Bytes=22)
   7    6         TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=5 Bytes=45)
   8    6         INDEX (RANGE SCAN) OF 'EMP_DEPT_IDX' (NON-UNIQUE)



As you can see -- the plans are roughly equivalent -- a full outer join could be equated to setting "slow=true" if you are not careful!  It is an expensive operation. 

Thanks again!

Juan Guascarancas Pena, November 08, 2002 - 9:59 am UTC

Thank you Tom!

Please solve this doubt!

Abhijit, November 14, 2002 - 7:22 am UTC

Hi Tom,
I have got a query in SQL Server as follows.
SELECT A.CHECKLIST_NAME,A.CHECKED,A.FUNCTIONNAME,A.MAKER_NAME,A.MAKER_DATE,B.DESCRIPTION,C.MANDATORY,A.CHECK_SERIAL_NO,A.LEVEL_NO
FROM TABLE_CHECK A LEFT OUTER JOIN
LEFT OUTER JOIN TABLE_RECORD B ON
(A.CHECKLIST_NAME=B.VALUE AND B.RECORD_TYPE='CHK')
LEFT OUTER JOIN TB_IW_CHECKLIST C ON
(A.CHECKLIST_NAME=C.CHECKLIST_NAME
AND A.FUNCTIONNAME=C.FUNCTIONNAME)
AND A.WORKITEM_ID = '02318ESWM047119'


This is what I did to the query in Oracle 8i. Now I am not able to execute it.

SELECT A.CHECKLIST_NAME,A.CHECKED,A.FUNCTIONNAME,A.MAKER_NAME,A.MAKER_DATE,B.DESCRIPTION,C.MANDATORY,A.CHECK_SERIAL_NO,A.LEVEL_NO
FROM TABLE_CHECK A, TABLE_RECORD B, TABLE_LIST C
WHERE(A.CHECKLIST_NAME=B.VALUE(+)
AND B.RECORD_TYPE='CHK')
AND (A.CHECKLIST_NAME=C.CHECKLIST_NAME(+)
AND A.FUNCTIONNAME=C.FUNCTIONNAME(+)
AND A.WORKITEM_ID = '02318ESWM047119'



What did I do wrong?

Doubtfully,
Abhijit

Tom Kyte
November 14, 2002 - 7:09 pm UTC

Hmm, lets see -- I'm not actually a SQL compiler (contrary to popular belief) and I don't just happen to have your tables (no create tables to help me out) so....

I'll have to PASS - no error messages, no easy way to reproduce, no info to go on.

Why would you port from SQL Server to 8i in 2002??? there have been 2 production releases of Oracle since that version. If you are going through the heartburn of migration -- why would you do it to an already outdated database version??

Outer joining in the case of a join table (8.1.7)

&quot;Ted Billings&quot;, January 10, 2003 - 4:20 pm UTC

/*
Im curious how one can deal with situations where multiple outer joins are required on a join table? Is the only solution to work with a cartesian product? Or is this the point where switching to plsql or asking for schema modification should take place.

I need a query similar to this:
*/

select db_developer.developer_id as developer_id,
nvl(schema_developer.has_written_good_query, 0)
as has_written_good_query
from db_schema,
db_developer,
schema_developer
where schema_developer.schema_id (+) = db_schema.schema_id
and schema_developer.developer_id (+) = db_developer.developer_id;

/*
The general case that I need is which developers
have never written a bad query. If a developer has not written a query against schema, they are considered as having written a bad query against that schema.
Based on these results, I can in turn run a query similar to this:
*/

select db_schema.schema_name as schema_name,
db_developer.developer_name as developer_name,
rank () over (partition by schema_developer.schema_id
order by schema_developer.power_ranking desc)
as power_ranking
from db_schema,
db_developer,
schema_developer
where db_schema.schema_id = schema_developer.schema_id
and db_developer.developer_id = schema_developer.schema_id
and db_developer.developer_id = (
select distinct developer_id
from ( select db_developer.developer_id as developer_id,
nvl(schema_developer.has_written_good_query, 0)
as has_written_good_query
from db_schema,
db_developer,
schema_developer
where schema_developer.schema_id (+) = db_schema.schema_id
and schema_developer.developer_id (+) = db_developer.developer_id)
where has_writen_good_query = 1)
order by schema_name,
developer_name;

/*
Now the only way I was able to solve this (so far) was to do a cartesian product (ick!!) on schema and developer. I would hope there is a better way.
*/
select distinct developer_id
from (
select db_schema.schema_id,
db_developer.developer_id,
0 as has_written_good_query
from db_schema,
db_developer
minus
select schema_developer.schema_id,
schema_developer.developer_id,
0 as has_written_good_query
from schema_developer
union
select schema_developer.schema_id,
schema_developer.developer_id,
schema_developer.has_written_good_query
from schema_developer
where has_written_good_query = 0);

/*
which in turn is used in this context:
*/
select db_schema.schema_name as schema_name,
db_developer.developer_name as developer_name,
rank () over (partition by schema_developer.schema_id
order by schema_developer.power_ranking desc)
as rank
from db_schema,
db_developer,
schema_developer
where db_schema.schema_id = schema_developer.schema_id
and db_developer.developer_id = schema_developer.developer_id
and db_developer.developer_id not in (
select distinct developer_id
from (
select db_schema.schema_id,
db_developer.developer_id,
0 as has_written_good_query
from db_schema,
db_developer
minus
select schema_developer.schema_id,
schema_developer.developer_id,
0 as has_written_good_query
from schema_developer
union
select schema_developer.schema_id,
schema_developer.developer_id,
schema_developer.has_written_good_query
from schema_developer
where has_written_good_query = 0))
order by schema_name,
rank;

/*
Here's the sample ddl and data:
*/
-- Drop the example tables, if they exist
drop table schema_developer;
drop table db_schema;
drop table db_developer;

-- Create our schema table
create table db_schema (
schema_id int primary key,
schema_name varchar2(255) not null
);

-- Create our developer table
create table db_developer (
developer_id int primary key,
developer_name varchar2(255) not null
);

-- Create our schema developer join table
create table schema_developer (
schema_id int,
developer_id int,
has_written_good_query number(1),
power_ranking number(2)
);

alter table schema_developer
add ( primary key (schema_id, developer_id) );

-- Populate our schema table
insert into db_schema values (1, 'prod');
insert into db_schema values (2, 'stage');
insert into db_schema values (3, 'integration');
insert into db_schema values (4, 'dev');

-- Populate our developer table
insert into db_developer values (1, 'Tom Kyte');
insert into db_developer values (2, 'Sue Smith');
insert into db_developer values (3, 'Ted Billings');
insert into db_developer values (4, 'Pete Hanger');

-- Populate our schema developer join table

-- For the first two schemas everyone has
-- written good queries
insert into schema_developer values (1, 1, 1,99);
insert into schema_developer values (1, 2, 1,50);
insert into schema_developer values (1, 3, 1,25);
insert into schema_developer values (1, 4, 1,10);
insert into schema_developer values (2, 1, 1,99);
insert into schema_developer values (2, 2, 1,40);
insert into schema_developer values (2, 3, 1,53);
insert into schema_developer values (2, 4, 1,77);

-- For the third schema, Ted blew it and wrote
-- a query that performed a cartesian product
-- on two huge tables
insert into schema_developer values (3, 1, 1,99);
insert into schema_developer values (3, 2, 1,32);
insert into schema_developer values (3, 3, 0,17);
insert into schema_developer values (3, 4, 1,84);

-- For the fourth schema, Pete hasn't actually
-- written a query against that schema yet.
insert into schema_developer values (4, 1, 1,99);
insert into schema_developer values (4, 2, 1,66);
insert into schema_developer values (4, 3, 1,33);

-- Commit our changes
commit;



Tom Kyte
January 10, 2003 - 6:54 pm UTC

not 100% following you and a little large for a "comment" -- but here is an idea.

will

select ..., (SELECT .... <subquery> )
from ....


be useful here. like:

select deptno, (select count(*) from emp where emp.deptno = dept.deptno)
from dept

is -- for removing outer joins.

Double Outer Join

Reader, May 07, 2003 - 5:16 pm UTC

Tom,
Have learnt a lot from this site. I have a problem:
I need all the customers from the CUSTOMER table.
There are 8 distinct PFMS_CATEGORY in the
PRODUCT_HIERARCHY table and only 7 in the
FUND_BALANCES table. What I need is the ALLOCATED_AMOUNT
for all the Customers (field =CORP_CUSTOMER_DESCRIPTION)
for all the PFMS_CATEGORY in the PRODUCT_HIERARCHY table even
if they don't have any entry for the FUND_BALANCES. For Example
CORP_CUSTOMER PFMS_CATEGORY sum
_DESCRIPTION _DESCRIPTION
__________________________________________
THE LOCAL STORES Analgesics 4321
THE LOCAL STORES Nutritional 1234
THE LOCAL STORES Cough/Cold
THE LOCAL STORES Portfol
----
--
8 Records ( 1 for each Category)

But instead I am getting only two records

THE LOCAL STORES Analgesics 4321
THE LOCAL STORES Nutritional 1234
----
Here is my query:
select
c.CORP_CUSTOMER_DESCRIPTION,
PH.PFMS_CATEGORY_DESCRIPTION,
sum(FB.ALLOCATED_AMOUNT)
from
SPOT.PRODUCT_HIERARCHY PH,
SPOT.FUND_BALANCES FB,
SPOT.CUSTOMER C
where
FB.PFMS_CATEGORY =PH.PFMS_CATEGORY(+)
and FB.PPL_NUMBER (+) = c.PPL_NUMBER
group by c.CORP_CUSTOMER_DESCRIPTION,
PH.PFMS_CATEGORY_DESCRIPTION

Tom Kyte
May 08, 2003 - 9:31 am UTC

tell you what

when I am taking questions (from the home page) feel free to supply me a question with

o table create statements
o inserts to populate
o sample desired output
o what you actually get


and I'll take a look see.

\


Adding Table Descriptions

A Reader, May 08, 2003 - 8:57 am UTC

I missed the table Descriptions in my last post
DESC SPOT.CUSTOMER

NAME Null? Type
------------------------------- --------- -----
PPL_NUMBER NOT NULL VARCHAR2(10)
PPL_DESCRIPTION VARCHAR2(35)
CORP_CUSTOMER VARCHAR2(10)
CORP_CUSTOMER_DESCRIPTION VARCHAR2(35)

DESC SPOT.PRODUCT_HIERARCHY

NAME Null? Type
------------------------------- --------- -----
HC_PC NOT NULL VARCHAR2(2)
PFMS_CATEGORY NOT NULL VARCHAR2(3)
BRAND NOT NULL VARCHAR2(4)
BRAND_FORM NOT NULL VARCHAR2(4)
BRAND_SIZE NOT NULL VARCHAR2(4)
HC_PC_SHORT_DESCRIPTION VARCHAR2(2)
PFMS_CATEGORY_DESCRIPTION VARCHAR2(40)
BRAND_DESCRIPTION VARCHAR2(40)
BRAND_FORM_DESCRIPTION VARCHAR2(40)
BRAND_SIZE_DESCRIPTION VARCHAR2(40)
DIVISION_CODE VARCHAR2(2)
DIVISION_DESCRIPTION VARCHAR2(40)
PES_CATEGORY VARCHAR2(4)
PES_CATEGORY_DESCRIPTION VARCHAR2(40)

DESC SPOT.FUND_BALANCES

NAME Null? Type
------------------------------- --------- -----
PPL_NUMBER NOT NULL VARCHAR2(10)
PFMS_CATEGORY NOT NULL VARCHAR2(3)
FUND_YEAR NOT NULL NUMBER(4)
FUND_ID NOT NULL VARCHAR2(10)
DISTRIBUTION_CHANNEL VARCHAR2(2)
ALLOCATED_AMOUNT NUMBER(13,2)
COMMITTED_AMOUNT NUMBER(13,2)
PAID_AMOUNT NUMBER(13,2)
TRANSFERRED_AMOUNT NUMBER(13,2)

Tom Kyte
May 08, 2003 - 9:58 am UTC

see above, table CREATES, insert intos -- always appreciated. i've no clue what was in your test table.

insert intos are really appreciated. the output of a select * from means I have to edit them to turn them into inserts in the end!

Query for this double outer join

SSA, May 26, 2003 - 4:59 am UTC

Can you pls give the query for this double outer join??

Tom Kyte
May 26, 2003 - 9:41 am UTC

what double outer join? if you mean a full outer join, look up in this page. there are examples of 8i and before (with a union all) as well as full outer joins in this page already.

Query for outer join

SQL user, May 27, 2003 - 2:05 am UTC

There was a query posted in this thread regarding outer join. I wanted to know the excat query to achieve the required output. It would be helpful if you could give the query.

Following are the details posted earlier.
I need all the customers from the CUSTOMER table.
There are 8 distinct PFMS_CATEGORY in the
PRODUCT_HIERARCHY table and only 7 in the
FUND_BALANCES table. What I need is the ALLOCATED_AMOUNT
for all the Customers (field =CORP_CUSTOMER_DESCRIPTION)
for all the PFMS_CATEGORY in the PRODUCT_HIERARCHY table even
if they don't have any entry for the FUND_BALANCES. For Example
CORP_CUSTOMER PFMS_CATEGORY sum
_DESCRIPTION _DESCRIPTION
__________________________________________
THE LOCAL STORES Analgesics 4321
THE LOCAL STORES Nutritional 1234
THE LOCAL STORES Cough/Cold
THE LOCAL STORES Portfol
----
--
8 Records ( 1 for each Category)

But instead I am getting only two records

THE LOCAL STORES Analgesics 4321
THE LOCAL STORES Nutritional 1234
----
Here is my query:
select
c.CORP_CUSTOMER_DESCRIPTION,
PH.PFMS_CATEGORY_DESCRIPTION,
sum(FB.ALLOCATED_AMOUNT)
from
SPOT.PRODUCT_HIERARCHY PH,
SPOT.FUND_BALANCES FB,
SPOT.CUSTOMER C
where
FB.PFMS_CATEGORY =PH.PFMS_CATEGORY(+)
and FB.PPL_NUMBER (+) = c.PPL_NUMBER
group by c.CORP_CUSTOMER_DESCRIPTION,
PH.PFMS_CATEGORY_DESCRIPTION




Tom Kyte
May 27, 2003 - 7:43 am UTC

You have one customer record in C apparently.

It is joined to BOTH records you appear to have in FB.

These two in turn are joined to their mates in PH


I fail to see any logical relationship between C and PH here without an associated FB record.

This is not an outer join problem at all (looks like a data model problem perhaps).

I don't have an example, the data doesn't make sense. No inputs, no outputs. Simple create tables + inserts are generally **VERY** useful. I've no clue what data is where.


Tell you what -- when I'm taking questions, feel free to put this there.

Proper Use of Outer Joins

Su Baba, October 16, 2003 - 5:50 pm UTC

Hi Tom,

I'd like to know if the following is proper use of outer joins
and also if the PL/SQL code at the bottom is at all necessary.

I have 3 tables as shown below. The objective is that for every
matching record (by product) between referrals and orders tables,
there should be one or more matching records (defined by order + 
line + referral) in the accruals table. In addition, if the 
order line is cancelled (orders.stauts = 'Cancelled'), we can
just ignore that line. If the above condition is met, then the
order is considered complete.

In the following example, "Output 1" shows that the current order
is not complete since there is an order line that does not have
a matching accrual.

If we create another record in accruals table for order line L3,
the order would be complete.

Is the following SQL the right approach to this problem?

thanks


SQL> 
SQL> DROP TABLE referrals;

Table dropped.

SQL> DROP TABLE orders;

Table dropped.

SQL> DROP TABLE accruals;

Table dropped.

SQL> 
SQL> CREATE TABLE referrals (
  2  referral     VARCHAR2(10),
  3  product      VARCHAR2(10)
  4  );

Table created.

SQL> 
SQL> CREATE TABLE orders (
  2  order_id     VARCHAR2(10),
  3  line_id      VARCHAR2(10),
  4  product      VARCHAR2(10),
  5  status       VARCHAR2(10)
  6  );

Table created.

SQL> 
SQL> CREATE TABLE accruals (
  2  accrual      VARCHAR2(10),
  3  order_id     VARCHAR2(10),
  4  line_id      VARCHAR2(10),
  5  referral     VARCHAR2(10)
  6  );

Table created.

SQL> 
SQL> 
SQL> INSERT INTO referrals VALUES ('R1', 'P1');

1 row created.

SQL> INSERT INTO referrals VALUES ('R1', 'P2');

1 row created.

SQL> INSERT INTO referrals VALUES ('R1', 'P3');

1 row created.

SQL> 
SQL> INSERT INTO orders VALUES ('O1', 'L1', 'P1', 'Normal');

1 row created.

SQL> INSERT INTO orders VALUES ('O1', 'L2', 'P1', 'Cancelled');

1 row created.

SQL> INSERT INTO orders VALUES ('O1', 'L3', 'P2', 'Normal');

1 row created.

SQL> 
SQL> INSERT INTO accruals VALUES ('A1', 'O1', 'L1', 'R1');

1 row created.

SQL> INSERT INTO accruals VALUES ('A2', 'O1', 'L2', 'R1');

1 row created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> select * from referrals;

REFERRAL   PRODUCT                                                              
---------- ----------                                                           
R1         P1                                                                   
R1         P2                                                                   
R1         P3                                                                   

SQL> select * from orders;

ORDER_ID   LINE_ID    PRODUCT    STATUS                                         
---------- ---------- ---------- ----------                                     
O1         L1         P1         Normal                                         
O1         L2         P1         Cancelled                                      
O1         L3         P2         Normal                                         

SQL> select * from accruals;

ACCRUAL    ORDER_ID   LINE_ID    REFERRAL                                       
---------- ---------- ---------- ----------                                     
A1         O1         L1         R1                                             
A2         O1         L2         R1                                             



---------------------------OUTPUT 1-----------------------------------
SQL> 
SQL> 
SQL> SELECT x.referral, x.order_id, x.line_id, x.product, x.status, ACR.accrual
  2  FROM   accruals ACR,
  3    (
  4     SELECT REF.referral, ORD.order_id, ORD.line_id, ORD.product,
  5            ORD.status
  6     FROM   referrals REF, ORDERS ORD
  7     WHERE  REF.product  = ORD.product AND
  8            REF.referral = 'R1' AND
  9            ORD.status   <> 'Cancelled'
 10     ) x
 11  WHERE  x.order_id       = ACR.order_id (+) AND
 12         x.line_id        = ACR.line_id  (+) AND
 13         x.referral       = ACR.referral (+);

REFERRAL   ORDER_ID   LINE_ID    PRODUCT    STATUS     ACCRUAL                  
---------- ---------- ---------- ---------- ---------- ----------               
R1         O1         L1         P1         Normal     A1                       
R1         O1         L3         P2         Normal                              

SQL> 
SQL> 
SQL> 
SQL> DECLARE
  2     CURSOR c IS
  3        SELECT x.referral, x.order_id, x.line_id,
                  x.product, x.status, ACR.accrual
  4        FROM   accruals ACR,
  5          (
  6           SELECT REF.referral, ORD.order_id, ORD.line_id, ORD.product,
  7                  ORD.status
  8           FROM   referrals REF, ORDERS ORD
  9           WHERE  REF.product  = ORD.product AND
 10                  REF.referral = 'R1' AND
 11                  ORD.status   <> 'Cancelled'
 12           ) x
 13        WHERE  x.order_id       = ACR.order_id (+) AND
 14               x.line_id        = ACR.line_id  (+) AND
 15               x.referral       = ACR.referral (+);
 16  
 17     i NUMBER := 0;
 18     l_order_incomplete BOOLEAN := FALSE;
 19  
 20  BEGIN
 21     FOR x IN c LOOP
 22        i := i + 1;
 23        IF (x.accrual IS NULL) THEN
 24           l_order_incomplete := TRUE;
 25        END IF;
 26     END LOOP;
 27  
 28     IF ((NOT l_order_incomplete) AND i > 0) THEN
 29        dbms_output.put_line('ORDER COMPLETE');
 30     ELSE
 31        dbms_output.put_line('ORDER NOT COMPLETE');
 32     END IF;
 33  END;
 34  /
ORDER NOT COMPLETE                                                              

PL/SQL procedure successfully completed.

SQL> 
SQL> spool off
 

Tom Kyte
October 16, 2003 - 7:21 pm UTC

Looks ok, but you should consider that accruals just seems to be additional attributes of ORDERS. unless something more then I see is afoot, you might consider just makeing the accruals attributes part of the orders entity (not being a finance guy, that might not make sense -- i just don't know, looking at the example, it would make sense)

Proper Use of Outer Joins - Followup

Su Baba, October 16, 2003 - 9:07 pm UTC

Thanks, Tom.

Orders and accruals are two separate tables because they don't even belong to the same application. Orders belongs to Order Management while accruals belongs to Trade Management. Also an order line can have multiple accruals.

If the accruals table is a huge table, would the performance of the query suffer because of the outer join?


Tom Kyte
October 16, 2003 - 9:14 pm UTC


so? they "belong" to different apps? so what, order mgmt touches "its columns" and "trade mgmt" touches its. you don't design tables for applications, you design tables for holding data effectively, efficiently - for all. applications come, applications go -- the data is key. if these are your own applications -- time to rethink your implementation.

applications are eye candy.

data is data -- data is real. data lives on after applications come and go.

outer joins that are not necessary can be a performance drain - however, in your case it is "sort of moot". unless you go with the single table - you don't have much of a choice

Joining 50 Tables together

Hector, December 02, 2003 - 12:09 pm UTC

Tom

I have been asked to build a table that contains a series of Y/N flags,

e.g

create table all_flags (
ref_no number(10),
flag_1 varchar2(1),
flag_2 varchar2(1),
flag_3 varchar2(1),
.
.
flag_50 varchar2(1));

I have 50 tables of ref_no's that refer to 'Y' for that particular flag/ref_no, and a table of all ref_no's.
The number of records in each flag table could vary from 1 rec to 5 million, with all_ref_no having 5 million

Would you recommend a huge outer join or a series of smaller outer joins followed by a normal join.
Or can you suggest an alternate approach.

Method 1

create table all_flags as (
select t0.ref_no,
nvl(F1,'N') FLAG_01,
nvl(F2,'N') FLAG_02,
nvl(F3,'N') FLAG_03,
.
.
nvl(F50,'N') FLAG_50
from all_ref_nos t0,
(select ref_no,'Y' F1 from flag1_tab) t1,
(select ref_no,'Y' F2 from flag1_tab) t2,
(select ref_no,'Y' F3 from flag1_tab) t3,
.
.
(select ref_no,'Y' F50 from flag1_tab) t50,
where t0.ref_no = t1.ref_no(+)
and t0.ref_no = t2.ref_no(+)
and t0.ref_no = t3.ref_no(+)
and .
and .
and t0.ref_no = t50.ref_no(+));

Method 2

As above but only joining some of the tables in each query followed by


create table all_flags as (
select flag1_10.ref_no, f1, f2, f3, .. f50
from flag1_10,
flag11_20,
flag21_30,
flag31_40,
flag41_50
where flag1_10.ref_no = flag11_20.ref_no
and flag1_10.ref_no = flag21_30.ref_no
and flag1_10.ref_no = flag31_40.ref_no
and flag1_10.ref_no = flag41_50.ref_no);




Tom Kyte
December 02, 2003 - 12:32 pm UTC

tell me what this here table will be used for -- will you look at the flags for "a" refno at a time or what?

(cause I'm leaning towards "let's have a view so we don't have to rebuild this redundant table all of the time")

Joining 50 Tables together

Hector, December 02, 2003 - 2:04 pm UTC

Hi Tom

these 50 or so flags (derived from various tables within our warehouse) will eventually be joined to another table to form an 'Account' table within a mart.
These flags are basically summary flags such as 'Had more than 5 transactions in the past month'

Our application will then select accounts based upon these flags.

I hope this helps

Tom Kyte
December 02, 2003 - 3:01 pm UTC

the massive outer join

consider big enough pga_aggregate_target/hash_area_size and make sure to use the CBO for the hash joins.

Joining 50 Tables together

Hector, December 10, 2003 - 8:56 am UTC

Thanks Tom

Would indexed organised tables and/or hash partitioned tables improve performance here.

Or will it be test it and see

Tom Kyte
December 10, 2003 - 3:34 pm UTC

if this is your most important thing -- you should consider

NOT HAVING 50 TABLES IN THE VERY VERY VERY FIRST PLACE

if you are looking at this this hard, perhaps you need to look at the design in the first place. Nothing is going to make a 50 table OUTER JOIN efficient

multiple table join

Jon McRea, December 11, 2003 - 6:07 am UTC

Tom

You often say in your follow ups 'WHY DO YOU SHOUT'

As far as I can see, Hector is just asking a simple question.

I have been following this thread with interest as I am in a very similar position of having to create a table that contains some summary flags, and was considering an outer join. I note from Hectors original post that he asked "Or can you suggest an alternate approach."

You seemed to steer him towards the outer join

Whenever anybody writes code, performance is one of the major considerations, so I would have thought that his follow on question was a good one to ask.

Tom Kyte
December 11, 2003 - 7:18 am UTC

i do not "often" say that. and when I do it is because the entire question is phrased in upper case.

i used upper case here for EMPHASIS. to stress something. to say it emphatically.

I did not do so to offend.

"WHY do you have 50 tables" -- they are trying to speed up something, appear to be working very hard to do so -- when the underlying design is just something that will never lend itself towards that.

I would recommend an entire redesign of the entire underlying schema. there is nothing that will make a 50 table outer join "better" short of removing 50 tables.

Outer join query is confusing me.

Dave Blake, March 05, 2004 - 6:10 am UTC

Firstly sorry about saying the above question was not useful... If you read it again you will see why...

On with the question...


Tom, I have a problematic outer join query that I wondered if you could give some pointers on.

Firstly I have a number of tables as shown below...


___________________
| OFFICES |
|__________________|
| ID |
| NAME |
| |
|__________________|
|
|
_________/\________
| DEPARTMENTS |
|__________________|
| ID |
| NAME |
| OFFICE_ID |
|__________________|
|
|
|
_________/\_________ __________________
| EMPLOYEE_NUMBERS | | PAYBANDS |
|___________________| |_________________|
| PAYBAND_NAME | | |
| DEPT_ID |>----| NAME |
| VALUE | | |
|___________________| |_________________|


As you can see from above (hopefully the format will be OK when I submit this page) there is a hierarchy between OFFICES -> DEPARTMENTS -> EMPLOYEE_NUMBERS and PAYBANDS is a lookup table.


The query I am trying to produce should select the total employees for each payband and produce a null value for each payband that does not have an employee.

Below is a query that works at department level.

SELECT pb.NAME, NVL(SUM (en.VALUE),0)
FROM EMPLOYEE_NUMBERS en, PAYBANDS pb
WHERE en.PAYBAND_NAME(+) = pb.NAME
AND en.DEPT_ID(+) = 'HR'
GROUP BY (pb.NAME)

Now, the place that I get stuck is when I try to do the same at OFFICE level. EG. Selecting data for all departments in the office. The closest I can get is below.


SELECT pb.NAME, NVL(SUM (en.VALUE),0)
FROM EMPLOYEE_NUMBERS en, PAYBANDS pb,
DEPARTMENTS d, OFFICES o
WHERE en.PAYBAND_NAME(+) = pb.NAME
AND en.DEPT_ID = d.ID
AND d.OFFICE_ID = o.ID
AND o.NAME = 'London Branch'
GROUP BY (pb.NAME)


I understand that this does not work because of the following lines.
WHERE en.PAYBAND_NAME(+) = pb.NAME
AND en.DEPT_ID = d.ID

The second join cancels out the outer join on the first line. I wondered if you could help me as I seem to have it a dead end with this query.


Thanks a lot

Dave Blake

Tom Kyte
March 05, 2004 - 8:24 am UTC

what above question? I see no prior question/input from you?  anyway.....

starting with this data:


ops$tkyte@ORA9IR2> select * from offices;
 
        ID NAME
---------- ------------------------------
         1 Here
         2 There
 
ops$tkyte@ORA9IR2> select * from departments;
 
        ID NAME                            OFFICE_ID
---------- ------------------------------ ----------
        10 dept10                                  1
        20 dept20                                  1
        30 dept30                                  1
       100 dept100                                 2
       200 dept200                                 2
       300 dept300                                 2
 
6 rows selected.
 
ops$tkyte@ORA9IR2> select * from employee_numbers;
 
PAYBAND_NAME       DEPT_ID      VALUE
--------------- ---------- ----------
underpaid               10       1000
just right              20       1001
just right              30       1002
underpaid              100       2000
just right             200       2001
 
ops$tkyte@ORA9IR2> select * from paybands;
 
NAME
------------------------------
underpaid
just right
overpaid
 
<b>we can see there are a couple of ways to write your query.  one is:</b>

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select pb.name, sum(en.value)
  2    from paybands pb, employee_numbers en
  3   where pb.name = en.payband_name(+)
  4   group by pb.name
  5  /
 
NAME                           SUM(EN.VALUE)
------------------------------ -------------
just right                              4004
overpaid
underpaid                               3000

<b>and another is:</b>
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select name,
  2        (select sum(value)
  3           from employee_numbers en
  4          where en.payband_name = pb.name ) val
  5    from paybands pb
  6  /
 
NAME                                  VAL
------------------------------ ----------
underpaid                            3000
just right                           4004
overpaid
 
<b>Now, to get your report at the "office" level.  what we need is

a) the cartesian product between paybands and offices to get ever office/payband combo

b) the join between employee numbers and departments (so we can roll emps upto the office level)

if we had those two tables -- it would be straightforward -- just outer join A to B. We don't but inline views let us pretend we do:</b>


ops$tkyte@ORA9IR2> select pb_name, o_name, sum(value)
  2    from (
  3  select pb.name pb_name, o.name o_name, o.id o_id
  4    from paybands pb, offices o
  5         ) a,
  6         (
  7  select value, office_id, payband_name
  8    from employee_numbers, departments
  9   where dept_id = id
 10         ) b
 11   where o_id = office_id(+)
 12     and pb_name = payband_name(+)
 13   group by pb_name, o_name
 14  /
 
PB_NAME         O_NAME          SUM(VALUE)
--------------- --------------- ----------
overpaid        Here
overpaid        There
underpaid       Here                  1000
underpaid       There                 2000
just right      Here                  2003
just right      There                 2001
 
6 rows selected.

<b>we can use the same scalar subquery trick as well to remove the outer join and do it more "row by row"</b>

 
ops$tkyte@ORA9IR2> select pb.name pb_name, o.name o_name, o.id o_id,
  2        (select sum(value)
  3           from employee_numbers en, departments d
  4          where en.payband_name = pb.name
  5            and en.dept_id = d.id
  6            and d.office_id = o.id) val
  7    from paybands pb, offices o
  8  /
 
PB_NAME         O_NAME                O_ID        VAL
--------------- --------------- ---------- ----------
underpaid       Here                     1       1000
just right      Here                     1       2003
overpaid        Here                     1
underpaid       There                    2       2000
just right      There                    2       2001
overpaid        There                    2
 
6 rows selected.


<b>which one works best depends on the application.  You might well find that the scalar subquery with a FIRST_ROWS hint is best for "get me the first rows fast" where as the outer join is best for "get me all of the rows fast"</b>
 

Thanks a lot.

A reader, March 05, 2004 - 9:12 am UTC

Thanks a lot for the above answer Tom.


Just one question - you say about using the FIRST_ROWS hint. Am I correct in thinking that Oracle is moving towards just using cost based analysis rather than rule based? If this is correct then surely it is better to start trying to avoid using hints now.

If I am wrongly informed then sorry for wasting your time!

Dave

Tom Kyte
March 05, 2004 - 10:24 am UTC

in 10g, the RBO is officially "not supported", only the CBO is.

There are good hints, hints that give the optimizer INFORMATION. first_rows tells the optimizer useful stuff "hey, i am interested in initial response time here, I might be asking for 10000 rows but only fetch the first could dozen/hundred"

all_rows tells it something useful.
append tells it something useful
dynamic_sampling
cardinality
parallel
driving_site
choose
cursor_sharing_exact
rewrite

they all give the optimizer more information -- they are "good". the others that tell it how to do something, they are the ones I don't like to see used "as a matter of fact"

a table may not be outer joined to anly one table

A reader, May 26, 2004 - 4:14 pm UTC

Tom,
Here is my query

select pm.pog_id, pog.site_id, pog_date, pm.po_id, po.sales_order_id, pm.batch_id, pm.lot_id, pm.product_part_id,
pol.oligo_sequence, pol.oligo_type, po.qc_failure_count, pp.plate_id, pm.pad_plate_position, pm.material_type,
pm.material_id, to_char(pqr.yield), pqr.quant_status, to_char(pmsr.dalton_diff),
to_char(pmsr.percent_dalton_diff), ms_fail_mode, pmsr.ms_status, pog.product_type, pm.failure_code, null, null, null, null, null,null,null,null,null,null,
null, null, null, null, null, null
from ot_pog_material pm,
ot_production_orders po,
ot_pog_plates pp,
ot_pog_oligos pol,
ot_pog_quant_results pqr,
ot_pog_quant_result_sets pqrs,
ot_pog_mass_spec_results pmsr,
ot_production_order_groups pog
where pm.pog_id = pog.id
and pm.pgp_id = pp.id
and pm.id = pqr.mother_pgm_id
and pqr.pog_id = pog.id
and pqr.qrs_id = pqrs.id
and pqrs.pog_id = pog.id
and pm.id = pmsr.mother_pgm_id
and pmsr.pog_id = pog.id
and po.po_id = pm.po_id
and po.pog_id = pog.id
and pm.pog_id = pqr.pog_id
and pm.pog_id = pqrs.pog_id
and pqrs.accepted_flag = 'Y'
and pm.pog_id = pmsr.pog_id
and pm.pog_id = po.pog_id
and pqr.mother_pgm_id = pmsr.mother_pgm_id
and pqr.pog_id = pqrs.pog_id
and pqr.pog_id = pmsr.pog_id
and pqr.pog_id = po.pog_id
and pqrs.pog_id = pmsr.pog_id
and pqrs.pog_id = po.pog_id
and pmsr.pog_id = po.pog_id
and pqrs.pgp_id = pp.id
and pqrs.pgp_id = pm.pgp_id
and pog.id = pol.pog_id
and po.po_id = pol.po_id
and pm.oligo_type = pol.oligo_type
and pm.pog_id = pol.pog_id
and pm.po_id = pol.po_id

I have some records that exists in all the tables. There are some that exist in po, pog, pm, pol,pp , pqr and pqrs and not in pmsr.

And then I have some that exist in po, pog, pm, pol,pp ,pmsr and not in pqr and pqrs.

How would I modify my query to fetch me all the records. I am using Oracle 8i. So I tried to do union instead of a full outer join. But I could not make the outer join work with lots of tables. I get a error - A table may be outer joined to only one table.
Thanks

Tom Kyte
May 26, 2004 - 4:46 pm UTC

don't know -- too big, too long and too hard to figure out who joins to what and when.

that and I have no idea of the relationships (1;1, 1:m whatever) which would allow for or disallow for the use of scalar subqueries.

one needs entity relations, not a query :)

outer join

A reader, May 26, 2004 - 6:23 pm UTC

ok, my point was , I have a query with multiple tables. And I want to use an outer join because the data is missing in one of the tables. In the where clause this table is joined to many tables. If I try to join this tables with an outer join with more than one table, I get an error - a table can be outer joined with only one table. And if I move the outer join to the other said as you said previously in this thread of replies, I do not get all the rows.
Can you help me with this?

Tom Kyte
May 27, 2004 - 8:38 am UTC

no, because I do not have a simple example that is digestable easily in order to demonstrate with.

your description is not clear "missing in one of the tables" -- does not go with above example where there are many. "this table is joined to many tables" -- which table is "this table".

Use tables A, B, C, D, E for example
List how they relate (please just use single column keys) using your real world problem.
List out what you really want from them.

A simple example

Tak Tang, May 27, 2004 - 8:23 am UTC

I found outer joins very difficult when I started out, so I'll offer my thoughts . . .

-- Begin Sample schema --

create table t1
(
value1 varchar2(10)
)
/

create table t2
(
value2 varchar2(10)
)
/

create table r
(
value1 varchar2(10),
value2 varchar2(10),
value3 varchar2(20)
)
/

insert into t1 values ( 'apple' );
insert into t1 values ( 'oranges' );

insert into t2 values ( 'fresh' );
insert into t2 values ( 'candied' );

insert into r values ( 'apple', 'fresh', 'lots of vitamins' );
insert into r values ( 'cherries', 'candied', 'yum');

-- End Sample schema --


A table (r) outer joined to another table (t1) :-

TANGT@DEVC> select t1.*, r.*
2 from t1, r
3 where t1.value1 = r.value1 (+)
4 /

VALUE1 VALUE1 VALUE2 VALUE3
---------- ---------- ---------- --------------------
apple apple fresh lots of vitamins
oranges

2 rows selected.


A table (r) outer joined to two tables (t1 and t2) :-

TANGT@DEVC> select t1.*, t2.*, r.*
2 from t1, t2, r
3 where t1.value1 = r.value1 (+)
4 and t2.value2 = r.value2 (+)
5 /
where t1.value1 = r.value1 (+)
*
ERROR at line 3:
ORA-01417: a table may be outer joined to at most one other table


My solution . . . use a view to bring t1 and t2 together, and outer join to that :-

TANGT@DEVC> create or replace
2 view v
3 as
4 select t1.*, t2.*
5 from t1, t2
6 /

View created.

TANGT@DEVC> select v.*, r.*
2 from v, r
3 where v.value1 = r.value1 (+)
4 and v.value2 = r.value2 (+)
5 /

VALUE1 VALUE2 VALUE1 VALUE2 VALUE3
---------- ---------- ---------- ---------- --------------------
apple candied
apple fresh apple fresh lots of vitamins
apple peel
oranges candied
oranges fresh
oranges peel

6 rows selected.


Alternatively, use an 'inline view' :-

TANGT@DEVC> select v.*, r.*
2 from
3 ( -- bring t1 and t2 together as an inline view
4 select t1.*, t2.*
5 from t1, t2
6 ) v,
7 r
8 where v.value1 = r.value1 (+)
9 and v.value2 = r.value2 (+)
10 /

VALUE1 VALUE2 VALUE1 VALUE2 VALUE3
---------- ---------- ---------- ---------- --------------------
apple candied
apple fresh apple fresh lots of vitamins
apple peel
oranges candied
oranges fresh
oranges peel

6 rows selected.


Takmeister


Multiple FKs

A reader, November 19, 2004 - 11:01 am UTC

Our database design is such that all non-PK columns are either numbers, free-form text, a 1-byte numeric flag (1/0), or a varchar2(1) indicator (A-Z) or a (optional) FK to a big lookup table containing some a short code, long code and some other information about that field's domain.

The latter is mainly so that each of those columns can be validated against the list of valid values allowed for it. Yes, check constraint would work here as well, but the lookup table contains other information (short code, long code) that users need.

That brings me to the topic being discussed on this page...multiple outer joins.

If this table has 10 of these FK columns, the view I create on this table would have 10 outer-joins to this lookup table!

select
t.pk,
lkp1.short_val,lkp1.long_val,lkp1....,
from
t,lkp lkp1,lkp lkp2,lkp lkp3,...
where
t.lkp1_id=lkp1.id(+)
and t.lkp2_id=lkp2.id(+)
and t.lkp3_id=lkp3.id(+)
...

The performance of this view is not so good. Especially when predicates are applied on the columns from the lkp tables!

How would you approach this problem from a (re)design perspective or can the existing design be salvaged?

Thanks

Tom Kyte
November 19, 2004 - 12:04 pm UTC

are all of the "relations" optional here?


what kind of system is this that ALL columns are pretty much containing just list of values values? seems really "strange"

A reader, November 19, 2004 - 1:28 pm UTC

Yes, all the relations are optional.

No, not all the columns are like this. Like I said, there are regular varchar2 columns (free form text), numbers (amounts), dates, 1/0 flags. On average, 10% of columns are dates, 25% are numbers, 10% are 1/0 flags or indicators, 25% are varchar2, rest are these "list of value" values. Whats strange about it?

Tom Kyte
November 19, 2004 - 2:22 pm UTC

to have 3 out of every ten columns be a "lookup" on average is "high"

6 optional joins for a 20 column table. Very high.


I don't see a "single view" in your life.

The outer join will be right "sometimes".
A natural join is called for when you "where" on the attributes.
Scalar subqueries will be right "other times".

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:22868017492361 <code>

for a discussion on the two approaches.

Dummy FKs

A reader, November 19, 2004 - 2:36 pm UTC

OK what do you think about creating a "dummy" row in my lookup table with values "N/A" or "Unknown" or something like that and now I can NOT NULL all my FKs columns and update all the nulls to point to this dummy row.

This way, all my outer joins can become exact joins? Yes, my "view" would still have 6 joins with this lookup table, but at least they are all exact joins.

Oracle is designed to handle joins, right, so there is no reason to be wary of a view/query involving dozens of tables and dozens of joins?

Thanks

Tom Kyte
November 19, 2004 - 7:29 pm UTC

technically, speaking as a purist, i'd say no.

speaking as a pragmatic person, I'd say -- if you can do that, go for it, yes.

but.... you'll still want two views if you ask me :)

create view v as
select ...
from t1, t2, t3, t4, t5, t6
where <joins>

will always hit t1 ... t6

if you only query a column from t1, it'll have to hit t1..t6

if you used scalar subqueries, it'll only hit what it needs to hit as it needs to hit them.

Scalar subqueries

A reader, November 19, 2004 - 10:50 pm UTC

"if you used scalar subqueries, it'll only hit what it needs to hit as it needs to hit them."

Yes, I agree except (as I discuss in the thread you pointed me to above), what about performance when a predicate is applied on the scalar-subqueried column?

create view v as
select ...,
(select value from lookup where code=t.code) value
...
from t;

How would select .. from v where value='foo' perform? Even if there is a index on lookup.code, it wouldnt use it?

Thanks

Tom Kyte
November 20, 2004 - 8:13 am UTC

this is precisely why I say "you need two"

"one size never fits all"



A reader, November 20, 2004 - 9:00 am UTC

"this is precisely why I say "you need two""

Hm, but how would I explain this to my users?

Release Notes: There are 2 identical views, v_exactjoin and v_scalar_subquery.

They have the same columns but if you want to put a predicate on columns x,y,z, use the former view. If you want to pick and choose the columns and NOT apply predicates, use the latter view?!

That sounds a little unrealistic, doesnt it?

Tom Kyte
November 20, 2004 - 9:41 am UTC

why are end users ad-hocing operational data?

if these are in support of reporting, perhaps your structure is really inappropriate.

otherwise, for the developers -- yes (heck, i wouldn't even have views)

your choice otherwise is:

always join N tables, always -- for everything.


decision is up to you at the end of the day.

A reader, November 20, 2004 - 9:46 am UTC

<quote>
create view v as
select ...
from t1, t2, t3, t4, t5, t6
where <joins>

will always hit t1 ... t6

if you only query a column from t1, it'll have to hit t1..t6
<quote>

1. Yes it would have to hit t1..t6, but it would be just a PK hit, not too expensive, right?

2. More importantly, if I go down this path, some of my tables have 200 columns and 50 of them are this lookup type. So the view based on this table would have a 50-table join. Some queries would join this view to other views of this type, and I can easily see the final query having a 100-table join (with a handful <6 of "main" tables and dozens of aliases for the "lookup" table).

Are there any practical limitations on Oracle being able to do 100-table joins? Any other considerations?

Thanks

Tom Kyte
November 20, 2004 - 10:06 am UTC

1) depends on how many rows we are talking about here -- one row, your end user won't notice it, scanning thousands of rows, it will be "felt"

2) 50 table join. ok, now you are starting to scare me. a single row -- one row -- from that 50 table join is going to incurr something on the order of:

3 LIO's in the index on t1 to find the row (assuming normal index with height of 3
1 LIO to table access by index rowid the row...
3*50 LIO's = 150 LIO's to index probe all of the other tables....

153 LIO's for a *single row*.

that is what we might call "alot"

practical limits -- resource consumption. it can do it, the question is -- do you want to do it.

A reader, November 20, 2004 - 10:12 am UTC

"why are end users ad-hocing operational data?"

You are right, this is a operational data store. The tables are normalized, contain lots of internal surrogate key ids and cryptic column names so the intent behind the views is just to do all the legwork and present a business-friendly picture of the data. That would be for the adhoc business-type users. They typically access the data using MS Access (ugh!, we are working on them!). So, they mostly browse the data and if they see anything of interest, they put additional filters/predicates on few columns.

The other class of "users" are the BI-type report developers. They need to produce reports based on what the business wants and they join these views all, put predicates, etc.

"if these are in support of reporting, perhaps your structure is really inappropriate"

Given my 2 classes of users above, what structures (both physical and logical) would you suggest?

Thanks

Tom Kyte
November 20, 2004 - 10:16 am UTC

perhaps an operational data store and materialized views refreshed as the business needs (which does not imply "continously" really) that 'denormalize' the data for reporting.


Join tables Dynamically

A reader, August 04, 2005 - 12:27 pm UTC

Tom,
Is there a way to dynamically join tables in the same Query ?
I mean
if i have 3 tables emp,dept_us,dept_other,
then i want dept_us to be joined to emp table if cond1[some user input-integer] is met
else i want dept_other to be joined to emp table

I am currently taking this approach..

if cond1 = 1
then
select e.*,d.*
from emp e,dept_us d
where e.deptno=d.deptno
elsif cond1=2
then
select e.*,d.*
from emp e,dept_other d
where e.deptno=d.deptno
else -- use both tables
select e.*,d1.*,d2.*
from emp e,dept_us d1,dept_other d2
where e.deptno=d1.deptno
and e.deptno=d2.deptno
end if;

But , if i am joining 10 tables, i have to use 10
if-elseif statements

Is it possible[Am I making sense at all :-( ]?

Thanks,


Tom Kyte
August 04, 2005 - 1:44 pm UTC


well, you have AT LEAST two different select lists there -- how does your code deal with that?


but, yes, plsql supports dynamic SQL. You could build the query in a string.

thanks for the inspiration

K, November 18, 2005 - 1:24 pm UTC

Your example with
"-- bring t1 and t2 together as an inline view"
inspired me and now I have a cool sql that gets the results I needed!
THANKS!

Query that never seems to return

Suzanne, January 17, 2006 - 8:28 pm UTC

I've been wandering through your site and I haven't found a clue to my issue so I'm going to ask my question. I've got a small, modest query that has recently stopped working. My users claim it has to do with my patching the DB to 9.2.0.6, I'm not sure I buy it. Here is the query -

SELECT count(*)
FROM ODS.CURRENT_DAT_DAILIES DD
join ODS.ARCHIVE_CITIES AC on (DD.FROM_DEST=AC.CITY_INDEX)
join NET.STATES NS on (AC.STATE=NS.STATE)
WHERE NS.COUNTRY = 'USA'
AND AC.STATE NOT IN ('GU', 'PR', 'VI', 'AS', 'HI', 'DC', 'AK')
AND DD.EQUIPMENT IN ('F', 'FA', 'F2', 'FM', 'MX', 'FD', 'FV', 'FR', 'FS', 'FT', 'TT', 'VF', 'SD')
AND (DD.START_DATE between next_day((sysdate - 14),'SUNDAY') and next_day((sysdate - 7),'SUNDAY'))

I've let this query run for hours and it has never returned. I haven't been able to figure out why it runs so long.

Here are the table layouts
TABLE ODS.ARCHIVE_CITIES
(CITY_INDEX NUMBER NOT NULL,
CITY VARCHAR2(14 byte),
COUNTY VARCHAR2(14 byte),
PREFERENCE CHAR(1 byte),
STATE VARCHAR2(2 byte),
LATITUDE NUMBER,
LONGITUDE NUMBER,
METRO_CODE NUMBER,
ZIP_CODE VARCHAR2(9 byte),
CREATE_DATE DATE,
ODS_BATCH_ID NUMBER,
ODS_BEGIN_EFFECTIVE_DATE NOT NULL,
ODS_END_EFFECTIVE_DATE DATE,
CONSTRAINT ARCHIVE_CITIES_PK PRIMARY KEY(CITY_INDEX)

TABLE ODS.CURRENT_DAT_DAILIES
(DAILY_ID NUMBER NOT NULL,
ACCOUNT_ID NUMBER,
SUBACCOUNT_ID CHAR(2 byte),
DIVISION VARCHAR2(12 byte),
COMP_ID NUMBER,
USER_ID VARCHAR2(16 byte),
PRODUCT_ID CHAR(2 byte),
PRICE_CLASS CHAR(2 byte),
START_DATE DATE NOT NULL,
AVAILABLE_DATE DATE,
FROM_DEST NUMBER,
TO_DEST NUMBER,
POST_COUNT NUMBER,
CSR_USERNAME VARCHAR2(16 byte),
SALES_USERNAME VARCHAR2(16 byte),
EQUIPMENT CHAR(2 byte),
LENGTH NUMBER,
WEIGHT NUMBER,
SIEBEL_ACCOUNT_ID VARCHAR2(15 byte),
PORTAL_ACCOUNT_ID VARCHAR2(60 byte),
EVENT_TYPE VARCHAR2(60 byte),
ODS_BATCH_ID NUMBER NOT NULL,
ODS_BEGIN_EFFECTIVE_DATE DATE NOT NULL,
ODS_END_EFFECTIVE_DATE DATE,
ENHANCEMENTS VARCHAR2(8),
ORDER_PRICE NUMBER,
CONSTRAINT CURRENT_DAT_DAILIES_PK PRIMARY KEY(DAILY_ID,
ODS_BATCH_ID, ODS_BEGIN_EFFECTIVE_DATE, START_DATE)

TABLE NET.STATES (STATE VARCHAR2(2 byte) NOT NULL PRIMARY KEY,
DESCRIPTION" VARCHAR2(50 byte),
COUNTRY VARCHAR2(3 byte),
ROW_ID NUMBER,
CREATE_DATE DATE,
CREATE_TIME DATE,
CREATE_EMP VARCHAR2(16 byte),
LAST_MDFY_DATE DATE,
LAST_MDFY_TIME DATE,
LAST_MDFY_EMP VARCHAR2(16 byte),



Tom Kyte
January 18, 2006 - 8:13 am UTC

did you look at the plan to see what the plan was. and do you have a copy of the old system to see what the plan used to be.

Query that never returns

Suzanne, January 18, 2006 - 1:16 pm UTC

I've got the current explain plan from looking at the session in OEM -

Total Cost:

735

Execution Steps:

Step # Step Name
11 SELECT STATEMENT
10 SORT [AGGREGATE]
9 FILTER
8 ODS.CURRENT_DAT_DAILIES TABLE ACCESS [BY INDEX ROWID]
7 NESTED LOOPS
5 MERGE JOIN
2 NET.STATES TABLE ACCESS [BY INDEX ROWID]
1 NET.STATES_PK INDEX [FULL SCAN]
4 SORT [JOIN]
3 ODS.ARCHIVE_CITIES_CITY_ST_IDX INDEX [FAST FULL SCAN]
6 ODS.CUR_DD_START_DATE INDEX [RANGE SCAN]

I don't have the old plan because I didn't know this report existed until it had failed in production three times (after I had applied the patch for 9.2.0.6).

Tom Kyte
January 19, 2006 - 8:05 am UTC

so, how big are these tables. you can see what it does - "get stuff from table, sort it, merge it".

if "stuff" is big, and database machine is small, this query would take a while. I see no estimated cardinalities or anything (you are not using the RBO are you??)

Query that never returns

Suzanne, January 19, 2006 - 2:01 pm UTC

One of the tables is large but the other two are fairly small.

Net.States = 100 rows
ODS.Archive_Cities = 553,645 rows
ODS.Current_Dat_Dailies = 5,275,252 rows

These two schemas live in our Data Repository system, this is one of the biggest servers we've got. Since I was aware that the Dat_Dailies was large I've let this query run all night and it still hasn't finished. I would swear this query was acting like I've asked for a Cartesian result set.

As for the row counts and costing figures, since I was using OEM to get the information and OEM uses XML to reformat and display the Explain Plan, the costing figures are included in the explanation of the execution steps. Here is the rather ugly extraction -

Step # Description Est. Cost Est. Rows Returned Est. KBytes Returned
1 This plan step retrieves all ROWIDs from the B*-tree index STATES_PK by walking the index starting with its smallest key. 1 93 --
2 This plan step retrieves rows from table STATES through ROWID(s) returned by an index. 2 51 0.349
3 This plan step retrieves all of the ROWIDs of B*-tree index ARCHIVE_CITIES_CITY_ST_IDX by sequentially scanning the leaf nodes. 30 11 0.075
4 This plan step accepts a row set (its only child) and sorts it in preparation for a merge-join operation. 32 11 0.075
5 This plan step accepts two sets of rows sorted on the join key. By walking both sets of rows in the order of the join key, every distinct pair of rows satisfying the join condition in the WHERE clause is found through a single pass of the row sets. 34 1 0.014
6 This plan step retrieves one or more ROWIDs in ascending order by scanning the B*-tree index CUR_DD_START_DATE. 44 23,191 --
7 This plan step joins two sets of rows by iterating over the driving, or outer, row set (the first child of the join) and, for each row, carrying out the steps of the inner row set (the second child). Corresponding pairs of rows are tested against the join condition specified in the query's WHERE clause. 735 1 0.029
8 This plan step retrieves rows from table CURRENT_DAT_DAILIES through ROWID(s) returned by an index. 701 1 0.016
9 This plan step accepts a set of rows from its child node, eliminates some of them, and returns the rest.
10 This plan step accepts a row set (its only child) and returns a single row by applying an aggregation function. -- 1 0.029
11 This plan step designates this statement as a SELECT statement. 735 1 0.029


Tom Kyte
January 19, 2006 - 4:39 pm UTC

how about just a plain old fashioned

SQL> set autotrace traceonly explain
SQL> <query goes here>

cut and paste from sqlplus.


 

Query that never returns

Suzanne, January 20, 2006 - 5:34 pm UTC

Here is the explain plan from SQL Plus.

SQL> SELECT count(*)
  2  FROM ODS.CURRENT_DAT_DAILIES DD
  3  join ODS.ARCHIVE_CITIES AC on (DD.FROM_DEST=AC.CITY_INDEX)
  4  join NET.STATES NS on (AC.STATE=NS.STATE)
  5  WHERE NS.COUNTRY = 'USA'
  6    AND AC.STATE  NOT IN  ('GU', 'PR', 'VI', 'AS', 'HI', 'DC', 'AK')
  7    AND DD.EQUIPMENT  IN  ('F', 'FA', 'F2', 'FM', 'MX', 'FD', 'FV', 'FR', 'FS', 
  8  'FT', 'TT', 'VF', 'SD')
  9    AND (DD.START_DATE between next_day((sysdate - 14),'SUNDAY') and 
 10  next_day((sysdate - 7),'SUNDAY'))
 11  /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=735 Card=1 Bytes=3
          0)

   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'CURRENT_DAT_DAILIES'
           (Cost=701 Card=1 Bytes=16)

   4    3         NESTED LOOPS (Cost=735 Card=1 Bytes=30)
   5    4           MERGE JOIN (Cost=34 Card=1 Bytes=14)
   6    5             TABLE ACCESS (BY INDEX ROWID) OF 'STATES' (Cost=
          2 Card=51 Bytes=357)

   7    6               INDEX (FULL SCAN) OF 'STATES_PK' (UNIQUE) (Cos
          t=1 Card=93)

   8    5             SORT (JOIN) (Cost=32 Card=11 Bytes=77)
   9    8               INDEX (FAST FULL SCAN) OF 'ARCHIVE_CITIES_CITY
          _ST_IDX' (NON-UNIQUE) (Cost=30 Card=11 Bytes=77)

  10    4           INDEX (RANGE SCAN) OF 'CUR_DD_START_DATE' (NON-UNI
          QUE) (Cost=44 Card=23191) 

Tom Kyte
January 20, 2006 - 6:30 pm UTC

Ok, do you see all of the card= values in there, they are very very very small.

Are they "anywhere close to reality", using your knowledge of the data - are they even close to reality?

Query that never returns

Suzanne, January 24, 2006 - 12:52 pm UTC

No the card= values aren't realistic at all. I ran a Select between the City and State table and the join returned a count of 108592. I then ran a Select between the Dat_Dailies and the City. That join returned a count of 102112. So given what I know about the data my result set should be in the neighborhood of 100,000. Why are the counts so far off?

Tom Kyte
January 24, 2006 - 8:56 pm UTC

are the stats even a little bit up to date?

Query that never returns

A reader, January 24, 2006 - 3:58 pm UTC

After further investigation it looks like the archive_cities table is the biggest offender although the cardinality for the start_date index is somewhat low also. The cardinality for the state table is correct. The archive_cities should return more then 100,000 rows. I tried building an index on the state and city_index field. The optimizer did try and use it but the actual run of the query still didn't return after an hour and a half. I'm still puzzled.

Query that never returns

Suzanne, January 25, 2006 - 1:06 pm UTC

The statistics are very up to date. This is running off of our production database and we run statistics once a week on Sundays. The statictic date on these tables is 01/22/2006. The records count on all of the tables is very close to accurate (as in off by a hundred records or less).

Tom Kyte
January 25, 2006 - 4:24 pm UTC

SQL> SELECT count(*)
  2  FROM ODS.CURRENT_DAT_DAILIES DD
  3  join ODS.ARCHIVE_CITIES AC on (DD.FROM_DEST=AC.CITY_INDEX)
  4  join NET.STATES NS on (AC.STATE=NS.STATE)
  5  WHERE NS.COUNTRY = 'USA'
  6    AND AC.STATE  NOT IN  ('GU', 'PR', 'VI', 'AS', 'HI', 'DC', 'AK')
  7    AND DD.EQUIPMENT  IN  ('F', 'FA', 'F2', 'FM', 'MX', 'FD', 'FV', 'FR', 
'FS', 
  8  'FT', 'TT', 'VF', 'SD')
  9    AND (DD.START_DATE between next_day((sysdate - 14),'SUNDAY') and 
 10  next_day((sysdate - 7),'SUNDAY'))
 11  /

ok, is the autotrace for

select * from net.states ns where ns.country='USA';
select * from ods.archive_cities where AC.STATE  NOT IN  ('GU', 'PR', 'VI', 'AS', 'HI', 'DC', 'AK');
select * from dd.equipment IN  ('F', 'FA', 'F2', 'FM', 'MX', 'FD', 'FV', 'FR', 
'FS', 'FT', 'TT', 'VF', 'SD')
     AND (DD.START_DATE between next_day((sysdate - 14),'SUNDAY') and 
   next_day((sysdate - 7),'SUNDAY'));


close - which one is way off (i'm going to guess "the last one" as the functions on the dates will have the effect of hiding some bits from the optimizer) 

Query that never returns

Suzanne, January 25, 2006 - 8:40 pm UTC

select * from net.states ns where ns.country='USA'; - this is exactly correct in the autotrace. I think the card said card=51 which is right

select count(*) from ods.archive_cities where STATE NOT IN ('GU', 'PR', 'VI',
'AS', 'HI', 'DC', 'AK')

is WAY off. This table is a list of cities and this select run by itself (as above) returns a count of 117525

select count(*) from ods.current_dat_dailies where equipment IN ('F', 'FA', 'F2', 'FM',
'FS', 'FT', 'TT', 'VF', 'SD')
AND (START_DATE between next_day((sysdate - 14),'SUNDAY') and
next_day((sysdate - 7),'SUNDAY'))

is also off by not as badly. I believe the autotrace returned a card=23000 (somewhere in that area). My run of the simple select above returns a count of 102112.

Does this help?

Tom Kyte
January 26, 2006 - 10:13 am UTC

ok, the query begins by

a) get all from archive_cities (it thinks 11, it gets 117525)
b) get all states (it thinks 1, gets 1)

it believes from this, it'll get 1 row after sorting and joining, it is likely getting thousands and thousands of records instead - in a nested loops join - hitting another index .

ok, can you tell us the exact command used to gather stats on archive_cities and what is the output of

select count(*) from archive_cities;
select state, count(*) from archive_cities; <<<=== IF this is a relatively small set, if you have over say 200 rows - please don't post that...




Query that never returns

Suzanne, January 26, 2006 - 5:59 pm UTC

Thank you, Thank you, Thank you. It is all about the data! Based on your last follow up questions I began to actually dig into the data (I've only been at this company for 8 months). It turns out that more then 90% of the data in the archive_cities table isn't city/state information at all. Years ago somebody decided to store search requests in this table. DBAs, don't EVER let somebody use a table for more then one stated purpose. As Oracle's CBO has gotten more and more data-centric this table and it's indexes became more and more problematic until the CBO couldn't figure out what the heck to do with most of this table. I think my ultimate solution will be to use a materialized view to filter the actual, good data. Again, thank you for your help.

Fancy Join

Claude, May 24, 2006 - 2:06 pm UTC

Hi Tom,
I have two tables and I'm trying to do a special kind of join:
If a row in table A matches a row in table B then join on that column, like a regular join. If a row in table A does not match a row in table B then join A's column to all of the NULL rows in column B.
Here are the details:

CREATE TABLE a (acol NUMBER(1));

CREATE TABLE b (bcol NUMBER(1));

INSERT INTO a SELECT ROWNUM FROM dual CONNECT BY LEVEL <=7;

INSERT INTO b SELECT DECODE(ROWNUM, 1, 1, 2, 3, 3, 5)
FROM dual CONNECT BY LEVEL <= 3;

select * from a;

acol
----------
1
2
3
4
5
6
7

select * from b;

bcol
---------
1
3
5

Here is how I'm trying to get the results to look. Notice that 2, 4, 6, and 7 are repeated four times, once for each row in B that does not exist in A.

results

acol bcol
-------------------
1 1
2
2
2
2
3 3
4
4
4
4
5 5
6
6
6
6
7
7
7
7

How can I get these results? I've tried all kinds of fancy queries with no success.
Thanks


Tom Kyte
May 25, 2006 - 1:09 pm UTC

ops$tkyte@ORA10GR2> select *
  2    from a, b
  3   where a.acol = b.bcol
  4  union all
  5  select a.acol, null
  6    from a, b
  7   where acol not in (select bcol from b where bcol is not null)
  8   order by 1
  9  /

      ACOL       BCOL
---------- ----------
         1          1
         2
         2
         2
         3          3
         4
         4
         4
         5          5
         6
         6
         6
         7
         7
         7

15 rows selected.


You said 4 times - once for each row in B --- but b only has three?? 

Fancy Join

Claude, May 25, 2006 - 3:28 pm UTC

You're right. This does the trick and there should be three null rows. u_da_man;

going a step beyond this...

Phil Miesle, July 10, 2006 - 9:35 am UTC

I'm trying to figure out how to do a double-outer-join (das ist verbotten im Oracle). An example will help:

create table dim1(d number, constraint pk_dim1 primary key(d)) organization index;
create table f1(d number, v number, constraint pk_f1 primary key(d)) organization index;
create table f2(d number, v number, constraint pk_f2 primary key(d)) organization index;

insert into dim1 select rownum from all_objects where rownum < 31;
insert into f1 select d, v from (select rownum d, rownum*10 v from all_objects) where d between 11 and 20;
insert into f2 select d, v from (select rownum d, rownum*20 v from all_objects) where d between 16 and 25;
commit;


This is the result I want to achieve:

select dim1.d, nvl(f1.v,0), nvl(f2.v,0)
from dim1, f1, f2
where dim1.d = f1.d(+)
and dim1.d = f2.d(+)
and (f1.v is not null or f2.v is not null);

I want to avoid having to involve dim1 in this query at all. Conceptually, I should be able to walk down both fact tables in key order; if I skip key values on one side of the join I need to advance on the other side until I equal or exceed the first side key value.

It is a little bit of a goofy case, admittedly, but not particularly uncommon when I'm trying to join very sparse multidimensional data in a relational world. Expand my dim1 out into three dimensions with tens of billions of possible intersections, whilst my fact tables (in the same three dimensions) only order in the tens to hundreds of millions of populated values.

The above technique will yield me a correct answer after a very long time executing. At the moment, my only other option is to open two cursors and do the join in application code...and that means a dirty big sort.

I'd rather the database do the work since it is closest to the data!

Tom Kyte
July 10, 2006 - 9:50 am UTC

do you mean a full outer join?


ops$tkyte@ORA10GR2> select coalesce(f1.d,f2.d), nvl(f1.v,0), nvl(f2.v,0)
  2    from f1 full outer join f2 on (f1.d = f2.d)
  3   order by 1
  4  /
 
COALESCE(F1.D,F2.D) NVL(F1.V,0) NVL(F2.V,0)
------------------- ----------- -----------
                 11         110           0
                 12         120           0
                 13         130           0
                 14         140           0
                 15         150           0
                 16         160         320
                 17         170         340
                 18         180         360
                 19         190         380
                 20         200         400
                 21           0         420
                 22           0         440
                 23           0         460
                 24           0         480
                 25           0         500
 
15 rows selected.
 
<b>this is the same - if D is a primary key in both</b>

ops$tkyte@ORA10GR2> select d, sum(f1v), sum(f2v)
  2    from (select d, v f1v, 0 f2v from f1
  3          union all
  4          select d, 0 f1v, v f2v from f2)
  5   group by d
  6   order by 1
  7  /
 
         D   SUM(F1V)   SUM(F2V)
---------- ---------- ----------
        11        110          0
        12        120          0
        13        130          0
        14        140          0
        15        150          0
        16        160        320
        17        170        340
        18        180        360
        19        190        380
        20        200        400
        21          0        420
        22          0        440
        23          0        460
        24          0        480
        25          0        500
 
15 rows selected.

<b>and this is the "do it yourself full outer join"</b>
 
ops$tkyte@ORA10GR2> select f1.d, f1.v, nvl(f2.v,0)
  2    from f1, f2
  3   where f1.d = f2.d(+)
  4   union all
  5  select f2.d, 0, f2.v
  6    from f1, f2
  7   where f1.d(+) = f2.d
  8     and f1.d is null
  9   order by 1
 10  /
 
         D          V NVL(F2.V,0)
---------- ---------- -----------
        11        110           0
        12        120           0
        13        130           0
        14        140           0
        15        150           0
        16        160         320
        17        170         340
        18        180         360
        19        190         380
        20        200         400
        21          0         420
        22          0         440
        23          0         460
        24          0         480
        25          0         500
 
15 rows selected.
 

indeed

Phil Miesle, July 10, 2006 - 11:37 am UTC

Ta for that...strike one up for learning something today! I do in fact mean a 'full outer join', didn't know that was what I meant until I read the definition. Now to try this out with massive data sets...

Outer Join on Large dimensions

Nitin Aggarwal, January 30, 2008 - 12:25 am UTC

While performing ETL run, we select the fact table from source DB (10G) and outer join them to the dimensions. In some of cases these dimensions run up to 31, outer joined to the fact table.

We are trying to leverage "Hash Join Right Outer" way of execution plan which is a new feature in Oracle 10g.

If all the dimensions involved in outer join are small (< 2.5 million records), we see very good performance. Primarily because fact table (200 million records) is not scanned and stored in temporary segment, rather records are fetched directly after probing the in memory hash table.

But, the problem arises when we use bigger dimensions (> 2.5 million records). In this case, after having all dimensions scanned, it starts scanning (scattered reads)the fact table also and writes into temporary segments. The query takes forever to fetch results.

We have 4 GB ram on system with PGA_AGGREGATE_TARGET set to 1500.

I am not sure if this problem is because the hash table on large dimensions could not be contained entirely in memory. But how this leads to scattered reads over fact table, i am unable to understand.

Thanks.
Tom Kyte
January 30, 2008 - 9:51 am UTC

it would always full scan the fact table would it not? db file scattered reads are full scan style reads.

Not sure why you would not ALWAYS see that?

Outer Join on Large dimensions

Nitin Aggarwal, January 30, 2008 - 10:07 am UTC

We generally would not see it in Oracle 10G because of the newer way it executes the outer join. i.e HASH JOIN RIGHT OUTER which does not scan the larger table first. It creates hash tables on the dimensions and then looks up the hash values for the records of the larger table and gets us the data.

So, what we have seen is that for smaller dimensions if does not show any scattered reads over the large fact table, rather it straight fetches the records from it.

But if we incorporate some bigger dimensions in outer join, then after scanning all the dimensions it starts doing scattered read over the fact table with writes to temp segments.Also, it does not fetch any records for a long long time (in days).


Tom Kyte
January 30, 2008 - 10:39 am UTC

are we talking the same "scattered reads"

what WAIT EVENT from oracle do you see:

a) db file scattered read
b) db file sequence read

the fact table would be full scanned regardless of the join order - I would expect to see a) regardless.


but yes, eventually, as the size of your dims EXCEED what will fit in the allocated pga (a percentage of the pga aggregate target) you would expect to see temp be used - it would have to be used - and you would see performance "not be the same" once that starts happening as it has to make multiple passes on the hashed information.

Outer Join on Large dimensions

Nitin Aggarwal, January 30, 2008 - 11:09 am UTC

Yes, There are no "Scattered Reads" or "Sequential Reads" on the fact table (95 Million). without any scattered reads over the table, it gets me the results from the table and i see "SQL* Net Message to client" wait events.

For dimensions i do see scattered reads as well as write temp waits. but once the hash tables are created over them, it directly fetches the records from the fact table as i described above.

But only when dimension is large enough(9 Million records) amongst all other small dimension (31 in number) which are outer joined to fact, it starts with scattered reads and write temp waits over the fact table and goes on for ever.

I feel like it has some thing to do with the hash tables which go out of memory due to larger dimensions. But, i fail to understand how it makes fact table undergo wait events on scattered read and write temp.

Thanks in advance for your help.

Tom Kyte
January 30, 2008 - 1:40 pm UTC

give me an abbreviated version of your query - I fail to see how it could return without READING the fact table in the first case. Do you know how it could? All IO will incur a wait, so if you never see any IO on the fact table in the first case - do you know how it could be even *working*?

Outer Join on Large dimensions

Nitin Aggarwal, January 31, 2008 - 1:00 am UTC

SQL> create table tab1 as select all_objects.* from all_objects,user_objects;

Table created.

SQL> create table tab1_outer as select all_objects.* from all_objects,(select * from user_objects where rownum<100);

Table created.

SQL> exec dbms_stats.gather_table_stats(user,'TAB1',estimate_percent =>30,method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

SQL>  exec dbms_stats.gather_table_stats(user,'TAB1_OUTER',estimate_percent =>30,method_opt => 'FOR ALL INDEXED COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.


SQL> select table_name,num_rows from user_tables where table_name in ('TAB1','TAB1_OUTER');
TAB1                       78988577
TAB1_OUTER                 5190100

SQL> select object_name,object_id from user_objects where object_name in('TAB1','TAB1_OUTER')
TAB1              69331
TAB1_OUTER        69406


SELECT A.*,
       B.OBJECT_ID
FROM   TAB1 A
       LEFT OUTER JOIN TAB1_OUTER B
         ON (A.OBJECT_NAME = B.OBJECT_NAME
             AND A.OBJECT_TYPE = B.OBJECT_TYPE
             AND A.CREATED >= B.CREATED)


SQL> select * from table(dbms_xplan.display('PLAN_TABLE'));
Plan hash value: 2800983551

---------------------------------------------------------------------------------------------
| Id  | Operation             | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |             |    78M|    10G|       |   530K  (1)| 02:03:47 |
|*  1 |  HASH JOIN RIGHT OUTER|             |    78M|    10G|   306M|   530K  (1)| 02:03:47 |
|   2 |   TABLE ACCESS FULL   | TAB1_OUTER  |  5190K|   247M|       | 11031   (1)| 00:02:35 |
|   3 |   TABLE ACCESS FULL   | TAB1       |    78M|  7005M|       |   167K  (2)| 00:39:08 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."OBJECT_TYPE"="B"."OBJECT_TYPE"(+) AND
              "A"."OBJECT_NAME"="B"."OBJECT_NAME"(+))
       filter("A"."CREATED">="B"."CREATED"(+))

17 rows selected.



PARSING IN CURSOR #26 len=218 dep=0 uid=69 oct=3 lid=69 tim=4851248727 hv=4196608148 ad='952220fc'
SELECT A.*,
       B.OBJECT_ID
FROM   TAB1 A
       LEFT OUTER JOIN TAB1_OUTER B
         ON (A.OBJECT_NAME = B.OBJECT_NAME
             AND A.OBJECT_TYPE = B.OBJECT_TYPE
             AND A.CREATED >= B.CREATED)
END OF STMT
PARSE #26:c=0,e=1972,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=4851248721
BINDS #26:
EXEC #26:c=0,e=445,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=4851253922
WAIT #26: nam='SQL*Net message to client' ela= 3 driver id=1952673792 #bytes=1 p3=0 obj#=69406 tim=4851254348
WAIT #26: nam='db file scattered read' ela= 109634 file#=33 block#=1357000 blocks=32 obj#=69406 tim=4851366655
WAIT #26: nam='db file scattered read' ela= 13726 file#=33 block#=1357032 blocks=29 obj#=69406 tim=4851396829
WAIT #26: nam='db file scattered read' ela= 72515 file#=49 block#=565894 blocks=32 obj#=69406 tim=4853457794
WAIT #26: nam='db file scattered read' ela= 35281 file#=49 block#=565926 blocks=31 obj#=69406 tim=4853497122
WAIT #26: nam='db file scattered read' ela= 32554 file#=50 block#=660486 blocks=32 obj#=69406 tim=4853533474
WAIT #26: nam='db file scattered read' ela= 39474 file#=50 block#=660518 blocks=31 obj#=69406 tim=4853578306
WAIT #26: nam='db file scattered read' ela= 47165 file#=51 block#=654150 blocks=32 obj#=69406 tim=4853630325
WAIT #26: nam='db file scattered read' ela= 39733 file#=51 block#=654182 blocks=31 obj#=69406 tim=4853673866
WAIT #26: nam='db file scattered read' ela= 43304 file#=52 block#=579014 blocks=32 obj#=69406 tim=4853720904
WAIT #26: nam='db file scattered read' ela= 37839 file#=52 block#=579046 blocks=31 obj#=69406 tim=4853763968
WAIT #26: nam='direct path write temp' ela= 17 file number=519 first dba=107860 block cnt=15 obj#=69406 tim=4862697598
WAIT #26: nam='db file scattered read' ela= 80999 file#=27 block#=636037 blocks=32 obj#=69406 tim=4862781024
.....
.....
WAIT #26: nam='direct path read temp' ela= 2351 file number=519 first dba=18083 block cnt=15 obj#=69406 tim=4925451243
WAIT #26: nam='db file scattered read' ela= 81783 file#=33 block#=1343496 blocks=32 obj#=69331 tim=4925533884
FETCH #26:c=4671875,e=74280929,p=37002,cr=35368,cu=0,mis=0,r=1,dep=0,og=1,tim=4925535719
WAIT #26: nam='SQL*Net message from client' ela= 1172 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4925536993
WAIT #26: nam='SQL*Net message to client' ela= 2 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4925537051
FETCH #26:c=0,e=52,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,tim=4925537091
WAIT #26: nam='SQL*Net message from client' ela= 2050 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4925539171
WAIT #26: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4925539209
....
....
FETCH #26:c=0,e=47,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,tim=4978531636
WAIT #26: nam='SQL*Net message from client' ela= 2830 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4978534501
WAIT #26: nam='SQL*Net message to client' ela= 2 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4978534581
FETCH #26:c=0,e=55,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,tim=4978534616
WAIT #26: nam='SQL*Net message from client' ela= 2487 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4978537148
WAIT #26: nam='SQL*Net message to client' ela= 2 driver id=1952673792 #bytes=1 p3=0 obj#=69331 tim=4978537186
...
...

As we can notice that there only 1 scattered read performed over table TAB1 (78 Million Records) and after that we only see SQL*NET waits. There are no more scattered read further down.

Had the outer joined table big enough, we would have seen lot of unending scattered reads and write temp on TAB1.

Outer Join on Large dimensions

Nitin Aggarwal, February 14, 2008 - 2:01 am UTC

Hi Tom, i provided a test case above, can you please provide me with some guidance.
Tom Kyte
February 14, 2008 - 10:01 am UTC

show a tkprof of both with waits enabled. I'm not sure what your test case was showing here - or attempting to show.

You said before:

....Yes, There are no "Scattered Reads" or "Sequential Reads" on the fact table (95 Million). without
any scattered reads over the table, it gets me the results from the table and i see "SQL* Net
Message to client" wait events....

I'm sorry - but you have to understand, I see no way for there not to be a full scan of both in all cases.

I can see how when the data gets LARGE, we cease doing everything in MEMORY and one takes longer, but you will see IO on both tables (unless one of them is in the cache, highly unlikely - unless it was really small).


for a hash join, we expect to see (when done in memory)

a) full scan of one table
b) lots of cpu as it is hashed into memory
c) the start of a full scan of another table - and now data comes back to client


once it spills to disk however (if you have access to effective oracle by design, i describe the paging and partitioning that goes on), this changes entirely. I think you are just seeing the natural effect of "it went to disk, things change, processed differently"

Outer Join on large Dimensions

Nitin Aggarwal, February 15, 2008 - 12:05 pm UTC

Presenting two cases. Case1 has one outer join lesser than the Case2. Both the cases behave very differently.

Tables are same as i described before.

SQL> select table_name,num_rows from user_tables where table_name in ('TAB1','TAB1_OUTER');
TAB1                       78988577
TAB1_OUTER                 5190100

SQL> select object_name,object_id from user_objects where object_name in('TAB1','TAB1_OUTER')
TAB1              69331
TAB1_OUTER        69406

System memory : 4GB
PGA_AGGREGATE_TARGET : 1.1 GB

Case 1:
SELECT A.*,
       B.OBJECT_ID,
       C.OBJECT_ID
FROM   TAB1 A
       LEFT OUTER JOIN TAB1_OUTER B
         ON (A.OBJECT_NAME = B.OBJECT_NAME
             AND A.OBJECT_TYPE = B.OBJECT_TYPE
             AND A.CREATED >= B.CREATED)
       LEFT OUTER JOIN TAB1_OUTER C
         ON (A.OBJECT_NAME = C.OBJECT_NAME
               AND A.OBJECT_TYPE = C.OBJECT_TYPE
               AND A.CREATED >= C.CREATED)
       
       
       
       
Case 2:
SELECT A.*,
       B.OBJECT_ID,
       C.OBJECT_ID,
       D.OBJECT_ID
FROM   TAB1 A
       LEFT OUTER JOIN TAB1_OUTER B
         ON (A.OBJECT_NAME = B.OBJECT_NAME
             AND A.OBJECT_TYPE = B.OBJECT_TYPE
             AND A.CREATED >= B.CREATED)
       LEFT OUTER JOIN TAB1_OUTER C
         ON (A.OBJECT_NAME = C.OBJECT_NAME
               AND A.OBJECT_TYPE = C.OBJECT_TYPE
               AND A.CREATED >= C.CREATED)
 LEFT OUTER JOIN TAB1_OUTER D
          ON (A.OBJECT_NAME = D.OBJECT_NAME
              AND A.OBJECT_TYPE = D.OBJECT_TYPE
             AND A.CREATED >= D.CREATED)
       

       
TKprof output:

Case 1:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    32281     12.81   12060.36      77323      70774          0    78988577
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    32283     12.82   12060.39      77323      70774          0    78988577

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 69

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   32281        0.00          0.07
  db file sequential read                         1        0.04          0.04
  db file scattered read                       2220        0.38        183.65
  direct path write temp                       2340        0.21         25.29
  direct path read temp                         444        0.21         28.99
  latch free                                      1        0.00          0.00
  SQL*Net message from client                 32280        2.46        600.55




Case 2:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.15       0.18          0        876          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     9629    119.53   10408.64     649763     643810          0    78988577
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     9631    119.68   10408.83     649763     644686          0    78988577

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 69

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                   2281        0.00          0.01
  db file scattered read                      20133        0.60       1253.87
  direct path write temp                      39957        0.19         28.08
  latch: shared pool                              3        0.00          0.00
  direct path read temp                         464        0.16         19.74
  db file sequential read                         7        0.03          0.14
  latch: object queue header operation            1        0.00          0.00
  SQL*Net message from client                  9629        3.48        225.37


for Case 2:
CPU time jumped ~10 times.
Scattered reads waits jumped ~10 times.
Direct path write temp wait events jumped more than 10 times.

Believe me, there are hardly any scattered reads done for the larger table (TAB1) for Case 1.

C:\oracle\product\10.2.0\admin\gidw10g\udump>grep -i 'scattered.*69331' gidw10g_ora_5480_case1.trc | wc -l
     24

C:\oracle\product\10.2.0\admin\gidw10g\udump>grep -i 'scattered.*69331' gidw10g_ora_5196_case2.trc | wc -l
  16804

Tom Kyte
February 17, 2008 - 7:04 am UTC

can you explain why you believe two entirely different queries that join different tables - would, should or could be expected to perform even remotely the same?????????

I am totally missing something here - I'm very confused as to why you think they "should be the same, similar, close, in the same ballpark"

and given that one returned 1/2 hour faster than the other - I would not really care that it used "10 times the cpu", give that the amount of cpu is still pretty trivial given the RUNTIMES of these.

Jess, July 31, 2012 - 10:09 am UTC

Hi Tom,

I've run into an interesting situation with outer joins.
We have 2 tables: products and orders. Each has a concept of currency--native currency of the product (i.e., what manufacturer sells it in) and currency in which an order was placed. We'd like to run a report where we show all amounts in USD.

Provided not all exchange rates exist every day, we'd like to pull what's available.

We tried going with

with exch_rate as (select currency, exch_rate from exchange_rates where dt = <date>)
select product_name,
product_cost*exch_rate as product_cost_usd,
order_that_included_this_product,
order_total_amt*exch_rate as order_total_usd,
[more columns]
from products, orders, exch_rates, [other tables]
where products.prod_currency = exch_rates.currency(+)
and orders.order_total_amt_currency = exch_rates.currency(+)
[and.....]

(obviously we'd put in a bit of NVL logic into select to account for nulls when there is no exchange rate)

But we get an error for not being able to do multiple outer joins to the same table--it suggests first joining the 2 'source' tables.

SQL Error: ORA-01417: a table may be outer joined to at most one other table
*Cause: a.b (+) = b.b and a.c (+) = c.c is not allowed
*Action: Check that this is really what you want, then join b and c first in a view.

How does one get around this error? It's possible that some currencies specified in orders don't have an exchange rate for a given day, and ditto for products, but orders and products don't have to be defined in the same currency (and so joining products to orders first does not make sense).

Thank you.

Tom Kyte
July 31, 2012 - 1:00 pm UTC

no creates
no inserts
no look


wait, I just looked at this - and you are doing it wrong.

if products and orders have different currencies, you need to join twice!

Jess, July 31, 2012 - 12:15 pm UTC

Sorry, Tom, forgot to mention--
the obvious way of getting around the problem would be to alias exch_rates to er1 and er2 in the main select and do it that way, but the example I gave above is simplified, as we actually have 10-odd tables that need to be joined to currencies. Aliasing the same thing 10 times just to do an outer join seemed a be excessive, so I was wondering if there was a better way.
Tom Kyte
July 31, 2012 - 1:01 pm UTC

it is not only the 'obvious' way, it is the correct way.

products needs to be joined to currency to get *its* conversion
orders needs to be joined to currency to get *its* conversion

you need to join twice, products might join to a different row than orders does.

Jess, July 31, 2012 - 1:27 pm UTC

Thanks Tom. Since there is only 1 rate per currency, I would've thought that there is a slicker way of doing it. Since there isn't, off we go to do 14 joins :) Thank you as always.