Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Sunder.

Asked: June 15, 2000 - 2:06 pm UTC

Last updated: November 03, 2011 - 12:58 am UTC

Version: 8.1.6.0.0

Viewed 10K+ times! This question is

You Asked

I'am working on enhancing the performance of some long running
queries in our application. The existing code uses the NOT EXISTS
clause extensively and I wondered if it would be a good idea to
use OUTER JOINS instead, wherever possible. For example, all else
being equal, which of the following would execute faster ?

Insert into Title(......)
select .....
from rollup R
where not exists ( select 'Found' from title T
where R.source_id = T.Title_ID);

Insert into Title(.......)
select ......
from title T, rollup R
where R.source_id = T.Title_id(+)
and T.Title_id is null;

Note that Rollup has about 5 million rows and Title about
1 million rows and both are indexed on the join fields.
In trials, I did notice that the latter was about 30% faster
but does that have a theoritical basis ?

Would appreciate an early reply.

Thanks,

Sunder

and Tom said...

There are at least 3 ways to do this above (minus might be a 4'th depending on your circumstances).

The way with NOT exists works well for small OUTER tables (or outer queries that have other predicates the result in "smallish" result sets before we have to do the correlated subquery). A NOT exists works something like this:

for x in ( select * from rollup ) loop
if ( not exists ( that query ) ) then
OUTPUT
end if;
end

it in effect runs the subquery over and over and over....

The outer join, method number 2, allows us to do more large scale, bulk type operations. Anything we do in bulk, we can probably do faster. So, by not running that subquery over and over, we gained some efficiency. It'll depend on your indexes and such as well how well this goes.

The bulk operations we do -- thats your "theroretical basis".

A 3'rd alternative is a NOT IN query. As long as you understand that NOT IN and NOT EXISTS are semantically different, this might run even faster yet. NOT IN will not work correctly if the subquery returns a NULL (the result set will be empty if it does). So, we have to add a filter "where title_id is NOT NULL" to the subquery to make this be the same.

Here is a very very small test showing the three methods. Existence (or lack of) indexes may effect this test and so on. You might consider the NOT IN with the HASH_AJ hint (although using the CBO should have this happen automatically -- if the NOT IN runs really really slow, its not using the HASH_AJ -- you'd have to hint it in that case).

this test is not a benchmark, it is just illustrative of the three methods and contains some comments on their plans.


DOC>create table t as select * from all_objects;
DOC>create index t_idx on t(object_id);

DOC>create table t2 as select * from all_objects where mod(object_id,50) = 0;
DOC>create index t2_idx on t2(object_id);

DOC>*/

ops$tkyte@DEV8I.WORLD> set autotrace on explain
ops$tkyte@DEV8I.WORLD> select count(object_name)
2 from t
3 where NOT EXISTS( select null
4 from t2
5 where t2.object_id = t.object_id )
6 /

COUNT(OBJECT_NAME)
------------------
22251

Elapsed: 00:00:00.85

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'T'
4 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)

Here, it full scans T and does that subquery for each row -- thats the index probe we see in there, the subquery running

ops$tkyte@DEV8I.WORLD>
ops$tkyte@DEV8I.WORLD> select /*+ HASH_AJ */ count(object_name)
2 from t
3 where object_id NOT IN ( select object_id
4 from t2
5 where object_id IS NOT NULL )
6 /

COUNT(OBJECT_NAME)
------------------
22251

Elapsed: 00:00:00.67

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 INDEX (FAST FULL SCAN) OF 'T_IDX' (NON-UNIQUE)
4 2 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)

There is the "anti join" method with may or may not run a little better in your environment. Its worth trying though

ops$tkyte@DEV8I.WORLD> select count(t.object_name)
2 from t, t2
3 where t.object_id = t2.object_id(+)
4 and t2.object_id is NULL
5 /

COUNT(T.OBJECT_NAME)
--------------------
22251

Elapsed: 00:00:00.75

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 NESTED LOOPS (OUTER)
4 3 TABLE ACCESS (FULL) OF 'T'
5 3 INDEX (RANGE SCAN) OF 'T2_IDX' (NON-UNIQUE)

And then the outer join method -- in this PARTICULAR example it fell between the not exists and the NOT IN. Your results may vary ;)

ops$tkyte@DEV8I.WORLD> set autotrace off

Rating

  (51 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, March 30, 2001 - 2:34 pm UTC


Outer join in a corelated query - does not work

Nag, November 16, 2001 - 11:02 am UTC

SQL> ed
Wrote file afiedt.buf

  1  select a.ename, a.empno
  2  from emp a
  3  where exists(select null from  dept b
  4*               where a.deptno(+)=b.deptno)
SQL> /
              where a.deptno(+)=b.deptno)
                      *
ERROR at line 4:
ORA-01705: an outer join cannot be specified on a correlation column

Tom, in my output , I want to see deptno 40 though it doesnot have any employees, but I ran into ora-1705, why is this so.

Further, can you give examples of correlation column's apart from the above.

 

Tom Kyte
November 16, 2001 - 1:37 pm UTC

well, your query is funky to begin with.

If it worked, it would be asking:

"i want all rows in EMP a such there there exists a row in B"

You are trying to outer join from B to A in your example!

You simply want to query:

select a.empno, a.ename, b.deptno
from emp a, dept b
where b.deptno = a.deptno(+)
/



HASH_AJ example is wrong

MDz, May 08, 2002 - 6:33 pm UTC

The hash_aj-hint example above is not correct, and the explain plan shows it. If the hash join worked, the plan would show something like this for the join operation:

HASH JOIN ANTI
INDEX ...

and not FILTER as showed above. To make the HASH ANTI-JOIN work, you need to follow these guidelines:

1. Put the /*+ hash_aj */ hint in the subquery, not in the main query. Oracle "Tuning" documentation explains this.

2. Make sure that OBJECT_ID is not null, both in main query and in subquery. If there is no NOT NULL constraint in t and t2, then add "OBJECT_ID IS NOT NULL" conditions.

Disclaimer: I am using Oracle 8.1.7

A nudge in the right direction, please?

Matt, July 22, 2003 - 10:45 pm UTC

I'm having similar difficulties tuning an outer join, do you have any suggestions?

SQL> alter session set sort_area_size=180000000;

Session altered.

SQL> set autotrace traceonly explain statistics
SQL> set timing on
SQL> variable days number
SQL> exec :days := 15

PL/SQL procedure successfully completed.

SQL> alter session set optimizer_mode=choose;

Session altered.

Elapsed: 00:00:00.70
  1      SELECT ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
  2        DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
  3      FROM
  4       TMP_REPORT_FILE_SET
  5      WHERE 1=1
  6      AND scheduled_start_datetime > sysdate - :days
  7*     order by REPORT_TYPE,JOB_SID
SQL> /

3584 rows selected.

Elapsed: 00:00:45.91

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=24607 Card=1179778 Bytes=156910474)
   1    0   SORT (ORDER BY) (Cost=24607 Card=1179778 Bytes=156910474)
   2    1     HASH JOIN (Cost=1216 Card=1179778 Bytes=156910474)
   3    2       HASH JOIN (OUTER) (Cost=533 Card=118714 Bytes=7597696)
   4    3         INDEX (FAST FULL SCAN) OF 'J_TMP2' (NON-UNIQUE) (Cost=507 Card=118714 Bytes=1899424)
   5    3         TABLE ACCESS (FULL) OF 'SYSTEM_EVENT' (Cost=1 Card=1162 Bytes=55776)
   6    2       TABLE ACCESS (FULL) OF 'REPORT_FILE' (Cost=562 Card=1178112 Bytes=81289728)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      17601  consistent gets
      17552  physical reads
          0  redo size
     391472  bytes sent via SQL*Net to client
      47065  bytes received via SQL*Net from client
        240  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       3584  rows processed
@ dbutil_show_cbo_parameter.sql
HOST_NAME                      VERSION         INSTANCE   STARTUP_TIME         CURRENT_TIME
------------------------------ --------------- ---------- -------------------- --------------------
xxxxxx.xx.xxxxxx.xxx           9.0.1.4.0       XXXX      23/07/2003 05:03.44  23/07/2003 11:57.37

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cursor_sharing                       string      EXACT
db_block_size                        integer     8192
db_file_multiblock_read_count        integer     32
hash_area_size                       integer     131072
hash_join_enabled                    boolean     TRUE
optimizer_features_enable            string      9.0.1
optimizer_index_caching              integer     0
optimizer_index_cost_adj             integer     100
optimizer_max_permutations           integer     2000
optimizer_mode                       string      RULE
parallel_broadcast_enabled           boolean     FALSE
partition_view_enabled               boolean     TRUE
query_rewrite_enabled                boolean     FALSE
sort_area_size                       integer     65536
star_transformation_enabled          string      FALSE

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 0
workarea_size_policy                 string      MANUAL

The view is defined as:

CREATE OR REPLACE FORCE VIEW "TMP_REPORT_FILE_SET" ("REPORT_RUN_SID",
"JOB_SID", "REPORT_TYPE", "FILE_STATUS", "FILE_NAME", "DESTINATION_ALIAS",
"PARTICIPANT_SID", "OS_EXIT_CODE", "EVENT_DESCRIPTION",
"SCHEDULED_START_DATETIME") AS
SELECT RF.REPORT_RUN_SID, RF.JOB_SID,
RF.REPORT_TYPE, RF.FILE_STATUS, RF.FILE_NAME,
RF.DESTINATION_ALIAS, RF.PARTICIPANT_SID,
J.OS_EXIT_CODE,
SE.EVENT_DESCRIPTION,
J.SCHEDULED_START_DATETIME
FROM REPORT_FILE RF, SYSTEM_EVENT SE, JOB J
WHERE J.JOB_SID = RF.JOB_SID
   AND SE.EVENT_SID (+) = J.OS_EXIT_CODE;

The index being used is created as:

CREATE INDEX "J_TMP2" ON "JOB" ("OS_EXIT_CODE",
"SCHEDULED_START_DATETIME","JOB_SID")
COMPRESS 1
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS  TABLESPACE "USERS" ;

The optimizer statistics are current.

The query is used by a client application, so should really be optimized for FIRST_ROWS(10). However, the sort of the resultset negates any advantage of this approach. Optimizing for throughput gives better overall times.

I am a little stuck - I started to look at parallelizing the FAST FULL SCAN (J_TMP2 index) and FULL table scan (report_file) and got this query down to 27 seconds (the original query was taking nearly 400 seconds using RBO!). However, some of the system these queries came from currently uses the RBO and I would prefer not to go down this path and impact these parts of the system if possible.

Given the information I have proivided, in particular reducing the impact of the RBO part of the system, can you please suggest some alternatives to tuning the statement above? 

Thanks in advance, 

Tom Kyte
July 23, 2003 - 7:48 am UTC

how does:

CREATE OR REPLACE FORCE VIEW "TMP_REPORT_FILE_SET"
("REPORT_RUN_SID","JOB_SID", "REPORT_TYPE", "FILE_STATUS", "FILE_NAME",
"DESTINATION_ALIAS", "PARTICIPANT_SID", "OS_EXIT_CODE", "EVENT_DESCRIPTION",
"SCHEDULED_START_DATETIME")
AS
SELECT RF.REPORT_RUN_SID,
RF.JOB_SID,
RF.REPORT_TYPE,
RF.FILE_STATUS,
RF.FILE_NAME,
RF.DESTINATION_ALIAS,
RF.PARTICIPANT_SID,
J.OS_EXIT_CODE,
(select event_description
from system_event
where event_sid = j.os_exit_code ) EVENT_DESCRIPTION,
J.SCHEDULED_START_DATETIME
FROM REPORT_FILE RF,
JOB J
WHERE J.JOB_SID = RF.JOB_SID
/

work?

Followup on your request.

Matt, July 23, 2003 - 8:00 pm UTC

With the outer join removed, it seems to be performing the same. See below for the same test case, with the buffer cache warmed up. I will include a tkprof report for the same test as well.

CREATE OR REPLACE FORCE VIEW "TK_REPORT_FILE_SET" 
("REPORT_RUN_SID","JOB_SID", "REPORT_TYPE", "FILE_STATUS", "FILE_NAME", 
 "DESTINATION_ALIAS", "PARTICIPANT_SID", "OS_EXIT_CODE", "EVENT_DESCRIPTION",
 "SCHEDULED_START_DATETIME") 
AS
SELECT RF.REPORT_RUN_SID, 
       RF.JOB_SID,
       RF.REPORT_TYPE, 
       RF.FILE_STATUS, 
       RF.FILE_NAME,
       RF.DESTINATION_ALIAS, 
       RF.PARTICIPANT_SID,
       J.OS_EXIT_CODE,
       (select event_description 
          from system_event 
         where event_sid = j.os_exit_code ) EVENT_DESCRIPTION,
       J.SCHEDULED_START_DATETIME
  FROM REPORT_FILE RF, 
       JOB J
 WHERE J.JOB_SID = RF.JOB_SID
/

SQL> l
  1  SELECT ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
  2  DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
  3  FROM TK_REPORT_FILE_SET
  4  where 1=1
  5  AND scheduled_start_datetime > sysdate - :days
  6* order by REPORT_TYPE,JOB_SID
SQL> /

3584 rows selected.

Elapsed: 00:00:43.71

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=16598 Card=1178108 Bytes=100139180)
   1    0   SORT (ORDER BY) (Cost=16598 Card=1178108 Bytes=100139180)
   2    1     HASH JOIN (Cost=1114 Card=1178108 Bytes=100139180)
   3    2       INDEX (FAST FULL SCAN) OF 'J_TMP2' (NON-UNIQUE) (Cost=507 Card=118714 Bytes=1899424)
   4    2       TABLE ACCESS (FULL) OF 'REPORT_FILE' (Cost=562 Card=1178112 Bytes=81289728)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      17592  consistent gets
      17541  physical reads
          0  redo size
     391472  bytes sent via SQL*Net to client
      47065  bytes received via SQL*Net from client
        240  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       3584  rows processed

I also tried removing the sort, but the results were similar.

Here is the tkprof:

********************************************************************************

 SELECT ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
 DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
 FROM TK_REPORT_FILE_SET
 where 1=1
 AND scheduled_start_datetime > sysdate - :days
order by REPORT_TYPE,JOB_SID

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.04       2.41          0          0          0           0
Execute      1      0.00       0.02          0          0          0           0
Fetch       25     21.46      50.46      17541      17592          0        3584
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       27     21.51      52.90      17541      17592          0        3584

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 100

Rows     Row Source Operation
-------  ---------------------------------------------------
   3584  SORT ORDER BY
   3584   HASH JOIN
  18816    INDEX FAST FULL SCAN (object id 43141)
1178112    TABLE ACCESS FULL REPORT_FILE


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      25        0.00          0.00
  db file scattered read                        558        0.09         29.95
  db file sequential read                         4        0.03          0.09
  latch free                                      1        0.02          0.02
  SQL*Net message from client                    25        1.07          1.64
  SQL*Net more data to client                   167        0.01          0.01
********************************************************************************

From the wait events I am concluding that the full scan of report_file is the problem (surprisingly (to me) the INDEX FAST FULL SCAN appears to be relatively inexpensive).

So, for now I think I am looking for a way to index report_file ( before I put the outer join back in). 

I just re-tested with the following index:

create index mgb_tmp on report_file(report_type, job_sid) compress 1;  

I expected this to allow the optimizer to use the index for the join, before going to the report file table for the rest of the column data. I also wondered if it might effect the sort. 

However, report_file has the following indexes which already include job_sid:

INDEXES FOR TABLE: REPORT_FILE
================================================

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
MGB_TMP                        REPORT_TYPE                                  1
                               JOB_SID                                      2
REPORT_FILE_INDX2              REPORT_RUN_SID                               1
                               JOB_SID                                      2
                               FILE_NAME                                    3
REPORT_FILE_INDX3              JOB_SID                                      1
                               FILE_NAME                                    2
REPORT_FILE_INDX4              FILE_STATUS                                  1
                               DESTINATION_ALIAS                            2
REPORT_FILE_INDX_PK            FILE_NAME                                    1
                               JOB_SID                                      2

In conclusion, I am still seeing the same behaviour as described above in the autotrace and tkprof.

Do you agree that the FTS is the problem here?

Do you have any suggestions?
 

Tom Kyte
July 23, 2003 - 9:30 pm UTC

don't have time to fully investigate right now -- try back next week (on vacation - -just sort of "glancing" at things)

Coolio.

Matt, July 23, 2003 - 10:30 pm UTC

No worries Tom, I shall update this again next week.

Surely, a holiday means an absence of work? I think you need a real holiday!

M.

Tuning query (reference previous few posts).

Matt Butler, July 28, 2003 - 3:20 am UTC

To summarise the above few posts.

I am trying to tune a query that uses an outer join. I have tried several approaches and am unable to get the required query performance (original query took 350+ seconds, currently the query takes 45 seconds). After your previous suggestion to change the query I have concluded that the bottleneck is the FTS access to the report_file table. I do not seem to be able to get the CBO to use any index on the table.

I have managed to get the query to run in under 10 seconds by using parallel query slaves. However, since the remainder of the system uses the RBO I do not want to use this approach in production.

Can you please have a look at the previous posts and suggest an alternative approach? I do not understand why the CBO refuses to use any of the indexes, although I am sure that there are good reasons. Any ideas?

Thanks in advance.

Tom Kyte
July 28, 2003 - 7:08 am UTC

and exactly what index would you LIKE it to use?

Look at the query, it seems to be pulling all rows from each table and joining -- why would an index help here (it would hurt)


If the "remainder" of the system is using RBO and that is a reason not to use PQ -- why are you using the cbo here? in for a penny in for a pound they say -- you are wet upto your knees already, whats another inch?

Are you sure that there are no other options?

Matt, July 28, 2003 - 8:22 pm UTC

>> and exactly what index would you LIKE it to use?

Well to be honest I was hoping to get the optimizer to pick one that contained job_sid. I expected that the predicate on the job table (by scheduled_start_datetime) would restrict the result from job significantly (to be << 1%) and then the join to report_file by job_sid would use such an index (since I expect there to be few rows that satisfy the join). I have tried a single index on job_sid for this purpose....didn't work.

Having said all this, I now don't think that the CBO has any information about how job and report_file are related - how could it?!? So, I think that all it can do is full scan.

>> Look at the query,
>>  it seems to be pulling all rows from each table and >> joining 
>> -- why would an index help here (it would hurt)

I understand what you are saying, but this is the query on the view that we are talking about:

  1      SELECT ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
  2        
DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
  3      FROM
  4       TMP_REPORT_FILE_SET
  5      WHERE 1=1
  6      AND scheduled_start_datetime > sysdate - :days
  7*     order by REPORT_TYPE,JOB_SID
SQL> /

It pulls back 3584 rows - NOT all the data. Have a look at the tkprof, do you still stand by your comments? Is there really no way to improve this query?

>>If the "remainder" of the system is using RBO and that is
>>a reason not to use PQ 
>>-- why are you using the cbo here?  in for a penny in for
>>a pound they say -- 
>>you are wet upto your knees already, whats another inch? 

We are still in the process of migrating from the RBO to the CBO after an upgrade from 7.3 (we are moving a module at a time using an 'alter session' approach) . Any tuning that I do, I try and move more of the system over to the CBO. The reason I don't want to parallize the query is that I want to control the scope of change and the required testing - I do not want parts of the system to suddenly start altering their behaviour (queries that access tables with a parallel setting use the CBO) - I have seen RBO queries that run well turn into CBO queries that run not so well. This is not necessarily the fault of the CBO, but relates to parts of the system joining to DD views (which can't have stats), and finding the correct values for the optimizer_index_caching, optimizer_index_cost_adj parameters.

This was the last execution plan I posted:

Rows     Row Source Operation
-------  ---------------------------------------------------
   3584  SORT ORDER BY
   3584   HASH JOIN
  18816    INDEX FAST FULL SCAN (object id 43141)
1178112    TABLE ACCESS FULL REPORT_FILE

I just read the Performance doco on hash joins and can see that there is nothing that can be done to speed this join up, unless it is possible to restrict the report_file data by adding something to the predicate.

Are you sure that there are no other options?

Thanks again,
 

Tom Kyte
July 29, 2003 - 6:46 am UTC

and if you use first_rows?

Tuning

Santhanam, July 29, 2003 - 4:43 am UTC

Good example


More Information and results...

Matt, July 29, 2003 - 10:07 pm UTC

OK, from the top (search down for "Start of Test" to see the results)...

Datatypes for Table JOB

Data Data
Column Name Type Length Nullable
------------------------------ -------------------- ----------- --------
JOB_SID NUMBER (10,0) not null
JOB_STATUS VARCHAR2 1 null
JOB_QUEUE_SID NUMBER (10,0) null
JOB_TYPE VARCHAR2 10 null
USER_SID NUMBER (10,0) null
PARENT_JOB_SID NUMBER (10,0) null
SCHEDULED_START_DATETIME DATE 7 not null
ACTUAL_START_DATETIME DATE 7 null
ACTUAL_COMPLETION_DATETIME DATE 7 null
MAX_RUN_TIME NUMBER (6,0) null
LAST_UPDATE_DATETIME DATE 7 null
OS_PID VARCHAR2 9 null
PARENT_OS_PID VARCHAR2 9 null
OS_PHYSICAL_IO NUMBER (10,0) null
OS_CPU NUMBER (10,0) null
OS_MEMORY NUMBER (10,0) null
OS_EXIT_CODE NUMBER (5,0) null
JOB_ABBREVIATED_PARAMS VARCHAR2 2000 null


Indexes on JOB

Index Is
Name Unique COLUMNS
------------------------------ ------ --------------------------------
JOB_INDX2 Yes PARENT_JOB_SID, JOB_SID
JOB_INDX3 No JOB_TYPE
JOB_INDX4 Yes JOB_SID, JOB_STATUS
JOB_INDX5 Yes JOB_STATUS, JOB_SID
JOB_INDX6 Yes OS_PID, JOB_SID
JOB_INDX7 No JOB_QUEUE_SID, JOB_STATUS,
SCHEDULED_START_DATETIME

JOB_INDX8 No OS_EXIT_CODE, JOB_SID
JOB_INDX9 No OS_EXIT_CODE,
SCHEDULED_START_DATETIME,
JOB_SID

JOB_INDX_PK Yes JOB_SID

Datatypes for Table REPORT_FILE

Data Data
Column Name Type Length Nullable
------------------------------ -------------------- ----------- --------
FILE_NAME VARCHAR2 255 not null
JOB_SID NUMBER (10,0) not null
REPORT_RUN_SID NUMBER (10,0) null
PARTICIPANT_SID NUMBER (10,0) null
REPORT_TYPE VARCHAR2 10 null
DESTINATION_ALIAS VARCHAR2 10 null
FILE_STATUS VARCHAR2 1 null


Indexes on REPORT_FILE

Index Is
Name Unique COLUMNS
------------------------------ ------ --------------------------------
REPORT_FILE_INDX2 No REPORT_RUN_SID, JOB_SID,
FILE_NAME

REPORT_FILE_INDX3 Yes JOB_SID, FILE_NAME
REPORT_FILE_INDX4 No FILE_STATUS, DESTINATION_ALIAS
REPORT_FILE_INDX5 No REPORT_TYPE, JOB_SID
REPORT_FILE_INDX_PK Yes FILE_NAME, JOB_SID

Datatypes for Table SYSTEM_EVENT

Data Data
Column Name Type Length Nullable
------------------------------ -------------------- ----------- --------
EVENT_SID NUMBER (10,0) not null
EVENT_SYMBOL VARCHAR2 20 not null
EVENT_DESCRIPTION VARCHAR2 80 null
ARGUMENT1_DESCRIPTION VARCHAR2 40 null
ARGUMENT2_DESCRIPTION VARCHAR2 40 null
ARGUMENT3_DESCRIPTION VARCHAR2 40 null
CONTEXT_DESCRIPTION VARCHAR2 40 null
EXIT_STATUS_DESCRIPTION VARCHAR2 80 null


Indexes on SYSTEM_EVENT

Index Is
Name Unique COLUMNS
------------------------------ ------ --------------------------------
SYSTEM_EVENT_INDX2 Yes EVENT_SYMBOL
SYSTEM_EVENT_INDX3 Yes EVENT_SYMBOL, EVENT_SID
SYSTEM_EVENT_INDX_PK Yes EVENT_SID

TABLE STATISTICS FOR : JOB
===================================================


NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
---------- ---------- ----------- -------------------
2374272 58370 100 07/22/2003 16:23:44


COLUMN STATISTICS FOR : JOB
====================================================


COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY LAST_ANALYZED
------------------------------ ------------ ---------- ----------- ---------------- -------------------
JOB_QUEUE_SID 5 0 4 .0000002106 07/21/2003 17:26:33
JOB_SID 2374272 0 75 .0000004212 07/21/2003 17:26:33
JOB_STATUS 3 0 2 .0000002106 07/21/2003 17:26:33
JOB_TYPE 61 0 28 .0104306613 07/21/2003 17:26:33
OS_EXIT_CODE 24 0 1 .0416666667 07/22/2003 16:23:44
OS_PID 16704 97792 75 .0000636187 07/21/2003 17:26:33
PARENT_JOB_SID 3805 1332096 57 .0018233225 07/21/2003 17:26:33
SCHEDULED_START_DATETIME 12612 0 75 .0004318452 07/21/2003 17:26:33

8 rows selected.


INDEX STATISTICS FOR : JOB
===================================================


NAME NUM_ROWS DISTINCT LEAF_BLOCKS CF LEVEL ALFBPKEY LAST_ANALYZED
------------------------------ ---------- ---------- ----------- ---------- ---------- ---------- -------------------
JOB_INDX2 2374272 2374272 6370 551703 2 1 07/21/2003 17:26:56
JOB_INDX3 2374272 61 6784 331214 2 111 07/21/2003 17:27:17
JOB_INDX4 2374272 2374272 5966 58926 2 1 07/21/2003 17:27:38
JOB_INDX5 2374272 2374272 5966 76198 2 1 07/21/2003 17:27:58
JOB_INDX6 2374272 2374272 7299 2280191 2 1 07/21/2003 17:28:17
JOB_INDX7 2374272 13095 7968 1706109 2 1 07/21/2003 17:28:41
JOB_INDX8 2374272 2374272 6850 179906 2 1 07/21/2003 17:29:07
JOB_INDX9 2374272 2374272 8304 1668315 2 1 07/22/2003 16:28:05
JOB_INDX_PK 2374272 2374272 4763 58926 2 1 07/21/2003 17:26:35

9 rows selected.

TABLE STATISTICS FOR : REPORT_FILE
===================================================


NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
---------- ---------- ----------- -------------------
1178112 9207 100 07/21/2003 17:41:20


COLUMN STATISTICS FOR : REPORT_FILE
====================================================


COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY LAST_ANALYZED
------------------------------ ------------ ---------- ----------- ---------------- -------------------
DESTINATION_ALIAS 4 0 3 .0000004244 07/21/2003 17:41:20
FILE_NAME 1177487 0 75 .0000008611 07/21/2003 17:41:20
FILE_STATUS 3 1156864 2 .0000235316 07/21/2003 17:41:20
JOB_SID 4576 0 75 .0002221131 07/21/2003 17:41:20
REPORT_RUN_SID 1152 28672 46 .0037663495 07/21/2003 17:41:20


INDEX STATISTICS FOR : REPORT_FILE
===================================================


NAME NUM_ROWS DISTINCT LEAF_BLOCKS CF LEVEL ALFBPKEY LAST_ANALYZED
------------------------------ ---------- ---------- ----------- ---------- ---------- ---------- -------------------
REPORT_FILE_INDX2 1178112 1178112 6928 1171692 2 1 07/21/2003 17:42:34
REPORT_FILE_INDX3 1178112 1178112 5950 1171669 2 1 07/21/2003 17:43:02
REPORT_FILE_INDX4 1178112 9 3282 20663 2 364 07/21/2003 17:43:20
REPORT_FILE_INDX5 1178112 4577 2804 1171782 2 1 07/24/2003 09:51:06
REPORT_FILE_INDX_PK 1178112 1178112 10148 232016 2 1 07/21/2003 17:41:20

TABLE STATISTICS FOR : SYSTEM_EVENT
===================================================


NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
---------- ---------- ----------- -------------------
1162 15 100 07/21/2003 17:43:32


COLUMN STATISTICS FOR : SYSTEM_EVENT
====================================================


COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY LAST_ANALYZED
------------------------------ ------------ ---------- ----------- ---------------- -------------------
EVENT_SID 1162 0 75 .0008605852 07/21/2003 17:43:32
EVENT_SYMBOL 1162 0 75 .0008605852 07/21/2003 17:43:32


INDEX STATISTICS FOR : SYSTEM_EVENT
===================================================


NAME NUM_ROWS DISTINCT LEAF_BLOCKS CF LEVEL ALFBPKEY LAST_ANALYZED
------------------------------ ---------- ---------- ----------- ---------- ---------- ---------- -------------------
SYSTEM_EVENT_INDX2 1162 1162 4 627 1 1 07/21/2003 17:43:32
SYSTEM_EVENT_INDX3 1162 1162 5 627 1 1 07/21/2003 17:43:32
SYSTEM_EVENT_INDX_PK 1162 1162 5 49 1 1 07/21/2003 17:43:32


CREATE OR REPLACE FORCE VIEW "TK_REPORT_FILE_SET"
("REPORT_RUN_SID","JOB_SID", "REPORT_TYPE", "FILE_STATUS", "FILE_NAME",
"DESTINATION_ALIAS", "PARTICIPANT_SID", "OS_EXIT_CODE", "EVENT_DESCRIPTION",
"SCHEDULED_START_DATETIME")
AS
SELECT RF.REPORT_RUN_SID,
RF.JOB_SID,
RF.REPORT_TYPE,
RF.FILE_STATUS,
RF.FILE_NAME,
RF.DESTINATION_ALIAS,
RF.PARTICIPANT_SID,
J.OS_EXIT_CODE,
(select event_description
from system_event
where event_sid (+) = j.os_exit_code ) EVENT_DESCRIPTION,
J.SCHEDULED_START_DATETIME
FROM REPORT_FILE RF,
JOB J
WHERE J.JOB_SID = RF.JOB_SID

*************
Start of Test
*************

alter session set sort_area_size = 180000000;

Session altered.

********************************************************************************

SELECT /*+ ALL_ROWS */ ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
FROM
TK_REPORT_FILE_SET
WHERE 1=1
AND scheduled_start_datetime > sysdate - :days
AND report_type = :report_type
order by REPORT_TYPE,JOB_SID

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 10 19.79 41.90 8334 21457 0 1280
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 12 19.79 41.90 8334 21457 0 1280

Misses in library cache during parse: 0
Optimizer goal: ALL_ROWS
Parsing user id: 100

Rows Row Source Operation
------- ---------------------------------------------------
1280 SORT ORDER BY
1280 HASH JOIN
13056 TABLE ACCESS BY INDEX ROWID REPORT_FILE
13056 INDEX RANGE SCAN (object id 43184)
251648 INDEX FAST FULL SCAN (object id 43141)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 10 0.00 0.00
db file sequential read 4 0.00 0.00
db file scattered read 265 1.88 15.45
latch free 1 0.01 0.01
SQL*Net message from client 10 4.36 5.88
SQL*Net more data to client 59 0.00 0.00
********************************************************************************
********************************************************************************

SELECT /*+ FIRST_ROWS */ ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
FROM
TK_REPORT_FILE_SET
WHERE 1=1
AND scheduled_start_datetime > sysdate - :days
AND report_type = :report_type
order by REPORT_TYPE,JOB_SID

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.03 0 0 0 0
Execute 1 0.00 0.14 0 0 0 0
Fetch 10 2.53 3.81 31 39263 0 1280
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 12 2.53 4.00 31 39263 0 1280

Misses in library cache during parse: 0
Optimizer goal: FIRST_ROWS
Parsing user id: 100

Rows Row Source Operation
------- ---------------------------------------------------
1280 TABLE ACCESS BY INDEX ROWID JOB
26113 NESTED LOOPS
13056 TABLE ACCESS BY INDEX ROWID REPORT_FILE
13056 INDEX RANGE SCAN (object id 43184)
13056 INDEX RANGE SCAN (object id 43060)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 10 0.00 0.00
db file sequential read 31 0.07 0.19
SQL*Net message from client 10 0.09 0.16
SQL*Net more data to client 59 0.01 0.02
********************************************************************************
********************************************************************************

SELECT /*+ FIRST_ROWS(10) */ ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
FROM
TK_REPORT_FILE_SET
WHERE 1=1
AND scheduled_start_datetime > sysdate - :days
AND report_type = :report_type
order by REPORT_TYPE,JOB_SID

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.04 1.77 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 10 23.43 68.58 16073 17618 0 1280
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 12 23.48 70.36 16073 17618 0 1280

Misses in library cache during parse: 1
Optimizer goal: FIRST_ROWS
Parsing user id: 100

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 10 0.00 0.00
db file sequential read 130 0.07 0.97
db file scattered read 1362 0.12 29.67
SQL*Net message from client 9 0.02 0.11
SQL*Net more data to client 59 0.00 0.00
********************************************************************************

repdlvr@NDEVT.WORLD> l
1 select object_id, object_name from user_objects
2 where object_id in (43184,43141,
3* 43184,43060)
repdlvr@NDEVT.WORLD> /

OBJECT_ID OBJECT_NAME
---------- ------------------------------
43060 JOB_INDX_PK
43141 JOB_INDX9
43184 REPORT_FILE_INDX5

*********************
Summary and Questions
*********************

FIRST_ROWS hinted query runs in 4 seconds.

Now I am simply confused. This now behaves nothing like I had expected. Previously, (prior to the view re-write and new indexes) I had concluded that I needed to optimize with ALL_ROWS, since get the rows to the screen I needed to do the sort quickly and therefore optimize for throughput (ie: get all the resultset for the sort asap). However, I think that the new indexes are now being used - and so the sort is done by reading the indexes in order.

Questions:
==========

1) How does the outer join in the view work - you suggested placing this in the select section rather than the join section in an earlier post.
2) Why don't I see an execution plan in the tkprof for the second FIRST_ROWS(10) optimization?
3) Why does the FIRST_ROWS(10) optimization run significantly slower than FIRST_ROWS?
4) I read somewhere (not sure where, but NOT in the Oracle Preformance doco) that FIRST_ROWS uses heuristics and rules to generate the plan. The Oracle doco suggests that this (FIRST_ROWS) is the same as FIRST_ROWS(1). So I re-tested...Can you explain this (it suggests that FIRST_ROWS != FIRST_ROWS(1) ) :

********************************************************************************

SELECT /*+ FIRST_ROWS(1) */ ROWID,JOB_SID,FILE_STATUS,PARTICIPANT_SID,REPORT_TYPE,
DESTINATION_ALIAS,REPORT_RUN_SID,OS_EXIT_CODE,EVENT_DESCRIPTION,FILE_NAME
FROM
TK_REPORT_FILE_SET
WHERE 1=1
AND scheduled_start_datetime > sysdate - :days
AND report_type = :report_type
order by REPORT_TYPE,JOB_SID

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.52 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 10 21.08 46.43 16072 17618 0 1280
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 12 21.10 46.96 16072 17618 0 1280

Misses in library cache during parse: 1
Optimizer goal: FIRST_ROWS
Parsing user id: 100

Rows Row Source Operation
------- ---------------------------------------------------
1280 SORT ORDER BY
1280 HASH JOIN
13056 TABLE ACCESS FULL REPORT_FILE
251648 INDEX FAST FULL SCAN (object id 43141)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 10 0.00 0.00
db file scattered read 1362 0.06 26.13
db file sequential read 129 0.02 0.70
SQL*Net message from client 10 104.12 104.20
SQL*Net more data to client 59 0.00 0.00
********************************************************************************

Thanks in advance

Tom Kyte
July 30, 2003 - 7:12 am UTC

1) i didn't suggest the outer join that way -- mine was:


J.OS_EXIT_CODE,
(select event_description
from system_event
where event_sid = j.os_exit_code ) EVENT_DESCRIPTION,
J.SCHEDULED_START_DATETIME
FROM REPORT_FILE RF,


it is just that instead of:

select emp.ename, dept.dname
from emp, dept
where emp.deptno = dept.deptno(+)


you can query:


select emp.ename, (select dname from dept where dept.deptno = emp.deptno) dname
from emp
/


and that sometimes can open up query plan alterantives that were not possible before -- it is a simple query rewrite that I try frequently.


2) you should, if you exited the sqlplus session BEFORE you ran tkprof. we write the STAT records out when the cursors are closed.

does it reproduce that you do not get a plan?

3) we'd have to see the plan

4) that does look funny - i would never expect to see a hash join in a first_rows(1). but based on the consistent gets and such -- it looks like first_rows(1) and first_rows(10) are actually generating the same plan.


what you can do is search this site for 10053, and play with that to see if anything pops out -- but, please don't post a trace file like that here (they get largish)

Thanks yet again...

Matt, July 30, 2003 - 9:10 am UTC

In response to your response to my question 1) :o)

I couldn't find any documentation on this ability to include a select in a the select list. Can you reference any? I obviously have a bug in my query that I must fix.

In response to your response to my question 2)

I always make sure that I exit the session for this very reason. I have encountered this before during this tuning exercise, especially when consecutive queries use the FIRST_ROWS and FIRST_ROWS(n) hints. I will have to verify whether this is consistently reproducible. I assume that this MAY be a bug? If so I will contact support witha test case.

In response to your response to my question 4)

I will investigate this a little further with the 10053 trace. I won't generate any other long posts like this one - I just figured that the stats might be relevant. I certainly wouldn't post a multi-MB file here.

In your experience should FIRST_ROWS and FIRST_ROWS(1) generate the same plan?

Thanks yet again,

Tom Kyte
July 30, 2003 - 9:57 am UTC

it is called a scalar subquery.

see
</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96540/expressions.htm#1037302 <code>

It would be not unbelievable that first_rows and first_rows(1) generate different plans -- they are generally "equivalent" but obviously different.

It is interesting to note that:

....<cut and paste source = 9ir2 perf guide>...

The optimizer ignores this hint in DELETE and UPDATE statement blocks and in SELECT statement blocks that contain any of the following syntax:

Set operators (UNION, INTERSECT, MINUS, UNION ALL)
GROUP BY clause
FOR UPDATE clause
Aggregate functions
DISTINCT operator
ORDER BY clauses, when there is no index on the ordering columns

........


Now that last goody is a new one in 9ir2, not there before. Given that the plan for first_rows(1) and first_rows(10) are apparently the SAME and look to me to NOT be first_rows type plans -- I'll BETCHA that is what it is here.

Hmm -- that would be interesting. first_rows is "legacy", first_rows(n) is new and obeys this new rule -- meaning the optimizer sees the order by and says "you know what, I cannot get the first row fast here, hence we'll do ALL_ROWS instead and sort the result"

and in fact, that seems to be true -- in the following, big_table is just a copy of all_objects with a series of insert into big_table select * from big_table; over and over until there were about 4,000,000 plus rows. I created an index on STATUS:

big_table@ORA920> select status, count(*) from big_table group by status order by 2;

STATUS COUNT(*)
------- ----------
INVALID 38512
VALID 3461488

2 rows selected.

big_table@ORA920> create index bt_status_idx on big_table(status) compute statistics;

Index created.


Now look:

big_table@ORA920> set autotrace traceonly explain
big_table@ORA920>
big_table@ORA920> select /*+ FIRST_ROWS */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=58664 Card=1750000 Bytes=175000000)
1 0 SORT (ORDER BY) (Cost=58664 Card=1750000 Bytes=175000000)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=31483 Card=1750000 Bytes=175000000)
3 2 INDEX (RANGE SCAN) OF 'BT_STATUS_IDX' (NON-UNIQUE) (Cost=4155 Card=1750000)



big_table@ORA920> select /*+ FIRST_ROWS(1) */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=32042 Card=1750000 Bytes=175000000)
1 0 SORT (ORDER BY) (Cost=32042 Card=1750000 Bytes=175000000)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=4861 Card=1750000 Bytes=175000000)



big_table@ORA920> select /*+ ALL_ROWS */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: ALL_ROWS (Cost=32042 Card=1750000 Bytes=175000000)
1 0 SORT (ORDER BY) (Cost=32042 Card=1750000 Bytes=175000000)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=4861 Card=1750000 Bytes=175000000)



big_table@ORA920> select * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=32042 Card=1750000 Bytes=175000000)
1 0 SORT (ORDER BY) (Cost=32042 Card=1750000 Bytes=175000000)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=4861 Card=1750000 Bytes=175000000)



big_table@ORA920>
big_table@ORA920> set autotrace off


first_rows(1) and first_rows(10) are really ALL_ROWS due to the order by!

first_rows is first_rows still -- so they DO behave differently
..



But what about in 9.0.1?

Matt, July 30, 2003 - 9:57 pm UTC

A couple of things:

Firstly I am using 9.0.1 (it was included in an earlier post on this thread, but not emphasised in the last post)

For those that are interested, and consistencies sake -scalar subqueries were new in 9i (apparently). This is the same reference for 9.0.1

</code> http://download-west.oracle.com/docs/cd/A91202_01/901_doc/server.901/a90125/expressions11.htm#1033552 <code>

This is the same snip from the 9.0.1 Perf doco (Optimizer Hints 5-7)

[SNIP...]
The optimizer ignores this hint in DELETE and UPDATE statement blocks and in
SELECT statement blocks that contain any of the following syntax:

o Set operators (UNION, INTERSECT, MINUS, UNION ALL)
o GROUP BY clause
o FOR UPDATE clause
o Aggregate functions

o DISTINCT operator
These statements cannot be optimized for best response time, because Oracle must
retrieve all rows accessed by the statement before returning the first row. If you
specify this hint in any of these statements, then the optimizer uses the cost-based
approach and optimizes for best throughput.
[...SNIP]

Which does not describe the same restriction. Also, the 9i R2 comment does not hold in this case since I have an index on the columns in the order by:

[SNIP...]
Indexes on REPORT_FILE

Index Is
Name Unique COLUMNS
------------------------------ ------ --------------------------------
REPORT_FILE_INDX5 No REPORT_TYPE, JOB_SID
[...SNIP]

I've re-run your test to see if the same behaviour holds in 9.0.1:

repdlvr@NDEVT.WORLD> l
1 select status, count(*) from big_table group by status order
2* by 2
repdlvr@NDEVT.WORLD> select status, count(*) from big_table group by status order by 2 ;



STATUS COUNT(*)
------- ----------
INVALID 11776
VALID 10014720

repdlvr@NDEVT.WORLD> create index bt_status_idx on big_table(status) compute statistics ;

Index created.

repdlvr@NDEVT.WORLD> repdlvr@NDEVT.WORLD> select /*+ FIRST_ROWS */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=161075 Card=5013248 Bytes=501324800)
1 0 SORT (ORDER BY) (Cost=161075 Card=5013248 Bytes=501324800)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=85133 Card=5013248 Bytes=501324800)
3 2 INDEX (RANGE SCAN) OF 'BT_STATUS_IDX' (NON-UNIQUE) (Cost=11912 Card=5013248)

select /*+ FIRST_ROWS(1) */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=84388 Card=5013248 Bytes=501324800)
1 0 SORT (ORDER BY) (Cost=84388 Card=5013248 Bytes=501324800)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=8446 Card=5013248 Bytes=501324800)

repdlvr@NDEVT.WORLD> select /*+ ALL_ROWS */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: ALL_ROWS (Cost=84388 Card=5013248 Bytes=501324800)
1 0 SORT (ORDER BY) (Cost=84388 Card=5013248 Bytes=501324800)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=8446 Card=5013248 Bytes=501324800)


So, this all looks like the same behaviour you describe on 9.2 (although undocumented for 9.0.1). I tried to reproduce the same situation as in my query, where the index allows the sort to take place...


create index bt_status_created_idx on big_table(status,created) compute statistics;

Index created.

repdlvr@NDEVT.WORLD> select /*+ FIRST_ROWS */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=161075 Card=5013248 Bytes=501324800)
1 0 SORT (ORDER BY) (Cost=161075 Card=5013248 Bytes=501324800)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BIG_TABLE' (Cost=85133 Card=5013248 Bytes=501324800)
3 2 INDEX (RANGE SCAN) OF 'BT_STATUS_IDX' (NON-UNIQUE) (Cost=11912 Card=5013248)


repdlvr@NDEVT.WORLD> select /*+ ALL_ROWS */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: ALL_ROWS (Cost=84388 Card=5013248 Bytes=501324800)
1 0 SORT (ORDER BY) (Cost=84388 Card=5013248 Bytes=501324800)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=8446 Card=5013248 Bytes=501324800)


repdlvr@NDEVT.WORLD> select /*+ FIRST_ROWS(1) */ * from big_table where status = 'INVALID' order by created;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: FIRST_ROWS (Cost=84388 Card=5013248 Bytes=501324800)
1 0 SORT (ORDER BY) (Cost=84388 Card=5013248 Bytes=501324800)
2 1 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=8446 Card=5013248 Bytes=501324800)

However, I am seeing the same behaviour. Maybe this is because the select requires all the rows in the table.


Any comments?

Thanks again.

Tom Kyte
July 31, 2003 - 6:59 am UTC

first_rows(n) and first_rows are different -- it seems you have what you need, use "first_rows"?

first_rows(n)

John, August 06, 2003 - 5:19 am UTC

Tom,

is first_rows(n) hint simular to writing the query with rownum < n.

i.e. the below queries are same performance wise:

select * from emp where rownum < 100;

select /*+ first_rows(n) */ * from emp;

Tom Kyte
August 06, 2003 - 8:16 am UTC

No, where rownum < n is a predicate. first_rows(n) specifies an optimization technique.


for such trivial queries, yes, I would expect the same exact plans to be generated.

Documentation Bug?

Steve, August 06, 2003 - 8:18 am UTC

Matt from Australia Et Al,
Scalar subqueries were added in 8i.

SELECT
(SELECT version FROM v$instance) db_version,
(SELECT COUNT(*) FROM emp) emp_count
FROM
dual
DB_VERSION EMP_COUNT
----------------- ---------
8.1.7.4.0 14


This link
</code> http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96540/expressions11a.htm <code>
Says that scalar subqueries are not valid expressions in the following places:
-In WHEN conditions of CASE expressions
-In GROUP BY and HAVING clauses

This is incorrect.

SELECT
empno,
deptno,
COUNT(*) OVER(PARTITION BY deptno) emp_count,
CASE
WHEN (SELECT COUNT(*) FROM emp WHERE deptno = e.deptno) > 3
THEN 'More Than 3 Emps In Deptno ' || e.deptno
ELSE '3 or Fewer Emps In Deptno ' || e.deptno
END dept_status
FROM
emp e
EMPNO DEPTNO EMP_COUNT DEPT_STATUS
-------- -------- --------- --------------------------------
7782 10 3 3 or Fewer Emps In Deptno 10
7839 10 3 3 or Fewer Emps In Deptno 10
7934 10 3 3 or Fewer Emps In Deptno 10
7369 20 5 More Than 3 Emps In Deptno 20
7876 20 5 More Than 3 Emps In Deptno 20
7902 20 5 More Than 3 Emps In Deptno 20
7788 20 5 More Than 3 Emps In Deptno 20
7566 20 5 More Than 3 Emps In Deptno 20
7499 30 6 More Than 3 Emps In Deptno 30
7698 30 6 More Than 3 Emps In Deptno 30
7654 30 6 More Than 3 Emps In Deptno 30
7900 30 6 More Than 3 Emps In Deptno 30
7844 30 6 More Than 3 Emps In Deptno 30
7521 30 6 More Than 3 Emps In Deptno 30

SELECT
deptno,
COUNT(*) emp_count
FROM
emp e
GROUP BY
deptno
HAVING
(SELECT COUNT(*) FROM emp WHERE deptno = e.deptno) > 4

DEPTNO EMP_COUNT
-------- ---------
20 5
30 6


page m of n

John, August 07, 2003 - 8:24 am UTC

to display the records of a query pagewise,

we normally use this technique:

select * from
(
select a.empno,a.ename,rownum row_number from emp a
WHERE rownum <= i_page_number * i_rows_per_page
)
WHERE row_number > (i_page_number - 1) * i_rows_per_page;

in the above query, can we make use of first_rows(n) hint?

if yes, the value of n be a variable?

Tom Kyte
August 09, 2003 - 4:45 pm UTC

you would have to use dynamic sql as the N cannot be bound (they would all have the same plan). So, you would have as many cursors in the pool as you had distinct values for N

Outer join only for certain values of join key ?

Steve, August 15, 2003 - 10:33 am UTC


Hi,

I am wondering if it is possible to perform outer join only for certain values of the join key. Let me explain.

If the DEPTNO column in the EMP table is SCOTT schema could have NULL values, I would like a query to generate NULLs for the corresponding DEPT values. So, the outer join should be in effect only when the EMP.DEPTNO value is NULL. If there is mismatch in the join keys, it should not produce the a row with NULL DEPT values.

The following query will generate NULL values for DEPT table
if EMP.DEPTNO is NULL or if there there no row in DEPT table that matched the EMP.DEPTNO value.

SELECT emp.*, dept.* FROM emp, dept
WHERE emp.deptno = dept.deptno(+);

Although the following syntax is not allowed, this would have produced the desired results.

SELECT emp.empno, dept.dname FROM emp, dept
WHERE
(emp.deptno is null and emp.deptno = dept.deptno(+)) emp.deptno is not null and emp.deptno = dept.deptno;

Since this is an over-simplification of my actual problem,
I cannot use UNION in my case. Are there any other alternatives ?

Thanks in advance,
-Steve

Tom Kyte
August 15, 2003 - 11:19 am UTC

confusion, you say:

So, the
outer join should be in effect only when the EMP.DEPTNO value is NULL. If there
is mismatch in the join keys, it should not produce the a row with NULL DEPT
values.


that is exactly what outer joins do. If and only if the row is "mismatched" will it be "created" by the outer join with null entries.


Please give me a concrete example with sample data -- eg: i have this, I want that.


Outer join only for certain values of join key ?

Steve, August 15, 2003 - 11:58 am UTC

Here are the SQL commands using SCOTT schema

update emp set deptno = null where empno > 7850;
update dept set deptno = 22 where deptno = 10;

I would like to rewrite the following query such that the  
row with empno = 7782 will not be part of the result set. 
This is because there is a valid deptno for this row. 
The rows with empno > 7840 are fine because their depno
is NULL. 

SQL> SELECT emp.empno, dept.dname FROM emp, dept
  2  WHERE emp.deptno = dept.deptno(+);

     EMPNO DNAME
---------- --------------
      7782              
      7369 RESEARCH
      7788 RESEARCH
      7566 RESEARCH
      7499 SALES
      7698 SALES
      7654 SALES
      7521 SALES
      7839
      7844
      7876

     EMPNO DNAME
---------- --------------
      7900
      7902
      7934

14 rows selected.

SQL> select deptno from emp where empno = 7782;

    DEPTNO
----------
        10

SQL> select deptno from emp where empno > 7850;

    DEPTNO
----------





SQL> 

In other words, I want to generate dummy (NULL) values for DEPT table if and only if the EMP.DEPNO is NULL. In all other cases, I would like to perform the join as it is. 

Thanks, 
-Steve.
 

Tom Kyte
August 15, 2003 - 12:52 pm UTC

ops$tkyte@ORA920> SELECT emp.empno, dept.dname, emp.deptno, dept.deptno
  2  FROM emp, dept
  3    WHERE emp.deptno = dept.deptno(+)
  4    and ((emp.deptno is not null and dept.deptno is not null) OR (emp.deptno is null and dept.deptno is null));

     EMPNO DNAME              DEPTNO     DEPTNO
---------- -------------- ---------- ----------
      7369 RESEARCH               20         20
      7788 RESEARCH               20         20
      7566 RESEARCH               20         20
      7499 SALES                  30         30
      7698 SALES                  30         30
      7654 SALES                  30         30
      7844 SALES                  30         30
      7521 SALES                  30         30
      7876
      7900
      7902
      7934

12 rows selected.
 

Blanket statement - outer joins to be avoided

Doug, November 19, 2003 - 12:43 pm UTC

Tom - what would you say to the perception that outer joins should be avoided generally because of problems in optimizing them. And, therefore design decisions to use default values rather the null for "unknown" should be made? I guess my own impression is that outer joins are not inherently evil, but I'm wondering if you have a thought or two.

Tom Kyte
November 21, 2003 - 11:42 am UTC

i don't see how default values would avoid outer joins?

outer joins should be used in an optional 1:m relationship (have to be used).

outer joins should be used when they MUST BE.

outer joins should not be used "just in case they could be"

question on outer join syntax

Jerry, January 08, 2004 - 3:47 pm UTC

I saw people using outer joins like the one below:

SELECT e.ename FROM emp e, dept d
WHERE e.deptno (+) = d.deptno
AND e.ename (+) = 'abcd';

What does the query do? Why outer join on the e.ename? I can't find any documentation about this usage. Thanks!



Tom Kyte
January 08, 2004 - 8:10 pm UTC

no need to document something like that -- it is just a "natural side effect".

that would be like saying, you must document that you can:

where deptno = 10;


(well, we do document that and we do document that ename(+) = EXPRESSION works and 'abcd' is just an expression)


That is just like saying:


select e.ename
from (select * from emp where ename = 'abcd') e, dept d
where e.deptno(+) = d.deptno;

or, maybe this "makes more sense"

select e.ename
from emp e, (select deptno, 'abcd' ename from dept ) d
where e.deptno (+) = d.deptno
and e.ename (+) = d.ename;


basically, it is saying:

output EVERY row in DEPT and if it has a match in EMP where ename = 'abcd', lets see that data too.


if they queried:


select e.ename
from emp e, (select deptno, 'abcd' ename from dept ) d
where e.deptno (+) = d.deptno
and e.ename = 'abcd';

that would be like saying:

take EVERY row in DEPT and if it has a match in DEPT, join it -- but output EVERY row in dept regardless.

After you output every row in DEPT only really return the ones where ename = 'abcd'


very helpful

Jerry, January 09, 2004 - 10:12 pm UTC

Thanks for the quick and clear explanation This site is a great help!

Internal Behaviour of Outer Joins

Kapil, March 20, 2004 - 4:57 am UTC

Hi Tom,

The above was helpful indeed.

Would be greatful if you could help me understand the internal behaviour of outer joins.

If we take the following example:

select tabA.colAn, tabB.colBn
from tabA, tabB
where tabB.colB1 = tabA.colA1(+)
and tabB.colB2 = tabA.colA2(+)
and tabB.colB3 = tabA.colA3(+)
and tabB.colB4 = 'ABC'

Here we have kept tabB as the driving table and we are searching for rows from tabB in tabA.
There is also a composite index idxA1 on tabA(colA1, colA2, colA3)

Considering the above can I say that:
1.For every row in tabB satisfying the criteria [tabB.colB4 = 'ABC'] a search would be made in idxA1 for a match.
2.If there is a match found then colAn would return the respective value else a NULL would be returned,
3.The amount of computation involved in this, whille searching idxA1 or for that matter in the execution of this whole query, is the same as compared to a natural join.
4.Hence the amount of time taken to execute this query should be more or less the same compared to a natural join in the same query.

The reason I request clarification is that in case my understanding is not right and the outer join would take much more time than a natural join then I would have to go for an alternate approach to achieve the desired results.

Kapil


Tom Kyte
March 20, 2004 - 10:06 am UTC

1, 2) no, yes, maybe. conceptually yes. the above query says:

return to me EVERY row in tabB such that colb4 = 'ABC'. IF there exist any rows in tabA that join to this, please return them as well.


3, 4) no, what if tabA had zero rows and tabB has 1,000,000,000 rows where colb4 = 'ABC'. the outer join returns one BILLION rows -- the natural join zero. big big difference, you cannot even compare them.





Internal Behaviour of Outer Joins

Kapil, March 20, 2004 - 10:57 am UTC

1,2) But can you elaborate a little on why you say no and maybe for this. If an index access is available with a high selectivity then wouldnt it be used?
3,4) Yes. I am in agreement that if the rows returned by a natural join are far less than that returned by an outer join then there would be a big difference in the time taken by the query due to increased fetch phase times. But can I say that the time in the execution phase would be the same?

Also
5.If there exists a match 90% of the times then can I say that for those 90% of the cases the total time taken by a natural join query would be the same as an outer join query? The practical scenario is that I expect this query to return a match most of the times.

Kapil


Tom Kyte
March 21, 2004 - 9:39 am UTC

1,2) because an index access MIGHT NOT be there, or not a good idea. that is why it is yes, no, maybe -- it depends (as you are pointing out!)

No, you cannot say the time would be anywhere near the same! if the outer join returns 1,000,000,000 records and the natural join returns 1 -- it is likely (in my scenario above) that the natural join would execute intantaneously whereas the outer join would take a while.


No, you cannot -- they are different queries 100%. I do not see the relevance of trying to even compare them for

a) it is mandated that you use an outer join when that returns the correct result (as the natural join would not)

b) you would use the natural join in all other cases

so it is not like you have "a choice" except in B and that is not much of a choice (not a hard decision to make -- you would use the outer join ONLY when a) was true)

A reader, May 31, 2005 - 9:45 am UTC

Hi Tom,

SELECT DISTINCT COL3,
CASE WHEN T2.COL1 IS NULL THEN 'N'
ELSE 'Y'
END
FROM T1,T2
WHERE T1.COL1=T2.COL1(+)

Here is the requirement:

If a matching record is found in table T2; the flag should be 'Y', else 'N'.
To accomodate this, I am using an outer join and to avoid duplicates; I am
using 'distinct'. I know all these will kill the performance.

Is there an alternative for this?

Thanks.











Tom Kyte
May 31, 2005 - 10:13 am UTC

why would any of them "kill performance"?

big brutually efficient hash outer join, followed by a distinct. If anything, the distinct will be "an issue" but that is rather unavoidable if you need distinct output.

OK

A reader, June 05, 2005 - 10:35 am UTC

Hi Tom,
Can an Outer join be performed with a single table??
Any simple example you have??


Tom Kyte
June 05, 2005 - 11:04 am UTC

No, you need a "set" to outer join to.

OK

Raju, June 05, 2005 - 1:10 pm UTC

I think you can have inline views to simulate a set
for which you can have an outer join and that inline view
can be a subquery which refers the same table as the main
query.
Am I right??


Tom Kyte
June 05, 2005 - 2:20 pm UTC

but, then you would have two "tables" for an inline view (any result set really) is a table.

you need two tables, two sets, two things -- to outer join.

Outer Join Question

Berton Coutinho, June 23, 2005 - 1:06 pm UTC

Tom,

I have a query that joins 2 tables, I want to select all rows from table a and only those rows from table b where the part numbers are the same and the order status in table b is either ‘P’ending or ‘S’hipped. I use the query listed below to achive this. I was wondering if there is a better way to do this ?

CREATE TABLE APP_CATALOG (APP_PART_NUM VARCHAR2(10), APP_DESC VARCHAR2(30));

CREATE TABLE APP_ORDER (ORD_PART_NUM VARCHAR2(10), ORD_QTY NUMBER(5,0), ORD_STATUS VARCHAR2(1));

INSERT INTO APP_CATALOG VALUES('1-X-01', 'ITEM 01');
INSERT INTO APP_CATALOG VALUES('2-X-01', 'ITEM 02');
INSERT INTO APP_CATALOG VALUES('3-X-01', 'ITEM 02');

INSERT INTO APP_ORDER VALUES('1-X-01', 500, 'S');
INSERT INTO APP_ORDER VALUES('1-X-01', 250, 'P');
INSERT INTO APP_ORDER VALUES('2-X-01', 250, 'S');
INSERT INTO APP_ORDER VALUES('2-X-01', 125, 'P');
INSERT INTO APP_ORDER VALUES('3-X-01', 500, 'D');
INSERT INTO APP_ORDER VALUES('3-X-01', 250, 'D');

SELECT A.APP_PART_NUM,
A.APP_DESC,
NVL(SUM(B.ORD_QTY), 0) TOT_ORD_QTY
FROM APP_CATALOG A,
( SELECT *
FROM APP_ORDER
WHERE ORD_STATUS IN('P', 'S') ) B
WHERE A.APP_PART_NUM = B.ORD_PART_NUM(+)
GROUP BY A.APP_PART_NUM,
A.APP_DESC;

SPOOL TEXT

APP_PART_NUM APP_DESC TOT_ORD_QTY
------------ ------------------------------ -----------
1-X-01 ITEM 01 750
2-X-01 ITEM 02 375
3-X-01 ITEM 02 0





Tom Kyte
June 23, 2005 - 6:56 pm UTC

that is how I would have written it.

Thanks a million

Berton Coutinho, June 24, 2005 - 9:49 am UTC

Tom,

Thank you very much for responding so quickly to my questions. You are awesome.


outer join and NVL function

Anoop Gupta, July 20, 2005 - 9:40 am UTC

Tom,
When we are quering with two tables using following queries then found some discrepency . Queries are following -

Q1-

SELECT *
FROM ATTD_OPTIONAL_HOLIDAY_DETAIL OHD,
ATTD_EMPWISE_OPT_HOLIDAY EOH
WHERE TO_CHAR(OHD.HOLIDAY_DATE,'YYYY')='2005'
AND OHD.GROUP_ID=EOH.GROUP_ID(+)
AND OHD.HOLIDAY_DATE=EOH.CHOOSEN_HOLI_DATE(+)
AND NVL(EOH.EMBT_EMP_ID,'2471')='2471' -- diff here
ORDER BY OHD.GROUP_ID,OHD.HOLIDAY_DATE;

Q2-

SELECT *
FROM ATTD_OPTIONAL_HOLIDAY_DETAIL OHD,
ATTD_EMPWISE_OPT_HOLIDAY EOH
WHERE TO_CHAR(OHD.HOLIDAY_DATE,'YYYY')='2005'
AND OHD.GROUP_ID=EOH.GROUP_ID(+)
AND OHD.HOLIDAY_DATE=EOH.CHOOSEN_HOLI_DATE(+)
AND EOH.EMBT_EMP_ID(+)='2471' -- diff here
ORDER BY OHD.GROUP_ID,OHD.HOLIDAY_DATE;

the Q2 is giving one more row then Q1.

Waiting for your reply......



Tom Kyte
July 20, 2005 - 12:51 pm UTC

you either outer join all of the columns in the table or you might as well not outer join any of them.


the two queries are not even remotely similar, short of querying the same tables.


remove the (+) from q1 and it is the same as you have typed.

A reader, August 18, 2005 - 10:41 am UTC


Tom,

When there is a join involving say 5 tables with 4 primary keys. Here there could be missing information anywhere and I am thinking of using full outer joins on all join conditions. Or lets suppose its a requirement to use full outer joins in all join conditions. Can you help me with this query?

Can you suggest how I can use full outer join here on all join conditions


select PRO.EID empnum, MAS.LY LY, MAS.PP PP, ACTY.DNO DNO
ACTY.CODE CODE, TRN.TC_ID TNO, AT.day day
from MAS, PRO, AY, ACTY, TRN, AT
WHERE MAS.TA_ID = PRO.TA_ID
AND MAS.TA_ID = TRN.TA_ID
AND TRN.ACCT_ID = AY.ACCT_ID
AND TRN.ACCT_ID = AT.ACCT_ID
AND AY.ACTY_ID = ACTY.ACTY_ID
AND MAS.PP = 22
AND MAS.LY = 2000

I can write outer join conditions using two tables or multiple tables that involve join codition based on a pk from one table. But here, its more like a combinatorial. Please let me kow if this information is not enough for you.

Tom Kyte
August 18, 2005 - 4:25 pm UTC

you mean you want to do a 4 way FULL OUTER JOIN.

Read about FULL OUTER JOIN

but think about "fixing the model", sounds like something might be wrong with it if at least the top level parent table doesn't have all of the rows.


</code> http://asktom.oracle.com/pls/ask/search?p_string=%22full+outer+join%22 <code>

LIOs and rewrites

Duke Ganote, August 24, 2005 - 10:03 am UTC

Tom-- Would I be correct in inferring that you wrote the original response prior to understanding that LIOs should be minimized?   You wrote your test "is not a benchmark", but I don't think I'd see you doing that nowdays.

I decided to revisit the results using 10g, and count LIOs.  In a nutshell, 

QUERY FORM   LIO
-----------  ----
NOT EXISTS       667
NOT IN       190,076
LEFT OUTER       678
MINUS            678

I'm also curious:  why wouldn't these functionally identical queries be re-written by the optimizer (as you've commented, "select count(*)" and "select count(1)" are rewritten, see 
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1156159920245
)?  With the optimizer doing so much, it just seems odd that we can directly influence the access path by the format of the SQL.

Regards!

*****************
First, I checked that the stats are current:

  1  WITH STATS AS (
  2  select table_name, num_rows from user_tables where table_name in ( 'T','T2' ) )
  3  ,    COUNTS AS (
  4  select 'T' as table_name, count(*) num_rows from t UNION
  5  select 'T2'             , count(*)          from t2 )
  6  select S.table_name, S.num_rows "stats", c.num_rows "counts"
  7    from STATS S inner join COUNTS C
  8*     on S.table_name = C.table_name
SQL> /

TABLE_NAME                          stats     counts
------------------------------ ---------- ----------
T                                   47707      47707
T2                                    958        958

Then I tried the queries and counted LIOs:

  1  select count(object_name)
  2    from t
  3   where NOT EXISTS(
  4         select null
  5           from t2
  6*         where t2.object_id = t.object_id )
SQL> /

COUNT(OBJECT_NAME)
------------------
             46749

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=154 Card=1 Bytes=35)
   1    0   SORT (AGGREGATE)
   2    1     HASH JOIN (RIGHT ANTI) (Cost=154 Card=46749 Bytes=1636215)
   3    2       INDEX (FAST FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=2 Card=958 Bytes=4790)
   4    2       TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=151 Card=47707 Bytes=1431210)

Statistics
----------------------------------------------------------
          0  db block gets
        667  consistent gets
         20  physical reads

  1  select /*+ HASH_AJ */ count(object_name)
  2    from t
  3   where object_id NOT IN (
  4      select object_id
  5       from t2
  6*     where object_id IS NOT NULL )
SQL> /

COUNT(OBJECT_NAME)
------------------
             46749

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=147418 Card=1 Bytes=30)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=151 Card=47707 Bytes=1431210)
   4    2       INDEX (FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=4 Card=48 Bytes=240)

Statistics
----------------------------------------------------------
          0  db block gets
     190076  consistent gets
         44  physical reads

SQL> ed
Wrote file afiedt.buf

  1  select count(t.object_name)
  2    from t LEFT OUTER JOIN t2
  3      on t.object_id = t2.object_id
  4*  where t2.object_id is NULL
SQL> /

COUNT(T.OBJECT_NAME)
--------------------
               46749

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=154 Card=1 Bytes=35)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       HASH JOIN (RIGHT OUTER) (Cost=154 Card=47707 Bytes=1669745)
   4    3         INDEX (FAST FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=2 Card=958 Bytes=4790)
   5    3         TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=151 Card=47707 Bytes=1431210)

Statistics
----------------------------------------------------------
          0  db block gets
        667  consistent gets
         37  physical reads

  1  select count(*) from
  2   ( select object_id from t
  3     MINUS
  4*    select object_id from t2 )
SQL> /

            COUNT(*)
--------------------
              46,749

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=308 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     VIEW (Cost=308 Card=47707)
   3    2       MINUS
   4    3         SORT (UNIQUE) (Cost=301 Card=47707 Bytes=238535)
   5    4           TABLE ACCESS (FULL) OF 'T' (TABLE) (Cost=151 Card=47707 Bytes=238535)
   6    3         SORT (UNIQUE) (Cost=7 Card=958 Bytes=4790)
   7    6           TABLE ACCESS (FULL) OF 'T2' (TABLE) (Cost=6 Card=958 Bytes=4790)

Statistics
----------------------------------------------------------
          0  db block gets
        678  consistent gets
         47  physical reads

 

Tom Kyte
August 24, 2005 - 5:35 pm UTC

10.1.0.4, out of the box, standard parameters and I don't see what you see:

ops$tkyte@ORA10GR1> set echo on
ops$tkyte@ORA10GR1> set linesize 121
ops$tkyte@ORA10GR1> /*
ops$tkyte@ORA10GR1> drop table t;
ops$tkyte@ORA10GR1> drop table t2;
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> create table t as select * from all_objects;
ops$tkyte@ORA10GR1> create index t_idx on t(object_id);
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> create table t2 as select * from all_objects where mod(object_id,50) = 0;
ops$tkyte@ORA10GR1> create index t2_idx on t2(object_id);
ops$tkyte@ORA10GR1> */
ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> set autotrace on
ops$tkyte@ORA10GR1> select count(object_name)
  2    from t
  3   where NOT EXISTS( select null
  4                       from t2
  5                      where t2.object_id = t.object_id )
  6  /

COUNT(OBJECT_NAME)
------------------
             46851


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32 Card=1 Bytes=26)
   1    0   SORT (AGGREGATE)
   2    1     HASH JOIN (RIGHT ANTI) (Cost=32 Card=33755 Bytes=877630)
   3    2       INDEX (FAST FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=3 Card=952 Bytes=12376)
   4    2       INDEX (FAST FULL SCAN) OF 'T_IDX' (INDEX) (Cost=27 Card=34707 Bytes=451191)



Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        119  consistent gets
          0  physical reads
          0  redo size
        432  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR1>
ops$tkyte@ORA10GR1> select /*+ HASH_AJ */ count(object_name)
  2    from t
  3   where object_id NOT IN ( select object_id
  4                              from t2
  5                             where object_id IS NOT NULL )
  6  /

COUNT(OBJECT_NAME)
------------------
             46851


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32 Card=1 Bytes=26)
   1    0   SORT (AGGREGATE)
   2    1     HASH JOIN (RIGHT ANTI) (Cost=32 Card=33755 Bytes=877630)
   3    2       INDEX (FAST FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=3 Card=952 Bytes=12376)
   4    2       INDEX (FAST FULL SCAN) OF 'T_IDX' (INDEX) (Cost=27 Card=34707 Bytes=451191)



Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        119  consistent gets
          0  physical reads
          0  redo size
        432  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR1> select count(t.object_name)
  2    from t, t2
  3   where t.object_id = t2.object_id(+)
  4     and t2.object_id is NULL
  5  /

COUNT(T.OBJECT_NAME)
--------------------
               46851


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32 Card=1 Bytes=26)
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       HASH JOIN (RIGHT OUTER) (Cost=32 Card=34707 Bytes=902382)
   4    3         INDEX (FAST FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=3 Card=952 Bytes=12376)
   5    3         INDEX (FAST FULL SCAN) OF 'T_IDX' (INDEX) (Cost=27 Card=34707 Bytes=451191)




Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        119  consistent gets
          0  physical reads
          0  redo size
        434  bytes sent via SQL*Net to client
        512  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

ops$tkyte@ORA10GR1> set autotrace off
 

hi gets

Duke Ganote, August 27, 2005 - 7:43 am UTC

Interesting. I seem to "get" a lot more "gets". I think I just did a standard install on my PC. (I get results more like yours when I tried this query on 9i)

XPLT9173:GANOTEDP\dganote> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.1.0.2.0 - Prod
PL/SQL Release 10.1.0.2.0 - Production
CORE 10.1.0.2.0 Production
TNS for 32-bit Windows: Version 10.1.0.2.0 - Production
NLSRTL Version 10.1.0.2.0 - Production


XPLT9173:GANOTEDP\dganote> ed
Wrote file afiedt.buf

1 select /*+ HASH_AJ */ count(object_name)
2 from t
3 where object_id NOT IN ( select object_id
4 from t2
5* where object_id IS NOT NULL )
XPLT9173:GANOTEDP\dganote> /

COUNT(OBJECT_NAME)
------------------
46797

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=9834 Card=1 Bytes=30)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'T' (TABLE) Cost=151 Card=2419 Bytes=72570)
4 2 INDEX (FULL SCAN) OF 'T2_IDX' (INDEX) (Cost=8 Card=48 Bytes=624)





Statistics
----------------------------------------------------------
11 recursive calls
0 db block gets
190349 consistent gets
0 physical reads
0 redo size
405 bytes sent via SQL*Net to client
508 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

Problem with Full Outer Join

Yuan, December 30, 2005 - 11:57 am UTC

I'm having a problem with a query that tries to full outer join 2 aggregate subqueries. I have included the query, though I realize it might not be of much use to you without giving you the table structures that are used, but I'm hoping you can spot something.

SELECT NVL(Stat_Cnt.DNIS, Dlvr_Cnt.DNIS) DNIS,
Stat_Cnt.Created,
Stat_Cnt.Approved,
Dlvr_Cnt.Delivered
FROM (SELECT D1.DNIS,
COUNT(A.Created) Created,
COUNT(A.Approved) Approved
FROM (SELECT DISTINCT Sale_Id,
Call_Id,
MAX(CASE WHEN Prior_Status IS NULL AND Lead_Ind = 1
THEN 1 END) OVER (PARTITION BY Sale_Id) Created,
MAX(CASE WHEN Prior_Status = 'SLCN' AND Status IN ('VFQ', 'PFQ')
THEN 1 END) OVER (PARTITION BY Sale_Id) Approved
FROM (SELECT Action_Tm,
Sale_Id,
Call_Id,
Status,
LAG(Status) OVER (PARTITION BY Sale_Id ORDER BY Action_Tm) Prior_Status,
MAX(DECODE(Status, 'CBK', 1)) OVER (PARTITION BY Sale_Id) Lead_Ind
FROM UDB.Audit_Sale
WHERE Product_Type_Id = 400)
WHERE Action_Tm >= '15-DEC-05'
AND Action_Tm < '19-DEC-05'
AND (Status != Prior_Status OR Prior_Status IS NULL)) A,
UDB.Call C1,
UDB.DNIS D1
WHERE COALESCE(Created, Approved) = 1
AND C1.Call_Id = A.Call_Id
AND D1.DNIS_Id = C1.DNIS_Id
GROUP BY D1.DNIS) Stat_Cnt FULL OUTER JOIN
(SELECT D2.DNIS,
COUNT(DISTINCT S.Sale_Id) Delivered
FROM UDB.Shipment S, UDB.Sale, UDB.Call C2, UDB.DNIS D2
WHERE S.Delivery_Dt >= '15-DEC-05'
AND S.Delivery_Dt < '19-DEC-05'
AND Sale.Sale_Id = S.Sale_Id
AND C2.Call_Id = Sale.Call_Id
AND D2.DNIS_Id = C2.DNIS_Id
GROUP BY D2.DNIS) Dlvr_Cnt ON Dlvr_Cnt.DNIS = Stat_Cnt.DNIS

This is the error that I get:
SELECT NVL(Stat_Cnt.DNIS, Dlvr_Cnt.DNIS) DNIS,
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression

If I change the full outer to inner, left or right outer join, it works fine.

This is the version of Oracle that I'm running on:
Oracle9i Enterprise Edition Release 9.2.0.5.0

Tom Kyte
December 31, 2005 - 10:44 am UTC

you give me no way to reproduce it :(

you should make it as tiny as possible - chop out all predicates/whatever to make the query tiny and reproduce the issue - supplying the create tables of course as well so anyone on the planet could run it and see the issue.

Also, why would you compare what appear to be dates (s.delivery_dt) to strings??? '15-dec-05'???

What is '15-dec-05'? I think it is 1905, what do you think? (eg: we should have learned some 5 years ago to the day that using two characters to represent something that nominally takes 4 is "a bad idea" - and compare dates to dates, numbers to numbers, and strings to strings ONLY - never compare a date to a string, a string to number, a number to .... whatever)

Problem with Full Outer Join (continued)

Yuan, January 02, 2006 - 11:04 am UTC

I determined that it was the first subquery that caused the problem. I was able to recreate the error using just one table:

CREATE TABLE Sale_History (Dt DATE,
Sale_Id INTEGER,
Status VARCHAR2(1),
Other VARCHAR2(1))
/
INSERT INTO Sale_History VALUES ('1-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('2-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('3-DEC-05', 1, 'B', '2')
/
INSERT INTO Sale_History VALUES ('4-DEC-05', 1, 'A', '2')
/
INSERT INTO Sale_History VALUES ('5-DEC-05', 1, 'A', '1')
/
INSERT INTO Sale_History VALUES ('02-DEC-05', 2, 'A', '1')
/
INSERT INTO Sale_History VALUES ('03-DEC-05', 2, 'A', '2')
/
INSERT INTO Sale_History VALUES ('04-DEC-05', 2, 'B', '2')
/
INSERT INTO Sale_History VALUES ('05-DEC-05', 2, 'A', '2')
/
INSERT INTO Sale_History VALUES ('06-DEC-05', 2, 'A', '1')
/
INSERT INTO Sale_History VALUES ('07-DEC-05', 2, 'D', '1')
/
INSERT INTO Sale_History VALUES ('08-DEC-05', 2, 'D', '0')
/
COMMIT
/

If it rings a bell, I used this same example table for another thread recently. Anyway, here's the revised query that causes the problem:

SELECT NVL(Stat_Cnt.Sale_Id, Dlvr_Cnt.Sale_Id) Sale_Id,
Stat_Cnt.Created,
Stat_Cnt.Approved,
Dlvr_Cnt.Created
FROM (SELECT Sale_Id,
COUNT(A.Created) Created,
COUNT(A.Approved) Approved
FROM (SELECT DISTINCT Sale_Id,
MAX(CASE WHEN Prior_Status IS NULL AND
Lead_Ind = 1
THEN 1 END) OVER (PARTITION BY
Sale_Id) Created,
MAX(CASE WHEN Prior_Status = 'A' AND
Status = 'B'
THEN 1 END) OVER (PARTITION BY
Sale_Id) Approved
FROM (SELECT Dt,
Sale_Id,
Status,
LAG(Status) OVER (PARTITION BY Sale_Id ORDER
BY Dt) Prior_Status,
MAX(DECODE(Status, 'A', 1)) OVER (PARTITION
BY Sale_Id) Lead_Ind
FROM Sale_History)) A
WHERE COALESCE(Created, Approved) = 1
GROUP BY Sale_Id) Stat_Cnt FULL OUTER JOIN
(SELECT Sale_Id,
COUNT(A.Created) Created,
COUNT(A.Approved) Approved
FROM (SELECT DISTINCT Sale_Id,
MAX(CASE WHEN Prior_Status IS NULL AND
Lead_Ind = 1
THEN 1 END) OVER (PARTITION BY
Sale_Id) Created,
MAX(CASE WHEN Prior_Status = 'A' AND
Status = 'B'
THEN 1 END) OVER (PARTITION BY
Sale_Id) Approved
FROM (SELECT Dt,
Sale_Id,
Status,
LAG(Status) OVER (PARTITION BY Sale_Id ORDER
BY Dt) Prior_Status,
MAX(DECODE(Status, 'A', 1)) OVER (PARTITION
BY Sale_Id) Lead_Ind
FROM Sale_History)) A
WHERE COALESCE(Created, Approved) = 1
GROUP BY Sale_Id) Dlvr_Cnt ON Dlvr_Cnt.Sale_Id = Stat_Cnt.Sale_Id

This first and 2nd subqueries are now identical.

Also, I used '05' instead of '2005' because Action_Tm in the real table is a timestamp, and I posted a problem with timestamps in a different thread: </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:5011677391274 <code>

Tom Kyte
January 02, 2006 - 11:20 am UTC

the problem in the other thread was mostly your choice of format - you should be using YYYY to avoid ambiguity - then there is no issue. It is only when you use RR that the issue arises.


Looks like a 10gr1 issue, did not reproduce in 10gr2 - please utilize support for this one.

Problem with Full Outer Join (continued)

Yuan, January 02, 2006 - 12:03 pm UTC

OK, thanks. This problem occurred on Oracle9i Enterprise Edition Release 9.2.0.5.0. Not 10g at all yet, so glad to know it will be resolved by moving to 10gr2 if not r1. In any event, I'm doing the old fashion union all for now.

As for the RR choice of format, it's not mine. It is the default for timestamp. I did open a TAR for that issue as you suggested.

Tom Kyte
January 02, 2006 - 1:12 pm UTC

but you can ALWAYS BE EXPLICIT (and I believe you should be - you should not rely on defaults, defaults change....)

Cartesian / Outer Join

Chris, February 08, 2006 - 10:23 am UTC

Tom,
   I'm looking for a suggestion that might make use of an outer join, just a union, or something else. The problem I'm trying to solve is that I have many independent queries that all return a single column that I want to cartesian together.  Yes I have a business reason that I actually want a cartesian product.  The issue is that if any of the queries returns no rows, the whole query brings nothing back.  Hence, I need a way to force each query to return at least one row.  Here is an example.

SQL> create table t1 (c1 varchar2(10));

Table created.

SQL> insert into t1 values ('A');

1 row created.

SQL> insert into t1 values ('B');

1 row created.

SQL> select a.c1, b.c1 from
  2     (select c1 from t1) a,
  3     (select c1 from t1) b;

C1         C1
---------- ----------
A          A
B          A
A          B
B          B

SQL> select a.c1, b.c1 from
  2     (select c1 from t1) a,
  3     (select c1 from t1 where c1 > 'B') b;

no rows selected

So I just did a cartesian on my two row table with itself and got the four rows I was looking for.  However if I modify one of the queries so it gets no rows, the whole result comes back empty.  What I really want to see is this:

C1         C1
---------- ----------
A          
B          

I'm looking for the best approach to this theory as the individual queries are much more complicated.  I thought of using a "union select null from dual" with each query but then I get that extra row all the time and I want it only when the individual query returns nothing.  Hence I could use the same philosophy with a "where not exists" and repeat the first query but that gets ugly quick.  I'm hoping you can think of a good way (less code, more maintainable, still performs decent, etc.) to accomplish this. 

Tom Kyte
February 08, 2006 - 10:55 am UTC

ops$tkyte@ORA10GR2> select a.c1, b.c1
  2    from (select c1 from t1) a full outer join  (select c1 from t1) b on (1=1);

C1         C1
---------- ----------
A          A
A          B
B          A
B          B

ops$tkyte@ORA10GR2> select a.c1, b.c1
  2    from (select c1 from t1) a full outer join  (select c1 from t1 where c1 > 'B') b on (1=1);

C1         C1
---------- ----------
A
B

ops$tkyte@ORA10GR2> select a.c1, b.c1
  2    from (select c1 from t1 where c1 > 'B' ) a full outer join  (select c1 from t1 ) b on (1=1);

C1         C1
---------- ----------
           A
           B

hope A, B and so on are "not big" :) 

Full outer Join

Chris, February 08, 2006 - 11:21 am UTC

Tom, I'd never used this type of syntax before (looks more like the kind of stuff I see MS Access generate). I'll plug this into my real example and see what happens, thanks. I do have one question about extending to queries C, D, etc. You gave syntax like the following when I had just A and B.

select a.c1, b.c1
from (select c1 from t1) a full outer join
(select c1 from t1 where c1 > 'B') b on (1=1);

What does it look like when I add C? Something like this?

select a.c1, b.c1
from (select c1 from t1) a full outer join
(select c1 from t1 where c1 > 'B') b on (1=1) full outer join (select c1 from t1) c on (1=1);

More of a syntax question than anything else, I've just used what I would call typical join syntax, not syntax like this.




Tom Kyte
February 09, 2006 - 4:01 am UTC

yup, the ANSI join syntax, you just keep on stringing it along.

babloo, May 25, 2006 - 4:02 pm UTC

Hi Tom ,
I am refering to the very first response you gave on this page
"There are at least 3 ways to do this above (minus might be a 4'th depending on
your circumstances)."
I got from the response that there is no yes/no answer of which may be faster .
So does it mean , anytime we code , we should code all 3 possible ways to check which is faster .
It would not be very effitient development I think.
How do you code stuff like this ? For every sql , should we code all posssible ways and then see which is faster . I have been asked this question many times in the code review " why did you use this approach and not the other "

2. for not exists , you explained how it works
for x in ( select * from rollup ) loop
if ( not exists ( that query ) ) then
OUTPUT
end if;
end
Please you explain other approcahes too ?

3. are there any thumbs of rules for these 4 approches which tell us which of these may be faster

Thanks ......


Tom Kyte
May 25, 2006 - 5:59 pm UTC

answer is changed to:

USE THE CBO, it'll figure it out.

noted above in the evolution of this Q&A

Not clear

babloo, May 25, 2006 - 10:57 pm UTC

Hi Tom ,
Sometimes you amaze me how promptly you give answers . Thank you so much ......
But
I am confused . are you sayng that even if I coded using exists, CBO will be able to figure out when it is not effitient and decide that (say) it is best to execute as If I wrote using outer join and will execute like outer join ?

Tom Kyte
May 26, 2006 - 8:33 am UTC

yes, the CBO will rewrite the query using various techniques - whereas the RBO would just follow a very strict set of rules.

Is this a new CBO feature

Matthew, June 16, 2006 - 5:55 am UTC

You said:
yes, the CBO will rewrite the query using various techniques - whereas the RBO
would just follow a very strict set of rules.

Is this a new feature of 9i/10g? I have experienced significant performance changes on our 8.1.7.4.0 database when rewriting queries that use IN to instead use EXISTS.

Tom Kyte
June 16, 2006 - 7:00 pm UTC

the CBO is a constantly evolving beast, hugely different in 9i over 8i and 10g over 9i.


and are you sure you were using the CBO with 8i...

(and do you have an example still?)

EXISTS v IN in 8.1.7.4.0

Matthew, June 21, 2006 - 9:46 am UTC

Yes I still have the tkprof files from the change I made from IN to EXISTS.

Using IN:

************************************************************

SELECT 'Y'
FROM
PARAMETER_OPTIONS WHERE POX_ID IN (SELECT POX_POX_ID FROM MUL_QUOTE_OPT
WHERE TMQE_TMQE_ID IN (SELECT TMQE_SEQ FROM MUL_QUOTE_ELE WHERE QUO_ID =
:b1 AND LEL_ID = :b2 ) AND FPR_FPR_ID IN (SELECT FPR_FORMULA_PARAMETER
FROM QUOTE_ELEMENT_PARAMETERS WHERE QEL_ID IN (SELECT QEL_ID FROM
QUOTATION_ELEMENTS WHERE QMD_QMD_ID = :b3 AND LEL_LEL_ID = :b2 )) AND
POX_POX_ID IS NOT NULL ) AND NOTE IS NOT NULL AND ROWNUM = 1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.07 7 79 0 0
Execute 8 0.03 0.02 0 0 0 0
Fetch 8 373.36 982.56 562161 416194 1324 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 17 373.42 982.65 562168 416273 1324 0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer goal: CHOOSE
Parsing user id: 12

Rows Row Source Operation
------- ---------------------------------------------------
0 COUNT STOPKEY
0 MERGE JOIN
9 VIEW VW_NSO_1
9 SORT UNIQUE
1 MERGE JOIN
9 SORT JOIN
1 HASH JOIN
16 MERGE JOIN CARTESIAN
16 TABLE ACCESS BY INDEX ROWID MUL_QUOTE_ELE
400 INDEX RANGE SCAN (object id 17730)
16 SORT JOIN
16 TABLE ACCESS BY INDEX ROWID QUOTATION_ELEMENTS
184 INDEX RANGE SCAN (object id 18173)
585488 TABLE ACCESS FULL MUL_QUOTE_OPT
1 SORT JOIN
25675345 TABLE ACCESS FULL QUOTE_ELEMENT_PARAMETERS
0 SORT JOIN
0 TABLE ACCESS FULL PARAMETER_OPTIONS

************************************************************

Using EXISTS:

************************************************************

SELECT 'Y'
FROM
PARAMETER_OPTIONS WHERE EXISTS (SELECT 1 FROM MUL_QUOTE_OPT,
MUL_QUOTE_ELE WHERE POX_ID = POX_POX_ID AND TMQE_TMQE_ID = TMQE_SEQ AND
QUO_ID = :b1 AND LEL_ID = :b2 AND FPR_FPR_ID IN (SELECT
FPR_FORMULA_PARAMETER FROM QUOTE_ELEMENT_PARAMETERS WHERE QEL_ID IN
(SELECT QEL_ID FROM QUOTATION_ELEMENTS WHERE QMD_QMD_ID = :b3 AND
LEL_LEL_ID = :b2 )) AND POX_POX_ID IS NOT NULL ) AND NOTE IS NOT NULL
AND ROWNUM = 1


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 8 0.01 0.01 0 0 0 0
Fetch 8 0.01 0.01 1 8 32 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 17 0.02 0.03 1 8 32 0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer goal: CHOOSE
Parsing user id: 12

Rows Row Source Operation
------- ---------------------------------------------------
0 COUNT STOPKEY
0 FILTER
8 TABLE ACCESS FULL PARAMETER_OPTIONS
0 NESTED LOOPS
0 MERGE JOIN CARTESIAN
0 NESTED LOOPS
0 TABLE ACCESS BY INDEX ROWID MUL_QUOTE_ELE
0 INDEX RANGE SCAN (object id 17730)
0 TABLE ACCESS BY INDEX ROWID MUL_QUOTE_OPT
0 INDEX RANGE SCAN (object id 17936)
0 SORT JOIN
0 TABLE ACCESS BY INDEX ROWID QUOTATION_ELEMENTS
0 INDEX RANGE SCAN (object id 18173)
0 INDEX UNIQUE SCAN (object id 49556)

************************************************************

Both queries are of the form:

SELECT 'Y'
FROM PARAMETER_OPTIONS
WHERE <condition>
AND NOTE IS NOT NULL
AND ROWNUM = 1

The data is such that there are no rows where NOTE is not null so the CBO would ideally be able to use this fact to avoid ever having to evaluate <condition>. (the statistics on this column has NUM_NULLS equal to the number of rows in the table so incorrect stats doesn't appear to be a problem) However it appears that the CBO can make this optimisation when using EXISTS, but not when using IN.

Tom Kyte
June 22, 2006 - 11:23 am UTC

where the estimated cardinalities even close to correct on the plans?



Outer Join question

Bob, August 23, 2006 - 12:00 pm UTC

Hi Gurus,

I have a query of the form:

select a.cust_no,
sum(nvl(value1, 0)) value1,
sum(nvl(value2, 0)) value2,
from table1 a
where exists
(select 'x'
from table2 b,
table3 c
where b.col1 = c.col1 (+)
and b.col2 = c.col2 (+)
and (b.col3 = a.col3
or b.col3 = a.col4)
and b.value1 = 1
and b.value2 = 2
and b.value3 is null)

table1 - has 5 million records
table2 - has 130,000 records
table3 - has 1.6 million records

I know that (obviously) the configuration of the indexes will have an impact. But can anyone think of an alternative way of writing this - without the outer joins!

Cheers

Tom Kyte
August 27, 2006 - 5:38 pm UTC

why is the outer join to C relevant?


if there is a record in B, you either

a) join it to a mate in C or...
b) if there is no mate in C you "make one up"

either way it comes down to "if there is a record in B"?



Outer Join

Bob, August 30, 2006 - 11:03 am UTC

Sorry, there is a syntax error in my SQL. It should read c.value3 is null

Tom Kyte
August 30, 2006 - 5:25 pm UTC

what is the current plan and what does the plan become if you use IN (select from B where not in (select from C where c_cols are not null) ) instead of exists with the nasty outer join?

Outer join Vs not exists

Rj, January 19, 2007 - 1:29 pm UTC

i would like to display those records from the detail table where tr_amount is greater than amount mainitaied in the amount_cutoff table.

i would like to ask query using not exists is better or an outer join . can we use outer join against numeric values.



Best regards

Rj

amount_cutoff

REC_ID CURRENCY AMOUNT
R1 USD 10000.00
R2 EUR 110000.00

TRANSACTION_DETAILS

REFNO CURRENCY TR_AMOUNT
10001 USD 50000.00
10002 USD 100000.00
10003 USD 3000.00
10004 USD 2000.00
10005 USD 50000.00
10006 USD 100000.00
10008 eur 2000.00
10009 eur 50000.00
10010 eur 100000.00




LEFT JOIN vs RIGHT JOIN

James Su, December 16, 2008 - 2:09 pm UTC

Dear Tom,

Can you take a look at this example and tell me why the two joins give different result? Thanks!

CREATE TABLE T (ID NUMBER);
INSERT INTO T VALUES (1);
INSERT INTO T VALUES (2);
INSERT INTO T VALUES (3);
INSERT INTO T VALUES (4);

CREATE TABLE T2 (ID NUMBER, NAME VARCHAR2(10));
INSERT INTO T2 VALUES (1,'A');
INSERT INTO T2 VALUES (2,'A');
INSERT INTO T2 VALUES (3,'A');
INSERT INTO T2 VALUES (4,'A');

INSERT INTO T2 VALUES (1,'B');
INSERT INTO T2 VALUES (2,'B');
--- 3,'B' is missing here
INSERT INTO T2 VALUES (4,'B');


SELECT T.ID,T2.ID, T2.NAME
FROM T
LEFT JOIN T2
PARTITION BY (NAME)
ON T.ID = T2.ID;

ID ID NAME
---------- ---------- --------------
1 1 A
2 2 A
3 3 A
4 4 A
1 1 B
2 2 B
3 B ------- this is expected
4 4 B


8 rows selected.

SELECT T.ID,T2.ID, T2.NAME
FROM T2
RIGHT JOIN T
PARTITION BY (NAME)
ON T2.ID = T.ID;


ID ID NAME
---------- ---------- --------
1 1 A
2 2 A
3 3 A
4 4 A
1 1 B
2 2 B ----- 3,'B' is missing here
4 4 B

7 rows selected.

Tom Kyte
December 29, 2008 - 11:31 am UTC

ops$tkyte%ORA10GR2> SELECT T.ID,T2.ID, T2.NAME
  2      FROM T2
  3           PARTITION BY (NAME)
  4           RIGHT JOIN T
  5           ON T2.ID = T.ID;

        ID         ID NAME
---------- ---------- ------------------------------
         1          1 A
         2          2 A
         3          3 A
         4          4 A
         1          1 B
         2          2 B
         3            B
         4          4 B

8 rows selected.


it isn't right versus left, it is the placement of the partition clause

Outer join and Null values

Rakesh Bansal, January 06, 2009 - 8:55 am UTC

Tom,

Just in continuation of the above disucssion, why oracle behaves differently in two Database (same version)

SQL> select a.id, a.grp from
  2  (select 1 id, null grp from dual) a,
  3  (select 1 id, null grp from dual) b
  4  where a.id = b.id (+)
  5  and a.grp = b.grp (+)
  6  ;

no rows selected

SQL> conn admin_batch@rxros11
Enter password: ********
Connected.
SQL> select a.id, a.grp from
  2  (select 1 id, null grp from dual) a,
  3  (select 1 id, null grp from dual) b
  4  where a.id = b.id (+)
  5  and a.grp = b.grp (+);

        ID G
---------- -
         1

Any environment setting issue?

Tom Kyte
January 06, 2009 - 10:27 am UTC

let's see the plans for those

it should return a row, this will be "a bug"

show us plans - versions.

outer join

A reader, March 30, 2009 - 11:40 am UTC

I have these two tables. The two tables are joined on (bkno,media) with one
eception. The two media (CA,CR) should be treated as having same author.
I want to find the book authors whether they have one or not.


Book
-----
bkno number(10), PK
media varchar2(5) PK

author
------
bkno number(10)
media varchar2(5),
fname varchar2(20)
lname varchar2(20)

I did this it works.

select distinct B.media,B.bkno,A.fname,A.lname
from book b
left join author a on b.bkno = a.bkno
and case when b.media in ('CA','CR') and a.media in ('CA','CR') then 'CA' else b.media end =
case when b.media in ('CA','CR') and a.media in ('CA','CR') then 'CA' else a.media end
order by B.bkno


But when i do it using the (+) non-ANSI way

select distinct B.media,B.bkno,A.fname,A.lname
from book b, author a where b.bkno = a.bkno(+)
and case when b.media in ('CA','CR') and a.media in ('CA','CR') then 'CA' else b.media end =
case when b.media in ('CA','CR') and a.media in ('CA','CR') then 'CA' else a.media(+) end
order by B.bkno


I always get an error

*
ERROR at line 4:
ORA-01417: a table may be outer joined to at most one other table

Any ideas why and how to do same using (+)

ANSI equivalent Vs Using (+) in loops

A reader, May 31, 2009 - 4:01 am UTC

Hi Tom,
Thanks for your time.

SQL> create table a ( x number, y varchar2(30));

Table created.

SQL> create table b ( x number, y varchar2(30));

Table created.

SQL> insert into a values (1,'A1');

1 row created.

SQL> insert into a values (2,'A2');

1 row created.

SQL> insert into a values (3,'A3');

1 row created.

SQL> insert into a values (4,'A4');

1 row created.

SQL> insert into a values (null,'A99');

1 row created.

SQL> insert into b values (3,'B3');

1 row created.

SQL> insert into b values (4,'B4');

1 row created.

SQL> insert into b values (5,'B5');

1 row created.

SQL> insert into b values (6,'B6');

1 row created.


SQL> select * from a
  2  /

         X Y
---------- ------------------------------
         1 A1
         2 A2
         3 A3
         4 A4
           A99

SQL> select * from b;

         X Y
---------- ------------------------------
         3 B3
         4 B4
         5 B5
         6 B6

Query1:

SQL> select a.x,count(b.x) from a , b where a.x = b.x(+) and a.x is not null group by a.x order by a.x;

         X COUNT(B.X)
---------- ----------
         1          0
         2          0
         3          1
         4          1

Query2:
 
SQL> select a.x,count(b.x) from a LEFT OUTER JOIN b ON a.x = b.x and a.x is not null group by a.x order by a.x;

         X COUNT(B.X)
---------- ----------
         1          0
         2          0
         3          1
         4          1
                    0

Questions : 

a) Since keyword (+)  can be avoided if we use the ANSI equivalent keyword LEFT OUTER JOIN then why there is a difference in the output 
   using  using (+) ( query1) and using  LEFT OUTER JOIN (query2) ?

b) What would be the best way to re-write this query, if i have the following requirements ?
   - Calcalute count from table b ( column x) for   all not null rows of table a ( column x).


 


Regards
   
 

Tom Kyte
June 01, 2009 - 7:53 pm UTC

your last query is not "phraseable" using (+) joins - it is a new capability of the ansi joins

but no worries, you use the ON for your JOIN from A to B, you use WHERE t filter:

ops$tkyte%ORA10GR2> select a.x,count(b.x) from a LEFT OUTER JOIN b ON a.x = b.x WHERE a.x is not null group by a.x order by a.x;

         X COUNT(B.X)
---------- ----------
         1          0
         2          0
         3          1
         4          1

Right, Left outer join and outer join with + sign

A reader, August 14, 2009 - 10:08 am UTC

Hi Tom,


I am confused over right, left outer join and outer join with + sign. My questions are:

1. What is the difference between left and right outer join.
2. Why should I use new syntax of left, right syntax when I can use + sign for the same.


Thanks,

Arvind







Tom Kyte
August 24, 2009 - 8:00 am UTC

well, if you are talking "right and left", you are not using (+)


select * from t1, t2 where t1.x = t2.x(+);

that says "return all rows in T1, and if there is a match in T2, get it - but don't let the fact that T2 doesn't have a match prevent a row in T1 from being returned"

ops$tkyte%ORA10GR2> create table t1 ( x number );

Table created.

ops$tkyte%ORA10GR2> create table t2 ( y number );

Table created.

ops$tkyte%ORA10GR2> insert into t1 select rownum from dual connect by level <= 2;

2 rows created.

ops$tkyte%ORA10GR2> insert into t2 select rownum+1 from dual connect by level <= 2;

2 rows created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select x,y from t1, t2 where t1.x = t2.y(+);

         X          Y
---------- ----------
         2          2
         1

ops$tkyte%ORA10GR2> select x,y from t1, t2 where t2.y(+) = t1.x;

         X          Y
---------- ----------
         2          2
         1

ops$tkyte%ORA10GR2> select x,y from t1 left join t2 on (t1.x=t2.y);

         X          Y
---------- ----------
         2          2
         1

ops$tkyte%ORA10GR2> select x,y from t2 right join t1 on (t1.x=t2.y);

         X          Y
---------- ----------
         2          2
         1




the 'left and right' stuff only comes into play when you use the ANSI join syntax and

t1 left join t2 just means "t1.x = t2.x(+)" (keep all rows in the LEFT table - t1 - and if the other table t2 has a match - that is great, return it too

t2 right join t1 is the same as t1 left join t2.

It just specifies which table will be fully 'preserved' in the join.

use_hash hint in ansi outer join not working

David, May 10, 2011 - 11:15 am UTC

Hi

I have a query which requieres a USE_HASH hint to perform ok, however while trying to rewrite it using ANSI OUTER JOIN syntax I noticed the hint stops working.

The plan are as following:

select /*+ use_hash(ts cv) */
ts.guid, cv.characteristic, cv.value
from tech_service ts,
(select characteristic, value, id_tech_service
from charact_values cxv
where cxv.active = 1
and cxv.id_characteristic in (67, 511)) cv
where cv.id_tech_service(+) = ts.id_tech_service
and ts.id_is_tech_service in (39014, 39023, 39092, 39094, 39095)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=19370 Card=24 Bytes=4464)
1 0 HASH JOIN (OUTER) (Cost=19370 Card=24 Bytes=4464)
2 1 INLIST ITERATOR
3 2 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'TECH_SERVICE' (Cost=6 Card=5 Bytes=155)
4 3 INDEX (RANGE SCAN) OF 'TECH_SERVICE_PK' (UNIQUE) (Cost=3 Card=5)
5 1 VIEW (Cost=19041 Card=4734814 Bytes=733896170)
6 5 TABLE ACCESS (FULL) OF 'CHARACT_VALUES' (Cost=19041 Card=4734814 Bytes=85226652)


select /*+ use_hash(ts cv) */
ts.guid, cv.characteristic, cv.value
from tech_service ts left outer join charact_values cv
on cv.id_tech_service = ts.id_tech_service
and cv.active = 1
and cv.characteristic in (67, 511)
where ts.id_tech_service in (39014, 39023, 39092, 39094, 39095)

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=51 Card=25 Bytes=4325)
1 0 NESTED LOOPS (OUTER) (Cost=51 Card=25 Bytes=4325)
2 1 INLIST ITERATOR
3 2 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'TECH_SERVICE' (Cost=6 Card=5 Bytes=155)
4 3 INDEX (RANGE SCAN) OF 'TECH_SERVICE_PK' (UNIQUE) (Cost=3 Card=5)
5 1 VIEW (Cost=9 Card=5 Bytes=710)
6 5 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'CHARACT_VALUES' (Cost=9 Card=5 Bytes=90)
7 6 INDEX (RANGE SCAN) OF 'CHARACTER_FK' (NON-UNIQUE) (Cost=4 Card=42)



I wonder if there is some problems with HINTs in ANSI JON SYNTAX or simply hints just dont work in this case?

Thanks a lot
Tom Kyte
May 10, 2011 - 11:58 am UTC

Let's fix the real problem, the supposed need for the hint.

Which of the above estimated cardinalities are "way off". One of them must be, let's focus on fixing the root cause, not the band-aid that is a hint.


Show us the original plan unhinted and tell us what the real cardinalities should be. If you don't know, use TKPROF to get that information.

First_rows with order by clause

Ned, July 14, 2011 - 10:09 pm UTC

Hi Tom,

We have an out of the box application that we can NOT modify. The application needs only first 10 rows from the query but due to the "order by” it reads 99% more than it needs. Below is a simple test case. The first SQL is original structure, second one is re-written version with better execution plan optimized for first_rows. The original query sorts at the end while the modified one sorts at the beginning. Can we fix the first query using sqlprofile/hints to force it to behave like the second one without rewriting it? We are on 11.2.0.2 database.

Thanks in advance



create table t1 as select * from dba_objects;
create index t1_i1 on t1(object_name);

create table t2 as select * from dba_objects;
create index t2_i1 on t2(object_id);

create table t3 as select * from dba_objects;
create index t3_i1 on t3(object_type);

exec dbms_stats.gather_table_stats(null,'T1');
exec dbms_stats.gather_table_stats(null,'T2');
exec dbms_stats.gather_table_stats(null,'T3');
var b1 varchar2(32);
exec :b1:='DBA_OBJEC%';

set serverout off;
select /*+gather_plan_statistics */ * from (
SELECT t2.object_id t2_object_id,
t1.object_name t1_object_name,
t2.created t2_created,
t2.object_type t2_object_type,
t3.*
FROM T1, T2, T3
WHERE T1.object_id = T2.object_id
AND T1.OBJECT_NAME LIKE :B1
AND t2.object_type = t3.object_type(+)
ORDER BY T2.CREATED DESC)
where rownum <10;
select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 |00:00:00.88 | 3852 | 853 | | | |
|* 1 | COUNT STOPKEY | | 1 | | 9 |00:00:00.88 | 3852 | 853 | | | |
| 2 | VIEW | | 1 | 2785 | 9 |00:00:00.88 | 3852 | 853 | | | |
|* 3 | SORT ORDER BY STOPKEY | | 1 | 2785 | 9 |00:00:00.88 | 3852 | 853 | 4096 | 4096 | 4096 (0)|
| 4 | NESTED LOOPS OUTER | | 1 | 2785 | 112K|00:00:00.92 | 3852 | 853 | | | |
| 5 | NESTED LOOPS | | 1 | 1 | 8 |00:00:00.05 | 26 | 19 | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 8 |00:00:00.03 | 13 | 13 | | | |
|* 7 | INDEX RANGE SCAN | T1_I1 | 1 | 1 | 8 |00:00:00.03 | 10 | 10 | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| T2 | 8 | 1 | 8 |00:00:00.03 | 13 | 6 | | | |
|* 9 | INDEX RANGE SCAN | T2_I1 | 8 | 1 | 8 |00:00:00.01 | 10 | 3 | | | |
| 10 | TABLE ACCESS BY INDEX ROWID | T3 | 8 | 1923 | 112K|00:00:00.84 | 3826 | 834 | | | |
|* 11 | INDEX RANGE SCAN | T3_I1 | 8 | 1923 | 112K|00:00:00.12 | 298 | 75 | | | |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<10)
3 - filter(ROWNUM<10)
7 - access("T1"."OBJECT_NAME" LIKE :B1)
filter("T1"."OBJECT_NAME" LIKE :B1)
9 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
11 - access("T2"."OBJECT_TYPE"="T3"."OBJECT_TYPE")


SELECT /*+ gather_plan_statistics outline_leaf(@"SEL$2")
no_access(@"SEL$1" "from$_subquery$_002"@"SEL$1")
index_rs (t3 t3_i1)*/
*
FROM T3,
( SELECT t2.object_id t2_object_id,
t1.object_name t1_object_name,
t2.created t2_created,
t2.object_type t2_object_type
FROM T1, T2
WHERE T1.object_id = T2.object_id AND T1.OBJECT_NAME LIKE :B1
ORDER BY T2.CREATED DESC) t
WHERE t2_object_type = t3.object_type(+) and rownum <10;
select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
-----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 |00:00:00.01 | 31 | 22 | | | |
|* 1 | COUNT STOPKEY | | 1 | | 9 |00:00:00.01 | 31 | 22 | | | |
| 2 | NESTED LOOPS OUTER | | 1 | 9 | 9 |00:00:00.01 | 31 | 22 | | | |
| 3 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 26 | 19 | | | |
| 4 | SORT ORDER BY | | 1 | 1 | 1 |00:00:00.01 | 26 | 19 | 2048 | 2048 | 2048 (0)|
| 5 | NESTED LOOPS | | 1 | | 8 |00:00:00.01 | 26 | 19 | | | |
| 6 | NESTED LOOPS | | 1 | 1 | 8 |00:00:00.01 | 23 | 16 | | | |
| 7 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 8 |00:00:00.01 | 13 | 13 | | | |
|* 8 | INDEX RANGE SCAN | T1_I1 | 1 | 1 | 8 |00:00:00.01 | 10 | 10 | | | |
|* 9 | INDEX RANGE SCAN | T2_I1 | 8 | 1 | 8 |00:00:00.01 | 10 | 3 | | | |
| 10 | TABLE ACCESS BY INDEX ROWID | T2 | 8 | 1 | 8 |00:00:00.01 | 3 | 3 | | | |
| 11 | TABLE ACCESS BY INDEX ROWID | T3 | 1 | 9 | 9 |00:00:00.01 | 5 | 3 | | | |
|* 12 | INDEX RANGE SCAN | T3_I1 | 1 | 9 | 9 |00:00:00.01 | 3 | 2 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<10)
8 - access("T1"."OBJECT_NAME" LIKE :B1)
filter("T1"."OBJECT_NAME" LIKE :B1)
9 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
12 - access("T2_OBJECT_TYPE"="T3"."OBJECT_TYPE")


Tom Kyte
July 18, 2011 - 8:51 am UTC

the second query is not the same as the first query, it doesn't have to return the rows in the same order at all.

the second one joins t1/t2, orders by created desc, then joins to t3 ( which can re-sort your T inline view at will, i DO NOT CARE IF YOU HAVE AN EXAMPLE WHEREBY IT DID NOT ORDER THE DATA - the fact is IT CAN and SOMEDAY IT WOULD) and keeps the first 10 rows it happens to have found.

These queries are not even remotely similar.


and without knowing what the relations (primary/foreign keys, whether 1:1, 1:m, etc etc etc) are - I cannot even rewrite this query (all I know is - they are NOT the same)


Heck, I just ran them - they do not even give the same answer - I don't know why you think you can compare them????



I would try to a SQL profile - create that in production - we look at the historical workload and use that to influence the plan.


But in general, if your relation to T3 is 1:M, you cannot short circuit the join like that. Getting the first 10 sorted records from t1/t2 and then joining to t3 ISN'T THE SAME AS joining t1,t2,t3 and sorting and getting the first ten records.

native full outer join

Rajeshwaran, Jeyabal, November 02, 2011 - 4:42 pm UTC

Tom:

I was reading this article
http://blogs.oracle.com/optimizer/entry/outerjoins_in_oracle
In the Q/A section, its mentioned as

Q3: Has native full outer join been made available in
versions prior to 11gR1?

A3: Yes. It is available in 10.2.0.3 and 10.2.0.4, but not by
default.


Questions
1) Can you tell us what one should do to enable native full outer join in 10.2.0.3 and 10.2.0.4 ?

2) I was trying this example in 10.2.0.5 and i dont see Native full outer join is available. Is that not available in 10.2.0.5? But i see that in 11G


rajesh@ORA10GR2> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

Elapsed: 00:00:00.04
rajesh@ORA10GR2>
rajesh@ORA10GR2> set autotrace traceonly explain;
rajesh@ORA10GR2>
rajesh@ORA10GR2> select t1.x, t2.y
  2  from t1 full outer join t2
  3  on (t1.x = t2.x)
  4  /
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 2841162349

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     4 |   104 |    13   (8)| 00:00:01 |
|   1 |  VIEW                |      |     4 |   104 |    13   (8)| 00:00:01 |
|   2 |   UNION-ALL          |      |       |       |            |          |
|*  3 |    HASH JOIN OUTER   |      |     3 |   117 |     7  (15)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T1   |     3 |    39 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| T2   |     3 |    78 |     3   (0)| 00:00:01 |
|*  6 |    HASH JOIN ANTI    |      |     1 |    39 |     7  (15)| 00:00:01 |
|   7 |     TABLE ACCESS FULL| T2   |     3 |    78 |     3   (0)| 00:00:01 |
|   8 |     TABLE ACCESS FULL| T1   |     3 |    39 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T1"."X"="T2"."X"(+))
   6 - access("T1"."X"="T2"."X")

Note
-----
   - dynamic sampling used for this statement

rajesh@ORA10GR2>

rajesh@ORA11GR2> select t1.x, t2.y
  2  from t1 full outer join t2
  3  on (t1.x = t2.x)
  4  /
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 53297166

----------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |     3 |    78 |     7  (15)| 00:00:01 |
|   1 |  VIEW                 | VW_FOJ_0 |     3 |    78 |     7  (15)| 00:00:01 |
|*  2 |   HASH JOIN FULL OUTER|          |     3 |   117 |     7  (15)| 00:00:01 |
|   3 |    TABLE ACCESS FULL  | T1       |     3 |    39 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL  | T2       |     3 |    78 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."X"="T2"."X")

Note
-----
   - dynamic sampling used for this statement (level=2)

rajesh@ORA11GR2>

Tom Kyte
November 03, 2011 - 12:58 am UTC

sorry, I don't go into the un-documented, just not really worth it. too many other possible side effects.

why wouldn't you ask on the original blog by the way?

More to Explore

Performance

Get all the information about database performance in the Database Performance guide.