Helena Markova, April     03, 2001 - 4:05 am UTC
 
 
  
 
phrase it
A reader, September 26, 2001 - 2:20 pm UTC
 
 
Kindly phrase the following query for us
select count(*) from emp where empno not in ( select mgr from emp );
 
 
 
Reader
A reader, September 30, 2002 - 10:01 pm UTC
 
 
Tom,
Instead of
SQL> select count(*) from emp T1
  2  where not exists ( select null from emp T2 where t2.mgr = t1.empno );
you could have used 
SQL>  select count(*) from emp T1
  2  where not exists ( select mgr  from emp T2 where t2.mgr = t1.empno );
Could you tell what circumstances do we use "select null"
instead of "select <value>". Are there any advantages
 
 
 
October   01, 2002 - 9:54 am UTC 
 
why select mgr?
I find select null to be semantically more meaningful.  You are NOT selecting anything really -- so say that. 
 
 
 
Why problem if NULL is there 
Tony, October   02, 2002 - 2:02 am UTC
 
 
Hi Tom,
       Your answer is superb. Can you tell us why there is no record selected for NOT IN when there is NULL?
 
 
October   02, 2002 - 10:25 am UTC 
 
Because NULL means -- gee, I don't know.  (litterally, null means Unknown)
So, the predicate
where x not in ( NULL )
evaluates to neither TRUE, nor FALSE
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from dual where dummy not in ( NULL );
no rows selected
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from dual where NOT( dummy not in (NULL) );
no rows selected
(you would think one of the two queries would return a row -- but there is a third state for a boolean expression in sql -- "I don't know what the answer is") 
 
 
 
 
select rows using exists
Madhuri Bondre, March     31, 2003 - 6:44 am UTC
 
 
Your solution is very useful
but i have one query
i have two tables
first table with item_cd, sub_item_cd (both the columns are in primary key)
second table with sub_item_cd (no primary key)
i want to filter only those records from first table where all sub_item_cd exists in second table
ex. 
rows of first table 
item_cd     sub_item_cd 
   
1010        A
1010            B
1010        C
1011        A
1011        D
1012        A
rows of second table  
   
sub_item_cd
A
A
A
B
B
C
B
C
i need output as
item_cd 
1010
1012
 
how can I get it in one select ?
can exists will solve it?
thanks in advance 
 
 
March     31, 2003 - 9:02 am UTC 
 
ps$tkyte@ORA920> select t1.item_cd, t1.sub_item_cd, decode( t2.sub_item_cd, NULL, 1, 0 )
  2    from t1, (select distinct sub_item_cd from t2 ) t2
  3   where t1.sub_item_cd = t2.sub_item_cd(+)
  4  /
   ITEM_CD S DECODE(T2.SUB_ITEM_CD,NULL,1,0)
---------- - -------------------------------
      1010 a                               0
      1011 a                               0
      1012 a                               0
      1010 b                               0
      1010 c                               0
      1011 d                               1
6 rows selected.
ops$tkyte@ORA920> select t1.item_cd
  2    from t1, (select distinct sub_item_cd from t2 ) t2
  3   where t1.sub_item_cd = t2.sub_item_cd(+)
  4   group by t1.item_cd
  5  having max(decode( t2.sub_item_cd, NULL, 1, 0 )) = 0
  6  /
   ITEM_CD
----------
      1010
      1012
ops$tkyte@ORA920> 
 
 
 
 
Thanks
Madhuri bondre, April     01, 2003 - 12:31 am UTC
 
 
you are great !!!! 
 
 
Mirjana, May       05, 2003 - 2:51 am UTC
 
 
  
 
NOT IN or NOT EXISTS or
Gautam, May       24, 2003 - 11:27 pm UTC
 
 
Hi Tom, Your points are very helping, however, in case of having 20 Mill rows in outer table and Mill in inner table none is working, both are too slow. Is there a way to avoid the subquery being executed for retrieval of every row from the outer query. 
 
May       25, 2003 - 9:59 am UTC 
 
give me a "for example" and define "too slow".
on my pc, to full scan a 1,000,000 row table (a copy of all_objects -- the WIDTH * HEIGHT of a table is relevant -- not just the height.  I have 1,000,000 tables that consume lots less space then a 10,000 row one) takes about 7 seconds.  a 20million -- 20 times as long.
As long as you are using the CBO, and the tables are analyzed, it will use a hash anti join and be very fast. 
 
 
 
Gautam, May       27, 2003 - 4:54 pm UTC
 
 
Hi Tom, Thanks for your reply. 
I have following observations :
Query:
SELECT ind_id, lst_ord_ss, dec_sc_is, lst_ord_is, 
       ss_group, dec_sc_ss, list_pri, cont_lvl, 
       site_upd_dt, source_cd, sd_custtyp, site_id, 
       ss_subgrp
 FROM multi_mv
WHERE country_cd = :sys_b_00
  AND dsf_index IN (:sys_b_01, :sys_b_02)
  AND mail_cont IN (:sys_b_03, :sys_b_04)
  AND mail_site = :sys_b_05
  AND ss_group IN     
     (:sys_b_06, :sys_b_07, :sys_b_08, :sys_b_09, :sys_b_10)
  AND NOT ind_id IN (SELECT DISTINCT ind_id
                             FROM ssw.promo
                             WHERE drop_cd = :sys_b_11)
1. The plan of above query. NL's
SELECT STATEMENT   Cost = 193604
  FILTER
    TABLE ACCESS FULL MULTI_MV
    TABLE ACCESS BY INDEX ROWID PROMO
      INDEX RANGE SCAN PROM_DROP_CD
2. Adding "ORDER BY int_id"into above Query changes 
   the plan to seems like MERGE_AJ 
SELECT STATEMENT   Cost = 118609
  TABLE ACCESS BY INDEX ROWID MULTI_MV
    INDEX FULL SCAN MV_MULTI_IND_ID
      TABLE ACCESS BY INDEX ROWID PROMO
        INDEX RANGE SCAN PROM_DROP_CD
3. Changing "NOT IN" to "NOT EXISTS"
SELECT STATEMENT   Cost = 198944
  SORT ORDER BY
    HASH JOIN ANTI
      TABLE ACCESS FULL MULTI_MV
      TABLE ACCESS FULL PROMO
Why is the CBO choosing Anti join only in case of NOT EXISTS ?
Internally the Anti Hash Join retrives the full row or only the join column ?
The approx time for the query to finish in first 2 cases is 5 days.  
  
I would appreciate any comments on this ?. 
 
May       27, 2003 - 5:51 pm UTC 
 
obviously NO BIND variables on your system eh.  
I'll guess (in light of no creates, no knowledge of indexes, no dbms_stats.set_table_stats to let me know how big/how many rows things are, etc etc etc)
Is ind_id in ssw.promo NULLABLE
and if so, what happens when you say:
  AND NOT ind_id IN (SELECT DISTINCT ind_id
                             FROM ssw.promo
                             WHERE drop_cd = :sys_b_11
                               and IND_ID IS NOT NULL )
instead (you do know of course that NOT EXISTS and NOT IN are *not* interchangeable, they'll give different results!)
 
 
 
 
Not In Vs Not Exists !
ramki, May       27, 2003 - 7:41 pm UTC
 
 
Tom,
I have 2 tables. Both tables have around 8000 records.
Table 1:
ID
parentID
col1
col2
col3
Table1: Primary key: ID
Table2
Module
ID
processDate
ParentID
dataCol1
dataCol2
source_column
source_fg
Table2 Primary key: Module, Id, processDate
Table 1 is a superset of all Ids that Table 2 has.
ParentIds for Ids in Table1 and Table2 would be same for same Id.
Now, I need to pick those Ids from Table1 that are 
 -  not in Table2 
 - have same ParentID from Table2.
 - for records that have source_column in table2 is not null
 - for records in Table2 that have source_fg as null
Table1:
ID     ParentId    Col1    col2    col3
101    111        10    20    30
102    111        20    40    60
103    112        10    10    10
104    113        10    20    30
105    113        30    40    50
106    114        40    40    50
107    113        50    50    50
Table2:
Module    ID    ParentId    ProcessDate    datacol1    datacol2    source_column source_fg
1    101    111        27-MAY-2003    1000        1000        2000        
1    103    112        27-MAY-2003    2000        2000        2000        Y
1    104    113        27-MAY-2003    3000        3000        3000        
1    106    114        27-MAY-2003    2300        2300        3500
Now, I need the out to be:
102
105
107
Select t1.Id
from Table1 t1, Table2 t2
where not Exists (select null from Table2 where table2.module=1 and 
table2.id = t1.id and table2.processDate = sysdate)
and t1.parentId = t2.parentId
and t2.Module = 1
and t2.processDate = sysdate
and t2.source_column is not null
and t2.source_fg is null
Actually, I need to insert these Ids in Table2 and replicate data from their parentIDs in Table2.
so, New records would be
Module    ID    ParentId    ProcessDate    datacol1    datacol2    source_column source_fg
1    102    111        27-MAY-2003    1000        1000        2000        
1    105    113        27-MAY-2003    3000        3000        3000    
1    107    113        27-MAY-2003    3000        3000        3000    
I am trying to use, Insert into table2
select ......
Which one would be better: (using RBO)
Select t1.Id, t2.datacol1, t2.datacol2..........
from Table1 t1, Table2 t2
where not Exists (select null from Table2 where table2.module=1 and 
table2.id = t1.id and table2.processDate = sysdate)
and t1.parentId = t2.parentId
and t2.Module = 1
and t2.processDate = sysdate
and t2.source_column is not null
and t2.source_fg is null
or 
Select t1.Id, t2.datacol1, t2.datacol2..........
from Table1 t1, Table2 t2
where t1.Id not in (select Id from Table2 where table2.module=1 and 
            table2.processDate = sysdate)
and t1.parentId = t2.parentId
and t2.Module = 1
and t2.processDate = sysdate
and t2.source_column is not null
and t2.source_fg is null
Please advice. Thanks
 
 
May       27, 2003 - 8:01 pm UTC 
 
neither, rbo is the wrong technology.
In anycase, it seems you have the test data, benchmark it?  it is what I would have to do in order to show you.
rbo would be wrong.  rbo would use the least efficient plans for these types of queries.  cbo would do it most efficiently. 
 
 
 
Problem with my NOT EXISTS statement
Godwin, May       30, 2003 - 8:03 am UTC
 
 
Hi Tom,
Its difficult getting you meanwhile this question is similar what is being discussed and I will be grateful if you will mail the reply to godwinofoe@yahoo.com. Please don't ignore reading this.I need an urgent reply to be able to get this report to management.
The problem is in generating a report in Report6i from a University HR database Oracle7.3.2 Server.
The problem is like this:
In a University, staff who retired are taken on contract to work for some time. Also staff who resigned from the university and want to come back are re-engaged.
Now management requested for a report on all active staff (i.e to exclude retirees and include staff re-engaged).
When I wrote a query to produce such report I was not getting the correct result. The query failed to exclude the retired staff. The logic i was using is that the subquery should select all staff who retired and resigned from the university. At that instance of creating their service record, their start date is the same as their end date so the subquery would select by using the condition where the start date equals the end date. When such records of staff are found by the subquery,the outer query would compare its result with the inner query and then exclude any matching records found in the inner query result. I hope i made myself clear.Here is the detail of my query:
SELECT  distinct id,name
,INITCAP(LOWER(dname)) 
,disc,descrip
,startdate,endate
from staff a
,dept b
,jobhist c
,dictator d
,postab e
WHERE a.id= c.stid
 AND   a.deptno= b.deptno
 AND   c.indic = d.code
 AND   c.jobtitle = e.postid
AND NOT EXISTS 
             (SELECT 1
              FROM jobhist j
              WHERE j.stid=a.id
              AND j.startdate=(SELECT MAX(NVL   k.endate,k.startdate)) 
                             FROM jobhist k,jobhist n
                             WHERE k.stid = a.id
                             AND k.stid=n.stid
                             AND c.indic IN('01','02','04')
                                            )
                 )
ORDER BY   name asc 
This is the result got from the above query:
DeptFile# NameService Indicator    PostStart Date    End Date
Operations    10003    Mana    NEW APPOINTMENT    LECTURER    1-Dec-87    20-Jan-99
Operations    10003    Mana    re-enganged    SENIOR LECTURER    1-Jun-01     
Research    10001    Ofoe    NEW APPOINTMENT    LECTURER    15-Aug-02    15-May-03
Research    10001    Ofoe    PROMOTION    SENIOR LECTURER    15-May-03     
Sales    10002    John    NEW APPOINTMENT    Junior Admin Asst    14-Jun-75    15-May-88
Sales    10002    John    PROMOTION    LECTURER    16-May-90    11-Jun-00
Sales    10002    John    PROMOTION    SENIOR ADMIN    15-May-88    30-May-90
Sales    10002    John    PROMOTION    SENIOR LECTURER    13-Jun-00    30-Jul-02
The correct output should have excluded staff called John since he is retired.
Here are the records stored in the various tables:
Jobhist Table
File# Ser.Indicator Postcode StartDate End Date
10001    A1    5    15-Aug-02    15-May-03
10001    A2    6    15-May-03     
10002    A3    7    14-Jun-75    15-May-88
10002    A3    6    15-May-88    30-May-90
10002    A1    6    16-May-90    11-Jun-00
10002    A2    6    13-Jun-00    30-Jul-02
10002    A2    1    30-May-03    30-May-03
10003    A1    5    1-Dec-87    20-Jan-99
10003    A2    4    1-Jul-00    1-Jul-00
10003    A2    3    1-Jun-01     
Staff table:
File#  Name    Hiredate     StaffType Deptno    Birthdate
10001    Ofoe    12-Aug-02    3    20      
10004    Kofi    1-Jun-75    3    10    15-Jun-40
10002    John    12-May-14    3    30    12-Jun-70
10003    Mana    12-Sep-87    2    40    17-Jul-56
Service Indicator Table(dictator):
Code    Description
1    retired
2    contract
3    re-enganged
4    resigned
5    NEW APPOINTMENT
6    PROMOTION
Is there a mistake I made, is there a different way of achieving the result or is SQL incapable of achieving the result? I need your help please. Thanks for your help.
 
 
May       30, 2003 - 8:40 am UTC 
 
well, first - i don't read every review, most but not all.  Also, I read them as I have time -- not on any sort of set schedule
second, I'm not emailing people on individual followups.  I'd spend the other half of my life doing that (leaving no half for me)
third, the example you put forth is really really really hard to follow.  Your sample tables for example haven't many columns at all that match the columns in your query.  I don't understand your datamodel (as you should).  So, I'll make a simple guess
sounds like you want STAFF records such that their last (most recent) record in JOBHIST does not have an indic code in ( 1, 2, 4 )
so, I think we can ignore enddate all togther, just find the max startdate for each person.  we can start with:
select file#, max( to_char(startdate,'yyyymmddhh24miss') || indic )
  from jobhist
 group by file#
Now, that will give us a set of file# (which I assume is the key into STAFF) and their max startdate value ALONG WITH THE INDIC code for that max date.
We can then:
select file#
  from (
select file#, max( to_char(startdate,'yyyymmddhh24miss') || indic ) data
  from jobhist
 group by file#
       )
 where substr( data, 15 ) not in ( '01', '02', '04' )
/
That'll be the set of ACTIVE STAFF file#'s.
Now you can just
select * 
  from staff
 where file# in ( <that query> );
 
 
 
 
You got what i mean but not getting the actual result I want
Godwin, June      02, 2003 - 8:59 am UTC
 
 
Hi Tom,
Thanks for the response.
With reference to your understanding "sounds like you want STAFF records such that their last (most recent) record in 
JOBHIST does not have an indic code in ( 1, 2, 4 )"
What i mean is that staff who resign(4) or retired(1)
shouldn't have their names or previous service records in the JOBHIST table appeared in the report at all.
Take it as that the STAFF TABLE is the MASTER Table with the primary key fileNo and the JOBHIST table is the DETAIL TABLE with FileNo as the foreign Key.
Your solutions you gave produced the names and previous service records of staff excluding their most recent records that has indic code in (1,2,4). 
We don't the records of those whose most recent records that has indic code in (1,2,4) appeared.
Can't there be away of achieving that?
Thanks for your help. 
 
 
June      02, 2003 - 9:16 am UTC 
 
you say 
"we don't the records ... with 1,2,4 appeared"
If I read that to mean "why don't" -- then it is because I filtered them:
select file#
  from (
select file#, max( to_char(startdate,'yyyymmddhh24miss') || indic ) data
  from jobhist
 group by file#
       )
 where substr( data, 15 ) not in ( '01', '02', '04' )
/
Look at the concept there.  (you should be able to fix this, it is after all your set of tables???)
That should -- if you've explained things accurately -- return the set of primary keys from jobhist for joining to the other table.  The set of records this returns is the MAX startdate record for everyone in jobhist (their LAST ENTRY) if and only if their indic code isn't in 1, 2, 4
Take the technique, verify the results -- apply the technique to your problem.
Since I 
a) don't know your tables
b) don't know your keys
c) am having a hard time following your explaination
this is the best I can do. 
 
 
 
It was wonderful
Godwin, June      02, 2003 - 10:53 am UTC
 
 
Hi Tom,
It is working perfectly.
Thanks for your patience although my questions were not explanatory enough,your guess were correct.I made some mistakes in implementing the solution.
Just as you said,I will now fine tune the query so i can get the records from more than one tables. I will do my best to get accurate result but please bear with me if i should get back to you for additional help.
God bless!!
 
 
 
avoid correlated subquery
umesh, August    10, 2003 - 8:55 am UTC
 
 
SELECT 
  count(*)
FROM 
  AER A,
  AER_PRODUCT B ,
  AER_PRODUCT_APPROVAL_NO P ,
  AER_REACT E
WHERE 
  A.AER_DELETED = '0'  AND
  B.RECORD_DELETED != '1' AND
  E.RECORD_DELETED != '1'  AND
  (B.NOT_RELEVANT IS NULL) AND
  (E.NOT_RELEVANT IS NULL)   AND
  B.TRADE_NAME LIKE :1 AND
  (  
     ( B.PRODUCT_TYPE_MULTI = '0' 
       AND
      (B.PRODUCT_TYPE = '1' OR B.PRODUCT_TYPE = '2' OR B.PRODUCT_TYPE = '4'  )  
     )  
     OR  
     ( B.PRODUCT_TYPE_MULTI = '1'  
     AND
      ( ( P.PRODUCT_TYPE = '1' OR P.PRODUCT_TYPE = '2' OR P.PRODUCT_TYPE = '4' )  
       AND P.COUNTRY_CODE = NVL(A.AER_COUNTRY, '078') 
         AND B.SEQ_PRODUCT = P.SEQ_PRODUCT  AND B.AER_ID = P.AER_ID   
     )  
     OR  
    (NOT EXISTS 
        ( SELECT Y.APPROVAL_NO FROM AER_PRODUCT_APPROVAL_NO Y 
          WHERE Y.COUNTRY_CODE  = NVL(A.AER_COUNTRY, '078') 
      AND B.SEQ_PRODUCT = Y.SEQ_PRODUCT  AND  B.AER_ID = Y.AER_ID
    )
    AND
    B.PRODUCT_TYPE != '3' 
 ))  
  ) 
  AND
  (B.PRODUCT_FLAG = '1'  OR B.PRODUCT_FLAG = '4')  AND
  B.AER_ID = A.AER_ID  AND
  E.AER_ID = A.AER_ID  AND
  E.EVENT_TYPE  != '2'
ORDER BY A.AER_NO, A.COMPOSITE_VERSION_NO DESC , A.AER_ID, E.AUTO_RANK, B.AUTO_RANK 
Tables                       P.KEY
AER                          AER_ID
AER_PRODUCT                  AER_ID,SEQ_PRODUCT
AER_APPROVAL_NO             
                      AER_ID,SEQ_PRODUCT,SEQ_APPROVAL_NUMBER
i WANT TO ELIMINATE THE CORRELATED SUBQRY 
PLEASE HELP 
 
 
A reader, September 04, 2003 - 6:39 pm UTC
 
 
Tom,
Can you please take a look and advice on this query
select     map, 
    todmdunit, 
    todmdgroup, 
    todfuloc, 
    tomodel,         
        fromdmdunit, 
    fromdmdgroup, 
    fromdfuloc, 
    frommodel, 
    eff 
from    stsc.dfumap a 
where      scen = 0 
and        not exists 
    (    select * 
        from   stsc.dfu b                    
        where  scen = 0                    
        and    a.fromdmdunit = b.dmdunit                    
        and    a.fromdmdgroup = b.dmdgroup
                and    a.fromdfuloc = b.loc
                and    a.frommodel = b.model
    )
/
no rows selected
Elapsed: 00:00:30.29
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=59587 Bytes=3634807)
   1    0   FILTER
   2    1     INDEX (FAST FULL SCAN) OF 'DFUMAP1' (NON-UNIQUE) (Cost=2 Card=59587 Bytes=36
          34807)
   3    1     INDEX (UNIQUE SCAN) OF 'DFU1' (UNIQUE) (Cost=2 Card=1 Bytes=23)
Statistics
----------------------------------------------------------
          0  recursive calls
         50  db block gets
    3590173  consistent gets
          0  physical reads
          0  redo size
        686  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
SQL> desc dfu
 Name                                                                    Null?    Type
 ----------------------------------------------------------------------- -------- ----------------------------
 DMDUNIT                                                                 NOT NULL VARCHAR2(30)
 DMDGROUP                                                                NOT NULL VARCHAR2(10)
 LOC                                                                     NOT NULL VARCHAR2(10)
 HISTSTART                                                               NOT NULL DATE
 EFF                                                                     NOT NULL DATE
 DISC                                                                    NOT NULL DATE
 FCSTHOR                                                                 NOT NULL NUMBER(38)
 DMDCAL                                                                  NOT NULL VARCHAR2(18)
 DMDPOSTDATE                                                             NOT NULL DATE
 MODELDATE                                                               NOT NULL DATE
 STATMSE                                                                 NOT NULL FLOAT(126)
 MAXHIST                                                                 NOT NULL NUMBER(38)
 TOTFCSTLOCK                                                             NOT NULL NUMBER(38)
 LOCKDUR                                                                 NOT NULL NUMBER(38)
 REFITDATE                                                               NOT NULL DATE
 MASK                                                                    NOT NULL VARCHAR2(18)
 SCEN                                                                    NOT NULL NUMBER(38)
 SCENBITS                                                                NOT NULL NUMBER(38)
 MAPUSED                                                                 NOT NULL VARCHAR2(18)
 NETFCSTMSE                                                              NOT NULL FLOAT(126)
 NETFCSTMSESMCONST                                                       NOT NULL FLOAT(126)
 NETFCSTERROR                                                            NOT NULL FLOAT(126)
 NEGFCSTSW                                                               NOT NULL NUMBER(38)
 AUTOADJFACTOR                                                           NOT NULL FLOAT(126)
 MODEL                                                                   NOT NULL VARCHAR2(18)
 AUTOADJMIN                                                              NOT NULL FLOAT(126)
 AUTOADJMAX                                                              NOT NULL FLOAT(126)
 PUBLISHDATE                                                             NOT NULL DATE
 UDC_DFU_HIERARCHY                                                                VARCHAR2(50)
 UDC_MODELNUM                                                            NOT NULL NUMBER(38)
 UDC_ABCINDICATOR                                                                 VARCHAR2(50)
 UDC_DELETERECSW                                                         NOT NULL NUMBER(38)
 UDC_IMPORTSW                                                                     NUMBER(38)
 UDC_NEWAUTOGENSW                                                        NOT NULL NUMBER(38)
SQL>desc dfumap
 Name                                                                    Null?    Type
 ----------------------------------------------------------------------- -------- ----------------------------
 MAP                                                                     NOT NULL VARCHAR2(18)
 TODMDUNIT                                                               NOT NULL VARCHAR2(30)
 TODMDGROUP                                                              NOT NULL VARCHAR2(10)
 TODFULOC                                                                NOT NULL VARCHAR2(10)
 FROMDMDUNIT                                                             NOT NULL VARCHAR2(30)
 FROMDMDGROUP                                                            NOT NULL VARCHAR2(10)
 FROMDFULOC                                                              NOT NULL VARCHAR2(10)
 EFF                                                                     NOT NULL DATE
 DISC                                                                    NOT NULL DATE
 FACTOR                                                                  NOT NULL FLOAT(126)
 LEVELNUM                                                                NOT NULL NUMBER(38)
 FCSTTYPE                                                                NOT NULL NUMBER(38)
 HISTTYPE                                                                NOT NULL NUMBER(38)
 SCEN                                                                    NOT NULL NUMBER(38)
 SCENBITS                                                                NOT NULL NUMBER(38)
 CONVFACTOR                                                              NOT NULL FLOAT(126)
 LOCKSTART                                                               NOT NULL DATE
 LOCKEND                                                                 NOT NULL DATE
 TOMODEL                                                                 NOT NULL VARCHAR2(18)
 FROMMODEL                                                               NOT NULL VARCHAR2(18)
 UDC_NEWAUTOGENSW                                                        NOT NULL NUMBER(38)
SQL>select count(*) from dfu;
 COUNT(*)
----------
    770946
SQL>select count(*) from dfumap;
  COUNT(*)
----------
   1191740
parameters for 
optimizer_index_caching              integer 80
optimizer_index_cost_adj             integer 25 
 
 
September 05, 2003 - 3:33 pm UTC 
 
try using NOT IN instead of EXISTS
from    stsc.dfumap a 
where      scen = 0 
and    (fromdmdunit, ... ) not in 
    (    select b.dmdunit, b.dmdgroup, b.loc, b.model
        from   stsc.dfu b                    
        where  scen = 0                    
    ) 
 
 
 
Sometimes not exists does not work!!!
Aldo bravo, October   06, 2003 - 4:31 pm UTC
 
 
Hi Tom,
See this example:
SQL> drop table t1; 
drop table t1
           *
ERROR at line 1:
ORA-00942: table or view does not exist
SQL> 
SQL> create table t1 (f1 varchar(10), 
  2  f2 varchar(10), 
  3  f3 varchar(10)); 
Table created.
SQL> 
SQL> insert into t1 values ('1','5','3'); 
1 row created.
SQL> insert into t1 values ('4','4','6'); 
1 row created.
SQL> commit; 
Commit complete.
SQL> SELECT f1,f2,f3 
  2  FROM t1 a 
  3  WHERE a.f2 <= '5' 
  4  AND NOT EXISTS (SELECT '1' 
  5  FROM t1 b 
  6  WHERE b.f2 <= '5' 
  7  AND b.f2 > a.f2); 
F1         F2         F3
---------- ---------- ----------
1          5          3
SQL> analyze table t1 compute statistics;
Table analyzed.
SQL> SELECT f1,f2,f3 
  2  FROM t1 a 
  3  WHERE a.f2 <= '5' 
  4  AND NOT EXISTS (SELECT '1' 
  5  FROM t1 b 
  6  WHERE b.f2 <= '5' 
  7  AND b.f2 > a.f2); 
no rows selected
What can you say about it?
If I delete the stats it works:
SQL> exec dbms_stats.delete_table_stats(tabname=> 'T1',ownname=>'ESCMANAGER');
PL/SQL procedure successfully completed.
SQL> /
F1         F2         F3
---------- ---------- ----------
1          5          3
It is a 9.2.0.4 database on Windows 2003.
Thanks, 
 
 
October   06, 2003 - 6:22 pm UTC 
 
first -- thank you thank you thank you for such a nice simple testcase.  bravo
Second, I filed Bug No: 3178622 on this -- obviously, a problem. 
 
 
 
implicit cast?
freek, October   07, 2003 - 8:04 am UTC
 
 
Tom,
Could the problem above be explained by an implicit cast?
Where the optimizer is after the analyze aware of the fact the the datatype of the column is varchar2 and does not convert the literal to a number?
freek@DEV01> select (case when to_char(x) > to_char(y) then 'X bigger' else 'Y bigger' end)
  2  from ( select 2 x, 10 y
  3         from dual
  4       );
(CASEWHE
--------
X bigger
Elapsed: 00:00:00.00
if you modify the example to use different values, you get following result
freek@DEV01> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE    9.2.0.1.0    Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
Elapsed: 00:00:00.00
freek@DEV01> drop table t;
Table dropped.
Elapsed: 00:00:00.00
freek@DEV01> create table t1
  2  ( f1 varchar(10),
  3    f2 varchar(10),
  4    f3 varchar(10)
  5  );
Table created.
Elapsed: 00:00:00.00
freek@DEV01> insert into t1 values ('1','2','3');
1 row created.
Elapsed: 00:00:00.00
freek@DEV01> insert into t1 values ('4','10','6');
1 row created.
Elapsed: 00:00:00.00
freek@DEV01> commit;
Commit complete.
Elapsed: 00:00:00.00
freek@DEV01> select f1,f2,f3
  2  from t1 a
  3  where a.f2 <= '20'
  4  and not exists ( select '1'
  5                   from t1 b
  6                   where b.f2 <= '20'
  7                         and b.f2 > a.f2
  8                 );
F1         F2         F3
---------- ---------- ----------
1          2          3
Elapsed: 00:00:00.00
freek@DEV01> analyze table t1 compute statistics;
Table analyzed.
Elapsed: 00:00:00.00
freek@DEV01> select f1,f2,f3
  2  from t1 a
  3  where a.f2 <= '20'
  4        and not exists ( select '1'
  5                         from t1 b
  6                         where b.f2 <= '20'
  7                         and b.f2 > a.f2
  8                       );
F1         F2         F3
---------- ---------- ----------
1          2          3
Elapsed: 00:00:00.00
now if you change the query to use an to_number
freek@DEV01> select f1,f2,f3
  2  from t1 a
  3  where to_number(a.f2) <= 20
  4  and not exists ( select 1
  5                   from t1 b
  6                   where to_number(b.f2) <= 20
  7                         and to_number(b.f2) > to_number(a.f2)
  8                 );
F1         F2         F3
---------- ---------- ----------
4          10         6
 
 
October   07, 2003 - 8:36 am UTC 
 
nope, it is a bug - if the SAME query against the SAME data returns DIFFERENT answers based solely on the plan developed -- that is a bug. 
 
 
 
RE: sometimes not exists does not work...
Mark A. Williams, October   07, 2003 - 5:35 pm UTC
 
 
Tom,
Is there some environmental/config setting that could make a difference?  I have 9.2.0.4.0 on a WinXP laptop and can not duplicate...  Is this behavior specific to Windows Server 2003?  I could not duplicate on 9.2.0.4.0 on Solaris, AIX, or Linux as well...  (I am going to install Windows Server 2003 and test on that tomorrow)
SQL> select * from v$version;
BANNER
----------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for 32-bit Windows: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
5 rows selected.
SQL> create table t1
  2  (
  3    f1 varchar(10),
  4    f2 varchar(10),
  5    f3 varchar(10)
  6  );
Table created.
SQL> insert into t1 values ('1','5','3');
1 row created.
SQL> insert into t1 values ('4','4','6');
1 row created.
SQL> commit;
Commit complete.
SQL> SELECT   f1,
  2           f2,
  3           f3
  4  FROM     t1 a
  5  WHERE    a.f2 <= '5'
  6  AND NOT EXISTS (SELECT   '1'
  7                  FROM     t1 b
  8                  WHERE    b.f2 <= '5'
  9                  AND      b.f2 > a.f2);
F1         F2         F3
---------- ---------- ----------
1          5          3
1 row selected.
SQL> analyze table t1 compute statistics;
Table analyzed.
SQL> SELECT   f1,
  2           f2,
  3           f3
  4  FROM     t1 a
  5  WHERE    a.f2 <= '5'
  6  AND NOT EXISTS (SELECT   '1'
  7                  FROM     t1 b
  8                  WHERE    b.f2 <= '5'
  9                  AND      b.f2 > a.f2);
F1         F2         F3
---------- ---------- ----------
1          5          3
1 row selected.
SQL>
Thanks,
Mark 
 
 
October   07, 2003 - 6:18 pm UTC 
 
absolutely -- any non-default optimizer related parameters will affect this.  It's all about the plan that was generated and many things will affect that. 
 
 
 
I can replicate on Windows 2000 as well
Martin, October   08, 2003 - 2:52 am UTC
 
 
  
 
Workaround
Aldo Bravo, October   08, 2003 - 4:16 pm UTC
 
 
If you use /*+ nounnest +/ hint or if you set the parameter _unnest_subquery=false in the spfile the problem is solved.
There are many bugs related to this issue (2739068-2700474-3085399-2829723) and in one of them I found the workaround.
Thanks!! 
 
 
rewriting "not in"
A reader, February  19, 2004 - 9:17 am UTC
 
 
I have seen a "not in" query written as:
select a.key
from t1 a,
     t2 b
where a.key = b.key(+) and
      b.key is null 
/
What do you think of this?
 
 
February  19, 2004 - 11:32 am UTC 
 
it is the "poor mans hash anti-join", a good idea in the olden days of the RBO, not so good in the days of a smarter CBO.
start with:
ops$tkyte@ORA920PC> create table t1 as select * from all_objects where rownum <= 5000;
 
Table created.
 
ops$tkyte@ORA920PC> create table t2 as select * from all_objects where rownum <= 4950;
 
Table created.
 
ops$tkyte@ORA920PC> create index t2_idx on t2(object_id);
 
Index created.
select count(*)
  from t1 rbo
 where object_id not in ( select object_id from t2 )
                                                                                                                  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      8.63       8.53     129066     197295          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      8.64       8.53     129066     197295          0           1
select count(*)
  from t1 rbo
 where NOT EXISTS (select null from t2 where t2.object_id = rbo.object_id )
                                                                                                                  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.08       0.13         83      10075          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.09       0.13         83      10075          0           1
select count(*)
  from t1, t2 rbo
 where t1.object_id = rbo.object_id(+)
   and rbo.object_id IS NULL
                                                                                                                  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.08       0.11         72       5087          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.09       0.11         72       5087          0           1
<b>so, it looks "good" right?  better than the alternatives in RBO, yes.  but, analyze and</b>
select count(*)
  from t1 cbo
 where object_id not in ( select object_id from t2 )
                                                                                                                  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.04          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.06       0.08          0         91          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.06       0.12          0         91          0           1
<b>hmmm, rbo not so good -- "user written hash anti join under rbo" not so good...
the cbo does OK though with the user written one:</b>
select count(*)
  from t1, t2 cbo
 where t1.object_id = cbo.object_id(+)
   and cbo.object_id IS NULL
                                                                                                                  
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.08       0.10          0         91          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.08       0.11          0         91          0           1
<b>but I find the not in to be infinitely easier to code and alot more "informational" (clear,easy to understand)</b>
 
 
 
 
 
Union in NOT EXISTS
Ishaque Hussain, February  19, 2004 - 10:46 pm UTC
 
 
Hi Tom, I don't have any statistics to provide you but could you look at query A and query B and advise which would be more efficient to write.
Query A)
select *
from contr_prod_price cpp,
where cpp.contr in (
    select contr
    from contr
    where co = 'E')
and cpp.prod_cd = 'PX'
and not exists
    (select 'x'
     from contr_prod_prc_low_lvl_ind cpli
     where cpp.contr       = cpli.contr
     and   cpp.contr_frmla = cpli.contr_frmla
     and   cpp.contr_lvl   = cpli.contr_lvl 
     union
     select 'x'
     from contr_prod_prc_low_lvl_dir cpld
     where cpp.contr       = cpld.contr
     and   cpp.contr_frmla = cpld.contr_frmla
     and   cpp.contr_lvl   = cpld.contr_lvl )
Query B)
select *
from contr_prod_price cpp,
where cpp.contr in (
    select contr
    from contr
    where co = 'E')
and cpp.prod_cd = 'PX'
and not exists
    (select 'x'
     from contr_prod_prc_low_lvl_ind cpli
     where cpp.contr       = cpli.contr
     and   cpp.contr_frmla = cpli.contr_frmla
     and   cpp.contr_lvl   = cpli.contr_lvl ) 
union
select *
from contr_prod_price cpp
where cpp.contr in (
     select contr
     from contr
     where co = 'E')
and  cpp.prod_cd = 'PX'
and  not exists
       (select 'x'
     from contr_prod_prc_low_lvl_dir cpld
     where cpp.contr       = cpld.contr
     and   cpp.contr_frmla = cpld.contr_frmla
     and   cpp.contr_lvl   = cpld.contr_lvl )
Thanks 
 
February  20, 2004 - 7:22 am UTC 
 
well, given that they do not return the same results "in general" (the 2cnd will do a distinct on the result set)....
and that I might really consider a NOT IN....
I would probably go with the first, but I would benchmark them (since I had both) 
 
 
 
INTERSECT , MINUS
Arangaperumal, April     28, 2004 - 7:57 am UTC
 
 
Hi Tom,
1.EXISTS plays same role as INTERSECT
2.NOT EXISTS plays same role as MINUS.
am i right? 
So, the same rule can be applied?
 
 
 
NOT IN Query
Reader, May       25, 2004 - 1:19 am UTC
 
 
Hi Tom,
The following are 2 tables,(external_documents and external_document_links).They are linked by external_documents.NAME and external_document_links.EXTERNAL_DOCUMENT_NAME columns.
SQL> select * from external_documents --(Primary Key: NAME)
NAME                                    
----------------------
corp-bro2.jpg                
cvs_structure.doc            
corp-bro.jpg                 
Raid.pdf                     
AutomaticUndoInternals.pdf   
5 rows selected
-------------------------------------------------------------
SQL> select * from external_document_links; --(All the 4 columns are part of the primary key)
FORM       DOCUMENT_NAME  DOCUMENT_VERSION EXTERNAL_DOCUMENT_NAME
---------- ----------------------- -----------------------------
form_1     document_1          1         corp-bro2.jpg
form_1     document_1          1         Raid.pdf
form_2     document_2          1  AutomaticUndoInternals.pdf
form_2     document_1          1         Raid.pdf
PO         po_4000542_2_69     1         Raid.pdf
PO         po_4000542_2_69     1  AutomaticUndoInternals.pdf
form_3     document_1          1         Raid.pdf
form_1     document_2          1         corp-bro.jpg
PO         po_4000542_2_69     1         corp-bro.jpg
PO         po_4000542_2_69     1         corp-bro2.jpg
PO         po_4000542_2_69     1         cvs_structure.doc
11 rows selected.
-------------------------------------------------------------
The input to the query will be external_document_links.form,external_document_links.document_name and external_document_links.document_version. Based on this criteria I need to retrive documents linked to the above 3 columns with a flag as 'Y' and rest of the documents from external_documents table with the flag as 'N'
I have written a query which goes like this :-
SQL> select a.name NAME,'Y' PRESENT from external_documents a,external_document_links b 
where a.name=b.EXTERNAL_DOCUMENT_NAME AND b.form ='form_1' and b.DOCUMENT_VERSION =1 and b.DOCUMENT_NAME='document_1'  
union 
select name ,'N' from external_documents where name
not in (select a.name NAME from external_documents a,external_document_links b 
where a.name=b.EXTERNAL_DOCUMENT_NAME AND b.form ='form_1'and b.DOCUMENT_VERSION =1 and b.DOCUMENT_NAME='document_1')
So I get the right output as follows :- 
--------------------------------------------------------------------------------
NAME                           P
------------------------------ -
AutomaticUndoInternals.pdf     N
Raid.pdf                       Y
corp-bro.jpg                   N
corp-bro2.jpg                  Y
cvs_structure.doc              N
5 rows selected
 
----------------------------------------------------------------
 Can you please tell me if there is any other way in which this query can be re-written, i.e without using the 'NOT IN' clause.
 
 
 
May       25, 2004 - 7:12 am UTC 
 
sigh, no table creates.  no insert intos.  no easy way to play with the data.
seems you want every row from external_documents, and if that document has a mate in external_document_links -- show Y else show N
select name, nvl( (select 'Y'
                     from external_document_links A
                    where a.name = b.external_document_name
                      and b.form = :bind1
                      and b.document_version = :bind2
                      and b.document_name = :bind3 ), 'N' ) present
  from external_documents;
probably works, but I couldn't really run it or anything. 
 
 
 
NOT IN Query  
Reader, May       26, 2004 - 1:00 am UTC
 
 
Hi Tom
Thanks a TON, Sorry I did not provide the necesssary scripts for table and data creation.
You are the greatest. 
 
 
NOT IN for REMOTE site
Erik, May       27, 2004 - 5:27 am UTC
 
 
Hello Tom,
 I would like to ask you for one our bad sql which uses "not in" for remote site. For some sites we have slow network and the package takes about 10min. (it transfers about 100rows, runs once per hour). Do you have any suggestion for us how can we improve performance of the bad sql?
Thank you.
SET AUTOTRACE TRACEONLY
-- insert into monitor.promon@tesa
  select
    PROINUM1, 
    PROINUM2, 
    PROAPPL, 
    --lc_host,
    PROSTAT, 
    upper(PROPROG), 
    PROMESS, 
    PRODATE, 
    upper(PROUSER), 
    PROTYPE 
  from 
   r4.promon
  where 
   prodate > sysdate-1/24 and --ld_prodate and
   prodate < sysdate and --ld_currentDate and
   UPPER(proprog) not in (select UPPER(batch) from monitor.promon_badbatch@tesa);
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2069 Card=1 Bytes=100)
   1    0   FILTER
   2    1     TABLE ACCESS (FULL) OF 'PROMON' (Cost=2069 Card=1 Bytes=100)
   3    1     REMOTE* (Cost=1 Card=1 Bytes=17) TESA.WORLD
   3 SERIAL_FROM_REMOTE   SELECT "BATCH" FROM "MONITOR"."PROMON_BADBATCH" "PROMON_BADBATCH" 
                          WHERE LNNVL(UPPER(:1)<>UPPER("BATCH"))
 
 
May       27, 2004 - 9:27 am UTC 
 
might try
from promon a, (select distinct upper(batch) batch 
                  from monitor.promon_badbatch@tesa ) b
where upper(a.proprog) = b.batch(+)
  and b.batch is null
  and .....
(and consider removing the need to upper things like that all of the time by standardizing on upper or lower case for those fields....) 
 
 
 
NOT IN for REMOTE site  
Erik, May       31, 2004 - 12:40 pm UTC
 
 
Tom, thank you very much. 
But, could you explain me please, why does my "NOT IN" is processed as correlated subquery? I thought that it should not be correlated.
 
 
May       31, 2004 - 2:11 pm UTC 
 
why did you think that?  it is all just syntax, the optimizer can and does do whatever it wants under the covers. 
 
 
 
Intreasting
Riaz Shahid, June      01, 2004 - 3:10 am UTC
 
 
Thats a pretty intresting thread. Tom would you please through some light on query:
select count(*)
from t1, t2 cbo
where t1.object_id = cbo.object_id(+)
and cbo.object_id IS NULL
We are (outer) joing the two tables on the basis of object_id and on the other hand we are saying "cbo.object_id IS NULL".
I can't understand how it will be executed. Can you please tell us in psuedo code ??? 
 
June      01, 2004 - 8:34 am UTC 
 
if you understand an outer join -- basically every row in T1 will be in the result set, regardless of whether there was a 'mate' in t2 to be joined to.
So, t1 will be retrieved from, and if there is a row(s) in t2 for that object_id -- the join will happen (and t2.object_id will be NOT NULL of course since we did the join).  Else, the column values for t2 will be set to NULL and the row from t2 will be output.  consider:
ops$tkyte@ORA9IR2> create table t1 ( x int );
 
Table created.
 
ops$tkyte@ORA9IR2> create table t2 ( x int );
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t1 values(1);
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t1 values(2);
 
1 row created.
 
ops$tkyte@ORA9IR2> insert into t2 values(1);
 
1 row created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from t1,t2 where t1.x = t2.x(+);
 
         X          X
---------- ----------
         1          1
         2
<b>the outer join says "all rows in t1 will be output, no matter what".  So, all we need to do is keep the rows where t2.x (cbo.object_id) is NULL -- those are the rows in t1 such that there was no row in T2 to join to -- just like a not in</b>
 
 
 
 
 
Excellent Explanation
Riaz Shahid, June      01, 2004 - 9:37 am UTC
 
 
I got it.
Love the way you described it...
 
 
 
NOT IN for REMOTE site
Erik, June      01, 2004 - 11:20 am UTC
 
 
I thought that this should not be processed as correlated:
... UPPER(proprog) not in (select UPPER(batch) from monitor.promon_badbatch@tesa);
But the excecution plan showed (I think that :1 = promon.proprog):
SELECT "BATCH" FROM "MONITOR"."PROMON_BADBATCH" "PROMON_BADBATCH" WHERE LNNVL(UPPER(:1)<>UPPER("BATCH"))
I did little test:
create table t1 (id number); create table t2 (id number);
insert into t1 values (1); insert into t1 values (2);
insert into t2 values (2);
analyze table t1 compute statistics;
analyze table t2 compute statistics;
I)
select * from t1 where id not in (select id from t2);
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6)
   1    0   FILTER
   2    1     TABLE ACCESS (FULL) OF 'T1'
   3    1     TABLE ACCESS (FULL) OF 'T2'
II)
select * from t1 where id not in (select id from t2@ee);
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6)
   1    0   FILTER
   2    1     TABLE ACCESS (FULL) OF 'T1' (Cost=3)
   3    1     REMOTE* (Cost=3) EE.WORLD
   3 SERIAL_FROM_REMOTE SELECT "ID" FROM "T2" "T2" WHERE LNNVL("ID"<>:1)
1) Is the first SQL processed same as the second - but only explain plan did not show it?
2) If not - why the 2nd is processed as correlated (I think that :1 = t1.id)?
Thank you.
 
 
June      01, 2004 - 2:59 pm UTC 
 
 no it is not.  you are missing a NOT NULL on there (on t2.id) (would make it better).
different queries, different plans. 
 
 
 
Into the Wayback Machine (wayback up the page)
Duke Ganote, September 04, 2004 - 12:02 pm UTC
 
 
Your two basic queries for comparison were:
SQL> select count(*) from emp where empno not in ( select mgr from emp );
        COUNT(*)
----------------
               0
SQL> select count(*) from emp T1
  2   where not exists ( select null from emp T2 where t2.mgr = t1.empno );
        COUNT(*)
----------------
               8
The first fails because of 3-value logic; it's easy to forget that NULL != NULL.  The query might as well be:
SQL> select count(*) from emp where empno not in ( select null from dual );
        COUNT(*)
----------------
               0
The first query could be rectified to account for 3-value logic; for example:
SQL> select count(*) from emp where to_char(empno)
  2  NOT IN ( select NVL(to_char(mgr),'null') from emp );
        COUNT(*)
----------------
               8
Is that correct?  And the reason anti-joins can be efficient if there are no NULLs is because a standard index doesn't include NULL values? 
 
 
September 04, 2004 - 1:04 pm UTC 
 
it would be more correct to code:
not in ( select mgr from emp where mgr is not null )
converting a number to a string would not be a good thing and it is more "self explanatory" what is happening with the "is not null"
It is not a direct cause/effect. create index on emp(mgr,empno) would have an entry for each and every row since EMPNO is not null -- so it is not correct to say that indexes do not include null values -- starndard b*tree indexes do not include ENTIRELY null key entrys but in general they have nulls. 
 
 
 
Into the Halfwayback Machine...
Duke Ganote, September 05, 2004 - 11:37 pm UTC
 
 
Looking at the "poor mans hash anti-join", there were 3 alternatives evaluated:
select count(*) from t1 rbo
   where object_id not in ( select object_id from t2 );
select count(*) from t1 rbo
   where NOT EXISTS (select null from t2 where t2.object_id = rbo.object_id );
select count(*) from t1, t2 rbo
   where t1.object_id = rbo.object_id(+) and rbo.object_id IS NULL;
Yet none seemed "set oriented".  I tried another query that, to me, seems clearer:
select count(*) from t1 rbo,
   ( select object_id from t1 MINUS select object_id from t2 ) SET_OP
   where SET_OP.object_id = RBO.object_id
/
Under RBO, it won with only 225 LIOs (compared with 197295, 10075, and 5087 LIOs):
select count(*) from t1 rbo,
( select object_id from t1 MINUS select object_id from t2 ) SET_OP
where SET_OP.object_id = RBO.object_id
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.06          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.04       0.04          0        225          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.04       0.10          0        225          0           1
However, under the CBO, it came in 3rd with 166 LIOs (more than 91 for 2 others).
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.03       0.03          0        166          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.03       0.03          0        166          0           1
I just find it interesting that logically-equivalent queries have such different implementations.
 
 
September 06, 2004 - 10:05 am UTC 
 
select count(*) from t1 rbo
   where object_id not in ( select object_id from t2 );
isn't set oriented?  actually -- they are all set oriented, no procedural code, no ordered array semantics (as you get with analytics)
you have "give me everything in this set that isn't in this other set", vs "get me everything in this set such that is doesn't exist in the other set", vs "outer join set A to set B and keep rows such that .... is true"
that last one you added really seems "cleaner" than
select count(*) from t1 rbo
   where object_id not in ( select object_id from t2 );
does?!?!  really?
 
 
 
 
in & exists
A reader, September 06, 2004 - 11:41 am UTC
 
 
hi
in one of your first replies you mentioned this
+++++++++++++++++++++++++
So, the predicate
where x not in ( NULL )
evaluates to neither TRUE, nor FALSE
How about 
where x in ( NULL )
that evaluates to neither true or false right?
So isnt IN and EXISTS different as well (like NOT IN and NOT EXISTS)?
 
 
September 06, 2004 - 11:59 am UTC 
 
where in and where exists work the same.
where not in and where not exists work differently.
NULL does not affect the behaviour of "in ( set with nulls )" whereas NULL does affect the behaviour of "not in ( set with nulls )"
 
 
 
 
Clean is in the eyes of the beholder...
Duke Ganote, September 06, 2004 - 11:34 pm UTC
 
 
Truly all examples were non-procedural, but I think of set operations in terms of Venn diagrams and operations like MINUS, UNION, and INTERSECT.  To my taste, the cleanest query for a simple count is:
select count(*) from
( select object_id from t1 MINUS select object_id from t2 ) RBO_SET
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.05       0.18        143        150          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.06       0.20        143        150          0           1
which has a scant 150 LIOs under the RBO, and an equal-to-the-best 91 LIOs under CBO:
select count(*) from
( select object_id from t1 MINUS select object_id from t2 ) CBO_SET
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.02          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.02       0.02          0         91          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.02       0.04          0         91          0           1
The other 3 queries could easily obtain information from T1 by replacing the "count(*)", so I augmented the simple MINUS with a join back to T1 in my previous query.
BTW, I appreciate your new "LINKS I LIKE" tab!  Thank you. 
 
 
Strange behaviour of IN with OR
Mike, November  10, 2004 - 8:48 am UTC
 
 
I have the most bizarre CBO'ness (for what of a better word!) that I can not fathom.  Below are the tear down and set up needed.
select * from v$banner
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for HPUX: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
--
-- Clear down
--
drop table HAS_PERMISSION_UNIT_TEST
/
drop table PLT_NODE_PERMISSIONS
/
drop table PLT_SECURITY_NODES
/
drop table PLT_USER_DELEGATES
/
drop table PLT_XDI_PERMISSIONS
/
CREATE TABLE PLT_SECURITY_NODES
( NODE_ID                 VARCHAR2(16)     NOT NULL,
  PARENT_NODE_ID          VARCHAR2(16),
  IS_DELETED              NUMBER                NOT NULL,
  IS_SYSTEM               NUMBER                NOT NULL,
  LAST_MODIFIED_DATETIME  DATE,
  LAST_MODIFIED_USER_ID   VARCHAR2(32)
)
/
CREATE INDEX PSN_PSN_FK_I ON PLT_SECURITY_NODES
(PARENT_NODE_ID)
/
CREATE UNIQUE INDEX PSN_PK ON PLT_SECURITY_NODES
(NODE_ID)
/
CREATE INDEX NODE_AND_PARENT_NODE ON PLT_SECURITY_NODES
(NODE_ID, PARENT_NODE_ID)
/
ALTER TABLE PLT_SECURITY_NODES ADD (
  CONSTRAINT PSN_PK PRIMARY KEY (NODE_ID))
/
ALTER TABLE PLT_SECURITY_NODES ADD (
  CONSTRAINT PSN_PSN_FK FOREIGN KEY (PARENT_NODE_ID) 
    REFERENCES PLT_SECURITY_NODES (NODE_ID))
/
------------------------------------------------------------------------------------------------
CREATE TABLE PLT_NODE_PERMISSIONS
( NODE_ID                 VARCHAR2(32)     NOT NULL,
  USER_ID                 VARCHAR2(32)     NOT NULL,
  ROLE_ID                 VARCHAR2(32)     NOT NULL,
  IS_DELETED              NUMBER           NOT NULL,
  IS_SYSTEM               NUMBER           NOT NULL,
  LAST_MODIFIED_DATETIME  DATE,
  LAST_MODIFIED_USER_ID   VARCHAR2(32)
)
/
CREATE UNIQUE INDEX PNP_PK ON PLT_NODE_PERMISSIONS
(NODE_ID, USER_ID, ROLE_ID)
/
CREATE INDEX NODE_AND_USER ON PLT_NODE_PERMISSIONS
(NODE_ID, USER_ID)
/
CREATE INDEX PNP_PSN_FK_I ON PLT_NODE_PERMISSIONS
(NODE_ID)
/
CREATE INDEX PNP_PPR_FK_I ON PLT_NODE_PERMISSIONS
(ROLE_ID, USER_ID)
/
ALTER TABLE PLT_NODE_PERMISSIONS ADD (
  CONSTRAINT PNP_PK PRIMARY KEY (NODE_ID, USER_ID, ROLE_ID))
/
ALTER TABLE PLT_NODE_PERMISSIONS ADD (
  CONSTRAINT PNP_PSN_FK FOREIGN KEY (NODE_ID) 
    REFERENCES PLT_SECURITY_NODES (NODE_ID))
/
----------------------------------------------------------------------------
CREATE TABLE PLT_USER_DELEGATES
( USER_ID                 VARCHAR2(32)     NOT NULL,
  DELEGATE_USER_ID        VARCHAR2(32)     NOT NULL,
  EFFECTIVE_FROM          DATE             NOT NULL,
  EFFECTIVE_TO            DATE,
  IS_DELETED              NUMBER           NOT NULL,
  IS_SYSTEM               NUMBER           NOT NULL,
  LAST_MODIFIED_DATETIME  DATE,
  LAST_MODIFIED_USER_ID   VARCHAR2(32)
)
/
CREATE INDEX PUD_PP_FK_I ON PLT_USER_DELEGATES
(USER_ID)
/
CREATE INDEX PUD_PP_DELEGATE_FK_I ON PLT_USER_DELEGATES
(DELEGATE_USER_ID)
/
CREATE UNIQUE INDEX PUD_PK ON PLT_USER_DELEGATES
(USER_ID, DELEGATE_USER_ID, EFFECTIVE_FROM)
/
ALTER TABLE PLT_USER_DELEGATES ADD (
  CONSTRAINT PUD_PK PRIMARY KEY (USER_ID, DELEGATE_USER_ID, EFFECTIVE_FROM))
/
---------------------------------------------------------------------------------------
CREATE TABLE PLT_XDI_PERMISSIONS
(
  XDI                     VARCHAR2(255)    NOT NULL,
  USER_ID                 VARCHAR2(32)     NOT NULL,
  ROLE_ID                 VARCHAR2(16)     NOT NULL,
  IS_DELETED              NUMBER           NOT NULL,
  IS_SYSTEM               NUMBER           NOT NULL,
  LAST_MODIFIED_DATETIME  DATE,
  LAST_MODIFIED_USER_ID   VARCHAR2(32)
)
/
CREATE INDEX PXP_PPR_FK_I ON PLT_XDI_PERMISSIONS
(ROLE_ID, USER_ID)
/
CREATE UNIQUE INDEX PXP_PK ON PLT_XDI_PERMISSIONS
(XDI, USER_ID, ROLE_ID)
/
ALTER TABLE PLT_XDI_PERMISSIONS ADD (
  CONSTRAINT PXP_PK PRIMARY KEY (XDI, USER_ID, ROLE_ID))
/
begin
  --
  -- Set up 4 delegates for this user to have access via
  --
  for idx in 1..4
  loop
    insert into plt_user_delegates
    ( user_id, delegate_user_id, effective_from, effective_to, is_deleted, is_system, last_modified_datetime, last_modified_user_id)
    values
    ('MJONES', 'MJONES_DELEGATE_'||idx, sysdate -100, sysdate + 100, 0, 0, sysdate, 'MJONES');
    --
  end loop;
  --
  --  Loop 500 times to create some realistic volume of data that this user has
  --
  for idx in 1..500
  loop
    insert into plt_security_nodes
    ( node_id, parent_node_id, is_deleted, is_system, last_modified_datetime, last_modified_user_id ) 
    values 
    ('NODE_'||idx||'_TOP_LVL', null, 0, 1,  sysdate, 'MJONES');
    -- 
    insert into plt_security_nodes 
    ( node_id, parent_node_id, is_deleted, is_system, last_modified_datetime, last_modified_user_id ) 
    values 
    ('NODE_'||idx||'_LEVEL_2', 'NODE_'||idx||'_TOP_LVL', 0, 1,  sysdate, 'MJONES');
    -- 
    insert into plt_security_nodes 
    ( node_id, parent_node_id, is_deleted, is_system, last_modified_datetime, last_modified_user_id ) 
    values 
    ('NODE_'||idx||'_LVL_3', 'NODE_'||idx||'_LEVEL_2', 0, 1,  sysdate, 'MJONES'); 
    --
    --  Create permission into Node
    --
    insert into plt_node_permissions ( node_id, user_id, role_id, is_deleted, is_system ) values ( 
    'NODE_'||idx||'_LVL_3', 'MJONES', 'UNIT_TEST', 0, 0); 
  end loop;
  --
  -- Now wedge up some XDI Permissions.  Have a total of 100 XDI's
  --
  for idx in 1..100
  loop
    insert into plt_xdi_permissions 
    ( xdi, user_id, role_id, is_deleted, is_system) 
    values 
    ( idx||'_JONES_THE_XDI', 'MJONES', 'UNIT_TEST', 0, 0); 
  end loop;
end;
/
analyze table plt_node_permissions compute statistics for table for all indexes for all indexed columns
/
analyze table plt_security_nodes compute statistics for table for all indexes for all indexed columns
/
analyze table plt_user_delegates compute statistics for table for all indexes for all indexed columns
/
analyze table plt_xdi_permissions compute statistics for table for all indexes for all indexed columns
/
--
-- Now build up a table of 100,000 rows to base all this malarkey on...  Set the access up to start with so that we have access too all 100,000 records in bunches of 200 (500 access records we have access to)
--
create table has_permission_unit_test as
select
  to_char(rownum) primary_key,
  owner, object_name, subobject_name, object_id, data_object_id, object_type, created, last_ddl_time, timestamp, status, temporary, generated, secondary, 
'NODE_'||(mod(rownum,200)+1)||'_LVL_3' security_node_id,
--decode(mod(rownum,1000),0, mod(rownum,1000)||'JONES_THE_XDI',rownum||'_XDI_'||object_name) xdi,
rownum||'_XDI_'||object_name xdi, 
0 is_deleted, 
sysdate last_modified_datetime, 
'BOB' last_modified_user_id
from
( select * from all_objects
  union all
  select * from all_objects
  union all
  select * from all_objects
  union all
  select * from all_objects
  union all
  select * from all_objects
)
where rownum <= 100000-- 100K
/
--
alter table has_permission_unit_test modify primary_key varchar2(32) 
/
--
alter table has_permission_unit_test add primary key (primary_key) 
/
--
analyze table has_permission_unit_test compute statistics for table for all indexes for all indexed columns
/
--
-- END OF SET UP
--
The query that behaves strangley is...
    select count(*)
    from has_permission_unit_test main
    where main.primary_key <= :i_volume
    and 
    ( 
      main.security_node_id in
      ( select /*use_hash(sn) index_ffs( sn node_and_parent_node ) use_hash(np) index_ffs(np node_and_user)*/ np.node_id
        from plt_security_nodes sn, plt_node_permissions np
        where sn.node_id = np.node_id
        and   np.user_id in
        ( select user_id
          from   plt_user_delegates ud_first
          where ud_first.delegate_user_id = 'MJONES'
          and   sysdate between ud_first.effective_from and ud_first.effective_to
          and   ud_first.is_deleted = 0
          union all
          select 'MJONES' user_id
          from dual dual_first
        )
        connect by prior sn.parent_node_id =  sn.node_id
      )
      -- COMMENT OUT FROM HERE
      or
      ( main.xdi in   
        ( select xi.xdi
          from plt_xdi_permissions xi
          where xi.user_id in
          ( select user_id
            from   plt_user_delegates ud_second
            where ud_second.delegate_user_id = 'MJONES'
            and   sysdate between ud_second.effective_from and ud_second.effective_to
            and   ud_second.is_deleted = 0
            union all
            select 'MJONES' user_id
            from dual dual_second
          )
        )
      )
      -- DOWN TO HERE
    )
Running as is with :i_volume = 1000 takes around 4 seconds, and the matches are being done with the security nodes (I beleive), but if you comment out the bottom bit the plan radically changes to start FFS with a couple of indexes I can not see why the "or" changes this.  Even if I change this to "or 1 in (select null from dual where 1=0) it still won't FFS.  I tried hinting (the + is removed so it won't pick it up to no avail.  Why has the performance so dropped off when I stick any OR in?
Thanks in advance, and hope this is enough info.
Mike. 
 
November  10, 2004 - 12:05 pm UTC 
 
sorry, this is just "not in scope" here and way too "big" for a review -- I have to answer these things relatively "quick" -- they are in a fifo queue, i either answer it or skip it.  I don't go back. 
this is a "question" for when I'm taking them. 
 
 
 
Fair Enough!
A reader, November  10, 2004 - 12:59 pm UTC
 
 
I'll await a question time... 
 
 
Reducing time
Yogesh, December  04, 2004 - 1:00 pm UTC
 
 
I've 2 tables cust(5G) and archive(1G), I want to create third table which will have cust - archive data so I used following qry 
create table newcust 
nologging 
as 
select * from cust where c1 not in 
(select c1 from archive)
parallel(Degree 8)
/
Cust table have index on c1 column ... archive table do not have any indexes. 
This query was running for  7 hours .. I killed this qry as it was taking almost all resources. 
Can you please suggest some alternative to speed up this query?  
 
December  04, 2004 - 1:46 pm UTC 
 
make sure to use cbo, make sure c1 is NOT NULLable or you say "where c1 is not null" in the subquery.
not sure that parallel would be necessary - big juicy hash anti-join is what I'd be looking for. 
 
 
 
Null
Yogesh, December  06, 2004 - 4:58 am UTC
 
 
Can you please explain part? 
make sure c1 is null or you say "where c1 is not null" in 
the subquery.
should it be
"make sure c1 is not null or you say "where c1 is not null" in the subquery."?
One more thing to add. C1 is not null in both the tables.  In cust table it is PK and archive table is extract of cust. 
 
 
December  06, 2004 - 11:43 am UTC 
 
thanks, i corrected that -- it should be "make sure c1 is NOT NULL'able" 
 
 
 
NULL
Yogesh, December  06, 2004 - 6:57 am UTC
 
 
I guess in the above answer you was relating NULL and NOT NULL to /*+ HASH_AJ */. 
There are 2 options we can use 
/*+ HASH_AJ */
/*+ MERGE_AJ */
which one we should use? 
 
 
December  06, 2004 - 12:09 pm UTC 
 
hash_aj probably.  hash one table and then join to it -- versus sort both and merge. 
 
 
 
REPORT 6i
Elena, March     18, 2005 - 3:59 am UTC
 
 
I live in Ukraine and have a question to you:
"Do you answer on questions about REPORT 6i?"
Or you can give us mail when I would ask my questions.
Please,answer me.
With best regards
Elena 
 
March     18, 2005 - 7:04 am UTC 
 
nope -- i've never used reports in my life.
otn.oracle.com -> discussion forums.  there is a forum for reports. 
 
 
 
HASH_AJ
Yogesh, April     04, 2005 - 6:58 am UTC
 
 
I'm using following query 
CREATE TABLE ARCHIVE
AS SELECT * FROM SMALL WHERE CNO NOT IN
(SELECT /*+ HASH_AJ */ CNO FROM LARGE WHERE CNO IS NOT NULL)
Following is the query plan
CREATE TABLE STATEMENT Optimizer Mode=CHOOSE
LOAD AS SELECT                                     
FILTER                                     
TABLE ACCESS FULL    SMALL    14 K    564 K    236
INDEX FULL SCAN            PK_LARGE  867 K    21 M    48303       
Size of small table is 20M, and large table is      3225M. Here I was expecting HASH ANTI JOIN. 
But if I use  following query ... 
CREATE TABLE ARCHIVE
AS SELECT * FROM SMALL WHERE CNO IS NOT NULL  AND CNO NOT IN
SELECT /*+ HASH_AJ */ CNO FROM LARGE)
It is using ANTI HASH JOIN. 
My questions are, 
1. Do we have to place NOT NULL in inner query or it can be a part of outer query?
2. Does table size matter? 
 
                
 
 
April     04, 2005 - 7:43 am UTC 
 
are your stats up to date, versions are always useful. 
 
 
 
HASH_AJ
Yogesh, April     04, 2005 - 9:23 am UTC
 
 
DB version 8.0.4. 
Stats for small table is up to date. Large table was analyzed 7 days back. Should I analyze it again ? 
 
April     04, 2005 - 9:27 am UTC 
 
do this, 
select table_name, num_rows from user_tables where table_name in ( 'A, 'B' )
using your table names of course.  Anywhere close to reality?  post it.
svrmgrl> show parameter hash
as well
 
 
 
 
HASH_AJ
Yogesh, April     04, 2005 - 10:31 am UTC
 
 
TABLE_NAME                    : LARGE
TABLESPACE_NAME               :
CLUSTER_NAME                  :
IOT_NAME                      :
PCT_FREE                      :
PCT_USED                      :
INI_TRANS                     :
MAX_TRANS                     :
INITIAL_EXTENT                :
NEXT_EXTENT                   :
MIN_EXTENTS                   :
MAX_EXTENTS                   :
PCT_INCREASE                  :
FREELISTS                     :
FREELIST_GROUPS               :
LOGGING                       :
BACKED_UP                     : N
NUM_ROWS                      : 17351339
BLOCKS                        : 399153
EMPTY_BLOCKS                  : 12287
AVG_SPACE                     : 1305
CHAIN_CNT                     : 0
AVG_ROW_LEN                   : 153
AVG_SPACE_FREELIST_BLOCKS     : 0
NUM_FREELIST_BLOCKS           : 0
DEGREE                        :          1
INSTANCES                     :          1
CACHE                         :     N
TABLE_LOCK                    : ENABLED
SAMPLE_SIZE                   : 0
LAST_ANALYZED                 : 28-mar-2005 20:31:03
PARTITIONED                   : YES
IOT_TYPE                      :
TEMPORARY                     : N
NESTED                        : NO
BUFFER_POOL                   :
###############################################################
TABLE_NAME                    : SMALL
TABLESPACE_NAME               : DB1
CLUSTER_NAME                  :
IOT_NAME                      :
PCT_FREE                      : 10
PCT_USED                      : 40
INI_TRANS                     : 1
MAX_TRANS                     : 255
INITIAL_EXTENT                : 5242880
NEXT_EXTENT                   : 5242880
MIN_EXTENTS                   : 1
MAX_EXTENTS                   : 2147483645
PCT_INCREASE                  : 0
FREELISTS                     : 1
FREELIST_GROUPS               : 1
LOGGING                       : YES
BACKED_UP                     : N
NUM_ROWS                      : 282211
BLOCKS                        : 2449
EMPTY_BLOCKS                  : 110
AVG_SPACE                     : 2792
CHAIN_CNT                     : 0
AVG_ROW_LEN                   : 41
AVG_SPACE_FREELIST_BLOCKS     : 4293
NUM_FREELIST_BLOCKS           : 9
DEGREE                        :          1
INSTANCES                     :          1
CACHE                         :     N
TABLE_LOCK                    : ENABLED
SAMPLE_SIZE                   : 0
LAST_ANALYZED                 : 03-apr-2005 18:24:31
PARTITIONED                   : NO
IOT_TYPE                      :
TEMPORARY                     : N
NESTED                        : NO
BUFFER_POOL                   : DEFAULT
SVRMGR> show parameter hash
NAME                                TYPE    VALUE
----------------------------------- ------- ------------------------------
hash_area_size                      integer 0
hash_join_enabled                   boolean TRUE
hash_multiblock_io_count            integer 1
sequence_cache_hash_buckets         integer 10
 
 
April     04, 2005 - 12:02 pm UTC 
 
just wanted num rows....
you said:
Size of small table is 20M, and large table is      3225M. Here I was expecting 
HASH ANTI JOIN. 
stats say 282,211  and 17,351,339
they are "off" a tad.  something isn't adding up.
 
 
 
 
HASH_AJ
Yogesh, April     04, 2005 - 12:39 pm UTC
 
 
M -> MB. Sorry about that ...  
 
April     04, 2005 - 1:44 pm UTC 
 
so you didn't have row counts before, they were sizes?
what is your sort area size?
 
 
 
 
HASH_AJ
Yogesh, April     04, 2005 - 2:00 pm UTC
 
 
Yes, 20MB is the size of small table, and large table size is 3225MB. 
sort_area_size = 10000000 
 
April     04, 2005 - 5:19 pm UTC 
 
guess you can do the "do it yourself" way if you believe it'll be faster:
select small.*
 from large, small
where large.key(+) = small.key
and large.key is null;
 
 
 
 
HASH_AJ
Yogesh, April     05, 2005 - 5:15 am UTC
 
 
I was trying to understand why it is not able to use HASH_AJ hint in this specific case. Because it is working with other queries. If I'm using two tables small - 1GB+ and Large - 5GB+, it works fine ..... 
"poor mans hash anti-join" ? Good idea ..  
 
April     05, 2005 - 7:32 am UTC 
 
8.0.4 could have some definite impact on this.  really old stuff. 
 
 
 
HASH_AJ
Yogesh, April     05, 2005 - 8:43 am UTC
 
 
I Agree ..  
 
 
Displaying the column of inner query into outer query
Baskar, May       31, 2005 - 11:26 am UTC
 
 
Hi Tom,
Please look into this query.
SELECT 
               statement_date,
               revenue_centre,
               bill_create_date,
               jnl_end_date,
               company_code,
               revenue
          FROM ods_merc_bill_rev stg
              WHERE NOT EXISTS (SELECT 1
                               FROM tmp_rtmr_accounts tra
                              WHERE stg.gl_account_code||' 000000' = tra.account_code_n106)
                       OR NOT EXISTS (SELECT account_code_n106,product_code
                               FROM tmp_rtmr_accounts tra
                              WHERE stg.gl_account_code||' 000000' =  tra.account_code_n106
                               and EXISTS (SELECT 1
                                                      FROM tmp_rtmr_products trp
                      WHERE
                    tra.product_code = trp.product_code ))
Now I want the product code which is used in inner most query  to be displayed in main query..
How to bring this into main query.  Actually It should
return only three rows. If I insert the product code into main query, It is replicated. How to do!..Please advice me..
 
Table structure for ods_merc_bill_rev:
 
PROCESS_KEY      NUMBER (12) 
COMPANY_CODE     VARCHAR (6) 
REVENUE_CENTRE   VARCHAR (6) 
STATEMENT_DATE   VARCHAR (8) 
BILL_CREATE_DATE VARCHAR (8) 
JNL_END_DATE     VARCHAR (8) 
BILL_PERIOD      VARCHAR (3) 
GL_ACCOUNT_CODE  VARCHAR (10) 
REVENUE          NUMBER (20,2) 
 
Table structure for tmp_rtmr_accounts:
 
ACCOUNT_CODE_N106 VARCHAR (17) 
PRODUCT_CODE      VARCHAR (20) 
 
Table structure for tmp_rtmr_products:
PRODUCT_CODE      VARCHAR (20) 
 
Please have a look at this and give me a solution as soon as possible..
Thanks in Advance!
Regards,
baskar
 
 
 
May       31, 2005 - 6:34 pm UTC 
 
you'll be joining - but it doesn't seem to make sense to me since you say "or not exists ( product code )"
how can you show something that doesn't exist.
(i'm not here to write SQL for you, I'll be glad to answer questions -- but please don't think I'm going to write your SQL as soon as possible, that is just not reasonable) 
 
 
 
hash_aj in 10G
Yogesh, August    05, 2005 - 11:53 am UTC
 
 
What is the replacement of hash_aj in 10g? or do we don't have to use hash_aj anymore? Sorry I don't have installed version of 10g.  I read this in one document.   
 
August    05, 2005 - 2:14 pm UTC 
 
the "hint" seems to be not documented, but the anti join lives on
ops$tkyte@ORA10G> create table t1 ( x int not null, y int );
 
Table created.
 
ops$tkyte@ORA10G> create table t2 ( x int not null, y int );
 
Table created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'T1', numrows=>1000000, numblks=> 10000 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> exec dbms_stats.set_table_stats( user, 'T2', numrows=>1000000, numblks=> 10000 );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> @plan "select * from t1 where x not in (select x from t2 )";
ops$tkyte@ORA10G> delete from plan_table;
 
4 rows deleted.
 
ops$tkyte@ORA10G> explain plan for &1;
old   1: explain plan for &1
new   1: explain plan for select * from t1 where x not in (select x from t2 )
 
Explained.
 
ops$tkyte@ORA10G> select * from table(dbms_xplan.display);
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------
Plan hash value: 629543484
 
--------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes |TempSpc| Cost (%CPU)|
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    39 |       |  8525   (2)|
|*  1 |  HASH JOIN RIGHT ANTI|      |     1 |    39 |    23M|  8525   (2)|
|   2 |   TABLE ACCESS FULL  | T2   |  1000K|    12M|       |  2742   (2)|
|   3 |   TABLE ACCESS FULL  | T1   |  1000K|    24M|       |  2742   (2)|
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("X"="X")
 
15 rows selected.
 
 
 
 
 
 
anti join
Yogesh, August    09, 2005 - 5:14 am UTC
 
 
so optimizer is smart enough to make this decision without specifying any hint?  
 
August    09, 2005 - 10:02 am UTC 
 
i did not use a hint 
 
 
 
Strange outcome of NOT IN
SAF, August    11, 2005 - 4:08 am UTC
 
 
Hi Tom,
I'm using NOT IN to clear up a table and turning up with some strange results. Hope this is not too stupid... anyway here goes. 
The basic idea is to clean up an old large table, based on a smaller new one. Both tables have no null elements. 
--create a "small" table first. 
CREATE TABLE m1_msisdns_int
(msisdn         varchar2(18)    NOT NULL);
--it's filled with some data...
--create table of unmatched keys
CREATE TABLE unmatched_msisdns
(msisdn         varchar2(18)    NOT NULL);
--fill it up 
INSERT INTO unmatched_msisdns
(SELECT mi.msisdn FROM medapp.msisdn_imei mi WHERE mi.msisdn NOT IN
(SELECT msisdn FROM m1_msisdns_int));
OK, now records in unmatched_msisdns *should* be in msisdn_imei and *should not* be in m1_msisdns_int. But this is not the case. For example, picking some random records:
SQL> select * from unmatched_msisdns where rownum < 5;
MSISDN
------------------
6581000016
6581000017
6581000032
6581000033
SQL> select msisdn from medapp.msisdn_imei where msisdn = '6581000016';
MSISDN
------------------
6581000016
SQL> select msisdn from m1_msisdns_int where msisdn = '6581000016';
MSISDN
------------------
6581000016
Why is the record in the m1_msisdns_int table too? What am I doing wrong?
Thanks.  
 
 
August    11, 2005 - 9:55 am UTC 
 
show us from start to finish -- eg: table unmatched starts EMPTY.
do you have a test case I can reproduce with? 
 
 
 
Strange outcome of NOT IN: details
SAF, August    12, 2005 - 4:37 am UTC
 
 
Hi Tom,
Thanks for response.
Here are more details of the situation. I run this script first: 
========================
DROP TABLE m1_msisdns_ext;
DROP TABLE m1_msisdns_int;
DROP TABLE unmatched_msisdns;
CREATE OR REPLACE DIRECTORY data_dir AS '/home/oracle';
CREATE TABLE m1_msisdns_ext
(msisdn         varchar2(18))
ORGANIZATION EXTERNAL
(TYPE oracle_loader
        DEFAULT DIRECTORY data_dir
        ACCESS PARAMETERS
        (RECORDS DELIMITED BY NEWLINE
        fields terminated by ',')
        location ('msisdns_20050811.dat')
)
PARALLEL 5
REJECT LIMIT UNLIMITED;
CREATE TABLE m1_msisdns_int
(msisdn         varchar2(18)    NOT NULL);
INSERT INTO m1_msisdns_int (SELECT msisdn FROM m1_msisdns_ext);
DROP TABLE m1_msisdns_ext;
commit;
CREATE TABLE unmatched_msisdns
(msisdn         varchar2(18)    NOT NULL);
INSERT INTO unmatched_msisdns (SELECT msisdn FROM medapp.msisdn_imei WHERE msisdn NOT IN (SELECT msisdn FROM m1_msisdns_int));
commit;
========================
Now,
1. The data file to create m1_msisdns_ext (msisdns_20050811.dat) is just a single column file with strings e.g.:
6596886272
6597600746
6593686340
6596875161
6594745290
6597902848
6598493940
6590933396
6597440556
6593672611
You can generate it yourself, or I can pass one to you.
2. The table medapp.msisdn_imei also contains a list of strings like m1_msisdns_int, but a much larger set. 
3. We're using Oracle 10g, on Lintel. All results shown here using SQLPlus.
Just to be sure, I do these checks:
SQL> select count(*) from medapp.msisdn_imei;
  COUNT(*)
----------
   1652979
SQL>  select count(*) from m1_msisdns_int;
  COUNT(*)
----------
   1204505
SQL> select count(*) from unmatched_msisdns;
  COUNT(*)
----------
    449069
SQL> select * from m1_msisdns_int where msisdn is null;
no rows selected
SQL> select * from medapp.msisdn_imei where  msisdn is null;
no rows selected
However, when I run the NOT IN based insertion, I still see some (some, not all) records in the unmatched_msisdns where some records belong to both tables (m1_msisdns_int and medapp.msisdn_imei). 
Finally, as an aside, when I select with quote a query does not work, but works without quotes!! Weird, isn't it??:
SQL> select * from unmatched_msisdns where msisdn = 6581000016;
MSISDN
------------------
6581000016
SQL> select * from unmatched_msisdns where msisdn = '6581000016';
no rows selected
Sorry for the long post, but hope it makes things clear. 
Thanks for your time and effort.  
 
 
August    12, 2005 - 8:45 am UTC 
 
test case, something I can cut and paste and just @runit 
 
 
 
Check for trailing spaces
Jono, August    12, 2005 - 9:12 am UTC
 
 
I would suggest checking if the unmatched_msisdns table contains any trailing spaces after the digits. These would be ignored when an implicit conversion to a number took place and will not be noticeable in the normal SQL*plus query output. 
 
 
NOT IN: details
SAF, August    14, 2005 - 10:02 pm UTC
 
 
For the test case, I'd need to pass you the data file. How do I do that?
Jono:
Thanks for your input. However there are no trailing spaces in the string. Is there a way to turn this behaviour off?  
 
August    15, 2005 - 7:00 am UTC 
 
develop a method to generate, to reproduce the ISSUE.
However, I just noticed you said:
SQL> select * from unmatched_msisdns where msisdn = 6581000016;
MSISDN
------------------
6581000016
SQL> select * from unmatched_msisdns where msisdn = '6581000016';
no rows selected
I disagree with your analysis of the trailing whitespace.  When you compare the string to a number, there is an implicit to_number applied to the string:
ops$xp10gr1\tkyte@ORA10GR1> create table t ( x varchar2(18) );
Table created.
ops$xp10gr1\tkyte@ORA10GR1> @plan "select * from t where x = 1"
0 rows deleted.
old   1: explain plan for &1
new   1: explain plan for select * from t where x = 1
Explained.
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    11 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    11 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
<b>
   1 - filter(TO_NUMBER("X")=1)
</b>
Note
-----
   - dynamic sampling used for this statement
17 rows selected.
<b>when you compare a string to a string, you do NOT have that happening:</b>
ops$xp10gr1\tkyte@ORA10GR1> @plan "select * from t where x = '1'"
2 rows deleted.
old   1: explain plan for &1
new   1: explain plan for select * from t where x = '1'
Explained.
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    11 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    11 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
<b>
   1 - filter("X"='1')
</b>
Note
-----
   - dynamic sampling used for this statement
17 rows selected.
So, I believe you DO have trailing whitespace on some of those.  Look again.
select * from t 
where x like '% '
  or x like '%'||chr(9)
  or x like '%' || chr(10) 
  or x like '%' || chr(13)
/
for example. 
 
 
 
 
spaces
SAF, August    16, 2005 - 6:08 am UTC
 
 
The trailing spaces were the culprit indeed. 
Thanks a lot.  
 
 
A reader, August    24, 2005 - 3:54 pm UTC
 
 
Tom,
Sorry for dragging this thread back to the basics a bit, but going back to your first reply
select count(*) from emp where empno not in (Select nvl(mgr,0) from emp) 
appears to bring the correct result.
What I am trying to understand is when the "NOT EXISTS" works surely it should also say 'Hey, I see a Null value and I don't know if it maybe or may not be so I can ignore'. I mean why is  NOT IN alone returning count(*)=0, comparitively. 
Thanks 
 
August    25, 2005 - 3:25 am UTC 
 
the where exists DOESN'T see the null.  That is the point :)
we said "not exists ( select null from emp T2 where t2.mgr = t1.empno );"
and sure enough for that t1.empno, there were no rows.
 
 
 
 
Great  Stuff
yogi, September 09, 2005 - 2:57 pm UTC
 
 
Thanks this was very useful .I used this to bring down the response time of a query from 40 minutes to some few seconds .Keep the good work going.
 
 
 
Not Exists
V, October   27, 2005 - 4:09 pm UTC
 
 
Tom I have the following which takes forever to run:
..
..
AND NOT EXISTS    (SELECT null FROM   tm tp WHERE  tp.id = cd.id)
AND NOT EXISTS    (SELECT null FROM   t1 tp WHERE  tp.id = cd.id)
can this be changed to:
AND NOT EXISTS (SELECT null FROM   tm tp,t1 tp1        WHERE tp.id = cd.id OR tp1.id = cd.id )
Would this result in the same result? 
 
October   28, 2005 - 1:42 am UTC 
 
why do you believe the not exists is the problem?  (you are using the CBO right?)
No, it cannot be changed to what you have, that query is not the same as the prior two.
Using the cbo, you might:
and cd.id not in ( select id from tm where id is not null 
                    union all 
                   select id from t1 where id is not null )
 
 
 
 
Is this way of existence check valid?
Ford Desperado, November  29, 2005 - 4:17 pm UTC
 
 
Tom,
I've come across an application where the client checks if there are any rows meeting some criteria. Here is how it is done:
select count(*) from some_table where (some complex criteria) and rownum=1 
/* also in some places rownum<2 is used */
Is is a safe way for existence checks?
Isn't it better to check existence like this:
select case when exists(select 1 from some_table where (some complex criteria)) then 1 else 0 end a from dual
Thank you
 
 
November  30, 2005 - 11:07 am UTC 
 
here is the best way to do this:
THIS SPACE INTENTIONALLY LEFT BLANK
I hate that check for existence, hardware vendors love them since it increases sales, but in software it is typically "not useful"
I take the approach of "rather than see if there is work to be done, I'll try to do work and at that point discover one of two things:
a) work is there to be done, lets do it
b) work is not there to be done, skip it
the rownum check works however. 
 
 
 
great, thanks!
Ford Desperado, November  30, 2005 - 1:33 pm UTC
 
 
Thanks Tom 
 
 
A reader, January   04, 2006 - 9:52 am UTC
 
 
Very helpful, thanks 
 
 
How can I get a hash mechanism working for this query
Naresh, March     30, 2006 - 12:01 pm UTC
 
 
Hello Tom,
I have the query below - service_agreement_tas_d and cm1_agreement_param_tas_d have about 2 million rows each, csm_offer_Param has about 1000 rows. Tables are not analyzed as we are doing a data conversion where the tables do not have indexes. The join in the sub query is on key fields (meaning agreement_no, param_name, offer_instance_id define a unique row in cm1_agreement_param):
  1  select
  2              mod(agreement_no, 100) agreement_key,
  3              agreement_no, soc, 30000000 + rownum param_seq_no,
  4              sa.effective_date, sa.expiration_date, sa.source_agr_no, sa.soc_seq_no,
  5              param_instance_level, param_name, param_values,
  6              row_number() over
  7                     (partition by agreement_no, param_name order by soc_seq_no) rn
  8    from service_agreement_tas_d sa,
  9    (select distinct soc_cd, param_name, populate_level, param_values,
 10     mandatory_ind, param_instance_level from csm_offer_param ) op
 11    where sa.soc = op.soc_cd
 12      and not exists (select /*+ HASH_AJ */ NULL from cm1_agreement_param_tas_d p
 13                       where p.agreement_no = sa.agreement_no
 14                             and p.param_name = op.param_name
 15                             and ( (op.param_instance_level = 'O'
 16                                     and p.offer_instance_id = sa.soc_seq_no)
 17                                     or (op.param_instance_level = 'S'
 18                                             and p.offer_instance_id is NULL)
 19                                     )
 20*                      )
SQL> /
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=124560227 Card=19271
          2 Bytes=425700808)
   1    0   WINDOW (SORT) (Cost=68275 Card=192712 Bytes=425700808)
   2    1     COUNT
   3    2       FILTER
   4    3         HASH JOIN (Cost=9984 Card=192712 Bytes=425700808)
   5    4           VIEW (Cost=4340 Card=14703 Bytes=31508529)
   6    5             SORT (UNIQUE) (Cost=4340 Card=14703 Bytes=315967
          47)
   7    6               TABLE ACCESS (FULL) OF 'CSM_OFFER_PARAM' (Cost
          =6 Card=14703 Bytes=31596747)
   8    4           PARTITION LIST (ALL)
   9    8             TABLE ACCESS (FULL) OF 'SERVICE_AGREEMENT_TAS_D'
           (Cost=1156 Card=3854235 Bytes=254379510)
  10    3         PARTITION LIST (ALL)
  11   10           TABLE ACCESS (FULL) OF 'CM1_AGREEMENT_PARAM_TAS_D'
           (Cost=646 Card=1 Bytes=155)
Is it possible for the last full table access in the sub-query to use a hash mechanism?
I tried using HASH_AJ in inner query
 and not exists (select /*+ HASH_AJ */ NULL from cm1_agreement_param_tas_d p
                          where p.agreement_no = sa.agreement_no
But that did not work.
Hash related settings:
  1* select name, value from v$parameter where name like '%hash%'
SQL> /
NAME                           VALUE
------------------------------ ------------------------------
hash_join_enabled              TRUE
hash_area_size                 50000000
Thanks,
Naresh 
 
 
March     31, 2006 - 11:28 am UTC 
 
code it as an outer join instead of a subquery and add 
and p.agreement_no  is null
to the query. 
 
 
 
HASH_AJ for not in 
Naresh, March     31, 2006 - 12:56 am UTC
 
 
Hello Tom,
Version used is 9.2.0.6
I tried a variation for above query with a not in - still does not use hash_aj. I put nvl for each column in the inner select and also set always_anti_join=hash (I do not know if this parameter is relevant in the version we use - 9.2.0.6). What can be done to get a hash join working for this?
   1  select
  2              agreement_no, soc, 30000000 + rownum param_seq_no,
  3              sa.effective_date, sa.expiration_date, sa.soc_seq_no,
  4              param_instance_level, param_name, param_values,
  5              row_number() over
  6                     (partition by agreement_no, param_name order by soc_seq_no) rn
  7    from service_agreement_tas_d sa,
  8    (select distinct soc_cd, param_name, populate_level, param_values,
  9     mandatory_ind, param_instance_level from csm_offer_param ) op
 10    where sa.soc = op.soc_cd
 11      and (nvl(agreement_no,0), nvl(param_name,' '), nvl(soc_seq_no, -1)) not in
 12     (select /*+ HASH_AJ */ nvl(agreement_no,0), nvl(param_name,' '),
 13             nvl(offer_instance_id, -1)
 14*            from cm1_agreement_param_tas_d)
SQL> /
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=226118412 Card=19271
          2 Bytes=423195552)
   1    0   WINDOW (SORT) (Cost=67236 Card=192712 Bytes=423195552)
   2    1     COUNT
   3    2       FILTER
   4    3         HASH JOIN (Cost=9276 Card=192712 Bytes=423195552)
   5    4           VIEW (Cost=4340 Card=14703 Bytes=31508529)
   6    5             SORT (UNIQUE) (Cost=4340 Card=14703 Bytes=315967
          47)
   7    6               TABLE ACCESS (FULL) OF 'CSM_OFFER_PARAM' (Cost
          =6 Card=14703 Bytes=31596747)
   8    4           PARTITION LIST (ALL)
   9    8             TABLE ACCESS (FULL) OF 'SERVICE_AGREEMENT_TAS_D'
           (Cost=1156 Card=3854235 Bytes=204274455)
  10    3         PARTITION LIST (ALL)
  11   10           TABLE ACCESS (FULL) OF 'CM1_AGREEMENT_PARAM_TAS_D'
           (Cost=1173 Card=489 Bytes=75795)
The Card=489 for 'CM1_AGREEMENT_PARAM_TAS_D is different from the one below:
SQL> select count(*) from CM1_AGREEMENT_PARAM_TAS_D;
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE (Cost=1173 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     PARTITION LIST (ALL)
   3    2       TABLE ACCESS (FULL) OF 'CM1_AGREEMENT_PARAM_TAS_D' (Co
          st=1173 Card=3910431)
And this is what the user_tables shows
SQL> SQL> select lasT_analyzed, num_rows, USER_STATS from user_tables where table_name in
  2  ('SERVICE_AGREEMENT_TAS_D', 'CM1_AGREEMENT_PARAM_TAS_D');
LAST_ANAL   NUM_ROWS USE
--------- ---------- ---
                     NO
                     NO
Thanks,
Naresh
 
 
 
 
Adding != NULL in sub-query did it
naresh, March     31, 2006 - 5:40 am UTC
 
 
Writing the sub-query thus, gave the desired result
(select /*+ HASH_AJ */ nvl(agreement_no,0), nvl(param_name,' '),
nvl(offer_instance_id, -1)
from cm1_agreement_param_tas_d
where agreement_no is not NULL and param_name is not null and nvl(offer_instance_id,-1) is not null)
Thanks,
Naresh. 
 
 
Query Optimization related to EXIST
Soni, May       20, 2006 - 12:55 am UTC
 
 
Please look into the part of query below. I need to optimize the query. This looks very big and not optimized. Please advice.
 
AND  EXISTS ((SELECT 1
                    FROM   springs.si_jde_schedrcpts_work sr
                    WHERE  t.order_id = sr.t.order_id
                           AND t.item = sr.item
                           AND t.fromloc = sr.fromloc
                           AND sr.order_status < 90
                           AND sr.supplysource = 1)
       OR  EXISTS (SELECT 1
                    FROM   stsc.inventory inventory
                    WHERE  t.item = inventory.item
                           AND t.fromloc = inventory.loc
                           AND inventory.qty => t.stqty
                           AND sr.order_staus > 90
                           AND sr.supplysource = 1)
       OR  EXISTS (SELECT 1
                    FROM   stsc.inventory inventory
                    WHERE  t.item = inventory.item
                           AND t.fromloc = inventory.loc
                           AND inventory.qty => t.stqty
                           AND sr.order_staus < 90
                           AND sr.supplysource <> 1)
         ) 
 
May       20, 2006 - 4:52 pm UTC 
 
how do you know it is not optimized?
I know it is not syntactically "meaningful"
ops$tkyte@ORA10GR2> select *
  2    from dual
  3   where 1=1
  4     and exists ( (select 1 from dual)
  5                  or exists (select 1 from dual)
  6                  or exists (select 1 from dual)
  7                )
  8  /
                or exists (select 1 from dual)
                *
ERROR at line 5:
ORA-00907: missing right parenthesis
 
 
 
 
 
Optimization
Soni, May       21, 2006 - 11:01 am UTC
 
 
Thanks TOM, I know it might be wrong. I am new to Oracle and have to get this done using EXIST. I understand it might be wrong. I am learning and working too. Please help me modify this. I was not getting a way to solve this query having OR conditions with EXIST. Please advice. 
 
May       21, 2006 - 8:12 pm UTC 
 
these two:
       OR  EXISTS (SELECT 1
                    FROM   stsc.inventory inventory
                    WHERE  t.item = inventory.item
                           AND t.fromloc = inventory.loc
                           AND inventory.qty => t.stqty
                           AND sr.order_staus > 90
                           AND sr.supplysource = 1)
       OR  EXISTS (SELECT 1
                    FROM   stsc.inventory inventory
                    WHERE  t.item = inventory.item
                           AND t.fromloc = inventory.loc
                           AND inventory.qty => t.stqty
                           AND sr.order_staus < 90
                           AND sr.supplysource <> 1)
can obviously be rolled into one - but it would require an OR which wouldn't be much different from what you have if an index on qty,status,supplysource were being used.
basically "not sufficient data to really comment"
exists is not inheritly "bad" 
 
 
 
Cursor Modification
Soni, May       22, 2006 - 1:18 am UTC
 
 
Hi Tom, I am here writting the whole code for the above query I have to make the changes in a Cursor.
The change description is :
In the c_recship cursor, add an edit to first of all check to see if the t.order_id, t.item and t.fromloc exist on the springs.si_jde_schedrcpts_work sr table, joining t.order_id to  sr.order_id, t.item to sr.item and t.fromloc to sr.loc, where the sr. order_status < 90 and the sr.supplysource = 1. If found, create the recship. If it exists and the order_status > 90 or the t.order_id, t.item and t.fromloc does not exist at all as a supplysource = 1 row on the si_jde_schedrcpts_work table, then check the stsc.inventory table joining t.item to inventory.item, t.fromloc to inventory.loc and create recship if inventory.qty = > t.stqty.
Please suggest for the changes done by me.
CURSOR c_recship IS 
SELECT t.item item, t.toloc dest, t.fromloc source, s.transmode 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needarrivedate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedarrivedate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') availtoshipdate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedshipdate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needshipdate 
, 4 type, 0 actionallowedsw, 1 firmsw , ROUND(t.stqty,2) qty, 0 orderid 
, t.item primaryitem, 1 sys_info, t.order_id 
FROM 
    (SELECT a.item, a.duedate 
    , b.fromloc, a.toloc, stqty, a.order_number order_id 
    FROM (SELECT DISTINCT item, TRUNC(duedate) duedate 
    , toloc, order_number 
    FROM springs.si_jde_intransit_work 
    WHERE TRIM(UPPER(order_type)) = 'OT' 
    ) a 
    , (SELECT item, TRUNC(duedate) duedate, fromloc 
    , rel_ot_ord_nbr, SUM(NVL(net_st_qty,0)) stqty 
    FROM springs.si_jde_intransit_work 
    WHERE TRIM(UPPER(order_type)) = 'ST' 
    GROUP BY item, TRUNC(duedate), fromloc, rel_ot_ord_nbr 
    HAVING SUM(NVL(net_st_qty,0)) <> 0 
    ) b 
    WHERE a.order_number = b.rel_ot_ord_nbr 
    AND a.item = b.item 
) t, stsc.sourcing s 
WHERE NOT EXISTS (SELECT 1 FROM stsc.loc 
WHERE scen = 0 AND loc = t.toloc 
AND NVL(TRIM(si_loc_type),' ') = 'DC' 
) 
--Change by Soni
AND EXISTS((SELECT 1 FROM springs.si_jde_schedrcpts_work sr 
WHERE t.order_id = sr.t.order_id AND t.item = sr.item 
AND t.fromloc = sr.fromloc AND sr.order_status < 90 
AND sr.supplysource  =1 )
OR EXISTS(SELECT 1 FROM stsc.inventory inventory 
WHERE t.item = inventory.item 
AND t.fromloc = inventory.loc AND inventory.qty => t.stqty AND sr.order_staus > 90)
OR EXISTS(SELECT 1 FROM stsc.inventory inventory 
WHERE t.item = inventory.item 
AND t.fromloc = inventory.loc AND inventory.qty => t.stqty AND sr.supplysource=1))
--Change by Soni till here
AND s.transmode=springs.si_transmode_lookup(t.fromloc 
,t.toloc 
,' ',t.item 
) 
AND s.item = t.item 
AND s.source = t.fromloc 
AND s.dest = t.toloc 
AND s.scen = 0;  
 
May       22, 2006 - 7:43 am UTC 
 
suggest you ask "why the heck would I have to do this"
 WHERE TRIM(UPPER(order_type)) = 'OT' 
what would possibly cause you to have to use trim and upper. ugh. 
 
 
 
Optimization
Soni, May       22, 2006 - 8:26 am UTC
 
 
The other part of query is old and I don't have to change it. I have got the change to edit only a part of it. Please help as this is too urgent and I am new to Oracle.
Thanks for your patient. 
 
May       22, 2006 - 3:50 pm UTC 
 
is it working, that is the question.  
it'll do what you programmed it to - is it working for you. 
 
 
 
Checking for optimization
Soni, May       23, 2006 - 3:05 am UTC
 
 
CURSOR c_recship IS 
SELECT t.item item, t.toloc dest, t.fromloc source, s.transmode 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needarrivedate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedarrivedate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') availtoshipdate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') schedshipdate 
, TO_CHAR(NVL(t.duedate,dt_ohpost),'YYMMDD') needshipdate 
, 4 type, 0 actionallowedsw, 1 firmsw , ROUND(t.stqty,2) qty, 0 orderid , t.item primaryitem, 1 sys_info, t.order_id 
FROM 
    (SELECT a.item, a.duedate 
    , b.fromloc, a.toloc, stqty, a.order_number order_id 
    FROM (SELECT DISTINCT item, TRUNC(duedate) duedate 
    , toloc, order_number 
    FROM springs.si_jde_intransit_work 
    WHERE TRIM(UPPER(order_type)) = 'OT' 
    ) a 
    , (SELECT item, TRUNC(duedate) duedate, fromloc 
    , rel_ot_ord_nbr, SUM(NVL(net_st_qty,0)) stqty 
    FROM springs.si_jde_intransit_work 
    WHERE TRIM(UPPER(order_type)) = 'ST' 
    GROUP BY item, TRUNC(duedate), fromloc, rel_ot_ord_nbr 
    HAVING SUM(NVL(net_st_qty,0)) <> 0 
    ) b 
    WHERE a.order_number = b.rel_ot_ord_nbr 
    AND a.item = b.item 
) t, stsc.sourcing s 
WHERE NOT EXISTS (SELECT 1 FROM stsc.loc 
WHERE scen = 0 AND loc = t.toloc 
AND NVL(TRIM(si_loc_type),' ') = 'DC' 
) 
--Changes by Soni from here
AND (EXISTS ( SELECT 1 FROM springs.si_jde_schedrcpts_work  
              sr WHERE t.order_id = sr.t.order_id AND t.item = sr.item AND t.fromloc = sr.fromloc AND sr.order_status < 90 
              AND sr.supplysource  =1 )
     OR EXISTS( SELECT 1 FROM stsc.inventory inventory 
              WHERE t.item = inventory.item 
              AND t.fromloc = inventory.loc AND  
              inventory.qty => t.stqty
              AND sr.order_staus > 90
             )
    OR EXISTS( SELECT 1 FROM stsc.inventory inventory 
               WHERE t.item = inventory.item 
               AND t.fromloc = inventory.loc AND 
               inventory.qty => t.stqty 
               AND sr.supplysource=1
              )
      )
--Changes by Soni till here
AND s.transmode=springs.si_transmode_lookup(t.fromloc 
,t.toloc ,' ',t.item 
) 
AND s.item = t.item AND s.source = t.fromloc 
AND s.dest = t.toloc AND s.scen = 0;  
**********
The changes done by me are between commnets. Please tell me if this correct. I have tested this. Its not giving any compilation error. Please check the part modified by me.
The description of the change is in the question asked before this. I want to put them not in three select statement but atleast in only two.
Thanks.
 
 
May       23, 2006 - 7:33 am UTC 
 
looks ok to me if it is getting the *right answer*.   
 
 
 
any thoughts on this query
A reader, June      15, 2006 - 11:03 am UTC
 
 
Tom,
 
Please keep in mind that main_report_tbl contains 4261053 records
select  yard_area , area_name , chassis_lt_description , visit , gate_id , reference_id ,
ssl_user_code , container , lane_out_by , cont_ck , container_lht_description ,
gross_weight , seals , temp_mins , temp_maxs , location , loc_type , booking ,
trucker_description , driver , chassis , genset , hazardous , guard_out_date , voided_date ,
 visit_voided_date , haz , vessel_voy , tir , ckr , sec , action , eir_id ,
 foreign , sealink_code , lane_in_date , ssl_user_description , port_of_destination_desc ,
 port_of_discharge_description , reefer , net_elapsed_time , net_time , rpm_time , clerical_time ,
 clerical_services_time , remarks , damage_description , action ,
 chassis , loc_type
 from main_report_tbl
 where guard_out_date is not null
 and voided_date is null
 and visit_voided_date is null
 and substr ( action , 2 , 1 ) not in ( 'C' , 'G' )
  and  lane_in_date  between to_date( '13-JUN-2005 00:00:00', 'dd-mon-yyyy hh24:mi:ss' )
  and to_date( '13-JUN-2005 23:59:59', 'dd-mon-yyyy hh24:mi:ss' ) 
 
June      16, 2006 - 5:48 pm UTC 
 
looks great!
as long as you want that data that is...
I mean, it doesn't get "much simplier than this"
cannot suggest anything unless you were to say about how many records you might generally anticipate, to see if an index makes sense or not. 
 
 
 
Brad, July      09, 2006 - 10:33 pm UTC
 
 
Tom..
i have some trouble with a sql...
insert into table T
select /*+ parallel(x,4) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (2,3)
group by a
the above sql runs fine..in about 10 mins...
But i have a limitation that i have to write the above sql
like this
insert into table T
select /*+ parallel(x,4) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (select b from table Y)
group by a..
table Y has 2 rows (2 and 3)
Now this SQL takes 40 mins to complete...any ideas improve this performance...please let me know if i am being vague...Thanks 
 
July      10, 2006 - 7:44 am UTC 
 
is b in Y "NOT NULL" or is it "NULLABLE"
and I assume that statistics are "there" and accurate? 
 
 
 
Brad, July      10, 2006 - 4:19 pm UTC
 
 
Both the tables are analyzed and the plans are as shown below ...table X has around 7.4 million rows and table y has 2 rows.
The column b in table y does not have null values but its nullable....
the output of the select statement will be arond 170,000 rows....
SQL1:
-----
The below sql takes 25 mins to comeback
select /*+ parallel(x,2) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (select b from table Y)
group by a..
Operation    Object Name    Rows    Bytes    Cost    Object Node    In/Out    PStart    PStop
SELECT STATEMENT Optimizer Mode=CHOOSE        73 K         162987                                       
  SORT GROUP BY        73 K    44 M    16217                                       
    FILTER                                                         
      TABLE ACCESS FULL    ODS.X    369 K    224 M    10175      :Q1638707000     P->S      QC (RANDOM)      
      TABLE ACCESS FULL    ODS.Y    1      4      2                                       
SQL2:
----
The below SQL takes 6 mins to comeback
select /*+ parallel(x,2) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (2,3)
group by a..
Operation    Object Name    Rows    Bytes    Cost    Object Node    In/Out    PStart    PStop
SELECT STATEMENT Optimizer Mode=CHOOSE        73 K         96849                                       
  SORT GROUP BY        73 K    44 M    96849      :Q1638710001     P->S      QC (RANDOM)      
    SORT GROUP BY        73 K    44 M    96849      :Q1638710000     P->P      HASH             
      TABLE ACCESS FULL    ODS.X    7 M    4G    10175      :Q1638710000     PCWP                       
 
 
July      11, 2006 - 7:34 pm UTC 
 
add "and y is not null" to the subquery.
 
 
 
 
Brad, July      11, 2006 - 10:30 pm UTC
 
 
Tom..i tried what you said..still no change...
select /*+ parallel(x,2) */
a,
sum(c),
....
30 columns all sum()
from table X
where b not in (select b from table Y where b is not null)
group by a..
Operation    Object Name    Rows    Bytes    Cost    Object Node    In/Out    PStart    PStop
SELECT STATEMENT Optimizer Mode=CHOOSE        73 K         162987                                       
SORT GROUP BY        73 K    44 M    16217                                       
FILTER                                                         
TABLE ACCESS FULL    ODS.X    369 K    224 M    10175      :Q1640640000     P->S      QC (RANDOM)      
TABLE ACCESS FULL    ODS.Y    1      4      2                                       
 
 
July      12, 2006 - 3:34 pm UTC 
 
make it so the plan fits on a line (so we humans can read it easily)
are the ESTIMATED CARDINALITIES correct
and how long does it take to just full scan this table on your instance?  I mean, does it take 6 minutes to do that full scan? 
 
 
 
not in vs exists
jas, July      11, 2006 - 11:16 pm UTC
 
 
I have a query like
select request_id from  request 
where request_id not in ( select ref from refer_tab)
and reuest_id not in ( select sed from sed_tab)
in whose explian plan ( select ref from refer_tab) query was not using index.
when i write it like .....
select request_id from  request 
where exists( select ref from refer_tab)
and reuest_id not in ( select sed from sed_tab)
it gives me the same output and also costs is some 10000 times low and query executes fast.
I can't understand the reason that exists and not in gives same result.
 
 
July      12, 2006 - 3:37 pm UTC 
 
they don't in general give the same results (eg: they are NOT semantically equivalent)
ops$tkyte@ORA10GR2> create table t1 ( x int );
Table created.
ops$tkyte@ORA10GR2> create table t2 ( x int );
Table created.
ops$tkyte@ORA10GR2> create table t3 ( x int );
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> insert into t1 values ( 1 );
1 row created.
ops$tkyte@ORA10GR2> insert into t2 values ( NULL );
1 row created.
ops$tkyte@ORA10GR2> insert into t3 values ( 3 );
1 row created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select *
  2    from t1
  3   where x not in (select x from t2)
  4     and x not in (select x from t3)
  5  /
no rows selected
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select *
  2    from t1
  3   where NOT EXISTS (select null from t2 where t2.x = t1.x)
  4     and x not in (select x from t3)
  5  /
         X
----------
         1
is ref in refer_tab NULLABLE.
if it is, add the predicate "where ref is not null" to the not in so that NOT IN and NOT EXISTS (I'm assuming your second example, which is totally "syntax free" was really a not exists and not exists...) all of a sudden become semantically "the same" 
 
 
 
 
CBO not choosing hash anti join
Morten, August    01, 2006 - 6:18 pm UTC
 
 
Having worked with Oracle since v.6 I have never really gotten comfortable with the CBO. 
I find that often (always?) I would like my queries to behave predictably and consistently rather than potentially (rarely) faster.
Having said that, I realize that I am being backwards about this, and thus I have set out to explore your brilliant site in search of the one thing that will win me over.
This thread is at the core of my issue with the CBO: I have a query that is clearly well suited for a hash anti join, and it runs in about 2 mins using the "poor man's hash anti join". 
Both tables involved are indexed and analyzed, so I set out to rewrite my query to use NOT IN, hoping that the CBO would do the rest for me. The result: Indexes used, decrease of perfomance by a factor 3. NOT EXISTS has the same problem. No NULL columns involved.
Normally I would just shake my head and use my hard-coded anti join, but I have a little bit of time on my hands to experiment, so ...
Could you possibly list a few reasons why the CBO would not pick the hash anti join for my query (looking for new records from mta_tab_2 to add to mta_tab_1)?
mta_tab_1 has 2819134 records (acct_id,start_dt unique)
mta_tab_2 has 1844852 records
  SELECT x.acct_id
        ,x.start_dt
  FROM   mta_tab_2 x
        ,(SELECT a.acct_id
                ,a.start_dt
          FROM   (SELECT acct_id
                        ,start_dt
                        ,end_dt
                  FROM   mta_tab_1
                 ) a
                ,(SELECT acct_id
                        ,MAX(start_dt) start_dt
                  FROM   mta_tab_1
                  GROUP  BY acct_id
                 ) b
          WHERE  b.acct_id = a.acct_id
          AND    :dt BETWEEN a.start_dt AND a.end_dt
          AND    a.start_dt = b.start_dt
         ) y
  WHERE  :dt BETWEEN x.start_dt AND x.end_dt
  AND    y.start_dt IS NULL
  AND    y.acct_id(+) = x.acct_id;
 
 
August    02, 2006 - 10:56 am UTC 
 
get used to the cbo - it is a fact, in 10g, it is just a plain and simple fact of life.
I always start with comparing the "guess" with "reality"
compare an autotrace traceonly explain
with
a tkprof
see where the optimizer gets estimated cardinalities wrong and ask yourself "is there a logical reason why they are wrong".  From that we can usually apply corrective actions.
Looks like you really want to learn about analytics!!!!!
(SELECT a.acct_id
                ,a.start_dt
          FROM   (SELECT acct_id
                        ,start_dt
                        ,end_dt
                  FROM   mta_tab_1
                 ) a
                ,(SELECT acct_id
                        ,MAX(start_dt) start_dt
                  FROM   mta_tab_1
                  GROUP  BY acct_id
                 ) b
          WHERE  b.acct_id = a.acct_id
          AND    :dt BETWEEN a.start_dt AND a.end_dt
          AND    a.start_dt = b.start_dt
         ) y
could be written in a single pass as:
select acct_id, start_dt
  from (
select acct_id, start_dt, end_dt,
       row_number() over (partition by acct_id order by start_dt DESC) rn
  from mta_tab_1
       )
 where rn = 1
   and start_date <= :dt
   and end_dt >= :dt
You just want to get the "max row by acct_id, where the max row is designated by start_dt"
And use that in a NOT IN
Seems this is the query:
select * 
  from mta_tab_2
 where start_dt <= :dt
   and end_dt >= :dt
   and (acct_id) NOT IN 
(
select acct_id
  from (
select acct_id, start_dt, end_dt,
       row_number() over (partition by acct_id order by start_dt DESC) rn
  from mta_tab_1
       )
 where rn = 1
   and start_date <= :dt
   and end_dt >= :dt
   and acct_id IS NOT NULL
) 
 
 
 
more info
Morten, August    01, 2006 - 7:10 pm UTC
 
 
btw the query returns ~10K records 
 
 
Wow - fast response
Morten, August    02, 2006 - 2:14 pm UTC
 
 
Thanks for the incredible response time!
I am onto the analytics. I tried the approach out with similar queries without any significant performance improvement, and in the end decided to modify the code as little as possible (I inherited the code).
So you are saying that by comparing the autotrace and the tkprof I should be able to pinpoint where the CBO goes wrong? (The tricky part is getting access to the trace file) 
 
August    02, 2006 - 4:01 pm UTC 
 
you compare the "row source operation" (the truth) with REAL output row counts
to the explain plan
which is the guess - to see if they are "way off" 
 
 
 
How about finding all non-primary key indexes ?
A reader, August    25, 2006 - 12:02 pm UTC
 
 
SELECT   uc.constraint_name,
         uc.constraint_type,
         uic.*
    FROM user_indexes uix,
         user_ind_columns uic,
         (SELECT index_name,
                 uc.constraint_type,
                 uc.table_name,
                 uc.constraint_name
            FROM user_constraints uc
           WHERE uc.constraint_type = 'P') uc
   WHERE uix.index_name = uic.index_name
     AND uix.table_name = uic.table_name
     AND uix.index_name = uc.index_name(+)
     AND uc.constraint_type IS NULL
order by uic.column_position
any better idea ? can JDBC do it ?
 note that the query must run from a user schema with only connect,resource prives. 
 
August    27, 2006 - 8:46 pm UTC 
 
select select index_name from user_indexes where index_name not in (select index_name from user_constraints where index_name is not null );
jdbc can run valid sql, so I would guess "yes" jdbc would be capable of running a query. 
 
 
 
Scalability
yuvraj, September 05, 2006 - 3:16 am UTC
 
 
Hi Tom,
What about the scalability of each of these ? I have a situation where in the data in the inner query resultset will grow almost exponentially over time whereas the outer query data will almost remain constant (comparitively speaking). Currently, the plans do not look very different from each other (I am using NOT IN & NOT EXISTS).
Thanks a lot! 
 
September 05, 2006 - 4:58 pm UTC 
 
using the CBO, it'll change plans as the volumes of data change over time - unlike the RBO 
 
 
 
Scalability
yuvraj, September 05, 2006 - 3:20 am UTC
 
 
I am using Oracle 10g Release 10.1.0.4.0. 
 
 
scalability
A reader, September 06, 2006 - 4:32 am UTC
 
 
Tom,
Thank you for the quick response.
So, can we predict (or do we know) which one out of the two will scale better in the situation given? Or is there no way to say ? Given my situation where the plans look alike, which condition should I use keeping in mind the growth of volume of data?  
 
September 06, 2006 - 7:51 am UTC 
 
use the cbo and we don't really *care*, that is the point.   
 
 
 
scalability-thanks
A reader, September 07, 2006 - 1:06 am UTC
 
 
Thank you Tom. I understand now.. 
 
 
OK
Kumar, February  02, 2007 - 1:50 am UTC
 
 
Hi Tom,
Any better way to rewrite the query given below?
SELECT  ol.line_id
         ,ol.created_by
         ,ol.last_updated_by
         ,ad.document_id
         ,ad.pk1_value
         ,ad.seq_num
    FROM  oe_order_lines_all  ol
         ,fnd_attached_documents ad
         ,(SELECT *
            FROM  so_lines_all
           ) sl
   WHERE ad.entity_name        = 'OE_ORDER_LINES'
     AND ol.split_from_line_id = ad.pk1_value
     AND sl.header_id          = ol.Header_id
     AND sl.line_id            = ol.split_from_line_id
     AND  NOT EXISTS
              ( SELECT 1
                  FROM fnd_attached_documents ad2
                 WHERE ol.line_id          = ad2.pk1_value
                   AND ad2.entity_name     = 'OE_ORDER_LINES'
Please do reply. 
February  02, 2007 - 10:45 am UTC 
 
looks okey dokey to me.  as long as it gets the right answer of course. 
 
 
Does null always mean "UNKNOWN"?
Robert Simpson, February  07, 2007 - 12:36 pm UTC
 
 
A null value was added to a column, and suddenly no data was returned from queries with a NOT IN clause.  We figured out why, but then figured that other queries using things like join conditions and IN to compare the same values were probably returning different results, but that ended up not being the case.
If NULL really means "gee, I don't know.  (litterally, null means Unknown)" per October 02, 2002 above, and "US" matches NULL in a NOT IN clause, why don't those two values also match in a join condition, an IN clause, a NOT EXISTS or a MINUS?
It almost seems like NOT IN is very unique in its treatment of NULLs.  What are we missing?  Thanks.
 
February  07, 2007 - 6:51 pm UTC 
 
because the sql standard says so - it is the way it was defined to work.
but it makes sense.
where x not in ( Q )
and Q contains nulls - is X not in there?  We don't know, there is no way to compare X to nulls...
where x in ( Q )
well, regardless of what Q has as far as nulls go - if X=5 and Q has 5, X is in Q. 
 
 
CBO: Result not consistent with Dukes..
Jens, February  08, 2007 - 4:18 am UTC
 
 
Back to the 3 easy examples:
select count(*) from t1 rbo
   where object_id not in ( select object_id from t2 );
select count(*) from t1 rbo
   where NOT EXISTS (select null from t2 where t2.object_id = rbo.object_id );
select count(*) from t1, t2 rbo
   where t1.object_id = rbo.object_id(+) and rbo.object_id IS NULL;
On my system (10.2.0.1), all these run with the same plan, and thus the same amount of LIO when using the CBO:
---------------------------------------------------------------------------------
| Id  | Operation              | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |        |     1 |     8 |    25  (24)| 00:00:01 |
|   1 |  SORT AGGREGATE        |        |     1 |     8 |            |          |
|*  2 |   HASH JOIN ANTI       |        |    50 |   400 |    25  (24)| 00:00:01 |
|   3 |    TABLE ACCESS FULL   | T1     |  5000 | 20000 |    17  (12)| 00:00:01 |
|   4 |    INDEX FAST FULL SCAN| T2_IDX |  4950 | 19800 |     5  (20)| 00:00:01 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"="OBJECT_ID")
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         80  consistent gets
          0  physical reads
          0  redo size
        227  bytes sent via SQL*Net to client
        234  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processedThen Duke Ganote claimed his approach using MINUS to be better, even using CBO. On my system its not:
------------------------------------------------------------------------------------
| Id  | Operation                 | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |        |     1 |    17 |    52  (35)| 00:00:01 |
|   1 |  SORT AGGREGATE           |        |     1 |    17 |            |          |
|*  2 |   HASH JOIN               |        |  5000 | 85000 |    52  (35)| 00:00:01 |
|   3 |    TABLE ACCESS FULL      | T1     |  5000 | 20000 |    17  (12)| 00:00:01 |
|   4 |    VIEW                   |        |  5000 | 65000 |    32  (41)| 00:00:01 |
|   5 |     MINUS                 |        |       |       |            |          |
|   6 |      SORT UNIQUE          |        |  5000 | 20000 |            |          |
|   7 |       TABLE ACCESS FULL   | T1     |  5000 | 20000 |    17  (12)| 00:00:01 |
|   8 |      SORT UNIQUE          |        |  4950 | 19800 |            |          |
|   9 |       INDEX FAST FULL SCAN| T2_IDX |  4950 | 19800 |     5  (20)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("SET_OP"."OBJECT_ID"="RBO"."OBJECT_ID")
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        144  consistent gets
          0  physical reads
          0  redo size
        226  bytes sent via SQL*Net to client
        234  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processedThe execution plan is much more complex, T1 is FTS'ed twice and his query needs to do 2 sorts to get to the result. 
I guess this could be for a number of reasons, like:
* DBMS version
* init parameters
* difference in the way statistics where gathered
Any comments on this?
Jens 
 
 Need Help only seeing a forest behind the trees..
Jan Solca, April     27, 2007 - 11:28 am UTC
 
 
Hi Tom 
I tried to solve a simple problem with the IN /EXISTS functions but I was not successfull yet.
Let's assume 2 tables T1 and T2: 
T1 has attributes workerUnit, valid_from, valid_to 
T2 has attributes workerUnit, year (not date but string) 
T1 identifies service providers for a certain period and T2 identifies if the service providers had a budget allocated for a specific year. 
What I'm trying to do is to generate a view 
(parametrized or not) which would allow me to perform the following query "Give me the list of valid workerUnits for a given date and also if a budget was allocated to it for that given year. 
Sketch for a result set input date 24.04.2007
workerUnit  |  valid_from  |   valid_to  | hasBudget 
  945          01.01.1988     31.12.9999        1
  888          03.04.2007     31.12.9999        1
  874          08.05.2007     31.03.2010        0 
where 0 and 1 in the 'hasBudget' column identify a workerUnit which got budget for the year. Let's exercise it for the first row for instance 
T1 
workerUnit  |  valid_from  |  valid_to 
  945          01.01.1988     31.12.9999
  222          01.01.1972     24.03.2006
T2 
workerUnit  |    Year
  945            2004
  945            2005
  945            2007  
  
Thus if the input date was 22.03.2006(1) the record would have a 0 in the hasBudget. As another example this input date(1) would also retrieve the workerUnit 222 which has a valid_to of 24.03.2006 which is older than (1) but no budget has been allocated (record missing in T2) thus hasBudget 0
the record 222 looks like (still (1) has input date)
workerUnit  |  valid_from  |   valid_to  | hasBudget 
  ...
  222          01.01.1972     24.03.2006       0
  ... 
Thanks your help ..   
 
Don't Use IN/Exists
RobH, May       16, 2007 - 2:26 pm UTC
 
 
Maybe try using a join with a case
Select workerUnit, valid_from , valid_to, 
  case when to_date('YYYY',year) between valid_from and valid_to then 1 else 0 end
from t1 outer join t2 on t1.workerUnit = t1.workerUnit
 
 
or....(since you have many t2 rows)
RobH, May       16, 2007 - 2:31 pm UTC
 
 
Select workerUnit, valid_from , valid_to, 
  max(case when to_date('YYYY',year) between valid_from and valid_to then 1 else 0 end) 
from t1 
   left outer join t2 on t1.workerUnit = t1.workerUnit 
 
Not Exists vs Not IN
A reader, January   09, 2008 - 10:14 am UTC
 
 
Tom,
I want to select all the rows from a table where one column values are not in another table.
CREATE TABLE T1
(
  A1            INTEGER,
  ANOTHER_COL   NUMBER                          DEFAULT 1,
  ANOTHER_COL1  INTEGER,
  ANOTHER_COL2  NUMBER(5)
)
/
Insert into T1
   (A1, ANOTHER_COL, ANOTHER_COL1, ANOTHER_COL2)
 Values
   (1, 1, 1, 1);
Insert into T1
   (A1, ANOTHER_COL, ANOTHER_COL1, ANOTHER_COL2)
 Values
   (23, 2, 3, 34);
COMMIT;
create table t3 as 
select * from t1;
insert into t1 values (12,3,4,12);
commit;
I want to select all the rows in t1 where another_col1 values are not in t3 table.
select * from t1 where another_col1 not in (select another_col1 from t3) ;
How do we frame the query using Not exists?
Also, If I insert another set of values in T3:
insert into t3 values (13,4,'',23);
Why can't I get any rows when I have some values in T3 with null values ? 
I can understand comparing with nulls is a problem. But it should return all rows whose values are not null, avoiding that null row, shouldn't it?
It's always been a confusion for me..can you clarify?
Thanks,
 
January   09, 2008 - 10:33 am UTC 
 
where column in (set that contains non-null and some nulls)
where, we can say that column is either in that set or not (we only need look at the non-nulls to see if that value is in that set)
where column NOT IN (same set)
well, the presence of nulls - as defined by the SQL language itself, a rule if you will - makes it impossible to see if that value is "not in" that set - we cannot tell if column not in (null) is true or false - it evaluates UNKNOWN.
so,  you would in general where column not in (select * from (that set) where that_column is not null )
if that is what you want....
As for the not exists, that is pretty trivial - give that bit of work to yourself as an exercise.... 
 
 
not able to tune it...
Reene, April     04, 2008 - 9:30 am UTC
 
 
Hi Tom
I have this query,it retuns 12 rows,runs for 90 minutes.
trying to tune it , but could not,spent 5 days but to no avail.
Parsing user id: 65  (APPS)
********************************************************************************
SELECT /*+   ordered  use_hash(mtp) push_subq */ ooha.org_id "OPERATING UNIT ID"
FROM        apps.oe_transaction_types_all ott,
            apps.oe_transaction_types_tl ottt_l,
            apps.oe_order_lines_all oola,
            apps.mtl_parameters mtp,
            apps.oe_order_headers_all ooha,
            apps.hz_cust_accounts hca
WHERE  ooha.header_id = oola.header_id AND
       ooha.org_id = oola.org_id AND
       ooha.order_source_id = 1263 AND
       ott.transaction_type_id = ottt_l.transaction_type_id AND
       ott.org_id = ooha.org_id AND
       oola.line_type_id = ottt_l.transaction_type_id AND
       oola.line_type_id = ott.transaction_type_id AND
       mtp.organization_id = oola.ship_from_org_id AND
       hca.cust_account_id = ooha.sold_to_org_id AND
       (ottt_l.NAME LIKE 'GPO%FE_CONSIGNMENT_LINE' OR
       ottt_l.NAME LIKE 'GPO%FE_SHIPMENT_LINE') AND
       oola.flow_status_code IN ('SHIPPED', 'CLOSED') AND
       hca.attribute9 IS NOT NULL
       and NOT EXISTS
       ( SELECT /*+ NO_UNNEST */ 1
       FROM   apps.mtl_material_transactions mmt ,
              apps.mtl_transaction_lot_numbers mtnl ,
              apps.mtl_parameters mp ,
              apps.mtl_item_locations mil
       WHERE  mmt.inventory_item_id = oola.inventory_item_id AND
              mmt.transaction_type_id IN (420, 40, 41, 42) AND
              mmt.transaction_quantity > 0 AND
              mtnl.lot_number = ooha.order_number AND
              mtnl.transaction_id = mmt.transaction_id AND
              mmt.transaction_date > (TRUNC(SYSDATE) -90) AND
              mmt.organization_id = mp.organization_id AND
              mp.attribute7 = 'PARTS' AND
              mp.attribute11 ='SERVICE' AND
              mil.inventory_location_id = hca.attribute9 AND
              mil.organization_id = mp.organization_id
       ) AND
       NVL(oola.actual_shipment_date, oola.schedule_ship_date) > SYSDATE - 90
/
its tkporf output is 
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2    690.28    4899.16    1524771   23494672          0          12
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4    690.28    4899.16    1524771   23494672          0          12
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 65  (APPS)
Rows     Row Source Operation
-------  ---------------------------------------------------
     12  NESTED LOOPS  (cr=23494672 r=1524771 w=0 time=4640213970 us)
 157696   NESTED LOOPS  (cr=5730245 r=1402270 w=0 time=3699650845 us)
 157697    HASH JOIN  (cr=5257151 r=1379549 w=0 time=3576187058 us)
 157697     TABLE ACCESS BY INDEX ROWID OE_ORDER_LINES_ALL (cr=5257149 r=1379549 w=0 time=3575115599 us)
2789657      NESTED LOOPS  (cr=13232 r=12982 w=0 time=90323177 us)
     76       HASH JOIN  (cr=277 r=82 w=0 time=175347 us)
   4597        TABLE ACCESS FULL OE_TRANSACTION_TYPES_ALL (cr=157 r=0 w=0 time=6683 us)
     76        TABLE ACCESS FULL OE_TRANSACTION_TYPES_TL (cr=120 r=82 w=0 time=157808 us)
2789580       INDEX RANGE SCAN GEMS_ONT_ORDER_LINES_N99 (cr=12955 r=12900 w=0 time=86301122 us)(object id 1863725)
    377     INDEX FULL SCAN MTL_PARAMETERS_U1 (cr=2 r=0 w=0 time=367 us)(object id 9847)
 157696    TABLE ACCESS BY INDEX ROWID OE_ORDER_HEADERS_ALL (cr=473094 r=22721 w=0 time=123022055 us)
 157697     INDEX UNIQUE SCAN OE_ORDER_HEADERS_U1 (cr=315397 r=1363 w=0 time=11565697 us)(object id 688729)
     12   TABLE ACCESS BY INDEX ROWID HZ_CUST_ACCOUNTS (cr=17764427 r=122501 w=0 time=1114177945 us)
 157696    INDEX UNIQUE SCAN HZ_CUST_ACCOUNTS_U1 (cr=315395 r=274 w=0 time=3770326 us)(object id 715003)
 157621    TABLE ACCESS BY INDEX ROWID MTL_TRANSACTION_LOT_NUMBERS (cr=17291336 r=121205 w=0 time=1096619120 us)
6897874     NESTED LOOPS  (cr=14562731 r=114663 w=0 time=1018837983 us)
3448931      NESTED LOOPS  (cr=7488435 r=112057 w=0 time=936487677 us)
 157633       NESTED LOOPS  (cr=788165 r=581 w=0 time=10022817 us)
 157633        INDEX RANGE SCAN MTL_ITEM_LOCATIONS_U1 (cr=472899 r=581 w=0 time=7984520 us)(object id 9761)
 157633        TABLE ACCESS BY INDEX ROWID MTL_PARAMETERS (cr=315266 r=0 w=0 time=1667395 us)
 157633         INDEX UNIQUE SCAN MTL_PARAMETERS_U1 (cr=157633 r=0 w=0 time=770942 us)(object id 9847)
3448931       TABLE ACCESS BY INDEX ROWID MTL_MATERIAL_TRANSACTIONS (cr=6700270 r=111476 w=0 time=924106849 us)
8273225        INDEX RANGE SCAN MTL_MATERIAL_TRANSACTIONS_N1 (cr=695175 r=32781 w=0 time=303092226 us)(object id 6205511)
3448931      INDEX RANGE SCAN MTL_TRANSACTION_LOT_NUMBERS_N1 (cr=7074296 r=2606 w=0 time=74416944 us)(object id 9976)
is there anything which can be done to improve its response time.
is push_subq is the correct choice here.
how does not exists work - please explain using the above query and the tkprof output..
what is wrong here.
thanks
 
 
An explanation I use for the IN x NOT IN
Marcus Rangel, May       19, 2008 - 8:54 pm UTC
 
 
Tom, there is an explanation I usually give for developers who ask me why "1 in (1,2,null)" returns TRUE but "3 not in (1,2,null)" returns FALSE. Please tell me what you think:
The IN operation is a series of *equality* comparisons that is ultimately resolved by an OR:
1 = 1 ? TRUE
1 = 2 ? FALSE
1 = NULL ? FALSE
TRUE OR FALSE OR FALSE = TRUE
The NOT IN operation is a series of *inequality* comparisons that is ultimately resolved by an AND:
3 != 1 ? TRUE
3 != 2 ? TRUE
3 != NULL ? FALSE
TRUE AND TRUE AND FALSE = FALSE
 
May       20, 2008 - 10:50 am UTC 
 
with the minor exception that 3!=NULL is not false, it is not true, it is UNKNOWN
ops$tkyte%ORA10GR2> select * from dual where 3 <> NULL;
no rows selected
ops$tkyte%ORA10GR2> select * from dual where NOT(3 <> NULL);
no rows selected
 
 
 
how about in subquery contains null
jian huang zheng, October   02, 2008 - 10:32 am UTC
 
 
HI TOM
 Thanks for this wonderful thread. i read:
<Quote>
Because NULL means -- gee, I don't know.  (litterally, null means Unknown)
So, the predicate
where x not in ( NULL )
evaluates to neither TRUE, nor FALSE
</qoute>
but when using in ( subquery), if subquery contains null, it will select rows ,why is that?
SQL> desc cc
 Name
 A
SQL> desc dd
 Name
 A
SQL> insert into dd values(null);
1 row created.
SQL> select *  from cc where a in ( select *  from dd);
         A
----------
         3
1 row selected.
SQL>
 
 
October   02, 2008 - 1:12 pm UTC 
 
because the rules say "where 5 in ( 5, NULL )" is true because 5 is in (5).
But "where 5 NOT IN (1, null)" is unknown - it is unknown if "5 is not in (null)"
ops$tkyte%ORA11GR1> select * from dual where 5 in (5,null);
D
-
X
ops$tkyte%ORA11GR1> select * from dual where 5 NOT in (1,null);
no rows selected
 
 
 
Could you help with this query, how to evalue it?
jian huang zheng, October   05, 2008 - 8:40 am UTC
 
 
Hello Tom
Thanks for clearing my doubts about the not in. I have following query , could you please help how conceptually evaluate this query?
  SQL> create table t1 ( x int);
Table created.
SQL> create table t2 ( x int);
Table created.
SQL> select *  from t1;
         X
----------
         1
         2
3 rows selected.
SQL> select *  from t2;
         X
----------
         1
         3
3 rows selected.
SQL> select *  from t1,t2 where exists ( select null from t2 where t1.x=t2.x);
         X          X
---------- ----------
         1          1
         1          3
         1
3 rows selected.
SQL> select *  from t1,t2 where exists ( select null from t1 where t1.x=t2.x);
         X          X
---------- ----------
         1          1
         2          1
                    1
3 rows selected.
I am not sure how those two query are executed, Most appreciated if you can give an explaination.
Thanks~
 
 
October   06, 2008 - 2:49 pm UTC 
 
... select *  from t1,t2 ...
join every row in t1 with every row in t2.
you have
1
2
and 
1
3
that results in
t1.x,t2.x
1,1
1,3
2,1
2,3
then - apply the filter:
where exists ( select null from t2 where t1.x=t2.x);
to each row.
You can always envision:
select * 
  from <table list>
where <predicate>
as a cartesian join of all tables in the table list (every row in table t1 joined with every row in t2, joined with every row in t3 and so on - so if you had 10 rows in each table - you'd end up with 10x10x10 rows in the result set) AND THEN run the predicate, the filter against them. 
 
 
very clear explaination, and one more question about 'in' execution
jian huang zheng, October   09, 2008 - 10:22 am UTC
 
 
Hi Tom
   Thanks for your explaination about the sql execution. one more question i am curious to know is:
   suppose i issue sql : select * from table1 where a in (select a from table2),suppose hash semi-join is used , does oracle have to finish the hash join for all rows before feeding back the resultset to the client or during the execution?
   Or is there a universal rule for oracle to feed back the result set, feed back during the sql execution or after the execution? 
October   09, 2008 - 11:21 am UTC 
 
with a hash (semi or not) join, the plan is typically like this:
hash join
   full scan t1
   full scan t2
One of the tables will be full scanned and hashed (hopefully into memory, but not necessarily)
Once that step has taken place, we can start full scanning the other table and probing the hash table for the matches
and start returning them.
So, you will full scan and hash one table BEFORE you get the first row back.
So, does Oracle have to finish the hash join for all rows? - No 
 
 
exists
A reader, October   09, 2008 - 10:58 pm UTC
 
 
 
 
Tuning
vikram, November  12, 2008 - 10:56 am UTC
 
 
Total Records deleteing 20,000(how to tune the query)
delete from H6054 where  entryid IN
(Select  C1 from T6054 Where C536870913=LoginName);   ---- consumes 8 secs
delete from T6054 where  C536870913=LoginName;         ---- consumes 6 secs
--COMMIT; 
November  13, 2008 - 4:12 pm UTC 
 
sounds good to me - but then I have no information of any use, so I get to make it up...
there are lots of indexes, and the delete hits them all.
there is no index on entryid and you are removing 20,000 out of 2,000,000 records
there is no index on C536870913 and you are finding 1 record out of 100,000
things like that - I'll just assume they are true - so therefore, 8 and 6 seconds is really fast.
no tables
no indexing scheme
no plans
no tkprofs
in short - nothing to really look at. 
 
 
IN & Exists
Jyothsna, May       03, 2010 - 1:44 am UTC
 
 
In Rooms table ive 6 rows (2,4,8,16,32,64 )
if i fire 
select roomsize from rooms
where 30>roomsize  
Then o/p is 4 rows (2,4,8,16)
But the following gives
select roomsize  from rooms 
where exists  (select 1 from rooms
              where 30 > roomsize)
It gives all the 6 rows 
Why it happens???
 
May       06, 2010 - 11:35 am UTC 
 
what the heck is a rooms table?
anyway, if there are any rows in rooms such that "30 > roomsize" then the subquery "(select 1 from rooms
              where 30 > roomsize)" obviously returns a record - doesn't it (we started with the supposition that there are some rows in rooms such that 30>roomsize).
Now, if that row exists - then "select x from t where exists (that subquery we know returns AT LEAST one row)" - would return every row from T - since the subquery by our definition here returns at least one row.
 
 
 
explain plan for in and exists
Chirayu, June      30, 2010 - 10:47 pm UTC
 
 
Hi Tom,
Oracle Database Version 11G
We had a bug in our application where in the original query was like this..
select bank_branch_name, bank_name, bank_branch_id
   from xx_bank_branches
WHERE BANK_BRANCH_ID IN
(SELECT  BANK_BRANCH_ID
from    xx_bank_accounts a
WHERE   a.ACCOUNT_TYPE <> 'EXTERNAL' )
order by bank_branch_name,  bank_name ;
I thought of re-writing the query as
 select bank_branch_name, bank_name, bank_branch_id
   from xx_bank_branches b
   where EXISTS ( SELECT 'x' FROM
xx_BANK_ACCOUNTS A
WHERE A.ACCOUNT_TYPE <> 'EXTERNAL'
and a.bank_branch_id = b.bank_branch_id )
order by bank_branch_name,  bank_name;
However when I compared the explain plan.. the plan generated by both the queries was exactly the same..
My question is did CBO treat both the queries as similar ? 
July      06, 2010 - 2:13 pm UTC 
 
answered this exact question elsewhere.  
 
 
Yes, here
Sutaria Chirayu, July      07, 2010 - 10:02 pm UTC
 
 
 
Query Tuning
Vicky, March     15, 2011 - 1:43 am UTC
 
 
/* Formatted on 3/15/2011 10:37:11 AM (QP5 v5.126.903.23003) */
SELECT                                                            --gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
      gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
      ,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 account_code
      ,                                                                                                      --gcc.segment4,--RICO
       SUBSTR(ap_main.description, 1, 40) journal_description
      ,(CASE WHEN gcc.segment3 IN ('1001', '1002') THEN SUBSTR(ap_main.doc_description, 1, 16) ELSE NULL END) document_1
      ,NULL document_2, SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
      ,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
      ,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
      ,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
      ,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
  FROM apps.gl_je_headers gjh
      ,apps.gl_je_lines gjl
      ,apps.gl_code_combinations gcc
      ,--gl.gl_period_statuses   gps ,
       (  SELECT ap.JE_HEADER_ID, ap.je_line_num, MAX(ap.description) description, MAX(ap.doc_description) doc_description
            FROM (SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, aid.description, NULL doc_description
                    FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
                        ,apps.gl_je_headers gjh, apps.gl_je_lines gjl, apps.Ap_Invoice_Distributions_All aid
                   WHERE     aeh.ae_header_id = ael.ae_header_id
                         AND AEL.SOURCE_TABLE = 'AP_INVOICE_DISTRIBUTIONS'
                         AND AEL.SOURCE_ID = AID.INVOICE_DISTRIBUTION_ID
                         AND ael.gl_sl_link_id = imp.gl_sl_link_id
                         AND imp.Gl_Sl_Link_Table = 'APECL'
                         AND gjh.je_header_id = imp.je_header_id
                         AND gjh.je_header_id = gjl.je_header_id
                         AND gjh.set_of_books_id = 1001
                         AND gjl.je_line_num = imp.je_line_num
                         AND aeh.gl_transfer_flag = 'Y'
                         AND aeh.set_of_books_id = 1001
                         AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                  UNION
                  --select GJH.JE_HEADER_ID,gjl.je_line_num ,gjh.period_name ,aid.description
                  SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, aid.description, b.description doc_description
                    FROM apps.ap_ae_headers_all aeh
                        ,apps.ap_ae_lines_all ael
                        ,apps.gl_import_references imp
                        ,apps.ap_invoice_payments_all pay
                        ,apps.ap_invoices_all inv
                        ,apps.gl_je_headers gjh
                        ,apps.gl_je_lines gjl
                        ,apps.Ap_Invoice_Distributions_All aid
                        ,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
                        ,(SELECT fvv.FLEX_VALUE, fvv.description
                            FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                           WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
                         b
                   WHERE     aeh.ae_header_id = ael.ae_header_id
                         AND ael.source_table = 'AP_INVOICE_PAYMENTS'
                         AND ael.source_id = pay.invoice_payment_id
                         AND pay.invoice_id = inv.invoice_id
                         AND ael.gl_sl_link_id = imp.gl_sl_link_id
                         AND imp.Gl_Sl_Link_Table = 'APECL'
                         AND gjh.je_header_id = imp.je_header_id
                         AND gjh.je_header_id = gjl.je_header_id
                         AND gjh.set_of_books_id = 1001
                         AND gjl.je_line_num = imp.je_line_num
                         AND aeh.set_of_books_id = 1001
                         AND aeh.gl_transfer_flag = 'Y'
                         AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         AND pay.invoice_id = jcca.TRX_id(+)
                         AND jcca.SOURCE(+) = 'AP'
                         AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
                         AND inv.invoice_id = aid.invoice_id
                  UNION
                  SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, aid.description, NULL doc_description
                    FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
                        ,apps.ap_invoices_all inv, apps.gl_je_headers gjh, apps.gl_je_lines gjl
                        ,apps.Ap_Invoice_Distributions_All aid
                   WHERE     aeh.ae_header_id = ael.ae_header_id
                         AND ael.source_table = 'AP_INVOICES'
                         AND ael.source_id = inv.invoice_id
                         AND ael.gl_sl_link_id = imp.gl_sl_link_id
                         AND imp.Gl_Sl_Link_Table = 'APECL'
                         AND gjh.je_header_id = imp.je_header_id
                         AND gjh.je_header_id = gjl.je_header_id
                         AND gjh.set_of_books_id = 1001
                         AND gjl.je_line_num = imp.je_line_num
                         AND aeh.set_of_books_id = 1001
                         AND aeh.gl_transfer_flag = 'Y'
                         AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         AND inv.invoice_id = aid.invoice_id
                  UNION
                  SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name, gjl.description, b.description doc_description
                    FROM apps.ap_ae_headers_all aeh
                        ,apps.ap_ae_lines_all ael
                        ,apps.gl_import_references imp
                        ,apps.AP_CHECKS_ALL AC
                        ,apps.gl_je_headers gjh
                        ,apps.gl_je_lines gjl
                        ,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
                        ,(SELECT fvv.FLEX_VALUE, fvv.description
                            FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                           WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
                         b
                        ,apps.ap_invoice_payments_all pay
                   WHERE     aeh.ae_header_id = ael.ae_header_id
                         AND ael.source_table = 'AP_CHECKS'
                         AND ael.source_id = AC.CHECK_ID
                         AND ael.gl_sl_link_id = imp.gl_sl_link_id
                         AND imp.Gl_Sl_Link_Table = 'APECL'
                         AND gjh.je_header_id = imp.je_header_id
                         AND gjh.je_header_id = gjl.je_header_id
                         AND gjh.set_of_books_id = 1001
                         AND gjl.je_line_num = imp.je_line_num
                         AND aeh.set_of_books_id = 1001
                         AND aeh.gl_transfer_flag = 'Y'
                         AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         AND ac.check_id = pay.check_id
                         AND pay.invoice_id = jcca.TRX_id(+)
                         AND jcca.SOURCE(+) = 'AP'
                         AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM) ap
        GROUP BY ap.JE_HEADER_ID, ap.je_line_num) ap_main
 WHERE     gjh.set_of_books_id = 1001
       AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
       --and   gjh.period_name  =gps.period_name
       AND gjh.je_header_id = gjl.je_header_id
       AND gjl.code_combination_id = gcc.code_combination_id
       AND ap_main.je_header_id = gjh.je_header_id
       AND ap_main.je_line_num = gjl.je_line_num
UNION ALL
--ar
SELECT                                                            --gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
      gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
      ,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 ccount_code
      ,                                                                                                      --gcc.segment4,--RICO
       SUBSTR(gjl.description, 1, 40) journal_description
      ,(CASE WHEN gcc.segment3 IN ('1001', '1002') THEN SUBSTR(ar_main.doc_description, 1, 16) ELSE NULL END) document_1
      ,NULL document_2, SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
      ,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
      ,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
      ,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
      ,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
  FROM apps.gl_je_headers gjh
      ,apps.gl_je_lines gjl
      ,apps.gl_code_combinations gcc
      ,(  SELECT ar.JE_HEADER_ID, ar.je_line_num, MAX(ar.description) doc_description
            FROM (SELECT GJH.JE_HEADER_ID, gjl.je_line_num, b.description
                    FROM apps.gl_je_headers gjh
                        ,apps.gl_je_lines gjl
                        ,apps.gl_import_references gir
                        ,-- ar_cash_receipts_all      acr,
                         apps.AR_CASH_RECEIPTS_ALL ACR
                        ,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
                        ,(SELECT fvv.FLEX_VALUE, fvv.description
                            FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                           WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
                         b
                   WHERE     gjh.je_source = 'Receivables'
                         AND gjh.je_header_id = gjl.je_header_id
                         AND gjh.set_of_books_id = 1001
                         AND gir.je_header_id = gjh.je_header_id
                         AND gir.je_line_num = gjl.je_line_num
                         -- NTP 4-Mar-11
                         --AND SUBSTR (gir.reference_2, 1, INSTR (gir.reference_2, 'C') - 1) =
                         --          TO_CHAR (acr.CASH_RECEIPT_ID)
                         AND TO_NUMBER(SUBSTR(gir.reference_2, 1, INSTR(gir.reference_2, 'C') - 1)) = acr.CASH_RECEIPT_ID
                         AND ACR.CASH_RECEIPT_ID = jcca.TRX_id(+)
                         AND jcca.SOURCE(+) = 'AR'
                         AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
                  UNION
                  SELECT GJH.JE_HEADER_ID, gjl.je_line_num, b.description
                    FROM apps.gl_je_headers gjh
                        ,apps.gl_je_lines gjl
                        ,apps.gl_import_references gir
                        ,apps.ar_receivable_applications_all ara
                        ,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
                        ,(SELECT fvv.FLEX_VALUE, fvv.description
                            FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                           WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID)
                         b
                   WHERE                                                                            --gjh.je_source= 'Receivables'
                        gjh  .je_header_id = gjl.je_header_id
                         AND gjh.set_of_books_id = 1001
                         AND gir.je_header_id = gjh.je_header_id
                         AND gir.je_line_num = gjl.je_line_num
                         AND NVL(gir.reference_2, 0) = TO_CHAR(ara.applied_customer_trx_id)
                         AND ara.applied_customer_trx_id = jcca.TRX_id(+)
                         AND jcca.SOURCE(+) = 'AR'
                         AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM) ar
        GROUP BY ar.JE_HEADER_ID, ar.je_line_num) ar_main
 WHERE     gjh.set_of_books_id = 1001
       AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
       --and   gjh.period_name  =gps.period_name
       AND gjh.je_header_id = gjl.je_header_id
       AND gjl.code_combination_id = gcc.code_combination_id
       AND ar_main.je_header_id = gjh.je_header_id
       AND ar_main.je_line_num = gjl.je_line_num
UNION ALL
--gl
SELECT                                                            --gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
      gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
      ,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 account_code
      ,                                                                                                     ---gcc.segment4,--RICO
       SUBSTR(gjl.description, 1, 40) journal_description
      ,DECODE(gcc.segment3, '1001', SUBSTR(b.description, 1, 16), '1002', SUBSTR(b.description, 1, 16), NULL) document_1
      ,NULL document_2, SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
      ,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
      ,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
      ,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
      ,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
  FROM apps.gl_je_headers gjh
      ,apps.gl_je_lines gjl
      ,apps.gl_code_combinations gcc
      ,(SELECT fvv.FLEX_VALUE, fvv.description
          FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
         WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS' AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
 WHERE     gjh.je_header_id = gjl.je_header_id
       AND gjh.set_of_books_id = 1001
       AND gjl.global_attribute1 = B.FLEX_VALUE
       AND gjl.code_combination_id = gcc.code_combination_id
       AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
UNION ALL
SELECT                                                           ---gjh.period_name,gjh.je_Source,gjh.Name,gjh.Description,---rico
      gjh.je_header_id journal_ctrl_num, gjl.je_line_num sequence_id
      ,gcc.segment1 || gcc.segment10 || gcc.segment2 || gcc.segment3 || gcc.segment4 || gcc.segment5 account_code
      ,                                                                                                     ---gcc.segment4,--RICO
       SUBSTR(gjl.description, 1, 40) journal_description, NULL document_1, NULL document_2
      ,SUBSTR(gjl.attribute1, 1, 32) reference_code, NVL(gjl.accounted_dr, -gjl.accounted_cr) balance
      ,NVL(gjl.entered_dr, -gjl.entered_cr) nat_balance, SUBSTR(gjh.currency_code, 1, 8) nat_cur_code
      ,gjh.currency_conversion_rate rate, gjh.currency_conversion_type rate_type_home
      ,DECODE(gjh.status, 'P', 1, 'U', 0, -1) posted_flag, TO_CHAR(GJH.POSTED_DATE, 'yyyy-mm-dd hh24:mi:ss') date_posted
      ,TO_CHAR(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') date_transfer
  FROM apps.gl_je_headers gjh, apps.gl_je_lines gjl, apps.gl_code_combinations gcc
 --      gl.gl_period_statuses       gps
 WHERE     gjh.set_of_books_id = 1001
       AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
       --and  gjh.period_name         =gps.period_name
       AND gjh.je_header_id = gjl.je_header_id
       AND gjh.set_of_books_id = 1001
       AND gjl.code_combination_id = gcc.code_combination_id
       AND(GJH.JE_HEADER_ID, GJL.JE_LINE_NUM) NOT IN
                (SELECT A.JE_HEADER_ID, A.je_line_num
                   FROM (SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael
                               ,apps.gl_import_references imp, apps.gl_je_headers gjh
                               ,apps.gl_je_lines gjl, apps.Ap_Invoice_Distributions_All aid
                          WHERE     aeh.ae_header_id = ael.ae_header_id
                                AND AEL.SOURCE_TABLE = 'AP_INVOICE_DISTRIBUTIONS'
                                AND AEL.SOURCE_ID = AID.INVOICE_DISTRIBUTION_ID
                                AND ael.gl_sl_link_id = imp.gl_sl_link_id
                                AND imp.Gl_Sl_Link_Table = 'APECL'
                                AND gjh.je_header_id = imp.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gjh.je_header_id = gjl.je_header_id
                                AND gjl.je_line_num = imp.je_line_num
                                AND aeh.set_of_books_id = 1001
                                AND aeh.gl_transfer_flag = 'Y'
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         UNION
                         SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
                               ,apps.ap_invoice_payments_all pay, apps.ap_invoices_all inv, apps.gl_je_headers gjh
                               ,apps.gl_je_lines gjl, apps.Ap_Invoice_Distributions_All aid
                          WHERE     aeh.ae_header_id = ael.ae_header_id
                                AND ael.source_table = 'AP_INVOICE_PAYMENTS'
                                AND ael.source_id = pay.invoice_payment_id
                                AND pay.invoice_id = inv.invoice_id
                                AND ael.gl_sl_link_id = imp.gl_sl_link_id
                                AND imp.Gl_Sl_Link_Table = 'APECL'
                                AND gjh.je_header_id = imp.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gjh.je_header_id = gjl.je_header_id
                                AND gjl.je_line_num = imp.je_line_num
                                AND aeh.set_of_books_id = 1001
                                AND aeh.gl_transfer_flag = 'Y'
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                                AND inv.invoice_id = aid.invoice_id
                         UNION
                         SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
                               ,apps.ap_invoices_all inv, apps.gl_je_headers gjh, apps.gl_je_lines gjl
                               ,apps.Ap_Invoice_Distributions_All aid
                          WHERE     aeh.ae_header_id = ael.ae_header_id
                                AND ael.source_table = 'AP_INVOICES'
                                AND ael.source_id = inv.invoice_id
                                AND ael.gl_sl_link_id = imp.gl_sl_link_id
                                AND imp.Gl_Sl_Link_Table = 'APECL'
                                AND gjh.je_header_id = imp.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gjh.je_header_id = gjl.je_header_id
                                AND gjl.je_line_num = imp.je_line_num
                                AND aeh.set_of_books_id = 1001
                                AND aeh.gl_transfer_flag = 'Y'
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                                AND inv.invoice_id = aid.invoice_id
                         UNION
                         SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM apps.ap_ae_headers_all aeh, apps.ap_ae_lines_all ael, apps.gl_import_references imp
                               ,apps.AP_CHECKS_ALL AC, apps.gl_je_headers gjh, apps.gl_je_lines gjl
                          WHERE     aeh.ae_header_id = ael.ae_header_id
                                AND ael.source_table = 'AP_CHECKS'
                                AND ael.source_id = AC.CHECK_ID
                                AND ael.gl_sl_link_id = imp.gl_sl_link_id
                                AND imp.Gl_Sl_Link_Table = 'APECL'
                                AND gjh.je_header_id = imp.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gjh.je_header_id = gjl.je_header_id
                                AND gjl.je_line_num = imp.je_line_num
                                AND aeh.set_of_books_id = 1001
                                AND aeh.gl_transfer_flag = 'Y'
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         --ar
                         UNION
                         SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM apps.gl_je_headers gjh
                               ,apps.gl_je_lines gjl
                               ,apps.gl_import_references gir
                               ,-- ar_cash_receipts_all          acr,
                                apps.AR_CASH_RECEIPTS_ALL ACR
                               ,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
                               ,(SELECT fvv.FLEX_VALUE, fvv.description
                                   FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                                  WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS'
                                        AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
                          WHERE     gjh.je_source = 'Receivables'
                                AND gjh.je_header_id = gjl.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gir.je_header_id = gjh.je_header_id
                                AND gir.je_line_num = gjl.je_line_num
                                AND TO_NUMBER(SUBSTR(gir.reference_2, 1, INSTR(gir.reference_2, 'C') - 1)) = acr.CASH_RECEIPT_ID
                                AND ACR.CASH_RECEIPT_ID = jcca.TRX_id(+)
                                AND jcca.SOURCE(+) = 'AR'
                                AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         UNION
                         SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM apps.gl_je_headers gjh
                               ,apps.gl_je_lines gjl
                               ,apps.gl_import_references gir
                               ,apps.ar_receivable_applications_all ara
                               ,apps.JA_CN_CFS_ACTIVITIES_ALL JCCA
                               ,(SELECT fvv.FLEX_VALUE, fvv.description
                                   FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                                  WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS'
                                        AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
                          WHERE                                                                     --gjh.je_source= 'Receivables'
                               gjh  .je_header_id = gjl.je_header_id
                                AND gir.je_header_id = gjh.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gir.je_line_num = gjl.je_line_num
                                AND NVL(gir.reference_2, 0) = TO_CHAR(ara.applied_customer_trx_id)
                                AND ara.applied_customer_trx_id = jcca.TRX_id(+)
                                AND jcca.SOURCE(+) = 'AR'
                                AND b.FLEX_VALUE(+) = JCCA.DETAILED_CFS_ITEM
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)
                         --gl
                         UNION
                         SELECT GJH.JE_HEADER_ID, gjl.je_line_num, gjh.period_name
                           FROM gl.gl_je_headers gjh
                               ,gl.gl_je_lines gjl
                               ,(SELECT fvv.FLEX_VALUE, fvv.description
                                   FROM apps.FND_FLEX_VALUES_VL FVV, apps.FND_FLEX_VALUE_SETS FVS
                                  WHERE FVS.FLEX_VALUE_SET_NAME = 'JA_CN_CASHFLOW_ITEMS'
                                        AND FVV.FLEX_VALUE_SET_ID = FVS.FLEX_VALUE_SET_ID) b
                          WHERE     gjh.je_header_id = gjl.je_header_id
                                AND gjh.set_of_books_id = 1001
                                AND gjl.global_attribute1 = B.FLEX_VALUE
                                AND(GJH.PERIOD_NAME = :cp_period_name OR GJH.PERIOD_NAME = :cp_prior_period_name)) A
                  WHERE (A.PERIOD_NAME = :cp_period_name OR A.PERIOD_NAME = :cp_prior_period_name)); 
 
Query tuning
Vicky, March     15, 2011 - 1:44 am UTC
 
 
hi Tom,
Could you please help me in tuning this query.
Thanks,
Vicky 
March     15, 2011 - 8:23 am UTC 
 
in a word - no
in a sentence - it is huge, it is against tables I don't know, containing data I don't have a clue about, answering a question I don't understand.  The odds that it is 100% "correct" (that every outer join is really necessary, that the approach taken is sound, that is even returns the right answer) is near 0% in my experience - therefore reverse engineering it - which itself would take a really really long time - isn't fruitful because the query is probably a bad idea in the first place.
In order to "tune a query", one needs:
o intimate knowledge of the schema - all of the constraints, everything.
o intimate knowledge of the question that needs to be answered
What one typically doesn't want (surprisingly) is the
o query you are trying to tune
Not that I'm asking you for the schema, etc - this is not "ask tom to tune your query" 
 
 
Tuning on Exist
Prince, March     18, 2011 - 12:44 am UTC
 
 
Hi Tom,
Please explain me what is wrong in performance in this query and how to tune this. Thanks in advance.. 
UPDATE  customer_table aaa
    SET     (acc_name, acc_flg, act_date) = 
                                          (
                                            SELECT 
                                              DISTINCT name,
                                              'Y',
                                              DECODE(to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'), NULL, aaa.status_date, to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'))
                                            FROM 
                                              account acc
                                            WHERE
                                              acc.acc_num = aaa.customer_id
                                              AND aaa.success_flg = 'Y'
                                              AND aaa.error_flg = 'N'
                                          )
    WHERE EXISTS
            (
                SELECT 
                  DISTINCT name,
                  'Y',
                  DECODE(to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'), NULL, aaa.status_date, to_char(REFERENCE_START_DT, 'DD/MM/YYYY HH24.MI.SS'))
                FROM 
                  account acc
                WHERE
                  acc.acc_num = aaa.customer_id
                  AND aaa.success_flg = 'Y'
                  AND aaa.error_flg = 'N'
            ); 
March     18, 2011 - 7:51 am UTC 
 
If the optimizer is unable to merge the correlated subquery into the update - it might have to execute it at least once per row in the update.
I like MERGE for this
merge into customer_table aaa
using account acc
ON (aaa.customer_id = acc.customer_id 
    and aaa.success_flg='Y' 
    and aaa.error_flg='N')
when matched then update set acc_name = acc.acc_name, acc_flg = 'Y', act_date = nvl(acc.act_date,status_date);
Instead of decoding and converting to strings (I assume that status_date is a date? ) just nvl it.  
If the distinct is necessary - there is something really wrong with  your data model, in a big way.
There is no way the predicate:
                                            WHERE
                                              acc.acc_num = aaa.customer_id
                                              AND aaa.success_flg = 'Y'
                                              AND aaa.error_flg = 'N'
                                          )
on aaa.success_flg and aaa.error_flg belonged in the subquery - that should have been set in the where clause of the update itself - where exists (...) AND success_flg = 'Y' and error_flg = 'N'. 
 
 
A reader, March     21, 2011 - 12:48 am UTC
 
 
In this case the account acc is the remote table which is having more than 30millions of rec.
earlier I was using the hint driving_site(acc) for the select statement to improve the performance.
So Now how I will use this hint or is there any alternate to improve the performance ?
Will appreciate an earliest reply.
Thanks
Prince 
March     21, 2011 - 9:44 am UTC 
 
The driving site hint is probably your best bet. 
 
 
Doubt in NOT IN
Sridhar, May       27, 2011 - 1:19 am UTC
 
 
Dear Tom,
We have been experimenting in the usage of IN and NOT IN and we came across a small scenario:
When we executed the following SQL:
select SYSDATE from dual where '2' not in ('1','3')
We got sysdate as an Output, but when we used the following SQL:
select SYSDATE from dual where '2' not in ('1','3', NULL)
0 Records was selected. 
What could be the reason for this? 
May       27, 2011 - 10:35 am UTC 
 
it works that way by definition.
When you compare "NON-NULL-VALUE" to "NULL-VALUE" the answer is "I don't know", it is not true, it is not false.
Therefore, it is "unknown" whether '2' is not in ( NULL ).
With in - we can "know"
where '2' in ( '1', '2', null )
we Know 2 is in that set.  But we don't know if '3' is NOT IN that set...
It is by definition the way it works.  Part of SQL 
 
 
Very Good
pradeep sharma, August    08, 2011 - 6:32 am UTC
 
 
Amazing Question & Answers 
 
why EXISTS ?
A reader, October   24, 2011 - 11:04 am UTC
 
 
Hi Tom,
when 'IN' and 'EXISTS' are interchangeable and we can do everything with IN whatever we can do with EXITS –
Why was the operator EXISTS introduced in ORACLE ?
Are there any special cases where we must use EXISTS only and can’t use ‘IN’ ?
Many thanks
 
October   24, 2011 - 11:12 am UTC 
 
Exists was introduced in SQL - the language specification.  It is not an Oracle thing - it is a sql thing.
It is primarily syntactic convenience, semantic pleasantness.  Sometimes "IN" is just "better sounding", sometimes exists is.
In and Exists can be used interchangeable.
Not In and Not Exists cannot - they are different (regarding treatment of nulls).
Just like you can say:
select * from t;
select ALL * from t;
interchangeably, or
select distinct * from t;
select unique * from t;
interchangeably...
In and exists both exist for the same reason "for" and "while" loops exist (even with goto which can be used instead of for or while as well)...
 
 
 
subquery
A reader, October   25, 2011 - 4:41 am UTC
 
 
Hi Tom,
1.Scalar subquery is a SPECIAL case of Single row return subquery which returns Single COLOUMN from single ROW.
We can use Scalar Subqueries wherever we can use single row subquery with single column and there are NO any differences between Scalar subquery and Single row retun subquery with single coloumn.
Could you please clarify if the above correct ?
2.I already asked the below question in new thread but did not get answer -SORRY for that.. Can you please confirm if this is correct that 
a)when the subquery is non-correlated and
b) when subquery is having aggregate functions count(*) with group by and 
c) when it returns multiple rows - 
First we will execute the subquery once and STORE the results in PGA -then we will execute main query with this as input ??
ex: select t.x,t.y from T
    where t.x > (select max(a) from s group by b)
Thanks Tom for your help all the way !!
 
October   25, 2011 - 7:10 am UTC 
 
1) what is a single row subquery?
You can use scalar subqueries anywhere you can use an expression.
select * from t where x = 10;
select * from t where x = (select 10 from dual);
There, a scalar subquery was used to replace the expression "10".
Now, if you have:
select * from t where x in (select 10 from dual);
the select 10 from dual is not a scalar subquery in that context, it is just a "subquery".  
2) No, that is not how it will happen in all cases.  We can and will do view merging when we find it to be the appropriate approach.  
Never assume some order of operation outside of expressions with ()'s.  That is
(5+3)/2 
has a definite order of operation - but 
where x > (5+3)/2  and (y < 55 and z > 4)
You cannot assume that y&z are evaulated and then x is (or that x is, and then y and z, or etc etc etc). 
All you know is (5+3)/2 - IF evaluated - will be done 'correctly' and everything else happens in some order.
Subqueries, inline views, etc etc etc - all can be moved around, rewritten and so on.
Your query in general does not work - select max(a) from s group by b would IN GENERAL return 0 to N records where N can be any number greater than zero.  To see "t.x > (select max(a) from s group by b)" work - either
o s would have to be empty, that query returns 0 rows.
o b is unique in S, that query returns 1 rows.
If b has more than one value, that query returns more than 1 rows and will fail in this context.  You need a scalar subquery here!
Now, if you wrote instead;
where t.x > ANY(select max(a) from s group by b)
or
where t.x > ALL(select max(a) from s group by b)
then your query would become a subquery and could work even if b was not unique in s.
 
 
 
A reader, October   25, 2011 - 9:30 am UTC
 
 
Hi Tom,
thanks for the reply , It was my misatke in the 2nd point - missing ANY..
ok, you said -
select * from t where x = (select 10 from dual);
There, a scalar subquery was used to replace the expression "10".
Now, if you have:
select * from t where x in (select 10 from dual);
the select 10 from dual is not a scalar subquery in that context, it is just a "subquery". 
Why - is it because of the operator 'IN' ?
 
October   25, 2011 - 11:53 am UTC 
 
yes, IN requires a set
= requires an expression.
a scalar subquery is an expression of sorts.
 
 
 
The algorithm by which EXIST, NOT EXISTS, In & NOT IN operates
NB, November  15, 2011 - 3:45 pm UTC
 
 
Can u explain the diff between the Output of these queries?? i.e.  The algorithm by which EXIST, NOT EXISTS, In & NOT IN operates.
SELECT *  FROM ALL_WKSCRATCHPAD_DB.TB1;?? 
SELECT * FROM  ALL_WKSCRATCHPAD_DB.TB2;
 Number          Name
1 brad         30                     
2 tom          20                     
3 John         10     
 Number           Name
1 brad         30                     
2 tom          20                     
3 anil         40                     
                
LOCKING ROW FOR ACCESS
SELECT *
FROM   ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE    NOT EXISTS (SELECT ''
                   FROM    ALL_WKSCRATCHPAD_DB.TB2 T2
                   WHERE  T1.Name = T2.name)
                ;
                
Ans:
 Number          Name
1 John         10                     
---------------------------
LOCKING ROW FOR ACCESS
SELECT *
FROM   ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE   T1.Name NOT IN (SELECT T2.name
                   FROM    ALL_WKSCRATCHPAD_DB.TB2 T2
                   WHERE  T1.Name = T2.name)
                ;
 Number           Name
1 John         10                     
----------------------------------                
                
LOCKING ROW FOR ACCESS
SELECT *
FROM   ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE     EXISTS (SELECT ''
                   FROM    ALL_WKSCRATCHPAD_DB.TB2 T2
                   WHERE  T1.Name <> T2.name)
                ;
 Number           Name
1 brad         30                     
2 tom          20                     
3 John         10                     
-------------------------------
LOCKING ROW FOR ACCESS
SELECT *
FROM   ALL_WKSCRATCHPAD_DB.TB1 T1
WHERE   T1.Name  IN (SELECT T2.name
                   FROM    ALL_WKSCRATCHPAD_DB.TB2 T2
                   WHERE  T1.Name <> T2.name)
                ;       
 Number       Name
                    
   
November  16, 2011 - 9:33 am UTC 
 
IN = Exists, they the same.
NOT IN is a *little* different from NOT EXISTS, but usually the same.
select * from t1 where not exists (select null from t2 where t2.x = t1.x);
select * from t1 where t1.x NOT IN (select t2.x from t2);
IF t2.x is NOT NULL - those two queries are the same.
IF t2.x is NULLABLE and there is at least one value of t2.x that is null in T2 - then the second query returns ZERO rows - whereas the not exists might return some rows.
but if you have:
select * from t1 where t1.x NOT IN (select t2.x from t2 where t2.x is not null)
then they would be the same again.
 
 
 
A reader, November  16, 2011 - 1:30 pm UTC
 
 
<quote>
Your query in general does not work - select max(a) from s group by b would IN GENERAL return 0 to N records where N can be any number greater than zero. To see "t.x > (select max(a) from s group by b)" work - either 
o s would have to be empty, that query returns 0 rows. 
o b is unique in S, that query returns 1 rows. 
<quote>
little correction....
for second point it should not be unique but all record should have same value for b in order to return single row 
 
November  16, 2011 - 1:33 pm UTC 
 
for second point it should not be unique but all record should have same value 
for b in order to return single row 
no, b must be unique in S.
If today - b has only one value - but tomorrow it might have more than one - it would not work.
Therefore, for this query to work, B must be unique in S.
 
 
 
A reader, November  16, 2011 - 2:02 pm UTC
 
 
"Therefore, for this query to work, B must be unique in S."
in that case "(select max(a) from s group by b)" will return multiple rows and give multiple rows return error does not execute at all... 
November  17, 2011 - 6:43 pm UTC 
 
ah, I see the point now, yes, B must have a single unique value. 
 
 
not in vs not exists - primary key
Surya, May       01, 2012 - 6:09 pm UTC
 
 
Hi Tom,
First of all, thank you very much for spending your valuable time in answering all our questions.
I have a question regarding not in vs not exists.
db version: 11gr2
Suppose I have a query like this,
select * from tab1 t1 where
(col1,col2,col3) not in (select col1,col2,col3 from tab2 t2);
If (col1,col2,col3) in tab2 is a primary key and so doesn't return any NULL values, can I rewrite the above query as below?
select * from tab1 t1 where
not exists(select null from tab2 t2 where t2.col1=t1.col1 and t2.col2=t1.col2 and t2.col3=t1.col3);
Appreciate all your help. Thank you. 
May       02, 2012 - 1:07 pm UTC 
 
yes, they would be semantically equivalent as long as c1,c2,c3 are not null 
 
 
Not Exists computationally expensive
Nagakiran, July      24, 2012 - 8:51 pm UTC
 
 
Hi Tom,
I have been trying to fetch rows from table A if it doesn't exist multiple times in table B..
Eg: Table A 
ID    Value 
1       ABC
1       ABC
1       ABC
2       DEF
2       DEF
Table B 
1       ABC
1       ABC
2       DEF
When I compare,these two tables,I need to show that there are two rows in the output as 
as ID   Value
    1    ABC
    2    DEF
I tried using the following query
select A.ID ,A.value, 
from A
where not exists (select * from B where A.id= B.id and A.value= B.value).
However,I don't get the required output.
I tried using LEFT OUTER JOIN too,
SELECT A.ID, A.value LEFT OUTER JOIN B 
    ON A.ID = B.ID AND A.value = B.value 
Even this doesn't work..
Kindly help me on this.
  
July      30, 2012 - 9:18 am UTC 
 
no creates
no inserts
no look 
 
 
not exists works strange
pfunk, August    10, 2012 - 2:14 am UTC
 
 
Hi Tom, 
i have faced with the strange query execution behavior.
In brief, I has two tables.
First - rem3_credit with "LINK" column as primary key.
Second - credit_doc with "LINK" column as primary key and nullable column "LINK_UP" referenced on rem3_credit.link.
Data was loaded via SQL*Loader with setting DIRECT=Y, so reference constraint had not checked during load.
I know about special settings in SQL*Loader which force reference constraints been checked after load but interesting issue was found when I decide to find rows in rem3_credit that are not in credit_doc.
My query was:
SQL> select * from rem3_credit rc where not exists (select * from credit_doc cd where cd.link=rc.link_up);
no rows selected
But actually there was one!
SQL> select rc.link,rc.link_up fk_link, cd.link pk_link from rem3_credit rc, credit_doc cd where cd.link(+)=rc.link_up and cd.link is null;
      LINK    FK_LINK    PK_LINK
---------- ---------- ----------
      2202      24031
There is the same result if i`d query such way:
SQL> select link,link_up from rem3_credit rc where not exists (select * from credit_doc cd where cd.link=rc.link_up and cd.link is not null);
      LINK    LINK_UP
---------- ----------
      2202      24031
But why I need to write additional predicate if I know (and Oracle knows) that credit_doc.link is primary key and can`t be null?
Let`s take a look at execution plan:
SQL> select * from TABLE(dbms_xplan.display_cursor('','','ALLSTATS LAST'));
SQL_ID  1cbvtut47qfv1, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ * from rem3_credit rc where not
exists (select * from credit_doc cd where cd.link=rc.link_up)
Plan hash value: 1161649700
-------------------------------------------------------------------------------------------
| Id  | Operation         | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |      1 |        |      0 |00:00:00.01 |      15 |
|*  1 |  TABLE ACCESS FULL| REM3_CREDIT |      1 |      1 |      0 |00:00:00.01 |      15 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RC"."LINK_UP" IS NULL)Execution plan of "correct" query is:
SQL> select * from TABLE(dbms_xplan.display_cursor('','','ALLSTATS LAST'));
SQL_ID  av9w7ftq1bu06, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS */ rc.link,rc.link_up fk_link,
cd.link pk_link from rem3_credit rc, credit_doc cd where
cd.link(+)=rc.link_up and cd.link is null
Plan hash value: 2146757845
-----------------------------------------------------------------------------------------------
| Id  | Operation           | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |               |      1 |        |      1 |00:00:00.01 |     452 |
|*  1 |  FILTER             |               |      1 |        |      1 |00:00:00.01 |     452 |
|   2 |   NESTED LOOPS OUTER|               |      1 |      8 |    797 |00:00:00.01 |     452 |
|   3 |    TABLE ACCESS FULL| REM3_CREDIT   |      1 |    798 |    797 |00:00:00.01 |      16 |
|*  4 |    INDEX UNIQUE SCAN| PK_CREDIT_DOC |    797 |  12208 |    796 |00:00:00.01 |     436 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("CD"."LINK" IS NULL)
   4 - access("CD"."LINK"="RC"."LINK_UP")I suppose there is some kind of query rewrite or transformation. Or simply a bug. By the way:
SQL> select * from v$version;
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
Can you give some clarification about what happening here? Thank you in advance. 
August    17, 2012 - 1:39 pm UTC 
 
please utilize support for something like this. 
 
 
not exists works strange
pfunk, August    10, 2012 - 2:44 am UTC
 
 
Sorry, i`ve made a mistake in table description.
Correct description is:
First - rem3_credit with "LINK" column as primary key and nullable column "LINK_UP" referenced on credit_doc.link. 
Second - credit_doc with "LINK" column as primary key.  
 
Balaji, August    14, 2012 - 9:04 am UTC
 
 
When you use IN / NOT IN, the query retrieves the records from subquery and then checks the existance of the master record in the subquery. It will not consider NULL Records.
Incase of Exists/Not Exists, the where condition will return a boolean (true/false) and display the desired output. It is faster in performance than IN and NOT IN.
Another better method is using Left/Right outer join. 
August    17, 2012 - 2:21 pm UTC 
 
there is so much wrong with what you just said.  so much.  I don't know where to start.
did you know we can and will rewrite your IN to be Exists, your Exists to be In, your not in as not exists, your not exists as not in???
they are the same to us - we use them as we see fit. 
http://www.oracle.com/technetwork/issue-archive/2006/06-may/o36asktom-099107.html When you use IN / NOT IN, the query retrieves the records from subquery and 
then checks the existance of the master record in the subquery. It will not 
consider NULL Records.I don't know what you mean by "it will not consider NULL records" - but I have a feeling that whatever you would say it means is probably wrong.  Not in and VERY null sensitive.  
ops$tkyte%ORA11GR2> create table t1 ( x int );
Table created.
ops$tkyte%ORA11GR2> create table t2 ( x int );
Table created.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t1 values(1);
1 row created.
ops$tkyte%ORA11GR2> insert into t1 values(2);
1 row created.
ops$tkyte%ORA11GR2> insert into t2 values(1);
1 row created.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t1 where x not in (select x from t2);
         X
----------
         2
ops$tkyte%ORA11GR2> insert into t2 values(null);
1 row created.
ops$tkyte%ORA11GR2> select * from t1 where x not in (select x from t2);
no rows selected
ops$tkyte%ORA11GR2> 
and you know what, ask yourself, if this were true:
It is faster in performance than 
IN and NOT IN.then why would anyone 'invent' IN and NOT IN???? 
 
 
Jess, August    21, 2012 - 7:57 am UTC
 
 
Hi Tom,
What about 'in' vs. '<>'?
I have a query that looks like this:
select txn.* from txn_tbl partition (20100101) txn, attribute_tbl attr
where txn.attrib_fk = attr.attrib_pk and attr.status <> 'A';
Select from big transaction table based on one attribute.  The 'status' column in the latter has 3 values and a bitmap index on it.  I need to return all transaction with attributes that have any of 2 values.
Run as it is (attr.status <> 'A'), I get the following plan:
--------------------------------------------------------------------------------------------------
Id |Operation                       |Name           |Rows|Bytes|Cost(CPU|Time    |Pstr|Pstp|IN-OUT
--------------------------------------------------------------------------------------------------
  0|SELECT STATEMENT                |               | 18M|2912M|6168 (2)|00:01:52|    |    |      
  1| PX COORDINATOR                 |               |    |     |        |        |    |    |      
  2|  PX SEND QC (RANDOM)           |:TQ10001       | 18M|2912M|6168 (2)|00:01:52|    |    | P->S 
* 3|   HASH JOIN                    |               | 18M|2912M|6168 (2)|00:01:52|    |    | PCWP 
  4|    BUFFER SORT                 |               |    |     |        |        |    |    | PCWC 
  5|     PX RECEIVE                 |               |128 | 512 |   3(34)|00:00:01|    |    | PCWP 
  6|      PX SEND BROADCAST         |:TQ10000       |128 | 512 |   3(34)|00:00:01|    |    | S->P 
  7|       VIEW                     |index$_join$_02|128 | 512 |   3(34)|00:00:01|    |    |      
* 8|        HASH JOIN               |               |    |     |        |        |    |    |      
  9|         BITMAP CONV'N TO ROWIDS|               |128 | 512 |   1 (0)|00:00:01|    |    |      
*10|          BITMAP INDEX FULL SCAN|ATTRIB_TBL_BMI1|    |     |        |        |    |    |      
 11|         INDEX FAST FULL SCAN   |ATTRIB_PK_IDX  |128 | 512 |   1 (0)|00:00:01|    |    |      
 12|    PX BLOCK ITERATOR           |               | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWC 
 13|     TABLE ACCESS FULL          |TXN_TBL        | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWP 
--------------------------------------------------------------------------------------------------
 
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("TXN"."ATTRIB_FK"="ATTR"."ATTRIB_PK")
   8 - access(ROWID=ROWID)
  10 - filter("ATTR"."STATUS"<>'A')
In one of the releases, the query was changed to the following "to make it easier to understand which values we're bringing back":
select txn.* from txn_tbl partition (20100101) txn, attribute_tbl attr
where txn.attrib_fk = attr.attrib_pk and attr.status in ('B','C');
The plan for this version is this:
--------------------------------------------------------------------------------------------------
Id |Operation                        |Name           |Rows|Bytes|Cost(CPU|Time    |Pstr|Pstp|INOUT
--------------------------------------------------------------------------------------------------
  0|SELECT STATEMENT                 |               | 18M|2912M|6169 (2)|00:01:52|    |    |      
  1| PX COORDINATOR                  |               |    |     |        |        |    |    |      
  2|  PX SEND QC (RANDOM)            |:TQ10001       | 18M|2912M|6169 (2)|00:01:52|    |    | P->S 
* 3|   HASH JOIN                     |               | 18M|2912M|6169 (2)|00:01:52|    |    | PCWP 
  4|    BUFFER SORT                  |               |    |     |        |        |    |    | PCWC 
  5|     PX RECEIVE                  |               |128 | 512 |   4(25)|00:00:01|    |    | PCWP 
  6|      PX SEND BROADCAST          |:TQ10000       |128 | 512 |   4(25)|00:00:01|    |    | S->P 
* 7|       VIEW                      |index$_join$_02|128 | 512 |   4(25)|00:00:01|    |    |      
* 8|        HASH JOIN                |               |    |     |        |        |    |    |      
  9|         INLIST ITERATOR         |               |    |     |        |        |    |    |      
 10|          BITMAP CONV'N TO ROWIDS|               |128 | 512 |   2 (0)|00:00:01|    |    |      
*11|           BITMAP IDX SINGL VALUE|ATTRIB_TBL_BMI1|    |     |        |        |    |    |      
 12|         INDEX FAST FULL SCAN    |ATTRIB_PK_IDX  |128 | 512 |   1 (0)|00:00:01|    |    |      
 13|    PX BLOCK ITERATOR            |               | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWC 
 14|     TABLE ACCESS FULL           |TXN_TBL        | 18M|2840M|6159 (1)|00:01:51|1317|1317| PCWP 
--------------------------------------------------------------------------------------------------
 
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("TXN"."ATTRIB_FK"="ATTR"."ATTRIB_PK")
   7 - filter("ATTR"."STATUS"='B' OR "ATTR"."STATUS"='C')
   8 - access(ROWID=ROWID)
  11 - access("ATTR"."STATUS"='B' OR "ATTR"."STATUS"='C')
With an additional access predicate and 'in list' operation, I would've thought that the second version would produce a different plan.  But the plans appear virtually identical in terms of cost.  I tried it with a larger number of values in the 'in' list against another table, but that didn't seem to affect the plan either.  Does that mean that doing <> is no better than doing in ()?  I would've thought that saying 'not A' would yield better performance than listing 'B, C, D, E' as possible 'in' values...
Thanks as always. 
August    28, 2012 - 1:16 pm UTC 
 
But the plans appear 
virtually identical in terms of cost.  
that should be expected, not unexpected!!  The cost if retrieving the same amount of data should be....
the same.
I'm disappointed when it *doesn't* work out that way!!!
Does that mean that doing <> is no better than doing in ()?  
I would've thought that saying 'not A' would yield better performance than 
listing 'B, C, D, E' as possible 'in' values...
no, it doesn't "mean" that.  It just means that to get 18,000,000 rows from a table - most of the work, the time, the energy is going to be spent doing IO. 
 
 
Jess, August    29, 2012 - 2:41 pm UTC
 
 
Thanks Tom.
I suppose I was expecting the plans to be different (erroneously so) in a sense that one would be better than the other one, which was not the case.
So then with the io being spent either way to get the rows, does it make it a purely sematic debate as to whether <> or () should be used, or is there a meaningful difference between the two?
Thanks as always 
August    29, 2012 - 2:53 pm UTC 
 
In general, != will preclude an index range access path whereas "in (...)" would allow it.
in your case, no index range scanning was to be used so it didn't matter.
so, where x <> 42 would probably never use an index range scan to find the rows.
but where x in ( 1, 2, 3 ) would or could. 
 
 
Jess, August    29, 2012 - 5:41 pm UTC
 
 
Thanks Tom.
That's a really neat piece of information.  I didn't know this. 
 
EXISTS with multiple OR vs UNION ALL
Prakash Rai, December  19, 2012 - 1:46 pm UTC
 
 
Hi Tom - 
Someone in the thread asked to tune the query with multiple EXISTS clause with OR. I have the similar case with multiple EXISTS that is producing the wong result. I replaced subsequents EXISTS with UNION ALL and changed to one EXISTS clause and get the correct result.
Though I am not sure why that would make the difference. Do you see any reason?
--This clause misses some records
WHERE 
     EXISTS ( SELECT 1 FROM springs.si_jde_schedrcpts_work sr 
  WHERE t.order_id = sr.t.order_id 
  AND t.item = sr.item 
  AND t.fromloc = sr.fromloc 
  AND sr.order_status < 90 
               AND sr.supplysource  =1 )
     OR EXISTS( SELECT 1 FROM stsc.inventory inventory 
              WHERE t.item = inventory.item 
              AND t.fromloc = inventory.loc AND  
              inventory.qty => t.stqty
              AND sr.order_staus > 90
             )
     OR EXISTS( SELECT 1 FROM stsc.inventory inventory 
               WHERE t.item = inventory.item 
               AND t.fromloc = inventory.loc AND 
               inventory.qty => t.stqty 
               AND sr.supplysource=1
              )
      
VS
--this produces the expected result
WHERE 
     EXISTS ( SELECT 1 FROM springs.si_jde_schedrcpts_work sr 
  WHERE t.order_id = sr.t.order_id 
  AND t.item = sr.item 
  AND t.fromloc = sr.fromloc 
  AND sr.order_status < 90 
               AND sr.supplysource  =1 
     UNION ALL
             SELECT 1 FROM stsc.inventory inventory 
               WHERE t.item = inventory.item 
               AND t.fromloc = inventory.loc AND  
               inventory.qty => t.stqty
               AND sr.order_staus > 90
            UNION ALL
        SELECT 1 FROM stsc.inventory inventory 
               WHERE t.item = inventory.item 
               AND t.fromloc = inventory.loc AND 
               inventory.qty => t.stqty 
               AND sr.supplysource=1
              )
Thanks an always!
Prakash 
December  20, 2012 - 4:57 pm UTC 
 
please contact support with this one...
(and use two subqueries, not three - just or the last condition on the last two)
If what you say is true, it looks like a bad plan (bug) is being generated for the first query - but I cannot confirm that as I don't have the schema or the entire query and cannot verify that. 
 
 
EXISTS with multiple OR vs UNION ALL   
Prakash Rai, December  20, 2012 - 9:47 am UTC
 
 
Follow up to my previous post;
Somewhere I read, use of multiple EXISTS with OR clause like below will not execute subsequent clause and that may vary if a column used in the predicate is indexed. e.g. SR.SUPPLYSOURCE. That's exactly the case I tested, but WHY? Why the standalone index makes the query behave differently?
  
I see INDEX or UNION ALL are fix or work around but curious what's the reason to Oracle behaves differently?
....
WHERE 
 EXISTS( SELECT 1 FROM stsc.inventory inventory 
              WHERE t.item = inventory.item 
              AND t.fromloc = inventory.loc 
              AND inventory.qty => t.stqty
              AND sr.supplysource  = 1 )
 OR EXISTS( SELECT 1 FROM stsc.inventory inventory 
              WHERE t.item = inventory.item 
              AND t.fromloc = inventory.loc 
              AND inventory.qty => t.stqty
              AND sr.supplysource  = 2 )
 OR EXISTS( SELECT 1 FROM stsc.inventory inventory 
              WHERE t.item = inventory.item 
              AND t.fromloc = inventory.loc 
              AND inventory.qty => t.stqty
              AND sr.supplysource  =3 )
Thanks as always!
Prakash 
December  20, 2012 - 5:23 pm UTC 
 
please see above. 
 
 
EXISTS with multiple OR vs UNION ALL
Prakash Rai, December  20, 2012 - 10:21 pm UTC
 
 
Tom - Thanks for your response. In real scenario I have 5 OR EXISTS and data breaks after 3rd clause. I had my peers and DBA looked into it and came to the similar conclusion that it is a bug. While we open SR with Oracle, I am applying workaround with UNION ALL, that preserves the performance and produce the correct result.
Thanks as always.
Prakash 
January   02, 2013 - 8:19 am UTC 
 
please only ever post the REAL QUERY.
You just wasted your time and my time by having me look at a fake example that does not represent what you really are faced with.
Now we have no idea if you are facing a bug or not - because we have never seen the actual query you are running.
The union all might not be giving you the right answer by the way.  If more than one branch of the OR is satisfied - you are going to get the same row more than once with a union all. 
 
 
Rahul, December  29, 2012 - 11:48 am UTC
 
 
Hi Tom ,
Your link provided for IN and exists is awesome . Though for NOT IN AND NOT EXISTS you gave us the example of the case which was mishandled by NOT IN .So if  i ignore such cases then the working of NOT IN and NOT EXISTS should be similar to IN AND EXISTS. I mean , in case of NOT IN - The sub-query is transformed to a view with distinct values and a join . but an equi join  wouldn't be needed here .So how does it work ?
Also for NOT EXISTS ,suppose the inner table is big enough and  indexed on the column .Since we arent matching any values, i suppose the index wont be used. so how does NOT EXISTS work ?  
January   04, 2013 - 11:23 am UTC 
 
NOT IN and NOT EXISTS are not the same in the way that IN and EXISTS are.
they (not in/not exists) deal with NULLS very differently.
with not in/exists - the subuquery doesn't have to be distincted - we can do a thing called a semi-join - so we don't necessarily need to distinct them.  We'd use an outer join and keep only the rows where the joined to table was "null" (not there at all).
and we can use an index easily for both not in and not exists if it makes sense.
ops$tkyte%ORA11GR2> create table t1 ( x int, y int, z int );
Table created.
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T1', numrows => 100 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> create table t2 ( x int not null, y int, z int );
Table created.
ops$tkyte%ORA11GR2> create index t2_idx on t2(x);
Index created.
ops$tkyte%ORA11GR2> exec dbms_stats.set_table_stats( user, 'T2', numrows => 1000000, numblks => 100000 );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select *
  2    from t1
  3   where NOT EXISTS (select null from t2 where t2.x = t1.x)
  4  /
Execution Plan
----------------------------------------------------------
Plan hash value: 2403176152
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |   100 |  5200 |    29   (0)| 00:00:01 |
|   1 |  NESTED LOOPS ANTI |        |   100 |  5200 |    29   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1     |   100 |  3900 |    29   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | T2_IDX |     1 |    13 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T2"."X"="T1"."X")
ops$tkyte%ORA11GR2> set autotrace off
 
 
 
EXISTS with multiple OR vs UNION ALL   
Prakahs Rai, January   02, 2013 - 12:59 pm UTC
 
 
Tom -
I apologize for not posting the actual query.
Here is the query that I replaced "OR" with "UNION ALL".
In the third EXIST clause, if I explicitly convert ics.cac_segment_type (which is number field) to TO_NUMBRE (as "TO_NUMBER(ics.cac_segment_type) = 1.3", the query fully executes the next evaluation "ics.cac_segment_type= 1.4" and returns the correct result, or else only partiatially evaluates the next statmenet and skips about 20% of  "ics.cac_segment_type= 1.4" matchings. The use of TO_NUMBER() changed the plan to "INDEX STORAGE FAST FULL SCAN" from "INDEX RANGE SCAN". This solution was found as hit and trial that I don't see the logical reasoning and also impacted the performance unacceptably.
UNION ALL produces the correct result and there is no counter impact. Though I don't see the point as you said "The union all might not be giving you the right answer by the way....". Can you please elaborate why would top query returns multiple records if the EXIST satisfies with multiple clauses. EXIST will only validate EXIST = Yes or No. What am I missing?
Thanks,
Prakash
SELECT
    ...
    FROM 
             table1 x                                              
  WHERE EXISTS
                                         (SELECT 'X'
                                            FROM local_gtt ics
                                           WHERE   ics.cac_segment_type = 1.1 
                                                 AND ics.buyer_bus_org_fk =  x.buyerorg
                                                 AND ics.ui_buyer_org_exists =1
                                                 AND ics.segment_value = x.segment1
                                       UNION ALL
                                            SELECT 'X'
                                               FROM local_gtt ics
                                              WHERE   ics.cac_segment_type =1.2
                                                    AND ics.buyer_bus_org_fk = x.buyerorg
                                                    AND ics.ui_buyer_org_exists = 1
                                                    AND ics.segment_value = x.segment2
                                      UNION ALL
                                            SELECT 'X'
                                               FROM local_gtt ics
                                              WHERE   ics.cac_segment_type = 1.3
                                                    AND ics.buyer_bus_org_fk = x.buyerorg
                                                    AND ics.ui_buyer_org_exists = 1
                                                    AND ics.segment_value = x.segment3
                                      UNION ALL
                                            SELECT 'X'
                                               FROM local_gtt ics
                                              WHERE   ics.cac_segment_type = 1.4
                                                    AND ics.buyer_bus_org_fk = x.buyerorg
                                                    AND ics.ui_buyer_org_exists = 1
                                                    AND ics.segment_value = x.segment4
                                      UNION ALL
                                            SELECT 'X'
                                               FROM local_gtt ics
                                              WHERE  ics.cac_segment_type = 1.5
                                                    AND ics.buyer_bus_org_fk = x.buyerorg
                                                    AND ics.ui_buyer_org_exists = 1
                                                    AND ics.segment_value = x.segment5
                                                    )
 
 
Difference in execution plan and execution time
Amit, February  01, 2013 - 3:34 pm UTC
 
 
Hi Tom,
I have following queries with execution plans. Both return exact same data. First query uses normal joins, it shows execution plan with low cost and executes faster. Second query uses EXISTS, it's cost is much higher and execution is 10 times slower compared to first query.
I am trying to understand reason for the difference between the execution plans of these 2 queries. Is there any way so that I can put HINT to second query so that its plan becomes better?
Thanks!
Table details - 
Table_name  Num_rows
--------------------------------
TBL_CALENDAR  4217
TBL_ACCOUNT  41355
TBL_DATA_SRC  12
TBL_POSN 377259
---------------------------------------------------------------------------------------------
First Query - 
SELECT S.ROWID
FROM  TBL_POSN S,
      TBL_ACCOUNT A,
      TBL_DATA_SRC D,
      (
          SELECT  TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY') MMYYYY,
                  MAX(CASE WHEN TO_CHAR(TBL_CALENDAR_DATE, 'D') IN (2, 3, 4, 5, 6) THEN TBL_CALENDAR_DATE ELSE NULL END) LAST_WORKING_DAY,
                  MAX(LAST_DAY(TBL_CALENDAR_DATE)) LAST_CALENDAR_DAY
          FROM    TBL_CALENDAR
          GROUP BY TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY')
      ) LAST_DAY
WHERE 1=1 
AND REPORT_DATE < '31-OCT-12'
AND REPORT_DCDE = 'I'
AND     A.AC_SK = S.AC_SK
AND     A.DATA_SRC_ID = D.DATA_SRC_ID
AND     S.REPORT_DATE >= D.BACKLOAD_CUTOFF_DATE
and     TO_CHAR(s.REPORT_DATE, 'MMYYYY') = LAST_DAY.MMYYYY
and     (
              a.DATA_SRC_ID in(1, 2)
          OR  (
                    a.DATA_SRC_ID in(3, 4, 5, 6, 10)
                AND s.REPORT_DATE <> LAST_DAY.LAST_WORKING_DAY
              )
          or  (
                    A.DATA_SRC_ID = 8
                and s.REPORT_DATE <> LAST_DAY.LAST_CALENDAR_DAY
              )
        );
Plan hash value: 1112228868
 
---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                | 11926 |   815K|   529  (15)|       |       |
|*  1 |  HASH JOIN                  |                | 11926 |   815K|   529  (15)|       |       |
|*  2 |   HASH JOIN                 |                |   857 | 40279 |   519  (15)|       |       |
|   3 |    NESTED LOOPS             |                |  8574 |   326K|   362  (19)|       |       |
|*  4 |     TABLE ACCESS FULL       | TBL_DATA_SRC   |     1 |    12 |     2   (0)|       |       |
|   5 |     PARTITION RANGE ITERATOR|                | 14291 |   376K|   360  (19)|   KEY |   KEY |
|*  6 |      INDEX FAST FULL SCAN   | XU_TBL_POSN_PK | 14291 |   376K|   360  (19)|   KEY |   KEY |
|   7 |    TABLE ACCESS FULL        | TBL_ACCOUNT    | 41355 |   323K|   154   (4)|       |       |
|   8 |   VIEW                      |                |  4200 | 96600 |     8  (25)|       |       |
|   9 |    HASH GROUP BY            |                |  4200 | 33600 |     8  (25)|       |       |
|  10 |     INDEX FULL SCAN         | TBL_CALENDAR_PK|  4217 | 33736 |     6   (0)|       |       |
---------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("LAST_DAY"."MMYYYY"=TO_CHAR(INTERNAL_FUNCTION("S"."REPORT_DATE"),'MMYYYY'))
       filter("A"."DATA_SRC_ID"=1 OR "A"."DATA_SRC_ID"=2 OR ("A"."DATA_SRC_ID"=3 OR "A"."DATA_SRC_ID"=4 OR 
              "A"."DATA_SRC_ID"=5 OR "A"."DATA_SRC_ID"=6 OR "A"."DATA_SRC_ID"=10) AND 
              "S"."REPORT_DATE"<>"LAST_DAY"."LAST_WORKING_DAY" OR "A"."DATA_SRC_ID"=8 AND 
              "S"."REPORT_DATE"<>"LAST_DAY"."LAST_CALENDAR_DAY")
   2 - access("A"."AC_SK"="S"."AC_SK" AND "A"."DATA_SRC_ID"="D"."DATA_SRC_ID")
   4 - filter("D"."BACKLOAD_CUTOFF_DATE"<'31-OCT-12')
   6 - filter("REPORT_DCDE"='I' AND "REPORT_DATE"<'31-OCT-12' AND 
              "S"."REPORT_DATE">="D"."BACKLOAD_CUTOFF_DATE")
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
Second query -
SELECT ROWID
FROM TBL_POSN s
WHERE 1=1 
AND REPORT_DATE < '31-OCT-12'
AND REPORT_DCDE = 'I'
and EXISTS
    (
        SELECT  1
        FROM    TBL_ACCOUNT A,
                TBL_DATA_SRC D,
                (
                    SELECT  TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY') MMYYYY,
                            MAX(CASE WHEN TO_CHAR(TBL_CALENDAR_DATE, 'D') IN (2, 3, 4, 5, 6) THEN TBL_CALENDAR_DATE ELSE NULL END) LAST_WORKING_DAY,
                            MAX(LAST_DAY(TBL_CALENDAR_DATE)) LAST_CALENDAR_DAY
                    FROM    TBL_CALENDAR
                    GROUP BY TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY')
                ) LAST_DAY
        WHERE   AC_SK = s.AC_SK
        AND     A.DATA_SRC_ID = D.DATA_SRC_ID
        AND     REPORT_DATE >= D.BACKLOAD_CUTOFF_DATE
        and     TO_CHAR(REPORT_DATE, 'MMYYYY') = LAST_DAY.MMYYYY
        and     (
                      a.DATA_SRC_ID in(1, 2)
                  OR  (
                            A.DATA_SRC_ID IN(3, 4, 5, 6, 10)
                        AND REPORT_DATE <> LAST_DAY.LAST_WORKING_DAY
                      )
                  or  (
                            A.DATA_SRC_ID = 8
                        and REPORT_DATE <> LAST_DAY.LAST_CALENDAR_DAY
                      )
                )
    );
Plan hash value: 723147282
 
------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name           | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                |     1 |    27 |   456K  (8)|       |       |
|*  1 |  FILTER                        |                |       |       |            |       |       |
|   2 |   PARTITION RANGE ITERATOR     |                |   285K|  7536K|   353  (17)|     1 |   KEY |
|*  3 |    INDEX FAST FULL SCAN        | XU_TBL_POSN_PK |   285K|  7536K|   353  (17)|     1 |   KEY |
|   4 |   NESTED LOOPS                 |                |     2 |    86 |     4   (0)|       |       |
|   5 |    NESTED LOOPS                |                |     1 |    20 |     2   (0)|       |       |
|   6 |     TABLE ACCESS BY INDEX ROWID| TBL_ACCOUNT    |     1 |     8 |     1   (0)|       |       |
|*  7 |      INDEX UNIQUE SCAN         | PK_TBL_ACCOUNT |     1 |       |     1   (0)|       |       |
|*  8 |     TABLE ACCESS BY INDEX ROWID| TBL_DATA_SRC   |     1 |    12 |     1   (0)|       |       |
|*  9 |      INDEX UNIQUE SCAN         | PK_TBL_DATA_SRC|     1 |       |     1   (0)|       |       |
|* 10 |    VIEW                        |                |     2 |    46 |     2   (0)|       |       |
|  11 |     SORT GROUP BY              |                |    42 |   336 |     8  (25)|       |       |
|* 12 |      INDEX FULL SCAN           | TBL_CALENDAR_PK|    42 |   336 |     7  (15)|       |       |
------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter( EXISTS (SELECT /*+ */ 0 FROM  (SELECT /*+ */ 
              TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY') "MMYYYY",MAX(CASE 
              TO_NUMBER(TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'d')) WHEN 2 THEN "TBL_CALENDAR_DATE" WHEN 3 THEN 
              "TBL_CALENDAR_DATE" WHEN 4 THEN "TBL_CALENDAR_DATE" WHEN 5 THEN "TBL_CALENDAR_DATE" WHEN 6 THEN "TBL_CALENDAR_DATE" 
              ELSE NULL END ) "LAST_WORKING_DAY",MAX(LAST_DAY(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"))) "LAST_CALENDAR_DAY" FROM 
              "TBLDBO"."TBL_CALENDAR" "TBL_CALENDAR" WHERE TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY')=TO_CHAR(:B1,'M
              MYYYY') GROUP BY TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY')) "LAST_DAY","TBLDBO"."TBL_DATA_SRC" 
              "D","TBLDBO"."TBL_ACCOUNT" "A" WHERE "AC_SK"=:B2 AND "A"."DATA_SRC_ID"="D"."DATA_SRC_ID" AND 
              "D"."BACKLOAD_CUTOFF_DATE"<=:B3 AND (("A"."DATA_SRC_ID"=1 OR "A"."DATA_SRC_ID"=2) OR ("A"."DATA_SRC_ID"=3 OR 
              "A"."DATA_SRC_ID"=4 OR "A"."DATA_SRC_ID"=5 OR "A"."DATA_SRC_ID"=6 OR "A"."DATA_SRC_ID"=10) AND 
              "LAST_DAY"."LAST_WORKING_DAY"<>:B4 OR "A"."DATA_SRC_ID"=8 AND "LAST_DAY"."LAST_CALENDAR_DAY"<>:B5)))
   3 - filter("REPORT_DCDE"='I' AND "REPORT_DATE"<'31-OCT-12')
   7 - access("AC_SK"=:B1)
   8 - filter("D"."BACKLOAD_CUTOFF_DATE"<=:B1)
   9 - access("A"."DATA_SRC_ID"="D"."DATA_SRC_ID")
  10 - filter("A"."DATA_SRC_ID"=1 OR "A"."DATA_SRC_ID"=2 OR ("A"."DATA_SRC_ID"=3 OR "A"."DATA_SRC_ID"=4 OR 
              "A"."DATA_SRC_ID"=5 OR "A"."DATA_SRC_ID"=6 OR "A"."DATA_SRC_ID"=10) AND "LAST_DAY"."LAST_WORKING_DAY"<>:B1 OR 
              "A"."DATA_SRC_ID"=8 AND "LAST_DAY"."LAST_CALENDAR_DAY"<>:B2)
  12 - filter(TO_CHAR(INTERNAL_FUNCTION("TBL_CALENDAR_DATE"),'MMYYYY')=TO_CHAR(:B1,'MMYYYY'))
 
February  04, 2013 - 9:17 am UTC 
 
couple of comments on a few really *bad* practices:
a) TO_CHAR(TBL_CALENDAR_DATE,'MMYYYY') MMYYYY
use trunc( dt, 'mm' ).  keep the date a date, trunc to the month level.  don't conver to a string (hugely expensive).  especially when you need it to compare to something later:  TO_CHAR(REPORT_DATE, 'MMYYYY') = LAST_DAY.MMYYYY.  that should be "trunc(report_date,'mm') = last_day.mmyyyy.  
trunc simply resets bytes at the end of a 7 byte field.  to_char has to run through probably hundreds of thousands of lines of code!
b) CASE WHEN TO_CHAR(TBL_CALENDAR_DATE, 'D') IN (2, 3, 4, 5, 6)
compare numbers to numbers, dates to dates, strings to string.  here you are comparing a string to a number - meaning there is an implicit conversion in there - meaning this code is *bad* (implicit conversions are *evil* - avoid them)
b) TO_CHAR(TBL_CALENDAR_DATE, 'd' )
beware of date functions that are dependent on where you are located.  
ops$tkyte%ORA11GR2> alter session set nls_territory = america;
Session altered.
ops$tkyte%ORA11GR2> select to_char(sysdate,'d'), trunc(sysdate)-trunc(sysdate,'iw')+1
  2  from dual;
T TRUNC(SYSDATE)-TRUNC(SYSDATE,'IW')+1
- ------------------------------------
2                                    1
ops$tkyte%ORA11GR2> alter session set nls_territory = germany;
Session altered.
ops$tkyte%ORA11GR2> select to_char(sysdate,'d'), trunc(sysdate)-trunc(sysdate,'iw')+1
  2  from dual;
T TRUNC(SYSDATE)-TRUNC(SYSDATE,'IW')+1
- ------------------------------------
1                                    1
ops$tkyte%ORA11GR2> 
your code will return different answers in different countries... you might consider using the IW approach - it is based on ISO standards (worldwide) instead of country by country settings.
c)                   MAX(LAST_DAY(TBL_CALENDAR_DATE)) LAST_CALENDAR_DAY
don't compute things you don't need.  You only reference last_calendar_day in the outer query in the predicate and could easily have used last_day(mmyyyy)
d) AND REPORT_DATE < '31-OCT-12'
never compare dates to strings, strings to numbers, etc.
always compare dates to dates, strings to strings, numbers to numbers.
never rely on implicit conversions, consider:
ops$tkyte%ORA11GR2> create table t ( x date );
Table created.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> insert into t values ( to_date( '01-jan-2012', 'dd-mon-yyyy' ) );
1 row created.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from t where x < '31-oct-12';
no rows selected
ops$tkyte%ORA11GR2> alter session set nls_date_format = 'dd-mon-yyyy';
Session altered.
ops$tkyte%ORA11GR2> select * from t where x < '31-oct-12';
no rows selected
ops$tkyte%ORA11GR2> 
whoops, wrong answer all of a sudden.....
e) AND REPORT_DATE < '31-OCT-12'
you must have started programming in about 2001 or 2002 or later.... Anyone that programmed in the 1990's knows - it takes four characters for years.  Always and forever.
I'm not a fan of hints at all - so no, I'm not going to give a magical hint - you might try using dynamic sampling set to level 3 or above - the exists query is getting what looks like an incorrect cardinality estimate.  the first query gets 11-12,000 rows as a guess, the second gets 1 row as a guess.  that is the root cause of the bad plan (incorrect cardinality estimates).  I cannot tell really where it is incorrect (because you know your data, I do not - you know what indexes go with what tables - i do not, makes it really hard to read a plan...)
so, looking at these plans - where are the cardinality estimates off by a large amount.  we'll start there - figure out how we might get the right cardinalities and then the right plan will just "fall out" from that.
as far as I know, these two queries are not equivalent.  the join query could return many more rows than the exists.  I don't know your constraints (primary keys, foreign keys, not null, etc) and hence cannot even tell you if these two queries are interchangeable! 
 
 
Difference in execution plan and execution time   
Manoj Kaparwan, February  02, 2013 - 5:33 pm UTC
 
 
Dear  tom
Here tried  the following test on Oracle 10.2.0.4
 
emp table - from scott.emp
big_table -- CTAS from all_objects
   
man@ora10g:rac1> select a.rowid
   from emp  a  where a.hiredate >to_date('1987-04-19','yyyy-mm-dd')
     and exists
      ( select null from big_table b , big_table c where b.object_id  = c.object_id );  2    3    4
ROWID
------------------
AAANjIAAEAACMNcAAK
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
Plan hash value: 4226424926
-----------------------------------------------------------------------------------------
| Id  | Operation              | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                |  7692K|   146M|   223K  (3)| 00:44:45 |
|*  1 |  FILTER                |                |       |       |            |          |
|*  2 |   TABLE ACCESS FULL    | EMP            |  7692K|   146M|   223K  (3)| 00:44:45 |
|   3 |   NESTED LOOPS         |                |     1 |    10 |     4   (0)| 00:00:01 |
|   4 |    INDEX FAST FULL SCAN| BIG_TBL_ID_IDX | 50503 |   246K|     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN    | BIG_TBL_ID_IDX |     1 |     5 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ */ 0 FROM "BIG_TABLE" "C","BIG_TABLE" "B"
              WHERE "B"."OBJECT_ID"="C"."OBJECT_ID"))
   2 - filter("A"."HIREDATE">TO_DATE(' 1987-04-19 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   5 - access("B"."OBJECT_ID"="C"."OBJECT_ID")
   
   
Then, I took tkprof
   
********************************************************************************
 select a.rowid
   from emp  a  where a.hiredate >to_date('1987-04-19','yyyy-mm-dd')
     and exists
      ( select null from big_table b , big_table c where b.object_id  = c.object_id )
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.02          0         18          0           0
Fetch        2      0.00       0.00          0          4          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.01       0.03          0         22          0           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  FILTER  (cr=22 pr=0 pw=0 time=1125 us)
      1   TABLE ACCESS FULL EMP (cr=4 pr=0 pw=0 time=273 us)
      1   NESTED LOOPS  (cr=18 pr=0 pw=0 time=776 us)
      1    INDEX FAST FULL SCAN BIG_TBL_ID_IDX (cr=15 pr=0 pw=0 time=638 us)(object id 55243)
      1    INDEX RANGE SCAN BIG_TBL_ID_IDX (cr=3 pr=0 pw=0 time=92 us)(object id 55243)
********************************************************************************
So Oracle is doing right thing here - as usual.
as soon as the 1st row is returned from the NL.. job is done .. no longer required joining each row from big_table a, to big_table c
I can see estimation of row in explain plan is way off the reality... is this the reason we should not rely much on explain plan. so we would not go with the cost as shown in the case in above post - might be tkprof would reveal more? 
 
February  04, 2013 - 9:40 am UTC 
 
how did it come up with over 7,000 rows in EMP??????
ops$tkyte%ORA10GR2> create table emp as select * from scott.emp;
Table created.
ops$tkyte%ORA10GR2> create table big_table as select * from all_objects;
Table created.
ops$tkyte%ORA10GR2> alter table big_table add constraint bt_pk primary key(object_id);
Table altered.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'EMP' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'BIG_TABLE' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace on explain
ops$tkyte%ORA10GR2> select a.rowid
  2     from emp  a  where a.hiredate >to_date('1987-04-19','yyyy-mm-dd')
  3          and exists
  4                        ( select null from big_table b , big_table c where b.object_id  =
  5                            c.object_id );
ROWID
------------------
AAARLfAAEAAASvUAAK
Execution Plan
----------------------------------------------------------
Plan hash value: 2260623314
--------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |     1 |    20 |     5   (0)| 00:00:01 |
|*  1 |  FILTER                |       |       |       |            |          |
|*  2 |   TABLE ACCESS FULL    | EMP   |     1 |    20 |     3   (0)| 00:00:01 |
|   3 |   NESTED LOOPS         |       |     2 |    20 |     2   (0)| 00:00:01 |
|   4 |    INDEX FAST FULL SCAN| BT_PK | 48879 |   238K|     2   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN   | BT_PK |     1 |     5 |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT 0 FROM "BIG_TABLE" "C","BIG_TABLE" "B"
              WHERE "B"."OBJECT_ID"="C"."OBJECT_ID"))
   2 - filter("A"."HIREDATE">TO_DATE(' 1987-04-19 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))
   5 - access("B"."OBJECT_ID"="C"."OBJECT_ID")
ops$tkyte%ORA10GR2> set autotrace off
post your step by steps to reproduce please. 
 
 
..Difference in execution plan and execution time
Manoj Kaparwan, February  04, 2013 - 7:32 pm UTC
 
 
Dear Tom,
My bad !
emp table in MAN schema had stats way off.
man@ora10g:rac1>  select owner, table_name, num_rows from dba_tables where table_name ='EMP';
OWNER                          TABLE_NAME                       NUM_ROWS
------------------------------ ------------------------------ ----------
MAN                            EMP                             100000000
later I gather stats.
man@ora10g:rac1> exec dbms_stats.gather_table_stats(user, 'EMP', cascade=>true);
PL/SQL procedure successfully completed.
man@ora10g:rac1>  select owner, table_name, num_rows, last_analyzed from dba_tables where table_name ='EMP';
OWNER                          TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ------------------------------ ---------- -------------------
MAN                            EMP                                    14 2013-02-05 12:25:31
man@ora10g:rac1> explain plan for
  2  select a.rowid
   from emp  a  where a.hiredate >to_date('1987-04-19','yyyy-mm-dd')
     and exists
      ( select null from big_table b , big_table c where b.object_id  = c.object_id );    3    4    5
Explained.
man@ora10g:rac1> @xplan
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4226424926
-----------------------------------------------------------------------------------------
| Id  | Operation              | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                |     1 |    20 |     7   (0)| 00:00:01 |
|*  1 |  FILTER                |                |       |       |            |          |
|*  2 |   TABLE ACCESS FULL    | EMP            |     1 |    20 |     3   (0)| 00:00:01 |
|   3 |   NESTED LOOPS         |                |     1 |    10 |     4   (0)| 00:00:01 |
|   4 |    INDEX FAST FULL SCAN| BIG_TBL_ID_IDX | 50503 |   246K|     2   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN    | BIG_TBL_ID_IDX |     1 |     5 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( EXISTS (SELECT /*+ */ 0 FROM "BIG_TABLE" "C","BIG_TABLE" "B"
              WHERE "B"."OBJECT_ID"="C"."OBJECT_ID"))
   2 - filter("A"."HIREDATE">TO_DATE(' 1987-04-19 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   5 - access("B"."OBJECT_ID"="C"."OBJECT_ID")
21 rows selected.
 
 
Not in and Not exists
ajit, February  20, 2013 - 6:54 pm UTC
 
 
Hi tom,
I read most of the discussions above.
I just want to confirm one thing.
If we are dealing with columns which do not have null values..can not in/not exists be interchanged in that case?
Thanks in advance
Big fan :)
  
February  25, 2013 - 10:09 am UTC 
 
yes, if the column involved in the NOT IN select list (the projected column(s)) are defined as NOT NULL - then not in and not exists would be semantically equivalent. 
 
 
why "not in" unable to compare null.
ranjan, August    29, 2013 - 8:38 pm UTC
 
 
Hi Tom,
Hope you are fine :)
#############
Create Table A As Select Level A From Dual Connect By Level<=5;
create table b as select level+2 a from dual connect by level<=5 union select null from dual;
Select * From A;
1
2
3
4
5
Select * From B;
3
4
5
6
7
Select * From A Where A not In (Select A From B);
no rows selected
select * from a where a in (select a from b);
3
4
5
##############
You told at the top:
--------
Beware the NULL value 
and NOT IN!!  (also the reason why NOT IN is sometimes avoided).
---------
That is true but why oracle doesn't handle as it handles in case of "in".
I mean to say how oracle skip the values "1" and "2" in case of "in" which is not correct (as we know null cant be compared with anything (null is unknown),  how it calculates that "1" in "null"  and "2" in "null" are false).
Many people say that, that is side effect of "not in" , but could you please say something :).
 
September 04, 2013 - 6:22 pm UTC 
 
because we are beholden to the ANSI standard to tell us how to process the language and the standard says this is the way it is to be done.
it is in the definition of the language, a definition we subscribe to but are not in charge of. 
 
 
hmm, thank you for the reply
Ranjan, September 08, 2013 - 10:10 pm UTC
 
 
 
 
Exact use of exists
Boob, June      17, 2014 - 2:43 pm UTC
 
 
Hi could you please sujjest me how exactly exit will be used.  
 
NOT IN and NOT EXISTS
Niranjan, September 17, 2014 - 9:15 am UTC
 
 
Hi Tom,
You said, NOT IN and NOT EXISTS are not the same!!! logically I understand that but I did not quite get, as the result I get is the same (query below). Are you referring to saying (NOT THE SAME) that, EXISTS evaluates to true immediately on the first occurrence of the value during a scan on the table/index?
Used smaple HR schema.
select department_id from departments d where not exists (select null from employees e where e.department_id = d.department_id);
select department_id from departments d where department_id not in (select department_id from employees e where e.department_id = d.department_id); 
 
getting null values from record 
lavanya, July      27, 2016 - 11:46 am UTC
 
 
Suppose customer is my table in that I want to find all the c ustomers where we don't have their phone  numbers .how can I get it?  
July      27, 2016 - 1:23 pm UTC 
 
select * 
from customers
where phone_number is null 
 
 
Not seeing difference in null handling between NOT IN and NOT EXISTS
Stacey Baker, March     27, 2017 - 10:27 pm UTC
 
 
Hi Tom,
I'm still struggling with the difference in null handling between NOT IN and NOT EXISTS. The example you gave that worked, only worked because the correlated subquery inner join eliminated the null in the table (emp 7839 has no manager):
select count(*) from emp t1 where not exists ( select null from emp t2 where mgr = t1.empno);
If I add "or m.mgr is null", I get 0 records, the same answer I get with NOT IN.
select count(*) from emp t1 where not exists ( select null from emp t2 where mgr = t1.empno or mgr is null);
Also, if I use NOT IN in your NOT EXISTS query, it returns the correct count of employees who aren't managers:
select count(*) from emp t1 where empno not in (select mgr from emp t2 where mgr = t1.empno);
How can I see the difference? 
March     28, 2017 - 1:06 am UTC 
 
You can generally transpose between the two if you  *change the sql*, because it is basically about sets and set membership.  Tom's point was that simply switching "not in" to "not exists" is not a guarantee it will be the same result.   
 
 
Smrdíš
A reader, September 26, 2017 - 12:01 pm UTC
 
 
 
 
Not sure I understand...
Randy Strauss, October   13, 2017 - 8:14 pm UTC
 
 
Other answer said/implied that the set in:
 select count(*) from emp where empno not in ( select mgr from emp ); 
evaluated to ( NULL ) - why?  It seems like it should evaluate to a set that includes managers and a null (does a set automatically discard redundant values?) . Is that correct?  And are you saying this wouldn't work- that NULL would make the set poorly defined?
Would this work better/well?:
 select count(*) from emp where empno not in ( select distinct(mgr) from emp where mgr is not null); 
 
October   16, 2017 - 12:27 pm UTC 
 
And are you saying this wouldn't work- that NULL would make the set poorly defined? 
Effectively, yes.
Would this work?: 
 select count(*) from emp where empno not in ( select distinct(mgr) from emp where mgr is not null); 
Yes.