Batching rowids
Sam To, January   03, 2003 - 12:43 pm UTC
 
 
Tom,
You listed this step in your response:
"loop over the rowids in the ORDER_HEADERS_N3 index where agent_id = :b1
   join them all and output to SUBRESULT-1
end loop"
So is it true that in this case the CBO decides that the savings by batching the rowid fetches on ORDER_HEADERS more than offset the cost of the cartesian product?   I always thought cartesian product are quite bad, except in the case when one of the row sources has very few rows. 
 
January   03, 2003 - 12:54 pm UTC 
 
You were already doing a cartesian product with a filter.  
All of the rows in the orders table that matched that index were joined to every row in the lookup that matched that other index -- AND THEN the predicate would be resolved.
Like full scans -- just keep repeating
"cartesian joins are not evil", "cartesian joins are not evil"......
if they were, we would never do them.  This is not a true cartesian product in the traditional sense but in fact every row in lookup that is PO TYPE is matched to every row in orders where agent_id = :b1 regardless of version!  The "when the predicate is evaluated" is done in a different place, that is all. 
 
 
 
It's clear now
Sam To, January   03, 2003 - 1:25 pm UTC
 
 
Tom,
Thanks for the followup answer.  It is clear to me now.
If I add an extra join to the query, the table access on ORDER_HEADERS is now pushed back into the nested loop, which is what I expect:
select /*+ USE_NL(p po s) leading(p) */ * from lookup_codes p, order_headers po, suppliers s
where p.lookup_type = 'PO TYPE' and
po.type_lookup_code = p.lookup_code and
po.agent_id = :b1 and
s.supplier_id = po.supplier_id
1:SELECT STATEMENT  Cost =28,6
 2:NESTED LOOPS   COST=28,6
  3:NESTED LOOPS   COST=19,9
   4:TABLE ACCESS BY INDEX ROWID LOOKUP_CODES COST=4,1
    5:INDEX RANGE SCAN LOOKUP_CODES_N1 COST=3,1
   4:TABLE ACCESS BY INDEX ROWID ORDER_HEADERS COST=15,1
    5:INDEX RANGE SCAN ORDER_HEADERS_N3 COST=1,145
  3:TABLE ACCESS BY INDEX ROWID SUPPLIERS COST=1,248
   4:INDEX UNIQUE SCAN SUPPIERS_U1 COST=,1
Originally I was concerned about the cartesian product creating a large row source for the subsequent join steps.  But the plan above indicates this will not happen. 
 
January   03, 2003 - 2:28 pm UTC 
 
don't be concerned -- it is the job of the optimizer to be concerned.  Let is do its job.  This  (the moving of the table access by rowid ) is an OPTIMIZATION, not a decrease in performance option! 
 
 
 
clarification
Stu Charlton, January   03, 2003 - 2:16 pm UTC
 
 
Perhaps I'm misreading the EXPLAIN PLAN or misunderstand the algorithm, but I'd like a clarification on this new algorithm.
In this article:
</code>  
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:6643159615303  <code>
Connor refers to this new nested loops algorithm in a comment, and says "Then it sorts the rowids so that it only has to visit each table block once."
I also notice that in the EXPLAIN PLAN above, the TABLE ACCESS BY INDEX ROWID ORDER_HEADERS occurs outside of the "NESTED LOOPS" operation (which seems to resonate with what Connor says above).
If this is the case, wouldn't the pseudo-code be more like:
loop over the rowids in the LOOKUP_CODES_N1 index where lookup_codes = 'PO TYPE'
      goto the LOOKUP_CODES table and get that row
      loop over the rowids in the ORDER_HEADERS_N3 index where agent_id = :b1
           join them all and output to SUBRESULT-1
      end loop
end loop
(^^^ the nested loops end here)
Now take SUBRESULT-1's set of rowids
 - sort to remove duplicate rowids
 - access the ORDER_HEADERS table
 - process the predicate (po.type_lookup_code=p.lookup_codes).  
 - If matched then OUTPUT ROW
If SUBRESULT-1 is predicate-matched within the nested loop as you've written above, then the whole point of this optimization doesn't seem to click.  
My understanding of this optimized algorithm is:  The "cartesian" product with the index is cheaper than accessing table blocks repeatedly inside a nested loop (because index blocks are fewer, denser, etc)... so defer the table access until after you've cartesian joined potential rowid's together through the indices.  Am I wrong?
Cheers
Stu  
January   03, 2003 - 2:47 pm UTC 
 
Not the way NL joins are processed -- NL joins stream data back ASAP.  
If it where a sort/merge or a hash join, you would be correct.  with NL, nope.
It always did a "cartesian join", in both 8i and 9i.  It found a row in LOOKUP, it found all rows in ORDERS -- it joined every row it found in orders with that lookup row, evaluated that predicate and then either kicked a row out or rejected it.
It they went on to the next row in lookup and did the same thing all over again.
In 9i, it just postpones the one step -- it gets all of the ORDERS rows for agent_id = :b1, joins them to that single lookup row -- sorts the rowids and accesses the table as efficiently as possible -- does the predicate -- outputs.
A nested loop feeds the output part of the plan ASAP.  Else you would have to wait till the last row was finished to get the first row (and you don't) 
 
 
 
got it
Stu Charlton, January   03, 2003 - 3:27 pm UTC
 
 
Thanks for your time Tom, I understand now.
 
 
 
Nested Loop Verses Merge join
reader, February  21, 2003 - 4:00 pm UTC
 
 
In what scenario one is better than the other. Can you please give some conceptual details about nested loops and merge joins or point to a resource.
Thanks. 
 
February  21, 2003 - 6:35 pm UTC 
 
You know -- if you use the CBO, it'll tell you when one is preferred over the other! that is exactly the job of an optimizer.
Think of nested loops like this:
for x in ( select * from T1 )
loop
     for y in ( selectd * from t2 where t2.joins = t1.joins )
     loop
            output records
     end loop
end loop
whereas a sort merge is like:
sort t1 by joins into temp1
sort t2 by joins into temp2
read a rec from temp1;
read a rec from temp2;
loop
   while (temp1_rec < temp2_rec) read temp1
   while (temp1_rec > temp2_rec) read temp2
   if ( temp1_rec = temp2_rec ) output record
   exit when end of file on temp1 or temp2
end;
the nested loops joins can be very poor performers for tons of rows -- whereas the sort merge excels at them (lots less index lookups)
if you want the first rows fastest however -- nested loops is the way to go.
 
 
 
 
excellent explanation of prefetch
Mikito Harakiri, February  24, 2003 - 7:03 pm UTC
 
 
  
 
Pauline, September 08, 2003 - 2:21 pm UTC
 
 
Tom,
We just upgrade one oracle8i database to oracle9i database by using dbua tool.When we run the query to check if database has locks, it takes very long tile. 
But this query returns very soon( 3-5 seconds ) in all our other oracle8i database. To test it, we also run it on one 9i database which is created by dbca tool. It takes more than 1 minutes. 
The query statement is:
SELECT /*+ CHOOSE */ s.username,  
s.osuser,   
s.serial# "Serial#",
lk.block, -- block = 0 non blocking, block = 1 blocking others
S.PROGRAM "Program",  
s.sql_address "address", 
s.sql_hash_value "Sql hash",   
--lk.sid, 
DECODE(lk.TYPE,   'MR', 'Media Recovery',   'RT', 'Redo Thread',   'UN', 'User Name',   'TX', 'Transaction',   
'TM', 'DML',   'UL', 'PL/SQL User Lock',   'DX', 'Distributed Xaction',   'CF', 'Control File',   
'IS', 'Instance State',   'FS', 'File Set',   'IR', 'Instance Recovery',   'ST', 'Disk Space Transaction',   
'TS', 'Temp Segment',   'IV', 'Library Cache Invalidation',   'LS', 'Log Start or Switch',   'RW', 'Row Wait',   
'SQ', 'Sequence Number',   'TE', 'Extend Table',  'TT', 'Temp Table',   lk.TYPE) lock_type,   
DECODE(lk.lmode,   0, 'None',    1, 'Null',    2, 'Row-S (SS)',   3, 'Row-X (SX)',   4, 'Share',     5, 'S/Row-X (SSX)',   
6, 'Exclusive',   TO_CHAR(lk.lmode)) mode_held,       
DECODE(request,   0, 'None',   1, 'Null',   2, 'Row-S (SS)',    3, 'Row-X (SX)',    4, 'Share',     5, 'S/Row-X (SSX)',   
6, 'Exclusive',   TO_CHAR(lk.request)) mode_requested,   
TO_CHAR(lk.id1) lock_id1,   
TO_CHAR(lk.id2) lock_id2,  -- will show the job number corresponding in dba_jobs
s.USERNAME  "DB User", 
s.sid,     
OWNER||'.'||OBJECT_NAME "Object" 
FROM v$lock lk,  
     v$session s,    
     DBA_OBJECTS ao 
WHERE lk.lmode  > 1 
     AND s.username is not null 
     AND lk.sid    = s.sid 
     AND ao.OBJECT_ID(+) = lk.id1 
     ORDER BY 1, "Object";
We compared the execution plan for Oracle9i database and Oracle8i database, they are very different for this query:
Oracle9i database execution plan:
Query Plan
--------------------------------------------------------------------------------
1.106 SELECT STATEMENT    
  2.1 SORT ORDER BY
    3.1 HASH JOIN OUTER
      4.1 NESTED LOOPS
        5.1 MERGE JOIN CARTESIAN
          6.1 NESTED LOOPS
            7.1 FIXED TABLE FULL X$KSUSE
            7.2 FIXED TABLE FIXED INDEX X$KSUSE (ind:1)
          6.2 BUFFER SORT
            7.1 FIXED TABLE FULL X$KSQRS
        5.2 VIEW  GV$_LOCK
Query Plan
--------------------------------------------------------------------------------
          6.1 UNION-ALL
            7.1 VIEW  GV$_LOCK1
              8.1 UNION-ALL
                9.1 FIXED TABLE FULL X$KDNSSF
                9.2 FIXED TABLE FULL X$KSQEQ
            7.2 FIXED TABLE FULL X$KTADM
            7.3 FIXED TABLE FULL X$KTCXB
      4.2 VIEW  DBA_OBJECTS
        5.1 UNION-ALL
          6.1 FILTER
            7.1 NESTED LOOPS
Query Plan
--------------------------------------------------------------------------------
              8.1 TABLE ACCESS FULL OBJ$
              8.2 TABLE ACCESS CLUSTER USER$
                9.1 INDEX UNIQUE SCAN I_USER# NON-UNIQUE
            7.2 TABLE ACCESS BY INDEX ROWID IND$
              8.1 INDEX UNIQUE SCAN I_IND1 UNIQUE
          6.2 HASH JOIN
            7.1 TABLE ACCESS FULL LINK$
            7.2 TABLE ACCESS FULL USER$
30 rows selected.
Oracle8i database:
Query Plan
--------------------------------------------------------------------------------
1.86 SELECT STATEMENT    
  2.1 SORT ORDER BY
    3.1 HASH JOIN
      4.1 FIXED TABLE FULL X$KSUSE
      4.2 HASH JOIN OUTER
        5.1 VIEW  V$LOCK
          6.1 NESTED LOOPS
            7.1 HASH JOIN
              8.1 VIEW  GV$_LOCK
                9.1 UNION-ALL
                  10.1 VIEW  GV$_LOCK1
Query Plan
--------------------------------------------------------------------------------
                    11.1 UNION-ALL
                      12.1 FIXED TABLE FULL X$KDNSSF
                      12.2 FIXED TABLE FULL X$KSQEQ
                  10.2 FIXED TABLE FULL X$KTADM
                  10.3 FIXED TABLE FULL X$KTCXB
              8.2 FIXED TABLE FULL X$KSUSE
            7.2 FIXED TABLE FIXED INDEX #1 X$KSQRS
        5.2 VIEW  DBA_OBJECTS
          6.1 UNION-ALL
            7.1 FILTER
              8.1 NESTED LOOPS
Query Plan
--------------------------------------------------------------------------------
                9.1 TABLE ACCESS FULL OBJ$
                9.2 TABLE ACCESS CLUSTER USER$
                  10.1 INDEX UNIQUE SCAN I_USER# NON-UNIQUE
              8.2 TABLE ACCESS BY INDEX ROWID IND$
                9.1 INDEX UNIQUE SCAN I_IND1 UNIQUE
            7.2 NESTED LOOPS
              8.1 TABLE ACCESS FULL LINK$
              8.2 TABLE ACCESS CLUSTER USER$
                9.1 INDEX UNIQUE SCAN I_USER# NON-UNIQUE
31 rows selected.
The time for returing the result from the query in Oracle8i and 9i database 
also very different:
Oracle8i databases only need 3 - 5 seconds to return the result.
Oracle9i database (created by dbca tool) returns the result by 49 seconds.
Oracle9i database (upgraded by dbua tool) returns the result by 37 minutes.
I see MERGE JOIN CATTESIAN in 9i database execution plan. Is it the reason
taking so long to return the result? How can we fix/avoid such problem so that once we upgrade production database to 9i, we won't have similar problem.
Thanks for your help.
 
 
September 08, 2003 - 5:32 pm UTC 
 
try
SELECT /*+ CHOOSE */ s.username,  
s.osuser,   
s.serial# "Serial#",
lk.block, -- block = 0 non blocking, block = 1 blocking others
S.PROGRAM "Program",  
s.sql_address "address", 
s.sql_hash_value "Sql hash",   
--lk.sid, 
DECODE(lk.TYPE,   'MR', 'Media Recovery',   'RT', 'Redo Thread',   'UN', 'User 
Name',   'TX', 'Transaction',   
'TM', 'DML',   'UL', 'PL/SQL User Lock',   'DX', 'Distributed Xaction',   'CF', 
'Control File',   
'IS', 'Instance State',   'FS', 'File Set',   'IR', 'Instance Recovery',   'ST', 
'Disk Space Transaction',   
'TS', 'Temp Segment',   'IV', 'Library Cache Invalidation',   'LS', 'Log Start 
or Switch',   'RW', 'Row Wait',   
'SQ', 'Sequence Number',   'TE', 'Extend Table',  'TT', 'Temp Table',   lk.TYPE) 
lock_type,   
DECODE(lk.lmode,   0, 'None',    1, 'Null',    2, 'Row-S (SS)',   3, 'Row-X 
(SX)',   4, 'Share',     5, 'S/Row-X (SSX)',   
6, 'Exclusive',   TO_CHAR(lk.lmode)) mode_held,       
DECODE(request,   0, 'None',   1, 'Null',   2, 'Row-S (SS)',    3, 'Row-X (SX)', 
   4, 'Share',     5, 'S/Row-X (SSX)',   
6, 'Exclusive',   TO_CHAR(lk.request)) mode_requested,   
TO_CHAR(lk.id1) lock_id1,   
TO_CHAR(lk.id2) lock_id2,  -- will show the job number corresponding in dba_jobs
s.USERNAME  "DB User", 
s.sid,     
(select OWNER||'.'||OBJECT_NAME from dba_objects where object_id = lk.id1) "Object" 
FROM v$lock lk,  
     v$session s
WHERE lk.lmode  > 1 
     AND s.username is not null 
     AND lk.sid    = s.sid 
     ORDER BY 1, "Object"; 
 
 
 
Pauline, September 10, 2003 - 5:58 pm UTC
 
 
Tom,
Your query running faster on our 8i database. But in our oracle 9i database it takes 57 minutes. I think this 9i database might have something wrong with data dictionaries.
When I login using Connect "/ as sysdba" against this 9i database, I get incorrect results. For example:
<dbstg1:stg1:/appl/oracle/SQLTXPLAIN> sqlplus /nolog
SQL*Plus: Release 9.2.0.1.0 - Production on Wed Sep 10 17:24:08 2003
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
SQL> conn / as sysdba
Connected.
SQL> SELECT USERENV('SESSIONID') FROM DUAL;
USERENV('SESSIONID')
--------------------
                   0
SQL> SELECT s.audsid, p.spid, p.addr
  2    FROM v$parameter u,
       v$session   s,
  3    4         v$process   p
 WHERE u.name   = 'user_dump_dest'
  5    6     AND s.audsid = USERENV('SESSIONID')
   AND p.addr   = s.paddr;  7  
    AUDSID SPID         ADDR
---------- ------------ ----------------
         0 25188        000000039537E638
         0 4868         000000039538A568
         0 4985         0000000395389148
         0 25228        00000003953827A0
         0 25226        0000000395382298
         0 25224        0000000395381D90
         0 25222        0000000395381888
         0 25220        0000000395380970
         0 25200        0000000395380468
         0 25198        000000039537FF60
         0 25196        000000039537FA58
    AUDSID SPID         ADDR
---------- ------------ ----------------
         0 25190        000000039537EB40
         0 25194        000000039537F550
         0 25192        000000039537F048
14 rows selected.
The second query returns multiple rows and it will cause
other query depending on this query to get error of
ORA-01422: exact fetch returns more than requested number of rows
But when I login as user with dba role:
SQL> conn /
Connected.
SQL> SQL> SELECT USERENV('SESSIONID') FROM DUAL;
USERENV('SESSIONID')
--------------------
             3978577
SQL> SELECT s.audsid, p.spid, p.addr
  2    FROM v$parameter u,
       v$session   s,
  3    4         v$process   p
 WHERE u.name   = 'user_dump_dest'
  5    6     AND s.audsid = USERENV('SESSIONID')
  7     AND p.addr   = s.paddr;
    AUDSID SPID         ADDR
---------- ------------ ----------------
   3978577 9273         0000000395389650
Can you tell me which result is correct? Suppose should return one row or multiple rows? 
 
 
September 10, 2003 - 8:22 pm UTC 
 
backgrounds have a sessionid of 0 -- both are correct.   
 
 
 
Pauline, September 11, 2003 - 10:17 am UTC
 
 
Thanks for the answer to give the key point. 
 
 
A reader, September 11, 2003 - 12:42 pm UTC
 
 
Tom,
I got some information that I was told this is the BUG:2243239 Connect "/ as sysdba" causes sessionid to be 0.  It is fixed on oracle10g. I tried to search bug list
from metalink but could not find. Can you tell me is it
true or not.
Tnanks.
Pauline 
 
September 11, 2003 - 7:23 pm UTC 
 
use
(select sid from v$mystat where rownum=1)
don't use sessionid, i never do, it is zero for internal/sysdba for a while.
select * from v$session where sid = (select sid from v$mystats where rownum=1)
 
 
 
 
Pauline, September 12, 2003 - 11:23 am UTC
 
 
Thanks for good advice. I will do as you suggested. 
 
 
list fully scaned tables
Gabriel, September 17, 2003 - 1:45 pm UTC
 
 
Your answers are very interesting, as usual. We recently migrated from 8i to 9i. We experience many more full table scans. Can you please tell me if there is a query that will show us all the tables that are fully scanned?
Thank you very much, 
 
September 17, 2003 - 5:29 pm UTC 
 
they should pop to the top of your top sql reports if they are problems (statspack)
you can use v$segment_statistics to get a count of logical io's against the segments as well -- looking for "big" ones. 
 
 
 
Article on nested loop joins.
Sai., December  01, 2003 - 1:46 pm UTC
 
 
Hi Tom,
I remember you talking about "how nested loop joins can sometimes kill performance with lots of LIO". I couldn't find that link. Could you please help me find that link.
Thanks,
 Sai. 
 
December  02, 2003 - 7:59 am UTC 
 
 
 
Your query running faster on our 8i database. But ...
Swetlana, January   03, 2004 - 6:38 am UTC
 
 
Hi Tom.
We have like problem. Explain please what happen and what do? Query is:
 select s.username,s.sid,l.type,l.id1,l.id2,l.lmode,
        l.request,p.spid pid,s.serial#
 from v$lock l,v$session s,v$process p
 where s.sid=l.sid and p.addr=s.paddr 
       and s.username is not null
 order by id1,s.sid,request;
Thanks.
 
 
January   03, 2004 - 9:22 am UTC 
 
huh?
what "like" problem? 
 
 
 
We have like problem
Swetlana, January   06, 2004 - 1:11 am UTC
 
 
We just upgrade one oracle8i database to oracle9i database.
When we run the query to check if database has locks, it takes very long tile. Query is :
 select s.username,s.sid,l.type,l.id1,l.id2,l.lmode,
        l.request,p.spid pid,s.serial#
 from v$lock l,v$session s,v$process p
 where s.sid=l.sid and p.addr=s.paddr 
       and s.username is not null
 order by id1,s.sid,request;
What happen ?
Thanks. 
 
January   06, 2004 - 8:36 am UTC 
 
select x.username, x.sid, l.type, l.id1, l.id2, l.lmode, l.request, x.pid, x.serial#
  from ( select s.username, s.sid, p.spid pid, s.serial#
           from v$session s, v$process p
          where s.paddr = p.addr(+)
            and rownum > 0
            and s.username is not null ) x,
       v$lock l
 where x.sid = l.sid
 order by l.id1, x.sid, l.request
/
 
 
 
 
Swetlana, January   08, 2004 - 1:04 am UTC
 
 
Thanks for the answer.  
 
 
Is there any Problem ???
Riaz Shahid, January   27, 2004 - 3:55 pm UTC
 
 
  1* select * from v$session where sid = (select sid from v$mystats where rownum=1)
SQL> /
select * from v$session where sid = (select sid from v$mystats where rownum=1)
                                                     *
ERROR at line 1:
ORA-00942: table or view does not exist 
 
 
January   28, 2004 - 8:10 am UTC 
 
sorry, typo on my part.  got it right on the line above.
singular -- v$mystat
 
 
 
 
why this query runs so fast
A reader, February  05, 2004 - 9:31 am UTC
 
 
Hi
I have a query which I expected it to run very slow but to my surprise it runs instantaneously
SELECT b.setid, b.inst_prod_id
  FROM sysadm.ps_rf_inst_prod a, sysadm.ps_pa_arbor_elemen b
 WHERE a.parent_inst_prodid = ' '
   AND a.product_id = :b1
   AND b.pa_comp_inst_id = 0
   AND a.inst_prod_id = b.inst_prod_id
   AND a.setid = b.setid
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           |  Name               | Rows  | Bytes | Cost  | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     1 |    67 |   642 |       |       |
|   1 |  NESTED LOOPS                       |                     |     1 |    67 |   642 |       |       |
|*  2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| PS_RF_INST_PROD     |     1 |    45 |   640 | ROWID | ROW L |
|*  3 |    INDEX RANGE SCAN                 | PSBRF_INST_PROD     |   880 |       |     5 |       |       |
|*  4 |   TABLE ACCESS BY GLOBAL INDEX ROWID| PS_PA_ARBOR_ELEMEN  |     1 |    22 |     2 | ROWID | ROW L |
|*  5 |    INDEX UNIQUE SCAN                | PS_PA_ARBOR_ELEMEN  |     2 |       |     1 |       |       |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("A"."PRODUCT_ID"=:Z)
   3 - access("A"."PARENT_INST_PRODID"=:Z)
   4 - filter("B"."PA_COMP_INST_ID"=0)
   5 - access("A"."SETID"="B"."SETID" AND "A"."INST_PROD_ID"="B"."INST_PROD_ID")
select count(*)
  2  from sysadm.ps_rf_inst_prod 
  3  where parent_inst_prodid = ' ';
  COUNT(*)
----------
   5684073
I thought this scan
INDEX UNIQUE SCAN                | PS_PA_ARBOR_ELEMEN
has to happen 5684073 times and would be slow... but it´s fast do you see how?! 
 
February  05, 2004 - 9:48 am UTC 
 
i cannot read the wrapped stuff there. and i'm not sure that query is the one explained (the filters in the explain do not map to it anyway)
but basically it looks like this:
nested loop
   TABLE ACCESS BY GLOBAL INDEX ROWID| PS_RF_INST_PROD 
      INDEX RANGE SCAN                 | PSBRF_INST_PROD  
   TABLE ACCESS BY GLOBAL INDEX ROWID| PS_PA_ARBOR_ELEMEN 
      INDEX UNIQUE SCAN                | PS_PA_ARBOR_ELEMEN
and with the filters it is doing:
a) read the index psbrf_inst_prod to find parent_inst_prodid = :z in table A.  We expect to get about 880 hits for this.
b) then goto the table ps_rf_inst_prod and apply the predicate product_id = :z.  We expect 1 row to meet that criteria.
c) then do the index range scan to pick up the single row from the other table and join.
So, it should be very fast if the assumptions about the cardinality are in fact correct.
 
 
 
 
cardinality is wrong!
A reader, February  05, 2004 - 10:22 am UTC
 
 
sql>var b0 varchar2(30)
sql>var b1 varchar2(15)
sql>exec :b0:=' '
PL/SQL procedure successfully completed.
sql>exec :b1:='20'
PL/SQL procedure successfully completed.
sql>set autotrace trace exp
sql>r
  1  SELECT b.setid, b.inst_prod_id
  2    FROM sysadm.ps_rf_inst_prod a, sysadm.ps_pa_arbor_elemen b
  3   WHERE a.parent_inst_prodid = :b0
  4     AND a.product_id = :b1
  5     AND b.pa_comp_inst_id = 0
  6     AND a.inst_prod_id = b.inst_prod_id
  7*    AND a.setid = b.setid
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=642 Card=1 Bytes=67)
   1    0   NESTED LOOPS (Cost=642 Card=1 Bytes=67)
   2    1     TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'PS_RF_INST_PROD' (Cost=640 Card=1 Bytes=45)
   3    2       INDEX (RANGE SCAN) OF 'PSBRF_INST_PROD' (NON-UNIQUE) (Cost=5 Card=880)
   4    1     TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'PS_PA_ARBOR_ELEMEN' (Cost=2 Card=1 Bytes=22)
   5    4       INDEX (UNIQUE SCAN) OF 'PS_PA_ARBOR_ELEMEN' (UNIQUE) (Cost=1 Card=2)
explain plan for
SELECT b.setid, b.inst_prod_id
  FROM sysadm.ps_rf_inst_prod a, sysadm.ps_pa_arbor_elemen b
 WHERE a.parent_inst_prodid = :b0
   AND a.product_id = :b1
   AND b.pa_comp_inst_id = 0
   AND a.inst_prod_id = b.inst_prod_id
   AND a.setid = b.setid;
-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           |  Name               | Rows  | Bytes | Cost  | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     1 |    67 |   642 |       |       |
|   1 |  NESTED LOOPS                       |                     |     1 |    67 |   642 |       |       |
|*  2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| PS_RF_INST_PROD     |     1 |    45 |   640 | ROWID | ROW L |
|*  3 |    INDEX RANGE SCAN                 | PSBRF_INST_PROD     |   880 |       |     5 |       |       |
|*  4 |   TABLE ACCESS BY GLOBAL INDEX ROWID| PS_PA_ARBOR_ELEMEN  |     1 |    22 |     2 | ROWID | ROW L |
|*  5 |    INDEX UNIQUE SCAN                | PS_PA_ARBOR_ELEMEN  |     2 |       |     1 |       |       |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("A"."PRODUCT_ID"=:Z)
   3 - access("A"."PARENT_INST_PRODID"=:Z)
   4 - filter("B"."PA_COMP_INST_ID"=0)
   5 - access("A"."SETID"="B"."SETID" AND "A"."INST_PROD_ID"="B"."INST_PROD_ID")
that 880 is wrong, it should be over 5000000!!!
The column has histograms by the way, otherwise the query is damn slow 
 
February  05, 2004 - 7:24 pm UTC 
 
if it were over 5,000,000 for the predicate, the query would not return fast.
show us the TKPROF with the actual row counts, then we can comment.d 
 
 
 
what rownum>0 means?
Andrew, February  05, 2004 - 7:09 pm UTC
 
 
Tom,
in the answer to "we have like problem" (she meant "we have similar problem") you use inline view to speed up the output. In this subquery you use rownum>0. Could you please explain what's the point of "and rownum>0"
Isn't that any query if it returns row will always have 0<rownum for every row?
So it's an obsolete line, IMHO. Please correct me if I am mistaken.
Thank you 
 
February  06, 2004 - 8:35 am UTC 
 
well, it obviously isn't "obsolete" :)   it works after all...
it causes the inline view to be "evaluated once and stored in temp" to avoid view merging.
if you have my book "Effective Oracle By Design" -- i cover this in some more detail/depth. 
 
 
 
rownum > 0
A reader, February  06, 2004 - 2:21 am UTC
 
 
is to avoid view merging 
 
 
TKPROF
A reader, February  06, 2004 - 2:30 am UTC
 
 
Hi here is the result
SELECT b.setid, b.inst_prod_id
  FROM sysadm.ps_rf_inst_prod a, sysadm.ps_pa_arbor_elemen b
 WHERE a.parent_inst_prodid = :b0
   AND a.product_id = :b1
   AND b.pa_comp_inst_id = 0
   AND a.inst_prod_id = b.inst_prod_id
   AND a.setid = b.setid
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          4          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.01       0.00          0          4          0           0
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 123
Rows     Row Source Operation
-------  ---------------------------------------------------
      0  TABLE ACCESS BY GLOBAL INDEX ROWID OBJ#(48103) PARTITION: ROW LOCATION ROW LOCATION (cr=4 r=0 w=0 time=238 us)
      1   NESTED LOOPS  (cr=4 r=0 w=0 time=228 us)
      0    TABLE ACCESS BY GLOBAL INDEX ROWID OBJ#(47711) PARTITION: ROW LOCATION ROW LOCATION (cr=4 r=0 w=0 time=219 us)
      0     INDEX RANGE SCAN OBJ#(55911) (cr=4 r=0 w=0 time=215 us)(object id 55911)
      0    INDEX RANGE SCAN OBJ#(55791) (cr=0 r=0 w=0 time=0 us)(object id 55791)
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1       36.17         36.17
select object_id, object_name from dba_objects where object_id in (55911, 55791)
 OBJECT_ID OBJECT_NAME
---------- --------------------
     55791 PS_PA_ARBOR_ELEMEN
     55911 PS2RF_INST_PROD
select index_name, column_name
from dba_ind_columns
where index_name in ('PSBRF_INST_PROD', 'PS_PA_ARBOR_ELEMEN', 'PS2RF_INST_PROD')
INDEX_NAME                     COLUMN_NAME
------------------------------ --------------------
PS_PA_ARBOR_ELEMEN             SETID
PS_PA_ARBOR_ELEMEN             INST_PROD_ID
PS2RF_INST_PROD                PRODUCT_ID
PS2RF_INST_PROD                SETID
PS2RF_INST_PROD                INST_PROD_ID
PSBRF_INST_PROD                PARENT_INST_PRODID
I can see from TKPROF that the plan is totally different from that shown in autotrace or explain plan for... (and may be this is why it´s so fast). Why is this, does this mean we should stop using set autotrace and explain plan from now on?
It´s Oracle 9.2.0.4 
 
February  07, 2004 - 12:14 pm UTC 
 
explain plan for
SELECT b.setid, b.inst_prod_id
  FROM sysadm.ps_rf_inst_prod a, sysadm.ps_pa_arbor_elemen b
 WHERE a.parent_inst_prodid = :b0
   AND a.product_id = :b1
   AND b.pa_comp_inst_id = 0
   AND a.inst_prod_id = b.inst_prod_id
   AND a.setid = b.setid;
if :b0 / :b1 are to be numbers or dates -- use to_number/to_date -- else they are assumed to be strings and that can through off the explain plan.  that is my guess.  be careful with binds in explain plans -- use EXPLICIT to_numbers/to_dates on them!
autotrace/explain plan show you want "might happen".
only tkprof shows you what "really did happen" 
 
 
 
Cartesian join
VA, April     17, 2006 - 12:35 pm UTC
 
 
I have a 1000 row table and a 100,000 row table. Each table has 4 "keys" that are generated according to a complicated algorithm.
I need to match the 2 tables on these keys to determine matching rows. Something like
select * from (
select
decode(t1.key1,t2.key1,'Y','N') key1_match,
decode(t1.key2,t2.key2,'Y','N') key2_match,
decode(t1.key3,t2.key3,'Y','N') key3_match,
decode(t1.key4,t2.key4,'Y','N') key4_match,
...
from t1,t2
) where key1_match||key2_match||key3_match||key4_match like '%Y%'
In other words, do a cartesian join of the 2 tables (generating 100,000,000 rows!) and do a nested loop matching each row against all the others.
I thought that doing this in plain SQL would be slow so I loaded up both the tables in PL/SQL tables using BULK COLLECT and did nested FOR loops in PL/SQL. Basically, I wrote the above SQL in procedural code.
This took 2 hours to process.
When I just ran the SQL above, it took just 25 minutes! That is pretty darn impressive to plough through a billion rows.
Questions
1. Why is the in-memory operation so slow?
2. How is the SQL so fast?
3. Any way to speed this up even more? :-)
Thanks 
 
April     17, 2006 - 8:12 pm UTC 
 
1) why do you think SQL is not "just as in memory as you were".  Databases were
a) BORN TO JOIN
b) BORN TO BE WRITTEN TO
I am suddenly reminded of tigger - joining is what databases do best.
2) because it is SQL, it is designed to munch data for us, that is its purpose in life.
3) where 'Y' in (key1_match, key2_match, key3_match, key4_match)
perhaps - why concatenate to like when you could just say "a='y' or b='y' or...." which is what the in will do.
or likely a union of four joins - by key1 to key1, key2 to key2 and so on. 
 
 
 
11gR2 Nested loop implementation using vector I/O?
phoenixbai, October   24, 2010 - 11:10 pm UTC
 
 
Hi Tom,
11gR2 doc said new implementation is introduced for nested loop to make use of vector I/O. Could you help explain what is vector I/O? and how does it differ from normal physical I/O?
Here is the quote from 11gR2 doc: 
11.3.3.1.2 New Implementation for Nested Loop Joins
Oracle Database 11g introduces a new implementation for nested loop joins to reduce overall latency for physical I/O. When an index or a table block is not in the buffer cache and is needed to process the join, a physical I/O is required. Oracle Database 11g can batch multiple physical I/O requests and process them using a vector I/O instead of processing them one at a time.
Thanks 
October   25, 2010 - 5:53 pm UTC 
 
it is an array request to the OS with an "array of blocks to read" rather than a single block at a time.  
Think of reading blocks from disk like picking up paper on the floor.  You spend a a lot time bending over and getting back up to pick up the paper (IO).  If  you were asked to get papers 1, 3, 5, and 7 one at a time - you'd bend over and get up 4 times.  If we asked you to get 1,3,5 and 7 all at once - you'd have to bend over and get up only once.  You'd still have the seek times and physical IO times - but a big chunk of time would be removed.
vector = array in this context. 
 
 
thank you very much for the explanation
phoenixbai, October   26, 2010 - 8:32 pm UTC
 
 
Reading asktom has been one of the most interesting things in my life. We(on behalf of all the people who read asktom) can`t thank you enough for putting all those time and energy to answer all of our questions. 
You are just AWESOME! 
 
Nested Loop working
Amardeep Sidhu, November  10, 2010 - 1:08 am UTC
 
 
Hi Tom,
Greetings...
My question is regarding your answer to the very first question in this thread. You have written:
In 8i what it is doing it something like this (in psuedo code)
loop over the rowids in the LOOKUP_CODES_N1 index where lookup_codes = 'PO TYPE'
      goto the LOOKUP_CODES table and get that row
      loop over the rowids in the ORDER_HEADERS_N3 index where agent_id = :b1
           join the rows
           goto the ORDER_HEADERS table and get that row
           apply predicate po.type_lookup_code = p.lookup_codes 
               if matched then OUTPUT ROW
      end loop
end loopBut according to what you have mentioned in Effective Oracle by Design, page no 466 ( 
http://bit.ly/bhIB0V ) shouldn't it be something like:
for x in (select * from LOOKUP_CODES (using index) where lookup_codes = 'PO TYPE')
loop
 index loopup the rowid for x.type_lookup_code & agent_id = :b1
 select * from order_headers where rowid=rowid coming from above index scan 
 output joined record
end loop
Could you please comment on this.
Regards,
Amardeep Sidhu 
November  11, 2010 - 1:38 pm UTC 
 
what do you see different about them?
for x in (select * from lookup_codes (using index) where lookup codes = 'PO TYPE')
is identical to 
loop over the rowids in the LOOKUP_CODES_N1 index where lookup_codes = 'PO 
TYPE'
      goto the LOOKUP_CODES table and get that row
and so on - one uses "SQL" as psuedo code - and the other uses psuedo code for psuedo code.
But - what did you see that was "different" between them otherwise?