Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, gaurang.

Asked: April 11, 2001 - 1:05 pm UTC

Last updated: October 06, 2022 - 4:53 am UTC

Version: 8.1.6

Viewed 100K+ times! This question is

You Asked

hi tom
can u explain me in detail about "start with connect by" sql statement (tree structure).i know there is documentatin but it is very confusing.

and Connor said...

It builds a hierarchical query.

There are 2 components to is:

"start with" -- this identifies all LEVEL=1 nodes in the tree

"connect by" -- describes how to walk from the parent nodes above to their children and their childrens children.

Easiest to use an example on emp. If we start with "where mgr is NULL", we generate the set of employees that have no mgr (they are the top of the tree). If we

CONNECT BY PRIOR EMPNO = /* current */ MGR

that will take all of the PRIOR records (the start with at first) and find all records such that the MGR column equals their EMPNO (find all the records of people managed by the people we started with).

Using EMP, the start with SET is:

scott@ORA8I.WORLD> select ename, empno, mgr from emp
2 where mgr is null;

ENAME EMPNO MGR
---------- ---------- ----------
KING 7839


Now, if we do the "connect by manually" we would find:

scott@ORA8I.WORLD> select ename, empno, mgr
2 from emp where mgr = 7839;

ENAME EMPNO MGR
---------- ---------- ----------
JONES 7566 7839
BLAKE 7698 7839
CLARK 7782 7839

scott@ORA8I.WORLD>


KINGS empno is the prior empno. If we build the entire hierarch -- we have:



scott@ORA8I.WORLD> select lpad(' ',level*2,' ')||ename ename, empno, mgr
2 from emp
3 START WITH MGR IS NULL
4 CONNECT BY PRIOR EMPNO = MGR
5 /

ENAME EMPNO MGR
--------------- ---------- ----------
KING 7839
JONES 7566 7839
SCOTT 7788 7566
ADAMS 7876 7788
FORD 7902 7566
SMITH 7369 7902
BLAKE 7698 7839
ALLEN 7499 7698
WARD 7521 7698
MARTIN 7654 7698
TURNER 7844 7698
JAMES 7900 7698
CLARK 7782 7839
MILLER 7934 7782

14 rows selected.

So, KING is the start with set then JONES BLAKE and CLARK fall under him. Each of them becomes the PRIOR record in turn and their trees are expanded.

Rating

  (160 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

gaurang, April 12, 2001 - 8:05 am UTC


useful summary of connect by

RParr, September 25, 2002 - 2:56 pm UTC

in a fraction of the space you provided a much better overview of connect by and basic heirarchical query.

Nice overview

Anirudh, January 22, 2003 - 1:44 am UTC

Hi Tom
Your explaination about the connect by clause was very helpfull. However, I have a question
we have got a function
FUNCTION Get_Parent_Entity_Id
(
in_test_pgm_id number,
in_test_admin_id number,
ic_child_entity_type_code varchar2,
in_child_entity_id number,
ic_parent_entity_type_code varchar2
)
RETURN number IS
ln_count NUMBER;
BEGIN
ln_count := 0;
BEGIN
select n_parent_entity_id
into ln_count
from
(
select
n_test_pgm_id,
n_test_admin_id,
c_parent_entity_type_code,
n_parent_entity_id
from
rpt_entity_struc res
start with
res.n_test_pgm_id = in_test_pgm_id and
res.n_test_admin_id = in_test_admin_id and
res.n_entity_struc_id = 0 and
res.c_child_entity_type_code = ic_child_entity_type_code and
res.n_child_entity_id = in_child_entity_id
connect by
res.n_test_pgm_id = prior n_test_pgm_id and
res.n_test_admin_id = prior n_test_admin_id and
res.n_entity_struc_id = prior n_entity_struc_id and
res.c_child_entity_type_code = prior c_parent_entity_type_code and
res.n_child_entity_id = prior n_parent_entity_id
)
where c_parent_entity_type_code = ic_parent_entity_type_code;
EXCEPTION
WHEN NO_DATA_FOUND THEN
ln_count := 0;
WHEN OTHERS THEN
ln_err_num := SQLCODE;
lc_err_msg := 'Error in Get_Parent_Entity_Id - ' || substr(sqlerrm, 1, 200);
raise_application_error(-20000, lc_err_msg);
END;
RETURN ln_count;
--
Algo:n_parent_entity_id is retrieved based on the input parameters passed besides the ic_parent_entity_type_code which is one of the important input columns to be considered. The Hierarchical query (STARTWITH and CONNECT BY) clauses are used to find the root parent id.
Question
The connect by prior clause here refers to the columns and one of those (n_entity_struc_id ) is not in the select list. My question is what is the use of that PRIORing the column when it has not been selected.?



Tom Kyte
January 22, 2003 - 8:11 am UTC

the same as when you use it in a predicate, no different.


select ename from emp where empno = :x


what is the use of empno in the predicate if it has not been selected? well, it is used to identify what record(s) you want. same with columns in the start with, connect by prior and so on...

Urgent Requirement

Shrikant Gavas, May 22, 2003 - 10:35 am UTC

Please go throught the script :

CREATE TABLE MY_LEVEL1 (
ORIG_RECP VARCHAR2 (10),
ORIG_AMT NUMBER,
REF_RECP VARCHAR2 (10),
REF_AMT NUMBER,
PARTY VARCHAR2 (6) ) ;

INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'100', 10000, NULL, 500, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'110', 300, '100', 100, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'120', 200, '110', 50, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'130', 100, '120', 30, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'300', 10000, NULL, 500, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'310', 300, '300', 100, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'320', 200, '310', 50, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'330', 100, '320', 30, 'A0001');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'100', 10000, NULL, 500, 'A0003');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'110', 300, '100', 100, 'A0003');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'120', 200, '110', 50, 'A0003');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'130', 100, '120', 30, 'A0003');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'100', 10000, NULL, 500, 'A0004');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'110', 300, '100', 100, 'A0004');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'120', 200, '110', 50, 'A0004');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'130', 100, '120', 30, 'A0004');
INSERT INTO MY_LEVEL1 ( ORIG_RECP, ORIG_AMT, REF_RECP, REF_AMT,
PARTY ) VALUES (
'130', 100, '120', 30, 'A0004');

select level x,
party, orig_recp, ref_recp, orig_amt, ref_amt
from my_level1
where party between 'A0001' AND 'A0003'
connect by
prior orig_recp = ref_recp and
prior party = party
start with ref_recp is null

1 A0001 100 10000 500
2 A0001 110 100 300 100
3 A0001 120 110 200 50
4 A0001 130 120 100 30
1 A0001 300 10000 500
2 A0001 310 300 300 100
3 A0001 320 310 200 50
4 A0001 330 320 100 30
1 A0003 100 10000 500
2 A0003 110 100 300 100
3 A0003 120 110 200 50
4 A0003 130 120 100 30

Clients desired output :

1 A0001 100 10000 500
4 A0001 130 120 100 30
1 A0001 300 10000 500
4 A0001 330 320 100 30
1 A0003 100 10000 500
4 A0003 130 120 100 30

i.e. clients requirement is that they want first and last row per level. We
have tried a lot but not got any appropriate solution for this.

Please provide some solution for above query asap.



Interpret

Nitin, July 13, 2003 - 2:26 pm UTC

How do I interpret this SQL and the output? Is the output correct?

SELECT
TO_CHAR(A.EFFDT,'YYYY-MM-DD'), A.TREE_NODE_NUM, A.TREE_NODE,
A.TREE_NODE_NUM_END, A.TREE_LEVEL_NUM, A.TREE_NODE_TYPE, A.PARENT_NODE_NUM, A.OLD_TREE_NODE_NUM
From PSTREENODE A
Where A.TREE_NAME = 'OVER_EXP'
And A.SETID = 'LOCKE'
And A.EFFDT = (Select Max(Z.EFFDT)
From PSTREENODE Z
Where Z.TREE_NAME = A.TREE_NAME
And Z.TREE_NODE = A.TREE_NODE
And Z.SETID = A.SETID)
Start With TREE_NODE = 'WORK'
Connect By PARENT_NODE_NUM = Prior TREE_NODE_NUM


TO_CHAR(A. TREE_NODE_NUM TREE_NODE TREE_NODE_NUM_END TREE_LEVEL_NUM T PARENT_NODE_NUM O
---------- ------------- -------------------- ----------------- -------------- - --------------- -
2002-01-01 222222223 WORK 333333333 0 G 1 N
2002-01-01 1222222222 CASE 1333333332 0 G 1111111111 N
2002-01-01 1166666666 DEPT 1222222221 0 G 1111111111 N
2002-01-01 1222222222 CASE 1333333332 0 G 1111111111 N
2002-01-01 1166666666 DEPT 1222222221 0 G 1111111111 N

Tom Kyte
July 14, 2003 - 12:22 am UTC

the output is correct -- given the query.

However, without any clue as to the "question" - no one can really tell you if the output correctly answers your question!



Clarification

Nitin, July 13, 2003 - 11:54 pm UTC

The SQL provided in my previous feedback finally made sense. I have the following question.

The output is not what is desired. We need the condition
Where A.TREE_NAME = 'OVER_EXP'
And A.SETID = 'LOCKE'
And A.EFFDT = (Select Max(Z.EFFDT)
From PSTREENODE Z
Where Z.TREE_NAME = A.TREE_NAME
And Z.TREE_NODE = A.TREE_NODE
And Z.SETID = A.SETID)

to be applied prior to executing the CONNECT BY - PRIOR part of the SQL.

Now, we observe that the SQL is first performing the CONNECT BY - PRIOR and then the WHERE portion. Please let us know how it can be achieved.

Thanks

Tom Kyte
July 14, 2003 - 12:30 am UTC

then that condition should be in the connect by itself as well. you don't have to only do prior's in there.

start with /connect by is done

AND THEN

the where clause is applied


If you want to stop building the tree when you hit some condition -- put that condition in the connect by

chained connect by

farweeda, July 26, 2003 - 3:26 am UTC

ihave docs tab containing two fields (doc_id , doc_rel_id)
with e.g. following values {(5,2)(9,5)(11,2)(3,1)(4,2)(8,3)(6,4)} the doc_rel_id has some relation with doc_id and vise versa so when i'm asking for the related docs to specific doc_id e.g. doc_id =2 the resault should be : ( 5,11,4,9 because of 5 ,6 because of 4 ) and if i change doc_id to any of the resault's value suppose 4 the same resault should be there

Tom Kyte
July 26, 2003 - 12:45 pm UTC

ops$tkyte@ORA920LAP> select *
  2    from t
  3   start with doc_rel_id = 2
  4  connect by prior doc_id = doc_rel_id;

    DOC_ID DOC_REL_ID
---------- ----------
         5          2
         9          5
        11          2
         4          2
         6          4

 

To Srikant for his urgent query

A reader, July 27, 2003 - 1:50 pm UTC

I tried out your scenario - I got a query to get the
same set of records that you wanted but not in the same
order - you can probably tweak it.
Tom may have an even better solution of course!

Menon:)

Here goes:
select x, party, orig_recp, ref_recp, orig_amt, ref_amt
from
(
select a.*, first_value( x) over (partition by party order by party) first_x,
last_value( x) over (partition by party order by party) last_x
from
(
select level x, party, orig_recp, ref_recp, orig_amt, ref_amt
from my_level1
where party between 'A0001' AND 'A0003'
connect by
prior orig_recp = ref_recp and
prior party = party
start with ref_recp is null
) a
) b
where x = first_x or
x = last_x;


Skipping Gaps?

Doug, November 23, 2003 - 12:23 pm UTC

Tom - is it possible to make a connect by skip gaps in the sequence? Like 

SQL> desc a;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------
 ID                                                 NUMBER
 PRINUMBER                                          NUMBER

SQL> select * from a;

        ID  PRINUMBER
---------- ----------
         1          2
         2          3
         3          4
         9         10

SQL> select id, prinumber from
  2   a start with id=1
  3  connect by prior prinumber=id;

        ID  PRINUMBER
---------- ----------
         1          2
         2          3
         3          4

Is there any way to move on to NINE and continue? 

Tom Kyte
November 23, 2003 - 2:52 pm UTC

what would be the possible use of this?

how would this be any different from "select * from t"

fair point

Doug, November 23, 2003 - 3:31 pm UTC

Fair enough - maybe I am asking the wrong question. 

What I'm trying to do is take a time history, potentially with workspace manager that might look like this:

SQL> select name, salary, starttime, stoptime, salary from doug; 

NAME           SALARY STARTTIME STOPTIME      SALARY
---------- ---------- --------- --------- ----------
Bob              6000 01-JAN-95 01-JUN-95       6000
Bob              7000 01-JUN-95 01-OCT-95       7000
Bob              7000 01-OCT-95 01-FEB-96       7000
Bob              7000 01-FEB-96 01-JAN-97       7000
Bob              6000 01-JAN-97 01-MAR-98       6000
Bob              5000 01-APR-03 01-JUN-03       5000

The startimes and stoptimes connect very nicely except there is a gap between March 1998 and April 2003. 

If I want to order these dates and include a special salary "gap".   We "don't know".. what Bob was doing between March,98 and April 2003. 

This is what happens at the root of the query I am looking at - 

SQL> l
  1  select name, salary, starttime, stoptime,decode( lag(salary) over (order by
 starttime), salary,
  2* to_number(null), row_number() over (order by starttime) ) rn  from doug
SQL> /

NAME           SALARY STARTTIME STOPTIME          RN
---------- ---------- --------- --------- ----------
Bob              6000 01-JAN-95 01-JUN-95          1
Bob              7000 01-JUN-95 01-OCT-95          2
Bob              7000 01-OCT-95 01-FEB-96
Bob              7000 01-FEB-96 01-JAN-97
Bob              6000 01-JAN-97 01-MAR-98          5
Bob              5000 01-APR-03 01-JUN-03          6

6 rows selected.

This is WHAT I WANT more or less - the null values when salarys are the same so they can be coalesed.  What I REALLY want is THIS:


NAME           SALARY STARTTIME STOPTIME          RN
---------- ---------- --------- --------- ----------
Bob              6000 01-JAN-95 01-JUN-95          1
Bob              7000 01-JUN-95 01-OCT-95          2
Bob              7000 01-OCT-95 01-FEB-96
Bob              7000 01-FEB-96 01-JAN-97
Bob              6000 01-JAN-97 01-MAR-98          5
Bob    <unknown> NULL 01-MAR-98 01-APR-03          6
Bob              5000 01-APR-03 01-JUN-03          7

Now if I just order by time with a "select * from T".. how could I "fill in" the gaps in the sequence?

Thanks,
D. 

Tom Kyte
November 23, 2003 - 5:24 pm UTC

it was the connnect by that really confused me -- couldn't understand where that is coming in.

Now that I know the question, an answer can be forthcoming :)

Here is one technique:

ops$tkyte@ORA920> select name,
  2         decode( r, 1, null, '<unknown>' ) msg,
  3         decode( r, 1, salary, null ) salary,
  4         decode( r, 1, starttime, last_stop ) starttime,
  5         decode( r, 1, stoptime, starttime ) stoptime
  6    from (
  7  select name, salary,
  8         starttime, stoptime,
  9         lag(stoptime) over (partition by name order by starttime) last_stop,
 10         decode( lag(stoptime) over (partition by name order by starttime), starttime, null, null, null, 1 ) dup_me
 11    from t
 12         ),
 13         (select 1 r from dual union all select 2 r from dual)
 14   where r = 1 or dup_me = 1
 15   order by 4
 16  /
 
NAME                           MSG           SALARY STARTTIME STOPTIME
------------------------------ --------- ---------- --------- ---------
Bob                                            6000 01-JAN-95 01-JUN-95
Bob                                            7000 01-JUN-95 01-OCT-95
Bob                                            7000 01-OCT-95 01-FEB-96
Bob                                            7000 01-FEB-96 01-JAN-97
Bob                                            6000 01-JAN-97 01-MAR-98
Bob                            <unknown>            01-MAR-98 01-APR-03
Bob                                            5000 01-APR-03 01-JUN-03
 
7 rows selected.
 

Stunning

Doug, November 23, 2003 - 7:42 pm UTC

Tom - that is very crafty. I learn a lot about SQL from you. Is there a bit of inefficiency built into that however? The join with the 2 rows in the dual table, doesn't that increase the work for the db? Don't get me wrong I'm not complaining. :-) Very nice solution. I wasn't sure it could be done. Analytic functions seem to have all sorts of interesting uses.

Tom Kyte
November 24, 2003 - 7:50 am UTC

well -- in order to "make up" data we need to join -- to synthesize that row, it was somewhat unavoidable (joining to a two row table). pipelined functions and procedural code might have worked as well, but this is easier.



connect by but parent in different table

umesh, December 02, 2003 - 2:33 am UTC

Tom
I have a situation

There are 2 tables srch_criteria, srch_hierarchy

srch_criteria is the master
------------------------------------------------------
id name
07 Aerospace Tier 1
08 Functional Tier 2
09 Finance Tier 1
10 x Tier 1
11 y Tier 1
12 z Tier 3

.....

srch_hierarchy has the hierarchy maintained in it
sc_id_child sc_id_parent
-------------------------------------
08 07
09 07

081 08
082 08

091 09
092 09

search hierarchy will have a hierachy
07 is linked with 08,09
08 has children in 081 , 082
09 has children in 091, 092
( I agree bad design)

I take only Tier 1 from the top and traverse thru whole of children and grand children in the hierarchy table
I need a query for it Is it possible in SQL or should i write procedure or function




Tom Kyte
December 02, 2003 - 8:35 am UTC

I don't see "a bad design" here? looks pretty typical.

I do not know what "tier 1" from the "top" is though. hows about you show us the desired output and explain how it was arrived at.

using an order by

Doug, December 02, 2003 - 11:08 am UTC

Tom - as you pointed out to me earlier, in some situations gaps or duplicates with a connect by are redundant because they are the same as an order by clause.

In a situation where I want to ensure the order but not select the column I am ordering by - will this work?
select T.x, T.y from (select x,y,z from R order by z ) T
It the order ensured?


Tom Kyte
December 02, 2003 - 12:25 pm UTC

just

select x, y
from R
order by Z;





umesh, December 09, 2003 - 2:28 am UTC

SQL> l
  1  SELECT
  2   DECODE ( ll ,1 , crit1.name ) one ,
  3   DECODE ( ll ,2 , crit1.name) two ,
  4   DECODE ( ll ,3 , crit1.name) three ,
  5   DECODE ( ll ,4 , crit1.name) four
  6   FROM
  7  (SELECT sc_id_child , sc_id_parent   , LEVEL ll  FROM CNT_SEARCH_HIERARCHY  src
  8  START WITH sc_id_parent='crite000000000000002'
  9  CONNECT BY PRIOR sc_id_child=sc_id_parent
 10   ) , CNT_SEARCH_CRITERIA crit ,  CNT_SEARCH_CRITERIA crit1
 11  WHERE crit.id = sc_id_parent
 12* AND crit1.id=sc_id_child
SQL> /

ONE                                      TWO                          THREE                   FOUR                            
-------------------------- ---------------------------    ------------------------   -------------------                      
Building Automation                                                                                                                                       ACSELON                                                                                                                       Authorized Trainer Programs                                                                                                   Delivery & Installation                                                                                                                                       ADEPT                                                                                                                        Project Management                                                                                                                                         Human Resources 

Love it, but ...

Peter Tran, March 04, 2004 - 3:47 pm UTC

Hi Tom,

What happens if you have a dataset where the top mgr refers to himself? Using your example, if the MGR = 7839 for KING and I run the same query you have Oracle gives me a:

ORA-01436: CONNECT BY loop in user data

A connect by record cannot reference itself otherwise we have an infinite loop situation.

Is there anyway I can filter this out? In other words, I still want the answer you got earlier, but now I have the situation where KING refers to himself in the MGR column.

Thanks,
-Peter

Tom Kyte
March 04, 2004 - 3:56 pm UTC

that case is easy -- since we have the "loop" in a single row, we can simply filter it out in the "connect by" clause:

ops$tkyte@ORA9IR2> create table emp as select * from scott.emp;
 
Table created.
 
ops$tkyte@ORA9IR2> update emp set mgr = empno where mgr is null;
 
1 row updated.
 
ops$tkyte@ORA9IR2> commit;
 
Commit complete.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select lpad(' ',level*2,' ')||ename ename, empno, mgr
  2    from emp
  3   START WITH ename = 'KING'
  4  CONNECT BY PRIOR EMPNO = MGR
  5  /
ERROR:
ORA-01436: CONNECT BY loop in user data
 
 
 
no rows selected
 
ops$tkyte@ORA9IR2> select lpad(' ',level*2,' ')||ename ename, empno, mgr
  2    from emp
  3   START WITH ename = 'KING'
  4  CONNECT BY PRIOR EMPNO = MGR <b>AND empno <> mgr</b>
  5  /
 
ENAME                EMPNO        MGR
--------------- ---------- ----------
  KING                7839       7839
    JONES             7566       7839
      SCOTT           7788       7566
        ADAMS         7876       7788
      FORD            7902       7566
        SMITH         7369       7902
    BLAKE             7698       7839
      ALLEN           7499       7698
      WARD            7521       7698
      MARTIN          7654       7698
      TURNER          7844       7698
      JAMES           7900       7698
    CLARK             7782       7839
      MILLER          7934       7782
 
14 rows selected.



And in 10g, you have NOCYCLE to avoid the loops anywhere:

ops$tkyte@ORA10G> select lpad(' ',level*2,' ')||ename ename, empno, mgr
  2    from emp
  3   START WITH ename = 'KING'
  4  CONNECT BY <b>NOCYCLE</b> PRIOR EMPNO = MGR
  5  /
 
ENAME                EMPNO        MGR
--------------- ---------- ----------
  KING                7839       7839
    JONES             7566       7839
      FORD            7902       7566
        SMITH         7369       7902
      SCOTT           7788       7566
        ADAMS         7876       7788
    BLAKE             7698       7839
      ALLEN           7499       7698
      WARD            7521       7698
      MARTIN          7654       7698
      TURNER          7844       7698
      JAMES           7900       7698
    CLARK             7782       7839
      MILLER          7934       7782
 
14 rows selected.
 

Awesome...

Peter Tran, March 04, 2004 - 4:03 pm UTC


Thanks for the quick turn-around.

The combination of Tom Kyte and Oracle really rocks.

Awesome...awesome...awesome.

-Peter

Can I get the parent position w.r.t. to the rownum?

Peter Tran, March 05, 2004 - 4:25 pm UTC

Hi Tom,

Is it possible for me to generate the ParentPosition index w.r.t. to the assigned ROWNUM? Of course, ROWNUM starts at 1, but my example below is using base 0. Either way, you can see where I'm getting at with the example below.

Currently, I'm doing this mapping in code. It would be much better if I can do this within SQL.

Thanks,
-Peter


ROWNUM ENAME EMPNO MGR ParentPos
------ --------------- ---------- ---------- -------------
0 KING 7839 7839 0
1 JONES 7566 7839 0
2 FORD 7902 7566 1
3 SMITH 7369 7902 2
4 SCOTT 7788 7566 1
5 ADAMS 7876 7788 4
6 BLAKE 7698 7839 0
7 ALLEN 7499 7698 6
8 WARD 7521 7698 6
9 MARTIN 7654 7698 6
10 TURNER 7844 7698 6
11 JAMES 7900 7698 6
12 CLARK 7782 7839 0
13 MILLER 7934 7782 12


Tom Kyte
March 05, 2004 - 7:00 pm UTC

what is "parentPos"

ParentPos

Peter Tran, March 06, 2004 - 5:57 pm UTC

ParentPos is the RowNum value of the parent.

King is the parent of Jones, Blake, and Clark which is why ParentPos for them is 0.

Blake is the parent of Alan, Ward, Martin, Turner, and James, so you see their ParentPos is 6 because Blake's RowNum is 6.

Thanks,
-Peter

Tom Kyte
March 06, 2004 - 8:42 pm UTC

scott@ORA9IR2> select rnum, ename, empno, mgr,
2 substr( scbp2, 1, instr(scbp2,',')-1 ) parentpos
3 from (
4 select a.*,
5 substr( scbp, instr(scbp, ',', -1, 2 )+1 ) scbp2
6 from (
7 select rownum-1 rnum, rpad( ' ', 2*level, ' ' ) || ename ename, empno, mgr,
8 sys_connect_by_path( rownum-1, ',' ) scbp
9 from emp
10 start with mgr is null
11 connect by prior empno = mgr
12 ) a
13 )
14 /

RNUM ENAME EMPNO MGR PARENT
---------- -------------------- ---------- ---------- ------
0 KING 7839
1 JONES 7566 7839 0
2 SCOTT 7788 7566 1
3 ADAMS 7876 7788 2
4 FORD 7902 7566 1
5 SMITH 7369 7902 4
6 BLAKE 7698 7839 0
7 ALLEN 7499 7698 6
8 WARD 7521 7698 6
9 MARTIN 7654 7698 6
10 TURNER 7844 7698 6
11 JAMES 7900 7698 6
12 CLARK 7782 7839 0
13 MILLER 7934 7782 12

14 rows selected.



Just amazing...

Peter Tran, March 06, 2004 - 9:45 pm UTC

Excellent!

Thanks!
-Peter

CONNECT BY

ANOOP GUPTA, March 09, 2004 - 12:08 am UTC

it would very useful for me to understand connect by
specially to get first and last in tree.

one solution i also have for geting first and last
in tree.
select *
from
(


select level x, party, orig_recp, ref_recp, orig_amt, ref_amt
from my_level1
where party between 'A0001' AND 'A0003'
connect by
prior orig_recp = ref_recp and
prior party = party
start with ref_recp is null
) a
where a.ref_recp is null
or a.orig_recp not in (select nvl(ref_recp,'NULL')
from my_level1
where party between 'A0001' AND 'A0003'
connect by
prior orig_recp = ref_recp and
prior party = party
start with ref_recp is null
)

its output is similar as your client want


1 A0001 100 10000 500
4 A0001 130 120 100 30
1 A0001 300 10000 500
4 A0001 330 320 100 30
1 A0003 100 10000 500
4 A0003 130 120 100 30


Find Employee's top most manager

Mita, August 19, 2004 - 6:17 pm UTC

For the following data,

Emp_id Mgr_id
1 2
2 3
3 4
4
5 6
6 7
7

I need the following result
EmpId TopManager
1 4
2 4
3 4
4 4
5 7
6 7
7 7

How can I achieve this ??


Tom Kyte
August 19, 2004 - 8:14 pm UTC

if you gave me a create table and insert into's, i'd show you how to use a scalar subquery with connect by using max to do this.

maybe that is enough to get you going....

Sample Data

Mita, August 24, 2004 - 12:17 pm UTC

create table emp (emp_Id number, mgr_Id number);

insert into emp values(1,2);
insert into emp values(2,3);
insert into emp values(3,4);
insert into emp values(4,Null);
insert into emp values(5,6);
insert into emp values(6,7);
insert into emp values(7,Null);


Tom Kyte
August 24, 2004 - 3:22 pm UTC

 
ops$tkyte@ORA9IR2> select emp_id,
  2         to_number( substr(
  3             (select max( to_char(level,'fm000009') || ' ' || emp_id )
  4                    from emp e2
  5                   start with e2.emp_id = e1.emp_id
  6            connect by prior mgr_id is not null and prior mgr_id = emp_id )
  7             , 8 ) ) top_mgr
  8    from emp e1
  9  /
 
    EMP_ID    TOP_MGR
---------- ----------
         1          4
         2          4
         3          4
         4          4
         5          7
         6          7
         7          7
 
7 rows selected.
 

More Connect by

Vinnie, October 14, 2004 - 5:29 pm UTC

Tom,

I have the following:

create table parent( rowid number(2) primary key, id varchar2(14) );

create table child( rowid_parent number(2), id varchar2(14) );

insert into parent(1, '12345');
insert into parent(2, '12346');
insert into parent(3, '12347');
insert into parent(4, '12348');
insert into child (1,'12346');
insert into child (2,'12347');
insert into child (2,'12348');

I would like to pass in a id (i.e. 12345)
and find all his child (1,'12346')
Then for each child find his children (1,'12347') and so on down the tree.

Like the following:
12345
12346
12347
12348


Tom Kyte
October 14, 2004 - 7:55 pm UTC

what have you tried so far.......

wacky structure to store parent/child in don't you think? probably could find a less efficient structure for a hierarchy....

Vinnie, October 15, 2004 - 8:44 am UTC

create table parent( rowid number(2) primary key, id varchar2(14) );

create table child( rowid_parent number(2), id varchar2(14) );

insert into parent(1, '12345');
insert into parent(2, '12346');
insert into parent(3, '12347');
insert into parent(4, '12348');
insert into child (1,'12346');
insert into child (2,'12347');
insert into child (2,'12348');

I have tried the following:

select a.id from parent a, child b
start with a.id='12345'
connect by a.rowid = b.rowid_parent;


Was hoping this could be accomplised using this type of approach....somehow!!


Tom Kyte
October 15, 2004 - 11:53 am UTC

do you have to live with this "structure"? ugh. it hurts my head to look at it. the names don't even make sense.

Ugh

Vinnie, October 15, 2004 - 2:02 pm UTC

Ugh is right, I have to live with the structure!

Perhaps I can explain this better.

I have table EMP with a list of emplyees, and table SUB with each employees subordinates. What I need is a list of all the subordinates for a given EMP rolled up to include all subordinate emplyees. So if I select employee SMITH I get:

SMITH
ADAMS
JONES
ARMSTRONG
FRANK
JAMES
BROWN

But The structure is still the same as described before.

CREATE TABLE EMP (row_id number, ename varchar2(30));
CREATE TABLE SUB (row_id_parent number, ename varchar2(30));
Just assume the names are unique for this test case.

INSERT INTO EMP (1,'SMITH');
INSERT INTO EMP (2,'ADAMS');
INSERT INTO EMP (3,'JONES');
INSERT INTO EMP (4,'ARMSTRONG');
INSERT INTO EMP (5,'FRANK');
INSERT INTO EMP (6,'JAMES');
INSERT INTO EMP (7,'BROWN');
INSERT INTO SUB (1,'ADAMS');
INSERT INTO SUB (1,'BROWN');
INSERT INTO SUB (2,'JONES');
INSERT INTO SUB (2,'ARMSTONG');
INSERT INTO SUB (2,'JAMES');
INSERT INTO SUB (4,'FRANK');

Hope this helps with your head ache:)


Tom Kyte
October 15, 2004 - 5:46 pm UTC

ops$tkyte@ORA9IR2> select rpad('*',2*level,'*') || ename  name
  2    from (select row_id_parent, ename, (select row_id from emp where ename = sub.ename) row_id from sub)
  3   start with row_id_parent = ( select row_id from emp where ename = 'SMITH' )
  4   connect by prior row_id = row_id_parent
  5  /
 
NAME
------------------------------
**ADAMS
****JONES
****ARMSTRONG
******FRANK
****JAMES
**BROWN
 
6 rows selected.


I figure "smith" can be "implied", you could union him in, but you already sort of know "smith" 

GREAT

Vinnie, October 18, 2004 - 1:41 pm UTC

This works great!!

Can you explain this all in plain text?

Tom Kyte
October 18, 2004 - 2:03 pm UTC

select row_id_parent, ename,
(select row_id from emp where ename = sub.ename) row_id
from sub



gives us the single table with the parent/child info we need. we needed the row_id_parent and row_id<child> together, then connect by is trivial.



Able to show only certain hierarchies?

Jon, October 21, 2004 - 12:37 pm UTC

Is it possible to only show certain hierarchies that match a WHERE condition? For example in the data below, FRENCH reports to both DAVIS AND BLAKE. I would like to show ONLY those hierarchies.

drop table t1;
create table t1 (EMP varchar2(30), MGR varchar2(30));
insert into t1 values ('ADAMS',null);
insert into t1 values ('BLAKE','ADAMS');
insert into t1 values ('CHARLES','ADAMS');
insert into t1 values ('DAVIS','ADAMS');
insert into t1 values ('EDWARDS','BLAKE');
insert into t1 values ('FRENCH','BLAKE');
insert into t1 values ('GAVIN','BLAKE');
insert into t1 values ('HOWARD','CHARLES');
insert into t1 values ('INGRAHAM','CHARLES');
insert into t1 values ('JONES','CHARLES');
insert into t1 values ('KING','DAVIS');
insert into t1 values ('LEWIS','DAVIS');
-- FRENCH REPORTS TO BOTH DAVIS AND BLAKE
insert into t1 values ('FRENCH','DAVIS');
insert into t1 values ('MATTHEWS','FRENCH');
insert into t1 values ('NEWMAN','FRENCH');
COMMIT;
SELECT substr(LPAD(' ',2*(LEVEL - 1))||EMP,1,40) Employee
from t1
connect by prior
emp = mgr
start with mgr is null;

This results in
EMPLOYEE
----------------------------------------
ADAMS
BLAKE
EDWARDS
FRENCH
MATTHEWS
NEWMAN
GAVIN
CHARLES
HOWARD
INGRAHAM
JONES
DAVIS
KING
LEWIS
FRENCH
MATTHEWS
NEWMAN

I would like it to only show:

EMPLOYEE
----------------------------------------
ADAMS
BLAKE
EDWARDS
FRENCH
MATTHEWS
NEWMAN
GAVIN
DAVIS
KING
LEWIS
FRENCH
MATTHEWS
NEWMAN

Is that possible

Tom Kyte
October 21, 2004 - 3:11 pm UTC

is it possible for there to be multiple "roots" in this? or will french always roll up to a single root node?

Thanks for responding

Jon, October 21, 2004 - 3:39 pm UTC

There could be multiple roots. How would the answer differ if the answer was a single root? I ask because it may be possible to create a view that makes the multiple roots all point to a single (new) root, if that would make this more doable.

Tom Kyte
October 22, 2004 - 3:10 pm UTC

if there were one root, we could "start with" using this:

ops$tkyte@ORA9IR2> select level, sys_connect_by_path( emp, '/' ) scbp
  2    from t1
  3   start with emp = 'FRENCH'
  4  connect by prior mgr = emp
  5  /
 
     LEVEL SCBP
---------- -------------------------
         1 /FRENCH
         2 /FRENCH/BLAKE
         3 /FRENCH/BLAKE/ADAMS
         1 /FRENCH
         2 /FRENCH/DAVIS
         3 /FRENCH/DAVIS/ADAMS
 
6 rows selected.
 
ops$tkyte@ORA9IR2> select max( to_char(level,'fm00009') || ' ' || sys_connect_by_path( emp, '/' ) ) scbp
  2    from t1
  3   start with emp = 'FRENCH'
  4  connect by prior mgr = emp
  5  /
 
SCBP
-------------------------
00003 /FRENCH/DAVIS/ADAMS


<b>see how we could get ADAMS... but if there are multiple roots that each take a different number of levels to get to -- that would be a problem</b>

with one root, you would START WITH EMP = ( SELECT that root )


but in hindsight -- i see that would not work either.  We'd have to actually run a connect by query per row -- just to see if french was in the hierarchy up or down the tree.  it'd be very expensive.

I'd probably rather run two queries and may union all them together -- one that runs "up" the tree from french, another that runs "down the tree" from french.

 

Further explanation

Jon, October 21, 2004 - 3:49 pm UTC

The following demostrates what I meant by creating a view to point multiple roots to a new root:

drop table t1;
create table t1 (EMP varchar2(30), MGR varchar2(30));
insert into t1 values ('ADAMS',null);
insert into t1 values ('BLAKE','ADAMS');
insert into t1 values ('CHARLES','ADAMS');
insert into t1 values ('DAVIS','ADAMS');
insert into t1 values ('EDWARDS','BLAKE');
insert into t1 values ('FRENCH','BLAKE');
insert into t1 values ('GAVIN','BLAKE');
insert into t1 values ('HOWARD','CHARLES');
insert into t1 values ('INGRAHAM','CHARLES');
insert into t1 values ('JONES','CHARLES');
insert into t1 values ('KING','DAVIS');
insert into t1 values ('LEWIS','DAVIS');
-- FRENCH REPORTS TO BOTH DAVIS AND BLAKE
insert into t1 values ('FRENCH','DAVIS');
insert into t1 values ('MATTHEWS','FRENCH');
insert into t1 values ('NEWMAN','FRENCH');
insert into t1 values('OLIVER',NULL);
-- FRENCH ALSO REPORTS TO OLIVER
insert into t1 values('FRENCH','OLIVER');
drop view view_t1;
create view view_t1 as
select '.' EMP,null MGR from dual
union
select emp, nvl(mgr, '.') MGR from t1;
SELECT substr(LPAD(' ',2*(LEVEL - 1))||EMP,1,40) Employee
from t1
connect by prior
emp = mgr
start with mgr is null;
SELECT substr(LPAD(' ',2*(LEVEL - 1))||EMP,1,40) Employee
from view_t1
connect by prior
emp = mgr
start with mgr is null;



To Jon ...

Gabe, October 22, 2004 - 4:42 pm UTC

So ... how should your report look if FRENCH who now reports to DAVIS gets to lead a new project having DAVIS as a resource (it frequently happens in real life)?

Maybe hierarchical queries are not quite applicable to your _model_. They work on hierarchies ... don't work very well on graphs.

Tom Kyte
October 23, 2004 - 9:32 am UTC

(in 10g with NOCYCLE and isleaf and other new functions -- they will work much much better with graphs)

Not quite there yet

Jon, October 25, 2004 - 9:43 am UTC

It may be intuitively obvious to you, but I'm struggling with a query that will give me the results I need using your "two query - one up, one down - union all" suggestion. I need the query to return the following results:

Level Emp
1 ADAMS
2 BLAKE
3 FRENCH
4 MATTHEWS
4 NEWMAN
2 DAVIS
3 FRENCH
4 MATTHEWS
4 NEWMAN
1 OLIVER
2 FRENCH
3 MATTHEWS
3 NEWMAN

Also, although we're not at 10g yet, if that has a more straightforward solution, I'd be interested in seeing the solution.

Tom Kyte
October 25, 2004 - 11:06 am UTC

you'll get the output of two queries -- unioned together.  It will not be exactly like that above -- it'll be the data you need however -- 

ops$tkyte@ORA9IR2> select level, rpad('*',2*level,'*') || emp ename
  2    from t1
  3   start with emp = 'FRENCH'
  4  connect by prior mgr = emp
  5  /
 
     LEVEL ENAME
---------- ------------------------------
         1 **FRENCH
         2 ****BLAKE
         3 ******ADAMS
         1 **FRENCH
         2 ****DAVIS
         3 ******ADAMS
         1 **FRENCH
         2 ****OLIVER
 
8 rows selected.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select level, rpad('*',2*level,'*') || emp ename
  2    from t1
  3   start with mgr = 'FRENCH'
  4  connect by prior emp = mgr
  5  /
 
     LEVEL ENAME
---------- ------------------------------
         1 **MATTHEWS
         1 **NEWMAN


that is what you'll have to work with if you want just "french" 

A reader, November 22, 2004 - 4:23 am UTC


How to skip the self reference

Ara, November 29, 2004 - 5:54 am UTC

ORACLE : 8i
Problem: Connect-by because self-reference

Hi Tom,
Is there any way to skip the self-reference.
I need all the records before self-reference. Suppose my table has 1000000 records, there is a self-referece at 50000th records. why should my query fail?
is there any way to skip the this records and move further.
Thanks


Tom Kyte
November 29, 2004 - 8:25 am UTC

in 10g there is a "NOCYCLE" operation. to detect a connect by loop prior to that was not part of the functionality of connect by.

connect by in view

A reader, December 01, 2004 - 10:13 am UTC

i am refering toms comment:

"see how we could get ADAMS... but if there are multiple roots that each take a
different number of levels to get to -- that would be a problem"

we are exactky in that situation, we wannt a connect by, and have a root-node column returned

select :x, ...
[...]
start with :x

would do the job, but we need the functionality in a view, so we can't use the start with clause.

one solution would be to

select substr(sys_connect_by_path, 1, 14) RNODE, ...

(the column is char(14))

the problem is there is a bug with sys_connect_by_path when used in a query.

select RNODE friom viewxyz; works but using RNODE in a where clause raises an ORA-600

the problem is, we currently can't update.


maybe someone has a different solution?

Regarding Mita's request (08/24/2004)

Jet-Lagged Jim, February 11, 2005 - 4:10 am UTC

Hi Tom.

So if we wanted a third column in the result set that corresponds to the hierarchical level of each manager's employees, how would that be done? Tried several things, but can't quite seem to "get it".

example output:

emp_id top_mgr hier_level
------- -------- -----------
1 4 4
2 4 3
3 4 2
4 4 1
5 7 3
6 7 2
7 7 1

Thanks so much!

From Jet-Lagged Jim

Jim, February 11, 2005 - 1:08 pm UTC

Regading my above inquiry - please disregard, I figured it out. T'was a muddled-brain posting at 1:00am from a handful of timezones. Feel free to remove. Thanks again.

Tom Kyte
February 12, 2005 - 8:16 am UTC

don't sweat it, i just got back last night 6 hours off myself :)

2 Tables in using start with , connect by

Lamya, May 09, 2005 - 3:35 pm UTC

I have 2 tables

CREATE TABLE HYBRIDOMA
(
HYBRIDOMA_ID NUMBER(10),
HYBRIDOMA_NAME VARCHAR2(20 BYTE),
)


CREATE TABLE CLONE
(
CLONE_ID NUMBER(10),
CLONE_NAME VARCHAR2(60 BYTE),
PARENT_TYPE VARCHAR2(10 BYTE),
PARENT_ID NUMBER(22)
)

INSERT INTO CLONE ( CLONE_ID, CLONE_NAME, PARENT_TYPE, PARENT_ID ) VALUES (
1, 'clone_from_h1', 'HYBRID', 1);
INSERT INTO CLONE ( CLONE_ID, CLONE_NAME, PARENT_TYPE, PARENT_ID ) VALUES (
2, 'clone_from_clone1', 'CLONE', 1);
INSERT INTO CLONE ( CLONE_ID, CLONE_NAME, PARENT_TYPE, PARENT_ID ) VALUES (
3, 'clone_from_h1', 'HYBRID', 1);
commit;

INSERT INTO HYBRIDOMA ( HYBRIDOMA_ID, HYBRIDOMA_NAME ) VALUES (
0, 'fff');
INSERT INTO HYBRIDOMA ( HYBRIDOMA_ID, HYBRIDOMA_NAME ) VALUES (
1, 'rrrrr');
INSERT INTO HYBRIDOMA ( HYBRIDOMA_ID, HYBRIDOMA_NAME ) VALUES (
2, 'ddddddd');
commit;

Now the clone table has parent type = clone or hybrid. thus the clone can have parents in the hybridoma table.

I would like to create a select statement which would start on the top and select all children , from both hybridoma and clone .

I tried this but its not helping me .

select level , lpad('*',level*2,'*') || decode( parent_type ,'CLONE' , clone_name , 'HYBRID' , hybridoma_name)
from clone c , hybridoma h
where c.parent_id = h.HYBRIDOMA_ID
start with c.parent_id = 1
connect by prior c.clone_id = c.parent_id and clone_ID <> parent_id


Very Informative - but can you get only a portion of the hierarchy?

Marshall B Thompson, January 30, 2006 - 7:22 pm UTC

late to the party, I know, but I find this thread very informative. Back to the very original example you used at the top of the thread, you produced the results:

ENAME EMPNO MGR
--------------- ---------- ----------
KING 7839
JONES 7566 7839
SCOTT 7788 7566
ADAMS 7876 7788
FORD 7902 7566
SMITH 7369 7902
BLAKE 7698 7839
ALLEN 7499 7698
WARD 7521 7698
MARTIN 7654 7698
TURNER 7844 7698
JAMES 7900 7698
CLARK 7782 7839
MILLER 7934 7782

What if you had the employee numbers for allen, ward, martin, turner, james, and miller, and wanted to get the results below. (The hierarchy above just those employees.) How would that be done?

ENAME EMPNO MGR
--------------- ---------- ----------
KING 7839
BLAKE 7698 7839
ALLEN 7499 7698
WARD 7521 7698
MARTIN 7654 7698
TURNER 7844 7698
JAMES 7900 7698
CLARK 7782 7839
MILLER 7934 7782



Tom Kyte
January 31, 2006 - 1:57 am UTC

and what if martin was not in the list?

I must be missing something

Marshall B Thompson, January 31, 2006 - 8:09 am UTC

From your response, this must be obvious, but is not to me at the moment. (But, I would expecte Martin to not be in the output, but all else the same.)

Against my HR sample schema, running the following query:

select lpad(' ',level*2,' ')||last_name ename, employee_id, manager_id
from hr.employees
start with manager_id is null
connect by prior employee_id = manager_id

I get:
King
100

Kochhar
101 100

Greenberg
108 101

Faviet
109 108

Chen
110 108

Sciarra
111 108

Urman
112 108

Popp
113 108

Whalen
200 101

Mavris
203 101
................etc.


Given employee id's 111 (Sciarra) and 112 (Urman), I'd like to get the relevant hierarchy from them up. Desired results would be:
King
100

Kochhar
101 100

Greenberg
108 101

Sciarra
111 108

Urman
112 108

But, when I do this:
select lpad(' ',level*2,' ')||last_name ename, employee_id, manager_id
from hr.employees
where employee_id in (111, 112)
start with manager_id is null
connect by prior employee_id = manager_id

I wind up with:
Sciarra
111 108

Urman
112 108

How can I accomplish the desired results?

Ahhhhhh

Marshall B Thompson, January 31, 2006 - 10:57 am UTC

found my inspiration here: </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:13129005201417 <code>

select lpad(' ',level*2,' ')||last_name ename, employee_id, manager_id
from hr.employees
start with employee_id in (111, 112)
connect by prior manager_id = employee_id

Sciarra
111 108

Greenberg
108 101

Kochhar
101 100

King
100

Urman
112 108

Greenberg
108 101

Kochhar
101 100

King
100

Tom Kyte
January 31, 2006 - 3:32 pm UTC

I did not mean to make you think this was "obvious", just that the answer would have to be very different if "martin" (or indeed any of the leaves) were not to be in the output!

Connect by prior

Sanjeev Vibuthi, February 28, 2006 - 9:36 am UTC


Hi Tom,

This is my table... I have set all reports_to correctly.. But the rows are repeating when I executed the following query...

desc test_unit
Name Null? Type
----------------------------------------------------------------- -------- ------------------------
D_CD NOT NULL NUMBER(3)
U_CD NOT NULL NUMBER(3)
U_NAME VARCHAR2(50)
U_TYPE VARCHAR2(10)
REPORTS_TO NUMBER(3)

SELECT SUBSTR(lpad(' ',Level*2,' '),1,30)||U_name Name,
U_Cd, Repors_To
From test_unit
where d_cd=32
start with d_cd=32 and u_cd=500
connect by prior u_cd = repors_to
--d_cd=32 where 32 is Top Level
NAME U_CD REPORS_TO
-------------------------------------------------------------------------------- ---------- ------------
Central - NY 500 0
EZ 301* 500
EZ-SDivision 202 301*
Sdivision2 13 202
Sdivision3 70 202
EZ-KDivision 203 301*
KDivision2 56 203
KDivision3 59 203
EZ-MDivision 223 301*
MDivision3 68 223
MDivision2 44 223
-- It is repeating (already displayed in "ES_SDivision")
Sdivision2 13 202?
Sdivision3 70 202?
-- The following UNITS should come under "SZ" - "SZ-CR Division"
CR Division-4 37 221***
CR Division-5 88 221
CR Division-6 92 221

NZ 302 500
NZ-GP-Division 211 302
GP-Division-3 40 211
GP-Division-4 77 211
NZ-MK-Division 212 302
MK-Division-2 46 212
MK-Division-3 52 212
MK-Division-4 89 212
NZ-BP-Division 213 302
BP-Division-4 84 213
BP-Division-5 91 213
-----The followings UNITS already displayed under "EZ" - "EZ-KDivision "
KDivision2 56 203
KDivision3 59 203
-----The followings UNITS already displayed under "SZ" - "SZ-CK Division "
CK Division-3 71 224
CK Division-4 72 224

-- The following Two Zones are Coming correctly
SZ 303 500
SZ-CR Division 221*** 303
CR Division-4 37 221
CR Division-5 88 221
CR Division-6 92 221
SZ-MR Division 222 303
MR Division-4 86 222
MR Division-3 62 222
SZ-CK Division 224 303
CK Division-3 71 224
CK Division-4 72 224
SZ-SR Division 237 303
SR Division-4 85 237
SR Division-3 64 237
WZ 304 500
WZ - AN Division 231 304
AN Division-4 41 231
AN Division-5 74 231
WZ - GL Division 232 304
GL Division-3 45 232
GL Division-4 69 232
WZ - PG Division 234 304
PG Division-1 57 234
PG Division-2 67 234
WZ - BH Division 236 304
BH Division-2 87 236
BH Division-1 11 236

Where is the error in my question...

Thanks in Adv..

Sanjeev Vibuthi


Tom Kyte
February 28, 2006 - 9:51 am UTC

I don't know.

Why don't I know?

because I don't have anything to reproduce with, no create table, no insert intos, we have no clue what you original source data looks like at all.

So, who knows.

(can you tell this gets frustrating when day after day no one reads the thing that says

.... If your followup requires a response that might include a query, you had better supply very very simple create tables and insert statements. I cannot create a table and populate it for each and every question. The SMALLEST create table possible (no tablespaces, no schema names, just like I do in my examples for you) ...)



Alex, February 28, 2006 - 10:19 am UTC

Especially since you have to check a box saying you agreed. You could probably change that to say anything and people would just click it. "I really like Michael Bolton he's fantastic...." ;)

Connect by

Sanjeev Vibuthi, March 01, 2006 - 3:13 am UTC

Hi Tom,

I will explain my problem with EMP Table ...

I added COMPANY_CD column to EMP table and renamed it to TEST_EMP (PK(COMPANY_CD, EMPNO))

Test_Emp table contains company wise Employee Details

SCOTT@ testdb 01-MAR-06>SELECT COMPANY_CD, EMPNO,ENAME,MGR FROM TEST_EMP
2* ORDER BY COMPANY_CD
SCOTT@ testdb 01-MAR-06>/

COMPANY_CD EMPNO ENAME MGR
---------- ---------- ---------- ----------
101 7369 SMITH 7902
7499 ALLEN 7698
7521 WARD 7698
7566 JONES 7839
7654 MARTIN 7698
7698 BLAKE 7839
7782 CLARK 7839
7788 SCOTT 7566
7839 KING
7844 TURNER 7698
7876 ADAMS 7788
7900 JAMES 7698
7902 FORD 7566
7934 MILLER 7782
201 7369 SMITH 7902
7499 ALLEN 7698
7521 WARD 7698
7566 JONES 7839
7654 MARTIN 7698
7698 BLAKE 7839
7782 CLARK 7839
7788 SCOTT 7566
7839 KING
7844 TURNER 7698
7876 ADAMS 7788
7900 JAMES 7698
7902 FORD 7566
7934 MILLER 7782

28 rows selected.

-- I have written connect by query to get Employee Hierarchy in a Given Company

SCOTT@ testdb 01-MAR-06>
1 Select substr(lpad('-',Level*2,'-')||Empno,1,15) Empno,
2 ename From test_emp
3 where company_cd=101
4 start with company_cd=101
5* connect by prior empno=mgr
SCOTT@ testdb 01-MAR-06>/

EMPNO ENAME
--------------- ----------
--7788 SCOTT
----7876 ADAMS
--7902 FORD
----7369 SMITH
--7499 ALLEN
--7521 WARD
--7654 MARTIN
--7844 TURNER
--7900 JAMES
--7934 MILLER
--7876 ADAMS
--7566 JONES
----7788 SCOTT
------7876 ADAMS
----7902 FORD
------7369 SMITH
------7876 ADAMS
------7369 SMITH
--7698 BLAKE
----7499 ALLEN
----7521 WARD
----7654 MARTIN
----7844 TURNER

.......
.......
.......

55 rows selected. -- But table is having only 24 records

SCOTT@ testdb 01-MAR-06>
1 Select substr(lpad('-',Level*2,'-')||Empno,1,15) Empno,
2 ename From (Select * from test_emp where company_cd=101) test_emp -- Inline View
3 start with company_cd=101
4* connect by prior empno=mgr
SCOTT@ testdb 01-MAR-06>/

EMPNO ENAME
--------------- ----------
--7788 SCOTT
----7876 ADAMS
--7902 FORD
----7369 SMITH
--7499 ALLEN
--7521 WARD
--7654 MARTIN
--7844 TURNER
--7900 JAMES
--7934 MILLER
--7876 ADAMS
--7566 JONES
----7788 SCOTT
------7876 ADAMS
----7902 FORD
------7369 SMITH
--7698 BLAKE
----7499 ALLEN
----7521 WARD
----7654 MARTIN
......
......

39 rows selected. -- Still I got more records

SCOTT@ testdb 01-MAR-06>
1 Select substr(lpad('-',Level*2,'-')||Empno,1,15) Empno,
2 ename From (Select * from test_emp where company_cd=101) test_emp
3 start with mgr is null -- Change condition
4* connect by prior empno=mgr
SCOTT@ testdb 01-MAR-06>/

EMPNO ENAME
--------------- ----------
--7839 KING
----7566 JONES
------7788 SCOTT
--------7876 ADAMS
------7902 FORD
--------7369 SMITH
----7698 BLAKE
------7499 ALLEN
------7521 WARD
------7654 MARTIN
------7844 TURNER
------7900 JAMES
----7782 CLARK
------7934 MILLER

14 rows selected.

I got what i want, but is there any other way to do this..

Even though my doublt is ... in From clause I have used inline view which contains only 14 records
but result shows 39 records... Why they are repeating.. If parent key is combination of company_cd, mgr then
how to write connect by clause


Thans in Adv.

Sanjeev Vibuthi

Tom Kyte
March 01, 2006 - 8:16 am UTC

well, if you don't have "start with", you start with EACH RECORDS (meaning - there are 14 emps, you'll have 14 trees)

so, what do you want to start with exactly


They are NOT repeating, you asked for them all - you have 14 "trees"

Tom Fox, March 01, 2006 - 9:44 am UTC

Tom, so what I'm getting out of the above query (repeating, not repeating, whatever) is that the START WITH clause _must_ reference the left hand side of the CONNECT BY clause. Is this correct?

--Tom

Tom Kyte
March 01, 2006 - 9:53 am UTC

I don't even know what the "left hand side" of a connect by would be, so no.


start with identifies the set of ROOT NODES.

connect by builds a hierarchy of child nodes under each root.

Tom Fox, March 01, 2006 - 9:45 am UTC

Yeah, I haven't finished my coffee yet. I meant the right hand side in my post above.

--Tom

Tom Kyte
March 01, 2006 - 9:53 am UTC

I don't know what the right hand side is either.

Tom Fox, March 01, 2006 - 5:37 pm UTC

Oh, you know, the right hand side.... Work with me here. :)

In this query:

scott@ORA8I.WORLD> select lpad(' ',level*2,' ')||ename ename, empno, mgr
2 from emp
3 START WITH MGR IS NULL
4 CONNECT BY PRIOR EMPNO = MGR
5 /

What I was trying to ask: Does the START WITH clause (meaning MGR on line 3) have to reference the right side of the CONNECT BY clause (line 4, in this case), in order for the tree to be formed correctly?

In other terms, does START WITH always have to reference one of the fields in the CONNECT BY clause?

Tom Kyte
March 02, 2006 - 9:11 am UTC

No, it does not have to.

your start with simply identifies the rows that start hierarchies.

You might want a tree for every MGR that has the JOB = 'X'

so the start with would be referencing the EMPNO column and the JOB column (to find empnos that are mgrs and have a job = 'X' )

Can we add salary to the query?

Mike, March 02, 2006 - 1:50 pm UTC

Hi Tom,

Could you show how to add Salary to this query using the connect by? for example I would like to add a column to the query that shows the total salary for each manager and his/her employees including all employees down to the last leaf.

In you example above your output shows

ENAME EMPNO MGR
--------------- ---------- ----------
KING 7839
JONES 7566 7839
SCOTT 7788 7566
ADAMS 7876 7788
FORD 7902 7566
SMITH 7369 7902

In other words the overall salary column for King would be the sum of KING+JONES+SCOTT+ADAMS+FORD+SMITH because they all fall under KING or an employee of KING, the overall salary column for SCOTT would be SCOTT's salary + ADAMS and FORD's would be FORD's salary + SMITH's

Of course SMITH's and ADAM's salaries would only include theirs because they don't manage anyone.

Any help would be greatly appreciated

Tom Kyte
March 02, 2006 - 2:25 pm UTC

ops$tkyte@ORA9IR2> select rpad('*',2*level,'*')||ename nm, empno, mgr,
  2        (select sum(sal) from emp e2 start with e2.empno = emp.empno connect by prior empno = mgr) sum_sal
  3    from emp
  4   start with mgr is null
  5   connect by prior empno = mgr;

NM                        EMPNO        MGR    SUM_SAL
-------------------- ---------- ---------- ----------
**KING                     7839                 29025
****JONES                  7566       7839      10875
******SCOTT                7788       7566       4100
********ADAMS              7876       7788       1100
******FORD                 7902       7566       3800
********SMITH              7369       7902        800
****BLAKE                  7698       7839       9400
******ALLEN                7499       7698       1600
******WARD                 7521       7698       1250
******MARTIN               7654       7698       1250
******TURNER               7844       7698       1500
******JAMES                7900       7698        950
****CLARK                  7782       7839       3750
******MILLER               7934       7782       1300

14 rows selected.
 

Using multiple table join and connect by

Mike, March 10, 2006 - 3:29 pm UTC

Tom,..

Thanks for the quick reply, I apologize for not getting back to you quickly.

Maybe I didn't explain well enough. a scalar subquery work very well but I am concerned about the table growing and causing the query to become very taxing to the system.

What we have is a table of entities and a table of entity tickets, they are

CREATE TABLE ENTITY (
ENTITY_UUID VARCHAR2(32),
NAME VARCHAR2(256),
PARENT_UUID VARCHAR2(32)
)
/

Table created.

CREATE TABLE ENTITY_TCKT (
ENTITY_UUID VARCHAR2(32),
CURRENT_LIFECYCLE_STATE NUMBER
)
/

To populate these tables and show some stats,..

insert into entity values ('13E7CAA5FDEB42518A798A77A19F70B0','Level1
Entity',NULL);
insert into entity values ('66A6A6EFFA9D46BE82EC8F5CFFAC91B9','Level4
Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('DCF6B6366D6449DB95A5AEA6B14F31F7','Level5
Entity','66A6A6EFFA9D46BE82EC8F5CFFAC91B9');
insert into entity values ('E2FD444948714528805EBFFA102511F5','Level5
Entity','CB4E1B74035947B9A5B9B0FE264DF4E7');
insert into entity values ('2E1E0646AC9F4BB9A6E4A747B20B2595','Level2
Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('54133391FDD54221B11382A20DFC38AA','Level2
Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('95A5F85D68184DB7A49F9DF7A236F9AF','Level5
Entity','99762DC75A5D42DCBEA6950D7011F130');
insert into entity values ('EAD30C5578BD491991B0D7049CD4F277','Level4
Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('536FCF7E4A5D457B8C3AECBED878FDBF','Level3
Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('883FD970DF264B7A9DC0DFBAF225012A','Level4
Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('C23E104B5AA044F795A6896B1C0B08E4','Level3
Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('64BA9F3295194A3B955FD446DBB2E7EC','Level4
Entity','C23E104B5AA044F795A6896B1C0B08E4');
insert into entity values ('0FE28D72C40C4D6FBB439A49B0BE6D3F','Level5
Entity','64BA9F3295194A3B955FD446DBB2E7EC');
----------list goes on------------------

Then to generate some random ticket data

DECLARE
l_tckt_ktr NUMBER;
l_entity_ktr NUMBER;
BEGIN
FOR i IN (select entity_uuid from entity)
LOOP
l_entity_ktr := round(dbms_random.value(1,6),0);
DBMS_OUTPUT.PUT_LINE('Processing entity -> '||i.entity_uuid||'
l_entity_ktr = '||l_entity_ktr);
IF l_entity_ktr = 1 THEN
l_tckt_ktr := round(dbms_random.value(1,10),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
ELSIF l_entity_ktr = 3 THEN
l_tckt_ktr := round(dbms_random.value(50,100),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
ELSIF l_entity_ktr = 5 THEN
l_tckt_ktr := round(dbms_random.value(500,1000),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
END IF;
END LOOP;
END;
/

PL/SQL procedure successfully completed.

Now that the tables have data, try two ways to get the query

First the Scalar Subquery

set lines 256
column lpad('',2*level)||e.name format a50
column lpad('',2*level)||e.entity_uuid format a50
select lpad(' ', 2 * level)||e.name, lpad(' ', 2 * level)||e.entity_uuid, level,
(select count(*) from entity_tckt where current_lifecycle_state = 0 and
entity_uuid in (
select a.entity_uuid from entity a start with a.entity_uuid =
e.entity_uuid connect by prior a.entity_uuid = a.parent_uuid)) as nbr
from entity e
start with e.entity_uuid = '13E7CAA5FDEB42518A798A77A19F70B0' connect by prior
e.entity_uuid = e.parent_uuid
/
LPAD('',2*LEVEL)||E.NAME
LPAD('',2*LEVEL)||E.ENTITY_UUID LEVEL NBR
--------------------------------------------------
-------------------------------------------------- ---------- ----------
Level1 Entity
13E7CAA5FDEB42518A798A77A19F70B0 1 15533
Level2 Entity
2E1E0646AC9F4BB9A6E4A747B20B2595 2 7
Level2 Entity
54133391FDD54221B11382A20DFC38AA 2 15526
Level3 Entity
536FCF7E4A5D457B8C3AECBED878FDBF 3 890
Level4 Entity
66A6A6EFFA9D46BE82EC8F5CFFAC91B9 4 51
Level5 Entity
DCF6B6366D6449DB95A5AEA6B14F31F7 5 0
Level4 Entity
EAD30C5578BD491991B0D7049CD4F277 4 76
Level4 Entity
883FD970DF264B7A9DC0DFBAF225012A 4 0
Level4 Entity
0F01178489CB421CA5A4C55AEC98300E 4 4
Level5 Entity
5B3B033B4A30443BB1B6F57C1BB05399 5 4
Level4 Entity
CB4E1B74035947B9A5B9B0FE264DF4E7 4 69

LPAD('',2*LEVEL)||E.NAME
LPAD('',2*LEVEL)||E.ENTITY_UUID LEVEL NBR
--------------------------------------------------
-------------------------------------------------- ---------- ----------
Level5 Entity
E2FD444948714528805EBFFA102511F5 5 69
Level4 Entity
99762DC75A5D42DCBEA6950D7011F130 4 607
Level5 Entity
95A5F85D68184DB7A49F9DF7A236F9AF 5 517
-------------------and on and on----------------

122 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
4 2 FILTER
5 4 CONNECT BY (WITH FILTERING)
6 5 NESTED LOOPS
7 6 TABLE ACCESS (FULL) OF 'ENTITY'
8 6 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
9 5 NESTED LOOPS
10 9 BUFFER (SORT)
11 10 CONNECT BY PUMP
12 9 TABLE ACCESS (FULL) OF 'ENTITY'
13 0 CONNECT BY (WITH FILTERING)
14 13 NESTED LOOPS
15 14 TABLE ACCESS (FULL) OF 'ENTITY'
16 14 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
17 13 NESTED LOOPS
18 17 BUFFER (SORT)
19 18 CONNECT BY PUMP
20 17 TABLE ACCESS (FULL) OF 'ENTITY'




Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
208383 consistent gets
0 physical reads
0 redo size
9007 bytes sent via SQL*Net to client
591 bytes received via SQL*Net from client
10 SQL*Net roundtrips to/from client
27913 sorts (memory)
0 sorts (disk)
122 rows processed

And now using 'WITH'

with alm as (select entity_uuid, count(*) as nbr
from entity_tckt
group by entity_uuid)
select e.entity_uuid, sum(nbr) from entity e, alm
where alm.entity_uuid in (select entity_uuid from entity start with entity_uuid
= e.entity_uuid connect by prior entity_uuid = parent_uuid)
group by e.entity_uuid
/

ENTITY_UUID SUM(NBR)
-------------------------------- ----------
026D4A6B547544E08CFEC5DDED3B7777 772
02AD7930E9B94A20A9F4E245B3F4B8C4 1747
04B3AD5F804B4A66AC6C91606EA7019B 74
070547E30C604D4680089959A6DB7684 677
08ABA73521F94F2DB19FF8C1C53ADF06 2
0D65B9BB906B442B9A6BDFEEF3D858AC 86
0F01178489CB421CA5A4C55AEC98300E 4
0FE28D72C40C4D6FBB439A49B0BE6D3F 716
10801AFBCBC44F2F9851E16BC1CCA442 703
13E7CAA5FDEB42518A798A77A19F70B0 15533
1677A02C0B89479BA27F95D7AA3DAC03 9516

ENTITY_UUID SUM(NBR)
-------------------------------- ----------
1AE6BCDA1FA74B0B8DEBEFF13F9A444A 96
1F9BD4CD72FA4879AC338397FB59FD39 74
21FAD03A5EB34D79908E96DA647FF24C 592
235F6C1600584CDD8B3FB1FFCD742E7E 839
23EA0B4618A94C1586BDF08437B97EB0 3067
25CFE1E64F36468DB291CBCF0867B314 59
26C635064E83447B91BA8D125FE74C2A 776
2ABC45D4A5D84684AC29A578C5CCBD3E 1759
2DF37BBB64254DB29081F03D38B5CE33 876
2E1E0646AC9F4BB9A6E4A747B20B2595 7
2FF85FEF82864C0BB75BA4513885ED0E 76

----------------more data ----------------
71 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 FILTER
3 2 NESTED LOOPS
4 3 VIEW
5 4 SORT (GROUP BY)
6 5 TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
7 3 TABLE ACCESS (FULL) OF 'ENTITY'
8 2 FILTER
9 8 CONNECT BY (WITH FILTERING)
10 9 NESTED LOOPS
11 10 TABLE ACCESS (FULL) OF 'ENTITY'
12 10 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
13 9 NESTED LOOPS
14 13 BUFFER (SORT)
15 14 CONNECT BY PUMP
16 13 TABLE ACCESS (FULL) OF 'ENTITY'




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
197127 consistent gets
0 physical reads
0 redo size
3707 bytes sent via SQL*Net to client
547 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
27902 sorts (memory)
0 sorts (disk)
71 rows processed

The logical reads will kill us as the table grows - Is there another way to get the logical reads down. I expect the entity table to grow to several hundred thousand records and the tickets table to get very large as well.

Tom Kyte
March 10, 2006 - 8:35 pm UTC

why would you connect by the entire table????? You would never have a where clause?

single heirarchy with multiple employees

Rasin, March 11, 2006 - 4:38 am UTC

single heirarchy with multiple employees
Suppose querying for empno's 7934, 7369 (actually this will be
determined
by the input of another subquery)
and want to get a single heirarchy instead of multiple heirarchies.

column ename format a60

select e.empno, lpad(' ', level * 2, ' ') || e.ename ename
from emp e
connect by e.empno = prior e.mgr
start with empno in (7934, 7369)

EMPNO ENAME
---------- --------------------------------------------------
7369 SMITH
7902 FORD
7566 JONES
7839 KING
7934 MILLER
7782 CLARK
7839 KING

7 rows selected.

currently I am doing a distinct select to remove duplicate but I am
loosing
the heirarchy
by doing so and also the subquery which will return empno's can return
100's of employees,
will this be OK performance wise ?

select distinct *
from
(
select e.empno, e.ename ename
from emp e
connect by e.empno = prior e.mgr
start with empno in (7934, 7369)
order siblings by empno
)

EMPNO ENAME
---------- --------------------------------------------------
7369 SMITH
7566 JONES
7782 CLARK
7839 KING
7902 FORD
7934 MILLER


My motto is to get the employees suppose whose salary
is greater than 3000 along with their managers up to the root of the heirarchy.


Thanks

Tom Kyte
March 11, 2006 - 3:43 pm UTC

will it be OK performance wise? You gotta do what you gotta do and if in fact you want to "start with" hundreds of roots, expand them all of the way up and distinct them - that is what you gotta do.

single heirarchy

Rasin, March 12, 2006 - 3:17 pm UTC

It took time to get some hold on heirarchial queries.
My data model looks like following
DEPT 1:M EMP
EMP 1:1 PROJECTS (there can be employees which are not assigned to projects)
PROJECTS 1:M SCHEDULES

create table projects (projno number primary key, pname varchar2(20));

create table schedules (scheduleno number primary key, projno number
constraint proj_fk references projects (projno), schedule_name varchar2(50));

alter table emp add projno number constraint emp_proj_fk
references projects;

insert into projects values (1, 'Engineering');
insert into projects values (2, 'Maintenance');

insert into schedules values (1, 1, 'Schedule1');
insert into schedules values (2, 1, 'Schedule2');
insert into schedules values (3, 2, 'Schedule3');
insert into schedules values (4, 2, 'Schedule4');

--assign project 1 to MILLER
update emp set projno = 1
where empno = 7934;

--assign project 2 to SMITH
update emp set projno = 2
where empno = 7369;

commit;


/*
I want to get employee records along with their managers
even though the managers are not assigned any projects.
The filtering to be done on schedules.schedule_name with in operator
for eg., I want to get employees for schedule1 and schedule3 along
with their managers and the heirarchy above them
*/
--I tried the following query with no rows returned
select e.empno, lpad(' ', level * 2, ' ') || e.ename ename
, dname, s.schedule_name
from emp e, dept d, projects p, schedules s
where
e.deptno = d.deptno
and e.projno = p.projno(+)
and p.projno = s.projno
and s.schedule_name in ('Schedule1', 'Schedule2')
connect by prior e.empno = e.mgr
start with e.mgr is null

--after watching this and other threads and reading oracle documentation
-- I came up with the following query, I applied your logic of getting
--the salary based on heirarchy.

select lpad('*', level * 2, '*') || ename ename, dname, pname
from
(
select e.empno, e.ename ename, e.mgr
, dname, pname,
(select count(p2.projno) from emp e2
,projects p2
where
e2.projno = p2.projno(+)
and
p2.projno in
(select projno from schedules
where schedule_name in ('Schedule1'))
start with e2.empno = e.empno
connect by prior empno = mgr) proj_cnt
from emp e, dept d, projects p
where
e.deptno = d.deptno
and e.projno = p.projno(+)
and
(
p.projno in
(select projno from schedules
where schedule_name in ('Schedule1'))
or
e.projno is null
)
)
where proj_cnt > 0
start with mgr is null
connect by prior empno = mgr
/

ENAME DNAME PNAME
------------------------------ -------------- --------------------
**KING ACCOUNTING
****CLARK ACCOUNTING
******MILLER ACCOUNTING Engineering

select lpad('*', level * 2, '*') || ename ename, dname, pname
from
(
select e.empno, e.ename ename, e.mgr
, dname, pname,
(select count(p2.projno) from emp e2
,projects p2
where
e2.projno = p2.projno(+)
and
p2.projno in
(select projno from schedules
where schedule_name in ('Schedule3','Schedule1'))
start with e2.empno = e.empno
connect by prior empno = mgr) proj_cnt
from emp e, dept d, projects p
where
e.deptno = d.deptno
and e.projno = p.projno(+)
and
(
p.projno in
(select projno from schedules
where schedule_name in ('Schedule3','Schedule1'))
or
e.projno is null
)
)
where proj_cnt > 0
start with mgr is null
connect by prior empno = mgr
/

ENAME DNAME PNAME
------------------------------ -------------- --------------------
**KING ACCOUNTING
****CLARK ACCOUNTING
******MILLER ACCOUNTING Engineering
****JONES RESEARCH
******FORD RESEARCH
********SMITH RESEARCH Maintenance


--clean up
drop table schedules;
alter table emp drop column projno;
drop table projects;



Won't connect by entire table

Mike, March 13, 2006 - 5:10 pm UTC

Tom,..

I didn't intend to connect by the entire table, only in this exercise because the tickets table only has records were interested in, the ticket table does have a current_lifecycle_state column to it.

Also, not every entity has tickets. The first query show this where clause. Sorry I didn't have it in the second query.

The task I am trying to accomplish is not to have the query go completely down the entity table using connect by for every record resulting from the outer connect by. For Example,..

Level1-
|_
| Level2_
| |_Level3
| |_Level3
| |_Level3
|_Level2_
| |_Level3
| |_Level3_
| |_Level4
| |_Level4
|_Level2_


For each entity, there may/may not be opened tickets in the ticket table. One could create a view on tickets for each entity easily using the connect by and scalar subquery that returns the number of open tickets for that entity alone.

Would there be a better way to sum up these numbers from the bottom up and therefore get a faster return using less resources than having to use the connect by in the scalar subquery.

I hope I was a bit clearer.

Mike

Please Help

Isam, May 06, 2006 - 5:42 pm UTC

Hi Tom,
How can I query all Employees ( lets say whose sal >=3000 ) and their managers.
I used the follwing but the result is not what I want

select EMPNO,SUBSTR(rpad('*',2*level,'*' ) || ENAME,1,30), MGR,SAL FROM EMP
START WITH MGR IN (SELECT X.EMPNO FROM EMP X WHERE X.SAL>=3000)
CONNECT BY PRIOR EMPNO = MGR

The Result is :

EMPNO SUBSTR(RPAD('*',2*LEVEL,'*')|| MGR SAL
---------- ------------------------------ ---------- ----------
7876 **ADAMS 7788 1100
7566 **JONES 7839 2975
7788 ****SCOTT 7566 3000
7876 ******ADAMS 7788 1100
7902 ****FORD 7566 3000
7369 ******SMITH 7902 800
7698 **BLAKE 7839 2850
7499 ****ALLEN 7698 1600
7521 ****WARD 7698 1250
7654 ****MARTIN 7698 1250
7844 ****TURNER 7698 1500

EMPNO SUBSTR(RPAD('*',2*LEVEL,'*')|| MGR SAL
---------- ------------------------------ ---------- ----------
7900 ****JAMES 7698 950
7782 **CLARK 7839 2450
7934 ****MILLER 7782 1500
7369 **SMITH 7902 800

15 rows selected.


Thank you


Tom Kyte
May 07, 2006 - 11:28 am UTC

"query all Employees ( lets say whose sal >=3000 ) and their managers."

why would you use connect by?


ops$tkyte@ORA10GR2> select ename, mgr, sal from emp where sal >= 3000;

ENAME             MGR        SAL
---------- ---------- ----------
SCOTT            7566       3000
KING                        5000
FORD             7566       3000

or if you needed their manager name:

  1  select a.ename, b.ename mgr, a.sal
  2    from emp a, emp b
  3   where a.sal >= 3000
  4*    and a.mgr = b.empno(+)
ops$tkyte@ORA10GR2> /

ENAME      MGR               SAL
---------- ---------- ----------
FORD       JONES            3000
SCOTT      JONES            3000
KING                        5000



 

Isam, May 07, 2006 - 5:30 pm UTC

Thank you for reply, the idea is to get all employees who satisfies the condition (sal>=3000) and at the same time I want to know their managers of all levels regardless of manager's sal in a tree structure. the result I am looking for is like this:

EMPNO name MGR SAL
---------- ------------------------------ ---------- ----------
7839 **KING 5000
7566 ****JONES 7839 2975
7788 ******SCOTT 7566 3000
7902 ******FORD 7566 3000

as you can see , the condition returns King, Scott and Ford . but Jones in the list because he is the MGR of Scott and Ford.
That what I want. How can I do this.
Thanks

Tom Kyte
May 08, 2006 - 7:46 am UTC

that is a bit of a sticky problem.  we can do it upside down rather easily:


ops$tkyte@ORA10GR2> select EMPNO,SUBSTR(rpad('*',2*level,'*' ) || ENAME,1,30), MGR,SAL
  2    FROM EMP
  3   start with sal >= 3000
  4  CONNECT BY PRIOR mgr = empno
  5  /
 
     EMPNO SUBSTR(RPAD('*',2*LEVEL,'*')||        MGR        SAL
---------- ------------------------------ ---------- ----------
      7788 **SCOTT                              7566       3000
      7566 ****JONES                            7839       2975
      7839 ******KING                                      5000
      7839 **KING                                          5000
      7902 **FORD                               7566       3000
      7566 ****JONES                            7839       2975
      7839 ******KING                                      5000
 
7 rows selected.


but going from the top down is a tad harder:


ops$tkyte@ORA10GR2> select EMPNO,SUBSTR(rpad('*',2*level,'*' ) || ENAME,1,30), MGR,SAL
  2    FROM EMP
  3   where empno in ( select EMPNO
  4                      FROM EMP
  5                     start with sal >= 3000
  6                   CONNECT BY PRIOR mgr = empno )
  7   start with empno in ( select EMPNO
  8                           FROM EMP
  9                          where connect_by_isleaf = 1
 10                          start with sal >= 3000
 11                        connect by prior mgr = empno )
 12  CONNECT BY PRIOR empno = mgr
 13  /
 
     EMPNO SUBSTR(RPAD('*',2*LEVEL,'*')||        MGR        SAL
---------- ------------------------------ ---------- ----------
      7839 **KING                                          5000
      7566 ****JONES                            7839       2975
      7788 ******SCOTT                          7566       3000
      7902 ******FORD                           7566       3000

it would only make sense to do the "start with" subquery IF you have multiple roots and thought that the filtering done by the subquery would prune away large parts of the hierarchy - otherwise, just build the entire hierarchy and then prune it with the where clause. 

Isam, May 09, 2006 - 4:37 pm UTC

Thank you for help.
That is what I need. I will use this logic in different queries of my project.

Thanks again.


Connect by doing extra full Scan?

Greg, June 01, 2006 - 12:10 pm UTC

Hi Tom,

Got myself confused (again - *sigh*) and was hoping you could shed some light on something.

I managed to reproduce in this test case:

gregs-ORA10 > drop table junk;

Table dropped.

gregs-ORA10 > create table junk
2 ( col1 number,
3 col2 number )
4 /

Table created.

gregs-ORA10 > insert into junk values ( 1, null );

1 row created.

gregs-ORA10 > insert into junk values ( 2, null );

1 row created.

gregs-ORA10 > insert into junk values ( 3, 2 );

1 row created.

gregs-ORA10 > insert into junk values ( 4, 3 );

1 row created.

gregs-ORA10 > insert into junk values ( 5, null );

1 row created.

gregs-ORA10 > insert into junk values ( 6, 5 );

1 row created.

gregs-ORA10 > insert into junk values ( 7, 6 );

1 row created.

gregs-ORA10 > insert into junk values ( 8, null );

1 row created.

gregs-ORA10 > commit;

Commit complete.

gregs-ORA10 > set autotrace traceonly explain
gregs-ORA10 > select col1
2 from junk
3 start with col2 = 2
4 connect by prior col1 = col2
5 /

Execution Plan
----------------------------------------------------------
Plan hash value: 3214713417

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 208 | 2 (0)| 00:00:01 |
|* 1 | CONNECT BY WITHOUT FILTERING| | | | | |
|* 2 | TABLE ACCESS FULL | JUNK | 1 | 52 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | JUNK | 8 | 208 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("COL2"=PRIOR "COL1")
2 - filter("COL2"=2)

Note
-----
- dynamic sampling used for this statement

gregs-ORA10 > create index junk_ind1 on junk ( col2 )
2 /

Index created.

gregs-ORA10 > select col1
2 from junk
3 start with col2 = 2
4 connect by prior col1 = col2
5 /

Execution Plan
----------------------------------------------------------
Plan hash value: 3206560888

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 1 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | JUNK | 1 | 52 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | JUNK_IND1 | 1 | | 1 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | | | | |
| 5 | BUFFER SORT | | | | | |
| 6 | CONNECT BY PUMP | | | | | |
| 7 | TABLE ACCESS BY INDEX ROWID| JUNK | 1 | 26 | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | JUNK_IND1 | 1 | | 1 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | JUNK | 8 | 312 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("COL2"=PRIOR "COL1")
3 - access("COL2"=2)
8 - access("COL2"=PRIOR "COL1")

Note
-----
- dynamic sampling used for this statement

gregs-ORA10 > analyze table junk compute statistics;

Table analyzed.

gregs-ORA10 > select col1
2 from junk
3 start with col2 = 2
4 connect by prior col1 = col2
5 /

Execution Plan
----------------------------------------------------------
Plan hash value: 3206560888

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 6 | 1 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | JUNK | 1 | 6 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | JUNK_IND1 | 1 | | 1 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | | | | |
| 5 | BUFFER SORT | | | | | |
| 6 | CONNECT BY PUMP | | | | | |
| 7 | TABLE ACCESS BY INDEX ROWID| JUNK | 2 | 6 | 1 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | JUNK_IND1 | 1 | | 1 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | JUNK | 8 | 40 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("COL2"=PRIOR "COL1")
3 - access("COL2"=2)
8 - access("COL2"=PRIOR "COL1")

This is and Oracle 10g (10.2.0.2.0) database, and my question is, why is Oracle doing that final FULL SCAN on JUNK ?? I just don't understand the logic - I don't know where it's coming from. I think the "START WITH" results in one Index scan, and the connect by results in the 2nd ... but I can't figure out that 3rd scan.

The original query ran fine in 8i (doing 2 range scans, and no third scan at all!)



Sorry ... don't mean to harass .. ;)

Greg, June 05, 2006 - 10:00 am UTC

I know you're busy, and I know you don't always see every post ... however, we're really stuck on this, I've tried the Oracle Metalink Forums ... nothing yet ... and we've continued to try a few other things .. (I'm currently reading through Jonathan's book "Cost Based Oracle Fundamentals" in the hopes of sheding some light on it .. but I'm still stumped ...

If you can find some time in your busy schedule to take a peek at this one, I'd be internally grateful ... if you can't ... well .. I understand!! ;)


Tom Kyte
June 05, 2006 - 10:07 am UTC

this is a teeny tiny table, I don't see the point - the table is too small to use the index here.



True .. however ...

Greg, June 06, 2006 - 1:35 pm UTC

This is a re-created test case based on a real problem I have with a table that's 17 million rows ...
Same explain is showing up in that query as this one.

Statistics are all up to date ...

I can also re-produce the test case with a ~17 million row table .. but I was hoping an 8 row table might be easier to work with ??

Just to be clear .. it's not that I'm going to complain about the full table scan .. (yet) .. just wondering why Oracle has decided to make that third pass ... I can't seem to understand why it would want to make a third pass on that data ... all because we added an index ??
(Note the explain with no index does 2x full scans ... but after the index is added, we're making 3x passes ... 2x index scans, + 1x full ... )

I'm only trying to understand that third pass ..

Thanks!


Tom Kyte
June 06, 2006 - 2:03 pm UTC

it is a "dummy no-op" pass, it is an artifact of the explain plan output.


select col1
from junk
start with col2 = 2
connect by prior col1 = col2

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 5 0 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 5 0 2

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 140

Rows Row Source Operation
------- ---------------------------------------------------
2 CONNECT BY WITH FILTERING (cr=5 pr=0 pw=0 time=239 us)
1 TABLE ACCESS BY INDEX ROWID JUNK (cr=2 pr=0 pw=0 time=61 us)
1 INDEX RANGE SCAN JUNK_IND1 (cr=1 pr=0 pw=0 time=40 us)(object id
1 NESTED LOOPS (cr=3 pr=0 pw=0 time=113 us)
2 BUFFER SORT (cr=0 pr=0 pw=0 time=65 us)
2 CONNECT BY PUMP (cr=0 pr=0 pw=0 time=29 us)
1 TABLE ACCESS BY INDEX ROWID JUNK (cr=3 pr=0 pw=0 time=37 us)
1 INDEX RANGE SCAN JUNK_IND1 (cr=2 pr=0 pw=0 time=24 us)(object id
0 TABLE ACCESS FULL JUNK (cr=0 pr=0 pw=0 time=0 us)



see the zero rows, cr=0, no activity - do you have a tkprof where the row source operation shows "not zero"

yes ... but ...

Greg, June 06, 2006 - 8:35 pm UTC

Yeah .. but I'm not sure we're looking at the same thing! ;)

In my original post, the set autotrace traceonly explain command spit out an explain formatted a bit different than yours (which is why I'm thinking I'm not looking at the right thing .. hehe) ... and it shows 8 rows on this one:

| 9 | TABLE ACCESS FULL | JUNK | 8 | 40 | 2 (0)| 00:00:01 |

I can try to re-run with:
set autotrace traceonly statistics ??



Tom Kyte
June 06, 2006 - 9:45 pm UTC

that is an autotrace, it is not "what happened".

tkprof shows "what happened"

Here is more information:

...
The connect by row source uses a sort to store the rows it will be working on. If the filtering option is used, connect by needs the sort to detect duplicates as the rows are inserted into the sort. If the sort spills to disk, it can no longer detect duplicates at the time the rows are inserted (the duplicates will be detected later, when the sort runs are merged). Therefore if the sort spills to disk, the connect by will switch to "no filtering" mode. The extra line in the plan is the row source that will be used if the switch to "no filtering" happens.
.............

Tree Walking from the result of a join

Mike Jones, June 14, 2006 - 2:35 pm UTC

Help! I'm trying to write a Query where given a result set, one of the columns is an index into a hierarchy. I want to report the result set and then each of the corresponding parents back up the hierachy, sort of like reporting some information about a person and then each of their dad, and then that person and their dads dad etc. all the way back up.

I can't seem to get it working, I can't get the filtering to the correct tree-walk right. Hopefully the below illistrates this. I should only get 6 rows back, the 2 base rows from the ilv and these reported 3 times each for the 3 levels of the hierachy. In practice the ILV rows would report different IDX values and so the relative tree walks would differ. Can you help point out where I'm going mad?

Thanks,

Mike.

create table hierarchy_table as
select object_name, rownum idx, decode(rownum -1,0,null,rownum-1) parent_idx
from user_objects
where rownum < 4
/

create table base_x as
select 3 idx, object_name
from user_objects
where rownum < 3
/

create table base_y as
select *
from base_x
/

select ilv.object_name, ilv.idx ilv_idx, ht.object_name ht_name, level
from
( select x.object_name, x.idx
from base_x x, base_y y
where x.idx = y.idx
) ilv,
hierarchy_table ht
connect by prior ht.parent_idx = ht.idx
start with ilv.idx = ht.idx
order by 1
/

Tom Kyte
June 15, 2006 - 8:18 am UTC

I'm not sure why this isn't just a normal connect by - your text description is that of a connect by:

...
I want to report the result set and then each of
the corresponding parents back up the hierachy,
.....

(the result set = START WITH rows, corresponding parents = CONNNECT BY rows)

so, maybe if you show us what you get and what you think you should get I can understand better.

Connect by Join Bug?

Matt Turner, August 09, 2006 - 10:24 am UTC

Tom I have produced a sample TEST case, can you tell me if you think this is working as the manual or is a bug with the join and connect by:

We have a user_state table which stores a script number for a connected user(simplified):



CREATE TABLE user_state (username VARCHAR2(30), scriptid NUMBER );

Both Fred and Bill are running script 1.

INSERT INTO user_state VALUES('FRED',1);

INSERT INTO user_state VALUES('BILL',1);

We have a second table - script structure, which has the script id, and a list of questions (TREE).


INSERT INTO script_structure VALUES (1,1,NULL);

INSERT INTO script_structure VALUES (1,2,1);

INSERT INTO script_structure VALUES (1,3,2);

INSERT INTO script_structure VALUES (1,4,2);

So, for Script 1:

SELECT *
FROM script_structure
WHERE scriptid = 1
START WITH parent_que_id IS NULL
CONNECT BY PRIOR que_id = parent_que_id
AND PRIOR scriptid = scriptid;

1 1
1 2 1
1 3 2
1 4 2


Looks ok so far. Now I want to join to the user_state table.


SELECT *
FROM script_structure ss,
user_state us
WHERE username = 'FRED' /* Derived from a DB context */
AND us.scriptid = ss.scriptid
START WITH parent_que_id IS NULL
CONNECT BY PRIOR que_id = parent_que_id;

This is where it goes pear shaped with the result set:

1 1 FRED 1
1 2 1 FRED 1
1 3 2 FRED 1
1 4 2 FRED 1
1 3 2 FRED 1
1 4 2 FRED 1
1 2 1 FRED 1
1 3 2 FRED 1
1 4 2 FRED 1
1 3 2 FRED 1
1 4 2 FRED 1


I do have a work around, that involves using an inline view to restrict the dataset prior to the tree walk.


SELECT que_id, parent_que_id
FROM
(
SELECT *
FROM script_structure ss,
user_state us
WHERE username = 'FRED'
AND us.scriptid = ss.scriptid )
START WITH parent_que_id IS NULL
CONNECT BY PRIOR que_id = parent_que_id;

1
2 1
3 2
4 2



So it seems that the username filter predicate isn't being applied at the time of the join.

Is this a bona-fide bug?


Tom Kyte
August 09, 2006 - 11:14 am UTC

with the connect by and joins:

join is done
start with applied
connect by done
where clause applied to results

username = 'FRED' is a predicate, applied after the hierarchy is built.



Tom Can you please tell me the workaround for Connect By Loop Error

Nitin Joshi, August 14, 2006 - 8:33 am UTC

Hi Tom,

Can you please tell is there any workaround to avoid connect By loop like 'NOCYCLE' the one which we have in 10g

Regards
Nitin S. Jsohi

Tom Kyte
August 14, 2006 - 11:37 am UTC

only if you can encode it in the query (eg: describe to me what condition would always exist for you to have a connect by)

then we can possibly filter it in the connect by clause (eg: if there is some "rule")

for example, if you hit the same ID as you started with, you should "stop" (eg: all of your data is a "circle")

ops$tkyte%ORA9IR2> create table t ( id number, pid number );

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert into t values ( 1, 5 );

1 row created.

ops$tkyte%ORA9IR2> insert into t values ( 2, 1 );

1 row created.

ops$tkyte%ORA9IR2> insert into t values ( 3, 2 );

1 row created.

ops$tkyte%ORA9IR2> insert into t values ( 4, 3 );

1 row created.

ops$tkyte%ORA9IR2> insert into t values ( 5, 4 );

1 row created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select *
  2    from t
  3   start with id = 1
  4  connect by prior id = pid;
ERROR:
ORA-01436: CONNECT BY loop in user data



no rows selected

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select *
  2    from t
  3   start with id = 1
  4  connect by prior id = pid and id <> 1;

        ID        PID
---------- ----------
         1          5
         2          1
         3          2
         4          3
         5          4



but short of having some "logic" or "rule" that can be used to describe when you would have a loop.... 

Connect By

Saeed, August 26, 2006 - 5:58 pm UTC

Hi Tom ,
I have the following table and its data;

CREATE TABLE MyTable
(MyPK NUMBER(3)NOT NULL,
COL VARCHAR2(20)NOT NULL,
STATUS NUMBER(1) DEFAULT 1 NOT NULL,
PARENT_ID NUMBER(3)
);
*** Status has one of two values (1=Enabled,0=Disabled)
ALTER TABLE Mytable ADD (
CONSTRAINT StatusCons CHECK (STATUS IN (0, 1)));

ALTER TABLE Mytable ADD (
CONSTRAINT MyTable_PK PRIMARY KEY (MyPK));

ALTER TABLE Mytable ADD (
CONSTRAINT MyTable_MyTable_FK FOREIGN KEY (PARENT_ID)
REFERENCES Mytable (MyPK));

INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
0, '0_', 1, NULL);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
1, '1_0', 1, 0);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
2, '2_0', 1, 0);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
3, '3_1', 1, 1);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
4, '4_1', 1, 1);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
5, '5_2', 1, 2);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
6, '6_2', 1, 2);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
7, '7_0', 1, 0);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
8, '8_0', 0, 0);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
9, '9_0', 0, 0);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
10, '10_7', 1, 7);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
11, '11_7', 1, 7);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
12, '12_7', 1, 7);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
13, '13_12', 0, 12);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
14, '14_12', 0, 12);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
15, '15_11', 0, 11);
INSERT INTO Mytable ( MyPK, col, STATUS, PARENT_ID ) VALUES (
16, '16_15', 1, 15);
COMMIT;

MY QUESTION IS :
How can I get rows that satisfy the following conditions :
All children with STATUS=1 AND their parents have STATUS=1.
For Parents with status=0 , don't get their children (regardless of Children's STATUS).
for Children with Status=0 , don't get them.

Thanks for help
Saeed



Tom Kyte
August 27, 2006 - 9:11 pm UTC

do you just want "root" parents - or all parents (eg: what is the start with here)

Saeed, August 29, 2006 - 6:10 am UTC

Hi Tom ,
What I want is (all Parents having STATUS=1 and their children having STATUS=1 ) in a tree structure.
i.e All Rows selected must have STATUS=1.
(DON'T INCLUDE ANY CHILD HIS PARENT'S STATUS=0.)

I know my English Language is not good enough , but I hope you got the idea.

Thanks again
Saeed





Tom Kyte
August 29, 2006 - 7:18 am UTC

just add "and status = 1" to the connect by clause then, if status <> 1, it will stop traversing the tree.

Saeed, August 29, 2006 - 4:52 pm UTC

Thanks TOM ,
It's worked well


Multiple records within a hierarchy.

A reader, October 10, 2006 - 11:43 am UTC

Is it possible to get multiple records of a particular employee within a hierarchy in specific cases?

I have the following table structure.


create table EMPINFO
( c_STATUS VARCHAR2(2),
c_GROUP VARCHAR2(10),
c_SYSID VARCHAR2(10),
c_SYSTEM VARCHAR2(10),
c_NAME VARCHAR2(40),
c_ID VARCHAR2(10),
c_MGRID VARCHAR2(10),
c_MGRNAME VARCHAR2(40)
);

Typical Data is
insert into EMPINFO VALUES ('A','HR','','','JASON BICKER','0000001','','');
insert into EMPINFO VALUES ('A','HR','','','NANCY WALTER','0400001','0000001','JASON BICKER');
insert into EMPINFO VALUES ('','MF','NAN3005','WIN','NANCY WALTER','0400001','','');
insert into EMPINFO VALUES ('','MF','NAN3005','UNIX','NANCY WALTER','0400001','','');
insert into EMPINFO VALUES ('A','HR','','','ADAM BRYAN','0400002','0000001','JASON BICKER');
insert into EMPINFO VALUES ('','MF','BRY0034','WIN','ADAM BRYAN','0400002','','');
insert into EMPINFO VALUES ('A','HR','','','STING CORY','0400003','0400002','ADAM BRYAN');
insert into EMPINFO VALUES ('','MF','STI4040','UNIX','STING CORY','0400003','','');


For example I want to display all records of NANCY WALTER. The c_MGRID is blank for 2nd and 3rd record of NANCY WALTER. Similar structure will follow for other records as well.

Regards,
Rao

Tom Kyte
October 10, 2006 - 8:07 pm UTC

if nancy walker doesn't have a manager, how is "she" in the hierarchy????

A reader, October 11, 2006 - 9:31 am UTC

This table is kind of a master table that includes employee and its manager and the systems(Unix/Windows) and userids the employee has on it.
Nancy Walker has a manager whose c_MGRID is populated for the 1st row. But the 2nd and 3rd row of Nancy Walker shows
the system and the userid's she has access to. c_MGRID of this 2nd and 3rd row is empty. Is it possible to include these rows in the hierarchy as well similar to "start with .... connect by prior"

Tom Kyte
October 11, 2006 - 3:52 pm UTC

need more details, can there be two (or zero, or more than two) records for nancy in there with c_mgrid filled in? would they have the same value?

A reader, October 11, 2006 - 4:19 pm UTC

For a user

a) only one record wherver c_mgrid is filled. This will always be there when he/she joins the company.

b) 0 or more records wherever c_mgrid is not filled. Rows will be there only if user has access to any Windows/Unix/Mainframe systems and has an userid on it.


Tom Kyte
October 11, 2006 - 8:13 pm UTC

use this as your "source" to connect by on:

ops$tkyte%ORA10GR2> select c_name, max( c_mgrid) over (partition by c_name) mgrid
  2    from empinfo;

C_NAME                                   MGRID
---------------------------------------- ----------
ADAM BRYAN                               0000001
ADAM BRYAN                               0000001
JASON BICKER
NANCY WALTER                             0000001
NANCY WALTER                             0000001
NANCY WALTER                             0000001
STING CORY                               0400002
STING CORY                               0400002

8 rows selected.
 

A reader, October 12, 2006 - 11:04 am UTC

I am sorry I did not get you. The order does not seem right.

select c_name, max( c_mgrid) over (partition by c_name) mgrid from empinfo
start with c_id='0000001'
connect by prior c_id=c_mgrid

The O/p is

C_NAME MGRID
------------------------------
ADAM BRYAN 0000001
JASON BICKER
NANCY WALTER 0000001
STING CORY 0400002


The required O/p is
C_NAME MGRID
------------------------------
JASON BICKER
NANCY WALTER '0000001'
NANCY WALTER '0000001' ->> Access to Windows System
NANCY WALTER '0000001' ->> Access to unix System
ADAM BRYAN '0000001'
ADAM BRYAN '0000001' ->> Access to Windows
STING CORY '0400002'
STING CORY '0400002' ->> Access to unix

Tom Kyte
October 12, 2006 - 11:46 am UTC

no, use my query as your "table

select ..
from (MY_QUERY)
start with
connect by

A reader, October 12, 2006 - 11:06 am UTC

The ' ' in the C_MGRID section of Required O/p is unintentional. It should be
The required O/p is
C_NAME MGRID
------------------------------
JASON BICKER
NANCY WALTER 0000001
NANCY WALTER 0000001 ->> Access to Windows System
NANCY WALTER 0000001 ->> Access to unix System
ADAM BRYAN 0000001
ADAM BRYAN 0000001 ->> Access to Windows
STING CORY 0400002
STING CORY 0400002 ->> Access to unix

A reader, October 12, 2006 - 3:19 pm UTC

Thanks a lot for the information.

CONNECT BY

lore, October 27, 2006 - 1:02 pm UTC

The ansewer is too clear and has an example that clarify the use of the connect by

Connect by

guest, February 15, 2007 - 5:01 pm UTC

Hi Tom,

I need output like given below

Key2 Key1 ENAME
---------- ------------------------------
1 1 **FRENCH
1 2 ****BLAKE
1 3 ******ADAMS
2 1 **FRENCH
2 2 ****DAVIS
2 3 ******ADAMS
3 1 **FRENCH
3 2 ****OLIVER
4 1 **BALMAR
4 2 ****GOWD

Could you please give me a query to get the above result. with key1 and key2 and tree should have full start to stop.

eg:

A->B->D
A->C->E
F->G->H
I->J
I->K

Database is 10g.
Thanks,
Tom Kyte
February 16, 2007 - 1:36 pm UTC

hmmm

eh?

geez.....

connect by

Guest, February 16, 2007 - 2:44 pm UTC

Hi Tom,

I have written the below code to achieve a hierarchy tree. It works fine but takes lot of time and memory. sometimes sorting fails.

Could you please tell me how can i reduce the run time for the below proc. None of the table has indexes. it has to process millions of records to give the tree with parent child relationship.

Example Input:
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (1, 'C','GBP','DW','A');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (2, 'C','GBP','DW','C');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (3, 'C','GBP','DW','G');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (4, 'C','GBP','DW','F');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (5, 'C','GBP','DW','H');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (6, 'C','GBP','DW','I');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (7, 'C','GBP','DW','J');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (8, 'C','GBP','DW','L');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (9, 'C','GBP','DW','K');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (10, 'C','GBP','DW','P');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (11, 'C','GBP','DW','D');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (12, 'C','GBP','DW','M');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (13, 'C','GBP','DW','O');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (14, 'C','GBP','DW','Q');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (15, 'C','GBP','DW','N');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (16, 'C','GBP','DW','E');
insert into po(po_NUM,mat_code,op_plant, sc, batch_num ) values (17, 'C','GBP','DW','B');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (1, 'GBP','DW', 'C','C', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (1, 'GBP','DW', 'C','D', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (1, 'GBP','DW', 'C','E', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (2, 'GBP','DW', 'C','F', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (2, 'GBP','DW', 'C','G', 'GBP');

insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (4, 'GBP','DW', 'C','H', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (5, 'GBP','DW', 'C','I', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (5, 'GBP','DW', 'C','J', 'GBP');

insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (7, 'GBP','DW', 'C','L', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (7, 'GBP','DW', 'C','K', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (7, 'GBP','DW', 'C','M', 'GBP');

insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (9, 'GBP','DW', 'C','P', 'GBP');

insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (11, 'GBP','DW', 'C','J', 'GBP');

insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (13, 'GBP','DW', 'C','M', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (14, 'GBP','DW', 'C','M', 'GBP');

insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (16, 'GBP','DW', 'C','O', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (16, 'GBP','DW', 'C','N', 'GBP');
insert into pod(po_NUM,op_plant, sc, i_mat_code, ib, ip_plant) values (17, 'GBP','DW', 'C','E', 'GBP');


expected output:
M--C--GBP 1 1 2 16/02/2007 19:25:18 16/02/2007 19:25:18
Q--C--GBP 1 1 1 16/02/2007 19:25:18 16/02/2007 19:25:18
N--C--GBP 2 1 3 16/02/2007 19:25:18 16/02/2007 19:25:18
E--C--GBP 2 1 2 16/02/2007 19:25:18 16/02/2007 19:25:18
B--C--GBP 2 1 1 16/02/2007 19:25:18 16/02/2007 19:25:18
M--C--GBP 2 2 4 16/02/2007 19:25:18 16/02/2007 19:25:18
O--C--GBP 2 2 3 16/02/2007 19:25:18 16/02/2007 19:25:18
E--C--GBP 2 2 2 16/02/2007 19:25:18 16/02/2007 19:25:18
B--C--GBP 2 2 1 16/02/2007 19:25:18 16/02/2007 19:25:18
I--C--GBP 3 1 5 16/02/2007 19:25:19 16/02/2007 19:25:19
H--C--GBP 3 1 4 16/02/2007 19:25:19 16/02/2007 19:25:19
F--C--GBP 3 1 3 16/02/2007 19:25:19 16/02/2007 19:25:19
C--C--GBP 3 1 2 16/02/2007 19:25:19 16/02/2007 19:25:19
A--C--GBP 3 1 1 16/02/2007 19:25:19 16/02/2007 19:25:19
P--C--GBP 3 2 7 16/02/2007 19:25:19 16/02/2007 19:25:19
K--C--GBP 3 2 6 16/02/2007 19:25:19 16/02/2007 19:25:19
J--C--GBP 3 2 5 16/02/2007 19:25:19 16/02/2007 19:25:19
H--C--GBP 3 2 4 16/02/2007 19:25:19 16/02/2007 19:25:19
F--C--GBP 3 2 3 16/02/2007 19:25:19 16/02/2007 19:25:19
C--C--GBP 3 2 2 16/02/2007 19:25:19 16/02/2007 19:25:19
A--C--GBP 3 2 1 16/02/2007 19:25:19 16/02/2007 19:25:19
L--C--GBP 3 3 6 16/02/2007 19:25:19 16/02/2007 19:25:19
J--C--GBP 3 3 5 16/02/2007 19:25:19 16/02/2007 19:25:19
H--C--GBP 3 3 4 16/02/2007 19:25:19 16/02/2007 19:25:19
F--C--GBP 3 3 3 16/02/2007 19:25:19 16/02/2007 19:25:19
C--C--GBP 3 3 2 16/02/2007 19:25:19 16/02/2007 19:25:19
A--C--GBP 3 3 1 16/02/2007 19:25:19 16/02/2007 19:25:19


56 rows.

total 56 rows expected for the above input.

The procedure is given below. Please help me to minimize the below query to execute fast and give the result.

CREATE OR REPLACE PROCEDURE hierarchy_Proc IS
tempvar varchar2(255);
batch_cnt number :=1;
sort_cnt number;
cnt number;
old_rec varchar2(255):='TRY';
BranchNoV number :=1;
Key1V number :=1;
Key2V number :=1;
InputV varchar2(255);
ibiv varchar2(255);
obiv varchar2(255);
bgsseq number;
minsort number;
bgs_chk number;
outbatch number;
row_chk number;
cursor otb_id is select obi from oo;
cursor ob_id is select obi from oo where obi in (select obi from oo GROUP BY obi having count(*) > 1);

BEGIN
-- P 1
execute immediate 'truncate table bio';
execute immediate 'truncate table oo';
execute immediate 'truncate table oi';

-- Load the bio table with data from the po table
insert into bio (select (po.batch_num||'--'||po.mat_code||'--'||po.op_plant) obi, (pod.input_batch||'--'||pod.i_mat_code||'--'||pod.ip_plant) ibi, po.load_date,po.update_date
from po, pod
where pod.order_num=po.order_num
and pod.op_plant=po.op_plant );
commit;

-- Remove Circular data from bio table
delete from bio where obi in(select a.obi from
bio a inner join bio b
on a.ibi=b.obi and a.obi=b.ibi );
commit;

--Remove duplicate data from bio table
delete from bio where rowid not in
(SELECT MIN(rowid)
FROM bio
GROUP BY obi, ibi);
commit;

-- For all records where the ibi does not exist as an obi
For rec in (select distinct(ibi) from (select a.ibi from bio a minus select b.obi from bio b))
loop
insert into bio values(rec.ibi,null,to_date(sysdate,'dd/mm/yyyy'),to_date(sysdate,'dd/mm/yyyy'));
end loop;
commit;

insert into oo(sort_num,obi,ibi,load_date,update_date) (select rownum,obi,ibi,load_date,update_date from bio);
commit;

-- load oi table

insert into oi(sort_num,obi,oi,load_date,update_date)(select sort_num,obi,sort_num,load_date,update_date from oo );
commit;

-- load oi column in the oi table for obi
For ob_rec in ob_id
loop
select min(sort_num) into minsort from oo where obi =ob_rec.obi;
update oi set oi=minsort where obi=ob_rec.obi;
end loop;
commit;

-- load o_count column in the oi table for obi

cnt:=1;
sort_cnt:=1;
For otb_rec in otb_id loop
if(old_rec = otb_rec.obi) then
cnt:=cnt+1;
update oi set o_count=cnt where obi =otb_rec.obi and sort_num=sort_cnt;
sort_cnt:=sort_cnt+1;
else
cnt:=1;
update oi set o_count=cnt where obi=otb_rec.obi and sort_num=sort_cnt;
sort_cnt:=sort_cnt+1;
end if;
old_rec:=otb_rec.obi;

end loop;
commit;

--Phase 2
execute immediate 'truncate table bgs';
execute immediate 'truncate table bgr';

<<label1>>
For eocbid in ( select distinct(obi) from bio where obi not in (select nvl(ibi,0) from bio)) loop
obiv := eocbid.obi;

<<Label2>>
select count(*) into bgs_chk from bgs where obi=obiv;
if (bgs_chk=0) then
goto label3;
else
goto label4;
end if;


<<label3>>
bgsseq :=1;
BranchNoV := bgsseq;
insert into bgs(obi,seq_num,load_date,update_date) values (obiv,bgsseq,sysdate,sysdate);
commit;
goto label5;


<<label4>>
select count(obi) into outbatch from bgs where obi=obiv;
if(outbatch=0) then
bgsseq :=1;
BranchNoV := bgsseq;
insert into bgs(obi,seq_num,load_date,update_date) values (obiv,bgsseq,sysdate,sysdate);
commit;
else
select bgs.seq_num+1 into BranchNoV from bgs bgs where bgs.obi=obiv;
end If;



<<label5>>
select count(*) into row_chk from oi oi where oi.obi=obiv and oi.o_count=BranchNoV;
If(row_chk<>0) then
select ibi into InputV from oo oo where oo.sort_num=(select distinct(oi.oi + BranchNoV - 1) from oi oi where oi.obi= obiv);
If InputRetV is null then
insert into bgr(batch_num,Key_1,Key_2,gen_sequence,load_date,update_date) (SELECT obi, Key1V,Key2V, (select count(*) from bgs)+1-LEVEL,sysdate,sysdate FROM bgs start with ibi is null CONNECT BY PRIOR obi= ibi );


<<label6>>
ibiv:=obiv;
commit;
delete from bgs bgs where bgs.obi= obiv;
select obi into obiv from bgs bgs where bgs.ibi =ibiv ;
Key2V := Key2V + 1;
commit;
goto label2;
else
update bgs bgs set bgs.ibi=InputV, bgs.seq_num = BranchNoV,update_date=sysdate where bgs.obi = obiv;
obiv := InputRetV;
commit;
goto label2;
end If;
else
ibiv := obiv;

delete from bgs bgs where bgs.obi=obiv;
commit;
select count(*) into bgs_chk from bgs bgs;

If(bgs_chk = 0) then
Key1V := Key1V + 1;
Key2V :=1;
else
select obi into obiv from bgs bgs where bgs.ibi=ibiv;
goto label2;
end if;
end if;
end loop;
END ;
/
table script
create table bio
(
obi varchar2(255),
ibi varchar2(255),
load_date date,
update_date date)


create table oo(
sort_num number,
obi varchar2(255),
ibi varchar2(255),
load_date date,
update_date date)


create table oi(
obi varchar2(255),
oi number,
output_count number,
load_date date,
update_date date)



create table bgs(
obi varchar2(255),
ibi varchar2(255),
seq_num number,
load_date date,
update_date date)

create table bgr(
batch_num varchar2(255),
key_1 number,
key_2 number,
gen_sequence number,
load_date date,
update_date date)

Thanks in advance.
Tom Kyte
February 18, 2007 - 8:48 am UTC

... sometimes sorting fails. ..


NO, it does not. You might have coded something incorrectly, but "sometimes sorting fails" is not an accurate statement to make.


but I would like you to read this posting you made, and - pretend you didn't know the problem attempting to be solved at all (sort of like us for example) and ask yourself "would anyone really be able to tell what I was trying to do given what I've posted - did I give them a clear definition of the problem in written words"

the answer to that is "absolutely not, there is no problem definition, there is however a lot of code that is known to not function correctly from which one could learn how not to solve the as yet to be stated problem"....


Great

Reader, February 16, 2007 - 8:27 pm UTC

Tom

You should take off from work and provide the answer :)

Different Result

Chi, March 13, 2007 - 4:03 pm UTC

Tom:

I ran the following SQL on my development & production servers and got different results:

select 1 from plateau.pa_student a 
where a.stud_id = 'adhh01' 
  and a.stud_id in 
  (select b.stud_id 
      from plateau.pa_student b start with b.stud_id = 'deaalder' connect by prior b.super = b.stud_id 
   union select c.stud_id 
      from plateau.pa_student c start with c.stud_id = 'deaalder' connect by c.super = prior c.stud_id )


Both servers are running 10.2.0.3 on Windows 2003 and all instances are created using DBCA. On the development server the result was as expected: "no rows selected". However on the production server it returned "1". I have compared all parameters on these 2 instances and found no mismatch (except those include the path). The problem is with the server as I tested on another instance on the production server and got same incorrest result. What would cause this kind of discrepancy? Thanks for the help.
Tom Kyte
March 13, 2007 - 8:52 pm UTC

perhaps different data...

but, you give us nothing to work with.

are the plans the same?

Display all dates in a year

Jay, March 19, 2007 - 3:56 pm UTC

Good day Tom!

I honestly am hoping that I am not posting this question in a wrong thread.
Tom, I am trying to just write a query to give me all the dates from a particular date to the system date.
For instance, if I enter 01/01/2007, I need a query to fetch me records like this...

01/01/2007
01/02/2007
01/03/2007
01/04/2007
...
..
03/19/2007

Is this something simple to do? Do we need to take into consideration the leap year issue as well or will oracle know the dates?

Thanks Tom. Have a good one!


Tom Kyte
March 19, 2007 - 7:31 pm UTC

ops$tkyte%ORA10GR2> with data
  2  as
  3  (select level-1 l
  4     from dual
  5  connect by level <= trunc(sysdate)-to_date(:x,'dd-mon-yyyy')+1)
  6  select to_date(:x,'dd-mon-yyyy')+l
  7    from data
  8  /

TO_DATE(:
---------
01-JAN-07
02-JAN-07
03-JAN-07
...
18-MAR-07
19-MAR-07

78 rows selected.

ops$tkyte%ORA10GR2>

A reader, March 20, 2007 - 5:27 am UTC

Tom,
Can you explain the logic behind
'connect by level < n' when the from clause has Dual or a Single Row rowsource.
Looks like the connect by keeps pumping rows because it cannot decide a relationship between parent and child rows.
Is it like if it can't find a relationship, it will assume each row to be a parent as well as a child for every level?

2) Why did you not choose to do it to a multi row table?

Thanks

Ravi
Tom Kyte
March 20, 2007 - 7:56 am UTC

with data
as
(select level from dual connect by level < :n )

is just a way to create :N rows - the connect by statement is satisfied as long as level is less than N - so it gets the first row from dual, sets level to 1 and says "1 is less than :n, great, we therefore create a new level and try again, now level is 2....."

2) because I needed N rows and dual is perfect for that? If I picked a 'real table', I'd have to pick on with at least :n rows - but we don't know what :n is

wow!

Jay, March 20, 2007 - 8:32 am UTC

Thanks Tom!

Jay

Kindly help SQL for Chain Marketing system -

Arindam Mukherjee, March 31, 2007 - 10:17 am UTC

Chain Marketing system, one has so many workers. That worker also has so many workers. In this way, we get so many workers down the line forming a chain. Our table structure looks like below. The total no. of records in this table is equal to the no. of agents. So Agent_id and Introducer_id are self_referential.

Agent_id | Commission | GAP-Commi| Introducer_id | Introducer_rank | team_strength

Data looks like

1st row >> 100 | 15% | 0 | 91 | 2 | 0
2nd row >> 91 | 5% | 0 | 81 | 3 | 7
3rd row >> 81 | 2% | 3% |61 | 5 | 9 --- here 4th rank is missing because of termination.

31st row >> 300 | 15% | 0 | 51 | 2 | 0
32nd row >> 51 | 5% | 0 | 47 | 3 | 18
33rd row >> 47 | 1% | 3% + 2% |33 | 6 | 56 --- here 4th and 5th rank is missing because of termination.

We need the following results.

1 > Each rank and its strength down the line, suppose for agent ID = 81, we need
For 81, 9 and for 91, 7 as 91 is related to 81.

2> When one agent brings one business say $100, commission column will be updated as follows.

1st row >> 15% of 100 = 15 (Either insert or update with existing one)
2nd row >> 5% of 100 = 5 (Either insert or update with existing one)

3rd row >> 2% of 100 = 2 (Either insert or update with existing one)
3rd row >> Gap Commission 3% of 100 = 3 (Either insert or update with existing one)
Since 4th Rank does not exist, 5th rank will get Gap commission.

You have every right to change the table structure but please help us write these complicated SQL as the table will have more than 40 thousands record. We are trying " connect by " clause but could not get success. Our database is Oracle 9i.

Kindly help us.

Arindam Mukherjee, April 01, 2007 - 11:52 pm UTC

Sir,
You please read the above facts and kindly guide me how to calculate commission when one business of $100 would be in place. So I need Insert / Update and Query SQL, Please help me.

Performance of connect by

Artur Popov, April 13, 2007 - 8:40 am UTC

Hi, Tom.

I have a big problem with two queries and I want to understand why do their execution time differs so much. Here they are:

SELECT ap.app_id, ap.created, u.fio_short from_user, ut.fio_short to_user, ty.TYPE_NAME, ap.app_num
FROM applications_events ae, apps ap, application_types_ref ty, isa_own.users u, isa_own.users ut
WHERE ae.application_id = ap.app_id
 AND ae.event_time = ap.recent_event_time
 AND ae.event_id = 3
 AND ae.user_id IN
  (SELECT id_user
   FROM isa_own.users
   WHERE pr_rabot = 1 START WITH id_user = 363 CONNECT BY PRIOR id_user = id_nach)
 AND u.id_user = ap.creator_id
 AND ut.id_user = ae.user_id
 AND ap.TYPE_ID = ty.TYPE_ID;


It takes over 5 seconds to execute. But when I replaced the subquery with connect by with it's result:

SELECT ap.app_id, ap.created, u.fio_short from_user, ut.fio_short to_user, ty.TYPE_NAME, ap.app_num
FROM applications_events ae, apps ap, application_types_ref ty, isa_own.users u, isa_own.users ut
WHERE ae.application_id = ap.app_id
 AND ae.event_time = ap.recent_event_time
 AND ae.event_id = 3
 AND ae.user_id IN
  (363,359,361,364,341,354,590,591,944,840)
 AND u.id_user = ap.creator_id
 AND ut.id_user = ae.user_id
 AND ap.TYPE_ID = ty.TYPE_ID


the execution time became just 0.9s.
Why does it happens and how can I speed up my first query?
Tom Kyte
April 13, 2007 - 2:09 pm UTC

did you look at the plans to see what is fundamentally different

and how long does the connect by take itself.

Performance of connect by

Popov Artur, April 16, 2007 - 12:38 am UTC

Hi Tom.

Here are the plans for my queries from the previous question.

First query:

Rows     Row Source Operation
-------  ---------------------------------------------------
    128  HASH JOIN  (cr=6551 r=5769 w=1529 time=5676858 us)
    128   HASH JOIN  (cr=6518 r=5769 w=1529 time=5670442 us)
    128    HASH JOIN  (cr=6487 r=5769 w=1529 time=5664359 us)
    128     MERGE JOIN  (cr=6480 r=5769 w=1529 time=5662939 us)
   4032      SORT JOIN (cr=6428 r=5769 w=1529 time=5615042 us)
   4032       MERGE JOIN SEMI (cr=6428 r=5769 w=1529 time=5586127 us)
 384128        SORT JOIN (cr=6363 r=5769 w=1529 time=4003999 us)
 384128         TABLE ACCESS FULL APPLICATIONS_EVENTS (cr=6363 r=4240 w=0 time=907248 us)
   4032        SORT UNIQUE (cr=65 r=0 w=0 time=609225 us)
     10         VIEW  (cr=65 r=0 w=0 time=13402 us)
     10          FILTER  (cr=65 r=0 w=0 time=13379 us)
     11           CONNECT BY WITH FILTERING (cr=65 r=0 w=0 time=13344 us)
      1            NESTED LOOPS  (cr=3 r=0 w=0 time=133 us)
      1             INDEX UNIQUE SCAN PK_ID_USER (cr=2 r=0 w=0 time=72 us)(object id 10236)
      1             TABLE ACCESS BY USER ROWID USERS (cr=1 r=0 w=0 time=49 us)
     10            HASH JOIN  (cr=62 r=0 w=0 time=12927 us)
     11             CONNECT BY PUMP  (cr=0 r=0 w=0 time=92 us)
   1920             TABLE ACCESS FULL USERS (cr=62 r=0 w=0 time=4132 us)
    128      SORT JOIN (cr=52 r=0 w=0 time=37592 us)
   6000       TABLE ACCESS FULL APPS (cr=52 r=0 w=0 time=9611 us)
      2     TABLE ACCESS FULL APPLICATION_TYPES_REF (cr=7 r=0 w=0 time=190 us)
    960    TABLE ACCESS FULL USERS (cr=31 r=0 w=0 time=1601 us)
    960   TABLE ACCESS FULL USERS (cr=33 r=0 w=0 time=1665 us)


Second query:

Rows     Row Source Operation
-------  ---------------------------------------------------
    128  NESTED LOOPS  (cr=2583 r=28 w=0 time=144063 us)
    128   NESTED LOOPS  (cr=2317 r=28 w=0 time=141186 us)
    128    HASH JOIN  (cr=2179 r=28 w=0 time=138167 us)
   4032     TABLE ACCESS BY INDEX ROWID APPLICATIONS_EVENTS (cr=2125 r=28 w=0 time=78225 us)
   4363      NESTED LOOPS  (cr=82 r=0 w=0 time=36619 us)
     10       INLIST ITERATOR  (cr=31 r=0 w=0 time=546 us)
     10        TABLE ACCESS BY INDEX ROWID USERS (cr=31 r=0 w=0 time=416 us)
     10         INDEX RANGE SCAN PK_ID_USER (cr=21 r=0 w=0 time=265 us)(object id 10236)
   4352       INDEX RANGE SCAN T_INDEX3 (cr=51 r=0 w=0 time=16801 us)(object id 29522)
   6000     TABLE ACCESS FULL APPS (cr=54 r=0 w=0 time=8504 us)
    128    TABLE ACCESS BY INDEX ROWID APPLICATION_TYPES_REF (cr=138 r=0 w=0 time=2066 us)
    128     INDEX UNIQUE SCAN PK_APPLICATIONS_TYPES_REF (cr=10 r=0 w=0 time=751 us)(object id 9935)
    128   TABLE ACCESS BY INDEX ROWID USERS (cr=266 r=0 w=0 time=1995 us)
    128    INDEX UNIQUE SCAN PK_ID_USER (cr=138 r=0 w=0 time=942 us)(object id 10236)


We can see, that they are completely different, but I didn't change the query, just replaced the subquery with it's result. Why did it cause such consequences?
Tom Kyte
April 16, 2007 - 1:15 pm UTC

so, in the bad plan - are the estimated card= values near to the actuals you posted here.

NO_FILTER Hint - What is it ?

BC, April 16, 2007 - 12:44 am UTC

Tom,

What is the NO_FILTER hint ? I have seen it used in several queries using CONNECT BY.

Thanks

BC

Performance of connect by

Popov Artur, April 17, 2007 - 12:43 am UTC

Statistics for all tables is gathered and actual.
For the first query I got this plan (using autotrace):

 0      SELECT STATEMENT Optimizer=CHOOSE (Cost=514 Card=16 Bytes=2656)
 1    0   HASH JOIN (Cost=514 Card=16 Bytes=2656)
 2    1     HASH JOIN (Cost=509 Card=16 Bytes=2368)
 3    2       HASH JOIN (Cost=504 Card=16 Bytes=2080)
 4    3         MERGE JOIN (Cost=501 Card=16 Bytes=1328)
 5    4           SORT (JOIN) (Cost=435 Card=192224 Bytes=6727840)
 6    5             MERGE JOIN (SEMI) (Cost=435 Card=192224 Bytes=6727840)
 7    6               SORT (JOIN) (Cost=427 Card=192224 Bytes=4228928)
 8    7                 TABLE ACCESS (FULL) OF 'T' (Cost=427 Card=192224 Bytes=4228928)
 9    6               SORT (UNIQUE) (Cost=8 Card=960 Bytes=12480)
10    9                 VIEW OF 'VW_NSO_1' (Cost=4 Card=960 Bytes=12480)
11   10                   FILTER
12   11                     CONNECT BY (WITH FILTERING)
13   12                       NESTED LOOPS
14   13                         INDEX (UNIQUE SCAN) OF 'PK_ID_USER' (UNIQUE) (Cost=1 Card=1 Bytes=4)
15   13                         TABLE ACCESS (BY USER ROWID) OF 'USERS'
16   12                       HASH JOIN
17   16                         CONNECT BY PUMP
18   16                         TABLE ACCESS (FULL) OF 'USERS' (Cost=4 Card=960 Bytes=10560)
19    4           SORT (JOIN) (Cost=66 Card=6000 Bytes=288000)
20   19             TABLE ACCESS (FULL) OF 'APPS' (Cost=7 Card=6000 Bytes=288000)
21    3         TABLE ACCESS (FULL) OF 'APPLICATION_TYPES_REF' (Cost=2 Card=2 Bytes=94)
22    2       TABLE ACCESS (FULL) OF 'USERS' (Cost=4 Card=960 Bytes=17280)
23    1     TABLE ACCESS (FULL) OF 'USERS' (Cost=4 Card=960 Bytes=17280)


Ok, to make everything clear here is my tables:

-- 6000 rows.
create table APPS
(
  APP_ID            NUMBER,
  TYPE_ID           NUMBER not null,
  CREATOR_ID        NUMBER(38) not null,
  CREATED           TIMESTAMP(6) not null,
  POSSIBLE_NAPR_ID  NUMBER not null,
  RECENT_EVENT_TIME TIMESTAMP(4),
  APP_NUM           VARCHAR2(14)
);
alter table APPS add constraint PK_APPS primary key (APP_ID);
alter table APPS add constraint FK_APPS_NAPR foreign key (POSSIBLE_NAPR_ID)
                              references ISA_OWN.NAPRAVL (ID_NAPR);
alter table APPS add constraint FK_APPS_TYPES foreign key (TYPE_ID)
                         references APPLICATION_TYPES_REF (TYPE_ID);
alter table APPS add constraint FK_APPS_USERS foreign key (CREATOR_ID)
                                 references ISA_OWN.USERS (ID_USER);

-- The following table is made just for testing.
-- It have 768383 rows.
create table APPLICATIONS_EVENTS
(
  EVENT_TIME     TIMESTAMP(4) not null,
  EVENT_ID       NUMBER not null,
  APPLICATION_ID NUMBER not null,
  USER_ID        NUMBER,
  REASON         VARCHAR2(50),
  NAPR_ID        NUMBER,
  ITEM_ID        NUMBER
)
create index T_INDEX1 on APPLICATIONS_EVENTS (EVENT_TIME);
create index T_INDEX2 on APPLICATIONS_EVENTS (APPLICATION_ID);
create index T_INDEX3 on APPLICATIONS_EVENTS (USER_ID);

-- 2 rows at this moment.
create table APPLICATION_TYPES_REF
(
  TYPE_ID   NUMBER not null,
  TYPE_NAME VARCHAR2(60) not null
);
alter table APPLICATION_TYPES_REF add constraint PK_APPLICATIONS_TYPES_REF primary key (TYPE_ID);

-- 920 rows.
create table ISA_OWN.USERS
(
  FNAME       VARCHAR2(50) not null,
  MNAME       VARCHAR2(50) not null,
  LNAME       VARCHAR2(50) not null,
  FIO_SHORT   VARCHAR2(60) not null,
  ID_USER     NUMBER(10) not null,
  FOTO        BLOB,
  KAB         VARCHAR2(50),
  TEL         VARCHAR2(50),
  EMAIL       VARCHAR2(50),
  ID_DOLJNOST NUMBER(10) not null,
  ID_NAPR     NUMBER(10) not null,
  TAB         VARCHAR2(5) not null,
  FOTO_NAME   VARCHAR2(100),
  CON_NAME    VARCHAR2(50) not null,
  PAS         VARCHAR2(50) not null,
  VID_DOG     VARCHAR2(20),
  DATE_NAIM   DATE,
  ID_NACH     NUMBER(10),
  PR_RABOT    NUMBER(1) default 1 not null,
  D_R         DATE,
  TEL_DOM     VARCHAR2(50),
  PROFSOUZ    NUMBER(1) default 1 not null,
  POL         VARCHAR2(1) not null,
  EMAIL_LOC   VARCHAR2(50),
  NOVELL_NAME VARCHAR2(20)
);
alter table ISA_OWN.USERS add constraint PK_ID_USER primary key (ID_USER);
alter table ISA_OWN.USERS add constraint CON_PAS_UNIQ unique (CON_NAME,PAS);
alter table ISA_OWN.USERS add constraint TAB_UNIQ unique (TAB);
create index ISA_OWN.IND_ID_NACH on ISA_OWN.USERS (ID_NACH);


My colleagues adviced me to store the results of the 'connect by' subquery in a temporary table and replace the 'connect by' subquery with simple select from this table, but I think this is not the best idea. What can you say?
Tom Kyte
April 17, 2007 - 9:59 am UTC

I simply asked:

so, in the bad plan - are the estimated card= values near to the actuals you posted here.

Performance of connect by

Popov Artur, April 17, 2007 - 11:33 am UTC

As you can see they are close, excluding APPLICATIONS_EVENTS. I just wanted to give you better understanding of the problem.
So, can you help?

Connect by prior

Bob, July 18, 2007 - 11:04 am UTC

Hi Tom,

Is there a way of listing parent and child relations in a query. For example, if Site 1: has child sites 2,3,4,5,6.

I want to able to select:

1,2
1,3
1,4
1,5
1,6
and so on..

Thanks
Tom Kyte
July 18, 2007 - 12:54 pm UTC

from what do you want to select this

connect by prior

Bob, July 18, 2007 - 11:19 am UTC

Good explanation at the top, answers my question!!!

Connect by mecanism

Car Elcaro, June 23, 2008 - 1:22 am UTC

I see the mecanism of connect by and join above in this url :
<code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:489772591421#69739221126046


join is done
start with applied
connect by done
where clause applied to results

I have confusion here :

create table t1
(
  key number,
  parent varchar2(2)
)

create table t2
(
  parent varchar2(2),
  child  varchar2(2)
)

insert into t1 values (1,'A1');
insert into t1 values (2,'B1');

select * from t1;

       KEY PA
---------- --
         1 A1
         2 B1

insert into t2 values ('A1','A2');
insert into t2 values ('A2','A3');
insert into t2 values ('A3','A4');

select * from t2;

PA CH
-- --
A1 A2
A2 A3
A3 A4


Join between t1 and t2 without any condition

select t1.key, t1.parent, t2.child from t1, t2;

       KEY PA CH
---------- -- --
         1 A1 A2
         2 B1 A2
         1 A1 A3
         2 B1 A3
         1 A1 A4
         2 B1 A4

6 rows selected.


Use connect by with 'start with' clause

column scbp format a20

select level, key,
       sys_connect_by_path(key || ',' || child ,'|') scbp 
from t1, t2
start with t2.parent = t1.parent
connect by prior t2.child = t2.parent
order by level

     LEVEL        KEY SCBP
---------- ---------- --------------------
         1          1 |1,A2
         2          2 |1,A2|2,A3
         2          1 |1,A2|1,A3
         3          1 |1,A2|2,A3|1,A4
         3          2 |1,A2|2,A3|2,A4
         3          1 |1,A2|1,A3|1,A4
         3          2 |1,A2|1,A3|2,A4

7 rows selected.


Here I only see one level 1, why ? My reason asking here is that you said join first and then 'start with' clause executed and from intermediate join result between t1 and t2 with parent = 'A1' is 3 record.

select * from
(
  select t1.key, t1.parent, t2.child from t1, t2
)
where parent = 'A1'       

       KEY PA CH
---------- -- --
         1 A1 A2
         1 A1 A3
         1 A1 A4


Questions :
1. Why Oracle not choose A3 as a level one (see SCBP result above) ? Does it just lucky ? Is Oracle implicitly add predicate key = 1 as parent = 'A1' is originated from key = 1 from table t1 ?
2. Why only 1 level one not three ?
3. Query to show output like below - of course without pl/sql :

--insert first at table t2  
insert into t2 values ('B1','B2');
insert into t2 values ('B2','B3');
insert into t2 values ('B3','B4');

       KEY ID
---------- ----------
         1 A2
         1 A3
         1 A4
         2 B2
         2 B3
         2 B4


Thanks. </code>
Tom Kyte
June 23, 2008 - 7:56 am UTC

You queried:

select t1.key, t1.parent, t2.child from t1, t2;

and then you

select level, key,
sys_connect_by_path(key || ',' || child ,'|') scbp
from t1, t2
start with t2.parent = t1.parent
connect by prior t2.child = t2.parent
order by level

used parent from both (but only queried parent from one) and used t2.parent again, but didn't query it. So, I fail to see why the cartesian join you presented is relevant - it isn't showing most of the data you use.


ops$tkyte%ORA10GR2> create table t3
  2  as
  3  select t1.key t1_key, t1.parent t1_parent, t2.child t2_child, t2.parent t2_parent from t1, t2;

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from t3;

    T1_KEY T1_PARENT T2_CHILD T2_PARENT
---------- --------- -------- ---------
         1 A1        A2       A1
         1 A1        A3       A2
         1 A1        A4       A3
         2 B1        A2       A1
         2 B1        A3       A2
         2 B1        A4       A3

6 rows selected.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> column scbp format a20
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select level, key,
  2         sys_connect_by_path(key || ',' || child ,'|') scbp
  3  from t1, t2
  4  start with t2.parent = t1.parent
  5  connect by prior t2.child = t2.parent
  6  order by level;

     LEVEL        KEY SCBP
---------- ---------- --------------------
         1          1 |1,A2
         2          2 |1,A2|2,A3
         2          1 |1,A2|1,A3
         3          1 |1,A2|2,A3|1,A4
         3          2 |1,A2|2,A3|2,A4
         3          1 |1,A2|1,A3|1,A4
         3          2 |1,A2|1,A3|2,A4

7 rows selected.


so, start with set is:


ops$tkyte%ORA10GR2> select * from t3 where t2_parent = t1_parent;

    T1_KEY T1_PARENT T2_CHILD T2_PARENT
---------- --------- -------- ---------
         1 A1        A2       A1


a single row to start with... and then the first connect by level (you said there is only one level, but your example shows THREE!!!!) would be:


ops$tkyte%ORA10GR2> select * from t3 where t2_parent = 'A2';

    T1_KEY T1_PARENT T2_CHILD T2_PARENT
---------- --------- -------- ---------
         1 A1        A3       A2
         2 B1        A3       A2



and then level three would be:


ops$tkyte%ORA10GR2> select * from t3 where t2_parent = 'A3';

    T1_KEY T1_PARENT T2_CHILD T2_PARENT
---------- --------- -------- ---------
         1 A1        A4       A3
         2 B1        A4       A3




and level four:

ops$tkyte%ORA10GR2> select * from t3 where t2_parent = 'A4';

no rows selected




level 1 connected to 2 level 2's. Each of the level 2's connected to 2 level 3's. Level 3 was the end.

ops$tkyte%ORA10GR2> select rpad('*', 2*level, '*') indent, level, key,
  2         sys_connect_by_path(key || ',' || child ,'|') scbp
  3  from t1, t2
  4  start with t2.parent = t1.parent
  5  connect by prior t2.child = t2.parent
  6  /

INDENT          LEVEL        KEY SCBP
---------- ---------- ---------- --------------------
**                  1          1 |1,A2
****                2          1 |1,A2|1,A3
******              3          1 |1,A2|1,A3|1,A4
******              3          2 |1,A2|1,A3|2,A4
****                2          2 |1,A2|2,A3
******              3          1 |1,A2|2,A3|1,A4
******              3          2 |1,A2|2,A3|2,A4

7 rows selected.



Now, your questions don't make sense because they were based on a flawed example (it is easy to see why there is only one level one once you look at all of the columns, there isn't one level, there are THREE and your example shows that)

and the kicker question, #3, well, please. Tell us the logic there. I have no idea what your output represents and given that the example was not very useful - we are really lost in the weeds.

re: Connect by mecanism

Car Elcaro, June 25, 2008 - 1:25 am UTC

assume that t2 is a process table (t2.child) and and t1.key is a row material. both table are connected with t1.parent and t2.parent as a starting point and then we just want to see all process in t2.child that belongs to t1.id.

Here is the example

t1.key = 1 t1.parent = 'A1' we get from t2 as starting point is (A1,A2) and then connected to it self so resulting A2, A3, A4 as processes that key = 1 is pass.
Tom Kyte
June 25, 2008 - 8:34 am UTC

this is no example

assume we don't know what t1 looks like (therefore, need a create)
assume same for t2

assume we haven't any data in either of them and need some.

assume "row material" is a meaningless term - I've never heard it.

assume that t1.id and t2.child have some relationship we don't understand "then we just want to see all process in t2.child that belongs to t1.id." is not meaningful to us.


Build better example please - this one was not intuitive. Don't say "i already gave you tables", you did, but not for this example. t1.id is a new thing here.

ignore the existence of the prior example, make everything self contained and clear with true examples. Perhaps build it up the way I did to demonstrate how connect by works above - a bit a time, building layer upon layer showing you what I meant.

completing question on connect by

Car Elcaro, June 28, 2008 - 1:19 am UTC

</code>
Sorry for incomplete and inconsitent question above. Let me reintroduce the problem.
I have table t2 defined as follow :

create table t2
( c_inp varchar2(2),
  c_out varchar2(2),
  c_prc varchar2(2)
);

insert into t2 values ('R1','S1','P1');
insert into t2 values ('S1','T1','P2');
insert into t2 values ('T1','U1','P3');
insert into t2 values ('U1','V1','P4');

insert into t2 values ('R2','S2','P1');
insert into t2 values ('S2','T2','P2');
insert into t2 values ('T2','U2','P3');
insert into t2 values ('U2','V2','P4');
insert into t2 values ('V2','W2','P5');

commit;

column c_inp format a5;
column c_out format a5;
column c_prc format a5;

select * from t2;

C_INP C_OUT C_PRC
----- ----- -----
R1    S1    P1
S1    T1    P2
T1    U1    P3
U1    V1    P4
R2    S2    P1
S2    T2    P2
T2    U2    P3
U2    V2    P4
V2    W2    P5

9 rows selected.


c_inp is RAW MATERIAL or INTERMEDIATE MATERIAL to process specified in c_prc into c_out so each row of this table mean a process. For example R1 is a RAW MATERIAL processed by P1 into S1. Next S1 is an INTERMEDIATE MATERIAL to process into T1. Similar to say that S1,T1,U1,V1 of c_inp is INTERMEDIATE OUTPUT for R1,S1,T1,U1 of c_out respectively.

I have another table that contain who is the supplier of raw material like R1 and R2 on the table above. Here is the structure :

create table t1
(
  c_sup varchar2(4),
  c_raw varchar2(2)
);

insert into t1 values ('SUP1','R1');
insert into t1 values ('SUP2','R2');

commit;

column c_sup format a5;
column c_raw format a5;

select * from t1;

C_SUP C_RAW
----- -----
SUP1  R1
SUP2  R2


Actually question #3 : I want to list supplier with their INTERMEDIATE OUTPUT so for supplier SUP1 I have four intermediate output (S1,T1,U1,V1) and for SUP2 I have five (S2,T2,U2,V2,W2). Expected result :

C_SUP C_OUT
----- -----
SUP1  S1
SUP1  T1
SUP1  U1
SUP1  V1
SUP2  S2
SUP2  T2
SUP2  U2
SUP2  V2
SUP2  W2

9 rows selected.


Currently, my solution use a PL/SQL to solve it. And I search a better solution if it's exist e.g. without using PL/SQL. Here is my current solution :

create or replace function find_output(p_raw varchar2)
return varchar2
is
  l_output varchar2(20);
begin
  select max(sys_connect_by_path(c_out,',') || ',')
  into l_output
  from t2
  start with c_inp = p_raw
  connect by c_inp = prior c_out;

  return l_output;
end;
/

select c_sup, column_value c_out
from (select c_sup, find_output(c_raw) c_list from t1) a,
     table(
       cast(
         multiset(
           (select substr(c_list,instr(c_list,',',1,level)+1,
                                 instr(c_list,',',1,level+1)-
                                 instr(c_list,',',1,level)-1)
            from dual
            connect by level <= length(c_list)-
                                length(replace(c_list,',',''))-1)
         ) as sys.odcivarchar2list
       )
     )
order by c_sup, c_out;

C_SUP C_OUT
----- -----
SUP1  S1
SUP1  T1
SUP1  U1
SUP1  V1
SUP2  S2
SUP2  T2
SUP2  U2
SUP2  V2
SUP2  W2

9 rows selected.


SQL above is also related to my question in AskTom another thread :

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:210612357425#927222300346633385


Many thanks for helping me.
Tom Kyte
June 28, 2008 - 1:41 pm UTC

anytime you have a function:

create function f ( inputs ) return output
as
begin
  select ... into output
    from ....

  return output
end;


and a query

select .... f (inputs ) ....


you know you can

select ... ( select ... ) 


you never need that function, just use a scalar subquery and we'll do scalar subquery caching and other neat stuff to make it more efficient.

To : Car Elcaro

Raj, July 02, 2008 - 11:21 am UTC

Tom,

First and foremost I thank you for your wonderful support to the oracle world. Correct if I am wrong.

Why can't we do it this way. I assumed the OP is using oracle 10g or above.

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bi
PL/SQL Release 10.2.0.2.0 - Production
CORE    10.2.0.2.0      Production
TNS for HPUX: Version 10.2.0.2.0 - Production
NLSRTL Version 10.2.0.2.0 - Production


SQL> select * from t1;

C_SUP      C_RAW
---------- ----------
SUP1       R1
SUP2       R2


SQL> select * from t2;

C_INP      C_OUT      C_PRC
---------- ---------- ----------
R1         S1         P1
S1         T1         P2
T1         U1         P3
U1         V1         P4
R2         S2         P1
S2         T2         P2
T2         U2         P3
U2         V2         P4
V2         W2         P5

9 rows selected.

SQL> l
  1  select c_sup, c_out
  2  from
  3  (
  4  select connect_by_root c_inp root_val, c_out from t2
  5  start with c_inp in (select c_raw from t1)
  6  connect by prior c_out = c_inp
  7  )t, t1
  8* where t.root_val = t1.c_raw
SQL> /

C_SUP      C_OUT
---------- ----------
SUP1       V1
SUP1       U1
SUP1       T1
SUP1       S1
SUP2       V2
SUP2       U2
SUP2       T2
SUP2       W2
SUP2       S2

9 rows selected.

Explain plan for the same.

SQL> /

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 2408931310

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     9 |   126 |     7  (15)| 00:00:01 |
|*  1 |  HASH JOIN                  |      |     9 |   126 |     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL         | T1   |     2 |    16 |     3   (0)| 00:00:01 |
|   3 |   VIEW                      |      |     9 |    54 |     3   (0)| 00:00:01 |
|*  4 |    CONNECT BY WITH FILTERING|      |       |       |            |          |
|*  5 |     FILTER                  |      |       |       |            |          |
|   6 |      TABLE ACCESS FULL      | T2   |     9 |    54 |     3   (0)| 00:00:01 |
|*  7 |      TABLE ACCESS FULL      | T1   |     1 |     3 |     3   (0)| 00:00:01 |
|*  8 |     HASH JOIN               |      |       |       |            |          |
|   9 |      CONNECT BY PUMP        |      |       |       |            |          |
|  10 |      TABLE ACCESS FULL      | T2   |     9 |    54 |     3   (0)| 00:00:01 |
|  11 |     TABLE ACCESS FULL       | T2   |     9 |    54 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T"."ROOT_VAL"="T1"."C_RAW")
   4 - access("C_INP"=PRIOR "C_OUT")
   5 - filter( EXISTS (SELECT 0 FROM "T1" "T1" WHERE "C_RAW"=:B1))
   7 - filter("C_RAW"=:B1)
   8 - access("C_INP"=PRIOR "C_OUT")

27 rows selected.


Regards

Raj
Tom Kyte
July 07, 2008 - 7:14 am UTC

big page

no idea who "OP" is - the original poster asked about "how does connect by work".

Remember - there are an infinite number of answers to pretty much every question. So the answer in general to "why can't we do it this way" is - well, of course you can - you can do something like this, that or the other way.

Not that I verified your approach - I didn't really piece together what you were trying to solve - just saying "there are thousands of ways to do anything"

Raj, July 07, 2008 - 9:57 am UTC

Tom,

I thought since I have mentioned (To : Car Elcaro) name in the subject of my post I assumed it would have correlated my reference of OP to be this reviewer, apparently it didn't. So to be more precise I posted one of the solutions which could be used for his question "Currently, my solution use a PL/SQL to solve it. And I search a better solution if it's exist e.g. without using PL/SQL."

Hope it makes sense this time.

Regards

Raj

using connect by to replace self join

A reader, August 05, 2008 - 10:13 pm UTC

I am referring to your post (reproduced below) on this page earlier where you suggested to use a self join:

(Dated May 7, 2006 - 11am US/Eastern:)

"query all Employees ( lets say whose sal >=3000 ) and their managers."

why would you use connect by?

ops$tkyte@ORA10GR2> select ename, mgr, sal from emp where sal >= 3000;

ENAME             MGR        SAL
---------- ---------- ----------
SCOTT            7566       3000
KING                        5000
FORD             7566       3000

or if you needed their manager name:

  1  select a.ename, b.ename mgr, a.sal
  2    from emp a, emp b
  3   where a.sal >= 3000
  4*    and a.mgr = b.empno(+)
ops$tkyte@ORA10GR2> /

ENAME      MGR               SAL
---------- ---------- ----------
FORD       JONES            3000
SCOTT      JONES            3000
KING                        5000

My Question:

(ORACLE VERSION: 10.2.0.2)
I have a view that does a self join to a table that is 25 million rows. Very similar to what you suggested above. The self join's only purpose is to get employee's manager name. But the self join results in nested loop outer, causing very high lio. Query on this view is executed hundreds of times in a sec, causing CPU related issues.

I just want to see if connect by will give me a better execution plan, but I do not know how to write a connect by which will give me this:

empno   ename   job           manager's job
7839    KING    PRESIDENT     NULL
7566    JONES   MANAGER       PRESIDENT
7788    SCOTT   ANALYST       MANAGER
7369    SMITH   CLERK         ANALYST
..

Can you please help me write one? I tried sys_connect_by_path, but could not go further.

Tom Kyte
August 06, 2008 - 8:38 am UTC

are you using the cbo or rbo and what are the estimated cardinalities in the plan.

the optimizer would not use nested loops unless you did something like hint it or used first rows optimization.

show us the plan and explain your optimizer environment.


but wait, if you have a query that returns 25 million rows and is executed "hundreds of times per second", then you must not be returning 25 million rows, you must be returning a very very very very small subset


so, you need to give us more information.

and lose the connect by idea, that doesn't apply, doesn't make sense.

A reader, February 12, 2009 - 11:07 pm UTC

Tom,
Need to take care of symbol changes in our database and calculate the average volume for last 10 days.

create table pricing
(ticker varchar2(10),
px_volume number,
date_loaded date);

create table ticker_changes
(old_ticker varchar2(10),
new_ticker varchar2(10),
start_date date);

insert into pricing
values
('aapl',100,to_date('08/18/2008','mm/dd/yyyy'));

insert into pricing
values
('aapl',200,to_date('08/19/2008','mm/dd/yyyy'));

insert into pricing
values
('aapl',300,to_date('08/20/2008','mm/dd/yyyy'));

insert into pricing
values
('aapl',400,to_date('08/21/2008','mm/dd/yyyy'));

commit;

on 08/22/2008, aapl changes to abc and this is inserted to ticker_changes table. From 08/22/2008 onwards, pricing table has abc instead of aapl

insert into ticker_changes
values
('aapl','abc',to_data('08/22/2008','mm/dd/yyyy'));

My output should be as shown below for 08/21/2008

dt ticker avg10d
08/21/2008 aapl 100

10 days before 08/21/2008 is 08/11/2008. I need to take the px_volume for all these days and divide by 10. But I only have data on 08/18,08/19,08/20,08/21. Regardless of if I have data on any of these 10 days, I have to take the sum of the px_volume and divide by 10. So, my average is (100+200+300+400)/20=100 for 08/21/2008

On 08/22/2008, there is a change to ticker = aapl. aapl has been changed to abc. In pricing table, I will insert one record on 08/22/2008

insert into pricing
values
('abc',500,to_date('08/22/2008','mm/dd/yyyy'));

dt ticker avg10d
08/22/2008 abc 150

10 days before 08/22/2008 is 08/12/2008. I need to take the px_volume for all these days and divide by 10. But I only have data on 08/18,08/19,08/20,08/21,08/22 Regardless of if I have data on any of these 10 days, I have to take the sum of the px_volume and divide by 10. So, my average is (100+200+300+400+500)/20=150 for 08/22/2008. Here for my calculations I need to consider aapl from 08/12/2008 to 08/21/2008 and abc on 08/22/2008. I am assuming this can be done with hierachical queries. Can you guid me?

On 08/25/2008 abc changes to pqr. I have a record in ticker_changes table

insert into ticker_changes
values
('abc','pqr',to_data('08/25/2008','yyyy/mm/dd'));

insert into pricing
values
('abc',600,to_date('08/23/2008','mm/dd/yyyy'));

insert into pricing
values
('abc',700,to_date('08/24/2008','mm/dd/yyyy'));

insert into pricing
values
('pqr',800,to_date('08/25/2008','mm/dd/yyyy'));


dt ticker avg10d
08/24/2008 abc 150
08/25/2008 pqr 260

10 days before 08/25/2008 is 08/15/2008. I need to take the px_volume for all these days and divide by 10.

(100+200+300+400+500+600+700+800)/20=260 for 08/25/2008.

Can you tell me how to achieve this?


Tom Kyte
February 16, 2009 - 10:09 am UTC

Reader, February 17, 2009 - 12:02 pm UTC

Tom,
I went through the other thread that you pointed out and wrote the query -

select nvl(the_orig_sym,ticker) the_sym,
date_loaded,
px_volume
from pricing p left outer join
(select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
connect by prior new_ticker = old_ticker and prior start_date < start_date)
) d
on (p.date_loaded between d.sdate and d.edate and (p.ticker = d.new_ticker) )
where (sdate is null and edate is null) or (date_loaded between sdate and edate)
order by /*ticker,*/ date_loaded;

I get extra record when the start with connect by is used for the ticker_chages table, which is correct. When I try to join this table d with pricing, I am getting extra record for ticker=pqr on 08/25/2008 as there are two records for pqr in the start with connect by query (table d in this case). When I run the query on 08/25, I need to get only one record. Can you tell me how to achieve this?

this is the query I used to get the total of 10 days.
select *
from(
select --nvl(the_orig_sym,ticker) the_sym,
ticker,
date_loaded,
px_volume,
sum(px_volume) over (partition by nvl(the_orig_sym,ticker) order by date_loaded desc rows between current row and 9 following) as sum_10d
from pricing p left outer join
(select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
connect by prior new_ticker = old_ticker and prior start_date < start_date)
) d
on (p.date_loaded between d.sdate and d.edate and (p.ticker = d.new_ticker) )
where (sdate is null and edate is null) or (date_loaded between sdate and edate)
order by date_loaded )
where date_loaded= to_date('08/25/2008','mm/dd/yyyy')

TICKER DATE_LOADED PX_VOLUME SUM_10D
---------- --------- ---------- ----------
pqr 25-AUG-08 800 800
pqr 25-AUG-08 800 3600

I should get
TICKER DATE_LOADED PX_VOLUME SUM_10D
---------- --------- ---------- ----------
pqr 25-AUG-08 800 3600

Reader, February 17, 2009 - 5:05 pm UTC

Tom,
I was wondering if you could answer the above question.
Tom Kyte
February 17, 2009 - 8:11 pm UTC

first - how hard have you tried? do you understand how this works (if not, please get there before going further)

second - make the test case really easy for me to follow - sort of a long narrative above - and you just say "I need one row, this one" right above - but don't really describe how you know "that is the row" (and perhaps when you do, what is needed will become obvious...)


so, put it all together concisely (the test case to load the data - not scattered in various reviews, it is really hard for me to read up and down over more than one and figure out what is relevant, what is not - especially when hours/days and many other questions go in between your entries). But truly describe what needs to happen to the data, psuedo code it (like we used to in the olden days, to algorithmically describe it to someone...) even

Reader, February 18, 2009 - 12:19 am UTC

Tom,
Thanks for your reply.

I get data in pricing table every day. pricing table has data about each ticker traded in the market. For each ticker that comes in the pricing table, 10d sum has to be calculated and has to be inserted into other table.

For example:


--08/18
insert into pricing values ('aapl',100,to_date('08/18/2008','mm/dd/yyyy'));

insert into pricing values ('orcl',100,to_date('08/18/2008','mm/dd/yyyy'));

--08/19

insert into pricing values ('aapl',200,to_date('08/19/2008','mm/dd/yyyy'));

insert into pricing values ('orcl',100,to_date('08/19/2008','mm/dd/yyyy'));


--08/20

insert into pricing values ('aapl',300,to_date('08/20/2008','mm/dd/yyyy'));

insert into pricing values ('kkk',300,to_date('08/20/2008','mm/dd/yyyy'));

--08/21

insert into pricing values ('aapl',400,to_date('08/21/2008','mm/dd/yyyy'));



insert into pricing values ('aapl',100,to_date('08/18/2008','mm/dd/yyyy'));

on 08/18, if I run the query, the result is

  select *
    from(
    select ticker,
           sum(px_volume) over (partition by ticker order by date_loaded desc rows between current row and 9 following) as sum_10d,
           date_loaded
    from pricing)
   where date_loaded = to_date('08/18/2008','mm/dd/yyyy')
SQL> /

TICKER        SUM_10D DATE_LOAD
---------- ---------- ---------
aapl              100 18-AUG-08
orcl              100 18-AUG-08

On 08/19/2008, I get only aapl,orcl

  select *
    from(
    select ticker,
           sum(px_volume) over (partition by ticker order by date_loaded desc rows between current row and 9 following) as sum_10d,
           date_loaded
    from pricing)
   where date_loaded = to_date('08/19/2008','mm/dd/yyyy')
SQL> /

TICKER        SUM_10D DATE_LOAD
---------- ---------- ---------
aapl              300 19-AUG-08
orcl    200 19-AUG-08


Similary if I run the query on 08/20

On 08/20/2008, if I run the query:

  select *
    from(
    select ticker,
           sum(px_volume) over (partition by ticker order by date_loaded desc rows between current row and 9 following) as sum_10d,
           date_loaded
    from pricing)
   where date_loaded = to_date('08/20/2008','mm/dd/yyyy')
SQL> /

TICKER        SUM_10D DATE_LOAD
---------- ---------- ---------
aapl              600 20-AUG-08
kkk               300 20-AUG-08

I have to run this for every day that I have data in pricing table

Apparently, the ticker aapl got changed to abc on 08/22. And this data is inserted to ticker_changes table

insert into ticker_changes values ('aapl','abc',to_data('08/22/2008','mm/dd/yyyy'));

And in the pricing table, I will start getting abc instead of aapl.

insert into pricing values ('abc',500,to_date('08/22/2008','mm/dd/yyyy'));

When I run the query on 08/22 to get the 10d sum, I should, consider aapl, prior to 08/22 and abc from 08/22 onwards.
The sum should be as shown below -

TICKER        SUM_10D DATE_LOAD
---------- ---------- ---------
abc              1500 20-AUG-08

1500 is got from the below rows inserted to pricing table -
aapl              100 18-AUG-08
aapl              200 19-AUG-08
aapl              300 20-AUG-08
aapl              400 21-AUG-08
abc               500 22-AUG-08

For this, I used the query that you pointed to in the other thread. 

    select  the_orig_sym, new_ticker, start_date sdate,
            nvl(lead(start_date) over (partition by the_orig_sym order by start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
    from (select connect_by_root old_ticker the_orig_sym,
                 new_ticker,
                start_date
          from ticker_changes
          where start_date <= to_date('08/22/2008','mm/dd/yyyy')
         connect by prior new_ticker = old_ticker and prior start_date < start_date)
SQL> /

THE_ORIG_S NEW_TICKER SDATE     EDATE
---------- ---------- --------- ---------
aapl       abc        22-AUG-08 31-DEC-99


On 08/22, when I get the sum for last 10 days, I have to add the following and I should get, 1500 and it should be displayed under the ticker abc

Ticker    volume date_loaded
aapl              100   18-AUG-08
aapl              200   19-AUG-08
aapl              300   20-AUG-08
aapl              400   21-AUG-08
abc               500   22-AUG-08




Result for abc along with other tickers loaded on 08/22
Ticker sum_10d date_loaded
abc 1500 22-AUG-08

(abc is actully the sum of aapl(prior to 08/22) and abc (on 08/22)

--08/23 load to pricing table
insert into pricing values ('abc',500,to_date('08/22/2008','mm/dd/yyyy'));

On 08/23, When I get the sum for last 10 days, I have to add the following and I should be getting, 

Ticker    volume date_loaded
aapl              100   18-AUG-08
aapl              200   19-AUG-08
aapl              300   20-AUG-08
aapl              400   21-AUG-08
abc               500   22-AUG-08
abc               500           23-AUG-08

Result for abc along with other tickers loaded on 08/23
Ticker sum_10d DATE_LOADED
abc     2000 08/23/2008

Please let me know if I am missing something. 

Tom Kyte
February 18, 2009 - 7:44 am UTC

please re-read my request above.

... so, put it all together concisely ... not scattered in various reviews, it is really hard for me to read up and down over more than one and figure out what is relevant.... But truly describe what needs to happen to the data, psuedo code it (like we used to in the olden days, to algorithmically describe it to someone...) ....

so, in looking at what you posted, would one have to go to more than one review section to get all of the necessary information?


but that aside, if you have:
    select  the_orig_sym, new_ticker, start_date sdate,
            nvl(lead(start_date) over (partition by the_orig_sym order by 
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
    from (select connect_by_root old_ticker the_orig_sym,
                 new_ticker,
                start_date
          from ticker_changes
          where start_date <= to_date('08/22/2008','mm/dd/yyyy')
         connect by prior new_ticker = old_ticker and prior start_date < 
start_date)
SQL> /

THE_ORIG_S NEW_TICKER SDATE     EDATE
---------- ---------- --------- ---------
aapl       abc        22-AUG-08 31-DEC-99




you basically have a mapping, the one you need, I see a single row. What is the issue, above you seemed to be saying "i get two"

Connect By on DUAL table

Matteo Mitrano, February 18, 2009 - 6:19 am UTC

Hi Tom,
with regard to this present topic, could you please explain to me why I found this behaviour in my environment? Is there's something wrong with it?

SQL> SELECT * FROM v$version;

Oracle9i Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production

Query 1:
SELECT LEVEL - 1 AS set_of_days
FROM DUAL
CONNECT BY LEVEL <= (ADD_MONTHS (TRUNC (SYSDATE, 'y'), 12) - TRUNC (SYSDATE, 'y'))

Output 1:
ROW# SET_OF_DAYS
1 0
2 1
3 2
4 3
... ...
250 249

(You see, 250 number of rows)


Query 2:
SELECT (TRUNC (SYSDATE, 'y') + set_of_days) AS dt
FROM (SELECT LEVEL - 1 AS set_of_days
FROM DUAL
CONNECT BY LEVEL <= (ADD_MONTHS (TRUNC (SYSDATE, 'y'), 12) - TRUNC (SYSDATE, 'y')))

Output 2:
ROW# DT
1 1-gen-2009
2 2-gen-2009
3 3-gen-2009
4 4-gen-2009
... ...
365 31-dic-2009

(365 number of rows)

Thanks in advance, Tom. I really appreciate your great site and your patience.
Matteo

Tom Kyte
February 18, 2009 - 3:36 pm UTC

it was a 9i really old issue.

you use an inline view as you did to avoid it.

Reader, February 18, 2009 - 11:31 pm UTC

Tom,
Regarding the previous question about the ticker changes:

select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/18/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/18/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by the_sym,date_loaded

TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME
---------- ---------- ---------- --------- ----------
aapl aapl 18-AUG-08 100
orcl orcl 18-AUG-08 100



select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/19/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/19/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by the_sym,date_loaded

TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME
---------- ---------- ---------- --------- ----------
aapl aapl 18-AUG-08 100
aapl aapl 19-AUG-08 200
orcl orcl 18-AUG-08 100


select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/20/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/20/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by the_sym,date_loaded


TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME
---------- ---------- ---------- --------- ----------
aapl aapl 18-AUG-08 100
aapl aapl 19-AUG-08 200
aapl aapl 20-AUG-08 300
orcl orcl 18-AUG-08 100

select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/20/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/20/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by the_sym,date_loaded

TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME
---------- ---------- ---------- --------- ----------
aapl aapl 18-AUG-08 100
aapl aapl 19-AUG-08 200
aapl aapl 20-AUG-08 300
aapl aapl 21-AUG-08 400
orcl orcl 18-AUG-08 100


Till 21st, there were no ticker changes. aapl is returned when the above query is ran .



select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/22/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/22/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by the_sym,date_loaded

On 22th, aapl, changed to abc; abc started coming in pricing table, the_orig_sym=aapl

TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME
---------- ---------- ---------- --------- ----------
aapl aapl 18-AUG-08 100
aapl aapl 19-AUG-08 200
aapl aapl 20-AUG-08 300
aapl aapl 21-AUG-08 400
abc aapl aapl 22-AUG-08 500
orcl orcl 18-AUG-08 100


select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/25/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/25/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by the_sym,date_loaded

on 25th, abc changed to pqr. Using the start with connect by query, we got an entry aapl, pqr (aapl is the original ticker for pqr). Since pqr is present
twice in this start with connect by relationship, When I try to join this to pricing table, I get two records for pqr.

TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME
---------- ---------- ---------- --------- ----------
aapl aapl 18-AUG-08 100
aapl aapl 19-AUG-08 200
aapl aapl 20-AUG-08 300
aapl aapl 21-AUG-08 400
abc aapl aapl 22-AUG-08 500
abc aapl aapl 23-AUG-08 600
abc aapl aapl 24-AUG-08 700
pqr aapl aapl 25-AUG-08 800
pqr abc abc 25-AUG-08 800
orcl orcl 18-AUG-08 100

I do the last 10 day sum using the below query; I get the correct sum for prq on 25th, which is 3600


select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
,sum(px_volume) over (partition by nvl(the_orig_sym,ticker) order by date_loaded desc rows between current row and 9 following) as sum_10d
from (select * from pricing where date_loaded <= to_Date('08/25/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/25/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by date_loaded


TICKER THE_ORIG_S THE_SYM DATE_LOAD PX_VOLUME SUM_10D
---------- ---------- ---------- --------- ---------- ----------
aapl aapl 18-AUG-08 100 100
orcl orcl 18-AUG-08 100 100
aapl aapl 19-AUG-08 200 300
aapl aapl 20-AUG-08 300 600
aapl aapl 21-AUG-08 400 1000
abc aapl aapl 22-AUG-08 500 1500
abc aapl aapl 23-AUG-08 600 2100
abc aapl aapl 24-AUG-08 700 2800
pqr aapl aapl 25-AUG-08 800 3600
pqr abc abc 25-AUG-08 800 800

10 rows selected.


but I want to display only the record with the total and not pqr with px_volume=800 on 25th.Since pqr is present in pricing table on 25th, I need to get the
sum for last 10 days, by considering ticker changes if any.

if I run the below query, I get two records, one with the 10day sum 3600 and also one with the px_volume = 800 which was loaded on 25th

select ticker,sum_10d,date_loaded
from ( select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
,sum(px_volume) over (partition by nvl(the_orig_sym,ticker) order by date_loaded desc rows between current row and 9 following) as sum_10d
from (select * from pricing where date_loaded <= to_Date('08/25/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/25/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by date_loaded)
where date_loaded = to_Date('08/25/2008','mm/dd/yyyy')


TICKER SUM_10D DATE_LOAD
---------- ---------- ---------
pqr 3600 25-AUG-08
pqr 800 25-AUG-08

I have to get the 10 day sum for each ticker loaded into pricing table on each day, considering the ticker changes if any.
I am sorry for troubling you again on this. Your start with connect by is an excellent solution. I am not able to derive what I want from the information you gave.
Please let me know if I am missing details again. Thanks a lot for your help.

Thanks!!!

Matteo Mitrano, February 19, 2009 - 4:53 am UTC


Reader, February 19, 2009 - 12:25 pm UTC

Tom,
With respect to the ticker changes question:

I used this SQl to get the sum_10d for the tickers in the pricing table for a particular day. I am using max to get the sum_10d.

Pricing table has around 200,000 records everyday and the ticker_changes now has 1500 records. Not sure how this works when I run on the actual table. I am not sure if this is the correct solution. Please let me know if you do not mind.

select ticker,date_loaded,max(sum_10d) sum_10d
from(
select mt.ticker
,d.the_orig_sym
,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded
,mt.px_volume
,sum(px_volume) over (partition by nvl(the_orig_sym,ticker) order by date_loaded desc rows between current row and 9 following) as sum_10d
from (select * from pricing where date_loaded <= to_Date('08/26/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by
start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/25/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by date_loaded)
where date_loaded = to_Date('08/25/2008','mm/dd/yyyy')
group by ticker,date_loaded

A reader, February 25, 2009 - 12:54 pm UTC

Tom,
I was wondering if the requirement that I sent on February 18, 2009 - 11pm US/Eastern was not enough. Not sure if you can suggest me anything in regards to what I sent. Please let me know, if something is not clear again.
Tom Kyte
February 25, 2009 - 6:13 pm UTC

it was too large, not concise, not self contained - think about how much work I'd have to do to put together everything needed to test with.

I skip those.

Another Connect by query

Maverick, February 25, 2009 - 10:21 pm UTC

Tom, I need to build a query to get all the parents or children for a given ssn number.
Let me explain:
In our system an employee can terminate from the services and re-enroll with a different SSN [that might have changed] and is linked by using parent_empno to his previous enrollments:

Here is a sample test scenario:
[I am using Oracle 10g]

create table test1 (ssn number(9),empno number(12),parent_empno number(12));

insert into test1 values(123456789,1234,null); --Enrolled into system
insert into test1 values(123456789,1235,1234); --Terminated and re-enrolled [linked by parent_empno]
insert into test1 values(123456789,1236,1235); --Terminated and re-enrolled [linked by parent_empno]
insert into test1 values(987654321,1237,1236); --Terminated and re-enrolled with diff. SSN [linked by parent_empno]
insert into test1 values(987654321,1238,1237); --Terminated and re-enrolled [linked by parent_empno]
insert into test1 values(987654321,1239,1238); --Terminated and re-enrolled [linked by parent_empno]

This link was created mainly for reporting purposes.


If I need to see all the related [linked] empno's for this employee by using an SSN, I need to get the following:

select ssn,empno from test1
start with parent_empno is null and ssn=987654321
connect by prior empno=parent_empno;

This will give me all related empno's who are it's children:

987654321,1237
987654321,1238
987654321,1239

But I also want to see his parent records:
123456789,1234
123456789,1235
123456789,1236

How to acheive this? Hope I am clear enough in explaining this. Can I do a union between two hirearchical queries, one going down and other going up? If so, how can I go upwards to bring parent records?

Thanks for all your help.
Tom Kyte
March 03, 2009 - 7:11 am UTC

... between two hirearchical queries, one going down and other going up? ...

sure, but only because there is no such thing as "up or down", there are just links

ops$tkyte%ORA10GR2> select rpad('*',2*level,'*') || ename nm, empno, mgr from scott.emp
  2  start with mgr is null
  3  connect by prior empno = mgr
  4  /

NM                   EMPNO        MGR
--------------- ---------- ----------
**KING                7839
****JONES             7566       7839
******SCOTT           7788       7566
********ADAMS         7876       7788
******FORD            7902       7566
********SMITH         7369       7902
****BLAKE             7698       7839
******ALLEN           7499       7698
******WARD            7521       7698
******MARTIN          7654       7698
******TURNER          7844       7698
******JAMES           7900       7698
****CLARK             7782       7839
******MILLER          7934       7782

14 rows selected.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select rpad('*',2*level,'*') || ename nm, empno, mgr , 'down'
  2  from scott.emp
  3  start with ename = 'SCOTT'
  4  connect by prior empno = mgr
  5  union all
  6  select rpad('*',2*level,'*') || ename nm, empno, mgr , 'up'
  7  from scott.emp
  8  start with ename = 'SCOTT'
  9  connect by prior mgr = empno
 10  /

NM                   EMPNO        MGR 'DOW
--------------- ---------- ---------- ----
**SCOTT               7788       7566 down
****ADAMS             7876       7788 down
**SCOTT               7788       7566 up
****JONES             7566       7839 up
******KING            7839            up


Reader, February 26, 2009 - 1:14 pm UTC

create table pricing
(ticker varchar2(10),
px_volume number,
date_loaded date);

create table ticker_changes
(old_ticker varchar2(10),
new_ticker varchar2(10),
start_date date);

--data for 08/18
insert into pricing values ('aapl',100,to_date('08/18/2008','mm/dd/yyyy'));

insert into pricing values ('orcl',100,to_date('08/18/2008','mm/dd/yyyy'));

--data for 08/19
insert into pricing values ('aapl',200,to_date('08/19/2008','mm/dd/yyyy'));

insert into pricing values ('orcl',100,to_date('08/19/2008','mm/dd/yyyy'));

--data for 08/20
insert into pricing values ('aapl',300,to_date('08/20/2008','mm/dd/yyyy'));

insert into pricing values ('kkk',300,to_date('08/20/2008','mm/dd/yyyy'));

--data for 08/21
insert into pricing values ('aapl',400,to_date('08/21/2008','mm/dd/yyyy'));

--data for 08/22
insert into pricing values ('abc',500,to_date('08/22/2008','mm/dd/yyyy'));

insert into pricing values ('orcl',100,to_date('08/22/2008','mm/dd/yyyy'));


--data for 08/23
insert into pricing values ('abc',600,to_date('08/23/2008','mm/dd/yyyy'));

--data for 08/24
insert into pricing values ('abc',700,to_date('08/24/2008','mm/dd/yyyy'));

--data for 08/25
insert into pricing values ('pqr',800,to_date('08/25/2008','mm/dd/yyyy'));

insert into pricing values ('orcl',100,to_date('08/25/2008','mm/dd/yyyy'));

--ticker changes insert
insert into ticker_changes values ('aapl','abc',to_data('08/22/2008','mm/dd/yyyy'));

insert into ticker_changes values ('abc','pqr',to_data('08/25/2008','yyyy/mm/dd'));

commit;

select * from ticker_changes order by start_date;

OLD_TICKER NEW_TICKER START_DATE
---------- ---------- ----------
aapl abc 22-AUG-08
abc pqr 25-AUG-08
pqr aapl 26-AUG-08

select * from pricing order by date_loaded;

TICKER PX_VOLUME DATE_LOAD
---------- ---------- ---------
orcl 100 18-AUG-08
aapl 100 18-AUG-08
aapl 200 19-AUG-08
orcl 100 19-AUG-08
kkk 300 20-AUG-08
aapl 300 20-AUG-08
aapl 400 21-AUG-08
orcl 100 22-AUG-08
abc 500 22-AUG-08
abc 600 23-AUG-08
abc 700 24-AUG-08
orcl 100 25-AUG-08
pqr 800 25-AUG-08

As shown in the above data from pricing table, I get tickers everday with there volume. There are chances that ticker can be changed. This ticker changes information is in ticker_changes table as shown above.

Everyday, I need to use pricing table and caclulate last 10 day sum of volume (including that day in the 10 day) for each ticker in pricing table. I need to consider
the ticker changes in the calculation.

On 08/18/2008, I need to calculate sum for aapl, orcl. So the 10 day volume is from 08/08 to 08/18. Since I do not have data from 08/08 to 08/17, I use only the data from 08/18.

TICKER TOTAL_VOLUME calculation_date
aapl 100 18-AUG-08
orcl 100 18-AUG-08

On 08/19, I have two tickers in pricing table, aapl and orcl. So the 10 day volume is from 08/08 to 08/18. Since I have data from 08/18 and 08/19, I use that to calculate the volume.

TICKER TOTAL_VOLUME calculation_date
aapl 300 19-AUG-08
orcl 200 19-AUG-08

This continues everyday. On 08/22 as shown in the ticker_chages table "aapl" changed to "abc". For 10 day volume calculation on 08/22, I need to consider aapl prior to 08/22 and abc on 08/22

TICKER TOTAL_VOLUME calculation_date
abc 1500 22-AUG-08 (this is sum of aapl+abc)
orcl 300 22-AUG-08

On 08/25, as shown in the ticker_changes table "abc" changed to "pqr", For 10 day volume calculation on 08/25, I need to consider aapl prior to 08/22,
abc from 08/22 to 08/24 and pqr on 08/25

TICKER TOTAL_VOLUME calculation_date
abc 3600 22-AUG-08 (this is sum of aapl+abc)
orcl 400 22-AUG-08

I used the hierarchical query that you posted to get the ticker changes and the sdate and edate range.

alter session set nls_date_format='DD-MON-YYYY';

select the_orig_sym, new_sym, effective_dt sdate,
nvl(lead(effective_dt) over (partition by the_orig_sym order by effective_dt)-1,to_date('1-jan-3000','dd-mon-yyyy')) edate
from ( select connect_by_root old_ticker the_orig_sym, new_ticker as new_sym, start_date as effective_dt
from ticker_changes
where start_date<= to_date('08/25/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date < start_date);


THE_ORIG_S NEW_SYM SDATE EDATE
---------- ---------- ----------- -----------
aapl abc 22-AUG-2008 24-AUG-2008
aapl pqr 25-AUG-2008 01-JAN-3000
abc pqr 25-AUG-2008 01-JAN-3000

I get two records for pqr on 08/25.
1) 1st the record from the the ticker_changes table
2) 2nd record is got from the connect by

If I use this to join to the pricing table, I get two records on 08/25 since I am joining on ticker_changes.new_sym with pricing.ticker

I used the below query to get all the necessary tickers and sdate and edate.

When I run this query, I get two records for pqr, since I have two records from the connect by query. If I use these two records in my calculation, I get extra 800, which is not correct. Here is where I am not able to figure out how to resolve this. Can you help me?

select mt.ticker,d.the_orig_sym,nvl(d.the_orig_sym,mt.ticker) the_sym
,mt.date_loaded,mt.px_volume
from (select * from pricing where date_loaded <= to_Date('08/25/2008','mm/dd/yyyy')) mt
left join ( select the_orig_sym, new_ticker, start_date sdate,
nvl(lead(start_date) over (partition by the_orig_sym order by start_date)-1,to_date('31-DEC-9999','dd-mon-yyyy')) edate
from (select connect_by_root old_ticker the_orig_sym,
new_ticker,
start_date
from ticker_changes
where start_date <= to_date('08/25/2008','mm/dd/yyyy')
connect by prior new_ticker = old_ticker and prior start_date <
start_date)) d
on (MT.DATE_LOADED between d.sdate and d.edate and MT.TICKER = d.new_ticker)
order by date_loaded,ticker


I hope my explanation is not too much this time. I only have arund 20 lines of explanation and other than that all others are data to demostrate what I am trying to achieve. Please help.


Tom Kyte
March 03, 2009 - 7:59 am UTC

... If I use this to join to the pricing table, I get two records on 08/25 since I
am joining on ticker_changes.new_sym with pricing.ticker
.....


easiest fix - just keep one of them - distinct or group by the set before applying the aggregates.

Connect By issue

Chandra, April 03, 2009 - 3:49 am UTC

Hi Tom,
Please go through the below script.

CREATE TABLE MATCHTABLE(
MATCH1 VARCHAR2(10),
MATCH2 VARCHAR2(10)
)


INSERT INTO MATCHTABLE VALUES('A1','A2');
INSERT INTO MATCHTABLE VALUES('A2','A1');

INSERT INTO MATCHTABLE VALUES('A2','A3');
INSERT INTO MATCHTABLE VALUES('A3','A2');

INSERT INTO MATCHTABLE VALUES('A4','A5');
INSERT INTO MATCHTABLE VALUES('A5','A4');

INSERT INTO MATCHTABLE VALUES('A4','A6');
INSERT INTO MATCHTABLE VALUES('A6','A4');

INSERT INTO MATCHTABLE VALUES('A7','A8');
INSERT INTO MATCHTABLE VALUES('A8','A7');

SELECT * FROM MATCHTABLE;

MATCH1 MATCH2
----- -----
A1 A2
A2 A1
A2 A3
A3 A2
A4 A5
A5 A4
A4 A6
A6 A4
A7 A8
A8 A7

In this table, we will always have 2 rows for corresponding any 2 matchids. I need the output as below grouped matchids in groups. The problem I am facing is there is no starting point from where to start the hierarchy tree. And if I start matches for all distinct matchids, the groups tend to be the subset of some other superset and I need the supersets only.


GROUPID MATCHID
---------------------
1 A1
1 A2
1 A3

2 A4
2 A5
2 A6

3 A7
3 A8

Thanks.
Tom Kyte
April 03, 2009 - 7:36 am UTC

you need to explain this better. if you have nothing to "start with", no way to describe what to "start with" - how can I figure out "what to start with"

give me some logic here - FORGET CONNECT BY - it might not even be the proper approach. Describe the inputs and the DESIRED OUTPUTS using english, in the form of a program specification - pretend you are the end user (not a programmer) and you want a programmer to develop code for you - what would you as the end user say to the programmer to describe what you want.

Connect By Clause

Chandra, April 06, 2009 - 2:49 am UTC

Sorry...

From the user perspective, table MAtchTable contains cols Match1 and Match2 which are the match ids. Any row eg A1, A2 corresponds that A1 and A2 ids have got a match among them. So, I need to group the matchids.

We start picking the matches from column Match1- A1, A1 = A2 and then for A2, A2 = (A3 and A1) and then for A3, A3 = A2. So by avoiding the circular loop, A1, A2 and A3 falls in a single group. Note - Circular loops are to be avoided.

If we start from A2, A2=(A1 and A3) and then for A1, A1=A2 and for A3, A3=A2. So they are again grouped as A1, A2 and A3.

Similarly, A4,A5 and A6 share common matches, so are grouped together.

And so with A7 and A8.

Balaji, May 19, 2009 - 10:36 am UTC

Hi tom,
Can you explain how the below mentioned query execute, i am confused with
second connect by prior condition in query2 and by adding this the query get executed fast.
what is the difference between query1 and query2 in execution

query1:-
========
SELECT lm_parentlimit_lm, lm_credprogram_cp_k, lm_limit_k, LEVEL
FROM ca_lmcreditprogramlimits
WHERE lm_credprogram_cp_k = 1088 AND lm_parentlimit_lm IS NULL
START WITH lm_limit_k = 22
CONNECT BY PRIOR lm_parentlimit_lm = lm_limit_k

query2:-
========
SELECT lm_parentlimit_lm, lm_credprogram_cp_k, lm_limit_k, LEVEL
FROM ca_lmcreditprogramlimits
WHERE lm_credprogram_cp_k = 1088 AND lm_parentlimit_lm IS NULL
START WITH lm_limit_k = 22
CONNECT BY PRIOR lm_parentlimit_lm = lm_limit_k
AND PRIOR lm_credprogram_cp_k = lm_credprogram_cp_k
Tom Kyte
May 23, 2009 - 12:14 pm UTC

they are as different as night and day

you would actually need to understand the constructs you are using here.... do you understand the use of connect by, when and how the where clause is applied? that the two queries you present are ENTIRELY AND UTTERLY different - they do not return the same answers?

They each are valid.
But they each answer entirely different queries, different questions.

One tells you why the sky is blue, the other why grass is green (as way of analogy)



A connect by with a where clause is similar to :


a) build the ENTIRE HIERARCHY
b) then apply the where clause


A connect by - by itself is

a) build the ENTIRE HIERARCHY


now, your query two, it builds (stops faster) a MUCH MUCH SMALLER hierarchy.

The query one, builds a big hierarchy and then filters it.



you cannot compare the above two queries, they are as different as

select * from emp;
select * from dept;

are.




Strange behavior

Mihail Bratu, December 22, 2010 - 9:51 am UTC

Hi Tom,
I discovered some strange behavior around the connect by clause. I'll reproduce this issue into one small test case.

drop table emp
/
drop table enames
/
create table emp as select * from scott.emp
/
create table enames(ename varchar2(10))
/
insert into enames
select column_value from table(sys.odcivarchar2List('KING','JONES','CLARK'))
/
commit
/
Let's see the tree starting from mgr is null and including the employee names from the enames table:

column ename_s format a20
select rpad(' ' ,2*(level-1), ' ')|| e.ename ename_s, prior e.ename ename_prior
from emp e
connect by prior e.empno = e.mgr
and e.ename in (select ename from enames)
start with e.mgr is null
/
ENAME_S ENAME_PRIO
-------------------- ----------
KING
JONES KING
CLARK KING

Now we'll evolve the tree applying the condition on connect by with one level delay using the prior operator.
We expect to obtain the above tree with one level extended. Instead of that the whole tree is returned.

select rpad(' ' ,2*(level-1), ' ')|| e.ename ename_s, prior e.ename ename_prior
from emp e
connect by prior e.empno = e.mgr
and prior e.ename in (select ename from enames)
start with e.mgr is null
/
ENAME_S ENAME_PRIO
-------------------- ----------
KING
JONES KING
SCOTT JONES
ADAMS SCOTT
FORD JONES
SMITH FORD
BLAKE KING
ALLEN BLAKE
WARD BLAKE
MARTIN BLAKE
TURNER BLAKE
JAMES BLAKE
CLARK KING
MILLER CLARK

Now we'll use the EXISTS function for the same target:

select rpad(' ' ,2*(level-1), ' ')|| e.ename ename_s, prior e.ename ename_prior
from emp e
connect by prior e.empno = e.mgr
and prior case when exists(
select null from enames where ename = e.ename
) then 1 end = 1
start with e.mgr is null
/
ENAME_S ENAME_PRIO
-------------------- ----------
KING
JONES KING
SCOTT JONES
FORD JONES
BLAKE KING
CLARK KING
MILLER CLARK

The result matches the expectation!

The question is, is this a bug, or I'm missing something? I appreciate your comments.
Thank you

Tom Kyte
December 22, 2010 - 2:50 pm UTC

In 10gR2 - I reproduce your findings (10.2.0.4)

in 11gR2 - I do NOT reproduce your finds, it does the right thing (11.2.0.2)

please contact support for this one.

Thank you (11.1.0.6.0)

Mihail Bratu, December 23, 2010 - 2:34 am UTC


connect by prior with nulls??

A reader, January 21, 2011 - 5:19 am UTC

Hi Tom,

In our 2 main DBs we have the following code

select sysid,party_link_sysid
from parties
CONNECT BY PRIOR SYSID = PARTY_LINK_SYSID
start with sysid=:a1;

both with the same indexes. However the data is different

DB1
-------

select count(*) from parties;

COUNT(*)
----------
7377820

select count(party_link_sysid) from parties;

COUNT(PARTY_LINK_SYSID)
-----------------------
1533750



DB2
---------

select count(*) from parties;

COUNT(*)
----------
1259267

select count(party_link_sysid) from parties;

COUNT(PARTY_LINK_SYSID)
-----------------------
34


results from autot for DB1

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 72 | 1 (0)| 00:00:01 |
|* 1 | CONNECT BY WITH FILTERING | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | PARTIES | 1 | 24 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PARTY_PK | 1 | | 1 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | | | | |
| 5 | CONNECT BY PUMP | | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| PARTIES | 8 | 72 | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | PARTY_PARTY_FK_I | 2 | | 1 (0)| 00:00:01 |


results from autot for DB2


| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 63036 | 492K| 1 |
| 1 | CONNECT BY WITH FILTERING | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | PARTIES | 1 | 22 | 1 |
| 3 | INDEX UNIQUE SCAN | PARTY_PK | 1 | | 1 |
| 4 | NESTED LOOPS | | | | |
| 5 | CONNECT BY PUMP | | | | |
| 6 | TABLE ACCESS BY INDEX ROWID| PARTIES | 63036 | 492K| 1 |
| 7 | INDEX RANGE SCAN | PARTY_PARTY_FK_I | 1 | | 1 |
----------------------------------------------------------------------------------


So it seems like the DB with 33 values and the rest of the table is null for the party_link_sysid column, gets half the table scanned.

Is there a way to get it to scan a smaller amount - like in DB1?

Thanks

Tom Kyte
January 24, 2011 - 7:13 am UTC

those are explain plans, they are not showing you what actually happened when you ran the query, they are showing a guess of what might happen.

there are no scans here, only indexed reads.

It looks like your statistics are not up to date somewhere and when they get up to date - you'll see different estimates.


Use sql_trace+tkprof to see what ACTUALLY happens row wise.

connect by prior

A reader, January 24, 2011 - 8:07 am UTC

Thanks tom - should we ignore the Rows column in autot ?

What is it for?

I have analayzed all indexes and tables involved - but the explain plans come out the same:

DB1
------------------

SQL> select table_name,num_rows,last_analyzed from dba_tables
  2    where table_name ='PARTIES';

TABLE_NAME                       NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------
PARTIES                           7378389 24-JAN-11

SQL> select index_name, num_rows,last_analyzed from dba_indexes
  2  where table_name ='PARTIES';

INDEX_NAME                       NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------
PARTY_PK                          7378389 24-JAN-11
PARTY_OPT1                        7378389 24-JAN-11
PARTY_PARTY_FK_I                  1533762 24-JAN-11
PARTY_PARTY_FK_I2                  245157 24-JAN-11
PARTY_OPT2                        7372722 24-JAN-11

--------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes | Cost (%CPU)
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |     8 |    72 |     1   (0)
|*  1 |  CONNECT BY WITH FILTERING    |                  |       |       |
|   2 |   TABLE ACCESS BY INDEX ROWID | PARTIES          |     1 |    24 |     1   (0)
|*  3 |    INDEX UNIQUE SCAN          | PARTY_PK         |     1 |       |     1   (0)
|   4 |   NESTED LOOPS                |                  |       |       |
|   5 |    CONNECT BY PUMP            |                  |       |       |
|   6 |    TABLE ACCESS BY INDEX ROWID| PARTIES          |     8 |    72 |     1   (0)
|*  7 |     INDEX RANGE SCAN          | PARTY_PARTY_FK_I |     2 |       |     1   (0)
--------------------------------------------------------------------------------------

DB2
------------------

SQL> select table_name,num_rows,last_analyzed from dba_tables
  2  where table_name ='PARTIES';

TABLE_NAME                       NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------
PARTIES                           1208948 21-JAN-11


SQL> select index_name, num_rows,last_analyzed from dba_indexes
  2  where table_name ='PARTIES';

INDEX_NAME                       NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------
PARTY_PK                          1208948 21-JAN-11
PARTY_OPT1                        1208948 21-JAN-11
PARTY_PARTY_FK_I                       22 21-JAN-11
PARTY_PARTY_FK_I2                       6 21-JAN-11
PARTY_OPT2                        1203585 21-JAN-11

| Id  | Operation                     | Name             | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  | 60447 |   472K|     1   (0)|
|*  1 |  CONNECT BY WITH FILTERING    |                  |       |       |            |
|   2 |   TABLE ACCESS BY INDEX ROWID | PARTIES          |     1 |    22 |     1   (0)|
|*  3 |    INDEX UNIQUE SCAN          | PARTY_PK         |     1 |       |     1   (0)|
|   4 |   NESTED LOOPS                |                  |       |       |            |
|   5 |    CONNECT BY PUMP            |                  |       |       |            |
|   6 |    TABLE ACCESS BY INDEX ROWID| PARTIES          | 60447 |   472K|     1   (0)|
|*  7 |     INDEX RANGE SCAN          | PARTY_PARTY_FK_I |     1 |       |     1   (0)|
---------------------------------------------------------------------------------------

Does this difference not matter?  Both queries timewise are the same - but as soon as this query is joined to other tables - the DB2 performs 30 times slower.

Tom Kyte
February 01, 2011 - 10:20 am UTC

... should we ignore the Rows column in auto ...

NO, definitely not - that is the most important thing in an explain plan!!!

You want to compare it to ACTUAL row counts from a sql_trace when you suspect an inefficient plan is being generated. If the ACTUALS differ widely from the GUESS (in the plan) - then that is the likely cause of the wrong plan being generated and we need to figure out "why" that is happening in order to correct it.


Please - as stated above - sql_plus+tkprof if you want to see what they are ACTUALLY DOING.


As you can see - the data sets are completely and utterly different. One of them seems to return A LOT more data. That might be a clue as to why they perform differently when used as a source to join to some other table.

LEVEL, CONNECT BY

Ananth, February 03, 2011 - 3:06 pm UTC

Hi Tom,

we have a table "folders" in content management system.


Columns:
folderid number
levl number
name varchar2
pathids varchar2

primary key is on folderid.

Data:

folderid levl name pathids
0 0 Root :0:
1 1 Test1 :0:1:
2 1 Test2 :0:2:
3 2 Test3 :0:1:3:
4 3 Test4 :0:1:3:4:


i want to have result

folderid levl name path-name
-------------------------------
0 0 Root Root
1 1 Test1 Root,Test1
2 1 Test2 Root,Test2
3 2 Test3 Root,Test1,Test3
..
..

Also, when i tried to run a sample query i get different results

SELECT FOLDERID,LEVL,LEVEL,
ROW_NUMBER() OVER (PARTITION BY SNO ORDER BY SNO) RWN,
PATHIDS
FROM
(
SELECT FOLDERID, LEVL, NAME, PATHIDS
FROM PATH
WHERE FOLDERID IN (1)
)
CONNECT BY LEVEL <= LEVL + 1

i get the output as
FOLDERID LEVL LEVEL RWN
---------- ---------- ---------- ----------
1 1 2 1
1 1 1 2

why did the level column showed in decreasing order.

if i replace it with FOLDERID in (9) instead of (1)
i get the results as

FOLDERID LEVL LEVEL RWN
---------- ---------- ---------- ----------
9 4 1 1
9 4 2 2
9 4 5 3
9 4 4 4
9 4 3 5


am not able to figure it out.

Can you please help me out in the above 2 scenario's
Tom Kyte
February 03, 2011 - 4:10 pm UTC

no create table
no inserts
I did not even look at the question.

Level, Connect By

Ananth, February 03, 2011 - 5:45 pm UTC

Hi Tom,

Please find below the create table and insert scripts for the same.

CREATE TABLE FOLDER
(
FOLDERID NUMBER NOT NULL PRIMARY KEY,
LEVL NUMBER NOT NULL,
NAME VARCHAR2(200) NOT NULL,
PATHIDS VARCHAR2(200) NOT NULL
);

INSERT INTO FOLDER VALUES (0, 0, 'ROOT', ':0:');
INSERT INTO FOLDER VALUES (1, 1, 'TEST1', ':0:1:');
INSERT INTO FOLDER VALUES (2, 2, 'TEST2', ':0:1:2:');
INSERT INTO FOLDER VALUES (3, 2, 'TEST3', ':0:1:3:');
INSERT INTO FOLDER VALUES (4, 3, 'TEST4', ':0:1:3:4:');
INSERT INTO FOLDER VALUES (5, 2, 'TEST5', ':0:1:5:');
INSERT INTO FOLDER VALUES (6, 2, 'TEST6', ':0:1:6:');
INSERT INTO FOLDER VALUES (7, 3, 'TEST7', ':0:1:6:7:');
INSERT INTO FOLDER VALUES (8, 4, 'TEST8', ':0:1:6:7:8:');
INSERT INTO FOLDER VALUES (9, 4, 'TEST9', ':0:1:6:7:9:');

I need a query which shows output as

FOLDERID LEVL NAME PATH
---------- ---------- ---------- ----------
0 0 ROOT ROOT
1 1 TEST1 ROOT:TEST1
2 2 TEST2 ROOT:TEST2
..
..
..

also, for the below query, on why do the results (column LEVEL) are altered when i add analytic function. With out analytic function the level column shows as expected. (increasing order of LEVEL).

SELECT
FOLDERID,
LEVL,
LEVEL,
ROW_NUMBER() OVER (PARTITION BY FOLDERID ORDER BY FOLDERID) RWNUM,
PATHIDS
FROM
(
SELECT FOLDERID, LEVL, NAME, PATHIDS
FROM FOLDER
WHERE FOLDERID = 9
)
CONNECT BY LEVEL <= LEVL + 1


FOLDERID LEVL LEVEL RWNUM PATHIDS
---------- ---------- ---------- -------- --------
9 4 1 1 :0:1:6:7:9:
9 4 2 2 :0:1:6:7:9:
9 4 5 3 :0:1:6:7:9:
9 4 4 4 :0:1:6:7:9:
9 4 3 5 :0:1:6:7:9:


Thanks in Advance
Ananth


Tom Kyte
February 04, 2011 - 8:33 am UTC

now you sort of need to explain your output. Just saying:

I need a query which shows output as

  FOLDERID       LEVL      NAME        PATH
---------- ---------- ----------  ----------
         0          0       ROOT       ROOT
         1          1      TEST1       ROOT:TEST1
         2          2      TEST2       ROOT:TEST2


doesn't tell us how your data is to be connected. The connect by you posted in the second bit - with connect by level <= levl+1 - is a bit odd - I cannot figure out what you are trying to actually do.

So, before we go further - please tell us - in psuedo code, specification format, whatever - what you are trying to do. It is not clear to me how your data is connected together and the connect by you show is really confusing.


As for level being out of order - you have an order by in the analytic - you would have to resort the data if you wanted it ordered by something in particular.

Level, Connect By

Ananth, February 04, 2011 - 2:32 pm UTC

Hi Tom,

The descriptions are as follows.

FOLDERID: id of the folder

LEVL:It is equivalent to pseduocolumn LEVEL. the catch here is for a folder of LEVEL 2, the LEVL will be of value 1 and soon.

PATHIDS: The Hierarchy of all FOLDERID's starting with root folderid. it is separated by colon. (for eg: :0:1:2:4: ... etc).

instead i want the output to be in Hierarchy of all NAMES concatenated, starting with Root till the present folder.



For the second part of my query, i was trying to break the column PATHIDS into individual folderid's, then apply self join to get NAME's and again convert them to into single column. so to determine the number of folderid's in the column PATHID's i used LEVEL <= LEVL + 1.
Tom Kyte
February 06, 2011 - 11:47 am UTC

that is an impossibly bad way to store a hierarchy - putting it back to together is impossible (well, not impossible but so inefficient as to make it be realistically "not something to do")

We have to substr out the parent id (levl does *nothing* for us to put the data back together - nothing) and the connect with it. It isn't going to be "fast" on non-trivial data sets:

ops$tkyte%ORA11GR2> select id, rpad('*',2*level,'*')||name nm, pathids, pid,
  2         sys_connect_by_path(name,',') scbp
  3    from (
  4  select folderid id, name, pathids,
  5         substr( pathids, instr( pathids, ':', -1, 3 )+1, instr(pathids,':',-1,2)-instr(pathids,':',-1,3)-1 ) pid
  6    from folder
  7         )
  8   start with id = 0
  9  connect by prior id = pid
 10  /

        ID NM                   PATHIDS              PID                  SCBP
---------- -------------------- -------------------- -------------------- ------------------------------
         0 **ROOT               :0:                                       ,ROOT
         1 ****TEST1            :0:1:                0                    ,ROOT,TEST1
         2 ******TEST2          :0:1:2:              1                    ,ROOT,TEST1,TEST2
         3 ******TEST3          :0:1:3:              1                    ,ROOT,TEST1,TEST3
         4 ********TEST4        :0:1:3:4:            3                    ,ROOT,TEST1,TEST3,TEST4
         5 ******TEST5          :0:1:5:              1                    ,ROOT,TEST1,TEST5
         6 ******TEST6          :0:1:6:              1                    ,ROOT,TEST1,TEST6
         7 ********TEST7        :0:1:6:7:            6                    ,ROOT,TEST1,TEST6,TEST7
         8 **********TEST8      :0:1:6:7:8:          7                    ,ROOT,TEST1,TEST6,TEST7,TEST8
         9 **********TEST9      :0:1:6:7:9:          7                    ,ROOT,TEST1,TEST6,TEST7,TEST9

10 rows selected.



you should forget about storing levl and pathids, just store PID - the parent id, the others are all easily derived from the connect by hierarchy.

Connect BY Prior

Ananth, April 05, 2011 - 2:23 pm UTC

Hi Tom,

I have a query regarding the Hierarchial queries.

Is there any way to differentiate between each hierarchy. i.e i want to logically differentiate each hierarchy ie. (leaf to root), just as how we logically differentiate regions by using group by region.

can you give me some ideas or suggestions on how to achieve the same.

Regards
Ananth
Tom Kyte
April 12, 2011 - 12:38 pm UTC

have you looked at sys_connect_by_path - maybe that'll help you.

I'm not really sure what you mean otherwise, I don't know how to compare group by with a connect by? Do you have an example?

Ananth, April 12, 2011 - 5:07 pm UTC

Hi Tom,

I could think of CONNECT_BY_ROOT as one option which can differentiate a hierarchy within.

for eg:
in traditional employee table if you go from Bottom - Top approach, you have in a hierarchy Employee-his manager-manager's manager and so on till employee with manager_id null.

So if i say that Leaf to Root as one hierarchy. is it possible to do some aggregate kind of things on each hierarchy.

from Traditional employee table design, i see employee_id being unique, i can do the aggregate function on hierarchies using connect_by_root employee_id.

In some scenarions where we couldnt differentiate a hierarchy.. how can we do on those..?

Regards
Ananth
Tom Kyte
April 13, 2011 - 9:58 am UTC

give me an example please - use scott.emp. I don't know what you want to "aggregate" - you know how to get the root - so aggregating is easy, you have something to group by. But I suspect you mean something OTHER than aggregation don't you.

give an EXAMPLE of what you are looking for - be specific, explain everything.

SYS_CONNECT_BY_PATH

Ananth, April 14, 2011 - 12:24 am UTC

Hi Tom,

Lets say i have the path from Root is stored in a table.
For Ex: in Traditional EMPLOYEES table, i have only columns as
EMPLOYEE_ID, EMPLOYEE_PATH  (assuming i dont have hierarchy information).

<code>
EMPLOYEE_ID EPATH
----------- --------------
        101 :101
        108 :101:108
        109 :101:108:109
        111 :101:108:111
        112 :101:108:112
        113 :101:108:113
        110 :101:108:110
        204 :101:204
        205 :101:205
        206 :101:205:206
        203 :101:203
        200 :101:200
        102 :102
        103 :102:103
        104 :102:103:104
        107 :102:103:107
        106 :102:103:106
        105 :102:103:105
        114 :114
        115 :114:115

Question1:
-----------
how do i get the below output
EMPLOYEE_ID  PATH  SERIAL
101          101   1
108          101   1
108          108   2
109          101   1
109          108   2
109          109   3
..
..
..
</code>
Tom Kyte
April 14, 2011 - 9:45 am UTC

this can be done, but what you have to do first is provide

create table
inserts

then I can supply select

actually, you can supply the select, here is the technique:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2189860818012#2146231300346158104

Scripts

Ananth, April 14, 2011 - 10:07 am UTC

Hi Tom,

PFB the scripts

create table emp
(
id number,
path varchar2(100)
);

insert into emp values (101,':101:');
insert into emp values (108,':101:108:');
insert into emp values (109,':101:108:109');
insert into emp values (111,':101:108:111:');
insert into emp values (112,':101:108:112:');
insert into emp values (113,':101:108:113:');
insert into emp values (110,':101:108:110:');
insert into emp values (204,':101:204:');
insert into emp values (205,':101:205:');
insert into emp values (206,':101:203:');
insert into emp values (203,':101:200:');
insert into emp values (200,':101:200:');
insert into emp values (102,':102:');
insert into emp values (103,':102:103:');
insert into emp values (104,':102:103:104:');
insert into emp values (107,':102:103:107:');
insert into emp values (106,':102:103:106:');
insert into emp values (105,':102:103:105:');
insert into emp values (114,':114:');
insert into emp values (115,':114:115:');


Tom Kyte
April 14, 2011 - 10:25 am UTC

tell you what Ananth, give it what we call the "old college try" using the link I provided first. You should be able to - they are virtually identical in nature.

And if you cannot - come back and show your work and we'll take it from there.

Ananth, April 14, 2011 - 10:27 am UTC

Sure Tom, Am actually looking into it.
wil surely let you know my approach if i couldnt be able to make it.

ask why conect by didn't work at oracle 11g (11.2.0.1.0)

Liliek, July 22, 2011 - 5:51 am UTC

Hi Tom,
i want to ask you :
i have been migrate from oracle 10.2.0.1.0 ver to 11.2.0.1.0 and this start with.. connect by.. clause not work. could you help me to solve it, because i have a report run by this query for one and half hours!!

thx tom , i really appreciate for your help


Rgds


Liliek
Tom Kyte
July 22, 2011 - 2:39 pm UTC

umm, connect by and start with most certainly DO WORK with 11g

you'd need to be a bit more precise in describing your problem

PLESE SOLVE MY DOUBT

HANMATH PRADEEP, March 30, 2012 - 4:33 am UTC

Hello every1 ……………….. I have a small doubt regarding retrieving records from the base table….. So please clarify my doubt ……………………………..

The respective base table name is “PRADEEP”.

SQL>SELECT *FROM PRADEEP;

SNO   NAME      ADDR
------------------------------------
1 HANMATH  PRADEEP     HYD
2 HARI  PRASAD     CHENNAI
3 HARI  SHANKER       BANGLORE
4 HARINATH  NAIDU     PAKISTAN
5  HARI  PRASAD     GUNTUR
6 SURESH  PAIDY       VIZAG

EXPECTING OUTPUT SHOULD BE IN THE FOLLOWING MANNER:

SNO  NAME       ADDR

1 HANMATH  PRADEEP     HYD
2 HARI  PRASAD     CHENNAI
5 HARI  PRASAD     GUNTUR

MY REQUIREMENT IS AS FOLLOWS:

(1)……… HERE I WOULD LIKE TO RETRIEVE THE RECORDS BASED ON THE NAME COLUMN ONLY.

(2) I KNOW THIS QUERY I.E.,  (SELECT *FROM PRADEEP WHERE NAME LIKE ‘H_______P%’ OR NAME LIKE ‘H____P%’;) ……………… I DON’T LIKE TO USE SUCH LIKE STATEMENTS HERE? SO PLZ AVOID IT…

(3)……….. THE IMPORTANT CONDITION IS THAT …. THE RECORDS IN WHICH THE STARTING LETTER IS “”H”” (IN MIDDLE NAME) & THE STARTING  LETTER IS “”P”” (IN LAST NAME) SHOULD ONLY RETRIEVED FROM THE BASE TABLE.

Tom Kyte
March 30, 2012 - 7:17 am UTC

every1? plz?

what is this? elementary school?


sigh, i did read it - nothing to be read here. "I don't like to use such like statements". to which I say "sorry, but what would you like to use then"

or where you looking for "name like 'H% P%'" - starts with an H and contains a P preceded by a space. but that presumes, assumes that everyone just has a single word first name and single word last name and nothing else (a rather naive assumption) but you give us no details.

(did you know your capslock is apparently stuck?)

sym, March 31, 2012 - 11:22 am UTC

Is there any way to fold a hierarchy into a collection?

Suppose I have

create type x as object(val varchar2(100)) not final;

create type x_list as table of x;

create type y under x(children x_list);


create table z (
val varchar2(100),
id int,
parent_id int);


insert into z values('a',1, null);
insert into z values('b',2, 1);
insert into z values('c',3, 2);


Is there a way to extract this into x with the following structure
x('a',x_list(x('b',x_list(x('c',NULL)))));


Thanks

sym, March 31, 2012 - 11:25 am UTC

The previous question, it is a y object and not x that needs to be constructed. Sorry for the mistake.

very nice

A reader, July 29, 2012 - 11:21 am UTC


Find immediate children

A reader, August 26, 2012 - 12:30 pm UTC

Hi Tom,

We have a table which has 60 million records of web pages, and each page can be child of other page or belong to a property page.

I want a list of pages which are immediate children of Property page.

This is what i wrote but the performance of this query is taking more than 10 mins as the WHERE clause to filter the records with LEVEL = 2 is happening after the connect by performs (walking down the tree for all page ids)

How to stop walking down the tree once all Level = 2 pages have been accomplished.

Select Level Lvl, spaceid, parent_spaceid, page_desc, page_key
from page
WHERE
Level = 2
Start with spaceid =
(
Select spaceid from page where spaceid = (Select Property_spaceid from property where property_id = 125 )
)
connect by prior spaceid = parent_spaceid

W/o the WHERE CLAUSE it returns .25 million rows, however when filter is applied with LVL = 2 it has only 132 rows.

I need to stop when Ist Level of information is obtained.

Need your help in optimizing this query.

Thanks



Tom Kyte
August 29, 2012 - 1:22 pm UTC

why even connect by?


select * from page where parent_spaceid = (
Select spaceid from page where
spaceid = (Select Property_spaceid from property where property_id = 125 )
)


if you just want the children, just ask for them.


but you could have put the "and level <= 2" in the connect by clause to stop at level two.


but don't use connect by on this at all.

Immediate Chidren

A reader, August 29, 2012 - 10:57 pm UTC

Thanks Tom.

How to prevent a cycle

Richard, September 28, 2012 - 12:03 am UTC

Hi Tom,

I'd like to know if it's possible to prevent the cycle in the example code below?
So no matter the direction taken, a "pth" should only be used once in the hierarchy. For the example pth = 8 is the first to get used twice, 9 to 8 (lvl 4) then again 8 to 9 (lvl 5). t is a shorten example version of a much larger table and t2 is my attempt to allow the connect_by to consider both start and end locations.


WITH t AS (SELECT 1 AS start_loc, 1 AS pth, 2 AS end_loc FROM DUAL
UNION
SELECT 2 AS start_loc, 2 AS pth, 3 AS end_loc FROM DUAL
UNION
SELECT 1 AS start_loc, 3 AS pth, 4 AS end_loc FROM DUAL
UNION
SELECT 3 AS start_loc, 4 AS pth, 5 AS end_loc FROM DUAL
UNION
SELECT 3 AS start_loc, 5 AS pth, 6 AS end_loc FROM DUAL
UNION
SELECT 5 AS start_loc, 6 AS pth, 7 AS end_loc FROM DUAL
UNION
SELECT 6 AS start_loc, 7 AS pth, 8 AS end_loc FROM DUAL
UNION
SELECT 8 AS start_loc, 8 AS pth, 9 AS end_loc FROM DUAL
UNION
SELECT 9 AS start_loc, 9 AS pth, 3 AS end_loc FROM DUAL
UNION
SELECT 8 AS start_loc, 10 AS pth, 10 AS end_loc FROM DUAL),
t2 AS (SELECT start_loc, pth, end_loc FROM t
UNION
SELECT end_loc AS start_loc, pth, start_loc AS end_loc FROM t)
SELECT LEVEL AS lvl,
CONNECT_BY_ISCYCLE AS iscycle,
CONNECT_BY_ROOT (start_loc) AS abs_start_loc,
LTRIM (SYS_CONNECT_BY_PATH (pth, ' - '), ' - ') AS full_pth,
end_loc
FROM t2
START WITH start_loc = 7
CONNECT BY NOCYCLE start_loc = PRIOR end_loc
AND PRIOR start_loc != end_loc
AND start_loc != CONNECT_BY_ROOT (start_loc)
AND end_loc != CONNECT_BY_ROOT (end_loc)
ORDER BY LEVEL ASC
Tom Kyte
September 28, 2012 - 7:09 am UTC

I'm not sure what you are trying to do.

posting a bit of code that doesn't work and not posting a really clear detailed specification of what you want to do isn't going to work...


I've never seen this "start_loc", "path", "end_loc" construct in a hierarchy before - explain it.

explain the significance of your T2 object.

tell us what data you are trying to get out of it. maybe without any code - just really specific details.

More details

Richard, September 29, 2012 - 1:24 am UTC

Sorry for not being clear.

start_loc = a starting location (start and end locations can be the same, i.e. 3 is a common location)
end_loc = an end location,
pth = a link the joins the start and end location together (can only be 1 of these per start and end location)

Below is a diagram of the T object

9-(8)-8-(10)-10
| |
(9) (7)
| |
1-(1)-2-(2)-3-(5)-6
| |
(3) (4)
| |
4 5-(6)-7

If 3 is the start then:
Start location, path, end location
3, 2, 2
3, 5, 6
3, 9, 9
3, 2-1, 1
3, 5-7, 8
3, 9-8, 8
3, 2-1-3, 4
3, 5-7-10, 10
3, 5-7-8, 9
3, 9-8-10, 10
3, 9-8-7, 6

T2 flips the start and end locations on top of each other to allow the hierarchical query to trace all possible paths, not sure if this is the right approach.
I've tried alternate code that places all the locations into one column but haven’t been successful with getting more than one hop away from the start.

What I'd like to be able to do is specify any start location then retrieve the full path from that start location spanning outwards to all the end locations via the paths without looping back (which 7 does at the moment).

Current train of thought (still work in progress) is to run the hierarchical query but with analytics to find the duplicate use of a path then the exact same hierarchical query again but with the first duplicate as a filter in the connect_by clause.

Tom Kyte
September 29, 2012 - 8:14 am UTC

start_loc = a starting location
end_loc an end location


no kidding. but really - what does that *mean*. I don't understand your data at all.

I have no clue, none, how you got your picture from the above data.

 23  select * from t2
 24  where start_loc = 3;

 START_LOC        PTH    END_LOC
---------- ---------- ----------
         3          2          2
         3          4          5
         3          5          6
         3          9          9


I don't know how you are using PTH
I don't know where 3,4,5 disappeared to

Richard from Australia,

Sokrates, September 29, 2012 - 5:26 pm UTC

a.
interestingly, your query results in a
ORA-30007: CONNECT BY ROOT operator is not supported in the START WITH or in the CONNECT BY condition
on an 11.2 for me, but it gives a result on a 10 XE.
So I stayed on the XE, which version are you on ?

b.
If 3 is the start then:
Start location, path, end location
3, 2, 2
3, 5, 6
3, 9, 9
3, 2-1, 1
3, 5-7, 8
3, 9-8, 8
3, 2-1-3, 4
3, 5-7-10, 10
3, 5-7-8, 9
3, 9-8-10, 10
3, 9-8-7, 6


You miss
3, 4, 5
and
3, 4-6, 7
here, don't you ?

c.
specify any start location then retrieve the full path from that
start location spanning outwards to all the end locations via the paths without looping back (which
7 does at the moment).

What do you mean by that ?
sokrates@10XE > WITH t AS (SELECT   1 AS start_loc, 1 AS pth, 2 AS end_loc FROM DUAL
  2             UNION
  3             SELECT   2 AS start_loc, 2 AS pth, 3 AS end_loc FROM DUAL
  4             UNION
  5             SELECT   1 AS start_loc, 3 AS pth, 4 AS end_loc FROM DUAL
  6             UNION
  7             SELECT   3 AS start_loc, 4 AS pth, 5 AS end_loc FROM DUAL
  8             UNION
  9             SELECT   3 AS start_loc, 5 AS pth, 6 AS end_loc FROM DUAL
 10             UNION
 11             SELECT   5 AS start_loc, 6 AS pth, 7 AS end_loc FROM DUAL
 12             UNION
 13             SELECT   6 AS start_loc, 7 AS pth, 8 AS end_loc FROM DUAL
 14             UNION
 15             SELECT   8 AS start_loc, 8 AS pth, 9 AS end_loc FROM DUAL
 16             UNION
 17             SELECT   9 AS start_loc, 9 AS pth, 3 AS end_loc FROM DUAL
 18             UNION
 19             SELECT   8 AS start_loc, 10 AS pth, 10 AS end_loc FROM DUAL),
 20      t2 AS (SELECT   start_loc, pth, end_loc FROM t
 21             UNION
 22             SELECT   end_loc AS start_loc, pth, start_loc AS end_loc FROM t)
 23      SELECT   CONNECT_BY_ROOT (start_loc) AS abs_start_loc,
 24               LTRIM (SYS_CONNECT_BY_PATH (pth, ' - '), ' - ') AS full_pth,
 25               end_loc
 26        FROM   t2
 27  START WITH   start_loc = 3
 28  CONNECT BY   NOCYCLE     start_loc = PRIOR end_loc
 29                       AND PRIOR start_loc != end_loc
 30                       AND start_loc != CONNECT_BY_ROOT (start_loc)
 31                       AND end_loc != CONNECT_BY_ROOT (end_loc)
 32    ORDER BY   2 ASC;

ABS_START_LOC FULL_PTH         END_LOC
------------- ------------- ----------
            3 2                      2
            3 2 - 1                  1
            3 2 - 1 - 3              4
            3 4                      5
            3 4 - 6                  7
            3 5                      6
            3 5 - 7                  8
            3 5 - 7 - 10            10
            3 5 - 7 - 8              9
            3 5 - 7 - 8 - 9          3
            3 9                      9
            3 9 - 8                  8
            3 9 - 8 - 10            10
            3 9 - 8 - 7              6
            3 9 - 8 - 7 - 5          3

15 rows selected.


Here,
3 5 - 7 - 8 - 9 3
and
3 9 - 8 - 7 - 5 3
are looping back, so why not just adding
AND end_loc != CONNECT_BY_ROOT (start_loc)
?

sokrates@10XE > WITH t AS (SELECT   1 AS start_loc, 1 AS pth, 2 AS end_loc FROM DUAL
  2             UNION
  3             SELECT   2 AS start_loc, 2 AS pth, 3 AS end_loc FROM DUAL
  4             UNION
  5             SELECT   1 AS start_loc, 3 AS pth, 4 AS end_loc FROM DUAL
  6             UNION
  7             SELECT   3 AS start_loc, 4 AS pth, 5 AS end_loc FROM DUAL
  8             UNION
  9             SELECT   3 AS start_loc, 5 AS pth, 6 AS end_loc FROM DUAL
 10             UNION
 11             SELECT   5 AS start_loc, 6 AS pth, 7 AS end_loc FROM DUAL
 12             UNION
 13             SELECT   6 AS start_loc, 7 AS pth, 8 AS end_loc FROM DUAL
 14             UNION
 15             SELECT   8 AS start_loc, 8 AS pth, 9 AS end_loc FROM DUAL
 16             UNION
 17             SELECT   9 AS start_loc, 9 AS pth, 3 AS end_loc FROM DUAL
 18             UNION
 19             SELECT   8 AS start_loc, 10 AS pth, 10 AS end_loc FROM DUAL),
 20      t2 AS (SELECT   start_loc, pth, end_loc FROM t
 21             UNION
 22             SELECT   end_loc AS start_loc, pth, start_loc AS end_loc FROM t)
 23      SELECT   CONNECT_BY_ROOT (start_loc) AS abs_start_loc,
 24               LTRIM (SYS_CONNECT_BY_PATH (pth, ' - '), ' - ') AS full_pth,
 25               end_loc
 26        FROM   t2
 27  START WITH   start_loc = 3
 28  CONNECT BY   NOCYCLE     start_loc = PRIOR end_loc
 29                       AND PRIOR start_loc != end_loc
 30                       AND start_loc != CONNECT_BY_ROOT (start_loc)
 31                       AND end_loc != CONNECT_BY_ROOT (end_loc)
 32                       AND end_loc != CONNECT_BY_ROOT (start_loc)
 33    ORDER BY   2 ASC
 34  /

ABS_START_LOC FULL_PTH         END_LOC
------------- ------------- ----------
            3 2                      2
            3 2 - 1                  1
            3 2 - 1 - 3              4
            3 4                      5
            3 4 - 6                  7
            3 5                      6
            3 5 - 7                  8
            3 5 - 7 - 10            10
            3 5 - 7 - 8              9
            3 9                      9
            3 9 - 8                  8
            3 9 - 8 - 10            10
            3 9 - 8 - 7              6

13 rows selected.

?

More details

Richard, September 30, 2012 - 7:19 am UTC

My apologies for not being specific again. In the real data, the locations are locations on a map (towns, cities etc) and pth are the cables connecting them together.
I substituted smaller numbers as they're represented by much longer even more meaningless keys in the real much bigger table.

Not a great analogy but it'd be similar to the US national roads system but where route numbers are unique on the ingress/egress to/from a town in the case of pth.
So for start_loc and end_loc = 3 let's say its Denver and pth 9 = route 25, pth 5 = route 76.
http://www.mapsofworld.com/usa/usa-road-map.html
Denver, route 25, Cheyenne
Denver, route 25 - 80, Salt Lake City
Denver, routes 25 – 90 – 94, Bismarck
Not wanted:
Denver, route 70 - 15 - 80, Cheyenne

Hi Sokrates,

a) 10.2. That's a concern it didn't work in 11.2 as we're migrating to 11.2 in the next 3 months.
b) Yes, I did miss it, I ended up typing the result out as was replying at home where I don’t have Oracle :(
c) Thanks, will give that a try and come back Tues when back in office. My main concern was with "start with start_loc = 7" (line 27 in your code) as it returns 1 for "connect_by_iscycle"

Richard from Australia

Sokrates, October 01, 2012 - 2:44 am UTC

a) 10.2. That's a concern it didn't work in 11.2 as we're migrating to 11.2 in the next 3 months.

good news !
that's actually a bug in 10 that the usage of a CONNECT_BY_ROOT in a START WITH or CONNECT BY condition does not fail as documented, see note 1477439.1

in 11.2, you can take advantage of the recursive subquery factoring clause.
Looks to me that
WITH t AS (SELECT   1 AS start_loc, 1 AS pth, 2 AS end_loc FROM DUAL
           UNION
           SELECT   2 AS start_loc, 2 AS pth, 3 AS end_loc FROM DUAL
           UNION
           SELECT   1 AS start_loc, 3 AS pth, 4 AS end_loc FROM DUAL
           UNION
           SELECT   3 AS start_loc, 4 AS pth, 5 AS end_loc FROM DUAL
           UNION
           SELECT   3 AS start_loc, 5 AS pth, 6 AS end_loc FROM DUAL
           UNION
           SELECT   5 AS start_loc, 6 AS pth, 7 AS end_loc FROM DUAL
           UNION
           SELECT   6 AS start_loc, 7 AS pth, 8 AS end_loc FROM DUAL
           UNION
           SELECT   8 AS start_loc, 8 AS pth, 9 AS end_loc FROM DUAL
           UNION
           SELECT   9 AS start_loc, 9 AS pth, 3 AS end_loc FROM DUAL
           UNION
           SELECT   8 AS start_loc, 10 AS pth, 10 AS end_loc FROM DUAL),
    t2 AS (SELECT   start_loc, pth, end_loc FROM t
           UNION
           SELECT   end_loc AS start_loc, pth, start_loc AS end_loc FROM t),
    t3(lvl, abs_start_loc, full_pth, lst, end_loc) as
    (
       select 1, start_loc, to_char(pth), pth, end_loc 
       from t2
       union all
       select t3.lvl + 1, t3.abs_start_loc, t3.full_pth || ' - ' || to_char(t2.pth), t2.pth, t2.end_loc
       from t3, t2
       where t2.start_loc = t3.end_loc and t2.start_loc != t3.abs_start_loc and t2.pth != t3.lst
    )
    search depth first by abs_start_loc set a
    cycle end_loc set is_cycle to '1' default '0'
    select lvl, abs_start_loc, full_pth, end_loc
    from t3
    where is_cycle = '0'


does what you want

Thanks Sokrates

Richard, October 16, 2012 - 1:33 am UTC

Thanks for those details. The 11G solution on the test data works nice, now the wait for our 10G to 11G upgrade.

Using hierarchical query to find all child nodes of a parent table

Sagar, February 13, 2013 - 11:57 am UTC

Hi Tom,

I am not good in writing hierarchical queries and I am stuck with it now!

I want to delete all rows from a particular table (consider parent table) for a particular key value, but this key value is being referred by multiple child tables, hence it will give a referential integrity constraint if I issue a DELETE straight away.

For this, I began to write the SQL to create the list of child tables for the parent table using hierarchical query, but it is not giving me the output.

Below are the DDL's:-

create table t1 (name varchar2(20), id number);

alter table t1 add constraint t1_pk primary key (id);

create table t2 (name varchar2(20) , id number , t2_id number);

alter table t2 add constraint t2_pk primary key (t2_id);
alter table t2 add constraint t2_fk foreign key (id) references t1 (id);

create table t3 (name varchar2(20) , id number , t3_id number);

alter table t3 add constraint t3_pk primary key (t3_id);
alter table t3 add constraint t3_fk foreign key (id) references t1 (id);

create table t4 (name varchar2(20) , id number , t4_id number);

alter table t4 add constraint t4_pk primary key (t4_id);
alter table t4 add constraint t4_fk foreign key (id) references t2 (t2_id);

create table t5 (name varchar2(20) , id number , t5_id number);

alter table t5 add constraint t5_pk primary key (t5_id);
alter table t5 add constraint t5_fk foreign key (id) references t3 (t3_id);


So I have 5 tables where T2 and T3 references T1 ; T4 references T2 ; T5 references T3.

I want to list the hierarchy starting from T1 table at the top and all the child nodes below.

Below is the hierarchical query I created to get the list of all child tables under the parent table (T1):-

select OWNER,
     CONSTRAINT_NAME,
     constraint_type,
    TABLE_NAME ,
     R_OWNER,
     R_CONSTRAINT_NAME,level lvl from 
( SELECT OWNER,
    NULL CONSTRAINT_NAME,
    null constraint_type,
    TABLE_NAME ,
    NULL R_OWNER,
   CONSTRAINT_NAME  R_CONSTRAINT_NAME
  FROM ALL_CONSTRAINTS
  WHERE CONSTRAINT_TYPE IN ('P')
  AND TABLE_NAME IN (SELECT TABLE_NAME FROM ALL_CONSTRAINTS WHERE R_CONSTRAINT_NAME='T1_PK') 
  UNION
   SELECT OWNER,
    NULL CONSTRAINT_NAME,
    null constraint_type,
    TABLE_NAME ,
    NULL R_OWNER,
   CONSTRAINT_NAME  R_CONSTRAINT_NAME
  FROM ALL_CONSTRAINTS
  WHERE CONSTRAINT_TYPE IN ('P')
  AND TABLE_NAME ='T1'
  UNION
   SELECT OWNER,
     CONSTRAINT_NAME,
     constraint_type,
    TABLE_NAME ,
     R_OWNER,
     R_CONSTRAINT_NAME
  FROM ALL_CONSTRAINTS
  WHERE CONSTRAINT_TYPE IN ('R')
  and table_name = 'T1')
  CONNECT BY PRIOR(CONSTRAINT_NAME)=R_CONSTRAINT_NAME
  start with R_CONSTRAINT_NAME is null


Once I am able to get the above query up and running, I will proceed with Deleting from bottom to up.

Thanks in advance.
Tom Kyte
February 13, 2013 - 1:26 pm UTC

ops$tkyte%ORA11GR2> select rpad( '*', (level-1)*2, '*' ) || table_name table_name
  2    from (
  3         select a.table_name,
  4                a.constraint_name pkey_constraint,
  5                    b.constraint_name fkey_constraint,
  6                    b.r_constraint_name
  7           from user_constraints a, user_constraints b
  8          where a.table_name = b.table_name
  9            and a.constraint_type = 'P'
 10            and b.constraint_type = 'R'
 11             union all
 12             select table_name, constraint_name, null, null
 13               from user_constraints
 14                  where constraint_type = 'P'
 15         )
 16    start with fkey_constraint is null
 17  connect by prior pkey_constraint = r_constraint_name
 18  /

TABLE_NAME
------------------------------
T1
**T2
****T4
**T3
****T5
T2
**T4
T3
**T5
T4
T5

11 rows selected.

Using hierarchical query to find all child nodes of a parent table

Sagar, February 13, 2013 - 1:51 pm UTC

Thanks for the correction to the query Tom.

However when I run it on my SQL Developer running Oracle 11g XE, I get the below error:-

Using hierarchical query to find all child nodes of a parent table


Any assistance would be highly appreciated.

Many Thanks
Tom Kyte
February 13, 2013 - 2:32 pm UTC

???

Using hierarchical query to find all child nodes of a parent table

Sagar, February 13, 2013 - 1:52 pm UTC

Sorry, the error I got was:-


ORA-01436: CONNECT BY loop in user data
01436. 00000 - "CONNECT BY loop in user data"
*Cause:
*Action:

Tom Kyte
February 13, 2013 - 2:35 pm UTC

do you have self referencing integrity or two tables who are both parents and children of each other??


if you read about connect by, you'll find there are functions you can use, in particular "CONNECT_BY_ISCYCLE " to detect these loops and stop looping.


but - if you have a loop - you by definition do not have any order in which to truncate these tables - you'll have to disable their constraints

Using hierarchical query to find all child nodes of a parent table

A reader, February 14, 2013 - 5:57 am UTC

Thanks for the reply Tom.

Slightly modified your query and now I am able to get the result as given below.

Modified query:

SELECT TABLE_NAME,
  pkey_constraint,
  fkey_constraint,
  r_constraint_name,
  Level,
  CONNECT_BY_ISCYCLE
FROM
  (SELECT a.table_name,
    a.constraint_name pkey_constraint,
    b.constraint_name fkey_constraint,
    b.r_constraint_name
  FROM user_constraints a,
    user_constraints b
  WHERE a.table_name    = b.table_name
  AND a.constraint_type = 'P'
  and B.CONSTRAINT_TYPE = 'R'
  UNION ALL
  SELECT table_name,
    constraint_name,
    NULL,
    NULL
  FROM USER_CONSTRAINTS
  where CONSTRAINT_TYPE = 'P'
  AND table_name        ='T1'
    )
  START WITH R_CONSTRAINT_NAME IS NULL
  CONNECT BY nocycle prior pkey_constraint = r_constraint_name


Result:

TABLE_NAME  PKEY_CONSTRAINT  FKEY_CONSTRAINT  R_CONSTRAINT_NAME  LEVEL   CONNECT_BY_ISCYCLE
----------- ---------------- ---------------- -----------------  ------- ------------------
T1          T1_PK                                                    1                  0
T2          T2_PK            T2_FK            T1_PK                  2                  0 
T4          T4_PK            T4_FK            T2_PK                  3                  0 
T3          T3_PK            T3_FK            T1_PK                  2                  0 
T5          T5_PK            T5_FK            T3_PK                  3                  0



Now I am working on to Delete all the data from T1 , but due to referential integrity constraints , I need to start my delete right from the bottomest node here (Level 3).

I have created a temporary table to store all the data from the above SELECT query. I plan to use this temp table to do a lookup and then perform the Delete from bottom to up.

Your views on the above are highly appreciated Tom. If you have something more efficient then please let me know.

Many Thanks.

Grant privileges based on flag in Heirarchy

Umesh Kasturi, March 04, 2013 - 12:48 am UTC

Hello Tom,
I want to grant privileges in a heirarchical way.

( priv means privilege and is controlled by ADMIN_FLAG column in the table)

If a Manager has ADMIN_FLAG=Y and the employee has ADMIN_FLAG =Y then display Y
If a Manager has ADMIN_FLAG=N for all employees under him display N

Manager employee Admin_flag Needed output
A Y Y
A B Y Y
C D N N

B G N N because in heirarchy some where at top we have a N
B H Y N because in heirarchy some where at top we have a N

G K Y N because in heirarchy some where at top we have a N
H L Y N -- because in heirarchy some where at top we have a N

drop table emp;

create table emp
( empno number
,ename varchar2(20)
,mgr number
,admin_flag varchar2(1)
);

alter table emp add primary key ( empno);

insert into emp values ( 7839,'KING' ,null,'Y');
insert into emp values ( 7566,'JONES',7839.'Y');
insert into emp values ( 7788,'SCOTT', 7566,'N');
insert into emp values ( 7876,'ADAMS',7788,'Y');
insert into emp values ( 7698,'BLAKE', 7839,'N');
insert into emp values ( 7499,'ALLEN',7698,'Y');
insert into emp values ( 7782,'CLARK',7839,'Y');
insert into emp values ( 7934,'MILLER', 7782,'Y');

COMMIT;

column admin_flag format a10

SQL> select mgr,empno,admin_flag,level from emp
2 connect by prior empno=mgr
3 start with empno=7839;


MGR EMPNO ADMIN_FLAG LEVEL needed output column
---------- ---------- ---------- ----------
7839 Y 1 Y
7839 7698 N 2 N
7698 7499 Y 3 N -- because the manager of 7499 is 7698 and he has a N
7839 7782 Y 2 Y
7782 7934 Y 3 Y

Thanks

Tom Kyte
March 04, 2013 - 4:34 pm UTC

use the code tags (hit the code button to see what they look like, they'll be added to your input text when you are editing)

I cannot read this output.

Grant privileges based on flag in Heirarchy

Umesh Kasturi, March 04, 2013 - 10:25 pm UTC

Followup to above Question
Apologies for the Formatting errors.I tried it again and finding hard to format.
So I will try and explain 
Grant_flag is the column to be derived 
1) Employee can grant privileges if he has admin_flag=Y and all his Managers above in the hierarchy has admin_flag=Y
so Grant_Flag=Y 

2) If any one in the hierarchy has admin_flag=N then all his subordinates cannot grant privileges so their grant_flag should be displayed as N . So Grant_Flag=N


MGR EMPNO  ADMIN_FLAG Grant Flag Reason for Grant Flag column output
1234 7839     Y            Y  Top most in the tree 
7839 7698     N            N emp and his manager has priv
7698 7499     Y            N Manager of 7499 has No priv . 
7839 7782     Y            Y  emp and his manager has priv
7782 7934     Y            Y  emp and his manager has priv


<code>
</code>

on: Grant privileges based on flag in Heirarchy

Stew Ashton, March 05, 2013 - 11:13 am UTC


Here's one way. First, to get privileges for everybody:
select empno, ename, mgr,
decode (
  replace(sys_connect_by_path(admin_flag,'/'), '/Y', ''),
  null,
  'Y',
  'N'
) admin_flag
from emp e
start with mgr is null
connect by mgr = prior empno;

     EMPNO ENAME                       MGR ADMIN_FLAG
---------- -------------------- ---------- ----------
      7839 KING                            Y          
      7566 JONES                      7839 Y          
      7788 SCOTT                      7566 N          
      7876 ADAMS                      7788 N          
      7698 BLAKE                      7839 N          
      7499 ALLEN                      7698 N          
      7782 CLARK                      7839 Y          
      7934 MILLER                     7782 Y 
Second, to get privilege for a specific employee or employees:
select empno, ename, mgr,
decode (
  replace(max(admin_flags), '/Y', ''),
  null,
  'Y',
  'N'
) admin_flag
from (
  select connect_by_root(empno) empno,
  connect_by_root(ename) ename,
  connect_by_root(mgr) mgr,
  sys_connect_by_path(admin_flag,'/') admin_flags
  from emp e
  start with empno in( 7876, 7499)
  connect by empno = prior mgr
)
group by empno, ename, mgr;

     EMPNO ENAME                       MGR ADMIN_FLAG
---------- -------------------- ---------- ----------
      7499 ALLEN                      7698 N          
      7876 ADAMS                      7788 N

Tom Kyte
March 05, 2013 - 11:57 am UTC

thanks, I was just about to type up the sys connect by path and decided to page down first! thanks much

Grant privileges based on flag in Heirarchy ( Steve & Tom)

Umesh Kasturi, March 05, 2013 - 10:43 pm UTC

Hi Tom/Steve Ashton

Thanks so much,for saving me from writing a bad code using cursor.


Multiple hierarchies?

Sal, July 10, 2013 - 5:00 am UTC

I'm drawing a blank. Here's the code:

CREATE TABLE T1 (ID NUMBER, PARENT_ID NUMBER, LEFT_SIBLING_ID NUMBER);

INSERT INTO T1 VALUES (1,NULL,NULL);
INSERT INTO T1 VALUES (3,1,NULL);
INSERT INTO T1 VALUES (2,1,3);
INSERT INTO T1 VALUES (4,2,NULL);
INSERT INTO T1 VALUES (5,2,4);
INSERT INTO T1 VALUES (10,NULL,NULL);
INSERT INTO T1 VALUES (12,10,NULL);
INSERT INTO T1 VALUES (11,10,12);

What I would like the result to be is:

ID-Tree
1
3
2
4
5
10
12
11

The key here is that besides the usual PRIOR ID = PARENT_ID hierarchy, there is another hierarchy based on PRIOR ID = LEFT_SIBLING. The children are sorted in the order of PRIOR ID = LEFT_SIBLING. That is why the 3 is followed by 2, and the 12 by 11. That order is important.

I've drawn a blank on how to do this.

BTW, why do we enforce child order this way instead of just numbering them? Because we figured, incorrectly in hindsight I think, that numbering would make inserting a new record in the middle too slow.

I forgot...

Sal, July 10, 2013 - 5:04 am UTC

We're still on 11g R1.

connect by without filtering

A reader, September 24, 2013 - 4:11 pm UTC

SELECT min(ess_id) FROM (select level ess_id from dual connect by level <= (select max(ess_id) FROM CUST_TBL)
UNION SELECT max(ess_id)+1 from CUST_TBL MINUS SELECT ess_id from CUST_TBL)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 549.19 565.90 0 2 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 549.19 565.90 0 2 0 0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 63

Rows Row Source Operation
------- ---------------------------------------------------
0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=60 us)
0 VIEW (cr=0 pr=0 pw=0 time=47 us)
0 MINUS (cr=0 pr=0 pw=0 time=43 us)
0 SORT UNIQUE (cr=0 pr=0 pw=0 time=27 us)
82104129 UNION-ALL (cr=2 pr=0 pw=0 time=410520687 us)
82104129 CONNECT BY WITHOUT FILTERING (cr=2 pr=0 pw=0 time=82104158 us)
1 FAST DUAL (cr=0 pr=0 pw=0 time=7 us)
1 SORT AGGREGATE (cr=2 pr=0 pw=0 time=42 us)
1 INDEX FULL SCAN (MIN/MAX) CUST_TBL_UIDX_1 (cr=2 pr=0 pw=0 time=21 us)(object id 54818)
0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=0 us)
0 INDEX FULL SCAN (MIN/MAX) CUST_TBL_UIDX_1 (cr=0 pr=0 pw=0 time=0 us)(object id 54818)
0 SORT UNIQUE (cr=0 pr=0 pw=0 time=12 us)
0 INDEX FAST FULL SCAN CUST_TBL_UIDX_1 (cr=0 pr=0 pw=0 time=5 us)(object id 54818)


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1 0.00 0.00
direct path write temp 256 0.16 20.82
SQL*Net break/reset to client 2 0.00 0.00
********************************************************************************

Great Explanation

reachsky, October 09, 2013 - 3:05 pm UTC

I went through many websites but couldn't grasp it at first look. I had to spend only min or so to get a hold of this concept(start with, connect by)

Ganesh, November 28, 2013 - 3:20 pm UTC

create table pd
(s varchar2(1 char),
t varchar2(1 char),
g varchar2(3 char))

insert into pd (s,t) values ('A','B');
insert into pd (s,t) values ('A','C');
insert into pd (s,t) values ('B','C');
insert into pd (s,t) values ('C','B');
insert into pd (s,t) values ('B','A');
insert into pd (s,t) values ('C','A');
insert into pd (s,t) values ('D','E');
insert into pd (s,t) values ('E','D');
insert into pd (s,t) values ('E','F');
insert into pd (s,t) values ('F','E');

select * from pd

The table pd contains potential duplicates identified in our application. A is duplicate of B so vice versa.
I need to group the results and give them a identification to investigate further. I would like to update g as follows

S T G
---------------
A B 1
A C 1
B C 1
C B 1 --> indirectly duplicates to A so belongs group 1
B A 1
C A 1

D E 2 -> not related to A or B or C so belongs to different group
E D 2
E F 2
F E 2


EXCELLENT

A reader, January 09, 2014 - 3:57 am UTC

EXCELLENT EXPLANATION

Hierarchy with parameters

Shimon Batashvili, February 12, 2014 - 9:07 am UTC

Hi Tom,
I work with Oracle9i Enterprise Edition Release 9.2.0.8.0 .
My task is to change an Hierarchical query results according to user input. Here is an example of the original query applied on Scott's EMP table, which has a relation to itself by empno and mgr, as we all know.
My query:
-----------
select level,lpad('*',(level-1)*2,'*')||ename ename from emp
start with mgr is null
connect by prior empno = mgr
and rownum=1 -- I'd like it to fetch only this specific record, which should appear in any case
union all
select level,lpad('*',(level-1)*2,'*')||ename ename
from emp e1
where (job = '&job' or '&job' is null)
and (e1.deptno = '&dept_no' or '&dept_no' is null)
and (e1.ename = '&name' or '&name' is null)
and mgr is not null
start with mgr is null
connect by prior empno = mgr;
Results:
----------
LEVEL ENAME
===== ==============
1 KING
2 **JONES
3 ****SCOTT
4 ******ADAMS
3 ****FORD
4 ******SMITH
2 **BLAKE
3 ****ALLEN
3 ****WARD
3 ****MARTIN
3 ****TURNER
3 ****JAMES
2 **CLARK
3 ****MILLER

This is all fine, in case the users have not passed parameters.
The problem begins when they do pass parameters.
What I expect from the query to show the both selected records and their successors' if they have any.
For example, if I run this query, and pass "BLAKE" as a value for the parametr named "NAME", currently my query returns KING as root and as the successor BLAKE, whereas I expect it to fetch both Blake and his subordinate employees as in the following result set :
LEVEL ENAME
====== ===========
1 KING
2 **BLAKE
3 ****ALLEN
3 ****WARD
3 ****MARTIN
3 ****TURNER

I'd like to add that I have tried to implement your answer to the original question here, i.e., including the inline view (the question with "emp" and "sub" tables), but the results remain the same .

Thanks in Advance!
Shimon B.



Hierarchy with parameters - A correction

Shimon Batashvili, February 12, 2014 - 9:10 am UTC

My city is Ashkelon, and not as it apears in the title of my original post.

Sorry for the mistake.

Shimon B.

SQL query

B Chaki, July 17, 2014 - 6:34 am UTC

In a table (T1 say), there are two columns C1 (varchar datatype) and C2 (number datatype)

C1 has values as under :

C1 C2
=========== =======
((2*150)+20)/4
sqrt(((2+5)-3)*4)
((2+4)*9)/2


How to calculate the values of column C1 and stored in column C2

Like here it will be

C2
=====
80
4
27

A reader, September 26, 2016 - 6:35 am UTC


Awesome Tom.Very helpful

A reader, February 24, 2017 - 12:07 pm UTC


Best explanation and very useful as always

Niranjan, April 09, 2020 - 8:56 pm UTC

The concept is very clearly explained. I went through the Oracle documentation and was very confusing. Now, this is very useful.
Connor McDonald
April 16, 2020 - 11:37 am UTC

Glad we could help

explanation made simple !

sohail, October 07, 2020 - 9:06 am UTC

Thanks. it was a comprehensive explanation.. simple to understand.

Mr

Newman, September 23, 2021 - 4:26 pm UTC

Hi, Tom,

I found the following query code on the web, which uses CONNECT BY to split a comma-delimited string. Have been searching hard for an answer to the question on my mind. Hope you can help.

select 
   level,
   regexp_substr('SMITH,ALLEN,WARD,JONES','[^,]+', 1, LEVEL) word
from dual
connect by regexp_substr('SMITH,ALLEN,WARD,JONES', '[^,]+', 1, LEVEL) is not null;


The result set:

     LEVEL WORD                  
---------- ----------------------
         1 SMITH                 
         2 ALLEN                 
         3 WARD                  
         4 JONES



CONNECT BY is for recursive parent/child relationships. Here in the CONNECT BY clause, rather than "PRIOR expr = expr", the condition is "is not null". And there isn't the START WITH clause either. Obviously, the values of LEVEL are crucial for the query to work out the solution. However, without discernible parent/child relationships among the names separated by commas, how does CONNECT BY determine the values of LEVEL? Could you describe how the code works, step by step, in building the result set?

Thank you very much.

Chris Saxon
September 24, 2021 - 4:21 pm UTC

And there isn't the START WITH clause either.

START WITH is always optional. If you omit this, it means build the hierarchy using every row as a root.

how does CONNECT BY determine the values of LEVEL?

It starts at one for every root row. This then increments for each row in the hierarchy it generates.

CONNECT BY keeps generating rows until the condition is not true (false or unknown).

So for the first (root) row, it extracts the first value (SMITH). Then increments the level to 2, and extracts the 2nd value (ALLEN), continuing until the expression is null.

Swapping the IS NOT NULL to check if the value returned is greater than some letter may help you understand:

select 
   level,
   regexp_substr('SMITH,WARD,JONES,ALLEN','[^,]+', 1, LEVEL) word
from dual
connect by regexp_substr('SMITH,WARD,JONES,ALLEN', '[^,]+', 1, LEVEL) > 'A';

LEVEL    WORD    
       1 SMITH    
       2 WARD     
       3 JONES    
       4 ALLEN 

select 
   level,
   regexp_substr('SMITH,WARD,JONES,ALLEN','[^,]+', 1, LEVEL) word
from dual
connect by regexp_substr('SMITH,WARD,JONES,ALLEN', '[^,]+', 1, LEVEL) > 'M';

LEVEL    WORD    
       1 SMITH    
       2 WARD  


I've recently published a blog post on the topic of CSV-to-rows splitting methods if you want to know more/alternatives:

https://blogs.oracle.com/sql/post/split-comma-separated-values-into-rows-in-oracle-database

Concise and clear answer to the question

Newman, September 25, 2021 - 4:47 pm UTC

Same as explaining the basics of CONNECT BY at the top of this page, the answer to my question is concise and clear. I especially appreciate the
> 'A'
and
> 'M'
examples. Now I understand, LEVEL functions as a counter when the CONNECT BY condition does not define a parent/child relationship, and increments only when there is a match satisfying the CONNECT BY condition.

Many thanks!
Connor McDonald
September 27, 2021 - 11:02 am UTC

Thanks for the feedback

A reader, October 19, 2021 - 11:21 am UTC

useless

Chris Saxon
October 20, 2021 - 9:45 am UTC

Constructive

CONNECT BY or RECURSIVE WITH

Asim, September 12, 2022 - 8:13 pm UTC


Now, as recursive queries using WITH clause are introduced since 11g, i think releasec2


1
What does oracle recommend CONNECT BY or Recursive query using WITH for hierarchical query.

2.
Is everything possible with CONNECT BY is possible with recursive WIT

3.
Is everything possible with recursive WITH is possible with CONNECT BY

4.
Are there anyother uses of recursive WITH, other than hierarchical queries or is it just for hierarchical queries only.

5.
Is CONNECT BY going to be obselete or deprecated or dissupported in future because of recursive WIT

Thanks
Connor McDonald
September 26, 2022 - 10:17 am UTC

1) Either is fine. Depends how much your project mandates ANSI standard SQL
2) Yes, but some times you need first principles (eg level)
3) depth/breadth first is easier with recursive because it is natively part of the syntax
4) generating lists of numbers, or genuine recursive functions
5) Not to my knowledge

Recursive with and connect by

Asim, October 03, 2022 - 7:06 pm UTC


Can you please explain difference between recursive function and genuine recursive function.

Also at the link mentioned below, is a good article about some very interesting differences between recursive with and connect by.
The article is of some independent author, therefore I want to confirm his claims from you as an Oracle's Advocate, what is Oracle's perspective about it.

Can you please comment on the article, specially about.

1.
Difference in cycle detection

....CONNECT_BY_ISCYCLE checks the children (which are yet to be returned), while CYCLE checks the current row (which is already returned).

CONNECT BY is row based, while recursive CTE's are set-based.

2.
Calculate using previously calculated values

....With the connect-by clause you can select PRIOR [column_name], but you cannot select PRIOR [calculated_value]. So what used to be impossible with the connect-by, is now possible with the recursive-with (for people who don't like the model clause....


The article is here.

http://rwijk.blogspot.com/2009/11/recursive-subquery-factoring.html?m=1

Chris Saxon
October 04, 2022 - 10:18 am UTC

recursive function and genuine recursive function.

I've no idea what you mean here! Can you give an example of genuine vs non-genuine recursive functions?

1. You've quoted a quote!

The original article for this quote is:

https://stackoverflow.com/questions/1731889/cycle-detection-with-recursive-subquery-factoring

Does the explanation there help?

2. The statement is accurate - I'm unclear what comment you're looking for here.

See my previous comments please

Asim, October 04, 2022 - 2:05 pm UTC


Please see above, my (Asim's) comment on 12 sep 2022, and Connor's reply to it on 26 sep 2022.

Connor's point no. 4, he said recursive WITH can be used for genuine recursive functions.
So, I asked Connor, what he meant by genuine recursive function.

Moreover, the original question was about differences between recursive with and connect by, to which Connor's reply didnt mentioned this very important difference which is mentioned in the article of Rob, and stackoverflow link you gave.

So, I just wanted to confirm from you Chris and Connor as Oracle's representatives, , that do you agree with the article? Specially , the cycle detection difference example, in which ,in one case the KING is repeated two times and in other case KING is only displayed ones. If you agree then this is a very big difference and people rarely know it and mybe unknowingly executing queries with unwanted results.

Does Oracle docs. mentions this difference, because docs does mention recursive with as an alternative to connect by, so docs must also at least warn about this difference. If it is not mentioned I request you to forward my this request docs team.

Also if you can say something about set based and row based difference?, because what I know till now, is that SQL is set based, then how is this possible.




Connor McDonald
October 06, 2022 - 4:53 am UTC

For genuine recursive functions I made things along these lines

https://www.geeksforgeeks.org/recursion-practice-problems-solutions/

Every recursive problem can be phrased as a non-recursive one, but sometimes the recursive expression is easier to write/understand etc.

Regarding "agree with the article", I already said that recursive with is NOT a drop in replacement for connect by. Expecting cycle detection to be "the same" is like saying you want a shower and bath to be the same. Yes they both get you clean, and yes, one can mostly replace the other .... but you can't then be upset that one you do standing up and one lying down.


Docs

Asim, October 06, 2022 - 4:10 pm UTC

Thanks.

However, I think it would be better if docs explicitly make the user aware of this cycle detection difference, as its very easy/intuitive for the reader to assume from similar definitions of keywords (CONNECT BY NOCYCLE and WITH....CYCLE..) of cycle detection, that both will produce same results.

Tim Hall has mentioned this here

https://oracle-base.com/articles/11g/recursive-subquery-factoring-11gr2#cyclic

The NOCYCLE and CONNECT_BY_ISCYCLE functionality is replicated using the CYCLE clause. By specifying this clause, the cycle is detected and the recursion stops, with the cycle column set to the specified value, to indicate the row where the cycle is detected. Unlike the CONNECT BY NOCYCLE method, which stops at the row before the cycle, this method stops at the row after the cycle.

Oracle docs should also mention this, preferably by showing an example of both CONNECT BY and recursive WITH using cyclic data, and it should be mentioned in both places in docs ie where coonect by is explained and where recursive with is explained, if you agree, please forward a request to docs team.


More to Explore

PL/SQL demos

Check out more PL/SQL tutorials on our LiveSQL tool.

PL/SQL docs

PL/SQL reference manual from the Oracle documentation library