Skip to Main Content
  • Questions
  • Hierarchial query with cumulative factor

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Kim Berg.

Asked: June 17, 2002 - 11:43 am UTC

Last updated: December 01, 2009 - 1:49 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

Hi, Tom


I have a slight problem with a hierarchial query that I hope you can help me with.

We have Oracle 8.1.7.4 and I've setup a testcase consisting of the following :


A table of items with an id, a name, and a type that can be either 'I'tem og 'S'et :

create table items (
itemid varchar2 (20),
itemtype char (1),
itemname varchar2 (40)
);


A table defining what items a Set consists of :

create table itemsets (
parentid varchar2 (20),
childid varchar2 (20),
factor number
);


A simple example might have the following data :


SQL> select * from items;

ITEMID I ITEMNAME
-------------------- - ----------------------
BIKE S Complete bicycle
FRAME I Bicycle frame
COMP_WHEEL S Complete wheel
WHEEL S Wheel w/out tyre
TYRE I Tyre for bicycle wheel
RING I Ring for wheel
SPOKE I Spoke for wheel


SQL> select * from itemsets;

PARENTID CHILDID FACTOR
-------------------- -------------------- ----------
BIKE FRAME 1
BIKE COMP_WHEEL 2
COMP_WHEEL WHEEL 1
COMP_WHEEL TYRE 1
WHEEL RING 1
WHEEL SPOKE 28


Meaning that a BIKE consists of one FRAME and two COMP_WHEEL's.
A COMP_WHEEL consists of one WHEEL and one TYRE.
A WHEEL consists of one RING and 28 SPOKE's.

So a hierarchial query on a BIKE would give the following result :

SQL> select lpad(' ',level*2)||parentid parentid, childid, factor
2 from itemsets
3 connect by prior childid = parentid
4 start with parentid = 'BIKE';

PARENTID CHILDID FACTOR
-------------------- -------------------- ----------
BIKE FRAME 1
BIKE COMP_WHEEL 2
COMP_WHEEL WHEEL 1
WHEEL RING 1
WHEEL SPOKE 28
COMP_WHEEL TYRE 1


Now I want to make a list that tells me what items and how many of each I need to put together a BIKE. I want this result :

FRAME 1
RING 2
SPOKE 56
TYRE 2

If I pick these items, I can assemble a BIKE.


Given my hierarchial query from above I need to do two things :
1) Select nodes in the tree that does not have children.
2) Multiply the factor of each child with the factor of the parent.

The first part I can do using that I have the ITEMTYPE column in the ITEMS table - no problem.


The second part is more tricky. I'd like to do this :

SQL> select items.itemid, items.itemname, tree.factor
2 from (
3 select childid, factor * nvl(prior factor, 1) factor
4 from itemsets
5 connect by prior childid = parentid
6 start with parentid = 'BIKE'
7 ) tree, items
8 where items.itemid = tree.childid
9 and items.itemtype = 'I';

ITEMID ITEMNAME FACTOR
-------------------- ---------------------------------------- ----------
FRAME Bicycle frame 1
RING Ring for wheel 1 -- WRONG
SPOKE Spoke for wheel 28 -- WRONG
TYRE Tyre for bicycle wheel 2


That works up to level two (the factor for TYRE is correctly 2), but not for lower levels in the tree (RING and SPOKE is wrong because the factor 2 for COMP_WHEEL is not propagated down to the lower levels.


I've thought of a solution where I make a recursive function that returns a TABLE that I can select from, but wouldn't it add quite an overhead ?
That overhead is not a big deal in a simple example like this, but I also have to "expand sets" like that in queries that should answer questions such as : "For all orders that should be shipped this week, check if we have enough items in stock". It would be perfect to be able to "expand" the items in the orders like the above example and then select sum(ordered_number * factor) group by childid (or something similar).


If this CAN'T be solved easily in 8i but CAN in 9i, please specify - we plan on upgrading some time fairly soon.


Thanks.


Kim Berg Hansen
Senior Systems Developer
kbh@thansen.dk
T.Hansen Gruppen A/S
Denmark


and Tom said...

This works in 8i:

ops$tkyte@ORA817DEV.US.ORACLE.COM> select parentid, childid,
2 decode( (select itemType from items where itemid = itemsets.childid),
3 'S', to_number(NULL),
4 (select exp(sum(ln(is2.factor))) from itemsets is2
5 start with is2.childid = itemsets.childid
6 connect by prior is2.parentid = is2.childid ) ) factor2
7 from itemsets
8 start with parentid = 'BIKE'
9 connect by prior childid = parentid
10 /

PARENTID CHILDID FACTOR2
--------------- --------------- ----------
BIKE FRAME 1
BIKE COMP_WHEEL
COMP_WHEEL WHEEL
WHEEL RING 2
WHEEL SPOKE 56
COMP_WHEEL TYRE 2

6 rows selected.


It is more difficult then you make out -- as you need the entire hierarchy of factors. What if a bike has two wheels that have 28 spokes that each have 4 gizmos -- you need to multiply by the entire lineage there.

That is what exp(sum(ln(is2.factor))) does for us -- a product. See
</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2200571416651 <code>
for background

Rating

  (79 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Perfect - just what I need

Kim Berg Hansen, June 17, 2002 - 2:29 pm UTC

Thanks, Tom

That was a very nice query, and I can definitely use it - but I can also see in the answer you linked to, that I think I will press on for an upgrade to 9i, as I like the idea of creating user-defined aggregates.


Slight complication ...

Kim Berg Hansen, June 19, 2002 - 3:07 am UTC

Hi, Tom


I still like your query, but I've found a slight complication when trying it out...

If I add another itemset - a ladies bike, which is identical to the BIKE except for the frame. So my data now looks like this :

SQL> select * from items;

ITEMID               I ITEMNAME
-------------------- - ---------------------------
BIKE                 S Complete bicycle
FRAME                I Bicycle frame
COMP_WHEEL           S Complete wheel
WHEEL                S Wheel w/out tyre
TYRE                 I Tyre for bicycle wheel
RING                 I Ring for wheel
SPOKE                I Spoke for wheel
LADYBIKE             S Complete bicycle for ladies
LADYFRAME            I Bicycle frame lady-design


SQL> select * from itemsets;

PARENTID             CHILDID                  FACTOR
-------------------- -------------------- ----------
BIKE                 FRAME                         1
BIKE                 COMP_WHEEL                    2
COMP_WHEEL           WHEEL                         1
COMP_WHEEL           TYRE                          1
WHEEL                RING                          1
WHEEL                SPOKE                        28
LADYBIKE             LADYFRAME                     1
LADYBIKE             COMP_WHEEL                    2


The two bottom rows in each table are new.

The hierarchial query now fails :

SQL> select
  2     parentid, childid,
  3     decode( (select itemType from items where itemid = 
  4              itemsets.childid),
  5              'S', to_number(NULL),
  6              (select exp(sum(ln(is2.factor))) from itemsets is2
  7              start with is2.childid = itemsets.childid
  8              connect by prior is2.parentid = is2.childid ) ) factor2
  9  from itemsets
 10  start with parentid = 'BIKE'
 11  connect by prior childid = parentid;

PARENTID             CHILDID                 FACTOR2
-------------------- -------------------- ----------
BIKE                 FRAME                         1
BIKE                 COMP_WHEEL
COMP_WHEEL           WHEEL
WHEEL                RING                          4
WHEEL                SPOKE                       112
COMP_WHEEL           TYRE                          4

Because the COMP_WHEEL is now a child of two different parents, FACTOR2 is now too high.

What I tried to do in my original query :

SQL> select items.itemid, items.itemname, tree.factor
  2  from (
  3  select childid, factor * nvl(prior factor, 1) factor
  4  from itemsets
  5  connect by prior childid = parentid
  6  start with parentid = 'BIKE'
  7  ) tree, items
  8  where items.itemid = tree.childid
  9  and items.itemtype = 'I';

was to hope, that "prior factor" would use the alias factor instead of the table column factor - but of course it didn't work. I knew it, but I just had :-) to try it...


So I've now tried out my PL/SQL function approach (if you can't make it SQL, try PL/SQL, right :-)

I've created these types :

create type item_expanded_type as object (
   childid     VARCHAR2(20),
   factor      NUMBER
);

create type item_expanded_table as table of item_expanded_type;

And I've created a function :

CREATE OR REPLACE function item_expanded (
   parent_id      VARCHAR2,
   number_items   NUMBER DEFAULT 1
) return item_expanded_table
is
   table_cnt      PLS_INTEGER := 0;
   return_table   item_expanded_table := item_expanded_table();
   child_table    item_expanded_table;
begin
   for child in (
      select itemsets.childid,
             itemsets.factor,
             items.itemtype
        from itemsets, items
       where itemsets.parentid = parent_id
         and items.itemid = itemsets.childid
   ) loop
      if child.itemtype = 'S' then
         child_table := item_expanded(child.childid, child.factor * number_items);
         return_table.extend(child_table.count);
         for child_cnt in child_table.first..child_table.last loop
            table_cnt := table_cnt + 1;
            return_table(table_cnt) := child_table(child_cnt);
         end loop;
      else
         return_table.extend(1);
         table_cnt := table_cnt + 1;
         return_table(table_cnt) := item_expanded_type(child.childid, child.factor * number_items);
      end if;
   end loop;
   return return_table;
end item_expanded;
/

Now I can query like this :


SQL> select 
  2     parent.itemid parentid, child.childid, child.factor
  3  from 
  4     items parent, 
  5     table(cast(item_expanded(parent.itemid) as item_expanded_table)) child
  6  where parent.itemid = 'BIKE';

PARENTID             CHILDID                  FACTOR
-------------------- -------------------- ----------
BIKE                 TYRE                          2
BIKE                 RING                          2
BIKE                 SPOKE                        56
BIKE                 FRAME                         1


That approach gives me the correct answer, but I'm concerned about performance.

Autotrace for the hierarchial approach :

SQL> select
  2     parentid, childid,
  3     decode( (select itemType from items where itemid = 
  4              itemsets.childid),
  5              'S', to_number(NULL),
  6              (select exp(sum(ln(is2.factor))) from itemsets is2
  7              start with is2.childid = itemsets.childid
  8              connect by prior is2.parentid = is2.childid ) ) factor2
  9  from itemsets
 10  start with parentid = 'BIKE'
 11  connect by prior childid = parentid;

PARENTID             CHILDID                 FACTOR2
-------------------- -------------------- ----------
BIKE                 FRAME                         1
BIKE                 COMP_WHEEL
COMP_WHEEL           WHEEL
WHEEL                RING                          4
WHEEL                SPOKE                       112
COMP_WHEEL           TYRE                          4

6 rækker er valgt.


Statistics
----------------------------------------------------------
          0  recursive calls
        116  db block gets
         45  consistent gets
          0  physical reads
          0  redo size
        690  bytes sent via SQL*Net to client
        633  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          6  rows processed


Autotrace for the procedural approach :

SQL> select 
  2     parent.itemid parentid, child.childid, child.factor
  3  from 
  4     items parent, 
  5     table(cast(item_expanded(parent.itemid) as item_expanded_table)) child
  6  where parent.itemid = 'BIKE';

PARENTID             CHILDID                  FACTOR
-------------------- -------------------- ----------
BIKE                 TYRE                          2
BIKE                 RING                          2
BIKE                 SPOKE                        56
BIKE                 FRAME                         1


Statistics
----------------------------------------------------------
         13  recursive calls
         28  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        513  bytes sent via SQL*Net to client
        404  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          6  sorts (memory)
          0  sorts (disk)
          4  rows processed


So the question to you, Tom, is now this :

Are all those recursive calls going to kill performance or what ?

If it's not a performance killer, then I'm going to stick to the procedural approach, as it is easier to join with than the hierarchial query (at least while we're still using 8i :-)


Thanks again for your answers (all of your answers - great archive to search when problems arise ...)

 

Tom Kyte
June 19, 2002 - 7:22 am UTC

will it kill performance? not if you have small structures to "explode".

A slight different question

Reader, August 04, 2003 - 11:10 am UTC

Using the original table itemsets and sample data,
but not items table:
Database version 9.0.1

create table itemsets ( 
  parentid  varchar2 (20), 
  childid   varchar2 (20), 
  factor    number 
);
SQL> select * from itemsets;

PARENTID             CHILDID                  FACTOR
-------------------- -------------------- ----------
BIKE                 FRAME                         1
BIKE                 COMP_WHEEL                    2
COMP_WHEEL           WHEEL                         1
COMP_WHEEL           TYRE                          1
WHEEL                RING                          1
WHEEL                SPOKE                        28

Is it possible to use one SQL Query to generate
the following results?
PARENTID        CHILDID            FACTOR2
--------------- --------------- ----------
BIKE            FRAME                    1
BIKE            TYRE                     2
BIKE            RING                     2
BIKE            SPOKE                   56
COMP_WHEEL      TYRE                     1
COMP_WHEEL      RING                     1
COMP_WHEEL      SPOKE                   28
WHEEL           RING                     1
WHEEL           SPOKE                   28

Thanks 

Tom Kyte
August 04, 2003 - 12:16 pm UTC

i don't understand the outputs from the inputs I guess.

bike => comp_wheel => wheel => spoke
=> tyre

so, where is bike comp_wheel in the output and such.



Re: A slight different question

Reader, August 04, 2003 - 11:16 am UTC

I guess I meant for all the
node-to-leaf relationship with cumulative factors.


Tom Kyte
August 04, 2003 - 12:19 pm UTC

nope, don't get it.

Re: A slight different question

Reader, August 04, 2003 - 12:33 pm UTC

Allow me to try to explain:
PARENTID CHILDID FACTOR
-------------------- -------------------- ----------
BIKE FRAME 1
BIKE COMP_WHEEL 2
COMP_WHEEL WHEEL 1
COMP_WHEEL TYRE 1
WHEEL RING 1
WHEEL SPOKE 28

frame, tyre, spoke, ring are leaf nodes because
they don't appear as parentid in itemsets.
bike, comp_wheel, wheel are non-leaf nodes
because they appear as parent id in itemsets.

I want to a report of
non leaf node (bike, comp_wheel, wheel)
to leaf node (frame, tyre, spoke, ring)
with cumulative factor

Tom Kyte
August 04, 2003 - 12:49 pm UTC

you can easily get enough to "get" it

ops$tkyte@ORA920> l
  1  select rpad('*',level,'*') || parentid parentid, childid, factor,
  2         sys_connect_by_path( parentid, '/' ) scbp1,
  3         sys_connect_by_path( factor, '/' ) scbp2
  4    from t
  5   where parentid in ( select parentid from t )
  6     and childid not in ( select parentid from t )
  7* connect by prior childid = parentid
ops$tkyte@ORA920> /

PARENTID             CHILDID    FACTOR SCBP1                  SCBP2
-------------------- ---------- ------ ---------------------- --------------
*BIKE                FRAME           1 /BIKE                  /1
***WHEEL             RING            1 /BIKE/COMP_WHEEL/WHEEL /2/1/1
***WHEEL             SPOKE          28 /BIKE/COMP_WHEEL/WHEEL /2/1/28
**COMP_WHEEL         TYRE            1 /BIKE/COMP_WHEEL       /2/1
**WHEEL              RING            1 /COMP_WHEEL/WHEEL      /1/1
**WHEEL              SPOKE          28 /COMP_WHEEL/WHEEL      /1/28
*COMP_WHEEL          TYRE            1 /COMP_WHEEL            /1
*WHEEL               RING            1 /WHEEL                 /1
*WHEEL               SPOKE          28 /WHEEL                 /28

9 rows selected.


in 9i using sys connect by path. 

Re: A slight different question

Reader, August 04, 2003 - 1:12 pm UTC

Sorry for my dumbness, but I don't understand how
I can get the cumulative factor, eg,
BIKE FRAME 1
BIKE TYRE 2 <--
BIKE RING 2 <--
BIKE SPOKE 56 <--
COMP_WHEEL TYRE 1
COMP_WHEEL RING 1
COMP_WHEEL SPOKE 28
WHEEL RING 1
WHEEL SPOKE 28

I guess you didn't mean that I should look at
the result of SCBP2 column,

***WHEEL SPOKE 28 /BIKE/COMP_WHEEL/WHEEL /2/1/28

find 2, 1, and 28 and do 2*1*28 in my head?

Tom Kyte
August 04, 2003 - 2:20 pm UTC

you could do it in your head or, you could do it in the client 

all I said was "i can get you sufficient data to get that"  


ops$tkyte@ORA920> create or replace function eval( p_string in varchar2 ) return number
  2  as
  3          l_str varchar2(4000) := substr( p_string, 2 ) || '/';
  4          l_val number := 1;
  5          l_n   number;
  6  begin
  7          while ( l_str is not null )
  8          loop
  9                  l_n := instr( l_str, '/' );
 10                  l_val := l_val * substr( l_str, 1, l_n-1 );
 11                  l_str := substr( l_str, l_n+1 );
 12          end loop;
 13          return l_val;
 14  end;
 15  /

Function created.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select rpad('*',level,'*') || parentid parentid, childid, factor,
  2         sys_connect_by_path( parentid, '/' ) scbp1,
  3         sys_connect_by_path( factor, '/' ) scbp2,
  4             eval( sys_connect_by_path( factor, '/' ) ) factor2
  5    from t
  6   where parentid in ( select parentid from t )
  7     and childid not in ( select parentid from t )
  8  connect by prior childid = parentid;

PARENTID             CHILDI FACTOR SCBP1                  SCBP2         FACTOR2
-------------------- ------ ------ ---------------------- ---------- ----------
*BIKE                FRAME       1 /BIKE                  /1                  1
***WHEEL             RING        1 /BIKE/COMP_WHEEL/WHEEL /2/1/1              2
***WHEEL             SPOKE      28 /BIKE/COMP_WHEEL/WHEEL /2/1/28            56
**COMP_WHEEL         TYRE        1 /BIKE/COMP_WHEEL       /2/1                2
**WHEEL              RING        1 /COMP_WHEEL/WHEEL      /1/1                1
**WHEEL              SPOKE      28 /COMP_WHEEL/WHEEL      /1/28              28
*COMP_WHEEL          TYRE        1 /COMP_WHEEL            /1                  1
*WHEEL               RING        1 /WHEEL                 /1                  1
*WHEEL               SPOKE      28 /WHEEL                 /28                28

9 rows selected.

 

A little different question

Sudhanshu, February 24, 2004 - 10:54 pm UTC

I have a slightly different question.
I have data in table items like below :-

TEM_MADE ITEM_REQD QTY
---------- ---------- ----------
Bicycle bell 1
Bicycle wheel 2
Bicycle frame 1
bell steel 20
wheel spoke 80
wheel tyre 2
spoke steel 10
frame screws 18
frame steel 1000
screws steel 45
steel Iron 70
steel Manganese 30

Now I have to know how much grams of steel I require to make a bicycle through a single query ???

The answer I am expecting to get is :-
(1*20)+(2*80*10)+(1*18*45)+(1*1000)
=3430

I tried solving this through below query but I couldn't,
SELECT item_made,
item_reqd,
DECODE (item_reqd,
'steel', (SELECT EXP (SUM (LN (is2.qty)))
FROM items is2
START WITH is2.item_reqd = abc.item_reqd
CONNECT BY PRIOR is2.item_made = is2.item_reqd),
TO_NUMBER (NULL)
) factor2
FROM items
START WITH item_made = 'Bicycle'
CONNECT BY PRIOR item_reqd = item_made;

Tom Kyte
February 25, 2004 - 9:02 am UTC

Here is a start, hurts my head thinking about these types of queries sometimes :)



ops$tkyte@ORA920PC> select a.*, sum(x) over () sumx
  2    from (
  3  select rpad( '*', level, '*' ) || item_made im, item_reqd, qty ,
  4         sys_connect_by_path( qty, '/' ) scbp,
  5             decode( item_reqd, 'steel',
  6                             (select exp(sum(ln(qty))) from t t2
  7                    start with t2.rowid = t.rowid
  8                  connect by prior t2.item_made = t2.item_reqd ), 0 ) x
  9    from t
 10   start with item_made = 'Bicycle'
 11  connect by prior item_reqd = item_made
 12         ) a
 13  /
 
IM              ITEM_REQD         QTY SCBP                     X       SUMX
--------------- --------------- ----- --------------- ---------- ----------
*Bicycle        bell                1 /1                       0       3430
**bell          steel              20 /1/20                   20       3430
***steel        Iron               70 /1/20/70                 0       3430
***steel        Manganese          30 /1/20/30                 0       3430
*Bicycle        wheel               2 /2                       0       3430
**wheel         spoke              80 /2/80                    0       3430
***spoke        steel              10 /2/80/10              1600       3430
****steel       Iron               70 /2/80/10/70              0       3430
****steel       Manganese          30 /2/80/10/30              0       3430
**wheel         tyre                2 /2/2                     0       3430
*Bicycle        frame               1 /1                       0       3430
**frame         screws             18 /1/18                    0       3430
***screws       steel              45 /1/18/45               810       3430
****steel       Iron               70 /1/18/45/70              0       3430
****steel       Manganese          30 /1/18/45/30              0       3430
**frame         steel            1000 /1/1000               1000       3430
***steel        Iron               70 /1/1000/70               0       3430
***steel        Manganese          30 /1/1000/30               0       3430
 
18 rows selected.
 
 

Slight Problem

Sudhanshu, February 26, 2004 - 7:30 am UTC

SQL> select * from abc;

ITEM_MADE ITEM_REQD        QTY
--------- ---------        ----------
A         B                1
A         C                2
A         D                3
C         E                3
B         C                2
D         B                3

Now from above table if I try to find out the qty of item C required to make item A using below query, it gives

 SELECT a.*,
        SUM (x) OVER () sumx
   FROM (SELECT RPAD ('*', LEVEL, '*') || item_made im,
                item_reqd,
                qty,
                SYS_CONNECT_BY_PATH (qty, '/') scbp,
                DECODE (item_reqd,
                        'C', (SELECT EXP(SUM(LN(qty)))
                                FROM abc t2
                          START WITH t2.ROWID = abc.ROWID
                    CONNECT BY PRIOR t2.item_made = t2.item_reqd), 0 ) x
FROM abc
START WITH item_made = 'A'
CONNECT BY PRIOR item_reqd = item_made) a;
-----------------------------------
*A    B 1 /1           38    
**B   C 2 /1/2         38   
***C  E 3 /1/2/3       38  
*A    C 2 /2           38    
**C   E 3 /2/3         38   
*A    D 3 /3           38    
**D   B 3 /3/3         38   
***B  C 2 /3/3/2       38  
****C E 3 /3/3/2/3     38

The expected answer should have been 22((1*2)+(2)+(3*3*2)).

Now in the output line 2 contributes 18,line 4 contributes 2 & line 8 again 18, which makes the total 38(18+2+18).
The flaw which I see is at line 2 of the output. When it saw that it has reached 'C', then it backtracks the path to reach 'A' but it takes the longer path at line 8 i.e. C->B->D->A which makes it 18; whereas it should have taken C->B->A path which would have contributed 2 thus making the total 22.

Can you suggest a way out from this ? 

Tom Kyte
February 26, 2004 - 10:40 am UTC

I don't see anything *wrong* with it. You have three hierarchies in your base table in the first place. Three "a's"

In order to make "A" you need:

a->b->c
a->c
a->d->b->c



More Information

Sudhanshu Kumar, February 26, 2004 - 11:13 am UTC

Last time I missed to provide column x values :-
x sum
---- - - ------ -- ----
*A B 1 /1 0 38
**B C 2 /1/2 18 38
***C E 3 /1/2/3 0 38
*A C 2 /2 2 38
**C E 3 /2/3 0 38
*A D 3 /3 0 38
**D B 3 /3/3 0 38
***B C 2 /3/3/2 18 38
****C E 3 /3/3/2/3 0 38

a->b->c This should give 2.In above result it is giving 18. Wrong
a->c This should give 2.In above result it is giving 2. Correct
a->d->b->c This should give 18.In above result it is giving 18. Correct

My point is for the hierarchy a->b->c in second row of above output, the control backtracks using c->b->d->a in the query in the select column list; whereas it should have backtracked in c->b->a thus giving us product value 2.

Any solutions ??

By the Way, sometime before when I used to post a question I used to receive a mail saying my question has been answered with a link to the answer. Have you removed this feature?

Tom Kyte
February 26, 2004 - 2:16 pm UTC

the easiest thing for us will be to write a small function I think to take the sys connect by path information and multiply.  something like:


ops$tkyte@ORA920PC> create or replace function multiply( p_str in varchar2, p_delim in varchar2 )
  2  return number
  3  as
  4          l_str long := substr( p_str, 2 ) || p_delim;
  5          l_res number;
  6          l_n   number;
  7  begin
  8          loop
  9          l_n := instr( l_str, p_delim );
 10          exit when (nvl(l_n,0) = 0);
 11                  l_res := nvl(l_res,1)*substr(l_str,1,l_n-1);
 12          l_str := substr( l_str, l_n+1 );
 13      end loop;
 14      return l_res;
 15  end;
 16  /
 
Function created.
 
ops$tkyte@ORA920PC>
ops$tkyte@ORA920PC> select multiply( '/1/2', '/' ) from dual;
 
MULTIPLY('/1/2','/')
--------------------
                   2
 
ops$tkyte@ORA920PC> select multiply( '/3/3/2', '/' ) from dual;
 
MULTIPLY('/3/3/2','/')
----------------------
                    18


might even be better than re-walking the tree in this case.


(when you ask a question via the "ask a question", you get an email.  these are "reviews", "followups", you are not asking a question here really ;) 

Tree

Marcio, April 02, 2004 - 12:18 pm UTC

Suppose you have a tree like this:

id, id_up
1, null
1,2
1,3
2,4
2,5
3,2
2,1

1
/ \
2 3
/ \ \
4 5 2
\
1

How can I get result like
1
1/2
1/3
1/2/4
1/2/5
1/3/2
1/3/2/1

in just one statement sql? I mess up with this!

Tom Kyte
April 02, 2004 - 1:43 pm UTC

until 10g, that'll give you a connect by loop -- 1->3->2->1->3->2->1->.....

sys_connect_by_path

Raj, April 22, 2004 - 9:30 pm UTC

In my stored procedure, i gave like

select
sys_connect_by_path(name,c_delimiter)
from emp
start with mgr_code = i_mgr_code
connect by prior emp_code = mgr_code;

where c_delimiter is a constant, if i replace that with a value lime '+' then it is fine else

then i get an error


PL/SQL: ORA-30003: illegal parameter in SYS_CONNECT_BY_PATH function
Line: 63

i am using the above query as inline view of another query.

version : 9.2


Tom Kyte
April 23, 2004 - 11:13 am UTC

Apparently, sys_connect_by_path does not support bind variables as the second argument for whatever reason:

ops$tkyte@ORA9IR2> select sys_connect_by_path(ename,:x) x
  2    from scott.emp
  3   start with mgr = 7839
  4  connect by prior empno = mgr;
select sys_connect_by_path(ename,:x) x
*
ERROR at line 1:
ORA-30003: illegal parameter in SYS_CONNECT_BY_PATH function
 
 
ops$tkyte@ORA9IR2> !oerr ora 30003
30003, 00000, "illegal parameter in SYS_CONNECT_BY_PATH function"
// *Cause:
// *Action: use a non-empty constant string as the second argument,
//          then retry the operation.


You would have to use dynamic sql currently to vary that input at runtime unfortunately. 

sum() over ( )

Marcio, May 12, 2004 - 4:04 pm UTC

Consider the scenario:

Description Rows
------------------- ----------
G = general 1
F = filial 10
L = Location 6000
E = Estation 13000
x
I = Indicator 180

sys_connect_by_path value
------------------- -----
/G/F1/L1/E1/I1 10
/G/F1/L1/E2/I2 8
/G/F1/L1/E3/I1 102
/G/F1/L2/E1/I4 24
/G/F1/L2/E2/I5 12
/G/F1/L2/E3/I8 229
/G/F2/L1/E1/I5 10
...

What would be your method to sum over g over g/f and over g/f/l like this:

select sum(value) over (partition by g, i),
sum(value) over (partition by g, f, i),
sum(value) over (partition by g, f, l, i)
from ...

I did this as is -- like demonstrated above, but take long time, because have a source with 200,000 rows.
This source will populate the structure associate(g/f/l/e/i)

thank you

Tom Kyte
May 13, 2004 - 9:32 am UTC

I don't know what you meant exactly by "i did this as is -- like demonstrated above" -- your situation seems to be quite different?


But given the inputs above - just some comments. If you look at your sum's -- they'll each require the ENTIRE input set to be sorted and sifted in different fashions. On my desktop, with 200,000 rows, this takes about 6/7 seconds, given the amount of raw sorting that must take place (each partition by is like an entire "new query" there), it seems reasonable:


select scbp,
value,
sum(value) over (partition by g,i) gi,
sum(value) over (partition by g,f,i) gfi,
sum(value) over (partition by g,f,l,i) gfli
from (
select substr( scbp, instr(scbp, '/', 1, 1 )+1, instr(scbp,'/',1,2)-instr(scbp,'/',1,1)-1 ) g,
substr( scbp, instr(scbp, '/', 1, 2 )+1, instr(scbp,'/',1,3)-instr(scbp,'/',1,2)-1 ) f,
substr( scbp, instr(scbp, '/', 1, 3 )+1, instr(scbp,'/',1,4)-instr(scbp,'/',1,3)-1 ) l,
substr( scbp, instr(scbp, '/', 1, 4 )+1, instr(scbp,'/',1,5)-instr(scbp,'/',1,4)-1 ) e,
substr( scbp, instr(scbp, '/', 1, 5 )+1, instr(scbp,'/',1,6)-instr(scbp,'/',1,5)-1 ) i,
scbp,
value
from (
select '/G1/F' || (mod(rownum,10)+1) ||
'/L' || (mod(rownum,6000)+1) ||
'/E' || (mod(rownum,13000)+1) ||
'/I' || (mod(rownum,180)+1)
|| '/' scbp,
object_id value
from big_table.big_table
where rownum <= 200000
)
)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 13335 6.84 6.80 2849 2878 0 200000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 13337 6.84 6.80 2849 2878 0 200000

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 133

Rows Row Source Operation
------- ---------------------------------------------------
200000 WINDOW SORT (cr=2878 r=2849 w=0 time=6119880 us)
200000 VIEW (cr=2878 r=2849 w=0 time=2640031 us)
200000 COUNT STOPKEY (cr=2878 r=2849 w=0 time=538830 us)
200000 TABLE ACCESS FULL BIG_TABLE (cr=2878 r=2849 w=0 time=262873 us)


What about subqueries

Marcio, May 13, 2004 - 1:15 pm UTC

Yes, I found my bottleneck -- I think!
It is subqueries to get pk from hierarchy -- ( select pk from hierarchy where name = x )

hierarchy_13k - 13,000 rows
t_200k - 200,000 rows

Look like this (resemble)

I must to insert into t (multi-insert)

so

insert /*+ append */ all
when ( rn_g = 1 ) then
insert into t values ( pk_g, sum_g )
when ( rn_f = 1 ) then
insert into t values ( pk_f, sum_f )
when ( rn_l = 1 ) then
insert into t values ( pk_l, sum_l )
when ( rn_e = 1 ) then
insert into t values ( pk_e, value )
select rownum() over ( partition by g, i order by g, i) rn_g,
rownum() over ( partition by g, f, i order by g, f, i) rn_f,
rownum() over ( partition by g, f, l, i order by g, f, l, i ) rn_l,
rownum() over ( partition by g, f, l, e, i order by g, f, l, e, i ) rn_e,
sum(value) over ( partition by g, i ) sum_g,
sum(value) over ( partition by g, f, i ) sum_f,
sum(value) over ( partition by g, f, l, i ) sum_l,
( select pk from hierarchy_13k where name = g ) pk_g,
( select pk from hierarchy_13k where name = f ) pk_f,
( select pk from hierarchy_13k where name = l ) pk_l
from t_200k, hierarchy_13k
where name = e
/

This is my performance problem -- what would be a better method to write this query to insert like demonstrated above.

Thank you


Tom Kyte
May 13, 2004 - 3:08 pm UTC

why not just join or outer join if need be in bulk?


seems there is a cartesian product in there? is that on purpose

Cartesian product ?

Marcio, May 14, 2004 - 2:46 pm UTC

Sorry Tom, I couldn't see cartesian product on there, allow me be more clear!

I don't want to spend more than 30-60-180 secs with this,
but don't get your last advice, may be if you're seeing a 100%
concise example you could suggest something to speed up this (if possible)

Volumetric information:

Table Rows
--------------------- -------
source_200k 200,000
element_13k 13,000
indicator 180
parameter_assoc TBD

DDL - Script

drop table parameter_assoc;
drop table source_200k;
drop table element_13k;
drop table indicator;
drop sequence s;

create table source_200k
( g varchar2(10),
f varchar2(10),
l varchar2(10),
e varchar2(10),
i varchar2(10),
value number );

insert into source_200k values ('G', 'F1', 'L1', 'E1', 'I1', 10);
insert into source_200k values ('G', 'F1', 'L1', 'E2', 'I2', 8);
insert into source_200k values ('G', 'F1', 'L1', 'E3', 'I1', 102);
insert into source_200k values ('G', 'F1', 'L2', 'E1', 'I4', 24);
insert into source_200k values ('G', 'F1', 'L2', 'E2', 'I5', 12);
insert into source_200k values ('G', 'F1', 'L2', 'E3', 'I8', 229);
insert into source_200k values ('G', 'F2', 'L1', 'E1', 'I5', 10);

create table element_13k ( id int primary key, name_ele varchar2(10) );
insert into element_13k values ( 1, 'G' );
insert into element_13k values ( 2, 'F1' );
insert into element_13k values ( 3, 'F2' );
insert into element_13k values ( 4, 'L1' );
insert into element_13k values ( 5, 'L2' );
insert into element_13k values ( 6, 'E1' );
insert into element_13k values ( 7, 'E2' );
insert into element_13k values ( 8, 'E3' );

create table indicator ( id int primary key, name_ind varchar2(10) );
insert into indicator values ( 1, 'I1' );
insert into indicator values ( 2, 'I2' );
insert into indicator values ( 3, 'I4' );
insert into indicator values ( 4, 'I5' );
insert into indicator values ( 5, 'I8' );

create table parameter_assoc (
par_id int primary key,
ele_id references element_13k,
ind_id references indicator,
value number )
/

create sequence s;

create or replace trigger parameter_trg
before insert on parameter_assoc for each row
begin
if ( :new.par_id is null )
then
select s.nextval into :new.par_id from dual;
end if;
end;
/
show error

column tree format a30

select rownum rn, '/'||g||'/'||f||'/'||l||'/'||e tree, value
from source_200k
/

/*
RN TREE VALUE
------------- ------------------------------ -------------
1 /G/F1/L1/E1 10
2 /G/F1/L1/E2 8
3 /G/F1/L1/E3 102
4 /G/F1/L2/E1 24
5 /G/F1/L2/E2 12
6 /G/F1/L2/E3 229
7 /G/F2/L1/E1 10
8 /G/F1/L1/E1 10
9 /G/F1/L1/E2 8
10 /G/F1/L1/E3 102
11 /G/F1/L2/E1 24
12 /G/F1/L2/E2 12
13 /G/F1/L2/E3 229
14 /G/F2/L1/E1 10

For each row on source_200k will generate entries on parameter_assoc.
Example:

rn=1
insert into parameter_assoc with /G and sum(value) group by g, i
insert into parameter_assoc with /G/F1 and sum(value) group by g, f, i
insert into parameter_assoc with /G/F1/L1 and sum(value) group by g, f, l, i
insert into parameter_assoc with /G/F1/L1/E1 and value

rn=2
insert into parameter_assoc with /G/F1/L1/E2 and value

rn=3
insert into parameter_assoc with /G/F1/L1/E2 and value

rn=4
insert into parameter_assoc with /G/F1/L2 and sum(value) group by g, f, l, i
insert into parameter_assoc with /G/F1/L2/E1 and value

and so on ...

Well, from this I thought this approach -- but it takes 30 minutes.
*/

insert /*+ append */ all
when ( rn_g = 1 ) then
into parameter_assoc ( ind_id, ele_id, value )
values ( ind_id, pk_g, sum_g )
when ( rn_f = 1 ) then
into parameter_assoc ( ind_id, ele_id, value )
values ( ind_id, pk_f, sum_f )
when ( rn_l = 1 ) then
into parameter_assoc ( ind_id, ele_id, value )
values ( ind_id, pk_l, sum_l )
when ( rn_e = 1 ) then
into parameter_assoc ( ind_id, ele_id, value )
values ( ind_id, pk_e, value )
select row_number() over ( partition by g, i order by g, i) rn_g,
row_number() over ( partition by g, f, i order by g, f, i) rn_f,
row_number() over ( partition by g, f, l, i order by g, f, l, i ) rn_l,
row_number() over ( partition by g, f, l, e, i order by g, f, l, e, i ) rn_e,
sum(value) over ( partition by g, i ) sum_g,
sum(value) over ( partition by g, f, i ) sum_f,
sum(value) over ( partition by g, f, l, i ) sum_l,
value,
( select id from element_13k where name_ele = g ) pk_g,
( select id from element_13k where name_ele = f ) pk_f,
( select id from element_13k where name_ele = l ) pk_l,
indicator.id ind_id, element_13k.id pk_e
from source_200k, element_13k, indicator
where name_ele = e
and name_ind = i
/

select par_id, name_ele, name_ind, value
from parameter_assoc pa,
indicator i , element_13k e
where i.id = pa.ind_id
and e.id = pa.ele_id
/


Tom Kyte
May 15, 2004 - 12:19 pm UTC

sorry, just assumed for some reason that:

from t_200k, hierarchy_13k
where name = e
/

was "name was a column", "e was an input"


instead of using scalar subqueries:

( select pk from hierarchy_13k where name = g ) pk_g,
( select pk from hierarchy_13k where name = f ) pk_f,
( select pk from hierarchy_13k where name = l ) pk_l


(outer) join to pick up g, f and l.

Child having several parents

Laurent Medioni, June 23, 2004 - 11:42 am UTC

I have a similar problem (Kim's second one) but bigger...

I have a simple "tree" table like:
CREATE TABLE computing_tree
(id_dad NUMBER(12) NOT NULL,
id_son NUMBER(12) NOT NULL)

Except that a child element can have several parent elements like:
INSERT INTO computing_tree (id_dad, id_son) VALUES (0, 1);
INSERT INTO computing_tree (id_dad, id_son) VALUES (0, 2);
INSERT INTO computing_tree (id_dad, id_son) VALUES (1, 4);
INSERT INTO computing_tree (id_dad, id_son) VALUES (1, 3);
INSERT INTO computing_tree (id_dad, id_son) VALUES (2, 3);
INSERT INTO computing_tree (id_dad, id_son) VALUES (3, 4);
So 3 has 2 parents, so do 4.

A query like:
SELECT id_dad, id_son
FROM computing_tree
START WITH id_dad = 0
CONNECT BY PRIOR id_son = id_dad
will logicaly return:
ID_DAD ID_SON
0 1
1 3
3 4
1 4
0 2
2 3
3 4

Next step is: I am not interested in having several time the same "link" between 2 elements and a useful info would be the "deepest" level of an element (the "tree" holds references between sub-parts of a computing, the 0 root beeing the final result, I want to know what sub-parts I have to execute and in what order...) using a query like:
SELECT id_son, max(level)
FROM computing_tree
START WITH id_dad = 0
CONNECT BY PRIOR id_son = id_dad
GROUP BY id_son
ORDER BY 2 DESC
Expected result is:
ID_SON MAX(LEVEL)
4 3
3 2
1 1
2 1
(meaning I have to compute 4, then 3, then 1 and 2, then 0 should be computed OK)

The fact that the CONNECT BY returns the whole combinations of path is negligeable on this example.
BUT on my production table holding 1110 (fuzzy!) references between 645 distinct elements, the CONNECT BY returns more than 2 500 000 lines (in less than 50ms, wouawww...). Then the GROUP BY take... 16 seconds to deal with the 2 millions lines and bring them down to my 645 elements...

Can you think of another and quicker way of doing this ?

Thanks a lot.
Laurent

Tom Kyte
June 23, 2004 - 12:53 pm UTC

I don't think i have enough of the details to formulate a complete answer here, but, if this goes "fast" as you say

ops$tkyte@ORA9IR2> SELECT id_dad, id_son
  2    FROM computing_tree
  3   START WITH id_dad = 0
  4  CONNECT BY PRIOR id_son = id_dad
  5  /
 
    ID_DAD     ID_SON
---------- ----------
         0          1
         1          4
         1          3
         3          4
         0          2
         2          3
         3          4
 
7 rows selected.


I would add this to it:

 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id_son,
  2         (select max(level)
  3                from computing_tree t2
  4                   start with t2.id_son = t1.id_son
  5             connect by prior t2.id_dad = t2.id_son ) max_level
  6    from (
  7  SELECT distinct id_son
  8    FROM computing_tree
  9   START WITH id_dad = 0
 10  CONNECT BY PRIOR id_son = id_dad
 11         ) t1
 12  /
 
    ID_SON  MAX_LEVEL
---------- ----------
         1          1
         2          1
         3          2
         4          3
 

Child having several parents (2)

Laurent, June 25, 2004 - 4:49 am UTC

Nice try but I do not have any "noticeable" performance improvement between the GROUP BY or the DISTINCT on the real production table (maybe 1s less with the DISTINCT but still more than 10s for the whole thing).
Think I will do that in a Java client function instead if there is no other option...

Do you know if the CONNECT BY is planned to be modified in order to handle "graphs" like mine and not plain vanilla "trees" (i.e. be able to specify that you do not want the CONNECT BY to crawl several times the same sub-path) ???

Thanks a lot for your prompt help.
Cheers,
Laurent

Tom Kyte
June 25, 2004 - 2:21 pm UTC

i'd be truly interested in hearing how java will do this better -- seems you still have to compute the max level regardless.

Interesting problem, Laurent...

Kim Berg Hansen, June 28, 2004 - 3:43 am UTC

Hi, Laurent.

I'd hazard a guess that the 50ms you say the CONNECT BY takes to return you 2.5 million rows is for the first row?
The problem seems to be that it does take time to trawl through all possible paths.

But as the actual amount of records in the table is comparatively smaller - maybe it will help to reverse the logic? Take Tom's solution, but do not waste time doing the "big" connect by first... Seems like practically all your sons are children of id_dad=0 at some point, so why not simply use Tom's solution to select the max level of ALL id_sons :

1 select * from
2 (
3 select id_son,
4 (select max(level)
5 from computing_tree t2
6 where t2.id_dad = 0
7 start with t2.id_son = t1.id_son
8 connect by prior t2.id_dad = t2.id_son ) max_level
9 from (
10 SELECT distinct id_son
11 FROM computing_tree
12 ) t1
13 )
14 where max_level is not null
15* order by max_level desc

ID_SON MAX_LEVEL
---------- ----------
4 3
3 2
1 1
2 1


Given that practically all id_son's are children of id_dad=0, then the overhead of calculating max_level for the few id_son's who are not children of id_dad=0 might be neglible?

It might not be faster, as the paths of the graph might be just as many when viewed "in reverse", but then again the paths might be considerably fewer that way?


...Just an idea to try out...


keep id

Marcio, July 08, 2004 - 1:24 pm UTC

I have a process to build a tree from file. This is a example.

file: tree_07JUL04.txt

name dad path
----- ----- -----
a null a
b a a/b
c a a/c

Tree:

id name dad_id Version
-- ---- ------ ---------
1 a NULL 07-JUL-04
2 b 1 07-JUL-04
3 c 1 07-JUL-04

a
/ \
b c

and you have to update this tree with another version

file: tree_08JUL04.txt

name dad path
----- ----- -----
a null a
b a a/b
c a a/c
d b a/b/d

Tree

id name dad_id Version
-- ---- ------ ---------
1 a NULL 07-JUL-04
2 b 1 07-JUL-04
3 c 1 07-JUL-04
1 a NULL 08-JUL-04
2 b 1 08-JUL-04
3 c 1 08-JUL-04
4 d 2 08-JUL-04

a
/ \
b c
|
d

Problem:
~~~~~~~~
I would like keep old ids and get a new to new lines on tree.

DDL

-- To simulate my environment, will put date inside table to pretend 2 version of file.

drop table file_export;

create table file_export (
version date,
name varchar2(10),
dad varchar2(10),
path varchar2(10)
);

insert into file_export values ( trunc(sysdate)-1, 'a', null, 'a' );
insert into file_export values ( trunc(sysdate)-1, 'b', 'a', 'a/b' );
insert into file_export values ( trunc(sysdate)-1, 'c', 'a', 'a/c' );

-- and second version

insert into file_export values ( trunc(sysdate), 'a', null, 'a' );
insert into file_export values ( trunc(sysdate), 'b', 'a', 'a/b' );
insert into file_export values ( trunc(sysdate), 'c', 'a', 'a/c' );
insert into file_export values ( trunc(sysdate), 'd', 'b', 'a/b/d' );

-- And tree table from 1st version.

drop table tree;
create table tree ( id number, name varchar2(5), dad_id number, version date, constraint tree_pk primary key ( id, version ) );
insert into tree values ( 1, 'a', null, trunc(sysdate)-1);
insert into tree values ( 2, 'b', 1, trunc(sysdate)-1);
insert into tree values ( 3, 'c', 1, trunc(sysdate)-1);




Tom Kyte
July 08, 2004 - 2:41 pm UTC

you'll need to add a column to file_export -- update it and then you can insert, eg:

ops$tkyte@ORA9IR2> alter table file_export add name_id number;
 
Table altered.

ops$tkyte@ORA9IR2> update file_export set name_id = nvl( (select id from tree where name = file_export.name), s.nextval );
 
3 rows updated.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into tree( id, name, dad_id, version )
  2  select name_id, name, (select name_id from file_export where name = fe2.dad), version
  3    from file_export fe2
  4  /
 
3 rows created.
 
ops$tkyte@ORA9IR2> select * from tree;
 
        ID NAME                               DAD_ID VERSION
---------- ------------------------------ ---------- ---------
         1 a                                         07-JUL-04
         2 b                                       1 07-JUL-04
         3 c                                       1 07-JUL-04

that gets the first three in, then:
ops$tkyte@ORA9IR2> update file_export set name_id = nvl( (select id from tree where name = file_export.name), s.nextval );
 
4 rows updated.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into tree( id, name, dad_id, version )
  2  select name_id, name, (select name_id from file_export where name = fe2.dad), version
  3    from file_export fe2
  4  /
 
4 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from tree;
 
        ID NAME                               DAD_ID VERSION
---------- ------------------------------ ---------- ---------
         1 a                                         07-JUL-04
         2 b                                       1 07-JUL-04
         3 c                                       1 07-JUL-04
         1 a                                         08-JUL-04
         2 b                                       1 08-JUL-04
         3 c                                       1 08-JUL-04
         7 d                                       2 08-JUL-04
 
7 rows selected.

 

performance problem with hierarchy query

mike, November 11, 2004 - 11:29 pm UTC

Tom,
i have a hierarchy query with performance problem:
SELECT TASK_ID
FROM APPS.PA_TASKS
WHERE PROJECT_ID = :b1
CONNECT BY PARENT_TASK_ID = PRIOR TASK_ID
START WITH TASK_NUMBER
LIKE :b2

the excution plan :
QUERY_PLAN
--------------------------------------------
FILTER
CONNECT BY
INDEX RANGE SCAN PA_TASKS_N11
TABLE ACCESS BY USER ROWID PA_TASKS
TABLE ACCESS BY INDEX ROWID PA_TASKS
INDEX RANGE SCAN PA_TASKS_N4


the index PA_TASKS_N11 is on TASK_NUMBER and the index PA_TASKS_N4 on PARENT_TASK_ID.
I would like to force to use index on PROJECT_ID, becuase it's an unique index.
Put the index hint and does not work. do not understand the reason.

SELECT /*+ index(apps.pa_tasks PA_TASKS_U2) */
TASK_ID
FROM APPS.PA_TASKS
WHERE PROJECT_ID = 2675
CONNECT BY PARENT_TASK_ID = PRIOR TASK_ID
START WITH TASK_NUMBER
LIKE 'A%'
/

QUERY_PLAN
------------------------------------------------------------
FILTER
CONNECT BY
INDEX RANGE SCAN PA_TASKS_N11
TABLE ACCESS BY USER ROWID PA_TASKS
TABLE ACCESS BY INDEX ROWID PA_TASKS
INDEX RANGE SCAN PA_TASKS_N4

6 rows selected.

Thanks a lot


Tom Kyte
November 12, 2004 - 7:09 am UTC

but that would change the answer.....


in a connect by -- we:

a) build the hierarchy
b) APPLY THE WHERE CLAUSE to that hierarchy


so, if you used the project id one first -- you would not have built the hierarchy correctly


so, you cannot use the "index" on project_id


ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> variable b1 number
ops$tkyte@ORA9IR2> variable b2 varchar2(5)
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec :b1 := 55; :b2 := 'x'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select * from t;
 
   TASK_ID PROJECT_ID PARENT_TASK_ID TASK_
---------- ---------- -------------- -----
         1         42                x
         2         55              1 y
         3         42              2 y
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT TASK_ID
  2  FROM t
  3  WHERE PROJECT_ID = :b1
  4  CONNECT BY PARENT_TASK_ID = PRIOR  TASK_ID
  5  START WITH TASK_NUMBER
  6  LIKE  :b2
  7  /
 
   TASK_ID
----------
         2
 
ops$tkyte@ORA9IR2> SELECT TASK_ID
  2  FROM (select * from t WHERE PROJECT_ID = :b1) t
  3  CONNECT BY PARENT_TASK_ID = PRIOR  TASK_ID
  4  START WITH TASK_NUMBER
  5  LIKE  :b2
  6  /
 
no rows selected
 

<b>see, "connect by with where" is very very different from "where and then connect by"

so, the filter on project id happens AFTER the hierarchy is built.

if that is not what you wanted -- the where on project id should be in the connect by itself

"connect by project_id = :b1 and parent_task_id = prior task_id"

or something like that -- but that would be an entirely DIFFERENT query.</b>
 

Rollup of Hierarchy

Vinnie, July 18, 2005 - 4:33 pm UTC

Tom,

I have the following:

ID NAME PARENT_ID TOT
-- ---- --------- ---
1 A 1
2 B 1 5
3 C 2 10
4 D 2 10

I would like to be able to create a hierarchy rollup report such that if I select ID=2 I get:
2 B 25

If I select ID=1 I get:
1 A 26

Thanks again.
Vinnie

Tom Kyte
July 18, 2005 - 5:07 pm UTC

looks like it'd be easy with a scalar subquery using a connect by and sum


select id, name, (select sum(tot)
from t t2
start with id = t1.id
connect by prior id = parent_id)
from t t1
where id = :x;


but having no create table/inserts makes it hard to be sure :) Everytime I don't actually test out the thought - the wrong answers pop out.

Again

Vinnie, July 19, 2005 - 10:18 am UTC

I have tested that & it works fine...but what if I wanted to sum more than 1 column?

Alex, August 17, 2005 - 3:37 pm UTC

Tom,

I was wondering if Oracle handles circular relationships between records? I was told start with and connect by only work until they hit the end of the chain, and won't work if the last record relates back to the first. So if I had something like:

a
/ \
b c
/ \ \
e f g
\
h

So If I had to make a change to a, and that effected b, and that in turn effected f, but f related back to a, does Oracle have any way of dealing with circular problems like this? I was told pl/sql can't accomplish business logic like this, and as a result we are doing all kinds of nasty, messing logic in Java. Thanks, looking forward to the new book.

Tom Kyte
August 17, 2005 - 5:13 pm UTC

in 10g yes, there is a NOCYLCE option on connect by -- and a flag psuedo column that you can use to see if the current row would have started an infinite loop.

Alex, August 18, 2005 - 10:10 am UTC

I'm using 9i. Can you have a stored procedure that calls it's self? Or have procedures that call each other in a circular fashion? e.g. proc a calls proc b that calls a again?

Tom Kyte
August 18, 2005 - 4:23 pm UTC

yes, it is called "recursion"

just make sure you have a way out of it :)

Eliminating certain nodes on tree

Rahul, September 27, 2005 - 1:12 pm UTC

Tom,

I am using 9iR2 and have a question regarding sys_connect_by_path. Here is the test case I set up:

create table emp ( emp_no number, mgr_no number, emp_typ varchar2(10) )
/

insert into emp values (1, NULL,'R1');
insert into emp values (2, 1,'D1');
insert into emp values (3, 2,'R2');
insert into emp values (4, 2,'R2');
insert into emp values (5, 1,'R1');
insert into emp values (6, 5,'R1');
insert into emp values (7, 6,'R1');
insert into emp values (8, 5,'D3');
insert into emp values (9, 5,'R1');

commit;

The tree now is:

1(R1)
/ \
/ \
/ \
2(D1) 5(R2)
/\ / | \
/ \ / | \
/ \ / | \
3(R2) 4 (R2) 6(R1) 8(D3) 9(R1)
|
|
|
7(R1)

SELECT lpad('+ ',level*3,'---') || emp_no xx
from emp
start with emp_no = 1
connect by mgr_no = prior emp_no
/

XX
---------------------------
-+ 1
----+ 2
-------+ 3
-------+ 4
----+ 5
-------+ 6
----------+ 7
-------+ 8
-------+ 9

I need to keep only specific emp_typ (say R1, R2) i.e. the tree should look like :

1(R1)
/ | \
/ | \
/ | \
3(R2) 4 (R2) 5(R2)
/ \
/ \
/ \
6(R1) 9(R1)
|
|
|
7(R1)

Here is the query I am using:

SELECT lpad('+ ',new_level*3,'---') || emp_no xx
from (SELECT LEVEL + (LEVEL - length(sys_connect_by_path(Decode(emp_typ,'R1',NULL, 'R2',NULL, '~' ),'/' )
)
) new_level,
emp.*
from emp
start with emp_no = 1
connect by mgr_no = prior emp_no) inner
where inner.emp_typ IN ('R1','R2')
/

XX
------------------------------------
-+ 1
----+ 3
----+ 4
----+ 5
-------+ 6
----------+ 7
-------+ 9

Is there a simpler way to do this ? One disadvantage I have with the above is that, every time somebody changes the query by
adding another emp_typ to the outer where clause (say R3), the same has to be added to the decode. Could be overlooked.
The R1,R2,R3 are for the test case only. They have no specific pattern.



Tom Kyte
September 27, 2005 - 1:54 pm UTC


comments are great for making sure things don't "get overlooked"

A stored procedure that returns a ref cursor would be too?



Thanks

A reader, September 27, 2005 - 2:10 pm UTC

Thanks for your response Tom :).

Was just trying to do it in SQL. Considering that our "Non-Oracle" developers will want to tinker with this, will go the stored procedure way. With lots of comments of course.

Hierarchical query - children repeated per cumulative factor

Adele d.b., May 30, 2006 - 9:25 am UTC

Tom,

Considering the original example data provided, and simplified as follows:
SQL> select * from itemsets;

PARENTID             CHILDID                  FACTOR
-------------------- -------------------- ----------
BIKE                 FRAME                         1
BIKE                 COMP_WHEEL                    2
COMP_WHEEL           WHEEL                         1
WHEEL                SPOKE                         4

How can I create a sql query that will provide the output as follows:
PARENTID             CHILDID                
-------------------- -------------------- 
  BIKE               FRAME                 
  BIKE               COMP_WHEEL             
    COMP_WHEEL       WHEEL                   
      WHEEL          SPOKE                      
      WHEEL          SPOKE                   
      WHEEL          SPOKE                   
      WHEEL          SPOKE                                
    COMP_WHEEL       WHEEL                    
      WHEEL          SPOKE                      
      WHEEL          SPOKE                   
      WHEEL          SPOKE                   
      WHEEL          SPOKE   

What I need is the child to be repeated (factor) number of times and in the order shown above - is it possible?  I am on DB ver 9.2.0.5. 

Tom Kyte
May 30, 2006 - 9:41 am UTC

i don't get why comp_wheel/wheel is repeated twice, but bike/comp_wheel is there once - when it would seem to be the opposite. given that wheel/spoke is there 4 times.

Not saying that it can be done if you explain, just that "i don't get it" as it stands.

Followup

Adele d.b., May 31, 2006 - 2:12 am UTC

Tom, 
Sorry for wasting your time and not being clear enough.  From a table like this:
PARENTID             CHILDID                  FACTOR
-------------------- -------------------- ----------
BIKE                 FRAME                         1
BIKE                 COMP_WHEEL                    2
COMP_WHEEL           WHEEL                         1
WHEEL                SPOKE                         4

I need to achieve a 'tree-like' view of the above data as follows:

  BIKE               
      FRAME                 
      COMP_WHEEL             
          WHEEL                   
               SPOKE                      
               SPOKE                   
               SPOKE                   
               SPOKE                                
      COMP_WHEEL                    
           WHEEL          
               SPOKE                      
               SPOKE                   
               SPOKE                   
               SPOKE  
I need each child to be repeated 'factor'-number of times. 
What I have tried is this:
SQL> set serveroutput on
SQL> declare
  2  t_ch varchar2(100);
  3  t_f  number(2);
  4  
  5  cursor cr is select lpad('  ',3*(level-1))||childid, factor
  6     from temp
  7     connect by prior childid = parentid
  8     start with parentid='BIKE';
  9  
 10  begin
 11  
 12  open cr;
 13  loop
 14  fetch cr into t_ch, t_f;
 15   exit when cr%notfound;
 16    for i in 1..t_f loop
 17     dbms_output.put_line(t_ch);
 18    end loop;
 19  end loop;
 20  close cr;
 21  end;
 22  /
I get this:
FRAME
COMP_WHEEL
COMP_WHEEL
WHEEL
SPOKE
SPOKE
SPOKE
SPOKE
But I need it like this:
FRAME
COMP_WHEEL
WHEEL
SPOKE
SPOKE
SPOKE
SPOKE
COMP_WHEEL
WHEEL
SPOKE
SPOKE
SPOKE
SPOKE

Please forgive my asking such a 'basic' question, but I simply do not know how to do it, and I am not a sql-genius.  I am using this in a Forms procedure, and the data selected by the cursor is to be inserted into a temporary table (on which a report is to be based), from the temp table I need to do some further updates once the complete 'structure' of the tree is built. 

Tom Kyte
May 31, 2006 - 9:54 am UTC

I've never asked anyone to be a "super genius"

I've not tried to imply this is a 'basic question' (don't know where that is coming from)

Trust me, this is not "basic" :)

create table t ( p varchar2(10), c varchar2(10), factor number );

insert into t values ( 'BIKE','FRAME', 1 );
insert into t values ( 'BIKE','COMP_WHEEL', 2 );
insert into t values ( 'BIKE','SEAT', 1 );
insert into t values ( 'BIKE','HANDLEBARS', 1 );
insert into t values ( 'COMP_WHEEL','WHEEL', 1 );
insert into t values ( 'WHEEL','SPOKE', 4 );
insert into t values ( 'SPOKE', 'NUTS', 3 );
insert into t values ( 'NUTS', 'BOLTS', 1 );


select * from t;

with some_rows
as
(
select level l
  from dual connect by level <= 100
),
new_hierarchy
as
(
select p, ceil((l-0.1)/factor) grp, c, factor, tot, l
  from ( select factor, p, c, level lvl,
               (select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid = t1.rowid connect by prior p = c) tot
          from t T1
         start with p = 'BIKE'
       connect by prior c = p ) data,
       some_rows
 where some_rows.l <= data.tot
)
select 2 oc, rownum, rpad('*',2*level,'*') || c txt
  from new_hierarchy
 start with p = 'BIKE'
connect by prior c = p and prior grp = grp
union all
select 1 oc, 0, 'BIKE' from dual
order by 1, 2
/


I added some rows to test other conditions...


ops$tkyte@ORA10GR2> with some_rows
  2  as
  3  (
  4  select level l
  5    from dual connect by level <= 100
  6  ),
  7  new_hierarchy
  8  as
  9  (
 10  select p, ceil((l-0.1)/factor) grp, c, factor, tot, l
 11    from ( select factor, p, c, level lvl,
 12                 (select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid = t1.rowid connect by prior p = c) tot
 13            from t T1
 14           start with p = 'BIKE'
 15         connect by prior c = p ) data,
 16             some_rows
 17   where some_rows.l <= data.tot
 18  )
 19  select 2 oc, rownum, rpad('*',2*level,'*') || c txt
 20    from new_hierarchy
 21   start with p = 'BIKE'
 22  connect by prior c = p and prior grp = grp
 23  union all
 24  select 1 oc, 0, 'BIKE' from dual
 25  order by 1, 2
 26  /

        OC     ROWNUM TXT
---------- ---------- ------------------------------
         1          0 BIKE
         2          1 **FRAME
         2          2 **COMP_WHEEL
         2          3 ****WHEEL
         2          4 ******SPOKE
         2          5 ********NUTS
...
         2         29 **********BOLTS
         2         30 ********NUTS
         2         31 **********BOLTS
         2         32 **SEAT
         2         33 **HANDLEBARS
         2         34 **COMP_WHEEL
         2         35 ****WHEEL
         2         36 ******SPOKE
...
         2         58 ********NUTS
         2         59 **********BOLTS
         2         60 ********NUTS
         2         61 **********BOLTS
         2         62 ********NUTS
         2         63 **********BOLTS

64 rows selected.


Now that I put that one out there, we'll just sit back and watch how many "alternative solutions" get thrown out there :) 

Hierarchical query - children repeated per cumulative factor

Adele d.B, June 01, 2006 - 7:23 am UTC

WOW.
First, I just absolutely HAVE to tell you that I very much appreciate your sense of humour - it makes my day every time I consult AskTom.
Second, regarding the 'basic question' thing, what I meant was that I expected the problem to be solve-able in a short piece of code (in typical Tom-style), and not in the very-long very-complex (wrong) ways I normally tend to think. Guess I was wrong in this case.

I have tried your solution, but since I am still on db. ver 9.2.0.5, I suspect I've run into bug 2820371 (have logged a SR on Metalink):
ERROR at line 5:
ORA-00604: error occurred at recursive SQL level 1
ORA-00904: "T1"."P": invalid identifier
As a side issue, do you perhaps know of a workaround for this problem? The workaround stated in Metalink (Use /*+ INLINE */ hint to workaround the problem) does not seem to work.

To return to your solution - You have lost me on the first line already. Could you please explain the 'with some_rows as'-part? I have never seen this kind of statement before. What exactly does it do, and in what kind of context/situation is it intended to be used, and why does it not start with a 'select ...'?

Here's what I (think) I understand from your solution:
1. this part gets executed first:
select level l from dual connect by level <= 100
2. then this part:
new_hierarchy as ( select ....
3. then this:
select 2 oc, rownum, rpad('*',2*level,'*') || c txt from new_hierarchy
4. and lastly this to also include the root (the Bike):
union all select 1 oc, 0, 'BIKE' from dual
Right?

But now (you've probably guessed it) I am baffled by the main part:
new_hierarchy
as
(
select p, ceil((l-0.1)/factor) grp, c, factor, tot, l
from ( select factor, p, c, level lvl,
(select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid
= t1.rowid connect by prior p = c) tot
from t T1
start with p = 'BIKE'
connect by prior c = p ) data,
some_rows
where some_rows.l <= data.tot
)
Would you mind explaining it step-by-step so that I can understand it and your reasoning behind it?

Regards
Adéle

PS: Must say that I am also sitting back and waiting for the "alternative solutions" to appear (as well as your comments on it). :)
When do we get the chance to welcome you in South Africa?

Tom Kyte
June 01, 2006 - 10:55 am UTC

workaround:

ops$tkyte@ORA9IR2> with some_rows
  2  as
  3  (
  4  select level l
  5    from dual connect by level <= 100
  6  )
  7  select 2 oc, rownum, rpad('*',2*level,'*') || c txt
  8    from
  9    (
 10  select p, ceil((l-0.1)/factor) grp, c, factor, tot, l
 11    from ( select factor, p, c, level lvl,
 12                 (select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid = t1.rowid connect by prior p = c) tot
 13            from t T1
 14           start with p = 'BIKE'
 15         connect by prior c = p ) data,
 16         some_rows
 17   where some_rows.l <= data.tot
 18   ) new_hierarchy
 19   start with p = 'BIKE'
 20  connect by prior c = p and prior grp = grp
 21  union all
 22  select 1 oc, 0, 'BIKE' from dual
 23  order by 1, 2
 24  /


some_rows - need a set of numbered rows (from 1..N) where N is greater than or equal to the maximum "total number of elements we need to produce".  If you need two WHEELS, but the input set has only ONE wheel, we need to output that row twice.


This part of the query:

  from ( select factor, p, c, level lvl,
               (select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid = t1.rowid connect by prior p = c) tot
          from t T1
         start with p = 'BIKE'
       connect by prior c = p ) data,


simply builds the tree and using the scalar subquery determines the TOTAL number of outputs we need - if there is:

1 bike
with 2 wheels
and 4 spokes per wheel - we need 4*2*1 = 8 spokes all together.

the exp(sum(ln()) does the multiplication for us....  so, if you just run that query:


ops$tkyte@ORA9IR2> select factor, p, c, level lvl,
  2         (select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid = t1.rowid connect by prior p = c) tot
  3   from t T1
  4   start with p = 'BIKE'
  5  connect by prior c = p
  6  /

    FACTOR P          C                 LVL        TOT
---------- ---------- ---------- ---------- ----------
         1 BIKE       FRAME               1          1
         2 BIKE       COMP_WHEEL          1          2
         1 COMP_WHEEL WHEEL               2          2
         4 WHEEL      SPOKE               3          8
         3 SPOKE      NUTS                4         24
         1 NUTS       BOLTS               5         24
         1 BIKE       SEAT                1          1
         1 BIKE       HANDLEBARS          1          1

8 rows selected.


It just does the "math".


Now, we do a non-equi join with "some_rows" to output each row TOT times:

  (
select p, ceil((l-0.1)/factor) grp, c, factor, tot, l
  from ( select factor, p, c, level lvl,
               (select round(exp(sum(ln(factor)))) from t t2 start with t2.rowid = t1.rowid connect by prior p = c) tot
          from t T1
         start with p = 'BIKE'
       connect by prior c = p ) data,<b>
       some_rows
 where some_rows.l <= data.tot</b>
 ) new_hierarchy


Now, the "tricky part" is that while we now have 8 spokes - we really need TWO GROUPS of FOUR spokes (one for each wheel right...), that is the "grp" column.

I basically "replicated the hierarchies" FACTOR times and made the "key" of each hierarchy be "C,GRP" instead of just "C"

QED :)

 

Alternative

Padders, June 01, 2006 - 11:18 am UTC

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> CREATE OR REPLACE TYPE varchar2_table AS TABLE OF VARCHAR2 (4000);
  2  /

Type created.

SQL> CREATE OR REPLACE FUNCTION many (
  2     n IN NUMBER)
  3     RETURN varchar2_table
  4  IS
  5     m varchar2_table := 
  6        varchar2_table ();
  7  BEGIN
  8     m.EXTEND (n);
  9     RETURN m;
 10  END;
 11  /

Function created.

SQL> VARIABLE parentid VARCHAR2 (10);
SQL> EXEC :parentid := 'BIKE';

PL/SQL procedure successfully completed.

SQL> SELECT LPAD (' ', 2 * (LEVEL - 1)) || p p, c
  2  FROM   t, TABLE (many (t.factor)) 
  3  START WITH p = :parentid
  4  CONNECT BY PRIOR c = p;

P               C
--------------- ---------------
BIKE            FRAME
BIKE            COMP_WHEEL
  COMP_WHEEL    WHEEL
    WHEEL       SPOKE
    WHEEL       SPOKE
    WHEEL       SPOKE
    WHEEL       SPOKE
BIKE            COMP_WHEEL
  COMP_WHEEL    WHEEL
    WHEEL       SPOKE
    WHEEL       SPOKE
    WHEEL       SPOKE
    WHEEL       SPOKE

13 rows selected.

SQL> SELECT :parentid c
  2  FROM   dual
  3  UNION ALL
  4  SELECT LPAD (' ', 2 * (LEVEL)) || c
  5  FROM   t, TABLE (many (t.factor)) 
  6  START WITH p = :parentid
  7  CONNECT BY PRIOR c = p;

C
---------------
BIKE
  FRAME
  COMP_WHEEL
    WHEEL
      SPOKE
      SPOKE
      SPOKE
      SPOKE
  COMP_WHEEL
    WHEEL
      SPOKE
      SPOKE
      SPOKE
      SPOKE

14 rows selected.

SQL>  

Tom Kyte
June 01, 2006 - 11:37 am UTC

Ok, now that is cool :)

but you better put an order by on that last query please - you are relying on a UNION ALL to do something it doesn't have to do.




Agree

Padders, June 01, 2006 - 12:08 pm UTC

Absolutely - of course if the example contained a row for 'BIKE' itself then the need for UNION ALL would be moot.

SQL> INSERT INTO t VALUES (NULL, 'BIKE', 1);

1 row created.

SQL> VARIABLE child VARCHAR2 (10);
SQL> EXEC :child := 'BIKE';

PL/SQL procedure successfully completed.

SQL> SELECT LPAD (' ', 2 * (LEVEL - 1)) || c c
  2  FROM   t, TABLE (many (t.factor)) 
  3  START WITH c = :child
  4  CONNECT BY PRIOR c = p;

C
---------------
BIKE
  FRAME
  COMP_WHEEL
    WHEEL
      SPOKE
      SPOKE
      SPOKE
      SPOKE
  COMP_WHEEL
    WHEEL
      SPOKE
      SPOKE
      SPOKE
      SPOKE

14 rows selected.

SQL>  

Very Very grateful for your help

Adéle d.B., June 02, 2006 - 1:45 am UTC

Thanks very much, guys. These are some very cool solutions that I will definitely implement. Thanks also for your explanations, Tom, I understand it now.

Regards
Adéle

Sum on hierarchy

Chris, August 08, 2006 - 1:56 pm UTC

Tom,
What is the best way to sum across the results of a hierarchical query. For example, I have a table with the following:

XFrom XTo XCost
----- ----- -----
A B 1
B C 5
C Z 3
C D 5
C A 1 (used just to weed out cycles)
D Z 3

If I want to get from A to Z my two paths are:

A-B-C-Z with a total cost of 9
A-B-C-D-Z with a total cost of 14

I can get the path with the following:

create table mytab (xfrom varchar2(5), xto varchar2(5), xcost number);

insert into mytab values ('A','B',1);
insert into mytab values ('B','C',5);
insert into mytab values ('C','Z',3);
insert into mytab values ('C','D',5);
insert into mytab values ('C','A',1);
insert into mytab values ('D','Z',3);

select distinct path
from (select 'A' || sys_connect_by_path( XTo, '-' ) path,
XTo
from mytab
start with XFrom = 'A'
connect by nocycle prior XTo = XFrom
and XTo != 'A')
where XTo = 'Z';

My question is how I sum up the "XCost" column for each path. The first option that came to my mind was to use sys_connect_by_path with "+" as the delimeter and then evaluate the resulting expression, but that would require writing and calling a function that I fear will perform poorly. I must be overlooking a simple solution?

Tom Kyte
August 09, 2006 - 9:58 am UTC

is 'Z' somehow special here.

You say "i want to get from A to Z" - are A and Z inputs into a more generalize query?



Best way?

Chris, August 08, 2006 - 2:34 pm UTC

Okay, after trying many things I believe this will do it...

select distinct path, sum_cost
from (select 'A' || sys_connect_by_path( XTo, '-' ) path, Xto,
(select sum(t2.XCost)
from mytab t2
start with t2.xfrom = t1.xfrom and t2.xto = t1.xto
connect by nocycle prior t2.xfrom = t2.xto
and t2.xto != 'A') sum_cost
from mytab t1
start with t1.XFrom = 'A'
connect by nocycle prior t1.XTo = t1.XFrom
and t1.XTo != 'A')
where XTo = 'Z'

Is this the best / most efficient way to do this?

Sum on hierarchy

Chris, August 09, 2006 - 11:18 am UTC

Tom,
A and Z are basically "from" and "to" pairs. I simplified the true example a little, but it is representative of the general idea. I have a building with a bunch of points where paths meet. Given any starting and ending point I need to identify all paths that can be taken (excluding loops) and the associated cost. You could think of it as sidewalks that connect street corners. All of my data points are the corners. Given any two corners/intersections, I want to find all paths I can take to get there and the total distance.

Having said all that, the last solution I proposed above does not work when one of the points along the way can be arrived at from another point not on your path. So I'm back to the drawing board again. An example of why my previous solution doesn't work is shown below by adding the last insert.

create table mytab (xfrom varchar2(5), xto varchar2(5), xcost number);

insert into mytab values ('A','B',1);
insert into mytab values ('B','C',5);
insert into mytab values ('C','Z',3);
insert into mytab values ('C','D',5);
insert into mytab values ('C','A',1);
insert into mytab values ('D','Z',3);
insert into mytab values ('H','B',500);

select distinct path, sum_cost
from (select 'A' || sys_connect_by_path( XTo, '-' ) path,
Xto,
(select sum(t2.XCost)
from mytab t2
start with t2.xfrom = t1.xfrom and t2.xto = t1.xto
connect by nocycle prior t2.xfrom = t2.xto
and t2.xto != 'A') sum_cost
from mytab t1
start with t1.XFrom = 'A'
connect by nocycle prior t1.XTo = t1.XFrom
and t1.XTo != 'A')
where XTo = 'Z'

PATH SUM_COST
---------- --------
A-B-C-D-Z 514
A-B-C-Z 509

You can see that because I added a path from H to B that ended up in my inner query. I essentially want a way to do something like:

select distinct path, XCost
from (select 'A' || sys_connect_by_path( XTo, '-' ) path,
XTo,
substr(sys_connect_by_path( XCost, '+' ),2) XCost
from mytab t1
start with t1.XFrom = 'A'
connect by nocycle prior t1.XTo = t1.XFrom
and t1.XTo != 'A')
where XTo = 'Z'

PATH XCOST
---------- --------
A-B-C-D-Z 1+5+5+3
A-B-C-Z 1+5+3

and then I want to evaluate XCost as an expression. I'm just afraid that writing a function to evaluate the expression (using execute immediate) will be a performance hinderance. Based on the way the query works it finds all paths from a given starting point to all leafs. I then use the outer query to limit it to those ending with the appropriate point. With my actual data the inner query will generate tens of thousands of paths, but very quickly due to the speed of the start with connect by. The outer query that limits to the appropriate leaf node will reduce it to just a handful of rows. I do not want to execute the function on each of the rows from the inner query.

Actually I guess I could nest all of this in one more outer query to ensure that function was only executed on the small number of resulting rows, but it just feels like there ought to be a way to sum values from each node along the way. Any thoughts?

Tom Kyte
August 09, 2006 - 12:42 pm UTC

this might be considered a tad obscure - but if the paths are unique, it'll work (if the paths are not unique - eg: a->b->c->z could appear twice, distinct it first)

ops$tkyte%ORA10GR2> variable startp varchar2(5)
ops$tkyte%ORA10GR2> variable endp   varchar2(5)
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :startp := 'A'; :endp := 'Z';
 
PL/SQL procedure successfully completed.
 

ops$tkyte%ORA10GR2> with data1
  2  as
  3  (
  4  select :startp || sys_connect_by_path(xto,',') scbp1,
  5         sys_connect_by_path(xcost,',') || ',' scbp2,
  6             level l
  7    from mytab
  8   where xto = :endp
  9   start with xfrom = :startp
 10  connect by nocycle prior xto = xfrom and prior xto <> :endp
 11  ),
 12  data2
 13  as
 14  (
 15  select level r
 16    from dual
 17  connect by level <= ( select max(l) max_level from data1 )
 18  )
 19  select scbp1, sum(cost)
 20    from (
 21  select scbp1, scbp2, data2.r,
 22         to_number( substr( scbp2, instr(scbp2,',',1,data2.r)+1,
 23                instr(scbp2,',',1,data2.r+1)-instr(scbp2,',',1,data2.r)-1 ) ) cost
 24    from data1, data2
 25   where data2.r <= data1.l
 26         )
 27   group by scbp1
 28  /
 
SCBP1            SUM(COST)
--------------- ----------
A,B,C,Z                  9
A,B,C,D,Z               14
 

Suggestions...

padders, August 09, 2006 - 11:30 am UTC

Two approaches spring to mind, the first is a blatant (albeit smart) 'cheat' courtesy Laurent Schneider which uses a combination of SYS_CONNECT_BY_PATH and LENGTH to derive the cumulative sum, obviously this is restricted by maximum length of VARCHAR2 in SQL. The second approach is to just join back to the same table and GROUP.

Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> SELECT path, dist
  2  FROM  (SELECT 'A' || SYS_CONNECT_BY_PATH (xto, '-') PATH, xto,
  3                LENGTH (REPLACE (SYS_CONNECT_BY_PATH (
  4                   RPAD (' ', xcost), '-'),'-')) dist
  5         FROM  mytab
  6         START WITH xfrom = 'A'
  7         CONNECT BY NOCYCLE PRIOR xto = xfrom AND xto != 'A')
  8  WHERE  xto = 'Z';

PATH                       DIST
-------------------- ----------
A-B-C-D-Z                    14
A-B-C-Z                       9

SQL> SELECT a.path, SUM (b.xcost) dist
  2  FROM  (SELECT 'A' || SYS_CONNECT_BY_PATH (xto, '-') PATH, xto,
  3                SYS_CONNECT_BY_PATH ('{' || xfrom || '-' || xto || '}', ',') path2
  4         FROM  mytab
  5         START WITH xfrom = 'A'
  6         CONNECT BY NOCYCLE PRIOR xto = xfrom AND xto != 'A') a, mytab b
  7  WHERE  INSTR (a.path2, '{' || b.xfrom || '-' || b.xto || '}') > 0
  8  AND    a.xto = 'Z'
  9  GROUP BY path;

PATH                       DIST
-------------------- ----------
A-B-C-Z                       9
A-B-C-D-Z                    14

SQL>  

Tom Kyte
August 09, 2006 - 12:53 pm UTC

oh man.

that length(replace(sys_connect_by_path( is neat.

hurt my head for a second, but a very nice trick - for positive numbers

very cool

Chris, August 09, 2006 - 12:56 pm UTC

Now that is extremely cool.

I love the ingenuity from that first solution. Unfortunately, my numbers will be too large for that to work. Not only that, but I fear some extremely poor performance due to the string operations and the fact that they will be performed on every possible path from A, not just those ending at Z. Keep in mind that I will have tens of thousands of results from the inner query before the predicate on the outer query limits the result set to a handful of rows.

The second solution is also very cool. Now I'm wondering how to tell if that is better than writing a function to evaluate the result of the expression as I suggested. Not sure how to determine this. I can run autotrace on the statement you gave and get the following:

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
645 consistent gets
0 physical reads
0 redo size
289 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
64 sorts (memory)
0 sorts (disk)
2 rows processed

However, if I create a function that really just does computation in an "execute immediate" that does a "select 1+3+5' from dual" will that performance hit really show up in an autotrace, I'm guessing not.

Any ideas on the best way to tell which will perform better?

looks intersting but I'm now wondering whether it will perform better than the option of writing a function

Tom Kyte
August 09, 2006 - 4:15 pm UTC

benchmark it. tkprof them against each other on your data.

Using eval function...

Chris, August 09, 2006 - 1:26 pm UTC

To followup. Here would be a solution that used a function to evaluate the expression.

create or replace function myeval(p_exp IN VARCHAR2) return number IS
l_result number;
begin
execute immediate 'select ' || p_exp || ' from dual' into l_result;
return l_result;
end;

select distinct path, myeval(XCost) dist
from (select 'A' || sys_connect_by_path( XTo, '-' ) path,
XTo,
substr(sys_connect_by_path( XCost, '+' ),2) XCost
from mytab t1
start with t1.XFrom = 'A'
connect by nocycle prior t1.XTo = t1.XFrom
and t1.XTo != 'A')
where XTo = 'Z'

PATH DIST
-------------------- ----------
A-B-C-Z 9
A-B-C-D-Z 14

Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
90 consistent gets
0 physical reads
0 redo size
300 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
10 sorts (memory)
0 sorts (disk)
2 rows processed

A big difference on the number of gets because I eliminated the join. How do I account for the time that would be added for calling the function and performing the computation?

Also, will the execute immediate statement buried in my function be hard parsed every time? I'm assuming so. Can this function be written to use bind variables even though the expression will be different every time? Assuming it can, and you show me how, can you provide the method by which I can see the difference in the number of parses if it were executed many times.

Thanks.

Tom Kyte
August 09, 2006 - 4:47 pm UTC

tkprof it against your actual data!

Question about Laurent's solution

A reader, August 09, 2006 - 3:44 pm UTC

As there is no sum, I guess the length is doing aggregation. This works with positive integers only (no negatives, no floats)

Laurent's solution

Chris, August 09, 2006 - 4:18 pm UTC

It's not really aggregation, it is simply (but very cleverly) using the sys_connect_by_path function to create a string with X number of spaces for each path, where X is the value/cost read from the table with each path (set of spaces) delimited by a '-'. Then the replace function is used to remove all of the delimiters, and the length of the remaining string is your answer.

Hence, negative numbers will not work because you can't create a string with a negative length. Floating point numbers will not work either for the same reason.

creative solution

A reader, August 09, 2006 - 5:39 pm UTC

Thank you, Chris for confirming my doubts. Then, I don't really see Laurent's hack as a very creative solution. Sure we know how to count numbers in system base 1, but this is not really what the hierarchical weighted total query is about.

Here is where standard SQL solution shines:

with TCAssembly as (
select Part, SubPart, Quantity AS factoredQuantity
from AssemblyEdges
where Part = ‘Bicycle’
union all
select te.Part, e.SubPart, e.Quantity * te.factoredQuantity
from TCAssembly te, AssemblyEdges e
where te.SubPart = e.Part
) select SubPart, sum(Quantity) from TCAssembly
group by SubPart

It does 2 aggregations at once: multiplying weights along the paths, and adding paths.

Alas, database vendors nowadays are more interested pursuing XML fad, rather than improving SQL.

Tom Kyte
August 09, 2006 - 7:48 pm UTC

I disagree, Laurent's length(replace...) "hack" (hack in the original sense of the word - a good, positive thing) is pretty cool.

Definitely creative. Absolutely creative.



Example please

Chris, August 10, 2006 - 1:51 pm UTC

I'm not sure I follow your example. You are referencing TCAssembly from within the "with TCAssembly as" which is not permitted. I provided a script with the create table and insert statements, can you either:

- provide your example using my tables, or
- provide a create table, insert statement, and sql statement that works

I'd love to understand what you were doing there, and possibly benchmark it.

connect by level

Chris, August 10, 2006 - 2:28 pm UTC

Tom,
Your "tad obscure" solution is very impressive/cool. Made my head hurt a little. I understand exactly what you're doing but there is one piece I had a question about. It's the:

select level r
from dual
connect by level <= (select max(l) max_level from data1)

I'll just take 4 as the result of the subquery so we end up with:

select level r from dual connect by level <= 4

It looks like a substitute for the old "select rownum from all_objects where rownum <= x" to generate a resultset of numbers 1 through x.

However, it only works up to 25 in my database. I'm assuming there must be some limit to the levels of recursion the database will perform. Is this a hard-coded value or specified in the init.ora?


Vertical Hierarchy built up

Dev, October 03, 2006 - 5:57 am UTC

Tom,

I have a table that contains the data in the Flat hierarchy like tier0,tier1....
Can you provide the query to generate the Vertical Hiearchy .The data shld be in fashion of
Parent-> Current Value-> All possible childs.

ta


Tom Kyte
October 03, 2006 - 6:34 am UTC

hahaha, this is a good one.

here is one for you with as much detail as you give us:

my car will not start, why not? It is white by the way.

A reader, October 03, 2006 - 6:47 am UTC

Here is some example for you to make understand better.

Existing table t1 (tieri,tier2,tier3....)
values
('State','Country','City1'...)
('State','Country','City2'...)

To achive the data in table t2 (current,parent,child) shld have values from t1 as
NULL,'State','Country'
NULL,'State','City1'
NULL,'State','City2'
'State','Country','City1'
'State','Country','City2'
'Country','City1','City1'
'Country','City2','City2'

Can you provide the query to transform the data from t1 into t2.
The important thing here is that each value in t1 has to be treated as current value and its parent(definitely single) and all the possible childs have to be found.It will be more clear if you can assume(visualise) all the rows in t1 in form of a tree.

ta



Tom Kyte
October 03, 2006 - 7:05 am UTC

how about you provide, in the form of a specification, precisely what this all means exactly.

I don't get "null, 'state', 'country'" - or why that makes sense, or how that is a "hierarchy"

and what happens when state varies
and country varies

and what happens when the same city is in the same state but a different country, of what possible use is that data.

Sorry - but no, this does not make sense. Don't draw "a picture", explain in painstaking detail the "problem" - like you would to a programmer tasked with coding this.




Dev, October 03, 2006 - 7:22 am UTC

Tom, I know the req is little complex and hard to type in and on the other hand for you to understand,but let me try all over again.
The data in flat Hierarchy looks like,this can be treated as the source table over here.
Table flat_tab
tier1 tier2 tier3 tier4
Asia China Shanghai Sh
Asia China NanJing Nj
Asia India Delhi Dl
Europe UK London Ln

tier1 : Region
tier2 : Country
tier3 : State
tier4 : City

Req : Need to build the table vertical_tab,that contains the current value its parent and all possible child regardless of the level of the hierarchy.The data for the first row shld look like.( This needs to be repeated for all teh rows present in the flat_tab)
Table vertical_tab
Parent Current Child
NULL Asia China
NULL Asia Shanghai
NULL Asia Sh
Asia China Shanghai
Asia China Sh
China Sh Sh

PS : If I take the tier1(topmost level or root) as current value the parent has to be NULL.
Also when I take tier4(last level or leaf) as current value the child remains same as current value.

If it is clear pls provide if you can frame a query to achieve this

Tom Kyte
October 03, 2006 - 10:37 am UTC

I'll do it as a tree, but not the funky way with nulls in the leading columns, that to me "just does not compute", it is not a hierarchy at all.

IF that suffices, so that you have something like:

Asia
China
Shanghai Sh
NanJing Nj
India
Dehli
Europe
UK
London
....


post the create and inserts with meaningful data (your asia/europe).

BOM query

Mette, October 03, 2006 - 9:24 am UTC

Hi Tom and others here

Given this:

create table parts (name varchar2(10), description varchar2(100));

create table products (name varchar2(10), description varchar2(100));

create table parts_parts (name varchar2(10), number_of number, consists_of varchar2(10));


insert into products values ('car','A nice german car');

insert into products values ('bike','A danish bike');

insert into parts values ('wheel',null);

insert into parts values ('roof',null);

insert into parts values ('bolt',null);

insert into parts values ('spoke',null);

insert into parts values ('seatbelt',null);

insert into parts values ('tyre',null);

insert into parts values ('seat',null);

insert into parts values ('steer', null);

insert into parts_parts values ('car',4, 'wheel');

insert into parts_parts values ('car',1, 'roof');

insert into parts_parts values ('car',14, 'bolt');

insert into parts_parts values ('wheel',1, 'tyre');

insert into parts_parts values ('wheel',10, 'spoke');

insert into parts_parts values ('wheel',6, 'bolt');

insert into parts_parts values ('car',4, 'seat');

insert into parts_parts values ('seat',1, 'seatbelt');

insert into parts_parts values ('seat',6, 'bolt');

insert into parts_parts values ('seatbelt',2,'bolt');

insert into parts_parts values ('bike',2,'wheel');

insert into parts_parts values ('bike',10, 'bolt');

insert into parts_parts values ('bike',1, 'steer');

-----------------

Can you help me with a query to the stock mgr to go fetch the parts for 2 cars and one bike.

Pick from stock:

roof 2
wheel 10
bolt 162 (hope I counted correctly)
spoke 100
I woul like both just the total number of parts listed, and the BOM tree like:

Car\Roof
Car\Wheel\bolt
Car\Wheel\tyre
etc...

I will garantee that there are no loops (cyclic refs).

regards
Mette

PS I'm currently on 9.2 - but have a local 10.2







Tom Kyte
October 03, 2006 - 10:47 am UTC

did you read through and try to apply the ORIGINAL answer?? your question looks fairly similar?

Bike/Car/Car etc.

Padders, October 03, 2006 - 10:40 am UTC

Tom I hope you don't mind me continuing.

Mette it kind of depends on how you will provide the order contents, i.e. the fact that the order is for 2 cars and 1 bike. Here I represent it as a nested table VARCHAR2_TABLE ('bike', 'car', 'car').

Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> CREATE TABLE parts_parts (
  2     name VARCHAR2 (10), 
  3     number_of NUMBER, 
  4     consists_of VARCHAR2(10));

Table created.

SQL> INSERT INTO parts_parts VALUES ('car', 4, 'wheel');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('car', 1, 'roof');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('car', 14, 'bolt');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('wheel', 1, 'tyre');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('wheel', 10, 'spoke');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('wheel', 6, 'bolt');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('car', 4, 'seat');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('seat', 1, 'seatbelt');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('seat', 6, 'bolt');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('seatbelt', 2, 'bolt');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('bike', 2, 'wheel');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('bike', 10, 'bolt');

1 row created.

SQL> INSERT INTO parts_parts VALUES ('bike', 1, 'steer');

1 row created.

SQL> CREATE OR REPLACE FUNCTION many (
  2     n IN NUMBER)
  3     RETURN VARCHAR2_TABLE
  4  IS
  5     m VARCHAR2_TABLE :=
  6        VARCHAR2_TABLE ();
  7  BEGIN
  8     m.EXTEND (n);
  9     RETURN m;
 10  END;
 11  /

Function created.

SQL> CREATE OR REPLACE TYPE varchar2_table AS TABLE OF VARCHAR2 (4000);
  2  /

Type created.

SQL> SET PAGESIZE 1000;
SQL> COLUMN name FORMAT A30;
SQL> COLUMN consists_of_path FORMAT A40;
SQL> SELECT a.consists_of, COUNT (*)
  2  FROM  (SELECT CONNECT_BY_ROOT (name) name, consists_of
  3         FROM   parts_parts pp, 
  4                TABLE (many (pp.number_of)) 
  5         START WITH NAME MEMBER OF 
  6                 VARCHAR2_TABLE ('bike', 'car', 'car')
  7         CONNECT BY PRIOR consists_of = name) a, 
  8               (SELECT ROWNUM order_item, column_value name
  9                FROM   TABLE (VARCHAR2_TABLE ('bike', 'car', 'car'))) b
 10         WHERE  b.name = a.name
 11  GROUP BY a.consists_of
 12  ORDER BY a.consists_of;

CONSISTS_O   COUNT(*)
---------- ----------
bolt              162
roof                2
seat                8
seatbelt            8
spoke             100
steer               1
tyre               10
wheel              10

8 rows selected.

SQL> COLUMN name FORMAT A30;
SQL> SELECT b.order_item, b.name, a.consists_of, COUNT (*)
  2  FROM  (SELECT CONNECT_BY_ROOT (name) name, consists_of
  3         FROM   parts_parts pp, 
  4                TABLE (many (pp.number_of)) 
  5         START WITH NAME MEMBER OF 
  6                 VARCHAR2_TABLE ('bike', 'car', 'car')
  7         CONNECT BY PRIOR consists_of = name) a, 
  8               (SELECT ROWNUM order_item, column_value name
  9                FROM   TABLE (VARCHAR2_TABLE ('bike', 'car', 'car'))) b
 10         WHERE  b.name = a.name
 11  GROUP BY b.order_item, b.name, a.consists_of
 12  ORDER BY b.order_item, b.name, a.consists_of;

ORDER_ITEM NAME                           CONSISTS_O   COUNT(*)
---------- ------------------------------ ---------- ----------
         1 bike                           bolt               22
         1 bike                           spoke              20
         1 bike                           steer               1
         1 bike                           tyre                2
         1 bike                           wheel               2
         2 car                            bolt               70
         2 car                            roof                1
         2 car                            seat                4
         2 car                            seatbelt            4
         2 car                            spoke              40
         2 car                            tyre                4
         2 car                            wheel               4
         3 car                            bolt               70
         3 car                            roof                1
         3 car                            seat                4
         3 car                            seatbelt            4
         3 car                            spoke              40
         3 car                            tyre                4
         3 car                            wheel               4

19 rows selected.

SQL> COLUMN consists_of_path FORMAT A40;
SQL> SELECT b.order_item, b.name || a.consists_of_path consists_of_path
  2  FROM  (SELECT CONNECT_BY_ROOT (name) name, consists_of,
  3                SYS_CONNECT_BY_PATH (consists_of, '/') consists_of_path
  4         FROM   parts_parts pp, 
  5                TABLE (many (pp.number_of)) 
  6         START WITH NAME MEMBER OF 
  7                 VARCHAR2_TABLE ('bike', 'car', 'car')
  8         CONNECT BY PRIOR consists_of = name) a, 
  9               (SELECT ROWNUM order_item, column_value name
 10                FROM   TABLE (VARCHAR2_TABLE ('bike', 'car', 'car'))) b
 11         WHERE  b.name = a.name
 12  ORDER BY b.order_item, consists_of_path;

ORDER_ITEM CONSISTS_OF_PATH
---------- ----------------------------------------
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/bolt
         1 bike/steer
         1 bike/wheel
         1 bike/wheel
         1 bike/wheel/bolt
         1 bike/wheel/bolt
         1 bike/wheel/bolt
         1 bike/wheel/bolt
         1 bike/wheel/bolt
         1 bike/wheel/bolt
         1 bike/wheel/bolt
<snip>
         3 car/wheel/spoke
         3 car/wheel/spoke
         3 car/wheel/spoke
         3 car/wheel/spoke
         3 car/wheel/tyre
         3 car/wheel/tyre
         3 car/wheel/tyre
         3 car/wheel/tyre

301 rows selected.

SQL>  

Thank's a lot

Mette, October 03, 2006 - 11:42 am UTC

Could you give me an example of how to do it with just 1 car (not using the table)?

And is this possible i V9 as well? (I tried but it complains about MEMBER OF) - so I tried to do it just for one item with no succes due to me not knowing all that hieracical suff

regards
Mette

See above

Padders, October 03, 2006 - 12:01 pm UTC

Solution for single vehicle is already above (as Tom hinted), see reply to question by Adele d.b.

Got it !

Mette, October 03, 2006 - 12:10 pm UTC

select name, count(*) antal, consists_of dimser from(
SELECT name, consists_of
FROM parts_parts pp,
TABLE (many (pp.number_of))
START WITH name = 'bike'
CONNECT BY PRIOR consists_of = name)
group by name, consists_of
order by name, consists_of;

I just had to split the query into bits .... then it's easier to understand :-)

I had problems with the connect by root thingy (read the docu - dit not get it anyway) - i'll give it a go later.

But thank you very much

regards
Mette

PS I did read some of the prev. answers :-) - but it seemed to me that they did not have the thing with one sub_part being part of many parts (ie. bolt being a part of both wheels and car). And it was this thing that bothered me ...

Display only the top most parent

J B, October 06, 2006 - 2:34 am UTC

Hi Tom,

We just need to display the top most parent for any child record using SQL statement. Is that possible?

A->B->C->D

If B is the child of A and C being the Child of B and D being the Child of C

then we should see the results as

Child Parent
A NULL
B A
C A
D A

Thanks

Tom Kyte
October 06, 2006 - 8:58 am UTC

laughing out loud... come on.

no table, no inserts, no sample query.

"the results" - not very "useful" in answering anything.



Yes it is possible

Michel Cadot, October 06, 2006 - 9:25 am UTC


An other SQL solution for "Sum on hierarchy"

Frank Zhou, December 07, 2006 - 5:09 pm UTC

     Here is an other SQL solution for Chris's question "Sum on hierarchy" posted on August 08, 2006. This solution doesn't need to join back to the same table to compute the sum of the cost.


SQL> SELECT  path ,
  2          sum(to_number( SUBSTR(x,
  3                         INSTR(x, '-', 1, LEVEL  ) + 1,
  4                         INSTR(x, '-', 1, LEVEL+1) -
  5           INSTR (x, '-', 1, LEVEL) -1 ) )) sum_cost
  6   FROM
  7  (
  8   SELECT XFrom, XTo, Xcost , 'A' ||'-'|| sys_connect_by_path(XTo, '-') path
  9          ,sys_connect_by_path(Xcost, '-')||'-' x
 10     FROM mytab
 11    WHERE XTo = 'Z'
 12   START WITH XFrom = 'A'
 13  CONNECT BY NOCYCLE  PRIOR XTo = XFrom
 14                      AND XTo != 'A'
 15                      AND Prior XTo != 'Z'
 16  )
 17  CONNECT BY PRIOR path = path
 18  AND INSTR (x, '-', 1, LEVEL+1) > 0
 19  AND PRIOR dbms_random.string ('p', 10) IS NOT NULL
 20  GROUP BY  path;

PATH           SUM_COST                                                                 
-----------    ----------                                                                    
A--B-C-D-Z       14                                                                 
A--B-C-Z         9                                                                      
                                                                                
                                                                        
                                                                         
SQL> spool off 

Can this be done with CONNECT BY for better performance?

jc, January 16, 2007 - 2:46 pm UTC

Tom,

create table calls (id varchar2(15), seq# number, scn# number);

insert into calls values ('One-a',0,1000);
insert into calls values ('Three-a',0,1111);
insert into calls values ('Three-a',1,null);
insert into calls values ('Three-a',2,null);
insert into calls values ('Two-a',0,1200);
insert into calls values ('Two-a',1,null);
insert into calls values ('One-b',0,1300);
insert into calls values ('Two-b',0,1400);
insert into calls values ('Two-b',1,null);
insert into calls values ('Three-b',0,1500);
insert into calls values ('Three-b',1,null);
insert into calls values ('Three-b',2,null);
commit;
select * from calls;
ID SEQ# SCN#
--------------- ---------- ----------
One-a 0 1000
Three-a 0 1111
Three-a 1
Three-a 2
Two-a 0 1200
Two-a 1
One-b 0 1300
Two-b 0 1400
Two-b 1
Three-b 0 1500
Three-b 1
Three-b 2

I have this table with multiple seq# values for each id. Only the first seq# (seq=0) has the CSN#. I would like to count the IDs with SCN# > 1200:

with cid as
( select id from calls where scn# > 1200)
select a.id , a.seq#, a.scn#
from calls a, cid
where a.id = cid.id;

ID SEQ# SCN#
--------------- ---------- ----------
One-b 0 1300
Two-b 0 1400
Two-b 1
Three-b 0 1500
Three-b 1
Three-b 2

with cid as
( select id from calls where scn# > 1200 )
select count(a.id)
from calls a, cid
where a.id = cid.id;

COUNT(A.ID)
-----------
6

Do you have better way to get the count, say with CONNECT BY (level), or whatever?

Thanks,

To: jc

Michel Cadot, January 18, 2007 - 9:24 am UTC


SQL> with 
  2    data as (
  3      select id, seq#,
  4             max(scn#) over (partition by id order by seq#) scn#
  5      from calls
  6    )
  7  select count(*)
  8  from data
  9  where scn# > 1200
 10  /
  COUNT(*)
----------
         6

The max function propagates the scn# value to all rows of the same id.

Regards
Michel

Thanks!! Michel

A reader, January 19, 2007 - 8:52 pm UTC


RE: Hierarchical query - children repeated per cumulative factor

Frank Zhou, February 15, 2007 - 5:50 pm UTC


Here is an alternative SQL solution for Adele d.b 's question
"Hierarchical query - children repeated per cumulative factor" posted on May 31, 2006

This solution is simply a "plug in" of the following SQL Pattern

http://www.jlcomp.demon.co.uk/faq/mult_row.html
Frank

SQL> select * from t2;

P C FACTOR
---------- -------------------- ----------
BIKE FRAME 1
BIKE COMP_WHEEL 2
COMP_WHEEL WHEEL 1
WHEEL SPOKE 4

SQL>
SQL> SELECT 2 oc, rownum rn, LPAD ('*', 2 * (LEVEL), '*') || c as c
2 FROM (
3 with in_line as (
4 SELECT p, c, factor
5 FROM t2
6 CONNECT BY PRIOR c = c AND PRIOR p=p AND PRIOR factor = factor
7 AND LEVEL < ABS(factor) +1
8 AND PRIOR dbms_random.string ('a', 10) IS NOT NULL )
9 SELECT p, c, factor
10 FROM in_line
11 )
12 START WITH p = 'BIKE'
13 CONNECT BY PRIOR c = p
14 union all
15 select 1 oc, 0, 'BIKE' c from dual
16 order by 1, 2;

OC RN C
---------- ---------- --------------------
1 0 BIKE
2 1 **COMP_WHEEL
2 2 ****WHEEL
2 3 ******SPOKE
2 4 ******SPOKE
2 5 ******SPOKE
2 6 ******SPOKE
2 7 **COMP_WHEEL
2 8 ****WHEEL
2 9 ******SPOKE
2 10 ******SPOKE
2 11 ******SPOKE
2 12 ******SPOKE
2 13 **FRAME

14 rows selected.

SQL>
SQL> spool off;

Doubt

sibghat, August 07, 2007 - 12:31 am UTC

SELECT  b.order_item,  b.name, a.consists_of, COUNT (*)
    FROM  (SELECT CONNECT_BY_ROOT (name) name, consists_of
           FROM   parts_parts pp, 
                  TABLE (many (pp.number_of)) 
           START WITH NAME MEMBER OF 
                   VARCHAR2_TABLE ('bike', 'car', 'car')
           CONNECT BY PRIOR consists_of = name) a, 
                 (SELECT ROWNUM order_item, column_value name
                  FROM   TABLE (VARCHAR2_TABLE ('bike', 'car', 'car'))) b
          WHERE  b.name = a.name
   GROUP BY b.order_item, b.name, a.consists_of
   ORDER BY b.order_item, b.name, a.consists_of;



In the above query why only 'bike', 'car', 'car' is selected, and what TABLE (many (pp.number_of) will do, can you please explain me..
Thanks a ton.
Tom Kyte
August 07, 2007 - 11:39 am UTC

why only bike car car?

because that is what is typed in???


just cut and paste in:

SELECT ROWNUM order_item, column_value name
FROM TABLE (VARCHAR2_TABLE ('bike', 'car', 'car'))


and run it, you'll see what it does (makes a set of

1 bike
2 car
3 car

)....

branch pruning

James Su, August 14, 2008 - 10:44 pm UTC

hi Tom,

I would like to ask about branch pruning when walking through a graph.

CREATE TABLE my_table (
id NUMBER primary key,
start_node VARCHAR2(10),
end_node VARCHAR2(10),
cost NUMBER(10,2)
);

question 1: how can I prune the branch when a node is already in the path (nocycle doesn't prune all of them because id may be different):

SELECT id, start_node, end_node, cost, SYS_CONNECT_BY_PATH(id,'/') as path_id
FROM my_table
CONNECT BY NOCYCLE PRIOR end_node = start_node
AND INSTR(SYS_CONNECT_BY_PATH(start_node,'/'),start_node)=0 --- doesn't work here
START WITH start_node = 'A';

question 2: how can I prune the branch when the total cost exceeds the budget (I can do it with subquery but that's not efficient)

SELECT id, start_node, end_node, cost, SYS_CONNECT_BY_PATH(id,'/') as path_id
FROM my_table
CONNECT BY NOCYCLE PRIOR end_node = start_node
AND SYS_SUM_BY_PATH(cost) <= :v_budget --- fake function made up by me
START WITH start_node = 'A';


question 3: is there a way to implement a user defined SYS_CONNECT_BY_PATH like function? I know we can use SYS_CONNECT_BY_PATH then parse it, but that doesn't look nice.

pseudo code:
SELECT id, start_node, end_node, cost, SYS_CONNECT_BY_PATH(id,'/') as path_id,
NVL(PRIOR total_cost,0) + cost as total_cost ---- doesn't work, made up by me
FROM my_table
CONNECT BY NOCYCLE PRIOR end_node = start_node
AND total_cost <= :v_budget --- fake function made up by me
START WITH start_node = 'A';

restriction on connect_by_root

James Su, August 20, 2008 - 11:52 am UTC

Dear Tom,

I have one more question about connect_by_root. The document says:

Restriction on CONNECT_BY_ROOT You cannot specify this operator in the START WITH condition or the CONNECT BY condition.

But according to my test case, CONNECT_BY_ROOT is allowed in CONNECT BY condition. Can you clarify that?

Thank you.
Tom Kyte
August 20, 2008 - 12:46 pm UTC

where do you have it in a connect by

connect_by_root in connect by

James Su, August 20, 2008 - 1:56 pm UTC

Dear Tom,

example:

CREATE TABLE my_table (
id NUMBER primary key,
start_node VARCHAR2(10),
end_node VARCHAR2(10),
cost NUMBER(10,2)
);

INSERT INTO my_table values (1,'A','B',1);
INSERT INTO my_table values (2,'B','C',1);
INSERT INTO my_table values (3,'A','C',1);
INSERT INTO my_table values (4,'C','Z',1);

SELECT *
FROM my_table
WHERE end_node = 'Z'
CONNECT BY NOCYCLE PRIOR end_node = start_node
AND CONNECT_BY_ROOT start_node <> end_node;

It works.

Can you please answer my questions above (on Aug 14)? Thank you!
Tom Kyte
August 21, 2008 - 8:20 am UTC

looks like a documentation bug.


#1 from above, did not understand the question.

#2 the subquery works - and might even be "more efficient" than this trick:

ops$tkyte%ORA10GR2> select nm, deptno, sal, scbp,
  2        (select sum(column_value)
  3           from TABLE( cast( multiset(
  4         ( select trim( substr (scbp, instr (scbp, ',', 1, level  ) + 1,
  5                  instr (scbp, ',', 1, level+1) - instr (scbp, ',', 1, level) -1 ) )
  6             from dual
  7           connect by level <= length(scbp)-length(replace(scbp,',',''))-1) ) as sys.odcinumberlist ) )
  8        ) sum_sal
  9  from (
 10  select lpad('*',2*level,'*') || ename nm, deptno, sal,
 11         sys_connect_by_path( sal, ',' ) ||',' scbp
 12    from scott.emp
 13  start with mgr is null
 14  connect by prior empno = mgr
 15       )
 16  /

NM                 DEPTNO        SAL SCBP                      SUM_SAL
-------------- ---------- ---------- ---------------------- ----------
**KING                 10       5000 ,5000,                       5000
****JONES              20       2975 ,5000,2975,                  7975
******SCOTT            20       3000 ,5000,2975,3000,            10975
********ADAMS          20       1100 ,5000,2975,3000,1100,       12075
******FORD             20       3000 ,5000,2975,3000,            10975
********SMITH          20        800 ,5000,2975,3000,800,        11775
****BLAKE              30       2850 ,5000,2850,                  7850
******ALLEN            30       1600 ,5000,2850,1600,             9450
******WARD             30       1250 ,5000,2850,1250,             9100
******MARTIN           30       1250 ,5000,2850,1250,             9100
******TURNER           30       1500 ,5000,2850,1500,             9350
******JAMES            30        950 ,5000,2850,950,              8800
****CLARK              10       2450 ,5000,2450,                  7450
******MILLER           10       1300 ,5000,2450,1300,             8750

14 rows selected.




#3 "but that doesn't look nice."

use a view, hide it if you don't like the way it looks... Keep it in SQL - pure SQL - if you want the most efficient path.

questions on connect by

James Su, August 21, 2008 - 11:53 am UTC

Dear Tom,

Let me explain more on question 1. This is trying to find a best route between two nodes. Because there are more than one records with the same start_node/end_node but different costs, NOCYCLE doesn't know a node is already in the path. For example, if we have different records between A and B, and I'm trying to go from A to Z, it may get me this answer:

A -> B -> C-> A -> B ...

I would like to avoid this (the second A node shouldn't be accessed). I tried adding this to the connect by:
INSTR(SYS_CONNECT_BY_PATH(start_node,'/'),start_node)=0
to stop it, but SYS_CONNECT_BY_PATH is not allowed in connect by.

My questions are all about connect by, I am trying to stop the traversing as early as possible and generate as little result as possible. If I put my predicate in the where clause, it works after all results are generated and thus not so efficient.

Thank you.
Tom Kyte
August 21, 2008 - 10:00 pm UTC

what do you mean by nocycle doesn't know a node is already there - if you hit what you just described, a-b-c-a-b.... that would be a connect by loop and nocycle would stop it.

not sure what you mean.

NOCYCLE works

James Su, August 22, 2008 - 9:55 am UTC

Tom, you are right, NOCYCLE works fine for that.
I found an issue with CONNECT_BY_ROOT when there's a "START WITH": (version 10.2.0.1.0)

SELECT *
FROM my_table
WHERE end_node = 'Z'
START WITH start_node IN ('A','B')
CONNECT BY NOCYCLE PRIOR end_node = start_node
AND CONNECT_BY_ROOT start_node <> end_node;

SELECT *
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel


ERROR:
ORA-03114: not connected to ORACLE

Tom Kyte
August 22, 2008 - 2:55 pm UTC

3113 = utilize support...

table function that you used

James Su, August 22, 2008 - 11:22 pm UTC

hi Tom,
I have a question about the table(cast(multiset
that you used in the SQL above. Why don't you just write:

(select sum(column_value)
from
( select trim( substr (scbp, instr (scbp, ',', 1, level ) + 1,
instr (scbp, ',', 1, level+1) - instr (scbp, ',', 1, level) -1 ) ) AS column_value
from dual
connect by level <= length(scbp)-length(replace(scbp,',',''))-1)
) sum_sal

Is it for better performance? Thank you.

Tom Kyte
August 26, 2008 - 7:52 pm UTC

it was, umm, to do what was needed? taking a string, add up the arbitrary number of numbers contained therein.

I had to turn the string into a set, so I could sum the set

implementation of sys_sum_by_path

James Su, January 06, 2009 - 2:01 pm UTC

Dear Tom,
I remember you said "never rely on the sequence of function call by sql" but somehow the return order of hierarchical queries is guaranteed by documentation. Can you please take a look at this implementation? Is it safe? Thank you!

create or replace package hierarchy is

type numtabletype1 is table of number index by binary_integer;
type numtabletype is table of numtabletype1 index by binary_integer;

numtable numtabletype;

function sys_sum_by_path(p_level in number, p_value in number,p_index IN NUMBER DEFAULT 1)
return number;

pragma restrict_references(sys_sum_by_path, wnds);

end;
/

create or replace package body hierarchy is

function sys_sum_by_path(p_level in number, p_value in number,p_index IN NUMBER DEFAULT 1)
return number
is
ln_ret number;
begin
numtable(p_index)(p_level) := p_value;
ln_ret := p_value;
for i in reverse 1 .. p_level - 1 loop
ln_ret := numtable(p_index)(i) + ln_ret;
end loop;
return ln_ret;
end;
end;
/


CREATE TABLE my_table (id NUMBER, parent_id NUMBER, amount NUMBER);

INSERT INTO my_table VALUES (1,NULL,1.1);
INSERT INTO my_table VALUES (2,1 ,2.1);
INSERT INTO my_table VALUES (3,1 ,3.1);
INSERT INTO my_table VALUES (4,2 ,4.1);

SELECT SYS_CONNECT_BY_PATH(id,'/')
,hierarchy.sys_sum_by_path(level,amount)
FROM my_table
START WITH id=1
CONNECT BY parent_id = prior id;


Tom Kyte
January 06, 2009 - 3:33 pm UTC

... but
somehow the return order of hierarchical queries is guaranteed by
documentation. ...

it is? where? unless you use order siblings by - it isn't "guaranteed"


Your code is a bug waiting to happen, it is not to be relied on. even with order siblings by - you cannot control when you are called - before, after, during the sort - or if you are only called once per row or whatever.

return rows in order

James Su, January 06, 2009 - 4:10 pm UTC

Dear Tom,

http://download.oracle.com/docs/cd/B10501_01/server.920/a96540/queries4a.htm

"6. Oracle returns the rows in the order shown in Figure 8-1."

It doesn't say functions in the select list will be called in this order (although my test cases did show that it's called in order).

It would be great if oracle can provide a way for users to create our own hierarchical functions, like the ones with aggregation (stragg).
Tom Kyte
January 07, 2009 - 9:09 am UTC

You took a snippet of that, fully it states:

...
Oracle returns the rows in the order shown in Figure 8-1. In the diagram, children appear below their parents. For an explanation of hierarchical trees, see Figure 2-1, "Hierarchical Tree".
...

what is the order of the children?


Let me ask you this in regards to your statement:

It doesn't say functions in the select list will be called in this order

I just now flipped a coin 50 times. Each of the 50 times it landed heads up. What can I conclude from this empirical observation? Can I conclude "this coin will land heads up every time I flip it"?

Empirical observation can be used to conclude "something is not always true" - if I can show you an example of something not behaving the way people say it "should" - then I have empirically proven it wrong. You cannot use an empirical test in this case to prove that it will always happen the way you currently observe.


For example - in older releases of Oracle, some people empirically observed "group by seems to sort, we don't need order by". It was accidentally true much of the time (it was a binary sort, not a character set sort, but in many cases the data did appear sorted). But it was *never true* that group by HAD TO sort (I've demonstrated group by not sorting in every release of Oracle since version 7.0). However, because they "observed this", they came to "rely on this" and when they upgraded to current releases - where group by almost never sorts - they were "hosed"

A simple change in plan - which could happen after statistics are gathered and the size/shape of the data changes - which could happen after any change in the database software - which could happen because of bind peeking - which could happen because it happens to be raining outside (not really but whatever) - will foil your well laid plan.


I don't know what a "hierarchical" function would be? Can you define what you mean by that?

hierarchical function

James Su, January 07, 2009 - 10:37 am UTC

Dear Tom,

"what is the order of the children? "
That is where "order siblings by" works. But in my case, I don't care about the order of the children.
What I want is the total cost of the path (from the root to the current node).
Total cost = cumulative cost up to the parent node + cost of current node

So, if my function is called in the order shown in Figure 8-1, then I am perfectly fine.

By "hierarchical function" I mean something like sys_connect_by_path. In this way we can define our own calculation, like sum_by_path, product_by_path, and even more complicated ones. Currently we have to parse sys_connect_by_path into a data set and run SQL on it.

Another great thing is we can use this "hierarchical function" in the connect by. For example, I want the SQL to stop traversing as early as possible, if the total cost already exceeds my budget.
Tom Kyte
January 07, 2009 - 2:56 pm UTC

... But in my case, I don't care about the
order of the children.
...

sure you do, how do you know we won't some day get all of the children of the parent AND THEN get the children of the first children and so on - plans change, optimizer methods improve, new features are added.

Anytime you say "I am dependent on the physical ordering of rows and Oracle invoking my function when I envision it should" - *STOP* - totally STOP

You have no control over it - none.


traversing order

James Su, January 07, 2009 - 3:57 pm UTC

"how do you know we won't some day get all of the children of the parent AND THEN ...."
If that happens, I would understand it as the change of internal processing of the sql engine. The output order is not likely to change. But I know the order of function call is not guaranteed, even the output order remains the same as Figure 8-1.

Unfortunately I have to give up my implementation and use the old trick to break the sys_connect_by_path into rows, and go back to procedural code when I need to get control of the levels while connect by doesn't give me what I want.

Thanks for your help.

Tree structure as columns

Sanji, September 23, 2009 - 3:35 pm UTC

Tom,

I'm struggling with a requirement wherein the hierarchy has to be displayed as columns and not rows.
We are on Oracle 9.2.0.7, HP-UX 11i.

create table hier1(sortord number, child varchar(13), parent varchar2(13))

insert into hier1 values(1,'BS1000',null);
insert into hier1 values(2,'100000','BS1000');
insert into hier1 values(3,'101009','100000');
insert into hier1 values(4,'102000','100000');
insert into hier1 values(5,'102100','102000');
insert into hier1 values(6,'102119','102100');
insert into hier1 values(7,'102129','102100');
insert into hier1 values(8,'102139','102100');
insert into hier1 values(9,'102149','102100');
insert into hier1 values(10,'102200','102000');
insert into hier1 values(11,'102218','102200');
insert into hier1 values(12,'102218000','102218');
insert into hier1 values(13,'102218001','102218');

The output format should be

SORTORD CHILD SORTORD_1 PARENT1 SORTORD_2 PARENT2 SORTORD_3 PARENT3 SORTORD_4 PARENT4 SORTORD_5 PARENT5
3 101009 2 100000 1 BS1000
6 102119 5 102100 4 102000 2 100000 1 BS1000
7 102129 5 102100 4 102000 2 100000 1 BS1000
8 102139 5 102100 4 102000 2 100000 1 BS1000
9 102149 5 102100 4 102000 2 100000 1 BS1000
13 102218001 11 102218 10 102200 4 102000 2 100000 1 BS1000
14 102218002 11 102218 10 102200 4 102000 2 100000 1 BS1000

Appreciate any insight...
Thanks
Sanji
Tom Kyte
September 28, 2009 - 2:30 pm UTC

you'll need to explain this output - I don't see how to take this hierarchy:

ops$tkyte%ORA9IR2> select lpad('*',2*level,'*') || child c
  2    from hier1
  3  start with parent is null
  4  connect by prior child = parent
  5  /

C
-------------------------------------------------------------------------------
**BS1000
****100000
******101009
******102000
********102100
**********102119
**********102129
**********102139
**********102149
********102200
**********102218
************102218000
************102218001

13 rows selected.


and flatten it, you'll need to describe the RULES OF PROCESSING THE DATA. Tell me what logic you used to figure out how many rows you got and what would be on each one.


You know, specifications.

Deriving list of hierarchies

Sanji, September 24, 2009 - 8:51 am UTC

Tom, i looked at a review from
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1154964870586
( Deriving a list of hierarchies June 9, 2004 - 3pm Central time zone Bookmark | Bottom | TopReviewer: Steven Healey from Saint Louis, Missouri USA)
and applied the same logic to my query (the previous review "Tree structure as columns)

OPEN:SANJI:PFIN@LAWSON1>select distinct
2 substr( path, instr( path, '/', 1, 4 )+1, instr( path, '/', 1, 5 )-instr(path, '/', 1, 4 )-1 ) parent_4,
3 substr( path, instr( path, '/', 1, 3 )+1, instr( path, '/', 1, 4 )-instr(path, '/', 1, 3 )-1 ) parent_3,
4 substr( path, instr( path, '/', 1, 2 )+1, instr( path, '/', 1, 3 )-instr(path, '/', 1, 2 )-1 ) parent_2,
5 substr( path, instr( path, '/', 1, 1 )+1, instr( path, '/', 1, 2 )-instr(path, '/', 1, 1 )-1 ) parent_1
6 from (
7 select child,
8 sys_connect_by_path( child, '/' ) || '/' path
9 from hier1
10 start with parent is null
11 connect by prior child = parent
12 )
13 /

PARENT_4 PARENT_3 PARENT_2 PARENT_1
---------- ---------- ---------- ----------
102100 102000 100000 BS1000
102200 102000 100000 BS1000
101009 100000 BS1000
102000 100000 BS1000
100000 BS1000
BS1000

Is there anyway we can do this without using sys_connect_by_path ?

Thanks
Sanji
Tom Kyte
September 29, 2009 - 7:56 am UTC

why? what is wrong with the implementation?

Hierarchial query with cumulative factor

Pedro, November 24, 2009 - 10:06 am UTC

Hi Tom,

I have a simplified cumulative factor question with an additional twist.
The requirements are:
1. Limit the hierarchies to certain leave's paths (KING being root, "subjects" being leaves)
2. Summarize cumulative salaries from "subjects" to KING (path_sum_sal)
3. ORDER the output from "subjects" to KING
The output should look just like the product of the provided query, the problem is I am pretty much sure it is not the optimal way to do it.
SELECT
SUBSTR(SYS_CONNECT_BY_PATH(empno, '/' ), 1, 32) scbp_empno,
SUBSTR(LPAD('*',2*LEVEL,'*') || ename, 1, 32) nm,
deptno,
empno,
mgr ,
sal,
(
SELECT
SUM(sal)
FROM scott.emp T2
WHERE empno IN
(
SELECT
empno
FROM scott.emp
START WITH empno IN (7499, 7521) /*level=1 => subjects; level last KING*/
CONNECT BY empno = PRIOR mgr /*level=1 => subjects; level last KING*/
)
START WITH T2.empno = emp.empno
CONNECT BY mgr = PRIOR empno
) path_sum_sal,
SUBSTR(SYS_CONNECT_BY_PATH( sal, ',' ) ||',', 1, 32) scbp
FROM scott.emp
WHERE empno IN
(
SELECT
empno
FROM scott.emp
START WITH empno IN (7499, 7521) /*level=1 => subjects; level last KING*/
CONNECT BY empno = PRIOR mgr /*level=1 => subjects; level last KING*/
)
START WITH mgr IS NULL /*level=1 => KING; level>1 subjects*/
CONNECT BY mgr = PRIOR empno /*level=1 => KING; level>1 subjects*/
ORDER BY mgr, empno
;
returns:
SCBP_EMPNO NM DEPTNO EMPNO MGR SAL PATH_SUM_SAL SCBP
--------------- ----------- ------ ---------- ---------- ---------- ------------ ----------------
/7839/7698/7499 ******ALLEN 30 7499 7698 1600 1600 ,5000,2850,1600,
/7839/7698/7521 ******WARD 30 7521 7698 1250 1250 ,5000,2850,1250,
/7839/7698 ****BLAKE 30 7698 7839 2850 5700 ,5000,2850,
/7839 **KING 10 7839 5000 10700 ,5000,

4 rows selected.


Thank you in advance,

Pedro

Tom Kyte
November 24, 2009 - 11:56 am UTC

given that the hierarchy looks like:

NM
------------------------------
**KING
****JONES
******SCOTT
********ADAMS
******FORD
********SMITH
****BLAKE
******ALLEN
******WARD
******MARTIN
******TURNER
******JAMES
****CLARK
******MILLER

14 rows selected.



please tell me why only ALLEN, WARD, BLAKE and KING are in the output, I don't see the logic there.

Pedro, November 24, 2009 - 12:47 pm UTC

Appreciate the lightning fast response, I am going ape here traversing these hierarchies.
This is what customer wants. Out of the whole tree they want only the branches which leaves are empno IN (7499, 7521) which would be ALLEN and WARD. The query:
(
SELECT
empno
FROM scott.emp
START WITH empno IN (7499, 7521) /*level=1 => subjects; level last KING*/
CONNECT BY empno = PRIOR mgr /*level=1 => subjects; level last KING*/
)
returns all of the parents of ALLEN and WARD (this adds BLAKE and KING to the list). As you can see to achieve that I traversed the tree from bottom to top (top being KING). The problem with the above query according to me is that I cannot traverse the hierarchy in the bottom to top fashion to obtain the correct path_sum_sal. So, the above query provides me with all the empno's I will need, then I traverse the hierarchy "START WITH mgr IS NULL" and get the correct path_sum_sal.
I hope that explains what I am after.
No matter what it feels great, that I am no more alone with these hierarchies from hell.
Thanks again,

Pedro
Tom Kyte
November 24, 2009 - 2:14 pm UTC

ops$tkyte%ORA9IR2> select ename, empno, deptno, mgr, sal,
  2         to_number( substr( data, 1, 14 ) ) sumsal,
  3         substr( data, 16 ) path
  4    from (
  5  select ename, empno, deptno, mgr, sal,
  6        (select to_char( sum(sal), '9999999999.99')|| max(sys_connect_by_path( empno, '/' ))
  7           from scott.emp e2
  8          start with e2.empno = e1.empno
  9        connect by prior mgr = empno) data
 10   from ( SELECT distinct *
 11                      FROM scott.emp
 12                     START WITH empno IN (7499, 7521)
 13                   CONNECT BY empno = PRIOR mgr  ) e1
 14        )
 15  /

ENAME           EMPNO     DEPTNO        MGR        SAL     SUMSAL PATH
---------- ---------- ---------- ---------- ---------- ---------- --------------------
ALLEN            7499         30       7698       1600       9450 7499/7698/7839
WARD             7521         30       7698       1250       9100 7521/7698/7839
BLAKE            7698         30       7839       2850       7850 7698/7839
KING             7839         10                  5000       5000 7839

4 rows selected.

Hierarchial query with cumulative factor

Pedro, November 24, 2009 - 2:55 pm UTC

We are getting close, I think.
The problem with your query is that it cumulates the salaries from KING to "subject":
ENAME EMPNO DEPTNO MGR SAL SUMSAL PATH
---------- ---------- ---------- ---------- ---------- ---------- ---------------
KING 7839 10 5000 5000 7839
BLAKE 7698 30 7839 2850 7850 7698/7839
WARD 7521 30 7698 1250 9100 7521/7698/7839
ALLEN 7499 30 7698 1600 9450 7499/7698/7839

4 rows selected.

What it should be doing is to cumulate the salaries from "subject" to KING like this:
NM DEPTNO EMPNO MGR SAL PATH_SUM_SAL SCBP_EMPNO
----------- ------ ----- ---- ---- ------------ ----------------
******ALLEN 30 7499 7698 1600 1600 /7839/7698/7499
******WARD 30 7521 7698 1250 1250 /7839/7698/7521
****BLAKE 30 7698 7839 2850 5700 /7839/7698
**KING 10 7839 5000 10700 /7839

4 rows selected.
Again, thank you very much for your help.

Pedro

Tom Kyte
November 24, 2009 - 3:21 pm UTC

maybe try this then...

walk "down" for the salaries, walk "up" for the path


ops$tkyte%ORA9IR2> with data
  2  as
  3  ( SELECT distinct *
  4      FROM scott.emp
  5     START WITH empno IN (7499, 7521)
  6   CONNECT BY empno = PRIOR mgr
  7  )
  8  select ename, deptno, empno, mgr, sal,
  9         (select sum(sal)
 10            from scott.emp e2
 11           where empno in (select empno from data)
 12           start with e2.empno = data.empno
 13          connect by prior empno = mgr) sumsal,
 14         (select max( sys_connect_by_path( empno, '/' ) )
 15            from scott.emp e3
 16           start with e3.empno = data.empno
 17          connect by prior mgr = empno) path
 18    from data
 19  /

ENAME          DEPTNO      EMPNO        MGR        SAL     SUMSAL PATH
---------- ---------- ---------- ---------- ---------- ---------- --------------------
ALLEN              30       7499       7698       1600       1600 /7499/7698/7839
WARD               30       7521       7698       1250       1250 /7521/7698/7839
BLAKE              30       7698       7839       2850       5700 /7698/7839
KING               10       7839                  5000      10700 /7839

4 rows selected.


Hierarchial query with cumulative factor

Pedro, November 24, 2009 - 4:01 pm UTC

This is exactly what I was looking for, thanks.
How do you come up with these answers in such a short time.
I am sitting here the third day, cannot find any white papers with any even slightly more complex examples of these hierarchies (except asktom.oracle.com) and my head is spinning.
Allow me to try to apply your approach to the real world, and I hope I won't get stuck again.
Thank you, what a service I am impressed,

Pedro

Tom Kyte
November 25, 2009 - 12:00 pm UTC

once you get a technique (walking up or down a hierarchy), you can apply that technique in a million ways.

Everything is about 'getting it', getting a technique, a pattern if you will.

If you look at the original answer to this question - that is the "technique", I just had to apply that technique twice - once up the hierarchy and once down - using the set of individuals you were interested in to filter with.

Once I got the requirement - the rest was easy. Remember this technique, remember that analytics exist and how they work, remember aggregation, joins, union/union all, minus, intersect, etc...

And then apply them.

It is the same way you code any algorithm, you understand what the language you are using has to offer and you apply those offerings to your problem.

Some things are easier in prolog
Some things in php
Some things in SQL
Some things in PL/SQL
etc...

but if you learn the techniques - approaches - offered by each - you can not only pick the right language for the job at hand, but also be able to actually do it

A reader, November 25, 2009 - 2:23 pm UTC

Hello Tom,

Can you please explain the difference between

"CONNECT BY empno = PRIOR mgr and connect by prior empno = mgr"

Thanks


Tom Kyte
November 28, 2009 - 10:40 am UTC

with hierarchical queries you have two important compoents

a) start with - find the initial set of rows

b) connect by


Now, in a query like:

ops$tkyte%ORA9IR2> select rpad('*',2*level,'*') || ename nm
  2    from scott.emp
  3  start with mgr is null -- start with king basically
  4  connect by PRIOR EMPNO -- for the first iteration, the PRIOR empno is the set of empno's in the start with
  5                         -- so in this case, PRIOR EMPNO will be just KINGS empno
  6   = mgr                 -- we want to find kings direct reports, they are the records
  7                         -- with a mgr equal to kings empno..
  8  /

NM
--------------------
**KING
****JONES
******SCOTT
********ADAMS
******FORD
********SMITH
****BLAKE
******ALLEN
******WARD
******MARTIN
******TURNER
******JAMES
****CLARK
******MILLER

14 rows selected.



so, the start with found KING and using KINGS empno (prior empno), we found all of the records such that the MGR = PRIOR EMPNO (kings empno)

that gave us JONES, BLAKE and CLARK

Now, their (jones/blake/clark) empnos become the 'prior' empno's. We find the people that work for them and add them to the hirearchy.

At which point in time - their empno's become the prior empno - that found scott, ford, allen, ward, martin, turner, james and miller.

Using their empnos as the PRIOR EMPNO - we found the people that work for them - that is just ADAMS and SMITH.

Now they (adams and smith) become the prior empnos - but no one works for them so the iteration stops..




If you do this one

ops$tkyte%ORA9IR2> select rpad('*',2*level,'*') || ename nm
  2    from scott.emp
  3  start with mgr is null -- start with king basically
  4  connect by PRIOR MGR = empno
  5  /

NM
--------------------
**KING




you get one record - why?

a) we started with MGR Is NULL that found king

b) then we find all records where prior mgr (which is NULL of course) = empno - nothing satisfies that - so we are done.


If we started with an employee that has a manager - then it would be different - let's use scott:


ops$tkyte%ORA9IR2> select rpad('*',2*level,'*') || ename nm
  2    from scott.emp
  3  start with ename = 'SCOTT'
  4  connect by PRIOR EMPNO = mgr
  5  /

NM
--------------------
**SCOTT
****ADAMS

ops$tkyte%ORA9IR2> select rpad('*',2*level,'*') || ename nm
  2    from scott.emp
  3  start with ename = 'SCOTT'
  4  connect by PRIOR MGR = empno
  5  /

NM
--------------------
**SCOTT
****JONES
******KING



the first one goes "down" the hierarchy - we go from scott at the top to everyone that works for scott and everyone that works for someone that works for scott and so on.

the second one goes "up" the hirearchy - we go from scott to the person that manages scott to the person that manages the person that manages scott and so on.




In short, you cannot really 'compare' these connect bys - they are as different as addition and long division - they are not really comparable, they are entirely different.

do not use connect by unless and until you can close your eyes and visualize what must be done in order to get the result set using your connect by. Until then - you haven't gotten it yet - make sure you understand this and can visualize it...


Hierarchial query with cumulative factor

A reader, November 30, 2009 - 1:57 pm UTC

Tom,

My real life code works like a dream based on your sample. I have an additional question, it is about:
(select sum(sal)
from scott.emp e2
where empno in (select empno from data)
start with e2.empno = data.empno
connect by prior empno = mgr) sumsal,
I have to compute multiple sums, they all traverse the hierarchy exactly the same way, so I just added another column like:
(select sum(sal+1)
from scott.emp e2
where empno in (select empno from data)
start with e2.empno = data.empno
connect by prior empno = mgr) sumsal_1
I was thinking would there be a way to compute all the sums with a single hierarchy traverse (I think this should be much more efficient) like:
(select sum(sal), sum(sal+1)
from scott.emp e2
where empno in (select empno from data)
start with e2.empno = data.empno
connect by prior empno = mgr) sumsal
but how does one return the multiple sums back to the initial query.
I find your previous post extremely interesting, this hierarchy business is a whole new world.
Thank you for all your help,

Pedro

Tom Kyte
December 01, 2009 - 3:20 am UTC

I don't see how you can traverse a single hierarchy to get multiple sums? Not sure what you mean

Hierarchial query with cumulative factor

Pedro, December 01, 2009 - 12:22 pm UTC

Tom,

This is where we left the query:
WITH data
AS
(
SELECT DISTINCT *
FROM scott.emp
START WITH empno IN (7499, 7521)
CONNECT BY empno = PRIOR mgr
)
SELECT
ename,
deptno,
empno,
mgr,
sal,
(
SELECT
SUM(sal)
FROM scott.emp e2
WHERE
empno IN (SELECT empno FROM data)
START WITH e2.empno = data.empno
CONNECT BY PRIOR empno = mgr
) sumsal,
(
SELECT MAX(sys_connect_by_path( empno, '/' ))
FROM scott.emp e3
START WITH e3.empno = data.empno
CONNECT BY PRIOR mgr = empno
) path
FROM data
/
What I have to do is get a cumulative sum of another column(s) for example comm, traversing the hierarchy exactly like for sumsal:
WITH data
AS
(
SELECT DISTINCT *
FROM scott.emp
START WITH empno IN (7499, 7521)
CONNECT BY empno = PRIOR mgr
)
SELECT
ename,
deptno,
empno,
mgr,
sal,
(
SELECT
SUM(sal)
FROM scott.emp e2
WHERE
empno IN (SELECT empno FROM data)
START WITH e2.empno = data.empno
CONNECT BY PRIOR empno = mgr
) sumsal,
(
SELECT
SUM(comm)
FROM scott.emp e2
WHERE
empno IN (SELECT empno FROM data)
START WITH e2.empno = data.empno
CONNECT BY PRIOR empno = mgr
) sumcomm,
(
SELECT MAX(sys_connect_by_path( empno, '/' ))
FROM scott.emp e3
START WITH e3.empno = data.empno
CONNECT BY PRIOR mgr = empno
) path
FROM data
/
I thought it would be more efficient to traverse the hierarchy only once compute both sumsal and sumcomm and put the results in a collection.
I hope this makes more sense.
After thinking for two days I figured out what is the MAX(sys_connect_by_path( empno, '/' )) all about, I am able to sort the results.
Thanks again,

Pedro
Tom Kyte
December 01, 2009 - 1:49 pm UTC

ah, got it.

you want to get more than one column out of a scalar subquery, two methods:

ops$tkyte%ORA10GR2> with data
  2  as
  3  ( SELECT distinct *
  4      FROM scott.emp
  5     START WITH empno IN (7499, 7521)
  6   CONNECT BY empno = PRIOR mgr
  7  )
  8  select ename, deptno, empno, mgr, sal,
  9         to_number( substr(string, 1, 13)) sumsal,
 10             to_number( substr(string,14    )) sumcomm,
 11             path
 12    from (
 13  select ename, deptno, empno, mgr, sal,
 14         (select to_char( sum(sal), '999999999.99' ) || to_char(sum(comm))
 15            from scott.emp e2
 16           where empno in (select empno from data)
 17           start with e2.empno = data.empno
 18          connect by prior empno = mgr) string,
 19         (select max( sys_connect_by_path( empno, '/' ) )
 20            from scott.emp e3
 21           start with e3.empno = data.empno
 22          connect by prior mgr = empno) path
 23    from data
 24         )
 25  /

ENAME          DEPTNO      EMPNO        MGR        SAL     SUMSAL    SUMCOMM PATH
---------- ---------- ---------- ---------- ---------- ---------- ---------- --------------------
BLAKE              30       7698       7839       2850       5700        800 /7698/7839
WARD               30       7521       7698       1250       1250        500 /7521/7698/7839
ALLEN              30       7499       7698       1600       1600        300 /7499/7698/7839
KING               10       7839                  5000      10700        800 /7839


ops$tkyte%ORA10GR2> create or replace type myScalarType as object
  2  ( sumsal number, sumcomm number )
  3  /

Type created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with data
  2  as
  3  ( SELECT distinct *
  4      FROM scott.emp
  5     START WITH empno IN (7499, 7521)
  6   CONNECT BY empno = PRIOR mgr
  7  )
  8  select ename, deptno, empno, mgr, sal, x.obj.sumsal, x.obj.sumcomm,
  9             path
 10    from (
 11  select ename, deptno, empno, mgr, sal,
 12         (select myScalarType( sum(sal), sum(comm) )
 13            from scott.emp e2
 14           where empno in (select empno from data)
 15           start with e2.empno = data.empno
 16          connect by prior empno = mgr) obj,
 17         (select max( sys_connect_by_path( empno, '/' ) )
 18            from scott.emp e3
 19           start with e3.empno = data.empno
 20          connect by prior mgr = empno) path
 21    from data
 22         ) x
 23  /

ENAME          DEPTNO      EMPNO        MGR        SAL OBJ.SUMSAL OBJ.SUMCOMM PATH
---------- ---------- ---------- ---------- ---------- ---------- ----------- --------------------
BLAKE              30       7698       7839       2850       5700         800 /7698/7839
WARD               30       7521       7698       1250       1250         500 /7521/7698/7839
ALLEN              30       7499       7698       1600       1600         300 /7499/7698/7839
KING               10       7839                  5000      10700         800 /7839

    

Hierarchial query with cumulative factor

Pedro, December 01, 2009 - 2:05 pm UTC

Tom,

The more than one column out of a scalar subquery just blew my socks off. Chapeau bas.
Thank you very much,

Pedro

Hierarchial stock consumption..

Johanes Budi, November 02, 2013 - 6:20 am UTC

Dear expert,

Following original poster problem, i need some magic to alocate stock,
here is my sample :
create table items ( 
  itemid    varchar2 (20), 
  itemtype  char (1), 
  itemname  varchar2 (40),
  stock number(8,0)
);

create table itemsets ( 
  parentid  varchar2 (20), 
  childid   varchar2 (20), 
  factor    number 
);
insert into items(itemid, itemtype,itemname,stock)values('BIKE','S','Complete bicycle',0);
insert into items(itemid, itemtype,itemname,stock)values('FRAME','I','Bicycle frame',2);
insert into items(itemid, itemtype,itemname,stock)values('COMP_WHEEL','S','Complete wheel',6);
insert into items(itemid, itemtype,itemname,stock)values('WHEEL','S','Wheel tyre',10);
insert into items(itemid, itemtype,itemname,stock)values('TYRE','I','Tyre for wheel',20);
insert into items(itemid, itemtype,itemname,stock)values('RING','I','Ring for wheel',20);
insert into items(itemid, itemtype,itemname,stock)values('SPOKE','I','Spoke for wheel',60);
insert into itemsets(parentid, childid,factor)values('BIKE','FRAME',1);
insert into itemsets(parentid, childid,factor)values('BIKE','COMP_WHEEL',2);
insert into itemsets(parentid, childid,factor)values('COMP_WHEEL','WHEEL',1);
insert into itemsets(parentid, childid,factor)values('COMP_WHEEL','TYRE',1);
insert into itemsets(parentid, childid,factor)values('WHEEL','RING',1);
insert into itemsets(parentid, childid,factor)values('WHEEL','SPOKE',28);


say, we have new order to assembly 10 bike
stock alocation= stock - required qty
required qty= order qty * factor

the problem is, when calc required qty for child item, we must consider parent stock alocation,
required qty for child=parent required qty * factor

expected output:

PARENTID   CHILDID LVL     FACTOR  TOT     PATH                         RQD    STOCK   STOCK_ALOC
BIKE       COMP_WHE       1       2       2/COMP_WHEEL                       20       6          6
COMP_WHEEL TYRE           2       1       2/COMP_WHEEL/TYRE                  20      20         14
COMP_WHEEL WHEEL          2       1       2/COMP_WHEEL/WHEEL                 20      10         14
WHEEL      RING           3       1       2/COMP_WHEEL/WHEEL/RING            20      20         14
WHEEL      SPOKE          3      28      56/COMP_WHEEL/WHEEL/SPOKE          560      60         60
BIKE       FRAME          1       1       1/FRAME                            10       2          2




Please, give me some light, to get this done in query,
I believe there is away to get this without crawling each row and updating stock in table,
I've tried this way and take so loooongg..

Thanks!
(Sorry for my bad english)