Skip to Main Content
  • Questions
  • Removing duplicates in a parent child connect by path

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Richard.

Asked: March 30, 2017 - 3:12 am UTC

Last updated: April 03, 2017 - 9:10 am UTC

Version: 11.2.0.4

Viewed 1000+ times

You Asked

Hi Tom,

I have the code below which seems to do the job ok. I was wondering however if there's a more elegant way using analytics or other methods to come up with the same results in a more dynamic way that would factor in the level becoming 5 for example?

WITH TMP AS (SELECT   1 AS ID, NULL AS PARENT_ID FROM DUAL
             UNION
             SELECT   2 AS ID, 1 AS PARENT_ID FROM DUAL
             UNION
             SELECT   3 AS ID, 2 AS PARENT_ID FROM DUAL
             UNION
             SELECT   4 AS ID, 3 AS PARENT_ID FROM DUAL
             UNION
             SELECT   5 AS ID, 3 AS PARENT_ID FROM DUAL
             UNION
             SELECT   6 AS ID, 2 AS PARENT_ID FROM DUAL
             UNION
             SELECT   7 AS ID, 1 AS PARENT_ID FROM DUAL
             UNION
             SELECT   8 AS ID, NULL AS PARENT_ID FROM DUAL
             UNION
             SELECT   9 AS ID, 8 AS PARENT_ID FROM DUAL
             UNION
             SELECT   10 AS ID, 8 AS PARENT_ID FROM DUAL
             UNION
             SELECT   11 AS ID, 10 AS PARENT_ID FROM DUAL
             UNION
             SELECT   12 AS ID, NULL AS PARENT_ID FROM DUAL
             UNION
             SELECT   13 AS ID, 12 AS PARENT_ID FROM DUAL
             UNION
             SELECT   14 AS ID, NULL AS PARENT_ID FROM DUAL),
    LVL_BREAKDOWN
       AS (SELECT   LVL,
                    ID_PATH,
                    TO_NUMBER (REGEXP_SUBSTR (ID_PATH,
                                              '[^/]+',
                                              1,
                                              1))
                       LEVEL1,
                    TO_NUMBER (REGEXP_SUBSTR (ID_PATH,
                                              '[^/]+',
                                              1,
                                              2))
                       LEVEL2,
                    TO_NUMBER (REGEXP_SUBSTR (ID_PATH,
                                              '[^/]+',
                                              1,
                                              3))
                       LEVEL3,
                    TO_NUMBER (REGEXP_SUBSTR (ID_PATH,
                                              '[^/]+',
                                              1,
                                              4))
                       LEVEL4
             FROM   (    SELECT   LEVEL AS LVL,
                                  SYS_CONNECT_BY_PATH (ID, '/') ID_PATH
                           FROM   TMP
                     START WITH   PARENT_ID IS NULL
                     CONNECT BY   PRIOR ID = PARENT_ID))
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 4
UNION
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 3
         AND NOT EXISTS
               (SELECT   'x'
                  FROM   LVL_BREAKDOWN LB2
                 WHERE       LB1.LEVEL1 = LB2.LEVEL1
                         AND LB1.LEVEL2 = LB2.LEVEL2
                         AND LB1.LEVEL3 = LB2.LEVEL3
                         AND lvl = 4)
UNION
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 2
         AND NOT EXISTS
               (SELECT   'x'
                  FROM   LVL_BREAKDOWN LB2
                 WHERE       LB1.LEVEL1 = LB2.LEVEL1
                         AND LB1.LEVEL2 = LB2.LEVEL2
                         AND lvl = 3)
UNION
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 1
         AND NOT EXISTS (SELECT   'x'
                           FROM   LVL_BREAKDOWN LB2
                          WHERE   LB1.LEVEL1 = LB2.LEVEL1 AND lvl = 2)
ORDER BY   3, 4 ASC


with LiveSQL Test Case:

and Chris said...

So you want to show just the paths to the leaf nodes? e.g., if you have:

1
1/2
1/2/3


You want just 1/2/3, because 1 and 2 both have children? But throw in 4 below 2 and you want to show 1/2/4 too?

If so, you've just got to change tack slightly:

Walk up the tree instead of down!

Start with the leaves. Work your way back up to the root, building up the tree as you go along. Once you're at the root you have the complete path back to the leaf. So you just need to return the root nodes - i.e. those where parent_id is null!

Here's how to do it using recursive with:

with tmp as (
  select   1 as id, null as parent_id from dual 
  union 
  select   2 as id, 1 as parent_id from dual 
  union 
  select   3 as id, 2 as parent_id from dual 
  union 
  select   4 as id, 3 as parent_id from dual 
  union 
  select   5 as id, 3 as parent_id from dual 
  union 
  select   6 as id, 2 as parent_id from dual 
  union 
  select   7 as id, 1 as parent_id from dual 
  union 
  select   8 as id, null as parent_id from dual 
  union 
  select   9 as id, 8 as parent_id from dual 
  union 
  select   10 as id, 8 as parent_id from dual 
  union 
  select   11 as id, 10 as parent_id from dual 
  union 
  select   12 as id, null as parent_id from dual 
  union 
  select   13 as id, 12 as parent_id from dual 
  union 
  select   14 as id, null as parent_id from dual
), levs (id, parent_id, path, lev) as (
  select id, parent_id, to_char(id) path, 1 lev from tmp t1
  where  not exists (
    select * from tmp t2
    where  t2.parent_id = t1.id
  )
  union all 
  select t.id, t.parent_id, t.id || '/' || path, lev + 1 lev from levs
  join   tmp t
  on     t.id = levs.parent_id
)
 select lev, path from levs
 where  parent_id is null
 order  by 2;

LEV  PATH     
4    1/2/3/4  
4    1/2/3/5  
3    1/2/6    
2    1/7      
2    12/13    
1    14       
3    8/10/11  
2    8/9


This will work however deep your tree is. Though if you want to split the levels out into columns, you'll still need to know the most there are to specify them.

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Perfect, thank you

Richard Shaw, March 30, 2017 - 11:38 pm UTC

Thank you! I was too focused on removing the duplicates than thinking of starting at the leaf. Now on my to-do list is to learn a bit more about recursive with.
Connor McDonald
March 31, 2017 - 4:44 am UTC

glad we could help

validate my understanding

Rajeshwaran, April 03, 2017 - 1:39 am UTC

SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 4
UNION
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 3
         AND NOT EXISTS
               (SELECT   'x'
                  FROM   LVL_BREAKDOWN LB2
                 WHERE       LB1.LEVEL1 = LB2.LEVEL1
                         AND LB1.LEVEL2 = LB2.LEVEL2
                         AND LB1.LEVEL3 = LB2.LEVEL3
                         AND lvl = 4)
UNION
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 2
         AND NOT EXISTS
               (SELECT   'x'
                  FROM   LVL_BREAKDOWN LB2
                 WHERE       LB1.LEVEL1 = LB2.LEVEL1
                         AND LB1.LEVEL2 = LB2.LEVEL2
                         AND lvl = 3)
UNION
SELECT   LB1.*
  FROM   LVL_BREAKDOWN LB1
 WHERE   lvl = 1
         AND NOT EXISTS (SELECT   'x'
                           FROM   LVL_BREAKDOWN LB2
                          WHERE   LB1.LEVEL1 = LB2.LEVEL1 AND lvl = 2)

Team,

The above union-all ladder says 
 (i) get all level four.
 (ii) get all level threes that doesnot exists in level four
 (iii) get all level twos that doesnot exists in level three
 (iv) get all level ones that doesnot exists in level twos.
 
doesn't that be the equivalent to the other way of saying.
 (i) start with the parent.
 (ii) walk down all the way to child upto level equal to four and establish the path.
 (iii) and finally retain the leaf nodes and discard all the non-leaf portions.
 
if that is right, then the above code can be rewritten like this.

demo@ORA12C> WITH TMP AS (SELECT   1 AS ID, NULL AS PARENT_ID FROM DUAL
  2               UNION
  3               SELECT   2 AS ID, 1 AS PARENT_ID FROM DUAL
  4               UNION
  5               SELECT   3 AS ID, 2 AS PARENT_ID FROM DUAL
  6               UNION
  7               SELECT   4 AS ID, 3 AS PARENT_ID FROM DUAL
  8               UNION
  9               SELECT   5 AS ID, 3 AS PARENT_ID FROM DUAL
 10               UNION
 11               SELECT   6 AS ID, 2 AS PARENT_ID FROM DUAL
 12               UNION
 13               SELECT   7 AS ID, 1 AS PARENT_ID FROM DUAL
 14               UNION
 15               SELECT   8 AS ID, NULL AS PARENT_ID FROM DUAL
 16               UNION
 17               SELECT   9 AS ID, 8 AS PARENT_ID FROM DUAL
 18               UNION
 19               SELECT   10 AS ID, 8 AS PARENT_ID FROM DUAL
 20               UNION
 21               SELECT   11 AS ID, 10 AS PARENT_ID FROM DUAL
 22               UNION
 23               SELECT   12 AS ID, NULL AS PARENT_ID FROM DUAL
 24               UNION
 25               SELECT   13 AS ID, 12 AS PARENT_ID FROM DUAL
 26               union
 27               select   14 as id, null as parent_id from dual)
 28  select level lvl, sys_connect_by_path( id,'/') as id_path
 29  from tmp
 30  where connect_by_isleaf = 1
 31  start with parent_id is null
 32  connect by prior id = parent_id
 33  and level <= 4 ;

       LVL ID_PATH
---------- --------------------
         4 /1/2/3/4
         4 /1/2/3/5
         3 /1/2/6
         2 /1/7
         2 /8/9
         3 /8/10/11
         2 /12/13
         1 /14

8 rows selected.

demo@ORA12C>

kindly advice if i am getting it wrong.
Chris Saxon
April 03, 2017 - 9:10 am UTC

Yeah, you could do it that way too! I forgot about connect_by_isleaf ;)

Assuming the requirement is "show all complete paths from roots to leaves", then you just need to find a way to identify leaves and roots.