Skip to Main Content
  • Questions
  • How to remove the duplicates records which are in two different columns of a single table.

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Lithesh.

Asked: March 28, 2017 - 1:24 pm UTC

Last updated: April 18, 2017 - 1:54 pm UTC

Version: oracle 11g

Viewed 1000+ times

You Asked

I have table called "DUPLICTAE" which contains the below records. I need the below desired ouput...

COL1     |  COL2
:-----    | -----:
banglore  | chennai
kolkata   | pune
chennai   | mysore
pune   | banglore
mysore   | kolkata
mumbai   | delhi

=================================


EXPECTED OUTPUT:-

BANGALORE | CHENNAI
KOLKATTA  | PUNE
MUMBAI    | DELHI

and Chris said...

So you want to only show rows so each value appears at most once in the whole result set? So if "banglore" is in col1, you can't display any other rows where it's in col1 or col2?

If so, here's one way to do this:

- Pick a row in your table to start
- Recursively work through the rest of the rows in the table
- As you add rows to the result, build up a string of all the values for c1 and c2
- Check whether the current row's values are in this string. If they are NOT
- Add them to the end of the "used values" string
- Set a flag stating that the current row is not in this list
- Return only those rows where the "not in the list" flag is set

Putting this together gives something like this:

create table t (
  c1 varchar2(10),
  c2 varchar2(10)
);

insert into t values ('banglore','chennai');
insert into t values ('kolkata','pune');
insert into t values ('mysore', 'chennai');
insert into t values ('pune','banglore');
insert into t values ('mysore','kolkata');
insert into t values ('mumbai','delhi');
insert into t values ('delhi', 'pune');

with ranks as (
  select t.*,
         c1 || ':' || c2 str,
         row_number() over (order by c1, c2) rn
  from   t
), dedup (c1, c2, str, rn, strs, lev, ins) as (
  select c1, c2, str, rn, str, 1 lev, 'N' ins from ranks where rn = 1
  union all
  select r.c1, r.c2, r.str, r.rn, 
         case 
           when instr(d.strs, r.c1) = 0 and instr(d.strs, r.c2) = 0 then d.strs || ':' || r.str
           else d.strs 
         end strs,
         d.lev + 1 d,
         case 
           when instr(d.strs, r.c1) = 0 and instr(d.strs, r.c2) = 0 then 'N' 
           else 'Y'
         end ins
  from dedup d
  join ranks r
  on   d.lev + 1 = r.rn 
)
  select * from dedup
  where  ins = 'N';
  
C1        C2       STR               RN  STRS                                        LEV  INS  
banglore  chennai  banglore:chennai  1   banglore:chennai                            1    N    
delhi     pune     delhi:pune        2   banglore:chennai:delhi:pune                 2    N    
mysore    kolkata  mysore:kolkata    6   banglore:chennai:delhi:pune:mysore:kolkata  6    N


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.