Skip to Main Content
  • Questions
  • Replacement for WM_CONCAT function to remove duplicate entries

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Akshaya.

Asked: January 17, 2018 - 11:28 am UTC

Last updated: January 18, 2018 - 6:00 pm UTC

Version: Oracle 12c

Viewed 10K+ times! This question is

You Asked

We are upgrading oracle from 11G to 12c and in one of our code we are using WM_CONCAT function.
Below is the functionality:
Concatenate the field values and remove duplicate entries for the same. This requirement is on multiple fields.
We need the replacement for this.

Actual requirement:
ORDER_NUMBER LINE_CATEGORY SUB_DIVISION OWNING_DIVISION COPYRIGHT
36647200 SALES ELEM MATHEMATICS ELEMENTARY 2013-2015
36723600 SALES ELEM MATHEMATICS ELEMENTARY 2013-2015
36737500 SALES ELEM MATHEMATICS ELEMENTARY 2013-2015
36893400 SALES ELEM MATHEMATICS ELEMENTARY 2013
36893500 SALES ELEM MATHEMATICS ELEMENTARY 2015-2016


Actual Query:
select ORDER_NUMBER,LINE_CATEGORY,--MIN(ORDER_ENTRY_DATE),SUM(TOTAL_AMT), SUM(TOTAL_COGS), SUM(TOTAL_QUANTITY),
REPLACE(WM_CONCAT(DISTINCT CRM_TEMP_EDW_ORDLINES.SUB_DIVISION),',','-'),REPLACE(WM_CONCAT(DISTINCT CRM_TEMP_EDW_ORDLINES.OWNING_DIVISION),',','-') 
REPLACE(WM_CONCAT(DISTINCT CRM_TEMP_EDW_ORDLINES.COPYRIGHT),',','-')
from CRM_TEMP_EDW_ORDLINES
where CONFLICT_FLAG='N' GROUP BY ORDER_NUMBER,LINE_CATEGORY ;

and Chris said...

Sadly listagg doesn't have a distinct option. But it does exclude null values.

So what you can do is have a subquery which maps duplicate values to null. One way to do this is using row_number(), partitioned by the column. If the result is this is 1, return the column value. Otherwise null:

set null <null>

select case row_number() over (partition by job order by 1)
          when 1 then job
       end jb,
       case row_number() over (partition by deptno order by 1)
          when 1 then deptno
       end dept
from   scott.emp;

JB          DEPT     
ANALYST       <null> 
<null>        <null> 
CLERK         <null> 
<null>        <null> 
<null>        <null> 
<null>        <null> 
MANAGER           10 
<null>        <null> 
<null>            20 
PRESIDENT     <null> 
SALESMAN      <null> 
<null>            30 
<null>        <null> 
<null>        <null> 

select listagg(jb, ',') within group (order by jb) jobs,
       listagg(dept, ',') within group (order by dept) depts
from   (
select case row_number() over (partition by job order by 1)
          when 1 then job
       end jb,
       case row_number() over (partition by deptno order by 1)
          when 1 then deptno
       end dept
from   scott.emp
);

JOBS                                       DEPTS      
ANALYST,CLERK,MANAGER,PRESIDENT,SALESMAN   10,20,30  


HT to Laurent Schneider for this technique:

https://laurentschneider.com/wordpress/2014/05/distinct-listagg.html

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Another variant

Iudith Mentzel, January 18, 2018 - 2:13 pm UTC

Hello all,

Just for fun, here is another variant of doing it:

SELECT
( SELECT LISTAGG(COLUMN_VALUE,',') WITHIN GROUP (ORDER BY COLUMN_VALUE)
FROM TABLE(v.t_jobs)
) AS jobs,
( SELECT LISTAGG(COLUMN_VALUE,',') WITHIN GROUP (ORDER BY COLUMN_VALUE)
FROM TABLE(v.t_depts)
) AS depts
FROM (
SELECT COLLECT(DISTINCT job) t_jobs,
COLLECT(DISTINCT deptno) t_depts
FROM scott.emp
) V
/

JOBS DEPTS
--------------------------------------------------------
ANALYST,CLERK,MANAGER,PRESIDENT,SALESMAN 10,20,30


The important here is that we have to unnest each collection separately,
( each TABLE() function call in its own SELECT statement ),
to avoid a cartesian product between the two sets that already contain distinct values each.


Best Regards,
Iudith Mentzel

Chris Saxon
January 18, 2018 - 6:00 pm UTC

Nice, thanks for sharing.

A reader, August 28, 2019 - 4:23 am UTC

Thank you, it is working fine.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.