Skip to Main Content
  • Questions
  • Distinct Count of a Value across columns in the table.

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Raghavendra.

Asked: December 02, 2014 - 1:01 pm UTC

Last updated: December 03, 2014 - 3:37 pm UTC

Version: 11.2

Viewed 10K+ times! This question is

You Asked

CREATE TABLE country_test (c1 varchar2(20), c2 varchar2(20), c3 varchar2(20),c4 varchar2(20))
/
INSERT INTO country_test VALUES ('india','us','china','uk')
/
INSERT INTO country_test VALUES ('india','india','china','uk')
/

INSERT INTO country_test VALUES ('india','china','china','uk')
/

INSERT INTO country_test VALUES ('us','us','us','uk')
/
commit;

select * from country_test
/
c1 c2 c3 c4
--------------------------
india us china uk
india india china uk
india china china uk
us us us uk

I Need the distinct count of countries across columns of the table i.e distinct across c1,c2,c3,c4. So the output has to be

c1 c2 c3 c4 cnt
-----------------------------------
india us china uk 4
india india china uk 3
india china china uk 3
us us us uk 2

If possible, Can the query be made generic to support any number of combinations
of the columns and give the distinct count across all the columns?

Thanks a Ton for your support!.

Regards,
Boralli

and Tom said...

it really won't be "generic" in that if you have 4 columns, you'll need to reference 4 columns. If you have 5, you'll need a different query.

this query works by generating a set of four rows (data), then doing a cartesian join of that to the original table turning each single row into four duplicates. Using decode, we populate a single column in each of those rows - one of the four original columns. We keep the primary key (rowid in this case) so we can group by safely (in case two rows have the same c1..c4 values).

ops$tkyte%ORA11GR2> with data(r)
  2  as
  3  (select 1 r from dual
  4   union all
  5   select r+1 from data where r < 4
  6  )
  7  select c1, c2, c3, c4, count(distinct c) cnt
  8    from (
  9  select rowid rid,
 10         c1, c2, c3, c4,
 11         decode(r,1,c1,2,c2,3,c3,4,c4) c
 12    from data, country_test
 13         )
 14   group by rid, c1, c2, c3, c4
 15  /

C1         C2         C3         C4                CNT
---------- ---------- ---------- ---------- ----------
india      us         china      uk                  4
us         us         us         uk                  2
india      india      china      uk                  3
india      china      china      uk                  3

Rating

  (9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Raghavendra Boralli, December 03, 2014 - 9:45 am UTC

Hi Tom,

Your Genius reflects here. Two new ideas 1. Using Rowid as primary key. 2. WITH Clause.

Your query is packed with Simplicity, Unexplored features...etc. It looks a simple query, yet it takes time to understand.

Kindly can you explain to us the With data(r), how a column can be parameterized in the With clause and the DATA table is self referenced within the WITH clause.

Thanks a ton.

Regards,
Boralli
Tom Kyte
December 03, 2014 - 12:38 pm UTC

Possible alternative

AndyP, December 03, 2014 - 10:29 am UTC

Just out of interest I tried approaching this using the 11G (un)pivot. It seems to work ok for this small amount of data and gives the same result (I'm not claiming it's any better!)
prompt Reintroduce the original columns (sort of)
col orig for a84
with data as (select rownum r,c1,c2,c3,c4 from ctest)
select listagg(rpad(val,21),'') within group (order by column_list) orig
      ,count(distinct val) countries
  from
(
select * from data
unpivot (val for column_list in (c1,c2,c3,c4))
)
 group by r
 order by r
/

prompt Compare Tom's method (slight variation)
with data as (select level r from dual connect by level<5)
select c1, c2, c3, c4, count(distinct c) countries
  from (
select ctest.rowid rid, c1, c2, c3, c4, decode(r,1,c1,2,c2,3,c3,4,c4) c
  from data, ctest
       )
 group by rid, c1, c2, c3, c4
 order by rid
/

Reintroduce the original columns (sort of)

ORIG                                                                                  COUNTRIES
------------------------------------------------------------------------------------ ----------
india                us                   china                uk                             4
india                india                china                uk                             3
india                china                china                uk                             3
us                   us                   us                   uk                             2
peru                 peru                 peru                 peru                           1

Compare Tom's method (slight variation)

C1                   C2                   C3                   C4                    COUNTRIES
-------------------- -------------------- -------------------- -------------------- ----------
india                us                   china                uk                            4
india                india                china                uk                            3
india                china                china                uk                            3
us                   us                   us                   uk                            2
peru                 peru                 peru                 peru                          1

Tom Kyte
December 03, 2014 - 12:40 pm UTC

doh, why didn't I think of that :) too many ways to do it!

Now we'll just wait for someone to post the MODEL clause approach ;)

thanks!

Not the MODEL clause, but...

Stew Ashton, December 03, 2014 - 1:48 pm UTC

select c1, c2, c3, c4, column_value cnt
from country_test,
table(cast(multiset(
  select count(distinct c1) from (
    select c1 from dual union all
    select c2 from dual union all
    select c3 from dual union all
    select c4 from dual
  )
) as sys.odcinumberlist));
And in 12c:
select * from country_test,
lateral(
  select count(distinct c) cnt from (
    select c1 c from dual union all
    select c2 from dual union all
    select c3 from dual union all
    select c4 from dual
  )
);

C1       C2       C3       C4              CNT
-------- -------- -------- -------- ----------
india    us       china    uk                4 
india    india    china    uk                3 
india    china    china    uk                3 
us       us       us       uk                2

Tom Kyte
December 03, 2014 - 3:37 pm UTC

absolutely :)

Forgot the UNPIVOT!

Stew Ashton, December 03, 2014 - 1:55 pm UTC

select * from country_test,
lateral(
select count(distinct val) cnt from (
select c1,c2,c3,c4 from dual
) unpivot(val for col in (c1,c2,c3,c4))
);

The MODEL Boy !

Rajeshwaran, Jeyabal, December 05, 2014 - 3:38 am UTC

rajesh@ORA11G> select f1,f2,f3,f4,count(distinct y) cnt
  2  from (
  3  select f1,f2,f3,f4,x,y
  4  from t
  5  model
  6    return updated rows
  7    partition by (c1 f1,c2 f2,c3 f3,c4 f4)
  8    dimension by (1 as x)
  9    measures(c1,c2,c3,c4, cast(null as varchar2(10)) y )
 10    rules upsert
 11    ( y[1] = c1[1] ,
 12      y[2] = c2[1] ,
 13      y[3] = c3[1] ,
 14      y[4] = c4[1] )
 15      )
 16  group by f1,f2,f3,f4
 17  /

F1         F2         F3         F4                CNT
---------- ---------- ---------- ---------- ----------
india      china      china      uk                  3
india      india      china      uk                  3
india      us         china      uk                  4
us         us         us         uk                  2

4 rows selected.

rajesh@ORA11G>

Frank Feng, December 08, 2014 - 10:30 pm UTC

The above two TABLE(CAST(MULTISET ...))) queries are not working in my 11.2 Oracle database. It seems the column of the country_test table cannot be used in one more level deep of the sub-query. Here is the test result:

SQL> select c1, c2, c3, c4, column_value cnt
  2  from country_test,
  3  table(cast(multiset(
  4    select count(distinct c1) from (
  5      select c1 from dual union all
  6      select c2 from dual union all
  7      select c3 from dual union all
  8      select c4 from dual
  9    )
 10  ) as sys.odcinumberlist));
ORA-00904: "C1": invalid identifier

If I change the query to the following, then it works:
SQL> SELECT t.*
  2        ,column_value cnt
  3    FROM country_test t
  4        ,TABLE(CAST(MULTISET (SELECT COUNT(DISTINCT DECODE(LEVEL, 1, c1, 2, c2, 3, c3, 4, c4))
  5                       FROM dual
  6                     CONNECT BY LEVEL <= 4) AS sys.odcinumberlist));
C1            C2            C3            C4                   CNT
------------- ------------- ------------- ------------- ----------
india         us            china         uk                     4
india         india         china         uk                     3
india         china         china         uk                     3
us            us            us            uk                     2

another 11.2 workaround

Barbara Boehmer, December 26, 2014 - 11:05 pm UTC

Here is another 11.2 workaround for the problem with nesting too deep in sub-queries:

SCOTT@orcl> select banner from v$version
  2  /

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for 64-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

5 rows selected.

SCOTT@orcl> select c1, c2, c3, c4,
  2       (select count (distinct column_value)
  3        from   table (cast (multiset
  4          (select c1 from dual union all
  5           select c2 from dual union all
  6           select c3 from dual union all
  7           select c4 from dual)
  8          as sys.odcivarchar2list))) cnt
  9  from   country_test
 10  /

C1                   C2                   C3                   C4                          CNT
-------------------- -------------------- -------------------- -------------------- ----------
india                us                   china                uk                            4
india                india                china                uk                            3
india                china                china                uk                            3
us                   us                   us                   uk                            2

4 rows selected.

Just another way

CVR, January 08, 2015 - 7:10 am UTC

select country_test.*, cnt_tab.cnt
from country_test,
(
select count(1) cnt, rid
from (
select rowid rid, c1 c from country_test
union select rowid, c2 from country_test
union select rowid, c3 from country_test
union select rowid, c4 from country_test
)
group by rid
) cnt_tab
where country_test.rowid = cnt_tab.rid

Yet another (short) alternative

Oren Nakdimon @DBoriented, March 27, 2015 - 12:47 pm UTC

create type string_ntt as table of varchar2(4000)
/
select c.*,cardinality(set(string_ntt(c1,c2,c3,c4))) from country_test c;