Skip to Main Content
  • Questions
  • Stats_mode function - Deterministic or Non- Deterministic

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, Deepak.

Asked: January 28, 2021 - 10:15 am UTC

Last updated: September 07, 2022 - 3:48 am UTC

Version: 19c

Viewed 10K+ times! This question is

You Asked

Stats_mode function - Deterministic or Non- Deterministic?

What does Oracle return when there are multiple keys with same mode (highest) and how?

and Connor said...

In the *current* implementation, we are doing a sort

SQL> create table t as
  2  select *
  3  from (
  4  select 'a' c1, 1 c2 from dual union all
  5  select 'b' c1, 1 c2 from dual union all
  6  select 'c' c1, 1 c2 from dual union all
  7  select 'd' c1, 1 c2 from dual union all
  8  select 'e' c1, 1 c2 from dual union all
  9  select 'a' c1, 2 c2 from dual union all
 10  select 'b' c1, 2 c2 from dual union all
 11  select 'c' c1, 2 c2 from dual union all
 12  select 'd' c1, 2 c2 from dual union all
 13  select 'e' c1, 2 c2 from dual
 14  );

Table created.

SQL>
SQL> set autotrace on explain
SQL> select stats_mode(c2) from t;

STATS_MODE(C2)
--------------
             1


Execution Plan
----------------------------------------------------------
Plan hash value: 1476560607

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT GROUP BY     |      |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| T    |    10 |    30 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------


so from that sense it will be deterministic as picking the lowest value out of those with the highest count.

But as per the docs: "If more than one mode exists, then Oracle Database chooses one and returns only that one value."

so we are giving *no* guarantee that this will always be the case. We might change to a GROUP HASH BY at some stage in the future which could change things.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Asim, August 31, 2022 - 5:05 pm UTC

Why there is no analytic version of stats_mode function even in 21c. This is a very normal business requirement.

E.g
One might need to display MODE salary of the dept in which emp works, or the MODE salary of the Job which emp performs, in front of each empno along with all rows of emp table, ie no GROUP BY.

I know that can be achieved by other ways, but still need to know, any special reason Oracle didnt provided its analytic version.

Thanks

Connor McDonald
September 01, 2022 - 4:14 am UTC

I agree - it seems an omission, because going back to first principles seems overly complex


SQL>
SQL> create table t ( x int, y int );

Table created.

SQL>
SQL> insert into t values (10,1);

1 row created.

SQL> insert into t values (10,1);

1 row created.

SQL>
SQL> insert into t values (10,2);

1 row created.

SQL> insert into t values (10,2);

1 row created.

SQL> insert into t values (10,2);

1 row created.

SQL> insert into t values (10,2);

1 row created.

SQL>
SQL> insert into t values (10,3);

1 row created.

SQL> insert into t values (10,3);

1 row created.

SQL> insert into t values (10,3);

1 row created.

SQL>
SQL> insert into t values (10,4);

1 row created.

SQL> insert into t values (10,4);

1 row created.

SQL> insert into t values (10,4);

1 row created.

SQL> insert into t values (10,4);

1 row created.

SQL> insert into t values (10,4);

1 row created.

SQL> insert into t values (10,4);

1 row created.

SQL>
SQL> insert into t values (10,5);

1 row created.

SQL> insert into t values (10,5);

1 row created.

SQL> insert into t values (10,5);

1 row created.

SQL>
SQL> insert into t
  2  select 20, 6-y from t;

18 rows created.

SQL> commit;

Commit complete.

SQL>
SQL>
SQL> select x, stats_mode(y)
  2  from t
  3  group by x;

         X STATS_MODE(Y)
---------- -------------
        10             4
        20             2

SQL>
SQL>
SQL> select x,y, max(case when r=1 then y end) over ( partition by x ) as themode
  2  from (
  3  select x,y, rank() over ( partition by x order by c desc ) r
  4  from
  5  (
  6  select x, y, count(*) over ( partition by x,y ) c
  7  from t
  8  )
  9  );

         X          Y    THEMODE
---------- ---------- ----------
        10          4          4
        10          4          4
        10          4          4
        10          4          4
        10          4          4
        10          4          4
        10          2          4
        10          2          4
        10          2          4
        10          2          4
        10          3          4
        10          3          4
        10          3          4
        10          5          4
        10          5          4
        10          5          4
        10          1          4
        10          1          4
        20          2          2
        20          2          2
        20          2          2
        20          2          2
        20          2          2
        20          2          2
        20          4          2
        20          4          2
        20          4          2
        20          4          2
        20          3          2
        20          3          2
        20          3          2
        20          1          2
        20          1          2
        20          1          2
        20          5          2
        20          5          2

36 rows selected.

SQL>


Cant get it

Asim, September 01, 2022 - 12:31 pm UTC

1.
Cant understand what you mean by going back to first principles....

2.
Will Oracle provide analytic version of stats_mode in future releases.?

3
Or is it delibrately ommited ? Means we should not expect analytic version of stats_mode in future releases?

Connor McDonald
September 05, 2022 - 5:41 am UTC

1) Counting, sorting and then ranking....
2) Unknown. I suggest raising an enhancement request
3) Unknown. See 2

Where to raise enhancement request

Asim, September 05, 2022 - 11:50 am UTC

Where can I raise enhancement request.

I suggest this syntax for stats_mode aggregate and analytic both.

Stats_mode(expr, ANY|MIN|MAX|Listagg)

ANY means current behavior ie return any value if multiple modes exist

MIN means minimum value of mode
MAX means maximum value of mode
Listagg means concatenated list of all modes

What are your thoughts ?


Your thoughts

Asim, September 06, 2022 - 6:46 am UTC

and what are your thoughts on my above suggestion.

Doesnt it makes stats_mode more deterministic ? Because only one value ie ANY of 2nd parameter to the function is undeterministic rest of the values will return deterministic value.

Connor McDonald
September 07, 2022 - 3:48 am UTC

Well the "classical" mode is not deterministic either, so I don't really see the need. But I do like the idea of analytic version as I said earlier.

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.