Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, krishna.

Asked: May 31, 2016 - 5:34 pm UTC

Last updated: August 15, 2022 - 5:18 am UTC

Version: 11g release 2

Viewed 50K+ times! This question is

You Asked

Hi,

There is a SELECT statement which returns columns (around no of coloums are 25).
I suppose to apply LISTAGG function on 16 columns (with delimiter as ',') whenever those columns has distinct values during group by one another column.
For example, I have below data.

col_1 col_2 col_3
100 25 35
100 30 42
100 29 45
200 11 9
200 11 9
300 1 2

After applied LISTAGG, getting following output.

col_1 col_2 col_3
100 25,30,29 35,42,45
200 11,11 9,9
300 1 2

However, expected output is like below

col_1 col_2 col_3
100 25,30,29 35,42,45
200 11 9
300 1 2

Please suggest here is there any direct way to skip LISTAGG in case of values are same?

Regards,
Krishna

and Connor said...

Not in this release, but the next release might have something to help :-)

In the interim, you would need to do a distinct first, and then LISTAGG

SQL> select deptno, listagg(job,',') within group ( order by job ) x
  2  from scott.emp
  3  group by deptno;

    DEPTNO X
---------- ------------------------------------------------------------
        10 CLERK,MANAGER,PRESIDENT
        20 ANALYST,ANALYST,CLERK,CLERK,MANAGER
        30 CLERK,MANAGER,SALESMAN,SALESMAN,SALESMAN,SALESMAN

SQL> select deptno, listagg(job,',') within group ( order by job ) x
  2  from ( select distinct deptno, job from scott.emp )
  3  group by deptno;

    DEPTNO X
---------- ------------------------------------------------------------
        10 CLERK,MANAGER,PRESIDENT
        20 ANALYST,CLERK,MANAGER
        30 CLERK,MANAGER,SALESMAN


Rating

  (26 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Distinct on multiple columns

Rajeshwaran, Jeyabal, June 01, 2016 - 8:18 am UTC

Distinct on multiple columns not possible with above approach.

Let say we have data like this.
col_1 col_2 col_3 
200 11 9 
200 11 10 


and the output needed is like this
col_1 col_2 col_3 
200 11 10,9 


Then one approach would be to stick with user defined aggregates.

demo@ORA11G> create or replace type vcarray is table of varchar2(30);
  2  /

Type created.

demo@ORA11G> create or replace type stragg_type is object
  2  (
  3     array vcarray,
  4
  5     static function odciaggregateinitialize(actx in out stragg_type)
  6             return number ,
  7
  8     member function odciaggregateiterate(
  9                                     self in out stragg_type,
 10                                     val varchar2)
 11             return number ,
 12
 13     member function odciaggregatemerge(
 14                        self in out stragg_type,
 15                        ctx2 in stragg_type)
 16             return number ,
 17
 18     member function odciaggregateterminate(
 19                        self in stragg_type,
 20                        returnvalue out varchar2,
 21                        flags in number)
 22             return number
 23  );
 24  /

Type created.

demo@ORA11G> create or replace type body stragg_type is
  2
  3     static function odciaggregateinitialize(actx in out stragg_type)
  4     return number is
  5     begin
  6             actx := stragg_type(vcarray());
  7             return odciconst.Success;
  8     end;
  9
 10     member function odciaggregateiterate(
 11                                     self in out stragg_type,
 12                                     val varchar2)
 13     return number is
 14     begin
 15             array.extend;
 16             array(array.count) := val;
 17             return odciconst.Success;
 18     end;
 19
 20     member function odciaggregatemerge(
 21                        self in out stragg_type,
 22                        ctx2 in stragg_type)
 23     return number is
 24     begin
 25             for i in 1..ctx2.array.count
 26             loop
 27                     array.extend;
 28                     array(array.count) := ctx2.array(i);
 29             end loop;
 30             return odciconst.Success;
 31     end;
 32
 33     member function odciaggregateterminate(
 34                        self in stragg_type,
 35                        returnvalue out varchar2,
 36                        flags in number)
 37     return number is
 38     begin
 39             for x in (select distinct column_value from table(array) order by 1)
 40             loop
 41                     returnvalue := returnvalue || ',' || x.column_value;
 42             end loop;
 43                     returnvalue := trim( ',' from returnvalue);
 44                     return odciconst.Success;
 45     end;
 46  end;
 47  /

Type body created.

demo@ORA11G> create or replace function stragg(x varchar2)
  2  return varchar2
  3  aggregate using stragg_type;
  4  /

Function created.

demo@ORA11G> select * from t;

      COL1       COL2       COL3
---------- ---------- ----------
       200         11          9
       200         11         10

2 rows selected.

demo@ORA11G> column col2 format a10
demo@ORA11G> column col3 format a10
demo@ORA11G> select col1,stragg(col2) col2,stragg(col3) col3
  2  from t
  3  group by col1 ;

      COL1 COL2       COL3
---------- ---------- ----------
       200 11         10,9

1 row selected.

demo@ORA11G>


In case of Multiset operation we can do like this (but that in-turn returns collections, rather than delimited strings )

demo@ORA11G> select col1,cast( collect( distinct col2) as num_array) col2,
  2          cast( collect(distinct col3) as num_array) col3
  3  from t
  4  group by col1 ;

      COL1 COL2                 COL3
---------- -------------------- --------------------
       200 NUM_ARRAY(11)        NUM_ARRAY(9, 10)

1 row selected.

demo@ORA11G>

listagg

Igor, June 01, 2016 - 6:36 pm UTC

listagg still can do it, but some help from analytic functions is needed ..

select * from t;

      COL1       COL2       COL3
---------- ---------- ----------
         1         10        100
         1         10        200
         2         10        100
         2         20        100

select col1,
  listagg(x_2,',') within group (order by x_2) as c_2,
  listagg(x_3,',') within group (order by x_3) as c_3
  from
  (select col1,
  case col2 when lag(col2) over (partition by col1 order by col2) then null else col2 end as x_2,
  case col3 when lag(col3) over (partition by col1 order by col3) then null else col3 end as x_3
  from t)
  group by col1;

      COL1 C_2        C_3
---------- ---------- ----------
         1 10         100,200
         2 10,20      100

Chris Saxon
June 02, 2016 - 1:50 am UTC

Nice work!

Missing

Ghassan, June 02, 2016 - 7:37 am UTC

Yes nice work but you must add an order by clause on col2 and col3 in the subquery

not enough order by's?

Igor, June 03, 2016 - 4:36 pm UTC

Hmm, sounds interesting. Ghassan, can you provide a simple test case where my query would return a wrong result? Thanks, -Igor

Listagg... WITHIN GROUP ...

Asim, July 24, 2022 - 7:33 pm UTC

What does the LISTAGG's WITHIN GROUP means when listagg is being used as reporting/analytical/window function. Because it isnt working on groups in any case i.e if main query has a GROUP BY clause or not.

In my opinion shouldnt it be WITHIN PARTITION or WITHIN WINDOW in this case
Chris Saxon
July 25, 2022 - 1:58 pm UTC

There are still groups when you use LISTAGG as an analytic! This function only supports the partitioning clause; you can't sort here or define a window.

For example, this uses the analytic form to group the employees by department, sorting their names by employee_id within each:

select employee_id, department_id, 
       listagg ( last_name, ',' ) 
         within group ( order by employee_id ) 
         over ( partition by department_id ) names
from   hr.employees
where  department_id in ( 20, 90 );

EMPLOYEE_ID DEPARTMENT_ID NAMES                                                       
----------- ------------- -----------------------
        201            20 Hartstein,Fay          
        202            20 Hartstein,Fay          
        100            90 King,Kochhar,De Haan   
        101            90 King,Kochhar,De Haan   
        102            90 King,Kochhar,De Haan   

Still thinking

Asim, July 28, 2022 - 7:25 pm UTC

I have read carefully the syntax diagram of LISTAGG and RANK functions (both diagrams of both functions ie aggregate diagram and analytic diagram).

They have seperate syntax for RANK aggregate and RANK analytic and removed the WITHIN GROUP keyword in RANK analytic and allowed the order by clause inside OVER. This seems very logical and the semantics are more easily understood/readable.

Why the same thing not done with LISTAGG analytic so that its more clear.

OR vice versa ie syntax of RANK analytic should be like LISTAGG analytic ie should use WITHIN GROUP

Actually, what i want to say is that throughout all analytic functions , the window is defined after OVER i.e (Partition Order Window), no syntax before OVER has anything to do in defining/framing that analytic window in any analytic function except LISTAGG 's WITHIN GROUP.

If GROUP BY's GROUPS are called GROUP in ORACLE then the groups of PARTITION BY Should not be given the same name although they are similar. OR if u insist that same name will be used and one should interpret its meaning from the context where the keyword is being placed in the syntax, then there was no need of OVER (PARTITION BY....) it could be simply OVER (GROUP BY....)


Chris Saxon
July 29, 2022 - 11:13 am UTC

RANK and LISTAGG work in fundamentally different ways. Perhaps this will help clear things up for you:

LISTAGG combines many rows into one, whether it's an analytic or aggregate. It's likely you want to specify how to order the values within each group it returns. This is separate to defining the group or window itself (which goes in OVER/GROUP BY). Hence WITHIN GROUP.

The WITHIN GROUP clause is optional for LISTAGG. Use this to specify how to sort which order values appear in the list. Without it the order is non-deterministic:

select department_id, 
       listagg ( last_name, ',' ) no_sort,
       listagg ( last_name, ',' ) 
         within group ( order by employee_id ) id_sort,
       listagg ( last_name, ',' ) 
         within group ( order by last_name ) name_sort
from   hr.employees
where  department_id in ( 20, 90 )
group  by department_id;

DEPARTMENT_ID    NO_SORT                 ID_SORT                 NAME_SORT               
              20 Hartstein,Fay           Hartstein,Fay           Fay,Hartstein           
              90 King,De Haan,Kochhar    King,Kochhar,De Haan    De Haan,King,Kochhar 


You can't specify a window with LISTAGG

select department_id, 
       listagg ( last_name, ',' ) 
         over ( order by department_id rows 1 preceding ) 
from   hr.employees
where  department_id in ( 20, 90 );

ORA-30487: ORDER BY not allowed here

select department_id, 
       listagg ( last_name, ',' ) 
         within group ( order by employee_id rows 1 preceding ) 
from   hr.employees
where  department_id in ( 20, 90 )
group  by department_id;

ORA-00907: missing right parenthesis


Combining LISTAGG ... OVER with GROUP BY is invalid syntax:

select department_id, 
       listagg ( last_name, ',' ) 
         within group ( order by employee_id ) 
         over ( partition by department_id ) 
from   hr.employees
where  department_id in ( 20, 90 )
group  by department_id;

ORA-00979: not a GROUP BY expression


there was no need of OVER (PARTITION BY....) it could be simply OVER (GROUP BY....)

This is how the SQL standard defines window functions. While I see your point, PARTITION BY is the accepted syntax now.

Moreover

Asim, July 28, 2022 - 9:14 pm UTC

Moreover, consider a query in which there is a GROUP BY clause in main query and the main query also use analytic LISTAGG...WITHIN GROUP order by......OVER (PARTITION BY....)

i.e SELECT .... LISTAG...WITHIN GROUP order by.... OVER(PARTITION BY....)
....
FROM table1
GROUP BY .....

I know the query will run perfectly and LISTAGG will concat on each partition...
But isnt the syntax of QUERY a little confusing to the reader????




Asim, July 28, 2022 - 9:26 pm UTC

Moreover, consider adding another column to above query with aggregate LISTAGG ie


SELECT .... ,
LISTAGG...WITHIN GROUP order by... ,
LISTAGG...WITHIN GROUP order by... OVER(PARTITION BY ....)

FROM table1
GROUP BY .....

First WITHIN GROUP works on rows grouped by GROUP BY the second WITHIN GROUP works on rows partitioned by OVER (PARTITION BY).

So isnt the syntax a little bit confusing/ambigous to the reader? Wouldnt it be better if we had syntax like analytic RANK which doesnt hv WITHIN GROUP.


Asim, July 29, 2022 - 5:28 pm UTC

You said..
Combining LISTAGG ... OVER with GROUP BY is invalid syntax:

select department_id,
listagg ( last_name, ',' )
within group ( order by employee_id )
over ( partition by department_id )
from hr.employees
where department_id in ( 20, 90 )
group by department_id;

ORA-00979: not a GROUP BY expression

Ok, as per your example its correct ,

But this is what I meant and will run

select a.deptno, ANY_VALUE(b.dname),
------
LISTAGG(a.ename) WITHIN GROUP (order by a.ename) as allemploysofadept,
-----
LISTAGG(ANY_VALUE(b.dname)) WITHIN GROUP (order by ANY_VALUE(b.dname)) OVER(PARTITION BY 1) AS alldeptnames
From scott.emp a, scott.dept b
Where a.deptno =b.deptno
GROUP BY a.deptno

The first LISTAGG is aggregate and the other is analytic and there is a GROUP BY clause as well, and have succesfully run it on liveoracle.

Now my question remains that the keywords WITHIN GROUP which is appearing 2 times in query, although have different semantics but will atleast confuse the reader and may seem ambigous at first look specially the WITHIN GROUP of analytic LISTAGG.

What you answered and proved me is... this is the way it works as per currently allowed syntax..... what I am asking is ....

1. Your comments on the ambiguity of current syntax...... and...
2. your comments on my suggestion of altering the syntax of analytic LISTAGG such that WITHIN GROUP order by is removed and order by is allowed inside OVER. so that its clear without any doubt of ambiguity at first look. Whats stopping ORACLE to do that.

While I was working on this issue a new question is raised....you see in my above query I used OVER(PARTITION BY 1).....this I was forced to do because the syntax says querypartitionclause is mandatory in analytic LISTAGG. See

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/LISTAGG.html#GUID-B6E50D8E-F467-425B-9436-F7F8BF38D466

It seemed to me illogical so I experimented with jUst OVER() and the query run successfully on liveoracle. So is the syntax of analytic LISTAGG is wrong in the document?

Thanks




Chris Saxon
August 01, 2022 - 3:00 pm UTC

I'll try again

- OVER () defines which rows are in the group
- WITHIN GROUP () defines how to sort rows in the group. It can't change which rows these are

In the working example, what's happening is:

ANY_VALUE is the aggregate
LISTAGG is an analytic on top of this

It's equivalent to this query:

with rws as (
  select department_id, 
         any_value( last_name ) last_name, 
         any_value( department_name ) department_name,
         listagg ( last_name ) within group (order by last_name ) as allemploysofadept
  from   hr.employees e
  join   hr.departments d
  using ( department_id )
  group by department_id
)
  select 
    department_id, last_name,
    allemploysofadept,
    listagg ( department_name ) 
      within group ( order by department_name ) 
      over () as alldeptnames 
  from   rws;


Notice that in RWS, LISTAGG is an aggregate and in the final query it's an analytic. At no stage is it both. And you can have an empty OVER () clause with LISTAGG.

You can nest aggregates within analytics for other functions too, for example:

select department_id, 
       avg ( sum ( salary ) ) over ()
from   hr.employees
group  by department_id


In general I think collapsing these into one is confusing.

altering the syntax of analytic LISTAGG ... Whats stopping ORACLE to do that.

As Stew said, this is all SQL standard behaviour which applies to all databases that choose to conform to it (MySQL, SQL Server, etc.). The chances of this syntax changing at this point are essentially zero.

To: Asim, July 29, 2022

Stew Ashton, July 31, 2022 - 6:51 am UTC

In the documentation you cite, I don't see where it says the "query partition clause" is required for LISTAGG to be an analytic function.

If you look at the introduction to analytic functions here: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Analytic-Functions.html

it says OVER ( <analytic clause> ) is required, but the analytic clause itself can be empty: all three components of that clause are above the line, which means they are optional.

In the final syntax you came up with, an analytic function is used on the result of an aggregation. The same page I cited above says "Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed."

So all this works as designed and documented.

You may not like the syntax, but please don't argue that it is not documented. Also, please understand that the SQL language has a standard and Oracle is not likely to replace existing standard syntax with new non-standard syntax.


Chris Saxon
August 01, 2022 - 3:00 pm UTC

Well said Stew.

To Stew Aushton

Asim, July 31, 2022 - 4:44 pm UTC

To Stew aushton

Your understanding of general syntax of analytic functions and relating/generalizing it to all analytic functions is incorrect.

See this, run it in liveoracle or elsewhere

select empno, sal, row_number() over() from scott.emp

it will give error

ORA-30485: missing ORDER BY expression in the window specification

Now see the specific analytic syntax of row_number here

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/ROW_NUMBER.html#GUID-D5A157F8-0F53-45BD-BF8C-AE79B1DB8C41

Now run this

select empno, sal, SUM(sal) over() from scott.emp

It will run without error.
Now see the specific analytic syntax of SUM function here.

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SUM.html#GUID-5610BE2C-CFE5-446F-A1F7-B924B5663220


You can see the orderbyclause is on main line of row_number syntax so you cannot go without specifying the mandatory order by clause.

So there are some analytic functions in which you must specify atleast 1 clause inside over() for eg ROW_NUMBER, RANK , DENSE_RANK etc. , and some analytic functions in which no clauses are mandatory inside Over() for eg. analytic SUM.

Therefore general analytic syntax is not enough , you have to see the function specific analytic syntax

Now coming back to LISTAGG analytic syntax

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/LISTAGG.html#GUID-B6E50D8E-F467-425B-9436-F7F8BF38D466

You can see either you can go without specifying OVER keyword (in this case it will be aggregate function )
Or if you specify OVER keyword, you cant go without specifying querypartitionclause as per syntax diagram.
But when you run query without specifying querypartitionclause, LISTAGG...OVER () it runs without error, thats why It seems to me the diagram is wrong. The diagram should show the lower and upper lines after OVER and querypartitionclause must be on upper line.






Chris Saxon
August 01, 2022 - 3:04 pm UTC

I would advise you to check your understanding very carefully before you start accusing others of being wrong.

Or if you specify OVER keyword, you cant go without specifying querypartitionclause as per syntax diagram.

select listagg ( department_id, ',' ) 
         within group ( order by department_id ) 
         over () depts
from   hr.departments
where  manager_id is not null;

DEPTS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
-----------------------------------------------------------------
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110
10,20,30,40,50,60,70,80,90,100,110

One more aspect

Asim, August 01, 2022 - 12:33 am UTC

One other thing I just got reminded of that order by inside over is to define the window and order of its data to work on, and the order by of WITHIN GROUP is to define the order in which the exp is to be concatenated and this order may be based on the expression being concatenated itself (normally) or may be based on some other expr.
So there can be scenarios when both order bys are needed in analytic listagg ie WITHIN GROUP ORDER BY and OVER(ORDER BY) both and yes that becomes too complicated, however even then also the WITHIN GROUP keyword becomes more ambigous.


Chris Saxon
August 01, 2022 - 3:07 pm UTC

Can you demonstrate a case where that's possible?

OVER ( ORDER BY ) is invalid for LISTAGG.

The OVER and WITHIN GROUP clauses are mutually exclusive for RANK:

select rank ( 10 ) 
         within group ( order by department_id ) 
         over ( order by deparment_id ) depts
from   hr.departments;

ORA-00923: FROM keyword not found where expected

Scenarios

Asim, August 02, 2022 - 10:55 am UTC

Ok here are the scenarios where both ORDER BY's may be needed

1
If we want cumilative concatination of ename based on ascending order of salaries.

We cant write the below SQL

select ename,
job,
sal,
listagg(ename,',') within group(order by sal) over(partition by job order by sal) enames,
count(*) over(partition by job order by sal) cnt
from emp
order by job,
sal

and therefore cant get the below result

ENAME JOB SAL ENAMES CNT
---------- --------- ---------- ------------------------------ ----------
FORD ANALYST 3000 FORD,SCOTT 2
SCOTT ANALYST 3000 FORD,SCOTT 2
SMITH CLERK 800 SMITH 1
JAMES CLERK 950 SMITH,JAMES 2
ADAMS CLERK 1100 SMITH,JAMES,ADAMS 3
MILLER CLERK 1300 SMITH,JAMES,ADAMS,MILLER 4
CLARK MANAGER 2450 CLARK 1
BLAKE MANAGER 2850 CLARK,BLAKE 2
JONES MANAGER 2975 CLARK,BLAKE,JONES 3
KING PRESIDENT 5000 KING 1
MARTIN SALESMAN 1250 MARTIN,WARD 2
WARD SALESMAN 1250 MARTIN,WARD 2
TURNER SALESMAN 1500 MARTIN,WARD,TURNER 3
ALLEN SALESMAN 1600 MARTIN,WARD,TURNER,ALLEN 4
Also, we can't get rolling window by sal and LISTAGG ordered by ename

2.
And we can't get rolling window by sal and LISTAGG ordered by ename ie we cant write below SQL

select ename,
job,
sal,
listagg(ename,',') within group(order by ename) over(partition by job order by sal) enames,
count(*) over(partition by job order by sal) cnt
from emp
order by job,
sal
/



Chris Saxon
August 02, 2022 - 3:44 pm UTC

Correct, you can't get the "rolling CSV" with LISTAGG. If you want to do that you'll need a different method.

Here's one using recursive with - I'm sure there are others

with rws as (
  select e.*, 
         row_number() over ( 
           partition by department_id 
           order by employee_id 
         ) rn
  from   hr.employees e
  where  department_id = 90
), running_names ( employee_id, department_id, rn, first_name, name_list ) as (
  select employee_id, department_id, rn, first_name, first_name 
  from   rws r
  where  rn = 1
  union all
  select r.employee_id, r.department_id, r.rn, r.first_name, 
         rn.name_list || ',' || r.first_name
  from   rws r
  join   running_names rn
  on     r.department_id = rn.department_id
  and    r.rn - 1 = rn.rn
) 
  select * from running_names;
  
EMPLOYEE_ID DEPARTMENT_ID         RN FIRST_NAME           NAME_LIST                     
----------- ------------- ---------- -------------------- ------------------------------
        100            90          1 Steven               Steven                        
        101            90          2 Neena                Steven,Neena                  
        102            90          3 Lex                  Steven,Neena,Lex  

Mandatory syntax of querypartionclause

Asim, August 02, 2022 - 11:03 am UTC


And about the documentation error in diagram of LISTAGG analytic.

The documentation diagram seems to say
LISTAGG.....OVER(querypartitionclause) whereas it should be
LISTAGG.....OVER([querypartitionclause])

Moreover the documentaion diagram also lacks the brackets () of OVER

Chris Saxon
August 02, 2022 - 3:44 pm UTC

Thanks for raising this - we're in the process of fixing it

The default window

Asim, August 02, 2022 - 1:07 pm UTC

OK, now I have understood that the WITHIN GROUP ORDER BY and OVER(ORDER BY) are different and serves different purpose.
One thing due to which one can misinterpret, as I was misinterpreting the OVER(PARTITION BY ORDER BY, is the default window RANGE BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW, due to which there is actually nothing like OVER(PARTION BY ORDER BY) or OVER( ORDER BY).

Every OVER (PARTITION BY ORDER BY) is equal to OVER (PARTITION BY ORDER BY RANGE BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
.
Every OVER ( ORDER BY) is equal to OVER (ORDER BY RANGE BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW.


Moreover the word window is also ambigous as it refers to 2 things.

1. OVER(), OVER(PARTITION BY)

2. OVER(ORDER BY), OVER(PARTITION BY ORDER BY).OVER(PARTITION BY ORDER BY WINDOW), OVER ( ORDER BY WINDOW).

i.e the word window is refering to the whole analytic-clause and also refering to the 3rd component ie only WINDOW component of analytic-clause





Chris Saxon
August 02, 2022 - 3:51 pm UTC

Moreover the word window is also ambigous as it refers to 2 things.

OVER ( PARTITION BY ... ) is also defining a window - it just includes all the rows in the partition.

Conceptually it's the same as:

OVER ( PARTITION BY ... ORDER BY 1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )

Another scenario

Asim, August 02, 2022 - 3:12 pm UTC

Also we cannot write this to get cumilative aggregation of sum(sal)

select e.deptno,sum(sal) ssal,max(d.dname) dname
,sum(sum(sal)) over (order by max(dname)) osum,
LISTAGG(sum(sal),',') WITHIN GROUP (order by (e.deptno)) OVER(order by max(d.dname)) olist
from emp e,dept d where e.deptno=d.deptno
group by e.deptno



Chris Saxon
August 02, 2022 - 3:53 pm UTC

What exactly are you trying to do here?

As stated, you can't have LISTAGG ... OVER ( ORDER BY ... )

Correction

Asim, August 02, 2022 - 3:14 pm UTC

Please make correction in above


Replace " cumilative aggregation of sum(sal)" by " cumilative concatination of sum(sal)"

To: Asim

Stew Ashton, August 03, 2022 - 7:08 am UTC

You are making things complicated in at least two ways:

1) you are discussing LISTAGG() in two different forums, so it is hard for anyone to keep track of who said what where.

2) you are mixing up different subjects in the same discussion: the syntax of LISTAGG(), the documentation of LISTAGG() and the functionality provided by LISTAGG().

When you say that "WITHIN GROUP" is an unfortunate choice of words when there is no GROUP BY clause, that is a syntax issue.

When you say the documentation implies that the PARTITION BY clause is required, you are correct: that is a documentation issue.

When you say we cannot get a cumulative concatenation of summed salaries, you are correct. That is not a syntax issue, it is a restriction of the LISTAGG() function.

The Data Warehousing Guide has a chapter on analytic functions. It talks about "Windowing functions" that "can be used to compute cumulative, moving, and centered aggregates." See https://docs.oracle.com/en/database/oracle/oracle-database/21/dwhsg/sql-analysis-reporting-data-warehouses.html#GUID-2877E1A5-9F11-47F1-A5ED-D7D5C7DED90A

LISTAGG() is not a "windowing function". You may want it to become one. I do. That would be an enhancement request. It is neither a syntax issue nor a documentation issue.
Chris Saxon
August 03, 2022 - 12:46 pm UTC

Good summary Stew

Thanks

Asim, August 03, 2022 - 11:01 am UTC

To Stew

Thanks for summarizing in nice way.

Keyword

Asim, August 04, 2022 - 2:05 am UTC

Ok, but do you agree that for Analytic LISTAGG the WITHIN GROUP keyword be replaced by WITHIN ANALYTIC WINDOW or WITHIN WINDOW or WITHIN PARTITION or something like that for the purpose of clarity.

But you said LISTAGG is not a windowing function. So the above new keywords has the word WINDOW

But as explained above every analytic function conceptually has a window no matter windowingclause is specified or not.
As OVER (PARTITION BY ) is conceptually a window. it is same as if one was able to write OVER (PARTITION BY .. ORDER BY 1 ROWS BETWEEN UNBOUNDED PRECEEDING AND UNBOUNDED FOLLOWING)


Connor McDonald
August 04, 2022 - 11:55 am UTC

Oracle doesn't write the SQL standards.

"For the purpose of clarity"

Stew Ashton, August 04, 2022 - 7:34 am UTC

Asim, you ask "do you agree that for Analytic LISTAGG the WITHIN GROUP keyword be replaced by WITHIN ANALYTIC WINDOW or WITHIN WINDOW or WITHIN PARTITION or something like that for the purpose of clarity."

No, I don't agree. I never agree with suggestions that add nothing to functionality. Moreover, changing existing syntax will never make things "clearer" for people who already know the existing syntax.

If we were on the SQL standards committee creating a brand new standard syntax for LISTAGG(), I would listen to you. It is way too late now.

Please forgive me if I seem rude when I say that this is my last comment on the subject.
Connor McDonald
August 04, 2022 - 11:56 am UTC

+1

Comments

Asim, August 04, 2022 - 7:09 pm UTC


Stew
Thank you very much for your valuable input and time , I really learned alot

Connor what you mean by +1

Connor McDonald
August 04, 2022 - 11:43 pm UTC

"+1" is a relic of Google hangouts. Its like "I agree"

Synonyms

Asim, August 04, 2022 - 8:07 pm UTC



Ok, the below is for general people to discuss , Stew and connor, you can also comment if you like. I will appreciate.

Ok, backward compatibility is an issue when replacing WITHIN GROUP in analytic LISTAGG by something else agree.

But they can keep the WITHIN GROUP keyword as it is and add another keyword WITHIN WINDOW for clarity purpose and allow the user to use any ONE of them.

And this will not be the first time Oracle will be doing this, same (although for the purpose of conformance with SQL standards not for clarity purpose) thing has been done in past, where a synonym keyword was added without any added functionality. For eg.
new COUNT(UNIQUE) = COUNT(DISTINCT)

and new set operator
EXCEPT [ALL] = MINUS [ALL]


see

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Aggregate-Functions.html#GUID-62BE676B-AF18-4E63-BD14-25206FEA0848

DISTINCT and UNIQUE, which are synonymous, cause an aggregate function to consider only distinct values of the argument expression. The syntax diagrams for aggregate functions in this chapter use the keyword DISTINCT for simplicity

and this

https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Changes-in-This-Release-for-Oracle-Database-SQL-Language-Reference.html#GUID-3C11D3A9-8B14-4DCC-B212-B7FE57EE81E8
Enhanced SQL Set Operators

The SQL set operators now support all keywords as defined in ANSI SQL. The new operator EXCEPT [ALL] is functionally equivalent to MINUS [ALL]. The operators MINUS and INTERSECT now support the keyword ALL.








Agree with who

Asim, August 05, 2022 - 1:39 am UTC


Ok so +1 means I agree. So you agree with me or stew?
Connor McDonald
August 08, 2022 - 1:14 am UTC

Stew :-)

Standards

Asim, August 05, 2022 - 9:56 am UTC

Connor, Oracle doesnt write standards but oracle has good proprietry keywords than standards I think. I normally like Oracle better than standards.

like DISTINCT is better than UNIQUE
MINUS is better than EXCEPT


Chris Saxon
August 05, 2022 - 12:49 pm UTC

DISTINCT is the SQL standard syntax

Window functios

Asim, August 14, 2022 - 4:13 am UTC

Chris please refer to your post on 2nd august 2022.

Can we say that all analytical functions do define a window either conceptual or default/implicit, or explicit. The analytical functions which allow the user to explicitly define window are called/named/classified as windowing analytical functions.

Connor McDonald
August 15, 2022 - 5:18 am UTC

This is really just your nomenclature of preference.

When we first introduced them in 8i, we called them Analytic Functions. When other database vendors caught up, they often referred to them as Window Functions.

I use the terms interchangeably, but Window Functions seems to have become more common