Skip to Main Content
  • Questions
  • analytic function sum() - grouping duplicate values

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Peter.

Asked: February 27, 2006 - 11:32 am UTC

Last updated: February 27, 2006 - 2:16 pm UTC

Version: 10g rel 2

Viewed 1000+ times

You Asked

Tom,
Question on the sum() function - more of a curiosity than anything else. My understanding of the sum() function is that it can be used to give a 'running total' for each row in a table. However, I found that if there are duplicate values, sum() groups them together and displays that total against all the duplicate colums - is this expected behaviour ?

Small demo:

create table p (col1 number);

insert into p values (1);
insert into p values (1);
insert into p values (1);
insert into p values (2);
insert into p values (2);
insert into p values (3);
insert into p values (4);


SQL> select col1, sum(col1) over (order by col1) sum_col from p;

COL1 SUM_COL
--------- ---------
1 3
1 3
1 3
2 7
2 7
3 13
3 13
4 17

and Tom said...

absolutely expected, in my seminar on analytics, I point this out.

Tell me - which col1 comes "first" when they are duplicated? Neither, they arrive at the same time.

If you want a "true" running total, you must ensure something unique is added there in the ordering (eg: rowid would do it in this case)

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Default window

Michel Cadot, February 27, 2006 - 2:47 pm UTC

It's not a so strange result when you think that the default window is "range between unbounded preceding and current row" that is the window contains all rows with value less than (assuming ascendant order) or equal to the current row value and not "rows between unbounded preceding and current row" as you expected.

Here's the two outputs:

SQL> select col1, 
  2         sum(col1) 
  3         over (order by col1 range between unbounded preceding and current row)
  4         "Sum"
  5  from p;
      COL1        Sum
---------- ----------
         1          3
         1          3
         1          3
         2          7
         2          7
         3         10
         4         14

7 rows selected.

SQL> select col1, 
  2         sum(col1) 
  3         over (order by col1 rows between unbounded preceding and current row)
  4         "Sum"
  5  from p;
      COL1        Sum
---------- ----------
         1          1
         1          2
         1          3
         2          5
         2          7
         3         10
         4         14

7 rows selected.

Regards
Michel 

Great explanation and demo, Mike - Thank you verymuch.

A reader, March 31, 2006 - 10:19 am UTC


More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.