Thanks for the question, Rakesh.

Asked: May 05, 2017 - 2:38 am UTC

Last updated: May 11, 2017 - 8:59 am UTC

Version: 11g

Viewed 1000+ times

You Asked

Hi Team,

Data Setup:

create table test_item as 
with t1 as 
(
select 1 category, 'abc' Item_name,100 item_volume,12.3 item_price,to_date('04-may-2017 12:22:33 AM','dd-mon-yyyy hh:mi:ss AM') time from dual
union 
select 1,'abc' ,-100 ,12.3 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 1,'abc' ,200 ,12.3 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union 
select 1,'abc' ,-100 ,12.3 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 2 category, 'abcd' Item_name,100 item_volume,12.32 item_price,to_date('04-may-2017 12:22:33 AM','dd-mon-yyyy hh:mi:ss AM') time from dual
union 
select 2,'abcd' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 2,'abcd' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union 
select 2,'abcd' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
)
select * from t1;

-------------------------------------------------- 
  CATEGORY ITEM ITEM_VOLUME ITEM_PRICE TO_CHAR(TIME,'DD-MON-YY
---------- ---- ----------- ---------- -----------------------
         1 abc          100       12.3 04-may-2017 12:22:33 am
         1 abc         -100       12.3 04-may-2017 12:22:34 am
         1 abc          200       12.3 04-may-2017 12:22:34 am
         1 abc         -100       12.3 04-may-2017 12:22:35 am
         2 abcd         100      12.32 04-may-2017 12:22:33 am
         2 abcd         -75      12.32 04-may-2017 12:22:34 am
         2 abcd         -25      12.32 04-may-2017 12:22:34 am
         2 abcd         100      12.32 04-may-2017 12:22:35 am

Question:Need a SQL query which will result below rows only,
Requirement:Partition by item_name,item_price and order by time and sum the item_volume, wherever the sum will become ZERO those records need to print,if we see below expected output 100,-100 resulting zero AND 100,-75,-25 resulting zero so only those records needed.

         1 abc          100       12.3 04-MAY-17
         1 abc         -100       12.3 04-MAY-17
         2 abcd         100      12.32 04-MAY-17
         2 abcd         -75      12.32 04-MAY-17
         2 abcd         -25      12.32 04-MAY-17

and Chris said...

So you want to find the rows where the running total per category is zero. Then display that and all the rows preceding it? What happens if the running total can reach zero on two separate occasions?

Anyway, here's a basic algorithm:

1. Calculating the running total per category
2. Assign a row number by category
3. Find the highest row num for the rows with the smallest running total
4. The smallest value for the running total

Do the first two steps in the first pass:

select ti.*, 
       sum(item_volume) over (partition by category order by time, item_volume) tot,
       row_number() over (partition by category order by time, item_volume) rn
from   test_item ti;

You can find the value for step 3 with:

max(rn) keep (dense_rank first order by tot) over (partition by category)

This is saying:

Sort the rows by (running) tot. Then for those that have the lowest running total in each category, find the maximum row number.

All you need to do after this is ensure that the min running total is zero and find all the rows with a row number equal to or lower than that found in step 3:

create table test_item as 
with t1 as 
(
select 1 category, 'abc' Item_name,100 item_volume,12.3 item_price,to_date('04-may-2017 12:22:33 AM','dd-mon-yyyy hh:mi:ss AM') time from dual
union 
select 1,'abc' ,-100 ,12.3 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 1,'abc' ,200 ,12.3 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union 
select 1,'abc' ,-100 ,12.3 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 2 category, 'abcd' Item_name,100 item_volume,12.32 item_price,to_date('04-may-2017 12:22:33 AM','dd-mon-yyyy hh:mi:ss AM') time from dual
union 
select 2,'abcd' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 2,'abcd' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union 
select 2,'abcd' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'def' ,100 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union 
select 3,'def' ,50 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
)
select * from t1;

with tots as (
select ti.*, 
       sum(item_volume) over (partition by category order by time, item_volume) tot,
       row_number() over (partition by category order by time, item_volume) rn
from   test_item ti
), rws as (
  select t.*, max(rn) keep (dense_rank first order by tot) over (partition by category) mxrn ,
         min(tot) over (partition by category) mn
  from   tots t
)
  select * from rws
  where  rn <= mxrn
  and    mn = 0
  order  by 1, 5;

CATEGORY  ITEM_NAME  ITEM_VOLUME  ITEM_PRICE  TIME                  TOT  RN  MXRN  MN  
1         abc        100          12.3        04-MAY-2017 00:22:33  100  1   2     0   
1         abc        -100         12.3        04-MAY-2017 00:22:34  0    2   2     0   
2         abcd       100          12.32       04-MAY-2017 00:22:33  100  1   3     0   
2         abcd       -75          12.32       04-MAY-2017 00:22:34  25   2   3     0   
2         abcd       -25          12.32       04-MAY-2017 00:22:34  0    3   3     0

Or if you want to do it 12c style, you can use match_recognize:

select * from test_item
match_recognize (
  partition by category order by time, item_volume
  measures classifier() c
  all rows per match 
  pattern (running* zero) 
  define 
    zero as sum(running.item_volume) = 0
)
where  c = 'RUNNING';

CATEGORY  TIME                  ITEM_VOLUME  C        ITEM_NAME  ITEM_PRICE  
1         04-MAY-2017 00:22:33  100          RUNNING  abc        12.3        
1         04-MAY-2017 00:22:34  -100         RUNNING  abc        12.3        
2         04-MAY-2017 00:22:33  100          RUNNING  abcd       12.32       
2         04-MAY-2017 00:22:34  -75          RUNNING  abcd       12.32       
2         04-MAY-2017 00:22:34  -25          RUNNING  abcd       12.32

This will only match the first time the running total hits zero though. The analytic method will match multiple:

insert into test_item
 select 1,'abc' ,-100 ,12.3 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual;

select * from test_item
match_recognize (
  partition by category order by time, item_volume
  measures classifier() c
  all rows per match 
  pattern (running* zero) 
  define 
    zero as sum(running.item_volume) = 0
)
where  c = 'RUNNING';

CATEGORY  TIME                  ITEM_VOLUME  C        ITEM_NAME  ITEM_PRICE  
1         04-MAY-2017 00:22:33  100          RUNNING  abc        12.3        
1         04-MAY-2017 00:22:34  -100         RUNNING  abc        12.3        
2         04-MAY-2017 00:22:33  100          RUNNING  abcd       12.32       
2         04-MAY-2017 00:22:34  -75          RUNNING  abcd       12.32       
2         04-MAY-2017 00:22:34  -25          RUNNING  abcd       12.32  

with tots as (
select ti.*, 
       sum(item_volume) over (partition by category order by time, item_volume) tot,
       row_number() over (partition by category order by time, item_volume) rn
from   test_item ti
), rws as (
  select t.*, max(rn) keep (dense_rank first order by tot) over (partition by category) mxrn ,
         min(tot) over (partition by category) mn
  from   tots t
)
  select * from rws
  where  rn <= mxrn
  and    mn = 0
  order  by 1, 5;

CATEGORY  ITEM_NAME  ITEM_VOLUME  ITEM_PRICE  TIME                  TOT  RN  MXRN  MN  
1         abc        100          12.3        04-MAY-2017 00:22:33  100  1   5     0   
1         abc        -100         12.3        04-MAY-2017 00:22:34  0    2   5     0   
1         abc        200          12.3        04-MAY-2017 00:22:34  200  3   5     0   
1         abc        -100         12.3        04-MAY-2017 00:22:35  0    4   5     0   
1         abc        -100         12.3        04-MAY-2017 00:22:35  0    5   5     0   
2         abcd       100          12.32       04-MAY-2017 00:22:33  100  1   3     0   
2         abcd       -75          12.32       04-MAY-2017 00:22:34  25   2   3     0   
2         abcd       -25          12.32       04-MAY-2017 00:22:34  0    3   3     0

Rating

(6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Always Awesome!

Rakesh Amujuri, May 05, 2017 - 5:31 pm UTC

Thanks Chris and thanks again for 12c solution, yes it is possible to have running total can reach zero on more than one occasion and I have missed few data cases,PFB.

==== Here are the Other data case please see====

insert into test_item
select 3,'abcw' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'abcw' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'abcw' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'abcw' ,-2500 ,12.32 ,to_date('04-may-2017 12:22:38 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,-2500 ,12.32 ,to_date('04-may-2017 12:22:30 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-2500 ,12.32 ,to_date('04-may-2017 12:22:30 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,100 ,12.32 ,to_date('04-may-2017 12:22:36 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-100 ,12.32 ,to_date('04-may-2017 12:22:37 AM','dd-mon-yyyy hh:mi:ss AM') from dual

May 08, 2017 - 8:50 am UTC

What are you expecting to see with these? These categories all have a running total below zero at some point. Which rows do you want to show?

Rakesh Amujuri, May 08, 2017 - 1:19 pm UTC

insert into test_item
select 3,'abcw' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'abcw' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'abcw' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 3,'abcw' ,-2500 ,12.32 ,to_date('04-may-2017 12:22:38 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,-2500 ,12.32 ,to_date('04-may-2017 12:22:30 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 4,'abcj' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-2500 ,12.32 ,to_date('04-may-2017 12:22:30 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-25 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-75 ,12.32 ,to_date('04-may-2017 12:22:34 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,100 ,12.32 ,to_date('04-may-2017 12:22:35 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,100 ,12.32 ,to_date('04-may-2017 12:22:36 AM','dd-mon-yyyy hh:mi:ss AM') from dual
union
select 5,'abcx' ,-100 ,12.32 ,to_date('04-may-2017 12:22:37 AM','dd-mon-yyyy hh:mi:ss AM') from dual

expected output: Display rows whose running total adding up to zero Sort by time Ascending.
3,'abcw' ,-25 ,12.32
3,'abcw' ,-75 ,12.32
3,'abcw' ,100 ,12.32
4,'abcj' ,-25 ,12.32
4,'abcj' ,-75 ,12.32
4,'abcj' ,100 ,12.32
5,'abcx' ,-25 ,12.32
5,'abcx' ,-75 ,12.32
5,'abcx' ,100 ,12.32
5,'abcx' ,100 ,12.32
5,'abcx' ,-100 ,12.32

May 08, 2017 - 5:35 pm UTC

So you're looking for a consecutive series of transactions, anywhere in the data set that sum to zero?

If so I think you have to use match_recognize to achieve a general solution. You can do it with a slight tweak to the original; define zero as matching sum(item_volume) instead of sum(running.item_volume):

select * from test_item
match_recognize (
  partition by category order by time, item_volume
  measures classifier() c, sum(item_volume) as run_tot
  all rows per match 
  pattern (running* zero) 
  define 
    zero as sum(item_volume) = 0
);

CATEGORY  TIME                  ITEM_VOLUME  C        RUN_TOT  ITEM_NAME  ITEM_PRICE  
1         04-MAY-2017 00:22:33  100          RUNNING  100      abc        12.3        
1         04-MAY-2017 00:22:34  -100         ZERO     0        abc        12.3        
2         04-MAY-2017 00:22:33  100          RUNNING  100      abcd       12.32       
2         04-MAY-2017 00:22:34  -75          RUNNING  25       abcd       12.32       
2         04-MAY-2017 00:22:34  -25          ZERO     0        abcd       12.32       
3         04-MAY-2017 00:22:34  -75          RUNNING  -75      abcw       12.32       
3         04-MAY-2017 00:22:34  -25          RUNNING  -100     abcw       12.32       
3         04-MAY-2017 00:22:35  100          ZERO     0        abcw       12.32       
4         04-MAY-2017 00:22:34  -75          RUNNING  -75      abcj       12.32       
4         04-MAY-2017 00:22:34  -25          RUNNING  -100     abcj       12.32       
4         04-MAY-2017 00:22:35  100          ZERO     0        abcj       12.32       
5         04-MAY-2017 00:22:34  -75          RUNNING  -75      abcx       12.32       
5         04-MAY-2017 00:22:34  -25          RUNNING  -100     abcx       12.32       
5         04-MAY-2017 00:22:35  100          RUNNING  0        abcx       12.32       
5         04-MAY-2017 00:22:36  100          RUNNING  100      abcx       12.32       
5         04-MAY-2017 00:22:37  -100         ZERO     0        abcx       12.32

A reader, May 09, 2017 - 2:01 am UTC

Yes, this is my exact requirement "So you're looking for a consecutive series of transactions, anywhere in the data set that sum to zero?"

We have not upgraded yet to 12c , so will try the other option and will test for all my data cases and comeback if required. Thank you.

Rakesh Amujuri, May 09, 2017 - 3:09 pm UTC

match_recognize function best suits for my requirement,
But ours is 11g ,so is it possible to write 11g compatible SQL query for the same requirement?

May 11, 2017 - 8:59 am UTC

The problem is for every row in your table you need to look an unknown number of rows back (or forward if you prefer). So with analytics, you need N different windowing clauses for the N rows before the current, because you don't know which (if any!) sum to zero.

It may be possible to write a recursive solution that does this, but I seriously doubt it will perform well...

A few details and a new approach

Stew Ashton, May 12, 2017 - 3:52 pm UTC

1) Rakesh says "partition by item_name, item_price" and Chris says "partition by category": which is it? I'm going to assume "partition by category, item_name, item_price" but that can easily be changed.

2) Within a partition, there are multiple rows with the same datetime value, so Chris orders by "time, item_volume" to get a deterministic result. Ordering differently could change the result.

3) Why just zero? If you do a cumulative sum and it goes to zero, then you want all the rows from the beginning to that point. In addition, if you do a cumulative sum and the same value appears more than once, then you want the rows after the first occurence up to and including the last occurence.

with cumul_sums as (
    select category, item_name, item_price, item_volume,
        row_number() over(
            partition by category, item_name, item_price
            order by time, item_volume, rowid
        ) rn,
        sum(item_volume) over(
            partition by category, item_name, item_price
            order by time, item_volume, rowid
            rows unbounded preceding
        ) cumul_volume
    from test_item
)
, ranges as (
    select * from (
        select category, item_name, item_price,
            case when cumul_volume = 0 then 1
                else min(rn) + 1
            end first_rn,
            max(rn) last_rn
        from cumul_sums
        group by category, item_name, item_price, cumul_volume
    )
    where first_rn <= last_rn
)
select category, item_name, item_price, item_volume
from cumul_sums cs
where exists (
    select null from ranges r
    where (cs.category, cs.item_name, cs.item_price) = ((r.category, r.item_name, r.item_price))
    and cs.rn between r.first_rn and r.last_rn
)
order by category, item_name, item_price, rn;

  CATEGORY ITEM ITEM_PRICE ITEM_VOLUME
---------- ---- ---------- -----------
         3 abcw      12.32         -75
         3 abcw      12.32         -25
         3 abcw      12.32         100
         4 abcj      12.32         -75
         4 abcj      12.32         -25
         4 abcj      12.32         100
         5 abcx      12.32         -75
         5 abcx      12.32         -25
         5 abcx      12.32         100
         5 abcx      12.32         100
         5 abcx      12.32        -100

Corrections...

Stew Ashton, May 12, 2017 - 3:59 pm UTC

- I forgot to remove ", rowid" from two lines of my code.

- I said "new approach" but rereading Chris' first answer I see that my approach is a variant of his.

Best regards, Stew

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Always Awesome!

A few details and a new approach

Corrections...

More to Explore

Analytics

Show rows where the running total is zero and all preceding rows

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Always Awesome!

A few details and a new approach

Corrections...

More to Explore

Analytics