create table test1 (col1 number,col2 varchar2(5),col3 date,col4 number,col5 number,col6 number);
--Positive scenario sample data
insert into test1 values(1,'A',trunc(sysdate),0,5.5,2);
insert into test1 values(1,'A',trunc(sysdate)+1,0,3.5,3);
insert into test1 values(1,'A',trunc(sysdate)+2,0,2.5,0);
insert into test1 values(1,'A',trunc(sysdate)+3,0,6.5,0);
insert into test1 values(1,'A',trunc(sysdate)+4,0,8,0);
--Negative scenario Sample data (not working)
insert into test1 values(2,'B',trunc(sysdate),0,5.5,2);
insert into test1 values(2,'B',trunc(sysdate),1,3.5,3);
insert into test1 values(2,'B',trunc(sysdate)+1,0,3.5,5);
insert into test1 values(2,'B',trunc(sysdate)+2,0,2.5,2);
insert into test1 values(2,'B',trunc(sysdate)+3,0,6.5,0);
insert into test1 values(2,'B',trunc(sysdate)+4,0,8,0);
insert into test1 values(2,'B',trunc(sysdate)+4,1,7.9,0);
insert into test1 values(2,'B',trunc(sysdate)+5,0,10.5,0);
insert into test1 values(2,'B',trunc(sysdate)+6,0,1.5,0);
/*Working scenario */
--Query fetches for positive scenario where always col4 is 0 for a corresponding combination of col1 and col2. there is no issue
select test1.*,sum(col5) over (partition by col1,col2 order by col3 desc,col4 desc rows col6 preceding ) derived_field from test1
where col1=1 order by col1,col2,col3,col4;
/* Negative Scenario */
--This query needs to be modified so that it should consider only records where col4 is 0. i.e if col6 is 5 and there is a record in preceding 5 rows where col4 is 1
--then that record needs to be ignored and it has to consider the next available 0th record
select test1.*,sum(col5) over (partition by col1,col2 order by col3 desc,col4 desc rows col6 preceding )derived_field from test1
where col1=2 order by col1,col2,col3,col4;
Hi TOM,
I have a typical scenario where we need to ignore n records in between and add n records when performing window functions
In the above example, there are two sets of data differentiated by values of Col1 (values being 1 and 2). For the first scenario where always col4 is 0 for every row where col1 is 1. The SQL window functions works as expected. Whereas the second scenario is the typical one. I have tried my best to illustrate the problem statement below. Please let me know if any more information is needed
COL1 COL2 COL3 COL4 COL5 COL6
2 B 12-Dec-18 0 5.5 2
2 B 12-Dec-18 1 3.5 3
2 B 13-Dec-18 0 3.5 5
2 B 14-Dec-18 0 2.5 2
2 B 15-Dec-18 0 6.5 0
2 B 16-Dec-18 0 8 0
2 B 16-Dec-18 1 7.9 0
2 B 17-Dec-18 0 10.5 0
2 B 18-Dec-18 0 1.5 0
From the above table I need to derive a new field. That field will be sum(col5) for the col6 number of rows ahead with some conditions. like the output shown below
COL1 COL2 COL3 COL4 COL5 COL6 Derived_field
2 B 12-DEC-18 0 5.5 2 11 -- (5.5+3.5(of date 13/dec)+2.5)
2 B 12-DEC-18 1 3.5 3 16 -- (3.5+3.5+2.5+6.5)
2 B 13-DEC-18 0 3.5 5 32.5
2 B 14-DEC-18 0 2.5 2 17
2 B 15-DEC-18 0 6.5 0 6.5
2 B 16-DEC-18 0 8 0 8
2 B 16-DEC-18 1 7.9 0 7.9
2 B 17-DEC-18 0 10.5 0 10.5
2 B 18-DEC-18 0 1.5 0 1.5
I have tried with the following query, but not sure how to ignore rows where col4=1 and dynamically add 1 more row if there is any ignored row in between. The critical part is there may be scenarios where col4 will have values like 2,3,4 where all of them needs to ignored and equivalent number of records where col4=0 needs to be added
select test1.*,sum(col5) over (partition by col1,col2 order by col3 desc,col4 desc rows col6 preceding )derived_field from test1
where col1=2 order by col1,col2,col3,col4;
OK, so you want the sum of:
- The current row
- COL6 following rows that have COL4 = 0?
If so, you can find how many of the next COL6 rows have the value COL4 = 1. Then add this to the number of rows you need to look ahead.
Calculate the sum for the following row + COL6 + the count above rows. Exclude the rows you want to skip by only returning COL5 if COL4 = 0.
Stick in some null handling and add this to the current COL5 value.
Giving something like:
with skips as (
select t.*,
col6 + count ( case when col4 <> 0 then 1 end ) over (
partition by col1, col2
order by col3, col4
rows between 1 following and col6 following
) rws
from test1 t
)
select s.*,
col5 +
nvl ( sum ( case when col4 = 0 then col5 end ) over (
partition by col1, col2
order by col3, col4
rows between 1 following and rws following
), 0 ) d2
from skips s
order by col1, col2, col3;
COL1 COL2 COL3 COL4 COL5 COL6 RWS D2
1 A 12-DEC-2018 00:00:00 0 5.5 2 2 11.5
1 A 13-DEC-2018 00:00:00 0 3.5 3 3 20.5
1 A 14-DEC-2018 00:00:00 0 2.5 0 0 2.5
1 A 15-DEC-2018 00:00:00 0 6.5 0 0 6.5
1 A 16-DEC-2018 00:00:00 0 8 0 0 8
2 B 12-DEC-2018 00:00:00 0 5.5 2 3 11.5
2 B 12-DEC-2018 00:00:00 1 3.5 3 3 16
2 B 13-DEC-2018 00:00:00 0 3.5 5 6 32.5
2 B 14-DEC-2018 00:00:00 0 2.5 2 2 17
2 B 15-DEC-2018 00:00:00 0 6.5 0 0 6.5
2 B 16-DEC-2018 00:00:00 0 8 0 0 8
2 B 16-DEC-2018 00:00:00 1 7.9 0 0 7.9
2 B 17-DEC-2018 00:00:00 0 10.5 0 0 10.5
2 B 18-DEC-2018 00:00:00 0 1.5 0 0 1.5
Note: this only works if there are no COL4 = 1 rows next to each other. Or at least, the number of consecutive COL4 = 1 rows is <= the number of rows you need to skip.
If you can have consecutive rows with COL4 = 1, you can have a situation where you look COL6 rows ahead to a row with COL4 = 1. But this has N more rows immediately after you need to skip. This solution doesn't consider these extra rows.