Hi
I am trying to write a SQL query that computes the values of a column(mmwl) as follows.
For the DDL and DML scripts please check the livesql link
create table t (y date, value_pvt int, mmwl int);
--sample data to generate a record
insert into t
select to_date('28/7/2018','dd/mm/yyyy')+level/24/60
,mod(level,3)
,null
from dual
connect by level<=5;
select to_char(y,'dd/mm/yyyy hh24:mi:ss') as y
,value_pvt
,mmwl
from t
order by 1;
Y VALUE_PVT MMWL
28/07/2018 00:01:00 1 -
28/07/2018 00:02:00 2 -
28/07/2018 00:03:00 0 -
28/07/2018 00:04:00 1 -
28/07/2018 00:05:00 2 -
For each of the 5 values of y, i need to compute the value of mmwl
The formulae to compute mmwl is as follows
For first value of y ie y('28/7/2018 00:01')
compare (Need to iterate till the upper_bound of y. ie till 5.)
value_pvt(1) - value_pvt(2) => 1 - 2 = -1
value_pvt(2) - value_pvt(3) => 2 - 0 = 2
value_pvt(3) - value_pvt(4) => 0 - 1 = -1
value_pvt(4) - value_pvt(5) => 1 - 2 = -1
get the max value of differences
max(-1,2,-1,-1) = 2
so the value for mmwl for y('28/7/2018 00:01') = 2
-----------------------------------------------------------
similarly for the next value of y
For y('28/7/2018 00:02')
compare
value_pvt(1) - value_pvt(3) = 1 - 0 = 1
value_pvt(2) - value_pvt(4) = 2 - 1 = 1
value_pvt(3) - value_pvt(5) = 0 - 2 = -2
get the max value of differences
max(1,1,-2) = 1
-------------------------------------------------------------
this needs to be continues till the penultimate record
The mmwl value for the last record ie y('28/7/2018 00:05') is to be set to 0)
To get the required output i created a query as follows
with data
as (select y /* generate a sequence ordered by the y*/
,value_pvt
,row_number() over(order by y) as rnk
,count(*) over(partition by 1) as tot_cnt
from t
)
,row_generator
as (select level as lvl /* generate a rows up and till 5*/
from dual
connect by level<=(select tot_cnt
from data
where rownum=1)
)
, boundary_values
as (select a.lvl /* generate records for levels and specify the max and min values of the boundaries. Eg:for lvl=4 low_val=1 and hi_val=5*/
,b.lvl as low_val
,a.lvl+b.lvl as hi_val
from row_generator a
join row_generator b
on 1=1
where b.lvl + a.lvl <=(select tot_cnt from data where rownum=1)
)
, computed_mmwl
as ( select lvl,max(diff) as diff
from (select m.lvl
,n.value_pvt - o.value_pvt as diff
from boundary_values m
join data n
on m.low_val=n.rnk
join data o
on m.hi_val=o.rnk
) group by lvl
union all
select max(rnk), 0 /* this union all adds the last record ie y('28/07/2018 00:05:00')|0 */
from data
)
select to_char(a1.y,'dd/mm/yyyy hh24:mi:ss') as y
,a1.value_pvt
,b1.diff as new_mmwl
from data a1
join computed_mmwl b1
on a1.rnk=b1.lvl
This works fine and gets me the output, however i was interested to know if there is a better way of doing this using model clause or using pattern matching or recursive cte or even a better straight sql solution.
(Note: There would be more than 500 rows in the actual table for t)
So, for each row at position N (sorted by y), you're:
- Finding the difference between the current val and the val N rows after it for every row
- Returning the largest of these differences?
If so, I believe you have to cross join the table with another that has the same number of rows.
But there is an easier way than recursive with to get the differences.
The second argument of lead states how many rows after the current to get the value of. So use this, partitioned by N, to generate all the combinations of differences.
Then use the nth_value function to get the Y and PVT values for this N.
Finally group by the nth_value outputs to find the maximum difference. Which gives:
with ranks as (
select t.*,
row_number () over ( order by y ) rn,
count (*) over () c
from t
), grps as (
select value_pvt - lead ( value_pvt, x ) over ( partition by x order by rn ) diff,
nth_value ( y, x ) over ( partition by x ) nth_y,
nth_value ( value_pvt, x ) over ( partition by x ) nth_pvt
from ranks r , lateral (
select level x from dual
connect by level <= c
) rws
)
select nth_y, nth_pvt, nvl ( max ( diff ), 0 )
from grps
group by nth_y, nth_pvt
order by nth_y;
NTH_Y NTH_PVT NVL(MAX(DIFF),0)
28-JUL-2018 00:01:00 1 2
28-JUL-2018 00:02:00 2 1
28-JUL-2018 00:03:00 0 0
28-JUL-2018 00:04:00 1 -1
28-JUL-2018 00:05:00 2 0