Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, George.

Asked: July 28, 2018 - 6:09 am UTC

Last updated: August 02, 2018 - 3:16 pm UTC

Version: 11g

Viewed 1000+ times

You Asked

Hi

I am trying to write a SQL query that computes the values of a column(mmwl) as follows.

For the DDL and DML scripts please check the livesql link

create table t (y date, value_pvt int, mmwl int);

--sample data to generate a record
insert into t
select to_date('28/7/2018','dd/mm/yyyy')+level/24/60
    ,mod(level,3)
    ,null
from dual
connect by level<=5;

select to_char(y,'dd/mm/yyyy hh24:mi:ss') as y
       ,value_pvt
       ,mmwl 
   from t 
 order by 1;

Y             VALUE_PVT MMWL
28/07/2018 00:01:00 1  - 
28/07/2018 00:02:00 2  - 
28/07/2018 00:03:00 0  - 
28/07/2018 00:04:00 1  - 
28/07/2018 00:05:00 2  -



For each of the 5 values of y, i need to compute the value of mmwl

The formulae to compute mmwl is as follows

For first value of y ie y('28/7/2018 00:01')

compare (Need to iterate till the upper_bound of y. ie till 5.)

value_pvt(1) - value_pvt(2) => 1 - 2 = -1
value_pvt(2) - value_pvt(3) => 2 - 0 = 2
value_pvt(3) - value_pvt(4) => 0 - 1 = -1
value_pvt(4) - value_pvt(5) => 1 - 2 = -1

get the max value of differences
max(-1,2,-1,-1) = 2

so the value for mmwl for y('28/7/2018 00:01') = 2
-----------------------------------------------------------
similarly for the next value of y
For y('28/7/2018 00:02')
compare
value_pvt(1) - value_pvt(3) = 1 - 0 = 1
value_pvt(2) - value_pvt(4) = 2 - 1 = 1
value_pvt(3) - value_pvt(5) = 0 - 2 = -2

get the max value of differences
max(1,1,-2) = 1
-------------------------------------------------------------
this needs to be continues till the penultimate record


The mmwl value for the last record ie y('28/7/2018 00:05') is to be set to 0)

To get the required output i created a query as follows


with data
  as (select y /* generate a sequence ordered by the y*/
            ,value_pvt
            ,row_number() over(order by y) as rnk
            ,count(*) over(partition by 1) as tot_cnt
        from t
      )
    ,row_generator
     as (select level as lvl /* generate a rows up and till 5*/
          from dual
         connect by level<=(select tot_cnt 
                              from data 
                             where rownum=1)
         )
    , boundary_values
     as (select a.lvl /* generate records for levels and specify the max and min values of the boundaries. Eg:for lvl=4 low_val=1 and hi_val=5*/
               ,b.lvl as low_val
               ,a.lvl+b.lvl as hi_val
          from row_generator a
          join row_generator b
            on 1=1
         where b.lvl + a.lvl <=(select tot_cnt from data where rownum=1)            
         )    
    , computed_mmwl
     as ( select lvl,max(diff) as diff
           from (select m.lvl
                        ,n.value_pvt - o.value_pvt as diff
                   from boundary_values m
                   join data n
                     on m.low_val=n.rnk
                   join data o
                     on m.hi_val=o.rnk   
                 ) group by lvl
           union all
           select max(rnk), 0  /* this union all adds the last record ie y('28/07/2018 00:05:00')|0   */
             from data
        )
select to_char(a1.y,'dd/mm/yyyy hh24:mi:ss') as y
       ,a1.value_pvt
       ,b1.diff as new_mmwl
 from data a1
 join computed_mmwl b1
   on a1.rnk=b1.lvl


This works fine and gets me the output, however i was interested to know if there is a better way of doing this using model clause or using pattern matching or recursive cte or even a better straight sql solution.

(Note: There would be more than 500 rows in the actual table for t)

with LiveSQL Test Case:

and Chris said...

So, for each row at position N (sorted by y), you're:

- Finding the difference between the current val and the val N rows after it for every row
- Returning the largest of these differences?

If so, I believe you have to cross join the table with another that has the same number of rows.

But there is an easier way than recursive with to get the differences.

The second argument of lead states how many rows after the current to get the value of. So use this, partitioned by N, to generate all the combinations of differences.

Then use the nth_value function to get the Y and PVT values for this N.

Finally group by the nth_value outputs to find the maximum difference. Which gives:

with ranks as (
  select t.*,
         row_number () over ( order by y ) rn,
         count (*) over () c
  from   t
), grps as (
  select value_pvt - lead ( value_pvt, x ) over ( partition by x order by rn ) diff,
         nth_value ( y, x ) over ( partition by x ) nth_y,
         nth_value ( value_pvt, x ) over ( partition by x ) nth_pvt
  from   ranks r , lateral (
    select level x from dual
    connect by level <= c
  ) rws
)
  select nth_y, nth_pvt, nvl ( max ( diff ), 0 ) 
  from   grps
  group  by nth_y, nth_pvt
  order  by nth_y;

NTH_Y                  NTH_PVT   NVL(MAX(DIFF),0)   
28-JUL-2018 00:01:00           1                  2 
28-JUL-2018 00:02:00           2                  1 
28-JUL-2018 00:03:00           0                  0 
28-JUL-2018 00:04:00           1                 -1 
28-JUL-2018 00:05:00           2                  0 

Rating

  (1 rating)

Is this answer out of date? If it is, please let us know via a Comment

Comments

RE

George Joseph, July 31, 2018 - 5:22 pm UTC

Thanks for the response Chris. It works very well with large number of records as well.

However i was not doing recursive CTE in my solution. But i realise my mistake that i was self joining the "data" twice to get the hi_val and low_val indices value_pvt, when i could have used lead to get the same effect.
Chris Saxon
August 02, 2018 - 3:16 pm UTC

Ah yes, you're right. Just plain old CTEs ;) Saw the union all within it and assumed...

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.