Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, shylaja.

Asked: February 09, 2017 - 8:08 am UTC

Last updated: February 09, 2017 - 7:27 pm UTC

Version: Oracle 12C

Viewed 1000+ times

You Asked

I am struggling to merge continuous time ranges to one.Here is my record set.

STAFF_NUMBER SHIFT_DATE TASK_START_TIME TASK_END_TIME
123 12/10/2016 12/10/2016 17:14 12/10/2016 20:10
123 12/10/2016 12/10/2016 20:08 12/10/2016 21:08
123 12/10/2016 12/10/2016 22:50 12/10/2016 23:28
123 12/10/2016 12/10/2016 21:53 12/11/2016 02:17
123 12/10/2016 12/11/2016 01:40 12/11/2016 04:21

Here the first and the 2nd record over lap as the first task completes after the second starts.So i need to merge the first 2 records to one like below

STAFF_NUMBER SHIFT_DATE TASK_START_TIME TASK_END_TIME
439972 12/10/2016 12/10/2016 17:14 12/10/2016 21:08

Similarly record 3, 4 and 5 gets overlapped and it should look like below.
STAFF_NUMBER SHIFT_DATE TASK_START_TIME TASK_END_TIME
439972 12/10/2016 12/10/2016 22:50 12/11/2016 04:21

So my final output should be only 2 rows after merging all the overlaps.I have been struggling to do this as i cannot order the data either by start date not the end date and hence cannot use analytical functions.

My final output should be like,

STAFF_NUMBER SHIFT_DATE TASK_START_TIME TASK_END_TIME
439972 12/10/2016 12/10/2016 17:14 12/10/2016 21:08
439972 12/10/2016 12/10/2016 22:50 12/11/2016 04:21


and Connor said...

Take a look here for an example using the numbers, but the concept is the same

https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:7867939004010

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

12c + time periods = MATCH_RECOGNIZE

Stew Ashton, February 10, 2017 - 9:20 am UTC

Since you have 12c, you should learn the MATCH_RECOGNIZE() clause, which is quite often the most efficient solution when time periods are involved.

Here I sort by end date, get the first row, then find all the rows whose start date is less than or equal to any preceding end date. That bunch of rows is a "match". Once I get that, I start again with the next row after the "match".
drop table t purge;
create table t (STAFF_NUMBER, SHIFT_DATE, TASK_START_TIME, TASK_END_TIME) as select 
123, to_date('12/10/2016','mm/dd/yyyy'), to_date('12/10/2016 17:14','mm/dd/yyyy hh24/mi'), to_date('12/10/2016 20:10','mm/dd/yyyy hh24/mi') from dual union all select
123, to_date('12/10/2016','mm/dd/yyyy'), to_date('12/10/2016 20:08','mm/dd/yyyy hh24/mi'), to_date('12/10/2016 21:08','mm/dd/yyyy hh24/mi') from dual union all select
123, to_date('12/10/2016','mm/dd/yyyy'), to_date('12/10/2016 22:50','mm/dd/yyyy hh24/mi'), to_date('12/10/2016 23:28','mm/dd/yyyy hh24/mi') from dual union all select
123, to_date('12/10/2016','mm/dd/yyyy'), to_date('12/10/2016 21:53','mm/dd/yyyy hh24/mi'), to_date('12/11/2016 02:17','mm/dd/yyyy hh24/mi') from dual union all select
123, to_date('12/10/2016','mm/dd/yyyy'), to_date('12/11/2016 01:40','mm/dd/yyyy hh24/mi'), to_date('12/11/2016 04:21','mm/dd/yyyy hh24/mi') from dual;

select * from t 
match_recognize( 
  partition by STAFF_NUMBER, SHIFT_DATE 
  order by TASK_END_TIME, TASK_START_TIME
  measures min(TASK_START_TIME) TASK_START_TIME,
    max(TASK_END_TIME) TASK_END_TIME 
  pattern(a b*) 
  define b as TASK_START_TIME <= last(TASK_END_TIME,1)
);

STAFF_NUMBER SHIFT_DATE  TASK_START_TIME  TASK_END_TIME  
------------ ----------- ---------------- ----------------
         123 2016-12-10  2016-12-10 17:14 2016-12-10 21:08
         123 2016-12-10  2016-12-10 21:53 2016-12-11 04:21

I disagree with your expected output. The start time of the second output row should be 21:53 because it is earlier than 22:50.

Solution with analytics

Stew Ashton, February 10, 2017 - 9:41 am UTC

Since you have been trying with analytics, here is an analytic solution that implements the same basic logic.

Some call this technique "start of group" because the idea is to identify the starting point, then assign the same "group identifier" to the starting point and to all the following rows until the next starting point. The "groups" formed in this way contain the same rows as the "matches" above. Once the group identifier is assigned to all the rows, we can group by it.
with grp_starts as (
  select STAFF_NUMBER, SHIFT_DATE, TASK_START_TIME, TASK_END_TIME,
  case
    when TASK_START_TIME > lag(TASK_END_TIME) over(
      partition by STAFF_NUMBER, SHIFT_DATE order by TASK_END_TIME, TASK_START_TIME
    )
    then 1 else 0
  end grp_start
  from t
)
, grps as (
  select STAFF_NUMBER, SHIFT_DATE, TASK_START_TIME , TASK_END_TIME ,
  sum(grp_start) over(
    partition by STAFF_NUMBER, SHIFT_DATE order by TASK_END_TIME, TASK_START_TIME
  ) grp
  from grp_starts
)
select STAFF_NUMBER, SHIFT_DATE,
  min(TASK_START_TIME) TASK_START_TIME ,
  max(TASK_END_TIME) TASK_END_TIME 
from grps
group by STAFF_NUMBER, SHIFT_DATE, grp
order by 1, 2, 3;