Skip to Main Content
  • Questions
  • Grouping punch clock data into 24 hour periods, starting with the time_in punches

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Bob.

Asked: October 18, 2017 - 2:42 pm UTC

Last updated: December 19, 2017 - 4:03 pm UTC

Version: 11.2.0.4

Viewed 1000+ times

You Asked

We have punch data coming in from employees tasking into work orders. As these records are encountered, I need to know based on employee group if their calc_hrs (field in table) for a 24 period is > 13 hours.

I was able to get it started:

with grp_starts as (
  select id, time_in, time_out, calculated_hrs, pr_emp_id, source, source_id, note, trunc(time_in) as prod_date, 'Alert#1842' as user_id, sysdate as timestamp,
 case
  when time_in >= lag(time_in) over(order by time_in, time_out) and time_in <= lag(time_out) over(order by time_in, time_out)
  then 0 else 1
 end grp_start
 from ta_labor
 where nvl(processed,'N')='N'
 and source in ('WO')
 order by time_in
)
, grps as (
 select id, time_in, time_out, calculated_hrs,
 sum(grp_start) over(order by time_in, time_out) grp
 from grp_starts
)
select
 id,
 (time_in) time_in,
 (time_out) time_out,
 (calculated_hrs) calc_hrs,
 grp,
 (select sum(calculated_hrs) from grps where grp=dtl.grp) total
from grps dtl
--group by grp
--having sum(calculated_hrs) > 13  --hide or use this row to debug the data
order by time_in, time_out


But my working copy, trying to get overlapped time, I was not able to process the 24-hour period grouping. My grouping just shows me if the consecutive punch records are overlapped. I wanted to build on this, but cannot figure out how to define the appropriate periods and group accordingly.

Here is example data:
id,time_in,time_out,calc_hrs,grp,total
5341,7/21/17 09:00:00 AM,7/21/17 10:00:00 AM,1.00,1,1.00
5707,7/31/17 12:23:50 PM,7/31/17 12:33:42 PM,0.16,2,0.16
5711,7/31/17 12:47:57 PM,7/31/17 12:49:01 PM,0.02,3,0.02
5713,7/31/17 12:49:16 PM,7/31/17 12:49:40 PM,0.01,4,0.01
6860,8/29/17 10:50:26 AM,8/29/17 10:52:47 AM,0.04,5,0.11
6861,8/29/17 10:50:55 AM,8/29/17 10:53:13 AM,0.04,5,0.11
6862,8/29/17 10:51:08 AM,8/29/17 10:52:53 AM,0.03,5,0.11
6863,8/29/17 10:53:15 AM,9/8/17 01:14:05 PM,242.34,6,727.01
6864,8/29/17 10:53:31 AM,9/8/17 01:14:05 PM,242.34,6,727.01
6866,8/29/17 10:53:53 AM,9/8/17 01:14:05 PM,242.33,6,727.01


As you can see my "grp" only provides for overlap hours, not really what I ultimately wanted. I wanted the output to reflect the first time_in to define a 24-hour period and to group the within that period, then then next and so on:

Desired output:
id,time_in,time_out,calc_hrs,grp,total
5341,7/21/17 09:00:00 AM,7/21/17 10:00:00 AM,1.00,1,1.00

5707,7/31/17 12:23:50 PM,7/31/17 12:33:42 PM,0.16,2,0.19
5711,7/31/17 12:47:57 PM,7/31/17 12:49:01 PM,0.02,2,0.19
5713,7/31/17 12:49:16 PM,7/31/17 12:49:40 PM,0.01,2,0.19

6860,8/29/17 10:50:26 AM,8/29/17 10:52:47 AM,0.04,3,727.12
6861,8/29/17 10:50:55 AM,8/29/17 10:53:13 AM,0.04,3,727.12
6862,8/29/17 10:51:08 AM,8/29/17 10:52:53 AM,0.03,3,727.12
6863,8/29/17 10:53:15 AM,9/8/17 01:14:05 PM,242.34,3,727.12
6864,8/29/17 10:53:31 AM,9/8/17 01:14:05 PM,242.34,3,727.12
6866,8/29/17 10:53:53 AM,9/8/17 01:14:05 PM,242.33,3,727.12


Then I could flag the records that exceeded our threshold punches for a day, which is 13.00 and not have to worry about 3rd shift punches that went across midnight. I'm sure I have over complicated the issue, but I'm stuck and can't seem to get myself out of it.

Example date table:

create table ta_labor (id int, time_in date, time_out date, calculated_hours number(5,2), pr_emp_id int, source varchar2(10));

with this data in it:

insert into ta_labor values (5341, to_date('7/21/2017 09:00:00', 'mm/dd/yyyy hh24:mi:ss'), to_date('7/21/2017 10:00:00', 'mm/dd/yyyy hh24:mi:ss'), 1, 9081, 'WO');
insert into ta_labor values (5707, to_date('7/31/2017 12:23:50', 'mm/dd/yyyy hh24:mi:ss'), to_date('7/31/2017 12:33:42', 'mm/dd/yyyy hh24:mi:ss'), 0.16, 9081, 'WO');
insert into ta_labor values (5711, to_date('7/31/2017 12:47:57', 'mm/dd/yyyy hh24:mi:ss'), to_date('7/31/2017 12:49:01', 'mm/dd/yyyy hh24:mi:ss'), 0.02, 9081, 'WO');
insert into ta_labor values (5713, to_date('7/31/2017 12:49:16', 'mm/dd/yyyy hh24:mi:ss'), to_date('7/31/2017 12:49:40', 'mm/dd/yyyy hh24:mi:ss'), 0.01, 9081, 'WO');
insert into ta_labor values (6860, to_date('8/29/2017 10:50:26', 'mm/dd/yyyy hh24:mi:ss'), to_date('8/29/2017 10:52:47', 'mm/dd/yyyy hh24:mi:ss'), 0.04, 9081, 'WO');
insert into ta_labor values (6861, to_date('8/29/2017 10:50:55', 'mm/dd/yyyy hh24:mi:ss'), to_date('8/29/2017 10:53:13', 'mm/dd/yyyy hh24:mi:ss'), 0.04, 9081, 'WO');
insert into ta_labor values (6862, to_date('8/29/2017 10:51:08', 'mm/dd/yyyy hh24:mi:ss'), to_date('8/29/2017 10:52:53', 'mm/dd/yyyy hh24:mi:ss'), 0.03, 9081, 'WO');
insert into ta_labor values (6863, to_date('8/29/2017 10:53:15', 'mm/dd/yyyy hh24:mi:ss'), to_date('9/8/2017 13:14:05', 'mm/dd/yyyy hh24:mi:ss'), 242.34, 9081, 'WO');
insert into ta_labor values (6864, to_date('8/29/2017 10:53:31', 'mm/dd/yyyy hh24:mi:ss'), to_date('9/8/2017 13:14:05', 'mm/dd/yyyy hh24:mi:ss'), 242.34, 9081, 'WO');
insert into ta_labor values (6866, to_date('8/29/2017 10:53:53', 'mm/dd/yyyy hh24:mi:ss'), to_date('9/8/2017 13:14:05', 'mm/dd/yyyy hh24:mi:ss'), 242.33, 9081, 'WO');

and Chris said...

So, you want to group the rows into 24 hour periods. But if there's more than 24 hours between two groups, you want to start the next period from the exact time of the next time_in?

If you're on 12c, this is easy. Use match_recognize!

Define a "day" pattern variable. This matches all the rows that are within a day of an always-true start variable. The built-in match_number returns the group number:

select * from ta_labor
match_recognize (
  order by time_in
  measures 
    match_number() as grp
  all rows per match
  pattern ( st dy* ) 
  define 
    dy as dy.time_in <= st.time_in + 1
);

TIME_IN                GRP   ID     TIME_OUT               CALCULATED_HOURS   PR_EMP_ID   SOURCE   
21-JUL-2017 09:00:00       1   5341 21-JUL-2017 10:00:00                    1        9081 WO       
31-JUL-2017 12:23:50       2   5707 31-JUL-2017 12:33:42                 0.16        9081 WO       
31-JUL-2017 12:47:57       2   5711 31-JUL-2017 12:49:01                 0.02        9081 WO       
31-JUL-2017 12:49:16       2   5713 31-JUL-2017 12:49:40                 0.01        9081 WO       
29-AUG-2017 10:50:26       3   6860 29-AUG-2017 10:52:47                 0.04        9081 WO       
29-AUG-2017 10:50:55       3   6861 29-AUG-2017 10:53:13                 0.04        9081 WO       
29-AUG-2017 10:51:08       3   6862 29-AUG-2017 10:52:53                 0.03        9081 WO       
29-AUG-2017 10:53:15       3   6863 08-SEP-2017 13:14:05               242.34        9081 WO       
29-AUG-2017 10:53:31       3   6864 08-SEP-2017 13:14:05               242.34        9081 WO       
29-AUG-2017 10:53:53       3   6866 08-SEP-2017 13:14:05               242.33        9081 WO


If you want to know more about how this works, read:

https://asktom.oracle.com/pls/apex/f?p=100:11:::NO:RP:P11_QUESTION_ID:9533477800346658909

As you're stuck on 11.2, you can use a recursive solution instead. This starts with the row with the earliest time_in for your dataset. Then on each iteration find the row with the next time.

To do the grouping, add two calculated columns.

The start time of each group. Test whether the time_in (or time_out, depending on how you're defining your 24 periods) of the current row is within a day of the start time for the group. If it is, keep the current start time. Otherwise replace it with the current time_in.

Do a similar test for the group column. Increment it depending on whether it's within a day of the start time for the group.

with times as (
  select t.*,
         min(time_in) over () mn ,
         lead(time_in) over (order by time_in) next_rec
  from   ta_labor t
), grps (id, time_in, time_out, calc_hrs, grp, grp_start, next_rec) as (
  select id, time_in, time_out, calculated_hours, 
         1 as grp, time_in grp_start, next_rec
  from   times
  where  time_in = mn
  union all
  select ta.id, ta.time_in, ta.time_out, ta.calculated_hours, 
         case
           when ta.time_in <= g.grp_start + 1 then grp 
           else grp + 1
         end as grp, 
         case
           when ta.time_in <= g.grp_start + 1 then g.grp_start
           else ta.time_in
         end as grp_start, ta.next_rec
  from   times ta 
  join   grps g
  on     ta.time_in = g.next_rec
)
  select * from grps;

ID     TIME_IN                TIME_OUT               CALC_HRS   GRP   GRP_START              NEXT_REC               
  5341 21-JUL-2017 09:00:00   21-JUL-2017 10:00:00            1     1 21-JUL-2017 09:00:00   31-JUL-2017 12:23:50   
  5707 31-JUL-2017 12:23:50   31-JUL-2017 12:33:42         0.16     2 31-JUL-2017 12:23:50   31-JUL-2017 12:47:57   
  5711 31-JUL-2017 12:47:57   31-JUL-2017 12:49:01         0.02     2 31-JUL-2017 12:23:50   31-JUL-2017 12:49:16   
  5713 31-JUL-2017 12:49:16   31-JUL-2017 12:49:40         0.01     2 31-JUL-2017 12:23:50   29-AUG-2017 10:50:26   
  6860 29-AUG-2017 10:50:26   29-AUG-2017 10:52:47         0.04     3 29-AUG-2017 10:50:26   29-AUG-2017 10:50:55   
  6861 29-AUG-2017 10:50:55   29-AUG-2017 10:53:13         0.04     3 29-AUG-2017 10:50:26   29-AUG-2017 10:51:08   
  6862 29-AUG-2017 10:51:08   29-AUG-2017 10:52:53         0.03     3 29-AUG-2017 10:50:26   29-AUG-2017 10:53:15   
  6863 29-AUG-2017 10:53:15   08-SEP-2017 13:14:05       242.34     3 29-AUG-2017 10:50:26   29-AUG-2017 10:53:31   
  6864 29-AUG-2017 10:53:31   08-SEP-2017 13:14:05       242.34     3 29-AUG-2017 10:50:26   29-AUG-2017 10:53:53   
  6866 29-AUG-2017 10:53:53   08-SEP-2017 13:14:05       242.33     3 29-AUG-2017 10:50:26


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.