Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Paul.

Asked: July 10, 2020 - 8:39 am UTC

Last updated: July 15, 2020 - 10:15 am UTC

Version: 19c

Viewed 1000+ times

You Asked

I have 3 tables schedule_assignment, schedule_detail and access_history (test case below)

The schedule_detail contains a list of times and locations.

The schedule_assignment assigns a list to an employee.

The access_history table records information whenever an employee swipes their card.

What I need to do is match the employee, location and trunc DATE rows and then walk through the schedule records and find the corresponding match HHMMSS from access_history

Since the chances of finding an exact match on the dates are very slim, I need to look through time period of the rows.

This way I can determine if an employee was on time, early, late, out of time range or never swiped their card at that location.


A row is considered good if the access_history time is + or -2 minutes of the schedule time, considered early if the row is between 2 minutes 1 second or 4 minutes before the schedule time, late if the row is between 2 minutes 1 second or 4 minutes after the schedule time, out of range if the row is 4 minutes 1 second before or after the scheduled time.

I took a stab at it with this initial query to provide a visual on what I was trying to do. I know its wrong as the schedule date will rarely match the access_history date.

SELECT sa.employee_id,  sd.location_id,
sd.schedule_date,
ah.access_date

FROM
schedule_assignment sa,
schedule_detail sd, 
access_history ah
WHERE 
sa.employee_id =
ah.employee_id AND
sd.location_id =
ah.location_id AND
(ah.access_date IS NULL OR 
TRUNC(sd.schedule_date) = TRUNC (ah.access_date)  
)
Order by schedule_date;



Below is a detailed test case with the expected outcome and a detailed explanation for each row.

Any guidance on how to look for rows within a range would be greatly appreciated.

Note, in the case below there is a schedule record without a access_history record so there should be nothing to compare that too.


ALTER SESSION SET NLS_DATE_FORMAT = 'MMDDYYYY HH24:MI:SS';

CREATE table schedule_hdr(
  schedule_id NUMBER(4),
  schedule_name VARCHAR2(30)
);

INSERT INTO schedule_hdr(
  schedule_id,
  schedule_name
)
VALUES 
(5, 'Thursday Day Shift');


CREATE table schedule_assignment( 
   schedule_id NUMBER(4),
   employee_id NUMBER(6)
);

INSERT INTO schedule_assignment(
  schedule_id,
  employee_id   
)
VALUES (5,1);


create table schedule_detail(
       schedule_id NUMBER(4), 
       location_id number(4),
       schedule_date date
  );

insert into schedule_detail(
  schedule_id, 
  location_id,
  schedule_date)
 values 
  (5, 100,
TO_DATE('2020/07/23 11:00:00', 'yyyy/mm/dd hh24:mi:ss'));

insert into schedule_detail(
schedule_id, 
location_id,  schedule_date)
 values 
  (5, 101,
TO_DATE('2020/07/23 11:04:00', 'yyyy/mm/dd hh24:mi:ss'));

insert into schedule_detail(
schedule_id, 
location_id,  schedule_date)
 values 
  (5,102,
TO_DATE('2020/07/23 11:07:00', 'yyyy/mm/dd hh24:mi:ss'));

insert into schedule_detail(
schedule_id, 
location_id,  schedule_date)
 values 
  (5,103,
TO_DATE('2020/07/23 11:10:00', 'yyyy/mm/dd hh24:mi:ss'));


insert into schedule_detail(
schedule_id, 
location_id,  schedule_date)
 values 
  (5,100,
TO_DATE('2020/07/23 11:11:00', 'yyyy/mm/dd hh24:mi:ss'));

    create table access_history(
       employee_id NUMBER(6), 
       location_id number(4),
       access_date date
    );

insert into access_history(
employee_id, 
location_id,  access_date)
VALUES 
  (1, 100,
TO_DATE('2020/07/23 11:00:20', 'yyyy/mm/dd hh24:mi:ss'));


insert into access_history(
employee_id, 
location_id,  access_date)
VALUES 
  (1, 101,
TO_DATE('2020/07/23 11:01:53', 'yyyy/mm/dd hh24:mi:ss'));

insert into access_history(
employee_id, 
location_id,  access_date)
 values 
  (1, 102,
TO_DATE('2020/07/23 11:10:13', 'yyyy/mm/dd hh24:mi:ss'));


insert into access_history(
employee_id, 
location_id,  access_date)
 values 
  (1,103,
TO_DATE('2020/07/23 11:15:12', 'yyyy/mm/dd hh24:mi:ss'));


-- results of query should populate this table.

CREATE table schedule_history(

      seq_num integer  GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
         employee_id NUMBER(6), 
       schedule_id NUMBER(4),
       location_id number(4),
       schedule_date date,
      access_date date,
      status VARCHAR2(1)
);


SELECT * from schedule_history

SEQ_NUM    EMPLOYEE_ID    
SCHEDULE_ID LOCATION_ID    SCHEDULE_DATE    ACCESS_DATE    STATUS
1    1    5 100    07232020 11:00:00    07232020 11:00:20    G

-- This row GOOD because only 20 second difference from schedule 

2    1    5 101    07232020 11:04:00    07232020 11:01:53    E

-- This row early because because 2 minutes 7 seconds before schedule time

3    1    5 102    07232020 11:07:00    07232020 11:10:13    L

-- This row late because because 3 minutes 13 seconds after schedule time

4    1    5 103    07232020 11:10:00    07232020 11:15:12    O

-- This is out of range because its 5 minutes and 12 seconds after the scheduled time. Note it could be out of range if its earlier then the scheduled time too.


5    1    5 100    07232020 11:11:00     NULL     N

-- This row is NO GOOD as there is no corresponding access_history record. Note the NULL value for access_date 



and Chris said...

Here's one way to approach this:

- Join the schedule tables
- Outer join the access history to these
- Only include rows that are within the schedule date +/- some number of minutes
- Use a case expression to check the difference between schedule and access dates to assign the status

The challenge is choosing a time range that is:

- Large enough to capture people extremely early or late
- Small enough that you don't start accidentally including accesses which is really for the next/previous schedule

I've gone for a 10 minute window, which works for these data:

select sa.employee_id,  
       sd.location_id,
       sd.schedule_date,
       ah.access_date,
       numtodsinterval ( ( schedule_date - access_date ), 'day' ) time_diff,
       case
         when abs ( schedule_date - access_date ) * 1440 <= 2 then 'GOOOOD'
         when ( schedule_date - access_date ) * 1440 between 2 and 4 then 'Early'
         when ( schedule_date - access_date ) * 1440 between -4 and -2 then 'Late'
         when access_date is not null then 'OUT OF RANGE'
         else 'NO GOOD!'
       end on_time
from   schedule_assignment sa
join   schedule_detail sd
on     sd.schedule_id = sa.schedule_id
left join access_history ah
on     sa.employee_id = ah.employee_id 
and    sd.location_id = ah.location_id
and    ah.access_date between sd.schedule_date - interval '10' minute
                      and sd.schedule_date + interval '10' minute
order by schedule_date;

EMPLOYEE_ID   LOCATION_ID SCHEDULE_DATE       ACCESS_DATE         TIME_DIFF            ON_TIME        
          1           100 07232020 11:00:00   07232020 11:00:20   -00 00:00:20.000000  GOOOOD          
          1           101 07232020 11:04:00   07232020 11:01:53   +00 00:02:07.000000  Early           
          1           102 07232020 11:07:00   07232020 11:10:13   -00 00:03:13.000000  Late            
          1           103 07232020 11:10:00   07232020 11:15:12   -00 00:05:12.000000  OUT OF RANGE    
          1           100 07232020 11:11:00   <null>              <null>               NO GOOD! 


But there are only 11 minutes between the scheduled times for location 100. If the employee was one minute later the first time they arrived at this location, they show up as "out of range" for the next access (instead of a no show):

update access_history
set    access_date = access_date + 1/1440
where  location_id = 100;

select sa.employee_id,  
       sd.location_id,
       sd.schedule_date,
       ah.access_date,
       numtodsinterval ( ( schedule_date - access_date ), 'day' ) time_diff,
       case
         when abs ( schedule_date - access_date ) * 1440 <= 2 then 'GOOOOD'
         when ( schedule_date - access_date ) * 1440 between 2 and 4 then 'Early'
         when ( schedule_date - access_date ) * 1440 between -4 and -2 then 'Late'
         when access_date is not null then 'OUT OF RANGE'
         else 'NO GOOD!'
       end on_time
from   schedule_assignment sa
join   schedule_detail sd
on     sd.schedule_id = sa.schedule_id
left join access_history ah
on     sa.employee_id = ah.employee_id 
and    sd.location_id = ah.location_id
and    ah.access_date between sd.schedule_date - interval '10' minute
                      and sd.schedule_date + interval '10' minute
order by schedule_date;

EMPLOYEE_ID   LOCATION_ID SCHEDULE_DATE       ACCESS_DATE         TIME_DIFF              ON_TIME        
          1           100 07232020 11:00:00   07232020 11:01:20   -00 00:01:20.000000    GOOOOD          
          1           101 07232020 11:04:00   07232020 11:01:53   +00 00:02:07.000000    Early           
          1           102 07232020 11:07:00   07232020 11:10:13   -00 00:03:13.000000    Late            
          1           103 07232020 11:10:00   07232020 11:15:12   -00 00:05:12.000000    OUT OF RANGE    
          1           100 07232020 11:11:00   07232020 11:01:20   +00 00:09:40.000000    OUT OF RANGE    


In general this is a hard problem to solve. How do you decide whether someone was late for one access vs early for the next?

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

SQL

The Oracle documentation contains a complete SQL reference.