Skip to Main Content
  • Questions
  • Using 1 CTE for 2 different DML statements

Breadcrumb

May 4th

Question and Answer

Chris Saxon

Thanks for the question, Paul.

Asked: June 18, 2020 - 6:51 am UTC

Last updated: June 29, 2020 - 12:47 pm UTC

Version: 11.2..0.4

Viewed 1000+ times

You Asked

I have a query that reads through access_history data, which looks for record_type='T' as I only want Time & attendance records, which haven't been paired previously (processed=0). I am also keeping the Systimestamp for each record so I can get the MIN() and MAX() values each time this process is run as records are being added rapidly and I dont want to mark a record processed=1 if it wasn't selected for a run.

In the same query that creates these paired records in the emp_attendance table I want to update the source table to processed=1 where records='T' and processed=0 and records>= MIN and <= MAX timestamp once the process is completed!!!


Below is my test case, which works fine. But I can't seem to add update logic too. Is it possible to have one CTE that does 2 DML INSERT and update. I want the update to be something like this

Update access_history 
Set process=1
Where min_ts >= ts and max_ts <= ts


If not possible, can you provide another solution. My goal is process different chunks of access_history history records for each run


    -- Drop table emp_info purge:
    -- Drop table locations purge;
    -- Drop table access_histoty purge;
    -- Drop table emp_attendance purge;

    ALTER SESSION SET NLS_DATE_FORMAT = 'MMDDYYYY HH24:MI:SS';

    CREATE TABLE employees
    (
       employee_id NUMBER(6),
       first_name VARCHAR2(20),
       last_name VARCHAR2(25) NOT NULL,
       card_num varchar2(10)  NOT NULL
    );
       
    ALTER TABLE employees
         ADD ( CONSTRAINT employee_id_pk
       PRIMARY KEY (employee_id));

    Insert into employees values (1, 'Mike', 'Jones', 'AAA1');
    Insert into employees values (2, 'Jane', 'Doe', 'BBB2');
    Insert into employees values (3, 'Paul', 'Smith', 'CCC3');
    Insert into employees values (4, 'John', 'Henry', 'DDD4');

     Create table locations(
       location_id NUMBER(4),
       location_name varchar2(30),
       location_type char(1));
    
     -- A=access T=Time & Attendance 
     
    ALTER TABLE locations 
         ADD ( CONSTRAINT lication_id_pk
       PRIMARY KEY (location_id));

    Insert into locations values (101, 'South Front Door 1', 'T');
  
      Insert into locations values (102, 'South Front Door 2', 'T');
  
      Insert into locations values (103, 'East Back Door 1', 'T');
 
       Insert into locations values (104, 'East Back Door 2', 'T');
  
       Insert into locations values (105,'Computer Room', 'A');
 
       Insert into locations values (106,'1st Floor North',  'A');


    Create table access_history(
       employee_id NUMBER(6), 
       card_num varchar2(10),
       location_id number(4),
       Access_date date,
       ts timestamp default systimestamp,
       processed NUMBER(1) default 0
    );

    
    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (1, 'AAA1', 101,    TO_DATE('06212020 21:02:04',  'MMDDYYYY HH24:MI:SS'));

       -- TYpe T no previous data for this
       -- empid record INSERT empid, 
        -- start  time ONLY in table below 
         -- and  update last_start_date  
         -- with DATETIME.
   
    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (1, 'AAA1', 102,    TO_DATE('06212020 23:52:14',  'MMDDYYYY HH24:MI:SS'));


      -- Type T record empid, start_time
      --  set update end_time only in
      -- emp_attendance.



    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (2, 'BBB2', 103,    TO_DATE('06212020 08:32:35',  'MMDDYYYY HH24:MI:SS'));


     -- TYpe T  INSERT empid, start 
     -- time ONLY in emp_attendance.  
     -- update last_start_date with 
     --  DATETIME on emp_info table

     

    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (2, 'BBB2', 102,    TO_DATE('06212020 15:39:05',  'MMDDYYYY HH24:MI:SS'));


    -- Type T record empid, start_time 
    --  set, update end_time only in
    -- emp_attendance.

 
    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (3, 'CCC3', 103,    TO_DATE('06212020 15:39:05',  'MMDDYYYY HH24:MI:SS'));


     -- TYpe T  INSERT empid, start 
     -- time ONLY in emp_attendance.  
     -- update last_start_date with 
     --  DATETIME on emp_info table


    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (3, 'CCC3', 105,    TO_DATE('06212020 18:19:55',  'MMDDYYYY HH24:MI:SS'));


    -- Type A record don't do anything to 
    -- emp_attendance.

    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (4, 'DDD4', 102,    TO_DATE('06212020 08:49:35',  'MMDDYYYY HH24:MI:SS'));

    --  single record with no pair. Set
    -- end_date to start_date if end_date
    -- is NULL
    
    INSERT INTO access_history
    ( employee_id, card_num,
       location_id, Access_date )
     VALUES (3, 'CCC3', 104,    TO_DATE('06222020 04:04:35',  'MMDDYYYY HH24:MI:SS'));


    -- Type T record empid, start_time 
    --  set, update end_time only in
    -- emp_attendance.


     CREATE TABLE  emp_attendance    
      (seq_num NUMBER(10),
       employee_id NUMBER(6),
       start_date DATE,
       end_date DATE,
       create_date DATE DEFAULT SYSDATE
       );
   
       Create sequence emp_attendance_seq;

   insert into emp_attendance (seq_num, employee_id, start_date, end_Date)
       with
       prep (employee_id, start_date, rn, end_date) as (
         select employee_id, access_date
            , row_number()      over (partition by card_num order by access_date)
          , lead(access_date) over (partition by card_num order by access_date)
       from   access_history
     where  location_id in ( select location_id
                         from   locations
                          where  location_type = 'T'
                      ) AND processed = 0
  )
   select emp_attendance_seq.nextval,
          employee_id,
         start_date,
       nvl(end_date, start_date)
      from prep
     where  mod(rn, 2) = 1;

and Chris said...

No.

With MERGE you can do INSERT or UPDATE. But you can't have one statement that does INSERT AND UPDATE at the same time.

One way to tackle this is to:

- Bulk collect all the rows you want to insert into an array
- Use forall insert/update to write the changes

Becuase you've fetched everything you want to change into the array, you can be sure that both processed use the same rows.

Here's an example:

declare
  type hist_rec is record (
    employee_id int,
    start_date  timestamp,
    end_date    timestamp
  );
  type hist_rec_arr 
    is table of hist_rec 
    index by pls_integer;
  
  hist_recs hist_rec_arr;
begin
  
  with prep ( employee_id, start_date, rn, end_date) as (
    select employee_id, access_date
         , row_number()      over (partition by card_num order by access_date)
         , lead(access_date) over (partition by card_num order by access_date)
    from   access_history
    where  location_id in ( 
      select location_id
      from   locations
      where  location_type = 'T'
    ) 
    and processed = 0
  )
   select employee_id,
          start_date,
          nvl(end_date, start_date)
   bulk collect into hist_recs
   from   prep
   where  mod(rn, 2) = 1;
   
  forall i in 1 .. hist_recs.count
    insert into emp_attendance (seq_num, employee_id, start_date, end_Date)
    values ( emp_attendance_seq.nextval, hist_recs(i).employee_id, 
      hist_recs(i).start_date, hist_recs(i).end_date
    );
    
  forall i in 1 .. hist_recs.count
    update access_history
    set    processed = 1
    where  employee_id = hist_recs(i).employee_id
    and    access_date = hist_recs(i).start_date;

end;
/

select * from access_history;

EMPLOYEE_ID    CARD_NUM    LOCATION_ID    ACCESS_DATE          TS                                PROCESSED   
             1 AAA1                   101 06212020 21:02:04    18-JUN-2020 13.42.45.879008000               1 
             1 AAA1                   102 06212020 23:52:14    18-JUN-2020 13.42.46.012733000               0 
             2 BBB2                   103 06212020 08:32:35    18-JUN-2020 13.42.46.210370000               1 
             2 BBB2                   102 06212020 15:39:05    18-JUN-2020 13.42.46.344091000               0 
             3 CCC3                   103 06212020 15:39:05    18-JUN-2020 13.42.46.442417000               1 
             3 CCC3                   105 06212020 18:19:55    18-JUN-2020 13.42.46.674809000               0 
             4 DDD4                   102 06212020 08:49:35    18-JUN-2020 13.42.46.756963000               1 
             3 CCC3                   104 06222020 04:04:35    18-JUN-2020 13.42.46.851017000               0

select * from emp_attendance;

SEQ_NUM    EMPLOYEE_ID    START_DATE           END_DATE             CREATE_DATE         
        13              1 06212020 21:02:04    06212020 23:52:14    06182020 13:52:19    
        14              2 06212020 08:32:35    06212020 15:39:05    06182020 13:52:19    
        15              3 06212020 15:39:05    06222020 04:04:35    06182020 13:52:19    
        16              4 06212020 08:49:35    06212020 08:49:35    06182020 13:52:19 


You'll need to take care if two or more sessions can run this at the same time though. If one starts before the other commits, they can process the same rows twice.

Rating

  (4 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, June 18, 2020 - 2:43 pm UTC

Almost there. Please explain why ALL the access_history records are not set processed=1. Please explain why and how to accomplish that.

When new records are added some old ones will be added again. Am I missing something based on the output you provided.


Chris Saxon
June 19, 2020 - 1:50 pm UTC

Please explain why ALL the access_history records are not set processed=1.

Because this where clause in the original INSERT-SELECT excludes half the rows:

where  mod(rn, 2) = 1


So you tell me: why is there?

When new records are added some old ones will be added again

I'm not sure what you mean here - please clarify with an example.

A reader, June 20, 2020 - 6:53 am UTC

When creating the CTE there is a condition "AND processed = 0"

So when new records are added via the application processed=0. The next time the pairing query is run it will pickup old records from the previous run plus all the new records. By adding this to the update statement fixes the issue

AND access_date = hist_recs(j). end_date
Chris Saxon
June 22, 2020 - 10:52 am UTC

I'm not sure what you're saying here - does adding that predicate fix the issue or are you still not processing some rows? Please clarify!

If it's not working how you want, please share exactly what code you're running.

A reader, June 22, 2020 - 11:07 am UTC

Adding the and end_date.... fixed the issue and solved the problem. Thanks for your eloquent solution and expertise

Adding more details

Paul, June 27, 2020 - 5:56 pm UTC

In my environment, there is a table access_history, where rows are being inserted 24x7 and at rapid pace.

From the access_history table I am successfully PAIRING rows where the column processed = 0 to create emp_attendance rows. This process works fine.

Once the rows are successfully PAIRED I need to update the source table (access_history) set of rows, which created the target (emp_attendance) rows to processed=1 so subsequent runs will not pick up rows, which were already used for PAIRING.

I posted the code below and it can be run.

I thought I could capture the rows into an array, do INSERTS and then update the source. I'm using the array method to capture a snapshot of the data as new rows are being added constantly.

After running this process ALL the rows are not set to processed=1. See output below the red rows are the problem!!

Can someone provide an alternative solution on how to PAIR the rows and update only those rows in the snapshot. I suspect the problem is with the modulus operation in the CTE.

This functionality hasn't been released yet so changes can be made without consequences. Thanks in advance to all who answer.

-- drop table employees purge;
-- drop table locations purge;
-- drop table access_history purge;
-- drop table emp_attendance purge;
-- drop sequence emp_attendance_seq;

ALTER SESSION SET NLS_DATE_FORMAT = 'MMDDYYYY HH24:MI:SS';



Create table employees(
employee_id NUMBER(6),
first_name VARCHAR2(20),
last_name VARCHAR2(20),
card_num VARCHAR2(10)
);


create table locations(
location_id NUMBER(4),
location_name VARCHAR2(20),
location_type VARCHAR2(1)
);

create table access_history(
employee_id NUMBER(6),
card_num varchar2(10),
location_id number(4),
access_date date,
processed NUMBER(1) default 0
);


CREATE TABLE emp_attendance
(seq_num NUMBER(10),
employee_id NUMBER(6),
start_date DATE,
end_date DATE,
create_date DATE DEFAULT SYSDATE
);

Create sequence emp_attendance_seq;

INSERT into employees(
employee_id, first_name,
last_name, card_num)
VALUES
(1, 'John', 'Doe', 'AAA1');

INSERT into employees(
employee_id, first_name,
last_name, card_num)
VALUES
(2, 'Jane', 'Smith', 'BBB2');


INSERT into locations
( location_id,
location_name,
location_type)
VALUES (1, 'Door 1', 'T');

INSERT into locations
( location_id,
location_name,
location_type)
VALUES (2, 'Door 2', 'T');


insert into access_history (employee_id, card_num,
location_id, access_date)
values
(1, 'AAA1', 1,
TO_DATE('2020/07/03 11:02:44', 'yyyy/mm/dd hh24:mi:ss'));

insert into access_history (employee_id, card_num,
location_id, access_date)
values
(1, 'AAA1', 2,
TO_DATE('2020/07/03 11:12:04', 'yyyy/mm/dd hh24:mi:ss'));

insert into access_history (employee_id, card_num,
location_id, access_date)
values
(2, 'BBB2', 1,
TO_DATE('2020/07/03 11:02:44', 'yyyy/mm/dd hh24:mi:ss'));

insert into access_history (employee_id, card_num,
location_id, access_date)
values
(2, 'BBB2', 2,
TO_DATE('2020/07/03 11:12:04', 'yyyy/mm/dd hh24:mi:ss'));

insert into access_history (employee_id, card_num,
location_id, access_date)
values
(2, 'BBB2', 2,
TO_DATE('2020/07/03 15:15:04', 'yyyy/mm/dd hh24:mi:ss'));


SELECT * from access_history

EMPLOYEE_ID CARD_NUM LOCATION_ID ACCESS_DATE PROCESSED
1 AAA1 1 07032020 11:02:44 1
1 AAA1 2 07032020 11:12:04 0
2 BBB2 1 07032020 11:02:44 1
2 BBB2 2 07032020 11:12:04 0
2 BBB2 2 07032020 15:15:04 1

-- problem code with update.

declare
type hist_rec is record (
employee_id int,
start_date timestamp,
end_date timestamp
);
type hist_rec_arr
is table of hist_rec
index by pls_integer;

hist_recs hist_rec_arr;
begin

with prep ( employee_id, start_date, rn, end_date) as (
select employee_id, access_date
, row_number() over (partition by card_num order by access_date)
, lead(access_date) over (partition by card_num order by access_date)
from access_history
where location_id in (
select location_id
from locations
where location_type = 'T'
)
and processed = 0
)
select employee_id,
start_date,
nvl(end_date, start_date)
bulk collect into hist_recs
from prep
where mod(rn, 2) = 1;

forall i in 1 .. hist_recs.count
insert into emp_attendance (seq_num, employee_id, start_date, end_Date)
values ( emp_attendance_seq.nextval, hist_recs(i).employee_id,
hist_recs(i).start_date, hist_recs(i).end_date
);

forall i in 1 .. hist_recs.count
update access_history
set processed = 1
where employee_id = hist_recs(i).employee_id
and access_date = hist_recs(i).start_date;

end;





Chris Saxon
June 29, 2020 - 12:47 pm UTC

As I said earlier, this

where mod(rn, 2) = 1;


excludes HALF the rows from the results. So you'll only UPDATE HALF the rows.

If you want to get and update ALL the rows, remove this where clause.

See output below the red rows are the problem!!

We don't have any font highlighting, so can't see which the red rows are!

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.