Skip to Main Content
  • Questions
  • oracle track the history on a table with timestamp columns

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, tom.

Asked: July 25, 2020 - 6:17 pm UTC

Last updated: July 28, 2020 - 2:57 pm UTC

Version: Oracle 12c

Viewed 1000+ times

You Asked

I have 2 tables in Oracle 12c database. Table A has incoming data from an application with modified date+time timestamps for a given product, each day we may get around 50,000 rows in table_A. The goal is to use table A's data and insert into the final target table B(usually has billions of rows) by using table_A's data as the driving data set.

A record needs to be inserted/updated in table_B only when there is a change in the incoming dataset for given product attributes.

The purpose is to track the history/journey of a given product with valid timestamps only when there are changes in its attributes such as state and zip_cd. There should be any overlapping dates or gaps between valid_from and valid_to date fields in the final target table B after inserting the data from A. Think of table_B as slowly changing dimension with Valid_date, valid_to date intervals to indicate the change of a given product


Looking for some guidance as to what would be the best approach to solve this problem. It's ok to create one or multiple temporary/intermediate tables to achieve this.

with LiveSQL Test Case:

and Chris said...

There are all sorts of problems with the sample data. Valid_to dates before valid_from, overlapping dates, ... This makes it tough to verify the solution does what you want!

Anyhow, you should be able use this as the basis for what you're doing:

- Merge table B with a query on A that:
- Generates valid from/to dates by using lead ( dt ) to find the next date for each product
- Match on product ids and valid from date = modified date

This gives:

create table table_a ( 
  product_id varchar2(30), state varchar2(30), zip_cd varchar2(30), modified_dt date
);
create table table_b ( 
  sequence_key number, product_id varchar2(30), state varchar2(30), zip_cd varchar2(30), valid_from date, valid_to date, latest_flag varchar2(1)
);

begin
  insert into table_a values ('abc',  'MN', '123', to_timestamp('3/5/2020 12:01:00 AM', 'mm/dd/yyyy hh:mi:ss AM'));
  insert into table_a values ('abc',  'MN', '123', to_timestamp('3/5/2020 6:01:13 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('abc',  'IL', '223', to_timestamp('3/5/2020 7:01:15 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('abc',  'OH', '333', to_timestamp('3/5/2020 6:01:16 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('abc',  'NY', '722', to_timestamp('3/5/2020 4:29:00 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('abc',  'KS', '444', to_timestamp('3/5/2020 4:31:41 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('bbc',  'MN', '123', to_timestamp('3/19/2020 2:47:08 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('bbc',  'IL', '223', to_timestamp('3/19/2020 2:50:37 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('ccb',  'MN', '123', to_timestamp('3/19/2020 2:56:24 PM', 'mm/dd/yyyy hh:mi:ss PM'));
  insert into table_a values ('dbd',  'KS', '444', to_timestamp('3/19/2020 12:00:00 AM', 'mm/dd/yyyy hh:mi:ss AM'));
  commit;
end;
/

create or replace procedure load_data as
begin

  merge into table_b b
  using ( 
    select a.*, 
           lead ( modified_dt, 1, date'2099-12-31' ) over ( 
             partition by product_id
             order by modified_dt
           ) next_date
    from   table_a a
  ) a
  on   ( b.product_id = a.product_id and b.valid_from = a.modified_dt )
  when not matched then 
    insert  ( b.product_id, b.state, b.zip_cd, b.valid_from, b.valid_to, b.latest_flag )
    values  ( a.product_id, a.state, a.zip_cd, a.modified_dt, a.next_date, case
      when next_date = date'2099-12-31' then 'Y' else 'N' end
    )
  when matched then
    update set b.valid_to = a.next_date, b.latest_flag = 'N'
    where  a.next_date <> date'2099-12-31';
    
end load_data;
/

exec load_data();
  
select * from table_b
order  by product_id, valid_from;

SEQUENCE_KEY    PRODUCT_ID    STATE    ZIP_CD    VALID_FROM              VALID_TO                LATEST_FLAG   
         <null> abc           MN       123       05-MAR-2020 00:01:00    05-MAR-2020 16:29:00    N              
         <null> abc           NY       722       05-MAR-2020 16:29:00    05-MAR-2020 16:31:41    N              
         <null> abc           KS       444       05-MAR-2020 16:31:41    05-MAR-2020 18:01:13    N              
         <null> abc           MN       123       05-MAR-2020 18:01:13    05-MAR-2020 18:01:16    N              
         <null> abc           OH       333       05-MAR-2020 18:01:16    05-MAR-2020 19:01:15    N              
         <null> abc           IL       223       05-MAR-2020 19:01:15    31-DEC-2099 00:00:00    Y              
         <null> bbc           MN       123       19-MAR-2020 14:47:08    19-MAR-2020 14:50:37    N              
         <null> bbc           IL       223       19-MAR-2020 14:50:37    31-DEC-2099 00:00:00    Y              
         <null> ccb           MN       123       19-MAR-2020 14:56:24    31-DEC-2099 00:00:00    Y              
         <null> dbd           KS       444       19-MAR-2020 00:00:00    31-DEC-2099 00:00:00    Y
  
insert into table_a values ( 'abc', 'TX', 654, sysdate );

exec load_data();

select * from table_b
order  by product_id, valid_from;

SEQUENCE_KEY    PRODUCT_ID    STATE    ZIP_CD    VALID_FROM              VALID_TO                LATEST_FLAG   
         <null> abc           MN       123       05-MAR-2020 00:01:00    05-MAR-2020 16:29:00    N              
         <null> abc           NY       722       05-MAR-2020 16:29:00    05-MAR-2020 16:31:41    N              
         <null> abc           KS       444       05-MAR-2020 16:31:41    05-MAR-2020 18:01:13    N              
         <null> abc           MN       123       05-MAR-2020 18:01:13    05-MAR-2020 18:01:16    N              
         <null> abc           OH       333       05-MAR-2020 18:01:16    05-MAR-2020 19:01:15    N              
         <null> abc           IL       223       05-MAR-2020 19:01:15    28-JUL-2020 14:54:52    N              
         <null> abc           TX       654       28-JUL-2020 14:54:52    31-DEC-2099 00:00:00    Y              
         <null> bbc           MN       123       19-MAR-2020 14:47:08    19-MAR-2020 14:50:37    N              
         <null> bbc           IL       223       19-MAR-2020 14:50:37    31-DEC-2099 00:00:00    Y              
         <null> ccb           MN       123       19-MAR-2020 14:56:24    31-DEC-2099 00:00:00    Y              
         <null> dbd           KS       444       19-MAR-2020 00:00:00    31-DEC-2099 00:00:00    Y


Note the case expression to set the latest_flag - this assumes that 31 Dec 2099 is "infinity", so identifies current values. The where clause in the when matched clause stops this overwriting latest to N if you run the merge twice with no changes in between.

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.