Skip to Main Content

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Avil.

Asked: June 30, 2021 - 5:54 pm UTC

Last updated: July 13, 2021 - 3:52 pm UTC

Version: Oracle 19c

Viewed 1000+ times

You Asked

Hi Tom,

I have a requirement to split the dates when one date range can contain other date range as shown below.

Need help writing sql query for this.

These are the different examples.


Input:

IK IK1 START_DATE END_DATE
417532654 887825619 5/1/2017 4/30/2018
417532654 901076156 5/1/2017 12/31/2017

Output:

IK IK1 START_DATE END_DATE
417532654 887825619 5/1/2017 12/31/2017
417532654 901076156 1/1/2018 4/30/2018
417532654 901076156 5/1/2017 12/31/2017

Input:
IK IK1 START_DATE END_DATE
417517804 919990044 10/1/2020 10/31/2020
417517804 915844176 10/1/2020 9/30/2021
417517804 918267561 11/1/2020 12/31/2020

Output:

IK IK1 START_DATE END_DATE
417517804 919990044 10/1/2020 10/31/2020
417517804 915844176 10/1/2020 10/31/2020
417517804 918267561 11/1/2020 12/31/2020
417517804 915844176 11/1/2020 12/31/2020
417517804 915844176 1/1/2021 9/30/2021

Input:

IK IK1 START_DATE END_DATE
417517805 919990044 8/1/2020 8/1/2020
417517805 915844176 8/1/2020 8/31/2021
417517805 918267561 8/1/2020 8/1/2020
417517805 915844177 8/1/2020 8/31/2021

Output:

IK IK1 START_DATE END_DATE
417517805 919990044 8/1/2020 8/1/2020
417517805 915844176 8/1/2020 8/1/2021
417517805 915844176 8/2/2020 8/31/2021
417517805 918267561 8/1/2020 8/1/2020
417517805 915844177 8/1/2020 8/1/2021
417517805 915844177 8/2/2020 8/31/2021

Here is the script:


create table test_dates
( ik number,
ik1 number,
start_date date,
end_date date);



REM INSERTING into TEST_DATES
SET DEFINE OFF;
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517804,919990044,to_date('10/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('10/31/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517804,915844176,to_date('10/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('09/30/2021 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517804,918267561,to_date('11/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('12/31/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517805,915844177,to_date('08/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('08/31/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517805,919990044,to_date('08/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('08/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517805,915844176,to_date('08/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('08/31/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417517805,918267561,to_date('08/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('08/01/2020 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417532654,887825619,to_date('05/01/2017 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('04/30/2018 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));
Insert into TEST_DATES (IK,IK1,START_DATE,END_DATE) values (417532654,901076156,to_date('05/01/2017 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'),to_date('12/31/2017 12.00.00 AM','MM/DD/YYYY HH.MI.SS AM'));


and Chris said...

If you're looking to find all the ranges which fall within another, here's one way to approach this:

- Join the table to itself, getting all those rows where the start and end dates are both between the start & dates in the self-join

dups.start_date between base.start_date and base.end_date
and    dups.end_date between base.start_date and base.end_date

- The start date is the previous end date + 1 or the current start date for the first row - you can do this with lag():

lag ( 
 dups.end_date + 1, 1, base.start_date 
) over (
 partition by base.ik, base.ik1
 order by dups.end_date
)


- The end date is the max end date so far - you can do this with max() over ...

max ( dups.end_date ) over (
 partition by base.ik, base.ik1
 order by dups.end_date
)


Altogether this is:

alter session set nls_date_format = '  DD-MON-YYYY  ';

with rws as (
  select base.ik, base.ik1, 
         lag ( 
           dups.end_date + 1, 1, base.start_date 
         ) over (
           partition by base.ik, base.ik1
           order by dups.end_date
         ) start_date,
         max ( dups.end_date ) over (
           partition by base.ik, base.ik1
           order by dups.end_date
         ) end_date
  from   test_dates base
  join   test_dates dups
  on     dups.start_date between base.start_date and base.end_date
  and    dups.end_date between base.start_date and base.end_date
)
select * from rws
where  start_date <= end_date
order  by start_date, end_date;

IK           IK1         START_DATE         END_DATE          
417532654    887825619   01-MAY-2017        31-DEC-2017      
417532654    901076156   01-MAY-2017        31-DEC-2017      
417532654    887825619   01-JAN-2018        30-APR-2018  

417517805    919990044   01-AUG-2020        01-AUG-2020      
417517805    918267561   01-AUG-2020        01-AUG-2020      
417517805    915844177   01-AUG-2020        01-AUG-2020      
417517805    915844176   01-AUG-2020        01-AUG-2020      
417517805    915844176   02-AUG-2020        31-AUG-2020      
417517805    915844177   02-AUG-2020        31-AUG-2020

417517804    915844176   01-OCT-2020        31-OCT-2020      
417517804    919990044   01-OCT-2020        31-OCT-2020      
417517804    918267561   01-NOV-2020        31-DEC-2020      
417517804    915844176   01-NOV-2020        31-DEC-2020      
417517804    915844176   01-JAN-2021        30-SEP-2021


This assumes that all sub-periods are contained within another with no overlaps.

e.g. there are no periods like this:

1 Jan 2021 - 1 May 2021
1 Feb 2021 - 1 Jul 2021

That you need to split into 1 Jan - 1 Feb - 1 May - 1 Jul. It also assumes no gaps in the subranges. If there are, you can't (always) take the previous end date.

If these assumptions are wrong, you'll need to update the solution. I recommend reading Stew Ashton's blog posts on merging/splitting overlapping date ranges

https://stewashton.wordpress.com/2014/03/11/sql-for-date-ranges-gaps-and-overlaps/

Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.