Skip to Main Content
  • Questions
  • SQL Query to find gaps in date ranges

Breadcrumb

XMAS_TREE

The AskTOM team is taking a break over the holiday season, so we're not taking questions or responding to comments, but we might still sneak in a few Office Hours sessions! Thanks for being a member of the AskTOM community.

Question and Answer

Tom Kyte

Thanks for the question, Dave.

Asked: November 05, 2007 - 12:46 pm UTC

Last updated: March 25, 2012 - 9:53 am UTC

Version: 10.2.0

Viewed 50K+ times! This question is

You Asked

Hi Tom,

I have a problem that I would like to solve using pure SQL and not PL/SQL (if possible). I suspect that there may be an Analytic solution to the problem, but have not been able to work it out for myself thus far.

I have a table with the following records in it:


create table t ( a int, b date, c date );

insert into t values(1, to_date( '01-jan-2007'), to_date( '15-jan-2007'));
insert into t values(2, to_date( '03-jan-2007'), to_date( '10-jan-2007'));
insert into t values(3, to_date( '12-jan-2007'), to_date( '25-jan-2007'));
insert into t values(4, to_date( '20-jan-2007'), to_date( '01-feb-2007'));
insert into t values(5, to_date( '05-feb-2007'), to_date( '10-feb-2007'));
insert into t values(6, to_date( '05-feb-2007'), to_date( '28-feb-2007'));
insert into t values(7, to_date( '10-feb-2007'), to_date( '15-feb-2007'));
insert into t values(8, to_date( '18-feb-2007'), to_date( '23-feb-2007'));
insert into t values(9, to_date( '22-feb-2007'), to_date( '16-mar-2007'));



Is there any way using a SELECT statement to determine the Gaps that exist within this table?

From the above example, the gap that exists is:
02-feb-2007 --> 04-feb-2007

This assumes that when date ranges overlap, it will eliminate a gap. For example the apparent gap of (22-jan-2007) between the Effective End Date of record 2 (10-jan-2007) and the Effective Start Date of record 3 (12-jan-2007) does not exist due to the overlap of record 1's Effective Start and End dates.

Hope this makes sense and I hope you can help.
thanks
Dave




and Tom said...

Here is one approach. I used "50" - you can replace 50 with a 'better' value, or you could use

with 
data as (select max(c-b) c_b from t),
r as (select level-1 l from data connect by level <= c_b)
select .....


if you do the "data" one, you do not need the case statement as protection...

ops$tkyte%ORA10GR2> variable x number
ops$tkyte%ORA10GR2> exec :x := 50

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> with
  2  r as
  3  (select level-1 l
  4     from dual
  5  connect by level <= :x
  6  )
  7  select last_dt+1, dt-1
  8    from (
  9  select dt, lag(dt) over (order by dt) last_dt
 10    from (
 11  select distinct t.b+r.l dt
 12    from t, r
 13   where r.l <= t.c-t.b and case when t.c-t.b > :x then 1/0 end is null
 14         )
 15             )
 16   where last_dt <> dt-1
 17  /

LAST_DT+1 DT-1
--------- ---------
02-FEB-07 04-FEB-07

Rating

  (28 ratings)

We're not taking comments currently, so please try again later if you want to add a comment.

Comments

SQL writing is an art....

Satya, November 07, 2007 - 9:31 am UTC

SQL writing is an art, and Tom does it well.
Tom Kyte
November 07, 2007 - 6:00 pm UTC

not any more than writing good java code is
or good cobol
or good assembler

<X> writing is gleaned through experience, time, effort and mentoring. Anyone can do it.

here is a completely other approach

Matthias Rogel, November 07, 2007 - 10:17 am UTC

SQL > with minmax as
2 (
3 select min(b) mi, max(c) ma
4 from t
5 )
6 select mi+level-1 miss
7 from minmax
8 connect by mi+level-1<=ma
9 minus
10 select b+level-1
11 from t
12 connect by b+level-1<=c
13 and prior a=a and prior dbms_random.value is not null
14 /

MISS
-----------
02-feb-2007
03-feb-2007
04-feb-2007

having some time to spare ;o)

A writer, November 07, 2007 - 10:54 am UTC

You solution obviously assumes the dates are Days - though that seems to be the case here.
And the reliance on MAX(c-b) makes the solution not very scalable.

Another solution working on time ranges ... the output should be interpreted as open intervals.
"a" is assumed to be the primary key ... else, use rowid instead.

flip@FLOP> select c, next_b from
  2  (
  3  select a,b,c,lead(b) over (order by b) next_b from t t2
  4  where  not exists ( select null from t t1
  5                      where t1.b <= t2.b and t1.c >= t2.c and t1.a != t2.a)
  6  )
  7  where c < next_b
  8  order by b;

C         NEXT_B
--------- ---------
01-FEB-07 05-FEB-07

flip@FLOP> alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss';

Session altered.

flip@FLOP> begin
  2  delete from t;
  3  insert into t values (1, to_date('01-jan-2007 13:00:00'),to_date('01-jan-2007 14:00:00'));
  4  insert into t values (1, to_date('01-jan-2007 14:00:01'),to_date('01-jan-2007 20:00:00'));
  5  insert into t values (1, to_date('01-jan-2007 18:00:00'),to_date('01-jan-2007 20:00:00'));
  6  insert into t values (1, to_date('01-jan-2007 21:00:00'),to_date('01-jan-2007 22:00:00'));
  7  end;
  8  /

PL/SQL procedure successfully completed.

flip@FLOP> select c, next_b from
  2  (
  3  select a,b,c,lead(b) over (order by b) next_b from t t2
  4  where  not exists ( select null from t t1
  5                      where t1.b <= t2.b and t1.c >= t2.c and t1.a != t2.a)
  6  )
  7  where c < next_b
  8  order by b;

C                    NEXT_B
-------------------- --------------------
01-jan-2007 14:00:00 01-jan-2007 14:00:01
01-jan-2007 20:00:00 01-jan-2007 21:00:00

flip@FLOP>

Tom Kyte
November 07, 2007 - 6:06 pm UTC

... And the reliance on MAX(c-b) makes the solution not very scalable. ... how so?

I can envision your not exists "not scaling" horribly well either - regardless of the indexing scheme you use...

1/0

A reader, November 07, 2007 - 3:20 pm UTC

could you please explain why
.. then 1/0 ...
is safe here and doesn't throw an exception ?
Tom Kyte
November 07, 2007 - 6:24 pm UTC

because 1/0 is only evaluated when the "where" bit succeeds - when we underguessed the number of rows we need to generate.

simpler solution...

Antony Boucher, November 07, 2007 - 7:30 pm UTC

What about this? Wouldn't that scale better?

SELECT * FROM
(SELECT MAX(C) OVER (ORDER BY B) D, LEAD(B) OVER (ORDER BY B) E FROM T)
WHERE D < E

harsh

A reader, November 07, 2007 - 7:53 pm UTC

And the reliance on MAX(c-b) makes the solution not very scalable. ... how so?

Keep adding, one by one ...

insert into t values(10, to_date( '22-feb-2008'), to_date( '16-mar-2018'));
insert into t values(11, to_date( '22-feb-2028'), to_date( '16-mar-2098'));

and there should be an incremental slowdown.

If these do no seem to be realistic DATEs, consider the similar problem with INTEGER intervals. If that's another problem requiring maybe another solution then ... well, that should be it then.

I can envision your not exists "not scaling" horribly well either - regardless of the indexing scheme you use

Indexing B seems to work (I replicated the original content up to 147K rows). I could be wrong though.

Sorry for sounding harsh.

And Anthony's excellent solution kind of makes all of these irrelevant.

Tom Kyte
November 08, 2007 - 9:52 am UTC

flip@FLOP> select c, next_b from
  2  (
  3  select a,b,c,lead(b) over (order by b) next_b from t t2
  4  where  not exists ( select null from t t1
  5                      where t1.b <= t2.b and t1.c >= t2.c and t1.a != t2.a)
  6  )
  7  where c < next_b
  8  order by b;



indexing B ....

it depends on the nature of the data entirely again (just like mine) - if there are tons of rows that satisfy "t1.b <= t2.b" such that NOT(t1.c >= t2.c and t1.a != t2.a) - it won't. Depends on the nature of the data.

But agreed, Antony's thinking outside the box does make it sort of a linear thing.

misspelling ...

A reader, November 07, 2007 - 7:57 pm UTC

Antony:

Sorry for not getting you name right.

Very nice solution.

great stuff Antony !

Matthias Rogel, November 08, 2007 - 9:18 am UTC

******
from me !

Nopparat V., November 08, 2007 - 11:51 pm UTC

Very nice solution from Antony. Thank you.

Want to Find Date Gaps with in two given dates

Suji, February 18, 2009 - 4:48 pm UTC

I would like to Get the Gaps between two different Dates with in my user specified date range. Ex : I have dates in table
stt_dt end_dt
10/01/2008 09/30/2009
10/11/2009 12/11/2009
10/01/2010 11/01/2011

I want to find out gaps between 09/30/2007 to 01/01/2011
Is there any solution?
Tom Kyte
February 18, 2009 - 5:47 pm UTC

can the stt_dt/end_dt ranges overlap in your table

stt_dt        end_dt
10/01/2008   09/30/2009
10/11/2008   12/11/2009  
11/01/2008   11/01/2011



is that a possible set of data? (details, details, details - it's all in the details...)

sure

Sokrates, February 19, 2009 - 7:48 am UTC

without knowing Suji's requirements, I would say, of course overlappings are possible.

For the simple reason: there is no way to forbid them via a constraint, or is there ?
Tom Kyte
February 19, 2009 - 8:55 am UTC

there is a materialized view trick you could use (on commit refresh - would fail transaction at commit)

or a trigger (properly coded, correctly coded)

or a transactional API that does the necessary locking to serialize properly so you can check safely.

Overlapping is possible

Suji, February 19, 2009 - 2:33 pm UTC

Like my dates will be
stt_dt end_dt
10/1/1999 9/30/2001
10/1/2000 10/30/2000 (This is overlapping)
10/1/2003 9/30/2007
10/1/2009 9/30/2010

User Date Ranges are 10/01/1997 to 09/30/2011
My Answer (gaps) should be
10/01/1997 09/30/1999
10/01/2001 09/30/2003
10/01/2007 09/30/2009
10/01/2010 09/30/2011

Thanks in advance



Tom Kyte
February 19, 2009 - 4:53 pm UTC

take this:

ops$tkyte%ORA10GR2> with data
  2  as
  3  (select to_date('10/01/1997')+level-1 dt
  4     from dual
  5  connect by level <= to_date('09/30/2011', 'mm/dd/yyyy')- to_date('10/01/1997', 'mm/dd/yyyy')
  6  ),
  7  dates
  8  as
  9  (select distinct dt
 10     from data left outer join t on ( data.dt between t.s and t.e )
 11    where t.s is null
 12  )
 13  select *
 14    from (
 15  select dt,
 16             nvl(lag(dt) over (order by dt), to_date('10/01/1997', 'mm/dd/yyyy') ) last_dt,
 17         nvl(lead(dt) over (order by dt), to_date('09/30/2011', 'mm/dd/yyyy')) next_dt
 18    from dates
 19         )
 20   where dt <> next_dt-1
 21      or dt <> last_dt+1
 22  /

DT         LAST_DT    NEXT_DT
---------- ---------- ----------
10/01/1997 10/01/1997 10/02/1997
09/30/1999 09/29/1999 10/01/2001
10/01/2001 09/30/1999 10/02/2001
09/30/2003 09/29/2003 10/01/2007
10/01/2007 09/30/2003 10/02/2007
09/30/2009 09/29/2009 10/01/2010
10/01/2010 09/30/2009 10/02/2010

7 rows selected.


and see what you can start to do with it...

re: finding date gaps

Stew Ashton, February 21, 2009 - 5:30 am UTC


Tom, in the first WITH clause called "data", you stopped short one day too early: add one at the end of the CONNECT BY clause.
with data
as (
  select to_date('10/01/1997')+level-1 dt
  from dual
  connect by level <= to_date('09/30/2011', 'mm/dd/yyyy')- to_date('10/01/1997', 'mm/dd/yyyy') + 1
),
dates
as
(select distinct dt
   from data left outer join t on ( data.dt between t.s and t.e )
  where t.s is null
)
select *
  from (
select dt,
       nvl(lag(dt) over (order by dt), to_date('10/01/1997', 'mm/dd/yyyy') ) last_dt,
       nvl(lead(dt) over (order by dt), to_date('09/30/2011', 'mm/dd/yyyy')) next_dt
  from dates
       )
 where dt <> next_dt-1
    or dt <> last_dt+1

DT                        LAST_DT                   NEXT_DT                   
------------------------- ------------------------- ------------------------- 
10/01/1997                10/01/1997                10/02/1997                
09/30/1999                09/29/1999                10/01/2001                
10/01/2001                09/30/1999                10/02/2001                
09/30/2003                09/29/2003                10/01/2007                
10/01/2007                09/30/2003                10/02/2007                
09/30/2009                09/29/2009                10/01/2010                
10/01/2010                09/30/2009                10/02/2010                
09/30/2011                09/29/2011                09/30/2011                

8 rows selected
Now the answer is in the first column: odd-numbered lines show the gap start dates, even-numbered lines show the gap end dates.
Tom Kyte
February 21, 2009 - 9:14 pm UTC

thanks!

Come back, Antony Boucher !

Stew Ashton, February 21, 2009 - 12:02 pm UTC


Last week I spent hours searching for Antony Boucher's post above (November 2007) because I vaguely remembered it was a remarkably concise use of analytics. So today I looked over Tom's answer just above and posted a slight bug fix. Now I scroll up and behold! there is Antony's post in all its glory

and

it still works :)
with data as (
  select * from t
  union all
  select to_date('10/01/1997', 'mm/dd/yyyy') - 1, to_date('10/01/1997', 'mm/dd/yyyy') - 1 from dual
  union all
  select to_date('09/30/2011', 'mm/dd/yyyy') + 1, to_date('09/30/2011', 'mm/dd/yyyy') + 1 from dual
)
SELECT to_char(gap_start, 'mm/dd/yyyy') gap_start, to_char(gap_end, 'mm/dd/yyyy') gap_end FROM
  (SELECT MAX(e) OVER (ORDER BY s) + 1 gap_start, LEAD(s) OVER (ORDER BY s) - 1 gap_end FROM data)
WHERE gap_start < gap_end;

GAP_START  GAP_END    
---------- ---------- 
10/01/1997 09/30/1999 
10/01/2001 09/30/2003 
10/01/2007 09/30/2009 
10/01/2010 09/30/2011 

4 rows selected

Awesome...

Parnesh, February 24, 2009 - 3:23 pm UTC

This query solves the problem. But How do we write the same query without using WITH function.
Tom Kyte
February 24, 2009 - 5:21 pm UTC

why?

Correction to post of February 21, 2009

Stew Ashton, March 14, 2009 - 4:33 pm UTC


In my SQL query just above, the last line should be
WHERE gap_start <= gap_end;
Sorry about that :(
Tom Kyte
March 16, 2009 - 9:24 am UTC

no worries, welcome to my world :)

Dates and Analytics

Matt, March 26, 2009 - 11:33 am UTC

Very useful stuff, I have a variation on a theme where for each record I need to show if they have a following or prior linked record based on date ranges.

I can do this simply with analytics lag, until I get a record which is wholely within another.

The only way I can solve the puzzle is with 3 hits on the same table! Any clever analytic solution appreciated. I have toyed with this for ages!!



CREATE TABLE delete_me
(
  membno      NUMBER,
  start_date  DATE,
  end_date    DATE
);

SET DEFINE OFF;
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, NEXT, sd)
 VALUES
   (1, TO_DATE('04/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('05/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'Y', TO_DATE('05/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, prev, NEXT, sd)
 VALUES
   (1, TO_DATE('04/10/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/15/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'Y', 'Y', TO_DATE('05/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, prev, NEXT, sd)
 VALUES
   (1, TO_DATE('04/19/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/22/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'Y', 'Y', TO_DATE('05/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, prev, sd)
 VALUES
   (1, TO_DATE('05/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/01/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'Y', TO_DATE('01/01/2009 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, sd)
 VALUES
   (2, TO_DATE('04/01/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/30/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/30/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, NEXT, sd)
 VALUES
   (2, TO_DATE('05/02/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('05/17/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'Y', TO_DATE('05/17/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
INSERT INTO choicerep.delete_me
   (membno, start_date, end_date, prev, sd)
 VALUES
   (2, TO_DATE('05/13/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('05/22/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'Y', TO_DATE('05/22/2008 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;

SELECT dm.*,
       ( SELECT MAX('Y')
         FROM delete_me dm2
         WHERE membno = dm.membno
         AND dm.start_date BETWEEN dm2.start_date AND dm2.end_date
         AND dm2.ROWID != dm.ROWID ) prev,
       ( SELECT MAX('Y')
         FROM delete_me dm2
         WHERE membno = dm.membno
         AND dm.end_date BETWEEN dm2.start_date AND dm2.end_date
         AND dm2.ROWID != dm.ROWID ) NEXT ,
         MAX( end_date ) OVER ( PARTITION BY membno ORDER BY  start_date ) sd
FROM delete_me dm
ORDER BY 1,2;

MEMBNO                                 START_DATE          END_DATE            PREV NEXT SD                 
1                                      01/04/2008          01/05/2008               Y    01/05/2008         
1                                      10/04/2008          15/04/2008          Y    Y    01/05/2008         
1                                      19/04/2008          22/04/2008          Y    Y    01/05/2008         
1                                      01/05/2008          01/01/2009          Y         01/01/2009         
2                                      01/04/2008          30/04/2008                    30/04/2008         
2                                      02/05/2008          17/05/2008               Y    17/05/2008         
2                                      13/05/2008          22/05/2008          Y         22/05/2008         

Tom Kyte
March 30, 2009 - 3:06 pm UTC

can you phrase this in the form of a 'question' rather than a sql query I'd have to parse and interpret and live with all of the assumptions you've made?

I'm not clear on "until I get a record which is wholely within another. ", what does it mean to you to have a record within another?

I don't understand the prev and next 'Y' values - what they mean?


you could write the two scalar subqueries as one

where dm2.membno = dm.membno
  and dm2.start_date <= dm.end_date
  and dm2.end_date >= dm.start_date


but that wouldn't tell you to "delete that record", but rather "these two could be merged" (and perhaps merged again with something else and again and again...)

Reader, April 22, 2009 - 2:45 pm UTC

Tom,
create table tst
(tst_id varchar2(10),
dt date,
qty number);
insert into tst values ('F1',to_date(20090402,'yyyymmdd'),100000);
insert into tst values ('F1',to_date(20090401,'yyyymmdd'),20000);
insert into tst values ('F1',to_date(20090312,'yyyymmdd'),100000);
insert into tst values ('F2',to_date(20090421,'yyyymmdd'),124000);
insert into tst values ('F2',to_date(20090420,'yyyymmdd'),24000);
insert into tst values ('F2',to_date(20090417,'yyyymmdd'),25000);
commit;

select * from tst
TST_ID DT QTY
---------- --------- ----------
F1 02-APR-09 100000
F1 01-APR-09 20000
F1 12-MAR-09 100000
F2 21-APR-09 124000
F2 20-APR-09 24000
F2 17-APR-09 25000

I need to calculate the sum of qty for the last 20 dates in the calender. For example
If today's date = 2009-04-21

select (to_date(20090421,'yyyymmdd') - 20) from dual; returns 4/1/2009

For example I need to calculate the sum(qty) from 20090401 to 20090421 for TST_ID=F1 and TST_ID=F2 in the table. Like this, I have many tst_id's in the table
Can you advice how to achieve this by using a query?

DT_calculated TST_ID TOTAL_QTY_for_last_20_dates
04-21-2009 F1 120000
04-21-2009 F2 173000

Data for 12-MAR-09 for F1 present in the table should not be considered since 12-MAR-09 does not fall in the last 20 dates.
Tom Kyte
April 24, 2009 - 4:30 pm UTC

don't take this wrong, but this is pretty simple SQL? You just want the rows such that dt is between sysdate-20 and sysdate - then aggregate...

ops$tkyte%ORA11GR1> select * from tst where dt between to_date('2009-04-21','yyyy-mm-dd')-20 and to_date('2009-04-21','yyyy-mm-dd');

TST_ID     DT               QTY
---------- --------- ----------
F1         02-APR-09     100000
F1         01-APR-09      20000
F2         21-APR-09     124000
F2         20-APR-09      24000
F2         17-APR-09      25000

ops$tkyte%ORA11GR1> select tst_id, sum(qty)
  2  from tst where dt between to_date('2009-04-21','yyyy-mm-dd')-20 and to_date('2009-04-21','yyyy-mm-dd')
  3  group by tst_id;

TST_ID       SUM(QTY)
---------- ----------
F1             120000
F2             173000

Nice Antony!

Erik, August 17, 2009 - 2:00 pm UTC

Just thought I'd add this in case anyone was dealing with a similar issue to mine.

I've been looking for an easy/scalable way to do this...and I was able to use the basics of your query and expand on it...

The difference was that I needed to find gaps in a table for EACH GIVEN ENTITY.

Say for instance I have a "complete" history of an employee, but for whatever reason, there are points in history that they had left employment and then come back (i.e. seasonal employment, temp agency, etc.). I needed to find those gaps and insert a row that noted as such.

Given the following data (for this example 1/1/1901 and 12/31/2499 is the "beginning" and "end" of time - very god-like, I know ;)):
emp_id start_date end_date
1 1/1/1901 10/31/1991
1 1/13/1992 6/27/1994
1 10/10/1994 5/6/1998
1 10/1/1999 11/30/1999
1 2/28/2000 12/31/2499
2 1/1/1901 10/3/1994
2 11/28/1994 2/5/1995
2 7/1/1995 12/29/1995
2 5/20/1996 5/21/1996
2 10/21/1996 10/24/1997
2 12/1/1997 7/31/1998
2 10/1/1998 3/31/1999
2 4/16/1999 12/31/2499

I needed it to separate the gaps by employee ID. Adding the "PARTITION BY" clause to the MAX function solved this nicely:
SELECT
emp_id,
gap_start,
gap_end
FROM
(SELECT
emp_id,
MAX(end_date) OVER (PARTITION BY emp_id ORDER BY start_date)+1 gap_start,
LEAD(start_date) OVER (ORDER BY emp_id, start_date)-1 gap_end
FROM employee_hist
)
WHERE gap_start <= gap_end
;

Returns the following:
emp_id gap_start gap_end
1 11/1/1991 1/12/1992
1 6/28/1994 10/9/1994
1 1/23/1995 4/19/1995
1 5/7/1998 9/30/1999
1 12/1/1999 2/27/2000
2 10/4/1994 11/27/1994
2 2/6/1995 6/30/1995
2 12/30/1995 5/19/1996
2 5/22/1996 10/20/1996
2 10/25/1997 11/30/1997
2 8/1/1998 9/30/1998
2 4/1/1999 4/15/1999

Thanks for the start Tom and Antony! Couldn't have figured this out without you!

Small refinement

Ted C, January 26, 2011 - 2:59 pm UTC

Erik's query was close, but I had some erroneous returns.

The gap_end column had to be specified the same way as the gap_start column.

SELECT
emp_id,
gap_start,
gap_end
FROM
(SELECT
emp_id,
MAX(end_date) OVER (PARTITION BY emp_id ORDER BY start_date)+1 gap_start,
LEAD(start_date) OVER (PARTITION BY emp_id ORDER BY start_date)-1 gap_end
FROM employee_hist
)
WHERE gap_start <= gap_end
;

This works great! Thanks to all.

overlap sessions and counts

karma, March 18, 2011 - 4:04 pm UTC

Hi: In this post i dont see a way to indentify all overlaps within a set and count for each overlapping session. I have table with user weblogs where they have multiple sessions running. I need to find out each overlapping sessions and their respective counts.

Here is sample data :
select * from overlap_session where cust_id = '000000001'

CUST_ID SESS_START_TM SESS_END_TM LOAD_DT
000000001 01/04/2010 09:36:43 01/04/2010 19:42:37 01/04/2010
000000001 01/04/2010 09:42:00 01/04/2010 10:42:00 01/04/2010
000000001 01/04/2010 13:17:58 01/04/2010 15:05:39 01/04/2010
000000001 01/04/2010 15:48:40 01/04/2010 19:06:50 01/04/2010
000000001 01/04/2010 18:42:21 01/04/2010 22:10:35 01/04/2010
000000001 01/04/2010 21:15:27 01/04/2010 23:27:03 01/04/2010
000000001 01/04/2010 22:30:27 01/04/2010 23:15:03 01/04/2010

Expected Output

CUST_ID SESS_START_TM SESS_END_TM NUM_SESSIONS
000000001 01/04/2010 09:42:00 01/04/2010 10:42:00 2
000000001 01/04/2010 13:17:58 01/04/2010 15:05:39 2
000000001 01/04/2010 15:48:40 01/04/2010 18:42:20 2
000000001 01/04/2010 18:42:21 01/04/2010 19:06:50 3
000000001 01/04/2010 21:15:27 01/04/2010 22:10:35 2
000000001 01/04/2010 22:30:27 01/04/2010 23:15:03 2

scripts to load sample data:

CREATE TABLE OVERLAP_SESSION
(
CUST_WEB_ID VARCHAR2(36 BYTE),
CUST_ID VARCHAR2(9 BYTE),
SESSION_START_TMSTP DATE,
SESSION_END_TMSTP DATE,
LOAD_DT DATE,
LOAD_DT_KEY NUMBER(15)
)

SET DEFINE OFF;

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 09:36:43', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 19:42:37', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 09:42:00', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 10:42:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 13:17:58', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 15:05:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 15:48:40', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 19:06:50', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 18:42:21', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 22:10:35', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 21:15:27', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 23:27:03', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);

Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('00000000-0000-0000-0000-000000000000', '000000001', TO_DATE('01/04/2010 22:30:27', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('01/04/2010 23:15:03', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/04/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455201);


Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 17:30:48', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 17:55:45', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 17:35:32', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 17:35:54', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 17:37:40', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 17:56:17', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 17:56:59', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 18:40:05', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 18:23:36', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 18:30:53', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 18:32:59', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 18:39:16', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 18:55:59', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 19:59:28', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 19:32:24', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:46:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 19:33:42', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 19:33:46', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 19:56:57', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 19:58:14', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 19:59:37', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:04:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:22:31', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:24:31', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:27:02', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:32:13', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:34:08', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:42:47', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:42:53', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:44:41', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:45:21', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:45:37', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:47:36', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:51:53', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:48:18', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 21:02:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:54:39', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 20:57:35', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 20:58:42', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 21:00:57', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 21:01:35', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 21:01:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 21:02:33', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 21:27:41', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/22/2010 21:07:14', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/22/2010 21:09:22', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/22/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455523);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/25/2010 00:22:27', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/25/2010 00:28:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/25/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455526);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/25/2010 00:26:28', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/25/2010 00:28:38', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/25/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455526);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/25/2010 00:49:54', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/25/2010 00:55:53', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/25/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455526);
Insert into OVERLAP_SESSION
(CUST_WEB_ID, CUST_ID, SESSION_START_TMSTP, SESSION_END_TMSTP, LOAD_DT,
LOAD_DT_KEY)
Values
('11111111-2222-3333-4444-555555555555', '000000080', TO_DATE('11/25/2010 00:53:52', 'MM/DD/YYYY HH24:MI:SS'),

TO_DATE('11/25/2010 00:55:57', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/25/2010 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),
2455526);
COMMIT;

Finding records within 2 years

Shravani, June 23, 2011 - 11:26 am UTC

Hi Tom,

Thanks so much for your continuous technical help.

My project requirement is to find out the records with age difference of two years. Below I created a dummy table (actual project tables are different)

EMP Table with below data

Empno Age (in Years)
===========================
123 34
345 43
454 39
643 41
767 32
121 45
423 27
982 33
566 41
==============================

Now for above data I should get all records having age difference as +2 or -2 which means I should get
=====
Empno
=====
123
345
643
767
121
982
566

Sincerely requesting you to help me on this.

Regards
Shravani
Tom Kyte
June 23, 2011 - 11:54 am UTC

this makes no sense.

I should get all records having age difference as +2 or -2

+/- 2 from WHAT exactly?

please remember - there is no such thing as a first/last/next row without an order by, I see no way to order this data.

In fact, I see no create table nor insert into tables to test with when you do write a complete, unambiguous specification for your issue.

Finding records within 2 years

Shravani, June 24, 2011 - 5:14 am UTC

Hi Tom,

I am extremely sorry for not providing the complete information in earlier post and also sorry for repeating the question as a new question. Very sorry for that.

Below are the required scripts :

---Table Creation
CREATE TABLE EMP
(
EMPNO NUMBER,
AGE NUMBER
);

---Data Population
insert into emp (empno, age) values (10, 34);
insert into emp (empno, age) values (20, 43);
insert into emp (empno, age) values (30, 39);
insert into emp (empno, age) values (40, 41);
insert into emp (empno, age) values (50, 32);
insert into emp (empno, age) values (60, 45);
insert into emp (empno, age) values (70, 27);
insert into emp (empno, age) values (80, 33);
insert into emp (empno, age) values (90, 41);
commit;

Business requirement is to scan the employee table to search for employees with age within 2 years.

For example in EMP table, employees with "empno" 10, 50 and 80 have the value in "age" as 34, 32 and 33. So these employees should be listed out as the age difference is within 2 years (+2 or -2).

Employee with "empno" as 70 and with "age" 27 should not appear in listing, because there is no employee with value in "age" column as 25 or 29.

Regards,
Shravani

Tom Kyte
June 24, 2011 - 9:05 am UTC

ops$tkyte%ORA11GR2> select *
  2    from (
  3  select empno, age,
  4         count(*) over (order by age range between 2 preceding and 2 following ) cnt
  5    from emp
  6         )
  7   where cnt > 1;

     EMPNO        AGE        CNT
---------- ---------- ----------
        50         32          3
        80         33          3
        10         34          3
        30         39          3
        40         41          4
        90         41          4
        20         43          4
        60         45          2

8 rows selected.

Wow!!

Sandeep, June 24, 2011 - 9:40 am UTC

I was not aware of predeeding and following clause..Thats very cool!!

Tooo Good...briliant

Shravani, June 24, 2011 - 10:04 am UTC

Thanks, you are the MASTER in Oracle. I thought the solution would be very complex with big lines of sql code, but as usual
you have simplified it. And gave a very effective solution.

Can you please let us know where to read more about the analytical function you used?

Thanks again

Regards
Shravani
Tom Kyte
June 24, 2011 - 10:12 am UTC

Data warehousing guide (misplaced bit of documentation if you ask me :) )

http://docs.oracle.com/docs/cd/E11882_01/server.112/e16579/analysis.htm#i1007779

Thanks so much Tom

Shravani, June 28, 2011 - 5:14 am UTC

Thanks for the link Tom

Grouping based on Year

snaseer, March 23, 2012 - 11:38 am UTC

Hello, When I found this Q&A I thought I can now solve my case but that is not true.
My data is like
CID RID Start_Year TITLE
52 1000 2001 1
52 1000 2002 1
52 1000 2003 6
52 1000 2004 6
52 1000 2005 6
52 1000 2007 1
52 1001 2003 1
52 1001 2004 1
52 1001 2005 1
52 1001 2006 3
52 1001 2007 3

Output I want to produce is
CID RID START-END TITLE
52 1000 2001-2002 1
52 1000 2003-2005 6
52 1000 2007-2007 1
52 1001 2003-2005 1
52 1001 2006-2007 3

Query I am come up with (output is not way I need)
select cid, rid, min(year2), max(year1), title
from (
select tc.cid, tc.rid, tc.title, tc.start_year year1, tp.start_year year2, decode (tc.title, tp.title, 'N', 'Y') gap
from t tc,
t tp
where tc.start_year -1 = tp.start_year
and tc.cid = tp.cid
and tc.rid = tp.rid
)
group by cid, rid, title, gap
--------
Table Script
create table T (
cid NUMBER,
rid NUMBER,
start_year NUMBER,
title NUMBER
);

insert into t (cid, rid, start_year, title) values (52, 1000, 2001, 1);
insert into t (cid, rid, start_year, title) values (52, 1000, 2002, 1);
insert into t (cid, rid, start_year, title) values (52, 1001, 2003, 1);
insert into t (cid, rid, start_year, title) values (52, 1000, 2003, 6);
insert into t (cid, rid, start_year, title) values (52, 1001, 2004, 1);
insert into t (cid, rid, start_year, title) values (52, 1000, 2004, 6);
insert into t (cid, rid, start_year, title) values (52, 1001, 2005, 1);
insert into t (cid, rid, start_year, title) values (52, 1000, 2005, 6);
insert into t (cid, rid, start_year, title) values (52, 1001, 2006, 3);
insert into t (cid, rid, start_year, title) values (52, 1001, 2007, 3);
insert into t (cid, rid, start_year, title) values (52, 1000, 2007, 1);

commit;
Tom Kyte
March 24, 2012 - 10:15 am UTC

I wish

a) you would have used the code button so that things lined up :(

b) actually explained your output. Why should I have to reverse engineer your output to try to figure out what you MIGHT have meant given your input????


So, fix it.

A reader, March 24, 2012 - 6:45 pm UTC



with data as
(
select 52 cid, 1000 rid, 2001 start_year, 1 title from dual union all
select 52, 1000, 2002, 1 from dual union all
select 52, 1000, 2003, 6 from dual union all
select 52, 1000, 2004, 6 from dual union all
select 52, 1000, 2005, 6 from dual union all
select 52, 1000, 2007, 1 from dual union all
select 52, 1001, 2003, 1 from dual union all
select 52, 1001, 2004, 1 from dual union all
select 52, 1001, 2005, 1 from dual union all
select 52, 1001, 2006, 3 from dual union all
select 52, 1001, 2007, 3 from dual
)
select cid, rid, min(start_year) st, max(start_year) as end, title
from
(
select cid, rid, start_year, title, row_number()over(partition by cid, rid order by start_year)
row_number()over(partition by cid, rid, title order by start_year) as rno2,
row_number()over(partition by cid, rid order by start_year) -
row_number()over(partition by cid, rid, title order by start_year) as grp
from data
)x
group by cid, rid, title, grp
order by cid, rid, min(start_year)
/

CID RID ST END TITLE
----- ---------- ---------- ---------- ----------
52 1000 2001 2002 1
52 1000 2003 2005 6
52 1000 2007 2007 1
52 1001 2003 2005 1
52 1001 2006 2007 3



Tom Kyte
March 25, 2012 - 9:53 am UTC

could be, might not be. depends on what the original posters specifications (something they chose to NOT share with us) are doesn't it.

I prefer to not waste my time on ill specified things

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.