Skip to Main Content
  • Questions
  • compare range of dates with range of dates

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Nag.

Asked: August 07, 2002 - 9:12 am UTC

Last updated: February 19, 2018 - 3:26 am UTC

Version: 8.1.7

Viewed 50K+ times! This question is

You Asked

Tom

I have a table t with column c_begining and c_ending which are date column type. It also has a column called skey which is a number.

Now I have two parameters passed into my program p_beginging and p_ending which are also dates.

I need to compare for every record in t , where p_beginging and p1p_ending ( they may not be equal to c_begining and c_ending ) fall within the range of c and c1.


i.e. c_begining and c_ending might be aug1st 2002 and dec12 2002 , but they fall within the range of p_beginging and p_ending if they are jan1st 2000 and jan1st 2003.

How should I do this comparision in sql itself without using any pl/sql.

Thanks

and Tom said...

Do you need to be concerned with all overlaps? Eg:

in database you have


01-jan-2001 07-jan-2001


you are given as inputs:


15-dec-2000 15-jan-2001 - wider then database
15-dec-2000 04-jan-2001 - overlaps but not contained within
04-jan-2001 15-jan-2001 - overlaps but not contained within
02-jan-2001 04-jan-2001 - overlaps and contained within


should all four return a "hit"?

select *
from t
where ( c_beginning between p_beginning and p_ending -- rule 1
OR c_ending between p_beginning and p_ending -- rule 2
OR ( c_beginning < p_beginning and p_ending < c_ending ) ); -- rule 3

will do that. If you use the following "chart", it depicts all of the possibles and which "rule" in the predicate above will catch them:

.............................................
CB CE
PB PE -- rule 1
PB PE -- rule 1
PB PE -- rule 2
PB PE -- rule 3
PB PE -- no hit
PB PE -- no hit





Rating

  (95 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, August 07, 2002 - 3:49 pm UTC


is the below approach valid...

Nag, August 07, 2002 - 10:31 pm UTC

select from_date,to_date
from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
where x.a between from_date and to_date and
x.b between from_date and to_date
OR ( from_date < x.a and x.b < to_date )

and is this kind of comparision(x.b < to_date ) i.e. value to the left and column to the right( in your solution) .. is it healthy..

Thanks

Tom Kyte
August 08, 2002 - 9:06 am UTC

define "healthy". what the heck do you mean by that?


(and please please please -- compare dates to dates, strings to strings, numbers to numbers. wrap those dates in a TO_DATE() call and always use 4 digit years and use an explicit date format to convert the string to a date)

If you are afraid of

x.b < to_date

for some reason, code

to_date > x.b

instead.


Correction or a typo

Logan Palanisamy, August 08, 2002 - 7:47 pm UTC

Tom,

.............................................
CB CE
PB PE -- rule 1
PB PE -- rule 1
PB PE -- rule 2
PB PE -- rule 3
PB PE -- no hit
PB PE -- no hit

In your chart above, the third row where PE is between CB and CE, belong to rule 2, not rule 1.

The correct one is,
.............................................
CB CE
PB PE -- rule 1
PB PE -- rule 2 <-- correction
PB PE -- rule 2
PB PE -- rule 3
PB PE -- no hit
PB PE -- no hit

I guess it was a typo on your part.

Tom Kyte
August 09, 2002 - 8:39 am UTC

rule 1 = where ( c_beginning between p_beginning and p_ending -- rule 1


On that row in question, c_beginning is in fact between PB and PE so -- rule 1 gets it in my mind since I was reading the list from TOP to BOTTOM. Since the order of evaluation of the predicate in SQL is not deterministic (eg: rule 3 then 2 then 1, or 2 then 3 then 1 or ..... and so on could be the order of evaluation) you might be right.

But, if you read the "rules" from top to bottom -- rule 1 is the one that catches row 3.

My sql with the trick is performing faster.. why any idea...

Nag, August 10, 2002 - 2:07 pm UTC

Create table t (from_date date, to_date date);
insert into t values ('01-jan-02', '31-dec-02');




Solution 1.

select *
from t
where ( from_date between :p_beginning :and p_ending
OR to_date between :p_beginning and :p_ending
OR ( from_date < :p_beginning and :p_ending <to_date ) );


Solution 2.

select from_date,to_date
from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
where from_date between x.a and x.b and
to_date between x.a and x.b or
OR ( from_date < x.a and x.b < to_date )


Tom,

I incorporated solution 1 as advetised by you.
It was working but taking enormous amount of time.

So , I used the trick which I invented as shown in solution 2.

I'm cleverly making the two parameters an inlineview and using the values in my where clause.
Surprise-o-rama.. I was amazed to see that the performanceof the query as such report increased
enormously.

What is the difference between what you suggested and what I wrote. What could be the reason
that the second solution was faster.






Tom Kyte
August 10, 2002 - 4:47 pm UTC

Umm, your query is radically different then mine and returns a different (perhaps wrong, perhaps not wrong) answer.

In any case -- only one of the two queries above could be valid!  They answer totally different questions.

(i removed a duplicate OR OR in the second query and I moved the colon in ":and p_ending" to the right place.  


ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( from_date date, to_date date );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> variable pb varchar2(20)
ops$tkyte@ORA817DEV.US.ORACLE.COM> variable pe varchar2(20)
ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> exec :pb := '01-jun-2002'; :pe := '30-dec-2002';

PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( to_date( '22-NOV-2002' ),  to_date( '25-JUN-2004' ) );

1 row created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select *
  2    from t
  3    where ( from_date between to_date(:pb,'dd-mon-yyyy') and to_date(:pe,'dd-mon-yyyy')
  4         OR to_date between to_date(:pb,'dd-mon-yyyy') and to_date(:pe,'dd-mon-yyyy')
  5         OR ( from_date < to_date(:pb,'dd-mon-yyyy') and to_date(:pe,'dd-mon-yyyy') <to_date ) )
  6  /

FROM_DATE TO_DATE
--------- ---------
22-NOV-02 25-JUN-04

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> select from_date,to_date
  2    from t,(select to_date(:pb,'dd-mon-yyyy') a , to_date(:pe,'dd-mon-yyyy') b from dual) x
  3    where from_date between x.a and x.b and
  4          to_date between x.a and x.b
  5                  OR ( from_date < x.a and x.b < to_date )
  6  /

no rows selected

ops$tkyte@ORA817DEV.US.ORACLE.COM> 


My query returned it (overlaps).

Yours does not.  

Only you know which one is correct! 

OOPS... sorry my fault .. small typo...

Nag, August 10, 2002 - 6:26 pm UTC

select from_date,to_date
  from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
  where from_date between x.a and x.b and  --mistake here...
        to_date between x.a and x.b or 
        OR ( from_date < x.a and x.b < to_date ) 

above was meant to be...

select from_date,to_date
  from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
  where from_date between x.a and x.b OR ..
        to_date between x.a and x.b or 
        OR ( from_date < x.a and x.b < to_date ) 

Now.. for the above comment , both the queries return the same rows...

I tried and the results are as follows...

SQL> ED
Wrote file afiedt.buf

  1  select from_date,to_date
  2        from t,(select '01-JUN-02' a , '01-JUN-03' b from dual) x
  3     where from_date between x.a and x.b OR -- corrected
  4              to_date between x.a and x.b
  5*                     OR ( from_date < x.a and x.b < to_date )
SQL> /

FROM_DATE TO_DATE
--------- ---------
01-JAN-02 31-DEC-02
22-NOV-02 25-JUN-04

Now Tom, going back to the original comment i.e. the immediate above one..

Solution 2 is much faster than solution 1. Believe me it runs very very fast compared to the solution 1.

Can you please give me an insight into why this is happening..
 

Tom Kyte
August 11, 2002 - 9:37 am UTC

Nag, you've been coming here long enough -- you should know the answer to this.

Set up the test case, show us the tkprof with timed statistics and all. Help use understand what indexes you have. Look at the differences in the query plans.

Then, all will be revealed.

Sort of bothersome that you STILL have typos in the queries. "or .." "or or" for example. Be nice to put up stuff that actually runs. Makes me question whether what I am looking at is really what you are using or not (eg: am i totally wasting my time or what)

You should know -- if you want answers, you need to supply information! Also, it should be easy for you to look at the traces and *see* what is different and do some of this analysis yourself.

Programmer

Jamil Shaibani, August 11, 2002 - 3:51 am UTC

Dear Tom 
When I try to run the Example I got this error


SQL> insert into t values (to_date('22-NOV-2002'),
to_date('25-JUN-2004'))              *
                              
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected

Thank you very much for your help 

Tom Kyte
August 11, 2002 - 9:53 am UTC

Your default date format is not dd-mon-rr, must be something else

use to_date( '22-nov-2002', 'dd-mon-yyyy' )


Huge apology.. feeling happy at being admonished by Tom

Nag, August 11, 2002 - 10:41 am UTC

Tom, I consider it a privilege to be admonished by you..
But please set your doubts at rest about my integrity in asking this question. I'm genuinely asking this question.

Now for the trace contents..

select *
from t
where ( from_date between to_date(:pb,'dd-mon-yyyy') and
to_date(:pe,'dd-mon-yyyy')
OR to_date between to_date(:pb,'dd-mon-yyyy') and
to_date(:pe,'dd-mon-yyyy')
OR ( from_date < to_date(:pb,'dd-mon-yyyy') and
to_date(:pe,'dd-mon-yyyy') <to_date ) )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 2 4 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 2 4 2

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 32

Rows Row Source Operation
------- ---------------------------------------------------
2 TABLE ACCESS FULL T

********************************************************************************

select from_date,to_date
from t,(select to_date(:pb,'dd-mon-yyyy') a , to_date(:pe,'dd-mon-yyyy')
b from dual) x
where from_date between x.a and x.b OR
to_date between x.a and x.b
OR ( from_date < x.a and x.b < to_date )

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 8 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 8 2

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 32

Rows Row Source Operation
------- ---------------------------------------------------
2 NESTED LOOPS
2 TABLE ACCESS FULL DUAL
2 TABLE ACCESS FULL T

********************************************************************************

ALTER SESSION SET SQL_TRACE=FALSE


In the solution you have provided, I see that we are simply doing a full table scan of T.

After I rewrote the query using the trick I thought will be useful, then we are doing a nested loops join i.e. we are full table scanning dual and comparing it with all the rows returned by T.

So though the time statistics above show 0, when I'm applying the same logic to my real time situation it is making a lot of difference.

But I have a question here ..
Rows Row Source Operation
------- ---------------------------------------------------
2 NESTED LOOPS
2 TABLE ACCESS FULL DUAL
2 TABLE ACCESS FULL T
above which table is being r3ead first dual or T ( as both of them are at the same level).

Can you also suggest the easiest way to insert a whole lot of data into T, so that I can get some time on timed statistics.




Tom Kyte
August 11, 2002 - 7:48 pm UTC

Looks like the times are the same to me -- 0.00 seconds?

Also the tables in question are trivial.

Show us the example where your query blows the one without dual away.......

You apparently MUST have that data somewhere? You made the claim already. Use the table you used to do your intial tests.




Simpler alternative

Tony, August 24, 2002 - 4:15 am UTC

I do date overlap checks a lot, and always use this method:

(c_beginning <= p_ending) AND (c_ending >= p_beginning)

which seems more elegant to me.

Calculate fraction of date range

A reader, January 31, 2003 - 9:00 pm UTC

Tom,

What if I have Prompt StartDate(PS) and Prompt EndDate(PE) as well as in the database fields: StartDate and DueDate and want to calculate fraction of how many days of my prompt to the total days between StartDate(SD) and DueDate(DD)? For example.,
SD - 1/1/2003
DD - 2/15/2003
PS - 12/15/2002
PE - 1/31/2003

So, the calculated field that I am interested in:

PE-1/1/2003 over DD-SD,

I don't count days from 12/15 to 1/1/2003 because they are outside of my SD and DD range(~31/46)



Tom Kyte
February 01, 2003 - 9:26 am UTC

just do it i guess? I me

(pe-sd)/(dd-sd)

is perfectly valid.

gap analysis !!

A reader, February 12, 2003 - 11:23 am UTC

hi tom,

I want to just do this and I am open for sigle sql or pl/sql program unit.

You said..

15-dec-2000 15-jan-2001 - wider then database
15-dec-2000 04-jan-2001 - overlaps but not contained within
04-jan-2001 15-jan-2001 - overlaps but not contained within
02-jan-2001 04-jan-2001 - overlaps and contained within

I have records in the database and valid value for that dates.

for me starting date is fixed let's say 15-dec-2000
eg..

startdate enddate value code
----------- --------- ----- ------
15-dec-2000 15-jan-2001 X ABC123-- flag this gap out
20-jan-2001 19-jan-2002 Y ABC123
15-jan-2002 null X ABC123-- flag overlap
//so for ABC123 code I want to report gap and overlap

15-dec-2000 15-jan-2001 P ABC456
15-jan-2001 19-jan-2002 Q ABC456
19-jan-2002 17-jan-2002 Q ABC456--- flag there is no
value from 17th jan to sysdate
20-jan-2001 19-jan-2002 M ABC789
15-jan-2002 15-dec-2002 N ABC789
// flag no value is available from 15-dec-00 to 20-jan-01
for ABC789 code

15-dec-2000 15-jan-2001 P ABC012
16-jan-2001 19-jan-2002 Q ABC012
20-jan-2002 null Q ABC012
// copletely valid record...

/***********************************
Note : null implies sysdate
***/

I can use utl_fie or table to insert to
comment/flag/errored report.

can you please help..

there are 100,000 distinct records..
all together 242,000 records.

Thanks,


Tom Kyte
February 12, 2003 - 3:41 pm UTC

not sure if I have all of your cases -- but this should get you going.  Using lag and lead we can look forward and back a row and assign whatever flags you need:


ops$tkyte@ORA920> break on code skip 1
ops$tkyte@ORA920>
ops$tkyte@ORA920> select startdate,
  2         enddate,
  3             last_enddate,
  4             flag,
  5             code,
  6             case when last_enddate is null then null
  7                  when last_enddate < startdate-1 then 'Gap'
  8                          when last_enddate > startdate   then 'Overlap'
  9                          when enddate is not null and next_startdate is null then 'Missing next'
 10                          else null
 11                  end another_flag
 12    from (
 13  select t.*,
 14         lag(enddate) over (partition by code order by startdate) last_enddate,
 15             lead(startdate) over (partition by code order by startdate) next_startdate
 16    from t
 17   order by code, startdate
 18         )
 19  /

STARTDATE ENDDATE   LAST_ENDD F CODE   ANOTHER_FLAG
--------- --------- --------- - ------ ------------
15-DEC-00 15-JAN-01           P ABC012
16-JAN-01 19-JAN-02 15-JAN-01 Q
20-JAN-02           19-JAN-02 Q

15-DEC-00 15-JAN-01           X ABC123
20-JAN-01 19-JAN-02 15-JAN-01 Y        Gap
15-JAN-02           19-JAN-02 X        Overlap

15-DEC-00 15-JAN-01           P ABC456
15-JAN-01 19-JAN-02 15-JAN-01 Q
19-JAN-02 17-JAN-02 19-JAN-02 Q        Missing next

20-JAN-01 19-JAN-02           M ABC789
15-JAN-02 15-DEC-02 19-JAN-02 N        Overlap


11 rows selected.
 

Thanks,

A reader, February 12, 2003 - 5:30 pm UTC

There is no alternative for you at least on this planet i think.

now, can you tell how did you "develop" this query setp by setp. I tried to create it by breaking it down in pice by pice but it was very difficut to put it together

-- also in this query how can I drop those rows which
are 'VALID'? ( The last case in my question example )

Thanks,


Tom Kyte
February 12, 2003 - 6:11 pm UTC

Well, it was pretty easy actually.

Looking at a row for a given CODE, what did we need?

o the prior rows enddate -- to look for gaps and overlaps
o the next rows startdate -- to look for missing end records with NULL

that dictates "lag" and "lead".  Once you have those -- CASE is pretty easy to use to come up with the logic to figure out "gap", "overlap", etc...

to drop the "valid" rows -- just wrap:

ops$tkyte@ORA920> select startdate,
  2         enddate,
  3             last_enddate,
  4             flag,
  5             code,
  6             case when last_enddate is null then null
  7                  when last_enddate < startdate-1 then 'Gap'
  8                          when last_enddate > startdate   then 'Overlap'
  9                          when enddate is not null and next_startdate is null 
then 'Missing next'
 10                          else null
 11                  end another_flag
 12    from (
 13  select t.*,
 14         lag(enddate) over (partition by code order by startdate) 
last_enddate,
 15             lead(startdate) over (partition by code order by startdate) 
next_startdate
 16    from t
 17   order by code, startdate
 18         )
 19  /

as an inline view an select * from ( that query ) where another_flag is not null;

 

cool

asktom fan, February 12, 2003 - 6:15 pm UTC

Thanks Tom,

Need Help with this

Kishan, July 03, 2003 - 11:14 am UTC

Dear Tom:
I need a solution to this query. Could this be done in a single SQL statement. I'm using 8.1.7

sql> DESC roombooking
Name Null? Type
----------------------------------------- -------- ----------------------------
ROOM_ID NOT NULL VARCHAR2(8) -- Room ID
BKING_ST_TIME NOT NULL DATE -- Start time
BKING_END_TIME NOT NULL DATE -- End time BKING_STS_C NOT NULL VARCHAR2(1) -- Booking Status


ROOM_ID BKING_ST_TIME BKING_END_TIME B
-------- -------------------- -------------------- -
(SR1) 07-JAN-2002 11:00:00 07-JAN-2002 13:00:00 C
(SR2) 07-JAN-2002 18:00:00 07-JAN-2002 21:00:00 C

(SR1) 08-JAN-2002 09:00:00 08-JAN-2002 12:00:00 C
(SR2) 08-JAN-2002 12:00:00 08-JAN-2002 15:00:00 C
(SR2) 08-JAN-2002 18:00:00 08-JAN-2002 21:00:00 C

(SR1) 09-JAN-2002 08:00:00 09-JAN-2002 11:00:00 C
(SR1) 09-JAN-2002 11:00:00 09-JAN-2002 13:00:00 C
(SR1) 09-JAN-2002 15:00:00 09-JAN-2002 21:00:00 C

(SR1) 10-JAN-2002 16:00:00 10-JAN-2002 18:00:00 C
(SR1) 10-JAN-2002 18:00:00 10-JAN-2002 21:00:00 C
(SR2) 10-JAN-2002 08:00:00 10-JAN-2002 10:00:00 C
(SR2) 10-JAN-2002 10:00:00 10-JAN-2002 13:00:00 C
(SR2) 10-JAN-2002 13:00:00 10-JAN-2002 15:00:00 C

(SR1) 11-JAN-2002 09:00:00 11-JAN-2002 12:00:00 C
(SR1) 11-JAN-2002 13:00:00 11-JAN-2002 15:00:00 C
(SR1) 11-JAN-2002 18:00:00 11-JAN-2002 21:00:00 C

(SR1) 14-JAN-2002 11:00:00 14-JAN-2002 13:00:00 C
(SR1) 14-JAN-2002 18:00:00 14-JAN-2002 21:00:00 C


The table roombooking stores all bookings made against a set of rooms on a daily basis. My users want to find out the period when those rooms are empty so that they can book for someone else..
For example, if they want the information for rooms available between 07-Jan-2002 to 15-Jan-2002 and between 14:00-16:00, the following information should be given to them, so that the rooms can be booked.

SR1 07-Jan-2002
SR2 07-Jan-2002

SR1 08-Jan-2002

SR2 09-Jan-2002

SR2 11-Jan-2002

SR1-12-Jan-2002
SR2-12-Jan-2002

SR2-14-Jan-2002


Tom Kyte
July 03, 2003 - 7:57 pm UTC

All we need for that is

A) a set of dates 07-jan-2002 .. 15-jan-2002, that is inline view A below
B) a set of distinct rooms that are possible, that is B below (best if you have a room table elsewhere -- the "parent" lookup table)

And then we simply cartesian product these together, generating an inclusive list of DATES and ROOMS -- looking for DATES+ROOMS that don't have an overlap:

ops$tkyte@ORA920> ops$tkyte@ORA920> select * from
  2    (select to_date( '07-jan-2002') + rownum -1 dates
  3       from all_objects
  4      where rownum <= to_date('15-jan-2002')-to_date('07-jan-2002')+1) a,
  5    (select distinct room_id
  6       from t ) b
  7   where not exists ( select null
  8                        from t
  9                       where bking_st_time <= a.DATES + 16/24
 10                         and bking_end_time >= a.DATES +14/24
 11                         and room_id = b.room_id
 12                      )
 13   order by dates, room_id
 14  /

DATES                ROOM_ID
-------------------- --------
07-jan-2002 00:00:00 (SR1)
07-jan-2002 00:00:00 (SR2)
08-jan-2002 00:00:00 (SR1)
09-jan-2002 00:00:00 (SR2)
11-jan-2002 00:00:00 (SR2)
12-jan-2002 00:00:00 (SR1)
12-jan-2002 00:00:00 (SR2)
13-jan-2002 00:00:00 (SR1)
13-jan-2002 00:00:00 (SR2)
14-jan-2002 00:00:00 (SR1)
14-jan-2002 00:00:00 (SR2)
15-jan-2002 00:00:00 (SR1)
15-jan-2002 00:00:00 (SR2)

13 rows selected.
 

That's cool...thank you for your valuable service to Oracle community

Kishan, July 03, 2003 - 8:27 pm UTC


Loan amount and Loan installment months

A reader, September 01, 2003 - 1:42 am UTC

SQL> CREATE TABLE LOANS( LOAN_NUM NUMBER, LOAN_BEGIN_DATE DATE, LOAN_END_DATE DATE, INSTALLMENT_AMOUNT NUMBER);

Table created.

SQL> INSERT INTO LOANS VALUES( 1, SYSDATE, SYSDATE +365, 1000);

1 row created.

SQL> INSERT INTO LOANS VALUES( 2, SYSDATE + 60, SYSDATE +365, 2000);

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM LOANS;

 LOAN_NUM LOAN_BEGI LOAN_END_    INSTALLMENT_AMOUNT
--------- --------- --------- ---------
        1 01-SEP-03 31-AUG-04      1000
        2 31-OCT-03 31-AUG-04      2000

We need to display the sum of installment amount for each month for the next 3 years( starting from the current month.)
With the given data, 1.we have to show the data for each month and the total of installments
for those months,plus 2. rest of the months should be shown as zero.

We need to take care of the over laps, all of the below..

 - wider then database
 - overlaps but not contained within
 - overlaps and contained within

The resulting grid should have summary for each row returned by the following...

select to_char(add_months(sysdate, (rownum-1)),'MM RRRR')  dt
from all_objects
where rownum <37

Please train into the insights of how you verbaged the problem, and how you arrived a the solution i.e. how you broke it into pieces.

Thanks
 

What helps

A reader, September 02, 2003 - 11:10 pm UTC

Tom, what helps is the reason why you did not deem it fit to answer the above request.

Tom Kyte
September 03, 2003 - 6:54 am UTC

didn't see like sufficient information to give you an answer. i don't really know what "verbiage" you are looking for or really what problem you were trying to solve.

A reader, September 03, 2003 - 8:12 am UTC

select to_char(add_months(sysdate, (rownum-1)),'MM RRRR') dt from all_objects
where rownum <37

The above query returns 36 months forward from the current month , i.e. 09 2003.

For each of those months, I have to display , the total installments due for all the loans put together.

The final output, I foresee will be something like this

09 2003 10,000
10 2003 14,000
11 2003 12,000
12 2003 15,000
01 2003 17,000
.....
......
etc




DATABASE VALUES might be

loan_no loan_begin_date loan_end_date installment
1 01/01/2002 01/01/2004 1,500
2 09/01/2003 01/01/2004 1,500
3 10/01/2003 01/01/2004 1,500

I have to display a grid of month and the total installments
for the next 36 months, from the month of the current month.
The above data will show total installments for upto 01/01/2004 , rest of the months should come up as zero.

Is this helpful.



A reader, September 03, 2003 - 10:51 am UTC

Do you need more information /explanation of the solution I'm looking for?

Tom Kyte
September 03, 2003 - 12:05 pm UTC

you know what -- they won't let me use my aircard on a plane. so, when I'm actually not able to be on the net -- I actually don't answer questions. HMMM...



if you have that table T with the months in it -- the one you made with all_objects, you can easily:


select month, (select sum(whatever) from another_table where start_date <= T.month and end_date >= T.month )
from t;

to achieve your report.

Range constraint

Tracy, October 16, 2003 - 10:16 am UTC

If you have two number columns in a table say c1 and c2, where c2 is always higher than c1 and the rows contain ranges eg

c1 c2
1 5
6 20
24 28
36 134
145 148

etc
and you want to add a constraint such that no ranges are contained within others or overlap in anyway, how would you recommend doing it? Given the above data the following inserts would fail:

c1 c2
1 2
18 23
28 30
39 58

but these would be allowed:
c1 c2
21 22
29 34
149 999

etc




Tom Kyte
October 16, 2003 - 11:09 am UTC

its really hard and relies on triggers that basically force serialization on modifications to the table and query the results in AFTER triggers to see if everything is OK.




Searching a range of dates on a range of dates

Ray DeBruyn Enterprise Information Systems, October 16, 2003 - 3:19 pm UTC

This is type of query I use to search a range of dates:

select * from t
where c_beginning < p_ending
and c_ending > p_beginning;

It will return any records where the span of c_beginning and c_ending overlaps the sapn of p_beginning and p_ending. If you want it to return records where the spans of time touch (ie p_ending = c_beginning) change the condition to <= and >=.

Tom Kyte
October 16, 2003 - 5:37 pm UTC


it is still "really hard" as it mandates

a) serialization of modifications to the table and
b) an AFTER trigger that has access to the modified data




I don't understand

John Gilmore, October 17, 2003 - 7:05 am UTC

You are telling Ray that his query "won't catch all overlaps -- only some kinds".

But you then use his same query "select * from t where cb < pe and ce > pb" to retieve the row representing the range 5..10 and 2..11 which you say his query doesn't get.

Am I missing something here?

Tom Kyte
October 17, 2003 - 10:12 am UTC

you're right -- i was just reading over it too fast last night.  sorry about that.  it does look right -- specially with the proof below.

and an example of all 4 overlaps:

ops$tkyte@ORA920> create table t ( cb int, ce int, pb int, pe int, msg varchar2(20) );
 
Table created.
 
ops$tkyte@ORA920> insert into t values ( 5, 10, 1, 6,  'low end overlap' );
 
1 row created.
 
ops$tkyte@ORA920> insert into t values ( 5, 10, 6, 11, 'hi end overlap' );
 
1 row created.
 
ops$tkyte@ORA920> insert into t values ( 5, 10, 6, 7,  'interior overlap' );
 
1 row created.
 
ops$tkyte@ORA920> insert into t values ( 5, 10, 1, 11, 'exterior overlap' );
 
1 row created.
 
ops$tkyte@ORA920>
ops$tkyte@ORA920> select * from t where cb < pe and ce > pb;
 
        CB         CE         PB         PE MSG
---------- ---------- ---------- ---------- --------------------
         5         10          1          6 low end overlap
         5         10          6         11 hi end overlap
         5         10          6          7 interior overlap
         5         10          1         11 exterior overlap


shows that -- sorry about that. 

A Different Explanation

John Gilmore, October 17, 2003 - 7:33 am UTC

Perhaps the easiest way to think of this problem is to start with the opposite question: "when do two ranges NOT overlap?"

Well, they don't overlap in two, and only two, cases:

o range_1.end < range_2.begin
or
o range_1.begin > range_2.end

Phrased in SQL this becomes

(R1_end < R2_begin
or
R1_begin > R2_end)

We're interested in the inverse of this; that is

not (
R1_end < R2_begin
or
R1_begin > R2_end)

Using Boolean algebra we can transform this to an equivalent expression:

R1_end >= R2_begin
and
R1_begin <= R2_end

which is pretty much what Ray suggested.

Difference between tow fields and use it in the sql statement condition

Jamil, December 15, 2003 - 8:50 am UTC

Dear Sir Tom
Can I make calculation in the SQL statements and make the condition on the result of calculation
For example I have this table contains

J_D J_B
--------------------
10 5
20 15
15 10
10 5
20 15
15 15
10 5
20 15
15 15

My SQL statements as follow :

select J_D,J_B,R
FROM T,(SELECT NVL(J_D,0) - NVL(J_B,0) FROM T) R
WHERE R > 0

So I want the result as follow :

J_D J_B R
------------------------------
10 5 5
20 15 5
15 10 5
10 5 5
20 15 5
11 5 6
20 15 5


How can I do that in select statement? Please give an example details how to solve it .

Best regards

Jamil




Tom Kyte
December 15, 2003 - 10:06 am UTC

well, (SELECT NVL(J_D,0) - NVL(J_B,0) FROM T) returns a row for every row in T, doesn't make sense....

not sure what you are trying to do here but

select j_d, j_b, nvl(j_d,0)-nvl(j_b,0)
from t

gets you what you want.


Using the result of the calculation in the where clause

Jamil, December 16, 2003 - 1:18 am UTC

Dear Sir

I want to use the result of the calculation
of this nvl(j_d,0)-nvl(j_b,0) in the where clause
so if I have for example table t contains data as follow :
J_D J_B
-------------------
10 5
20 15
15 10
10 10
15 15
20 20

so I want my query to return different between J_D and J_B , and the result of the subtraction should be used in the where clause like this

select J_D,J_B,R
FROM T,(SELECT NVL(J_D,0) - NVL(J_B,0) FROM T) R
WHERE R > 0


So the result of the above query should be as follow :

J_D J_B R
---------------------
10 5 5
20 15 5
15 10 5

How can I use the result of the calculation in the where clause , Or how can I make the result of my query
control by the result of the calculation
Please help with example details, with any other idea
if it is not possible with the above idea.


Best regards

Jamil





Tom Kyte
December 16, 2003 - 7:23 am UTC

that is just

select *
from t
where (j_d <> j_b or
((j_d is null and j_b is not null) or (j_d is not null and j_b is null))


you can use any computation in the where clause -- but in your case, using that nvl() function would NOT answer the query "show me the rows where they are different". My predicate will.

What about indexing date ranges ?

Christo Kutrovsky, February 18, 2004 - 11:33 pm UTC

Tom,

You have been discussing in great extent date range comparison. What about indexing for performance date ranges?

For example , we have a table with start date and end date. We would like to be able to pinpoint the rows where a specific moment is in the date range:

select * from <table>
where begin_date <= :moment
and end_date > :moment

What kind of index should we create to do this efficiently?

Creating a combined index (begin_date, end_date) results in a huge range_scan on the index from begin_date, to the lowest date in the index. Is there any other more efficient alternative ?


Tom Kyte
February 19, 2004 - 10:26 am UTC

It is alot like the "spatial" problem with lat/long's.   We had to invent a special datatype for that stuff and geocode the data to make it "efficient".

It is a pretty hard problem when you think about it.  Suppose you have data:


01-jan-2003       02-jan-2003
01-jan-2003       03-jan-2003
.....
01-jan-2003       31-dec-2003
02-jan-2003       03-jan-2003
02-jan-2003       04-jan-2003
....
02-jan-2003       31-dec-2003
(and on for each day in 2003, eg:

ops$ora920@ORA920PC> create table t ( a date, b date );
 
Table created.
 
ops$ora920@ORA920PC>
ops$ora920@ORA920PC> insert into t
  2  select a,b
  3    from (select to_date('01-jan-2003')+rownum-1 a
  4            from all_objects
  5            where rownum <= 365 ),
  6         (select to_date('01-jan-2003')+rownum-1 b
  7            from all_objects
  8            where rownum <= 365 )
  9   where a < b;
 
66430 rows created.



Now, say we index like this (using high pctfree to really spread the data out)

ops$ora920@ORA920PC> create index t_idx on t(a,b) pctfree 95;
Index created.


Now, we query:

ops$ora920@ORA920PC> select *
  2    from t
  3   where a <= to_date( '01-jan-2004' )
  4     and b > to_date( '01-jan-2004' )
  5  /
 
no rows selected
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       8306  consistent gets
       8305  physical reads
          0  redo size
        266  bytes sent via SQL*Net to client
        368  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

<b>it had to read the entire index -- to find no data.  so it seems that perhaps:
</b>

ops$ora920@ORA920PC> drop index t_idx;
 
Index dropped.
 
ops$ora920@ORA920PC> create index t_idx on t(b,a) pctfree 95;
 
Index created.
 
ops$ora920@ORA920PC>
ops$ora920@ORA920PC> select *
  2    from t
  3   where a <= to_date( '01-jan-2004' )
  4     and b > to_date( '01-jan-2004' )
  5  /
 
no rows selected

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        266  bytes sent via SQL*Net to client
        368  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
 
<b>maybe we just had the index backwards?  but no, it is not to be true all of the time:</b>


ops$ora920@ORA920PC> select *
  2    from t
  3   where a <= to_date( '01-jan-2002' )
  4     and b > to_date( '01-jan-2002' )
  5  /
 
no rows selected
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
 
 
 
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       8306  consistent gets
       8304  physical reads
          0  redo size
        266  bytes sent via SQL*Net to client
        368  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed
 
<b>it depends on the inputs...  sometimes a,b is right -- sometimes b,a is correct (and sometimes FULL_SCAN=true is correct)</b>

Ok, so that does not help you too much yet (just explains why this is "hard").  If this is the "important query to tune", we can do a couple of things.  

I could use using Oracle Text to index a "function", this function would take two dates and return a string, something like:

input = 01-jan-2003, 14-mar-2003

012003 022003 032003 


then we could:

select * from t where contains( dummy, to_char( :moment, 'mmyyyy' ) > 0 and 
begin_date <= :moment
and end_date > :moment


that would have us just look at "the months" of interest.

We could also do this with a trigger and an IOT -- sort of a DIY index, you would create an IOT with a date field and either a rowid or copy the primary key from the original table.

A trigger would put into this IOT the dates, rowid/pk -- after computing the relevant dates (eg: trunc start date to the month, insert that, add a month, insert that until you have a date that exceeds your end date).

then you can query:


select * from t where rowid in ( select rid from iot where dt = trunc(:moment,'mm')) and begin_date <= :moment
and end_date > :moment;


it'll do sort of the same thing.  Of course, you could trunc to the week and add 7 days instead of a month, trunc to the day and add a day - whatever -- to make the index "more selective"
                                                                                                               
                                                                                                               

 

Exactly

Christo Kutrovsky, February 19, 2004 - 11:21 am UTC

Those are the exact problems I am facing. I did try the index in reverse too. I finally had to take a piece of paper and write a few entries to understand why a range scan from the start value to the end of the entire index was necessary.

You said:
<q>
It is alot like the "spatial" problem with lat/long's. We had to invent a
special datatype for that stuff and geocode the data to make it "efficient".
</q>

I am not familiar with the solution in lat/long's. Before I dig into it, do you think a similar solution would be possible in this case?



Tom Kyte
February 19, 2004 - 11:41 am UTC

No, i don't think a similar solution would work -- lats/longs are in a very discrete, bounded, small set.

start_date, end_date is "infinite" in size -- not bounded.


unless you put artificial (or business) constraints on the start/end -- the problem is just that they could be any pair of values with no end points. so the stuff that works for lat/long -- won't work for start/end




A reader, March 19, 2004 - 8:51 am UTC


Slightly different

Sudha Bhagavatula, March 22, 2004 - 1:26 pm UTC

I have 2 tables contracts and member_providers each with member_id, eff_date and end_date fields. The tabbe in contracts has data like this :

member id := 100
eff_date := 01/01/2003
end_date := 12/31/2003

The table member_providers has data like this:

member_id := 100
eff_date := 01/01/2003
end_date := 06/30/2003

member_id := 100
eff_date := 08/01/2003
end_date := 12/31/2003

I want to identify the break in range in member_providers table compared to the contracts table in a SQL statement without using PL/SQL. Is it possible?

Tom Kyte
March 22, 2004 - 3:37 pm UTC

you can find breaks using LAG() and LEAD() easily....

do you know about them? (search for

rock roll


on this site -- lots of analytic examples)

very interesting issue

Muhammad Ibrahim, March 22, 2004 - 9:27 pm UTC

CB => c_beginning
PB => parameter start date
PE => parameter end date

CB1,CB2... different start dates in the table t
CE1,CE2... different end dates in the table t.

the situation looks like:

PB PE
|--------------------------|


|-----------|
CB1 CE1

|--------------------------|
CB2 CE2

|----------------|
CB3 CE3

|------------------|
CB4 CE4

|-----|
CB5 CE5

|--------|
CB6 CE6


From the below query

Select *
From t
Where ( CB Between PB And PE -- rule 1
Or CE Between PB And PE -- rule 2
Or ( CB < PB And PE < CE ) ); -- rule 3

it will return the below date sets:

CB1-CE1
CB2-CE2
CB3-CE3
CB4-CE4
CB5-CE5 -- *

the first four are correct but i want to eliminate the CB5-CE5 because either start or end date doesnt falls in PB-PE.
And also same with CB6-CE6. How can i do this?

Thanks,
Ibrahim.


Tom Kyte
March 22, 2004 - 9:31 pm UTC

how does that return cb5-c35 since

cb5 is not between pb and pe

ce5 is not between pb and pe

pe is NOT less than ce5

yes you are correct it returns CB5 also but will it return CB7 new scenario too?

A reader, March 22, 2004 - 11:09 pm UTC

Thanks for your reply. One more simple question.Will the same query returns for CB7 too?

CB => c_beginning
PB => parameter start date
PE => parameter end date

CB1,CB2... different start dates in the table t
CE1,CE2... different end dates in the table t.

the situation looks like:

PB PE
|--------------------------|


|-----------|
CB1 CE1

|--------------------------|
CB2 CE2

|----------------|
CB3 CE3

|------------------|
CB4 CE4

|-----|
CB5 CE5

|--------|
CB6 CE6
|----------------------------------------------------------|
CB7 CE7
From the below query

Select *
From t
Where ( CB Between PB And PE -- rule 1
Or CE Between PB And PE -- rule 2
Or ( CB < PB And PE < CE ) ); -- rule 3

it will return the below date sets:

CB1-CE1
CB2-CE2
CB3-CE3
CB4-CE4
CB5-CE5
CB7-CE7 -- ???

If i want to get the CB7 also do i need to add some more condition? The reason is CB7 is active eventhough it starts before PB date and ends behind PE date.

Thanks,
Ibrahim.



Tom Kyte
March 23, 2004 - 6:57 am UTC

for cb7

rule1 is false
rule2 is false
rule3 is false

(you know, you could load up some data or just evaluate the rules yourself! )


Yes, you would have to accomidate for cb7-ce7 if you wanted it, the original problem statement did not want it. would not be too hard to add.

Thanks for your reply

Muhammad Ibrahim, March 23, 2004 - 8:17 pm UTC

Dear Tom,

Select *
From t
Where ( CB Between PB And PE -- rule 1
Or CE Between PB And PE -- rule 2
Or ( CB < PB And PE < CE ) -- rule 3
Or ( CB < PE And PE < CE ); -- rule 4

The rule4 can overcome the CB7-CE7 but i am just wondering that whether any other scenario can pass thro' PB and PE. Just a doubt :-)

Thanks for your kind help to the oracle world.

Regards,
Ibrahim.

Tom Kyte
March 24, 2004 - 8:27 am UTC

at this point, it is just "logic", a predicate, a where clause.

you define what you want and the db will return it.

if that is what you want, there you go.....

Optimizing joins on start-end ranges

Bobby O, March 26, 2004 - 9:23 pm UTC

Tom,
Can the above advice be expanded range-queries for self-joins of large tables, on numeric rather than date values? For data-mining (i.e. not OLAP) queries?

As an example, below, I need to returns ids for pairs of start-end records that overlap.  I run variants of the query below quite often, and the spatial overlap step is an enormous, recurring bottleneck.

In my case, these are data-mining/data-analysis joins on a system dedicated to these types of joins; pre-calculating materialized views, adding indexes, partitioning, writing pipelined table functions, custom index functions etc. would all be ok as a solution

Thanks,
  Bobby O

DETAILS:
--- the query:
SQL> edit
Wrote file afiedt.buf

  1  select c1.id as id1, c2.id as id2
  2  from chrom_interval c1
  3      , chrom_interval c2
  4  where c1.cstart < c2.cend
  5    and c1.cend   > c2.cstart
  6    and c1.chrom = c2.chrom
  7*   and c1.id < c2.id -- id is primary key; report each pair once

 ---- The table:
SQL> describe chrom_interval;
 Name                      Null?    Type
 ------------------------ -------- ---------------
 ID                        NOT NULL NUMBER
 CHROM                     NOT NULL VARCHAR2(255)
 CSTART                    NOT NULL NUMBER(10)
 CEND                      NOT NULL NUMBER(10)
SQL> select count(*), count(distinct chrom ) from chrom_interval;

  COUNT(*) COUNT(DISTINCTCHROM)
---------- --------------------
  24644323                   41

The (cstart,cstop) range of each record is small compared to the max( cstart ) and min( cstart ) for each chrom. E.g. overlaps are infrequent, with each record overlapping ~0-10 other records.
Common query variants often include using the above join but with c1 & c1 from two different tables with the same columns as chrom_interval, or restricting c1 to a range of ids comprising ~0.1-1% of all chrom_interval rows. 

Tom Kyte
March 27, 2004 - 10:33 am UTC

the techniques would apply for numbers as well as dates. 

just as you can expand dates out into months and text index them -- you can do so with numbers as well.

but, there our goal was to reduce the search space and your query:

where c1.chrom=c2.chrom
  and c1.id < c2.id

would seem to already do that?   

If you are looking for overlaps (record "1" overlaps record "2") -- analytics might be best (single pass)....

It will not detect multiple overlaps as your does (eg; record 2 and record 3 overlap with record 1 -- you would show 

1 2
1 3
2 3

I would show

1 2
2 3

not sure if that matters to you or not...)

ops$tkyte@ORA9IR2> select c1.id as id1, c2.id as id2
  2  from t c1
  3      , t c2
  4  where c1.cstart < c2.cend
  5    and c1.cend      > c2.cstart
  6    and c1.chrom = c2.chrom
  7    and c1.id < c2.id
  8  /
 
       ID1        ID2
---------- ----------
         3          4
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id, next_id
  2    from (
  3  select id,
  4         lead(id) over (partition by chrom order by id) next_id,
  5         cend,
  6         lead(cstart) over (partition by chrom order by id) next_cstart
  7    from t
  8         )
  9   where next_cstart <= cend
 10  /
 
        ID    NEXT_ID
---------- ----------
         3          4



 

getting all overlaps

Bobby O, March 29, 2004 - 3:42 pm UTC

Tom,
thanks. I'm still not sure how to optimize the query in my previous post; I do need all overlaps; e.g.
in the case you mentioned, I need the result:
1 2
1 3
2 3

; the other result, below, would not work.

1 2
2 3

; What is the optimal strategy for greatly accelerating this type of query when comparing two very large tables? what index + what hint, or rather write custom pl/sql, or.... ?
I'm willing to accept a less elegant solution if it is a lot faster. Currently, the likely alternative is dumping the data out of the database & doing the overlap test in perl. The query I noted runs too slow to be useful/acceptable. ( many hours to days). But I would think there is a way where Oracle is the fastest route....?

Tom Kyte
March 29, 2004 - 4:22 pm UTC

then you need what you asked for - a non-equijoin

the only way to "optimize it" in this case would be to change the question :)

you need to almost cartesian join at the "id" level in order to get the number of rows you want.

You will not get me to believe that anything you write in perl would be faster than anything you could do in SQL or PLSQL. Sorry, but I would not "buy that". If you want to benchmark it - please post the perl and we'll have a look see at the assumptions you've made.


also, whats the output of an autotrace traceonly explain for that query?




Sql autotrace for interval overlap query

bobbyo, April 12, 2004 - 3:11 pm UTC

Tom,
   I believe that the sparseness of overlaps should help as restrective assumption in the join. To wit, the chrom_interval rows are start-stop intervals along a line, and very fiew interval-interval pairs do not result in  an overlap.
  Hence, shouldn't a sort-merge-like join approach, perhaps with the columns pre-sorted in an index, would be much more efficient than a cartesian join?

Thanks,
  Bobby O

SQL> edit
Wrote file afiedt.buf

  1  select c1.id as id1, c2.id as id2
  2  from chrom_interval c1
  3      , chrom_interval c2
  4  where c1.cstart < c2.cend
  5    and c1.cend   > c2.cstart
  6    and c1.chrom = c2.chrom
  7*   and c1.id < c2.id -- id is primary key; report each pair once
SQL> /

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=32751 Card=185165443
          9 Bytes=92582721950)

   1    0   HASH JOIN (Cost=32751 Card=1851654439 Bytes=92582721950)
   2    1     TABLE ACCESS (FULL) OF 'CHROM_INTERVAL' (Cost=2522 Card=
          24644323 Bytes=616108075)

   3    1     TABLE ACCESS (FULL) OF 'CHROM_INTERVAL' (Cost=2522 Card=
          24644323 Bytes=616108075)
 

Tom Kyte
April 12, 2004 - 7:08 pm UTC

hint it and see.

Hint it

bobbyo, April 14, 2004 - 3:49 pm UTC

What indexes & hints do you recommend ?

Tom Kyte
April 14, 2004 - 4:15 pm UTC

I don't have any -- i think indexes here would be disasterous to the performance.

index reading such big tables = bad
index probing from such a big table to another big table = really bad idea

Say you take this query:

1 select c1.id as id1, c2.id as id2
2 from chrom_interval c1
3 , chrom_interval c2
4 where c1.cstart < c2.cend
5 and c1.cend > c2.cstart
6 and c1.chrom = c2.chrom
7* and c1.id < c2.id


index chrom_interval(chrom,id,cstart,cend)

we'd have to full scan chrom_interval at least once and for *each row* if we used an index -- do a range scan -- you are talking billions of LIO's at that point.

can we use:

E.g. overlaps are infrequent, with each record
overlapping ~0-10 other records.


to your advantage? if you put a cap on the max for us, we can use analytics in a single pass.


Or, maybe this can work for you -- use analytics to get my small result and then a connect by to explode it out:

create global temporary table gtt (id int, next_id int );

insert into gtt
select id, next_id
from (
select id,
lead(id) over (partition by chrom order by id) next_id,
cend,
lead(cstart) over (partition by chrom order by id) next_cstart
from t
)
where next_cstart <= cend
/
select * from gtt connect by prior next_id = id
/



How to combine range of dates with the same amount

A reader, June 16, 2004 - 8:18 pm UTC

How to combine range of dates with the same amount:
(scripts are below )
I need to combine range of dates
which are
1) having same amount and are
continous ( start date of next record is end date of current record + 1 )
so "only" show the records with min start date and max end date ( and not the indivdual ranges )
or
2) falls between the range of min start date and max end date for the same amount
so "only" show the records with min start date and max end date ( and not the indivdual ranges )

Plus show the remaining records which do not satisfy the criteria.



SELECT * FROM TEST order by 1,2;

ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 1/2/2002 62
01 1/3/2002 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 7/10/2003 70
01 7/11/2003 1/1/2004 70
02 5/6/2003 1/1/2004 78
02 5/6/2003 12/12/2004 78
02 1/2/2004 12/12/2004 78
9 rows selected

The final output must be

SELECT * FROM TEST order by 1,2
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70
02 5/6/2003 12/12/2004 78

5 rows needed

my attempts

SELECT *
FROM (SELECT a.*,
LEAD (start_date) OVER (PARTITION BY ID ORDER BY start_date)
lead_sd,
LEAD (amt) OVER (PARTITION BY ID ORDER BY start_date) lead_amt,
MIN (start_date) OVER (PARTITION BY ID ORDER BY amt) min_sd,
MAX (end_date) OVER (PARTITION BY ID ORDER BY amt) max_ed
FROM TEST a)
WHERE end_date = lead_sd - 1 AND amt = lead_amt

id start date end date amt lead sd lead amt min sd max ed
01 1/8/2001 1/2/2002 62 1/3/2002 62 1/11/2000 6/30/2002
01 1/2/2003 7/10/2003 70 7/11/2003 70 1/11/2000 1/1/2004

I couldnt go furhter this gives me the min and max for continous dates for the same price.
how to combine into one result set ?




CREATE TABLE TEST
(
ID VARCHAR2(15 BYTE) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE DATE,
AMT NUMBER(13)
)
NOLOGGING
NOCACHE
NOPARALLEL;

INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '07/11/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 70);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/03/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '06/30/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 62);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '07/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 64);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/02/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '07/10/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 70);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/08/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/02/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 62);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/11/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/07/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 57);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'02', TO_Date( '01/02/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '12/12/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 78);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'02', TO_Date( '05/06/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 78);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'02', TO_Date( '05/06/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '12/12/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 78);
COMMIT;

Thanx for all your help

Tom Kyte
June 17, 2004 - 8:01 am UTC

...
1) having same amount and are
continous ( start date of next record is end date of current record + 1 )
so "only" show the records with min start date and max end date ( and not
the indivdual ranges )
or
2) falls between the range of min start date and max end date for the same
amount
so "only" show the records with min start date and max end date ( and not the
indivdual ranges )
....

if you ask me, #1 and #2 seem to cover the entire set.

every record satisfies #2.

so, there must be something more to it?

How to combine range of dates with the same amount DB version

A reader, June 16, 2004 - 9:30 pm UTC

DB version is 9.2

How to combine : Some clarifications for point 2

A reader, June 16, 2004 - 10:12 pm UTC

wrt " 2) falls between the range of min start date and max end date for the same
amount
so "only" show the records with min start date and max end date ( and not the
indivdual ranges )"

I meant to say that the min start date and max end date above must be in the same record like in the example below
it is the second record.
Only then show that record.

ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------

02 5/6/2003 1/1/2004 78
02 5/6/2003 12/12/2004 78
02 1/2/2004 12/12/2004 78


Tom Kyte
June 17, 2004 - 9:25 am UTC

sorry, not getting it. or it just sounds like you want min(start), max(end) group by amt.

A reader, June 16, 2004 - 10:25 pm UTC

My corrected attempt:
SELECT *
FROM (SELECT a.*,
LEAD (start_date) OVER (PARTITION BY ID ORDER BY start_date)
lead_sd,
LEAD (amt) OVER (PARTITION BY ID ORDER BY start_date) lead_amt,
MIN (start_date) OVER (PARTITION BY ID,amt ORDER BY amt) min_sd,
MAX (end_date) OVER (PARTITION BY ID,amt ORDER BY amt) max_ed
FROM TEST a)
WHERE end_date = lead_sd - 1 AND amt = lead_amt

ID START_DATE END_DATE AMT LEAD_SD LEAD_AMT MIN_SD MAX_ED
--------------- --------------------- --------------------- -------------- --------------------- --------------------------------------- --------------------- ---------------------
01 1/8/2001 1/2/2002 62 1/3/2002 62 1/8/2001 6/30/2002
01 1/2/2003 7/10/2003 70 7/11/2003 70 1/2/2003 1/1/2004
2 rows selected

Clarifications

A reader, June 17, 2004 - 10:39 am UTC

SELECT * FROM TEST order by 1,2;

ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 1/2/2002 62 **
01 1/3/2002 6/30/2002 62 **
01 7/1/2002 1/1/2003 64
01 1/2/2003 7/10/2003 70 ##
01 7/11/2003 1/1/2004 70 ##
02 5/6/2003 1/1/2004 78 $$
02 5/6/2003 12/12/2004 78 $$
02 1/2/2004 12/12/2004 78 $$
9 rows selected

The final output must be

ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 6/30/2002 62 **
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70 ##
02 5/6/2003 12/12/2004 78 $$


Explanation;

for the rows :
01 1/8/2001 1/2/2002 62
01 1/3/2002 6/30/2002 62
here start date of second row = end date or first row + 1 and also amt is same.
----------------------------------------------------------------->
1/8/2001 1/2/2002
1/3/2002 6/30/2002



I need output as (as long as amts are same) :
for the above set min start and max end dates

01 1/8/2001 6/30/2002 62

for the rows :
01 1/2/2003 7/10/2003 70 ##
01 7/11/2003 1/1/2004 70 ##

the output needed is (as long as amts are same) :
01 1/2/2003 1/1/2004 70 ##


for the rows :
02 5/6/2003 1/1/2004 78 $$
02 5/6/2003 12/12/2004 78 $$
02 1/2/2004 12/12/2004 78 $$

here if a single record has the start date and end date as the max start date and min start date for that amt
show that and neglect the rows with dates completely overlaping withing that range

------------------------------------------------------------------>
5/6/2003 1/1/2004
1/2/2004 12/12/2004
5/6/2003 12/12/2004 $ show this row only
the output needed is (as long as amts are same)
02 5/6/2003 12/12/2004 78 $$


Please see the matching characters.

So the final output is above cases plus the remainder of the rows
(hence you get rows marked @@@@@ and %%%%%)

ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57 @@@@@
01 1/8/2001 6/30/2002 62 **
01 7/1/2002 1/1/2003 64 %%%%%
01 1/2/2003 1/1/2004 70 ##
02 5/6/2003 12/12/2004 78 $$


Tom Kyte
June 17, 2004 - 12:33 pm UTC

yah, as far as I can see,

select id, min(start_date), max(end_date), amt from t group by id, amt;

is it. as least for your example data that does it.

You want the min start date by amt/id
You want the max start date by amt/id

all of this "prior" and "next" stuff doesn't seem to apply.

More:

A reader, June 17, 2004 - 10:51 am UTC

for the rows :
02 5/6/2003 1/1/2004 78 $$
02 5/6/2003 12/12/2004 78 $$
02 1/2/2004 12/12/2004 78 $$

Assume we also had a row like
02 11/5/2004 12/6/2004 78 $$

here if a single record has the start date and end date as the max start date
and min start date for that amt
show that and neglect the rows with dates completely overlaping withing that
range

------------------------------------------------------------------>
5/6/2003 1/1/2004
1/2/2004 12/12/2004
11/5/04 12/6/04
5/6/2003 12/12/2004 $ show this
row only
the output needed is (as long as amts are same)
02 5/6/2003 12/12/2004 78 $$



Tom Kyte
June 17, 2004 - 1:41 pm UTC

yah, so pump that data through my query -- we are still at "max min". I see nothing to indicate otherwise.

5/6/2003 is the min(start_date).
12/12/2004 is the max(end_date).

who cares about that other row?

May be :

A reader, June 17, 2004 - 2:31 pm UTC

I did not put a good test case.

1) I need the max and min dates only for those ranges that
are continous(start_date = prev end_date + 1) and have the same amt.
So that I can unify those records in a single range with min sd and max ed values.
This min max row will be a computed row representing those ranges.



Your query will give me wrong results for the data set:
select * from test order by 1,2
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/5/2001 62
01 1/8/2001 1/2/2002 62
01 1/3/2002 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 7/10/2003 70
01 7/11/2003 1/1/2004 70
02 5/6/2003 1/4/2004 78
02 5/6/2003 12/12/2004 78
02 1/2/2004 12/12/2004 78
9 rows selected

It will give me :
select id, min(start_date), max(end_date), amt from test group by id, amt
select id, min(start_date), max(end_date), amt from test3 group by id, amt
ID MIN(START_DATE) MAX(END_DATE) AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 6/30/2002 62-- doesnot consider breaks in range
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70
02 5/6/2003 12/12/2004 78
4 rows selected

What is required is :
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/5/2001 62 -- this comes because of break in range.
01 1/8/2001 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70
02 5/6/2003 12/12/2004 78

therefore I used analytics to get dates which are continous and have the same amt and then get their min SD and max ED.
Basically the requirement is as long as ranges are continous and have same amount get the min sd and max sd.
and donot show individual ranges for them.

Also even if range is not continous but they fall within
the range of "such a "row"" which happens to have the
min SD and max SD for that amt then show that and only that row ( see id = 02 in this case ).

So the min and max should either be in the same row for that amt or
if range is "continous" only then min and max to be computed across the various rows.


Any clues to this?
Thanx for your help

Tom Kyte
June 17, 2004 - 3:03 pm UTC

ops$tkyte@ORA9IR2> select id, min(start_date), max(end_date), amt
  2    from (
  3  select id, start_date, end_date, amt,
  4         max(grp) over (partition by id,amt order by end_date) grp
  5    from (
  6  select id, start_date, end_date, amt,
  7         case when lag(end_date) over (partition by id, amt order by end_date ) <> start_date-1
  8                   or
  9                   row_number() over (partition by id, amt order by end_date) = 1
 10              then row_number() over (partition by id, amt order by end_date)
 11          end grp
 12    from t
 13         )
 14         )
 15   group by id, amt, grp
 16  /
 
        ID MIN(START_ MAX(END_DA        AMT
---------- ---------- ---------- ----------
         1 01/11/2000 01/05/2001         62
         1 01/08/2001 06/30/2002         62
         1 07/01/2002 01/01/2003         64
         1 01/02/2003 01/01/2004         70
         2 05/06/2003 01/04/2004         78
         2 05/06/2003 12/12/2004         78
 
6 rows selected.
 


run the queries from the inside out to see what I did.  just marked the beginning of groups with row_number, then carried this grp down the rows, then we just group by it.

 

A Problem with

A reader, June 17, 2004 - 3:44 pm UTC

Great Thanx for your help sir.
But there is a problem with the query:
The results:
ID MIN(START_ MAX(END_DA AMT
---------- ---------- ---------- ----------
1 01/11/2000 01/05/2001 62
1 01/08/2001 06/30/2002 62
1 07/01/2002 01/01/2003 64
1 01/02/2003 01/01/2004 70
2 05/06/2003 01/04/2004 78 *****
2 05/06/2003 12/12/2004 78

The **** one should not show up as it falls within the
range of 5/6/2003 and 12/12/2004 (which is a single row in the data set ) and has the same amt.
How to eliminate such rows ?

The second requirement was "
Also even if range is not continous but they fall within
the range of "such a "row"" which happens to have the
min SD and max SD for that amt then show that and only that "max min" row ( see id = 02 in
this case )."
-------------------------->
5/6/2003 1/4/2004
1/2/2004 12/12/2004
5/6/2003 12/12/2004



Tom Kyte
June 17, 2004 - 6:25 pm UTC

probably not going to happen in sql - you would have to compare that row to every other grouped row inside of that set....

and it gets really wicked if you have something like:

2 05/06/2003 01/04/2004 78
2 05/06/2003 01/04/2004 78

since pretty much most every piece of sql logic we could apply to that would remove both rows.

so I don't see it happening.

Any directions ?

A reader, June 19, 2004 - 9:00 pm UTC

Great thanx sir for your help.
Any directions,or Algorithm to implement in plsql ?
I tried with plsql itself the logic gets very very complicated.
Was thinking of opening 2 cursors.
c1 is select * from t order by id ,start_date
c2(p_id number,p_amt number) is select * from t where
id = p_id and amt = p_amt.

Tom Kyte
June 20, 2004 - 10:18 am UTC

not really, you'll need to basically get all of the records for a given group and compare each record in that group to every other one -- looking for "overlaps". sort of like a cartesian join.

date in the where clause

A reader, August 12, 2004 - 12:19 pm UTC

hi tom
in the where clause I have something like:

date_column >= sysdate - :no_of_days

where 10 is the number of days...

Is there a better way of writing this? I mean I think I have seen some optimizations you suggest for clauses involving date between dat1 and date2...does any of that
apply here?

thanx!

Tom Kyte
August 12, 2004 - 12:37 pm UTC

that is the right way to do that. you want to avoid performing functions on database columns and do them on constants -- as you are.

Days in overlapping/non-overlapping dates ...

VKOUL, November 05, 2004 - 9:02 pm UTC

SQL> create table dt(from_date date, to_date date);

Table created.

SQL> 
SQL> insert into dt values('01-jan-2004', '05-jan-2004');

1 row created.

SQL> insert into dt values('04-jan-2004', '10-jan-2004');

1 row created.

SQL> insert into dt values('11-jan-2004', '20-jan-2004');

1 row created.

SQL> 
SQL> insert into dt values('19-jan-2004', '24-jan-2004');

1 row created.

SQL> insert into dt values('26-jan-2004', '31-jan-2004');

1 row created.

SQL> commit;

Commit complete.

SQL> 
SQL> select * from dt;
Press Enter ...

FROM_DATE TO_DATE
--------- ---------
01-JAN-04 05-JAN-04 
04-JAN-04 10-JAN-04
11-JAN-04 20-JAN-04
19-JAN-04 24-JAN-04
26-JAN-04 31-JAN-04

SQL> 

Q : Is it possible to get total number of days in this set counting the overlapping period once only

i.e. Between 01-JAN-04 & 05-JAN-04, there are 5 days
     Between 04-JAN-04 & 10-JAN-04, there are 7 days
     
     But, 04-JAN & 05-JAN are common, so the total number of days will be effectively 10.

Thanks 

Tom Kyte
November 06, 2004 - 10:27 am UTC

will you have


01-jan-04 05-jan-04
03-jan-04 10-jan-04
04-jan-04 11-jan-04

as well -- is is only contigous records that overlap or could these overlaps span tons of records.

Days in overlapping/non-overlapping dates ...

VKOUL, November 07, 2004 - 1:01 pm UTC

Yes they will be, but these records are enclosed by a key, means, per key basis these records will be finite, say at max 20 overlapping/non-overlapping.


Tom Kyte
November 07, 2004 - 3:00 pm UTC

are you saying "by key", at most 20 records?

VKOUL, November 07, 2004 - 5:45 pm UTC

Yes, that is true

Tom Kyte
November 08, 2004 - 9:40 am UTC

ops$tkyte@ORA9IR2> select min(from_date) from_date,max(to_date) to_date, max(to_date)-min(from_date)+1 days, grp
  2    from (
  3  select from_date, to_date, max(grp) over (order by to_date) grp
  4    from (
  5  select from_date, to_date,
  6         lag(to_date) over (order by to_date) lag_to_date,
  7         case when nvl( lag(to_date) over (order by to_date),from_date-1)
  8                       < from_date then row_number() over (order by to_date) end grp
  9    from dt
 10         )
 11         )
 12   group by grp
 13  /
 
FROM_DATE TO_DATE         DAYS        GRP
--------- --------- ---------- ----------
01-JAN-04 10-JAN-04         10          1
11-JAN-04 24-JAN-04         14          3
26-JAN-04 31-JAN-04          6          5
 
ops$tkyte@ORA9IR2> select sum((to_date-from_date)+1)
  2    from (
  3  select min(from_date) from_date,max(to_date) to_date,grp
  4    from (
  5  select from_date, to_date, max(grp) over (order by to_date) grp
  6    from (
  7  select from_date, to_date,
  8         lag(to_date) over (order by to_date) lag_to_date,
  9         case when nvl( lag(to_date) over (order by to_date),from_date-1)
 10                < from_date then row_number() over (order by to_date) end grp
 11    from dt
 12         )
 13         )
 14   group by grp
 15         )
 16  /
 
SUM((TO_DATE-FROM_DATE)+1)
--------------------------
                        30
 
ops$tkyte@ORA9IR2>  

VKOUL, November 08, 2004 - 4:33 pm UTC

Thanks Tom,

it works great.

I am going to test it with couple more scenarios



SUM(interval) work?

Joseph, November 08, 2004 - 6:23 pm UTC

Tom

Thanks for your all great insights. If you can tackle this, I'd appreciate.

I have a table that contains two timestamp columns (START and STOP). Each record is basically a log of a batch job. I tried

SELECT SUM(END_TIME-START_TIME) total_time
FROM jobs
WHERE TRANS_PROC_ID > 12886;

I got ORA-00932: INCONSISTENT DATA TYPES: expected number got interval.

Can we aggregrate interval data? How?

Thanks
Joseph


last granular detail of time we can store

A reader, November 09, 2004 - 2:51 am UTC

Tom, what is the last granular detail of time we can store in oracle database, and what is its format..

Thanks

Tom Kyte
November 09, 2004 - 8:50 am UTC

ops$tkyte@ORA9IR2> select systimestamp from dual;
 
SYSTIMESTAMP
---------------------------------------------------------------------------
09-NOV-04 08.22.27.351007 AM -05:00



read about TIMESTAMP in the Concepts and SQL Reference Guide. 

Leading Question

Robert, November 29, 2004 - 6:36 pm UTC

Tom,

Is there a technique which you commonly use to fill in the last lead() with the next value in a table.

For example, I would like to return the value of P200403 for the last row in this query.

test9.2> l
1 select part_name,
2 high_values,
3 lead_part
4 from (select A.partition_na part_name,
5 A.partition_key_c high_values,
6 lead(partition_na) over (order by partition_na) lead_part
7 from my_table_partitions A
8 where A.table_owner_na = 'TEST'
9 and A.table_na = 'FACT_TABLE'
10 and exists (select NULL
11 from dimension_table B
12* where B.system_calendar_key_n = A.partition_key_c))
test9.2> /
Press <RETURN> To Continue...

PART_NAME HIGH_VALUES LEAD_PART
------------- --------------- ----------
P200350 200351 P200351
P200351 200352 P200352
P200352 200401 P200401
P200401 200402 P200402
P200402 200403

5 rows selected.


Thanks,
Dale

Tom Kyte
November 29, 2004 - 7:12 pm UTC

nvl( lead()...., partition_key_c )


you just want partition_key_c when lead is null right?

Thanks!!!

Robert D. Ware, December 01, 2004 - 10:17 am UTC

Tom,

Again you were right on the money.

NVL(lead(partition_na) over (order by partition_na),'P'||partition_key_c) part_name

This gave me exactly what I needed.

Thanks again!

Hour wise Max number of logged in Users

Vaishnavi, December 19, 2004 - 2:29 am UTC

Hi Tom,

Thank you very much for your valuble Inputs.  I need to find out hour wise Max number of logged in users from a table.

SQL> create table t (login date, logout date);

Table created.

SQL> insert into t values(trunc(sysdate) + (8/24), trunc(sysdate) + (11/24) + (15/(24*60)));

1 row created.

SQL> insert into t values(trunc(sysdate) + (9/24), trunc(sysdate) + (11/24) + (15/(24*60)));

1 row created.

SQL> insert into t values(trunc(sysdate) + (10/24), trunc(sysdate) + (11/24) + (15/(24*60)));

1 row created.

SQL> insert into t values(trunc(sysdate) + (11/24), trunc(sysdate) + (11/24) + (15/(24*60)));

1 row created.

SQL> commit;

Commit complete.

SQL>

SQL> select to_char(login, 'DD-MON-RRRR HH24:MI:SS'), to_char(logout, 'DD-MON-RRRR HH24:MI:SS') from t;

TO_CHAR(login,'DD-MO TO_CHAR(logout,'DD-M
-------------------- --------------------
19-DEC-2004 08:00:00 19-DEC-2004 11:15:00
19-DEC-2004 09:00:00 19-DEC-2004 11:15:00
19-DEC-2004 10:00:00 19-DEC-2004 11:15:00
19-DEC-2004 11:00:00 19-DEC-2004 11:15:00

Now I want to find out, group by each hour, I want to find out max number of users connected.

Out put I am expecting is:

19-DEC-2004 08:00:00        1
19-DEC-2004 09:00:00        2
19-DEC-2004 10:00:00        3
19-DEC-2004 11:00:00        4

For a single day, half a million records will be added this table.  And I need to pull the above information 
at the end of each day.

Can I get this out put in a single query?  Or do i need to write a function which will return count, with HH24 as
the Input?

Please suggest.

Sincerely
Vaishnavi
 

Tom Kyte
December 19, 2004 - 11:24 am UTC

sort of remotely related to the original question, so...


we'll need a table of hours to join to -- we need to "multiply" a single row out into many rows:


ops$tkyte@ORA9IR2> create table hours
  2  as
  3  select rownum hh24
  4    from all_objects
  5   where rownum <= 24; 
Table created.
 
Elapsed: 00:00:00.07

ops$tkyte@ORA9IR2> variable x varchar2(20)
ops$tkyte@ORA9IR2> exec :x := to_char(sysdate,'dd-mon-yyyy')
PL/SQL procedure successfully completed.
 
Elapsed: 00:00:00.00
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select b.hh24, count(*)
  2    from (
  3  select to_number(to_char(login,'hh24')) login,
  4         to_number(to_char(logout,'hh24')) logout
  5    from t
  6   where logout >= to_date( :x, 'dd-mon-yyyy' )
  7     and login  < to_date( :x, 'dd-mon-yyyy' )+1
  8         ) a,
  9             hours b
 10   where b.hh24 between login and logout
 11   group by b.hh24
 12  /
 
      HH24   COUNT(*)
---------- ----------
         8          1
         9          2
        10          3
        11          4
 

<b>that is the output we want -- now to test it "bigger"</b>

ops$tkyte@ORA9IR2> insert /*+ APPEND */ into t
  2  select trunc(sysdate)+mod(rownum,80000)/24/60/60,
  3         trunc(sysdate)+mod(rownum,80000)/24/60/60 + (mod(rownum,5)+1)*1/24
  4    from big_table.big_table
  5   where rownum <= 500000;
 
500000 rows created.
 
ops$tkyte@ORA9IR2> commit;
 
Commit complete.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'HOURS' );
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> set timing on
ops$tkyte@ORA9IR2> select b.hh24, count(*)
  2    from (
  3  select to_number(to_char(login,'hh24')) login,
  4         to_number(to_char(logout,'hh24')) logout
  5    from t
  6   where logout >= to_date( :x, 'dd-mon-yyyy' )
  7     and login  < to_date( :x, 'dd-mon-yyyy' )+1
  8         ) a,
  9             hours b
 10   where b.hh24 between login and logout
 11   group by b.hh24
 12   order by 2
 13  /
 
      HH24   COUNT(*)
---------- ----------
        23      18240
        22      39840
         1      50399
        21      60480
         2      70560
        20      73440
        19      82080
         3      85680
        18      86400
        17      86400
        16      86400
        15      86400
        14      86400
        13      86400
        12      86400
        11      86404
        10      86803
         9      87922
         8      89761
         7      92320
         6      95601
         4      95760
         5      99201
 
23 rows selected.
 
Elapsed: 00:00:10.97
ops$tkyte@ORA9IR2>

 

Looking for better solution than this!

Vaishnavi, December 19, 2004 - 2:41 am UTC

Hi Tom,

Sorry!! After giving some serious thought, I came up with this solution. But It will do cartesian product.

select trunc(a.login, 'HH24'), sum(case when b.logout > a.login and b.login <= a.login then 1 else 0 end) f_max_number
from t a, t b
group by trunc(a.login, 'HH24');

But this query will do cartesian product to half a million records. :( Please help.

Thanks
Vaishnavi


two things for Vaishnavi to investigate/adjust ...

Gabe, December 20, 2004 - 2:50 pm UTC

The solution to the "number of users per hour" question won't work very well for:

1. users with open connections for more than 24 hours
2. users having an open connection when the query is executed (logout is null)



Tom Kyte
December 20, 2004 - 3:41 pm UTC

1) true, it'll skip em by design

Thank you Gabe

Vaishnavi, December 21, 2004 - 9:58 am UTC

Thank you Gabe. I am handling that. Thank you for your Interest.

Sincerely
Vaishnavi

date comparison (8.1.7)

A reader, January 04, 2005 - 5:55 pm UTC

hi tom,

I want to compare two dates including the hours part of the time.

so basically let us say to_day '04-Jan-2005' (sysdate)
and yesterday '03-Jan-2005' (sysdate-1)

I want to count all the inserted records from

select * from t
where 1 = 1
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
>= to_date('03-Jan-2005 16:00:00','dd-Mon-yyyy hh24:mi:ss')
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
< to_date('04-Jan-2005 16:00:00','dd-Mon-yyyy hh24:mi:ss')

works,
but how to do this with sysdate ?


select * from t
where 1 = 1
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
>= to_date('(sysdate -1) 16:00:00','dd-Mon-yyyy hh24:mi:ss')
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
< to_date('sysdate 16:00:00','dd-Mon-yyyy hh24:mi:ss')


please explain


Tom Kyte
January 05, 2005 - 9:10 am UTC

why are you using:

to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')

please don't tell me that mydate is a string, that would be so wrong, so so so SOOOOO wrong. It cannot be true. It must be a date

hence, it would simply be:

select * from t
where mydate between sysdate-1 and sysdate;

I've no idea what you are attempting to do with:

to_date('sysdate 16:00:00','dd-Mon-yyyy hh24:mi:ss')

at all? it doesn't make sense

it is a date.

A reader, January 05, 2005 - 10:42 am UTC

I just want to count all record in table,
which are inserted/updated from

18:00:00 hrs of (sysdate -1)
to 18:00:00 hrs of sysdate

whant to tweak 24 hour window from 6 pm (yesterday)
to 6pm (today)



Tom Kyte
January 05, 2005 - 11:06 am UTC

where date_column between trunc(sysdate)-1+18/24
and trunc(sysdate)+18/24

cool, thanks

A reader, January 05, 2005 - 11:57 am UTC

Thanks, tom

number of days that overlap between two date ranges

robin, January 14, 2005 - 1:20 pm UTC

it is possible to modify the above code to count the number of days the two date ranges overlap?

Tom Kyte
January 14, 2005 - 8:36 pm UTC

ops$tkyte@ORA9IR2> create table t ( x int, pb date, cb date, ce date, pe date );
 
Table created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t
  2  select rownum, null, sysdate-5, sysdate+5, null
  3    from all_users
  4   where rownum <= 6
  5  /
 
6 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> update t set pb = cb-1, pe=ce+1 where x=1;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set pb = cb-1, pe=ce-1 where x=2;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set pb = cb+1, pe=ce+1 where x=3;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set pb = cb+1, pe=ce-1 where x=4;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set pb = cb-1, pe=cb-1/2 where x=5;
 
1 row updated.
 
ops$tkyte@ORA9IR2> update t set pb = ce+1/2, pe=ce+1 where x=6;
 
1 row updated.
 
ops$tkyte@ORA9IR2> select t.*,
  2         case when (cb <= pe) AND (ce >= pb)
  3              then '<='
  4          end,
  5         greatest( cb, pb ),
  6         least( ce, pe ),
  7         case when (cb <= pe) AND (ce >= pb)
  8              then least( ce, pe )-greatest( cb, pb )
  9          end days
 10    from t
 11  /
 
 X PB        CB        CE        PE        CA GREATEST( LEAST(CE, DAYS
-- --------- --------- --------- --------- -- --------- --------- ----
 1 08-JAN-05 09-JAN-05 19-JAN-05 20-JAN-05 <= 09-JAN-05 19-JAN-05   10
 2 08-JAN-05 09-JAN-05 19-JAN-05 18-JAN-05 <= 09-JAN-05 18-JAN-05    9
 3 10-JAN-05 09-JAN-05 19-JAN-05 20-JAN-05 <= 10-JAN-05 19-JAN-05    9
 4 10-JAN-05 09-JAN-05 19-JAN-05 18-JAN-05 <= 10-JAN-05 18-JAN-05    8
 5 08-JAN-05 09-JAN-05 19-JAN-05 09-JAN-05    09-JAN-05 09-JAN-05
 6 20-JAN-05 09-JAN-05 19-JAN-05 20-JAN-05    20-JAN-05 19-JAN-05
 
6 rows selected.
 

Between date ranges

chandra, January 19, 2005 - 11:36 pm UTC

Tom,
How do I calculate how many times users logged in between the given dates(atleast once).

lets say I have a table with users and date
date users
---------- -----
10/28/2004 userA
10/28/2004 userB
10/29/2004 userA
11/01/2004 userc ----------- just once
12/20/2004 userA
12/28/2004 userA
12/28/2004 userB
10/29/2004 userA
01/01/2005 userd ----------- just once
01/10/2005 userA

The max(date) is 01/10/2005 and min(date) is 10/28/2004

How do I calculate how many times users logged between the max(date) and min(date).

The output required should be

count(users) days logged
2 ie userC and D 1 ------- 2 users
logged in atleast once
between the min(date)
and max(date)
1 i.e userA 6 ------- 1 user logged
into the system
6 days
1 i.e userB 2 ------- 1 user logged
into the system
2 days
The logging into the system need not be continous I.e the user can login monday and next login can be after a week and so on.

Thankyou very much
chandra












Tom Kyte
January 20, 2005 - 10:19 am UTC

select users, count(*) from t where dt between :a and :b group by users;



Net Days In Service

VKOUL, February 03, 2005 - 8:49 pm UTC

SQL> create table dt_net(from_date date, to_date date, type varchar2(10));

Table created.

SQL> insert into dt_net values('01-jan-2004', '05-jan-2004', 'ACTIVE');

1 row created.

SQL> insert into dt_net values('04-jan-2004', '10-jan-2004', 'HOLD');

1 row created.

SQL> insert into dt_net values('11-jan-2004', '18-jan-2004', 'ACTIVE');

1 row created.

SQL> insert into dt_net values('17-jan-2004', '20-jan-2004', 'HOLD');

1 row created.

SQL> insert into dt_net values('19-jan-2004', '24-jan-2004', 'ACTIVE');

1 row created.

SQL> insert into dt_net values('26-jan-2004', '31-jan-2004', 'ACTIVE');

1 row created.

SQL> commit;

Commit complete.

SQL> 
SQL> select * from dt_net
  2  /
Press Enter ...

FROM_DATE TO_DATE   TYPE
--------- --------- ----------
01-JAN-04 05-JAN-04 ACTIVE     -- 
04-JAN-04 10-JAN-04 HOLD       -- Effectively 5 days of hold
11-JAN-04 18-JAN-04 ACTIVE     
17-JAN-04 20-JAN-04 HOLD       -- Effectively 0 days of hold
19-JAN-04 24-JAN-04 ACTIVE
26-JAN-04 31-JAN-04 ACTIVE

6 rows selected.

SQL> DISC
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
SQL> 

Output I am expecting is : 25 days of activity.
i.e. (30 (Total active)- 5(Total hold))

Thanks 

Tom Kyte
February 04, 2005 - 2:00 am UTC

look into lag and lead.  lets you look back and forth a row or two or N....

ops$tkyte@ORA9IR2> select from_date, to_date, type,
  2         lag(to_date) over (order by from_date) last_to_date
  3    from dt_net;
 
FROM_DATE TO_DATE   TYPE       LAST_TO_D
--------- --------- ---------- ---------
01-JAN-04 05-JAN-04 ACTIVE
04-JAN-04 10-JAN-04 HOLD       05-JAN-04
11-JAN-04 18-JAN-04 ACTIVE     10-JAN-04
17-JAN-04 20-JAN-04 HOLD       18-JAN-04
19-JAN-04 24-JAN-04 ACTIVE     20-JAN-04
26-JAN-04 31-JAN-04 ACTIVE     24-JAN-04
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select from_date,
  2         case when type = 'ACTIVE'
  3                  then to_date
  4                      else greatest(to_date,last_to_date)
  5                  end new_to_date,
  6                  type
  7    from (
  8  select from_date, to_date, type,
  9         lag(to_date) over (order by from_date) last_to_date
 10    from dt_net
 11         )
 12  /
 
FROM_DATE NEW_TO_DA TYPE
--------- --------- ----------
01-JAN-04 05-JAN-04 ACTIVE
04-JAN-04 10-JAN-04 HOLD
11-JAN-04 18-JAN-04 ACTIVE
17-JAN-04 20-JAN-04 HOLD
19-JAN-04 24-JAN-04 ACTIVE
26-JAN-04 31-JAN-04 ACTIVE
ops$tkyte@ORA9IR2> edit test
 
ops$tkyte@ORA9IR2> @test
ops$tkyte@ORA9IR2> drop table dt_net;
ops$tkyte@ORA9IR2> create table dt_net(from_date date, to_date date, type varchar2(10));
ops$tkyte@ORA9IR2> insert into dt_net values('01-jan-2004', '05-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2> insert into dt_net values('04-jan-2004', '10-jan-2004', 'HOLD');
ops$tkyte@ORA9IR2> insert into dt_net values('11-jan-2004', '18-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2> insert into dt_net values('17-jan-2004', '20-jan-2004', 'HOLD');
ops$tkyte@ORA9IR2> insert into dt_net values('19-jan-2004', '24-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2> insert into dt_net values('26-jan-2004', '31-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select from_date, to_date, type,
  2         lag(to_date) over (order by from_date) last_to_date
  3    from dt_net;
 
FROM_DATE TO_DATE   TYPE       LAST_TO_D
--------- --------- ---------- ---------
01-JAN-04 05-JAN-04 ACTIVE
04-JAN-04 10-JAN-04 HOLD       05-JAN-04
11-JAN-04 18-JAN-04 ACTIVE     10-JAN-04
17-JAN-04 20-JAN-04 HOLD       18-JAN-04
19-JAN-04 24-JAN-04 ACTIVE     20-JAN-04
26-JAN-04 31-JAN-04 ACTIVE     24-JAN-04
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select case when type = 'ACTIVE'
  2                  then from_date
  3                      else greatest(from_date,last_to_date)
  4                  end new_to_date,
  5                  to_date,
  6                  type
  7    from (
  8  select from_date, to_date, type,
  9         lag(to_date) over (order by from_date) last_to_date
 10    from dt_net
 11         )
 12  /
 
NEW_TO_DA TO_DATE   TYPE
--------- --------- ----------
01-JAN-04 05-JAN-04 ACTIVE
05-JAN-04 10-JAN-04 HOLD
11-JAN-04 18-JAN-04 ACTIVE
18-JAN-04 20-JAN-04 HOLD
19-JAN-04 24-JAN-04 ACTIVE
26-JAN-04 31-JAN-04 ACTIVE
 

VKOUL, February 06, 2005 - 12:28 pm UTC

Yes, I think I'll be able to incorporate this logic with some modification.

Thanks

Missing Dates

Tracy, February 11, 2005 - 12:31 pm UTC

Two of the columns in my table are FromDate and ToDate.
In a given row FromDate is always less than ToDate.
Rows may have date ranges that overlap date ranges in other rows (or contain them completely).
I want to get a list of all dates that aren't in any row between the FromDate and the ToDate (starting with the earliest FromDate and finishing with the latest ToDate).

Let's say the table contains these rows:
FromDate ToDate
01-JAN-03 07-JAN-03
08-JAN-03 21-JAN-03
15-JAN-03 26-JAN-03
29-JAN-03 23-MAR-03
24-MAR-03 28-MAR-03
23-FEB-03 15-APR-03
18-APR-03 25-APR-03

I want my sql to return
27-jan-03
28-jan-03
16-apr-03
17-apr-03
as these are the only dates between 01-jan-03 (earliest FromDate) and 25-apr-03 (latest ToDate) that aren't between the dates in any row, i.e. they are 'missing'.

Tom Kyte
February 12, 2005 - 8:16 am UTC

you need to generate a set of dates, something like:


select (select min(fromdate) from t)+rownum-1 dt
from all_objects
where rownum <= (select max(todate) from t)-(select min(fromdate) from t)+1

(assuming the number of dates is small enough -- all_objects has about 30-40,000 rows in it typically these days. ANY table with enough rows will do)


Then, use that

with Q as (that_query)
select *
from q
where not exists (select null from t where fromdate <= q.dt and todate <= q.dt)


for example..

Traversing a GAP in data

Robert, March 23, 2005 - 1:41 pm UTC

Tom,

First I would like to thank you for all that you do for the Oracle community.

We have a requirement to report the average of 4 values however, the twist is that we need to detect gaps in the data.

To illustrate this I will use the demo table salgrade with the following modifications.

update SALGRADE set hisal = 5000 where grade = 5;
insert into salgrade values (7,6001,6500);
insert into salgrade values (8,6501,7000);
insert into salgrade values (9,7001,8000);
insert into salgrade values (10,8001,9999);
commit;

Here is what I am starting with:

select * from SALGRADE order by 1;

GRADE LOSAL HISAL
---------- ---------- ----------
1 700 1200
2 1201 1400
3 1401 2000
4 2001 3000
5 3001 5000

7 6001 6500
8 6501 7000
9 7001 8000
10 8001 9999

Since the users will control the starting point for this calculation, I will use 4 for this example.

variable b1 varchar2;

exec :b1 :=4;

select grade,
avg_losal
from (select grade,
avg(losal) over (order by grade
range between current row
and 4 following) avg_losal
from salgrade)
where grade = :b1
/


GRADE AVG_LOSAL
---------- ----------
4 4376

Since there is a gap in the data with grade 6 missing, we need to report the following.

select (2001 + 3001 + 6001) / 4 from dual;


(2001+3001+6001)/4
------------------
2750.75

How can I do this in SQL?

Thanks,
Robert D. Ware



Tom Kyte
March 23, 2005 - 6:19 pm UTC

really, by 4?


ops$tkyte@ORA9IR2> select grade, sum(losal) over (order by grade range between current row and 3 following)/4
  2    from salgrade;
 
     GRADE SUM(LOSAL)OVER(ORDERBYGRADERANGEBETWEENCURRENTROWAND3FOLLOWING)/4
---------- -----------------------------------------------------------------
         1                                                           1325.75
         2                                                              1901
         3                                                           1600.75
         4                                                           2750.75
         5                                                           3875.75
         7                                                              6876
         8                                                           5375.75
         9                                                            3750.5
        10                                                           2000.25
 
9 rows selected.


sum up the rows (you were over by 1 in your range I think) and divide by 4.
 

What about the SYS_OVER_ Functions

Bob Lyon, March 23, 2005 - 1:54 pm UTC

While looking for "new stuff" in (9iR2) package STANDARD I found

FUNCTION sys_over__dd(t1 date,t2 date,
t3 date,t4 date) RETURN BOOLEAN;

This is undocumented, as near as I can determine,
but a simple test case (below) seems to show the fundamentals.
Are these SYS_OVER_ functions "safe" to use,
or is their behavior subject to change in future releases?

SET LINESIZE 100

DECLARE
NOW DATE := TRUNC(SYSDATE);
TOMORROW DATE := NOW + 1;

PROCEDURE TEST (Start1 IN DATE, End1 IN DATE, Start2 IN DATE, End2 IN DATE)
IS
BEGIN

DBMS_OUTPUT.PUT( TO_CHAR(Start1, 'DD-MON-YY HH24:MI:SS ')
|| TO_CHAR(End1, 'DD-MON-YY HH24:MI:SS ')
|| TO_CHAR(Start2, 'DD-MON-YY HH24:MI:SS ')
|| TO_CHAR(End2, 'DD-MON-YY HH24:MI:SS ')
);

IF SYS_OVER__DD(Start1, End1, Start2, End2) THEN
DBMS_OUTPUT.PUT_LINE('Overlaps');
ELSE
DBMS_OUTPUT.PUT_LINE('Does Not Overlap');
END IF;

END TEST;

BEGIN

TEST(NOW, TOMORROW, NOW, TOMORROW);
TEST(NOW, TOMORROW, NOW+3, TOMORROW+3);
TEST(NOW, TOMORROW, TOMORROW, TOMORROW+3);
TEST(NOW, TOMORROW, TOMORROW-(1/86400), TOMORROW+3);
TEST(TOMORROW, NOW, NOW, TOMORROW); -- from/to dates don't care about order?

END;
/

23-MAR-05 00:00:00 24-MAR-05 00:00:00 23-MAR-05 00:00:00 24-MAR-05 00:00:00 Overlaps
23-MAR-05 00:00:00 24-MAR-05 00:00:00 26-MAR-05 00:00:00 27-MAR-05 00:00:00 Does Not Overlap
23-MAR-05 00:00:00 24-MAR-05 00:00:00 24-MAR-05 00:00:00 27-MAR-05 00:00:00 Does Not Overlap
23-MAR-05 00:00:00 24-MAR-05 00:00:00 23-MAR-05 23:59:59 27-MAR-05 00:00:00 Overlaps
24-MAR-05 00:00:00 23-MAR-05 00:00:00 23-MAR-05 00:00:00 24-MAR-05 00:00:00 Overlaps

Tom Kyte
March 23, 2005 - 6:21 pm UTC

caveat emptor

of course it can change. anything not documented is subject to change and if it doesn't work as expected in all cases - well, maybe it is not supposed to.


suggest you write your own.

Problem with compare hours

Ora_User, April 20, 2005 - 6:17 pm UTC

Hi Tom.

I have a problem to design the next situation and I need your help.

I must to create a table like this:

create table test (
test_id number(10),
start_hour date,
stop_hour date
);

And then I insert data like this:

insert into test values (1,to_date('080000','hh24miss'), to_date('165959','hh24miss'));

insert into test values (1,to_date('170000','hh24miss'), to_date('005959','hh24miss'));

insert into test values (1,to_date('010000','hh24miss'), to_date('075959','hh24miss'));

Then I have 3 intervals of hours that I have to select depending of an hour that I receive in a stored procedure. Something like this:

create or replace procedure test_st( p_date in varchar2,
p_test_id out number )
as
begin
select test_id
into p_test_id
from test
where to_date(p_date,'hh24miss') between
start_hour and stop_our;
return;
end;

This sp does not work because in the fields start_hour and stop_our are a "date", not a "time".

I can define that start_hour and stop_our be a varchar2, but this cause to me problems:

1. I get wrong results if I compare directly (not possible!)

2. How oracle can see that '010000' (am) is between '220000' (pm) and '080000' (am), if these are varchar2?

3. If I store the hours like varchar2, I can compare the fields if I transform start_hour and stop_hour to date, but ... it's is good for performance? and I need an excellent performance for this query.

Can you help me to find a solution for this problem?

Thank you so much.

Modified time overlap question

steve, July 29, 2005 - 10:59 pm UTC

Regarding the question "Hour wise Max number of logged in Users" asked on December 19, 2004.

1) if we get rid of "hour wise" from the question. i.e.,
find out max number of logged in users from a login history table and the range which has max number of logged in users.

2) since removing "hour wise" and assuming login history table holds more than one year's data.

then a table called Seconds(more than 31 million rows) have to be created?? Is there any way without using Seconds table but still in single SQL?

Thanks!


ABout days

A reader, August 30, 2005 - 2:58 pm UTC

Tom,

I have a few dates as follows, can you please tell me if there is a better way to write them. Thanks!

1 .....WHERE TRUNC(DATE_IN, 'DD') = TRUNC(date_IN, 'DD')

2. SELECT TRUNC(SYSDATE) - TRUNC(IN_Date) INTO NODays FROM dual

3... AND IN_date = trunc(SYSDATE)

4. RETURN GREATEST((trunc(v_date) - trunc(date_in)), 0);


Tom Kyte
August 30, 2005 - 3:23 pm UTC

1) umm,

where date_in is not null

2) dandy (fine)

3) good

4) fine.

thanks!!!!

A reader, August 30, 2005 - 3:25 pm UTC


as usual, priceless

A reader, September 07, 2005 - 4:38 pm UTC


Time lag between pairs of rows

Tracy, December 05, 2005 - 9:38 am UTC

I have a table where rows are inserted in pairs with a variable time lag between the two rows. There is a column indicating the time of the insert. The first insert in the pair will have the startyn column set to 1, the second insert will have it set to zero. The Primary key is from an ascending sequence. There is also a customer number which will be the same for each pair of rows. How can I write a query that will return the IDs of each pair where the time difference between the inserts of the pair is greater than a specified time, say 10 seconds.

SQL>  create table tb1 (id number, startyn number, createdat date, customer number);

Table created.

SQL> create sequence seq1;

Sequence created.

SQL> insert into tb1 values (seq1.nextval,1,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,3);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,3);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,4);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,4);

SQL> select id,startyn,to_char(createdat,'hh24miss'),customer from tb1 order by 1;

        ID    STARTYN TO_CHA   CUSTOMER
---------- ---------- ------ ----------
         1          1 150644          1
         2          0 150709          1
         3          1 150715          1
         4          0 150719          1
         5          1 150731          2
         6          0 150740          2
         7          1 150759          2
         8          0 150759          2
         9          1 150822          3
        10          0 150827          3
        11          1 150834          4
        12          0 150856          4

Given the above data I want a query that will return IDs 1,2,5,6,11,12 (if we are looking for a 10-second or greater difference).
This is on Oracle 8i. 

Tom Kyte
December 06, 2005 - 4:47 am UTC

I can do this however....

a) is the data perfect (always 1/0 or will some 0's be missing)

b) supply a test case setup that matches your example. sysdate is going to be sort of a "constant" thing here.

Test data

Tracy, December 10, 2005 - 12:20 pm UTC

Sorry for delay in replying.

a) We can assume that the data will always be perfect, i.e. no zero missing (although I would be interested to see a solution for where there were zeros missing).

b) Here is the test case that matches the above example: (I'm afraid the inserts wrap when pasted into your site)

SQL> drop table tb1;

Table dropped.

SQL>  create table tb1 (id number, startyn number, createdat date, customer number);

Table created.

SQL> create sequence seq1;

Sequence created.

SQL> set feed off
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:06:44','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:09','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:07:15','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:19','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:07:31','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:40','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:07:59','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:59','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:08:22','dd-mon-yyyy hh24:mi:ss'),3);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:08:27','dd-mon-yyyy hh24:mi:ss'),3);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:08:34','dd-mon-yyyy hh24:mi:ss'),4);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:08:56','dd-mon-yyyy hh24:mi:ss'),4);
SQL> select id,startyn,to_char(createdat,'hh24miss'),customer from tb1 order by 1;

        ID    STARTYN TO_CHA   CUSTOMER
---------- ---------- ------ ----------
         1          1 150644          1
         2          0 150709          1
         3          1 150715          1
         4          0 150719          1
         5          1 150731          2
         6          0 150740          2
         7          1 150759          2
         8          0 150759          2
         9          1 150822          3
        10          0 150827          3
        11          1 150834          4
        12          0 150856          4
SQL>  

Tom Kyte
December 11, 2005 - 2:51 am UTC

ops$tkyte@ORA9IR2> select id, next_id, customer,
  2         createdat, next_createdat,
  3         (next_createdat-createdat) * 24 * 60 * 60 sec_diff
  4    from (
  5  select id, startyn, createdat, customer,
  6         lead(id) over (partition by customer order by createdat) next_id,
  7         lead(createdat) over (partition by customer order by createdat) next_createdat,
  8         lead(startyn) over (partition by customer order by createdat) next_startyn
  9    from tb1
 10         )
 11   where startyn = 1
 12   order by customer, createdat
 13  /

        ID    NEXT_ID   CUSTOMER CREATEDA NEXT_CRE   SEC_DIFF
---------- ---------- ---------- -------- -------- ----------
         1          2          1 15:06:44 15:07:09         25
         3          4          1 15:07:15 15:07:19          4
         5          6          2 15:07:31 15:07:40          9
         7          8          2 15:07:59 15:07:59          0
         9         10          3 15:08:22 15:08:27          5
        11         12          4 15:08:34 15:08:56         22

6 rows selected.

ops$tkyte@ORA9IR2> select id, next_id, customer,
  2         createdat, next_createdat,
  3         (next_createdat-createdat) * 24 * 60 * 60 sec_diff
  4    from (
  5  select id, startyn, createdat, customer,
  6         lead(id) over (partition by customer order by createdat) next_id,
  7         lead(createdat) over (partition by customer order by createdat) next_createdat,
  8         lead(startyn) over (partition by customer order by createdat) next_startyn
  9    from tb1
 10         )
 11   where startyn = 1
 12     and (next_createdat-createdat) * 24 * 60 * 60 >= 10
 13   order by customer, createdat
 14  /

        ID    NEXT_ID   CUSTOMER CREATEDA NEXT_CRE   SEC_DIFF
---------- ---------- ---------- -------- -------- ----------
         1          2          1 15:06:44 15:07:09         25
        11         12          4 15:08:34 15:08:56         22


(5 and 6 are only 9 seconds apart...) 

Excellent

TT, December 12, 2005 - 7:38 am UTC

Perfect.Thank you very much.

Value between two columns

Tracy, April 15, 2006 - 10:00 am UTC

I have a problem similar to Christo Kutrovsky above.

We have a table of IP Address ranges. The table looks like:

IPADDRESSES(
IPFROM NUMBER,
IPTO NUMBER,
COUNTRYCODE VARCHAR2(2),
DESCRIPTION VARCHAR2(100))

To give you an idea of the data, here are five rows:
SELECT * FROM IPADDRESSES WHERE ROWNUM < 6;

50331648 50331903 US UNITED STATES
50331904 50332671 US UNITED STATES
50332672 50332927 US UNITED STATES
50332928 50333695 US UNITED STATES
50333696 50333951 US UNITED STATES
50333952 50334719 US UNITED STATES

However, the table has 1.8 million records and we query it to find what country an IP address belong to:

SELECT ip.description
FROM IPADDRESSES ip
WHERE :b1 between ip.ipfrom AND ip.ipto

I cannot find an efficient way of indexing this table.

I have tried indexing this table in various different ways but no one way seems suitable. The query response time varies from a few milliseconds to about 30 seconds (which is unacceptable) depending upon the value of the variable in the query. I would have thought that this kind of scenario is fairly common - surely there must be a simple way of indexing it such that it will produce similar performance regardless of the value of the variable in the query?

Tom Kyte
April 15, 2006 - 1:19 pm UTC

so, do these ranges in your table overlap or not?

if not, you want the first record where the ipto is >= :b1 don't you?

select *
from (select * from ipaddresses where ipto >= :b1 order by ipto)
where rownum = 1;

and you would need to verify that the ipfrom was valid (eg: gaps would be a problem)

Gaps

Tracy, April 19, 2006 - 10:48 am UTC

Well, the ranges do not overlap but often there are gaps between the ipto of one row and the ipfrom of the next, so your query as it stands may return the wrong answer.
Here is a trimmed down version of the table to illustrate:

tupmant@dev920> SET VERIFY OFF
tupmant@dev920> SELECT * FROM IPADDRESSES;

IPFROM IPTO COUNTRYCODE DESCRIPTION
---------- ---------- --------------------- ------------------
400052000 400052999 US UNITED STATES
400157000 400157999 US UNITED STATES
400158000 400158999 US UNITED STATES
400166000 400166999 US UNITED STATES
400167000 400167999 US UNITED STATES
400169000 400169158 US UNITED STATES
400169159 400169159 US UNITED STATES
400169160 400169999 US UNITED STATES
400179000 400179999 US UNITED STATES
400181000 400181999 US UNITED STATES
400183000 400183999 US UNITED STATES
400209000 400209999 US UNITED STATES
400212000 400212999 US UNITED STATES
400218000 400218999 US UNITED STATES
400226000 400226999 US UNITED STATES
400229000 400229999 US UNITED STATES
400231000 400231999 US UNITED STATES

17 rows selected.

tupmant@dev920> SELECT ip.description
2 FROM IPADDRESSES ip
3 WHERE &ip between ip.ipfrom AND ip.ipto;
Enter value for ip: 400053123

no rows selected

tupmant@dev920> select description
2 from (select * from ipaddresses where ipto >= &IP order by ipto)
3 where rownum = 1;
Enter value for ip: 400053123

DESCRIPTION
------------------------------
UNITED STATES

So instead maybe this would do it and still be pretty efficient if I have an index on ipto and another on ipfrom?

select description
from (select * from ipaddresses where ipto >= &IP order by ipto)
where rownum = 1
intersect
select description
from (select * from ipaddresses where ipfrom <= &IP order by ipfrom desc)
where rownum = 1;

Tom Kyte
April 19, 2006 - 11:07 am UTC

did you test it out to see the work performed?

There is an easier basic SQL method for comparing date ranges

Greg Smith, April 19, 2006 - 7:28 pm UTC

Hi Tom,

I wanted to share a simple technique I use for comparing date ranges, assuming you want the same all-inclusive overlap results as your example where clause proffered.

Quoted here is part of the SQL statement from your original followup at the beginning of this thread. The SQL statement was noted to include any full or partial overlaps, including subsets and supersets:

"
where ( c_beginning between p_beginning and p_ending -- rule 1
OR c_ending between p_beginning and p_ending -- rule 2
OR ( c_beginning < p_beginning and p_ending < c_ending ) ); -- rule 3
"

The technique I use is a simple 2-part AND-ed comparison:

where p_beginning <= c_ending AND c_beginning <= p_ending

This shorter set of conditions will accomplish the same results (assuming c_beginning and c_ending are both not null -- wrapping both with NVL functions could be used to ensure this in cases that called for it). I also find it more readable, although one my associates prefers the readability of your method.

Because of it's relative brevity, I find that this technique extends more gracefully to situations where you must join 3 or 4 date-ranges together in a single SQL statement, which is often necessary with "date-tracked" apps like Oracle's own HR apps module.

Hopefully, this will prove helpful to someone.

Thanks for the great site/books!
-Greg

Question

A reader, November 16, 2006 - 6:47 pm UTC

Tom, is there anyway (in sql) you can assign a sequential number to a set of rows (date) given a time bucket ?

For example, say you have this data:

01-FEB-07
01-APR-07
01-JAN-08
01-MAR-08
01-JUN-08
01-FEB-09
01-JAN-10
01-MAR-10
01-APR-10
01-JAN-11

And I tell you: assing a sequential value on a 9 month basis.

Output should be

01-FEB-07 1
01-APR-07 1
01-JAN-08 2
01-MAR-08 2
01-JUN-08 2
01-FEB-09 3
01-JAN-10 4
01-MAR-10 4
01-APR-10 4
01-JAN-11 5

Every time that a bucket is filled (9 month), it should start counting again at the next month.

Can it be possible ?

Let me know, thanks

Tom Kyte
November 17, 2006 - 3:15 am UTC

i very much do not understand your numbering system here. You'll have to supply a bit more explanation.

RE: Question

A reader, November 17, 2006 - 9:24 am UTC

Hi again Tom, sorry for not being that clear.

In the output, you see numbers 1, 2 and so on.. these numbers are linked during a 9 month period. For example, between the first row and the second row, they are 2 months, but between the first row and the third row, they are 11 months, so it is over the 9 month period so I must assign a new (2) group number. Now, between the third row and sixth row, they are 11 month, again the 9 month period is over and so on. Basically, the idea is to group by a 9 month period, but after each group, reset this period at the beginning of the next group.

Hope I have been a litle bit clearer this time.

Thank you again.

Tom Kyte
November 19, 2006 - 3:56 pm UTC

ahh, read
</code> https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html <code>

"analytics to the rescue, again"

same problem (expect in the article is was 3 seconds, you can adjust I am sure)

Very usefull

bc, January 29, 2008 - 4:19 pm UTC

Tom,

Below is a condensed example, The author intended to extract a sum of quantities ordered for the month and year of a given date.
pi_date is of type date and is passed to the function where the below query resides as a parameter.

Is there a better way to write the below query, keeping performance in mind.

select sum(order_qty)
from orders
where to_char(order_date, 'YYYYMM') = to_char(nvl(pi_date, sysdate), 'YYYYMM')

The orders table does not have any function based indexes on the order_date column and getting one is out of the question

The orders table is defined as
create table orders
(
order_id number,
order_date date,
order_qty number
);

Thanks


Tom Kyte
January 30, 2008 - 9:42 am UTC

 select      sum(order_qty)
  from        orders
  where       to_char(order_date, 'YYYYMM') = to_char(nvl(pi_date, sysdate), 
'YYYYMM')

ugh,

don't convert dates to strings....

In fact, if order_date is INDEXED and we should use an index (eg: you have 7 years of orders online, so you have 7x12 = 84 months of data, we get 1/84th of the table - an index might make sense)
where order_date >= nvl(trunc(pi_date,'mm'),trunc(sysdate,'mm'))
  and order_date < add_months( nvl(trunc(pi_date,'mm'),trunc(sysdate,'mm')), 1 )



consider these three ways of doing it (a couple hundred thousand dates were compared here)

select count(*) from t where to_char(created,'yyyymm')='200601'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          0          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.97       0.97          0       1443          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.99       0.99          0       1445          0           1
********************************************************************************
select count(*) from t where trunc(created,'mm')=to_date('200601','yyyymm')

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.47       0.47          0       1442          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.48       0.48          0       1443          0           1
********************************************************************************
select count(*) from t where created >= to_date('200601','yyyymm')
and created < add_months(to_date('200601','yyyymm'),1)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.07       0.07          0       1442          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.08       0.08          0       1443          0           1


see the cpu go up when you twiddle with the date column? You don't need to touch it.

Awesome as always

bc, January 30, 2008 - 3:19 pm UTC

Thank you very much for your detailed explanation.

Question on getting difference between dates

A reader, May 15, 2008 - 1:37 pm UTC

Hi:
I would like to do the following in sql. I think I have to use analytic functions
but I don't know how to go about it.

create table temp(id number not null, begin_date date not null, end_date date null);

insert into temp values (1, to_date('01/15/2008','mm/dd/yyyy'),null);

insert into temp values (2, to_date('03/20/2007','mm/dd/yyyy'), to_date('05/15/2008','mm/dd/yyyy'));



I want to know for a given range of dates how many days are there in each month for the entire year.

For example, if I pass the date range of 01/01/2008 and 12/31/2008 for row with id = 1, I want the output to be something like this

2008
Jan Feb Mar
15 29 31
Apr May Jun
30 31 30
Jul Aug Sep
31 31 30
Oct Nov Dec Total
31 30 31 351

if I pass the date range of 01/01/2007 and 12/31/2008 for row with id = 2, I want the output to be something like this

2008
Jan Feb Mar
31 29 31
Apr May Jun
30 15
Jul Aug Sep
Oct Nov Dec Total
135

2007
Jan Feb Mar
11
Apr May Jun
30 31 30
Jul Aug Sep
31 31 30
Oct Nov Dec Total
31 30 31 286

Thanks!
Tom Kyte
May 19, 2008 - 1:05 pm UTC

why does the first one have 15 days for JAN-2008

your start date was 15-jan-2008

ops$tkyte%ORA11GR1> select last_day(to_date('15-jan-2008'))-to_date('15-jan-2008') from dual;


TO_DATE('31-JAN-2008')-TO_DATE('15-JAN-2008')
---------------------------------------------
                                           16


so, you must not count the first day as being in the range... but


and the second one have 11 days for MAR-2007

ops$tkyte%ORA11GR1> select last_day(to_date('20-mar-2007'))-to_date('20-mar-2007') from dual;

LAST_DAY(TO_DATE('20-MAR-2007'))-TO_DATE('20-MAR-2007')
-------------------------------------------------------
                                                     11


now it is???



I'll presume it is included - actually, I don't agree with either of them - if start day is 15-jan and jan has 31 days then there are 17 days.

If start day is 20-mar and mar has 31 days then there are 12 days.

If that is wrong, you can fix it yourself :) It is just math


further I'll assume

a) your primary key is id
b) you will always include the primary key in the query (seems reasonable, you seem to indicate that)
c) the days are provide as begin of month for first date input and last day of month for end of input (eg: you only pass MON-YYYY really)


ops$tkyte%ORA11GR1> variable bdate varchar2(30)
ops$tkyte%ORA11GR1> variable edate varchar2(30)
ops$tkyte%ORA11GR1> variable id number
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> exec :bdate := '01-2008'; :edate := '12-2008'; :id := 1

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> with data
  2  as
  3  (select add_months( to_date(:bdate,'fxmm-yyyy'), level-1 ) dt,
  4          months_between( to_date(:edate,'fxmm-yyyy'), to_date(:bdate,'fxmm-yyyy') )+1 btwn
  5     from dual
  6  connect by level <= months_between( to_date(:edate,'fxmm-yyyy'), to_date(:bdate,'fxmm-yyyy'))+1
  7  )
  8  select to_char(dt,'yyyy'), dt, sum(days)
  9    from (
 10  select case when trunc(begin_date,'mm') > dt OR last_day(end_date) < last_day(dt) then 0
 11              when trunc(begin_date,'mm') = trunc(dt,'mm') then last_day(dt)-begin_date+1
 12              when trunc(end_date,'mm') = trunc(dt,'mm') then end_date - trunc(dt,'mm')+1
 13              else to_number(to_char(last_day(dt),'dd'))
 14          end days,
 15         data.*,
 16         x.*
 17    from data,
 18        (select begin_date begin_date, nvl(end_date,last_day(to_date(:edate,'fxmm-yyyy'))) end_date from temp where id = :id) x
 19        )
 20  group by rollup ( to_char(dt,'yyyy'), dt)
 21  /

TO_C DT         SUM(DAYS)
---- --------- ----------
2008 01-JAN-08         17
2008 01-FEB-08         29
2008 01-MAR-08         31
2008 01-APR-08         30
2008 01-MAY-08         31
2008 01-JUN-08         30
2008 01-JUL-08         31
2008 01-AUG-08         31
2008 01-SEP-08         30
2008 01-OCT-08         31
2008 01-NOV-08         30
2008 01-DEC-08         31
2008                  352
                      352

14 rows selected.

ops$tkyte%ORA11GR1> exec :bdate := '01-2007'; :edate := '12-2008'; :id := 2

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> /

TO_C DT         SUM(DAYS)
---- --------- ----------
2007 01-JAN-07          0
2007 01-FEB-07          0
2007 01-MAR-07         12
2007 01-APR-07         30
2007 01-MAY-07         31
2007 01-JUN-07         30
2007 01-JUL-07         31
2007 01-AUG-07         31
2007 01-SEP-07         30
2007 01-OCT-07         31
2007 01-NOV-07         30
2007 01-DEC-07         31
2007                  287
2008 01-JAN-08         31
2008 01-FEB-08         29
2008 01-MAR-08         31
2008 01-APR-08         30
2008 01-MAY-08         15
2008 01-JUN-08          0
2008 01-JUL-08          0
2008 01-AUG-08          0
2008 01-SEP-08          0
2008 01-OCT-08          0
2008 01-NOV-08          0
2008 01-DEC-08          0
2008                  136
                      423

27 rows selected.

ops$tkyte%ORA11GR1>


Why not use this syntax to do this?

ken.maricle, July 29, 2008 - 2:01 pm UTC

It is undocumented as far as I can tell, but seems to work.

with test_data as
(select date '2007-01-01' from_dt, date '2007-12-01' to_dt from dual union all
select date '2007-04-01' from_dt, date '2007-08-01' to_dt from dual union all
select date '2007-07-01' from_dt, date '2007-08-01' to_dt from dual union all
select date '2007-10-01' from_dt, date '2008-02-01' to_dt from dual union all
select date '2008-01-01' from_dt, date '2008-12-01' to_dt from dual)
select *
from test_data t1,
test_data t2
where (t1.from_dt,t1.to_dt) overlaps (t2.from_dt,t2.to_dt)
/

FROM_DT TO_DT FROM_DT TO_DT
--------- --------- --------- ---------
01-JAN-07 01-DEC-07 01-JAN-07 01-DEC-07
01-APR-07 01-AUG-07 01-JAN-07 01-DEC-07
01-JUL-07 01-AUG-07 01-JAN-07 01-DEC-07
01-OCT-07 01-FEB-08 01-JAN-07 01-DEC-07
01-JAN-07 01-DEC-07 01-APR-07 01-AUG-07
01-APR-07 01-AUG-07 01-APR-07 01-AUG-07
01-JUL-07 01-AUG-07 01-APR-07 01-AUG-07
01-JAN-07 01-DEC-07 01-JUL-07 01-AUG-07
01-APR-07 01-AUG-07 01-JUL-07 01-AUG-07
01-JUL-07 01-AUG-07 01-JUL-07 01-AUG-07
01-JAN-07 01-DEC-07 01-OCT-07 01-FEB-08
01-OCT-07 01-FEB-08 01-OCT-07 01-FEB-08
01-JAN-08 01-DEC-08 01-OCT-07 01-FEB-08
01-OCT-07 01-FEB-08 01-JAN-08 01-DEC-08
01-JAN-08 01-DEC-08 01-JAN-08 01-DEC-08

15 rows selected.
Tom Kyte
August 01, 2008 - 10:26 am UTC

do not use this, just don't.

we have no idea what it does - other then empirical observation - which is sort of like someone saying:

"you don't need order by if you group by, I group by a,b,c and it always comes back sorted by a,b,c"

only group by a,b,c doesn't have to sort, never did, never will HAVE to sort. It used to sort frequently in the past, it almost never does now. Using that syntax is a bug waiting to happen big time.

Continous dates

Amrit, February 26, 2009 - 10:21 pm UTC

Hi Tom,

I wrote the following query to get continuous date ranges , but it doesn't give me the correct result, can you please help me.

select * from mc_transaction ;

Mbr_sid From_Date To_date
1234 05/01/2008 05/22/2008
1234 05/23/2008 06/22/2008
1234 06/23/2008 07/22/2008
1234 07/23/2008 08/22/2008
1234 09/23/2008 10/22/2008
1234 12/23/2008 01/22/2009
1234 01/23/2009 02/22/2009
1234 02/23/2009 03/22/2009


Select mbr_sid, Min(from_date) Over(Partition By mbr_sid) As from_date, Max(To_date) Over(Partition By mbr_sid) As To_date
From (Select *
From (Select mbr_sid, Lead(from_date, 1, '') Over(Partition By mbr_sid Order By mbr_sid) - To_date date_diff1,
from_date - Lag(To_date, 1, '') Over(Partition By mbr_sid Order By mbr_sid) date_diff2
From (Select fnl.mbr_sid, fnl.from_date, fnl.To_date
From mc_transaction fnl)) l
Where NVL(l.date_diff1, 2) = 1
Or NVL(l.date_diff2, 2) = 1)

The query gives:
Mbr_sid From_Date To_date
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009

Desired Output is: These are continuous date ranges, when there is a gap, we look for the second continuous range.

Mbr_sid From_Date To_date
1234 05/01/2008 08/22/2008
1234 09/23/2008 10/22/2008
1234 12/23/2008 03/22/2009



Tom Kyte
March 03, 2009 - 9:33 am UTC

no one ever gives the details on the data.


can you have multiple overlapping ranges
1234     1-jan-2009      30-jan-2009
1234    15-jan-2009      25-jan-2009
1234    16-jan-2009      20-jan-2009
1234    17-jan-2009      25-jan-2009



that makes it harder.


anyway, read this:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:529176000346581356#1517983800346182083

to the bottom of the page for some approaches.

date range with bind variables using index

Lal, July 10, 2009 - 9:46 am UTC

Tom,
I execute the following query with and without
bind variables from sql developer
without bind variable
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date('01-jan-2009','dd-mon-yyyy')
AND DT.START_DTM_UTC <
to_date('31-jan-2009','dd-mon-yyyy')

with bind variable
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date(:d1,'dd-mon-rrrr')
AND DT.START_DTM_UTC <
to_date(:d2,'dd-mon-rrrr')

The table has 1468738 records and
The table has four indexes
lal_idx_01 on (START_DTM_UTC) column
lal_idx_02 on (END_DTM_UTC) column
lal_idx_03 on (START_DTM_UTC,END_DTM_UTC) columnS
lal_idx_04 on (END_DTM_UTC,START_DTM_UTC) columnS
i wanted to check which index will optimiser use. ie why four indexes

i gave the same values '01-jan-2009' and '31-jan-2009'
for the variables for the second query

I noticed that the query with bind variables used the
index and query without bind variable didn't.

Before executing the queries i flushed the buffer_cache and shared_pool.

Why did it use the index with bind variable eveif the same values were given and why did it not use the index fir the query without bind variables.

I am using 10g database.
I am attaching the tkprof outputs for both queries.
Pls explain. Also which one is advisable

query without bind variable

TKPROF: Release 10.2.0.1.0 - Production on Fri Jul 10 18:05:53 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Trace file: tcc_ora_928.trc
Sort options: default

********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************

SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date('01-jan-2009','dd-mon-yyyy')
AND DT.START_DTM_UTC <
to_date('31-jan-2009','dd-mon-yyyy')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.25 0.24 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1601 0.62 8.71 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 0.87 8.96 26687 28265 0 80023

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 60 (CTS_OWNER)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
0 TABLE ACCESS MODE: ANALYZED (FULL) OF 'DATED_TRIPS' (TABLE)

********************************************************************************

SELECT 1
FROM
DUAL


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 0 0 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 60 (CTS_OWNER)

Rows Row Source Operation
------- ---------------------------------------------------
1 FAST DUAL (cr=0 pr=0 pw=0 time=4 us)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 FAST DUAL




********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.25 0.24 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 1602 0.62 8.71 26687 28265 0 80024
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1606 0.87 8.96 26687 28265 0 80024

Misses in library cache during parse: 2


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 43 0.03 0.01 0 0 0 0
Execute 285 0.03 0.07 0 0 0 0
Fetch 455 0.03 0.81 107 1064 0 1805
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 783 0.09 0.90 107 1064 0 1805

Misses in library cache during parse: 20
Misses in library cache during execute: 20

2 user SQL statements in session.
285 internal SQL statements in session.
287 SQL statements in session.
2 statements EXPLAINed in this session.
********************************************************************************
Trace file: tcc_ora_928.trc
Trace file compatibility: 10.01.00
Sort options: default

1 session in tracefile.
2 user SQL statements in trace file.
285 internal SQL statements in trace file.
287 SQL statements in trace file.
22 unique SQL statements in trace file.
2 SQL statements EXPLAINed using schema:
CTS_OWNER.prof$plan_table
Default table was used.
Table was created.
Table was dropped.
4172 lines in trace file.
6604 elapsed seconds in trace file.

query with bind variable

TKPROF: Release 10.2.0.1.0 - Production on Fri Jul 10 18:12:55 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Trace file: tcc_ora_2440.trc
Sort options: default

********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************

SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date(:d1,'dd-mon-rrrr')
AND DT.START_DTM_UTC <
to_date(:d2,'dd-mon-rrrr')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.23 0.23 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 1601 0.96 7.88 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 1.21 8.13 26687 28265 0 80023

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 60 (CTS_OWNER)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
0 TABLE ACCESS MODE: ANALYZED (BY INDEX ROWID) OF 'DATED_TRIPS'
(TABLE)
0 INDEX MODE: ANALYZED (RANGE SCAN) OF 'LAL_IDX_04' (INDEX)




********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.23 0.23 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 1601 0.96 7.88 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 1.21 8.13 26687 28265 0 80023

Misses in library cache during parse: 1


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 30 0.01 0.00 0 0 0 0
Execute 180 0.03 0.05 0 0 0 0
Fetch 318 0.01 0.54 67 700 0 1021
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 528 0.06 0.60 67 700 0 1021

Misses in library cache during parse: 17
Misses in library cache during execute: 17

1 user SQL statements in session.
180 internal SQL statements in session.
181 SQL statements in session.
1 statement EXPLAINed in this session.
********************************************************************************
Trace file: tcc_ora_2440.trc
Trace file compatibility: 10.01.00
Sort options: default

1 session in tracefile.
1 user SQL statements in trace file.
180 internal SQL statements in trace file.
181 SQL statements in trace file.
18 unique SQL statements in trace file.
1 SQL statements EXPLAINed using schema:
CTS_OWNER.prof$plan_table
Default table was used.
Table was created.
Table was dropped.
3189 lines in trace file.
60 elapsed seconds in trace file.

Tom Kyte
July 14, 2009 - 5:19 pm UTC

what is your logic behind this:

... Before executing the queries i flushed the buffer_cache and shared_pool. ...
????

anyway


I don't think the plans were different

http://asktom.oracle.com/Misc/when-explanation-doesn-sound-quite.html

you used explain plan, it can "lie" to you.

these queries did exactly the same amount of work - they are the same:

SELECT * 
   FROM dated_TRIPS DT 
  WHERE DT.END_DTM_UTC                          >= 
to_date('01-jan-2009','dd-mon-yyyy') 
    AND DT.START_DTM_UTC                        < 
to_date('31-jan-2009','dd-mon-yyyy')

call     count       cpu    elapsed       disk      query    current  rows
------- ------  -------- ---------- ---------- ---------- ----------  ---------
Parse        1      0.25       0.24          0          0          0  0
Execute      1      0.00       0.00          0          0          0  0
Fetch     1601      0.62       8.71      26687      28265          0  80023
------- ------  -------- ---------- ---------- ---------- ----------  ---------
total     1603      0.87       8.96      26687      28265          0  80023


SELECT * 
   FROM dated_TRIPS DT 
  WHERE DT.END_DTM_UTC                          >= 
to_date(:d1,'dd-mon-rrrr') 
    AND DT.START_DTM_UTC                        < 
to_date(:d2,'dd-mon-rrrr')

call     count       cpu    elapsed       disk      query    current  rows
------- ------  -------- ---------- ---------- ---------- ----------  --------
Parse        1      0.23       0.23          0          0          0  0
Execute      1      0.01       0.01          0          0          0  0
Fetch     1601      0.96       7.88      26687      28265          0  80023
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1603      1.21       8.13      26687      28265          0  80023

Lal Cyril, July 21, 2009 - 1:11 pm UTC

Sorry for the late reply Thank you very much for the clarification.

As you mentioned the row source operation was same
for both query executions and the explain plan was wrong.

what is your logic behind this:
This logic is to get the date ranges overlapping
with the input date range.

... Before executing the queries i flushed the buffer_cache and shared_pool. ...
????

I did this to ensure that caching of data as well as previous plan is cleared for comparing the two queries.


Tom Kyte
July 26, 2009 - 6:19 am UTC

but why? that is as fake if not more fake than presuming the buffer cache would affect it.

There is a secondary sga (search for "secondary sga" on this site in quotes like that) in play. Flushing the buffer cache is only good for showing what the MAXIMUM physical IO's might be under the worst cases - it doesn't help you compare two queries like that.

lal cyril, November 05, 2009 - 5:59 am UTC

Thanks for the clarification.
i searched for "secondary sga"
and got the following link
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1641581200346367951

but the link
http://asktom.oracle.com/pls/ask/search?p_string=%22secondary+sga%22
is not working.

I have one more query in this regard

One of our client's consultant gave the suggestion
to use a hint for the same query

SELECT
/*+ INDEX(DT INDEX1) */
DT.*
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >='01-JAN-09'
AND DT.START_DTM_UTC <'31-JAN-09'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.70 8.73 4937 65948 0 80403
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.73 8.75 4937 65948 0 80403

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 59 (CTS_9WC01)

Rows Row Source Operation
------- ---------------------------------------------------
80403 TABLE ACCESS BY INDEX ROWID DATED_TRIPS (cr=65948 pr=4937 pw=0 time=4370428 us)
80403 INDEX RANGE SCAN INDEX1 (cr=1992 pr=1992 pw=0 time=104177 us)(object id 222154)
-----------------------------------------------------------------------------------------
Query without Hint

SELECT
DT.*
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >='01-JAN-09'
AND DT.START_DTM_UTC <'31-JAN-09'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.53 8.50 34339 34344 0 80403
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.56 8.51 34339 34344 0 80403

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 59 (CTS_9WC01)

Rows Row Source Operation
------- ---------------------------------------------------
80403 TABLE ACCESS FULL DATED_TRIPS (cr=34344 pr=34339 pw=0 time=6999958 us)

Based on explain plan the query with index involves more cost.
Explain plan for query with index
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 408K| 46M| 315K|
| 1 | TABLE ACCESS BY INDEX ROWID| DATED_TRIPS | 408K| 46M| 315K|
| 2 | INDEX RANGE SCAN | INDEX1 | 408K| | 2064 |
---------------------------------------------------------------------------
Explain plan for query without index
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 408K| 46M| 7547 |
| 1 | TABLE ACCESS FULL| DATED_TRIPS | 408K| 46M| 7547 |
-----------------------------------------------------------------

The following link regarding usage of hints
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:7038986332061

suggests that using hints should be a path of last resort.
In this case should i go for hints approach?


Tom Kyte
November 11, 2009 - 9:45 am UTC

http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22secondary+sga%22



do you see the estimated rows there - 408,000. Versus the actual number of rows - 80,403.

There seems therefore to be something wrong with your statistics - they are stale, incomplete perhaps.

Why is it overestimating the row counts like that - that is an easy query, it shouldn't be off by a factor of 5.


If it got the right row counts, it would use the index naturally.


also, please STOP comparing strings to dates. compare dates to dates, use to_date and an explicit format mask to convert the string '01-jan-09' into a date - else, if I'm feeling nasty, I'll change your default date mask to yy-mon-dd some day.

Date comparison between two tables

Brian Perkins, October 18, 2010 - 11:37 am UTC

Tom

Returning to the original question: instead of a start and end parameter and one table, I actually have two tables, each containing start/end dates between which I want to return data from all overlapping rows.

In this thread, John Gilmore later reduced the Boolean for this condition to:

R1_end >= R2_begin
and
R1_begin <= R2_end

The problem I have is that these tables are used for long term SQL reporting, and as such can contain 10's of millions of rows each. Using the above construct in the where clause, Oracle always performs a full tablescan on both tables.

Each table includes many other columns of course, some also used as further join conditions between the two tables, in addition to the dates.

So, I'm looking to see if I have overlooked some technique to improve the speed this query, the performance of which gets slower over time.
Using FIRST_ROWS as a hint is of little value as the entire result set is needed by the reporting tool before anything is displayed to the client.
The only way I have made the query perform better is to artificially restrict the date comparison to a week or so, e.g.

R1_end between (R2_begin and R2_begin + 7) // stop looking a week beyond the start date/time
and
R1_begin between (R2_end - 7 and R2_end) // start looking a week before the end date/time

but I don't like that - and the value '7' probably needs to vary from installation to installation.

Some attributes of the data in the tables may help you formulate an idea:

Each table contains 10-100 million rows.
The query will normally be used to return just a few thousand rows as a maximum - sometimes only less than 100.
Table 1 (end - start) can be measured in seconds
Table 2 (end - start) can be measured in days
The other joining columns without the dates are not very selective, and would join 10-20% of the rows.
Tom Kyte
October 25, 2010 - 9:54 am UTC

... Oracle always performs a full tablescan on both
tables. ...

PERFECTION - even for small tables - PERFECTION. Using an index would be incredibly slow and inefficient.


You sort of need every row from ever table don't you? How could you avoid not reading every row from every table in general?


say you used an index for r1_end >= r2_begin - that would almost certainly compare all records of one table to most records of the other. Same with the other predicate.



Agree

Brian Perkins, October 25, 2010 - 11:07 am UTC

Tom

On the full tablescans - I am in total agreement that Oracle is doing the right thing - given my code and the question it's trying to answer.

I was just wondering if I was overlooking some obvious feature or topic that would help to handle the ever increasing execution time - but I guess not!

At least the customers will "only" want 10 years of data at a maximum..... so there is an endpoint in sight.
Tom Kyte
October 25, 2010 - 6:13 pm UTC

It is a really hard problem to solve efficiently in an RDBMS - yes.

A reader, October 26, 2010 - 7:52 am UTC

I wonder whether it is the full scan that is the problem for Brian, or whether it is the comparison of a few million rows with a few million other rows -- a lot of CPU usage.

Is that a sort-merge operation, with a join on one date comparison and a subsequent filter applied to eliminate rows based on the other date comparison? Have you looked at the wait events and CPU usage?

I wouldn't rule out an index-based access method if it eliminates say 10,000,000*100,000,000/2=1,000,...blahblah...,000 date comparisons.
Tom Kyte
October 26, 2010 - 8:02 pm UTC

... or whether
it is the comparison of a few million rows with a few million other rows ...

that is it

... I wouldn't rule out an index-based access method if it eliminates say ...

but think about it. I wrote:

say you used an index for r1_end >= r2_begin - that would almost certainly compare all records of one table to most records of the other. Same with the other predicate.

the index would be used for ONE or the OTHER of the predicates - but not both.

and the index would be used over and over and over and over and over and ........

re: Brian Perkins and date comparison between two table

Stew Ashton, October 27, 2010 - 6:38 am UTC


Just a thought, provided this doesn't slow down the commits too much:
drop table A_DATES
/
-- drop table A_DATES succeeded.
--657ms elapsed
drop table B_dates
/
-- drop table B_dates succeeded.
--78ms elapsed
create table A_DATES as
select sysdate+(level*99) A_DATE_BEGIN, sysdate+(level*99) + MOD(level,1000)+1 A_DATE_END
from DUAL connect by level <= 1000
/
--create table succeeded.
--125ms elapsed
create table B_DATES as
select sysdate + MOD(level,1000) + MOD(level, 100) B_DATE_BEGIN
, sysdate + MOD(level,1000) + MOD(level, 100) + 1 B_DATE_END
from DUAL connect by level <= 1000
/
--create table succeeded.
--47ms elapsed
create materialized view log on a_dates with rowid
/
--create materialized succeeded.
--125ms elapsed
create materialized view log on B_DATES with rowid
/
--create materialized succeeded.
--62ms elapsed
drop materialized view ab_dates
/
-- drop materialized view succeeded.
--250ms elapsed
create materialized view ab_dates refresh fast on commit as
select a.rowid ARID
, a.A_DATE_BEGIN
, a.A_DATE_END
, B.B_DATE_BEGIN
, B.B_DATE_END
, B.rowid BRID from A_DATES a, B_DATES B
where a.A_DATE_END >= B.B_DATE_BEGIN
and a.A_DATE_BEGIN <= B.B_DATE_END
/
--create materialized succeeded.
--7,641ms elapsed
create index idx_ab_dates_arid on ab_dates(arid)
/
--create index succeeded.
--15ms elapsed
create index IDX_AB_DATES_BRID on AB_DATES(BRID)
/
--create index succeeded.
--16ms elapsed
begin
DBMS_STATS.GATHER_TABLE_STATS (ownname => user, tabname => 'A_DATES',estimate_percent => 100);
DBMS_STATS.GATHER_TABLE_STATS (OWNNAME => user, TABNAME => 'B_DATES',ESTIMATE_PERCENT => 100);
DBMS_STATS.GATHER_TABLE_STATS (ownname => user, tabname => 'AB_DATES',estimate_percent => 100, cascade=>TRUE);
end;
/
--anonymous block completed
--609ms elapsed
select COUNT(*) from AB_DATES
/
--COUNT(*)               
------------------------ 
--84                     
/
--0ms elapsed
delete from A_DATES where rowid = (select ARID from AB_DATES where rownum = 1)
/
--1 rows deleted
--16ms elapsed
commit
/
--commited
--78ms elapsed
select COUNT(*) from AB_DATES
/
--COUNT(*)               
------------------------ 
--78                     
/
--0ms elapsed

Missed something in previous post

Stew Ashton, October 27, 2010 - 7:32 am UTC


In my previous post, I didn't test inserts and updates to the base tables. To avoid full scans during the MV refresh:
create index IDX_A_DATES_BEGIN on A_DATES(A_DATE_BEGIN,A_DATE_END);
create index IDX_B_DATES_END on B_DATES(B_DATE_END,b_DATE_BEGIN);
Every insert or update to a base table will cause an index range scan on the other base table during the MV refresh.

Worked like a charm

Jason, September 21, 2016 - 5:40 pm UTC

This worked like a charm. I was in the same situation. I coded it like you had it exactly, and worked perfectly. Thank you so much. This saved me from hours of trying to compare dates. :) Cheers!

Re: my post of October 27, 2010 - 6:38 am UTC

Stew Ashton, February 17, 2018 - 10:28 am UTC

Starting with version 12c, inequality joins can be replaced by the MATCH_RECOGNIZE clause. With the test data I used above, MATCH_RECOGNIZE goes at least 60 times faster than JOIN.
select A_DATE_BEGIN, A_DATE_END, B_DATE_BEGIN, B_DATE_END
from (
  select 'A' tbl, a_date_begin date_begin, a_date_end date_end from a_dates
  union all
  select 'B', b_date_begin, b_date_end end from b_dates
)
match_recognize(
  order by date_begin
  measures
    case first(tbl) when 'A' then first(date_begin) else ba.date_begin end a_date_begin,
    case first(tbl) when 'A' then first(date_end) else ba.date_end end a_date_end,
    case first(tbl) when 'B' then first(date_begin) else ba.date_begin end b_date_begin,
    case first(tbl) when 'B' then first(date_end) else ba.date_end end b_date_end
  all rows per match
  after match skip to next row
  pattern ( {-ab-} ({-ab-}|ba)* ba )
  define ab as tbl = first(tbl) and date_begin <= first(date_end),
    ba as tbl != first(tbl) and date_begin <= first(date_end)
);

Connor McDonald
February 19, 2018 - 3:26 am UTC

Love your work.