A reader, August 07, 2002 - 3:49 pm UTC
is the below approach valid...
Nag, August 07, 2002 - 10:31 pm UTC
select from_date,to_date
from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
where x.a between from_date and to_date and
x.b between from_date and to_date
OR ( from_date < x.a and x.b < to_date )
and is this kind of comparision(x.b < to_date ) i.e. value to the left and column to the right( in your solution) .. is it healthy..
Thanks
August 08, 2002 - 9:06 am UTC
define "healthy". what the heck do you mean by that?
(and please please please -- compare dates to dates, strings to strings, numbers to numbers. wrap those dates in a TO_DATE() call and always use 4 digit years and use an explicit date format to convert the string to a date)
If you are afraid of
x.b < to_date
for some reason, code
to_date > x.b
instead.
Correction or a typo
Logan Palanisamy, August 08, 2002 - 7:47 pm UTC
Tom,
.............................................
CB CE
PB PE -- rule 1
PB PE -- rule 1
PB PE -- rule 2
PB PE -- rule 3
PB PE -- no hit
PB PE -- no hit
In your chart above, the third row where PE is between CB and CE, belong to rule 2, not rule 1.
The correct one is,
.............................................
CB CE
PB PE -- rule 1
PB PE -- rule 2 <-- correction
PB PE -- rule 2
PB PE -- rule 3
PB PE -- no hit
PB PE -- no hit
I guess it was a typo on your part.
August 09, 2002 - 8:39 am UTC
rule 1 = where ( c_beginning between p_beginning and p_ending -- rule 1
On that row in question, c_beginning is in fact between PB and PE so -- rule 1 gets it in my mind since I was reading the list from TOP to BOTTOM. Since the order of evaluation of the predicate in SQL is not deterministic (eg: rule 3 then 2 then 1, or 2 then 3 then 1 or ..... and so on could be the order of evaluation) you might be right.
But, if you read the "rules" from top to bottom -- rule 1 is the one that catches row 3.
My sql with the trick is performing faster.. why any idea...
Nag, August 10, 2002 - 2:07 pm UTC
Create table t (from_date date, to_date date);
insert into t values ('01-jan-02', '31-dec-02');
Solution 1.
select *
from t
where ( from_date between :p_beginning :and p_ending
OR to_date between :p_beginning and :p_ending
OR ( from_date < :p_beginning and :p_ending <to_date ) );
Solution 2.
select from_date,to_date
from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
where from_date between x.a and x.b and
to_date between x.a and x.b or
OR ( from_date < x.a and x.b < to_date )
Tom,
I incorporated solution 1 as advetised by you.
It was working but taking enormous amount of time.
So , I used the trick which I invented as shown in solution 2.
I'm cleverly making the two parameters an inlineview and using the values in my where clause.
Surprise-o-rama.. I was amazed to see that the performanceof the query as such report increased
enormously.
What is the difference between what you suggested and what I wrote. What could be the reason
that the second solution was faster.
August 10, 2002 - 4:47 pm UTC
Umm, your query is radically different then mine and returns a different (perhaps wrong, perhaps not wrong) answer.
In any case -- only one of the two queries above could be valid! They answer totally different questions.
(i removed a duplicate OR OR in the second query and I moved the colon in ":and p_ending" to the right place.
ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( from_date date, to_date date );
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> variable pb varchar2(20)
ops$tkyte@ORA817DEV.US.ORACLE.COM> variable pe varchar2(20)
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> exec :pb := '01-jun-2002'; :pe := '30-dec-2002';
PL/SQL procedure successfully completed.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> insert into t values ( to_date( '22-NOV-2002' ), to_date( '25-JUN-2004' ) );
1 row created.
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select *
2 from t
3 where ( from_date between to_date(:pb,'dd-mon-yyyy') and to_date(:pe,'dd-mon-yyyy')
4 OR to_date between to_date(:pb,'dd-mon-yyyy') and to_date(:pe,'dd-mon-yyyy')
5 OR ( from_date < to_date(:pb,'dd-mon-yyyy') and to_date(:pe,'dd-mon-yyyy') <to_date ) )
6 /
FROM_DATE TO_DATE
--------- ---------
22-NOV-02 25-JUN-04
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select from_date,to_date
2 from t,(select to_date(:pb,'dd-mon-yyyy') a , to_date(:pe,'dd-mon-yyyy') b from dual) x
3 where from_date between x.a and x.b and
4 to_date between x.a and x.b
5 OR ( from_date < x.a and x.b < to_date )
6 /
no rows selected
ops$tkyte@ORA817DEV.US.ORACLE.COM>
My query returned it (overlaps).
Yours does not.
Only you know which one is correct!
OOPS... sorry my fault .. small typo...
Nag, August 10, 2002 - 6:26 pm UTC
select from_date,to_date
from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
where from_date between x.a and x.b and --mistake here...
to_date between x.a and x.b or
OR ( from_date < x.a and x.b < to_date )
above was meant to be...
select from_date,to_date
from t,(select '01-JuN-02' a , '30-DEc-02' b from dual) x
where from_date between x.a and x.b OR ..
to_date between x.a and x.b or
OR ( from_date < x.a and x.b < to_date )
Now.. for the above comment , both the queries return the same rows...
I tried and the results are as follows...
SQL> ED
Wrote file afiedt.buf
1 select from_date,to_date
2 from t,(select '01-JUN-02' a , '01-JUN-03' b from dual) x
3 where from_date between x.a and x.b OR -- corrected
4 to_date between x.a and x.b
5* OR ( from_date < x.a and x.b < to_date )
SQL> /
FROM_DATE TO_DATE
--------- ---------
01-JAN-02 31-DEC-02
22-NOV-02 25-JUN-04
Now Tom, going back to the original comment i.e. the immediate above one..
Solution 2 is much faster than solution 1. Believe me it runs very very fast compared to the solution 1.
Can you please give me an insight into why this is happening..
August 11, 2002 - 9:37 am UTC
Nag, you've been coming here long enough -- you should know the answer to this.
Set up the test case, show us the tkprof with timed statistics and all. Help use understand what indexes you have. Look at the differences in the query plans.
Then, all will be revealed.
Sort of bothersome that you STILL have typos in the queries. "or .." "or or" for example. Be nice to put up stuff that actually runs. Makes me question whether what I am looking at is really what you are using or not (eg: am i totally wasting my time or what)
You should know -- if you want answers, you need to supply information! Also, it should be easy for you to look at the traces and *see* what is different and do some of this analysis yourself.
Programmer
Jamil Shaibani, August 11, 2002 - 3:51 am UTC
Dear Tom
When I try to run the Example I got this error
SQL> insert into t values (to_date('22-NOV-2002'),
to_date('25-JUN-2004')) *
ERROR at line 1:
ORA-01858: a non-numeric character was found where a numeric was expected
Thank you very much for your help
August 11, 2002 - 9:53 am UTC
Your default date format is not dd-mon-rr, must be something else
use to_date( '22-nov-2002', 'dd-mon-yyyy' )
Huge apology.. feeling happy at being admonished by Tom
Nag, August 11, 2002 - 10:41 am UTC
Tom, I consider it a privilege to be admonished by you..
But please set your doubts at rest about my integrity in asking this question. I'm genuinely asking this question.
Now for the trace contents..
select *
from t
where ( from_date between to_date(:pb,'dd-mon-yyyy') and
to_date(:pe,'dd-mon-yyyy')
OR to_date between to_date(:pb,'dd-mon-yyyy') and
to_date(:pe,'dd-mon-yyyy')
OR ( from_date < to_date(:pb,'dd-mon-yyyy') and
to_date(:pe,'dd-mon-yyyy') <to_date ) )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 2 4 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 2 4 2
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 32
Rows Row Source Operation
------- ---------------------------------------------------
2 TABLE ACCESS FULL T
********************************************************************************
select from_date,to_date
from t,(select to_date(:pb,'dd-mon-yyyy') a , to_date(:pe,'dd-mon-yyyy')
b from dual) x
where from_date between x.a and x.b OR
to_date between x.a and x.b
OR ( from_date < x.a and x.b < to_date )
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 8 2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 8 2
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 32
Rows Row Source Operation
------- ---------------------------------------------------
2 NESTED LOOPS
2 TABLE ACCESS FULL DUAL
2 TABLE ACCESS FULL T
********************************************************************************
ALTER SESSION SET SQL_TRACE=FALSE
In the solution you have provided, I see that we are simply doing a full table scan of T.
After I rewrote the query using the trick I thought will be useful, then we are doing a nested loops join i.e. we are full table scanning dual and comparing it with all the rows returned by T.
So though the time statistics above show 0, when I'm applying the same logic to my real time situation it is making a lot of difference.
But I have a question here ..
Rows Row Source Operation
------- ---------------------------------------------------
2 NESTED LOOPS
2 TABLE ACCESS FULL DUAL
2 TABLE ACCESS FULL T
above which table is being r3ead first dual or T ( as both of them are at the same level).
Can you also suggest the easiest way to insert a whole lot of data into T, so that I can get some time on timed statistics.
August 11, 2002 - 7:48 pm UTC
Looks like the times are the same to me -- 0.00 seconds?
Also the tables in question are trivial.
Show us the example where your query blows the one without dual away.......
You apparently MUST have that data somewhere? You made the claim already. Use the table you used to do your intial tests.
Simpler alternative
Tony, August 24, 2002 - 4:15 am UTC
I do date overlap checks a lot, and always use this method:
(c_beginning <= p_ending) AND (c_ending >= p_beginning)
which seems more elegant to me.
Calculate fraction of date range
A reader, January 31, 2003 - 9:00 pm UTC
Tom,
What if I have Prompt StartDate(PS) and Prompt EndDate(PE) as well as in the database fields: StartDate and DueDate and want to calculate fraction of how many days of my prompt to the total days between StartDate(SD) and DueDate(DD)? For example.,
SD - 1/1/2003
DD - 2/15/2003
PS - 12/15/2002
PE - 1/31/2003
So, the calculated field that I am interested in:
PE-1/1/2003 over DD-SD,
I don't count days from 12/15 to 1/1/2003 because they are outside of my SD and DD range(~31/46)
February 01, 2003 - 9:26 am UTC
just do it i guess? I me
(pe-sd)/(dd-sd)
is perfectly valid.
gap analysis !!
A reader, February 12, 2003 - 11:23 am UTC
hi tom,
I want to just do this and I am open for sigle sql or pl/sql program unit.
You said..
15-dec-2000 15-jan-2001 - wider then database
15-dec-2000 04-jan-2001 - overlaps but not contained within
04-jan-2001 15-jan-2001 - overlaps but not contained within
02-jan-2001 04-jan-2001 - overlaps and contained within
I have records in the database and valid value for that dates.
for me starting date is fixed let's say 15-dec-2000
eg..
startdate enddate value code
----------- --------- ----- ------
15-dec-2000 15-jan-2001 X ABC123-- flag this gap out
20-jan-2001 19-jan-2002 Y ABC123
15-jan-2002 null X ABC123-- flag overlap
//so for ABC123 code I want to report gap and overlap
15-dec-2000 15-jan-2001 P ABC456
15-jan-2001 19-jan-2002 Q ABC456
19-jan-2002 17-jan-2002 Q ABC456--- flag there is no
value from 17th jan to sysdate
20-jan-2001 19-jan-2002 M ABC789
15-jan-2002 15-dec-2002 N ABC789
// flag no value is available from 15-dec-00 to 20-jan-01
for ABC789 code
15-dec-2000 15-jan-2001 P ABC012
16-jan-2001 19-jan-2002 Q ABC012
20-jan-2002 null Q ABC012
// copletely valid record...
/***********************************
Note : null implies sysdate
***/
I can use utl_fie or table to insert to
comment/flag/errored report.
can you please help..
there are 100,000 distinct records..
all together 242,000 records.
Thanks,
February 12, 2003 - 3:41 pm UTC
not sure if I have all of your cases -- but this should get you going. Using lag and lead we can look forward and back a row and assign whatever flags you need:
ops$tkyte@ORA920> break on code skip 1
ops$tkyte@ORA920>
ops$tkyte@ORA920> select startdate,
2 enddate,
3 last_enddate,
4 flag,
5 code,
6 case when last_enddate is null then null
7 when last_enddate < startdate-1 then 'Gap'
8 when last_enddate > startdate then 'Overlap'
9 when enddate is not null and next_startdate is null then 'Missing next'
10 else null
11 end another_flag
12 from (
13 select t.*,
14 lag(enddate) over (partition by code order by startdate) last_enddate,
15 lead(startdate) over (partition by code order by startdate) next_startdate
16 from t
17 order by code, startdate
18 )
19 /
STARTDATE ENDDATE LAST_ENDD F CODE ANOTHER_FLAG
--------- --------- --------- - ------ ------------
15-DEC-00 15-JAN-01 P ABC012
16-JAN-01 19-JAN-02 15-JAN-01 Q
20-JAN-02 19-JAN-02 Q
15-DEC-00 15-JAN-01 X ABC123
20-JAN-01 19-JAN-02 15-JAN-01 Y Gap
15-JAN-02 19-JAN-02 X Overlap
15-DEC-00 15-JAN-01 P ABC456
15-JAN-01 19-JAN-02 15-JAN-01 Q
19-JAN-02 17-JAN-02 19-JAN-02 Q Missing next
20-JAN-01 19-JAN-02 M ABC789
15-JAN-02 15-DEC-02 19-JAN-02 N Overlap
11 rows selected.
Thanks,
A reader, February 12, 2003 - 5:30 pm UTC
There is no alternative for you at least on this planet i think.
now, can you tell how did you "develop" this query setp by setp. I tried to create it by breaking it down in pice by pice but it was very difficut to put it together
-- also in this query how can I drop those rows which
are 'VALID'? ( The last case in my question example )
Thanks,
February 12, 2003 - 6:11 pm UTC
Well, it was pretty easy actually.
Looking at a row for a given CODE, what did we need?
o the prior rows enddate -- to look for gaps and overlaps
o the next rows startdate -- to look for missing end records with NULL
that dictates "lag" and "lead". Once you have those -- CASE is pretty easy to use to come up with the logic to figure out "gap", "overlap", etc...
to drop the "valid" rows -- just wrap:
ops$tkyte@ORA920> select startdate,
2 enddate,
3 last_enddate,
4 flag,
5 code,
6 case when last_enddate is null then null
7 when last_enddate < startdate-1 then 'Gap'
8 when last_enddate > startdate then 'Overlap'
9 when enddate is not null and next_startdate is null
then 'Missing next'
10 else null
11 end another_flag
12 from (
13 select t.*,
14 lag(enddate) over (partition by code order by startdate)
last_enddate,
15 lead(startdate) over (partition by code order by startdate)
next_startdate
16 from t
17 order by code, startdate
18 )
19 /
as an inline view an select * from ( that query ) where another_flag is not null;
cool
asktom fan, February 12, 2003 - 6:15 pm UTC
Thanks Tom,
Need Help with this
Kishan, July 03, 2003 - 11:14 am UTC
Dear Tom:
I need a solution to this query. Could this be done in a single SQL statement. I'm using 8.1.7
sql> DESC roombooking
Name Null? Type
----------------------------------------- -------- ----------------------------
ROOM_ID NOT NULL VARCHAR2(8) -- Room ID
BKING_ST_TIME NOT NULL DATE -- Start time
BKING_END_TIME NOT NULL DATE -- End time BKING_STS_C NOT NULL VARCHAR2(1) -- Booking Status
ROOM_ID BKING_ST_TIME BKING_END_TIME B
-------- -------------------- -------------------- -
(SR1) 07-JAN-2002 11:00:00 07-JAN-2002 13:00:00 C
(SR2) 07-JAN-2002 18:00:00 07-JAN-2002 21:00:00 C
(SR1) 08-JAN-2002 09:00:00 08-JAN-2002 12:00:00 C
(SR2) 08-JAN-2002 12:00:00 08-JAN-2002 15:00:00 C
(SR2) 08-JAN-2002 18:00:00 08-JAN-2002 21:00:00 C
(SR1) 09-JAN-2002 08:00:00 09-JAN-2002 11:00:00 C
(SR1) 09-JAN-2002 11:00:00 09-JAN-2002 13:00:00 C
(SR1) 09-JAN-2002 15:00:00 09-JAN-2002 21:00:00 C
(SR1) 10-JAN-2002 16:00:00 10-JAN-2002 18:00:00 C
(SR1) 10-JAN-2002 18:00:00 10-JAN-2002 21:00:00 C
(SR2) 10-JAN-2002 08:00:00 10-JAN-2002 10:00:00 C
(SR2) 10-JAN-2002 10:00:00 10-JAN-2002 13:00:00 C
(SR2) 10-JAN-2002 13:00:00 10-JAN-2002 15:00:00 C
(SR1) 11-JAN-2002 09:00:00 11-JAN-2002 12:00:00 C
(SR1) 11-JAN-2002 13:00:00 11-JAN-2002 15:00:00 C
(SR1) 11-JAN-2002 18:00:00 11-JAN-2002 21:00:00 C
(SR1) 14-JAN-2002 11:00:00 14-JAN-2002 13:00:00 C
(SR1) 14-JAN-2002 18:00:00 14-JAN-2002 21:00:00 C
The table roombooking stores all bookings made against a set of rooms on a daily basis. My users want to find out the period when those rooms are empty so that they can book for someone else..
For example, if they want the information for rooms available between 07-Jan-2002 to 15-Jan-2002 and between 14:00-16:00, the following information should be given to them, so that the rooms can be booked.
SR1 07-Jan-2002
SR2 07-Jan-2002
SR1 08-Jan-2002
SR2 09-Jan-2002
SR2 11-Jan-2002
SR1-12-Jan-2002
SR2-12-Jan-2002
SR2-14-Jan-2002
July 03, 2003 - 7:57 pm UTC
All we need for that is
A) a set of dates 07-jan-2002 .. 15-jan-2002, that is inline view A below
B) a set of distinct rooms that are possible, that is B below (best if you have a room table elsewhere -- the "parent" lookup table)
And then we simply cartesian product these together, generating an inclusive list of DATES and ROOMS -- looking for DATES+ROOMS that don't have an overlap:
ops$tkyte@ORA920> ops$tkyte@ORA920> select * from
2 (select to_date( '07-jan-2002') + rownum -1 dates
3 from all_objects
4 where rownum <= to_date('15-jan-2002')-to_date('07-jan-2002')+1) a,
5 (select distinct room_id
6 from t ) b
7 where not exists ( select null
8 from t
9 where bking_st_time <= a.DATES + 16/24
10 and bking_end_time >= a.DATES +14/24
11 and room_id = b.room_id
12 )
13 order by dates, room_id
14 /
DATES ROOM_ID
-------------------- --------
07-jan-2002 00:00:00 (SR1)
07-jan-2002 00:00:00 (SR2)
08-jan-2002 00:00:00 (SR1)
09-jan-2002 00:00:00 (SR2)
11-jan-2002 00:00:00 (SR2)
12-jan-2002 00:00:00 (SR1)
12-jan-2002 00:00:00 (SR2)
13-jan-2002 00:00:00 (SR1)
13-jan-2002 00:00:00 (SR2)
14-jan-2002 00:00:00 (SR1)
14-jan-2002 00:00:00 (SR2)
15-jan-2002 00:00:00 (SR1)
15-jan-2002 00:00:00 (SR2)
13 rows selected.
That's cool...thank you for your valuable service to Oracle community
Kishan, July 03, 2003 - 8:27 pm UTC
Loan amount and Loan installment months
A reader, September 01, 2003 - 1:42 am UTC
SQL> CREATE TABLE LOANS( LOAN_NUM NUMBER, LOAN_BEGIN_DATE DATE, LOAN_END_DATE DATE, INSTALLMENT_AMOUNT NUMBER);
Table created.
SQL> INSERT INTO LOANS VALUES( 1, SYSDATE, SYSDATE +365, 1000);
1 row created.
SQL> INSERT INTO LOANS VALUES( 2, SYSDATE + 60, SYSDATE +365, 2000);
1 row created.
SQL> COMMIT;
Commit complete.
SQL> SELECT * FROM LOANS;
LOAN_NUM LOAN_BEGI LOAN_END_ INSTALLMENT_AMOUNT
--------- --------- --------- ---------
1 01-SEP-03 31-AUG-04 1000
2 31-OCT-03 31-AUG-04 2000
We need to display the sum of installment amount for each month for the next 3 years( starting from the current month.)
With the given data, 1.we have to show the data for each month and the total of installments
for those months,plus 2. rest of the months should be shown as zero.
We need to take care of the over laps, all of the below..
- wider then database
- overlaps but not contained within
- overlaps and contained within
The resulting grid should have summary for each row returned by the following...
select to_char(add_months(sysdate, (rownum-1)),'MM RRRR') dt
from all_objects
where rownum <37
Please train into the insights of how you verbaged the problem, and how you arrived a the solution i.e. how you broke it into pieces.
Thanks
What helps
A reader, September 02, 2003 - 11:10 pm UTC
Tom, what helps is the reason why you did not deem it fit to answer the above request.
September 03, 2003 - 6:54 am UTC
didn't see like sufficient information to give you an answer. i don't really know what "verbiage" you are looking for or really what problem you were trying to solve.
A reader, September 03, 2003 - 8:12 am UTC
select to_char(add_months(sysdate, (rownum-1)),'MM RRRR') dt from all_objects
where rownum <37
The above query returns 36 months forward from the current month , i.e. 09 2003.
For each of those months, I have to display , the total installments due for all the loans put together.
The final output, I foresee will be something like this
09 2003 10,000
10 2003 14,000
11 2003 12,000
12 2003 15,000
01 2003 17,000
.....
......
etc
DATABASE VALUES might be
loan_no loan_begin_date loan_end_date installment
1 01/01/2002 01/01/2004 1,500
2 09/01/2003 01/01/2004 1,500
3 10/01/2003 01/01/2004 1,500
I have to display a grid of month and the total installments
for the next 36 months, from the month of the current month.
The above data will show total installments for upto 01/01/2004 , rest of the months should come up as zero.
Is this helpful.
A reader, September 03, 2003 - 10:51 am UTC
Do you need more information /explanation of the solution I'm looking for?
September 03, 2003 - 12:05 pm UTC
you know what -- they won't let me use my aircard on a plane. so, when I'm actually not able to be on the net -- I actually don't answer questions. HMMM...
if you have that table T with the months in it -- the one you made with all_objects, you can easily:
select month, (select sum(whatever) from another_table where start_date <= T.month and end_date >= T.month )
from t;
to achieve your report.
Range constraint
Tracy, October 16, 2003 - 10:16 am UTC
If you have two number columns in a table say c1 and c2, where c2 is always higher than c1 and the rows contain ranges eg
c1 c2
1 5
6 20
24 28
36 134
145 148
etc
and you want to add a constraint such that no ranges are contained within others or overlap in anyway, how would you recommend doing it? Given the above data the following inserts would fail:
c1 c2
1 2
18 23
28 30
39 58
but these would be allowed:
c1 c2
21 22
29 34
149 999
etc
October 16, 2003 - 11:09 am UTC
its really hard and relies on triggers that basically force serialization on modifications to the table and query the results in AFTER triggers to see if everything is OK.
Searching a range of dates on a range of dates
Ray DeBruyn Enterprise Information Systems, October 16, 2003 - 3:19 pm UTC
This is type of query I use to search a range of dates:
select * from t
where c_beginning < p_ending
and c_ending > p_beginning;
It will return any records where the span of c_beginning and c_ending overlaps the sapn of p_beginning and p_ending. If you want it to return records where the spans of time touch (ie p_ending = c_beginning) change the condition to <= and >=.
October 16, 2003 - 5:37 pm UTC
it is still "really hard" as it mandates
a) serialization of modifications to the table and
b) an AFTER trigger that has access to the modified data
I don't understand
John Gilmore, October 17, 2003 - 7:05 am UTC
You are telling Ray that his query "won't catch all overlaps -- only some kinds".
But you then use his same query "select * from t where cb < pe and ce > pb" to retieve the row representing the range 5..10 and 2..11 which you say his query doesn't get.
Am I missing something here?
October 17, 2003 - 10:12 am UTC
you're right -- i was just reading over it too fast last night. sorry about that. it does look right -- specially with the proof below.
and an example of all 4 overlaps:
ops$tkyte@ORA920> create table t ( cb int, ce int, pb int, pe int, msg varchar2(20) );
Table created.
ops$tkyte@ORA920> insert into t values ( 5, 10, 1, 6, 'low end overlap' );
1 row created.
ops$tkyte@ORA920> insert into t values ( 5, 10, 6, 11, 'hi end overlap' );
1 row created.
ops$tkyte@ORA920> insert into t values ( 5, 10, 6, 7, 'interior overlap' );
1 row created.
ops$tkyte@ORA920> insert into t values ( 5, 10, 1, 11, 'exterior overlap' );
1 row created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> select * from t where cb < pe and ce > pb;
CB CE PB PE MSG
---------- ---------- ---------- ---------- --------------------
5 10 1 6 low end overlap
5 10 6 11 hi end overlap
5 10 6 7 interior overlap
5 10 1 11 exterior overlap
shows that -- sorry about that.
A Different Explanation
John Gilmore, October 17, 2003 - 7:33 am UTC
Perhaps the easiest way to think of this problem is to start with the opposite question: "when do two ranges NOT overlap?"
Well, they don't overlap in two, and only two, cases:
o range_1.end < range_2.begin
or
o range_1.begin > range_2.end
Phrased in SQL this becomes
(R1_end < R2_begin
or
R1_begin > R2_end)
We're interested in the inverse of this; that is
not (
R1_end < R2_begin
or
R1_begin > R2_end)
Using Boolean algebra we can transform this to an equivalent expression:
R1_end >= R2_begin
and
R1_begin <= R2_end
which is pretty much what Ray suggested.
Difference between tow fields and use it in the sql statement condition
Jamil, December 15, 2003 - 8:50 am UTC
Dear Sir Tom
Can I make calculation in the SQL statements and make the condition on the result of calculation
For example I have this table contains
J_D J_B
--------------------
10 5
20 15
15 10
10 5
20 15
15 15
10 5
20 15
15 15
My SQL statements as follow :
select J_D,J_B,R
FROM T,(SELECT NVL(J_D,0) - NVL(J_B,0) FROM T) R
WHERE R > 0
So I want the result as follow :
J_D J_B R
------------------------------
10 5 5
20 15 5
15 10 5
10 5 5
20 15 5
11 5 6
20 15 5
How can I do that in select statement? Please give an example details how to solve it .
Best regards
Jamil
December 15, 2003 - 10:06 am UTC
well, (SELECT NVL(J_D,0) - NVL(J_B,0) FROM T) returns a row for every row in T, doesn't make sense....
not sure what you are trying to do here but
select j_d, j_b, nvl(j_d,0)-nvl(j_b,0)
from t
gets you what you want.
Using the result of the calculation in the where clause
Jamil, December 16, 2003 - 1:18 am UTC
Dear Sir
I want to use the result of the calculation
of this nvl(j_d,0)-nvl(j_b,0) in the where clause
so if I have for example table t contains data as follow :
J_D J_B
-------------------
10 5
20 15
15 10
10 10
15 15
20 20
so I want my query to return different between J_D and J_B , and the result of the subtraction should be used in the where clause like this
select J_D,J_B,R
FROM T,(SELECT NVL(J_D,0) - NVL(J_B,0) FROM T) R
WHERE R > 0
So the result of the above query should be as follow :
J_D J_B R
---------------------
10 5 5
20 15 5
15 10 5
How can I use the result of the calculation in the where clause , Or how can I make the result of my query
control by the result of the calculation
Please help with example details, with any other idea
if it is not possible with the above idea.
Best regards
Jamil
December 16, 2003 - 7:23 am UTC
that is just
select *
from t
where (j_d <> j_b or
((j_d is null and j_b is not null) or (j_d is not null and j_b is null))
you can use any computation in the where clause -- but in your case, using that nvl() function would NOT answer the query "show me the rows where they are different". My predicate will.
What about indexing date ranges ?
Christo Kutrovsky, February 18, 2004 - 11:33 pm UTC
Tom,
You have been discussing in great extent date range comparison. What about indexing for performance date ranges?
For example , we have a table with start date and end date. We would like to be able to pinpoint the rows where a specific moment is in the date range:
select * from <table>
where begin_date <= :moment
and end_date > :moment
What kind of index should we create to do this efficiently?
Creating a combined index (begin_date, end_date) results in a huge range_scan on the index from begin_date, to the lowest date in the index. Is there any other more efficient alternative ?
February 19, 2004 - 10:26 am UTC
It is alot like the "spatial" problem with lat/long's. We had to invent a special datatype for that stuff and geocode the data to make it "efficient".
It is a pretty hard problem when you think about it. Suppose you have data:
01-jan-2003 02-jan-2003
01-jan-2003 03-jan-2003
.....
01-jan-2003 31-dec-2003
02-jan-2003 03-jan-2003
02-jan-2003 04-jan-2003
....
02-jan-2003 31-dec-2003
(and on for each day in 2003, eg:
ops$ora920@ORA920PC> create table t ( a date, b date );
Table created.
ops$ora920@ORA920PC>
ops$ora920@ORA920PC> insert into t
2 select a,b
3 from (select to_date('01-jan-2003')+rownum-1 a
4 from all_objects
5 where rownum <= 365 ),
6 (select to_date('01-jan-2003')+rownum-1 b
7 from all_objects
8 where rownum <= 365 )
9 where a < b;
66430 rows created.
Now, say we index like this (using high pctfree to really spread the data out)
ops$ora920@ORA920PC> create index t_idx on t(a,b) pctfree 95;
Index created.
Now, we query:
ops$ora920@ORA920PC> select *
2 from t
3 where a <= to_date( '01-jan-2004' )
4 and b > to_date( '01-jan-2004' )
5 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8306 consistent gets
8305 physical reads
0 redo size
266 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
<b>it had to read the entire index -- to find no data. so it seems that perhaps:
</b>
ops$ora920@ORA920PC> drop index t_idx;
Index dropped.
ops$ora920@ORA920PC> create index t_idx on t(b,a) pctfree 95;
Index created.
ops$ora920@ORA920PC>
ops$ora920@ORA920PC> select *
2 from t
3 where a <= to_date( '01-jan-2004' )
4 and b > to_date( '01-jan-2004' )
5 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
266 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
<b>maybe we just had the index backwards? but no, it is not to be true all of the time:</b>
ops$ora920@ORA920PC> select *
2 from t
3 where a <= to_date( '01-jan-2002' )
4 and b > to_date( '01-jan-2002' )
5 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
8306 consistent gets
8304 physical reads
0 redo size
266 bytes sent via SQL*Net to client
368 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
<b>it depends on the inputs... sometimes a,b is right -- sometimes b,a is correct (and sometimes FULL_SCAN=true is correct)</b>
Ok, so that does not help you too much yet (just explains why this is "hard"). If this is the "important query to tune", we can do a couple of things.
I could use using Oracle Text to index a "function", this function would take two dates and return a string, something like:
input = 01-jan-2003, 14-mar-2003
012003 022003 032003
then we could:
select * from t where contains( dummy, to_char( :moment, 'mmyyyy' ) > 0 and
begin_date <= :moment
and end_date > :moment
that would have us just look at "the months" of interest.
We could also do this with a trigger and an IOT -- sort of a DIY index, you would create an IOT with a date field and either a rowid or copy the primary key from the original table.
A trigger would put into this IOT the dates, rowid/pk -- after computing the relevant dates (eg: trunc start date to the month, insert that, add a month, insert that until you have a date that exceeds your end date).
then you can query:
select * from t where rowid in ( select rid from iot where dt = trunc(:moment,'mm')) and begin_date <= :moment
and end_date > :moment;
it'll do sort of the same thing. Of course, you could trunc to the week and add 7 days instead of a month, trunc to the day and add a day - whatever -- to make the index "more selective"
Exactly
Christo Kutrovsky, February 19, 2004 - 11:21 am UTC
Those are the exact problems I am facing. I did try the index in reverse too. I finally had to take a piece of paper and write a few entries to understand why a range scan from the start value to the end of the entire index was necessary.
You said:
<q>
It is alot like the "spatial" problem with lat/long's. We had to invent a
special datatype for that stuff and geocode the data to make it "efficient".
</q>
I am not familiar with the solution in lat/long's. Before I dig into it, do you think a similar solution would be possible in this case?
February 19, 2004 - 11:41 am UTC
No, i don't think a similar solution would work -- lats/longs are in a very discrete, bounded, small set.
start_date, end_date is "infinite" in size -- not bounded.
unless you put artificial (or business) constraints on the start/end -- the problem is just that they could be any pair of values with no end points. so the stuff that works for lat/long -- won't work for start/end
A reader, March 19, 2004 - 8:51 am UTC
Slightly different
Sudha Bhagavatula, March 22, 2004 - 1:26 pm UTC
I have 2 tables contracts and member_providers each with member_id, eff_date and end_date fields. The tabbe in contracts has data like this :
member id := 100
eff_date := 01/01/2003
end_date := 12/31/2003
The table member_providers has data like this:
member_id := 100
eff_date := 01/01/2003
end_date := 06/30/2003
member_id := 100
eff_date := 08/01/2003
end_date := 12/31/2003
I want to identify the break in range in member_providers table compared to the contracts table in a SQL statement without using PL/SQL. Is it possible?
March 22, 2004 - 3:37 pm UTC
you can find breaks using LAG() and LEAD() easily....
do you know about them? (search for
rock roll
on this site -- lots of analytic examples)
very interesting issue
Muhammad Ibrahim, March 22, 2004 - 9:27 pm UTC
CB => c_beginning
PB => parameter start date
PE => parameter end date
CB1,CB2... different start dates in the table t
CE1,CE2... different end dates in the table t.
the situation looks like:
PB PE
|--------------------------|
|-----------|
CB1 CE1
|--------------------------|
CB2 CE2
|----------------|
CB3 CE3
|------------------|
CB4 CE4
|-----|
CB5 CE5
|--------|
CB6 CE6
From the below query
Select *
From t
Where ( CB Between PB And PE -- rule 1
Or CE Between PB And PE -- rule 2
Or ( CB < PB And PE < CE ) ); -- rule 3
it will return the below date sets:
CB1-CE1
CB2-CE2
CB3-CE3
CB4-CE4
CB5-CE5 -- *
the first four are correct but i want to eliminate the CB5-CE5 because either start or end date doesnt falls in PB-PE.
And also same with CB6-CE6. How can i do this?
Thanks,
Ibrahim.
March 22, 2004 - 9:31 pm UTC
how does that return cb5-c35 since
cb5 is not between pb and pe
ce5 is not between pb and pe
pe is NOT less than ce5
yes you are correct it returns CB5 also but will it return CB7 new scenario too?
A reader, March 22, 2004 - 11:09 pm UTC
Thanks for your reply. One more simple question.Will the same query returns for CB7 too?
CB => c_beginning
PB => parameter start date
PE => parameter end date
CB1,CB2... different start dates in the table t
CE1,CE2... different end dates in the table t.
the situation looks like:
PB PE
|--------------------------|
|-----------|
CB1 CE1
|--------------------------|
CB2 CE2
|----------------|
CB3 CE3
|------------------|
CB4 CE4
|-----|
CB5 CE5
|--------|
CB6 CE6
|----------------------------------------------------------|
CB7 CE7
From the below query
Select *
From t
Where ( CB Between PB And PE -- rule 1
Or CE Between PB And PE -- rule 2
Or ( CB < PB And PE < CE ) ); -- rule 3
it will return the below date sets:
CB1-CE1
CB2-CE2
CB3-CE3
CB4-CE4
CB5-CE5
CB7-CE7 -- ???
If i want to get the CB7 also do i need to add some more condition? The reason is CB7 is active eventhough it starts before PB date and ends behind PE date.
Thanks,
Ibrahim.
March 23, 2004 - 6:57 am UTC
for cb7
rule1 is false
rule2 is false
rule3 is false
(you know, you could load up some data or just evaluate the rules yourself! )
Yes, you would have to accomidate for cb7-ce7 if you wanted it, the original problem statement did not want it. would not be too hard to add.
Thanks for your reply
Muhammad Ibrahim, March 23, 2004 - 8:17 pm UTC
Dear Tom,
Select *
From t
Where ( CB Between PB And PE -- rule 1
Or CE Between PB And PE -- rule 2
Or ( CB < PB And PE < CE ) -- rule 3
Or ( CB < PE And PE < CE ); -- rule 4
The rule4 can overcome the CB7-CE7 but i am just wondering that whether any other scenario can pass thro' PB and PE. Just a doubt :-)
Thanks for your kind help to the oracle world.
Regards,
Ibrahim.
March 24, 2004 - 8:27 am UTC
at this point, it is just "logic", a predicate, a where clause.
you define what you want and the db will return it.
if that is what you want, there you go.....
Optimizing joins on start-end ranges
Bobby O, March 26, 2004 - 9:23 pm UTC
Tom,
Can the above advice be expanded range-queries for self-joins of large tables, on numeric rather than date values? For data-mining (i.e. not OLAP) queries?
As an example, below, I need to returns ids for pairs of start-end records that overlap. I run variants of the query below quite often, and the spatial overlap step is an enormous, recurring bottleneck.
In my case, these are data-mining/data-analysis joins on a system dedicated to these types of joins; pre-calculating materialized views, adding indexes, partitioning, writing pipelined table functions, custom index functions etc. would all be ok as a solution
Thanks,
Bobby O
DETAILS:
--- the query:
SQL> edit
Wrote file afiedt.buf
1 select c1.id as id1, c2.id as id2
2 from chrom_interval c1
3 , chrom_interval c2
4 where c1.cstart < c2.cend
5 and c1.cend > c2.cstart
6 and c1.chrom = c2.chrom
7* and c1.id < c2.id -- id is primary key; report each pair once
---- The table:
SQL> describe chrom_interval;
Name Null? Type
------------------------ -------- ---------------
ID NOT NULL NUMBER
CHROM NOT NULL VARCHAR2(255)
CSTART NOT NULL NUMBER(10)
CEND NOT NULL NUMBER(10)
SQL> select count(*), count(distinct chrom ) from chrom_interval;
COUNT(*) COUNT(DISTINCTCHROM)
---------- --------------------
24644323 41
The (cstart,cstop) range of each record is small compared to the max( cstart ) and min( cstart ) for each chrom. E.g. overlaps are infrequent, with each record overlapping ~0-10 other records.
Common query variants often include using the above join but with c1 & c1 from two different tables with the same columns as chrom_interval, or restricting c1 to a range of ids comprising ~0.1-1% of all chrom_interval rows.
March 27, 2004 - 10:33 am UTC
the techniques would apply for numbers as well as dates.
just as you can expand dates out into months and text index them -- you can do so with numbers as well.
but, there our goal was to reduce the search space and your query:
where c1.chrom=c2.chrom
and c1.id < c2.id
would seem to already do that?
If you are looking for overlaps (record "1" overlaps record "2") -- analytics might be best (single pass)....
It will not detect multiple overlaps as your does (eg; record 2 and record 3 overlap with record 1 -- you would show
1 2
1 3
2 3
I would show
1 2
2 3
not sure if that matters to you or not...)
ops$tkyte@ORA9IR2> select c1.id as id1, c2.id as id2
2 from t c1
3 , t c2
4 where c1.cstart < c2.cend
5 and c1.cend > c2.cstart
6 and c1.chrom = c2.chrom
7 and c1.id < c2.id
8 /
ID1 ID2
---------- ----------
3 4
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id, next_id
2 from (
3 select id,
4 lead(id) over (partition by chrom order by id) next_id,
5 cend,
6 lead(cstart) over (partition by chrom order by id) next_cstart
7 from t
8 )
9 where next_cstart <= cend
10 /
ID NEXT_ID
---------- ----------
3 4
getting all overlaps
Bobby O, March 29, 2004 - 3:42 pm UTC
Tom,
thanks. I'm still not sure how to optimize the query in my previous post; I do need all overlaps; e.g.
in the case you mentioned, I need the result:
1 2
1 3
2 3
; the other result, below, would not work.
1 2
2 3
; What is the optimal strategy for greatly accelerating this type of query when comparing two very large tables? what index + what hint, or rather write custom pl/sql, or.... ?
I'm willing to accept a less elegant solution if it is a lot faster. Currently, the likely alternative is dumping the data out of the database & doing the overlap test in perl. The query I noted runs too slow to be useful/acceptable. ( many hours to days). But I would think there is a way where Oracle is the fastest route....?
March 29, 2004 - 4:22 pm UTC
then you need what you asked for - a non-equijoin
the only way to "optimize it" in this case would be to change the question :)
you need to almost cartesian join at the "id" level in order to get the number of rows you want.
You will not get me to believe that anything you write in perl would be faster than anything you could do in SQL or PLSQL. Sorry, but I would not "buy that". If you want to benchmark it - please post the perl and we'll have a look see at the assumptions you've made.
also, whats the output of an autotrace traceonly explain for that query?
Sql autotrace for interval overlap query
bobbyo, April 12, 2004 - 3:11 pm UTC
Tom,
I believe that the sparseness of overlaps should help as restrective assumption in the join. To wit, the chrom_interval rows are start-stop intervals along a line, and very fiew interval-interval pairs do not result in an overlap.
Hence, shouldn't a sort-merge-like join approach, perhaps with the columns pre-sorted in an index, would be much more efficient than a cartesian join?
Thanks,
Bobby O
SQL> edit
Wrote file afiedt.buf
1 select c1.id as id1, c2.id as id2
2 from chrom_interval c1
3 , chrom_interval c2
4 where c1.cstart < c2.cend
5 and c1.cend > c2.cstart
6 and c1.chrom = c2.chrom
7* and c1.id < c2.id -- id is primary key; report each pair once
SQL> /
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=32751 Card=185165443
9 Bytes=92582721950)
1 0 HASH JOIN (Cost=32751 Card=1851654439 Bytes=92582721950)
2 1 TABLE ACCESS (FULL) OF 'CHROM_INTERVAL' (Cost=2522 Card=
24644323 Bytes=616108075)
3 1 TABLE ACCESS (FULL) OF 'CHROM_INTERVAL' (Cost=2522 Card=
24644323 Bytes=616108075)
April 12, 2004 - 7:08 pm UTC
hint it and see.
Hint it
bobbyo, April 14, 2004 - 3:49 pm UTC
What indexes & hints do you recommend ?
April 14, 2004 - 4:15 pm UTC
I don't have any -- i think indexes here would be disasterous to the performance.
index reading such big tables = bad
index probing from such a big table to another big table = really bad idea
Say you take this query:
1 select c1.id as id1, c2.id as id2
2 from chrom_interval c1
3 , chrom_interval c2
4 where c1.cstart < c2.cend
5 and c1.cend > c2.cstart
6 and c1.chrom = c2.chrom
7* and c1.id < c2.id
index chrom_interval(chrom,id,cstart,cend)
we'd have to full scan chrom_interval at least once and for *each row* if we used an index -- do a range scan -- you are talking billions of LIO's at that point.
can we use:
E.g. overlaps are infrequent, with each record
overlapping ~0-10 other records.
to your advantage? if you put a cap on the max for us, we can use analytics in a single pass.
Or, maybe this can work for you -- use analytics to get my small result and then a connect by to explode it out:
create global temporary table gtt (id int, next_id int );
insert into gtt
select id, next_id
from (
select id,
lead(id) over (partition by chrom order by id) next_id,
cend,
lead(cstart) over (partition by chrom order by id) next_cstart
from t
)
where next_cstart <= cend
/
select * from gtt connect by prior next_id = id
/
How to combine range of dates with the same amount
A reader, June 16, 2004 - 8:18 pm UTC
How to combine range of dates with the same amount:
(scripts are below )
I need to combine range of dates
which are
1) having same amount and are
continous ( start date of next record is end date of current record + 1 )
so "only" show the records with min start date and max end date ( and not the indivdual ranges )
or
2) falls between the range of min start date and max end date for the same amount
so "only" show the records with min start date and max end date ( and not the indivdual ranges )
Plus show the remaining records which do not satisfy the criteria.
SELECT * FROM TEST order by 1,2;
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 1/2/2002 62
01 1/3/2002 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 7/10/2003 70
01 7/11/2003 1/1/2004 70
02 5/6/2003 1/1/2004 78
02 5/6/2003 12/12/2004 78
02 1/2/2004 12/12/2004 78
9 rows selected
The final output must be
SELECT * FROM TEST order by 1,2
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70
02 5/6/2003 12/12/2004 78
5 rows needed
my attempts
SELECT *
FROM (SELECT a.*,
LEAD (start_date) OVER (PARTITION BY ID ORDER BY start_date)
lead_sd,
LEAD (amt) OVER (PARTITION BY ID ORDER BY start_date) lead_amt,
MIN (start_date) OVER (PARTITION BY ID ORDER BY amt) min_sd,
MAX (end_date) OVER (PARTITION BY ID ORDER BY amt) max_ed
FROM TEST a)
WHERE end_date = lead_sd - 1 AND amt = lead_amt
id start date end date amt lead sd lead amt min sd max ed
01 1/8/2001 1/2/2002 62 1/3/2002 62 1/11/2000 6/30/2002
01 1/2/2003 7/10/2003 70 7/11/2003 70 1/11/2000 1/1/2004
I couldnt go furhter this gives me the min and max for continous dates for the same price.
how to combine into one result set ?
CREATE TABLE TEST
(
ID VARCHAR2(15 BYTE) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE DATE,
AMT NUMBER(13)
)
NOLOGGING
NOCACHE
NOPARALLEL;
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '07/11/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 70);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/03/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '06/30/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 62);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '07/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 64);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/02/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '07/10/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 70);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/08/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/02/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 62);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'01', TO_Date( '01/11/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/07/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 57);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'02', TO_Date( '01/02/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '12/12/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 78);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'02', TO_Date( '05/06/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 78);
INSERT INTO TEST ( ID, START_DATE, END_DATE, AMT ) VALUES (
'02', TO_Date( '05/06/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '12/12/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 78);
COMMIT;
Thanx for all your help
June 17, 2004 - 8:01 am UTC
...
1) having same amount and are
continous ( start date of next record is end date of current record + 1 )
so "only" show the records with min start date and max end date ( and not
the indivdual ranges )
or
2) falls between the range of min start date and max end date for the same
amount
so "only" show the records with min start date and max end date ( and not the
indivdual ranges )
....
if you ask me, #1 and #2 seem to cover the entire set.
every record satisfies #2.
so, there must be something more to it?
How to combine range of dates with the same amount DB version
A reader, June 16, 2004 - 9:30 pm UTC
DB version is 9.2
How to combine : Some clarifications for point 2
A reader, June 16, 2004 - 10:12 pm UTC
wrt " 2) falls between the range of min start date and max end date for the same
amount
so "only" show the records with min start date and max end date ( and not the
indivdual ranges )"
I meant to say that the min start date and max end date above must be in the same record like in the example below
it is the second record.
Only then show that record.
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
02 5/6/2003 1/1/2004 78
02 5/6/2003 12/12/2004 78
02 1/2/2004 12/12/2004 78
June 17, 2004 - 9:25 am UTC
sorry, not getting it. or it just sounds like you want min(start), max(end) group by amt.
A reader, June 16, 2004 - 10:25 pm UTC
My corrected attempt:
SELECT *
FROM (SELECT a.*,
LEAD (start_date) OVER (PARTITION BY ID ORDER BY start_date)
lead_sd,
LEAD (amt) OVER (PARTITION BY ID ORDER BY start_date) lead_amt,
MIN (start_date) OVER (PARTITION BY ID,amt ORDER BY amt) min_sd,
MAX (end_date) OVER (PARTITION BY ID,amt ORDER BY amt) max_ed
FROM TEST a)
WHERE end_date = lead_sd - 1 AND amt = lead_amt
ID START_DATE END_DATE AMT LEAD_SD LEAD_AMT MIN_SD MAX_ED
--------------- --------------------- --------------------- -------------- --------------------- --------------------------------------- --------------------- ---------------------
01 1/8/2001 1/2/2002 62 1/3/2002 62 1/8/2001 6/30/2002
01 1/2/2003 7/10/2003 70 7/11/2003 70 1/2/2003 1/1/2004
2 rows selected
Clarifications
A reader, June 17, 2004 - 10:39 am UTC
SELECT * FROM TEST order by 1,2;
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 1/2/2002 62 **
01 1/3/2002 6/30/2002 62 **
01 7/1/2002 1/1/2003 64
01 1/2/2003 7/10/2003 70 ##
01 7/11/2003 1/1/2004 70 ##
02 5/6/2003 1/1/2004 78 $$
02 5/6/2003 12/12/2004 78 $$
02 1/2/2004 12/12/2004 78 $$
9 rows selected
The final output must be
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57
01 1/8/2001 6/30/2002 62 **
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70 ##
02 5/6/2003 12/12/2004 78 $$
Explanation;
for the rows :
01 1/8/2001 1/2/2002 62
01 1/3/2002 6/30/2002 62
here start date of second row = end date or first row + 1 and also amt is same.
----------------------------------------------------------------->
1/8/2001 1/2/2002
1/3/2002 6/30/2002
I need output as (as long as amts are same) :
for the above set min start and max end dates
01 1/8/2001 6/30/2002 62
for the rows :
01 1/2/2003 7/10/2003 70 ##
01 7/11/2003 1/1/2004 70 ##
the output needed is (as long as amts are same) :
01 1/2/2003 1/1/2004 70 ##
for the rows :
02 5/6/2003 1/1/2004 78 $$
02 5/6/2003 12/12/2004 78 $$
02 1/2/2004 12/12/2004 78 $$
here if a single record has the start date and end date as the max start date and min start date for that amt
show that and neglect the rows with dates completely overlaping withing that range
------------------------------------------------------------------>
5/6/2003 1/1/2004
1/2/2004 12/12/2004
5/6/2003 12/12/2004 $ show this row only
the output needed is (as long as amts are same)
02 5/6/2003 12/12/2004 78 $$
Please see the matching characters.
So the final output is above cases plus the remainder of the rows
(hence you get rows marked @@@@@ and %%%%%)
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/7/2001 57 @@@@@
01 1/8/2001 6/30/2002 62 **
01 7/1/2002 1/1/2003 64 %%%%%
01 1/2/2003 1/1/2004 70 ##
02 5/6/2003 12/12/2004 78 $$
June 17, 2004 - 12:33 pm UTC
yah, as far as I can see,
select id, min(start_date), max(end_date), amt from t group by id, amt;
is it. as least for your example data that does it.
You want the min start date by amt/id
You want the max start date by amt/id
all of this "prior" and "next" stuff doesn't seem to apply.
More:
A reader, June 17, 2004 - 10:51 am UTC
for the rows :
02 5/6/2003 1/1/2004 78 $$
02 5/6/2003 12/12/2004 78 $$
02 1/2/2004 12/12/2004 78 $$
Assume we also had a row like
02 11/5/2004 12/6/2004 78 $$
here if a single record has the start date and end date as the max start date
and min start date for that amt
show that and neglect the rows with dates completely overlaping withing that
range
------------------------------------------------------------------>
5/6/2003 1/1/2004
1/2/2004 12/12/2004
11/5/04 12/6/04
5/6/2003 12/12/2004 $ show this
row only
the output needed is (as long as amts are same)
02 5/6/2003 12/12/2004 78 $$
June 17, 2004 - 1:41 pm UTC
yah, so pump that data through my query -- we are still at "max min". I see nothing to indicate otherwise.
5/6/2003 is the min(start_date).
12/12/2004 is the max(end_date).
who cares about that other row?
May be :
A reader, June 17, 2004 - 2:31 pm UTC
I did not put a good test case.
1) I need the max and min dates only for those ranges that
are continous(start_date = prev end_date + 1) and have the same amt.
So that I can unify those records in a single range with min sd and max ed values.
This min max row will be a computed row representing those ranges.
Your query will give me wrong results for the data set:
select * from test order by 1,2
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/5/2001 62
01 1/8/2001 1/2/2002 62
01 1/3/2002 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 7/10/2003 70
01 7/11/2003 1/1/2004 70
02 5/6/2003 1/4/2004 78
02 5/6/2003 12/12/2004 78
02 1/2/2004 12/12/2004 78
9 rows selected
It will give me :
select id, min(start_date), max(end_date), amt from test group by id, amt
select id, min(start_date), max(end_date), amt from test3 group by id, amt
ID MIN(START_DATE) MAX(END_DATE) AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 6/30/2002 62-- doesnot consider breaks in range
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70
02 5/6/2003 12/12/2004 78
4 rows selected
What is required is :
ID START_DATE END_DATE AMT
--------------- --------------------- --------------------- --------------
01 1/11/2000 1/5/2001 62 -- this comes because of break in range.
01 1/8/2001 6/30/2002 62
01 7/1/2002 1/1/2003 64
01 1/2/2003 1/1/2004 70
02 5/6/2003 12/12/2004 78
therefore I used analytics to get dates which are continous and have the same amt and then get their min SD and max ED.
Basically the requirement is as long as ranges are continous and have same amount get the min sd and max sd.
and donot show individual ranges for them.
Also even if range is not continous but they fall within
the range of "such a "row"" which happens to have the
min SD and max SD for that amt then show that and only that row ( see id = 02 in this case ).
So the min and max should either be in the same row for that amt or
if range is "continous" only then min and max to be computed across the various rows.
Any clues to this?
Thanx for your help
June 17, 2004 - 3:03 pm UTC
ops$tkyte@ORA9IR2> select id, min(start_date), max(end_date), amt
2 from (
3 select id, start_date, end_date, amt,
4 max(grp) over (partition by id,amt order by end_date) grp
5 from (
6 select id, start_date, end_date, amt,
7 case when lag(end_date) over (partition by id, amt order by end_date ) <> start_date-1
8 or
9 row_number() over (partition by id, amt order by end_date) = 1
10 then row_number() over (partition by id, amt order by end_date)
11 end grp
12 from t
13 )
14 )
15 group by id, amt, grp
16 /
ID MIN(START_ MAX(END_DA AMT
---------- ---------- ---------- ----------
1 01/11/2000 01/05/2001 62
1 01/08/2001 06/30/2002 62
1 07/01/2002 01/01/2003 64
1 01/02/2003 01/01/2004 70
2 05/06/2003 01/04/2004 78
2 05/06/2003 12/12/2004 78
6 rows selected.
run the queries from the inside out to see what I did. just marked the beginning of groups with row_number, then carried this grp down the rows, then we just group by it.
A Problem with
A reader, June 17, 2004 - 3:44 pm UTC
Great Thanx for your help sir.
But there is a problem with the query:
The results:
ID MIN(START_ MAX(END_DA AMT
---------- ---------- ---------- ----------
1 01/11/2000 01/05/2001 62
1 01/08/2001 06/30/2002 62
1 07/01/2002 01/01/2003 64
1 01/02/2003 01/01/2004 70
2 05/06/2003 01/04/2004 78 *****
2 05/06/2003 12/12/2004 78
The **** one should not show up as it falls within the
range of 5/6/2003 and 12/12/2004 (which is a single row in the data set ) and has the same amt.
How to eliminate such rows ?
The second requirement was "
Also even if range is not continous but they fall within
the range of "such a "row"" which happens to have the
min SD and max SD for that amt then show that and only that "max min" row ( see id = 02 in
this case )."
-------------------------->
5/6/2003 1/4/2004
1/2/2004 12/12/2004
5/6/2003 12/12/2004
June 17, 2004 - 6:25 pm UTC
probably not going to happen in sql - you would have to compare that row to every other grouped row inside of that set....
and it gets really wicked if you have something like:
2 05/06/2003 01/04/2004 78
2 05/06/2003 01/04/2004 78
since pretty much most every piece of sql logic we could apply to that would remove both rows.
so I don't see it happening.
Any directions ?
A reader, June 19, 2004 - 9:00 pm UTC
Great thanx sir for your help.
Any directions,or Algorithm to implement in plsql ?
I tried with plsql itself the logic gets very very complicated.
Was thinking of opening 2 cursors.
c1 is select * from t order by id ,start_date
c2(p_id number,p_amt number) is select * from t where
id = p_id and amt = p_amt.
June 20, 2004 - 10:18 am UTC
not really, you'll need to basically get all of the records for a given group and compare each record in that group to every other one -- looking for "overlaps". sort of like a cartesian join.
date in the where clause
A reader, August 12, 2004 - 12:19 pm UTC
hi tom
in the where clause I have something like:
date_column >= sysdate - :no_of_days
where 10 is the number of days...
Is there a better way of writing this? I mean I think I have seen some optimizations you suggest for clauses involving date between dat1 and date2...does any of that
apply here?
thanx!
August 12, 2004 - 12:37 pm UTC
that is the right way to do that. you want to avoid performing functions on database columns and do them on constants -- as you are.
Days in overlapping/non-overlapping dates ...
VKOUL, November 05, 2004 - 9:02 pm UTC
SQL> create table dt(from_date date, to_date date);
Table created.
SQL>
SQL> insert into dt values('01-jan-2004', '05-jan-2004');
1 row created.
SQL> insert into dt values('04-jan-2004', '10-jan-2004');
1 row created.
SQL> insert into dt values('11-jan-2004', '20-jan-2004');
1 row created.
SQL>
SQL> insert into dt values('19-jan-2004', '24-jan-2004');
1 row created.
SQL> insert into dt values('26-jan-2004', '31-jan-2004');
1 row created.
SQL> commit;
Commit complete.
SQL>
SQL> select * from dt;
Press Enter ...
FROM_DATE TO_DATE
--------- ---------
01-JAN-04 05-JAN-04
04-JAN-04 10-JAN-04
11-JAN-04 20-JAN-04
19-JAN-04 24-JAN-04
26-JAN-04 31-JAN-04
SQL>
Q : Is it possible to get total number of days in this set counting the overlapping period once only
i.e. Between 01-JAN-04 & 05-JAN-04, there are 5 days
Between 04-JAN-04 & 10-JAN-04, there are 7 days
But, 04-JAN & 05-JAN are common, so the total number of days will be effectively 10.
Thanks
November 06, 2004 - 10:27 am UTC
will you have
01-jan-04 05-jan-04
03-jan-04 10-jan-04
04-jan-04 11-jan-04
as well -- is is only contigous records that overlap or could these overlaps span tons of records.
Days in overlapping/non-overlapping dates ...
VKOUL, November 07, 2004 - 1:01 pm UTC
Yes they will be, but these records are enclosed by a key, means, per key basis these records will be finite, say at max 20 overlapping/non-overlapping.
November 07, 2004 - 3:00 pm UTC
are you saying "by key", at most 20 records?
VKOUL, November 07, 2004 - 5:45 pm UTC
Yes, that is true
November 08, 2004 - 9:40 am UTC
ops$tkyte@ORA9IR2> select min(from_date) from_date,max(to_date) to_date, max(to_date)-min(from_date)+1 days, grp
2 from (
3 select from_date, to_date, max(grp) over (order by to_date) grp
4 from (
5 select from_date, to_date,
6 lag(to_date) over (order by to_date) lag_to_date,
7 case when nvl( lag(to_date) over (order by to_date),from_date-1)
8 < from_date then row_number() over (order by to_date) end grp
9 from dt
10 )
11 )
12 group by grp
13 /
FROM_DATE TO_DATE DAYS GRP
--------- --------- ---------- ----------
01-JAN-04 10-JAN-04 10 1
11-JAN-04 24-JAN-04 14 3
26-JAN-04 31-JAN-04 6 5
ops$tkyte@ORA9IR2> select sum((to_date-from_date)+1)
2 from (
3 select min(from_date) from_date,max(to_date) to_date,grp
4 from (
5 select from_date, to_date, max(grp) over (order by to_date) grp
6 from (
7 select from_date, to_date,
8 lag(to_date) over (order by to_date) lag_to_date,
9 case when nvl( lag(to_date) over (order by to_date),from_date-1)
10 < from_date then row_number() over (order by to_date) end grp
11 from dt
12 )
13 )
14 group by grp
15 )
16 /
SUM((TO_DATE-FROM_DATE)+1)
--------------------------
30
ops$tkyte@ORA9IR2>
VKOUL, November 08, 2004 - 4:33 pm UTC
Thanks Tom,
it works great.
I am going to test it with couple more scenarios
SUM(interval) work?
Joseph, November 08, 2004 - 6:23 pm UTC
Tom
Thanks for your all great insights. If you can tackle this, I'd appreciate.
I have a table that contains two timestamp columns (START and STOP). Each record is basically a log of a batch job. I tried
SELECT SUM(END_TIME-START_TIME) total_time
FROM jobs
WHERE TRANS_PROC_ID > 12886;
I got ORA-00932: INCONSISTENT DATA TYPES: expected number got interval.
Can we aggregrate interval data? How?
Thanks
Joseph
November 08, 2004 - 9:25 pm UTC
last granular detail of time we can store
A reader, November 09, 2004 - 2:51 am UTC
Tom, what is the last granular detail of time we can store in oracle database, and what is its format..
Thanks
November 09, 2004 - 8:50 am UTC
ops$tkyte@ORA9IR2> select systimestamp from dual;
SYSTIMESTAMP
---------------------------------------------------------------------------
09-NOV-04 08.22.27.351007 AM -05:00
read about TIMESTAMP in the Concepts and SQL Reference Guide.
Leading Question
Robert, November 29, 2004 - 6:36 pm UTC
Tom,
Is there a technique which you commonly use to fill in the last lead() with the next value in a table.
For example, I would like to return the value of P200403 for the last row in this query.
test9.2> l
1 select part_name,
2 high_values,
3 lead_part
4 from (select A.partition_na part_name,
5 A.partition_key_c high_values,
6 lead(partition_na) over (order by partition_na) lead_part
7 from my_table_partitions A
8 where A.table_owner_na = 'TEST'
9 and A.table_na = 'FACT_TABLE'
10 and exists (select NULL
11 from dimension_table B
12* where B.system_calendar_key_n = A.partition_key_c))
test9.2> /
Press <RETURN> To Continue...
PART_NAME HIGH_VALUES LEAD_PART
------------- --------------- ----------
P200350 200351 P200351
P200351 200352 P200352
P200352 200401 P200401
P200401 200402 P200402
P200402 200403
5 rows selected.
Thanks,
Dale
November 29, 2004 - 7:12 pm UTC
nvl( lead()...., partition_key_c )
you just want partition_key_c when lead is null right?
Thanks!!!
Robert D. Ware, December 01, 2004 - 10:17 am UTC
Tom,
Again you were right on the money.
NVL(lead(partition_na) over (order by partition_na),'P'||partition_key_c) part_name
This gave me exactly what I needed.
Thanks again!
Hour wise Max number of logged in Users
Vaishnavi, December 19, 2004 - 2:29 am UTC
Hi Tom,
Thank you very much for your valuble Inputs. I need to find out hour wise Max number of logged in users from a table.
SQL> create table t (login date, logout date);
Table created.
SQL> insert into t values(trunc(sysdate) + (8/24), trunc(sysdate) + (11/24) + (15/(24*60)));
1 row created.
SQL> insert into t values(trunc(sysdate) + (9/24), trunc(sysdate) + (11/24) + (15/(24*60)));
1 row created.
SQL> insert into t values(trunc(sysdate) + (10/24), trunc(sysdate) + (11/24) + (15/(24*60)));
1 row created.
SQL> insert into t values(trunc(sysdate) + (11/24), trunc(sysdate) + (11/24) + (15/(24*60)));
1 row created.
SQL> commit;
Commit complete.
SQL>
SQL> select to_char(login, 'DD-MON-RRRR HH24:MI:SS'), to_char(logout, 'DD-MON-RRRR HH24:MI:SS') from t;
TO_CHAR(login,'DD-MO TO_CHAR(logout,'DD-M
-------------------- --------------------
19-DEC-2004 08:00:00 19-DEC-2004 11:15:00
19-DEC-2004 09:00:00 19-DEC-2004 11:15:00
19-DEC-2004 10:00:00 19-DEC-2004 11:15:00
19-DEC-2004 11:00:00 19-DEC-2004 11:15:00
Now I want to find out, group by each hour, I want to find out max number of users connected.
Out put I am expecting is:
19-DEC-2004 08:00:00 1
19-DEC-2004 09:00:00 2
19-DEC-2004 10:00:00 3
19-DEC-2004 11:00:00 4
For a single day, half a million records will be added this table. And I need to pull the above information
at the end of each day.
Can I get this out put in a single query? Or do i need to write a function which will return count, with HH24 as
the Input?
Please suggest.
Sincerely
Vaishnavi
December 19, 2004 - 11:24 am UTC
sort of remotely related to the original question, so...
we'll need a table of hours to join to -- we need to "multiply" a single row out into many rows:
ops$tkyte@ORA9IR2> create table hours
2 as
3 select rownum hh24
4 from all_objects
5 where rownum <= 24;
Table created.
Elapsed: 00:00:00.07
ops$tkyte@ORA9IR2> variable x varchar2(20)
ops$tkyte@ORA9IR2> exec :x := to_char(sysdate,'dd-mon-yyyy')
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select b.hh24, count(*)
2 from (
3 select to_number(to_char(login,'hh24')) login,
4 to_number(to_char(logout,'hh24')) logout
5 from t
6 where logout >= to_date( :x, 'dd-mon-yyyy' )
7 and login < to_date( :x, 'dd-mon-yyyy' )+1
8 ) a,
9 hours b
10 where b.hh24 between login and logout
11 group by b.hh24
12 /
HH24 COUNT(*)
---------- ----------
8 1
9 2
10 3
11 4
<b>that is the output we want -- now to test it "bigger"</b>
ops$tkyte@ORA9IR2> insert /*+ APPEND */ into t
2 select trunc(sysdate)+mod(rownum,80000)/24/60/60,
3 trunc(sysdate)+mod(rownum,80000)/24/60/60 + (mod(rownum,5)+1)*1/24
4 from big_table.big_table
5 where rownum <= 500000;
500000 rows created.
ops$tkyte@ORA9IR2> commit;
Commit complete.
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'HOURS' );
PL/SQL procedure successfully completed.
ops$tkyte@ORA9IR2> set timing on
ops$tkyte@ORA9IR2> select b.hh24, count(*)
2 from (
3 select to_number(to_char(login,'hh24')) login,
4 to_number(to_char(logout,'hh24')) logout
5 from t
6 where logout >= to_date( :x, 'dd-mon-yyyy' )
7 and login < to_date( :x, 'dd-mon-yyyy' )+1
8 ) a,
9 hours b
10 where b.hh24 between login and logout
11 group by b.hh24
12 order by 2
13 /
HH24 COUNT(*)
---------- ----------
23 18240
22 39840
1 50399
21 60480
2 70560
20 73440
19 82080
3 85680
18 86400
17 86400
16 86400
15 86400
14 86400
13 86400
12 86400
11 86404
10 86803
9 87922
8 89761
7 92320
6 95601
4 95760
5 99201
23 rows selected.
Elapsed: 00:00:10.97
ops$tkyte@ORA9IR2>
Looking for better solution than this!
Vaishnavi, December 19, 2004 - 2:41 am UTC
Hi Tom,
Sorry!! After giving some serious thought, I came up with this solution. But It will do cartesian product.
select trunc(a.login, 'HH24'), sum(case when b.logout > a.login and b.login <= a.login then 1 else 0 end) f_max_number
from t a, t b
group by trunc(a.login, 'HH24');
But this query will do cartesian product to half a million records. :( Please help.
Thanks
Vaishnavi
two things for Vaishnavi to investigate/adjust ...
Gabe, December 20, 2004 - 2:50 pm UTC
The solution to the "number of users per hour" question won't work very well for:
1. users with open connections for more than 24 hours
2. users having an open connection when the query is executed (logout is null)
December 20, 2004 - 3:41 pm UTC
1) true, it'll skip em by design
Thank you Gabe
Vaishnavi, December 21, 2004 - 9:58 am UTC
Thank you Gabe. I am handling that. Thank you for your Interest.
Sincerely
Vaishnavi
date comparison (8.1.7)
A reader, January 04, 2005 - 5:55 pm UTC
hi tom,
I want to compare two dates including the hours part of the time.
so basically let us say to_day '04-Jan-2005' (sysdate)
and yesterday '03-Jan-2005' (sysdate-1)
I want to count all the inserted records from
select * from t
where 1 = 1
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
>= to_date('03-Jan-2005 16:00:00','dd-Mon-yyyy hh24:mi:ss')
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
< to_date('04-Jan-2005 16:00:00','dd-Mon-yyyy hh24:mi:ss')
works,
but how to do this with sysdate ?
select * from t
where 1 = 1
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
>= to_date('(sysdate -1) 16:00:00','dd-Mon-yyyy hh24:mi:ss')
and to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
< to_date('sysdate 16:00:00','dd-Mon-yyyy hh24:mi:ss')
please explain
January 05, 2005 - 9:10 am UTC
why are you using:
to_date(t.mydate,'dd-Mon-yyyy HH24:mi:ss')
please don't tell me that mydate is a string, that would be so wrong, so so so SOOOOO wrong. It cannot be true. It must be a date
hence, it would simply be:
select * from t
where mydate between sysdate-1 and sysdate;
I've no idea what you are attempting to do with:
to_date('sysdate 16:00:00','dd-Mon-yyyy hh24:mi:ss')
at all? it doesn't make sense
it is a date.
A reader, January 05, 2005 - 10:42 am UTC
I just want to count all record in table,
which are inserted/updated from
18:00:00 hrs of (sysdate -1)
to 18:00:00 hrs of sysdate
whant to tweak 24 hour window from 6 pm (yesterday)
to 6pm (today)
January 05, 2005 - 11:06 am UTC
where date_column between trunc(sysdate)-1+18/24
and trunc(sysdate)+18/24
cool, thanks
A reader, January 05, 2005 - 11:57 am UTC
Thanks, tom
number of days that overlap between two date ranges
robin, January 14, 2005 - 1:20 pm UTC
it is possible to modify the above code to count the number of days the two date ranges overlap?
January 14, 2005 - 8:36 pm UTC
ops$tkyte@ORA9IR2> create table t ( x int, pb date, cb date, ce date, pe date );
Table created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into t
2 select rownum, null, sysdate-5, sysdate+5, null
3 from all_users
4 where rownum <= 6
5 /
6 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> update t set pb = cb-1, pe=ce+1 where x=1;
1 row updated.
ops$tkyte@ORA9IR2> update t set pb = cb-1, pe=ce-1 where x=2;
1 row updated.
ops$tkyte@ORA9IR2> update t set pb = cb+1, pe=ce+1 where x=3;
1 row updated.
ops$tkyte@ORA9IR2> update t set pb = cb+1, pe=ce-1 where x=4;
1 row updated.
ops$tkyte@ORA9IR2> update t set pb = cb-1, pe=cb-1/2 where x=5;
1 row updated.
ops$tkyte@ORA9IR2> update t set pb = ce+1/2, pe=ce+1 where x=6;
1 row updated.
ops$tkyte@ORA9IR2> select t.*,
2 case when (cb <= pe) AND (ce >= pb)
3 then '<='
4 end,
5 greatest( cb, pb ),
6 least( ce, pe ),
7 case when (cb <= pe) AND (ce >= pb)
8 then least( ce, pe )-greatest( cb, pb )
9 end days
10 from t
11 /
X PB CB CE PE CA GREATEST( LEAST(CE, DAYS
-- --------- --------- --------- --------- -- --------- --------- ----
1 08-JAN-05 09-JAN-05 19-JAN-05 20-JAN-05 <= 09-JAN-05 19-JAN-05 10
2 08-JAN-05 09-JAN-05 19-JAN-05 18-JAN-05 <= 09-JAN-05 18-JAN-05 9
3 10-JAN-05 09-JAN-05 19-JAN-05 20-JAN-05 <= 10-JAN-05 19-JAN-05 9
4 10-JAN-05 09-JAN-05 19-JAN-05 18-JAN-05 <= 10-JAN-05 18-JAN-05 8
5 08-JAN-05 09-JAN-05 19-JAN-05 09-JAN-05 09-JAN-05 09-JAN-05
6 20-JAN-05 09-JAN-05 19-JAN-05 20-JAN-05 20-JAN-05 19-JAN-05
6 rows selected.
Between date ranges
chandra, January 19, 2005 - 11:36 pm UTC
Tom,
How do I calculate how many times users logged in between the given dates(atleast once).
lets say I have a table with users and date
date users
---------- -----
10/28/2004 userA
10/28/2004 userB
10/29/2004 userA
11/01/2004 userc ----------- just once
12/20/2004 userA
12/28/2004 userA
12/28/2004 userB
10/29/2004 userA
01/01/2005 userd ----------- just once
01/10/2005 userA
The max(date) is 01/10/2005 and min(date) is 10/28/2004
How do I calculate how many times users logged between the max(date) and min(date).
The output required should be
count(users) days logged
2 ie userC and D 1 ------- 2 users
logged in atleast once
between the min(date)
and max(date)
1 i.e userA 6 ------- 1 user logged
into the system
6 days
1 i.e userB 2 ------- 1 user logged
into the system
2 days
The logging into the system need not be continous I.e the user can login monday and next login can be after a week and so on.
Thankyou very much
chandra
January 20, 2005 - 10:19 am UTC
select users, count(*) from t where dt between :a and :b group by users;
Net Days In Service
VKOUL, February 03, 2005 - 8:49 pm UTC
SQL> create table dt_net(from_date date, to_date date, type varchar2(10));
Table created.
SQL> insert into dt_net values('01-jan-2004', '05-jan-2004', 'ACTIVE');
1 row created.
SQL> insert into dt_net values('04-jan-2004', '10-jan-2004', 'HOLD');
1 row created.
SQL> insert into dt_net values('11-jan-2004', '18-jan-2004', 'ACTIVE');
1 row created.
SQL> insert into dt_net values('17-jan-2004', '20-jan-2004', 'HOLD');
1 row created.
SQL> insert into dt_net values('19-jan-2004', '24-jan-2004', 'ACTIVE');
1 row created.
SQL> insert into dt_net values('26-jan-2004', '31-jan-2004', 'ACTIVE');
1 row created.
SQL> commit;
Commit complete.
SQL>
SQL> select * from dt_net
2 /
Press Enter ...
FROM_DATE TO_DATE TYPE
--------- --------- ----------
01-JAN-04 05-JAN-04 ACTIVE --
04-JAN-04 10-JAN-04 HOLD -- Effectively 5 days of hold
11-JAN-04 18-JAN-04 ACTIVE
17-JAN-04 20-JAN-04 HOLD -- Effectively 0 days of hold
19-JAN-04 24-JAN-04 ACTIVE
26-JAN-04 31-JAN-04 ACTIVE
6 rows selected.
SQL> DISC
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
SQL>
Output I am expecting is : 25 days of activity.
i.e. (30 (Total active)- 5(Total hold))
Thanks
February 04, 2005 - 2:00 am UTC
look into lag and lead. lets you look back and forth a row or two or N....
ops$tkyte@ORA9IR2> select from_date, to_date, type,
2 lag(to_date) over (order by from_date) last_to_date
3 from dt_net;
FROM_DATE TO_DATE TYPE LAST_TO_D
--------- --------- ---------- ---------
01-JAN-04 05-JAN-04 ACTIVE
04-JAN-04 10-JAN-04 HOLD 05-JAN-04
11-JAN-04 18-JAN-04 ACTIVE 10-JAN-04
17-JAN-04 20-JAN-04 HOLD 18-JAN-04
19-JAN-04 24-JAN-04 ACTIVE 20-JAN-04
26-JAN-04 31-JAN-04 ACTIVE 24-JAN-04
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select from_date,
2 case when type = 'ACTIVE'
3 then to_date
4 else greatest(to_date,last_to_date)
5 end new_to_date,
6 type
7 from (
8 select from_date, to_date, type,
9 lag(to_date) over (order by from_date) last_to_date
10 from dt_net
11 )
12 /
FROM_DATE NEW_TO_DA TYPE
--------- --------- ----------
01-JAN-04 05-JAN-04 ACTIVE
04-JAN-04 10-JAN-04 HOLD
11-JAN-04 18-JAN-04 ACTIVE
17-JAN-04 20-JAN-04 HOLD
19-JAN-04 24-JAN-04 ACTIVE
26-JAN-04 31-JAN-04 ACTIVE
ops$tkyte@ORA9IR2> edit test
ops$tkyte@ORA9IR2> @test
ops$tkyte@ORA9IR2> drop table dt_net;
ops$tkyte@ORA9IR2> create table dt_net(from_date date, to_date date, type varchar2(10));
ops$tkyte@ORA9IR2> insert into dt_net values('01-jan-2004', '05-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2> insert into dt_net values('04-jan-2004', '10-jan-2004', 'HOLD');
ops$tkyte@ORA9IR2> insert into dt_net values('11-jan-2004', '18-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2> insert into dt_net values('17-jan-2004', '20-jan-2004', 'HOLD');
ops$tkyte@ORA9IR2> insert into dt_net values('19-jan-2004', '24-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2> insert into dt_net values('26-jan-2004', '31-jan-2004', 'ACTIVE');
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select from_date, to_date, type,
2 lag(to_date) over (order by from_date) last_to_date
3 from dt_net;
FROM_DATE TO_DATE TYPE LAST_TO_D
--------- --------- ---------- ---------
01-JAN-04 05-JAN-04 ACTIVE
04-JAN-04 10-JAN-04 HOLD 05-JAN-04
11-JAN-04 18-JAN-04 ACTIVE 10-JAN-04
17-JAN-04 20-JAN-04 HOLD 18-JAN-04
19-JAN-04 24-JAN-04 ACTIVE 20-JAN-04
26-JAN-04 31-JAN-04 ACTIVE 24-JAN-04
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select case when type = 'ACTIVE'
2 then from_date
3 else greatest(from_date,last_to_date)
4 end new_to_date,
5 to_date,
6 type
7 from (
8 select from_date, to_date, type,
9 lag(to_date) over (order by from_date) last_to_date
10 from dt_net
11 )
12 /
NEW_TO_DA TO_DATE TYPE
--------- --------- ----------
01-JAN-04 05-JAN-04 ACTIVE
05-JAN-04 10-JAN-04 HOLD
11-JAN-04 18-JAN-04 ACTIVE
18-JAN-04 20-JAN-04 HOLD
19-JAN-04 24-JAN-04 ACTIVE
26-JAN-04 31-JAN-04 ACTIVE
VKOUL, February 06, 2005 - 12:28 pm UTC
Yes, I think I'll be able to incorporate this logic with some modification.
Thanks
Missing Dates
Tracy, February 11, 2005 - 12:31 pm UTC
Two of the columns in my table are FromDate and ToDate.
In a given row FromDate is always less than ToDate.
Rows may have date ranges that overlap date ranges in other rows (or contain them completely).
I want to get a list of all dates that aren't in any row between the FromDate and the ToDate (starting with the earliest FromDate and finishing with the latest ToDate).
Let's say the table contains these rows:
FromDate ToDate
01-JAN-03 07-JAN-03
08-JAN-03 21-JAN-03
15-JAN-03 26-JAN-03
29-JAN-03 23-MAR-03
24-MAR-03 28-MAR-03
23-FEB-03 15-APR-03
18-APR-03 25-APR-03
I want my sql to return
27-jan-03
28-jan-03
16-apr-03
17-apr-03
as these are the only dates between 01-jan-03 (earliest FromDate) and 25-apr-03 (latest ToDate) that aren't between the dates in any row, i.e. they are 'missing'.
February 12, 2005 - 8:16 am UTC
you need to generate a set of dates, something like:
select (select min(fromdate) from t)+rownum-1 dt
from all_objects
where rownum <= (select max(todate) from t)-(select min(fromdate) from t)+1
(assuming the number of dates is small enough -- all_objects has about 30-40,000 rows in it typically these days. ANY table with enough rows will do)
Then, use that
with Q as (that_query)
select *
from q
where not exists (select null from t where fromdate <= q.dt and todate <= q.dt)
for example..
Traversing a GAP in data
Robert, March 23, 2005 - 1:41 pm UTC
Tom,
First I would like to thank you for all that you do for the Oracle community.
We have a requirement to report the average of 4 values however, the twist is that we need to detect gaps in the data.
To illustrate this I will use the demo table salgrade with the following modifications.
update SALGRADE set hisal = 5000 where grade = 5;
insert into salgrade values (7,6001,6500);
insert into salgrade values (8,6501,7000);
insert into salgrade values (9,7001,8000);
insert into salgrade values (10,8001,9999);
commit;
Here is what I am starting with:
select * from SALGRADE order by 1;
GRADE LOSAL HISAL
---------- ---------- ----------
1 700 1200
2 1201 1400
3 1401 2000
4 2001 3000
5 3001 5000
7 6001 6500
8 6501 7000
9 7001 8000
10 8001 9999
Since the users will control the starting point for this calculation, I will use 4 for this example.
variable b1 varchar2;
exec :b1 :=4;
select grade,
avg_losal
from (select grade,
avg(losal) over (order by grade
range between current row
and 4 following) avg_losal
from salgrade)
where grade = :b1
/
GRADE AVG_LOSAL
---------- ----------
4 4376
Since there is a gap in the data with grade 6 missing, we need to report the following.
select (2001 + 3001 + 6001) / 4 from dual;
(2001+3001+6001)/4
------------------
2750.75
How can I do this in SQL?
Thanks,
Robert D. Ware
March 23, 2005 - 6:19 pm UTC
really, by 4?
ops$tkyte@ORA9IR2> select grade, sum(losal) over (order by grade range between current row and 3 following)/4
2 from salgrade;
GRADE SUM(LOSAL)OVER(ORDERBYGRADERANGEBETWEENCURRENTROWAND3FOLLOWING)/4
---------- -----------------------------------------------------------------
1 1325.75
2 1901
3 1600.75
4 2750.75
5 3875.75
7 6876
8 5375.75
9 3750.5
10 2000.25
9 rows selected.
sum up the rows (you were over by 1 in your range I think) and divide by 4.
What about the SYS_OVER_ Functions
Bob Lyon, March 23, 2005 - 1:54 pm UTC
While looking for "new stuff" in (9iR2) package STANDARD I found
FUNCTION sys_over__dd(t1 date,t2 date,
t3 date,t4 date) RETURN BOOLEAN;
This is undocumented, as near as I can determine,
but a simple test case (below) seems to show the fundamentals.
Are these SYS_OVER_ functions "safe" to use,
or is their behavior subject to change in future releases?
SET LINESIZE 100
DECLARE
NOW DATE := TRUNC(SYSDATE);
TOMORROW DATE := NOW + 1;
PROCEDURE TEST (Start1 IN DATE, End1 IN DATE, Start2 IN DATE, End2 IN DATE)
IS
BEGIN
DBMS_OUTPUT.PUT( TO_CHAR(Start1, 'DD-MON-YY HH24:MI:SS ')
|| TO_CHAR(End1, 'DD-MON-YY HH24:MI:SS ')
|| TO_CHAR(Start2, 'DD-MON-YY HH24:MI:SS ')
|| TO_CHAR(End2, 'DD-MON-YY HH24:MI:SS ')
);
IF SYS_OVER__DD(Start1, End1, Start2, End2) THEN
DBMS_OUTPUT.PUT_LINE('Overlaps');
ELSE
DBMS_OUTPUT.PUT_LINE('Does Not Overlap');
END IF;
END TEST;
BEGIN
TEST(NOW, TOMORROW, NOW, TOMORROW);
TEST(NOW, TOMORROW, NOW+3, TOMORROW+3);
TEST(NOW, TOMORROW, TOMORROW, TOMORROW+3);
TEST(NOW, TOMORROW, TOMORROW-(1/86400), TOMORROW+3);
TEST(TOMORROW, NOW, NOW, TOMORROW); -- from/to dates don't care about order?
END;
/
23-MAR-05 00:00:00 24-MAR-05 00:00:00 23-MAR-05 00:00:00 24-MAR-05 00:00:00 Overlaps
23-MAR-05 00:00:00 24-MAR-05 00:00:00 26-MAR-05 00:00:00 27-MAR-05 00:00:00 Does Not Overlap
23-MAR-05 00:00:00 24-MAR-05 00:00:00 24-MAR-05 00:00:00 27-MAR-05 00:00:00 Does Not Overlap
23-MAR-05 00:00:00 24-MAR-05 00:00:00 23-MAR-05 23:59:59 27-MAR-05 00:00:00 Overlaps
24-MAR-05 00:00:00 23-MAR-05 00:00:00 23-MAR-05 00:00:00 24-MAR-05 00:00:00 Overlaps
March 23, 2005 - 6:21 pm UTC
caveat emptor
of course it can change. anything not documented is subject to change and if it doesn't work as expected in all cases - well, maybe it is not supposed to.
suggest you write your own.
Problem with compare hours
Ora_User, April 20, 2005 - 6:17 pm UTC
Hi Tom.
I have a problem to design the next situation and I need your help.
I must to create a table like this:
create table test (
test_id number(10),
start_hour date,
stop_hour date
);
And then I insert data like this:
insert into test values (1,to_date('080000','hh24miss'), to_date('165959','hh24miss'));
insert into test values (1,to_date('170000','hh24miss'), to_date('005959','hh24miss'));
insert into test values (1,to_date('010000','hh24miss'), to_date('075959','hh24miss'));
Then I have 3 intervals of hours that I have to select depending of an hour that I receive in a stored procedure. Something like this:
create or replace procedure test_st( p_date in varchar2,
p_test_id out number )
as
begin
select test_id
into p_test_id
from test
where to_date(p_date,'hh24miss') between
start_hour and stop_our;
return;
end;
This sp does not work because in the fields start_hour and stop_our are a "date", not a "time".
I can define that start_hour and stop_our be a varchar2, but this cause to me problems:
1. I get wrong results if I compare directly (not possible!)
2. How oracle can see that '010000' (am) is between '220000' (pm) and '080000' (am), if these are varchar2?
3. If I store the hours like varchar2, I can compare the fields if I transform start_hour and stop_hour to date, but ... it's is good for performance? and I need an excellent performance for this query.
Can you help me to find a solution for this problem?
Thank you so much.
Modified time overlap question
steve, July 29, 2005 - 10:59 pm UTC
Regarding the question "Hour wise Max number of logged in Users" asked on December 19, 2004.
1) if we get rid of "hour wise" from the question. i.e.,
find out max number of logged in users from a login history table and the range which has max number of logged in users.
2) since removing "hour wise" and assuming login history table holds more than one year's data.
then a table called Seconds(more than 31 million rows) have to be created?? Is there any way without using Seconds table but still in single SQL?
Thanks!
ABout days
A reader, August 30, 2005 - 2:58 pm UTC
Tom,
I have a few dates as follows, can you please tell me if there is a better way to write them. Thanks!
1 .....WHERE TRUNC(DATE_IN, 'DD') = TRUNC(date_IN, 'DD')
2. SELECT TRUNC(SYSDATE) - TRUNC(IN_Date) INTO NODays FROM dual
3... AND IN_date = trunc(SYSDATE)
4. RETURN GREATEST((trunc(v_date) - trunc(date_in)), 0);
August 30, 2005 - 3:23 pm UTC
1) umm,
where date_in is not null
2) dandy (fine)
3) good
4) fine.
thanks!!!!
A reader, August 30, 2005 - 3:25 pm UTC
as usual, priceless
A reader, September 07, 2005 - 4:38 pm UTC
Time lag between pairs of rows
Tracy, December 05, 2005 - 9:38 am UTC
I have a table where rows are inserted in pairs with a variable time lag between the two rows. There is a column indicating the time of the insert. The first insert in the pair will have the startyn column set to 1, the second insert will have it set to zero. The Primary key is from an ascending sequence. There is also a customer number which will be the same for each pair of rows. How can I write a query that will return the IDs of each pair where the time difference between the inserts of the pair is greater than a specified time, say 10 seconds.
SQL> create table tb1 (id number, startyn number, createdat date, customer number);
Table created.
SQL> create sequence seq1;
Sequence created.
SQL> insert into tb1 values (seq1.nextval,1,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,1);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,2);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,3);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,3);
SQL> insert into tb1 values (seq1.nextval,1,sysdate,4);
SQL> insert into tb1 values (seq1.nextval,0,sysdate,4);
SQL> select id,startyn,to_char(createdat,'hh24miss'),customer from tb1 order by 1;
ID STARTYN TO_CHA CUSTOMER
---------- ---------- ------ ----------
1 1 150644 1
2 0 150709 1
3 1 150715 1
4 0 150719 1
5 1 150731 2
6 0 150740 2
7 1 150759 2
8 0 150759 2
9 1 150822 3
10 0 150827 3
11 1 150834 4
12 0 150856 4
Given the above data I want a query that will return IDs 1,2,5,6,11,12 (if we are looking for a 10-second or greater difference).
This is on Oracle 8i.
December 06, 2005 - 4:47 am UTC
I can do this however....
a) is the data perfect (always 1/0 or will some 0's be missing)
b) supply a test case setup that matches your example. sysdate is going to be sort of a "constant" thing here.
Test data
Tracy, December 10, 2005 - 12:20 pm UTC
Sorry for delay in replying.
a) We can assume that the data will always be perfect, i.e. no zero missing (although I would be interested to see a solution for where there were zeros missing).
b) Here is the test case that matches the above example: (I'm afraid the inserts wrap when pasted into your site)
SQL> drop table tb1;
Table dropped.
SQL> create table tb1 (id number, startyn number, createdat date, customer number);
Table created.
SQL> create sequence seq1;
Sequence created.
SQL> set feed off
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:06:44','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:09','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:07:15','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:19','dd-mon-yyyy hh24:mi:ss'),1);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:07:31','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:40','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:07:59','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:07:59','dd-mon-yyyy hh24:mi:ss'),2);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:08:22','dd-mon-yyyy hh24:mi:ss'),3);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:08:27','dd-mon-yyyy hh24:mi:ss'),3);
SQL> insert into tb1 values (seq1.nextval,1,to_date('10-dec-2005 15:08:34','dd-mon-yyyy hh24:mi:ss'),4);
SQL> insert into tb1 values (seq1.nextval,0,to_date('10-dec-2005 15:08:56','dd-mon-yyyy hh24:mi:ss'),4);
SQL> select id,startyn,to_char(createdat,'hh24miss'),customer from tb1 order by 1;
ID STARTYN TO_CHA CUSTOMER
---------- ---------- ------ ----------
1 1 150644 1
2 0 150709 1
3 1 150715 1
4 0 150719 1
5 1 150731 2
6 0 150740 2
7 1 150759 2
8 0 150759 2
9 1 150822 3
10 0 150827 3
11 1 150834 4
12 0 150856 4
SQL>
December 11, 2005 - 2:51 am UTC
ops$tkyte@ORA9IR2> select id, next_id, customer,
2 createdat, next_createdat,
3 (next_createdat-createdat) * 24 * 60 * 60 sec_diff
4 from (
5 select id, startyn, createdat, customer,
6 lead(id) over (partition by customer order by createdat) next_id,
7 lead(createdat) over (partition by customer order by createdat) next_createdat,
8 lead(startyn) over (partition by customer order by createdat) next_startyn
9 from tb1
10 )
11 where startyn = 1
12 order by customer, createdat
13 /
ID NEXT_ID CUSTOMER CREATEDA NEXT_CRE SEC_DIFF
---------- ---------- ---------- -------- -------- ----------
1 2 1 15:06:44 15:07:09 25
3 4 1 15:07:15 15:07:19 4
5 6 2 15:07:31 15:07:40 9
7 8 2 15:07:59 15:07:59 0
9 10 3 15:08:22 15:08:27 5
11 12 4 15:08:34 15:08:56 22
6 rows selected.
ops$tkyte@ORA9IR2> select id, next_id, customer,
2 createdat, next_createdat,
3 (next_createdat-createdat) * 24 * 60 * 60 sec_diff
4 from (
5 select id, startyn, createdat, customer,
6 lead(id) over (partition by customer order by createdat) next_id,
7 lead(createdat) over (partition by customer order by createdat) next_createdat,
8 lead(startyn) over (partition by customer order by createdat) next_startyn
9 from tb1
10 )
11 where startyn = 1
12 and (next_createdat-createdat) * 24 * 60 * 60 >= 10
13 order by customer, createdat
14 /
ID NEXT_ID CUSTOMER CREATEDA NEXT_CRE SEC_DIFF
---------- ---------- ---------- -------- -------- ----------
1 2 1 15:06:44 15:07:09 25
11 12 4 15:08:34 15:08:56 22
(5 and 6 are only 9 seconds apart...)
Excellent
TT, December 12, 2005 - 7:38 am UTC
Perfect.Thank you very much.
Value between two columns
Tracy, April 15, 2006 - 10:00 am UTC
I have a problem similar to Christo Kutrovsky above.
We have a table of IP Address ranges. The table looks like:
IPADDRESSES(
IPFROM NUMBER,
IPTO NUMBER,
COUNTRYCODE VARCHAR2(2),
DESCRIPTION VARCHAR2(100))
To give you an idea of the data, here are five rows:
SELECT * FROM IPADDRESSES WHERE ROWNUM < 6;
50331648 50331903 US UNITED STATES
50331904 50332671 US UNITED STATES
50332672 50332927 US UNITED STATES
50332928 50333695 US UNITED STATES
50333696 50333951 US UNITED STATES
50333952 50334719 US UNITED STATES
However, the table has 1.8 million records and we query it to find what country an IP address belong to:
SELECT ip.description
FROM IPADDRESSES ip
WHERE :b1 between ip.ipfrom AND ip.ipto
I cannot find an efficient way of indexing this table.
I have tried indexing this table in various different ways but no one way seems suitable. The query response time varies from a few milliseconds to about 30 seconds (which is unacceptable) depending upon the value of the variable in the query. I would have thought that this kind of scenario is fairly common - surely there must be a simple way of indexing it such that it will produce similar performance regardless of the value of the variable in the query?
April 15, 2006 - 1:19 pm UTC
so, do these ranges in your table overlap or not?
if not, you want the first record where the ipto is >= :b1 don't you?
select *
from (select * from ipaddresses where ipto >= :b1 order by ipto)
where rownum = 1;
and you would need to verify that the ipfrom was valid (eg: gaps would be a problem)
Gaps
Tracy, April 19, 2006 - 10:48 am UTC
Well, the ranges do not overlap but often there are gaps between the ipto of one row and the ipfrom of the next, so your query as it stands may return the wrong answer.
Here is a trimmed down version of the table to illustrate:
tupmant@dev920> SET VERIFY OFF
tupmant@dev920> SELECT * FROM IPADDRESSES;
IPFROM IPTO COUNTRYCODE DESCRIPTION
---------- ---------- --------------------- ------------------
400052000 400052999 US UNITED STATES
400157000 400157999 US UNITED STATES
400158000 400158999 US UNITED STATES
400166000 400166999 US UNITED STATES
400167000 400167999 US UNITED STATES
400169000 400169158 US UNITED STATES
400169159 400169159 US UNITED STATES
400169160 400169999 US UNITED STATES
400179000 400179999 US UNITED STATES
400181000 400181999 US UNITED STATES
400183000 400183999 US UNITED STATES
400209000 400209999 US UNITED STATES
400212000 400212999 US UNITED STATES
400218000 400218999 US UNITED STATES
400226000 400226999 US UNITED STATES
400229000 400229999 US UNITED STATES
400231000 400231999 US UNITED STATES
17 rows selected.
tupmant@dev920> SELECT ip.description
2 FROM IPADDRESSES ip
3 WHERE &ip between ip.ipfrom AND ip.ipto;
Enter value for ip: 400053123
no rows selected
tupmant@dev920> select description
2 from (select * from ipaddresses where ipto >= &IP order by ipto)
3 where rownum = 1;
Enter value for ip: 400053123
DESCRIPTION
------------------------------
UNITED STATES
So instead maybe this would do it and still be pretty efficient if I have an index on ipto and another on ipfrom?
select description
from (select * from ipaddresses where ipto >= &IP order by ipto)
where rownum = 1
intersect
select description
from (select * from ipaddresses where ipfrom <= &IP order by ipfrom desc)
where rownum = 1;
April 19, 2006 - 11:07 am UTC
did you test it out to see the work performed?
There is an easier basic SQL method for comparing date ranges
Greg Smith, April 19, 2006 - 7:28 pm UTC
Hi Tom,
I wanted to share a simple technique I use for comparing date ranges, assuming you want the same all-inclusive overlap results as your example where clause proffered.
Quoted here is part of the SQL statement from your original followup at the beginning of this thread. The SQL statement was noted to include any full or partial overlaps, including subsets and supersets:
"
where ( c_beginning between p_beginning and p_ending -- rule 1
OR c_ending between p_beginning and p_ending -- rule 2
OR ( c_beginning < p_beginning and p_ending < c_ending ) ); -- rule 3
"
The technique I use is a simple 2-part AND-ed comparison:
where p_beginning <= c_ending AND c_beginning <= p_ending
This shorter set of conditions will accomplish the same results (assuming c_beginning and c_ending are both not null -- wrapping both with NVL functions could be used to ensure this in cases that called for it). I also find it more readable, although one my associates prefers the readability of your method.
Because of it's relative brevity, I find that this technique extends more gracefully to situations where you must join 3 or 4 date-ranges together in a single SQL statement, which is often necessary with "date-tracked" apps like Oracle's own HR apps module.
Hopefully, this will prove helpful to someone.
Thanks for the great site/books!
-Greg
Question
A reader, November 16, 2006 - 6:47 pm UTC
Tom, is there anyway (in sql) you can assign a sequential number to a set of rows (date) given a time bucket ?
For example, say you have this data:
01-FEB-07
01-APR-07
01-JAN-08
01-MAR-08
01-JUN-08
01-FEB-09
01-JAN-10
01-MAR-10
01-APR-10
01-JAN-11
And I tell you: assing a sequential value on a 9 month basis.
Output should be
01-FEB-07 1
01-APR-07 1
01-JAN-08 2
01-MAR-08 2
01-JUN-08 2
01-FEB-09 3
01-JAN-10 4
01-MAR-10 4
01-APR-10 4
01-JAN-11 5
Every time that a bucket is filled (9 month), it should start counting again at the next month.
Can it be possible ?
Let me know, thanks
November 17, 2006 - 3:15 am UTC
i very much do not understand your numbering system here. You'll have to supply a bit more explanation.
RE: Question
A reader, November 17, 2006 - 9:24 am UTC
Hi again Tom, sorry for not being that clear.
In the output, you see numbers 1, 2 and so on.. these numbers are linked during a 9 month period. For example, between the first row and the second row, they are 2 months, but between the first row and the third row, they are 11 months, so it is over the 9 month period so I must assign a new (2) group number. Now, between the third row and sixth row, they are 11 month, again the 9 month period is over and so on. Basically, the idea is to group by a 9 month period, but after each group, reset this period at the beginning of the next group.
Hope I have been a litle bit clearer this time.
Thank you again.
Very usefull
bc, January 29, 2008 - 4:19 pm UTC
Tom,
Below is a condensed example, The author intended to extract a sum of quantities ordered for the month and year of a given date.
pi_date is of type date and is passed to the function where the below query resides as a parameter.
Is there a better way to write the below query, keeping performance in mind.
select sum(order_qty)
from orders
where to_char(order_date, 'YYYYMM') = to_char(nvl(pi_date, sysdate), 'YYYYMM')
The orders table does not have any function based indexes on the order_date column and getting one is out of the question
The orders table is defined as
create table orders
(
order_id number,
order_date date,
order_qty number
);
Thanks
January 30, 2008 - 9:42 am UTC
select sum(order_qty)
from orders
where to_char(order_date, 'YYYYMM') = to_char(nvl(pi_date, sysdate),
'YYYYMM')
ugh,
don't convert dates to strings....
In fact, if order_date is INDEXED and we should use an index (eg: you have 7 years of orders online, so you have 7x12 = 84 months of data, we get 1/84th of the table - an index might make sense)
where order_date >= nvl(trunc(pi_date,'mm'),trunc(sysdate,'mm'))
and order_date < add_months( nvl(trunc(pi_date,'mm'),trunc(sysdate,'mm')), 1 )
consider these three ways of doing it (a couple hundred thousand dates were compared here)
select count(*) from t where to_char(created,'yyyymm')='200601'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.02 0 2 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.97 0.97 0 1443 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.99 0.99 0 1445 0 1
********************************************************************************
select count(*) from t where trunc(created,'mm')=to_date('200601','yyyymm')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.47 0.47 0 1442 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.48 0.48 0 1443 0 1
********************************************************************************
select count(*) from t where created >= to_date('200601','yyyymm')
and created < add_months(to_date('200601','yyyymm'),1)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.07 0.07 0 1442 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.08 0.08 0 1443 0 1
see the cpu go up when you twiddle with the date column? You don't need to touch it.
Awesome as always
bc, January 30, 2008 - 3:19 pm UTC
Thank you very much for your detailed explanation.
Question on getting difference between dates
A reader, May 15, 2008 - 1:37 pm UTC
Hi:
I would like to do the following in sql. I think I have to use analytic functions
but I don't know how to go about it.
create table temp(id number not null, begin_date date not null, end_date date null);
insert into temp values (1, to_date('01/15/2008','mm/dd/yyyy'),null);
insert into temp values (2, to_date('03/20/2007','mm/dd/yyyy'), to_date('05/15/2008','mm/dd/yyyy'));
I want to know for a given range of dates how many days are there in each month for the entire year.
For example, if I pass the date range of 01/01/2008 and 12/31/2008 for row with id = 1, I want the output to be something like this
2008
Jan Feb Mar
15 29 31
Apr May Jun
30 31 30
Jul Aug Sep
31 31 30
Oct Nov Dec Total
31 30 31 351
if I pass the date range of 01/01/2007 and 12/31/2008 for row with id = 2, I want the output to be something like this
2008
Jan Feb Mar
31 29 31
Apr May Jun
30 15
Jul Aug Sep
Oct Nov Dec Total
135
2007
Jan Feb Mar
11
Apr May Jun
30 31 30
Jul Aug Sep
31 31 30
Oct Nov Dec Total
31 30 31 286
Thanks!
May 19, 2008 - 1:05 pm UTC
why does the first one have 15 days for JAN-2008
your start date was 15-jan-2008
ops$tkyte%ORA11GR1> select last_day(to_date('15-jan-2008'))-to_date('15-jan-2008') from dual;
TO_DATE('31-JAN-2008')-TO_DATE('15-JAN-2008')
---------------------------------------------
16
so, you must not count the first day as being in the range... but
and the second one have 11 days for MAR-2007
ops$tkyte%ORA11GR1> select last_day(to_date('20-mar-2007'))-to_date('20-mar-2007') from dual;
LAST_DAY(TO_DATE('20-MAR-2007'))-TO_DATE('20-MAR-2007')
-------------------------------------------------------
11
now it is???
I'll presume it is included - actually, I don't agree with either of them - if start day is 15-jan and jan has 31 days then there are 17 days.
If start day is 20-mar and mar has 31 days then there are 12 days.
If that is wrong, you can fix it yourself :) It is just math
further I'll assume
a) your primary key is id
b) you will always include the primary key in the query (seems reasonable, you seem to indicate that)
c) the days are provide as begin of month for first date input and last day of month for end of input (eg: you only pass MON-YYYY really)
ops$tkyte%ORA11GR1> variable bdate varchar2(30)
ops$tkyte%ORA11GR1> variable edate varchar2(30)
ops$tkyte%ORA11GR1> variable id number
ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> exec :bdate := '01-2008'; :edate := '12-2008'; :id := 1
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> with data
2 as
3 (select add_months( to_date(:bdate,'fxmm-yyyy'), level-1 ) dt,
4 months_between( to_date(:edate,'fxmm-yyyy'), to_date(:bdate,'fxmm-yyyy') )+1 btwn
5 from dual
6 connect by level <= months_between( to_date(:edate,'fxmm-yyyy'), to_date(:bdate,'fxmm-yyyy'))+1
7 )
8 select to_char(dt,'yyyy'), dt, sum(days)
9 from (
10 select case when trunc(begin_date,'mm') > dt OR last_day(end_date) < last_day(dt) then 0
11 when trunc(begin_date,'mm') = trunc(dt,'mm') then last_day(dt)-begin_date+1
12 when trunc(end_date,'mm') = trunc(dt,'mm') then end_date - trunc(dt,'mm')+1
13 else to_number(to_char(last_day(dt),'dd'))
14 end days,
15 data.*,
16 x.*
17 from data,
18 (select begin_date begin_date, nvl(end_date,last_day(to_date(:edate,'fxmm-yyyy'))) end_date from temp where id = :id) x
19 )
20 group by rollup ( to_char(dt,'yyyy'), dt)
21 /
TO_C DT SUM(DAYS)
---- --------- ----------
2008 01-JAN-08 17
2008 01-FEB-08 29
2008 01-MAR-08 31
2008 01-APR-08 30
2008 01-MAY-08 31
2008 01-JUN-08 30
2008 01-JUL-08 31
2008 01-AUG-08 31
2008 01-SEP-08 30
2008 01-OCT-08 31
2008 01-NOV-08 30
2008 01-DEC-08 31
2008 352
352
14 rows selected.
ops$tkyte%ORA11GR1> exec :bdate := '01-2007'; :edate := '12-2008'; :id := 2
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR1> /
TO_C DT SUM(DAYS)
---- --------- ----------
2007 01-JAN-07 0
2007 01-FEB-07 0
2007 01-MAR-07 12
2007 01-APR-07 30
2007 01-MAY-07 31
2007 01-JUN-07 30
2007 01-JUL-07 31
2007 01-AUG-07 31
2007 01-SEP-07 30
2007 01-OCT-07 31
2007 01-NOV-07 30
2007 01-DEC-07 31
2007 287
2008 01-JAN-08 31
2008 01-FEB-08 29
2008 01-MAR-08 31
2008 01-APR-08 30
2008 01-MAY-08 15
2008 01-JUN-08 0
2008 01-JUL-08 0
2008 01-AUG-08 0
2008 01-SEP-08 0
2008 01-OCT-08 0
2008 01-NOV-08 0
2008 01-DEC-08 0
2008 136
423
27 rows selected.
ops$tkyte%ORA11GR1>
Why not use this syntax to do this?
ken.maricle, July 29, 2008 - 2:01 pm UTC
It is undocumented as far as I can tell, but seems to work.
with test_data as
(select date '2007-01-01' from_dt, date '2007-12-01' to_dt from dual union all
select date '2007-04-01' from_dt, date '2007-08-01' to_dt from dual union all
select date '2007-07-01' from_dt, date '2007-08-01' to_dt from dual union all
select date '2007-10-01' from_dt, date '2008-02-01' to_dt from dual union all
select date '2008-01-01' from_dt, date '2008-12-01' to_dt from dual)
select *
from test_data t1,
test_data t2
where (t1.from_dt,t1.to_dt) overlaps (t2.from_dt,t2.to_dt)
/
FROM_DT TO_DT FROM_DT TO_DT
--------- --------- --------- ---------
01-JAN-07 01-DEC-07 01-JAN-07 01-DEC-07
01-APR-07 01-AUG-07 01-JAN-07 01-DEC-07
01-JUL-07 01-AUG-07 01-JAN-07 01-DEC-07
01-OCT-07 01-FEB-08 01-JAN-07 01-DEC-07
01-JAN-07 01-DEC-07 01-APR-07 01-AUG-07
01-APR-07 01-AUG-07 01-APR-07 01-AUG-07
01-JUL-07 01-AUG-07 01-APR-07 01-AUG-07
01-JAN-07 01-DEC-07 01-JUL-07 01-AUG-07
01-APR-07 01-AUG-07 01-JUL-07 01-AUG-07
01-JUL-07 01-AUG-07 01-JUL-07 01-AUG-07
01-JAN-07 01-DEC-07 01-OCT-07 01-FEB-08
01-OCT-07 01-FEB-08 01-OCT-07 01-FEB-08
01-JAN-08 01-DEC-08 01-OCT-07 01-FEB-08
01-OCT-07 01-FEB-08 01-JAN-08 01-DEC-08
01-JAN-08 01-DEC-08 01-JAN-08 01-DEC-08
15 rows selected.
August 01, 2008 - 10:26 am UTC
do not use this, just don't.
we have no idea what it does - other then empirical observation - which is sort of like someone saying:
"you don't need order by if you group by, I group by a,b,c and it always comes back sorted by a,b,c"
only group by a,b,c doesn't have to sort, never did, never will HAVE to sort. It used to sort frequently in the past, it almost never does now. Using that syntax is a bug waiting to happen big time.
Continous dates
Amrit, February 26, 2009 - 10:21 pm UTC
Hi Tom,
I wrote the following query to get continuous date ranges , but it doesn't give me the correct result, can you please help me.
select * from mc_transaction ;
Mbr_sid From_Date To_date
1234 05/01/2008 05/22/2008
1234 05/23/2008 06/22/2008
1234 06/23/2008 07/22/2008
1234 07/23/2008 08/22/2008
1234 09/23/2008 10/22/2008
1234 12/23/2008 01/22/2009
1234 01/23/2009 02/22/2009
1234 02/23/2009 03/22/2009
Select mbr_sid, Min(from_date) Over(Partition By mbr_sid) As from_date, Max(To_date) Over(Partition By mbr_sid) As To_date
From (Select *
From (Select mbr_sid, Lead(from_date, 1, '') Over(Partition By mbr_sid Order By mbr_sid) - To_date date_diff1,
from_date - Lag(To_date, 1, '') Over(Partition By mbr_sid Order By mbr_sid) date_diff2
From (Select fnl.mbr_sid, fnl.from_date, fnl.To_date
From mc_transaction fnl)) l
Where NVL(l.date_diff1, 2) = 1
Or NVL(l.date_diff2, 2) = 1)
The query gives:
Mbr_sid From_Date To_date
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
1234 05/01/2008 03/22/2009
Desired Output is: These are continuous date ranges, when there is a gap, we look for the second continuous range.
Mbr_sid From_Date To_date
1234 05/01/2008 08/22/2008
1234 09/23/2008 10/22/2008
1234 12/23/2008 03/22/2009
date range with bind variables using index
Lal, July 10, 2009 - 9:46 am UTC
Tom,
I execute the following query with and without
bind variables from sql developer
without bind variable
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date('01-jan-2009','dd-mon-yyyy')
AND DT.START_DTM_UTC <
to_date('31-jan-2009','dd-mon-yyyy')
with bind variable
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date(:d1,'dd-mon-rrrr')
AND DT.START_DTM_UTC <
to_date(:d2,'dd-mon-rrrr')
The table has 1468738 records and
The table has four indexes
lal_idx_01 on (START_DTM_UTC) column
lal_idx_02 on (END_DTM_UTC) column
lal_idx_03 on (START_DTM_UTC,END_DTM_UTC) columnS
lal_idx_04 on (END_DTM_UTC,START_DTM_UTC) columnS
i wanted to check which index will optimiser use. ie why four indexes
i gave the same values '01-jan-2009' and '31-jan-2009'
for the variables for the second query
I noticed that the query with bind variables used the
index and query without bind variable didn't.
Before executing the queries i flushed the buffer_cache and shared_pool.
Why did it use the index with bind variable eveif the same values were given and why did it not use the index fir the query without bind variables.
I am using 10g database.
I am attaching the tkprof outputs for both queries.
Pls explain. Also which one is advisable
query without bind variable
TKPROF: Release 10.2.0.1.0 - Production on Fri Jul 10 18:05:53 2009
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Trace file: tcc_ora_928.trc
Sort options: default
********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date('01-jan-2009','dd-mon-yyyy')
AND DT.START_DTM_UTC <
to_date('31-jan-2009','dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.25 0.24 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1601 0.62 8.71 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 0.87 8.96 26687 28265 0 80023
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 60 (CTS_OWNER)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
0 TABLE ACCESS MODE: ANALYZED (FULL) OF 'DATED_TRIPS' (TABLE)
********************************************************************************
SELECT 1
FROM
DUAL
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.00 0.00 0 0 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 60 (CTS_OWNER)
Rows Row Source Operation
------- ---------------------------------------------------
1 FAST DUAL (cr=0 pr=0 pw=0 time=4 us)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 FAST DUAL
********************************************************************************
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.25 0.24 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 1602 0.62 8.71 26687 28265 0 80024
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1606 0.87 8.96 26687 28265 0 80024
Misses in library cache during parse: 2
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 43 0.03 0.01 0 0 0 0
Execute 285 0.03 0.07 0 0 0 0
Fetch 455 0.03 0.81 107 1064 0 1805
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 783 0.09 0.90 107 1064 0 1805
Misses in library cache during parse: 20
Misses in library cache during execute: 20
2 user SQL statements in session.
285 internal SQL statements in session.
287 SQL statements in session.
2 statements EXPLAINed in this session.
********************************************************************************
Trace file: tcc_ora_928.trc
Trace file compatibility: 10.01.00
Sort options: default
1 session in tracefile.
2 user SQL statements in trace file.
285 internal SQL statements in trace file.
287 SQL statements in trace file.
22 unique SQL statements in trace file.
2 SQL statements EXPLAINed using schema:
CTS_OWNER.prof$plan_table
Default table was used.
Table was created.
Table was dropped.
4172 lines in trace file.
6604 elapsed seconds in trace file.
query with bind variable
TKPROF: Release 10.2.0.1.0 - Production on Fri Jul 10 18:12:55 2009
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Trace file: tcc_ora_2440.trc
Sort options: default
********************************************************************************
count = number of times OCI procedure was executed
cpu = cpu time in seconds executing
elapsed = elapsed time in seconds executing
disk = number of physical reads of buffers from disk
query = number of buffers gotten for consistent read
current = number of buffers gotten in current mode (usually for update)
rows = number of rows processed by the fetch or execute call
********************************************************************************
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date(:d1,'dd-mon-rrrr')
AND DT.START_DTM_UTC <
to_date(:d2,'dd-mon-rrrr')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.23 0.23 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 1601 0.96 7.88 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 1.21 8.13 26687 28265 0 80023
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 60 (CTS_OWNER)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
0 TABLE ACCESS MODE: ANALYZED (BY INDEX ROWID) OF 'DATED_TRIPS'
(TABLE)
0 INDEX MODE: ANALYZED (RANGE SCAN) OF 'LAL_IDX_04' (INDEX)
********************************************************************************
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.23 0.23 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 1601 0.96 7.88 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 1.21 8.13 26687 28265 0 80023
Misses in library cache during parse: 1
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 30 0.01 0.00 0 0 0 0
Execute 180 0.03 0.05 0 0 0 0
Fetch 318 0.01 0.54 67 700 0 1021
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 528 0.06 0.60 67 700 0 1021
Misses in library cache during parse: 17
Misses in library cache during execute: 17
1 user SQL statements in session.
180 internal SQL statements in session.
181 SQL statements in session.
1 statement EXPLAINed in this session.
********************************************************************************
Trace file: tcc_ora_2440.trc
Trace file compatibility: 10.01.00
Sort options: default
1 session in tracefile.
1 user SQL statements in trace file.
180 internal SQL statements in trace file.
181 SQL statements in trace file.
18 unique SQL statements in trace file.
1 SQL statements EXPLAINed using schema:
CTS_OWNER.prof$plan_table
Default table was used.
Table was created.
Table was dropped.
3189 lines in trace file.
60 elapsed seconds in trace file.
July 14, 2009 - 5:19 pm UTC
what is your logic behind this:
... Before executing the queries i flushed the buffer_cache and shared_pool. ...
????
anyway
I don't think the plans were different
http://asktom.oracle.com/Misc/when-explanation-doesn-sound-quite.html you used explain plan, it can "lie" to you.
these queries did exactly the same amount of work - they are the same:
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date('01-jan-2009','dd-mon-yyyy')
AND DT.START_DTM_UTC <
to_date('31-jan-2009','dd-mon-yyyy')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ---------
Parse 1 0.25 0.24 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1601 0.62 8.71 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ---------
total 1603 0.87 8.96 26687 28265 0 80023
SELECT *
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >=
to_date(:d1,'dd-mon-rrrr')
AND DT.START_DTM_UTC <
to_date(:d2,'dd-mon-rrrr')
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- --------
Parse 1 0.23 0.23 0 0 0 0
Execute 1 0.01 0.01 0 0 0 0
Fetch 1601 0.96 7.88 26687 28265 0 80023
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1603 1.21 8.13 26687 28265 0 80023
Lal Cyril, July 21, 2009 - 1:11 pm UTC
Sorry for the late reply Thank you very much for the clarification.
As you mentioned the row source operation was same
for both query executions and the explain plan was wrong.
what is your logic behind this:
This logic is to get the date ranges overlapping
with the input date range.
... Before executing the queries i flushed the buffer_cache and shared_pool. ...
????
I did this to ensure that caching of data as well as previous plan is cleared for comparing the two queries.
July 26, 2009 - 6:19 am UTC
but why? that is as fake if not more fake than presuming the buffer cache would affect it.
There is a secondary sga (search for "secondary sga" on this site in quotes like that) in play. Flushing the buffer cache is only good for showing what the MAXIMUM physical IO's might be under the worst cases - it doesn't help you compare two queries like that.
lal cyril, November 05, 2009 - 5:59 am UTC
Thanks for the clarification.
i searched for "secondary sga"
and got the following link
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1641581200346367951 but the link
http://asktom.oracle.com/pls/ask/search?p_string=%22secondary+sga%22 is not working.
I have one more query in this regard
One of our client's consultant gave the suggestion
to use a hint for the same query
SELECT
/*+ INDEX(DT INDEX1) */
DT.*
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >='01-JAN-09'
AND DT.START_DTM_UTC <'31-JAN-09'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.70 8.73 4937 65948 0 80403
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 0.73 8.75 4937 65948 0 80403
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 59 (CTS_9WC01)
Rows Row Source Operation
------- ---------------------------------------------------
80403 TABLE ACCESS BY INDEX ROWID DATED_TRIPS (cr=65948 pr=4937 pw=0 time=4370428 us)
80403 INDEX RANGE SCAN INDEX1 (cr=1992 pr=1992 pw=0 time=104177 us)(object id 222154)
-----------------------------------------------------------------------------------------
Query without Hint
SELECT
DT.*
FROM dated_TRIPS DT
WHERE DT.END_DTM_UTC >='01-JAN-09'
AND DT.START_DTM_UTC <'31-JAN-09'
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1.53 8.50 34339 34344 0 80403
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1.56 8.51 34339 34344 0 80403
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 59 (CTS_9WC01)
Rows Row Source Operation
------- ---------------------------------------------------
80403 TABLE ACCESS FULL DATED_TRIPS (cr=34344 pr=34339 pw=0 time=6999958 us)
Based on explain plan the query with index involves more cost.
Explain plan for query with index
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 408K| 46M| 315K|
| 1 | TABLE ACCESS BY INDEX ROWID| DATED_TRIPS | 408K| 46M| 315K|
| 2 | INDEX RANGE SCAN | INDEX1 | 408K| | 2064 |
---------------------------------------------------------------------------
Explain plan for query without index
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 408K| 46M| 7547 |
| 1 | TABLE ACCESS FULL| DATED_TRIPS | 408K| 46M| 7547 |
-----------------------------------------------------------------
The following link regarding usage of hints
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:7038986332061 suggests that using hints should be a path of last resort.
In this case should i go for hints approach?
November 11, 2009 - 9:45 am UTC
http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22secondary+sga%22 do you see the estimated rows there - 408,000. Versus the actual number of rows - 80,403.
There seems therefore to be something wrong with your statistics - they are stale, incomplete perhaps.
Why is it overestimating the row counts like that - that is an easy query, it shouldn't be off by a factor of 5.
If it got the right row counts, it would use the index naturally.
also, please STOP comparing strings to dates. compare dates to dates, use to_date and an explicit format mask to convert the string '01-jan-09' into a date - else, if I'm feeling nasty, I'll change your default date mask to yy-mon-dd some day.
Date comparison between two tables
Brian Perkins, October 18, 2010 - 11:37 am UTC
Tom
Returning to the original question: instead of a start and end parameter and one table, I actually have two tables, each containing start/end dates between which I want to return data from all overlapping rows.
In this thread, John Gilmore later reduced the Boolean for this condition to:
R1_end >= R2_begin
and
R1_begin <= R2_end
The problem I have is that these tables are used for long term SQL reporting, and as such can contain 10's of millions of rows each. Using the above construct in the where clause, Oracle always performs a full tablescan on both tables.
Each table includes many other columns of course, some also used as further join conditions between the two tables, in addition to the dates.
So, I'm looking to see if I have overlooked some technique to improve the speed this query, the performance of which gets slower over time.
Using FIRST_ROWS as a hint is of little value as the entire result set is needed by the reporting tool before anything is displayed to the client.
The only way I have made the query perform better is to artificially restrict the date comparison to a week or so, e.g.
R1_end between (R2_begin and R2_begin + 7) // stop looking a week beyond the start date/time
and
R1_begin between (R2_end - 7 and R2_end) // start looking a week before the end date/time
but I don't like that - and the value '7' probably needs to vary from installation to installation.
Some attributes of the data in the tables may help you formulate an idea:
Each table contains 10-100 million rows.
The query will normally be used to return just a few thousand rows as a maximum - sometimes only less than 100.
Table 1 (end - start) can be measured in seconds
Table 2 (end - start) can be measured in days
The other joining columns without the dates are not very selective, and would join 10-20% of the rows.
October 25, 2010 - 9:54 am UTC
... Oracle always performs a full tablescan on both
tables. ...
PERFECTION - even for small tables - PERFECTION. Using an index would be incredibly slow and inefficient.
You sort of need every row from ever table don't you? How could you avoid not reading every row from every table in general?
say you used an index for r1_end >= r2_begin - that would almost certainly compare all records of one table to most records of the other. Same with the other predicate.
Agree
Brian Perkins, October 25, 2010 - 11:07 am UTC
Tom
On the full tablescans - I am in total agreement that Oracle is doing the right thing - given my code and the question it's trying to answer.
I was just wondering if I was overlooking some obvious feature or topic that would help to handle the ever increasing execution time - but I guess not!
At least the customers will "only" want 10 years of data at a maximum..... so there is an endpoint in sight.
October 25, 2010 - 6:13 pm UTC
It is a really hard problem to solve efficiently in an RDBMS - yes.
A reader, October 26, 2010 - 7:52 am UTC
I wonder whether it is the full scan that is the problem for Brian, or whether it is the comparison of a few million rows with a few million other rows -- a lot of CPU usage.
Is that a sort-merge operation, with a join on one date comparison and a subsequent filter applied to eliminate rows based on the other date comparison? Have you looked at the wait events and CPU usage?
I wouldn't rule out an index-based access method if it eliminates say 10,000,000*100,000,000/2=1,000,...blahblah...,000 date comparisons.
October 26, 2010 - 8:02 pm UTC
... or whether
it is the comparison of a few million rows with a few million other rows ...
that is it
... I wouldn't rule out an index-based access method if it eliminates say ...
but think about it. I wrote:
say you used an index for r1_end >= r2_begin - that would almost certainly compare all records of one table to most records of the other. Same with the other predicate.
the index would be used for ONE or the OTHER of the predicates - but not both.
and the index would be used over and over and over and over and over and ........
re: Brian Perkins and date comparison between two table
Stew Ashton, October 27, 2010 - 6:38 am UTC
Just a thought, provided this doesn't slow down the commits too much:
drop table A_DATES
/
-- drop table A_DATES succeeded.
--657ms elapsed
drop table B_dates
/
-- drop table B_dates succeeded.
--78ms elapsed
create table A_DATES as
select sysdate+(level*99) A_DATE_BEGIN, sysdate+(level*99) + MOD(level,1000)+1 A_DATE_END
from DUAL connect by level <= 1000
/
--create table succeeded.
--125ms elapsed
create table B_DATES as
select sysdate + MOD(level,1000) + MOD(level, 100) B_DATE_BEGIN
, sysdate + MOD(level,1000) + MOD(level, 100) + 1 B_DATE_END
from DUAL connect by level <= 1000
/
--create table succeeded.
--47ms elapsed
create materialized view log on a_dates with rowid
/
--create materialized succeeded.
--125ms elapsed
create materialized view log on B_DATES with rowid
/
--create materialized succeeded.
--62ms elapsed
drop materialized view ab_dates
/
-- drop materialized view succeeded.
--250ms elapsed
create materialized view ab_dates refresh fast on commit as
select a.rowid ARID
, a.A_DATE_BEGIN
, a.A_DATE_END
, B.B_DATE_BEGIN
, B.B_DATE_END
, B.rowid BRID from A_DATES a, B_DATES B
where a.A_DATE_END >= B.B_DATE_BEGIN
and a.A_DATE_BEGIN <= B.B_DATE_END
/
--create materialized succeeded.
--7,641ms elapsed
create index idx_ab_dates_arid on ab_dates(arid)
/
--create index succeeded.
--15ms elapsed
create index IDX_AB_DATES_BRID on AB_DATES(BRID)
/
--create index succeeded.
--16ms elapsed
begin
DBMS_STATS.GATHER_TABLE_STATS (ownname => user, tabname => 'A_DATES',estimate_percent => 100);
DBMS_STATS.GATHER_TABLE_STATS (OWNNAME => user, TABNAME => 'B_DATES',ESTIMATE_PERCENT => 100);
DBMS_STATS.GATHER_TABLE_STATS (ownname => user, tabname => 'AB_DATES',estimate_percent => 100, cascade=>TRUE);
end;
/
--anonymous block completed
--609ms elapsed
select COUNT(*) from AB_DATES
/
--COUNT(*)
------------------------
--84
/
--0ms elapsed
delete from A_DATES where rowid = (select ARID from AB_DATES where rownum = 1)
/
--1 rows deleted
--16ms elapsed
commit
/
--commited
--78ms elapsed
select COUNT(*) from AB_DATES
/
--COUNT(*)
------------------------
--78
/
--0ms elapsed
Missed something in previous post
Stew Ashton, October 27, 2010 - 7:32 am UTC
In my previous post, I didn't test inserts and updates to the base tables. To avoid full scans during the MV refresh:
create index IDX_A_DATES_BEGIN on A_DATES(A_DATE_BEGIN,A_DATE_END);
create index IDX_B_DATES_END on B_DATES(B_DATE_END,b_DATE_BEGIN);
Every insert or update to a base table will cause an index range scan on the other base table during the MV refresh.
Worked like a charm
Jason, September 21, 2016 - 5:40 pm UTC
This worked like a charm. I was in the same situation. I coded it like you had it exactly, and worked perfectly. Thank you so much. This saved me from hours of trying to compare dates. :) Cheers!
Re: my post of October 27, 2010 - 6:38 am UTC
Stew Ashton, February 17, 2018 - 10:28 am UTC
Starting with version 12c, inequality joins can be replaced by the MATCH_RECOGNIZE clause. With the test data I used above, MATCH_RECOGNIZE goes at least 60 times faster than JOIN.
select A_DATE_BEGIN, A_DATE_END, B_DATE_BEGIN, B_DATE_END
from (
select 'A' tbl, a_date_begin date_begin, a_date_end date_end from a_dates
union all
select 'B', b_date_begin, b_date_end end from b_dates
)
match_recognize(
order by date_begin
measures
case first(tbl) when 'A' then first(date_begin) else ba.date_begin end a_date_begin,
case first(tbl) when 'A' then first(date_end) else ba.date_end end a_date_end,
case first(tbl) when 'B' then first(date_begin) else ba.date_begin end b_date_begin,
case first(tbl) when 'B' then first(date_end) else ba.date_end end b_date_end
all rows per match
after match skip to next row
pattern ( {-ab-} ({-ab-}|ba)* ba )
define ab as tbl = first(tbl) and date_begin <= first(date_end),
ba as tbl != first(tbl) and date_begin <= first(date_end)
);
February 19, 2018 - 3:26 am UTC
Love your work.