Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Frank.

Asked: December 20, 2001 - 10:30 am UTC

Last updated: October 18, 2011 - 5:06 pm UTC

Version: 8.1.7

Viewed 10K+ times! This question is

You Asked

8.1.7, Solaris 5.8

I have the following construct:

table:

id number
txn_date date
rel_id number
signal varchar2(1)
amount number
...

index on id
index on rel_id
index on txn_date
index on signal

75+mln records

all records have <signal is null>, except for new records.
once a week i have to check all new records:

List 7 days period with in it txn_date of new record (signal = 'Y') where total amount > x

so, if total over (txn_date - 6 days upto txn_date) > x
or total over (txn_date - 5 days upto txn_date + 1) > x
or ...
or total over(txn_date upto txn_date + 6 days) > x

then
list id
(+ start/end-date of 7 days period if possible)

I started out with analytic functions but only came half-way ;-(

tia,
Frank

and Tom said...

something like this should do it -- we compute the various numbers you want and can then where on them. I did this for 5 days, you can expand to 7...



ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( txn_date date, x int );

Table created.

ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> begin
2 for i in 1 .. 25
3 loop
4 insert into t values ( sysdate+i, i );
5 end loop;
6 end;
7 /

PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> column x format 99
ops$tkyte@ORA817DEV.US.ORACLE.COM> column sum_5_0 format 999999
ops$tkyte@ORA817DEV.US.ORACLE.COM> column sum_4_1 format 999999
ops$tkyte@ORA817DEV.US.ORACLE.COM> column sum_3_2 format 999999
ops$tkyte@ORA817DEV.US.ORACLE.COM> column sum_2_3 format 999999
ops$tkyte@ORA817DEV.US.ORACLE.COM> column sum_1_4 format 999999
ops$tkyte@ORA817DEV.US.ORACLE.COM> column sum_0_5 format 999999
ops$tkyte@ORA817DEV.US.ORACLE.COM>
ops$tkyte@ORA817DEV.US.ORACLE.COM> select txn_date,
2
ops$tkyte@ORA817DEV.US.ORACLE.COM> select txn_date,
2 sum(x) over( order by txn_date desc range 5 preceding ) sum_5_0,
3 sum(x) over( order by txn_date desc range between 4 preceding and 1 following ) sum_4_1,
4 sum(x) over( order by txn_date desc range between 3 preceding and 2 following ) sum_3_2,
5 sum(x) over( order by txn_date desc range between 2 preceding and 3 following ) sum_2_3,
6 sum(x) over( order by txn_date desc range between 1 preceding and 4 following ) sum_1_4,
7 sum(x) over( order by txn_date desc range between 0 preceding and 5 following ) sum_0_5,
8 sum(x) over( order by txn_date desc range between 5 preceding and 5 following ) sum_5_5,
9 x
10 from t
11 /

TXN_DATE SUM_5_0 SUM_4_1 SUM_3_2 SUM_2_3 SUM_1_4 SUM_0_5 SUM_5_5 X
--------- ------- ------- ------- ------- ------- ------- ---------- ---
14-JAN-02 25 49 72 94 115 135 135 25
13-JAN-02 49 72 94 115 135 129 154 24
12-JAN-02 72 94 115 135 129 123 172 23
11-JAN-02 94 115 135 129 123 117 189 22
10-JAN-02 115 135 129 123 117 111 205 21
09-JAN-02 135 129 123 117 111 105 220 20
08-JAN-02 129 123 117 111 105 99 209 19
07-JAN-02 123 117 111 105 99 93 198 18
06-JAN-02 117 111 105 99 93 87 187 17
05-JAN-02 111 105 99 93 87 81 176 16
04-JAN-02 105 99 93 87 81 75 165 15
03-JAN-02 99 93 87 81 75 69 154 14
02-JAN-02 93 87 81 75 69 63 143 13
01-JAN-02 87 81 75 69 63 57 132 12
31-DEC-01 81 75 69 63 57 51 121 11
30-DEC-01 75 69 63 57 51 45 110 10
29-DEC-01 69 63 57 51 45 39 99 9
28-DEC-01 63 57 51 45 39 33 88 8
27-DEC-01 57 51 45 39 33 27 77 7
26-DEC-01 51 45 39 33 27 21 66 6
25-DEC-01 45 39 33 27 21 15 55 5
24-DEC-01 39 33 27 21 15 10 45 4
23-DEC-01 33 27 21 15 10 6 36 3
22-DEC-01 27 21 15 10 6 3 28 2
21-DEC-01 21 15 10 6 3 1 21 1

25 rows selected.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select *
2 from (
3 select txn_date,
4 sum(x) over( order by txn_date desc range 5 preceding ) sum_5_0,
5 sum(x) over( order by txn_date desc range between 4 preceding and 1 following ) sum_4_1,
6 sum(x) over( order by txn_date desc range between 3 preceding and 2 following ) sum_3_2,
7 sum(x) over( order by txn_date desc range between 2 preceding and 3 following ) sum_2_3,
8 sum(x) over( order by txn_date desc range between 1 preceding and 4 following ) sum_1_4,
9 sum(x) over( order by txn_date desc range between 0 preceding and 5 following ) sum_0_5,
10 sum(x) over( order by txn_date desc range between 5 preceding and 5 following ) sum_5_5,
11 x
12 from t
13 )
14 where sum_5_0 > 50
15 or sum_4_1 > 50
16 or sum_3_2 > 50
17 or sum_2_3 > 50
18 or sum_1_4 > 50
19 or sum_0_5 > 50
20 /

TXN_DATE SUM_5_0 SUM_4_1 SUM_3_2 SUM_2_3 SUM_1_4 SUM_0_5 SUM_5_5 X
--------- ------- ------- ------- ------- ------- ------- ---------- ---
14-JAN-02 25 49 72 94 115 135 135 25
13-JAN-02 49 72 94 115 135 129 154 24
12-JAN-02 72 94 115 135 129 123 172 23
11-JAN-02 94 115 135 129 123 117 189 22
10-JAN-02 115 135 129 123 117 111 205 21
09-JAN-02 135 129 123 117 111 105 220 20
08-JAN-02 129 123 117 111 105 99 209 19
07-JAN-02 123 117 111 105 99 93 198 18
06-JAN-02 117 111 105 99 93 87 187 17
05-JAN-02 111 105 99 93 87 81 176 16
04-JAN-02 105 99 93 87 81 75 165 15
03-JAN-02 99 93 87 81 75 69 154 14
02-JAN-02 93 87 81 75 69 63 143 13
01-JAN-02 87 81 75 69 63 57 132 12
31-DEC-01 81 75 69 63 57 51 121 11
30-DEC-01 75 69 63 57 51 45 110 10
29-DEC-01 69 63 57 51 45 39 99 9
28-DEC-01 63 57 51 45 39 33 88 8
27-DEC-01 57 51 45 39 33 27 77 7
26-DEC-01 51 45 39 33 27 21 66 6

20 rows selected.

ops$tkyte@ORA817DEV.US.ORACLE.COM>

(love those analytic functions....)

Rating

  (35 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A reader, December 20, 2001 - 4:40 pm UTC

Excellent

Halfway indeed ;-)

Frank Tollenaar, December 21, 2001 - 1:16 am UTC

This is exactly how far i came !
The point i miss is the fact that i don't want to check the whole table every time, but only the 7 days periods around the _new_ records. Problem is, that i can't add this to the where clause, since the analytic functions also use this where clause...

Tom Kyte
December 21, 2001 - 7:53 am UTC

Halfway question indeed.  your question really wasn't that clear.  I had to guess alot.  Now it is more understandable.  I would use a function based index on 

create index t_idx on t(signal,decode(signal,'Y',txn_date,null)) compress 1;

Now, we have an index that only contains values when signal is 'Y' (use whatever value you use there of course).  Since only totally NON-NULL index entries are made in b*tree -- we haven't added any leaf rows to this -- just added the txn_date.  Now, we can easily generate the set of all rows such that they are within 5 days (N days for you) of the LAST new record.

Consider:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t ( txn_date date, x int, signal char(1), data char(2000) default 'x' );

Table created.

<b>data is just there to make the rows big, so the optimizer thinks "big table"
</b>

ops$tkyte@ORA817DEV.US.ORACLE.COM> begin
  2      for i in 1 .. 25
  3      loop
  4          insert into t (txn_date,x,signal) values ( sysdate+i, i, 'Y' );
  5      end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> insert /*+ append */ into t (txn_date,x,signal) select sysdate-50, rownum, null from all_objects;

17119 rows created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> commit;
Commit complete.

<b>added lots of "old data"</b>

ops$tkyte@ORA817DEV.US.ORACLE.COM> 
ops$tkyte@ORA817DEV.US.ORACLE.COM> create index t_idx on t(signal,decode(signal,'Y',txn_date,null)) compress 1;
Index created.

<b>that leading edge Y is a constant, just get rid of it, store just the txn_date (compress 1)</b>

ops$tkyte@ORA817DEV.US.ORACLE.COM> create index t_idx2 on t(txn_date);
Index created.

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze table t compute statistics;
Table analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> analyze index t_idx validate structure;
Index analyzed.

ops$tkyte@ORA817DEV.US.ORACLE.COM> select LF_ROWS from index_stats;

   LF_ROWS
----------
        25

<b>that shows only 25 rows in the index, not 18,000</b>

ops$tkyte@ORA817DEV.US.ORACLE.COM> alter session set query_rewrite_enabled=true;
Session altered.

ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace on
ops$tkyte@ORA817DEV.US.ORACLE.COM> select min(decode(signal,'Y',txn_date,null)) -5 from t where signal = 'Y'
  2  /

MIN(DECOD
---------
17-DEC-01

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=8)
   1    0   SORT (AGGREGATE)
   2    1     FIRST ROW (Cost=1 Card=25 Bytes=200)
   3    2       INDEX (RANGE SCAN (MIN/MAX)) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=25)

Statistics
----------------------------------------------------------
         12  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        403  bytes sent via SQL*Net to client
        430  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
          1  rows processed

<b>shows we can get the date we are interested in very fast...</b>


ops$tkyte@ORA817DEV.US.ORACLE.COM> select *
  2    from (
  3  select txn_date,
  4         sum(x) over( order by txn_date desc range 5 preceding ) sum_5_0,
  5         sum(x) over( order by txn_date desc range between 4 preceding and 1 following ) sum_4_1,
  6         sum(x) over( order by txn_date desc range between 3 preceding and 2 following ) sum_3_2,
  7         sum(x) over( order by txn_date desc range between 2 preceding and 3 following ) sum_2_3,
  8         sum(x) over( order by txn_date desc range between 1 preceding and 4 following ) sum_1_4,
  9         sum(x) over( order by txn_date desc range between 0 preceding and 5 following ) sum_0_5,
 10         sum(x) over( order by txn_date desc range between 5 preceding and 5 following ) sum_5_5,
 11         x
 12    from t<b>
 13   where txn_date >= ( select min(decode(signal,'Y',txn_date,null)) -5 from t where signal = 'Y' )</b>
 14   )
 15   where sum_5_0 > 50
 16      or sum_4_1 > 50
 17      or sum_3_2 > 50
 18      or sum_2_3 > 50
 19      or sum_1_4 > 50
 20      or sum_0_5 > 50
 21  /

<b>that where clause gets all of the rows we are interested in now...</b>

TXN_DATE     SUM_5_0    SUM_4_1    SUM_3_2    SUM_2_3    SUM_1_4    SUM_0_5    SUM_5_5          X
--------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
15-JAN-02         25         49         72         94        115        135        135         25
14-JAN-02         49         72         94        115        135        129        154         24
13-JAN-02         72         94        115        135        129        123        172         23
12-JAN-02         94        115        135        129        123        117        189         22
11-JAN-02        115        135        129        123        117        111        205         21
10-JAN-02        135        129        123        117        111        105        220         20
09-JAN-02        129        123        117        111        105         99        209         19
08-JAN-02        123        117        111        105         99         93        198         18
07-JAN-02        117        111        105         99         93         87        187         17
06-JAN-02        111        105         99         93         87         81        176         16
05-JAN-02        105         99         93         87         81         75        165         15
04-JAN-02         99         93         87         81         75         69        154         14
03-JAN-02         93         87         81         75         69         63        143         13
02-JAN-02         87         81         75         69         63         57        132         12
01-JAN-02         81         75         69         63         57         51        121         11
31-DEC-01         75         69         63         57         51         45        110         10
30-DEC-01         69         63         57         51         45         39         99          9
29-DEC-01         63         57         51         45         39         33         88          8
28-DEC-01         57         51         45         39         33         27         77          7
27-DEC-01         51         45         39         33         27         21         66          6

20 rows selected.


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=54 Card=858 Bytes=96954)
   1    0   VIEW (Cost=54 Card=858 Bytes=96954)
   2    1     WINDOW (SORT)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=54 Card=858 Bytes=9438)
   4    3         INDEX (RANGE SCAN) OF 'T_IDX2' (NON-UNIQUE) (Cost=2 Card=858)
   5    4           SORT (AGGREGATE)
   6    5             FIRST ROW (Cost=1 Card=25 Bytes=200)
   7    6               INDEX (RANGE SCAN (MIN/MAX)) OF 'T_IDX' (NON-UNIQUE) (Cost=1 Card=25)



 

how about days of a month which not predictable

A reader, October 13, 2002 - 4:51 am UTC

Hi

when we use range preceding we specify the number of days as your example, what if I want to calculate some values from last month and compare with todays which can be unpredicatable since number of days of months can differ.

For example today, Sunday 13th of October I billed $10000, I want to calculate the % increase or decrease from average bill rate of ALL Sundays from September. There were 5 Sundays (billed $12000, $21000, $9800, $19000 and $13200) last September and I billed an average of $15000 so today's bill is -33.33% compared with September's average (Sundays). Last month we had 30 days but we could have had 28 or 31 days as well if we were in March or September, we also could have had 4 or 5 Sundays. How can we use analytic functions to calculate these averages?

Thank you

Tom Kyte
October 14, 2002 - 7:34 am UTC

You would need to use dynamic SQL and build the query on the fly as the NUMBER OF COLUMNS changes (hence it is a brand new, unique, never before seen query).




Additional question regarding range

Shetty Yermal, February 13, 2003 - 3:09 pm UTC

Tom,

How can I specify my range of values instead of relative range.

For example can I use the following statement as
"order by txn_date desc range between 5 preceding and 5
following"

"order by txn_date desc range between b.start_date preceding and b.start_date"

where b is a table containing start_date and end_date and can be equijoined with main table.

Tom Kyte
February 13, 2003 - 5:26 pm UTC

when you tried it -- what happened ;)


ops$tkyte@ORA920> create table t
  2  as
  3  select sysdate+rownum dt
  4    from all_users
  5   where rownum <= 10
  6  /

Table created.

ops$tkyte@ORA920> select dt,
  2         first_value(dt) over ( order by dt
  3                                range between 5 preceding
  4                                          and 5 following ) fixed,
  5         first_value(dt) over ( order by dt
  6                                range between minrange preceding
  7                                          and maxrange following ) variable
  8    from t, ( select 5 minrange, 5 maxrange
  9                from dual )
 10   order by dt
 11  /

DT        FIXED     VARIABLE
--------- --------- ---------
14-FEB-03 14-FEB-03 14-FEB-03
15-FEB-03 14-FEB-03 14-FEB-03
16-FEB-03 14-FEB-03 14-FEB-03
17-FEB-03 14-FEB-03 14-FEB-03
18-FEB-03 14-FEB-03 14-FEB-03
19-FEB-03 14-FEB-03 14-FEB-03
20-FEB-03 15-FEB-03 15-FEB-03
21-FEB-03 16-FEB-03 16-FEB-03
22-FEB-03 17-FEB-03 17-FEB-03
23-FEB-03 18-FEB-03 18-FEB-03

10 rows selected.
 

Additional question regarding range

Shetty Yermal, February 13, 2003 - 3:09 pm UTC

Tom,

How can I specify my range of values instead of relative range.

For example can I use the following statement as
"order by txn_date desc range between 5 preceding and 5
following"

"order by txn_date desc range between b.start_date preceding and b.start_date"

where b is a table containing start_date and end_date and can be equijoined with main table.

Multiple Windowing

Lakshminarasimhan R, April 13, 2003 - 8:48 am UTC

Hi tom

i have a query with respect to the above thread.

I am having a table like this


SALEDATE
OPRCARDSGCOD
OPRCARNUMCOD
CNTCOD
FRMCTYCOD
TOCTYCOD
COTCOD
RBDCOD
NETAMT
CPNCNT

Saledate through Rbdcod columns are Dimensions and
Netamt and Cpncnt are Measures.

My user want to see Moving Annaul Total(MAT) for last
11 months for any of the above combination.

Like They may need to MAT for the combination of
all the dimension or
only for OPrcardsgcod, Cntcod, Frmctycod, toctycod
or only for OPrcardsgcod and cntcod only like this.

I initially thought

creating a MAT like

sum(netamt) over (partition by Oprcardsgcod,oprcarnumcod,Cntcod,frmctycod,toctycod,cotcod,rbdcod order by Saledate Range Interval '11' month preceeding) will anyway create MAT for all the dimension combination and when it will be stored in table permemenantaly we can just select netamt only for differnt group by.. But i found that approach is wrong.

HOw to obtaint the above requirement.

One way could be creating different columns with differne partition by statement.

But it must be too heavy for the users to use so many columns in the report.

can you suggest me some better idea.



Tom Kyte
April 13, 2003 - 8:16 pm UTC

no idea what a "MAT" is -- is that like a moving average?

tell you what -- when I'm taking questions -- be more explicit and put this there, in a new question.

how do i do this ( oracle 8.1.7 standard edition on Windoze)

umesh, May 14, 2003 - 12:39 am UTC

I have a table which has the open_date and close_date of orders as follows


create table oc_date
( open_date date,
close_date date);



insert into oc_date values ('01-jan-2000','5-jan-2000');
insert into oc_date values ('02-jan-2000','15-jan-2000');
insert into oc_date values ('03-jan-2000','16-jan-2000');
insert into oc_date values ('05-jan-2000','16-jan-2000');
insert into oc_date values ('12-jan-2000','26-jan-2000');
insert into oc_date values ('13-jan-2000','26-jan-2000');
insert into oc_date values ('15-jan-2000','26-jan-2000');
commit;

i need an output as follows

open_date close_date cnt (of orders still open )
01-JAN-00 05-JAN-00 1 -> 1 order is open
02-JAN-00 15-JAN-00 2 -> 2 orders open till 02-jan-00
03-JAN-00 16-JAN-00 3 -> 3 ordes open till 03-jan-00
05-JAN-00 16-JAN-00 3 -> 1st order is closed on 5-jun-00
12-JAN-00 26-JAN-00 4
13-JAN-00 26-JAN-00 5
15-JAN-00 26-JAN-00 4 -> 2nd order closed on 15-jun-00
27-JAN-00 28-JAN-00 1 -> all orders closed except
present one

I need the number of orders which are still open ( till the open_date) for e.g consider an open_date 05-jan-00
number of ordes open are 3 becoz the order opened on 1-jan-00 is closed on the present date of interest which is 05-jan-00
but next record open_date =15-jan-00 has 4 records becoz 4 orders are open
Thanks in advance


Tom Kyte
May 14, 2003 - 6:57 am UTC

I think your example has a "bug", 

15-JAN-00 26-JAN-00   4 -> 2nd order closed on 15-jun-00

should have a count of 5, not 4 (it would have 6 if the second order did not close, 5 since it did).

So, this will do it:

ops$tkyte@ORA920LAP> select open_date, close_date,
  2         (select count(*)
  3                from oc_date t2
  4                   where t2.open_date <= t1.open_date
  5                     and t2.close_date > t1.open_date ) x
  6    from oc_date t1;

OPEN_DATE CLOSE_DAT          X
--------- --------- ----------
01-JAN-00 05-JAN-00          1
02-JAN-00 15-JAN-00          2
03-JAN-00 16-JAN-00          3
05-JAN-00 16-JAN-00          3
12-JAN-00 26-JAN-00          4
13-JAN-00 26-JAN-00          5
15-JAN-00 26-JAN-00          5
27-JAN-00 28-JAN-00          1

8 rows selected.

 

thanks for the order dates in the question above

umesh, May 15, 2003 - 4:39 am UTC

it was really made simple by you Tom, Thanks
I thought this needed analytics and broke my head using analytics

current date with preceding (are analytics the answer?)

bob, October 31, 2003 - 11:48 am UTC

Tom,

Just like the original thread, what if the counts were all relative to SYSDATE instead of the most recent row..
---

If a user wanted generate count/sums/avg/etc.. over various windows, with the window of time being, the last day, month, year, etc.. analytics only fit when that window is relative to a value in a current row. right?

so that if I wanted counts over the last month,year, or second, I could only get them relative to the txn_date for a given row. If I was interested in many different intervals relative to SYSDATE, should I just do a bunch of subqueries each with a SYSDATE-X, or should I union a sysdate row into the tables the analytics are being used against, so that my counts can be relative to that row?

I wasn't sure if this was something that analytics shouldn't be asked to do, and if standard approaches using multiple inline views each of which looking at the same data over different time range were better for.

Tom Kyte
November 01, 2003 - 12:03 pm UTC



i'm not sure what you mean? if they were not relative to the txn_date for a given row -- what use would sysdate be? can you provide a small example with data?

clarification

bob, November 01, 2003 - 3:44 pm UTC

Tom,

I hope this makes more sense. SQL is probably more universally understable than my logic, so I will try and use it to more to express my thoughts.I can easily copy what you did above over various time windows, relative to a particular or all rows.
If I wanted counts of transactions over the 5,6, and 7 days preceeding every transaction I could do something like:

select
count(*) over (order by txn_date range numdstoinverval(5,'days') preceding) last_5,
count(*) over (order by txn_date range numdstoinverval(6,'days') preceding) last_6,
count(*) over (order by txn_date range numdstoinverval(7,'days') preceding) last_7
from transactions;

but if instead I wanted counts over 5,6,7 days preceding the current day (and there was not necessarily any transactions today), I don't think I could use these analytics anymore. is that right?

without analytics I use your standard pivot technique on my dummy summary (see below) column to pivot a unioned vertical result set into a horizontal that looks something like:

the pivoted version looks like:

last 5 last 6 last 7
cat A 10 15 20
cat B 3 24 46

the vertical looks like
SELECT category,
count(*) ,
'last 5' summary -- pivot on this value
FROM table
WHERE date_col > sysdate -5
GROUP BY category
union
SELECT category,
count(*) ,
'last 6'
FROM table
WHERE date_col > sysdate -6
GROUP BY category
union
SELECT category,
count(*) ,
'last 7'
FROM table
WHERE date_col > sysdate -7
GROUP BY category
)

I thought I might be able to use the first approach using analytics instead if I did something like this:

select * from (
select
txn_date,
count(*) over (order by txn_date range 5....),
count(*) over (order by txn_date range 6....),
count(*) over (order by txn_date range 7....)
from (
select tnx_date from table
union
select sysdate from dual
)
)
where txn_date=sysdate

The where clause might not be right idea, but the intention is to only show the counts of the window relative to the sysdate row only. (I assume sysdate will be the same date/time within the same query no matter how many times it is used, and that may be incorrect)

Is it likely that analytics with a unioned sysdate row is a better approach than the pivot of the queries unioned together.
I hope this makes more sense. I look forward to your insight. I have a pivoting version working, so I am sure I could experiment with the analytics version and post my comparison results, but I thought you might have some insight without actually solving the problem.

Using the union, I can easily see that depending on the time ranges being looked at, I will likely be querying up and counting the same data many times, where I think the analytics might do this more efficiently internally.

Tom Kyte
November 01, 2003 - 6:27 pm UTC

oh, for that, CASE is infinitely useful


select count( case when DT between sysdate and sysdate-5 then 1 else null end ),
count( case when DT between sysdate and sysdate-15 then 1 else null end )
...


and can even be combined with analytics when necessary.

thanks - you're great!

bob, November 02, 2003 - 9:05 am UTC


Question on SYSDATE comparing Logic

Ashok Rathi, June 17, 2004 - 10:40 am UTC

Hi Tom ,
Any clue which is more efficient

1 . start_date <= sysdate and ( end_date is null or
end_date >= sysdate )

2. trunc(sysdate) between start_date and nvl(end_date,'31-dec-9999')


I would think first one is more efficient for the fact that oracle does short circuit in or conditions


Thanks and Regards
Ashok

Tom Kyte
June 17, 2004 - 1:35 pm UTC

forget about "or" if there was an index on start_date, date_date -- the first would be able to use it and the second would not.

also, using trunc and nvl would be "expensive"

it is easy enough to test:



ops$tkyte@ORA9IR2> create or replace procedure p1( start_date in date, end_date in date )
  2  is
  3          l_x number := 1;
  4  begin
  5          if ( start_date <= sysdate and ( end_date is null or end_date >= sysdate ) )
  6          then
  7                  l_x := 1;
  8          end if;
  9  end;
 10  /
 
Procedure created.
 
ops$tkyte@ORA9IR2> create or replace procedure p2( start_date in date, end_date in date )
  2  is
  3          l_x number := 1;
  4  begin
  5      if ( trunc(sysdate) between start_date and nvl(end_date,'31-dec-9999') )
  6          then
  7                  l_x := 1;
  8          end if;
  9  end;
 10  /
 
Procedure created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> exec runstats_pkg.rs_start
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> begin
  2  for j in 1 .. 500
  3  loop
  4          for i in 1 .. 50
  5          loop
  6                  p1( sysdate-25+i, sysdate-25+i );
  7                  p1( sysdate-25+i, null );
  8          end loop;
  9  end loop;
 10  end;
 11  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec runstats_pkg.rs_middle
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> begin
  2  for j in 1 .. 500
  3  loop
  4          for i in 1 .. 50
  5          loop
  6                  p2( sysdate-25+i, sysdate-25+i );
  7                  p2( sysdate-25+i, null );
  8          end loop;
  9  end loop;
 10  end;
 11  /
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA9IR2> exec runstats_pkg.rs_stop(1000)
Run1 ran in 46 hsecs
Run2 ran in 83 hsecs
run 1 ran in 55.42% of the time

 

CAN THIS BE DONE ???? IN ORACLE

Lou, July 27, 2004 - 6:01 pm UTC

a procedure where I can put something like this

01/01/2004 23:59:59

add 01/01/2004 00:00:01

and get 01/01/2004 24:00:01

and then 01/01/2004 00:00:00

and then get 01/01/2004 00:00:01

Thanks Tom

Tom Kyte
July 27, 2004 - 7:49 pm UTC

well, huh?

you add a date to a date?

what does it mean to add "jan 1st at 11:59:59pm" with "jan 1st at one second past midnight"???????

and what time is "24:00:01"?

no clue really what you are trying to accomplish -- do you want to add N seconds to a date?

if so, add N * (1/24/60/60)

to add minutes N*(1/24/60)

to add hours N*(1/24)

to add days N*(1)

or in 9i, use intervals...

Time Interval Counts using Analytical Funcs

Mahesh, December 14, 2004 - 7:45 pm UTC

I have a true follow on question to your first answer in this thread -
I like to know how many transactions were executed in 5 min intervals over last 7 days.

TXN_DATE COUNT
14-JAN-02:09:10 20
14-JAN-02:09:15 30
14-JAN-02:09:20 60

I was wondering how we could use an analytical function...

Tom Kyte
December 15, 2004 - 1:19 pm UTC

they would not be appropriate.  a simple group by it what you need -- analytics would not be "useful" here.


ops$tkyte@ORA9IR2> select dt,
  2         trunc(dt,'hh') +
  3             trunc(to_char(dt,'mi')/5)*5*(1/24/60)
  4    from (select to_date('01-jan-2004 12:00:00', 'dd-mon-yyyy hh24:mi:ss') + rownum/24/60 dt from all_objects
  5    where rownum <= 60)
  6  /
 
DT                   TRUNC(DT,'HH')+TRUNC
-------------------- --------------------
01-jan-2004 12:01:00 01-jan-2004 12:00:00
01-jan-2004 12:02:00 01-jan-2004 12:00:00
01-jan-2004 12:03:00 01-jan-2004 12:00:00
01-jan-2004 12:04:00 01-jan-2004 12:00:00
01-jan-2004 12:05:00 01-jan-2004 12:05:00
01-jan-2004 12:06:00 01-jan-2004 12:05:00
01-jan-2004 12:07:00 01-jan-2004 12:05:00
01-jan-2004 12:08:00 01-jan-2004 12:05:00
01-jan-2004 12:09:00 01-jan-2004 12:05:00
01-jan-2004 12:10:00 01-jan-2004 12:10:00
01-jan-2004 12:11:00 01-jan-2004 12:10:00



     trunc(dt,'hh') +  trunc(to_char(dt,'mi')/5)*5*(1/24/60)

is one method of getting things rounded to 5 minute intervals -- that is what you want to group by. 

Dave T, June 30, 2006 - 10:02 am UTC

Hi Tom,

I have the following table and data:

create table t_askdave (yrmth varchar2(6), dept varchar2 (10), qty number);

insert into t_askdave values ('200204', 'PLUMB01', 23);
insert into t_askdave values ('200205', 'PLUMB01', 18);
insert into t_askdave values ('200206', 'PLUMB01', 7);
insert into t_askdave values ('200207', 'PLUMB01', 9);
insert into t_askdave values ('200208', 'PLUMB01', 14);
insert into t_askdave values ('200209', 'PLUMB01', 18);
insert into t_askdave values ('200210', 'PLUMB01', 23);
insert into t_askdave values ('200211', 'PLUMB01', 17);
insert into t_askdave values ('200212', 'PLUMB01', 3);
insert into t_askdave values ('200301', 'PLUMB01', 24);
insert into t_askdave values ('200302', 'PLUMB01', 24);
insert into t_askdave values ('200303', 'PLUMB01', 22);
insert into t_askdave values ('200304', 'PLUMB01', 19);
insert into t_askdave values ('200305', 'PLUMB01', 1);
insert into t_askdave values ('200306', 'PLUMB01', 3);
insert into t_askdave values ('200307', 'PLUMB01', 19);
insert into t_askdave values ('200308', 'PLUMB01', 17);
insert into t_askdave values ('200309', 'PLUMB01', 2);
insert into t_askdave values ('200310', 'PLUMB01', 8);
insert into t_askdave values ('200311', 'PLUMB01', 9);
insert into t_askdave values ('200312', 'PLUMB01', 9);
insert into t_askdave values ('200401', 'PLUMB01', 9);
insert into t_askdave values ('200402', 'PLUMB01', 22);
insert into t_askdave values ('200403', 'PLUMB01', 20);
insert into t_askdave values ('200404', 'PLUMB01', 14);
insert into t_askdave values ('200405', 'PLUMB01', 3);
insert into t_askdave values ('200406', 'PLUMB01', 14);
insert into t_askdave values ('200407', 'PLUMB01', 7);
insert into t_askdave values ('200408', 'PLUMB01', 9);
insert into t_askdave values ('200409', 'PLUMB01', 18);
insert into t_askdave values ('200410', 'PLUMB01', 22);
insert into t_askdave values ('200411', 'PLUMB01', 18);
insert into t_askdave values ('200412', 'PLUMB01', 19);
insert into t_askdave values ('200501', 'PLUMB01', 2);
insert into t_askdave values ('200502', 'PLUMB01', 18);
insert into t_askdave values ('200503', 'PLUMB01', 0);
insert into t_askdave values ('200504', 'PLUMB01', 5);
insert into t_askdave values ('200505', 'PLUMB01', 5);
insert into t_askdave values ('200506', 'PLUMB01', 4);
insert into t_askdave values ('200507', 'PLUMB01', 22);
insert into t_askdave values ('200508', 'PLUMB01', 17);
insert into t_askdave values ('200509', 'PLUMB01', 18);
insert into t_askdave values ('200510', 'PLUMB01', 14);
insert into t_askdave values ('200511', 'PLUMB01', 4);
insert into t_askdave values ('200512', 'PLUMB01', 0);
insert into t_askdave values ('200601', 'PLUMB01', 2);
insert into t_askdave values ('200602', 'PLUMB01', 21);
insert into t_askdave values ('200603', 'PLUMB01', 21);

commit;

Initially we needed to create a running total on qty based on financial year. In the UK this is April to April so we did this:

SELECT YRMTH,
DEPT,
QTY,
FIN_YR,
SUM ( QTY ) OVER ( PARTITION BY FIN_YR ORDER BY YRMTH, DEPT,
FIN_YR ) RUNNING_TOTAL
FROM (
SELECT YRMTH,
DEPT,
QTY,
MAX ( C_ ) OVER ( ORDER BY YRMTH ) FIN_YR
FROM (
SELECT t.*,
TO_NUMBER ( CASE WHEN TRIM ( TO_CHAR ( TO_DATE ( YRMTH, 'YYYYMM'
), 'MONTH' )) = 'APRIL' THEN SUBSTR ( YRMTH, 1, 4 ) ELSE '0' END ) C_
FROM t_askdave t
ORDER BY TO_DATE ( yrmth, 'yyyymm' )
)
)

This gives:

YRMTH DEPT QTY FIN_YR RUNNING_TOTAL
------ ---------- ---------- ---------- -------------
200204 PLUMB01 23 2002 23
200205 PLUMB01 18 2002 41
200206 PLUMB01 7 2002 48
200207 PLUMB01 9 2002 57
200208 PLUMB01 14 2002 71
200209 PLUMB01 18 2002 89
200210 PLUMB01 23 2002 112
200211 PLUMB01 17 2002 129
200212 PLUMB01 3 2002 132
200301 PLUMB01 24 2002 156
200302 PLUMB01 24 2002 180

YRMTH DEPT QTY FIN_YR RUNNING_TOTAL
------ ---------- ---------- ---------- -------------
200303 PLUMB01 22 2002 202
200304 PLUMB01 19 2003 19
200305 PLUMB01 1 2003 20
200306 PLUMB01 3 2003 23
200307 PLUMB01 19 2003 42
200308 PLUMB01 17 2003 59
200309 PLUMB01 2 2003 61
200310 PLUMB01 8 2003 69
200311 PLUMB01 9 2003 78
200312 PLUMB01 9 2003 87
200401 PLUMB01 9 2003 96

YRMTH DEPT QTY FIN_YR RUNNING_TOTAL
------ ---------- ---------- ---------- -------------
200402 PLUMB01 22 2003 118
200403 PLUMB01 20 2003 138
200404 PLUMB01 14 2004 14
200405 PLUMB01 3 2004 17
200406 PLUMB01 14 2004 31
200407 PLUMB01 7 2004 38
200408 PLUMB01 9 2004 47
200409 PLUMB01 18 2004 65
200410 PLUMB01 22 2004 87
200411 PLUMB01 18 2004 105
200412 PLUMB01 19 2004 124

YRMTH DEPT QTY FIN_YR RUNNING_TOTAL
------ ---------- ---------- ---------- -------------
200501 PLUMB01 2 2004 126
200502 PLUMB01 18 2004 144
200503 PLUMB01 0 2004 144
200504 PLUMB01 5 2005 5
200505 PLUMB01 5 2005 10
200506 PLUMB01 4 2005 14
200507 PLUMB01 22 2005 36
200508 PLUMB01 17 2005 53
200509 PLUMB01 18 2005 71
200510 PLUMB01 14 2005 85
200511 PLUMB01 4 2005 89

YRMTH DEPT QTY FIN_YR RUNNING_TOTAL
------ ---------- ---------- ---------- -------------
200512 PLUMB01 0 2005 89
200601 PLUMB01 2 2005 91
200602 PLUMB01 21 2005 112
200603 PLUMB01 21 2005 133

Now we want to add a column that shows the value for the corresponding month in the previous financial year.

This is how far we have got:

SELECT YRMTH,
DEPT,
RUNNING_TOTAL,
FIRST_VALUE ( RUNNING_TOTAL ) OVER ( ORDER BY YRMTH ASC RANGE YRMTH - ADD_MONTHS ( YRMTH, -12 ) PRECEDING ) PRE
FROM (
SELECT YRMTH,
DEPT,
QTY,
FIN_YR,
SUM ( QTY ) OVER ( PARTITION BY FIN_YR ORDER BY YRMTH, DEPT, FIN_YR ) RUNNING_TOTAL
FROM (
SELECT TO_DATE ( YRMTH, 'YYYYMM' ) YRMTH,
DEPT,
QTY,
MAX ( C_ ) OVER ( ORDER BY YRMTH ) FIN_YR
FROM (
SELECT t.*,
TO_NUMBER ( CASE WHEN TRIM ( TO_CHAR ( TO_DATE ( YRMTH, 'YYYYMM'
), 'MONTH' )) = 'APRIL' THEN SUBSTR ( YRMTH, 1, 4 ) ELSE '0' END ) C_
FROM t_askdave t
ORDER BY TO_DATE ( yrmth, 'yyyymm' )
)
) I )

This gives:

YRMTH DEPT RUNNING_TOTAL PRE
--------- ---------- ------------- ----------
01-APR-02 PLUMB01 23 23
01-MAY-02 PLUMB01 41 23
01-JUN-02 PLUMB01 48 23
01-JUL-02 PLUMB01 57 23
01-AUG-02 PLUMB01 71 23
01-SEP-02 PLUMB01 89 23
01-OCT-02 PLUMB01 112 23
01-NOV-02 PLUMB01 129 23
01-DEC-02 PLUMB01 132 23
01-JAN-03 PLUMB01 156 23
01-FEB-03 PLUMB01 180 23

YRMTH DEPT RUNNING_TOTAL PRE
--------- ---------- ------------- ----------
01-MAR-03 PLUMB01 202 23
01-APR-03 PLUMB01 19 23
01-MAY-03 PLUMB01 20 41
01-JUN-03 PLUMB01 23 48
01-JUL-03 PLUMB01 42 57
01-AUG-03 PLUMB01 59 71
01-SEP-03 PLUMB01 61 89
01-OCT-03 PLUMB01 69 112
01-NOV-03 PLUMB01 78 129
01-DEC-03 PLUMB01 87 132
01-JAN-04 PLUMB01 96 156

YRMTH DEPT RUNNING_TOTAL PRE
--------- ---------- ------------- ----------
01-FEB-04 PLUMB01 118 180
01-MAR-04 PLUMB01 138 202
01-APR-04 PLUMB01 14 19
01-MAY-04 PLUMB01 17 20
01-JUN-04 PLUMB01 31 23
01-JUL-04 PLUMB01 38 42
01-AUG-04 PLUMB01 47 59
01-SEP-04 PLUMB01 65 61
01-OCT-04 PLUMB01 87 69
01-NOV-04 PLUMB01 105 78
01-DEC-04 PLUMB01 124 87

YRMTH DEPT RUNNING_TOTAL PRE
--------- ---------- ------------- ----------
01-JAN-05 PLUMB01 126 96
01-FEB-05 PLUMB01 144 118
01-MAR-05 PLUMB01 144 138
01-APR-05 PLUMB01 5 14
01-MAY-05 PLUMB01 10 17
01-JUN-05 PLUMB01 14 31
01-JUL-05 PLUMB01 36 38
01-AUG-05 PLUMB01 53 47
01-SEP-05 PLUMB01 71 65
01-OCT-05 PLUMB01 85 87
01-NOV-05 PLUMB01 89 105

YRMTH DEPT RUNNING_TOTAL PRE
--------- ---------- ------------- ----------
01-DEC-05 PLUMB01 89 124
01-JAN-06 PLUMB01 91 126
01-FEB-06 PLUMB01 112 144
01-MAR-06 PLUMB01 133 144

This is almost what we are after. The only values that I am
not happy with are the months where there was no data for the previous financial year ( anything before 01-apr-03 ). Can you explain how the query has populated these fields?

Thanks.

Ps sorry about the date stored as a char!

Tom Kyte
June 30, 2006 - 12:41 pm UTC

you used first_value - not an "absolute offset", the first value in the window for the first row - is the row itself, the first value for the second row? the row in front of it.

maybe you wanted to use lag(..,12) ?

Distinct items ++

Reader, November 21, 2006 - 6:38 pm UTC

I wish to find the number of distincts items sold everyday across different stores.And it has to be a cummulative total.
So,if the first day I have sold 2 items A and B then the number will be 2 and on the second day we sold B and C then the count has to be 3 (A,B,C) .

The table could be something like this --
create table temp (store_id varchar2(4),itm_cd varchar2(15) ,sold_dt date )

insert into temp values ('S01','A123','21-Nov-2006');
insert into temp values ('S01','A124','21-Nov-2006');
insert into temp values ('S02','A123','21-Nov-2006');
insert into temp values ('S04','A121','22-Nov-2006');
insert into temp values ('S01','A125','23-Nov-2006');
insert into temp values ('S02','A126','27-Nov-2006');
insert into temp values ('S01','A123','28-Nov-2006');
commit;

The expected output should be ---
S01 21-Nov-2006 2
S02 21-Nov-2006 1
S04 22-Nov-2006 1
S01 23-Nov-2006 3
S02 27-Nov-2006 2
S01 28-Nov-2006 3

I would appreciate if you could show me how to do it using Analytical function.

Thanks

Tom Kyte
November 22, 2006 - 4:45 pm UTC

ops$tkyte%ORA10GR2> select store_id, sold_dt, sum(cnt) over (partition by store_id order by sold_dt) rt
  2    from (select store_id, sold_dt, count(*) cnt from temp group by store_id, sold_dt )
  3   order by sold_dt, store_id
  4  /

STOR SOLD_DT           RT
---- --------- ----------
S01  21-NOV-06          2
S02  21-NOV-06          1
S04  22-NOV-06          1
S01  23-NOV-06          3
S02  27-NOV-06          2
S01  28-NOV-06          4

6 rows selected.


doesn't match your output, but yours is wrong on the last line I think... 

Distinct sold items Sold ...

Reader, November 24, 2006 - 11:31 am UTC

May be I have not conveyed my case properly...

My output above is correct because if you look at the data then there are only three distinct items sold upto 11/28/2006 which are itm_cd --A123,A124,A125 so the count should be 3 instead of 4 .

My logic is something like this --
distinct items in the first week,
distinct items in the first 2 weeks ,
distinct items in the first 3 weeks ,
distinct items in the first 4 weeks ,
then distinct items in the whole month

Is it that we will need to use the windowing function.

Thanks ,

Tom Kyte
November 24, 2006 - 6:46 pm UTC

first week of WHAT?
2 weeks of WHAT?

clear as mud I'm afraid

Contd...

Reader, November 24, 2006 - 7:17 pm UTC

Distinct items sold in the First week of the month --
Distinct items sold in the first two weeks of the month --
Distinct items sold in the first three weeks of the month --
Distinct items sold in the first four weeks of the month --
Distinct items sold in the month --
I hope it is better this time --



Tom Kyte
November 24, 2006 - 7:24 pm UTC

so, can this be in a single row?

and define for us all "first week of month"

"Subtracting" date ranges from other date ranges?

Mobra, December 02, 2006 - 3:16 am UTC

Let's say we have a table like this:

create table period
(from_date date,
to_date date,
direction varchar2(3));

insert into period (from_date, to_date, direction) values (to_date('01.01.2006', 'dd.mm.yyyy'), to_date('31.06.2006', 'dd.mm.yyyy'), 'IN');
insert into period (from_date, to_date, direction) values (to_date('10.10.2006', 'dd.mm.yyyy'), to_date('31.12.2006', 'dd.mm.yyyy'), 'IN');

insert into period (from_date, to_date, direction) values (to_date('05.01.2006', 'dd.mm.yyyy'), to_date('10.01.2006', 'dd.mm.yyyy'), 'OUT');
insert into period (from_date, to_date, direction) values (to_date('03.04.2006', 'dd.mm.yyyy'), to_date('24.04.2006', 'dd.mm.yyyy'), 'OUT');


What we want is effectively to:

- keep all the "IN" periods
- if one or more of the "OUT" periods overlap with an "IN" period, remove that part from the "IN" period, possibly splitting the "IN" period into two separate periods if the "OUT" period is in the middle of the "IN" period


So that this:

IN |---------------------| |---------------|
OUT |----| |---|


Becomes like this:

IN |---| |------| |-| |---------------|



Using the above data, the result should be something like:

FROM_DATE TO_DATE
01.01.2006 05.01.2006
10.01.2006 03.04.2006
24.04.2006 31.06.2006
10.10.2006 31.12.2006

Which represents the "IN" periods "minus" the "OUT" periods.


Is it possible to do this easily using SQL? Perhaps with analytical functions?

Tom Kyte
December 02, 2006 - 1:03 pm UTC

can out's overlap

can out's go past the end of an in

we need to know all of the rules.

In response to the above problem

Mobra, December 03, 2006 - 11:01 am UTC


Basically, the "in" periods represent leasing an item, and the "out" periods represent re-letting the item. In other words:

- out's cannot overlap
- out's do not go past the end of an in



Tom Kyte
December 03, 2006 - 1:17 pm UTC

run your example script - feel free to fix it and repost it.

See if this works

Anwar, December 04, 2006 - 1:25 am UTC

I am not sure whether this is the most efficient of doing it, but it seems to generate desired output.
(We don't have 31st of june so assuming you meant 30.06.2006)

SQL>with per as
  2     (select strt,ofdt,cnt,totcnt,ifdt,itdt,otdt
  3     from
  4             (
  5             select o.from_date ofdt,
  6                     lag(o.to_date,1,i.from_date)
  7                     over (partition by i.from_date order by o.from_date) strt,
  8                     count(i.from_date)
  9                     over (partition by i.from_date order by o.from_date) cnt,
 10                     count(i.from_date)
 11                     over (partition by i.from_date order by 1) totcnt,
 12                     i.from_date ifdt,
 13                     i.to_date itdt,
 14                     o.to_date otdt
 15             from period i,period o
 16             where i.direction='IN' and o.direction='OUT' and
 17                   o.to_date between i.from_date and i.to_date
 18             )
 19     )
 20  select strt "FROM_DATE",
 21     ofdt "TO_DATE"
 22  from per
 23  -------------------
 24  union all
 25  select otdt,
 26     itdt
 27  from per
 28  where cnt=totcnt
 29  --------------------
 30  union all
 31  select from_date,
 32     to_date
 33  from period x
 34  where x.direction='IN' and
 35     not exists (select 1 from period
 36             where direction='OUT' and
 37                     from_date between x.from_date and x.to_date)
 38  order by 1
 39  /

FROM_DATE TO_DATE
--------- ---------
01-JAN-06 05-JAN-06
10-JAN-06 03-APR-06
24-APR-06 30-JUN-06
10-OCT-06 31-DEC-06

4 rows selected.


regards
anwar 

Matthew, December 04, 2006 - 5:22 am UTC

Anwar, does your query rely on Mobra's specification that the outs not go past the end of an in.

We had the same requirements earlier this year, except that the out times weren't a subset of the in times.

Thanks!

Mobra, December 04, 2006 - 6:27 am UTC

Anwar, your solution seems to work according to my requirements. Thanks a lot for your help!


Anwar, December 05, 2006 - 12:05 am UTC

Matthew, yes my quiry assumes that OUTs don't go past INs. However, if there's a likelyhood of OUTs going past INs then changing the WHERE clause in second SELECT to "where cnt=totcnt and otdt<itdt" will probably serve your purpose.


windowing suqery for financial year

Phil, July 04, 2007 - 11:46 am UTC

Hi Tom
This sort of relates to the above - I want to select the max(date) grouping by status on a table but group this for each UK financial year (1st April to 31st March). Now, I can do this by hard coding in the dates but I kind of know you'll have an analytic way up your sleve?....
Phil
Tom Kyte
July 05, 2007 - 12:53 pm UTC

just use a sql function to determine the FY.


case when to_char(dt) between 4 and 12 then to_number(to_char(dt,'yyyy'))+1
     else to_number(to_char(dt,'yyyy'))
 end FISCAL_YEAR



now you have a column named fiscal year in your query and you can partition by that.

Brilliant

Phil, July 06, 2007 - 5:34 am UTC

Your solutions are sort of obvious but I never fail to be impressed you can roll them out so effortlessly.
I've made a small change but this is superb, thanks (and just done this as select for sysdate).
Phil

select case
when
to_char(sysdate,'mm') between 4 and 12 then to_number(to_char(sysdate,'yyyy'))
else
to_number(to_char(sysdate,'yyyy')-1)
end FISCAL_YEAR
from dual


Tom Kyte
July 06, 2007 - 12:49 pm UTC

ahh, in the US our convention is the FY is the year at the END of the fiscal year.

eg: oracle's year is 1-jun through 31-may. We are currently in FY2008 having just finished FY2007

sliding cumulative aggregation

A reader, August 03, 2007 - 7:31 am UTC

Hi Tom
That approach has been extremely helpful. Thanks again.
I'm afraid a new question has come my way and the only way I can think of resolving it is procedurally which is always inferior to SQL alone. I have a hunch you might be able to help...

I need to count inventory for each financial year
However, each row has a start_date and end_date
Therefore, start_date (without end_date) gives me the financial year I should count from
BUT, I also need to count that row again in the next financial years if it does not have an end_date!

Does that make sense? I'm thinking pl/sql but if there is an option with SQL I'd be amazed and grateful.

Phil




Tom Kyte
August 05, 2007 - 1:32 pm UTC

example please.

Resolved...

A reader, August 04, 2007 - 8:50 am UTC

Tom
The strategy here was a with fin_years as(select to_char(sysdate,'yyyy')-level-21 connect by level<21 from dual) before the query and then to join on fin_years providing more than one row per row of source data.
You've provided the answer in other postings before.
Phil
Tom Kyte
August 05, 2007 - 2:15 pm UTC

example please - show us what sort of output you mean.

example

Phil, August 06, 2007 - 11:45 am UTC

Tom, here is the approach that sorted it - I'm not on the site today so this is what I remember...

with fin_years as(select to_char(sysdate,'yyyy')-level-21 y connect by
level<21 from dual)
select count(*), to_char(start_date,'yyyy')
from
table_i_am_interested_in t join fin_years on t.start_date >= to_date(01-JAN-'||fin_years.y, 'dd-mon-yyyy')
and
nvl(t.end_date,'31-DEC-4000') <= to_date(31-DEC-'||fin_years.y, 'dd-mon-yyyy')
group by
to_char(t.start_date,'yyyy')

I've done this from memory so the syntax might be tad ugly & I am sure you'll rip it into a more attractive solution.

sum over overlapping intervals

Matt, October 14, 2011 - 6:01 pm UTC

Hi Tom

I have several overlapping intervals and need the sum of values for each group for each interval WHERE THE SUM STAYS THE SAME

desc tabA

----------------------------------------- -------- ----------------------------
RESOURCE_ID VARCHAR2(32)
EFF_ST_DATE NOT NULL TIMESTAMP(6) WITH TIME ZONE
EFF_END_DATE NOT NULL TIMESTAMP(6) WITH TIME ZONE
RA_CAP NOT NULL NUMBER(10,2)

alter session set NLS_TIMESTAMP_TZ_FORMAT='YYYY-MM-DD HH24:MI:SS';

select * from tabA;


resource_id eff_st_date eff_end_date RA_CAP
------------ -------------------- -------------------- ----------
RESOURCE_A 2011-11-01 00:00:00 2011-11-02 18:00:00 41.26
RESOURCE_A 2011-11-01 00:00:00 2011-11-11 00:00:00 64.76
RESOURCE_A 2011-11-03 00:00:00 2011-11-18 00:00:00 2.23
RESOURCE_B 2011-11-06 00:00:00 2011-11-24 00:00:00 14.81
RESOURCE_B 2011-11-01 00:00:00 2011-11-06 00:00:00 59.9
RESOURCE_A 2011-11-09 00:00:00 2011-11-27 00:00:00 20.14
RESOURCE_D 2011-11-01 00:00:00 2011-12-01 00:00:00 45
RESOURCE_C 2011-11-01 00:00:00 2011-12-01 00:00:00 23.02
RESOURCE_C 2011-11-01 00:00:00 2011-12-01 00:00:00 43.48
RESOURCE_C 2011-11-01 00:00:00 2011-12-01 00:00:00 76.16
RESOURCE_B 2011-11-24 00:00:00 2011-12-01 00:00:00 66.3

11 rows selected.

I need a report of the start and end of intervals windowed on the SUM of RA_CAPs during the interval

Output I need:

resource_id eff_st_date eff_end_date RA_CAP
------------ -------------------- -------------------- ----------
RESOURCE_A 2011-11-01 00:00:00 2011-11-02 18:00:00 106.02
RESOURCE_A 2011-11-01 18:00:00 2011-11-03 00:00:00 64.76
RESOURCE_A 2011-11-03 00:00:00 2011-11-09 00:00:00 66.99
RESOURCE_A 2011-11-09 00:00:00 2011-11-11 00:00:00 87.13
RESOURCE_A 2011-11-11 00:00:00 2011-11-18 00:00:00 22.37
RESOURCE_A 2011-11-18 00:00:00 2011-11-27 00:00:00 20.14
RESOURCE_B 2011-11-01 00:00:00 2011-11-06 00:00:00 59.9
...

oh. the version is Oracle 10.2.0.5

Thanks in advance
Tom Kyte
October 16, 2011 - 9:56 pm UTC

no create table
no inserts
not really going to look. I'd spend 50% of my life taking examples like this and attempting to recreate just the test data.


You'll have to explain this:


I need a report of the start and end of intervals windowed on the SUM of
RA_CAPs during the interval


a lot better. Just showing inputs and outputs doesn't do much for me. Are you supposed to take each row - and find all of the other rows (for that resource?) that overlap it and then sum up those values? That isn't going to be easy (or performant)

How many rows are we talking about that this query will have to process?

small typo in output required

Matt, October 14, 2011 - 6:03 pm UTC

Sorry small typo in output I need

RESOURCE_A 2011-11-01 00:00:00 2011-11-02 18:00:00 106.02
RESOURCE_A 2011-11-02 18:00:00 2011-11-03 00:00:00 64.76
RESOURCE_A 2011-11-03 00:00:00 2011-11-09 00:00:00 66.99
RESOURCE_A 2011-11-09 00:00:00 2011-11-11 00:00:00 87.13
RESOURCE_A 2011-11-11 00:00:00 2011-11-18 00:00:00 22.37
RESOURCE_A 2011-11-18 00:00:00 2011-11-27 00:00:00 20.14
RESOURCE_B 2011-11-01 00:00:00 2011-11-06 00:00:00 59.9
Tom Kyte
October 16, 2011 - 9:56 pm UTC

see above

More info

Matt, October 17, 2011 - 11:38 am UTC

> You'll have to explain this:
>
> > I need a report of the start and end of intervals
> > windowed on the SUM of RA_CAPs during the interval

If you look closely there are multiple records with same resource each that could have overlapping eff_st_date and eff_end_date. The report is to show the sum of the RA_CAP values for all the periods covered by the overlapping start and end dates.

I think I could get it to sum at regular intervals such as by day by using:

select resource_id,
eff_st_date+numtodsinterval(b.column_value,'DAY') trade_day,
sum(ra_cap)
from tabA a,
table(cast(multiset(select level-1 r
from dual
connect by level <= extract(day from (eff_end_date+numtodsinterval(1,'SECOND')-eff_st_date))
)
as sys.odciNumberList)
) b
group by resource_id,eff_st_date+numtodsinterval(b.column_value,'DAY')
order by 1,2

and then reconstruct it back to the report output I need by wrapping that in analytic function first_value/last_value.

That would give the true sum at those intervals....

but the problem is that eff_st_date and eff_end_date can be input as any timestamp value and summing at a regular interval might not show a higher/lower value if the start or end is not on that interval. I could get to smaller and smaller intervals - hours, minutes, seconds - but I too am worried it would not be performant. However if there is no other way to do it I dont think any data will be input for eff_st_date or eff_end_date at anything smaller than minute.

So here is the create table, inserts for my test case. The real table will hold between 10,000 and 50,000 rows for the forseeable future.



DROP TABLE TABA CASCADE CONSTRAINTS;

CREATE TABLE TABA
(
RESOURCE_ID VARCHAR2(32 BYTE),
EFF_ST_DATE TIMESTAMP(6) WITH TIME ZONE NOT NULL,
EFF_END_DATE TIMESTAMP(6) WITH TIME ZONE NOT NULL,
RA_CAP NUMBER(10,2) NOT NULL
);

Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_A', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('11/2/2011 6:00:00.000000 PM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 41.26);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_A', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('11/11/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 64.76);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_A', TO_TIMESTAMP_TZ('11/3/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('11/18/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 2.23);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_B', TO_TIMESTAMP_TZ('11/6/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('11/24/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 14.81);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_B', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('11/6/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 59.9);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_A', TO_TIMESTAMP_TZ('11/9/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('11/27/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 20.14);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_D', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('12/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 45);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_C', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('12/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 23.02);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_C', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('12/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 43.48);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_C', TO_TIMESTAMP_TZ('11/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('12/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 76.16);
Insert into TABA
(RESOURCE_ID, EFF_ST_DATE, EFF_END_DATE, RA_CAP)
Values
('RESOURCE_B', TO_TIMESTAMP_TZ('11/24/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), TO_TIMESTAMP_TZ('12/1/2011 12:00:00.000000 AM -07:00','fmMMfm/fmDDfm/YYYY fmHH12fm:MI:SS.FF AM TZH:TZM'), 66.3);
COMMIT;


as always Tom, thanks for your time and effort keeping this website updated and informative.

ugh. another typo

Matt, October 17, 2011 - 12:09 pm UTC

In the above select to sum by day, change

connect by level <= extract(day from (eff_end_date+numtodsinterval(1,'SECOND')-eff_st_date))

to

connect by level <= extract(day from (eff_end_date-eff_st_date))

It's something left over from testing with the idiosyncracies of the real table (eff_end_date is 1 second before the real end, usually 11:59:59 pm) that is not applicable to the discussion.
Tom Kyte
October 17, 2011 - 6:34 pm UTC

If you look closely there are multiple records with same resource each that
could have overlapping eff_st_date and eff_end_date.


we shouldn't have to look closely and try to reverse engineer YOUR OUTPUT input what you meant to specify as a specification. Not only is that error prone on our part (way too easy to "guess" wrong or make incorrect assumptions based on a simple example). So, not only is it error prone - but it is backwards - the person asking for help should lay out the groundwork - do the legwork - provide the specification


I still don't get your output, if I get the sum of overlaps for each record, this is what I see:

ops$tkyte%ORA11GR2> select t1.resource_id,
  2         to_char( t1.eff_st_date, 'dd-mon-yy hh24:mi:ss'),
  3             to_char( t1.eff_end_date, 'dd-mon-yy hh24:mi:ss'),
  4             t1.ra_cap,
  5             (select sum(ra_cap)
  6                from t t2
  7                   where t1.resource_id = t2.resource_id
  8                     and t1.eff_st_date <= t2.eff_end_date
  9                     and t1.eff_end_date >= t2.eff_st_date) sum_ra_cap
 10    from t t1
 11  /

RESOURCE_ID     TO_CHAR(T1.EFF_ST_ TO_CHAR(T1.EFF_END     RA_CAP SUM_RA_CAP
--------------- ------------------ ------------------ ---------- ----------
RESOURCE_A      01-nov-11 00:00:00 02-nov-11 18:00:00      41.26     106.02
RESOURCE_A      01-nov-11 00:00:00 11-nov-11 00:00:00      64.76     128.39
RESOURCE_A      03-nov-11 00:00:00 18-nov-11 00:00:00       2.23      87.13
RESOURCE_B      06-nov-11 00:00:00 24-nov-11 00:00:00      14.81     141.01
RESOURCE_B      01-nov-11 00:00:00 06-nov-11 00:00:00       59.9      74.71
RESOURCE_A      09-nov-11 00:00:00 27-nov-11 00:00:00      20.14      87.13
RESOURCE_D      01-nov-11 00:00:00 01-dec-11 00:00:00         45         45
RESOURCE_C      01-nov-11 00:00:00 01-dec-11 00:00:00      23.02     142.66
RESOURCE_C      01-nov-11 00:00:00 01-dec-11 00:00:00      43.48     142.66
RESOURCE_C      01-nov-11 00:00:00 01-dec-11 00:00:00      76.16     142.66
RESOURCE_B      24-nov-11 00:00:00 01-dec-11 00:00:00       66.3      81.11

11 rows selected.



and I don't get how to turn that into your output.

got it

Matt, October 17, 2011 - 6:41 pm UTC

with
points as -- get all the points at which the RA_CAP value may change
(
select resource_id,eff_st_date p from tabA
union
select resource_id,eff_end_date p from tabA
),
intervals as -- get the date and next date of the above "points"
(
select resource_id,
p intvl_start,
lead(p) over (partition by resource_id order by resource_id,p) intvl_end
from points
)
-- now get the sum for above intervals
select a.resource_id,
b.intvl_start,
b.intvl_end,
sum(a.ra_cap)
from tabA a, intervals b
where a.resource_id= b.resource_id
and intvl_start >= eff_st_date
and intvl_end <= eff_end_date
group by a.resource_id, b.intvl_start, b.intvl_end
order by 1,2

Thanks for trying

Matt, October 18, 2011 - 2:54 pm UTC

Not sure where I went wrong in explaining. Perhaps you didn't notice the start and end times don't overlap in my output. They are the true sum of RA_CAP at those intervals based upon records where eff_st_date and eff_end_date do overlap.

Anyway I figured it out with the help of another page on this site: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:474221407101#48772536405689

And it seems pretty performant so I thank you.
Tom Kyte
October 18, 2011 - 5:06 pm UTC

you didn't explain it, that is where you went wrong.

I told you - I'm not going to reverse engineer output to figure out what magic must probably take place. I want to be told how to get from point A to point B - that is what we do in our profession - we write specifications, we explain in gory detail what needs to be done - leaving nothing to imagination, leaving no ambiguity.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.