Skip to Main Content

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Scott.

Asked: March 30, 2003 - 8:17 pm UTC

Last updated: March 15, 2010 - 11:13 am UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

G'day

I'm trying to master the use of the lag function to get last years sales within the same query.
Our data is partitioned by two time attributes, fin_year and fin_month.
CREATE TABLE ths_sales_det (state_code VARCHAR2(3) NOT NULL ,fin_year NUMBER NOT NULL,fin_month NUMBER NOT NULL,acc_nbr NUMBER NOT NULL
,locn_code NUMBER NOT NULL,subd_code NUMBER NOT NULL,prod_type_id NUMBER NOT NULL ,sales NUMBER,costs NUMBER,qty_bought NUMBER);
And the typical report needs to be sales grouped by month.
Month Sales Last Year Sales
Jul02 100 50
Aug02 90 30
...
INSERT INTO ths_sales_det VALUES ('WA',2003,1,332,2010,123, 0,100, 70,1,1);
INSERT INTO ths_sales_det VALUES ('WA',2003,2,332,2010,123, 0,90, 70,1,1);
INSERT INTO ths_sales_det VALUES ('WA',2002,1,332,2010,123, 0,50, 70,1,1);
INSERT INTO ths_sales_det VALUES ('WA',2002,2,332,2010,123, 0,30, 70,1,1);
However I am unable to get the query to distinguish between previous row and previous year.
Keep in mind that some months may not necessarily have sales, so you can't offset by 12.

This is obviously incorrect...
select sum(sales), fin_year, fin_month,lag(sum(sales)) over ( order by fin_year,fin_month) last_year_sale
from ths_sales_det
group by fin_year,fin_month
SUM(SALES) FIN_YEAR FIN_MONTH LAST_YEAR_SALES
---------- ---------- ---------- ---------------
50 2002 1
30 2002 2 50
100 2003 1 30
90 2003 2 100

Would you please assist?
Thankyou in advance.

and Tom said...

I think a simple pivot is what you want -- just get the sum of sales by mon/year for the years in question and pivot:

ops$tkyte@ORA920> select fin_month, max(decode( fin_year, 2002, sales, null )) fy_2002,
2 max(decode( fin_year, 2003, sales, null )) fy_2003
3 from (
4 select sum(sales) sales, fin_year, fin_month
5 from ths_sales_det
6 group by fin_year, fin_month
7 )
8 group by fin_month
9 /

FIN_MONTH FY_2002 FY_2003
---------- ---------- ----------
1 50 100
2 30 90



Rating

  (21 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

ongoing application

Scott, March 30, 2003 - 8:53 pm UTC

G'day Tom

This will be an ongoing application, so hardcoding the financial years will be of no use.
The parameter set by the user will be financial year.
The data I supplied was only a sample to generate the test case.

I was trying to extend the example found in the documentation and in </code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:2403541386543 <code>

Thankyou for the quick reply.

Tom Kyte
March 30, 2003 - 9:13 pm UTC

can you spell "bind variable"

any literal in any query is "replaceable"

ops$tkyte@ORA920> variable x number
ops$tkyte@ORA920> variable y number
ops$tkyte@ORA920> exec :x := 2002

PL/SQL procedure successfully completed.

ops$tkyte@ORA920> exec :y := 2003

PL/SQL procedure successfully completed.

ops$tkyte@ORA920>
ops$tkyte@ORA920> select fin_month, max(decode( fin_year, :x, sales, null )) fy_last_year,
  2                    max(decode( fin_year, :y, sales, null )) fy_this_one
  3    from (
  4  select sum(sales) sales, fin_year, fin_month
  5    from ths_sales_det
  6   group by fin_year, fin_month
  7         )
  8   group by fin_month
  9  /

 FIN_MONTH FY_LAST_YEAR FY_THIS_ONE
---------- ------------ -----------
         1           50         100
         2           30          90


or, if you want to key off of sysdate:

ops$tkyte@ORA920> select fin_month, max(decode( fin_year, to_char(add_months(sysdate,-12),'yyyy'), sales, null )) fy_last_year,
  2                    max(decode( fin_year, to_char(sysdate,'yyyy'), sales, null )) fy_this_one
  3    from (
  4  select sum(sales) sales, fin_year, fin_month
  5    from ths_sales_det
  6   group by fin_year, fin_month
  7         )
  8   group by fin_month
  9  /

 FIN_MONTH FY_LAST_YEAR FY_THIS_ONE
---------- ------------ -----------
         1           50         100
         2           30          90


or whatever -- use whatever technique you need -- it is trivial...  You will be feeding those values into a predicate somewhere I would presume.

But, using lag and lead -- you need to order NOT BY YEAR, by partition by month and then order by year

ops$tkyte@ORA920> select *
  2    from (
  3  select fin_month, sales, lag(sales) over (partition by fin_month order by fin_year) last_year
  4    from (
  5  select sum(sales) sales, fin_year, fin_month
  6    from ths_sales_det
  7   group by fin_year, fin_month
  8         )
  9         )
 10   where last_year is not null
 11  /

 FIN_MONTH      SALES  LAST_YEAR
---------- ---------- ----------
         1        100         50
         2         90         30



just think about how lag/lead/etc work and partitions/windows and ranges...





 

Thanks.

Scott, March 30, 2003 - 9:57 pm UTC

It's been that sort of Monday morning...

B-Y-N-D V-A-R-E-E-A-B-E-L, no, no, that's not it...

Thanks Tom.

lag is not perfect than pivot

Steve, June 04, 2003 - 12:37 pm UTC

Hi Tom,

In the following answer you gave, what if remove one record ( e.g. fin_year is 2002, fin_month is 2)
then the result is

FIN_MONTH FY_2002 FY_2003
---------- ---------- ----------
1 50 100

instead of
FIN_MONTH FY_2002 FY_2003
---------- ---------- ----------
1 50 100
2 90

Thanks!
Steve



select *
from (
select fin_month, sales, lag(sales) over (partition by fin_month order by
fin_year) last_year
from (
select sum(sales) sales, fin_year, fin_month
from steve_test_sales
group by fin_year, fin_month
)
)
where last_year is not null



Tom Kyte
June 04, 2003 - 12:57 pm UTC

use where fin_year = 2003 (bind it) then instead to get rows for records with no "last year"

lag is not perfect than pivot

Steve, June 04, 2003 - 1:36 pm UTC

Hi Tom,

I am sorry I didn't follow you.
How do I put Zero in the column where there is no data?
e.g.

FIN_MONTH FY_2002 FY_2003
---------- ---------- ----------
1 50 100
2 0 90

Thanks!

Steve

Tom Kyte
June 04, 2003 - 1:37 pm UTC

select fin_month, sale, nvl(last_year,0)
from (
select fin_year, fin_month, sales, lag(sales) over (partition by fin_month order by
fin_year) last_year
from (
select sum(sales) sales, fin_year, fin_month
from steve_test_sales
group by fin_year, fin_month
)
)
where fin_year = 2003

future years

Scott, June 11, 2003 - 9:41 pm UTC

Using
select *
from (
select fin_year,fin_month, sales, lag(sales) over (partition by fin_month order by
fin_year) last_year
from (
select sum(sales) sales, fin_year, fin_month
from ths_sales_det
group by fin_year, fin_month
)
)
Is useful for selecting 2003/2002 data for a clause of fin_year=2003
FIN_YEAR FIN_MONTH SALES LAST_YEAR
---------- ---------- ---------- ----------
2003 1 100 50
2003 2 90 30

And good for selecting 2002/2001 data for clause of fin_year = 2002 (obviously, with our test data, 2001 column will be null.
FIN_YEAR FIN_MONTH SALES LAST_YEAR
---------- ---------- ---------- ----------
2002 1 50
2002 2 30

But if we have a clause of fin_year=2004, even though there is data for 2003 (for the last_year column), the query will return no rows.
Ideally, it would return
FIN_YEAR FIN_MONTH SALES LAST_YEAR
---------- ---------- ---------- ----------
2004 1 100
2004 2 90
How can we combat this?

Can I do this in a SQL

Ajeet, June 20, 2004 - 8:52 am UTC

Tom -- I was not getting a chance to ask a question so I am posting here --it is related with this thread but kind of new query :
I have a table and data Like below :

create table iet_test (fleet varchar2(20) , consumption_rate number(10) ,smooth_rate number(10)) ;
insert into iet_test values ('AAL',1,2) ;
insert into iet_test values ('AAL',10,null) ;
insert into iet_test values ('AAL',30,null) ;
insert into iet_test values ('DAL',6,5) ;
insert into iet_test values ('DAL',300,null) ;
insert into iet_test values ('DAL',40,null) ;
commit ;
SQL> select * from iet_test ;

FLEET                CONSUMPTION_RATE SMOOTH_RATE
-------------------- ---------------- -----------
AAL                                 1           2
AAL                                10
AAL                                30
DAL                                 6           5
DAL                               300
DAL                                40

and I want a SQL to get the output like below :

output which I want is

SQL> select * from iet_test ;

FLEET                CONSUMPTION_RATE SMOOTH_RATE
-------------------- ---------------- -----------
AAL                                 1           2
AAL                                10           2
AAL                                30           2
DAL                                 6           5
DAL                               300           5
DAL                                40           5

--so far I have tried this..does not work ofcourse..so thought to ask you.

select fleet,consumption_rate,
( Lag(smooth_rate) over ( partition by (fleet) order by fleet )) smooth_rate_next
from iet_test
/ 

Any help is always great.

Thanks
Ajeet 

Tom Kyte
June 20, 2004 - 10:26 am UTC

this is easy -- HOWEVER -- you are missing something, or we can make another assumption.


order by fleet would allow the rows to come out in any order inside of a fleet. Therefore, you must have SOMETHING else that orders the rows?

What is it?

Or is there an implied business rule that says "smooth_rate is NULL for all but one row per fleet"?

You see, what if the data were:

AAL 1 2
AAL 10
AAL 30 3


what should be assigned to "AAL 10" -- 2 or 3. without something to order the rows within a fleet, this is ambigous.

More information

Ajeet, June 21, 2004 - 2:15 am UTC

Tom,
Infact the first value in a fleet is calculated using a formaula and it uses first 3 values of consumption rate for that fleet -- (in real life it is first 10 values ) -- there is a field in table flight_datetime which we can use to order by.
Once I get the first value for smooth using above logic (which is 2 in the given example for the fleet AAL ) then I need to calculate second value for the same fleet which can be same as the first value or anothe value (which will again use the first value and the multiply it with a constant value say .5 ) to get the second values..same is true for 3rd value and all subsequent values for that fleet.

so I will give me you complete example --sorry about this.

I did not do it knowingly --this is the way I was trying...

create table iet_test (fleet varchar2(20) , flight_datetime date,consumption_rate number(10) ,smooth_rate number(10)) ;
insert into iet_test values ('AAL',trunc(sysdate) ,1,2) ;
insert into iet_test values ('AAL',trunc(sysdate+1),10,null) ;
insert into iet_test values ('AAL',trunc(sysdate+2),30,null) ;
insert into iet_test values ('DAL',trunc(sysdate),6,5) ;
insert into iet_test values ('DAL',trunc(sysdate+1),300,null) ;
insert into iet_test values ('DAL',trunc(sysdate+3),40,null) ;
commit ;
SQL> select * from iet_test ;

SQL> select * from iet_test ;

FLEET                FLIGHT_DA CONSUMPTION_RATE SMOOTH_RATE
-------------------- --------- ---------------- -----------
AAL                  21-JUN-04                1           2
AAL                  22-JUN-04               10
AAL                  23-JUN-04               30
DAL                  21-JUN-04                6           5
DAL                  22-JUN-04              300
DAL                  24-JUN-04               40

Now the output I want is :

SQL> select * from iet_test ;

FLEET                FLIGHT_DA CONSUMPTION_RATE SMOOTH_RATE
-------------------- --------- ---------------- -----------
AAL                  21-JUN-04                1           2
AAL                  22-JUN-04               10          2
AAL                  23-JUN-04               30           2
DAL                  21-JUN-04                6           5
DAL                  22-JUN-04              300           5
DAL                  24-JUN-04               40           5


Thanks so much for your time and help on this.
Ajeet 

Tom Kyte
June 21, 2004 - 8:20 am UTC

if there is but one value per fleet, a simple:

max(smooth_rate) over ( partition by fleet )

will do it.

What if there's no sales in a partical...

Martin, June 21, 2004 - 8:16 am UTC

..and I still want to show last_year_Sales for that fin_month.

I came up with  the following query which seems te work quit well (using the original test-set):

SQL> select fin_year, fin_month,sum(sales) SALES,lag(sum(sales)) over (partition by fin_month order 
by 
  2  fin_year) last_year_sale
  3  from (select * from ths_sales_det
  4        union all
  5     select STATE_CODE, FIN_YEAR+1, FIN_MONTH, ACC_NBR, LOCN_CODE, SUBD_CODE, PROD_TYPE_ID, null,
null,null from ths_sales_det)
  6  group by fin_year,fin_month;

 FIN_YEAR FIN_MONTH     SALES LAST_YEAR_SALE
--------- --------- --------- --------------
     2002         1        50
     2003         1       100             50
     2004         1                      100
     2002         2        30
     2003         2        90             30
     2004         2                       90

6 rows selected.

SQL> 

This way I make sure there's always  a record for the current year if we had sales in the previous year. However with larger amount of data this migth get a bit slow.

Is this the best solution or do you have a better one?  

Thanks for your time 

Tom Kyte
June 21, 2004 - 9:25 am UTC

I usually generate a set of dates (using all_objects or some other large table) and OUTER JOIN to it.



oops

Martin, June 21, 2004 - 8:21 am UTC

The title should be: What if there's no sales in a particular fin_month...


Can I use Lag --as I would really need to calculate next value

Ajeet, June 21, 2004 - 9:49 am UTC

Tom -

Thanks again for help.
I really want to use Lag somehow here as finally I will need to put more logic and need to calculate the next value in smooth_rate column based on previous value of this column..I was able to do it -- something like this..

SQL> select fleet ,consumption_rate ,flight_datetime,
lag(smooth_rate,rownum-1) over  (  partition by fleet order by flight
 next_smooth from
iet_test
/  2    3    4

FLEET                CONSUMPTION_RATE FLIGHT_DA NEXT_SMOOTH
-------------------- ---------------- --------- -----------
AAL                                 1 21-JUN-04           2
AAL                                10 22-JUN-04           2
AAL                                30 23-JUN-04           2
DAL                                 6 21-JUN-04
DAL                               300 22-JUN-04
DAL                                40 24-JUN-04

but it did not work for new Fleet - -like DAL in above example..I know it is not working because of rownum stuff and i need to put a better logic..so if you  can tell me how to do it using Lag that will be great help...
Max (smooth_rate) does solve my few problem as I have described to you above.Thanks. 

Tom Kyte
June 21, 2004 - 1:50 pm UTC

psuedo code the algorithm, or otherwise specify what it should "do".



Dimensional Data

Martin, June 21, 2004 - 10:49 am UTC

>I usually generate a set of dates (using all_objects or >some other large table) and OUTER JOIN to it.

Ok, I thougth about that but what if i want to group the data by several dimensions? My solution:

SQL> INSERT INTO ths_sales_det VALUES ('NY',2003,1,333,2011,124,0,150,70,1);

1 row created.

SQL> INSERT INTO ths_sales_det VALUES ('NY',2003,2,333,2011,124,0,110,70,1);

1 row created.

SQL> INSERT INTO ths_sales_det VALUES ('NY',2002,1,333,2011,124,0,40,70,1);

1 row created.

SQL> INSERT INTO ths_sales_det VALUES ('NY',2002,2,333,2011,124,0,50,70,1);

1 row created.

SQL>    
SQL> select state_code,fin_year, fin_month,sum(sales) SALES,lag(sum(sales)) over (partition by state
_code,fin_month order by fin_year) last_year_sale
  2  from (select * from ths_sales_det
  3        union all
  4     select STATE_CODE, FIN_YEAR+1, FIN_MONTH, ACC_NBR, LOCN_CODE, SUBD_CODE, PROD_TYPE_ID, null,
null,null from ths_sales_det)
  5  group by state_code,fin_year,fin_month
  6  ;

STATE_COD  FIN_YEAR FIN_MONTH     SALES LAST_YEAR_SALE
--------- --------- --------- --------- --------------
NY             2002         1        40
NY             2003         1       150             40
NY             2004         1                      150
NY             2002         2        50
NY             2003         2       110             50
NY             2004         2                      110
WA             2002         1        50
WA             2003         1       100             50
WA             2004         1                      100
WA             2002         2        30
WA             2003         2        90             30
WA             2004         2                       90

12 rows selected.

SQL> 

I wouldn't know how to achieve this the way you suggested. In my real life case there's about 10 more dimensions (sales channel, product group etc), not only state like in this example.  

Tom Kyte
June 21, 2004 - 1:56 pm UTC

if all you need to add is a single, known row -- your solution is awesome.

In general, you have to cartesian product your "dimensions" with the dates and outer joini to it. (for sparse data). in 10g, we have partitioned outer joins that remove the need for the cartesian product.

I am still trying

Ajeet, June 21, 2004 - 11:55 am UTC

Tom:
For my question above -- can I use Lag -- I am trying to use this SQL:
SQL> select fleet,consumption_rate,
( Lag(smooth_rate,decode(smooth_rate,null,1,0)) over ( partition by (fleet) orde
r by flight_datetime ))
smooth_rate_next
from iet_test
/   2    3    4    5

I get output like this ..what i am doing wrong --Please help..I have been trying since 2 days without any desired results..any help /guiudence will be a big help..I don't want to write a pl/sql code ..as I think anlytics has something which can solve it..I am just not able to get it.

FLEET                CONSUMPTION_RATE SMOOTH_RATE_NEXT
-------------------- ---------------- ----------------
AAL                                 1                2
AAL                                10                2
AAL                                30
DAL                                 6                5
DAL                               300                5
DAL                                40
 

Tom Kyte
June 21, 2004 - 7:52 pm UTC

why are you going with LAG? it isn't the right thing to use given your question.

Too poor

A reader, June 21, 2004 - 1:55 pm UTC

the reader asked lag function and you gave some other answer

Tom Kyte
June 21, 2004 - 8:14 pm UTC

the reader appears to have a screw and is asking for how to use a hammer on it.

Until the reader explains why lag is relevant, i'll keep advising "use a screwdriver for that screw please"



Try the following

Vishal Shah, June 21, 2004 - 4:15 pm UTC

Ajeet, try the following:

select fleet,consumption_rate,flight_datetime,
lag(smooth_rate,grouprow-1) over (partition by fleet order by flight_datetime) smooth_rate_next
from (select fleet ,consumption_rate ,flight_datetime,smooth_rate,
row_number() over ( partition by fleet order by flight_datetime)
grouprow from
iet_test);




A reader, June 21, 2004 - 5:22 pm UTC

Thanks Vishal,

That's the lag function. Not the work around that Tom tried to explain..

Tom Kyte
June 21, 2004 - 8:48 pm UTC

so???????

I'm not getting you (same person who has posted on other pages)....

You tell me (be honest now), which looks "more proper, more efficient, more correct"

select fleet,consumption_rate,flight_datetime,
lag(smooth_rate,grouprow-1) over (partition by fleet order by flight_datetime)
smooth_rate_next
from (select fleet ,consumption_rate ,flight_datetime,smooth_rate,
row_number() over ( partition by fleet order by flight_datetime)
grouprow from
iet_test);

or

max(smooth_rate) over ( partition by fleet )

be honest now, which is "right" given what we all know at this point in time?

Did not mean any offense

A reader, June 21, 2004 - 9:03 pm UTC

Sorry Tom, I did not mean to offend anyone. The other person seemed to be bent on using the lag function for obtaining his results. So that is why I posted the query. Obviously the max(smooth_rate) is a much better solution.

Thanks Vishal

Ajeet, June 22, 2004 - 1:21 am UTC

Vishal -- Thanks alot.

Thanks Tom

Martin, June 22, 2004 - 3:34 am UTC

I'm now pretty much convinced that I did the right thing.
Learned most of my good stuff from your site anyway :)

The people above should learn that it's not about "Look what cool things I do with analytics" but that it is about getting the best possible solution for the given problem. Thats the reason I come to your site every day.

LAG and LEAD can be used for this?

A reader, September 13, 2007 - 5:37 am UTC

Hi

I have a query which returns following results

SELECT
COD_EJER,
ID_COMP,
VAL,
VERSIONING
FROM COMP_CL
WHERE COD_EJER IN (200202, 200203)
AND ID_COMP = 1173
AND KEY = 56

COD_EJER ID_COMP VAL VERSIONING
------------ ---------- ---------- -----------
200202 1173 22000 1
200202 1173 23000 2
200203 1173 30000 1
200203 1173 30000 3

And teh requirement is:

VAL OF MAX(VERSIONING) of 200203 - VAL OF MAX(VERSIONING) of 200202

So it would return folloging

COD_EJER ID_COMP VAL VERSIONING DIFF
------------ ---------- ---------- ----------- -----
200203 1173 30000 3 7000

I know I can do first a MAX(VERSIONING) then filter the max version rows and use lag to subtract but I wonder if there are any better ways such as using windowing?

Iam basically treating quarters (200201, 200202, 200203 200204)

Cheers


Tom Kyte
September 15, 2007 - 7:20 pm UTC

no create
no inserts
no look

and a better explanation of what you mean would be required too - your 'requirement' is not parsable by my brain.

Issue in retrieving the data

Shiva E, March 15, 2010 - 5:35 am UTC

Tom,

Here we have a requirement in which we are able to generate the data @ runtime as shown below:

INTRVL_DT RATE_DT RATE_ID DT_TO DAYS_FROM DAYS_TO BID_RATE OFFER_RATE
3/10/2004 3/9/2004 197813 3/16/2004 1 7 9.36 11.94
3/11/2004
3/12/2004
3/13/2004
3/14/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/15/2004
3/16/2004
3/17/2004
3/18/2004
3/19/2004
3/20/2004

However we need to modify the above SQL result and produce the result as shown below:

INTRVL_DT RATE_DT RATE_ID DT_TO DAYS_FROM DAYS_TO BID_RATE OFFER_RATE
3/10/2004 3/9/2004 197813 3/16/2004 1 7 9.36 11.94
3/11/2004 3/9/2004 197813 3/16/2004 1 7 9.36 11.94
3/12/2004 3/9/2004 197813 3/16/2004 1 7 9.36 11.94
3/13/2004 3/9/2004 197813 3/16/2004 1 7 9.36 11.94
3/14/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/15/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/16/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/17/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/18/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/19/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94
3/20/2004 3/9/2004 197813 3/23/2004 7 14 9.36 11.94

In short null values should be replaced with the previous records non null values. I tried MAX and LOG analytical functions to address this issue but failed to get the required results. I would be able to retrive this data using self join and however I am looking for other alternatives to avoid this self join.

Kindly request you to advice on the same.

Thanks in advance...

Regards
Shiva
Tom Kyte
March 15, 2010 - 11:13 am UTC

ops$tkyte%ORA11GR2> select empno, comm, last_value(comm ignore nulls) over (order by empno) from scott.emp order by empno;

     EMPNO       COMM LAST_VALUE(COMMIGNORENULLS)OVER(ORDERBYEMPNO)
---------- ---------- ---------------------------------------------
      7369
      7499        300                                           300
      7521        500                                           500
      7566                                                      500
      7654       1400                                          1400
      7698                                                     1400
      7782                                                     1400
      7788                                                     1400
      7839                                                     1400
      7844          0                                             0
      7876                                                        0
      7900                                                        0
      7902                                                        0
      7934                                                        0

14 rows selected.



last value can do that for you

Issue in retrieving the data

Shiva, March 17, 2010 - 1:44 am UTC

Tons of thanks Tom.

Now my query response time was drastically reduced from 8 hrs to 3.5 hrs....

Its a tremondous improvement :)

Once again thanks Tom.

Best Regards
Shiva

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.