Toby, February 27, 2008 - 6:01 pm UTC
Thanks Tom.
B is unique C is not.
I am tinkering with lag and lead but can't quite get what I need
I do want to the whole result set on a single row please
February 27, 2008 - 7:25 pm UTC
define please what you expect when c is not unique.
eg: BE SPECIFIC AND VERY CLEAR for every case.
A reader, February 27, 2008 - 7:55 pm UTC
create table t (mykey int, the_date date, drink_sales int, food_sales int,
petrol_sales int,tyre_sales int);
insert into t values (1,to_date('20-feb-2008'), 100, 120,100,0);
insert into t values (2,to_date('21-feb-2008'), 150, 120,100,0);
insert into t values (3,to_date('22-feb-2008'), 90, 120,140,10);
insert into t values (4,to_date('23-feb-2008'), 200, 200,110,60);
insert into t values (5,to_date('24-feb-2008'), 200, 180,110,60);
insert into t values (6,to_date('25-feb-2008'), 120, 250,110,120);
insert into t values (7,to_date('26-feb-2008'), 100, 100,300,60);
column mykey is the primary key
column the_date is unique
remaining columns are not I am not concerned if these columns are unique
or not just there relationship to the values of the preceding day.
I would like to do something like this
select all the columns for a day where
the drink_sales(today) are less then the drink_sales(yesterday)
and the food_sales(today) are greater then food_sales(yesterday)
and the food_sales(yesterday) are less then food_sale(the day before that)
Did I get this clearer?
February 28, 2008 - 3:28 pm UTC
ops$tkyte%ORA10GR2> select *
2 from (
3 select t.*,
4 case when lag(the_date) over (order by the_date) = the_date-1 then lag(drink_sales) over (order by the_date) end yesterdays_drink,
5 case when lag(the_date) over (order by the_date) = the_date-1 then lag(food_sales) over (order by the_date) end yesterdays_food, 6 case when lag(the_date,2) over (order by the_date) = the_date-2 then lag(food_sales,2) over (order by the_date) end day_before_yesterdays_food
7 from t
8 )
9 where drink_sales < yesterdays_drink
10 and food_sales > yesterdays_food
11 and yesterdays_food < day_before_yesterdays_food
12 /
MYKEY THE_DATE DRINK_SALES FOOD_SALES PETROL_SALES TYRE_SALES YESTERDAYS_DRINK YESTERDAYS_FOOD DAY_BEFORE_YESTERDAYS_FOOD
---------- --------- ----------- ---------- ------------ ---------- ---------------- --------------- --------------------------
6 25-FEB-08 120 250 110 120 200 180 200
specific enough
Duke Ganote, February 27, 2008 - 10:25 pm UTC
I'm sure Tom's will be slicker:
create table t
( mykey int primary key
, the_date date unique
, drink_sales int, food_sales int
, petrol_sales int,tyre_sales int)
;
insert into t values (1,to_date('20-feb-2008','dd-mon-yyyy'), 100, 120,100,0);
insert into t values (2,to_date('21-feb-2008','dd-mon-yyyy'), 150, 120,100,0);
insert into t values (3,to_date('22-feb-2008','dd-mon-yyyy'), 90, 120,140,10);
insert into t values (4,to_date('23-feb-2008','dd-mon-yyyy'), 200, 200,110,60);
insert into t values (5,to_date('24-feb-2008','dd-mon-yyyy'), 200, 180,110,60);
insert into t values (6,to_date('25-feb-2008','dd-mon-yyyy'), 120, 250,110,120);
insert into t values (7,to_date('26-feb-2008','dd-mon-yyyy'), 100, 100,300,60);
SELECT today, "food(today)", "drink(today)"
, "drink(dayB4)"
, "food(dayB4)"
, "food(2dayB4)"
FROM
(
select the_date today
, food_sales "food(today)"
, drink_sales "drink(today)"
, first_value(drink_sales) OVER (order by the_date rows 1 preceding )
"drink(dayB4)"
, first_value(food_sales) OVER (order by the_date rows 1 preceding)
"food(dayB4)"
, first_value(food_sales) OVER (order by the_date rows 2 preceding)
"food(2dayB4)"
, row_number() OVER (order by the_date) r#
from t
order by the_date DESC
)
WHERE r# > 2
AND "drink(today)" < "drink(dayB4)"
AND "food(today)" > "food(dayB4)"
AND "food(dayB4)" < "food(2dayB4)"
/
TODAY food(today) drink(today) drink(dayB4) food(dayB4) food(2dayB4)
---------- ----------- ------------ ------------ ----------- ------------
2008-02-25 250 120 200 180 200
February 28, 2008 - 3:30 pm UTC
what about non-contiguous dates.....
lag(column,offset) with the case to check the dates...
Looks good
Toby, February 27, 2008 - 11:30 pm UTC
Hey thanks Duke!
Looks very good, I haven't used first_value before.
Can ask what the row number > 2 doing in the query?
I got lag and lead working.
I have used lag to get the the date, yesterday and the day before yesterdays date.
Thanks again for your answer.
Still interested in seeing your solution Tom.
February 28, 2008 - 3:31 pm UTC
see above
on FIRST_VALUE for 'Knight Move' and ROW_NUMBER
Duke Ganote, February 28, 2008 - 5:59 am UTC
FIRST_VALUE may be my favorite analytic function. Here it allows me to make a 'knight move': count 2 rows ordered by date, then get the value of food_sales. (Anthony Molinaro, in his book "SQL Cookbook", refers to 'knight values', but I prefer the Bob Seger allusion. :)
ROW_NUMBER covers the boundary condition. I explicitly excluded the first/earliest two rows because they couldn't possibly satisfy the condition "food_sales(yesterday) are less then food_sale(the day before that)". Row 1 has no known "food_sales(yesterday)" and row 2 has no known food_sale "food_sale(the day before that)".
February 28, 2008 - 3:46 pm UTC
what about non-contiguous dates....
re: FIRST_VALUE and non-contiguous dates.....
Duke Ganote, February 28, 2008 - 5:55 pm UTC
Now you made me review p.553-555 of the ultimate Oracle book (your EO1O1, of course) and learn something new...
You mean like this?
SELECT today -- REVISED QUERY AGAINST ORIGINAL DATA
, "drink(today)"
, "drink(dayB4)"
, "food(today)"
, "food(dayB4)"
, "food(2dayB4)"
FROM
(
select the_date today
, food_sales "food(today)"
, drink_sales "drink(today)"
, FIRST_VALUE(drink_sales) OVER
(order by the_date
range between 1 preceding and 1 preceding)
"drink(dayB4)"
, FIRST_VALUE(food_sales) OVER
(order by the_date
range between 1 preceding and 1 preceding)
"food(dayB4)"
, FIRST_VALUE(food_sales) OVER
(order by the_date
range between 2 preceding and 2 preceding)
"food(2dayB4)"
from t
order by the_date DESC
)
WHERE "drink(today)" < "drink(dayB4)"
AND "food(today)" > "food(dayB4)"
AND "food(dayB4)" < "food(2dayB4)"
/
TODAY drink(today) drink(dayB4) food(today) food(dayB4) food(2dayB4)
---------- ------------ ------------ ----------- ----------- ------------
2008-02-25 120 200 250 180 200
update t set the_date = the_date + mykey
/
select * from t -- NON-CONTIGUOUS DATES
/
MYKEY THE_DATE DRINK_SALES FOOD_SALES PETROL_SALES TYRE_SALES
---------- ---------- ----------- ---------- ------------ ----------
1 2008-02-21 100 120 100 0
2 2008-02-23 150 120 100 0
3 2008-02-25 90 120 140 10
4 2008-02-27 200 200 110 60
5 2008-02-29 200 180 110 60
6 2008-03-02 120 250 110 120
7 2008-03-04 100 100 300 60
7 rows selected.
SELECT today
, "drink(today)"
, "drink(dayB4)"
, "food(today)"
, "food(dayB4)"
, "food(2dayB4)"
FROM
(
select the_date today
, food_sales "food(today)"
, drink_sales "drink(today)"
, FIRST_VALUE(drink_sales) OVER
(order by the_date
range between 1 preceding and 1 preceding)
"drink(dayB4)"
, FIRST_VALUE(food_sales) OVER
(order by the_date
range between 1 preceding and 1 preceding)
"food(dayB4)"
, FIRST_VALUE(food_sales) OVER
(order by the_date
range between 2 preceding and 2 preceding)
"food(2dayB4)"
from t
order by the_date DESC
)
WHERE "drink(today)" < "drink(dayB4)"
AND "food(today)" > "food(dayB4)"
AND "food(dayB4)" < "food(2dayB4)"
/
no rows selected
LAG vs FIRST_VALUE. Subtle, no?
Duke Ganote, February 28, 2008 - 7:21 pm UTC
There's a subtle difference between the inner queries, which I noticed when testing with the non-contiguous data.
The LAG(...,2) query doesn't show show food sales from two days prior on non-contiguous data because LAG(...,2) is a ROW offset (p. 564 of EO1O1):
select the_date,
food_sales "food(today)",
case when lag(the_date ,2) over (order by the_date) = the_date-2
then lag(food_sales,2) over (order by the_date)
end "food(2dayB4)"
from t
order by the_date
/
THE_DATE food(today) food(2dayB4)
---------- ----------- ------------
2008-02-21 120
2008-02-23 120
2008-02-25 120
2008-02-27 200
2008-02-29 180
2008-03-02 250
2008-03-04 100
7 rows selected.
This query shows sales from two days prior because of the windowing RANGE clause:
select the_date today
, food_sales "food(today)"
, FIRST_VALUE(food_sales) OVER
(order by the_date
range between 2 preceding and 2 preceding)
"food(2dayB4)"
from t
order by the_date
/
TODAY food(today) food(2dayB4)
---------- ----------- ------------
2008-02-21 120
2008-02-23 120 120
2008-02-25 120 120
2008-02-27 200 120
2008-02-29 180 200
2008-03-02 250 180
2008-03-04 100 250
7 rows selected.
Actually, it shows the same result with FIRST_VALUE, MIN, MAX, SUM, etc.
Learned alot
Toby, February 29, 2008 - 8:45 am UTC
Thanks Tom and Duke
I learned an awful lot from your posts
Learned a lot
Duke Ganote, February 29, 2008 - 9:45 am UTC
Me too! Analytic functions offer lots of power. However, I tend to only use what I originally needed, learned (and still regularly use).
Until Tom "pushed back", I'd never looked at the row offset parameter on LAG, or used a windowing clause. I regularly use PARTITIONING and ORDER BY clauses, sure. But never a windowing clause like this:
RANGE BETWEEN 2 PRECEDING AND 2 PRECEDING
to capture a value "the day before yesterday", that is, 2 days ago (in calendar days, not business days).
I toyed with it further by compressing the time slices to half-daily:
update t set the_date = the_date-mykey*.5;
7 rows updated.
select * from t;
MYKEY THE_DATE DRINK_SALES FOOD_SALES PETROL_SALES TYRE_SALES
---------- ------------------- ----------- ---------- ------------ ----------
1 2008-02-19 12:00:00 100 120 100 0
2 2008-02-20 00:00:00 150 120 100 0
3 2008-02-20 12:00:00 90 120 140 10
4 2008-02-21 00:00:00 200 200 110 60
5 2008-02-21 12:00:00 200 180 110 60
6 2008-02-22 00:00:00 120 250 110 120
7 2008-02-22 12:00:00 100 100 300 60
7 rows selected.
and got the sum of yesterday's and the day-before-yesterday's sales like this:
select the_date this_datetime
, food_sales "food(this_datetime)"
, sum(food_sales) OVER
(order by trunc(the_date)
range between 1 preceding and 1 preceding)
"food(dayB4)"
, sum(food_sales) OVER
(order by trunc(the_date)
range between 2 preceding and 2 preceding)
"food(2dayB4)"
from t
order by the_date
/
THIS_DATETIME food(this_datetime) food(dayB4) food(2dayB4)
------------------- ------------------- ----------- ------------
2008-02-19 12:00:00 120
2008-02-20 00:00:00 120 120
2008-02-20 12:00:00 120 120
2008-02-21 00:00:00 200 240 120
2008-02-21 12:00:00 180 240 120
2008-02-22 00:00:00 250 380 240
2008-02-22 12:00:00 100 380 240
7 rows selected.
ah, the possibilities...
How about if the column is a computed value?
Chuck jolley, February 29, 2008 - 12:15 pm UTC
As an example, would it be possible to create a factorial table by referencing prior calculations?
eg
1 * 1 = 1
2 * 1 = 2
3 * 2 = 6
4 * 6 = 24
5 * 24 = 120
etc...
Something along the lines of
select level,
level * nvl(lag(factorial) over (order by level), level) factorial
from dual connect by level <= 5
Except, of course, that that won't compile because of the self reverencing computed column.
But, is there a way it can be done?
March 01, 2008 - 11:59 am UTC
ops$tkyte%ORA10GR2> with data as
2 (select level l from dual connect by level <= 10)
3 select l, exp( sum( ln(l) ) over (order by l) ) fact
4 from data
5 /
L FACT
---------- ----------
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10 3628800
10 rows selected.
RE: factorial table by referencing prior calculations?
Duke Ganote, February 29, 2008 - 7:20 pm UTC
10g's MODEL clause makes it easy:
SELECT * FROM
(
SELECT level r#
FROM dual
CONNECT BY LEVEL <= 5
) x
MODEL
DIMENSION BY ( r# )
MEASURES ( CAST(NULL AS NUMBER) factorial )
RULES AUTOMATIC ORDER
( factorial[r#] = cv(r#)* CASE cv(r#)
WHEN 1 THEN 1
ELSE factorial[cv()-1]
END
)
/
R# FACTORIAL
---------- ----------
1 1
2 2
3 6
4 24
5 120
A whole lesson in one statement
Chuck Jolley, March 01, 2008 - 8:52 am UTC
That is very interesting, thanks.
It's going to take me a bit to pick it apart and understand the general case.
If the factorial example was too easy, how about a Pascal's Triangle. ;)
Chuck Jolley, March 01, 2008 - 4:19 pm UTC
Tom,
I knew about the factorial formula.
I really just wondered if there was a way to reference prior calculated rows and used factorials as an example for simplicity.
Still, interesting stuff!
Chuck Jolley, March 01, 2008 - 4:21 pm UTC
I should have followed your link before I replied :D
I guess I didn't have anything else to do for the next week but wade through that.
pascal in sql
A reader, March 02, 2008 - 11:18 am UTC
RE: pick it apart and understand the general case
Duke Ganote, March 03, 2008 - 5:19 am UTC
Difference between Lag/Lead and range/row preceding
Raghu, September 13, 2012 - 4:34 pm UTC
Tom,
The question might sound noobish, but is there a difference between
LAG(SUM(salary), 1) OVER (ORDER BY hiredate)
and
sum(salary) over (order by hiredate rows/range 1 preceding)?
Thanks in advance
September 14, 2012 - 6:53 pm UTC
if you just want the preceding salary, please just use
lag(salary) over (order by hiredate)
why are you confusing everything with sum()s and such.
use the one that makes sense
lag(salary) over (order by hiredate)
Selecting previous rows
Raghu, September 18, 2012 - 1:42 pm UTC
Thanks Tom.
I've a table
create table t (attribute_name varchar2(50), attribute_length number
);
insert into t values ('ATTRIBUTE_1',10);
insert into t values ('ATTRIBUTE_2',10);
insert into t values ('ATTRIBUTE_2',20);
insert into t values ('ATTRIBUTE_3',30);
insert into t values ('ATTRIBUTE_4',10);
insert into t values ('ATTRIBUTE_4',40);
insert into t values ('ATTRIBUTE_5',50);
insert into t values ('ATTRIBUTE_6',60);
insert into t values ('ATTRIBUTE_7',70);
insert into t values ('ATTRIBUTE_8',80);
insert into t values ('ATTRIBUTE_8',80);
commit;
/
Now,
select attribute_name, attribute_length,
row_number () over (partition by attribute_name order by attribute_name) rn
from t;
would give me
ATTRIBUTE_NAME ATTRIBUTE_LENGTH RN
ATTRIBUTE_1 10 1
ATTRIBUTE_2 20 1
ATTRIBUTE_2 10 2
ATTRIBUTE_3 30 1
ATTRIBUTE_4 10 1
ATTRIBUTE_4 40 2
ATTRIBUTE_5 50 1
ATTRIBUTE_6 60 1
ATTRIBUTE_7 70 1
ATTRIBUTE_8 80 1
ATTRIBUTE_8 80 2
Is there a way to select only attributes 2,4 and not 8 (since the difference in lengths are only in 2 and 4)
I'm sure analytics is the way to go but uncertain which function to use.
Thanks
September 20, 2012 - 5:26 am UTC
ops$tkyte%ORA11GR2> select attribute_name, attribute_length,
2 row_number() over (partition by attribute_name order by attribute_length) rn
3 from (
4 select attribute_name, attribute_length,
5 count(distinct attribute_length) over (partition by attribute_name ) cnt
6 from t
7 )
8 where cnt > 1
9 order by attribute_name, attribute_length
10 /
ATTRIBUTE_NAME ATTRIBUTE_LENGTH RN
-------------------------------------------------- ---------------- ----------
ATTRIBUTE_2 10 1
ATTRIBUTE_2 20 2
ATTRIBUTE_4 10 1
ATTRIBUTE_4 40 2
4 rows selected.
Rephrasing the question
Raghu, September 18, 2012 - 1:47 pm UTC
I need both the rows of attributes 2 and 4
i.e. result should return,
ATTRIBUTE_NAME ATTRIBUTE_LENGTH RN
ATTRIBUTE_2 20 1
ATTRIBUTE_2 10 2
ATTRIBUTE_4 10 1
ATTRIBUTE_4 40 2
only.
Thanks a million.