Excellent!!!
A reader, October 07, 2003 - 5:05 am UTC
Hi Tom,
'been thinking about writing a book just about analytics ' ... please make this book available soon and am sure it will be yet another gift from you to Oracle World :)
Wow!!
Michael T, October 07, 2003 - 7:00 am UTC
This is exactly what I needed! Analytics do rock! I just
need to understand them better. If you do decide to write a
book on analytics, it would be at the top of my must have
list. Thanks again!!!
Small correction
Michael T, October 07, 2003 - 7:33 am UTC
After looking at it a little closer it looks like there is
one small error. The start date for the first MACH1 entry
should be the close date of the prior different station. In
this case 07/01/2003. However, by making some small changes
to your query I can get the results I want.
SELECT order#,
station,
lag(close_date) over (partition by order# order by close_date)
start_date,
close_date
FROM (SELECT order#,
station,
close_date
FROM (SELECT order#,
lag(station) over (partition by order# order by
close_date) lag_station,
lead(station) over (partition by order# order by
close_date) lead_station,
station,
close_date
FROM t)
WHERE lead_station <> station
OR lead_station is null
OR lag_station is null)
There might be an easier way to construct this query, but
it works great for me. Thanks a lot for your help!
October 07, 2003 - 8:25 am UTC
sorry about that -- you are right -- when we have "a pair", we want to use lag/lead again to get and keep the right dates.
So, we want to keep rows that are:
a) the first row in the partition "where lag_station is null"
b) the last row in the partition "where lead_station is null"
c) the first of a possible pair "where lag_station <> station"
d) the second of a possible pair "where lead_station <> station"
This query does that:
ops$tkyte@ORA920> select order#,
2 station,
3 lag_close_date,
4 close_date,
5 decode( lead_station, station, 1, 0 ) first_of_pair,
6 decode( lag_station, station, 1, 0 ) second_of_pair
7 from (
8 select order#,
9 lag(station) over (partition by order# order by close_date)
10 lag_station,
11 lead(station) over (partition by order# order by close_date)
12 lead_station,
13 station,
14 close_date,
15 lag(close_date) over (partition by order# order by close_date)
16 lag_close_date,
17 lead(close_date) over (partition by order# order by close_date)
18 lead_close_date
19 from t
20 )
21 where lag_station is null
22 or lead_station is null
23 or lead_station <> station
24 or lag_station <> station
25 /
ORDER# STATION LAG_CLOSE_ CLOSE_DATE FIRST_OF_PAIR SECOND_OF_PAIR
------ ------- ---------- ---------- ------------- --------------
12345 RECV 07/01/2003 0 0
12345 MACH1 07/01/2003 07/02/2003 1 0
12345 MACH1 07/05/2003 07/11/2003 0 1
12345 INSP1 07/11/2003 07/12/2003 0 0
12345 MACH1 07/12/2003 07/16/2003 0 0
12345 MACH2 07/16/2003 07/30/2003 0 0
12345 STOCK 07/30/2003 08/01/2003 0 0
7 rows selected.
<b>we can see with the 1's the first/second of a pair in there. All we need to do now is "reach forward" for the first of a pair and grab the close date from the next record:</b>
ops$tkyte@ORA920> select order#,
2 station,
3 lag_close_date,
4 close_date
5 from (
6 select order#,
7 station,
8 lag_close_date,
9 decode( lead_station,
10 station,
11 lead(close_date) over (partition by order# order by close_date),
12 close_date ) close_date,
13 decode( lead_station, station, 1, 0 ) first_of_pair,
14 decode( lag_station, station, 1, 0 ) second_of_pair
15 from (
16 select order#,
17 lag(station) over (partition by order# order by close_date)
18 lag_station,
19 lead(station) over (partition by order# order by close_date)
20 lead_station,
21 station,
22 close_date,
23 lag(close_date) over (partition by order# order by close_date)
24 lag_close_date,
25 lead(close_date) over (partition by order# order by close_date)
26 lead_close_date
27 from t
28 )
29 where lag_station is null
30 or lead_station is null
31 or lead_station <> station
32 or lag_station <> station
33 )
34 where second_of_pair <> 1
35 /
ORDER# STATION LAG_CLOSE_ CLOSE_DATE
------ ------- ---------- ----------
12345 RECV 07/01/2003
12345 MACH1 07/01/2003 07/11/2003
12345 INSP1 07/11/2003 07/12/2003
12345 MACH1 07/12/2003 07/16/2003
12345 MACH2 07/16/2003 07/30/2003
12345 STOCK 07/30/2003 08/01/2003
6 rows selected.
<b>and discard the second of pairs row</b>
That is another way to do it (and an insight into how I develop analytic queries -- adding extra columns like that just to see visually what I want to do)
another good book on the list please go ahead on this one too
Vijay Sehgal, October 07, 2003 - 8:58 am UTC
Best Regards,
Vijay Sehgal
Very useful
Michael T., October 07, 2003 - 12:05 pm UTC
Excellent, as always!
Can we reach to the end of the group?
Steve, December 15, 2003 - 11:28 am UTC
For example, say our analytic query returns the following result:
master_record sub_record nxt_record
95845433 25860032 95118740
95118740 25860032 95837497
95837497 25860032
What I'd like is to do is grab the final master_record, 95837497, and have that populated in the final column. There could be 2,3 or more in each group.
December 15, 2003 - 3:45 pm UTC
so the nxt_record of the last record should be the master_record of that row?
then just select
nvl( lead(master_record) over (....), master_record ) nxt_record
when the lead is NULL, return the master_record of the current row
Almost....
Steve, December 15, 2003 - 5:52 pm UTC
but I dodn't explain it well enough. What I'd like to see is a result set that looks like:
master_record sub_record nxt_record
95845433 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
The data comes from this:
table activity
cllocn moddate
25860032 18/06/2003
95118740 26/08/2003
95837497 15/12/2003
95845433 19/08/2003
table ext_dedupe
master_cllocn dupe_cllocn
25860032 95118740
25860032 95837497
25860032 95845433
My query is:
select * from ( select master_record, sub_record, lead(master_record) over (partition by sub_record order by lst_activity asc) nxt_activity
from ( select * from (select case when dupelast_ackdate>last_ackdate then dupe_cllocn
when last_ackdate>dupelast_ackdate then master_cllocn
else master_cllocn
end master_record, greatest(last_ackdate,dupelast_ackdate) lst_activity,
case when dupelast_ackdate>last_ackdate then master_cllocn
when last_ackdate>dupelast_ackdate then dupe_cllocn
else dupe_cllocn
end sub_record
from (select master_cllocn, (select max(moddate) from activity a where a.cllocn=ed.master_cllocn) last_ackdate,
dupe_cllocn, (select max(moddate) from activity a where a.cllocn=ed.dupe_cllocn) dupelast_ackdate
from ext_dedupe ed))))
Am I on the right track or is there a simpler way to this?
Thanks
December 16, 2003 - 6:50 am UTC
can you explain in "just text" how you got from your inputs to your outputs.
it is not clear (and i didn't feel like parsing that sql to reverse engineer what it does)
Is this what you are looking for ?
Venkat, December 15, 2003 - 6:44 pm UTC
select master, sub, moddate
, min(master) keep (dense_rank first order by moddate) over (partition by sub) first_in_list
, max(master) keep (dense_rank last order by moddate) over (partition by sub) last_in_list
from (select master, sub, moddate from (
select 95845433 master, 25860032 sub, to_date('19-aug-03','dd/mon/yy') moddate from dual union all
select 95118740, 25860032, to_date('26-aug-03','dd/mon/yy') from dual union all
select 95837497, 25860032, to_date('15-dec-03','dd/mon/yy') from dual))
MASTER SUB MODDATE FIRST_IN_LIST LAST_IN_LIST
95845433 25860032 8/19/2003 95845433 95837497
95118740 25860032 8/26/2003 95845433 95837497
95837497 25860032 12/15/2003 95845433 95837497
Tom's Book
umesh, December 16, 2003 - 4:13 am UTC
Tom
Do not announce until you are finished with the book .. when you talk of a book ..can't wait until We have it here
Analytics Book That must be real good
Is it possible to get the same result in standard edition ?
Ninoslav, December 16, 2003 - 4:21 am UTC
Hi Tom,
yes, analitic functions are great. However, we can use it only in enterprise edition of database. We have a few small customers that want only a standard edition.
So, is it possible in this question to get the same result without analitic functions ?
It would be nice to have some kind of mapping between analitics and 'standard' queries. But, that is probabaly impossible...
December 16, 2003 - 7:27 am UTC
Oracle 9iR2 and up -- analytics are a feature of standard edition.
there are things you can do in analytics that are quite simply NOT PRACTICAL in any sense without them.
ok
Steve, December 16, 2003 - 8:41 am UTC
I have two tables - activity and ext_dedupe.
table activity
cllocn moddate
25860032 18/06/2003
95118740 26/08/2003
95837497 15/12/2003
95845433 19/08/2003
table ext_dedupe
master_cllocn dupe_cllocn
25860032 95118740
25860032 95837497
25860032 95845433
Ext_dedupe is a table created by a third party app which has identified duplicate records within our database. The first column is supposed to be the master and the second the duplicate. The idea is to mark as archived all our duplicate records with a pointer to the master. Notwithstanding the order of the columns, what we want to do is find out which record has the most recent activity (from the activity table) and archive off the others.
So, in this example although the master is listed as 25860032 against the other 3, an examination of the activity dates mean I want to keep 95837497 and mark the others as archived and have a pointer on each of them to 95837497. That's why I thought if I could get to the following result it would make it simpler.
master_record sub_record nxt_record
95845433 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
Hope that makes sense!
December 16, 2003 - 11:33 am UTC
oh, then nxt_record is just
last_value(master_record) over (partition by sub_record order by moddate)
Why...
Steve, December 16, 2003 - 1:31 pm UTC
it didn't work for me. I had to change it to
first_value(master_record) over (partition by sub_record order by moddate desc)
Is there a reason for that?
December 16, 2003 - 2:00 pm UTC
doh, default window clause is current row and unbounded preceding
i would have needed a window clause that looks forwards rather then backwards (reason #1 why I should always set up a test case instead of just answering on the fly)
your solution of reversing the data works just fine.
Another solution
A reader, December 16, 2003 - 4:03 pm UTC
The following gives the same result ...
select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
, max(cllocn) keep (dense_rank last order by moddate)
over (partition by nvl(master_cllocn,cllocn)) nxt_record
from activity, ext_dedupe where cllocn = dupe_cllocn
MASTER_RECORD SUB_RECORD NXT_RECORD
95118740 25860032 95837497
95837497 25860032 95837497
95845433 25860032 95837497
December 16, 2003 - 5:44 pm UTC
yes, there are many many ways to do this.
first_value
last_value
substring of max() without keep
sure.
A reader, December 16, 2003 - 4:15 pm UTC
Actually the nvl(master_cllocn...) is required only if you need all 4 rows in the output as follows(there is an outer join involved). If you need only the 3 rows as shown in the above post, there is no need for the nvl's....
select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
, max(cllocn) keep (dense_rank last order by moddate)
over (partition by nvl(master_cllocn,cllocn)) nxt_record
, last_value(cllocn) over (partition by nvl(master_cllocn,cllocn) order by moddate) nxt
from activity, ext_dedupe where cllocn = dupe_cllocn (+)
MASTER_RECORD SUB_RECORD NXT_RECORD
25860032 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
95845433 25860032 95837497
still q's on analytics
A reader, January 30, 2004 - 10:13 am UTC
Okay, so my web application logs "web transaction" statistics to a table. This actually amounts to 0 to many database tranactions... but anyway.. I need to summarize (sum, min, max, count, average) each day's transaction times for each class (name2) and action (name3) and ultimately "archive" this data to a hisory table. I am running 8.1.7 and pretty new to analytics.
My table looks like this:
SQL> desc tran_stats
Name Null? Type
----------------------- -------- ----------------
ID NOT NULL NUMBER(9)
NAME1 VARCHAR2(100)
NAME2 VARCHAR2(100)
NAME3 VARCHAR2(100)
NAME4 VARCHAR2(100)
SEC NOT NULL NUMBER(9,3)
TS_CR NOT NULL DATE
ID NAME1 NAME2 NAME3 SEC NAME4 TS_CR
---------- ----- ------------------------- ---------- ------ ----- ---------
35947 /CM01_PersonManagement CREATE .484 15-JAN-04
35987 /CM01_PersonManagement CREATE .031 15-JAN-04
36086 /CM01_PersonManagement EDIT .312 16-JAN-04
36555 /CM01_PersonManagement CREATE .297 19-JAN-04
36623 /CM01_PersonManagement EDIT .375 19-JAN-04
36627 /CM01_PersonManagement CREATE .047 19-JAN-04
36756 /CM01_AddressManagement CREATE .375 20-JAN-04
36766 /CM01_AddressManagement CREATE .305 20-JAN-04
36757 /CM01_AddressManagement INSERT .391 20-JAN-04
37178 /CM01_PersonManagement EDIT .203 20-JAN-04
and I need output like this:
TS_CR NAME2 NAME3 M_SUM M_MIN M_MAX M_COUNT M_AVG
--------- ------------------------- ---------- ------ ------ ------ ------- ------
20-JAN-04 /CM01_AddressManagement CREATE .680 .305 .375 2 .340
20-JAN-04 /CM01_AddressManagement INSERT .391 .391 .391 1 .391
20-JAN-04 /CM01_PersonManagement EDIT .203 .203 .203 1 .203
19-JAN-04 /CM01_PersonManagement CREATE .344 .047 .297 2 .172
19-JAN-04 /CM01_PersonManagement EDIT .375 .375 .375 1 .375
16-JAN-04 /CM01_PersonManagement EDIT .312 .312 .312 1 .312
15-JAN-04 /CM01_PersonManagement CREATE .515 .031 .484 2 .258
This seems to work, but there has to be a better/cleaner/more efficient way to do this:
select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
from (
select trunc(ts_cr) ts_cr,id, name2, name3, sum(sec) m_dummy
, min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min
, max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max
, round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg
, count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count
, sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
from tran_stats group by name2, name3,trunc(ts_cr),id
)n order by 1 desc, 2, 3;
Any help or pointers would be appreciated. Thanks in advance.
January 30, 2004 - 10:31 am UTC
why does there "have to be"?
what is "unclean" about this? I could make it more verbose (and perhaps more readable) but this does exactly what you ask for?
It seems pretty "good", very "clean" and probably the most efficient method to get this result?
Regarding the previous post ...
A reader, January 30, 2004 - 11:45 am UTC
Am I missing something or will the following do the same ..
select trunc(ts_cr) ts_cr, name2, name3,
count(*) m_count, min(sec) m_min, max(sec) m_max,
sum(sec) m_sum, avg(sec) m_avg
from tran_stats
group by trunc(ts_cr), name2, name3
order by 1 desc, 2, 3
January 30, 2004 - 7:43 pm UTC
with the supplied data -- since "group by trunc(ts_cr), name2, name3" happened to be unique
yes.
In general -- no. consider:
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
2 from ( select trunc(ts_cr) ts_cr,
3 id,
4 name2,
5 name3,
6 sum(sec) m_dummy ,
7 min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
8 max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
9 round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
10 count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
11 sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
12 from tran_stats
13 group by name2, name3,trunc(ts_cr),id
14 )n
15 MINUS
16 select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
17 from (
18 select trunc(ts_cr) ts_cr, name2, name3,
19 count(*) m_count, min(sec) m_min, max(sec) m_max,
20 sum(sec) m_sum, avg(sec) m_avg
21 from tran_stats
22 group by trunc(ts_cr), name2, name3 )
23 /
no rows selected
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into tran_stats
2 select 35947,'/CM01_PersonManagement','CREATE', .484 ,'15-JAN-04'
3 from all_users where rownum <= 5;
5 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
2 from ( select trunc(ts_cr) ts_cr,
3 id,
4 name2,
5 name3,
6 sum(sec) m_dummy ,
7 min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
8 max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
9 round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
10 count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
11 sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
12 from tran_stats
13 group by name2, name3,trunc(ts_cr),id
14 )n
15 MINUS
16 select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
17 from (
18 select trunc(ts_cr) ts_cr, name2, name3,
19 count(*) m_count, min(sec) m_min, max(sec) m_max,
20 sum(sec) m_sum, avg(sec) m_avg
21 from tran_stats
22 group by trunc(ts_cr), name2, name3 )
23 /
TS_CR NAME2 NAME3 M_SUM M_MIN M_MAX M_COUNT M_AVG
--------- ----------------------- -------- ---------- ---------- ---------- ---------- ----------
15-JAN-04 /CM01_PersonManagement CREATE 2.935 .031 2.904 2 1.4675
add more data and it won't be the same.
OK
Siva, January 31, 2004 - 9:05 am UTC
Dear Tom,
Can analytics be used for the following formats of the same query
sql>select ename,nvl(ename,'Name is null') from emp
sql>select ename,decode(ename,null,'Name is null',ename)
from emp
If you know other ways,Please let me know
Bye!
January 31, 2004 - 10:03 am UTC
umm, why ?
with analytics
A reader, February 18, 2004 - 7:30 am UTC
with the following data
-- ------
1 val1_1
1 val1_2
1 val1_3
2 val1_1
2 val2_2
can i produce
-- ------ --------------------
1 val1_1 val1_1,val1_2,val1_3
1 val1_2 val1_1,val1_2,val1_3
1 val1_3 val1_1,val1_2,val1_3
2 val2_1 val2_1,val2_2
2 val2_2 val2_1,val2_2
with an analytic that rocks
February 18, 2004 - 8:47 pm UTC
if
select max(count(*)) from t group by id
has a reasonable maximum -- yes, but it would be a trick lag/lead thing.
I would probably join using stragg. join the details to the aggregate using inline views.
OK
Siddiq, March 01, 2004 - 9:26 am UTC
Hi Tom,
What can be the business use cases of the analytic functions
1)cume_dist
2)percentile_dist
3)percentile_cont
Where can they be of immense use?
Bye!
March 01, 2004 - 10:17 am UTC
they are just statistical functions for analysis.
2 and 3 are really variations on eachother (disc=discrete, cont=continuous) and would be used to compute pctcentiles (like you might see on an SAT test report from back in high school). percentile_* can be used to find a median for example :)
cume_dist is a variation on that. I'll cheat on an example, from the doc:
Analytic Example
The following example calculates the salary percentile for each employee in the purchasing area. For example, 40% of clerks have salaries less than or equal to Himuro.
SELECT job_id, last_name, salary, CUME_DIST() OVER (PARTITION BY job_id ORDER BY salary) AS cume_dist FROM employees WHERE job_id LIKE PU% ;
JOB_ID LAST_NAME SALARY CUME_DIST
---------- ------------------------- ---------- ----------
PU_CLERK Colmenares 2500 .2
PU_CLERK Himuro 2600 .4
PU_CLERK Tobias 2800 .6
PU_CLERK Baida 2900 .8
PU_CLERK Khoo 3100 1
PU_MAN Raphaely 11000 1
Stumped on Analytics
Dave Thompson, March 04, 2004 - 9:56 am UTC
Hi Tom,
I have the following two tables:
CREATE TABLE PAY_M
(
PAY_ID NUMBER,
PAYMENT NUMBER
)
--
--
CREATE TABLE PREM
(
PREM_ID NUMBER,
PREM_PAYMENT NUMBER
)
With the following data:
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
1, 100);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
2, 50);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
3, 50);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
4, 50);
COMMIT;
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
1, 50);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
2, 25);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
3, 50);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
4, 50);
COMMIT;
PAY_M contains payments made against the premiums in the table prem.
Payments:
PAY_ID PAYMENT
---------- ----------
1 50
2 25
3 50
4 50
Prem:
PREM_ID PREM_PAYMENT
---------- ------------
1 100
2 50
3 50
4 50
We are trying to find which payment Ids paid each premium payment in Prem. The payments are assigned sequentially to the premiums.
For example payments 1,2 & 3 pay off the £100 in premium 1 leaving £25. Then the remaining payment from payment 3 & payment 4 pay off premium 2 leaving a balance of £25, and so on.
We are trying to create a query that will use the analytical functions to find all the payment IDs that pay off the associated premium ids. We want to keep this SQL based as we need to Process about 30 million payments!
Thanks.
Great website, hope you enjoyed your recent visit to the UK.
March 04, 2004 - 1:52 pm UTC
let me make sure I have this straight -- you want to
o sum up the first 3 records in payments
o discover they are 125 which exceeds 100
o output the fact that prem_id 1 is paid for by pay_id 1..3
o carry forward 25 from 3, discover that leftover 3+4 = 75 pays for prem_id 2
with 25 extra
while I believe (not sure) that the 10g MODEL clause might be able to do this (if you can do it in a spreadsheet, we can use the MODEL clause to do it).....
I'm pretty certain that analytics cannot -- we would need to recursively use lag (eg: after finding that 1,2,3 pay off 1, we'd need to -- well, it's hard to explain...)
I cannot see analytics doing this -- future rows depend on functions of the analytics from past rows and that is just "not allowed".
I can see how to do this in a pipelined PLSQL function -- will that work for you?
Oops - Error in previous post
Dave Thompson, March 04, 2004 - 10:17 am UTC
Tom,
Sorry, ignore the above tables as they are missing the joining column:
CREATE TABLE PAY_M
(
PREM_ID NUMBER,
PAY_ID NUMBER,
PAYMENT NUMBER
)
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 1, 50);
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 2, 25);
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 3, 50);
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 4, 50);
COMMIT;
CREATE TABLE PREM
(
PREM_ID NUMBER,
PAY_ID NUMBER,
PREM_PAYMENT NUMBER
)
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 1, 100);
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 2, 50);
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 3, 50);
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 4, 50);
COMMIT;
SQL> l
1 SELECT *
2* FROM PAY_M
SQL> /
PREM_ID PAY_ID PAYMENT
---------- ---------- ----------
1 1 50
1 2 25
1 3 50
1 4 50
SQL> select *
2 from prem;
PREM_ID PAY_ID PREM_PAYMENT
---------- ---------- ------------
1 1 100
1 2 50
1 3 50
1 4 50
Thanks.....
Dave Thompson, March 05, 2004 - 4:23 am UTC
Tom,
Thanks for your prompt response.
I am familiar with Pipeline functions.
I was however hoping we could do this as a set based opertion because of the volume of data involved.
Thanks for your time.
analytics book
Ron Chennells, March 05, 2004 - 5:52 am UTC
Just another vote and pre order for the analytics book
OK
Gerhard, March 19, 2004 - 12:33 am UTC
Dear Tom,
I used the following query to find the difference of salaries between employees.
SQL> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal from emp;
ENAME SAL DIFF_SAL
---------- ---------- ----------
SMITH 800
JAMES 950 150
ADAMS 1100 150
WARD 1250 150
MARTIN 1250 0
MILLER 1300 50
TURNER 1500 200
ALLEN 1600 100
CLARK 2450 850
BLAKE 2850 400
JONES 2975 125
ENAME SAL DIFF_SAL
---------- ---------- ----------
SCOTT 3000 25
FORD 3000 0
KING 5000 2000
14 rows selected.
My Question is:
" What is the difference between King's sal with other
employees?".Could you please help with the query?
Bye!
March 19, 2004 - 8:58 am UTC
scott@ORA9IR2> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal ,
2 sal-king_sal king_sal_diff
3 from (select sal king_sal from emp where ename = 'KING'),
4 emp
5 /
ENAME SAL DIFF_SAL KING_SAL_DIFF
---------- ---------- ---------- -------------
SMITH 800 -4200
JAMES 950 150 -4050
ADAMS 1100 150 -3900
WARD 1250 150 -3750
MARTIN 1250 0 -3750
MILLER 1300 50 -3700
TURNER 1500 200 -3500
ALLEN 1600 100 -3400
CLARK 2450 850 -2550
BLAKE 2850 400 -2150
JONES 2975 125 -2025
SCOTT 3000 25 -2000
FORD 3000 0 -2000
KING 5000 2000 0
14 rows selected.
Will this be faster?
Venkat, March 19, 2004 - 4:20 pm UTC
select ename, sal,
sal-lag(sal) over(order by sal) as diff_sal,
sal - max(case when ename='KING' then sal
else null end) over () king_sal_diff
from emp
March 20, 2004 - 9:47 am UTC
when you benchmarked it and tested it to scale, what did you see? it would be interesting no?
lead/lag on different dataset
Stalin, May 03, 2004 - 9:22 pm UTC
Hi Tom,
I've similar requirement but i'm not sure how to use lead or lag to refer from a different dataset.
Eg. logs table has both login and logout information and they are identified by action column. There could be different login/logout modes so records that have action in (1,2) and (3,4,5,6,7) values are login and logout records respectively. Now i need to find signon and signout times and also session duration in mins.
here is some sample data of logs table :
LOG_ID LOG_CREATION_DATE USER_ID SERVICE ACTION
---------- ------------------- ---------- ---------- ----------
1 04/29/2004 10:48:36 3 5 2
3 04/29/2004 10:53:44 3 5 3
5 04/29/2004 11:11:35 3 5 1
1003 05/03/2004 15:18:53 3 5 5
1004 05/03/2004 15:19:50 8 5 1
here is a query i came up with (not exacly what i want) :
select log_id signon_id, lead(log_id, 1) over (partition by account_id, user_id, mac order by log_id) signoff_id,
user_id, log_creation_date signon_date,
lead(log_creation_date, 1) over (partition by account_id, user_id, mac order by log_creation_date) signoff_date,
nvl(round(((lead(log_creation_date, 1)
over (partition by account_id, user_id order by log_creation_date)-log_creation_date)*1440), 2), 0) Usage_Mins
from logs
where account_id = 'Robert'
and service = 5
order by user_id
desired output :
SIGNON_ID SIGNOFF_ID USER_ID SIGNON_DATE SIGNOFF_DATE USAGE_MINS
---------- ---------- ---------- ------------------- ------------------- ----------
1 3 3 04/29/2004 10:48:36 04/29/2004 10:53:44 5.13
5 1003 3 04/29/2004 11:11:35 05/03/2004 15:18:53 6007.3
1004 8 05/03/2004 15:19:50 0
Thanks in Advance,
Stalin
May 04, 2004 - 7:11 am UTC
maybe if you supply simple create table and insert ... values ... statements for me.... this stuff would go faster.
Your query references columns that are not in the example as well.
Create table scripts
Stalin, May 04, 2004 - 1:29 pm UTC
Sorry for not giving this info in the first place.
here goes the scripts....
create table logs (log_id number, log_creation_date date, account_id varchar2(25), user_id number,
service number, action number, mac varchar2(50))
/
insert into logs values (1, to_date('04/29/2004 10:48:36'), 'Robert', 3, 5, 2, '00-00-00-00')
/
insert into logs values (3, to_date('04/29/2004 10:53:44'), 'Robert', 3, 5, 3, '00-00-00-00')
/
insert into logs values (5, to_date('04/29/2004 11:11:35'), 'Robert', 3, 5, 1, '00-00-00-00')
/
insert into logs values (1003, to_date('05/03/2004 15:18:53'), 'Robert', 3, 5, 5, '00-00-00-00')
/
insert into logs values (1004, to_date('05/03/2004 15:19:50'), 'Robert', 8, 5, 1, '00-00-00-00')
/
The reason for including mac in the partition group is cause users can login via multiple pc's without logging out hence i grouped it on account_id, user_id and mac.
Thanks,
Stalin
May 04, 2004 - 2:38 pm UTC
ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 ) minutes
2 from (
3 select log_id,
4 case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
5 then lead(log_id) over (partition by account_id, user_id, mac order by log_creation_date)
6 end signoff_id,
7 user_id,
8 log_creation_date signon_date,
9 case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
10 then lead(log_creation_date) over (partition by account_id, user_id, mac order by log_creation_date)
11 end signoff_date,
12 action
13 from logs
14 where account_id = 'Robert'
15 and service = 5
16 order by user_id
17 ) a
18 where action in (1,2)
19 /
LOG_ID SIGNOFF_ID USER_ID SIGNON_DATE SIGNOFF_DATE ACTION MINUTES
---------- ---------- ---------- ------------------- ------------------- ---------- ----------
1 3 3 04/29/2004 10:48:36 04/29/2004 10:53:44 2 5.13
5 1003 3 04/29/2004 11:11:35 05/03/2004 15:18:53 1 6007.3
1004 8 05/03/2004 15:19:50 1
Excellent
Stalin, May 04, 2004 - 3:42 pm UTC
This is exactly what i'm looking for.
Thanks so much!
Help On SQL
VKOUL, May 04, 2004 - 8:05 pm UTC
I want to substitute the non null value of a column to its null column. e.g.
If I have records like the following
year month column_value
----- ------ --------------------
2002 06 55
2002 06 57
2002 07 NULL
2002 08 NULL
2002 09 NULL
2002 10 100
2002 11 101
I want the results as below
year month column_value
----- ------ --------------------
2002 06 55
2002 06 57
2002 07 57 ------> Repeated
2002 08 57 ------> Repeated
2002 09 57 ------> Repeated
2002 10 100
2002 11 101
May 04, 2004 - 9:08 pm UTC
create table,
insert into table
much appreciated......... (so i don't spend days of my life making create tables and insert into statements. I've added this request to all pages where you can input stuff and I'll just be asking for it from now on in...... Not picking on you, just reminding everyone that i need a script like I provide.....)
but..... asked and answered:
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10286792840956 <code>
Help On SQL
VKoul, May 04, 2004 - 11:27 pm UTC
Beautiful !!!
I'll keep in mind "create table etc."
Thanks
VKoul
analytic q
A reader, May 11, 2004 - 6:38 pm UTC
Tom
Please look at the following schema and data.
---------
spool schema
set echo on
drop table host_instances;
drop table rac_instances;
drop table instance_tablespaces;
create table host_instances
(
host_name varchar2(50),
instance_name varchar2(50)
);
create table rac_instances
(
rac_name varchar2(50),
instance_name varchar2(50)
);
create table instance_tablespaces
(
instance_name varchar2(50),
tablespace_name varchar2(50),
tablespace_size number
);
-- host to instance mapping data
insert into host_instances values ( 'h1', 'i1' );
insert into host_instances values ( 'h2', 'i2' );
insert into host_instances values ( 'h3', 'i3' );
insert into host_instances values ( 'h4', 'i4' );
insert into host_instances values ( 'h5', 'i5' );
-- rac to instance mapping data
insert into rac_instances values ( 'rac1', 'i1' );
insert into rac_instances values ( 'rac1', 'i2' );
insert into rac_instances values ( 'rac2', 'i3' );
insert into rac_instances values ( 'rac2', 'i4' );
--- instance to tablespace mapping data
insert into instance_tablespaces values( 'i1', 't11', 100 );
insert into instance_tablespaces values( 'i1', 't12', 200 );
insert into instance_tablespaces values( 'i2', 't11', 100 );
insert into instance_tablespaces values( 'i2', 't12', 200 );
insert into instance_tablespaces values( 'i3', 't31', 500 );
insert into instance_tablespaces values( 'i3', 't32', 300 );
insert into instance_tablespaces values( 'i4', 't31', 500 );
insert into instance_tablespaces values( 'i4', 't32', 300 );
insert into instance_tablespaces values( 'i5', 't51', 400 );
commit;
---------
What I need is to sum up all tablespaces of all instances
for a list of hosts. However, if two hosts in the list
belong to a RAC then I should only pick one of the
hosts (I can pick any one of them.)
e.g. in the above data I should only pick i1 or i2 not
both since they both belong to the same RAC 'rac1'.
Following is the select I came up with for the above data.
Let me know if you have any comments on it.
Any other alternative solutions you can think of would
also be educating to me. I have not benchmarked this
select yet. The number of hosts could reach up to 2000
approximately. On an average we can assume each will have
one instance - some of these will be RACs.
Thank you!
-----------
scott@ora10g> set echo on
scott@ora10g> column host_name format a10
scott@ora10g> column instance_name format a10
scott@ora10g> column rac_name format a10
scott@ora10g> column row_number format 999
scott@ora10g>
scott@ora10g> select a.instance_name, sum( tablespace_size )
2 from
3 (
4 select instance_name
5 from
6 (
7 select host_name, instance_name, rac_name,
8 row_number() over
9 (
10 partition by rac_name
11 order by rac_name, instance_name
12 ) row_number
13 from
14 (
15 select hi.host_name, hi.instance_name, ri.rac_name
16 from host_instances hi, rac_instances ri
17 where hi.instance_name = ri.instance_name(+)
18 )
19 )
20 where row_number <= 1
21 ) a, instance_tablespaces e
22 where a.instance_name = e.instance_name
23 group by a.instance_name;
i1 300
i3 800
i5 400
---
Also do you prefer the .sql file (as above) or
the spooled output of schema.sql (i.e. schema.lst.)
The above is more convenient to reproduce - but the spooled output makes for better reading in some cases.
May 11, 2004 - 9:15 pm UTC
I like the cut and paste from sqlplus truth be told.
sure, I have to do two vi commands and a couple of deletes to fix it up but.... I'm fairly certain that the poster *actually ran the commands successfully!* which is most relevant to me....
Besides, I do it to you ;)
ops$tkyte@ORA9IR2> select *
2 from (
3 select h.host_name, h.instance_name, r.rac_name, sum(t.tablespace_size),
4 row_number() over (partition by r.rac_name order by h.host_name ) rn
5 from host_instances h,
6 rac_instances r,
7 instance_tablespaces t
8 where h.instance_name = r.instance_name(+)
9 and h.instance_name = t.instance_name
10 group by h.host_name, h.instance_name, r.rac_name
11 )
12 where rn = 1
13 /
HO IN RAC_N SUM(T.TABLESPACE_SIZE) RN
-- -- ----- ---------------------- ----------
h1 i1 rac1 300 1
h3 i3 rac2 800 1
h5 i5 400 1
is the first thing that popped into my head.
with just a couple hundred rows -- any of them will perform better than good enough.
thanx!
A reader, May 11, 2004 - 9:54 pm UTC
"I like the cut and paste from sqlplus truth be told."
Actually I was going to post that only - but your
example at the point of posting led me to believe
that you want a straight sql - may be you wanna
fix that (not that many people seem to care anyways!:))
Thanx for the sql - it looks good and a tad simpler
than the one I wrote...
How to compute this running total (sort of...)
Kishan, May 18, 2004 - 11:33 am UTC
create table investment (
investment_id number,
asset_id number,
agreement_id number,
constraint pk_i primary key (investment_id)
)
/
create table period (
period_id number,
business_domain varchar2(10),
status_code varchar2(10),
constraint pk_p primary key (period_id)
)
/
create table entry (
entry_id number,
period_id number,
investment_id number,
constraint pk_e primary key(entry_id),
constraint fk_e_period foreign key(period_id) references period(period_id),
constraint fk_e_investment foreign key (investment_id) references investment(investment_id)
)
/
create table entry_detail(
entry_id number,
account_type varchar2(10),
amount number,
constraint pk_ed primary key(entry_id, account_type),
constraint fk_ed_entry foreign key(entry_id) references entry(entry_id)
)
/
insert into period (period_id, business_domain, status_code)
SELECT rownum AS period_id,
'BDG' AS business_domain,
'2' AS status_code
from all_objects where rownum <= 5
/
insert into investment(investment_id, asset_id, agreement_id)
select rownum+10 AS investment_id,
rownum+100 AS asset_id,
rownum+1000 AS agreement_id
from all_objects where rownum <=5
/
insert into entry(entry_id, period_id, investment_id) values (1, 1, 11)
/
insert into entry(entry_id, period_id, investment_id) values (2, 2, 11)
/
insert into entry(entry_id, period_id, investment_id) values (3, 3, 11)
/
insert into entry(entry_id, period_id, investment_id) values (4, 3, 13)
/
insert into entry(entry_id, period_id, investment_id) values (5, 4, 13)
/
insert into entry(entry_id, period_id, investment_id) values (6, 4, 14)
/
insert into entry(entry_id, period_id, investment_id) values (7, 5, 14)
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC1', 1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC2', -200 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC3', 300 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(5, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC1', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC3', 500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC4', 1200 )
/
scott@LDB.US.ORACLE.COM> select * from period;
PERIOD_ID BUSINESS_D STATUS_COD
---------- ---------- ----------
1 BDG 2
2 BDG 2
3 BDG 2
4 BDG 2
5 BDG 2
scott@LDB.US.ORACLE.COM> select * from investment;
INVESTMENT_ID ASSET_ID AGREEMENT_ID
------------- ---------- ------------
11 101 1001
12 102 1002
13 103 1003
14 104 1004
15 105 1005
scott@LDB.US.ORACLE.COM> select * from entry;
ENTRY_ID PERIOD_ID INVESTMENT_ID
---------- ---------- -------------
1 1 11
2 2 11
3 3 11
4 3 13
5 4 13
6 4 14
7 5 14
7 rows selected.
scott@LDB.US.ORACLE.COM> select * from entry_detail;
ENTRY_ID ACCOUNT_TY AMOUNT
---------- ---------- ----------
1 AC1 1000
1 AC2 -200
1 AC3 300
2 AC1 200
2 AC4 -1000
2 AC2 -500
3 AC2 2200
3 AC1 200
4 AC4 -1000
4 AC2 -500
5 AC2 2200
6 AC1 200
6 AC4 -1000
6 AC2 -500
7 AC1 2200
7 AC3 500
7 AC4 1200
17 rows selected.
The resultant view needed is given below.
To give an example from the result below, the first entry for investment_id 14
is from period 4. The account types entered on period 4 are AC1, AC4, AC2. We
need these three account types in all subsequent periods. Also, on period 5 a
new account type AC3 is added. So, if there is another period, say period_id 6, we need
information for AC1, AC2, AC3, AC4 (that's 4 account types). If there's no entry
for any of these account_types for any subseqent periods, the amount_for_period for such
periods are considered to be 0.00 and the balance will be sum(amount_for_period)
until that period.
PERIOD_ID INVESTMENT_ID ACCOUNT_TYPE AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
--------- ------------- ------------ ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000
3 11 AC1 200 1400
3 11 AC2 200 -500
3 11 AC3 0 300
3 11 AC4 0 1000
4 11 AC1 0 1400
4 11 AC2 0 -500
4 11 AC3 0 300
4 11 AC4 0 1000
5 11 AC1 0 1400
5 11 AC2 0 -500
5 11 AC3 0 300
5 11 AC4 0 1000
3 13 AC4 -1000 -1000
3 13 AC2 -500 -500
4 13 AC4 0 -1000
4 13 AC2 -500 -1000
5 13 AC4 0 -1000
5 13 AC4 0 -1000
4 14 AC1 200 200
4 14 AC4 -1000 -1000
4 14 AC2 -500 -500
5 14 AC1 2200 2400
5 14 AC3 500 500
5 14 AC4 1200 200
5 14 AC2 0 -500
The blank lines in between are just for clarity. As always, grateful for all your efforts.
Regards,
Kishan.
May 18, 2004 - 6:14 pm UTC
so, what does your first try look like :) at least get the join written up for the details - maybe the running total will be obvious from that.
This is how far I went...and no further
Kishan, May 19, 2004 - 10:18 am UTC
select distinct period_id,
investment_id,
account_type,
amount_for_period,
balance_till_period
from ( select period.period_id,
entry.investment_id,
entry_detail.account_type,
(case when entry.period_id = period.period_id then entry_detail.amount else 0 end) amount_for_period,
sum(amount) over(partition by period.period_id, investment_id, account_type) balance_till_period
from period left outer join (entry join entry_detail on (entry.entry_id = entry_detail.entry_id)) on (entry.period_id <= period.period_id))
order by investment_id
The result looks as below:
PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300
2 11 AC1 0 1200
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC2 0 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000
3 11 AC1 0 1400
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
4 11 AC1 0 1400
4 11 AC2 0 1500
4 11 AC3 0 300
4 11 AC4 0 -1000
5 11 AC1 0 1400
5 11 AC2 0 1500
5 11 AC3 0 300
5 11 AC4 0 -1000
3 13 AC2 -500 -500
3 13 AC4 -1000 -1000
4 13 AC2 0 1700
4 13 AC2 2200 1700
4 13 AC4 0 -1000
5 13 AC2 0 1700
5 13 AC4 0 -1000
4 14 AC1 200 200
4 14 AC2 -500 -500
4 14 AC4 -1000 -1000
5 14 AC1 0 2400
5 14 AC1 2200 2400
5 14 AC2 0 -500
5 14 AC3 500 500
5 14 AC4 0 200
5 14 AC4 1200 200
First, I am sorry my originally constructed result (by hand..;) misses a couple of rows .
However, other than that, I am unable to remove the redundant rows that are shows up for the particular investment and accout_type for a period as the logic beats me.
Basically, I need to remove rows where the amount_for_period is 0 for an account_type only if its a redundant row for that set. That is, the first row of period_id 2 and 3 are redundant but the rows for the period 4 are not redundant.
Could you help me out?
Regards,
Kishan.
May 19, 2004 - 11:06 am UTC
are we missing some more order bys? I mean -- what if:
3 11 AC1 0 1400
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
was really:
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
3 11 AC1 0 1400
would that still be redundant? missing something here/
Yes...they are redundant
A reader, May 19, 2004 - 12:16 pm UTC
Tom:
Yes, for that particular set, those rows are redundant, no matter what the order is.
Regards,
Kishan.
May 19, 2004 - 2:24 pm UTC
ok, so what is the "key" of that result set? what can we partition the result set by.
my idea will be to use your query in an inline view and analytics on that to weed out what you want.
Kishan, May 19, 2004 - 3:08 pm UTC
The key would be period_id, investment_id and accout_type. Basically, what the result represents is the amount and the balance-to-date for a particular account_type of an investment_id for a period.
Eg: Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
If there's no activity on that investment and account_type for the next period, say Period 2, the amount will be 0 for that period, and the balance will be previous period's balance.
Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=0->Balance = 1000
But, if there's an activity on that account_type for that investment, then the amount will be the amount for that period and balance will be the sum of previous balance and current amount. Say for Period 2, the amount is 500, then
Period 1->Investment 1->Account_Type AC1->Amount=1000-> Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500-> Balance=1500
And if there's a new account type entry, say AC2 and amount, say 2000 created for period 2, then the result set will be
Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500->Balance=1500
Period 2->Investment 1->Account_Type AC2->Amount=2000->Balance=2000
There may be many investments per period and many account_types per investment. Hope I am clear....
Regards,
Kishan.
May 19, 2004 - 5:34 pm UTC
so... if you have:
PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300
2 11 AC1 0 1200
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC2 0 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000
you see though, why isn't the 4th line here "redundant" then?
But it is redundant..
Kishan, May 19, 2004 - 11:51 pm UTC
Tom, I am assuming the 4th line you mention is 2->11->AC2->0->-700. Yes, it is redundant.
We need amount and balance for every period_id, investment_id and account_type. One line, per period_id, investment_id and account_type, anything more, is redundant.
Issue is, there may not be entries for a specific account_type of an investment for a particular period. In such cases, we need to assume amount for such periods are 0 and compute the balances accordingly.
Regards,
Kishan
May 20, 2004 - 10:55 am UTC
so, if you partition by
PERIOD_ID INVESTMENT_ID ACCOUNT_TY BALANCE_TILL_PERIOD
order by
AMOUNT_FOR_PERIOD
select a.*, lead(amount_for_period) over (partition by .... order by ... ) nxt
from (YOUR_QUERY)
you can then
select *
from (that_query)
where nxt is NULL or (nxt is not null and amount_for_period <> 0)
if nxt is null -- last row in the partition, keep it.
if nxt is not null AND we are zero -- remove it.
Almost there?
Dave Thompson, May 20, 2004 - 12:30 pm UTC
Hi Tom,
We have the following table of data:
CREATE TABLE DEDUP_TEST
(
ID NUMBER,
COLUMN_A VARCHAR2(10 BYTE),
COLUMN_B VARCHAR2(10 BYTE),
COLUMN_C VARCHAR2(10 BYTE),
START_DATE DATE,
END_DATE DATE
)
With:
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'D', 'B', 'C', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'a', 'f', 'f', TO_Date( '02/06/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/07/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/05/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/03/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/06/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
3, 'A', 'F', 'F', TO_Date( '02/10/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/20/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
COMMIT;
We are trying to sequentially de-duplicate this data.
Basically from the top of the table we go down and the check each row against the previous. If they are the same the row that is a duplicate is marked as such as is the original row.
So far we have this query:
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM DEDUP_TEST
ORDER
BY 1, 5 ) ) )
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-00 01-OCT-01 DUP
2 A B B 01-OCT-01 01-OCT-03 DUP
2 A B B 02-OCT-01 05-OCT-03 DUP
2 a f f 06-FEB-04 07-FEB-04 NOT
2 A B B 02-OCT-05 03-OCT-05 DUP
2 A B B 04-OCT-05 06-OCT-05 DUP
3 A F F 10-FEB-04 20-FEB-04 NOT
The resultset from this is almost what I am after.
However where there are groups of duplicate rows I only want to return one row. I take the attributes, the start_date of the first row duplicated and the end_date of the last row duplicated.
I do not want to group all the duplicates together, so for example the rows with the attributes
ID COLUMN_A COLUMN_B COLUMN_C
2 A B B
will result in two output rows:
2 A B B 01-OCT-00 01-OCT-03
2 A B B 02-OCT-05 06-OCT-05
This is the final piece I cannot work out.
Any help would be appreciated.
Thanks.
May 20, 2004 - 2:18 pm UTC
what happens in your data if you had
1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....
that might or might not be "dup" since you just order by ID? don't we need to ordedr by a,b, and c?
Follow up
Dave Thompson, May 21, 2004 - 5:02 am UTC
Hi Tom,
In repsonse to your question:
what happens in your data if you had
1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....
Then the first row would be classed as unique, as would the second and the third. We are only looking at duplicates that occur sequentially.
Sequential duplicates are then turned into one row by taking the start date of the first row and the end date of the last row in the group.
The test data should have had sequential dates:
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'D', 'B', 'C', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'a', 'f', 'f', TO_Date( '02/06/2009 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/07/2010 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2007 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/05/2008 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2011 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/03/2012 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/04/2013 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/06/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
3, 'A', 'F', 'F', TO_Date( '02/10/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/20/2015 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
COMMIT;
CREATE TABLE DEDUP_TEST
(
ID NUMBER,
COLUMN_A VARCHAR2(10 BYTE),
COLUMN_B VARCHAR2(10 BYTE),
COLUMN_C VARCHAR2(10 BYTE),
START_DATE DATE,
END_DATE DATE
)
The query:
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM DEDUP_TEST
ORDER
BY ID, START_DATE ) ) )
Gives:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
From this the sequentially duplicated rows with the attributes a, b, c will become:
2 A B C 01-OCT-03 05-OCT-08
2 A B C 02-OCT-11 06-OCT-14
Thanks.
May 21, 2004 - 10:50 am UTC
define sequentially.
1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....
ordered by ID is the same (exact same) as:
1 A1 B1 C1 ....
1 A1 B1 C1 ....
1 A2 B2 C2 ....
and
1 A2 B2 C2 ....
1 A1 B1 C1 ....
1 A1 B1 C1 ....
and in fact, two runs of your query could return different answers given the SAME exact data. How to handle that, you must have something more to sort by.
Typo in previous post
Dave Thompson, May 21, 2004 - 5:56 am UTC
Tom,
The final output should be:
From this the sequentially duplicated rows with the attributes a, b, c will
become:
2 A B B 01-OCT-03 05-OCT-08
2 A B B 02-OCT-11 06-OCT-14
Thanks.
Order
Dave Thompson, May 21, 2004 - 10:57 am UTC
Hi Tom,
The order of the dataset should be on the ID and Start Date.
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
Thanks.
May 21, 2004 - 11:42 am UTC
Ok, your example doesn't do that -- it is "non-deterministic", given the same data, it could/would return two different answers at different times during the day!
so, i think you want one of these:
ops$tkyte@ORA9IR2> select *
2 from (
3 select id, a,b,c, start_date, end_date,
4 case when (a = lag(a) over (order by id, start_date desc) and
5 b = lag(b) over (order by id, start_date desc) and
6 c = lag(c) over (order by id, start_date desc) )
7 then row_number() over (order by id, start_date)
8 end rn
9 from v
10 )
11 where rn is null
12 /
ID A B C START_DAT END_DATE RN
---------- ---------- ---------- ---------- --------- --------- ----------
1 A B C 01-OCT-99 01-OCT-00
1 D B C 01-OCT-01 01-OCT-02
1 A B C 01-OCT-02 01-OCT-03
2 A B B 02-OCT-07 05-OCT-08
2 a f f 06-FEB-09 07-FEB-10
2 A B B 04-OCT-13 06-OCT-14
3 A F F 10-FEB-14 20-FEB-15
7 rows selected.
ops$tkyte@ORA9IR2> select *
2 from (
3 select id, a,b,c, start_date, end_date,
4 case when (a = lag(a) over (order by id, start_date) and
5 b = lag(b) over (order by id, start_date) and
6 c = lag(c) over (order by id, start_date) )
7 then row_number() over (order by id, start_date)
8 end rn
9 from v
10 )
11 where rn is null
12 /
ID A B C START_DAT END_DATE RN
---------- ---------- ---------- ---------- --------- --------- ----------
1 A B C 01-OCT-99 01-OCT-00
1 D B C 01-OCT-01 01-OCT-02
1 A B C 01-OCT-02 01-OCT-03
2 A B B 01-OCT-03 01-OCT-04
2 a f f 06-FEB-09 07-FEB-10
2 A B B 02-OCT-11 03-OCT-12
3 A F F 10-FEB-14 20-FEB-15
7 rows selected.
we just need to mark records that the preceding record is the "same" after sorting -- then nuke them.
More Info
Dave Thompson, May 21, 2004 - 12:25 pm UTC
Hi Tom,
Thanks for the prompt reply.
I re-wrote the base query:
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
ROWID ROWID_R,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_C
FROM DEDUP_TEST
ORDER
BY ID, START_DATE ) ) )
And got:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
Looking at the column LETSEE I want to add a unique identifier to each row, treating duplicated rows as 1.
For example:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET DUP_ID
---------- ---------- ---------- ---------- --------- --------- --- ------
1 A B C 01-OCT-99 01-OCT-00 NOT 1
1 D B C 01-OCT-01 01-OCT-02 NOT 2
1 A B C 01-OCT-02 01-OCT-03 NOT 3
2 A B B 01-OCT-03 01-OCT-04 DUP 4
2 A B B 01-OCT-05 01-OCT-06 DUP 4
2 A B B 02-OCT-07 05-OCT-08 DUP 4
2 a f f 06-FEB-09 07-FEB-10 NOT 5
2 A B B 02-OCT-11 03-OCT-12 DUP 6
2 A B B 04-OCT-13 06-OCT-14 DUP 6
3 A F F 10-FEB-14 20-FEB-15 NOT 7
Then I could use the Dup_Id to partition on to do the anaysis I need.
Any idea?
Have a nice weekend.
Thanks.
May 21, 2004 - 1:59 pm UTC
the above query doesn't work?
Hi Again
Dave Thompson, May 21, 2004 - 2:05 pm UTC
Hi Tom,
The above didn't work.
From the source query:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
I want to output the following resultset:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
On the resultset from your queries the start and end dates were incorrect.
Where duplicates rows occur one after another then we need to take the start_date of the first row and the end_date of the last row in that block.
So far the following:
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
You would get
2 A B B 01-OCT-03 05-OCT-08 DUP
Does this make sense?
Thanks again for you input on this.
May 21, 2004 - 2:19 pm UTC
ops$tkyte@ORA9IR2> select id, a,b,c, min(start_date) start_date, max(end_date) end_date
2 from (
3 select id, a,b,c, start_date, end_date,
4 max(grp) over (order by id, start_date desc) grp
5 from (
6 select id, a,b,c, start_date, end_date,
7 case when (a <> lag(a) over (order by id, start_date desc) or
8 b <> lag(b) over (order by id, start_date desc) or
9 c <> lag(c) over (order by id, start_date desc) )
10 then row_number() over (order by id, start_date desc)
11 end grp
12 from v
13 )
14 )
15 group by id, a,b,c,grp
16 order by 1, 5
17 /
ID A B C START_DAT END_DATE
---------- ---------- ---------- ---------- --------- ---------
1 A B C 01-OCT-99 01-OCT-00
1 D B C 01-OCT-01 01-OCT-02
1 A B C 01-OCT-02 01-OCT-03
2 A B B 01-OCT-03 05-OCT-08
2 a f f 06-FEB-09 07-FEB-10
2 A B B 02-OCT-11 06-OCT-14
3 A F F 10-FEB-14 20-FEB-15
7 rows selected.
One of my (current) favorite analytic tricks -- the old "carry forward". We mark rows such that the preceding row was different -- subsequent dup rows would have NULLS there for grp.
Then, we use max(grp) to "carry" that number down....
Now we have something to group by -- we've divided the rows up into groups we can deal with.
(note: if a,b,c allow NULLS, we'll need to accomidate for that!)
Great Stuff
Dave Thompson, May 21, 2004 - 5:02 pm UTC
Tom,
Thanks very much for that.
I'll go over it in more detail when I'm in the Office Monday but it looks great from here.
Enjoy the weekend.
Excellent
Dave Thompson, June 02, 2004 - 4:53 am UTC
Hi Tom,
This solution was spot on.
Thanks.
Any more thoughts on an Analytics book?
Stalin, June 09, 2004 - 6:03 pm UTC
hi tom,
wondering what would below sql look like if there hadn't been existence of lead or partition analytical funtions. is pl/sql the only option.
snippet from "lead/lag on different dataset" thread (it's has the create and insert stmts)
ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 )
minutes
2 from (
3 select log_id,
4 case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
5 then lead(log_id) over (partition by account_id, user_id, mac
order by log_creation_date)
6 end signoff_id,
7 user_id,
8 log_creation_date signon_date,
9 case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
10 then lead(log_creation_date) over (partition by account_id,
user_id, mac order by log_creation_date)
11 end signoff_date,
12 action
13 from logs
14 where account_id = 'Robert'
15 and service = 5
16 order by user_id
17 ) a
18 where action in (1,2)
19 /
Thanks,
Stalin
June 09, 2004 - 6:27 pm UTC
you could use a non-equi self join to achieve the same. Many orders of magnitudes slower.
scalar subqueries could be used as well -- with the same "slower" caveat.
Is this solvable with ANALTICS too?
Peter Tran, June 10, 2004 - 12:14 am UTC
Hi Tom,
Can the following problem be solved using Analytics?
I have a 10 columns table where 9 of the fields are dimensions and one attribute. I would like to get a report of D1/D2 columns where the ATTR1 is 1 for every other dimensions. Furthermore the PK consist of all the dimension columns.
The example below aren't really true column names, but I didn't want to make the example table too wide for illustrative purpose.
D1 D2 D3 D4 D5 D6 D7 D8 D9 ATTR1
--------------------------------------------
AA AA AA AA AA AA AA AA AA 1
AA AA BB AA AA AA AA AA AA 1
AA AA AA CC AA AA AA AA AA 1
AA AA AA AA DD AA AA AA AA 1
AA AA AA AA EE AA AA AA AA 1
AA BB AA AA AA AA AA GG AA 1
AA BB AA AA AA AA AA AA AA 1
AA BB CC AA AA AA AA AA AA 0
AA BB AA DD AA AA AA AA AA 1
EE DD JJ LL MM NN OO PP QQ 1
EE DD TT LL MM NN OO PP QQ 1
I want the query to return:
D1 D2
--------
AA AA
EE DD
It would not return AA/BB, because of the record:
D1 D2 D3 D4 D5 D6 D7 D8 D9 ATTR1
--------------------------------------------
AA BB CC AA AA AA AA AA AA 0
Thanks,
-Peter
June 10, 2004 - 7:43 am UTC
yes they can, but they are not needed. regular aggregates do the job. I'd give you the real query if I had a create table/inserts to demo against. this is "psuedo code", might or might not actually work:
select d1, d2
from t
group by d1, d2
having count(distinct attribute) = 1
Michael T., June 10, 2004 - 9:01 am UTC
Peter,
I think the following may give you what you want.
SELECT d1, d2
FROM t
GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) > 0;
Tom's psuedo code will work except for the case when all D1/D2 combinations have the same ATTR1 value, but that value is not 1.
June 10, 2004 - 9:45 am UTC
ahh, good eye -- i was thinking "all attribute values are the same"
but yours doesn't do it, this will
having count( decode( attr1, 1, 1 ) ) = count(*)
cound(decode(attr1,1,1)) will return a count of non-null occurences (all of the 1's)
count(*) returns a count of all records
output when count(decode) = count(*)
Thank you!
Peter Tran, June 10, 2004 - 10:37 am UTC
Hi Tom/Michael T.,
Thank you. It so much clearer now.
-Peter
Michael T., June 10, 2004 - 10:46 am UTC
I did screw up in my previous response. The query I submitted gives the entirely wrong answer. It should have been
SELECT d1, d2
FROM t
GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) = 0
Even though, incorrectly, I wasn't originally considering null values for ATTR1, the above query seems to produce the correct answer even if ATTR1 is NULL. The DECODE will evaluate a null ATTR1 entry to 1.
Tom, many thanks for this site. I have learned so much from it. It is a daily must read for me.
You said a book on analytics?
Jeff, June 10, 2004 - 12:30 pm UTC
A book by you on analytics would be a best seller I think.
Go for it.
quick analytic question
A reader, June 16, 2004 - 5:03 pm UTC
schema creation---
---
scott@ora92> drop table t1;
Table dropped.
scott@ora92> create table t1
2 (
3 x varchar2(10),
4 y number
5 );
Table created.
scott@ora92>
scott@ora92> insert into t1 values( 'x1', 1 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 2 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 4 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 0 );
1 row created.
scott@ora92> commit;
Commit complete.
scott@ora92> select x, y, min(y) over() min_y
2 from t1;
X Y MIN_Y
---------- ---------- ----------
x1 1 0
x1 2 0
x1 4 0
x1 0 0
scott@ora92> spool off
---
how do i get the minimum of y for all values
that is greater than 0 (if one exists). In the above case
I should get the result as
X Y MIN_Y
---------- ---------- ----------
x1 1 1
x1 2 1
x1 4 1
x1 0 1
Thanx for your excellent site and brilliant work!
June 16, 2004 - 6:09 pm UTC
min( case when y > 0 then y end ) over ()
Great!!!
A reader, June 16, 2004 - 6:46 pm UTC
Thank you very much
Gj, July 02, 2004 - 9:16 am UTC
The Oracle docs are a little light on examples but thank you for giving us the quick start to analytics, can't say I understand the complex examples yet, but the simple stuff seems so easy to understand now, can't wait until a real problem comes along I can apply this feature to.
July 02, 2004 - 10:39 am UTC
How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?
Sergey, July 06, 2004 - 8:08 am UTC
Hi Tom,
I need to 'fill the gaps' with the values from the last existing row in a table that is outer joined to another table. The other table servers as a source of regular [time] intervals. The task seems to be conceptually very simple, so I looked into Ora docs (it happens to be Ora10g docs) I pretty soon found exactly what I need: LAST_VALUE with IGNORE NULLS. Unfortunately neither Ora8i, nor Ora9i accept IGNORE NULLS. Is there any way to mimic this feature with 'older' analitical functions?
I tried sort of ORDER BY SIGN(NVL(VALUE), 0) in analitical ORDER BY clause, but it does not work (I do not have a clue why)
Thanks in advance
Here is the test:
DROP TABLE TD;
CREATE TABLE TD AS
(SELECT TRUNC(SYSDATE, 'DD') + ROWNUM T
FROM ALL_OBJECTS
WHERE ROWNUM <= 15
);
DROP TABLE TV;
CREATE TABLE TV AS
(SELECT
TRUNC(SYSDATE, 'DD') + ROWNUM * 3 T
,ROWNUM V
FROM ALL_OBJECTS
WHERE ROWNUM <= 5
);
SELECT
TD.T
,SIGN(NVL(TV.V, 0))
,NVL
(TV.V,
LAST_VALUE(TV.V IGNORE NULLS) -- IGNORE NULLS does not work on Ora8i, Ora9i
OVER
(
ORDER BY TD.T
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
) V
FROM TD, TV
WHERE TV.T(+) = TD.T
ORDER BY TD.T
;
ERROR at line 6:
ORA-00907: missing right parenthesis
SELECT
TD.T
,SIGN(NVL(TV.V, 0))
,NVL
(TV.V,
LAST_VALUE(TV.V)
OVER
(
ORDER BY SIGN(NVL(TV.V, 0)), TD.T -- Does not work
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
) V
FROM TD, TV
WHERE TV.T(+) = TD.T
ORDER BY TD.T
;
T SIGN(NVL(TV.V,0)) V
------------------- ------------------ ------------------
07.07.2004 00:00:00 0
08.07.2004 00:00:00 0
09.07.2004 00:00:00 1 1
10.07.2004 00:00:00 0
11.07.2004 00:00:00 0
12.07.2004 00:00:00 1 2
13.07.2004 00:00:00 0
14.07.2004 00:00:00 0
15.07.2004 00:00:00 1 3
16.07.2004 00:00:00 0
17.07.2004 00:00:00 0
18.07.2004 00:00:00 1 4
19.07.2004 00:00:00 0
20.07.2004 00:00:00 0
21.07.2004 00:00:00 1 5
July 06, 2004 - 8:26 am UTC
This is a trick I call "carry down", we use analytics on analytics to accomplish this. We output "marker rows" with ROW_NUMBER() on the leading edge. Using MAX() in the outer query, we "carry down" these marker rows -- substr gets rid of the row_number for us:
ops$tkyte@ORA10G> select t,
2 sign_v,
3 v,
4 substr( max(data) over (order by t), 7 ) v2
5 from (
6 SELECT TD.T,
7 SIGN(NVL(TV.V, 0)) sign_v,
8 NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
9 case when tv.v is not null
10 then to_char( row_number()
over (order by td.t), 'fm000000' ) || tv.v
11 end data
12 FROM TD, TV
13 WHERE TV.T(+) = TD.T
14 )
15 ORDER BY T
16 ;
T SIGN_V V V2
--------- ---------- ---------- -----------------------------------------
07-JUL-04 0
08-JUL-04 0
09-JUL-04 1 1 1
10-JUL-04 0 1 1
11-JUL-04 0 1 1
12-JUL-04 1 2 2
13-JUL-04 0 2 2
14-JUL-04 0 2 2
15-JUL-04 1 3 3
16-JUL-04 0 3 3
17-JUL-04 0 3 3
18-JUL-04 1 4 4
19-JUL-04 0 4 4
20-JUL-04 0 4 4
21-JUL-04 1 5 5
15 rows selected.
So, in 9ir2 this would simply be:
ops$tkyte@ORA9IR2> select t,
2 sign_v,
3 substr( max(data) over (order by t), 7 ) v2
4 from (
5 SELECT TD.T,
6 SIGN(NVL(TV.V, 0)) sign_v,
7 case when tv.v is not null
8 then to_char( row_number() over (order by td.t), 'fm000000' ) || tv.v
9 end data
10 FROM TD, TV
11 WHERE TV.T(+) = TD.T
12 )
13 ORDER BY T
14 ;
T SIGN_V V2
--------- ---------- -----------------------------------------
07-JUL-04 0
08-JUL-04 0
09-JUL-04 1 1
10-JUL-04 0 1
11-JUL-04 0 1
12-JUL-04 1 2
13-JUL-04 0 2
14-JUL-04 0 2
15-JUL-04 1 3
16-JUL-04 0 3
17-JUL-04 0 3
18-JUL-04 1 4
19-JUL-04 0 4
20-JUL-04 0 4
21-JUL-04 1 5
15 rows selected.
Doesn't work with PL/SQL ????????
A reader, July 20, 2004 - 9:31 am UTC
Dear Tom
Are analytics fully compatible with PL/SQL?
Please see
SQL> ed
Wrote file afiedt.buf
1 select empno,deptno,
2 count(empno) over (partition by deptno order by empno
3 rows between unbounded preceding and current row) run_count
4* from emp
SQL> /
EMPNO DEPTNO RUN_COUNT
---------- ---------- ----------
7782 10 1
7839 10 2
7934 10 3
7369 20 1
7566 20 2
7788 20 3
7876 20 4
7902 20 5
7499 30 1
7521 30 2
7654 30 3
EMPNO DEPTNO RUN_COUNT
---------- ---------- ----------
7698 30 4
7844 30 5
7900 30 6
14 rows selected.
SQL>
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is
3 select empno,deptno,
4 count(empno) over (partition by deptno order by empno
5 rows between unbounded preceding and current row) run_count
6 from emp;
7 begin
8 for rec in c1 loop
9 null;
10 end loop;
11* end;
SQL> /
end;
*
ERROR at line 11:
ORA-06550: line 5, column 72:
PL/SQL: ORA-00905: missing keyword
ORA-06550: line 3, column 1:
PL/SQL: SQL Statement ignored
SQL>
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE 9.2.0.3.0 Production
TNS for 32-bit Windows: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
SQL>
July 20, 2004 - 8:08 pm UTC
You can contact support and reference <Bug:3083373>, but the workaround would be to use native dynamic sql or a view to "hide" this construct.
the problem turns out to be the word "current" which had meaning in plsql.
Effect of distinct on lag
John Murphy, July 29, 2004 - 1:48 pm UTC
I am trying to use analytics to find accounts with receipts in 3 consecutive years. The analytic code seems to work, however, when I add DISTINCT (to find each account once), I get strange results. This is on 9.2.0.1.0.
create table jcm_test(acct_id number(10), rcpt_date date);
insert into jcm_test
values (1 , to_date('01-JAN-2000', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('01-JAN-2001', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('01-JAN-2003', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('02-JAN-2001', 'dd-mon-yyyy'));
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2);
ACCT_ID RCPT RANK_YEAR LAG_ LAG_RANK YEAR_DIFF RANK_DIFF
---------- ---- ---------- ---- ---------- ---------- ----------
1 2000 1
1 2001 2
1 2003 3 2000 1 3 2
select * from
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2)
where year_diff = rank_diff;
no rows selected
select distinct * from
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2)
where year_diff = rank_diff;
ACCT_ID RCPT RANK_YEAR LAG_ LAG_RANK YEAR_DIFF RANK_DIFF
---------- ---- ---------- ---- ---------- ---------- ----------
1 2001 2 2000 1 1 1
1 2003 4 2001 2 2 2
In your book, you say that because analytics are performed last, you must push them into an inline view. However, that doesn't seem to do the trick here. Thanks, john
July 29, 2004 - 2:18 pm UTC
what release -- i don't see what you see.
Distinct effect release
John Murphy, July 29, 2004 - 3:12 pm UTC
Tom, we are using the following.
Oracle9i Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
I tried searching Metalink, but couldn't find any bugs.
July 29, 2004 - 4:03 pm UTC
i found one, not published, was solved via 9202 -- at least it did not reproduce, they did not pursue it further for that reason.
Distinct effect release
John Murphy, July 29, 2004 - 4:01 pm UTC
Actually, I suspect that this may be related to bug 2258035. Do you agree? Thanks, john
July 29, 2004 - 4:18 pm UTC
yes, i can confirm that in 9205, it is not happening that way.
how to write this query
Teddy, July 30, 2004 - 6:33 am UTC
Hi
using the original poster´s example:
ORDER OPN STATION CLOSE_DATE
----- --- ------- ----------
12345 10 RECV 07/01/2003
12345 20 MACH1 07/02/2003
12345 25 MACH1 07/05/2003
12345 30 MACH1 07/11/2003
12345 36 INSP1 07/12/2003
12345 50 MACH1 08/16/2003
12346 90 MACH2 07/30/2003
12346 990 STOCK 07/31/2003
How do you write a query to determine that and order has passed maufacturing operation in several months?
In above example
12345 has rows in July and Augist but 12346 has rows in July only. How can we write a query to find orders such as 12345?
July 30, 2004 - 4:40 pm UTC
select order, min(close_date), max(close_date)
from t
having months_between( max(close_date), min(close_date) ) > your_threshold;
Finding pairs in result set
PJ, August 11, 2004 - 10:05 am UTC
Tom,
CREATE TABLE A
(
N NUMBER,
C CHAR(1),
V VARCHAR2(20)
)
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '1st e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '2nd e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '3rd e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '1st w of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '2nd w of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'e', '1st e of 2nd N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '1st w of 2nd N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '2nd w of 2nd N');
commit;
SO the data I've is
select * from a;
-------------------------
N C V
1 e 1st e of 1st N
1 e 2nd e of 1st N
1 e 3rd e of 1st N
1 w 1st w of 1st N
1 w 2nd w of 1st N
2 e 1st e of 2nd N
2 w 1st w of 2nd N
2 w 2nd w of 2nd N
---------------------------------------
And the output I'm looking for is
1 e 1st e of 1st N
1 e 2nd e of 1st N
1 w 1st w of 1st N
1 w 2nd w of 1st N
2 e 1st e of 2nd N
2 w 1st w of 2nd N
So basically I need the first pairs of (e-w/w-e) for each N.
I hope I'm clear here.
Thanks as usual in advance,
August 11, 2004 - 12:40 pm UTC
do you have a field that can be "sorted on" for finding "1st, 2cnd" and so on.
If not, there is no such thing as "first", or "third"
PJ, August 11, 2004 - 12:58 pm UTC
Tom,
Sorry if I was not clear.
we need to pick pairs for N. Like we have 5 rows with N=1. so we have to pick 4 rows leaving 1 UNPAIRED "e" out.
We want the data in the same order as it is in table. We can sort it by --> order by N,C
August 11, 2004 - 1:58 pm UTC
ops$tkyte@ORA920> select n, c, rn, cnt2
2 from (
3 select n, c, rn,
4 min(cnt) over (partition by n) cnt2
5 from (
6 select n, c,
7 row_number() over (partition by n, c order by c) rn,
8 count(*) over (partition by n, c) cnt
9 from a
10 )
11 )
12 where rn <= cnt2
13 /
N C RN CNT2
---------- - ---------- ----------
1 e 1 2
1 e 2 2
1 w 1 2
1 w 2 2
2 e 1 1
2 w 1 1
6 rows selected.
Brilliant as usual !!
A reader, August 11, 2004 - 2:04 pm UTC
PJ's query
Kevin, August 11, 2004 - 2:04 pm UTC
PJ - you can drop the column 'v' from your table, and just use this query (which I think will answer your question using N and C alone, and generate an appropriate 'v' as it runs).
CREATE TABLE b
(
N NUMBER,
C CHAR(1)
)
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
INSERT INTO b ( N, C ) VALUES ( 1, 'w');
INSERT INTO b ( N, C ) VALUES ( 1, 'w');
INSERT INTO b ( N, C ) VALUES ( 2, 'e');
INSERT INTO b ( N, C ) VALUES ( 2, 'w');
INSERT INTO b ( N, C ) VALUES ( 2, 'w');
COMMIT;
SELECT n,c,v1
FROM (
SELECT lag (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1) c3,
lead (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1)c4,
c1 ||
CASE WHEN c1 BETWEEN 10 AND 20
THEN 'th'
ELSE DECODE(MOD(c1,10),1,'st',2,'nd',3,'rd','th')
END || ' ' || c || ' of ' || c2 ||
CASE WHEN c2 BETWEEN 10 AND 20
THEN 'th'
ELSE DECODE(MOD(c2,10),1,'st',2,'nd',3,'rd','th')
END || ' N' v1,
t1.*
FROM (
SELECT b.*,
row_number() OVER (PARTITION BY n, c ORDER BY n,c) c1,
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
FROM b
) t1
) t2
WHERE c3 IS NOT NULL OR c4 IS NOT NULL
/
Results:
N C V1
1 e 1st e of 1st N
1 w 1st w of 1st N
1 e 2nd e of 1st N
1 w 2nd w of 1st N
2 e 1st e of 1st N
2 w 1st w of 1st N
INSERT INTO b ( N, C ) VALUES ( 1, 'w');
COMMIT;
Results:
N C V1
1 e 1st e of 1st N
1 w 1st w of 1st N
1 e 2nd e of 1st N
1 w 2nd w of 1st N
1 e 3rd e of 1st N
1 w 3rd w of 1st N
2 e 1st e of 1st N
2 w 1st w of 1st N
oops
Kevin, August 11, 2004 - 2:12 pm UTC
replace
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
with
DENSE_RANK() OVER (PARTITION BY c ORDER BY c) c2
my bad.
A reader, August 11, 2004 - 3:27 pm UTC
Your bad what?
toe? leg?
Cool....
PJ, August 12, 2004 - 7:25 am UTC
analytic q
A reader, October 22, 2004 - 6:34 pm UTC
First the schema:
scott@ORA92I> drop table t1;
Table dropped.
scott@ORA92I> create table t1( catg1 varchar2(10), catg2 varchar2(10), total number );
Table created.
scott@ORA92I>
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 5 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 6 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 9 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V2', 'T2', 10 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V3', 'T1', 11 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V4', 'T1', 1 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V5', 'T2', 2 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V6', 'T2', 3 );
1 row created.
The catg2 can only take two values, 'T1', 'T2'.
I want to sum the total column for catg1, catg2
and order by their total sum for each catg1 and catg2 values. Then
I want to list the top 3 catg1, catg2 combinations
based on their sum values of total column.
If there are more than 3 such combinations then I
club the remaining ones into a catg1 value of 'Others'.
my first cut solution is:
scott@ORA92I> select catg1, catg2, sum( total_sum )
2 from
3 (
4 select case
5 when dr > 3 then
6 'Others'
7 when dr <= 3 then
8 catg1
9 end catg1,
10 catg2,
11 total_sum
12 from
13 (
14 select catg1, catg2, total_sum,
15 dense_rank() over( order by total_sum desc) dr
16 from
17 (
18 select catg1, catg2, sum( total ) total_sum
19 from t1
20 group by catg1, catg2
21 )
22 )
23 )
24 group by catg1, catg2;
CATG1 CATG2 SUM(TOTAL_SUM)
---------- ---------- --------------
V1 T1 20
V2 T2 10
V3 T1 11
Others T1 1
Others T2 5
Does it look ok or do you have any better solution?
Thank you as always.
October 23, 2004 - 9:36 am UTC
you could skip a layer of inline view, but it looks fine as is.
thanx!
A reader, October 24, 2004 - 12:37 pm UTC
SQL query
Reader, November 03, 2004 - 1:45 pm UTC
I have a table which stores receipts against Purchase Orders. The users want the following o/p:
For each of the months of Jan, Feb and March 2004, provide a count of number of receipts which fall in each of the following Dollar value range
< $5000
Between $5000 to $9999
> $10,000
(There can be a number of receipts against one Purchase Order, so that's needs to be grouped together first)
I wrote this query using an inline view which is the UNION of 3 SQLs, one for each dollar range.
However, am sure there is a more elegant and efficient method to do this,maybe using analytical functions , CASE, decode .... Appreciate your help.
Thanks
November 05, 2004 - 10:49 am UTC
select trunc(date_col,'mm') Month,
count( case when amt < 5000 then 1 end ) "lt 5000",
count( case when amt between 5000 and 9999 then 1 end ) "between 5/9k",
count( case when amt >= 10000 then 1 end ) "10k or more"
from t
where date_col between :a and :b
group by trunc(date_col,'mm')
single pass....
Great -
syed, November 10, 2004 - 7:09 am UTC
Tom
I have a tables as follows
create table matches
( reference varchar2(9),
endname varchar2(20),
beginname varchar2(30),
DOB date,
ni varchar2(9)
)
/
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A1','SMITH','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A2','JONES','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A3','JONES','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A4','BROWN','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234570');
insert into matches values ('A4','JONES','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234571');
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A3','JACKSON','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A2','JACKSON','BOB',to_date('1/1/1962','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A5','JACKSON','TOM',to_date('1/1/1920','dd/mm/yyyy'),'AA1234569');
commit;
SQL> select rownum,REFERENCE,ENDNAME,BEGINNAME,DOB,NI from matches;
ROWNUM REFERENCE ENDNAME BEGINNAME DOB NI
------- --------- -------- ---------- --------- ---------
1 A1 SMITH BOB 01-JAN-76 AA1234567
2 A1 SMITH TOM 01-JAN-70 AA1234568
3 A2 JONES TOM 01-JAN-70 AA1234568
4 A3 JONES TOM 01-JAN-71 AA1234569
5 A4 BROWN BRAD 01-JAN-61 AA1234570
6 A4 JONES BRAD 01-JAN-61 AA1234571
7 A1 SMITH BOB 01-JAN-76 AA1234567
8 A3 JACKSON TOM 01-JAN-71 AA1234569
9 A2 JACKSON BOB 01-JAN-62 AA1234568
10 A5 JACKSON TOM 01-JAN-20 AA1234569
I need to show duplicates where the following columns values are the same.
a) REFERENCE, ENDNAME,BEGINNAME,DOB,NI
b) ENDNAME,BEGINNAME,NI
c) REFERENCE,NI
So,
rownum 1 and 7 match criteria a)
rownum 8 and 10 match criteria b)
rownum 1 and 7, rownum 3 and 9, rownum 4 and 8 match criteria c)
How can I select this data out to show number matching each criteria ?
November 10, 2004 - 7:23 am UTC
"How can I select this data out to show number matching each criteria ?"
is ambigous.
If you add columns:
count(*) over (partition by reference, endname, beginname, dob, ni ) cnt1,
count(*) over (partition by endname, beginname, ni) cnt2,
count(*) over (partition by reference,ni) cnt3
it'll give you the "dup count" by each partition -- technically showing you the "number matching each criteria"
analytics problem
David, November 19, 2004 - 9:37 am UTC
Am newish to analytic functions and have hit problem as follows:-
create table a
(accno number(8) not null,
total_paid number(7,2) not null)
/
create table b
(accno number(8) not null,
due_date date not null,
amount_due number(7,2) not null)
/
insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into b values (1, '01-oct-04', 1000);
insert into b values (1, '01-jan-05', 900);
insert into b values (1, '01-apr-05', 700);
insert into b values (2, '01-oct-04', 1000);
insert into b values (2, '01-jan-05', 900);
insert into b values (2, '01-apr-05', 700);
insert into b values (3, '01-oct-04', 1000);
insert into b values (3, '01-jan-05', 900);
insert into b values (3, '01-apr-05', 700);
insert into b values (4, '01-oct-04', 1000);
insert into b values (4, '01-jan-05', 900);
insert into b values (4, '01-apr-05', 700);
If I then do this query...
SQL> select a.accno,
2 a.total_paid,
3 b.due_date,
4 b.amount_due,
5 case
6 when sum(b.amount_due)
7 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
8 then 0
9 when sum(b.amount_due)
10 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
11 then sum(b.amount_due)
12 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
13 when sum(b.amount_due)
14 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
15 and a.total_paid >= 0
16 then b.amount_due
17 end to_pay
18 from a,b
19 where a.accno = b.accno
20 order by a.accno,
21 to_date(b.due_date, 'dd-mon-rr')
22 /
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
1 1000 01-OCT-04 1000 1000
1 1000 01-JAN-05 900 900
1 1000 01-APR-05 700 700
2 1500 01-OCT-04 1000 1000
2 1500 01-JAN-05 900 900
2 1500 01-APR-05 700 700
3 2000 01-OCT-04 1000 1000
3 2000 01-JAN-05 900 900
3 2000 01-APR-05 700 700
4 3000 01-OCT-04 1000 1000
4 3000 01-JAN-05 900 900
4 3000 01-APR-05 700 700
12 rows selected.
...TO_PAY does not give what I was expecting. But if I do by individual accno I get what I'm after:-
SQL> select a.accno,
2 a.total_paid,
3 b.due_date,
4 b.amount_due,
5 case
6 when sum(b.amount_due)
7 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
8 then 0
9 when sum(b.amount_due)
10 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
11 then sum(b.amount_due)
12 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
13 when sum(b.amount_due)
14 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
15 and a.total_paid >= 0
16 then b.amount_due
17 end to_pay
18 from a,b
19 where a.accno = b.accno
20 and a.accno = &accno
21 order by a.accno,
22 to_date(b.due_date, 'dd-mon-rr')
23 /
Enter value for accno: 1
old 20: and a.accno = &accno
new 20: and a.accno = 1
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
1 1000 01-OCT-04 1000 0
1 1000 01-JAN-05 900 900
1 1000 01-APR-05 700 700
3 rows selected.
SQL> /
Enter value for accno: 2
old 20: and a.accno = &accno
new 20: and a.accno = 2
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
2 1500 01-OCT-04 1000 0
2 1500 01-JAN-05 900 400
2 1500 01-APR-05 700 700
3 rows selected.
SQL> /
Enter value for accno: 3
old 20: and a.accno = &accno
new 20: and a.accno = 3
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
3 2000 01-OCT-04 1000 0
3 2000 01-JAN-05 900 0
3 2000 01-APR-05 700 600
3 rows selected.
SQL> /
Enter value for accno: 4
old 20: and a.accno = &accno
new 20: and a.accno = 4
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
4 3000 01-OCT-04 1000 0
4 3000 01-JAN-05 900 0
4 3000 01-APR-05 700 0
3 rows selected.
What is needed for first query above to work?
cheers,
David
November 19, 2004 - 11:31 am UTC
ops$tkyte@ORA9IR2> select a.accno,
2 a.total_paid,
3 b.due_date,
4 b.amount_due,
5 case
6 when sum(b.amount_due)
7 over (<b>partition by a.accno</b> order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
8 then 0
9 when sum(b.amount_due)
10 over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
11 then sum(b.amount_due)
12 over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
13 when sum(b.amount_due)
14 over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
15 and a.total_paid >= 0
16 then b.amount_due
17 end to_pay
18 from a,b
19 where a.accno = b.accno
20 order by a.accno,
21 to_date(b.due_date, 'dd-mon-rr')
22 /
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
1 1000 01-OCT-04 1000 0
1 1000 01-JAN-05 900 900
1 1000 01-APR-05 700 700
2 1500 01-OCT-04 1000 0
2 1500 01-JAN-05 900 400
2 1500 01-APR-05 700 700
3 2000 01-OCT-04 1000 0
3 2000 01-JAN-05 900 0
3 2000 01-APR-05 700 600
4 3000 01-OCT-04 1000 0
4 3000 01-JAN-05 900 0
4 3000 01-APR-05 700 0
12 rows selected.
excellent
David, November 19, 2004 - 12:02 pm UTC
many thanks
Limitation of Analytic Functions
Nilanjan Ray, December 16, 2004 - 4:27 am UTC
I am using the following view
create or replace view vw_history as
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history
The following SQL statement invariably does a full table scan on 112,861,91 rows of ADC_HISTORY and runs for 20-25 mins.
select *
from vw_history
where t_dt between to_date('01/01/2002','dd/mm/yyyy') and to_date('01/01/2002','dd/mm/yyyy');
The query return 4200 rows. ADC_HISTORY has 112,861,91 rows. I have the following indexes : ADC_HISTORY_IDX8 on txm_dt and ADC_HISTORY_IDX1 on spot_key columns. Both have good selectivities.
But when the required query is ran without the view it properly uses the index ADC_HISTORY_IDX8
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history
I had raised a tar and it says:This is the expected behaviour "PREDICATES ARE NOT PUSHED IN THE VIEW IF ANY ANALYTIC FUNCTIONS ARE USED"
Is there any way to work aroung this limitation. I just cannot think of the painful situation if I am unable to use views with analytics!!!!
Your help is absolutely necessary. Thanks in advance
December 16, 2004 - 8:27 am UTC
guess what -- your two queries <b>return different answers</b>..
did you consider that? did you check that?
they are TOTALLY DIFFERENT. Analytics are applied after predicates. The view -- it has no predicate. The query -- it has a predicate. You'll find that you have DIFFERENT result sets.
don't you see that as a problem?
It is not that you are "unable to use views"
It is that "when I use a view, I get answer 1, when I do not use a view, I get answer 2"
which answer is technically correct here?
Think about it.
consider this example (using RBO just to make it so that "if an index could be used it would" to stress the point):
ops$tkyte@ORA9IR2> create table emp as select * from scott.emp;
Table created.
ops$tkyte@ORA9IR2> create index job_idx on emp(job);
Index created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace view v
2 as
3 select ename, sal, job,
4 sum(sal) over (partition by job) sal_by_job,
5 sum(sal) over (partition by deptno) sal_by_deptno
6 from emp
7 /
View created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace on explain
ops$tkyte@ORA9IR2> select *
2 from v
3 where job = 'CLERK'
4 /
ENAME SAL JOB SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER 1300 CLERK 4150 8750
JAMES 950 CLERK 4150 9400
SMITH 800 CLERK 4150 10875
ADAMS 1100 CLERK 4150 10875
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 VIEW OF 'V'
2 1 WINDOW (SORT)
3 2 WINDOW (SORT)
4 3 TABLE ACCESS (FULL) OF 'EMP'
<b>so, one might ask "well - hey, I've got that beautiful index on JOB, I said "where job = 'CLERK'", whats up with that full scan.
in fact, when I do it "right" -- without the evil view:</b>
ops$tkyte@ORA9IR2> select ename, sal, job,
2 sum(sal) over (partition by job) sal_by_job,
3 sum(sal) over (partition by deptno) sal_by_deptno
4 from emp
5 where job = 'CLERK'
6 /
ENAME SAL JOB SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER 1300 CLERK 4150 1300
SMITH 800 CLERK 4150 1900
ADAMS 1100 CLERK 4150 1900
JAMES 950 CLERK 4150 950
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 WINDOW (SORT)
2 1 WINDOW (SORT)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
4 3 INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
<b>it very rapidly uses my index !!! stupid views...
but wait.
whats up with SAL_BY_DEPTNO, that appears to be wrong... hmmm, what happened?
What happened was we computed the sal_by_depto in the query without the view AFTER doing "where job = 'CLERK'"
YOU are doing your LAG() analysis AFTER applying the predicate. Your lags in your query without the view -- they are pretty much "not accurate"
Note that when the predicate CAN be pushed:</b>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select ename, sal, sal_by_job
2 from v
3 where job = 'CLERK'
4 /
ENAME SAL SAL_BY_JOB
---------- ---------- ----------
SMITH 800 4150
ADAMS 1100 4150
JAMES 950 4150
MILLER 1300 4150
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 VIEW OF 'V'
2 1 WINDOW (BUFFER)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
4 3 INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
<b>it most certainly is. here the predicate can safely be pushed -- since the analytic is computed "by job", a predicate on "job" can be applied FIRST and then the analytic computed.
When pushing would change the answer -- we cannot do it.
When pushing the predicate would not change the answer -- we do it.
This is not a 'limitation', this is about "getting the right answer"</b>
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> alter session set optimizer_mode = choose;
Session altered.
Great!!!
Nilanjan Ray, December 17, 2004 - 12:59 pm UTC
Simply amazing explanation. Cleared my doubts still further. One of the best explanation, in simple concise terms, I have seen on "Ask Tom". You know what, people should take enough caution and learn leasons from you before making misleading statements like "...LIMITATIONS...". In your terms yet again "Analytics Rock".
Regards
Using analytical function, LEAD, LAG
Praveen, December 24, 2004 - 9:26 am UTC
Hi Tom,
Analytical function, LEAD (or LAG) accepts the offset parameter as an integer which is a count of rows to be skipped from the current row before accessing the leading/lagging row. What if I want to access leading rows based on the value of column of current row, like a function applied to the column value of current row to access the leading row.
As an example: I have a table
create table t(id integer, dt date);
For each id, start with the first record, after ordering by dt ASC. Get the next record where dt = 10 min + first_row.dt. Then next record where dt = 20 min + first_row.dt and so on. Each time time is cummulatively increased by 10 min.
Suppose if don't get an exact match from next record (ie next_row.dt <> first_row.dt+10 min(say), then we select a row closest to the expected record, but lying within +/-10 seconds.
insert into t values (1, to_date('12/20/2004 00:00:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:05:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:09:55', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:10:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:15:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:19:54', 'mm/dd/yyyy hh24:mi:ss')); --Not selected.
insert into t values (1, to_date('12/20/2004 00:19:55', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:25:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:30:05', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:30:06', 'mm/dd/yyyy hh24:mi:ss')); --Not Selected.
insert into t values (1, to_date('12/20/2004 00:35:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:39:55', 'mm/dd/yyyy hh24:mi:ss')); --Either this or below record is selected.
insert into t values (1, to_date('12/20/2004 00:40:05', 'mm/dd/yyyy hh24:mi:ss')); --Either this or above record is selected.
My output would be:
id dt
-----------
1 12/20/2004 00:00:00 AM
1 12/20/2004 00:10:00 AM --Exactly matches first_row.dt + 10min
1 12/20/2004 00:19:55 AM --Closest to first_row.dt + 20min +/- 10sec
1 12/20/2004 00:30:05 AM --Closest to first_row.dt + 30min +/- 10sec
1 12/20/2004 00:39:55 AM OR 12/20/2004 00:40:05 AM --Closest to first_row.dt + 40min +/- 10sec
The method I followed, after failed using LEAD is:
Step#1
------
Get a subset of dt's column, which is a 10 min cummulatiave dts from the dt value of first row(after rounding to the nearest minute, multiple of 10).
In this example I will get a subset:
12/20/2004 00:00:00 AM
12/20/2004 00:10:00 AM
12/20/2004 00:20:00 AM
12/20/2004 00:30:00 AM
12/20/2004 00:40:00 AM
This query will do it:
SELECT t1.id,
( min_dt - MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
FROM (SELECT id, MIN (dt) min_dt,
ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
FROM t
WHERE id = 1
GROUP BY id) t1, t
WHERE ROWNUM <= max_rows + 1
Step#2:
-------
This subquery is joined with table t to get only those records from t which is either equal to the dts in the resultset returned by the subquery or fall within the range 10min +/-10sec (not closest only, but all).
SELECT t.id, dt_rounded, ABS (t.dt - dt_rounded) * 24 * 60 * 60 dt_diff_in_sec
FROM t,
(SELECT t1.id,
( min_dt - MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
FROM (SELECT id, MIN (dt) min_dt,
ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
FROM t
WHERE id = 1
GROUP BY id) t1, t
WHERE ROWNUM <= max_rows + 1) t2
WHERE t.id = 1
AND ABS (t.dt - dt_rounded) * 24 * 60 * 60 <= 10
ORDER BY t.id, dt_rounded, dt_diff_in_sec;
I agree, this resultset will include duplicate records which I need to remove procedurally, while looping through the cursor; the order by clause simplifies this.
Now you might have guessed the problem. If table t contains more than 1000 records, the query asks me to wait atleast 2 min! And that too when I am planning to put at least 70,000 records!
I wrote a procedure which is handling the situation a little better. But I dont know if analytical query can help me out to bring back the performance. I could do it if Lead have the fuctionality I mentioned in the first paragraph. Do you have any hints?
Thanks and regards
Praveen
December 24, 2004 - 9:54 am UTC
you'd be looking at first_value with range windows, not lag and lead in this case.
Windowing clause and range function.
Praveen, December 25, 2004 - 1:29 pm UTC
Hi Tom,
Thankyou for the suggestion. I am not very well used with analytical queries. I have tried based on your advise but unable to even start with. I am struck with the first step itself - in specifying the range in the windowing clause. In the windowing clause, we specify an integer to get the preceding rows based on the current column value (CLARK's example-Page:556, Analytical Funtions).
In my above example I wrote a query which contains:
FIRST_VALUE(id)
OVER (ORDER BY dt DESC
RANGE 10 PRECEDING)
10, in the windowing clause, will give me a record that fall within 10 days preceding the current row. But I need 10 minutes preceding records. Also at the same time all those records that span within +/- 10 sec, if exact 10 minute later records are not found (please see the description of the problem given in the previous question).
Kindly give me a more clear picture about windowing clause.
Also how you will approch the above problem.
Thanks and regards
Praveen
December 26, 2004 - 12:19 pm UTC
do you have Expert One on One Oracle? I have extensive examples in there.
range 10 = 10 days.
range 10/24 = 10 hours
range 10/24/60 = 10 minutes......
I do have Expert One on One
Praveen, December 26, 2004 - 2:24 pm UTC
Hi Tom,
I got the first glimpse into analytical queries through your book only. Although I had attempted to learn them through oracle documentation a couple of times earlier, I never was able to write an decent query using analytical functions. Now, after spending a few hours with your book, I can see that these fuctions are not as complex as I thought earlier.
The 'hiredate' example you have given in the book is calculating in terms of days. (Pg:555)
"select ename, sal, hiredate, hiredate-100 window_top
first_value(ename)
over(order by hiredate asc
range 100 preceding) ename_prec,...."
I got the hint from your follow-up. I should have to think a little myself.
Thankyou Tom,
Praveen.
A reader, December 26, 2004 - 5:49 pm UTC
Tom,
Any dates when you would be releasing your book on Analytic?
Thanks.
December 26, 2004 - 6:00 pm UTC
doing a 2nd edition of Expert One on One Oracle now -- not on the list yet.
Great answer!
Shimon Tourgeman, December 27, 2004 - 2:21 am UTC
Dear Tom,
Could you please tell us when you are going to publish the next edition of your books, covering 9iR2 and maybe 10g, as you stated here?
Merry Christmas and a Happy New Year!
Shimon.
December 27, 2004 - 10:06 am UTC
sometime in 2005, but not the first 1/2 :)
Using range windows
Praveen, January 03, 2005 - 8:09 am UTC
Hi Tom,
Please allow me to explain the problem again which you had
followed up earlier (Please refer: "Using analytical
function, LEAD, LAG"). In the table t(id integer, dt date)
I have records which only differ by seconds ('dt' column).
Could you please help me to write a query to create windows
such that each window groups records based on the
expression 590 <= dt_1 <= 610 (590 & 610 are date
difference between first record and current record in
seconds and dt1 is the 'dt' column value of first record in
each window after ordering by 'id' and 'dt' ASC).
The idea is to find a record following the first record
which leads by 10 minutes. If exact match is not found
apply a tolerance of +/-10 seconds. Once the nearest match
is found (if multiple matches are found, select any), start
from the next record and repeat the process. (Please see
the scripts I had given earlier).
In your follow up, you had suggested the use of
first_value() analytical function with range windows. But
it looks like it is pretty difficult to generate the kind
of windows I specified above. And in your book, examples of
such complex nature where not given (pardon me for being
critical).
Your answer will help me to get a deeper and practical
understanding of analytical functions while at the same
time may help us to bring down a 12 hour procedure to less
than 5 hours.
Thanks and regards
Praveen
January 03, 2005 - 9:11 am UTC
no idea what 590 is. days? hours? seconds?
sorry - this doesn't compute to me.
590 <= dt_1 <= 610???
Delete Records Older Than 90 Days While Keeping Max
Mac, January 03, 2005 - 10:24 am UTC
There is a DATE column in a table. I need to delete all records older than 90 days -- except if the newest record for a unique key happens to be older than 90 days, I want to keep it and delete the prior records for that key value.
How?
January 03, 2005 - 10:26 am UTC
if the "newest record for a unique key"
if the key is unique.... then the date column is the only thing to be looked at?
that is, if the key is unique, then the oldest record is the newest record is in the fact the only record....
Oops, but
A reader, January 03, 2005 - 11:01 am UTC
Sorry, forgot to mention that the DATE column is a part of the unique key.
Sorry, I went a bit fast...
Praveen, January 03, 2005 - 2:00 pm UTC
Hi Tom,
Sorry, I didnt explained properly.
590 = (10 minutes * 60) seconds - 10 seconds
600 = (10 minutes * 60) seconds + 10 seconds
Here I am looking for a record (say rn) exactly
600 sec (10 min) later to the first record in
the range window. If I didn't get an exact match
I try to find a record which is closest to rn,
but lies with in a range which is 10 seconds less
than or more than rn.
And the condition
"590 <= dt_1 <= 610" tries to eliminate all other
records inside the range window that does not follow
the above rule.
dt_1 is the dt column value of any row following the
first row in a given range window, such that the
difference between dt_1 and dt of first row is between
590 seconds and 610 seconds. I am interested in only
one record which lies closest to 600 seconds.
I hope, the picture is more clear to you now. As an
example,
id dt
-----------------------------
1 12/20/2004 00:00:00 AM --Range window #1
1 12/20/2004 00:09:55 AM
1 12/20/2004 00:10:00 AM --Selected (Closest to 12/20/2004 00:10:00 AM)
............................
1 12/20/2004 00:10:10 AM --Range window #2
1 12/20/2004 00:19:55 AM --Selected (Closest to 12/20/2004 00:20:00 AM)
1 12/20/2004 00:20:55 AM
............................
1 12/20/2004 00:20:55 AM --Range window #3
1 12/20/2004 00:25:00 AM --Nothing to select
1 12/20/2004 00:29:10 AM --Nothing to select
...........................
1 12/20/2004 00:30:05 AM --Range window #4
1 12/20/2004 00:39:55 AM --Either one is selected
1 12/20/2004 00:40:05 AM --Either one is selected
-----------------------------
Thanks and regards
Praveen
January 03, 2005 - 10:24 pm UTC
that is first_value, last_value with a range window and the time range is
N * 1/24/60/60 -- for N seconds.
How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?
jayaramj@quinnox.com, January 13, 2005 - 3:11 pm UTC
Hi Tom,
In answer to the question 'How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?' from reviewer Sergey (from Norway) in this post you have proposed the following solution:
ops$tkyte@ORA10G> select t,
2 sign_v,
3 v,
4 substr( max(data) over (order by t), 7 ) v2
5 from (
6 SELECT TD.T,
7 SIGN(NVL(TV.V, 0)) sign_v,
8 NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
9 case when tv.v is not null
10 then to_char( row_number()
over (order by td.t), 'fm000000' ) || tv.v
11 end data
12 FROM TD, TV
13 WHERE TV.T(+) = TD.T
14 )
15 ORDER BY T
16 ;
The problem is that this solution converts the data type of the column (in this case column TV.V) to a string (V2 in the result is a string). The result would then need to be converted back to the original data type.
It is best to avoid such data type conversion. Is there a solution to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS) in Oracle 9i without the datatype conversion?
January 13, 2005 - 3:45 pm UTC
encode the date as a string using to_char( v, 'yyyymmddhh24miss' ) and in the substr of it back out -- to_date( substr(...), 'yyyymmddhh24miss' )
How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?
Jay, January 14, 2005 - 12:44 am UTC
In response to your post above - Taking care of dates (for datatype conversion) is not complex (though timestamp variants would require a different format string). Object columns are a different story altogether. These cannot be easily converted to strings. Is there a better solution that does not require datatype conversion (and hence does not require any knowledge of the column datatype in this SQL).
January 14, 2005 - 8:06 am UTC
upgrade to 10g.
find prior collect_date to the max collect_date for each customer
JANE, January 25, 2005 - 4:30 pm UTC
Hello,Tom!
I work in ORACLE 8I
I have table with 2 columns:cstmr_no,collect_date
CREATE TABLE CSTMR_dates
(
CSTMR_NO NUMBER(8) NOT NULL,
COLLECT_DATE DATE NOT NULL);
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
How can i do instead this query the query using analytical
function:
select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no
In production i have thousands records in the table. THANK A LOT
JANE
January 25, 2005 - 6:59 pm UTC
no idea what "return_collect_date" is. or where it comes from.
the sql is not sql...
Mistake:return_collect_date is a collect_date
JANE, January 26, 2005 - 2:58 am UTC
Thank you for answer
JANE
January 26, 2005 - 8:46 am UTC
but this sql:
select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(COLLECT_DATE)
group by cstmr_no
is still not sql and I don't know if you want to
a) delete all old data BY CSTMR_NO (eg: keep just the record with the max(collect_date) BY CSTMR_NO
b) delete all data such that the collect_date is not equal to the max(collect_date)
I cannot suggest a way to rewrite an invalid sql query.
No,i want to do the next:
A reader, January 26, 2005 - 9:08 am UTC
i have just to presene the data without deleting anything
For each cstmr i have to see:
cstmr_no max(collect_date) last prior date to max
======== ================= ======================
18 01/05/04 01/03/04
248 01/11/04 01/03/04
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
January 26, 2005 - 9:31 am UTC
wow, how we got from:
select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no
to this, well -- just "wow". horse of a very different color.
I have to sort of guess -- maybe I'll get it right -- you want
a) every cstmr_no,
b) the last two dates recorded for them.
well, after editing your inserts to make them become actual sql that can run.... (you don't really use YY in real life do you? please please say "no, that was a mistake...")
ops$tkyte@ORA9IR2> select cstmr_no,
2 max( decode(rn,1,collect_date) ) d1,
3 max( decode(rn,2,collect_date) ) d1
4 from (
5 select cstmr_no,
6 collect_date,
7 row_number() over (partition by cstmr_no order by collect_date desc nulls last) rn
8 from cstmr_dates
9 )
10 where rn <= 2
11 group by cstmr_no
12 /
CSTMR_NO D1 D1
---------- --------- ---------
18 01-MAY-04 01-MAR-04
248 01-NOV-04 01-MAR-04
Lead/Lag and Indexes
Rob H, February 22, 2005 - 6:12 pm UTC
We are using the Lead and Lag functions and I have run into an issue of Index usage.
lets say I have 2 tables
select customer_num, prod_id, date_sold, total_sales from sales_table_NA
and
select customer_num, prod_id, date_sold, total_sales from sales_table_EUR
if i do a
create view eur_sales
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_EUR
group by customer_account, prod_id
create view na_sales as
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_NA
group by customer_account, prod_id
There are indexes on the tables for customer_acccount
Now, if I
select * from na_sales where customer_account=1
the index is used. Same for eur_sales. However, if I UNION them together it does not (WINDOW SORT on first select and WINDOW BUFFER on second). If I remove the lead function and UNION them, the index is used.
Any help?
February 23, 2005 - 1:56 am UTC
do you really want UNION or UNION ALL.........
(do you know the difference between the two)....
if you had given me simple setup scripts, I would have been happy to see if that makes a difference, but oh well.
Potential Solution
Rob H, February 22, 2005 - 6:54 pm UTC
Rather than pre-sum the data into 2 views I found that union'ing (actually UNION ALL) the data, then sum and Lag works fine.
ie
select
customer_account, prod_id, sales_date month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by
customer_account, prod_id order by sales_date desc) sales_last
from(
select customer_account, prod_id, sales_date, total_sales from sales_table_NA
union all
select customer_account, prod_id, trunc(sales_date,'mon') month_purch, total_sales from sales_table_EUR)
Attitude....
Rob H, February 23, 2005 - 9:54 am UTC
What's the deal? Having a bad day? I'm sorry, but I assumed from the select statements you could infer structure. Yes, I was using UNION ALL, yes, I know the difference (uh, feeling a bit rude are we?) but I didn't realize until after I posted that I missed that (a nice feature would be to be able to edit a post for a certain time after post). I generalized the data structure and SQL for confidentiality reasons. For a guy who is so hard on people's IM speak, you forget to capitalize your sentences :)
Now, UNION Vs UNION ALL didn't affect index usage (it did however have 'other' performance issues). You can see from my next post that I worked on the issue and resolved it by not presuming each table. With the new query, if someone issues a select with no 'where customer_account=' then it's slower (but that also wasn't the goal).
Thanks
February 24, 2005 - 4:35 am UTC
No? I was simply asking "do you know the difference between the two" for I find most people
a) don't know union all exists
b) the semantic difference between union and union all
c) the performance penalty involved with union vs union all when they didn't need to use UNION
Your example, as posted, did not use UNION ALL. Look at your text:
<quote>
Now, if I
select * from na_sales where customer_account=1
the index is used. Same for eur_sales. However, if I UNION them together it
does not (WINDOW SORT on first select and WINDOW BUFFER on second). If I remove
the lead function and UNION them, the index is used.
</quote>
I quite simply asked:
does union all change the behaviour? (i did not have an example with table creates and such to work with, so I couldn't really 'test it', I don't have your tables, your indexes, your datatypes, etc)
do you need to use union, you said union, you did not say union all. do you know the difference between the two.
Sorry if you took it as an insult, I can only comment based on the data provided. I had to assume you like most of the world was using UNION, not UNION ALL and simply wanted to know if you could use union all, if union all made a difference, if you knew the difference between the two.
If I had precience, I could have read your subsequent post and not ask any questions I guess.
Not having a bad day, just working with information provided. I was not trying to insult you -- I was simply "asking".
Analytics
Neelz, February 24, 2005 - 5:34 am UTC
Dear Sir,
I had gone through the above examples and was wondering whether analytical functions could be used when aggregating multiple columns from a table,
CREATE TABLE T (
SUPPLIER_CD CHAR(4) NOT NULL,
ORDERRPT_NO CHAR(8) NOT NULL,
ORDER_DATE CHAR(8) NOT NULL,
STORE_CD CHAR(4) NOT NULL,
POSITION_NO CHAR(3 ) NOT NULL,
CONTORL_FLAG CHAR(2 ),
ORDERQUANTITY_EXP NUMBER(3) DEFAULT (0) NOT NULL,
ORDERQUANTITY_RES NUMBER(3) DEFAULT (0) NOT NULL,
ENT_DATE DATE DEFAULT (SYSDATE) NOT NULL,
UPD_DATE DATE DEFAULT (SYSDATE) NOT NULL,
CONSTRAINT PK_T PRIMARY KEY(SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE, STORE_CD));
CREATE INDEX IDX_T ON T (SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE);
insert into t values('5636','62108373','20041129','0007','2','00',1,1, to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108373','20041129','0012','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'), to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108384','20041129','0014','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108384','20041129','0015','3','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1309','4','00',8,8,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1312','12' ,'00',8,8,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1313','13' ,'00',12,12,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));
Currently the following query is used:-
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
The execution plan when this query is executed on the real table which has 4m records is : -
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=103002 Card=3571095 Bytes=107132850)
1 0 SORT (GROUP BY NOSORT) (Cost=103002 Card=3571095 Bytes=107 132850)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=10 3002 Card=3571095 Bytes=107132850)
3 2 INDEX (FULL SCAN) OF 'IDX_T' (NON-UNIQUE) (Cost=26942 Card=3571095)
Could you please tell me whether analytical functions could be used over here or a better approach for this query.
Thanks for your great help
February 24, 2005 - 5:49 am UTC
there would be no need of analytics here. analytics would be useful to get the 'aggregates' while preserving the 'details'
eg:
select empno, sal, sum(sal) over (partition by deptno)
from emp;
shows the empno, their sal and the sum of all salaries in their dept. that would be instead of coding:
select empno, sal, sum_sal
from emp, (select deptno, sum(sal) sum_sal from emp gropu by deptno) t
where emp.deptno = t.deptno
/
I was just wondering
A reader, February 24, 2005 - 6:25 am UTC
how would analytics help in the following example (the data nodes are implemented as rows in a table with two columns as pointers: split-from and merge-to, and the third column is "value", some number, not shown on diagram):
</code>
http://img23.exs.cx/my.php?loc=img23&image=directedgraph11th.png <code>
The task is to use this directed dependency graph and prorate the "value" column in each row/node in the following way:
foreach node
-start with a node, for example 16
-visit each hierarchy on which 16 depends, in this case hierarchies for 14 and 15, SUM their values and the current value of node 16, and that will be new, prorated value for node 16
-repeat this recursively for each sub-hierarchy
until all nodes are prorated
I was thinking maybe to use combination of sys_connect_by_path and AF but not sure how. Any thoughts?
February 24, 2005 - 6:51 am UTC
you won't get very far with that structure in 9i and before. connect by "loop" will be an error you see lots of with a directed graph.
analytics won't be appropriate either, they work on windows - not on hierarchies.
sys_connect_by_path is going to give you a string, not a sum
a scalar subquery in 10g with NOCYCLE on the query might work.
What if there is no closure inside the graph?
A reader, February 24, 2005 - 9:08 am UTC
i.e. if the link between node 9 and 5 is removed, and the link between node 6 and 0 is removed.
Would that make difference? It would be a tree in that case. How should we proceed if that is the case? I was thinking maybe to use sys_connect_by_path to pack all sub-hierarchies one after another, and marker in window to be the depth or level. If the level switch from n to 1 that would mean the end of sub-hierarchy. If the level switch from 1 to 2 that is the begining of the hierarchy. And then aggregate over partition inside hierarchy view. Or is there a better approach?
February 24, 2005 - 9:22 am UTC
Lead/Lag and 0 Sales
Rob H, February 24, 2005 - 1:00 pm UTC
Thanks for all of the help so far. I have run into an issue where I have Companies and Contacts at that company. Here are the tables.
create table SALES_TRANS
(
CUSTOMER_ACCOUNT VARCHAR2(8) ,
STATION_NUMBER VARCHAR2(7) ,
PRODUCT_CODE VARCHAR2(8) ,
QUANTITY NUMBER ,
DATE_ISSUE DATE ,
PRICE NUMBER ,
VALUE NUMBER );
/
Create table COMPANY_CUSTOMER
(
COMPANY_ID NUMBER(9),
CUSTOMER_ACCOUNT VARCHAR2(8));
/
Create table PRODUCT_INFO
(
PRODUCT_CODE VARCHAR2(8) ,
PRODUCT_GROUP VARCHAR2(25),
PRODUCT_DESC VARCHAR2(100)
);
/
Running a query by customer (this select is a view called - SUM_CUST_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
t.CUSTOMER_ACCOUNT,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc) total_VALUE_pre_y
From SALES_TRANS t
inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.OMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group, fiscal_year
I get
COMPANY_ID,CUSTOMER_ACCOUNT,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","27294370","Product1",2002,1460.08,0
"F0009631","27294370","Product2",2005,0,27926.31
"F0009631","27294370","Product2",2004,27926.31,18086.17
"F0009631","27294370","Product2",2003,18086.17,47597.05
"F0009631","27294370","Product2",2002,47597.05,0
"F0009631","27294370","Product2",2001,0,0
"F0009631","27294370","Product3",2004,64582.6,51041
"F0009631","27294370","Product3",2003,51041,60225
"F0009631","27294370","Product3",2002,60225,43150
"F0009631","27294370","Product3",2001,43150,50491
"F0009631","27294370","Product3",2000,50491,664
"F0009631","27294370","Product3",1999,664,0
"F0009631","27294370","Product4",2005,2119.1,1708.61
"F0009631","27294370","Product4",2004,1708.61,4050.82
"F0009631","27294370","Product4",2003,4050.82,15662.57
"F0009631","27294370","Product4",2002,15662.57,0
"F0009631","27294370","Product5",2005,0,351.64
"F0009631","27294370","Product5",2004,351.64,5873.61
"F0009631","27294370","Product5",2003,5873.61,2548.83
"F0009631","27294370","Product5",2002,2548.83,0
"F0009631","27294370","Product6",2004,17347.84,16781.33
"F0009631","27294370","Product6",2003,16781.33,10575
"F0009631","27294370","Product6",2002,10575,3659.67
"F0009631","27294370","Product6",2001,3659.67,4901.67
"F0009631","27294370","Product6",2000,4901.67,4073.47
"F0009631","27294370","Product6",1999,4073.47,0
"F0009631","27294370","Product7",2004,5377.5,2588
"F0009631","27294370","Product7",2003,2588,245
"F0009631","27294370","Product7",2000,245,0
"F0009631","27340843","Product2",2003,3013.71,0
"F0009631","27340843","Product3",1999,1411,0
"F0009631","27340843","Product5",2003,3254.9,0
Now if I run the same grouping by only company (this select is a view called - SUM_COMPANY_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc) total_VALUE_pre_y
From SALES_TRANS t
inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.COMPANY_ID, p.product_group, fiscal_year
we get
COMPANY_ID,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","Product1",2002,1460.08,0
"F0009631","Product2",2005,0,27926.31
"F0009631","Product2",2004,27926.31,21099.88
"F0009631","Product2",2003,21099.88,47597.05
"F0009631","Product2",2002,47597.05,0
"F0009631","Product2",2001,0,0
"F0009631","Product3",2004,64582.6,51041
"F0009631","Product3",2003,51041,60225
"F0009631","Product3",2002,60225,43150
"F0009631","Product3",2001,43150,50491
"F0009631","Product3",2000,50491,2075
"F0009631","Product3",1999,2075,0
"F0009631","Product4",2005,2119.1,1708.61
"F0009631","Product4",2004,1708.61,4050.82
"F0009631","Product4",2003,4050.82,15662.57
"F0009631","Product4",2002,15662.57,0
"F0009631","Product5",2005,0,351.64
"F0009631","Product5",2004,351.64,9128.51
"F0009631","Product5",2003,9128.51,2548.83
"F0009631","Product5",2002,2548.83,0
"F0009631","Product6",2004,17347.84,16781.33
"F0009631","Product6",2003,16781.33,10575
"F0009631","Product6",2002,10575,3659.67
"F0009631","Product6",2001,3659.67,4901.67
"F0009631","Product6",2000,4901.67,4073.47
"F0009631","Product6",1999,4073.47,0
"F0009631","Product7",2004,5377.5,2588
"F0009631","Product7",2003,2588,245
"F0009631","Product7",2000,245,0
The problem is that because if I
select * from SUM_CUST_TRANS_PRODUCT_FY_V where fiscal_year=2004
Customer 27340843 will not show up (no 2004 purchases), but that also means that the total_VALUE_pre_y for 2004 will never summarize by customer to the total_VALUE_pre_y for 2004 for the company. Is there a better way to do this. The goal is that we can show current year sales vs previous years sales by company, by customer, and potentially a larger summary higher than company (city).
I guess the idea would be that I could somehow show for all customers in a company, all years, all products, that the company has purchases (cartesian) for every year purchasing. This I think is difficult for large customer, sales transaction tables.
ie
"F0009631","27340843","Product2",2004,0,3013.71 <--- ***
"F0009631","27340843","Product2",2003,3013.71,0
*** This row doesn't exist in the customer view. There are no 2004 sales, so doesn't appear, but we would like to see it so that the year previous shows.
I would love to "attach" some of the transactions if it would help. Is there a better way?
hierarchical cubes + MV?
Rob H, February 25, 2005 - 2:52 pm UTC
Would hierarchical cubes and MV be the solution. It seems like a lot of meta data to create. We would have to create it for all customers, for all years, for all product groups.
February 25, 2005 - 6:40 pm UTC
if you have "missing data", the only way i know to "make it up" is an outer join (partitioned outer joins in 10g rock, removing the need to create cartesian products of every dimension first)
Neelz, February 27, 2005 - 2:32 am UTC
Dear Sir,
This is with regards to my previous post which is 5th above from this.
<quote>
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
</quote>
As you mentioned analytics could not be used, but could you please advice me on my problem,
The query is infact big, for brevity I just put few columns. The actual query is
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
.....
.....
.....
.....
.....
.....
SUM(DECODE(RTRIM(POSITION_NO),'197',ORDERQUANTITY_RES,0)) Q197,
SUM(DECODE(RTRIM(POSITION_NO),'198',ORDERQUANTITY_RES,0)) Q198,
SUM(DECODE(RTRIM(POSITION_NO),'199',ORDERQUANTITY_RES,0)) Q199,
SUM(DECODE(RTRIM(POSITION_NO),'200',ORDERQUANTITY_RES,0)) Q200,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
As you could see there is a definite pattern on the sum function. Could you please help me in tuning this query?
Thanks in advance
February 27, 2005 - 8:32 am UTC
you aer doing a pivot -- looks great to me? It is "classic"
Neelz, February 27, 2005 - 9:51 am UTC
Dear Sir,
I am sorry if you felt like that, It is quite a new world for me here, started visiting this site 3-4 months back then realized the enormity of it and its become like an addiction. Bought both books by you and started working on it. Reading the Oracle concepts guide. Every day many times will try for asking a question but till now no luck, might be because of timezone difference.
Coming back to my question, since it is a huge query and was taking 35 min to execute, after reading through many articles here and in the books I was really confused as to what approach should I take. Still is. Analytical functions (not useful as you told), Function based indexes(no becuase we have a standard edition), Materialized views(no because its an OLTP), Stored Sql functions, Deterministic keyword, user defined aggregates, optimizer hints.. at present it is confusing for me.
I am working on it with different approaches, could reduce the execution time upto 9.08 minutes. The query was written with an index hint earlier and by removing it, the execution time decreased upt 9+ minutes.
I was thinking whether you could advice on what approach should I take
Thanks for your valuable time,
February 27, 2005 - 10:04 am UTC
if that is taking 35 minutes you either
a) have the memory settings like pga_aggreate_target/sort_area_size set way too low
b) you have billions of records that are hundreds of bytes in width
c) really slow disks
d) an overloaded system
I mean -- that query is pretty "simple" full scan, aggregate, nothing to it -- unless it is a gross simplification, it should not take 35 minutes. Can you trace it with the 10046 level 12 trace and post the tkprof section that is relevant to just this query with the waits and all?
Neelz, February 27, 2005 - 10:56 am UTC
Dear Sir,
Thank you for your kind reply,
This report is taken for the development system.
I used alter session set events '10046 trace name context forever, level 12'. The query execution time was 00:08:15.03
select
supplier_cd, orderrpt_no, order_date,
sum(decode(rtrim(position_no),'1',orderquantity_res,0)) q1,
sum(decode(rtrim(position_no),'2',orderquantity_res,0)) q2,
sum(decode(rtrim(position_no),'3',orderquantity_res,0)) q3,
sum(decode(rtrim(position_no),'4',orderquantity_res,0)) q4,
sum(decode(rtrim(position_no),'5',orderquantity_res,0)) q5,
.....
.....
sum(decode(rtrim(position_no),'197',orderquantity_res,0)) q197,
sum(decode(rtrim(position_no),'198',orderquantity_res,0)) q198,
sum(decode(rtrim(position_no),'199',orderquantity_res,0)) q199,
sum(decode(rtrim(position_no),'200',orderquantity_res,0)) q200,
sum(orderquantity_res) order_total
from
t
group by
supplier_cd, orderrpt_no, order_date
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.04 0 0 0 0
Execute 2 0.02 0.04 0 0 0 0
Fetch 15 431.55 488.37 37147 36118 74 211
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 18 431.60 488.46 37147 36118 74 211
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 66
Rows Row Source Operation
------- ---------------------------------------------------
211 SORT GROUP BY
4205484 TABLE ACCESS FULL T
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 16 0.00 0.00
SQL*Net more data to client 30 0.00 0.00
db file sequential read 3 0.04 0.05
db file scattered read 2280 0.78 30.62
direct path write 4 0.00 0.00
direct path read 147 0.05 1.45
SQL*Net message from client 16 140.57 166.58
SQL*Net break/reset to client 2 0.01 0.01
********************************************************************************
Thank you
February 27, 2005 - 11:13 am UTC
that is 8 minutes?
but I see some writes to temp here -- for 211 aggregated rows, perhaps your sort/pga is set small
Also, why do you need to rtrim() 4,205,484 rows? (and why is something called position NUMBER in a string?) is that rtrim there "just in case" or is it really needed? why would it have trailing blanks and is that not a data integrity issue that needs to be fixed?
(but this is an 8 minute query, not a 35 minute query, if it takes longer on production -- it'll be because it is waiting for something -- like IO...)
Neelz, February 27, 2005 - 11:30 am UTC
Dear Sir,
This is a 3rd party application and the query was written with an index hint earlier. After removing the hint query execution time reduced to 8 min. Regarding the rtrim I have to check with the team if it is really needed. I will try the trace on production tomorrow.
And at last I could see the link for "Submit a New Question"!, I think I should try around 1.00 AM
Thanking You a lot
February 27, 2005 - 11:31 am UTC
depends on your time zone, rarely am I up at 1am east coast (gmt-5) time doing this stuff!
Miki, March 02, 2005 - 8:51 am UTC
Tom,
I need to produce a moving average which has an even window size. If I want a 28 sized window, I need to look backward 14 but I need the first value of the window to be divided by 2 and I need to look forward 14 and the last value of the window to be divided by 2 also.
(a1/2+a2+...+a28+a29/2)/28
How could I accomplish it with the function:
avg() over(...)?
Thanks in advance
March 02, 2005 - 10:03 am UTC
this is the first thought that popped into my head:
a) get the sum(val) over 13 before and 13 after (27 rows possible).
b) get the lag(val,14)/2 and lead(val,14)/2
c) add those three numbers
d) divide by the count of non-null VALS observed (count(val) 13 before/after+ 1 if lag is not null + 1 is lead is not null)
ops$tkyte@ORA9IR2> create table t
2 as
3 select rownum id, object_id val
4 from all_objects
5 where rownum <= 30;
Table created.
<b>so, this was my "debug" query, just to see the data:</b>
ops$tkyte@ORA9IR2> select id,
2 sum(val) over
(order by id rows between 13 preceding and 13 following) sum,
3 count(val) over
(order by id rows between 13 preceding and 13 following)+
4 decode(lag(val,14) over (order by id),null,0,1)+
5 decode(lead(val,14) over (order by id),null,0,1) cnt,
6 lag(id,14) over (order by id) lagid,
7 lag(val,14) over (order by id) lagval,
8 lead(id,14) over (order by id) leadid,
9 lead(val,14) over (order by id) leadval
10 from t
11 order by id;
ID SUM CNT LAGID LAGVAL LEADID LEADVAL
---------- ---------- ---------- ---------- ---------- ---------- ----------
1 218472 15 15 6399
2 224871 16 16 19361
3 244232 17 17 23637
4 267869 18 18 14871
5 282740 19 19 20668
6 303408 20 20 18961
7 322369 21 21 15767
8 338136 22 22 20654
9 358790 23 23 7065
10 365855 24 24 17487
11 383342 25 25 11077
12 394419 26 26 20772
13 415191 27 27 15505
14 430696 28 28 12849
15 425648 29 1 17897 29 23195
16 441314 29 2 7529 30 18523
17 436505 28 3 23332
18 422306 27 4 14199
19 399409 26 5 22897
20 389266 25 6 10143
21 365728 24 7 23538
22 342135 23 8 23593
23 332316 22 9 9819
24 320581 21 10 11735
25 303084 20 11 17497
26 295369 19 12 7715
27 276010 18 13 19359
28 266791 17 14 9219
29 260392 16 15 6399
30 241031 15 16 19361
30 rows selected.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id,
2 (sum(val) over
(order by id rows between 13 preceding and 13 following)+
3 nvl(lag(val,14) over (order by id)/2,0)+
4 nvl(lead(val,14) over (order by id)/2,0))/
5 nullif(
6 count(val) over
(order by id rows between 13 preceding and 13 following)+
7 decode(lag(val,14) over (order by id),null,0,1)+
8 decode(lead(val,14) over (order by id),null,0,1)
9 ,0) avg
10 from t
11 order by id;
ID AVG
---------- ----------
1 14778.1
2 14659.4688
3 15061.7941
4 15294.6944
5 15424.9474
6 15644.425
7 15726.3095
8 15839.2273
9 15753.1522
10 15608.2708
11 15555.22
12 15569.4231
13 15664.5741
14 15611.4464
15 15386
16 15666.8966
17 16006.1071
18 15903.9074
19 15802.2115
20 15773.5
21 15729.0417
22 15388.3261
23 15328.4318
24 15545.1667
25 15591.625
26 15748.7632
27 15871.6389
28 15964.7353
29 16474.4688
30 16714.1
30 rows selected.
ops$tkyte@ORA9IR2>
<b>I did not do a detailed check of the results -- but that should get you going (remember -- there are 29 rows -- 14+1+14!!! and beware NULLs)</b>
Miki, March 02, 2005 - 10:54 am UTC
Tom,
Your answer is excellent. That is - almost - what I needed.
If my window size is odd I can use simly avg() over() function. I am looking for a solution where I can also use avg() over() instead of sum() over()/count().
Is it possible?
Thank you!
March 02, 2005 - 11:15 am UTC
if you want to do things to row 1 and row 29 in the window "special" like this -- this was the only thing I thought of.
Miki, March 02, 2005 - 11:18 am UTC
Thank you! I will use your recommended code.
consecutive days... 8.1.7
Dean, March 09, 2005 - 1:07 pm UTC
create table day_cd
(dt date
,cd varchar2(2))
/
insert into day_cd values ('08-MAR-05', 'BD');
insert into day_cd values ('09-MAR-05', 'AD');
insert into day_cd values ('10-MAR-05', 'AD');
insert into day_cd values ('11-MAR-05', 'AD');
insert into day_cd values ('12-MAR-05', 'AD');
insert into day_cd values ('13-MAR-05', 'AD');
insert into day_cd values ('14-MAR-05', 'CD');
insert into day_cd values ('15-MAR-05', 'CD');
insert into day_cd values ('16-MAR-05', 'AD');
insert into day_cd values ('17-MAR-05', 'AD');
insert into day_cd values ('18-MAR-05', 'AD');
insert into day_cd values ('19-MAR-05', 'CD')
/
SELECT * FROM DAY_CD;
DT CD
--------- --
08-MAR-05 BD
09-MAR-05 AD
10-MAR-05 AD
11-MAR-05 AD
12-MAR-05 AD
13-MAR-05 AD
14-MAR-05 CD
15-MAR-05 CD
16-MAR-05 AD
17-MAR-05 AD
18-MAR-05 AD
19-MAR-05 CD
I'd like the count the occurrence of each code as it occurs in consecutive days as one occurrence.
So that the output would be:
CD OCCURRENCES
-- -----------
AD 2
BD 1
CD 2
nevermind...
Dean, March 09, 2005 - 1:59 pm UTC
select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd
we were responding at the same time...
Dean, March 09, 2005 - 2:01 pm UTC
:)
select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd
CD COUNT(*)
-- ----------
AD 2
BD 1
CD 2
Thanks for all of your help...
max() over() till not the current row
Miki, March 10, 2005 - 4:12 am UTC
Tom,
I have the following input
DATUM T COL1 COL2 COL3 COL4
2005.02.19 9:29 T 1 0 0 0
2005.02.20 9:29 0 0 0 0
2005.02.21 9:29 0 0 0 0
2005.02.22 9:29 T 1 0 0 0
2005.02.23 9:29 0 0 0 0
2005.02.24 9:29 0 0 0 0
2005.02.25 9:29 0 0 0 0
2005.02.26 9:29 0 0 0 0
2005.02.27 9:29 T 0 1 0 0
2005.02.28 9:29 0 0 0 0
2005.03.01 9:29 0 0 0 0
2005.03.02 9:29 T 1 1 0 0
2005.03.03 9:29 0 0 0 0
2005.03.04 9:29 T 1 1 0 0
2005.03.05 9:29 0 0 0 0
2005.03.06 9:29 T 1 0 0 0
2005.03.07 9:29 0 0 0 0
2005.03.08 9:29 0 0 0 0
2005.03.09 9:29 0 0 0 0
When value of column T is Â’TÂ’ a rule determines which columns (col1, Â…, col4) get 1 or 0.
Unfortunately, with the rule more then one column can get value 1. So, if col1+Â…+col4 > 1 then I would like colx to be the previous colx where t = 'T' and col1+...+col4 = 1
So, the output is the following
DATUM T COL1 COL2 COL3 COL4
2005.02.19 9:29 T 1 0 0 0
2005.02.20 9:29 0 0 0 0
2005.02.21 9:29 0 0 0 0
2005.02.22 9:29 T 1 0 0 0
2005.02.23 9:29 0 0 0 0
2005.02.24 9:29 0 0 0 0
2005.02.25 9:29 0 0 0 0
2005.02.26 9:29 0 0 0 0
2005.02.27 9:29 T 0 1 0 0
2005.02.28 9:29 0 0 0 0
2005.03.01 9:29 0 0 0 0
2005.03.02 9:29 T 0 1 0 0
2005.03.03 9:29 0 0 0 0
2005.03.04 9:29 T 0 1 0 0
2005.03.05 9:29 0 0 0 0
2005.03.06 9:29 T 1 0 0 0
2005.03.07 9:29 0 0 0 0
2005.03.08 9:29 0 0 0 0
2005.03.09 9:29 0 0 0 0
I tried to use a max() over() function to replace the Â’wrongÂ’ value but it dosnÂ’t work because I canÂ’t see the max datum till the previous record where t=Â’TÂ’ and col1+...+col4 = 1
...
case when t = Â’TÂ’ and col1+Â…+col4>1 and
greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000) then 1 else 0 end col1,
Â…
Case when t = Â’TÂ’ and col1+Â…+col4>1 and
Greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000) then 1 else 0 end col4Â…
Could you give me a solution to my problem?
Thanks in advance
miki
Miki, March 10, 2005 - 8:09 am UTC
Here is my table populated with data:
create table T
(
DATUM DATE,
T VARCHAR2(1),
COL1 NUMBER,
COL2 NUMBER,
COL3 NUMBER,
COL4 NUMBER
);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('16-01-2005 13:17:46', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-01-2005 17:23:13', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-03-2005 02:59:17', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('11-12-2004 21:59:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('10-01-2005 12:00:22', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('24-02-2005 02:36:51', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('08-12-2004 11:21:15', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('07-01-2005 20:52:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-02-2005 23:44:33', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-03-2005 16:25:12', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 19:02:28', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('22-01-2005 11:21:41', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-01-2005 15:32:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-12-2004 03:07:10', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('21-02-2005 16:25:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 01:02:39', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('15-12-2004 05:49:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-02-2005 14:35:34', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-12-2004 15:01:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
commit;
select t.* from t t
order by 1;
DATUM T COL1 COL2 COL3 COL4
1 2004.12.02. 15:01:42 0 0 0 0
2 2004.12.08. 11:21:15 0 0 0 0
3 2004.12.11. 21:59:18 T 1 0 0 0
4 2004.12.15. 5:49:26 0 0 0 0
5 2004.12.19. 3:07:10 0 0 0 0
6 2005.01.01. 1:02:39 T 0 1 0 0
7 2005.01.01. 19:02:28 0 0 0 0
8 2005.01.04. 17:23:13 T 1 1 0 0
9 2005.01.07. 20:52:26 0 0 0 0
10 2005.01.10. 12:00:22 0 0 0 0
11 2005.01.16. 13:17:46 0 0 0 0
12 2005.01.19. 15:32:18 T 1 1 0 0
13 2005.01.22. 11:21:41 0 0 0 0
14 2005.02.02. 23:44:33 0 0 0 0
15 2005.02.04. 14:35:34 T 0 1 0 0
16 2005.02.21. 16:25:42 0 0 0 0
17 2005.02.24. 2:36:51 0 0 0 0
18 2005.03.01. 2:59:17 0 0 0 0
19 2005.03.04. 16:25:12 T 1 0 0 0
Line 8 and 12 have more then one column that contain 1.
So, I need to "copy" every colx from line 6 because it is the first line (ordered by datum), that has value 'T' for column T and only one colx has value 1.
Thank you
March 10, 2005 - 8:28 am UTC
ops$tkyte@ORA9IR2> select t, col1, col2, col3, col4,
2 substr(max(data) over (order by datum),11,1) c1,
3 substr(max(data) over (order by datum),12,1) c2,
4 substr(max(data) over (order by datum),13,1) c3,
5 substr(max(data) over (order by datum),14,1) c4,
6 case when col1+col2+col3+col4 > 1 then '<---' end fix
7 from (
8 select t.*,
9 case when t = 'T' and col1+col2+col3+col4 = 1
10 then to_char(row_number() over (order by datum) ,'fm0000000000') || col1 || col2 || col3 || col4
11 end data
12 from t
13 )
14 order by datum;
T COL1 COL2 COL3 COL4 C C C C FIX
- ---------- ---------- ---------- ---------- - - - - ----
0 0 0 0
0 0 0 0
T 1 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
T 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 1 1 0 0 0 1 0 0 <---
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 1 1 0 0 0 1 0 0 <---
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 1 0 0 0 1 0 0 0
19 rows selected.
Great!
Miki, March 10, 2005 - 9:38 am UTC
Great solution!
Thank you, it is that i expected.
book on Analytics
A reader, March 10, 2005 - 11:07 am UTC
Hi Tom,
It is high time that you publish the book on 'Analytic functions' - there is a lot one can do with these , but very few people are fully aware of it
When is this book due ?
thanks
A variation of Dean's question ...
Julius, March 10, 2005 - 8:13 pm UTC
create table tt (
did number,
dd date,
status number);
alter table tt add constraint tt_pk primary key (did,dd) using index;
insert into tt values (-111,to_date('03/03/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/04/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/05/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/06/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-111,to_date('03/08/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/09/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/10/2005','mm/dd/yyyy'),22);
insert into tt values (-222,to_date('03/04/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/05/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/06/2005','mm/dd/yyyy'),77);
insert into tt values (-222,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/08/2005','mm/dd/yyyy'),55);
insert into tt values (-222,to_date('03/09/2005','mm/dd/yyyy'),11);
I need a query which would return following result set where days_in_status is a count of consecutive days the did has been in its current status (dd values are days only). I've been trying to use analytics but without much success so far. Any idea? Thanks!!
DID DD STATUS DAYS_IN_STATUS
----- ---------- ------ --------------
-111 03/10/2005 22 3
-222 03/09/2005 11 1
March 10, 2005 - 9:04 pm UTC
ops$tkyte@ORA9IR2> select did, max(dd), count(*)
2 from (
3 select x.*, max(grp) over (partition by did order by dd desc) maxgrp
4 from (
5 select tt.*,
6 case when lag(status) over (partition by did order by dd desc) <> status
7 then 1
8 end grp
9 from tt
10 ) x
11 )
12 where maxgrp is null
13 group by did
14 /
DID MAX(DD) COUNT(*)
---------- --------- ----------
-222 09-MAR-05 1
-111 10-MAR-05 3
is one approach...
SQL Query
a reader, March 15, 2005 - 6:50 pm UTC
Hi Tom,
create table a
(accno number(8) not null,
amount_paid number(7) not null)
/
insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into a values (5, 3000);
Could you please help me in writing the following query without using rownum and analytics.
list the accno corresponding to maximum amount paid. In case of more than one accounts having the same max amount paid, list any one.
I am expecting the result to be accno 4 or 5
Thanks for your time.
Regards
March 15, 2005 - 9:27 pm UTC
sounds like homework.
I give a similar quiz question in interviews (find the more frequently occuring month)
tkyte@ORA8IW> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno
2 from a;
ACCNO
-----------------------------------------
5
is one possible approach (assuming that amount_paid is positive)
tkyte@ORA8IW> select max(accno)
2 from a
3 where amount_paid = ( select max(amount_paid) from a );
MAX(ACCNO)
----------
5
is another (that would work well if amount_paid,accno were indexed....)
negatives to worry about ...
Gabe, March 15, 2005 - 9:56 pm UTC
SQL> select * from a;
ACCNO AMOUNT_PAID
---------- -----------
1 -2
2 -1
SQL> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno from a;
ACCNO
-----------------------------------------
21
March 15, 2005 - 10:04 pm UTC
....
(assuming that amount_paid is positive)
.......
that was caveated and why I gave two answers ;)
cannot read ...
Gabe, March 15, 2005 - 10:57 pm UTC
Sorry about that ... missed it completely.
following an idea of mikito ...
Matthias Rogel, March 16, 2005 - 8:11 am UTC
1 select accno
2 from a
3 start with amount_paid = (select max(amount_paid) from a)
4 and accno = (select min(accno) from a where amount_paid = (select max(amount_paid) from a))
5* connect by prior null is not null
SQL> /
ACCNO
----------
4
would be a third solution
March 16, 2005 - 8:38 am UTC
there are many solutions -- this one would win a Rube Goldberg award though :)
another query using analytics
A reader, March 29, 2005 - 11:32 am UTC
I've got 2 tables, t1 and t2.
t1(1 column):
t1.x(int ,primary key)
1
2
3
and t2(3 columns,index on t2.y):
t2.x(int) t2.y(int) t2.z(int)
1 7000 1
1 7000 6
1 8000 8
2 7000 1
2 7000 5
3 7000 3
3 8000 1
3 8000 7
3 9000 5
I would like to have a report like this:
t1.x t2.y count min max
1 7000 2 1 8
1 8000 1 1 8
2 7000 2 1 5
3 7000 1 1 7
3 8000 2 1 7
3 9000 1 1 7
What I came up with is:
select distinct t1.x,t2.y,
count(*) over (partition by t1.x,t2.y) as count,
min(t2.z) over (partition by t1.x) as min,
max(t2.z) over (partition by t1.x) as max
from t1,t2 where
where t1.x=t2.x;
I was wondering if this query is good enough, or if there's a better way(in terms of performance) to write this query. I'm new to analytics, and your help would be very much appreciated.
March 29, 2005 - 12:25 pm UTC
we could probably do this in analytics without the distinct, something like
select t1.x, t2.y, t2.cnt,
min(t2.z) over (partition by t1.x),
max(t2.z) over (partition by t1.x)
from t1, (select x, y, count(*) cnt from t2 group by x, y ) t2
where t1.x = t2.x;
and maybe even pusht he min/max() down into the inline view.
Analytics problem
Mark, April 08, 2005 - 12:19 pm UTC
Hi Tom,
I have a problem whose solution I'm pretty sure involves analytic functions. I've been struggling with it for some time, but analytics are new to me. I want to go from this:
/* create and inserts */
create table test.test (ordernum varchar2(10),
tasktype char(3),
feetype varchar2(20),
amount number(10,2));
insert into test.test(ordernum, tasktype, feetype, amount)
values('123123', 'DOC', 'Product Fee', 15);
insert into test.test(ordernum, tasktype, feetype, amount)
values('123123', 'DOC', 'Copy Fee', 1);
insert into test.test(ordernum, tasktype, feetype, amount)
values('34864', 'COS', 'Setup Fee', 23);
insert into test.test(ordernum, tasktype, feetype, amount)
values('34864', 'COS', 'File Review Fee', 27);
insert into test.test(ordernum, tasktype, feetype, amount)
values('34864', 'COS', 'Statutory Fee', 23);
insert into test.test(ordernum, tasktype, feetype, amount)
values('56432', 'DOC', 'Product Fee', 80);
insert into test.test(ordernum, tasktype, feetype, amount)
values('56432', 'DOC', 'Prepayment', -16);
SQL> select tasktype, ordernum, feetype, amount from test.test;
TAS ORDERNUM FEETYPE AMOUNT
--- ---------- -------------------- ----------
DOC 123123 Product Fee 15
DOC 123123 Copy Fee 1
COS 34864 Setup Fee 23
COS 34864 File Review Fee 27
COS 34864 Statutory Fee 22
DOC 56432 Product Fee 80
DOC 56432 Prepayment -16
...to this:
TAS ORDERNUM FEE1 FEE2 FEE3 FEE4 FEE5
--- -------- ----------- -------- ---------- -------- --------
DOC Product Fee Copy Fee Prepayment
DOC 123123 15 1
DOC 56432 80 -16
COS Setup Fee File Review Fee Statutory Fee
COS 34864 23 27 22
Allow me to explain. For each tasktype I would like a heading row, which, going across, contains all the feetypes found in test.test for that particular tasktype. There should never be more than five feetypes.
For each ordernum under each tasktype, I would like to have the amounts going across, underneath the appropriate feetypes.
I'm pretty sure my solution involves the lag and/or lead functions, partitioning over tasktype. I particularly seem to have trouble wrapping my brain around the problem of how to get a distinct ordernum while keeping intact the data in other columns (where ordernums duplicate).
I hope my explanation is clear enough.
Hope you can help. Thanks in advance. I will continue working on this.
April 08, 2005 - 12:51 pm UTC
ops$tkyte@ORA9IR2> with columns
2 as
3 (select tasktype, feetype, row_number() over (partition by tasktype order by feetype) rn
4 from (select distinct tasktype, feetype from test )
5 )
6 select a.tasktype, a.ordernum,
7 to_char( max( decode( rn, 1, amount ) )) fee1,
8 to_char( max( decode( rn, 2, amount ) )) fee2,
9 to_char( max( decode( rn, 3, amount ) )) fee3,
10 to_char( max( decode( rn, 4, amount ) )) fee4,
11 to_char( max( decode( rn, 5, amount ) )) fee5
12 from test a, columns b
13 where a.tasktype = b.tasktype
14 and a.feetype = b.feetype
15 group by a.tasktype, a.ordernum
16 union all
17 select tasktype, null,
18 ( max( decode( rn, 1, feetype ) )) fee1,
19 ( max( decode( rn, 2, feetype ) )) fee2,
20 ( max( decode( rn, 3, feetype ) )) fee3,
21 ( max( decode( rn, 4, feetype ) )) fee4,
22 ( max( decode( rn, 5, feetype ) )) fee5
23 from columns
24 group by tasktype
25 order by 1 desc, 2 nulls first
26 /
TAS ORDERNUM FEE1 FEE2 FEE3 FEE4 FEE5
--- ---------- --------------- --------------- --------------- ---- ----
DOC Copy Fee Prepayment Product Fee
DOC 123123 1 15
DOC 56432 -16 80
COS File Review Fee Setup Fee Statutory Fee
COS 34864 27 23 23
of course. :)
(suggestion, break it out, run each of the bits to see what they do. basically, columns is a view used to "pivot" on -- we needed to assign a column number to each FEETYPE by TASKTYPE. That is all that view does.
Then, we join that to test and "pivot" naturally.
Union all in the pivot of the column names....
and sort)
RE: Analytics problem
Mark, April 08, 2005 - 1:27 pm UTC
Excellent! I'll definitely break it down to figure out exactly what you did. Thank you very much.
Re: “another query using analytics”
Gabe, April 08, 2005 - 3:27 pm UTC
You werenÂ’t given any resources Â… so, I understand your solution was in fact merely a [untested] suggestion.
create table t1 ( x int primary key );
insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
create table t2 ( x int not null references t1(x), y int not null, z int not null );
insert into t2 values ( 1,7000,1);
insert into t2 values ( 1,7000,6);
insert into t2 values ( 1,8000,8);
insert into t2 values ( 2,7000,1);
insert into t2 values ( 2,7000,5);
insert into t2 values ( 3,7000,3);
insert into t2 values ( 3,8000,1);
insert into t2 values ( 3,8000,7);
insert into t2 values ( 3,9000,5);
My solution (avoiding the distinct) is not necessarily better than the one presented by the “A reader”, but here it goes:
flip@FLOP> select x, y, c
2 ,min(f) over (partition by x) f
3 ,max(l) over (partition by x) l
4 from (
5 select t2.x, t2.y, count(*) c
6 ,min(t2.z) keep (dense_rank first order by t2.z) f
7 ,max(t2.z) keep (dense_rank last order by t2.z) l
8 from t1, t2
9 where t1.x = t2.x
10 group by t2.x, t2.y
11 ) t
12 ;
X Y C F L
---------- ---------- ---------- ---------- ----------
1 7000 2 1 8
1 8000 1 1 8
2 7000 2 1 5
3 7000 1 1 7
3 8000 2 1 7
3 9000 1 1 7
Cheers.
April 08, 2005 - 3:34 pm UTC
without create tables and inserts, I guess :)
takes too much time to create the setup for every case (wish people would read the page that they have to page down through to put something up here...)
I'm confused
Mikito, April 18, 2005 - 9:55 pm UTC
Given that
select distinct deptno
from emp
is essentially
select deptno
from emp
group by deptno
how is distinct query should be rewritten in case with analytics columns? Neither
SELECT deptno, count(1),
min(sal) over (partition by deptno) f
from emp
group by deptno,min(sal) over (partition by deptno);
nor
SELECT deptno, count(1),
min(sal) over (partition by deptno) f
from emp
group by deptno,f;
seems to be a valid syntax.
(To repeat: "Does analytics scale?")
April 19, 2005 - 7:22 am UTC
why would you use analytics that way?
Tell us the question, we'll tell you the method.
select deptno, count(*) /* because count(1) is counter-intuitive */,
min(sal) over (partition by deptno) f
from emp
group by deptno, min(sal) over (partition by deptno)
would not make sense. You are saying "get all deptnos, by deptno find the minimum salary and associate that number with each one, then aggregate by deptno/min salary to count records"
You should just ask:
find the minimum salary and count of records by deptno.
select deptno, count(*), min(sal) from emp group by deptno;
is what you were looking for. analytics scale up wonderful. Say the question was instead:
you have a table full of records that have a customer_id and a last_sale_date, I would like you to retrieve the last record for each customer.
select *
from ( select cust.*, max(sale_date) over (partition by cust_id) lsd
from cust )
where sale_date = lsd;
versus
select *
from cust
where sale_date =
(select max(sale_date) from cust c2 where cust_id = cust.cust_id )
/
or
select *
from cust, (select cust_id, max(sale_date) lsd from cust group by cust_id)x
where cust.cust_id = x.cust_id
and cust.sale_date = x.lsd
/
for example
Tricky SQL?
A reader, April 19, 2005 - 10:29 am UTC
CREATE TABLE master
(
m_no INTEGER PRIMARY KEY,
m_name VARCHAR2(255) NOT NULL UNIQUE
);
create table detail
(
d_pk integer primary key,
d_no integer not null references m(m_no),
d_date date,
d_data varchar2(255)
);
Given a d_pk, how can I get the second-to-last (ordered by d_date) record from M for that M_NAME? In other words, for a given m_name, there are multiple records in "detail" with different dates. Given one of those records, I want the prior record in "detail" (there might not be any)
I tried to design a simple master detail table, but maybe I over-normalized?
Thanks
April 19, 2005 - 12:00 pm UTC
are you saying "i have a detail record, I want the detail record that came 'in front' of this one"?
that is what I sort of hear, but the second to last is confusing me.
select *
from (
select ...., lead(d_pk) over (order by d_date) next_pk
from master, detail
where master.m_no = (select d_no from detail where d_pk = :x)
and master.m_no = detail.d_no
)
where next_pk = :x;
I think that does that. You get the master/detail for that d_pk (inline view)
Use lead to assign to each record the "next pk" after sorting by d_date
Keep the record whose 'next' records primary key was the one you wanted..
a little inconsistency
mikito, April 19, 2005 - 1:24 pm UTC
I meant inconsistency, not scalability. Why "distinct"
SELECT distinct deptno,
min(sal) over (partition by deptno) f
from emp
is allowed, whereas "group by" doesn't? If someone has trouble understanding what analytics with "group by" means, the same should apply to analytics with "distinct" as well.
April 19, 2005 - 1:26 pm UTC
because group by is not distinct, they are frankly very different concepts.
detail and summery in one sql statement
A reader, April 27, 2005 - 3:02 pm UTC
hi tom,
quick shot. i have to process many detail (column a - f) and one summery record (containing sum (column c) + count (*) over all recs + some literal placeholders) within one sql statement. is there another way then using a classical UNION ALL select? any new way with analytical functions?
April 27, 2005 - 3:22 pm UTC
need small example, did not follow your example as stated.
detail and summery in one sql statement
A reader, April 28, 2005 - 10:08 am UTC
hi tom,
here is the small and simple test case to show what i mean.
SQL> create table t1 (col1 number primary key, col2 number, col3 number);
Tabelle wurde angelegt.
SQL> create table t2 (col0 number primary key, col1 number references t1 (col1), col2 number, col3 number, col4 number);
Tabelle wurde angelegt.
SQL> create index t2_col1 on t2 (col1);
Index wurde angelegt.
SQL> insert into t1 values (1, 1, 1);
1 Zeile wurde erstellt.
SQL> insert into t2 values (1, 1, 1, 1, 1);
1 Zeile wurde erstellt.
SQL> insert into t2 values (2, 1, 2, 2, 2);
1 Zeile wurde erstellt.
SQL> insert into t2 values (3, 1, 3, 3, 3);
1 Zeile wurde erstellt.
SQL> analyze table t1 compute statistics;
Tabelle wurde analysiert.
SQL> analyze table t2 compute statistics;
Tabelle wurde analysiert.
SQL> select 0 rowtype, t1.col1 display1, t1.col2 display2, t2.col3 display3, t2.col4 display4
2 from t1 join t2 on (t1.col1 = t2.col1)
3 where t1.col1 = 1
4 UNION ALL
5 select 1 rowtype, t1.col1, count (*), null, sum (t2.col4)
6 from t1 join t2 on (t1.col1 = t2.col1)
7 where t1.col1 = 1
8 group by t1.col1
9* order by rowtype
ROWTYPE DISPLAY1 DISPLAY2 DISPLAY3 DISPLAY4
---------- ---------- ---------- ---------- ----------
0 1 1 1 1
0 1 1 2 2
0 1 1 3 3
1 1 3 6
that is creating detail + summary record within one sql statement!
April 28, 2005 - 10:18 am UTC
ops$tkyte@ORA10G> select grouping_id(t1.col2) rowtype,
2 t1.col1 d1,
3 decode( grouping_id(t1.col2), 0, t1.col2, count(*) ) d2,
4 decode( grouping_id(t1.col2), 0, t2.col3, null ) d3,
5 decode( grouping_id(t1.col2), 0, t2.col4, sum(t2.col4) ) d4
6 from t1, t2
7 where t1.col1 = t2.col1
8 group by grouping sets((t1.col1),(t1.col1,t1.col2,t2.col3,t2.col4))
9 /
ROWTYPE D1 D2 D3 D4
---------- ---------- ---------- ---------- ----------
0 1 1 1 1
0 1 1 2 2
0 1 1 3 3
1 1 3 6
detail and summery in one sql statement
A reader, April 29, 2005 - 10:05 am UTC
hi tom,
thanks for your help. that's exactly what i need. analytics rock, analytics roll as you said. :)
unfortunately it is hard to get. :(
i looked in the documentation but cannot understand the grouping_id values in the example. please could you explain? what is "2" or "3" in the grouping column?
Examples
The following example shows how to extract grouping IDs from a query of the sample table sh.sales:
SELECT channel_id, promo_id, sum(amount_sold) s_sales,
GROUPING(channel_id) gc,
GROUPING(promo_id) gp,
GROUPING_ID(channel_id, promo_id) gcp,
GROUPING_ID(promo_id, channel_id) gpc
FROM sales
WHERE promo_id > 496
GROUP BY CUBE(channel_id, promo_id);
C PROMO_ID S_SALES GC GP GCP GPC
- ---------- ---------- ---------- ---------- ---------- ----------
C 497 26094.35 0 0 0 0
C 498 22272.4 0 0 0 0
C 499 19616.8 0 0 0 0
C 9999 87781668 0 0 0 0
C 87849651.6 0 1 1 2
I 497 50325.8 0 0 0 0
I 498 52215.4 0 0 0 0
I 499 58445.85 0 0 0 0
I 9999 169497409 0 0 0 0
I 169658396 0 1 1 2
P 497 31141.75 0 0 0 0
P 498 46942.8 0 0 0 0
P 499 24156 0 0 0 0
P 9999 70890248 0 0 0 0
P 70992488.6 0 1 1 2
S 497 110629.75 0 0 0 0
S 498 82937.25 0 0 0 0
S 499 80999.15 0 0 0 0
S 9999 267205791 0 0 0 0
S 267480357 0 1 1 2
T 497 8319.6 0 0 0 0
T 498 5347.65 0 0 0 0
T 499 19781 0 0 0 0
T 9999 28095689 0 0 0 0
T 28129137.3 0 1 1 2
497 226511.25 1 0 2 1
498 209715.5 1 0 2 1
499 202998.8 1 0 2 1
9999 623470805 1 0 2 1
624110031 1 1 3 3
April 29, 2005 - 10:21 am UTC
How to do this using Analytics
A reader, May 05, 2005 - 5:11 pm UTC
Hello Sir,
I have a denormalized table dept_emp of which part of it I have reproduced here.It has/will have dupes .
I need to find out all emps which belong to more than one dept using Analytics ( Want to avoid self join ).
So the required output must be :
DEPTNO DNAME EMPNO ENAME
------ ---------- ----- --------------------
10 D10 1 E1
10 D10 1 E1
10 D10 2 E2
10 D10 2 E2
20 D20 1 E1
20 D20 1 E1
20 D20 2 E2
20 D20 2 E2
From the total set of :
SELECT * FROM DEPT_EMP ORDER BY DEPTNO ,EMPNO
DEPTNO DNAME EMPNO ENAME
------ ---------- ----- --------------------
10 D10 1 E1
10 D10 1 E1
10 D10 2 E2
10 D10 2 E2
10 D10 3 E3
10 D10 3 E3
20 D20 1 E1
20 D20 1 E1
20 D20 2 E2
20 D20 2 E2
20 D20 4 E4
20 D20 4 E4
20 D20 5 E5
20 D20 5 E5
14 rows selected
create table dept_emp (deptno number , dname varchar2(10) ,empno number ,ename varchar2(20) ) ;
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 3, 'E3');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 4, 'E4');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 5, 'E5');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 3, 'E3');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 4, 'E4');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 5, 'E5');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 2, 'E2');
COMMIT ;
Thanx
May 05, 2005 - 6:10 pm UTC
no analytics
select empno, count(distinct deptno)
from t
group by empno
having count(distinct deptno) > 1;
Thanx Sir
A reader, May 05, 2005 - 9:20 pm UTC
Actually I was planing to use analytics to get the whole row info, will do the same trick with Analytics, then.
You are a Genius.
May 06, 2005 - 7:17 am UTC
select *
from (
select t.*, count(distinct deptno) over (partition by empno) cnt
from t
)
where cnt > 1;
Analytical solution
Baiju Menon, May 10, 2005 - 6:29 am UTC
Sir,
I want to list the department and the maximum number of employees working in that department by using Analytical function(only the department in which the maximum number of employees are working)
the query without the Analytical function is
select deptno, count(deptno) from emp group by deptno having count(deptno) in (select max(count(deptno)) from emp group by deptno)
Thanks
May 10, 2005 - 9:15 am UTC
1 select deptno, cnt
2 from (
3 select deptno, cnt, max(cnt) over() max_cnt
4 from (
5 select deptno, count(*) cnt
6 from emp
7 group by deptno
8 )
9 )
10* where cnt = max_cnt
scott@ORA9IR2> /
DEPTNO CNT
---------- ----------
30 6
group by
Anoop Gupta, May 11, 2005 - 4:15 am UTC
Hi Tom,
I have a table in table data is like this
empid leavelname
1001 Level1
1001 Level2
1001 Level3
1001 Level4
1002 Level1
1002 Level2
1002 Level3
...
...
Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a function.
empid emp_assigned on leavel
1001 level1,level2,level3,level4
1002 level1,level2,level3
...
...
Waiting for your response.....
May 11, 2005 - 7:30 am UTC
only if there is some reasonable maximum number of levelname rows per empid.
is there?
Analytics Rock - But why are they slower for me
Jeff Plumb, May 13, 2005 - 1:00 am UTC
Hi Tom,
I have followed you example about Analytics from Effective Oracle by Design on page 516 (Find a specific row in a partition). When I run the example and tkprof the 3 different queries, the analytics actually takes a lot longer to run, but it does do less logical I/O's. It is doing a lot more physical I/O's so I am guessing that it is using a temporary segment on disk to perform the window sort. To perform the test I created the big_table that you use and populated it with 1,000,000 rows. I am using Oracle 9i release 2. Here is the output from TKPROF:
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 5.32 6.42 13815 14669 0 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 5.32 6.42 13815 14669 0 694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows Row Source Operation
------- ---------------------------------------------------
694 HASH JOIN
20 VIEW
20 SORT GROUP BY
1000000 TABLE ACCESS FULL BIG_TABLE
1000000 TABLE ACCESS FULL BIG_TABLE
********************************************************************************
select t.owner, t.object_name, t.created
from big_table t
join (select owner, max(created) maxcreated
from big_table
group by owner) t2
on (t2.owner = t.owner and t2.maxcreated = t.created)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 5.03 5.06 13816 14669 0 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 5.03 5.06 13816 14669 0 694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows Row Source Operation
------- ---------------------------------------------------
694 HASH JOIN
20 VIEW
20 SORT GROUP BY
1000000 TABLE ACCESS FULL BIG_TABLE
1000000 TABLE ACCESS FULL BIG_TABLE
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created, max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 16.68 40.66 15157 7331 17 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 16.68 40.66 15157 7331 17 694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows Row Source Operation
------- ---------------------------------------------------
694 VIEW
1000000 WINDOW SORT
1000000 TABLE ACCESS FULL BIG_TABLE
********************************************************************************
And when I run the query with the analytics using autotrace I get the following which shows a sort to disk:
SQL*Plus: Release 9.2.0.6.0 - Production on Fri May 13 14:53:08 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production
control@DWDEV> set autot traceonly
control@DWDEV> select owner, object_name, created
2 from
3 ( select owner, object_name, created, max(created) over (partition by owner) as maxcreated
4 from big_table
5 )
6 where created = maxcreated;
694 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4399 Card=1000000 Bytes=52000000)
1 0 VIEW (Cost=4399 Card=1000000 Bytes=52000000)
2 1 WINDOW (SORT) (Cost=4399 Card=1000000 Bytes=43000000)
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=637 Card=1000000 Bytes=43000000)
Statistics
----------------------------------------------------------
0 recursive calls
17 db block gets
7331 consistent gets
15348 physical reads
432 redo size
12784 bytes sent via SQL*Net to client
717 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
694 rows processed
So how can I stop the sorts (disk)? I am guessing that the pga_aggregate_target needs to be higher, but it seems to already be set quite high.
control@DWDEV> show parameter pga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 524288000
I hope you can help clarify how to make the anayltic version run quicker.
Thanks.
May 13, 2005 - 9:50 am UTC
it'll be a function of the number of "owners" here
You have 1,000,000 records.
You have but 20 users.
in this extreme case, having 50,000 records per window and swapping out was not as good as squashing the data down to 20 records and joining -- the CBO quite smartly rewrote:
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
as
select ...
from big_table t, (select owner,max(created) created from big_table t2 ...)
where ....
So, does the data you analyze to find the "most current record" tend to have 50,000 records/key in real life?
In your case, your hash table didn't spill to disk. In real life though, the numbers would probably be much different. a 1,000,000 row table would have keys with 10 or 100 rows maybe, not 50,000 (in general). There you would find the answer to be very different.
And if you let the sort run in memory it would be different as well -- you would get a max of 25m given your pga aggregate target setting that may have been too small.
but consider what happens when the size of the "aggregate" goes up, dimishing marginal returns sets in:
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 2.06 2.01 26970 29283 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 2.06 2.01 26970 29283 0 4775
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 4.57 10.05 30603 14484 15 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 4.57 10.05 30603 14484 15 4775
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.id = t.id)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.70 12.04 33787 45393 2 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.71 12.05 33787 45393 2 1000000
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by id) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.00 9.60 9336 14484 2 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.00 9.60 9336 14484 2 1000000
and, given sufficient space to work "in memory", these two big queries both benefited:
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 1.82 1.96 9909 29283 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 1.83 1.97 9909 29283 0 4775
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 2.15 2.11 2858 14484 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 2.15 2.11 2858 14484 0 4775
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.id = t.id)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.64 7.55 10181 94633 0 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.65 7.56 10181 94633 0 1000000
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by id) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 5.69 5.49 2699 14484 0 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 5.69 5.49 2699 14484 0 1000000
(this was a dual cpu xeon using 'nonparallel' query in this case, once with a 256mb pga aggregate target and again with a 2gig one
kuldeep, May 14, 2005 - 3:27 am UTC
Dear Tom,
I have three tables t1, t2 & t3. where t2 & t3 is joined with t2 with column "key_id".
Now I need sum of key_values(amount) of t2 and sum of key_values(amount) of t3 for key_id
in table t1.
kuldeep@dlfscg> select * from t1;
KEY_ID KEY_VAL
---------- ----------
2 1980
1 1975
kuldeep@dlfscg> select * from t2;
KEY_ID KEY_VAL
---------- ----------
2 550
2 575
1 500
kuldeep@dlfscg> select * from t3;
KEY_ID KEY_VAL
---------- ----------
2 900
1 1000
1 750
***** QUERY 1 *****
kuldeep@dlfscg> SELECT t1.key_id, SUM(t2.key_val) sum_t2_key_val, SUM(t3.key_val) sum_t3_key_val
2 FROM t1, t2, t3
3 WHERE t1.key_id=t2.key_id
4 AND t1.key_id=t3.key_id
5 GROUP BY t1.key_id
6 /
KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
1 1000 1750
2 1125 1800
***** QUERY 2 *****
kuldeep@dlfscg> SELECT t1.key_id, t2.sum_t2_key_val, t3.sum_t3_key_val
2 FROM t1,
3 (SELECT key_id, SUM(key_val) sum_t2_key_val FROM t2 GROUP BY key_id) t2,
4 (SELECT key_id, SUM(key_val) sum_t3_key_val FROM t3 GROUP BY key_id) t3
5 WHERE t1.key_id=t2.key_id
6 AND t1.key_id=t3.key_id
7 /
KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
1 500 1750
2 1125 900
Query 1 is giving wrong result and I can not use query 2 whose performance is very poor.
Oracle 9i has added a lot of new grouping features and a lot of analytic functions (all going over the head).
Is there any "special" sum function or way which should pick value only once belonging to a row (or query's key, here "key_id")
irrespective of how many time it is appearing on query result.
KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1000
1 500 750 <---- 500 of t2 should not be calculated, it is repeat
2 550 900
2 575 900 <---- 900 of t3 should not be calculated, it is repeat
thanks and regards,
May 14, 2005 - 9:36 am UTC
select t1.key_id, t2.sum_val, t3.sum_val
from t1,
(select key_id, sum(val) sum_val from t2 group by key_id ) t2,
(select key_id, sum(val) sum_val from t3 group by key_id ) t3
WHERE t1.key_id=t2.key_id
AND t1.key_id=t3.key_id
apply an amount across multiple records
Dave, May 15, 2005 - 8:17 pm UTC
I have a problem similar to what I call the invoice payment problem.
It would seem to be a common problem, but I have searched to no avail.
The idea is that a customer may have many outstanding invoices, and sends in a check for an arbitrary amount. So we need to apply the money across the invoices oldest first.
Note that in my specific case, if a payment exceeds the total outstanding, the excess is ignored (obviously not dealing with real money here!)
create table invoices (
cust_nbr integer not null,
invoice_nbr integer not null,
invoice_amt number not null,
payment_amt number not null,
primary key (cust_nbr, invoice_nbr)
);
begin
delete from invoices;
dbms_random.seed(123456789);
for c in 1 .. 2 loop
for i in 1 .. 3 loop
insert into invoices values (c, i, round(dbms_random.value * 10, 2)+1, 0);
end loop;
end loop;
update invoices
set payment_amt = round(dbms_random.value * invoice_amt, 2)
where invoice_nbr = 1;
commit;
end;
/
select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
invoice_amt - payment_amt outstanding_amt
from invoices
where invoice_amt - payment_amt > 0
order by cust_nbr, invoice_nbr;
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 5.55 3.89
1 2 3.21 0 3.21
1 3 2.78 0 2.78
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92
variable cust_nbr number;
variable received_amt number;
begin
:cust_nbr := 1;
:received_amt := 7.25;
end;
/
update invoices i1
set payment_amt = (... some query which applies
:received_amt to outstanding_amt ...)
where cust_nbr = :cust_nbr;
result should be:
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 9.44 0
1 2 3.21 3.21 0
1 3 2.78 .15 2.63
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92
This is simple to solve in pl/sql with a cursor, but I thought it would be a good test for a set-based solution with analytics. But after some effort, I'm stumped.
May 16, 2005 - 7:37 am UTC
Using analytics we can see how to apply the inputs:
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
2 least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
3 from (
4 select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
5 invoice_amt - payment_amt outstanding_amt,
6 sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
7 from invoices
8 where cust_nbr = :cust_nbr
9 )
10 order by cust_nbr, invoice_nbr;
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT AMOUNT_TO_APPLY
---------- ----------- ----------- ----------- ---------------
1 1 9.44 5.55 3.89
1 2 3.21 0 3.21
1 3 2.78 0 .15
Just needed a running total of outstanding amounts to take away from the received amount....
Then, merge:
ops$tkyte@ORA9IR2> merge into invoices
2 using
3 (
4 select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
5 least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
6 from (
7 select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
8 invoice_amt - payment_amt outstanding_amt,
9 sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
10 from invoices
11 where cust_nbr = :cust_nbr
12 )
13 ) x
14 on ( invoices.cust_nbr = x.cust_nbr and invoices.invoice_nbr = x.invoice_nbr )
15 when matched then update set payment_amt = nvl(payment_amt,0)+x.amount_to_apply
16 when not matched /* never happens... */ then insert (cust_nbr) values (null);
3 rows merged.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
2 invoice_amt - payment_amt outstanding_amt
3 from invoices
4 order by cust_nbr, invoice_nbr;
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 9.44 0
1 2 3.21 3.21 0
1 3 2.78 .15 2.63
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92
6 rows selected.
Group by
Anoop Gupta, May 16, 2005 - 10:06 am UTC
Reviewer: Anoop Gupta from INDIA
Hi Tom,
As i asked question that
I have a table in table data is like this
empid leavelname
1001 Level1
1001 Level2
1001 Level3
1001 Level4
1002 Level1
1002 Level2
1002 Level3
...
...
Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a
function.
empid emp_assigned on leavel
1001 level1,level2,level3,level4
1002 level1,level2,level3
...
...
Give me the way to write a query if Suppose here we have a limitation of levels for an employee is 50.
Please reply....
May 16, 2005 - 1:09 pm UTC
select empid,
rtrim(
max(decode(rn,1,leavelname)) || ',' ||
max(decode(rn,1,leavelname)) || ',' ||
....
max(decode(rn,50,leavelname)), ',' )
from (select empid,
row_number() over (partition by empid order by leavelname) rn,
leavelname
from t
)
group by empid;
special sum
kuldeep, May 17, 2005 - 12:38 am UTC
Dear Tom,
Thanks for your response and for this useful site.
I was looking for a solution which could avoid these inline views which were making my query to run slow. I tried for the solution and got this query,
/* DATA VIEW */
kuldeep@dlfscg> SELECT t1.key_id,
2 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
3 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
4 FROM t1, t2, t3
5 WHERE t1.key_id=t2.key_id
6 AND t1.key_id=t3.key_id
7 ORDER BY t1.key_id
8 /
KEY_ID T2_ROWID T2_RN KEY_VAL T3_ROWID T2_RN KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
1 AAANZ5AAHAAAD94AAA 1 500 AAANZ4AAHAAAD9wAAA 1 1000
1 AAANZ5AAHAAAD94AAA 2 500 AAANZ4AAHAAAD9wAAB 1 750
2 AAANZ5AAHAAAD91AAA 1 550 AAANZ4AAHAAAD9tAAA 1 900
2 AAANZ5AAHAAAD91AAB 1 575 AAANZ4AAHAAAD9tAAA 2 900
/* FINAL QUERY */
kuldeep@dlfscg> SELECT key_id,
2 SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
3 SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
4 FROM (SELECT t1.key_id,
5 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
6 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
7 FROM t1, t2, t3
8 WHERE t1.key_id=t2.key_id
9 AND t1.key_id=t3.key_id)
10 GROUP BY key_id
11 /
KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1750
2 1125 900
regards,
May 17, 2005 - 8:23 am UTC
one would need more information -- it APPEARS that you are trying to get a "random first hit" from T2 and T3 by T1.key_id
That is, for every row in T1 -- find the first match (any match will do) in T2 and in T3
report that value
is that correct.
and how big are t1,t2,t3, and how long is long.
group by
Anoop Gupta, May 17, 2005 - 9:42 am UTC
Tom,
Thanks for your prompt response.
Analytical Problem
Imran, May 18, 2005 - 4:16 am UTC
Look at the following two queries.
SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
2 CASE
3 WHEN up_down < 0
4 THEN 'DOWN'
5 WHEN up_down > 0
6 THEN 'UP'
7 ELSE 'BALANCE'
8 END CASE,
9 prev_month
10 FROM (SELECT exch || ' - ' || phone phone,
11 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
12 instdate, paybefdue this_month, arrears,
13 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
14 paybefdue
15 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
16 FROM ptc
17 WHERE phone IN (7629458));
PHONE MONTH ARREARS THIS_MONTH ABS(UP_DOWN) CASE PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458 Apr, 2005 2562.52 5265 5265 UP 0
SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
2 CASE
3 WHEN up_down < 0
4 THEN 'DOWN'
5 WHEN up_down > 0
6 THEN 'UP'
7 ELSE 'BALANCE'
8 END CASE,
9 prev_month
10 FROM (SELECT exch || ' - ' || phone phone,
11 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
12 instdate, paybefdue this_month, arrears,
13 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
14 paybefdue
15 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
16 FROM ptc
17 WHERE phone IN (7629459));
PHONE MONTH ARREARS THIS_MONTH ABS(UP_DOWN) CASE PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629459 Apr, 2005 3516.62 7834 7834 UP 0
SQL>
Now when I combine the two queries results are different.
1 SELECT phone, MONTH, arrears, this_month, ABS (up_down),
2 CASE
3 WHEN up_down < 0
4 THEN 'DOWN'
5 WHEN up_down > 0
6 THEN 'UP'
7 ELSE 'BALANCE'
8 END CASE,
9 prev_month
10 FROM (SELECT exch || ' - ' || phone phone,
11 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
12 instdate, paybefdue this_month, arrears,
13 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
14 paybefdue
15 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
16 FROM ptc
17* WHERE phone IN (7629458,7629459))
SQL> /
PHONE MONTH ARREARS THIS_MONTH ABS(UP_DOWN) CASE PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458 Apr, 2005 2562.52 5265 2569 DOWN 7834
202 - 7629459 Apr, 2005 3516.62 7834 7834 UP 0
So you note that prev Month balance now disturbs badly.
Please tell me how to do this
May 18, 2005 - 8:58 am UTC
need test case. create table, insert's (like the page used to submit this said....)
Use of analytic functions in UPDATE statements
Bob Lyon, May 18, 2005 - 12:29 pm UTC
Tom,
-- Given this sample data
CREATE TABLE GT (
XP_ID INTEGER,
OFFSET INTEGER,
PMAX NUMBER,
PRIOR_PMAX NUMBER
);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 1, 3);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 2, 8);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 155, 3, 5);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 173, 3, 7.3);
-- I want to update the table and set the PRIOR_PMAX column values to be as follows
SELECT XP_ID, OFFSET, PMAX,
LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
ORDER BY XP_ID, OFFSET) PRIOR_PMAX
FROM GT
ORDER BY XP_ID, OFFSET;
XP_ID OFFSET PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
123 1 3
123 2 8 3
155 3 5
173 3 7.3
-- My insert to do this tells me "4 rows updated.", but does not do what I want.
UPDATE GT A
SET PRIOR_PMAX = (
SELECT LAG(B.PMAX, 1, NULL) OVER (PARTITION BY B.XP_ID
ORDER BY B.XP_ID, B.OFFSET) PRIOR_PMAX
FROM GT B
WHERE A.ROWID = B.ROWID
);
-- but I get
SELECT xp_id, offset, pmax, prior_pmax
FROM GT
ORDER BY xp_id, offset;
XP_ID OFFSET PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
123 1 3
123 2 8
155 3 5
173 3 7.3
-- Oracle doc states
-- "Therefore, analytic functions can appear only in the select list or ORDER BY clause."
-- which is perhaps a little ambiguous in this case.
-- Is there a way to do this update is "Straight SQL"?
May 18, 2005 - 12:54 pm UTC
you can merge
merge into gt a
using ( SELECT rowid rid, XP_ID, OFFSET, PMAX,
LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
ORDER BY XP_ID, OFFSET) PRIOR_PMAX
FROM GT )b
on (a.rowid = b.rowid)
when matched then update ...
when not matched (never happens, just do a dummy insert of a single null in 9i or leave off entirely in 10g)
special sum
Kuldeep, May 19, 2005 - 1:09 am UTC
My requirement was like this : I have receivables (bills, debit notes etc.) which I adjusts against the received payments and credit note (both are in seperate tables). To know the outstanding I was joining (outer join) my receivables with payments and credit notes.
Because one receivable can be adjusted against many payments and credit notes so outstanding payment was like this:
outstanding = receivable amount - sum(payment amount) - sum(credit note amount)
this simple query using outer join was giving wrong result if a receivable is adjusted against one payment and more than one credit note or vice versa.
in this case where
receivable : 1000 payment : 400 CN : 400, 200
will appear as
1000 400 400
1000 400 200
--- ---
800 600 outstanding = -400 (wrong)
My t1, t2 and t3 has 600,000, 350,000 and 80,000 row respectively.
This is my actual inline view query
-----------------------------------
SELECT a.bill_type, a.bill_exact_type, a.period_id,
a.scheme_id, a.property_number, a.bill_number,
a.bill_amount, SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) adj_amt,
NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) pending_amt
FROM ALL_RECEIVABLE a,
(SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adj_amt) adj_amt
FROM CREDIT_NOTE_RECEIVABLE
WHERE bill_type=p_bill_type
AND scheme_id=p_scheme
AND property_number=p_prop
GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) c,
(SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adjust_amount) adjust_amount
FROM PAYMENT_RECEIPT_ADJ
WHERE bill_type=p_bill_type
AND scheme_id=p_scheme
AND property_number=p_prop
GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) p
WHERE a.bill_type=P_BILL_TYPE
AND a.scheme_id=P_SCHEME
AND a.property_number=P_PROP
AND a.bill_type=c.bill_type(+)
AND a.bill_exact_type=c.bill_exact_type(+)
AND a.period_id=c.period_id(+)
AND a.scheme_id=c.scheme_id(+)
AND a.property_number=c.property_number(+)
AND a.bill_number=c.bill_number(+)
AND a.bill_type=p.bill_type(+)
AND a.bill_exact_type=p.bill_exact_type(+)
AND a.period_id=p.period_id(+)
AND a.scheme_id=p.scheme_id(+)
AND a.property_number=p.property_number(+)
AND a.bill_number=p.bill_number(+)
GROUP BY a.bill_type, a.bill_exact_type, a.period_id, a.scheme_id,
a.property_number, a.bill_number, a.bill_date, a.bill_amount
HAVING (NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0))) > 0
ORDER BY a.bill_date;
-----------------------------------
It is not reporting just the first hit of t1 in t2 and t3. Here in my last posting, I was trying just to exclude any repeat of t2 and t3's ROW in sum calculation. That means one row of t2 and t3 should be calculated only once.
I have tried this query putting more rows and applied the same on actual query, it is working fine and giving the same result as previous inline view query was giving.
kuldeep@dlfscg> SELECT t1.key_id,
2 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
3 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
4 FROM t1, t2, t3
5 WHERE t1.key_id=t2.key_id(+)
6 AND t1.key_id=t3.key_id(+)
7 ORDER BY t1.key_id
8 /
KEY_ID T2_ROWID T2_RN KEY_VAL T3_ROWID T2_RN KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
1 AAANZ5AAHAAAD94AAA 1 500 AAANZ4AAHAAAD9wAAA 1 1000
1 AAANZ5AAHAAAD94AAA 2 500 AAANZ4AAHAAAD9wAAB 1 750
1 AAANZ5AAHAAAD94AAA 3 500 AAANZ4AAHAAAD9wAAC 1 25
2 AAANZ5AAHAAAD91AAA 1 550 AAANZ4AAHAAAD9tAAA 1 900
2 AAANZ5AAHAAAD91AAB 1 575 AAANZ4AAHAAAD9tAAA 2 900
3 AAANZ5AAHAAAD91AAC 1 222 1
3 AAANZ5AAHAAAD91AAD 1 223 2
4 1 AAANZ4AAHAAAD9tAAB 1 333
8 rows selected.
kuldeep@dlfscg> SELECT key_id,
2 SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
3 SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
4 FROM (SELECT t1.key_id,
5 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
6 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
7 FROM t1, t2, t3
8 WHERE t1.key_id=t2.key_id(+)
9 AND t1.key_id=t3.key_id(+))
10 GROUP BY key_id
11 /
KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1775
2 1125 900
3 445 0
4 333
kuldeep@dlfscg>
thanks for your responses.
regards,
May 19, 2005 - 7:47 am UTC
do not order by rowid to get a last row -- is that what you are trying to do??
which row do you want to get from t2 to join with t1
and which row do you want to get from t3 to join with t1
You must specify that based on attributes you manage (eg: there must be an orderable field that helps you determine WHICH record is the right one)
consider rowid to be a random number that does not have any meaning when ordered by, it does not imply order of insertion or anything.
null record
yeshk, May 25, 2005 - 4:12 pm UTC
I need help with this query - This is just a part of the query I am working with.
I am not able to generate a NULL RECORD in between the result set.
I should be able to pass this information out as a reference cursor.
create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt number,non_measd_tkt number);
insert into test values('CA','NDS',100,200);
insert into test values('IL','DSL',200,300);
insert into test values('CA','DSL',100,300);
insert into test values('MO','NDS',1000,300);
insert into test values('MO','DSL',100,200);
I need a result like this
STATE SVC_CAT MEASD_TKT NON MEASD TKT
CA DSL 200 300
CA NDS 100 200
TOTAL 300 500
IL DSL 200 300
TOTAL 200 300
MO DSL 100 200
MO NDS 1000 300
TOTAL 1100 500
I am able to generate the result using a query with analytics.But I dont know how to get an empty row after each state total
Also,Which is better using cursor
1) cursor based on state.
2) get the data and insert into a temporary table.
3) insert a null record
or use analytics to get complete data and put into a reference cursor.
Thanks
yeshk
May 25, 2005 - 7:57 pm UTC
well, that would sort of be the job of the "pretty printing routine" -- eg: the report generator?
what tool is printing this out?
null record
yeshk, May 26, 2005 - 9:20 am UTC
we need to give the resultant set with a null record after each state calculation to front-end VB application. It will be given in a reference cursor.They will just select * from reference cursor and display it on a report.
May 26, 2005 - 10:02 am UTC
the VB application should do this, (it should be able to do something shouldn't it...)
ops$tkyte@ORA9IR2> select decode( grp, 0, state ) state,
2 decode( grp, 0, svc_cat) svc_cat,
3 decode( grp, 0, sum_mt ) sum_mt,
4 decode( grp, 0, sum_nmt ) sum_nmt
5 from (
6 select grouping(dummy) grp, state, svc_cat, sum(measd_tkt) sum_mt, sum(non_measd_tkt) sum_nmt
7 from (
8 select state, svc_cat, 1 dummy, measd_tkt, non_measd_tkt
9 from test
10 )
11 group by rollup( state, dummy, svc_cat )
12 )
13 /
ST SVC SUM_MT SUM_NMT
-- --- ---------- ----------
CA DSL 100 300
CA NDS 100 200
CA 200 500
IL DSL 200 300
IL 200 300
MO DSL 100 200
MO NDS 1000 300
MO 1100 500
12 rows selected.
Can rollup do the thing??
Bhavesh Ghodasara, May 26, 2005 - 9:39 am UTC
Hi yeshk,
create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt
number,non_measd_tkt number);
insert into test values('CA','NDS',100,200);........
insert into test values('CA','DSL',100,300);....
STATE SVC_CAT MEASD_TKT NON MEASD TKT
CA DSL 200 300 <==From where measd_tkt=200 comes from??????
CA NDS 100 200
TOTAL 300 500
Tom,,Can we do like this,
break on state
select STATE,SVC_CAT,sum(measd_tkt),sum(non_measd_tkt)
from test
group by rollup(STATE,SVC_CAT)
order by state
............
If i make any mistake than please tell me..
Thanks in advance.
May 26, 2005 - 10:19 am UTC
see above
Which analytics to use?
Marc-Andre Larochelle, May 30, 2005 - 9:10 pm UTC
Hi Tom,
I have this 3rd party table:
drop table t;
create table t (atype varchar2(4),
acol# varchar2(3),
adin varchar2(8),
ares varchar2(8));
insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
insert into t (atype, acol#, ares) values ('MACT','1','02246569');
insert into t (atype, acol#, ares) values ('MACT','6','02246569');
insert into t (atype, acol#, ares) values ('MACT','7','00021474');
select * from t;
ATYPE ACOL# ADIN ARES
----- ----- -------- --------
DUPT 001 02246569
DUPT 002 00021474
DUPT 003 02246569
MACT 1 02246569
MACT 6 02246569
MACT 7 00021474
I would like to get the following result :
DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569
I need to match DUPT.adin=MACT.ares together but making sure MACT.acol# is different for every DUPT.acol#. Bsically this table has different values in column depending on the type of rows (atype).
I have tried using lag, lead, rank and nothing seems to work but I am pretty sure it is doable with analytics which is why I posted my question here.
Any hint/help would be appreciated.
Thank you,
Marc-Andre
May 31, 2005 - 7:30 am UTC
question for you.
How did you know to put:
DUPT 001 02246569 together with MACT 1 02246569 and
DUPT 003 02246569 together with MACT 6 02246569
and not
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569
for example. some missing logic here.
Am I Correct??
Bhavesh Ghodasara, May 31, 2005 - 5:15 am UTC
Hi tom,
I solved above problem...
The query like :
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol# bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
count(*) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol<>lb
What i think is there must be a better way...
I know You will do it in much much better way..
Please suggest the corrections.
Thanks in Advance..
May 31, 2005 - 8:17 am UTC
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 1 02246569
well, it gives a different result than the one you posted, it gives my hypothetical answer -- where 001 was combined with 6, not 1.
We can do this..
Bhavesh Ghodasara, May 31, 2005 - 8:28 am UTC
Hi tom,
I can further modified my query:
now its give desired result....
(Agree that question is ambiguous)
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
min(b.acol#) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol=lb
or cnt>1
OUTPUT:
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569
So any corrections now??
Thanks in advance
Bhavesh
May 31, 2005 - 8:43 am UTC
I don't know your data well enough, but your query is non-deterministic if you care. Consider:
ops$tkyte@ORA10G> create table t (atype varchar2(4),
2 acol# varchar2(3),
3 adin varchar2(8),
4 ares varchar2(8));
Table created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
1 row created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
2 from (
3 select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
4 bacol,b.ares bares,
5 nvl(lead(b.acol# ) over(order by a.adin),0) lb,
6 min(b.acol#) over(partition by a.acol#) cnt
7 from t a,t b
8 where a.adin=b.ares
9 order by atyp,acol) t
10 where bacol=lb
11 or cnt>1;
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 002 00021474 MACT 7 00021474
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> truncate table t;
Table truncated.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
1 row created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
2 from (
3 select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
4 bacol,b.ares bares,
5 nvl(lead(b.acol# ) over(order by a.adin),0) lb,
6 min(b.acol#) over(partition by a.acol#) cnt
7 from t a,t b
8 where a.adin=b.ares
9 order by atyp,acol) t
10 where bacol=lb
11 or cnt>1;
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6 02246569
DUPT 002 00021474 MACT 7 00021474
Same data both times, just different order of insertions. With analytics and order by, you need to be concerned about duplicates.
Answers
Marc-Andre Larochelle, May 31, 2005 - 11:35 am UTC
Tom, Bhavesh,
The problem resides exactly there: no logic to match the records. I know that DUPT.din1 must have a MACT.din1 somewhere. I just don't know which one (1st one, 2nd one?). This is a decision I will have to take.
DUPT 001 02246569 MACT 1 02246569
DUPT 003 02246569 MACT 6 02246569
and
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569
are the same to me. But when I run the query, I want to always get the same results.
Anyways, all in all, your queries (Bhavesh - thank you - and yours) seem to answer to my question. I will watch out for duplicates.
Thank you very much for the quick help.
Marc-Andre
What I found
Marc-Andre Larochelle, May 31, 2005 - 5:02 pm UTC
Hi Tom,
Testing the SQL statement Bhavesh provided, I quickly discovered what you meant when saying the query was non-deterministic. When I added a 4th record :
insert into t (atype,acol#,adin) values ('DUPT','004','02246569');
insert into t (atype,acol#,ares) values ('MACT','5','02246569');
only one row was returned. I played with the query and here is what I came up with :
select atyp,acol,aadin,batype,bacol,bares
from (
select atyp,acol,aadin,batype,bacol,bares,drnk ,
rank() over (partition by acol order by bacol) rnk
from (
select a.atype atyp,
a.acol# acol,
a.adin aadin,
b.atype batype,
b.acol# bacol,
b.ares bares,
dense_rank() over (partition by a.atype,a.adin order by a.acol#) drnk
from t a,t b
where a.adin=b.ares))
where drnk=rnk;
Feel free to comment.
Again thank you (and Bhavesh).
Marc-Andre
Using Analytical Values to find latest info
anirudh, June 03, 2005 - 10:41 am UTC
Hi Tom,
we have a fairly large table with about 100 million rows, among others this table has
the following columns
CREATE TABLE my_fact_table (
staff_number VARCHAR2 (10), -- staff number
per_end_dt DATE, -- last day of month
engagement_code VARCHAR2 (30), -- engagement code
client_code VARCHAR2 (20), -- client code
revenue NUMBER (15,2) -- revenue
)
in this table the same engagement code can have different client codes for diffenet periods. This was at one point desirable and that is the reason client code was stored in this fact table instead of the engagement dimension.
Our users now want us to update the client code in these transactions to the latest value of the client code (meaning - pick the client from the latest month for which we have got any transactions for that engagement)
This situation where same engagement has multiple clients across periods is there for about 5 % of the rows.
[btw - we do plan to do data-model change to reflect the new relationships - but that may take some time - hence the interim need to just update the fact table]
to implemnt these updates that may happen for several months, I'm trying to take the approach below
which involve multiple queries and creation of a couple of temp tables - does it seem reasonable. i have a lurking feeling that with a deeper understanding of Analytic functions this can be further simplified - will appreciate your thoughts.
============= My Approach =================
-- Find the Engagements that have multiple Clients
CREATE TABLE amtest_mult_cli AS
WITH
v1 AS (SELECT DISTINCT engagement_code,client_code
FROM my_fact_table)
SELECT engagement_code
FROM v1
GROUP BY engagement_code
HAVING COUNT(*) > 1
-- Find What should be the correct client for those engagements
CREATE TABLE amtest_use_cli AS
SELECT engagement_code,per_end_dt,client_code
FROM
(
SELECT engagement_code,per_end_dt,client_code
row_number() OVER (PARTITION BY engagement_code
ORDER BY per_end_dt DESC, client_code DESC)
row_num
FROM my_fact_table a,
amtest_mult_cli b
WHERE a.engagement_code = b.engagement_code
)
WHERE row_num = 1;
-- Update Correct Clients for those engagements
UPDATE my_fact_table a
SET a.client_code =
(SELECT b.client_code
FROM amtest_use_cli b
WHERE a.engagement_code = b.engagement_code)
WHERE EXISTS
(SELECT 1
FROM amtest_use_cli c
WHERE a.engagement_code = c.engagement_code);
======================================================
June 03, 2005 - 12:14 pm UTC
why not:
merge into my_fact_table F
using
( select engagement_code,
substr(max(to_char(per_end_dt,'yyyymmddhh24miss')||client_code ),15) cc
from my_fact_table
group by engagement_code
having count(distinct client_code) > 1 ) X
on ( f.engagement_code = x.engagement_code )
when matched
then update set client_code = x.cc
when not matched
then insert ( client_code ) values ( null ); <<== never can happen
<<== in 10g, not needed!
That select finds the client_code for the max per_end_dt by engagement_code for engagement_code's that have more than one distinct client_code....
first_value(client_code)
over (partition by engagement_code
order by per_end_dt desc, client_code desc ),
count(distinct client_code)
help with lead
Adolph, June 09, 2005 - 1:24 am UTC
I have a table in the following structure:
create table cs_fpc_pr
(PRGM_C VARCHAR2(10) not null,
fpc_date date not null,
TIME_code VARCHAR2(3) not null,
SUN_TYPE varchar2(1))
insert into cs_fpc_pr values ('PRGM000222', to_date('08-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '45','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '45','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '24','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '23','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '47','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '48','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '23','1');
commit;
select prgm_c,fpc_date,time_code,sun_type,
lead(fpc_date) over(partition by prgm_C order by fpc_date) next_date
from cs_fpc_pr
order by prgm_c,fpc_date,time_code;
PRGM_C FPC_DATE TIM S NEXT_DATE
---------- --------- --- - ---------
PRGM000222 08-MAY-05 33 1 09-MAY-05
PRGM000222 09-MAY-05 05 3 09-MAY-05
PRGM000222 09-MAY-05 25 1 09-MAY-05
PRGM000222 09-MAY-05 45 3 10-MAY-05
PRGM000222 10-MAY-05 05 3 10-MAY-05
PRGM000222 10-MAY-05 25 1 10-MAY-05
PRGM000222 10-MAY-05 45 3 14-MAY-05
PRGM000222 14-MAY-05 05 3 14-MAY-05
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05
PRGM000242 08-MAY-05 23 1 08-MAY-05
PRGM000242 08-MAY-05 47 3 08-MAY-05
PRGM000242 08-MAY-05 48 3 09-MAY-05
PRGM000242 09-MAY-05 07 3 09-MAY-05
PRGM000242 09-MAY-05 33 1 09-MAY-05
PRGM000242 09-MAY-05 46 3 10-MAY-05
PRGM000242 10-MAY-05 07 3 10-MAY-05
PRGM000242 10-MAY-05 33 1 10-MAY-05
PRGM000242 10-MAY-05 46 3 11-MAY-05
PRGM000242 11-MAY-05 07 3 11-MAY-05
PRGM000242 11-MAY-05 33 1 11-MAY-05
PRGM000242 11-MAY-05 46 3 14-MAY-05
PRGM000242 14-MAY-05 07 3 14-MAY-05
PRGM000242 14-MAY-05 23 1
I need to find the for a particular 'prgm_c' the next date & time code where the 'sun_type' field = '1'.
A sample of the output should look something like this:
PRGM_C FPC_DATE TIM S NEXT_DATE next_time
---------- --------- --- - --------- -------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 10-MAY-05 25
PRGM000222 09-MAY-05 45 3 10-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 14-MAY-05 24
PRGM000222 10-MAY-05 45 3 14-MAY-05 24
PRGM000222 14-MAY-05 05 3 14-MAY-05 24
PRGM000222 14-MAY-05 24 1
Tom, Can you please help me with with this?
Regards
June 09, 2005 - 6:53 am UTC
PRGM000222 10-MAY-05 05 3 10-MAY-05
PRGM000222 10-MAY-05 25 1 10-MAY-05
PRGM000222 10-MAY-05 45 3 14-MAY-05
PRGM000222 14-MAY-05 05 3 14-MAY-05
PRGM000222 14-MAY-05 24 1
you've got a problem with those fpc_dates and ordering by them. you have "dups" so no one of those 10-may-05 comes "first" same with the 14th. You need to figure out how to really order this data deterministically first.
My first attempt at this is:
tkyte@ORA9IR2W> select prgm_c, fpc_date, time_code, sun_type,
2 to_date(substr( max(data)
over (partition by prgm_c order by fpc_date desc),
6, 14 ),'yyyymmddhh24miss') ndt,
3 to_number( substr( max(data)
over (partition by prgm_c order by fpc_date desc), 20) ) ntc
4 from (
5 select prgm_c,
6 fpc_date,
7 time_code,
8 sun_type,
9 case when lag(sun_type)
over (partition by prgm_c order by fpc_date desc) = '1'
10 then to_char( row_number()
over (partition by prgm_c order by fpc_date desc) , 'fm00000') ||
11 to_char(lag(fpc_date)
over (partition by prgm_c order by fpc_date desc),'yyyymmddhh24mi
ss')||
12 lag(time_code) over (partition by prgm_c order by fpc_date desc)
13 end data
14 from cs_fpc_pr
15 )
16 order by prgm_c,fpc_date,time_code
17 /
PRGM_C FPC_DATE TIM S NDT NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 09-MAY-05 25
PRGM000222 09-MAY-05 45 3 09-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 10-MAY-05 25
PRGM000222 10-MAY-05 45 3 10-MAY-05 25
PRGM000222 14-MAY-05 05 3
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05 23
PRGM000242 08-MAY-05 23 1 08-MAY-05 23
PRGM000242 08-MAY-05 47 3 08-MAY-05 23
PRGM000242 08-MAY-05 48 3 08-MAY-05 23
PRGM000242 09-MAY-05 07 3 10-MAY-05 33
PRGM000242 09-MAY-05 33 1 10-MAY-05 33
PRGM000242 09-MAY-05 46 3 10-MAY-05 33
PRGM000242 10-MAY-05 07 3 10-MAY-05 33
PRGM000242 10-MAY-05 33 1 10-MAY-05 33
PRGM000242 10-MAY-05 46 3 10-MAY-05 33
PRGM000242 11-MAY-05 07 3 14-MAY-05 23
PRGM000242 11-MAY-05 33 1 14-MAY-05 23
PRGM000242 11-MAY-05 46 3 14-MAY-05 23
PRGM000242 14-MAY-05 07 3
PRGM000242 14-MAY-05 23 1
24 rows selected.
but the lack of distinctness on the fpc_date means you might get "a different answer" with the same set of data.
reply
Adolph, June 09, 2005 - 7:48 am UTC
Sorry for not being clear at the first instance so here goes.... A program (prgm_C) will have a maximum of one entry in the table for a combination of a (fpc_date & time_code).
This time_code actually maps to another table where '01' is '01:00:00' , '02' is '01:30:00' & so on (i.e. times stored in varchar2 formats )
So basically a program will exist for a fpc_date and a time_code only once
I hope i'm making sense.
Regards
June 09, 2005 - 7:58 am UTC
tkyte@ORA9IR2W> select prgm_c,
2 fpc_date,
3 time_code,
4 sun_type,
5 to_date(
6 substr( max(data)
7 over (partition by prgm_c
8 order by fpc_date desc,
9 time_code desc),
10 6, 14 ),'yyyymmddhh24miss') ndt,
11 to_number(
12 substr( max(data)
13 over (partition by prgm_c
14 order by fpc_date desc,
15 time_code desc), 20) ) ntc
16 from (
17 select prgm_c,
18 fpc_date,
19 time_code,
20 sun_type,
21 case when lag(sun_type)
22 over (partition by prgm_c
23 order by fpc_date desc,
24 time_code desc) = '1'
25 then
26 to_char( row_number()
27 over (partition by prgm_c
28 order by fpc_date desc,
29 time_code desc) , 'fm00000') ||
30 to_char(lag(fpc_date)
31 over (partition by prgm_c
32 order by fpc_date desc,
33 time_code desc),'yyyymmddhh24mi ss')||
34 lag(time_code)
35 over (partition by prgm_c
36 order by fpc_date desc,
37 time_code desc)
38 end data
39 from cs_fpc_pr
40 )
41 order by prgm_c,fpc_date,time_code
42 /
PRGM_C FPC_DATE TIM S NDT NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 10-MAY-05 25
PRGM000222 09-MAY-05 45 3 10-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 14-MAY-05 24
PRGM000222 10-MAY-05 45 3 14-MAY-05 24
PRGM000222 14-MAY-05 05 3 14-MAY-05 24
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05 23
PRGM000242 08-MAY-05 23 1 09-MAY-05 33
PRGM000242 08-MAY-05 47 3 09-MAY-05 33
PRGM000242 08-MAY-05 48 3 09-MAY-05 33
PRGM000242 09-MAY-05 07 3 09-MAY-05 33
PRGM000242 09-MAY-05 33 1 10-MAY-05 33
PRGM000242 09-MAY-05 46 3 10-MAY-05 33
PRGM000242 10-MAY-05 07 3 10-MAY-05 33
PRGM000242 10-MAY-05 33 1 11-MAY-05 33
PRGM000242 10-MAY-05 46 3 11-MAY-05 33
PRGM000242 11-MAY-05 07 3 11-MAY-05 33
PRGM000242 11-MAY-05 33 1 14-MAY-05 23
PRGM000242 11-MAY-05 46 3 14-MAY-05 23
PRGM000242 14-MAY-05 07 3 14-MAY-05 23
PRGM000242 14-MAY-05 23 1
24 rows selected.
Just needed to add "time_code DESC"
See
</code>
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html <code>
analytics to the rescue
for the "carry down" technique I used here. In 10g, we'd simplify using "ignore nulls" in the LAST_VALUE function instead of the max() and row_number() trick
brilliant
Adolph, June 09, 2005 - 9:53 am UTC
Thank you very much Tom. The query works like a charm.I will read up the link. Analytics do rock n roll :)
Working on an Analytic Query
Scott, June 09, 2005 - 12:15 pm UTC
Tom,
From your example for Mark's problem on 4/8, it seems that you need to specify a number of columns to output this way. Is there a way to have a varying number of columns. For example, I need to have a query that takes a date range, and makes each date a column heading. Any help would be greatly appreciated.
Thanks,
Scott
June 09, 2005 - 6:15 pm UTC
you need dynamic sql. the number of columns in a query is "well defined, known at parse time" by definition.
If you have access to expert one on one Oracle, I demostrated how to do this with ref cursors in a stored procedure. but you have to run a query, to get the set of column "headings" and write a query bsaed on that.
Tom any idea how I can re write this piece of code
A reader, June 09, 2005 - 3:00 pm UTC
decode ((SELECT ih.in_date
FROM major_sales ih
WHERE ih.container = i.container
AND sales > i.container_id
AND sales = (SELECT MIN(ihh.container_id)
FROM major_sales ihh
WHERE ihh.container_id > i.container_id
AND ihh.container = i.container)), NULL,
June 09, 2005 - 6:35 pm UTC
not out of context, no.
I am still having problem with analytical function
A reader, July 01, 2005 - 12:31 pm UTC
select i.container,ssl_user_code,ssl_user_code ssl,cl.code length_code, out_trucker_code, i.chassis,
lead(in_date) over (partition by i.container order by in_date) next_in_date,
out_date,
lead (out_date) over (partition by i.container order by in_date) o_date
from his_containers i,
container_masters cm,
tml_container_lhts clht,
tml_container_lengths cl
WHERE cm.container = i.container
and cm.lht_code = clht.code
and clht.length_code = cl.code
and ssl_user_code = 'ACL'
and i.container like '%408014'
and voided_date is null
and ((in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
results:
----------
CONTAINER SSL_USER_CODE SSL LENGTH_CODE OUT_TRUCKER_CODE CHASSIS NEXT_IN_DATE OUT_DATE O_DATE
ACLU408014 ACL ACL 4 R0480 3/22/2005 2:52:41 PM 3/21/2005 3:45:48 PM 4/6/2005 2:25:59 PM
ACLU408014 ACL ACL 4 J1375 4/6/2005 2:25:59 PM
1. how can I get rid of the 4/6/2005 2:25:59 PM???
July 01, 2005 - 1:52 pm UTC
can you be more specific about why you don't like April 6th as 2:25:59pm? what is it about that you don't like?
That'll help me tell you how to in general remove it. What is the criteria for removal
analytical query
A reader, July 01, 2005 - 2:19 pm UTC
Tom,
We are trying to build the client within a the month, in this case is within april. I also would like to know how many days elapsed during 2 days so I can bill them.
July 01, 2005 - 3:15 pm UTC
"how many days elapsed between 2 days"
the answer is: 2
but are you asking how to do date arithmetic? Just subtract.
sorry...within March
A reader, July 01, 2005 - 2:21 pm UTC
more information
A reader, July 01, 2005 - 2:29 pm UTC
Tom,
This is how the data looks
IN_DATE OUT_DATE CONTAINER
1/3/2005 2:23:05 PM 1/10/2005 5:05:16 PM ACLU408014
1/11/2005 1:04:49 PM 1/12/2005 8:49:06 AM ACLU408014
1/14/2005 12:09:50 PM 1/18/2005 6:39:10 AM ACLU408014
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM 4/6/2005 2:25:59 PM ACLU408014
4/7/2005 1:24:43 PM 4/10/2005 2:21:59 AM ACLU408014
and I would like to get the pair within the same month
July 01, 2005 - 3:16 pm UTC
the pair of "what"?
I would like to get all the dates within the month
A reader, July 01, 2005 - 4:03 pm UTC
one more try
A reader, July 01, 2005 - 4:26 pm UTC
This is how the data looks as of now with the above query.
IN_DATE OUT_DATE CONTAINER
1/3/2005 2:23:05 PM 1/10/2005 5:05:16 PM ACLU408014
1/11/2005 1:04:49 PM 1/12/2005 8:49:06 AM ACLU408014
1/14/2005 12:09:50 PM 1/18/2005 6:39:10 AM ACLU408014
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM 4/6/2005 2:25:59 PM ACLU408014
4/7/2005 1:24:43 PM 4/10/2005 2:21:59 AM ACLU408014
I Would like to get it as the following
IN_DATE OUT_DATE CONTAINER
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM
This is what I am looking for.....this way.
July 01, 2005 - 4:46 pm UTC
still not much of a specification (important thing for those of us in this industry - being able to describe the problem at hand in detail, so someone else can take the problem definition and code it).
Let me try, this is purely a speculative guess on my part:
I would like all records in the table such that the in_date-out_date range covered at least part of the month of march in the year 2005.
If the out_date falls AFTER march, I would like it nulled out.
(this part is a total guess) if the in_date falls BEFORE march, i would like it nulled out as well (for consistency?)
Ok, stated like that I can give you untested psuedo code since there are no create tables and no inserts to play with:
select case when in_date between to_date( :x, 'dd-mon-yyyy' )
and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
then in_date end,
case when out_date between to_date( :x, 'dd-mon-yyyy' )
and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
then out_date end,
container
from T
where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
and out_date >= to_date( :x, 'dd-mon-yyyy' )
bind in :x = '01-mar-2005' and :y = '01-apr-2005' for your dates.
As you requested
A reader, July 01, 2005 - 5:01 pm UTC
CREATE TABLE CONTAINER_MASTERS
(
CONTAINER VARCHAR2(10 BYTE) NOT NULL,
CHECK_DIGIT VARCHAR2(1 BYTE) NOT NULL,
SSL_OWNER_CODE VARCHAR2(5 BYTE) NOT NULL,
LHT_CODE VARCHAR2(5 BYTE) NOT NULL
)
INSERT INTO CONTAINER_MASTERS ( CONTAINER, CHECK_DIGIT, SSL_OWNER_CODE,
LHT_CODE ) VALUES ( '045404', '1', 'BCL', '5AV');
commit;
CREATE TABLE TML_CONTAINER_LHTS
(
CODE VARCHAR2(5 BYTE) NOT NULL,
SHORT_DESCRIPTION VARCHAR2(10 BYTE) NOT NULL,
LONG_DESCRIPTION VARCHAR2(30 BYTE) NOT NULL,
ISO VARCHAR2(4 BYTE) NOT NULL,
LENGTH_CODE VARCHAR2(5 BYTE) NOT NULL
)
INSERT INTO TML_CONTAINER_LHTS ( CODE, SHORT_DESCRIPTION, LONG_DESCRIPTION, ISO, LENGTH_CODE,
HEIGHT_CODE, TYPE_CODE ) VALUES ( '5BR', '5BR', '45'' 9''6" Reefer', '5432', '5', 'B', 'R');
commit;
CREATE TABLE TML_CONTAINER_LENGTHS
(
CODE VARCHAR2(5 BYTE) NOT NULL,
SHORT_DESCRIPTION VARCHAR2(10 BYTE) NOT NULL,
)
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES (
'2', '20''', '20 Ft');
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES (
'4', '40''', '40 Ft');
commit;
July 01, 2005 - 6:06 pm UTC
umm, specification?
did I get it right? if so, did you *try* the query at all???
Here is a SQL puzzle for analytics zealots
Mikito Harakiri, July 01, 2005 - 10:33 pm UTC
OK, if anybody suceed writing the following with analytics, I would convert to analytics once and forever. Credit it in the book, of course.
Given:
table Hotels (
name string,
price integer,
distance
)
Here is a query that sounds very analytical:
Order hotels by price, distance. Compare each record with its neighbour (lag?), and one of them is inferior to the other by both criteria -- more pricey and father from the beach -- then throw it away from the result.
July 02, 2005 - 9:20 am UTC
define neighbor.
is neighbor defined by price or by distance? your specification is lacking many many details (seems to be a recurring theme on this page for some reason)
sounds like you want the cheapest closest hotel to the beach. for each row, if something closer and cheaper exists in the original set, do not keep that row.
sounds like a where not exists, not analytics to me. but then - the specification is lacking.
And lets see, in order to appreciate a tool, you have to be shown that the tool can be the end all, be all answer to everything??!?? that is downright silly don't you think.
Let's see:
"if anyone succeeds in making the Oracle 9i merge command select data, I would convert to merge once and forever"
"if anyone succeeds in making my car fly into outer space, I would convert to cars once and forever"
Think about your logic here.
There are no zealots here, there are people willing to read the documentation, understand that things work the way they work, not the way THEY think they should have been made to work, and have jobs to do, pragmatic practical things to accomplish and are willing to use the best tool for the job.
specs
Mikito Harakiri, July 03, 2005 - 11:07 pm UTC
Yes, find all the hotels that are not dominated by the others by both price and distance. That is "not exists" query, but it is a very inefficient one:
select * from hotels h
where not exists (select * from hotels hh
where hh.price < h.price and hh.distance <= h.distance
or hh.price <= h.price and hh.distance < h.distance
)
The one that reformulated is much more efficient, but how do I express it in SQL?
July 04, 2005 - 10:25 am UTC
the one that reforumulated?
and why do you have the or in there at all. to dominate by both pric and distance would simply be:
where not exists ( select NULL
from hotels hh
where hh.price < h.price
AND hh.distinct < h.distance )
You said "by BOTH price and distance", nothing but nothing about ties.
ops$tkyte@ORA9IR2> /*
DOC>
DOC>drop table hotels;
DOC>
DOC>create table hotels
DOC>as
DOC>select object_name name, object_id price, object_id distance, all_objects.*
DOC> from all_objects;
DOC>
DOC>create index hotel_idx on hotels(price,distance);
DOC>
DOC>exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
DOC>*/
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select h1.name, h1.price, h1.distance
2 from hotels h1
3 where not exists ( select NULL
4 from hotels h2
5 where h2.price < h1.price
6 AND h2.distance < h1.distance )
7 /
NAME PRICE DISTANCE
------------------------------ ---------- ----------
I_OBJ# 3 3
Elapsed: 00:00:00.22
ops$tkyte@ORA9IR2> select count(*) from hotels;
COUNT(*)
----------
27837
Elapsed: 00:00:00.00
it doesn't seem horribly inefficient.
Tom Can we give it one more try
A reader, July 05, 2005 - 9:20 am UTC
Tom, When I ran the query it returned nothing. I am sending you the whole test case. This is what I would like to see
in the report.
out_date in_date container
1/18/2005 6:39:10 AM 3/19/2005 2:10:24 AM ACLU408014
3/21/2005 3:45:48 PM 3/22/2005 2:52:41 PM ACLU408014
CREATE TABLE BETA
(
IN_DATE DATE NOT NULL,
OUT_DATE DATE,
CONTAINER VARCHAR2(10 BYTE) NOT NULL
)
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
commit;
select in_date, out_date,container,
case when in_date between to_date('01-mar-2005', 'dd-mon-yyyy' )
and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
then in_date end,
case when out_date between to_date( '01-mar-2005', 'dd-mon-yyyy' )
and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
then out_date end
container
from BETA
WHERE in_date <= to_date( '01-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
and out_date >= to_date( '31-mar-2005', 'dd-mon-yyyy' )
July 05, 2005 - 9:54 am UTC
you know, this is going beyond....
*s*p*e*c*i*f*i*c*a*t*i*o*n*
pretend you were explaining to your mother (who presumably doesn't work in IT and doesn't know sql or databases or whatever) what needed to be done.
that is what I need to see. I obviously don't know your logic of getting from "A (inputs) to B (outputs)" and you need to explain that.
and when I run my query:
ops$tkyte@ORA10G> variable x varchar2(20)
ops$tkyte@ORA10G> variable y varchar2(20)
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec :x := '01-mar-2005'; :y := '01-apr-2005'
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G> select case when in_date between to_date( :x, 'dd-mon-yyyy' )
2 and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
3 then in_date end,
4 case when out_date between to_date( :x, 'dd-mon-yyyy' )
5 and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
6 then out_date end,
7 container
8 from beta
9 where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
10 and out_date >= to_date( :x, 'dd-mon-yyyy' )
11 /
CASEWHENI CASEWHENO CONTAINER
--------- --------- ----------
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 ACLU408014
I do get output, not what you say you want, but output. you need to tell me THE LOGIC here. (and maybe when you write it down, specify it, the answer will just naturally appear)
so yes, we can definitely give it one more try but if and only if you provide the details, the specification, the logic, the thoughts behind this.
Not just "i have this and want that", it doesn't work that way.
in english
Jean, July 05, 2005 - 10:33 am UTC
We are trying to bill from the time the truck left to the
time it returned. For example in the above query.
I would like to bill him from 1/18/2005 to 3/19/2005. So it must be part of the report. That's the the whole key here.
clarification!!
A reader, July 05, 2005 - 10:56 am UTC
the time he left 1/18/2005 6:39:10 AM
the time he came back 3/22/2005 2:52:41 PM
hope this helps....
July 05, 2005 - 11:28 am UTC
ops$tkyte@ORA9IR2> select * from beta order by in_date;
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014 <<<=== gap, no 13
14-JAN-05 18-JAN-05 ACLU408014 <<=== big gap, no 19.... mar 18
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
6 rows selected.
I don't get it. I don't get it AT ALL. does anyone else ?
nope, not getting it even a teeny tiny bit myself.
give us LOGIC, ALGORITHM, INFORMATION.
like I said, pretend I'm your mother who has never seen a computer -- explain the logic at that level (or I just give up)
BETTER TABLE
A reader, July 05, 2005 - 11:57 am UTC
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
commit;
OUT_DATE IN_DATE
1/18/2005 6:39:10 AM 3/19/2005 2:10:24 AM
3/21/2005 3:45:48 PM 3/22/2005 2:52:41 PM
LEFT 1/18 CAME BACK 3/19
LEFT 3/21 CAME BACK 3/22
July 05, 2005 - 12:20 pm UTC
you have totally and utterly missed my point.
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
6 rows selected.
sigh.
what if the records are
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
specification, you know what, without it, I'm not even going to look anymore. Textual description of precisely what you want. I'm tired of guessing. I think I can guess, but I don't even want to guess about "missing" months like my second example here.
English Explanation
A reader, July 05, 2005 - 1:06 pm UTC
Sorry for going back and forth on this report. All I want is the following: We have trucks that comes and out of yard. All we are looking for is when the truck came in and the "next record" nothing in between because a truck can come in many times during a month. So we want when it first came in and the very last time he went out for a particular month.That is to say the last time he left the yard. So the date and time should give us this information. Finally this report should be within a month.
example:
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
6 rows selected.
in this case we want
in_date out_date
-------- --------
3/22/2005 2:52:41PM 1/18/2005 6:39:10 AM
July 05, 2005 - 1:17 pm UTC
so what happened to the 21st/22nd of march this time. the answer keeps changing?
and what if, there are no records for march in the table (nothing in_date/out_date wise)
follow up
jean, July 05, 2005 - 1:57 pm UTC
Tom,
We realized that it maybe too much to get the dates in between
so we opt for just getting the in_date and out_date. By the way there will always be data so do not worry about if....
Thanks!!
July 05, 2005 - 3:11 pm UTC
feb, what about feb? you said there would always be data? I want to run this for feb?
do you or do you not need to be concerned about a missing month.
do not be concerned!
A reader, July 05, 2005 - 3:22 pm UTC
Please do not be concerned about missing a month. This is a report.
July 05, 2005 - 3:46 pm UTC
umm, I want the report for feburary
it is blank.
now what? it should not be blank should it? this is a problem, this is a problem in our industry in general. You get what you ask for (sometimes) and if you ask repeatedly for the wrong thing, that's what you'll get. I am concerned -- by this line of question here.
Hey, here you go:
ops$tkyte-ORA9IR2> select *
2 from (
3 select
4 lag(out_date) over (partition by container order by in_date) last_out_date,
5 in_date,
6 container
7 from beta
8 )
9 where trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
10 or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy');
LAST_OUT_ IN_DATE CONTAINER
--------- --------- ----------
18-JAN-05 19-MAR-05 ACLU408014
21-MAR-05 22-MAR-05 ACLU408014
gets the answer given your data, makes a zillion assumptions (50% of which are probably wrong), won't work for FEB, probably doesn't answer the question behind the question, but hey, there you go.
Thanks!!!
A reader, July 06, 2005 - 9:00 am UTC
I will try it ...Thanks a zillion for your efforts and your patient.
Thanks!
A reader, July 06, 2005 - 11:55 am UTC
CREATE TABLE BETA3
(
IN_DATE DATE NOT NULL,
OUT_DATE DATE,
CONTAINER VARCHAR2(10 BYTE) NOT NULL
)
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '07/20/2004 03:08:49 PM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_Date( '08/10/2004 02:45:52 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312');
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 01:55:06 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_Date( '03/27/2005 05:05:36 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312');
commit;
Tom I was able to get the first pair as show
last_out_date in_date container
8/10/2004 2:45:52 AM 3/19/2005 1:55:06 AM ACLU040312
which is fine...
But can I get the other pair?
last_out_date in_date container
3/27/2005 5:05:36 AM
July 06, 2005 - 12:44 pm UTC
problem is, you are "missing" a row and 'making up' data is hard.
it might be
ops$tkyte-ORA10G> select decode( r, 1, last_out_date, out_date ),
2 decode( r, 1, in_date, next_in_date )
3 from (
4 select
5 lag(out_date) over (partition by container order by in_date) last_out_date,
6 in_date, out_date,
7 lead(in_date) over (partition by container order by in_date) next_in_date,
8 container
9 from beta3
10 ), ( select 1 r from dual union all select 2 r from dual )
11 where ((
12 trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
13 or
14 trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
15 ) and r = 1 )
16 or
17 ( next_in_date is null and r = 2 )
18 /
DECODE(R,1,LAST_OUT_ DECODE(R,1,IN_DATE,N
-------------------- --------------------
10-aug-2004 02:45:52 19-mar-2005 01:55:06
27-mar-2005 05:05:36
still curious what happens in feb.
Please refer some books to learn Oracle Analytic functions
Vijay, July 07, 2005 - 7:58 am UTC
July 07, 2005 - 9:47 am UTC
data warehousing guide (freely available on otn.oracle.com)
Expert one on one Oracle (I have a big chapter on them in there)
Thank you very much!!
Jean, July 08, 2005 - 10:35 am UTC
I want to thank you for the last query!!! it worked very well,even tho I still get dates outside of the range. But overall it's fine.
How to get contiguous date ranges from Start_date, end_date pairs?
Bob Lyon, July 11, 2005 - 3:15 pm UTC
-- Tom, Suppose I have a table with data...
-- MKT_CD START_DT_GMT END_DT_GMT
-- ------ ----------------- -----------------
-- AAA 07/11/05 00:00:00 07/12/05 00:00:00
-- BBB 07/11/05 00:00:00 07/11/05 01:00:00
-- BBB 07/11/05 01:00:00 07/11/05 02:00:00
-- BBB 07/11/05 02:00:00 07/11/05 03:00:00
-- BBB 07/11/05 06:00:00 07/11/05 07:00:00
-- BBB 07/11/05 07:00:00 07/11/05 08:00:00
-- What I would like to get is the "contiguous date ranges"
-- by MKT_CD, i.e.,
-- MKT_CD START_DT_GMT END_DT_GMT
-- ------ ----------------- -----------------
-- AAA 07/11/05 00:00:00 07/12/05 00:00:00
-- BBB 07/11/05 00:00:00 07/11/05 03:00:00
-- BBB 07/11/05 06:00:00 07/11/05 08:00:00
-- I have played with LAG/LEAD/FIRST_VALUE/LAST_VALUE
-- but seem to just "go in circles" trying to code this.
-- Here is the test data setup (Oracle 9.2.0.6) :
CREATE GLOBAL TEMPORARY TABLE NM_DEMAND_BIDS_API_GT
(
MKT_CD VARCHAR2(6) NOT NULL,
START_DT_GMT DATE NOT NULL,
END_DT_GMT DATE NOT NULL
)
ON COMMIT PRESERVE ROWS;
-- This code has 24 hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('AAA', TRUNC(SYSDATE), TRUNC(SYSDATE) + 1);
-- A second code goes by hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 00/24, TRUNC(SYSDATE) + 01/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 01/24, TRUNC(SYSDATE) + 02/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 02/24, TRUNC(SYSDATE) + 03/24);
-- and has an intentional gap
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 06/24, TRUNC(SYSDATE) + 07/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 07/24, TRUNC(SYSDATE) + 08/24);
-- Query
SELECT MKT_CD, START_DT_GMT, END_DT_GMT
FROM NM_DEMAND_BIDS_API_GT;
July 11, 2005 - 3:49 pm UTC
based on:
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html
ops$tkyte@ORA9IR2> select mkt_cd, min(start_dt_gmt), max(end_dt_gmt)
2 from (
3 select mkt_cd, start_dt_gmt, end_dt_gmt,
4 max(grp) over (partition by mkt_cd order by start_dt_gmt) mgrp
5 from (
6 SELECT MKT_CD,
7 START_DT_GMT,
8 END_DT_GMT,
9 case when lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) <> start_dt_gmt
10 or
11 lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) is null
12 then row_number() over (partition by mkt_cd order by start_dt_gmt)
13 end grp
14 FROM NM_DEMAND_BIDS_API_GT
15 )
16 )
17 group by mkt_cd, mgrp
18 order by 1, 2
19 /
MKT_CD MIN(START_DT_GMT) MAX(END_DT_GMT)
------ -------------------- --------------------
AAA 11-jul-2005 00:00:00 12-jul-2005 00:00:00
BBB 11-jul-2005 00:00:00 11-jul-2005 03:00:00
BBB 11-jul-2005 06:00:00 11-jul-2005 08:00:00
Thanks!
Bob Lyon, July 11, 2005 - 5:20 pm UTC
Wow, that was fast.
The trick here is the MAX() analytic function. I could tag the lines where a break was to occur but couldn't figure out how to carry forward the tag/grp.
Thanks Again!
Analytical functions book
Vijay, July 11, 2005 - 11:55 pm UTC
Thanks a lot
More Help
Jean, July 26, 2005 - 5:40 pm UTC
Tom,
How can I get "just" the record within the scope? I am getting record outside of march.
select container,decode( r, 1, last_out_date, out_date )out_date, decode( r, 1, in_date, next_in_date) in_date,
code length_code,chassis,out_trucker_code,ssl_user_code ssl, ssl_user_code,out_mode
from (
select lag(out_date) over (partition by i.container order by in_date)
last_out_date,
i.ssl_user_code,
in_date,
cl.code,
i.out_trucker_code,
i.ssl_user_code ssl,
i.container,
i.chassis,
out_mode,
out_date,
clht.length_code,
lead(in_date) over (partition by i.container order by in_date)
next_in_date
from his_containers i,container_masters cm,tml_container_lhts clht,tml_container_lengths cl
where cm.container = i.container
and cm.lht_code = clht.code
and cl.code = clht.length_code
and ssl_user_code = 'ACL'
and i.container = 'ACLU214285'
and voided_date is null
and chassis is null
and in_mode = 'T'
and out_mode = 'T' ), ( select 1 r from dual union all select 2 r from dual )
where (( trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy'))
and r = 1 ) or ( next_in_date is null and r = 2 )
order by out_date
July 26, 2005 - 5:57 pm UTC
select *
from (Q)
where <any other conditions you like>
order by out_date;
replace Q with your query.
that's what I got in my query.....
A reader, July 26, 2005 - 6:03 pm UTC
July 26, 2005 - 6:23 pm UTC
don't know what you mean
I thought I was doing what you suggested already...
A reader, July 26, 2005 - 6:41 pm UTC
July 26, 2005 - 6:56 pm UTC
I cannot see your output, obviously you are getting more data than you wanted it -- add to the predicate in order to filter it out. don't know what else to say.
More information..
Jean, July 27, 2005 - 9:14 am UTC
the way it was before
CONTAINER OUT_DATE IN_DATE LENGTH_CODE CHASSIS OUT_TRUCKER_CODE
ACLU217150 6/25/2004 2:58:01 PM 3/11/2005 7:36:29 PM 4 E2131 ACL ACL T
---with your changes---
CONTAINER OUT_DATE IN_DATE LENGTH_CODE CHASSIS OUT_TRUCKER_CODE ACLU217150 6/25/2004 2:58:01 PM 3/11/2005 7:36:29 PM 4 E2131
my history tables
CONTAINER_ID OUT_DATE IN_DATE
31779 6/21/2004 10:03:25 AM 6/16/2004 1:33:50 AM
55317 6/25/2004 2:58:01 PM 6/25/2004 2:19:49 PM
672863 3/2/2005 7:03:31 PM 2/26/2005 6:03:49 PM
708598 4/4/2005 3:31:03 PM 3/11/2005 7:36:29 PM
779305 4/16/2005 1:03:36 PM 4/6/2005 2:04:53 PM
as you can see I am not picking up the records within the month of march...with or without
the changes to the query.
July 27, 2005 - 10:27 am UTC
sorry -- you'll need to work through this, you see the techniques involved right -- lag, lead, analytic functions, YOU understand your data much better than I.
(because in part, frankly, the "way it was before" and "with your changes" look, well, I don't know -- the same I think to me as displayed here)
Thanks for your help!
A reader, July 27, 2005 - 1:26 pm UTC
I know the data, however I thought I was going to be something easy just to get the date within march...I guess not.
count number of rows in a number of ranges
A reader, July 27, 2005 - 6:08 pm UTC
Hi
I would like to count the number of rows I have per range of values. For example
SELECT RANGE, SUM(suma) total_per_deptno
FROM (SELECT CASE
WHEN deptno between 10 and 20 THEN '10-20'
ELSE '30'
END RANGE,
deptno, 1 SUMA
FROM scott$emp)
GROUP BY RANGE
RANGE TOTAL_PER_DEPTNO
----- ----------------
10-20 8
30 6
Can I rewrite that query in some other way so range can be dynamic such as
11-20
21-30
31-40
and counts the number of rows?
Thank you
July 27, 2005 - 6:33 pm UTC
if you can come up with a function f(x) such that f(x) returns what you want, sure.
EG:
for you 11-20, 21-30, 31-40 -- well
f(deptno) = trunc( (deptno-0.1)/10)
(assuming deptno is an integer) -- that'll bin up deptno 0..10, 11..20, 21..30 and so on into groups 0, 1, 2, 3, ....
A reader, August 02, 2005 - 1:35 pm UTC
Tom,
I hope you can provide an insight to this.
table emp1 is shown below.
EmpId Week Year Day0 Day1 ..... Day14
100 20 2005 8 8 8
200 22 2003 0 0 8
300 25 2004 8 8 0
400 06 2005 0 8 8
500 08 2002 8 0 8
create table emp1(empid varchar2(3), week varchar2(2), year varchar2(4), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));
insert into emp1 values('100', '20', '2005', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('200', '22', '2003', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('300', '25', '2004', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0);
insert into emp1 values('400', '06', '2005', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('500', '08', '2002', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
I am trying to select emp1 records as follows:
EmpId, Date of the day, Hours worked per day
Firstly, I have to calculate date of the day of a record (first day that corresponds to Day0) using
week of the year and year. Then I have to increment the day by 1, 2 ...14
to get the hours worked for each particular date
Example: Assuming that week 20 of 2005 is 05/07/2005. It corresponds to Day0 in the same record
Day1 column corresponds to the next day which is 05/08/2005. Day2 becomes 05/09/2005 and so on ...
Then, I have to print individual rows for each empid as:
100 05/07/2005 8
100 05/08/2005 8
.....
200 05/22/2003 0
200 05/23/2003 8
.. and so on for all empid's ...
Thank you.
August 02, 2005 - 2:09 pm UTC
oh no, columns where rows should be :(
and basically you are saying "i need ROWS where these rows should be!"
tell me, how do you turn 20 into a date?
A reader, August 02, 2005 - 2:19 pm UTC
Tom,
I should've explained it better. Week 20 of 2005, here should be translated to the first day of week 20 of 2005 (Assuming it is 05/07/2005). That corresponds to Day0 of that row. Day1 becomes 05/08/2005 and so on ...
Is there a function or approach that can convert columns to rows?
August 02, 2005 - 3:30 pm UTC
no, i mean -- what function/logic/algorithm are you using to figure out "week 20 is this day"
A reader, August 02, 2005 - 9:06 pm UTC
Tom,
Sorry, firstly, the date is not calculated the way I said above. It's not clear yet how the date is obtained. This issue is under review and I think I'll obtain date by joining empid with some table (say temp1). However, I am sure I will have to use date (such as 05/07/2005), associate it with Day0 column value. Day1 becomes 05/08/2005 and so on .. However, I am trying to obtain a sql or pl/sql that can arrange the rows as described above. Any ideas? Thanks.
August 03, 2005 - 10:06 am UTC
I cannot tell you how much I object to this model.
storing "week" and "year" - UGH.
storing them in STRINGS - UGH UGH UGH.
storing things that should be cross record in record UGH to the power of 10.
I had to fix your inserts, they did not work, added day14 of zero.
ops$tkyte@ORA10G> with dates as
2 (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual connect by level <= 15 )
3 select empid, dt,
4 case when l = 0 then day0
5 when l = 1 then day1
6 when l = 2 then day2
7 /* ... */
8 when l = 13 then day13
9 when l = 14 then day14
10 end data
11 from (select * from emp1 where week = 20), dates
12 /
EMP DT DATA
--- --------- ----------
100 07-MAY-05 8
100 08-MAY-05 8
100 09-MAY-05 0
100 10-MAY-05
100 11-MAY-05
100 12-MAY-05
100 13-MAY-05
100 14-MAY-05
100 15-MAY-05
100 16-MAY-05
100 17-MAY-05
100 18-MAY-05
100 19-MAY-05
100 20-MAY-05 8
100 21-MAY-05 0
15 rows selected.
A reader, August 03, 2005 - 3:18 pm UTC
Tom,
Thanks for the solution. I need some more help if you don't mind. The sql works excellently and I experimented with it.
However, this question is based on a change of design here ... The emp1 table is joined with trn1 table (empid ~ trnid) to obtain values x and y. x and y should be passed to a function that returns date.
The emp1 table is like:
EmpId Day0 Day1 ..... Day14
100 8 8 8
200 0 0 8
300 8 8 0
400 0 8 8
500 8 0 8
trn1 table is like:
trnid x y
100 3 18
200 4 19
300 5 20
400 6 21
500 7 22
etc ...
create table emp1(empid varchar2(3), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));
insert into emp1 values('100', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('200', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 0);
insert into emp1 values('300', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0, 0);
insert into emp1 values('400', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('500', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
create table trn1(empid varchar2(3), x number(2), y number(2));
insert into trn1 values('100', 3, 18);
insert into trn1 values('200', 4, 19);
insert into trn1 values('300', 5, 20);
insert into trn1 values('400', 6, 21);
insert into trn1 values('500', 7, 22);
I used this function on just one row of emp1 (by hard coding x and y values).
I replaced
with dates as
(select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual
connect by level <= 15 )
with
with dates as
(select getXYDate(x,y)+level-1 dt, level-1 l from dual
connect by level <= 15 )
However, I am trying to implement this on every row of emp1 by obtaining x and y from trn. There is no week or year in emp1 table. Any help? Thanks again.
August 03, 2005 - 6:00 pm UTC
I didn't think it was possible, but now I like this even less than before! didn't think you could do that ;(
ops$tkyte@ORA10G> with dates as
2 (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual
connect by level <= 15 )
3 select empid, dt,
4 case when l = 0 then day0
5 when l = 1 then day1
6 when l = 2 then day2
7 /* ... */
8 when l = 13 then day13
9 when l = 14 then day14
10 end data
11 from ( QUERY ), dates
12 /
replace query with a join of emp with trn and apply the function in there.
A reader, August 03, 2005 - 7:38 pm UTC
Tom,
Sorry to bother you again. In my case, I think
(select to_date( '05/07/2005','mm/dd/yyyy') will not help me anymore because I have to basically find dates for Day0 .. Day14 of every row in emp1 table. The first date (date that corresponds to Day0) for each record should be obtained using a function by passing X and Y values of trn table .. Because each record may have different x, y values.
If it's not achievable using this way, can you suggest an alternate approach. I am trying to make a function that would use a loop. Also, the data should be written to a text file once complete, in that case I think a procedure might help and if so, could you throw some light? Thanks for your patience.
August 03, 2005 - 8:26 pm UTC
well, you just need to generate a set of 15 numbers (L)
and add them in later than. No big change. You have the "start_date" from the function right -- just add L to dt.
A reader, August 03, 2005 - 8:38 pm UTC
Ok, Can you please show that if possible?
A reader, August 03, 2005 - 9:38 pm UTC
Tom,
I tried this and am getting an error: ORA-00904: "DAY13": invalid identifier
WITH DATES AS
(SELECT FUNC_XY(17,2003)+level-1 dt, level-1 l FROM DUAL
connect by level <= 15)
select empid, day0, day14, x, y, dt,
case when l = 0 then day0
when l = 1 then day1
when l = 2 then day2
when l = 3 then day3
when l = 4 then day4
when l = 5 then day5
when l = 6 then day6
when l = 7 then day7
when l = 8 then day8
when l = 9 then day9
when l = 10 then day10
when l = 11 then day11
when l = 12 then day12
when l = 13 then day13
when l = 14 then day14
end data
from (select emp1.empid, day0, day14, x, y from emp1, trn1 where emp1.empid = trn1.empid), dates
/
As said before ... I also have to use x and y instead of 17 and 2003 in order to compute it for every row.
August 04, 2005 - 8:20 am UTC
yeah, well -- you didn't select it out in the inline view. fix that.
look the concept is thus:
with some_rows as ( select level-1 l from dual connect by level <= 15 )
select a.empid, a.dt+l, case when l=0 then a.day0
...
when l=14 then a.day14
end data
from some_rows,
(select emp1.empid, func_xy(trn1.x, trn1.y) dt,
emp1.day0, emp1.day1, .... <ALL OF THE DAYS>, emp1.day14
from emp1, trn1
where emp1.empid = trn1.empno )
A reader, August 04, 2005 - 9:15 am UTC
Tom,
Here, the sql is using a.empid, a.dt+l ...
whereas the inner sql is using emp1.day0, trn1.empid , etc ... My real inner sql well uses some more columns adn joins as well. When this gave me error, I just substituted emp1.day0, emp1.day14 etc ... with day0, day14 etc .. and it worked. However, when there are several joins with alias names, How should it be done?
To make it a bit clear, this sql looks similar to:
select emp1.empid, emp1.day0 from some_rows, (select emp1.empid, emp1.day0) ...
Any idea how to select from select and still use multiple joins etc ... Hope I am clear
August 04, 2005 - 9:56 am UTC
you can join as much as you WANT in the inline views.
Sorry, I cannot go further with this one, I've shown the technique -- it is just a pivot to turn COLUMNS THAT SHOULD HAVE BEEN ROWS into rows -- very common.
A reader, August 04, 2005 - 9:52 am UTC
Please ignore above post.
I need some help
Carlos, August 09, 2005 - 10:25 am UTC
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('11/15/2004 17:42:56', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/18/2004 15:09:19', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('11/24/2004 09:38:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/30/2004 04:28:09', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/03/2005 14:36:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/05/2005 10:04:15', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/07/2005 08:54:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/10/2005 10:54:07', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/12/2005 10:13:13', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/18/2005 04:23:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/03/2005 03:15:05', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/09/2005 18:54:11', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005 21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/22/2005 20:27:03', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/29/2005 17:05:04', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/22/2005 20:27:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/30/2005 08:53:13', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/30/2005 13:16:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/30/2005 15:08:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;
Tom,
I hope you can help since I have been struggling with this report. I would like to get something like this...
IN ORDER WORDS I WANT TO GET WHEN IT FIRST WAS LOGED IN INDATE AND WHEN IT WAS LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX. In this case for example for the month of March, however it can be for any given Month. Any Ideas how I can accomplish that?
IN_DATE OUT_DATE
3/22/2005 8:27:03 PM 3/30/2005 3:08:39 PM
----from the table above for the month of March
August 09, 2005 - 10:45 am UTC
insufficient detail here, why won't min/max work for you for example.
but I don't understand the logic behind the two values you say you want, I don't get how you arrived at them.
This is what I get
A reader, August 09, 2005 - 10:57 am UTC
select in_date, out_date
from lou_date
where id = 201048
and ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
I get the following:
In_date out_date
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM
August 09, 2005 - 11:19 am UTC
ok,
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005
21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
why didn't you get that row. for example.
A reader, August 09, 2005 - 11:48 am UTC
SQL Statement which produced this data:
select in_date, out_date
from lou_date
where ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
order by out_date
3/3/2005 3:15:05 AM 3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/22/2005 8:27:15 PM 3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM
I guess my question is I would like to that when
I get records with beyond march it should be replace
with blank or Null...since I can't charged him/her
for April...
August 09, 2005 - 12:00 pm UTC
I am so not following you here.
A reader, August 09, 2005 - 12:24 pm UTC
Tom,
Pretend that you are charging someone for a particular month. Let's say the month of March. So you would like to do a query that reflect just that..so a group of dates are given to you and in that group of dates you have multiple records with the same id. Also some records containts records that inintiated in march but came back in April. Here is are the examples..but it can work with any dates...
example 1.
in_date out_date
3/22/2005 8:27:15 PM 3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
would like to see:
in_date out_date
3/22/2005 8:27:15 PM 3/30/2005 1:16:00 PM
example 2
In_date out_date
3/3/2005 3:15:05 AM 3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
would like to see:
In_date out_date
3/3/2005 3:15:05 AM 3/15/2005 9:47:41 PM
August 09, 2005 - 12:42 pm UTC
begs the question
in_date out_date
20-feb-2005 15-apr-2005
or
in_date out_date
3/22/2005 8:27:15 PM 3/25/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
what then. Be able to clearly specify the "goal" or the "algorithm" usually leads us straight to the query itself. There are so many ambiguities here. Pretend you were actually documenting this for a junior programmer to program. Give them the specifications. In gory detail.
please don't just answer these two what thens -- think of all of the cases (cause I'll just keep on coming back with "what then" if you don't)
Remember -- I know NOTHING about your data, not a thing. This progression from
... I WANT TO GET WHEN IT FIRST WAS LOGED IN INDATE AND WHEN IT WAS
LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX....
to this has been 'strange' to say the least.
Full explanation of requirements
A reader, August 09, 2005 - 3:10 pm UTC
Sorry for the misunderstanding Tom. Here is the full requirements. I hope I can explain it this time.
The report is a billing report and the it goes as follows:
For example for the month of March we have to bill as
in the following way:
out_date date_in Bill
2/23 3/2 3/1 to 3/2
3/1 3/3 3/1 to 3/3
3/1 4/14 3/1 to 3/31
3/1 - 3/1 to 3/31
2/23 - 3/1 to 3/31
August 09, 2005 - 3:38 pm UTC
well, i hope you give your programmers more detail. Here is the best I'll do
ops$tkyte@ORA9IR1> select t.*,
2 greatest( in_date, to_date('mar-2005','mon-yyyy') ) fixed_in_date,
3 least( nvl(out_date,to_date('3000','yyyy')), last_day( to_date( 'mar-2005', 'mon-yyyy' ) ) ) fixed_out_date
4 from t
5 where in_date < last_day( to_date( 'mar-2005', 'mon-yyyy' ) )+1
6 and out_date >= to_date( 'mar-2005', 'mon-yyyy' );
IN_DATE OUT_DATE FIXED_IN_ FIXED_OUT
--------- --------- --------- ---------
03-MAR-05 09-MAR-05 03-MAR-05 09-MAR-05
11-MAR-05 15-MAR-05 11-MAR-05 15-MAR-05
22-MAR-05 29-MAR-05 22-MAR-05 29-MAR-05
22-MAR-05 30-MAR-05 22-MAR-05 30-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
6 rows selected.
predicate finds records that overlap march.
select adjusts the begin/end dates.
Thank!!!
A reader, August 10, 2005 - 12:00 pm UTC
Tom,
One more request. I would like to start the report with
the first time it went out. That is to say...
how it looks now with your help...
fix_in fix_out
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM 3/31/2005
how the data looks
fix_in fix_out
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM---first went out
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM
How I would like to see it since we begin billing from
the first date the truck went out.
fix_in fix_out
3/29/2005 5:05:04 PM 3/30/2005 3:08:39 PM
3/30/2005 3:08:39 PM 3/31/2005
Thanks again Tom
August 10, 2005 - 1:05 pm UTC
try to work it out yourself -- please.
why? because I'll do this little thing and it'll be "oh yeah, one more thing, when the data looks like this...."
specifying requirements is like the most important thing in the world -- it is key, it is crucial. It is obivous you know what you want (well, maybe -- it seems to change over time) but I don't "get it" myself. Your simple example here with two rows begs so so many questions, I don't even want to get started.
You have lag() and lead() at your disposal, the probably come into play here. check them out.
Thanks for help !
A reader, August 11, 2005 - 3:25 pm UTC
The report is kind of tricky. Specially when one of the dates originates in Feb. and the other pair falls in march.
Hooked on Analytics worked for me!!
Greg, August 22, 2005 - 11:15 am UTC
I think I need to find a meeting group to help with my addiction ... I think I'm addicted to analytics .. :\
Finally got a chance to read chapter 12 in "Expert Oracle" ... awesome!! 4 big, hairy Thumbs up!! heh
But I got a question ... an "odd" behaviour that I don't understand ... was wondering if you could help explain:
Test Script:
================
drop table junk2;
drop sequence seq_junk2;
create sequence seq_junk2;
create table junk2
(inv_num number,
cli_num number,
user_id number)
/
insert into junk2
values ( 123, 456, null );
insert into junk2
values ( 123, 678, null );
insert into junk2
values ( 234, 456, null );
insert into junk2
values ( 234, 678, null );
commit;
break on cli_num skip 1
select * from junk2;
select inv_num, cli_num,
NVL ( user_id, 999 ) chk1,
NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
seq_junk2.nextval seq,
FIRST_VALUE ( NVL ( user_id, seq_junk2.nextval ) )
OVER ( PARTITION BY cli_num ) user_id
from junk2
/
=====================
The final query shows this:
INV_NUM CLI_NUM CHK1 CHK2 SEQ USER_ID
---------- ---------- ---------- -------- ---------- ----------
123 456 999 NULL 1
234 999 NULL 2
123 678 999 NULL 3 2
234 999 NULL 4 2
4 rows selected.
and I'm kinda confused .. it appears that the analytic functions are not "processing" that sequence ... how do sequences and analytics work together?? (if at all??)
(In short, this is a simplified example of a bigger problem I tripped over. I'm trying to assign new user_ids for existing clients, but only want 1 user_id assigned per client. Trick is, each client can be associated with more 1 investment ... so I have multiple rows with same client, but I want the same user_id assigned. kind of: "Has this client got an id yet? if not, give him a new one, otherwise display the one he's already been assigned".)
FIRST_VALUE and LAST_VALUE seemed the logical choice ...
The interesting thing is, when I use DBMS_RANDOM.VALUE (to assign a random PIN to start with) ... it works fine, what am I missing/forgetting about sequences that changes their behaviour in this regards?)
August 23, 2005 - 8:56 am UTC
that will be a tricky one, lots of assumptions on orders of rows processed and such.
that should throw an ora-2287 in my opinion.
I cannot see a safe way to do that without writing a plsql function and performing a lookup off to the side by cli_num
Sorry, I don't understand ...
Greg, August 23, 2005 - 11:36 am UTC
you wrote:
"that will be a tricky one, lots of assumptions on orders of rows processed and such."
I don't understand what assumptions I'm making ... in my example, I just got 4 rows, I don't care what order they come back in, just so long as it deals with them in "groups of cli_nums" .. (hence the partition by cli_num portion) ... if I "lose" sequence numbers, that's fine, too ... I don't care about gaps in the sequence or "missing userids" ...
The only behaviour I'm seeing, is that the analytic function doesn't seem to be working with the sequence properly ...
I guess I can simplify the question even further:
Why does the following query return "NULL" ?
SQL > select first_value ( seq_junk2.nextval ) over ( )
2 from dual
3 /
------more------
FIRST_VALUE(SEQ_JUNK2.NEXTVAL)OVER()
------------------------------------
1 row selected.
(with a "normal" sequence - nothing fancy):
SQL > select seq_junk2.nextval from dual;
------more------
NEXTVAL
----------
29
1 row selected.
August 24, 2005 - 8:35 am UTC
as i said, i believe it should be raising an error (I have it on my list of things to file when I get back in town).
I cannot make it work, I cannot think of a way to do it in a single statement, short of writing a user defined function.
Connect by with self referenced parent
Joe, August 23, 2005 - 12:30 pm UTC
CONNECT BY works great but I've run into a problem when the ultimate parent is referenced in the parent record. e.g., date looks like:
SQL> select * from t;
OBJ_ID PARENT_ID
---------- ----------
1 1
2 1
3 1
4 2
5 4
But... using connect by generates an error..
SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
2 from t
3 connect by t.parent_id = prior t.obj_id;
ERROR:
ORA-01436: CONNECT BY loop in user data
If parent_id is null where obj_id = 1, then it's okay. Any suggestion on how to handle the other case? I'm stumped.
Solution for connect by
Logan Palanisamy, August 23, 2005 - 5:39 pm UTC
SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
2 from t
3 connect by t.parent_id = prior t.obj_id and t.parent_id <> t.obj_id;
LEVEL OBJ_ID PARENT_ID
-------------------- ---------- ----------
1 1 1
2 2 1
3 4 2
4 5 4
2 3 1
1 2 1
2 4 2
3 5 4
1 3 1
1 4 2
2 5 4
1 5 4
12 rows selected.
re:Solution for connect by
Joe, August 24, 2005 - 8:43 am UTC
Thanks Logan. Often the solution is so simple! Thanks.
Seq problem
Bob B, August 24, 2005 - 11:25 am UTC
SELECT
A.*,
seq_junk2.currval CURR_SEQ,
seq_junk2.nextval - ROWNUM + VAL SEQ
FROM (
SELECT
inv_num,
cli_num,
NVL ( user_id, 999 ) chk1,
NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
DENSE_RANK() OVER ( ORDER BY CLI_NUM ) VAL
FROM JUNK2
) A
Might be a starting point. It works on the following ASSUMPTION: ROWNUM corresponds to the number of times the sequence has been called. As Tom stated, this assumption can easily go out the window (throw an analytic function or an order by on the outer query for a simple example).
A safer solution might be to run two updates. Update 1 will give a unique id to each null user id. Update 2 will update the user id to the min or max user id for that cli_num. A little overhead, but safer and simpler than the aforementioned alternative.
Still confused ... but working on it ...
Greg, August 24, 2005 - 1:42 pm UTC
Thanks, Bob!! Yeah, that does exactly what I wanted it to do, (but still doesn't really explain the "why" part) ...
problem is, it looks like this is more a question on sequences now than analytics, so I'll see if I can find a more appropriate thread to continue this on ..
Thanks!!
A slight twist on lag/lead
Sudha Bhagavatula, September 01, 2005 - 11:08 am UTC
That was useful to me. Could do a lot of queries easily. However I'm stuck at this point.
I have data like this:
subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2000
1001 001 2112 01/01/2001 06/30/2001
1001 001 2112 07/01/2001 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999
I want my output to look like this:
subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999
How do I achieve this ?
September 01, 2005 - 3:49 pm UTC
well, you should start by describing the logic from getting from A to B first.
otherwise it is just text. what are the rules that got you from inputs to outputs.
tell me the procedural algorithm you would use for example.
Rules from A to B
Sudha Bhagavatula, September 02, 2005 - 9:29 am UTC
A member is enrolled in a group for a timeframe. For all contiguous time frames for a group I can take the min(eff_date) and max(term_date). For each break in group a new row with min(eff_date) and max(term_date) again. So say a member was enrolled in a group from 01/01/2001 to 12/31/2001 and then again with the same group from 01/01/2005 to 06/30/2005 then I need 2 rows for this member
with the dates as said just now. This is the sql that I'm running, hopefully I'm on the right track but am stuck at this point:
SELECT SUBR_ID,
DEP_NBR,
GRP,
LAG_EFF_DATE,
LEAD_EFF_DATE,
EFF_DATE,
TERM_DATE,
LAG_TERM_DATE,
LEAD_TERM_DATE,
DECODE( LEAD_GRP, GRP, 1, 0 ) FIRST_OF_SET,
DECODE( LAG_GRP, GRP, 1, 0 ) LAST_OF_SET
FROM (SELECT M.SUBR_ID,
M.DEP_NBR,
LAG(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_GRP,
LEAD(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_GRP,
GRP_NBR||SUB_GRP GRP,
CJ.EFF_DATE,
CJ.TERM_DATE,
LAG(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_EFF_DATE,
LEAD(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_EFF_DATE,
LAG(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_TERM_DATE,
LEAD(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_TERM_DATE
FROM DW.T_MEMBER_GROUP_JUNCTION CJ,
BCBS.T_GROUP_DIMENSION G,
BCBS.T_MEMBER_DIMENSION M
WHERE CJ.GRP_DIM_ID = G.GRP_DIM_ID
AND CJ.MBR_DIM_ID = M.MBR_DIM_ID
AND M.DEP_NBR != '000'
AND G.BENE_PKG IS NOT NULL)
WHERE LAG_GRP IS NULL
OR LEAD_GRP IS NULL
OR LEAD_GRP <> GRP
OR LAG_GRP <> GRP
Thanks for your reply.
September 03, 2005 - 7:15 am UTC
you know, without a table, rows and something more concrete.... I have no comment.
More detail
Sudha Bhagavatula, September 04, 2005 - 10:34 pm UTC
have 3 tables:
Member_dimension
Group_Dimension
Member_Group_Junction
Member_Dimension :- columns are mbr_dim_id, subr_id, dep_nbr
Group dimension :- columns are grp_dim_id, grp_nbr, sub_grp
Member_Group_Junction :- columns are mbr_dim_id, grp_dim_id, eff_date, term_date
I have to create one row for each contiguous dates of enrollment with a new row for a new group or a break in date.
Suppose a member (subr_id = 1001, dep_nbr = 001) is enrolled with a group called 001 from 01/01/2001 till 06/30/2001, he then changes group to 002 for the period 07/01/2001 till 12/31/2001. He enrolls with the same group 002 from 01/01/2002 till 06/30/2002 with a change in benefits. He then gets transferred to some other city or changes jobs. He joins back with the group 001 from 09/30/2003 till 11/30/2003 and quits again. joins back with the same group 001 from 01/01/2204 till present.The data in the junction table will be like this:
mbr_dim_id grp_dim_id eff_date term_date
1 1 01/01/2001 06/30/2001
1 2 07/01/2001 12/31/2001
1 2 01/01/2002 06/30/2002
1 1 09/30/2003 11/30/2003
1 1 01/01/2004 12/31/9999
My output should be like this:
mbr_dim_id grp_dim_id eff_date term_date
1 1 01/01/2001 06/30/2001
1 2 07/01/2001 06/30/2002
1 1 09/30/2003 11/30/2003
1 1 01/01/2004 12/31/9999
For each change in group or a break in the contiguity of the dates I should get a new row. The junction table is joined to the dimension with the respective dim_ids.
Hope I'm clearer this time.
Thanks
Sudha
September 05, 2005 - 10:11 am UTC
tell you what, see
</code>
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html <code>
it shows a technique in the analytics to the rescue article that will be useful for grouping ranges a records using the LAG() function.
But, you need to read the text that you are supposed to read before putting an example here.
It is something I think I say a lot.
<quote>
If your followup requires a response that might include a query, you had better supply very very simple create tables and insert statements. I cannot create a table and populate it for each and every question. The SMALLEST create table possible (no tablespaces, no schema names, just like I do in my examples for you)
</quote>
that is a direct cut and paste
distinct last_value
Putchi, September 06, 2005 - 4:49 am UTC
When using last_value I am usually only intrested in the last value, hence I need a distinct in the select to get it. It gives what I want but it seems that the database hase to do the work twice, first a window sort and after that a unique sort. Is there any way to avoid the distinct but still only get one row per partion key?
create table a (num number(2), var1 varchar2(10), var2 varchar2(10));
insert into a values (1,'a','A');
insert into a values (2,'b','A');
insert into a values (3,'c','A');
insert into a values (1,'a','B');
insert into a values (2,'b','B');
insert into a values (3,'c','B');
commit;
SQL> select distinct
2 var2
3 ,last_value(var1) over (partition by var2 order by num
4 rows between unbounded preceding and unbounded following) var1
5 from a;
VAR2 VAR1
---------- ----------
A c
B c
Körschema
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (UNIQUE)
2 1 WINDOW (SORT)
3 2 TABLE ACCESS (FULL) OF 'A'
September 06, 2005 - 8:31 am UTC
nope, analytics are not aggregates, aggregates are not analytics.
A trick you can use to skip one or the other step is:
ops$tkyte@ORA817DEV> select var2,
2 substr( max(to_char( num,'fm0000000000') || var1), 11 ) data
3 from a
4 group by var2
5 /
VAR2 DATA
---------- -----------
A c
B c
Analytics to the rescue
Sudha Bhagavatula, September 06, 2005 - 11:28 am UTC
Read that article. Helped me, but now I have another twist.
Create table contracts (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)
insert into contracts values ('1001', '001', '2112', to_date('01/01/2000','mm/dd/yyyy'), to_date('12/31/2000','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2001','mm/dd/yyyy'), to_date('06/30/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('07/01/2001','mm/dd/yyyy'), to_date('12/31/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '7552', to_date('01/01/2003','mm/dd/yyyy'), to_date('12/31/2003','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2004','mm/dd/yyyy'), to_date('12/31/9999','mm/dd/yyyy'));
I ran this query to identify breaks in groups and dates for the above table:
select subr_id, dep_nbr, grp,
min_eff_date,
max_term_date
from
(select subr_id, dep_nbr, grp,
min(eff_date) min_eff_date,
max(term_date) max_term_date
from
(select subr_id, dep_nbr, eff_date, term_date, grp,
max(rn)
over(partition by subr_id, dep_nbr order by eff_date) max_rn
from
(select subr_id, dep_nbr, eff_date, term_date, grp,
(case
when eff_date-lag_term_date > 1
or lag_term_date is null
or lag_grp_nbr is null
or lag_grp_nbr <> grp
then row_num
end) rn
from (
select subr_id, dep_nbr, eff_date, term_date, grp_nbr grp,
lag(term_date)
over (partition by subr_id, dep_nbr order by eff_date) lag_term_date,
lag(grp_nbr||sub_grp)
over (partition by subr_id, dep_nbr order by eff_date) lag_grp_nbr,
row_number()
over (partition by subr_id, dep_nbr order by eff_date) row_num
from contracts )))
group by subr_id, dep_nbr, grp, max_rn )
order by subr_id, dep_nbr, min_eff_date
This gave me the output as :
subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999
I now have another table :
create table contract_pcp_junction (subr_id varchar2(15), dep_nbr varchar2(3), pcp_id varchar2(12), eff_date date, term_date date)
insert into contract_pcp_junction values('1001','001','123765', to_date('07/01/2000','mm/dd/yyyy') to_date('06/30/2001','mm/dd/yyyy');
insert into contract_pcp_junction values('1001','001','155165', to_date('01/01/2003','mm/dd/yyyy') to_date('12/31/9999','mm/dd/yyyy');
This table identifies the provider coverage for each member. I need to identify the breaks in coverage with regards to the contracts.
Now as per the data above this member does not have a pcp from 01/01/2000 to 06/30/2000 and again from 07/01/2001 to 12/31/2001.
I need to insert the breaks into another table. This table needs to have the subr_id, dep_nbr, grp and eff_date, term_date.
create table contract_pcp_breaks (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)
This table needs to have the data for the breaks
subr_id dep_nbr grp_nbr eff_date term_date
1001 001 2112 01/01/2000 06/30/2000
1001 001 2112 07/01/2001 12/31/2001
How do I do that and hopefully I have the necessary scripts for you to work w1th.
Thanks a lot for your patience with this.
--Sudha
September 06, 2005 - 8:51 pm UTC
yah, I have scripts, but no real idea how these tables relate. Your query looks overly complex for the single table.
cannot you take your data, join it, get some "flat relation" that just simply using lag() on will solve the problem?
(please remember, you have been looking at this for hours. To you this data is natural. to everyone else, it is just bits and bytes on the screen)
Combining two tables
Putchi, September 09, 2005 - 6:39 am UTC
Hi Tom!
I want to combine from/to history values from two tables into one sequence like this:
create table a (a varchar2(2)
,from_date date
,to_date date);
create table b (b varchar2(2)
,from_date date
,to_date date);
insert into a ( a, from_date, to_date ) values (
'a1', to_date( '01/13/2005', 'mm/dd/yyyy'), to_date('02/10/2005', 'mm/dd/yyyy'));
insert into a ( a, from_date, to_date ) values (
'a2', to_date( '02/10/2005', 'mm/dd/yyyy'), to_date( '05/01/2005', 'mm/dd/yyyy'));
insert into a ( a, from_date, to_date ) values (
'a3', to_date( '05/01/2005', 'mm/dd/yyyy'), to_date( '08/12/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b1', to_date( '01/13/2005', 'mm/dd/yyyy'), to_date( '01/22/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b2', to_date( '01/22/2005', 'mm/dd/yyyy'), to_date( '04/01/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b3', to_date( '04/01/2005', 'mm/dd/yyyy'), to_date( '09/07/2005', 'mm/dd/yyyy'));
commit;
select * from ("Magic");
A B FROM_DATE TO_DATE
-- -- ---------- ----------
a1 b1 2005-01-13 2005-01-22
a1 b2 2005-01-22 2005-02-10
a2 b2 2005-02-10 2005-04-01
a2 b3 2005-04-01 2005-05-01
a3 b3 2005-05-01 2005-08-12
Is it possible?
September 09, 2005 - 8:30 am UTC
ops$tkyte@ORA10G> select a.* , b.*,
2 greatest(a.from_date,b.from_date),
3 least(a.to_date,b.to_date)
4 from a, b
5 where a.from_date <= b.to_date
6 and a.to_date >= b.from_date;
A FROM_DATE TO_DATE B FROM_DATE TO_DATE GREATEST( LEAST(A.T
-- --------- --------- -- --------- --------- --------- ---------
a1 13-JAN-05 10-FEB-05 b1 13-JAN-05 22-JAN-05 13-JAN-05 22-JAN-05
a1 13-JAN-05 10-FEB-05 b2 22-JAN-05 01-APR-05 22-JAN-05 10-FEB-05
a2 10-FEB-05 01-MAY-05 b2 22-JAN-05 01-APR-05 10-FEB-05 01-APR-05
a2 10-FEB-05 01-MAY-05 b3 01-APR-05 07-SEP-05 01-APR-05 01-MAY-05
a3 01-MAY-05 12-AUG-05 b3 01-APR-05 07-SEP-05 01-MAY-05 12-AUG-05
It won't be blindingly fast on huge things I would guess...
Putchi, September 09, 2005 - 9:14 am UTC
OK, I will try if it works, the real tables will have hundred of thousands records. I tried this myself, but I couldn't come up with something that filled in the "null" values.
SQL> select a,b,from_date,lead(from_date) over (order by from_date)
2 from (
3 select a,null b,from_date,to_date from a
4 union all
5 select null a,b,from_date,to_date from b
6 order by from_date
7 );
A B FROM_DATE LEAD(FROM_
-- -- ---------- ----------
a1 2005-01-13 2005-01-13
b1 2005-01-13 2005-01-22
b2 2005-01-22 2005-02-10
a2 2005-02-10 2005-04-01
b3 2005-04-01 2005-05-01
a3 2005-05-01
September 09, 2005 - 9:36 am UTC
that query won't work -- you need to join.
How to get the 1ST row of this distinct value in a single SELECT
Sean Chang, September 16, 2005 - 11:48 am UTC
Thank you, Tom.
I have read the analytic function for a while, but still
can't figure out a way to select the first row of a distinct
column value in a single SELECT statement. I.E
>>by running below Create and Insert
create table INV (
inv# number(7),
add_time date ,
inv_type varchar2(10),
amount number(8,2));
insert into inv values(1, sysdate-1, 'CASH', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(1, sysdate+1, 'COD', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(2, sysdate, 'MC', 10);
insert into inv values(3, sysdate-1, 'AMEX', 30);
insert into inv values(3, sysdate, 'CASH', 30);
I can get the first row of distinct INV# this way:
select * from (select a.*,
rank() over (partition by inv# order by add_time) time_order
from inv a) where time_order=1;
But how can I acheive this by a single SELECT statement?
The reason is that we have lots of tables we only need
look the very first row of the same Column value and I
don't want endup with lots of in-line views in SELECT
statement.
September 16, 2005 - 1:59 pm UTC
that is a single select.
why not? (on the lots of in-line views). If you think they are evil - then you wouldn't like my code ;)
Is analytical fitting in this situtation?
A reader, October 03, 2005 - 10:29 am UTC
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'A'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);
October 03, 2005 - 11:29 am UTC
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'A'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
should be a single query without union's - you don't need to make three passes on that data
select ..., count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)
Great!
A reader, October 03, 2005 - 4:05 pm UTC
Tom,
When I put the changes. It saying "missing keyword" What am I doing wrong?
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date)
damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id))
from gate_containers ab,gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);
October 03, 2005 - 8:57 pm UTC
sorry, I am not a sql compiler, I cannot reproduce since I don't have the tables or anything.
Case when ... then ... end
Greg, October 04, 2005 - 8:26 am UTC
Just lucked out an saw this:
"select ..., count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)"
Should be:
"select ..., count(distinct case when damage_code = 'A' then gate_id end),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)"
Tom just missed the "end" for the case statement ... (I got lucky and spotted it .. heh)
October 04, 2005 - 4:25 pm UTC
(that is why i always ask for create tables and inserts - without them, it is not possible to test)
thanks!!
A reader, October 04, 2005 - 2:24 pm UTC
Well Taken
A reader, October 05, 2005 - 10:53 am UTC
Tom,
This is what I would like to see..
damage_inspection_date damage_inspection_by counts
xx/xx/xxxx Louis 2 minors
xx/xx/xxxx juan 1 major
thanks.
can analytics help me?
Susan, October 05, 2005 - 2:41 pm UTC
My result set be ordered by the sum of multiple columns with weight assigned to the columns. The SQL below works and gives me what I want, but maybe there is an analytical function solution? Thanks for all your help.
SELECT ename, job, sal, comm FROM scott.BONUS
ORDER BY DECODE(job, -2, 0, job)*100000+DECODE(sal, -2, 0, sal)*10000+DECODE(comm, -2,0,comm)*100 DESC
October 05, 2005 - 3:05 pm UTC
not in this case - you want to order by a simple function of attributes of a single row.
You don't need to look across rows - analytics look across rows.
Thanks Tom
Susan, October 05, 2005 - 3:58 pm UTC
Thanks for your reply. Do you agree with the DECODE approach or am I missing a more elegant solution?
October 05, 2005 - 8:23 pm UTC
the decode looks fine here - shorter than case but in this "case" just as easy to read.
Tom
A reader, October 05, 2005 - 4:25 pm UTC
Tom,
Can you please point in the right direction...
This is what I am getting with the following query...
damage_inspection_date damage_inspection_by status
6/12/2004 CCCT MAJOR
6/12/2004 CCCT MINOR
6/12/2004 CCCT TOTAL
6/12/2004 LOU MAJOR
6/12/2004 LOU MINOR
and this is what I would like to get....
damage_inspection_date damage_inspection_by status count
6/12/2004 CCCT MAJOR 2
6/12/2004 CCCT MINOR 2
6/12/2004 CCCT TOTAL 1
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT ab.damage_inspection_date,
damage_inspection_by,
STATUS_CODE,
count(distinct case when ac.damage_location_code = 'A' then ab.gate_id end),
count(distinct case when ac.damage_location_code = 'F' then ab.gate_id end),
count(distinct ab.gate_id )
from gate_containers ab,gate_damages ac
where ab.gate_id = ac.gate_id
group by ab.damage_inspection_date,ab.damage_inspection_by,status_code, ab.gate_id))a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
group by (b.damage_inspection_date, b.damage_inspection_by,b.status)
October 05, 2005 - 8:35 pm UTC
....
damage_inspection_date damage_inspection_by status
6/12/2004 CCCT MAJOR
6/12/2004 CCCT MINOR
6/12/2004 CCCT TOTAL
6/12/2004 LOU MAJOR
6/12/2004 LOU MINOR
and this is what I would like to get....
damage_inspection_date damage_inspection_by status count
6/12/2004 CCCT MAJOR 2
6/12/2004 CCCT MINOR 2
6/12/2004 CCCT TOTAL 1
...
by what "logic"? can you explain how you get from A to B?
follow up
A reader, October 06, 2005 - 9:48 am UTC
Tom,
I already got the first part done. All I need to show is to somehow have the count in another column, how many minor, major and total I have. Can that be possible?
Just maybe like in the second example.
October 06, 2005 - 11:54 am UTC
first part of WHAT?
more information
A reader, October 06, 2005 - 12:47 pm UTC
Sorry about the lack of information before.
Here I will try to do bettter. I am trying to
a query where I need to count the major, minor
and then get a total.
requirements:
1. if there is a container with majors and a minors total the
counts = major+ minor = total count
2. where container has minor and no major count the minor only.
count = minor
inspector major minor total
1 major, 0 minor , other 1 1
inspector
2 major , 1 minor , other 2 1 3
inspector
0 major, 1 minor, other 0 1 1
October 06, 2005 - 1:25 pm UTC
sorry -- going back to your original example, I still cannot see the logic behind "what I have" and "what I want" there.
I don't know what you mean by "i have the first part"
this what I have now
A reader, October 06, 2005 - 2:11 pm UTC
Tom,
This is my query and result...
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);
RESULT:
SQL Statement which produced this data:
select * from MAJOR_MINOR_COUNT_VIEW
where rownum < 10
6/12/2004 CCCT TOTAL 1
6/12/2004 CRAIG TOTAL 6
6/13/2004 CCCT TOTAL 5
6/14/2004 CCCT TOTAL 46
6/14/2004 FYFE TOTAL 30
6/14/2004 HALM TOTAL 38
6/14/2004 MUTH MAJOR 2
6/14/2004 MUTH MINOR 14
6/14/2004 MUTH TOTAL 40
AND I WOULD LIK TO HAVE LIKE AS
THE REQUIREMENTS ABOVE...HOPE THIS HELP.
October 06, 2005 - 2:57 pm UTC
take your query - call it Q
select inspector,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (Q)
group by inspector
Year to dt + month to date
reader, October 06, 2005 - 2:35 pm UTC
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;
SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id, TO_CHAR(invoice_dt, 'MON-YYYY') ORDER BY TO_CHAR(sale_dt, 'MON-YYYY') ASC) mon_sal,
SUM (sale_price_usd * qty_sold)
OVER (PARTITION BY id, TO_CHAR(sale_dt, 'YYYY') ORDER BY TO_CHAR(sale_dt, 'YYYY') ASC) yr_sal,
FROM test
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 237
aa 14-OCT-05 121.5 164.75 237
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 1298.25
bb 13-OCT-05 235.25 600.5 1298.25
bb 15-OCT-05 365.25 600.5 1298.25
bb 14-NOV-05 465.25 465.25 1298.25
7 rows selected.
Ideally ,it should have been ----
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.5
bb 15-OCT-05 365.25 600.5 833.0
bb 14-NOV-05 465.25 465.25 1298.25
How can I do this ?
Will appreciate your help .
THANKS
October 06, 2005 - 3:02 pm UTC
ideally - there would be a qty_sold column somewhere :)
ideally you will ONLY use to_char to *format* data, never to process it.
trunc(invoice_dt,'y') NOT to_char(invoice_dt,'yyyy')
trunc(sale_dt,'mm') NOT to_char(sale_dt, 'MON-YYYY' )
Year to Date and Month to date
READER, October 06, 2005 - 10:04 pm UTC
As per your suggestion ,I made the changes but ...still need your help .
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;
SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'MM') ORDER BY trunc(sale_dt,'MM') ASC) mon_sal,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'Y') ORDER BY trunc(sale_dt,'Y') ASC) yr_sal
FROM test
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------------------- ---------- ---------- ----------
aa 9/15/2005 72.25 72.25 237
aa 10/14/2005 193.75 164.75 237
aa 10/19/2005 237 164.75 237
bb 9/14/2005 232.5 232.5 1298.25
bb 10/13/2005 467.75 600.5 1298.25
bb 10/15/2005 833 600.5 1298.25
bb 11/14/2005 1298.25 465.25 1298.25
7 rows selected
Ideally ,it should have been ----
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.5
bb 15-OCT-05 365.25 600.5 833.0
bb 14-NOV-05 465.25 465.25 1298.25
Thanks again .
October 07, 2005 - 8:13 am UTC
you shall have to explain how you derived your "optimal" output.
certainly isn't sorted by anything? I don't get the numbers.
Year to date /Month to date
Reader, October 07, 2005 - 9:49 am UTC
I wish to create a summary table where we will have sale for every day ,sale up to that day in that month and then upto that day in that year
ie running total or cummulative total
Thanks
October 07, 2005 - 8:22 pm UTC
ok?
Follo up
A reader, October 07, 2005 - 9:52 am UTC
Tom,
The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..
major minor count
1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1
* count the minor when there is no major
CREATE TABLE GATE_CONTAINERS
(
GATE_ID NUMBER ,
VISIT NUMBER ,
REFERENCE_ID NUMBER ,
DAMAGE_INSPECTION_BY VARCHAR2(30),
DAMAGE_INSPECTION_DATE DATE,
)
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(1, 1);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(17, 10);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(21, 12);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(31, 18);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(33, 19);
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(36, 22, TO_DATE('06/12/2004 11:48:49', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(37, 23, TO_DATE('06/12/2004 11:50:11', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(39, 25, TO_DATE('06/12/2004 11:48:19', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(45, 30);
COMMIT;
CREATE TABLE GATE_DAMAGES
(
GATE_ID NUMBER NOT NULL,
DAMAGE_LOCATION_CODE VARCHAR2(5 BYTE) NOT NULL,
DAMAGE_TYPE_CODE VARCHAR2(5 BYTE) NOT NULL
)
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '01', '9');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '02', 'C');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(37, '01', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(62, '05', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(101, '23', 'C');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(183, '99', '9');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '01', 'D');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '04', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '07', 'B');
COMMIT;
October 07, 2005 - 8:35 pm UTC
The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..
major minor count
1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1
so tell me why there are minor counts when major > 0???
and this is my query
A reader, October 07, 2005 - 9:53 am UTC
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by
I got it....
A reader, October 07, 2005 - 11:16 am UTC
Tom,
I got it....I just had to put the following. Let me know
what you think? If you have any suggestions!
Thanks for all your patient...
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,z.damage_type_code,
( case z.damage_type_code
when 'F' then 0
ELSE Count(distinct g.gate_id)
end ) CNT
--- count(distinct g.gate_id) cnt
from gate_containers g,gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
Year to Date and Month to date
Tim, October 07, 2005 - 11:52 pm UTC
Just a guess - could this be what your looking for?
SELECT DISTINCT id,sale_dt
,SUM (amount) OVER
(PARTITION BY id, sale_dt ORDER BY id ASC, sale_dt ASC) sale_daily
,SUM (amount) OVER
(PARTITION BY id, TRUNC(sale_dt,'MM')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) mon_sal
,SUM (amount) OVER
(PARTITION BY id, TRUNC(sale_dt,'Y')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) yr_sal
FROM TEST
ORDER BY id, sale_dt
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.75
bb 15-OCT-05 365.25 600.5 833
bb 14-NOV-05 465.25 465.25 1298.25
-- another variation
SELECT a.*
,SUM (sale_daily) OVER
(PARTITION BY id, TRUNC(sale_dt,'MM')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) mon_sal
,SUM (sale_daily) OVER
(PARTITION BY id, TRUNC(sale_dt,'Y')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) yr_sal
FROM
(
SELECT id,sale_dt
,SUM(amount) sale_daily
FROM TEST
GROUP BY id, sale_dt
) a
EXCELLENT !
reader, October 08, 2005 - 12:24 am UTC
Thanks !
to answer your question
A reader, October 11, 2005 - 12:54 pm UTC
The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..
major minor count
1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1
so tell me why there are minor counts when major > 0???
because when there are a major's and minor's I want to count
only the major's. When just the minor when there is no
major...just want to count the minor. those are the only 2 situation that there should be.
How can I ignore some selected columns in my group by?
Neil, October 12, 2005 - 3:14 am UTC
Tom,
I have a set of data that is recorded daily and I want to
compress it; so this:
87654321 1 5 21-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 22-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 23-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 24-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 25-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 26-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 27-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 28-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 29-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 30-AUG-2005 2.7500E+10 0 -1.436E+10 2.7500E+10 0 -1.436E+10
87654321 1 5 31-AUG-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 01-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 02-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 03-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 04-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 05-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 06-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 07-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 08-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 09-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 10-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 11-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 12-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 13-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321 1 5 14-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 15-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 16-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 17-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 18-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 19-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 20-SEP-2005 5555550000 0 -1.436E+10 5555550000 0 -1.436E+10
87654321 1 5 21-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 22-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 23-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 24-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 25-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 26-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
Needs to be converted into this:
87654321 1 5 21-AUG-2005 29-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 30-AUG-2005 12-SEP-2005 2.7500E+10 0 -1.436E+10 2.7500E+10 0 -1.436E+10
87654321 1 5 13-SEP-2005 19-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321 1 5 20-SEP-2005 01-JAN-4000 5555550000 0 -1.436E+10 5555550000 0 -1.436E+10
The column of interest is the 7th one. Whenever it changes,
I want to create a new row beginning with the day's date,
and ending either on the day before the next change, or, if
there is no next change (LEAD analytic function), substitute
in 01-JAN-4000 to show that this is the current amount.
The problem is, I need to ignore the other figures in columns 5 & 6 and 8 & 9. If I group by all the columns, I get separate
entries for these lines. That's my stumbling block - I have got close with analytics, but so far, no cigar!
I'm on 8.1.7, although I'd be interested in solutions possible in
later versions, too. If you think this is possible, I can paste a create table and SQL*Loader script here, but it would detract from the post: It's a bit of a mess anyway - if only AskTom allowed 132 columns :)
T.I.A
October 12, 2005 - 7:26 am UTC
"i want to create a new row" - that is hard as analytics don't "create rows", they just don't "squish them out" like an aggregate wou.d
make the example smaller - you don't need all of the columns, seems two or three might suffice. show the table, the data (via inserts) and the expected output if you like.
Maybe this should be a GROUP BY question, then
Neil, October 13, 2005 - 4:12 am UTC
OK - here's the table creation scripts and a couple of loader files.
My goal is to create a SQL statement to change the old data into the new.
I can use analytics to give me the start and end dates, but my problem is that I wish to ignore the actin, actout, expin and expout columns and concentrate on the act column. When it changes, I want to take the row, and give it an end date of the day before the date on which it changes again, or the default date of 01-JAN-4000 if no such row exists.
If I could just partition by the earliest date and the latest date where the act figure is the same within serial, volume and part, I could pick off the FIRST and the LAST and use LAG and LEAD to work out the dates...
CREATE TABLE t_old (
DEPOT VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART NUMBER(2)
,ASAT DATE
,ACTIN NUMBER(8)
,ACTOUT NUMBER(8)
,ACT NUMBER(8)
,EXPIN NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD NUMBER(8)
)
/
LOAD DATA
INFILE *
INTO TABLE t_old
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,ASAT
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 00822000 6086 5 25-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 00822000 6086 5 26-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 08226111 1 5 29-AUG-2005 0 0 -4185550 0 0 -4185550
DEPOT1 08226111 1 5 30-AUG-2005 2750000 0 -1435550 2750000 0 -1435550
DEPOT1 08226111 1 5 31-AUG-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 01-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 02-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 03-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111 1 5 04-SEP-2005 0 0 -1991105 0 0 -1991105
DEPOT1 08226111 1 5 05-SEP-2005 0 0 -1991105 0 0 -1991105
DEPOT1 08226111 1 5 06-SEP-2005 555555 0 -1435550 555555 0 -1435550
DEPOT1 08226111 1 5 07-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 08-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 420 5 11-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 12-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 13-SEP-2005 3329555 -2775150 555555 3329555 -2775150 555555
DEPOT1 08226111 420 5 14-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 420 5 15-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 420 5 16-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 420 5 17-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 495 5 18-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 495 5 19-SEP-2005 555555 0 555555 555555 0 555555
DEPOT1 08226111 495 5 20-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 495 5 21-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 495 5 22-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 495 5 23-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 664 5 28-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 29-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 30-AUG-2005 0 -2750000 1478550 0 -2750000 1478550
DEPOT1 08226111 664 5 31-AUG-2005 0 0 1478550 0 0 1478550
DEPOT1 08226111 664 5 01-SEP-2005 0 0 1478550 0 0 1478550
CREATE TABLE t_new (
DEPOT VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART NUMBER(2)
,FROM_D DATE
,UNTIL_D DATE
,ACTIN NUMBER(8)
,ACTOUT NUMBER(8)
,ACT NUMBER(8)
,EXPIN NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD NUMBER(8)
)
/
LOAD DATA
INFILE *
INTO TABLE t_new
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,FROM_D
,UNTIL_D
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 01-JAN-4000 0 0 -1796200 0 0 -1796200
DEPOT1 08226111 1 5 29-AUG-2005 29-AUG-2005 0 0 -4185550 0 0 -4185550
DEPOT1 08226111 1 5 30-AUG-2005 02-SEP-2005 2750000 0 -1435550 2750000 0 -1435550
DEPOT1 08226111 1 5 03-SEP-2005 05-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111 1 5 06-SEP-2005 01-JAN-4000 555555 0 -1435550 555555 0 -1435550
DEPOT1 08226111 420 5 11-SEP-2005 12-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 13-SEP-2005 15-SEP-2005 3329555 -2775150 555555 3329555 -2775150 555555
DEPOT1 08226111 420 5 16-SEP-2005 18-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 495 5 19-SEP-2005 21-SEP-2005 555555 0 555555 555555 0 555555
DEPOT1 08226111 495 5 22-SEP-2005 01-JAN-4000 0 -555555 0 0 -555555 0
DEPOT1 08226111 664 5 28-AUG-2005 29-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 30-AUG-2005 01-JAN-4000 0 -2750000 1478550 0 -2750000 1478550
Some Help needed!
A reader, October 18, 2005 - 10:54 am UTC
Tom,
How can I count the double moves as 1. For example,
in the case of 1690371?
I want to count
1690371 63 A
1690371 63 X
1690371 64 A
1690371 64 L
I want to count "A" AS ONE MOVE using this query
select trunc(g.damage_inspection_date) damage_inspection_date,g.damage_inspection_by, 'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and g.DAMAGE_INSPECTION_BY = 'COLUMBO'
and trunc(G.damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
1690355 59 A
1690355 59 E
1690371 63 A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A
1690405 71 I
1690433 71 A
1690433 71 I
1690486 54 F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A
1690540 59 E
1690636 63 A
1690636 63 X
1690781 67 X
One solution
A reader, October 19, 2005 - 9:29 am UTC
Tom,
Can decode work here...
decode(count(distinct g.gate_id,'A','F',0,NULL)
October 19, 2005 - 9:45 am UTC
I didn't really understand the question right above, nor did I see any table creates or inserts, so I sort of ignored it...
More information
A reader, October 19, 2005 - 10:44 am UTC
create table gate_containers
(gate_id number,
action varchar2(5),
damage_inspection_date date,
damage_inspection_by varchar2(30))
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686439, 'RNC', TO_DATE('06/14/2005 11:16:16', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688372, 'RNC', TO_DATE('06/14/2005 13:26:59', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688374, 'RNC', TO_DATE('06/14/2005 13:27:08', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688235, 'RNC', TO_DATE('06/14/2005 13:18:15', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688609, 'RNC', TO_DATE('06/14/2005 13:43:35', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686827, 'RNC', TO_DATE('06/14/2005 11:42:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688508, 'RNC', TO_DATE('06/14/2005 13:36:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686044, 'RNC', TO_DATE('06/14/2005 10:50:47', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685720, 'RNC', TO_DATE('06/14/2005 10:27:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686276, 'RNC', TO_DATE('06/14/2005 11:05:23', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
CREATE TABLE GATE_DAMAGES
(
GATE_ID NUMBER NOT NULL,
DAMAGE_LOCATION_CODE VARCHAR2(5 BYTE) NOT NULL,
DAMAGE_TYPE_CODE VARCHAR2(5 BYTE) NOT NULL
)
--
--SQL Statement which produced this data:
-- SELECT * FROM GATE_DAMAGES
-- WHERE ROWNUM < 20
--
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '01', '9');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '02', 'C');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(37, '01', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(62, '05', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(101, '23', 'C');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(183, '99', '9');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '01', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '04', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '07', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '08', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '11', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '18', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '22', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '24', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '01', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '08', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '11', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '18', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '22', 'D');
COMMIT;
Values
(1686279, 'RNC', TO_DATE('06/14/2005 11:05:34', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686285, 'RNC', TO_DATE('06/14/2005 11:05:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685831, 'RNC', TO_DATE('06/14/2005 10:36:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685417, 'RNC', TO_DATE('06/14/2005 10:06:00', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685579, 'RNC', TO_DATE('06/14/2005 10:17:18', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685828, 'RNC', TO_DATE('06/14/2005 10:34:44', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686007, 'RNC', TO_DATE('06/14/2005 10:47:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686131, 'RNC', TO_DATE('06/14/2005 10:56:42', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688019, 'RNC', TO_DATE('06/14/2005 13:05:56', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
COMMIT;
Tom,
Let me see if I can try now to explain it better. I am looking for
a report like this..
damages_inspection_date damages_inspection_by cnt minor major total
6/12/2005 MUTH XX XXX XX XX
requirements
MINOR = A
MAJOR = F
TOTAL != 'C'
and also but very important is that when there is a major and a minor
I should only count the major thus ignoring the minor.
October 19, 2005 - 12:34 pm UTC
sorry - but I'll need much more "text" than that. Remember, I haven't been staring at these tables for hours/days, I'm not familar with your vernacular, I don't know what problem you are trying to solve.
spec it out like we used to in the olden days - someone wrote spec (requirements) and someone else might have written the code from the spec.
follow up
A reader, October 19, 2005 - 2:51 pm UTC
GATE_ID DAMAGE_TYPE_CODE
1690355 59 A A
1690355 59 E
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A A
1690405 71 I
1690433 71 A A
1690433 71 I
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A A
1690540 59 E
1690636 63 A A
1690636 63 X
1690781 67 X
1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X
I think this is a good example. In this case I got
A = MINOR DAMAGES
F = MAJOR DAMAGES
TOTAL = NOT EQUAL TO C ( !C)
1. If you look at it closely you can see for every
every gate_id where I have multiple
Major damages just count them as one like for example
Gate_id = 1690371
2 When you have multiples MINORS (FÂ’Â’s)
Like gate_id = 1690486 count them as one
3 When you have gate_id with F and A like
Gate_id = 1687912 then just Count F(MAJORS)
damages_inspection_date damages_inspection_by cnt minor major Total
6/12/2005 MUTH XX XXX XXX XX
October 19, 2005 - 4:30 pm UTC
you seem to be using F as major:
F = MAJOR DAMAGES
but also as minor:
When you have multiples MINORS (Fs)
sorry, I'm not being a "hard whatever", I'm not getting it. step back, pretend you were trying to explain this to your mom.
this is what I got so far
A reader, October 19, 2005 - 3:02 pm UTC
Tom,
This is what I got so far, but the query is
not following the rule with the MINORS DAMAGES..
----------------------------------------
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by
SORRY!!
A reader, October 19, 2005 - 3:05 pm UTC
COPY THE WRONG QUERY...
This is what I got so far but the query
is not following the rules with the MINOR
AS stated above.
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
d.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by
follow up
A reader, October 19, 2005 - 4:47 pm UTC
1. If you look at it closely you can see for every
every gate_id where I have multiple
Major damages just count them as one like for example
Gate_id = 1690371
2 When you have multiples MINORS (FÂ’s)
Like gate_id = 1690486 count them as one
3 When you have gate_id with F and A ALIKE
Gate_id = 1687912 then just Count F(MAJORS)
F FOR MAJOR
A FOR MINOR
SOMETIMES IN THE RECORD WILL HAVE MAJOR AND MINORS
I JUST WANT TO COUNT THE MAJOR AND IGNORE THE MINORS.
I HAVE GIVEN 3 EXAMPLES OF THE RULES....DON'T KNOW
WHAT ELSE TO SAY...ALSO PLEASE LOOK AT THE QUERY
IT'S ALL IN THE UNION. THE ONLY PROBLEM THAT I HAVE
IS THAT I AM COUNTING THE MAJOR AND THE MINORS IN
THE MINOR UNION.
October 19, 2005 - 4:57 pm UTC
1) Ok, I'm looking at that gate id:
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?
2) Ok, I'm looking at that gate id:
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get it?
3) Ok, I'm looking at that:
1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X
and we are back to F being a major, not a minor again?
So, no, I don't get it, it is not clear, you can shout loud, but it won't matter.
(am I the only one not really following this??)
ANOTHER EXAMPLE!
A reader, October 19, 2005 - 4:55 pm UTC
1690355 59 A A
1690355 59 E
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A A
1690405 71 I
1690433 71 A A
1690433 71 I
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A A
1690540 59 E
1690636 63 A A
1690636 63 X
1690781 67 X
A 12
F 9
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by, 'MINOR' status,
count(distinct g.gate_id) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and damage_inspection_by = 'COLUMBO'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
result from the query:
damage_inspection_date
6/14/2005
damage_inspection_by
COLUMBO
status
MINOR
CNT
9 --------IT SHOULD BE 8 WHY? BECAUSE I WANT TO COUNT
THE "A"(MINORS) UNIQUELY,DISTINCTLY. IN OTHER WORDS
follow up
A reader, October 19, 2005 - 5:15 pm UTC
) Ok, I'm looking at that gate id:
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
*****this count as one A = MINOR
IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?
2) Ok, I'm looking at that gate id:
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
*******THIS COUNT AS ONE F = MAJOR
Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the
primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get
it?
3) Ok, I'm looking at that:
1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X
***** IN THIS CASE IGNORE THE MINOR(A) AND COUNT JUST THE MAJOR(F)
****THE FAR LETTER ON THE RIGHT IS TO SHOW YOU HOW I AM
CALCULATING WHAT IS ON THE RIGHT...
October 19, 2005 - 7:44 pm UTC
so, by gate_id compute how many A's and how many F's
select gate_id, count(case when col='A' then col end) A,
count(case when col='F' then col end) F
from t;
now you have gate_id and a count of A's and F's
call that Q
select ...
from (Q);
use CASE to look at A and F and return whatever you want.
Year to date Business Day
Reader, October 19, 2005 - 6:48 pm UTC
I am trying to calculate the number of days we did business ie sold anything and then a running total for the year .
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
COMMIT;
SELECT a.*
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mtd_day_of_business
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) ytd_day_of_business
FROM
(
SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
GROUP BY ID, sale_dt
) a
ID SALE_DT SALE_DAILY MON_SAL YR_SAL MTD_DAYOFBUS YTD_DAYOFBUS
---------- --------- ---------- ---------- ---------- ------------ ------------
aa 15-SEP-05 216.75 216.75 216.75 1 1
aa 14-OCT-05 299.25 299.25 516 1 2
aa 19-OCT-05 129.75 429 645.75 2 3
bb 14-SEP-05 697.5 697.5 697.5 1 1
bb 13-OCT-05 705.75 705.75 1403.25 1 2
bb 15-OCT-05 1095.75 1801.5 2499 2 3
bb 14-NOV-05 1395.75 1395.75 3894.75 1 4
Ideally ,the business days in the month of sep -14,15
oct-13,14,15,19 nov 1
So the year count should be 7 and the monthly count should be sep 1,2 oct 1,2,3,4 and Nov 1 .
Can this be done using analytical function or is there any other way .
Thanks
October 19, 2005 - 7:56 pm UTC
you mean like this?
ops$tkyte@ORA10GR1> SELECT a.*
2 ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
3 ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
4 ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
5 ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
6 ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'MM') order by sale_dt )
7 mtd_day_of_business
8 ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'Y') )
9 ytd_day_of_business
10 FROM
11 (
12 SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
13 GROUP BY ID, sale_dt
14 ) a
15 order by sale_dt
16 /
ID SALE_DT SALE_DAILY MON_SAL YR_SAL MTD_DAY_OF_BUSINESS YTD_DAY_OF_BUSINESS
---------- --------- ---------- ---------- ---------- ------------------- -------------------
bb 14-SEP-05 2092.5 2092.5 2092.5 1 7
aa 15-SEP-05 650.25 650.25 650.25 2 7
bb 13-OCT-05 2117.25 2117.25 4209.75 1 7
aa 14-OCT-05 1093.5 1093.5 1743.75 2 7
bb 15-OCT-05 3287.25 5404.5 7497 3 7
aa 19-OCT-05 389.25 1482.75 2133 4 7
bb 14-NOV-05 4187.25 4187.25 11684.25 1 7
7 rows selected.
Year to date ..
Reader, October 20, 2005 - 12:27 am UTC
The months are fine .
But year should increment ie 1,2,3,4,5,6,7
now it 7th business day for all transactions .
Thanks
October 20, 2005 - 8:06 am UTC
I did that, because you asked for that.
... So the year count should be 7 and the monthly count should be sep 1,2 oct
1,2,3,4 and Nov 1 .....
add the order by to the year count just like I did for the month.
the order by will make it a running total.
Thank you
A reader, October 20, 2005 - 8:56 am UTC
Tom,
Thank you for your solution, above. However,
Can you give me a solution using CASE when
I can count either A and NOT F?
Thank you again.
October 20, 2005 - 8:59 am UTC
select case when cnta > 0 and cntf > 0
then ...
when cnta = 0 and cntf > 0
then ...
when cnta > 0 and cntf = 0
then ...
use a boolean expression after computing the cnt of A and the cnt of F
SOMETHING LIKE THIS...
A reader, October 20, 2005 - 10:18 am UTC
Tom,
you mean something like this..
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,
'MINOR' STATUS,
SUM (case WHEN z.DAMAGE_TYPE_CODE= 'A' THEN 1 ELSE 0 end)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
October 20, 2005 - 4:33 pm UTC
sure
FINAL SOLUTION
A reader, October 20, 2005 - 11:04 am UTC
Tom,
here is the problem that I was facing....I hope
this clear things out.
SQL Statement which produced this data:
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
--and z.gate_id = '1688273'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
-------------------------------------------------------------------------
6/14/2005 COLUMBO MINOR 1688235 A
6/14/2005 COLUMBO MINOR 1688609 A
6/14/2005 COLUMBO MINOR 1688273 A------was counting this a minor when it should be counted as a major
6/14/2005 COLUMBO MINOR 1686769 A
6/14/2005 COLUMBO MINOR 1686517 A
6/14/2005 COLUMBO MINOR 1687985 A
6/14/2005 COLUMBO MINOR 1686483 A
6/14/2005 COLUMBO MINOR 1685361 A
6/14/2005 COLUMBO MINOR 1686414 A
SQL Statement which produced this data:
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
--and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
and z.gate_id = '1688273'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
6/14/2005 COLUMBO MINOR 1688273 A--------throw away!
6/14/2005 COLUMBO MINOR 1688273 E
6/14/2005 COLUMBO MINOR 1688273 C
6/14/2005 COLUMBO MINOR 1688273 F-------keep
6/14/2005 COLUMBO MINOR 1688273 I
follow up
A reader, October 20, 2005 - 5:01 pm UTC
Tom,
I am still not able to get the corrent result using
a case statement. Maybe I should use a function to return only the F when there is a F and A in the record.
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
sum (case when z.damage_type_code = 'F' then 1 else 0 end) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
Can this be done using analytical functions??
A reader, October 25, 2005 - 2:59 pm UTC
Tom,
I am done with my query....I am looking for a better
approach or how I can improve it. Thanks!!
select damage_inspection_date, damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date)
damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by, 'MINOR' status,
count(distinct g.gate_id) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and not exists
(select z.gate_id from gate_damages z
where z.gate_id = g.gate_id
and z.damage_type_code = 'F')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+))
group by damage_inspection_date, damage_inspection_by;
October 26, 2005 - 11:24 am UTC
sorry - too big to reverse engineer here as a review/followup....
Analytic Question
Yoav, November 20, 2005 - 8:16 am UTC
Hi Tom.
Im tring to calculating Weighted moving average.
Im having a problem to calculate the values under column SUM_D.
Can you please demonstrate how to achieve the values that appears under the column SUM_D ?
create table t
(stock_date date,
close_value number(8,2));
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('02-OCT-2005',759.56);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('29-SEP-2005',753.59);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('28-SEP-2005',749.20);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('27-SEP-2005',741.71);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('26-SEP-2005',729.93);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('25-SEP-2005',719.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('22-SEP-2005',727.30);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('21-SEP-2005',735.81);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('20-SEP-2005',740.38);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('19-SEP-2005',739.86);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('18-SEP-2005',745.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('15-SEP-2005',744.65);
COMMIT;
select RN, day, stock_date,close_value,weight
from(
select rownum RN,to_char(stock_date,'d') Day,
stock_date,close_value,
(case when to_char(stock_date,'d') = 1 then
1*close_value
when to_char(stock_date,'d') = 2 then
2*close_value
when to_char(stock_date,'d') = 3 then
3*close_value
when to_char(stock_date,'d') = 4 then
4*close_value
when to_char(stock_date,'d') = 5 then
5*close_value
end) weight
from( select rownum,stock_date,close_value
from test
order by stock_date asc)
order by 1
)
ORDER BY 1
/
RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
--------- - --------- ----------- ---------- ----------
1 5 15-SEP-05 744.65 3723.25 5
2 1 18-SEP-05 745.48 745.48 1 <==
3 2 19-SEP-05 739.86 1479.72 3
4 3 20-SEP-05 740.38 2221.14 6
5 4 21-SEP-05 735.81 2943.24 10
6 5 22-SEP-05 727.3 3636.5 15
7 1 25-SEP-05 719.48 719.48 1 <==
8 2 26-SEP-05 729.93 1459.86 3
9 3 27-SEP-05 741.71 2225.13 6
10 4 28-SEP-05 749.2 2996.8 10
11 5 29-SEP-05 753.59 3767.95 15
RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
--------- - --------- ----------- ---------- ----------
12 1 02-OCT-05 759.56 759.56 1
Thank You.
November 20, 2005 - 8:31 am UTC
would you like to explain what sum_d is to you? explain the logic behind it.
Analytic Question
Yoav, November 20, 2005 - 10:10 am UTC
Hi Tom.
Im sorry if my explanation wasnt clear enough.
The column SUM_D is actualy a "running total" of the column
Day.
The thing is that i need to reset the value of the
column SUM_D to 1 at the beginning of each week (Sunday).
select RN, day, TheDayIs, stock_date
from(
select rownum RN,to_char(stock_date,'d') Day,
to_char(stock_date,'Day')TheDayIs,
stock_date,close_value
from( select rownum,stock_date,close_value
from test
order by stock_date asc)
order by 1
)
ORDER BY 1
/
RN D SUM_D THEDAYIS STOCK_DAT
--------- - ----- --------- ---------
1 5 5 Thursday 15-SEP-05
2 1 1 Sunday 18-SEP-05
3 2 3 Monday 19-SEP-05
4 3 6 Tuesday 20-SEP-05
5 4 10 Wednesday 21-SEP-05
6 5 15 Thursday 22-SEP-05
7 1 1 Sunday 25-SEP-05
8 2 3 Monday 26-SEP-05
9 3 6 Tuesday 27-SEP-05
10 4 10 Wednesday 28-SEP-05
11 5 15 Thursday 29-SEP-05
12 1 1 Sunday 02-OCT-05
Thank you for you quick response
November 21, 2005 - 8:20 am UTC
you might have to "adjust" your stock_date by a day if 'ww' doesn't group right for you with your NLS settings (sometimes the week ends on a different day depending on your NLS settings - locale issue)
ops$tkyte@ORA9IR2> select row_number() over (order by stock_date) rn,
2 to_char(stock_date,'d') day,
3 stock_date,
4 close_value,
5 to_number(to_char(stock_date,'d'))*close_value weight,
6 sum(to_number(to_char(stock_date,'d')))
over (partition by to_char(stock_date,'ww')
order by stock_date) sum_d
7 from t
8 order by stock_date
9 /
RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
---------- - --------- ----------- ---------- ----------
1 5 15-SEP-05 744.65 3723.25 5
2 1 18-SEP-05 745.48 745.48 1
3 2 19-SEP-05 739.86 1479.72 3
4 3 20-SEP-05 740.38 2221.14 6
5 4 21-SEP-05 735.81 2943.24 10
6 5 22-SEP-05 727.3 3636.5 15
7 1 25-SEP-05 719.48 719.48 1
8 2 26-SEP-05 729.93 1459.86 3
9 3 27-SEP-05 741.71 2225.13 6
10 4 28-SEP-05 749.2 2996.8 10
11 5 29-SEP-05 753.59 3767.95 15
12 1 02-OCT-05 759.56 759.56 1
12 rows selected.
Analytics Question
Yoav, November 21, 2005 - 7:41 am UTC
Hi Tom.
Im Sorry for wasting you time.
i found the solution.
select RN, day, week_no,
sum(day) over
(partition by week_no
order by day) sum_d,
stock_date
from(
select RN, day, week_no, stock_date
from(select rownum RN,to_char(stock_date,'d') Day,
to_char(stock_date,'ww') week_no,
stock_date,close_value,
(case when to_char(stock_date,'d') = 1 then
1*close_value
when to_char(stock_date,'d') = 2 then
2*close_value
when to_char(stock_date,'d') = 3 then
3*close_value
when to_char(stock_date,'d') = 4 then
4*close_value
when to_char(stock_date,'d') = 5 then
5*close_value
end) weight
from( select rownum,'Y',stock_date,close_value
from test
order by stock_date asc)
order by 1)
)
order by 1
/
RN D WE SUM_D STOCK_DAT
------ - -- ---------- ---------
1 5 37 5 15-SEP-05
2 1 38 1 18-SEP-05
3 2 38 3 19-SEP-05
4 3 38 6 20-SEP-05
5 4 38 10 21-SEP-05
6 5 38 15 22-SEP-05
7 1 39 1 25-SEP-05
8 2 39 3 26-SEP-05
9 3 39 6 27-SEP-05
10 4 39 10 28-SEP-05
11 5 39 15 29-SEP-05
RN D WE SUM_D STOCK_DAT
------ - -- ---------- ---------
12 1 40 1 02-OCT-05
Thank You. !!
November 21, 2005 - 8:52 am UTC
see above, you can skip lots of steps here!
Analytics Question
Yoav, November 22, 2005 - 5:29 am UTC
Tom.
Your solution is better then my.
Thank you !
Could you please help me with this
A reader, November 29, 2005 - 4:17 am UTC
I am trying to output a report with different aggregates for different price ranges
create table t(
id number(3),
year number(4),
month number(2),
slno number(2),
colorcd number(2),
sizecd number(2),
itemid number(4),
prdno number(3),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));
insert into t values (1,2005,1,1,1,10,1000,101,150,100,10);
insert into t values (1,2005,1,2,1,11,1000,101,150,120,2);
insert into t values (1,2005,1,3,1,12,1000,101,150,100,10);
insert into t values (1,2005,1,4,1,13,1000,102,150,200,2);
insert into t values (1,2005,2,5,2,10,1000,102,150,100,20);
insert into t values (1,2005,2,6,2,11,1000,102,150,100,12);
insert into t values (1,2005,2,7,3,10,1000,103,150,100,20);
insert into t values (1,2005,3,8,4,10,1000,103,150,100,22);
insert into t values (1,2005,4,9,4,11,1000,103,150,100,12);
insert into t values (1,2005,1,10,5,10,1000,104,450,100,10);
insert into t values (1,2005,1,11,5,11,1000,104,450,120,2);
insert into t values (1,2005,1,12,5,12,1000,104,450,100,10);
insert into t values (1,2005,1,13,5,13,1000,104,450,200,2);
insert into t values (1,2005,2,14,5,14,1000,104,450,100,20);
insert into t values (1,2005,1,15,6,10,1001,105,150,100,10);
insert into t values (1,2005,1,16,6,11,1001,105,150,120,2);
insert into t values (1,2005,1,17,6,12,1001,105,150,100,10);
insert into t values (1,2005,1,18,6,13,1001,105,150,200,2);
insert into t values (1,2005,2,19,7,10,1001,105,150,100,20);
insert into t values (1,2005,2,20,7,11,1002,106,400,100,12);
insert into t values (1,2005,2,21,8,10,1002,106,400,100,20);
insert into t values (1,2005,3,22,9,10,1002,107,400,100,22);
insert into t values (1,2005,4,23,10,11,1002,107,400,100,12);
insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');
Desc Itemname <199 <299 <399 <499
----------------------------------------------------------------------------
Count of distinct prdnos Item0 3 null null 1
(Count of distinct prdnos
group by colorcd, sizecd) 3 null null 1
sum of sl_qty group by itemid 110 null null 44
Count of distinct prdnos Item1 1 null null null
(Count of distinct prdnos
group by colorcd, sizecd) 1 null null null
sum of sl_qty group by itemid 44 null null null
Count of distinct prdnos Item2 null null null 2
(Count of distinct prdnos
group by colorcd, sizecd) null null null 1
sum of sl_qty group by itemid null null null 66
Is this possible? The Desc column is not needed and 'null' should be blank.
Thank you
November 29, 2005 - 10:22 am UTC
I don't get the "group by colorcd, sizecd" bit. If you group by those attributes, you'll get a row per unique ITEMNAME, COLORCD, SIZECD.
I don't understand the logic.
A reader, November 29, 2005 - 10:48 am UTC
Dear Tom,
For each unique record of ITEMNAME, COLORCD, SIZECD, the PRODNOs are repeating isn't it? I need a count of distinct prodnos.
COLORCD-SIZECD-ITEMID-PRODNO in that order, please see below
first group
-------------------------
1-10-1000-101
1-11-1000-101
1-12-1000-101
----------------------
second group
----------------------
1-13-1000-102
2-10-1000-102
2-11-1000-102
Both these groups comes under price range < 199. So the distinct count of PRODNO for price range < 199 = 2
Hope this make sense.
Thank you
November 30, 2005 - 10:46 am UTC
not understanding how this gets down to a single row. I did not get it.
why would that be different than the count of disintct prodno's by itemid.
how about this query
steve, November 30, 2005 - 8:04 pm UTC
Hi Tom,
Is there simple way to do it by analytic function?
select dept_num, id, sum(curr_adj_qty)
from
(
select dept_num, id, sum(current_adjust_qty) curr_adj_qty
from adjust
where applied_ind = 'N'
and expired_ind = 'N'
group by dept_num, id
UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
from adjust
where expired_ind = 'N'
and applied_ind = 'Y'
group by dept_num, id
) adj_tmp
group by dept_num, id
Thanks a lot!
Steve
November 30, 2005 - 9:17 pm UTC
hows about you
a) set up a small example
b) explain what it is supposed to do in text (so we don't have to reverse engineer what you might have been thinking)
[RE] to Steve NYC
Marcio Portes, November 30, 2005 - 10:43 pm UTC
May be he is looking for this
ops$marcio@LNX10GR2> select dept_num, id, sum(curr_adj_qty)
2 from
3 (
4 select dept_num, id, sum(current_adjust_qty) curr_adj_qty
5 from adjust
6 where applied_ind = 'N'
7 and expired_ind = 'N'
8 group by dept_num, id
9 UNION
10 select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
11 from adjust
12 where expired_ind = 'N'
13 and applied_ind = 'Y'
14 group by dept_num, id
15 ) adj_tmp
16 group by dept_num, id
17 /
DEPT_NUM ID SUM(CURR_ADJ_QTY)
------------- ------------- -----------------
1 0 185
1 2 186
0 2 77
0 0 81
1 1 165
0 1 56
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2815735809
--------------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 390 | 11 (46)| 00:00:01 |
| 1 | HASH GROUP BY | | 10 | 390 | 11 (46)| 00:00:01 |
| 2 | VIEW | | 10 | 390 | 10 (40)| 00:00:01 |
| 3 | SORT UNIQUE | | 10 | 120 | 10 (70)| 00:00:01 |
| 4 | UNION-ALL | | | | | |
| 5 | HASH GROUP BY | | 5 | 60 | 5 (40)| 00:00:01 |
|* 6 | TABLE ACCESS FULL| ADJUST | 250 | 3000 | 3 (0)| 00:00:01 |
| 7 | HASH GROUP BY | | 5 | 60 | 5 (40)| 00:00:01 |
|* 8 | TABLE ACCESS FULL| ADJUST | 250 | 3000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("APPLIED_IND"='N' AND "EXPIRED_IND"='N')
8 - filter("EXPIRED_IND"='N' AND "APPLIED_IND"='Y')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
624 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
ops$marcio@LNX10GR2>
ops$marcio@LNX10GR2> select dept_num, id,
2 sum( case when applied_ind = 'N'
3 then current_adjust_qty
4 else 0 end )
5 - sum( case when applied_ind = 'Y'
6 then current_adjust_qty
7 else 0 end ) curr_adj_qty
8 from adjust
9 where expired_ind = 'N'
10 group by dept_num, id
11 /
DEPT_NUM ID CURR_ADJ_QTY
------------- ------------- -------------
1 0 185
1 2 186
0 2 77
1 1 165
0 0 81
0 1 56
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3658272021
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 60 | 4 (25)| 00:00:01 |
| 1 | HASH GROUP BY | | 5 | 60 | 4 (25)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| ADJUST | 500 | 6000 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXPIRED_IND"='N')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
622 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
I used this script to produce output above.
set echo on
drop table adjust purge;
create table
adjust (
dept_num int,
id int,
applied_ind char(1),
expired_ind char(1),
current_adjust_qty int
);
insert /*+ append */ into adjust
with v
as ( select level l from dual connect by level <= 1000 )
select mod(l, 2), mod(l, 3),
decode(mod(l,8), 0, 'Y', 'N'),
decode(mod(l,5), 0, 'N', 'Y'),
trunc(dbms_random.value(1,10.000))
from v
/
commit;
exec dbms_stats.gather_table_stats( user, 'adjust' )
set autotrace on
select dept_num, id, sum(curr_adj_qty)
from
(
select dept_num, id, sum(current_adjust_qty) curr_adj_qty
from adjust
where applied_ind = 'N'
and expired_ind = 'N'
group by dept_num, id
UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
from adjust
where expired_ind = 'N'
and applied_ind = 'Y'
group by dept_num, id
) adj_tmp
group by dept_num, id
/
select dept_num, id,
sum( case when applied_ind = 'N'
then current_adjust_qty
else 0 end )
- sum( case when applied_ind = 'Y'
then current_adjust_qty
else 0 end ) curr_adj_qty
from adjust
where expired_ind = 'N'
group by dept_num, id
/
set autotrace off
set echo off
Multiple aggregates
Raj, December 01, 2005 - 7:45 am UTC
Dear Tom,
This is continuing with my previous post where the given data for the problem was wrong. I was trying to make a sample testcase. Here is the requirements along with the create table statements and corrected data.
This is to output sales figures for a given period of different products.
The output format should be,
1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD
3. PRICE
4. Count of distinct PRODNOs for an item group by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item group by PRICE
6. Total SL_QTY for an item group by PRICE
7. Total SL_QTY*PRICE for an item group by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item
create table t(
id number(3),
slno number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));
With Items as(
select
itemid, displaycd, itemname
from
p),
DistinctCounts as(
select
min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
from
(select
distinct prdno,colorcd, sizecd, price, itemid
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
price, prdno, colorcd, sizecd )
group by price),
Aggregates as(
select
price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
from
(
select
itemid, month, prdno,colorcd, sizecd,
count(distinct prdno) over (partition by price) c1,
sl_qty c3,
sl_qty*price c4,
price,
trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7 c6
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
prdno,colorcd, sizecd)
group by
price)
select
a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
avg(c6) avg_trend
from
DistinctCounts d, Aggregates a, Items i
where
d.prdno=a.prdno and
d.price=a.price and
i.itemid=d.itemid
group by
a.price,a.itemid, i.itemname, c2
order by
a.itemid, i.itemname, a.price
/
With this query I am able to get the report like this,
ITEMID ITEMNAME PRICE PRDNO_CNT SKU_CNT SALE_CNT SALE_PRICE AVG_PRICE AVG_TREND
---------- -------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1000 Item0 150 2 8 280 42000 150 40
1000 Item0 450 1 4 110 49500 450 46.6
1001 Item1 350 1 5 270 94500 350 32.6
Is it possible to get the report in the following format along with the nonmatching itemnames and null cells as blanks.
Itemname <199 <299 <399 <499
------------------------------------------------
Item0 2 null null 1
8 null null 4
280 null null 110
42000 null null 49500
150 null null 450
40.0 null null 46.6
Item1 null null 1 null
null null 5 null
null null 270 null
null null 94500 null
null null 350 null
null null 32.6 null
Item2 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Item3 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Many thanks for your help and patience
Sorry
Raj, December 01, 2005 - 9:45 pm UTC
Dear Tom,
I am sorry, I did'nt post the insert statements. Sorry for being careless. I did execute everything in my system and was formating and copying it one by one, previewed and reread it before posting and still missed it. I will repost the requirements below,
This is to output sales figures for a given period of different products.
The output format should be,
1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD
3. PRICE
4. Count of distinct PRODNOs for an item grouped by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item grouped by PRICE
6. Total SL_QTY for an item grouped by PRICE
7. Total SL_QTY*PRICE for an item grouped by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item
drop table t;
drop table p;
create table t(
id number(3),
slno number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));
insert into t values (1,1,2005,1,1000,101,1,10,150,90,10);
insert into t values (1,2,2005,1,1000,101,1,11,150,80,20);
insert into t values (1,3,2005,1,1000,101,1,12,150,90,10);
insert into t values (1,4,2005,1,1000,101,1,13,150,80,20);
insert into t values (1,5,2005,2,1000,101,1,10,150,80,20);
insert into t values (1,25,2005,1,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,1,1000,104,1,13,150,80,20);
insert into t values (1,25,2005,2,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,2,1000,104,1,13,150,80,20);
insert into t values (1,26,2005,2,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,2,1000,104,1,10,150,90,10);
insert into t values (1,26,2005,1,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,1,1000,104,1,10,150,90,10);
insert into t values (1,6,2005,2,1000,101,1,11,150,60,40);
insert into t values (1,7,2005,2,1000,101,1,12,150,80,20);
insert into t values (1,14,2005,2,1000,101,1,13,150,80,20);
insert into t values (1,15,2005,1,1001,103,1,10,350,90,10);
insert into t values (1,23,2005,3,1001,103,1,11,350,10,90);
insert into t values (1,22,2005,3,1001,103,1,10,350,90,10);
insert into t values (1,21,2005,2,1001,103,1,11,350,80,20);
insert into t values (1,20,2005,2,1001,103,1,10,350,80,20);
insert into t values (1,19,2005,1,1001,103,1,14,350,80,20);
insert into t values (1,18,2005,1,1001,103,1,13,350,70,30);
insert into t values (1,17,2005,1,1001,103,1,12,350,40,60);
insert into t values (1,16,2005,1,1001,103,1,11,350,90,10);
insert into t values (1,8,2005,1,1000,102,1,10,450,80,20);
insert into t values (1,9,2005,1,1000,102,1,11,450,90,10);
insert into t values (1,10,2005,1,1000,102,1,12,450,90,10);
insert into t values (1,11,2005,1,1000,102,1,13,450,90,10);
insert into t values (1,12,2005,2,1000,102,1,10,450,80,10);
insert into t values (1,13,2005,2,1000,102,1,11,450,50,50);
insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');
commit;
With Items as(
select
itemid, displaycd, itemname
from
p),
DistinctCounts as(
select
min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
from
(select
distinct prdno,colorcd, sizecd, price, itemid
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
price, prdno, colorcd, sizecd )
group by price),
Aggregates as(
select
price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
from
(
select
itemid, month, prdno,colorcd, sizecd,
count(distinct prdno) over (partition by price) c1,
sl_qty c3,
sl_qty*price c4,
price,
trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7 c6
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
prdno,colorcd, sizecd)
group by
price)
select
a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
avg(c6) avg_trend
from
DistinctCounts d, Aggregates a, Items i
where
d.prdno=a.prdno and
d.price=a.price and
i.itemid=d.itemid
group by
a.price,a.itemid, i.itemname, c2
order by
a.itemid, i.itemname, a.price
/
ITEMID ITEMNAME PRICE PRDNO_CNT SKU_CNT SALE_CNT SALE_PRICE AVG_PRICE AVG_TREND
------ -------- ------ --------- -------- --------- ----------- ---------- ----------
1000 Item0 150 2 8 280 42000 150 40
1000 Item0 450 1 4 110 49500 450 47
1001 Item1 350 1 5 270 94500 350 33
Is it possible to get the report in the following format along with the
nonmatching itemnames and null cells as blanks.
Itemname <199 <299 <399 <499
------------------------------------------------
Item0 2 null null 1
8 null null 4
280 null null 110
42000 null null 49500
150 null null 450
40.0 null null 46.6
Item1 null null 1 null
null null 5 null
null null 270 null
null null 94500 null
null null 350 null
null null 32.6 null
Item2 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Item3 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Thanking you
December 02, 2005 - 10:48 am UTC
it wasn't just that - it was "this was too big to answer in a coupld of seconds and since I get over 1,000 of these a month, I cannot spend too much time on each one, I'd rather take NEW questions sometimes"
a single query to collapse date ranges
Bob Lyon, December 06, 2005 - 12:56 pm UTC
Tom,
I know what I want to do but can't quite get my mind around the syntax...
We want a single query to collapse date ranges under the assumption that a date range that
starts later than another range has a better value.
So given this test case
CREATE GLOBAL TEMPORARY TABLE RDL (
DATE_FROM DATE,
DATE_TO DATE,
VALUE NUMBER
);
INSERT INTO RDL VALUES (TO_DATE('01/03/2005', 'MM/DD/YYYY'), TO_DATE('01/12/2005', 'MM/DD/YYYY'), 5);
INSERT INTO RDL VALUES (TO_DATE('01/05/2005', 'MM/DD/YYYY'), TO_DATE('01/10/2005', 'MM/DD/YYYY'), 8);
-- I assume the innermost subquery would
-- use the DUAL CONNECT BY LEVEL trick to generate individual days for each grouping
-- ORDER BY DATE_FROM
1 01/03/2005 01/04/2005 5
1 01/04/2005 01/05/2005 5
1 01/05/2005 01/06/2005 5
1 01/06/2005 01/07/2005 5
1 01/07/2005 01/08/2005 5
1 01/08/2005 01/09/2005 5
1 01/09/2005 01/10/2005 5
1 01/10/2005 01/11/2005 5
1 01/11/2005 01/12/2005 5
2 01/05/2005 01/06/2005 8
2 01/06/2005 01/07/2005 8
2 01/07/2005 01/08/2005 8
2 01/08/2005 01/09/2005 8
2 01/09/2005 01/10/2005 8
-- an outer subquery would use analytics to get the max grouping
1 01/03/2005 01/04/2005 5
1 01/04/2005 01/05/2005 5
2 01/05/2005 01/06/2005 8
2 01/06/2005 01/07/2005 8
2 01/07/2005 01/08/2005 8
2 01/08/2005 01/09/2005 8
2 01/09/2005 01/10/2005 8
1 01/10/2005 01/11/2005 5
1 01/11/2005 01/12/2005 5
-- And the outermost subquery would use analytics to collapse the dates into contiguous groups
-- for the desired result
1 01/03/2005 01/05/2005 5
2 01/05/2005 01/10/2005 8
1 01/10/2005 01/12/2005 5
The trick is to do all of the above in a single query!
Any suggestions (Yeah, I know, REALLY learn analytics!)
Thanks in advance,
Bob Lyon
OK, I think I got it
Bob Lyon, December 06, 2005 - 2:25 pm UTC
SELECT d date_from, d2 date_to, value
FROM (
SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
FROM (
SELECT DATE_FROM D, VALUE FROM RDL
UNION
SELECT DATE_TO D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
)
)
WHERE D2 IS NOT NULL
/
DATE_FROM DATE_TO VALUE
----------------- ----------------- ----------
01/03/05 00:00:00 01/05/05 00:00:00 5
01/05/05 00:00:00 01/10/05 00:00:00 8
01/10/05 00:00:00 01/12/05 00:00:00 5
December 06, 2005 - 3:50 pm UTC
depends on how many overlaps you allow, take your create and:
...
ops$tkyte@ORA9IR2> INSERT INTO RDL VALUES (TO_DATE('01/06/2005', 'MM/DD/YYYY'),
2 TO_DATE('01/7/2005', 'MM/DD/YYYY'), 99);
1 row created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT d date_from, d2 date_to, value
2 FROM (
3 SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
4 FROM (
5 SELECT DATE_FROM D, VALUE FROM RDL
6 UNION
7 SELECT DATE_TO D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
8 )
9 )
10 WHERE D2 IS NOT NULL
11 /
DATE_FROM DATE_TO VALUE
--------- --------- ----------
03-JAN-05 05-JAN-05 5
05-JAN-05 06-JAN-05 8
06-JAN-05 07-JAN-05 99
07-JAN-05 10-JAN-05 8
10-JAN-05 12-JAN-05 5
so, maybe we expand out and keep the row we want:
ops$tkyte@ORA9IR2> with
2 data
3 as
4 (select level-1 l
5 from (select max(date_to-date_from+1) n from rdl) n
6 connect by level <= n)
7 select rdl.date_from+l,
8 to_number( substr( max( to_char(date_from,'yyyymmdd') || value ), 9 ) ) value
9 from rdl, data
10 where data.l <= rdl.date_to-rdl.date_from
11 group by rdl.date_from+l
12 ;
RDL.DATE_ VALUE
--------- ----------
03-JAN-05 5
04-JAN-05 5
05-JAN-05 8
06-JAN-05 99
07-JAN-05 99
08-JAN-05 8
09-JAN-05 8
10-JAN-05 8
11-JAN-05 5
12-JAN-05 5
10 rows selected.
Select with Analytics Working Partially
denni50, December 06, 2005 - 4:34 pm UTC
Hi Tom
have a question with the below script that is puzzling.
I'm using the 4 idnumbers as test data. I'm looking to select the most recent record where the appealcode is like '_R%' for 2005.
when I run the script it only pulls 2 of the idnumbers.
I've been looking at this and can't see why the other two are being bypassed.I'm trying to use more analytics in my code, it's working for two records and not the other two.
any tips/help greatly appreciated.
SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
2 paydate,payamount,transnum,ltransnum,appealcode
3 from (
4 select x.*, row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
5 from (
6 select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
7 payamount, transnum,ltransnum,appealcode,
8 max(paydate) over (partition by idnumber) maxpd
9 from payment
10 where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MON-Y
YYY')
11 ) x
12 where appealcode like '_R%'
13 and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)
15 )
16 where rn = 1;
IDNUMBER USER FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
4004594 ACDC May2005 17-MAY-05 0 10159410 10086183 DRE0505
5618190 ACDC Mar2005 11-MAR-05 0 9918802 9845638 DRJ0503
SQL>
**** 4 TEST IDNUMBERS FROM BASE TABLE*********************
SQL> select idnumber,appealcode,paydate,payamount,transnum
2 from payment where appealcode like '_R%'
3 and idnumber=4004594 order by paydate desc;
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
4004594 DRE0505 17-MAY-05 0 10159410
4004594 GRG0502 08-FEB-05 0 9804766
4004594 GRF0501 31-JAN-05 0 9750332
4004594 GRK0410 01-NOV-04 0 9303510
4004594 GRC0403 19-MAR-04 0 8371053
4004594 GRA0305 12-AUG-03 0 7543911
4004594 GRG0303 16-APR-03 0 7209503
4004594 GRA0301 16-FEB-03 0 7026840
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
4002401 GRG0502 16-MAR-05 25 9862647
4002401 GRG0502 23-FEB-05 0 9826142
4002401 GRA0501 19-JAN-05 0 9712904
4002401 GRF0412 05-JAN-05 0 9630884
4002401 GRK0410 21-OCT-04 0 9299106
4002401 GRG0303 03-MAR-03 0 7066423
4002401 GRA0301 09-FEB-03 0 7022121
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
5406454 DRJ0503 03-MAR-05 0 9887770
5406454 DRG0502 28-FEB-05 0 9870637
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
5618190 DRJ0503 11-MAR-05 0 9918802
5618190 DRG0502 28-FEB-05 0 9870090
5618190 GRG0502 21-FEB-05 0 9824705
December 07, 2005 - 1:32 am UTC
(i would need a create table and insert statements if you really want me to play with it)
but this predicate:
12 where appealcode like '_R%'
13 and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)
says
"only keep _R% records that had the max paydate over ALL records for that id"
to satisfy:
I'm looking to select the most recent
record where the appealcode is like '_R%' for 2005.
perhaps you mean:
select *
from (select t.*,
row_number() over (partition by idnumber sort by paydate DESC)rn
from t
where appealcode like '_R%'
and idnumber in ( 1,2,3,4 ) )
where rn = 1;
that says
"find the _R% records"
"break them up by idnumber"
"sort each group from big to small by paydate"
"keep only the first record in each group"
if
to dennis
Oraboy, December 06, 2005 - 6:09 pm UTC
Hi ,
I tried your problem and looks its working fine.
Just a quick question..Did you check the dates are really 2005 and not 0005?
(Create scripts for anyone who wants to try in future)
Create table Test_T
(IdNumber number,
AppealCode Varchar2(100),
PayDate date,
PayAmount NUmber,
TransNum Number)
/
Insert into Test_t values ( 4004594 ,'DRE0505',to_date('17-May-05','DD-MON-RR'),0,10159410 );
Insert into Test_t values ( 4004594 ,'GRG0502',to_date('8-Feb-05','DD-MON-RR'),0,9804766 );
Insert into Test_t values ( 4004594 ,'GRF0501',to_date('31-Jan-05','DD-MON-RR'),0,9750332 );
Insert into Test_t values ( 4004594 ,'GRK0410',to_date('1-Nov-04','DD-MON-RR'),0,9303510 );
Insert into Test_t values ( 4004594 ,'GRC0403',to_date('19-Mar-04','DD-MON-RR'),0,8371053 );
Insert into Test_t values ( 4004594 ,'GRA0305',to_date('12-Aug-03','DD-MON-RR'),0,7543911 );
Insert into Test_t values ( 4004594 ,'GRG0303',to_date('16-Apr-03','DD-MON-RR'),0,7209503 );
Insert into Test_t values ( 4004594 ,'GRA0301',to_date('16-Feb-03','DD-MON-RR'),0,7026840 );
Insert into Test_t values ( 4002401 ,'GRG0502',to_date('16-Mar-05','DD-MON-RR'),25,9862647 );
Insert into Test_t values ( 4002401 ,'GRG0502',to_date('23-Feb-05','DD-MON-RR'),0,9826142 );
Insert into Test_t values ( 4002401 ,'GRA0501',to_date('19-Jan-05','DD-MON-RR'),0,9712904 );
Insert into Test_t values ( 4002401 ,'GRF0412',to_date('5-Jan-05','DD-MON-RR'),0,9630884 );
Insert into Test_t values ( 4002401 ,'GRK0410',to_date('21-Oct-04','DD-MON-RR'),0,9299106 );
Insert into Test_t values ( 4002401 ,'GRG0303',to_date('3-Mar-03','DD-MON-RR'),0,7066423 );
Insert into Test_t values ( 4002401 ,'GRA0301',to_date('9-Feb-03','DD-MON-RR'),0,7022121 );
Insert into Test_t values ( 5406454 ,'DRJ0503',to_date('3-Mar-05','DD-MON-RR'),0,9887770 );
Insert into Test_t values ( 5406454 ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870637 );
Insert into Test_t values ( 5618190 ,'DRJ0503',to_date('11-Mar-05','DD-MON-RR'),0,9918802 );
Insert into Test_t values ( 5618190 ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870090 );
Insert into Test_t values ( 5618190 ,'GRG0502',to_date('21-Feb-05','DD-MON-RR'),0,9824705 );
--since the other columns are not relevant , I used dummy values in your Select statement
s61>l
1 select
2 idnumber,
3 usercode1,
4 substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
5 paydate,
6 payamount,
7 transnum,
8 ltransnum,
9 appealcode
10 from (
11 select x.*,
12 row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
13 from (
14 select
15 idnumber,1 usercode1,to_char(paydate,'MON') mon_raw, paydate,
16 payamount, transnum,transnum ltransnum,appealcode,
17 max(paydate) over (partition by idnumber) maxpd
18 from Test_t
19 where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
20 and to_date('31-OCT-2005','DD-MON-YYYY')
21 ) x
22 where appealcode like '_R%'
23 and paydate=maxpd
24 and idnumber in(4002401,4004594,5406454,5618190)
25 )
26* where rn = 1
s61>/
IDNUMBER USERCODE1 FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---------- ------- --------- ---------- ---------- ---------- -----------
4002401 1 Mar2005 16-MAR-05 25 9862647 9862647 GRG0502
4004594 1 May2005 17-MAY-05 0 10159410 10159410 DRE0505
5406454 1 Mar2005 03-MAR-05 0 9887770 9887770 DRJ0503
5618190 1 Mar2005 11-MAR-05 0 9918802 9918802 DRJ0503
--added the other two columns
s61>alter table test_t add (usercode1 varchar2(100),ltransnum varchar2(100));
Table altered.
s61>update test_t set usercode1 = chr(65+ mod(rownum,3)), ltransnum=transnum+rownum;
20 rows updated.
S61> @<<ursql.txt>>
IDNUMBER USERCODE1 FIRSTPA PAYDATE
---------- ---------------------------------------------------------------------------------------------------- ------- ---------
4002401 A Mar2005 16-MAR-05
4004594 B May2005 17-MAY-05
5406454 B Mar2005 03-MAR-05
5618190 A Mar2005 11-MAR-05
-- this is just a guess on why the other two numbers didnt
--show up in your result
-- updating 2005 to 05
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=5406454
2 /
2 rows updated.
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=4002401
2 /
7 rows updated.
s61>l
1 select
2 idnumber,
3 usercode1,
4 substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
5 paydate,
6 payamount,
7 transnum,
8 ltransnum,
9 appealcode
10 from (
11 select x.*,
12 row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
13 from (
14 select
15 idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
16 payamount, transnum, ltransnum,appealcode,
17 max(paydate) over (partition by idnumber) maxpd
18 from Test_t
19 where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
20 and to_date('31-OCT-2005','DD-MON-YYYY')
21 ) x
22 where appealcode like '_R%'
23 and paydate=maxpd
24 and idnumber in(4002401,4004594,5406454,5618190)
25 )
26* where rn = 1
s61>/
IDNUMBER US FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- -- ------- --------- ---------- ---------- ---------- -------------------------------------
4004594 B May2005 17-MAY-05 0 10159410 10159411 DRE0505
5618190 A Mar2005 11-MAR-05 0 9918802 9918820 DRJ0503
-- same as what you see
Thanks Tom and Oraboy
denni50, December 07, 2005 - 8:37 am UTC
Oraboy...you brought up a good possibility..although
the data gets posted through canned software...users
are responsible for creating the batch headers before
posting batches and what may have happened is a user
inadvertently inserted year 0005 instead of '2005'
for a particular batch that included those idnumbers
I am testing with here. It's happened before.
thanks for that helpful tip!
:~)
Oraboy and Tom
denni50, December 07, 2005 - 8:57 am UTC
Oraboy:
it was not the year, I did testing using '0005' to see
if those two records would output results and it did not.
I changed the script based on logic that Tom suggested
and it worked...see changes below.
thanks Oraboy for your input and help.
SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode
2 paydate,payamount,transnum,ltransnum,appealcode
3 from (
4 select x.*, row_number() over (partition by idnumber order by paydate desc,payamount desc, t
snum desc) rn
5 from (
6 select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
7 payamount, transnum,ltransnum,appealcode
8 --max(paydate) over (partition by idnumber) maxpd
9 from payment
10 where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MO
YYY')
11 ) x
12 where appealcode like '_R%'
13 --and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)
15 )
16 where rn = 1;
IDNUMBER USER FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
4002401 ACGA Mar2005 16-MAR-05 25 9862647 9789477 GRG0502
4004594 ACDC May2005 17-MAY-05 0 10159410 10086183 DRE0505
5406454 ACDC Mar2005 03-MAR-05 0 9887770 9814606 DRJ0503
5618190 ACDC Mar2005 11-MAR-05 0 9918802 9845638 DRJ0503
SQL>
anyway to do a dynamic lag?
Ryan, December 07, 2005 - 10:27 pm UTC
Is it possible to use lag, but you don't know how many rows you want to go back?
create table history (
history_id number,
history_sequence number,
history_status varchar2(20),
history_balance number);
insert into history(1,123,'HISTORY 1',10);
insert into history(1,128,'PROCESSED',0);
insert into history(1,130,'PROCESSED',0);
insert into history(1,131,'HISTORY 8',15);
insert into history(1,145,'PROCESSED',0);
for each history_id ordered by history_sequence
loop
if status = 'PROCESSED' then
history_balance = the history_balance of the last record where status != 'PROCESSED'
end if;
end loop;
Typically with lag you have to state how many rows you are looking back, in this case my discriminator is based on the value in the status field?
After this is run, I expect the values to be
1,123,'HISTORY 1',10
1,128,'PROCESSED',10
1,130,'PROCESSED',10
1,131,'HISTORY 8',15
1,145,'PROCESSED',15
I can do this with pl/sql. I am trying to figure out how to do this with straight sql.
December 08, 2005 - 2:03 am UTC
last_value with ignore nulls in 10g, or to_number(substr(max in 9i and before can be used....
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
2 last_value(
3 case when history_status <> 'PROCESSED'
4 then history_balance
5 end IGNORE NULLS ) over (order by history_sequence ) last_hb
6 from history
7 /
HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS HISTORY_BALANCE LAST_HB
---------- ---------------- -------------------- --------------- ----------
1 123 HISTORY 1 10 10
1 128 PROCESSED 0 10
1 130 PROCESSED 0 10
1 131 HISTORY 8 15 15
1 145 PROCESSED 0 15
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
2 to_number( substr( max(
3 case when history_status <> 'PROCESSED'
4 then to_char(history_sequence,'fm0000000000' ) || history_balance
5 end ) over (order by history_sequence ), 11 ) ) last_hb
6 from history
7 /
HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS HISTORY_BALANCE LAST_HB
---------- ---------------- -------------------- --------------- ----------
1 123 HISTORY 1 10 10
1 128 PROCESSED 0 10
1 130 PROCESSED 0 10
1 131 HISTORY 8 15 15
1 145 PROCESSED 0 15
Query
Mark, January 18, 2006 - 5:02 pm UTC
Hi Tom,
Given a Table:
PK_ID NUMBER (PK)
CLASS_ID NUMBER
MY_DATE DATE
I'd like to develop an output like:
CLASS_ID W1 W2 W3...
where W1, W2, W3... are 'weeks' from SYSDATE, using MY_DATE, holding the counts of the CLASS_ID for that row.
How could I do that?
Thanks!
About to read Ch. 12 in Expert one-on-one...
January 19, 2006 - 12:28 pm UTC
if
a) you have a finite number of weeks (eg: a sql query has N columns at parse time, unless you know what N is...)
b) an example <<<=== creates/inserts
we could play with this.
ok, here we go
Mark, January 20, 2006 - 1:31 pm UTC
drop table cts_temp
/
create table cts_temp
( class_id number,
cts_date date)
/
insert into cts_temp
select round(dbms_random.value(1, 20)), trunc(created) from user_objects
/
Output sought:
CLASS_ID W1 W2 W3 W4 W5 W6 W7 W8 W9 W10+
----------------------------------------
1 1 3 7 5 1 0 0 0 0 12
2 1 5 1 0 0 0 5 3 4 6
3 1 0 0 9 0 1 1 5 1 10
...
where W# = # of Weeks away from current date
therefore, W1 is within 7 days of today, W10+ everything 10 weeks and older. These numbers are are Counts of records.
I have done this in the past with DECODE statements, but am looking for a more efficient way to do this using Analytics.