Excellent!!!
A reader, October 07, 2003 - 5:05 am UTC
Hi Tom,
'been thinking about writing a book just about analytics ' ... please make this book available soon and am sure it will be yet another gift from you to Oracle World :)
Wow!!
Michael T, October 07, 2003 - 7:00 am UTC
This is exactly what I needed! Analytics do rock! I just
need to understand them better. If you do decide to write a
book on analytics, it would be at the top of my must have
list. Thanks again!!!
Small correction
Michael T, October 07, 2003 - 7:33 am UTC
After looking at it a little closer it looks like there is
one small error. The start date for the first MACH1 entry
should be the close date of the prior different station. In
this case 07/01/2003. However, by making some small changes
to your query I can get the results I want.
SELECT order#,
station,
lag(close_date) over (partition by order# order by close_date)
start_date,
close_date
FROM (SELECT order#,
station,
close_date
FROM (SELECT order#,
lag(station) over (partition by order# order by
close_date) lag_station,
lead(station) over (partition by order# order by
close_date) lead_station,
station,
close_date
FROM t)
WHERE lead_station <> station
OR lead_station is null
OR lag_station is null)
There might be an easier way to construct this query, but
it works great for me. Thanks a lot for your help!
October 07, 2003 - 8:25 am UTC
sorry about that -- you are right -- when we have "a pair", we want to use lag/lead again to get and keep the right dates.
So, we want to keep rows that are:
a) the first row in the partition "where lag_station is null"
b) the last row in the partition "where lead_station is null"
c) the first of a possible pair "where lag_station <> station"
d) the second of a possible pair "where lead_station <> station"
This query does that:
ops$tkyte@ORA920> select order#,
2 station,
3 lag_close_date,
4 close_date,
5 decode( lead_station, station, 1, 0 ) first_of_pair,
6 decode( lag_station, station, 1, 0 ) second_of_pair
7 from (
8 select order#,
9 lag(station) over (partition by order# order by close_date)
10 lag_station,
11 lead(station) over (partition by order# order by close_date)
12 lead_station,
13 station,
14 close_date,
15 lag(close_date) over (partition by order# order by close_date)
16 lag_close_date,
17 lead(close_date) over (partition by order# order by close_date)
18 lead_close_date
19 from t
20 )
21 where lag_station is null
22 or lead_station is null
23 or lead_station <> station
24 or lag_station <> station
25 /
ORDER# STATION LAG_CLOSE_ CLOSE_DATE FIRST_OF_PAIR SECOND_OF_PAIR
------ ------- ---------- ---------- ------------- --------------
12345 RECV 07/01/2003 0 0
12345 MACH1 07/01/2003 07/02/2003 1 0
12345 MACH1 07/05/2003 07/11/2003 0 1
12345 INSP1 07/11/2003 07/12/2003 0 0
12345 MACH1 07/12/2003 07/16/2003 0 0
12345 MACH2 07/16/2003 07/30/2003 0 0
12345 STOCK 07/30/2003 08/01/2003 0 0
7 rows selected.
<b>we can see with the 1's the first/second of a pair in there. All we need to do now is "reach forward" for the first of a pair and grab the close date from the next record:</b>
ops$tkyte@ORA920> select order#,
2 station,
3 lag_close_date,
4 close_date
5 from (
6 select order#,
7 station,
8 lag_close_date,
9 decode( lead_station,
10 station,
11 lead(close_date) over (partition by order# order by close_date),
12 close_date ) close_date,
13 decode( lead_station, station, 1, 0 ) first_of_pair,
14 decode( lag_station, station, 1, 0 ) second_of_pair
15 from (
16 select order#,
17 lag(station) over (partition by order# order by close_date)
18 lag_station,
19 lead(station) over (partition by order# order by close_date)
20 lead_station,
21 station,
22 close_date,
23 lag(close_date) over (partition by order# order by close_date)
24 lag_close_date,
25 lead(close_date) over (partition by order# order by close_date)
26 lead_close_date
27 from t
28 )
29 where lag_station is null
30 or lead_station is null
31 or lead_station <> station
32 or lag_station <> station
33 )
34 where second_of_pair <> 1
35 /
ORDER# STATION LAG_CLOSE_ CLOSE_DATE
------ ------- ---------- ----------
12345 RECV 07/01/2003
12345 MACH1 07/01/2003 07/11/2003
12345 INSP1 07/11/2003 07/12/2003
12345 MACH1 07/12/2003 07/16/2003
12345 MACH2 07/16/2003 07/30/2003
12345 STOCK 07/30/2003 08/01/2003
6 rows selected.
<b>and discard the second of pairs row</b>
That is another way to do it (and an insight into how I develop analytic queries -- adding extra columns like that just to see visually what I want to do)
another good book on the list please go ahead on this one too
Vijay Sehgal, October 07, 2003 - 8:58 am UTC
Best Regards,
Vijay Sehgal
Very useful
Michael T., October 07, 2003 - 12:05 pm UTC
Excellent, as always!
Can we reach to the end of the group?
Steve, December 15, 2003 - 11:28 am UTC
For example, say our analytic query returns the following result:
master_record sub_record nxt_record
95845433 25860032 95118740
95118740 25860032 95837497
95837497 25860032
What I'd like is to do is grab the final master_record, 95837497, and have that populated in the final column. There could be 2,3 or more in each group.
December 15, 2003 - 3:45 pm UTC
so the nxt_record of the last record should be the master_record of that row?
then just select
nvl( lead(master_record) over (....), master_record ) nxt_record
when the lead is NULL, return the master_record of the current row
Almost....
Steve, December 15, 2003 - 5:52 pm UTC
but I dodn't explain it well enough. What I'd like to see is a result set that looks like:
master_record sub_record nxt_record
95845433 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
The data comes from this:
table activity
cllocn moddate
25860032 18/06/2003
95118740 26/08/2003
95837497 15/12/2003
95845433 19/08/2003
table ext_dedupe
master_cllocn dupe_cllocn
25860032 95118740
25860032 95837497
25860032 95845433
My query is:
select * from ( select master_record, sub_record, lead(master_record) over (partition by sub_record order by lst_activity asc) nxt_activity
from ( select * from (select case when dupelast_ackdate>last_ackdate then dupe_cllocn
when last_ackdate>dupelast_ackdate then master_cllocn
else master_cllocn
end master_record, greatest(last_ackdate,dupelast_ackdate) lst_activity,
case when dupelast_ackdate>last_ackdate then master_cllocn
when last_ackdate>dupelast_ackdate then dupe_cllocn
else dupe_cllocn
end sub_record
from (select master_cllocn, (select max(moddate) from activity a where a.cllocn=ed.master_cllocn) last_ackdate,
dupe_cllocn, (select max(moddate) from activity a where a.cllocn=ed.dupe_cllocn) dupelast_ackdate
from ext_dedupe ed))))
Am I on the right track or is there a simpler way to this?
Thanks
December 16, 2003 - 6:50 am UTC
can you explain in "just text" how you got from your inputs to your outputs.
it is not clear (and i didn't feel like parsing that sql to reverse engineer what it does)
Is this what you are looking for ?
Venkat, December 15, 2003 - 6:44 pm UTC
select master, sub, moddate
, min(master) keep (dense_rank first order by moddate) over (partition by sub) first_in_list
, max(master) keep (dense_rank last order by moddate) over (partition by sub) last_in_list
from (select master, sub, moddate from (
select 95845433 master, 25860032 sub, to_date('19-aug-03','dd/mon/yy') moddate from dual union all
select 95118740, 25860032, to_date('26-aug-03','dd/mon/yy') from dual union all
select 95837497, 25860032, to_date('15-dec-03','dd/mon/yy') from dual))
MASTER SUB MODDATE FIRST_IN_LIST LAST_IN_LIST
95845433 25860032 8/19/2003 95845433 95837497
95118740 25860032 8/26/2003 95845433 95837497
95837497 25860032 12/15/2003 95845433 95837497
Tom's Book
umesh, December 16, 2003 - 4:13 am UTC
Tom
Do not announce until you are finished with the book .. when you talk of a book ..can't wait until We have it here
Analytics Book That must be real good
Is it possible to get the same result in standard edition ?
Ninoslav, December 16, 2003 - 4:21 am UTC
Hi Tom,
yes, analitic functions are great. However, we can use it only in enterprise edition of database. We have a few small customers that want only a standard edition.
So, is it possible in this question to get the same result without analitic functions ?
It would be nice to have some kind of mapping between analitics and 'standard' queries. But, that is probabaly impossible...
December 16, 2003 - 7:27 am UTC
Oracle 9iR2 and up -- analytics are a feature of standard edition.
there are things you can do in analytics that are quite simply NOT PRACTICAL in any sense without them.
ok
Steve, December 16, 2003 - 8:41 am UTC
I have two tables - activity and ext_dedupe.
table activity
cllocn moddate
25860032 18/06/2003
95118740 26/08/2003
95837497 15/12/2003
95845433 19/08/2003
table ext_dedupe
master_cllocn dupe_cllocn
25860032 95118740
25860032 95837497
25860032 95845433
Ext_dedupe is a table created by a third party app which has identified duplicate records within our database. The first column is supposed to be the master and the second the duplicate. The idea is to mark as archived all our duplicate records with a pointer to the master. Notwithstanding the order of the columns, what we want to do is find out which record has the most recent activity (from the activity table) and archive off the others.
So, in this example although the master is listed as 25860032 against the other 3, an examination of the activity dates mean I want to keep 95837497 and mark the others as archived and have a pointer on each of them to 95837497. That's why I thought if I could get to the following result it would make it simpler.
master_record sub_record nxt_record
95845433 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
Hope that makes sense!
December 16, 2003 - 11:33 am UTC
oh, then nxt_record is just
last_value(master_record) over (partition by sub_record order by moddate)
Why...
Steve, December 16, 2003 - 1:31 pm UTC
it didn't work for me. I had to change it to
first_value(master_record) over (partition by sub_record order by moddate desc)
Is there a reason for that?
December 16, 2003 - 2:00 pm UTC
doh, default window clause is current row and unbounded preceding
i would have needed a window clause that looks forwards rather then backwards (reason #1 why I should always set up a test case instead of just answering on the fly)
your solution of reversing the data works just fine.
Another solution
A reader, December 16, 2003 - 4:03 pm UTC
The following gives the same result ...
select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
, max(cllocn) keep (dense_rank last order by moddate)
over (partition by nvl(master_cllocn,cllocn)) nxt_record
from activity, ext_dedupe where cllocn = dupe_cllocn
MASTER_RECORD SUB_RECORD NXT_RECORD
95118740 25860032 95837497
95837497 25860032 95837497
95845433 25860032 95837497
December 16, 2003 - 5:44 pm UTC
yes, there are many many ways to do this.
first_value
last_value
substring of max() without keep
sure.
A reader, December 16, 2003 - 4:15 pm UTC
Actually the nvl(master_cllocn...) is required only if you need all 4 rows in the output as follows(there is an outer join involved). If you need only the 3 rows as shown in the above post, there is no need for the nvl's....
select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
, max(cllocn) keep (dense_rank last order by moddate)
over (partition by nvl(master_cllocn,cllocn)) nxt_record
, last_value(cllocn) over (partition by nvl(master_cllocn,cllocn) order by moddate) nxt
from activity, ext_dedupe where cllocn = dupe_cllocn (+)
MASTER_RECORD SUB_RECORD NXT_RECORD
25860032 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
95845433 25860032 95837497
still q's on analytics
A reader, January 30, 2004 - 10:13 am UTC
Okay, so my web application logs "web transaction" statistics to a table. This actually amounts to 0 to many database tranactions... but anyway.. I need to summarize (sum, min, max, count, average) each day's transaction times for each class (name2) and action (name3) and ultimately "archive" this data to a hisory table. I am running 8.1.7 and pretty new to analytics.
My table looks like this:
SQL> desc tran_stats
Name Null? Type
----------------------- -------- ----------------
ID NOT NULL NUMBER(9)
NAME1 VARCHAR2(100)
NAME2 VARCHAR2(100)
NAME3 VARCHAR2(100)
NAME4 VARCHAR2(100)
SEC NOT NULL NUMBER(9,3)
TS_CR NOT NULL DATE
ID NAME1 NAME2 NAME3 SEC NAME4 TS_CR
---------- ----- ------------------------- ---------- ------ ----- ---------
35947 /CM01_PersonManagement CREATE .484 15-JAN-04
35987 /CM01_PersonManagement CREATE .031 15-JAN-04
36086 /CM01_PersonManagement EDIT .312 16-JAN-04
36555 /CM01_PersonManagement CREATE .297 19-JAN-04
36623 /CM01_PersonManagement EDIT .375 19-JAN-04
36627 /CM01_PersonManagement CREATE .047 19-JAN-04
36756 /CM01_AddressManagement CREATE .375 20-JAN-04
36766 /CM01_AddressManagement CREATE .305 20-JAN-04
36757 /CM01_AddressManagement INSERT .391 20-JAN-04
37178 /CM01_PersonManagement EDIT .203 20-JAN-04
and I need output like this:
TS_CR NAME2 NAME3 M_SUM M_MIN M_MAX M_COUNT M_AVG
--------- ------------------------- ---------- ------ ------ ------ ------- ------
20-JAN-04 /CM01_AddressManagement CREATE .680 .305 .375 2 .340
20-JAN-04 /CM01_AddressManagement INSERT .391 .391 .391 1 .391
20-JAN-04 /CM01_PersonManagement EDIT .203 .203 .203 1 .203
19-JAN-04 /CM01_PersonManagement CREATE .344 .047 .297 2 .172
19-JAN-04 /CM01_PersonManagement EDIT .375 .375 .375 1 .375
16-JAN-04 /CM01_PersonManagement EDIT .312 .312 .312 1 .312
15-JAN-04 /CM01_PersonManagement CREATE .515 .031 .484 2 .258
This seems to work, but there has to be a better/cleaner/more efficient way to do this:
select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
from (
select trunc(ts_cr) ts_cr,id, name2, name3, sum(sec) m_dummy
, min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min
, max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max
, round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg
, count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count
, sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
from tran_stats group by name2, name3,trunc(ts_cr),id
)n order by 1 desc, 2, 3;
Any help or pointers would be appreciated. Thanks in advance.
January 30, 2004 - 10:31 am UTC
why does there "have to be"?
what is "unclean" about this? I could make it more verbose (and perhaps more readable) but this does exactly what you ask for?
It seems pretty "good", very "clean" and probably the most efficient method to get this result?
Regarding the previous post ...
A reader, January 30, 2004 - 11:45 am UTC
Am I missing something or will the following do the same ..
select trunc(ts_cr) ts_cr, name2, name3,
count(*) m_count, min(sec) m_min, max(sec) m_max,
sum(sec) m_sum, avg(sec) m_avg
from tran_stats
group by trunc(ts_cr), name2, name3
order by 1 desc, 2, 3
January 30, 2004 - 7:43 pm UTC
with the supplied data -- since "group by trunc(ts_cr), name2, name3" happened to be unique
yes.
In general -- no. consider:
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
2 from ( select trunc(ts_cr) ts_cr,
3 id,
4 name2,
5 name3,
6 sum(sec) m_dummy ,
7 min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
8 max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
9 round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
10 count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
11 sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
12 from tran_stats
13 group by name2, name3,trunc(ts_cr),id
14 )n
15 MINUS
16 select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
17 from (
18 select trunc(ts_cr) ts_cr, name2, name3,
19 count(*) m_count, min(sec) m_min, max(sec) m_max,
20 sum(sec) m_sum, avg(sec) m_avg
21 from tran_stats
22 group by trunc(ts_cr), name2, name3 )
23 /
no rows selected
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into tran_stats
2 select 35947,'/CM01_PersonManagement','CREATE', .484 ,'15-JAN-04'
3 from all_users where rownum <= 5;
5 rows created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
2 from ( select trunc(ts_cr) ts_cr,
3 id,
4 name2,
5 name3,
6 sum(sec) m_dummy ,
7 min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
8 max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
9 round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
10 count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
11 sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
12 from tran_stats
13 group by name2, name3,trunc(ts_cr),id
14 )n
15 MINUS
16 select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
17 from (
18 select trunc(ts_cr) ts_cr, name2, name3,
19 count(*) m_count, min(sec) m_min, max(sec) m_max,
20 sum(sec) m_sum, avg(sec) m_avg
21 from tran_stats
22 group by trunc(ts_cr), name2, name3 )
23 /
TS_CR NAME2 NAME3 M_SUM M_MIN M_MAX M_COUNT M_AVG
--------- ----------------------- -------- ---------- ---------- ---------- ---------- ----------
15-JAN-04 /CM01_PersonManagement CREATE 2.935 .031 2.904 2 1.4675
add more data and it won't be the same.
OK
Siva, January 31, 2004 - 9:05 am UTC
Dear Tom,
Can analytics be used for the following formats of the same query
sql>select ename,nvl(ename,'Name is null') from emp
sql>select ename,decode(ename,null,'Name is null',ename)
from emp
If you know other ways,Please let me know
Bye!
January 31, 2004 - 10:03 am UTC
umm, why ?
with analytics
A reader, February 18, 2004 - 7:30 am UTC
with the following data
-- ------
1 val1_1
1 val1_2
1 val1_3
2 val1_1
2 val2_2
can i produce
-- ------ --------------------
1 val1_1 val1_1,val1_2,val1_3
1 val1_2 val1_1,val1_2,val1_3
1 val1_3 val1_1,val1_2,val1_3
2 val2_1 val2_1,val2_2
2 val2_2 val2_1,val2_2
with an analytic that rocks
February 18, 2004 - 8:47 pm UTC
if
select max(count(*)) from t group by id
has a reasonable maximum -- yes, but it would be a trick lag/lead thing.
I would probably join using stragg. join the details to the aggregate using inline views.
OK
Siddiq, March 01, 2004 - 9:26 am UTC
Hi Tom,
What can be the business use cases of the analytic functions
1)cume_dist
2)percentile_dist
3)percentile_cont
Where can they be of immense use?
Bye!
March 01, 2004 - 10:17 am UTC
they are just statistical functions for analysis.
2 and 3 are really variations on eachother (disc=discrete, cont=continuous) and would be used to compute pctcentiles (like you might see on an SAT test report from back in high school). percentile_* can be used to find a median for example :)
cume_dist is a variation on that. I'll cheat on an example, from the doc:
Analytic Example
The following example calculates the salary percentile for each employee in the purchasing area. For example, 40% of clerks have salaries less than or equal to Himuro.
SELECT job_id, last_name, salary, CUME_DIST() OVER (PARTITION BY job_id ORDER BY salary) AS cume_dist FROM employees WHERE job_id LIKE PU% ;
JOB_ID LAST_NAME SALARY CUME_DIST
---------- ------------------------- ---------- ----------
PU_CLERK Colmenares 2500 .2
PU_CLERK Himuro 2600 .4
PU_CLERK Tobias 2800 .6
PU_CLERK Baida 2900 .8
PU_CLERK Khoo 3100 1
PU_MAN Raphaely 11000 1
Stumped on Analytics
Dave Thompson, March 04, 2004 - 9:56 am UTC
Hi Tom,
I have the following two tables:
CREATE TABLE PAY_M
(
PAY_ID NUMBER,
PAYMENT NUMBER
)
--
--
CREATE TABLE PREM
(
PREM_ID NUMBER,
PREM_PAYMENT NUMBER
)
With the following data:
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
1, 100);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
2, 50);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
3, 50);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
4, 50);
COMMIT;
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
1, 50);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
2, 25);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
3, 50);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
4, 50);
COMMIT;
PAY_M contains payments made against the premiums in the table prem.
Payments:
PAY_ID PAYMENT
---------- ----------
1 50
2 25
3 50
4 50
Prem:
PREM_ID PREM_PAYMENT
---------- ------------
1 100
2 50
3 50
4 50
We are trying to find which payment Ids paid each premium payment in Prem. The payments are assigned sequentially to the premiums.
For example payments 1,2 & 3 pay off the £100 in premium 1 leaving £25. Then the remaining payment from payment 3 & payment 4 pay off premium 2 leaving a balance of £25, and so on.
We are trying to create a query that will use the analytical functions to find all the payment IDs that pay off the associated premium ids. We want to keep this SQL based as we need to Process about 30 million payments!
Thanks.
Great website, hope you enjoyed your recent visit to the UK.
March 04, 2004 - 1:52 pm UTC
let me make sure I have this straight -- you want to
o sum up the first 3 records in payments
o discover they are 125 which exceeds 100
o output the fact that prem_id 1 is paid for by pay_id 1..3
o carry forward 25 from 3, discover that leftover 3+4 = 75 pays for prem_id 2
with 25 extra
while I believe (not sure) that the 10g MODEL clause might be able to do this (if you can do it in a spreadsheet, we can use the MODEL clause to do it).....
I'm pretty certain that analytics cannot -- we would need to recursively use lag (eg: after finding that 1,2,3 pay off 1, we'd need to -- well, it's hard to explain...)
I cannot see analytics doing this -- future rows depend on functions of the analytics from past rows and that is just "not allowed".
I can see how to do this in a pipelined PLSQL function -- will that work for you?
Oops - Error in previous post
Dave Thompson, March 04, 2004 - 10:17 am UTC
Tom,
Sorry, ignore the above tables as they are missing the joining column:
CREATE TABLE PAY_M
(
PREM_ID NUMBER,
PAY_ID NUMBER,
PAYMENT NUMBER
)
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 1, 50);
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 2, 25);
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 3, 50);
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES (
1, 4, 50);
COMMIT;
CREATE TABLE PREM
(
PREM_ID NUMBER,
PAY_ID NUMBER,
PREM_PAYMENT NUMBER
)
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 1, 100);
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 2, 50);
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 3, 50);
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES (
1, 4, 50);
COMMIT;
SQL> l
1 SELECT *
2* FROM PAY_M
SQL> /
PREM_ID PAY_ID PAYMENT
---------- ---------- ----------
1 1 50
1 2 25
1 3 50
1 4 50
SQL> select *
2 from prem;
PREM_ID PAY_ID PREM_PAYMENT
---------- ---------- ------------
1 1 100
1 2 50
1 3 50
1 4 50
Thanks.....
Dave Thompson, March 05, 2004 - 4:23 am UTC
Tom,
Thanks for your prompt response.
I am familiar with Pipeline functions.
I was however hoping we could do this as a set based opertion because of the volume of data involved.
Thanks for your time.
analytics book
Ron Chennells, March 05, 2004 - 5:52 am UTC
Just another vote and pre order for the analytics book
OK
Gerhard, March 19, 2004 - 12:33 am UTC
Dear Tom,
I used the following query to find the difference of salaries between employees.
SQL> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal from emp;
ENAME SAL DIFF_SAL
---------- ---------- ----------
SMITH 800
JAMES 950 150
ADAMS 1100 150
WARD 1250 150
MARTIN 1250 0
MILLER 1300 50
TURNER 1500 200
ALLEN 1600 100
CLARK 2450 850
BLAKE 2850 400
JONES 2975 125
ENAME SAL DIFF_SAL
---------- ---------- ----------
SCOTT 3000 25
FORD 3000 0
KING 5000 2000
14 rows selected.
My Question is:
" What is the difference between King's sal with other
employees?".Could you please help with the query?
Bye!
March 19, 2004 - 8:58 am UTC
scott@ORA9IR2> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal ,
2 sal-king_sal king_sal_diff
3 from (select sal king_sal from emp where ename = 'KING'),
4 emp
5 /
ENAME SAL DIFF_SAL KING_SAL_DIFF
---------- ---------- ---------- -------------
SMITH 800 -4200
JAMES 950 150 -4050
ADAMS 1100 150 -3900
WARD 1250 150 -3750
MARTIN 1250 0 -3750
MILLER 1300 50 -3700
TURNER 1500 200 -3500
ALLEN 1600 100 -3400
CLARK 2450 850 -2550
BLAKE 2850 400 -2150
JONES 2975 125 -2025
SCOTT 3000 25 -2000
FORD 3000 0 -2000
KING 5000 2000 0
14 rows selected.
Will this be faster?
Venkat, March 19, 2004 - 4:20 pm UTC
select ename, sal,
sal-lag(sal) over(order by sal) as diff_sal,
sal - max(case when ename='KING' then sal
else null end) over () king_sal_diff
from emp
March 20, 2004 - 9:47 am UTC
when you benchmarked it and tested it to scale, what did you see? it would be interesting no?
lead/lag on different dataset
Stalin, May 03, 2004 - 9:22 pm UTC
Hi Tom,
I've similar requirement but i'm not sure how to use lead or lag to refer from a different dataset.
Eg. logs table has both login and logout information and they are identified by action column. There could be different login/logout modes so records that have action in (1,2) and (3,4,5,6,7) values are login and logout records respectively. Now i need to find signon and signout times and also session duration in mins.
here is some sample data of logs table :
LOG_ID LOG_CREATION_DATE USER_ID SERVICE ACTION
---------- ------------------- ---------- ---------- ----------
1 04/29/2004 10:48:36 3 5 2
3 04/29/2004 10:53:44 3 5 3
5 04/29/2004 11:11:35 3 5 1
1003 05/03/2004 15:18:53 3 5 5
1004 05/03/2004 15:19:50 8 5 1
here is a query i came up with (not exacly what i want) :
select log_id signon_id, lead(log_id, 1) over (partition by account_id, user_id, mac order by log_id) signoff_id,
user_id, log_creation_date signon_date,
lead(log_creation_date, 1) over (partition by account_id, user_id, mac order by log_creation_date) signoff_date,
nvl(round(((lead(log_creation_date, 1)
over (partition by account_id, user_id order by log_creation_date)-log_creation_date)*1440), 2), 0) Usage_Mins
from logs
where account_id = 'Robert'
and service = 5
order by user_id
desired output :
SIGNON_ID SIGNOFF_ID USER_ID SIGNON_DATE SIGNOFF_DATE USAGE_MINS
---------- ---------- ---------- ------------------- ------------------- ----------
1 3 3 04/29/2004 10:48:36 04/29/2004 10:53:44 5.13
5 1003 3 04/29/2004 11:11:35 05/03/2004 15:18:53 6007.3
1004 8 05/03/2004 15:19:50 0
Thanks in Advance,
Stalin
May 04, 2004 - 7:11 am UTC
maybe if you supply simple create table and insert ... values ... statements for me.... this stuff would go faster.
Your query references columns that are not in the example as well.
Create table scripts
Stalin, May 04, 2004 - 1:29 pm UTC
Sorry for not giving this info in the first place.
here goes the scripts....
create table logs (log_id number, log_creation_date date, account_id varchar2(25), user_id number,
service number, action number, mac varchar2(50))
/
insert into logs values (1, to_date('04/29/2004 10:48:36'), 'Robert', 3, 5, 2, '00-00-00-00')
/
insert into logs values (3, to_date('04/29/2004 10:53:44'), 'Robert', 3, 5, 3, '00-00-00-00')
/
insert into logs values (5, to_date('04/29/2004 11:11:35'), 'Robert', 3, 5, 1, '00-00-00-00')
/
insert into logs values (1003, to_date('05/03/2004 15:18:53'), 'Robert', 3, 5, 5, '00-00-00-00')
/
insert into logs values (1004, to_date('05/03/2004 15:19:50'), 'Robert', 8, 5, 1, '00-00-00-00')
/
The reason for including mac in the partition group is cause users can login via multiple pc's without logging out hence i grouped it on account_id, user_id and mac.
Thanks,
Stalin
May 04, 2004 - 2:38 pm UTC
ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 ) minutes
2 from (
3 select log_id,
4 case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
5 then lead(log_id) over (partition by account_id, user_id, mac order by log_creation_date)
6 end signoff_id,
7 user_id,
8 log_creation_date signon_date,
9 case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
10 then lead(log_creation_date) over (partition by account_id, user_id, mac order by log_creation_date)
11 end signoff_date,
12 action
13 from logs
14 where account_id = 'Robert'
15 and service = 5
16 order by user_id
17 ) a
18 where action in (1,2)
19 /
LOG_ID SIGNOFF_ID USER_ID SIGNON_DATE SIGNOFF_DATE ACTION MINUTES
---------- ---------- ---------- ------------------- ------------------- ---------- ----------
1 3 3 04/29/2004 10:48:36 04/29/2004 10:53:44 2 5.13
5 1003 3 04/29/2004 11:11:35 05/03/2004 15:18:53 1 6007.3
1004 8 05/03/2004 15:19:50 1
Excellent
Stalin, May 04, 2004 - 3:42 pm UTC
This is exactly what i'm looking for.
Thanks so much!
Help On SQL
VKOUL, May 04, 2004 - 8:05 pm UTC
I want to substitute the non null value of a column to its null column. e.g.
If I have records like the following
year month column_value
----- ------ --------------------
2002 06 55
2002 06 57
2002 07 NULL
2002 08 NULL
2002 09 NULL
2002 10 100
2002 11 101
I want the results as below
year month column_value
----- ------ --------------------
2002 06 55
2002 06 57
2002 07 57 ------> Repeated
2002 08 57 ------> Repeated
2002 09 57 ------> Repeated
2002 10 100
2002 11 101
May 04, 2004 - 9:08 pm UTC
create table,
insert into table
much appreciated......... (so i don't spend days of my life making create tables and insert into statements. I've added this request to all pages where you can input stuff and I'll just be asking for it from now on in...... Not picking on you, just reminding everyone that i need a script like I provide.....)
but..... asked and answered:
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10286792840956 <code>
Help On SQL
VKoul, May 04, 2004 - 11:27 pm UTC
Beautiful !!!
I'll keep in mind "create table etc."
Thanks
VKoul
analytic q
A reader, May 11, 2004 - 6:38 pm UTC
Tom
Please look at the following schema and data.
---------
spool schema
set echo on
drop table host_instances;
drop table rac_instances;
drop table instance_tablespaces;
create table host_instances
(
host_name varchar2(50),
instance_name varchar2(50)
);
create table rac_instances
(
rac_name varchar2(50),
instance_name varchar2(50)
);
create table instance_tablespaces
(
instance_name varchar2(50),
tablespace_name varchar2(50),
tablespace_size number
);
-- host to instance mapping data
insert into host_instances values ( 'h1', 'i1' );
insert into host_instances values ( 'h2', 'i2' );
insert into host_instances values ( 'h3', 'i3' );
insert into host_instances values ( 'h4', 'i4' );
insert into host_instances values ( 'h5', 'i5' );
-- rac to instance mapping data
insert into rac_instances values ( 'rac1', 'i1' );
insert into rac_instances values ( 'rac1', 'i2' );
insert into rac_instances values ( 'rac2', 'i3' );
insert into rac_instances values ( 'rac2', 'i4' );
--- instance to tablespace mapping data
insert into instance_tablespaces values( 'i1', 't11', 100 );
insert into instance_tablespaces values( 'i1', 't12', 200 );
insert into instance_tablespaces values( 'i2', 't11', 100 );
insert into instance_tablespaces values( 'i2', 't12', 200 );
insert into instance_tablespaces values( 'i3', 't31', 500 );
insert into instance_tablespaces values( 'i3', 't32', 300 );
insert into instance_tablespaces values( 'i4', 't31', 500 );
insert into instance_tablespaces values( 'i4', 't32', 300 );
insert into instance_tablespaces values( 'i5', 't51', 400 );
commit;
---------
What I need is to sum up all tablespaces of all instances
for a list of hosts. However, if two hosts in the list
belong to a RAC then I should only pick one of the
hosts (I can pick any one of them.)
e.g. in the above data I should only pick i1 or i2 not
both since they both belong to the same RAC 'rac1'.
Following is the select I came up with for the above data.
Let me know if you have any comments on it.
Any other alternative solutions you can think of would
also be educating to me. I have not benchmarked this
select yet. The number of hosts could reach up to 2000
approximately. On an average we can assume each will have
one instance - some of these will be RACs.
Thank you!
-----------
scott@ora10g> set echo on
scott@ora10g> column host_name format a10
scott@ora10g> column instance_name format a10
scott@ora10g> column rac_name format a10
scott@ora10g> column row_number format 999
scott@ora10g>
scott@ora10g> select a.instance_name, sum( tablespace_size )
2 from
3 (
4 select instance_name
5 from
6 (
7 select host_name, instance_name, rac_name,
8 row_number() over
9 (
10 partition by rac_name
11 order by rac_name, instance_name
12 ) row_number
13 from
14 (
15 select hi.host_name, hi.instance_name, ri.rac_name
16 from host_instances hi, rac_instances ri
17 where hi.instance_name = ri.instance_name(+)
18 )
19 )
20 where row_number <= 1
21 ) a, instance_tablespaces e
22 where a.instance_name = e.instance_name
23 group by a.instance_name;
i1 300
i3 800
i5 400
---
Also do you prefer the .sql file (as above) or
the spooled output of schema.sql (i.e. schema.lst.)
The above is more convenient to reproduce - but the spooled output makes for better reading in some cases.
May 11, 2004 - 9:15 pm UTC
I like the cut and paste from sqlplus truth be told.
sure, I have to do two vi commands and a couple of deletes to fix it up but.... I'm fairly certain that the poster *actually ran the commands successfully!* which is most relevant to me....
Besides, I do it to you ;)
ops$tkyte@ORA9IR2> select *
2 from (
3 select h.host_name, h.instance_name, r.rac_name, sum(t.tablespace_size),
4 row_number() over (partition by r.rac_name order by h.host_name ) rn
5 from host_instances h,
6 rac_instances r,
7 instance_tablespaces t
8 where h.instance_name = r.instance_name(+)
9 and h.instance_name = t.instance_name
10 group by h.host_name, h.instance_name, r.rac_name
11 )
12 where rn = 1
13 /
HO IN RAC_N SUM(T.TABLESPACE_SIZE) RN
-- -- ----- ---------------------- ----------
h1 i1 rac1 300 1
h3 i3 rac2 800 1
h5 i5 400 1
is the first thing that popped into my head.
with just a couple hundred rows -- any of them will perform better than good enough.
thanx!
A reader, May 11, 2004 - 9:54 pm UTC
"I like the cut and paste from sqlplus truth be told."
Actually I was going to post that only - but your
example at the point of posting led me to believe
that you want a straight sql - may be you wanna
fix that (not that many people seem to care anyways!:))
Thanx for the sql - it looks good and a tad simpler
than the one I wrote...
How to compute this running total (sort of...)
Kishan, May 18, 2004 - 11:33 am UTC
create table investment (
investment_id number,
asset_id number,
agreement_id number,
constraint pk_i primary key (investment_id)
)
/
create table period (
period_id number,
business_domain varchar2(10),
status_code varchar2(10),
constraint pk_p primary key (period_id)
)
/
create table entry (
entry_id number,
period_id number,
investment_id number,
constraint pk_e primary key(entry_id),
constraint fk_e_period foreign key(period_id) references period(period_id),
constraint fk_e_investment foreign key (investment_id) references investment(investment_id)
)
/
create table entry_detail(
entry_id number,
account_type varchar2(10),
amount number,
constraint pk_ed primary key(entry_id, account_type),
constraint fk_ed_entry foreign key(entry_id) references entry(entry_id)
)
/
insert into period (period_id, business_domain, status_code)
SELECT rownum AS period_id,
'BDG' AS business_domain,
'2' AS status_code
from all_objects where rownum <= 5
/
insert into investment(investment_id, asset_id, agreement_id)
select rownum+10 AS investment_id,
rownum+100 AS asset_id,
rownum+1000 AS agreement_id
from all_objects where rownum <=5
/
insert into entry(entry_id, period_id, investment_id) values (1, 1, 11)
/
insert into entry(entry_id, period_id, investment_id) values (2, 2, 11)
/
insert into entry(entry_id, period_id, investment_id) values (3, 3, 11)
/
insert into entry(entry_id, period_id, investment_id) values (4, 3, 13)
/
insert into entry(entry_id, period_id, investment_id) values (5, 4, 13)
/
insert into entry(entry_id, period_id, investment_id) values (6, 4, 14)
/
insert into entry(entry_id, period_id, investment_id) values (7, 5, 14)
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC1', 1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC2', -200 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC3', 300 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(5, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC1', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC3', 500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC4', 1200 )
/
scott@LDB.US.ORACLE.COM> select * from period;
PERIOD_ID BUSINESS_D STATUS_COD
---------- ---------- ----------
1 BDG 2
2 BDG 2
3 BDG 2
4 BDG 2
5 BDG 2
scott@LDB.US.ORACLE.COM> select * from investment;
INVESTMENT_ID ASSET_ID AGREEMENT_ID
------------- ---------- ------------
11 101 1001
12 102 1002
13 103 1003
14 104 1004
15 105 1005
scott@LDB.US.ORACLE.COM> select * from entry;
ENTRY_ID PERIOD_ID INVESTMENT_ID
---------- ---------- -------------
1 1 11
2 2 11
3 3 11
4 3 13
5 4 13
6 4 14
7 5 14
7 rows selected.
scott@LDB.US.ORACLE.COM> select * from entry_detail;
ENTRY_ID ACCOUNT_TY AMOUNT
---------- ---------- ----------
1 AC1 1000
1 AC2 -200
1 AC3 300
2 AC1 200
2 AC4 -1000
2 AC2 -500
3 AC2 2200
3 AC1 200
4 AC4 -1000
4 AC2 -500
5 AC2 2200
6 AC1 200
6 AC4 -1000
6 AC2 -500
7 AC1 2200
7 AC3 500
7 AC4 1200
17 rows selected.
The resultant view needed is given below.
To give an example from the result below, the first entry for investment_id 14
is from period 4. The account types entered on period 4 are AC1, AC4, AC2. We
need these three account types in all subsequent periods. Also, on period 5 a
new account type AC3 is added. So, if there is another period, say period_id 6, we need
information for AC1, AC2, AC3, AC4 (that's 4 account types). If there's no entry
for any of these account_types for any subseqent periods, the amount_for_period for such
periods are considered to be 0.00 and the balance will be sum(amount_for_period)
until that period.
PERIOD_ID INVESTMENT_ID ACCOUNT_TYPE AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
--------- ------------- ------------ ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000
3 11 AC1 200 1400
3 11 AC2 200 -500
3 11 AC3 0 300
3 11 AC4 0 1000
4 11 AC1 0 1400
4 11 AC2 0 -500
4 11 AC3 0 300
4 11 AC4 0 1000
5 11 AC1 0 1400
5 11 AC2 0 -500
5 11 AC3 0 300
5 11 AC4 0 1000
3 13 AC4 -1000 -1000
3 13 AC2 -500 -500
4 13 AC4 0 -1000
4 13 AC2 -500 -1000
5 13 AC4 0 -1000
5 13 AC4 0 -1000
4 14 AC1 200 200
4 14 AC4 -1000 -1000
4 14 AC2 -500 -500
5 14 AC1 2200 2400
5 14 AC3 500 500
5 14 AC4 1200 200
5 14 AC2 0 -500
The blank lines in between are just for clarity. As always, grateful for all your efforts.
Regards,
Kishan.
May 18, 2004 - 6:14 pm UTC
so, what does your first try look like :) at least get the join written up for the details - maybe the running total will be obvious from that.
This is how far I went...and no further
Kishan, May 19, 2004 - 10:18 am UTC
select distinct period_id,
investment_id,
account_type,
amount_for_period,
balance_till_period
from ( select period.period_id,
entry.investment_id,
entry_detail.account_type,
(case when entry.period_id = period.period_id then entry_detail.amount else 0 end) amount_for_period,
sum(amount) over(partition by period.period_id, investment_id, account_type) balance_till_period
from period left outer join (entry join entry_detail on (entry.entry_id = entry_detail.entry_id)) on (entry.period_id <= period.period_id))
order by investment_id
The result looks as below:
PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300
2 11 AC1 0 1200
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC2 0 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000
3 11 AC1 0 1400
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
4 11 AC1 0 1400
4 11 AC2 0 1500
4 11 AC3 0 300
4 11 AC4 0 -1000
5 11 AC1 0 1400
5 11 AC2 0 1500
5 11 AC3 0 300
5 11 AC4 0 -1000
3 13 AC2 -500 -500
3 13 AC4 -1000 -1000
4 13 AC2 0 1700
4 13 AC2 2200 1700
4 13 AC4 0 -1000
5 13 AC2 0 1700
5 13 AC4 0 -1000
4 14 AC1 200 200
4 14 AC2 -500 -500
4 14 AC4 -1000 -1000
5 14 AC1 0 2400
5 14 AC1 2200 2400
5 14 AC2 0 -500
5 14 AC3 500 500
5 14 AC4 0 200
5 14 AC4 1200 200
First, I am sorry my originally constructed result (by hand..;) misses a couple of rows .
However, other than that, I am unable to remove the redundant rows that are shows up for the particular investment and accout_type for a period as the logic beats me.
Basically, I need to remove rows where the amount_for_period is 0 for an account_type only if its a redundant row for that set. That is, the first row of period_id 2 and 3 are redundant but the rows for the period 4 are not redundant.
Could you help me out?
Regards,
Kishan.
May 19, 2004 - 11:06 am UTC
are we missing some more order bys? I mean -- what if:
3 11 AC1 0 1400
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
was really:
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
3 11 AC1 0 1400
would that still be redundant? missing something here/
Yes...they are redundant
A reader, May 19, 2004 - 12:16 pm UTC
Tom:
Yes, for that particular set, those rows are redundant, no matter what the order is.
Regards,
Kishan.
May 19, 2004 - 2:24 pm UTC
ok, so what is the "key" of that result set? what can we partition the result set by.
my idea will be to use your query in an inline view and analytics on that to weed out what you want.
Kishan, May 19, 2004 - 3:08 pm UTC
The key would be period_id, investment_id and accout_type. Basically, what the result represents is the amount and the balance-to-date for a particular account_type of an investment_id for a period.
Eg: Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
If there's no activity on that investment and account_type for the next period, say Period 2, the amount will be 0 for that period, and the balance will be previous period's balance.
Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=0->Balance = 1000
But, if there's an activity on that account_type for that investment, then the amount will be the amount for that period and balance will be the sum of previous balance and current amount. Say for Period 2, the amount is 500, then
Period 1->Investment 1->Account_Type AC1->Amount=1000-> Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500-> Balance=1500
And if there's a new account type entry, say AC2 and amount, say 2000 created for period 2, then the result set will be
Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500->Balance=1500
Period 2->Investment 1->Account_Type AC2->Amount=2000->Balance=2000
There may be many investments per period and many account_types per investment. Hope I am clear....
Regards,
Kishan.
May 19, 2004 - 5:34 pm UTC
so... if you have:
PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300
2 11 AC1 0 1200
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC2 0 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000
you see though, why isn't the 4th line here "redundant" then?
But it is redundant..
Kishan, May 19, 2004 - 11:51 pm UTC
Tom, I am assuming the 4th line you mention is 2->11->AC2->0->-700. Yes, it is redundant.
We need amount and balance for every period_id, investment_id and account_type. One line, per period_id, investment_id and account_type, anything more, is redundant.
Issue is, there may not be entries for a specific account_type of an investment for a particular period. In such cases, we need to assume amount for such periods are 0 and compute the balances accordingly.
Regards,
Kishan
May 20, 2004 - 10:55 am UTC
so, if you partition by
PERIOD_ID INVESTMENT_ID ACCOUNT_TY BALANCE_TILL_PERIOD
order by
AMOUNT_FOR_PERIOD
select a.*, lead(amount_for_period) over (partition by .... order by ... ) nxt
from (YOUR_QUERY)
you can then
select *
from (that_query)
where nxt is NULL or (nxt is not null and amount_for_period <> 0)
if nxt is null -- last row in the partition, keep it.
if nxt is not null AND we are zero -- remove it.
Almost there?
Dave Thompson, May 20, 2004 - 12:30 pm UTC
Hi Tom,
We have the following table of data:
CREATE TABLE DEDUP_TEST
(
ID NUMBER,
COLUMN_A VARCHAR2(10 BYTE),
COLUMN_B VARCHAR2(10 BYTE),
COLUMN_C VARCHAR2(10 BYTE),
START_DATE DATE,
END_DATE DATE
)
With:
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'D', 'B', 'C', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'a', 'f', 'f', TO_Date( '02/06/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/07/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/05/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/03/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/06/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
3, 'A', 'F', 'F', TO_Date( '02/10/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/20/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
COMMIT;
We are trying to sequentially de-duplicate this data.
Basically from the top of the table we go down and the check each row against the previous. If they are the same the row that is a duplicate is marked as such as is the original row.
So far we have this query:
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM DEDUP_TEST
ORDER
BY 1, 5 ) ) )
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-00 01-OCT-01 DUP
2 A B B 01-OCT-01 01-OCT-03 DUP
2 A B B 02-OCT-01 05-OCT-03 DUP
2 a f f 06-FEB-04 07-FEB-04 NOT
2 A B B 02-OCT-05 03-OCT-05 DUP
2 A B B 04-OCT-05 06-OCT-05 DUP
3 A F F 10-FEB-04 20-FEB-04 NOT
The resultset from this is almost what I am after.
However where there are groups of duplicate rows I only want to return one row. I take the attributes, the start_date of the first row duplicated and the end_date of the last row duplicated.
I do not want to group all the duplicates together, so for example the rows with the attributes
ID COLUMN_A COLUMN_B COLUMN_C
2 A B B
will result in two output rows:
2 A B B 01-OCT-00 01-OCT-03
2 A B B 02-OCT-05 06-OCT-05
This is the final piece I cannot work out.
Any help would be appreciated.
Thanks.
May 20, 2004 - 2:18 pm UTC
what happens in your data if you had
1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....
that might or might not be "dup" since you just order by ID? don't we need to ordedr by a,b, and c?
Follow up
Dave Thompson, May 21, 2004 - 5:02 am UTC
Hi Tom,
In repsonse to your question:
what happens in your data if you had
1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....
Then the first row would be classed as unique, as would the second and the third. We are only looking at duplicates that occur sequentially.
Sequential duplicates are then turned into one row by taking the start date of the first row and the end date of the last row in the group.
The test data should have had sequential dates:
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'D', 'B', 'C', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'a', 'f', 'f', TO_Date( '02/06/2009 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/07/2010 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2007 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/05/2008 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2011 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/03/2012 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/04/2013 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/06/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
3, 'A', 'F', 'F', TO_Date( '02/10/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/20/2015 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
COMMIT;
CREATE TABLE DEDUP_TEST
(
ID NUMBER,
COLUMN_A VARCHAR2(10 BYTE),
COLUMN_B VARCHAR2(10 BYTE),
COLUMN_C VARCHAR2(10 BYTE),
START_DATE DATE,
END_DATE DATE
)
The query:
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM DEDUP_TEST
ORDER
BY ID, START_DATE ) ) )
Gives:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
From this the sequentially duplicated rows with the attributes a, b, c will become:
2 A B C 01-OCT-03 05-OCT-08
2 A B C 02-OCT-11 06-OCT-14
Thanks.
May 21, 2004 - 10:50 am UTC
define sequentially.
1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....
ordered by ID is the same (exact same) as:
1 A1 B1 C1 ....
1 A1 B1 C1 ....
1 A2 B2 C2 ....
and
1 A2 B2 C2 ....
1 A1 B1 C1 ....
1 A1 B1 C1 ....
and in fact, two runs of your query could return different answers given the SAME exact data. How to handle that, you must have something more to sort by.
Typo in previous post
Dave Thompson, May 21, 2004 - 5:56 am UTC
Tom,
The final output should be:
From this the sequentially duplicated rows with the attributes a, b, c will
become:
2 A B B 01-OCT-03 05-OCT-08
2 A B B 02-OCT-11 06-OCT-14
Thanks.
Order
Dave Thompson, May 21, 2004 - 10:57 am UTC
Hi Tom,
The order of the dataset should be on the ID and Start Date.
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
Thanks.
May 21, 2004 - 11:42 am UTC
Ok, your example doesn't do that -- it is "non-deterministic", given the same data, it could/would return two different answers at different times during the day!
so, i think you want one of these:
ops$tkyte@ORA9IR2> select *
2 from (
3 select id, a,b,c, start_date, end_date,
4 case when (a = lag(a) over (order by id, start_date desc) and
5 b = lag(b) over (order by id, start_date desc) and
6 c = lag(c) over (order by id, start_date desc) )
7 then row_number() over (order by id, start_date)
8 end rn
9 from v
10 )
11 where rn is null
12 /
ID A B C START_DAT END_DATE RN
---------- ---------- ---------- ---------- --------- --------- ----------
1 A B C 01-OCT-99 01-OCT-00
1 D B C 01-OCT-01 01-OCT-02
1 A B C 01-OCT-02 01-OCT-03
2 A B B 02-OCT-07 05-OCT-08
2 a f f 06-FEB-09 07-FEB-10
2 A B B 04-OCT-13 06-OCT-14
3 A F F 10-FEB-14 20-FEB-15
7 rows selected.
ops$tkyte@ORA9IR2> select *
2 from (
3 select id, a,b,c, start_date, end_date,
4 case when (a = lag(a) over (order by id, start_date) and
5 b = lag(b) over (order by id, start_date) and
6 c = lag(c) over (order by id, start_date) )
7 then row_number() over (order by id, start_date)
8 end rn
9 from v
10 )
11 where rn is null
12 /
ID A B C START_DAT END_DATE RN
---------- ---------- ---------- ---------- --------- --------- ----------
1 A B C 01-OCT-99 01-OCT-00
1 D B C 01-OCT-01 01-OCT-02
1 A B C 01-OCT-02 01-OCT-03
2 A B B 01-OCT-03 01-OCT-04
2 a f f 06-FEB-09 07-FEB-10
2 A B B 02-OCT-11 03-OCT-12
3 A F F 10-FEB-14 20-FEB-15
7 rows selected.
we just need to mark records that the preceding record is the "same" after sorting -- then nuke them.
More Info
Dave Thompson, May 21, 2004 - 12:25 pm UTC
Hi Tom,
Thanks for the prompt reply.
I re-wrote the base query:
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
ROWID ROWID_R,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_C
FROM DEDUP_TEST
ORDER
BY ID, START_DATE ) ) )
And got:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
Looking at the column LETSEE I want to add a unique identifier to each row, treating duplicated rows as 1.
For example:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET DUP_ID
---------- ---------- ---------- ---------- --------- --------- --- ------
1 A B C 01-OCT-99 01-OCT-00 NOT 1
1 D B C 01-OCT-01 01-OCT-02 NOT 2
1 A B C 01-OCT-02 01-OCT-03 NOT 3
2 A B B 01-OCT-03 01-OCT-04 DUP 4
2 A B B 01-OCT-05 01-OCT-06 DUP 4
2 A B B 02-OCT-07 05-OCT-08 DUP 4
2 a f f 06-FEB-09 07-FEB-10 NOT 5
2 A B B 02-OCT-11 03-OCT-12 DUP 6
2 A B B 04-OCT-13 06-OCT-14 DUP 6
3 A F F 10-FEB-14 20-FEB-15 NOT 7
Then I could use the Dup_Id to partition on to do the anaysis I need.
Any idea?
Have a nice weekend.
Thanks.
May 21, 2004 - 1:59 pm UTC
the above query doesn't work?
Hi Again
Dave Thompson, May 21, 2004 - 2:05 pm UTC
Hi Tom,
The above didn't work.
From the source query:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
I want to output the following resultset:
ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT
On the resultset from your queries the start and end dates were incorrect.
Where duplicates rows occur one after another then we need to take the start_date of the first row and the end_date of the last row in that block.
So far the following:
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
You would get
2 A B B 01-OCT-03 05-OCT-08 DUP
Does this make sense?
Thanks again for you input on this.
May 21, 2004 - 2:19 pm UTC
ops$tkyte@ORA9IR2> select id, a,b,c, min(start_date) start_date, max(end_date) end_date
2 from (
3 select id, a,b,c, start_date, end_date,
4 max(grp) over (order by id, start_date desc) grp
5 from (
6 select id, a,b,c, start_date, end_date,
7 case when (a <> lag(a) over (order by id, start_date desc) or
8 b <> lag(b) over (order by id, start_date desc) or
9 c <> lag(c) over (order by id, start_date desc) )
10 then row_number() over (order by id, start_date desc)
11 end grp
12 from v
13 )
14 )
15 group by id, a,b,c,grp
16 order by 1, 5
17 /
ID A B C START_DAT END_DATE
---------- ---------- ---------- ---------- --------- ---------
1 A B C 01-OCT-99 01-OCT-00
1 D B C 01-OCT-01 01-OCT-02
1 A B C 01-OCT-02 01-OCT-03
2 A B B 01-OCT-03 05-OCT-08
2 a f f 06-FEB-09 07-FEB-10
2 A B B 02-OCT-11 06-OCT-14
3 A F F 10-FEB-14 20-FEB-15
7 rows selected.
One of my (current) favorite analytic tricks -- the old "carry forward". We mark rows such that the preceding row was different -- subsequent dup rows would have NULLS there for grp.
Then, we use max(grp) to "carry" that number down....
Now we have something to group by -- we've divided the rows up into groups we can deal with.
(note: if a,b,c allow NULLS, we'll need to accomidate for that!)
Great Stuff
Dave Thompson, May 21, 2004 - 5:02 pm UTC
Tom,
Thanks very much for that.
I'll go over it in more detail when I'm in the Office Monday but it looks great from here.
Enjoy the weekend.
Excellent
Dave Thompson, June 02, 2004 - 4:53 am UTC
Hi Tom,
This solution was spot on.
Thanks.
Any more thoughts on an Analytics book?
Stalin, June 09, 2004 - 6:03 pm UTC
hi tom,
wondering what would below sql look like if there hadn't been existence of lead or partition analytical funtions. is pl/sql the only option.
snippet from "lead/lag on different dataset" thread (it's has the create and insert stmts)
ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 )
minutes
2 from (
3 select log_id,
4 case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
5 then lead(log_id) over (partition by account_id, user_id, mac
order by log_creation_date)
6 end signoff_id,
7 user_id,
8 log_creation_date signon_date,
9 case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
10 then lead(log_creation_date) over (partition by account_id,
user_id, mac order by log_creation_date)
11 end signoff_date,
12 action
13 from logs
14 where account_id = 'Robert'
15 and service = 5
16 order by user_id
17 ) a
18 where action in (1,2)
19 /
Thanks,
Stalin
June 09, 2004 - 6:27 pm UTC
you could use a non-equi self join to achieve the same. Many orders of magnitudes slower.
scalar subqueries could be used as well -- with the same "slower" caveat.
Is this solvable with ANALTICS too?
Peter Tran, June 10, 2004 - 12:14 am UTC
Hi Tom,
Can the following problem be solved using Analytics?
I have a 10 columns table where 9 of the fields are dimensions and one attribute. I would like to get a report of D1/D2 columns where the ATTR1 is 1 for every other dimensions. Furthermore the PK consist of all the dimension columns.
The example below aren't really true column names, but I didn't want to make the example table too wide for illustrative purpose.
D1 D2 D3 D4 D5 D6 D7 D8 D9 ATTR1
--------------------------------------------
AA AA AA AA AA AA AA AA AA 1
AA AA BB AA AA AA AA AA AA 1
AA AA AA CC AA AA AA AA AA 1
AA AA AA AA DD AA AA AA AA 1
AA AA AA AA EE AA AA AA AA 1
AA BB AA AA AA AA AA GG AA 1
AA BB AA AA AA AA AA AA AA 1
AA BB CC AA AA AA AA AA AA 0
AA BB AA DD AA AA AA AA AA 1
EE DD JJ LL MM NN OO PP QQ 1
EE DD TT LL MM NN OO PP QQ 1
I want the query to return:
D1 D2
--------
AA AA
EE DD
It would not return AA/BB, because of the record:
D1 D2 D3 D4 D5 D6 D7 D8 D9 ATTR1
--------------------------------------------
AA BB CC AA AA AA AA AA AA 0
Thanks,
-Peter
June 10, 2004 - 7:43 am UTC
yes they can, but they are not needed. regular aggregates do the job. I'd give you the real query if I had a create table/inserts to demo against. this is "psuedo code", might or might not actually work:
select d1, d2
from t
group by d1, d2
having count(distinct attribute) = 1
Michael T., June 10, 2004 - 9:01 am UTC
Peter,
I think the following may give you what you want.
SELECT d1, d2
FROM t
GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) > 0;
Tom's psuedo code will work except for the case when all D1/D2 combinations have the same ATTR1 value, but that value is not 1.
June 10, 2004 - 9:45 am UTC
ahh, good eye -- i was thinking "all attribute values are the same"
but yours doesn't do it, this will
having count( decode( attr1, 1, 1 ) ) = count(*)
cound(decode(attr1,1,1)) will return a count of non-null occurences (all of the 1's)
count(*) returns a count of all records
output when count(decode) = count(*)
Thank you!
Peter Tran, June 10, 2004 - 10:37 am UTC
Hi Tom/Michael T.,
Thank you. It so much clearer now.
-Peter
Michael T., June 10, 2004 - 10:46 am UTC
I did screw up in my previous response. The query I submitted gives the entirely wrong answer. It should have been
SELECT d1, d2
FROM t
GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) = 0
Even though, incorrectly, I wasn't originally considering null values for ATTR1, the above query seems to produce the correct answer even if ATTR1 is NULL. The DECODE will evaluate a null ATTR1 entry to 1.
Tom, many thanks for this site. I have learned so much from it. It is a daily must read for me.
You said a book on analytics?
Jeff, June 10, 2004 - 12:30 pm UTC
A book by you on analytics would be a best seller I think.
Go for it.
quick analytic question
A reader, June 16, 2004 - 5:03 pm UTC
schema creation---
---
scott@ora92> drop table t1;
Table dropped.
scott@ora92> create table t1
2 (
3 x varchar2(10),
4 y number
5 );
Table created.
scott@ora92>
scott@ora92> insert into t1 values( 'x1', 1 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 2 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 4 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 0 );
1 row created.
scott@ora92> commit;
Commit complete.
scott@ora92> select x, y, min(y) over() min_y
2 from t1;
X Y MIN_Y
---------- ---------- ----------
x1 1 0
x1 2 0
x1 4 0
x1 0 0
scott@ora92> spool off
---
how do i get the minimum of y for all values
that is greater than 0 (if one exists). In the above case
I should get the result as
X Y MIN_Y
---------- ---------- ----------
x1 1 1
x1 2 1
x1 4 1
x1 0 1
Thanx for your excellent site and brilliant work!
June 16, 2004 - 6:09 pm UTC
min( case when y > 0 then y end ) over ()
Great!!!
A reader, June 16, 2004 - 6:46 pm UTC
Thank you very much
Gj, July 02, 2004 - 9:16 am UTC
The Oracle docs are a little light on examples but thank you for giving us the quick start to analytics, can't say I understand the complex examples yet, but the simple stuff seems so easy to understand now, can't wait until a real problem comes along I can apply this feature to.
July 02, 2004 - 10:39 am UTC
How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?
Sergey, July 06, 2004 - 8:08 am UTC
Hi Tom,
I need to 'fill the gaps' with the values from the last existing row in a table that is outer joined to another table. The other table servers as a source of regular [time] intervals. The task seems to be conceptually very simple, so I looked into Ora docs (it happens to be Ora10g docs) I pretty soon found exactly what I need: LAST_VALUE with IGNORE NULLS. Unfortunately neither Ora8i, nor Ora9i accept IGNORE NULLS. Is there any way to mimic this feature with 'older' analitical functions?
I tried sort of ORDER BY SIGN(NVL(VALUE), 0) in analitical ORDER BY clause, but it does not work (I do not have a clue why)
Thanks in advance
Here is the test:
DROP TABLE TD;
CREATE TABLE TD AS
(SELECT TRUNC(SYSDATE, 'DD') + ROWNUM T
FROM ALL_OBJECTS
WHERE ROWNUM <= 15
);
DROP TABLE TV;
CREATE TABLE TV AS
(SELECT
TRUNC(SYSDATE, 'DD') + ROWNUM * 3 T
,ROWNUM V
FROM ALL_OBJECTS
WHERE ROWNUM <= 5
);
SELECT
TD.T
,SIGN(NVL(TV.V, 0))
,NVL
(TV.V,
LAST_VALUE(TV.V IGNORE NULLS) -- IGNORE NULLS does not work on Ora8i, Ora9i
OVER
(
ORDER BY TD.T
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
) V
FROM TD, TV
WHERE TV.T(+) = TD.T
ORDER BY TD.T
;
ERROR at line 6:
ORA-00907: missing right parenthesis
SELECT
TD.T
,SIGN(NVL(TV.V, 0))
,NVL
(TV.V,
LAST_VALUE(TV.V)
OVER
(
ORDER BY SIGN(NVL(TV.V, 0)), TD.T -- Does not work
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
) V
FROM TD, TV
WHERE TV.T(+) = TD.T
ORDER BY TD.T
;
T SIGN(NVL(TV.V,0)) V
------------------- ------------------ ------------------
07.07.2004 00:00:00 0
08.07.2004 00:00:00 0
09.07.2004 00:00:00 1 1
10.07.2004 00:00:00 0
11.07.2004 00:00:00 0
12.07.2004 00:00:00 1 2
13.07.2004 00:00:00 0
14.07.2004 00:00:00 0
15.07.2004 00:00:00 1 3
16.07.2004 00:00:00 0
17.07.2004 00:00:00 0
18.07.2004 00:00:00 1 4
19.07.2004 00:00:00 0
20.07.2004 00:00:00 0
21.07.2004 00:00:00 1 5
July 06, 2004 - 8:26 am UTC
This is a trick I call "carry down", we use analytics on analytics to accomplish this. We output "marker rows" with ROW_NUMBER() on the leading edge. Using MAX() in the outer query, we "carry down" these marker rows -- substr gets rid of the row_number for us:
ops$tkyte@ORA10G> select t,
2 sign_v,
3 v,
4 substr( max(data) over (order by t), 7 ) v2
5 from (
6 SELECT TD.T,
7 SIGN(NVL(TV.V, 0)) sign_v,
8 NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
9 case when tv.v is not null
10 then to_char( row_number()
over (order by td.t), 'fm000000' ) || tv.v
11 end data
12 FROM TD, TV
13 WHERE TV.T(+) = TD.T
14 )
15 ORDER BY T
16 ;
T SIGN_V V V2
--------- ---------- ---------- -----------------------------------------
07-JUL-04 0
08-JUL-04 0
09-JUL-04 1 1 1
10-JUL-04 0 1 1
11-JUL-04 0 1 1
12-JUL-04 1 2 2
13-JUL-04 0 2 2
14-JUL-04 0 2 2
15-JUL-04 1 3 3
16-JUL-04 0 3 3
17-JUL-04 0 3 3
18-JUL-04 1 4 4
19-JUL-04 0 4 4
20-JUL-04 0 4 4
21-JUL-04 1 5 5
15 rows selected.
So, in 9ir2 this would simply be:
ops$tkyte@ORA9IR2> select t,
2 sign_v,
3 substr( max(data) over (order by t), 7 ) v2
4 from (
5 SELECT TD.T,
6 SIGN(NVL(TV.V, 0)) sign_v,
7 case when tv.v is not null
8 then to_char( row_number() over (order by td.t), 'fm000000' ) || tv.v
9 end data
10 FROM TD, TV
11 WHERE TV.T(+) = TD.T
12 )
13 ORDER BY T
14 ;
T SIGN_V V2
--------- ---------- -----------------------------------------
07-JUL-04 0
08-JUL-04 0
09-JUL-04 1 1
10-JUL-04 0 1
11-JUL-04 0 1
12-JUL-04 1 2
13-JUL-04 0 2
14-JUL-04 0 2
15-JUL-04 1 3
16-JUL-04 0 3
17-JUL-04 0 3
18-JUL-04 1 4
19-JUL-04 0 4
20-JUL-04 0 4
21-JUL-04 1 5
15 rows selected.
Doesn't work with PL/SQL ????????
A reader, July 20, 2004 - 9:31 am UTC
Dear Tom
Are analytics fully compatible with PL/SQL?
Please see
SQL> ed
Wrote file afiedt.buf
1 select empno,deptno,
2 count(empno) over (partition by deptno order by empno
3 rows between unbounded preceding and current row) run_count
4* from emp
SQL> /
EMPNO DEPTNO RUN_COUNT
---------- ---------- ----------
7782 10 1
7839 10 2
7934 10 3
7369 20 1
7566 20 2
7788 20 3
7876 20 4
7902 20 5
7499 30 1
7521 30 2
7654 30 3
EMPNO DEPTNO RUN_COUNT
---------- ---------- ----------
7698 30 4
7844 30 5
7900 30 6
14 rows selected.
SQL>
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is
3 select empno,deptno,
4 count(empno) over (partition by deptno order by empno
5 rows between unbounded preceding and current row) run_count
6 from emp;
7 begin
8 for rec in c1 loop
9 null;
10 end loop;
11* end;
SQL> /
end;
*
ERROR at line 11:
ORA-06550: line 5, column 72:
PL/SQL: ORA-00905: missing keyword
ORA-06550: line 3, column 1:
PL/SQL: SQL Statement ignored
SQL>
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE 9.2.0.3.0 Production
TNS for 32-bit Windows: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
SQL>
July 20, 2004 - 8:08 pm UTC
You can contact support and reference <Bug:3083373>, but the workaround would be to use native dynamic sql or a view to "hide" this construct.
the problem turns out to be the word "current" which had meaning in plsql.
Effect of distinct on lag
John Murphy, July 29, 2004 - 1:48 pm UTC
I am trying to use analytics to find accounts with receipts in 3 consecutive years. The analytic code seems to work, however, when I add DISTINCT (to find each account once), I get strange results. This is on 9.2.0.1.0.
create table jcm_test(acct_id number(10), rcpt_date date);
insert into jcm_test
values (1 , to_date('01-JAN-2000', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('01-JAN-2001', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('01-JAN-2003', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('02-JAN-2001', 'dd-mon-yyyy'));
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2);
ACCT_ID RCPT RANK_YEAR LAG_ LAG_RANK YEAR_DIFF RANK_DIFF
---------- ---- ---------- ---- ---------- ---------- ----------
1 2000 1
1 2001 2
1 2003 3 2000 1 3 2
select * from
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2)
where year_diff = rank_diff;
no rows selected
select distinct * from
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2)
where year_diff = rank_diff;
ACCT_ID RCPT RANK_YEAR LAG_ LAG_RANK YEAR_DIFF RANK_DIFF
---------- ---- ---------- ---- ---------- ---------- ----------
1 2001 2 2000 1 1 1
1 2003 4 2001 2 2 2
In your book, you say that because analytics are performed last, you must push them into an inline view. However, that doesn't seem to do the trick here. Thanks, john
July 29, 2004 - 2:18 pm UTC
what release -- i don't see what you see.
Distinct effect release
John Murphy, July 29, 2004 - 3:12 pm UTC
Tom, we are using the following.
Oracle9i Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
I tried searching Metalink, but couldn't find any bugs.
July 29, 2004 - 4:03 pm UTC
i found one, not published, was solved via 9202 -- at least it did not reproduce, they did not pursue it further for that reason.
Distinct effect release
John Murphy, July 29, 2004 - 4:01 pm UTC
Actually, I suspect that this may be related to bug 2258035. Do you agree? Thanks, john
July 29, 2004 - 4:18 pm UTC
yes, i can confirm that in 9205, it is not happening that way.
how to write this query
Teddy, July 30, 2004 - 6:33 am UTC
Hi
using the original poster´s example:
ORDER OPN STATION CLOSE_DATE
----- --- ------- ----------
12345 10 RECV 07/01/2003
12345 20 MACH1 07/02/2003
12345 25 MACH1 07/05/2003
12345 30 MACH1 07/11/2003
12345 36 INSP1 07/12/2003
12345 50 MACH1 08/16/2003
12346 90 MACH2 07/30/2003
12346 990 STOCK 07/31/2003
How do you write a query to determine that and order has passed maufacturing operation in several months?
In above example
12345 has rows in July and Augist but 12346 has rows in July only. How can we write a query to find orders such as 12345?
July 30, 2004 - 4:40 pm UTC
select order, min(close_date), max(close_date)
from t
having months_between( max(close_date), min(close_date) ) > your_threshold;
Finding pairs in result set
PJ, August 11, 2004 - 10:05 am UTC
Tom,
CREATE TABLE A
(
N NUMBER,
C CHAR(1),
V VARCHAR2(20)
)
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '1st e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '2nd e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '3rd e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '1st w of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '2nd w of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'e', '1st e of 2nd N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '1st w of 2nd N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '2nd w of 2nd N');
commit;
SO the data I've is
select * from a;
-------------------------
N C V
1 e 1st e of 1st N
1 e 2nd e of 1st N
1 e 3rd e of 1st N
1 w 1st w of 1st N
1 w 2nd w of 1st N
2 e 1st e of 2nd N
2 w 1st w of 2nd N
2 w 2nd w of 2nd N
---------------------------------------
And the output I'm looking for is
1 e 1st e of 1st N
1 e 2nd e of 1st N
1 w 1st w of 1st N
1 w 2nd w of 1st N
2 e 1st e of 2nd N
2 w 1st w of 2nd N
So basically I need the first pairs of (e-w/w-e) for each N.
I hope I'm clear here.
Thanks as usual in advance,
August 11, 2004 - 12:40 pm UTC
do you have a field that can be "sorted on" for finding "1st, 2cnd" and so on.
If not, there is no such thing as "first", or "third"
PJ, August 11, 2004 - 12:58 pm UTC
Tom,
Sorry if I was not clear.
we need to pick pairs for N. Like we have 5 rows with N=1. so we have to pick 4 rows leaving 1 UNPAIRED "e" out.
We want the data in the same order as it is in table. We can sort it by --> order by N,C
August 11, 2004 - 1:58 pm UTC
ops$tkyte@ORA920> select n, c, rn, cnt2
2 from (
3 select n, c, rn,
4 min(cnt) over (partition by n) cnt2
5 from (
6 select n, c,
7 row_number() over (partition by n, c order by c) rn,
8 count(*) over (partition by n, c) cnt
9 from a
10 )
11 )
12 where rn <= cnt2
13 /
N C RN CNT2
---------- - ---------- ----------
1 e 1 2
1 e 2 2
1 w 1 2
1 w 2 2
2 e 1 1
2 w 1 1
6 rows selected.
Brilliant as usual !!
A reader, August 11, 2004 - 2:04 pm UTC
PJ's query
Kevin, August 11, 2004 - 2:04 pm UTC
PJ - you can drop the column 'v' from your table, and just use this query (which I think will answer your question using N and C alone, and generate an appropriate 'v' as it runs).
CREATE TABLE b
(
N NUMBER,
C CHAR(1)
)
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
INSERT INTO b ( N, C ) VALUES ( 1, 'w');
INSERT INTO b ( N, C ) VALUES ( 1, 'w');
INSERT INTO b ( N, C ) VALUES ( 2, 'e');
INSERT INTO b ( N, C ) VALUES ( 2, 'w');
INSERT INTO b ( N, C ) VALUES ( 2, 'w');
COMMIT;
SELECT n,c,v1
FROM (
SELECT lag (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1) c3,
lead (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1)c4,
c1 ||
CASE WHEN c1 BETWEEN 10 AND 20
THEN 'th'
ELSE DECODE(MOD(c1,10),1,'st',2,'nd',3,'rd','th')
END || ' ' || c || ' of ' || c2 ||
CASE WHEN c2 BETWEEN 10 AND 20
THEN 'th'
ELSE DECODE(MOD(c2,10),1,'st',2,'nd',3,'rd','th')
END || ' N' v1,
t1.*
FROM (
SELECT b.*,
row_number() OVER (PARTITION BY n, c ORDER BY n,c) c1,
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
FROM b
) t1
) t2
WHERE c3 IS NOT NULL OR c4 IS NOT NULL
/
Results:
N C V1
1 e 1st e of 1st N
1 w 1st w of 1st N
1 e 2nd e of 1st N
1 w 2nd w of 1st N
2 e 1st e of 1st N
2 w 1st w of 1st N
INSERT INTO b ( N, C ) VALUES ( 1, 'w');
COMMIT;
Results:
N C V1
1 e 1st e of 1st N
1 w 1st w of 1st N
1 e 2nd e of 1st N
1 w 2nd w of 1st N
1 e 3rd e of 1st N
1 w 3rd w of 1st N
2 e 1st e of 1st N
2 w 1st w of 1st N
oops
Kevin, August 11, 2004 - 2:12 pm UTC
replace
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
with
DENSE_RANK() OVER (PARTITION BY c ORDER BY c) c2
my bad.
A reader, August 11, 2004 - 3:27 pm UTC
Your bad what?
toe? leg?
Cool....
PJ, August 12, 2004 - 7:25 am UTC
analytic q
A reader, October 22, 2004 - 6:34 pm UTC
First the schema:
scott@ORA92I> drop table t1;
Table dropped.
scott@ORA92I> create table t1( catg1 varchar2(10), catg2 varchar2(10), total number );
Table created.
scott@ORA92I>
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 5 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 6 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 9 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V2', 'T2', 10 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V3', 'T1', 11 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V4', 'T1', 1 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V5', 'T2', 2 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V6', 'T2', 3 );
1 row created.
The catg2 can only take two values, 'T1', 'T2'.
I want to sum the total column for catg1, catg2
and order by their total sum for each catg1 and catg2 values. Then
I want to list the top 3 catg1, catg2 combinations
based on their sum values of total column.
If there are more than 3 such combinations then I
club the remaining ones into a catg1 value of 'Others'.
my first cut solution is:
scott@ORA92I> select catg1, catg2, sum( total_sum )
2 from
3 (
4 select case
5 when dr > 3 then
6 'Others'
7 when dr <= 3 then
8 catg1
9 end catg1,
10 catg2,
11 total_sum
12 from
13 (
14 select catg1, catg2, total_sum,
15 dense_rank() over( order by total_sum desc) dr
16 from
17 (
18 select catg1, catg2, sum( total ) total_sum
19 from t1
20 group by catg1, catg2
21 )
22 )
23 )
24 group by catg1, catg2;
CATG1 CATG2 SUM(TOTAL_SUM)
---------- ---------- --------------
V1 T1 20
V2 T2 10
V3 T1 11
Others T1 1
Others T2 5
Does it look ok or do you have any better solution?
Thank you as always.
October 23, 2004 - 9:36 am UTC
you could skip a layer of inline view, but it looks fine as is.
thanx!
A reader, October 24, 2004 - 12:37 pm UTC
SQL query
Reader, November 03, 2004 - 1:45 pm UTC
I have a table which stores receipts against Purchase Orders. The users want the following o/p:
For each of the months of Jan, Feb and March 2004, provide a count of number of receipts which fall in each of the following Dollar value range
< $5000
Between $5000 to $9999
> $10,000
(There can be a number of receipts against one Purchase Order, so that's needs to be grouped together first)
I wrote this query using an inline view which is the UNION of 3 SQLs, one for each dollar range.
However, am sure there is a more elegant and efficient method to do this,maybe using analytical functions , CASE, decode .... Appreciate your help.
Thanks
November 05, 2004 - 10:49 am UTC
select trunc(date_col,'mm') Month,
count( case when amt < 5000 then 1 end ) "lt 5000",
count( case when amt between 5000 and 9999 then 1 end ) "between 5/9k",
count( case when amt >= 10000 then 1 end ) "10k or more"
from t
where date_col between :a and :b
group by trunc(date_col,'mm')
single pass....
Great -
syed, November 10, 2004 - 7:09 am UTC
Tom
I have a tables as follows
create table matches
( reference varchar2(9),
endname varchar2(20),
beginname varchar2(30),
DOB date,
ni varchar2(9)
)
/
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A1','SMITH','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A2','JONES','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A3','JONES','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A4','BROWN','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234570');
insert into matches values ('A4','JONES','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234571');
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A3','JACKSON','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A2','JACKSON','BOB',to_date('1/1/1962','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A5','JACKSON','TOM',to_date('1/1/1920','dd/mm/yyyy'),'AA1234569');
commit;
SQL> select rownum,REFERENCE,ENDNAME,BEGINNAME,DOB,NI from matches;
ROWNUM REFERENCE ENDNAME BEGINNAME DOB NI
------- --------- -------- ---------- --------- ---------
1 A1 SMITH BOB 01-JAN-76 AA1234567
2 A1 SMITH TOM 01-JAN-70 AA1234568
3 A2 JONES TOM 01-JAN-70 AA1234568
4 A3 JONES TOM 01-JAN-71 AA1234569
5 A4 BROWN BRAD 01-JAN-61 AA1234570
6 A4 JONES BRAD 01-JAN-61 AA1234571
7 A1 SMITH BOB 01-JAN-76 AA1234567
8 A3 JACKSON TOM 01-JAN-71 AA1234569
9 A2 JACKSON BOB 01-JAN-62 AA1234568
10 A5 JACKSON TOM 01-JAN-20 AA1234569
I need to show duplicates where the following columns values are the same.
a) REFERENCE, ENDNAME,BEGINNAME,DOB,NI
b) ENDNAME,BEGINNAME,NI
c) REFERENCE,NI
So,
rownum 1 and 7 match criteria a)
rownum 8 and 10 match criteria b)
rownum 1 and 7, rownum 3 and 9, rownum 4 and 8 match criteria c)
How can I select this data out to show number matching each criteria ?
November 10, 2004 - 7:23 am UTC
"How can I select this data out to show number matching each criteria ?"
is ambigous.
If you add columns:
count(*) over (partition by reference, endname, beginname, dob, ni ) cnt1,
count(*) over (partition by endname, beginname, ni) cnt2,
count(*) over (partition by reference,ni) cnt3
it'll give you the "dup count" by each partition -- technically showing you the "number matching each criteria"
analytics problem
David, November 19, 2004 - 9:37 am UTC
Am newish to analytic functions and have hit problem as follows:-
create table a
(accno number(8) not null,
total_paid number(7,2) not null)
/
create table b
(accno number(8) not null,
due_date date not null,
amount_due number(7,2) not null)
/
insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into b values (1, '01-oct-04', 1000);
insert into b values (1, '01-jan-05', 900);
insert into b values (1, '01-apr-05', 700);
insert into b values (2, '01-oct-04', 1000);
insert into b values (2, '01-jan-05', 900);
insert into b values (2, '01-apr-05', 700);
insert into b values (3, '01-oct-04', 1000);
insert into b values (3, '01-jan-05', 900);
insert into b values (3, '01-apr-05', 700);
insert into b values (4, '01-oct-04', 1000);
insert into b values (4, '01-jan-05', 900);
insert into b values (4, '01-apr-05', 700);
If I then do this query...
SQL> select a.accno,
2 a.total_paid,
3 b.due_date,
4 b.amount_due,
5 case
6 when sum(b.amount_due)
7 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
8 then 0
9 when sum(b.amount_due)
10 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
11 then sum(b.amount_due)
12 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
13 when sum(b.amount_due)
14 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
15 and a.total_paid >= 0
16 then b.amount_due
17 end to_pay
18 from a,b
19 where a.accno = b.accno
20 order by a.accno,
21 to_date(b.due_date, 'dd-mon-rr')
22 /
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
1 1000 01-OCT-04 1000 1000
1 1000 01-JAN-05 900 900
1 1000 01-APR-05 700 700
2 1500 01-OCT-04 1000 1000
2 1500 01-JAN-05 900 900
2 1500 01-APR-05 700 700
3 2000 01-OCT-04 1000 1000
3 2000 01-JAN-05 900 900
3 2000 01-APR-05 700 700
4 3000 01-OCT-04 1000 1000
4 3000 01-JAN-05 900 900
4 3000 01-APR-05 700 700
12 rows selected.
...TO_PAY does not give what I was expecting. But if I do by individual accno I get what I'm after:-
SQL> select a.accno,
2 a.total_paid,
3 b.due_date,
4 b.amount_due,
5 case
6 when sum(b.amount_due)
7 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
8 then 0
9 when sum(b.amount_due)
10 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
11 then sum(b.amount_due)
12 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
13 when sum(b.amount_due)
14 over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
15 and a.total_paid >= 0
16 then b.amount_due
17 end to_pay
18 from a,b
19 where a.accno = b.accno
20 and a.accno = &accno
21 order by a.accno,
22 to_date(b.due_date, 'dd-mon-rr')
23 /
Enter value for accno: 1
old 20: and a.accno = &accno
new 20: and a.accno = 1
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
1 1000 01-OCT-04 1000 0
1 1000 01-JAN-05 900 900
1 1000 01-APR-05 700 700
3 rows selected.
SQL> /
Enter value for accno: 2
old 20: and a.accno = &accno
new 20: and a.accno = 2
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
2 1500 01-OCT-04 1000 0
2 1500 01-JAN-05 900 400
2 1500 01-APR-05 700 700
3 rows selected.
SQL> /
Enter value for accno: 3
old 20: and a.accno = &accno
new 20: and a.accno = 3
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
3 2000 01-OCT-04 1000 0
3 2000 01-JAN-05 900 0
3 2000 01-APR-05 700 600
3 rows selected.
SQL> /
Enter value for accno: 4
old 20: and a.accno = &accno
new 20: and a.accno = 4
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
4 3000 01-OCT-04 1000 0
4 3000 01-JAN-05 900 0
4 3000 01-APR-05 700 0
3 rows selected.
What is needed for first query above to work?
cheers,
David
November 19, 2004 - 11:31 am UTC
ops$tkyte@ORA9IR2> select a.accno,
2 a.total_paid,
3 b.due_date,
4 b.amount_due,
5 case
6 when sum(b.amount_due)
7 over (<b>partition by a.accno</b> order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
8 then 0
9 when sum(b.amount_due)
10 over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
11 then sum(b.amount_due)
12 over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
13 when sum(b.amount_due)
14 over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
15 and a.total_paid >= 0
16 then b.amount_due
17 end to_pay
18 from a,b
19 where a.accno = b.accno
20 order by a.accno,
21 to_date(b.due_date, 'dd-mon-rr')
22 /
ACCNO TOTAL_PAID DUE_DATE AMOUNT_DUE TO_PAY
---------- ---------- --------- ---------- ----------
1 1000 01-OCT-04 1000 0
1 1000 01-JAN-05 900 900
1 1000 01-APR-05 700 700
2 1500 01-OCT-04 1000 0
2 1500 01-JAN-05 900 400
2 1500 01-APR-05 700 700
3 2000 01-OCT-04 1000 0
3 2000 01-JAN-05 900 0
3 2000 01-APR-05 700 600
4 3000 01-OCT-04 1000 0
4 3000 01-JAN-05 900 0
4 3000 01-APR-05 700 0
12 rows selected.
excellent
David, November 19, 2004 - 12:02 pm UTC
many thanks
Limitation of Analytic Functions
Nilanjan Ray, December 16, 2004 - 4:27 am UTC
I am using the following view
create or replace view vw_history as
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history
The following SQL statement invariably does a full table scan on 112,861,91 rows of ADC_HISTORY and runs for 20-25 mins.
select *
from vw_history
where t_dt between to_date('01/01/2002','dd/mm/yyyy') and to_date('01/01/2002','dd/mm/yyyy');
The query return 4200 rows. ADC_HISTORY has 112,861,91 rows. I have the following indexes : ADC_HISTORY_IDX8 on txm_dt and ADC_HISTORY_IDX1 on spot_key columns. Both have good selectivities.
But when the required query is ran without the view it properly uses the index ADC_HISTORY_IDX8
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history
I had raised a tar and it says:This is the expected behaviour "PREDICATES ARE NOT PUSHED IN THE VIEW IF ANY ANALYTIC FUNCTIONS ARE USED"
Is there any way to work aroung this limitation. I just cannot think of the painful situation if I am unable to use views with analytics!!!!
Your help is absolutely necessary. Thanks in advance
December 16, 2004 - 8:27 am UTC
guess what -- your two queries <b>return different answers</b>..
did you consider that? did you check that?
they are TOTALLY DIFFERENT. Analytics are applied after predicates. The view -- it has no predicate. The query -- it has a predicate. You'll find that you have DIFFERENT result sets.
don't you see that as a problem?
It is not that you are "unable to use views"
It is that "when I use a view, I get answer 1, when I do not use a view, I get answer 2"
which answer is technically correct here?
Think about it.
consider this example (using RBO just to make it so that "if an index could be used it would" to stress the point):
ops$tkyte@ORA9IR2> create table emp as select * from scott.emp;
Table created.
ops$tkyte@ORA9IR2> create index job_idx on emp(job);
Index created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace view v
2 as
3 select ename, sal, job,
4 sum(sal) over (partition by job) sal_by_job,
5 sum(sal) over (partition by deptno) sal_by_deptno
6 from emp
7 /
View created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace on explain
ops$tkyte@ORA9IR2> select *
2 from v
3 where job = 'CLERK'
4 /
ENAME SAL JOB SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER 1300 CLERK 4150 8750
JAMES 950 CLERK 4150 9400
SMITH 800 CLERK 4150 10875
ADAMS 1100 CLERK 4150 10875
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 VIEW OF 'V'
2 1 WINDOW (SORT)
3 2 WINDOW (SORT)
4 3 TABLE ACCESS (FULL) OF 'EMP'
<b>so, one might ask "well - hey, I've got that beautiful index on JOB, I said "where job = 'CLERK'", whats up with that full scan.
in fact, when I do it "right" -- without the evil view:</b>
ops$tkyte@ORA9IR2> select ename, sal, job,
2 sum(sal) over (partition by job) sal_by_job,
3 sum(sal) over (partition by deptno) sal_by_deptno
4 from emp
5 where job = 'CLERK'
6 /
ENAME SAL JOB SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER 1300 CLERK 4150 1300
SMITH 800 CLERK 4150 1900
ADAMS 1100 CLERK 4150 1900
JAMES 950 CLERK 4150 950
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 WINDOW (SORT)
2 1 WINDOW (SORT)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
4 3 INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
<b>it very rapidly uses my index !!! stupid views...
but wait.
whats up with SAL_BY_DEPTNO, that appears to be wrong... hmmm, what happened?
What happened was we computed the sal_by_depto in the query without the view AFTER doing "where job = 'CLERK'"
YOU are doing your LAG() analysis AFTER applying the predicate. Your lags in your query without the view -- they are pretty much "not accurate"
Note that when the predicate CAN be pushed:</b>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select ename, sal, sal_by_job
2 from v
3 where job = 'CLERK'
4 /
ENAME SAL SAL_BY_JOB
---------- ---------- ----------
SMITH 800 4150
ADAMS 1100 4150
JAMES 950 4150
MILLER 1300 4150
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=RULE
1 0 VIEW OF 'V'
2 1 WINDOW (BUFFER)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
4 3 INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
<b>it most certainly is. here the predicate can safely be pushed -- since the analytic is computed "by job", a predicate on "job" can be applied FIRST and then the analytic computed.
When pushing would change the answer -- we cannot do it.
When pushing the predicate would not change the answer -- we do it.
This is not a 'limitation', this is about "getting the right answer"</b>
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> alter session set optimizer_mode = choose;
Session altered.
Great!!!
Nilanjan Ray, December 17, 2004 - 12:59 pm UTC
Simply amazing explanation. Cleared my doubts still further. One of the best explanation, in simple concise terms, I have seen on "Ask Tom". You know what, people should take enough caution and learn leasons from you before making misleading statements like "...LIMITATIONS...". In your terms yet again "Analytics Rock".
Regards
Using analytical function, LEAD, LAG
Praveen, December 24, 2004 - 9:26 am UTC
Hi Tom,
Analytical function, LEAD (or LAG) accepts the offset parameter as an integer which is a count of rows to be skipped from the current row before accessing the leading/lagging row. What if I want to access leading rows based on the value of column of current row, like a function applied to the column value of current row to access the leading row.
As an example: I have a table
create table t(id integer, dt date);
For each id, start with the first record, after ordering by dt ASC. Get the next record where dt = 10 min + first_row.dt. Then next record where dt = 20 min + first_row.dt and so on. Each time time is cummulatively increased by 10 min.
Suppose if don't get an exact match from next record (ie next_row.dt <> first_row.dt+10 min(say), then we select a row closest to the expected record, but lying within +/-10 seconds.
insert into t values (1, to_date('12/20/2004 00:00:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:05:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:09:55', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:10:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:15:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:19:54', 'mm/dd/yyyy hh24:mi:ss')); --Not selected.
insert into t values (1, to_date('12/20/2004 00:19:55', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:25:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:30:05', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:30:06', 'mm/dd/yyyy hh24:mi:ss')); --Not Selected.
insert into t values (1, to_date('12/20/2004 00:35:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:39:55', 'mm/dd/yyyy hh24:mi:ss')); --Either this or below record is selected.
insert into t values (1, to_date('12/20/2004 00:40:05', 'mm/dd/yyyy hh24:mi:ss')); --Either this or above record is selected.
My output would be:
id dt
-----------
1 12/20/2004 00:00:00 AM
1 12/20/2004 00:10:00 AM --Exactly matches first_row.dt + 10min
1 12/20/2004 00:19:55 AM --Closest to first_row.dt + 20min +/- 10sec
1 12/20/2004 00:30:05 AM --Closest to first_row.dt + 30min +/- 10sec
1 12/20/2004 00:39:55 AM OR 12/20/2004 00:40:05 AM --Closest to first_row.dt + 40min +/- 10sec
The method I followed, after failed using LEAD is:
Step#1
------
Get a subset of dt's column, which is a 10 min cummulatiave dts from the dt value of first row(after rounding to the nearest minute, multiple of 10).
In this example I will get a subset:
12/20/2004 00:00:00 AM
12/20/2004 00:10:00 AM
12/20/2004 00:20:00 AM
12/20/2004 00:30:00 AM
12/20/2004 00:40:00 AM
This query will do it:
SELECT t1.id,
( min_dt - MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
FROM (SELECT id, MIN (dt) min_dt,
ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
FROM t
WHERE id = 1
GROUP BY id) t1, t
WHERE ROWNUM <= max_rows + 1
Step#2:
-------
This subquery is joined with table t to get only those records from t which is either equal to the dts in the resultset returned by the subquery or fall within the range 10min +/-10sec (not closest only, but all).
SELECT t.id, dt_rounded, ABS (t.dt - dt_rounded) * 24 * 60 * 60 dt_diff_in_sec
FROM t,
(SELECT t1.id,
( min_dt - MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
FROM (SELECT id, MIN (dt) min_dt,
ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
FROM t
WHERE id = 1
GROUP BY id) t1, t
WHERE ROWNUM <= max_rows + 1) t2
WHERE t.id = 1
AND ABS (t.dt - dt_rounded) * 24 * 60 * 60 <= 10
ORDER BY t.id, dt_rounded, dt_diff_in_sec;
I agree, this resultset will include duplicate records which I need to remove procedurally, while looping through the cursor; the order by clause simplifies this.
Now you might have guessed the problem. If table t contains more than 1000 records, the query asks me to wait atleast 2 min! And that too when I am planning to put at least 70,000 records!
I wrote a procedure which is handling the situation a little better. But I dont know if analytical query can help me out to bring back the performance. I could do it if Lead have the fuctionality I mentioned in the first paragraph. Do you have any hints?
Thanks and regards
Praveen
December 24, 2004 - 9:54 am UTC
you'd be looking at first_value with range windows, not lag and lead in this case.
Windowing clause and range function.
Praveen, December 25, 2004 - 1:29 pm UTC
Hi Tom,
Thankyou for the suggestion. I am not very well used with analytical queries. I have tried based on your advise but unable to even start with. I am struck with the first step itself - in specifying the range in the windowing clause. In the windowing clause, we specify an integer to get the preceding rows based on the current column value (CLARK's example-Page:556, Analytical Funtions).
In my above example I wrote a query which contains:
FIRST_VALUE(id)
OVER (ORDER BY dt DESC
RANGE 10 PRECEDING)
10, in the windowing clause, will give me a record that fall within 10 days preceding the current row. But I need 10 minutes preceding records. Also at the same time all those records that span within +/- 10 sec, if exact 10 minute later records are not found (please see the description of the problem given in the previous question).
Kindly give me a more clear picture about windowing clause.
Also how you will approch the above problem.
Thanks and regards
Praveen
December 26, 2004 - 12:19 pm UTC
do you have Expert One on One Oracle? I have extensive examples in there.
range 10 = 10 days.
range 10/24 = 10 hours
range 10/24/60 = 10 minutes......
I do have Expert One on One
Praveen, December 26, 2004 - 2:24 pm UTC
Hi Tom,
I got the first glimpse into analytical queries through your book only. Although I had attempted to learn them through oracle documentation a couple of times earlier, I never was able to write an decent query using analytical functions. Now, after spending a few hours with your book, I can see that these fuctions are not as complex as I thought earlier.
The 'hiredate' example you have given in the book is calculating in terms of days. (Pg:555)
"select ename, sal, hiredate, hiredate-100 window_top
first_value(ename)
over(order by hiredate asc
range 100 preceding) ename_prec,...."
I got the hint from your follow-up. I should have to think a little myself.
Thankyou Tom,
Praveen.
A reader, December 26, 2004 - 5:49 pm UTC
Tom,
Any dates when you would be releasing your book on Analytic?
Thanks.
December 26, 2004 - 6:00 pm UTC
doing a 2nd edition of Expert One on One Oracle now -- not on the list yet.
Great answer!
Shimon Tourgeman, December 27, 2004 - 2:21 am UTC
Dear Tom,
Could you please tell us when you are going to publish the next edition of your books, covering 9iR2 and maybe 10g, as you stated here?
Merry Christmas and a Happy New Year!
Shimon.
December 27, 2004 - 10:06 am UTC
sometime in 2005, but not the first 1/2 :)
Using range windows
Praveen, January 03, 2005 - 8:09 am UTC
Hi Tom,
Please allow me to explain the problem again which you had
followed up earlier (Please refer: "Using analytical
function, LEAD, LAG"). In the table t(id integer, dt date)
I have records which only differ by seconds ('dt' column).
Could you please help me to write a query to create windows
such that each window groups records based on the
expression 590 <= dt_1 <= 610 (590 & 610 are date
difference between first record and current record in
seconds and dt1 is the 'dt' column value of first record in
each window after ordering by 'id' and 'dt' ASC).
The idea is to find a record following the first record
which leads by 10 minutes. If exact match is not found
apply a tolerance of +/-10 seconds. Once the nearest match
is found (if multiple matches are found, select any), start
from the next record and repeat the process. (Please see
the scripts I had given earlier).
In your follow up, you had suggested the use of
first_value() analytical function with range windows. But
it looks like it is pretty difficult to generate the kind
of windows I specified above. And in your book, examples of
such complex nature where not given (pardon me for being
critical).
Your answer will help me to get a deeper and practical
understanding of analytical functions while at the same
time may help us to bring down a 12 hour procedure to less
than 5 hours.
Thanks and regards
Praveen
January 03, 2005 - 9:11 am UTC
no idea what 590 is. days? hours? seconds?
sorry - this doesn't compute to me.
590 <= dt_1 <= 610???
Delete Records Older Than 90 Days While Keeping Max
Mac, January 03, 2005 - 10:24 am UTC
There is a DATE column in a table. I need to delete all records older than 90 days -- except if the newest record for a unique key happens to be older than 90 days, I want to keep it and delete the prior records for that key value.
How?
January 03, 2005 - 10:26 am UTC
if the "newest record for a unique key"
if the key is unique.... then the date column is the only thing to be looked at?
that is, if the key is unique, then the oldest record is the newest record is in the fact the only record....
Oops, but
A reader, January 03, 2005 - 11:01 am UTC
Sorry, forgot to mention that the DATE column is a part of the unique key.
Sorry, I went a bit fast...
Praveen, January 03, 2005 - 2:00 pm UTC
Hi Tom,
Sorry, I didnt explained properly.
590 = (10 minutes * 60) seconds - 10 seconds
600 = (10 minutes * 60) seconds + 10 seconds
Here I am looking for a record (say rn) exactly
600 sec (10 min) later to the first record in
the range window. If I didn't get an exact match
I try to find a record which is closest to rn,
but lies with in a range which is 10 seconds less
than or more than rn.
And the condition
"590 <= dt_1 <= 610" tries to eliminate all other
records inside the range window that does not follow
the above rule.
dt_1 is the dt column value of any row following the
first row in a given range window, such that the
difference between dt_1 and dt of first row is between
590 seconds and 610 seconds. I am interested in only
one record which lies closest to 600 seconds.
I hope, the picture is more clear to you now. As an
example,
id dt
-----------------------------
1 12/20/2004 00:00:00 AM --Range window #1
1 12/20/2004 00:09:55 AM
1 12/20/2004 00:10:00 AM --Selected (Closest to 12/20/2004 00:10:00 AM)
............................
1 12/20/2004 00:10:10 AM --Range window #2
1 12/20/2004 00:19:55 AM --Selected (Closest to 12/20/2004 00:20:00 AM)
1 12/20/2004 00:20:55 AM
............................
1 12/20/2004 00:20:55 AM --Range window #3
1 12/20/2004 00:25:00 AM --Nothing to select
1 12/20/2004 00:29:10 AM --Nothing to select
...........................
1 12/20/2004 00:30:05 AM --Range window #4
1 12/20/2004 00:39:55 AM --Either one is selected
1 12/20/2004 00:40:05 AM --Either one is selected
-----------------------------
Thanks and regards
Praveen
January 03, 2005 - 10:24 pm UTC
that is first_value, last_value with a range window and the time range is
N * 1/24/60/60 -- for N seconds.
How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?
jayaramj@quinnox.com, January 13, 2005 - 3:11 pm UTC
Hi Tom,
In answer to the question 'How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?' from reviewer Sergey (from Norway) in this post you have proposed the following solution:
ops$tkyte@ORA10G> select t,
2 sign_v,
3 v,
4 substr( max(data) over (order by t), 7 ) v2
5 from (
6 SELECT TD.T,
7 SIGN(NVL(TV.V, 0)) sign_v,
8 NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
9 case when tv.v is not null
10 then to_char( row_number()
over (order by td.t), 'fm000000' ) || tv.v
11 end data
12 FROM TD, TV
13 WHERE TV.T(+) = TD.T
14 )
15 ORDER BY T
16 ;
The problem is that this solution converts the data type of the column (in this case column TV.V) to a string (V2 in the result is a string). The result would then need to be converted back to the original data type.
It is best to avoid such data type conversion. Is there a solution to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS) in Oracle 9i without the datatype conversion?
January 13, 2005 - 3:45 pm UTC
encode the date as a string using to_char( v, 'yyyymmddhh24miss' ) and in the substr of it back out -- to_date( substr(...), 'yyyymmddhh24miss' )
How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?
Jay, January 14, 2005 - 12:44 am UTC
In response to your post above - Taking care of dates (for datatype conversion) is not complex (though timestamp variants would require a different format string). Object columns are a different story altogether. These cannot be easily converted to strings. Is there a better solution that does not require datatype conversion (and hence does not require any knowledge of the column datatype in this SQL).
January 14, 2005 - 8:06 am UTC
upgrade to 10g.
find prior collect_date to the max collect_date for each customer
JANE, January 25, 2005 - 4:30 pm UTC
Hello,Tom!
I work in ORACLE 8I
I have table with 2 columns:cstmr_no,collect_date
CREATE TABLE CSTMR_dates
(
CSTMR_NO NUMBER(8) NOT NULL,
COLLECT_DATE DATE NOT NULL);
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
How can i do instead this query the query using analytical
function:
select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no
In production i have thousands records in the table. THANK A LOT
JANE
January 25, 2005 - 6:59 pm UTC
no idea what "return_collect_date" is. or where it comes from.
the sql is not sql...
Mistake:return_collect_date is a collect_date
JANE, January 26, 2005 - 2:58 am UTC
Thank you for answer
JANE
January 26, 2005 - 8:46 am UTC
but this sql:
select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(COLLECT_DATE)
group by cstmr_no
is still not sql and I don't know if you want to
a) delete all old data BY CSTMR_NO (eg: keep just the record with the max(collect_date) BY CSTMR_NO
b) delete all data such that the collect_date is not equal to the max(collect_date)
I cannot suggest a way to rewrite an invalid sql query.
No,i want to do the next:
A reader, January 26, 2005 - 9:08 am UTC
i have just to presene the data without deleting anything
For each cstmr i have to see:
cstmr_no max(collect_date) last prior date to max
======== ================= ======================
18 01/05/04 01/03/04
248 01/11/04 01/03/04
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
January 26, 2005 - 9:31 am UTC
wow, how we got from:
select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no
to this, well -- just "wow". horse of a very different color.
I have to sort of guess -- maybe I'll get it right -- you want
a) every cstmr_no,
b) the last two dates recorded for them.
well, after editing your inserts to make them become actual sql that can run.... (you don't really use YY in real life do you? please please say "no, that was a mistake...")
ops$tkyte@ORA9IR2> select cstmr_no,
2 max( decode(rn,1,collect_date) ) d1,
3 max( decode(rn,2,collect_date) ) d1
4 from (
5 select cstmr_no,
6 collect_date,
7 row_number() over (partition by cstmr_no order by collect_date desc nulls last) rn
8 from cstmr_dates
9 )
10 where rn <= 2
11 group by cstmr_no
12 /
CSTMR_NO D1 D1
---------- --------- ---------
18 01-MAY-04 01-MAR-04
248 01-NOV-04 01-MAR-04
Lead/Lag and Indexes
Rob H, February 22, 2005 - 6:12 pm UTC
We are using the Lead and Lag functions and I have run into an issue of Index usage.
lets say I have 2 tables
select customer_num, prod_id, date_sold, total_sales from sales_table_NA
and
select customer_num, prod_id, date_sold, total_sales from sales_table_EUR
if i do a
create view eur_sales
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_EUR
group by customer_account, prod_id
create view na_sales as
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_NA
group by customer_account, prod_id
There are indexes on the tables for customer_acccount
Now, if I
select * from na_sales where customer_account=1
the index is used. Same for eur_sales. However, if I UNION them together it does not (WINDOW SORT on first select and WINDOW BUFFER on second). If I remove the lead function and UNION them, the index is used.
Any help?
February 23, 2005 - 1:56 am UTC
do you really want UNION or UNION ALL.........
(do you know the difference between the two)....
if you had given me simple setup scripts, I would have been happy to see if that makes a difference, but oh well.
Potential Solution
Rob H, February 22, 2005 - 6:54 pm UTC
Rather than pre-sum the data into 2 views I found that union'ing (actually UNION ALL) the data, then sum and Lag works fine.
ie
select
customer_account, prod_id, sales_date month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by
customer_account, prod_id order by sales_date desc) sales_last
from(
select customer_account, prod_id, sales_date, total_sales from sales_table_NA
union all
select customer_account, prod_id, trunc(sales_date,'mon') month_purch, total_sales from sales_table_EUR)
Attitude....
Rob H, February 23, 2005 - 9:54 am UTC
What's the deal? Having a bad day? I'm sorry, but I assumed from the select statements you could infer structure. Yes, I was using UNION ALL, yes, I know the difference (uh, feeling a bit rude are we?) but I didn't realize until after I posted that I missed that (a nice feature would be to be able to edit a post for a certain time after post). I generalized the data structure and SQL for confidentiality reasons. For a guy who is so hard on people's IM speak, you forget to capitalize your sentences :)
Now, UNION Vs UNION ALL didn't affect index usage (it did however have 'other' performance issues). You can see from my next post that I worked on the issue and resolved it by not presuming each table. With the new query, if someone issues a select with no 'where customer_account=' then it's slower (but that also wasn't the goal).
Thanks
February 24, 2005 - 4:35 am UTC
No? I was simply asking "do you know the difference between the two" for I find most people
a) don't know union all exists
b) the semantic difference between union and union all
c) the performance penalty involved with union vs union all when they didn't need to use UNION
Your example, as posted, did not use UNION ALL. Look at your text:
<quote>
Now, if I
select * from na_sales where customer_account=1
the index is used. Same for eur_sales. However, if I UNION them together it
does not (WINDOW SORT on first select and WINDOW BUFFER on second). If I remove
the lead function and UNION them, the index is used.
</quote>
I quite simply asked:
does union all change the behaviour? (i did not have an example with table creates and such to work with, so I couldn't really 'test it', I don't have your tables, your indexes, your datatypes, etc)
do you need to use union, you said union, you did not say union all. do you know the difference between the two.
Sorry if you took it as an insult, I can only comment based on the data provided. I had to assume you like most of the world was using UNION, not UNION ALL and simply wanted to know if you could use union all, if union all made a difference, if you knew the difference between the two.
If I had precience, I could have read your subsequent post and not ask any questions I guess.
Not having a bad day, just working with information provided. I was not trying to insult you -- I was simply "asking".
Analytics
Neelz, February 24, 2005 - 5:34 am UTC
Dear Sir,
I had gone through the above examples and was wondering whether analytical functions could be used when aggregating multiple columns from a table,
CREATE TABLE T (
SUPPLIER_CD CHAR(4) NOT NULL,
ORDERRPT_NO CHAR(8) NOT NULL,
ORDER_DATE CHAR(8) NOT NULL,
STORE_CD CHAR(4) NOT NULL,
POSITION_NO CHAR(3 ) NOT NULL,
CONTORL_FLAG CHAR(2 ),
ORDERQUANTITY_EXP NUMBER(3) DEFAULT (0) NOT NULL,
ORDERQUANTITY_RES NUMBER(3) DEFAULT (0) NOT NULL,
ENT_DATE DATE DEFAULT (SYSDATE) NOT NULL,
UPD_DATE DATE DEFAULT (SYSDATE) NOT NULL,
CONSTRAINT PK_T PRIMARY KEY(SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE, STORE_CD));
CREATE INDEX IDX_T ON T (SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE);
insert into t values('5636','62108373','20041129','0007','2','00',1,1, to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108373','20041129','0012','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'), to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108384','20041129','0014','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108384','20041129','0015','3','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1309','4','00',8,8,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1312','12' ,'00',8,8,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1313','13' ,'00',12,12,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));
Currently the following query is used:-
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
The execution plan when this query is executed on the real table which has 4m records is : -
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=103002 Card=3571095 Bytes=107132850)
1 0 SORT (GROUP BY NOSORT) (Cost=103002 Card=3571095 Bytes=107 132850)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=10 3002 Card=3571095 Bytes=107132850)
3 2 INDEX (FULL SCAN) OF 'IDX_T' (NON-UNIQUE) (Cost=26942 Card=3571095)
Could you please tell me whether analytical functions could be used over here or a better approach for this query.
Thanks for your great help
February 24, 2005 - 5:49 am UTC
there would be no need of analytics here. analytics would be useful to get the 'aggregates' while preserving the 'details'
eg:
select empno, sal, sum(sal) over (partition by deptno)
from emp;
shows the empno, their sal and the sum of all salaries in their dept. that would be instead of coding:
select empno, sal, sum_sal
from emp, (select deptno, sum(sal) sum_sal from emp gropu by deptno) t
where emp.deptno = t.deptno
/
I was just wondering
A reader, February 24, 2005 - 6:25 am UTC
how would analytics help in the following example (the data nodes are implemented as rows in a table with two columns as pointers: split-from and merge-to, and the third column is "value", some number, not shown on diagram):
</code>
http://img23.exs.cx/my.php?loc=img23&image=directedgraph11th.png <code>
The task is to use this directed dependency graph and prorate the "value" column in each row/node in the following way:
foreach node
-start with a node, for example 16
-visit each hierarchy on which 16 depends, in this case hierarchies for 14 and 15, SUM their values and the current value of node 16, and that will be new, prorated value for node 16
-repeat this recursively for each sub-hierarchy
until all nodes are prorated
I was thinking maybe to use combination of sys_connect_by_path and AF but not sure how. Any thoughts?
February 24, 2005 - 6:51 am UTC
you won't get very far with that structure in 9i and before. connect by "loop" will be an error you see lots of with a directed graph.
analytics won't be appropriate either, they work on windows - not on hierarchies.
sys_connect_by_path is going to give you a string, not a sum
a scalar subquery in 10g with NOCYCLE on the query might work.
What if there is no closure inside the graph?
A reader, February 24, 2005 - 9:08 am UTC
i.e. if the link between node 9 and 5 is removed, and the link between node 6 and 0 is removed.
Would that make difference? It would be a tree in that case. How should we proceed if that is the case? I was thinking maybe to use sys_connect_by_path to pack all sub-hierarchies one after another, and marker in window to be the depth or level. If the level switch from n to 1 that would mean the end of sub-hierarchy. If the level switch from 1 to 2 that is the begining of the hierarchy. And then aggregate over partition inside hierarchy view. Or is there a better approach?
February 24, 2005 - 9:22 am UTC
Lead/Lag and 0 Sales
Rob H, February 24, 2005 - 1:00 pm UTC
Thanks for all of the help so far. I have run into an issue where I have Companies and Contacts at that company. Here are the tables.
create table SALES_TRANS
(
CUSTOMER_ACCOUNT VARCHAR2(8) ,
STATION_NUMBER VARCHAR2(7) ,
PRODUCT_CODE VARCHAR2(8) ,
QUANTITY NUMBER ,
DATE_ISSUE DATE ,
PRICE NUMBER ,
VALUE NUMBER );
/
Create table COMPANY_CUSTOMER
(
COMPANY_ID NUMBER(9),
CUSTOMER_ACCOUNT VARCHAR2(8));
/
Create table PRODUCT_INFO
(
PRODUCT_CODE VARCHAR2(8) ,
PRODUCT_GROUP VARCHAR2(25),
PRODUCT_DESC VARCHAR2(100)
);
/
Running a query by customer (this select is a view called - SUM_CUST_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
t.CUSTOMER_ACCOUNT,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc) total_VALUE_pre_y
From SALES_TRANS t
inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.OMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group, fiscal_year
I get
COMPANY_ID,CUSTOMER_ACCOUNT,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","27294370","Product1",2002,1460.08,0
"F0009631","27294370","Product2",2005,0,27926.31
"F0009631","27294370","Product2",2004,27926.31,18086.17
"F0009631","27294370","Product2",2003,18086.17,47597.05
"F0009631","27294370","Product2",2002,47597.05,0
"F0009631","27294370","Product2",2001,0,0
"F0009631","27294370","Product3",2004,64582.6,51041
"F0009631","27294370","Product3",2003,51041,60225
"F0009631","27294370","Product3",2002,60225,43150
"F0009631","27294370","Product3",2001,43150,50491
"F0009631","27294370","Product3",2000,50491,664
"F0009631","27294370","Product3",1999,664,0
"F0009631","27294370","Product4",2005,2119.1,1708.61
"F0009631","27294370","Product4",2004,1708.61,4050.82
"F0009631","27294370","Product4",2003,4050.82,15662.57
"F0009631","27294370","Product4",2002,15662.57,0
"F0009631","27294370","Product5",2005,0,351.64
"F0009631","27294370","Product5",2004,351.64,5873.61
"F0009631","27294370","Product5",2003,5873.61,2548.83
"F0009631","27294370","Product5",2002,2548.83,0
"F0009631","27294370","Product6",2004,17347.84,16781.33
"F0009631","27294370","Product6",2003,16781.33,10575
"F0009631","27294370","Product6",2002,10575,3659.67
"F0009631","27294370","Product6",2001,3659.67,4901.67
"F0009631","27294370","Product6",2000,4901.67,4073.47
"F0009631","27294370","Product6",1999,4073.47,0
"F0009631","27294370","Product7",2004,5377.5,2588
"F0009631","27294370","Product7",2003,2588,245
"F0009631","27294370","Product7",2000,245,0
"F0009631","27340843","Product2",2003,3013.71,0
"F0009631","27340843","Product3",1999,1411,0
"F0009631","27340843","Product5",2003,3254.9,0
Now if I run the same grouping by only company (this select is a view called - SUM_COMPANY_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc) total_VALUE_pre_y
From SALES_TRANS t
inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.COMPANY_ID, p.product_group, fiscal_year
we get
COMPANY_ID,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","Product1",2002,1460.08,0
"F0009631","Product2",2005,0,27926.31
"F0009631","Product2",2004,27926.31,21099.88
"F0009631","Product2",2003,21099.88,47597.05
"F0009631","Product2",2002,47597.05,0
"F0009631","Product2",2001,0,0
"F0009631","Product3",2004,64582.6,51041
"F0009631","Product3",2003,51041,60225
"F0009631","Product3",2002,60225,43150
"F0009631","Product3",2001,43150,50491
"F0009631","Product3",2000,50491,2075
"F0009631","Product3",1999,2075,0
"F0009631","Product4",2005,2119.1,1708.61
"F0009631","Product4",2004,1708.61,4050.82
"F0009631","Product4",2003,4050.82,15662.57
"F0009631","Product4",2002,15662.57,0
"F0009631","Product5",2005,0,351.64
"F0009631","Product5",2004,351.64,9128.51
"F0009631","Product5",2003,9128.51,2548.83
"F0009631","Product5",2002,2548.83,0
"F0009631","Product6",2004,17347.84,16781.33
"F0009631","Product6",2003,16781.33,10575
"F0009631","Product6",2002,10575,3659.67
"F0009631","Product6",2001,3659.67,4901.67
"F0009631","Product6",2000,4901.67,4073.47
"F0009631","Product6",1999,4073.47,0
"F0009631","Product7",2004,5377.5,2588
"F0009631","Product7",2003,2588,245
"F0009631","Product7",2000,245,0
The problem is that because if I
select * from SUM_CUST_TRANS_PRODUCT_FY_V where fiscal_year=2004
Customer 27340843 will not show up (no 2004 purchases), but that also means that the total_VALUE_pre_y for 2004 will never summarize by customer to the total_VALUE_pre_y for 2004 for the company. Is there a better way to do this. The goal is that we can show current year sales vs previous years sales by company, by customer, and potentially a larger summary higher than company (city).
I guess the idea would be that I could somehow show for all customers in a company, all years, all products, that the company has purchases (cartesian) for every year purchasing. This I think is difficult for large customer, sales transaction tables.
ie
"F0009631","27340843","Product2",2004,0,3013.71 <--- ***
"F0009631","27340843","Product2",2003,3013.71,0
*** This row doesn't exist in the customer view. There are no 2004 sales, so doesn't appear, but we would like to see it so that the year previous shows.
I would love to "attach" some of the transactions if it would help. Is there a better way?
hierarchical cubes + MV?
Rob H, February 25, 2005 - 2:52 pm UTC
Would hierarchical cubes and MV be the solution. It seems like a lot of meta data to create. We would have to create it for all customers, for all years, for all product groups.
February 25, 2005 - 6:40 pm UTC
if you have "missing data", the only way i know to "make it up" is an outer join (partitioned outer joins in 10g rock, removing the need to create cartesian products of every dimension first)
Neelz, February 27, 2005 - 2:32 am UTC
Dear Sir,
This is with regards to my previous post which is 5th above from this.
<quote>
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
</quote>
As you mentioned analytics could not be used, but could you please advice me on my problem,
The query is infact big, for brevity I just put few columns. The actual query is
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
.....
.....
.....
.....
.....
.....
SUM(DECODE(RTRIM(POSITION_NO),'197',ORDERQUANTITY_RES,0)) Q197,
SUM(DECODE(RTRIM(POSITION_NO),'198',ORDERQUANTITY_RES,0)) Q198,
SUM(DECODE(RTRIM(POSITION_NO),'199',ORDERQUANTITY_RES,0)) Q199,
SUM(DECODE(RTRIM(POSITION_NO),'200',ORDERQUANTITY_RES,0)) Q200,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
As you could see there is a definite pattern on the sum function. Could you please help me in tuning this query?
Thanks in advance
February 27, 2005 - 8:32 am UTC
you aer doing a pivot -- looks great to me? It is "classic"
Neelz, February 27, 2005 - 9:51 am UTC
Dear Sir,
I am sorry if you felt like that, It is quite a new world for me here, started visiting this site 3-4 months back then realized the enormity of it and its become like an addiction. Bought both books by you and started working on it. Reading the Oracle concepts guide. Every day many times will try for asking a question but till now no luck, might be because of timezone difference.
Coming back to my question, since it is a huge query and was taking 35 min to execute, after reading through many articles here and in the books I was really confused as to what approach should I take. Still is. Analytical functions (not useful as you told), Function based indexes(no becuase we have a standard edition), Materialized views(no because its an OLTP), Stored Sql functions, Deterministic keyword, user defined aggregates, optimizer hints.. at present it is confusing for me.
I am working on it with different approaches, could reduce the execution time upto 9.08 minutes. The query was written with an index hint earlier and by removing it, the execution time decreased upt 9+ minutes.
I was thinking whether you could advice on what approach should I take
Thanks for your valuable time,
February 27, 2005 - 10:04 am UTC
if that is taking 35 minutes you either
a) have the memory settings like pga_aggreate_target/sort_area_size set way too low
b) you have billions of records that are hundreds of bytes in width
c) really slow disks
d) an overloaded system
I mean -- that query is pretty "simple" full scan, aggregate, nothing to it -- unless it is a gross simplification, it should not take 35 minutes. Can you trace it with the 10046 level 12 trace and post the tkprof section that is relevant to just this query with the waits and all?
Neelz, February 27, 2005 - 10:56 am UTC
Dear Sir,
Thank you for your kind reply,
This report is taken for the development system.
I used alter session set events '10046 trace name context forever, level 12'. The query execution time was 00:08:15.03
select
supplier_cd, orderrpt_no, order_date,
sum(decode(rtrim(position_no),'1',orderquantity_res,0)) q1,
sum(decode(rtrim(position_no),'2',orderquantity_res,0)) q2,
sum(decode(rtrim(position_no),'3',orderquantity_res,0)) q3,
sum(decode(rtrim(position_no),'4',orderquantity_res,0)) q4,
sum(decode(rtrim(position_no),'5',orderquantity_res,0)) q5,
.....
.....
sum(decode(rtrim(position_no),'197',orderquantity_res,0)) q197,
sum(decode(rtrim(position_no),'198',orderquantity_res,0)) q198,
sum(decode(rtrim(position_no),'199',orderquantity_res,0)) q199,
sum(decode(rtrim(position_no),'200',orderquantity_res,0)) q200,
sum(orderquantity_res) order_total
from
t
group by
supplier_cd, orderrpt_no, order_date
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.04 0 0 0 0
Execute 2 0.02 0.04 0 0 0 0
Fetch 15 431.55 488.37 37147 36118 74 211
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 18 431.60 488.46 37147 36118 74 211
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 66
Rows Row Source Operation
------- ---------------------------------------------------
211 SORT GROUP BY
4205484 TABLE ACCESS FULL T
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 16 0.00 0.00
SQL*Net more data to client 30 0.00 0.00
db file sequential read 3 0.04 0.05
db file scattered read 2280 0.78 30.62
direct path write 4 0.00 0.00
direct path read 147 0.05 1.45
SQL*Net message from client 16 140.57 166.58
SQL*Net break/reset to client 2 0.01 0.01
********************************************************************************
Thank you
February 27, 2005 - 11:13 am UTC
that is 8 minutes?
but I see some writes to temp here -- for 211 aggregated rows, perhaps your sort/pga is set small
Also, why do you need to rtrim() 4,205,484 rows? (and why is something called position NUMBER in a string?) is that rtrim there "just in case" or is it really needed? why would it have trailing blanks and is that not a data integrity issue that needs to be fixed?
(but this is an 8 minute query, not a 35 minute query, if it takes longer on production -- it'll be because it is waiting for something -- like IO...)
Neelz, February 27, 2005 - 11:30 am UTC
Dear Sir,
This is a 3rd party application and the query was written with an index hint earlier. After removing the hint query execution time reduced to 8 min. Regarding the rtrim I have to check with the team if it is really needed. I will try the trace on production tomorrow.
And at last I could see the link for "Submit a New Question"!, I think I should try around 1.00 AM
Thanking You a lot
February 27, 2005 - 11:31 am UTC
depends on your time zone, rarely am I up at 1am east coast (gmt-5) time doing this stuff!
Miki, March 02, 2005 - 8:51 am UTC
Tom,
I need to produce a moving average which has an even window size. If I want a 28 sized window, I need to look backward 14 but I need the first value of the window to be divided by 2 and I need to look forward 14 and the last value of the window to be divided by 2 also.
(a1/2+a2+...+a28+a29/2)/28
How could I accomplish it with the function:
avg() over(...)?
Thanks in advance
March 02, 2005 - 10:03 am UTC
this is the first thought that popped into my head:
a) get the sum(val) over 13 before and 13 after (27 rows possible).
b) get the lag(val,14)/2 and lead(val,14)/2
c) add those three numbers
d) divide by the count of non-null VALS observed (count(val) 13 before/after+ 1 if lag is not null + 1 is lead is not null)
ops$tkyte@ORA9IR2> create table t
2 as
3 select rownum id, object_id val
4 from all_objects
5 where rownum <= 30;
Table created.
<b>so, this was my "debug" query, just to see the data:</b>
ops$tkyte@ORA9IR2> select id,
2 sum(val) over
(order by id rows between 13 preceding and 13 following) sum,
3 count(val) over
(order by id rows between 13 preceding and 13 following)+
4 decode(lag(val,14) over (order by id),null,0,1)+
5 decode(lead(val,14) over (order by id),null,0,1) cnt,
6 lag(id,14) over (order by id) lagid,
7 lag(val,14) over (order by id) lagval,
8 lead(id,14) over (order by id) leadid,
9 lead(val,14) over (order by id) leadval
10 from t
11 order by id;
ID SUM CNT LAGID LAGVAL LEADID LEADVAL
---------- ---------- ---------- ---------- ---------- ---------- ----------
1 218472 15 15 6399
2 224871 16 16 19361
3 244232 17 17 23637
4 267869 18 18 14871
5 282740 19 19 20668
6 303408 20 20 18961
7 322369 21 21 15767
8 338136 22 22 20654
9 358790 23 23 7065
10 365855 24 24 17487
11 383342 25 25 11077
12 394419 26 26 20772
13 415191 27 27 15505
14 430696 28 28 12849
15 425648 29 1 17897 29 23195
16 441314 29 2 7529 30 18523
17 436505 28 3 23332
18 422306 27 4 14199
19 399409 26 5 22897
20 389266 25 6 10143
21 365728 24 7 23538
22 342135 23 8 23593
23 332316 22 9 9819
24 320581 21 10 11735
25 303084 20 11 17497
26 295369 19 12 7715
27 276010 18 13 19359
28 266791 17 14 9219
29 260392 16 15 6399
30 241031 15 16 19361
30 rows selected.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id,
2 (sum(val) over
(order by id rows between 13 preceding and 13 following)+
3 nvl(lag(val,14) over (order by id)/2,0)+
4 nvl(lead(val,14) over (order by id)/2,0))/
5 nullif(
6 count(val) over
(order by id rows between 13 preceding and 13 following)+
7 decode(lag(val,14) over (order by id),null,0,1)+
8 decode(lead(val,14) over (order by id),null,0,1)
9 ,0) avg
10 from t
11 order by id;
ID AVG
---------- ----------
1 14778.1
2 14659.4688
3 15061.7941
4 15294.6944
5 15424.9474
6 15644.425
7 15726.3095
8 15839.2273
9 15753.1522
10 15608.2708
11 15555.22
12 15569.4231
13 15664.5741
14 15611.4464
15 15386
16 15666.8966
17 16006.1071
18 15903.9074
19 15802.2115
20 15773.5
21 15729.0417
22 15388.3261
23 15328.4318
24 15545.1667
25 15591.625
26 15748.7632
27 15871.6389
28 15964.7353
29 16474.4688
30 16714.1
30 rows selected.
ops$tkyte@ORA9IR2>
<b>I did not do a detailed check of the results -- but that should get you going (remember -- there are 29 rows -- 14+1+14!!! and beware NULLs)</b>
Miki, March 02, 2005 - 10:54 am UTC
Tom,
Your answer is excellent. That is - almost - what I needed.
If my window size is odd I can use simly avg() over() function. I am looking for a solution where I can also use avg() over() instead of sum() over()/count().
Is it possible?
Thank you!
March 02, 2005 - 11:15 am UTC
if you want to do things to row 1 and row 29 in the window "special" like this -- this was the only thing I thought of.
Miki, March 02, 2005 - 11:18 am UTC
Thank you! I will use your recommended code.
consecutive days... 8.1.7
Dean, March 09, 2005 - 1:07 pm UTC
create table day_cd
(dt date
,cd varchar2(2))
/
insert into day_cd values ('08-MAR-05', 'BD');
insert into day_cd values ('09-MAR-05', 'AD');
insert into day_cd values ('10-MAR-05', 'AD');
insert into day_cd values ('11-MAR-05', 'AD');
insert into day_cd values ('12-MAR-05', 'AD');
insert into day_cd values ('13-MAR-05', 'AD');
insert into day_cd values ('14-MAR-05', 'CD');
insert into day_cd values ('15-MAR-05', 'CD');
insert into day_cd values ('16-MAR-05', 'AD');
insert into day_cd values ('17-MAR-05', 'AD');
insert into day_cd values ('18-MAR-05', 'AD');
insert into day_cd values ('19-MAR-05', 'CD')
/
SELECT * FROM DAY_CD;
DT CD
--------- --
08-MAR-05 BD
09-MAR-05 AD
10-MAR-05 AD
11-MAR-05 AD
12-MAR-05 AD
13-MAR-05 AD
14-MAR-05 CD
15-MAR-05 CD
16-MAR-05 AD
17-MAR-05 AD
18-MAR-05 AD
19-MAR-05 CD
I'd like the count the occurrence of each code as it occurs in consecutive days as one occurrence.
So that the output would be:
CD OCCURRENCES
-- -----------
AD 2
BD 1
CD 2
nevermind...
Dean, March 09, 2005 - 1:59 pm UTC
select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd
we were responding at the same time...
Dean, March 09, 2005 - 2:01 pm UTC
:)
select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd
CD COUNT(*)
-- ----------
AD 2
BD 1
CD 2
Thanks for all of your help...
max() over() till not the current row
Miki, March 10, 2005 - 4:12 am UTC
Tom,
I have the following input
DATUM T COL1 COL2 COL3 COL4
2005.02.19 9:29 T 1 0 0 0
2005.02.20 9:29 0 0 0 0
2005.02.21 9:29 0 0 0 0
2005.02.22 9:29 T 1 0 0 0
2005.02.23 9:29 0 0 0 0
2005.02.24 9:29 0 0 0 0
2005.02.25 9:29 0 0 0 0
2005.02.26 9:29 0 0 0 0
2005.02.27 9:29 T 0 1 0 0
2005.02.28 9:29 0 0 0 0
2005.03.01 9:29 0 0 0 0
2005.03.02 9:29 T 1 1 0 0
2005.03.03 9:29 0 0 0 0
2005.03.04 9:29 T 1 1 0 0
2005.03.05 9:29 0 0 0 0
2005.03.06 9:29 T 1 0 0 0
2005.03.07 9:29 0 0 0 0
2005.03.08 9:29 0 0 0 0
2005.03.09 9:29 0 0 0 0
When value of column T is T a rule determines which columns (col1,
, col4) get 1 or 0.
Unfortunately, with the rule more then one column can get value 1. So, if col1+
+col4 > 1 then I would like colx to be the previous colx where t = 'T' and col1+...+col4 = 1
So, the output is the following
DATUM T COL1 COL2 COL3 COL4
2005.02.19 9:29 T 1 0 0 0
2005.02.20 9:29 0 0 0 0
2005.02.21 9:29 0 0 0 0
2005.02.22 9:29 T 1 0 0 0
2005.02.23 9:29 0 0 0 0
2005.02.24 9:29 0 0 0 0
2005.02.25 9:29 0 0 0 0
2005.02.26 9:29 0 0 0 0
2005.02.27 9:29 T 0 1 0 0
2005.02.28 9:29 0 0 0 0
2005.03.01 9:29 0 0 0 0
2005.03.02 9:29 T 0 1 0 0
2005.03.03 9:29 0 0 0 0
2005.03.04 9:29 T 0 1 0 0
2005.03.05 9:29 0 0 0 0
2005.03.06 9:29 T 1 0 0 0
2005.03.07 9:29 0 0 0 0
2005.03.08 9:29 0 0 0 0
2005.03.09 9:29 0 0 0 0
I tried to use a max() over() function to replace the wrong value but it dosnt work because I cant see the max datum till the previous record where t=T and col1+...+col4 = 1
...
case when t = T and col1+
+col4>1 and
greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000) then 1 else 0 end col1,
Case when t = T and col1+
+col4>1 and
Greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000) then 1 else 0 end col4
Could you give me a solution to my problem?
Thanks in advance
miki
Miki, March 10, 2005 - 8:09 am UTC
Here is my table populated with data:
create table T
(
DATUM DATE,
T VARCHAR2(1),
COL1 NUMBER,
COL2 NUMBER,
COL3 NUMBER,
COL4 NUMBER
);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('16-01-2005 13:17:46', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-01-2005 17:23:13', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-03-2005 02:59:17', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('11-12-2004 21:59:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('10-01-2005 12:00:22', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('24-02-2005 02:36:51', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('08-12-2004 11:21:15', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('07-01-2005 20:52:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-02-2005 23:44:33', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-03-2005 16:25:12', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 19:02:28', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('22-01-2005 11:21:41', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-01-2005 15:32:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-12-2004 03:07:10', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('21-02-2005 16:25:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 01:02:39', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('15-12-2004 05:49:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-02-2005 14:35:34', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-12-2004 15:01:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
commit;
select t.* from t t
order by 1;
DATUM T COL1 COL2 COL3 COL4
1 2004.12.02. 15:01:42 0 0 0 0
2 2004.12.08. 11:21:15 0 0 0 0
3 2004.12.11. 21:59:18 T 1 0 0 0
4 2004.12.15. 5:49:26 0 0 0 0
5 2004.12.19. 3:07:10 0 0 0 0
6 2005.01.01. 1:02:39 T 0 1 0 0
7 2005.01.01. 19:02:28 0 0 0 0
8 2005.01.04. 17:23:13 T 1 1 0 0
9 2005.01.07. 20:52:26 0 0 0 0
10 2005.01.10. 12:00:22 0 0 0 0
11 2005.01.16. 13:17:46 0 0 0 0
12 2005.01.19. 15:32:18 T 1 1 0 0
13 2005.01.22. 11:21:41 0 0 0 0
14 2005.02.02. 23:44:33 0 0 0 0
15 2005.02.04. 14:35:34 T 0 1 0 0
16 2005.02.21. 16:25:42 0 0 0 0
17 2005.02.24. 2:36:51 0 0 0 0
18 2005.03.01. 2:59:17 0 0 0 0
19 2005.03.04. 16:25:12 T 1 0 0 0
Line 8 and 12 have more then one column that contain 1.
So, I need to "copy" every colx from line 6 because it is the first line (ordered by datum), that has value 'T' for column T and only one colx has value 1.
Thank you
March 10, 2005 - 8:28 am UTC
ops$tkyte@ORA9IR2> select t, col1, col2, col3, col4,
2 substr(max(data) over (order by datum),11,1) c1,
3 substr(max(data) over (order by datum),12,1) c2,
4 substr(max(data) over (order by datum),13,1) c3,
5 substr(max(data) over (order by datum),14,1) c4,
6 case when col1+col2+col3+col4 > 1 then '<---' end fix
7 from (
8 select t.*,
9 case when t = 'T' and col1+col2+col3+col4 = 1
10 then to_char(row_number() over (order by datum) ,'fm0000000000') || col1 || col2 || col3 || col4
11 end data
12 from t
13 )
14 order by datum;
T COL1 COL2 COL3 COL4 C C C C FIX
- ---------- ---------- ---------- ---------- - - - - ----
0 0 0 0
0 0 0 0
T 1 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
T 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 1 1 0 0 0 1 0 0 <---
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 1 1 0 0 0 1 0 0 <---
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
T 1 0 0 0 1 0 0 0
19 rows selected.
Great!
Miki, March 10, 2005 - 9:38 am UTC
Great solution!
Thank you, it is that i expected.
book on Analytics
A reader, March 10, 2005 - 11:07 am UTC
Hi Tom,
It is high time that you publish the book on 'Analytic functions' - there is a lot one can do with these , but very few people are fully aware of it
When is this book due ?
thanks
A variation of Dean's question ...
Julius, March 10, 2005 - 8:13 pm UTC
create table tt (
did number,
dd date,
status number);
alter table tt add constraint tt_pk primary key (did,dd) using index;
insert into tt values (-111,to_date('03/03/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/04/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/05/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/06/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-111,to_date('03/08/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/09/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/10/2005','mm/dd/yyyy'),22);
insert into tt values (-222,to_date('03/04/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/05/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/06/2005','mm/dd/yyyy'),77);
insert into tt values (-222,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/08/2005','mm/dd/yyyy'),55);
insert into tt values (-222,to_date('03/09/2005','mm/dd/yyyy'),11);
I need a query which would return following result set where days_in_status is a count of consecutive days the did has been in its current status (dd values are days only). I've been trying to use analytics but without much success so far. Any idea? Thanks!!
DID DD STATUS DAYS_IN_STATUS
----- ---------- ------ --------------
-111 03/10/2005 22 3
-222 03/09/2005 11 1
March 10, 2005 - 9:04 pm UTC
ops$tkyte@ORA9IR2> select did, max(dd), count(*)
2 from (
3 select x.*, max(grp) over (partition by did order by dd desc) maxgrp
4 from (
5 select tt.*,
6 case when lag(status) over (partition by did order by dd desc) <> status
7 then 1
8 end grp
9 from tt
10 ) x
11 )
12 where maxgrp is null
13 group by did
14 /
DID MAX(DD) COUNT(*)
---------- --------- ----------
-222 09-MAR-05 1
-111 10-MAR-05 3
is one approach...
SQL Query
a reader, March 15, 2005 - 6:50 pm UTC
Hi Tom,
create table a
(accno number(8) not null,
amount_paid number(7) not null)
/
insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into a values (5, 3000);
Could you please help me in writing the following query without using rownum and analytics.
list the accno corresponding to maximum amount paid. In case of more than one accounts having the same max amount paid, list any one.
I am expecting the result to be accno 4 or 5
Thanks for your time.
Regards
March 15, 2005 - 9:27 pm UTC
sounds like homework.
I give a similar quiz question in interviews (find the more frequently occuring month)
tkyte@ORA8IW> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno
2 from a;
ACCNO
-----------------------------------------
5
is one possible approach (assuming that amount_paid is positive)
tkyte@ORA8IW> select max(accno)
2 from a
3 where amount_paid = ( select max(amount_paid) from a );
MAX(ACCNO)
----------
5
is another (that would work well if amount_paid,accno were indexed....)
negatives to worry about ...
Gabe, March 15, 2005 - 9:56 pm UTC
SQL> select * from a;
ACCNO AMOUNT_PAID
---------- -----------
1 -2
2 -1
SQL> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno from a;
ACCNO
-----------------------------------------
21
March 15, 2005 - 10:04 pm UTC
....
(assuming that amount_paid is positive)
.......
that was caveated and why I gave two answers ;)
cannot read ...
Gabe, March 15, 2005 - 10:57 pm UTC
Sorry about that ... missed it completely.
following an idea of mikito ...
Matthias Rogel, March 16, 2005 - 8:11 am UTC
1 select accno
2 from a
3 start with amount_paid = (select max(amount_paid) from a)
4 and accno = (select min(accno) from a where amount_paid = (select max(amount_paid) from a))
5* connect by prior null is not null
SQL> /
ACCNO
----------
4
would be a third solution
March 16, 2005 - 8:38 am UTC
there are many solutions -- this one would win a Rube Goldberg award though :)
another query using analytics
A reader, March 29, 2005 - 11:32 am UTC
I've got 2 tables, t1 and t2.
t1(1 column):
t1.x(int ,primary key)
1
2
3
and t2(3 columns,index on t2.y):
t2.x(int) t2.y(int) t2.z(int)
1 7000 1
1 7000 6
1 8000 8
2 7000 1
2 7000 5
3 7000 3
3 8000 1
3 8000 7
3 9000 5
I would like to have a report like this:
t1.x t2.y count min max
1 7000 2 1 8
1 8000 1 1 8
2 7000 2 1 5
3 7000 1 1 7
3 8000 2 1 7
3 9000 1 1 7
What I came up with is:
select distinct t1.x,t2.y,
count(*) over (partition by t1.x,t2.y) as count,
min(t2.z) over (partition by t1.x) as min,
max(t2.z) over (partition by t1.x) as max
from t1,t2 where
where t1.x=t2.x;
I was wondering if this query is good enough, or if there's a better way(in terms of performance) to write this query. I'm new to analytics, and your help would be very much appreciated.
March 29, 2005 - 12:25 pm UTC
we could probably do this in analytics without the distinct, something like
select t1.x, t2.y, t2.cnt,
min(t2.z) over (partition by t1.x),
max(t2.z) over (partition by t1.x)
from t1, (select x, y, count(*) cnt from t2 group by x, y ) t2
where t1.x = t2.x;
and maybe even pusht he min/max() down into the inline view.
Analytics problem
Mark, April 08, 2005 - 12:19 pm UTC
Hi Tom,
I have a problem whose solution I'm pretty sure involves analytic functions. I've been struggling with it for some time, but analytics are new to me. I want to go from this:
/* create and inserts */
create table test.test (ordernum varchar2(10),
tasktype char(3),
feetype varchar2(20),
amount number(10,2));
insert into test.test(ordernum, tasktype, feetype, amount)
values('123123', 'DOC', 'Product Fee', 15);
insert into test.test(ordernum, tasktype, feetype, amount)
values('123123', 'DOC', 'Copy Fee', 1);
insert into test.test(ordernum, tasktype, feetype, amount)
values('34864', 'COS', 'Setup Fee', 23);
insert into test.test(ordernum, tasktype, feetype, amount)
values('34864', 'COS', 'File Review Fee', 27);
insert into test.test(ordernum, tasktype, feetype, amount)
values('34864', 'COS', 'Statutory Fee', 23);
insert into test.test(ordernum, tasktype, feetype, amount)
values('56432', 'DOC', 'Product Fee', 80);
insert into test.test(ordernum, tasktype, feetype, amount)
values('56432', 'DOC', 'Prepayment', -16);
SQL> select tasktype, ordernum, feetype, amount from test.test;
TAS ORDERNUM FEETYPE AMOUNT
--- ---------- -------------------- ----------
DOC 123123 Product Fee 15
DOC 123123 Copy Fee 1
COS 34864 Setup Fee 23
COS 34864 File Review Fee 27
COS 34864 Statutory Fee 22
DOC 56432 Product Fee 80
DOC 56432 Prepayment -16
...to this:
TAS ORDERNUM FEE1 FEE2 FEE3 FEE4 FEE5
--- -------- ----------- -------- ---------- -------- --------
DOC Product Fee Copy Fee Prepayment
DOC 123123 15 1
DOC 56432 80 -16
COS Setup Fee File Review Fee Statutory Fee
COS 34864 23 27 22
Allow me to explain. For each tasktype I would like a heading row, which, going across, contains all the feetypes found in test.test for that particular tasktype. There should never be more than five feetypes.
For each ordernum under each tasktype, I would like to have the amounts going across, underneath the appropriate feetypes.
I'm pretty sure my solution involves the lag and/or lead functions, partitioning over tasktype. I particularly seem to have trouble wrapping my brain around the problem of how to get a distinct ordernum while keeping intact the data in other columns (where ordernums duplicate).
I hope my explanation is clear enough.
Hope you can help. Thanks in advance. I will continue working on this.
April 08, 2005 - 12:51 pm UTC
ops$tkyte@ORA9IR2> with columns
2 as
3 (select tasktype, feetype, row_number() over (partition by tasktype order by feetype) rn
4 from (select distinct tasktype, feetype from test )
5 )
6 select a.tasktype, a.ordernum,
7 to_char( max( decode( rn, 1, amount ) )) fee1,
8 to_char( max( decode( rn, 2, amount ) )) fee2,
9 to_char( max( decode( rn, 3, amount ) )) fee3,
10 to_char( max( decode( rn, 4, amount ) )) fee4,
11 to_char( max( decode( rn, 5, amount ) )) fee5
12 from test a, columns b
13 where a.tasktype = b.tasktype
14 and a.feetype = b.feetype
15 group by a.tasktype, a.ordernum
16 union all
17 select tasktype, null,
18 ( max( decode( rn, 1, feetype ) )) fee1,
19 ( max( decode( rn, 2, feetype ) )) fee2,
20 ( max( decode( rn, 3, feetype ) )) fee3,
21 ( max( decode( rn, 4, feetype ) )) fee4,
22 ( max( decode( rn, 5, feetype ) )) fee5
23 from columns
24 group by tasktype
25 order by 1 desc, 2 nulls first
26 /
TAS ORDERNUM FEE1 FEE2 FEE3 FEE4 FEE5
--- ---------- --------------- --------------- --------------- ---- ----
DOC Copy Fee Prepayment Product Fee
DOC 123123 1 15
DOC 56432 -16 80
COS File Review Fee Setup Fee Statutory Fee
COS 34864 27 23 23
of course. :)
(suggestion, break it out, run each of the bits to see what they do. basically, columns is a view used to "pivot" on -- we needed to assign a column number to each FEETYPE by TASKTYPE. That is all that view does.
Then, we join that to test and "pivot" naturally.
Union all in the pivot of the column names....
and sort)
RE: Analytics problem
Mark, April 08, 2005 - 1:27 pm UTC
Excellent! I'll definitely break it down to figure out exactly what you did. Thank you very much.
Re: another query using analytics
Gabe, April 08, 2005 - 3:27 pm UTC
You werent given any resources
so, I understand your solution was in fact merely a [untested] suggestion.
create table t1 ( x int primary key );
insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
create table t2 ( x int not null references t1(x), y int not null, z int not null );
insert into t2 values ( 1,7000,1);
insert into t2 values ( 1,7000,6);
insert into t2 values ( 1,8000,8);
insert into t2 values ( 2,7000,1);
insert into t2 values ( 2,7000,5);
insert into t2 values ( 3,7000,3);
insert into t2 values ( 3,8000,1);
insert into t2 values ( 3,8000,7);
insert into t2 values ( 3,9000,5);
My solution (avoiding the distinct) is not necessarily better than the one presented by the A reader, but here it goes:
flip@FLOP> select x, y, c
2 ,min(f) over (partition by x) f
3 ,max(l) over (partition by x) l
4 from (
5 select t2.x, t2.y, count(*) c
6 ,min(t2.z) keep (dense_rank first order by t2.z) f
7 ,max(t2.z) keep (dense_rank last order by t2.z) l
8 from t1, t2
9 where t1.x = t2.x
10 group by t2.x, t2.y
11 ) t
12 ;
X Y C F L
---------- ---------- ---------- ---------- ----------
1 7000 2 1 8
1 8000 1 1 8
2 7000 2 1 5
3 7000 1 1 7
3 8000 2 1 7
3 9000 1 1 7
Cheers.
April 08, 2005 - 3:34 pm UTC
without create tables and inserts, I guess :)
takes too much time to create the setup for every case (wish people would read the page that they have to page down through to put something up here...)
I'm confused
Mikito, April 18, 2005 - 9:55 pm UTC
Given that
select distinct deptno
from emp
is essentially
select deptno
from emp
group by deptno
how is distinct query should be rewritten in case with analytics columns? Neither
SELECT deptno, count(1),
min(sal) over (partition by deptno) f
from emp
group by deptno,min(sal) over (partition by deptno);
nor
SELECT deptno, count(1),
min(sal) over (partition by deptno) f
from emp
group by deptno,f;
seems to be a valid syntax.
(To repeat: "Does analytics scale?")
April 19, 2005 - 7:22 am UTC
why would you use analytics that way?
Tell us the question, we'll tell you the method.
select deptno, count(*) /* because count(1) is counter-intuitive */,
min(sal) over (partition by deptno) f
from emp
group by deptno, min(sal) over (partition by deptno)
would not make sense. You are saying "get all deptnos, by deptno find the minimum salary and associate that number with each one, then aggregate by deptno/min salary to count records"
You should just ask:
find the minimum salary and count of records by deptno.
select deptno, count(*), min(sal) from emp group by deptno;
is what you were looking for. analytics scale up wonderful. Say the question was instead:
you have a table full of records that have a customer_id and a last_sale_date, I would like you to retrieve the last record for each customer.
select *
from ( select cust.*, max(sale_date) over (partition by cust_id) lsd
from cust )
where sale_date = lsd;
versus
select *
from cust
where sale_date =
(select max(sale_date) from cust c2 where cust_id = cust.cust_id )
/
or
select *
from cust, (select cust_id, max(sale_date) lsd from cust group by cust_id)x
where cust.cust_id = x.cust_id
and cust.sale_date = x.lsd
/
for example
Tricky SQL?
A reader, April 19, 2005 - 10:29 am UTC
CREATE TABLE master
(
m_no INTEGER PRIMARY KEY,
m_name VARCHAR2(255) NOT NULL UNIQUE
);
create table detail
(
d_pk integer primary key,
d_no integer not null references m(m_no),
d_date date,
d_data varchar2(255)
);
Given a d_pk, how can I get the second-to-last (ordered by d_date) record from M for that M_NAME? In other words, for a given m_name, there are multiple records in "detail" with different dates. Given one of those records, I want the prior record in "detail" (there might not be any)
I tried to design a simple master detail table, but maybe I over-normalized?
Thanks
April 19, 2005 - 12:00 pm UTC
are you saying "i have a detail record, I want the detail record that came 'in front' of this one"?
that is what I sort of hear, but the second to last is confusing me.
select *
from (
select ...., lead(d_pk) over (order by d_date) next_pk
from master, detail
where master.m_no = (select d_no from detail where d_pk = :x)
and master.m_no = detail.d_no
)
where next_pk = :x;
I think that does that. You get the master/detail for that d_pk (inline view)
Use lead to assign to each record the "next pk" after sorting by d_date
Keep the record whose 'next' records primary key was the one you wanted..
a little inconsistency
mikito, April 19, 2005 - 1:24 pm UTC
I meant inconsistency, not scalability. Why "distinct"
SELECT distinct deptno,
min(sal) over (partition by deptno) f
from emp
is allowed, whereas "group by" doesn't? If someone has trouble understanding what analytics with "group by" means, the same should apply to analytics with "distinct" as well.
April 19, 2005 - 1:26 pm UTC
because group by is not distinct, they are frankly very different concepts.
detail and summery in one sql statement
A reader, April 27, 2005 - 3:02 pm UTC
hi tom,
quick shot. i have to process many detail (column a - f) and one summery record (containing sum (column c) + count (*) over all recs + some literal placeholders) within one sql statement. is there another way then using a classical UNION ALL select? any new way with analytical functions?
April 27, 2005 - 3:22 pm UTC
need small example, did not follow your example as stated.
detail and summery in one sql statement
A reader, April 28, 2005 - 10:08 am UTC
hi tom,
here is the small and simple test case to show what i mean.
SQL> create table t1 (col1 number primary key, col2 number, col3 number);
Tabelle wurde angelegt.
SQL> create table t2 (col0 number primary key, col1 number references t1 (col1), col2 number, col3 number, col4 number);
Tabelle wurde angelegt.
SQL> create index t2_col1 on t2 (col1);
Index wurde angelegt.
SQL> insert into t1 values (1, 1, 1);
1 Zeile wurde erstellt.
SQL> insert into t2 values (1, 1, 1, 1, 1);
1 Zeile wurde erstellt.
SQL> insert into t2 values (2, 1, 2, 2, 2);
1 Zeile wurde erstellt.
SQL> insert into t2 values (3, 1, 3, 3, 3);
1 Zeile wurde erstellt.
SQL> analyze table t1 compute statistics;
Tabelle wurde analysiert.
SQL> analyze table t2 compute statistics;
Tabelle wurde analysiert.
SQL> select 0 rowtype, t1.col1 display1, t1.col2 display2, t2.col3 display3, t2.col4 display4
2 from t1 join t2 on (t1.col1 = t2.col1)
3 where t1.col1 = 1
4 UNION ALL
5 select 1 rowtype, t1.col1, count (*), null, sum (t2.col4)
6 from t1 join t2 on (t1.col1 = t2.col1)
7 where t1.col1 = 1
8 group by t1.col1
9* order by rowtype
ROWTYPE DISPLAY1 DISPLAY2 DISPLAY3 DISPLAY4
---------- ---------- ---------- ---------- ----------
0 1 1 1 1
0 1 1 2 2
0 1 1 3 3
1 1 3 6
that is creating detail + summary record within one sql statement!
April 28, 2005 - 10:18 am UTC
ops$tkyte@ORA10G> select grouping_id(t1.col2) rowtype,
2 t1.col1 d1,
3 decode( grouping_id(t1.col2), 0, t1.col2, count(*) ) d2,
4 decode( grouping_id(t1.col2), 0, t2.col3, null ) d3,
5 decode( grouping_id(t1.col2), 0, t2.col4, sum(t2.col4) ) d4
6 from t1, t2
7 where t1.col1 = t2.col1
8 group by grouping sets((t1.col1),(t1.col1,t1.col2,t2.col3,t2.col4))
9 /
ROWTYPE D1 D2 D3 D4
---------- ---------- ---------- ---------- ----------
0 1 1 1 1
0 1 1 2 2
0 1 1 3 3
1 1 3 6
detail and summery in one sql statement
A reader, April 29, 2005 - 10:05 am UTC
hi tom,
thanks for your help. that's exactly what i need. analytics rock, analytics roll as you said. :)
unfortunately it is hard to get. :(
i looked in the documentation but cannot understand the grouping_id values in the example. please could you explain? what is "2" or "3" in the grouping column?
Examples
The following example shows how to extract grouping IDs from a query of the sample table sh.sales:
SELECT channel_id, promo_id, sum(amount_sold) s_sales,
GROUPING(channel_id) gc,
GROUPING(promo_id) gp,
GROUPING_ID(channel_id, promo_id) gcp,
GROUPING_ID(promo_id, channel_id) gpc
FROM sales
WHERE promo_id > 496
GROUP BY CUBE(channel_id, promo_id);
C PROMO_ID S_SALES GC GP GCP GPC
- ---------- ---------- ---------- ---------- ---------- ----------
C 497 26094.35 0 0 0 0
C 498 22272.4 0 0 0 0
C 499 19616.8 0 0 0 0
C 9999 87781668 0 0 0 0
C 87849651.6 0 1 1 2
I 497 50325.8 0 0 0 0
I 498 52215.4 0 0 0 0
I 499 58445.85 0 0 0 0
I 9999 169497409 0 0 0 0
I 169658396 0 1 1 2
P 497 31141.75 0 0 0 0
P 498 46942.8 0 0 0 0
P 499 24156 0 0 0 0
P 9999 70890248 0 0 0 0
P 70992488.6 0 1 1 2
S 497 110629.75 0 0 0 0
S 498 82937.25 0 0 0 0
S 499 80999.15 0 0 0 0
S 9999 267205791 0 0 0 0
S 267480357 0 1 1 2
T 497 8319.6 0 0 0 0
T 498 5347.65 0 0 0 0
T 499 19781 0 0 0 0
T 9999 28095689 0 0 0 0
T 28129137.3 0 1 1 2
497 226511.25 1 0 2 1
498 209715.5 1 0 2 1
499 202998.8 1 0 2 1
9999 623470805 1 0 2 1
624110031 1 1 3 3
April 29, 2005 - 10:21 am UTC
How to do this using Analytics
A reader, May 05, 2005 - 5:11 pm UTC
Hello Sir,
I have a denormalized table dept_emp of which part of it I have reproduced here.It has/will have dupes .
I need to find out all emps which belong to more than one dept using Analytics ( Want to avoid self join ).
So the required output must be :
DEPTNO DNAME EMPNO ENAME
------ ---------- ----- --------------------
10 D10 1 E1
10 D10 1 E1
10 D10 2 E2
10 D10 2 E2
20 D20 1 E1
20 D20 1 E1
20 D20 2 E2
20 D20 2 E2
From the total set of :
SELECT * FROM DEPT_EMP ORDER BY DEPTNO ,EMPNO
DEPTNO DNAME EMPNO ENAME
------ ---------- ----- --------------------
10 D10 1 E1
10 D10 1 E1
10 D10 2 E2
10 D10 2 E2
10 D10 3 E3
10 D10 3 E3
20 D20 1 E1
20 D20 1 E1
20 D20 2 E2
20 D20 2 E2
20 D20 4 E4
20 D20 4 E4
20 D20 5 E5
20 D20 5 E5
14 rows selected
create table dept_emp (deptno number , dname varchar2(10) ,empno number ,ename varchar2(20) ) ;
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 3, 'E3');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 4, 'E4');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 5, 'E5');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 3, 'E3');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 4, 'E4');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 5, 'E5');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 2, 'E2');
COMMIT ;
Thanx
May 05, 2005 - 6:10 pm UTC
no analytics
select empno, count(distinct deptno)
from t
group by empno
having count(distinct deptno) > 1;
Thanx Sir
A reader, May 05, 2005 - 9:20 pm UTC
Actually I was planing to use analytics to get the whole row info, will do the same trick with Analytics, then.
You are a Genius.
May 06, 2005 - 7:17 am UTC
select *
from (
select t.*, count(distinct deptno) over (partition by empno) cnt
from t
)
where cnt > 1;
Analytical solution
Baiju Menon, May 10, 2005 - 6:29 am UTC
Sir,
I want to list the department and the maximum number of employees working in that department by using Analytical function(only the department in which the maximum number of employees are working)
the query without the Analytical function is
select deptno, count(deptno) from emp group by deptno having count(deptno) in (select max(count(deptno)) from emp group by deptno)
Thanks
May 10, 2005 - 9:15 am UTC
1 select deptno, cnt
2 from (
3 select deptno, cnt, max(cnt) over() max_cnt
4 from (
5 select deptno, count(*) cnt
6 from emp
7 group by deptno
8 )
9 )
10* where cnt = max_cnt
scott@ORA9IR2> /
DEPTNO CNT
---------- ----------
30 6
group by
Anoop Gupta, May 11, 2005 - 4:15 am UTC
Hi Tom,
I have a table in table data is like this
empid leavelname
1001 Level1
1001 Level2
1001 Level3
1001 Level4
1002 Level1
1002 Level2
1002 Level3
...
...
Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a function.
empid emp_assigned on leavel
1001 level1,level2,level3,level4
1002 level1,level2,level3
...
...
Waiting for your response.....
May 11, 2005 - 7:30 am UTC
only if there is some reasonable maximum number of levelname rows per empid.
is there?
Analytics Rock - But why are they slower for me
Jeff Plumb, May 13, 2005 - 1:00 am UTC
Hi Tom,
I have followed you example about Analytics from Effective Oracle by Design on page 516 (Find a specific row in a partition). When I run the example and tkprof the 3 different queries, the analytics actually takes a lot longer to run, but it does do less logical I/O's. It is doing a lot more physical I/O's so I am guessing that it is using a temporary segment on disk to perform the window sort. To perform the test I created the big_table that you use and populated it with 1,000,000 rows. I am using Oracle 9i release 2. Here is the output from TKPROF:
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 5.32 6.42 13815 14669 0 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 5.32 6.42 13815 14669 0 694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows Row Source Operation
------- ---------------------------------------------------
694 HASH JOIN
20 VIEW
20 SORT GROUP BY
1000000 TABLE ACCESS FULL BIG_TABLE
1000000 TABLE ACCESS FULL BIG_TABLE
********************************************************************************
select t.owner, t.object_name, t.created
from big_table t
join (select owner, max(created) maxcreated
from big_table
group by owner) t2
on (t2.owner = t.owner and t2.maxcreated = t.created)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 5.03 5.06 13816 14669 0 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 5.03 5.06 13816 14669 0 694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows Row Source Operation
------- ---------------------------------------------------
694 HASH JOIN
20 VIEW
20 SORT GROUP BY
1000000 TABLE ACCESS FULL BIG_TABLE
1000000 TABLE ACCESS FULL BIG_TABLE
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created, max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 16.68 40.66 15157 7331 17 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 16.68 40.66 15157 7331 17 694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows Row Source Operation
------- ---------------------------------------------------
694 VIEW
1000000 WINDOW SORT
1000000 TABLE ACCESS FULL BIG_TABLE
********************************************************************************
And when I run the query with the analytics using autotrace I get the following which shows a sort to disk:
SQL*Plus: Release 9.2.0.6.0 - Production on Fri May 13 14:53:08 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production
control@DWDEV> set autot traceonly
control@DWDEV> select owner, object_name, created
2 from
3 ( select owner, object_name, created, max(created) over (partition by owner) as maxcreated
4 from big_table
5 )
6 where created = maxcreated;
694 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4399 Card=1000000 Bytes=52000000)
1 0 VIEW (Cost=4399 Card=1000000 Bytes=52000000)
2 1 WINDOW (SORT) (Cost=4399 Card=1000000 Bytes=43000000)
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=637 Card=1000000 Bytes=43000000)
Statistics
----------------------------------------------------------
0 recursive calls
17 db block gets
7331 consistent gets
15348 physical reads
432 redo size
12784 bytes sent via SQL*Net to client
717 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
694 rows processed
So how can I stop the sorts (disk)? I am guessing that the pga_aggregate_target needs to be higher, but it seems to already be set quite high.
control@DWDEV> show parameter pga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 524288000
I hope you can help clarify how to make the anayltic version run quicker.
Thanks.
May 13, 2005 - 9:50 am UTC
it'll be a function of the number of "owners" here
You have 1,000,000 records.
You have but 20 users.
in this extreme case, having 50,000 records per window and swapping out was not as good as squashing the data down to 20 records and joining -- the CBO quite smartly rewrote:
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
as
select ...
from big_table t, (select owner,max(created) created from big_table t2 ...)
where ....
So, does the data you analyze to find the "most current record" tend to have 50,000 records/key in real life?
In your case, your hash table didn't spill to disk. In real life though, the numbers would probably be much different. a 1,000,000 row table would have keys with 10 or 100 rows maybe, not 50,000 (in general). There you would find the answer to be very different.
And if you let the sort run in memory it would be different as well -- you would get a max of 25m given your pga aggregate target setting that may have been too small.
but consider what happens when the size of the "aggregate" goes up, dimishing marginal returns sets in:
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 2.06 2.01 26970 29283 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 2.06 2.01 26970 29283 0 4775
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 4.57 10.05 30603 14484 15 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 4.57 10.05 30603 14484 15 4775
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.id = t.id)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.70 12.04 33787 45393 2 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.71 12.05 33787 45393 2 1000000
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by id) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.00 9.60 9336 14484 2 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.00 9.60 9336 14484 2 1000000
and, given sufficient space to work "in memory", these two big queries both benefited:
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 1.82 1.96 9909 29283 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 1.83 1.97 9909 29283 0 4775
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 2.15 2.11 2858 14484 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 2.15 2.11 2858 14484 0 4775
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.id = t.id)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.64 7.55 10181 94633 0 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.65 7.56 10181 94633 0 1000000
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by id) as maxcreated
from big_table
)
where created = maxcreated
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 5.69 5.49 2699 14484 0 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 5.69 5.49 2699 14484 0 1000000
(this was a dual cpu xeon using 'nonparallel' query in this case, once with a 256mb pga aggregate target and again with a 2gig one
kuldeep, May 14, 2005 - 3:27 am UTC
Dear Tom,
I have three tables t1, t2 & t3. where t2 & t3 is joined with t2 with column "key_id".
Now I need sum of key_values(amount) of t2 and sum of key_values(amount) of t3 for key_id
in table t1.
kuldeep@dlfscg> select * from t1;
KEY_ID KEY_VAL
---------- ----------
2 1980
1 1975
kuldeep@dlfscg> select * from t2;
KEY_ID KEY_VAL
---------- ----------
2 550
2 575
1 500
kuldeep@dlfscg> select * from t3;
KEY_ID KEY_VAL
---------- ----------
2 900
1 1000
1 750
***** QUERY 1 *****
kuldeep@dlfscg> SELECT t1.key_id, SUM(t2.key_val) sum_t2_key_val, SUM(t3.key_val) sum_t3_key_val
2 FROM t1, t2, t3
3 WHERE t1.key_id=t2.key_id
4 AND t1.key_id=t3.key_id
5 GROUP BY t1.key_id
6 /
KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
1 1000 1750
2 1125 1800
***** QUERY 2 *****
kuldeep@dlfscg> SELECT t1.key_id, t2.sum_t2_key_val, t3.sum_t3_key_val
2 FROM t1,
3 (SELECT key_id, SUM(key_val) sum_t2_key_val FROM t2 GROUP BY key_id) t2,
4 (SELECT key_id, SUM(key_val) sum_t3_key_val FROM t3 GROUP BY key_id) t3
5 WHERE t1.key_id=t2.key_id
6 AND t1.key_id=t3.key_id
7 /
KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
1 500 1750
2 1125 900
Query 1 is giving wrong result and I can not use query 2 whose performance is very poor.
Oracle 9i has added a lot of new grouping features and a lot of analytic functions (all going over the head).
Is there any "special" sum function or way which should pick value only once belonging to a row (or query's key, here "key_id")
irrespective of how many time it is appearing on query result.
KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1000
1 500 750 <---- 500 of t2 should not be calculated, it is repeat
2 550 900
2 575 900 <---- 900 of t3 should not be calculated, it is repeat
thanks and regards,
May 14, 2005 - 9:36 am UTC
select t1.key_id, t2.sum_val, t3.sum_val
from t1,
(select key_id, sum(val) sum_val from t2 group by key_id ) t2,
(select key_id, sum(val) sum_val from t3 group by key_id ) t3
WHERE t1.key_id=t2.key_id
AND t1.key_id=t3.key_id
apply an amount across multiple records
Dave, May 15, 2005 - 8:17 pm UTC
I have a problem similar to what I call the invoice payment problem.
It would seem to be a common problem, but I have searched to no avail.
The idea is that a customer may have many outstanding invoices, and sends in a check for an arbitrary amount. So we need to apply the money across the invoices oldest first.
Note that in my specific case, if a payment exceeds the total outstanding, the excess is ignored (obviously not dealing with real money here!)
create table invoices (
cust_nbr integer not null,
invoice_nbr integer not null,
invoice_amt number not null,
payment_amt number not null,
primary key (cust_nbr, invoice_nbr)
);
begin
delete from invoices;
dbms_random.seed(123456789);
for c in 1 .. 2 loop
for i in 1 .. 3 loop
insert into invoices values (c, i, round(dbms_random.value * 10, 2)+1, 0);
end loop;
end loop;
update invoices
set payment_amt = round(dbms_random.value * invoice_amt, 2)
where invoice_nbr = 1;
commit;
end;
/
select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
invoice_amt - payment_amt outstanding_amt
from invoices
where invoice_amt - payment_amt > 0
order by cust_nbr, invoice_nbr;
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 5.55 3.89
1 2 3.21 0 3.21
1 3 2.78 0 2.78
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92
variable cust_nbr number;
variable received_amt number;
begin
:cust_nbr := 1;
:received_amt := 7.25;
end;
/
update invoices i1
set payment_amt = (... some query which applies
:received_amt to outstanding_amt ...)
where cust_nbr = :cust_nbr;
result should be:
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 9.44 0
1 2 3.21 3.21 0
1 3 2.78 .15 2.63
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92
This is simple to solve in pl/sql with a cursor, but I thought it would be a good test for a set-based solution with analytics. But after some effort, I'm stumped.
May 16, 2005 - 7:37 am UTC
Using analytics we can see how to apply the inputs:
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
2 least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
3 from (
4 select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
5 invoice_amt - payment_amt outstanding_amt,
6 sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
7 from invoices
8 where cust_nbr = :cust_nbr
9 )
10 order by cust_nbr, invoice_nbr;
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT AMOUNT_TO_APPLY
---------- ----------- ----------- ----------- ---------------
1 1 9.44 5.55 3.89
1 2 3.21 0 3.21
1 3 2.78 0 .15
Just needed a running total of outstanding amounts to take away from the received amount....
Then, merge:
ops$tkyte@ORA9IR2> merge into invoices
2 using
3 (
4 select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
5 least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
6 from (
7 select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
8 invoice_amt - payment_amt outstanding_amt,
9 sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
10 from invoices
11 where cust_nbr = :cust_nbr
12 )
13 ) x
14 on ( invoices.cust_nbr = x.cust_nbr and invoices.invoice_nbr = x.invoice_nbr )
15 when matched then update set payment_amt = nvl(payment_amt,0)+x.amount_to_apply
16 when not matched /* never happens... */ then insert (cust_nbr) values (null);
3 rows merged.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
2 invoice_amt - payment_amt outstanding_amt
3 from invoices
4 order by cust_nbr, invoice_nbr;
CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 9.44 0
1 2 3.21 3.21 0
1 3 2.78 .15 2.63
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92
6 rows selected.
Group by
Anoop Gupta, May 16, 2005 - 10:06 am UTC
Reviewer: Anoop Gupta from INDIA
Hi Tom,
As i asked question that
I have a table in table data is like this
empid leavelname
1001 Level1
1001 Level2
1001 Level3
1001 Level4
1002 Level1
1002 Level2
1002 Level3
...
...
Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a
function.
empid emp_assigned on leavel
1001 level1,level2,level3,level4
1002 level1,level2,level3
...
...
Give me the way to write a query if Suppose here we have a limitation of levels for an employee is 50.
Please reply....
May 16, 2005 - 1:09 pm UTC
select empid,
rtrim(
max(decode(rn,1,leavelname)) || ',' ||
max(decode(rn,1,leavelname)) || ',' ||
....
max(decode(rn,50,leavelname)), ',' )
from (select empid,
row_number() over (partition by empid order by leavelname) rn,
leavelname
from t
)
group by empid;
special sum
kuldeep, May 17, 2005 - 12:38 am UTC
Dear Tom,
Thanks for your response and for this useful site.
I was looking for a solution which could avoid these inline views which were making my query to run slow. I tried for the solution and got this query,
/* DATA VIEW */
kuldeep@dlfscg> SELECT t1.key_id,
2 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
3 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
4 FROM t1, t2, t3
5 WHERE t1.key_id=t2.key_id
6 AND t1.key_id=t3.key_id
7 ORDER BY t1.key_id
8 /
KEY_ID T2_ROWID T2_RN KEY_VAL T3_ROWID T2_RN KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
1 AAANZ5AAHAAAD94AAA 1 500 AAANZ4AAHAAAD9wAAA 1 1000
1 AAANZ5AAHAAAD94AAA 2 500 AAANZ4AAHAAAD9wAAB 1 750
2 AAANZ5AAHAAAD91AAA 1 550 AAANZ4AAHAAAD9tAAA 1 900
2 AAANZ5AAHAAAD91AAB 1 575 AAANZ4AAHAAAD9tAAA 2 900
/* FINAL QUERY */
kuldeep@dlfscg> SELECT key_id,
2 SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
3 SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
4 FROM (SELECT t1.key_id,
5 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
6 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
7 FROM t1, t2, t3
8 WHERE t1.key_id=t2.key_id
9 AND t1.key_id=t3.key_id)
10 GROUP BY key_id
11 /
KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1750
2 1125 900
regards,
May 17, 2005 - 8:23 am UTC
one would need more information -- it APPEARS that you are trying to get a "random first hit" from T2 and T3 by T1.key_id
That is, for every row in T1 -- find the first match (any match will do) in T2 and in T3
report that value
is that correct.
and how big are t1,t2,t3, and how long is long.
group by
Anoop Gupta, May 17, 2005 - 9:42 am UTC
Tom,
Thanks for your prompt response.
Analytical Problem
Imran, May 18, 2005 - 4:16 am UTC
Look at the following two queries.
SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
2 CASE
3 WHEN up_down < 0
4 THEN 'DOWN'
5 WHEN up_down > 0
6 THEN 'UP'
7 ELSE 'BALANCE'
8 END CASE,
9 prev_month
10 FROM (SELECT exch || ' - ' || phone phone,
11 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
12 instdate, paybefdue this_month, arrears,
13 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
14 paybefdue
15 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
16 FROM ptc
17 WHERE phone IN (7629458));
PHONE MONTH ARREARS THIS_MONTH ABS(UP_DOWN) CASE PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458 Apr, 2005 2562.52 5265 5265 UP 0
SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
2 CASE
3 WHEN up_down < 0
4 THEN 'DOWN'
5 WHEN up_down > 0
6 THEN 'UP'
7 ELSE 'BALANCE'
8 END CASE,
9 prev_month
10 FROM (SELECT exch || ' - ' || phone phone,
11 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
12 instdate, paybefdue this_month, arrears,
13 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
14 paybefdue
15 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
16 FROM ptc
17 WHERE phone IN (7629459));
PHONE MONTH ARREARS THIS_MONTH ABS(UP_DOWN) CASE PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629459 Apr, 2005 3516.62 7834 7834 UP 0
SQL>
Now when I combine the two queries results are different.
1 SELECT phone, MONTH, arrears, this_month, ABS (up_down),
2 CASE
3 WHEN up_down < 0
4 THEN 'DOWN'
5 WHEN up_down > 0
6 THEN 'UP'
7 ELSE 'BALANCE'
8 END CASE,
9 prev_month
10 FROM (SELECT exch || ' - ' || phone phone,
11 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
12 instdate, paybefdue this_month, arrears,
13 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
14 paybefdue
15 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
16 FROM ptc
17* WHERE phone IN (7629458,7629459))
SQL> /
PHONE MONTH ARREARS THIS_MONTH ABS(UP_DOWN) CASE PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458 Apr, 2005 2562.52 5265 2569 DOWN 7834
202 - 7629459 Apr, 2005 3516.62 7834 7834 UP 0
So you note that prev Month balance now disturbs badly.
Please tell me how to do this
May 18, 2005 - 8:58 am UTC
need test case. create table, insert's (like the page used to submit this said....)
Use of analytic functions in UPDATE statements
Bob Lyon, May 18, 2005 - 12:29 pm UTC
Tom,
-- Given this sample data
CREATE TABLE GT (
XP_ID INTEGER,
OFFSET INTEGER,
PMAX NUMBER,
PRIOR_PMAX NUMBER
);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 1, 3);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 2, 8);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 155, 3, 5);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 173, 3, 7.3);
-- I want to update the table and set the PRIOR_PMAX column values to be as follows
SELECT XP_ID, OFFSET, PMAX,
LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
ORDER BY XP_ID, OFFSET) PRIOR_PMAX
FROM GT
ORDER BY XP_ID, OFFSET;
XP_ID OFFSET PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
123 1 3
123 2 8 3
155 3 5
173 3 7.3
-- My insert to do this tells me "4 rows updated.", but does not do what I want.
UPDATE GT A
SET PRIOR_PMAX = (
SELECT LAG(B.PMAX, 1, NULL) OVER (PARTITION BY B.XP_ID
ORDER BY B.XP_ID, B.OFFSET) PRIOR_PMAX
FROM GT B
WHERE A.ROWID = B.ROWID
);
-- but I get
SELECT xp_id, offset, pmax, prior_pmax
FROM GT
ORDER BY xp_id, offset;
XP_ID OFFSET PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
123 1 3
123 2 8
155 3 5
173 3 7.3
-- Oracle doc states
-- "Therefore, analytic functions can appear only in the select list or ORDER BY clause."
-- which is perhaps a little ambiguous in this case.
-- Is there a way to do this update is "Straight SQL"?
May 18, 2005 - 12:54 pm UTC
you can merge
merge into gt a
using ( SELECT rowid rid, XP_ID, OFFSET, PMAX,
LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
ORDER BY XP_ID, OFFSET) PRIOR_PMAX
FROM GT )b
on (a.rowid = b.rowid)
when matched then update ...
when not matched (never happens, just do a dummy insert of a single null in 9i or leave off entirely in 10g)
special sum
Kuldeep, May 19, 2005 - 1:09 am UTC
My requirement was like this : I have receivables (bills, debit notes etc.) which I adjusts against the received payments and credit note (both are in seperate tables). To know the outstanding I was joining (outer join) my receivables with payments and credit notes.
Because one receivable can be adjusted against many payments and credit notes so outstanding payment was like this:
outstanding = receivable amount - sum(payment amount) - sum(credit note amount)
this simple query using outer join was giving wrong result if a receivable is adjusted against one payment and more than one credit note or vice versa.
in this case where
receivable : 1000 payment : 400 CN : 400, 200
will appear as
1000 400 400
1000 400 200
--- ---
800 600 outstanding = -400 (wrong)
My t1, t2 and t3 has 600,000, 350,000 and 80,000 row respectively.
This is my actual inline view query
-----------------------------------
SELECT a.bill_type, a.bill_exact_type, a.period_id,
a.scheme_id, a.property_number, a.bill_number,
a.bill_amount, SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) adj_amt,
NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) pending_amt
FROM ALL_RECEIVABLE a,
(SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adj_amt) adj_amt
FROM CREDIT_NOTE_RECEIVABLE
WHERE bill_type=p_bill_type
AND scheme_id=p_scheme
AND property_number=p_prop
GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) c,
(SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adjust_amount) adjust_amount
FROM PAYMENT_RECEIPT_ADJ
WHERE bill_type=p_bill_type
AND scheme_id=p_scheme
AND property_number=p_prop
GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) p
WHERE a.bill_type=P_BILL_TYPE
AND a.scheme_id=P_SCHEME
AND a.property_number=P_PROP
AND a.bill_type=c.bill_type(+)
AND a.bill_exact_type=c.bill_exact_type(+)
AND a.period_id=c.period_id(+)
AND a.scheme_id=c.scheme_id(+)
AND a.property_number=c.property_number(+)
AND a.bill_number=c.bill_number(+)
AND a.bill_type=p.bill_type(+)
AND a.bill_exact_type=p.bill_exact_type(+)
AND a.period_id=p.period_id(+)
AND a.scheme_id=p.scheme_id(+)
AND a.property_number=p.property_number(+)
AND a.bill_number=p.bill_number(+)
GROUP BY a.bill_type, a.bill_exact_type, a.period_id, a.scheme_id,
a.property_number, a.bill_number, a.bill_date, a.bill_amount
HAVING (NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0))) > 0
ORDER BY a.bill_date;
-----------------------------------
It is not reporting just the first hit of t1 in t2 and t3. Here in my last posting, I was trying just to exclude any repeat of t2 and t3's ROW in sum calculation. That means one row of t2 and t3 should be calculated only once.
I have tried this query putting more rows and applied the same on actual query, it is working fine and giving the same result as previous inline view query was giving.
kuldeep@dlfscg> SELECT t1.key_id,
2 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
3 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
4 FROM t1, t2, t3
5 WHERE t1.key_id=t2.key_id(+)
6 AND t1.key_id=t3.key_id(+)
7 ORDER BY t1.key_id
8 /
KEY_ID T2_ROWID T2_RN KEY_VAL T3_ROWID T2_RN KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
1 AAANZ5AAHAAAD94AAA 1 500 AAANZ4AAHAAAD9wAAA 1 1000
1 AAANZ5AAHAAAD94AAA 2 500 AAANZ4AAHAAAD9wAAB 1 750
1 AAANZ5AAHAAAD94AAA 3 500 AAANZ4AAHAAAD9wAAC 1 25
2 AAANZ5AAHAAAD91AAA 1 550 AAANZ4AAHAAAD9tAAA 1 900
2 AAANZ5AAHAAAD91AAB 1 575 AAANZ4AAHAAAD9tAAA 2 900
3 AAANZ5AAHAAAD91AAC 1 222 1
3 AAANZ5AAHAAAD91AAD 1 223 2
4 1 AAANZ4AAHAAAD9tAAB 1 333
8 rows selected.
kuldeep@dlfscg> SELECT key_id,
2 SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
3 SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
4 FROM (SELECT t1.key_id,
5 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
6 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
7 FROM t1, t2, t3
8 WHERE t1.key_id=t2.key_id(+)
9 AND t1.key_id=t3.key_id(+))
10 GROUP BY key_id
11 /
KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1775
2 1125 900
3 445 0
4 333
kuldeep@dlfscg>
thanks for your responses.
regards,
May 19, 2005 - 7:47 am UTC
do not order by rowid to get a last row -- is that what you are trying to do??
which row do you want to get from t2 to join with t1
and which row do you want to get from t3 to join with t1
You must specify that based on attributes you manage (eg: there must be an orderable field that helps you determine WHICH record is the right one)
consider rowid to be a random number that does not have any meaning when ordered by, it does not imply order of insertion or anything.
null record
yeshk, May 25, 2005 - 4:12 pm UTC
I need help with this query - This is just a part of the query I am working with.
I am not able to generate a NULL RECORD in between the result set.
I should be able to pass this information out as a reference cursor.
create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt number,non_measd_tkt number);
insert into test values('CA','NDS',100,200);
insert into test values('IL','DSL',200,300);
insert into test values('CA','DSL',100,300);
insert into test values('MO','NDS',1000,300);
insert into test values('MO','DSL',100,200);
I need a result like this
STATE SVC_CAT MEASD_TKT NON MEASD TKT
CA DSL 200 300
CA NDS 100 200
TOTAL 300 500
IL DSL 200 300
TOTAL 200 300
MO DSL 100 200
MO NDS 1000 300
TOTAL 1100 500
I am able to generate the result using a query with analytics.But I dont know how to get an empty row after each state total
Also,Which is better using cursor
1) cursor based on state.
2) get the data and insert into a temporary table.
3) insert a null record
or use analytics to get complete data and put into a reference cursor.
Thanks
yeshk
May 25, 2005 - 7:57 pm UTC
well, that would sort of be the job of the "pretty printing routine" -- eg: the report generator?
what tool is printing this out?
null record
yeshk, May 26, 2005 - 9:20 am UTC
we need to give the resultant set with a null record after each state calculation to front-end VB application. It will be given in a reference cursor.They will just select * from reference cursor and display it on a report.
May 26, 2005 - 10:02 am UTC
the VB application should do this, (it should be able to do something shouldn't it...)
ops$tkyte@ORA9IR2> select decode( grp, 0, state ) state,
2 decode( grp, 0, svc_cat) svc_cat,
3 decode( grp, 0, sum_mt ) sum_mt,
4 decode( grp, 0, sum_nmt ) sum_nmt
5 from (
6 select grouping(dummy) grp, state, svc_cat, sum(measd_tkt) sum_mt, sum(non_measd_tkt) sum_nmt
7 from (
8 select state, svc_cat, 1 dummy, measd_tkt, non_measd_tkt
9 from test
10 )
11 group by rollup( state, dummy, svc_cat )
12 )
13 /
ST SVC SUM_MT SUM_NMT
-- --- ---------- ----------
CA DSL 100 300
CA NDS 100 200
CA 200 500
IL DSL 200 300
IL 200 300
MO DSL 100 200
MO NDS 1000 300
MO 1100 500
12 rows selected.
Can rollup do the thing??
Bhavesh Ghodasara, May 26, 2005 - 9:39 am UTC
Hi yeshk,
create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt
number,non_measd_tkt number);
insert into test values('CA','NDS',100,200);........
insert into test values('CA','DSL',100,300);....
STATE SVC_CAT MEASD_TKT NON MEASD TKT
CA DSL 200 300 <==From where measd_tkt=200 comes from??????
CA NDS 100 200
TOTAL 300 500
Tom,,Can we do like this,
break on state
select STATE,SVC_CAT,sum(measd_tkt),sum(non_measd_tkt)
from test
group by rollup(STATE,SVC_CAT)
order by state
............
If i make any mistake than please tell me..
Thanks in advance.
May 26, 2005 - 10:19 am UTC
see above
Which analytics to use?
Marc-Andre Larochelle, May 30, 2005 - 9:10 pm UTC
Hi Tom,
I have this 3rd party table:
drop table t;
create table t (atype varchar2(4),
acol# varchar2(3),
adin varchar2(8),
ares varchar2(8));
insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
insert into t (atype, acol#, ares) values ('MACT','1','02246569');
insert into t (atype, acol#, ares) values ('MACT','6','02246569');
insert into t (atype, acol#, ares) values ('MACT','7','00021474');
select * from t;
ATYPE ACOL# ADIN ARES
----- ----- -------- --------
DUPT 001 02246569
DUPT 002 00021474
DUPT 003 02246569
MACT 1 02246569
MACT 6 02246569
MACT 7 00021474
I would like to get the following result :
DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569
I need to match DUPT.adin=MACT.ares together but making sure MACT.acol# is different for every DUPT.acol#. Bsically this table has different values in column depending on the type of rows (atype).
I have tried using lag, lead, rank and nothing seems to work but I am pretty sure it is doable with analytics which is why I posted my question here.
Any hint/help would be appreciated.
Thank you,
Marc-Andre
May 31, 2005 - 7:30 am UTC
question for you.
How did you know to put:
DUPT 001 02246569 together with MACT 1 02246569 and
DUPT 003 02246569 together with MACT 6 02246569
and not
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569
for example. some missing logic here.
Am I Correct??
Bhavesh Ghodasara, May 31, 2005 - 5:15 am UTC
Hi tom,
I solved above problem...
The query like :
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol# bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
count(*) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol<>lb
What i think is there must be a better way...
I know You will do it in much much better way..
Please suggest the corrections.
Thanks in Advance..
May 31, 2005 - 8:17 am UTC
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 1 02246569
well, it gives a different result than the one you posted, it gives my hypothetical answer -- where 001 was combined with 6, not 1.
We can do this..
Bhavesh Ghodasara, May 31, 2005 - 8:28 am UTC
Hi tom,
I can further modified my query:
now its give desired result....
(Agree that question is ambiguous)
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
min(b.acol#) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol=lb
or cnt>1
OUTPUT:
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569
So any corrections now??
Thanks in advance
Bhavesh
May 31, 2005 - 8:43 am UTC
I don't know your data well enough, but your query is non-deterministic if you care. Consider:
ops$tkyte@ORA10G> create table t (atype varchar2(4),
2 acol# varchar2(3),
3 adin varchar2(8),
4 ares varchar2(8));
Table created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
1 row created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
2 from (
3 select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
4 bacol,b.ares bares,
5 nvl(lead(b.acol# ) over(order by a.adin),0) lb,
6 min(b.acol#) over(partition by a.acol#) cnt
7 from t a,t b
8 where a.adin=b.ares
9 order by atyp,acol) t
10 where bacol=lb
11 or cnt>1;
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 002 00021474 MACT 7 00021474
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> truncate table t;
Table truncated.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
1 row created.
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
1 row created.
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
2 from (
3 select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
4 bacol,b.ares bares,
5 nvl(lead(b.acol# ) over(order by a.adin),0) lb,
6 min(b.acol#) over(partition by a.acol#) cnt
7 from t a,t b
8 where a.adin=b.ares
9 order by atyp,acol) t
10 where bacol=lb
11 or cnt>1;
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6 02246569
DUPT 002 00021474 MACT 7 00021474
Same data both times, just different order of insertions. With analytics and order by, you need to be concerned about duplicates.
Answers
Marc-Andre Larochelle, May 31, 2005 - 11:35 am UTC
Tom, Bhavesh,
The problem resides exactly there: no logic to match the records. I know that DUPT.din1 must have a MACT.din1 somewhere. I just don't know which one (1st one, 2nd one?). This is a decision I will have to take.
DUPT 001 02246569 MACT 1 02246569
DUPT 003 02246569 MACT 6 02246569
and
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569
are the same to me. But when I run the query, I want to always get the same results.
Anyways, all in all, your queries (Bhavesh - thank you - and yours) seem to answer to my question. I will watch out for duplicates.
Thank you very much for the quick help.
Marc-Andre
What I found
Marc-Andre Larochelle, May 31, 2005 - 5:02 pm UTC
Hi Tom,
Testing the SQL statement Bhavesh provided, I quickly discovered what you meant when saying the query was non-deterministic. When I added a 4th record :
insert into t (atype,acol#,adin) values ('DUPT','004','02246569');
insert into t (atype,acol#,ares) values ('MACT','5','02246569');
only one row was returned. I played with the query and here is what I came up with :
select atyp,acol,aadin,batype,bacol,bares
from (
select atyp,acol,aadin,batype,bacol,bares,drnk ,
rank() over (partition by acol order by bacol) rnk
from (
select a.atype atyp,
a.acol# acol,
a.adin aadin,
b.atype batype,
b.acol# bacol,
b.ares bares,
dense_rank() over (partition by a.atype,a.adin order by a.acol#) drnk
from t a,t b
where a.adin=b.ares))
where drnk=rnk;
Feel free to comment.
Again thank you (and Bhavesh).
Marc-Andre
Using Analytical Values to find latest info
anirudh, June 03, 2005 - 10:41 am UTC
Hi Tom,
we have a fairly large table with about 100 million rows, among others this table has
the following columns
CREATE TABLE my_fact_table (
staff_number VARCHAR2 (10), -- staff number
per_end_dt DATE, -- last day of month
engagement_code VARCHAR2 (30), -- engagement code
client_code VARCHAR2 (20), -- client code
revenue NUMBER (15,2) -- revenue
)
in this table the same engagement code can have different client codes for diffenet periods. This was at one point desirable and that is the reason client code was stored in this fact table instead of the engagement dimension.
Our users now want us to update the client code in these transactions to the latest value of the client code (meaning - pick the client from the latest month for which we have got any transactions for that engagement)
This situation where same engagement has multiple clients across periods is there for about 5 % of the rows.
[btw - we do plan to do data-model change to reflect the new relationships - but that may take some time - hence the interim need to just update the fact table]
to implemnt these updates that may happen for several months, I'm trying to take the approach below
which involve multiple queries and creation of a couple of temp tables - does it seem reasonable. i have a lurking feeling that with a deeper understanding of Analytic functions this can be further simplified - will appreciate your thoughts.
============= My Approach =================
-- Find the Engagements that have multiple Clients
CREATE TABLE amtest_mult_cli AS
WITH
v1 AS (SELECT DISTINCT engagement_code,client_code
FROM my_fact_table)
SELECT engagement_code
FROM v1
GROUP BY engagement_code
HAVING COUNT(*) > 1
-- Find What should be the correct client for those engagements
CREATE TABLE amtest_use_cli AS
SELECT engagement_code,per_end_dt,client_code
FROM
(
SELECT engagement_code,per_end_dt,client_code
row_number() OVER (PARTITION BY engagement_code
ORDER BY per_end_dt DESC, client_code DESC)
row_num
FROM my_fact_table a,
amtest_mult_cli b
WHERE a.engagement_code = b.engagement_code
)
WHERE row_num = 1;
-- Update Correct Clients for those engagements
UPDATE my_fact_table a
SET a.client_code =
(SELECT b.client_code
FROM amtest_use_cli b
WHERE a.engagement_code = b.engagement_code)
WHERE EXISTS
(SELECT 1
FROM amtest_use_cli c
WHERE a.engagement_code = c.engagement_code);
======================================================
June 03, 2005 - 12:14 pm UTC
why not:
merge into my_fact_table F
using
( select engagement_code,
substr(max(to_char(per_end_dt,'yyyymmddhh24miss')||client_code ),15) cc
from my_fact_table
group by engagement_code
having count(distinct client_code) > 1 ) X
on ( f.engagement_code = x.engagement_code )
when matched
then update set client_code = x.cc
when not matched
then insert ( client_code ) values ( null ); <<== never can happen
<<== in 10g, not needed!
That select finds the client_code for the max per_end_dt by engagement_code for engagement_code's that have more than one distinct client_code....
first_value(client_code)
over (partition by engagement_code
order by per_end_dt desc, client_code desc ),
count(distinct client_code)
help with lead
Adolph, June 09, 2005 - 1:24 am UTC
I have a table in the following structure:
create table cs_fpc_pr
(PRGM_C VARCHAR2(10) not null,
fpc_date date not null,
TIME_code VARCHAR2(3) not null,
SUN_TYPE varchar2(1))
insert into cs_fpc_pr values ('PRGM000222', to_date('08-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '45','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '45','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '24','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '23','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '47','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '48','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '23','1');
commit;
select prgm_c,fpc_date,time_code,sun_type,
lead(fpc_date) over(partition by prgm_C order by fpc_date) next_date
from cs_fpc_pr
order by prgm_c,fpc_date,time_code;
PRGM_C FPC_DATE TIM S NEXT_DATE
---------- --------- --- - ---------
PRGM000222 08-MAY-05 33 1 09-MAY-05
PRGM000222 09-MAY-05 05 3 09-MAY-05
PRGM000222 09-MAY-05 25 1 09-MAY-05
PRGM000222 09-MAY-05 45 3 10-MAY-05
PRGM000222 10-MAY-05 05 3 10-MAY-05
PRGM000222 10-MAY-05 25 1 10-MAY-05
PRGM000222 10-MAY-05 45 3 14-MAY-05
PRGM000222 14-MAY-05 05 3 14-MAY-05
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05
PRGM000242 08-MAY-05 23 1 08-MAY-05
PRGM000242 08-MAY-05 47 3 08-MAY-05
PRGM000242 08-MAY-05 48 3 09-MAY-05
PRGM000242 09-MAY-05 07 3 09-MAY-05
PRGM000242 09-MAY-05 33 1 09-MAY-05
PRGM000242 09-MAY-05 46 3 10-MAY-05
PRGM000242 10-MAY-05 07 3 10-MAY-05
PRGM000242 10-MAY-05 33 1 10-MAY-05
PRGM000242 10-MAY-05 46 3 11-MAY-05
PRGM000242 11-MAY-05 07 3 11-MAY-05
PRGM000242 11-MAY-05 33 1 11-MAY-05
PRGM000242 11-MAY-05 46 3 14-MAY-05
PRGM000242 14-MAY-05 07 3 14-MAY-05
PRGM000242 14-MAY-05 23 1
I need to find the for a particular 'prgm_c' the next date & time code where the 'sun_type' field = '1'.
A sample of the output should look something like this:
PRGM_C FPC_DATE TIM S NEXT_DATE next_time
---------- --------- --- - --------- -------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 10-MAY-05 25
PRGM000222 09-MAY-05 45 3 10-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 14-MAY-05 24
PRGM000222 10-MAY-05 45 3 14-MAY-05 24
PRGM000222 14-MAY-05 05 3 14-MAY-05 24
PRGM000222 14-MAY-05 24 1
Tom, Can you please help me with with this?
Regards
June 09, 2005 - 6:53 am UTC
PRGM000222 10-MAY-05 05 3 10-MAY-05
PRGM000222 10-MAY-05 25 1 10-MAY-05
PRGM000222 10-MAY-05 45 3 14-MAY-05
PRGM000222 14-MAY-05 05 3 14-MAY-05
PRGM000222 14-MAY-05 24 1
you've got a problem with those fpc_dates and ordering by them. you have "dups" so no one of those 10-may-05 comes "first" same with the 14th. You need to figure out how to really order this data deterministically first.
My first attempt at this is:
tkyte@ORA9IR2W> select prgm_c, fpc_date, time_code, sun_type,
2 to_date(substr( max(data)
over (partition by prgm_c order by fpc_date desc),
6, 14 ),'yyyymmddhh24miss') ndt,
3 to_number( substr( max(data)
over (partition by prgm_c order by fpc_date desc), 20) ) ntc
4 from (
5 select prgm_c,
6 fpc_date,
7 time_code,
8 sun_type,
9 case when lag(sun_type)
over (partition by prgm_c order by fpc_date desc) = '1'
10 then to_char( row_number()
over (partition by prgm_c order by fpc_date desc) , 'fm00000') ||
11 to_char(lag(fpc_date)
over (partition by prgm_c order by fpc_date desc),'yyyymmddhh24mi
ss')||
12 lag(time_code) over (partition by prgm_c order by fpc_date desc)
13 end data
14 from cs_fpc_pr
15 )
16 order by prgm_c,fpc_date,time_code
17 /
PRGM_C FPC_DATE TIM S NDT NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 09-MAY-05 25
PRGM000222 09-MAY-05 45 3 09-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 10-MAY-05 25
PRGM000222 10-MAY-05 45 3 10-MAY-05 25
PRGM000222 14-MAY-05 05 3
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05 23
PRGM000242 08-MAY-05 23 1 08-MAY-05 23
PRGM000242 08-MAY-05 47 3 08-MAY-05 23
PRGM000242 08-MAY-05 48 3 08-MAY-05 23
PRGM000242 09-MAY-05 07 3 10-MAY-05 33
PRGM000242 09-MAY-05 33 1 10-MAY-05 33
PRGM000242 09-MAY-05 46 3 10-MAY-05 33
PRGM000242 10-MAY-05 07 3 10-MAY-05 33
PRGM000242 10-MAY-05 33 1 10-MAY-05 33
PRGM000242 10-MAY-05 46 3 10-MAY-05 33
PRGM000242 11-MAY-05 07 3 14-MAY-05 23
PRGM000242 11-MAY-05 33 1 14-MAY-05 23
PRGM000242 11-MAY-05 46 3 14-MAY-05 23
PRGM000242 14-MAY-05 07 3
PRGM000242 14-MAY-05 23 1
24 rows selected.
but the lack of distinctness on the fpc_date means you might get "a different answer" with the same set of data.
reply
Adolph, June 09, 2005 - 7:48 am UTC
Sorry for not being clear at the first instance so here goes.... A program (prgm_C) will have a maximum of one entry in the table for a combination of a (fpc_date & time_code).
This time_code actually maps to another table where '01' is '01:00:00' , '02' is '01:30:00' & so on (i.e. times stored in varchar2 formats )
So basically a program will exist for a fpc_date and a time_code only once
I hope i'm making sense.
Regards
June 09, 2005 - 7:58 am UTC
tkyte@ORA9IR2W> select prgm_c,
2 fpc_date,
3 time_code,
4 sun_type,
5 to_date(
6 substr( max(data)
7 over (partition by prgm_c
8 order by fpc_date desc,
9 time_code desc),
10 6, 14 ),'yyyymmddhh24miss') ndt,
11 to_number(
12 substr( max(data)
13 over (partition by prgm_c
14 order by fpc_date desc,
15 time_code desc), 20) ) ntc
16 from (
17 select prgm_c,
18 fpc_date,
19 time_code,
20 sun_type,
21 case when lag(sun_type)
22 over (partition by prgm_c
23 order by fpc_date desc,
24 time_code desc) = '1'
25 then
26 to_char( row_number()
27 over (partition by prgm_c
28 order by fpc_date desc,
29 time_code desc) , 'fm00000') ||
30 to_char(lag(fpc_date)
31 over (partition by prgm_c
32 order by fpc_date desc,
33 time_code desc),'yyyymmddhh24mi ss')||
34 lag(time_code)
35 over (partition by prgm_c
36 order by fpc_date desc,
37 time_code desc)
38 end data
39 from cs_fpc_pr
40 )
41 order by prgm_c,fpc_date,time_code
42 /
PRGM_C FPC_DATE TIM S NDT NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 10-MAY-05 25
PRGM000222 09-MAY-05 45 3 10-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 14-MAY-05 24
PRGM000222 10-MAY-05 45 3 14-MAY-05 24
PRGM000222 14-MAY-05 05 3 14-MAY-05 24
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05 23
PRGM000242 08-MAY-05 23 1 09-MAY-05 33
PRGM000242 08-MAY-05 47 3 09-MAY-05 33
PRGM000242 08-MAY-05 48 3 09-MAY-05 33
PRGM000242 09-MAY-05 07 3 09-MAY-05 33
PRGM000242 09-MAY-05 33 1 10-MAY-05 33
PRGM000242 09-MAY-05 46 3 10-MAY-05 33
PRGM000242 10-MAY-05 07 3 10-MAY-05 33
PRGM000242 10-MAY-05 33 1 11-MAY-05 33
PRGM000242 10-MAY-05 46 3 11-MAY-05 33
PRGM000242 11-MAY-05 07 3 11-MAY-05 33
PRGM000242 11-MAY-05 33 1 14-MAY-05 23
PRGM000242 11-MAY-05 46 3 14-MAY-05 23
PRGM000242 14-MAY-05 07 3 14-MAY-05 23
PRGM000242 14-MAY-05 23 1
24 rows selected.
Just needed to add "time_code DESC"
See
</code>
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html <code>
analytics to the rescue
for the "carry down" technique I used here. In 10g, we'd simplify using "ignore nulls" in the LAST_VALUE function instead of the max() and row_number() trick
brilliant
Adolph, June 09, 2005 - 9:53 am UTC
Thank you very much Tom. The query works like a charm.I will read up the link. Analytics do rock n roll :)
Working on an Analytic Query
Scott, June 09, 2005 - 12:15 pm UTC
Tom,
From your example for Mark's problem on 4/8, it seems that you need to specify a number of columns to output this way. Is there a way to have a varying number of columns. For example, I need to have a query that takes a date range, and makes each date a column heading. Any help would be greatly appreciated.
Thanks,
Scott
June 09, 2005 - 6:15 pm UTC
you need dynamic sql. the number of columns in a query is "well defined, known at parse time" by definition.
If you have access to expert one on one Oracle, I demostrated how to do this with ref cursors in a stored procedure. but you have to run a query, to get the set of column "headings" and write a query bsaed on that.
Tom any idea how I can re write this piece of code
A reader, June 09, 2005 - 3:00 pm UTC
decode ((SELECT ih.in_date
FROM major_sales ih
WHERE ih.container = i.container
AND sales > i.container_id
AND sales = (SELECT MIN(ihh.container_id)
FROM major_sales ihh
WHERE ihh.container_id > i.container_id
AND ihh.container = i.container)), NULL,
June 09, 2005 - 6:35 pm UTC
not out of context, no.
I am still having problem with analytical function
A reader, July 01, 2005 - 12:31 pm UTC
select i.container,ssl_user_code,ssl_user_code ssl,cl.code length_code, out_trucker_code, i.chassis,
lead(in_date) over (partition by i.container order by in_date) next_in_date,
out_date,
lead (out_date) over (partition by i.container order by in_date) o_date
from his_containers i,
container_masters cm,
tml_container_lhts clht,
tml_container_lengths cl
WHERE cm.container = i.container
and cm.lht_code = clht.code
and clht.length_code = cl.code
and ssl_user_code = 'ACL'
and i.container like '%408014'
and voided_date is null
and ((in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
results:
----------
CONTAINER SSL_USER_CODE SSL LENGTH_CODE OUT_TRUCKER_CODE CHASSIS NEXT_IN_DATE OUT_DATE O_DATE
ACLU408014 ACL ACL 4 R0480 3/22/2005 2:52:41 PM 3/21/2005 3:45:48 PM 4/6/2005 2:25:59 PM
ACLU408014 ACL ACL 4 J1375 4/6/2005 2:25:59 PM
1. how can I get rid of the 4/6/2005 2:25:59 PM???
July 01, 2005 - 1:52 pm UTC
can you be more specific about why you don't like April 6th as 2:25:59pm? what is it about that you don't like?
That'll help me tell you how to in general remove it. What is the criteria for removal
analytical query
A reader, July 01, 2005 - 2:19 pm UTC
Tom,
We are trying to build the client within a the month, in this case is within april. I also would like to know how many days elapsed during 2 days so I can bill them.
July 01, 2005 - 3:15 pm UTC
"how many days elapsed between 2 days"
the answer is: 2
but are you asking how to do date arithmetic? Just subtract.
sorry...within March
A reader, July 01, 2005 - 2:21 pm UTC
more information
A reader, July 01, 2005 - 2:29 pm UTC
Tom,
This is how the data looks
IN_DATE OUT_DATE CONTAINER
1/3/2005 2:23:05 PM 1/10/2005 5:05:16 PM ACLU408014
1/11/2005 1:04:49 PM 1/12/2005 8:49:06 AM ACLU408014
1/14/2005 12:09:50 PM 1/18/2005 6:39:10 AM ACLU408014
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM 4/6/2005 2:25:59 PM ACLU408014
4/7/2005 1:24:43 PM 4/10/2005 2:21:59 AM ACLU408014
and I would like to get the pair within the same month
July 01, 2005 - 3:16 pm UTC
the pair of "what"?
I would like to get all the dates within the month
A reader, July 01, 2005 - 4:03 pm UTC
one more try
A reader, July 01, 2005 - 4:26 pm UTC
This is how the data looks as of now with the above query.
IN_DATE OUT_DATE CONTAINER
1/3/2005 2:23:05 PM 1/10/2005 5:05:16 PM ACLU408014
1/11/2005 1:04:49 PM 1/12/2005 8:49:06 AM ACLU408014
1/14/2005 12:09:50 PM 1/18/2005 6:39:10 AM ACLU408014
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM 4/6/2005 2:25:59 PM ACLU408014
4/7/2005 1:24:43 PM 4/10/2005 2:21:59 AM ACLU408014
I Would like to get it as the following
IN_DATE OUT_DATE CONTAINER
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM
This is what I am looking for.....this way.
July 01, 2005 - 4:46 pm UTC
still not much of a specification (important thing for those of us in this industry - being able to describe the problem at hand in detail, so someone else can take the problem definition and code it).
Let me try, this is purely a speculative guess on my part:
I would like all records in the table such that the in_date-out_date range covered at least part of the month of march in the year 2005.
If the out_date falls AFTER march, I would like it nulled out.
(this part is a total guess) if the in_date falls BEFORE march, i would like it nulled out as well (for consistency?)
Ok, stated like that I can give you untested psuedo code since there are no create tables and no inserts to play with:
select case when in_date between to_date( :x, 'dd-mon-yyyy' )
and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
then in_date end,
case when out_date between to_date( :x, 'dd-mon-yyyy' )
and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
then out_date end,
container
from T
where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
and out_date >= to_date( :x, 'dd-mon-yyyy' )
bind in :x = '01-mar-2005' and :y = '01-apr-2005' for your dates.
As you requested
A reader, July 01, 2005 - 5:01 pm UTC
CREATE TABLE CONTAINER_MASTERS
(
CONTAINER VARCHAR2(10 BYTE) NOT NULL,
CHECK_DIGIT VARCHAR2(1 BYTE) NOT NULL,
SSL_OWNER_CODE VARCHAR2(5 BYTE) NOT NULL,
LHT_CODE VARCHAR2(5 BYTE) NOT NULL
)
INSERT INTO CONTAINER_MASTERS ( CONTAINER, CHECK_DIGIT, SSL_OWNER_CODE,
LHT_CODE ) VALUES ( '045404', '1', 'BCL', '5AV');
commit;
CREATE TABLE TML_CONTAINER_LHTS
(
CODE VARCHAR2(5 BYTE) NOT NULL,
SHORT_DESCRIPTION VARCHAR2(10 BYTE) NOT NULL,
LONG_DESCRIPTION VARCHAR2(30 BYTE) NOT NULL,
ISO VARCHAR2(4 BYTE) NOT NULL,
LENGTH_CODE VARCHAR2(5 BYTE) NOT NULL
)
INSERT INTO TML_CONTAINER_LHTS ( CODE, SHORT_DESCRIPTION, LONG_DESCRIPTION, ISO, LENGTH_CODE,
HEIGHT_CODE, TYPE_CODE ) VALUES ( '5BR', '5BR', '45'' 9''6" Reefer', '5432', '5', 'B', 'R');
commit;
CREATE TABLE TML_CONTAINER_LENGTHS
(
CODE VARCHAR2(5 BYTE) NOT NULL,
SHORT_DESCRIPTION VARCHAR2(10 BYTE) NOT NULL,
)
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES (
'2', '20''', '20 Ft');
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES (
'4', '40''', '40 Ft');
commit;
July 01, 2005 - 6:06 pm UTC
umm, specification?
did I get it right? if so, did you *try* the query at all???
Here is a SQL puzzle for analytics zealots
Mikito Harakiri, July 01, 2005 - 10:33 pm UTC
OK, if anybody suceed writing the following with analytics, I would convert to analytics once and forever. Credit it in the book, of course.
Given:
table Hotels (
name string,
price integer,
distance
)
Here is a query that sounds very analytical:
Order hotels by price, distance. Compare each record with its neighbour (lag?), and one of them is inferior to the other by both criteria -- more pricey and father from the beach -- then throw it away from the result.
July 02, 2005 - 9:20 am UTC
define neighbor.
is neighbor defined by price or by distance? your specification is lacking many many details (seems to be a recurring theme on this page for some reason)
sounds like you want the cheapest closest hotel to the beach. for each row, if something closer and cheaper exists in the original set, do not keep that row.
sounds like a where not exists, not analytics to me. but then - the specification is lacking.
And lets see, in order to appreciate a tool, you have to be shown that the tool can be the end all, be all answer to everything??!?? that is downright silly don't you think.
Let's see:
"if anyone succeeds in making the Oracle 9i merge command select data, I would convert to merge once and forever"
"if anyone succeeds in making my car fly into outer space, I would convert to cars once and forever"
Think about your logic here.
There are no zealots here, there are people willing to read the documentation, understand that things work the way they work, not the way THEY think they should have been made to work, and have jobs to do, pragmatic practical things to accomplish and are willing to use the best tool for the job.
specs
Mikito Harakiri, July 03, 2005 - 11:07 pm UTC
Yes, find all the hotels that are not dominated by the others by both price and distance. That is "not exists" query, but it is a very inefficient one:
select * from hotels h
where not exists (select * from hotels hh
where hh.price < h.price and hh.distance <= h.distance
or hh.price <= h.price and hh.distance < h.distance
)
The one that reformulated is much more efficient, but how do I express it in SQL?
July 04, 2005 - 10:25 am UTC
the one that reforumulated?
and why do you have the or in there at all. to dominate by both pric and distance would simply be:
where not exists ( select NULL
from hotels hh
where hh.price < h.price
AND hh.distinct < h.distance )
You said "by BOTH price and distance", nothing but nothing about ties.
ops$tkyte@ORA9IR2> /*
DOC>
DOC>drop table hotels;
DOC>
DOC>create table hotels
DOC>as
DOC>select object_name name, object_id price, object_id distance, all_objects.*
DOC> from all_objects;
DOC>
DOC>create index hotel_idx on hotels(price,distance);
DOC>
DOC>exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
DOC>*/
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select h1.name, h1.price, h1.distance
2 from hotels h1
3 where not exists ( select NULL
4 from hotels h2
5 where h2.price < h1.price
6 AND h2.distance < h1.distance )
7 /
NAME PRICE DISTANCE
------------------------------ ---------- ----------
I_OBJ# 3 3
Elapsed: 00:00:00.22
ops$tkyte@ORA9IR2> select count(*) from hotels;
COUNT(*)
----------
27837
Elapsed: 00:00:00.00
it doesn't seem horribly inefficient.
Tom Can we give it one more try
A reader, July 05, 2005 - 9:20 am UTC
Tom, When I ran the query it returned nothing. I am sending you the whole test case. This is what I would like to see
in the report.
out_date in_date container
1/18/2005 6:39:10 AM 3/19/2005 2:10:24 AM ACLU408014
3/21/2005 3:45:48 PM 3/22/2005 2:52:41 PM ACLU408014
CREATE TABLE BETA
(
IN_DATE DATE NOT NULL,
OUT_DATE DATE,
CONTAINER VARCHAR2(10 BYTE) NOT NULL
)
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
commit;
select in_date, out_date,container,
case when in_date between to_date('01-mar-2005', 'dd-mon-yyyy' )
and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
then in_date end,
case when out_date between to_date( '01-mar-2005', 'dd-mon-yyyy' )
and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
then out_date end
container
from BETA
WHERE in_date <= to_date( '01-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
and out_date >= to_date( '31-mar-2005', 'dd-mon-yyyy' )
July 05, 2005 - 9:54 am UTC
you know, this is going beyond....
*s*p*e*c*i*f*i*c*a*t*i*o*n*
pretend you were explaining to your mother (who presumably doesn't work in IT and doesn't know sql or databases or whatever) what needed to be done.
that is what I need to see. I obviously don't know your logic of getting from "A (inputs) to B (outputs)" and you need to explain that.
and when I run my query:
ops$tkyte@ORA10G> variable x varchar2(20)
ops$tkyte@ORA10G> variable y varchar2(20)
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec :x := '01-mar-2005'; :y := '01-apr-2005'
PL/SQL procedure successfully completed.
ops$tkyte@ORA10G> select case when in_date between to_date( :x, 'dd-mon-yyyy' )
2 and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
3 then in_date end,
4 case when out_date between to_date( :x, 'dd-mon-yyyy' )
5 and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
6 then out_date end,
7 container
8 from beta
9 where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
10 and out_date >= to_date( :x, 'dd-mon-yyyy' )
11 /
CASEWHENI CASEWHENO CONTAINER
--------- --------- ----------
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 ACLU408014
I do get output, not what you say you want, but output. you need to tell me THE LOGIC here. (and maybe when you write it down, specify it, the answer will just naturally appear)
so yes, we can definitely give it one more try but if and only if you provide the details, the specification, the logic, the thoughts behind this.
Not just "i have this and want that", it doesn't work that way.
in english
Jean, July 05, 2005 - 10:33 am UTC
We are trying to bill from the time the truck left to the
time it returned. For example in the above query.
I would like to bill him from 1/18/2005 to 3/19/2005. So it must be part of the report. That's the the whole key here.
clarification!!
A reader, July 05, 2005 - 10:56 am UTC
the time he left 1/18/2005 6:39:10 AM
the time he came back 3/22/2005 2:52:41 PM
hope this helps....
July 05, 2005 - 11:28 am UTC
ops$tkyte@ORA9IR2> select * from beta order by in_date;
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014 <<<=== gap, no 13
14-JAN-05 18-JAN-05 ACLU408014 <<=== big gap, no 19.... mar 18
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
6 rows selected.
I don't get it. I don't get it AT ALL. does anyone else ?
nope, not getting it even a teeny tiny bit myself.
give us LOGIC, ALGORITHM, INFORMATION.
like I said, pretend I'm your mother who has never seen a computer -- explain the logic at that level (or I just give up)
BETTER TABLE
A reader, July 05, 2005 - 11:57 am UTC
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
commit;
OUT_DATE IN_DATE
1/18/2005 6:39:10 AM 3/19/2005 2:10:24 AM
3/21/2005 3:45:48 PM 3/22/2005 2:52:41 PM
LEFT 1/18 CAME BACK 3/19
LEFT 3/21 CAME BACK 3/22
July 05, 2005 - 12:20 pm UTC
you have totally and utterly missed my point.
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
6 rows selected.
sigh.
what if the records are
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
specification, you know what, without it, I'm not even going to look anymore. Textual description of precisely what you want. I'm tired of guessing. I think I can guess, but I don't even want to guess about "missing" months like my second example here.
English Explanation
A reader, July 05, 2005 - 1:06 pm UTC
Sorry for going back and forth on this report. All I want is the following: We have trucks that comes and out of yard. All we are looking for is when the truck came in and the "next record" nothing in between because a truck can come in many times during a month. So we want when it first came in and the very last time he went out for a particular month.That is to say the last time he left the yard. So the date and time should give us this information. Finally this report should be within a month.
example:
IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
6 rows selected.
in this case we want
in_date out_date
-------- --------
3/22/2005 2:52:41PM 1/18/2005 6:39:10 AM
July 05, 2005 - 1:17 pm UTC
so what happened to the 21st/22nd of march this time. the answer keeps changing?
and what if, there are no records for march in the table (nothing in_date/out_date wise)
follow up
jean, July 05, 2005 - 1:57 pm UTC
Tom,
We realized that it maybe too much to get the dates in between
so we opt for just getting the in_date and out_date. By the way there will always be data so do not worry about if....
Thanks!!
July 05, 2005 - 3:11 pm UTC
feb, what about feb? you said there would always be data? I want to run this for feb?
do you or do you not need to be concerned about a missing month.
do not be concerned!
A reader, July 05, 2005 - 3:22 pm UTC
Please do not be concerned about missing a month. This is a report.
July 05, 2005 - 3:46 pm UTC
umm, I want the report for feburary
it is blank.
now what? it should not be blank should it? this is a problem, this is a problem in our industry in general. You get what you ask for (sometimes) and if you ask repeatedly for the wrong thing, that's what you'll get. I am concerned -- by this line of question here.
Hey, here you go:
ops$tkyte-ORA9IR2> select *
2 from (
3 select
4 lag(out_date) over (partition by container order by in_date) last_out_date,
5 in_date,
6 container
7 from beta
8 )
9 where trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
10 or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy');
LAST_OUT_ IN_DATE CONTAINER
--------- --------- ----------
18-JAN-05 19-MAR-05 ACLU408014
21-MAR-05 22-MAR-05 ACLU408014
gets the answer given your data, makes a zillion assumptions (50% of which are probably wrong), won't work for FEB, probably doesn't answer the question behind the question, but hey, there you go.
Thanks!!!
A reader, July 06, 2005 - 9:00 am UTC
I will try it ...Thanks a zillion for your efforts and your patient.
Thanks!
A reader, July 06, 2005 - 11:55 am UTC
CREATE TABLE BETA3
(
IN_DATE DATE NOT NULL,
OUT_DATE DATE,
CONTAINER VARCHAR2(10 BYTE) NOT NULL
)
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '07/20/2004 03:08:49 PM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_Date( '08/10/2004 02:45:52 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312');
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 01:55:06 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_Date( '03/27/2005 05:05:36 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312');
commit;
Tom I was able to get the first pair as show
last_out_date in_date container
8/10/2004 2:45:52 AM 3/19/2005 1:55:06 AM ACLU040312
which is fine...
But can I get the other pair?
last_out_date in_date container
3/27/2005 5:05:36 AM
July 06, 2005 - 12:44 pm UTC
problem is, you are "missing" a row and 'making up' data is hard.
it might be
ops$tkyte-ORA10G> select decode( r, 1, last_out_date, out_date ),
2 decode( r, 1, in_date, next_in_date )
3 from (
4 select
5 lag(out_date) over (partition by container order by in_date) last_out_date,
6 in_date, out_date,
7 lead(in_date) over (partition by container order by in_date) next_in_date,
8 container
9 from beta3
10 ), ( select 1 r from dual union all select 2 r from dual )
11 where ((
12 trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
13 or
14 trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
15 ) and r = 1 )
16 or
17 ( next_in_date is null and r = 2 )
18 /
DECODE(R,1,LAST_OUT_ DECODE(R,1,IN_DATE,N
-------------------- --------------------
10-aug-2004 02:45:52 19-mar-2005 01:55:06
27-mar-2005 05:05:36
still curious what happens in feb.
Please refer some books to learn Oracle Analytic functions
Vijay, July 07, 2005 - 7:58 am UTC
July 07, 2005 - 9:47 am UTC
data warehousing guide (freely available on otn.oracle.com)
Expert one on one Oracle (I have a big chapter on them in there)
Thank you very much!!
Jean, July 08, 2005 - 10:35 am UTC
I want to thank you for the last query!!! it worked very well,even tho I still get dates outside of the range. But overall it's fine.
How to get contiguous date ranges from Start_date, end_date pairs?
Bob Lyon, July 11, 2005 - 3:15 pm UTC
-- Tom, Suppose I have a table with data...
-- MKT_CD START_DT_GMT END_DT_GMT
-- ------ ----------------- -----------------
-- AAA 07/11/05 00:00:00 07/12/05 00:00:00
-- BBB 07/11/05 00:00:00 07/11/05 01:00:00
-- BBB 07/11/05 01:00:00 07/11/05 02:00:00
-- BBB 07/11/05 02:00:00 07/11/05 03:00:00
-- BBB 07/11/05 06:00:00 07/11/05 07:00:00
-- BBB 07/11/05 07:00:00 07/11/05 08:00:00
-- What I would like to get is the "contiguous date ranges"
-- by MKT_CD, i.e.,
-- MKT_CD START_DT_GMT END_DT_GMT
-- ------ ----------------- -----------------
-- AAA 07/11/05 00:00:00 07/12/05 00:00:00
-- BBB 07/11/05 00:00:00 07/11/05 03:00:00
-- BBB 07/11/05 06:00:00 07/11/05 08:00:00
-- I have played with LAG/LEAD/FIRST_VALUE/LAST_VALUE
-- but seem to just "go in circles" trying to code this.
-- Here is the test data setup (Oracle 9.2.0.6) :
CREATE GLOBAL TEMPORARY TABLE NM_DEMAND_BIDS_API_GT
(
MKT_CD VARCHAR2(6) NOT NULL,
START_DT_GMT DATE NOT NULL,
END_DT_GMT DATE NOT NULL
)
ON COMMIT PRESERVE ROWS;
-- This code has 24 hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('AAA', TRUNC(SYSDATE), TRUNC(SYSDATE) + 1);
-- A second code goes by hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 00/24, TRUNC(SYSDATE) + 01/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 01/24, TRUNC(SYSDATE) + 02/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 02/24, TRUNC(SYSDATE) + 03/24);
-- and has an intentional gap
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 06/24, TRUNC(SYSDATE) + 07/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 07/24, TRUNC(SYSDATE) + 08/24);
-- Query
SELECT MKT_CD, START_DT_GMT, END_DT_GMT
FROM NM_DEMAND_BIDS_API_GT;
July 11, 2005 - 3:49 pm UTC
based on:
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html
ops$tkyte@ORA9IR2> select mkt_cd, min(start_dt_gmt), max(end_dt_gmt)
2 from (
3 select mkt_cd, start_dt_gmt, end_dt_gmt,
4 max(grp) over (partition by mkt_cd order by start_dt_gmt) mgrp
5 from (
6 SELECT MKT_CD,
7 START_DT_GMT,
8 END_DT_GMT,
9 case when lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) <> start_dt_gmt
10 or
11 lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) is null
12 then row_number() over (partition by mkt_cd order by start_dt_gmt)
13 end grp
14 FROM NM_DEMAND_BIDS_API_GT
15 )
16 )
17 group by mkt_cd, mgrp
18 order by 1, 2
19 /
MKT_CD MIN(START_DT_GMT) MAX(END_DT_GMT)
------ -------------------- --------------------
AAA 11-jul-2005 00:00:00 12-jul-2005 00:00:00
BBB 11-jul-2005 00:00:00 11-jul-2005 03:00:00
BBB 11-jul-2005 06:00:00 11-jul-2005 08:00:00
Thanks!
Bob Lyon, July 11, 2005 - 5:20 pm UTC
Wow, that was fast.
The trick here is the MAX() analytic function. I could tag the lines where a break was to occur but couldn't figure out how to carry forward the tag/grp.
Thanks Again!
Analytical functions book
Vijay, July 11, 2005 - 11:55 pm UTC
Thanks a lot
More Help
Jean, July 26, 2005 - 5:40 pm UTC
Tom,
How can I get "just" the record within the scope? I am getting record outside of march.
select container,decode( r, 1, last_out_date, out_date )out_date, decode( r, 1, in_date, next_in_date) in_date,
code length_code,chassis,out_trucker_code,ssl_user_code ssl, ssl_user_code,out_mode
from (
select lag(out_date) over (partition by i.container order by in_date)
last_out_date,
i.ssl_user_code,
in_date,
cl.code,
i.out_trucker_code,
i.ssl_user_code ssl,
i.container,
i.chassis,
out_mode,
out_date,
clht.length_code,
lead(in_date) over (partition by i.container order by in_date)
next_in_date
from his_containers i,container_masters cm,tml_container_lhts clht,tml_container_lengths cl
where cm.container = i.container
and cm.lht_code = clht.code
and cl.code = clht.length_code
and ssl_user_code = 'ACL'
and i.container = 'ACLU214285'
and voided_date is null
and chassis is null
and in_mode = 'T'
and out_mode = 'T' ), ( select 1 r from dual union all select 2 r from dual )
where (( trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy'))
and r = 1 ) or ( next_in_date is null and r = 2 )
order by out_date
July 26, 2005 - 5:57 pm UTC
select *
from (Q)
where <any other conditions you like>
order by out_date;
replace Q with your query.
that's what I got in my query.....
A reader, July 26, 2005 - 6:03 pm UTC
July 26, 2005 - 6:23 pm UTC
don't know what you mean
I thought I was doing what you suggested already...
A reader, July 26, 2005 - 6:41 pm UTC
July 26, 2005 - 6:56 pm UTC
I cannot see your output, obviously you are getting more data than you wanted it -- add to the predicate in order to filter it out. don't know what else to say.
More information..
Jean, July 27, 2005 - 9:14 am UTC
the way it was before
CONTAINER OUT_DATE IN_DATE LENGTH_CODE CHASSIS OUT_TRUCKER_CODE
ACLU217150 6/25/2004 2:58:01 PM 3/11/2005 7:36:29 PM 4 E2131 ACL ACL T
---with your changes---
CONTAINER OUT_DATE IN_DATE LENGTH_CODE CHASSIS OUT_TRUCKER_CODE ACLU217150 6/25/2004 2:58:01 PM 3/11/2005 7:36:29 PM 4 E2131
my history tables
CONTAINER_ID OUT_DATE IN_DATE
31779 6/21/2004 10:03:25 AM 6/16/2004 1:33:50 AM
55317 6/25/2004 2:58:01 PM 6/25/2004 2:19:49 PM
672863 3/2/2005 7:03:31 PM 2/26/2005 6:03:49 PM
708598 4/4/2005 3:31:03 PM 3/11/2005 7:36:29 PM
779305 4/16/2005 1:03:36 PM 4/6/2005 2:04:53 PM
as you can see I am not picking up the records within the month of march...with or without
the changes to the query.
July 27, 2005 - 10:27 am UTC
sorry -- you'll need to work through this, you see the techniques involved right -- lag, lead, analytic functions, YOU understand your data much better than I.
(because in part, frankly, the "way it was before" and "with your changes" look, well, I don't know -- the same I think to me as displayed here)
Thanks for your help!
A reader, July 27, 2005 - 1:26 pm UTC
I know the data, however I thought I was going to be something easy just to get the date within march...I guess not.
count number of rows in a number of ranges
A reader, July 27, 2005 - 6:08 pm UTC
Hi
I would like to count the number of rows I have per range of values. For example
SELECT RANGE, SUM(suma) total_per_deptno
FROM (SELECT CASE
WHEN deptno between 10 and 20 THEN '10-20'
ELSE '30'
END RANGE,
deptno, 1 SUMA
FROM scott$emp)
GROUP BY RANGE
RANGE TOTAL_PER_DEPTNO
----- ----------------
10-20 8
30 6
Can I rewrite that query in some other way so range can be dynamic such as
11-20
21-30
31-40
and counts the number of rows?
Thank you
July 27, 2005 - 6:33 pm UTC
if you can come up with a function f(x) such that f(x) returns what you want, sure.
EG:
for you 11-20, 21-30, 31-40 -- well
f(deptno) = trunc( (deptno-0.1)/10)
(assuming deptno is an integer) -- that'll bin up deptno 0..10, 11..20, 21..30 and so on into groups 0, 1, 2, 3, ....
A reader, August 02, 2005 - 1:35 pm UTC
Tom,
I hope you can provide an insight to this.
table emp1 is shown below.
EmpId Week Year Day0 Day1 ..... Day14
100 20 2005 8 8 8
200 22 2003 0 0 8
300 25 2004 8 8 0
400 06 2005 0 8 8
500 08 2002 8 0 8
create table emp1(empid varchar2(3), week varchar2(2), year varchar2(4), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));
insert into emp1 values('100', '20', '2005', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('200', '22', '2003', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('300', '25', '2004', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0);
insert into emp1 values('400', '06', '2005', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('500', '08', '2002', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
I am trying to select emp1 records as follows:
EmpId, Date of the day, Hours worked per day
Firstly, I have to calculate date of the day of a record (first day that corresponds to Day0) using
week of the year and year. Then I have to increment the day by 1, 2 ...14
to get the hours worked for each particular date
Example: Assuming that week 20 of 2005 is 05/07/2005. It corresponds to Day0 in the same record
Day1 column corresponds to the next day which is 05/08/2005. Day2 becomes 05/09/2005 and so on ...
Then, I have to print individual rows for each empid as:
100 05/07/2005 8
100 05/08/2005 8
.....
200 05/22/2003 0
200 05/23/2003 8
.. and so on for all empid's ...
Thank you.
August 02, 2005 - 2:09 pm UTC
oh no, columns where rows should be :(
and basically you are saying "i need ROWS where these rows should be!"
tell me, how do you turn 20 into a date?
A reader, August 02, 2005 - 2:19 pm UTC
Tom,
I should've explained it better. Week 20 of 2005, here should be translated to the first day of week 20 of 2005 (Assuming it is 05/07/2005). That corresponds to Day0 of that row. Day1 becomes 05/08/2005 and so on ...
Is there a function or approach that can convert columns to rows?
August 02, 2005 - 3:30 pm UTC
no, i mean -- what function/logic/algorithm are you using to figure out "week 20 is this day"
A reader, August 02, 2005 - 9:06 pm UTC
Tom,
Sorry, firstly, the date is not calculated the way I said above. It's not clear yet how the date is obtained. This issue is under review and I think I'll obtain date by joining empid with some table (say temp1). However, I am sure I will have to use date (such as 05/07/2005), associate it with Day0 column value. Day1 becomes 05/08/2005 and so on .. However, I am trying to obtain a sql or pl/sql that can arrange the rows as described above. Any ideas? Thanks.
August 03, 2005 - 10:06 am UTC
I cannot tell you how much I object to this model.
storing "week" and "year" - UGH.
storing them in STRINGS - UGH UGH UGH.
storing things that should be cross record in record UGH to the power of 10.
I had to fix your inserts, they did not work, added day14 of zero.
ops$tkyte@ORA10G> with dates as
2 (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual connect by level <= 15 )
3 select empid, dt,
4 case when l = 0 then day0
5 when l = 1 then day1
6 when l = 2 then day2
7 /* ... */
8 when l = 13 then day13
9 when l = 14 then day14
10 end data
11 from (select * from emp1 where week = 20), dates
12 /
EMP DT DATA
--- --------- ----------
100 07-MAY-05 8
100 08-MAY-05 8
100 09-MAY-05 0
100 10-MAY-05
100 11-MAY-05
100 12-MAY-05
100 13-MAY-05
100 14-MAY-05
100 15-MAY-05
100 16-MAY-05
100 17-MAY-05
100 18-MAY-05
100 19-MAY-05
100 20-MAY-05 8
100 21-MAY-05 0
15 rows selected.
A reader, August 03, 2005 - 3:18 pm UTC
Tom,
Thanks for the solution. I need some more help if you don't mind. The sql works excellently and I experimented with it.
However, this question is based on a change of design here ... The emp1 table is joined with trn1 table (empid ~ trnid) to obtain values x and y. x and y should be passed to a function that returns date.
The emp1 table is like:
EmpId Day0 Day1 ..... Day14
100 8 8 8
200 0 0 8
300 8 8 0
400 0 8 8
500 8 0 8
trn1 table is like:
trnid x y
100 3 18
200 4 19
300 5 20
400 6 21
500 7 22
etc ...
create table emp1(empid varchar2(3), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));
insert into emp1 values('100', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('200', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 0);
insert into emp1 values('300', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0, 0);
insert into emp1 values('400', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('500', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
create table trn1(empid varchar2(3), x number(2), y number(2));
insert into trn1 values('100', 3, 18);
insert into trn1 values('200', 4, 19);
insert into trn1 values('300', 5, 20);
insert into trn1 values('400', 6, 21);
insert into trn1 values('500', 7, 22);
I used this function on just one row of emp1 (by hard coding x and y values).
I replaced
with dates as
(select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual
connect by level <= 15 )
with
with dates as
(select getXYDate(x,y)+level-1 dt, level-1 l from dual
connect by level <= 15 )
However, I am trying to implement this on every row of emp1 by obtaining x and y from trn. There is no week or year in emp1 table. Any help? Thanks again.
August 03, 2005 - 6:00 pm UTC
I didn't think it was possible, but now I like this even less than before! didn't think you could do that ;(
ops$tkyte@ORA10G> with dates as
2 (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual
connect by level <= 15 )
3 select empid, dt,
4 case when l = 0 then day0
5 when l = 1 then day1
6 when l = 2 then day2
7 /* ... */
8 when l = 13 then day13
9 when l = 14 then day14
10 end data
11 from ( QUERY ), dates
12 /
replace query with a join of emp with trn and apply the function in there.
A reader, August 03, 2005 - 7:38 pm UTC
Tom,
Sorry to bother you again. In my case, I think
(select to_date( '05/07/2005','mm/dd/yyyy') will not help me anymore because I have to basically find dates for Day0 .. Day14 of every row in emp1 table. The first date (date that corresponds to Day0) for each record should be obtained using a function by passing X and Y values of trn table .. Because each record may have different x, y values.
If it's not achievable using this way, can you suggest an alternate approach. I am trying to make a function that would use a loop. Also, the data should be written to a text file once complete, in that case I think a procedure might help and if so, could you throw some light? Thanks for your patience.
August 03, 2005 - 8:26 pm UTC
well, you just need to generate a set of 15 numbers (L)
and add them in later than. No big change. You have the "start_date" from the function right -- just add L to dt.
A reader, August 03, 2005 - 8:38 pm UTC
Ok, Can you please show that if possible?
A reader, August 03, 2005 - 9:38 pm UTC
Tom,
I tried this and am getting an error: ORA-00904: "DAY13": invalid identifier
WITH DATES AS
(SELECT FUNC_XY(17,2003)+level-1 dt, level-1 l FROM DUAL
connect by level <= 15)
select empid, day0, day14, x, y, dt,
case when l = 0 then day0
when l = 1 then day1
when l = 2 then day2
when l = 3 then day3
when l = 4 then day4
when l = 5 then day5
when l = 6 then day6
when l = 7 then day7
when l = 8 then day8
when l = 9 then day9
when l = 10 then day10
when l = 11 then day11
when l = 12 then day12
when l = 13 then day13
when l = 14 then day14
end data
from (select emp1.empid, day0, day14, x, y from emp1, trn1 where emp1.empid = trn1.empid), dates
/
As said before ... I also have to use x and y instead of 17 and 2003 in order to compute it for every row.
August 04, 2005 - 8:20 am UTC
yeah, well -- you didn't select it out in the inline view. fix that.
look the concept is thus:
with some_rows as ( select level-1 l from dual connect by level <= 15 )
select a.empid, a.dt+l, case when l=0 then a.day0
...
when l=14 then a.day14
end data
from some_rows,
(select emp1.empid, func_xy(trn1.x, trn1.y) dt,
emp1.day0, emp1.day1, .... <ALL OF THE DAYS>, emp1.day14
from emp1, trn1
where emp1.empid = trn1.empno )
A reader, August 04, 2005 - 9:15 am UTC
Tom,
Here, the sql is using a.empid, a.dt+l ...
whereas the inner sql is using emp1.day0, trn1.empid , etc ... My real inner sql well uses some more columns adn joins as well. When this gave me error, I just substituted emp1.day0, emp1.day14 etc ... with day0, day14 etc .. and it worked. However, when there are several joins with alias names, How should it be done?
To make it a bit clear, this sql looks similar to:
select emp1.empid, emp1.day0 from some_rows, (select emp1.empid, emp1.day0) ...
Any idea how to select from select and still use multiple joins etc ... Hope I am clear
August 04, 2005 - 9:56 am UTC
you can join as much as you WANT in the inline views.
Sorry, I cannot go further with this one, I've shown the technique -- it is just a pivot to turn COLUMNS THAT SHOULD HAVE BEEN ROWS into rows -- very common.
A reader, August 04, 2005 - 9:52 am UTC
Please ignore above post.
I need some help
Carlos, August 09, 2005 - 10:25 am UTC
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('11/15/2004 17:42:56', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/18/2004 15:09:19', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('11/24/2004 09:38:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/30/2004 04:28:09', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/03/2005 14:36:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/05/2005 10:04:15', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/07/2005 08:54:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/10/2005 10:54:07', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/12/2005 10:13:13', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/18/2005 04:23:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/03/2005 03:15:05', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/09/2005 18:54:11', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005 21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/22/2005 20:27:03', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/29/2005 17:05:04', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/22/2005 20:27:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/30/2005 08:53:13', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/30/2005 13:16:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/30/2005 15:08:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;
Tom,
I hope you can help since I have been struggling with this report. I would like to get something like this...
IN ORDER WORDS I WANT TO GET WHEN IT FIRST WAS LOGED IN INDATE AND WHEN IT WAS LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX. In this case for example for the month of March, however it can be for any given Month. Any Ideas how I can accomplish that?
IN_DATE OUT_DATE
3/22/2005 8:27:03 PM 3/30/2005 3:08:39 PM
----from the table above for the month of March
August 09, 2005 - 10:45 am UTC
insufficient detail here, why won't min/max work for you for example.
but I don't understand the logic behind the two values you say you want, I don't get how you arrived at them.
This is what I get
A reader, August 09, 2005 - 10:57 am UTC
select in_date, out_date
from lou_date
where id = 201048
and ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
I get the following:
In_date out_date
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM
August 09, 2005 - 11:19 am UTC
ok,
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005
21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
why didn't you get that row. for example.
A reader, August 09, 2005 - 11:48 am UTC
SQL Statement which produced this data:
select in_date, out_date
from lou_date
where ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
order by out_date
3/3/2005 3:15:05 AM 3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/22/2005 8:27:15 PM 3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM
I guess my question is I would like to that when
I get records with beyond march it should be replace
with blank or Null...since I can't charged him/her
for April...
August 09, 2005 - 12:00 pm UTC
I am so not following you here.
A reader, August 09, 2005 - 12:24 pm UTC
Tom,
Pretend that you are charging someone for a particular month. Let's say the month of March. So you would like to do a query that reflect just that..so a group of dates are given to you and in that group of dates you have multiple records with the same id. Also some records containts records that inintiated in march but came back in April. Here is are the examples..but it can work with any dates...
example 1.
in_date out_date
3/22/2005 8:27:15 PM 3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
would like to see:
in_date out_date
3/22/2005 8:27:15 PM 3/30/2005 1:16:00 PM
example 2
In_date out_date
3/3/2005 3:15:05 AM 3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
would like to see:
In_date out_date
3/3/2005 3:15:05 AM 3/15/2005 9:47:41 PM
August 09, 2005 - 12:42 pm UTC
begs the question
in_date out_date
20-feb-2005 15-apr-2005
or
in_date out_date
3/22/2005 8:27:15 PM 3/25/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
what then. Be able to clearly specify the "goal" or the "algorithm" usually leads us straight to the query itself. There are so many ambiguities here. Pretend you were actually documenting this for a junior programmer to program. Give them the specifications. In gory detail.
please don't just answer these two what thens -- think of all of the cases (cause I'll just keep on coming back with "what then" if you don't)
Remember -- I know NOTHING about your data, not a thing. This progression from
... I WANT TO GET WHEN IT FIRST WAS LOGED IN INDATE AND WHEN IT WAS
LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX....
to this has been 'strange' to say the least.
Full explanation of requirements
A reader, August 09, 2005 - 3:10 pm UTC
Sorry for the misunderstanding Tom. Here is the full requirements. I hope I can explain it this time.
The report is a billing report and the it goes as follows:
For example for the month of March we have to bill as
in the following way:
out_date date_in Bill
2/23 3/2 3/1 to 3/2
3/1 3/3 3/1 to 3/3
3/1 4/14 3/1 to 3/31
3/1 - 3/1 to 3/31
2/23 - 3/1 to 3/31
August 09, 2005 - 3:38 pm UTC
well, i hope you give your programmers more detail. Here is the best I'll do
ops$tkyte@ORA9IR1> select t.*,
2 greatest( in_date, to_date('mar-2005','mon-yyyy') ) fixed_in_date,
3 least( nvl(out_date,to_date('3000','yyyy')), last_day( to_date( 'mar-2005', 'mon-yyyy' ) ) ) fixed_out_date
4 from t
5 where in_date < last_day( to_date( 'mar-2005', 'mon-yyyy' ) )+1
6 and out_date >= to_date( 'mar-2005', 'mon-yyyy' );
IN_DATE OUT_DATE FIXED_IN_ FIXED_OUT
--------- --------- --------- ---------
03-MAR-05 09-MAR-05 03-MAR-05 09-MAR-05
11-MAR-05 15-MAR-05 11-MAR-05 15-MAR-05
22-MAR-05 29-MAR-05 22-MAR-05 29-MAR-05
22-MAR-05 30-MAR-05 22-MAR-05 30-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
6 rows selected.
predicate finds records that overlap march.
select adjusts the begin/end dates.
Thank!!!
A reader, August 10, 2005 - 12:00 pm UTC
Tom,
One more request. I would like to start the report with
the first time it went out. That is to say...
how it looks now with your help...
fix_in fix_out
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM 3/31/2005
how the data looks
fix_in fix_out
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM---first went out
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM
How I would like to see it since we begin billing from
the first date the truck went out.
fix_in fix_out
3/29/2005 5:05:04 PM 3/30/2005 3:08:39 PM
3/30/2005 3:08:39 PM 3/31/2005
Thanks again Tom
August 10, 2005 - 1:05 pm UTC
try to work it out yourself -- please.
why? because I'll do this little thing and it'll be "oh yeah, one more thing, when the data looks like this...."
specifying requirements is like the most important thing in the world -- it is key, it is crucial. It is obivous you know what you want (well, maybe -- it seems to change over time) but I don't "get it" myself. Your simple example here with two rows begs so so many questions, I don't even want to get started.
You have lag() and lead() at your disposal, the probably come into play here. check them out.
Thanks for help !
A reader, August 11, 2005 - 3:25 pm UTC
The report is kind of tricky. Specially when one of the dates originates in Feb. and the other pair falls in march.
Hooked on Analytics worked for me!!
Greg, August 22, 2005 - 11:15 am UTC
I think I need to find a meeting group to help with my addiction ... I think I'm addicted to analytics .. :\
Finally got a chance to read chapter 12 in "Expert Oracle" ... awesome!! 4 big, hairy Thumbs up!! heh
But I got a question ... an "odd" behaviour that I don't understand ... was wondering if you could help explain:
Test Script:
================
drop table junk2;
drop sequence seq_junk2;
create sequence seq_junk2;
create table junk2
(inv_num number,
cli_num number,
user_id number)
/
insert into junk2
values ( 123, 456, null );
insert into junk2
values ( 123, 678, null );
insert into junk2
values ( 234, 456, null );
insert into junk2
values ( 234, 678, null );
commit;
break on cli_num skip 1
select * from junk2;
select inv_num, cli_num,
NVL ( user_id, 999 ) chk1,
NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
seq_junk2.nextval seq,
FIRST_VALUE ( NVL ( user_id, seq_junk2.nextval ) )
OVER ( PARTITION BY cli_num ) user_id
from junk2
/
=====================
The final query shows this:
INV_NUM CLI_NUM CHK1 CHK2 SEQ USER_ID
---------- ---------- ---------- -------- ---------- ----------
123 456 999 NULL 1
234 999 NULL 2
123 678 999 NULL 3 2
234 999 NULL 4 2
4 rows selected.
and I'm kinda confused .. it appears that the analytic functions are not "processing" that sequence ... how do sequences and analytics work together?? (if at all??)
(In short, this is a simplified example of a bigger problem I tripped over. I'm trying to assign new user_ids for existing clients, but only want 1 user_id assigned per client. Trick is, each client can be associated with more 1 investment ... so I have multiple rows with same client, but I want the same user_id assigned. kind of: "Has this client got an id yet? if not, give him a new one, otherwise display the one he's already been assigned".)
FIRST_VALUE and LAST_VALUE seemed the logical choice ...
The interesting thing is, when I use DBMS_RANDOM.VALUE (to assign a random PIN to start with) ... it works fine, what am I missing/forgetting about sequences that changes their behaviour in this regards?)
August 23, 2005 - 8:56 am UTC
that will be a tricky one, lots of assumptions on orders of rows processed and such.
that should throw an ora-2287 in my opinion.
I cannot see a safe way to do that without writing a plsql function and performing a lookup off to the side by cli_num
Sorry, I don't understand ...
Greg, August 23, 2005 - 11:36 am UTC
you wrote:
"that will be a tricky one, lots of assumptions on orders of rows processed and such."
I don't understand what assumptions I'm making ... in my example, I just got 4 rows, I don't care what order they come back in, just so long as it deals with them in "groups of cli_nums" .. (hence the partition by cli_num portion) ... if I "lose" sequence numbers, that's fine, too ... I don't care about gaps in the sequence or "missing userids" ...
The only behaviour I'm seeing, is that the analytic function doesn't seem to be working with the sequence properly ...
I guess I can simplify the question even further:
Why does the following query return "NULL" ?
SQL > select first_value ( seq_junk2.nextval ) over ( )
2 from dual
3 /
------more------
FIRST_VALUE(SEQ_JUNK2.NEXTVAL)OVER()
------------------------------------
1 row selected.
(with a "normal" sequence - nothing fancy):
SQL > select seq_junk2.nextval from dual;
------more------
NEXTVAL
----------
29
1 row selected.
August 24, 2005 - 8:35 am UTC
as i said, i believe it should be raising an error (I have it on my list of things to file when I get back in town).
I cannot make it work, I cannot think of a way to do it in a single statement, short of writing a user defined function.
Connect by with self referenced parent
Joe, August 23, 2005 - 12:30 pm UTC
CONNECT BY works great but I've run into a problem when the ultimate parent is referenced in the parent record. e.g., date looks like:
SQL> select * from t;
OBJ_ID PARENT_ID
---------- ----------
1 1
2 1
3 1
4 2
5 4
But... using connect by generates an error..
SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
2 from t
3 connect by t.parent_id = prior t.obj_id;
ERROR:
ORA-01436: CONNECT BY loop in user data
If parent_id is null where obj_id = 1, then it's okay. Any suggestion on how to handle the other case? I'm stumped.
Solution for connect by
Logan Palanisamy, August 23, 2005 - 5:39 pm UTC
SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
2 from t
3 connect by t.parent_id = prior t.obj_id and t.parent_id <> t.obj_id;
LEVEL OBJ_ID PARENT_ID
-------------------- ---------- ----------
1 1 1
2 2 1
3 4 2
4 5 4
2 3 1
1 2 1
2 4 2
3 5 4
1 3 1
1 4 2
2 5 4
1 5 4
12 rows selected.
re:Solution for connect by
Joe, August 24, 2005 - 8:43 am UTC
Thanks Logan. Often the solution is so simple! Thanks.
Seq problem
Bob B, August 24, 2005 - 11:25 am UTC
SELECT
A.*,
seq_junk2.currval CURR_SEQ,
seq_junk2.nextval - ROWNUM + VAL SEQ
FROM (
SELECT
inv_num,
cli_num,
NVL ( user_id, 999 ) chk1,
NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
DENSE_RANK() OVER ( ORDER BY CLI_NUM ) VAL
FROM JUNK2
) A
Might be a starting point. It works on the following ASSUMPTION: ROWNUM corresponds to the number of times the sequence has been called. As Tom stated, this assumption can easily go out the window (throw an analytic function or an order by on the outer query for a simple example).
A safer solution might be to run two updates. Update 1 will give a unique id to each null user id. Update 2 will update the user id to the min or max user id for that cli_num. A little overhead, but safer and simpler than the aforementioned alternative.
Still confused ... but working on it ...
Greg, August 24, 2005 - 1:42 pm UTC
Thanks, Bob!! Yeah, that does exactly what I wanted it to do, (but still doesn't really explain the "why" part) ...
problem is, it looks like this is more a question on sequences now than analytics, so I'll see if I can find a more appropriate thread to continue this on ..
Thanks!!
A slight twist on lag/lead
Sudha Bhagavatula, September 01, 2005 - 11:08 am UTC
That was useful to me. Could do a lot of queries easily. However I'm stuck at this point.
I have data like this:
subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2000
1001 001 2112 01/01/2001 06/30/2001
1001 001 2112 07/01/2001 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999
I want my output to look like this:
subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999
How do I achieve this ?
September 01, 2005 - 3:49 pm UTC
well, you should start by describing the logic from getting from A to B first.
otherwise it is just text. what are the rules that got you from inputs to outputs.
tell me the procedural algorithm you would use for example.
Rules from A to B
Sudha Bhagavatula, September 02, 2005 - 9:29 am UTC
A member is enrolled in a group for a timeframe. For all contiguous time frames for a group I can take the min(eff_date) and max(term_date). For each break in group a new row with min(eff_date) and max(term_date) again. So say a member was enrolled in a group from 01/01/2001 to 12/31/2001 and then again with the same group from 01/01/2005 to 06/30/2005 then I need 2 rows for this member
with the dates as said just now. This is the sql that I'm running, hopefully I'm on the right track but am stuck at this point:
SELECT SUBR_ID,
DEP_NBR,
GRP,
LAG_EFF_DATE,
LEAD_EFF_DATE,
EFF_DATE,
TERM_DATE,
LAG_TERM_DATE,
LEAD_TERM_DATE,
DECODE( LEAD_GRP, GRP, 1, 0 ) FIRST_OF_SET,
DECODE( LAG_GRP, GRP, 1, 0 ) LAST_OF_SET
FROM (SELECT M.SUBR_ID,
M.DEP_NBR,
LAG(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_GRP,
LEAD(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_GRP,
GRP_NBR||SUB_GRP GRP,
CJ.EFF_DATE,
CJ.TERM_DATE,
LAG(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_EFF_DATE,
LEAD(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_EFF_DATE,
LAG(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_TERM_DATE,
LEAD(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_TERM_DATE
FROM DW.T_MEMBER_GROUP_JUNCTION CJ,
BCBS.T_GROUP_DIMENSION G,
BCBS.T_MEMBER_DIMENSION M
WHERE CJ.GRP_DIM_ID = G.GRP_DIM_ID
AND CJ.MBR_DIM_ID = M.MBR_DIM_ID
AND M.DEP_NBR != '000'
AND G.BENE_PKG IS NOT NULL)
WHERE LAG_GRP IS NULL
OR LEAD_GRP IS NULL
OR LEAD_GRP <> GRP
OR LAG_GRP <> GRP
Thanks for your reply.
September 03, 2005 - 7:15 am UTC
you know, without a table, rows and something more concrete.... I have no comment.
More detail
Sudha Bhagavatula, September 04, 2005 - 10:34 pm UTC
have 3 tables:
Member_dimension
Group_Dimension
Member_Group_Junction
Member_Dimension :- columns are mbr_dim_id, subr_id, dep_nbr
Group dimension :- columns are grp_dim_id, grp_nbr, sub_grp
Member_Group_Junction :- columns are mbr_dim_id, grp_dim_id, eff_date, term_date
I have to create one row for each contiguous dates of enrollment with a new row for a new group or a break in date.
Suppose a member (subr_id = 1001, dep_nbr = 001) is enrolled with a group called 001 from 01/01/2001 till 06/30/2001, he then changes group to 002 for the period 07/01/2001 till 12/31/2001. He enrolls with the same group 002 from 01/01/2002 till 06/30/2002 with a change in benefits. He then gets transferred to some other city or changes jobs. He joins back with the group 001 from 09/30/2003 till 11/30/2003 and quits again. joins back with the same group 001 from 01/01/2204 till present.The data in the junction table will be like this:
mbr_dim_id grp_dim_id eff_date term_date
1 1 01/01/2001 06/30/2001
1 2 07/01/2001 12/31/2001
1 2 01/01/2002 06/30/2002
1 1 09/30/2003 11/30/2003
1 1 01/01/2004 12/31/9999
My output should be like this:
mbr_dim_id grp_dim_id eff_date term_date
1 1 01/01/2001 06/30/2001
1 2 07/01/2001 06/30/2002
1 1 09/30/2003 11/30/2003
1 1 01/01/2004 12/31/9999
For each change in group or a break in the contiguity of the dates I should get a new row. The junction table is joined to the dimension with the respective dim_ids.
Hope I'm clearer this time.
Thanks
Sudha
September 05, 2005 - 10:11 am UTC
tell you what, see
</code>
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html <code>
it shows a technique in the analytics to the rescue article that will be useful for grouping ranges a records using the LAG() function.
But, you need to read the text that you are supposed to read before putting an example here.
It is something I think I say a lot.
<quote>
If your followup requires a response that might include a query, you had better supply very very simple create tables and insert statements. I cannot create a table and populate it for each and every question. The SMALLEST create table possible (no tablespaces, no schema names, just like I do in my examples for you)
</quote>
that is a direct cut and paste
distinct last_value
Putchi, September 06, 2005 - 4:49 am UTC
When using last_value I am usually only intrested in the last value, hence I need a distinct in the select to get it. It gives what I want but it seems that the database hase to do the work twice, first a window sort and after that a unique sort. Is there any way to avoid the distinct but still only get one row per partion key?
create table a (num number(2), var1 varchar2(10), var2 varchar2(10));
insert into a values (1,'a','A');
insert into a values (2,'b','A');
insert into a values (3,'c','A');
insert into a values (1,'a','B');
insert into a values (2,'b','B');
insert into a values (3,'c','B');
commit;
SQL> select distinct
2 var2
3 ,last_value(var1) over (partition by var2 order by num
4 rows between unbounded preceding and unbounded following) var1
5 from a;
VAR2 VAR1
---------- ----------
A c
B c
Körschema
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (UNIQUE)
2 1 WINDOW (SORT)
3 2 TABLE ACCESS (FULL) OF 'A'
September 06, 2005 - 8:31 am UTC
nope, analytics are not aggregates, aggregates are not analytics.
A trick you can use to skip one or the other step is:
ops$tkyte@ORA817DEV> select var2,
2 substr( max(to_char( num,'fm0000000000') || var1), 11 ) data
3 from a
4 group by var2
5 /
VAR2 DATA
---------- -----------
A c
B c
Analytics to the rescue
Sudha Bhagavatula, September 06, 2005 - 11:28 am UTC
Read that article. Helped me, but now I have another twist.
Create table contracts (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)
insert into contracts values ('1001', '001', '2112', to_date('01/01/2000','mm/dd/yyyy'), to_date('12/31/2000','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2001','mm/dd/yyyy'), to_date('06/30/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('07/01/2001','mm/dd/yyyy'), to_date('12/31/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '7552', to_date('01/01/2003','mm/dd/yyyy'), to_date('12/31/2003','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2004','mm/dd/yyyy'), to_date('12/31/9999','mm/dd/yyyy'));
I ran this query to identify breaks in groups and dates for the above table:
select subr_id, dep_nbr, grp,
min_eff_date,
max_term_date
from
(select subr_id, dep_nbr, grp,
min(eff_date) min_eff_date,
max(term_date) max_term_date
from
(select subr_id, dep_nbr, eff_date, term_date, grp,
max(rn)
over(partition by subr_id, dep_nbr order by eff_date) max_rn
from
(select subr_id, dep_nbr, eff_date, term_date, grp,
(case
when eff_date-lag_term_date > 1
or lag_term_date is null
or lag_grp_nbr is null
or lag_grp_nbr <> grp
then row_num
end) rn
from (
select subr_id, dep_nbr, eff_date, term_date, grp_nbr grp,
lag(term_date)
over (partition by subr_id, dep_nbr order by eff_date) lag_term_date,
lag(grp_nbr||sub_grp)
over (partition by subr_id, dep_nbr order by eff_date) lag_grp_nbr,
row_number()
over (partition by subr_id, dep_nbr order by eff_date) row_num
from contracts )))
group by subr_id, dep_nbr, grp, max_rn )
order by subr_id, dep_nbr, min_eff_date
This gave me the output as :
subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999
I now have another table :
create table contract_pcp_junction (subr_id varchar2(15), dep_nbr varchar2(3), pcp_id varchar2(12), eff_date date, term_date date)
insert into contract_pcp_junction values('1001','001','123765', to_date('07/01/2000','mm/dd/yyyy') to_date('06/30/2001','mm/dd/yyyy');
insert into contract_pcp_junction values('1001','001','155165', to_date('01/01/2003','mm/dd/yyyy') to_date('12/31/9999','mm/dd/yyyy');
This table identifies the provider coverage for each member. I need to identify the breaks in coverage with regards to the contracts.
Now as per the data above this member does not have a pcp from 01/01/2000 to 06/30/2000 and again from 07/01/2001 to 12/31/2001.
I need to insert the breaks into another table. This table needs to have the subr_id, dep_nbr, grp and eff_date, term_date.
create table contract_pcp_breaks (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)
This table needs to have the data for the breaks
subr_id dep_nbr grp_nbr eff_date term_date
1001 001 2112 01/01/2000 06/30/2000
1001 001 2112 07/01/2001 12/31/2001
How do I do that and hopefully I have the necessary scripts for you to work w1th.
Thanks a lot for your patience with this.
--Sudha
September 06, 2005 - 8:51 pm UTC
yah, I have scripts, but no real idea how these tables relate. Your query looks overly complex for the single table.
cannot you take your data, join it, get some "flat relation" that just simply using lag() on will solve the problem?
(please remember, you have been looking at this for hours. To you this data is natural. to everyone else, it is just bits and bytes on the screen)
Combining two tables
Putchi, September 09, 2005 - 6:39 am UTC
Hi Tom!
I want to combine from/to history values from two tables into one sequence like this:
create table a (a varchar2(2)
,from_date date
,to_date date);
create table b (b varchar2(2)
,from_date date
,to_date date);
insert into a ( a, from_date, to_date ) values (
'a1', to_date( '01/13/2005', 'mm/dd/yyyy'), to_date('02/10/2005', 'mm/dd/yyyy'));
insert into a ( a, from_date, to_date ) values (
'a2', to_date( '02/10/2005', 'mm/dd/yyyy'), to_date( '05/01/2005', 'mm/dd/yyyy'));
insert into a ( a, from_date, to_date ) values (
'a3', to_date( '05/01/2005', 'mm/dd/yyyy'), to_date( '08/12/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b1', to_date( '01/13/2005', 'mm/dd/yyyy'), to_date( '01/22/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b2', to_date( '01/22/2005', 'mm/dd/yyyy'), to_date( '04/01/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b3', to_date( '04/01/2005', 'mm/dd/yyyy'), to_date( '09/07/2005', 'mm/dd/yyyy'));
commit;
select * from ("Magic");
A B FROM_DATE TO_DATE
-- -- ---------- ----------
a1 b1 2005-01-13 2005-01-22
a1 b2 2005-01-22 2005-02-10
a2 b2 2005-02-10 2005-04-01
a2 b3 2005-04-01 2005-05-01
a3 b3 2005-05-01 2005-08-12
Is it possible?
September 09, 2005 - 8:30 am UTC
ops$tkyte@ORA10G> select a.* , b.*,
2 greatest(a.from_date,b.from_date),
3 least(a.to_date,b.to_date)
4 from a, b
5 where a.from_date <= b.to_date
6 and a.to_date >= b.from_date;
A FROM_DATE TO_DATE B FROM_DATE TO_DATE GREATEST( LEAST(A.T
-- --------- --------- -- --------- --------- --------- ---------
a1 13-JAN-05 10-FEB-05 b1 13-JAN-05 22-JAN-05 13-JAN-05 22-JAN-05
a1 13-JAN-05 10-FEB-05 b2 22-JAN-05 01-APR-05 22-JAN-05 10-FEB-05
a2 10-FEB-05 01-MAY-05 b2 22-JAN-05 01-APR-05 10-FEB-05 01-APR-05
a2 10-FEB-05 01-MAY-05 b3 01-APR-05 07-SEP-05 01-APR-05 01-MAY-05
a3 01-MAY-05 12-AUG-05 b3 01-APR-05 07-SEP-05 01-MAY-05 12-AUG-05
It won't be blindingly fast on huge things I would guess...
Putchi, September 09, 2005 - 9:14 am UTC
OK, I will try if it works, the real tables will have hundred of thousands records. I tried this myself, but I couldn't come up with something that filled in the "null" values.
SQL> select a,b,from_date,lead(from_date) over (order by from_date)
2 from (
3 select a,null b,from_date,to_date from a
4 union all
5 select null a,b,from_date,to_date from b
6 order by from_date
7 );
A B FROM_DATE LEAD(FROM_
-- -- ---------- ----------
a1 2005-01-13 2005-01-13
b1 2005-01-13 2005-01-22
b2 2005-01-22 2005-02-10
a2 2005-02-10 2005-04-01
b3 2005-04-01 2005-05-01
a3 2005-05-01
September 09, 2005 - 9:36 am UTC
that query won't work -- you need to join.
How to get the 1ST row of this distinct value in a single SELECT
Sean Chang, September 16, 2005 - 11:48 am UTC
Thank you, Tom.
I have read the analytic function for a while, but still
can't figure out a way to select the first row of a distinct
column value in a single SELECT statement. I.E
>>by running below Create and Insert
create table INV (
inv# number(7),
add_time date ,
inv_type varchar2(10),
amount number(8,2));
insert into inv values(1, sysdate-1, 'CASH', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(1, sysdate+1, 'COD', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(2, sysdate, 'MC', 10);
insert into inv values(3, sysdate-1, 'AMEX', 30);
insert into inv values(3, sysdate, 'CASH', 30);
I can get the first row of distinct INV# this way:
select * from (select a.*,
rank() over (partition by inv# order by add_time) time_order
from inv a) where time_order=1;
But how can I acheive this by a single SELECT statement?
The reason is that we have lots of tables we only need
look the very first row of the same Column value and I
don't want endup with lots of in-line views in SELECT
statement.
September 16, 2005 - 1:59 pm UTC
that is a single select.
why not? (on the lots of in-line views). If you think they are evil - then you wouldn't like my code ;)
Is analytical fitting in this situtation?
A reader, October 03, 2005 - 10:29 am UTC
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'A'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);
October 03, 2005 - 11:29 am UTC
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'A'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
should be a single query without union's - you don't need to make three passes on that data
select ..., count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)
Great!
A reader, October 03, 2005 - 4:05 pm UTC
Tom,
When I put the changes. It saying "missing keyword" What am I doing wrong?
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date)
damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id))
from gate_containers ab,gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);
October 03, 2005 - 8:57 pm UTC
sorry, I am not a sql compiler, I cannot reproduce since I don't have the tables or anything.
Case when ... then ... end
Greg, October 04, 2005 - 8:26 am UTC
Just lucked out an saw this:
"select ..., count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)"
Should be:
"select ..., count(distinct case when damage_code = 'A' then gate_id end),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)"
Tom just missed the "end" for the case statement ... (I got lucky and spotted it .. heh)
October 04, 2005 - 4:25 pm UTC
(that is why i always ask for create tables and inserts - without them, it is not possible to test)
thanks!!
A reader, October 04, 2005 - 2:24 pm UTC
Well Taken
A reader, October 05, 2005 - 10:53 am UTC
Tom,
This is what I would like to see..
damage_inspection_date damage_inspection_by counts
xx/xx/xxxx Louis 2 minors
xx/xx/xxxx juan 1 major
thanks.
can analytics help me?
Susan, October 05, 2005 - 2:41 pm UTC
My result set be ordered by the sum of multiple columns with weight assigned to the columns. The SQL below works and gives me what I want, but maybe there is an analytical function solution? Thanks for all your help.
SELECT ename, job, sal, comm FROM scott.BONUS
ORDER BY DECODE(job, -2, 0, job)*100000+DECODE(sal, -2, 0, sal)*10000+DECODE(comm, -2,0,comm)*100 DESC
October 05, 2005 - 3:05 pm UTC
not in this case - you want to order by a simple function of attributes of a single row.
You don't need to look across rows - analytics look across rows.
Thanks Tom
Susan, October 05, 2005 - 3:58 pm UTC
Thanks for your reply. Do you agree with the DECODE approach or am I missing a more elegant solution?
October 05, 2005 - 8:23 pm UTC
the decode looks fine here - shorter than case but in this "case" just as easy to read.
Tom
A reader, October 05, 2005 - 4:25 pm UTC
Tom,
Can you please point in the right direction...
This is what I am getting with the following query...
damage_inspection_date damage_inspection_by status
6/12/2004 CCCT MAJOR
6/12/2004 CCCT MINOR
6/12/2004 CCCT TOTAL
6/12/2004 LOU MAJOR
6/12/2004 LOU MINOR
and this is what I would like to get....
damage_inspection_date damage_inspection_by status count
6/12/2004 CCCT MAJOR 2
6/12/2004 CCCT MINOR 2
6/12/2004 CCCT TOTAL 1
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT ab.damage_inspection_date,
damage_inspection_by,
STATUS_CODE,
count(distinct case when ac.damage_location_code = 'A' then ab.gate_id end),
count(distinct case when ac.damage_location_code = 'F' then ab.gate_id end),
count(distinct ab.gate_id )
from gate_containers ab,gate_damages ac
where ab.gate_id = ac.gate_id
group by ab.damage_inspection_date,ab.damage_inspection_by,status_code, ab.gate_id))a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
group by (b.damage_inspection_date, b.damage_inspection_by,b.status)
October 05, 2005 - 8:35 pm UTC
....
damage_inspection_date damage_inspection_by status
6/12/2004 CCCT MAJOR
6/12/2004 CCCT MINOR
6/12/2004 CCCT TOTAL
6/12/2004 LOU MAJOR
6/12/2004 LOU MINOR
and this is what I would like to get....
damage_inspection_date damage_inspection_by status count
6/12/2004 CCCT MAJOR 2
6/12/2004 CCCT MINOR 2
6/12/2004 CCCT TOTAL 1
...
by what "logic"? can you explain how you get from A to B?
follow up
A reader, October 06, 2005 - 9:48 am UTC
Tom,
I already got the first part done. All I need to show is to somehow have the count in another column, how many minor, major and total I have. Can that be possible?
Just maybe like in the second example.
October 06, 2005 - 11:54 am UTC
first part of WHAT?
more information
A reader, October 06, 2005 - 12:47 pm UTC
Sorry about the lack of information before.
Here I will try to do bettter. I am trying to
a query where I need to count the major, minor
and then get a total.
requirements:
1. if there is a container with majors and a minors total the
counts = major+ minor = total count
2. where container has minor and no major count the minor only.
count = minor
inspector major minor total
1 major, 0 minor , other 1 1
inspector
2 major , 1 minor , other 2 1 3
inspector
0 major, 1 minor, other 0 1 1
October 06, 2005 - 1:25 pm UTC
sorry -- going back to your original example, I still cannot see the logic behind "what I have" and "what I want" there.
I don't know what you mean by "i have the first part"
this what I have now
A reader, October 06, 2005 - 2:11 pm UTC
Tom,
This is my query and result...
select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);
RESULT:
SQL Statement which produced this data:
select * from MAJOR_MINOR_COUNT_VIEW
where rownum < 10
6/12/2004 CCCT TOTAL 1
6/12/2004 CRAIG TOTAL 6
6/13/2004 CCCT TOTAL 5
6/14/2004 CCCT TOTAL 46
6/14/2004 FYFE TOTAL 30
6/14/2004 HALM TOTAL 38
6/14/2004 MUTH MAJOR 2
6/14/2004 MUTH MINOR 14
6/14/2004 MUTH TOTAL 40
AND I WOULD LIK TO HAVE LIKE AS
THE REQUIREMENTS ABOVE...HOPE THIS HELP.
October 06, 2005 - 2:57 pm UTC
take your query - call it Q
select inspector,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (Q)
group by inspector
Year to dt + month to date
reader, October 06, 2005 - 2:35 pm UTC
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;
SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id, TO_CHAR(invoice_dt, 'MON-YYYY') ORDER BY TO_CHAR(sale_dt, 'MON-YYYY') ASC) mon_sal,
SUM (sale_price_usd * qty_sold)
OVER (PARTITION BY id, TO_CHAR(sale_dt, 'YYYY') ORDER BY TO_CHAR(sale_dt, 'YYYY') ASC) yr_sal,
FROM test
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 237
aa 14-OCT-05 121.5 164.75 237
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 1298.25
bb 13-OCT-05 235.25 600.5 1298.25
bb 15-OCT-05 365.25 600.5 1298.25
bb 14-NOV-05 465.25 465.25 1298.25
7 rows selected.
Ideally ,it should have been ----
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.5
bb 15-OCT-05 365.25 600.5 833.0
bb 14-NOV-05 465.25 465.25 1298.25
How can I do this ?
Will appreciate your help .
THANKS
October 06, 2005 - 3:02 pm UTC
ideally - there would be a qty_sold column somewhere :)
ideally you will ONLY use to_char to *format* data, never to process it.
trunc(invoice_dt,'y') NOT to_char(invoice_dt,'yyyy')
trunc(sale_dt,'mm') NOT to_char(sale_dt, 'MON-YYYY' )
Year to Date and Month to date
READER, October 06, 2005 - 10:04 pm UTC
As per your suggestion ,I made the changes but ...still need your help .
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;
SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'MM') ORDER BY trunc(sale_dt,'MM') ASC) mon_sal,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'Y') ORDER BY trunc(sale_dt,'Y') ASC) yr_sal
FROM test
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------------------- ---------- ---------- ----------
aa 9/15/2005 72.25 72.25 237
aa 10/14/2005 193.75 164.75 237
aa 10/19/2005 237 164.75 237
bb 9/14/2005 232.5 232.5 1298.25
bb 10/13/2005 467.75 600.5 1298.25
bb 10/15/2005 833 600.5 1298.25
bb 11/14/2005 1298.25 465.25 1298.25
7 rows selected
Ideally ,it should have been ----
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.5
bb 15-OCT-05 365.25 600.5 833.0
bb 14-NOV-05 465.25 465.25 1298.25
Thanks again .
October 07, 2005 - 8:13 am UTC
you shall have to explain how you derived your "optimal" output.
certainly isn't sorted by anything? I don't get the numbers.
Year to date /Month to date
Reader, October 07, 2005 - 9:49 am UTC
I wish to create a summary table where we will have sale for every day ,sale up to that day in that month and then upto that day in that year
ie running total or cummulative total
Thanks
October 07, 2005 - 8:22 pm UTC
ok?
Follo up
A reader, October 07, 2005 - 9:52 am UTC
Tom,
The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..
major minor count
1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1
* count the minor when there is no major
CREATE TABLE GATE_CONTAINERS
(
GATE_ID NUMBER ,
VISIT NUMBER ,
REFERENCE_ID NUMBER ,
DAMAGE_INSPECTION_BY VARCHAR2(30),
DAMAGE_INSPECTION_DATE DATE,
)
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(1, 1);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(17, 10);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(21, 12);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(31, 18);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(33, 19);
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(36, 22, TO_DATE('06/12/2004 11:48:49', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(37, 23, TO_DATE('06/12/2004 11:50:11', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(39, 25, TO_DATE('06/12/2004 11:48:19', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(45, 30);
COMMIT;
CREATE TABLE GATE_DAMAGES
(
GATE_ID NUMBER NOT NULL,
DAMAGE_LOCATION_CODE VARCHAR2(5 BYTE) NOT NULL,
DAMAGE_TYPE_CODE VARCHAR2(5 BYTE) NOT NULL
)
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '01', '9');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '02', 'C');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(37, '01', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(62, '05', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(101, '23', 'C');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(183, '99', '9');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '01', 'D');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '04', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '07', 'B');
COMMIT;
October 07, 2005 - 8:35 pm UTC
The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..
major minor count
1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1
so tell me why there are minor counts when major > 0???
and this is my query
A reader, October 07, 2005 - 9:53 am UTC
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by
I got it....
A reader, October 07, 2005 - 11:16 am UTC
Tom,
I got it....I just had to put the following. Let me know
what you think? If you have any suggestions!
Thanks for all your patient...
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,z.damage_type_code,
( case z.damage_type_code
when 'F' then 0
ELSE Count(distinct g.gate_id)
end ) CNT
--- count(distinct g.gate_id) cnt
from gate_containers g,gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
Year to Date and Month to date
Tim, October 07, 2005 - 11:52 pm UTC
Just a guess - could this be what your looking for?
SELECT DISTINCT id,sale_dt
,SUM (amount) OVER
(PARTITION BY id, sale_dt ORDER BY id ASC, sale_dt ASC) sale_daily
,SUM (amount) OVER
(PARTITION BY id, TRUNC(sale_dt,'MM')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) mon_sal
,SUM (amount) OVER
(PARTITION BY id, TRUNC(sale_dt,'Y')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) yr_sal
FROM TEST
ORDER BY id, sale_dt
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.75
bb 15-OCT-05 365.25 600.5 833
bb 14-NOV-05 465.25 465.25 1298.25
-- another variation
SELECT a.*
,SUM (sale_daily) OVER
(PARTITION BY id, TRUNC(sale_dt,'MM')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) mon_sal
,SUM (sale_daily) OVER
(PARTITION BY id, TRUNC(sale_dt,'Y')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) yr_sal
FROM
(
SELECT id,sale_dt
,SUM(amount) sale_daily
FROM TEST
GROUP BY id, sale_dt
) a
EXCELLENT !
reader, October 08, 2005 - 12:24 am UTC
Thanks !
to answer your question
A reader, October 11, 2005 - 12:54 pm UTC
The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..
major minor count
1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1
so tell me why there are minor counts when major > 0???
because when there are a major's and minor's I want to count
only the major's. When just the minor when there is no
major...just want to count the minor. those are the only 2 situation that there should be.
How can I ignore some selected columns in my group by?
Neil, October 12, 2005 - 3:14 am UTC
Tom,
I have a set of data that is recorded daily and I want to
compress it; so this:
87654321 1 5 21-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 22-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 23-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 24-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 25-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 26-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 27-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 28-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 29-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 30-AUG-2005 2.7500E+10 0 -1.436E+10 2.7500E+10 0 -1.436E+10
87654321 1 5 31-AUG-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 01-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 02-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 03-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 04-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 05-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 06-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 07-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 08-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 09-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 10-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 11-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 12-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 13-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321 1 5 14-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 15-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 16-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 17-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 18-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 19-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 20-SEP-2005 5555550000 0 -1.436E+10 5555550000 0 -1.436E+10
87654321 1 5 21-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 22-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 23-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 24-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 25-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 26-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
Needs to be converted into this:
87654321 1 5 21-AUG-2005 29-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 30-AUG-2005 12-SEP-2005 2.7500E+10 0 -1.436E+10 2.7500E+10 0 -1.436E+10
87654321 1 5 13-SEP-2005 19-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321 1 5 20-SEP-2005 01-JAN-4000 5555550000 0 -1.436E+10 5555550000 0 -1.436E+10
The column of interest is the 7th one. Whenever it changes,
I want to create a new row beginning with the day's date,
and ending either on the day before the next change, or, if
there is no next change (LEAD analytic function), substitute
in 01-JAN-4000 to show that this is the current amount.
The problem is, I need to ignore the other figures in columns 5 & 6 and 8 & 9. If I group by all the columns, I get separate
entries for these lines. That's my stumbling block - I have got close with analytics, but so far, no cigar!
I'm on 8.1.7, although I'd be interested in solutions possible in
later versions, too. If you think this is possible, I can paste a create table and SQL*Loader script here, but it would detract from the post: It's a bit of a mess anyway - if only AskTom allowed 132 columns :)
T.I.A
October 12, 2005 - 7:26 am UTC
"i want to create a new row" - that is hard as analytics don't "create rows", they just don't "squish them out" like an aggregate wou.d
make the example smaller - you don't need all of the columns, seems two or three might suffice. show the table, the data (via inserts) and the expected output if you like.
Maybe this should be a GROUP BY question, then
Neil, October 13, 2005 - 4:12 am UTC
OK - here's the table creation scripts and a couple of loader files.
My goal is to create a SQL statement to change the old data into the new.
I can use analytics to give me the start and end dates, but my problem is that I wish to ignore the actin, actout, expin and expout columns and concentrate on the act column. When it changes, I want to take the row, and give it an end date of the day before the date on which it changes again, or the default date of 01-JAN-4000 if no such row exists.
If I could just partition by the earliest date and the latest date where the act figure is the same within serial, volume and part, I could pick off the FIRST and the LAST and use LAG and LEAD to work out the dates...
CREATE TABLE t_old (
DEPOT VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART NUMBER(2)
,ASAT DATE
,ACTIN NUMBER(8)
,ACTOUT NUMBER(8)
,ACT NUMBER(8)
,EXPIN NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD NUMBER(8)
)
/
LOAD DATA
INFILE *
INTO TABLE t_old
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,ASAT
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 00822000 6086 5 25-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 00822000 6086 5 26-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 08226111 1 5 29-AUG-2005 0 0 -4185550 0 0 -4185550
DEPOT1 08226111 1 5 30-AUG-2005 2750000 0 -1435550 2750000 0 -1435550
DEPOT1 08226111 1 5 31-AUG-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 01-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 02-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 03-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111 1 5 04-SEP-2005 0 0 -1991105 0 0 -1991105
DEPOT1 08226111 1 5 05-SEP-2005 0 0 -1991105 0 0 -1991105
DEPOT1 08226111 1 5 06-SEP-2005 555555 0 -1435550 555555 0 -1435550
DEPOT1 08226111 1 5 07-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 08-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 420 5 11-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 12-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 13-SEP-2005 3329555 -2775150 555555 3329555 -2775150 555555
DEPOT1 08226111 420 5 14-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 420 5 15-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 420 5 16-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 420 5 17-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 495 5 18-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 495 5 19-SEP-2005 555555 0 555555 555555 0 555555
DEPOT1 08226111 495 5 20-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 495 5 21-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 495 5 22-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 495 5 23-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 664 5 28-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 29-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 30-AUG-2005 0 -2750000 1478550 0 -2750000 1478550
DEPOT1 08226111 664 5 31-AUG-2005 0 0 1478550 0 0 1478550
DEPOT1 08226111 664 5 01-SEP-2005 0 0 1478550 0 0 1478550
CREATE TABLE t_new (
DEPOT VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART NUMBER(2)
,FROM_D DATE
,UNTIL_D DATE
,ACTIN NUMBER(8)
,ACTOUT NUMBER(8)
,ACT NUMBER(8)
,EXPIN NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD NUMBER(8)
)
/
LOAD DATA
INFILE *
INTO TABLE t_new
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,FROM_D
,UNTIL_D
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 01-JAN-4000 0 0 -1796200 0 0 -1796200
DEPOT1 08226111 1 5 29-AUG-2005 29-AUG-2005 0 0 -4185550 0 0 -4185550
DEPOT1 08226111 1 5 30-AUG-2005 02-SEP-2005 2750000 0 -1435550 2750000 0 -1435550
DEPOT1 08226111 1 5 03-SEP-2005 05-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111 1 5 06-SEP-2005 01-JAN-4000 555555 0 -1435550 555555 0 -1435550
DEPOT1 08226111 420 5 11-SEP-2005 12-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 13-SEP-2005 15-SEP-2005 3329555 -2775150 555555 3329555 -2775150 555555
DEPOT1 08226111 420 5 16-SEP-2005 18-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 495 5 19-SEP-2005 21-SEP-2005 555555 0 555555 555555 0 555555
DEPOT1 08226111 495 5 22-SEP-2005 01-JAN-4000 0 -555555 0 0 -555555 0
DEPOT1 08226111 664 5 28-AUG-2005 29-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 30-AUG-2005 01-JAN-4000 0 -2750000 1478550 0 -2750000 1478550
Some Help needed!
A reader, October 18, 2005 - 10:54 am UTC
Tom,
How can I count the double moves as 1. For example,
in the case of 1690371?
I want to count
1690371 63 A
1690371 63 X
1690371 64 A
1690371 64 L
I want to count "A" AS ONE MOVE using this query
select trunc(g.damage_inspection_date) damage_inspection_date,g.damage_inspection_by, 'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and g.DAMAGE_INSPECTION_BY = 'COLUMBO'
and trunc(G.damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
1690355 59 A
1690355 59 E
1690371 63 A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A
1690405 71 I
1690433 71 A
1690433 71 I
1690486 54 F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A
1690540 59 E
1690636 63 A
1690636 63 X
1690781 67 X
One solution
A reader, October 19, 2005 - 9:29 am UTC
Tom,
Can decode work here...
decode(count(distinct g.gate_id,'A','F',0,NULL)
October 19, 2005 - 9:45 am UTC
I didn't really understand the question right above, nor did I see any table creates or inserts, so I sort of ignored it...
More information
A reader, October 19, 2005 - 10:44 am UTC
create table gate_containers
(gate_id number,
action varchar2(5),
damage_inspection_date date,
damage_inspection_by varchar2(30))
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686439, 'RNC', TO_DATE('06/14/2005 11:16:16', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688372, 'RNC', TO_DATE('06/14/2005 13:26:59', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688374, 'RNC', TO_DATE('06/14/2005 13:27:08', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688235, 'RNC', TO_DATE('06/14/2005 13:18:15', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688609, 'RNC', TO_DATE('06/14/2005 13:43:35', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686827, 'RNC', TO_DATE('06/14/2005 11:42:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688508, 'RNC', TO_DATE('06/14/2005 13:36:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686044, 'RNC', TO_DATE('06/14/2005 10:50:47', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685720, 'RNC', TO_DATE('06/14/2005 10:27:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686276, 'RNC', TO_DATE('06/14/2005 11:05:23', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
CREATE TABLE GATE_DAMAGES
(
GATE_ID NUMBER NOT NULL,
DAMAGE_LOCATION_CODE VARCHAR2(5 BYTE) NOT NULL,
DAMAGE_TYPE_CODE VARCHAR2(5 BYTE) NOT NULL
)
--
--SQL Statement which produced this data:
-- SELECT * FROM GATE_DAMAGES
-- WHERE ROWNUM < 20
--
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '01', '9');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '02', 'C');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(37, '01', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(62, '05', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(101, '23', 'C');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(183, '99', '9');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '01', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '04', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '07', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '08', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '11', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '18', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '22', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '24', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '01', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '08', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '11', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '18', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '22', 'D');
COMMIT;
Values
(1686279, 'RNC', TO_DATE('06/14/2005 11:05:34', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686285, 'RNC', TO_DATE('06/14/2005 11:05:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685831, 'RNC', TO_DATE('06/14/2005 10:36:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685417, 'RNC', TO_DATE('06/14/2005 10:06:00', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685579, 'RNC', TO_DATE('06/14/2005 10:17:18', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685828, 'RNC', TO_DATE('06/14/2005 10:34:44', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686007, 'RNC', TO_DATE('06/14/2005 10:47:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686131, 'RNC', TO_DATE('06/14/2005 10:56:42', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688019, 'RNC', TO_DATE('06/14/2005 13:05:56', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
COMMIT;
Tom,
Let me see if I can try now to explain it better. I am looking for
a report like this..
damages_inspection_date damages_inspection_by cnt minor major total
6/12/2005 MUTH XX XXX XX XX
requirements
MINOR = A
MAJOR = F
TOTAL != 'C'
and also but very important is that when there is a major and a minor
I should only count the major thus ignoring the minor.
October 19, 2005 - 12:34 pm UTC
sorry - but I'll need much more "text" than that. Remember, I haven't been staring at these tables for hours/days, I'm not familar with your vernacular, I don't know what problem you are trying to solve.
spec it out like we used to in the olden days - someone wrote spec (requirements) and someone else might have written the code from the spec.
follow up
A reader, October 19, 2005 - 2:51 pm UTC
GATE_ID DAMAGE_TYPE_CODE
1690355 59 A A
1690355 59 E
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A A
1690405 71 I
1690433 71 A A
1690433 71 I
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A A
1690540 59 E
1690636 63 A A
1690636 63 X
1690781 67 X
1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X
I think this is a good example. In this case I got
A = MINOR DAMAGES
F = MAJOR DAMAGES
TOTAL = NOT EQUAL TO C ( !C)
1. If you look at it closely you can see for every
every gate_id where I have multiple
Major damages just count them as one like for example
Gate_id = 1690371
2 When you have multiples MINORS (Fs)
Like gate_id = 1690486 count them as one
3 When you have gate_id with F and A like
Gate_id = 1687912 then just Count F(MAJORS)
damages_inspection_date damages_inspection_by cnt minor major Total
6/12/2005 MUTH XX XXX XXX XX
October 19, 2005 - 4:30 pm UTC
you seem to be using F as major:
F = MAJOR DAMAGES
but also as minor:
When you have multiples MINORS (Fs)
sorry, I'm not being a "hard whatever", I'm not getting it. step back, pretend you were trying to explain this to your mom.
this is what I got so far
A reader, October 19, 2005 - 3:02 pm UTC
Tom,
This is what I got so far, but the query is
not following the rule with the MINORS DAMAGES..
----------------------------------------
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by
SORRY!!
A reader, October 19, 2005 - 3:05 pm UTC
COPY THE WRONG QUERY...
This is what I got so far but the query
is not following the rules with the MINOR
AS stated above.
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
d.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by
follow up
A reader, October 19, 2005 - 4:47 pm UTC
1. If you look at it closely you can see for every
every gate_id where I have multiple
Major damages just count them as one like for example
Gate_id = 1690371
2 When you have multiples MINORS (Fs)
Like gate_id = 1690486 count them as one
3 When you have gate_id with F and A ALIKE
Gate_id = 1687912 then just Count F(MAJORS)
F FOR MAJOR
A FOR MINOR
SOMETIMES IN THE RECORD WILL HAVE MAJOR AND MINORS
I JUST WANT TO COUNT THE MAJOR AND IGNORE THE MINORS.
I HAVE GIVEN 3 EXAMPLES OF THE RULES....DON'T KNOW
WHAT ELSE TO SAY...ALSO PLEASE LOOK AT THE QUERY
IT'S ALL IN THE UNION. THE ONLY PROBLEM THAT I HAVE
IS THAT I AM COUNTING THE MAJOR AND THE MINORS IN
THE MINOR UNION.
October 19, 2005 - 4:57 pm UTC
1) Ok, I'm looking at that gate id:
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?
2) Ok, I'm looking at that gate id:
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get it?
3) Ok, I'm looking at that:
1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X
and we are back to F being a major, not a minor again?
So, no, I don't get it, it is not clear, you can shout loud, but it won't matter.
(am I the only one not really following this??)
ANOTHER EXAMPLE!
A reader, October 19, 2005 - 4:55 pm UTC
1690355 59 A A
1690355 59 E
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A A
1690405 71 I
1690433 71 A A
1690433 71 I
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A A
1690540 59 E
1690636 63 A A
1690636 63 X
1690781 67 X
A 12
F 9
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by, 'MINOR' status,
count(distinct g.gate_id) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and damage_inspection_by = 'COLUMBO'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
result from the query:
damage_inspection_date
6/14/2005
damage_inspection_by
COLUMBO
status
MINOR
CNT
9 --------IT SHOULD BE 8 WHY? BECAUSE I WANT TO COUNT
THE "A"(MINORS) UNIQUELY,DISTINCTLY. IN OTHER WORDS
follow up
A reader, October 19, 2005 - 5:15 pm UTC
) Ok, I'm looking at that gate id:
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
*****this count as one A = MINOR
IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?
2) Ok, I'm looking at that gate id:
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
*******THIS COUNT AS ONE F = MAJOR
Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the
primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get
it?
3) Ok, I'm looking at that:
1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X
***** IN THIS CASE IGNORE THE MINOR(A) AND COUNT JUST THE MAJOR(F)
****THE FAR LETTER ON THE RIGHT IS TO SHOW YOU HOW I AM
CALCULATING WHAT IS ON THE RIGHT...
October 19, 2005 - 7:44 pm UTC
so, by gate_id compute how many A's and how many F's
select gate_id, count(case when col='A' then col end) A,
count(case when col='F' then col end) F
from t;
now you have gate_id and a count of A's and F's
call that Q
select ...
from (Q);
use CASE to look at A and F and return whatever you want.
Year to date Business Day
Reader, October 19, 2005 - 6:48 pm UTC
I am trying to calculate the number of days we did business ie sold anything and then a running total for the year .
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
COMMIT;
SELECT a.*
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mtd_day_of_business
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) ytd_day_of_business
FROM
(
SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
GROUP BY ID, sale_dt
) a
ID SALE_DT SALE_DAILY MON_SAL YR_SAL MTD_DAYOFBUS YTD_DAYOFBUS
---------- --------- ---------- ---------- ---------- ------------ ------------
aa 15-SEP-05 216.75 216.75 216.75 1 1
aa 14-OCT-05 299.25 299.25 516 1 2
aa 19-OCT-05 129.75 429 645.75 2 3
bb 14-SEP-05 697.5 697.5 697.5 1 1
bb 13-OCT-05 705.75 705.75 1403.25 1 2
bb 15-OCT-05 1095.75 1801.5 2499 2 3
bb 14-NOV-05 1395.75 1395.75 3894.75 1 4
Ideally ,the business days in the month of sep -14,15
oct-13,14,15,19 nov 1
So the year count should be 7 and the monthly count should be sep 1,2 oct 1,2,3,4 and Nov 1 .
Can this be done using analytical function or is there any other way .
Thanks
October 19, 2005 - 7:56 pm UTC
you mean like this?
ops$tkyte@ORA10GR1> SELECT a.*
2 ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
3 ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
4 ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
5 ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
6 ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'MM') order by sale_dt )
7 mtd_day_of_business
8 ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'Y') )
9 ytd_day_of_business
10 FROM
11 (
12 SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
13 GROUP BY ID, sale_dt
14 ) a
15 order by sale_dt
16 /
ID SALE_DT SALE_DAILY MON_SAL YR_SAL MTD_DAY_OF_BUSINESS YTD_DAY_OF_BUSINESS
---------- --------- ---------- ---------- ---------- ------------------- -------------------
bb 14-SEP-05 2092.5 2092.5 2092.5 1 7
aa 15-SEP-05 650.25 650.25 650.25 2 7
bb 13-OCT-05 2117.25 2117.25 4209.75 1 7
aa 14-OCT-05 1093.5 1093.5 1743.75 2 7
bb 15-OCT-05 3287.25 5404.5 7497 3 7
aa 19-OCT-05 389.25 1482.75 2133 4 7
bb 14-NOV-05 4187.25 4187.25 11684.25 1 7
7 rows selected.
Year to date ..
Reader, October 20, 2005 - 12:27 am UTC
The months are fine .
But year should increment ie 1,2,3,4,5,6,7
now it 7th business day for all transactions .
Thanks
October 20, 2005 - 8:06 am UTC
I did that, because you asked for that.
... So the year count should be 7 and the monthly count should be sep 1,2 oct
1,2,3,4 and Nov 1 .....
add the order by to the year count just like I did for the month.
the order by will make it a running total.
Thank you
A reader, October 20, 2005 - 8:56 am UTC
Tom,
Thank you for your solution, above. However,
Can you give me a solution using CASE when
I can count either A and NOT F?
Thank you again.
October 20, 2005 - 8:59 am UTC
select case when cnta > 0 and cntf > 0
then ...
when cnta = 0 and cntf > 0
then ...
when cnta > 0 and cntf = 0
then ...
use a boolean expression after computing the cnt of A and the cnt of F
SOMETHING LIKE THIS...
A reader, October 20, 2005 - 10:18 am UTC
Tom,
you mean something like this..
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,
'MINOR' STATUS,
SUM (case WHEN z.DAMAGE_TYPE_CODE= 'A' THEN 1 ELSE 0 end)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
October 20, 2005 - 4:33 pm UTC
sure
FINAL SOLUTION
A reader, October 20, 2005 - 11:04 am UTC
Tom,
here is the problem that I was facing....I hope
this clear things out.
SQL Statement which produced this data:
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
--and z.gate_id = '1688273'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
-------------------------------------------------------------------------
6/14/2005 COLUMBO MINOR 1688235 A
6/14/2005 COLUMBO MINOR 1688609 A
6/14/2005 COLUMBO MINOR 1688273 A------was counting this a minor when it should be counted as a major
6/14/2005 COLUMBO MINOR 1686769 A
6/14/2005 COLUMBO MINOR 1686517 A
6/14/2005 COLUMBO MINOR 1687985 A
6/14/2005 COLUMBO MINOR 1686483 A
6/14/2005 COLUMBO MINOR 1685361 A
6/14/2005 COLUMBO MINOR 1686414 A
SQL Statement which produced this data:
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
--and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
and z.gate_id = '1688273'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
6/14/2005 COLUMBO MINOR 1688273 A--------throw away!
6/14/2005 COLUMBO MINOR 1688273 E
6/14/2005 COLUMBO MINOR 1688273 C
6/14/2005 COLUMBO MINOR 1688273 F-------keep
6/14/2005 COLUMBO MINOR 1688273 I
follow up
A reader, October 20, 2005 - 5:01 pm UTC
Tom,
I am still not able to get the corrent result using
a case statement. Maybe I should use a function to return only the F when there is a F and A in the record.
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
sum (case when z.damage_type_code = 'F' then 1 else 0 end) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
Can this be done using analytical functions??
A reader, October 25, 2005 - 2:59 pm UTC
Tom,
I am done with my query....I am looking for a better
approach or how I can improve it. Thanks!!
select damage_inspection_date, damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date)
damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by, 'MINOR' status,
count(distinct g.gate_id) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and not exists
(select z.gate_id from gate_damages z
where z.gate_id = g.gate_id
and z.damage_type_code = 'F')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+))
group by damage_inspection_date, damage_inspection_by;
October 26, 2005 - 11:24 am UTC
sorry - too big to reverse engineer here as a review/followup....
Analytic Question
Yoav, November 20, 2005 - 8:16 am UTC
Hi Tom.
Im tring to calculating Weighted moving average.
Im having a problem to calculate the values under column SUM_D.
Can you please demonstrate how to achieve the values that appears under the column SUM_D ?
create table t
(stock_date date,
close_value number(8,2));
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('02-OCT-2005',759.56);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('29-SEP-2005',753.59);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('28-SEP-2005',749.20);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('27-SEP-2005',741.71);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('26-SEP-2005',729.93);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('25-SEP-2005',719.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('22-SEP-2005',727.30);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('21-SEP-2005',735.81);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('20-SEP-2005',740.38);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('19-SEP-2005',739.86);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('18-SEP-2005',745.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('15-SEP-2005',744.65);
COMMIT;
select RN, day, stock_date,close_value,weight
from(
select rownum RN,to_char(stock_date,'d') Day,
stock_date,close_value,
(case when to_char(stock_date,'d') = 1 then
1*close_value
when to_char(stock_date,'d') = 2 then
2*close_value
when to_char(stock_date,'d') = 3 then
3*close_value
when to_char(stock_date,'d') = 4 then
4*close_value
when to_char(stock_date,'d') = 5 then
5*close_value
end) weight
from( select rownum,stock_date,close_value
from test
order by stock_date asc)
order by 1
)
ORDER BY 1
/
RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
--------- - --------- ----------- ---------- ----------
1 5 15-SEP-05 744.65 3723.25 5
2 1 18-SEP-05 745.48 745.48 1 <==
3 2 19-SEP-05 739.86 1479.72 3
4 3 20-SEP-05 740.38 2221.14 6
5 4 21-SEP-05 735.81 2943.24 10
6 5 22-SEP-05 727.3 3636.5 15
7 1 25-SEP-05 719.48 719.48 1 <==
8 2 26-SEP-05 729.93 1459.86 3
9 3 27-SEP-05 741.71 2225.13 6
10 4 28-SEP-05 749.2 2996.8 10
11 5 29-SEP-05 753.59 3767.95 15
RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
--------- - --------- ----------- ---------- ----------
12 1 02-OCT-05 759.56 759.56 1
Thank You.
November 20, 2005 - 8:31 am UTC
would you like to explain what sum_d is to you? explain the logic behind it.
Analytic Question
Yoav, November 20, 2005 - 10:10 am UTC
Hi Tom.
Im sorry if my explanation wasnt clear enough.
The column SUM_D is actualy a "running total" of the column
Day.
The thing is that i need to reset the value of the
column SUM_D to 1 at the beginning of each week (Sunday).
select RN, day, TheDayIs, stock_date
from(
select rownum RN,to_char(stock_date,'d') Day,
to_char(stock_date,'Day')TheDayIs,
stock_date,close_value
from( select rownum,stock_date,close_value
from test
order by stock_date asc)
order by 1
)
ORDER BY 1
/
RN D SUM_D THEDAYIS STOCK_DAT
--------- - ----- --------- ---------
1 5 5 Thursday 15-SEP-05
2 1 1 Sunday 18-SEP-05
3 2 3 Monday 19-SEP-05
4 3 6 Tuesday 20-SEP-05
5 4 10 Wednesday 21-SEP-05
6 5 15 Thursday 22-SEP-05
7 1 1 Sunday 25-SEP-05
8 2 3 Monday 26-SEP-05
9 3 6 Tuesday 27-SEP-05
10 4 10 Wednesday 28-SEP-05
11 5 15 Thursday 29-SEP-05
12 1 1 Sunday 02-OCT-05
Thank you for you quick response
November 21, 2005 - 8:20 am UTC
you might have to "adjust" your stock_date by a day if 'ww' doesn't group right for you with your NLS settings (sometimes the week ends on a different day depending on your NLS settings - locale issue)
ops$tkyte@ORA9IR2> select row_number() over (order by stock_date) rn,
2 to_char(stock_date,'d') day,
3 stock_date,
4 close_value,
5 to_number(to_char(stock_date,'d'))*close_value weight,
6 sum(to_number(to_char(stock_date,'d')))
over (partition by to_char(stock_date,'ww')
order by stock_date) sum_d
7 from t
8 order by stock_date
9 /
RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
---------- - --------- ----------- ---------- ----------
1 5 15-SEP-05 744.65 3723.25 5
2 1 18-SEP-05 745.48 745.48 1
3 2 19-SEP-05 739.86 1479.72 3
4 3 20-SEP-05 740.38 2221.14 6
5 4 21-SEP-05 735.81 2943.24 10
6 5 22-SEP-05 727.3 3636.5 15
7 1 25-SEP-05 719.48 719.48 1
8 2 26-SEP-05 729.93 1459.86 3
9 3 27-SEP-05 741.71 2225.13 6
10 4 28-SEP-05 749.2 2996.8 10
11 5 29-SEP-05 753.59 3767.95 15
12 1 02-OCT-05 759.56 759.56 1
12 rows selected.
Analytics Question
Yoav, November 21, 2005 - 7:41 am UTC
Hi Tom.
Im Sorry for wasting you time.
i found the solution.
select RN, day, week_no,
sum(day) over
(partition by week_no
order by day) sum_d,
stock_date
from(
select RN, day, week_no, stock_date
from(select rownum RN,to_char(stock_date,'d') Day,
to_char(stock_date,'ww') week_no,
stock_date,close_value,
(case when to_char(stock_date,'d') = 1 then
1*close_value
when to_char(stock_date,'d') = 2 then
2*close_value
when to_char(stock_date,'d') = 3 then
3*close_value
when to_char(stock_date,'d') = 4 then
4*close_value
when to_char(stock_date,'d') = 5 then
5*close_value
end) weight
from( select rownum,'Y',stock_date,close_value
from test
order by stock_date asc)
order by 1)
)
order by 1
/
RN D WE SUM_D STOCK_DAT
------ - -- ---------- ---------
1 5 37 5 15-SEP-05
2 1 38 1 18-SEP-05
3 2 38 3 19-SEP-05
4 3 38 6 20-SEP-05
5 4 38 10 21-SEP-05
6 5 38 15 22-SEP-05
7 1 39 1 25-SEP-05
8 2 39 3 26-SEP-05
9 3 39 6 27-SEP-05
10 4 39 10 28-SEP-05
11 5 39 15 29-SEP-05
RN D WE SUM_D STOCK_DAT
------ - -- ---------- ---------
12 1 40 1 02-OCT-05
Thank You. !!
November 21, 2005 - 8:52 am UTC
see above, you can skip lots of steps here!
Analytics Question
Yoav, November 22, 2005 - 5:29 am UTC
Tom.
Your solution is better then my.
Thank you !
Could you please help me with this
A reader, November 29, 2005 - 4:17 am UTC
I am trying to output a report with different aggregates for different price ranges
create table t(
id number(3),
year number(4),
month number(2),
slno number(2),
colorcd number(2),
sizecd number(2),
itemid number(4),
prdno number(3),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));
insert into t values (1,2005,1,1,1,10,1000,101,150,100,10);
insert into t values (1,2005,1,2,1,11,1000,101,150,120,2);
insert into t values (1,2005,1,3,1,12,1000,101,150,100,10);
insert into t values (1,2005,1,4,1,13,1000,102,150,200,2);
insert into t values (1,2005,2,5,2,10,1000,102,150,100,20);
insert into t values (1,2005,2,6,2,11,1000,102,150,100,12);
insert into t values (1,2005,2,7,3,10,1000,103,150,100,20);
insert into t values (1,2005,3,8,4,10,1000,103,150,100,22);
insert into t values (1,2005,4,9,4,11,1000,103,150,100,12);
insert into t values (1,2005,1,10,5,10,1000,104,450,100,10);
insert into t values (1,2005,1,11,5,11,1000,104,450,120,2);
insert into t values (1,2005,1,12,5,12,1000,104,450,100,10);
insert into t values (1,2005,1,13,5,13,1000,104,450,200,2);
insert into t values (1,2005,2,14,5,14,1000,104,450,100,20);
insert into t values (1,2005,1,15,6,10,1001,105,150,100,10);
insert into t values (1,2005,1,16,6,11,1001,105,150,120,2);
insert into t values (1,2005,1,17,6,12,1001,105,150,100,10);
insert into t values (1,2005,1,18,6,13,1001,105,150,200,2);
insert into t values (1,2005,2,19,7,10,1001,105,150,100,20);
insert into t values (1,2005,2,20,7,11,1002,106,400,100,12);
insert into t values (1,2005,2,21,8,10,1002,106,400,100,20);
insert into t values (1,2005,3,22,9,10,1002,107,400,100,22);
insert into t values (1,2005,4,23,10,11,1002,107,400,100,12);
insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');
Desc Itemname <199 <299 <399 <499
----------------------------------------------------------------------------
Count of distinct prdnos Item0 3 null null 1
(Count of distinct prdnos
group by colorcd, sizecd) 3 null null 1
sum of sl_qty group by itemid 110 null null 44
Count of distinct prdnos Item1 1 null null null
(Count of distinct prdnos
group by colorcd, sizecd) 1 null null null
sum of sl_qty group by itemid 44 null null null
Count of distinct prdnos Item2 null null null 2
(Count of distinct prdnos
group by colorcd, sizecd) null null null 1
sum of sl_qty group by itemid null null null 66
Is this possible? The Desc column is not needed and 'null' should be blank.
Thank you
November 29, 2005 - 10:22 am UTC
I don't get the "group by colorcd, sizecd" bit. If you group by those attributes, you'll get a row per unique ITEMNAME, COLORCD, SIZECD.
I don't understand the logic.
A reader, November 29, 2005 - 10:48 am UTC
Dear Tom,
For each unique record of ITEMNAME, COLORCD, SIZECD, the PRODNOs are repeating isn't it? I need a count of distinct prodnos.
COLORCD-SIZECD-ITEMID-PRODNO in that order, please see below
first group
-------------------------
1-10-1000-101
1-11-1000-101
1-12-1000-101
----------------------
second group
----------------------
1-13-1000-102
2-10-1000-102
2-11-1000-102
Both these groups comes under price range < 199. So the distinct count of PRODNO for price range < 199 = 2
Hope this make sense.
Thank you
November 30, 2005 - 10:46 am UTC
not understanding how this gets down to a single row. I did not get it.
why would that be different than the count of disintct prodno's by itemid.
how about this query
steve, November 30, 2005 - 8:04 pm UTC
Hi Tom,
Is there simple way to do it by analytic function?
select dept_num, id, sum(curr_adj_qty)
from
(
select dept_num, id, sum(current_adjust_qty) curr_adj_qty
from adjust
where applied_ind = 'N'
and expired_ind = 'N'
group by dept_num, id
UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
from adjust
where expired_ind = 'N'
and applied_ind = 'Y'
group by dept_num, id
) adj_tmp
group by dept_num, id
Thanks a lot!
Steve
November 30, 2005 - 9:17 pm UTC
hows about you
a) set up a small example
b) explain what it is supposed to do in text (so we don't have to reverse engineer what you might have been thinking)
[RE] to Steve NYC
Marcio Portes, November 30, 2005 - 10:43 pm UTC
May be he is looking for this
ops$marcio@LNX10GR2> select dept_num, id, sum(curr_adj_qty)
2 from
3 (
4 select dept_num, id, sum(current_adjust_qty) curr_adj_qty
5 from adjust
6 where applied_ind = 'N'
7 and expired_ind = 'N'
8 group by dept_num, id
9 UNION
10 select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
11 from adjust
12 where expired_ind = 'N'
13 and applied_ind = 'Y'
14 group by dept_num, id
15 ) adj_tmp
16 group by dept_num, id
17 /
DEPT_NUM ID SUM(CURR_ADJ_QTY)
------------- ------------- -----------------
1 0 185
1 2 186
0 2 77
0 0 81
1 1 165
0 1 56
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2815735809
--------------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 390 | 11 (46)| 00:00:01 |
| 1 | HASH GROUP BY | | 10 | 390 | 11 (46)| 00:00:01 |
| 2 | VIEW | | 10 | 390 | 10 (40)| 00:00:01 |
| 3 | SORT UNIQUE | | 10 | 120 | 10 (70)| 00:00:01 |
| 4 | UNION-ALL | | | | | |
| 5 | HASH GROUP BY | | 5 | 60 | 5 (40)| 00:00:01 |
|* 6 | TABLE ACCESS FULL| ADJUST | 250 | 3000 | 3 (0)| 00:00:01 |
| 7 | HASH GROUP BY | | 5 | 60 | 5 (40)| 00:00:01 |
|* 8 | TABLE ACCESS FULL| ADJUST | 250 | 3000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("APPLIED_IND"='N' AND "EXPIRED_IND"='N')
8 - filter("EXPIRED_IND"='N' AND "APPLIED_IND"='Y')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
624 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
ops$marcio@LNX10GR2>
ops$marcio@LNX10GR2> select dept_num, id,
2 sum( case when applied_ind = 'N'
3 then current_adjust_qty
4 else 0 end )
5 - sum( case when applied_ind = 'Y'
6 then current_adjust_qty
7 else 0 end ) curr_adj_qty
8 from adjust
9 where expired_ind = 'N'
10 group by dept_num, id
11 /
DEPT_NUM ID CURR_ADJ_QTY
------------- ------------- -------------
1 0 185
1 2 186
0 2 77
1 1 165
0 0 81
0 1 56
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3658272021
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 60 | 4 (25)| 00:00:01 |
| 1 | HASH GROUP BY | | 5 | 60 | 4 (25)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| ADJUST | 500 | 6000 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXPIRED_IND"='N')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
622 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
I used this script to produce output above.
set echo on
drop table adjust purge;
create table
adjust (
dept_num int,
id int,
applied_ind char(1),
expired_ind char(1),
current_adjust_qty int
);
insert /*+ append */ into adjust
with v
as ( select level l from dual connect by level <= 1000 )
select mod(l, 2), mod(l, 3),
decode(mod(l,8), 0, 'Y', 'N'),
decode(mod(l,5), 0, 'N', 'Y'),
trunc(dbms_random.value(1,10.000))
from v
/
commit;
exec dbms_stats.gather_table_stats( user, 'adjust' )
set autotrace on
select dept_num, id, sum(curr_adj_qty)
from
(
select dept_num, id, sum(current_adjust_qty) curr_adj_qty
from adjust
where applied_ind = 'N'
and expired_ind = 'N'
group by dept_num, id
UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
from adjust
where expired_ind = 'N'
and applied_ind = 'Y'
group by dept_num, id
) adj_tmp
group by dept_num, id
/
select dept_num, id,
sum( case when applied_ind = 'N'
then current_adjust_qty
else 0 end )
- sum( case when applied_ind = 'Y'
then current_adjust_qty
else 0 end ) curr_adj_qty
from adjust
where expired_ind = 'N'
group by dept_num, id
/
set autotrace off
set echo off
Multiple aggregates
Raj, December 01, 2005 - 7:45 am UTC
Dear Tom,
This is continuing with my previous post where the given data for the problem was wrong. I was trying to make a sample testcase. Here is the requirements along with the create table statements and corrected data.
This is to output sales figures for a given period of different products.
The output format should be,
1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD
3. PRICE
4. Count of distinct PRODNOs for an item group by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item group by PRICE
6. Total SL_QTY for an item group by PRICE
7. Total SL_QTY*PRICE for an item group by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item
create table t(
id number(3),
slno number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));
With Items as(
select
itemid, displaycd, itemname
from
p),
DistinctCounts as(
select
min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
from
(select
distinct prdno,colorcd, sizecd, price, itemid
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
price, prdno, colorcd, sizecd )
group by price),
Aggregates as(
select
price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
from
(
select
itemid, month, prdno,colorcd, sizecd,
count(distinct prdno) over (partition by price) c1,
sl_qty c3,
sl_qty*price c4,
price,
trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7 c6
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
prdno,colorcd, sizecd)
group by
price)
select
a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
avg(c6) avg_trend
from
DistinctCounts d, Aggregates a, Items i
where
d.prdno=a.prdno and
d.price=a.price and
i.itemid=d.itemid
group by
a.price,a.itemid, i.itemname, c2
order by
a.itemid, i.itemname, a.price
/
With this query I am able to get the report like this,
ITEMID ITEMNAME PRICE PRDNO_CNT SKU_CNT SALE_CNT SALE_PRICE AVG_PRICE AVG_TREND
---------- -------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1000 Item0 150 2 8 280 42000 150 40
1000 Item0 450 1 4 110 49500 450 46.6
1001 Item1 350 1 5 270 94500 350 32.6
Is it possible to get the report in the following format along with the nonmatching itemnames and null cells as blanks.
Itemname <199 <299 <399 <499
------------------------------------------------
Item0 2 null null 1
8 null null 4
280 null null 110
42000 null null 49500
150 null null 450
40.0 null null 46.6
Item1 null null 1 null
null null 5 null
null null 270 null
null null 94500 null
null null 350 null
null null 32.6 null
Item2 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Item3 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Many thanks for your help and patience
Sorry
Raj, December 01, 2005 - 9:45 pm UTC
Dear Tom,
I am sorry, I did'nt post the insert statements. Sorry for being careless. I did execute everything in my system and was formating and copying it one by one, previewed and reread it before posting and still missed it. I will repost the requirements below,
This is to output sales figures for a given period of different products.
The output format should be,
1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD
3. PRICE
4. Count of distinct PRODNOs for an item grouped by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item grouped by PRICE
6. Total SL_QTY for an item grouped by PRICE
7. Total SL_QTY*PRICE for an item grouped by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item
drop table t;
drop table p;
create table t(
id number(3),
slno number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));
insert into t values (1,1,2005,1,1000,101,1,10,150,90,10);
insert into t values (1,2,2005,1,1000,101,1,11,150,80,20);
insert into t values (1,3,2005,1,1000,101,1,12,150,90,10);
insert into t values (1,4,2005,1,1000,101,1,13,150,80,20);
insert into t values (1,5,2005,2,1000,101,1,10,150,80,20);
insert into t values (1,25,2005,1,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,1,1000,104,1,13,150,80,20);
insert into t values (1,25,2005,2,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,2,1000,104,1,13,150,80,20);
insert into t values (1,26,2005,2,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,2,1000,104,1,10,150,90,10);
insert into t values (1,26,2005,1,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,1,1000,104,1,10,150,90,10);
insert into t values (1,6,2005,2,1000,101,1,11,150,60,40);
insert into t values (1,7,2005,2,1000,101,1,12,150,80,20);
insert into t values (1,14,2005,2,1000,101,1,13,150,80,20);
insert into t values (1,15,2005,1,1001,103,1,10,350,90,10);
insert into t values (1,23,2005,3,1001,103,1,11,350,10,90);
insert into t values (1,22,2005,3,1001,103,1,10,350,90,10);
insert into t values (1,21,2005,2,1001,103,1,11,350,80,20);
insert into t values (1,20,2005,2,1001,103,1,10,350,80,20);
insert into t values (1,19,2005,1,1001,103,1,14,350,80,20);
insert into t values (1,18,2005,1,1001,103,1,13,350,70,30);
insert into t values (1,17,2005,1,1001,103,1,12,350,40,60);
insert into t values (1,16,2005,1,1001,103,1,11,350,90,10);
insert into t values (1,8,2005,1,1000,102,1,10,450,80,20);
insert into t values (1,9,2005,1,1000,102,1,11,450,90,10);
insert into t values (1,10,2005,1,1000,102,1,12,450,90,10);
insert into t values (1,11,2005,1,1000,102,1,13,450,90,10);
insert into t values (1,12,2005,2,1000,102,1,10,450,80,10);
insert into t values (1,13,2005,2,1000,102,1,11,450,50,50);
insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');
commit;
With Items as(
select
itemid, displaycd, itemname
from
p),
DistinctCounts as(
select
min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
from
(select
distinct prdno,colorcd, sizecd, price, itemid
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
price, prdno, colorcd, sizecd )
group by price),
Aggregates as(
select
price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
from
(
select
itemid, month, prdno,colorcd, sizecd,
count(distinct prdno) over (partition by price) c1,
sl_qty c3,
sl_qty*price c4,
price,
trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7 c6
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
prdno,colorcd, sizecd)
group by
price)
select
a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
avg(c6) avg_trend
from
DistinctCounts d, Aggregates a, Items i
where
d.prdno=a.prdno and
d.price=a.price and
i.itemid=d.itemid
group by
a.price,a.itemid, i.itemname, c2
order by
a.itemid, i.itemname, a.price
/
ITEMID ITEMNAME PRICE PRDNO_CNT SKU_CNT SALE_CNT SALE_PRICE AVG_PRICE AVG_TREND
------ -------- ------ --------- -------- --------- ----------- ---------- ----------
1000 Item0 150 2 8 280 42000 150 40
1000 Item0 450 1 4 110 49500 450 47
1001 Item1 350 1 5 270 94500 350 33
Is it possible to get the report in the following format along with the
nonmatching itemnames and null cells as blanks.
Itemname <199 <299 <399 <499
------------------------------------------------
Item0 2 null null 1
8 null null 4
280 null null 110
42000 null null 49500
150 null null 450
40.0 null null 46.6
Item1 null null 1 null
null null 5 null
null null 270 null
null null 94500 null
null null 350 null
null null 32.6 null
Item2 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Item3 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null
Thanking you
December 02, 2005 - 10:48 am UTC
it wasn't just that - it was "this was too big to answer in a coupld of seconds and since I get over 1,000 of these a month, I cannot spend too much time on each one, I'd rather take NEW questions sometimes"
a single query to collapse date ranges
Bob Lyon, December 06, 2005 - 12:56 pm UTC
Tom,
I know what I want to do but can't quite get my mind around the syntax...
We want a single query to collapse date ranges under the assumption that a date range that
starts later than another range has a better value.
So given this test case
CREATE GLOBAL TEMPORARY TABLE RDL (
DATE_FROM DATE,
DATE_TO DATE,
VALUE NUMBER
);
INSERT INTO RDL VALUES (TO_DATE('01/03/2005', 'MM/DD/YYYY'), TO_DATE('01/12/2005', 'MM/DD/YYYY'), 5);
INSERT INTO RDL VALUES (TO_DATE('01/05/2005', 'MM/DD/YYYY'), TO_DATE('01/10/2005', 'MM/DD/YYYY'), 8);
-- I assume the innermost subquery would
-- use the DUAL CONNECT BY LEVEL trick to generate individual days for each grouping
-- ORDER BY DATE_FROM
1 01/03/2005 01/04/2005 5
1 01/04/2005 01/05/2005 5
1 01/05/2005 01/06/2005 5
1 01/06/2005 01/07/2005 5
1 01/07/2005 01/08/2005 5
1 01/08/2005 01/09/2005 5
1 01/09/2005 01/10/2005 5
1 01/10/2005 01/11/2005 5
1 01/11/2005 01/12/2005 5
2 01/05/2005 01/06/2005 8
2 01/06/2005 01/07/2005 8
2 01/07/2005 01/08/2005 8
2 01/08/2005 01/09/2005 8
2 01/09/2005 01/10/2005 8
-- an outer subquery would use analytics to get the max grouping
1 01/03/2005 01/04/2005 5
1 01/04/2005 01/05/2005 5
2 01/05/2005 01/06/2005 8
2 01/06/2005 01/07/2005 8
2 01/07/2005 01/08/2005 8
2 01/08/2005 01/09/2005 8
2 01/09/2005 01/10/2005 8
1 01/10/2005 01/11/2005 5
1 01/11/2005 01/12/2005 5
-- And the outermost subquery would use analytics to collapse the dates into contiguous groups
-- for the desired result
1 01/03/2005 01/05/2005 5
2 01/05/2005 01/10/2005 8
1 01/10/2005 01/12/2005 5
The trick is to do all of the above in a single query!
Any suggestions (Yeah, I know, REALLY learn analytics!)
Thanks in advance,
Bob Lyon
OK, I think I got it
Bob Lyon, December 06, 2005 - 2:25 pm UTC
SELECT d date_from, d2 date_to, value
FROM (
SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
FROM (
SELECT DATE_FROM D, VALUE FROM RDL
UNION
SELECT DATE_TO D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
)
)
WHERE D2 IS NOT NULL
/
DATE_FROM DATE_TO VALUE
----------------- ----------------- ----------
01/03/05 00:00:00 01/05/05 00:00:00 5
01/05/05 00:00:00 01/10/05 00:00:00 8
01/10/05 00:00:00 01/12/05 00:00:00 5
December 06, 2005 - 3:50 pm UTC
depends on how many overlaps you allow, take your create and:
...
ops$tkyte@ORA9IR2> INSERT INTO RDL VALUES (TO_DATE('01/06/2005', 'MM/DD/YYYY'),
2 TO_DATE('01/7/2005', 'MM/DD/YYYY'), 99);
1 row created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT d date_from, d2 date_to, value
2 FROM (
3 SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
4 FROM (
5 SELECT DATE_FROM D, VALUE FROM RDL
6 UNION
7 SELECT DATE_TO D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
8 )
9 )
10 WHERE D2 IS NOT NULL
11 /
DATE_FROM DATE_TO VALUE
--------- --------- ----------
03-JAN-05 05-JAN-05 5
05-JAN-05 06-JAN-05 8
06-JAN-05 07-JAN-05 99
07-JAN-05 10-JAN-05 8
10-JAN-05 12-JAN-05 5
so, maybe we expand out and keep the row we want:
ops$tkyte@ORA9IR2> with
2 data
3 as
4 (select level-1 l
5 from (select max(date_to-date_from+1) n from rdl) n
6 connect by level <= n)
7 select rdl.date_from+l,
8 to_number( substr( max( to_char(date_from,'yyyymmdd') || value ), 9 ) ) value
9 from rdl, data
10 where data.l <= rdl.date_to-rdl.date_from
11 group by rdl.date_from+l
12 ;
RDL.DATE_ VALUE
--------- ----------
03-JAN-05 5
04-JAN-05 5
05-JAN-05 8
06-JAN-05 99
07-JAN-05 99
08-JAN-05 8
09-JAN-05 8
10-JAN-05 8
11-JAN-05 5
12-JAN-05 5
10 rows selected.
Select with Analytics Working Partially
denni50, December 06, 2005 - 4:34 pm UTC
Hi Tom
have a question with the below script that is puzzling.
I'm using the 4 idnumbers as test data. I'm looking to select the most recent record where the appealcode is like '_R%' for 2005.
when I run the script it only pulls 2 of the idnumbers.
I've been looking at this and can't see why the other two are being bypassed.I'm trying to use more analytics in my code, it's working for two records and not the other two.
any tips/help greatly appreciated.
SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
2 paydate,payamount,transnum,ltransnum,appealcode
3 from (
4 select x.*, row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
5 from (
6 select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
7 payamount, transnum,ltransnum,appealcode,
8 max(paydate) over (partition by idnumber) maxpd
9 from payment
10 where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MON-Y
YYY')
11 ) x
12 where appealcode like '_R%'
13 and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)
15 )
16 where rn = 1;
IDNUMBER USER FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
4004594 ACDC May2005 17-MAY-05 0 10159410 10086183 DRE0505
5618190 ACDC Mar2005 11-MAR-05 0 9918802 9845638 DRJ0503
SQL>
**** 4 TEST IDNUMBERS FROM BASE TABLE*********************
SQL> select idnumber,appealcode,paydate,payamount,transnum
2 from payment where appealcode like '_R%'
3 and idnumber=4004594 order by paydate desc;
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
4004594 DRE0505 17-MAY-05 0 10159410
4004594 GRG0502 08-FEB-05 0 9804766
4004594 GRF0501 31-JAN-05 0 9750332
4004594 GRK0410 01-NOV-04 0 9303510
4004594 GRC0403 19-MAR-04 0 8371053
4004594 GRA0305 12-AUG-03 0 7543911
4004594 GRG0303 16-APR-03 0 7209503
4004594 GRA0301 16-FEB-03 0 7026840
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
4002401 GRG0502 16-MAR-05 25 9862647
4002401 GRG0502 23-FEB-05 0 9826142
4002401 GRA0501 19-JAN-05 0 9712904
4002401 GRF0412 05-JAN-05 0 9630884
4002401 GRK0410 21-OCT-04 0 9299106
4002401 GRG0303 03-MAR-03 0 7066423
4002401 GRA0301 09-FEB-03 0 7022121
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
5406454 DRJ0503 03-MAR-05 0 9887770
5406454 DRG0502 28-FEB-05 0 9870637
IDNUMBER APPEALCODE PAYDATE PAYAMOUNT TRANSNUM
---------- ---------- --------- ---------- ----------
5618190 DRJ0503 11-MAR-05 0 9918802
5618190 DRG0502 28-FEB-05 0 9870090
5618190 GRG0502 21-FEB-05 0 9824705
December 07, 2005 - 1:32 am UTC
(i would need a create table and insert statements if you really want me to play with it)
but this predicate:
12 where appealcode like '_R%'
13 and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)
says
"only keep _R% records that had the max paydate over ALL records for that id"
to satisfy:
I'm looking to select the most recent
record where the appealcode is like '_R%' for 2005.
perhaps you mean:
select *
from (select t.*,
row_number() over (partition by idnumber sort by paydate DESC)rn
from t
where appealcode like '_R%'
and idnumber in ( 1,2,3,4 ) )
where rn = 1;
that says
"find the _R% records"
"break them up by idnumber"
"sort each group from big to small by paydate"
"keep only the first record in each group"
if
to dennis
Oraboy, December 06, 2005 - 6:09 pm UTC
Hi ,
I tried your problem and looks its working fine.
Just a quick question..Did you check the dates are really 2005 and not 0005?
(Create scripts for anyone who wants to try in future)
Create table Test_T
(IdNumber number,
AppealCode Varchar2(100),
PayDate date,
PayAmount NUmber,
TransNum Number)
/
Insert into Test_t values ( 4004594 ,'DRE0505',to_date('17-May-05','DD-MON-RR'),0,10159410 );
Insert into Test_t values ( 4004594 ,'GRG0502',to_date('8-Feb-05','DD-MON-RR'),0,9804766 );
Insert into Test_t values ( 4004594 ,'GRF0501',to_date('31-Jan-05','DD-MON-RR'),0,9750332 );
Insert into Test_t values ( 4004594 ,'GRK0410',to_date('1-Nov-04','DD-MON-RR'),0,9303510 );
Insert into Test_t values ( 4004594 ,'GRC0403',to_date('19-Mar-04','DD-MON-RR'),0,8371053 );
Insert into Test_t values ( 4004594 ,'GRA0305',to_date('12-Aug-03','DD-MON-RR'),0,7543911 );
Insert into Test_t values ( 4004594 ,'GRG0303',to_date('16-Apr-03','DD-MON-RR'),0,7209503 );
Insert into Test_t values ( 4004594 ,'GRA0301',to_date('16-Feb-03','DD-MON-RR'),0,7026840 );
Insert into Test_t values ( 4002401 ,'GRG0502',to_date('16-Mar-05','DD-MON-RR'),25,9862647 );
Insert into Test_t values ( 4002401 ,'GRG0502',to_date('23-Feb-05','DD-MON-RR'),0,9826142 );
Insert into Test_t values ( 4002401 ,'GRA0501',to_date('19-Jan-05','DD-MON-RR'),0,9712904 );
Insert into Test_t values ( 4002401 ,'GRF0412',to_date('5-Jan-05','DD-MON-RR'),0,9630884 );
Insert into Test_t values ( 4002401 ,'GRK0410',to_date('21-Oct-04','DD-MON-RR'),0,9299106 );
Insert into Test_t values ( 4002401 ,'GRG0303',to_date('3-Mar-03','DD-MON-RR'),0,7066423 );
Insert into Test_t values ( 4002401 ,'GRA0301',to_date('9-Feb-03','DD-MON-RR'),0,7022121 );
Insert into Test_t values ( 5406454 ,'DRJ0503',to_date('3-Mar-05','DD-MON-RR'),0,9887770 );
Insert into Test_t values ( 5406454 ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870637 );
Insert into Test_t values ( 5618190 ,'DRJ0503',to_date('11-Mar-05','DD-MON-RR'),0,9918802 );
Insert into Test_t values ( 5618190 ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870090 );
Insert into Test_t values ( 5618190 ,'GRG0502',to_date('21-Feb-05','DD-MON-RR'),0,9824705 );
--since the other columns are not relevant , I used dummy values in your Select statement
s61>l
1 select
2 idnumber,
3 usercode1,
4 substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
5 paydate,
6 payamount,
7 transnum,
8 ltransnum,
9 appealcode
10 from (
11 select x.*,
12 row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
13 from (
14 select
15 idnumber,1 usercode1,to_char(paydate,'MON') mon_raw, paydate,
16 payamount, transnum,transnum ltransnum,appealcode,
17 max(paydate) over (partition by idnumber) maxpd
18 from Test_t
19 where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
20 and to_date('31-OCT-2005','DD-MON-YYYY')
21 ) x
22 where appealcode like '_R%'
23 and paydate=maxpd
24 and idnumber in(4002401,4004594,5406454,5618190)
25 )
26* where rn = 1
s61>/
IDNUMBER USERCODE1 FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---------- ------- --------- ---------- ---------- ---------- -----------
4002401 1 Mar2005 16-MAR-05 25 9862647 9862647 GRG0502
4004594 1 May2005 17-MAY-05 0 10159410 10159410 DRE0505
5406454 1 Mar2005 03-MAR-05 0 9887770 9887770 DRJ0503
5618190 1 Mar2005 11-MAR-05 0 9918802 9918802 DRJ0503
--added the other two columns
s61>alter table test_t add (usercode1 varchar2(100),ltransnum varchar2(100));
Table altered.
s61>update test_t set usercode1 = chr(65+ mod(rownum,3)), ltransnum=transnum+rownum;
20 rows updated.
S61> @<<ursql.txt>>
IDNUMBER USERCODE1 FIRSTPA PAYDATE
---------- ---------------------------------------------------------------------------------------------------- ------- ---------
4002401 A Mar2005 16-MAR-05
4004594 B May2005 17-MAY-05
5406454 B Mar2005 03-MAR-05
5618190 A Mar2005 11-MAR-05
-- this is just a guess on why the other two numbers didnt
--show up in your result
-- updating 2005 to 05
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=5406454
2 /
2 rows updated.
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=4002401
2 /
7 rows updated.
s61>l
1 select
2 idnumber,
3 usercode1,
4 substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
5 paydate,
6 payamount,
7 transnum,
8 ltransnum,
9 appealcode
10 from (
11 select x.*,
12 row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
13 from (
14 select
15 idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
16 payamount, transnum, ltransnum,appealcode,
17 max(paydate) over (partition by idnumber) maxpd
18 from Test_t
19 where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
20 and to_date('31-OCT-2005','DD-MON-YYYY')
21 ) x
22 where appealcode like '_R%'
23 and paydate=maxpd
24 and idnumber in(4002401,4004594,5406454,5618190)
25 )
26* where rn = 1
s61>/
IDNUMBER US FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- -- ------- --------- ---------- ---------- ---------- -------------------------------------
4004594 B May2005 17-MAY-05 0 10159410 10159411 DRE0505
5618190 A Mar2005 11-MAR-05 0 9918802 9918820 DRJ0503
-- same as what you see
Thanks Tom and Oraboy
denni50, December 07, 2005 - 8:37 am UTC
Oraboy...you brought up a good possibility..although
the data gets posted through canned software...users
are responsible for creating the batch headers before
posting batches and what may have happened is a user
inadvertently inserted year 0005 instead of '2005'
for a particular batch that included those idnumbers
I am testing with here. It's happened before.
thanks for that helpful tip!
:~)
Oraboy and Tom
denni50, December 07, 2005 - 8:57 am UTC
Oraboy:
it was not the year, I did testing using '0005' to see
if those two records would output results and it did not.
I changed the script based on logic that Tom suggested
and it worked...see changes below.
thanks Oraboy for your input and help.
SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode
2 paydate,payamount,transnum,ltransnum,appealcode
3 from (
4 select x.*, row_number() over (partition by idnumber order by paydate desc,payamount desc, t
snum desc) rn
5 from (
6 select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
7 payamount, transnum,ltransnum,appealcode
8 --max(paydate) over (partition by idnumber) maxpd
9 from payment
10 where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MO
YYY')
11 ) x
12 where appealcode like '_R%'
13 --and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)
15 )
16 where rn = 1;
IDNUMBER USER FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
4002401 ACGA Mar2005 16-MAR-05 25 9862647 9789477 GRG0502
4004594 ACDC May2005 17-MAY-05 0 10159410 10086183 DRE0505
5406454 ACDC Mar2005 03-MAR-05 0 9887770 9814606 DRJ0503
5618190 ACDC Mar2005 11-MAR-05 0 9918802 9845638 DRJ0503
SQL>
anyway to do a dynamic lag?
Ryan, December 07, 2005 - 10:27 pm UTC
Is it possible to use lag, but you don't know how many rows you want to go back?
create table history (
history_id number,
history_sequence number,
history_status varchar2(20),
history_balance number);
insert into history(1,123,'HISTORY 1',10);
insert into history(1,128,'PROCESSED',0);
insert into history(1,130,'PROCESSED',0);
insert into history(1,131,'HISTORY 8',15);
insert into history(1,145,'PROCESSED',0);
for each history_id ordered by history_sequence
loop
if status = 'PROCESSED' then
history_balance = the history_balance of the last record where status != 'PROCESSED'
end if;
end loop;
Typically with lag you have to state how many rows you are looking back, in this case my discriminator is based on the value in the status field?
After this is run, I expect the values to be
1,123,'HISTORY 1',10
1,128,'PROCESSED',10
1,130,'PROCESSED',10
1,131,'HISTORY 8',15
1,145,'PROCESSED',15
I can do this with pl/sql. I am trying to figure out how to do this with straight sql.
December 08, 2005 - 2:03 am UTC
last_value with ignore nulls in 10g, or to_number(substr(max in 9i and before can be used....
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
2 last_value(
3 case when history_status <> 'PROCESSED'
4 then history_balance
5 end IGNORE NULLS ) over (order by history_sequence ) last_hb
6 from history
7 /
HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS HISTORY_BALANCE LAST_HB
---------- ---------------- -------------------- --------------- ----------
1 123 HISTORY 1 10 10
1 128 PROCESSED 0 10
1 130 PROCESSED 0 10
1 131 HISTORY 8 15 15
1 145 PROCESSED 0 15
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
2 to_number( substr( max(
3 case when history_status <> 'PROCESSED'
4 then to_char(history_sequence,'fm0000000000' ) || history_balance
5 end ) over (order by history_sequence ), 11 ) ) last_hb
6 from history
7 /
HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS HISTORY_BALANCE LAST_HB
---------- ---------------- -------------------- --------------- ----------
1 123 HISTORY 1 10 10
1 128 PROCESSED 0 10
1 130 PROCESSED 0 10
1 131 HISTORY 8 15 15
1 145 PROCESSED 0 15
Query
Mark, January 18, 2006 - 5:02 pm UTC
Hi Tom,
Given a Table:
PK_ID NUMBER (PK)
CLASS_ID NUMBER
MY_DATE DATE
I'd like to develop an output like:
CLASS_ID W1 W2 W3...
where W1, W2, W3... are 'weeks' from SYSDATE, using MY_DATE, holding the counts of the CLASS_ID for that row.
How could I do that?
Thanks!
About to read Ch. 12 in Expert one-on-one...
January 19, 2006 - 12:28 pm UTC
if
a) you have a finite number of weeks (eg: a sql query has N columns at parse time, unless you know what N is...)
b) an example <<<=== creates/inserts
we could play with this.
ok, here we go
Mark, January 20, 2006 - 1:31 pm UTC
drop table cts_temp
/
create table cts_temp
( class_id number,
cts_date date)
/
insert into cts_temp
select round(dbms_random.value(1, 20)), trunc(created) from user_objects
/
Output sought:
CLASS_ID W1 W2 W3 W4 W5 W6 W7 W8 W9 W10+
----------------------------------------
1 1 3 7 5 1 0 0 0 0 12
2 1 5 1 0 0 0 5 3 4 6
3 1 0 0 9 0 1 1 5 1 10
...
where W# = # of Weeks away from current date
therefore, W1 is within 7 days of today, W10+ everything 10 weeks and older. These numbers are are Counts of records.
I have done this in the past with DECODE statements, but am looking for a more efficient way to do this using Analytics.
January 20, 2006 - 2:47 pm UTC
but you will not want to use analytics since you NEED TO AGGREGATE.
decode (or case) is the correct approach to this problem, keep using that.
SQL Query
Parag J Patankar, January 23, 2006 - 7:26 am UTC
Hi Tom,
Wish you very happy new year 2006. I have a table
create table t ( a number(5), b number(6), c number(1), d varchar2(8), e number(10));
insert into t values ( 09009, 1000, 1, 'RIS00001', 100);
insert into t values ( 09009, 1000, 0, 'RIS00001', 200);
insert into t values ( 09009, 1000, 0, 'RIS00001', 300);
insert into t values ( 09009, 1000, 2, 'RIS00001', 400)
insert into t values(09009, 5000, 2, 'BIC77777', 100);
insert into t values(09009, 5000, 2, 'BIC77777', 100);
insert into t values(09009, 6000, 0, 'DIG00077', 100);
insert into t values(09009, 6000, 0, 'DIG00077', 200);
commit;
17:33:38 SQL> select * from t;
A B C D E
---------- ---------- ---------- -------- ----------
9009 1000 1 RIS00001 100
9009 1000 0 RIS00001 200
9009 1000 0 RIS00001 300
9009 5000 2 BIC77777 100
9009 5000 2 BIC77777 100
9009 6000 0 DIG00077 100
9009 6000 0 DIG00077 200
7 rows selected.
In Column "C" values can be 0, 1, 2.
Now I want to select only those set records for column c value is not 1 for the combination of same a, b, and d.
For e.g I want output like
9009 5000 2 BIC77777 100
9009 5000 2 BIC77777 100
9009 6000 0 DIG00077 100
9009 6000 0 DIG00077 200
RIS00001 records should not appear because for column c value 1 once appeared for 09009 10000.
How can I do this in most efficient way ?
Currently it is very large table and joined to few tables in a query.
best regards
pjp
January 23, 2006 - 10:34 am UTC
for big "all rows" I would go analytics
ops$tkyte@ORA9IR2> select *
2 from (
3 select t.*,
4 max( case when c = 1 then c end ) over (partition by a, b, d ) c_one
5 from t
6 )
7 where c_one is null
8 /
A B C D E C_ONE
---------- ---------- ---------- -------- ---------- ----------
9009 5000 2 BIC77777 100
9009 5000 2 BIC77777 100
9009 6000 0 DIG00077 100
9009 6000 0 DIG00077 200
for possibly getting "first row as fast as I can", I might opt for not exists or not in
ops$tkyte@ORA9IR2> select *
2 from t
3 where (a,b,d) not in (select a,b,d from t where c = 1 and a is not null and b is not null and d is
4 not null )
5 /
A B C D E
---------- ---------- ---------- -------- ----------
9009 5000 2 BIC77777 100
9009 5000 2 BIC77777 100
9009 6000 0 DIG00077 100
9009 6000 0 DIG00077 200
ops$tkyte@ORA9IR2> select *
2 from t
3 where not exists (select null from t t2
4 where t2.a = t.a and t2.b = t.b and t2.d = t.d and t2.c = 1 )
5 /
A B C D E
---------- ---------- ---------- -------- ----------
9009 5000 2 BIC77777 100
9009 5000 2 BIC77777 100
9009 6000 0 DIG00077 100
9009 6000 0 DIG00077 200
Eliminating distinct rows
Avishay, February 20, 2006 - 8:48 am UTC
Hello Tom,
I have a view that UNION ALL 3 tables
here is the result for a select * on that view(f_emp_v):
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 5/12/2005
111111 123456 1/23/2005 1/24/2005
111111 123459 1/25/2005
111111 60 75 5/12/2005 5/13/2005
111111 30 5/13/2005 1/1/2006
111111 85 55 5/13/2005
Using the following SQL and analytical functions I fillied the NULL's, and created the IN_USE_FROM,IN_USE_UNTIL columns in a different way the IN_USE_UNTIL recieves the next date of the IN_USE_FROM ASC.
Here is the SQL:
SELECT Person_Id,
Substr(MAX(Decode(Cost_Account_Id,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Id))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Id,
Substr(MAX(Decode(Cost_Account_Code,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Code))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Code,
Substr(MAX(Decode(Working_Hours,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Working_Hours))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Working_Hours,
Substr(MAX(Decode(Fte,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') || Fte))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Fte,
Substr(MAX(Decode(Calc_Fte_Type,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Calc_Fte_Type))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Calc_Fte_Type,
In_Use_From,
Lead(t.In_Use_From, 1, In_Use_Until) Over(PARTITION BY t.Person_Id ORDER BY t.In_Use_From ASC) In_Use_Until
FROM Fact_Employee_List_v t
WHERE person_id = 111111
ORDER BY Person_Id,
In_Use_From;
Here are the results:
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 1/23/2005
111111 123456 20 1/23/2005 1/25/2005
111111 123459 20 1/25/2005 5/12/2005
111111 123459 20 60 75 5/12/2005 5/13/2005
111111 123459 30 85 55 5/13/2005 5/13/2005
111111 123459 30 85 55 5/13/2005
The table "Fills" in accordance with the dates.
As you can see the last 2 rows except for the IN_USE_UNTIL are identical.
How can I get 'Rid' of the row with IN_USE_UNTIL NOT NULL ?
Is there a way to do it in the above select?
Maybe change the way the analytical function for IN_US_UNTIL ?
Your remarks will be appreciated
Best Regards,
Avishay
Update using LAG
Zahir M, February 21, 2006 - 2:16 pm UTC
SQL> desc tab1
Name Null? Type
----------------------------------------- -------- ----------------------------
MEMBER_ID NUMBER(10)
START_DATE DATE
STOP_DATE DATE
SQL> Select * from tab1 where member_id = 125;
MEMBER_ID START_DAT STOP_DATE
---------- --------- ---------
125 23-OCT-00
125 05-MAY-04
125 10-MAY-04
125 30-MAR-05
SQL> Select wh.* , lag(start_date) over ( partition by member_id order by start_date asc) - 1 new_s
top_date
2 from tab1 wh where member_id = 125
3
SQL> /
MEMBER_ID START_DAT STOP_DATE NEW_STOP_
---------- --------- --------- ---------
125 23-OCT-00
125 05-MAY-04 22-OCT-00
125 10-MAY-04 04-MAY-04
125 30-MAR-05 09-MAY-04
SQL> Update tab1 a
2 set a.stop_date = ( SElect
3 lag(b.start_date)
4 over ( partition by b.member_id order by b.start_date asc) - 1 new_stop_date
5 from tab1 b where a.member_id = b.member_id and a.rowid = b.rowid ) ;
4 rows updated.
SQL> Select * from tab1 where member_id = 125;
MEMBER_ID START_DAT STOP_DATE
---------- --------- ---------
125 23-OCT-00
125 05-MAY-04
125 10-MAY-04
125 30-MAR-05
SQL> select * from v$version
2 /
BANNER
------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.7.4.1 - Production
PL/SQL Release 8.1.7.4.0 - Production
CORE 8.1.7.2.1 Production
TNS for 32-bit Windows: Version 8.1.7.4.0 - Production
NLSRTL Version 3.4.1.0.0 - Production
I am trying to use the LAG analytic function for an update statement . It does not seem to work.
Still , the column STOP_DATE is not updated with the new values( ie from the lag ).
Please advise.
February 22, 2006 - 8:16 am UTC
Oh, it is absolutely working!!! The problem is - where clause is applied
where a.member_id = b.member_id and a.rowid = b.rowid
AND THEN analytic is performed - of course, there is only one row however, so, well - the "previous" row isn't there anymore.
ops$tkyte@ORA10GR2> create table emp
2 as
3 select job, hiredate, to_date(null) last_hiredate
4 from scott.emp;
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> merge into emp
2 using ( select rowid rid, lag(hiredate) over (partition by job order by hiredate) last_hiredate
3 from emp) e2
4 on ( emp.rowid = e2.rid )
5 when matched then update set last_hiredate = e2.last_hiredate
6 -- when not matched then insert (job) values (NULL)
7 /
14 rows merged.
ops$tkyte@ORA10GR2> select * from emp order by job, hiredate;
JOB HIREDATE LAST_HIRE
--------- --------- ---------
ANALYST 03-DEC-81
ANALYST 09-DEC-82 03-DEC-81
CLERK 17-DEC-80
CLERK 03-DEC-81 17-DEC-80
CLERK 23-JAN-82 03-DEC-81
CLERK 12-JAN-83 23-JAN-82
MANAGER 02-APR-81
MANAGER 01-MAY-81 02-APR-81
MANAGER 09-JUN-81 01-MAY-81
PRESIDENT 17-NOV-81
SALESMAN 20-FEB-81
SALESMAN 22-FEB-81 20-FEB-81
SALESMAN 08-SEP-81 22-FEB-81
SALESMAN 28-SEP-81 08-SEP-81
14 rows selected.
In 9i, you need the "insert" part - but it'll never happen.
(but really, this looks like a bad idea, you'll have to just keep doing this over and over and over)
In 8i, you'll likely want to "two step this", create global temporary table, insert and update the join.
LEAD UPDATE
Zahir M, February 22, 2006 - 10:45 am UTC
Tom ,
I did the "two steps process" in 8i as you have suggested .
( except that I created a another table - not a global temp table ) . But it takes long duration to perform the update .
130874 rows updated.
Elapsed: 07:469:28146.33
Statistics
----------------------------------------------------------
0 recursive calls
4455554 db block gets
105093306 consistent gets
14702345 physical reads
30683572 redo size
852 bytes sent via SQL*Net to client
670 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
130874 rows processed
February 22, 2006 - 11:05 am UTC
no example :(
should only take seconds for so few rows.
ops$xp8i\tkyte@ORA8IR3W> alter table big_table add last_created date;
Table altered.
Elapsed: 00:00:00.63
ops$xp8i\tkyte@ORA8IR3W> create table t
2 ( rid rowid primary key, last_created date );
Table created.
Elapsed: 00:00:00.78
ops$xp8i\tkyte@ORA8IR3W>
ops$xp8i\tkyte@ORA8IR3W> insert into t
2 select rowid, lag(created) over (partition by object_type order by created)
3 from big_table;
130874 rows created.
Elapsed: 00:00:36.03
ops$xp8i\tkyte@ORA8IR3W> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.38
ops$xp8i\tkyte@ORA8IR3W>
ops$xp8i\tkyte@ORA8IR3W> update (select a.last_created new_last_created,
2 b.last_created old_last_created
3 from t a, big_table b
4 where a.rid = b.rowid )
5 set old_last_created = new_last_created;
130874 rows updated.
Elapsed: 00:00:32.41
Maybe you are getting blocked by other users, since you are updating every row, lock the table first - then update it.
LAG Update
Zahir M, February 22, 2006 - 11:32 am UTC
Thanks , Tom.
I re-ran the update after locking the table.
It took only 47 seconds for the update operation.
I guess it was locked by some other users / processes.
Thanks again !
Reposting
Avishay, February 23, 2006 - 4:28 am UTC
Hello Tom,
I have a view that UNION ALL 3 tables
here is the result for a select * on that view(f_emp_v):
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 5/12/2005
111111 123456 1/23/2005 1/24/2005
111111 123459 1/25/2005
111111 60 75 5/12/2005 5/13/2005
111111 30 5/13/2005 1/1/2006
111111 85 55 5/13/2005
Using the following SQL and analytical functions I fillied the NULL's, and
created the IN_USE_FROM,IN_USE_UNTIL columns in a different way the IN_USE_UNTIL
recieves the next date of the IN_USE_FROM ASC.
Here is the SQL:
SELECT Person_Id,
Substr(MAX(Decode(Cost_Account_Id,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Id))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Id,
Substr(MAX(Decode(Cost_Account_Code,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Code))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Code,
Substr(MAX(Decode(Working_Hours,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Working_Hours))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Working_Hours,
Substr(MAX(Decode(Fte,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') || Fte))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Fte,
Substr(MAX(Decode(Calc_Fte_Type,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Calc_Fte_Type))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Calc_Fte_Type,
In_Use_From,
Lead(t.In_Use_From, 1, In_Use_Until) Over(PARTITION BY t.Person_Id ORDER
BY t.In_Use_From ASC) In_Use_Until
FROM Fact_Employee_List_v t
WHERE person_id = 111111
ORDER BY Person_Id,
In_Use_From;
Here are the results:
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 1/23/2005
111111 123456 20 1/23/2005 1/25/2005
111111 123459 20 1/25/2005 5/12/2005
111111 123459 20 60 75 5/12/2005 5/13/2005
111111 123459 30 85 55 5/13/2005 5/13/2005
111111 123459 30 85 55 5/13/2005
The table "Fills" in accordance with the dates.
As you can see the last 2 rows except for the IN_USE_UNTIL are identical.
How can I get 'Rid' of the row with IN_USE_UNTIL NOT NULL ?
Is there a way to do it in the above select?
Maybe change the way the analytical function for IN_US_UNTIL ?
Your remarks will be appreciated
Best Regards,
Avishay
February 23, 2006 - 8:07 am UTC
why did you repost it.
You must have seen the page you used to post this. did you *READ* that page? I ignore all things that look like this - you have the classic example of what I ignore.
slow down, read the page you used to post this repost.
Analytics
Mark, February 23, 2006 - 10:42 am UTC
Hi Tom,
Oracle 9i latest and greatest...
At a loss for this one. Don't know where to start...
I have a table:
Create Table MY_NUMS
(N1 NUMBER, N2 NUMBER, N3 NUMBER, N4 NUMBER, N5 NUMBER, N6 NUMBER)
/
I populate each column with random values between 1-46:
INSERT INTO my_nums
SELECT TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
FROM user_objects;
/
and I get data similar to this:
HT4:DEVDB001101007:10:10 - DEV> select * from my_nums where rownum <= 10
2 /
N1 N2 N3 N4 N5 N6
---------- ---------- ---------- ---------- ---------- ----------
6 13 11 6 21 36
33 23 45 11 24 32
36 19 43 19 8 44
11 39 9 14 35 25
42 8 29 15 26 4
1 25 12 41 21 10
20 6 43 29 39 28
16 18 36 15 38 26
16 33 15 16 40 18
17 1 20 39 20 46
10 rows selected.
And my question is how do I get counts of how many times each number appears with every other number?
Output might look like this:
MY_NUMBER OTHER_NUM COUNT(*)
--------- --------- --------
16 15 3
...
This says that the number 16 appeared 3 times with the number 15 in the same row.
I'd have to check each column (N1 - N6) against each of the other 5 columns for each value and sum them up...
Regards,
Mark
February 23, 2006 - 10:47 am UTC
are my_number and other_num INPUTS into your query or what? where did 16 and 15 come from.
Analytics
Mark, February 23, 2006 - 12:28 pm UTC
Oh, ok.
MY_NUMBER is the number I am counting the combinations with all OTHER_NUMBERs.
Output would ideally look like this:
MY_NUMBER OTHER_NUMBER COUNT(*)
--------- ------------ --------
1 2 5
1 3 6
1 4 2
...
2 3 4
2 4 7
2 5 8
...
etc., all the way down to
MY_NUMBER OTHER_NUMBER COUNT(*)
--------- ------------ --------
45 46 7
There are no INPUTS to the query as it calculates counts for all combinations.
This query somewhat does it, but I feel there is a way better way than doing it with decodes:
SELECT n1
,SUM(n2_2 + n3_2 + n4_2 + n5_2 + n6_5) two
,SUM(n2_3 + n3_3 + n4_3 + n5_3 + n6_5) three
,SUM(n2_4 + n3_4 + n4_4 + n5_4 + n6_5) four
,SUM(n2_5 + n3_5 + n4_5 + n5_5 + n6_5) five
,SUM(n2_46 + n3_46 + n4_46 + n5_46 + n6_46) fortysix
FROM (SELECT n1
,DECODE(n2, 2, 1, 0) n2_2
,DECODE(n2, 3, 1, 0) n2_3
,DECODE(n2, 4, 1, 0) n2_4
,DECODE(n2, 5, 1, 0) n2_5
/* n2_6 through n2_45 here ... */
,DECODE(n2, 46, 1, 0) n2_46
,DECODE(n3, 2, 1, 0) n3_2
,DECODE(n3, 3, 1, 0) n3_3
,DECODE(n3, 4, 1, 0) n3_4
,DECODE(n3, 5, 1, 0) n3_5
,DECODE(n3, 46, 1, 0) n3_46
,DECODE(n4, 2, 1, 0) n4_2
,DECODE(n4, 3, 1, 0) n4_3
,DECODE(n4, 4, 1, 0) n4_4
,DECODE(n4, 5, 1, 0) n4_5
,DECODE(n4, 46, 1, 0) n4_46
,DECODE(n5, 2, 1, 0) n5_2
,DECODE(n5, 3, 1, 0) n5_3
,DECODE(n5, 4, 1, 0) n5_4
,DECODE(n5, 5, 1, 0) n5_5
,DECODE(n5, 46, 1, 0) n5_46
,DECODE(n6, 2, 1, 0) n6_2
,DECODE(n6, 3, 1, 0) n6_3
,DECODE(n6, 4, 1, 0) n6_4
,DECODE(n6, 5, 1, 0) n6_5
,DECODE(n6, 46, 1, 0) n6_46
FROM my_nums)
GROUP BY n1
/
N1 TWO THREE FOUR FIVE FORTYSIX
---------- ---------- ---------- ---------- ---------- ----------
1 27 10 14 19 16
2 10 12 12 22 21
3 13 15 13 12 16
4 25 15 24 13 23
5 19 15 16 14 19
6 16 15 13 8 30
7 18 13 13 19 15
8 14 14 12 18 14
9 16 16 18 16 22
10 14 12 16 14 19
...
for the entire matrix of number.
This output says that the number 2 in columns N2,N3,N4,N5,N6 appeared 27 times with the number 1. The number 3 in columns N2 - N6 appeared 10 times with the number 1, etc.
A sort of 'how many times does this number appear with all thes other numbers in the same row" query...
Thanks.
February 23, 2006 - 7:01 pm UTC
I was beaten to the punch on this one ;) see below
A solution
Michel Cadot, February 23, 2006 - 3:10 pm UTC
Hi Mark,
Here's a solution to your issue.
I changed the values to be able to post the whole example but the query does not care about what's inside the table.
I let all the steps but i think you can compact the query.
SQL> INSERT INTO my_nums
2 SELECT TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
3 ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
4 ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
5 ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
6 ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
7 ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
8 FROM user_views
9 where rownum <= 3
10 /
3 rows created.
SQL> commit;
Commit complete.
SQL> select * from my_nums;
N1 N2 N3 N4 N5 N6
---------- ---------- ---------- ---------- ---------- ----------
7 4 6 8 2 1
1 2 8 8 7 2
6 3 5 1 4 8
3 rows selected.
SQL> with
2 a as ( select row_number () over (order by n1) rn,
3 n1, n2, n3, n4, n5, n6
4 from my_nums
5 ),
6 b as ( select rn, n1, n2, n3, n4, n5, n6, part
7 from a,
8 (select rownum part from dual connect by level <= 15)
9 ),
10 c as ( select distinct rn,
11 case part
12 when 1 then least(n1,n2)
13 when 2 then least(n1,n3)
14 when 3 then least(n1,n4)
15 when 4 then least(n1,n5)
16 when 5 then least(n1,n6)
17 when 6 then least(n2,n3)
18 when 7 then least(n2,n4)
19 when 8 then least(n2,n5)
20 when 9 then least(n2,n6)
21 when 10 then least(n3,n4)
22 when 11 then least(n3,n5)
23 when 12 then least(n3,n6)
24 when 13 then least(n4,n5)
25 when 14 then least(n4,n6)
26 when 15 then least(n5,n6)
27 end v1,
28 case part
29 when 1 then greatest(n1,n2)
30 when 2 then greatest(n1,n3)
31 when 3 then greatest(n1,n4)
32 when 4 then greatest(n1,n5)
33 when 5 then greatest(n1,n6)
34 when 6 then greatest(n2,n3)
35 when 7 then greatest(n2,n4)
36 when 8 then greatest(n2,n5)
37 when 9 then greatest(n2,n6)
38 when 10 then greatest(n3,n4)
39 when 11 then greatest(n3,n5)
40 when 12 then greatest(n3,n6)
41 when 13 then greatest(n4,n5)
42 when 14 then greatest(n4,n6)
43 when 15 then greatest(n5,n6)
44 end v2
45 from b
46 )
47 select v1, v2, count(*) nb
48 from c
49 where v1 != v2
50 group by v1, v2
51 order by v1, v2
52 /
V1 V2 NB
---------- ---------- ----------
1 2 2
1 3 1
1 4 2
1 5 1
1 6 2
1 7 2
1 8 3
2 4 1
2 6 1
2 7 2
2 8 2
3 4 1
3 5 1
3 6 1
3 8 1
4 5 1
4 6 2
4 7 1
4 8 2
5 6 1
5 8 1
6 7 1
6 8 2
7 8 2
24 rows selected.
Regards
Michel
Excellent
Mark, February 23, 2006 - 3:35 pm UTC
Thanks
To Mark ... COUNT of what?
A reader, February 23, 2006 - 4:07 pm UTC
<quote>And my question is how do I get counts of how many times each number appears with every other number?</quote>
flip@FLOP> select * from my_nums;
N1 N2 N3 N4 N5 N6
---------- ---------- ---------- ---------- ---------- ----------
1 2 2 2 2 2
flip@FLOP> @michel_qry
V1 V2 NB
---------- ---------- ----------
1 2 1
Should the answer here be:
A. "1" [as in the number of rows where 1 is together with 2]
or
B. "5"
?
To A reader
Michel Cadot, February 23, 2006 - 4:46 pm UTC
The "distinct" in c definition is there to count only 1 per each row.
If you want to count all occurrences then remove "distinct".
Regards
Michel
Analytics
Mark, February 23, 2006 - 5:30 pm UTC
The answer to that should be "5" as 1 and 2 appear as a combination 5 times.
Regards,
Mark
Difficulty with min(id) over partition by
Ken, February 27, 2006 - 11:38 pm UTC
Hi Tom:
We have a table with over 4 million records. A bug in the java application has created duplicate records in this table. I am brought in to dedupe the records based on certain criteria. For every duplicate record set, we need to keep the record with the smallest value in the ID field and delete the rest.
I am close to identifying the IDs to be deleted but the query is not identifying the records with min(id) consistently. Here is the table structure (created for this test case -- no index or anything) and query.
Thanks in advance for your help.
ken@DEV9206> desc DEDUPE_TEST_01
Name Null? Type
----------------------------------- -------- --------
EQUIP_ID NOT NULL NUMBER
EQUIP_TYPE_ID NOT NULL NUMBER
TYPE_ID NOT NULL NUMBER
CREATED NOT NULL DATE
COL_1 NUMBER
COL_0 NUMBER
ID NOT NULL NUMBER
CT NUMBER
ken@DEV9206> l
1 SELECT equip_id, equip_type_id, type_id, TO_CHAR(CREATED, 'DDMMYYYY HH24:MI:SS') created,
2 col_1, col_0, MIN(id) OVER (PARTITION BY equip_id
3 ORDER BY equip_id, equip_type_id, type_id, TO_CHAR(created, 'DDMMYYYY HH24:MI:SS'),
4 col_1, col_0) min_id,
5 id
6 from dedupe_test_01
7* where rownum < 31
ken@DEV9206> /
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 MIN_ID ID
-------- ------------- ------- ----------------- ----- ----- --------- --------
3011 29 221 30110002 00:00:00 3 77 26445635 26445635
3011 29 221 30110002 00:00:00 3 77 26445651
3011 29 221 30110002 00:00:00 3 86 26445626 26445626
3011 29 221 30110002 00:00:00 3 86 26445653
3011 29 221 30110002 00:00:00 3 112 26445617 26445617
3011 29 221 30110002 00:00:00 3 112 26445620
3011 29 221 30110002 00:00:00 3 125 26445631*
3011 29 221 30110002 00:00:00 3 125 26445641
3551 29 221 11032005 17:01:00 1 186 3209093 3209093
3551 29 221 11032005 17:01:00 1 186 6072228
3551 29 221 11032005 17:01:00 1 186 8894681
3551 29 221 11032005 17:01:00 1 186 3837758
3551 29 221 11032005 17:01:00 1 186 3837738
3551 29 221 11032005 20:44:00 1 190 3209092 3209092
3551 29 221 11032005 20:44:00 1 190 3837757
3551 29 221 11032005 20:44:00 1 190 6072227
3551 29 221 11032005 20:44:00 1 190 8894680
3551 29 221 11032005 20:44:00 1 190 3837737
3551 29 221 11032005 23:00:00 1 227 3209091 3209091
3551 29 221 11032005 23:00:00 1 227 3837736
3551 29 221 30110002 00:00:00 3 112 3209094*
3551 29 221 30110002 00:00:00 3 112 3837739
3551 29 221 30110002 00:00:00 3 112 3837759
3551 29 221 30110002 00:00:00 3 112 6072229
3551 29 221 30110002 00:00:00 3 112 8894682
3551 29 221 30110002 00:00:00 3 118 3209095*
3551 29 221 30110002 00:00:00 3 118 3837740
3551 29 221 30110002 00:00:00 3 118 3837760
3551 29 221 30110002 00:00:00 3 118 6072230
3551 29 221 30110002 00:00:00 3 118 8894683
30 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 WINDOW (SORT)
2 1 COUNT (STOPKEY)
3 2 TABLE ACCESS (FULL) OF 'DEDUPE_TEST_01'
February 28, 2006 - 7:14 am UTC
... A bug in the java application has
created duplicate records in this table. ...
You really meant to say "a horribly flawed implementation whereby the java coders decided to do all DATA LOGIC in the wrongest place in the world - the application has exposed itself. We know there are dozens more of these lying in wait for us"
That is identifying the min id very very very very consistently.
but why are you sorting? If you want the min(id) by equip_id, you should leave out the order by - else you get the minimum ID for the current row and every row in front of it - not over all rows by equip_id.
Don't know why you are using:
TO_CHAR(created, 'DDMMYYYY HH24:MI:SS')
why would you take something incredibly sortable (a DATE) and turn it into an ascii string that doesn't even sort correctly? (it will not sort by date at all, it'll sort as a string, wrong)
But - I don't believe you want the order by - but I cannot really say because you do not tell us what the primary key that should have been in place is?
Believe me - data and business logic belongs in the DB
Ken, February 28, 2006 - 8:32 am UTC
I have been telling the Java folks the same. And I have been preaching about the bind variables as well. The database was designed by someone quite famous in the Java tech community a few years back. The data abstraction has created so much issues and only the application knows what column contains what data based on a value in another field. But they are listening. They have changed updated the code to use preparedStatements and more changes are coming. A tiny victory, thanks in large part to your books and the forum. (The blog too, I must add. But I enjoy the non-tech pieces more since they show the otherside we hardly see here. Keep it up!)
The only reason in using TO_CHAR was to expose the time values but this can be done later. I will remove this and sort and run it again.
It is unbelievable how this app works. Pulling all the data and sorting in java, yikes!
Thank you so much, Tom.
February 28, 2006 - 9:05 am UTC
but dates sort by
century, year, month, day, hour, minute, second
quite naturally!!!!!
By "exposing" the time in this example, you mucked up the sort order!!!
You put DD first - it would put the first of ANY MONTH before the second of ANY MONTH - messing it up.
it should just order by DATE - period. Never order by to_char(date)....
Analytics rock Analytics roll
Alf, February 28, 2006 - 2:11 pm UTC
Hello Tom,
I have four tables proc_event, visit, patient, and event
I need to get a list of patients with all records in the proc_event table for proc_id = 123 and 3456.
Patient proc_id
Patient A 123
3456
Patient B 123
3456
The query below is listing all records from the patient table that have either of the proc_id in the proc_event table.
Would you please direct me how or if I might be able to rewrite this in order to gets the desire listing?
Any information would be greatly appreciated. Thanks.
SELECT p.P_NAME,
p.MRN,
pe.proc_id,
e.date_time
FROM ud.patient p,
ud.proc_event pe,
ud.event e,
ud.visit v
WHERE ( (p.patient_id = v.patient_id)
AND (v.visit_id = e.visit_id)
AND (e.event_id(+) = pe.event_id)
AND (pe.proc_id in('123','3456'))
AND (e.date_time BETWEEN
to_date('01-jan-2005 00:00','dd-mon-yyyy hh24:mi') AND
to_date('20-jan-2005 23:59','dd-mon-yyyy hh24:mi')
)
)
GROUP BY p.P_NAME,
p.mrn,
pe.proc_id,
e.date_time
ORDER BY p.P_NAME ASC,
p.mrn ASC,
pe.proc_id ASC
March 01, 2006 - 7:57 am UTC
...
I have four tables proc_event, visit, patient, and event
I need to get a list of patients with all records in the proc_event table for
proc_id = 123 and 3456.
...
Well given that
a) we don't know how these relate
b) what columns might be available
c) pretty much don't know anything
It is sort of difficult.
It sounds like you simply want to
a) join patients to proc_event (but we don't even know if you CAN!!!!!!!)
b) use an IN
Difficulty with min(id) over partition by -- part deux
Ken, March 01, 2006 - 9:29 am UTC
Hi Tom:
I am still not getting this to group correctly. The date is date, and orader is removed. Could you please let me know what could be causing this? Thanks.
ken@DEV9206> break on min_id skip 1
ken@DEV9206> l
1 SELECT equip_id, equip_type_id, type_id, created,
2 col_1, col_0,
3 ID,
4 MIN(id) OVER (PARTITION BY equip_id
5 ORDER BY equip_type_id,
6 type_id,
7 created,
8 col_1,
9 col_0) min_id
10 FROM dedupe_test_01
11* WHERE ROWNUM < 51
ken@DEV9206> /
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 ID MIN_ID
---------- ------------- ---------- --------- ---------- ---------- ---------- ----------
3011 29 221 30-NOV-02 3 77 26445635 26445635
3011 29 221 30-NOV-02 3 77 26445651
3011 29 221 30-NOV-02 3 86 26445626 26445626
3011 29 221 30-NOV-02 3 86 26445653
3011 29 221 30-NOV-02 3 112 26445617 26445617
3011 29 221 30-NOV-02 3 112 26445620
3011 29 221 30-NOV-02 3 125 26445631
3011 29 221 30-NOV-02 3 125 26445641
3551 29 221 30-NOV-02 3 112 3209094 3209094
3551 29 221 30-NOV-02 3 112 3837739
3551 29 221 30-NOV-02 3 112 3837759
3551 29 221 30-NOV-02 3 112 6072229
3551 29 221 30-NOV-02 3 112 8894682
3551 29 221 30-NOV-02 3 118 3209095***
3551 29 221 30-NOV-02 3 118 3837740
3551 29 221 30-NOV-02 3 118 3837760
3551 29 221 30-NOV-02 3 118 6072230
3551 29 221 30-NOV-02 3 118 8894683
3551 29 221 11-MAR-05 1 186 3209093 3209093
3551 29 221 11-MAR-05 1 186 3837738
3551 29 221 11-MAR-05 1 186 3837758
3551 29 221 11-MAR-05 1 186 6072228
3551 29 221 11-MAR-05 1 186 8894681
3551 29 221 11-MAR-05 1 190 3209092 3209092
3551 29 221 11-MAR-05 1 190 3837737
3551 29 221 11-MAR-05 1 190 3837757
3551 29 221 11-MAR-05 1 190 6072227
3551 29 221 11-MAR-05 1 190 8894680
3551 29 221 11-MAR-05 1 227 3209091 3209091
3551 29 221 11-MAR-05 1 227 3837736
3551 29 221 11-MAR-05 1 227 3837756
3551 29 221 11-MAR-05 1 227 6072226
3551 29 221 11-MAR-05 1 227 8894679
3551 29 221 12-MAR-05 1 153 3209090 3209090
3551 29 221 12-MAR-05 1 153 3837735
3551 29 221 12-MAR-05 1 153 3837755
3551 29 221 12-MAR-05 1 153 6072225
3551 29 221 12-MAR-05 1 153 8894678
3551 29 221 16-MAR-05 1 109 3837734***
3551 29 221 16-MAR-05 1 109 3837754
3551 29 221 16-MAR-05 1 109 6072224
3551 29 221 16-MAR-05 1 109 8894677
3551 29 221 16-MAR-05 1 128 3837733***
3551 29 221 16-MAR-05 1 128 3837753
3551 29 221 16-MAR-05 1 128 6072223
3551 29 221 16-MAR-05 1 128 8894676
3551 29 221 17-MAR-05 1 181 3837732
3551 29 221 17-MAR-05 1 181 3837752
3551 29 221 17-MAR-05 1 181 6072222
3551 29 221 17-MAR-05 1 181 8894675
50 rows selected.
March 01, 2006 - 9:52 am UTC
I said you don't want the order by at all.
If you are trying to associate the MIN(ID) to every record in a group - you just partition - you do NOT order by (else you get the min id of every record from the current on one up)
I don't know what you are trying to retrieve really since all I have is "non functioning sql" to work with - I don't know what your logic is.
query is doing precisely what you asked - break data up by X, sort by A,B,C and assign min(id) from the current row on up on the group.
I think "no order by" is called for.
Difficulty with min(id) over partition by -- part deux - take 2
Ken, March 01, 2006 - 9:49 am UTC
Please disregard my previous posting. It did not come out right.
I still cannot get this to group correctly. TO_CHAR was removed and order WAS removed. This gave me the following.
1 SELECT rownum, equip_id, equip_type_id, type_id, created,
2 col_1, col_0,
3 ID,
4 MIN(id) OVER (PARTITION BY equip_id) min_id
5 FROM dedupe_test_01
6* WHERE ROWNUM < 51
ken@DEV9206/
ROWNUM EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 ID MIN_ID
------ -------- ------------- ------- --------- ----- ----- -------- --------
1 3011 29 221 30-NOV-02 3 77 26445635 26445617
2 3011 29 221 30-NOV-02 3 77 26445651
3 3011 29 221 30-NOV-02 3 86 26445626
4 3011 29 221 30-NOV-02 3 86 26445653
5 3011 29 221 30-NOV-02 3 112 26445617
6 3011 29 221 30-NOV-02 3 112 26445620
7 3011 29 221 30-NOV-02 3 125 26445631
8 3011 29 221 30-NOV-02 3 125 26445641
9 3551 29 221 30-NOV-02 3 112 3209094 3209090
10 3551 29 221 30-NOV-02 3 112 3837739
11 3551 29 221 30-NOV-02 3 112 3837759
12 3551 29 221 30-NOV-02 3 112 6072229
13 3551 29 221 30-NOV-02 3 112 8894682
14 3551 29 221 30-NOV-02 3 118 3209095
15 3551 29 221 30-NOV-02 3 118 3837740
16 3551 29 221 30-NOV-02 3 118 3837760
17 3551 29 221 30-NOV-02 3 118 6072230
18 3551 29 221 30-NOV-02 3 118 8894683
19 3551 29 221 11-MAR-05 1 186 3209093
20 3551 29 221 11-MAR-05 1 186 3837738
21 3551 29 221 11-MAR-05 1 186 3837758
22 3551 29 221 11-MAR-05 1 186 6072228
23 3551 29 221 11-MAR-05 1 186 8894681
24 3551 29 221 11-MAR-05 1 190 3209092
25 3551 29 221 11-MAR-05 1 190 3837737
26 3551 29 221 11-MAR-05 1 190 3837757
27 3551 29 221 11-MAR-05 1 190 6072227
28 3551 29 221 11-MAR-05 1 190 8894680
29 3551 29 221 11-MAR-05 1 227 3209091
30 3551 29 221 11-MAR-05 1 227 3837736
31 3551 29 221 11-MAR-05 1 227 3837756
32 3551 29 221 11-MAR-05 1 227 6072226
33 3551 29 221 11-MAR-05 1 227 8894679
34 3551 29 221 12-MAR-05 1 153 3209090
35 3551 29 221 12-MAR-05 1 153 3837735
36 3551 29 221 12-MAR-05 1 153 3837755
37 3551 29 221 12-MAR-05 1 153 6072225
38 3551 29 221 12-MAR-05 1 153 8894678
39 3551 29 221 16-MAR-05 1 109 3837734
40 3551 29 221 16-MAR-05 1 109 3837754
41 3551 29 221 16-MAR-05 1 109 6072224
42 3551 29 221 16-MAR-05 1 109 8894677
43 3551 29 221 16-MAR-05 1 128 3837733
44 3551 29 221 16-MAR-05 1 128 3837753
45 3551 29 221 16-MAR-05 1 128 6072223
46 3551 29 221 16-MAR-05 1 128 8894676
47 3551 29 221 17-MAR-05 1 181 3837732
48 3551 29 221 17-MAR-05 1 181 3837752
49 3551 29 221 17-MAR-05 1 181 6072222
50 3551 29 221 17-MAR-05 1 181 8894675
50 rows selected.
I added the ORDER BY back in the PARTITION BY clause. Here's the result:
ken@DEV9206get q_1
1 SELECT rownum, equip_id, equip_type_id, type_id, created,
2 col_1, col_0,
3 ID,
4 MIN(id) OVER (PARTITION BY equip_id
5 ORDER BY equip_type_id,
6 type_id,
7 created,
8 col_1,
9 col_0) min_id
10 FROM dedupe_test_01
11* WHERE ROWNUM < 51
ken@DEV9206/
ROWNUM EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 ID MIN_ID
------ -------- ------------- ------- --------- ----- ----- -------- --------
1 3011 29 221 30-NOV-02 3 77 26445635 26445635
2 3011 29 221 30-NOV-02 3 77 26445651
3 3011 29 221 30-NOV-02 3 86 26445626 26445626
4 3011 29 221 30-NOV-02 3 86 26445653
5 3011 29 221 30-NOV-02 3 112 26445617 26445617
6 3011 29 221 30-NOV-02 3 112 26445620
7 3011 29 221 30-NOV-02 3 125 26445631
8 3011 29 221 30-NOV-02 3 125 26445641
9 3551 29 221 30-NOV-02 3 112 3209094 3209094
10 3551 29 221 30-NOV-02 3 112 3837739
11 3551 29 221 30-NOV-02 3 112 3837759
12 3551 29 221 30-NOV-02 3 112 6072229
13 3551 29 221 30-NOV-02 3 112 8894682
14 3551 29 221 30-NOV-02 3 118 3209095
15 3551 29 221 30-NOV-02 3 118 3837740
16 3551 29 221 30-NOV-02 3 118 3837760
17 3551 29 221 30-NOV-02 3 118 6072230
18 3551 29 221 30-NOV-02 3 118 8894683
19 3551 29 221 11-MAR-05 1 186 3209093 3209093
20 3551 29 221 11-MAR-05 1 186 3837738
21 3551 29 221 11-MAR-05 1 186 3837758
22 3551 29 221 11-MAR-05 1 186 6072228
23 3551 29 221 11-MAR-05 1 186 8894681
24 3551 29 221 11-MAR-05 1 190 3209092 3209092
25 3551 29 221 11-MAR-05 1 190 3837737
26 3551 29 221 11-MAR-05 1 190 3837757
27 3551 29 221 11-MAR-05 1 190 6072227
28 3551 29 221 11-MAR-05 1 190 8894680
29 3551 29 221 11-MAR-05 1 227 3209091 3209091
30 3551 29 221 11-MAR-05 1 227 3837736
31 3551 29 221 11-MAR-05 1 227 3837756
32 3551 29 221 11-MAR-05 1 227 6072226
33 3551 29 221 11-MAR-05 1 227 8894679
34 3551 29 221 12-MAR-05 1 153 3209090 3209090
35 3551 29 221 12-MAR-05 1 153 3837735
36 3551 29 221 12-MAR-05 1 153 3837755
37 3551 29 221 12-MAR-05 1 153 6072225
38 3551 29 221 12-MAR-05 1 153 8894678
39 3551 29 221 16-MAR-05 1 109 3837734
40 3551 29 221 16-MAR-05 1 109 3837754
41 3551 29 221 16-MAR-05 1 109 6072224
42 3551 29 221 16-MAR-05 1 109 8894677
43 3551 29 221 16-MAR-05 1 128 3837733
44 3551 29 221 16-MAR-05 1 128 3837753
45 3551 29 221 16-MAR-05 1 128 6072223
46 3551 29 221 16-MAR-05 1 128 8894676
47 3551 29 221 17-MAR-05 1 181 3837732
48 3551 29 221 17-MAR-05 1 181 3837752
49 3551 29 221 17-MAR-05 1 181 6072222
50 3551 29 221 17-MAR-05 1 181 8894675
50 rows selected.
Row numbers 14, 39, 43, and 47 should have started new groups. What am I doing wrong that it is not seeing these as new groups?
Thanks much for your help.
March 01, 2006 - 10:32 am UTC
you need to just state in english what you are trying to do rather than posting SQL that does not achieve it.
Tell us your LOGIC.
Difficulty with min(id) over partition by -- part deux - take 3
Ken, March 01, 2006 - 10:38 am UTC
Thanks, Tom, fair enough.
We would like to select duplicate rows from a table based on the values in certain fields (not all fields will have duplicate data since some are unique to each row). And identify the row within this set with the lowest value in one column --- min(id), and have this value be shown in a separate column.
Please let me know how we can achieve that.
Thanks again.
March 01, 2006 - 10:42 am UTC
tell me what values perhaps.
In general, you will partition by the unique key
You will get the MIN(ID) by that key
No order by
Difficulty with min(id) over partition by -- part deux - take 4
Ken, March 01, 2006 - 11:54 am UTC
Thanks, Tom, but I think we are not on the same page yet.
Please let me try again re: logic.
We have the following table:
ken@DEV9206desc dedupe_test_01
Name Null? Type
----------------------------- -------- ------------------
ID NOT NULL NUMBER
EQUIP_ID NOT NULL NUMBER
EQUIP_TYPE_ID NOT NULL NUMBER
TYPE_ID NOT NULL NUMBER
CREATED NOT NULL DATE
COL_0 NUMBER
COL_1 NUMBER
ken@DEV9206l
1 select * from dedupe_test_01
2 where equip_id in (3011, 3551)
3 and rownum < 51
4* order by 1,2,3,4,5,6
ken@DEV9206/
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_0 COL_1 ID
-------- ------------- ------- --------- ----- ----- --------
3011 29 221 30-NOV-02 77 3 26445635
3011 29 221 30-NOV-02 77 3 26445651
3011 29 221 30-NOV-02 86 3 26445626
3011 29 221 30-NOV-02 86 3 26445653
3011 29 221 30-NOV-02 112 3 26445617
3011 29 221 30-NOV-02 112 3 26445620
3011 29 221 30-NOV-02 125 3 26445631
3011 29 221 30-NOV-02 125 3 26445641
3551 29 221 30-NOV-02 112 3 3209094
3551 29 221 30-NOV-02 112 3 3837739
3551 29 221 30-NOV-02 112 3 3837759
3551 29 221 30-NOV-02 112 3 6072229
3551 29 221 30-NOV-02 112 3 8894682
3551 29 221 30-NOV-02 118 3 3209095
3551 29 221 30-NOV-02 118 3 3837740
3551 29 221 30-NOV-02 118 3 3837760
3551 29 221 30-NOV-02 118 3 6072230
3551 29 221 30-NOV-02 118 3 8894683
3551 29 221 11-MAR-05 186 1 3209093
3551 29 221 11-MAR-05 186 1 3837738
3551 29 221 11-MAR-05 186 1 3837758
3551 29 221 11-MAR-05 186 1 6072228
3551 29 221 11-MAR-05 186 1 8894681
3551 29 221 11-MAR-05 190 1 3209092
3551 29 221 11-MAR-05 190 1 3837737
3551 29 221 11-MAR-05 190 1 3837757
3551 29 221 11-MAR-05 190 1 6072227
3551 29 221 11-MAR-05 190 1 8894680
3551 29 221 11-MAR-05 227 1 3209091
3551 29 221 11-MAR-05 227 1 3837736
3551 29 221 11-MAR-05 227 1 3837756
3551 29 221 11-MAR-05 227 1 6072226
3551 29 221 11-MAR-05 227 1 8894679
3551 29 221 12-MAR-05 153 1 3209090
3551 29 221 12-MAR-05 153 1 3837735
3551 29 221 12-MAR-05 153 1 3837755
3551 29 221 12-MAR-05 153 1 6072225
3551 29 221 12-MAR-05 153 1 8894678
3551 29 221 16-MAR-05 109 1 3837734
3551 29 221 16-MAR-05 109 1 3837754
3551 29 221 16-MAR-05 109 1 6072224
3551 29 221 16-MAR-05 109 1 8894677
3551 29 221 16-MAR-05 128 1 3837733
3551 29 221 16-MAR-05 128 1 3837753
3551 29 221 16-MAR-05 128 1 6072223
3551 29 221 16-MAR-05 128 1 8894676
3551 29 221 17-MAR-05 181 1 3837732
3551 29 221 17-MAR-05 181 1 3837752
3551 29 221 17-MAR-05 181 1 6072222
3551 29 221 17-MAR-05 181 1 8894675
50 rows selected.
And here's the answer I need:
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_0 COL_1 ID MIN_ID
-------- ------------- ------- --------- ----- ----- -------- --------
3011 29 221 30-NOV-02 77 3 26445635 26445635
3011 29 221 30-NOV-02 77 3 26445651
3011 29 221 30-NOV-02 86 3 26445626 26445626
3011 29 221 30-NOV-02 86 3 26445653
3011 29 221 30-NOV-02 112 3 26445617 26445617
3011 29 221 30-NOV-02 112 3 26445620
3011 29 221 30-NOV-02 125 3 26445631 26445631
3011 29 221 30-NOV-02 125 3 26445641
3551 29 221 30-NOV-02 112 3 3209094 3209094
3551 29 221 30-NOV-02 112 3 3837739
3551 29 221 30-NOV-02 112 3 3837759
3551 29 221 30-NOV-02 112 3 6072229
3551 29 221 30-NOV-02 112 3 8894682
3551 29 221 30-NOV-02 118 3 3209095 3209095
3551 29 221 30-NOV-02 118 3 3837740
3551 29 221 30-NOV-02 118 3 3837760
3551 29 221 30-NOV-02 118 3 6072230
3551 29 221 30-NOV-02 118 3 8894683
3551 29 221 11-MAR-05 186 1 3209093 3209093
3551 29 221 11-MAR-05 186 1 3837738
3551 29 221 11-MAR-05 186 1 3837758
3551 29 221 11-MAR-05 186 1 6072228
3551 29 221 11-MAR-05 186 1 8894681
3551 29 221 11-MAR-05 190 1 3209092 3209092
3551 29 221 11-MAR-05 190 1 3837737
3551 29 221 11-MAR-05 190 1 3837757
3551 29 221 11-MAR-05 190 1 6072227
3551 29 221 11-MAR-05 190 1 8894680
Perhaps, PARTITION BY is not appropriate? Thanks.
March 01, 2006 - 1:49 pm UTC
I know we are not on the same page - because all i want is a textual description of the logic.
I do not, will not, reverse engineer "this is what I get" and "this is what I want"
I want you to write it down as if you would give it to someone to implement
because....... That is precisely what you are doing.
Difficulty with min(id) over partition by -- part deux - take 5
Ken, March 01, 2006 - 12:23 pm UTC
Thanks, Tom. Your questions had forced me to look at it again. I was able to resolve it without using PARTITION BY.
Thanks again.
Analytics Question
Mike, March 01, 2006 - 2:48 pm UTC
I am having an issue with using analytics. What I need to be able to accomplish is
I have a table called entity, this table is recursive in the fact the every entity
with the exception of the highest level entity has a parent entity.
Normally this list is easily generated by using 'connect by' in the SQL to show the hierarchy.
The problem I have is now we have to show all the tickets that have been opened for each entity
and all the entities under it.
A simple join will get the open tickets for each entity, but in order to sum all of the tickets
for each entity and all entities under it takes too many resources to be considered usable.
I have tried several ways to get this query to work with as few resources as possible but still
believe that this SQL will generate to unnacceptable as the table(s) grow.
what I really need to do is sum the entities tickets up in reverse order from the bottom up to
the level 1 entity so that I don't have to traverse the tree from level 1 to level 6, then each
level2 to level6, etc,..
I wrote a quick procedure to populate a table (t) with each uuid that has a ticket, it's sum of
the tickets tied to that entity and a concatenated list of uuids under that entity in the tree,
this resulted in a much smaller table but this is not the way I would want to implemet it
This table is going to grow into several hundred thousand entities and I need to be able
to generate this list much faster and using less resources than the connect by statements below.
Any help would definitely be appreciated.
Without using a procedure and a two step process here is what I have to work with for a test,
CREATE TABLE ENTITY (
ENTITY_UUID VARCHAR2(32),
NAME VARCHAR2(256),
PARENT_UUID VARCHAR2(32)
)
/
Table created.
CREATE TABLE ENTITY_TCKT (
ENTITY_UUID VARCHAR2(32),
CURRENT_LIFECYCLE_STATE NUMBER
)
/
Table created.
Then populate the entity table,
insert into entity values ('13E7CAA5FDEB42518A798A77A19F70B0','Level1 Entity',NULL);
insert into entity values ('66A6A6EFFA9D46BE82EC8F5CFFAC91B9','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('DCF6B6366D6449DB95A5AEA6B14F31F7','Level5 Entity','66A6A6EFFA9D46BE82EC8F5CFFAC91B9');
insert into entity values ('E2FD444948714528805EBFFA102511F5','Level5 Entity','CB4E1B74035947B9A5B9B0FE264DF4E7');
insert into entity values ('2E1E0646AC9F4BB9A6E4A747B20B2595','Level2 Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('54133391FDD54221B11382A20DFC38AA','Level2 Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('95A5F85D68184DB7A49F9DF7A236F9AF','Level5 Entity','99762DC75A5D42DCBEA6950D7011F130');
insert into entity values ('EAD30C5578BD491991B0D7049CD4F277','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('536FCF7E4A5D457B8C3AECBED878FDBF','Level3 Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('883FD970DF264B7A9DC0DFBAF225012A','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('C23E104B5AA044F795A6896B1C0B08E4','Level3 Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('64BA9F3295194A3B955FD446DBB2E7EC','Level4 Entity','C23E104B5AA044F795A6896B1C0B08E4');
insert into entity values ('0FE28D72C40C4D6FBB439A49B0BE6D3F','Level5 Entity','64BA9F3295194A3B955FD446DBB2E7EC');
----------list goes on------------------
Then to generate some random ticket data
DECLARE
l_tckt_ktr NUMBER;
l_entity_ktr NUMBER;
BEGIN
FOR i IN (select entity_uuid from entity)
LOOP
l_entity_ktr := round(dbms_random.value(1,6),0);
DBMS_OUTPUT.PUT_LINE('Processing entity -> '||i.entity_uuid||' l_entity_ktr = '||l_entity_ktr);
IF l_entity_ktr = 1 THEN
l_tckt_ktr := round(dbms_random.value(1,10),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
ELSIF l_entity_ktr = 3 THEN
l_tckt_ktr := round(dbms_random.value(50,100),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
ELSIF l_entity_ktr = 5 THEN
l_tckt_ktr := round(dbms_random.value(500,1000),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
END IF;
END LOOP;
END;
/
PL/SQL procedure successfully completed.
Now that the tables have data, try two ways to get the query
set lines 256
column lpad('',2*level)||e.name format a50
column lpad('',2*level)||e.entity_uuid format a50
select lpad(' ', 2 * level)||e.name, lpad(' ', 2 * level)||e.entity_uuid, level,
(select count(*) from entity_tckt where current_lifecycle_state = 0 and entity_uuid in (
select a.entity_uuid from entity a start with a.entity_uuid = e.entity_uuid connect by prior a.entity_uuid = a.parent_uuid)) as nbr
from entity e
start with e.entity_uuid = '13E7CAA5FDEB42518A798A77A19F70B0' connect by prior e.entity_uuid = e.parent_uuid
/
LPAD('',2*LEVEL)||E.NAME LPAD('',2*LEVEL)||E.ENTITY_UUID LEVEL NBR
-------------------------------------------------- -------------------------------------------------- ---------- ----------
Level1 Entity 13E7CAA5FDEB42518A798A77A19F70B0 1 15533
Level2 Entity 2E1E0646AC9F4BB9A6E4A747B20B2595 2 7
Level2 Entity 54133391FDD54221B11382A20DFC38AA 2 15526
Level3 Entity 536FCF7E4A5D457B8C3AECBED878FDBF 3 890
Level4 Entity 66A6A6EFFA9D46BE82EC8F5CFFAC91B9 4 51
Level5 Entity DCF6B6366D6449DB95A5AEA6B14F31F7 5 0
Level4 Entity EAD30C5578BD491991B0D7049CD4F277 4 76
Level4 Entity 883FD970DF264B7A9DC0DFBAF225012A 4 0
Level4 Entity 0F01178489CB421CA5A4C55AEC98300E 4 4
Level5 Entity 5B3B033B4A30443BB1B6F57C1BB05399 5 4
Level4 Entity CB4E1B74035947B9A5B9B0FE264DF4E7 4 69
LPAD('',2*LEVEL)||E.NAME LPAD('',2*LEVEL)||E.ENTITY_UUID LEVEL NBR
-------------------------------------------------- -------------------------------------------------- ---------- ----------
Level5 Entity E2FD444948714528805EBFFA102511F5 5 69
Level4 Entity 99762DC75A5D42DCBEA6950D7011F130 4 607
Level5 Entity 95A5F85D68184DB7A49F9DF7A236F9AF 5 517
-------------------and on and on----------------
122 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
4 2 FILTER
5 4 CONNECT BY (WITH FILTERING)
6 5 NESTED LOOPS
7 6 TABLE ACCESS (FULL) OF 'ENTITY'
8 6 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
9 5 NESTED LOOPS
10 9 BUFFER (SORT)
11 10 CONNECT BY PUMP
12 9 TABLE ACCESS (FULL) OF 'ENTITY'
13 0 CONNECT BY (WITH FILTERING)
14 13 NESTED LOOPS
15 14 TABLE ACCESS (FULL) OF 'ENTITY'
16 14 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
17 13 NESTED LOOPS
18 17 BUFFER (SORT)
19 18 CONNECT BY PUMP
20 17 TABLE ACCESS (FULL) OF 'ENTITY'
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
208383 consistent gets
0 physical reads
0 redo size
9007 bytes sent via SQL*Net to client
591 bytes received via SQL*Net from client
10 SQL*Net roundtrips to/from client
27913 sorts (memory)
0 sorts (disk)
122 rows processed
And now using 'WITH'
with alm as (select entity_uuid, count(*) as nbr
from entity_tckt
group by entity_uuid)
select e.entity_uuid, sum(nbr) from entity e, alm
where alm.entity_uuid in (select entity_uuid from entity start with entity_uuid = e.entity_uuid connect by prior entity_uuid = parent_uuid)
group by e.entity_uuid
/
ENTITY_UUID SUM(NBR)
-------------------------------- ----------
026D4A6B547544E08CFEC5DDED3B7777 772
02AD7930E9B94A20A9F4E245B3F4B8C4 1747
04B3AD5F804B4A66AC6C91606EA7019B 74
070547E30C604D4680089959A6DB7684 677
08ABA73521F94F2DB19FF8C1C53ADF06 2
0D65B9BB906B442B9A6BDFEEF3D858AC 86
0F01178489CB421CA5A4C55AEC98300E 4
0FE28D72C40C4D6FBB439A49B0BE6D3F 716
10801AFBCBC44F2F9851E16BC1CCA442 703
13E7CAA5FDEB42518A798A77A19F70B0 15533
1677A02C0B89479BA27F95D7AA3DAC03 9516
ENTITY_UUID SUM(NBR)
-------------------------------- ----------
1AE6BCDA1FA74B0B8DEBEFF13F9A444A 96
1F9BD4CD72FA4879AC338397FB59FD39 74
21FAD03A5EB34D79908E96DA647FF24C 592
235F6C1600584CDD8B3FB1FFCD742E7E 839
23EA0B4618A94C1586BDF08437B97EB0 3067
25CFE1E64F36468DB291CBCF0867B314 59
26C635064E83447B91BA8D125FE74C2A 776
2ABC45D4A5D84684AC29A578C5CCBD3E 1759
2DF37BBB64254DB29081F03D38B5CE33 876
2E1E0646AC9F4BB9A6E4A747B20B2595 7
2FF85FEF82864C0BB75BA4513885ED0E 76
----------------more data ----------------
71 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 FILTER
3 2 NESTED LOOPS
4 3 VIEW
5 4 SORT (GROUP BY)
6 5 TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
7 3 TABLE ACCESS (FULL) OF 'ENTITY'
8 2 FILTER
9 8 CONNECT BY (WITH FILTERING)
10 9 NESTED LOOPS
11 10 TABLE ACCESS (FULL) OF 'ENTITY'
12 10 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
13 9 NESTED LOOPS
14 13 BUFFER (SORT)
15 14 CONNECT BY PUMP
16 13 TABLE ACCESS (FULL) OF 'ENTITY'
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
197127 consistent gets
0 physical reads
0 redo size
3707 bytes sent via SQL*Net to client
547 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
27902 sorts (memory)
0 sorts (disk)
71 rows processed
The logical reads will kill us as the table grows - Any help you could give would be fantastic
P.S. - I love the site, use it every day - Thanx
Mike
Analytics rock Analytics roll
Alf, March 01, 2006 - 2:50 pm UTC
Sorry, I should have mention that there is not direct relation between patient and proc_event:
The only way that Im able to relate patients to proc_event is by joining event to proc_event as (e.event_id(+) = pe.event_id) and then visit to patient (p.patient_id = v.patient_id). The IN operator is working as expected, however I try to use AND as pe.proc_id = '123' AND pe.proc_id ='3456' because this would return only records for patients who would have both 123 and 3456 (proc_di) records in the proc_event table.
Hello Tom, Analytics rock Analytics roll
Alf, March 02, 2006 - 4:51 pm UTC
Hi Tom,
I've been trying many different approach for this
Here are the table desc, I'm including the whole output from de desc command. Not sure if you'd want me to include everything or cut the not revelant culmns out.
SQL> desc ud_master.patient
Name Null? Type
----------------------------------------- -------- ----------------------------
PATIENT_ID NOT NULL NUMBER(12)
NAME VARCHAR2(100)
TITLE_ID NUMBER(12)
MEDICAL_RECORD_NUMBER VARCHAR2(20)
SEX VARCHAR2(8)
BIRTHDATE DATE
DATE_OF_DEATH DATE
APT_SUITE VARCHAR2(100)
STREET_ADDRESS VARCHAR2(100)
CITY VARCHAR2(50)
STATE VARCHAR2(50)
COUNTRY VARCHAR2(50)
MAILING_CODE VARCHAR2(50)
MARITAL_STATUS_ID NUMBER(12)
RACE_ID NUMBER(12)
RELIGION_ID NUMBER(12)
FREE_TEXT_RELIGION VARCHAR2(100)
OCCUPATION_ID NUMBER(12)
FREE_TEXT_OCCUPATION VARCHAR2(100)
EMPLOYER_ID NUMBER(12)
FREE_TEXT_EMPLOYER VARCHAR2(150)
MOTHER_PATIENT_ID NUMBER(12)
COLLAPSED_INTO_PATIENT_ID NUMBER(12)
SOCIAL_SECURITY_NUMBER VARCHAR2(15)
LIFECARE_VISIT_ID NUMBER(12)
CONFIDENTIAL_FLAG VARCHAR2(1)
HOME_PHONE VARCHAR2(20)
DAY_PHONE VARCHAR2(20)
SMOKER_FLAG VARCHAR2(1)
CURRENT_LOCATION VARCHAR2(15)
SEC_LANG_NAME VARCHAR2(100)
ADDR_STRING VARCHAR2(50)
BLOCK_CODE VARCHAR2(50)
SQL> desc ud_master.visit
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
VISIT_NUMBER VARCHAR2(40)
PATIENT_ID NUMBER(12)
VISIT_TYPE_ID NUMBER(12)
VISIT_SUBTYPE_ID NUMBER(12)
VISIT_STATUS_ID NUMBER(12)
FACILITY_ID NUMBER(12)
ATTENDING_EMP_PROVIDER_ID NUMBER(12)
RESIDENT_EMP_PROVIDER_ID NUMBER(12)
ADMISSION_DATE_TIME DATE
DISCHARGE_DATE_TIME DATE
DISCHARGE_TYPE_ID NUMBER(12)
MARITAL_STATUS_ID NUMBER(12)
RELIGION_ID NUMBER(12)
FREE_TEXT_RELIGION VARCHAR2(100)
FINANCIAL_CLASS_ID NUMBER(12)
OCCUPATION_ID NUMBER(12)
FREE_TEXT_OCCUPATION VARCHAR2(100)
EMPLOYER_ID NUMBER(12)
FREE_TEXT_EMPLOYER VARCHAR2(100)
PHYSICIAN_SERVICE_ID VARCHAR2(12)
LOCATION_ID VARCHAR2(12)
ADDL_RESP_EMP_PROVIDER_ID NUMBER(12)
ADDL_RESP_STRING VARCHAR2(100)
ATTENDING_STRING VARCHAR2(100)
RESIDENT_STRING VARCHAR2(100)
LAST_LOCATION VARCHAR2(15)
ADDL_RESP_RESIDENT_SERVICE_ID NUMBER(12)
TRIAGE_ACUITY_ID NUMBER(12)
SERIES_VISIT_FLAG VARCHAR2(5)
NEWBORN_FLAG VARCHAR2(5)
SQL> desc ud_master.proc_event
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
EVENT_ID NOT NULL NUMBER(12)
ORDER_SPAN_ID NUMBER(12)
ORDER_SPAN_STATE_ID NUMBER(12)
PROC_ID NUMBER(12)
ORIG_SCHEDULE_BEGIN_DATE_TIME DATE
ORIG_SCHEDULE_END_DATE_TIME DATE
FINAL_SCHEDULE_BEGIN_DATE_TIME DATE
FINAL_SCHEDULE_END_DATE_TIME DATE
ABNORMAL_STATE_ID VARCHAR2(3)
MODIFIED_PROC_NAME VARCHAR2(250)
FACILITY_ID NUMBER(12)
PRIORITY_ID NUMBER(12)
CORRECTED_FLAG VARCHAR2(5)
RX_FLAG VARCHAR2(5)
SPEC_RECOLLECT_FLAG VARCHAR2(5)
COMPLETE_RESULT_RPT NUMBER(12)
ORDER_VISIT_ID NUMBER(12)
ORDER_DEFINITION_ID VARCHAR2(25)
PROC_ORDER_NBR NUMBER(12)
SQL> desc ud_master.event
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
EVENT_ID NOT NULL NUMBER(12)
DATE_TIME DATE
EVENT_STATUS_ID NUMBER(12)
EVENT_TYPE_ID NUMBER(12)
PATIENT_SCHEDULE_DISPLAY VARCHAR2(100)
I've created the following query with a sub and with the exists clause, would please review it and let me know whatever I need to correct, many thanks in advance.
SELECT DISTINCT patient.NAME,
patient.medical_record_number Medical_RN,
sub_epe.proc_id,
to_date(sub_epe.date_time,'dd-mon-yy') event_dt
FROM ud_master.patient,
(SELECT proc_event.proc_id, event.date_time,proc_event.VISIT_ID
FROM ud_master.event, ud_master.proc_event
WHERE (exists (select *
from ud_master.proc_event sub_p
where sub_p.proc_id = sub_p.proc_id)
and (proc_event.visit_id = event.visit_id)))sub_epe,ud_master.visit
WHERE ( (patient.patient_id = visit.patient_id)
AND (sub_epe.proc_id = 21078) and (sub_epe.proc_id = 22025)
AND (sub_epe.date_time BETWEEN to_date('25-jan-2005','dd-mon-yyyy') AND
to_date('31-jan-2005','dd-mon-yyyy')))
GROUP BY patient.NAME,
patient.medical_record_number,
sub_epe.proc_id,
sub_epe.date_time
Another Select Query
A reader, March 08, 2006 - 2:55 am UTC
Hi Tom,
Have a requirement like this:
Create table test(
main number,
a1 varchar2(10),
a2 varchar2(10),
a3 varchar2(10),
a4 varchar2(10),
a5 varchar2(10)) nologging;
Insert into test (1,'A1','A2','A3','A4','A5');
Normal Output:
Main A1 A2 A3 A4 A5
---- -- -- -- -- --
1 A1 A2 A3 A4 A5
Desired Output:
---------------
Main Txt
---- ---
1 A1
1 A2
1 A3
1 A4
1 A5
I know it is a simple query.
Please advice
March 09, 2006 - 12:13 pm UTC
ops$tkyte@ORA10GR2> with r as
2 (select level l from dual connect by level <= 5)
3 select test.main,
4 decode( r.l, 1, a1, 2, a2, 3, a3, 4, a4, 5, a5 ) data
5 from test, r;
MAIN DATA
---------- ----------
1 A1
1 A2
1 A3
1 A4
1 A5
Analytics rock Analytics roll
Alf, March 09, 2006 - 4:37 pm UTC
Hello Tom,
How about this?
I have four tables (descs included below) that holds information about patients along with their visits and tests (procedures hence the proc table) perform. In some case some patients are ordered two tests/procedures with a certain period of time.
I to create a report listing patients that had had at least two procedures done in this case procedure 21078 and 22025, the desire out put would be:
P_name P_MRN Proc_ID Event_Date_Time
---------------- ----------- ----------- --------------------
Patient Test, A 44422244555 21078 01-jan-2005 00:00
22025 05-jan-2005 10:35
Patient Test, B 3334442222 21078 28-feb-2005 11:15
22025 31-jul-2005 01:35
...
My challenge is there's not direct relation between patient and proc_event, would please review these tables and give a hit how to approach this? Thanks.
Patient
=======
Name Null? Type
----------------------------------------- -------- ----------------------------
PATIENT_ID NOT NULL NUMBER(12)
NAME VARCHAR2(100)
TITLE_ID NUMBER(12)
MEDICAL_RECORD_NUMBER VARCHAR2(20)
SEX VARCHAR2(8)
.....
Visit
=====
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
VISIT_NUMBER VARCHAR2(40)
PATIENT_ID NUMBER(12)
VISIT_TYPE_ID NUMBER(12)
VISIT_SUBTYPE_ID NUMBER(12)
VISIT_STATUS_ID NUMBER(12)
FACILITY_ID NUMBER(12)
......
proc_event
=====
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
EVENT_ID NOT NULL NUMBER(12)
ORDER_SPAN_ID NUMBER(12)
ORDER_SPAN_STATE_ID NUMBER(12)
PROC_ID NUMBER(12)
......
event
=====
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
EVENT_ID NOT NULL NUMBER(12)
DATE_TIME DATE
EVENT_STATUS_ID NUMBER(12)
EVENT_TYPE_ID NUMBER(12)
PATIENT_SCHEDULE_DISPLAY VARCHAR2(100)
......
Analytics Rock
Alf, March 10, 2006 - 3:31 pm UTC
Hello Tom,
Follow-up for my pervious above question regarding a list of patients who have had two specific tests performed/ordered in this case proc_id 22025 and 21078
I finally got the below query to work this time. Would you please review it and let me know if there's anything I need to change to improve speed performance? Thanks.
SELECT
DISTINCT patient.medical_record_number mrn,
patient.name p_name,
proc.proc_id,proc.name,
last_day(to_date(event.date_time, 'dd-mon-yy')) D_time
FROM ud_master.event,
ud_master.proc_event,
ud_master.patient,
ud_master.visit,
ud_master.proc
WHERE (
(patient.patient_id = visit.patient_id)
AND (visit.visit_id = proc_event.visit_id)
AND (visit.visit_id = event.visit_id)
AND (proc_event.visit_id = event.visit_id)
AND (proc_event.event_id = event.event_id)
AND (proc.proc_id in (22025,21078))
AND (event.date_time BETWEEN to_date('01-jan-2005','dd-mon-yyyy') AND
to_date('31-dec-2005','dd-mon-yyyy')))
GROUP BY patient.medical_record_number,
patient.name,
proc.proc_id,
proc.name,
to_date(event.date_time, 'dd-mon-yy')
ORDER BY patient.medical_record_number,'p_name',
proc.proc_id,
proc.name,'D_time'
Analytic function
Mohamed from France, March 23, 2006 - 4:20 am UTC
Dear Tom,
I have a table T1 as follows:
IDE NOT NULL NUMBER
PTFL_IDE NOT NULL NUMBER
BATY_TYP NOT NULL VARCHAR2(6)
CTP_IC NOT NULL NUMBER
LABL_PLY NOT NULL VARCHAR2(6)
NMFI_IDE NOT NULL NUMBER
QTY NOT NULL NUMBER
and a table T2 as follows
IDE NOT NULL NUMBER
SNAP_NMFI_IDE NOT NULL NUMBER
IDE_PORTF VARCHAR2(50) ===>
SNAP_BATY_TYP VARCHAR2(6)
USR_INS VARCHAR2(48)
DAT_INS DATE
USR_UPD VARCHAR2(48)
DAT_UPD DATE
PGM_INS VARCHAR2(48)
PGM_UPD VARCHAR2(48)
and table T3 as follows
PTFL_IDE NOT NULL NUMBER
IDE_PORTF NOT NULL VARCHAR2(50)
PTF_STATUS NOT NULL VARCHAR2(3)
NLAB_COD_LABL NOT NULL VARCHAR2(6)
I would like to get from table T1 the sum(T1.QTY) grouped by T1.PTFL_IDE,T1.BATY_TYP,T1.CTP_IC, T1.LABL_PLY +
the max(T2.dat_ins) for each (T3.IDE_PORTF,T3.SNAP_BATY_TYP) in this group
Remark: T3.SNAP_BATY_TYP = T1. BATY_TYP
T3.IDE_PORTF = SELECT T3.IDE_PORTF FROM T3 WHERE T3.PTFL_IDE = T1.PTFL_IDE
Thanks in advance for your help
March 23, 2006 - 10:54 am UTC
sorry, no help.
no create table.
no insert into.
no real explanation of the goal. the remark doesn't make sense (don't know what you are trying to remark), the "i would like to get" isn't very clear either.
Simple Analytical Query
A reader, March 24, 2006 - 2:31 am UTC
Hi Tom,
I have finally got a chance to make use of analytical Query but I am not able to use it. Following is my table data:-
ITEM MYORDERS QTYONHAND SAFETYSTOCK
---------- ---------- ---------- -----------
ABC 5 50 10
ABC 45 50 10
ABC 25 50 10
DEF 30 60 10
DEF 40 60 10
DEF 30 50 10
DEF 30 60 10
DEF 40 60 10
XYZ 20 80 10
XYZ 10 80 10
XYZ 10 80 10
ITEM MYORDERS QTYONHAND SAFETYSTOCK
---------- ---------- ---------- -----------
XYZ 20 80 10
Now I want the Output to be as :-
Item MYORDERS qtyonhand safetystock tomake --> (qtyonhand-safetystock-sum(MYORDERS))
==== ========= ======= ============ ========
ABC 5 50 10 0 (haven't depleted stock)
ABC 45 50 10 10 -1 x (50-10-5-45)
ABC 25 50 10 35 -1 x (50-10-5-45-25)
so on........
I tried the following Query but it does'nt give me a running total for the MYORDERS column :-
SQL> select item,MYORDERS,sum(MYORDERS) over (partition by item) tomake from test;
ITEM MYORDERS TOMAKE
---------- ---------- ----------
ABC 5 75
ABC 45 75
ABC 25 75
DEF 30 170
DEF 40 170
DEF 30 170
DEF 30 170
DEF 40 170
XYZ 20 60
XYZ 10 60
XYZ 10 60
ITEM MYORDERS TOMAKE
---------- ---------- ----------
XYZ 20 60
12 rows selected.
If I get a running total in TOMAKE then I can just subtract it from the value of SAFETYSTOCK for that particular row to get my desired Output.
I know I am missing the order by clause but I tried using the other 2 columns (QTYONHAND, SAFETYSTOCK) but did not get the desired output. Can you please help me out?
--Create Table Script
CREATE TABLE TEST
(
ITEM VARCHAR2(10),
MYORDERS NUMBER,
QTYONHAND NUMBER,
SAFETYSTOCK NUMBER
)
--Insert Statements
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'ABC', 5, 50, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'ABC', 45, 50, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'ABC', 25, 50, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'DEF', 30, 60, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'DEF', 40, 60, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'DEF', 30, 60, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'DEF', 40, 60, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'DEF', 30, 50, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'XYZ', 10, 80, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'XYZ', 20, 80, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'XYZ', 10, 80, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES (
'XYZ', 20, 80, 10);
COMMIT;
Thanks
March 24, 2006 - 9:49 am UTC
you seem to be missing something to sort by? what orders this data - a running total implies "SORT BY SOMETHING"
Simple Analytical Query
A reader, March 24, 2006 - 12:16 pm UTC
Any one of the other 2 columns can be used to sort the data.
Thanks
March 24, 2006 - 3:34 pm UTC
so, order by them. in the over () clause.
Simple Analytical Query
A reader, March 24, 2006 - 3:50 pm UTC
Hi Tom
I think the below query will work fine for me I have ordered the data by rowid:-
SQL> select item,toorder,sum(toorder) over (partition by item order by rowid) run_tot from test;
ITEM TOORDER RUN_TOT
---------- ---------- ----------
ABC 25 25
ABC 45 70
ABC 5 75
DEF 30 30
DEF 40 70
DEF 30 100
DEF 40 140
DEF 30 170
XYZ 10 10
XYZ 20 30
XYZ 10 40
ITEM TOORDER RUN_TOT
---------- ---------- ----------
XYZ 20 60
12 rows selected.
Thanks for all your help
March 24, 2006 - 4:16 pm UTC
as long as you don't care that it gives different answers for the same data on different databases, sure.
Full Join of inline views with analytics
Anwar, March 25, 2006 - 8:22 am UTC
I have a table with date column. I want to display dates from two different months in two columns. I want dates from both months in same rows and not alternate ones.
create table test
(tdate date);
insert into test values('01-jan-2006');
insert into test values('05-jan-2006');
insert into test values('15-jan-2006');
insert into test values('02-feb-2006');
insert into test values('07-feb-2006');
insert into test values('25-feb-2006');
Then i issue following command to retrieve data.
1 select jan06,feb06
2 from
3 (select tdate jan06,
4 row_number() over (order by tdate) rn
5 from test
6 where tdate between '01-jan-2006' and '31-jan-2006') j,
7 (select tdate feb06,
8 row_number() over (order by tdate) rn
9 from test
10 where tdate between '01-feb-2006' and '28-feb-2006') f
11* where j.rn=f.rn
SQL> /
JAN06 FEB06
--------- ---------
01-JAN-06 02-FEB-06
05-JAN-06 07-FEB-06
15-JAN-06 25-FEB-06
So far so good. But then i insert another record in the table.
insert into test values('27-feb-2006');
Now number of records for february is more than that of january. I use full outer join and have this unexpected result.
1 select j.rn,jan06,f.rn,feb06
2 from
3 (select tdate jan06,
4 row_number() over (order by tdate) rn
5 from test
6 where tdate between '01-jan-2006' and '31-jan-2006') j
7 full join
8 (select tdate feb06,
9 row_number() over (order by tdate) rn
10 from test
11 where tdate between '01-feb-2006' and '28-feb-2006') f
12* on j.rn=f.rn
SQL> /
RN JAN06 RN FEB06
---------- --------- ---------- ---------
1 01-JAN-06 1 02-FEB-06
2 05-JAN-06 2 07-FEB-06
3 15-JAN-06 3 25-FEB-06
27-FEB-06
25-FEB-06
07-FEB-06
02-FEB-06
7 rows selected.
I create one table each for both months.
create table j_test
as select tdate JAN06,row_number() over (order by tdate) rn
from test
where tdate between '01-jan-2006' and '31-jan-2006'
/
create table f_test
as select tdate FEB06,row_number() over (order by tdate) rn
from test
where tdate between '01-feb-2006' and '28-feb-2006'
/
SQL> select j.rn,jan06,f.rn,feb06
2 from j_test j full join f_test f
3 on j.rn=f.rn
4 /
RN JAN06 RN FEB06
---------- --------- ---------- ---------
1 01-JAN-06 1 02-FEB-06
2 05-JAN-06 2 07-FEB-06
3 15-JAN-06 3 25-FEB-06
4 25-FEB-06
This result is what i was expecting with inline views. Why is the behaviour different with analytics in inline views and tables?
Regards
Anwar
March 25, 2006 - 9:13 am UTC
reproduces in 9ir2, 10gr1 but not 10gr2 - please contact support with this test case.
Full Join of inline views with analytics - follow-up
Michel Cadot, March 25, 2006 - 10:27 am UTC
I don't get exactly the same thing on 9.2.0.6 but still get a wrong answer (btw, the answer showed for the last query is not correct, maybe there are not the same input rows, and i wonder why f.rn is not filled in the previous query answer):
SQL> select j.rn,jan06,f.rn,feb06
2 from
3 (select tdate jan06,
4 row_number() over (order by tdate) rn
5 from test
6 where tdate between '01-jan-2006' and '31-jan-2006') j
7 full join
8 (select tdate feb06,
9 row_number() over (order by tdate) rn
10 from test
11 where tdate between '01-feb-2006' and '28-feb-2006') f
12 on j.rn=f.rn
13 /
RN JAN06 RN FEB06
---------- ----------- ---------- -----------
1 01-jan-2006 1 02-feb-2006
2 05-jan-2006 2 07-feb-2006
3 15-jan-2006 3 25-feb-2006
3 25-feb-2006
4 27-feb-2006
5 rows selected.
But this does not happen if you use factoring clause:
SQL> with
2 j_test as
3 (select tdate jan06,
4 row_number() over (order by tdate) rn
5 from test
6 where tdate between '01-jan-2006' and '31-jan-2006'),
7 f_test as
8 (select tdate feb06,
9 row_number() over (order by tdate) rn
10 from test
11 where tdate between '01-feb-2006' and '28-feb-2006')
12 select j.rn,jan06,f.rn,feb06
13 from j_test j full join f_test f
14 on j.rn=f.rn
15 /
RN JAN06 RN FEB06
---------- ----------- ---------- -----------
1 01-jan-2006 1 02-feb-2006
2 05-jan-2006 2 07-feb-2006
3 15-jan-2006 3 25-feb-2006
4 27-feb-2006
4 rows selected.
Regards
Michel
But is it possible to divide x days over a week using analytics?
Martijn, April 13, 2006 - 9:39 am UTC
Hello Tom,
First of all: Thanks for your great work and site!
I wonder if its possible to execute the following in SQL, using rocking and rolling analytics:
Each day of the week a query has to select a balanced set of ids from a table, for example the all_objects table.
Connected to Oracle9i Enterprise Edition Release 9.2.0.5.0
SQL> select count(*) from all_objects;
COUNT(*)
----------
26652
SQL> select round(count(*)/7) from all_objects;
ROUND(COUNT(*)/7)
-----------------
3807
So each day about 3807 records should be selected.
SQL> SELECT distinct(MOD(t.object_id, 7))+1 day_of_week
2 , count(*) over (PARTITION BY (MOD(t.object_id, 7))) nof_rec_by_objid
3 FROM all_objects t
4 ;
DAY_OF_WEEK NOF_REC_BY_OBJID
----------- ----------------
1 3810
2 3820
3 3808
4 3811
5 3807
6 3791
7 3805
7 rows selected
Now the data is more or less evenly distributed per day over a week. Not much of a problem BUT:
By using a parameter we want to be able to do the following:
Imagine that the number of object_id-records queried each day represents a set of persons an email will be sent to. I want to be able to send the same person 'x' emails per week, equally spread over the week. So if today is day 1 of the week and the parameter value = 3 (emails per week) then I would like to get this result:
DAY_OF_WEEK NOF_REC_BY_OBJID
----------- ----------------
1 3810 <-mailrun 1
2 3820
3 3808 <-mailrun 2
4 3811
5 3807 <-mailrun 3
6 3791
7 3805
So:
if parameter = 3 and day = 1 then
Select a group of numbers: 1, 3, 5
All persons with mod(person_uid, 7)+1 in (1, 3, 5) need to be selected on day 1, 3 and 5 of the week.
if arameter = 3 and day of week = 2 then
Select a group of numbers: 2, 4, 6
All persons with mod(person_uid, 7)+1 in (2, 4, 6) need to be selected on day 2, 4 and 6 of the week.
Parameter = 3, day of week = 3
Select a group of numbers: 3, 5, 7
All persons with mod(person_uid, 7)+1 in ( 3, 5, 7) need to be selected on day 3, 5 and 7 of the week.
Etc. (for all days in 1..7).
If the parameter is set to 2 then the query schould get the following:
Parameter = 2, day of week = 1
Select a group of numbers: 1, 5
All persons with mod(person_uid, 7)+1 in (1, 5) need to be selected on day 1, 3 and 5 of the week.
Etc. for all days of the week.
This is the current version of the daily selection:
SELECT ao.object_id
FROM all_objects ao
WHERE mod(ao.object_id, 7)+1 IN
( to_number(to_char(sysdate, 'D'))
, to_number(to_char(sysdate, 'D'))+3
, to_number(to_char(sysdate, 'D'))-4);
I would like to get rid of the hardcoded ' +3 , -4' lines and replace them by a more flexible approach. Can analytics be of help in this question or is the only solution using a PL/SQL Function? So the current query needs to be rewritten to:
SELECT ao.object_id
FROM all_objects ao
WHERE mod(ao.object_id, 7)+1 IN (select all available days of the week, evenly distributed over the week, based on the value of the (cursor) parameter );
Or
SELECT ao.object_id
FROM all_objects ao
WHERE EXISTS (select all available days of the week, evenly distributed over the week, based on the value of the (cursor) parameter where mod(ao.object_id, 7)+1 IN (the available days) );
Thanks in advance for any clue!
Regards,
Martijn
Continued from earlier post
Martijn, April 19, 2006 - 8:58 am UTC
With a PL/SQL function,
the point where I get stuck is the fact that if the value of the parameter is 3, I can't get the distribution right (in an elegant way that is). The odd daynumers mess up the distribution.
I.e., this is what i get when i choose 3 times a week for groups to receive email:
Day Mailto "mods" with no.
-------------------------------
1 1, 3, 5
2 2, 4, 6
3 3, 5, 7 (7 should be 1?)
4 4, 6, 2
5 5, 7, 1 (wrong too)
6 6, 2, 4
7 7, 1, 3 (wrong too)
Still I'm sure that this could be done in SQL, with some formule, analytics, but got lost in finding the correct way to do so.
Analytical function
Amit, April 26, 2006 - 7:50 pm UTC
I have a result set like this
SELECT sku, price, pct_off,
price * ((100 - pct_off) / 100) calculated_price
FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
FROM user_all_tables
WHERE rownum <= 5);
SKU PRICE PCT_OFF CALCULATED_PRICE
--------- --------- --------- ----------------
101 1000 10 900
101 1000 11 890 -> 801
101 1000 12 880 -> 704.88
101 1000 13 870 -> 613.25
101 1000 14 860 -> 527.39
The calculated price is simply being calculated here
as Price - pct_off.
What I would like to do is to calculate the price based on previous calculated value.
So for row2, I want the calculated price to be
900 - 11% off = 801
for row3, it should be
801 - 12% off = 704.88 and so on.
April 27, 2006 - 2:54 pm UTC
no create
no insert intos
no example basically
no look, not promising anything, I just don't look at these until I have an example I can cut and paste.
read about LAG() - it will do this, you can figure this out - I'm sure you can (you will need an ORDER BY which is WOEFULLY missing from your example - rows have NO ORDER until you give it to them)
Analytical function
Amit, April 27, 2006 - 4:39 pm UTC
Tom,
There are no inserts needed to run my query. You can directly run this statement. Please run this statement once and you'll understand what I am talking about.
SELECT sku, price, pct_off,
price * ((100 - pct_off) / 100) calculated_price
FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
FROM user_all_tables
WHERE rownum <= 5);
I have read about Lag() function and tried to use it here but it does not give me the results I need.
SQL> ed
Wrote file afiedt.buf
1 SELECT sku, price, pct_off, cal_price,
2 lag(cal_price, 1, cal_price) over(ORDER BY 1) new_cal_price
3 FROM (SELECT sku, price, pct_off, price * ((100 - pct_off) / 100) cal_price
4 FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
5 FROM user_all_tables
6* WHERE rownum <= 5))
SQL> /
SKU PRICE PCT_OFF CAL_PRICE NEW_CAL_PRICE
--------- --------- --------- --------- -------------
101 1000 10 900 900 -- Correct
101 1000 11 890 900 -- Should be 801
101 1000 12 880 890 -- Should be 704.88
101 1000 13 870 880 -- Should be 613.25
101 1000 14 860 870 -- Should be 527.39
I want to calculate the price based on previous calculated
value.
For row2, I want the calculated price to be
900 - 11% off = 801
for row3, it should be
801 - 12% off = 704.88 and so on.
April 27, 2006 - 4:49 pm UTC
umm, without an order by - I'm afraid there is no "prior row" concept AT ALL.
Need real example, YOU NEED SOMETHING TO ACTUALLY SORT THE DATA BY
order by 1
that orders (in this case) by the CONSTANT NUMBER ONE, which means "they are all equally "first" or "last" or "next")
order order in the court. give us a real example to deal with here.
Analytical question
A reader, April 27, 2006 - 4:50 pm UTC
I have tried this variation. This is closer to my desired result but only upto row 2.
SQL> SELECT sku, price, pct_off, cal_price,
2 lag(cal_price * ((100 - pct_off) / 100), 1, cal_price) over(ORDER BY 1) new_cal_price
3 FROM (SELECT sku, price, pct_off, price * ((100 - pct_off) / 100) cal_price
4 FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
5 FROM user_all_tables
6 WHERE rownum <= 5));
SKU PRICE PCT_OFF CAL_PRICE NEW_CAL_PRICE
--------- --------- --------- --------- -------------
101 1000 10 900 900
101 1000 11 890 810
101 1000 12 880 792.1
101 1000 13 870 774.4
101 1000 14 860 756.9
April 28, 2006 - 1:43 am UTC
I give up.
These rows have NO ORDER WHATSOEVER, you need something to order by.
You don't have anything in your example to tell me what "row 2" is. Your rows could be anywhere - you could run this query tomorrow and your row 2 might become row 4 - or row 5000 and not even appear.
Analytical question.
Amit, April 27, 2006 - 5:12 pm UTC
Ok..let me give you some real data with an order by clause.
SQL> CREATE TABLE A AS
2 SELECT 101 sku, 1000 price, 9 + rownum pct_off, trunc(sysdate) + (2 * rownum) apply_dt
3 FROM user_all_tables
4 WHERE rownum <= 5;
Table created.
SQL> select * from a;
SKU PRICE PCT_OFF APPLY_DT
--------- --------- --------- --------
101 1000 10 04/29/06
101 1000 11 05/01/06
101 1000 12 05/03/06
101 1000 13 05/05/06
101 1000 14 05/07/06
SQL> SELECT sku, price, pct_off, apply_dt, cal_price,
2 lag(cal_price * ((100 - pct_off) / 100), 1, cal_price) over(ORDER BY 1) new_cal_price
3 FROM (SELECT sku, price, pct_off, price * ((100 - pct_off) / 100) cal_price, apply_dt
4 FROM a
5 ORDER BY apply_dt);
SKU PRICE PCT_OFF APPLY_DT CAL_PRICE NEW_CAL_PRICE
--------- --------- --------- -------- --------- -------------
101 1000 10 04/29/06 900 900
101 1000 11 05/01/06 890 810
101 1000 12 05/03/06 880 792.1
101 1000 13 05/05/06 870 774.4
101 1000 14 05/07/06 860 756.9
model is great - but 10G only
vlad, April 27, 2006 - 7:09 pm UTC
SQL> select sku, price, pct_off, new_price
2 from (select aaa.*, rownum rn from aaa order by apply_dt)
3 model
4 return updated rows
5 partition by(sku) dimension by(rn)
6 measures(price, pct_off,0 new_price)
7 rules(new_price [any] = nvl(new_price[cv()-1]/100*(100-pct_off[cv()]),price[cv()]/100*(100-pct_off[cv()])));
SKU PRICE PCT_OFF NEW_PRICE
---------- ---------- ---------- ----------
101 1000 10 900
101 1000 11 801
101 1000 12 704.88
101 1000 13 613.2456
101 1000 14 527.391216
SQL>
10G is cool, but all you need is to search carefully Tom's archive :-))
Vlad, April 27, 2006 - 10:19 pm UTC
Amit,
it took me some time to figure out that there is no need for all this model cool features - just common sence and Tom's archive. You actually don't need previous price value - it's enough to know initial value and all factors for calculating new_price for the current row:
SQL> select ord, 1000 * exp(sum(ln((100 - pct) / 100)) over(order by ord)) new_price,
2 price,
3 pct
4 from (select rownum ord, rownum + 9 pct, 1000 price from all_tables where rownum < 10);
ORD NEW_PRICE PRICE PCT
---------- ---------- ---------- ----------
1 900 1000 10
2 801 1000 11
3 704.88 1000 12
4 613.2456 1000 13
5 527.391216 1000 14
6 448.282533 1000 15
7 376.557328 1000 16
8 312.542582 1000 17
9 256.284917 1000 18
9 rows selected
SQL>
Thanks Vlad
Amit, April 27, 2006 - 11:06 pm UTC
Thanks Vlad. This is exactly what I needed.
I will also checkout the model function in 10G. It looks very interesting.
Is it possible to implement with LAG?
A reader, April 28, 2006 - 1:20 pm UTC
Tom, just curious - how were you going to implement this with LAG function call? Doesn't look like this was the right suggestion (if possible at all?)
April 28, 2006 - 1:32 pm UTC
I did not really look at it until there was an example to go with it (said "not promising anything"). I don't look at them in detail until everything is good to go.
And by then, someone else had already done it :)
Analytical question
Amit, April 28, 2006 - 1:43 pm UTC
I need some more help on the question I asked earlier.
The solution Vlad provided works very well if I need to do percent off in all the rows.
I need to break this pattern based on conditions and then continue again.
create table AA
(
BATCH_ID NUMBER(10) not null,
APPLY_DT DATE not null,
SKU_NO NUMBER(9) not null,
SITE_NO NUMBER(4) not null,
PCT NUMBER(5,2) not null,
PRICE NUMBER(10,2) not null,
QTY NUMBER(4) not null,
UNIT_PRICE NUMBER
);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6282, to_date('06-09-2006', 'dd-mm-yyyy'), 10007, 999, 12, 0, 0, 8.55);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6283, to_date('15-09-2006', 'dd-mm-yyyy'), 10007, 999, 5, 0, 0, 8.55);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6287, to_date('16-09-2006', 'dd-mm-yyyy'), 10007, 999, 0, 12, 1, 8.55);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6286, to_date('22-09-2006', 'dd-mm-yyyy'), 10007, 999, 10, 0, 0, 8.55);
commit;
SQL> SELECT apply_dt, pct, price, qty, unit_price,
2 round(CASE
3 WHEN pct > 0 THEN
4 unit_price * exp(SUM(ln((100 - pct) / 100)) over(ORDER BY apply_dt))
5 ELSE
6 price / qty
7 END, 2) new_price
8 FROM aa;
APPLY_DT PCT PRICE QTY UNIT_PRICE NEW_PRICE
-------- --------- --------- --------- ---------- ---------
09/06/06 12 0 0 8.55 7.52-> 12% Off 8.55
09/15/06 5 0 0 8.55 7.15-> 5% Off 7.52
09/16/06 0 12 1 8.55 12-> pct = 0 then price/qty
09/22/06 10 0 0 8.55 6.43-> 10% off 12
In the example above row4 is the problem.
In row3 I do not want to do pct off because pct = 0 for row3. I want to use price/qty instead. I am able to do what using case statement.
Now, for row4, I want 10% off row3 value. I want 10% of 12 and should get 10.8 and not 10% off 7.15 which is 6.43.
added some more rows to your table...
Vlad, April 28, 2006 - 5:18 pm UTC
SQL> select zzz.*,
2 first_value(unit_price) over(partition by part order by apply_dt) *
3 exp(SUM(ln((100 - pct) / 100)) over(partition by part ORDER BY apply_dt)) new_price
4 from (select zz.*,
5 nvl((sum(rn) over(order by apply_dt desc) - nvl(rn, 0)),0) part
6 from (select aa.batch_id,
7 aa.apply_dt,
8 aa.sku_no,
9 aa.site_no,
10 aa.pct,
11 aa.price,
12 aa.qty,
13 decode(aa.pct, 0, aa.price / aa.qty, aa.unit_price) unit_price,
14 z.rn
15 from aa,
16 (select rowid rd, rownum rn
17 from aa
18 where pct = 0
19 order by apply_dt) z
20 where aa.rowid = z.rd(+)
21 order by apply_dt) zz) zzz
22 order by apply_dt;
BATCH_ID APPLY_DT SKU_NO SITE_NO PCT PRICE QTY UNIT_PRICE RN PART NEW_PRICE
----------- ----------- ---------- ------- ------- ------------ ----- ---------- ---------- ---------- ----------
6282 9/6/2006 10007 999 12.00 0.00 0 8.55 3 7.524
6283 9/15/2006 10007 999 5.00 0.00 0 8.55 3 7.1478
6287 9/16/2006 10007 999 0.00 12.00 1 12 1 2 12
6286 9/22/2006 10007 999 10.00 0.00 0 8.55 2 10.8
6282 9/6/2007 10007 999 12.00 0.00 0 8.55 2 9.504
6283 9/15/2007 10007 999 5.00 0.00 0 8.55 2 9.0288
6287 9/16/2007 10007 999 0.00 12.00 1 12 2 0 12
6286 9/22/2007 10007 999 10.00 0.00 0 8.55 0 10.8
8 rows selected
SQL>
Excellent Vlad
Amit, April 28, 2006 - 9:08 pm UTC
This works very well. Excellent work Vlad. Thanks
Tom, can this be done in any other way ?
Is it possible with Analytics function ?
Jonty, May 01, 2006 - 2:25 pm UTC
Hello Tom,
This will help you to create table and populate data for my problem.
CREATE TABLE T_Order
(Item VARCHAR2(20),
Order_Date DATE NOT NULL,
Order_No NUMBER(10,0))
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','1-JAN-2006',9001)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','2-JAN-2006',9002)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','8-JAN-2006',9003)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','16-JAN-2006',9004)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','18-JAN-2006',9005)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','1-JAN-2006',9006)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','5-JAN-2006',9007)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','7-JAN-2006',9008)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','20-JAN-2006',9009)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('103','5-JAN-2006',9010)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('103','7-JAN-2006',9011)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('103','9-JAN-2006',9012)
/
SELECT a.item, a.order_date, a.order_no
FROM t_order a
/
I have following data in my table.
Row# Item Order_Date Order_No
---- ----- ---------- ---------
1 101 1-Jan-2006 9001
2 101 2-Jan-2006 9002
3 101 8-Jan-2006 9003
4 101 16-Jan-2006 9004
5 101 18-Jan-2006 9005
6 102 1-Jan-2006 9006
7 102 5-Jan-2006 9007
8 102 7-Jan-2006 9008
9 102 20-Jan-2006 9009
10 103 5-Jan-2006 9010
11 103 7-Jan-2006 9011
12 103 9-Jan-2006 9012
I need to find out which are the latest group of Orders for each Item.
Group of Ordres - any 2 orders order_date are within 1 week time gap.
In above example,
For
Item 101 - Row # 4,5 will be group # 1
Row# 1,2,3 wiil be group # 2
Item 102 - Row # 9 will be group # 1
Row # 6,7,8 will be group # 2
Item 103 - Row # 10, 11, 12 will be group # 1
I am only interested in group # 1. Query results should be like this
Item # Order No
-------- ------------
101 9004
101 9005
102 9009
103 9010
103 9011
103 9012
I really appreciate your help on this.
May 02, 2006 - 3:21 am UTC
a tad "ambigous" -
do you mean within the last week of the LAST RECORD (eg: for item 101 - any date between 18-Jan-2006 and 12-Jan-2006)
OR
any set of records such that the date of the current record is within one week of the prior record (eg: if item 101 had observations:
18-jan, 15-jan, 12-jan, 9-jan, 6-jan, 3-jan
they would all "count")
we can do either, here is the "first" one:
ops$tkyte@ORA10GR2> select *
2 from (
3 SELECT a.item, a.order_date, a.order_no,
4 max(order_date) over (partition by item) max_order_date
5 FROM t_order a
6 )
7 where max_order_date - order_date <= 7
8 /
ITEM ORDER_DAT ORDER_NO MAX_ORDER
-------------------- --------- ---------- ---------
101 16-JAN-06 9004 18-JAN-06
18-JAN-06 9005 18-JAN-06
102 20-JAN-06 9009 20-JAN-06
103 05-JAN-06 9010 09-JAN-06
07-JAN-06 9011 09-JAN-06
09-JAN-06 9012 09-JAN-06
6 rows selected.
2nd Option
Jonty, May 02, 2006 - 10:11 am UTC
Thanks a lot for quick response.
Yes, I am looking for this option answer.
"any set of records such that the date of the current record is within one week
of the prior record (eg: if item 101 had observations:
18-jan, 15-jan, 12-jan, 9-jan, 6-jan, 3-jan
they would all "count")"
Here is the Insert St. for an example.
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','01-JAN-2006',9091)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','02-JAN-2006',9094)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','08-JAN-2006',9092)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','14-JAN-2006',9093)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','18-JAN-2006',9095)
/
ITEM ORDER_DATE ORDER_NO
-------------------- ---------------------- ----------
104 1-Jan-2006 9091
104 2-Jan-2006 9094
104 18-Jan-2006 9092
104 24-Jan-2006 9093
104 28-Jan-2006 9095
Results should bring 18-jan, 24-jan, 28-jan records.
Thanks in advance.
May 02, 2006 - 3:47 pm UTC
arg - why do selects NOT match "supplied data"
inserts <> select....
oh well, work with this:
ops$tkyte@ORA10GR2> select * from t_order;
ITEM ORDER_DAT ORDER_NO
-------------------- --------- ----------
101 01-JAN-06 9001
101 02-JAN-06 9002
101 08-JAN-06 9003
101 16-JAN-06 9004
101 18-JAN-06 9005
102 01-JAN-06 9006
102 05-JAN-06 9007
102 07-JAN-06 9008
102 20-JAN-06 9009
103 05-JAN-06 9010
103 07-JAN-06 9011
103 09-JAN-06 9012
104 01-JAN-06 9091
104 02-JAN-06 9094
104 08-JAN-06 9092
104 14-JAN-06 9093
104 18-JAN-06 9095
17 rows selected.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select *
2 from (
3 select item, order_date, order_no,
4 max(grp) over (partition by item order by order_date DESC) maxgrp
5 from (
6 select item, order_date, order_no,
7 case when (lag(order_date) over (partition by item order by order_date DESC)-order_date)>7
8 then row_number() over (partition by item order by order_date DESC)
9 when lag(order_date) over (partition by item order by order_date DESC) is null then 1
10 end grp
11 from t_order
12 )
13 )
14 where maxgrp = 1
15 order by item, order_date
16 /
ITEM ORDER_DAT ORDER_NO MAXGRP
-------------------- --------- ---------- ----------
101 16-JAN-06 9004 1
101 18-JAN-06 9005 1
102 20-JAN-06 9009 1
103 05-JAN-06 9010 1
103 07-JAN-06 9011 1
103 09-JAN-06 9012 1
104 01-JAN-06 9091 1
104 02-JAN-06 9094 1
104 08-JAN-06 9092 1
104 14-JAN-06 9093 1
104 18-JAN-06 9095 1
11 rows selected.
It works.
Jonty, May 03, 2006 - 10:19 am UTC
Thanks a lot.
It works well.
Where can I find detailed material about Analytic Functions ?
May 03, 2006 - 1:01 pm UTC
Merge rows with first non-null value for each column
narendra, May 08, 2006 - 7:48 am UTC
Tom,
It sounds pretty simple to achieve when I put it in english. However, I am not able to write a SQL to achieve the same.
Following are the details:
select * from v$version ;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE 9.2.0.6.0 Production
TNS for Solaris: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
create table t_src (id number not null, name varchar2(10), st_id number, created_date date not null)
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,'2',to_date('16-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,null,to_date('08-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6','Row6','1',to_date('05-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7',null,null,to_date('07-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7','Row10','1',to_date('06-MAY-06','DD-MON-YY'))
/
select * from t_src order by id, created_date desc ;
ID NAME ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
6 2 16-05-2006 00:00:00
6 08-05-2006 00:00:00
6 Row6 1 05-05-2006 00:00:00
7 07-05-2006 00:00:00
7 Row10 1 06-05-2006 00:00:00
Desired Output:
ID NAME ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
6 Row6 2 16-05-2006 00:00:00
7 Row10 1 07-05-2006 00:00:00
The logic behind desired output is:
Generate a row for each ID, with latest CREATED_DATE.
Fill columns NAME & ST_ID with first not null value in respective columns for the ID.
Can you please help me out ?
May 08, 2006 - 8:31 am UTC
...
Generate a row for each ID, with latest CREATED_DATE.
Fill columns NAME & ST_ID with first not null value in respective columns for
the ID.
.....
your example does not do that. Your ST_ID for row6 is for the latest row - not the first non-null value.
This is what I come up with:
ops$tkyte@ORA10GR2> select id, name, st_id, max_date
2 from (
3 select id, name, st_id, created_date,
4 max(created_date) over (partition by id) max_date ,
5 min(case when name is not null then created_date end) over (partition by id) min_date
6 from t_src
7 )
8 where created_date = min_date
9 order by id, created_date
10 /
ID NAME ST_ID MAX_DATE
---------- ------------------------------ ---------- ---------
6 Row6 1 16-MAY-06
7 Row10 1 07-MAY-06
Incomplete Test Data
Narendra, May 08, 2006 - 11:57 pm UTC
Sorry Tom,
My sample data did not cater for all conditions.
select * from t_src order by id, created_date desc ;
ID NAME ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
6 2 16-05-2006 00:00:00
6 Row5 08-05-2006 00:00:00
6 Row6 1 05-05-2006 00:00:00
7 07-05-2006 00:00:00
7 Row10 1 06-05-2006 00:00:00
7 Row11 3 06-05-2006 00:00:00
Desired Output:
ID NAME ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
6 Row5 2 16-05-2006 00:00:00
7 Row10 1 07-05-2006 00:00:00
"Row5" in first row is due to
i) "Name" column for ID = 6 with max created date is null &
ii) For remaining rows having ID = 6, "Row5" is first not null value.
With same logic, row with ID = 7 has "Row10" value in NAME column & "1" in ST_ID column.
Hope I am clear.
May 09, 2006 - 7:42 am UTC
I didn't see any inserts here for me to use - did I miss something?
and your explaination is still not very clear.
I believe you are trying to say:
get me the unique ID's
for each unique ID show me:
a) the max(created_date)
b) the last non-null value of name when sorted by created_date
c) the last non-null value of st_id when sorted by created_date
You don't seem to be mentioning "how row with id=6" got st_id = 2.
More details
Narendra, May 10, 2006 - 12:13 am UTC
Sorry Tom for missing out on inserts...
Following are details.
create table t_src (id number not null, name varchar2(10), st_id number,
created_date date not null)
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,null,to_date('16-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6','asdf',null,to_date('10-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6','Row5',null,to_date('08-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,'6',to_date('05-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7',null,null,to_date('07-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7',null,'1',to_date('06-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7','Row11','3',to_date('04-MAY-06','DD-MON-YY'))
/
select * from t_src order by id, created_date desc
/
ID NAME ST_ID CREATED_DATE
6 2 16-MAY-06
6 10-MAY-06
6 Row5 08-MAY-06
6 6 05-MAY-06
7 07-MAY-06
7 1 06-MAY-06
7 Row11 3 04-MAY-06
Desired Output:
ID NAME ST_ID CREATED_DATE
6 Row5 2 16-MAY-06
7 Row11 1 07-MAY-06
Let me try explaining what I am trying to achieve.
1. For each uniqued ID, get the record with max CREATED_DATE. So in above example, I get following 2 rows:
ID NAME ST_ID CREATED_DATE
6 16-MAY-06
7 07-MAY-06
2. For each record (i.e. unique ID) derived in step (1),
fill rest of the columns as follows
2.1 If the record for a unique ID, with max CREATED_DATE, has any of the other columns with not null values, get those values as it is. So the above results become:
ID NAME ST_ID CREATED_DATE
6 2 16-MAY-06
7 07-MAY-06
2.2 For all columns of the record with unique ID, having max CREATED_DATE, get the FIRST NOT NULL value from remaining rows (from the output of query "select * from t_src order by id, created_date desc") for the same ID.
So the above results become:
ID NAME ST_ID CREATED_DATE
6 Row5 2 16-MAY-06
7 Row11 1 07-MAY-06
Hope I am clear about the same.
Awaiting your reply.
Regards
May 10, 2006 - 7:52 am UTC
no, this is not clear - 2.2 is NOT clear at all. Not really.
Actually, I don't see why there is a 2.1 and a 2.2 - seems you could say it all at once.
Are the columns treated independantly so that what I stated:
get me the unique ID's
for each unique ID show me:
a) the max(created_date)
b) the last non-null value of name when sorted by created_date desc
c) the last non-null value of st_id when sorted by created_date desc
is actually what you mean. It seems so. In which case - I disagree with your sample output. Your input data:
ops$tkyte@ORA10GR2> select * from t_src order by id, created_date desc;
ID NAME ST_ID CREATED_D
---------- ------------------------------ ---------- ---------
6 16-MAY-06
6 asdf 10-MAY-06
6 Row5 08-MAY-06
6 6 05-MAY-06
7 07-MAY-06
7 1 06-MAY-06
7 Row11 3 04-MAY-06
7 rows selected.
So, I don't see ANY row with id=6 and st_id = 2. I don't get where your example comes up with that. Also, I see that when id=6, asdf is going to be the last non-null value for NAME when sorted by created_date desc.
Stating a requirement and having examples that correspond to what you type in - sort of important.
Ok, I'll guess my statement of the problem is close to what you really mean, then:
ops$tkyte@ORA10GR2> select id,
2 max(created_date),
3 substr( max( case when name is not null
4 then to_char(created_date,'yyyymmddhh24miss')||name
5 end ), 15 ) name,
6 to_number( substr( max( case when st_id is not null
7 then to_char(created_date,'yyyymmddhh24miss')|| st_id
8 end ), 15 ) ) st_id
9 from t_src
10 group by id;
ID MAX(CREAT NAME ST_ID
---------- --------- ------------------------------ ----------
6 16-MAY-06 asdf 6
7 07-MAY-06 Row11 1
is what you were really looking for perhaps.
Results....bang on target
Narendra, May 11, 2006 - 12:10 am UTC
Tom,
Thanks a lot.
That is the exact query I am looking at.
However, just one question in the confusion created by my explanation.
You said:
get me the unique ID's
for each unique ID show me:
a) the max(created_date)
b) the last non-null value of name when sorted by created_date desc
c) the last non-null value of st_id when sorted by created_date desc
With input data as:
ID NAME ST_ID CREATED_D
---------- ------------------------------ ---------- ---------
6 16-MAY-06
6 asdf 10-MAY-06
6 Row5 08-MAY-06
6 6 05-MAY-06
7 07-MAY-06
7 1 06-MAY-06
7 Row11 3 04-MAY-06
Now, "the last non-null value of name when sorted by created_date desc" for ID = 6 becomes "Row5" and not "asdf", Right?
Anyway, that is what I needed. "The first non-null value of name when sorted by created_date desc".
I think my interpretation of "first" and "last" created confusion. Sorry for the same.
Thanks once again.
( Will keep bothering you...:)
Should I use Analytics
David Piazza, May 25, 2006 - 11:08 pm UTC
Tom,
I have the following problem. Table A has multiple x values which has just once value in Table B. I need to update column y in table B based on the criteria of the latest date in column z as long as the record is older than 16-MAY-06. The columns are shown below along with the way I'd like table B to end up:
Table A (Test Schema) Table B
x y z x y
-- -- --------- -- --
1 10 14-MAY-06 1 10
2 25 14_MAY-06 2 NULL
2 2 15_MAY-06 3 11
3 25 10_MAY-06 4 24
3 3 18_MAY-06
3 1 15_MAY-06
4 42 14_APR-06
4 4 14_MAY-05
AFTER:
Table B
x y
-- --
1 10
2 2
3 1
4 42
I have successfully updated one value for value x, but when I try them all using the following query, I don't get the results I need. I've spent a lot of time trying different things but to no avail. Can this be done without analytics, and how would you do it?
UPDATE B
SET y=
(SELECT *
FROM (SELECT y
FROM test.A,B
WHERE test.A.x=B.x AND
test.z < TO_DATE('16-MAY-2006','dd-mon-yy')
ORDER by test.z DESC )
WHERE rownum = 1);
May 26, 2006 - 8:36 am UTC
I would start by setting up a create table and inserts so that when I asked someone to play around with it - they would have an out of the box test case to work with.
Either MERGE using an aggregate query, or a correlated subquery likely.
count's between boundaries
A reader, June 01, 2006 - 5:25 am UTC
tom,
can you please show me a more elegant solution for my query below?
TEST @ adm1 SQL>create table emp as select * from scott.emp;
Tabelle wurde erstellt.
Abgelaufen: 00:00:00.19
TEST @ adm1 SQL>insert into emp select * from scott.emp;
15 Zeilen wurden erstellt.
Abgelaufen: 00:00:00.01
TEST @ adm1 SQL>insert into emp select * from scott.emp;
15 Zeilen wurden erstellt.
Abgelaufen: 00:00:00.00
TEST @ adm1 SQL>edit
file afiedt.buf wurde geschrieben
1 select lb, rb, count(*) sal_between
2 from
3 (
4 select b.l_bound lb, b.r_bound rb, s.sal ss
5 from
6 (
7 select (rownum - 1) * 1000 l_bound
8 ,(rownum - 1) * 1000 + 1000 r_bound
9 from all_objects
10 where rownum <= 20 -- max. l_bound / r_bound.
11 ) b,
12 (
13 select sal
14 from emp
15 where deptno between 10 and 30 -- data from.
16 ) s
17 where s.sal between b.l_bound and b.r_bound
18 )
19 group by lb, rb
20* order by lb, rb
21 /
LB RB SAL_BETWEEN
---------- ---------- -----------
0 1000 6
1000 2000 18
2000 3000 15
3000 4000 6
4000 5000 3
5000 6000 3
6 Zeilen ausgewählt.
Abgelaufen: 00:00:00.20
June 01, 2006 - 10:36 am UTC
it looks like you just want to "divide" doesn't it?? This is not 100% identical to yours (you double count things on the "boundary", dividing won't)
ops$tkyte@ORA10GR2> select lb, rb, count(*) sal_between
2 from
3 (
4 select b.l_bound lb, b.r_bound rb, s.sal ss
5 from
6 (
7 select (rownum - 1) * 1000 l_bound
8 ,(rownum - 1) * 1000 + 1000 r_bound
9 from all_objects
10 where rownum <= 20 -- max. l_bound / r_bound.
11 ) b,
12 (
13 select sal
14 from emp
15 where deptno between 10 and 30 -- data from.
16 ) s
17 where s.sal between b.l_bound and b.r_bound
18 )
19 group by lb, rb
20 order by lb, rb
21 /
LB RB SAL_BETWEEN
---------- ---------- -----------
0 1000 2
1000 2000 6
2000 3000 5 <<<==== counted 3000 here
3000 4000 2 <<<==== and here
4000 5000 1 <<<==== counted 5000 here
5000 6000 1 <<<==== and here
6 rows selected.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select minsal, maxsal, cnt, truncsal*1000 || ' >= sal < ' || (truncsal+1)*1000 rng
2 from (
3 select min(sal) minsal, max(sal) maxsal, count(*) cnt, trunc(sal/1000) truncsal
4 from emp
5 group by trunc(sal/1000)
6 )
7 order by truncsal
8 /
MINSAL MAXSAL CNT RNG
---------- ---------- ---------- --------------------
800 950 2 0 >= sal < 1000
1100 1600 6 1000 >= sal < 2000
2450 2975 3 2000 >= sal < 3000
3000 3000 2 3000 >= sal < 4000
5000 5000 1 5000 >= sal < 6000
count between boundaries
A reader, June 01, 2006 - 6:29 am UTC
ups, i made a mistake. overlapping partition boundaries.
TEST @ adm1 SQL>edit
file afiedt.buf wurde geschrieben
1 select lb, rb, count(*) sal_between
2 from
3 (
4 select b.l_bound lb, b.r_bound rb, s.sal ss
5 from
6 (
7 select (rownum - 1) * 1000 + 1 l_bound
8 ,(rownum - 1) * 1000 + 1000 r_bound
9 from all_objects
10 where rownum <= 20 -- max. l_bound / r_bound.
11 ) b,
12 (
13 select sal
14 from emp
15 where deptno between 10 and 30 -- data from.
16 ) s
17 where s.sal between b.l_bound and b.r_bound
18 )
19 group by lb, rb
20* order by lb, rb
TEST @ adm1 SQL>/
LB RB SAL_BETWEEN
---------- ---------- -----------
1 1000 6
1001 2000 18
2001 3000 15
4001 5000 3
Abgelaufen: 00:00:00.07
June 01, 2006 - 10:43 am UTC
see above, same concept will work if you "tweak the math in the division" to get 3000/5000 to go into the group you think they should.
Beware of decimal numbers with your solution?
count's between boundaries
A reader, June 01, 2006 - 7:03 am UTC
i tried the width_bucket analytic function. would this be the best way?
TEST @ adm1 SQL>edit
file afiedt.buf wurde geschrieben
1 select lb, rb, count(*) rates_between
2 from
3 (
4 select sal
5 ,width_bucket (sal - 1, 0, 50000, 50) * 1000 - 999 lb
6 ,width_bucket (sal - 1, 0, 50000, 50) * 1000 rb
7 from emp
8 )
9 group by lb, rb
10* order by lb, rb
TEST @ adm1 SQL>/
LB RB RATES_BETWEEN
---------- ---------- -------------
1 1000 6
1001 2000 18
2001 3000 15
4001 5000 3
3
Abgelaufen: 00:00:00.01
June 01, 2006 - 10:43 am UTC
I think you just want to "divide"
count's between boundaries
A reader, June 02, 2006 - 8:37 am UTC
divide! yes, that's it! i wish i could find such a simple concise and elegant solution on the first try! ;(
what's about your book concerning "analytics"?
TEST @ adm1 SQL>SELECT minsal, maxsal, cnt,
2 LPAD ((DECODE (truncsal * 1000, 0, 0, truncsal * 1000 + 1)),
3 4
4 )
5 || ' >= sal <= '
6 || ((truncsal + 1) * 1000) rng
7 FROM (SELECT MIN (sal) minsal, MAX (sal) maxsal, COUNT (*) cnt,
8 TRUNC ((sal - 1) / 1000) truncsal
9 FROM emp
10 GROUP BY TRUNC ((sal - 1) / 1000))
11 ORDER BY truncsal
12 /
MINSAL MAXSAL CNT RNG
---------- ---------- ---------- ----------------------------------------
800 950 6 0 >= sal <= 1000
1100 1600 18 1001 >= sal <= 2000
2450 3000 15 2001 >= sal <= 3000
5000 5000 3 4001 >= sal <= 5000
3 >= sal <=
Abgelaufen: 00:00:00.02
TO_NUMBER PROBLEM
A reader, June 07, 2006 - 7:33 pm UTC
Hi
I have external table which has column which is char(19)
when i do SELECT TO_NUMBER(Col_NUM) FROM ET_table
it comes as 4.70793001110283E15
i want output as number.
June 07, 2006 - 8:26 pm UTC
that is a number - sqlplus is printing a string so you can read it but it is in fact "a number"
you can
select to_char( to_number(col_num), '99999999999999999999.99999' ) from et_table;
but rest assured, it is a nice number and sqlplus is just trying to make it fit reasonably on the screen for us human beings to read.
Analytics or Not
Kandy Train, June 09, 2006 - 11:03 am UTC
Hi Tom,
Forgive me if this is not the proper thread to put this in.
I have never written a analytic query and came across this requiment and felt it can't be done with a normal ORDER BY.
Here is the example..
create table t1 (priority varchar2(20), planned_ship_date date, order_no varchar2(20));
insert into t1 values(1, sysdate + 5, 111111);
insert into t1 values(2, sysdate + 1, 333333);
insert into t1 values(3, sysdate + 1, 444444);
insert into t1 values(4, sysdate, 222222);
insert into t1 values(5, sysdate + 3, 555555);
--column planned_ship_date format a25
--select order_no, priority, planned_ship_date from t1;
I want the results to be sorted
1). All the Priority = 1 rows should come first
2). The rest should be orderd on the min(Planned_Ship_Date) --> earliest
3). When there is a tie on the Planned_Ship_date, priority = 2 rows should come first(If there is any).
The Order No column is there just to see the result, if we have orderes them correctly.
Can this be done??
Thank you so much for your help.
June 09, 2006 - 1:24 pm UTC
order by case when priority = 1 then 1 else 2 end,
planned_ship_date,
case when priority = 2 then 1 else 2 end
/
why is priority a varcahr2(20)? that is so wrong, hope that is not "real life"
same with order_no
never ever use a string to store a number, or a date. never.
LEAST of all dates
Mave, June 12, 2006 - 10:55 am UTC
Tom, I want to get Least of all the selected dates [Trying to get earliest date]
eg:
select least(d1,d2,d3) from (select sysdate-3 d1,sysdate-2 d2,sysdate d3 from dual)
should work. But If one of them is null, i want least of other two. Nulls should not be inlcuded in the list for LEAST Function
Thanks for all your help in advance.
June 13, 2006 - 10:04 am UTC
nvl() them and have the nvl return a date far into the future (like the year 4000)
least( nvl(d1,to_date('4000','yyyy')), nvl(d2,...)
and if you think all three might be null, use decode or case to return NULL if the returned least value is 4000
Works for me....
Mave, June 13, 2006 - 10:33 am UTC
Thanks tom, that would work for me.
But I am curious ..how does this to_date('4000','yyyy') work? I tried and got 06/01/4000 as the date. Why did it show june 1st of 4000?
Thanks,
June 13, 2006 - 12:28 pm UTC
the default day and month for yyyy is "first day of the current month"
you can of course use 01014000 and a format of ddmmyyyy if you like
If /how I can use analytics for this problem?
Ping, June 14, 2006 - 10:00 am UTC
I have a table with info like this:
c_id a_id reg_dt
10 1 1/1/2006
10 2 1/1/2006
10 3 1/1/2006
10 1 1/2/2006
10 2 1/2/2006
10 3 1/2/2006
10 1 1/3/2006
10 2 1/3/2006
10 3 1/3/2006
What I want to get is 3 rows:
c_id a_id reg_dt
10 1 1/1/2006
10 2 1/2/2006
10 3 1/3/2006
I was trying to use the lag/leap functions but didn't seem to be able to make it work. Any suggestions?
Thanks!
June 14, 2006 - 12:37 pm UTC
fascinating.
a bunch of data and no description of what it means or why you got the three rows you did.
that and there is no create table, no inserts, nothing to play with.
Query
Rambabu, June 15, 2006 - 7:10 am UTC
Hi Tom,
I have a table like this.
Pid seq value1 value2
100 1 X A
100 2 Y B
101 1 M O
101 2 N P
I want the output in the following way.
Pid value1 value2
100 X B
101 M P
For all Pids, I want value1 from the row which has "seq =1" and value2 from the row which has "seq =2".
I got the above result using the following query.
select t1.pid,t1.value1, t2.value2
from (select *
from t tt
where seq = (select min(seq) from t where pid = tt.pid)) t1,
(select *
from t tt
where seq = (select max(seq) from t where pid = tt.pid)) t2
where t1.pid = t2.pid
Here i used two instances(t1,t2) of Table t.
How can i get the above result using a single instance of Table t and using analytics.
create table t(
Pid number(6),
seq number(6),
value1 varchar2(20),
value2 varchar2(20));
insert into t values(100,1,'X','A');
insert into t values(100,2,'Y','B');
insert into t values(101,1,'M','O');
insert into t values(101,2,'N','P');
commit;
Thanks in advance,
Rambabu B
June 15, 2006 - 8:47 am UTC
ops$tkyte@ORA10GR2> select pid,
2 max( decode( seq, 1, value1 ) ) v1,
3 max( decode( seq, 2, value2 ) ) v2
4 from t
5 group by pid;
PID V1 V2
---------- -------------------- --------------------
100 X B
101 M P
<b>sorry, analytics didn't fit into the answer</b>, they would not make sense here.
You can use analytics
Anwar, June 15, 2006 - 10:56 am UTC
Tom's answer seems better, more concise, but you can use analytics if you really want to.
SQL> SELECT pid,value1,v2
2 FROM
3 (
4 SELECT pid,
5 value1,
6 lead(value2) over(partition by pid order by seq) v2,
7 seq
8 FROM t
9 )
10 WHERE seq=1
11 /
PID VALUE1 V2
---------- -------------------- --------------------
100 X B
101 M P
Sorry, let me try...
A reader, June 15, 2006 - 11:55 am UTC
I think the table I showed earlier is coming from a cross join b/w two tables:
Table 1, stores acct_ids that are cancelled
customer_id acct_id
10 1
10 2
10 3
table 2, stores acct_ids that are currently in use
customer_id acct_id reg_dt
10 11 1/1/2006
10 12 1/2/2006
10 13 1/3/2006
The two tables join on customer_id, so it got 9 rows.
Customer_id 10 cancelled 3 of his acct_ids, and then registered 3 new acct_ids. What my client really needs is, after customer_id 10 cancaled his acct_id 1,2,3, when did he registered acct_id 11,12,13.
customer_id acct_id_cancaled acct_id_new reg_dt
10 1 11 1/1/2006
10 2 12 1/2/2006
10 3 13 1/3/2006
Hope this is clear.
Thanks, Ping
June 16, 2006 - 5:51 pm UTC
I have no idea what you are referring to.
and still, hmm, not a create, not an insert - no real detail...
For create table and insert statements...
A reader, June 15, 2006 - 12:00 pm UTC
create table a
(c_id integer,a_id integer)
;
insert into a values (10, 1);
insert into a values (10, 2);
insert into a values (10, 3);
create table b
(c_id integer,a_id integer,reg_dt date);
insert into b values (10, 11,'1/1/2006');
insert into b values (10, 12,'1/2/2006');
insert into b values (10, 13,'1/3/2006');
Thanks!
SQL Query
Sankar Kumar, June 15, 2006 - 12:50 pm UTC
Hi Tom,
I have a table like this:
Personid Change_Sequence Class Cls_EffDate Location Loc_EffDate
----------------------------------------------------------------------------
1000 1 FullTime Hawaii
1000 2 FullTime California 1/1/2005
1000 3 PartTime 1/1/2006 California 1/1/2005
1000 4 PartTime 1/1/2006 Texas 10/1/2005
1000 5 FullTime 1/1/2007 Boston 1/1/2007
1000 6 COBRA 1/1/2005 Boston 1/1/2007
1000 7 Outside 5/1/2006 Boston 1/1/2007
1. The primary key is (Personid, change_sequence)
2. The effective dates of the first row for each person is null (i.e. for change_sequence = 1)
3. For each row only one column value will be effected (i.e. Either Class or Location can be changed) and the remaining data will be copied from the previous row
4. Both Class and Location can be changed in a row, only if the effective date is same. [See where change_sequence = 5]
I am using the following query for getting Class and Location of the person as per given Effective Date.
Select ECls.personid, ECls.class, ELoc.Location, ECls.change_sequence Cls_change_seq, ELoc.change_Sequence Loc_change_seq
from (select * from (select personid, class, change_sequence,
row_number() over(partition by personid order by change_sequence desc) r
from employee
where nvl(Cls_EffDate, to_date('&givenDate','MM/DD/YYYY')) <= to_date('&givenDate','MM/DD/YYYY')) Cls
where Cls.r = 1) ECls,
(Select * from (select personid, Location, change_sequence,
row_number() over(partition by personid order by change_sequence desc) r
from employee
where nvl(Loc_EffDate, to_date('&givenDate','MM/DD/YYYY')) <= to_date('&givenDate','MM/DD/YYYY')) Loc
where Loc.r = 1) ELoc
where ECls.personid = ELoc.personid;
Now lets run query with three different dates
For 1/1/2005 the output would be
PERSONID CLASS LOCATION CLS_CHANGE_SEQ LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000 COBRA-Class California 6 3
For 1/1/2006 the output would be
PERSONID CLASS LOCATION CLS_CHANGE_SEQ LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000 COBRA-Class Texas 6 4
For 1/1/2007 the output would be
PERSONID CLASS LOCATION CLS_CHANGE_SEQ LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000 Outside Boston 7 7
The table more than 1 Lakh rows and got more than 12 Columns, similar to CLASS and LOCATION, based on effective dates. (For example Dept, Dept_EffDate, Status, Status_EffDate,.....)
In the above query I created two instances ECls, ELoc. If I want dept column too, I would need to create one more instance.
The goal is to get all the Columns using less number of joins, if possible, using only one instance of the table?
If this is not possible, what would be the better design to handle this type of data?
create table employee
(personid number(10),
change_sequence number(3),
class varchar2(50),
cls_effdate date,
location varchar2(50),
loc_effdate date,
primary key (personid,change_sequence));
insert into employee values(1000,1,'FullTime',null,'Hawaii',null);
insert into employee
values(1000,2,'FullTime',null,'California',to_date('1/1/2005','MM/DD/YYYY'));
insert into employee
values(1000,3,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'California',to_date('1/1/2005','MM/DD/YYYY'));
insert into employee
values(1000,4,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'Texas',to_date('10/1/2005','MM/DD/YYYY'));
insert into employee
values(1000,5,'FullTime',to_date('1/1/2007','MM/DD/YYYY'),'Boston',to_date('1/1/2007','MM/DD/YYYY'));
insert into employee
values(1000,6,'COBRA-Class',to_date('1/1/2005','MM/DD/YYYY'),'Boston',to_date('1/1/2007','MM/DD/YYYY'));
insert into employee
values(1000,7,'Outside',to_date('5/1/2006','MM/DD/YYYY'),'Boston',to_date('1/1/2007','MM/DD/YYYY'));
Thanks in Advance
Sankar
A reader, June 16, 2006 - 8:01 am UTC
Dear Tom,
I have read chapter 12(analytic function) of your book and don't find the solution to my problem. May be you can help me.
CREATE TABLE T
(IDE NUMBER,
COMPTECLI VARCHAR2(13),
DAT_JOB DATE,
ACTION VARCHAR2(255),
FAG_SUCCESS VARCHAR2(1));
insert into t values (115, 'XX12', TRUN(SYSDATE), '01','N');
insert into t values (120, 'YY14', TRUN(SYSDATE), '01','N');
insert into t values(145, 'ZZ25', TRUNC(SYSDATE),'01','N');
insert into t values(255, 'XX12',TRUN(SYSDATE+1), '01','N');
insert into t values(301, 'AB13',TRUN(SYSDATE+1), '02','N');
commit;
select * from t
order by dat_job;
IDE COMPTECLI DAT_JOB ACTION FAG_SUCCESS
115 XX12 16/06/2006 01 N
120 YY14 16/06/2006 01 N
145 ZZ25 16/06/2006 01 N
301 AB13 17/06/2006 02 N
255 XX12 17/06/2006 01 N
What I would like to get is:
if a same COMPTECLI with the same ACTION and FAG_SUCCES = 'N' occur many times then select only the comptecli with the "bigest" dat_job. In this example I would like to get this:
IDE COMPTECLI DAT_JOB ACTION FAG_SUCCESS
120 YY14 16/06/2006 01 N
145 ZZ25 16/06/2006 01 N
301 AB13 17/06/2006 02 N
255 XX12 17/06/2006 01 N
where comptecli XX12 of date 16/06/2006 has been ignored
Could you please help me
Thanks in advance
June 16, 2006 - 7:13 pm UTC
there actually are examples of your problem in the book... you just didn't recognize them :)
however, what about records where fag_success <> 'N' - what becomes of them?
are they treated differently? does this only apply to 'N's?
answer is maybe this:
ops$tkyte@ORA10GR2> select * from t
2 order by dat_job;
IDE COMPTECLI DAT_JOB ACTIO F
---------- ------------- --------- ----- -
115 XX12 16-JUN-06 01 N
120 YY14 16-JUN-06 01 N
145 ZZ25 16-JUN-06 01 N
255 XX12 17-JUN-06 01 N
301 AB13 17-JUN-06 02 N
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select *
2 from (
3 select ide, comptecli, dat_job, action, fag_success,
4 max(case when fag_success = 'N' then dat_job end) over (partition by comptecli, action) max_date
5 from t
6 )
7 where fag_success <> 'N'
8 or (fag_success = 'N' and dat_job = max_date)
9 or max_date is null;
IDE COMPTECLI DAT_JOB ACTIO F MAX_DATE
---------- ------------- --------- ----- - ---------
301 AB13 17-JUN-06 02 N 17-JUN-06
255 XX12 17-JUN-06 01 N 17-JUN-06
120 YY14 16-JUN-06 01 N 16-JUN-06
145 ZZ25 16-JUN-06 01 N 16-JUN-06
but you didn't provide the details - and the border test cases!
A reader, June 17, 2006 - 5:42 am UTC
Thanks very much Tom,
Only Fag_Success = 'N' are considered. Records with Fag_Success = 'Y' are deleted before the process starts.
Again Thanks a lot
June 17, 2006 - 7:14 am UTC
then mentioning 'fag-success' at all in the problem description was what we call a 'red herring', something to throw us off the trail :)
distinct and average
riyaz, June 28, 2006 - 12:29 am UTC
There is a data in a select is
>
> XVAL PO_NO BILLNO BILL VALUE
>
> 000001 PO1000000 BL0000001 1000.00
> 000001 PO1000001 BL0000002 2000.00
> 000001 PO1000002 BL0000002 2000.00
> 000001 PO1000003 BL0000003 1000.00
> 000001 PO1000003 BL0000004 1000.00
>
>
> I need a report like
>
> XVAL PO_count BILL Count Avg BILL VALUE
>
> 000001 4 4 1250.00
distict (count) and average - contd
riyaz, June 28, 2006 - 12:36 am UTC
For bill no, amount is repeated twice, so only one amount to be taken
ie average --> 5000/4 = 1250
> XVAL PO_NO BILLNO BILL VALUE
>
> 000001 PO1000000 BL0000001 1000.00
> 000001 PO1000001 BL0000002 2000.00
> 000001 PO1000002 BL0000002 2000.00
(value is repeating, not orginally available in table)
> 000001 PO1000003 BL0000003 1000.00
> 000001 PO1000003 BL0000004 1000.00
>
>
Pl give us analytical query.
June 28, 2006 - 7:45 am UTC
no creates, no inserts, no look
Does it perform better?
A reader, July 05, 2006 - 11:54 am UTC
Tom, Please take a look at this query. Basically i am trying to pull all the tables and it's columns from all_tab_columns and want them listed as group
eg:
Table A Column1
Column2
Column3
Table B Column1
Column2
Column3
I do not want table name to repeat for each column [like BREAK in SQL Plus]
I got it using this query:
select
case when lag(table_name) over (order by table_name) is null
then table_name
when lag(table_name) over(order by table_name) = table_name
then ''
else table_name
end tab_name,
column_name
from all_tab_columns
Do you see any problem with this one? or this should be ok in terms of performance?
Thanks,
July 08, 2006 - 8:11 am UTC
I might have used row_number with a partition - but only because it "made more sense to me" that way... You need to incorporate OWNER in there as well though, technically your query isn't quite right without it.
select decode( row_number() over (partition by owner, table_name order by column_name),
1, owner || '.' || table_name ) tname,
column_name
from all_tab_columns
order by owner, table_name, column_name
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3637 4.36 4.26 0 390552 0 54535
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3639 4.37 4.27 0 390552 0 54535
select
case when lag(table_name) over (order by table_name) is null
then table_name
when lag(table_name) over(order by table_name) = table_name
then ''
else table_name
end tab_name,
column_name
from all_tab_columns
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3637 4.51 4.42 0 390552 0 54535
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3639 4.51 4.42 0 390552 0 54535
Items sold Together
Reader, July 11, 2006 - 6:43 pm UTC
I was exploring if you could help me in writing a SQL to get the popular items sold together .Something like when you go to some websites they say that people who bought this item also bought this item .
eg if a customers buys a shirt he always buys a pen .
THANKS !
July 12, 2006 - 3:28 pm UTC
sort of hard without a table isn't it?
Can i use it as analytic func in ORACLE10 G
Mariana, July 13, 2006 - 5:17 pm UTC
I have a select
SELECT SUPPLIER_NO,ACCOUNT_NO,LINE_NO,AMOUNT
FROM KM_SUP_PAYMENTS
ORDER BY SUPPLIER_NO
that gives such result:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
123 555555 10001 34
123 555555 20034 56
123 555555 30034 12
234 345555 34555 56
234 454555 45455 34
But i need to see also after each group of suppliers
total amount for supplier
like here, i tried to do it by analitic but my solution was wrong,
can you help me please in solution in order to get this result:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
123 555555 10001 34
123 555555 20034 56
123 555555 30034 12
123 102
234 345555 34555 56
234 454555 45455 34
234 90
July 13, 2006 - 6:24 pm UTC
scott@ORA10GR2> select deptno, ename, sum(sal)
2 from emp
3 group by grouping sets((deptno,ename),(deptno));
DEPTNO ENAME SUM(SAL)
---------- ---------- ----------
10 KING 5000
10 CLARK 2450
10 MILLER 1300
10 8750
20 FORD 3000
20 ADAMS 1100
20 JONES 2975
20 SCOTT 3000
20 SMITH 800
20 10875
30 WARD 1250
30 ALLEN 1600
30 BLAKE 2850
30 JAMES 950
30 MARTIN 1250
30 TURNER 1500
30 9400
17 rows selected.
Continue grouping sets in ORACLE 10G
Marirana, July 14, 2006 - 12:12 am UTC
HI Tom,
Thank for answer,but at this example:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
123 555555 10001 34
123 555555 20034 56
123 555555 30034 12
123 102
234 345555 34555 56
234 454555 45455 34
234 90
the amount for each row is not sum , but just when supllier_no changes,even though i have two identical rows
before i have to dispay it and to make sum when supplier_no changes.
July 14, 2006 - 8:22 am UTC
i don't see a create table and inserts for your example?
but using emp.... we can do this again:
ops$tkyte@ORA10GR2> select supplier_no, account_no, line_no, sum(amount)
2 from (
3 select deptno supplier_no, job account_no, empno line_no, sal amount
4 from emp
5 )
6 group by grouping sets((supplier_no,account_no,line_no),(supplier_no))
7 /
SUPPLIER_NO ACCOUNT_N LINE_NO SUM(AMOUNT)
----------- --------- ---------- -----------
10 CLERK 7934 1300
10 MANAGER 7782 2450
10 PRESIDENT 7839 5000
10 8750
20 CLERK 7369 800
20 CLERK 7876 1100
20 ANALYST 7788 3000
20 ANALYST 7902 3000
20 MANAGER 7566 2975
20 10875
30 CLERK 7900 950
30 MANAGER 7698 2850
30 SALESMAN 7499 1600
30 SALESMAN 7521 1250
30 SALESMAN 7654 1250
30 SALESMAN 7844 1500
30 9400
17 rows selected.
I don't see two *identical rows* in your example - do you?
if you pretend my 30 = your 123 and my salesman = your 555555 - there you go.
Can Analytic do this?
Steve Read, July 14, 2006 - 1:32 am UTC
Here is a simple self-contained query with no creates or inserts needed......
SELECT keynbr
,MAX(MAX(status)) OVER (PARTITION BY keynbr ORDER BY MAX(datetm) DESC) status
FROM -- should return 1, 4, X, X
( SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual
UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual
UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
-- should return 2, 2, X, NULL
UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual
UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual
UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual
UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual
)
GROUP BY keynbr
What it returns now is the MAX(status), but what I actually want is the status associated with the most recent datetm. I think this is probably very simple, but I can't figure out how to do it. I read through this whole thread (took a long time) but I couldn't find the answer.
We are using 8.1.7.
Thanks in advance for your help.
Can Analytic do this, part 2
Steve Read, July 14, 2006 - 1:36 am UTC
Correcting the comment lines from the previous post (I forgot to fix them after simplifying my example):
SELECT keynbr
,MAX(MAX(status)) OVER (PARTITION BY keynbr ORDER BY MAX(datetm) DESC) status
FROM -- should return 1, 4
( SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual
UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual
UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
-- should return 2, 2
UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual
UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual
UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual
UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual
)
GROUP BY keynbr
Michel Cadot, July 14, 2006 - 3:03 am UTC
SQL> with
2 raw_data as (
3 SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
4 UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual
5 UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual
6 UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
7 UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual
8 UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual
9 UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual
10 UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual
11 ),
12 numbered_data as (
13 select keynbr, status,
14 row_number() over (partition by keynbr order by datetm desc) rn
15 from raw_data
16 )
17 select keynbr, status
18 from numbered_data
19 where rn = 1
20 order by keynbr
21 /
K S
- -
1 4
2 2
2 rows selected.
Have a look at rank and dense_rank functions depending on the result you want in case of dupplicate dates.
But if you have huge data and (keynbr,datetm) is indexed, the following one will be faster and cheaper:
SQL> with
2 raw_data as (
3 SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
4 UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual
5 UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual
6 UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
7 UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual
8 UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual
9 UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual
10 UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual
11 ),
12 max_data as (
13 select keynbr, max(datetm) max_date
14 from raw_data
15 group by keynbr
16 )
17 select r.keynbr, r.status
18 from raw_data r, max_data m
19 where r.keynbr = m.keynbr
20 and r.datetm = m.max_date
21 order by r.keynbr
22 /
K S
- -
1 4
2 2
2 rows selected.
Michel
Complex Query
Sankar, July 14, 2006 - 7:41 am UTC
Hi Tom,
I have a table like this:
Personid Change_Sequence Class Cls_EffDate Location Loc_EffDate
----------------------------------------------------------------------------
1000 1 FullTime Hawaii
1000 2 FullTime California 1/1/2005
1000 3 PartTime 1/1/2006 California 1/1/2005
1000 4 PartTime 1/1/2006 Texas 10/1/2005
1000 5 FullTime 1/1/2007 Boston 1/1/2007
1000 6 COBRA 1/1/2005 Boston 1/1/2007
1000 7 Outside 5/1/2006 Boston 1/1/2007
1. The primary key is (Personid, change_sequence)
2. The effective dates of the first row for each person is null (i.e. for
change_sequence = 1)
3. For each row only one column value will be effected (i.e. Either Class or
Location can be changed) and the remaining data will be copied from the previous
row
4. Both Class and Location can be changed in a row, only if the effective date
is same. [See where change_sequence = 5]
I am using the following query for getting Class and Location of the person as
per given Effective Date.
Select ECls.personid, ECls.class, ELoc.Location, ECls.change_sequence
Cls_change_seq, ELoc.change_Sequence Loc_change_seq
from (select * from (select personid, class, change_sequence,
row_number() over(partition by personid order by
change_sequence desc) r
from employee
where nvl(Cls_EffDate, to_date('&givenDate','MM/DD/YYYY')) <=
to_date('&givenDate','MM/DD/YYYY')) Cls
where Cls.r = 1) ECls,
(Select * from (select personid, Location, change_sequence,
row_number() over(partition by personid order by
change_sequence desc) r
from employee
where nvl(Loc_EffDate, to_date('&givenDate','MM/DD/YYYY')) <=
to_date('&givenDate','MM/DD/YYYY')) Loc
where Loc.r = 1) ELoc
where ECls.personid = ELoc.personid;
Now lets run query with three different dates
For 1/1/2005 the output would be
PERSONID CLASS LOCATION CLS_CHANGE_SEQ LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000 COBRA-Class California 6 3
For 1/1/2006 the output would be
PERSONID CLASS LOCATION CLS_CHANGE_SEQ LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000 COBRA-Class Texas 6 4
For 1/1/2007 the output would be
PERSONID CLASS LOCATION CLS_CHANGE_SEQ LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000 Outside Boston 7 7
The table more than 1 Lakh rows and got more than 12 Columns, similar to CLASS
and LOCATION, based on effective dates. (For example Dept, Dept_EffDate, Status,
Status_EffDate,.....)
In the above query I created two instances ECls, ELoc. If I want dept column
too, I would need to create one more instance.
The goal is to get all the Columns using less number of joins, if possible,
using only one instance of the table?
If this is not possible, what would be the better design to handle this type of
data?
create table employee
(personid number(10),
change_sequence number(3),
class varchar2(50),
cls_effdate date,
location varchar2(50),
loc_effdate date,
primary key (personid,change_sequence));
insert into employee values(1000,1,'FullTime',null,'Hawaii',null);
insert into employee
values(1000,2,'FullTime',null,'California',to_date('1/1/2005','MM/DD/YYYY'));
insert into employee
values(1000,3,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'California',to_date('1
/1/2005','MM/DD/YYYY'));
insert into employee
values(1000,4,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'Texas',to_date('10/1/2
005','MM/DD/YYYY'));
insert into employee
values(1000,5,'FullTime',to_date('1/1/2007','MM/DD/YYYY'),'Boston',to_date('1/1/2
007','MM/DD/YYYY'));
insert into employee
values(1000,6,'COBRA-Class',to_date('1/1/2005','MM/DD/YYYY'),'Boston',to_date('1/
1/2007','MM/DD/YYYY'));
insert into employee
values(1000,7,'Outside',to_date('5/1/2006','MM/DD/YYYY'),'Boston',to_date('1/1/20
07','MM/DD/YYYY'));
Thanks in Advance
Sankar
July 14, 2006 - 8:37 am UTC
it might just be me, but 4 and 5 seems to conflict with eachother.
also, why do the dates go UP and DOWN as the sequence increases??? doesn't make sense.
seems the dates should only INCREASE or stay the same over time.
Continue with grouping sets in ORACLE10G
Mariana, July 14, 2006 - 9:48 am UTC
Hi,Tom,
This is just an example:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
123 555555 10001 34
123 555555 20034 56
123 555555 30034 12
123 102
234 345555 34555 56
234 454555 45455 34
234 90
But what i really need in my project is
to run through the cursor
SELECT A.SUPPLIER_NO,
B.SUPPLIER_NAME
A.LINE_NO,
A.PAYMENT_NO
B.VAT_SW
A.AMOUNT_NO
FROM KM_SUP_PAYMENTS A,
KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
For each different supplier i have to insert sum(amount) to some array,but for each and each row i have to make other checks,I thought to make sum(amount) by code but for this i have to use previous values fields from cursor,i also thought to use analitic func of sum(amount)
for every row but i am not sure it is good.Help me please.
Thank you very much
July 14, 2006 - 9:59 am UTC
SELECT A.SUPPLIER_NO,
B.SUPPLIER_NAME
A.LINE_NO,
A.PAYMENT_NO
B.VAT_SW
sum(A.AMOUNT_NO)
FROM KM_SUP_PAYMENTS A,
KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
group by grouping sets(
(a.supplier_no,b.supplier_name,a.line_no,a.payment_no,b.vat_sw),
(a.supplier_no)
)
ORDER BY A.SUPPLIER_NO
I am going to keep saying basically the same thing over and over - you want a subtotal. Hence, I'll keep bringing up GROUPING SETS!!!!!!
Answer to Complex Query
Andy, July 14, 2006 - 10:32 am UTC
To Sankar,
This query should work for you.
select
personid,
/* repeat in the same pattern for other fields */
max(cls_max_chg_seq) cls_chg_seq,
max(case when (change_sequence = cls_max_chg_seq) then class end) class,
max(case when (change_sequence = cls_max_chg_seq) then cls_effdate end) cls_effdate,
/* repeat in the same pattern for other fields */
max(loc_max_chg_seq) loc_chg_seq,
max(case when (change_sequence = loc_max_chg_seq) then location end) location,
max(case when (change_sequence = loc_max_chg_seq) then loc_effdate end) loc_effdate
from
(
select personid, change_sequence,
/* repeat in the same pattern for other fields */
class,
cls_effdate,
max(case when cls_effdate <= the_date then change_sequence else 1 end) over(partition by personid) cls_max_chg_seq,
/* repeat in the same pattern for other fields */
location,
loc_effdate,
max(case when loc_effdate <= the_date then change_sequence else 1 end) over(partition by personid) loc_max_chg_seq
from
employee
)
group by personid;
Here analytics gives you the max change sequence with effective date less than the input date (the_date => replace it with your own input).
Then group by, with max, allows you select the required value. This query can be readily extended to multiple attributes.
Continue to grouping sets
Mariana, July 14, 2006 - 10:35 am UTC
I understand you, but what i am confused is that i do it through the cursor and i have to put in grouping sets all fields that appear in select,what about the analitical func in every row sum(amount) and i'll write the sum when previous supplier_no!=current supplier_no?
Thank you.
July 14, 2006 - 12:34 pm UTC
not understanding you.
where does a cursor come into the discussion, I'm getting the answer in SQL - no code.
Less code = less bugs. Strive to write less code.
Analytics use
Thiru, July 14, 2006 - 11:55 am UTC
Hello Tom,
Great site. Thank so much.
Can you advise how analytics could be used (if possbile, or any other way to do) to get the results for the query below:
CREATE TABLE Z (A CHAR(2),B NUMBER,C NUMBER,D NUMBER);
CREATE TABLE Y (A CHAR(2) PRIMARY KEY,B NUMBER,C NUMBER);
BEGIN
INSERT INTO Y VALUES('AA',200,NULL);
INSERT INTO Y VALUES('AB',200,NULL);
INSERT INTO Y VALUES('AC',200,NULL);
INSERT INTO Y VALUES('AD',200,NULL);
INSERT INTO Z VALUES('AB',100,1,0);
INSERT INTO Z VALUES('AB',100,1,0);
INSERT INTO Z VALUES('AC',100,1,1);
INSERT INTO Z VALUES('AC',100,1,0);
INSERT INTO Z VALUES('AD',100,0,0);
INSERT INTO Z VALUES('AD',100,0,0);
COMMIT;
END;
I would like to update Table Y col c with the sum(Z.B) grouped by Z.A where SUM(Z.B)=Y.B AND Z.C =1 for all records in the group and Z.D=0 for all records in the group .
Y.A=Z.A and Y.A is primary key.
July 14, 2006 - 12:53 pm UTC
AND Z.C =1 for all records in the group and Z.D=0 for all records
in the group
you lost me there.
do you mean something like
update y
set c = b
where (a,b) in
(
select z.a, sum(z.b)
from z
where z.c = 1
and z.d = 0
group by z.a
)
since SUM(Z.B) must be Y.B, we don't need to update from Z, we just need to find the records to update right?
I messed up the question
Thiru, July 14, 2006 - 1:39 pm UTC
Sorry Tom for the confusion, I really meant:
I would like to update Table Y col c with the sum(Z.B) grouped by Z.A where
Z.C =1 for all records in the group and Z.D=0 for all records
in the group .
Y.A=Z.A and Y.A is primary key.
The sum(Z.b)=y.b is not in the condition. I will have to take the values from Z. Just coincidence that I made the sceanrio where y.b looked same as sum(z.b).
Will that make the subquery more complex?
Also can we use analytics for this looking at the millions of rows in the two tables?
July 14, 2006 - 1:57 pm UTC
Not just a conincidence - you said it:
....
grouped by Z.A where SUM(Z.B)=Y.B AND Z.C =1
........
what version you got going here?
are there any Z.A values that are NOT in Y.A?
analytics will NOT come into play here - aggregates will.
Thiru, July 14, 2006 - 2:01 pm UTC
"what version you got going here?
are there any Z.A values that are NOT in Y.A?"
I am on 10gR2.
Yes it is possible that Z.A values will not be in Y.A.
July 14, 2006 - 2:47 pm UTC
ops$tkyte@ORA10GR2> select * from y;
A B C
-- ---------- ----------
AA 200
AB 200
AC 200
AD 200
ops$tkyte@ORA10GR2> merge into y
2 using
3 (
4 select z.a, sum(z.b) new_c
5 from z
6 where z.c = 1
7 and z.d = 0
8 group by z.a
9 ) z
10 on (y.a = z.a)
11 when matched then update set c = z.new_c;
2 rows merged.
ops$tkyte@ORA10GR2> select * from y;
A B C
-- ---------- ----------
AA 200
AB 200 200
AC 200 100
AD 200
Thiru, July 14, 2006 - 3:03 pm UTC
The Y table should have been updated for only one row ie., 'AB' and not 'AC'.
As only AB on Table Z satisfies both conditions. All records in the group AB has 1 for column c and 0 for column d. While for 'AC' one record in Col d has a value 1. So this does not satisfy the condition.
select * from z
/ a b c d
AB 100 1 0
AB 100 1 0
AC 100 1 1
AC 100 1 0
AD 100 0 0
AD 100 0 0
July 14, 2006 - 3:16 pm UTC
that was not entirely clear.
easy enough however
ops$tkyte@ORA10GR2> merge into y
2 using
3 (
4 select z.a, sum(z.b) new_c
5 from z
6 group by z.a
7 having count(*) = count( case when c=1 and d=0 then 1 end)
8 ) z
9 on (y.a = z.a)
10 when matched then update set c = z.new_c;
1 row merged.
ops$tkyte@ORA10GR2> select * from y;
A B C
-- ---------- ----------
AA 200
AB 200 200
AC 200
AD 200
Wow!
Thiru, July 14, 2006 - 3:21 pm UTC
Hats off to you! I was trying using CASE but just could not get it right.
Thanks.
Continue to grouping sets in ORACLE10G
MARIANA, July 14, 2006 - 5:12 pm UTC
What i meant here that in my store procedure i have to use this select in cursor:i have to proceed each row at this cursor because i have a logic that i need to do it by code,
and my question is: May i use the analytical function instead of grouping set?
like:
CURSOR
SELECT A.SUPPLIER_NO,
B.SUPPLIER_NAME
A.LINE_NO,
A.PAYMENT_NO
B.VAT_SW
A.AMOUNT,
SUM(A.AMOUNT) OVER (PARTITION BY A.SUPPLIER_NO)
FROM KM_SUP_PAYMENTS A,
KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
Thank you
July 14, 2006 - 5:28 pm UTC
you may do whatever you like, yes. of course.
(you could also just fetch out grouping:
ops$tkyte@ORA10GR2> select supplier_no, account_no, line_no, sum(amount),
2 grouping(account_no)
3 from (
4 select deptno supplier_no, job account_no, empno line_no, sal amount
5 from emp
6 )
7 group by grouping sets((supplier_no,account_no,line_no),(supplier_no))
8 /
SUPPLIER_NO ACCOUNT_N LINE_NO SUM(AMOUNT) GROUPING(ACCOUNT_NO)
----------- --------- ---------- ----------- --------------------
10 CLERK 7934 1300 0
10 MANAGER 7782 2450 0
10 PRESIDENT 7839 5000 0
10 8750 1
20 CLERK 7369 800 0
20 CLERK 7876 1100 0
20 ANALYST 7788 3000 0
20 ANALYST 7902 3000 0
20 MANAGER 7566 2975 0
20 10875 1
30 CLERK 7900 950 0
30 MANAGER 7698 2850 0
30 SALESMAN 7499 1600 0
30 SALESMAN 7521 1250 0
30 SALESMAN 7654 1250 0
30 SALESMAN 7844 1500 0
30 9400 1
17 rows selected.
so you know what rows are "made up for you" instead of doing the old "if curr <> last then ..." trick.
Continue with grouping settings in ORACLE 10G
Mariana, July 14, 2006 - 6:32 pm UTC
Thank you,Tom,
I have a question about analitic func at this concept:
is this function is expensive ?
Is this function calculated for each row?
or just when supplier_no is changing?
SELECT A.SUPPLIER_NO,
B.SUPPLIER_NAME
A.LINE_NO,
A.PAYMENT_NO
B.VAT_SW
A.AMOUNT,
SUM(A.AMOUNT) OVER (PARTITION BY A.SUPPLIER_NO)
FROM KM_SUP_PAYMENTS A,
KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
July 15, 2006 - 3:19 pm UTC
it'll have to
a) join
b) sort/hash/partition by supplier_no
c) compute the sum(a.amount)
"expensive" - probably less so than you doing it yourself since (a) and (b) would have to be done for you - letting the server do C is a good idea.
However, grouping sets *STILL* seems more appropriate.
Can this be done?
A reader, July 15, 2006 - 11:43 am UTC
Hi Tom,
Can this be done in one select through analytic function, instead of two sql stmnts.
1 select 'client, '||count(*) from client
2 union all
3* select to_char(client_key) from client
SQL> /
'CLIENT,'||COUNT(*)
------------------------------------------------
client, 10
10
12
16
22
23
24
25
26
29
31
11 rows selected.
Regards,
Tarun
July 15, 2006 - 3:24 pm UTC
assuming client_key is "unique"
scott%ORA10GR2> select decode( grouping(empno),
2 0, to_char(empno),
3 1, 'client, ' || count(empno) ) data
4 from emp
5 group by rollup(empno)
6 order by grouping(empno) DESC, empno
7 /
DATA
------------------------------------------------
client, 14
7369
7499
7521
7566
7654
7698
7782
7788
7839
7844
7876
7900
7902
7934
15 rows selected.
Good one!
Michel Cadot, July 15, 2006 - 3:48 pm UTC
Continue to grouping sets ORACLE 10G
Mariana, July 15, 2006 - 4:22 pm UTC
Hi Tom,
Thank you for question,
i didn't understand which join is executed in this analitic function?And is the analitic function executed per each row or once per different supplier_no?
SELECT A.SUPPLIER_NO,
B.SUPPLIER_NAME
A.LINE_NO,
A.PAYMENT_NO
B.VAT_SW
A.AMOUNT,
SUM(A.AMOUNT) OVER (PARTITION BY A.SUPPLIER_NO)
FROM KM_SUP_PAYMENTS A,
KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
July 16, 2006 - 9:36 am UTC
you are doing the join? right there - A to B.
I was describing in effect what would have to take place to process said query.
a) join
b) sort
c) compute analytic.
If you did it yourself it would be:
a) join
b) sort
c) YOU compute the analytic
I'm going to guess it will be better for US to commpute the analytic
I'll even guess that GROUPING SETS *is still the right answer* :)
Complex Query
Sankar Kumar, July 16, 2006 - 10:59 am UTC
To Andy,
That is what I am looking for. Thanks alot
What if there is nothing to group on?
Phil, July 21, 2006 - 9:11 am UTC
Hi Tom
This is great but unlike "traditional" SQL I find it hard to learn. Can you recommend a book that deciphers it for the likes of me please?
I have some data which I would like to analyse but I am not sure if I can do it:
WHEN (date) DESCRIPTION (varchar2)
7/21/2006 1:10:44 PM Starting processX. (From 14-MAY-06 for 20 days)
7/21/2006 1:10:42 PM Completed balance update processing in pr_update_lagging_RTS_bals. 7419 balances processed
7/21/2006 11:10:41 AM Completed processing in processcdr. 433923 events,28567 accounts*days
7/21/2006 11:10:41 AM Starting balance update processing in pr_update_lagging_RTS_bals.
7/20/2006 9:42:48 PM Starting processX. (From 24-APR-06 for 20 days)
7/20/2006 9:42:46 PM Completed balance update processing in pr_update_lagging_RTS_bals. 2855 balances processed
7/20/2006 8:58:01 PM Completed processing in processcdr. 286963 events,7211 accounts*days
7/20/2006 8:58:01 PM Starting balance update processing in pr_update_lagging_RTS_bals.
7/20/2006 8:25:48 PM Starting processX. (From 04-APR-06 for 20 days)
7/20/2006 8:25:46 PM Completed balance update processing in pr_update_lagging_RTS_bals. 528 balances processed
7/20/2006 8:17:30 PM Completed processing in processX. 106057 CDR events,1009 accounts*days
7/20/2006 8:17:30 PM Starting balance update processing in pr_update_lagging_RTS_bals.
7/20/2006 8:06:00 PM Starting processX. (From 15-MAR-06 for 20 days)
I only have these two columns. Can I query this for a result as follows:
Process #1 completed in xx minutes
Process #2 completed in yy minutes
...
If I could, I could then do some extra substr to get the frequency and performance (process per min) on this aspect of the app.
Cheers,
Phil
July 23, 2006 - 7:29 am UTC
the data warehousing guide is a pretty good read. And remember "traditional" SQL looked like Klingon to you until you learned it. Then and only then did you start thinking of it as "traditional" easy stuff.
If you have access to Expert One On One Oracle - I have a chapter on them in there as well that explains them in some detail.
we could do what you need in all likelyhood - what you would have to do is
a) provide create tables
b) with inserts to populate them
c) and most importantly describe how to find the begin/end records for a process. I see two possible records - "starting processx" and "starting balance update"
A simple lag() or lead() will do this easily - we just select out the records that represent start and stop and using lag - you can get the prior record attached to the current record.
Gints, July 21, 2006 - 3:10 pm UTC
create table a1(a number);
insert into a1 values(1);
insert into a1 values(2);
insert into a1 values(3);
insert into a1 values(4);
insert into a1 values(5);
SQL> select max (a) over (order by a)
2 from a1;
MAX(A)OVER(ORDERBYA)
--------------------
1
2
3
4
5
SQL> select min (a) over (order by a desc)
2 from a1;
MIN(A)OVER(ORDERBYADESC)
------------------------
5
4
3
2
1
SQL> select max (a) over (order by a), min (a) over (order by a desc)
2 from a1;
MAX(A)OVER(ORDERBYA) MIN(A)OVER(ORDERBYADESC)
-------------------- ------------------------
5 5
4 4
3 3
2 2
1 1
Why max (a) over (order by a) changes it's appearance? :OOOO
Even trying explicitly say window clause there are wrong results although separately everything is ok:
select max (a) over (order by a rows between unbounded preceding and current row),
max (a) over (order by a desc rows between current row and unbounded following)
from a1;
gives the same....
July 23, 2006 - 7:52 am UTC
maybe when we include "a" in the query and order each query by "a" so they are presenting the data in the same order - it'll make more sense and you'll understand why the answer is in fact "correct" (remember, the rows can only be sorted ONE WAY :)
ops$tkyte%ORA10GR2> select a, max (a) over (order by a) from a1 order by a;
A MAX(A)OVER(ORDERBYA)
---------- --------------------
1 1
2 2
3 3
4 4
5 5
ops$tkyte%ORA10GR2> select a, min (a) over (order by a desc) from a1 order by a;
A MIN(A)OVER(ORDERBYADESC)
---------- ------------------------
1 1
2 2
3 3
4 4
5 5
ops$tkyte%ORA10GR2> select a, max (a) over (order by a), min (a) over (order by a desc) from a1 order by a;
A MAX(A)OVER(ORDERBYA) MIN(A)OVER(ORDERBYADESC)
---------- -------------------- ------------------------
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
The max(a) when you order by a in the over clause is the same as A
The min(a) when you order by a DESC in the over clause is the same as A
hence, for your data - max(a) over ... is the SAME as min(a) over ...
and both are the same as A!
Problem in Analitic function
Mariana, July 26, 2006 - 4:44 pm UTC
Hello,Tom,
I have a table:
create table Lines
(LINE_NO NUMBER(4),
MONTH DATE,
POINTS_NUM NUMBER(3));
INSERT INTO TABLE LINES
VALUES(1001,TO_DATE('01/03/2006','DD/MM/YYYY'),25);
INSERT INTO TABLE LINES
VALUES(1001,TO_DATE('01/04/2006','DD/MM/YYYY'),15);
INSERT INTO TABLE LINES
VALUES(1001,TO_DATE('01/05/2006','DD/MM/YYYY'),40);
INSERT INTO TABLE LINES
VALUES(1002,TO_DATE('01/03/2006','DD/MM/YYYY'),35);
INSERT INTO TABLE LINES
VALUES(1002,TO_DATE('01/04/2006','DD/MM/YYYY'),12);
INSERT INTO TABLE LINES
VALUES(1002,TO_DATE('01/05/2006','DD/MM/YYYY'),76);
SELECT LINE_NO,MONTH,POINTS_NUM FROM LINES
ORDER BY LINE_NO,MONTH;
LINE_NO MONTH POINTS_NUM
======= ===== ==========
1001 03/06 25
1001 04/06 15
1001 05/06 40
1002 03/06 35
1002 04/06 12
1002 05/06 76
What i really need is :
1) For each group of lines(i have 2 groups 1001 and 1002)
from the second row of each group i have to calc the percentage of change OF POINTS_NUM from previous month and for each group i have also to calc cumulative percentage of change of POINTS_NUM:
it will look like:
LINE_NO MONTH POINTS_NUM PERCENTAGE_CHANGE CUMULATIVE
======= ===== ========== ================= ==========
1001 03/06 25
1001 04/06 15 (15-25)/25*100 (15-25)/25*100
1001 05/06 40 (40-15)/15*100 (40-25)/25*100
1002 03/06 35
1002 04/06 12 (12-35)/35*100 (12-35)/35*100
1002 05/06 76 (76-12)/12*100 (76-35)/35*100
and i have to calc total cumulative percentage:
SELECT SUM(POINTS_NUM) SUM1 FROM LINES
WHERE MONTH='01-MAR-2006'
SUM1:=25+35=60;
SELECT SUM(POINTS_NUM) SUM2 FROM LINES
WHERE MONTH='01-MAY-2006'
SUM2:=40+76=116;
TOTAL CUMULATIVE PERCENTAGE=((116-60)/60)*100;
My question is -how to do by one select-analitic function everything that i wrote above?
Thank you very much,
Mariana
July 26, 2006 - 5:08 pm UTC
here is your start:
ops$tkyte%ORA10GR2> select line_no, month, points_num,
2 lag(points_num) over (partition by line_no order by month) last_points_num,
3 first_value(points_num) over (partition by line_no order by month) first_points_num
4 from lines
5 /
LINE_NO MONTH POINTS_NUM LAST_POINTS_NUM FIRST_POINTS_NUM
---------- --------- ---------- --------------- ----------------
1001 01-MAR-06 25 25
1001 01-APR-06 15 25 25
1001 01-MAY-06 40 15 25
1002 01-MAR-06 35 35
1002 01-APR-06 12 35 35
1002 01-MAY-06 76 12 35
6 rows selected.
Continue to calc cumulative percentage
Mariana, July 26, 2006 - 5:12 pm UTC
Hi Tom,
Thank you for answer,
but how can i calc the total cumulative percentage?
July 26, 2006 - 5:20 pm UTC
you already seem to have those queries coded?!?
Continue to cumulative percentage
Mariana, July 26, 2006 - 5:33 pm UTC
Sorry,i didn't understand you.
What do you mean?
July 26, 2006 - 5:42 pm UTC
you wrote the summation queries for that - union all them if you want.
You want analytics for the lag()/first_value()
you need plain old aggregation for the "cumulative percentage" - and the two results are entire different "shapes"
don't see how they fit together really
Can you show me by example ,please, what you mean
A reader, July 26, 2006 - 5:46 pm UTC
Mariana
July 26, 2006 - 5:53 pm UTC
problem is - I wasn't sure what you meant by cumulative percent.
but your two sum queries certainly retrieve what you say you wanted? You have already written them.
....
and i have to calc total cumulative percentage:
SELECT SUM(POINTS_NUM) SUM1 FROM LINES
WHERE MONTH='01-MAR-2006'
SUM1:=25+35=60;
SELECT SUM(POINTS_NUM) SUM2 FROM LINES
WHERE MONTH='01-MAY-2006'
SUM2:=40+76=116;
TOTAL CUMULATIVE PERCENTAGE=((116-60)/60)*100;
..........
Analytic Query with other functions
A reader, July 26, 2006 - 5:52 pm UTC
Hi Tom,
How do i write the following query:
select 1, max(decode(dummy,null,'NN',null)) D,
max(dummy) over (partition by dummy order by dummy desc) S
from dual
group by 1;
I get the following error:
SQL> /
max(dummy) over (partition by dummy order by dummy desc) S
*
ERROR at line 2:
ORA-00979: not a GROUP BY expression
Do not have access to Scott Schema, so used dual instead.
Oracle version: 9.2.0.7
Thanks as always
July 26, 2006 - 6:00 pm UTC
why are you grouping by.
I cannot tell you how to write a query when you haven't told me what question you are trying to ask.
you don't use group by with analytics, you use group by with aggregates.
Sorry for the improper posting
A reader, July 27, 2006 - 10:58 am UTC
Hi Tom,
sorry for the question.
Well the logic is like this;
I have a query on similar lines as follows:
select a.col1, b.col2, c.col3, a.dt,
max(decode(a.col2,null,'NN',null)) D,
from tab_a a,
tab_b b,
tab_c c
where
a.col1 = b.col1
and a.col2 = c.col2
and c.col1 = b.col1
and a.dt = (select max(in_a.dt) from tab_a in_a
where in_a.col1 = a.col1
and in_a.dt = a.dt)
and b.col2 = (select max(in_b.col2) from tab_b in_b
where in_b.col1 = b.col1
and in_b.col2 = b.col2)
group by a.col1, b.col2, c.col3, a.dt;
I want to know how can i rewrite that query using Analytic Functions or in a alternate manner.
Thanks as always.
July 27, 2006 - 12:16 pm UTC
instead of using tab_a in the from list use:
(select *
from (select a.*, max(dt) over (partition by col1, col2) max_dt from tab_a a)
where dt = max_dt
)
(I assume you meant to use col1, col2 not col1, dt in the subquery)
do the same for b
and drop the correlated subquery from the where clause
Thanks for the quick response.
A reader, July 27, 2006 - 1:56 pm UTC
Hi Tom,
Dropping the corelated query was the idea. But I did not know how to reformat the query.
So as per you suggestion it should be as follows:
Original Query:
----------------
select a.col1, b.col2, c.col3, a.dt,
max(decode(a.col2,null,'NN',null)) D,
from tab_a a,
tab_b b,
tab_c c
where
a.col1 = b.col1
and a.col2 = c.col2
and c.col1 = b.col1
and a.dt = (select max(in_a.dt) from tab_a in_a
where in_a.col1 = a.col1
and in_a.dt = a.dt)
and b.col2 = (select max(in_b.col2) from tab_b in_b
where in_b.col1 = b.col1
and in_b.col2 = b.col2)
group by a.col1, b.col2, c.col3, a.dt;
Your Suggestion:
----------------
instead of using tab_a in the from list use:
(select *
from (select a.*, max(dt) over (partition by col1, col2) max_dt from tab_a a)
where dt = max_dt
)
Query Rewritten as:
--------------------
select a.col1, b.col2, c.col3, a.dt,
max(decode(a.col2,null,'NN',null)) D,
from (select *
from (select a.*, max(dt) over (partition by col1, col2) max_dt from tab_a a)
where dt = max_dt
) a,
(select *
from (select b.*, max(col2) over (partition by col1, col2) max_dt from tab_b b)
where dt = max_dt
) b,
tab_c c
where
a.col1 = b.col1
and a.col2 = c.col2
and c.col1 = b.col1
group by a.col1, b.col2, c.col3, a.dt;
Please correct me if I am wrong.
Thanks as always
July 27, 2006 - 2:28 pm UTC
that is correct, all you want is the "most current record by col1, col2"
that analytic gets that record.
second max would be max(dt), not col2
What in case...
A reader, July 28, 2006 - 3:59 pm UTC
Hi Tom,
With reference to the question above:
What should be the approach if i have to do a max(col) on 2 or more cols from the same table.
can you explain with an example using the same query.
Thanks in advance.
July 28, 2006 - 8:53 pm UTC
give me a test case to work with - not sure what you want to do with the max of these two columns
RBO and Analytical Functions
Muhammad Riaz Shahid, July 30, 2006 - 4:42 am UTC
Tom,
Can we use analytical functions while using RBO?
SQL> create table x(a number);
Table created.
SQL> select row_number() over(order by 1) from x;
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 WINDOW (NOSORT)
2 1 TABLE ACCESS (FULL) OF 'X'
DB Version 9.2.0.5.0
July 30, 2006 - 8:42 am UTC
you just did.
Analytics and RBO
Jonathan Lewis, July 31, 2006 - 5:44 am UTC
You need to be a little careful with RBO and analytics. RBO is allowed to use an index to implement an "order by" clause without sorting - but if you include an analytic function in the select list the run-time engine may have to sort the data to partition or order by inside the analytic call - leaving the data in the wrong order ... RBO does not know about everything that can happen inside analytics.
I think there's a note about this, with an example, on my website.
A reader, August 05, 2006 - 1:50 am UTC
Hi Tom,
I have the following Analytics wich works fine but I have one little problem with the query.
It must discerned how the query is executed.
The Code column can only be A or B , I use these column for my partioning clause.
The columns pos1,pos2,pos3...pos41 can only be 0,1, or 2.
0 normal count as 0
1 normal count as 1
2 normal count as 1
Now the problem 0 and 1 be calculated in one query to get an result.
The columns with the value 2 can not be calculated within the query.
I must have two result orderd by 1, 0 and 2, 0
17/07/2006;05:59:58;17/07/2006;0388;XXXXXX1;A;1;2;0;0
17/07/2006;06:00:05;17/07/2006;1242;XXXXXX2;A;1;2;1;0
17/07/2006;06:00:11;17/07/2006;1146;XXXXXX3;A;2;1;2;0
24/07/2006;05:59:58;24/07/2006;0388;XXXXXX4;B;2;1;0;0
24/07/2006;06:00:05;24/07/2006;1242;XXXXXX5;B;1;2;1;0
24/07/2006;06:00:11;24/07/2006;1146;XXXXXX6;B;1;2;2;0
for th values with 0,1 the result schould look like this :
C POS1 POS2 POS3 POS3
- ---- ---- ---- ----
A 2 1 1 0
B 2 1 1 0
for th values with 0,1 the result schould look like this :
C POS1 POS2 POS3 POS3
- ---- ---- ---- ----
A 1 2 1 0
B 1 2 1 0
select distinct code,pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9,pos10,pos11,pos12,
pos13,pos14,pos15,pos16,pos17,pos18,pos19,pos20,pos21,pos22,pos23,pos24
from (
select vin_code,
sum(pos1) over (partition by code order by code) as pos1,
sum(pos2) over (partition by code order by code) as pos2,
sum(pos3) over (partition by code order by code) as pos3,
sum(pos1) over (partition by code order by code) as pos4,
sum(pos2) over (partition by code order by code) as pos5,
sum(pos1) over (partition by code order by code) as pos6,
sum(pos2) over (partition by code order by code) as pos7,
sum(pos3) over (partition by code order by code) as pos8,
sum(pos1) over (partition by code order by code) as pos9,
sum(pos2) over (partition by code order by code) as pos10,
sum(pos3) over (partition by code order by code) as pos11,
sum(pos1) over (partition by code order by code) as pos12,
sum(pos2) over (partition by code order by code) as pos13,
sum(pos3) over (partition by code order by code) as pos14,
sum(pos1) over (partition by code order by code) as pos15,
sum(pos2) over (partition by code order by code) as pos16,
sum(pos3) over (partition by code order by code) as pos17,
sum(pos1) over (partition by code order by code) as pos18,
sum(pos2) over (partition by code order by code) as pos19,
sum(pos3) over (partition by code order by code) as pos20,
sum(pos1) over (partition by code order by code) as pos21,
sum(pos2) over (partition by code order by code) as pos22,
sum(pos3) over (partition by code order by code) as pos23,
sum(pos1) over (partition by code order by code) as pos24
from monitoring)
group by code,pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9,pos10,pos11,pos12,
pos13,pos14,pos15,pos16,pos17,pos18,pos19,pos20,pos21,pos22,pos23,pos24;
So hope you have an idea ;-)
Regards
Marcel
August 05, 2006 - 10:50 am UTC
no idea, didn't follow this logic at all.
And I see no table creates, no inserts.....
Question review
Marcel, August 07, 2006 - 1:31 am UTC
Hi Tom,
I will try it again without errors in the question !
First the create table and some inserts
create table vin_check
(
seq number(4),
vin_code varchar2(10),
system varchar2(1),
pos1 number(1),
pos2 number(1),
pos3 number(1)
);
insert into vin_check values(0001,'XXXXXXXX01','A',1,1,1);
insert into vin_check values(0002,'XXXXXXXX02','A',0,1,2);
insert into vin_check values(0003,'XXXXXXXX03','A',2,2,0);
insert into vin_check values(0004,'XXXXXXXX04','A',1,0,1);
insert into vin_check values(0005,'XXXXXXXX05','B',1,1,1);
insert into vin_check values(0006,'XXXXXXXX06','B',0,1,2);
insert into vin_check values(0006,'XXXXXXXX06','B',2,2,0);
insert into vin_check values(0007,'XXXXXXXX07','B',1,0,1);
The problem I have is the following I must calculate the errors in the system not together
but differenced by system and the factor of how they sum is the error value.
So I took system a partition column. (Query see below )
Here what the numbers mean.
0 is no error,
1 internal error,
2 external error.
SEQ VIN_CODE S POS1 POS2 POS3
---------- ---------- - ---------- ---------- ----------
1 XXXXXXXX01 A 1 1 1
2 XXXXXXXX02 A 0 1 0
3 XXXXXXXX03 A 2 2 0
4 XXXXXXXX04 A 2 0 1
5 XXXXXXXX05 B 1 1 1
6 XXXXXXXX06 B 0 1 2
6 XXXXXXXX06 B 2 2 0
7 XXXXXXXX07 B 0 0 2
select distinct system,pos1,pos2,pos3 from (
select system,
sum(pos1) over (partition by system order by system) as pos1,
sum(pos2) over (partition by system order by system) as pos2,
sum(pos3) over (partition by system order by system) as pos3
from vin_check)
group by system,pos1,pos2,pos3;
S POS1 POS2 POS3
- ---------- ---------- ----------
A 5 4 2
B 3 4 5
This would be the result with the query above, but I need an another result
the errors have not the same value ( for me ) and must treat alone. So it is recommend that I calculate the
errors with 1 and the errors with 2 seperatly.
Would I run the Query for errors with the value 1 the result should look like this:
S POS1 POS2 POS3
- ---------- ---------- ----------
A 1 2 2
B 1 2 1
Would I run the Query for errors with the value 2 the result should look like this:
S POS1 POS2 POS3
- ---------- ---------- ----------
A 2 1 0
B 1 1 2
Thanks Marcel
August 07, 2006 - 8:00 am UTC
this:
select distinct system,pos1,pos2,pos3 from (
select system,
sum(pos1) over (partition by system order by system) as pos1,
sum(pos2) over (partition by system order by system) as pos2,
sum(pos3) over (partition by system order by system) as pos3
from vin_check)
group by system,pos1,pos2,pos3;
is just wrong, you use analytics when you DON'T want to aggregate, just aggregate otherwise:
select system, sum(pos1), sum(pos2), sum(pos3) from vin_check group by system;
that is all that query should ever be.
Now, it would be nice if your supplied example data (create+inserts) matched your text example!!! but they don't, we'll make due:
tkyte%ORCL> select * from vin_check;
SEQ VIN_CODE S POS1 POS2 POS3
---- ---------- - ---- ---- ----
1 XXXXXXXX01 A 1 1 1
2 XXXXXXXX02 A 0 1 2
3 XXXXXXXX03 A 2 2 0
4 XXXXXXXX04 A 1 0 1
5 XXXXXXXX05 B 1 1 1
6 XXXXXXXX06 B 0 1 2
6 XXXXXXXX06 B 2 2 0
7 XXXXXXXX07 B 1 0 1
8 rows selected.
tkyte%ORCL> select system,
2 sum(pos1) pos1_sum,
3 sum(pos2) pos2_sum,
4 sum(pos3) pos3_sum,
5 count(case when pos1=1 then 1 end) "pos1=1",
6 count(case when pos2=1 then 1 end) "pos2=1",
7 count(case when pos2=1 then 1 end) "pos2=1",
8 count(case when pos1=2 then 1 end) "pos1=2",
9 count(case when pos2=2 then 1 end) "pos2=2",
10 count(case when pos2=2 then 1 end) "pos2=2"
11 from vin_check
12 group by system;
S POS1_SUM POS2_SUM POS3_SUM pos1=1 pos2=1 pos2=1 pos1=2 pos2=2 pos2=2
- -------- -------- -------- ------ ------ ------ ------ ------ ------
A 4 4 4 2 2 2 1 1 1
B 4 4 4 2 2 2 1 1 1
that gets it all at once, better than three queries, but if you need three queries, use the columns from above - and a simple group by.
Again confusion.....
Vinayak Awasthi, August 15, 2006 - 9:47 am UTC
Hi Tom,
Just started reading the Analytics chapter in your book and got a bit puzzled. See the following queries:
<<QUERY1>>
select sum(sal) over(order by ename,deptno) from emp
<<QUERY1>>
<<QUERY2>>
select sum(sal) over(order by deptno,ename) from emp
<<QUERY2>>
<<QUERY3>>
select sum(sal) over(order by deptno,ename),sum(sal) over(order by ename,deptno) from emp
<<QUERY3>>
<<QUERY4>>
select sum(sal) over(order by ename,deptno), sum(sal) over(order by deptno,ename) from emp
<<QUERY4>>
The result from query1 and query2 is obvious and understandable. But query3 & query4 bumped me. I am not able to understand how this result set is arrived. Moreover, the query3 & query4 are exactly the same with just the two columns interchanged but the result differs considerably.
Tom, can you please explain this behaviour of analytics.
Thanks
August 15, 2006 - 12:18 pm UTC
if you order the results consistently, you'll find 3 and 4 are identical with the columns reversed:
scott%ORA9IR2> select sum(sal) over(order by deptno,ename),sum(sal) over(order by ename,deptno)
2 from emp
3 order by rowid
4 /
SUM(SAL)OVER(ORDERBYDEPTNO,ENAME) SUM(SAL)OVER(ORDERBYENAME,DEPTNO)
--------------------------------- ---------------------------------
19625 26275
21225 2700
29025 29025
15825 14925
26275 21175
24075 5550
2450 8000
18825 25475
7450 19925
27775 27775
9850 1100
25025 11950
12850 11000
8750 22475
14 rows selected.
scott%ORA9IR2> select sum(sal) over(order by ename,deptno), sum(sal) over(order by
2 deptno,ename) from emp
3 order by rowid
4 /
SUM(SAL)OVER(ORDERBYENAME,DEPTNO) SUM(SAL)OVER(ORDERBYDEPTNO,ENAME)
--------------------------------- ---------------------------------
26275 19625
2700 21225
29025 29025
14925 15825
21175 26275
5550 24075
8000 2450
25475 18825
19925 7450
27775 27775
1100 9850
11950 25025
11000 12850
22475 8750
14 rows selected.
Addendum to my question above
Vinayak Awasthi, August 15, 2006 - 10:17 am UTC
Tom,
in addition to above question, I wanted to ask that how ORDER BY is processed by oracle in case of analytics.If the query does not have an outer order by but implicit analytic columns contain different order by, which order by oracle actually picks to generate the final result set.By running the different queries, I beleive the order by of the last column of the select.
Please advice.
August 15, 2006 - 12:19 pm UTC
unless and until you have an order by on your SQL statement - please make no assumptions about the order of rows, you cannot.
until you have order by, there is no order.
Things not clear !!!!
Vinayak Awasthi, August 16, 2006 - 5:32 am UTC
Tom,
I agree that if we order the result consistently, then our results in both the queries are same but jsut reversed. But my confusion is where we do not provide any order by clause.If you see my query3 & query4, I have not give any order by clause to the sql, then what causes oracle to return the result set differently. Should it not give the same results in both thee queries with the columns reversed.
Please guide.
August 16, 2006 - 8:36 am UTC
if you do not give an order by
YOU CANNOT HAVE ANY EXPECTATION ON THE ORDER OF ROWS.
period.
the result sets are exactly the same - with or without the order by. They are just returned with the rows in different orders.
Thanks Tom...
Vinayak Awasthi, August 16, 2006 - 10:17 am UTC
Thanks Tom,
After running the queries multiple times, I understood what you were trying to explain.
Thanks once again for your time and guidance.
getting wrong answer
Vinayak, August 21, 2006 - 9:19 am UTC
Hi Tom,
I have the following table with the set of data. This is just a short version of the actual table. The actual table has around 13 million records.
CREATE TABLE MY_TEST_TABLE(ID NUMBER,KEY NUMBER,TYPEOFRECORD VARCHAR2(20BYTE),
MYDATE DATE);
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values(6366556, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(6366516, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(6366565, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(6366568, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(7076940, 404887, 'CE', TO_DATE('11/04/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18197564, 404887, 'CE', TO_DATE('08/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(17561339, 404887, 'CE', TO_DATE('05/05/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381063, 404887, 'CE', TO_DATE('05/19/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381260, 404887, 'CE', TO_DATE('06/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18386869, 404887, 'CE', TO_DATE('06/10/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18895620, 404887, 'CE', TO_DATE('06/10/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(17769950, 404887, 'CE', TO_DATE('05/06/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18096803, 404887, 'CE', TO_DATE('05/19/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381262, 404887, 'CE', TO_DATE('06/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381270, 404887, 'CE', TO_DATE('06/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;
My requirement is to get the highest ID value, TYPEOFRECORD value for a given key value. So I wrote this query:
select KEY, max(ID) over (partition by KEY) myid, TYPEOFRECORD
from MY_TEST_TABLE;
but this query returns as many number of records as there are records for a particular KEY value.
But if I write this query it gives the correct result:
select tt1.key,tt1.id,tt1.MYDATE,tt1.TYPEOFRECORD
from MY_TEST_TABLE tt1 where (tt1.KEY,tt1.ID) in
(select tt.KEY, max(tt.ID) myid
from MY_TEST_TABLE tt
group by tt.key)
What wrong I am doing in the analytic query as I wanted to get the result through analytics and not use the traditional GROUP BY clause.
Thanks
Vinayak
Help required on one scenario related to total promotion calculation
Hardik Panchal, September 30, 2006 - 7:00 am UTC
Hi Tom,
I have to calculate promotion in percentage (if % off given, keep the value as it is, if amount off given, convert it to percentage) per customer and item id in one base table and then need to fill summary table based on the base table.
For example,
CUSTOMR: 01
ITEM: 123456
BASE-SDV: 100
Abbreviations used:
PKG: Package Number
GRP: Group Number
EVE: Event Number
SEQ: Sequence Number
Base-SDV: synonym to Item Price. SDV = Store Door Value
Case 1 (Same package):
Promotion data defined:
PKG GRP EVE %off Amt.off SEQ
01 01 01 0 10 01
01 01 02 10 0 02
Calculation required:
For Same Package and Group-Customer-Product, First Event will be applied to its Base-SDV.(means 100 -10 =90). After that Second Event will be applied to Base-SDV which is deducted from first Event.[means 90-(10% of 90)=81]
Data to fill in Base-Table:
PKG GRP EVE %off Amt.off Discount SEQ
01 01 01 0 10 10% 01
01 01 02 10 0 9% 02 *
* Although 10% discount is defined, actually 9% applied
Data to fill in Summary Table:
Customer ITEM TotalPromo
01 123456 19%
Case 2 (Different Package):
PKG GRP EVE %off Amt.off SEQ
01 01 01 0 10 01
02 01 02 10 0 02
Calculation required:
For different Package and Group-Customer-Product ,Event in 1st package will be applied to its Base-SDV.(means 100 -10 =90). After that, event of 2nd package will be applied to original item Base-SDV (without deducting any previous promotion discount) [means 90 - (10% of 100) = 80]
Data to fill in Base-Table:
PKG GRP EVE %off Amt.off Discount SEQ
01 01 01 0 10 10% 01
02 01 01 10 0 10% 02
Data to fill in Summary Table:
Customer ITEM TotalPromo
01 123456 20%
Quick reply is appreciated !
Thanks in Advance :-)
Hardik Panchal
September 30, 2006 - 8:11 am UTC
no create
no inserts
no look
don't know if it can be done, did not really look
Not getting your comments
Hardik Panchal, September 30, 2006 - 9:12 am UTC
Do you mean to say it doesn't look possible or you require more information like table structures , insert stmts.
Thanks
October 01, 2006 - 12:28 am UTC
correct, without a table create and insert intos - I'm not going to be able to test a sql solution and therefore could not look at it.
analytical Functions
Maulesh Jani, September 30, 2006 - 9:30 am UTC
Hi ,
Soory for last mail.
LOGIC :
This table has base data for customer's Products promotion information . Now on the base of this table I have to create replicate table shows the infor in cumulative percentage of promotion .
Like
cust item event price percent-promotion
01 01 01 100 10
01 01 02 100 5
requested output table will contain one more column
cust item event price percent-promotion cumu-promo
01 01 01 100 10 10
01 01 02 100 5 15
I just want to know that for such scenario does analytical functions can be used ? If yes then pl.can give me any example
Thanks and Pl. guide
Regards
Maulesh Jani
October 01, 2006 - 12:30 am UTC
sum(percent-promotition) over (partition by YOUR_GROUP order by event)
I think your group might be cust, item - but not sure as example is "not very example-ish"
Details Information Provided
Hardik Panchal, September 30, 2006 - 10:22 am UTC
I have following table:
CREATE TABLE TEMP_PROMO_CALC_2
(
CTRY_CODE CHAR(3 BYTE) NOT NULL,
CO_CODE CHAR(3 BYTE) NOT NULL,
CUST_NBR NUMBER NOT NULL,
ITEM_ID CHAR(20 BYTE),
EVENT_NBR NUMBER NOT NULL,
PROMO_SEQ_NBR NUMBER(2),
EVENT_ALLOW_PCT NUMBER(5,4),
EVENT_ALLOW_AMT NUMBER,
BASE_PRICE NUMBER,
NEW_PRICE NUMBER,
PERCENT_DIC NUMBER
)
Having following data in this table:
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE ) VALUES (
'138', '002', 1, '100', 1, 1, 4, 0, 100, NULL);
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE ) VALUES (
'138', '002', 1, '100', 5, 2, 0, 20, 100, NULL);
COMMIT;
Here, EVENT_ALLOW_PCT is the % discount, while EVENT_ALLOW_AMT is the discount amount in $.
I want following things:
- First convert all the promotions in % form
- Apply the promotion in the seq. mentioned by PROMO_SEQ_NBR
In above example,
- if 4% (Event Seq 1, so this will be applied first) given on a 100 $ price, NEW_PRICE field value after applying this promotion will be 96 $. The PERCENT_DIC field value will be 4% for PROMO_SEQ_NBR -> 1.
- Then 20$ amount off promotion is given. This 20$ amount off promotion needs to be converted into % off i.e. 20$ will be deducted from the 96 $ (amount calculated after applying promotion in order of PROMO_SEQ_NBR, like 1st seq. is 4% percent off and then 2nd sequence is 20$ amount off discount). In this example, the PERCENT_DIC for this 2nd seq. will be 20.83 % [calculated as (20*100)/96] and NEW_PRICE filed value will be 76 $
Desrired Result:
CUST_NBR ITEM_ID EVENT_NBR PROMO_SEQ_NBR EVENT_ALLOW_PCT
--------- -------- --------- ------------- ----------------
1 100 1 1 4
1 100 5 2 0
EVENT_ALLOW_AMT BASE_PRICE NEW_PRICE PERCENT_DIC
---------------- ---------- --------- -----------
0 100 96 4
20 100 76 20.83
I believe this can be achieved with the analytic functions !
Regards,
Hardik Panchal
October 01, 2006 - 12:41 am UTC
ok, so, what happens with other rows, what are the groups here, is event_nbr there just to 'confuse' (is it used for anything in the example or just an extra bit)
start with this I guess
ops$tkyte%ORA10GR2> select cust_nbr, item_id, promo_seq_nbr,
2 event_allow_pct, event_allow_amt,
3 sum(pr * (100-event_allow_pct)/100 - event_allow_amt) over (partition by cust_nbr,it
em_id order
4 by promo_seq_nbr) new_pr
5 from (
6 select cust_nbr, item_id, promo_seq_nbr,
7 event_allow_pct, event_allow_amt,
8 case when row_number() over (partition by cust_nbr, item_id
9 order by promo_seq_nbr) = 1 then base_price else 0 end pr
10 from temp_promo_calc_2
11 )
12 /
CUST_NBR ITE PROMO_SEQ_NBR EVENT_ALLOW_PCT EVENT_ALLOW_AMT NEW_PR
---------- --- ------------- --------------- --------------- ----------
1 100 1 4 0 96
1 100 2 0 20 76
Analytical Question
Hardik Panchal, October 01, 2006 - 2:31 am UTC
Hi TOM,
Thanks for your help .Yes you are right that event_nbr is not usefull in calculation. Your query helped me , but when i tried it on other data then it return me wrong info, i m trying to update your query for desired output,but still not get it .
Here is insert stmt for the base table data :
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES (
'138', '002', 1, '100 ', 1, 1, 4, 0, 100, NULL, NULL);
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES (
'138', '002', 1, '100 ', 5, 2, 0, 20, 100, NULL, NULL);
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES (
'138', '002', 1, '100 ', 3, 3, 5, 0, 100, NULL, NULL);
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES (
'138', '002', 2, '200 ', 4, 1, 0, 10, 100, NULL, NULL);
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES (
'138', '002', 2, '200 ', 5, 2, 5, 0, 100, NULL, NULL);
COMMIT;
after it when I applies ur query then it doesnt show me the correct output for New_pr . (means in this case when customer having 3 Promotion on same Product and when customer having promotion in different order of Amount_off and Promo_off) ,
. This query works well in the case when %off is first and amount off second .
October 01, 2006 - 3:06 am UTC
"ur"????? what is that.
you best supply a really good detailed specification. eg: explain precisely, in such detail that someone could write code from your specification - even if they never saw your data before and didn't really know your data
eg: that is what we have here. so, specify it out in really good detail like you were going to hire a consultant to write code from your specification.
Over?
Anderson Haertel Rodrigues, October 03, 2006 - 6:12 pm UTC
Hello All,
Oracle 9.2.0.5.0
My doubt as (please):
CREATE GLOBAL TEMPORARY TABLE ETMPTESTE
(
CdPerfilAgrupamento integer,
CdFuncAgrupamento integer,
CdOperacao integer,
UO char(1),
NatVinc char(1),
RelTrab char(1),
Regiao char(1),
RegTrab char(1),
SitPrev char(1),
Carreira char(1),
Cargo char(1),
Funcao char(1)
)
ON COMMIT PRESERVE ROWS;
insert into ETMPTESTE
(
CdPerfilAgrupamento,
CdFuncAgrupamento,
CdOperacao,
UO,
NatVinc,
RelTrab,
Regiao,
RegTrab,
SitPrev,
Carreira,
Cargo,
Funcao
)
values
(
1,85,1,'N','E','T','E','N','N','N','N','N'
);
insert into ETMPTESTE
(
CdPerfilAgrupamento,
CdFuncAgrupamento,
CdOperacao,
UO,
NatVinc,
RelTrab,
Regiao,
RegTrab,
SitPrev,
Carreira,
Cargo,
Funcao
)
values
(
2,85,1,'E','N','E','N','E','N','N','E','N'
);
insert into ETMPTESTE
(
CdPerfilAgrupamento,
CdFuncAgrupamento,
CdOperacao,
UO,
NatVinc,
RelTrab,
Regiao,
RegTrab,
SitPrev,
Carreira,
Cargo,
Funcao
)
values
(
3,85,1,'T','T','T','T','T','T','T','T','T'
);
insert into ETMPTESTE
(
CdPerfilAgrupamento,
CdFuncAgrupamento,
CdOperacao,
UO,
NatVinc,
RelTrab,
Regiao,
RegTrab,
SitPrev,
Carreira,
Cargo,
Funcao
)
values
(
4,85,1,'E','E','E','E','E','E','E','E','E'
);
My Query:
select FA,
Op,
UOAbrangencia,
NatVincAbrangencia,
RelTrabAbrangencia,
RegiaoAbrangencia,
RegTrabAbrangencia,
SitPrevAbrangencia,
CEFAbrangencia,
CCOAbrangencia,
FUCAbrangencia
from
(
select FA,
Op,
CASE WHEN UO = 'T' AND UOLead = 'T' THEN 'T'
WHEN UO = 'T' AND UOLead = 'E' THEN 'T'
WHEN UO = 'T' AND UOLead = 'N' THEN 'T'
WHEN UO = 'E' AND UOLead = 'T' THEN 'T'
WHEN UO = 'E' AND UOLead = 'E' THEN 'E'
WHEN UO = 'E' AND UOLead = 'N' THEN 'E'
WHEN UO = 'N' AND UOLead = 'T' THEN 'T'
WHEN UO = 'N' AND UOLead = 'E' THEN 'E'
WHEN UO = 'N' AND UOLead = 'N' THEN 'N'
END UOAbrangencia,
CASE WHEN NatVinc = 'T' AND NatVincLead = 'T' THEN 'T'
WHEN NatVinc = 'T' AND NatVincLead = 'E' THEN 'T'
WHEN NatVinc = 'T' AND NatVincLead = 'N' THEN 'T'
WHEN NatVinc = 'E' AND NatVincLead = 'T' THEN 'T'
WHEN NatVinc = 'E' AND NatVincLead = 'E' THEN 'E'
WHEN NatVinc = 'E' AND NatVincLead = 'N' THEN 'E'
WHEN NatVinc = 'N' AND NatVincLead = 'T' THEN 'T'
WHEN NatVinc = 'N' AND NatVincLead = 'E' THEN 'E'
WHEN NatVinc = 'N' AND NatVincLead = 'N' THEN 'N'
END NatVincAbrangencia,
CASE WHEN RelTrab = 'T' AND RelTrabLead = 'T' THEN 'T'
WHEN RelTrab = 'T' AND RelTrabLead = 'E' THEN 'T'
WHEN RelTrab = 'T' AND RelTrabLead = 'N' THEN 'T'
WHEN RelTrab = 'E' AND RelTrabLead = 'T' THEN 'T'
WHEN RelTrab = 'E' AND RelTrabLead = 'E' THEN 'E'
WHEN RelTrab = 'E' AND RelTrabLead = 'N' THEN 'E'
WHEN RelTrab = 'N' AND RelTrabLead = 'T' THEN 'T'
WHEN RelTrab = 'N' AND RelTrabLead = 'E' THEN 'E'
WHEN RelTrab = 'N' AND RelTrabLead = 'N' THEN 'N'
END RelTrabAbrangencia,
CASE WHEN Regiao = 'T' AND RegiaoLead = 'T' THEN 'T'
WHEN Regiao = 'T' AND RegiaoLead = 'E' THEN 'T'
WHEN Regiao = 'T' AND RegiaoLead = 'N' THEN 'T'
WHEN Regiao = 'E' AND RegiaoLead = 'T' THEN 'T'
WHEN Regiao = 'E' AND RegiaoLead = 'E' THEN 'E'
WHEN Regiao = 'E' AND RegiaoLead = 'N' THEN 'E'
WHEN Regiao = 'N' AND RegiaoLead = 'T' THEN 'T'
WHEN Regiao = 'N' AND RegiaoLead = 'E' THEN 'E'
WHEN Regiao = 'N' AND RegiaoLead = 'N' THEN 'N'
END RegiaoAbrangencia,
CASE WHEN RegTrab = 'T' AND RegTrabLead = 'T' THEN 'T'
WHEN RegTrab = 'T' AND RegTrabLead = 'E' THEN 'T'
WHEN RegTrab = 'T' AND RegTrabLead = 'N' THEN 'T'
WHEN RegTrab = 'E' AND RegTrabLead = 'T' THEN 'T'
WHEN RegTrab = 'E' AND RegTrabLead = 'E' THEN 'E'
WHEN RegTrab = 'E' AND RegTrabLead = 'N' THEN 'E'
WHEN RegTrab = 'N' AND RegTrabLead = 'T' THEN 'T'
WHEN RegTrab = 'N' AND RegTrabLead = 'E' THEN 'E'
WHEN RegTrab = 'N' AND RegTrabLead = 'N' THEN 'N'
END RegTrabAbrangencia,
CASE WHEN SitPrev = 'T' AND SitPrevLead = 'T' THEN 'T'
WHEN SitPrev = 'T' AND SitPrevLead = 'E' THEN 'T'
WHEN SitPrev = 'T' AND SitPrevLead = 'N' THEN 'T'
WHEN SitPrev = 'E' AND SitPrevLead = 'T' THEN 'T'
WHEN SitPrev = 'E' AND SitPrevLead = 'E' THEN 'E'
WHEN SitPrev = 'E' AND SitPrevLead = 'N' THEN 'E'
WHEN SitPrev = 'N' AND SitPrevLead = 'T' THEN 'T'
WHEN SitPrev = 'N' AND SitPrevLead = 'E' THEN 'E'
WHEN SitPrev = 'N' AND SitPrevLead = 'N' THEN 'N'
END SitPrevAbrangencia,
CASE WHEN Carreira = 'T' AND CarreiraLead = 'T' THEN 'T'
WHEN Carreira = 'T' AND CarreiraLead = 'E' THEN 'T'
WHEN Carreira = 'T' AND CarreiraLead = 'N' THEN 'T'
WHEN Carreira = 'E' AND CarreiraLead = 'T' THEN 'T'
WHEN Carreira = 'E' AND CarreiraLead = 'E' THEN 'E'
WHEN Carreira = 'E' AND CarreiraLead = 'N' THEN 'E'
WHEN Carreira = 'N' AND CarreiraLead = 'T' THEN 'T'
WHEN Carreira = 'N' AND CarreiraLead = 'E' THEN 'E'
WHEN Carreira = 'N' AND CarreiraLead = 'N' THEN 'N'
END CEFAbrangencia,
CASE WHEN Cargo = 'T' AND CargoLead = 'T' THEN 'T'
WHEN Cargo = 'T' AND CargoLead = 'E' THEN 'T'
WHEN Cargo = 'T' AND CargoLead = 'N' THEN 'T'
WHEN Cargo = 'E' AND CargoLead = 'T' THEN 'T'
WHEN Cargo = 'E' AND CargoLead = 'E' THEN 'E'
WHEN Cargo = 'E' AND CargoLead = 'N' THEN 'E'
WHEN Cargo = 'N' AND CargoLead = 'T' THEN 'T'
WHEN Cargo = 'N' AND CargoLead = 'E' THEN 'E'
WHEN Cargo = 'N' AND CargoLead = 'N' THEN 'N'
END CCOAbrangencia,
CASE WHEN Funcao = 'T' AND FuncaoLead = 'T' THEN 'T'
WHEN Funcao = 'T' AND FuncaoLead = 'E' THEN 'T'
WHEN Funcao = 'T' AND FuncaoLead = 'N' THEN 'T'
WHEN Funcao = 'E' AND FuncaoLead = 'T' THEN 'T'
WHEN Funcao = 'E' AND FuncaoLead = 'E' THEN 'E'
WHEN Funcao = 'E' AND FuncaoLead = 'N' THEN 'E'
WHEN Funcao = 'N' AND FuncaoLead = 'T' THEN 'T'
WHEN Funcao = 'N' AND FuncaoLead = 'E' THEN 'E'
WHEN Funcao = 'N' AND FuncaoLead = 'N' THEN 'N'
END FUCAbrangencia
from
( select CdFuncAgrupamento FA,
CdOperacao Op,
UO,
LEAD(UO) over (order by CdFuncAgrupamento, cdOperacao ) UOLead,
NatVinc,
LEAD(NatVinc) over (order by CdFuncAgrupamento, cdOperacao ) NatVincLead,
RelTrab,
LEAD(RelTrab) over (order by CdFuncAgrupamento, cdOperacao ) RelTrabLead,
Regiao,
LEAD(Regiao) over (order by CdFuncAgrupamento, cdOperacao ) RegiaoLead,
RegTrab,
LEAD(RegTrab) over (order by CdFuncAgrupamento, cdOperacao ) RegTrabLead,
SitPrev,
LEAD(SitPrev) over (order by CdFuncAgrupamento, cdOperacao ) SitPrevLead,
Carreira,
LEAD(Carreira) over (order by CdFuncAgrupamento, cdOperacao ) CarreiraLead,
Cargo,
LEAD(Cargo) over (order by CdFuncAgrupamento, cdOperacao ) CargoLead,
Funcao,
LEAD(Funcao) over (order by CdFuncAgrupamento, cdOperacao ) FuncaoLead
from ETMPTESTE
)
)
where UOAbrangencia is not null
group by FA,
Op,
UOAbrangencia,
NatVincAbrangencia,
RelTrabAbrangencia,
RegiaoAbrangencia,
RegTrabAbrangencia,
SitPrevAbrangencia,
CEFAbrangencia,
CCOAbrangencia,
FUCAbrangencia
/
This result as:
FA OP U N R R R S C C F
--- -- - - - - - - - - -
85 1 E E T E E N N E N
85 1 T T T T T T T T T
But, desired:
FA OP U N R R R S C C F
--- -- - - - - - - - - -
85 1 T T T T T T T T T
Thanks All
October 03, 2006 - 7:28 pm UTC
"sorry"????
I have no idea why you expected what you expected.
or - WHAT YOU EXPECTED at all.
I see lots of code that quite simply does not work and no explaination of why you expected what you think should have come out.
and frankly, i don't know what to do about it anymore. People thing other people should be mentally linked or something. That we should be able to look at "code that doesn't work" and "this is what I expected" from abismally small test cases that hardly exercise the boundary conditions or anything...........
it does get discouraging.
please, just pretend you are trying to explain this to your mom, use simple terms, present it simply, yet COMPLETELY. I have *no clue* why you expected that row and not the two that are correctly returned...
Over?
Anderson Haertel Rodrigues, October 04, 2006 - 8:45 am UTC
Tom, Sorry!
My english for write, is very bad! ;-)
But,
CREATE GLOBAL TEMPORARY TABLE ETMPTESTE
(
CdPerfilAgrupamento integer,
CdFuncAgrupamento integer,
CdOperacao integer,
UO char(1),
NatVinc char(1),
RelTrab char(1),
Regiao char(1),
RegTrab char(1),
SitPrev char(1),
Carreira char(1),
Cargo char(1),
Funcao char(1)
)
ON COMMIT PRESERVE ROWS;
insert into ETMPTESTE
(CdPerfilAgrupamento, CdFuncAgrupamento,CdOperacao,
UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
Carreira, Cargo, Funcao)
values
(1,85,1,'N','E','T','E','N','N','N','N','N');
insert into ETMPTESTE
(CdPerfilAgrupamento, CdFuncAgrupamento, CdOperacao,
UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
Carreira, Cargo, Funcao)
values
(2,85,1,'E','N','E','N','E','N','N','E','N');
insert into ETMPTESTE
(CdPerfilAgrupamento, CdFuncAgrupamento, CdOperacao,
UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
Carreira, Cargo, Funcao )
values
(3,85,1,'T','T','T','T','T','T','T','T','T');
insert into ETMPTESTE
(CdPerfilAgrupamento, CdFuncAgrupamento, CdOperacao,
UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
Carreira, Cargo, Funcao )
values
(4,85,1,'E','E','E','E','E','E','E','E','E');
sigrharq@SIGRHDES>select CDPERFILAGRUPAMENTO, CDFUNCAGRUPAMENTO, CDOPERACAO, UO from etmpteste;
CDPERFILAGRUPAMENTO CDFUNCAGRUPAMENTO CDOPERACAO U
------------------- ------------------ ------------------ -
1 85 1 N
2 85 1 E
3 85 1 T
4 85 1 E
But, my desired result as:
CDFUNCAGRUPAMENTO CDOPERACAO UO
85 1 T
without CDPERFILAGRUPAMENTO, where:
IF UO = 'T' AND Lead = 'T'
Column := 'T'
elsif UO = 'T' AND Lead = 'E'
Column := 'T'
elsif UO = 'T' AND Lead = 'N'
Column := 'T'
elsif UO = 'E' AND Lead = 'T'
Column := 'T'
elsif UO = 'E' AND Lead = 'E'
Column := 'E'
elsif UO = 'E' AND Lead = 'N'
Column := 'E
elsif UO = 'N' AND Lead = 'T'
Column := 'T'
elsif UO = 'N' AND Lead = 'E'
Column := 'E'
elsif UO = 'N' AND Lead = 'N'
Column := 'N'
my query:
select FA,
Op,
UOAbrangencia
from
(
select FA,
Op,
CASE WHEN UO = 'T' AND UOLead = 'T' THEN 'T'
WHEN UO = 'T' AND UOLead = 'E' THEN 'T'
WHEN UO = 'T' AND UOLead = 'N' THEN 'T'
WHEN UO = 'E' AND UOLead = 'T' THEN 'T'
WHEN UO = 'E' AND UOLead = 'E' THEN 'E'
WHEN UO = 'E' AND UOLead = 'N' THEN 'E'
WHEN UO = 'N' AND UOLead = 'T' THEN 'T'
WHEN UO = 'N' AND UOLead = 'E' THEN 'E'
WHEN UO = 'N' AND UOLead = 'N' THEN 'N'
END UOAbrangencia
from
( select CdFuncAgrupamento FA,
CdOperacao Op,
UO,
LEAD(UO) over (order by CdFuncAgrupamento, cdOperacao ) UOLead
from ETMPTESTE
order by CdOperacao, CdFuncAgrupamento
)
)
where UOAbrangencia is not null
group by FA,
Op,
UOAbrangencia
/
My result:
FA OP U
--- -- -
85 1 E
85 1 T
Thanks!!!
October 04, 2006 - 5:06 pm UTC
no, see I need to know the "logic here", reading a failed implementation doesn't help.
no more select's, just tell us WHY there is just one row, what this row represents, how you arrived at it, what is the LOGIC
Over?
Anderson Haertel Rodrigues, October 06, 2006 - 8:51 am UTC
Hello Tom!
Thanks for your answers and patience.
My query now it´s Ok!
Thanks all!
Top-n query
Anu, October 10, 2006 - 11:10 am UTC
Hi Tom,
I have to find the most recent 1000 unique values from a set of data. To make it easy here, I've created a small table with sample data.
create table ANALYTICS_QUESTION
(
PK_COL NUMBER(2) not null,
COL1 VARCHAR2(10),
COL2 VARCHAR2(10),
A_DATE DATE
);
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (1, 'alpha', 'alpha-val', to_date('09-10-2006 17:49:30', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (2, 'alpha', 'alpha-val', to_date('09-10-2006 17:49:33', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (3, 'alpha', 'alpha-val', to_date('29-09-2006 17:49:36', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (4, 'beta', 'beta-val', to_date('04-10-2006 17:49:52', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (5, 'gamma', 'gamma-val', to_date('07-10-2006 17:50:06', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (6, 'gamma', 'gamma-val', to_date('07-10-2006 17:50:09', 'dd-mm-yyyy hh24:mi:ss'));
The table now has:
SQL> select * from analytics_question;
PK_COL COL1 COL2 A_DATE
------ ---------- ---------- -----------
1 alpha alpha-val 10/9/2006 5
2 alpha alpha-val 10/9/2006 5
3 alpha alpha-val 9/29/2006 5
4 beta beta-val 10/4/2006 5
5 gamma gamma-val 10/7/2006 5
6 gamma gamma-val 10/7/2006 5
The output I need is (consider top 2 rows for this example):
COL1 COL2
----- ---------
alpha alpha-val
gamma gamma-val
If I had to do this in a spreadsheet, I would remove duplicates in column col1 while retaining the rows that have the most recent date in column a_date. Next, I would sort this by date and pick the top 2 rows. (For a given value in co11, col2 always has the same value.)
Since it involves top-n, I considered using analytic functions. To do it the traditional way, I could use:
SQL> select *
2 from (select *
3 from (select col1,
4 col2,
5 max(a_date) as a_date
6 from analytics_question
7 group by col1,
8 col2) tab
9 order by a_date desc)
10 where rownum <= 2;
COL1 COL2 A_DATE
---------- ---------- -----------
alpha alpha-val 10/9/2006 5
gamma gamma-val 10/7/2006 5
In reality, the "analytics_question" would be a inline view over a table that has over 600,000 rows. Does not perform well.
I've attempted to use row_number() and rank() functions as below, but I wonder if there is any way to avoid the group by clause.
SQL> select *
2 from (select col1,
3 col2,
4 max(a_date),
5 row_number() over(order by max(a_date) desc) as rn
6 from analytics_question
7 group by col1,
8 col2)
9 where rn <= 2;
COL1 COL2 MAX(A_DATE) RN
---------- ---------- ----------- ----------
alpha alpha-val 10/9/2006 5 1
gamma gamma-val 10/7/2006 5 2
Is there a way to simplify this further? Otherwise, its back to tuning.
Thanks and regards,
Anu
October 10, 2006 - 7:51 pm UTC
600,000 rows is pretty small - can you define "does not perform well"?
Your Help --
Reader, October 10, 2006 - 5:44 pm UTC
create table testa (a1 date,h_flg varchar2(1),b_flg varchar2(1),amt number(6,2))
insert into testa values( sysdate,'B',null,25.00);
insert into testa values( sysdate,null,'H',55.00);
insert into testa values( sysdate,null,'H',35.00);
How can I get the output in one line --
160 70 90
Thanks
insert into testa values( sysdate,'B',null,45.00);
commit;
select distinct sum(amt) over(partition by nvl(h_flg,0)) h_amt,
sum(amt) over(partition by nvl(b_flg,0)) b_amt,
sum(amt) over() tot_amt from testa
H_AMT B_AMT B_AMT
---------- ---------- ----------
160 70 70
160 90 90
2 rows selected.
How can I get the output in one line --
160 70 90
October 10, 2006 - 8:24 pm UTC
umm, you have a table with just three rows in it? your answer is very dependent on there being precisely three rows - is this the "real problem" or so overly simplified as to not be relevant?
Help Required on Promotion Calculation
Hardik Panchal, October 12, 2006 - 9:40 am UTC
Hi Tom,
I have following table:
CREATE TABLE PROMOTION_CALC
(
CUST_NBR NUMBER NOT NULL,
ITEM_ID CHAR(20 BYTE),
EVENT_NBR NUMBER NOT NULL,
PROMO_SEQ_NBR NUMBER(2),
PCT NUMBER(5,4),
AMT NUMBER,
BASE_PRICE NUMBER
);
The table data is as below:
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 1, 1, 4, 0, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 6, 1, 0, 5, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 5, 2, 0, 20, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 2, 3, 5, 0, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 7, 3, 0, 10, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
2, '400 ', 9, 1, 0, 15, 200);
COMMIT;
following is the tabular form of data:
CUST_NBR ITEM_ID EVENT_NBR PROMO_SEQ_NBR PCT AMT BASE_PRICE
-------- ------- --------- ------------- --- --- ----------
1 300 1 1 4 0 100
1 300 6 1 0 5 100
1 300 5 2 0 20 100
1 300 2 3 5 0 100
1 300 7 3 0 10 100
2 400 9 1 0 15 200
Now, i want to calculate NEW_PRICE and PERCENT_DIC as mentioned below:
CUST ITEM EVENT PROMO_ PCT AMT BASE_PRICE NEW_PRICE PERCENT_DIC
_NBR _ID _NBR SEQ_NBR
---- ---- ------ -------- --- --- ---------- --------- -----------
1 300 1 1 4 0 100 91 4
1 300 6 1 0 5 100 91 5
1 300 5 2 0 20 100 71 21.97
1 300 2 3 5 0 100 57.4532 5
1 300 7 3 0 10 100 57.4532 14.08
2 400 9 1 0 15 200 185 7.5
The logic of calculation is as below:
- First of all, the promotion (whether in percent off or amount off) will be converted in to percent off
- The summation of percent off (group by cust_nbr, item_nbr, promo_seq_nbr) will be applied
- The percent off calculated will be applied on the base_price for the least promo_seq_nbr per cust_nbr, item_nbr combination. For the next set of promo_seq_nbr per cust_nbr, item_nbr combination, the percent off calculated will be applied on the new_price (amount calculated with previous promo_seq_nbr), and not on the base_price.
In the above example,
- Row 1 and Row 2 will be applied to base_price: 100$ individually (as both having same promo_seq_nbr: 1)
i.e.
Row 1: Directly 4 percent off given
Row 2: Amount Off 5$ (on base_price: 100$) will be converted to percent off, which will result into 5%
- The summation of percent off: 9% (4% from row 1 + 5% from row 2) will be applied per cust_nbr, item_nbr, promo_seq_nbr (cust_nbr: 1, item_id: 300, promo_seq_nbr:1 in this case), which will result into new_price 91 $. Note here that the new_price column will be calculated as 91 in both the rows (row 1 and row 2) as a result of combined promotion of 9% applied on base_price.
- Now, for subsequent promo_seq_nbr for cust_nbr, item_nbr combination, the converted Percent off will be applied on new_price and not on the item base_price. i.e. in row 3, the amount off 20$ promtion will first converted in to percent off (applied on 91$ and not on 100$), resulting into 21.97%.
- In a same way, 3rd promo_seq_nbr (again having two rows for cust_nbr, item_nbr combination with same promo_seq_nbr) will be applied (as a summation of row 4 and row 5) on a new_price (which is 71$ after applying 2nd promo_seq_nbr).
The event_nbr field is not used in the promtion calculation, but is kept just because it is required for display purpose only in the business.
Thanks a lot in advance !
Regards,
Hardik Panchal
bulk analytic update
raajesh, October 12, 2006 - 3:32 pm UTC
Hi Tom
I would like to know if I can acheive my problem listed below using anayltic functions.
I have a table like this
Dept ID Dept Name Unique ID
1 A1
2 A1
3 A1
4 A2
5 A2
6 A2
7 A3
8 A4
9 A4
10 A4
Now I want my unique ID field to be generated like this
Dept ID Dept Name Unique ID
1 A1 123
2 A1 123
3 A1 123
4 A2 456
5 A2 456
6 A2 456
7 A3 124
8 A4 125
9 A4 125
10 A4 125
Also, the unique ID needs to be generated via an Oracle Sequence. The numbers given are just examples. i.e. unique ID takes values based on the Dept Name partition.
How to achieve this by an Update statement on the table? Please suggest a solution.
Raajesh
October 13, 2006 - 6:55 am UTC
no create, no inserts, no lookie :)
not promising it can be done, just stating "I do not look at the problem if you expect me to create a table, insert your data into it, create sequences, develop a working solution"
I only do the last bit, if and when possible
to raajesh
Michel Cadot, October 13, 2006 - 8:17 am UTC
I don't see the need of an Oracle sequence.
You can try this one:
update dept a
set unqid = (select val
from (select rownum val, deptname
from (select distinct deptname from dept)
) b
where b.deptname=a.deptname)
/
I'm not sure analytics are faster in this case but you can try (assuming deptid is a unique key):
update dept a
set unqid = (select val
from (select deptid,
first_value (deptid)
over (partition by deptname order by deptid) val
from dept) b
where b.deptid=a.deptid)
/
Regards
Michel
October 13, 2006 - 8:24 am UTC
if you really wanted to use "rownum", you best use order by in the select distinct, else there is no assurance that rownum would be assigned deterministically.
Michel Cadot, October 13, 2006 - 8:45 am UTC
Yes, I agree (about rownum and order) but I did not care about what the unique number is (as raajesh said: The numbers given are just examples) as long as it is different for each deptname.
Anyway, the second query gives a number that fully depends on the data as the department (name) is associated to the first deptid for each name (and it could be any other one). (This to explain to other readers and not to you, Tom, as I know you perfectly understand my queries.)
October 13, 2006 - 2:28 pm UTC
problem is that the calls to the subquery are NOT deterministic, the same rownum could in theory be assigned to different rows in different invocations.
One more analytics question
Serge Shmygelsky, October 13, 2006 - 8:55 am UTC
Hello Tom,
I have a table with the following data:
DN_NUM LAC CELL_ID DURATION
-------------------- -------------------- ---------- ----------
503111011 Kiev 044 10
503111011 Kiev 044 10
503111011 Kiev 044 10
503111011 Kiev region 044 10
503111011 Kiev region 044 10
503111011 Kiev region 044 10
503111011 Kiev region 044 10
503111011 Odesa 048 10
503111011 Odesa region 048 10
503111011 Odesa region 048 10
503111011 Odesa region 048 10
503111011 Odesa region 048 10
503111011 Odesa region 048 10
and I need for every phone number (there is only one for simplicity) to find cell_id with maximum number of calls (rows actually) and within this cell_id find lac with maximum number of calls, e.g. output should look like the following:
DN_NUM CELL_ID LAC COUNT_BY_CELL_ID COUNT_BY_REGION
503111011 044 Kiev region 7 4
I we just do 'GROUP BY cell_id, lac' then we'll have incorrect answer.
I've spent 2 hours playing with rank and rownum and cannot find solution. Is it possible to get it within one query?
Thanks in advance.
P.S. The real table is huge. So I'd rather not use any joins
October 13, 2006 - 2:29 pm UTC
no table create
no inserts into table
NO LOOK
to Serge Shmygelsky
Michel Cadot, October 13, 2006 - 10:09 am UTC
Hi Serge,
Do you understand "no create, no inserts, no lookie"?
If so, provide the data if you want an answer.
Regards
Michel
Analytical Calculation Problem
Hardik Panchal, October 14, 2006 - 9:30 am UTC
Hi Tom,
I am posting one problem scenario, hoping an excellent solution from you as always :-)
I have following table:
CREATE TABLE PROMOTION_CALC
(
CUST_NBR NUMBER NOT NULL,
ITEM_ID CHAR(20 BYTE),
EVENT_NBR NUMBER NOT NULL,
PROMO_SEQ_NBR NUMBER(2),
PCT NUMBER(5,4),
AMT NUMBER,
BASE_PRICE NUMBER
);
The table data is as below:
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 1, 1, 4, 0, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 6, 1, 0, 5, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 5, 2, 0, 20, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 2, 3, 5, 0, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
1, '300 ', 7, 3, 0, 10, 100);
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES (
2, '400 ', 9, 1, 0, 15, 200);
COMMIT;
following is the tabular form of data:
CUST_NBR ITEM_ID EVENT_NBR PROMO_SEQ_NBR PCT AMT BASE_PRICE
-------- ------- --------- ------------- --- --- ----------
1 300 1 1 4 0 100
1 300 6 1 0 5 100
1 300 5 2 0 20 100
1 300 2 3 5 0 100
1 300 7 3 0 10 100
2 400 9 1 0 15 200
Now, i want to calculate NEW_PRICE and PERCENT_DIC as mentioned below:
CUST ITEM EVENT PROMO_ PCT AMT BASE_PRICE NEW_PRICE PERCENT_DIC
_NBR _ID _NBR SEQ_NBR
---- ---- ------ -------- --- --- ---------- --------- -----------
1 300 1 1 4 0 100 91 4
1 300 6 1 0 5 100 91 5
1 300 5 2 0 20 100 71 21.97
1 300 2 3 5 0 100 57.4532 5
1 300 7 3 0 10 100 57.4532 14.08
2 400 9 1 0 15 200 185 7.5
The logic of calculation is as below:
- First of all, the promotion (whether in percent off or amount off) will be
converted in to percent off
- The summation of percent off (group by cust_nbr, item_nbr, promo_seq_nbr)
will be applied
- The percent off calculated will be applied on the base_price for the least
promo_seq_nbr per cust_nbr, item_nbr combination. For the next set of
promo_seq_nbr per cust_nbr, item_nbr combination, the percent off calculated
will be applied on the new_price (amount calculated with previous
promo_seq_nbr), and not on the base_price.
In the above example,
- Row 1 and Row 2 will be applied to base_price: 100$ individually (as both
having same promo_seq_nbr: 1)
i.e.
Row 1: Directly 4 percent off given
Row 2: Amount Off 5$ (on base_price: 100$) will be converted to percent off,
which will result into 5%
- The summation of percent off: 9% (4% from row 1 + 5% from row 2) will be
applied per cust_nbr, item_nbr, promo_seq_nbr (cust_nbr: 1, item_id: 300,
promo_seq_nbr:1 in this case), which will result into new_price 91 $. Note here
that the new_price column will be calculated as 91 in both the rows (row 1 and
row 2) as a result of combined promotion of 9% applied on base_price.
- Now, for subsequent promo_seq_nbr for cust_nbr, item_nbr combination, the
converted Percent off will be applied on new_price and not on the item
base_price. i.e. in row 3, the amount off 20$ promtion will first converted in
to percent off (applied on 91$ and not on 100$), resulting into 21.97%.
- In a same way, 3rd promo_seq_nbr (again having two rows for cust_nbr, item_nbr
combination with same promo_seq_nbr) will be applied (as a summation of row 4
and row 5) on a new_price (which is 71$ after applying 2nd promo_seq_nbr).
The event_nbr field is not used in the promtion calculation, but is kept just
because it is required for display purpose only in the business.
Thanks a lot in advance !
Regards,
Hardik Panchal
October 14, 2006 - 9:39 am UTC
first - I'm taking new questions now, why wouldn't you use new questions to ask a.... new question?
second, I have no idea why 91 is the new price on the first row. 4% off of 100 would be 96 no?
How to do the reverse of this analytical function question?
Patty Hoth, October 16, 2006 - 1:05 pm UTC
I really find your website invaluable! I have studied and played around with this answer to try to solve my sql issue. I want to do the reverse of what this person was looking for.
My data looks like:
EMPLID EFFDT EFFSEQ DEPTID
----------- --------- ---------- ----------
0003496 01-JUL-99 0 157000
0003496 01-JAN-00 0 157000
0003496 01-JUL-00 1 157999
0003496 01-JAN-01 0 157999
0003496 01-JUL-01 1 157999
0003496 01-JAN-02 0 157999
0003496 01-JUL-02 1 157999
0003496 01-JAN-03 0 157999
0003496 01-JUL-03 1 157999
0003496 01-JAN-04 0 157999
0003496 01-JAN-05 0 157999
0003496 01-JUL-05 1 157999
0003496 01-JUL-05 2 157999
0003496 01-JAN-06 0 157999
14 rows selected.
I want to create sql to answer the question of department movement. So, for this data, the answer would like like:
eid effdt dept end effdt
3496 7/1/1999 157000 7/1/2000
3496 7/1/2000 157999
I have tried variations of the answer you provided, come very close, but am not able to get the right answer:
SELECT emplid, deptid, effdt, ending_effdt
FROM
(SELECT emplid,
deptid,
effdt,
ending_effdt,
DECODE( next_deptid, deptid, 1, 0 ) first_of_pair,
DECODE( lead_deptid, deptid, 1, 0 ) second_of_pair
FROM (
SELECT emplid,
lag(deptid) OVER (PARTITION BY emplid ORDER BY effdt, effseq) next_deptid,
lead(deptid) OVER (PARTITION BY emplid ORDER BY effdt, effseq) lead_deptid,
deptid,
effdt,
-- lag(effdt) OVER (PARTITION BY emplid ORDER BY effdt) ending_effdt,
-- lead(effdt) OVER (PARTITION BY emplid ORDER BY effdt) lead_effdt
lead(effdt) OVER (PARTITION BY emplid ORDER BY effdt, effseq) ending_effdt,
lag(effdt) OVER (PARTITION BY emplid ORDER BY effdt, effseq) lead_effdt
FROM ps_job
WHERE emplid = '0003496' )
WHERE next_deptid IS NULL
OR lead_deptid IS NULL
OR lead_deptid <> deptid
OR next_deptid <> deptid
)
WHERE first_of_pair <> 1
Result:
EMPLID DEPTID EFFDT ENDING_EF
----------- ---------- --------- ---------
0003496 157000 7/1/1999 1/1/2000
0003496 157999 7/1/2000 1/1/2001
The effdt are correct, but the ending dates are not (second one should be null and first should be 7/1/2000). I will continue to work on this, but thought I'd try your assistance! Thanks very much in advance.
October 16, 2006 - 5:21 pm UTC
no creates
no insert intos
definitely no lookie...
One more analytics question - ctd
Serge Shmygelsky, October 23, 2006 - 8:10 am UTC
Hello Tom,
thanks for your attention. I'll try to re-formulate my question hoping to get the answer.
I have created the following table:
OPS$SHMYG@REX> create table test (dn_Num varchar2(11), region varchar2(20), area varchar2(20));
After that I put some data in there:
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
Now my table looks like this:
OPS$SHMYG@REX> select * from test order by 1, 2, 3;
DN_NUM REGION AREA
-------------------- -------------------- --------------------
503111011 Kiev Kiev
503111011 Kiev Kiev
503111011 Kiev Kiev
503111011 Kiev Kiev region
503111011 Kiev Kiev region
503111011 Kiev Kiev region
503111011 Kiev Kiev region
503111011 Odesa Odesa
503111011 Odesa Odesa region
503111011 Odesa Odesa region
503111011 Odesa Odesa region
503111011 Odesa Odesa region
503111011 Odesa Odesa region
Actually these are phone calls. Each phone call has 2 attributes:
region it is made from
area it is made from.
Each region contains 2 or more areas.
The question is as follows:
for any given phone number I need to find a region with maximum number of calls. For this phone and this region I need to find area with maximum number of calls.
If I just run (group by region, area), I'll have
Phone number Region CNT_REGION Area CNT_AREA
503111011 Odesa 6 Odesa region 5
but I need the following:
Phone number Region CNT_REGION Area CNT_AREA
503111011 Kiev 7 Kiev region 4
I cannot find a way to do it in one sql and hope this is possible as real table contains 100,000,000 rows per day.
Hope this time my explanation is acceptable.
Best regards.
October 23, 2006 - 10:29 am UTC
ops$tkyte%ORA10GR2> select * from (
2 select region, count(*)
3 from test
4 group by region
5 order by 2 desc
6 )
7 where rownum = 1;
REGION COUNT(*)
-------------------- ----------
Kiev 7
ops$tkyte%ORA10GR2> select * from (
2 select region, area, count(*)
3 from test
4 group by region, area
5 order by 3 desc
6 )
7 where rownum = 1;
REGION AREA COUNT(*)
-------------------- -------------------- ----------
Odesa Odesa region 5
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select *
2 from (
3 select region, area, max_area, area_cnt, reg_cnt,
4 max(reg_cnt) over () max_reg
5 from (
6 select region, area, area_cnt,
7 max(area_cnt) over (partition by region) max_area,
8 sum(area_cnt) over (partition by region) reg_cnt
9 from (
10 select region, area, count(*) area_cnt
11 from test
12 where dn_num = '503111011'
13 group by region, area
14 )
15 )
16 )
17 where area_cnt = max_area or reg_cnt = max_reg
18 /
REGION AREA MAX_AREA AREA_CNT REG_CNT MAX_REG
-------------------- -------------------- ---------- ---------- ---------- ----------
Kiev Kiev 4 3 7 7
Kiev Kiev region 4 4 7 7
Odesa Odesa region 5 5 6 7
last query is the one you asked for, if dn_num is indexed, it doesn't really matter if the table is big, i would presume the phone number is fairly selective.
One more analytics question - ctd
Serge Shmygelsky, October 23, 2006 - 9:03 am UTC
Hello Tom,
looks like I've finally found the answer. If I'm not wrong, it should be like that:
OPS$SHMYG@REX> select dn_num, region, area, count(*), sum(count(*)) over (partition by dn_num, region) from test group by dn_num, region, area order by 5 desc, 4 desc;
DN_NUM REGION AREA COUNT(*) SUM(COUNT(*))OVER(PARTITIONBYDN_NUM,REGION)
-------------------- -------------------- -------------------- ---------- -------------------------------------------
503111011 Kiev Kiev region 4 7
503111011 Kiev Kiev 3 7
503111011 Odesa Odesa region 5 6
503111011 Odesa Odesa 1 6
The first row is the one I was looking for.
October 23, 2006 - 10:31 am UTC
no according to your problem definition - you asked for two two rows (max region, max area - no relation between area and region in your problem statement) AND further, there are ties - meaning this query can return lots of rows and it should
or you need to be much much more precise.
One more analytics question - ctd
Serge Shmygelsky, October 24, 2006 - 2:06 am UTC
Hello Tom,
thanks for your solution. It solves the problem as well as the one I mentioned. They'll need to be customized a little but I'm on my way.
I'll pay some attention to my English :).
Do you plan to create a logo for PL/SQL? 'Analytics rocks' would be a good one :)
Best regards.
One probable answer to Serge !!!!!
Vinayak, October 24, 2006 - 5:32 am UTC
select * from (
select * from(
select dn_num,region,area,count(region) over(partition by region) reg_cnt,count(area) over(partition by region,area) area_cnt from test1
where dn_num='503111011'
)
order by reg_cnt desc,area_cnt desc
) where rownum=1
Hope this help you!!! Also wait for Tom's comments on this.
October 24, 2006 - 9:16 am UTC
well, not really knowing what he wanted, I cannot really comment much more than I did above - it is from his description that two rows should regularly appear (max region count, max area count)
How to move to analytic function ?
Lena, October 26, 2006 - 6:12 pm UTC
Hi Tom,
There are times when you need to bring diffrent things twice from the same table .
select test.name name,
test2.name name2,
p.name,
p.main_num,
p.sec_num
from test test, <===
test test2, <===
prod p
where p.main_num = test.id
and p.sec_num = test2.id
Is it possible get the same result using analytic function ?
create table test
(id number,
name varchar2(20))
insert into test values(251,'U-M');
insert into test values(279,'PAL');
insert into test values(300,'U-A');
insert into test values(301,'E');
insert into test values(303,'S');
insert into test values(337,'A')
commit;
create table prod
(id number,
name varchar2(20),
main_num number,
sec_num number);
insert into prod values (1,'XX',300,251);
insert into prod values (2,'YY',300,null);
insert into prod values (3,'ZZ',279,301);
insert into prod values (4,'UU',303,337);
COMMIT;
Regards
October 27, 2006 - 7:38 am UTC
in this case, no, using analytics would not suffice "in general"
Second question ...
Lena, October 27, 2006 - 1:32 pm UTC
Hi Tom,
I tried to find a second way to write this statment
but didnt success.
Could you please show how to do it ?
select test.name name,
test2.name name2,
p.name,
p.main_num,
p.sec_num
from test test, <===
test test2, <===
prod p
where p.main_num = test.id
and p.sec_num = test2.id
Thanks again
October 27, 2006 - 8:08 pm UTC
from test test2,
test test,
prod p
there you go, just mix up the order of the tables in the from clause and you have written it in a second way.....
Not sure what you mean?
Second way = Different way
Lena, October 28, 2006 - 1:11 am UTC
Hi Tom,
I found in one of our reports a select statment that
fetching from 20 tables in the FROM clause.
The reason that there are so many tables , is because of the need to bring twice (or more) thing from the same
table.
So, i built a simple test case that look like the following query:
select test.name name,
test2.name name2,
p.name,
p.main_num,
p.sec_num
from test test, <===
test test2, <===
prod p
where p.main_num = test.id
and p.sec_num = test2.id
Im trying to rewrite this query in different way(second way ...)) , but cant find any way to do that task.
I thought about analytics , but understood that its not
the case for analytics.
My question is if Could you please show how to do it ?
Thanks again
October 28, 2006 - 10:35 am UTC
well, the problem is that each row in "prod" can join to a different set of rows in test each time, hence - well, you sort of need TWO JOINS.
In this case, it is somewhat "not really avoidable", p.main_num in general is not equal to p.sec_num (I presume, else you would not have two columns) therefore and thusly - two joins.
A reader, November 17, 2006 - 6:50 am UTC
Dear Tom,
Sorry to ask my question here. I've the following situation
create table t1 (x number, y varchar2(3), z varchar2(3));
create table t2 (z varchar2(3), w number);
INSERT INTO T1 ( X, Y, Z ) VALUES ( 1, '003', 'aaa');
INSERT INTO T1 ( X, Y, Z ) VALUES ( 2, '004', 'bbb');
INSERT INTO T1 ( X, Y, Z ) VALUES ( 3, '005', 'ccc');
INSERT INTO T1 ( X, Y, Z ) VALUES ( 4, '006', 'ddd');
INSERT INTO T2 ( Z, W ) VALUES ( 'aaa', 1);
INSERT INTO T2 ( Z, W ) VALUES ( 'bbb', 2);
INSERT INTO T2 ( Z, W ) VALUES ( 'ccc', 3);
commit;
I would like then to do the following update
update t1
set t1.x = 99
where t1.y = '003'
and t1.z in (select z from t2)
Then, I would like to insert into another table all t1.z
that have been concerned by this update.
(I don't want to issue the following select again in order to get those t1.z
select t1.z from t1
where t1.y = '003'
and t1.z in (select z from t2)
)
Is it possible to do this directly within the update statement ?
Thanks for your help as you do always
Best Regards
November 17, 2006 - 8:09 am UTC
ops$tkyte%ORA10GR2> declare
2 type array is table of t3.z%type index by binary_integer;
3 l_data array;
4 begin
5 update t1 set x = 99 where y = '003' and z in (select z from t2)
6 returning z bulk collect into l_data;
7
8 forall i in 1 .. l_data.count
9 insert into t3 values (l_data(i));
10 end;
11 /
PL/SQL procedure successfully completed.
How can I use analytics to answer this question
A reader, November 27, 2006 - 11:15 am UTC
I have this data:
ORDER NUM UPDATE DATE TOTAL
--------- ----------- ------
100 01-JAN-06 150.00
100 15-FEB-06 160.00
100 02-JUN-06 85.00
200 05-FEB-06 300.00
200 07-NOV-06 400.00
200 25-NOV-06 100.00
300 01-MAY-06 500.00
300 20-JUN-06 600.00
300 26-NOV-06 750.00
400 16-AUG-06 800.00
400 20-SEP-06 800.00
400 10-OCT-06 200.00
I need to calculate the total difference between the last update date and the next to last update date for orders where the next to last update date is < 10/19 and the last total < the next to last total.
Using the sample data, the query should return
ORDER NUM DIFFERENCE
--------- ----------
100 -75.00
400 -600.00
Your expert advise will be appreciated.
November 27, 2006 - 7:33 pm UTC
no create
no inserts
no lookie
no promises that when we do look we can answer...
Here's the missing info
A reader, November 28, 2006 - 9:53 am UTC
create table orders (order_num number, update_date date, total number);
insert into orders values(100,to_date('01-JAN-06','DD-MON-YY'),150);
insert into orders values(100,to_date('15-FEB-06','DD-MON-YY'),160);
insert into orders values(100,to_date('02-JUN-06','DD-MON-YY'), 85);
insert into orders values(200,to_date('05-FEB-06','DD-MON-YY'),300);
insert into orders values(200,to_date('07-NOV-06','DD-MON-YY'),400);
insert into orders values(200,to_date('25-NOV-06','DD-MON-YY'),100);
insert into orders values(300,to_date('01-MAY-06','DD-MON-YY'),500);
insert into orders values(300,to_date('20-JUN-06','DD-MON-YY'),600);
insert into orders values(300,to_date('26-NOV-06','DD-MON-YY'),750);
insert into orders values(400,to_date('16-AUG-06','DD-MON-YY'),800);
insert into orders values(400,to_date('20-SEP-06','DD-MON-YY'),800);
insert into orders values(400,to_date('10-OCT-06','DD-MON-YY'),200);
November 28, 2006 - 11:56 am UTC
this is a total guess since "10/19" is a bit ambiguous, and you don't really specify how to compare total with last total (assume "by order_num")
ops$tkyte%ORA10GR2> select *
2 from (
3 select order_num, update_date, total,
4 lag(total) over (partition by order_num order by update_date) last,
5 total-lag(total) over (partition by order_num order by update_date) diff
6 from orders
7 )
8 where diff < 0
9 and update_date < to_date( '19-oct-2006', 'dd-mon-yyyy')
10 order by order_num, update_date
11 /
ORDER_NUM UPDATE_DA TOTAL LAST DIFF
---------- --------- ---------- ---------- ----------
100 02-JUN-06 85 160 -75
400 10-OCT-06 200 800 -600
Thanks!
A reader, November 28, 2006 - 3:59 pm UTC
Your reply gave me an excellent foundation to achieve what I was trying to accomplish.
Analytics defenitely rock!
Thanks!
How about de model clause
Bart, November 29, 2006 - 8:31 am UTC
Often I see you're quote :
(but wait'll you see the
SQL Model clause in 10g)
But it seems very quiet around de model clause.
It is just too complicated
It doesn't fill any needs
or
??
Any thoughts...practical (real world) examples...
Re: How about de model clause
Frank Zhou, November 30, 2006 - 12:16 pm UTC
Bart,
SQL Model Clause is one of the most powerful and useful tool in the Oracle Database. This is truly a SQL Evolution
It opens up limitless opportunities to process data in a single SQL queries. A fast and efficient single query solution becomes a reality for many of the tough real world problems. It is an excellent invention from the Oracle Database team!!
In 10G release 2 the following trick is allowed in the database:
with Model_Clause_1 as (SQL_MODEL_Clause_QUERY),
With Model_Clause_2 as (an other SQL_MODEL_Clause_QUERY that can Selects from "Model_Clause_1")
Select * from "With Model_Clause_2"
MODEL
Partition by ( * )
dimension by (Analytic functions is allowed here !)
Measures (Analytic functions is allowed here too ! )
Rules ( Analytic functions is "FINALLY" allowed here !!!!)
Just think about it, with this kind of powerful data processing capabilities people should have very few excuses not to be able to implement a single query to solve a database related problems.
Here are more Model Clause examples :
</code>
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1941199930674#77866736858360 http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:4273679444401#76833955117389 <code>
SQL Model clause rock......
Analytics roll....
Frank
select distinct
Sandeep, December 06, 2006 - 1:39 am UTC
Tom I have one table suppose table1 and I have to select 2 rows in that suppose row1 and row2 but row1 I have some duplicates I want to select now and while selecting I want to skip duplicates and select only those rows which is having maximum effective date
we have 11 mil records in table so I have to write analytical function
please help me in doing this
December 07, 2006 - 8:01 am UTC
how can row1 have "some duplicates".
if you mean:
we have a history table, the key is:
X, EFF_DT
for any given X value, we only want to keep the record such that for that X value, EFF_DT = max(EFF_DT)
then
select *
from (select t.*, max(eff_dt) over (partition by X) max_dt
from t)
where eff_dt = max_dt;
Reader, December 06, 2006 - 1:43 am UTC
Sorry Tom rather than saying column I said rows
00907: missing right parenthesis in ProC *
Jimmy, December 07, 2006 - 9:16 pm UTC
Tom Greetings,
I have a program in ProC *, and works most of the times, the problem well is that some times it appears error ORA-00907: missing right parenthesis when executing following sentence INSERT:
EXEC SQL INSERT INTO TABLAF
( N1M_F5CT3NT4RN2 ,
N1M_F2L32 ,
N1M_PR2C4S2 ,
T3P_D2C1M ,
T3P_PR2C4S2 ,
C2D_V4NT5 ,
C2D_C4NTR4M3 ,
C2D_CL34NT4 ,
F4C_4M3S32N ,
F4C_V4NC3M34NT2 ,
C2D_C3CLF5CT ,
C2D_M2N4D5 ,
F4C_3N3D5TC2BR2S,
T2T_F5CT1R5 ,
T2T_N4T2 ,
T2T_P5G5R ,
5C1M_D4SC14NT2 ,
5C1M_3V5 ,
5C1M_3C5 ,
5C1M_R4T4NC32N ,
3MP_5C1M1L5D2 ,
3MP_S5LD25NT ,
3MP_5J1ST4S ,
3MP_P5G2S ,
3MP_M2R530 ,
3MP_M2R560 ,
3MP_M2R590 ,
3MP_M2R5120 ,
3MP_M2R5150 ,
N1M_F5CTR4F ,
N2M_1S15R2R5 ,
3ND_P5C2BR2S ,
3ND_P5C2NT5BL4 ,
3ND_3MPR4S32N ,
N1M_D35N ,
3MP_3V5S5LD25NT ,
F4C_S1SP4NS32N )
V5L14S (
T2_N1MB4R(:stH3stD2c1.szN1mF5ct3nt4rn2) ,
T2_N1MB4R(:stH3stD2c1.szN1mF2l32) ,
:stH3stD2c1.lN1mPr2c4s2 ,
:stH3stD2c1.3T3pD2c1m ,
:stH3stD2c1.3T3pPr2c4s2 ,
:stH3stD2c1.lC2dV4nt5 :3_shC2dV4nt5 ,
:stH3stD2c1.3C2dC4ntr4m3 ,
:stH3stD2c1.lC2dCl34nt4 ,
T2_D5T4(:stH3stD2c1.szF4c4m3s32n ,'YYYYMMDDHH24M3SS') ,
T2_D5T4(:stH3stD2c1.szF4cV4nc3m34nt2 ,'YYYYMMDDHH24M3SS') ,
:stH3stD2c1.lC2dC3clF5ct :3_shC2dC3clF5ct ,
:stH3stD2c1.szC2dM2n4d5 ,
T2_D5T4(:stH3stD2c1.szF4c3n3D5tC2br2s :3_shF4c3n3D5tC2br2s,
'YYYYMMDDHH24M3SS') ,
:stH3stD2c1.dT2tF5ct1r5 ,
:stH3stD2c1.dT2tN4t2 ,
:stH3stD2c1.dT2tP5g5r ,
:stH3stD2c1.d5c1mD4sc14nt2 ,
:stH3stD2c1.d5c1m3V5 ,
:stH3stD2c1.d5c1m3C5 ,
:stH3stD2c1.d5c1mR4t4nc32n ,
:stH3stD2c1.d3mp5c1m1l5d2 ,
:stH3stD2c1.d3mpS5ld25nt ,
:stH3stD2c1.d3mp5j1st4s :3_sh3mp5j1st4s ,
:stH3stD2c1.d3mpP5g2s :3_sh3mpP5g2s ,
:stH3stD2c1.d3mpM2r530 :3_sh3mpM2r530 ,
:stH3stD2c1.d3mpM2r560 :3_sh3mpM2r560 ,
:stH3stD2c1.d3mpM2r590 :3_sh3mpM2r590 ,
:stH3stD2c1.d3mpM2r5120 :3_sh3mpM2r5120 ,
:stH3stD2c1.d3mpM2r5150 :3_sh3mpM2r5150 ,
T2_N1MB4R(:stH3stD2c1.szN1mF5ctR4f :3_shN1mF5ctR4f) ,
:stH3stD2c1.szN2m1s15r2r5 :3_shN2m1s15r2r5 ,
:stH3stD2c1.33ndP5C2br2s ,
:stH3stD2c1.33ndP5C2nt5bl4 ,
:stH3stD2c1.33nd3mpr4s32n ,
:stH3stD2c1.szN1mD35N ,
:stH3stD2c1.d3mp3v5S5ld25nt ,
T2_D5T4(:stH3stD2c1.szF4cS1sp4ns32n :3_shF4cS1sp4ns32n ,
'YYYYMMDDHH24M3SS') );
The variables flags, as i_shF4c3n3D5tC2br2s previously are initialized with the values according to the value of the variables.
Ej: i_shFecIniDatCobros = (stHistDocu.szFecIniDatCobros[0] == ' \0')?ORA_NULL:0; ORA_NULL has value -1 I thank for
It will have something to see with host variable and flags, in what cases can fail. upgrade library of Oracle? or S.O?
Much the aid that you can offer me,
Thanks,
December 08, 2006 - 7:35 am UTC
man, those are the wackiest variable names ever aren't they.
I don't see how a prepared statement like that could or would "sometimes" say "missing parens", it seems it should either be
a) always
b) never
are you saying "intermittent" - because that would only be raised during a parse and that should only be parsed ONCE really during program execution and if it parsed at least once, it should always parse
(and if it does not, I would have to then say "you have a memory overwrite somewhere and you are munging this static string during your program execution - that is, you are doing it yourself")
Michel Cadot, December 08, 2006 - 8:05 am UTC
It seems he tried to obfuscate the real name as we can see VALUES is converted to V5L14S, TO_DATE to T2_D5T4...
A quick look we can see:
A -> 5
E -> 4
I -> 3
O -> 2(?)
U -> 1
In his idiom "Analytics question" means "Syntax error". :)
Michel
Secret decoder ring...
Jmv, December 08, 2006 - 11:20 am UTC
Start of decoding....
T2_N1MB4R is TO_NUMBER so 2=O, 1=U, E=4
V5L14S is VALUES so 5=A, 1=U and E=4 (correlation)
T2_D5T4 is TO_DATE so 2=0 (corr.), 5=A (corr.), 4=E (correlated)
Possible partial Rosetta stone...
12345
UO?EA Is 3=I?
Possible rewrite...
EXEC SQL INSERT INTO TABLAF
( NUM_FACTINTERNO ,
NUM_FOLIO ,
NUM_PROCESO ,
TIP_DOCUM ,
TIP_PROCESO ,
COD_VENTA ,
COD_CENTREMI ,
COD_CLIENTE ,
FEC_EMISION ,
FEC_VENCIMIENTO ,
COD_CICLFACT ,
COD_MONEDA ,
FEC_INIDATCOBROS,
TOT_FACTURA ,
TOT_NETO ,
TOT_PAGAR ,
ACUM_DESCUENTO ,
ACUM_IVA ,
ACUM_ICA ,
ACUM_RETENCION ,
IMP_ACUMULADO ,
IMP_SALDOANT ,
IMP_AJUSTES ,
IMP_PAGOS ,
IMP_MORAI0 ,
IMP_MORA60 ,
IMP_MORA90 ,
IMP_MORAUO0 ,
IMP_MORAUA0 ,
NUM_FACTREF ,
NOM_USUARORA ,
IND_PACOBROS ,
IND_PACONTABLE ,
IND_IMPRESION ,
NUM_DIAN ,
IMP_IVASALDOANT ,
FEC_SUSPENSION )
VALUES (
TO_NUMBER(:stHIstDOcU.szNUmFActIntErnO) ,
TO_NUMBER(:stHIstDOcU.szNUmFOlIO) ,
:stHIstDOcU.lNUmPrOcEsO ,
:stHIstDOcU.ITIpDOcUm ,
:stHIstDOcU.ITIpPrOcEsO ,
:stHIstDOcU.lCOdVEntA :I_shCOdVEntA ,
:stHIstDOcU.ICOdCEntrEmI ,
:stHIstDOcU.lCOdClIEntE ,
TO_DATE(:stHIstDOcU.szFEcEmIsIOn ,'YYYYMMDDHHOEMISS') ,
TO_DATE(:stHIstDOcU.szFEcVEncImIEntO ,'YYYYMMDDHHOEMISS') ,
:stHIstDOcU.lCOdCIclFAct :I_shCOdCIclFAct ,
:stHIstDOcU.szCOdMOnEdA ,
TO_DATE(:stHIstDOcU.szFEcInIDAtCObrOs :I_shFEcInIDAtCObrOs,
'YYYYMMDDHHOEMISS') ,
:stHIstDOcU.dTOtFActUrA ,
:stHIstDOcU.dTOtNEtO ,
:stHIstDOcU.dTOtPAgAr ,
:stHIstDOcU.dAcUmDEscUEntO ,
:stHIstDOcU.dAcUmIVA ,
:stHIstDOcU.dAcUmICA ,
:stHIstDOcU.dAcUmREtEncIOn ,
:stHIstDOcU.dImpAcUmUlAdO ,
:stHIstDOcU.dImpSAldOAnt ,
:stHIstDOcU.dImpAjUstEs :I_shImpAjUstEs ,
:stHIstDOcU.dImpPAgOs :I_shImpPAgOs ,
:stHIstDOcU.dImpMOrAI0 :I_shImpMOrAI0 ,
:stHIstDOcU.dImpMOrA60 :I_shImpMOrA60 ,
:stHIstDOcU.dImpMOrA90 :I_shImpMOrA90 ,
:stHIstDOcU.dImpMOrAUO0 :I_shImpMOrAUO0 ,
:stHIstDOcU.dImpMOrAUA0 :I_shImpMOrAUA0 ,
TO_NUMBER(:stHIstDOcU.szNUmFActREf :I_shNUmFActREf) ,
:stHIstDOcU.szNOmUsUArOrA :I_shNOmUsUArOrA ,
:stHIstDOcU.IIndPACObrOs ,
:stHIstDOcU.IIndPACOntAblE ,
:stHIstDOcU.IIndImprEsIOn ,
:stHIstDOcU.szNUmDIAN ,
:stHIstDOcU.dImpIvASAldOAnt ,
TO_DATE(:stHIstDOcU.szFEcSUspEnsIOn :I_shFEcSUspEnsIOn ,
'YYYYMMDDHHOEMISS') );
If trying to obfuscate, then one of the parenthesis might have been missed, or accidently replaced.
Otherwise perhaps there may be a problem with the variable assignment wherein the variable contains a quote or other character which is causing the Pro*C routine (eg sql injection) to report the error.
Need more info.
December 09, 2006 - 12:39 pm UTC
laughing out loud - nice decoder there.
Analytic question
Carlos, December 11, 2006 - 8:05 am UTC
Tom I am using Oracle 10g and I need your feedback on the following:
I need a way to generate zero amount value records on dates when the blasts did not generate any amounts.
Here are the DDLs for the table and data:
create table test
(transaction_date date,
blast_id number(11),
blast_amount number(11)
)
/
insert into test values ('01-DEC-06', 1, 10);
insert into test values ('01-DEC-06', 2, 5);
insert into test values ('01-DEC-06', 3, 15);
insert into test values ('01-DEC-06', 4, 12);
insert into test values ('01-DEC-06', 5, 9);
insert into test values ('02-DEC-06', 7, 30);
insert into test values ('02-DEC-06', 8, 10);
insert into test values ('02-DEC-06', 1, 20);
insert into test values ('02-DEC-06', 9, 40);
insert into test values ('02-DEC-06', 2, 10);
insert into test values ('03-DEC-06', 10, 100);
insert into test values ('03-DEC-06', 6, 45);
insert into test values ('03-DEC-06', 12, 200);
insert into test values ('03-DEC-06', 13, 90);
insert into test values ('03-DEC-06', 14, 32);
commit;
If I query the table for transaction date = '02-DEC-06' the ouput should be
'02-DEC-06', 1, 20
'02-DEC-06', 2, 10
'02-DEC-06', 3, 0
'02-DEC-06', 4, 0
'02-DEC-06', 5, 0
'02-DEC-06', 7, 30
'02-DEC-06', 8, 10
'02-DEC-06', 9, 40
If I query the table for transaction date = '03-DEC-06' the output should be
'03-DEC-06', 1, 0
'03-DEC-06', 2, 0
'03-DEC-06', 3, 0
'03-DEC-06', 4, 0
'03-DEC-06', 5, 0
'03-DEC-06', 6, 45
'03-DEC-06', 7, 0
'03-DEC-06', 8, 0
'03-DEC-06', 9, 0
'03-DEC-06', 10, 100
'03-DEC-06', 12, 200
'03-DEC-06', 13, 90
'03-DEC-06', 14, 32
If I query the table for transaction date between '02-DEC-06' and '03-DEC-06', I should get
'02-DEC-06', 1, 20
'02-DEC-06', 2, 10
'02-DEC-06', 3, 0
'02-DEC-06', 4, 0
'02-DEC-06', 5, 0
'02-DEC-06', 7, 30
'02-DEC-06', 8, 10
'02-DEC-06', 9, 40
'03-DEC-06', 1, 0
'03-DEC-06', 2, 0
'03-DEC-06', 3, 0
'03-DEC-06', 4, 0
'03-DEC-06', 5, 0
'03-DEC-06', 6, 45
'03-DEC-06', 7, 0
'03-DEC-06', 8, 0
'03-DEC-06', 9, 0
'03-DEC-06', 10, 100
'03-DEC-06', 12, 200
'03-DEC-06', 13, 90
'03-DEC-06', 14, 32
My goal is not to store blasts that did not generate any amounts as I would be wasting space.
Thanks.
December 11, 2006 - 8:33 am UTC
I don't know why "6" for 2-dec and "11" for 3-dec are "missing", I will assume that is a mistake in your example - I cannot see any logic to them not being there.
What we need is a set of rows that is equal to the max blast_id for the rows of interest. That is:
7 (select max(max(blast_id))
8 from test
9 where transaction_date between to_date(:ldate,'dd-mon-yyyy') and to_date(:hdate,'dd-mon-yyyy')
10 group by transaction_date)
11 )
take that number and have at least that many rows. That is what "data" is below.
then, using 10gr2 partitioned outer joins, we can make up the missing rows and fill them in:
ops$tkyte%ORA10GR2> variable ldate varchar2(25);
ops$tkyte%ORA10GR2> variable hdate varchar2(25);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :ldate := '02-dec-2006';
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> exec :hdate := '03-dec-2006';
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with
2 data
3 as
4 (select level l
5 from dual
6 connect by level <=
7 (select max(max(blast_id))
8 from test
9 where transaction_date between to_date(:ldate,'dd-mon-yyyy') and to_date(:hdate,'dd-mon-yyyy')
10 group by transaction_date)
11 )
12 select *
13 from (
14 select test.transaction_date, data.l, nvl(test.blast_amount,0) blast_amount,
15 max(blast_id) over (partition by test.transaction_date) max_id
16 from data left join
17 (select *
18 from test
19 where transaction_date between to_date(:ldate,'dd-mon-yyyy') and to_date(:hdate,'dd-mon-yyyy')) test
20 partition by (transaction_date) on (data.l = test.blast_id)
21 )
22 where l <= max_id
23 order by transaction_date, l
24 /
TRANSACTI L BLAST_AMOUNT MAX_ID
--------- ---------- ------------ ----------
02-DEC-06 1 20 9
02-DEC-06 2 10 9
02-DEC-06 3 0 9
02-DEC-06 4 0 9
02-DEC-06 5 0 9
02-DEC-06 6 0 9
02-DEC-06 7 30 9
02-DEC-06 8 10 9
02-DEC-06 9 40 9
03-DEC-06 1 0 14
03-DEC-06 2 0 14
03-DEC-06 3 0 14
03-DEC-06 4 0 14
03-DEC-06 5 0 14
03-DEC-06 6 45 14
03-DEC-06 7 0 14
03-DEC-06 8 0 14
03-DEC-06 9 0 14
03-DEC-06 10 100 14
03-DEC-06 11 0 14
03-DEC-06 12 200 14
03-DEC-06 13 90 14
03-DEC-06 14 32 14
23 rows selected.
Thanks a million
Carlos, December 11, 2006 - 8:49 am UTC
Just what I needed!
Analytics Question
Carlos, December 12, 2006 - 6:41 pm UTC
Tom, regarding your comment on what you perceived where missing records, the blast_id is a generated from a database sequence; A sequence is not guaranteed to always be sequential, as I understand it. It guaranties uniqueness and in some cases could skip.
Would your solution change as result of this?
Thanks again.
December 12, 2006 - 10:10 pm UTC
sorry - this comment is not making sense to me.
I assumed
a) blast_id starts at 1
b) blast_id goes to max by the transaction date
that is all - was wondering about the missing records in your example, couldn't figure out why
Finding Duplicates in large unpartitioned table
Ravi, December 26, 2006 - 12:29 pm UTC
Hi Tom,
Can you help me to find any ways to optimize the below routine which is taking 38 hours to run in production, table is huge with about 17 million records, and with about 2 million duplicate records, explan plan for query shows that is doing fast full table scan, but the problem looks like in the delete statement, which is taking very long.
Any help or insight on this problem is greatly appreciated.
SQL> select count(*) from claims_temp;
COUNT(*)
----------
17353894
Number of duplicates about 2,000,000
CREATE TABLE CLAIMS_TEMP
(
HEALTH_SERVICE_ID NUMBER(20) NOT NULL,
HEALTH_SERVICE_ITEM_NO NUMBER(2) NOT NULL,
PDE_REC_NO NUMBER(3) NOT NULL,
PATIENT_ID NUMBER(10) NOT NULL,
ADJ_CARDHOLDER_ID CHAR(20 BYTE) NOT NULL,
CLIENT_ID CHAR(15 BYTE) NOT NULL,
PLAN_ID CHAR(8 BYTE) NOT NULL,
ADJUD_DT DATE NOT NULL,
I_SERVICE_DT DATE NOT NULL,
LABEL_NAME VARCHAR2(50 BYTE) NOT NULL,
I_QTY_DISPENSED NUMBER(10,3) NOT NULL,
TOTAL_CLIENT_AMT_BILLED NUMBER(8,2) NOT NULL,
O_PATIENT_PAY_AMT NUMBER(8,2) NOT NULL
);
CREATE UNIQUE INDEX EOB_CLAIMS_TEMP_UX1 ON CLAIMS_TEMP
(HEALTH_SERVICE_ID, HEALTH_SERVICE_ITEM_NO, PDE_REC_NO);
CREATE OR REPLACE PROCEDURE sp_del_dups (p_limit IN PLS_INTEGER, p_exit_status OUT PLS_INTEGER)
IS
CURSOR dups_cur
IS
SELECT row_id
FROM (SELECT ROWID row_id,
ROW_NUMBER () OVER (PARTITION BY health_service_id, health_service_item_no ORDER BY pde_rec_no DESC)
dups_claims
FROM claims_temp)
WHERE dups_claims > 1;
TYPE rowidarray IS TABLE OF ROWID
INDEX BY PLS_INTEGER;
dups_rowid_tb pkg_eob.rowidarray;
BEGIN
OPEN dups_claims_cur;
LOOP
FETCH dups_claims_cur
BULK COLLECT INTO dups_rowid_tb LIMIT p_limit;
FORALL i IN 1 .. dups_rowid_tb.COUNT
DELETE FROM claims_temp
WHERE ROWID = dups_rowid_tb (i);
COMMIT;
EXIT WHEN dups_claims_cur%NOTFOUND;
END LOOP;
CLOSE dups_claims_cur;
COMMIT;
EXCEPTION
WHEN OTHERS
THEN
p_exit_status := 1;
RAISE;
END sp_del_dups;
Thank you
Ravi
December 26, 2006 - 9:23 pm UTC
lots of indexes or not lots of indexes?
what do you have p_limit set to?
you do know that setting p_exit_status in the exception block is a BIG OLD WASTE of keystrokes right?
Commit inside a loop
A reader, December 27, 2006 - 5:36 am UTC
Why are you commiting inside a loop. This will cause an ORA-01555 error.
Why not using this
delete from claims_temp
where rowid not in (select min(rowid) from claims_temp
group by health_service_id
,health_service_item_no
,pde_rec_no
);
However I am not sure that you will not get the ORA-0156 error. In this case then the above delete must be done part by part
Finding Duplicates in large unpartitioned table
Ravi, December 27, 2006 - 12:33 pm UTC
First thing is I am sorry, that the code I posted will not compile for you because, the Cursor name in the declaration section does not match to cursor name in cursor reference section.
In the declaration section it should have been "dups_claims_cur "
I tested out P_limit with 50000 and 500
with 500 -- > takes 4 hours
with 50,000 -- > takes 38 hours
But I need this script to finish in secounds not hours, is there any better way of achiving this with out bulk collect
For the reply from " THE READER " that solution wouldnt work because you are looking for duplicates for a combination of "health_service_id, health_service_item_no, pde_rec_no " but my requirement is for combination of "health_service_id, health_service_item_no" and delete the records with "pde_rec_no" < max(pde_rec_no) for multiple occurence of "health_service_id,health_service_item_no "
And for the commit, I assume the commit is in the right place that it can be, here the commit frequency would be for each BULK_COLLECT LIMIT;
December 28, 2006 - 9:33 am UTC
the commit should be where the commit belongs - and it not necessarily should be committing every delete call.
you mentioned nothing about number of indexes
and if you just run the query and fetch the rows (no delete), how long does that take.
For Ravi
Tyler, December 27, 2006 - 5:31 pm UTC
"reader" had the right idea for you, probably just not all of your requirements.
This should give you a starting point. I don't profess for the code to be optimal, you may need to tweak it for your needs. Should still be faster than your procedural code.
DELETE FROM claims_temp
WHERE ROWID IN
(
SELECT MY_ROWID
FROM
(
SELECT
ROWID as MY_ROWID,
pde_rec_no,
MAX(pde_rec_no) over (PARTITION BY health_service_id ,health_service_item_no) as MAX_pde_rec_no,
COUNT(*) over (PARTITION BY health_service_id ,health_service_item_no) as MY_COUNT
FROM claims_temp
)
WHERE MY_COUNT > 1
AND pde_rec_no <> MAX_pde_rec_no
);
General idea, find all the items with more than 1 record (per health_service_id ,health_service_item_no) where the pde_rec_no DOES NOT EQUAL the MAX pde_rec_no for the given per health_service_id ,health_service_item_no combination.
Based on your previous post, that should encompass the algorithm you require. I hope :)
Finding Duplicates in large unpartitioned table
Ravi, December 28, 2006 - 6:33 pm UTC
For Tom's Feedback
My mistake, I dint include index counts
Number of indexes
Unique index on (Health_service_id, health_service_item_no,pde_rec_no)
Bitmap index on (plan_id) not part of the query
I used Tyler's Query to return full data set comes back (with out delete) in 20 min.
--------------------------------------------------------
For Tyler's Feedback
your Query comes back with much better cost then mine, window sort is avoided by using your query, but cost is to high and runs for ever when accompanied by delete
--------------------------------------------------------
Tyler's Query Uses
WINDOW BUFFER
--------------------------------------------------------
SELECT MY_ROWID
FROM
(
SELECT
ROWID as MY_ROWID,
pde_rec_no,
MAX(pde_rec_no) over (PARTITION BY health_service_id ,health_service_item_no) as MAX_pde_rec_no,
COUNT(*) over (PARTITION BY health_service_id ,health_service_item_no) as MY_COUNT
FROM claims_temp
)
WHERE MY_COUNT > 1
AND pde_rec_no <> MAX_pde_rec_no
--------------------------------------------------------
Ravi's Query
Uses WINDOW SORT (more expensive)
--------------------------------------------------------
SELECT row_id
FROM (SELECT ROWID row_id,
ROW_NUMBER () OVER (PARTITION BY health_service_id, health_service_item_no ORDER BY pde_rec_no DESC)
dups_claims
FROM claims_temp)
WHERE dups_claims > 1;
--------------------------------------------------------
For Both
Query seems to be perfoming well, but when added
Delete .. where rowid in (query)
Cost went up by 20 times
my new approach ( I assume not very good though, but does the job in 45 min)
CREATE TABLE rowid_temp
(seq_no NUMBER UNIQUE, row_id VARCHAR2(30))
INSERT /*+APPEND */INTO rowid_temp
(seq_no, row_id)
SELECT ROWNUM, ROWIDTOCHAR (row_id)
FROM (SELECT ROWID AS row_id, pde_rec_no,
MAX (pde_rec_no) OVER (PARTITION BY health_service_id, health_service_item_no)
AS max_pde_rec_no,
COUNT (*) OVER (PARTITION BY health_service_id, health_service_item_no)
AS dups_count
FROM claims_temp)
WHERE dups_count > 1 AND pde_rec_no <> max_pde_rec_no;
COMMIT;
DELETE FROM claims_temp
WHERE ROWID IN (SELECT /*+ FULL(r) PARALLEL(r,4) */
CHARTOROWID (row_id)
FROM rowid_temp r);
COMMIT;
TRUNCATE TABLE rowid_temp;
December 29, 2006 - 9:38 am UTC
all indexes are necessarily involved here!!!!! you are deleting entries.
have you thought about a maintenance window where you can disable indexes, do the bulk delete and then rebuild them
Thank you
Ravi, December 29, 2006 - 11:42 am UTC
Unfortunatly DBA's in here will not allow me to drop and recreate indexes. I have to live with it,
Thanks a lot for all the help and guidence provided, that did help me a lot, execution time dropped from 38 hours to 45 minutes.
Tyler thanks for your query, that did helped me.
Thank you
Ravi
Analytics rock
Nengbing, January 09, 2007 - 1:26 pm UTC
Tom,
Thank you so much for your help to many who are using Oracle. I enjoy reading your books and this site.
I hope the following is relevant to this thread.
drop table t1 purge;
create table t1 (d number);
insert into t1 values(452911);
insert into t1 values(794034);
insert into t1 values(794057);
insert into t1 values(893069);
insert into t1 values(924952);
insert into t1 values(1068608);
insert into t1 values(1068617);
insert into t1 values(1237362);
insert into t1 values(1268008);
Desired output, groups of numbers that are are within 2*1000, each group with a distinctive group id.
D P1 P2 G
---------- ---------- ---------- ----------
452911 451911 453911 1
794034 793034 795034 2
794057 793057 795057 2
893069 892069 894069 3
924952 923952 925952 4
1068608 1067608 1069608 5
1068617 1067617 1069617 5
1237362 1236362 1238362 6
1268008 1267008 1269008 7
I have managed to do the following, but am not sure how to assign distinctive groups.
1 select d,p1,p2,l1+NVL(lag(l1,1) over (order by d),0) g
2 from (
3 select d,d-1000 p1,d+1000 p2,
4 case when (lead(d,1) over (order by d) - d) < 2000 then 1
5 else 0
6* end l1 from t1)
SQL> /
D P1 P2 G
---------- ---------- ---------- ----------
452911 451911 453911 0
794034 793034 795034 1
794057 793057 795057 1
893069 892069 894069 0
924952 923952 925952 0
1068608 1067608 1069608 1
1068617 1067617 1069617 1
1237362 1236362 1238362 0
1268008 1267008 1269008 0
9 rows selected.
Bascially, I want to put those consecutive rows that have 1 in column G into distinctive groups, each row wih 0 in column G should be in its own group.
Any help is greatly appreciated.
Nengbing
January 11, 2007 - 8:59 pm UTC
sorry, this is very very confusing - don't get p1, p2, g AT ALL
... groups of numbers that are are within 2*1000, ...
means nothing - within 2*1000 of what? why 2*1000 why not 2000?
Analytics rock
Nengbing, January 10, 2007 - 7:00 pm UTC
create table t1 (d number);
---- thought of using sequence to generate group id
create sequence seq1 start by 1 increment by 1;
insert into table t1 values (452911);
insert into table t1 values (794012);
insert into table t1 values (794034);
insert into table t1 values (794234);
insert into table t1 values (794345);
insert into table t1 values (924952);
insert into table t1 values (1068608);
insert into table t1 values (1068617);
insert into table t1 values (1237362);
insert into table t1 values (1268008);
select t2.*,
case when (1=l1)
and ((lag(l1,1) over (order by d)) = 1)
then seq1.currval
else seq1.nextval
end g
from (
select d,d-1000 p1,d+1000 p2,
case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
else 0
end l1 from t1
) t2
/
D P1 P2 L1 G
---------- ---------- ---------- ---------- ----------
452911 451911 453911 0 281
794012 793012 795012 1 282
794034 793034 795034 1 283
794234 793234 795234 1 284
794345 793345 795345 1 285
924952 923952 925952 0 286
1068608 1067608 1069608 1 287
1068617 1067617 1069617 1 288
1237362 1236362 1238362 0 289
1268008 1267008 1269008 1 290
10 rows selected.
---- Why column G always increment?
---- replace seq1.nextval with 100
select t2.*,
case when (1=l1)
and ((lag(l1,1) over (order by d)) = 1)
then seq1.currval
else 100
end g
from (
select d,d-1000 p1,d+1000 p2,
case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
else 0
end l1 from t1
) t2
/
D P1 P2 L1 G
---------- ---------- ---------- ---------- ----------
452911 451911 453911 0 100
794012 793012 795012 1 100
794034 793034 795034 1 290
794234 793234 795234 1 290
794345 793345 795345 1 290
924952 923952 925952 0 100
1068608 1067608 1069608 1 100
1068617 1067617 1069617 1 290
1237362 1236362 1238362 0 100
1268008 1267008 1269008 1 100
10 rows selected.
---- it seems that the condition testing is working
create sequence seq2 start with 1 increment by 1;
select t2.*,
case when (1=l1)
and ((lag(l1,1) over (order by d)) = 1)
then seq1.currval
else seq2.nextval
end g
from (
select d,d-1000 p1,d+1000 p2,
case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
else 0
end l1 from t1
) t2
/
D P1 P2 L1 G
---------- ---------- ---------- ---------- ----------
452911 451911 453911 0 1
794012 793012 795012 1 2
794034 793034 795034 1 290
794234 793234 795234 1 290
794345 793345 795345 1 290
924952 923952 925952 0 6
1068608 1067608 1069608 1 7
1068617 1067617 1069617 1 290
1237362 1236362 1238362 0 9
1268008 1267008 1269008 1 10
10 rows selected.
---- condition testing is OK
select t2.*,
case when (1=l1)
and ((lag(l1,1) over (order by d)) = 1)
then seq1.currval
else seq1.nextval
end g
from (
select d,d-1000 p1,d+1000 p2,
case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
else 0
end l1 from t1
) t2
/
D P1 P2 L1 G
---------- ---------- ---------- ---------- ----------
452911 451911 453911 0 291
794012 793012 795012 1 292
794034 793034 795034 1 293
794234 793234 795234 1 294
794345 793345 795345 1 295
924952 923952 925952 0 296
1068608 1067608 1069608 1 297
1068617 1067617 1069617 1 298
1237362 1236362 1238362 0 299
1268008 1267008 1269008 1 300
10 rows selected.
---- How come G always increments?
analytics rock
Nengbing, January 12, 2007 - 2:38 pm UTC
Tom,
I apologize for the confusion and hope this makes it clear.
Assuming that I have a set of numbers, I'd like to put the numbers into distinctive groups within each group the difference of any number to the next one when the numbers are sorted should be less than 2000, and all numbers that differs from any number in a group by less than 2000 should be in the same group.
For example:
452911 1 ---- put 452911 in group 1
794012 2 ---- in a new group because it differs from 1st number by more than 2000
794034 2 ---- in group 2 because it differs from last number by less than 2000
794234 2 ---- same group 2
794345 2 ---- same group 2
924952 3 ---- new group 3
1068608 4 ---- new group 4
1068617 4 ---- same group 4
1237362 5 ---- new group 5
1268008 6 ---- new group 6
By the way I did find in Oracle document that says "......If any of these locations contains references to both CURRVAL and NEXTVAL, then Oracle increments the sequence and returns the same value for both CURRVAL and NEXTVAL......."
So using sequence as I intended from my last reply does not work.
Thank you so much, Tom!
To: Nengbing
Michel Cadot, January 16, 2007 - 5:01 am UTC
SQL> with
2 step1 as (
3 select d,
4 case
5 when d-nvl(lag(d) over(order by d),0) > 2000
6 then rownum
7 end flag
8 from t1
9 ),
10 step2 as (
11 select d, max(flag) over (order by d) grp
12 from step1
13 )
14 select d,
15 dense_rank() over (order by grp) grp
16 from step2
17 order by d
18 /
D GRP
---------- ----------
452911 1
794012 2
794034 2
794234 2
794345 2
924952 3
1068608 4
1068617 4
1237362 5
1268008 6
10 rows selected.
First we flag (with their rownum) the rows with a difference greater than 2000.
Then we propagate this flag to all members of the same group.
Finally we convert this flag to a rank.
Regards
Michel
To: Nengbing
Michel Cadot, January 16, 2007 - 3:40 pm UTC
Another solution, just writing what you said in the rule of a model clause:
SQL> select d, grp
2 from (select d from t1 order by d)
3 model
4 dimension by (rownum rn)
5 measures (d, cast(null as number) grp)
6 rules
7 ( grp[ANY] = case when d[cv()-1] is null then 1 -- case first line
8 when d[cv()]-d[cv()-1]>2000 then grp[cv()-1]+1
9 else grp[cv()-1]
10 end
11 )
12 order by 1
13 /
D GRP
---------- ----------
452911 1
794012 2
794034 2
794234 2
794345 2
924952 3
1068608 4
1068617 4
1237362 5
1268008 6
10 rows selected.
Michel
Thanks, Michel
Nengbing, January 16, 2007 - 6:30 pm UTC
Very nicely done!
I have only minor modifications (to replace t1 on line 8 with "select d from t1 order by d"; and 0 on line 5 with -2001 to always have a flag for 1st row )
Thank you very much!
To: Nengbing
Michel Cadot, January 17, 2007 - 12:37 am UTC
You're right for line 5 but ordering table t1 in line 8 is useless as the rows are compared with their previous one in the order given in the lag function (that is "d") whatever is the physical order in the table.
Michel
Michel Cadot, January 17, 2007 - 2:20 am UTC
I have to correct my query with the model clause.
I used "rownum" to dimension the result set, it works here 1) because the model clause does not generate any row and 2) thanks to the current implementation. But this is not the correct way to do it: I should number my lines before, for instance with the row_number function.
SQL> select d, grp
2 from (select d, row_number() over (order by d) rn from t1)
3 model
4 dimension by (rn)
5 measures (d, cast(null as number) grp)
6 rules
7 ( grp[ANY] = case when d[cv()-1] is null then 1 -- case first line
8 when d[cv()]-d[cv()-1]>2000 then grp[cv()-1]+1
9 else grp[cv()-1]
10 end
11 )
12 order by 1
13 /
D GRP
---------- ----------
452911 1
794012 2
794034 2
794234 2
794345 2
924952 3
1068608 4
1068617 4
1237362 5
1268008 6
10 rows selected.
Michel
Thanks, Michel
Nengbing, January 17, 2007 - 11:55 am UTC
I tend to agree with you. However, the rownum as flag may throw thing off. For example, if I insert a few more rows like:
insert into t1 values (3000);
insert into t1 values (4000);
insert into t1 values (200);
insert into t1 values (400);
the query returns:
D GRP
---------- ----------
200 1
400 1
3000 1
4000 1
452911 1
794012 1
794034 1
794234 1
794345 1
924952 1
1068608 1
1068617 1
1237362 1
1268008 1
Because step1 returns:
D FLAG
---------- ----------
200 13
400
3000 11
4000
452911 1
794012 2
794034
794234
794345
924952 6
1068608 7
1068617
1237362 9
1268008 10
Thanks again, Michel, becuase of contributions from Oracle users like you, this site is even more helpful!
Thanks Nengbing
Michel Cadot, January 17, 2007 - 3:00 pm UTC
Oh yes, rownum is wrong, row_number should be used instead.
SQL> with
2 step1 as (
3 select d,
4 case
5 when d-nvl(lag(d) over(order by d),-2001) > 2000
6 then row_number() over(order by d)
7 end flag
8 from t1
9 ),
10 step2 as (
11 select d, max(flag) over (order by d) grp
12 from step1
13 )
14 select d,
15 dense_rank() over (order by grp) grp
16 from step2
17 order by d
18 /
D GRP
---------- ----------
200 1
400 1
3000 2
4000 2
452911 3
794012 4
794034 4
794234 4
794345 4
924952 5
1068608 6
1068617 6
1237362 7
1268008 8
14 rows selected.
Thanks to correct that.
Michel
analytics to solve credit / debit allocation
RVH, January 24, 2007 - 12:44 am UTC
Tom,
You mentioned in a response to Dave Thompson on March 04, 2004, that the problem described by him in this thread can not be solved using analytical functions. Just to let you know - the following does the job:
SQL> select * from pay_m;
PAY_ID PAYMENT
---------- ----------
1 50
2 25
3 50
4 50
SQL>
SQL>
SQL> select * from prem;
PREM_ID PREM_PAYMENT
---------- ------------
1 100
2 50
3 50
4 50
SQL>
The following query will allocate the actual payments (identified through PAY_ID) to the 'premiums' (identified through PREM_ID). I do also include the credit/debit amount remaining after the payment has been allocated.
SQL> select vprem.prem_id,
2 vpay.pay_id,
3 greatest(vprem.sum_prem_payment - vpay.sum_payment, 0) remaining_debit,
4 greatest(vpay.sum_payment - vprem.sum_prem_payment, 0) available_credit
5 from (select pay_id,
6 payment,
7 sum(payment) over (order by pay_id) - payment sum_prev_payment,
8 sum(payment) over (order by pay_id) sum_payment
9 from pay_m) vpay,
10 (select prem_id,
11 prem_payment,
12 sum(prem_payment) over (order by prem_id) - prem_payment sum_prev_prem_payment,
13 sum(prem_payment) over (order by prem_id) sum_prem_payment
14 from prem) vprem
15 where ( vpay.sum_payment > vprem.sum_prev_prem_payment
16 and vpay.sum_payment <= vprem.sum_prem_payment)
17 or ( vpay.sum_prev_payment > vprem.sum_prev_prem_payment
18 and vpay.sum_prev_payment < vprem.sum_prem_payment)
19 order by 1 asc, 2 asc
20 /
PREM_ID PAY_ID REMAINING_DEBIT AVAILABLE_CREDIT
---------- ---------- --------------- ----------------
1 1 50 0
1 2 25 0
1 3 0 25
2 3 25 0
2 4 0 25
3 4 25 0
6 rows selected.
Thanks for this very useful web site. I am learning a lot by browsing it.
Query - correction
RVH, January 24, 2007 - 7:37 pm UTC
Tom,
When I tested the query more thouroughly that I gave in my previous email I realised that it has a flaw under certain boundary conditions. I underline the change in the 'where' clause that fixes the problem. It is working fine now ... Cheers
select vprem.prem_id,
vpay.pay_id,
greatest(vprem.sum_prem_payment - vpay.sum_payment, 0) remaining_debit,
greatest(vpay.sum_payment - vprem.sum_prem_payment, 0) available_credit
from (select pay_id,
sum(payment) over (order by pay_id) - payment sum_prev_payment,
sum(payment) over (order by pay_id) sum_payment
from pay_m) vpay,
(select prem_id,
sum(prem_payment) over (order by prem_id) - prem_payment sum_prev_prem_payment,
sum(prem_payment) over (order by prem_id) sum_prem_payment
from prem) vprem
where greatest (vpay.sum_prev_payment, vprem.sum_prev_prem_payment)
< least (vpay.sum_payment, vprem.sum_prem_payment)
order by 1, 2
/
Any single query for this scenario
Subramanian T S, January 25, 2007 - 10:03 am UTC
Hi Tom,
I have a lookup table
company brand area ndays
XX {null} {null} 3
XX BB {null} 10
XX BB 1 4
XX BB 2 8
I want a query to retrieve ndays. The logic is as follows:
First search by company, brand and area
If match found return ndays.
Else search by company and brand
If match found return ndays
Else search by company
If match found return ndays
Else return 0
This can be done simple PL/SQL. I am looking for a single query may be using some built-in functions..
Thanks in advance :)
Cheers !
Subbu.
To: Subramanian T S
Michel Cadot, January 29, 2007 - 6:20 am UTC
One way to do it is (with your data):
SQL> var comp varchar2(10)
SQL> var brand varchar2(10)
SQL> var area number
SQL> begin
2 :comp := 'XX';
3 :brand := 'BB';
4 :area := 1;
5 end;
6 /
PL/SQL procedure successfully completed.
SQL> with
2 data as (
3 select t.company, t.brand, t.area, t.ndays,
4 row_number () over
5 (order by t.company, t.brand nulls last, t.area nulls last) nb
6 from t
7 where t.company = :comp
8 and ( t.brand = :brand or t.brand is null )
9 and ( t.area = :area or t.area is null )
10 )
11 select company, brand, area, nvl(ndays,0) ndays
12 from (select 1 nb from dual) dual, data
13 where data.nb (+) = dual.nb
14 /
COMPANY BRAND AREA NDAYS
---------- ---------- ---------- ----------
XX BB 1 4
1 row selected.
SQL> exec :area := 3
PL/SQL procedure successfully completed.
SQL> /
COMPANY BRAND AREA NDAYS
---------- ---------- ---------- ----------
XX BB 10
1 row selected.
SQL> exec :brand := 'CC'
PL/SQL procedure successfully completed.
SQL> /
COMPANY BRAND AREA NDAYS
---------- ---------- ---------- ----------
XX 3
1 row selected.
SQL> exec :comp := 'YY'
PL/SQL procedure successfully completed.
SQL> /
COMPANY BRAND AREA NDAYS
---------- ---------- ---------- ----------
0
1 row selected.
The first subquery ("data") searches all lines that match your criteria and orders them according to your algorithm. The main query output the first row and 0 if there is none.
Regards
Michel
Excellent !!!
A reader, January 30, 2007 - 6:04 am UTC
Thank you Michel! Excellent one..
Cheers !
Subbu.
excellent information.
kde, February 07, 2007 - 2:28 pm UTC
Tom,
I am having problems with the following requirement.
Would you suggest using analytics for this?
My data in the source looks like this.
1,abc,500
1,abc,200
1,abc,100
1,abc,150
2,aaa,200
2,aaa,150
2,aaa,50
3,asa,100
3,asa,30
3,asa,60
I need to generate
0,1,abc,500
1,1,abc,300(i.e.,500-200)
2,1,abc,200(i.e.,300-100)
3,1,abc,50(i.e.,200-150)
0,2,aaa,200
1,2,aaa,50(i.e.,200-150)
2,2,aaa,0(i.e.,50-50)
0,3,asa,100
1,3,asa,70(i.e.,100-30)
2,3,asa,10(i.e.,70-60)
I tried using analytics. Here is the query that I have so far
select
A.COL1 col1
,A.TXT1 txt1
,A.COL2 col2
,sum(A.COL2) as moving_sum
,lead(SUM(A.COL2),1) OVER (PARTITION BY a.col1, a.txt1 ORDER BY A.COL2 DESC) as next_row -- ROWS 1 PRECEDING)
FROM t1 a
group by A.COL1,A.TXT1,A.COL2
ORDER BY A.COL1 ASC,A.TXT1 ASC,A.COL2 DESC
--
I tried using the result of the above query but couldn't process the results.
I would really appreciate any suggestions/help.
thanks
kde
February 07, 2007 - 6:54 pm UTC
there is no explanation of how to get from input to output "requirements" wise. You don't explain what your outputs are - we just have a "picture" and we are supposed to derive the requirements from that - that is a flawed approach.
tell us what the outputs actually mean.
and give us create table and inserts to work with.
To kde ...
Gabe, February 07, 2007 - 8:20 pm UTC
The processing order is quite specific and cannot be derived from your input. To that end I have added a "sequence" column to your data set.
create table z ( id number, nm varchar2(5), val number, seq number );
insert into z values (1,'abc',500,1);
insert into z values (1,'abc',200,2);
insert into z values (1,'abc',100,3);
insert into z values (1,'abc',150,4);
insert into z values (2,'aaa',200,1);
insert into z values (2,'aaa',150,2);
insert into z values (2,'aaa', 50,3);
insert into z values (3,'asa',100,1);
insert into z values (3,'asa', 30,2);
insert into z values (3,'asa', 60,3);
20:08:28 session_139> select id,nm,seq
20:08:30 2 ,first_value(val) over (partition by id order by seq)
20:08:30 3 - sum(case when seq=1 then 0 else val end)
20:08:30 4 over (partition by id order by seq) balance
20:08:30 5 from z
20:08:30 6 order by 1,3
20:08:30 7 ;
ID NM SEQ BALANCE
---------- ----- ---------- ----------
1 abc 1 500
1 abc 2 300
1 abc 3 200
1 abc 4 50
2 aaa 1 200
2 aaa 2 50
2 aaa 3 0
3 asa 1 100
3 asa 2 70
3 asa 3 10
10 rows selected.
apologize
a user, February 07, 2007 - 10:44 pm UTC
<quote>
there is no explanation of how to get from input to output "requirements" wise. You don't explain what your outputs are - we just have a "picture" and we are supposed to derive the requirements from that - that is a flawed approach.
<quote>
Tom,
I apologize for asking questions with incomplete information. I should have given you what Gabe had created.
regards
Using analytics in subqueries
sharvu, February 08, 2007 - 3:15 pm UTC
Tom,
I need your suggestion:
create table a(id number,src number,dt date);
There are 50 Mill records in this table that actually has 50 more columns.
I would need something like:
select count(*) from a t1
where src=999 and dt
= (select max(dt) from a t2 where t2.src=t1.src and t2.id=t1.id)
The query runs forever almost. Is there an effective way to do this?
Thanks,
February 08, 2007 - 4:25 pm UTC
select count(*)
from (select dt, max(dt) over (partition by id) max_dt
from a
where src = 999
)
where dt = max_dt;
Can this be done through Analytical Function?
A reader, February 10, 2007 - 2:19 am UTC
Hi Tom,
I have set of very large tables (ranging from around 40 Million rows to 10 Million rows). I have created a test table just to set up an example what I want to achieve:-
create table test (start_date date, end_date date, amnt number(10));
insert into test values ('01-JAN-2000','01-JAN-2001',100);
insert into test values ('01-JAN-2001','01-JAN-2002',200);
insert into test values ('01-JAN-2002','01-JAN-2003',200);
insert into test values ('01-JAN-2003','01-JAN-2004',500);
insert into test values ('18-JAN-2004','01-JAN-2004',1000);
SQL> select * from test;
START_DAT END_DATE AMNT
--------- --------- ----------
01-JAN-00 01-JAN-01 100
01-JAN-01 01-JAN-02 200
01-JAN-02 01-JAN-03 200
01-JAN-03 01-JAN-04 500
18-JAN-04 01-JAN-04 1000
select * FROM test
2 WHERE (start_date, end_date) IN (SELECT max(start_date), max(END_DATE) FROM test
3 where add_months(start_date,12)-END_DATE = 0);
START_DAT END_DATE AMNT
--------- --------- ----------
01-JAN-03 01-JAN-04 500
i.e., I want to travel backwards and get the amount/record where difference of dates is 1 year.
Regards,
February 12, 2007 - 9:51 am UTC
you want to select specific rows - for that you need a where clause.
analytics are useful to look "across rows", not for getting "specific rows" so much.
another way for you would be
select *
from
(
select *
from test
where months_between(start_date,end_date) = 12
order by start_date DESC
)
where rownum = 1;
To: A Reader
Michel Cadot, February 12, 2007 - 2:41 am UTC
SQL> select start_date, end_date, amnt
2 from ( select start_date, end_date, amnt,
3 max(start_date) over () max_start,
4 max(end_date) over () max_end
5 from test
6 where add_months(start_date,12)-END_DATE = 0
7 )
8 where start_date = max_start and end_date = max_end
9 /
START_DATE END_DATE AMNT
----------- ----------- ----------
01-JAN-2003 01-JAN-2004 500
Regards
Michel
Is Analytic Query an answer for this?
A Reader, March 01, 2007 - 12:11 am UTC
Tom,
I really need some help with this. Please Help!!!!
I have a table that looks like this.
CREATE TABLE T5
( COL1 VARCHAR2(26) NOT NULL,
COL2 NUMBER(10) NOT NULL,
COL3 NUMBER(10) NOT NULL,
COL4 VARCHAR2(4) NOT NULL,
COL5 VARCHAR2(10)
)
Insert into T5 values ('2007-01-25 08:22:16',101,1,'A','ABC');
Insert into T5 values ('2007-01-25 08:22:16',101,1,'L','XYZ');
Insert into T5 values ('2007-01-25 08:22:16',101,2,'A','ABC');
Insert into T5 values ('2007-01-25 08:22:16',101,2,'L','XYZ');
Insert into T5 values ('2007-01-25 08:22:16',101,3,'A','ABC');
Insert into T5 values ('2007-01-25 08:22:16',101,3,'L','XYZ');
COL1 COL2 COL3 COL4 COL5
2007-01-25 08:22:16 101 1 A ABC
2007-01-25 08:22:16 101 1 L XYZ
2007-01-25 08:22:16 101 2 A ABC
2007-01-25 08:22:16 101 2 L XYZ
2007-01-25 08:22:16 101 3 A ABC
2007-01-25 08:22:16 101 3 L XYZ
I need the output like this.
COL1 COL2 COL3 A_GRP L_GRP
2007-01-25 08:22:16 101 1 ABC XYZ
2007-01-25 08:22:16 101 2 ABC XYZ
2007-01-25 08:22:16 101 3 ABC XYZ
i.e.
For each group of COL1, COL2, COL3 do the following.
For each ~COL5 where COL4='A'~ put the value of ~COL5 where COL4='L'~ right next to it.
I did try the pivot query,
select
col1, col2, col3,
max(decode(col4,'A',COL5,null)) a,
max(decode(col4,'L',COL5,null)) b
from t5
group by col1, col2, col3;
COL1 COL2 COL3 A B
2007-01-25 08:22:16 101 2 ABC XYZ
2007-01-25 08:22:16 101 1 ABC XYZ
2007-01-25 08:22:16 101 3 ABC XYZ
But this would not work once I add more columns to the query.
I was thinking that this could be easily done with Analytic Query but I have not been able to figure it out yet.
Is there another way?
thanks for your help
March 01, 2007 - 8:30 am UTC
why doesn't it work - it is a typical pivot, works for any number of "columns"
tell us what isn't working first.
OK
Kumar, March 05, 2007 - 6:15 am UTC
Hi Tom,
I have a query which gives the price difference at three levels namely
Product,Pack( A pack is group of similar products) and market(all packs).
DDL,DML and query is below.
SQL> create table pack(ccode varchar2(30),pack_code varchar2(30),prod_code number,
2 curr_price number,prev_price number,trans_date date)
3 /
Table created.
SQL> insert into pack values('UK','PK1',100,100.25,140.90,sysdate-6);
SQL> insert into pack values('UK','PK1',101,10.25,40.90,sysdate-6);
SQL> insert into pack values('UK','PK1',100,55,45,SYSDATE-6);
SQL> insert into pack values('UK','PK2',200,22.75,20.75,SYSDATE-5);
SQL> insert into pack values('UK','PK2',200,30.25,27.25,SYSDATE-5);
SQL> insert into pack values('UK','PK2',200,102.50,90.50,SYSDATE-5);
SQL> commit
2 /
Commit complete.
SQL> select * from pack
2 /
CCODE PACK_CODE PROD_CODE CURR_PRICE PREV_PRICE TRANS_DAT
------------------------------ ------------------------------ ---------- ---------- ---------- -----
UK PK1 100 100.25 140.9 27-FEB-07
UK PK1 101 10.25 40.9 27-FEB-07
UK PK1 100 55 45 27-FEB-07
UK PK2 200 22.75 20.75 28-FEB-07
UK PK2 200 30.25 27.25 28-FEB-07
UK PK2 200 102.5 90.5 28-FEB-07
6 rows selected.
When I issue the below query the computed columns namely prod_prc_diff and pack_prc_diff
show the same values.
SQL> col ccode for a5
SQL> col pack_code for a5
SQL> select p.*,sum(abs(curr_price-prev_price)) over(partition by pack_code,prod_code,trans_date) as
prod_prc_diff,sum(abs(curr_price-prev_price)) over(partition by pack_code,trans_date)
as pack_prc_diff from pack p;
CCODE PACK_ PROD_CODE CURR_PRICE PREV_PRICE TRANS_DAT PROD_PRC_DIFF PACK_PRC_DIFF
----- ----- ---------- ---------- ---------- --------- ------------- -------------
UK PK1 100 100.25 140.9 27-FEB-07 40.65 40.65
UK PK1 101 10.25 40.9 27-FEB-07 30.65 30.65
UK PK1 100 55 45 27-FEB-07 10 10
UK PK2 200 22.75 20.75 28-FEB-07 2 2
UK PK2 200 30.25 27.25 28-FEB-07 3 3
UK PK2 200 102.5 90.5 28-FEB-07 12 12
6 rows selected.
But I want the pack_prc_diff column to show two rows as
pack_prc_diff
=============
80.3
17
I also need one more column as below which would be sum of 80.3+17
mkt_prc_diff
============
97.3
Please let me know how to arrive at this.
Thanks Tom.
March 05, 2007 - 1:17 pm UTC
I'm confused, why do our outputs differ given the same inputs
ops$tkyte%ORA10GR2> select p.*,sum(abs(curr_price-prev_price)) over(partition by pack_code,prod_code,trans_date) as
2 prod_prc_diff,sum(abs(curr_price-prev_price)) over(partition by pack_code,trans_date)
3 as pack_prc_diff from pack p;
CCODE PACK_ PROD_CODE CURR_PRICE PREV_PRICE TRANS_DAT PROD_PRC_DIFF PACK_PRC_DIFF
----- ----- ---------- ---------- ---------- --------- ------------- -------------
UK PK1 100 100.25 140.9 27-FEB-07 50.65 81.3
UK PK1 100 55 45 27-FEB-07 50.65 81.3
UK PK1 101 10.25 40.9 27-FEB-07 30.65 81.3
UK PK2 200 102.5 90.5 28-FEB-07 17 17
UK PK2 200 22.75 20.75 28-FEB-07 17 17
UK PK2 200 30.25 27.25 28-FEB-07 17 17
Is Analytic Query an answer for this?
A reader, March 05, 2007 - 10:05 am UTC
why doesn't it work - it is a typical pivot, works for any number of "columns"
tell us what isn't working first.Pivot would work if I add more columns to the max(decode()) list. I want to add more columns to the 'group by' list.
drop table t5;
CREATE TABLE T5
(
COL1 VARCHAR2(26 BYTE) NOT NULL,
COL2 NUMBER(10) NOT NULL,
COL3 NUMBER(10) NOT NULL,
COL4 VARCHAR2(4 BYTE) NOT NULL,
COL5 VARCHAR2(10 BYTE),
COL6 NUMBER(10),
COL7 VARCHAR2(10 BYTE)
)
/
Insert into T5
(COL1, COL2, COL3, COL4, COL5,
COL6, COL7)
Values
('2007-01-25 08:22:16', 101, 1, 'A', 'ABC',
1, NULL);
Insert into T5
(COL1, COL2, COL3, COL4, COL5,
COL6, COL7)
Values
('2007-01-25 08:22:16', 101, 1, 'L', 'XYZ',
2, 'BB');
Insert into T5
(COL1, COL2, COL3, COL4, COL5,
COL6, COL7)
Values
('2007-01-25 08:22:16', 101, 2, 'A', 'ABC',
3, NULL);
Insert into T5
(COL1, COL2, COL3, COL4, COL5,
COL6, COL7)
Values
('2007-01-25 08:22:16', 101, 2, 'L', 'XYZ',
4, 'DD');
Insert into T5
(COL1, COL2, COL3, COL4, COL5,
COL6, COL7)
Values
('2007-01-25 08:22:16', 101, 3, 'A', 'ABC',
5, NULL);
Insert into T5
(COL1, COL2, COL3, COL4, COL5,
COL6, COL7)
Values
('2007-01-25 08:22:16', 101, 3, 'L', 'XYZ',
6, 'FF');
COMMIT;
SQL> select
2 col1, col2, col3, col6, col7,
3 max(decode(col4,'A',COL5,null)) a,
4 max(decode(col4,'L',COL5,null)) b
5 from T5
6 group by col1, col2, col3, col6, col7
7 order by col1, col2, col3
8 ;
COL1 COL2 COL3 COL6 COL7 A B
------------------- ----- ------- ---------- ---------- ------- -------
2007-01-25 08:22:16 101 1 1 ABC
2007-01-25 08:22:16 101 1 2 BB XYZ
2007-01-25 08:22:16 101 2 3 ABC
2007-01-25 08:22:16 101 2 4 DD XYZ
2007-01-25 08:22:16 101 3 5 ABC
2007-01-25 08:22:16 101 3 6 FF XYZ
I do realize that the Pivot Query above is working exactly as it is supposed to.
But I need the output to be like
COL1 COL2 COL3 COL6 COL7 A B
------------------- ----- ------- ---------- ---------- ------- -------
2007-01-25 08:22:16 101 1 2 BB ABC XYZ
2007-01-25 08:22:16 101 2 4 DD ABC XYZ
2007-01-25 08:22:16 101 3 6 FF ABC XYZ
And that's why I thought that Analutic Query could be the answer.
thanks
March 05, 2007 - 2:07 pm UTC
and you would have to tell us the LOGIC BY WHICH YOUR OUTPUT IS COMING FROM
did you want the max(col6), and the max(col7) - eg; you did not mean to group by them.
ops$tkyte%ORA10GR2> select
2 col1, col2, col3, max(col6), max(col7),
3 max(decode(col4,'A',COL5,null)) a,
4 max(decode(col4,'L',COL5,null)) b
5 from T5
6 group by col1, col2, col3
7 order by col1, col2, col3
8 ;
COL1 COL2 COL3 MAX(COL6) MAX(COL7) A B
-------------------------- ---------- ---------- ---------- ---------- ---------- ----------
2007-01-25 08:22:16 101 1 2 BB ABC XYZ
2007-01-25 08:22:16 101 2 4 DD ABC XYZ
2007-01-25 08:22:16 101 3 6 FF ABC XYZ
Logic you asked for!!!
A Reader, March 05, 2007 - 3:27 pm UTC
...LOGIC BY WHICH YOUR OUTPUT IS COMING FROM...
I guess that would help. haha. Here it is...
For the table like T5 which has n (where n is defined) number of keys,
For every row with COL4 as L (call it X),
there may be multiple rows with COL4 as A (call it Y).
Among these matching rows there is a common key (which is a part of n keys above).
Data from Y contains values in multiple columns which are need to be JOINed with columns from X in order to create a ~complete~ row for the purpose of the process.
for example, consider the table below
4KEYS 1key COL4 COL5
----- ---- ---- ---------
P 1 L L_VALUES
Q 1 A A_VALUES
R 1 A A_VALUES
From the table above, create TWO rows (because there are TWO 'A' type lines) that look like the following.
4KEYS 1key L A
----- ---- --------- ---------
P 1 L_VALUES A_VALUES
Q 1 L_VALUES A_VALUES
Would you use analytics for that or just a simple self join?
thanks
March 05, 2007 - 4:24 pm UTC
I did not get the 'four keys' bit?
how did you know to keep "Q" but discard "R"
we are missing something here.
Logic (Updated!!!)
A Reader, March 05, 2007 - 3:31 pm UTC
I think the 4KEYS column would have to go...
1key L A
---- --------- ---------
1 L_VALUES A_VALUES
1 L_VALUES A_VALUES
4KEYS would have to go...
A Reader, March 05, 2007 - 4:28 pm UTC
I think the 4KEYS column would have to go...
1key L A
---- --------- ---------
1 L_VALUES A_VALUES
1 L_VALUES A_VALUES
I was wrong!!!
reader, March 06, 2007 - 9:32 am UTC
Actually I was wrong when I gave this table as an example.
4KEYS 1key COL4 COL5
----- ---- ---- ---------
P 1 L L_VALUES
Q 1 A A_VALUES
R 1 A A_VALUES
THE CORRECT EXAMPLE WOULD BE
4KEYS 1key COL4 COL5
----- ---- ---- ---------
P 1 L L_VALUES
P 1 A A_VALUES
P 1 A A_VALUES
i.e. the 4KEYS would remain same for that rowset.
And from this table I would have to create TWO rows (because there are TWO 'A' type lines) that look like the following.
4KEYS 1key L A
----- ---- --------- ---------
P 1 L_VALUES A_VALUES
P 1 L_VALUES A_VALUES
Sorry for the confusion there...
thanks
March 06, 2007 - 11:11 am UTC
Now you entirely, utterly and totally have lost me.
start over - use much more TEXT with WORDS than "pictures of inputs and outputs"
I fail to see how you get from your input to your output.
Across Rows
A reader, March 08, 2007 - 9:41 pm UTC
Hi Tom,
In relation to data set below:-
create table test (start_date date, end_date date, amnt number(10));
insert into test values ('01-JAN-2000','01-JAN-2001',100);
insert into test values ('01-JAN-2001','01-JAN-2002',200);
insert into test values ('01-JAN-2002','01-JAN-2003',200);
insert into test values ('01-JAN-2003','01-JAN-2004',500);
insert into test values ('18-JAN-2004','01-JAN2004',1000);
SQL> select * from test;
START_DAT END_DATE AMNT
--------- --------- ----------
01-JAN-00 01-JAN-01 100
01-JAN-01 01-JAN-02 200
01-JAN-02 01-JAN-03 200
01-JAN-03 01-JAN-04 500
18-JAN-04 01-JAN-04 1000
How to get a record by comparing end dates of two adjacent records (by traversing backwords) where end dates changed/different i.e., I want to get record (second last) with amount 500 as by comparing its end date with the previous one, the date is different.
Regards,
March 09, 2007 - 10:45 am UTC
ops$tkyte%ORA10GR2> select *
2 from (
3 select start_date,
4 end_date,
5 lead(start_date) over (order by start_date) next_start_date,
6 amnt
7 from test
8 )
9 where end_date <> next_start_date
10 /
START_DAT END_DATE NEXT_STAR AMNT
--------- --------- --------- ----------
01-JAN-03 01-JAN-04 18-JAN-04 500
Thanks a lot!
A reader, March 09, 2007 - 5:43 pm UTC
Variation
Juan Velez, March 10, 2007 - 4:15 pm UTC
This is related to the topic, so I would like to ask:
I have a table with effective-dated data, this is, effective/termination dates
CREATE TABLE EFFECTIVEDATED(ID NUMBER NOT NULL, EFFECTIVEDAY DATE NOT NULL, TERMINATIONDAY NOT NULL, SOMEVALUE NUMBER);
INSERT INTO EFFECTIVEDATED VALUES(1, TO_DATE('20070301','YYYYMMDD'), TO_DATE('20070601','YYYYMMDD'), 10);
INSERT INTO EFFECTIVEDATED VALUES(1, TO_DATE('20070901','YYYYMMDD'), TO_DATE('20080601','YYYYMMDD'), 8);
INSERT INTO EFFECTIVEDATED VALUES(1, TO_DATE('20080601','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'), 12);
The data in this example has gaps if we consider the range 20070101 and 20090101. I would like to get those missing segments as part of a query whose "somevalue" column would be 0, this is, given start/end dates 20070101/20090101 and the above data, a query should return the following records (assume that the query looks for ID=1)
ID EFFECTIVEDAY TERMINATIONDAY SOMEVALUE
-- ------------ -------------- ---------
1 20070101 20070301 0
1 20070301 20070601 10
1 20070601 20070901 0
1 20070901 20080601 8
1 20080601 20090101 12
I have accomplished this using PL/SQL but I know it can be done using analytics, I just do not know how, and that's what I am hoping you can help me with
Thanks
March 12, 2007 - 5:38 pm UTC
you know, by doing that - you've utterly handicapped the optimizer. You should have used NULL, not 99991231
start with this... it can be solved using this set:
ops$tkyte%ORA10GR2> with data
2 as
3 (select 1 l from dual union all select 2 l from dual)
4 select *
5 from data d, (select id, effectiveday edt, terminationday tdt, somevalue val,
6 lag( terminationday ) over (order by effectiveday ) last_tdt,
7 lead( effectiveday ) over (order by effectiveday ) next_edt,
8 to_date(:x,'yyyymmdd') x,
9 to_date(:y,'yyyymmdd') y
10 from effectivedated
11 where id = :z
12 ) e
13 order by edt, l
14 /
L ID EDT TDT VAL LAST_TDT NEXT_EDT X Y
---------- ---------- --------- --------- ---------- --------- --------- --------- ---------
1 1 01-MAR-07 01-JUN-07 10 01-SEP-07 01-JAN-07 01-JAN-09
2 1 01-MAR-07 01-JUN-07 10 01-SEP-07 01-JAN-07 01-JAN-09
1 1 01-SEP-07 01-JUN-08 8 01-JUN-07 01-JUN-08 01-JAN-07 01-JAN-09
2 1 01-SEP-07 01-JUN-08 8 01-JUN-07 01-JUN-08 01-JAN-07 01-JAN-09
1 1 01-JUN-08 31-DEC-99 12 01-JUN-08 01-JAN-07 01-JAN-09
2 1 01-JUN-08 31-DEC-99 12 01-JUN-08 01-JAN-07 01-JAN-09
6 rows selected.
Analytics to the rescue
Livio, March 12, 2007 - 6:05 am UTC
I hope my problem can be considered a followup of this thread. Otherwise, I'll try to put a question at a later time.
I have a table like this:
create table eub.test (date_time date, validity number);
insert into test (date_time, validity)
values (to_date('08-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('08-01-2007', 'dd-mm-yyyy'), 27335);
insert into test (date_time, validity)
values (to_date('08-01-2007', 'dd-mm-yyyy'), 27336);
insert into test (date_time, validity)
values (to_date('09-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('09-01-2007', 'dd-mm-yyyy'), 27335);
insert into test (date_time, validity)
values (to_date('09-01-2007', 'dd-mm-yyyy'), 27340);
insert into test (date_time, validity)
values (to_date('10-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('10-01-2007', 'dd-mm-yyyy'), 27335);
insert into test (date_time, validity)
values (to_date('10-01-2007', 'dd-mm-yyyy'), 27336);
commit;
I'd like to make a report like the following:
date_time validity group
---------- -------- -----
08-01-2007 27334 0
08-01-2007 27335 0
08-01-2007 27336 0
09-01-2007 27334 1
09-01-2007 27335 1
09-01-2007 27340 1
10-01-2007 27334 0
10-01-2007 27335 0
10-01-2007 27336 0
Said in other words, dates with the very same validities belong to the same group (or partition.)
Is it possible to figure out a solution using analytics, without writing down procedural code?
Thank you very much in advance.
March 12, 2007 - 8:01 pm UTC
you'll need to be a tad more precise in your definition of a "group"
what logic did you follow to put the first 3 and last 3 together and what happens if there was a group with more "validities" that covered that group and all......
define, precisely, how to identify a group - in a manner a programmer could produce code from.
Analytics to the rescue
Livio, March 13, 2007 - 4:36 am UTC
Tom,
I should have used the word "set" instead of "group". In fact, I put the first 3 and the last three together because
the set of values s1 = {27334, 27335, 27336} associated to the date 08-01-2007 is the same set of values associated to the date 10-01-2007.
The set s2 = {27334, 27335, 27340} associated to the date 09-01-2007 is not equal to s1, so dates 09-01-2007 belong to a different set.
If I insert the following rows:
insert into test (date_time, validity)
values (to_date('11-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('11-01-2007', 'dd-mm-yyyy'), 27335);
commit;
I would add a third set s3 = {27334, 27335} that, even if subset of s1 and s2, is not equal to s1, s2. Rows with date 11-01-2007 should be marked as belonging to a third set, say 2.
date_time validity group
---------- -------- -----
08-01-2007 27334 0
08-01-2007 27335 0
08-01-2007 27336 0
09-01-2007 27334 1
09-01-2007 27335 1
09-01-2007 27340 1
10-01-2007 27334 0
10-01-2007 27335 0
10-01-2007 27336 0
11-01-2007 27334 2
11-01-2007 27335 2
To sum up: set equality is the criterion to group rows of table test together. Given two dates d1 and d2, let us call s1 and s2 the set of validities associated to d1 and d2, respectively. d1 and d2 belong to the same set (group, partition) if and only if set s1 equals to set s2.
I'd like to solve the above problem without writing down a procedure going through table test to identify all dates with the same set of validities. I thought that an analytic approach might help.
Thank you
March 13, 2007 - 11:24 am UTC
doubt you would ever come up with an efficient SQL based method for this.
SQL Model and Analytics to the rescue
Frank Zhou, March 13, 2007 - 2:43 pm UTC
Hi Livio,
Here is a SQL model clause solution for you.
I was able to put the identical sets into the same group
(but the group number is not exactly the same as yours)
Frank
SQL> SELECT date_time ,
2 SUBSTR(fin_str,
3 INSTR(fin_str, ',', 1, LEVEL ) + 1,
4 INSTR(fin_str, ',', 1, LEVEL+1) -
5 INSTR(fin_str, ',', 1, LEVEL) -1 ) validity,
6 grp
7 FROM (
8 SELECT date_time , ','||fin_str||',' as fin_str, dense_rank( ) over (order by fin_str) -1 grp
9 FROM
10 (
11 SELECT date_time, validity, validity_str, fin_str
12 FROM test
13 MODEL
14 PARTITION BY (date_time)
15 DIMENSION BY ( row_number() OVER (PARTITION BY date_time ORDER BY validity) rn )
16 MEASURES
17 (
18 CAST(NULL AS VARCHAR2(3255)) validity_str, validity, CAST(NULL AS VARCHAR2(3255)) fin_str
19 )
20 RULES
21 (
22 validity_str[ANY] ORDER BY rn =
23 CASE WHEN validity[cv() - 1 ] IS NULL
24 THEN TO_CHAR(validity[cv()])
25 ELSE validity_str[cv()-1]||','|| TO_CHAR(validity[cv()])
26 END,
27 fin_str[ANY] ORDER BY rn =
28 CASE WHEN validity[cv() + 1 ] IS NULL
29 THEN validity_str[CV()]
30 END
31 )
32 )
33 WHERE fin_str IS NOT NULL
34 )
35 CONNECT BY PRIOR date_time = date_time
36 AND INSTR (fin_str, ',', 1, LEVEL+1) > 0
37 AND PRIOR dbms_random.string ('p', 10) IS NOT NULL
38 order by date_time, validity;
DATE_TIME VALIDITY GRP
--------- ------------------ ----------
08-JAN-07 27334 1
08-JAN-07 27335 1
08-JAN-07 27336 1
09-JAN-07 27334 2
09-JAN-07 27335 2
09-JAN-07 27340 2
10-JAN-07 27334 1
10-JAN-07 27335 1
10-JAN-07 27336 1
11-JAN-07 27334 0
11-JAN-07 27335 0
11 rows selected.
SQL> spool off;
sys_connect_by_path and Analytics to the rescue
Frank Zhou, March 13, 2007 - 3:19 pm UTC
In addition to my 10G SQL Model Clause solution above,
Here is a 9I pure SQL solution.
Frank
SQL> SELECT date_time ,
2 SUBSTR(fin_str,
3 INSTR(fin_str, ',', 1, LEVEL ) + 1,
4 INSTR(fin_str, ',', 1, LEVEL+1) -
5 INSTR(fin_str, ',', 1, LEVEL) -1 ) validity,
6 grp
7 FROM
8 (SELECT date_time,fin_str||',' as fin_str,
9 dense_rank( ) over (order by fin_str) -1 grp
10 FROM
11 (SELECT date_time, validity, fin_str,
12 MAX(LENGTH(fin_str) - LENGTH(REPLACE(fin_str, ',', '')))
13 OVER (PARTITION BY date_time) max_path,
14 LENGTH(fin_str)-LENGTH(REPLACE(fin_str, ',', '')) str_len
15 FROM (SELECT date_time, validity,
16 sys_connect_by_path(validity, ',') fin_str
17 from test
18 connect by prior date_time = date_time
19 and prior validity < validity
20 )
21 )
22 WHERE max_path = str_len
23 )
24 CONNECT BY PRIOR date_time = date_time
25 AND INSTR (fin_str, ',', 1, LEVEL+1) > 0
26 AND PRIOR dbms_random.string ('p', 10) IS NOT NULL
27 order by date_time, validity;
DATE_TIME VALIDITY GRP
--------- ------------------ ----------
08-JAN-07 27334 1
08-JAN-07 27335 1
08-JAN-07 27336 1
09-JAN-07 27334 2
09-JAN-07 27335 2
09-JAN-07 27340 2
10-JAN-07 27334 1
10-JAN-07 27335 1
10-JAN-07 27336 1
11-JAN-07 27334 0
11-JAN-07 27335 0
11 rows selected.
SQL> spool off;
Easier Analytics to the rescue
Michel Cadot, March 13, 2007 - 3:34 pm UTC
Using the well-known stragg function defined at:
http://asktom.oracle.com/pls/asktom/f?p=100:11:2073682287744121::::P11_QUESTION_ID:2196162600402 SQL> with
2 data as (
3 select date_time, validity,
4 stragg(validity) over (partition by date_time) same_date
5 from (select date_time, validity from test order by validity, date_time)
6 )
7 select date_time, validity,
8 dense_rank() over (order by same_date) "GROUP"
9 from data
10 order by 1, 2
11 /
DATE_TIME VALIDITY GROUP
---------- ---------- ----------
01/08/2007 27334 3
01/08/2007 27335 3
01/08/2007 27336 3
01/09/2007 27334 1
01/09/2007 27335 1
01/09/2007 27340 1
01/10/2007 27334 3
01/10/2007 27335 3
01/10/2007 27336 3
01/11/2007 27334 2
01/11/2007 27335 2
Regards
Michel
March 13, 2007 - 3:54 pm UTC
you'd need to make sure to use a stragg that SORTS for this to work reliably (and probably "distincts" too)
but yes, why didn't I think of a simple pivot....
so perhaps something even as simple as:
ops$tkyte%ORA10GR2> select date_time,
2 v1, v2, v3, v4, v5, v6, v7, v8
3 from (
4 select date_time,
5 max(decode(rn,1,validity)) v1,
6 max(decode(rn,2,validity)) v2,
7 max(decode(rn,3,validity)) v3,
8 max(decode(rn,4,validity)) v4,
9 max(decode(rn,5,validity)) v5,
10 max(decode(rn,6,validity)) v6,
11 max(decode(rn,7,validity)) v7,
12 max(decode(rn,8,validity)) v8
13 from (
14 select date_time, row_number() over (partition by date_time order by validity) rn, validity
15 from test
16 )
17 group by date_time
18 )
19 order by v1, v2, v3, v4, v5, v6, v7, v8
20 /
DATE_TIME V1 V2 V3 V4 V5 V6 V7 V8
--------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
08-JAN-07 27334 27335 27336
10-JAN-07 27334 27335 27336
09-JAN-07 27334 27335 27340
would be useful - you can partition by v1, v2, .... vN
(I know, 8 columns = limit, you can raise limit. stragg = limit as well though, varchar2(4000))
RE: Analytics to the rescue
Frank Zhou, March 13, 2007 - 4:19 pm UTC
Hi Tom,
My two SQl solutions above took care of the "Sorting" ( The 9i pure sql solution also handles the "distinct"). For the sql model clause solution to handle the "Distinct"
The following sql pattern can be used :
http://www.jlcomp.demon.co.uk/faq/Exclude_duplicate.html Thanks,
Frank
Easier Analytics to the rescue (2)
Michel Cadot, March 13, 2007 - 4:19 pm UTC
Tom,
For your first remark, this is why I sorted the result set before applying stragg but of course it works only if we not PARALLEL_ENABLE the function.
For pivot, I thought about it but the problem is when you "group by date_time" to build the set of values per date you lost "validity" which is required in the final result so joining this with the original test seems unavoidable and we have 2 test table scans instead of 1 with stragg.
Regards
Michel
March 13, 2007 - 8:53 pm UTC
not sure that we can 100% rely on the sort for the aggregate to work.... it might, it might not. there is nothing saying it will (there is a version of stragg that saves the inputs, sorts them and THEN concatenates them...)
group by date_time - gives you a set of rows you can now PARTITION BY v1, v2, v..... on
Easier Analytics to the rescue (3)
Michel Cadot, March 14, 2007 - 3:03 am UTC
You mean grouping, ranking, degrouping; something like:
SQL> with
2 data as (
3 select date_time, validity,
4 row_number() over (partition by date_time order by validity) rn
5 from test
6 ),
7 sets as (
8 select date_time,
9 max(decode(rn,1,validity)) v1,
10 max(decode(rn,2,validity)) v2,
11 max(decode(rn,3,validity)) v3,
12 max(decode(rn,4,validity)) v4,
13 max(decode(rn,5,validity)) v5,
14 max(decode(rn,6,validity)) v6,
15 max(decode(rn,7,validity)) v7,
16 max(decode(rn,8,validity)) v8,
17 max(decode(rn,9,validity)) v9
18 from data
19 group by date_time
20 ),
21 groups as (
22 select date_time, v1, v2, v3, v4, v5, v6, v7, v8, v9,
23 dense_rank() over (order by v1, v2, v3, v4, v5, v6, v7, v8, v9) grp
24 from sets
25 ),
26 lines as ( select rownum line from dual connect by level <= 9 )
27 select date_time,
28 decode(line,1,v1,2,v2,3,v3,4,v4,5,v5,6,v6,7,v7,8,v8,9,v9) validity,
29 grp "GROUP"
30 from groups, lines
31 where decode(line,1,v1,2,v2,3,v3,4,v4,5,v5,6,v6,7,v7,8,v8,9,v9)
32 is not null
33 order by 1, 2
34 /
DATE_TIME VALIDITY GROUP
---------- ---------- ----------
01/08/2007 27334 1
01/08/2007 27335 1
01/08/2007 27336 1
01/09/2007 27334 2
01/09/2007 27335 2
01/09/2007 27340 2
01/10/2007 27334 1
01/10/2007 27335 1
01/10/2007 27336 1
01/11/2007 27334 3
01/11/2007 27335 3
I noticed that you seldomly use "with" clause and prefer inline views. Is this just your way of writing SQL or is there any underlying reason?
Regards
Michel
March 14, 2007 - 7:52 am UTC
i would not bother ungrouping again, once the data is inline like that (v1....vn) it is better than good enough :)
it is my way of writing SQL. I typically start with simple query "Q" - wrap it in parentheses, put the next layer on it.
rather than go "down" the page - my queries tend to explode up and down the page - add a bit on top, add a bit on the bottom...
Analytics to the rescue
Livio, March 14, 2007 - 7:59 am UTC
Hi folks,
thank you all for your help.
Eventually, I could figure out a solution based on the fact that a set of integers s = {s1, s2, ... sn} can be thought of as the number 2^s1 + 2^s2 + ...+ 2^sn.
Here's my solution:
select date_time, validity, dns_rnk, sum(power(2, dns_rnk)) over (partition by date_time) set_id from (
select date_time, validity, dense_rank() over (order by validity) - 1 dns_rnk
from test
)
order by day;
DATE_TIME VALIDITY DNS_RNK SET_ID
--------- ---------- ---------- ----------
08-GEN-07 27334 0 7
08-GEN-07 27335 1 7
08-GEN-07 27336 2 7
09-GEN-07 27334 0 11
09-GEN-07 27335 1 11
09-GEN-07 27340 3 11
10-GEN-07 27334 0 7
10-GEN-07 27335 1 7
10-GEN-07 27336 2 7
11-GEN-07 27334 0 3
11-GEN-07 27335 1 3
I use dense_rank() in the inner query to "normalize" validities and avoid a numeric overflow error while computing the sum of the powers of 2.
Yet, should I have a large number of validities for each date, the value of set_id can become very large (something like 1,563E+112 in my original table) making two different sets look "the same". Anyway, I think this is just a matter of formatting the output.
I will compare the proposed solutions with the one I devised
and apply the best that suits my case.
Thanks again
Analytics to the rescue
Michel Cadot, March 14, 2007 - 8:56 am UTC
Hi Livio,
Nicely thought.
Regards
Michel
To Livio ... constraints matter
Gabe, March 14, 2007 - 11:35 pm UTC
<quote>a set of integers s = {s1, s2, ... sn} can be thought of as the number 2^s1 + 2^s2 + ...+ 2^sn</quote>
Your table definition does not have a unique constraint; the potential need to "distinct" has been mentioned and you have not clarified if duplicates can exist (and how it would impact the definition of "set equality").
If they do, your approach will give you false grouping since, for any n, Power(2,n) + power(2,n) = power(2,n+1) ... and it is just one of the combinations that would give you false positives.
Re: To Livio ... constraints matter
Livio, March 15, 2007 - 8:07 am UTC
Hi Gabe;
thanks for your remark.
You're right: I did not clarified about the possibility of having duplicates values or not. Actually, I don't have to cope with duplicates. My observation should have been written more precisely as
<quote>a set of distinct integers s = {s1, s2, ... sn} can be thought of as the number 2^s1 + 2^s2 + ...+
2^sn</quote>
In fact, columns date_time and validity make up the PK of table test.
Regards
Livio
Analytics to the rescue... continue
Michel Cadot, March 15, 2007 - 8:46 am UTC
Livio,
You can change you big numbers to smaller applying the rank function on the set_id:
SQL> with
2 data as (
3 select date_time, validity,
4 dense_rank() over (order by validity) - 1 val_rk
5 from test
6 ),
7 groups as (
8 select date_time, validity,
9 sum(power(2,val_rk)) over (partition by date_time) set_id
10 from data
11 )
12 select date_time, validity,
13 dense_rank () over (order by set_id) - 1 "GROUP"
14 from groups
15 order by 1, 2
16 /
DATE_TIME VALIDITY GROUP
---------- ---------- ----------
01/08/2007 27334 1
01/08/2007 27335 1
01/08/2007 27336 1
01/09/2007 27334 2
01/09/2007 27335 2
01/09/2007 27340 2
01/10/2007 27334 1
01/10/2007 27335 1
01/10/2007 27336 1
01/11/2007 27334 0
01/11/2007 27335 0
Regards
Michel
Re: Analytics to the rescue... continue
Livio, March 16, 2007 - 5:33 am UTC
Hi Michel;
you're right. One more analytic makes the output more intelligible.
Thanks for your suggestion
Best Regards
Livio
find the ranges
vincenzo, March 23, 2007 - 1:18 pm UTC
Hi Tom,
I have a big table with about 500 million rows:
drop table big_table;
create table big_table
(serial_number number,
product_code char(10),
constraint pk_bt primary key (serial_number)
);
insert into big_table values (1,'product a');
insert into big_table values (2,'product a');
insert into big_table values (3,'product a');
insert into big_table values (6,'product a');
insert into big_table values (7,'product a');
insert into big_table values (10,'product b');
insert into big_table values (11,'product b');
insert into big_table values (12,'product b');
insert into big_table values (13,'product a');
SQL> select * from big_table;
SERIAL_NUMBER PRODUCT_CO
------------- ----------
1 product a
2 product a
3 product a
6 product a
7 product a
10 product b
11 product b
12 product b
13 product a
I need to aggregate the serial_number by ranges and product_code.
The output should look like this:
FIRST_SERIAL LAST_SERIAL PRODUCT_CODE
------------- ------------- ----------
1 3 product a
6 7 product a
10 12 product b
13 13 product a
How can be done using analytic function ?
Thanks
March 26, 2007 - 7:04 am UTC
ops$tkyte%ORA10GR2> select product_code, min(serial_number), max(serial_number)
2 from (
3 select product_code, serial_number,
4 max(grp) over (partition by product_code order by serial_number) maxgrp
5 from (
6 select product_code, serial_number,
7 case when lag(serial_number)
8 over (partition by product_code
9 order by serial_number) <> serial_number-1
10 or
11 row_number()
12 over (partition by product_code
13 order by serial_number) =1
14 then row_number()
15 over (partition by product_code
16 order by serial_number)
17 end grp
18 from big_table
19 )
20 )
21 group by product_code, maxgrp
22 order by product_code, maxgrp
23 /
PRODUCT_CO MIN(SERIAL_NUMBER) MAX(SERIAL_NUMBER)
---------- ------------------ ------------------
product a 1 3
product a 6 7
product a 13 13
product b 10 12
4 rows selected.
To vincenzo
Michel Cadot, March 24, 2007 - 5:50 pm UTC
With the assumption that your serial_number is strictly positive (otherwise change the -1 in the nvl function):
SQL> with
2 data as (
3 select serial_number, product_code,
4 case
5 when nvl(lag(serial_number)
6 over (partition by product_code order by serial_number),-1)
7 != serial_number-1
8 then serial_number
9 end flag
10 from big_table
11 ),
12 grouping as (
13 select serial_number, product_code,
14 max(flag) over (partition by product_code order by serial_number) grp
15 from data
16 )
17 select min(serial_number) first_serial,
18 max(serial_number) last_serial,
19 max(product_code) product_code
20 from grouping
21 group by grp
22 order by 1
23 /
FIRST_SERIAL LAST_SERIAL PRODUCT_CO
------------ ----------- ----------
1 3 product a
6 7 product a
10 12 product b
13 13 product a
4 rows selected.
Regards
Michel
Dates with overlaps
Sudha Bhagavatula, March 26, 2007 - 9:28 am UTC
I have a question regarding lag and lead.
create table t_mbr_enrol ( subr_id varchar2(15),
dep_nbr varchar2(2),
eff_date date,
term_date date);
insert into t_mbr_enrol values('1001','0', TO_DATE('19990101','YYYYMMDD'), TO_DATE('19991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20020101','YYYYMMDD'), TO_DATE('20021231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20030101','YYYYMMDD'), TO_DATE('20031231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20040101','YYYYMMDD'), TO_DATE('20041231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20050101','YYYYMMDD'), TO_DATE('20051231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20060101','YYYYMMDD'), TO_DATE('20061231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20070101','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20040701','YYYYMMDD'), TO_DATE('20041231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20050101','YYYYMMDD'), TO_DATE('20051231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20060101','YYYYMMDD'), TO_DATE('20061231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20070101','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','2', TO_DATE('20050801','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','3', TO_DATE('20070101','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
create table t_mdcr_elig ( subr_id varchar2(15),
dep_nbr varchar2(2),
eff_date date);
insert into t_mdcr_elig values('1001','0',TO_DATE('20040301','YYYYMMDD'));
insert into t_mdcr_elig values('1001','1',TO_DATE('20060801','YYYYMMDD'));
select * from t_mbr_enrol;
SUBR_ID DEP_NBR EFF_DATE TERM_DATE
--------------- ------- ----------- -----------
1001 0 01/01/1999 12/31/1999
1001 0 01/01/2002 12/31/2002
1001 0 01/01/2003 12/31/2003
1001 0 01/01/2004 12/31/2004
1001 0 01/01/2005 12/31/2005
1001 0 01/01/2006 12/31/2006
1001 0 01/01/2007 12/31/9999
1001 1 07/01/2004 12/31/2004
1001 1 01/01/2005 12/31/2005
1001 1 01/01/2006 12/31/2006
1001 1 01/01/2007 12/31/9999
1001 2 08/01/2005 12/31/9999
1001 3 01/01/2007 12/31/9999
13 rows selected
select * from t_mdcr_elig;
SUBR_ID DEP_NBR EFF_DATE
--------------- ------- -----------
1001 0 03/01/2004
1001 1 08/01/2006
My final output should be like this:
SUBR_ID DEP_NBR EFF_DATE TERM_DATE
--------------- ------- ----------- -----------
1001 0 01/01/1999 12/31/1999
1001 0 01/01/2002 12/31/2002
1001 0 01/01/2003 12/31/2003
1001 0 01/01/2004 02/29/2004
1001 0 03/01/2004 12/31/2004
1001 0 01/01/2005 12/31/2005
1001 0 01/01/2006 07/31/2006
1001 0 08/01/2006 12/31/2006
1001 0 01/01/2007 12/31/9999
1001 1 07/01/2004 12/31/2004
1001 1 01/01/2005 12/31/2005
1001 1 01/01/2006 07/31/2006
1001 1 08/01/2006 12/31/2006
1001 1 01/01/2007 12/31/9999
1001 2 08/01/2005 07/31/2006
1001 2 08/01/2006 12/31/9999
1001 3 01/01/2007 12/31/9999
Is there a way of doing this in a single sql query with analytics ?
Thanks a lot
Sudha
March 26, 2007 - 11:12 am UTC
how's about you explain the logic behind your output - so we can understand what we are looking at instead of having to try to understand what you might have been thinking...
Analytics
Sudha Bhagavatula, March 26, 2007 - 12:44 pm UTC
The table t_mbr_enrol contains data for a member's enrollment into a medical plan.
The subscriber id is common to all members of a the same contract. A contract can have only one primary subscriber which is ascertained by the dep_nbr (dependent number) = 0. The rest of them are dependents of this primary subscriber. This table contains no overlaps and the date boundaries for each subr_id, dep_nbr are set.
The second table t_mdcr_elig contains the dat when each member became eligible for medicare, when they torn 65 years of age or due to some medical condition.
When this happens, the dates in enrollment need to be sliced further to reflect medicare eligibility.
So in the example given earlier, the final output is like this:
SUBR_ID DEP_NBR EFF_DATE TERM_DATE
--------------- ------- ----------- -----------
1001 0 01/01/1999 12/31/1999 -- no change
1001 0 01/01/2002 12/31/2002 -- no change
1001 0 01/01/2003 12/31/2003 -- no change
1001 0 01/01/2004 02/29/2004 -- last date before medicare for dep nbr = 0
1001 0 03/01/2004 12/31/2004 -- start of medicare eligibility for subscriber
1001 0 01/01/2005 12/31/2005 -- no change
1001 0 01/01/2006 07/31/2006 -- last date before medicare eligibilty of a dependent (dep_nbr = 1)
1001 0 08/01/2006 12/31/2006 -- medicare eligiblity starts for dep nbr = 1
1001 0 01/01/2007 12/31/9999 -- no change
1001 1 07/01/2004 12/31/2004 -- no change
1001 1 01/01/2005 12/31/2005 -- no change
1001 1 01/01/2006 07/31/2006 -- last date before medicare for dep nbr = 1
1001 1 08/01/2006 12/31/2006 -- medicare eligibility for dep nbr = 1
1001 1 01/01/2007 12/31/9999 -- no change
1001 2 08/01/2005 07/31/2006 -- dates split for medicare eligibilty for dep nbr = 1
1001 2 08/01/2006 12/31/9999 -- dates split for medicare eligibilty for dep nbr = 1
1001 3 01/01/2007 12/31/9999 -- no change since this member became active after the splits
I need to split dates for medicare eligibilty.
Thanks
Sudha
March 26, 2007 - 2:11 pm UTC
that does not really explain the logic at all.
MODEL clause ?
Gary, March 27, 2007 - 3:24 am UTC
The logic appears to be...
DECLARE
CURSOR c_1 is
select subr_id, dep_nbr, eff_date, term_date
from t_mbr_enrol
order by 1,2,3;
cursor c_2
(p_subr in varchar2, p_start in date, p_end in date) is
select eff_date - 1 eff_date
from t_mdcr_elig
where subr_id = p_subr
and eff_date between p_start and p_end
order by eff_date;
v_out_line varchar2(100);
BEGIN
FOR c_r1 in c_1 LOOP
FOR c_r2 in c_2
(c_r1.subr_id, c_r1.eff_date, c_r1.term_date) LOOP
dbms_output.put_line(c_r1.subr_id||':'||c_r1.dep_nbr||
' From '||c_r1.eff_date||' to '||c_r2.eff_date);
c_r1.eff_date := c_r2.eff_date+1;
END LOOP;
dbms_output.put_line(c_r1.subr_id||':'||c_r1.dep_nbr||
' From '||c_r1.eff_date||' to '||c_r1.term_date);
END LOOP;
END;
/
The following SQL is close, but it needs to generate an extra row whenever the split date is present. If they are on 10G, the MODEL clause would be an option
select t1.subr_id, t1.dep_nbr,
t1.eff_date, t1.term_date, t2.eff_date split_date
from t_mbr_enrol t1 left outer join t_mdcr_elig t2
on (t1.subr_id = t2.subr_id
and t2.eff_date between t1.eff_date and t1.term_date)
order by t1.subr_id, t1.dep_nbr, t1.eff_date, t2.eff_date;
how to get this select
A reader, March 27, 2007 - 5:02 am UTC
Thanks Tom for this web site,
I have the following question
create table t (name varchar2(50), ident varchar2(13))
insert into t values ('Jean','12456');
insert into t values ('Jean','12457');
insert into t values ('Jean','12458');
insert into t values ('Paul','67850');
insert into t values ('Paul','67851');
insert into t values ('Remy','4879');
commit;
I would like to issue a select that will give me:
Jean 12456 -- only once for Jean whatever the ident
Paul 67850 -- only once for Paul whatever the ident
Remy 4879 -- only once for Remy whatever the ident
Thanks a lot
March 27, 2007 - 9:35 am UTC
select name, min(ident) from t group by name;
I have found
A reader, March 27, 2007 - 5:47 am UTC
Tom,
I have found the solution to get my desired select
select * from (
select name, ident, row_number() over (partition by name order by ident) seq
from t
)
where seq = 1
Thanks
March 27, 2007 - 9:35 am UTC
use the infinitely more straightforward query right above
this is an inappropriate use of analytics.
A reader, March 27, 2007 - 10:58 am UTC
and it is more performant.
Thanks very much Tom
Thanks Gary
Sudha Bhagavatula, March 28, 2007 - 9:50 am UTC
Thanks Gary for the input. I managed to get an extra row by doing a union instead of an outer join.
pagination
shay, March 29, 2007 - 3:44 am UTC
hi tom,
further to pagination result set i have another question:
i have this statement
select *
from (
select
id,type_id,
row_number() over
(order by sortby) rn
from wscportal_mpower.wsc where father_id=1382)
where rn between 1 and 16
order by rn
/
and it returns :
ID TYPE_ID RN
---------- ---------- ----------
1087809 0 1
29359 0 2
113 0 3
1064393 0 4
447340 0 5
462072 0 6
1031179 0 7
458014 0 8
1014710 1 9
464113 0 10
1014711 1 11
1066580 1 12
1032922 0 13
827134 0 14
838371 0 15
864671 0 16
16 rows selected.
i would like to ask also on the type_id and if i find 2 rows with type_id=1 i would like to stop the resultset and get the answer. in other words to stop at row 11 (include)?!
is it possible?
thanks a lot
March 30, 2007 - 12:02 pm UTC
no table create
no insert intos
no look
Help with queries
Jayadevan, March 30, 2007 - 6:30 am UTC
Tom,
Could you please help with the following issues?
1)
There is a table x with 2 columns
SQL> desc x
Name Null? Type
----------------------------------------- -------- ----------------------------
CUS_ID NUMBER
CLMAMT NUMBER
SQL> select * from x;
CUS_ID CLMAMT
---------- ----------
1 13008
3 10500
2 9306
6 4500
4 3002
5 3000
7 3000
7 rows selected.
We have to classify the customers into A,B and C depending on whether the sum total of their claims come to 50%,90% and 10 %. If we do it manually, this is what we will get if we do it in excel
CusID Claim amt Running total Running total as % total ABC class Description
1 13008 13008 28.08532688 A Top customers adding to 50% of total claims
3 10500 23508 50.75567838 A Top customers adding to 50% of total claims
2 9306 32814 70.84808705 B Top customers adding to 90% of total claims
6 4500 37314 80.56395198 B Top customers adding to 90% of total claims
4 3002 40316 87.04551343 B Top customers adding to 90% of total claims
5 3000 43316 93.52275671 C Rest
7 3000 46316 100 C Rest
How can we get this using a query?
There is something wrong with the way I am using a simple sum and over.
It is clubbing the claim amount with the same values and adding it in the first instance itself.
SELECT cus_id, clmamt,
SUM(clmamt)
OVER (ORDER BY clmamt desc ) x
FROM x
1 13008 13008
3 10500 23508
2 9306 32814
6 4500 37314
4 3002 40316
5 3000 46316
7 3000 46316
2) I have a table with 3 columns
tariff, start_date, end_date
There is one record with values
1000 01-Jan-2006 31-Jan-2006
I need to insert into another table records such as
1000 01-Jan-2006
1000 02-Jan-2006
1000 03-Jan-2006
.....
to
1000 31-Jan-2006
i.e. for each date starting with teh start date upto the end date in this table,
insert a record in the target table. Can I do this with one query?
March 30, 2007 - 1:41 pm UTC
1) order by something "distinct" - else - neither of the claimamts "come first", they arrive "together"
order by clmamt desc, rowid
2) yes...
ops$tkyte%ORA10GR2> select * from t;
X Y Z
---------- --------- ---------
1000 01-JAN-07 10-JAN-07
2000 15-JAN-07 22-JAN-07
ops$tkyte%ORA10GR2> with data
2 as
3 (select level l from dual connect by level <= (select max(z-y+1) from t)
4 )
5 select t.*,
6 t.y+data.l-1
7 from data, t
8 where data.l <= t.z-t.y+1
9 order by x, t.y+data.l-1
10 /
X Y Z T.Y+DATA.
---------- --------- --------- ---------
1000 01-JAN-07 10-JAN-07 01-JAN-07
1000 01-JAN-07 10-JAN-07 02-JAN-07
1000 01-JAN-07 10-JAN-07 03-JAN-07
1000 01-JAN-07 10-JAN-07 04-JAN-07
1000 01-JAN-07 10-JAN-07 05-JAN-07
1000 01-JAN-07 10-JAN-07 06-JAN-07
1000 01-JAN-07 10-JAN-07 07-JAN-07
1000 01-JAN-07 10-JAN-07 08-JAN-07
1000 01-JAN-07 10-JAN-07 09-JAN-07
1000 01-JAN-07 10-JAN-07 10-JAN-07
2000 15-JAN-07 22-JAN-07 15-JAN-07
2000 15-JAN-07 22-JAN-07 16-JAN-07
2000 15-JAN-07 22-JAN-07 17-JAN-07
2000 15-JAN-07 22-JAN-07 18-JAN-07
2000 15-JAN-07 22-JAN-07 19-JAN-07
2000 15-JAN-07 22-JAN-07 20-JAN-07
2000 15-JAN-07 22-JAN-07 21-JAN-07
2000 15-JAN-07 22-JAN-07 22-JAN-07
18 rows selected.
Great
Jayadevan, March 31, 2007 - 3:27 am UTC
Tom,
Thanks a lot for the replies. I am able to do the ABC classification by adding a case statement. I have learned more about Oracle from asktom than from books or training sessions.
Well, some Indians say - 'Cricket is our religion and Tendulkar is our God'. With India crashing out of the world cup, may be Indian Oracle developers can switch to 'Oracle is our religion and Tom is our God :)'.
about running %
wawan, April 02, 2007 - 4:46 am UTC
dear Tom,
create table SCORING
(Name varchar2(5),
Amount number(10,2));
insert into SCORING values('AA',200);
insert into SCORING values('BB',400);
insert into SCORING values('CC',-200);
insert into SCORING values('DD',-300);
insert into SCORING values('EE',400);
insert into SCORING values('FF',500);
commit;
compute sum of Amount on report
break on report
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING;
NAME AMOUNT PRCTG
----- ---------- ----------
AA 200 .2
BB 400 .4
CC -200 -.2
DD -300 -.3
EE 400 .4
FF 500 .5
----------
sum 1000
how to make the result is like this ?
the % calculated from the sum of postive amount only
(1500 ---> 200/1500, 400/1500, 500/1500)
NAME AMOUNT PRCTG
----- ---------- ---------- ----- -------
AA 200 .2 200 .1333
BB 400 .4 400 .2666
CC -200 -.2
DD -300 -.3
EE 400 .4 400 .2666
FF 500 .5 500 .3333
---------- -----
sum 1000 1500
I add two last column-result (off course) by typing.
regards
April 03, 2007 - 8:35 pm UTC
ops$tkyte%ORA10GR2> select name,Amount ,
2 ratio_to_report(case when amount>0 then amount end) over() prctg
3 from SCORING;
NAME AMOUNT PRCTG
------------------------------ ---------- ----------
AA 200 .133333333
BB 400 .266666667
CC -200
DD -300
EE 400 .266666667
FF 500 .333333333
6 rows selected.
To wawan
Michel Cadot, April 02, 2007 - 10:59 am UTC
SQL> select name, amount,
2 ratio_to_report(case when amount >= 0 then amount end) over() prctg
3 from SCORING;
NAME AMOUNT PRCTG
----- ---------- ----------
AA 200 .133333333
BB 400 .266666667
CC -200
DD -300
EE 400 .266666667
FF 500 .333333333
6 rows selected.
Regards
Michel
Thanks
wawan, April 02, 2007 - 9:06 pm UTC
ok. thanks for your query.
To wawan: to get your result
Michel CADOT, April 04, 2007 - 4:30 am UTC
SQL> select case when grouping(name) = 1 then 'Sum' else name end name,
2 sum(amount1) amount1, prctg1, sum(amount2) amount2, prctg2
3 from (
4 select name, amount amount1,
5 ratio_to_report(amount) over() prctg1,
6 case when amount >= 0 then amount end amount2,
7 ratio_to_report(case when amount >= 0 then amount end) over() prctg2
8 from SCORING
9 )
10 group by grouping sets ((),(name,prctg1,prctg2))
11 order by 1
12 /
NAME AMOUNT1 PRCTG1 AMOUNT2 PRCTG2
----- ---------- ---------- ---------- ----------
AA 200 .2 200 .133333333
BB 400 .4 400 .266666667
CC -200 -.2
DD -300 -.3
EE 400 .4 400 .266666667
FF 500 .5 500 .333333333
Sum 1000 1500
7 rows selected.
Regards
Michel
Stumped on Analytics!!
Saurabh, April 05, 2007 - 6:43 am UTC
Hi Tom,
This is in response to review "Stumped on Analytics" posted by "Dave Thompson" on 'March 04, 2004'.
I know its too late(ages late) to reply but the case presented was interesting. Besides, I had similar(more complex) situation in one of my projects. Unfortunately, I had to do it on Teradata and Procedural Code was not allowed as a standard (by DBA's). So I gave a shot at Analytics. Finally objective was achieved and performance was worth effort spent but the resulting SQL was too complex to understand and maintain.
For presented case, I think below SQL will do.
cusum's are used to achieve looping effect. Analytics rocks!! lot of procedural code can be eliminated using analytics.
select p.pay_id, p.payment, q.prem_payment, q.prem_id,
p.cusum, q.cusum,
-- p.prev_cusum, q.prev_cusum,
case when p.cusum <= q.cusum then p.payment else p.cusum - q.cusum end amt_applied
from
(
select
pay_id,
payment,
sum(payment) over(order by pay_id rows between unbounded preceding and current row) cusum,
sum(payment) over(order by pay_id rows between unbounded preceding and 1 preceding ) prev_cusum
from pay_m pay
) p ,
(
select prem_id, prem_payment,
sum(prem_payment) over(order by prem_id rows between unbounded preceding and current row) cusum,
sum(prem_payment) over(order by prem_id rows between unbounded preceding and 1 preceding ) prev_cusum
from prem ) q
where nvl(p.prev_cusum,0) < q.cusum
and p.cusum >= nvl(q.prev_cusum,0)
PAY_ID PAYMENT PREM_PAYMENT PREM_ID
---------- ---------- ------------ ----------
1 50 100 1
2 25 100 1
3 50 100 1
3 50 50 2
4 50 50 2
4 50 50 3
If a grouped output is required, user-defined aggregation function, such as yours STRAGG, can be used.
regards
saurabh
looking at two different prior rows at same time
Kevin Meade, April 09, 2007 - 1:56 pm UTC
--
-- this continues to be cool stuff but I have a variation on analytics I have not seen yet
-- I have read through the analytic post for hours now
--
-- guys in the office had a problem and wanted an all analytic solution
-- essentially they want to look at two different prior rows at the same time
-- (at least I think that is the best way to describe it)
--
--
drop table temp1
/
create table temp1
(
claim_id number
,transtype varchar2(3)
,dollartype varchar2(3)
,transdate date
,amount number
)
/
insert into temp1 values (1,'TEK','USD',to_date('01','dd'),1100);
insert into temp1 values (1,'REV','USD',to_date('02','dd'),1200);
insert into temp1 values (1,'TEK','USD',to_date('04','dd'),1400);
insert into temp1 values (1,'TEK','USD',to_date('05','dd'),1500);
insert into temp1 values (1,'REV','USD',to_date('06','dd'),1600);
insert into temp1 values (1,'REV','USD',to_date('07','dd'),1700);
insert into temp1 values (1,'TEK','USD',to_date('08','dd'),1800);
insert into temp1 values (1,'TEK','USD',to_date('09','dd'),1900);
insert into temp1 values (1,'TEK','USD',to_date('10','dd'),1950);
insert into temp1 values (1,'REV','USD',to_date('11','dd'),2000);
commit
/
select * from temp1 order by transdate
/
/*
CLAIM_ID TRA DOL TRANSDATE AMOUNT
---------- --- --- --------- ----------
1 TEK USD 01-APR-07 1100
1 REV USD 02-APR-07 1200
1 TEK USD 04-APR-07 1400
1 TEK USD 05-APR-07 1500
1 REV USD 06-APR-07 1600
1 REV USD 07-APR-07 1700
1 TEK USD 08-APR-07 1800
1 TEK USD 09-APR-07 1900
1 TEK USD 10-APR-07 1950
1 REV USD 11-APR-07 2000
10 rows selected.
*/
--
-- by claim, for each row, get the amount from the prior REV row
-- get the amount from the prior TEK row
--
-- using scalars it looks like this
--
select a.*
,(
select amount
from temp1 b
where b.claim_id = a.claim_id
and b.transtype = 'TEK'
and b.transdate =
(
select max(c.transdate)
from temp1 c
where c.claim_id = a.claim_id
and c.transdate < a.transdate
and c.transtype = 'TEK'
)
) prior_TEK
,(
select amount
from temp1 b
where b.claim_id = a.claim_id
and b.transtype = 'REV'
and b.transdate =
(
select max(c.transdate)
from temp1 c
where c.claim_id = a.claim_id
and c.transdate < a.transdate
and c.transtype = 'REV'
)
) prior_REV
from temp1 a
order by transdate
/
/*
CLAIM_ID TRA DOL TRANSDATE AMOUNT PRIOR_TEK PRIOR_REV
---------- --- --- --------- ---------- ---------- ----------
1 TEK USD 01-APR-07 1100
1 REV USD 02-APR-07 1200 1100
1 TEK USD 04-APR-07 1400 1100 1200
1 TEK USD 05-APR-07 1500 1400 1200
1 REV USD 06-APR-07 1600 1500 1200
1 REV USD 07-APR-07 1700 1500 1600
1 TEK USD 08-APR-07 1800 1500 1700
1 TEK USD 09-APR-07 1900 1800 1700
1 TEK USD 10-APR-07 1950 1900 1700
1 REV USD 11-APR-07 2000 1950 1700
10 rows selected.
*/
--
-- but they wanted a one pass analytic solution
-- I racked my brain for a couple of days and couldn't figure out how to do this with analytics
-- in one pass. What I need is something like this: (FICTITIOUS WHERE CLAUSE IN THE ANALYTIC)
-- maybe I am just being dense
--
/*
select a.*
,lag(amount) over (partition by claim_id order by transdate WHERE TRANSTYPE = 'REV')
,lag(amount) over (partition by claim_id order by transdate WHERE TRANSTYPE = 'TEK')
from temp1 a
order by transdate
/
*/
--
-- But of course this WHERE clause is not part of the anlaytic syntax so...
-- I gave them the scalars and said if they wanted a one pass analytic solution then
-- they would need someone smarter than me cause I couldn't figure it out
-- to whit they said "what about asking TOM :)", so here I am
--
-- Kevin from CT USA
April 09, 2007 - 2:05 pm UTC
first thing that popped into my head was....
ops$tkyte%ORA10GR2> select claim_id, transtype, dollartype, transdate, amount,
2 lag(last_tek) over (order by transdate) real_last_tek,
3 lag(last_rev) over (order by transdate) real_last_rev
4 from (
5 select claim_id, transtype, dollartype, transdate, amount,
6 last_value(case when transtype='TEK' then amount end ignore nulls) over (order by transdate) last_tek,
7 last_value(case when transtype='REV' then amount end ignore nulls) over (order by transdate) last_rev
8 from temp1
9 )
10 order by transdate
11 /
CLAIM_ID TRA DOL TRANSDATE AMOUNT REAL_LAST_TEK REAL_LAST_REV
---------- --- --- --------- ---------- ------------- -------------
1 TEK USD 01-APR-07 1100
1 REV USD 02-APR-07 1200 1100
1 TEK USD 04-APR-07 1400 1100 1200
1 TEK USD 05-APR-07 1500 1400 1200
1 REV USD 06-APR-07 1600 1500 1200
1 REV USD 07-APR-07 1700 1500 1600
1 TEK USD 08-APR-07 1800 1500 1700
1 TEK USD 09-APR-07 1900 1800 1700
1 TEK USD 10-APR-07 1950 1900 1700
1 REV USD 11-APR-07 2000 1950 1700
10 rows selected.
carry down on 9i
Scott, April 09, 2007 - 11:41 pm UTC
G'day Tom,
I've been trying to translate your "carry down" example to my seemingly simple problem, but I cannot seem to get it right.
Given
create table swtest (a number, b varchar2(2));
insert into swtest values (3,'51');
insert into swtest values (4,null);
insert into swtest values (6,'E9');
9i> select * from swtest order by a
A B
---------- --
3 51
4
6 E9
I want a column C that would return the most recent value of B, eg:
A B MI
---------- -- --
3 51 51
4 51
6 E9 E9
Something like this sounds kinda right in theory, but doesn't return 51 for the 2nd row.
1 select a, b,last_value(b) over (order by a rows between unbounded preceding and current row )
2 from swtest
3* order by a
9i> /
A B LA
---------- -- --
3 51 51
4
6 E9 E9
My colleague successfully used
last_value(b IGNORE NULLS) over (order by a range between unbounded preceding and current row )
But I'm on 9i.
Nulls first or any other "order by" fiddling doesn't have an affect on the windowing function.
Any help?
April 10, 2007 - 10:28 am UTC
ops$tkyte%ORA9IR2> select a, b,
2 substr(max( case when b is not null then to_char(a,'fm0000000000' ) || b end ) over ( order by a ),11) maxb
3 from swtest
4 /
A B MAX
---------- -- ---
3 51 51
4 51
6 E9 E9
Saurabh, April 10, 2007 - 12:39 am UTC
select a, b,
max(decode(a,pm, b,null)) over(order by a rows between unbounded preceding and current row)
from (
select a, b,
max(decode(b,null,null,a)) over(order by a rows between unbounded preceding and current row) pm
from swtest
) x
order by a
A B MA
---------- -- --
3 51 51
4 51
5 51
6 E9 E9
close...
Scott, April 10, 2007 - 1:26 am UTC
Ahh, but if you add
insert into swtest values (7,'60');
insert into swtest values (8,NULL);
It's erroneous to my problem. That's relying on the ordering the character set.
I have managed this:
select a, first_value(b) over (partition by r) b
from (
select a, b, max(r) over (order by a) r
from (
select a, b, case when b is not null then row_number() over (order by a) end r
from swtest
)
)
order by a;
But I was hoping for something with less inline views, as of course it's ultimately part of a larger query.
Most elegant... that's going in the 9i notebook.
Scott, April 10, 2007 - 7:56 pm UTC
thanks muchly again
Kevin, April 11, 2007 - 4:42 pm UTC
thanks very much for your timely response. I'll look it over in depth.
I am amazed every time I ask you something, how easily you seem to grasp my problem, and how hard it is for me to grasp your solution. hmm.. all of a sudden I am thinking Jethro Tull: you are the wise man and I feel "thick as a brick".
Kevin
analytic function in union query
wawan, April 12, 2007 - 12:49 am UTC
dear Tom , dear All
as a subject ,
create table SCORING1
(Name varchar2(5),
Amount number(10,2));
create table SCORING2
(Name varchar2(5),
Amount number(10,2));
insert into SCORING1 values('AA',200);
insert into SCORING1 values('BB',400);
insert into SCORING1 values('CC',-200);
insert into SCORING2 values('DD',-300);
insert into SCORING2 values('EE',400);
insert into SCORING2 values('FF',500);
commit;
compute sum of Amount on report
break on report
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING1
/
NAME AMOUNT PRCTG
----- ---------- ----------
AA 200 .5
BB 400 1
CC -200 -.5
----------
sum 400
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING2
/
NAME AMOUNT PRCTG
----- ---------- ----------
DD -300 -.5
EE 400 .666666667
FF 500 .833333333
----------
sum 600
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING1
union
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING2
/
NAME AMOUNT PRCTG
----- ---------- ----------
AA 200 .5
BB 400 1
CC -200 -.5
DD -300 -.5
EE 400 .666666667
FF 500 .833333333
----------
sum 1000
what is the correct query so the % is
calculated from 1000 for each rows ?
April 13, 2007 - 11:41 am UTC
ops$tkyte%ORA9IR2> compute sum of amount on report
ops$tkyte%ORA9IR2> break on report
ops$tkyte%ORA9IR2> select name, amount, ratio_to_report(amount) over () prctg
2 from (select * from scoring1 UNION ALL select * from scoring2)
3 /
NAME AMOUNT PRCTG
------------------------------ ---------- ----------
AA 200 .2
BB 400 .4
CC -200 -.2
DD -300 -.3
EE 400 .4
FF 500 .5
----------
sum 1000
6 rows selected.
User defined analytic function
Miki, April 16, 2007 - 3:55 pm UTC
Dear Tom,
I have a difficult query to identify rows that member of a continuous constant number (for exapmle 0 traffic for 7 days).
I solved this problem with using couple of analytic functions and subquerys. The idea was to separate the non-0 and the 0 rows and assign a virtual partition number to each row...But I found this method too difficult to understand and genererated a lots of code.
I rethinked my solution and I tried to write a user defined analytic function.
CREATE TABLE "T"
( "TRAFFIC_DATE" DATE,
"MC_ID" NUMBER(*,0),
"TRAFFIC" NUMBER(*,0)
);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('04-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('05-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('06-04-2007', 'dd-mm-yyyy'), 1, 100);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('07-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('08-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('09-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('10-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('11-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('12-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('13-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('14-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('15-04-2007', 'dd-mm-yyyy'), 1, 3);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('04-04-2007', 'dd-mm-yyyy'), 2, 2);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('05-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('06-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('07-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('08-04-2007', 'dd-mm-yyyy'), 2, 100);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('09-04-2007', 'dd-mm-yyyy'), 2, 100);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('10-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('11-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('12-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('13-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('14-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('15-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('16-04-2007', 'dd-mm-yyyy'), 2, 2);
commit;
My desired output:
SELECT a.*
,f_normal_day_fg(a.traffic) over(PARTITION BY a.mc_id ORDER BY a.traffic_date ROWS BETWEEN 6 preceding AND 6 following)
FROM t a;
TRAFFIC_DATE MC_ID TRAFFIC F_NORMAL_DAY_FG(A.TRAFFIC)OVER
2007. 04. 04. 1 0 1
2007. 04. 05. 1 0 1
2007. 04. 06. 1 100 1
2007. 04. 07. 1 0 0
2007. 04. 08. 1 0 0
2007. 04. 09. 1 0 0
2007. 04. 10. 1 0 0
2007. 04. 11. 1 0 0
2007. 04. 12. 1 0 0
2007. 04. 13. 1 0 0
2007. 04. 14. 1 0 0
2007. 04. 15. 1 3 1
2007. 04. 04. 2 2 1
2007. 04. 05. 2 0 1
2007. 04. 06. 2 0 1
2007. 04. 07. 2 0 1
2007. 04. 08. 2 100 1
2007. 04. 09. 2 100 1
2007. 04. 10. 2 0 1
2007. 04. 11. 2 0 1
2007. 04. 12. 2 0 1
2007. 04. 13. 2 0 1
2007. 04. 14. 2 0 1
2007. 04. 15. 2 0 1
2007. 04. 16. 2 2 1
My function:
create or replace type F_Normal_Day as object
(
NemNullaForgDB NUMBER,
cnt NUMBER,
static function
ODCIAggregateInitialize(sctx IN OUT F_Normal_Day )
return number,
member function
ODCIAggregateIterate(self IN OUT F_Normal_Day ,
value IN NUMBER )
return number,
member function
ODCIAggregateTerminate(self IN F_Normal_Day,
returnValue OUT NUMBER,
flags IN NUMBER)
return NUMBER,
member function
ODCIAggregateMerge(self IN OUT F_Normal_Day,
ctx2 IN F_Normal_Day)
return NUMBER
)
;
create or replace type body F_Normal_Day
is
static function ODCIAggregateInitialize(sctx IN OUT F_Normal_Day)
return number
is
begin
sctx := F_Normal_Day( 0,0 );
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(self IN OUT F_Normal_Day,
value IN NUMBER )
return number
is
begin
IF VALUE=0 THEN
self.NemNullaForgDB:=self.NemNullaForgDB+1;
ELSE
self.NemNullaForgDB:=0;
END IF;
self.cnt:=greatest(SELF.cnt,SELF.NemNullaForgDB);
RETURN ODCIConst.Success;
end;
member function ODCIAggregateTerminate(self IN F_Normal_Day,
returnValue OUT NUMBER,
flags IN NUMBER)
return number
is
BEGIN
IF SELF.cnt>=7 THEN
returnValue := 0;
ELSE
returnValue := 1;
END IF;
-- returnValue:=SELF.cnt;
-- returnValue:=self.NemNullaForgDB;
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(self IN OUT F_Normal_Day,
ctx2 IN F_Normal_Day)
return number
is
BEGIN
return ODCIConst.Success;
NULL;
end;
end;
CREATE OR REPLACE FUNCTION f_normal_day_fg (input NUMBER) RETURN NUMBER
PARALLEL_ENABLE AGGREGATE USING f_normal_day;
My question is: how can I (if it is possible) to assign the 7 number as parameter to the F_Normal_Day object? I mean, I like to have an analytic funtion like lag() that have 2 parameter. I can not add another parameter to f_normal_day_fg, because it raises an error...
Thanks in advance,
Miki
April 17, 2007 - 9:39 am UTC
aggregate functions - which you are building - take precisely one argument.
You could do something like:
SELECT a.*
,f_normal_day_fg(a.traffic || ',7' ) over(PARTITION BY a.m
perhaps and substr it out - or use an application context and call dbms_session.set_context to the value prior to calling the function from SQL.
calculate % of sum
wawan, April 17, 2007 - 6:14 am UTC
drop table scoring1;
drop table scoring2;
create table SCORING1
(Name varchar2(5),
Location number(2),
Amount number(10,2));
create table SCORING2
(Name varchar2(5),
Location number(2),
Amount number(10,2));
insert into SCORING1 values('AA',1,200);
insert into SCORING1 values('AA',2,300);
insert into SCORING1 values('AA',3,-400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,200);
insert into SCORING1 values('CC',2,-200);
insert into SCORING1 values('AA',1,200);
insert into SCORING1 values('AA',2,300);
insert into SCORING1 values('AA',3,-400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,200);
insert into SCORING1 values('CC',2,-200);
commit;
col prctg1 format 99.99
col prctg2 format 99.99
compute sum of AAA on name
compute sum of BBB on name
compute sum of prctg1 on name
compute sum of prctg2 on name
break on name skip 1
select A.name,A.location,sum(A.amount) AAA,
case when sum(A.amount)>0 then sum(A.amount) end BBB,
ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
over(partition by A.name) prctg2
from
(
select name,location,amount
from SCORING1
union all
select name,location,amount
from SCORING2
) A
group by A.name, A.location
/
NAME LOCATION AAA BBB PRCTG2
----- ---------- ---------- ---------- ------
AA 1 400 400 .40
2 600 600 .60
3 -800
***** ---------- ---------- ------
sum 200 1000 1.00
CC 1 400 400 .29
2 -400
3 1000 1000 .71
***** ---------- ---------- ------
sum 1000 1400 1.00
how add a new column , which calculate
PRCTG2 * sum of each name : .40* 200 and .60*200 for AA
and .29*1000 and .71*1000 for CC ?
like this :
NAME LOCATION AAA BBB PRCTG2 NEW_COLUMN
----- ---------- ---------- ---------- ------ ----------
AA 1 400 400 .40 80
2 600 600 .60 120
3 -800
***** ---------- ---------- ------ ----------
sum 200 1000 1.00 200
CC 1 400 400 .29 290
2 -400
3 1000 1000 .71 710
***** ---------- ---------- ------ ----------
sum 1000 1400 1.00 1000
its too complicated for me :)
thanks before
April 17, 2007 - 10:13 am UTC
(please discover the CODE button!!! very hard to read)
...
PRCTG2 * sum of each name : .40* 200 and .60*200 for AA
......
where does 200 come from?
calculate query question
wawan, April 17, 2007 - 9:04 pm UTC
col prctg1 format 99.99
col prctg2 format 99.99
compute sum of AAA on name
compute sum of BBB on name
compute sum of prctg1 on name
compute sum of prctg2 on name
break on name skip 1
select A.name,A.location,sum(A.amount) AAA,
case when sum(A.amount)>0 then sum(A.amount) end BBB,
ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
over(partition by A.name) prctg2
from
(
select name,location,amount
from SCORING1
union all
select name,location,amount
from SCORING2
) A
group by A.name, A.location
/
NAME LOCATION AAA BBB PRCTG2
----- ---------- ---------- ---------- ------
AA 1 400 400 .40
2 600 600 .60
3 -800
***** ---------- ---------- ------
sum 200 1000 1.00
CC 1 400 400 .29
2 -400
3 1000 1000 .71
***** ---------- ---------- ------
sum 1000 1400 1.00
how add a new column , which calculate
PRCTG2 * sum of each name : .40* 200 and .60*200 for AA
and .29*1000 and .71*1000 for CC ?
like this :
NAME LOCATION AAA BBB PRCTG2 NEW_COLUMN
----- ---------- ---------- ---------- ------ ----------
AA 1 400 400 .40 80
2 600 600 .60 120
3 -800
***** ---------- ---------- ------ ----------
sum 200 1000 1.00 200
CC 1 400 400 .29 290
2 -400
3 1000 1000 .71 710
***** ---------- ---------- ------ ----------
sum 1000 1400 1.00 1000
its too complicated for me :)
thanks before
the 200 came from sum of amount of each name
sum AA is 200
sum CC is 1000
April 18, 2007 - 11:56 am UTC
ops$tkyte%ORA10GR2> select name,
2 location,
3 aaa,
4 bbb,
5 prctg2,
6 prctg2*sum(aaa) over (partition by name) sum_aaa
7 from (
8 select A.name,
9 A.location,
10 sum(A.amount) AAA,
11 case when sum(A.amount)>0 then sum(A.amount) end BBB,
12 ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
13 over(partition by A.name) prctg2
14 from
15 (
16 select name,location,amount
17 from SCORING1
18 union all
19 select name,location,amount
20 from SCORING2
21 ) A
22 group by A.name, A.location
23 )
24 /
NAME LOCATION AAA BBB PRCTG2 SUM_AAA
------------------------------ ---------- ---------- ---------- ---------- ----------
AA 1 400 400 .4 80
AA 2 600 600 .6 120
AA 3 -800
CC 1 400 400 .285714286 285.714286
CC 2 -400
CC 3 1000 1000 .714285714 714.285714
6 rows selected.
asking query
wawan, April 23, 2007 - 2:18 am UTC
Tom,
drop table scoring1;
drop table scoring2;
drop table scoring3;
create table SCORING1
(Name varchar2(5),
Location number(2),
Amount number(10,2));
create table SCORING2
(Name varchar2(5),
Location number(2),
Amount number(10,2));
create table SCORING3
(Name varchar2(5),
Location number(2),
Amount number(10,2));
insert into SCORING1 values('AA',1,-500);
insert into SCORING1 values('AA',2,300);
insert into SCORING1 values('AA',3,400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,200);
insert into SCORING1 values('CC',2,-600);
insert into SCORING1 values('AA',1,200);
insert into SCORING1 values('AA',2,500);
insert into SCORING1 values('AA',3,400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,400);
insert into SCORING1 values('CC',2,-200);
insert into SCORING3 values('AA',4,500);
insert into SCORING3 values('AA',5,500);
insert into SCORING3 values('CC',6,600);
insert into SCORING3 values('CC',7,300);
insert into SCORING3 values('CC',8,300);
commit;
col prctg1 format 99.99
col prctg2 format 99.99
compute sum of AAA on name
compute sum of BBB on name
compute sum of prctg1 on name
compute sum of prctg2 on name
break on name skip 1
select name,
location,
aaa,
bbb,
prctg2,
prctg2*sum(aaa) over (partition by name) sum_aaaXprctg2
from (
select A.name,
A.location,
sum(A.amount) AAA,
case when sum(A.amount)>0 then sum(A.amount) end BBB,
ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
over(partition by A.name) prctg2
from
(
select name,location,amount
from SCORING1
union all
select name,location,amount
from SCORING2
) A
group by A.name, A.location
)
/
select name,location,ratio_to_report(amount)
over(partition by name) prctg3 from scoring3
/
Commit complete.
NAME LOCATION AAA BBB PRCTG2 SUM_AAAXPRCTG2
----- ---------- ---------- ---------- ------ --------------
AA 1 -300
2 800 800 .50 650
3 800 800 .50 650
***** ---------- ---------- ------
sum 1300 1600 1.00
CC 1 600 600 .38 300
2 -800
3 1000 1000 .63 500
***** ---------- ---------- ------
sum 800 1600 1.00
6 rows selected.
NAME LOCATION PRCTG3
----- ---------- ----------
AA 4 .5
5 .5
CC 6 .5
7 .25
8 .25
how to make a query with result like this ?
NAME LOCATION AAA BBB PRCTG2 SUM_AAA new_col prctg3
----- -------- ------ ------- ------ ------- ------- ------
AA 1 -300
2 800 800 .50 650
3 800 800 .50 650
4 325 .5
5 325 .5
***** ------ ------- ------ ------- ------
sum 1300 1600 1.00
CC 1 600 600 .38 300
2 -800
3 1000 1000 .63 500
6 250 .5
7 125 .25
8 125 .25
***** ------ ------- ------
sum 800 1600 1.00
the new column is get from amount in location 3 * prctg3,
for the location 4,5 and so on.
regards
Analytics
Vamsi Krishna, April 27, 2007 - 2:31 am UTC
Tom you are the BEST.
I am exploring the world of Oracle with your invaluable support.
Thanks TOM Keep going........
Using SQL to convert nulls to the value
ht, May 01, 2007 - 12:37 am UTC
Hello Tom,
I've searched your site and am not sure if this is the correct place to ask this question so my apologies up front. I've been stuck on this concept for some time. All I want to do is to use sql (as opposed to pl/sql) to display (and not store) a row's previous value if it is null.
For instance, in the snippet below, I provide a script that adds the emp.week. column. I then populate the column using pl/sql. Then, I query the table for empno=7902.
The resulting output is close to what I want but instead of:
40 7902 8092
41
42 7902 8094
43
44 7902 8096
45
46 7902 8098
I would like:
40 7902 8092
41 7902 8092
42 7902 8094
43 7902 8094
44 7902 8096
45 7902 8096
46 7902 8098
The idea is I would like to save "space" in my database by not storing rows that, when checked, were the previous value (which could also be < the previous value). But I would like to provide a report like the one above to compare empno commissions. The actual output is just a sample, I'm really storing the growth of extents over time in multiple databases.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Connected.
desc emp
Name Null? Type
----------------------------------------- -------- ----------------------------
EMPNO NOT NULL NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)
alter table emp disable constraint pk_emp;
Table altered.
alter table emp add week number;
Table altered.
declare
2 x number;
3 y number;
4 z number;
5 begin
6 for i in 1 .. 52 loop
7 for j in (select distinct empno from emp) loop
8 -- every other week for some empnos
9 y:=mod(j.empno,2);
10 x:=mod(i,2);
11 if y=0 then
12 if x=0 then
13 insert into emp (empno,comm,week)values(j.empno,j.empno+150+i,i);
14 end if;
15 end if;
16 -- every week for some empnos
17 z:=mod(j.empno,2);
18 if z!=0 then
19 insert into emp (empno,comm,week)values(j.empno,j.empno-175-i,i);
20 end if;
21 end loop;
22 end loop;
23 end;
24 /
PL/SQL procedure successfully completed.
--select week,empno,comm from emp where week is not null order by week,empno;
/*
select x.week,x.empno,sum(comm) commission4week from emp x where week is not null
group by x.week,x.empno
order by x.week,x.empno;
*/
select distinct e.week,xyz.empno,xyz.commission4week from emp e ,
2 (select x.week,x.empno,sum(comm) commission4week from emp x where week is not null
3 and x.empno=7902
4 group by x.week,x.empno) xyz
5 where e.week=xyz.week(+)
6 order by week,empno;
WEEK EMPNO COMMISSION4WEEK
---------- ---------- ---------------
1
2 7902 8054
3
4 7902 8056
5
6 7902 8058
7
8 7902 8060
9
10 7902 8062
11
12 7902 8064
13
14 7902 8066
15
16 7902 8068
17
18 7902 8070
19
20 7902 8072
21
22 7902 8074
23
24 7902 8076
25
26 7902 8078
27
28 7902 8080
29
30 7902 8082
31
32 7902 8084
33
34 7902 8086
35
36 7902 8088
37
38 7902 8090
39
40 7902 8092
41
42 7902 8094
43
44 7902 8096
45
46 7902 8098
47
48 7902 8100
49
50 7902 8102
51
52 7902 8104
53 rows selected.
desc emp
Name Null? Type
----------------------------------------- -------- ----------------------------
EMPNO NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)
WEEK NUMBER
delete emp where week is not null;
468 rows deleted.
alter table emp drop column week;
Table altered.
desc emp
Name Null? Type
----------------------------------------- -------- ----------------------------
EMPNO NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)
alter table emp enable constraint pk_emp;
Table altered.
exit;
May 01, 2007 - 9:52 am UTC
last_value analytic function with the new in 10g option of "ignore nulls" is what you want:
scott%ORA10GR2> select empno, comm, last_value( comm ignore nulls ) over (order by empno) carry_down
2 from emp
3 order by empno;
EMPNO COMM CARRY_DOWN
---------- ---------- ----------
7369
7499 300 300
7521 500 500
7566 500
7654 1400 1400
7698 1400
7782 1400
7788 1400
7839 1400
7844 0 0
7876 0
7900 0
7902 0
7934 0
14 rows selected.
Works great!
ht, May 01, 2007 - 1:42 pm UTC
Tom,
It's amazing that we can be stuck on a problem for a few days, "give up", submit it to you, go to bed, and wake up with a solution that not only avoids pl/sql but is quicker than the flawwed sql I had settled on.
This solution will save me gigabytes of storage since I don't have to store a value if it hasn't changed since the last check. Also, it saves storage because I don't have to create a table that would have periodically stored my summarized data (using slower pl/sql) so queries would run acceptably.
Thank you.
reader
A reader, May 08, 2007 - 4:12 pm UTC
Using analytical function, is it possible match two adjacent rows and print just the column and values that has changed during update
For example, if I use CDC (Change Data Capture), it prints out the before and after image of a row that has been updated. Ex
UU 123 456 789 101 102
UN 123 555 789 101 102
In the above example UU is the before imaage and UN is the after image. The row has four columns
This row has been updated. But only one column-value has been change i-e column2 value changed from 456 to 555
Is there a way to scan the subscriber view to identify , the column and vlaue that has been changed and print it
May 11, 2007 - 8:36 am UTC
do you have
a) a 'primary key'
b) a flag that says "this is before, this is after"
if so,
select pk,
decode( old_c1, new_c1, cast( null as <datatype> ), new_c1 ) c1,
....
from (
select pk,
max( decode( flag, 'before', c1 ) ) old_c1,
max( decode( flag, 'after', c1 ) ) new_c1,
....
from t
group by pk
)
Avoiding a full table scan wgeb using outer joins?
ht, May 10, 2007 - 5:45 am UTC
Tom,
Is it possible to avoid the full table scan occuring below? After implementing your carry_over technique above, it takes over 5 hours to run this query for a specific instance using my "production data".
I believe the predicate causing the issue is:
WHERE e.batch_id = xyz.batch_id(+)
Is there a way to get the output desired and prevent a full table scan?
Thanks again.
SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 10 02:35:41 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Connected.
SELECT DISTINCT
2 e.batch_id,
3 sum(xyz.total)size_mb,
4 last_value(sum(xyz.total) ignore nulls) over(ORDER BY e.batch_id) carry_over
5 FROM
6 testeextent e,
7 (
8 SELECT
9 x.batch_id,
10 SUM(x.size_mb) total
11 FROM
12 testeextent x
13 WHERE
14 x.instance_id =7383
15 GROUP BY
16 x.batch_id
17 )xyz
18 WHERE e.batch_id = xyz.batch_id(+)
19 group by e.batch_id
20 ORDER BY e.batch_id;
BATCH_ID SIZE_MB CARRY_OVER
---------- ---------- ----------
1 4 4
2 12 12
3 12
4 12
5 40 40
6 60 60
7 60
8 68 68
8 rows selected.
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 137322112
--------------------------------------------------------------------------------
----------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
--------------------------------------------------------------------------------
----------
| 0 | SELECT STATEMENT | | 5 | 85 | 17 (12)|
00:00:01 |
| 1 | WINDOW BUFFER | | 5 | 85 | 17 (12)|
00:00:01 |
| 2 | SORT GROUP BY | | 5 | 85 | 17 (12)|
00:00:01 |
|* 3 | HASH JOIN OUTER | | 8 | 136 | 16 (7)|
00:00:01 |
| 4 | TABLE ACCESS FULL | TESTEEXTENT | 8 | 16 | 14 (0)|
00:00:01 |
| 5 | VIEW | | 5 | 75 | 1 (0)|
00:00:01 |
| 6 | HASH GROUP BY | | 5 | 35 | 1 (0)|
00:00:01 |
|* 7 | INDEX RANGE SCAN| TESTEEXTENT_IDX1 | 5 | 35 | 1 (0)|
00:00:01 |
--------------------------------------------------------------------------------
----------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
7 - access("X"."INSTANCE_ID"=7383)
filter("X"."INSTANCE_ID"=7383)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
23 consistent gets
0 physical reads
0 redo size
783 bytes sent via SQL*Net to client
469 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
8 rows processed
May 11, 2007 - 10:38 am UTC
umm, think about this please..
how would anything short of a full scan be even remotely efficient here?
5 FROM
6 testeextent e,
7 (
8 SELECT
9 x.batch_id,
10 SUM(x.size_mb) total
11 FROM
12 testeextent x
13 WHERE
14 x.instance_id =7383
15 GROUP BY
16 x.batch_id
17 )xyz
18 WHERE e.batch_id = xyz.batch_id(+)
19 group by e.batch_id
20 ORDER BY e.batch_id;
you are getting - by definition - every thing from testeextent - E is retrieved in FULL
tell me, what do you think should happen in this query?
Analytic question
A reader, May 11, 2007 - 9:49 am UTC
Hi Tom,
Thanks for the book expert one on one I've bought. I read carrefuly Chapter 12 : Analytic Functions and may be I've found an error in the explanation (page 557)
----------------------------------
Here, if we look at CLARK again, since we understand........ These are the values of CLARK's salary and the rows preceeding (isn't following instead?)
-----------------
Anyway, I have a question
CREATE TABLE T1
(
TYP_TAR VARCHAR2(3),
FAG_SLC VARCHAR2(1),
TYP_CAT VARCHAR2(2),
PCT_FEE NUMBER(6,4),
PCT_RED NUMBER(6,4),
AMT_MAX_CND_FEE NUMBER(38,17),
AMT_MIN_CND_FEE NUMBER(38,17),
DAT_START_CND DATE,
DAT_END_CND DATE,
SLC_LWR NUMBER(38,17),
SLC_UPR NUMBER(38,17),
TYP_XTR_FAC VARCHAR2(3),
XTR_FAC_VAL VARCHAR2(250),
FAG_STD VARCHAR2(1),
FAG_DONE VARCHAR2(1),
PRC_CTT_BKA_VAL VARCHAR2(1),
AMT_MAX_TAR_FEE NUMBER(38,17),
FAG_APE VARCHAR2(1),
IDE NUMBER(38),
TPER_TYP_PDI VARCHAR2(2)
)
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('02', 'N', '01', 1, NULL, 999999,
40,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
999999, 25000, '02', 'Br/PB', 'Y', 'N',
NULL, 999999, 'Y', 1, '00'
);
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('02', 'N', '01', 14.5, NULL, 999999,
40,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
999999, 25000, '02', 'Br/PB', 'Y', 'N',
NULL, 999999, 'Y', 1, '00'
);
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('02', 'N', '01', 1, NULL, 999999,
40,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
25000, 999999, '02', 'Br/PB', 'Y', 'N',
NULL, 999999, 'Y', 2, '00'
);
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('02', 'N', '02', 0.5, NULL, 999999,
30,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
25000, 999999, '02', 'HB', 'Y', 'N',
NULL, 999999, 'Y', 8, '00'
);
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('01', 'N', '08', 0.0011, NULL, 999999,
40,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
999999, 999999, '###', '######', 'Y', 'N',
NULL, 999999, 'Y', 20, '00'
);
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('01', 'N', '08', 0.0012, NULL, 999999,
40,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
999999, 999999, '###', '######', 'Y', 'N',
NULL, 999999, 'Y', 20, '00'
);
INSERT INTO t1
(typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
amt_min_cnd_fee,
dat_start_cnd,
dat_end_cnd,
slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
)
VALUES ('01', 'N', '08', 0.0009, NULL, 999999,
40,
TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
999999, 999999, '###', '######', 'Y', 'Y',
NULL, 999999, 'Y', 20, '00'
);
I would like to select several rows plus the min(pct_fee) as in the below select
select ide
,prc_ctt_bka_val
,amt_min_cnd_fee
,fag_slc
,dat_start_cnd
,dat_end_cnd
,fag_ape
,typ_tar
,typ_cat
,slc_lwr
,slc_upr
,typ_xtr_fac
,xtr_fac_val
,min(pct_fee) over( partition by typ_tar, typ_cat, slc_lwr, slc_upr, typ_xtr_fac,xtr_fac_val) min_pct_fee
from t1
where fag_done = 'N'
My problem is that I an getting two rows for ide 1 and 20. I would like to get only one row when there exist more than one ide. Distinct is not what I want
Is it possible to do this using analytics ?
Thanks
May 11, 2007 - 11:38 am UTC
i don't understand what you are looking for.
if you want one row by some key - you are looking for GROUP BY, not analytics.
Maybe a summary table can be used to prevent the full table scan?
ht, May 11, 2007 - 12:01 pm UTC
Tom,
Thanks for the info. I was hoping there was a way to avoid the full table scan the outer join seems to be causing. The query takes over 5 hours so I was hoping I could somehow tune the query.
I'm going to test retrieving distinct batch_ids into another table to see if that speeds up the query. That will probably not help though. Please let me know if you know of a more efficient method to retrieve this data.
Thanks again.
ht
May 11, 2007 - 12:07 pm UTC
no, look at your query
the outer join has NOTHING AT ALL to do with the full scan
you need each and ever row from that table, all of them, each and every single one.
the outer join is NOT causing the full scan.
Your question mandates a full scan.
why do you think, how do you think a full scan can be avoided?
could a summary table that grouped up the table by batch_id be used? sure, but now you have to create it, maintain it.
are you sure you have the right QUERY here? do you understand what this query you are showing us does?
remove the outer join, just query testeevent itself once (b). Now, you have
select
from table
group by col
please, explain how you would "avoid a full scan"
Avoid scanning 832k recs?
ht, May 11, 2007 - 2:21 pm UTC
Tom,
You are absolutely correct. I should not have focused on the full table scan but the actual query elapsed time. I guess my question should be "Is there any other way (besides inserting batch_ids into another table when a batch_id is created) to improve the query elapsed time below?".
select count(*) from testextent2;
COUNT(*)
----------
823205
1 row selected.
Elapsed: 00:00:00.66
select count(*) from testbatches2;
COUNT(*)
----------
51
1 row selected.
Elapsed: 00:00:00.01
set autotrace on;
SELECT DISTINCT
2 e.batch_id,
3 sum(xyz.total)size_mb,
4 last_value(sum(xyz.total) ignore nulls) over(ORDER BY e.batch_id) carry_over
5 FROM
6 testbatches2 e,
7 (
8 SELECT
9 x.batch_id,
10 SUM(x.size_mb) total
11 FROM
12 testextent2 x
13 WHERE
14 x.instance_id =7383
15 GROUP BY
16 x.batch_id
17 )xyz
18 WHERE e.batch_id = xyz.batch_id(+)
19 group by e.batch_id
20 ORDER BY e.batch_id;
BATCH_ID SIZE_MB CARRY_OVER
---------- ---------- ----------
800 78545 78545
820 47232 47232
840 34587 34587
860 40894 40894
880 28310 28310
900 22759 22759
920 36445 36445
921 26179 26179
940 29027 29027
941 36768 36768
960 35733 35733
980 35377 35377
1000 25364 25364
1020 39610 39610
1040 31137 31137
1041 35676 35676
1060 34423 34423
1061 29731 29731
1080 26837 26837
1100 34240 34240
1120 40619 40619
1140 18641 18641
1160 45161 45161
1180 23378 23378
1200 17084 17084
1220 19251 19251
1240 25816 25816
1260 17200 17200
1280 20338 20338
1300 20338
1320 36871 36871
1340 33504 33504
1360 33514 33514
1380 47218 47218
1400 28894 28894
1420 34509 34509
1440 36182 36182
1460 30872 30872
1480 34677 34677
1500 30393 30393
1520 5841 5841
1540 10515 10515
1560 10644 10644
1561 2503 2503
1562 23232 23232
1580 25329 25329
1581 45908 45908
1600 36838 36838
1620 43275 43275
1640 42380 42380
1660 37154 37154
51 rows selected.
<b>Elapsed: 00:00:00.06</b>
Execution Plan
----------------------------------------------------------
Plan hash value: 575510713
--------------------------------------------------------------------------------
----------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
--------------------------------------------------------------------------------
----------
| 0 | SELECT STATEMENT | | 51 | 1479 | 53 (4)|
00:00:01 |
| 1 | WINDOW BUFFER | | 51 | 1479 | 53 (4)|
00:00:01 |
| 2 | SORT GROUP BY | | 51 | 1479 | 53 (4)|
00:00:01 |
|* 3 | HASH JOIN OUTER | | 51 | 1479 | 52 (2)|
00:00:01 |
<b>| 4 | TABLE ACCESS FULL | TESTBATCHES2 | 51 | 663 | 3 (0)|
00:00:01 |</b>
| 5 | VIEW | | 51 | 816 | 48 (0)|
00:00:01 |
| 6 | HASH GROUP BY | | 51 | 408 | 48 (0)|
00:00:01 |
|* 7 | INDEX RANGE SCAN| TESTEXTENT2_IDX1 | 5488 | 43904 | 48 (0)|
00:00:01 |
--------------------------------------------------------------------------------
----------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
7 - access("X"."INSTANCE_ID"=7383)
filter("X"."INSTANCE_ID"=7383)
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
28 recursive calls
0 db block gets
123 consistent gets
1 physical reads
0 redo size
2055 bytes sent via SQL*Net to client
502 bytes received via SQL*Net from client
5 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
51 rows processed
set autotrace off;
--------------------------------------------------------------
--------------------------------------------------------------
-- test #1
--------------------------------------------------------------
select count(*) from testextent2;
COUNT(*)
----------
823205
1 row selected.
Elapsed: 00:00:00.53
set autotrace on;
SELECT DISTINCT
2 e.batch_id,
3 sum(xyz.total)size_mb,
4 last_value(sum(xyz.total) ignore nulls) over(ORDER BY e.batch_id) carry_over
5 FROM
6 testextent2 e,
7 (
8 SELECT
9 x.batch_id,
10 SUM(x.size_mb) total
11 FROM
12 testextent2 x
13 WHERE
14 x.instance_id =7383
15 GROUP BY
16 x.batch_id
17 )xyz
18 WHERE e.batch_id = xyz.batch_id(+)
19 group by e.batch_id
20 ORDER BY e.batch_id;
BATCH_ID SIZE_MB CARRY_OVER
---------- ---------- ----------
800 6.2573E+10 6.2573E+10
820 52474752 52474752
840 40017159 40017159
860 13290550 13290550
880 7219050 7219050
900 4005584 4005584
920 7216110 7216110
921 3926850 3926850
940 9259613 9259613
941 8493408 8493408
960 8790318 8790318
980 35978409 35978409
1000 7380924 7380924
1020 9466790 9466790
1040 6351948 6351948
1041 7741692 7741692
1060 8261520 8261520
1061 10584236 10584236
1080 3918202 3918202
1100 25816960 25816960
1120 19294025 19294025
1140 4063738 4063738
1160 25606287 25606287
1180 27819820 27819820
1200 11958800 11958800
1220 17961183 17961183
1240 7512456 7512456
1260 8514000 8514000
1280 4738754 4738754
1300 4738754
1320 11872462 11872462
1340 8778048 8778048
1360 8646612 8646612
1380 69032716 69032716
1400 8148108 8148108
1420 7453944 7453944
1440 10963146 10963146
1460 75018960 75018960
1480 12206304 12206304
1500 6230565 6230565
1520 4485888 4485888
1540 10262640 10262640
1560 2182020 2182020
1561 1056266 1056266
1562 15542208 15542208
1580 46605360 46605360
1581 17307316 17307316
1600 16282396 16282396
1620 58377975 58377975
1640 14027780 14027780
1660 7802340 7802340
51 rows selected.
<b>Elapsed: 00:00:03.90</b>
Execution Plan
----------------------------------------------------------
Plan hash value: 698401733
--------------------------------------------------------------------------------
-------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost
(%CPU)| Time |
--------------------------------------------------------------------------------
-------------------
| 0 | SELECT STATEMENT | | 51 | 969 | | 547
1 (12)| 00:01:06 |
| 1 | WINDOW BUFFER | | 51 | 969 | | 547
1 (12)| 00:01:06 |
| 2 | SORT GROUP BY NOSORT | | 51 | 969 | | 547
1 (12)| 00:01:06 |
| 3 | MERGE JOIN OUTER | | 39M| 712M| | 547
1 (12)| 00:01:06 |
| 4 | SORT JOIN | | 823K| 2411K| 18M| 490
8 (3)| 00:00:59 |
<b>| 5 | TABLE ACCESS FULL | TESTEXTENT2 | 823K| 2411K| | 262
8 (2)| 00:00:32 |</b>
|* 6 | SORT JOIN | | 51 | 816 | | 4
9 (3)| 00:00:01 |
| 7 | VIEW | | 51 | 816 | | 4
8 (0)| 00:00:01 |
| 8 | HASH GROUP BY | | 51 | 408 | | 4
8 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN| TESTEXTENT2_IDX1 | 5488 | 43904 | | 4
8 (0)| 00:00:01 |
--------------------------------------------------------------------------------
-------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
filter("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
9 - access("X"."INSTANCE_ID"=7383)
filter("X"."INSTANCE_ID"=7383)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
13427 consistent gets
0 physical reads
0 redo size
2162 bytes sent via SQL*Net to client
502 bytes received via SQL*Net from client
5 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
51 rows processed
set autotrace off;
--------------------------------------------------------------
exit;
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
May 11, 2007 - 3:41 pm UTC
what is unacceptable here?
Takes 5 hours.
ht, May 11, 2007 - 3:57 pm UTC
Tom,
The scan of 800k+ recs in the second example causes the superset of data 5 hours to return. I'm just wondering if there is a method to avoid the 800k+ rec scan or if I should just create a table that stores the batch_ids upon batch_id creation.
May 11, 2007 - 4:33 pm UTC
if it takes 5 hours to full scan 800k records, you have a serious problem.
How about a TKPROF of that there query with wait events enabled so we can see what it is doing, how it is doing it, how long it is waiting for what.
Analytics to generate multiple types fo date dimensions
David, May 11, 2007 - 6:40 pm UTC
Tom,
I'm trying to get the query that joins
my CONFIG table with ALL_OBJECTS and
produces as many records per each config item as
NUM_RECORDS field indicate:
--Generate dynamic date dimension
create table CONFIG
(
NAME VARCHAR2(50),
NUM_RECORDS NUMBER
);
insert into CONFIG
values ('TEST_2_rec',2);
insert into CONFIG
values ('TEST_3_rec',3);
insert into CONFIG
values ('TEST_5_rec',5);
e.g. there should be 3 groups of records returned,
one group with 2 records, another with 3 and lst one with 5.
I only came up with lame sl as this
one which apparently doesn't work:
col name format a12
SELECT rownum as rnum,name,NUM_RECORDS
from ALL_OBJECTS JOIN CONFIG ON (1=1)
where rownum<= NUM_RECORDS
;
RNUM NAME NUM_RECORDS
---------- ------------ -----------
1 TEST_2_rec 2
2 TEST_2_rec 2
3 TEST_3_rec 3
4 TEST_5_rec 5
5 TEST_5_rec 5
I tried to produce this with analytics,
but results are too scary to post :)
RNUM NAME NUM_RECORDS
---------- ------------ -----------
1 TEST_2_rec 2
2 TEST_2_rec 2
3 TEST_3_rec 3
4 TEST_3_rec 3
5 TEST_3_rec 3
6 TEST_5_rec 5
7 TEST_5_rec 5
8 TEST_5_rec 5
9 TEST_5_rec 5
10 TEST_5_rec 5
Any help or advice would be appreciated :)
Thank you for your time
David
May 14, 2007 - 12:42 pm UTC
To David
martina, May 12, 2007 - 1:35 am UTC
Hi, is that what you want?
INFO/INFO_PROD> r
1 select name,count(*)
2 from ( select c.*
3 from
4 mpa_config c
5 ,
6 (select level l from dual
7 connect by level < 10 -- Here you need the maximum num_records
8 ) d
9 where c.num_records >= d.l
10 )
11* group by name
NAME COUNT(*)
-------------------------------------------------- ----------
TEST_2_rec 2
TEST_3_rec 3
TEST_5_rec 5
INFO/INFO_PROD>
is that what you want?
martina
A reader, May 12, 2007 - 9:27 am UTC
HI Tom,
Back to the question above
CREATE TABLE T1
(
TYP_TAR VARCHAR2(3),
FAG_SLC VARCHAR2(1),
TYP_CAT VARCHAR2(2),
PCT_FEE NUMBER(6,4),
PCT_RED NUMBER(6,4),
AMT_MAX_CND_FEE NUMBER(38,17),
AMT_MIN_CND_FEE NUMBER(38,17),
DAT_START_CND DATE,
DAT_END_CND DATE,
SLC_LWR NUMBER(38,17),
SLC_UPR NUMBER(38,17),
TYP_XTR_FAC VARCHAR2(3),
XTR_FAC_VAL VARCHAR2(250),
FAG_STD VARCHAR2(1),
FAG_DONE VARCHAR2(1),
PRC_CTT_BKA_VAL VARCHAR2(1),
AMT_MAX_TAR_FEE NUMBER(38,17),
FAG_APE VARCHAR2(1),
IDE NUMBER(38),
TPER_TYP_PDI VARCHAR2(2)
)
../..
One of my colleague is doing the following very non performant code:
cursor c1 is
select ,prc_ctt_bka_val
,amt_min_cnd_fee
,fag_slc
,dat_start_cnd
,dat_end_cnd
,pct_fee,fag_ape
,typ_tar
,typ_cat
,slc_lwr
,slc_upr
,typ_xtr_fac
,xtr_fac_val
from t1
where fag_done = 'N' ;
cursor c2 is
select ,prc_ctt_bka_val
,amt_min_cnd_fee
,fag_slc
,dat_start_cnd
,dat_end_cnd
,pct_fee,fag_ape
,typ_tar
,typ_cat
,slc_lwr
,slc_upr
,typ_xtr_fac
,xtr_fac_val
from t1
where typ_tar = p_typ_tar
and typ_cat = p_typ_cat
and slc_lwr = p_slc_lwr
and slc_upr = p_slc_upr
and typ_xtr_fac = p_typ_xtr_fac
and xtr_fac_val = p_xtr_fac_val
and fag_done = 'N'
and pct_fee = (select min(pct_fee) from t1
where typ_tar = p_typ_tar
and typ_cat = p_typ_cat
and slc_lwr = p_slc_lwr
and slc_upr = p_slc_upr
and typ_xtr_fac = p_typ_xtr_fac
and xtr_fac_val = p_xtr_fac_val
and fag_done = 'N' );
r2 c2%rowtype;
BEGIN
for r1 in c1 LOOP
open c2(r1.typ_tar,r1.typ_cat,r1.slc_lwr,r1.typ_xtr_fac,xtr_fac_val);
fetch c2 into r2;
if c2%notfound then
insert into table t2 values comming from cursor r1;
else
insert into table t2 values comming from cursor r2;
end if;
close c2;
END;
I would like to replace the above code by this one
BEGIN
insert into table t2
select ,prc_ctt_bka_val
,amt_min_cnd_fee
,fag_slc
,dat_start_cnd
,dat_end_cnd
,pct_fee,fag_ape
,typ_tar
,typ_cat
,slc_lwr
,slc_upr
,typ_xtr_fac
,xtr_fac_val
,min(pct_fee) over( PARTITION BY typ_tar, typ_cat, slc_lwr, slc_upr, typ_xtr_fac,xtr_fac_val) min_pct_fee
from t1
where fag_done = 'N' ;
END;
1. knowing that table t1 has no any unique constraint (PK or UK) implemented how can I ensure that the above analytic select gives only one row(the one with the min(pct_fee) when there is duplicate (typ_tar ,typ_cat,slc_lwr,slc_upr ,typ_xtr_fac ,xtr_fac_val )
2. Could you please advise something else if you see that both we are wrong
Thanks very much
May 14, 2007 - 12:59 pm UTC
if you want a single recrod by x,y,z,..... you use group by (aggregation), not analytics.
analytics specifically do not "squish out" records, you seem to want one observation by key - hence, you want aggregations.
something like:
ops$tkyte%ORA10GR2> select id,
2 min(fnm) KEEP (dense_rank first order by rowid) fnm,
3 min(lnm) KEEP (dense_rank first order by rowid) lnm
4 from test
5 group by id;
ID FNM LNM
---------- ------------------------------ ------------------------------
1 Tom Jones
2 Sue Snue
3 Robert Smith
but you group by your big list of columns, not just id
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:228182900346230020
martina, May 13, 2007 - 1:57 am UTC
hi,
i think the 2 cursors you show insert all rows of t1 into t2 fetching them in a complicated way. if you only wanted the c2 values c2 would be sufficient?
This might be what you are looking for:
insert into t2
select .....
from
( select d.*
,row_number() over (partition by typ_tar, typ_cat, slc_lwr, slc_upr, typ_xtr_fac,xtr_fac_val
order by pct_fee) rd
from t1 d ) z
WHERE RD = 1
;
if it is not please clarify WHAT you want to achieve, not HOW you want to achieve something.
regards, martina
Thank you
David, May 14, 2007 - 3:34 pm UTC
Thank you very much to Tom and Martina
:)
A reader, May 15, 2007 - 5:46 am UTC
Thanks Tom, Thanks Marina,
What I would like is
select a,
b,
c,
d,
x,
y,
z,
min(r)
from t
group by
x,
y,
z
This will not compile because I am also selecting a, b, c and d
If there is a possibility to select all what I want using group by (x,y,z) then I will use group by.
Since I've not succeed to do it, I am using analytic function which allows me to have select like this
select a,
b,
c,
d,
x,
y,
z,
min(r) over (partition by x,y,z)
from t
May 15, 2007 - 4:35 pm UTC
answer is already here:
ops$tkyte%ORA10GR2> select id,
2 min(fnm) KEEP (dense_rank first order by rowid) fnm,
3 min(lnm) KEEP (dense_rank first order by rowid) lnm
4 from test
5 group by id;
id = x,y,z
you'll min(a) keep (dense....) a
and just min(r)
Finding Previous Item within range
Jonty, May 15, 2007 - 5:51 pm UTC
Hi,
I want to find out last Item No used in Previous year for each item - catch is - if same Item as current then take which ever is before that item.
I have given code below, which create sample table TB_TRANS, insert st. to populate data, TB_TRANS_OUT output table and how results should look.
Scenario #
1. for Tran_no 1 record - item is 10390 - latest item used in last year was also 10390, which is same, go beyond that and get 2nd last which is 17365.
2. for Tran_no 2 record - same as 1
3. for Tran_no 3 record - item is 17365- latest item used in last year was 10390, previous item no will be 10390.
I hope this will give good idea what I am looking for.
I tried using Analytical Function but somehow I am not getting same results. I can get last item in previous year but scenario 1 I am not able to get it.
Any help will be highly appreciated.
CREATE TABLE TB_TRANS
(TRAN_DATE DATE,
TRAN_NO NUMBER,
CUST_NO NUMBER,
ITEM_NO NUMBER,
QTY NUMBER)
/
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '07/29/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 6, 101, 15448, 40);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '07/04/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 5, 101, 10390, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '07/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 4, 101, 17365, 20);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '07/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 3, 101, 17365, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '05/20/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 2, 101, 10390, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '03/04/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 1, 101, 10390, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '12/24/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 1000, 101, 10390, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '12/14/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 999, 101, 17365, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '12/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 998, 101, 10390, 30);
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES (
TO_Date( '11/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 997, 101, 15448, 30);
COMMIT;
SELECT * FROM TB_TRANS
ORDER BY tran_date, TRaN_NO
/
CREATE TABLE TB_TRANS_OUT
(TRAN_DATE DATE,
TRAN_NO NUMBER,
CUST_NO NUMBER,
ITEM_NO NUMBER,
QTY NUMBER,
PREV_ITEM_NO NUMBER
)
/
/* EXPECTED RESULT SET */
TRAN_DATE TRAN_NO CUST_NO ITEM_NO QTY
11/4/2005 997 101 15448 30
12/4/2005 998 101 10390 30
12/14/2005 999 101 17365 30
12/24/2005 1000 101 10390 30
3/4/2006 1 101 10390 30
5/20/2006 2 101 10390 30
7/1/2006 3 101 17365 30
7/1/2006 4 101 17365 20
7/4/2006 5 101 10390 30
7/29/2006 6 101 15448 40
SELECT * FROM TB_TRANS_OUT
ORDER BY tran_date, TRaN_NO
/
TRAN_DATE TRAN_NO CUST_NO ITEM_NO QTY PREV_ITEM_NO
11/4/2005 997 101 15448 30
12/4/2005 998 101 10390 30
12/14/2005 999 101 17365 30
12/24/2005 1000 101 10390 30
3/4/2006 1 101 10390 30 17365
5/20/2006 2 101 10390 30 17365
7/1/2006 3 101 17365 30 10390
7/1/2006 4 101 17365 20 10390
7/4/2006 5 101 10390 30 17365
7/29/2006 6 101 15448 40 10390
Thanks a lot in advance.
Jonty
May 16, 2007 - 10:18 am UTC
give us more details - will this always be for JUST two years
are the two years INPUTS into the query itself?
analytics will probably not be used here.
To: Jonty
Narendra, May 16, 2007 - 6:08 am UTC
Following is something I could think of:
Assumptions:
1. Your year starts on 1st January and ends on 31st December.
2. CUST_NO remains same for all the rows. (If it changes the query will change a bit).
SQL> select a.tran_date, a.tran_no, a.cust_no, a.item_no, a.qty,
2 ( select b.item_no from ( select item_no, tran_date from tb_trans order by tran_date desc) b
3 where b.item_no <> a.item_no and b.tran_date < trunc(a.tran_date, 'YYYY')
4 and rownum = 1 ) prev_item_no
5 from tb_trans a
6 ORDER BY tran_date, TRaN_NO ;
TRAN_DATE TRAN_NO CUST_NO ITEM_NO QTY PREV_ITEM_NO
----------- ---------- ---------- ---------- ---------- ------------
04-Nov-05 997 101 15448 30
04-Dec-05 998 101 10390 30
14-Dec-05 999 101 17365 30
24-Dec-05 1000 101 10390 30
04-Mar-06 1 101 10390 30 17365
20-May-06 2 101 10390 30 17365
01-Jul-06 3 101 17365 30 10390
01-Jul-06 4 101 17365 20 10390
04-Jul-06 5 101 10390 30 17365
29-Jul-06 6 101 15448 40 10390
10 rows selected
But I sincerely feel there might be a better approach.
Follw-up
Jonty, May 16, 2007 - 11:33 am UTC
Hi,
Thanks for looking into it.
Unfortunately this will not be a good solution.
1. I have to do it on High Volume data (1 to 300 mil records), there will be many more calculations on same record. I want to avoid using table twice in the query. I am more thinking to use Analycic Function.
2. Years will be varying but to make it easier lets assume Year is fix.
3. Customer No. - there will be many more customers. for each custom and item I have to do this.
4.
May 17, 2007 - 10:46 am UTC
2) no, how about you give the real requirement.
I don't see the opportunity for analytics really here. You would partition by year, but you want last years stuff in this year - analytics work within a partition - not across partitions
how does customer fit into all of this?? you just mentioned "customer and item"
please be amazingly precise in your specification and if customer is part of the equation, make sure your example includes at least two customers to demonstrate with!
Richard, May 16, 2007 - 4:13 pm UTC
In our data warehouse environment, we are receiving updated sales orders on a nightly basis from a 3rd party system via ODBC.
The process is prebuilt and we cannot modify. The only solutions we have available are to rebuild the entire interface process, or add additional processing on top of the delivered functionality.
Management decided we are better off adding customized processes on top of the delivered product.
We can only add new interfaces for additional data needed from the 3rd party app beyond the scope of the delivered interfaces.
SQL*Plus: Release 10.1.0.2.0 - Production on Wed May 16 14:25:12 2007
Copyright (c) 1982, 2004, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> drop table historical_txns;
Table dropped.
SQL> create table historical_txns (
2 order_date date not null,
3 order_no integer not null,
4 order_line integer not null,
5 txn_date date not null,
6 blocked_flag varchar2(1) not null,
7 constraint pk_historical_txns
8 primary key (order_date, order_no, order_line, txn_date));
Table created.
SQL> drop table order_status_hist;
Table dropped.
SQL> create table order_status_hist (
2 order_date date not null,
3 order_no integer not null,
4 order_line integer not null,
5 effective_dt date not null,
6 expired_dt date default to_date('99991231','yyyymmdd') not null,
7 constraint ck_eff_exp_dates
8 check (effective_dt <= expired_dt),
9 constraint pk_order_status_hist
10 primary key (order_date, order_no, order_line, effective_dt));
Table created.
SQL> alter session set nls_date_format='yyyymmdd';
Session altered.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070201'),'Y');
1 row created.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070203'),'N');
1 row created.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070210'),'Y');
1 row created.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070211'),'Y');
1 row created.
SQL> commit;
Commit complete.
SQL> select * from historical_txns;
ORDER_DA ORDER_NO ORDER_LINE TXN_DATE B
-------- ---------- ---------- -------- -
20070201 100 1 20070201 Y
20070201 100 1 20070203 N
20070201 100 1 20070210 Y
20070201 100 1 20070211 Y
ASSUMPTIONS:A given order line may be blocked and released multiple times, depending on item availability, customer credit, etc.
In the order_status_hist table, only one record for a given order line may be "active" at any given time. For example,
given the above data, the following is the expected output:
SQL> select * from order_status_hist;
ORDER_DA ORDER_NO ORDER_LINE EFFECTIV EXPIRED_
-------- ---------- ---------- -------- --------
20070201 100 1 20070201 20070203
20070201 100 1 20070210 99991231
QUESTION:
To do the initial load of data from historical_txns into order_status_hist we need an analytic SQL or pl/sql procedure?
Note that the transaction dated 11-Feb-2007 did not change the flag.
Going forward, as inserts and updates are processed on the ORDERS table which contains a BLOCKED column with Y or N values, we want to capture changes to the BLOCKED column via a trigger.
If the updated record is blocked and the order was previously not blocked, a new effective date record is added to the order_status_hist table.
If the updated record unblocks a previously blocked order, the currently active record is given the current date as the expired_date.
If the BLOCKED field did not change, do nothing to the history table.
The following trigger should perform the required ongoing maintenance, correct?
CREATE OR REPLACE TRIGGER order_blocked_status_trg
BEFORE DELETE or INSERT OR UPDATE OF blocked
ON ORDERS
FOR EACH ROW
BEGIN
IF INSERTING
and :new.blocked='Y'
THEN
-- when adding new records that are blocked, also add
-- records to the blocked status table for tracking
INSERT INTO order_status_hist (order_date, order_no, order_line, effective_date)
VALUES (:new.order_date, :new.order_no, :new.order_line, trunc(SYSDATE));
ELSIF UPDATING
AND :OLD.blocked <> :NEW.blocked
THEN
-- if the line was previously blocked and is now unblocked,
-- expire the currently blocked status record
UPDATE order_status_hist
SET expired_dt = trunc(sysdate)
where order_date = :old.order_date
and order_no = :old.order_no
and order_line = :old.order_line
and :old.blocked = 1
and trunc(sysdate) between effective_dt and expired_dt;
-- If the line was previously unblocked and is now blocked,
-- create a blocked status record.
if :new.blocked=1 then
INSERT INTO order_status_hist (order_date, order_no, order_line, effective_date)
values (:new.order_date, :new.order_no, :new.order_line, trunc(sysdate));
end if;
ELSIF DELETING
then
-- if the record being deleted has blocked status records,
-- also delete these blocked status records for data integrity
delete from order_status_hist
where order_date = :old.order_date
and order_no = :old.order_no
and order_line = :old.order_line;
END IF;
END order_blocked_status_trg;
/
May 17, 2007 - 10:55 am UTC
please, I'm begging you to not use that expired date in the year 9999 - please, use null. you'll whack out the optimizers ability to estimate card= values.
I would not use a trigger - hate triggers.
If I can give you a query (using analytics) from the one table to provide you with the desired output, is that sufficient?
Richard, May 17, 2007 - 12:27 pm UTC
Analytic sql would be the ideal solution for me. I can update the process to use nvl(expired_dt,sysdate) instead of 9999/12/31 easily enough.
TO Richard
Kevin, May 17, 2007 - 12:54 pm UTC
> I can update the process to use nvl(expired_dt,sysdate) instead of 9999/12/31 easily enough.
Don't do this:
SYSDATE <= NVL(expired_dt,sysdate)
Instead, use:
(EXPIRED_DT IS NULL OR SYSDATE <= EXPIRED_DT)
Yes, we can see the equivalence, and yes, yours is more "succinct". But the NVL() on EXPIRED_DT will prevent the CBO from using expired_dt column stats when estimating cardinities. In fact, regardless of distribution on EXPIRED_DT, the CBO will use the default 5% selectivity estimate for your version of the constraint.
GROUP BY order date
A reader, May 21, 2007 - 1:05 pm UTC
tom, I have a query and built out of 3- 4 sub queries.
SELECT date,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT TRUNC(date) date,product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY TRUNC(date),product_id, product_structure,side
)
GROUP BY date,product_id, product_structure
the problem is, if I take off the date from the group by clause i get total askprice + price = 1656
and if i put the date clause (as above ) i get
askprice + price = 1510
looking at the query , do you see this happening ?
May 21, 2007 - 1:27 pm UTC
I don't understand. If you take date off of the group by - the query no longer FUNCTIONS.
nvl( x, null ) is 'redundant'
GROUP BY DATE
A reader, May 21, 2007 - 1:31 pm UTC
SELECT product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY product_id, product_structure,side
)
GROUP BY product_id, product_structure
but now, if you sum askprice + price is not same as before
(
even though we have
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
)
May 21, 2007 - 1:55 pm UTC
I'm not understanding it - I don't see any outputs, so I'm not sure what you mean really.
the CODE button is really good for things CODE related (fixed width font)
GROUP BY DATE
A reader, May 21, 2007 - 1:32 pm UTC
by the way , where do you see nvl(x,null) ?
May 21, 2007 - 1:56 pm UTC
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price,
x = lag(price) over (partion by order_id order by date asc)
nvl( x, null ) previous_price
GROUP BY DATE
A reader, May 21, 2007 - 2:17 pm UTC
code 1:
SELECT <b>date</b>,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT <b>TRUNC(date)</b> date,product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY <b>TRUNC(date)</b>,product_id, product_structure,side
)
GROUP BY <b>date</b>,product_id, product_structure
code 2:
SELECT product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY product_id, product_structure,side
)
GROUP BY product_id, product_structure
now in the above two queries, in outer query one has date in the group by clause and the 2nd does not, and both has same where clause.
Q: DO YOU see any reason why , if i sum up the
counts of askpirce + price , counts will be different ?
May 22, 2007 - 9:18 am UTC
it would be so cool to see the outputs you say are different just to, well, sort of see what it is you are comparing.
GROUP BY DATE
A reader, May 21, 2007 - 2:18 pm UTC
code 1:
SELECT date,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT TRUNC(date) date,product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY TRUNC(date),product_id, product_structure,side
)
GROUP BY date,product_id, product_structure
code 2:
SELECT product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY product_id, product_structure,side
)
GROUP BY product_id, product_structure
now in the above two queries, in outer query one has date in the group by clause and the 2nd does not, and both has same where clause.
Q: DO YOU see any reason why , if i sum up the counts of askpirce + price , counts will be different ?
GROUP BY DATE -- worked ....
A reader, May 21, 2007 - 5:15 pm UTC
SELECT date,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT TRUNC(date) product_id,product_structure, side, COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY TRUNC(date),product_id, product_structure,side
)
GROUP BY date,product_id, product_structure
did not work..
but after changing the place of date column it produced same results...
SELECT product_id,product_structure,date, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
FROM (SELECT product_id,product_structure, side,TRUNC(date) COUNT (*) cnt
FROM (SELECT order_id, product_id, side, last_user, date,
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
FROM order o,user u
where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
AND last_user = u.code ) o, products p
WHERE 1 = 1
AND product_id = p.pid(+)
and status = 2
and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
GROUP BY product_id, product_structure,side,TRUNC(date)
)
GROUP BY product_id, product_structure,date
Analytics and PLSQL
A reader, May 23, 2007 - 8:13 am UTC
Hi Tom,
I have the following plsql code in a Procedure
Execute immediate 'insert into t1 (aa,
bb,
cc,
dd)
select a,
b,
c,
min (d) over (partition by a,b)
from t
where fag_done = ''N''
';
The trace file when "tkprofed" gives for this select a ration parse/execute = 1 which needs to be corrected.
Could you please help me to correct this high ration of parse count to execute count
Thanks a lot
Any Analytical funtion of HELP
Hitesh, June 14, 2007 - 2:48 am UTC
Hi Tom,
I am trying to calculate %variation of the no of records inserted past hr with the no of records inserted past week.
Is there any analytical funtion to do that?
Select 'T' "Table Name", A.Hrly_Cnt "Past Hour Statistics", B.Weekly_Cnt "Past Week Statistics", Ceil((B.Weekly_Cnt-A.Hrly_Cnt)* 100 /(B.Weekly_Cnt)) "%Variation"
from
(
Select /*+Parallel(X,4) */ Count(*) Hrly_cnt
from
t X
Where
Insert_date >= Sysdate-1/24 And Insert_date < Sysdate) A,
(
Select /*+Parallel(Y,4) */ Ceil(Count(*)/7/24) Weekly_cnt
from
t Y
Where
Insert_date >= Sysdate-7-1/24 And Insert_date < Sysdate-1/24) B
/
Thanks
Hitesh
June 14, 2007 - 7:24 am UTC
no analytics needed, but a single pass would more than do this
select count( case when insert_date between sysdate-1/24 and sysdate then 1 end),
count( case when insert_date between sysdate-7-1/24 and sysdate then 1 end)
from t
where insert_date between sysdate-7-1/24 and sysdate
Can we avoid multi pass of table T1
Ritesh, June 16, 2007 - 4:52 am UTC
Hi Tom,
Can we aviod the multi-pass of Table T1 in the below mentioned SQL. T1 is the parent table based on which ceratin records gets invalid status and they are inserted in table T. We need to know the %of Invalid activity based on the table T1.
Select
'T' "Table Name",
Cnt "Base Count",
Invalid_Match_Cnt "Invalid Match Count",
Invalid_Match_Cnt*100/Cnt "%Invalid"
from
(Select /*+Parallel(X 16) */ Count(*) Cnt
from
T1 X
WHERE Insert_date Between Sysdate-4/24 AND Sysdate-3/24) A,
(Select /*+ Parallel(X 16) Parallel(Y 16) Use_Hash(X Y) */ Count(*) Invalid_Match_Cnt
from
T X, T1 Y
WHERE
X.Insert_date Between Sysdate-4/24 AND Sysdate AND
Y.Insert_date Between Sysdate-4/24 AND Sysdate-3/24 AND
X.tag = Y.tag) B
/
Thanks
Ritesh
Last_Value vs First_Value - Please explain
Jay, July 19, 2007 - 10:56 am UTC
Quote:
"it didn't work for me. I had to change it to first_value(master_record) over (partition by sub_record order by moddate desc)
Is there a reason for that?
Followup December 16, 2003 - 2pm US/Eastern:
doh, default window clause is current row and unbounded preceding
i would have needed a window clause that looks forwards rather then backwards (reason #1 why I
should always set up a test case instead of just answering on the fly)
your solution of reversing the data works just fine. "
---------------------
Hi tom,
I had the same doubt. I read your explanation but couldn't fully comprehend what you were saying.
Why would,
Last_Value() over (partition by .. order by ... )
be different from..
First_Value() over (partition by .. order by ... desc) ??
The later case provides the correct result. The first one doesn't. I'm confused. Thanks for your time..
Jay
July 19, 2007 - 12:20 pm UTC
give us the for example where the latter is correct and the former is not - and then we'll explain what happened.
NULLS would definitely affect that (unless you use nulls first/nulls last to control their placement)
Thanks..
Jay, July 19, 2007 - 1:41 pm UTC
-----------------------------case 1 query
with
population as
(
select 123 as test_id, 'Alpha' as test1, 1 as test2 from dual
union all
select 123 as test_id, 'Beta' as test1, 2 as test2 from dual
union all
select 123 as test_id, 'Gamma' as test1, 3 as test2 from dual
union all
select 123 as test_id, 'Delta' as test1, 4 as test2 from dual
union all
select 123 as test_id, 'Pi' as test1, 5 as test2 from dual
)
select test_id,
test1,
test2,
first_value(test1) over (partition by test_id
order by test2 desc) as "Correct"
from population p
-----------------------------case 2 query
with
population as
(
select 123 as test_id, 'Alpha' as test1, 1 as test2 from dual
union all
select 123 as test_id, 'Beta' as test1, 2 as test2 from dual
union all
select 123 as test_id, 'Gamma' as test1, 3 as test2 from dual
union all
select 123 as test_id, 'Delta' as test1, 4 as test2 from dual
union all
select 123 as test_id, 'Pi' as test1, 5 as test2 from dual
)
select test_id,
test1,
test2,
last_value(test1) over (partition by test_id
order by test2) as "Incorrect.. why??"
from population p
case 1 is correct and returns 'pi' which is expected.
case 2 should return 'pi' as well since I am ordering it by test2 ascending and am partitioning it by the ID.
Could you please explain me as to why this is happening?
Thanks for your time as always!
July 19, 2007 - 2:53 pm UTC
oh, ok - easy, the default window is:
default window clause is current row and unbounded preceding
so, first_value/last_value works from the current record BACKWARDS, up to the top of the result set.
first_value will always be - the first record.
last_value (in this case) will always be - the CURRENT RECORD
14 select test_id,
15 test1,
16 test2,
17 first_value(test1) over (partition by test_id
18 order by test2 desc) as "Correct" ,
19 last_value(test1) over (partition by test_id
20 order by test2) as "Incorrect.. why??"
21 from population p
22 order by test2
23 /
TEST_ID TEST1 TEST2 Corre Incor
---------- ----- ---------- ----- -----
123 Alpha 1 Pi Alpha
123 Beta 2 Pi Beta
123 Gamma 3 Pi Gamma
123 Delta 4 Pi Delta
123 Pi 5 Pi Pi
here, when processing row 1 - last value has to be alpha, because if you look at the current record and all of the preceding ones, the last value of test1 is alpha.
when you go to row 2, the last value has to be beta - because when you look at the last row in the current window (which is from the current row and all preceding), the last value is Beta - and so on...
14 select test_id,
15 test1,
16 test2,
17 first_value(test1) over (partition by test_id
18 order by test2 desc) as "Correct" ,
19 last_value(test1) over (partition by test_id
20 order by test2<b>
21 range between current row and unbounded following ) as "Incorrect.. why??"</b>
22 from population p
23 order by test2
24 /
TEST_ID TEST1 TEST2 Corre Incor
---------- ----- ---------- ----- -----
123 Alpha 1 Pi Pi
123 Beta 2 Pi Pi
123 Gamma 3 Pi Pi
123 Delta 4 Pi Pi
123 Pi 5 Pi Pi
Here, we changed the window so that last value would look at the current row and everything AFTER it, so now, Pi is the answer because no matter what row we look at - using the current row and unbounded following - the last value is the last row in the window and the last row is always the SAME in that window.
Thank you very much..
Jay, July 19, 2007 - 3:03 pm UTC
Hello Tom,
Thanks a lot for explaining the concept. It makes complete sense now. Thank you so much!!
Jay
last_value
Sokrates, July 20, 2007 - 1:55 am UTC
now, even me, I got it !
LAST_VALUE, unbounded frustration
Duke Ganote, August 24, 2007 - 11:19 am UTC
That's why I never use LAST_VALUE. FIRST_VALUE lets me intuitively use the default window clause (current row and unbounded preceding).
FIRST_VALUE behaves as FIFO (first inline, first out).
The default clauses turn LAST_VALUE into LINF (last inline, never found).
Analytics, configurable date window
Phil, September 02, 2007 - 7:54 am UTC
Hello Tom
I struggle with Analytics, have you finished writing a book that explains it from the start?
I need to produce a report based on time and split the data into chunks. Initially each minute but the user needs to be able to select the time "window" for example 5 minutes.
Is there any easy way of doing this using analytics please?
here are 2 hours of random data I've been working with...
create table tom
(when date)
/
insert into tom(select sysdate+((dbms_random.value)-.5)/14 from all_objects)
/
September 05, 2007 - 1:27 pm UTC
I did,
years ago - expert one on one Oracle - the chapter on Analytics. Covers it from start to finish.
Also, the Data Warehousing guide does a really good job!
you need to define your problem a little better - normally to split things into 1 or N minutes slices - you do NOT need analytics at all. eg: one minute slices, just group by:
to_char( dt, 'yyyymmddhhmm' )
everything in the same minute will be together - you need to be a little more precise as to what you mean here though before I can say anything.
Can LAG do what I need? Version 9.2.0.6.0
Peter, September 13, 2007 - 7:16 am UTC
Dear Tom,
Thanks for the site - I have learned many things and use Analytics a lot now - very performant in many cases. If you can spare the time to look at my problem I will be most grateful.
I have a table like:
SQL> r
1 create table lag_test (
2 claim varchar2 (10) not null,
3 event varchar2 (10),
4 period number (6,0) not null,
5 amount number (10,2) not null
6* )
Table created.
With this data:
SQL> insert into lag_test values ('1', 'A', 200701, 50);
1 row created.
SQL> insert into lag_test values ('1', null, 200702, 75);
1 row created.
SQL> insert into lag_test values ('1', 'A', 200703, 100);
1 row created.
SQL> insert into lag_test values ('1', 'A', 200704, 1000);
1 row created.
SQL> commit;
Commit complete.
And I am using the following query to compute movements in amount:
SQL> edit
Wrote file afiedt.buf
1 select
2 event,
3 period,
4 amount - lag (amount, 1, 0) over (partition by claim order by claim, period) amt
5 from
6* lag_test
SQL> /
EVENT PERIOD AMT
---------- ---------- ----------
A 200701 50
200702 25
A 200703 25
A 200704 900
But I was expecting the following output:
EVENT PERIOD AMT
---------- ---------- ----------
A 200701 50
A 200702 -50
200702 75
A 200703 100
200703 -75
A 200704 900
Because at period 200702 "A" has gone away and been replaced by "null". And at period 200703 "null" has gone away and been replaced by "A".
I have scoured Chapter 12 in your book but maybe my understanding of how LAG works is flawed - or is there some way I can get my desired output.
Would really appreciate any help you can give me, and apologise if this is off topic.
Kind Regards,
Peter
September 15, 2007 - 7:27 pm UTC
I can only tell you if your desired output is possible if.....
You state in a factual, requirementy sort of what what it is. Saying "I expect 42 to be displayed" is insufficient.
You have 4 rows, I fail to see how a query against a simple 4 row table would return more than 4 in this case.
'A' has nothing to do with this query - you partition by claim (always '1') you order by claim and period - other than you selecting event, it is not part of the picture here.
Thanks for the tip above
Phil, September 15, 2007 - 5:57 pm UTC
Hi Tom Re:
Hello Tom
I struggle with Analytics, have you finished writing a book that explains it from the start?
I need to produce a report based on time and split the data into chunks. Initially each minute but
the user needs to be able to select the time "window" for example 5 minutes.
Is there any easy way of doing this using analytics please?
here are 2 hours of random data I've been working with...
create table tom
(when date)
/
insert into tom(select sysdate+((dbms_random.value)-.5)/14 from all_objects)
/
OK, understand the to_char(t.when, 'yyyymmddhhmm') which makes perfect sense but is this an efficient way of doing a search for a max frequency in a certain window? I thought analytics would offer me some performance advanatages. I'm confident you'll improve on the following to give me a top frequency per period based on the above seed data...
select five_sec_count, five_sec_period, seconds secfrom, seconds + 5 secto
from (select count(*) five_sec_count,
to_char(t.when, 'yyyymmddhhmm') five_sec_period,
case
when to_char(when, 'ss') between 1 and 5 then
5
when to_char(when, 'ss') between 6 and 10 then
10
when to_char(when, 'ss') between 11 and 15 then
15
when to_char(when, 'ss') between 16 and 20 then
20
when to_char(when, 'ss') between 21 and 25 then
25
when to_char(when, 'ss') between 26 and 30 then
30
when to_char(when, 'ss') between 31 and 35 then
35
when to_char(when, 'ss') between 36 and 40 then
40
when to_char(when, 'ss') between 41 and 45 then
45
when to_char(when, 'ss') between 46 and 50 then
50
when to_char(when, 'ss') between 51 and 55 then
55
else
60
end seconds
from tom t
group by to_char(t.when, 'yyyymmddhhmm'),
case
when to_char(when, 'ss') between 1 and 5 then
5
when to_char(when, 'ss') between 6 and 10 then
10
when to_char(when, 'ss') between 11 and 15 then
15
when to_char(when, 'ss') between 16 and 20 then
20
when to_char(when, 'ss') between 21 and 25 then
25
when to_char(when, 'ss') between 26 and 30 then
30
when to_char(when, 'ss') between 31 and 35 then
35
when to_char(when, 'ss') between 36 and 40 then
40
when to_char(when, 'ss') between 41 and 45 then
45
when to_char(when, 'ss') between 46 and 50 then
50
when to_char(when, 'ss') between 51 and 55 then
55
else
60
end
order by 1 desc)
where rownum = 1
Thanks for everything you've provided over the years.
September 15, 2007 - 10:03 pm UTC
well, the case isn't needed - a simple divide and trunc could do that - otherwise - looks fine.
Can LAG do what I need? Version 9.2.0.6.0
Peter, September 19, 2007 - 11:35 am UTC
>>I can only tell you if your desired output is possible if.....
>>You state in a factual, requirementy sort of what what it is. Saying "I expect 42 to be displayed" is insufficient.
Sorry for being unclear Tom. My example was cut down too much and didn't really represent what I was trying to do.
I have found the answer in your article
https://asktom.oracle.com/Misc/oramag/on-uniqueness-space-and-numbers.html where you mention partitioned "sparse" outer joins. Unfortunately we are not yet on 10g, and my cartesian product will be extremely large.
Many thanks anyway for your response.
Kind Regards,
Peter
Combine row data for date event
Robert, September 28, 2007 - 4:57 pm UTC
Hello Tom,
I have a small module that tracks employee activity within a department. However, the tracker registers the activity as a single event (marked by 'A' - active or 'I' - inactive). Chronologically, for the same employee/department, an event of type 'A' can be either followed by a 'I' or by nothing (in which case the emp is still active).
I need to get the activity as a period, where the start date/time is when the emp became active at a particular time and the end date/time is when the emp became inactive. There's a way to accomplish this using table joins but I'm pretty sure there's a (simpler) way to accomplish it using analytics. How can I do this?
Here's the scripts (Oracle 10g).
create table emp_activity
(
dept_id number,
emp_id number,
dt date,
activity char(1)
);
One set of sample data:
insert into emp_activity(dept_id,emp_id,dt,activity)
values(15, 973, to_Date('23-AUG-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(15, 973, to_Date('12-APR-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'I');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(15, 973, to_Date('02-OCT-06 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
the result for this would be
dept | emp | start_date | end_date
----------------------------------------
15 | 973 | 23-AUG-07 | null
15 | 973 | 02-OCT-06 | 12-APR-07
Another set of data:
insert into emp_activity(dept_id,emp_id,dt,activity)
values(10, 100, to_Date('16-AUG-01 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(10, 100, to_Date('25-APR-03 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'I');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(10, 100, to_Date('30-JUL-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(10, 100, to_Date('22-AUG-07 11.08.54 AM', 'DD-MON-RR HH12.MI.SS AM'), 'I');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(10, 100, to_Date('22-AUG-07 11.35.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity)
values(10, 100, to_Date('22-AUG-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS PM'), 'I');
in this case, the result should be
dept | emp | start_date | end_date
----------------------------------------
10 | 100 | 22-AUG-07 | 22-AUG-07
10 | 100 | 30-JUL-07 | 22-AUG-07
10 | 100 | 16-AUG-01 | 25-APR-03
I left out the time fields in the results, for clarity. Thank you.
October 03, 2007 - 1:10 pm UTC
your last output looks wrong to me, this is what I think it should be:
ops$tkyte%ORA10GR2> select *
2 from (
3 select dept_id, emp_id,
4 dt, lead(dt) over (partition by dept_id, emp_id order by dt) last_dt,
5 activity, lead(activity) over (partition by dept_id, emp_id order by dt) last_activity
6 from emp_activity
7 )
8 where activity = 'A'
9 order by dept_id, emp_id, dt
10 /
DEPT_ID EMP_ID DT LAST_DT A L
---------- ---------- --------- --------- - -
10 100 16-AUG-01 25-APR-03 A I
10 100 30-JUL-07 22-AUG-07 A I
10 100 22-AUG-07 A
15 973 02-OCT-06 12-APR-07 A I
15 973 23-AUG-07 A
Re: Combine row data for date event
Robert, October 02, 2007 - 8:54 am UTC
I actually solved the problem myself after playing with lag and lead!
select dept_id, emp_id, start_date, end_date
from (
select dept_id, emp_id, dt start_date
,lag(dt) over(partition by dept_id, emp_id order by dt desc) end_date
,activity
from emp_activity
) where activity='A'
order by dept_id, emp_id, start_date desc
window clause on time
Ryan, October 10, 2007 - 1:31 pm UTC
This might be on here already, but I couldn't find it. I think I need an analytic function with a window clause to do this:
mytimestamp is a timestamp field
mymetric is a number field
select max(mytimestamp) as time, sum(mymetric) as bytes
from metrictable
where mymetric > ( select max(mymetric) from metrictable) - (1/96)
I want to get this rolled up by 15 minute chunks. do i use a group by, analytic function?
So it is essentially a set of buckets or chunks
first 15 minute period
second 15 minute period
There "where" clause is just me trying to get just the last 15 minutes.
analytic functions
RAJ, November 08, 2007 - 4:16 pm UTC
Hi Tom,
I have the following scenario and given the exepected output as well.
I am trying to achieve with the below said query.
SELECT DISTINCT a.emp_id, CASE WHEN a.operation$ IN ('UU', 'UN') THEN 'U'
ELSE a.operation$ END operation,
CASE
WHEN a.operation$ = 'I' THEN NULL
ELSE FIRST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
END prev_amount,
LAST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
AS curr_amount,
CASE WHEN a.operation$ = 'I'
THEN NULL ELSE FIRST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
END prev_timestamp,
LAST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
AS curr_timestamp
FROM DEPT a
ORDER BY emp_id
Result -- 3 rows but expected is 1 row. Below I have given the expected output as well. I am not sure where I am missing in the above said query..
Here is the scenario..
create table dept (
OPERATION$ CHAR(2)
,CSCN$ NUMBER
,COMMIT_TIMESTAMP$ DATE
,RSID$ NUMBER
,USERNAME$ VARCHAR2(30)
,TIMESTAMP$ DATE
,SOURCE_COLMAP$ RAW(128)
,TARGET_COLMAP$ RAW(128)
,EMP_ID NUMBER
,TIMESTAMP DATE
,AMOUNT NUMBER
)
INSERT INTO
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
VALUES
('UU ',1234,SYSDATE,40264,'TEST',SYSDATE,'08','08',104,SYSDATE,90000);
INSERT INTO
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
VALUES
('UN ',1234,SYSDATE,40265,'TEST',SYSDATE,'08','08',104,SYSDATE,100000);
INSERT INTO
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
VALUES
('UU ',1234,SYSDATE,40270,'TEST',SYSDATE,'08','08',104,SYSDATE,100000);
INSERT INTO
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
VALUES
('UN ',1234,SYSDATE,40271,'TEST',SYSDATE,'08','08',104,SYSDATE,30000);
UU,UN -- 'U'
Expected result set
EMP_ID, OPERATION$, PREV_AMOUNT, PREV_TIMESTAMP, CURRENT_AMOUNT, CURR_TIMESTAMP
104 , U 90000 SYSDATE , 30000 SYSDATE
November 09, 2007 - 12:10 pm UTC
funny, your example doesn't work AT ALL. operation$ is 2 characters, you have 3. there is no table name. sigh.
Funnier still, And no clue why you expect a SINGLE ROW - when there is clearly three distinct rows.
ops$tkyte%ORA10GR2> SELECT DISTINCT a.emp_id, CASE WHEN a.operation$ IN ('UU', 'UN') THEN 'U'
2 ELSE a.operation$ END operation,
3 CASE
4 WHEN a.operation$ = 'I' THEN NULL
5 ELSE FIRST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
6 END prev_amount,
7 LAST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
8 AS curr_amount,
9 CASE WHEN a.operation$ = 'I'
10 THEN NULL ELSE FIRST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
11 END prev_timestamp,
12 LAST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
13 AS curr_timestamp
14 FROM DEPT a
15 ORDER BY emp_id
16 /
EMP_ID OP PREV_AMOUNT CURR_AMOUNT PREV_TIME CURR_TIME
---------- -- ----------- ----------- --------- ---------
104 U 90000 30000 09-NOV-07 09-NOV-07
104 U 90000 90000 09-NOV-07 09-NOV-07
104 U 90000 100000 09-NOV-07 09-NOV-07
It appears you are misunderstanding how the analytic functions work in this case - so presenting your "answer" without presenting "your question" (phrased in text) is not useful.
how do you carry lag to more than one row?
ee, December 05, 2007 - 6:11 pm UTC
How can you carry forward lag analytic results to multiple rows. Given table t based on view of two cartesianed tables that uses anlaytics dataset is as follows:
eg.
CREATE TABLE T(
cert_number varchar2(10),
inspection_date DATE,
risk_assign_date DATE,
risk_number NUMBER,
derived_risk NUMBER
);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) VALUES(1,TO_DATE('11-dec-2006'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) VALUES(1,TO_DATE('12-dec-2006'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) VALUES(1,TO_DATE('01-jan-2007'),TO_DATE('06-jan-2007'),3,3);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('25-jan-2007'),NULL,NULL,3);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('27-jan-2007'),TO_DATE('27-jan-2007'),8,8);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('13-feb-2007'),NULL,NULL,3);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('17-feb-2007'),TO_DATE('17-feb-2007'),2,2);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('18-apr-2007'),NULL,NULL,2);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('23-may-2007'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('17-aug-2007'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('22-aug-2007'),TO_DATE('22-aug-2007'),1,1);
Analytic code ...DECODE (risk_number, NULL,first_value(risk_number) over(ORDER BY cert_id_number, inspectiondate ASC ROWS 1 preceding ),risk_number)derived_risk...within the view, correctly determines the risk number for the preceding rows where the risk number is null, but it leaves several rows without a derived risk_number value. How can I carry the lag derived_risk value of 2 forward to more than just one row so that the rows that include inspection dates of 5/23/2007 and 8/23/07 also get a derived_risk value of 2 assigned to the row.
The desired output would look like:
cert_ inspection_ risk_assign_ risk_ derived_
number date date number risk
====== ========== ===== ====== ========
1 12/11/2006 null null null
1 12/12/2006 null null null
1 1/1/2007 1/6/2007 3 3
1 1/25/2007 null null 3
1 1/27/2007 1/27/2007 8 8
1 2/13/2007 null null 3
1 2/17/2007 2/17/2007 2 2
1 4/18/2007 null null 2
1 5/23/2007 null null 2
1 8/17/2007 null null 2
1 8/22/2007 8/22/2007 1 1
December 10, 2007 - 9:52 am UTC
ops$tkyte%ORA10GR2> select t.*,
2 last_value( risk_number ignore nulls ) over (order by cert_number, inspection_date ) dr
3 from t;
CERT_NUMBE INSPECTIO RISK_ASSI RISK_NUMBER DERIVED_RISK DR
---------- --------- --------- ----------- ------------ ----------
1 11-DEC-06
1 12-DEC-06
1 01-JAN-07 06-JAN-07 3 3 3
1 25-JAN-07 3 3
1 27-JAN-07 27-JAN-07 8 8 8
1 13-FEB-07 3 8
1 17-FEB-07 17-FEB-07 2 2 2
1 18-APR-07 2 2
1 23-MAY-07 2
1 17-AUG-07 2
1 22-AUG-07 22-AUG-07 1 1 1
11 rows selected.
I don't know why your example has
null, null, 3,3,8,3 <<<===???? why 3 - i think it should be 8.
how do you carry lag values forward to more than one row?
ee, December 06, 2007 - 8:55 am UTC
How can you carry forward lag analytic results to multiple rows. Derived risk value carries forward the value of previous recent risk number and only changes when new risk assign date changes with a new risk number. Given table t based on view of two cartesianed tables that uses anlaytics dataset is as follows:
CREATE TABLE T(
cert_number varchar2(10),
inspection_date DATE,
risk_assign_date DATE,
risk_number NUMBER,
derived_risk NUMBER
);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('11-dec-2006'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('12-dec-2006'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('01-jan-2007'),TO_DATE('06-jan-2007'),3,3);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('25-jan-2007'),NULL,NULL,3);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('27-jan-2007'),TO_DATE('27-jan-2007'),8,8);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('13-feb-2007'),NULL,NULL,8);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('17-feb-2007'),TO_DATE('17-feb-2007'),2,2);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('18-apr-2007'),NULL,NULL,2);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('23-may-2007'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('17-aug-2007'),NULL,NULL,NULL);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk )
VALUES(1,TO_DATE('22-aug-2007'),TO_DATE('22-aug-2007'),1,1);
Analytic code ...DECODE (risk_number, NULL,first_value(risk_number) over(ORDER BY cert_id_number,
inspectiondate ASC ROWS 1 preceding ),risk_number)derived_risk...within the view, correctly
determines the risk number for the preceding rows where the risk number is null, but it leaves
several rows without a derived risk_number value. How can I carry the lag derived_risk value of 2
forward to more than just one row so that the rows that include inspection dates of 5/23/2007 and
8/23/07 also get a derived_risk value of 2 assigned to the row.
The desired output would look like:
cert_ inspection_ risk_assign_ risk_ derived_
number date date number risk
====== ========== ===== ====== ========
1 12/11/2006 null null null
1 12/12/2006 null null null
1 1/1/2007 1/6/2007 3 3
1 1/25/2007 null null 3
1 1/27/2007 1/27/2007 8 8
1 2/13/2007 null null 8
1 2/17/2007 2/17/2007 2 2
1 4/18/2007 null null 2
1 5/23/2007 null null 2
1 8/17/2007 null null 2
1 8/22/2007 8/22/2007 1 1
10g only?
Jay, December 10, 2007 - 10:05 am UTC
Hi Tom,
Thanks! The questioner corrected his output in the following statement. I had a question, is this 'ignore nulls' a 10g thing? Because, it does not seem to be working for me and I use 9i. I get a ORA-00907 Missing right parenthesis when I use the 'ignore nulls' statement.
Thanks for your help!
Jay
December 10, 2007 - 10:16 am UTC
ignore nulls is new with 10g - yes.
before that you can use this "carry down" technique:
ops$tkyte%ORA9IR2> select t.*,
2 last_value( risk_number ignore nulls ) over (order by cert_number, inspection_date ) dr
3 from t;
last_value( risk_number ignore nulls ) over (order by cert_number, inspection_date ) dr
*
ERROR at line 2:
ORA-00907: missing right parenthesis
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select cert_number,inspection_date,risk_assign_date,risk_number,derived_risk,
2 to_number( substr( max(data) over (order by cert_number, inspection_date), 11 ) ) dr
3 from (
4 select t.*,
5 case when risk_number is not null
6 then to_char(row_number() over
7 (order by cert_number, inspection_date),'fm0000000000') ||
8 risk_number
9 end data
10 from t
11 ) x
12 /
CERT_NUMBE INSPECTIO RISK_ASSI RISK_NUMBER DERIVED_RISK DR
---------- --------- --------- ----------- ------------ ----------
1 11-DEC-06
1 12-DEC-06
1 01-JAN-07 06-JAN-07 3 3 3
1 25-JAN-07 3 3
1 27-JAN-07 27-JAN-07 8 8 8
1 13-FEB-07 3 8
1 17-FEB-07 17-FEB-07 2 2 2
1 18-APR-07 2 2
1 23-MAY-07 2
1 17-AUG-07 2
1 22-AUG-07 22-AUG-07 1 1 1
11 rows selected.
WOW!
Jay, December 10, 2007 - 10:24 am UTC
Thanks Tom. You are simply amazing. I get to learn so much from your site!
Thanks a ton and have a wonderful day.
Jay
carrying lag value forward to more than one row
ee, December 11, 2007 - 12:40 pm UTC
Tom ,as always you find the answers to difficult questions and provide succinct code from which we can follow and learn. I struggled for several days on finding a solution to carry lag values forward to multiple rows and you have delivered once again. As I stated to you once at an RMOUG meeting, I thank the maker above for you insight and willingness to share your talents and knowledge with others.
Thanks again.
carry multiple lag values forward to multple rows
ee, December 13, 2007 - 12:00 pm UTC
TOM: I am still trying to grasp the capabilities of using analytics and have a follow up question:
as you stated: your code.....
ops$tkyte%ORA9IR2>
1 select cert_number,inspection_date,risk_assign_date,risk_number,derived_risk,
2 to_number( substr( max(data) over (order by cert_number, inspection_date), 11 ) ) dr
3 from (
4 select t.*,
5 case when risk_number is not null
6 then to_char(row_number() over
7 (order by cert_number, inspection_date),'fm0000000000')
||
8 risk_number
9 end data
10 from t
11 ) x
12 /
RETURNS this output:
cert_ inspection_ risk_assign_ risk_ derived_
number date date number risk
====== ========== ===== ====== ========
1 12/11/2006 null null null
1 12/12/2006 null null null
1 1/6/2007 1/6/2007 3 3
1 1/25/2007 null null 3
1 1/27/2007 1/27/2007 8 8
1 2/13/2007 null null 8
1 2/17/2007 2/17/2007 2 2
1 4/18/2007 null null 2
1 5/23/2007 null null 2
1 8/17/2007 null null 2
1 8/22/2007 8/22/2007 1 1
But, how would code the query to return this output?
The desired output would look like:
cert_ inspection_ risk_assign_ risk_ derived_ derived_
number date date number risk risk_date
====== ========== ===== ====== ======== =========
1 12/11/2006 null null null null
1 12/12/2006 null null null null
1 1/6/2007 1/6/2007 3 3 1/6/2007
1 1/25/2007 null null 3 1/6/2007
1 1/27/2007 1/27/2007 8 8 1/27/2007
1 2/13/2007 null null 8 1/27/2007
1 2/17/2007 2/17/2007 2 2 2/17/2007
1 4/18/2007 null null 2 2/17/2007
1 5/23/2007 null null 2 2/17/2007
1 8/17/2007 null null 2 2/17/2007
1 8/22/2007 8/22/2007 1 1 8/22/2007
Thank you so very much for your help !!!
December 13, 2007 - 12:37 pm UTC
using the same technique on risk_assign_date
Look at what I did bit by bit here.... This is a technique applicable to anything...
carry down bit by bit...
ops$tkyte%ORA10GR2> select x, y
2 from t
3 order by x
4 /
X Y
-------------------- --------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40
14-dec-2007 00:00:00
we start with that, we want to carry down Y. Now, Y is not 'sortable' right now - the 13th comes before the 12th. In general, this will be true - that the attribute we want is not sortable directly... We need something to add to Y to make it "sortable" - row_number will help:
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select x, y,
2 case when y is not null then row_number() over (order by x) end rn
3 from t
4 order by x
5 /
X Y RN
-------------------- -------------------- ----------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 2
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40 4
14-dec-2007 00:00:00
we want row_number in a field whenever Y is not null (for the MAX() trick later!) We need it to sort - and it would - it is a number, but we need to piggy back onto this RN field our Y value - so we need a string:
o
ps$tkyte%ORA10GR2> select x, y,
2 case when y is not null then to_char(row_number() over (order by x),'fm000000') end rn
3 from t
4 order by x
5 /
X Y RN
-------------------- -------------------- -------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 000002
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40 000004
14-dec-2007 00:00:00
Now, we need to add our value to this RN field - it is a date, we convert date into something neutral for our string:
ops$tkyte%ORA10GR2> select x, y,
2 case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
3 to_char(y,'yyyymmddhh24miss') rn
4 from t
5 order by x
6 /
X Y RN
-------------------- -------------------- ---------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 00000220071213123940
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40 00000420071212123940
14-dec-2007 00:00:00
Now, we can apply the MAX trick to 'carry' down:
ops$tkyte%ORA10GR2> select x, y, MAX(rn) over (order by x) max_rn
2 from (
3 select x, y,
4 case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
5 to_char(y,'yyyymmddhh24miss') rn
6 from t
7 )
8 order by x
9 /
X Y MAX_RN
-------------------- -------------------- ---------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 00000220071213123940
12-dec-2007 00:00:00 00000220071213123940
13-dec-2007 00:00:00 12-dec-2007 12:39:40 00000420071212123940
14-dec-2007 00:00:00 00000420071212123940
substr gets out bit out again:
ops$tkyte%ORA10GR2> select x, y, substr(MAX(rn) over (order by x),7) max_rn
2 from (
3 select x, y,
4 case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
5 to_char(y,'yyyymmddhh24miss') rn
6 from t
7 )
8 order by x
9 /
X Y MAX_RN
-------------------- -------------------- ---------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 20071213123940
12-dec-2007 00:00:00 20071213123940
13-dec-2007 00:00:00 12-dec-2007 12:39:40 20071212123940
14-dec-2007 00:00:00 20071212123940
we are almost done - a to_date and viola'
ops$tkyte%ORA10GR2> select x, y, to_date( substr(MAX(rn) over (order by x),7),'yyyymmddhh24miss') max_rn
2 from (
3 select x, y,
4 case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
5 to_char(y,'yyyymmddhh24miss') rn
6 from t
7 )
8 order by x
9 /
X Y MAX_RN
-------------------- -------------------- --------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 13-dec-2007 12:39:40
12-dec-2007 00:00:00 13-dec-2007 12:39:40
13-dec-2007 00:00:00 12-dec-2007 12:39:40 12-dec-2007 12:39:40
14-dec-2007 00:00:00 12-dec-2007 12:39:40
so, apply that technique to your data - I purposely am not giving the final final solution :) I want you to code it.
carry lag values forward to multple rows
ee, December 13, 2007 - 4:06 pm UTC
Thank you sooooo very much. I studied your example and was successful in applying it to my data situation.
PS.
During the daytime, I pour through your books and your askTom website and I carefully study the wonderful examples.... and then at night, I place your books under my pillow hoping that through osmosis, all of the wonderfull contents that is within the pages will be absorbed and understood when I awake. It is much better than counting sheep.
Single query to replace 'Union All'
Matthew, January 08, 2008 - 7:44 am UTC
Given the following script (v10.2):
DROP TABLE issue_category;
DROP TABLE category_type;
DROP TABLE issue_class;
DROP TABLE fd_scorecard;
DROP TABLE class_type;
DROP TABLE fd;
CREATE TABLE fd(
fd_id VARCHAR2(20),
fd_name VARCHAR2(100));
CREATE TABLE class_type(
cl_id NUMBER(6),
description VARCHAR2(100));
CREATE TABLE fd_scorecard(
fd_id VARCHAR2(20),
scorecard_id NUMBER(10),
date_of_meeting DATE,
status VARCHAR2(100));
CREATE TABLE category_type(
ct_id NUMBER(6),
class_type_id NUMBER(6),
description VARCHAR2(100),
weighting NUMBER(5,2));
CREATE TABLE issue_class(
fd_id VARCHAR2(20),
scorecard_id NUMBER(10),
class_type_id NUMBER(6),
ic_name VARCHAR2(100),
num_grade NUMBER(1),
weighting NUMBER(5,2));
CREATE TABLE issue_category(
fd_id VARCHAR2(20),
scorecard_id NUMBER(10),
category_type_id NUMBER(10),
class_type_id NUMBER(6),
num_grade NUMBER(1),
weighting NUMBER(5,2),
date_created DATE,
created_by VARCHAR2(100),
notes VARCHAR2(4000));
/* Create the data */
DECLARE
TYPE class_type_tab IS TABLE OF class_type%ROWTYPE;
t_class_type class_type_tab := class_type_tab();
TYPE category_type_tab IS TABLE OF category_type%ROWTYPE;
t_category_type category_type_tab := category_type_tab();
indx PLS_INTEGER;
PROCEDURE ins(
p_id class_type.cl_id%TYPE,
p_desc class_type.description%TYPE) IS
BEGIN
indx := t_class_type.NEXT(indx);
t_class_type(indx).cl_id := p_id;
t_class_type(indx).description := p_desc;
END ins;
PROCEDURE ins(
p_class_type_id category_type.class_type_id%TYPE,
p_id category_type.ct_id%TYPE,
p_desc category_type.description%TYPE,
p_weighting category_type.weighting%TYPE) IS
BEGIN
indx := t_category_type.NEXT(indx);
t_category_type(indx).ct_id := p_id;
t_category_type(indx).class_type_id := p_class_type_id;
t_category_type(indx).description := p_desc;
t_category_type(indx).weighting := p_weighting;
END ins;
BEGIN
INSERT INTO fd(
fd_id,
fd_name)
VALUES('FD000001', 'FD000001');
indx := 0;
/* Class_Type data */
t_class_type.EXTEND(12);
ins(1,'Class Type 1');
ins(2,'Class Type 2');
ins(3,'Class Type 3');
ins(4,'Class Type 4');
ins(5,'Class Type 5');
ins(6,'Class Type 6');
ins(7,'Class Type 7');
ins(8,'Class Type 8');
ins(9,'Class Type 9');
ins(10,'Class Type 10');
ins(11,'Class Type 11');
ins(12,'Class Type 12');
FORALL i IN 1..t_class_type.COUNT
INSERT INTO class_type
VALUES t_class_type(i);
INSERT INTO fd_scorecard(
fd_id,
scorecard_id,
date_of_meeting,
status)
SELECT fd_id,
ROWNUM Scorecard_ID,
SYSDATE Meeting_Date,
CASE WHEN MOD(ROWNUM,2) = 0 THEN 'Draft'
ELSE 'Final'
END Status
FROM fd;
INSERT INTO issue_class(
fd_id,
scorecard_id,
class_type_id,
ic_name,
num_grade,
weighting)
SELECT fs.fd_id,
fs.scorecard_id,
cl.cl_id Class_Type_ID,
'Issue Class Name '||TO_CHAR(ROWNUM) Ic_Name,
CASE MOD(ROWNUM, 5)
WHEN 0 THEN 5
WHEN 1 THEN 4
WHEN 2 THEN 3
WHEN 3 THEN 2
ELSE 1 END Num_Grade,
ROUND(100 / ROWNUM, 2) Weighting
FROM fd_scorecard fs,
class_type cl;
/* Category_Type data */
t_category_type.EXTEND(56);
indx := 0;
ins(1,1,'Category Type 11',150);
ins(1,2,'Category Type 12',30);
ins(1,3,'Category Type 13',60);
ins(1,4,'Category Type 14',40);
ins(1,5,'Category Type 15',10);
ins(1,6,'Category Type 16',5);
ins(1,7,'Category Type 17',5);
ins(2,1,'Category Type 21',150);
ins(2,2,'Category Type 22',25);
ins(2,3,'Category Type 23',20);
ins(2,4,'Category Type 24',10);
ins(2,5,'Category Type 25',10);
ins(2,6,'Category Type 26',10);
ins(3,1,'Category Type 31',50);
ins(3,2,'Category Type 32',25);
ins(3,3,'Category Type 33',25);
ins(4,1,'Category Type 41',25);
ins(4,2,'Category Type 42',75);
ins(4,3,'Category Type 43',50);
ins(5,1,'Category Type 51',30);
ins(5,2,'Category Type 52',30);
ins(5,3,'Category Type 53',30);
ins(5,4,'Category Type 54',10);
ins(6,1,'Category Type 61',50);
ins(6,2,'Category Type 62',25);
ins(6,3,'Category Type 63',50);
ins(6,4,'Category Type 64',50);
ins(6,5,'Category Type 65',25);
ins(6,6,'Category Type 66',25);
ins(6,7,'Category Type 67',25);
ins(7,1,'Category Type 71',100);
ins(7,2,'Category Type 72',50);
ins(7,3,'Category Type 73',20);
ins(7,4,'Category Type 74',5);
ins(7,5,'Category Type 75',25);
ins(7,6,'Category Type 76',20);
ins(7,7,'Category Type 77',25);
ins(7,8,'Category Type 78',40);
ins(7,9,'Category Type 79',10);
ins(7,10,'Category Type 710',5);
ins(8,1,'Category Type 81',25);
ins(8,2,'Category Type 82',25);
ins(9,1,'Category Type 91',50);
ins(9,2,'Category Type 92',25);
ins(9,3,'Category Type 93',50);
ins(9,4,'Category Type 94',25);
ins(9,5,'Category Type 95',25);
ins(9,6,'Category Type 96',25);
ins(10,1,'Category Type 101',100);
ins(10,2,'Category Type 102',100);
ins(10,3,'Category Type 103',75);
ins(10,4,'Category Type 104',25);
ins(11,1,'Category Type 111',30);
ins(11,2,'Category Type 112',30);
ins(11,3,'Category Type 113',20);
ins(11,4,'Category Type 114',20);
FORALL i IN 1..t_category_type.COUNT
INSERT INTO category_type
VALUES t_category_type(i);
INSERT INTO issue_category(
fd_id,
scorecard_id,
category_type_id,
class_type_id,
num_grade,
weighting,
date_created,
created_by,
notes)
SELECT ic.fd_id,
ic.scorecard_id,
ct.ct_id Category_Type_ID,
ct.class_type_id Class_Type_ID,
CASE MOD(ROWNUM, 5)
WHEN 0 THEN 5
WHEN 1 THEN 4
WHEN 2 THEN 3
WHEN 3 THEN 2
ELSE 1 END Num_Grade,
DBMS_RANDOM.VALUE(1,999) Weighting,
SYSDATE Date_Created,
USER Created_By,
NULL Notes
FROM issue_class ic,
category_type ct
WHERE ic.class_type_id = ct.class_type_id;
COMMIT;
END;
/* Constraints */
ALTER TABLE fd ADD(
CONSTRAINT fd_pk
PRIMARY KEY
(fd_id));
ALTER TABLE class_type ADD(
CONSTRAINT class_type_pk
PRIMARY KEY
(cl_id));
ALTER TABLE category_type ADD(
CONSTRAINT category_type_pk
PRIMARY KEY
(ct_id, class_type_id));
ALTER TABLE issue_class ADD(
CONSTRAINT issue_class_pk
PRIMARY KEY
(fd_id, scorecard_id, class_type_id));
ALTER TABLE issue_category ADD(
CONSTRAINT issue_class_category_pk
PRIMARY KEY
(fd_id, scorecard_id, category_type_id, class_type_id));
ALTER TABLE fd_scorecard ADD(
CONSTRAINT fd_scorecard_pk
PRIMARY KEY
(fd_id, scorecard_id));
ALTER TABLE fd_scorecard ADD(
CONSTRAINT fd_scorecard_fd_fk
FOREIGN KEY (fd_id)
REFERENCES fd(fd_id));
ALTER TABLE issue_class ADD(
CONSTRAINT issue_class_fd_scorecard_fk
FOREIGN KEY (fd_id,scorecard_id)
REFERENCES fd_scorecard(fd_id,scorecard_id));
ALTER TABLE issue_class ADD(
CONSTRAINT issue_class_class_type_fk
FOREIGN KEY (class_type_id)
REFERENCES class_type(cl_id));
ALTER TABLE issue_class ADD(
CHECK (num_grade BETWEEN 1 AND 5));
ALTER TABLE category_type ADD(
CONSTRAINT category_type_class_type_fk
FOREIGN KEY (class_type_id)
REFERENCES class_type(cl_id));
ALTER TABLE issue_category ADD(
CONSTRAINT issue_category_category_typ_fk
FOREIGN KEY (category_type_id, class_type_id)
REFERENCES category_type(ct_id, class_type_id));
the following query produces the required results (i.e. what in my Cobol days I'd have called File Header, (Batch Header, Details, Batch Trailer) times n, File Trailer):
SELECT 0 Class_ID,
f.fd_name Class_Description,
0 Class_Type_ID,
NULL Catg_ID,
fs.status Catg_Description,
TO_NUMBER(NULL) Num_Grade,
TO_NUMBER(NULL) Weighting,
fs.date_of_meeting Date_Created,
NULL Created_By
FROM fd f,
fd_scorecard fs
WHERE fs.fd_id = f.fd_id
AND f.fd_id = 'FD000001'
UNION ALL
SELECT cl_id Class_ID,
cl.description Class_Description,
0 Class_Type_ID,
NULL Catg_ID,
NULL Catg_Description,
TO_NUMBER(NULL) Num_Grade,
TO_NUMBER(NULL) Weighting,
TO_DATE(NULL) Date_Created,
NULL Created_By
FROM class_type cl
UNION ALL
SELECT cl.cl_id Class_ID,
NULL Class_Description,
ic.class_type_id,
CASE ic.class_type_id
WHEN 1 THEN 'A'
WHEN 2 THEN 'B'
WHEN 3 THEN 'C'
WHEN 4 THEN 'D'
WHEN 5 THEN 'E'
WHEN 6 THEN 'F'
WHEN 7 THEN 'G'
WHEN 8 THEN 'H'
WHEN 9 THEN 'I'
WHEN 10 THEN 'J'
WHEN 11 THEN 'K'
WHEN 12 THEN 'L'
END||
TO_CHAR(ct.ct_id) Catg_ID,
CASE GROUPING (ct.ct_id)
WHEN 0 THEN ct.description
ELSE
CASE GROUPING(ic.class_type_id)
WHEN 0 THEN 'Grade'
ELSE 'Overall Grade And Conclusion'
END
END Catg_Description,
ROUND(SUM(ic.num_grade * ic.weighting) / SUM(ic.weighting)) Num_Grade,
ROUND(AVG(ct.weighting)) Weighting,
ic.date_created,
ic.created_by
FROM fd f,
issue_class icl,
class_type cl,
category_type ct,
issue_category ic
WHERE f.fd_id = icl.fd_id
AND icl.class_type_id = cl.cl_id
AND ct.class_type_id = cl.cl_id
AND ic.class_type_id = icl.class_type_id
AND ic.scorecard_id = icl.scorecard_id
AND ic.fd_id = icl.fd_id
AND ic.category_type_id = ct.ct_id
AND f.fd_id = 'FD000001'
GROUP BY GROUPING SETS((ic.class_type_id),(cl.cl_id,cl.description, ic.class_type_id, ct.ct_id, ct.description, ic.date_created, ic.created_by),())
ORDER BY class_id, class_type_id, catg_id;
Can this be done with a single select?
Single query to replace 'Union All'
Matthew, January 08, 2008 - 9:14 am UTC
Apologies, the query I gave you was leaving the 'batch trailer' rows till the end. This one is OK:
SELECT 0 Class_ID,
f.fd_name Class_Description,
0 Class_Type_ID,
0 Ct_ID,
NULL Catg_ID,
fs.status Catg_Description,
0 Batch_Trailer,
TO_NUMBER(NULL) Num_Grade,
TO_NUMBER(NULL) Weighting,
fs.date_of_meeting Date_Created,
NULL Created_By
FROM fd f,
fd_scorecard fs
WHERE fs.fd_id = f.fd_id
AND f.fd_id = :p_fd_id
UNION ALL
SELECT cl_id Class_ID,
cl.description Class_Description,
0 Class_Type_ID,
0 Ct_ID,
NULL Catg_ID,
NULL Catg_Description,
0 Batch_Trailer,
TO_NUMBER(NULL) Num_Grade,
TO_NUMBER(NULL) Weighting,
TO_DATE(NULL) Date_Created,
NULL Created_By
FROM class_type cl
UNION ALL
SELECT NVL(cl.cl_id, ic.class_type_id) Class_ID,
NULL Class_Description,
ic.class_type_id,
NVL(ct.ct_id,0) Ct_ID,
CASE ic.class_type_id
WHEN 1 THEN 'A'
WHEN 2 THEN 'B'
WHEN 3 THEN 'C'
WHEN 4 THEN 'D'
WHEN 5 THEN 'E'
WHEN 6 THEN 'F'
WHEN 7 THEN 'G'
WHEN 8 THEN 'H'
WHEN 9 THEN 'I'
WHEN 10 THEN 'J'
WHEN 11 THEN 'K'
WHEN 12 THEN 'L'
END||
TO_CHAR(ct.ct_id) Catg_ID,
CASE GROUPING (ct.ct_id)
WHEN 0 THEN ct.description
ELSE
CASE GROUPING(ic.class_type_id)
WHEN 0 THEN 'Grade'
ELSE 'Overall Grade And Conclusion'
END
END Catg_Description,
GROUPING_ID(ct.description) Batch_Trailer,
ROUND(SUM(ic.num_grade * ic.weighting) / SUM(ic.weighting)) Num_Grade,
ROUND(AVG(ct.weighting)) Weighting,
ic.date_created,
ic.created_by
FROM fd f,
issue_class icl,
class_type cl,
category_type ct,
issue_category ic
WHERE f.fd_id = icl.fd_id
AND icl.class_type_id = cl.cl_id
AND ct.class_type_id = cl.cl_id
AND ic.class_type_id = icl.class_type_id
AND ic.scorecard_id = icl.scorecard_id
AND ic.fd_id = icl.fd_id
AND ic.category_type_id = ct.ct_id
AND f.fd_id = 'FD000001'
GROUP BY GROUPING SETS((ic.class_type_id),(cl.cl_id,cl.description, ic.class_type_id, ct.ct_id,
ct.description, ic.date_created, ic.created_by),())
ORDER BY class_id, class_type_id, batch_trailer, ct_id;
January 08, 2008 - 9:49 am UTC
I sort of doubt it based on the from lists and the group by grouping sets.
SELECT
FROM fd f,
fd_scorecard fs
UNION ALL
SELECT
FROM class_type cl
UNION ALL
SELECT
FROM fd f,
issue_class icl,
class_type cl,
category_type ct,
issue_category ic
if the union alls were subsets of eachother - maybe, but here you have fd+fd_scorecard - but score card isn't present in the last query - and fd is 1:M with fd_scorecard it would tend to "add rows" to the last query if we added it (assuming an OUTER JOIN was there in case the relationship is 1:M OPTIONAL - each fd row could have 0, 1 or MORE entries in fd_scorecard)
Single query to replace 'Union All'
Matthew, January 08, 2008 - 9:57 am UTC
I won't waste any more time on it then. Many thanks.
Using analytics for getting hourly intervals
Thiru, January 14, 2008 - 10:45 am UTC
Tom,
What is a good way using analytics for getting the result for:
create table t1 ( id number, sid number,qty number,stime date)
insert into t1 values(1,10,100,to_date('01-15-2008 12:00','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 12:05','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 13:05','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 13:10','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 14:10','MM-DD-YYYY HH24:MI'))
the output for sum (qty) needs to be grouped by id and sid and hourly time with the top of the hour being shown.
ID SID TOP_OF_HOUR TOTAL_QTY
1 10 01-15-2008 12:00 300
1 10 01-15-2008 13:00 400
1 10 01-15-2008 14:00 200
Thanks for the time
January 14, 2008 - 3:43 pm UTC
you don't want analytics, you just want to trunc the time to the HOUR level and group by id, sid, date.... sum(qty) by that.
code..
Jay, January 14, 2008 - 5:45 pm UTC
Tom refers to a code like this..
select id,
sid,
trunc(stime,'hh'),
sum(qty)
from t1
group by id, sid, trunc(stime,'hh')
Thanks!
Selecting columns that do not change
Thiru, January 15, 2008 - 8:51 am UTC
Thanks for the input. If I have let's say 15 columns in the above t1 table and all the columns except the one that is being grouped on are the same, is there a way I can avoid putting all the columns in the group by clause?
if columns are c1....c15 and all columns from c4 to c15 are constant values for the related groups, how do I avoid doing the following?
select c1,c2,sum(c3),....c15
group by c1,c2,c4....c15
Only c1 and c2 are the actual grouping columns
Thanks for the time again.
January 15, 2008 - 1:02 pm UTC
you only have to type it once
and it is mandatory to be typed
You will not avoid the following(the above) but feel free to hide it in a view by all means.
do a max...
Jay, January 15, 2008 - 9:04 am UTC
select c1, c2, sum(c3), max(c4), max(c5)...,max(c15)
group by c1, c2
Thanks,
Jay
January 15, 2008 - 1:10 pm UTC
do not do that.
the goal in programming is not to write obfuscated code for the ease of the current human typing.
the goal is to generate correct code that is maintainable, understandable and correct.
That query would be so misleading as to the intent, the purpose. I would look at it later and say "what sort of goofy person did this, this is so wrong, don't they understand this data model - we'll have to review all of their code now, they don't know what this data is"
Please don't do something like this to avoid a few keystrokes once. This is as simple as CUT AND PASTE
select a, b, c, d, .... z, sum(xx)
from t
group by <yank that select list and paste it here>
max worked.
Thiru, January 15, 2008 - 10:45 am UTC
Thanks. That did it. Is max efficient to min in this case because the other column is doing a sum().
January 15, 2008 - 2:59 pm UTC
stop it, please don't be lazy.
Your group by key is your group by key.
Your aggregates are your aggregates.
Please do not confuse the two. You will confuse someone down the road or they'll be convinced you had no idea what you were doing.
I'll never, in a billion years never, be able to understand the fear of typing.
Jay, January 15, 2008 - 11:02 am UTC
I don't understand your question. Both Max or Min would work in this case. No pros or cons... it's the same banana!
Thanks!
Jay
January 15, 2008 - 3:23 pm UTC
yeah, they are both WRONG and INAPPROPRIATE to use.
And that they "asked" implies they don't understand what it is actually doing and that is even worse.
Thiru, January 15, 2008 - 11:33 am UTC
Though the max function did the trick, dont you feel that when working with huge sets of data, the function would be an expensive operation as all these other constant value columns are not going to be indexed. I would like to just pick up one value without using the aggregate function call or do a group by all the 15 constant value columns.
Hope I am making myself clear here. It's like this:
c1 c2 c3 c4 c5 c6
1 2 1 1 1 1
1 5 1 1 1 1
2 10 2 2 2 2
2 20 2 2 2 2
As you see : select c1,sum(c2),max(c3),max(c4),max(c5),max(c6) from t1 group by c1
this has got a) the grouping operation b) the max operation for each of the other columns
would like to avoid this max to get the result:
c1 c2 c3 c4 c5 c6
1 7 1 1 1 1
2 30 2 2 2 2
Thanks again.
January 15, 2008 - 3:34 pm UTC
hey - just use GROUP BY !!!!! (come on, this is a trivial CUT AND PASTE people)
think about this - using MAX() is harder for you to code. *harder*
It is not only wrong
It is *harder*
In your attempt to shave a millisecond of your time, you add many seconds to your time
<b>
select c1, c2, c3, c4, c5, c6, sum(x)
from t
group by </b> c1, c2, c3, c4, c5, c6
the code in bold - what you have to type. the stuff not in bold, the stuff you have to copy and paste.
Now, do it with max and you are - typing MORE
heck, way more - because you probably have to alias the columns
select max(c1) c1, max(c2) c2, .......
all in a misguided attempt to "type less"...
and yeah, you don't want to max all over the place for performance reasons.
Jay, January 15, 2008 - 1:06 pm UTC
Did you test run the query using the max function? Are you having any performance issues? Tom might talk to you more about this. But, I feel that you should evaluate your query and measure the performance before making a 'generic' comment without any basis.
Now, if you really "hate" doing 'group by', one way to do this via analytics would be something like this.
select distinct c1,
sum_c2,
c3,
c4
from
(select c1,
c2,
sum(c2) over (partition by c1) as sum_c2,
c3,
c4
from t1)
But, again, I don't understand why you would do this!
Thanks,
Jay
January 15, 2008 - 3:37 pm UTC
it will take a little more cpu to max lots of columns than to group by...
this was a quick and dirty 10g test:
SELECT OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, OBJECT_TYPE, CREATED,
LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY,
SUM(DATA_OBJECT_ID)
FROM
T GROUP BY OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, OBJECT_TYPE,
CREATED, LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 10 0.00 0.00 0 0 0 0
Fetch 5000 7.68 13.12 4800 6900 0 499010
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5011 7.69 13.13 4800 6900 0 499010
********************************************************************************
SELECT MAX(OWNER), MAX( OBJECT_NAME), MAX( SUBOBJECT_NAME), OBJECT_ID,
MAX(OBJECT_TYPE), MAX( CREATED), MAX( LAST_DDL_TIME), MAX( TIMESTAMP), MAX(
STATUS), MAX(TEMPORARY), MAX( GENERATED), MAX( SECONDARY),
SUM(DATA_OBJECT_ID)
FROM
T GROUP BY OBJECT_ID
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.02 0 1 0 0
Execute 10 0.00 0.00 0 0 0 0
Fetch 5000 9.27 16.05 7800 6900 0 499010
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 5011 9.29 16.08 7800 6901 0 499010
but the right answer is to group by that which you want to group by.
Thanks Tom
Jay, January 15, 2008 - 1:29 pm UTC
Quote --
"... the goal is to generate correct code that is maintainable, understandable and correct......"
Tom, thanks for the comment. I get your point!
Jay
Conditional sorting
Reader, January 22, 2008 - 7:03 pm UTC
Sample data
create table t
(
num_col number,
var_col varchar2(3)
);
insert into t values(1,'abc');
insert into t values(1,'def');
insert into t values(1,'pqr');
insert into t values(2,'cde');
insert into t values(2,'def');
insert into t values(2,'rst');
insert into t values(2,'xyz');
insert into t values(3,'bcd');
insert into t values(3,'def');
insert into t values(3,'pqr');
Expected Output
1 abc <== This num_col group comes first as abc is the least value among all var_col values in the table
1 def
1 pqr
3 bcd <== This num_col group comes next as bcd will be the next sorted value after abc among all var_col values in the table
3 def
3 pqr
2 cde <== This num_col group comes next as bcd will be the next sorted value after abc among all var_col values in the table
2 def
2 rst
2 xyz
Is it possible to achieve the above expected output in a single SQL query, maybe using Analytics?
January 22, 2008 - 7:13 pm UTC
ops$tkyte%ORA10GR2> select num_col, var_col, min(var_col) over (partition by num_col) min_var_col
2 from t
3 order by 3
4 /
NUM_COL VAR MIN
---------- --- ---
1 pqr abc
1 abc abc
1 def abc
3 bcd bcd
3 def bcd
3 pqr bcd
2 rst cde
2 xyz cde
2 cde cde
2 def cde
10 rows selected.
Fantastic
Reader, January 22, 2008 - 7:22 pm UTC
s devarshi, January 29, 2008 - 8:59 am UTC
i have table t.
create table t( x number(3),y varchar2(3),z number(3));
the rows are
insert into t values (1,'a',5);
insert into t values (1,'a',10);
insert into t values (1,'b',25);
insert into t values (1,'b',12);
insert into t values (1,'c',5);
insert into t values (1,'c',5);
insert into t values (2,'a',5);
insert into t values (2,'a',15);
insert into t values (2,'c',5);
insert into t values (2,'c',15);
insert into t values (3,'a',5);
insert into t values (3,'a',5);
insert into t values (3,'b',5);
insert into t values (4,'a',5);
insert into t values (4,'a',35);
insert into t values (5,'a',55);
insert into t values (5,'a',5);
insert into t values (5,'b',55);
insert into t values (6,'b',35);
insert into t values (6,'c',35);
i need an output like this
x y z
- - -
5 a 60
5 b 55
5 tot 115
6 b 35
6 c 35
6 tot 70
4 a 40
4 tot 40
the sorting is on z (tot) ,grouped by x and y .
i can get the output
x,y1,y2,y3,tot
format using decode but not this.
January 30, 2008 - 8:59 am UTC
the sorting is on X,Y, and then SUM(z)
you missed X,Y in your "specification", you must order by X,Y and then sum(z). You cannot sort by Z as that would cause X,Y to not be "grouped" in the result set of course.
ops$tkyte%ORA10GR2> select x, decode( grouping(y), 1, 'tot', y ), sum(z)
2 from t
3 group by grouping sets ((x,y),(x))
4 order by x, y, sum(z)
5 /
X DEC SUM(Z)
---------- --- ----------
1 a 15
1 b 37
1 c 10
1 tot 62
2 a 20
2 c 20
2 tot 40
3 a 10
3 b 5
3 tot 15
4 a 40
4 tot 40
5 a 60
5 b 55
5 tot 115
6 b 35
6 c 35
6 tot 70
18 rows selected.
devarshi, January 31, 2008 - 12:58 am UTC
superb
analytics and 8.1.7
devarshi, February 05, 2008 - 12:59 am UTC
i have two more queries.
1. how to do it in 8.1.7 ?
2. can analytic function be used here ?
thanks
February 05, 2008 - 8:13 am UTC
ops$tkyte@ORA817DEV> select decode( grouping(x), 1, 'tot', x ) x,
2 decode( grouping(y), 1, 'tot', y ),
3 sum(z),
4 grouping(x) gx, grouping(y) gy
5 from t
6 group by rollup(x,y)
7 /
X DEC SUM(Z) GX GY
--- --- ---------- ---------- ----------
1 a 15 0 0
1 b 37 0 0
1 c 10 0 0
1 tot 62 0 1
2 a 20 0 0
2 c 20 0 0
2 tot 40 0 1
3 a 10 0 0
3 b 5 0 0
3 tot 15 0 1
4 a 40 0 0
4 tot 40 0 1
5 a 60 0 0
5 b 55 0 0
5 tot 115 0 1
6 b 35 0 0
6 c 35 0 0
6 tot 70 0 1
tot tot 342 1 1
19 rows selected.
you can use a where clause to get rid of the tot/tot row if you want.
analytics (the over () functions) would not be appropriate, they do not "make up" rows like grouping sets, rollup and cube do.
Grouped Pagination
reader, February 07, 2008 - 8:26 am UTC
Hi Tom,
I have a view like this(Scott EMP table).
select min(job) over (partition by deptno) mnjb, deptno dnm , e.* from emp e
order by mnjb,deptno;
This view returns an ordered resultset. On this ordered resultset i want to generate a sequential number against each distinct DEPTNO, which I will use for pagination.
The expected output is:
1 ANALYST 20 7369 SMITH CLERK 7902 12/17/1980 00:00:00 800 20
1 ANALYST 20 7876 ADAMS CLERK 7788 01/12/1983 00:00:00 1100 20
1 ANALYST 20 7788 SCOTT ANALYST 7566 12/09/1982 00:00:00 3000 20
1 ANALYST 20 7902 FORD ANALYST 7566 12/03/1981 00:00:00 3000 20
1 ANALYST 20 7566 JONES MANAGER 7839 04/02/1981 00:00:00 2975 20
2 CLERK 10 7782 CLARK MANAGER 7839 06/09/1981 00:00:00 2450 10
2 CLERK 10 7839 KING PRESIDENT 11/17/1981 00:00:00 5000 10
2 CLERK 10 7934 MILLER CLERK 7782 01/23/1982 00:00:00 1300 10
3 CLERK 30 7499 ALLEN SALESMAN 7698 02/20/1981 00:00:00 1600 300 30
3 CLERK 30 7698 BLAKE MANAGER 7839 05/01/1981 00:00:00 2850 30
3 CLERK 30 7654 MARTIN SALESMAN 7698 09/28/1981 00:00:00 1250 1400 30
3 CLERK 30 7900 JAMES CLERK 7698 12/03/1981 00:00:00 950 30
3 CLERK 30 7844 TURNER SALESMAN 7698 09/08/1981 00:00:00 1500 0 30
3 CLERK 30 7521 WARD SALESMAN 7698 02/22/1981 00:00:00 1250 500 30
Please refer the first column. The sequence changes based on the ordered distinct DEPTNO as received from the view.
Is it possible to get the sequences using Analytics or any other method?
February 07, 2008 - 9:07 am UTC
ops$tkyte%ORA10GR2> create or replace view v
2 as
3 select dense_rank() over (order by mnjb, deptno) rnk,
4 x.*
5 from (
6 select min(job) over (partition by deptno) mnjb,
7 deptno dnm ,
8 e.*
9 from emp e
10 ) x
11 order by mnjb,deptno;
View created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from v;
RNK MNJB DNM EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- --------- ---------- ---------- ---------- --------- ---------- --------- ---------- ---------- ----------
1 ANALYST 20 7876 ADAMS CLERK 7788 12-JAN-83 1100 20
1 ANALYST 20 7566 JONES MANAGER 7839 02-APR-81 2975 20
1 ANALYST 20 7902 FORD ANALYST 7566 03-DEC-81 3000 20
1 ANALYST 20 7788 SCOTT ANALYST 7566 09-DEC-82 20182.53 20
1 ANALYST 20 7369 SMITH CLERK 7902 17-DEC-80 800 20
2 CLERK 10 123 HelloWorld 10
2 CLERK 10 7839 KING PRESIDENT 17-NOV-81 5000 10
2 CLERK 10 7782 CLARK MANAGER 7839 09-JUN-81 2450 10
2 CLERK 10 7934 MILLER CLERK 7782 23-JAN-82 1300 10
3 CLERK 30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
3 CLERK 30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
3 CLERK 30 7900 JAMES CLERK 7698 03-DEC-81 950 30
3 CLERK 30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
3 CLERK 30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
3 CLERK 30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
15 rows selected.
Fantastic...superb....stunning
reader, February 07, 2008 - 9:42 am UTC
When will I be able to think like you?...just thinking out loud....
Ordering Rows by specific values
Humble, February 19, 2008 - 10:49 am UTC
Tom I have the below SQL
SELECT LPAD(NVL(pm.item,'0'),8,'0') item,
'00000000000' supplier,
TO_CHAR(LPAD(NVL(pm.dept,0),3,0)) dept,
TO_CHAR(LPAD(NVL(pm.class,0),4,'0')) class,
RPAD('R',1,'0') dummy,
TO_CHAR(LPAD((ROUND(NVL(pm.new_price,0),2) * 100),7,'0')) new_price, --izp.unit_retail,
'0000000' unit_cost,
' ' tax_category,
RPAD(NVL(pm.item_short_desc,' '),18,' ') item_short_desc,
TO_CHAR(LPAD(NVL(pm.old_multi_units,0),3,'0')) old_multi_units,
TO_CHAR(LPAD((ROUND(NVL(pm.old_multi_unit_retail,0),2) * 100),7,'0')) old_multi_unit_retail,
NVL(RPAD(DECODE(pm.tran_type,1,'A',10,'M',13,'M',21,'D'),1,' '),' ') tran_type
FROM nex_pos_mods pm
WHERE pm.store = TO_NUMBER(p_StoreNo)
AND tran_type IN (1,10,13,21)
GROUP BY pm.item,
'00000000000',
pm.dept,
pm.class,
pm.new_price,
item_short_desc,
pm.old_multi_units,
pm.old_multi_unit_retail,
'0000000',
pm.tran_type;
I would like to order the results by specific tran_types.
I would like all 'D' values first,
all 'A' values second
all 'M' values third
what could I do using SQL analytics?
February 19, 2008 - 5:03 pm UTC
why does everyone say "how can I do this with analytics" when faced with any "i cannot figure out how to write this in sql"?
I know analytics are pretty amazing, but they are not magic.
Analytics do not sort result sets.
Analytics do not aggregate
Analytics do not do many things.
this is a very simple order by requirement:
order by decode( tran_type, 'D', 1, 'A', 2, 'M' 3 ),
Most popular price
Thomas, February 25, 2008 - 11:36 am UTC
Hi Tom,
I have a table containing articles and a table containing
the current price for each article in each store.
For each article, I'd like to know the most popular price,
that is, the price used in the most number of markets. In
case of a tie (several prices being the most popular for an
article), I'd like to have the lowest of these most popular
prices.
Here is some simple test data:
CREATE TABLE article (article INTEGER PRIMARY KEY);
CREATE TABLE price (article INTEGER REFERENCES article,store INTEGER,price NUMBER);
INSERT INTO article VALUES(1);
INSERT INTO article VALUES(2);
INSERT INTO price VALUES(1,1,4.5);
INSERT INTO price VALUES(1,2,4.5);
INSERT INTO price VALUES(1,3,3);
INSERT INTO price VALUES(1,4,5);
INSERT INTO price VALUES(2,1,6);
INSERT INTO price VALUES(2,2,6.5);
INSERT INTO price VALUES(2,3,6);
INSERT INTO price VALUES(2,4,6.5);
INSERT INTO price VALUES(2,5,6);
INSERT INTO price VALUES(2,6,6.5);
INSERT INTO price VALUES(2,6,5);
I've tried it like this (without analytics) for an
individual article, which works fine:
SELECT * FROM (SELECT price FROM price p WHERE p.article=1
GROUP BY p.price ORDER BY COUNT(*) DESC,price)
WHERE ROWNUM=1
However, when I try to use this as a subquery like this,
Oracle 10gR2 won't let me:
SELECT a.article,(SELECT * FROM (SELECT price FROM price p
WHERE p.article=a.article GROUP BY p.price
ORDER BY COUNT(*) DESC,price) WHERE ROWNUM=1)
FROM article a;
Oracle tells me it doesn't know the alias a. The alias a
seems to be present in the first level of the subquery, but
not in the second level, i.e., where I want to use it.
Questions:
1. Why does Oracle tell me it doesn't know about a? To me,
this looks like a valid query. Is there anything I can do to
allow me access to a from the inner level of the subquery?
2. Can this be done better using a subquery with analytics?
February 25, 2008 - 2:06 pm UTC
1) the correlation name goes one level down. period.
2)
ops$tkyte@ORA920> select article, price
2 from (
3 select article, price,
4 row_number() over (partition by article order by cnt desc, price ) rn
5 from (
6 select article, price,
7 count(*) over (partition by article, price) cnt
8 from price
9 )
10 )
11 where rn = 1
12 order by article
13 /
ARTICLE PRICE
---------- ----------
1 4.5
2 6
analytics
A reader, February 26, 2008 - 6:08 pm UTC
Tom:
If I have a table
Mag_no, contract, stage, create_date
1 ABC 5 1/1/2008
1 ABC 5 1/15/2008
1 ABC 6 2/2/2008
1 ABC 6 3/3/3008
...............
How do I have a query to display the last create date record with highest stage
1 ABC 6 3/3/3008
February 27, 2008 - 2:26 am UTC
The lack of specificity here is huge....
apparently "stage" is not relevant, you want to collapse over that.
but what about mag_no, contract....
and where is the create table, the insert intos
Partition is detached from Order by?
Marat, April 12, 2008 - 4:47 am UTC
Hi Tom,
lets see a table
CREATE TABLE TTT
(
F1 VARCHAR2(10),
F2 NUMBER
);
Insert into TTT (F1, F2) Values ('1', 1);
Insert into TTT (F2) Values (2);
Insert into TTT (F2) Values (3);
Insert into TTT (F1, F2)Values ('2', 4);
Insert into TTT (F2) Values(5);
Insert into TTT (F2) Values(6);
Insert into TTT (F2) Values(7);
select ttt.*,
row_number() over ( partition by f1 order by f2) rn,
first_value(f2) over ( partition by f1 order by f2)fv
from ttt
order by f2;
F1 F2 RN FV
1 1 1 1
2 1 2
3 2 2
2 4 1 4
5 3 2
6 4 2
7 5 2
Note the values in RN - they continue from the first partition (f1=1), and FIRST_VALUE points the second row, where nulls in F1 begin.
So, how to partition same values to make them based on Order by?
Actually, I need the result like
F1 F2 RN FV
1 1 1 1
1 2 1 2
1 3 2 2
2 4 1 4
2 5 1 5
2 6 2 5
2 7 3 5
that is, I need null values to be filled with previous non-null value.
Thank you.
April 13, 2008 - 8:32 am UTC
... Note the values in RN - they continue from the first partition (f1=1), and FIRST_VALUE points the second row, where nulls in F1 begin. ...
no, they don'tF1 F2 RN FV
1 1 1 1 <<<<= partition 1
2 4 1 4 <<<<= partition 2
2 1 2 <<<<= partition 3
3 2 2
5 3 2
6 4 2
7 5 2
Now, since ROWS IN A TABLE HAVE NO ORDER WHATSOEVER, I have to presume you want to carry down the last non-null value of F1 after sorting the entire set by F2 - F2 is what "sequences" or "orders" this data - right?
So, your requirement is simply stated as "carry down the last non-null value of F1 after sorting all data by F2"
ops$tkyte%ORA10GR2> select last_value(f1 ignore nulls) over (order by f2) f1_new,
2 f2
3 from ttt
4 order by f2
5 /
F1_NEW F2
---------- ----------
1 1
1 2
1 3
2 4
2 5
2 6
2 7
7 rows selected.
that works in 10g, in earlier releases, assuming f2 is NOT NULL and POSITIVE - the following would cut it:
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select substr( max( case when f1 is not null then to_char(f2,'fm0000000') || f1 end ) over (order by f2), 8, 10 ) f1_new,
2 f2
3 from ttt
4 order by f2
5 /
F1_NEW F2
---------- ----------
1 1
1 2
1 3
2 4
2 5
2 6
2 7
7 rows selected.
How it works?
A reader, April 14, 2008 - 3:30 am UTC
Excellent! Thank you, Tom! It works fine.
But... I still can't understand why it works NOT like this:
F1 F2 RN FV
1 1 1 1 => Partition 1
2 1 2 => Partition 2
3 2 2
2 4 1 4 => Partition 3
5 3 2 => Partition 4
6 4 2
7 5 2
April 16, 2008 - 1:50 pm UTC
umm, because F1 has three values in your table
f1 = 1
f1 = 2
f1 is null
therefore, if you partition by f1, you will have three partitions.
Is PARTITION BY works before ORDER BY?
Marat, April 15, 2008 - 5:25 am UTC
Well, now I realize that PARTITION BY is working before ORDER BY, isn't it?
April 16, 2008 - 2:42 pm UTC
the data is partitioned by your partition by
THEN, the data in that partition is sorted.
THEN, the window is "set up" (range or row window), the default window is typically "current row and all preceding rows" with an order by.
THEN the function is applied.
Query
Tim, April 17, 2008 - 2:12 pm UTC
N1 N2
1 345
2 645
3 378
4 95
5 2557
6 95
7 111
8 8756
9 40
I need to write SQL for below algorithm:
As you see N1 is sequential. So, algorithm starts from end (MAX N1).
1. Get curent N1 and N2 (lets keep it as N1_1 nd N2_1)
2. Go to previous record and get new N1 and N2 (lets keep it as N1_2 nd N2_2)
3. if N2_2 < N2_1 the return this record else go to previous record and start from step 1
Regarding above data, query must return;
N1 N2
1 345 -> return this (345 is less than 645)
2 645
3 378
4 95 -> return this (95 is less than 2557)
5 2557
6 95
7 111 -> return this (111 is less than 8756)
8 8756
9 40
Could you pls help to write this query?
April 17, 2008 - 4:33 pm UTC
no create table
no insert into table
no look
this is trivial with lag or lead - read up on them, you'll probably be able to figure it out. lag lets you look "back N records" in a result set. Lead - look forward N records...
create table
Tim, April 18, 2008 - 12:05 am UTC
Sorry Tom.
Pls see below for table creation.
create table test (N1 int, N2 int);
insert into test values (1,345);
insert into test values (2,645);
insert into test values (3,378);
insert into test values (4,95);
insert into test values (5,2557);
insert into test values (6,95);
insert into test values (7,111);
insert into test values (8,8756);
insert into test values (9,40);
Thanks;
April 18, 2008 - 8:31 am UTC
did you even try, sigh.... the answer was given, hundreds of examples exist on this site...
did you notice your logic doesn't match your picture?
1. Get curent N1 and N2 (lets keep it as N1_1 nd N2_1)
2. Go to previous record and get new N1 and N2 (lets keep it as N1_2 nd N2_2)
3. if N2_2 < N2_1 the return this record else go to previous record and start from step 1
Regarding above data, query must return;
N1 N2
1 345 -> return this (345 is less than 645)
2 645
3 378
4 95 -> return this (95 is less than 2557)
5 2557
6 95
7 111 -> return this (111 is less than 8756)
8 8756
9 40
1. Get curent N1 and N2 (lets keep it as N1_1 nd N2_1)
2. Go to previous record and get new N1 and N2 (lets keep it as N1_2 nd N2_2)
3. if N2_2 < N2_1 the return this record else go to previous record and start from step 1
1) assume current = N1=7, N2_1=111
2) previous record is N1=6, N2_2=95
3) n2_2 < n2_1, but you wrote "111 is less than 8756"
Ok, play with lag and lead:
ops$tkyte%ORA10GR2> select n1, n2,
2 lead(n2) over (order by n1) next_n2,
3 lag(n2) over (order by n1) prior_n2
4 from test
5 /
N1 N2 NEXT_N2 PRIOR_N2
---------- ---------- ---------- ----------
1 345 645
2 645 378 345
3 378 95 645
4 95 2557 378
5 2557 95 95
6 95 111 2557
7 111 8756 95
8 8756 40 111
9 40 8756
9 rows selected.
and tell us what you really meant - if you cannot get it yourself given this example.
keen relationship
Ken, April 23, 2008 - 6:29 pm UTC
If Ids are available then the query writing gets easy but what could be used if there is only name matching pattens?
For instance if combination is that the Second name of Son will be first name of his father, the third name of the son will be first name of this grand father and sons, forth name will be name of this great grand father. This rules apply on all names, thats why we have 4 names fields other than family name.
I have a table with following fields
CREATE TABLE KEEN_RELATIONSHIP
( "TESTID" NUMBER,
"AGE" NUMBER(3,0),
"FIRSTNAME" VARCHAR2(25 BYTE),
"SECONDNAME" VARCHAR2(25 BYTE),
"THIRDNAME" VARCHAR2(25 BYTE),
"FORTHNAME" VARCHAR2(25 BYTE),
"FAMILYNAME" VARCHAR2(25 BYTE));
And have 39 following records in this table
-- INSERTING into KEEN_RELATIONSHIP
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961527492,47,'Mishal','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961562389,47,'Waleed','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962101220,42,'Khalid','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (964338909,30,'Faisal','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961746095,44,'Abdullah','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960785299,56,'Suliman','Salah','Suliman',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960986913,55,'Abdul Aziz','Salah','Suliman',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962749770,38,'Ali','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962579046,39,'Maman','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961747704,45,'Bader','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962145270,43,'Khalid','Mohd','Ibrahim','Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960714745,66,'Ibrahim','Mohd','Ibrahim','Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960792751,57,'Mohd','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960111460,82,'Ali','Abdul Mohsin','Mohd',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961886076,44,'Abdul Latif','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (964355027,33,'Abdul Aziz','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (963164687,36,'Abdul Mohsin','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960519063,72,'Baderia','Ali',null,null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (963027715,36,'Shekha','Ibrahim','Abdul Rehman','Salah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960985516,52,'Noora','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961282916,50,'Fozia','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962082670,64,'Naja','Rashid','Abdul Rehman',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961971436,45,'Mana','Ali','Salah',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962094798,82,'Munira','Abdul Mohsin','Mohd',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (964127289,35,'Alia','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961377460,48,'Nawal','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (967763537,23,'Noora','Abdullah','Mohd','Ali Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (967775024,23,'Shekha','Bader','Abdul Aziz','Abdul Mohsin','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962214823,41,'Khlood','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960958623,54,'Baderia','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961775619,62,'Wasam','Ali','Abdullah',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961541876,51,'Maryaam','Mohd','Ibrahim','Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962343452,61,'Husa','Al Salah','Al Mansoor',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (966099168,26,'Afnan','Ibrahim','Mohd','Ibrahim Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961669035,45,'Mana','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961076755,51,'Shekha','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960791757,91,'Lulu','Ibrahim','Al Awada',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (966364107,30,'Ghzlan','Mohd','Ali','Abdul Mohsin Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (963587153,73,'Latifa','Abdul Rehman','Salah',null,'Jaan');
April 28, 2008 - 9:37 am UTC
... If Ids are available then the query writing gets easy ...
no it doesn't, in fact, it can get easier to get wrong - but we digress.
there is no question here as far as I can see - at all.
and if the rule were "my name is derived from this rule", I would have a hierarchy - NOT a flat table like this.
Meaning, I have my name, my father has his name and so on - and if you want to see my full name, we retrieve the data on the fly.
A reader, May 20, 2008 - 7:03 pm UTC
create table fas157_mref_inp_load_status_b
(Load_id integer,
Load_Status char(1),
month_id integer,
updated_dt date default sysdate,
updated_by varchar2(4000) default sys_context('USERENV','OS_USER'));
alter table fas157_mref_inp_load_status add constraint fas157_mref_inp_load_status_pk
primary key(Load_id);
create table fas157_mref_out_load_status_b
(Load_id integer,
Load_Status char(1),
month_id integer,
updated_dt date default sysdate,
updated_by varchar2(4000) default sys_context('USERENV','OS_USER'));
alter table fas157_mref_out_load_status add constraint fas157_mref_out_load_status_pk
primary key(Load_id);
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (2, null, 200804, to_date('16-05-2008 15:46:52', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (3, 'P', 200804, to_date('09-05-2008 15:58:44', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (1, 'P', 200804, to_date('09-05-2008 15:12:21', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (4, 'p', 200804, to_date('20-05-2008 09:12:21', 'dd-mm-yyyy hh24:mi:ss'), null);
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (5, 'P', 200804, to_date('20-05-2008 09:11:21', 'dd-mm-yyyy hh24:mi:ss'), null);
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (6, 'P', 200804, to_date('09-05-2008 15:59:44', 'dd-mm-yyyy hh24:mi:ss'), null);
commit;
insert into FAS157_MREF_OUT_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (86, 'P', 200804, to_date('14-05-2008 09:56:51', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_OUT_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (87, 'P', 200804, to_date('20-05-2008 11:56:51', 'dd-mm-yyyy hh24:mi:ss'), null);
insert into FAS157_MREF_OUT_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (88, 'P', 200804, to_date('19-05-2008 11:56:51', 'dd-mm-yyyy hh24:mi:ss'), null);
commit;
DESC fas157_mref_INP_load_status_b
Name Null? Type
----------------------------------------- -------- ----------------------------
LOAD_ID NUMBER(38)
LOAD_STATUS CHAR(1 CHAR)
MONTH_ID NUMBER(38)
UPDATED_DT DATE
UPDATED_BY VARCHAR2(4000 CHAR)
desc FAS157_MREF_OUT_LOAD_STATUS_B
Name Null? Type
----------------------------------------- -------- ----------------------------
LOAD_ID NUMBER(38)
LOAD_STATUS CHAR(1 CHAR)
MONTH_ID NUMBER(38)
UPDATED_DT DATE
UPDATED_BY VARCHAR2(4000 CHAR)
For all load_ids in output table(FAS157_MREF_OUT_LOAD_STATUS_B
) with load_status='P', I would like to get the load_id in input table (FAS157_MREF_INP_LOAD_STATUS_B) whose updated_dt is nearest to and less than the updated_dt in output table. Can we do this in a single SQL ?
thanks
Christian
May 20, 2008 - 10:00 pm UTC
ops$tkyte%ORA10GR2> select out.*,
2 (select max( load_id ) keep ( dense_rank first order by inp.updated_dt desc )
3 from FAS157_MREF_INP_LOAD_STATUS_B inp
4 where inp.updated_dt <= out.updated_dt ) that_load_id
5 from FAS157_MREF_OUT_LOAD_STATUS_B out
6 where out.load_status = 'P'
7 /
LOAD_ID L MONTH_ID UPDATED_DT UPDATED_BY THAT_LOAD_ID
---------- - ---------- -------------------- ---------- ------------
86 P 200804 14-may-2008 09:56:51 mamarti 6
87 P 200804 20-may-2008 11:56:51 4
88 P 200804 19-may-2008 11:56:51 2
A reader, May 20, 2008 - 7:06 pm UTC
Correction above
alter table fas157_mref_inp_load_status_b add constraint fas157_mref_inp_load_status_pk
primary key(Load_id);
alter table fas157_mref_out_load_status_b add constraint fas157_mref_out_load_status_pk
primary key(Load_id);
Reader, May 21, 2008 - 12:59 pm UTC
create table test
(id varchar2(10),
dt date,
amt number);
insert into test
values
('ABC', to_date('01/02/2008','mm/dd/yyyy'),20);
insert into test
values
('ABC', to_date('01/02/2008','mm/dd/yyyy'),40);
insert into test
values
('ABC', to_date('01/03/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/04/2008','mm/dd/yyyy'),50);
insert into test
values
('ABC', to_date('01/05/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/06/2008','mm/dd/yyyy'),20);
insert into test
values
('ABC', to_date('01/07/2008','mm/dd/yyyy'),0);
--
insert into test
values
('PQR', to_date('01/02/2008','mm/dd/yyyy'),30);
insert into test
values
('PQR', to_date('01/03/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/04/2008','mm/dd/yyyy'),80);
insert into test
values
('PQR', to_date('01/05/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/06/2008','mm/dd/yyyy'),10);
insert into test
values
('PQR', to_date('01/07/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/08/2008','mm/dd/yyyy'),10);
insert into test
values
('PQR', to_date('01/09/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/10/2008','mm/dd/yyyy'),30);
select * from test;
ID DT AMT
---------- --------- ----------
ABC 02-JAN-08 20
ABC 02-JAN-08 40
ABC 03-JAN-08 0
ABC 04-JAN-08 50
ABC 05-JAN-08 0
ABC 06-JAN-08 20
ABC 07-JAN-08 0
PQR 02-JAN-08 30
PQR 03-JAN-08 0
PQR 04-JAN-08 80
PQR 05-JAN-08 0
PQR 06-JAN-08 10
PQR 07-JAN-08 0
PQR 08-JAN-08 10
PQR 09-JAN-08 0
PQR 10-JAN-08 30
16 rows selected.
I need to find sum for the last three days for each id
For example
if today is 1/4/2008, I need to find the sum of 1/2/2008, 1/3/2008 and 1/4/2008
if today is 1/5/2008, I need to find the sum of 1/5/2008, 1/4/2008 and 1/3/2008
Can you please help me how to do this by using analytical functions?
May 21, 2008 - 3:14 pm UTC
that would not involve analytics at all. It is a rather simple "where" and group by.
ops$tkyte%ORA10GR2> variable dt varchar2(20)
ops$tkyte%ORA10GR2> exec :dt := '1/4/2008'
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select id, min(dt), max(dt), sum(amt)
2 from test where dt between to_date( :dt, 'mm/dd/yyyy')-2 and to_date( :dt, 'mm/dd/yyyy')
3 group by id;
ID MIN(DT) MAX(DT) SUM(AMT)
---------- --------- --------- ----------
PQR 02-JAN-08 04-JAN-08 110
ABC 02-JAN-08 04-JAN-08 110
ops$tkyte%ORA10GR2> exec :dt := '1/5/2008'
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> /
ID MIN(DT) MAX(DT) SUM(AMT)
---------- --------- --------- ----------
PQR 03-JAN-08 05-JAN-08 80
ABC 03-JAN-08 05-JAN-08 50
A reader, May 21, 2008 - 1:46 pm UTC
thanks
Christian
Reader, May 21, 2008 - 3:56 pm UTC
Tom,
Sorry missed to mention this in my previous question.
We have data since 2004. I need get the sum for all dates. I need to create a view for this, so that I can get data for any date, whenever, I query.
If the date is 05/07/2004, I should sum the data for 05/05/2004, 05/06/2004 and 05/07/2004.
If the date is 05/06/2004, I should sum the data for 05/04/2004, 05/05/2004 and 05/06/2004 and so on. I am not sure if this is called as Moving sum. Please advice.
Reader, May 21, 2008 - 4:28 pm UTC
with regards to the above question -
If I do dt-2, it will take weekends and holidays also. In the test table, I do not store data for weekends and holidays.
May 21, 2008 - 5:21 pm UTC
ok, so that is different - it would/could be an analytic with a range window - but, there is no such thing as "a holiday" - what do you need to have happen here and what do you have existing that will help make it happen.
Vaibhav, May 22, 2008 - 12:39 am UTC
hi tom,
I had to write a query that would give me the number fo errors that occured for a particular error code for the past 3 hrs but split over an interval of 15 mins...
eg: if it is 12 o clock now, then i would want number fo errors coourred between 9 to 9.15, 9.15 to 9.30 ... 11.45 to 12 for each error code...i went through your site and could write the following query
SELECT *
FROM (SELECT *
FROM (SELECT me.error_code,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) AND (p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') THEN 1 ELSE NULL END) in1,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') THEN 1 ELSE NULL END) in2,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE') THEN 1 ELSE NULL END) in3,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE') THEN 1 ELSE NULL END) in4,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE') THEN 1 ELSE NULL END) in5,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE' THEN 1 ELSE NULL END) in6,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE') THEN 1 ELSE NULL END) in7,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE') THEN 1 ELSE NULL END) in8,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE') THEN 1 ELSE NULL END) in9,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE') THEN 1 ELSE NULL END) in10,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE') THEN 1 ELSE NULL END) in11,
COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(180, 'MINUTE') THEN 1 ELSE NULL END) in12,
SUM (COUNT(1)) OVER (PARTITION BY error_code ORDER BY error_code) total_count
FROM mhs_errors me, mhs_transaction_status mts
WHERE me.transaction_id = mts.transaction_id
AND me.error_timestamp BETWEEN p_error_date - 3/24 AND p_error_date
AND status_type = p_status_type
GROUP BY me.error_code)
ORDER BY total_count desc)
WHERE ROWNUM <= NVL(p_batchsize, 10)
i actually used a function that was called 12 times for each error code and then i went through your queries and came to kow about analytic functions and only then i could write the above query.
i want to know if there ia a better way to write this query?
please help
May 22, 2008 - 7:10 am UTC
I have no idea what to do here. No create tables, no inserts...
I don't know why you have so many columns for such a simple return - seems like it would have three columns in the output:
time (rounded to 15 minute intervals)
error code
count(*)
grouped by time, error code
no analytics.
Given that is what I think - you must not be presenting your requirements.... sorry.
Can try this
Ayan, May 22, 2008 - 3:39 am UTC
You can write a view as follows:
select sysdate,
(trunc(sysdate,'hh')-3/24)+r*15/(24*60) start_dt,
(trunc(sysdate,'hh')-3/24)+(r+1)*15/(24*60) end_dt
from(
select sysdate,rownum r
from user_objects
where rownum<13
)
and then compare your data as
error_timestamp >= start_dt and error_timestamp<end_dt
and then do the grouping.
I dont see use of analytical function here, but Tom is the best one to answer
Reader, May 22, 2008 - 10:01 am UTC
create table test
(id varchar2(10),
dt date,
amt number);
insert into test
values
('ABC', to_date('01/10/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/11/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/14/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/15/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/16/2008','mm/dd/yyyy'),20);
insert into test
values
('ABC', to_date('01/18/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/22/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/11/2008','mm/dd/yyyy'),30);
insert into test
values
('PQR', to_date('01/14/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/15/2008','mm/dd/yyyy'),80);
insert into test
values
('PQR', to_date('01/16/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/18/2008','mm/dd/yyyy'),10);
insert into test
values
('PQR', to_date('01/22/2008','mm/dd/yyyy'),0);
commit;
01/15/2008 is tuesday
01/14/2008 is monday
01/11/2008 is friday
I have zero for amount on all the above days for id = ABC. I do not store data for weekends and holidays in the test table.
When I do the 3-day sum using windowing option, I get 1E-34 instead of zero. I am not sure, why I am getting 1E-34. Please advice.
select dt
,id
,sum(amt) as sum_daily
,sum(sum(amt)) OVER (PARTITION BY id order by dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as SUM_3d
from test
group by dt, id
order by dt;
DT ID SUM_DAILY SUM_3D
--------- ---------- ---------- ----------
11-JAN-08 ABC 0 0
11-JAN-08 PQR 30 30
14-JAN-08 ABC 0 0
14-JAN-08 PQR 0 30
15-JAN-08 PQR 80 110
15-JAN-08 ABC 0 1E-34
16-JAN-08 ABC 20 20
16-JAN-08 PQR 0 80
18-JAN-08 PQR 10 90
18-JAN-08 ABC 0 20
22-JAN-08 PQR 0 10
22-JAN-08 ABC 0 20
12 rows selected.
May 23, 2008 - 8:01 am UTC
I cannot reproduce.
any other information here - like platform, precise version
Vaibhav, May 23, 2008 - 12:52 am UTC
Hi
Thanks for that quick reply
I apologise for not providing any create and insert scripts.
The reason i am using PARTITION BY is that i want the total count as well for that particular error code
Kindly see below the kind of output that i am lookin out for.
Also, Ayan mentioned using TRUNC(errror_date, 'HH')
i cannot use this because it will round the error timestamp. i dont want that
what i want is if user has passed the error date as 2008/22/05 12:35:56, then
i should give him the error count as
ERROR_CODE IN1 IN2 IN3 IN4 IN5 IN6 IN7 IN8 IN9 IN10 IN11 IN12 TOTAL_COUNT
err1 0 0 0 0 0 0 0 0 0 2 0 5 7
err2 0 0 0 0 0 0 0 0 0 0 0 5 5
err3 0 0 0 0 0 0 0 0 0 0 1 5 6
err4 0 0 0 0 0 0 0 1 0 0 0 8 9
here IN1 = 09:35:56 to 09:50:56
IN2 = 09:50:56 to 10:05:56
IN3 = 10:05:56 to 10:20:56
and so on...the last interval would be
IN12 = 12:20:56 to 12:35:56
the last column TOTAL_COUNT would give me the total count for that error code for that 3 hours ie 09:35:56 to 12:35:56
Here are the scripts:
CREATE TABLE mhs_errors
(
error_id NUMBER(10),
transaction_id VARCHAR2(36) NOT NULL,
error_code VARCHAR2(50) NOT NULL,
error_severity VARCHAR2(10),
error_timestamp TIMESTAMP,
error_text VARCHAR2(700)
);
ALTER TABLE mhs_errors
ADD CONSTRAINT pk_mhs_errors PRIMARY KEY(error_id);
Insert into MHS_ERRORS Values (2, 'tran1', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (3, 'tran2', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (4, 'tran3', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (5, 'tran4', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (6, 'tran5', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (7, 'tran6', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (8, 'tran7', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (9, 'tran8', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (10, 'tran9', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (11, 'tran10', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (12, 'tran11', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (13, 'tran12', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (14, 'tran13', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (15, 'tran14', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (16, 'tran15', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (17, 'tran16', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (18, 'tran17', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (19, 'tran18', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (20, 'tran19', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (21, 'tran20', 'err4', 'fatal', systimestamp, 'dummy error text');
COMMIT;
SELECT *
FROM (SELECT *
FROM (SELECT me.error_code,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE')
THEN 1 ELSE NULL END) in1,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE')
THEN 1 ELSE NULL END) in2,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE')
THEN 1 ELSE NULL END) in3,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE')
THEN 1 ELSE NULL END) in4,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE')
THEN 1 ELSE NULL END) in5,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE')
THEN 1 ELSE NULL END) in6,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE')
THEN 1 ELSE NULL END) in7,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE')
THEN 1 ELSE NULL END) in8,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE')
THEN 1 ELSE NULL END) in9,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE')
THEN 1 ELSE NULL END) in10,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE')
THEN 1 ELSE NULL END) in11,
COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(180, 'MINUTE')
THEN 1 ELSE NULL END) in12,
SUM (COUNT(1)) OVER (PARTITION BY error_code ORDER BY error_code) total_count
FROM mhs_errors me
WHERE me.error_timestamp BETWEEN &&p_error_date - 3/24 AND &&p_error_date
GROUP BY me.error_code)
ORDER BY total_count desc)
WHERE ROWNUM <= NVL(:p_batchsize, 10)
I get the following output after executing the above query:
ERROR_CODE IN1 IN2 IN3 IN4 IN5 IN6 IN7 IN8 IN9 IN10 IN11 IN12 TOTAL_COUNT
err1 0 0 0 0 0 0 0 0 0 0 0 5 5
err2 0 0 0 0 0 0 0 0 0 0 0 5 5
err3 0 0 0 0 0 0 0 0 0 0 0 5 5
err4 0 0 0 0 0 0 0 0 0 0 0 5 5
Kindly guide me how to write this in a better way
thanks in advance
May 23, 2008 - 9:02 am UTC
I see nothing wrong with your existing query.
You need a function / column you want in the output - your counts are correct as is.
You are done. (well, use a bind variable in real life, no &&p_error_date of course and always - ALWAYS use to-date on a string to convert to a date - using the right date format as well)
Vaibhav, May 23, 2008 - 12:44 pm UTC
Hi
Thanks for that piece of advice...
i will make sure i am using to date function henceforth...
now i will tell you what the actual buisness scenario would be...
on the screen, user has an option to select the time interval...
i.e. last 3 hours, last 6 hours, last 12 hours, last 24 hours
the only difference is, for 3 hours the error interval is 15 mins i.e for each error code we get 4 error counts per hour making it 12 error count for 3 hours (12 col + 1 total count = 13 cols in o/p)
for 6 hours, interval is again 15 min, therefore for each error code we have 6*4=24 error count (24 cols in o/p)
for 12 hours, interval is 30 min, therefore for each error code we have 12*2=24 error count(24 cols in o/p)
for 12 hours, interval is 1 hour, therefore for each error code we have 24*1=24 error count(24 cols in o/p)
what i have done is used if condition rewriting the same query 4 times with different time comparisons...
is there a way by which i can write just one select query and still get the desired output...
kindly help me with these..
thanks a million again for all the help that you provide to people...
and ya, i have been reading throughout that you have written a book on analytics...
where can i purchase it from ????
let me know about this
May 23, 2008 - 6:19 pm UTC
... is there a way by which i can write just one select query and still get the
desired output...
...
a query has a fixed number of columns - unless you want to include nulls in the other queries, no. the single query would have a SINGLE set of columns
Vaibhav, May 24, 2008 - 1:52 am UTC
Thanks...
I was just wondering if there is any way by which we can filter out number of columns dynamically...its not possible as i dont want null values in those undesired columns...thanks for confirming this...
TOM...i asked you about your book Expert One-on-One
i surfed some sites like amazon.com and all...
but I am from India...how do i purchase the book here
May 24, 2008 - 7:00 pm UTC
for books: take ISBN to any reputable bookstore and say "I want this book"
expert one on one is out of print, effective oracle by design and expert oracle database architecture are not.
a sql query has a FIXED NUMBER OF COLUMNS, period.
Vaibhav, May 25, 2008 - 2:27 am UTC
thanks...
i will grab a copy of the available books asap...
Min And Max of date ranges
Shivdeep Modi, May 27, 2008 - 8:16 am UTC
Hi,
I've got something like
create table tsreport_date ( report_date date);
insert into tsreport_date values('09-MAY-2008 17:52:22');
insert into tsreport_date values('10-MAY-2008 10:00:03');
insert into tsreport_date values('11-MAY-2008 10:00:01');
insert into tsreport_date values('12-MAY-2008 10:00:06');
insert into tsreport_date values('13-MAY-2008 10:00:01');
insert into tsreport_date values('14-MAY-2008 10:00:02');
insert into tsreport_date values('15-MAY-2008 10:00:05');
insert into tsreport_date values('16-MAY-2008 10:00:03');
insert into tsreport_date values('17-MAY-2008 10:00:01');
insert into tsreport_date values('18-MAY-2008 10:00:04');
insert into tsreport_date values('19-MAY-2008 10:00:02');
insert into tsreport_date values('20-MAY-2008 10:00:04');
insert into tsreport_date values('21-MAY-2008 10:00:04');
insert into tsreport_date values('22-MAY-2008 10:00:04');
insert into tsreport_date values('23-MAY-2008 10:00:02');
insert into tsreport_date values('23-MAY-2008 14:46:27');
insert into tsreport_date values('24-MAY-2008 10:00:02');
insert into tsreport_date values('25-MAY-2008 10:00:03');
insert into tsreport_date values('26-MAY-2008 10:00:01');
insert into tsreport_date values('27-MAY-2008 10:00:03');
insert into tsreport_date values('01-JUN-2008 11:03:16');
insert into tsreport_date values('02-JUN-2008 11:03:21');
insert into tsreport_date values('03-JUN-2008 11:03:23');
insert into tsreport_date values('26-JUN-2008 11:05:01');
insert into tsreport_date values('01-JUL-2008 11:05:10');
insert into tsreport_date values('02-JUL-2008 11:16:25');
insert into tsreport_date values('03-JUL-2008 11:16:27');
select to_char(report_date,'MON-YYYY'), min(report_date), max(report_date)
from tsreport_date
group by to_char(report_Date,'MON-YYYY')
/
TO_CHAR( MIN(REPORT_DATE) MAX(REPORT_DATE)
-------- -------------------- --------------------
JUL-2008 01-JUL-2008 11:05:10 03-JUL-2008 11:16:27
MAY-2008 09-MAY-2008 17:52:22 27-MAY-2008 10:00:03
JUN-2008 01-JUN-2008 11:03:16 26-JUN-2008 11:05:01
3 rows selected.
To get the min and max dates of a month, is it possible to get the result using analytics?
If yes which one of them would be the best?
Regards,
Shivdeep
May 27, 2008 - 8:32 am UTC
ops$tkyte%ORA10GR2> select to_char(trunc(report_date,'mm'),'MON-YYYY'), min(report_date), max(report_date)
2 from tsreport_date
3 group by trunc(report_Date,'mm')
4 order by trunc(report_date,'mm')
5 /
TO_CHAR( MIN(REPORT_DATE) MAX(REPORT_DATE)
-------- -------------------- --------------------
MAY-2008 09-may-2008 17:52:22 27-may-2008 10:00:03
JUN-2008 01-jun-2008 11:03:16 26-jun-2008 11:05:01
JUL-2008 01-jul-2008 11:05:10 03-jul-2008 11:16:27
that would be the more efficient way - do not convert a date into a string for operations on it - use TRUNC whenever possible.
Analytics would not be appropriate here - you want to aggregate - you only want three records output - analytics do not "aggregate", aggregation does.
trunc(date,'mm') is a lot less cpu intensive than to_char(date,'mon-yyyy') - does the same thing and is SORTABLE (mon-yyyy isn't sortable at all, you'd have to to_date that string to make it a date again in order to sort!!!)
use to_char on a date only for formatting in a report - not as a way to truncate.
Min And Max of date ranges
Shivdeep Modi, May 27, 2008 - 11:30 am UTC
Thanks for the advise to use trunc. I was not aware I was not using the best options using to_char.
Regards,
Shivdeep
Vaibhav, May 28, 2008 - 12:17 am UTC
Hey Tom,
I found both the books,
Expert one on one oracle and Effective orcale by design
they will reach in 10 days
i m very excited
Karteek, May 28, 2008 - 9:51 am UTC
Tom,
col1 is a grouping column and col2 is actually the result of row_num() analytic function on inner query (not shown), and col3 is some thing like a flag that tells about col2.
create table test(col1 varchar2(2), col2 number(5), col3 number(5));
insert into test values('A', 1, NULL);
insert into test values('A', 2, 2);
insert into test values('A', 3, 3);
insert into test values('A', 4, 4);
insert into test values('A', 5, NULL);
insert into test values('A', 6, 6);
insert into test values('A', 7, NULL);
insert into test values('A', 8, 8);
insert into test values('A', 9, 9);
A 1
A 2 2
A 3 3
A 4 4
A 5
A 6 6
A 7
A 8 8
A 9 9
Output should be like...
A 2
A 8
Condition here is, I need to consider only the first occurence of the sequence with atleast 2 consecutive numbers followed(2,3,4 - 1st, 8,9 - 2nd). Hope I am clear!
Thanks Tom!
May 28, 2008 - 10:11 am UTC
ops$tkyte%ORA10GR2> select *
2 from (
3 select col1, col2, col3,
4 lag(col3) over (partition by col1 order by col2) last_col3,
5 lead(col3) over (partition by col1 order by col2) next_col3
6 from test
7 )
8 where last_col3 is null and next_col3 = col3+1
9 /
CO COL2 COL3 LAST_COL3 NEXT_COL3
-- ---------- ---------- ---------- ----------
A 2 2 3
A 8 8 9
Analytics
Shivdeep Modi, June 09, 2008 - 12:34 pm UTC
Hi,
I need to formulate drop statement for the package/package bodies.
I've got something like:
select owner,object_name,object_type
from dba_objects
where object_type like 'PACKAGE%'
and owner = 'NCLDBA'
order by object_name
/
Owner Object Name Object Type
--------------- -------------------------- ---------------
NCLDBA TEST PACKAGE BODY
NCLDBA TEST PACKAGE
NCLDBA TEST2 PACKAGE
NCLDBA TEST3 PACKAGE BODY
4 rows selected.
Now I want to formulate drop statements for the above.
The result set should be:
drop package test;
drop package test2;
drop package body test3;
I think this can be done using analytic functions. However I get the same rank for package/package body combination. What is the mistake here?
select 'drop '||object_type||' '||owner||'.'||object_name||';' Statement,
object_name,
object_type,
rank() over (partition by object_name order by object_name) as rank
from dba_objects
where object_type like 'PACKAGE%'
and owner = 'NCLDBA'
order by object_name
/
STATEMENT Object Name Object Type RANK
-------------------------------- ------------ ------------- -----
drop PACKAGE BODY NCLDBA.TEST; TEST PACKAGE BODY 1
drop PACKAGE NCLDBA.TEST; TEST PACKAGE 1
drop PACKAGE NCLDBA.TEST2; TEST2 PACKAGE 1
drop PACKAGE BODY NCLDBA.TEST3; TEST3 PACKAGE BODY 1
June 09, 2008 - 2:19 pm UTC
ops$tkyte%ORA10GR2> select * from (
2 select 'drop '||object_type||' '||owner||'.'||object_name||';' Statement,
3 object_name,
4 object_type,
5 rank() over (partition by object_name order by object_type) as rank
6 from dba_objects
7 where object_type like 'PACKAGE%'
8 and owner = user
9 ) where rank = 1
10 order by object_name
11 /
STATEMENT OBJECT_NAME OBJECT_TYPE RANK
----------------------------------- ------------------------------ ------------------- ----------
drop PACKAGE OPS$TKYTE.TEST; TEST PACKAGE 1
drop PACKAGE OPS$TKYTE.TEST2; TEST2 PACKAGE 1
drop PACKAGE BODY OPS$TKYTE.TEST3; TEST3 PACKAGE BODY 1
In an analytic, if what you order by isn't UNIQUE within the partition - neither row is "first", they are consider the same, a tie - so order by object type to make it unique.
Analytics
Shivdeep Modi, June 10, 2008 - 4:31 am UTC
Thanks for the solution.
Regards,
Shivdeep
Is there a way to do this using Analytics ?
Rajesh Srivastava, September 17, 2008 - 11:55 pm UTC
Hi Tom,
I have a Table with the following structure
Table : TAB1
---------------------------------------------------------
Client HQ-Country Engagement-Country Revenue
---------------------------------------------------------
C1 US UK 100
C1 US US 200
C1 US Japan 50
C2 US India 100
C2 US Japan 50
-------------------------------------------------------
What we want to do is to select all records for those clients where NO record exists with HQ_COUNTRY = ENGAGEMENT_COUNTRY. So the three records of C1 should not be selected and the two records for C2 should be selected.
I can think of :
select * from TAB1
where client_id NOT IN
(select distinct client_id from TAB1
where HQ_COUNTRY = ENGAGEMENT_COUNTRY)
but it looks clunky and have a feeling will perform badly
(TAB1 is large and is a complex view)
Since the query is about checking a condition in a partition, i think Analytics with its PARTION clause should have a magic bullet - but i can't come up with it ...
Any pointers would be much appreciated
September 18, 2008 - 7:58 am UTC
there are no silver bullets.
this can be done with analytics, but if tab1 is a large complex view that takes a long time to execute - well - it'll still take a long time to execute
select *
from (select tab1.*,
count( case when hq_country = engagement_country then 1 end)
over (partition by client_id) cnt
from tab1 )
where cnt = 0;
Above logic did not work for me..
Ashish, September 25, 2008 - 11:10 am UTC
Hi Tom,
Thankyou for your time! above query did not work for me please help!
SELECT pc_no,av_cd,jcd,rga,rgm,
from (
select pc_no,av_cd,substr(jct,1,5)||'00' jcd,rga,rgm,
count (case when (nvl(rga,0)>0 OR nvl(rgm,0)>0) then 1 end)
over (partition by av_cd, pc_no, SUBSTR(jct,1,5)||'00') cnt
from ca_cd ca
where av_cd = 'T2'
)
where cnt=1
and avs = '0000422'
September 25, 2008 - 3:38 pm UTC
well, short of giving you another random piece of logic - how about you actually set down and tell us exactly what it is you are trying to do??????????
Seriously - look at this for a second and ask yourself "how could any reasonable person read what I wrote and know what I need, I didn't tell them what I needed - I only told them that the answer to someone else's question does not work for me"
Basically, all you have done is ruled out one solution from an infinite set of all solutions to all problems. We could be here a while throwing out logic/ algorithms and never get to what you need.
So, it will go a lot better if.... you stated precisely what you are trying to do
and remember - create table, insert into - a test case - would be necessary if you want sql that actually compiles.
Lag/Lead Windowing...
J, September 25, 2008 - 2:02 pm UTC
Hello Tom,
Here's a situation that we find ourselves up against. We're hopelessly stuck and am hoping you can advise us on how best to proceed (thanks!)
The setup:
==========
create table jt
(cust_cd varchar2(20),
visit_id1 number(20),
visit_id2 number(8,2),
visit_dt date,
c_flag varchar2(1),
l_flag varchar2(1));
insert into jt values ('P1',1,60961,to_date('11/27/2007','MM/DD/YYYY'),'Y','N');
insert into jt values ('P1',2,60964.02,to_date('11/30/2007','MM/DD/YYYY'),'N','Y');
insert into jt values ('P1',3,61055,to_date('02/29/2008','MM/DD/YYYY'),'Y','Y');
insert into jt values ('P1',4,61055.01,to_date('02/29/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P1',5,61072,to_date('03/03/2008','MM/DD/YYYY'),'Y','N');
insert into jt values ('P1',6,61100,to_date('04/14/2008','MM/DD/YYYY'),'Y','Y');
insert into jt values ('P1',7,61121,to_date('05/15/2008','MM/DD/YYYY'),'Y','N');
insert into jt values ('P1',8,61163,to_date('06/16/2008','MM/DD/YYYY'),'Y','Y');
insert into jt values ('P1',9,61163.01,to_date('06/16/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P1',10,61163.02,to_date('06/16/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P2',11,61105,to_date('03/18/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P2',12,62100,to_date('03/25/2008','MM/DD/YYYY'),'Y','N');
commit;
========
Here's the objective: For each Customer's "C" visit (C_Flag = 'Y'), find the *nearest* "L" visit(s) (L_Flag = 'Y') by Visit_Dt - backwards two weeks or forwards two weeks in time and associate the Visit_Id1's.
========
So the "C" visits are the pivot points. The sorting order is: Cust_Cd, Visit_Dt, Visit_Id2. Just to make it more complicated, in the sample data, Visit_Id1 # 8 is a C-Visit. It's also an L-Visit. And... Visits 9 and 10 are on the same day as #9. The number of associated L-Visits for a given C-Visit is three.
The resulting set should look like:
Cust ID C-Visit ID L-Visit ID1 L-Visit ID2 L-Visit ID3
------- ---------- ----------- ----------- -----------
P1 1 2 (null) (null)
P1 3 3 4 (null)
P1 5 3 4 (null)
P1 6 6 (null) (null)
P1 7 (null) (null) (null)
P1 8 8 9 10
P2 12 11 (null) (null)
Can this be answered through some nifty analytics?
Three of us have been wrestling with this all week long and can't figure out an approach - Any assistance or advice would be most appreciated.
Best Regards,
- J
September 25, 2008 - 4:35 pm UTC
here was my though process on this:
ops$tkyte%ORA10GR2> select cust_cd, visit_id1, visit_id2, visit_dt, c_flag, l_flag,
2 decode( l_flag, 'Y', visit_dt ) curr_l,
3 last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt ) last_l,
4 last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt DESC ) next_l
5 from jt;
CUST_CD VISIT_ID1 VISIT_ID2 VISIT_DT C L CURR_L LAST_L NEXT_L
-------------------- ---------- ---------- --------- - - --------- --------- ---------
P1 1 60961 27-NOV-07 Y N 30-NOV-07
P1 2 60964.02 30-NOV-07 N Y 30-NOV-07 30-NOV-07 30-NOV-07
P1 3 61055 29-FEB-08 Y Y 29-FEB-08 29-FEB-08 29-FEB-08
P1 4 61055.01 29-FEB-08 N Y 29-FEB-08 29-FEB-08 29-FEB-08
P1 5 61072 03-MAR-08 Y N 29-FEB-08 14-APR-08
P1 6 61100 14-APR-08 Y Y 14-APR-08 14-APR-08 14-APR-08
P1 7 61121 15-MAY-08 Y N 14-APR-08 16-JUN-08
P1 8 61163 16-JUN-08 Y Y 16-JUN-08 16-JUN-08 16-JUN-08
P1 9 61163.01 16-JUN-08 N Y 16-JUN-08 16-JUN-08 16-JUN-08
P1 10 61163.02 16-JUN-08 N Y 16-JUN-08 16-JUN-08 16-JUN-08
P2 11 61105 18-MAR-08 N Y 18-MAR-08 18-MAR-08 18-MAR-08
P2 12 62100 25-MAR-08 Y N 18-MAR-08
12 rows selected.
Get the current row L-flag date if it exists, get the immediate prior one and get the immediate next one - one of those three (if there) will be the l-date of interest....
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select cust_cd, visit_id1, visit_id2, visit_dt, curr_l, last_l, next_l,
2 case when curr_l is not null then curr_l
3 when last_l is not null and next_l is null and visit_dt-last_l <= 14 then last_l
4 when last_l is null and next_l is not null and next_l-visit_dt <= 14 then next_l
5 else decode( least( (visit_dt-last_l), (next_l-visit_dt), 14),
6 (visit_dt-last_l), last_l,
7 (next_l-visit_dt), next_l )
8 end interesting_l
9 from (
10 select cust_cd, visit_id1, visit_id2, visit_dt, c_flag, l_flag,
11 decode( l_flag, 'Y', visit_dt ) curr_l,
12 last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt ) last_l,
13 last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt DESC ) next_l
14 from jt
15 )
16 where c_flag = 'Y'
17 /
CUST_CD VISIT_ID1 VISIT_ID2 VISIT_DT CURR_L LAST_L NEXT_L INTERESTI
-------------------- ---------- ---------- --------- --------- --------- --------- ---------
P1 1 60961 27-NOV-07 30-NOV-07 30-NOV-07
P1 3 61055 29-FEB-08 29-FEB-08 29-FEB-08 29-FEB-08 29-FEB-08
P1 5 61072 03-MAR-08 29-FEB-08 14-APR-08 29-FEB-08
P1 6 61100 14-APR-08 14-APR-08 14-APR-08 14-APR-08 14-APR-08
P1 7 61121 15-MAY-08 14-APR-08 16-JUN-08
P1 8 61163 16-JUN-08 16-JUN-08 16-JUN-08 16-JUN-08 16-JUN-08
P2 12 62100 25-MAR-08 18-MAR-08 18-MAR-08
7 rows selected.
So, we can take those three dates and for the C records - find the interesting l date....
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with A
2 as
3 (
4 select cust_cd, visit_id1, visit_id2, visit_dt, curr_l, last_l, next_l,
5 case when curr_l is not null then curr_l
6 when last_l is not null and next_l is null and visit_dt-last_l <= 14 then last_l
7 when last_l is null and next_l is not null and next_l-visit_dt <= 14 then next_l
8 else decode( least( (visit_dt-last_l), (next_l-visit_dt), 14),
9 (visit_dt-last_l), last_l,
10 (next_l-visit_dt), next_l )
11 end interesting_l
12 from (
13 select cust_cd, visit_id1, visit_id2, visit_dt, c_flag, l_flag,
14 decode( l_flag, 'Y', visit_dt ) curr_l,
15 last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt ) last_l,
16 last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt DESC ) next_l
17 from jt
18 )
19 where c_flag = 'Y'
20 ),
21 B
22 as
23 (select * from jt where l_flag = 'Y' )
24 select a.cust_cd, a.visit_dt, a.visit_id1, b.visit_id1
25 from A left join B
26 on (a.cust_cd = b.cust_cd and a.interesting_l = b.visit_dt)
27 order by a.cust_cd, a.visit_dt, a.visit_id1
28 /
CUST_CD VISIT_DT VISIT_ID1 VISIT_ID1
-------------------- --------- ---------- ----------
P1 27-NOV-07 1 2
P1 29-FEB-08 3 3
P1 29-FEB-08 3 4
P1 03-MAR-08 5 3
P1 03-MAR-08 5 4
P1 14-APR-08 6 6
P1 15-MAY-08 7
P1 16-JUN-08 8 8
P1 16-JUN-08 8 9
P1 16-JUN-08 8 10
P2 25-MAR-08 12 11
11 rows selected.
and then outer join back to pick up the records of interest.... You could pivot this if you want - but basically this is your set of data.
RE: Lag/Lead Windowing
J, September 25, 2008 - 6:25 pm UTC
Thank you Tom!
This is a *completely* different approach from what we were trying. Will dissect this to try to tease the nuances of your "how to apply analytics" mindset.
All the best,
- J
Analytics
Ashish, October 07, 2008 - 6:05 am UTC
Tom,
First of all I apology for not providing 100% input to you in order to help me resolve the problem.
Here I go...Code is:
SELECT pc_no,av_cd,jcd,rga,rgm,
from (
select pc_no,av_cd,substr(jct,1,5)||'00' jcd,rga,rgm,
count (case when (nvl(rga,0)>0 OR nvl(rgm,0)>0) then 1 end)
over (partition by av_cd, pc_no, SUBSTR(jct,1,5)||'00') cnt
from ca_cd ca
where av_cd = 'T2'
)
where cnt=1
and avs = '0000424'
If I run above query I get below output:
pc_no av_cd jcd rga rgm
0000424 T2 7166500 0 0
0000424 T2 7166500 6 6
0000424 T2 7166500 0 0
While, I only want (rga>0 or rgm>0) data i.e. only one row out of above 3 rows from above table.
Please help!
Thanks,
October 08, 2008 - 9:13 pm UTC
add a where clause?
where cnt = 1 and (rga >0 or rgm>0)
what about your book on Analytical functions
Vijay'S, April 15, 2009 - 6:49 am UTC
Dear Tom,
it's long you said that you are thinking of writing a book on analytical functions, have you started on it already or is it going to be announced soon. I know I am too optimistic here but reading your few more posts on analytics I can't wait more to have your book on this subject to understand them better.
Please make this happen it will be great help for everyone as your other books have helped a lot.
Regards,
Vijay'S
can this be done in analytics -or single sql
A reader, July 28, 2009 - 9:40 am UTC
Hi Tom,
I have a table like below and the sample data as given below.
drop table T ;
create table T ( ordernumber number,stockid number, side char(1),quantity number,price number) ;
sample data :
insert into t values (1, 1, 'S', 100, 50.00) ;
insert into t values (2, 5, 'B', 25, 12.50) ;
insert into t values (3, 1, 'S', 100, 50.00) ;
insert into t values (4, 1, 'B', 150, 51.00) ;
insert into t values (5, 1, 'B', 50, 49.00) ;
insert into t values (6, 8, 'B', 100, 16.75) ;
insert into t values (7, 5, 'S', 50, 12.25) ;
insert into t values (8, 8, 'S', 100, 17.00) ;
my sql should be able to give me a output for all the possible records where i can see a stock to somone who want to buy..the ordernunber is a system generated number and that is typical a order.the sell and buy match off should have 2 restrictions
the sell price must be >= buy price.
and the stock_number should be same for sell and buy records.
here "sell" records are the one for which side = 'S' and buy records are the one with side = 'B' ...so the expected output of the sql should be
stockId sell_order buy_order sell_qty buy_quty
1 1 4 100 100
1 3 4 50 50
5 7 2 25 25
in case of order #7 the remaining qty to sell should be updated as part of this process ...so the table should be updated that now for order number 7 there are still 25 stocks and they can be considereed for the next run..
same is true for order for #3.
so far i have taken this approach - but not able to break it...
column diff format 99999
select 'T'
, buy.ordernumber buy_ord
,sell.ordernumber sell_ord
,sell.stockid
,sell.quantity sell_qty
-- ,sell.tot_sell_qty
-- ,buy.tot_buy_qty
,buy.quantity buy_qty
,sell.price sell_price
,buy.price buy_price
,lag(sell.quantity - buy.quantity) over ( order by buy.price desc) diff
from
(select t.* , sum(quantity) over (partition by side,price) tot_sell_qty from t where side ='S' and quantity >0 and price > 0
order by ordernumber,stockid,price ) sell,
(select t.* ,sum(quantity) over (partition by side,price) tot_buy_qty from t where side ='B' and quantity >0 and price > 0
order by ordernumber,stockid,price ) buy
where sell.stockid = buy.stockid and buy.price >= sell.price ;
i would really appreciate any help on this.
Is analytics better in this case?
Joe, July 29, 2009 - 7:50 pm UTC
Tom,
I have become a huge fan of analytics and I find myself trying to use them whenever possible, even if there may be an easier or better to get the result. So I figured I better get your opinion as to whether the fictional example below is better with or without the analytics.
declare
s1 varchar2(30);
s2 varchar2(30);
function get_first_index_loop(t in varchar2) return varchar2 is
result varchar2(30) := null;
begin
for c in (select 1 priority,
constraint_name index_name
from user_constraints
where table_name = t
and constraint_type = 'P'
union
select 2 priority,
index_name
from user_indexes
where table_name = t
order by priority)
loop
result := c.index_name;
exit;
end loop;
return result;
end;
function get_first_index_analytics(t in varchar2) return varchar2 is
result varchar2(30) := null;
begin
select distinct first_value(index_name) over(order by priority)
into result
from (select 1 priority,
constraint_name index_name
from user_constraints
where constraint_type = 'P'
and table_name = t
union
select 2 priority,
index_name
from user_indexes
where table_name = t
order by priority);
return result;
exception
when no_data_found then
return null;
end;
begin
for t in (select table_name
from user_tables
order by table_name)
loop
s1 := get_first_index_loop(t.table_name);
s2 := get_first_index_analytics(t.table_name);
end loop;
end;
I know that "if it can be done in SQL then do it, else use as little pl/sql as possible". But in the above example there are other factors like the exception handler that could affect performance.
Is one way better than the other (peformance, etc) and if not, could you share you opinion as to which way you prefer?
Thanks,
Joe
August 03, 2009 - 5:10 pm UTC
I would not use procedural code for this - no. You shouldn't have any code at all - forget comparing "procedural code - no analytics" vs "procedural code with analytics"
There should never have been a function - ever.
ops$tkyte%ORA10GR2> select table_name,
2 nvl( ( select constraint_name index_name
3 from user_constraints
4 where constraint_type = 'P'
5 and table_name = t.table_name
6 and rownum = 1 ),
7 ( select index_name
8 from user_indexes
9 where table_name = t.table_name
10 and rownum = 1 ) ) iname
11 from user_tables t
12 order by table_name
13 /
TABLE_NAME INAME
------------------------------ ------------------------------
C
CASE CASE_SEARCH1
DEPT DEPT_PK
DOCUMENTS T_IOT_PK
more details on question asked on 28th Jul 2009
A reader, July 31, 2009 - 1:04 am UTC
Hi Tom,
Please consider these details as more accurate.
Hi Tom,
I have a table like below and the sample data as given below.
drop table T ;
create table T ( ordernumber number,stockid number, side char(1),quantity number,price number) ;
sample data :
insert into t values (1, 1, 'S', 100, 50.00) ;
insert into t values (2, 5, 'B', 25, 12.50) ;
insert into t values (3, 1, 'S', 100, 50.00) ;
insert into t values (4, 1, 'B', 150, 51.00) ;
insert into t values (5, 1, 'B', 50, 49.00) ;
insert into t values (6, 8, 'B', 100, 16.75) ;
insert into t values (7, 5, 'S', 50, 12.25) ;
insert into t values (8, 8, 'S', 100, 17.00) ;
my sql should be able to give me a output for all the possible records where i can sell a stock to
somone who want to buy..the ordernunber is a system generated number and that is typical a
order.the sell and buy match off should have 3 restrictions
1)the sell price must be >= buy price.
2)and the stock_number should be same for sell and buy records.that is sell-buy can be done only for the same stockid , we can sell a stockid 1 to someone who want to buy stockid 2 ,even if he has the required pricing etc.
3) for the same stockid , if I have more than 2 buy orders then the one with maximum buy price should be able to buy.
if there 2 or more buyers with the same price then the allotment should happen by ordernumber .
here "sell" records are the one for which side = 'S' and buy records are the one with side = 'B'
...so the expected output of the sql should be
stockId sell_order buy_order sell_qty buy_quty
1 1 4 100 100
1 3 4 50 50
5 7 2 25 25
in case of order #7 the remaining qty to sell should be updated as part of this process ...so the
table should be updated that now for order number 7 there are still 25 stocks and they can be
considereed for the next run..
same is true for order for #3.
so far i have taken this approach - but not able to break it...
column diff format 99999
select 'T'
, buy.ordernumber buy_ord
,sell.ordernumber sell_ord
,sell.stockid
,sell.quantity sell_qty
-- ,sell.tot_sell_qty
-- ,buy.tot_buy_qty
,buy.quantity buy_qty
,sell.price sell_price
,buy.price buy_price
,lag(sell.tot_sell_qty - buy.quantity) over ( partition by sell.stockid order by buy.price desc) diff
from
(select t.* , sum(quantity) over (partition by stockid, side,price) tot_sell_qty from t where side ='S' and quantity >0 and price > 0
order by ordernumber,stockid,price ) sell,
(select t.* ,sum(quantity) over (partition by stockid, side,price) tot_buy_qty from t where side ='B' and quantity >0 and price > 0
order by ordernumber,stockid,price ) buy
where sell.stockid = buy.stockid and buy.price >= sell.price ;
really apprceciate any help on this.
Thanks
How to give group id for the particular group of rows
Suji, January 27, 2010 - 3:20 pm UTC
Ex:
CREATE TABLE TAB_A(A_ID NUMBER, A_DT DATE, A_TTL VARCHAR2(20));
BEGIN
INSERT INTO TAB_A VALUES(1,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABC');
INSERT INTO TAB_A VALUES(2,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABC');
INSERT INTO TAB_A VALUES(3,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABCD');
INSERT INTO TAB_A VALUES(4,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABCD');
INSERT INTO TAB_A VALUES(5,TO_DATE('02/01/2009','MM/DD/YYYY'),'ABC');
INSERT INTO TAB_A VALUES(6,TO_DATE('02/01/2009','MM/DD/YYYY'),'ABCDE');
INSERT INTO TAB_A VALUES(6,TO_DATE('03/01/2009','MM/DD/YYYY'),'ABCDEF');
INSERT INTO TAB_A VALUES(7,TO_DATE('03/01/2009','MM/DD/YYYY'),'ABCDEF');
INSERT INTO TAB_A VALUES(8,TO_DATE('03/01/2009','MM/DD/YYYY'),'ABCDEF');
END;
SELECT DISTINCT A.A_ID, A.A_DT, A.A_TTL
FROM TAB_A A
WHERE (A.A_DT, A.A_TTL) IN
(SELECT A.A_DT, A.A_TTL
FROM TAB_A A
GROUP BY A.A_DT, A.A_TTL
HAVING COUNT(DISTINCT A.A_ID) > 1)
ORDER BY A.A_ID
Answer is
A_ID A_DT A_TTL
1 1/1/2009 ABC
2 1/1/2009 ABC
3 1/1/2009 ABCD
4 1/1/2009 ABCD
6 3/1/2009 ABCDEF
7 3/1/2009 ABCDEF
8 3/1/2009 ABCDEF
I just want 4th group which should generate unique number for each group of A_DT and A_TTL columns like
all the 1/1/2009 and ABC should be displayed 1
all the 1/1/2009 and ABC should be 2
all the 3/1/2009 and ABCDEF should be 3
etc..........
How do I generate query
Thanks in advance
January 29, 2010 - 3:23 pm UTC
ops$tkyte%ORA11GR2> select a.*, dense_rank() over (order by a_ttl)
2 FROM TAB_A A
3 WHERE (A.A_DT, A.A_TTL) IN
4 (SELECT A.A_DT, A.A_TTL
5 FROM TAB_A A
6 GROUP BY A.A_DT, A.A_TTL
7 HAVING COUNT(DISTINCT A.A_ID) > 1)
8 ORDER BY A.A_ID
9 /
A_ID A_DT A_TTL DENSE_RANK()OVER(ORDERBYA_TTL)
---------- --------- -------------------- ------------------------------
1 01-JAN-09 ABC 1
2 01-JAN-09 ABC 1
3 01-JAN-09 ABCD 2
4 01-JAN-09 ABCD 2
6 01-MAR-09 ABCDEF 3
7 01-MAR-09 ABCDEF 3
8 01-MAR-09 ABCDEF 3
7 rows selected.
Excelent
suji, February 01, 2010 - 12:59 pm UTC
My Problem is solved. Excellent. Thank you very much.
Thanks
A reader, February 11, 2010 - 5:58 am UTC
> create table test (id number, token varchar2(10));
> insert into test values (1,'ABC');
> insert into test values (2,'XYZ');
> insert into test values (3,'DEF');
> insert into test values (3,'DEF');
> insert into test values (4,'GHI');
> insert into test values (5,'RST');
> insert into test values (6,'JKL');
> insert into test values (6,'JKL');
> insert into test values (7,'MNO');
> insert into test values (8,'PQR');
> insert into test values (9,'DEF');
> insert into test values (10,'RST');
> SELECT * FROM TEST;
ID TOKEN
---------- ----------
1 ABC
2 XYZ
3 DEF
3 DEF
4 GHI
5 RST
6 JKL
6 JKL
7 MNO
8 PQR
9 DEF
10 RST
I want a query that will return all duplicate Token values unless the duplicates all belong to the same Id. For each duplicate returned the Id should also be displayed. Based on the above data the query should return:
ID TOKEN
--- ------
10 RST
5 RST
3 DEF
9 DEF
February 16, 2010 - 9:43 am UTC
1 select distinct *
2 from (
3 select id, token, count(distinct id) over (partition by token) cnt
4 from test
5 )
6* where cnt > 1
ops$tkyte%ORA10GR2> /
ID TOKEN CNT
---------- ---------- ----------
10 RST 2
9 DEF 2
5 RST 2
3 DEF 2
Great work Tom
Dominic, February 25, 2010 - 6:27 am UTC
Tom,
First of all thanks for all the work you've done for the Oracle development community.
create table test1
(object_no number,
abi_code varchar2(3),
priority_over varchar2(3))
/
insert into test1 values(1,'G07','Z99');
insert into test1 values(1,'J01','A22');
insert into test1 values(1,'J01','G06');
insert into test1 values(1,'J01','G07');
insert into test1 values(1,'J01','Z99');
insert into test1 values(1,'J01','P01');
insert into test1 values(2,'P01','Z99');
commit;
SQL> select *
2 from test1
3 /
OBJECT_NO ABI PRI
---------- --- ---
1 G07 Z99
1 J01 A22
1 J01 G06
1 J01 G07
1 J01 Z99
1 J01 P01
2 P01 Z99
7 rows selected.
I need to flag those rows where the abi_code exists in the priority_over column for another abi_code for that object. The end output I would be after would be
OBJECT_NO ABI
---------- ---
1 J01
2 P01
which I can get with
select distinct object_no, abi_code
from test1 t
where abi_code not in (select priority_over
from test1
where object_no = t.object_no)
but this is 2 full table scans so I was wondering if there was an analytic function I could use to flag those rows that I want to exclude
I tried
select object_no,
abi_code,
priority_over,
first_value(priority_over)
over (partition by object_no
order by decode(priority_over,
abi_code,1,2)) match
from test1
but that failed miserably.
Can I do this with analytics or am I stuck with the correlated subquery
Thanks in advance
Dom
March 01, 2010 - 10:15 am UTC
even with the analytics - you would have to make two passes -
pass1: identify
pass2: update
because you cannot update the result of an analytic directly.
You are not stuck with a correlated subquery, you could use not in (select object_no, priority_code ...), but I would suggest this instead:
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'TEST1', numrows=> 1000000, numblks => 10000);
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select object_no, abi_code
2 from test1 t1
3 where NOT EXISTS
4 (select null
5 from test1 t2
6 where t2.object_no = t1.object_no
7 and t2.priority_over = t1.abi_code)
8 /
Execution Plan
----------------------------------------------------------
Plan hash value: 2822440888
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 999K| 30M| | 1981 (4)| 00:00:15 |
|* 1 | HASH JOIN ANTI | | 999K| 30M| 26M| 1981 (4)| 00:00:15 |
| 2 | TABLE ACCESS FULL| TEST1 | 1000K| 15M| | 709 (3)| 00:00:06 |
| 3 | TABLE ACCESS FULL| TEST1 | 1000K| 15M| | 711 (4)| 00:00:06 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T2"."OBJECT_NO"="T1"."OBJECT_NO" AND
"T2"."PRIORITY_OVER"="T1"."ABI_CODE")
ops$tkyte%ORA10GR2> set autotrace off
looking for that nice juicy hash anti-join.
Thanks Tom
Dominic, March 01, 2010 - 11:40 am UTC
Excellent as usual.
The actual problem was complicated by the fact that test1 is actually an inline view which initially caused me problems, but when I build the view in a with statement, that anti hash join comes up lovely.
1 with il as (
2 select hi.object_no,
3 hi.item_description,
4 hfm.abi_code,
5 himp.priority_over
6 from obj_hitem hi,
7 hp_freetext_map hfm,
8 hp_item_map_priority himp
9 where instr(upper(hi.item_description),hfm.rule) > 0
10 and himp.abi_code = hfm.abi_code)
11 select il.object_no,
12 il.item_description,
13 il.abi_code,
14 il.priority_over
15 from il
16 where not exists (select null
17 from il il2
18 where il2.object_no = il.object_no
19* and il2.priority_over = il.abi_code)
SQL> set autotrace traceonly explain
SQL> /
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=376 Bytes=60
912)
1 2 RECURSIVE EXECUTION OF 'SYS_LE_2_0'
2 0 TEMP TABLE TRANSFORMATION
3 2 HASH JOIN (ANTI) (Cost=21 Card=376 Bytes=60912)
4 3 VIEW (Cost=4 Card=388 Bytes=56648)
5 4 TABLE ACCESS (FULL) OF 'SYS_TEMP_0FD9D6602_77604101'
(Cost=4 Card=388 Bytes=5432)
6 3 VIEW (Cost=4 Card=388 Bytes=6208)
7 6 TABLE ACCESS (FULL) OF 'SYS_TEMP_0FD9D6602_77604101'
(Cost=4 Card=388 Bytes=5432)
Once more you allow me to build a reputation of competence based on your hard work. It's solutions like this that are allowing me to turn 14000 rows of slow by slow pl/sql processing into one 3500 line insert statement :-)
Seriously though, I plug your website in every interview I attend, or site I work on. Who knows, if management finally learn that there is such a thing as good SQL, they might start hiring good developers to write it. Won't hold my breath.
Thanks again for all your work, just don't retire before I do.
Dom
CezarN, March 01, 2010 - 2:57 pm UTC
@Dominic:
I thought about analytics as a solution to your question:
select object_no, abi_code, priority_over
from (select object_no,
cod,
src,
count(distinct src) over(partition by cod) as csrc,
count(distinct object_no) over(partition by cod) as cobj,
abi_code,
priority_over
from (select object_no,
case d.src
when 1 then abi_code
else priority_over
end as cod,
d.src,
abi_code,
priority_over
from test1
join (select 1 as src from dual
union all
select 2 as src from dual) d
on 1 = 1))
where csrc = 2 and cobj > 1
order by cod, src, object_no;
@CezarN
Dominic, March 02, 2010 - 3:53 am UTC
Nice way of looking at the problem, I'm pretty impressed.
Unfortunately, as it is at the moment, it relies on the existence of the row where abi_code = 'J01' and priority_over = 'P01', but I tweaked it a bit and got this
select distinct object_no, abi_code
from (
select object_no,
cod,
src,
count(distinct src) over(partition by object_no,
cod) as csrc,
abi_code,
priority_over
from (select object_no,
case d.src
when 1 then abi_code
else priority_over
end as cod,
d.src,
abi_code,
priority_over
from test1
join (select 1 as src from dual
union all
select 2 as src from dual) d
on 1 = 1))
where csrc = 1
and src = 1
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (UNIQUE)
2 1 VIEW
3 2 WINDOW (SORT)
4 3 NESTED LOOPS
5 4 VIEW
6 5 UNION-ALL
7 6 TABLE ACCESS (FULL) OF 'DUAL'
8 6 TABLE ACCESS (FULL) OF 'DUAL'
9 4 TABLE ACCESS (FULL) OF 'TEST1'
I'll have to do some benchmarking to see which method performs best, and also which one can be more easily merged into the main query but it's always nice to have several options.
Thanks
Dom
March 02, 2010 - 8:28 am UTC
I'd lay odds on a nice juicy pair of full scans and a hash operation.
Always bet on Tom
Dominic, March 02, 2010 - 10:52 am UTC
SQL> get get_stats
1 declare
2 l_start number;
3 begin
4 insert into run_stats
5 select 'before',
6 stats.*
7 from stats;
8 l_start := dbms_utility.get_time;
9 for i in 1 .. 100
10 loop
11 for x in (with il as (select hi.object_no,
12 hi.item_description,
13 hfm.abi_code,
14 himp.priority_over
15 from obj_hitem hi,
16 hp_freetext_map hfm,
17 hp_item_map_priority himp
18 where instr(upper(hi.item_description),
19 hfm.rule(+)) > 0
20 and himp.abi_code(+) = hfm.abi_code)
21 select il.object_no,
22 il.item_description,
23 il.abi_code
24 from il
25 where not exists (select null
26 from il il2
27 where il2.object_no = il.object_no
28 and il2.priority_over = il.abi_code))
29 loop
30 null;
31 end loop;
32 end loop;
33 dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
34 insert into run_stats select 'after 1', stats.* from stats;
35 l_start := dbms_utility.get_time;
36 for i in 1 .. 100
37 loop
38 for x in (with test1 as (select hi.object_no,
39 hi.item_description,
40 hfm.abi_code,
41 himp.priority_over
42 from obj_hitem hi,
43 hp_freetext_map hfm,
44 hp_item_map_priority himp
45 where instr(upper(hi.item_description),
46 hfm.rule(+)) > 0
47 and himp.abi_code(+) = hfm.abi_code)
48 select distinct object_no,
49 item_description,
50 abi_code
51 from (select object_no,
52 item_description,
53 cod,
54 src,
55 count(distinct src) over(partition by object_no,
56 cod) as csrc,
57 abi_code,
58 priority_over
59 from (select object_no,
60 item_description,
61 case d.src
62 when 1 then abi_code
63 else priority_over
64 end as cod,
65 d.src,
66 abi_code,
67 priority_over
68 from test1
69 join (select 1 as src from dual
70 union all
71 select 2 as src from dual) d
72 on 1 = 1))
73 where csrc = 1
74 and src = 1)
75 loop
76 null;
77 end loop;
78 end loop;
79 dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
80 insert into run_stats select 'after 2', stats.* from stats;
81* end;
82 /
1240 hsecs
1480 hsecs
SQL> get extract_stats
1 select a.name, b.value-a.value run1, c.value-b.value run2,
2 ( (c.value-b.value)-(b.value-a.value)) diff
3 from run_stats a, run_stats b, run_stats c
4 where a.name = b.name
5 and b.name = c.name
6 and a.runid = 'before'
7 and b.runid = 'after 1'
8 and c.runid = 'after 2'
9 and (c.value-a.value) > 0
10 and (c.value-b.value) <> (b.value-a.value)
11* order by abs( (c.value-b.value)-(b.value-a.value))
SQL> /
NAME RUN1 RUN2 DIFF
---------------------------------------- ---------- ---------- ----------
LATCH.FOB s.o list latch 1 0 -1
LATCH.lgwr LWN SCN 4 5 1
LATCH.session timer 4 5 1
STAT...calls to kcmgcs 6 7 1
LATCH.mostly latch-free SCN 4 5 1
LATCH.active checkpoint queue latch 8 10 2
LATCH.channel operations parent latch 708 710 2
LATCH.Consistent RBA 4 1 -3
LATCH.redo writing 25 22 -3
STAT...active txn count during cleanout 0 4 4
STAT...cleanout - number of ktugct calls 0 4 4
STAT...consistent gets - examination 0 4 4
LATCH.messages 35 40 5
LATCH.simulator hash latch 4372 4348 -24
LATCH.multiblock read objects 9554 9502 -52
LATCH.sort extent pool 700 800 100
STAT...parse count (total) 106 6 -100
STAT...sorts (disk) 0 100 100
STAT...opened cursors cumulative 106 6 -100
STAT...execute count 205 105 -100
LATCH.session idle bit 102 1 -101
LATCH.simulator lru latch 561 672 111
STAT...db block gets 759 637 -122
STAT...prefetched blocks 26031 25893 -138
STAT...free buffer requested 30972 30775 -197
LATCH.dml lock allocation 202 4 -198
STAT...calls to kcmgas 200 0 -200
STAT...sorts (memory) 4 204 200
LATCH.session allocation 204 3 -201
STAT...session uga memory max 0 216 216
LATCH.checkpoint queue latch 1536 1920 384
LATCH.undo global data 3807 3408 -399
LATCH.loader state object freelist 200 600 400
STAT...consistent changes 400 0 -400
LATCH.redo allocation 455 47 -408
STAT...redo entries 445 31 -414
STAT...calls to get snapshot scn: kcmgss 20607 20105 -502
STAT...enqueue releases 708 202 -506
STAT...enqueue requests 709 202 -507
LATCH.enqueues 4727 4021 -706
LATCH.cache buffers lru chain 900 100 -800
STAT...db block changes 870 53 -817
LATCH.enqueue hash chains 5028 4026 -1002
LATCH.shared pool 1428 233 -1195
LATCH.library cache pin 1501 305 -1196
LATCH.library cache pin allocation 1284 88 -1196
LATCH.row cache objects 9630 8254 -1376
LATCH.row cache enqueue latch 9630 8244 -1386
STAT...buffer is not pinned count 81716 80316 -1400
STAT...table scan blocks gotten 81700 80300 -1400
STAT...no work - consistent read gets 81708 80308 -1400
STAT...consistent gets 101917 100320 -1597
STAT...session logical reads 102676 100957 -1719
LATCH.cache buffers chains 243285 241440 -1845
LATCH.library cache 3893 497 -3396
STAT...physical writes 800 4900 4100
STAT...physical writes direct 800 4900 4100
STAT...physical writes non checkpoint 800 4900 4100
STAT...physical reads 30858 35669 4811
STAT...physical reads direct 0 5000 5000
STAT...redo size 82956 29868 -53088
STAT...recursive calls 66331 7731 -58600
STAT...table scan rows gotten 8852600 8717200 -135400
STAT...sorts (rows) 3084 201984 198900
64 rows selected.
I'm guessing it's the sort that does for the analytic version?
Dom
March 02, 2010 - 12:47 pm UTC
I'd use get_CPU_time instead of just get time.
hash (anti/non-anti) joins are just brutally efficient - one of the reasons we started using hash operations for many more things in 10g (over sort operations)
@Dominic
CezarN, March 03, 2010 - 12:22 am UTC
Your where condition:
where csrc = 1
and src = 1
means than we have to have only ONE src and that src must be abi_code, right?
So, we can reduce the amount of sort analytics implies by work arounding that COUNT(DISTINCT src). So I changed the coding set for src from (1,2) in (0,1). This means that a SUM of ZERO on src means that we have only ONE src and that src must be abi_code. No more DISTICT:
select distinct object_no, abi_code
from (
select object_no,
cod,
src,
sum(src) over(partition by object_no, cod) as csrc,
abi_code,
priority_over
from (select object_no,
case d.src
when 0 then abi_code
else priority_over
end as cod,
d.src,
abi_code,
priority_over
from test1
join (select 0 as src from dual
union all
select 1 as src from dual) d
on 1 = 1)
)
where csrc = 0
Hope this does better then with DISTINCT.
@CezarN
Dominic, March 03, 2010 - 5:30 am UTC
Actually you can have more than 1 src as shown below, but fortunately because we are partitioning on cod, multiple values are still picked up. Having said that, I do like the sum method more, seems sharper to me somehow.
SQL> select *
2 from test1
3 /
OBJECT_NO ABI PRI
---------- --- ---
1 G07 Z99
1 J01 A22
1 J01 G06
1 J01 G07
1 J01 Z99
2 P01 Z99
6 rows selected.
SQL> insert into test1
2 values(1,'H01','Z99')
3 /
1 row created.
SQL> commit;
Commit complete.
select distinct object_no,
abi_code
from (select object_no,
cod,
src,
sum(src) over(partition by object_no, cod) as csrc,
abi_code,
priority_over
from (select object_no,
case d.src
when 0 then abi_code
else priority_over
end as cod,
d.src,
abi_code,
priority_over
from test1
join (select 0 as src from dual
union all
select 1 as src from dual) d
on 1 = 1)
)
where csrc = 0
SQL> /
OBJECT_NO ABI
---------- ---
1 H01
1 J01
2 P01
The run times don't change much to be honest (sorry Tom, should have mentioned it before, I'm on version 9.2.0.7.0 so no get_cpu_time) and the stats seem to be the same, so the anti hash join looks to be the winner over analytics in this case.
(looks like that's another set of pre-conceptions for the bin) :-)
Thanks
Dom
Reader, June 10, 2010 - 12:17 pm UTC
Tom,
In what cases should we use analytics and in what cases group by should be used.
Ranking can be done without using rank function. Can you tell which one is better? Ranking using rank analytic function or just group by and get rownumber
thanks
June 10, 2010 - 2:49 pm UTC
that is like asking "when should I join and when should I group by"
Show me an example of an analytic (that does not ever aggregate, they do not by definition) that can be replaced by group by (that always aggregates - in general returns FEWER rows than a non-group by) and vice versa
when you do that, I'll let you know the answer.
Use rank for example, do ranking with group by and ranking with analytics and then we can have a discussion.
Analytical Query
NS, June 22, 2010 - 3:49 pm UTC
I’ll really appreciate your help to solve a particular scenario using analytical query.
I'm working on Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit.
Following are the create and insert statements:
create table ntest(a number, pdt varchar2(10),sec_pdt varchar2(10),htime number);
insert into ntest values(1,'RE','ST',1000);
insert into ntest values(2,'FF','RE',1100);
insert into ntest values(3,'RE','NR',1400);
insert into ntest values(4,'NR','FF',1450);
insert into ntest values(5,'RE','NR',1700);
insert into ntest values(6,'NR','FF',2100);
insert into ntest values(7,'TT','FA',2105);
insert into ntest values(8,'MM','FJ',2107);
insert into ntest values(9,'RE','TT',2115);
insert into ntest values(10,'RE','FF',2500);
insert into ntest values(11,'MN','RT',2510);
insert into ntest values(12,'MN','NT',2600);
insert into ntest values(13,'ZZ','FF',3000);
insert into ntest values(14,'ZN','FF',3100);
commit;
I want to select column a for those values where pdt_cd in that row matches with the pdt_cd in another row which is within the range of +150 or -150 of htime.
eg : if pdt_cd is 'NS' in row of htime 1000 and there is another row with htime between 850 and 1150 with same pdt_cd, then I want to select that a value.
similar is the case with sec_pdt column.
I'm able to tackle this situation by writing following query :
select a from
(
select a,
count(*) OVER(Partition by pdt
order by htime RANGE BETWEEN 150 PRECEDING AND 150 FOLLOWING) pdtcnt,
count(*) OVER(Partition by sec_pdt
order by htime RANGE BETWEEN 150 PRECEDING AND 150 FOLLOWING) secpdtcnt
from ntest
) n
where pdtcnt > 1
or secpdtcnt > 1;
I want to add 2 more OR conditions to this query
1. I want to select column a for those values where pdt_cd in that row matches with the sec_pdt_cd in another row which is within the range of +150 or -150 of htime
2. I want to select column a for those values where sec_pdt_cd in that row matches with the pdt_cd in another row which is within the range of +150 or -150 of htime
I'm not able to figure out how to add these 2 conditions in the above query or is there any better way to do this in single query?
June 24, 2010 - 6:18 am UTC
it would have been interesting to have at least one observation such that:
where pdt_cd in that row matches
with the sec_pdt_cd in another row which is within the range of +150 or -150 of
htime
in the example wouldn't it?
I added:
insert into ntest values(42,'RE','NR',1800);
to create one. Then, if you add:
count(*) over (partition by pdt, sec_pdt order by htime range between 150 preceding and 150 following)
you just want to keep where that is greater than one.
and isn't:
I want to select column a for those values where sec_pdt_cd in that row
matches with the pdt_cd in another row which is within the range of +150 or
-150 of htime
the same as the prior - if pdt_cd has a sec_pdt_cd within +-150 - then that same thing is true in reverse already.
Analytics Question
NS, June 24, 2010 - 2:53 pm UTC
When I said
"where pdt_cd in that row matches
with the sec_pdt_cd in another row which is within the range of +150 or -150 of htime " I meant that pdt_cd in first row matches with the sec_pdt_cd in other rows, like 2,3 .... which is within the range of +150 or -150 of first row.
In my example RE(pdt_cd) from row 1 matches with RE(sec_pdt_cd) of row 2 and is within +150 or -150 of 1000(htime of row 1)
The other scenario is reverse
NR(sec_pdt_cd) from row 3 matches with NR(pdt_cd) of row 4 which is within +150 or -150 of 1400(htime of row 3)
From someone help I'm able to come up to the following query which brings the output I'm looking for.
with t as
(select a, pdt combined_pdt, htime,pdt,sec_pdt
from ntest
union
select a, sec_pdt combined_pdt, htime,pdt,sec_pdt
from ntest)
select distinct a,pdt,sec_pdt,htime
from (select a,
combined_pdt,
pdt,
sec_pdt,
htime,
count(*) over(partition by combined_pdt order by htime range between 150 preceding and 150 following) pdt_cnt
from t)
where pdt_cnt > 1
and combined_pdt is not null
order by htime;
Here is the result.
A PDT SEC_PDT HTIME
1 RE ST 1000
2 FF RE 1100
3 RE NR 1400
4 NR FF 1450
5 RE NR 1700
42 RE NR 1800
7 TT FA 2105
9 RE TT 2115
11 MN RT 2510
12 MN NT 2600
13 ZZ FF 3000
14 ZN FF 3100
Thanks.
reader
A reader, July 12, 2010 - 12:03 pm UTC
On Toad Grid output,
break on column
does not work
Is there a way to simulate that function with analyticals so that I can just execute a sql query without "break on table_name" at the top and get the same result
Thanks
July 19, 2010 - 10:53 am UTC
ops$tkyte%ORA10GR2> select deptno, ename from scott.emp order by deptno, ename;
DEPTNO ENAME
---------- ----------
10 CLARK
10 KING
10 MILLER
20 ADAMS
20 FORD
20 JONES
20 SCOTT
20 SMITH
30 ALLEN
30 BLAKE
30 JAMES
30 MARTIN
30 TURNER
30 WARD
14 rows selected.
ops$tkyte%ORA10GR2> break on deptno
ops$tkyte%ORA10GR2> select deptno, ename from scott.emp order by deptno, ename;
DEPTNO ENAME
---------- ----------
10 CLARK
KING
MILLER
20 ADAMS
FORD
JONES
SCOTT
SMITH
30 ALLEN
BLAKE
JAMES
MARTIN
TURNER
WARD
14 rows selected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select decode( row_number() over (partition by deptno order by ename), 1, deptno ) new_deptno,
2 ename
3 from scott.emp
4 order by deptno, ename
5 /
NEW_DEPTNO ENAME
---------- ----------
10 CLARK
KING
MILLER
20 ADAMS
FORD
JONES
SCOTT
SMITH
30 ALLEN
BLAKE
JAMES
MARTIN
TURNER
WARD
14 rows selected.
it will not be "as efficient" or course, takes extra work to compute row_number
Analytics Stumper
Kevin Kirkpatrick, August 27, 2010 - 4:55 pm UTC
Hey Tom,
I'm presently stumped on this analytics problem - and am about to bite the bullet and code it out in some very ugly PL/SQL. But before throwing in the towel, I figured I'd let a pro take a stab.
2 Tables, DEBTS and ASSETS, defined as
WITH DEBTS AS
(
SELECT 1 TRANSACTION_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 2 TRANSACTION_ID, 'X' ATYPE, 5 UNITS FROM DUAL UNION ALL
SELECT 3 TRANSACTION_ID, 'Y' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 4 TRANSACTION_ID, 'Z' ATYPE, 3 UNITS FROM DUAL
) ,
ASSETS AS
(SELECT 'A' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'B' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'C' ASSET_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 'D' ASSET_ID, 'X' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 'E' ASSET_ID, 'Y' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'F' ASSET_ID, 'Z' ATYPE, 2 UNITS FROM DUAL)
-- insert query here.
/*
DEBTS can be interpretted as: "Transaction <TRANSACTION_ID> must deduct <UNITS> units from
assets of type <ATYPE>."
ASSETS can be interpretted as: "Asset <ASSET_ID> is of type <ATYPE> and has <UNITS> available"
Desired results: Show the required series of steps to properly deduct DEBTS units from ASSETS units of the same ASSET TYPE.
Rules:
1) Process DEBTS in order of TRANSACTION_ID.
2) Debt units must be removed from ASSETS with a matching asset type, and must be
deducted from assets in order of ASSET_ID*.
3) Uncovered Debts of a given transaction are not associate with any ASSET, but
are available in the result set for error handling.
So, for TRANSACTION_ID 1, we need to remove 4 units from ATYPE = 'X'. We first remove 3 units
from ASSET_ID A, then 1 more unit for ASSET_ID B. Transaction 2 must remove an additional 5
units from ATYPE='X'. It will take the remaining 2 units from ASSET_ID B, and then 3 more units
from ASSET_ID C. No units will be taken from ASSET_ID D.
The total activity for ATYPE=X would thus appear as follows in the query results:
TRANSACTION_ID UNITS_REMOVED ASSET_ID
1 3 A
1 1 B
2 2 B
2 3 C
Following this logic, transaction ID 3 is a new asset type, Y, and would thus have the following:
TRANSACTION_ID UNITS_REMOVED ASSET_ID
3 2 E
The special case is if there are insufficient UNITS in ASSETS to cover a DEBT. In that case
the remaining units need to be left as <null> ASSET_ID. So Transaction ID 4 would give:
TRANSACTION_ID UNITS_REMOVED ASSET_ID
4 2 F
4 1 <null>
Anyway, to recap: can you substitute for <insert query here> such that this result set would be obtained:
WITH DEBTS AS
(
SELECT 1 TRANSACTION_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 2 TRANSACTION_ID, 'X' ATYPE, 5 UNITS FROM DUAL UNION ALL
SELECT 3 TRANSACTION_ID, 'Y' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 4 TRANSACTION_ID, 'Z' ATYPE, 3 UNITS FROM DUAL ) ,
ASSETS AS
(SELECT 'A' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'B' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'C' ASSET_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 'D' ASSET_ID, 'X' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 'E' ASSET_ID, 'Y' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'F' ASSET_ID, 'Z' ATYPE, 2 UNITS FROM DUAL)
-- insert query here.
TRANSACTION_ID UNITS_REMOVED ASSET_ID
1 3 A
1 1 B
2 2 B
2 3 C
3 2 E
4 2 F
4 1 <null>
Or do I have to bite the bullet and cram out some PL/SQL? (the main reason I want to avoid that is performance - I've got to process hundreds of thousands of DEBTS from millions of ASSETS, and need it to happen quick, e.g. PARALLEL FTS of ASSETS, PARALLEL FTS of DEBTS, hash join, and crunch).
Solved
Kevin Kirkpatrick, August 30, 2010 - 10:30 am UTC
Tom,
Feel free to ignore the above - I've come up with the following solution (spoiler alert - to those who like to work through a neatly packaged SQL puzzler, I'd put this as a prime example...)
WITH
DEBTS AS (
SELECT 1 TRANSACTION_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 2 TRANSACTION_ID, 'X' ATYPE, 5 UNITS FROM DUAL UNION ALL
SELECT 3 TRANSACTION_ID, 'Y' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 4 TRANSACTION_ID, 'Z' ATYPE, 3 UNITS FROM DUAL )
,ASSETS AS (
SELECT 'A' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'B' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'C' ASSET_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 'D' ASSET_ID, 'X' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 'E' ASSET_ID, 'Y' ATYPE, 3 UNITS FROM DUAL UNION ALL
SELECT 'F' ASSET_ID, 'Z' ATYPE, 2 UNITS FROM DUAL)
-- Solution:
-- Asset ranges define the number of units each asset_id will contribute
-- to a given asset type. Define into existence the "null" asset, which
-- effectively contributes infinite units to each asset type.
,ASSET_RANGES AS (
SELECT ASSET_ID, ATYPE,
NVL(LAG(RUNNING_UNITS) OVER (PARTITION BY ATYPE ORDER BY ASSET_ID),0)+1 MIN_UNIT,
RUNNING_UNITS MAX_UNIT
FROM
(SELECT SUM(UNITS) OVER (PARTITION BY ATYPE ORDER BY ASSET_ID) RUNNING_UNITS,
A.*
FROM ASSETS A)
UNION ALL
SELECT NULL ASSET_ID, ATYPE, SUM(UNITS)+1 MIN_UNIT, 1000000000000 MAX_UNIT
FROM ASSETS
GROUP BY ATYPE)
-- Debt ranges define the number of units each transaction_id will deduct
-- from a given asset type.
,DEBT_RANGES AS (
SELECT TRANSACTION_ID, ATYPE,
NVL(LAG(RUNNING_UNITS) OVER (PARTITION BY ATYPE ORDER BY TRANSACTION_ID),0)+1 MIN_UNIT,
RUNNING_UNITS MAX_UNIT
FROM
(SELECT SUM(UNITS) OVER (PARTITION BY ATYPE ORDER BY TRANSACTION_ID) RUNNING_UNITS,
D.*
FROM DEBTS D))
-- Using the ranges defined above, I get my desired "HASH JOIN AND CRUNCH" performance
SELECT TRANSACTION_ID,
ASSET_ID,
LEAST(A.MAX_UNIT, D.MAX_UNIT) - GREATEST(A.MIN_UNIT,D.MIN_UNIT) +1
FROM DEBT_RANGES D, ASSET_RANGES A
WHERE D.ATYPE = A.ATYPE AND D.MIN_UNIT <= A.MAX_UNIT AND D.MAX_UNIT >= A.MIN_UNIT
ORDER BY TRANSACTION_ID, ASSET_ID
analytics
A reader, October 15, 2010 - 9:09 am UTC
Hi Tom,
you always say that the analytical functions *opens up NEW * path .
Quote from your book :
"This new set of functionality holds some exciting possibilities. It opens up a whole new way of looking
at the data. It will remove a lot of procedural code and complex (or inefficient) queries that would have
taken a long time to develop, to achieve the same result".
Waht is the new path ? what is the 'mechanism' behind that ? how it happens? with out doing 'self join' how it apply the analytcal function and calculate AVG (for eample)?
Thanks
October 15, 2010 - 9:25 am UTC
not sure how to answer this.
close your eyes, envision a table of data. Now, envision a sliding range of data in that big table of data - a range of rows. Analytics has access to the rows in that range and can reference them. The "new path" is "new code" that was added to the database kernel to open up access to that range - what we call a "window" and added steps to plans like "winder buffer sort" and the like.
A reader, October 15, 2010 - 9:32 am UTC
Many thanks Tom,
"The "new path" is "new code" that was added to the database kernel"
Yes, understood now - One more doubt is that -Oracle engine has to access the FULL table first before applying ANY analytical function (assume we are applying analytics on a TABLE) - correct ?
October 15, 2010 - 10:04 am UTC
... Oracle engine has to access the
FULL table first before applying ANY analytical function ...
of course the oracle engine has access to the full table - that is because we "own" the table, we manage all of the data.
However, it is not a REQUIREMENT to access the full table before applying ANY analytical function - it would depend on the schema (indexing), the query, the partitioning clause and so on.
Sometimes the entire query is "executed" before the first rows are returned and sometimes the first rows may be returned immediately without hitting the entire table. "it depends" - for all queries "it depends". It is not a requirement.
Slow analytic function
lalu, October 17, 2010 - 4:17 am UTC
HiTom,
I have a table with contents as below:
id notes
1 abcdefghijklmnopqrstuvwxyz
I want the output as below
id line notes
1 1 abcd (varchar of 4 characters)
1 2 efgh
1 3 ijkl
.....
....
1 7 yz
I use the below query to get the data:
SELECT DISTINCT A.ID, LEVEL LINE, SUBSTR(A.T, (LEVEL - 1) * 4 + 1, 4) NOTES
FROM (SELECT ID, CONTENTS T FROM MY_TABLE) A
CONNECT BY LENGTH(SUBSTR(A.T, (LEVEL - 1) * 4 + 1, (LEVEL - 1) * 4)) >= 1
ORDER BY ID, LINE
In my production system the table has got 50000 rows and the notes field is 1000char.
I want the data in chunk of 80 chars(notes) in the output.
Tried the above query by replacing 4 with 80.The query runs for 3/4 hours but still no output.
Is there any way to rewrite the query to get the data faster?
Thanks.
lalu.
October 25, 2010 - 8:17 am UTC
ops$tkyte%ORA10GR2> create table t
2 as
3 select rownum id,
4 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
5 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
6 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
7 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
8 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
9 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
10 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
11 rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
12 rpad('y',81,'y') data
13 from all_objects
14 where rownum <= 5000
15 /
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace traceonly
ops$tkyte%ORA10GR2> set timing on
ops$tkyte%ORA10GR2> select id, to_number( substr( column_value, 1, 4 ) ) id2, substr( column_value, 5 ) new_data
2 from t, table( cast( multiset( select to_char( rownum, 'fm0009' ) || substr( data, 1+(rownum-1)*80, 80 )
3 from dual
4 connect by level <= ceil(length(data)/80) ) as sys.odciVarchar2List ) )
5 /
70000 rows selected.
Elapsed: 00:00:01.29
Execution Plan
----------------------------------------------------------
Plan hash value: 442444467
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50M| 25G| 59040 |
| 1 | NESTED LOOPS | | 50M| 25G| 59040 |
| 2 | TABLE ACCESS FULL | T | 6130 | 3202K| 84 |
| 3 | COLLECTION ITERATOR SUBQUERY FETCH| | | | |
| 4 | COUNT | | | | |
|* 5 | CONNECT BY WITHOUT FILTERING | | | | |
| 6 | FAST DUAL | | 1 | | 2 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - filter(LEVEL<=CEIL(LENGTH(:B1)/80))
Note
-----
- cpu costing is off (consider enabling it)
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
68 recursive calls
0 db block gets
4903 consistent gets
834 physical reads
0 redo size
6447972 bytes sent via SQL*Net to client
51726 bytes received via SQL*Net from client
4668 SQL*Net roundtrips to/from client
5000 sorts (memory)
0 sorts (disk)
70000 rows processed
ops$tkyte%ORA10GR2> set autotrace off
seems to work fairly well.
Tried but got error
A reader, October 25, 2010 - 9:18 am UTC
Hi TOm,
I tried your example.
Create table T,then
select id, to_number( substr( column_value, 1, 4 ) ) id2, substr( column_value, ) new_data
from t, table( cast( multiset( select to_char( rownum, 'fm0009' ) || substr( data,
1+(rownum-1)*80, 80 ) from dual connect by level <= ceil(length(data)/80) )as sys.odciVarchar2List ) )
returns:
select id, to_number( substr( column_value, 1, 4 ) ) id2, substr( column_value, ) new_data
*
ERROR at line 1:
ORA-00936: missing expression
Working
lalu, October 25, 2010 - 9:33 am UTC
Hi Tom,
Thanks a lot.
I was missing a very small char from the script.
Its running now.Will check the performance.
Thanks.
Achieve this with analytics or pivot query?
A reader, January 20, 2011 - 3:35 pm UTC
Hi
I have following data:
phone state state_mod_date service_id
-------- ------- ---------------- ------------
9087624 online 2010/02/22 1
9087624 online 2010/02/22 2
9087624 online 2010/02/23 6
9087624 offline 2010/02/23 2
9087624 offline 2010/02/24 6
9200921 online 2010/01/25 1
9200921 online 2010/03/12 3
9200921 online 2010/04/21 7
9200921 offline 2010/09/11 3
9760293 online 2010/01/02 1
I need to transform it to
phone service_id start_date end_date
-------- ----------- ----------- ----------
9087624 1 2010/02/22
9087624 2 2010/02/22 2010/02/23
9087624 6 2010/02/23 2010/02/24
9200921 1 2010/01/25
9200921 3 2010/03/12 2010/09/11
9200921 7 2010/04/21
9760293 1 2010/01/02
I have been doing this with PL/SQL but the amount of data is increasing in the last few months and the PL/SQL code is running slower and slower.
I have been looking the analytic function or pivot query but cant find a way of doing it, can you please share some lights?
I am running 10.2.0.3
Thanks a lot
Pedro
January 24, 2011 - 6:58 am UTC
you would need to
a) supply create tables
b) insert into tables (so we have data)
and most importantly:
c) explain your requirement, sometimes a picture is worth a thousand words - but we need an explanation of your inputs and outputs here and the processing that needs to take place.
It looks like just getting the min(state_mod_date) and max(state_mod_date) (where max<>min) by phone and service_id, it is not clear what the state column is used for
A reader, January 24, 2011 - 11:39 am UTC
use group by as below...
select phone, service_id, min(case when state='online' then state_mod_date END) as start_date , max(case when state='offline' then state_mod_date END) as end_date
from data
Group by phone, service_id
Achieve this with analytics or pivot query?
Ravi Kumar Pilla, January 27, 2011 - 9:00 am UTC
Dear Tom,
I'm a silent reader of your site and learned a lot and appreciate your work.
I don't know whether it is the best way, but its working for me. Correct me if any other way which works faster for the person who asked.
SELECT PHONE,SERVICEID,MAX(DECODE(status,'ONLINE',START_date)) startdate,
MAX(decode(status,'OFFLINE',START_DATE)) ENDDATE FROM PHONE GROUP BY PHONE,
SERVICEID
February 01, 2011 - 3:16 pm UTC
as I said above:
It looks like just getting the min(state_mod_date) and max(state_mod_date) (where max<>min) by phone and service_id, it is not clear what the state column is used for
I cannot validate your query as I don't know what the question is
Combine dates
Bob C, January 28, 2011 - 3:12 pm UTC
I have a similar challenge as above, except I want to combine rows to show continuous date range when rate plan and customer are the same. Records are not supposed to overlap, so that does not need to be handled.
drop table t1;
create table t1 ( key_col char(1), d_begin date, d_end date, cust char(10), rate_plan char(10) );
insert into t1 values ('a',TO_DATE('01/01/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/02/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/03/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/04/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/05/2011 01:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/06/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/07/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/08/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/09/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/10/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/11/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/12/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r2');
insert into t1 values ('b',TO_DATE('01/01/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/02/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/03/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/04/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/05/2011 01:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/06/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/07/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/08/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/09/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/10/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r2');
commit;
Would like to get results:
a, 01/01/2011 00:00:00, 01/04/2011 23:59:59, c1, r1
a, 01/05/2011 01:00:00, 01/10/2011 23:59:59, c1, r1
a, 01/11/2011 00:00:00, 01/12/2011 23:59:59, c1, r2
b, 01/01/2011 00:00:00, 01/04/2011 23:59:59, c1, r1
b, 01/05/2011 01:00:00, 01/10/2011 23:59:59, c1, r1
b, 01/11/2011 00:00:00, 01/12/2011 23:59:59, c1, r2
following query is close, but does not account for the gap between rows 2 and 3
select distinct key_col, cust, rate_plan,
first_value(d_begin) over (partition by key_col, cust, rate_plan order by d_begin rows unbounded preceding) as min_eff,
last_value(d_end) over (partition by key_col, cust, rate_plan order by d_begin ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as max_term
from t1
order by 1,2,4
KEY_COL CUST RATE_PLAN MIN_EFF MAX_TERM
a c1 r1 1/1/2011 1/10/2011 11:59:59 PM
a c1 r2 1/11/2011 1/12/2011 11:59:59 PM
b c1 r1 1/1/2011 1/8/2011 11:59:59 PM
b c1 r2 1/9/2011 1/10/2011 11:59:59 PM
February 01, 2011 - 3:31 pm UTC
I don't understand your output
a, 01/01/2011 00:00:00, 01/04/2011 23:59:59, c1, r1
a, 01/05/2011 01:00:00, 01/10/2011 23:59:59, c1, r1
a, 01/11/2011 00:00:00, 01/12/2011 23:59:59, c1, r2
I see why the third row is there, but I don't get why the first two are not merged.
I don't see any "gap" there.
@Bob C re:combine dates
Stew Ashton, January 29, 2011 - 5:04 am UTC
Hi Bob,
Here's a solution based on a contribution by Steve from Pleasanton CA:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1594948200346140013#1596748700346101469 select key_col, cust, rate_plan, min(d_begin) d_begin, max(d_end) d_end from (
select key_col, cust, rate_plan, d_begin, d_end,
sum(new_start) over(partition by key_col,cust,rate_plan order by d_begin) grp from (
select key_col, cust, rate_plan, d_begin, d_end,
case when
lag(d_end + 1/24/60/60) over(partition by key_col,cust,rate_plan order by d_begin)
>= d_begin
then 0 else 1 end new_start
from t1
)
)
group by key_col, cust, rate_plan, grp
order by key_col, cust, rate_plan, d_begin;
The innermost query sets NEW_START to 0 when there is no date gap with the previous record, and to 1 when you have to start afresh. The middle query sets GRP as a running total, so all records within the same continuous date range will belong to the same GRP. The outer query uses GROUP BY to get one record per date range.
Perfect answer thanks
A reader, February 01, 2011 - 10:24 am UTC
Works great, thanks.
Your Feb. 1 followup
Stew Ashton, February 01, 2011 - 4:35 pm UTC
Tom, you said 'I don't see any "gap" there.'
The second row starts at 1 A.M., not midnight. I didn't see that at first either. Or did you mean something else?
February 01, 2011 - 6:30 pm UTC
indeed, the gap was too subtle for my eyes :) I missed it entirely. After putting my glasses on:
ops$tkyte%ORA11GR2> select key_col, cust, rate_plan, min(d_begin), max(d_end)
2 from (
3 select key_col, cust, rate_plan, d_begin, d_end, last_d_end,
4 last_value(rn IGNORE NULLS) over (partition by key_col, cust, rate_plan order by d_end) grp
5 from (
6 select key_col, cust, rate_plan, d_begin, d_end,
7 lag(d_end) over (partition by key_col, cust, rate_plan order by d_end) last_d_end,
8 case when lag(d_end) over (partition by key_col, cust, rate_plan order by d_end) = d_begin-1/24/60/60 then null
9 else row_number() over (partition by key_col, cust, rate_plan order by d_end )
10 end rn
11 from t1
12 )
13 )
14 group by key_col, cust, rate_plan, grp
15 order by key_col, cust, rate_plan
16 /
K CUST RATE_PLAN MIN(D_BEGIN) MAX(D_END)
- ---------- ---------- -------------------- --------------------
a c1 r1 01-jan-2011 00:00:00 04-jan-2011 23:59:59
a c1 r1 05-jan-2011 01:00:00 10-jan-2011 23:59:59
a c1 r2 11-jan-2011 00:00:00 12-jan-2011 23:59:59
b c1 r1 01-jan-2011 00:00:00 04-jan-2011 23:59:59
b c1 r1 05-jan-2011 01:00:00 08-jan-2011 23:59:59
b c1 r2 09-jan-2011 00:00:00 10-jan-2011 23:59:59
6 rows selected.
for Pedro from Spain
CezarN, February 02, 2011 - 3:36 am UTC
/*--Here are create/insert statements you have to supply whenever you ask something
--drop table t1;
--create table t1 (phone varchar2(10), state varchar2(10), state_mod_date date, service_id number(2));
insert into t1 (phone, state, state_mod_date, service_id) values ('9087624', 'online', to_date('2010/02/22','yyyy/mm/dd'), 1);
insert into t1 (phone, state, state_mod_date, service_id) values ('9087624', 'online', to_date('2010/02/22','yyyy/mm/dd'), 2);
insert into t1 (phone, state, state_mod_date, service_id) values ('9087624', 'online', to_date('2010/02/23','yyyy/mm/dd'), 6);
insert into t1 (phone, state, state_mod_date, service_id) values ('9087624', 'offline', to_date('2010/02/23','yyyy/mm/dd'), 2);
insert into t1 (phone, state, state_mod_date, service_id) values ('9087624', 'offline', to_date('2010/02/24','yyyy/mm/dd'), 6);
insert into t1 (phone, state, state_mod_date, service_id) values ('9200921', 'online', to_date('2010/01/25','yyyy/mm/dd'), 1);
insert into t1 (phone, state, state_mod_date, service_id) values ('9200921', 'online', to_date('2010/03/12','yyyy/mm/dd'), 3);
insert into t1 (phone, state, state_mod_date, service_id) values ('9200921', 'online', to_date('2010/04/21','yyyy/mm/dd'), 7);
insert into t1 (phone, state, state_mod_date, service_id) values ('9200921', 'offline', to_date('2010/09/11','yyyy/mm/dd'), 3);
insert into t1 (phone, state, state_mod_date, service_id) values ('9760293', 'online', to_date('2010/01/02','yyyy/mm/dd'), 1);
commit;
select * from t1;
*/
--Hope this is what you want
select phone, service_id, start_date, end_date
from (
select phone, service_id, start_date, last_value(end_date) over (partition by phone, service_id) as end_date
from (
select
phone,
service_id,
case when state='online' then state_mod_date else null end as start_date,
case when state='offline' then state_mod_date else null end as end_date,
state
from t1
order by phone, service_id, start_date
)
)
where start_date is not null
order by phone, service_id
to_date function
Nitesh Jyotirmay, April 14, 2011 - 6:16 am UTC
SQL> ALTER SESSION SET NLS_DATE_FORMAT = 'DD:MM:YYYY HH24:MI:SS';
Session altered.
SQL> SELECT SYSDATE,to_date('13:14:00', 'HH24:MI:SS') FROM DUAL;
SYSDATE to_date('13:14:00',
------------------- -------------------
14:04:2011 11:17:43 01:04:2011 13:14:00
why the to date function gives the output like that.
i am using the Oracle 9i
April 14, 2011 - 10:00 am UTC
because you told us to.
you are running a query that selected two dates out. When you select a date in sqlplus, it converts it to a string for display. It uses your NLS date format for that.
If you want to control what the STRING representation of a date looks like - you would have to TO_CHAR it with a format - the format being the format you want.
TO_DATE with only TIME elements in the string
Kim Berg Hansen, April 14, 2011 - 10:23 am UTC
Hi, Tom
A followup to the review from Nitesh above.
Nitesh did this:
SQL> SELECT SYSDATE,to_date('13:14:00', 'HH24:MI:SS') FROM DUAL;
SYSDATE to_date('13:14:00',
------------------- -------------------
14:04:2011 11:17:43 01:04:2011 13:14:00
That is, he used TO_DATE on a string that only contains TIME elements. Since he did not specify any day, month or year, Oracle will have to "make up" something. Just as if you use TO_DATE only with DD-MM-YYYY elements, Oracle "makes up" that this must be midnight 00:00:00 hour.
The curious question is why Oracle picks the "first day of the current month"? It could have just as well have picked "today" or "January first Year 1"?
(Maybe Nitesh should consider using an INTERVAL datatype if he is only concerned with a time period? :-)
Anyway - just curious :-)
April 14, 2011 - 10:27 am UTC
because we documented it to be that way.
If you do not specify a year, the year is the current year.
If you do not specify a month, the month is the current month.
If you do not specify a day, the day is 1.
Thanks Tom
Kim Berg Hansen, April 14, 2011 - 10:33 am UTC
Thanks - I just couldn't find that bit in the documentation when I searched :-)
OK Great :-)
Kim Berg Hansen, April 14, 2011 - 11:42 am UTC
I was looking in the documentation of TO_DATE and in the "date formats" and those parts of the documentation. It did not occur to me to look at the datatype documentation :-)
Thanks for your patience...
April 14, 2011 - 5:25 pm UTC
I'll admit, it is a bit odd to be there - since it is really to_date related more so than data type related.
SQL query needs to be modified
A reader, April 17, 2011 - 1:02 pm UTC
Sql) query needs to be modified according to parameters for date range providedQuestion: the below query is for sales compensation based on the datre range i have to modify it so automatically it will change the date, can anyone please tell me how to go ab out it
--Use follwoing query for total tab of excel to make s` ure for current three month change OSR DATE FOR THAT
select decode(to_char(osr_date),
'12/10/2010 00:00:00', '12/2010',
'12/22/2010 00:00:00', '12/2010',
'01/04/2011 00:00:00', '12/2010',
'01/10/2011 00:00:00', '01/2011',
'01/22/2011 00:00:00', '01/2011',
'02/04/2011 00:00:00', '01/2011',
'02/10/2011 00:00:00', '02/2011',
'02/22/2011 00:00:00', '02/2011',
'03/04/2011 00:00:00', '02/2011',
'03/10/2011 00:00:00', '03/2011',
'03/22/2011 00:00:00', '03/2011',
'04/04/2011 00:00:00', '03/2011',
'1') osr_run,
lst_nm,sum(less_usage) "LESS_USAGE"
from ( select a.LST_NM , round(sum(comp_amt),2)less_usage,
decode (to_char(trunc(a.end_dt)),
'12/10/2010 00:00:00', '1',
'12/22/2010 00:00:00', '1',
'01/04/2011 00:00:00', '1',
'01/10/2011 00:00:00', '2',
'01/22/2011 00:00:00', '2',
'02/04/2011 00:00:00', '2',
'02/10/2011 00:00:00', '3',
'02/22/2011 00:00:00', '3',
'03/04/2011 00:00:00', '3',
'03/10/2011 00:00:00', '4',
'03/22/2011 00:00:00', '4',
'04/04/2011 00:00:00', '4',
to_char(a.end_dt)) osr_run,
to_char(trunc(a.end_dt))osr_date
from amk_arm_comp_2011 a
where lst_nm <>'House Account - CRM '
and a.end_dt between to_date('12/10/2010','mm/dd/yyyy') and to_date('04/04/2011','mm/dd/yyyy') --update each month by 1. 4 cycle month range for ARMS.
and a.sub_id <>2002003611
group by a.LST_NM, decode (to_char(trunc(a.end_dt)),
'12/10/2010 00:00:00', '1',
'12/22/2010 00:00:00', '1',
'01/04/2011 00:00:00', '1',
'01/10/2011 00:00:00', '2',
'01/22/2011 00:00:00', '2',
'02/04/2011 00:00:00', '2',
'02/10/2011 00:00:00', '3',
'02/22/2011 00:00:00', '3',
'03/04/2011 00:00:00', '3',
'03/10/2011 00:00:00', '4',
'03/22/2011 00:00:00', '4',
'04/04/2011 00:00:00', '4',
to_char(a.end_dt)),
to_char(trunc(a.end_dt))
)
group by decode(to_char(osr_date),
'12/10/2010 00:00:00', '12/2010',
'12/22/2010 00:00:00', '12/2010',
'01/04/2011 00:00:00', '12/2010',
'01/10/2011 00:00:00', '01/2011',
'01/22/2011 00:00:00', '01/2011',
'02/04/2011 00:00:00', '01/2011',
'02/10/2011 00:00:00', '02/2011',
'02/22/2011 00:00:00', '02/2011',
'03/04/2011 00:00:00', '02/2011',
'03/10/2011 00:00:00', '03/2011',
'03/22/2011 00:00:00', '03/2011',
'04/04/2011 00:00:00', '03/2011',
'1'), lst_nm
order by lst_nm, osr_run desc;
April 18, 2011 - 10:27 am UTC
You'll need to be a tad more clear on this
"i have to modify it so automatically it will change the date"
I presume you mean you'd like what are now hard coded dates to be replaced with some sort of function that figures out what the dates should be. BUT in order to do that - you would sort of need to fill us in on WHAT THE RULES ARE - what the LOGIC is.
I'm looking at what appear to be a bunch of random dates to me. Consider:
ops$tkyte%ORA11GR2> select dt, lag(dt) over (order by dt) ,
2 dt-lag(dt) over (order by dt)
3 from
4 (select to_date( '12/10/2010 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
5 select to_date( '12/22/2010 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
6 select to_date( '01/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
7 select to_date( '01/10/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
8 select to_date( '01/22/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
9 select to_date( '02/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
10 select to_date( '02/10/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
11 select to_date( '02/22/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
12 select to_date( '03/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
13 select to_date( '03/10/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
14 select to_date( '03/22/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
15 select to_date( '04/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual)
16 /
DT LAG(DT)OV DT-LAG(DT)OVER(ORDERBYDT)
--------- --------- -------------------------
10-DEC-10
22-DEC-10 10-DEC-10 12
04-JAN-11 22-DEC-10 13
10-JAN-11 04-JAN-11 6
22-JAN-11 10-JAN-11 12
04-FEB-11 22-JAN-11 13
10-FEB-11 04-FEB-11 6
22-FEB-11 10-FEB-11 12
04-MAR-11 22-FEB-11 10
10-MAR-11 04-MAR-11 6
22-MAR-11 10-MAR-11 12
04-APR-11 22-MAR-11 13
12 rows selected.
what is the "logic" behind these dates that are 13, 12, sometimes 6, other times 10 days apart???!?!?!?!?
I will say - DO NOT DO NOT DO NOT code this:
(to_char(trunc(a.end_dt)),
'12/10/2010 00:00:00', '1',...
DO code DO code DO code this:
(to_char(trunc(a.end_dt)),
to_date('12/10/2010 00:00:00','mm/dd/yyyy hh24:mi:ss'), '1', ....
NEVER NEVER NEVER rely on DEFAULTS.
Also - do not DO NOT do not code this:
decode (to_char(trunc(a.end_dt)),
'12/10/2010 00:00:00', '1',
you are once again relying on IMPLICIT CONVERSIONS - do not do that - ALWAYS (repeat ALWAYS) use an explicit format!@!!!!!!!!!!!!!!!!
a reader, April 18, 2011 - 1:19 pm UTC
The dates on the 10th, 22nd and 4th (of the following month) are one considered "cycle".. so
12/10/2010, 12/22/2010 & 04/04/2011 fall under Dec-10 cycle. 01/10/2011, 01/22/2011 &
02/04/2011 fall under Jan-11 cycle... & so on. These cycle dates are being inserted from another table.
What I'm looking to do is get rid of the hard-coded values from the 2 sets of the decode.
So I want to run this SQl from a given data range say - 12/05/2010 to 04/05/2011, it should pick up all "cycle" dates and name
them with the right month in order to figure our commissions for the sales rep.
Believe me, I could not agree more with you on the implicit conversions. But this is what we are dealing with for now.
April 18, 2011 - 1:51 pm UTC
keep going, explain the number system and dates in these too:
'12/10/2010 00:00:00', '12/2010',
'12/22/2010 00:00:00', '12/2010',
'01/04/2011 00:00:00', '12/2010',
'01/10/2011 00:00:00', '01/2011',
'01/22/2011 00:00:00', '01/2011',
'02/04/2011 00:00:00', '01/2011',
'02/10/2011 00:00:00', '02/2011',
'02/22/2011 00:00:00', '02/2011',
'03/04/2011 00:00:00', '02/2011',
'03/10/2011 00:00:00', '03/2011',
'03/22/2011 00:00:00', '03/2011',
'04/04/2011 00:00:00', '03/2011',
'1') osr_run,
lst_nm,sum(less_usage) "LESS_USAGE"
from ( select a.LST_NM , round(sum(comp_amt),2)less_usage,
decode (to_char(trunc(a.end_dt)),
'12/10/2010 00:00:00', '1',
'12/22/2010 00:00:00', '1',
'01/04/2011 00:00:00', '1',
'01/10/2011 00:00:00', '2',
'01/22/2011 00:00:00', '2',
'02/04/2011 00:00:00', '2',
'02/10/2011 00:00:00', '3',
'02/22/2011 00:00:00', '3',
'03/04/2011 00:00:00', '3',
'03/10/2011 00:00:00', '4',
'03/22/2011 00:00:00', '4',
'04/04/2011 00:00:00', '4',
to_char(a.end_dt)) osr_run,
o is the first week of a month always to be for the last month?
o are the numbers always 111,222,333,444 in order from first date to last date or are they assigned based on the date or what?
Believe me, I could not agree more with you on the implicit conversions. But
this is what we are dealing with for now. you are obviously in the midst of fixing things, this should have been the first fix!
A reader, April 27, 2011 - 12:53 pm UTC
Yes. The first week of a month is always to be for the last month & the numbers are in order (111,222,333) from first date to last date.
The result of the above query produces results in this fashion -
OSR_RUN LST_NM COMP_AMT
Dec-10 Jennings 495129.66
Jan-11 Jennings 484153.76
Feb-11 Jennings 472022.35
Mar-11 Jennings 481284.15
Dec-10 Parker 512221.55
Jan-11 Parker 519161.53
Feb-11 Parker 518672.03
Mar-11 Parker 518291.82
Dec-10 Scott 473346.21
Jan-11 Scott 478488.66
Feb-11 Scott 483854.59
Mar-11 Scott 480323.2
Dec-10 Spencer 501227.09
Jan-11 Spencer 512199.9
Feb-11 Spencer 511897.21
Mar-11 Spencer 506145.24
April 27, 2011 - 1:55 pm UTC
so, to be clear, the first three lowest dates are assigned one (even if they are not in the same time period - eg, if the first three dates we hit are:
22-dec, 04-jan, 10-jan
they will be assigned one?
A reader, April 28, 2011 - 9:54 am UTC
The first three dates we should be hitting are -
10-dec, 22-dec, 04-jan. These will be assigned 1.
10-jan, 22-jan, 04-feb. These will be assigned 2.
10-feb, 22-feb, 04-mar. These will be assigned 3.
April 28, 2011 - 10:20 am UTC
so, you are saying there is an implied constraint here that your input date ranges will ALWAYS be on the tenth of some month.
Ok, what if the date range I put in is
10-JUN-2010 through 04-DEC-2010
does june get one?
In any case - use whichever function below that works for you:
ops$tkyte%ORA11GR2> variable start_date varchar2(20)
ops$tkyte%ORA11GR2> variable end_date varchar2(20)
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec :start_date := '10-dec-2010'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec :end_date := '04-apr-2011'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> with four as
2 (select decode(level,1,4,2,22,10) dy from dual connect by level <= 3),
3 twelve as
4 (select level mm from dual connect by level <= 12),
5 one_year as
6 (select to_date( dy||'/'||mm, 'dd/mm' ) dt from four, twelve),
7 days as
8 (select decode( l, 1, dt, add_months(dt,-12) ) dt
9 from one_year, (select level l from dual connect by level <= 2)
10 )
11 select dt,
12 to_char( dt-5, 'mm/yyyy' ),
13 to_char( add_months(dt-5,1), 'fmmm' ) first,
14 to_char( add_months(dt-5,1), 'fmmm' )-to_char( add_months(to_date(:start_date, 'dd-mon-yyyy')-5,1),'fmmm') + 1
15 second
16 from days
17 where dt between to_date( :start_date, 'dd-mon-yyyy' ) and to_date( :end_date, 'dd-mon-yyyy' )
18 order by dt;
DT TO_CHAR FI SECOND
--------- ------- -- ----------
10-DEC-10 12/2010 1 1
22-DEC-10 12/2010 1 1
04-JAN-11 12/2010 1 1
10-JAN-11 01/2011 2 2
22-JAN-11 01/2011 2 2
04-FEB-11 01/2011 2 2
10-FEB-11 02/2011 3 3
22-FEB-11 02/2011 3 3
04-MAR-11 02/2011 3 3
10-MAR-11 03/2011 4 4
22-MAR-11 03/2011 4 4
04-APR-11 03/2011 4 4
12 rows selected.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec :start_date := '10-feb-2011'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec :end_date := '04-jul-2011'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> /
DT TO_CHAR FI SECOND
--------- ------- -- ----------
10-FEB-11 02/2011 3 1
22-FEB-11 02/2011 3 1
04-MAR-11 02/2011 3 1
10-MAR-11 03/2011 4 2
22-MAR-11 03/2011 4 2
04-APR-11 03/2011 4 2
10-APR-11 04/2011 5 3
22-APR-11 04/2011 5 3
04-MAY-11 04/2011 5 3
10-MAY-11 05/2011 6 4
22-MAY-11 05/2011 6 4
04-JUN-11 05/2011 6 4
10-JUN-11 06/2011 7 5
22-JUN-11 06/2011 7 5
04-JUL-11 06/2011 7 5
15 rows selected.
Divide into 3 groups
A reader, May 03, 2011 - 2:59 pm UTC
Table has following data
TEAM member
team1 2
team2 5
team3 3
team4 1
team5 10
team6 4
team7 38
...
How to divide these teams relative evenly into 3 groups?
May 04, 2011 - 1:45 pm UTC
read about ntile() or just simply add a column "mod(rownum,3) grp" to your select list.
I would have given you an example, but you didn't give us a create table/insert intos to work with!
Divide into 3 groups
A reader, May 05, 2011 - 8:49 am UTC
Tom,
I have tried with mod(rownum,3). But the result is not that even.
create table t (team varchar2(8), member_num number, group_no number);
insert into t values ('team1',3, null);
insert into t values ('team2',5, null);
insert into t values ('team3',8, null);
insert into t values ('team4',9, null);
insert into t values ('team5',2, null);
insert into t values ('team6',7, null);
insert into t values ('team7',4, null);
insert into t values ('team8',3, null);
insert into t values ('team9',3, null);
insert into t values ('team10',25, null);
insert into t values ('team11',33, null);
commit;
update t
set t.group_no = (select r.rg from (select REPLACE(mod(rownum,3),0,3) rg, team
from t order by member) r
where t.team = r.team
);
SQL > select group_no, sum(member) from t group by group_no;
GROUP_NO SUM(MEMBER)
---------- -----------
1 41
2 43
3 18
Thanks.
May 06, 2011 - 10:06 am UTC
When you said to "How to divide these teams relative evenly into 3 groups?
", what immediately pops into mine?
Well, to break them into three groups of equal sizes....
I see 11 teams. I would put
1,4,7,10 into a group of 4
2,5,8,11 into a group of 4
3,6,9 into a group of 3
there I have three groups of equal sizes.
You did not say anything like:
given that member_num column, can you split the values in the above list up into 3 groups such that the sum of member_num within each group is about the same magnitude.
Which is what you are apparently asking to do.
That is a 'bin fitting' type of problem and does not lend itself easily to SQL based solutions.
see
http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22bin+fitting%22 for past discussions on this
Similar to the topic
Yar, June 01, 2011 - 4:17 am UTC
CREATE TABLE GRTAB
(
ID NUMBER
INS_DATE DATE
)
/
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-25 23:13:32','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 02:14:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 04:15:30','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 05:14:31','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 07:15:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 10:15:50','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 13:44:46','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 15:14:54','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 16:15:01','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 17:14:38','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 19:15:36','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 11:30:17','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 14:30:22','YYYY-MM-DD HH24:MI:SS'));
ID INS_DATE BASE_DATE
-- ------------------- -------------------
1 2011-05-25 23:13:32 2011-05-25 23:13:32
1 2011-05-26 02:14:19 2011-05-25 23:13:32
1 2011-05-26 04:15:30 2011-05-25 23:13:32
1 2011-05-26 05:14:31 2011-05-26 05:14:31
1 2011-05-26 07:15:19 2011-05-26 05:14:31
1 2011-05-26 10:15:50 2011-05-26 05:14:31
1 2011-05-26 13:44:46 2011-05-26 13:44:46
1 2011-05-26 15:14:54 2011-05-26 13:44:46
1 2011-05-26 16:15:01 2011-05-26 13:44:46
1 2011-05-26 17:14:38 2011-05-26 13:44:46
1 2011-05-26 19:15:36 2011-05-26 13:44:46
2 2011-05-30 11:30:17 2011-05-30 11:30:17
2 2011-05-30 14:30:22 2011-05-30 11:30:17
CREATE TABLE GRTAB
(
ID NUMBER
INS_DATE DATE
)
/
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-25 23:13:32','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 02:14:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 04:15:30','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 05:14:31','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 07:15:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 10:15:50','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 13:44:46','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 15:14:54','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 16:15:01','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 17:14:38','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 19:15:36','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 11:30:17','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 14:30:22','YYYY-MM-DD HH24:MI:SS'));
I'd like to group INS_DATE every 6 hours from the first INS_DATE of every ID. And on each group, set the BASE_DATE on the first ins_date per group, like:
ID INS_DATE BASE_DATE
-- ------------------- -------------------
1 2011-05-25 23:13:32 2011-05-25 23:13:32
1 2011-05-26 02:14:19 2011-05-25 23:13:32
1 2011-05-26 04:15:30 2011-05-25 23:13:32
1 2011-05-26 05:14:31 2011-05-26 05:14:31
1 2011-05-26 07:15:19 2011-05-26 05:14:31
1 2011-05-26 10:15:50 2011-05-26 05:14:31
1 2011-05-26 13:44:46 2011-05-26 13:44:46
1 2011-05-26 15:14:54 2011-05-26 13:44:46
1 2011-05-26 16:15:01 2011-05-26 13:44:46
1 2011-05-26 17:14:38 2011-05-26 13:44:46
1 2011-05-26 19:15:36 2011-05-26 13:44:46
2 2011-05-30 11:30:17 2011-05-30 11:30:17
2 2011-05-30 14:30:22 2011-05-30 11:30:17
been struggling the whole time doing this, any help is appreciated
June 01, 2011 - 2:29 pm UTC
please look at the other place you already asked this and please do not do that anymore, just ask on ONE PLACE
Led/lag or connect by prior..
Jayadevan, June 08, 2011 - 3:55 am UTC
Hello Tom,
I have the following scenario -
CREATE TABLE "HR"."ACTIVITY"
( "ACTIVITY_ID" NUMBER(*,0),
"PERSON_ID" NUMBER(*,0),
"START_DATE" DATE,
"END_DATE" DATE,
"ACTIVITY_NAME" VARCHAR2(10 BYTE)
)
Sample data -
Insert into HR.ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (1,1,to_date('08-JUN-11','DD-MON-RR'),to_date('18-JUN-11','DD-MON-RR'),'LEAVE');
Insert into HR.ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (2,1,to_date('06-JUN-11','DD-MON-RR'),to_date('07-JUN-11','DD-MON-RR'),'LEAVE');
Insert into HR.ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (3,1,to_date('01-JUN-11','DD-MON-RR'),to_date('04-JUN-11','DD-MON-RR'),'LEAVE');
select * from activity order by start_date;
ACTIVITY_ID PERSON_ID START_DAT END_DATE ACTIVITY_N
----------- ---------- --------- --------- ----------
3 1 01-JUN-11 04-JUN-11 LEAVE
2 1 06-JUN-11 07-JUN-11 LEAVE
1 1 08-JUN-11 18-JUN-11 LEAVE
I will get one date as input - say sysdate. Now I have to get the activity a person is assigned for that date (i.e. start date <= sysdate and end_date>= sysdate) and walk back through all the data till there is a blank date, and walk forward till there is a blank date, and display all those records.
For example, if user enters '10-JUN-2011' as input, Activityid 1 is the 'current' activity. Then I step back to the previous activity (activity id 2, with end date as 07. I step back again- now the end date of previous activity is 04-June. This implaies a break. So I stop there and stop going back. Similarly, I have to go forward. Ihope the situation is clear. There is another break condition - if previous activity is not the same as current, that implies stop. But that should be easy enough if I can figure out the firts case. Any suggestions?
June 08, 2011 - 10:42 am UTC
are you doing this for a single person or for a set of people or for all people in the table.
A more interesting test case would have more than one person in it to ensure the answer actually works too.
can a person have more than one activity assigned at the same time?
do the start/end times overlap?
this implies a break"
why? and what does that mean? how did you know that was a 'break'.
There is another break condition - if previous activity is not the same as current, that implies stop.
if that implied a stop, then we would have stopped on the first record???? it had activity 2, which is not 1.
Aswers
Jayadevan, June 08, 2011 - 11:43 pm UTC
Thanks for the reply.
1) are you doing this for a single person or for a set of people or for all people in the table.
This can happen for a number of persons. The maximum number of persons is, as of now 6000. I am sure it can go up. So we could select upto 6000 persons. Since we have to maintain the data for these persons for a few years, the number of records is already in millions. We already have a temp table which will store the 'filtered' persons- filtered on various conditions user will choose- all persons from a location, or all persons with some designation.... and that will be used for filrst-level filtering. Now we are iterating through a loop to get the 'break' and as number of persons/activities go up, iterations go up - no very scalable, I feel.
A more interesting test case would have more than one person in it to ensure the answer actually works too.
2)can a person have more than one activity assigned at the same time?
No
3) do the start/end times overlap?
Not to the minute/second level. But end time can be today 10 AM and next start time can be tonight 11 or something like that
this implies a break"
why? and what does that mean? how did you know that was a 'break'.
If a person was not assigned any activity, we do not have to display what he was doing before that day.
There is another break condition - if previous activity is not the same as current, that implies stop.
4)if that implied a stop, then we would have stopped on the first record???? it had activity 2, which is not 1.
I mean the type of activity- which is 'LEAVE' is all the cases here. If the activity was, say , 'TRAINING', yes, we would not have taken the record with Activity id 2 (i.e. take only activity 1. Activity 2's name is different, so we do not take it, and do not traverse further.
I hope it is clear.
The column activity_id is just a column I created as PK. It is the activity_name that I need to check for 'is it different or same'.
June 09, 2011 - 9:56 am UTC
provide me, as requested, a better test case.
Give me an employee that has a date based break.
Give me another on that has an activity based break.
Give me another that has no breaks - all data would be returned.
Give me one that has both a date based break and an activity based break.
Make your test case actually test all of the different cases
SQL Queries
Raghu, June 10, 2011 - 5:08 am UTC
Hi Tom,
I am a Oracle developer and in my work I rarely get requirements to write complex SQL Queries, where I can make use of analytic functions and other features.
So I was wondering if you had any set of questions (On EMP/DEPT tables), which would test our query writing ability.
Or what queries you might ask in an interview.
There are daily PL/SQL quiz hapening, wish we had SQL Quiz too.
Thanks for all the help, I am a big fan of yours.
Activities and breaks
Jayadevan, June 11, 2011 - 1:05 am UTC
Hello Tom,
Apologies for not providing enough sample data...
CREATE TABLE "ACTIVITY" ("ACTIVITY_ID" NUMBER(*,0), "PERSON_ID" NUMBER(*,0), "START_DATE" DATE, "END_DATE" DATE, "ACTIVITY_NAME" VARCHAR2(10))
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (4,1,to_date('19-JUN-11','DD-MON-RR'),to_date('18-JUN-11','DD-MON-RR'),'TRAINING');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (2,1,to_date('05-JUN-11','DD-MON-RR'),to_date('07-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (3,1,to_date('01-JUN-11','DD-MON-RR'),to_date('04-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (5,1,to_date('27-MAY-11','DD-MON-RR'),to_date('29-MAY-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (1,1,to_date('08-JUN-11','DD-MON-RR'),to_date('18-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (14,2,to_date('01-JUN-11','DD-MON-RR'),to_date('02-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (12,2,to_date('03-JUN-11','DD-MON-RR'),to_date('06-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (13,2,to_date('09-JUN-11','DD-MON-RR'),to_date('10-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (15,2,to_date('11-JUN-11','DD-MON-RR'),to_date('12-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (11,2,to_date('07-JUN-11','DD-MON-RR'),to_date('08-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (20,3,to_date('07-JUN-11','DD-MON-RR'),to_date('09-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (21,3,to_date('13-JUN-11','DD-MON-RR'),to_date('19-JUN-11','DD-MON-RR'),'TRAINING');
select * from activity order by person_id, start_date, end_date ;
5 1 27-MAY-11 29-MAY-11 LEAVE
3 1 01-JUN-11 04-JUN-11 LEAVE
2 1 05-JUN-11 07-JUN-11 LEAVE
1 1 08-JUN-11 18-JUN-11 LEAVE
4 1 19-JUN-11 18-JUN-11 TRAINING
14 2 01-JUN-11 02-JUN-11 GROUND
12 2 03-JUN-11 06-JUN-11 GROUND
11 2 07-JUN-11 08-JUN-11 GROUND
13 2 09-JUN-11 10-JUN-11 GROUND
15 2 11-JUN-11 12-JUN-11 GROUND
20 3 07-JUN-11 09-JUN-11 LEAVE
21 3 13-JUN-11 19-JUN-11 TRAINING
The input filter is '07-JUN-11'.
For pers_id = 1 , I have to go back till activity_id 3. The start_date of the previous activity (id = 5) activity is not the same as the end_date of activity_id 5. So we stop there. This is a date-based break. While going forward, I have to go till activity_id 1. The activity_name for activity_id 4 is 'TRAINING', which is not the same as the ACTIVITY_NAME for person_id for the date '07-JUN-11'. This is a change of activity-based break.
For pers_id 3, there is an activity and date based_break on activity_id 21 and we stop there (actually, if one break condition is met, we do not have to check for the other condition)
For pers_id 2, I have to fetch all the records - going back from '07-JUN-11' and forward from '07-JUN-11', there is no 'break' as far as one activity's end_date and the next activity's start_dates are concerned. The activity names are also same. In the end, the output I have to provide is
PERSON_ID START_DAT END_DATE ACTIVITY_NAME
---------- --------- --------- ----------
1 01-JUN-11 18-JUN-11 LEAVE
2 01-JUN-11 12-JUN-11 GROUND
3 07-JUN-11 09-JUN-11 LEAVE
Correction
Jayadevan, June 11, 2011 - 1:14 am UTC
A correction in the sentence
"The start_date of the previous activity (id = 5) activity is not the same as the end_date of activity_id 5. So we stop there. "
It should be
"The end_date of the previous activity (id = 5) is not the same as the start_date of activity_id 3. It is not the previous date as compared to the start_date of activity 3 either So we stop there. ". In effect, there is some gap between the assignments- there is one or more days when the employee is not assigned any activity. That is a break
@Jayadevan: Generic Analytic Solution
Brendan, June 11, 2011 - 8:06 am UTC
The requirement is expressed rather procedurally in terms of going forward and back from a given record, then stopping at the break points. If you want a solution by analytics, I would view it rather as a simple extension of a general class of problems where you want to group records primarily by contiguity of ranges (usually date ranges). I wondered whether it might not be interesting to present the class generically, propose a generic analytic solution, then apply it to your specific case.
I would describe the class for your problem as follows. Consider the fields in a record set to divide into the following categories:
key - partition by fields
range_start, range_end - range fields
attr_brk - break fields
attr_oth - any other fields
The problem is then to obtain for each record a group_start, group_end pair that are the range_start and range_end values for the records that respectively start and end the break group of the current record. The records are to be ordered by range_start, range_end within the partitioning key, and a new break group starts when, between successive records, either there is a gap between range_end and range_start fields, or any of the attr_brk fields change value. In this case no overlaps are allowed in the ranges within a key.
It's worth noting that if you drop the break fields overlapping ranges can easily be accommodated, making a related second class that can be solved by similar methods.
I have written a kind of pseudo-SQL solution for both classes, and have also implemented them for your particular table, with my own test sets. I could post these if there is any interest, and TK has no objection. However, I will just state the approach used for now to keep it brief.
1. Within an inline view, use Lag and Lead functions with CASE expressions to set group_start and group_end values on the respective start and end records of the break groups, leaving other values null.
2. Select all the original fields from the inline view, as well as the new fields within First_Value, Last_Value functions with the IGNORE NULLS option
3. The ouptut from step 2 solves the problems as defined, but if necessary, can be used within another inline view to restrict the output to certain groups only (eg a 'current' group)
Correction
Brendan, June 12, 2011 - 12:45 pm UTC
The second class actually needs an additional prior step. I've put an article on Scribd on this.
Query
Jayadevan, June 13, 2011 - 9:09 am UTC
Hello,
If you do have a working query, please do share that. I hope some amount of filtering/elimination will happen in the first query that gets executed, since the table in question has millions of records. I think Tom too would be intersted in seeing that query. I considered a simple lead/lag inner query with filters applied in the outer query. But that would mean a full scan of this huge table, and hence I thought I should seek Tom's help.
Regards,
Jayadevan
Code as requested
Brendan, June 14, 2011 - 2:33 am UTC
Hi Jayadevan
Here is the code:
SELECT /* NO_OVERLAP */
person_id, start_date, end_date, activity_name, activity_id id,
Last_Value (group_start IGNORE NULLS) OVER (PARTITION BY person_id ORDER BY start_date) group_start,
First_Value (group_end IGNORE NULLS) OVER (PARTITION BY person_id ORDER BY start_date RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) group_end
FROM (
SELECT person_id, start_date, end_date, activity_name, activity_id,
CASE WHEN (start_date > Nvl (Lag (end_date) OVER (PARTITION BY person_id ORDER BY start_date), start_date-1)) OR
(activity_name != Lag (activity_name) OVER (PARTITION BY person_id ORDER BY start_date)) THEN start_date END group_start,
CASE WHEN (Nvl (Lead (start_date) OVER (PARTITION BY person_id ORDER BY start_date), end_date+1) > end_date) OR
(activity_name != Lead (activity_name) OVER (PARTITION BY person_id ORDER BY start_date)) THEN end_date END group_end
FROM activity
WHERE person_id IN (1, 2)
)
ORDER BY person_id, start_date
The shape of the data matters as much as the size - is it wide and shallow or deep and narrow for example, in terms of persons and time. I've only tested on 30 records. For wide and shallow, it may be efficient. However, if you have a very deep data set but with only a few records in the 'current' group you might be better using a more procedural approach. Assuming you've indexed start and end times, you might 'anchor' from the current records, then execute indexed selects to pass from one record to the adjacent until you hit a break. You could also use the method mentioned in the recent thread on '3 second...' where a recursive subquery approach is mentioned, you'd have to put it in a union to go both ways I think - I'd have tried that myself but have not had time to do it. Alternatively, within the analytic approach, would it help to do an 'Exists current record' subquery? I don't know your data so I can't say.
re: Activities and breaks
Stew Ashton, June 14, 2011 - 11:13 am UTC
This seems like a job for procedural logic. Analytics let you look forward and backward in the data your are selecting, but they don't let you look backward in the result set you are creating. Brendan used the MODEL clause in another thread and I think it does the trick here. The MODEL clause carries forward the start date as long as there is not a "break", then the GROUP BY clause gives us one record with the latest end date:
> select person_id, start_date, max(end_date) end_date, activity_name from (
SELECT person_id, start_date, end_date, activity_name
FROM activity
MODEL
PARTITION BY (person_id)
DIMENSION BY (Row_Number() OVER (PARTITION BY person_id ORDER BY start_date) rn)
MEASURES (start_date, end_date, activity_name)
RULES (
start_date[rn>1] order by rn =
case when end_date[cv()-1] >= start_date[cv()]-1
and activity_name[cv()-1] = activity_name[cv()]
then start_date[cv()-1]
else start_date[cv()] end
)
)
where to_date('07-JUN-11', 'DD-MON-RR') between start_date and end_date
group by person_id, start_date, activity_name
order by 1,2,4
/
PERSON_ID START_DATE END_DATE ACTIVITY_NAME
---------- ------------ ---------- -------------
1 01-JUN-11 18-JUN-11 LEAVE
2 01-JUN-11 12-JUN-11 GROUND
3 07-JUN-11 09-JUN-11 LEAVE
Recursive Solution
Brendan, June 14, 2011 - 5:56 pm UTC
Some analytics sneaked in though, basic testing done...
WITH rsq_f (person_id, start_date, end_date, activity_name, activity_id) AS (
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM activity
WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
AND person_id IN (1, 2)
UNION ALL
SELECT act.person_id, act.start_date, act.end_date, act.activity_name, act.activity_id
FROM rsq_f
JOIN activity act
ON act.start_date = rsq_f.end_date
AND act.person_id = rsq_f.person_id
AND act.activity_name = rsq_f.activity_name
AND act.person_id IN (1, 2)
), rsq_b (person_id, start_date, end_date, activity_name, activity_id) AS (
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM activity
WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
AND person_id IN (1, 2)
UNION ALL
SELECT act.person_id, act.start_date, act.end_date, act.activity_name, act.activity_id
FROM rsq_b
JOIN activity act
ON act.end_date = rsq_b.start_date
AND act.person_id = rsq_b.person_id
AND act.activity_name = rsq_b.activity_name
)
SELECT /* RSQ '&TODAY' */ person_id, start_date, end_date, activity_name, activity_id,
Min (start_date) OVER (PARTITION BY person_id) grp_start, Max (end_date) OVER (PARTITION BY person_id) grp_end
FROM (
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM rsq_b
UNION
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM rsq_f
)
ORDER BY person_id, start_date
/
PERSON_ID START_DAT END_DATE ACTIVITY_N ACTIVITY_ID GRP_START GRP_END
---------- --------- --------- ---------- ----------- --------- ---------
1 08-JUN-11 09-JUN-11 LEAVE 4 08-JUN-11 14-JUN-11
09-JUN-11 14-JUN-11 LEAVE 5 08-JUN-11 14-JUN-11
2 09-JUN-11 14-JUN-11 TRAINING 11 09-JUN-11 14-JUN-11
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | WINDOW SORT | | 8 | 408 | 2 (100)| 00:00:01 |
| 2 | VIEW | | 8 | 408 | 2 (100)| 00:00:01 |
| 3 | SORT UNIQUE | | 8 | 408 | 2 (100)| 00:00:01 |
| 4 | UNION-ALL | | | | | |
| 5 | VIEW | | 4 | 204 | 0 (0)| |
| 6 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |
| 7 | INLIST ITERATOR | | | | | |
|* 8 | TABLE ACCESS BY INDEX ROWID | ACTIVITY | 2 | 102 | 0 (0)| |
|* 9 | INDEX RANGE SCAN | ACTIVITY_U1 | 1 | | 0 (0)| |
| 10 | NESTED LOOPS | | | | | |
| 11 | NESTED LOOPS | | 2 | 160 | 0 (0)| |
| 12 | RECURSIVE WITH PUMP | | | | | |
|* 13 | INDEX UNIQUE SCAN | ACTIVITY_U2 | 1 | | 0 (0)| |
|* 14 | TABLE ACCESS BY INDEX ROWID | ACTIVITY | 1 | 51 | 0 (0)| |
| 15 | VIEW | | 4 | 204 | 0 (0)| |
| 16 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |
| 17 | INLIST ITERATOR | | | | | |
|* 18 | TABLE ACCESS BY INDEX ROWID | ACTIVITY | 2 | 102 | 0 (0)| |
|* 19 | INDEX RANGE SCAN | ACTIVITY_U1 | 1 | | 0 (0)| |
| 20 | NESTED LOOPS | | | | | |
| 21 | NESTED LOOPS | | 2 | 160 | 0 (0)| |
| 22 | RECURSIVE WITH PUMP | | | | | |
|* 23 | INDEX UNIQUE SCAN | ACTIVITY_U1 | 1 | | 0 (0)| |
|* 24 | TABLE ACCESS BY INDEX ROWID | ACTIVITY | 1 | 51 | 0 (0)| |
-------------------------------------------------------------------------------------------------------------
Recursion Refined
Brendan, June 15, 2011 - 1:57 am UTC
This version might be more efficient, it does half the table accesses...
DEFINE TODAY='11-JUN-2011'
PROMPT Recursive Subquery (OR) for '&TODAY'
WITH act AS (
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM activity
WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
AND person_id IN (1, 2)
), rsq (person_id, start_date, end_date, activity_name, activity_id, go_back) AS (
SELECT person_id, start_date, end_date, activity_name, activity_id, go_back
FROM (
SELECT person_id, start_date, end_date, activity_name, activity_id, 'N' go_back
FROM act
UNION ALL
SELECT person_id, start_date, end_date, activity_name, activity_id, 'Y'
FROM act
)
UNION ALL
SELECT act.person_id, act.start_date, act.end_date, act.activity_name, act.activity_id, CASE WHEN act.start_date = rsq.end_date THEN 'N' ELSE 'Y' END
FROM rsq
JOIN activity act
ON ((act.start_date = rsq.end_date AND go_back = 'N') OR
(act.end_date = rsq.start_date AND go_back = 'Y'))
AND act.person_id = rsq.person_id
AND act.activity_name = rsq.activity_name
AND act.person_id IN (1, 2)
)
SELECT /* RSQ '&TODAY' */ person_id, start_date, end_date, activity_name, activity_id,
Min (start_date) OVER (PARTITION BY person_id) grp_start, Max (end_date) OVER (PARTITION BY person_id) grp_end
FROM (
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM rsq
UNION
SELECT person_id, start_date, end_date, activity_name, activity_id
FROM rsq
)
ORDER BY person_id, start_date
/
------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 14 (100)| |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | | | | | |
| 3 | INLIST ITERATOR | | | | | |
|* 4 | TABLE ACCESS BY INDEX ROWID | ACTIVITY | 2 | 102 | 0 (0)| |
|* 5 | INDEX RANGE SCAN | ACTIVITY_U1 | 1 | | 0 (0)| |
| 6 | LOAD AS SELECT | | | | | |
| 7 | UNION ALL (RECURSIVE WITH) BREADTH FIRST| | | | | |
| 8 | VIEW | | 4 | 216 | 4 (0)| 00:00:01 |
| 9 | UNION-ALL | | | | | |
| 10 | VIEW | | 2 | 102 | 2 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_BC16D2 | 2 | 102 | 2 (0)| 00:00:01 |
| 12 | VIEW | | 2 | 102 | 2 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_BC16D2 | 2 | 102 | 2 (0)| 00:00:01 |
| 14 | NESTED LOOPS | | | | | |
| 15 | NESTED LOOPS | | 1 | 92 | 4 (0)| 00:00:01 |
| 16 | RECURSIVE WITH PUMP | | | | | |
|* 17 | INDEX RANGE SCAN | ACTIVITY_U1 | 2 | | 0 (0)| |
|* 18 | TABLE ACCESS BY INDEX ROWID | ACTIVITY | 1 | 51 | 0 (0)| |
| 19 | WINDOW SORT | | 10 | 510 | 6 (34)| 00:00:01 |
| 20 | VIEW | | 10 | 510 | 6 (34)| 00:00:01 |
| 21 | SORT UNIQUE | | 10 | 510 | 6 (67)| 00:00:01 |
| 22 | UNION-ALL | | | | | |
| 23 | VIEW | | 5 | 255 | 2 (0)| 00:00:01 |
| 24 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661B_BC16D2 | 5 | 270 | 2 (0)| 00:00:01 |
| 25 | VIEW | | 5 | 255 | 2 (0)| 00:00:01 |
| 26 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661B_BC16D2 | 5 | 270 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------
Output...
Recursive Subquery (OR) for '11-JUN-2011'
old 4: WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
new 4: WHERE '11-JUN-2011' BETWEEN start_date AND Nvl(end_date, '11-JUN-2011')
old 25: SELECT /* RSQ '&TODAY' */ person_id, start_date, end_date, activity_name, activity_id,
new 25: SELECT /* RSQ '11-JUN-2011' */ person_id, start_date, end_date, activity_name, activity_id,
PERSON_ID START_DAT END_DATE ACTIVITY_N ACTIVITY_ID GRP_START GRP_END
---------- --------- --------- ---------- ----------- --------- ---------
1 08-JUN-11 09-JUN-11 LEAVE 4 08-JUN-11 14-JUN-11
09-JUN-11 14-JUN-11 LEAVE 5 08-JUN-11 14-JUN-11
2 09-JUN-11 14-JUN-11 TRAINING 11 09-JUN-11 14-JUN-11
Analytical question
John, June 15, 2011 - 7:24 pm UTC
have a table data
col1 col2
1 2
5 6
8 9
2 3
10 12
3 20
6 25
want to display data as
col1 col2
1 2
2 3
3 20
5 6
6 25
8 9
10 12
Thanks in advance
June 17, 2011 - 1:25 pm UTC
if I had such tables, I could show you how to do that.
But I don't, so I cannot.
If you provide me with a create table and a bunch of inserts, I could
no create
no inserts
no look
I was wrong about "Activities and breaks"
Stew Ashton, June 16, 2011 - 8:58 am UTC
I thought analytics couldn't do this, but I was wrong. In this solution,
Q1 adds a "break" column that is set to 1 when a break occurs.
Q2 does a running total on the "break" column; this assigns each set of lines a rank that increments at each break.
Finally, the main select gets the earliest start date and latest end date for each rank.
This solution is based on
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1594948200346140013#1596748700346101469 By the way, in my tests the MODEL-based solution was somewhat faster :)
with q1 as (
select person_id, start_date, end_date, activity_name
, case when nvl(
lag(activity_name)
over (partition by person_id order by start_date, activity_name)
, activity_name
) = activity_name
and nvl(
lag(end_date)
over (partition by person_id order by start_date, activity_name)
, start_date-1
) >= start_date-1
then 0 else 1 end as break
from activity
), q2 as (
select person_id, start_date, end_date, activity_name
, sum(break)
over(partition by person_id order by start_date, activity_name)
+ 1 as rank
from q1
)
select person_id, min(start_date) start_date, max(end_date) end_date, activity_name
from q2
group by person_id, rank, activity_name
having to_date('07-JUN-11', 'DD-MON-RR') between min(start_date) and max(end_date)
order by 1, 2
/
PERSON_ID START_DATE END_DATE ACTIVITY_NAME
---------- -------------------- -------------------- -------------
1 2011/06/01 00:00:00 2011/06/18 00:00:00 LEAVE
2 2011/06/01 00:00:00 2011/06/12 00:00:00 GROUND
3 2011/06/07 00:00:00 2011/06/09 00:00:00 LEAVE
Analytics
John, June 17, 2011 - 9:56 pm UTC
Please see the following
create table t_hierarchy( col1 number(10), col2 number(10));
insert into t_hierarchy values(1,80);
insert into t_hierarchy values(60,10);
insert into t_hierarchy values(80,60);
insert into t_hierarchy values(65,200);
insert into t_hierarchy values(10,100);
insert into t_hierarchy values(200,55);
insert into t_hierarchy values(101,500);
The data display should as follows
col1 col2
---- ----
1 80
80 60
60 10
10 100
65 200
200 55
101 500
Thanks in advance
June 20, 2011 - 9:44 am UTC
neat, I can do that by:
select 1 oc, t.* from t where col1 = 1 and col2 = 60
UNION ALL
select 2 oc, t.* from t where col1 = 80 and col = 60
UNION ALL
...
order by 1;
there you go - now, if you would care to explain the logic behind your requested output and ALL of the rules governing your data (I can probably see what you are trying to do here) - such as
what describes the parent child relation?
can there be more than one child of a given parent?
if yes, what then?
is there only one root - only one "master parent"?
if not, what then?
can there be loops?
etc etc etc
In other words, write a SPECIFICATION.
I completely fail to understand the questions with:
a) here is my data
b) here is my output
now please write a query. Without providing ANY OTHER INFORMATION.
If your data always conforms to the rules I've made up in my head, then this might work (and it also might be overkill for you - you don't give me ANYTHING to work with)
ops$tkyte%ORA11GR2> select *
2 from t_hierarchy t1
3 start with rowid in (select rowid from t_hierarchy t2 where not exists (select null from t_hierarchy t3 where t3.col2 = t2.col1))
4 connect by prior t1.col2 = t1.col1
5 /
COL1 COL2
---------- ----------
1 80
80 60
60 10
10 100
65 200
200 55
101 500
7 rows selected.
About the t_hierarchy question
Stew Ashton, June 20, 2011 - 12:52 pm UTC
Tom, I see that the "root" values are returned in ascending order. Is this guaranteed, or must one add the "order siblings by" clause?
June 20, 2011 - 1:55 pm UTC
hierarchical queries will return the children of children of children .... in order
but the children in a level will not come out in any order in particular.
You don't need an order siblings by in this case because (in the the example and one of my assumptions) there is ONLY ONE sibling per level - there is nothing to sort at that level
Quick followup
Stew Ashton, June 20, 2011 - 2:08 pm UTC
There are 3 "root" values: 1, 65 and 101. Those are the "siblings" I'm talking about. They happen to come back in ascending order, but can I count on that without an "order siblings by"?
select *
from t_hierarchy t1
start with rowid in (
select rowid from T_HIERARCHY T2
where not exists (select null from T_HIERARCHY T3 where T3.COL2 = T2.COL1)
)
connect by prior T1.COL2 = T1.COL1
order siblings by T1.COL1 desc
COL1 COL2
---------------------- ----------------------
101 500
65 200
200 55
1 80
80 60
60 10
10 100
June 20, 2011 - 3:11 pm UTC
ah, I see.
yes, they would have to be ordered to get them ordered.
else they could come back in any order.
am jealous
venkata, July 26, 2011 - 7:39 am UTC
Tom,
i don't know anything about model, dimension by, measures, partition by clauses. can you give me some pointers where i can learn about these and i stumbled upon the following url which i didn't understand at all please help.
http://www.sqlsnippets.com/en/topic-11663.html
July 28, 2011 - 6:30 pm UTC
Activity dates
Jayadevan, July 29, 2011 - 5:06 am UTC
Hi Stew,
I got back to working on the activity/date query. The second query you provided -
with q1 as (
select person_id, start_date, end_date, activity_name
, case when nvl(
lag(activity_name)
over (partition by person_id order by start_date, activity_name)
, activity_name
) = activity_name
.......
works great.
Thank you.
help
Venkata, August 05, 2011 - 6:29 am UTC
Hi,
to learn about data modelling i have created a table and inserted data as shown below.
create table xtest(cust_id number(10), sale_val number(20,4), asoflastday date)
insert into xtest select trunc(dbms_random.value*100000,0) c,trunc(dbms_random.value*100000000,4) amt,last_day(add_months(trunc(sysdate,'yyyy'),rownum-1)) last
from all_objects where rownum < 13
now when you do
select * from xtest
you will get data which is different from what i have
what i need is
sales as on feb = (febsale+Marsale)/2
sales as on Mar = (Marsale+Aprsale)/2
hope you are getting what i am trying to say :)
i need your help in solving this using only data modelling let me know if you can use analytics also
August 05, 2011 - 8:23 am UTC
how can you solve a question like that with "data modeling".
Data modeling is the art of coming up with your schema. It doesn't involve actually answering any business questions?
No, I don't know what you are trying to say.
with analytics
Venkata, August 05, 2011 - 3:00 pm UTC
can you please get me a solution with analytics please
August 05, 2011 - 3:05 pm UTC
ops$tkyte%ORA11GR2> select cust_id, sale_val, asoflastday, (sale_val+lead(sale_val) over (order by asoflastday))/2 what_you_asked_for
2 from xtest
3 order by asoflastday
4 /
CUST_ID SALE_VAL ASOFLASTD WHAT_YOU_ASKED_FOR
---------- ---------- --------- ------------------
50333 59710311.4 31-JAN-11 44289384.5
43525 28868457.6 28-FEB-11 40566135.3
2827 52263813 31-MAR-11 72036300.6
44790 91808788.3 30-APR-11 82621668.8
10819 73434549.3 31-MAY-11 67047508.9
35 60660468.5 30-JUN-11 31365013.1
93257 2069557.62 31-JUL-11 17826510.8
90286 33583464 31-AUG-11 28697602.1
59528 23811740.1 30-SEP-11 18998234.4
88679 14184728.7 31-OCT-11 8633988.97
48826 3083249.24 30-NOV-11 17567744.3
55996 32052239.3 31-DEC-11
12 rows selected.
seems to be what you asked for - but hard to say since it was pretty darn vague.
little to tweak
Venkata, August 05, 2011 - 3:39 pm UTC
Tom,
first thank you the pretty reply, i will eloberate you in detail. here i was waiting for your reply so that i can explain it a little more
you gave this
CUST_ID SALE_VAL ASOFLASTD WHAT_YOU_ASKED_FOR
---------- ---------- --------- ------------------
50333 59710311.4 31-JAN-11 44289384.5
43525 28868457.6 28-FEB-11 40566135.3
2827 52263813 31-MAR-11 72036300.6
44790 91808788.3 30-APR-11 82621668.8
10819 73434549.3 31-MAY-11 67047508.9
35 60660468.5 30-JUN-11 31365013.1
93257 2069557.62 31-JUL-11 17826510.8
90286 33583464 31-AUG-11 28697602.1
59528 23811740.1 30-SEP-11 18998234.4
88679 14184728.7 31-OCT-11 8633988.97
48826 3083249.24 30-NOV-11 17567744.3
55996 32052239.3 31-DEC-11
i need something like this
CUST_ID SALE_VAL ASOFLASTD WHAT_YOU_ASKED_FOR
---------- ---------- --------- ------------------
50333 59710311.4 31-JAN-11
43525 28868457.6 28-FEB-11
---------------------------------------------
88578769.0/2 <----- sale for feb
------^ here (2) is processing month number
---------------------------------------------
88578769.0 28-feb-11
2827 52263813 31-MAR-11
---------------------------------------------
140842582/3 <------- avg sale for 31-Mar-2011
------^ here 3 is processing month number
the result should be like this for other months also. Will that be possible.
August 06, 2011 - 8:01 am UTC
. here i was waiting for your reply so that i can explain it a little more why - why OH WHY would you do something like that? Just to waste both of our time? Think about this please.
When you ask a question, you are asking for something from someone - that something is "their time". Their time is valuable to them. Do not waste it by asking incomplete, vague questions (in short - by asking the WRONG question the first N times around). Rather - use some of your precious time to precisely and concisely ask what you really meant to ask the first time - get it right the first time around.
There is nothing, repeat nothing, more frustrating then getting a response "ok, that was cool - but what I *really* meant to ask was....."
I cannot read your stuff - you did not use the code button - therefore, your text is using a proportional font - it is not readable.
Additionally, it makes no sense.
I *think*, I'm GUESSING, that you want a running total - divided by the number of observations - in short, an average.
You do not want it divided by the month number (what if the months didn't run from 1..12 or there where more than 12 or a month was missing etc...). You want an average plain and simple.
Now, since you used random numbers.... I cannot reproduce your example exactly - but here:
ops$tkyte%ORA11GR2> select cust_id, sale_val, asoflastday, avg(sale_val) over (order by asoflastday) what_you_PROBABLY_asked_for
2 from xtest
3 order by asoflastday
4 /
CUST_ID SALE_VAL ASOFLASTD WHAT_YOU_PROBABLY_ASKED_FOR
---------- ---------- --------- ---------------------------
44944 58494945.3 31-JAN-11 58494945.3
72912 47251551.8 28-FEB-11 52873248.6
17580 99228681.8 31-MAR-11 68325059.6
56824 76260729.8 30-APR-11 70308977.2
24469 11107307.6 31-MAY-11 58468643.3
20531 10186073.3 30-JUN-11 50421548.3
14909 42877408.5 31-JUL-11 49343814
11919 32108787.5 31-AUG-11 47189435.7
79820 4212567.56 30-SEP-11 42414228.1
28397 53316727.6 31-OCT-11 43504478.1
27438 96206453.3 30-NOV-11 48295566.7
22665 65783689.2 31-DEC-11 49752910.3
12 rows selected.
Analytics
A reader, August 08, 2011 - 2:37 pm UTC
Hi Tom,
I have a question.
create table demo_1(col1 number, col2 varchar2(10));
insert into demo_1 values(1,'P');
insert into demo_1 values(2,'P');
insert into demo_1 values(3,'P');
insert into demo_1 values(4,'P');
insert into demo_1 values(5,'P');
create table demo(col1 number, col2 number, col3 varchar2(10));
insert into demo values(1,1,'A');
insert into demo values(2,1,'A');
insert into demo values(3,1,'A');
insert into demo values(4,2,'C');
insert into demo values(5,2,'B');
insert into demo values(6,3,'B');
insert into demo values(7,3,'B');
insert into demo values(8,4,'B');
insert into demo values(9,4,'C');
insert into demo values(10,5,'X');
We have to join the two tables demo and demo_1 where demo_1.col1 = demo.col2
The query should select the items based on follwoing criteria:
Select the value If there are multiple col2 values, they must exist in all the rows.
Select the value If there are only one col2 value.
The result will be
1 A
3 B
5 X
The col2=2 and 4 will be discarded as it has multiple col3 values in demo table.
Can we achieve this using Analytic function?
August 13, 2011 - 3:36 pm UTC
ops$tkyte%ORA11GR2> select t1.col1, count( distinct t2.col3), max(t2.col3)
2 from demo_1 t1, demo t2
3 where t1.col1 = t2.col2
4 group by t1.col1
5 having count(distinct t2.col3) = 1
6 /
COL1 COUNT(DISTINCTT2.COL3) MAX(T2.COL
---------- ---------------------- ----------
1 1 A
5 1 X
3 1 B
no analytics this time, just old fashioned aggregates.
consider using oracle forum
chris227, August 11, 2011 - 8:57 am UTC
Im my opinion oracle forums are a good place for those kind of questions.
Anyway, you may try this:
with a as (
select 1 a,'P' b from dual
union all
select 2,'P' from dual
union all
select 3,'P' from dual
union all
select 4,'P' from dual
union all
select 5,'P' from dual
),
b as(
select 1 a,1 b,'A' c from dual
union all
select 2,1,'A' from dual
union all
select 3,1,'A' from dual
union all
select 4,2,'C' from dual
union all
select 5,2,'B' from dual
union all
select 6,3,'B' from dual
union all
select 7,3,'B' from dual
union all
select 8,4,'B' from dual
union all
select 9,4,'C' from dual
union all
select 10,5,'X' from dual
)
select distinct * from (
select b.b,b.c, count(distinct b.c) over (partition by b.b) num from a,b
where
a.a=b.b)
where num = 1
pre-Analytics
A reader, August 11, 2011 - 4:35 pm UTC
I've not been "fortunate" to use Oracle pre-Analytics, just wonder what you could do WITHOUT analytics for this;
select *
from (select t1.*,
row_number() over (partition by t1.val order by t2.priority desc) rn
from t1, t2
where t2.id = t1.t2_id
)
where rn = 1
The above selects all rows from t1 (with a FK to t2) and returns the t1 row with the highest priority.
What would be a good equivalent without Analytics?
Just been having a chat about the good old days with colleagues and we agree Analytics "rock", but this would make work more challenging :)
answer to pre-analytics
chris227, August 12, 2011 - 3:40 am UTC
consider subqueries.
slightly modified example from question before:
with t2 as (
select 1 id,5 priority from dual
union all
select 2,1 from dual
union all
select 3,2 from dual
union all
select 4,3 from dual
union all
select 5,4 from dual
),
t1 as(
select 1 val,1 t2_id,'A' c from dual
union all
select 2,1,'A' from dual
union all
select 3,1,'A' from dual
union all
select 4,2,'C' from dual
union all
select 5,2,'B' from dual
union all
select 6,3,'B' from dual
union all
select 7,3,'B' from dual
union all
select 8,4,'B' from dual
union all
select 9,4,'C' from dual
union all
select 10,5,'X' from dual
),
j1 as (
select *
from t1, t2
where t2.id = t1.t2_id
)
select val, t2_id, c
from j1
where
priority >= (
select max(priority) from j1 j2
where
j1.val = j2.val
)
VAL T2_ID C
---------- ---------- -
1 1 A
2 1 A
3 1 A
4 2 C
5 2 B
6 3 B
7 3 B
8 4 B
9 4 C
10 5 X
10 rows selected.
Same with but now old school
select val, t2_id, c
from (select val, t2_id, c,
row_number() over (partition by j1.val order by j1.priority desc) rn
from j1
)
where rn = 1
VAL T2_ID C
---------- ---------- -
1 1 A
2 1 A
3 1 A
4 2 C
5 2 B
6 3 B
7 3 B
8 4 B
9 4 C
10 5 X
10 rows selected.
answer to pre-analytics 2
chris227, August 12, 2011 - 3:52 am UTC
sorry, slip of the pen, statements should be swapped.
Notice that the window of the analytic corrsponds to the predicate of the where clause of the subquery
One question
A reader, August 18, 2011 - 3:16 pm UTC
How to select the rows which are equal to number of number of master table.
Lets say we have three different flags in rows for a employee id.
In my second table I have object_id assigned to that flag. I will only select those flags which are equal to number of rows in first table.
table_1:
emp_id flag
1 XX
1 YY
1 ZZ
2 PP
2 XX
3 RR
table_2
emp_id object_id
1 10
1 10
1 10
1 20
1 20
1 45
1 45
1 45
2 50
2 50
2 45
3 25
3 45
So mz expected output should be
emp_1 : 10, 45(since these occur 3 times equal to rows in table_1) We discard 20 as it occurs only 2 times
emp_2 : 50(since these occur 2 times equal to rows in table_1) We discard 45 as it occurs only 1 times
emp_3 : 25, 45(since these occur 1 times equal to rows in table_1)
Could you please suggest a solution.
Oracle 10G R1 .. Estimated rows in table_1 : 1 million
table_2 : 500 MB
August 30, 2011 - 12:54 pm UTC
no creates
no inserts
no look
if you want a query, provide create tables and inserts.
21 days sql question
Muhammet, August 26, 2011 - 4:41 pm UTC
Hi,
Think a table with patients and inspect date.
The business rule says that there can be no more than one inspection within 21 days. But table has dirty data.
They want a report with patient and inspection date with cleaning data. They say for every 21 day starting
with inspection date,take the first one in the report.
Sample
Patient Inspection_Date
------ --------------
Ahmet 01/01/2011 ==> DD/MM/YYYY
Ahmet 15/01/2011
Ahmet 01/02/2011
Ahmet 15/02/2011
Ahmet 01/06/2011
Ahmet 15/06/2011
Ahmet 01/07/2011
So the report will be
Ahmet 01/01/2011
Ahmet 02/02/2011
Ahmet 01/06/2011
Ahmet 01/07/2011
create table T (patient varchar2(20),inspection_date date);
insert into T values('Ahmet',to_date('01/01/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('15/01/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('01/02/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('15/02/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('01/06/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('15/06/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('01/07/2011','DD/MM/YYYY');
Thanks
re: 21 days sql question
Stew Ashton, August 29, 2011 - 11:41 am UTC
Hello Muhammet,
This solution may not be to everyone's taste, but the only alternative I know is worse:
SELECT distinct patient, keep_date
FROM t
MODEL
PARTITION BY (patient)
DIMENSION BY (Row_Number() OVER (PARTITION BY patient ORDER BY inspection_date) rn)
MEASURES (inspection_date, inspection_date keep_date)
RULES (
keep_date[rn>1] ORDER BY rn =
CASE WHEN inspection_date[cv()] < keep_date[cv()-1]+21
then keep_date[cv()-1]
ELSE inspection_date[cv()] END
)
order by 1, 2;
PATIENT KEEP_DATE
-------------------- -------------------------
Ahmet 01-JAN-11 00.00.00
Ahmet 01-FEB-11 00.00.00
Ahmet 01-JUN-11 00.00.00
Ahmet 01-JUL-11 00.00.00
Re: One Question: Tables for Analytics question
A reader, August 30, 2011 - 4:29 pm UTC
Hi Tom,
Please find the create and inserts:
create table table_1(emp_id number, flag varchar2(10));
insert into table_1 values(1,'XX');
insert into table_1 values(1,'YY');
insert into table_1 values(1,'ZZ');
insert into table_1 values(2,'PP');
insert into table_1 values(2,'XX');
insert into table_1 values(3,'RR');
create table table_2(emp_id number, object_id varchar2(10));
insert into table_2 values(1,'10');
insert into table_2 values(1,'10');
insert into table_2 values(1,'10');
insert into table_2 values(1,'20');
insert into table_2 values(1,'20');
insert into table_2 values(1,'45');
insert into table_2 values(1,'45');
insert into table_2 values(1,'45');
insert into table_2 values(2,'50');
insert into table_2 values(2,'50');
insert into table_2 values(2,'45');
insert into table_2 values(3,'25');
insert into table_2 values(3,'45');
Thanks for the time.
To A Reader
Michel Cadot, August 31, 2011 - 2:21 am UTC
You have no need of analytic functions for this:
SQL> select b.emp_id, b.object_id
2 from (select emp_id, count(*) nb from table_1 group by emp_id) a,
3 (select emp_id, object_id, count(*) nb from table_2 group by emp_id, object_id) b
4 where a.emp_id = b.emp_id and a.nb = b.nb
5 order by 1, 2
6 /
EMP_ID OBJECT_ID
---------- ----------
1 10
1 45
2 50
3 25
3 45
5 rows selected.
Regards
Michel
To Muhammet
Michel Cadot, August 31, 2011 - 6:54 am UTC
With one more row in your table, here's a query with no model clause:
SQL> select patient, inspection_date
2 from t
3 order by 1, 2
4 /
PATIENT INSPECTION_
-------------------- -----------
Ahmet 01-JAN-2011
Ahmet 15-JAN-2011
Ahmet 01-FEB-2011
Ahmet 10-FEB-2011
Ahmet 15-FEB-2011
Ahmet 01-JUN-2011
Ahmet 15-JUN-2011
Ahmet 01-JUL-2011
8 rows selected.
SQL> with
2 data as (
3 select patient, inspection_date,
4 min(inspection_date)
5 over (partition by patient order by inspection_date
6 range between 22 following and unbounded following)
7 next_21
8 from t a
9 ),
10 compute (patient, inspection_date, next_21) as (
11 select patient, min(inspection_date), min(next_21)
12 from data
13 group by patient
14 union all
15 select data.patient, data.inspection_date, data.next_21
16 from data, compute c
17 where data.patient = c.patient
18 and data.inspection_date = c.next_21
19 )
20 select patient, inspection_date
21 from compute
22 order by 1, 2
23 /
PATIENT INSPECTION_
-------------------- -----------
Ahmet 01-JAN-2011
Ahmet 01-FEB-2011
Ahmet 01-JUN-2011
Ahmet 01-JUL-2011
4 rows selected.
Regards
Michel
Analytic Question
A reader, August 31, 2011 - 11:04 am UTC
Is it possible to re write something like this?
SELECT
CASE
WHEN a.status IN (1,2,3,5)
AND a.pm_id = b.pm_id
AND c.ac_id =1
AND b.ho_id=1
THEN MIN(a.col2) keep (dense_rank FIRST
ORDER BY (
CASE
WHEN a.status IN (1,2,3,5)
AND a.pm_id = b.pm_id
AND c.ac_id =1
AND b.ho_id=1
THEN 0
ELSE 1
END), a.col2) over (partition BY a.col1)
ELSE NULL
END pp1_account_id,
CASE
WHEN a.status IN (1,2,3,5)
AND a.pm_id = b.pm_id
AND c.ac_id =1
AND b.ho_id=1
THEN MIN(b.col3) keep (dense_rank FIRST
ORDER BY (
CASE
WHEN a.status IN (1,2,3,5)
AND a.pm_id = b.pm_id
AND c.ac_id =1
AND b.ho_id=1
THEN 0
ELSE 1
END), b.col3) over (partition BY a.col1)
ELSE NULL
FROM table_a a,
table_b b,
table_c c
WHERE a.col1 = b.col1(+)
AND b.col1 = c.col1
AND c.col1 = a.col1
August 31, 2011 - 2:11 pm UTC
maybe - I don't have the specification for the question being asked, I don't have the schema and constraints, I don't know of the query itself as stated is correct.
just for readability I'd be using an inline view at the very least - so I didn't have to keep reproducing
a.status IN (1,2,3,5)
AND a.pm_id = b.pm_id
AND c.ac_id =1
AND b.ho_id=1
over and over again
@Michel re:Muhammet's question
Stew Ashton, August 31, 2011 - 2:10 pm UTC
Hi Michel,
Yes, that was the alternative I was thinking of, although I used the old-fashioned hierarchical query syntax. Amusing that we probably live near each other but meet only on a server in Austin :)
Restriction on analytic function result on fly
Sonal, September 01, 2011 - 1:44 pm UTC
Hi Tom,
Is it possible to restric query result based on cummulative data while processing.
i.e. table has following data
empno sal
10 100
11 200
12 300
13 400
by using analytic function, we can get
empno sal cummulative sal
10 100 100
11 200 300
12 300 600
13 400 1000
select empno, sal, sum(sal) over (order by empno) cummulative_sal from emp;
now if i want that give employee where cummulative salary first reach 500. so I will do
select * from (
select a.*, rank() over (order by cummulative_sal) rnk from (
select empno, sal, sum(sal) over (order by empno) cummulative_sal from emp) a
where cummulative_sal >= 500
) where rnk = 1
If I want this in dept wise or some other grouping then I will add partition clause. So here I have used 2 times from clause and get the result. I am wondering, if any function which tells me to stop processing more rows when my cummulative reach to that limit. if inside query is returning 100K rows and then if i am only interested in first 3 then why should I process another rows. I think performance wise there is no benefits but just to know is there any way for doing this?
Regards
Sonal
September 01, 2011 - 2:36 pm UTC
analytics is not able to do this (yet), no
A reader, September 03, 2011 - 6:12 pm UTC
If you are on 11gR2 you can do with recursive with clause as below I assume you want to reset the total when >=500 to 0
with t(deptno, empno, sal) as
(
select 1, 10, 100 from dual union all
select 1, 11, 200 from dual union all
select 1, 12, 300 from dual union all
select 1, 13, 400 from dual
),
data as
(
select t.*,row_number()over(partition by deptno order by empno) as rno,
sum(sal)over(partition by deptno order by empno) as Run_Tot
from t
)
,
rec (rno, deptno, empno, sal, Run_Tot, Total, flg) as
(
select rno, deptno, empno, sal, Run_Tot, sal, case when sal >=500 Then 1 ELSE 0 END
from data
where rno=1
union all
select d.rno, d.deptno, d.empno, d.sal, d.Run_Tot, case when r.Total + d.sal >=500 Then d.sal ELSE r.Total + d.sal END,
case when r.Total + d.sal >=500 Then 1 ELSE 0 END
from rec r, data d
where r.rno+1=d.rno and r.deptno=d.deptno
)
select * from rec
where flg=1
hth...
Thanks
@Sonal on "Restriction on analytic function result on fly"
Stew Ashton, September 04, 2011 - 4:01 am UTC
Hi Sonal,
You can remove a FROM in your query if you say
SELECT empno, sal, cum_sal FROM (
SELECT empno, sal,
sum(sal) OVER (ORDER BY empno) cum_sal,
nvl(sum(sal) OVER (
ORDER BY empno ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
),0) prev_cum_sal
FROM emp
) WHERE cum_sal >=500 AND prev_cum_sal < 500;
You could get down to one FROM with the MODEL clause
SELECT empno, sal, cum_sal FROM emp
MODEL
RETURN UPDATED ROWS
DIMENSION BY (
sum(sal) OVER (ORDER BY empno) cum_sal,
nvl(sum(sal) OVER (
ORDER BY empno ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
),0) prev_cum_sal
)
MEASURES (empno, sal)
RULES (
empno[cum_sal>=500, prev_cum_sal<500] = empno[cv(),cv()]
);
Neither would do what you are looking for, but the execution plans have fewer steps. (The MODEL clause is just for fun.)
21 days sql question
CezarN, November 01, 2011 - 7:17 am UTC
I have modified Michel Cadot's solution in order not to use recursive CTE which work on 11g R2 and above (waiting for this "above" :D):
select patient, inspection_date from
(
select patient, inspection_date,
min(inspection_date)
over (partition by patient order by inspection_date
range between 22 following and unbounded following)
next_21,
row_number() over (partition by patient order by inspection_date) as rn
from t a
)
connect by prior next_21 = inspection_date
start with rn = 1
;
PATIENT INSPECTION_DATE
-------------------- ---------------
Ahmet 01/01/2011
Ahmet 01/02/2011
Ahmet 01/06/2011
Ahmet 01/07/2011
Regards,
Cezar
Counting partners who appeared together with each other
Ramis, November 27, 2011 - 8:58 am UTC
Hi Tom
I am posting here becuase I think it can perhaps be solved by analytics.
I have a table t
create table T
(player varchar2(40),
for_team varchar2(10),
Vs_team varchar2(10),
matchid number,
points number)
/
INSERT INTO T VALUES ('John','A','B',1,2);
INSERT INTO T VALUES ('Fulton','A','B',1,10);
INSERT INTO T VALUES ('Sarah','A','B',1,9);
INSERT INTO T VALUES ('Peter','B','A',1,7);
INSERT INTO T VALUES ('Carlos','B','A',1,9);
INSERT INTO T VALUES ('Jose','B','A',1,6);
INSERT INTO T VALUES ('Joe','A','B',2,8);
INSERT INTO T VALUES ('Peter','A','B',2,9);
INSERT INTO T VALUES ('Carlos','A','B',2,1);
INSERT INTO T VALUES ('Rubben','B','A',2,10);
INSERT INTO T VALUES ('John','B','A',2,0);
INSERT INTO T VALUES ('Fulton','B','A',2,1);
INSERT INTO T VALUES ('Marcelo','A','B',3,7);
INSERT INTO T VALUES ('Daniela','A','B',3,1);
INSERT INTO T VALUES ('John','A','B',3,2);
INSERT INTO T VALUES ('Jose','B','A',3,5);
INSERT INTO T VALUES ('Abrao','B','A',3,3);
INSERT INTO T VALUES ('Carlos','B','A',3,10);
Select * from t
order by matchid, for_team
/
Player For_team vs_team matchid Points
John A B 1 2
Fulton A B 1 10
Sarah A B 1 9
Carlos B A 1 9
Peter B A 1 7
Jose B A 1 6
Peter A B 2 9
Carlos A B 2 1
Joe A B 2 8
Fulton B A 2 1
John B A 2 0
Rubben B A 2 10
Daniela A B 3 1
John A B 3 2
Marcelo A B 3 7
Jose B A 3 5
Abrao B A 3 3
Carlos B A 3 10
Note: Each player can appear for more than one team in different matches. For exmaple, John appeared for Team A in matchid = 1 and then in matchid = 2 he appeared for team B. So same could for other players
Requirment:
I want for each player, the sum of total number of matches and points (which is easy using SUM) but along with total number of different teammates he played with in all the matches he appeared [for his team(s)] and also the total number opposition players and lastly the total number of different players he played with in all the matches he appeared in.
Just to clarify some terms, incase any doubt:
Here teammates = all players who appeared (for_team) in a match from the same team for whom the respective player also appeared in the same match.
opposition players = all players who appeared for VS_team played by each player.
total different players = all unique players who appeared for for_team or vs_team in the matches in which each player appeared.
Here is my desired outout:
Player total Matches Sum(points) Different teammates Different total
player played with opposition players different players
John 3 4 5 5 10
Fulton 2 11 3 4 6
Sarah 1 9 2 3 5
Peter 2 16 3 4 7
Carlos 3 20 4 6 10
Jose 2 11 3 5 8
Joe 1 8 2 3 5
Rubben 1 10 2 3 5
Marcelo 1 7 2 3 5
Daniela 1 1 2 3 5
Abrao 1 3 2 3 5
I want one simple query and shortest query to achieve about output since in my actual table data is huge.
thanks in advance.
regards
Asim
November 29, 2011 - 7:09 am UTC
I don't think analytics are appropriate, but an aggregation would be:
ops$tkyte%ORA11GR2> select t1_player,
2 count(case when t1_player = t2_player then matchid end) "total matches",
3 sum(case when t1_player = t2_player then points end) "total points",
4 count(distinct case when t1_for_team = t2_for_team and t1_player <> t2_player then t2_player end) "diff teammates",
5 count(distinct case when t1_for_team <>t2_for_team then t2_player end) "diff opp"
6 from (
7 select t1.player t1_player, t1.for_team t1_for_team, t2.player t2_player, t2.for_team t2_for_team, t1.matchid, t1.points
8 from t t1, t t2
9 where t1.matchid = t2.matchid
10 )
11 group by t1_player
12 order by t1_player
13 /
T1_PLAYER total matches total points diff teammates diff opp
---------- ------------- ------------ -------------- ----------
Abrao 1 3 2 3
Carlos 3 20 4 6
Daniela 1 1 2 3
Fulton 2 11 3 4
Joe 1 8 2 3
John 3 4 5 5
Jose 2 11 3 5
Marcelo 1 7 2 3
Peter 2 16 3 4
Rubben 1 10 2 3
Sarah 1 9 2 3
11 rows selected.
we cartesian product each matchid group - that'll join every play with every other player - including themselves - by matchid.
then - if we count the rows where the players are the same - that is the number of matches that player played in.
if we sum the points where the players are the same - that is the sum of their points.
if we count the number of distinct player names where the teams are the same (but the player name is different) that is the total number of teammates.
if we count the number of distinct player names where the teams are different - that is the total number of opposing players...
Now, if this table is "HUGE", remember that this will make it temporarily 16 times "more huge".
I put airquotes on "HUGE" because some people thing a few million rows is "HUGE" - but it really isn't.
You do not want a single index to be used - you want two full scans and a nice big hash join.
thanks alot
Ramis, December 01, 2011 - 8:34 am UTC
Dear Tom
This is really so amazing. Thank you so much. I didn't expect that it will be so simple. On my actual table its takes just 11 seconds to give me output for 2600+ players and 3000+ matches. I appreciate your time and help. thanks again.
regards
Ramis
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December 07, 2011 - 8:06 pm UTC
Tom:
I have a requirement to calculate the below values ( where the table 'T' in application is external table to read from text files)
1) ST_START_ID should be calculated as value of x when y = 'ST'
2) ST_END_ID should be calculated as value of x-1 when y ='CLM'
3) CLM_START_ID should be calculated as value of x when y ='CLM'
4) CLM_END_ID should be calculated as value of x-1 when y ='SE' or y ='CLM'
drop table t purge;
create table t(x number,y varchar2(10));
insert into t values(1,'ISA');
insert into t values(2,'ST');
insert into t values(3,'D1');
insert into t values(4,'D2');
insert into t values(5,'CLM1');
insert into t values(6,'X1');
insert into t values(7,'X2');
insert into t values(8,'X3');
insert into t values(9,'SE');
insert into t values(10,'ST');
insert into t values(11,'D1');
insert into t values(12,'D2');
insert into t values(13,'D3');
insert into t values(14,'CLM1');
insert into t values(15,'X1');
insert into t values(16,'CLM2');
insert into t values(17,'X1');
insert into t values(18,'X2');
insert into t values(19,'CLM3');
insert into t values(20,'X1');
insert into t values(21,'SE');
insert into t values(22,'IEA');
commit;
I need the output like this
ST_START_ID ST_END_ID CLM_START_ID CLM_END_ID
2 4 5 8
10 13 14 15
10 13 16 18
10 13 19 20
December 08, 2011 - 12:37 pm UTC
umm, this doesn't make sense at all.
the pivoting is unexplained and doesn't make sense.
You wrote:
1) ST_START_ID should be calculated as value of x when y = 'ST'
2) ST_END_ID should be calculated as value of x-1 when y ='CLM'
3) CLM_START_ID should be calculated as value of x when y ='CLM'
4) CLM_END_ID should be calculated as value of x-1 when y ='SE' or y ='CLM'
I see nothing where y = 'CLM'
going further on, assuming you meant "y like'CLM%'", how did you know to "go on" when you hit CLM1 with x=5. you used it twice for st_end_id and clm_start_id, why didn't you use it a third time for clm_end_id
also, why did you use it twice, if that is always going to be true, you don't really need the st_end_id do you - it is rather obvious what it is - you don't need to compute it in the result set.
and what happens if in between an 'ST' record and the next 'ST' record - there are no clm records.
and you best explain those two last lines of output - they make no sense to me.
Ask yourself - honestly and truly ask yourself - have you given enough information in the form of a specification here that anyone could write code from it without ambiguity. I could code up something that gets your result set - that is easy enough, but I have no idea if it is what you really meant. I won't take my time until you give us a lot more detail.
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December 07, 2011 - 8:08 pm UTC
Tom, Sorry to miss this in previous post, that the size of this file mapped to External table 'T' will be around 54MB to 80MB.
December 08, 2011 - 12:37 pm UTC
so it is very small, ok, post more details from above please...
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December 08, 2011 - 1:07 pm UTC
Tom:
Here is the more details
1) With in ST and SE ( that is where x between 2 and 9) there will be multiple CLM% segments. (atleast there will be one CLM available between ST and SE)
2) I need to output the boundry for each CLM segments with ST and SE combinations.
a) So i have one CLM% between x >=2 and x<=9 so I need the one output like this.
ST_START_ID ST_END_ID CLM_START_ID CLM_END_ID
2 4 5 8
b) Now i have theree CLM% between x >=10 and x<=21 so I need the three output like this one for each CLM% values.
ST_START_ID ST_END_ID CLM_START_ID CLM_END_ID
10 13 14 15
10 13 16 18
10 13 19 20
December 09, 2011 - 4:00 pm UTC
this is as close as I can get you - YOU will take it from here.
ops$tkyte%ORA11GR2> select *
2 from (
3 select x, y, grp,
4 max(st_x) over (partition by grp) new_st_x,
5 max(se_x) over (partition by grp) new_se_x
6 from (
7 select x, y,
8 count( case when y = 'ST' then 'ST' end ) over (order by x) grp,
9 case when y = 'ST' then x end st_x,
10 case when y = 'SE' then x end se_x
11 from t
12 )
13 )
14 where y like 'CLM_'
15 order by x
16 /
X Y GRP NEW_ST_X NEW_SE_X
---------- ---------- ---------- ---------- ----------
5 CLM1 1 2 9
14 CLM1 2 10 21
16 CLM2 2 10 21
19 CLM3 2 10 21
your description leaves something to be - well, I'll leave it at that.
Anyway - what you have there is every CLM_ record "joined" with its prior ST record and post SE record. That should be everything you need.
If you need the prior CLM_ record "joined" to each existing CLM_ record - please use LAG()
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December 10, 2011 - 2:41 pm UTC
Thank Tom.
I think i got it.
rajesh@ORA11GR2> select st_x as st_start_id,
2 st_end_id - 1 as st_end_id,
3 min(x) as clm_start_id,
4 max(x) as clm_end_id
5 from (
6 select x,y,st_x,st_grp,last_value(clm_grp ignore nulls) over(order by x) as clm_grp,
7 min(case when y like 'CLM%' then x end) over(partition by st_grp order by x) as st_end_id
8 from (
9 select x,y,
10 case when y like 'CLM%' then row_number() over(order by x)
11 when y like 'SE' then 1 end as clm_grp,
12 last_value(case when y ='ST' then x end ignore nulls) over(order by x) as st_x,
13 count(case when y='ST' then 'ST' end) over(order by x) as st_grp
14 from t
15 )
16 )
17 where clm_grp > 1
18 group by st_x,st_end_id,clm_grp
19 order by clm_grp
20 /
ST_START_ID ST_END_ID CLM_START_ID CLM_END_ID
----------- ---------- ------------ ----------
2 4 5 8
10 13 14 15
10 13 16 18
10 13 19 20
Elapsed: 00:00:00.11
rajesh@ORA11GR2>
rajesh@ORA11GR2>
Alternate approach
Matt McPeak, December 14, 2011 - 2:56 pm UTC
Another way to do it:
select * from
(
select t.*, max(case when y = 'ST' THEN x ELSE null end) over ( partition by null order by x range between unbounded preceding and current row ) st_start_id,
max(case when y = 'CLM1' THEN x-1 ELSE null end) over ( partition by null order by x range between unbounded preceding and current row ) st_end_id,
max(case when y like 'CLM%' THEN x ELSE null end) over ( partition by null order by x range between unbounded preceding and current row ) clm_start_id,
min(case when y like 'CLM%' or y IN ('SE','IEA') THEN x-1 ELSE null end) over ( partition by null order by x range between 1 following and unbounded following) clm_end_id
from t
)
where y like 'CLM%'
Analytic function
A reader, January 13, 2012 - 9:55 pm UTC
Hello Tom,
I have a table TEST which contain one column id.
And it has values from 99 to 1 in desc manner.
ID
99
98
..
..
1
Please help me getting a output as below
id1 id2 id3 id4 id5 id6 id7
99 98 97 96 95 94 93
86 87 88 89 90 91 92
85 84 83 .. ..
Thanks.
lalu.
January 17, 2012 - 11:23 am UTC
And it has values from 99 to 1 in desc manner.
No, I'm sorry, it does not.
I might well have 1 to 99 in it, but they are not in any "order". there is no order to any set of rows unless and until you add an order by.
your output doesn't make sense to me.
you go 99 down to 93
then from 86 up to 92
then from 85 down to ....
do you want to go up, or down, or did you really mean to interleave it
ANY WHY DOESN'T ANYONE SEEM TO SAY STUFF LIKE THIS ANYMORE. where are the specifications, where is the explanation of what you really mean? pictograms like this don't tell the whole story.
pick the one you want
ops$tkyte%ORA11GR2> with data(r)
2 as
3 (select 1 r from dual
4 union all
5 select r+1 from data where r < 99
6 )
7 select max(decode(m,6,r)) id1,
8 max(decode(m,5,r)) id2,
9 max(decode(m,4,r)) id3,
10 max(decode(m,3,r)) id4,
11 max(decode(m,2,r)) id5,
12 max(decode(m,1,r)) id6,
13 max(decode(m,0,r)) id7
14 from (
15 select r, trunc((r+7-1.99)/7) d, mod(r-2+7, 7) m
16 from data
17 )
18 group by d
19 order by d desc;
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
99 98 97 96 95 94 93
92 91 90 89 88 87 86
85 84 83 82 81 80 79
78 77 76 75 74 73 72
71 70 69 68 67 66 65
64 63 62 61 60 59 58
57 56 55 54 53 52 51
50 49 48 47 46 45 44
43 42 41 40 39 38 37
36 35 34 33 32 31 30
29 28 27 26 25 24 23
22 21 20 19 18 17 16
15 14 13 12 11 10 9
8 7 6 5 4 3 2
1
15 rows selected.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> with data(r)
2 as
3 (select 1 r from dual
4 union all
5 select r+1 from data where r < 99
6 )
7 select max(decode(m,0,r)) id1,
8 max(decode(m,1,r)) id2,
9 max(decode(m,2,r)) id3,
10 max(decode(m,3,r)) id4,
11 max(decode(m,4,r)) id5,
12 max(decode(m,5,r)) id6,
13 max(decode(m,6,r)) id7
14 from (
15 select r, trunc((r+7-1.99)/7) d, mod(r-2+7, 7) m
16 from data
17 )
18 group by d
19 order by d desc
20 /
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
93 94 95 96 97 98 99
86 87 88 89 90 91 92
79 80 81 82 83 84 85
72 73 74 75 76 77 78
65 66 67 68 69 70 71
58 59 60 61 62 63 64
51 52 53 54 55 56 57
44 45 46 47 48 49 50
37 38 39 40 41 42 43
30 31 32 33 34 35 36
23 24 25 26 27 28 29
16 17 18 19 20 21 22
9 10 11 12 13 14 15
2 3 4 5 6 7 8
1
15 rows selected.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> with data(r)
2 as
3 (select 1 r from dual
4 union all
5 select r+1 from data where r < 99
6 )
7 select max(decode(m,decode( mod(d,2), 1, 0, 6) ,r)) id1,
8 max(decode(m,decode( mod(d,2), 1, 1, 5) ,r)) id2,
9 max(decode(m,decode( mod(d,2), 1, 2, 4) ,r)) id3,
10 max(decode(m,decode( mod(d,2), 1, 3, 3) ,r)) id4,
11 max(decode(m,decode( mod(d,2), 1, 4, 2) ,r)) id5,
12 max(decode(m,decode( mod(d,2), 1, 5, 1) ,r)) id6,
13 max(decode(m,decode( mod(d,2), 1, 6, 0) ,r)) id7
14 from (
15 select r, trunc((r+7-1.99)/7) d, mod(r-2+7, 7) m
16 from data
17 )
18 group by d
19 order by d desc
20 /
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
99 98 97 96 95 94 93
86 87 88 89 90 91 92
85 84 83 82 81 80 79
72 73 74 75 76 77 78
71 70 69 68 67 66 65
58 59 60 61 62 63 64
57 56 55 54 53 52 51
44 45 46 47 48 49 50
43 42 41 40 39 38 37
30 31 32 33 34 35 36
29 28 27 26 25 24 23
16 17 18 19 20 21 22
15 14 13 12 11 10 9
2 3 4 5 6 7 8
1
15 rows selected.
ops$tkyte%ORA11GR2>
Ouch
Chuck, January 17, 2012 - 1:28 pm UTC
To me #3 wins the "Least Likely to be Useful" award hands down.
That that 'with' clause passes the compiler, let alone producing more than one row, simply makes my head hurt.
SQL fun at it's best. Thanks.
Analytic and Tom both rocks.
A reader, January 18, 2012 - 7:32 am UTC
Thanks Tom.
I could not explain my requirement, but you have provided me the output.
January 18, 2012 - 7:44 am UTC
what did analytics have to do with anything? We didn't use any in this particular problem.
Alternative approach using pivot
AndyP, January 30, 2012 - 5:10 am UTC
Tom has of course provided the exact answer to the question and it probably isn't actually worth pursuing for its own sake, but I just wondered how to achieve the same results using the pivot syntax, which might be of interest to someone I suppose
In doing so I found that the solution provided relies on knowing the starting value of the dataset being used, so I've made it work for any highest number, not just 99
This method seems to require more work generating the data in the right format, but then the presentation of it using pivot is straightforward
(I too found the with clause a surprising formulation, so I used a more explicit one)
col rowval noprint
prompt This one has each row descending
with
data as(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as(select r dataval,trunc((rownum-0.1)/7) rowval,mod((rownum-1),7) colval from data)
select * from datasets pivot(max(dataval) for colval in(0 as id1,1 as id2,2 as id3,3 as id4,4 as id5,5 as id6,6 as id7))
order by rowval
/
prompt This one has each row ascending
with
data as(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as(select r dataval,trunc((rownum-0.1)/7) rowval,mod((rownum-1),7) colval from data)
select * from datasets pivot(max(dataval) for colval in(6 as id1,5 as id2,4 as id3,3 as id4,2 as id5,1 as id6,0 as id7))
order by rowval
/
prompt This one has each row ascending too, achieved by reversing the data rather than altering the pivot column sequence
with
data as(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as(select r dataval,trunc((rownum-0.1)/7) rowval,6-mod((rownum-1),7) colval from data)
select * from datasets pivot(max(dataval) for colval in(0 as id1,1 as id2,2 as id3,3 as id4,4 as id5,5 as id6,6 as id7))
order by rowval
/
prompt This one has rows alternately descending and ascending, achieved by ordering the data like that before pivoting
with
data as
(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as
(select r dataval,trunc((rownum-0.1)/7) rowval,decode(mod(trunc((rownum-0.1)/7),2),0,mod((rownum-1),7),6-mod((rownum-1),7)) colval from data)
select * from datasets pivot(max(dataval) for colval in(0 as id1,1 as id2,2 as id3,3 as id4,4 as id5,5 as id6,6 as id7))
order by rowval
/
SQL > @pivdown
This one has each row descending
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
25 24 23 22 21 20 19
18 17 16 15 14 13 12
11 10 9 8 7 6 5
4 3 2 1
This one has each row ascending
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
19 20 21 22 23 24 25
12 13 14 15 16 17 18
5 6 7 8 9 10 11
1 2 3 4
This one has each row ascending too, achieved by reversing the data rather than altering the pivot column sequence
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
19 20 21 22 23 24 25
12 13 14 15 16 17 18
5 6 7 8 9 10 11
1 2 3 4
This one has rows alternately descending and ascending, achieved by ordering the data like that before pivoting
ID1 ID2 ID3 ID4 ID5 ID6 ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
25 24 23 22 21 20 19
12 13 14 15 16 17 18
11 10 9 8 7 6 5
1 2 3 4
January 31, 2012 - 5:48 pm UTC
;)
thanks
RANGE BETWEEN UNBOUNDED PRECEDING confusion
biswaranjan., April 05, 2012 - 6:29 am UTC
Hi Tom,
I am confused for the out put of the below query .
emp table has the below 14 columns.
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3001 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
.........
SELECT deptno, empno, sal,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal
DEPTNO EMPNO SAL CNT_LT_HALF CNT_MT_HALF
------- ------ ----- ----------- -----------
20 7369 800 0 3
20 7876 1100 0 3
20 7566 2975 2 0
20 7788 3000 2 0
20 7902 3000 2 0
30 7900 950 0 3
30 7521 1250 0 1
30 7654 1250 0 1
30 7844 1500 0 1
30 7499 1600 0 1
30 7698 2850 3 0
11 rows selected.
can you plz explain the out put of the CNT_LT_HALF and CNT_MT_HALF
columns . I was thinking myself and found no confusion for the out put of cnt_lt_half column but confused for the cnt_mt_half column's .
could you plz explain the logic.
thanks and regards,
Biswaranjan.
April 06, 2012 - 10:17 am UTC
does this make more sense:
ops$tkyte%ORA11GR2> break on deptno skip 1
ops$tkyte%ORA11GR2> SELECT deptno, empno, sal,
2 Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
3 BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) cnt1,
4 first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
5 BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) fv1,
6 last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
7 BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) lv1,
8 COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
9 BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) cnt2,
10 first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
11 BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) fv2,
12 last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
13 BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) lv2
14 FROM emp
15 WHERE deptno IN (20, 30)
16 ORDER BY deptno, sal
17 /
DEPTNO EMPNO SAL CNT1 FV1 LV1 CNT2 FV2 LV2
------ ----- ----- ----- ----- ----- ----- ----- -----
20 7369 800 0 3 7566 7902
7876 1100 0 3 7566 7902
7566 2975 2 7369 7876 0
7788 3000 2 7369 7876 0
7902 3000 2 7369 7876 0
30 7900 950 0 3 7844 7698
7521 1250 0 1 7698 7698
7654 1250 0 1 7698 7698
7844 1500 0 1 7698 7698
7499 1600 0 1 7698 7698
7698 2850 3 7900 7654 0
11 rows selected.
see what the empno ranges are now - you can see exactly which records are included in the range...
A reader, April 06, 2012 - 12:24 pm UTC
Dear Sir,
the range for preceding is from unlimited lower bound to sal/2 and for following is from sal/2 to unlimited upper bound.
My above understanding match with the above output for preceding clause but not for following
for eg: for record deptno=20 and empno=7369 the range will be 400 to unlimited.
can you please clarify this?
Thanks in Advance
April 08, 2012 - 5:44 pm UTC
does this help:
ops$tkyte%ORA11GR2> SELECT deptno, empno, sal,
2 'btwn 1st and ' || (sal-sal/2) txt1,
3 Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
4 BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) cnt1,
5 first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
6 BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) fv1,
7 last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
8 BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) lv1,
9 'btwn ' || (sal+sal/2) || ' and last' txt2,
10 COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
11 BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) cnt2,
12 first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
13 BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) fv2,
14 last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
15 BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) lv2
16 FROM emp
17 WHERE deptno IN (20, 30)
18 ORDER BY deptno, sal
19 /
DEPTNO EMPNO SAL TXT1 CNT1 FV1 LV1 TXT2 CNT2 FV2 LV2
------ ----- ----- ----------------- ----- ----- ----- ------------------ ----- ----- -----
20 7369 800 btwn 1st and 400 0 btwn 1200 and last 3 7566 7902
7876 1100 btwn 1st and 550 0 btwn 1650 and last 3 7566 7902
7566 2975 btwn 1st and 1487 2 7369 7876 btwn 4462.5 and la 0
.5 st
7788 3000 btwn 1st and 1500 2 7369 7876 btwn 4500 and last 0
7902 3000 btwn 1st and 1500 2 7369 7876 btwn 4500 and last 0
30 7900 950 btwn 1st and 475 0 btwn 1425 and last 3 7844 7698
7521 1250 btwn 1st and 625 0 btwn 1875 and last 1 7698 7698
7654 1250 btwn 1st and 625 0 btwn 1875 and last 1 7698 7698
7844 1500 btwn 1st and 750 0 btwn 2250 and last 1 7698 7698
7499 1600 btwn 1st and 800 0 btwn 2400 and last 1 7698 7698
7698 2850 btwn 1st and 1425 3 7900 7654 btwn 4275 and last 0
11 rows selected.
unbound following
Biswaranjan, April 07, 2012 - 12:28 am UTC
Hi Tom,
First of all thanks for your quick reply.
I finally got it after thinking deeply your query's output.
@reader
you are thinking ,the way i was thinking so same confusionn you faced.
for your eg:
for record deptno=20 and emp=7369 the range will be from current sal 400 ahead to
unlimited.
so the lower limit will be 800+800/2=1200
and the upper limit will be unlimited. so obviously the count for that record will be
3.
thanks again tom.
regards,
Biswaranjan
A reader, April 07, 2012 - 4:20 pm UTC
@Biswaranjan - Thanks I understand that from output for following but the same does not work for preceding may be i miss something
for eg:
when processing deptno=20 and empno=7876 for preceding, the range will be from unbounded preceding to 800(previous record salary) + 800/2=1200 upper limit
so the count in this case should be 1 rather than 0?
is it like that for preceding the range is from unlimited TO sal/2 of previous record and for following it
is from sal*1.5 TO unlimited following?
@Tom can you please clarify?
Thanks in Advance
range window
biswaranjan, April 09, 2012 - 1:06 am UTC
@reader
for preceding the range should be
unbound prededing to sal-sal/2 and for followning
the range should be sal+sal/2.
you got wrong output coz for preceding you have added sal/2 with sal where you should have subtracted fro sal.
thanks,
Biswaranjan.
range window with example
biswaranjan, April 09, 2012 - 1:15 am UTC
@reader
hope below example will clarify your doubt completely.\
SELECT deptno, empno, sal,sal-sal/4,sal+sal/4,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/4) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/4) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal;
Deptno empno sal (sa-sa/4)(sa+sa/4)pre_CNT follow_CNT
20 7369 800 600 1000 0 4
20 7876 1100 825 1375 1 3
20 7566 2975 2231.25 3718.75 2 0
20 7788 3000 2250 3750 2 0
20 7902 3000 2250 3750 2 0
30 7900 950 712.5 1187.5 0 5
30 7521 1250 937.5 1562.5 0 2
30 7654 1250 937.5 1562.5 0 2
30 7844 1500 1125 1875 1 1
30 7499 1600 1200 2000 1 1
30 7698 2850 2137.5 3562.5 5 0
pre_count column is less than (sa-sa/4) and following_count column is more than (sa+sa/4).
thanks,
Biswaranjan.
A reader, April 09, 2012 - 10:53 am UTC
Thank you very much Biswaranjan!!
analytic performance
Biswaranjan, April 18, 2012 - 4:04 am UTC
Hi Tom,
hope you are doing fine :).
can you please tell which below query is good for performance.
select ename,deptno,(
select max(sal) from emp e1
where emp.deptno = e1.deptno ) max_sal_deptwise
from emp
order by deptno;
select ename,deptno,max(sal) over(partition by deptno) max_sal_deptwise from emp;
thanks,
Biswaranjan
April 18, 2012 - 8:01 am UTC
In almost every case - the analytic would be superior to the scalar subquery in general.
an exception to that might be if you wanted to get the first row from the query as fast as possible - the scalar subquery might have a slight advantage.
ops$tkyte%ORA11GR2> set linesize 1000
ops$tkyte%ORA11GR2> set echo on
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> /*
ops$tkyte%ORA11GR2> drop table t;
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create table t
ops$tkyte%ORA11GR2> as
ops$tkyte%ORA11GR2> select object_name ename, object_id empno,
ops$tkyte%ORA11GR2> mod(object_id,1000) deptno, mod(object_id,1000) sal,
ops$tkyte%ORA11GR2> a.*
ops$tkyte%ORA11GR2> from all_objects a
ops$tkyte%ORA11GR2> /
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create index t_idx on t(deptno,sal);
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );
ops$tkyte%ORA11GR2> */
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> variable n number
ops$tkyte%ORA11GR2> set autotrace traceonly
ops$tkyte%ORA11GR2> exec :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> select ename,deptno,(
2 select max(sal) from t e1
3 where emp.deptno = e1.deptno ) max_sal_deptwise
4 from t emp
5 order by deptno;
72889 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 4042721663
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72889 | 2064K| | 978 (1)| 00:00:12 |
| 1 | SORT AGGREGATE | | 1 | 8 | | | |
| 2 | FIRST ROW | | 1 | 8 | | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN (MIN/MAX)| T_IDX | 1 | 8 | | 2 (0)| 00:00:01 |
| 4 | SORT ORDER BY | | 72889 | 2064K| 2872K| 978 (1)| 00:00:12 |
| 5 | TABLE ACCESS FULL | T | 72889 | 2064K| | 395 (1)| 00:00:05 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("E1"."DEPTNO"=:B1)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
17441 consistent gets
0 physical reads
0 redo size
2767962 bytes sent via SQL*Net to client
53869 bytes received via SQL*Net from client
4861 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
72889 rows processed
ops$tkyte%ORA11GR2> exec dbms_output.put_line( (dbms_utility.get_cpu_time-:n) || ' hsecs' );
37 hsecs
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> exec :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> select ename,deptno,max(sal) over(partition by deptno) max_sal_deptwise from t;
72889 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2335850315
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72889 | 2348K| | 1049 (1)| 00:00:13 |
| 1 | WINDOW SORT | | 72889 | 2348K| 3160K| 1049 (1)| 00:00:13 |
| 2 | TABLE ACCESS FULL| T | 72889 | 2348K| | 395 (1)| 00:00:05 |
-----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1422 consistent gets
0 physical reads
0 redo size
2767962 bytes sent via SQL*Net to client
53869 bytes received via SQL*Net from client
4861 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
72889 rows processed
ops$tkyte%ORA11GR2> exec dbms_output.put_line( (dbms_utility.get_cpu_time-:n) || ' hsecs' );
32 hsecs
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> set autotrace off
Thanks.
Biswaranjan, April 18, 2012 - 11:34 pm UTC
Hi Tom,
Thanks for a nice and simple explanation.
simply you rock!
one little request to you can you please tell which book is good for performance tuning for developer as a beginer to performance tuning.(sorry if I am not supposed to post these
requests).
thanks,
Biswaranjan.
April 19, 2012 - 5:44 am UTC
you can post these requests - but i always say the same thing
I don't know of a single book - or even a set of books - that can teach a developer the 15 steps to performance. The optimizer is that check list.
My approach to performance has always been:
a) understand how it (the database) works
b) know as much as you can about your options
c) understand those options work in general
d) put a, b, c together logically to say "these are probably the best ways"
e) benchmark them.
Just like I did above.
and then with experience - like 5-10-15 years - you'll be able to recognize general patterns/problems and be able to solve them without even thinking about it.
answer says it all!
Ravi B, April 19, 2012 - 3:50 pm UTC
WOW nice
Biswaranjan, April 19, 2012 - 11:36 pm UTC
Thanks a lot.
I will follow your vital points mentioned.
A log way to go for me(3.5 years in oracle :)).
I will try to do performance issue myself if not able then surely post in this site.
thanks again,
Biswaranjan
scnhealthcheck.sql" script for the Oracle version 9.2.0.1.0 .
Menaka, May 16, 2012 - 6:54 am UTC
Hi Tom
We're searching scnhealthcheck.sql script for the Oracle version 9.2.0.1.0 .
Could you please help me download this?
Thanks
May 17, 2012 - 2:16 am UTC
please utilize support for something like this.
Grouping moving set
Sanji, May 24, 2012 - 1:07 pm UTC
Tom,
I'm trying to write a query that'd return 1 record for a given set from a set of entries coming into the system.
CREATE TABLE EQP_TEST (SEQ_NUM INT, JTYPE CHAR(1),EQP_ID INT, EQP_AMT INT, DATE_TIME TIME);
INSERT INTO EQP_TEST VALUES(1,'A',10,10,CAST ('10:00:00' AS TIME));
INSERT INTO EQP_TEST VALUES(2,'A',10,10,CAST ('12:00:00' AS TIME));
INSERT INTO EQP_TEST VALUES(3,'A',10,10,CAST ('12:30:00' AS TIME));
INSERT INTO EQP_TEST VALUES(4,'B',10,20,CAST ('12:35:00' AS TIME));
INSERT INTO EQP_TEST VALUES(5,'B',10,20,CAST ('13:00:00' AS TIME));
INSERT INTO EQP_TEST VALUES(6,'C',20,40,CAST ('13:30:00' AS TIME));
INSERT INTO EQP_TEST VALUES(7,'C',20,40,CAST ('14:00:00' AS TIME));
INSERT INTO EQP_TEST VALUES(8,'A',10,10,CAST ('14:30:00' AS TIME));
INSERT INTO EQP_TEST VALUES(9,'C',20,40,CAST ('15:00:00' AS TIME));
What i need is
SEQ_NUM JTYPE EQP_ID EQP_AMT DATE_TIME
3 A 10 10 12:30:00
5 B 10 20 13:00:00
7 C 20 40 14:00:00
8 A 10 10 14:30:00
9 C 20 40 15:00:00
I know the max(date_time) over (partition by jtype rows unbounded following) would also include seq_num 8 for JTYPE 'A', which is what i do not want.
I am unable to come up with grouping sequence numbers 1,2,3 as 1 group, 4 and 5 as another and so on.
Any help is much appreciated.
Thanks
May 24, 2012 - 3:10 pm UTC
I'm sorry but did you mean to ask "askbill" or someone else this question?
CREATE TABLE EQP_TEST (SEQ_NUM INT, JTYPE CHAR(1),EQP_ID INT, EQP_AMT INT,
DATE_TIME TIME);
doesn't really compute in Oracle - and any answer I gave you probably wouldn't work in whatever else you are actually using.
Grouping moving set
Sanji, May 24, 2012 - 4:10 pm UTC
My apologies. I should have known better.
Could you take out time to explain how this requirement can be achieved in Oracle.
CREATE TABLE EQP_TEST (SEQ_NUM NUMBER, JTYPE CHAR(1), EQP_ID NUMBER, EQP_AMT NUMBER, DATE_TIME DATE);
INSERT INTO EQP_TEST VALUES(1,'A',10,10, TO_DATE('2012-05-01:10:00:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(2,'A',10,10, TO_DATE('2012-05-01:10:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(3,'A',10,10, TO_DATE('2012-05-01:11:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(4,'B',10,20, TO_DATE('2012-05-01:12:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(5,'B',10,20, TO_DATE('2012-05-01:13:00:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(6,'C',20,40, TO_DATE('2012-05-01:14:00:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(7,'C',20,40, TO_DATE('2012-05-01:14:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(8,'A',10,10, TO_DATE('2012-05-01:15:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(9,'D',20,40, TO_DATE('2012-05-01:15:45:00','YYYY-MM-DD:HH24:MI:SS'));
ALTER SESSION SET NLS_DATE_FORMAT = 'HH24:MI:SS';
SELECT * FROM EQP_TEST;
SEQ_NUM J EQP_ID EQP_AMT DATE_TIM
---------- - ---------- ---------- --------
1 A 10 10 10:00:00
2 A 10 10 10:30:00
3 A 10 10 11:30:00
4 B 10 20 12:30:00
5 B 10 20 13:00:00
6 C 20 40 14:00:00
7 C 20 40 14:30:00
8 A 10 10 15:30:00
9 D 20 40 15:45:00
I'm looking for
3 A 10 10 11:30:00
5 B 10 20 13:00:00
7 C 20 40 14:30:00
8 A 10 10 15:30:00
9 D 20 40 15:45:00
Appreciate you input.
Thanks
Sanji
May 25, 2012 - 8:51 am UTC
I made the assumption you wanted to keep the last eqp_* column in each group
ops$tkyte%ORA11GR2> select *
2 from (
3 select seq_num,
4 jtype,
5 eqp_id,
6 eqp_amt,
7 decode( lead(jtype) over (order by date_time), jtype, null, '<=' ) end_of_grp,
8 date_time
9 from eqp_test
10 )
11 where end_of_grp is not null
12 order by date_time
13 /
SEQ_NUM J EQP_ID EQP_AMT EN DATE_TIM
---------- - ---------- ---------- -- --------
3 A 10 10 <= 11:30:00
5 B 10 20 <= 13:00:00
7 C 20 40 <= 14:30:00
8 A 10 10 <= 15:30:00
9 D 20 40 <= 15:45:00
ops$tkyte%ORA11GR2>
RE: Grouping moving set
Duke Ganote, May 24, 2012 - 8:39 pm UTC
Tom has name for this technique, although it escapes my memory at the moment. Just use the LEAD analytic function:
select e.*
from (
select e.seq_num
, e.jtype
, e.eqp_id
, e.eqp_amt
, e.date_time
, case LEAD(Jtype) OVER (order by seq_num)
when Jtype then NULL
else seq_num
end as jagged
from eqp_test e
) e
where jagged IS NOT NULL
order by seq_num
/
SEQ_NUM J EQP_ID EQP_AMT DATE_TIM JAGGED
------- - ---------- ---------- -------- ----------
3 A 10 10 11:30:00 3
5 B 10 20 13:00:00 5
7 C 20 40 14:30:00 7
8 A 10 10 15:30:00 8
9 D 20 40 15:45:00 9
Grouping moving set
Sanji, May 25, 2012 - 9:26 am UTC
Brilliant... Really appreciate all your help Tom/ Duke.
Thanks much...
Sanji
Query
A reader, May 26, 2012 - 10:42 am UTC
Hi Tom,
I need to get the PK_REF_COL column, it represent the value of the PK column, and this should be derived based on the column PARENT_FK and DIR.
For each record must find the record where the value of the column DIR is different from the value of the current record.The order of the records it is indicated by the ORDER_NO column.
PK PARENT_FK ORDER_NO DIR PK_REF_COL
10 100 1 N 12
11 100 2 N 12
12 100 3 S 13
13 100 4 N 14
14 100 5 S 15
15 100 6 N NULL
16 200 1 N 20
17 200 2 N 20
18 200 3 N 20
19 200 4 N 20
20 200 5 S 21
21 200 6 N NULL
22 300 1 S 23
23 300 2 N NULL
24 300 3 N NULL
25 300 4 N NULL
query for PK_REF_COL for reader
Biswaranjan, May 26, 2012 - 9:36 pm UTC
Hi,
@reader
############Input
create table table2(PK number,PARENT_FK number,ORDER_NO number,DIR char(1));
insert into table2 values(10 , 100 , 1, 'N');
insert into table2 values(11 , 100 , 2 , 'N' );
insert into table2 values(12 , 100 , 3 , 'S' );
insert into table2 values(13 , 100 , 4 , 'N' );
insert into table2 values(14 , 100 , 5 , 'S' );
insert into table2 values(15 , 100 ,6 , 'N' );
insert into table2 values(16 , 200 , 1 , 'N' );
insert into table2 values(17 , 200 , 2 , 'N' );
insert into table2 values(18 , 200 , 3 , 'N' );
insert into table2 values(19 , 200 , 4 , 'N' );
insert into table2 values(20 , 200 , 5 , 'S' );
insert into table2 values(21 , 200 ,6 , 'N' );
insert into table2 values(22 , 300 , 1 , 'S' );
insert into table2 values(23 , 300 , 2 , 'N' );
insert into table2 values(24 , 300 , 3 , 'N' );
insert into table2 values(25 , 300 , 4 , 'N' );
############ sql query
SELECT pk,parent_fk,order_no,dir
,decode(x,null,pk+to_number(last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY ROWNUM)-order_no),to_char(pk+1)) PK_REF_COL
FROM(
SELECT table2.*
,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
then null
when lag(dir) over (PARTITION BY PARENT_FK order by ORDER_NO desc) is null
then null
else order_no
end X
FROM table2) ask_all order by parent_fk,order_no;
########################output/result
PK PARENT_FK ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
10 100 1 N 12
11 100 2 N 12
12 100 3 S 13
13 100 4 N 14
14 100 5 S 15
15 100 6 N
16 200 1 N 20
17 200 2 N 20
18 200 3 N 20
19 200 4 N 20
20 200 5 S 21
21 200 6 N
22 300 1 S 23
23 300 2 N
24 300 3 N
25 300 4 N
16 rows selected.
@reader
Please always provide the create table and insert queries so
Tom or others can make use of them without wasting time for
creating and inserting by their own.
If you are new to this asktom then ok but please make a habit to do that. :) .
Thanks & Regards,
Biswaranjan.
continuation to last posted result.
Biswaranjan, May 26, 2012 - 10:00 pm UTC
Hi,
@reader
I just removed the to_number things from my above posted query .
#############
SELECT pk,parent_fk,order_no,dir
,decode(x,null,pk+last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY ROWNUM)-order_no,pk+1) PK_REF_COL
FROM(
SELECT table2.*
,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
then null
when lag(dir) over (PARTITION BY PARENT_FK order by ORDER_NO desc) is null
then null
else order_no
end X
FROM table2) order by parent_fk,order_no;
#########
regards,
Biswaranjan
query
A reader, May 26, 2012 - 10:11 pm UTC
Thanks, I will..
Biswaranjan, May 27, 2012 - 7:38 am UTC
I was just checking with different input for my above posted query for pk_ref_col result.
I missed the logic slightly.
test cases with diff inputs########
SQL> select * from table2;
PK PARENT_FK ORDER_NO D
---------- ---------- ---------- -
10 100 1 N
11 100 2 N
12 100 3 S
13 100 4 N
14 100 5 N
15 100 6 S
16 200 1 N
17 200 2 N
18 200 3 N
19 200 4 N
20 200 5 S
21 200 6 N
22 300 1 S
23 300 2 N
24 300 3 S
25 300 4 N
16 rows selected.
SQL> SELECT pk,parent_fk,order_no,dir
2 ,decode(x,null,pk+last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY
3 ROWNUM)-order_no,pk+1) PK_REF_COL
4 FROM(
5 SELECT table2.*
6 ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
7 or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
8 then null
9 when lag(dir) over (PARTITION BY PARENT_FK order by ORDER_NO desc) is null
10 then null
11 else order_no
12 end X
13 FROM table2) order by parent_fk,order_no;
PK PARENT_FK ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
10 100 1 N 12
11 100 2 N 12
12 100 3 S 13
13 100 4 N
14 100 5 N
15 100 6 S
16 200 1 N 20
17 200 2 N 20
18 200 3 N 20
19 200 4 N 20
20 200 5 S 21
21 200 6 N
22 300 1 S 23
23 300 2 N 24
24 300 3 S 25
25 300 4 N
16 rows selected.
SO above result was not giving the expected result.
SO I rewrite the query ,with many inputs I tried and found no error.
@reader Please ignore the last posted query . Make use of following query.
###############
SQL> SELECT pk,parent_fk,order_no,dir
2 ,case when x is null
3 then pk+last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY order_no desc)-order_no
4 when s =1
5 then null
6 else
7 pk+1
8 end pk_ref_col
9 FROM(
10 SELECT table2.*
11 ,row_number() over (PARTITION BY PARENT_FK order by ORDER_NO desc) s
12 ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
13 or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
14 then null
15 else order_no
16 end X
17 FROM table2) order by parent_fk,order_no;
PK PARENT_FK ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
10 100 1 N 12
11 100 2 N 12
12 100 3 S 13
13 100 4 N 15
14 100 5 N 15
15 100 6 S
16 200 1 N 20
17 200 2 N 20
18 200 3 N 20
19 200 4 N 20
20 200 5 S 21
21 200 6 N
22 300 1 S 23
23 300 2 N 24
24 300 3 S 25
25 300 4 N
16 rows selected.
Thanks & Regards,
Biswaranjan.
query
A reader, May 27, 2012 - 1:22 pm UTC
Hi Biswaranjan,
The pk column is not a consecutive number is random, but it is unique.
query
A reader, May 27, 2012 - 1:24 pm UTC
Biswaranjan,
I mean in the example it is , but it could be randomly.
query for PK_REF_COL for reader for consecutive or non consecutive pk
Biswaranjan, May 27, 2012 - 3:59 pm UTC
SQL> select * from table1;
PK PARENT_FK ORDER_NO D
---------- ---------- ---------- -
10 100 1 N
11 100 2 N
13 100 3 S
14 100 4 N
15 100 5 S
16 100 6 N
16 200 1 N
17 200 2 N
18 200 3 N
19 200 4 N
20 200 5 S
21 200 6 N
22 300 1 S
24 300 2 N
25 300 3 N
28 300 4 N
16 rows selected.
SQL> SELECT pk,parent_fk,order_no,dir
2 ,case when x is null
3 then last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY order_no desc)
4 when s =1
5 then null
6 else
7 lag(pk) over(partition by parent_fk order by order_no desc)
8 end pk_ref_col
9 FROM(
10 SELECT table1.*
11 ,row_number() over (PARTITION BY PARENT_FK order by ORDER_NO desc) s
12 ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
13 or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
14 then null
15 else pk
16 end X
17 FROM table1) order by parent_fk,order_no;
PK PARENT_FK ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
10 100 1 N 13
11 100 2 N 13
13 100 3 S 14
14 100 4 N 15
15 100 5 S 16
16 100 6 N
16 200 1 N 20
17 200 2 N 20
18 200 3 N 20
19 200 4 N 20
20 200 5 S 21
21 200 6 N
22 300 1 S 24
24 300 2 N
25 300 3 N
28 300 4 N
16 rows selected.
SQL>
@reader you can make user of above query , Please check with
various input and let us know if it works or not(I tested and worked fine).
Thanks & Regards,
Biswaranja.
results that never changed over certain period of time
Ashrf, June 18, 2012 - 5:32 am UTC
create table emp_salaries (emp_id number, r_year number(4), r_month number(2), r_code number, r_value number);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 8, 102, 10187.097);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 8, 221, 516.129);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 8, 597, 6000);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 8, 598, 3783.871);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 9, 102, 840);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 9, 221, 46.667);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 10, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 10, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 11, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 11, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 12, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2010, 12, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 1, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 1, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 2, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 2, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 3, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 3, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 4, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 4, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 5, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 5, 199, 918.333);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 5, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 6, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 6, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 6, 599, 918.333);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 7, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 7, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 8, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 8, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 9, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 9, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 10, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 10, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 11, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 11, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 12, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2011, 12, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 1, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 1, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 2, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 2, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 3, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 3, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 4, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 4, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 5, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 5, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 6, 102, 900);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 6, 221, 50);
insert into emps_salaries
(emp_id, r_year, r_month, r_code, r_value)
values
(1000000, 2012, 6, 233, 150);
commit;
I want to write a SQL statement deciding if a salary changed over a certain period of time which is determined as a bind variables (i.e. did salary change in the period 10/2010 till 04/2011). THANKS
June 18, 2012 - 9:09 am UTC
you could have made the example a tad bit shorter.
but this is trivial
select count(distinct r_value) from table where date between :x and :y;
if count > 1, it changed.
Help/guidance writing analytical function SQL
A reader, June 28, 2012 - 8:03 am UTC
DROP TABLE SK_T;
CREATE TABLE SK_T
(ITEM VARCHAR2(25) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE DATE,
PRICE_TYPE VARCHAR2(20),
PRICE NUMBER(10,2));
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.00);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.5);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('09/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.75);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/15/2012', 'MM/DD/RRRR'), TO_DATE('07/22/2012', 'MM/DD/RRRR'), 'SALE', 1.75);
Is it possible to generate the following using just SQL. The Table can have 3-4 million rows.
ITEM START_DATE END_DATE PRICE_TYPE PRICE
123456 07/01/2012 07/14/2012 REGULAR 2.00
123456 07/15/2012 07/22/2012 SALE 1.75
123456 07/23/2012 07/31/2012 REGULAR 2.00
123456 08/01/2012 08/31/2012 REGULAR 2.50
123456 09/01/2012 NULL REGULAR 1.75
In the table Regular Priced items will only have the Start Date and Sale Priced will have a Start Date and End Date.
If the Regular Priced Item's date range falls between a Sale priced item's date range then I have to generate a line for Start date and End Date and then a new row after the sale ends.
June 29, 2012 - 9:17 am UTC
explain your logic thoroughly - precisely.
and if this has to work for multiple items, by all means expand your test case to be inclusive of multiple items so one can check for edge conditions.
but most importantly - explain this to death. where did that jul-23-jul31 record magically appear from. where did the 1.75 come from for september. detail your logic.
Help/guidance writing analytical function SQL
A reader, June 28, 2012 - 8:43 am UTC
Continuation to my previous question.........
Using Lead() I can generate the End Date value for Regular Items, my problem is generating new lines for the overlapping rows
ITEM START_DATE END_DATE PRICE_TYPE PRICE
123456 07/01/2012 07/31/2012 REGULAR 2.00
123456 07/15/2012 07/22/2012 SALE 1.75
123456 07/26/2012 07/30/2012 SALE 1.70
Should be converted to
ITEM START_DATE END_DATE PRICE_TYPE PRICE
123456 07/01/2012 07/31/2012 REGULAR 2.00
123456 07/15/2012 07/22/2012 SALE 1.75
123456 07/23/2012 07/25/2012 REGULAR 2.00
123456 07/26/2012 07/30/2012 SALE 1.70
123456 07/31/2012 07/31/2012 REGULAR 2.00
June 29, 2012 - 9:22 am UTC
why do you have overlapping ranges? what if you have multiple overlapping ranges? This doesn't make sense - to store anything like that. Are you basically saying "sale over overrides regular" and do you guarantee that ONLY ONE sale will override any given regular???
Help/guidance writing analytical function SQL
A reader, June 29, 2012 - 3:46 pm UTC
The overlapping date range in the above example was a typing error. Sorry for the confusion.
We have a table that stores the regular prices of the item, It has item_id, Start Date(/Effective Date) for the price and the Price.
Example
ITEM START_DATE PRICE
123456 08/01/2012 2.50
123456 09/01/2012 2.75
The regular Price is valid until the next Start Date/Effective date.
That means from Aug 01, 2012 thru Aug 31, 2012 the Price is $2.50 and from Sep 01 the Price will be $2.75. Sep 01, 2012's price does not have an End date because there no Start Date greater than Sep 01,2012.
There is another table than has Sale/Special Prices on that item. That table has Item Id, Start date, End date and the Price.
Start date and End date is NOT NULL and there won't be an overlap in the date range for the same item.
Example
ITEM START_DATE END_DATE PRICE_TYPE PRICE
123456 08/01/2012 08/06/2012 SALE 1.50
123456 08/21/2012 08/28/2012 SALE 1.75
That means the Price on the Item is $1.50 from Aug 01, 2012 thru Aug 06, 2012 and then reverts back to the Regular Price after the Sale Period ends.
I need to combine the following two tables and generate the Price with the date Ranges that gives me the Price for the Item and it's correspoding date ranges.
Please note that after the Sale Price period ends, the Price Should revert it back to regular price
For the above example we should get the following output
ITEM START_DATE END_DATE PRICE_TYPE PRICE
123456 08/01/2012 08/06/2012 SALE 1.50
123456 08/07/2012 08/20/2012 REGULAR 2.50
123456 08/21/2012 08/28/2012 SALE 1.75
123456 08/29/2012 08/31/2012 REGULAR 2.50
123456 09/01/2012 NULL REGULAR 2.75
I am resending the test script again with more data for two items.
Also, I am attaching a SQL I was able to come up with which gives the output in the above format. I don't know whether it's efficient or not.
DROP TABLE SK_T;
CREATE TABLE SK_T
(ITEM VARCHAR2(25) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE DATE,
PRICE_TYPE VARCHAR2(20),
PRICE NUMBER(10,2));
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/15/2012', 'MM/DD/RRRR'), TO_DATE('07/22/2012', 'MM/DD/RRRR'), 'SALE', 1.50);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/23/2012', 'MM/DD/RRRR'), TO_DATE('07/26/2012', 'MM/DD/RRRR'), 'SALE', 1.25);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/01/2012', 'MM/DD/RRRR'), TO_DATE('08/06/2012', 'MM/DD/RRRR'), 'SALE', 1.50);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/21/2012', 'MM/DD/RRRR'), TO_DATE('08/28/2012', 'MM/DD/RRRR'), 'SALE', 1.75);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('09/21/2012', 'MM/DD/RRRR'), TO_DATE('09/28/2012', 'MM/DD/RRRR'), 'SALE', 1.00);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.00);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.5);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('09/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.75);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('07/17/2012', 'MM/DD/RRRR'), TO_DATE('07/24/2012', 'MM/DD/RRRR'), 'SALE', 7.50);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('09/01/2012', 'MM/DD/RRRR'), TO_DATE('09/07/2012', 'MM/DD/RRRR'), 'SALE', 8);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('06/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 11.50);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('07/25/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 10.75);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('10/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 11.25);
COMMIT;
WITH a AS --Using this to generate Regular Price row after the Sale Ends.
(SELECT LEVEL L
FROM DUAL
CONNECT BY LEVEL <= 2),
I AS
(SELECT ITEM,
START_DATE,
(CASE WHEN PRICE_TYPE='SALE' THEN
END_DATE
WHEN END_DATE > LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE) THEN
LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE)-1
ELSE
END_DATE
END) END_DATE,
LAG(PRICE_TYPE,1) OVER (PARTITION BY ITEM ORDER BY START_DATE) PREV_PRICE_TYPE,
LEAD(PRICE_TYPE,1) OVER (PARTITION BY ITEM ORDER BY START_DATE) NEXT_PRICE_TYPE,
(CASE WHEN PRICE_TYPE='SALE' THEN
LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN END_DATE END),1) IGNORE NULLS OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE)
ELSE
END_DATE
END) ORIG_END_DATE,
(CASE WHEN PRICE_TYPE='SALE' THEN
LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN PRICE END),1) IGNORE NULLS OVER (PARTITION BY ITEM ORDER BY START_DATE, ORD_TYPE)
END) ORIG_PRICE,
PRICE_TYPE,
PRICE,
ORD_TYPE,
LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE) NEXT_SALE_START_DATE
FROM (SELECT ITEM,
START_DATE,
NVL((CASE WHEN END_DATE IS NOT NULL THEN
END_DATE
ELSE
(LEAD(START_DATE,1) IGNORE NULLS
OVER (PARTITION BY ITEM, PRICE_TYPE
ORDER BY START_DATE))-1
END), TO_DATE('31-DEC-5099', 'DD-MON-YYYY')) END_DATE,
PRICE_TYPE,
PRICE,
DECODE(PRICE_TYPE, 'REGULAR', 1, 2) ORD_TYPE
FROM SK_T
ORDER BY 1,2)),
items AS
(SELECT ITEM, START_DATE, END_DATE, PRICE_TYPE, PRICE, ORIG_END_DATE, PREV_PRICE_TYPE, NEXT_PRICE_TYPE, ORIG_PRICE, L,ORD_TYPE, NEXT_SALE_START_DATE
FROM A, I
WHERE (L=1
OR (PRICE_TYPE='SALE'
AND END_DATE < ORIG_END_DATE
AND ((PREV_PRICE_TYPE ='REGULAR' AND NEXT_PRICE_TYPE='REGULAR')
OR (PREV_PRICE_TYPE ='SALE' AND NEXT_PRICE_TYPE='REGULAR')
OR NEXT_SALE_START_DATE-1 != END_DATE
OR NEXT_PRICE_TYPE IS NULL)))),
qry AS
(SELECT ITEM,
(CASE WHEN L= 2 THEN
(LAST_VALUE(END_DATE) OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE, LEVEL))+1
ELSE
START_DATE
END) START_DATE,
(CASE WHEN L= 2 THEN
(CASE WHEN (NEXT_SALE_START_DATE-1) < ORIG_END_DATE THEN
NEXT_SALE_START_DATE-1
ELSE
ORIG_END_DATE
END)
ELSE
END_DATE
END) END_DATE,
(CASE WHEN L= 2 THEN
'REGULAR'
ELSE
PRICE_TYPE
END) PRICE_TYPE,
(CASE WHEN L= 2 THEN
ORIG_PRICE
ELSE
PRICE
END) PRICE
FROM items)
SELECT item, start_date, end_date, price_type, price
FROM qry
WHERE start_date <= NVL(end_date,start_date)
ORDER BY 1,2;
July 02, 2012 - 6:52 am UTC
There is another table than has Sale/Special Prices on that item.
then why the heck does your example have but one table??? why would you do that - why would you say "i have two" and then give me but one???????? How can someone develop an answer for you if you do stuff like that????? Your real life problem has nothing to do with your test case???
I doubt your query works if there are two sales in the window of a regular price, that is, if you have a regular price for the month of august, but it goes on sale on the 5th-10th and the 20th-25th. You only produce two rows of a regular price - you might need N rows.
when you run this query:
a) will you do this for an item or for N items (where N is generally what number) or all items?
b) how large do you anticipate the date ranges you have to deal with being? weeks, months, years?
c) please fix your example to model REALITY, if you have two tables - please use that in your example - it would be a waste of my time to provide a solution that doesn't work in real life.
Help/guidance writing analytical function SQL
A reader, July 02, 2012 - 8:37 am UTC
The reason I didn't give the example with two tables was because I thought it's only a UNION ALL statementbetween the two tables and didn't want to confuse you with lot of details
a) will you do this for an item or for N items (where N is generally what number) or all items?
The query is all Items in the tables(SK_REG_TEMP and SK_SALE_TEMP)
There can be items in SK_REG_TEMP and not in SK_SALE_TEMP and vice versa.
b) how large do you anticipate the date ranges you have to deal with being? weeks, months, years?
Most of the date ranges will be in weeks, but there will be few that are in months(4-6 months)
c) please fix your example to model REALITY, if you have two tables - please use that in your example - it would be a waste of my time to provide a solution that doesn't work in real life.
Please find th updated test case below.
And Sorry for the confusion.
DROP TABLE SK_REG_TEMP;
DROP TABLE SK_SALE_TEMP;
CREATE TABLE SK_REG_TEMP
(ITEM VARCHAR2(25) NOT NULL,
START_DATE DATE NOT NULL,
PRICE NUMBER(10,2));
CREATE TABLE SK_SALE_TEMP
(ITEM VARCHAR2(25) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE DATE NOT NULL,
PRICE NUMBER(10,2));
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('07/01/2012', 'MM/DD/RRRR'), 2.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('08/15/2012', 'MM/DD/RRRR'), 2.50);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('12/15/2012', 'MM/DD/RRRR'), 2.75);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('07/15/2012', 'MM/DD/RRRR'), 25.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('01/15/2013', 'MM/DD/RRRR'), 24.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM3', TO_DATE('07/23/2012', 'MM/DD/RRRR'), 15.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM4', TO_DATE('07/12/2012', 'MM/DD/RRRR'), 31.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM4', TO_DATE('01/15/2013', 'MM/DD/RRRR'), 33.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('07/08/2012', 'MM/DD/RRRR'), TO_DATE('07/15/2012', 'MM/DD/RRRR'), 1.50);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('07/23/2012', 'MM/DD/RRRR'), TO_DATE('07/30/2012', 'MM/DD/RRRR'), 1.75);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('08/15/2012', 'MM/DD/RRRR'), TO_DATE('01/15/2013', 'MM/DD/RRRR'), 1.90);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('07/19/2012', 'MM/DD/RRRR'), TO_DATE('07/26/2012', 'MM/DD/RRRR'), 22.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('09/23/2012', 'MM/DD/RRRR'), TO_DATE('09/30/2012', 'MM/DD/RRRR'), 24.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM3', TO_DATE('08/14/2012', 'MM/DD/RRRR'), TO_DATE('08/21/2012', 'MM/DD/RRRR'), 12.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM5', TO_DATE('10/20/2012', 'MM/DD/RRRR'), TO_DATE('10/27/2012', 'MM/DD/RRRR'), 15.70);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM5', TO_DATE('12/21/2012', 'MM/DD/RRRR'), TO_DATE('02/15/2013', 'MM/DD/RRRR'), 14.70);
COMMIT;
Desired Output........
ITEM START_DATE END_DATE PRICE_TYPE PRICE
ITEM1 07/01/2012 07/07/2012 REG 2.00
ITEM1 07/08/2012 07/15/2012 SALE 1.50
ITEM1 07/16/2012 07/22/2012 REG 2.00
ITEM1 07/23/2012 07/30/2012 SALE 1.75
ITEM1 07/31/2012 08/14/2012 REG 2.00
ITEM1 08/15/2012 01/15/2013 SALE 1.90
ITEM1 01/16/2013 NULL REG 2.75
ITEM2 07/15/2012 07/18/2012 REG 25.00
ITEM2 07/19/2012 07/26/2012 SALE 22.00
ITEM2 07/27/2012 09/22/2012 REG 25.00
ITEM2 09/23/2012 09/30/2012 SALE 24.00
ITEM2 10/01/2012 01/14/2013 REG 25.00
ITEM2 01/15/2013 NULL REG 24.00
ITEM3 07/23/2012 08/13/2012 REG 15.00
ITEM3 08/14/2012 08/21/2012 SALE 12.00
ITEM3 08/22/2012 NULL REG 15.00
ITEM4 07/12/2012 01/14/2013 REG 31.00
ITEM4 01/15/2013 NULL REG 33.00
ITEM5 10/20/2012 10/27/2012 SALE 15.70
ITEM5 12/21/2012 02/15/2013 SALE 14.70
July 02, 2012 - 12:52 pm UTC
you might consider a data model more suited to representing this data relationally....
It isn't pretty, but I'm pretty sure it works.
The concept it to turn the ranges into rows. for the regular price we have to figure out the end date - which is easy for all of the rows except the last row in the set by items. That "end date" will be the greatest of a) the last observed regular price, or b) the last observed sales price for that item.
Once we have the end date - it is easy to turn the single row into a row per day per record.
Same with sale date
then just join (by_date)
and then do a bit of grouping to get the ranges back again and a fancy case at the end to figure out whether to print out the end date or leave it null
ops$tkyte%ORA11GR2> with reg
2 as
3 (
4 select item, start_date,
5 nvl( lead(start_date-1) over (partition by item order by start_date),
6 greatest( r.start_date+1,
7 (select max(s.end_date+1)
8 from sk_sale_temp s
9 where s.item = r.item)) ) end_date,
10 price
11 from sk_reg_temp r
12 ),
13 reg_by_date
14 as
15 (
16 select item, start_date+column_value curr_date, start_date, end_date,
17 price reg_price, cast( null as number ) sale_price, 'R' price_type
18 from reg,
19 TABLE( cast(multiset(select level-1 from dual
20 connect by level <= end_date-start_date+1) as sys.odciNumberList) )
21 ),
22 sale_by_date
23 as
24 (select item, start_date+column_value curr_date, start_date, end_date,
25 cast( null as number ) reg_price, price sale_price, 'S' price_type
26 from sk_sale_temp,
27 TABLE( cast(multiset(select level-1 from dual
28 connect by level <= end_date-start_date+1) as sys.odciNumberList) )
29 ),
30 by_date
31 as
32 (
33 select item, curr_date,
34 case when max(price_type) = 'R'
35 then max(reg_price)
36 else max(sale_price)
37 end the_price,
38 max(price_type) the_price_type
39 from (
40 select *
41 from reg_by_date
42 union all
43 select *
44 from sale_by_date
45 )
46 group by item, curr_date
47 )
48 select item, sdate,
49 case when the_price_type <> 'R' or edate <> last_date
50 then edate
51 end edate, the_price_type, the_price
52 from (
53 select item, min(curr_date) sdate, max(curr_date) edate, the_price, the_price_type,
54 max( max(curr_date) ) over (partition by item) last_date
55 from (
56 select item, curr_date, the_price, the_price_type, count(flag) over (partition by item order by curr_date) grp
57 from (
58 select item, curr_date, the_price, the_price_type,
59 case when lag(curr_date) over (partition by item order by curr_date) <> curr_date-1
60 or lag(the_price) over (partition by item order by curr_date) <> the_price
61 or lag(the_price_type) over (partition by item order by curr_date) <> the_price_type
62 then 1
63 end flag
64 from by_date
65 )
66 )
67 group by item, the_price, the_price_type, grp
68 )
69 order by 1, 2
70 /
ITEM SDATE EDATE T THE_PRICE
------------------------- --------- --------- - ----------
ITEM1 01-JUL-12 07-JUL-12 R 2
08-JUL-12 15-JUL-12 S 1.5
16-JUL-12 22-JUL-12 R 2
23-JUL-12 30-JUL-12 S 1.75
31-JUL-12 14-AUG-12 R 2
15-AUG-12 15-JAN-13 S 1.9
16-JAN-13 R 2.75
ITEM2 15-JUL-12 18-JUL-12 R 25
19-JUL-12 26-JUL-12 S 22
27-JUL-12 22-SEP-12 R 25
23-SEP-12 30-SEP-12 S 24
01-OCT-12 14-JAN-13 R 25
15-JAN-13 R 24
ITEM3 23-JUL-12 13-AUG-12 R 15
14-AUG-12 21-AUG-12 S 12
22-AUG-12 R 15
ITEM4 12-JUL-12 14-JAN-13 R 31
15-JAN-13 R 33
ITEM5 20-OCT-12 27-OCT-12 S 15.7
21-DEC-12 15-FEB-13 S 14.7
20 rows selected.
Help/guidance writing analytical function SQL
A reader, July 02, 2012 - 9:14 am UTC
The following query looks like it's working for the above test data. I have two sale data in the regular price window for item ITEM1(07/08/2012 to 07/15/2012 and 07/23/2012 to 07/30/2012). I am producing two rows for every Sale row.
It looks ugly and I don't know if it's efficient.
WITH a AS --Using this to generate Regular Price row after the Sale Ends.
(SELECT LEVEL L
FROM DUAL
CONNECT BY LEVEL <= 2),
STG AS
(SELECT ITEM, START_DATE, TO_DATE(NULL) END_DATE, 'REGULAR' PRICE_TYPE, PRICE
FROM SK_REG_TEMP
UNION ALL
SELECT ITEM, START_DATE, END_DATE, 'SALE' PRICE_TYPE, PRICE
FROM SK_SALE_TEMP),
I AS
(SELECT ITEM,
(CASE WHEN PRICE_TYPE='SALE' THEN
START_DATE
WHEN RN > 1 AND START_DATE < LAG((CASE WHEN PRICE_TYPE='SALE' THEN END_DATE +1 END),1, SYSDATE-3000) OVER (PARTITION BY ITEM ORDER BY END_DATE) THEN
LAG((CASE WHEN PRICE_TYPE='SALE' THEN END_DATE +1 END),1) OVER (PARTITION BY ITEM ORDER BY END_DATE)
ELSE
START_DATE
END) START_DATE,
(CASE WHEN PRICE_TYPE='SALE' THEN
END_DATE
WHEN END_DATE > LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE-1 END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE) THEN
LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE-1 END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE)
ELSE
END_DATE
END) END_DATE,
LAG(PRICE_TYPE,1) OVER (PARTITION BY ITEM ORDER BY START_DATE) PREV_PRICE_TYPE,
LEAD(PRICE_TYPE,1) OVER (PARTITION BY ITEM ORDER BY START_DATE) NEXT_PRICE_TYPE,
(CASE WHEN PRICE_TYPE='SALE' THEN
LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN END_DATE END),1) IGNORE NULLS OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE)
ELSE
END_DATE
END) ORIG_END_DATE,
(CASE WHEN PRICE_TYPE='SALE' THEN
LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN PRICE END),1) IGNORE NULLS OVER (PARTITION BY ITEM ORDER BY START_DATE, ORD_TYPE)
END) ORIG_PRICE,
PRICE_TYPE,
PRICE,
ORD_TYPE,
LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE) NEXT_SALE_START_DATE
FROM (SELECT ITEM,
START_DATE,
NVL((CASE WHEN END_DATE IS NOT NULL THEN
END_DATE
ELSE
(LEAD(START_DATE,1) IGNORE NULLS
OVER (PARTITION BY ITEM, PRICE_TYPE
ORDER BY START_DATE))-1
END), TO_DATE('31-DEC-5099', 'DD-MON-YYYY')) END_DATE,
PRICE_TYPE,
PRICE,
DECODE(PRICE_TYPE, 'REGULAR', 1, 2) ORD_TYPE,
ROW_NUMBER() OVER (PARTITION BY ITEM ORDER BY START_DATE) RN
FROM STG
ORDER BY 1,2)),
items AS
(SELECT ITEM, START_DATE, END_DATE, PRICE_TYPE, PRICE, ORIG_END_DATE, PREV_PRICE_TYPE, NEXT_PRICE_TYPE, ORIG_PRICE, L,ORD_TYPE, NEXT_SALE_START_DATE
FROM A, I
WHERE (L=1
OR (PRICE_TYPE='SALE'
AND END_DATE < ORIG_END_DATE
AND ((PREV_PRICE_TYPE ='REGULAR' AND NEXT_PRICE_TYPE='REGULAR')
OR (PREV_PRICE_TYPE ='SALE' AND NEXT_PRICE_TYPE='REGULAR')
OR NEXT_SALE_START_DATE-1 != END_DATE
OR NEXT_PRICE_TYPE IS NULL)))),
qry AS
(SELECT ITEM,
(CASE WHEN L= 2 THEN
(LAST_VALUE(END_DATE) OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE, LEVEL))+1
ELSE
START_DATE
END) START_DATE,
(CASE WHEN L= 2 THEN
(CASE WHEN (NEXT_SALE_START_DATE-1) < ORIG_END_DATE THEN
NEXT_SALE_START_DATE-1
ELSE
ORIG_END_DATE
END)
ELSE
END_DATE
END) END_DATE,
(CASE WHEN L= 2 THEN
'REGULAR'
ELSE
PRICE_TYPE
END) PRICE_TYPE,
(CASE WHEN L= 2 THEN
ORIG_PRICE
ELSE
PRICE
END) PRICE
FROM items)
SELECT item, start_date, end_date, price_type, price
FROM qry
WHERE start_date <= NVL(end_date,start_date)
ORDER BY 1,2;
Help/guidance writing analytical function SQL
A reader, July 02, 2012 - 2:58 pm UTC
Thanks a lot for the SQL.
The SQL I wrote was not dividing the regular and sales rows by date, but trying to find the end_date+1 from the Sales row to get the next regular price row.
Your SQL is much more cleaner and easier to understand. Thanks again for the nice SQL.
Is there any difference between (CAST NULL AS NUMBER) and TO_NUMBER(NULL)?
July 03, 2012 - 8:04 am UTC
cast(null as number) and to_number(null) lead to pretty much the same effect. I just personally find cast( null as number ) to be cleaner to read.
Help/guidance writing analytical function SQL
A reader, July 02, 2012 - 3:13 pm UTC
You had mentioned earlier
"you might consider a data model more suited to representing this data relationally.... "
It's not possible for me to change the data model right now.
If you were to do the data model, how would you have done this
Requirement
1) Maintain Regular Price of the Items with effective/start date.
2) Maintain a sale price. The sale price will have a start and end date.
The sale price can be a fixed price($2.25) or a percentage off the retail(25% off retail) or fixed dollars off the retail($0.50 off the retail)
July 03, 2012 - 8:19 am UTC
The requirement can be more readily stated as:
maintain the correct price of something during a period of time.
I probably would have gone with:
item, start_date, end_date, price, description
where description was 'regular', 'sales'. Single table, non-overlapping ranges.
In short, the table would have been your report, your report would have been "select * from table order by item, start_date"
Analytics or not
Tony Fernandez, August 01, 2012 - 3:05 pm UTC
Dear Tom,
We have this query we are trying to implement using analytic functions but the cost and statistics increase dramatically, tried "with" in the select statament to "prepare data", but no gain at all. We need your help:
( data provided is very small set, tables go to millions of rows each )
Here is query:
select case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
else dic.distributor_id
end distributor_id
,case when dic.din_mapping_grp_flag = 1 then do.distributor_name
else dc.distributor_name
end distributor_name
,dic.din
,dic.active_flag
,min ( dic.dist_item_source ) dist_item_source
,max ( dic.brand ) brand
,max ( dic.description ) description
from dist_item_catalog dic
,distributor do
,distributor dc
where dic.dist_org_id = do.distributor_id
and dic.distributor_id = dc.distributor_id
and dic.active_flag = 1
group by case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
else dic.distributor_id
end,
case when dic.din_mapping_grp_flag = 1 then do.distributor_name
else dc.distributor_name
end,
dic.din,
dic.active_flag;
Table distributor is:
CREATE TABLE DISTRIBUTOR
( DISTRIBUTOR_ID VARCHAR2(13 BYTE) NOT NULL,
DISTRIBUTOR_NAME VARCHAR2(60 BYTE) NOT NULL );
SET DEFINE OFF;
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00729', 'Sysco Sacramento');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00705', 'AFI Foodservice INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00699', 'US Foods Detroit INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00697', 'Hansen Sales - VEND');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00696', 'Fox River');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00703', 'DiCarlo Distributors');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00701', 'US Foods Cincinnati INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00700', 'US Foods Knoxville 2270 6H');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00730', 'US Foods Paris (formerly PYA ) / 5Q INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00718', 'Upper Lakes Foods Inc');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00719', 'Fleming INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00720', 'Camellia INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00724', 'Reinhart Foodservice Twin Cities');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00725', 'Indianhead Foodservice ');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00726', 'Martin Preferred Foods');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00727', 'Wenning and Sons INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00706', 'J Kings Food Service Professionals - Pro Act');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00707', 'Texoma Meat Company INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00710', 'Q S I Foods INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00711', 'Sysco Jamestown INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00712', 'City Meats INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00713', 'Merit Day Food Service INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00714', 'US Foods Milwaukee 248 (formerly Alliant) 3D');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2001-00715', 'Sysco Pittsburgh');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2002-01122', 'Sysco Hampton Roads');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2002-01106', 'Martin Brothers');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2003-00231', 'Dean Foods - Pet Dairy Johnson City TN (Land O Sun)');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2003-00232', 'Dean Foods - Pet Dairy Spartanburg SC INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2003-00233', 'Dean Foods - Schepps Dairy Houston INACTIVE');
Insert into GPO.DISTRIBUTOR
(DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
Values
('DO-2003-00234', 'Dean Foods - Shenandoah''s Pride Dairy');
COMMIT;
and table dist_item_catalog:
CREATE TABLE DIST_ITEM_CATALOG
(
DIN VARCHAR2(20 BYTE) NOT NULL,
DIST_ITEM_SOURCE VARCHAR2(1 BYTE) NOT NULL,
DISTRIBUTOR_ID VARCHAR2(13 BYTE),
DIST_ORG_ID VARCHAR2(13 BYTE),
DIN_MAPPING_GRP_FLAG NUMBER(1),
ACTIVE_FLAG NUMBER,
DESCRIPTION VARCHAR2(100 BYTE),
BRAND VARCHAR2(40 BYTE),
);
SET DEFINE OFF;
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '17253', 1,
'D', 'PACTIV', 'TRAY FOAM SCHL 5COMP WHT 8X10');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '55659', 1,
'D', 'SUNKSTS', 'DRINK CRANBERRY CKTL 10% 4+1');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '59859', 1,
'D', 'SUBWAY', 'TURKEY BRST CKD SLI .5 OZ');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '83737', 1,
'D', 'PRIMOS', 'BOX BAKERY 10X10X4 PRINT PRIMO');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '83790', 1,
'D', 'C&M', 'LID PLAS DOME F/9" DEEP');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '100628', 1,
'D', 'SYS IMP', 'JUICE ORANGE 100% NECTAR THICK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '101675', 1,
'D', 'SYS IMP', 'WATER LEMON NECTAR THICK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '102020', 1,
'D', 'SYS IMP', 'MILK 2% DAIRY HONEY THICK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '117341', 1,
'D', 'BASICAM', 'POTATO MASHED NATURE S OWN');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '127672', 1,
'D', 'SUBWAY', 'FILM PVC RL 18"X2000 W/CUTTER');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '138792', 1,
'D', 'PACKER', 'SUGAR COARSE CON AA');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '143356', 1,
'D', 'SYS CLS', 'POTATO H/BRN DEHY GLDN GRL');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '156901', 1,
'D', 'HELLMAN', 'DRESSING 1000 ISL FF PKT');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '172623', 1,
'D', 'CHFMXWL', 'PEA GREEN FRZN');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '213452', 1,
'D', 'SUBWAY', 'DOUGH COOKIE RASP CHSCAK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '218962', 1,
'D', 'SYS CLS', 'CHICKEN WING BUFF GLZE 1&2 PCK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '236964', 1,
'D', 'SLIKRIK', 'MIX SPICE CHILI ORGANIC');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '237018', 1,
'D', 'SLIKRIK', 'MIX SPICE RED HOT ORGANIC');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '284137', 1,
'D', 'BRILL', 'CAKE RED VELVET DBL LYR 8"');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '322376', 1,
'D', 'BRKBUSH', 'CHICKEN WING BRD BRBN WILD');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '325223', 1,
'D', 'CORNER', 'BACON SLI APLWD SMKD 13/15');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '349245', 1,
'D', 'THOMPSN', 'STEAK PORTERHOUSE SEL ANGUS');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '350514', 1,
'D', 'THOMPSN', 'BEEF ROUND INSIDE DENUDE');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '353237', 1,
'D', 'AREZZIO', 'CHEESE STRING MOZZARELLA');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '354219', 1,
'D', 'THOMPSN', 'PORK CHOP FRCHD 8-10 OZ');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '356024', 1,
'D', 'FARMLND', 'BACON CANADIAN SLI .75 OZ FRZN');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '356857', 1,
'D', 'SCHWANS', 'ICE CREAM STWBRY');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '359453', 1,
'D', 'CORNER', 'SAUCE BASIL PSTO');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '361028', 1,
'D', 'SUBWAY', 'CUP PLAS CLD 40 OZ "GRN LEAF"');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '366904', 1,
'D', 'MSGROWN', 'POTATO SWEET FRSH');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '384495', 1,
'D', 'KRSPY K', 'DOUGH BISCUIT BTRMLK DROP');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '387318', 1,
'D', 'SEABEST', 'SNAPPER FIL SKON SCAR 8/10 IDN');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '389280', 1,
'D', 'BKRSCLS', 'BATTER MUFFIN CHOC CHOC CHIP');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '392490', 1,
'D', 'COMET', 'CUP PLAS CAR 32TV OZ WHT');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '395802', 1,
'D', 'PRAIRIE', 'STRAW WRPD TRANS GNT 10.25"');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '429274', 1,
'D', 'SYS IMP', 'MILK 2% DAIRY NECTAR THICK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '435046', 1,
'D', 'PLDRHNO', 'SAUCE BBQ');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '449280', 1,
'D', 'GILSTER', 'PASTA SPAGHETTI 10"');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '455099', 1,
'D', 'PACKER', 'PIE LEMON MERNGE');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '471922', 1,
'D', 'SUBWAY', 'BAG PLAS SNDW "GRN LEAF"');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '476224', 1,
'D', 'CABBHNP', 'STEAK FILET C\C CH EXCLUSIVE');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '477513', 1,
'D', 'TODDS', 'SAUCE TERIYAKI SWEET THICK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '491567', 1,
'D', 'KINGCHS', 'CAKE BUNDT ASST 9"');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '496578', 1,
'D', 'BHB/NPM', 'STEAK STRIP E\E 1"TL PR FRZN');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '501957', 1,
'D', 'MARY B', 'DOUGH BISCUIT BUTTERMILK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '504169', 1,
'D', 'AEP INC', 'LINER TRASH 38X58 1.5M BLK');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '533517', 1,
'D', 'BTRBALL', 'TURKEY BRGR SAVRY WHT RTC');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '557575', 1,
'D', 'RUFINOS', 'APTZR MUSHROOM STFD W/CRABMEAT');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '566824', 1,
'D', 'FIRECLS', 'BEEF GROUND BULK FINE 73/27');
Insert into GPO.DIST_ITEM_CATALOG
(DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG,
DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
Values
(1, 'DO-2004-00404', 'DO-2001-00589', '570273', 1,
'D', 'CYCREEK', 'ALLIGATOR MEAT FARM RAISED');
COMMIT;
August 01, 2012 - 3:26 pm UTC
I do not see a single analytic function in that entire query.
????
?
Tony Fernandez, August 01, 2012 - 3:50 pm UTC
That is correct Tom, no analytic function is displayed but the ones attempted do not help in the tuning but make it more costly.
The aggregates have been added analytics like "over ( partition by case
when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
else dic.distributor_id
end,
case
when dic.din_mapping_grp_flag = 1 then do.distributor_name
else dc.distributor_name
end,
dic.din,
dic.active_flag )
We need these items to "group by" or "over ( partitioned by )"
Thanks,
August 01, 2012 - 3:52 pm UTC
??? I don't understand at all.
I don't see how analytics would fit into this query at all, you appear to want to aggregate.
I'm not sure where you are trying to go with analytics, I don't see them as being useful in this query - not if your need is to AGGREGATE.
continued
Tony Fernandez, August 01, 2012 - 4:04 pm UTC
Sorry Tom,
Here is the analytic version:
select case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
else dic.distributor_id
end distributor_id
,case when dic.din_mapping_grp_flag = 1 then do.distributor_name
else dc.distributor_name
end distributor_name
,dic.din
,dic.active_flag
,min ( dic.dist_item_source ) over ( partition by case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
else dic.distributor_id
end,
dic.din,
dic.active_flag )
,max ( dic.brand ) -- analityc same as prior column directly above
,max ( dic.description ) -- idem
from dist_item_catalog dic
,distributor do
,distributor dc
where dic.dist_org_id = do.distributor_id
and dic.distributor_id = dc.distributor_id
and dic.active_flag = 1
;
Please keep in mind that analytic is only one alternative to attempt to expedite the query, any other suggestion even aside of analytics will be fine.
Bottom line is the query for an GUI transaction takes up to 20 minutes on the whole data set, we would like to talk about seconds for the GUI user.
Thanks,
August 01, 2012 - 4:10 pm UTC
but that gives the wrong answer doesn't it.
you need to aggregate, you want to GET RID of rows. analytics do not do that.
have you considered a materialized view.
Mat view
Tony Fernandez, August 01, 2012 - 4:15 pm UTC
That is correct, more rows.
Materialized view is out of the question because this being a financial system and many things happen per second that could obsolete the mat view almost instantaneously.
Bottom line is that there are many aggregates with the same group by and noticed that just dropping one aggregated column makes great gain in time. But unfortunately, all columns in query are needed.
One more attempt
Tony Fernandez, August 06, 2012 - 9:10 am UTC
Tom,
Here is query written with no analytics. Understood that analytics will not help in this case. Can you please point any other direction yet to be explored? My DBA group all agree that mat view was not solution before due to high OLTP and we want to continue with that norm.
Here is the query:
with dic2 as ( select min( dist_item_catalog_id ) dicid
,din
,decode( din_mapping_grp_flag, 1, dist_org_id, distributor_id ) distid
from dist_item_catalog
group by din
,decode( din_mapping_grp_flag, 1, dist_org_id, distributor_id )
)
select 'Available' type
,decode( dic.din_mapping_grp_flag, 1, dic.dist_org_id, dic.distributor_id ) distributor_id
,decode( dic.din_mapping_grp_flag, 1, do.distributor_name, dc.distributor_name ) distributor_name
,dic.din
,dic.active_flag
,dic.dist_item_source
,dic.brand
,dic.description
,dic.dist_category_id
,dic.category_name
,dic.dist_sub_category_id
,dic.subcategory_name
,dic.pack_size
,dic.dist_mfr_name
,dic.dist_mfr_id
from dist_item_catalog dic
,dic2
,distributor do
,distributor dc
where dic2.dicid = dic.dist_item_catalog_id
and dic.dist_org_id = do.distributor_id
and dic.distributor_id = dc.distributor_id
and dic.active_flag = 1
and not exists ( select 1
from dist_term_set_item dtsi
where dtsi.dist_item_catalog_id =
dic.dist_item_catalog_id );
August 17, 2012 - 12:20 pm UTC
do you have a plan
and are the estimated cardinalities in the plan close to being correct? use your knowledge of the data to answer that.
Book?
Nelson, August 10, 2012 - 12:08 pm UTC
So Tom... 10 years later.. did you finally get a chance to publish a book on analytics?
Plan
Tony Fernandez, August 17, 2012 - 1:47 pm UTC
Tom,
Here is my explain for two questions ago in current thread:
Execution Plan
----------------------------------------------------------
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1652K| 623M| 86663 |
| 1 | HASH JOIN | | 1652K| 623M| 86663 |
| 2 | TABLE ACCESS FULL | DISTRIBUTOR | 5673 | 238K| 58 |
| 3 | HASH JOIN | | 1652K| 556M| 86585 |
| 4 | TABLE ACCESS FULL | DISTRIBUTOR | 5673 | 238K| 58 |
| 5 | HASH JOIN | | 1652K| 488M| 86506 |
| 6 | VIEW | | 1652K| 20M| 17601 |
| 7 | HASH GROUP BY | | 1652K| 85M| 17601 |
| 8 | TABLE ACCESS FULL | DIST_ITEM_CATALOG | 1652K| 85M| 17418 |
| 9 | HASH JOIN RIGHT ANTI | | 1652K| 467M| 42682 |
| 10 | INDEX FAST FULL SCAN| DIST_TERM_SET_ITEM_IDX | 1056K| 6192K| 476 |
| 11 | TABLE ACCESS FULL | DIST_ITEM_CATALOG | 1652K| 458M| 17517 |
-----------------------------------------------------------------------------------
and Statistics:
Statistics
----------------------------------------------------------
107 recursive calls
0 db block gets
47967 consistent gets
5140 physical reads
0 redo size
25155 bytes sent via SQL*Net to client
507 bytes received via SQL*Net from client
15 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
209 rows processed
Cardinalities are fine.
Thanks,
August 17, 2012 - 3:37 pm UTC
no they aren't.
we estimated 1.6 MILLION ROWS as output.
you got 209
that sure doesn't add up.
can you run the query with gather_plan_statistics in sqlplus:
http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/ or get a tkprof and merge the actual row counts with the estimated and give us a report that shows the plan - then estimated - then actual rows.
Analytics - considering a flag in both senarios
ARC, October 02, 2012 - 3:59 am UTC
Hi Tom,
Please help me with analytics.
Table Data:
SNO FLG AMT
1 NPI 1000
2 NPI 2000
3 AFM 1400
4 AFM 1500
5 AFM 1700
6 NFM 500
7 NFM 700
8 NPI 8000
9 NPI 6000
10 OTH 9000
In the above data I have to consider NFM flag records under both NPI and AFM records. When I select NPI the amounts under NPI and NFM should sum up, When I select AFM the amounts under AFM and NFM should sum up.
I tried with below SQL, but it is working for one condition either NPI or AFM. I'm trying to achive in a single query without any filters on the FLAG.
==========
SELECT case when flg in ('NPI','NFM') then 'NPI' when flg in ('AFM','NFM') then 'AFM' else FLG end SSS,
sum(amt) amt
FROM TEST_FLG
group by case when flg in ('NPI','NFM') then 'NPI' when flg in ('AFM','NFM') then 'AFM' else FLG end
==========
My desired out put is
NPI 18200
AFM 5800
OTH 9000
Plese help me in correcting query. Thanks in advance.
-ARC
October 09, 2012 - 11:34 am UTC
no create
no inserts
no look
no promises either, i really didn't look at this
rebisco, October 10, 2012 - 2:29 am UTC
Hello ARC,
I'm not so sure if this is what you wanted but, for me it can be solved using UNION ALL and a few filters?
SQL> CREATE TABLE TEST AS
2 SELECT 1 SNO, 'NPI' FLG, 1000 AMT FROM DUAL
3 UNION ALL
4 SELECT 2 SNO, 'NPI' FLG, 2000 AMT FROM DUAL
5 UNION ALL
6 SELECT 3 SNO, 'AFM' FLG, 1400 AMT FROM DUAL
7 UNION ALL
8 SELECT 4 SNO, 'AFM' FLG, 1500 AMT FROM DUAL
9 UNION ALL
10 SELECT 5 SNO, 'AFM' FLG, 1700 AMT FROM DUAL
11 UNION ALL
12 SELECT 6 SNO, 'NFM' FLG, 500 AMT FROM DUAL
13 UNION ALL
14 SELECT 7 SNO, 'NFM' FLG, 700 AMT FROM DUAL
15 UNION ALL
16 SELECT 8 SNO, 'NPI' FLG, 8000 AMT FROM DUAL
17 UNION ALL
18 SELECT 9 SNO, 'NPI' FLG, 6000 AMT FROM DUAL
19 UNION ALL
20 SELECT 10 SNO, 'OTH' FLG, 9000 AMT FROM DUAL;
Table created
SQL> SELECT 'NFM' flag, SUM(amt)amt FROM TEST WHERE flg IN ('NPI','AFM')
2 UNION ALL
3 SELECT 'NPI' flag, SUM(amt)amt FROM TEST WHERE flg IN ('NPI','NFM')
4 UNION ALL
5 SELECT 'AFM' flag, SUM(amt)amt FROM TEST WHERE flg IN ('AFM','NFM')
6 UNION ALL
7 SELECT 'OTH' flag, SUM(amt) amt FROM TEST WHERE flg NOT IN ('AFM','NPI','NFM');
FLAG AMT
-------------------------------- ----------
NFM 21600
NPI 18200
AFM 5800
OTH 9000
Printing range with comma separated values
Arvind, December 06, 2012 - 12:01 am UTC
Dear Tom,
In this question, user wants the data to be categorised in conitnous range. But I have some what same but bit different requirement as given below.
I have table and data as given below
create table ag_t1
(col1 varchar2(5),
col2 varchar2(10)
);
insert into ag_t1 values ('01.01', 'v1');
insert into ag_t1 values ('01.02', 'v1');
insert into ag_t1 values ('01.03', 'v1');
insert into ag_t1 values ('01.04', 'v2');
insert into ag_t1 values ('01.05', 'v1');
insert into ag_t1 values ('01.06', 'v2');
insert into ag_t1 values ('02.01', 'v1');
insert into ag_t1 values ('02.02', 'v1');
insert into ag_t1 values ('02.03', 'v2');
insert into ag_t1 values ('02.04', 'v2');
insert into ag_t1 values ('02.05', 'v1');
insert into ag_t1 values ('02.06', 'v2');
insert into ag_t1 values ('08.01', 'v1');
insert into ag_t1 values ('08.02', 'v1');
insert into ag_t1 values ('08.03', 'v1');
insert into ag_t1 values ('08.04', 'v2');
insert into ag_t1 values ('08.05', 'v1');
insert into ag_t1 values ('08.06', 'v2');
I want the output as given below.
Output
v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
December 14, 2012 - 1:57 pm UTC
why isn't 2.03, 2.04 using "to" like 2.01 to 2.02 did.
define this better.
always 2 digit number "." 2 digit number?
always collapse out consecutive numbers using "to"?
Printing range with comma separated values
Rajeshwaran, Jeyabal, December 15, 2012 - 7:19 am UTC
drop table t purge;
create table t
(col1 number(5,2),
col2 varchar2(10)
);
insert into t values (01.01, 'v1');
insert into t values (01.02, 'v1');
insert into t values (01.03, 'v1');
insert into t values (01.04, 'v2');
insert into t values (01.05, 'v1');
insert into t values (01.06, 'v2');
insert into t values (02.01, 'v1');
insert into t values (02.02, 'v1');
insert into t values (02.03, 'v2');
insert into t values (02.04, 'v2');
insert into t values (02.05, 'v1');
insert into t values (02.06, 'v2');
insert into t values (08.01, 'v1');
insert into t values (08.02, 'v1');
insert into t values (08.03, 'v1');
insert into t values (08.04, 'v2');
insert into t values (08.05, 'v1');
insert into t values (08.06, 'v2');
commit;
rajesh@ORA11G> select col2,min(col1) min_col1,max(col1) max_col1,
2 case when nullif(min(col1),max(col1)) is not null
3 then min(col1)||' to '||max(col1)
4 else to_char(min(col1)) end reslt
5 from (
6 select col1,col2,max(val) over(order by col1,col2) grp
7 from (
8 select t.*,
9 case when lag(col2) over(order by col1,col2) is null
10 or lag(col2) over(order by col1,col2) <> col2 then
11 row_number() over(order by col1,col2) end as val
12 from t
13 )
14 )
15 group by col2,grp
16 order by col2
17 /
COL2 MIN_COL1 MAX_COL1 RESLT
---------- ---------- ---------- --------------------
v1 1.01 1.03 1.01 to 1.03
v1 1.05 1.05 1.05
v1 2.01 2.02 2.01 to 2.02
v1 2.05 2.05 2.05
v1 8.01 8.03 8.01 to 8.03
v1 8.05 8.05 8.05
v2 1.04 1.04 1.04
v2 1.06 1.06 1.06
v2 2.03 2.04 2.03 to 2.04
v2 2.06 2.06 2.06
v2 8.04 8.04 8.04
v2 8.06 8.06 8.06
12 rows selected.
Elapsed: 00:00:00.01
rajesh@ORA11G> select col2,listagg(reslt,',') within group(order by min_col1) as listed_values
2 from (
3 select col2,min(col1) min_col1,max(col1) max_col1,
4 case when nullif(min(col1),max(col1)) is not null
5 then min(col1)||' to '||max(col1)
6 else to_char(min(col1)) end reslt
7 from (
8 select col1,col2,max(val) over(order by col1,col2) grp
9 from (
10 select t.*,
11 case when lag(col2) over(order by col1,col2) is null
12 or lag(col2) over(order by col1,col2) <> col2 then
13 row_number() over(order by col1,col2) end as val
14 from t
15 )
16 )
17 group by col2,grp
18 )
19 group by col2
20 /
COL2 LISTED_VALUES
---------- ------------------------------------------------------------
v1 1.01 to 1.03,1.05,2.01 to 2.02,2.05,8.01 to 8.03,8.05
v2 1.04,1.06,2.03 to 2.04,2.06,8.04,8.06
Elapsed: 00:00:00.03
rajesh@ORA11G>
December 18, 2012 - 7:41 am UTC
you did not produce their requested output, they did not provide sufficient information to actually answer the question....
Muhammad Bilal, December 31, 2012 - 4:44 am UTC
I have the following dataset:
CREATE TABLE RoomGroups
(group_id NUMBER,
group_name VARCHAR2(30),
group_capacity NUMBER);
INSERT INTO RoomGroups VALUES(1, 'SEECS UG Block', 100);
INSERT INTO RoomGroups VALUES(2, 'IAEC Building', 70);
INSERT INTO RoomGroups VALUES(3, 'RIMMS Building', 90);
Commit;
CREATE TABLE DegreeBatches
(batch_id NUMBER, batch_name VARCHAR2(30), batch_strength NUMBER);
INSERT INTO DegreeBatches VALUES(10,'BIT-10',35);
INSERT INTO DegreeBatches VALUES(20,'BIT-11',40);
INSERT INTO DegreeBatches VALUES(30,'BSCS-2',35);
INSERT INTO DegreeBatches VALUES(40,'BSCS-3',40);
INSERT INTO DegreeBatches VALUES(50,'BEE-4',50);
INSERT INTO DegreeBatches VALUES(60,'BICSE-7',25);
INSERT INTO DegreeBatches VALUES(70,'BESE-3',30);
Commit;
I want to achieve the following through single SQL statement:
1. Get all possible sets of DegreeBatches for each group - in such a way that for each set the SUM(batch_strength) <= Capacity of that group as show below.
GroupId Batches StrengthStr StrengthTotal Capacity
1. {BIT-10, BIT-11, BICSE-7} {35, 40, 25} 100 100
1. {BSCS-2, BSCS-3, BICSE-7} {35, 40, 25} 100 100
1. {BEE-4, BSCS-3} {50, 40} 90 100
...
...
2. {BIT-10, BSCS-2} {35, 35} 70 70
2. {BSCS-3, BESE-3} {40, 30} 70 70
2. {BIT-11, BICSE-7} {40, 25} 65 70
and so on...
2. Next I want to retrieve only non-overlapping sets based on minimum value of Capacity-StrengthTotal for each group
GroupId Batches StrengthStr StrengthTotal Capacity
1. {BIT-10, BIT-11, BICSE-7} {35, 40, 25} 100 100
2. {BSCS-3, BESE-3} {40, 30} 70 70
3. {BSCS-2, BEE-4} {35, 50} 85 90
3. The final result must have all the batches in it...
Which in this case is true...
Any help will be highly appreciated...
Thanks
Bilal
How to squash/combine rows around current row
Matt, January 08, 2013 - 11:30 am UTC
Hi Tom
I am trying to figure out how to combine similar, connected rows. I think I need analytics but all my attempts are meeting failure.
drop table mtemp;
create table mtemp (startdt date,enddt date, ra number, nqc number);
insert into mtemp values ('1-JAN-2013','2-JAN-2013',1,2);
insert into mtemp values ('2-JAN-2013','3-JAN-2013',1,2);
insert into mtemp values ('3-JAN-2013','4-JAN-2013',1,2);
insert into mtemp values ('4-JAN-2013','5-JAN-2013',1000,2000);
insert into mtemp values ('5-JAN-2013','6-JAN-2013',1,2);
commit;
what I need is a report that "squashes" the first three records where ra and nqc values are the same into one row. Then the next record where the values change to 1000 and 2000 and then a third record where the values go back to their initial values.
The output I would like is:
startdt enddt ra nqc
-------- -------- ------- --------
1-1-2013 1-4-2013 1 2 <three records "squashed"
1-4-2013 1-5-2013 1000 2000
1-5-2013 1-6-2013 1 2 <back to initial vals, need as SEPARATE record
I thought this was a simple analytic function using first_value and last_value. But the fact that the partition by values "return" to something previous is the troublesome portion.
select first_value(startdt) over (partition by ra,nqc order by startdt) minstart
, last_value(enddt) over (partition by ra,nqc order by startdt) maxend
, ra
, nqc
from mtemp
But with that I get
MINSTART MAXEND RA NQC
1/1/2013 1/2/2013 1 2
1/1/2013 1/3/2013 1 2
1/1/2013 1/4/2013 1 2
1/1/2013 1/6/2013 1 2
1/4/2013 1/5/2013 1000 2000
I have also tried various windowing with RANGE BETWEEN and ROWS BETWEEN but I am not finding good examples and I am failing.
Thanks so much for your help over the years.
matt
January 14, 2013 - 11:53 am UTC
ops$tkyte%ORA11GR2> select min(startdt), max(enddt), min(ra), min(nqc)
2 from (
3 select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by startdt) grp
4 from (
5 select startdt, enddt, ra, nqc,
6 case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
7 and
8 decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
9 then row_number() over (order by startdt)
10 end tag
11 from mtemp
12 )
13 )
14 group by grp
15 order by grp
16 /
MIN(START MAX(ENDDT MIN(RA) MIN(NQC)
--------- --------- ---------- ----------
01-JAN-13 04-JAN-13 1 2
04-JAN-13 05-JAN-13 1000 2000
05-JAN-13 06-JAN-13 1 2
is one approach.
the inner most query tags the beginning of each group - based on whether the last ra/nqc are the same:
ops$tkyte%ORA11GR2> select startdt, enddt, ra, nqc,
2 lag(ra) over (order by startdt) lrq,
3 lag(nqc) over (order by startdt) lnqc,
4 decode( lag(ra) over (order by startdt), ra, 1,0 ) dra,
5 decode( lag(nqc) over (order by startdt), nqc, 1,0 ) dnqc,
6 case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
7 and
8 decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
9 then row_number() over (order by startdt)
10 end tag
11 from mtemp
12 /
STARTDT ENDDT RA NQC LRQ LNQC DRA DNQC TAG
--------- --------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
01-JAN-13 02-JAN-13 1 2 0 0 1
02-JAN-13 03-JAN-13 1 2 1 2 1 1
03-JAN-13 04-JAN-13 1 2 1 2 1 1
04-JAN-13 05-JAN-13 1000 2000 1 2 0 0 4
05-JAN-13 06-JAN-13 1 2 1000 2000 0 0 5
the second level carries down this value for us:
ops$tkyte%ORA11GR2> select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by startdt) grp
2 from (
3 select startdt, enddt, ra, nqc,
4 case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
5 and
6 decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
7 then row_number() over (order by startdt)
8 end tag
9 from mtemp
10 )
11 /
STARTDT ENDDT RA NQC GRP
--------- --------- ---------- ---------- ----------
01-JAN-13 02-JAN-13 1 2 1
02-JAN-13 03-JAN-13 1 2 1
03-JAN-13 04-JAN-13 1 2 1
04-JAN-13 05-JAN-13 1000 2000 4
05-JAN-13 06-JAN-13 1 2 5
and then we just aggregate.
Question asked by Matt - bottom of page
Artieboy, January 15, 2013 - 12:33 am UTC
As you can tell I'm a novice, but would this work Oracle?
select unique startdt_new, enddt_new, ra, nqc
from
(--3
select startdt, enddt, ra, nqc, startdt_new,
last_value(enddt) over (partition by startdt_new) enddt_new from
(--2
select startdt, enddt, ra, nqc,
case when ra = lead(ra) over(partition by ra, nqc order by startdt) and nqc = lead(nqc) over(partition by ra, nqc order by startdt)
then c2 else startdt end startdt_new
from
(--1
select startdt, enddt, ra, nqc,
min(startdt) over(partition by ra, nqc) c2
from mtemp
)--1
)--2
)--3
order by startdt_new
CHEERS!
January 15, 2013 - 10:34 am UTC
explain your logic, like I did mine. reverse engineering what someone was thinking can be error prone ;)
it doesn't work in general, no
ops$tkyte%ORA11GR2> select * from mtemp order by startdt;
STARTDT ENDDT RA NQC
--------- --------- ---------- ----------
01-JAN-13 02-JAN-13 1 2
02-JAN-13 03-JAN-13 1 2
03-JAN-13 04-JAN-13 1 2
04-JAN-13 05-JAN-13 1000 2000
05-JAN-13 06-JAN-13 1 2
07-JAN-13 08-JAN-13 5000 6000
09-JAN-13 10-JAN-13 1 2
7 rows selected.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select unique startdt_new, enddt_new, ra, nqc
2 from
3 (--3
4 select startdt, enddt, ra, nqc, startdt_new,
5 last_value(enddt) over (partition by startdt_new) enddt_new from
6 (--2
7 select startdt, enddt, ra, nqc,
8 case when ra = lead(ra) over(partition by ra, nqc order by startdt) and nqc =
9 lead(nqc) over(partition by ra, nqc order by startdt)
10 then c2 else startdt end startdt_new
11 from
12 (--1
13 select startdt, enddt, ra, nqc,
14 min(startdt) over(partition by ra, nqc) c2
15 from mtemp
16 )--1
17 )--2
18 )--3
19 order by startdt_new
20 /
STARTDT_N ENDDT_NEW RA NQC
--------- --------- ---------- ----------
01-JAN-13 06-JAN-13 1 2
04-JAN-13 05-JAN-13 1000 2000
07-JAN-13 08-JAN-13 5000 6000
09-JAN-13 10-JAN-13 1 2
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select min(startdt), max(enddt), min(ra), min(nqc)
2 from (
3 select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by startdt) grp
4 from (
5 select startdt, enddt, ra, nqc,
6 case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
7 and
8 decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
9 then row_number() over (order by startdt)
10 end tag
11 from mtemp
12 )
13 )
14 group by grp
15 order by grp
16 /
MIN(START MAX(ENDDT MIN(RA) MIN(NQC)
--------- --------- ---------- ----------
01-JAN-13 04-JAN-13 1 2
04-JAN-13 05-JAN-13 1000 2000
05-JAN-13 06-JAN-13 1 2
07-JAN-13 08-JAN-13 5000 6000
09-JAN-13 10-JAN-13 1 2
Approach to How to squash/combine rows
Matt, January 15, 2013 - 12:13 pm UTC
Tom,
I like your approach. We ended up solving a different way that appears to work, though. It relies on two "counters" using row_number(). The first creates a running counter and the other a counter for like rows - rows with the same RA and NQC. Subtract one from the other and you get a number that is unique for the grouping.
Do you see anything wrong with this:
-- first the inner query showing the counters:
SELECT startdt, enddt, ra, nqc
, ROW_NUMBER() OVER(ORDER BY startdt, enddt) grpa
, ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grpb
, ROW_NUMBER() OVER(ORDER BY startdt, enddt) - ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grp
FROM mtemp;
STARTDT ENDDT RA NQC GRPA GRPB GRP
--------- --------- ---------- ---------- ---------- ---------- ----------
01-JAN-13 02-JAN-13 1 2 1 1 0
02-JAN-13 03-JAN-13 1 2 2 2 0
03-JAN-13 04-JAN-13 1 2 3 3 0
04-JAN-13 05-JAN-13 1000 2000 4 1 3
05-JAN-13 06-JAN-13 1 2 5 4 1
Now we aggregate and remove the counting/grouping
SELECT MIN(startdt) startdt, MAX(enddt) enddt, ra, nqc
FROM (SELECT startdt, enddt, ra, nqc
, ROW_NUMBER() OVER(ORDER BY startdt, enddt) grpa
, ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grpb
, ROW_NUMBER() OVER(ORDER BY startdt, enddt) - ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grp
FROM mtemp)
GROUP BY ra, nqc, grp
ORDER BY startdt, enddt;
STARTDT ENDDT RA NQC
--------- --------- ---------- ----------
01-JAN-13 04-JAN-13 1 2
04-JAN-13 05-JAN-13 1000 2000
05-JAN-13 08-JAN-13 1 2
My colleague found this while researching this as a "Gaps and Islands" problem.
Thanks again.
January 15, 2013 - 2:33 pm UTC
interesting approach - it looks to be roughly equivalent workwise - same number of window sort operations, they both full scan - window sort - window sort - aggregate.
I'll have to file that one away in my head for future reference :)
I've used what I call the "carry down" technique (my approach above) so so so many times now - it just types itself anymore...
Re: Approach to How to squash/combine rows
Narendra, January 17, 2013 - 3:19 am UTC
..question
A Reader, February 05, 2013 - 6:21 am UTC
Hello Tom,
SQL> create table t1 ( x varchar2(31));
Table created.
SQL>insert into t1 values('zzz90000(Zh900000)xyz~0001234');
SQL>insert into t1 values('zzz90000(7i900000)xyz~0001234');
SQL>insert into t1 values('zzz90000(2i900000)xyz~0001234');
SQL>insert into t1 values('zzz80000(7i900000)xyz~0001234');
SQL>insert into t1 values('zzz70000(Ve900000)xyz~0001234');
SQL>insert into t1 values('zzz70000(2f900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(Zh900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(Ve900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(ui900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(Ui900000)xyz~0001234');
man@ora10g:rac1> select * from t1 order by 1;
X
-------------------------------
zzz60000(Ui900000)xyz~0001234
zzz60000(Ve900000)xyz~0001234
zzz60000(Zh900000)xyz~0001234
zzz60000(ui900000)xyz~0001234
zzz70000(2f900000)xyz~0001234
zzz70000(Ve900000)xyz~0001234
zzz80000(7i900000)xyz~0001234
zzz90000(2i900000)xyz~0001234
zzz90000(7i900000)xyz~0001234
zzz90000(Zh900000)xyz~0001234
10 rows selected.
man@ora10g:rac1> select max(x) from t1;
MAX(X)
-------------------------------
zzz90000(Zh900000)xyz~0001234
man@ora10g:rac1> select * from t1 where x>'zzz70000(Ve900000)xyz~0001234';
X
-------------------------------
zzz90000(Zh900000)xyz~0001234
zzz90000(7i900000)xyz~0001234
zzz90000(2i900000)xyz~0001234
zzz80000(7i900000)xyz~0001234
Question.
I am interested in finding the result for top 4 values with the given info of max(x).
Had x been a number data type...
I would have written
I know using analytics it can be done.. but for some reasons I want to use max(x) ..
so I will write
select * from t1 where x > ( select ( max(x) - 4) from t1 ) ;
how to select something similar in above case ( when we have x as varchar2 and with the 10 rows as shown )
select * from t1 where x> ( select ( max(x) - ??) from t1);
February 06, 2013 - 8:24 am UTC
Help with this one
Doug, February 06, 2013 - 10:21 pm UTC
Hi Tom,
Here is my query
With tbs_name as (select tablespace_name
from dba_tablespaces
where tablespace_name = UPPER ('ADMIN_INDEX') )
SELECT 'free space' owner, ' ' OBJECT, file_id, block_id, blocks,
(blocks * 8192) / (1024 * 1024) meg
FROM dba_free_space a, Tbs_name
WHERE a.tablespace_name = tbs_name.tablespace_name
UNION
SELECT SUBSTR (owner, 1, 20), SUBSTR (segment_name, 1, 32), file_id,
block_id, blocks,
(blocks * 8192) / (1024 * 1024) meg
FROM dba_extents a, tbs_name
WHERE a.tablespace_name = tbs_name.tablespace_name
ORDER BY 3, 4;
I would like to take the output of this and summarise or
consolidation into contiguous space.
ie if output was
OWNER OBJECT FILE_ID BLOCK_ID BLOCKS MEG
---------------------------------------------------
USER IDX_1 2 9 128 1
USER IDX_2 2 137 128 1
USER IDX_2 2 265 128 1
USER IDX_2 2 393 128 1
free space 2 521 128 1
free space 2 649 128 1
USER IDX_2 2 777 128 1
Then I would want to roll up block Ids 137, 265 and 393
as as contigous space for IDX_2 and also roll up the freepace to report it as total of contigous space as well
I know defrags are normally a waste of time , my goal is
is to indentify indexes I can rebuild to free up free space.
I dropped 400Gb of indexes that were not being used and the space is available within the index tablespace but I need it
for other purposes
Regards
Doug
February 07, 2013 - 6:46 am UTC
I dropped 400Gb of indexes that were not being used and the space is available within the index tablespace but I need it
for other purposes great, so use it - why do you need to do anything to use this space? it is already 100% useable???????
ops$tkyte%ORA11GR2> with tbs_name
2 as
3 (select tablespace_name
4 from dba_tablespaces
5 where tablespace_name = UPPER ('USERS')
6 ),
7 space
8 as
9 (
10 SELECT 'free space' owner,
11 ' ' OBJECT,
12 file_id,
13 block_id,
14 blocks
15 FROM dba_free_space a, Tbs_name
16 WHERE a.tablespace_name = tbs_name.tablespace_name
17 UNION ALL
18 SELECT SUBSTR (owner, 1, 20),
19 SUBSTR (segment_name, 1, 32), file_id,
20 block_id,
21 blocks
22 FROM dba_extents a, tbs_name
23 WHERE a.tablespace_name = tbs_name.tablespace_name
24 )
25 select owner, object, file_id, min(block_id), sum(blocks), count(*) cnt
26 from (
27 select owner,
28 object,
29 file_id,
30 block_id,
31 blocks,
32 last_value( grp ignore nulls ) over (partition by owner, object, file_id order by block_id ) lgrp
33 from (
34 select owner,
35 object,
36 file_id,
37 block_id,
38 blocks,
39 case when nvl( lag(block_id+blocks) over (partition by owner, object, file_id order by block_id), block_id+blocks) <> block_id
40 then row_number() over (partition by owner, object, file_id order by block_id)
41 end grp
42 from space
43 )
44 )
45 group by owner, object, file_id, lgrp
46 order by owner, object, file_id
47 /
OWNER OBJECT FILE_ID MIN(BLOCK_ID) SUM(BLOCKS) CNT
---------- -------------------- ---------- ------------- ----------- ----------
OPS$TKYTE FY_CAL 4 160 8 1
OPS$TKYTE SYS_C0046904 4 200 8 1
OPS$TKYTE SYS_C0046904 4 8872 8 1
OPS$TKYTE SYS_C0046905 4 248 8 1
OPS$TKYTE SYS_C0046905 4 280 8 1
OPS$TKYTE SYS_C0046905 4 328 8 1
OPS$TKYTE SYS_C0046905 4 1232 8 1
OPS$TKYTE SYS_C0046905 4 1264 8 1
OPS$TKYTE SYS_C0046905 4 8840 8 1
OPS$TKYTE SYS_C0046905 4 8856 8 1
OPS$TKYTE SYS_C0046905 4 9088 128 1
OPS$TKYTE SYS_C0046905 4 16640 8 1
OPS$TKYTE SYS_C0046905 4 16656 8 1
OPS$TKYTE SYS_C0046905 4 16672 8 1
OPS$TKYTE SYS_C0046905 4 18944 8 1
OPS$TKYTE SYS_C0046905 4 18960 8 1
OPS$TKYTE SYS_C0046905 4 18976 8 1
OPS$TKYTE SYS_C0046905 4 19032 8 1
OPS$TKYTE SYS_C0046905 4 19048 8 1
OPS$TKYTE SYS_C0046905 4 19064 8 1
OPS$TKYTE SYS_C0046907 4 272 8 1
OPS$TKYTE SYS_C0046907 4 296 8 1
OPS$TKYTE SYS_C0046907 4 344 8 1
OPS$TKYTE SYS_C0046907 4 1240 8 1
OPS$TKYTE SYS_C0046907 4 1272 8 1
OPS$TKYTE SYS_C0046907 4 8832 8 1
OPS$TKYTE SYS_C0046907 4 8848 8 1
OPS$TKYTE SYS_C0046907 4 8864 8 1
OPS$TKYTE SYS_C0046907 4 9216 128 1
OPS$TKYTE SYS_C0046907 4 16648 8 1
OPS$TKYTE SYS_C0046907 4 16664 8 1
OPS$TKYTE SYS_C0046907 4 16680 8 1
OPS$TKYTE SYS_C0046907 4 18952 8 1
OPS$TKYTE SYS_C0046907 4 18984 8 1
OPS$TKYTE SYS_C0046907 4 19024 8 1
OPS$TKYTE SYS_C0046907 4 19040 8 1
OPS$TKYTE SYS_C0046907 4 19056 8 1
OPS$TKYTE T 4 8880 8 1
OPS$TKYTE T1 4 152 8 1
OPS$TKYTE T2 4 136 8 1
OPS$TKYTE TBL_HISTORY 4 29960 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 144 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 168 16 2
OPS$TKYTE TDETAIL_REF_MANY 4 232 16 2
OPS$TKYTE TDETAIL_REF_MANY 4 256 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 288 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 336 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 1184 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 1200 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 1216 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 1248 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 2176 128 1
OPS$TKYTE TDETAIL_REF_MANY 4 6272 128 1
OPS$TKYTE TDETAIL_REF_MANY 4 8704 128 1
OPS$TKYTE TDETAIL_REF_MANY 4 10496 128 1
OPS$TKYTE TDETAIL_REF_MANY 4 16384 128 1
OPS$TKYTE TDETAIL_REF_MANY 4 16752 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 18968 8 1
OPS$TKYTE TDETAIL_REF_MANY 4 19000 16 2
OPS$TKYTE TDETAIL_REF_ONE 4 184 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 216 16 2
OPS$TKYTE TDETAIL_REF_ONE 4 264 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 304 24 3
OPS$TKYTE TDETAIL_REF_ONE 4 1176 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 1192 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 1208 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 1224 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 1256 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 2304 128 1
OPS$TKYTE TDETAIL_REF_ONE 4 8576 128 1
OPS$TKYTE TDETAIL_REF_ONE 4 8960 128 1
OPS$TKYTE TDETAIL_REF_ONE 4 16256 128 1
OPS$TKYTE TDETAIL_REF_ONE 4 16744 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 16760 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 18992 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 19016 8 1
OPS$TKYTE TDETAIL_REF_ONE 4 27776 128 1
OPS$TKYTE TMASTER 4 192 8 1
OPS$TKYTE TMASTER 4 208 8 1
OPS$TKYTE TMASTER 4 352 824 13
OPS$TKYTE TMASTER 4 1280 896 7
OPS$TKYTE TMASTER 4 2432 3840 30
OPS$TKYTE TMASTER 4 6400 2176 3
OPS$TKYTE TMASTER 4 9344 1152 2
OPS$TKYTE TMASTER 4 10624 5632 9
OPS$TKYTE TMASTER 4 16512 128 1
OPS$TKYTE TMASTER 4 16688 56 7
OPS$TKYTE TMASTER 4 16768 2176 3
OPS$TKYTE TMASTER 4 19072 8704 19
OPS$TKYTE TMASTER 4 27904 2048 2
free space 4 128 8 1
free space 4 8888 72 9
free space 4 29952 8 1
free space 4 29968 1488 1
94 rows selected.
Printing range with comma separated values
Arvind, February 06, 2013 - 11:36 pm UTC
Dear Tom,
This is further information as asked by you in my earlier post. In this question, user wants the data to be categorised in conitnous range. But I have some what
same but bit different requirement as given below.
I have table and data as given below
create table ag_t1
(col1 varchar2(5),
col2 varchar2(10)
);
insert into ag_t1 values ('01.01', 'v1');
insert into ag_t1 values ('01.02', 'v1');
insert into ag_t1 values ('01.03', 'v1');
insert into ag_t1 values ('01.04', 'v2');
insert into ag_t1 values ('01.05', 'v1');
insert into ag_t1 values ('01.06', 'v2');
insert into ag_t1 values ('02.01', 'v1');
insert into ag_t1 values ('02.02', 'v1');
insert into ag_t1 values ('02.03', 'v2');
insert into ag_t1 values ('02.04', 'v2');
insert into ag_t1 values ('02.05', 'v1');
insert into ag_t1 values ('02.06', 'v2');
insert into ag_t1 values ('08.01', 'v1');
insert into ag_t1 values ('08.02', 'v1');
insert into ag_t1 values ('08.03', 'v1');
insert into ag_t1 values ('08.04', 'v2');
insert into ag_t1 values ('08.05', 'v1');
insert into ag_t1 values ('08.06', 'v2');
I want the output as given below.
Output
v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
col1 values can be in three formats
a) 01.01
b) 01.001
c) 1
February 11, 2013 - 7:58 am UTC
you answered nothing that I asked above actually???
sigh.
if there is a 01.001 why isn't it in your example, where does it fit in. Would it be "contiguous" with 01.01? what does it mean.
if there is a 1, why isn't it in your example where does it fit in. Would it be contiguous with 01?
be precise, pretend you were explaining this to your mom - be that detailed.
arvind's analytic problem.
Biswaranjan, February 07, 2013 - 1:13 pm UTC
Hi Arvind,
I just analyzed your request deeply(initially it was very difficult to understand the logic behind it but finally I
got it totally ;)).
please run the below query and let us know whether it works or not
I tested for your input and some random inputs it works fine,
but plz test it.
#############(tested in 11g)I have just modified rajesh's query to fit your output #######
select col2,listagg(reslt,',') within group(order by min_col1) as listed_values
from (
select col2,min(col1) min_col1,max(col1) max_col1,
case when nullif(min(col1),max(col1)) is not null
then min(col1)||' to '||max(col1)
else to_char(min(col1)) end reslt
from (
select col1,col2,case when col2<> 'v2' then max(val) over(order by rownum) else val end grp
from (
select ag_t1.*,
case when lag(col2) over(order by rownum) is null
or lag(col2) over(order by rownum) <> col2 or col2<>'v1' then
row_number() over(order by rownum) end as val
from ag_t1
)
)
group by col2,grp
)
group by col2;
############################
for you data I got the result as below.
v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
But I know tom can write it more beautifully(regarding performance ).
regards,
Biswaranjan.
February 11, 2013 - 8:50 am UTC
you got it totally?
where does 1 and 01.001 fit into the scheme of things? what would your code do with that?
it is not possible for you to have gotten it totally, it was not specified at a level of detail for anyone to get it right without guessing at what should happen with those values.
slightly different
A reader, February 07, 2013 - 4:22 pm UTC
Hi Tom,
I would like to reclaim the space because the system does not neet 400Gb of free space in the index tablespace.
It does however need disk space to turn on archive log mode and currently the system is at full capacity, no place to add more disks.
I don't think I explained my requirement very well sorry however I liked seeing your reply.
What I am trying to do is work out which indexes should I
rebuild to try and get the free space towards the end of a data file do I can shrink it.
Regards Doug
February 11, 2013 - 9:04 am UTC
oh, the way to do this would be to rebuild all of the indexes in that tablespace into a new tablespace and then drop the old, now empty, tablespace.
the new tablespace will be 100% contiguous (use an autoextend datafile that starts "small" and grows as you put more in there) and the old tablespace can just be dropped.
If you do it by identifying (via dba_extents) what indexes are at the "end" of your datafile - you can move the furthest out index first, shrinking the old datafile as you are growing the new one.
and if you want, just rename the new tablespace to the old name after you are done.
Arvind's question logic.
Biswaranjan., February 08, 2013 - 12:51 am UTC
for Arvinda's question logic.
('01.01', 'v1')
('01.02', 'v1')
('01.03', 'v1')
('01.04', 'v2')
('01.05', 'v1')
('01.06', 'v2')
('02.01', 'v1')
('02.02', 'v1')
('02.03', 'v2')
('02.04', 'v2')
('02.05', 'v1')
('02.06', 'v2')
('08.01', 'v1')
('08.02', 'v1')
('08.03', 'v1')
('08.04', 'v2')
('08.05', 'v1')
('08.06', 'v2')
v1- 01.01 to 01.03(coz it is continuos upto 01.03,so skipped 01.02),01.05
(though it has no v1 values above and below it we will keep it as it ),
now coming down 02.01 to 02.0(coz this v1 values starts from 02.01 and continuos upto 02.02),
02.05(though it has no v1 values above and below it we will keep it as it between commas),
now coming down 08.01 to 08.03(coz this v1 values starts from 08.01 and continuos upto 08.03),
and finally 08.05 (though it has no v1 values above and below it we will keep it as it).
so with out comment it would be as below for 'v1'
01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
And for v2 we should not see anything.place each and every value for v2 as comma separated.
so simply as below.
01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
That is all the logic ,what I understood.
thanks & regards,
Biswaranjan.
:) you opened my eyes further.
Biswaranjan, February 11, 2013 - 9:28 am UTC
frankly speaking, I Just guessed that logic(as you know :)).I should not have written like "got it totally".
But I know guessing might be disastrous and wasting of time
in many case.
Thank you Tom.
By the way ,
I have been using this site ,since 3 years but never
faced any error opening this site(I mean clicking any link on your homepage).
But Now I am getting the below error.
######
ORA-01654: unable to extend index ASKTOM.ATE_QUESTION_LOG_IDX by 8192 in tablespace FLOW_19246
##############
I even tried from linux machine but samething happend.
By accessing many times I was able to open and writing now :).
Is it some issue with your site or My system problem (am confused).
Thanks agian.
February 11, 2013 - 10:12 am UTC
sorry about that, I ran out of space on apex.oracle.com and had to up my quota.
Continuation from Matt’s request on 1/15/2013
Artieboy, February 19, 2013 - 6:19 pm UTC
Hey Tom:
Thanks for letting me know my query did not work. So how about this? It’s kind of like your carry down approach but it looks for the first value instead.
First I added in some extra rows after Matt’s original records and the two records you added:
INSERT INTO mtemp VALUES('11-jan-2013', '12-jan-2013', 1, 2);
INSERT INTO mtemp VALUES('13-jan-2013', '14-jan-2013', 1, 2);
INSERT INTO mtemp VALUES('15-jan-2013', '16-jan-2013', 5000, 6000);
INSERT INTO mtemp VALUES('17-jan-2013', '18-jan-2013', 1, 2);
Query:
select min(startdt), max(enddt), ra, nqc
from
(--3
select startdt, enddt, ra, nqc,
case when c1 is null then first_value(c1 ignore nulls)
over(order by startdt rows between current row and unbounded following)-1 else c1 end grp
from
(--2
select startdt, enddt, ra, nqc, lag_ra, lead_ra,
case when (ra=lag_ra or ra=lead_ra) AND (nqc=lag_nqc or nqc=lead_nqc)
then null else startdt end c1
from
(--1
select startdt, enddt, ra,
lag(ra) over(order by startdt)lag_ra,
lead(ra) over (order by startdt) lead_ra,
nqc,
lag(nqc) over (order by startdt)lag_nqc,
lead(nqc) over (order by startdt)lead_nqc
from mtemp
)--1
)--2
)--3
group by grp, ra, nqc
order by 1
Explanation:
Where ra does not equal the ra of the next or previous record (same thing for nqc) then that is a stand alone record. For those records give me the startdt of that row, for all others return null.
Now I have all the “break points”, places where a continuous set is broken. For all null values, starting from the current row and going down, give me the first date seen of column c1 minus 1 day, This creates a grouping based on the date. Then aggregate.
In addition:
I added one more record:
INSERT INTO mtemp VALUES('30-dec-2012', '31-dec-2012', 5000, 6000);
This gives the first row of column c1 a date entry and the query still works.
Is this viable solution Oracle? Just seems easier to group records that have a date field (min(date), max(date)) with a date field.
Cheers!
February 25, 2013 - 8:27 am UTC
.. Just seems easier to group records that have a
date field (min(date), max(date)) with a date field.
...
how so? Your query has more levels of nesting - and seems more complex to me.
ops$tkyte%ORA11GR2> select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by
startdt) grp
2 from (
3 select startdt, enddt, ra, nqc,
4 case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
5 and
6 decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
7 then row_number() over (order by startdt)
8 end tag
9 from mtemp
10 )
11 /
two layers of nesting.
I didn't prove your query works (or not) - I'm just stating that I find your approach much less obvious.
arvind's analytic problem
A reader, February 21, 2013 - 10:17 pm UTC
Dear Tom,
This is further information as asked by you in my earlier post. In this question, user wants the
data to be categorised in conitnous range. But I have some what
same but bit different requirement as given below.
I have table and data as given below
create table ag_t1
(col3 varchar2(4)
col1 varchar2(5),
col2 varchar2(10)
);
here my primary key is col3, and col1
Data set 1 -----
insert into ag_t1 values ('4000','01.01', 'v1');
insert into ag_t1 values ('4000','01.02', 'v1');
insert into ag_t1 values ('4000','01.03', 'v1');
insert into ag_t1 values ('4000','01.04', 'v2');
insert into ag_t1 values ('4000','01.05', 'v1');
insert into ag_t1 values ('4000','01.06', 'v2');
insert into ag_t1 values ('4000','02.01', 'v1');
insert into ag_t1 values ('4000','02.02', 'v1');
insert into ag_t1 values ('4000','02.03', 'v2');
insert into ag_t1 values ('4000','02.04', 'v2');
insert into ag_t1 values ('4000','02.05', 'v1');
insert into ag_t1 values ('4000','02.06', 'v2');
insert into ag_t1 values ('4000','08.01', 'v1');
insert into ag_t1 values ('4000','08.02', 'v1');
insert into ag_t1 values ('4000','08.03', 'v1');
insert into ag_t1 values ('4000','08.04', 'v2');
insert into ag_t1 values ('4000','08.05', 'v1');
insert into ag_t1 values ('4000','08.06', 'v2');
for above data set, I want the output as given below.
Output
v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
Data set 2 -----
insert into ag_t1 values ('4001','1', 'v3');
insert into ag_t1 values ('4001','2', 'v3');
insert into ag_t1 values ('4001','3', 'v3');
insert into ag_t1 values ('4001','4', 'v4');
insert into ag_t1 values ('4001','5', 'v3');
insert into ag_t1 values ('4001','6', 'v4');
for above data set, I want the output as given below.
Output
v3 - 1 to 3, 5
v4 - 4, 6
Data set 3 -----
insert into ag_t1 values ('4002','01.001', 'v1');
insert into ag_t1 values ('4002','01.002', 'v1');
insert into ag_t1 values ('4002','01.003', 'v1');
insert into ag_t1 values ('4002','01.004', 'v2');
insert into ag_t1 values ('4002','01.005', 'v1');
insert into ag_t1 values ('4002','01.006', 'v2');
insert into ag_t1 values ('4002','02.001', 'v1');
insert into ag_t1 values ('4002','02.002', 'v1');
insert into ag_t1 values ('4002','02.003', 'v2');
insert into ag_t1 values ('4002','02.004', 'v2');
insert into ag_t1 values ('4002','02.005', 'v1');
insert into ag_t1 values ('4002','02.006', 'v2');
insert into ag_t1 values ('4002','08.001', 'v1');
insert into ag_t1 values ('4002','08.002', 'v1');
insert into ag_t1 values ('4002','08.003', 'v1');
insert into ag_t1 values ('4002','08.004', 'v2');
insert into ag_t1 values ('4002','08.005', 'v1');
insert into ag_t1 values ('4002','08.006', 'v2');
for above data set, I want the output as given below.
Output
v1 - 01.001 to 01.003, 01.005, 02.001 to 02.002, 02.005, 08.001 to 08.003, 08.005
v2 - 01.004, 01.006, 02.003, 02.004, 02.006, 08.004, 08.006
February 25, 2013 - 11:11 am UTC
ops$tkyte%ORA11GR2> select c1, listagg( label, ',' ) within group (order by max_grp) str
2 from (
3 select c1, max_grp, to_char(min(c2),'fm00.00') || case when min(c2) <> max(c2)
4 then ' to ' || to_char(max(c2),'fm00.00')
5 end label
6 from (
7 select c1, c2, max(grp) over (partition by c1 order by c2) max_grp
8 from (
9 select c1, c2,
10 case when nvl(lag(c2) over (partition by c1 order by c2), c2) <> c2-0.01
11 then row_number() over (partition by c1 order by c2)
12 end grp
13 from (select col2 C1, to_number(col1) C2
14 from ag_t1)
15 )
16 )
17 group by c1, max_grp
18 )
19 group by c1
20 order by c1
21 /
C1 STR
---------- ------------------------------------------------------------------------
v1 01.01 to 01.03,01.05,02.01 to 02.02,02.05,08.01 to 08.03,08.05
v2 01.04,01.06,02.03 to 02.04,02.06,08.04,08.06
now you see the technique (break the query out layer by layer, running the innermost query, then the next layer and so on so you can understand what is going on)
and then you can easily do it for the other data sets.
Analytical Query
Arvind, February 28, 2013 - 3:13 am UTC
Dear Tom,
Thanks for your response to my query. I have gone thru the query. Some time, data is coming with alphabets like 01.01A, 01.01B, 01.02 etc. In that case it may not be possible to convert date to number.
Thanks & Regards
Arvind
February 28, 2013 - 8:02 am UTC
you didn't specify that.
get us a FULL specification that covers 100% of the issue and maybe we can take a look at it.
we can only deal with what we are given. even here you don't say what to do with these values, you give no example (creates, inserts) to work with.
Arvind query
Stew Ashton, February 28, 2013 - 9:51 am UTC
I'm being a bad boy. Tom said to specify more fully, so I should wait for Arvind to answer, but I'm not.
My variant is a bit "generic", but it only works if two assumptions are true:
- first, that the col1 values are formatted so that a simple ORDER BY works, without having to convert to numbers;
- and second, that any two successive col1 values are "consecutive" as long as the the col3 and col2 values are the same.
These assumptions fit the datasets given. If they are wrong, then Arvind has a lot more to specify.
One other thing, it appears that Arvind only wants " to " when there are at least 3 consecutive values.
P.S. This is another example of the "Tabibitosan method" as explained at
https://forums.oracle.com/forums/thread.jspa?threadID=1005478 select col3, col2, listagg(
min_col1 || decode(cnt, 1, null, 2, ', ' || max_col1, ' to ' || max_col1)
, ', '
) within group(order by min_col1) col1
from (
select col3, col2, count(*) cnt, min(col1) min_col1, max(col1) max_col1
from (
select col3, col2, col1,
row_number() over(partition by col3 order by col1) -
row_number() over(partition by col3, col2 order by col1) grp
from ag_t1 t
)
group by col3, col2, grp
)
group by col3, col2
order by col3, col2;
COL3 COL2 COL1
----- ----- --------------------------------------------------------------------------
4000 v1 01.01 to 01.03, 01.05, 02.01, 02.02, 02.05, 08.01 to 08.03, 08.05
4000 v2 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
4001 v3 1 to 3, 5
4001 v4 4, 6
4002 v1 01.001 to 01.003, 01.005, 02.001, 02.002, 02.005, 08.001 to 08.003, 08.005
4002 v2 01.004, 01.006, 02.003, 02.004, 02.006, 08.004, 08.006
February 28, 2013 - 11:52 am UTC
nope, doesn't work - assumption 2 cannot be true. the want them to be consecutive if they different by a unit - 0.01 in this case.
for example, if you remove a row:
-- insert into ag_t1 values ('4000','01.04', 'v2');
then your result becomes:
ops$tkyte%ORA11GR2> select col3, col2, listagg(
2 min_col1 || decode(cnt, 1, null, 2, ', ' || max_col1, ' to ' || max_col1)
3 , ', '
4 ) within group(order by min_col1) col1
5 from (
6 select col3, col2, count(*) cnt, min(col1) min_col1, max(col1) max_col1
7 from (
8 select col3, col2, col1,
9 row_number() over(partition by col3 order by col1) -
10 row_number() over(partition by col3, col2 order by col1) grp
11 from ag_t1 t
12 )
13 group by col3, col2, grp
14 )
15 group by col3, col2
16 order by col3, col2;
COL3 COL2 COL1
---- ---------- ------------------------------------------------------------------------
4000 v1 01.01 to 01.05, 02.01, 02.02, 02.05, 08.01 to 08.03, 08.05
4000 v2 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
but we don't have 1.01 to 1.05 - 1.04 is missing...
they have not specified this adequately to answer - but I know that assumption #2 cannot be correct - removing a row for V2 should not affect the answer of V1.
Analytical Query
Arvind, March 03, 2013 - 11:38 pm UTC
This sample includes all cases
This is further information as asked by you in my earlier post. In this question, user wants the
data to be categorised in conitnous range. But I have some what
same but bit different requirement as given below.
I have table and data as given below
create table ag_t1
(col3 varchar2(4)
col1 varchar2(5),
col2 varchar2(10)
);
here my primary key is col3, and col1
Data set 1 -----
insert into ag_t1 values ('4000','01.01', 'v1');
insert into ag_t1 values ('4000','01.02', 'v1');
insert into ag_t1 values ('4000','01.03', 'v1');
insert into ag_t1 values ('4000','01.04', 'v2');
insert into ag_t1 values ('4000','01.05', 'v1');
insert into ag_t1 values ('4000','01.06', 'v2');
insert into ag_t1 values ('4000','02.01', 'v1');
insert into ag_t1 values ('4000','02.02', 'v1');
insert into ag_t1 values ('4000','02.03', 'v2');
insert into ag_t1 values ('4000','02.04', 'v2');
insert into ag_t1 values ('4000','02.05', 'v1');
insert into ag_t1 values ('4000','02.06', 'v2');
insert into ag_t1 values ('4000','08.01', 'v1');
insert into ag_t1 values ('4000','08.02', 'v1');
insert into ag_t1 values ('4000','08.03', 'v1');
insert into ag_t1 values ('4000','08.04', 'v2');
insert into ag_t1 values ('4000','08.05', 'v1');
insert into ag_t1 values ('4000','08.06', 'v2');
for above data set, I want the output as given below.
Output
v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
Data set 2 -----
insert into ag_t1 values ('4001','1', 'v3');
insert into ag_t1 values ('4001','2', 'v3');
insert into ag_t1 values ('4001','3', 'v3');
insert into ag_t1 values ('4001','4', 'v4');
insert into ag_t1 values ('4001','5', 'v3');
insert into ag_t1 values ('4001','6', 'v4');
for above data set, I want the output as given below.
Output
v3 - 1 to 3, 5
v4 - 4, 6
Data set 3 -----
insert into ag_t1 values ('4002','01.001', 'v1');
insert into ag_t1 values ('4002','01.002', 'v1');
insert into ag_t1 values ('4002','01.003', 'v1');
insert into ag_t1 values ('4002','01.004', 'v2');
insert into ag_t1 values ('4002','01.005', 'v1');
insert into ag_t1 values ('4002','01.006', 'v2');
insert into ag_t1 values ('4002','02.001', 'v1');
insert into ag_t1 values ('4002','02.002', 'v1');
insert into ag_t1 values ('4002','02.003', 'v2');
insert into ag_t1 values ('4002','02.004', 'v2');
insert into ag_t1 values ('4002','02.005', 'v1');
insert into ag_t1 values ('4002','02.006', 'v2');
insert into ag_t1 values ('4002','08.001', 'v1');
insert into ag_t1 values ('4002','08.002', 'v1');
insert into ag_t1 values ('4002','08.003', 'v1');
insert into ag_t1 values ('4002','08.004', 'v2');
insert into ag_t1 values ('4002','08.005', 'v1');
insert into ag_t1 values ('4002','08.006', 'v2');
for above data set, I want the output as given below.
Output
v1 - 01.001 to 01.003, 01.005, 02.001 to 02.002, 02.005, 08.001 to 08.003, 08.005
v2 - 01.004, 01.006, 02.003, 02.004, 02.006, 08.004, 08.006
Data set 4 -----
insert into ag_t1 values ('4000','01.01A', 'v1');
insert into ag_t1 values ('4000','01.01B', 'v1');
insert into ag_t1 values ('4000','01.01C', 'v1');
insert into ag_t1 values ('4000','01.02', 'v1');
insert into ag_t1 values ('4000','01.03', 'v1');
insert into ag_t1 values ('4000','01.04', 'v2');
insert into ag_t1 values ('4000','01.05', 'v1');
insert into ag_t1 values ('4000','01.06', 'v2');
insert into ag_t1 values ('4000','02.01A', 'v1');
insert into ag_t1 values ('4000','02.01B', 'v1');
insert into ag_t1 values ('4000','02.01C', 'v1');
insert into ag_t1 values ('4000','02.02', 'v1');
insert into ag_t1 values ('4000','02.03', 'v2');
insert into ag_t1 values ('4000','02.04', 'v2');
insert into ag_t1 values ('4000','02.05', 'v1');
insert into ag_t1 values ('4000','02.06', 'v2');
insert into ag_t1 values ('4000','08.01A', 'v1');
insert into ag_t1 values ('4000','08.01B', 'v1');
insert into ag_t1 values ('4000','08.01C', 'v1');
insert into ag_t1 values ('4000','08.02', 'v1');
insert into ag_t1 values ('4000','08.03', 'v1');
insert into ag_t1 values ('4000','08.04', 'v2');
insert into ag_t1 values ('4000','08.05', 'v1');
insert into ag_t1 values ('4000','08.06', 'v2');
for above data set, I want the output as given below.
Output
v1 - 01.01A to 01.01C, 01.02, 01.03, 01.05, 02.01A to 02.01C, 02.02, 02.05, 08.01A to 08.01C, 08.02, 08.03, 08.05
v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
March 04, 2013 - 4:32 pm UTC
why did you post all of the redundant stuff ...
and a create table that does not work
with inserts that fail after you fix the create table...
*
ERROR at line 1:
ORA-12899: value too large for column "OPS$TKYTE"."AG_T1"."COL1" (actual: 6,
maximum: 5)
ugh....
if we are allowed to assume:
all data will either be in the form
NN.NN where N is a digit between 0 and 9
or
NN.NNC where N is a digit between 0 and 9 and C is a character between A and Z
then, a simple variation on a theme can work - we'll convert NN.NN into a number and NN.NNC into a number by taking NN.NN + ascii-code(C)/10000
then the original query just works - albeit with a little more complex formatting of the output.
If these assumptions are not correct - FIX YOUR SPECIFICATION (actually, provide one, you never ever have...)
ops$tkyte%ORA11GR2> select c1, listagg( label, ',' ) within group (order by max_grp) str
2 from (
3 select c1, max_grp,
4 to_char( trunc(min(c2),2) ,'fm00.00') ||
5 case when (min(c2)-trunc(min(c2),2)) <> 0
6 then chr( (min(c2)-trunc(min(c2),2))*10000 )
7 end ||
8 case when min(c2) <> max(c2)
9 then ' to ' || to_char( trunc(max(c2)),'fm00.00') ||
10 case when (max(c2)-trunc(max(c2),2)) <> 0
11 then chr( (max(c2)-trunc(max(c2),2))*10000 )
12 end
13 end label
14 from (
15 select c1, c2, max(grp) over (partition by c1 order by c2) max_grp
16 from (
17 select c1, c2,
18 case when nvl(lag(c2) over (partition by c1 order by c2), c2) <> c2-0.0001
19 then row_number() over (partition by c1 order by c2)
20 end grp
21 from (select col2 C1, to_number(substr( col1, 1, 5 ))+ nvl(ascii(substr(col1,6,1))/10000,0) C2
22 from ag_t1)
23 )
24 )
25 group by c1, max_grp
26 )
27 group by c1
28 order by c1
29 /
C1
----------
STR
-------------------------------------------------------------------------------
v1
01.01A to 01.00C,01.02,01.03,01.05,02.01A to 02.00C,02.02,02.05,08.01A to 08.00
C,08.02,08.03,08.05
v2
01.04,01.06,02.03,02.04,02.06,08.04,08.06
problem understanding statement
preet, March 20, 2013 - 4:56 pm UTC
Hi Tom,
I am going through ORACLE documentation for understanding Analytic function. Unfortunately, I couldn't get this statement "To filter the results of a query based on an analytic function, nest these functions within the parent query, and then filter the results of the nested subquery."
Could you please provide some example on this statement? I couldn't understand it.
Thank you
March 25, 2013 - 3:46 pm UTC
see my original answer, it does just this.
I cannot write:
where lead(station) over (partition by order# order by close_date) <> station
so I have to alias lead(station) over (partition by order# order by close_date) as "lead_station" and in another level of the query use
where lead_station <> station
Basically you cannot "where" on the analytics, you have to create an inline view (or with factored subquery) and then where on the results of that inline view.
Anand, March 25, 2013 - 6:57 am UTC
Hi Tom,
I have table as below :
DROP TABLE test
/
CREATE TABLE test (branch_code NUMBER,tot_cnt NUMBER ,process_id NUMBER )
/
INSERT INTO test VALUES (1,5000,99)
/
INSERT INTO test VALUES (2,2500,99)
/
INSERT INTO test VALUES (3,2500,99)
/
INSERT INTO test VALUES (4,3000,99)
/
INSERT INTO test VALUES (5,2000,99)
/
INSERT INTO test VALUES (6,1000,99)
/
INSERT INTO test VALUES (7,4000,99)
/
INSERT INTO test VALUES (8,1200,99)
/
Now i want to updated the process_id from 1 to 6 from default 99.It should divide in such a
way that it should have tot_count equally as much as possible in each of the process_id.i know it not possible
to have count exactly equal.
If i use below query it will give me count higher on some process_id and lower on some process id :
MERGE INTO test a
USING (SELECT branch_code,
ntile(6) over(order by branch_code) stream
FROM test) b
ON (a.branch_code = b.branch_code)
WHEN MATCHED THEN
UPDATE SET a.process_id = b.stream;
Result :
SELECT SUM(TOT_CNT), PROCESS_ID FROM TEST GROUP BY PROCESS_ID
SUM(TOT_CNT) PROCESS_ID
7500 1 ---> Higher count
5500 2
2000 3
1000 4 ---> Lower count
4000 5
1200 6
Anand, April 23, 2013 - 1:27 pm UTC
Hi Tom,
Waiting for your reply on above question.
April 23, 2013 - 2:56 pm UTC
search for
bin fitting
on this site (and others). there are approaches with the model clause, many demonstrated.
Anand, April 25, 2013 - 1:47 pm UTC
Hi Tom,
Thanks for you reply .Never knew about this.
You have suggested below site in one of forum :
http://www.jlcomp.demon.co.uk/faq/Bin_Fitting.html Now my question how if i want divide the record in 15 or 20 bucket ? Do i have to modify the query and add bucket 3,4,5...20(as per example) ?
April 25, 2013 - 7:49 pm UTC
left as an exercise for the reader... we've pointed you to the "technique". Learn the technique and you'll be able to use it in thousands of places over your career ;)
Recursive Subquery Factors...
Brendan, April 26, 2013 - 2:40 pm UTC
I had a look at that link to a Model (approximate) solution and you have to rewrite the query to change the number of buckets. I implemented a similar algorithm using the v11.2 feature Recursive Subquery Factoring in which I parametrise the number of buckets. It's not quite identical as it distributes the items in batches across the set of buckets, but gave the same answers on my test problems, which involved 10,001 items of random values.
Insert test data:
INSERT INTO items
SELECT 'item' || n, DBMS_Random.Value (0, 10000) FROM
(SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <10002)
Set the variable for number of buckets:
VAR N_BUCKETS NUMBER
EXECUTE :N_BUCKETS := 3
Recursive Subquery Factor query:
WITH buckets AS (
SELECT LEVEL bucket, :N_BUCKETS n_buckets FROM DUAL CONNECT BY LEVEL <= :N_BUCKETS
), items_desc AS (
SELECT item_name, item_value, Row_Number () OVER (ORDER BY item_value DESC) rn
FROM items
), rsf (bucket, item_name, item_value, bin_value, lev, bin_rank, n_buckets) AS (
SELECT b.bucket,
i.item_name,
i.item_value,
i.item_value,
1,
b.n_buckets - i.rn + 1,
b.n_buckets
FROM buckets b
JOIN items_desc i
ON i.rn = b.bucket
UNION ALL
SELECT r.bucket,
i.item_name,
i.item_value,
r.bin_value + i.item_value,
r.lev + 1,
Row_Number () OVER (ORDER BY r.bin_value + i.item_value),
r.n_buckets
FROM rsf r
JOIN items_desc i
ON (i.rn - r.lev * r.n_buckets) = r.bin_rank
)
SELECT r.item_name,
r.bucket, r.item_value, r.bin_value, r.lev, r.bin_rank
FROM rsf r
ORDER BY item_value DESC
Output extract for linked Model for 3 buckets, and RSF for 3 and 20 buckets:
Model 3 buckets
NAME VALUE BUCKET_NAME B1 B2 B3
------------------------------ ---------- ----------- ---------- ---------- ----------
item8937 9999 1 9999
item4315 9999 2 9999 9999
item3367 9999 3 9999 9999 9999
...
item5523 1 2 16734557 16734557 16734557
item145 1 1 16734558 16734557 16734557
item3670 0 2 16734558 16734557 16734557
10001 rows selected.
Elapsed: 00:03:00.79
RSF 3 buckets
ITEM_NAME BUCKET ITEM_VALUE BIN_VALUE LEV BIN_RANK
------------------------------ ---------- ---------- ---------- ---------- ----------
item3367 3 9999 9999 1 1
item8937 1 9999 9999 1 3
item4315 2 9999 9999 1 2
...
item5523 3 1 16734557 3333 2
item145 2 1 16734558 3334 2
item3670 3 0 16734557 3334 1
10001 rows selected.
Elapsed: 00:02:37.65
RSF 20 buckets
ITEM_NAME BUCKET ITEM_VALUE BIN_VALUE LEV BIN_RANK
------------------------------ ---------- ---------- ---------- ---------- ----------
item8937 1 9999 9999 1 20
item3367 3 9999 9999 1 18
item4315 2 9999 9999 1 19
...
item5523 20 1 2510177 500 1
item145 19 1 2510178 500 3
item3670 20 0 2510177 501 1
10001 rows selected.
Elapsed: 00:02:25.73
...and Plain Old SQL
Brendan, April 26, 2013 - 3:53 pm UTC
On further thought, you might consider a very simple solution where you just assign items sequentially to buckets in value order. This won't give quite as good an answer but often it might be good enough and it's much simpler and faster. Here are the results for 3 buckets, showing model and rsf giving the same very good results, with the simple solution a bit worse, but much quicker:
Model 3 buckets
NAME VALUE BUCKET_NAME B1 B2 B3
------------------------------ ---------- ----------- ---------- ---------- ----------
item906 9999 1 9999
item1343 9998 2 9999 9998
item2819 9995 3 9999 9998 9995
...
item3158 7 3 16701010 16701010 16701010
item814 5 1 16701015 16701010 16701010
item5304 4 2 16701015 16701014 16701010
10001 rows selected.
Elapsed: 00:03:01.91
RSF 3 buckets
ITEM_NAME BUCKET ITEM_VALUE BIN_VALUE LEV BIN_RANK
------------------------------ ---------- ---------- ---------- ---------- ----------
item906 1 9999 9999 1 3
item1343 2 9998 9998 1 2
item2819 3 9995 9995 1 1
...
item3158 1 7 16701010 3333 2
item814 2 5 16701015 3334 2
item5304 1 4 16701014 3334 1
10001 rows selected.
Elapsed: 00:02:37.84
POS 3 buckets
ITEM_NAME BUCKET ITEM_VALUE BUCKET_TOTAL
------------------------------ ---------- ---------- ------------
item906 2 9999 16704330
item1343 3 9998 16701019
item2819 1 9995 16697690
...
item3158 1 7 16697690
item814 2 5 16704330
item5304 3 4 16701019
10001 rows selected.
Elapsed: 00:00:03.00
Here is the simple SQL:
WITH items_desc AS (
SELECT item_name, item_value, Mod (Row_Number () OVER (ORDER BY item_value DESC), :N_BUCKETS) + 1 bucket
FROM items
)
SELECT item_name, bucket, item_value, Sum (item_value) OVER (PARTITION BY bucket) bucket_total
FROM items_desc
ORDER BY item_value DESC
Improved MODEL
Brendan, April 28, 2013 - 2:18 pm UTC
And, for MODEL aficianados, here I've taken the same underlying algorithm as in the link above, but used MODEL in a slightly different way that avoids embedding the number of buckets in the query structure, and also avoids the expensive repeated summing. I use the first :N_BINS rows to store the running totals, rather than columns. It runs in about 50s on my PC compared with around 110s for both the original MODEL and my RSF (it's a different PC from above). As a bonus it's quite a lot simpler too.
SELECT item_name, bin, item_value, CASE WHEN rn_m <= :N_BINS THEN bin_value END bin_value
FROM items
MODEL
DIMENSION BY (Row_Number() OVER (ORDER BY item_value DESC) rn)
MEASURES (item_name,
item_value,
0 bin,
item_value bin_value,
Row_Number() OVER (ORDER BY item_value DESC) rn_m,
0 min_bin,
Count(*) OVER () - :N_BINS - 1 n_iters
)
RULES ITERATE(10000) UNTIL (ITERATION_NUMBER >= n_iters[1]) (
bin[rn <= :N_BINS] = CV(rn),
min_bin[1] = Min(rn_m) KEEP (DENSE_RANK FIRST ORDER BY bin_value)[rn <= :N_BINS],
bin[ITERATION_NUMBER + :N_BINS + 1] = min_bin[1],
bin_value[min_bin[1]] = bin_value[CV()] + item_value[ITERATION_NUMBER + :N_BINS + 1]
)
ORDER BY item_value DESC
@Brendan re: Improved MODEL
Stew Ashton, April 28, 2013 - 7:57 pm UTC
Brendan,
Your last MODEL solution is brilliant. Studying it, I noticed that it doesn't quite work when the number of buckets is greater than or equal to the number of items. I suggest this slight adjustment to handle the edge case. It adds an nvl() near the end and filters out a possible extra row with a null item_name.
SELECT item_name, bin, item_value, CASE WHEN rn_m <= :N_BINS THEN bin_value END bin_value
FROM (
select * from items
MODEL
DIMENSION BY (Row_Number() OVER (ORDER BY item_value DESC) rn)
MEASURES (item_name,
item_value,
0 bin,
item_value bin_value,
Row_Number() OVER (ORDER BY item_value DESC) rn_m,
0 min_bin,
Count(*) OVER () - :N_BINS - 1 n_iters
)
RULES ITERATE(10000) UNTIL (ITERATION_NUMBER >= n_iters[1]) (
bin[rn <= :N_BINS] = CV(rn),
min_bin[1] =
Min(rn_m) KEEP (DENSE_RANK FIRST ORDER BY bin_value)[rn <= :N_BINS],
bin[ITERATION_NUMBER + :N_BINS + 1] = min_bin[1],
bin_value[min_bin[1]] =
bin_value[CV()] + nvl(item_value[ITERATION_NUMBER + :N_BINS + 1],0)
)
)
where item_name is not null
ORDER BY item_value DESC
@Brendan: another suggestion
Stew Ashton, April 28, 2013 - 8:53 pm UTC
In the MEASURES clause, put
Row_Number() OVER (ORDER BY item_value DESC) bin,
instead of
0 bin,
then remove the first RULE. In my tests, it cuts elapsed time by about one third.
@Stew: Thanks
Brendan, April 29, 2013 - 11:16 am UTC
Thanks, Stew
You are correct on both points. I hadn't tested the more items than buckets special case, and I get a reduction from 68s to 39s by initialising bin to its desired value instead of setting it in the rules.
Bin fitting: PL/SQL seems better
Stew Ashton, April 29, 2013 - 12:16 pm UTC
Further testing indicates that my second suggestion is no help. However, I did try a pipelined table function for comparison and it runs an order or two of magnitude faster than the MODEL solution. I suspect it scales better too, assuming the MODEL solution keeps the whole table in memory.
> truncate table items
/
table ITEMS truncated.
> insert /*+ append */ into items
select level, dbms_random.value(0,10000) from dual connect by level <= 10002
/
10,002 rows inserted.
> commit
/
committed.
> create or replace package bin_fit as
cursor cur_out is select item_name, item_value, 0 num_bin from items;
type tt_out is table of cur_out%rowtype;
function do(p_numbins number) return tt_out pipelined;
end bin_fit;
/
PACKAGE BIN_FIT compiled
> create or replace package body bin_fit as
function do(p_numbins number) return tt_out pipelined is
l_bins sys.odcinumberlist := sys.odcinumberlist();
min_bin number := 1;
begin
l_bins.extend(p_numbins);
for i in 1..l_bins.count loop
l_bins(i) := 0;
end loop;
for rec in (
select item_name, item_value, 0 num_bin
from items order by item_value desc
) loop
l_bins(min_bin) := l_bins(min_bin) + rec.item_value;
rec.num_bin := min_bin;
pipe row (rec);
for i in 1..l_bins.count loop
if l_bins(i) < l_bins(min_bin) then
min_bin := i;
end if;
end loop;
end loop;
return;
end do;
end bin_fit;
/
PACKAGE BODY BIN_FIT compiled
> select num_bin, sum(item_value) from (
select * from table(bin_fit.do(20))
)
group by num_bin
order by num_bin
/
NUM_BIN SUM(ITEM_VALUE)
---------- ---------------
1 2506121.265
2 2506121.058
3 2506126.829
4 2506126.745
5 2506128.011
6 2506123.159
7 2506125.575
8 2506126.78
9 2506120.774
10 2506120.798
11 2506120.623
12 2506123.07
13 2506120.905
14 2506123.8
15 2506128.144
16 2506126.882
17 2506120.592
18 2506124.556
19 2506128.09
20 2506125.791
20 rows selected
Elapsed: 00:00:00.130
PL/SQL
Brendan, April 29, 2013 - 12:26 pm UTC
Yes, you'd expect PL/SQL to be much faster as it's an essentially procedural algorithm.
multiple record matching using self join or analytics
santosh, June 12, 2013 - 3:16 am UTC
We have requirement to find matching rows (3 or 4 columns match ) from within same table .There may be a match of 1 row to another and also ,1 to many and also, many to many .
Do we have an efficient way to do this if data in scope could be in range of millions.
June 18, 2013 - 2:50 pm UTC
the database is brutally efficient at doing joins. Just join.
Anand, June 20, 2013 - 11:47 am UTC
Hi Tom,
I have a table like below :
create table test (limit_id NUMBER,stream NUMBER)
/
insert into test values (123,1)
/
insert into test values (123,2)
/
insert into test values (456,3)
/
insert into test values (678,2)
/
Now i want to use analytical function to get the such limit_id which is having 2 different stream.
June 20, 2013 - 2:52 pm UTC
no analytics, just good old aggregates:
ops$tkyte%ORA11GR2> select limit_id from test group by limit_id having count(distinct stream) = 2;
LIMIT_ID
----------
123
Anand, June 20, 2013 - 4:06 pm UTC
Thanks tom....
nth_value
Rajeshwaran, June 21, 2013 - 1:26 pm UTC
Tom:
What additional information I need to provide nth_value function to get the output as 2975 rather than 3000? I know I can easily achieve results using dense_rank but would like to know how can this be handled using nth_value.
rajesh@ORA11G> select empno,deptno,sal,
2 nth_value(sal,2) over(partition by
3 deptno order by sal desc) nth_sal
4 from emp
5 where deptno = 20
6 order by deptno,sal desc ;
EMPNO DEPTNO SAL NTH_SAL
---------- ---------- ---------- ----------
7788 20 3000 3000
7902 20 3000 3000
7566 20 2975 3000
7876 20 1100 3000
7369 20 800 3000
July 01, 2013 - 4:50 pm UTC
nth value gives you the nth value in a window - that is 3,000 in this case. that is simply what this function does. It isn't looking for the nth DISTINCT value, just the nth value after ordering and partitioning.
Need a help on formulating report query
kumar, July 10, 2013 - 8:01 am UTC
Hi Tom,
I need to generate weekly(as of Monday 0:01 AM) and monthly(as of 1st Date of month)
report for the following set of data which is a modification history of a product.
I need to produce a count of product for every latest status(max of seq_id within/group by prod_id) of product for every week or month.
Lets take an example for weekly report:-
Need to generate report for a given period - start date : 27-May-2013; end date : 09-Jul-2013
we need to generate report as of week of the year(week will be considered as of Monday 0:01 am).
week number for the given dates are 22/2013(Mon - 27-05-2013 00:01 am), 23/2013.......28/2013.
here is the query for every weekly report but i want your help in combining alltogether in a single query.
CREATE TABLE PRODUCT_TEST
(
PROD_SEQ_ID NUMBER PRIMARY KEY,
PROD_ID NUMBER,
PRD_CHNG_DTTM DATE,
PROD_STTS VARCHAR2(25)
);
Insert into PRODUCT_TEST values (1, 101, to_date('27-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (2, 101, to_date('28-MAY-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (3, 101, to_date('30-MAY-2013','DD-MON-YYYY'),'Error');
Insert into PRODUCT_TEST values (4, 102, to_date('31-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (5, 102, to_date('04-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (6, 102, to_date('08-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (7, 103, to_date('12-JUN-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (8, 103, to_date('19-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (9, 103, to_date('26-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (10, 104, to_date('10-JUL-2013','DD-MON-YYYY'),'New');
COMMIT;
First we need to find all the week numbers i.e.,Monday 00:01AM so b/w 27th May and 9th Jul
we will have week
22/2013(date will be 27/05/2013 0:01 am)
23/2013(date will be 03/06/2013 0:01 am)
24/2013(date will be 10/06/2013 0:01 am) and so on....till end date
For weekly report
For week 22/2013=======PROD_CHNG_DTTM < 27-May-2013 00:01==========
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
THEN 1
END ) cnt_new,
COUNT( CASE WHEN PROD_STTS = 'Open'
THEN 1
END ) cnt_open,
COUNT( CASE WHEN PROD_STTS = 'Error'
THEN 1
END ) cnt_error,
COUNT( CASE WHEN PROD_STTS = 'Closed'
THEN 1
END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
For week 23/2013========PROD_CHNG_DTTM < 03-Jun-2013 00:01==========
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
THEN 1
END ) cnt_new,
COUNT( CASE WHEN PROD_STTS = 'Open'
THEN 1
END ) cnt_open,
COUNT( CASE WHEN PROD_STTS = 'Error'
THEN 1
END ) cnt_error,
COUNT( CASE WHEN PROD_STTS = 'Closed'
THEN 1
END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('03-Jun-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
............. and so on for every week till end date and need to produce a
weekly report like
For the period 27th May to 9th Jul--
As of New_Count Open_Count Error_Count Closed_count
2013 FW22 1 0 0 0
2013 FW23 1 0 1 0
2013 FW24 0 0 1 1
..........................so on
Thanks in advance..
Can you please help me in formulating above query.
kumar, July 13, 2013 - 2:55 pm UTC
Need your help Tom....
help needed
kumar, July 16, 2013 - 5:11 pm UTC
Hi Tom,
please help me in formulating above requirement in combining all the queries into one....
Anxiously waiting for response..
Need your help on reporting query.
kumar, July 17, 2013 - 10:58 am UTC
Hi Tom,
I need to prepare a monthly, weekly and yearly count report from a modification history based on status.
CREATE TABLE PRODUCT_TEST
(
PROD_SEQ_ID NUMBER PRIMARY KEY,
PROD_ID NUMBER,
PRD_CHNG_DTTM DATE,
PROD_STTS VARCHAR2(25)
);
Insert into PRODUCT_TEST values (1, 101, to_date('27-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (2, 101, to_date('28-MAY-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (3, 101, to_date('30-MAY-2013','DD-MON-YYYY'),'Error');
Insert into PRODUCT_TEST values (4, 102, to_date('31-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (5, 102, to_date('04-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (6, 102, to_date('08-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (7, 103, to_date('12-JUN-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (8, 103, to_date('19-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (9, 103, to_date('26-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (10, 104, to_date('10-JUL-2013','DD-MON-YYYY'),'New');
COMMIT;
If user wants the report from date 26-May-2013 todate 09-Jul-2013
Let's just take an weekly report example :-
First we need to find all the week numbers i.e., all the Mondays 00:01AM so for 26th May and 9th Jul range we will have weeks as
21/2013 (date will be 20/05/2013 0:01 am)
22/2013 (date will be 27/05/2013 0:01 am)
......
28/2013 (date will be 08/07/2013 0:01 am)
Now i have to produce a trend report which will have recods as of weeks :-
For 21/2013 my query will have PROD_CHNG_DTTM < 20-May-2013 00:01 i.e., take all records from beginning till 20/05/13
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
THEN 1
END ) cnt_new,
COUNT( CASE WHEN PROD_STTS = 'Open'
THEN 1
END ) cnt_open,
COUNT( CASE WHEN PROD_STTS = 'Error'
THEN 1
END ) cnt_error,
COUNT( CASE WHEN PROD_STTS = 'Closed'
THEN 1
END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
For 22/2013 my query will have PROD_CHNG_DTTM < 27-May-2013 00:01 i.e., take all records from beginning till 27/05/13
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
THEN 1
END ) cnt_new,
COUNT( CASE WHEN PROD_STTS = 'Open'
THEN 1
END ) cnt_open,
COUNT( CASE WHEN PROD_STTS = 'Error'
THEN 1
END ) cnt_error,
COUNT( CASE WHEN PROD_STTS = 'Closed'
THEN 1
END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
One way is write queries for all the weeks requested by user's date range and use union all but it will be the slowest option i think if user demands for date range having over 100 weeks.
I am sure you must be having a faster solution for combining all the queries into one(I mean finding weeks query, all the report queries for max of record in varying todate...).
Hope I have made myself clear.
Thanks a lot in Advance
July 17, 2013 - 6:08 pm UTC
I don't really have anything to add - this is not really a followup question - maybe when I have time to take a new question.
since you seem to use
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
multiple times, perhaps in your union all you can
with data
as
( that select )
select .. from data ..
union all
select ... from data. ...
union all
select ... from data ..
we'll build the result once (data) and use it over and over.
Thanks a lot TOM
kumar, July 30, 2013 - 9:12 am UTC
Thank you very much Tom..
For the time being i have made my query as per your suggestion, but still want to see some good or flexible option s from your pen just for learning or enhance my analytical ability.
May be i will to try put this into new question section whenever i will be able to see new question button......
@kumar on reporting query
Stew Ashton, July 31, 2013 - 1:02 pm UTC
Hi kumar,
You may not ever read this, but I think you misled Tom. You repeated
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
when you said you needed two different dates.
I think you want a weekly summary of product status. For that, you need:
- the "as of" dates you want the summary for
- the
range of dates the status applies to
- join status to "as of" dates based on the date range
- then group by "as of" date.
with as_of_dates as (
select min_as_of_date + level*7 as_of_date
from (
select trunc(min(prd_chng_dttm)-1/24/60/60, 'IW') min_as_of_date
from product_test
)
connect by min_as_of_date + level*7 <= sysdate
), data as (
select prd_chng_dttm,
lead(prd_chng_dttm-1/24/60/60,1,sysdate)
over(partition by prod_id order by prd_chng_dttm) prd_chng_dttm_end,
CASE WHEN PROD_STTS = 'New' THEN 1 else 0 END cnt_new,
CASE WHEN PROD_STTS = 'Open' THEN 1 else 0 END cnt_Open,
CASE WHEN PROD_STTS = 'Error' THEN 1 else 0 END cnt_Error,
CASE WHEN PROD_STTS = 'Closed' THEN 1 else 0 END cnt_Closed
from product_test
)
select as_of_date, sum(cnt_new) cnt_new, sum(cnt_open) cnt_open,
sum(cnt_error) cnt_error, sum(cnt_closed) cnt_closed
from data a, as_of_dates b
where b.as_of_date between a.prd_chng_dttm and a.prd_chng_dttm_end
group by as_of_date
order by as_of_date;
AS_OF_DATE CNT_NEW CNT_OPEN CNT_ERROR CNT_CLOSED
-------------------- ---------- ---------- ---------- ----------
27-MAY-13 00.00.00 1 0 0 0
03-JUN-13 00.00.00 1 0 1 0
10-JUN-13 00.00.00 0 0 1 1
17-JUN-13 00.00.00 1 0 1 1
24-JUN-13 00.00.00 0 1 1 1
01-JUL-13 00.00.00 0 0 1 2
08-JUL-13 00.00.00 0 0 1 2
15-JUL-13 00.00.00 1 0 1 2
22-JUL-13 00.00.00 1 0 1 2
29-JUL-13 00.00.00 1 0 1 2
using partition
A, August 01, 2013 - 6:57 pm UTC
Hello Tom,
Was wondering can we use partition to display "Multiple" for Col3 is the getting repeated?
If the data is like this,
Col1 Col2 Col3
111 AA NY
222 BB LON
111 AA PAR
333 AA MUM
111 AA MUM
222 BB LON
444 XX LON
We want the output like this:
Col1 Col2 Col3
111 AA Multiple
222 BB Multiple
333 AA MUM
444 XX LON
If there are multiple values for Col3 for an combination of Col1 and Col2 then in Col3 it should display as "Multiple" else it should show the value.
Thanks
August 02, 2013 - 7:26 pm UTC
ops$tkyte%ORA11GR2> with data
2 as
3 (select 111 c1, 'aa' c2, 'NY' c3 from dual union all
4 select 111 c1, 'aa' c2, 'XX' c3 from dual union all
5 select 222 c1, 'aa' c2, 'YY' c3 from dual union all
6 select 222 c1, 'aa' c2, 'YY' c3 from dual
7 )
8 select c1, c2, case when count(distinct c3) > 1 then 'multiple' else max(c3) end c3
9 from data
10 group by c1, c2
11 order by c1, c2
12 /
C1 C2 C3
---------- -- --------
111 aa multiple
222 aa YY
Thanks
A, August 05, 2013 - 7:14 am UTC
Hello Tom,
Thanks for the reply. This is hard-coded to only 2 distinct rows. There are more than 500 rows in the table and any combination of these
some thing like ...
Col1 Col2 Col3
111 AA NY
222 BB LON
111 AA PAR
333 AA MUM
111 AA MUM
222 BB LON
444 XX LON
100 AQ JJ
101 BB PAR
...
...
...
How to write a query for this?
August 08, 2013 - 4:26 pm UTC
there is no hard coded limit? what are you talking about?
my query works for any number of rows.
Response
Raj B, August 05, 2013 - 12:46 pm UTC
Tom has already answered the query! All you need to do is just try it.
SQL> select * from t3;
COL1 COL COL
---------- --- ---
222 BB ZZZ
345 MM YYY
111 AA CAL
222 BB IAD
222 BB BLT
111 AA LON
123 AA MUM
7 rows selected.
SQL> select col1,col2, case when count(distinct col3) > 1 then 'MUL' else max(col3) end col3
from t3
group by col1, col2 order by col1, col2;
COL1 COL COL
---------- --- ---
111 AA MUL
123 AA MUM
222 BB MUL
345 MM YYY
tough query
Bhavesh Ghodasara, August 21, 2013 - 1:27 pm UTC
Hi Tom,
I was regular at your blog before 7-8 years. and learned a lot from you at beginning of my career. and it helped a lot. Ironically I became Teradata DBA :).. Thanks for all your help and wonderful solutions.
So I am back with query again.
create table t
(
Artcl integer,
store integer,
yr integer,
week integer,
Inventory integer);
insert into t values( 1,1,2011,50,2);
insert into t values( 1,1,2011,51,-5);
insert into t values( 1,1,2011,52,4);
insert into t values( 1,1,2012,1,-5);
insert into t values( 1,1,2012,4,3);
insert into t values( 1,1,2012,7,2);
insert into t values( 1,1,2012,8,4);
insert into t values( 1,1,2012,10,-2);
insert into t values( 1,1,2012,15,7);
insert into t values( 2,1,2011,52,-5);
insert into t values( 2,1,2012,4,2);
insert into t values( 2,1,2012,6,-1);
insert into t values( 2,1,2012,50,5);
insert into t values( 2,1,2012,52,2);
insert into t values( 2,1,2013,1,-2);
insert into t values( 2,1,2013,2,-7);
insert into t values( 2,1,2013,3,-2);
insert into t values( 2,1,2013,4,4);
insert into t values( 2,1,2013,5,-2);
insert into t values( 3,2,2012,6,1);
insert into t values( 3,2,2012,50,2);
insert into t values( 3,2,2012,52,3);
insert into t values( 3,2,2013,1,-2);
insert into t values( 3,2,2013,2,-1);
insert into t values( 3,2,2013,3,-1);
insert into t values( 3,2,2013,4,5);
insert into t values( 3,2,2013,5,-2);
Now data will look like (except last column)
Artcl store year week Inventory Rolling total
1 1 2011 50 2 2
1 1 2011 51 -5 0
1 1 2011 52 4 4
1 1 2012 1 -5 0
1 1 2012 4 3 3
1 1 2012 7 2 5
1 1 2012 8 4 9
1 1 2012 10 -2 7
1 1 2012 15 7 14
2 1 2011 52 -5 0
2 1 2012 4 2 2
2 1 2012 6 -1 1
2 1 2012 50 5 6
2 1 2012 52 2 8
2 1 2013 1 -2 6
2 1 2013 2 -7 0
2 1 2013 3 -2 0
2 1 2013 4 4 4
2 1 2013 5 -2 2
3 2 2012 6 1 1
3 2 2012 50 2 3
3 2 2012 52 3 6
3 2 2013 1 -2 4
3 2 2013 2 -1 3
3 2 2013 3 -1 2
3 2 2013 4 5 7
3 2 2013 5 -2 5
I need rolling column like above report.
1) partition by artcl, store, year , week
2) rolling total of inventory with following conditions :
a) if total becomes 0 or negative, we have to consider as 0 and start all over.
I tried hard to get solution but no avail. I even think now that it will not probably possible
with simple sql and will need function/procedure for that.
Please help. As I left with no option, I came back to my childhood hero for help :)
String Diff
TCS, September 17, 2013 - 10:44 am UTC
i want to do string diff between two string like if i have two string input say : ABCDCAFGV and ABDCAF then my output will be : CGV
Basically i want to do string position match between two string and output the difference.
Can it be done througn Oracle SQL query only or we need to write UDF for that.
reply to Bhavesh's question
Ranjan., September 19, 2013 - 5:19 pm UTC
Hi Bhavesh,
For your scenario ,here is the below code which will produce the desire result.
################################
CREATE TYPE typ_obj_breaksum AS OBJECT
(
sum NUMBER,
STATIC FUNCTION ODCIAggregateInitialize (
sctx IN OUT typ_obj_breaksum
) RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateIterate (
self IN OUT typ_obj_breaksum,
value IN NUMBER
) RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateTerminate (
self IN typ_obj_breaksum,
retval OUT NUMBER,
flags IN NUMBER
) RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateMerge (
self IN OUT typ_obj_breaksum,
ctx2 IN typ_obj_breaksum
) RETURN NUMBER
);
/
CREATE TYPE BODY typ_obj_breaksum IS
STATIC FUNCTION ODCIAggregateInitialize (
sctx IN OUT typ_obj_breaksum
) RETURN NUMBER IS
BEGIN
sctx := typ_obj_breaksum(0);
RETURN ODCIConst.Success;
End;
MEMBER FUNCTION ODCIAggregateIterate (
self IN OUT typ_obj_breaksum,
value IN NUMBER
) RETURN NUMBER IS
BEGIN
self.sum := CASE
WHEN value >= 0
OR (value < 0 AND self.sum + value > 0)
THEN self.sum + value
ELSE 0
END;
Return Odciconst.Success;
END;
MEMBER FUNCTION ODCIAggregateTerminate (
self IN typ_obj_breaksum,
retval OUT NUMBER,
flags IN NUMBER
) RETURN NUMBER IS
BEGIN
retval := self.sum;
RETURN ODCIConst.Success;
End;
MEMBER FUNCTION ODCIAggregateMerge (
self IN OUT typ_obj_breaksum,
ctx2 IN typ_obj_breaksum
) RETURN NUMBER IS
BEGIN
self.sum := CASE
WHEN self.sum + ctx2.sum > 0
THEN self.sum + ctx2.sum
ELSE 0
END;
RETURN ODCIConst.Success;
END;
End;
/
CREATE FUNCTION breaksum (input NUMBER) RETURN NUMBER
PARALLEL_ENABLE
AGGREGATE USING typ_obj_breaksum;
/
SELECT artcl , store,yr,week,inventory, BREAKSUM (inventory) over(partition by artcl order by rownum) AS rolling_total
From T
;
And Here The Output Is:
##################################
Artcl store year week Inventory Rolling total
1 1 2011 50 2 2
1 1 2011 51 -5 0
1 1 2011 52 4 4
1 1 2012 1 -5 0
1 1 2012 4 3 3
1 1 2012 7 2 5
1 1 2012 8 4 9
1 1 2012 10 -2 7
1 1 2012 15 7 14
2 1 2011 52 -5 0
2 1 2012 4 2 2
2 1 2012 6 -1 1
2 1 2012 50 5 6
2 1 2012 52 2 8
2 1 2013 1 -2 6
2 1 2013 2 -7 0
2 1 2013 3 -2 0
2 1 2013 4 4 4
2 1 2013 5 -2 2
3 2 2012 6 1 1
3 2 2012 50 2 3
3 2 2012 52 3 6
3 2 2013 1 -2 4
3 2 2013 2 -1 3
3 2 2013 3 -1 2
3 2 2013 4 5 7
3 2 2013 5 -2 5
############################
Hope that help you and from now ,you can able to create your own user defined aggregate function. :)
cont to my above post for Bhavesh
Ranjan., September 19, 2013 - 5:53 pm UTC
Hi Bhavesh,
by seeing your result,
you datas need to partition by artcl,store order by yr,week.
Because for "2 1 2013 1 -2 6"
the 6 came from the year 2012 balance ,which is 8 and added "-2" of year 2013 value.
So the query should be as below.
################33
SELECT artcl , store,yr,week,inventory, BREAKSUM (inventory) over(partition by artcl,store order by yr,week) AS rolling_total
From T ;
########################3
Enjoy :).
thanks
Bhavesh Ghodasara, September 24, 2013 - 11:50 am UTC
Thank you so much Rajan.
I really appriciate your effort.
I want to know whether it is possible to get particular output by just Query, I dont want to use any function.
cont to my last post for @Bhavesh
Ranjan, September 26, 2013 - 8:43 pm UTC
Hi Bhavesh,
For Your requirement, It is not that easy to display desire report directly using select query.
Long back I read something "creating our own aggregate function" from the below site.
http://www.oracle-developer.net/display.php?id=215 I suggest you to read that page by spending 10 minutes and
you will know why it is not that easy :).
I have only copied those logic from that site and shown in my last post.
Regards,
Ranjan.
Rich, October 29, 2013 - 3:26 pm UTC
Tomm,
Thanks for providing the "Old Carry Forward Trick." It worked well for me!
"One of my (current) favorite analytic tricks -- the old "carry forward". We mark rows such that
the preceding row was different -- subsequent dup rows would have NULLS there for grp.
Then, we use max(grp) to "carry" that number down....
Now we have something to group by -- we've divided the rows up into groups we can deal with."
selecting selective subset of rows
Ravi B, April 05, 2014 - 12:00 am UTC
Hi Tom,
Could you please help me with this problem.
CREATE TABLE TEST_VERSION(VERSION_ID NUMBER,SUBVERSION VARCHAR2(100));
insert into TEST_VERSION VALUES(1,null);
insert into TEST_VERSION VALUES(1,'Alpha');
insert into TEST_VERSION VALUES(1,'Internal Beta');
insert into TEST_VERSION VALUES(1,'Technical Preview');
insert into TEST_VERSION VALUES(1,'RTM');
insert into TEST_VERSION VALUES(2,null);
insert into TEST_VERSION VALUES(2,'Alpha');
insert into TEST_VERSION VALUES(3,'Internal Beta');
insert into TEST_VERSION VALUES(4,'Technical Preview');
SELECT * FROM TEST_VERSION;
select rules:
for given version_id:
1) if there is a SUBVERSION='RTM' row(s) in the set and also a row with SUBVERSION= NULL ignore the row with SUBVERSION= NULL and select all the other rows including RTM
2) if there is a row SUBVERSION= NULL and there are no rows with SUBVERSION='RTM' then pick all the rows including SUBVERSION=NULL
In essence, row with SUBVERSION= NULL cannot not co-exist with SUBVERSION='RTM' for the same version_id.
April 16, 2014 - 5:43 pm UTC
I think #2 should be phrased as:
2) if there are no rows with RTM, select them all.
ops$tkyte%ORA11GR2> select *
2 from (
3 select version_id, subversion,
4 max( case when subversion = 'RTM' then 1 else 0 end ) over (partition by version_id ) has_rtm
5 from test_version
6 )
7 where (has_rtm = 1 and subversion is not null)
8 or (has_rtm = 0)
9 order by version_id, subversion
10 /
VERSION_ID SUBVERSION HAS_RTM
---------- -------------------- ----------
1 Alpha 1
1 Internal Beta 1
1 RTM 1
1 Technical Preview 1
2 Alpha 0
2 0
3 Internal Beta 0
4 Technical Preview 0
8 rows selected.
selecting selective subset of rows
A reader, April 17, 2014 - 3:51 am UTC
Brilliant!!
alternatively
Igor, April 17, 2014 - 4:02 pm UTC
SQL> select * from test_version
2 minus
3 select version_id, decode(subversion,'RTM',null) from test_version where subversion='RTM'
4 order by 1,2;
VERSION_ID SUBVERSION
---------- -----------------
1 Alpha
1 Internal Beta
1 RTM
1 Technical Preview
2 Alpha
2
3 Internal Beta
4 Technical Preview
8 rows selected.
April 17, 2014 - 5:40 pm UTC
depends, we know nothing about their primary keys, unique constraints and so on. don't forget that MINUS adds a "distinct" to the query. That means that in general, you cannot substitute MINUS for what I did above.
just re-run the inserts and:
ops$tkyte%ORA11GR2> select *
2 from (
3 select version_id, subversion,
4 max( case when subversion = 'RTM' then 1 else 0 end ) over (partition by version_id ) has_rtm
5 from test_version
6 )
7 where (has_rtm = 1 and subversion is not null)
8 or (has_rtm = 0)
9 order by version_id, subversion
10 /
VERSION_ID SUBVERSION HAS_RTM
---------- -------------------- ----------
1 Alpha 1
1 Alpha 1
1 Internal Beta 1
1 Internal Beta 1
1 RTM 1
1 RTM 1
1 Technical Preview 1
1 Technical Preview 1
2 Alpha 0
2 Alpha 0
2 0
2 0
3 Internal Beta 0
3 Internal Beta 0
4 Technical Preview 0
4 Technical Preview 0
16 rows selected.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select * from test_version
2 minus
3 select version_id, decode(subversion,'RTM',null) from test_version where subversion='RTM'
4 order by 1,2;
VERSION_ID SUBVERSION
---------- --------------------
1 Alpha
1 Internal Beta
1 RTM
1 Technical Preview
2 Alpha
2
3 Internal Beta
4 Technical Preview
8 rows selected.
and you don't need your decode, just select version_id, null - since you are only getting RTM subversions - the decode is an extra step you would not need.
a non-analytic query that would work could be:
ops$tkyte%ORA11GR2> select *
2 from test_version a
3 where not exists (select null
4 from test_version b
5 where a.version_id = b.version_id
6 and b.subversion = 'RTM'
7 and a.subversion is null)
8 order by 1, 2
9 /
but be careful trying to substitute MINUS, UNION, UNION ALL, INTERSECT, etc - they have "set based side effects" that can and will change the answer you receive - unless there are constraints in place that tell you "that side effect cannot happen"
but yes, there are a huge number of equivalent ways to get this answer...
Igor, April 17, 2014 - 9:03 pm UTC
Tom, you are right, decode was not needed in my query, thanks for pointing at this. And yes, these set operators (union, minus etc.) may produce different results, depending on the data (for example the duplicate records will be lost). Thanks!
April 17, 2014 - 10:52 pm UTC
thanks for the ideas though :) it is always good to see alternatives...
A reader, August 11, 2015 - 7:51 pm UTC
In this example, is there a way to get the values without doing the join twice. I need to get the value from table N where the MAX date (field d) is less than or equal to date in table A.
create table g(a int, b varchar2(2), d date) -- a, b and d are together unique;
insert into g values (1, 'ER', to_date('01012005','mmddyyyy');
insert into g values (1, 'ER', to_date('06012005','mmddyyyy');
insert into g values (1, 'ER', to_date('01012015','mmddyyyy');
insert into g values (2, 'PR', to_date('01012005','mmddyyyy');
insert into g values (2, 'PR', to_date('01012015','mmddyyyy');
insert into g values (2, 'PR', to_date('06012015','mmddyyyy');
insert into g values (3, 'FR', to_date('06012015','mmddyyyy');
create table n(a int, b varchar2(2), d date, e varchar2(3), f varchar2(3)) -- a, b and d are unique;
insert into n values(1, 'ER', to_date('01012005','mmddyyyy', 'FSC', 'COL');
insert into n values(1, 'ER', to_date('01012014','mmddyyyy', 'FOO', 'CLL');
insert into n values(1, 'PR', to_date('01012014','mmddyyyy', 'FOI', 'COL');
insert into n values(1, 'PR', to_date('06012014','mmddyyyy', 'FIT', 'CLL');
select g.a, g.b, g.d,
(select n.e from n where n.a = g.a and n.b = g.b and n.d = (select max(n.d) from n where n.a = g.a and n.b = g.b and n.d<=g.d) e,
(select n.f from n where n.a = g.a and n.b = g.b and n.d = (select max(n.d) from n where n.a = g.a and n.b = g.b and n.d<=g.d) f
from g
Analytics based on Logical offset
Rajeshwaran, Jeyabal, July 13, 2016 - 11:24 am UTC
set feedback off
drop table t purge;
create table t(
HYB_PROJECT_KEY NUMBER,
SAMP_CNFG_KEY NUMBER,
SAMPLE_RESULT_KEY NUMBER,
RESULT_SOURCE VARCHAR2(1),
RESULT_SOURCE_CDC1 VARCHAR2(1),
RESULT_DATE DATE,
RESULT_VALUE_TYPE VARCHAR2(10),
RESULT_VALUE NUMBER,
IS_ACTIVE NUMBER);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'A','A',to_date('09-JUN-2014','DD-MON-YYYY'),'TEST',7.2,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'A','A',to_date('14-JUN-2014','DD-MON-YYYY'),'TEST',8,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'H','H',to_date('01-JUN-2014','DD-MON-YYYY'),'TEST',4,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'H','H',to_date('03-JUN-2014','DD-MON-YYYY'),'TEST',5,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'H','H',to_date('06-JUN-2014','DD-MON-YYYY'),'TEST',4,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'H','H',to_date('07-JUN-2014','DD-MON-YYYY'),'TEST',3,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'H','H',to_date('15-JUN-2014','DD-MON-YYYY'),'TEST',10,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE)
values (84,114,1365970,'H','H',to_date('17-JUN-2014','DD-MON-YYYY'),'TEST',0,1);
commit;
set feedback on
Team,
1) for each sample_result_key and result_date just retain only the Minimum non-zero values as result_value.
2) for each sample_result_key if the maximum result_date having the result_value = 0, then go back seven days and pick the minimum result_value and its corresponding result_source and result_source_cdc1.
I am able to come up with a query like this, but do you have any other better way of doing this?
demo@ORA11G>
demo@ORA11G> select hyb_project_key, samp_cnfg_key , sample_result_key,
2 max(result_date) result_date,
3 min(case when result_value > 0 then result_value end) result_value,
4 min(result_source) keep(dense_rank first order by case when result_value > 0 then result_value end asc nulls last) result_source,
5 min(result_source_cdc1) keep(dense_rank first order by case when result_value > 0 then result_value end asc nulls last) result_source_cdc1
6 from (
7 select hyb_project_key, samp_cnfg_key,
8 sample_result_key,result_date, result_value,rnk,
9 result_source,result_source_cdc1
10 from (
11 select hyb_project_key, samp_cnfg_key,
12 sample_result_key,result_date, result_value,rnk,
13 result_source,result_source_cdc1,nxt_value,
14 last_value(nxt_value ignore nulls) over(partition by sample_result_key order by rnk) nxt_value2
15 from (
16 select hyb_project_key, samp_cnfg_key,
17 sample_result_key,result_date,
18 min(case when result_value > 0 then result_value end) result_value,
19 min(result_source) keep(dense_rank first order by case when result_value > 0 then result_value end nulls last) result_source,
20 min(result_source_cdc1) keep(dense_rank first order by case when result_value > 0 then result_value end nulls last) result_source_cdc1 ,
21 row_number() over(partition by sample_result_key order by result_date desc) rnk,
22 case when row_number() over(partition by sample_result_key order by result_date desc) = 1
23 and min(case when result_value > 0 then result_value end) is null
24 then last_value( result_date ) over( partition by sample_result_key order by result_date desc
25 range between current row and interval '7' day following) end nxt_value
26 from t
27 group by hyb_project_key, samp_cnfg_key,
28 sample_result_key,result_date
29 order by result_date desc
30 )
31 )
32 where rnk =1 or nxt_value2 <= result_date
33 ) t2
34 group by hyb_project_key, samp_cnfg_key , sample_result_key ;
HYB_PROJECT_KEY SAMP_CNFG_KEY SAMPLE_RESULT_KEY RESULT_DATE RESULT_VALUE R R
--------------- ------------- ----------------- ----------- ------------ - -
84 114 1365970 17-JUN-2014 8 A A
1 row selected.
July 13, 2016 - 1:24 pm UTC
I'm not sure I understand the requirement. Based on the example, do you mean:
"Return the maximum result date for each sample. If this has a value of zero, return the lowest (non-zero) result in the seven previous days. Also show the source and CDC that gave this minimum"
?
If so, the following does that:
select hyb_project_key, samp_cnfg_key, sample_result_key, mx,
min(case when result_value > 0 then result_value end) res_value,
min(result_source) keep (
dense_rank first order by case when result_value > 0 then result_value end nulls last
) score,
min(result_source_cdc1) keep (
dense_rank first order by case when result_value > 0 then result_value end
) cdc
from (
select t.*,
max(result_date) over (partition by sample_result_key) mx
from t
)
where mx - result_date <= 7
group by hyb_project_key, samp_cnfg_key, sample_result_key, mx;
HYB_PROJECT_KEY SAMP_CNFG_KEY SAMPLE_RESULT_KEY MX RES_VALUE SCORE CDC
84 114 1,365,970 17-JUN-2014 00:00:00 8 A A
Better is always a bit subjective. But this is far less code!
Obviously the where clause (mx - result_date <= 7) will need adjusting if there are multiple result keys in the data.
Analytics based on Logical offset
Rajeshwaran, Jeyabal, July 13, 2016 - 2:36 pm UTC
Thanks, this helps.
On "Analytics based on Logical offset"
Stew Ashton, January 09, 2017 - 11:03 am UTC
In 12c there is a MATCH_RECOGNIZE solution:
select * from t
match_recognize(
partition by HYB_PROJECT_KEY, SAMP_CNFG_KEY, sample_result_key
order by result_date desc
measures a.RESULT_DATE RESULT_DATE,
ab.RESULT_SOURCE RESULT_SOURCE,
ab.RESULT_SOURCE_CDC1 RESULT_SOURCE_CDC1,
ab.RESULT_VALUE_TYPE RESULT_VALUE_TYPE,
ab.RESULT_VALUE RESULT_VALUE,
ab.IS_ACTIVE IS_ACTIVE
pattern (^a (b|c)*)
subset bc = (b, c), ab = (a, b)
define
b as a.result_value = 0 and result_date >= a.result_date - 7
and result_value = min(bc.result_value),
c as a.result_value = 0 and result_date >= a.result_date - 7
);
January 10, 2017 - 4:04 am UTC
Love your work :-)