Skip to Main Content

Breadcrumb

Dev Live Dev Intro

We are celebrating Developers at AskTOM. We welcome Developers of all levels of experience to join us at our FREE Developer Live events coming in August and September. Just click on the left to register today! If you are brand new to Database Technology, then we also have got you covered. Just click on the right for your comprehensive FREE training program to kick start your Oracle Database Development journey!

Question and Answer

Tom Kyte

Thanks for the question, Michael.

Asked: October 06, 2003 - 3:41 pm UTC

Answered by: Tom Kyte - Last updated: January 10, 2017 - 4:04 am UTC

Category: Database - Version: 9.2.0.

Viewed 50K+ times! This question is

You Asked


I have a table from a 3rd party application that is used to track
an order through the various manufacturing operations. A subset of
the information looks like this:

ORDER OPN STATION CLOSE_DATE
----- --- ------- ----------
12345 10 RECV 07/01/2003
12345 20 MACH1 07/02/2003
12345 25 MACH1 07/05/2003
12345 30 MACH1 07/11/2003
12345 36 INSP1 07/12/2003
12345 50 MACH1 07/16/2003
12345 90 MACH2 07/30/2003
12345 990 STOCK 08/01/2003

Where each row is a process that the order had to go through,
with OPN being the order of the processes.

What I would like to receive is the output grouped by consecutive
STATION values and include the start and close dates for each
STATION group. The start date is defined as the date the prior
station closed. So the output expected from the above data subset
would be:

ORDER STATION START_DATE CLOSE_DATE
----- ------- ---------- ----------
12345 RECV 07/01/2003
12345 MACH1 07/01/2003 07/11/2003
12345 INSP1 07/11/2003 07/12/2003
12345 MACH1 07/12/2003 07/16/2003
12345 MACH2 07/16/2003 07/30/2003
12345 STOCK 07/30/2003 08/01/2003

Is this possible? I've tried using analytics, but I can't seem to
get what I want. I can use the LAG function to get the start and
close dates, grouped by STATION, but it will group all the different
STATION values together (i.e. all MACH1 STATIONS will be grouped
together), not just the consecutive STATION values. I could use
procedural code to get this answer, but I was wanting to see if
it could be done in 1 statement.

I'm sure it will be something easy, but I've been racking my tiny
brain over this for the last few days and can't come up with a
solution. Can you help?

Many thanks,

Michael T.





and we said...

Analytics rock
Analytics roll

been thinking about writing a book just about analytics (but wait'll you see the SQL Model clause in 10g)

ops$tkyte@ORA920> select order#, station, lag_close_date, close_date
2 from (
3 select order#,
4 lag(station) over (partition by order# order by close_date)
lag_station,
5 lead(station) over (partition by order# order by close_date)
lead_station,
6 station,
7 close_date,
8 lag(close_date) over (partition by order# order by close_date)
lag_close_date,
9 lead(close_date) over (partition by order# order by close_date)
lead_close_date
10 from t
11 )
12 where lag_station is null
13 or lead_station is null
14 or lead_station <> station
15 /

ORDER# STATION LAG_CLOSE_ CLOSE_DATE
---------- ---------- ---------- ----------
12345 RECV 07/01/2003
12345 MACH1 07/05/2003 07/11/2003
12345 INSP1 07/11/2003 07/12/2003
12345 MACH1 07/12/2003 07/16/2003
12345 MACH2 07/16/2003 07/30/2003
12345 STOCK 07/30/2003 08/01/2003

6 rows selected.



and you rated our response

  (797 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Excellent!!!

October 07, 2003 - 5:05 am UTC

Reviewer: A reader

Hi Tom,

'been thinking about writing a book just about analytics ' ... please make this book available soon and am sure it will be yet another gift from you to Oracle World :)

Wow!!

October 07, 2003 - 7:00 am UTC

Reviewer: Michael T from Dallas, Tx

This is exactly what I needed! Analytics do rock! I just
need to understand them better. If you do decide to write a
book on analytics, it would be at the top of my must have
list. Thanks again!!!

Small correction

October 07, 2003 - 7:33 am UTC

Reviewer: Michael T from Dallas, Tx

After looking at it a little closer it looks like there is
one small error. The start date for the first MACH1 entry
should be the close date of the prior different station. In
this case 07/01/2003. However, by making some small changes
to your query I can get the results I want.

SELECT order#,
station,
lag(close_date) over (partition by order# order by close_date)
start_date,
close_date
FROM (SELECT order#,
station,
close_date
FROM (SELECT order#,
lag(station) over (partition by order# order by
close_date) lag_station,
lead(station) over (partition by order# order by
close_date) lead_station,
station,
close_date
FROM t)
WHERE lead_station <> station
OR lead_station is null
OR lag_station is null)

There might be an easier way to construct this query, but
it works great for me. Thanks a lot for your help!



Tom Kyte

Followup  

October 07, 2003 - 8:25 am UTC

sorry about that -- you are right -- when we have "a pair", we want to use lag/lead again to get and keep the right dates.  

So, we want to keep rows that are:

a) the first row in the partition  "where lag_station is null"
b) the last row in the partition "where lead_station is null"
c) the first of a possible pair "where lag_station <> station"
d) the second of a possible pair "where lead_station <> station"

This query does that:

ops$tkyte@ORA920> select order#,
  2         station,
  3         lag_close_date,
  4         close_date,
  5         decode( lead_station, station, 1, 0 ) first_of_pair,
  6         decode( lag_station, station, 1, 0 ) second_of_pair
  7    from (
  8  select order#,
  9         lag(station) over (partition by order# order by close_date)
 10                                                              lag_station,
 11         lead(station) over (partition by order# order by close_date)
 12                                                              lead_station,
 13             station,
 14             close_date,
 15         lag(close_date) over (partition by order# order by close_date)
 16                                                             lag_close_date,
 17         lead(close_date) over (partition by order# order by close_date)
 18                                                             lead_close_date
 19    from t
 20         )
 21   where lag_station is null
 22          or lead_station is null
 23          or lead_station <> station
 24          or lag_station <> station
 25  /
 
ORDER# STATION LAG_CLOSE_ CLOSE_DATE FIRST_OF_PAIR SECOND_OF_PAIR
------ ------- ---------- ---------- ------------- --------------
 12345 RECV               07/01/2003             0              0
 12345 MACH1   07/01/2003 07/02/2003             1              0
 12345 MACH1   07/05/2003 07/11/2003             0              1
 12345 INSP1   07/11/2003 07/12/2003             0              0
 12345 MACH1   07/12/2003 07/16/2003             0              0
 12345 MACH2   07/16/2003 07/30/2003             0              0
 12345 STOCK   07/30/2003 08/01/2003             0              0
 
7 rows selected.
 

<b>we can see with the 1's the first/second of a pair in there.  All we need to do now is "reach forward" for the first of a pair and grab the close date from the next record:</b>

ops$tkyte@ORA920> select order#,
  2         station,
  3         lag_close_date,
  4         close_date
  5    from (
  6  select order#,
  7         station,
  8         lag_close_date,
  9         decode( lead_station,
 10                 station,
 11                 lead(close_date) over (partition by order# order by close_date),
 12                 close_date ) close_date,
 13         decode( lead_station, station, 1, 0 ) first_of_pair,
 14         decode( lag_station, station, 1, 0 ) second_of_pair
 15    from (
 16  select order#,
 17         lag(station) over (partition by order# order by close_date)
 18                                                              lag_station,
 19         lead(station) over (partition by order# order by close_date)
 20                                                              lead_station,
 21             station,
 22             close_date,
 23         lag(close_date) over (partition by order# order by close_date)
 24                                                             lag_close_date,
 25         lead(close_date) over (partition by order# order by close_date)
 26                                                             lead_close_date
 27    from t
 28         )
 29   where lag_station is null
 30          or lead_station is null
 31          or lead_station <> station
 32          or lag_station <> station
 33         )
 34   where second_of_pair <> 1
 35  /
 
ORDER# STATION LAG_CLOSE_ CLOSE_DATE
------ ------- ---------- ----------
 12345 RECV               07/01/2003
 12345 MACH1   07/01/2003 07/11/2003
 12345 INSP1   07/11/2003 07/12/2003
 12345 MACH1   07/12/2003 07/16/2003
 12345 MACH2   07/16/2003 07/30/2003
 12345 STOCK   07/30/2003 08/01/2003
 
6 rows selected.


<b>and discard the second of pairs row</b>


That is another way to do it (and an insight into how I develop analytic queries -- adding extra columns like that just to see visually what I want to do) 

another good book on the list please go ahead on this one too

October 07, 2003 - 8:58 am UTC

Reviewer: Vijay Sehgal from India

Best Regards,
Vijay Sehgal

Very useful

October 07, 2003 - 12:05 pm UTC

Reviewer: Michael T. from Dallas, Tx

Excellent, as always!

Can we reach to the end of the group?

December 15, 2003 - 11:28 am UTC

Reviewer: Steve from UK

For example, say our analytic query returns the following result:

master_record sub_record nxt_record
95845433 25860032 95118740
95118740 25860032 95837497
95837497 25860032

What I'd like is to do is grab the final master_record, 95837497, and have that populated in the final column. There could be 2,3 or more in each group.

Tom Kyte

Followup  

December 15, 2003 - 3:45 pm UTC

so the nxt_record of the last record should be the master_record of that row?

then just select


nvl( lead(master_record) over (....), master_record ) nxt_record


when the lead is NULL, return the master_record of the current row

Almost....

December 15, 2003 - 5:52 pm UTC

Reviewer: Steve from UK

but I dodn't explain it well enough. What I'd like to see is a result set that looks like:

master_record sub_record nxt_record
95845433 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497

The data comes from this:

table activity
cllocn moddate

25860032 18/06/2003
95118740 26/08/2003
95837497 15/12/2003
95845433 19/08/2003

table ext_dedupe

master_cllocn dupe_cllocn
25860032 95118740
25860032 95837497
25860032 95845433

My query is:

select * from ( select master_record, sub_record, lead(master_record) over (partition by sub_record order by lst_activity asc) nxt_activity
from ( select * from (select case when dupelast_ackdate>last_ackdate then dupe_cllocn
when last_ackdate>dupelast_ackdate then master_cllocn
else master_cllocn
end master_record, greatest(last_ackdate,dupelast_ackdate) lst_activity,
case when dupelast_ackdate>last_ackdate then master_cllocn
when last_ackdate>dupelast_ackdate then dupe_cllocn
else dupe_cllocn
end sub_record
from (select master_cllocn, (select max(moddate) from activity a where a.cllocn=ed.master_cllocn) last_ackdate,
dupe_cllocn, (select max(moddate) from activity a where a.cllocn=ed.dupe_cllocn) dupelast_ackdate
from ext_dedupe ed))))

Am I on the right track or is there a simpler way to this?

Thanks

Tom Kyte

Followup  

December 16, 2003 - 6:50 am UTC

can you explain in "just text" how you got from your inputs to your outputs.

it is not clear (and i didn't feel like parsing that sql to reverse engineer what it does)



Is this what you are looking for ?

December 15, 2003 - 6:44 pm UTC

Reviewer: Venkat from Detroit, MI USA

select master, sub, moddate
, min(master) keep (dense_rank first order by moddate) over (partition by sub) first_in_list
, max(master) keep (dense_rank last order by moddate) over (partition by sub) last_in_list
from (select master, sub, moddate from (
select 95845433 master, 25860032 sub, to_date('19-aug-03','dd/mon/yy') moddate from dual union all
select 95118740, 25860032, to_date('26-aug-03','dd/mon/yy') from dual union all
select 95837497, 25860032, to_date('15-dec-03','dd/mon/yy') from dual))

MASTER SUB MODDATE FIRST_IN_LIST LAST_IN_LIST
95845433 25860032 8/19/2003 95845433 95837497
95118740 25860032 8/26/2003 95845433 95837497
95837497 25860032 12/15/2003 95845433 95837497


Tom's Book

December 16, 2003 - 4:13 am UTC

Reviewer: umesh from blore india

Tom
Do not announce until you are finished with the book .. when you talk of a book ..can't wait until We have it here
Analytics Book That must be real good

Is it possible to get the same result in standard edition ?

December 16, 2003 - 4:21 am UTC

Reviewer: Ninoslav from croatia

Hi Tom,
yes, analitic functions are great. However, we can use it only in enterprise edition of database. We have a few small customers that want only a standard edition.
So, is it possible in this question to get the same result without analitic functions ?
It would be nice to have some kind of mapping between analitics and 'standard' queries. But, that is probabaly impossible...

Tom Kyte

Followup  

December 16, 2003 - 7:27 am UTC

Oracle 9iR2 and up -- analytics are a feature of standard edition.

there are things you can do in analytics that are quite simply NOT PRACTICAL in any sense without them.


ok

December 16, 2003 - 8:41 am UTC

Reviewer: Steve from uk

I have two tables - activity and ext_dedupe.

table activity
cllocn moddate
25860032 18/06/2003
95118740 26/08/2003
95837497 15/12/2003
95845433 19/08/2003

table ext_dedupe
master_cllocn dupe_cllocn
25860032 95118740
25860032 95837497
25860032 95845433

Ext_dedupe is a table created by a third party app which has identified duplicate records within our database. The first column is supposed to be the master and the second the duplicate. The idea is to mark as archived all our duplicate records with a pointer to the master. Notwithstanding the order of the columns, what we want to do is find out which record has the most recent activity (from the activity table) and archive off the others.

So, in this example although the master is listed as 25860032 against the other 3, an examination of the activity dates mean I want to keep 95837497 and mark the others as archived and have a pointer on each of them to 95837497. That's why I thought if I could get to the following result it would make it simpler.

master_record sub_record nxt_record
95845433 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497

Hope that makes sense!

Tom Kyte

Followup  

December 16, 2003 - 11:33 am UTC

oh, then nxt_record is just

last_value(master_record) over (partition by sub_record order by moddate)




Why...

December 16, 2003 - 1:31 pm UTC

Reviewer: Steve from UK

it didn't work for me. I had to change it to

first_value(master_record) over (partition by sub_record order by moddate desc)

Is there a reason for that?

Tom Kyte

Followup  

December 16, 2003 - 2:00 pm UTC

doh, default window clause is current row and unbounded preceding

i would have needed a window clause that looks forwards rather then backwards (reason #1 why I should always set up a test case instead of just answering on the fly)

your solution of reversing the data works just fine.

Another solution

December 16, 2003 - 4:03 pm UTC

Reviewer: A reader

The following gives the same result ...

select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
, max(cllocn) keep (dense_rank last order by moddate)
over (partition by nvl(master_cllocn,cllocn)) nxt_record
from activity, ext_dedupe where cllocn = dupe_cllocn

MASTER_RECORD SUB_RECORD NXT_RECORD
95118740 25860032 95837497
95837497 25860032 95837497
95845433 25860032 95837497


Tom Kyte

Followup  

December 16, 2003 - 5:44 pm UTC

yes, there are many many ways to do this.

first_value
last_value

substring of max() without keep

sure.

December 16, 2003 - 4:15 pm UTC

Reviewer: A reader

Actually the nvl(master_cllocn...) is required only if you need all 4 rows in the output as follows(there is an outer join involved). If you need only the 3 rows as shown in the above post, there is no need for the nvl's....

select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
, max(cllocn) keep (dense_rank last order by moddate)
over (partition by nvl(master_cllocn,cllocn)) nxt_record
, last_value(cllocn) over (partition by nvl(master_cllocn,cllocn) order by moddate) nxt
from activity, ext_dedupe where cllocn = dupe_cllocn (+)

MASTER_RECORD SUB_RECORD NXT_RECORD
25860032 25860032 95837497
95118740 25860032 95837497
95837497 25860032 95837497
95845433 25860032 95837497

still q's on analytics

January 30, 2004 - 10:13 am UTC

Reviewer: A reader from Madison, wi

Okay, so my web application logs "web transaction" statistics to a table.  This actually amounts to 0 to many database tranactions... but anyway.. I need to summarize (sum, min, max, count, average) each day's transaction times for each class (name2) and action (name3) and ultimately "archive" this data to a hisory table.  I am running 8.1.7 and pretty new to analytics.

My table looks like this:

SQL> desc tran_stats
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID                      NOT NULL NUMBER(9)
 NAME1                            VARCHAR2(100)
 NAME2                            VARCHAR2(100)
 NAME3                            VARCHAR2(100)
 NAME4                            VARCHAR2(100)
 SEC                     NOT NULL NUMBER(9,3)
 TS_CR                   NOT NULL DATE

        ID NAME1 NAME2                     NAME3         SEC NAME4 TS_CR
---------- ----- ------------------------- ---------- ------ ----- ---------
     35947       /CM01_PersonManagement    CREATE       .484       15-JAN-04
     35987       /CM01_PersonManagement    CREATE       .031       15-JAN-04
     36086       /CM01_PersonManagement    EDIT         .312       16-JAN-04
     36555       /CM01_PersonManagement    CREATE       .297       19-JAN-04
     36623       /CM01_PersonManagement    EDIT         .375       19-JAN-04
     36627       /CM01_PersonManagement    CREATE       .047       19-JAN-04
     36756       /CM01_AddressManagement   CREATE       .375       20-JAN-04
     36766       /CM01_AddressManagement   CREATE       .305       20-JAN-04
     36757       /CM01_AddressManagement   INSERT       .391       20-JAN-04
     37178       /CM01_PersonManagement    EDIT         .203       20-JAN-04

and I need output like this:

TS_CR     NAME2                     NAME3       M_SUM  M_MIN  M_MAX M_COUNT  M_AVG
--------- ------------------------- ---------- ------ ------ ------ ------- ------
20-JAN-04 /CM01_AddressManagement   CREATE       .680   .305   .375       2   .340
20-JAN-04 /CM01_AddressManagement   INSERT       .391   .391   .391       1   .391
20-JAN-04 /CM01_PersonManagement    EDIT         .203   .203   .203       1   .203
19-JAN-04 /CM01_PersonManagement    CREATE       .344   .047   .297       2   .172
19-JAN-04 /CM01_PersonManagement    EDIT         .375   .375   .375       1   .375
16-JAN-04 /CM01_PersonManagement    EDIT         .312   .312   .312       1   .312
15-JAN-04 /CM01_PersonManagement    CREATE       .515   .031   .484       2   .258


This seems to work, but there has to be a better/cleaner/more efficient way to do this:

select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
from (
select  trunc(ts_cr) ts_cr,id, name2, name3, sum(sec) m_dummy
 , min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min
 , max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max
 , round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg
 , count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count
 , sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
        from tran_stats group by name2, name3,trunc(ts_cr),id
)n order by 1 desc, 2, 3;


Any help or pointers would be appreciated.  Thanks in advance.



 

Tom Kyte

Followup  

January 30, 2004 - 10:31 am UTC

why does there "have to be"?

what is "unclean" about this? I could make it more verbose (and perhaps more readable) but this does exactly what you ask for?

It seems pretty "good", very "clean" and probably the most efficient method to get this result?

Regarding the previous post ...

January 30, 2004 - 11:45 am UTC

Reviewer: A reader

Am I missing something or will the following do the same ..

select trunc(ts_cr) ts_cr, name2, name3,
count(*) m_count, min(sec) m_min, max(sec) m_max,
sum(sec) m_sum, avg(sec) m_avg
from tran_stats
group by trunc(ts_cr), name2, name3
order by 1 desc, 2, 3

Tom Kyte

Followup  

January 30, 2004 - 7:43 pm UTC

with the supplied data -- since "group by trunc(ts_cr), name2, name3" happened to be unique

yes.

In general -- no.  consider:

ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
  2  from ( select trunc(ts_cr) ts_cr,
  3                id,
  4                            name2,
  5                            name3,
  6                            sum(sec) m_dummy ,
  7                            min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
  8                            max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
  9                            round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
 10                            count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
 11                            sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
 12               from tran_stats
 13                  group by name2, name3,trunc(ts_cr),id
 14        )n
 15  MINUS
 16  select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
 17  from (
 18  select trunc(ts_cr) ts_cr, name2, name3,
 19         count(*) m_count, min(sec) m_min, max(sec) m_max,
 20                    sum(sec) m_sum, avg(sec) m_avg
 21                            from tran_stats
 22                            group by trunc(ts_cr), name2, name3 )
 23  /
 
no rows selected
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into tran_stats
  2  select 35947,'/CM01_PersonManagement','CREATE', .484  ,'15-JAN-04'
  3   from all_users where rownum <= 5;
 
5 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
  2  from ( select trunc(ts_cr) ts_cr,
  3                id,
  4                            name2,
  5                            name3,
  6                            sum(sec) m_dummy ,
  7                            min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
  8                            max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
  9                            round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
 10                            count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
 11                            sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
 12               from tran_stats
 13                  group by name2, name3,trunc(ts_cr),id
 14        )n
 15  MINUS
 16  select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
 17  from (
 18  select trunc(ts_cr) ts_cr, name2, name3,
 19         count(*) m_count, min(sec) m_min, max(sec) m_max,
 20                    sum(sec) m_sum, avg(sec) m_avg
 21                            from tran_stats
 22                            group by trunc(ts_cr), name2, name3 )
 23  /
 
TS_CR     NAME2                   NAME3         M_SUM      M_MIN      M_MAX    M_COUNT      M_AVG
--------- ----------------------- -------- ---------- ---------- ---------- ---------- ----------
15-JAN-04 /CM01_PersonManagement  CREATE        2.935       .031      2.904          2     1.4675



add more data and it won't be the same. 

OK

January 31, 2004 - 9:05 am UTC

Reviewer: Siva from Bangalore,India

Dear Tom,
Can analytics be used for the following formats of the same query
sql>select ename,nvl(ename,'Name is null') from emp
sql>select ename,decode(ename,null,'Name is null',ename)
from emp
If you know other ways,Please let me know
Bye!


Tom Kyte

Followup  

January 31, 2004 - 10:03 am UTC

umm, why ?




with analytics

February 18, 2004 - 7:30 am UTC

Reviewer: A reader

with the following data

-- ------
1 val1_1
1 val1_2
1 val1_3
2 val1_1
2 val2_2

can i produce

-- ------ --------------------
1 val1_1 val1_1,val1_2,val1_3
1 val1_2 val1_1,val1_2,val1_3
1 val1_3 val1_1,val1_2,val1_3
2 val2_1 val2_1,val2_2
2 val2_2 val2_1,val2_2

with an analytic that rocks


Tom Kyte

Followup  

February 18, 2004 - 8:47 pm UTC

if

select max(count(*)) from t group by id

has a reasonable maximum -- yes, but it would be a trick lag/lead thing.

I would probably join using stragg. join the details to the aggregate using inline views.

OK

March 01, 2004 - 9:26 am UTC

Reviewer: Siddiq from UAE

Hi Tom,
What can be the business use cases of the analytic functions
1)cume_dist
2)percentile_dist
3)percentile_cont
Where can they be of immense use?
Bye!

Tom Kyte

Followup  

March 01, 2004 - 10:17 am UTC

they are just statistical functions for analysis.

2 and 3 are really variations on eachother (disc=discrete, cont=continuous) and would be used to compute pctcentiles (like you might see on an SAT test report from back in high school). percentile_* can be used to find a median for example :)

cume_dist is a variation on that. I'll cheat on an example, from the doc:

Analytic Example

The following example calculates the salary percentile for each employee in the purchasing area. For example, 40% of clerks have salaries less than or equal to Himuro.

SELECT job_id, last_name, salary, CUME_DIST() OVER (PARTITION BY job_id ORDER BY salary) AS cume_dist FROM employees WHERE job_id LIKE PU% ;

JOB_ID LAST_NAME SALARY CUME_DIST
---------- ------------------------- ---------- ----------
PU_CLERK Colmenares 2500 .2
PU_CLERK Himuro 2600 .4
PU_CLERK Tobias 2800 .6
PU_CLERK Baida 2900 .8
PU_CLERK Khoo 3100 1
PU_MAN Raphaely 11000 1

Stumped on Analytics

March 04, 2004 - 9:56 am UTC

Reviewer: Dave Thompson from West Yorkshire, England.

Hi Tom,

I have the following two tables:

CREATE TABLE PAY_M
(
PAY_ID NUMBER,
PAYMENT NUMBER
)

--
--

CREATE TABLE PREM
(
PREM_ID NUMBER,
PREM_PAYMENT NUMBER
)

With the following data:

INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
1, 100);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
2, 50);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
3, 50);
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES (
4, 50);
COMMIT;
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
1, 50);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
2, 25);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
3, 50);
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES (
4, 50);
COMMIT;

PAY_M contains payments made against the premiums in the table prem.

Payments:

PAY_ID PAYMENT
---------- ----------
1 50
2 25
3 50
4 50

Prem:

PREM_ID PREM_PAYMENT
---------- ------------
1 100
2 50
3 50
4 50

We are trying to find which payment Ids paid each premium payment in Prem. The payments are assigned sequentially to the premiums.

For example payments 1,2 & 3 pay off the £100 in premium 1 leaving £25. Then the remaining payment from payment 3 & payment 4 pay off premium 2 leaving a balance of £25, and so on.

We are trying to create a query that will use the analytical functions to find all the payment IDs that pay off the associated premium ids. We want to keep this SQL based as we need to Process about 30 million payments!

Thanks.

Great website, hope you enjoyed your recent visit to the UK.

Tom Kyte

Followup  

March 04, 2004 - 1:52 pm UTC

let me make sure I have this straight -- you want to

o sum up the first 3 records in payments
o discover they are 125 which exceeds 100
o output the fact that prem_id 1 is paid for by pay_id 1..3
o carry forward 25 from 3, discover that leftover 3+4 = 75 pays for prem_id 2
with 25 extra

while I believe (not sure) that the 10g MODEL clause might be able to do this (if you can do it in a spreadsheet, we can use the MODEL clause to do it).....

I'm pretty certain that analytics cannot -- we would need to recursively use lag (eg: after finding that 1,2,3 pay off 1, we'd need to -- well, it's hard to explain...)

I cannot see analytics doing this -- future rows depend on functions of the analytics from past rows and that is just "not allowed".


I can see how to do this in a pipelined PLSQL function -- will that work for you?

Oops - Error in previous post

March 04, 2004 - 10:17 am UTC

Reviewer: Dave Thompson from West Yorkshire, England

Tom,

Sorry, ignore the above tables as they are missing the joining column:

CREATE TABLE PAY_M
(
  PREM_ID  NUMBER,
  PAY_ID   NUMBER,
  PAYMENT  NUMBER
)

INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 1, 50); 
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 2, 25); 
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 3, 50); 
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 4, 50); 
COMMIT;

CREATE TABLE PREM
(
  PREM_ID       NUMBER,
  PAY_ID        NUMBER,
  PREM_PAYMENT  NUMBER
)

INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 1, 100); 
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 2, 50); 
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 3, 50); 
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 4, 50); 
COMMIT;


SQL> l
  1  SELECT *
  2* FROM   PAY_M
SQL> /

   PREM_ID     PAY_ID    PAYMENT
---------- ---------- ----------
         1          1         50
         1          2         25
         1          3         50
         1          4         50

SQL> select *
  2  from prem;

   PREM_ID     PAY_ID PREM_PAYMENT
---------- ---------- ------------
         1          1          100
         1          2           50
         1          3           50
         1          4           50

 

Thanks.....

March 05, 2004 - 4:23 am UTC

Reviewer: Dave Thompson from West Yorkshire, England.

Tom,

Thanks for your prompt response.

I am familiar with Pipeline functions.

I was however hoping we could do this as a set based opertion because of the volume of data involved.

Thanks for your time.

analytics book

March 05, 2004 - 5:52 am UTC

Reviewer: Ron Chennells from UK

Just another vote and pre order for the analytics book

OK

March 19, 2004 - 12:33 am UTC

Reviewer: Gerhard from Dusseldorf,Germany

Dear Tom,
 I used the following query to find the difference of salaries between employees.

SQL> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal from emp;

ENAME             SAL   DIFF_SAL                                                
---------- ---------- ----------                                                
SMITH             800                                                           
JAMES             950        150                                                
ADAMS            1100        150                                                
WARD             1250        150                                                
MARTIN           1250          0                                                
MILLER           1300         50                                                
TURNER           1500        200                                                
ALLEN            1600        100                                                
CLARK            2450        850                                                
BLAKE            2850        400                                                
JONES            2975        125                                                

ENAME             SAL   DIFF_SAL                                                
---------- ---------- ----------                                                
SCOTT            3000         25                                                
FORD             3000          0                                                
KING             5000       2000                                                

14 rows selected.

My Question is:
" What is the difference between King's sal with other 
employees?".Could you please help with the query?
Bye! 

Tom Kyte

Followup  

March 19, 2004 - 8:58 am UTC

scott@ORA9IR2> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal ,
2 sal-king_sal king_sal_diff
3 from (select sal king_sal from emp where ename = 'KING'),
4 emp
5 /

ENAME SAL DIFF_SAL KING_SAL_DIFF
---------- ---------- ---------- -------------
SMITH 800 -4200
JAMES 950 150 -4050
ADAMS 1100 150 -3900
WARD 1250 150 -3750
MARTIN 1250 0 -3750
MILLER 1300 50 -3700
TURNER 1500 200 -3500
ALLEN 1600 100 -3400
CLARK 2450 850 -2550
BLAKE 2850 400 -2150
JONES 2975 125 -2025
SCOTT 3000 25 -2000
FORD 3000 0 -2000
KING 5000 2000 0

14 rows selected.


Will this be faster?

March 19, 2004 - 4:20 pm UTC

Reviewer: Venkat from Detroit

select ename, sal,
sal-lag(sal) over(order by sal) as diff_sal,
sal - max(case when ename='KING' then sal
else null end) over () king_sal_diff
from emp


Tom Kyte

Followup  

March 20, 2004 - 9:47 am UTC

when you benchmarked it and tested it to scale, what did you see? it would be interesting no?

lead/lag on different dataset

May 03, 2004 - 9:22 pm UTC

Reviewer: Stalin from CA, USA

Hi Tom,

I've similar requirement but i'm not sure how to use lead or lag to refer from a different dataset.

Eg. logs table has both login and logout information and they are identified by action column. There could be different login/logout modes so records that have action in (1,2) and (3,4,5,6,7) values are login and logout records respectively. Now i need to find signon and signout times and also session duration in mins.

here is some sample data of logs table :

LOG_ID LOG_CREATION_DATE USER_ID SERVICE ACTION
---------- ------------------- ---------- ---------- ----------
1 04/29/2004 10:48:36 3 5 2
3 04/29/2004 10:53:44 3 5 3
5 04/29/2004 11:11:35 3 5 1
1003 05/03/2004 15:18:53 3 5 5
1004 05/03/2004 15:19:50 8 5 1

here is a query i came up with (not exacly what i want) :

select log_id signon_id, lead(log_id, 1) over (partition by account_id, user_id, mac order by log_id) signoff_id,
user_id, log_creation_date signon_date,
lead(log_creation_date, 1) over (partition by account_id, user_id, mac order by log_creation_date) signoff_date,
nvl(round(((lead(log_creation_date, 1)
over (partition by account_id, user_id order by log_creation_date)-log_creation_date)*1440), 2), 0) Usage_Mins
from logs
where account_id = 'Robert'
and service = 5
order by user_id

desired output :

SIGNON_ID SIGNOFF_ID USER_ID SIGNON_DATE SIGNOFF_DATE USAGE_MINS
---------- ---------- ---------- ------------------- ------------------- ----------
1 3 3 04/29/2004 10:48:36 04/29/2004 10:53:44 5.13
5 1003 3 04/29/2004 11:11:35 05/03/2004 15:18:53 6007.3
1004 8 05/03/2004 15:19:50 0

Thanks in Advance,
Stalin


Tom Kyte

Followup  

May 04, 2004 - 7:11 am UTC

maybe if you supply simple create table and insert ... values ... statements for me.... this stuff would go faster.

Your query references columns that are not in the example as well.

Create table scripts

May 04, 2004 - 1:29 pm UTC

Reviewer: Stalin from CA, USA

Sorry for not giving this info in the first place.

here goes the scripts....

create table logs (log_id number, log_creation_date date, account_id varchar2(25), user_id number,
service number, action number, mac varchar2(50))
/

insert into logs values (1, to_date('04/29/2004 10:48:36'), 'Robert', 3, 5, 2, '00-00-00-00')
/
insert into logs values (3, to_date('04/29/2004 10:53:44'), 'Robert', 3, 5, 3, '00-00-00-00')
/
insert into logs values (5, to_date('04/29/2004 11:11:35'), 'Robert', 3, 5, 1, '00-00-00-00')
/
insert into logs values (1003, to_date('05/03/2004 15:18:53'), 'Robert', 3, 5, 5, '00-00-00-00')
/
insert into logs values (1004, to_date('05/03/2004 15:19:50'), 'Robert', 8, 5, 1, '00-00-00-00')
/

The reason for including mac in the partition group is cause users can login via multiple pc's without logging out hence i grouped it on account_id, user_id and mac.

Thanks,
Stalin

Tom Kyte

Followup  

May 04, 2004 - 2:38 pm UTC

ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 ) minutes
  2    from (
  3  select log_id,
  4         case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
  5              then lead(log_id) over (partition by account_id, user_id, mac order by log_creation_date)
  6          end signoff_id,
  7         user_id,
  8         log_creation_date signon_date,
  9         case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
 10              then lead(log_creation_date) over (partition by account_id, user_id, mac order by log_creation_date)
 11          end signoff_date,
 12                  action
 13  from   logs
 14  where  account_id = 'Robert'
 15  and    service = 5
 16  order  by user_id
 17         ) a
 18   where action in (1,2)
 19  /
 
    LOG_ID SIGNOFF_ID    USER_ID SIGNON_DATE         SIGNOFF_DATE            ACTION    MINUTES
---------- ---------- ---------- ------------------- ------------------- ---------- ----------
         1          3          3 04/29/2004 10:48:36 04/29/2004 10:53:44          2       5.13
         5       1003          3 04/29/2004 11:11:35 05/03/2004 15:18:53          1     6007.3
      1004                     8 05/03/2004 15:19:50                              1
 

Excellent

May 04, 2004 - 3:42 pm UTC

Reviewer: Stalin from CA, USA

This is exactly what i'm looking for.

Thanks so much!

Help On SQL

May 04, 2004 - 8:05 pm UTC

Reviewer: VKOUL from Lacey, WA

I want to substitute the non null value of a column to its null column. e.g.

If I have records like the following

year month column_value
----- ------ --------------------
2002 06 55
2002 06 57
2002 07 NULL
2002 08 NULL
2002 09 NULL
2002 10 100
2002 11 101

I want the results as below

year month column_value
----- ------ --------------------
2002 06 55
2002 06 57
2002 07 57 ------> Repeated
2002 08 57 ------> Repeated
2002 09 57 ------> Repeated
2002 10 100
2002 11 101


Tom Kyte

Followup  

May 04, 2004 - 9:08 pm UTC

create table,
insert into table

much appreciated......... (so i don't spend days of my life making create tables and insert into statements. I've added this request to all pages where you can input stuff and I'll just be asking for it from now on in...... Not picking on you, just reminding everyone that i need a script like I provide.....)


but..... asked and answered:

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10286792840956 <code>






Help On SQL

May 04, 2004 - 11:27 pm UTC

Reviewer: VKoul

Beautiful !!!

I'll keep in mind "create table etc."

Thanks

VKoul

analytic q

May 11, 2004 - 6:38 pm UTC

Reviewer: A reader

Tom
Please look at the following schema and data.
---------
spool schema
set echo on
drop table host_instances;
drop table rac_instances;
drop table instance_tablespaces;

create table host_instances
(
host_name varchar2(50),
instance_name varchar2(50)
);

create table rac_instances
(
rac_name varchar2(50),
instance_name varchar2(50)
);

create table instance_tablespaces
(
instance_name varchar2(50),
tablespace_name varchar2(50),
tablespace_size number
);

-- host to instance mapping data
insert into host_instances values ( 'h1', 'i1' );
insert into host_instances values ( 'h2', 'i2' );
insert into host_instances values ( 'h3', 'i3' );
insert into host_instances values ( 'h4', 'i4' );
insert into host_instances values ( 'h5', 'i5' );

-- rac to instance mapping data

insert into rac_instances values ( 'rac1', 'i1' );
insert into rac_instances values ( 'rac1', 'i2' );
insert into rac_instances values ( 'rac2', 'i3' );
insert into rac_instances values ( 'rac2', 'i4' );

--- instance to tablespace mapping data
insert into instance_tablespaces values( 'i1', 't11', 100 );
insert into instance_tablespaces values( 'i1', 't12', 200 );
insert into instance_tablespaces values( 'i2', 't11', 100 );
insert into instance_tablespaces values( 'i2', 't12', 200 );
insert into instance_tablespaces values( 'i3', 't31', 500 );
insert into instance_tablespaces values( 'i3', 't32', 300 );
insert into instance_tablespaces values( 'i4', 't31', 500 );
insert into instance_tablespaces values( 'i4', 't32', 300 );
insert into instance_tablespaces values( 'i5', 't51', 400 );
commit;
---------

What I need is to sum up all tablespaces of all instances
for a list of hosts. However, if two hosts in the list
belong to a RAC then I should only pick one of the
hosts (I can pick any one of them.)
e.g. in the above data I should only pick i1 or i2 not
both since they both belong to the same RAC 'rac1'.

Following is the select I came up with for the above data.
Let me know if you have any comments on it.
Any other alternative solutions you can think of would
also be educating to me. I have not benchmarked this
select yet. The number of hosts could reach up to 2000
approximately. On an average we can assume each will have
one instance - some of these will be RACs.

Thank you!
-----------
scott@ora10g> set echo on
scott@ora10g> column host_name format a10
scott@ora10g> column instance_name format a10
scott@ora10g> column rac_name format a10
scott@ora10g> column row_number format 999
scott@ora10g>
scott@ora10g> select a.instance_name, sum( tablespace_size )
2 from
3 (
4 select instance_name
5 from
6 (
7 select host_name, instance_name, rac_name,
8 row_number() over
9 (
10 partition by rac_name
11 order by rac_name, instance_name
12 ) row_number
13 from
14 (
15 select hi.host_name, hi.instance_name, ri.rac_name
16 from host_instances hi, rac_instances ri
17 where hi.instance_name = ri.instance_name(+)
18 )
19 )
20 where row_number <= 1
21 ) a, instance_tablespaces e
22 where a.instance_name = e.instance_name
23 group by a.instance_name;

i1 300
i3 800
i5 400


---

Also do you prefer the .sql file (as above) or
the spooled output of schema.sql (i.e. schema.lst.)
The above is more convenient to reproduce - but the spooled output makes for better reading in some cases.




Tom Kyte

Followup  

May 11, 2004 - 9:15 pm UTC

I like the cut and paste from sqlplus truth be told.

sure, I have to do two vi commands and a couple of deletes to fix it up but.... I'm fairly certain that the poster *actually ran the commands successfully!* which is most relevant to me....

Besides, I do it to you ;)




ops$tkyte@ORA9IR2> select *
  2    from (
  3  select h.host_name, h.instance_name, r.rac_name, sum(t.tablespace_size),
  4         row_number() over (partition by r.rac_name order by h.host_name ) rn
  5    from host_instances h,
  6             rac_instances r,
  7             instance_tablespaces t
  8   where h.instance_name = r.instance_name(+)
  9     and h.instance_name = t.instance_name
 10   group by h.host_name, h.instance_name, r.rac_name
 11         )
 12   where rn = 1
 13  /
 
HO IN RAC_N SUM(T.TABLESPACE_SIZE)         RN
-- -- ----- ---------------------- ----------
h1 i1 rac1                     300          1
h3 i3 rac2                     800          1
h5 i5                          400          1


is the first thing that popped into my head.

with just a couple hundred rows -- any of them will perform better than good enough. 

thanx!

May 11, 2004 - 9:54 pm UTC

Reviewer: A reader

"I like the cut and paste from sqlplus truth be told."
Actually I was going to post that only - but your
example at the point of posting led me to believe
that you want a straight sql - may be you wanna
fix that (not that many people seem to care anyways!:))

Thanx for the sql - it looks good and a tad simpler
than the one I wrote...


How to compute this running total (sort of...)

May 18, 2004 - 11:33 am UTC

Reviewer: Kishan from USA

create table investment (
investment_id number,
asset_id number,
agreement_id number,
constraint pk_i primary key (investment_id)
)
/
create table period (
period_id number,
business_domain varchar2(10),
status_code varchar2(10),
constraint pk_p primary key (period_id)
)
/
create table entry (
entry_id number,
period_id number,
investment_id number,
constraint pk_e primary key(entry_id),
constraint fk_e_period foreign key(period_id) references period(period_id),
constraint fk_e_investment foreign key (investment_id) references investment(investment_id)
)
/
create table entry_detail(
entry_id number,
account_type varchar2(10),
amount number,
constraint pk_ed primary key(entry_id, account_type),
constraint fk_ed_entry foreign key(entry_id) references entry(entry_id)
)
/
insert into period (period_id, business_domain, status_code)
SELECT rownum AS period_id,
'BDG' AS business_domain,
'2' AS status_code
from all_objects where rownum <= 5
/

insert into investment(investment_id, asset_id, agreement_id)
select rownum+10 AS investment_id,
rownum+100 AS asset_id,
rownum+1000 AS agreement_id
from all_objects where rownum <=5
/
insert into entry(entry_id, period_id, investment_id) values (1, 1, 11)
/
insert into entry(entry_id, period_id, investment_id) values (2, 2, 11)
/
insert into entry(entry_id, period_id, investment_id) values (3, 3, 11)
/
insert into entry(entry_id, period_id, investment_id) values (4, 3, 13)
/
insert into entry(entry_id, period_id, investment_id) values (5, 4, 13)
/
insert into entry(entry_id, period_id, investment_id) values (6, 4, 14)
/
insert into entry(entry_id, period_id, investment_id) values (7, 5, 14)
/

insert into entry_detail(entry_id, account_type, amount) values(1, 'AC1', 1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC2', -200 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC3', 300 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(5, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC1', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC3', 500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC4', 1200 )
/

scott@LDB.US.ORACLE.COM> select * from period;

PERIOD_ID BUSINESS_D STATUS_COD
---------- ---------- ----------
1 BDG 2
2 BDG 2
3 BDG 2
4 BDG 2
5 BDG 2

scott@LDB.US.ORACLE.COM> select * from investment;

INVESTMENT_ID ASSET_ID AGREEMENT_ID
------------- ---------- ------------
11 101 1001
12 102 1002
13 103 1003
14 104 1004
15 105 1005

scott@LDB.US.ORACLE.COM> select * from entry;

ENTRY_ID PERIOD_ID INVESTMENT_ID
---------- ---------- -------------
1 1 11
2 2 11
3 3 11
4 3 13
5 4 13
6 4 14
7 5 14

7 rows selected.

scott@LDB.US.ORACLE.COM> select * from entry_detail;

ENTRY_ID ACCOUNT_TY AMOUNT
---------- ---------- ----------
1 AC1 1000
1 AC2 -200
1 AC3 300
2 AC1 200
2 AC4 -1000
2 AC2 -500
3 AC2 2200
3 AC1 200
4 AC4 -1000
4 AC2 -500
5 AC2 2200
6 AC1 200
6 AC4 -1000
6 AC2 -500
7 AC1 2200
7 AC3 500
7 AC4 1200

17 rows selected.


The resultant view needed is given below.

To give an example from the result below, the first entry for investment_id 14
is from period 4. The account types entered on period 4 are AC1, AC4, AC2. We
need these three account types in all subsequent periods. Also, on period 5 a
new account type AC3 is added. So, if there is another period, say period_id 6, we need
information for AC1, AC2, AC3, AC4 (that's 4 account types). If there's no entry
for any of these account_types for any subseqent periods, the amount_for_period for such
periods are considered to be 0.00 and the balance will be sum(amount_for_period)
until that period.


PERIOD_ID INVESTMENT_ID ACCOUNT_TYPE AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
--------- ------------- ------------ ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300

2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000

3 11 AC1 200 1400
3 11 AC2 200 -500
3 11 AC3 0 300
3 11 AC4 0 1000

4 11 AC1 0 1400
4 11 AC2 0 -500
4 11 AC3 0 300
4 11 AC4 0 1000

5 11 AC1 0 1400
5 11 AC2 0 -500
5 11 AC3 0 300
5 11 AC4 0 1000

3 13 AC4 -1000 -1000
3 13 AC2 -500 -500

4 13 AC4 0 -1000
4 13 AC2 -500 -1000

5 13 AC4 0 -1000
5 13 AC4 0 -1000

4 14 AC1 200 200
4 14 AC4 -1000 -1000
4 14 AC2 -500 -500

5 14 AC1 2200 2400
5 14 AC3 500 500
5 14 AC4 1200 200
5 14 AC2 0 -500

The blank lines in between are just for clarity. As always, grateful for all your efforts.

Regards,
Kishan.


Tom Kyte

Followup  

May 18, 2004 - 6:14 pm UTC

so, what does your first try look like :) at least get the join written up for the details - maybe the running total will be obvious from that.

This is how far I went...and no further

May 19, 2004 - 10:18 am UTC

Reviewer: Kishan from USA

select distinct period_id,
investment_id,
account_type,
amount_for_period,
balance_till_period
from ( select period.period_id,
entry.investment_id,
entry_detail.account_type,
(case when entry.period_id = period.period_id then entry_detail.amount else 0 end) amount_for_period,
sum(amount) over(partition by period.period_id, investment_id, account_type) balance_till_period
from period left outer join (entry join entry_detail on (entry.entry_id = entry_detail.entry_id)) on (entry.period_id <= period.period_id))
order by investment_id

The result looks as below:

PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300

2 11 AC1 0 1200
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC2 0 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000

3 11 AC1 0 1400
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000

4 11 AC1 0 1400
4 11 AC2 0 1500
4 11 AC3 0 300
4 11 AC4 0 -1000

5 11 AC1 0 1400
5 11 AC2 0 1500
5 11 AC3 0 300
5 11 AC4 0 -1000

3 13 AC2 -500 -500
3 13 AC4 -1000 -1000

4 13 AC2 0 1700
4 13 AC2 2200 1700
4 13 AC4 0 -1000

5 13 AC2 0 1700
5 13 AC4 0 -1000

4 14 AC1 200 200
4 14 AC2 -500 -500
4 14 AC4 -1000 -1000

5 14 AC1 0 2400
5 14 AC1 2200 2400
5 14 AC2 0 -500
5 14 AC3 500 500
5 14 AC4 0 200
5 14 AC4 1200 200

First, I am sorry my originally constructed result (by hand..;) misses a couple of rows .
However, other than that, I am unable to remove the redundant rows that are shows up for the particular investment and accout_type for a period as the logic beats me.

Basically, I need to remove rows where the amount_for_period is 0 for an account_type only if its a redundant row for that set. That is, the first row of period_id 2 and 3 are redundant but the rows for the period 4 are not redundant.

Could you help me out?

Regards,
Kishan.

Tom Kyte

Followup  

May 19, 2004 - 11:06 am UTC

are we missing some more order bys? I mean -- what if:

3 11 AC1 0 1400
3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000

was really:


3 11 AC1 200 1400
3 11 AC2 0 1500
3 11 AC2 2200 1500
3 11 AC3 0 300
3 11 AC4 0 -1000
3 11 AC1 0 1400

would that still be redundant? missing something here/

Yes...they are redundant

May 19, 2004 - 12:16 pm UTC

Reviewer: A reader

Tom:
Yes, for that particular set, those rows are redundant, no matter what the order is.


Regards,
Kishan.

Tom Kyte

Followup  

May 19, 2004 - 2:24 pm UTC

ok, so what is the "key" of that result set? what can we partition the result set by.

my idea will be to use your query in an inline view and analytics on that to weed out what you want.

May 19, 2004 - 3:08 pm UTC

Reviewer: Kishan from USA

The key would be period_id, investment_id and accout_type. Basically, what the result represents is the amount and the balance-to-date for a particular account_type of an investment_id for a period.

Eg: Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000

If there's no activity on that investment and account_type for the next period, say Period 2, the amount will be 0 for that period, and the balance will be previous period's balance.

Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=0->Balance = 1000

But, if there's an activity on that account_type for that investment, then the amount will be the amount for that period and balance will be the sum of previous balance and current amount. Say for Period 2, the amount is 500, then

Period 1->Investment 1->Account_Type AC1->Amount=1000-> Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500-> Balance=1500

And if there's a new account type entry, say AC2 and amount, say 2000 created for period 2, then the result set will be

Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500->Balance=1500
Period 2->Investment 1->Account_Type AC2->Amount=2000->Balance=2000

There may be many investments per period and many account_types per investment. Hope I am clear....

Regards,
Kishan.


Tom Kyte

Followup  

May 19, 2004 - 5:34 pm UTC

so... if you have:

PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
1 11 AC1 1000 1000
1 11 AC2 -200 -200
1 11 AC3 300 300

2 11 AC1 0 1200
2 11 AC1 200 1200
2 11 AC2 -500 -700
2 11 AC2 0 -700
2 11 AC3 0 300
2 11 AC4 -1000 -1000

you see though, why isn't the 4th line here "redundant" then?

But it is redundant..

May 19, 2004 - 11:51 pm UTC

Reviewer: Kishan from USA

Tom, I am assuming the 4th line you mention is 2->11->AC2->0->-700. Yes, it is redundant.

We need amount and balance for every period_id, investment_id and account_type. One line, per period_id, investment_id and account_type, anything more, is redundant.

Issue is, there may not be entries for a specific account_type of an investment for a particular period. In such cases, we need to assume amount for such periods are 0 and compute the balances accordingly.

Regards,
Kishan

Tom Kyte

Followup  

May 20, 2004 - 10:55 am UTC

so, if you partition by

PERIOD_ID INVESTMENT_ID ACCOUNT_TY BALANCE_TILL_PERIOD

order by
AMOUNT_FOR_PERIOD

select a.*, lead(amount_for_period) over (partition by .... order by ... ) nxt
from (YOUR_QUERY)


you can then

select *
from (that_query)
where nxt is NULL or (nxt is not null and amount_for_period <> 0)

if nxt is null -- last row in the partition, keep it.
if nxt is not null AND we are zero -- remove it.




Almost there?

May 20, 2004 - 12:30 pm UTC

Reviewer: Dave Thompson from UK

Hi Tom,

We have the following table of data:

CREATE TABLE DEDUP_TEST
(
ID NUMBER,
COLUMN_A VARCHAR2(10 BYTE),
COLUMN_B VARCHAR2(10 BYTE),
COLUMN_C VARCHAR2(10 BYTE),
START_DATE DATE,
END_DATE DATE
)

With:

INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'D', 'B', 'C', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'a', 'f', 'f', TO_Date( '02/06/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/07/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/05/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/03/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/06/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
3, 'A', 'F', 'F', TO_Date( '02/10/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/20/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
COMMIT;

We are trying to sequentially de-duplicate this data.

Basically from the top of the table we go down and the check each row against the previous. If they are the same the row that is a duplicate is marked as such as is the original row.

So far we have this query:

SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM DEDUP_TEST
ORDER
BY 1, 5 ) ) )

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-00 01-OCT-01 DUP
2 A B B 01-OCT-01 01-OCT-03 DUP
2 A B B 02-OCT-01 05-OCT-03 DUP
2 a f f 06-FEB-04 07-FEB-04 NOT
2 A B B 02-OCT-05 03-OCT-05 DUP
2 A B B 04-OCT-05 06-OCT-05 DUP
3 A F F 10-FEB-04 20-FEB-04 NOT

The resultset from this is almost what I am after.

However where there are groups of duplicate rows I only want to return one row. I take the attributes, the start_date of the first row duplicated and the end_date of the last row duplicated.

I do not want to group all the duplicates together, so for example the rows with the attributes

ID COLUMN_A COLUMN_B COLUMN_C

2 A B B

will result in two output rows:

2 A B B 01-OCT-00 01-OCT-03

2 A B B 02-OCT-05 06-OCT-05

This is the final piece I cannot work out.

Any help would be appreciated.

Thanks.



Tom Kyte

Followup  

May 20, 2004 - 2:18 pm UTC

what happens in your data if you had

1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....

that might or might not be "dup" since you just order by ID? don't we need to ordedr by a,b, and c?

Follow up

May 21, 2004 - 5:02 am UTC

Reviewer: Dave Thompson from UK

Hi Tom,

In repsonse to your question:

what happens in your data if you had

1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....

Then the first row would be classed as unique, as would the second and the third. We are only looking at duplicates that occur sequentially.

Sequential duplicates are then turned into one row by taking the start date of the first row and the end date of the last row in the group.

The test data should have had sequential dates:

INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'D', 'B', 'C', TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
1, 'A', 'B', 'C', TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'a', 'f', 'f', TO_Date( '02/06/2009 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/07/2010 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/01/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2007 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/05/2008 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/02/2011 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/03/2012 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
2, 'A', 'B', 'B', TO_Date( '10/04/2013 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '10/06/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES (
3, 'A', 'F', 'F', TO_Date( '02/10/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '02/20/2015 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'));
COMMIT;

CREATE TABLE DEDUP_TEST
(
ID NUMBER,
COLUMN_A VARCHAR2(10 BYTE),
COLUMN_B VARCHAR2(10 BYTE),
COLUMN_C VARCHAR2(10 BYTE),
START_DATE DATE,
END_DATE DATE
)

The query:

SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM DEDUP_TEST
ORDER
BY ID, START_DATE ) ) )

Gives:

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT

From this the sequentially duplicated rows with the attributes a, b, c will become:

2 A B C 01-OCT-03 05-OCT-08

2 A B C 02-OCT-11 06-OCT-14

Thanks.

Tom Kyte

Followup  

May 21, 2004 - 10:50 am UTC

define sequentially.

1 A1 B1 C1 ....
1 A2 B2 C2 ....
1 A1 B1 C1 ....

ordered by ID is the same (exact same) as:

1 A1 B1 C1 ....
1 A1 B1 C1 ....
1 A2 B2 C2 ....

and

1 A2 B2 C2 ....
1 A1 B1 C1 ....
1 A1 B1 C1 ....


and in fact, two runs of your query could return different answers given the SAME exact data. How to handle that, you must have something more to sort by.

Typo in previous post

May 21, 2004 - 5:56 am UTC

Reviewer: Dave Thompson from England

Tom,

The final output should be:

From this the sequentially duplicated rows with the attributes a, b, c will
become:

2 A B B 01-OCT-03 05-OCT-08

2 A B B 02-OCT-11 06-OCT-14

Thanks.


Order

May 21, 2004 - 10:57 am UTC

Reviewer: Dave Thompson from England

Hi Tom,

The order of the dataset should be on the ID and Start Date.

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT

Thanks.

Tom Kyte

Followup  

May 21, 2004 - 11:42 am UTC

Ok, your example doesn't do that -- it is "non-deterministic", given the same data, it could/would return two different answers at different times during the day!


so, i think you want one of these:

ops$tkyte@ORA9IR2> select *
  2    from (
  3  select id, a,b,c, start_date, end_date,
  4         case when (a = lag(a) over (order by id, start_date desc) and
  5                    b = lag(b) over (order by id, start_date desc) and
  6                    c = lag(c) over (order by id, start_date desc) )
  7              then row_number() over (order by id, start_date)
  8          end rn
  9    from v
 10         )
 11   where rn is null
 12  /
 
        ID A          B          C          START_DAT END_DATE          RN
---------- ---------- ---------- ---------- --------- --------- ----------
         1 A          B          C          01-OCT-99 01-OCT-00
         1 D          B          C          01-OCT-01 01-OCT-02
         1 A          B          C          01-OCT-02 01-OCT-03
         2 A          B          B          02-OCT-07 05-OCT-08
         2 a          f          f          06-FEB-09 07-FEB-10
         2 A          B          B          04-OCT-13 06-OCT-14
         3 A          F          F          10-FEB-14 20-FEB-15
 
7 rows selected.
 
ops$tkyte@ORA9IR2> select *
  2    from (
  3  select id, a,b,c, start_date, end_date,
  4         case when (a = lag(a) over (order by id, start_date) and
  5                    b = lag(b) over (order by id, start_date) and
  6                    c = lag(c) over (order by id, start_date) )
  7              then row_number() over (order by id, start_date)
  8          end rn
  9    from v
 10         )
 11   where rn is null
 12  /
 
        ID A          B          C          START_DAT END_DATE          RN
---------- ---------- ---------- ---------- --------- --------- ----------
         1 A          B          C          01-OCT-99 01-OCT-00
         1 D          B          C          01-OCT-01 01-OCT-02
         1 A          B          C          01-OCT-02 01-OCT-03
         2 A          B          B          01-OCT-03 01-OCT-04
         2 a          f          f          06-FEB-09 07-FEB-10
         2 A          B          B          02-OCT-11 03-OCT-12
         3 A          F          F          10-FEB-14 20-FEB-15
 
7 rows selected.

we just need to mark records that the preceding record is the "same" after sorting -- then nuke them. 

More Info

May 21, 2004 - 12:25 pm UTC

Reviewer: Dave Thompson from England, Sunny spells with cloud today.

Hi Tom,

Thanks for the prompt reply.

I re-wrote the base query:

SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
DUP,
CASE WHEN COLUMN_A = NEXT_A
AND COLUMN_B = NEXT_B
AND COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
NEXT_A,
NEXT_B,
NEXT_C,
CASE WHEN COLUMN_A = PREV_A
AND COLUMN_B = PREV_B
AND COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM ( SELECT ID,
COLUMN_A,
COLUMN_B,
COLUMN_C,
START_DATE,
END_DATE,
ROWID ROWID_R,
LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_A,
LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_B,
LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_C,
LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_A,
LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_B,
LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_C
FROM DEDUP_TEST
ORDER
BY ID, START_DATE ) ) )

And got:

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT

Looking at the column LETSEE I want to add a unique identifier to each row, treating duplicated rows as 1.

For example:

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET DUP_ID
---------- ---------- ---------- ---------- --------- --------- --- ------
1 A B C 01-OCT-99 01-OCT-00 NOT 1
1 D B C 01-OCT-01 01-OCT-02 NOT 2
1 A B C 01-OCT-02 01-OCT-03 NOT 3
2 A B B 01-OCT-03 01-OCT-04 DUP 4
2 A B B 01-OCT-05 01-OCT-06 DUP 4
2 A B B 02-OCT-07 05-OCT-08 DUP 4
2 a f f 06-FEB-09 07-FEB-10 NOT 5
2 A B B 02-OCT-11 03-OCT-12 DUP 6
2 A B B 04-OCT-13 06-OCT-14 DUP 6
3 A F F 10-FEB-14 20-FEB-15 NOT 7

Then I could use the Dup_Id to partition on to do the anaysis I need.

Any idea?

Have a nice weekend.

Thanks.


Tom Kyte

Followup  

May 21, 2004 - 1:59 pm UTC

the above query doesn't work?

Hi Again

May 21, 2004 - 2:05 pm UTC

Reviewer: Dave Thompson from England

Hi Tom,

The above didn't work.

From the source query:

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 03-OCT-12 DUP
2 A B B 04-OCT-13 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT

I want to output the following resultset:

ID COLUMN_A COLUMN_B COLUMN_C START_DAT END_DATE LET
---------- ---------- ---------- ---------- --------- --------- ---
1 A B C 01-OCT-99 01-OCT-00 NOT
1 D B C 01-OCT-01 01-OCT-02 NOT
1 A B C 01-OCT-02 01-OCT-03 NOT
2 A B B 01-OCT-03 05-OCT-08 DUP
2 a f f 06-FEB-09 07-FEB-10 NOT
2 A B B 02-OCT-11 06-OCT-14 DUP
3 A F F 10-FEB-14 20-FEB-15 NOT

On the resultset from your queries the start and end dates were incorrect.

Where duplicates rows occur one after another then we need to take the start_date of the first row and the end_date of the last row in that block.

So far the following:

2 A B B 01-OCT-03 01-OCT-04 DUP
2 A B B 01-OCT-05 01-OCT-06 DUP
2 A B B 02-OCT-07 05-OCT-08 DUP

You would get

2 A B B 01-OCT-03 05-OCT-08 DUP


Does this make sense?

Thanks again for you input on this.



Tom Kyte

Followup  

May 21, 2004 - 2:19 pm UTC

ops$tkyte@ORA9IR2> select id, a,b,c, min(start_date) start_date, max(end_date) end_date
  2    from (
  3  select id, a,b,c, start_date, end_date,
  4         max(grp) over (order by id, start_date desc) grp
  5    from (
  6  select id, a,b,c, start_date, end_date,
  7         case when (a <> lag(a) over (order by id, start_date desc) or
  8                    b <> lag(b) over (order by id, start_date desc) or
  9                    c <> lag(c) over (order by id, start_date desc) )
 10              then row_number() over (order by id, start_date desc)
 11          end grp
 12    from v
 13         )
 14         )
 15   group by id, a,b,c,grp
 16   order by 1, 5
 17  /
 
        ID A          B          C          START_DAT END_DATE
---------- ---------- ---------- ---------- --------- ---------
         1 A          B          C          01-OCT-99 01-OCT-00
         1 D          B          C          01-OCT-01 01-OCT-02
         1 A          B          C          01-OCT-02 01-OCT-03
         2 A          B          B          01-OCT-03 05-OCT-08
         2 a          f          f          06-FEB-09 07-FEB-10
         2 A          B          B          02-OCT-11 06-OCT-14
         3 A          F          F          10-FEB-14 20-FEB-15
 
7 rows selected.


One of my (current) favorite analytic tricks -- the old "carry forward".  We mark rows such that the preceding row was different -- subsequent dup rows would have NULLS there for grp.  

Then, we use max(grp) to "carry" that number down....

Now we have something to group by -- we've divided the rows up into groups we can deal with.


(note: if a,b,c allow NULLS, we'll need to accomidate for that!) 

Great Stuff

May 21, 2004 - 5:02 pm UTC

Reviewer: Dave Thompson from England, Overnight frost expected!

Tom,

Thanks very much for that.

I'll go over it in more detail when I'm in the Office Monday but it looks great from here.

Enjoy the weekend.

Excellent

June 02, 2004 - 4:53 am UTC

Reviewer: Dave Thompson from Yorkshire

Hi Tom,

This solution was spot on.

Thanks.

Any more thoughts on an Analytics book?

June 09, 2004 - 6:03 pm UTC

Reviewer: Stalin from CA, US

hi tom,

wondering what would below sql look like if there hadn't been existence of lead or partition analytical funtions. is pl/sql the only option.

snippet from "lead/lag on different dataset" thread (it's has the create and insert stmts)

ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 )
minutes
2 from (
3 select log_id,
4 case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
5 then lead(log_id) over (partition by account_id, user_id, mac
order by log_creation_date)
6 end signoff_id,
7 user_id,
8 log_creation_date signon_date,
9 case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
10 then lead(log_creation_date) over (partition by account_id,
user_id, mac order by log_creation_date)
11 end signoff_date,
12 action
13 from logs
14 where account_id = 'Robert'
15 and service = 5
16 order by user_id
17 ) a
18 where action in (1,2)
19 /

Thanks,
Stalin

Tom Kyte

Followup  

June 09, 2004 - 6:27 pm UTC

you could use a non-equi self join to achieve the same. Many orders of magnitudes slower.

scalar subqueries could be used as well -- with the same "slower" caveat.

Is this solvable with ANALTICS too?

June 10, 2004 - 12:14 am UTC

Reviewer: Peter Tran from Houston, TX USA

Hi Tom,

Can the following problem be solved using Analytics?

I have a 10 columns table where 9 of the fields are dimensions and one attribute. I would like to get a report of D1/D2 columns where the ATTR1 is 1 for every other dimensions. Furthermore the PK consist of all the dimension columns.

The example below aren't really true column names, but I didn't want to make the example table too wide for illustrative purpose.


D1 D2 D3 D4 D5 D6 D7 D8 D9 ATTR1
--------------------------------------------
AA AA AA AA AA AA AA AA AA 1
AA AA BB AA AA AA AA AA AA 1
AA AA AA CC AA AA AA AA AA 1
AA AA AA AA DD AA AA AA AA 1
AA AA AA AA EE AA AA AA AA 1

AA BB AA AA AA AA AA GG AA 1
AA BB AA AA AA AA AA AA AA 1
AA BB CC AA AA AA AA AA AA 0
AA BB AA DD AA AA AA AA AA 1

EE DD JJ LL MM NN OO PP QQ 1
EE DD TT LL MM NN OO PP QQ 1

I want the query to return:

D1 D2
--------
AA AA
EE DD

It would not return AA/BB, because of the record:

D1 D2 D3 D4 D5 D6 D7 D8 D9 ATTR1
--------------------------------------------
AA BB CC AA AA AA AA AA AA 0

Thanks,
-Peter

Tom Kyte

Followup  

June 10, 2004 - 7:43 am UTC

yes they can, but they are not needed. regular aggregates do the job. I'd give you the real query if I had a create table/inserts to demo against. this is "psuedo code", might or might not actually work:


select d1, d2
from t
group by d1, d2
having count(distinct attribute) = 1




June 10, 2004 - 9:01 am UTC

Reviewer: Michael T. from Dallas, Tx

Peter,
I think the following may give you what you want.

SELECT d1, d2
FROM t
GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) > 0;

Tom's psuedo code will work except for the case when all D1/D2 combinations have the same ATTR1 value, but that value is not 1.


Tom Kyte

Followup  

June 10, 2004 - 9:45 am UTC

ahh, good eye -- i was thinking "all attribute values are the same"

but yours doesn't do it, this will

having count( decode( attr1, 1, 1 ) ) = count(*)



cound(decode(attr1,1,1)) will return a count of non-null occurences (all of the 1's)

count(*) returns a count of all records

output when count(decode) = count(*)



Thank you!

June 10, 2004 - 10:37 am UTC

Reviewer: Peter Tran from Houston, TX USA

Hi Tom/Michael T.,

Thank you. It so much clearer now.

-Peter

June 10, 2004 - 10:46 am UTC

Reviewer: Michael T. from Dallas, Tx

I did screw up in my previous response. The query I submitted gives the entirely wrong answer. It should have been

SELECT d1, d2
FROM t
GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) = 0

Even though, incorrectly, I wasn't originally considering null values for ATTR1, the above query seems to produce the correct answer even if ATTR1 is NULL. The DECODE will evaluate a null ATTR1 entry to 1.

Tom, many thanks for this site. I have learned so much from it. It is a daily must read for me.


You said a book on analytics?

June 10, 2004 - 12:30 pm UTC

Reviewer: Jeff from Atlanta, GA

A book by you on analytics would be a best seller I think.
Go for it.

quick analytic question

June 16, 2004 - 5:03 pm UTC

Reviewer: A reader

schema creation---
---
scott@ora92> drop table t1;

Table dropped.

scott@ora92> create table t1
2 (
3 x varchar2(10),
4 y number
5 );

Table created.

scott@ora92>
scott@ora92> insert into t1 values( 'x1', 1 );

1 row created.

scott@ora92> insert into t1 values( 'x1', 2 );

1 row created.

scott@ora92> insert into t1 values( 'x1', 4 );

1 row created.

scott@ora92> insert into t1 values( 'x1', 0 );

1 row created.

scott@ora92> commit;

Commit complete.

scott@ora92> select x, y, min(y) over() min_y
2 from t1;

X Y MIN_Y
---------- ---------- ----------
x1 1 0
x1 2 0
x1 4 0
x1 0 0

scott@ora92> spool off
---
how do i get the minimum of y for all values
that is greater than 0 (if one exists). In the above case
I should get the result as

X Y MIN_Y
---------- ---------- ----------
x1 1 1
x1 2 1
x1 4 1
x1 0 1



Thanx for your excellent site and brilliant work!


Tom Kyte

Followup  

June 16, 2004 - 6:09 pm UTC

min( case when y > 0 then y end ) over ()

Great!!!

June 16, 2004 - 6:46 pm UTC

Reviewer: A reader


Thank you very much

July 02, 2004 - 9:16 am UTC

Reviewer: Gj from UK

The Oracle docs are a little light on examples but thank you for giving us the quick start to analytics, can't say I understand the complex examples yet, but the simple stuff seems so easy to understand now, can't wait until a real problem comes along I can apply this feature to.

Tom Kyte

Followup  

July 02, 2004 - 10:39 am UTC

</code> http://docs.oracle.com/docs/cd/B10501_01/server.920/a96520/analysis.htm#1020 <code>


How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?

July 06, 2004 - 8:08 am UTC

Reviewer: Sergey from Norway

Hi Tom,

I need to 'fill the gaps' with the values from the last existing row in a table that is outer joined to another table. The other table servers as a source of regular [time] intervals. The task seems to be conceptually very simple, so I looked into Ora docs (it happens to be Ora10g docs) I pretty soon found exactly what I need: LAST_VALUE with IGNORE NULLS. Unfortunately neither Ora8i, nor Ora9i accept IGNORE NULLS. Is there any way to mimic this feature with 'older' analitical functions?
I tried sort of ORDER BY SIGN(NVL(VALUE), 0) in analitical ORDER BY clause, but it does not work (I do not have a clue why)

Thanks in advance

Here is the test:
DROP TABLE TD;
CREATE TABLE TD AS
(SELECT TRUNC(SYSDATE, 'DD') + ROWNUM T
FROM ALL_OBJECTS
WHERE ROWNUM <= 15
);

DROP TABLE TV;
CREATE TABLE TV AS
(SELECT
TRUNC(SYSDATE, 'DD') + ROWNUM * 3 T
,ROWNUM V
FROM ALL_OBJECTS
WHERE ROWNUM <= 5
);


SELECT
TD.T
,SIGN(NVL(TV.V, 0))
,NVL
(TV.V,
LAST_VALUE(TV.V IGNORE NULLS) -- IGNORE NULLS does not work on Ora8i, Ora9i
OVER
(
ORDER BY TD.T
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
) V
FROM TD, TV
WHERE TV.T(+) = TD.T
ORDER BY TD.T
;
ERROR at line 6:
ORA-00907: missing right parenthesis

SELECT
TD.T
,SIGN(NVL(TV.V, 0))
,NVL
(TV.V,
LAST_VALUE(TV.V)
OVER
(
ORDER BY SIGN(NVL(TV.V, 0)), TD.T -- Does not work
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
)
) V
FROM TD, TV
WHERE TV.T(+) = TD.T
ORDER BY TD.T
;
T SIGN(NVL(TV.V,0)) V
------------------- ------------------ ------------------
07.07.2004 00:00:00 0
08.07.2004 00:00:00 0
09.07.2004 00:00:00 1 1
10.07.2004 00:00:00 0
11.07.2004 00:00:00 0
12.07.2004 00:00:00 1 2
13.07.2004 00:00:00 0
14.07.2004 00:00:00 0
15.07.2004 00:00:00 1 3
16.07.2004 00:00:00 0
17.07.2004 00:00:00 0
18.07.2004 00:00:00 1 4
19.07.2004 00:00:00 0
20.07.2004 00:00:00 0
21.07.2004 00:00:00 1 5




Tom Kyte

Followup  

July 06, 2004 - 8:26 am UTC

This is a trick I call "carry down", we use analytics on analytics to accomplish this.  We output "marker rows" with ROW_NUMBER() on the leading edge.  Using MAX() in the outer query, we "carry down" these marker rows -- substr gets rid of the row_number for us:



ops$tkyte@ORA10G> select t,
  2         sign_v,
  3             v,
  4             substr( max(data) over (order by t), 7 ) v2
  5    from (
  6  SELECT TD.T,
  7         SIGN(NVL(TV.V, 0)) sign_v,
  8          NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
  9           case when tv.v is not null
 10                 then to_char( row_number() 
                                  over (order by td.t), 'fm000000' ) || tv.v
 11                    end data
 12      FROM TD, TV
 13      WHERE TV.T(+) = TD.T
 14          )
 15   ORDER BY T
 16      ;
 
T             SIGN_V          V V2
--------- ---------- ---------- -----------------------------------------
07-JUL-04          0
08-JUL-04          0
09-JUL-04          1          1 1
10-JUL-04          0          1 1
11-JUL-04          0          1 1
12-JUL-04          1          2 2
13-JUL-04          0          2 2
14-JUL-04          0          2 2
15-JUL-04          1          3 3
16-JUL-04          0          3 3
17-JUL-04          0          3 3
18-JUL-04          1          4 4
19-JUL-04          0          4 4
20-JUL-04          0          4 4
21-JUL-04          1          5 5
 
15 rows selected.


So, in 9ir2 this would simply be:


ops$tkyte@ORA9IR2> select t,
  2         sign_v,
  3             substr( max(data) over (order by t), 7 ) v2
  4    from (
  5  SELECT TD.T,
  6         SIGN(NVL(TV.V, 0)) sign_v,
  7           case when tv.v is not null
  8                        then to_char( row_number() over (order by td.t), 'fm000000' ) || tv.v
  9                    end data
 10      FROM TD, TV
 11      WHERE TV.T(+) = TD.T
 12          )
 13   ORDER BY T
 14      ;
 
T             SIGN_V V2
--------- ---------- -----------------------------------------
07-JUL-04          0
08-JUL-04          0
09-JUL-04          1 1
10-JUL-04          0 1
11-JUL-04          0 1
12-JUL-04          1 2
13-JUL-04          0 2
14-JUL-04          0 2
15-JUL-04          1 3
16-JUL-04          0 3
17-JUL-04          0 3
18-JUL-04          1 4
19-JUL-04          0 4
20-JUL-04          0 4
21-JUL-04          1 5
 
15 rows selected.
 

Doesn't work with PL/SQL ????????

July 20, 2004 - 9:31 am UTC

Reviewer: A reader

Dear Tom
Are analytics fully compatible with PL/SQL?
Please see
SQL> ed
Wrote file afiedt.buf

  1  select empno,deptno,
  2         count(empno) over (partition by deptno order by empno
  3                            rows between unbounded preceding and current row) run_count
  4* from emp
SQL> /

     EMPNO     DEPTNO  RUN_COUNT
---------- ---------- ----------
      7782         10          1
      7839         10          2
      7934         10          3
      7369         20          1
      7566         20          2
      7788         20          3
      7876         20          4
      7902         20          5
      7499         30          1
      7521         30          2
      7654         30          3

     EMPNO     DEPTNO  RUN_COUNT
---------- ---------- ----------
      7698         30          4
      7844         30          5
      7900         30          6

14 rows selected.

SQL> 
SQL> ed
Wrote file afiedt.buf

  1  declare
  2  cursor c1 is
  3  select empno,deptno,
  4         count(empno) over (partition by deptno order by empno
  5                            rows between unbounded preceding and current row) run_count
  6  from emp;
  7  begin
  8   for rec in c1 loop
  9    null;
 10   end loop;
 11* end;
SQL> /
end;
   *
ERROR at line 11:
ORA-06550: line 5, column 72:
PL/SQL: ORA-00905: missing keyword
ORA-06550: line 3, column 1:
PL/SQL: SQL Statement ignored


SQL> 
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for 32-bit Windows: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production

SQL>  

Tom Kyte

Followup  

July 20, 2004 - 8:08 pm UTC

You can contact support and reference <Bug:3083373>, but the workaround would be to use native dynamic sql or a view to "hide" this construct.

the problem turns out to be the word "current" which had meaning in plsql.

Effect of distinct on lag

July 29, 2004 - 1:48 pm UTC

Reviewer: John Murphy from Vienna, VA

I am trying to use analytics to find accounts with receipts in 3 consecutive years. The analytic code seems to work, however, when I add DISTINCT (to find each account once), I get strange results. This is on 9.2.0.1.0.

create table jcm_test(acct_id number(10), rcpt_date date);

insert into jcm_test
values (1 , to_date('01-JAN-2000', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('01-JAN-2001', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('01-JAN-2003', 'dd-mon-yyyy'));
insert into jcm_test
values (1 , to_date('02-JAN-2001', 'dd-mon-yyyy'));

(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2);

ACCT_ID RCPT RANK_YEAR LAG_ LAG_RANK YEAR_DIFF RANK_DIFF
---------- ---- ---------- ---- ---------- ---------- ----------
1 2000 1
1 2001 2
1 2003 3 2000 1 3 2

select * from
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2)
where year_diff = rank_diff;

no rows selected

select distinct * from
(select j2.*,
rcpt_year - lag_yr as year_diff,
rank_year - lag_rank as rank_diff
from (select acct_id, rcpt_year, rank_year,
lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
from (select acct_id,
rcpt_year,
rank() over (partition by acct_id order by j.rcpt_year) rank_year
from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
from jcm_test) j )
) j2)
where year_diff = rank_diff;

ACCT_ID RCPT RANK_YEAR LAG_ LAG_RANK YEAR_DIFF RANK_DIFF
---------- ---- ---------- ---- ---------- ---------- ----------
1 2001 2 2000 1 1 1
1 2003 4 2001 2 2 2

In your book, you say that because analytics are performed last, you must push them into an inline view. However, that doesn't seem to do the trick here. Thanks, john

Tom Kyte

Followup  

July 29, 2004 - 2:18 pm UTC

what release -- i don't see what you see.

Distinct effect release

July 29, 2004 - 3:12 pm UTC

Reviewer: John Murphy from Vienna, VA

Tom, we are using the following.

Oracle9i Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE 9.2.0.1.0 Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production

I tried searching Metalink, but couldn't find any bugs.

Tom Kyte

Followup  

July 29, 2004 - 4:03 pm UTC

i found one, not published, was solved via 9202 -- at least it did not reproduce, they did not pursue it further for that reason.



Distinct effect release

July 29, 2004 - 4:01 pm UTC

Reviewer: John Murphy from Vienna, VA

Actually, I suspect that this may be related to bug 2258035. Do you agree? Thanks, john

Tom Kyte

Followup  

July 29, 2004 - 4:18 pm UTC

yes, i can confirm that in 9205, it is not happening that way.

how to write this query

July 30, 2004 - 6:33 am UTC

Reviewer: Teddy

Hi

using the original poster´s example:


ORDER OPN STATION CLOSE_DATE
----- --- ------- ----------
12345 10 RECV 07/01/2003
12345 20 MACH1 07/02/2003
12345 25 MACH1 07/05/2003
12345 30 MACH1 07/11/2003
12345 36 INSP1 07/12/2003
12345 50 MACH1 08/16/2003
12346 90 MACH2 07/30/2003
12346 990 STOCK 07/31/2003

How do you write a query to determine that and order has passed maufacturing operation in several months?
In above example

12345 has rows in July and Augist but 12346 has rows in July only. How can we write a query to find orders such as 12345?


Tom Kyte

Followup  

July 30, 2004 - 4:40 pm UTC

select order, min(close_date), max(close_date)
from t
having months_between( max(close_date), min(close_date) ) > your_threshold;

Finding pairs in result set

August 11, 2004 - 10:05 am UTC

Reviewer: PJ

Tom,

CREATE TABLE A
(
N NUMBER,
C CHAR(1),
V VARCHAR2(20)
)

INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '1st e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '2nd e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '3rd e of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '1st w of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '2nd w of 1st N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'e', '1st e of 2nd N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '1st w of 2nd N');
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '2nd w of 2nd N');
commit;

SO the data I've is
select * from a;
-------------------------
N C V

1 e 1st e of 1st N
1 e 2nd e of 1st N
1 e 3rd e of 1st N
1 w 1st w of 1st N
1 w 2nd w of 1st N
2 e 1st e of 2nd N
2 w 1st w of 2nd N
2 w 2nd w of 2nd N

---------------------------------------
And the output I'm looking for is

1 e 1st e of 1st N
1 e 2nd e of 1st N
1 w 1st w of 1st N
1 w 2nd w of 1st N
2 e 1st e of 2nd N
2 w 1st w of 2nd N


So basically I need the first pairs of (e-w/w-e) for each N.

I hope I'm clear here.

Thanks as usual in advance,

Tom Kyte

Followup  

August 11, 2004 - 12:40 pm UTC

do you have a field that can be "sorted on" for finding "1st, 2cnd" and so on.

If not, there is no such thing as "first", or "third"





August 11, 2004 - 12:58 pm UTC

Reviewer: PJ

Tom,

Sorry if I was not clear.
we need to pick pairs for N. Like we have 5 rows with N=1. so we have to pick 4 rows leaving 1 UNPAIRED "e" out.
We want the data in the same order as it is in table. We can sort it by --> order by N,C

Tom Kyte

Followup  

August 11, 2004 - 1:58 pm UTC

ops$tkyte@ORA920> select n, c, rn, cnt2
  2    from (
  3  select n, c, rn,
  4             min(cnt) over (partition by n) cnt2
  5    from (
  6  select n, c,
  7             row_number() over (partition by n, c order by c) rn,
  8             count(*) over (partition by n, c) cnt
  9    from a
 10             )
 11             )
 12   where rn <= cnt2
 13  /
 
         N C         RN       CNT2
---------- - ---------- ----------
         1 e          1          2
         1 e          2          2
         1 w          1          2
         1 w          2          2
         2 e          1          1
         2 w          1          1
 
6 rows selected.
 

Brilliant as usual !!

August 11, 2004 - 2:04 pm UTC

Reviewer: A reader


PJ's query

August 11, 2004 - 2:04 pm UTC

Reviewer: Kevin from St. Louis

PJ - you can drop the column 'v' from your table, and just use this query (which I think will answer your question using N and C alone, and generate an appropriate 'v' as it runs).

CREATE TABLE b
(
N NUMBER,
C CHAR(1)
)


INSERT INTO b ( N, C ) VALUES ( 1, 'e');

INSERT INTO b ( N, C ) VALUES ( 1, 'e');

INSERT INTO b ( N, C ) VALUES ( 1, 'e');

INSERT INTO b ( N, C ) VALUES ( 1, 'w');

INSERT INTO b ( N, C ) VALUES ( 1, 'w');

INSERT INTO b ( N, C ) VALUES ( 2, 'e');

INSERT INTO b ( N, C ) VALUES ( 2, 'w');

INSERT INTO b ( N, C ) VALUES ( 2, 'w');

COMMIT;


SELECT n,c,v1
FROM (
SELECT lag (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1) c3,
lead (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1)c4,
c1 ||
CASE WHEN c1 BETWEEN 10 AND 20
THEN 'th'
ELSE DECODE(MOD(c1,10),1,'st',2,'nd',3,'rd','th')
END || ' ' || c || ' of ' || c2 ||
CASE WHEN c2 BETWEEN 10 AND 20
THEN 'th'
ELSE DECODE(MOD(c2,10),1,'st',2,'nd',3,'rd','th')
END || ' N' v1,
t1.*
FROM (
SELECT b.*,
row_number() OVER (PARTITION BY n, c ORDER BY n,c) c1,
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
FROM b
) t1
) t2
WHERE c3 IS NOT NULL OR c4 IS NOT NULL
/
Results:
N C V1
1 e 1st e of 1st N
1 w 1st w of 1st N
1 e 2nd e of 1st N
1 w 2nd w of 1st N
2 e 1st e of 1st N
2 w 1st w of 1st N

INSERT INTO b ( N, C ) VALUES ( 1, 'w');

COMMIT;

Results:
N C V1
1 e 1st e of 1st N
1 w 1st w of 1st N
1 e 2nd e of 1st N
1 w 2nd w of 1st N
1 e 3rd e of 1st N
1 w 3rd w of 1st N
2 e 1st e of 1st N
2 w 1st w of 1st N


oops

August 11, 2004 - 2:12 pm UTC

Reviewer: Kevin from St. Louis

replace
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
with
DENSE_RANK() OVER (PARTITION BY c ORDER BY c) c2

my bad.

August 11, 2004 - 3:27 pm UTC

Reviewer: A reader

Your bad what?
toe? leg?


Cool....

August 12, 2004 - 7:25 am UTC

Reviewer: PJ


analytic q

October 22, 2004 - 6:34 pm UTC

Reviewer: A reader

First the schema:

scott@ORA92I> drop table t1;

Table dropped.

scott@ORA92I> create table t1( catg1 varchar2(10), catg2 varchar2(10), total number );

Table created.

scott@ORA92I>
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 5 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 6 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 9 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V2', 'T2', 10 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V3', 'T1', 11 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V4', 'T1', 1 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V5', 'T2', 2 );

1 row created.

scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V6', 'T2', 3 );

1 row created.

The catg2 can only take two values, 'T1', 'T2'.
I want to sum the total column for catg1, catg2
and order by their total sum for each catg1 and catg2 values. Then
I want to list the top 3 catg1, catg2 combinations
based on their sum values of total column.

If there are more than 3 such combinations then I
club the remaining ones into a catg1 value of 'Others'.

my first cut solution is:

scott@ORA92I> select catg1, catg2, sum( total_sum )
2 from
3 (
4 select case
5 when dr > 3 then
6 'Others'
7 when dr <= 3 then
8 catg1
9 end catg1,
10 catg2,
11 total_sum
12 from
13 (
14 select catg1, catg2, total_sum,
15 dense_rank() over( order by total_sum desc) dr
16 from
17 (
18 select catg1, catg2, sum( total ) total_sum
19 from t1
20 group by catg1, catg2
21 )
22 )
23 )
24 group by catg1, catg2;

CATG1 CATG2 SUM(TOTAL_SUM)
---------- ---------- --------------
V1 T1 20
V2 T2 10
V3 T1 11
Others T1 1
Others T2 5


Does it look ok or do you have any better solution?

Thank you as always.


Tom Kyte

Followup  

October 23, 2004 - 9:36 am UTC

you could skip a layer of inline view, but it looks fine as is.

thanx!

October 24, 2004 - 12:37 pm UTC

Reviewer: A reader


SQL query

November 03, 2004 - 1:45 pm UTC

Reviewer: Reader from USA

I have a table which stores receipts against Purchase Orders. The users want the following o/p:
For each of the months of Jan, Feb and March 2004, provide a count of number of receipts which fall in each of the following Dollar value range
< $5000
Between $5000 to $9999
> $10,000

(There can be a number of receipts against one Purchase Order, so that's needs to be grouped together first)

I wrote this query using an inline view which is the UNION of 3 SQLs, one for each dollar range.
However, am sure there is a more elegant and efficient method to do this,maybe using analytical functions , CASE, decode .... Appreciate your help.

Thanks

Tom Kyte

Followup  

November 05, 2004 - 10:49 am UTC

select trunc(date_col,'mm') Month,
count( case when amt < 5000 then 1 end ) "lt 5000",
count( case when amt between 5000 and 9999 then 1 end ) "between 5/9k",
count( case when amt >= 10000 then 1 end ) "10k or more"
from t
where date_col between :a and :b
group by trunc(date_col,'mm')


single pass....

Great -

November 10, 2004 - 7:09 am UTC

Reviewer: syed from UK

Tom

I have a tables as follows

create table matches
( reference varchar2(9),
  endname varchar2(20),
  beginname varchar2(30),
  DOB date, 
  ni varchar2(9)
)
/


insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A1','SMITH','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A2','JONES','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A3','JONES','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A4','BROWN','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234570');
insert into matches values ('A4','JONES','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234571');
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A3','JACKSON','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A2','JACKSON','BOB',to_date('1/1/1962','dd/mm/yyyy'),'AA1234568');
 insert into matches values ('A5','JACKSON','TOM',to_date('1/1/1920','dd/mm/yyyy'),'AA1234569');
commit;

SQL> select rownum,REFERENCE,ENDNAME,BEGINNAME,DOB,NI from matches;

 ROWNUM REFERENCE ENDNAME  BEGINNAME  DOB       NI
------- --------- -------- ---------- --------- ---------
      1 A1        SMITH    BOB        01-JAN-76 AA1234567
      2 A1        SMITH    TOM        01-JAN-70 AA1234568
      3 A2        JONES    TOM        01-JAN-70 AA1234568
      4 A3        JONES    TOM        01-JAN-71 AA1234569
      5 A4        BROWN    BRAD       01-JAN-61 AA1234570
      6 A4        JONES    BRAD       01-JAN-61 AA1234571
      7 A1        SMITH    BOB        01-JAN-76 AA1234567
      8 A3        JACKSON  TOM        01-JAN-71 AA1234569
      9 A2        JACKSON  BOB        01-JAN-62 AA1234568
     10 A5        JACKSON  TOM        01-JAN-20 AA1234569

I need to show duplicates where the following columns values are the same.

a) REFERENCE, ENDNAME,BEGINNAME,DOB,NI
b) ENDNAME,BEGINNAME,NI
c) REFERENCE,NI

So, 
rownum 1 and 7 match criteria a)
rownum 8 and 10 match criteria b) 
rownum 1 and 7, rownum 3 and 9, rownum 4 and 8 match criteria c)

How can I select this data out to show number matching each criteria ?
 

Tom Kyte

Followup  

November 10, 2004 - 7:23 am UTC

"How can I select this data out to show number matching each criteria ?"

is ambigous.


If you add columns:

count(*) over (partition by reference, endname, beginname, dob, ni ) cnt1,
count(*) over (partition by endname, beginname, ni) cnt2,
count(*) over (partition by reference,ni) cnt3


it'll give you the "dup count" by each partition -- technically showing you the "number matching each criteria"

analytics problem

November 19, 2004 - 9:37 am UTC

Reviewer: David from United Kingdom

Am newish to analytic functions and have hit problem as follows:-

create table a
(accno         number(8)     not null,
 total_paid    number(7,2)   not null)
/

create table b
(accno         number(8)     not null,
 due_date      date          not null,
 amount_due    number(7,2)   not null)
/

insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);

insert into b values (1, '01-oct-04', 1000);
insert into b values (1, '01-jan-05', 900);
insert into b values (1, '01-apr-05', 700);

insert into b values (2, '01-oct-04', 1000);
insert into b values (2, '01-jan-05', 900);
insert into b values (2, '01-apr-05', 700);

insert into b values (3, '01-oct-04', 1000);
insert into b values (3, '01-jan-05', 900);
insert into b values (3, '01-apr-05', 700);

insert into b values (4, '01-oct-04', 1000);
insert into b values (4, '01-jan-05', 900);
insert into b values (4, '01-apr-05', 700);

If I then do this query...

SQL> select a.accno,
  2         a.total_paid,
  3         b.due_date,
  4         b.amount_due,
  5         case
  6  when sum(b.amount_due)
  7  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
  8  then 0
  9  when sum(b.amount_due)
 10  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
 11  then sum(b.amount_due)
 12  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
 13  when sum(b.amount_due)
 14  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
 15  and a.total_paid >= 0
 16  then b.amount_due
 17  end to_pay
 18  from a,b
 19  where a.accno = b.accno
 20  order by a.accno,
 21           to_date(b.due_date, 'dd-mon-rr')
 22  /

     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         1       1000 01-OCT-04       1000       1000
         1       1000 01-JAN-05        900        900
         1       1000 01-APR-05        700        700
         2       1500 01-OCT-04       1000       1000
         2       1500 01-JAN-05        900        900
         2       1500 01-APR-05        700        700
         3       2000 01-OCT-04       1000       1000
         3       2000 01-JAN-05        900        900
         3       2000 01-APR-05        700        700
         4       3000 01-OCT-04       1000       1000
         4       3000 01-JAN-05        900        900
         4       3000 01-APR-05        700        700

12 rows selected.

...TO_PAY does not give what I was expecting. But if I do by individual accno I get what I'm after:-

SQL> select a.accno,
  2         a.total_paid,
  3         b.due_date,
  4         b.amount_due,
  5         case
  6  when sum(b.amount_due)
  7  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
  8  then 0
  9  when sum(b.amount_due)
 10  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
 11  then sum(b.amount_due)
 12  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
 13  when sum(b.amount_due)
 14  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
 15  and a.total_paid >= 0
 16  then b.amount_due
 17  end to_pay
 18  from a,b
 19  where a.accno = b.accno
 20  and a.accno = &accno
 21  order by a.accno,
 22           to_date(b.due_date, 'dd-mon-rr')
 23  /
Enter value for accno: 1
old  20: and a.accno = &accno
new  20: and a.accno = 1

     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         1       1000 01-OCT-04       1000          0
         1       1000 01-JAN-05        900        900
         1       1000 01-APR-05        700        700

3 rows selected.

SQL> /
Enter value for accno: 2
old  20: and a.accno = &accno
new  20: and a.accno = 2

     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         2       1500 01-OCT-04       1000          0
         2       1500 01-JAN-05        900        400
         2       1500 01-APR-05        700        700

3 rows selected.

SQL> /
Enter value for accno: 3
old  20: and a.accno = &accno
new  20: and a.accno = 3

     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         3       2000 01-OCT-04       1000          0
         3       2000 01-JAN-05        900          0
         3       2000 01-APR-05        700        600

3 rows selected.

SQL> /
Enter value for accno: 4
old  20: and a.accno = &accno
new  20: and a.accno = 4

     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         4       3000 01-OCT-04       1000          0
         4       3000 01-JAN-05        900          0
         4       3000 01-APR-05        700          0

3 rows selected.

What is needed for first query above to work?

cheers,
David 

Tom Kyte

Followup  

November 19, 2004 - 11:31 am UTC

ops$tkyte@ORA9IR2> select a.accno,
  2         a.total_paid,
  3         b.due_date,
  4         b.amount_due,
  5         case
  6  when sum(b.amount_due)
  7  over (<b>partition by a.accno</b> order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
  8  then 0
  9  when sum(b.amount_due)
 10  over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
 11  then sum(b.amount_due)
 12  over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
 13  when sum(b.amount_due)
 14  over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
 15  and a.total_paid >= 0
 16  then b.amount_due
 17  end to_pay
 18  from a,b
 19  where a.accno = b.accno
 20  order by a.accno,
 21           to_date(b.due_date, 'dd-mon-rr')
 22  /
 
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         1       1000 01-OCT-04       1000          0
         1       1000 01-JAN-05        900        900
         1       1000 01-APR-05        700        700
         2       1500 01-OCT-04       1000          0
         2       1500 01-JAN-05        900        400
         2       1500 01-APR-05        700        700
         3       2000 01-OCT-04       1000          0
         3       2000 01-JAN-05        900          0
         3       2000 01-APR-05        700        600
         4       3000 01-OCT-04       1000          0
         4       3000 01-JAN-05        900          0
         4       3000 01-APR-05        700          0
 
12 rows selected.
 
 

excellent

November 19, 2004 - 12:02 pm UTC

Reviewer: David from UK

many thanks

Limitation of Analytic Functions

December 16, 2004 - 4:27 am UTC

Reviewer: Nilanjan Ray from India

I am using the following view
create or replace view vw_history as
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history

The following SQL statement invariably does a full table scan on 112,861,91 rows of ADC_HISTORY and runs for 20-25 mins.

select *
from vw_history
where t_dt between to_date('01/01/2002','dd/mm/yyyy') and to_date('01/01/2002','dd/mm/yyyy');

The query return 4200 rows. ADC_HISTORY has 112,861,91 rows. I have the following indexes : ADC_HISTORY_IDX8 on txm_dt and ADC_HISTORY_IDX1 on spot_key columns. Both have good selectivities.

But when the required query is ran without the view it properly uses the index ADC_HISTORY_IDX8

select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history

I had raised a tar and it says:This is the expected behaviour "PREDICATES ARE NOT PUSHED IN THE VIEW IF ANY ANALYTIC FUNCTIONS ARE USED"

Is there any way to work aroung this limitation. I just cannot think of the painful situation if I am unable to use views with analytics!!!!

Your help is absolutely necessary. Thanks in advance

Tom Kyte

Followup  

December 16, 2004 - 8:27 am UTC

guess what -- your two queries <b>return different answers</b>..


did you consider that?  did you check that?  

they are TOTALLY DIFFERENT.  Analytics are applied after predicates.  The view -- it has no predicate.  The query -- it has a predicate.  You'll find that you have DIFFERENT result sets.

don't you see that as a problem?

It is not that you are "unable to use views"

It is that "when I use a view, I get answer 1, when I do not use a view, I get answer 2"

which answer is technically correct here?


Think about it.


consider this example (using RBO just to make it so that "if an index could be used it would" to stress the point):


ops$tkyte@ORA9IR2> create table emp as select * from scott.emp;
 
Table created.
 
ops$tkyte@ORA9IR2> create index job_idx on emp(job);
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace view v
  2  as
  3  select ename, sal, job,
  4         sum(sal) over (partition by job) sal_by_job,
  5             sum(sal) over (partition by deptno) sal_by_deptno
  6    from emp
  7  /
 
View created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace on explain
ops$tkyte@ORA9IR2> select *
  2    from v
  3   where job = 'CLERK'
  4  /
 
ENAME             SAL JOB       SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER           1300 CLERK           4150          8750
JAMES             950 CLERK           4150          9400
SMITH             800 CLERK           4150         10875
ADAMS            1100 CLERK           4150         10875
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE
   1    0   VIEW OF 'V'
   2    1     WINDOW (SORT)
   3    2       WINDOW (SORT)
   4    3         TABLE ACCESS (FULL) OF 'EMP'
 
 
<b>so, one might ask "well - hey, I've got that beautiful index on JOB, I said "where job = 'CLERK'", whats up with that full scan.

in fact, when I do it "right" -- without the evil view:</b>

 
ops$tkyte@ORA9IR2> select ename, sal, job,
  2         sum(sal) over (partition by job) sal_by_job,
  3             sum(sal) over (partition by deptno) sal_by_deptno
  4    from emp
  5   where job = 'CLERK'
  6  /
 
ENAME             SAL JOB       SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER           1300 CLERK           4150          1300
SMITH             800 CLERK           4150          1900
ADAMS            1100 CLERK           4150          1900
JAMES             950 CLERK           4150           950
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE
   1    0   WINDOW (SORT)
   2    1     WINDOW (SORT)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
   4    3         INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
 
<b>it very rapidly uses my index !!!   stupid views...

but wait.

whats up with SAL_BY_DEPTNO, that appears to be wrong... hmmm, what happened?

What happened was we computed the sal_by_depto in the query without the view AFTER doing "where job = 'CLERK'"


YOU are doing your LAG() analysis AFTER applying the predicate.  Your lags in your query without the view -- they are pretty much "not accurate"


Note that when the predicate CAN be pushed:</b>
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select ename, sal, sal_by_job
  2    from v
  3   where job = 'CLERK'
  4  /
 
ENAME             SAL SAL_BY_JOB
---------- ---------- ----------
SMITH             800       4150
ADAMS            1100       4150
JAMES             950       4150
MILLER           1300       4150
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE
   1    0   VIEW OF 'V'
   2    1     WINDOW (BUFFER)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
   4    3         INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
 

<b>it most certainly is.  here the predicate can safely be pushed -- since the analytic is computed "by job", a predicate on "job" can be applied FIRST and then the analytic computed.  

When pushing would change the answer -- we cannot do it.

When pushing the predicate would not change the answer -- we do it.


This is not a 'limitation', this is about "getting the right answer"</b>
 
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> alter session set optimizer_mode = choose;
 
Session altered.
 

 

Great!!!

December 17, 2004 - 12:59 pm UTC

Reviewer: Nilanjan Ray from India

Simply amazing explanation. Cleared my doubts still further. One of the best explanation, in simple concise terms, I have seen on "Ask Tom". You know what, people should take enough caution and learn leasons from you before making misleading statements like "...LIMITATIONS...". In your terms yet again "Analytics Rock".

Regards

Using analytical function, LEAD, LAG

December 24, 2004 - 9:26 am UTC

Reviewer: Praveen from Bangalore

Hi Tom,

Analytical function, LEAD (or LAG) accepts the offset parameter as an integer which is a count of rows to be skipped from the current row before accessing the leading/lagging row. What if I want to access leading rows based on the value of column of current row, like a function applied to the column value of current row to access the leading row.

As an example: I have a table

create table t(id integer, dt date);

For each id, start with the first record, after ordering by dt ASC. Get the next record where dt = 10 min + first_row.dt. Then next record where dt = 20 min + first_row.dt and so on. Each time time is cummulatively increased by 10 min.

Suppose if don't get an exact match from next record (ie next_row.dt <> first_row.dt+10 min(say), then we select a row closest to the expected record, but lying within +/-10 seconds.

insert into t values (1, to_date('12/20/2004 00:00:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.

insert into t values (1, to_date('12/20/2004 00:05:00', 'mm/dd/yyyy hh24:mi:ss'));

insert into t values (1, to_date('12/20/2004 00:09:55', 'mm/dd/yyyy hh24:mi:ss'));

insert into t values (1, to_date('12/20/2004 00:10:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.

insert into t values (1, to_date('12/20/2004 00:15:00', 'mm/dd/yyyy hh24:mi:ss'));

insert into t values (1, to_date('12/20/2004 00:19:54', 'mm/dd/yyyy hh24:mi:ss')); --Not selected.

insert into t values (1, to_date('12/20/2004 00:19:55', 'mm/dd/yyyy hh24:mi:ss')); --Selected.

insert into t values (1, to_date('12/20/2004 00:25:00', 'mm/dd/yyyy hh24:mi:ss'));

insert into t values (1, to_date('12/20/2004 00:30:05', 'mm/dd/yyyy hh24:mi:ss')); --Selected.

insert into t values (1, to_date('12/20/2004 00:30:06', 'mm/dd/yyyy hh24:mi:ss')); --Not Selected.

insert into t values (1, to_date('12/20/2004 00:35:00', 'mm/dd/yyyy hh24:mi:ss'));

insert into t values (1, to_date('12/20/2004 00:39:55', 'mm/dd/yyyy hh24:mi:ss')); --Either this or below record is selected.

insert into t values (1, to_date('12/20/2004 00:40:05', 'mm/dd/yyyy hh24:mi:ss')); --Either this or above record is selected.

My output would be:
id dt
-----------
1 12/20/2004 00:00:00 AM
1 12/20/2004 00:10:00 AM --Exactly matches first_row.dt + 10min
1 12/20/2004 00:19:55 AM --Closest to first_row.dt + 20min +/- 10sec
1 12/20/2004 00:30:05 AM --Closest to first_row.dt + 30min +/- 10sec
1 12/20/2004 00:39:55 AM OR 12/20/2004 00:40:05 AM --Closest to first_row.dt + 40min +/- 10sec

The method I followed, after failed using LEAD is:

Step#1
------
Get a subset of dt's column, which is a 10 min cummulatiave dts from the dt value of first row(after rounding to the nearest minute, multiple of 10).
In this example I will get a subset:

12/20/2004 00:00:00 AM
12/20/2004 00:10:00 AM
12/20/2004 00:20:00 AM
12/20/2004 00:30:00 AM
12/20/2004 00:40:00 AM

This query will do it:

SELECT t1.id,
( min_dt - MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
FROM (SELECT id, MIN (dt) min_dt,
ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
FROM t
WHERE id = 1
GROUP BY id) t1, t
WHERE ROWNUM <= max_rows + 1

Step#2:
-------
This subquery is joined with table t to get only those records from t which is either equal to the dts in the resultset returned by the subquery or fall within the range 10min +/-10sec (not closest only, but all).

SELECT t.id, dt_rounded, ABS (t.dt - dt_rounded) * 24 * 60 * 60 dt_diff_in_sec
FROM t,
(SELECT t1.id,
( min_dt - MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
FROM (SELECT id, MIN (dt) min_dt,
ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
FROM t
WHERE id = 1
GROUP BY id) t1, t
WHERE ROWNUM <= max_rows + 1) t2
WHERE t.id = 1
AND ABS (t.dt - dt_rounded) * 24 * 60 * 60 <= 10
ORDER BY t.id, dt_rounded, dt_diff_in_sec;

I agree, this resultset will include duplicate records which I need to remove procedurally, while looping through the cursor; the order by clause simplifies this.

Now you might have guessed the problem. If table t contains more than 1000 records, the query asks me to wait atleast 2 min! And that too when I am planning to put at least 70,000 records!

I wrote a procedure which is handling the situation a little better. But I dont know if analytical query can help me out to bring back the performance. I could do it if Lead have the fuctionality I mentioned in the first paragraph. Do you have any hints?

Thanks and regards

Praveen

Tom Kyte

Followup  

December 24, 2004 - 9:54 am UTC

you'd be looking at first_value with range windows, not lag and lead in this case.



Windowing clause and range function.

December 25, 2004 - 1:29 pm UTC

Reviewer: Praveen from Bangalore

Hi Tom,
Thankyou for the suggestion. I am not very well used with analytical queries. I have tried based on your advise but unable to even start with. I am struck with the first step itself - in specifying the range in the windowing clause. In the windowing clause, we specify an integer to get the preceding rows based on the current column value (CLARK's example-Page:556, Analytical Funtions).

In my above example I wrote a query which contains:

FIRST_VALUE(id)
OVER (ORDER BY dt DESC
RANGE 10 PRECEDING)

10, in the windowing clause, will give me a record that fall within 10 days preceding the current row. But I need 10 minutes preceding records. Also at the same time all those records that span within +/- 10 sec, if exact 10 minute later records are not found (please see the description of the problem given in the previous question).

Kindly give me a more clear picture about windowing clause.
Also how you will approch the above problem.

Thanks and regards

Praveen

Tom Kyte

Followup  

December 26, 2004 - 12:19 pm UTC

do you have Expert One on One Oracle? I have extensive examples in there.


range 10 = 10 days.

range 10/24 = 10 hours

range 10/24/60 = 10 minutes......




I do have Expert One on One

December 26, 2004 - 2:24 pm UTC

Reviewer: Praveen from Bangalore

Hi Tom,

I got the first glimpse into analytical queries through your book only. Although I had attempted to learn them through oracle documentation a couple of times earlier, I never was able to write an decent query using analytical functions. Now, after spending a few hours with your book, I can see that these fuctions are not as complex as I thought earlier.

The 'hiredate' example you have given in the book is calculating in terms of days. (Pg:555)

"select ename, sal, hiredate, hiredate-100 window_top
first_value(ename)
over(order by hiredate asc
range 100 preceding) ename_prec,...."

I got the hint from your follow-up. I should have to think a little myself.

Thankyou Tom,

Praveen.

December 26, 2004 - 5:49 pm UTC

Reviewer: A reader

Tom,

Any dates when you would be releasing your book on Analytic?

Thanks.

Tom Kyte

Followup  

December 26, 2004 - 6:00 pm UTC

doing a 2nd edition of Expert One on One Oracle now -- not on the list yet.

Great answer!

December 27, 2004 - 2:21 am UTC

Reviewer: Shimon Tourgeman

Dear Tom,
Could you please tell us when you are going to publish the next edition of your books, covering 9iR2 and maybe 10g, as you stated here?

Merry Christmas and a Happy New Year!
Shimon.


Tom Kyte

Followup  

December 27, 2004 - 10:06 am UTC

sometime in 2005, but not the first 1/2 :)

Using range windows

January 03, 2005 - 8:09 am UTC

Reviewer: Praveen from Bangalore

Hi Tom,

Please allow me to explain the problem again which you had
followed up earlier (Please refer: "Using analytical
function, LEAD, LAG"). In the table t(id integer, dt date)
I have records which only differ by seconds ('dt' column).
Could you please help me to write a query to create windows
such that each window groups records based on the
expression 590 <= dt_1 <= 610 (590 & 610 are date
difference between first record and current record in
seconds and dt1 is the 'dt' column value of first record in
each window after ordering by 'id' and 'dt' ASC).
The idea is to find a record following the first record
which leads by 10 minutes. If exact match is not found
apply a tolerance of +/-10 seconds. Once the nearest match
is found (if multiple matches are found, select any), start
from the next record and repeat the process. (Please see
the scripts I had given earlier).

In your follow up, you had suggested the use of
first_value() analytical function with range windows. But
it looks like it is pretty difficult to generate the kind
of windows I specified above. And in your book, examples of
such complex nature where not given (pardon me for being
critical).

Your answer will help me to get a deeper and practical
understanding of analytical functions while at the same
time may help us to bring down a 12 hour procedure to less
than 5 hours.

Thanks and regards


Praveen

Tom Kyte

Followup  

January 03, 2005 - 9:11 am UTC

no idea what 590 is. days? hours? seconds?

sorry - this doesn't compute to me.

590 <= dt_1 <= 610???



Delete Records Older Than 90 Days While Keeping Max

January 03, 2005 - 10:24 am UTC

Reviewer: Mac

There is a DATE column in a table. I need to delete all records older than 90 days -- except if the newest record for a unique key happens to be older than 90 days, I want to keep it and delete the prior records for that key value.

How?

Tom Kyte

Followup  

January 03, 2005 - 10:26 am UTC

if the "newest record for a unique key"

if the key is unique.... then the date column is the only thing to be looked at?

that is, if the key is unique, then the oldest record is the newest record is in the fact the only record....

Oops, but

January 03, 2005 - 11:01 am UTC

Reviewer: A reader

Sorry, forgot to mention that the DATE column is a part of the unique key.

Sorry, I went a bit fast...

January 03, 2005 - 2:00 pm UTC

Reviewer: Praveen from Bangalore

Hi Tom,

Sorry, I didnt explained properly.

590 = (10 minutes * 60) seconds - 10 seconds
600 = (10 minutes * 60) seconds + 10 seconds

Here I am looking for a record (say rn) exactly
600 sec (10 min) later to the first record in
the range window. If I didn't get an exact match
I try to find a record which is closest to rn,
but lies with in a range which is 10 seconds less
than or more than rn.

And the condition

"590 <= dt_1 <= 610" tries to eliminate all other
records inside the range window that does not follow
the above rule.

dt_1 is the dt column value of any row following the
first row in a given range window, such that the
difference between dt_1 and dt of first row is between
590 seconds and 610 seconds. I am interested in only
one record which lies closest to 600 seconds.

I hope, the picture is more clear to you now. As an
example,

id dt
-----------------------------
1 12/20/2004 00:00:00 AM --Range window #1
1 12/20/2004 00:09:55 AM
1 12/20/2004 00:10:00 AM --Selected (Closest to 12/20/2004 00:10:00 AM)

............................
1 12/20/2004 00:10:10 AM --Range window #2
1 12/20/2004 00:19:55 AM --Selected (Closest to 12/20/2004 00:20:00 AM)
1 12/20/2004 00:20:55 AM
............................
1 12/20/2004 00:20:55 AM --Range window #3
1 12/20/2004 00:25:00 AM --Nothing to select
1 12/20/2004 00:29:10 AM --Nothing to select
...........................
1 12/20/2004 00:30:05 AM --Range window #4
1 12/20/2004 00:39:55 AM --Either one is selected
1 12/20/2004 00:40:05 AM --Either one is selected
-----------------------------

Thanks and regards

Praveen



Tom Kyte

Followup  

January 03, 2005 - 10:24 pm UTC

that is first_value, last_value with a range window and the time range is

N * 1/24/60/60 -- for N seconds.

How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?

January 13, 2005 - 3:11 pm UTC

Reviewer: jayaramj@quinnox.com

Hi Tom,

In answer to the question 'How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?' from reviewer Sergey (from Norway) in this post you have proposed the following solution:

ops$tkyte@ORA10G> select t,
2 sign_v,
3 v,
4 substr( max(data) over (order by t), 7 ) v2
5 from (
6 SELECT TD.T,
7 SIGN(NVL(TV.V, 0)) sign_v,
8 NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
9 case when tv.v is not null
10 then to_char( row_number()
over (order by td.t), 'fm000000' ) || tv.v
11 end data
12 FROM TD, TV
13 WHERE TV.T(+) = TD.T
14 )
15 ORDER BY T
16 ;

The problem is that this solution converts the data type of the column (in this case column TV.V) to a string (V2 in the result is a string). The result would then need to be converted back to the original data type.
It is best to avoid such data type conversion. Is there a solution to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS) in Oracle 9i without the datatype conversion?


Tom Kyte

Followup  

January 13, 2005 - 3:45 pm UTC

encode the date as a string using to_char( v, 'yyyymmddhh24miss' ) and in the substr of it back out -- to_date( substr(...), 'yyyymmddhh24miss' )

How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?

January 14, 2005 - 12:44 am UTC

Reviewer: Jay

In response to your post above - Taking care of dates (for datatype conversion) is not complex (though timestamp variants would require a different format string). Object columns are a different story altogether. These cannot be easily converted to strings. Is there a better solution that does not require datatype conversion (and hence does not require any knowledge of the column datatype in this SQL).

Tom Kyte

Followup  

January 14, 2005 - 8:06 am UTC

upgrade to 10g.

find prior collect_date to the max collect_date for each customer

January 25, 2005 - 4:30 pm UTC

Reviewer: JANE

Hello,Tom!
I work in ORACLE 8I
I have table with 2 columns:cstmr_no,collect_date
CREATE TABLE CSTMR_dates
(
CSTMR_NO NUMBER(8) NOT NULL,
COLLECT_DATE DATE NOT NULL);
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
How can i do instead this query the query using analytical
function:

select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no

In production i have thousands records in the table. THANK A LOT
JANE

Tom Kyte

Followup  

January 25, 2005 - 6:59 pm UTC

no idea what "return_collect_date" is. or where it comes from.

the sql is not sql...

Mistake:return_collect_date is a collect_date

January 26, 2005 - 2:58 am UTC

Reviewer: JANE

Thank you for answer
JANE

Tom Kyte

Followup  

January 26, 2005 - 8:46 am UTC

but this sql:

select cstmr_no,max(collect_date) from
CSTMR_dates
where collect_date<(select max(COLLECT_DATE)
group by cstmr_no

is still not sql and I don't know if you want to

a) delete all old data BY CSTMR_NO (eg: keep just the record with the max(collect_date) BY CSTMR_NO

b) delete all data such that the collect_date is not equal to the max(collect_date)


I cannot suggest a way to rewrite an invalid sql query.

No,i want to do the next:

January 26, 2005 - 9:08 am UTC

Reviewer: A reader

i have just to presene the data without deleting anything
For each cstmr i have to see:
cstmr_no max(collect_date) last prior date to max
======== ================= ======================
18 01/05/04 01/03/04
248 01/11/04 01/03/04

insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');


Tom Kyte

Followup  

January 26, 2005 - 9:31 am UTC

wow, how we got from:

select cstmr_no,max(collect_date) from 
CSTMR_dates
where  collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no 

to this, well -- just "wow".  horse of a very different color.


I have to sort of guess -- maybe I'll get it right -- you want 

a) every cstmr_no, 
b) the last two dates recorded for them.


well, after editing your inserts to make them become actual sql that can run.... (you don't really use YY in real life do you? please please say "no, that was a mistake...")

ops$tkyte@ORA9IR2> select cstmr_no,
  2         max( decode(rn,1,collect_date) ) d1,
  3         max( decode(rn,2,collect_date) ) d1
  4    from (
  5  select cstmr_no,
  6         collect_date,
  7             row_number() over (partition by cstmr_no order by collect_date desc nulls last) rn
  8    from cstmr_dates
  9         )
 10   where rn <= 2
 11   group by cstmr_no
 12  /
 
  CSTMR_NO D1        D1
---------- --------- ---------
        18 01-MAY-04 01-MAR-04
       248 01-NOV-04 01-MAR-04
 

 

Lead/Lag and Indexes

February 22, 2005 - 6:12 pm UTC

Reviewer: Rob H from Winnipeg Canada

We are using the Lead and Lag functions and I have run into an issue of Index usage.

lets say I have 2 tables

select customer_num, prod_id, date_sold, total_sales from sales_table_NA

and

select customer_num, prod_id, date_sold, total_sales from sales_table_EUR

if i do a

create view eur_sales
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_EUR
group by customer_account, prod_id

create view na_sales as
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_NA
group by customer_account, prod_id

There are indexes on the tables for customer_acccount

Now, if I
select * from na_sales where customer_account=1
the index is used. Same for eur_sales. However, if I UNION them together it does not (WINDOW SORT on first select and WINDOW BUFFER on second). If I remove the lead function and UNION them, the index is used.

Any help?


Tom Kyte

Followup  

February 23, 2005 - 1:56 am UTC

do you really want UNION or UNION ALL.........

(do you know the difference between the two)....

if you had given me simple setup scripts, I would have been happy to see if that makes a difference, but oh well.

Potential Solution

February 22, 2005 - 6:54 pm UTC

Reviewer: Rob H from Winnipeg Canada

Rather than pre-sum the data into 2 views I found that union'ing (actually UNION ALL) the data, then sum and Lag works fine.
ie
select
customer_account, prod_id, sales_date month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by
customer_account, prod_id order by sales_date desc) sales_last
from(
select customer_account, prod_id, sales_date, total_sales from sales_table_NA
union all
select customer_account, prod_id, trunc(sales_date,'mon') month_purch, total_sales from sales_table_EUR)

Attitude....

February 23, 2005 - 9:54 am UTC

Reviewer: Rob H from Winnipeg Canada

What's the deal? Having a bad day? I'm sorry, but I assumed from the select statements you could infer structure. Yes, I was using UNION ALL, yes, I know the difference (uh, feeling a bit rude are we?) but I didn't realize until after I posted that I missed that (a nice feature would be to be able to edit a post for a certain time after post). I generalized the data structure and SQL for confidentiality reasons. For a guy who is so hard on people's IM speak, you forget to capitalize your sentences :)

Now, UNION Vs UNION ALL didn't affect index usage (it did however have 'other' performance issues). You can see from my next post that I worked on the issue and resolved it by not presuming each table. With the new query, if someone issues a select with no 'where customer_account=' then it's slower (but that also wasn't the goal).

Thanks

Tom Kyte

Followup  

February 24, 2005 - 4:35 am UTC

No? I was simply asking "do you know the difference between the two" for I find most people

a) don't know union all exists
b) the semantic difference between union and union all
c) the performance penalty involved with union vs union all when they didn't need to use UNION

Your example, as posted, did not use UNION ALL. Look at your text:

<quote>
Now, if I
select * from na_sales where customer_account=1
the index is used. Same for eur_sales. However, if I UNION them together it
does not (WINDOW SORT on first select and WINDOW BUFFER on second). If I remove
the lead function and UNION them, the index is used.
</quote>


I quite simply asked:

does union all change the behaviour? (i did not have an example with table creates and such to work with, so I couldn't really 'test it', I don't have your tables, your indexes, your datatypes, etc)

do you need to use union, you said union, you did not say union all. do you know the difference between the two.


Sorry if you took it as an insult, I can only comment based on the data provided. I had to assume you like most of the world was using UNION, not UNION ALL and simply wanted to know if you could use union all, if union all made a difference, if you knew the difference between the two.


If I had precience, I could have read your subsequent post and not ask any questions I guess.


Not having a bad day, just working with information provided. I was not trying to insult you -- I was simply "asking".



Analytics

February 24, 2005 - 5:34 am UTC

Reviewer: Neelz from Japan

Dear Sir,

I had gone through the above examples and was wondering whether analytical functions could be used when aggregating multiple columns from a table,
CREATE TABLE T (
SUPPLIER_CD CHAR(4) NOT NULL,
ORDERRPT_NO CHAR(8) NOT NULL,
ORDER_DATE CHAR(8) NOT NULL,
STORE_CD CHAR(4) NOT NULL,
POSITION_NO CHAR(3 ) NOT NULL,
CONTORL_FLAG CHAR(2 ),
ORDERQUANTITY_EXP NUMBER(3) DEFAULT (0) NOT NULL,
ORDERQUANTITY_RES NUMBER(3) DEFAULT (0) NOT NULL,
ENT_DATE DATE DEFAULT (SYSDATE) NOT NULL,
UPD_DATE DATE DEFAULT (SYSDATE) NOT NULL,
CONSTRAINT PK_T PRIMARY KEY(SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE, STORE_CD));

CREATE INDEX IDX_T ON T (SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE);

insert into t values('5636','62108373','20041129','0007','2','00',1,1, to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));

insert into t values('5636','62108373','20041129','0012','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'), to_date('2004/11/30', 'yyyy/mm/dd'));

insert into t values('5636','62108384','20041129','0014','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));

insert into t values('5636','62108384','20041129','0015','3','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));

insert into t values('1000','11169266','20040805','1309','4','00',8,8,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));

insert into t values('1000','11169266','20040805','1312','12' ,'00',8,8,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));

insert into t values('1000','11169266','20040805','1313','13' ,'00',12,12,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));

Currently the following query is used:-

SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE

The execution plan when this query is executed on the real table which has 4m records is : -

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=103002 Card=3571095 Bytes=107132850)

1 0 SORT (GROUP BY NOSORT) (Cost=103002 Card=3571095 Bytes=107 132850)

2 1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=10 3002 Card=3571095 Bytes=107132850)

3 2 INDEX (FULL SCAN) OF 'IDX_T' (NON-UNIQUE) (Cost=26942 Card=3571095)

Could you please tell me whether analytical functions could be used over here or a better approach for this query.

Thanks for your great help


Tom Kyte

Followup  

February 24, 2005 - 5:49 am UTC

there would be no need of analytics here. analytics would be useful to get the 'aggregates' while preserving the 'details'

eg:

select empno, sal, sum(sal) over (partition by deptno)
from emp;

shows the empno, their sal and the sum of all salaries in their dept. that would be instead of coding:

select empno, sal, sum_sal
from emp, (select deptno, sum(sal) sum_sal from emp gropu by deptno) t
where emp.deptno = t.deptno
/




I was just wondering

February 24, 2005 - 6:25 am UTC

Reviewer: A reader

how would analytics help in the following example (the data nodes are implemented as rows in a table with two columns as pointers: split-from and merge-to, and the third column is "value", some number, not shown on diagram):

</code> http://img23.exs.cx/my.php?loc=img23&image=directedgraph11th.png <code>


The task is to use this directed dependency graph and prorate the "value" column in each row/node in the following way:

foreach node
-start with a node, for example 16
-visit each hierarchy on which 16 depends, in this case hierarchies for 14 and 15, SUM their values and the current value of node 16, and that will be new, prorated value for node 16
-repeat this recursively for each sub-hierarchy
until all nodes are prorated

I was thinking maybe to use combination of sys_connect_by_path and AF but not sure how. Any thoughts?


Tom Kyte

Followup  

February 24, 2005 - 6:51 am UTC

you won't get very far with that structure in 9i and before. connect by "loop" will be an error you see lots of with a directed graph.

analytics won't be appropriate either, they work on windows - not on hierarchies.

sys_connect_by_path is going to give you a string, not a sum


a scalar subquery in 10g with NOCYCLE on the query might work.

What if there is no closure inside the graph?

February 24, 2005 - 9:08 am UTC

Reviewer: A reader

i.e. if the link between node 9 and 5 is removed, and the link between node 6 and 0 is removed.
Would that make difference? It would be a tree in that case. How should we proceed if that is the case? I was thinking maybe to use sys_connect_by_path to pack all sub-hierarchies one after another, and marker in window to be the depth or level. If the level switch from n to 1 that would mean the end of sub-hierarchy. If the level switch from 1 to 2 that is the begining of the hierarchy. And then aggregate over partition inside hierarchy view. Or is there a better approach?

Tom Kyte

Followup  

February 24, 2005 - 9:22 am UTC

scalar subqueries.

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:30609389052804 <code>



Lead/Lag and 0 Sales

February 24, 2005 - 1:00 pm UTC

Reviewer: Rob H from Winnipeg Canada

Thanks for all of the help so far. I have run into an issue where I have Companies and Contacts at that company. Here are the tables.

create table SALES_TRANS
(
CUSTOMER_ACCOUNT VARCHAR2(8) ,
STATION_NUMBER VARCHAR2(7) ,
PRODUCT_CODE VARCHAR2(8) ,
QUANTITY NUMBER ,
DATE_ISSUE DATE ,
PRICE NUMBER ,
VALUE NUMBER );
/

Create table COMPANY_CUSTOMER
(
COMPANY_ID NUMBER(9),
CUSTOMER_ACCOUNT VARCHAR2(8));
/

Create table PRODUCT_INFO
(
PRODUCT_CODE VARCHAR2(8) ,
PRODUCT_GROUP VARCHAR2(25),
PRODUCT_DESC VARCHAR2(100)
);
/

Running a query by customer (this select is a view called - SUM_CUST_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
t.CUSTOMER_ACCOUNT,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc) total_VALUE_pre_y
From SALES_TRANS t
inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.OMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group, fiscal_year


I get
COMPANY_ID,CUSTOMER_ACCOUNT,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","27294370","Product1",2002,1460.08,0
"F0009631","27294370","Product2",2005,0,27926.31
"F0009631","27294370","Product2",2004,27926.31,18086.17
"F0009631","27294370","Product2",2003,18086.17,47597.05
"F0009631","27294370","Product2",2002,47597.05,0
"F0009631","27294370","Product2",2001,0,0
"F0009631","27294370","Product3",2004,64582.6,51041
"F0009631","27294370","Product3",2003,51041,60225
"F0009631","27294370","Product3",2002,60225,43150
"F0009631","27294370","Product3",2001,43150,50491
"F0009631","27294370","Product3",2000,50491,664
"F0009631","27294370","Product3",1999,664,0
"F0009631","27294370","Product4",2005,2119.1,1708.61
"F0009631","27294370","Product4",2004,1708.61,4050.82
"F0009631","27294370","Product4",2003,4050.82,15662.57
"F0009631","27294370","Product4",2002,15662.57,0
"F0009631","27294370","Product5",2005,0,351.64
"F0009631","27294370","Product5",2004,351.64,5873.61
"F0009631","27294370","Product5",2003,5873.61,2548.83
"F0009631","27294370","Product5",2002,2548.83,0
"F0009631","27294370","Product6",2004,17347.84,16781.33
"F0009631","27294370","Product6",2003,16781.33,10575
"F0009631","27294370","Product6",2002,10575,3659.67
"F0009631","27294370","Product6",2001,3659.67,4901.67
"F0009631","27294370","Product6",2000,4901.67,4073.47
"F0009631","27294370","Product6",1999,4073.47,0
"F0009631","27294370","Product7",2004,5377.5,2588
"F0009631","27294370","Product7",2003,2588,245
"F0009631","27294370","Product7",2000,245,0
"F0009631","27340843","Product2",2003,3013.71,0
"F0009631","27340843","Product3",1999,1411,0
"F0009631","27340843","Product5",2003,3254.9,0


Now if I run the same grouping by only company (this select is a view called - SUM_COMPANY_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc) total_VALUE_pre_y
From SALES_TRANS t
inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.COMPANY_ID, p.product_group, fiscal_year

we get
COMPANY_ID,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","Product1",2002,1460.08,0
"F0009631","Product2",2005,0,27926.31
"F0009631","Product2",2004,27926.31,21099.88
"F0009631","Product2",2003,21099.88,47597.05
"F0009631","Product2",2002,47597.05,0
"F0009631","Product2",2001,0,0
"F0009631","Product3",2004,64582.6,51041
"F0009631","Product3",2003,51041,60225
"F0009631","Product3",2002,60225,43150
"F0009631","Product3",2001,43150,50491
"F0009631","Product3",2000,50491,2075
"F0009631","Product3",1999,2075,0
"F0009631","Product4",2005,2119.1,1708.61
"F0009631","Product4",2004,1708.61,4050.82
"F0009631","Product4",2003,4050.82,15662.57
"F0009631","Product4",2002,15662.57,0
"F0009631","Product5",2005,0,351.64
"F0009631","Product5",2004,351.64,9128.51
"F0009631","Product5",2003,9128.51,2548.83
"F0009631","Product5",2002,2548.83,0
"F0009631","Product6",2004,17347.84,16781.33
"F0009631","Product6",2003,16781.33,10575
"F0009631","Product6",2002,10575,3659.67
"F0009631","Product6",2001,3659.67,4901.67
"F0009631","Product6",2000,4901.67,4073.47
"F0009631","Product6",1999,4073.47,0
"F0009631","Product7",2004,5377.5,2588
"F0009631","Product7",2003,2588,245
"F0009631","Product7",2000,245,0


The problem is that because if I
select * from SUM_CUST_TRANS_PRODUCT_FY_V where fiscal_year=2004

Customer 27340843 will not show up (no 2004 purchases), but that also means that the total_VALUE_pre_y for 2004 will never summarize by customer to the total_VALUE_pre_y for 2004 for the company. Is there a better way to do this. The goal is that we can show current year sales vs previous years sales by company, by customer, and potentially a larger summary higher than company (city).

I guess the idea would be that I could somehow show for all customers in a company, all years, all products, that the company has purchases (cartesian) for every year purchasing. This I think is difficult for large customer, sales transaction tables.

ie

"F0009631","27340843","Product2",2004,0,3013.71 <--- ***
"F0009631","27340843","Product2",2003,3013.71,0

*** This row doesn't exist in the customer view. There are no 2004 sales, so doesn't appear, but we would like to see it so that the year previous shows.

I would love to "attach" some of the transactions if it would help. Is there a better way?

hierarchical cubes + MV?

February 25, 2005 - 2:52 pm UTC

Reviewer: Rob H from Winnipeg Canada

Would hierarchical cubes and MV be the solution. It seems like a lot of meta data to create. We would have to create it for all customers, for all years, for all product groups.

Tom Kyte

Followup  

February 25, 2005 - 6:40 pm UTC

if you have "missing data", the only way i know to "make it up" is an outer join (partitioned outer joins in 10g rock, removing the need to create cartesian products of every dimension first)

February 27, 2005 - 2:32 am UTC

Reviewer: Neelz from Japan

Dear Sir,

This is with regards to my previous post which is 5th above from this.

<quote>
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE
</quote>

As you mentioned analytics could not be used, but could you please advice me on my problem,

The query is infact big, for brevity I just put few columns. The actual query is
SELECT
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE,
SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1,
SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2,
SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3,
.....
.....
.....
.....
.....
.....
SUM(DECODE(RTRIM(POSITION_NO),'197',ORDERQUANTITY_RES,0)) Q197,
SUM(DECODE(RTRIM(POSITION_NO),'198',ORDERQUANTITY_RES,0)) Q198,
SUM(DECODE(RTRIM(POSITION_NO),'199',ORDERQUANTITY_RES,0)) Q199,
SUM(DECODE(RTRIM(POSITION_NO),'200',ORDERQUANTITY_RES,0)) Q200,
SUM(ORDERQUANTITY_RES) ORDER_TOTAL
FROM
T
GROUP BY
SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE

As you could see there is a definite pattern on the sum function. Could you please help me in tuning this query?
Thanks in advance





Tom Kyte

Followup  

February 27, 2005 - 8:32 am UTC

you aer doing a pivot -- looks great to me? It is "classic"

February 27, 2005 - 9:51 am UTC

Reviewer: Neelz from Japan

Dear Sir,

I am sorry if you felt like that, It is quite a new world for me here, started visiting this site 3-4 months back then realized the enormity of it and its become like an addiction. Bought both books by you and started working on it. Reading the Oracle concepts guide. Every day many times will try for asking a question but till now no luck, might be because of timezone difference.

Coming back to my question, since it is a huge query and was taking 35 min to execute, after reading through many articles here and in the books I was really confused as to what approach should I take. Still is. Analytical functions (not useful as you told), Function based indexes(no becuase we have a standard edition), Materialized views(no because its an OLTP), Stored Sql functions, Deterministic keyword, user defined aggregates, optimizer hints.. at present it is confusing for me.

I am working on it with different approaches, could reduce the execution time upto 9.08 minutes. The query was written with an index hint earlier and by removing it, the execution time decreased upt 9+ minutes.

I was thinking whether you could advice on what approach should I take

Thanks for your valuable time,





Tom Kyte

Followup  

February 27, 2005 - 10:04 am UTC

if that is taking 35 minutes you either

a) have the memory settings like pga_aggreate_target/sort_area_size set way too low

b) you have billions of records that are hundreds of bytes in width

c) really slow disks

d) an overloaded system


I mean -- that query is pretty "simple" full scan, aggregate, nothing to it -- unless it is a gross simplification, it should not take 35 minutes. Can you trace it with the 10046 level 12 trace and post the tkprof section that is relevant to just this query with the waits and all?

February 27, 2005 - 10:56 am UTC

Reviewer: Neelz from Japan

Dear Sir,

Thank you for your kind reply,

This report is taken for the development system.
I used alter session set events '10046 trace name context forever, level 12'. The query execution time was 00:08:15.03


select
supplier_cd, orderrpt_no, order_date,
sum(decode(rtrim(position_no),'1',orderquantity_res,0)) q1,
sum(decode(rtrim(position_no),'2',orderquantity_res,0)) q2,
sum(decode(rtrim(position_no),'3',orderquantity_res,0)) q3,
sum(decode(rtrim(position_no),'4',orderquantity_res,0)) q4,
sum(decode(rtrim(position_no),'5',orderquantity_res,0)) q5,
.....
.....
sum(decode(rtrim(position_no),'197',orderquantity_res,0)) q197,
sum(decode(rtrim(position_no),'198',orderquantity_res,0)) q198,
sum(decode(rtrim(position_no),'199',orderquantity_res,0)) q199,
sum(decode(rtrim(position_no),'200',orderquantity_res,0)) q200,
sum(orderquantity_res) order_total
from
t
group by
supplier_cd, orderrpt_no, order_date

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.04 0 0 0 0
Execute 2 0.02 0.04 0 0 0 0
Fetch 15 431.55 488.37 37147 36118 74 211
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 18 431.60 488.46 37147 36118 74 211

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 66

Rows Row Source Operation
------- ---------------------------------------------------
211 SORT GROUP BY
4205484 TABLE ACCESS FULL T


Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 16 0.00 0.00
SQL*Net more data to client 30 0.00 0.00
db file sequential read 3 0.04 0.05
db file scattered read 2280 0.78 30.62
direct path write 4 0.00 0.00
direct path read 147 0.05 1.45
SQL*Net message from client 16 140.57 166.58
SQL*Net break/reset to client 2 0.01 0.01
********************************************************************************

Thank you

Tom Kyte

Followup  

February 27, 2005 - 11:13 am UTC

that is 8 minutes?

but I see some writes to temp here -- for 211 aggregated rows, perhaps your sort/pga is set small


Also, why do you need to rtrim() 4,205,484 rows? (and why is something called position NUMBER in a string?) is that rtrim there "just in case" or is it really needed? why would it have trailing blanks and is that not a data integrity issue that needs to be fixed?

(but this is an 8 minute query, not a 35 minute query, if it takes longer on production -- it'll be because it is waiting for something -- like IO...)

February 27, 2005 - 11:30 am UTC

Reviewer: Neelz from Japan

Dear Sir,

This is a 3rd party application and the query was written with an index hint earlier. After removing the hint query execution time reduced to 8 min. Regarding the rtrim I have to check with the team if it is really needed. I will try the trace on production tomorrow.

And at last I could see the link for "Submit a New Question"!, I think I should try around 1.00 AM

Thanking You a lot

Tom Kyte

Followup  

February 27, 2005 - 11:31 am UTC

depends on your time zone, rarely am I up at 1am east coast (gmt-5) time doing this stuff!

March 02, 2005 - 8:51 am UTC

Reviewer: Miki from Hungary

Tom,

I need to produce a moving average which has an even window size. If I want a 28 sized window, I need to look backward 14 but I need the first value of the window to be divided by 2 and I need to look forward 14 and the last value of the window to be divided by 2 also.
(a1/2+a2+...+a28+a29/2)/28
How could I accomplish it with the function:
avg() over(...)?

Thanks in advance

Tom Kyte

Followup  

March 02, 2005 - 10:03 am UTC

this is the first thought that popped into my head:

a) get the sum(val) over 13 before and 13 after (27 rows possible).
b) get the lag(val,14)/2 and lead(val,14)/2
c) add those three numbers
d) divide by the count of non-null VALS observed (count(val) 13 before/after+ 1 if lag is not null + 1 is lead is not null)


ops$tkyte@ORA9IR2> create table t
  2  as
  3  select rownum id, object_id val
  4    from all_objects
  5   where rownum <= 30;
 
Table created.
 
<b>so, this was my "debug" query, just to see the data:</b>


ops$tkyte@ORA9IR2> select id,
  2         sum(val) over 
                 (order by id rows between 13 preceding and 13 following) sum,
  3         count(val) over 
                 (order by id rows between 13 preceding and 13 following)+
  4             decode(lag(val,14) over (order by id),null,0,1)+
  5             decode(lead(val,14) over (order by id),null,0,1) cnt,
  6             lag(id,14) over (order by id) lagid,
  7             lag(val,14) over (order by id) lagval,
  8             lead(id,14) over (order by id) leadid,
  9             lead(val,14) over (order by id) leadval
 10    from t
 11   order by id;
 
        ID        SUM        CNT      LAGID     LAGVAL     LEADID    LEADVAL
---------- ---------- ---------- ---------- ---------- ---------- ----------
         1     218472         15                               15       6399
         2     224871         16                               16      19361
         3     244232         17                               17      23637
         4     267869         18                               18      14871
         5     282740         19                               19      20668
         6     303408         20                               20      18961
         7     322369         21                               21      15767
         8     338136         22                               22      20654
         9     358790         23                               23       7065
        10     365855         24                               24      17487
        11     383342         25                               25      11077
        12     394419         26                               26      20772
        13     415191         27                               27      15505
        14     430696         28                               28      12849
        15     425648         29          1      17897         29      23195
        16     441314         29          2       7529         30      18523
        17     436505         28          3      23332
        18     422306         27          4      14199
        19     399409         26          5      22897
        20     389266         25          6      10143
        21     365728         24          7      23538
        22     342135         23          8      23593
        23     332316         22          9       9819
        24     320581         21         10      11735
        25     303084         20         11      17497
        26     295369         19         12       7715
        27     276010         18         13      19359
        28     266791         17         14       9219
        29     260392         16         15       6399
        30     241031         15         16      19361
 
30 rows selected.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id,
  2         (sum(val) over 
                (order by id rows between 13 preceding and 13 following)+
  3              nvl(lag(val,14) over (order by id)/2,0)+
  4              nvl(lead(val,14) over (order by id)/2,0))/
  5             nullif(
  6           count(val) over 
                  (order by id rows between 13 preceding and 13 following)+
  7               decode(lag(val,14) over (order by id),null,0,1)+
  8               decode(lead(val,14) over (order by id),null,0,1)
  9                  ,0) avg
 10    from t
 11   order by id;
 
        ID        AVG
---------- ----------
         1    14778.1
         2 14659.4688
         3 15061.7941
         4 15294.6944
         5 15424.9474
         6  15644.425
         7 15726.3095
         8 15839.2273
         9 15753.1522
        10 15608.2708
        11   15555.22
        12 15569.4231
        13 15664.5741
        14 15611.4464
        15      15386
        16 15666.8966
        17 16006.1071
        18 15903.9074
        19 15802.2115
        20    15773.5
        21 15729.0417
        22 15388.3261
        23 15328.4318
        24 15545.1667
        25  15591.625
        26 15748.7632
        27 15871.6389
        28 15964.7353
        29 16474.4688
        30    16714.1
 
30 rows selected.
 
ops$tkyte@ORA9IR2>

<b>I did not do a detailed check of the results -- but that should get you going (remember -- there are 29 rows -- 14+1+14!!! and beware NULLs)</b> 

March 02, 2005 - 10:54 am UTC

Reviewer: Miki from Hungary

Tom,

Your answer is excellent. That is - almost - what I needed.
If my window size is odd I can use simly avg() over() function. I am looking for a solution where I can also use avg() over() instead of sum() over()/count().
Is it possible?

Thank you!

Tom Kyte

Followup  

March 02, 2005 - 11:15 am UTC

if you want to do things to row 1 and row 29 in the window "special" like this -- this was the only thing I thought of.

March 02, 2005 - 11:18 am UTC

Reviewer: Miki from Hungary

Thank you! I will use your recommended code.

consecutive days... 8.1.7

March 09, 2005 - 1:07 pm UTC

Reviewer: Dean from IL

create table day_cd
(dt date
,cd varchar2(2))
/


insert into day_cd values ('08-MAR-05', 'BD');
insert into day_cd values ('09-MAR-05', 'AD');
insert into day_cd values ('10-MAR-05', 'AD');
insert into day_cd values ('11-MAR-05', 'AD');
insert into day_cd values ('12-MAR-05', 'AD');
insert into day_cd values ('13-MAR-05', 'AD');
insert into day_cd values ('14-MAR-05', 'CD');
insert into day_cd values ('15-MAR-05', 'CD');
insert into day_cd values ('16-MAR-05', 'AD');
insert into day_cd values ('17-MAR-05', 'AD');
insert into day_cd values ('18-MAR-05', 'AD');
insert into day_cd values ('19-MAR-05', 'CD')
/


SELECT * FROM DAY_CD;

DT CD
--------- --
08-MAR-05 BD
09-MAR-05 AD
10-MAR-05 AD
11-MAR-05 AD
12-MAR-05 AD
13-MAR-05 AD
14-MAR-05 CD
15-MAR-05 CD
16-MAR-05 AD
17-MAR-05 AD
18-MAR-05 AD
19-MAR-05 CD

I'd like the count the occurrence of each code as it occurs in consecutive days as one occurrence.

So that the output would be:

CD OCCURRENCES
-- -----------
AD 2
BD 1
CD 2


Tom Kyte

Followup  

March 09, 2005 - 1:55 pm UTC

take a look at this thread, try to understand analytics and see if you cannot figure out how you might use lag to do this.

see also:

</code> http://www.oracle.com/technology/oramag/oracle/04-mar/o24asktom.html <code>

for some hints (analytics to the rescue)

nevermind...

March 09, 2005 - 1:59 pm UTC

Reviewer: Dean from IL

select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd


we were responding at the same time...

March 09, 2005 - 2:01 pm UTC

Reviewer: Dean from IL

:)

select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd


CD COUNT(*)
-- ----------
AD 2
BD 1
CD 2

Thanks for all of your help...

max() over() till not the current row

March 10, 2005 - 4:12 am UTC

Reviewer: Miki from Hungary

Tom,

I have the following input

DATUM T COL1 COL2 COL3 COL4
2005.02.19 9:29 T 1 0 0 0
2005.02.20 9:29 0 0 0 0
2005.02.21 9:29 0 0 0 0
2005.02.22 9:29 T 1 0 0 0
2005.02.23 9:29 0 0 0 0
2005.02.24 9:29 0 0 0 0
2005.02.25 9:29 0 0 0 0
2005.02.26 9:29 0 0 0 0
2005.02.27 9:29 T 0 1 0 0
2005.02.28 9:29 0 0 0 0
2005.03.01 9:29 0 0 0 0
2005.03.02 9:29 T 1 1 0 0
2005.03.03 9:29 0 0 0 0
2005.03.04 9:29 T 1 1 0 0
2005.03.05 9:29 0 0 0 0
2005.03.06 9:29 T 1 0 0 0
2005.03.07 9:29 0 0 0 0
2005.03.08 9:29 0 0 0 0
2005.03.09 9:29 0 0 0 0

When value of column T is ’T’ a rule determines which columns (col1, …, col4) get 1 or 0.
Unfortunately, with the rule more then one column can get value 1. So, if col1+…+col4 > 1 then I would like colx to be the previous colx where t = 'T' and col1+...+col4 = 1

So, the output is the following
DATUM T COL1 COL2 COL3 COL4
2005.02.19 9:29 T 1 0 0 0
2005.02.20 9:29 0 0 0 0
2005.02.21 9:29 0 0 0 0
2005.02.22 9:29 T 1 0 0 0
2005.02.23 9:29 0 0 0 0
2005.02.24 9:29 0 0 0 0
2005.02.25 9:29 0 0 0 0
2005.02.26 9:29 0 0 0 0
2005.02.27 9:29 T 0 1 0 0
2005.02.28 9:29 0 0 0 0
2005.03.01 9:29 0 0 0 0
2005.03.02 9:29 T 0 1 0 0
2005.03.03 9:29 0 0 0 0
2005.03.04 9:29 T 0 1 0 0
2005.03.05 9:29 0 0 0 0
2005.03.06 9:29 T 1 0 0 0
2005.03.07 9:29 0 0 0 0
2005.03.08 9:29 0 0 0 0
2005.03.09 9:29 0 0 0 0
I tried to use a max() over() function to replace the ’wrong’ value but it dosn’t work because I can’t see the max datum till the previous record where t=’T’ and col1+...+col4 = 1

...
case when t = ’T’ and col1+…+col4>1 and
greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000) then 1 else 0 end col1,
…
Case when t = ’T’ and col1+…+col4>1 and
Greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000) then 1 else 0 end col4…

Could you give me a solution to my problem?

Thanks in advance
miki


Tom Kyte

Followup  

March 10, 2005 - 7:42 am UTC

I can, but I'd need a create table and some inserts.


You might look at:

</code> http://www.oracle.com/technology/oramag/oracle/04-mar/o24asktom.html <code>

analytics to the rescue because I'll be using that exact technique.

March 10, 2005 - 8:09 am UTC

Reviewer: Miki from Hungary

Here is my table populated with data:
create table T
(
DATUM DATE,
T VARCHAR2(1),
COL1 NUMBER,
COL2 NUMBER,
COL3 NUMBER,
COL4 NUMBER
);

insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('16-01-2005 13:17:46', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-01-2005 17:23:13', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-03-2005 02:59:17', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('11-12-2004 21:59:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('10-01-2005 12:00:22', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('24-02-2005 02:36:51', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('08-12-2004 11:21:15', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('07-01-2005 20:52:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-02-2005 23:44:33', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-03-2005 16:25:12', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 19:02:28', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('22-01-2005 11:21:41', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-01-2005 15:32:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-12-2004 03:07:10', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('21-02-2005 16:25:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 01:02:39', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('15-12-2004 05:49:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-02-2005 14:35:34', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-12-2004 15:01:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
commit;

select t.* from t t
order by 1;

DATUM T COL1 COL2 COL3 COL4
1 2004.12.02. 15:01:42 0 0 0 0
2 2004.12.08. 11:21:15 0 0 0 0
3 2004.12.11. 21:59:18 T 1 0 0 0
4 2004.12.15. 5:49:26 0 0 0 0
5 2004.12.19. 3:07:10 0 0 0 0
6 2005.01.01. 1:02:39 T 0 1 0 0
7 2005.01.01. 19:02:28 0 0 0 0
8 2005.01.04. 17:23:13 T 1 1 0 0
9 2005.01.07. 20:52:26 0 0 0 0
10 2005.01.10. 12:00:22 0 0 0 0
11 2005.01.16. 13:17:46 0 0 0 0
12 2005.01.19. 15:32:18 T 1 1 0 0
13 2005.01.22. 11:21:41 0 0 0 0
14 2005.02.02. 23:44:33 0 0 0 0
15 2005.02.04. 14:35:34 T 0 1 0 0
16 2005.02.21. 16:25:42 0 0 0 0
17 2005.02.24. 2:36:51 0 0 0 0
18 2005.03.01. 2:59:17 0 0 0 0
19 2005.03.04. 16:25:12 T 1 0 0 0
Line 8 and 12 have more then one column that contain 1.
So, I need to "copy" every colx from line 6 because it is the first line (ordered by datum), that has value 'T' for column T and only one colx has value 1.

Thank you

Tom Kyte

Followup  

March 10, 2005 - 8:28 am UTC

ops$tkyte@ORA9IR2> select t, col1, col2, col3, col4,
  2         substr(max(data) over (order by datum),11,1) c1,
  3         substr(max(data) over (order by datum),12,1) c2,
  4         substr(max(data) over (order by datum),13,1) c3,
  5         substr(max(data) over (order by datum),14,1) c4,
  6             case when col1+col2+col3+col4 > 1 then '<---' end fix
  7    from (
  8  select t.*,
  9         case when t = 'T' and col1+col2+col3+col4 = 1
 10                  then to_char(row_number() over (order by datum) ,'fm0000000000') || col1 || col2 || col3 || col4
 11                  end data
 12    from t
 13         )
 14   order by datum;

T       COL1       COL2       COL3       COL4 C C C C FIX
- ---------- ---------- ---------- ---------- - - - - ----
           0          0          0          0
           0          0          0          0
T          1          0          0          0 1 0 0 0
           0          0          0          0 1 0 0 0
           0          0          0          0 1 0 0 0
T          0          1          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          1          1          0          0 0 1 0 0 <---
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          1          1          0          0 0 1 0 0 <---
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          0          1          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          1          0          0          0 1 0 0 0

19 rows selected.
 

Great!

March 10, 2005 - 9:38 am UTC

Reviewer: Miki from Hungary

Great solution!
Thank you, it is that i expected.

book on Analytics

March 10, 2005 - 11:07 am UTC

Reviewer: A reader

Hi Tom,

It is high time that you publish the book on 'Analytic functions' - there is a lot one can do with these , but very few people are fully aware of it

When is this book due ?

thanks

A variation of Dean's question ...

March 10, 2005 - 8:13 pm UTC

Reviewer: Julius from Fremont, CA

create table tt (
did number,
dd date,
status number);

alter table tt add constraint tt_pk primary key (did,dd) using index;

insert into tt values (-111,to_date('03/03/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/04/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/05/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/06/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-111,to_date('03/08/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/09/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/10/2005','mm/dd/yyyy'),22);

insert into tt values (-222,to_date('03/04/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/05/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/06/2005','mm/dd/yyyy'),77);
insert into tt values (-222,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/08/2005','mm/dd/yyyy'),55);
insert into tt values (-222,to_date('03/09/2005','mm/dd/yyyy'),11);

I need a query which would return following result set where days_in_status is a count of consecutive days the did has been in its current status (dd values are days only). I've been trying to use analytics but without much success so far. Any idea? Thanks!!

DID DD STATUS DAYS_IN_STATUS
----- ---------- ------ --------------
-111 03/10/2005 22 3
-222 03/09/2005 11 1


Tom Kyte

Followup  

March 10, 2005 - 9:04 pm UTC

ops$tkyte@ORA9IR2> select did, max(dd), count(*)
  2    from (
  3  select x.*, max(grp) over (partition by did order by dd desc) maxgrp
  4    from (
  5  select tt.*,
  6         case when lag(status) over (partition by did order by dd desc) <> status
  7                  then 1
  8                  end grp
  9    from tt
 10         ) x
 11             )
 12   where maxgrp is null
 13   group by did
 14  /

       DID MAX(DD)     COUNT(*)
---------- --------- ----------
      -222 09-MAR-05          1
      -111 10-MAR-05          3



is one approach... 

SQL Query

March 15, 2005 - 6:50 pm UTC

Reviewer: a reader

Hi Tom,

create table a
(accno number(8) not null,
amount_paid number(7) not null)
/


insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into a values (5, 3000);

Could you please help me in writing the following query without using rownum and analytics.

list the accno corresponding to maximum amount paid. In case of more than one accounts having the same max amount paid, list any one.

I am expecting the result to be accno 4 or 5


Thanks for your time.

Regards

Tom Kyte

Followup  

March 15, 2005 - 9:27 pm UTC

sounds like homework.

I give a similar quiz question in interviews (find the more frequently occuring month)


tkyte@ORA8IW> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno
2 from a;

ACCNO
-----------------------------------------
5


is one possible approach (assuming that amount_paid is positive)



tkyte@ORA8IW> select max(accno)
2 from a
3 where amount_paid = ( select max(amount_paid) from a );

MAX(ACCNO)
----------
5


is another (that would work well if amount_paid,accno were indexed....)


negatives to worry about ...

March 15, 2005 - 9:56 pm UTC

Reviewer: Gabe

SQL> select * from a;

     ACCNO AMOUNT_PAID
---------- -----------
         1          -2
         2          -1

SQL> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno from a;

ACCNO
-----------------------------------------
21
 

Tom Kyte

Followup  

March 15, 2005 - 10:04 pm UTC

....
(assuming that amount_paid is positive)
.......

that was caveated and why I gave two answers ;)


cannot read ...

March 15, 2005 - 10:57 pm UTC

Reviewer: Gabe

Sorry about that ... missed it completely.



following an idea of mikito ...

March 16, 2005 - 8:11 am UTC

Reviewer: Matthias Rogel from Kaiserslautern, Germany

  1  select accno
  2  from a
  3  start with amount_paid = (select max(amount_paid) from a)
  4         and accno = (select min(accno) from a where amount_paid = (select max(amount_paid) from a))
  5* connect by prior null is not null
SQL> /

     ACCNO
----------
         4

would be a third solution 

Tom Kyte

Followup  

March 16, 2005 - 8:38 am UTC

there are many solutions -- this one would win a Rube Goldberg award though :)

another query using analytics

March 29, 2005 - 11:32 am UTC

Reviewer: A reader

I've got 2 tables, t1 and t2.
t1(1 column):

t1.x(int ,primary key)

1
2
3

and t2(3 columns,index on t2.y):

t2.x(int) t2.y(int) t2.z(int)

1 7000 1
1 7000 6
1 8000 8
2 7000 1
2 7000 5
3 7000 3
3 8000 1
3 8000 7
3 9000 5

I would like to have a report like this:

t1.x t2.y count min max

1 7000 2 1 8
1 8000 1 1 8
2 7000 2 1 5
3 7000 1 1 7
3 8000 2 1 7
3 9000 1 1 7

What I came up with is:

select distinct t1.x,t2.y,
count(*) over (partition by t1.x,t2.y) as count,
min(t2.z) over (partition by t1.x) as min,
max(t2.z) over (partition by t1.x) as max
from t1,t2 where
where t1.x=t2.x;

I was wondering if this query is good enough, or if there's a better way(in terms of performance) to write this query. I'm new to analytics, and your help would be very much appreciated.


Tom Kyte

Followup  

March 29, 2005 - 12:25 pm UTC

we could probably do this in analytics without the distinct, something like


select t1.x, t2.y, t2.cnt,
min(t2.z) over (partition by t1.x),
max(t2.z) over (partition by t1.x)
from t1, (select x, y, count(*) cnt from t2 group by x, y ) t2
where t1.x = t2.x;

and maybe even pusht he min/max() down into the inline view.


Analytics problem

April 08, 2005 - 12:19 pm UTC

Reviewer: Mark from NY

Hi Tom,

I have a problem whose solution I'm pretty sure involves analytic functions. I've been struggling with it for some time, but analytics are new to me. I want to go from this:

/* create and inserts */
create table test.test (ordernum varchar2(10), 
                         tasktype char(3),
                         feetype varchar2(20),
                         amount number(10,2));

insert into test.test(ordernum, tasktype, feetype, amount)
               values('123123', 'DOC', 'Product Fee', 15);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('123123', 'DOC', 'Copy Fee', 1);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('34864', 'COS', 'Setup Fee', 23);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('34864', 'COS', 'File Review Fee', 27);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('34864', 'COS', 'Statutory Fee', 23);               
insert into test.test(ordernum, tasktype, feetype, amount)
               values('56432', 'DOC', 'Product Fee', 80);    
insert into test.test(ordernum, tasktype, feetype, amount)
               values('56432', 'DOC', 'Prepayment', -16);

SQL> select tasktype, ordernum, feetype, amount from test.test;

TAS ORDERNUM   FEETYPE                  AMOUNT
--- ---------- -------------------- ----------
DOC 123123     Product Fee                  15
DOC 123123     Copy Fee                      1
COS 34864      Setup Fee                    23
COS 34864      File Review Fee              27
COS 34864      Statutory Fee                22
DOC 56432      Product Fee                  80
DOC 56432      Prepayment                  -16

...to this:

TAS ORDERNUM FEE1        FEE2            FEE3          FEE4     FEE5
--- -------- ----------- --------        ----------    -------- --------
DOC          Product Fee Copy Fee        Prepayment
DOC 123123   15          1
DOC 56432    80                          -16
COS          Setup Fee   File Review Fee Statutory Fee
COS 34864    23          27              22

Allow me to explain. For each tasktype I would like a heading row, which, going across, contains all the feetypes found in test.test for that particular tasktype. There should never be more than five feetypes.

For each ordernum under each tasktype, I would like to have the amounts going across, underneath the appropriate feetypes. 

I'm pretty sure my solution involves the lag and/or lead functions, partitioning over tasktype. I particularly seem to have trouble wrapping my brain around the problem of how to get a distinct ordernum while keeping intact the data in other columns (where ordernums duplicate).

I hope my explanation is clear enough.

Hope you can help. Thanks in advance. I will continue working on this. 

Tom Kyte

Followup  

April 08, 2005 - 12:51 pm UTC

ops$tkyte@ORA9IR2> with columns
  2  as
  3  (select tasktype, feetype, row_number() over (partition by tasktype order by feetype) rn
  4     from (select distinct tasktype, feetype from test )
  5  )
  6  select a.tasktype, a.ordernum,
  7         to_char( max( decode( rn, 1, amount ) )) fee1,
  8         to_char( max( decode( rn, 2, amount ) )) fee2,
  9         to_char( max( decode( rn, 3, amount ) )) fee3,
 10         to_char( max( decode( rn, 4, amount ) )) fee4,
 11         to_char( max( decode( rn, 5, amount ) )) fee5
 12    from test a, columns b
 13   where a.tasktype = b.tasktype
 14     and a.feetype = b.feetype
 15   group by a.tasktype, a.ordernum
 16   union all
 17  select tasktype, null,
 18         ( max( decode( rn, 1, feetype ) )) fee1,
 19         ( max( decode( rn, 2, feetype ) )) fee2,
 20         ( max( decode( rn, 3, feetype ) )) fee3,
 21         ( max( decode( rn, 4, feetype ) )) fee4,
 22         ( max( decode( rn, 5, feetype ) )) fee5
 23    from columns
 24   group by tasktype
 25   order by 1 desc, 2 nulls first
 26  /
 
TAS ORDERNUM   FEE1            FEE2            FEE3            FEE4 FEE5
--- ---------- --------------- --------------- --------------- ---- ----
DOC            Copy Fee        Prepayment      Product Fee
DOC 123123     1                               15
DOC 56432                      -16             80
COS            File Review Fee Setup Fee       Statutory Fee
COS 34864      27              23              23


of course. :)


(suggestion, break it out, run each of the bits to see what they do.  basically, columns is a view used to "pivot" on -- we needed to assign a column number to each FEETYPE by TASKTYPE.  That is all that view does.

Then, we join that to test and "pivot" naturally.

Union all in the pivot of the column names....

and sort) 

RE: Analytics problem

April 08, 2005 - 1:27 pm UTC

Reviewer: Mark from NY

Excellent! I'll definitely break it down to figure out exactly what you did. Thank you very much.

Re: “another query using analytics”

April 08, 2005 - 3:27 pm UTC

Reviewer: Gabe

You weren’t given any resources … so, I understand your solution was in fact merely a [untested] suggestion.

create table t1 ( x int primary key );

insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);

create table t2 ( x int not null references t1(x), y int not null, z int not null );

insert into t2 values ( 1,7000,1);
insert into t2 values ( 1,7000,6);
insert into t2 values ( 1,8000,8);
insert into t2 values ( 2,7000,1);
insert into t2 values ( 2,7000,5);
insert into t2 values ( 3,7000,3);
insert into t2 values ( 3,8000,1);
insert into t2 values ( 3,8000,7);
insert into t2 values ( 3,9000,5);

My solution (avoiding the distinct) is not necessarily better than the one presented by the “A reader”, but here it goes:

flip@FLOP> select x, y, c
2 ,min(f) over (partition by x) f
3 ,max(l) over (partition by x) l
4 from (
5 select t2.x, t2.y, count(*) c
6 ,min(t2.z) keep (dense_rank first order by t2.z) f
7 ,max(t2.z) keep (dense_rank last order by t2.z) l
8 from t1, t2
9 where t1.x = t2.x
10 group by t2.x, t2.y
11 ) t
12 ;

X Y C F L
---------- ---------- ---------- ---------- ----------
1 7000 2 1 8
1 8000 1 1 8
2 7000 2 1 5
3 7000 1 1 7
3 8000 2 1 7
3 9000 1 1 7

Cheers.

Tom Kyte

Followup  

April 08, 2005 - 3:34 pm UTC

without create tables and inserts, I guess :)

takes too much time to create the setup for every case (wish people would read the page that they have to page down through to put something up here...)

I'm confused

April 18, 2005 - 9:55 pm UTC

Reviewer: Mikito

Given that

select distinct deptno
from emp

is essentially

select deptno
from emp
group by deptno

how is distinct query should be rewritten in case with analytics columns? Neither

SELECT deptno, count(1),
min(sal) over (partition by deptno) f
from emp
group by deptno,min(sal) over (partition by deptno);

nor

SELECT deptno, count(1),
min(sal) over (partition by deptno) f
from emp
group by deptno,f;

seems to be a valid syntax.

(To repeat: "Does analytics scale?")





Tom Kyte

Followup  

April 19, 2005 - 7:22 am UTC

why would you use analytics that way?

Tell us the question, we'll tell you the method.


select deptno, count(*) /* because count(1) is counter-intuitive */,
min(sal) over (partition by deptno) f
from emp
group by deptno, min(sal) over (partition by deptno)

would not make sense. You are saying "get all deptnos, by deptno find the minimum salary and associate that number with each one, then aggregate by deptno/min salary to count records"

You should just ask:

find the minimum salary and count of records by deptno.


select deptno, count(*), min(sal) from emp group by deptno;


is what you were looking for. analytics scale up wonderful. Say the question was instead:

you have a table full of records that have a customer_id and a last_sale_date, I would like you to retrieve the last record for each customer.


select *
from ( select cust.*, max(sale_date) over (partition by cust_id) lsd
from cust )
where sale_date = lsd;

versus

select *
from cust
where sale_date =
(select max(sale_date) from cust c2 where cust_id = cust.cust_id )
/

or

select *
from cust, (select cust_id, max(sale_date) lsd from cust group by cust_id)x
where cust.cust_id = x.cust_id
and cust.sale_date = x.lsd
/

for example

Tricky SQL?

April 19, 2005 - 10:29 am UTC

Reviewer: A reader

CREATE TABLE master
(
m_no INTEGER PRIMARY KEY,
m_name VARCHAR2(255) NOT NULL UNIQUE
);

create table detail
(
d_pk integer primary key,
d_no integer not null references m(m_no),
d_date date,
d_data varchar2(255)
);

Given a d_pk, how can I get the second-to-last (ordered by d_date) record from M for that M_NAME? In other words, for a given m_name, there are multiple records in "detail" with different dates. Given one of those records, I want the prior record in "detail" (there might not be any)

I tried to design a simple master detail table, but maybe I over-normalized?

Thanks

Tom Kyte

Followup  

April 19, 2005 - 12:00 pm UTC



are you saying "i have a detail record, I want the detail record that came 'in front' of this one"?

that is what I sort of hear, but the second to last is confusing me.


select *
from (
select ...., lead(d_pk) over (order by d_date) next_pk
from master, detail
where master.m_no = (select d_no from detail where d_pk = :x)
and master.m_no = detail.d_no
)
where next_pk = :x;

I think that does that. You get the master/detail for that d_pk (inline view)

Use lead to assign to each record the "next pk" after sorting by d_date

Keep the record whose 'next' records primary key was the one you wanted..


a little inconsistency

April 19, 2005 - 1:24 pm UTC

Reviewer: mikito

I meant inconsistency, not scalability. Why "distinct"

SELECT distinct deptno,
min(sal) over (partition by deptno) f
from emp

is allowed, whereas "group by" doesn't? If someone has trouble understanding what analytics with "group by" means, the same should apply to analytics with "distinct" as well.


Tom Kyte

Followup  

April 19, 2005 - 1:26 pm UTC

because group by is not distinct, they are frankly very different concepts.



detail and summery in one sql statement

April 27, 2005 - 3:02 pm UTC

Reviewer: A reader

hi tom,

quick shot. i have to process many detail (column a - f) and one summery record (containing sum (column c) + count (*) over all recs + some literal placeholders) within one sql statement. is there another way then using a classical UNION ALL select? any new way with analytical functions?

Tom Kyte

Followup  

April 27, 2005 - 3:22 pm UTC

need small example, did not follow your example as stated.

detail and summery in one sql statement

April 28, 2005 - 10:08 am UTC

Reviewer: A reader

hi tom,

here is the small and simple test case to show what i mean.

SQL> create table t1 (col1 number primary key, col2 number, col3 number);

Tabelle wurde angelegt.

SQL> create table t2 (col0 number primary key, col1 number references t1 (col1), col2 number, col3 number, col4 number);

Tabelle wurde angelegt.

SQL> create index t2_col1 on t2 (col1);

Index wurde angelegt.

SQL> insert into t1 values (1, 1, 1);

1 Zeile wurde erstellt.

SQL> insert into t2 values (1, 1, 1, 1, 1);

1 Zeile wurde erstellt.

SQL> insert into t2 values (2, 1, 2, 2, 2);

1 Zeile wurde erstellt.

SQL> insert into t2 values (3, 1, 3, 3, 3);

1 Zeile wurde erstellt.

SQL> analyze table t1 compute statistics;

Tabelle wurde analysiert.

SQL> analyze table t2 compute statistics;

Tabelle wurde analysiert.

SQL> select 0 rowtype, t1.col1 display1, t1.col2 display2, t2.col3 display3, t2.col4 display4
  2  from   t1 join t2 on (t1.col1 = t2.col1)
  3  where  t1.col1 = 1
  4  UNION ALL
  5  select 1 rowtype, t1.col1, count (*), null, sum (t2.col4)
  6  from   t1 join t2 on (t1.col1 = t2.col1)
  7  where  t1.col1 = 1
  8  group  by t1.col1
  9* order  by rowtype

   ROWTYPE   DISPLAY1   DISPLAY2   DISPLAY3   DISPLAY4
---------- ---------- ---------- ---------- ----------
         0          1          1          1          1
         0          1          1          2          2
         0          1          1          3          3
         1          1          3                     6

that is creating detail + summary record within one sql statement! 

Tom Kyte

Followup  

April 28, 2005 - 10:18 am UTC

ops$tkyte@ORA10G> select grouping_id(t1.col2) rowtype,
  2         t1.col1 d1,
  3             decode( grouping_id(t1.col2), 0, t1.col2, count(*) ) d2,
  4             decode( grouping_id(t1.col2), 0, t2.col3, null ) d3,
  5             decode( grouping_id(t1.col2), 0, t2.col4, sum(t2.col4) ) d4
  6    from t1, t2
  7   where t1.col1 = t2.col1
  8   group by grouping sets((t1.col1),(t1.col1,t1.col2,t2.col3,t2.col4))
  9  /
 
   ROWTYPE         D1         D2         D3         D4
---------- ---------- ---------- ---------- ----------
         0          1          1          1          1
         0          1          1          2          2
         0          1          1          3          3
         1          1          3                     6
 

detail and summery in one sql statement

April 29, 2005 - 10:05 am UTC

Reviewer: A reader

hi tom,

thanks for your help. that's exactly what i need. analytics rock, analytics roll as you said. :)

unfortunately it is hard to get. :(

i looked in the documentation but cannot understand the grouping_id values in the example. please could you explain? what is "2" or "3" in the grouping column?


Examples
The following example shows how to extract grouping IDs from a query of the sample table sh.sales:

SELECT channel_id, promo_id, sum(amount_sold) s_sales,
GROUPING(channel_id) gc,
GROUPING(promo_id) gp,
GROUPING_ID(channel_id, promo_id) gcp,
GROUPING_ID(promo_id, channel_id) gpc
FROM sales
WHERE promo_id > 496
GROUP BY CUBE(channel_id, promo_id);

C PROMO_ID S_SALES GC GP GCP GPC
- ---------- ---------- ---------- ---------- ---------- ----------
C 497 26094.35 0 0 0 0
C 498 22272.4 0 0 0 0
C 499 19616.8 0 0 0 0
C 9999 87781668 0 0 0 0
C 87849651.6 0 1 1 2
I 497 50325.8 0 0 0 0
I 498 52215.4 0 0 0 0
I 499 58445.85 0 0 0 0
I 9999 169497409 0 0 0 0
I 169658396 0 1 1 2
P 497 31141.75 0 0 0 0
P 498 46942.8 0 0 0 0
P 499 24156 0 0 0 0
P 9999 70890248 0 0 0 0
P 70992488.6 0 1 1 2
S 497 110629.75 0 0 0 0
S 498 82937.25 0 0 0 0
S 499 80999.15 0 0 0 0
S 9999 267205791 0 0 0 0
S 267480357 0 1 1 2
T 497 8319.6 0 0 0 0
T 498 5347.65 0 0 0 0
T 499 19781 0 0 0 0
T 9999 28095689 0 0 0 0
T 28129137.3 0 1 1 2
497 226511.25 1 0 2 1
498 209715.5 1 0 2 1
499 202998.8 1 0 2 1
9999 623470805 1 0 2 1
624110031 1 1 3 3



Tom Kyte

Followup  

April 29, 2005 - 10:21 am UTC

</code> http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:37355353762363 <code>



How to do this using Analytics

May 05, 2005 - 5:11 pm UTC

Reviewer: A reader

Hello Sir,
I have a denormalized table dept_emp of which part of it I have reproduced here.It has/will have dupes .

I need to find out all emps which belong to more than one dept using Analytics ( Want to avoid self join ).

So the required output must be :



DEPTNO DNAME EMPNO ENAME
------ ---------- ----- --------------------
10 D10 1 E1
10 D10 1 E1
10 D10 2 E2
10 D10 2 E2

20 D20 1 E1
20 D20 1 E1
20 D20 2 E2
20 D20 2 E2

From the total set of :
SELECT * FROM DEPT_EMP ORDER BY DEPTNO ,EMPNO
DEPTNO DNAME EMPNO ENAME
------ ---------- ----- --------------------
10 D10 1 E1
10 D10 1 E1
10 D10 2 E2
10 D10 2 E2
10 D10 3 E3
10 D10 3 E3
20 D20 1 E1
20 D20 1 E1
20 D20 2 E2
20 D20 2 E2
20 D20 4 E4
20 D20 4 E4
20 D20 5 E5
20 D20 5 E5
14 rows selected


create table dept_emp (deptno number , dname varchar2(10) ,empno number ,ename varchar2(20) ) ;

INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 3, 'E3');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 4, 'E4');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 5, 'E5');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 2, 'E2');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
10, 'D10', 3, 'E3');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 4, 'E4');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 5, 'E5');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 1, 'E1');
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES (
20, 'D20', 2, 'E2');
COMMIT ;

Thanx

Tom Kyte

Followup  

May 05, 2005 - 6:10 pm UTC

no analytics

select empno, count(distinct deptno)
from t
group by empno
having count(distinct deptno) > 1;

Thanx Sir

May 05, 2005 - 9:20 pm UTC

Reviewer: A reader

Actually I was planing to use analytics to get the whole row info, will do the same trick with Analytics, then.

You are a Genius.

Tom Kyte

Followup  

May 06, 2005 - 7:17 am UTC

select *
from (
select t.*, count(distinct deptno) over (partition by empno) cnt
from t
)
where cnt > 1;


Analytical solution

May 10, 2005 - 6:29 am UTC

Reviewer: Baiju Menon from India

Sir,
I want to list the department and the maximum number of employees working in that department by using Analytical function(only the department in which the maximum number of employees are working)
the query without the Analytical function is
select deptno, count(deptno) from emp group by deptno having count(deptno) in (select max(count(deptno)) from emp group by deptno)
Thanks


Tom Kyte

Followup  

May 10, 2005 - 9:15 am UTC

1 select deptno, cnt
2 from (
3 select deptno, cnt, max(cnt) over() max_cnt
4 from (
5 select deptno, count(*) cnt
6 from emp
7 group by deptno
8 )
9 )
10* where cnt = max_cnt
scott@ORA9IR2> /

DEPTNO CNT
---------- ----------
30 6


group by

May 11, 2005 - 4:15 am UTC

Reviewer: Anoop Gupta from INDIA

Hi Tom,

I have a table in table data is like this

empid leavelname
1001 Level1
1001 Level2
1001 Level3
1001 Level4
1002 Level1
1002 Level2
1002 Level3
...
...

Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a function.

empid emp_assigned on leavel
1001 level1,level2,level3,level4
1002 level1,level2,level3
...
...

Waiting for your response.....



Tom Kyte

Followup  

May 11, 2005 - 7:30 am UTC

only if there is some reasonable maximum number of levelname rows per empid.

is there?

Analytics Rock - But why are they slower for me

May 13, 2005 - 1:00 am UTC

Reviewer: Jeff Plumb from Melbourne, Australia

Hi Tom,

I have followed you example about Analytics from Effective Oracle by Design on page 516 (Find a specific row in a partition). When I run the example and tkprof the 3 different queries, the analytics actually takes a lot longer to run, but it does do less logical I/O's. It is doing a lot more physical I/O's so I am guessing that it is using a temporary segment on disk to perform the window sort. To perform the test I created the big_table that you use and populated it with 1,000,000 rows. I am using Oracle 9i release 2. Here is the output from TKPROF:

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
********************************************************************************

select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 5.32 6.42 13815 14669 0 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 5.32 6.42 13815 14669 0 694

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33

Rows Row Source Operation
------- ---------------------------------------------------
694 HASH JOIN
20 VIEW
20 SORT GROUP BY
1000000 TABLE ACCESS FULL BIG_TABLE
1000000 TABLE ACCESS FULL BIG_TABLE

********************************************************************************

select t.owner, t.object_name, t.created
from big_table t
join (select owner, max(created) maxcreated
from big_table
group by owner) t2
on (t2.owner = t.owner and t2.maxcreated = t.created)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 5.03 5.06 13816 14669 0 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 5.03 5.06 13816 14669 0 694

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33

Rows Row Source Operation
------- ---------------------------------------------------
694 HASH JOIN
20 VIEW
20 SORT GROUP BY
1000000 TABLE ACCESS FULL BIG_TABLE
1000000 TABLE ACCESS FULL BIG_TABLE

********************************************************************************

select owner, object_name, created
from
( select owner, object_name, created, max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 8 16.68 40.66 15157 7331 17 694
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 10 16.68 40.66 15157 7331 17 694

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33

Rows Row Source Operation
------- ---------------------------------------------------
694 VIEW
1000000 WINDOW SORT
1000000 TABLE ACCESS FULL BIG_TABLE

********************************************************************************

And when I run the query with the analytics using autotrace I get the following which shows a sort to disk:
SQL*Plus: Release 9.2.0.6.0 - Production on Fri May 13 14:53:08 2005

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production

control@DWDEV> set autot traceonly
control@DWDEV> select owner, object_name, created
2 from
3 ( select owner, object_name, created, max(created) over (partition by owner) as maxcreated
4 from big_table
5 )
6 where created = maxcreated;

694 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4399 Card=1000000 Bytes=52000000)
1 0 VIEW (Cost=4399 Card=1000000 Bytes=52000000)
2 1 WINDOW (SORT) (Cost=4399 Card=1000000 Bytes=43000000)
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=637 Card=1000000 Bytes=43000000)




Statistics
----------------------------------------------------------
0 recursive calls
17 db block gets
7331 consistent gets
15348 physical reads
432 redo size
12784 bytes sent via SQL*Net to client
717 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
694 rows processed

So how can I stop the sorts (disk)? I am guessing that the pga_aggregate_target needs to be higher, but it seems to already be set quite high.

control@DWDEV> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 524288000

I hope you can help clarify how to make the anayltic version run quicker.

Thanks.


Tom Kyte

Followup  

May 13, 2005 - 9:50 am UTC

it'll be a function of the number of "owners" here

You have 1,000,000 records.

You have but 20 users.

in this extreme case, having 50,000 records per window and swapping out was not as good as squashing the data down to 20 records and joining -- the CBO quite smartly rewrote:

select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)

as

select ...
from big_table t, (select owner,max(created) created from big_table t2 ...)
where ....



So, does the data you analyze to find the "most current record" tend to have 50,000 records/key in real life?

In your case, your hash table didn't spill to disk. In real life though, the numbers would probably be much different. a 1,000,000 row table would have keys with 10 or 100 rows maybe, not 50,000 (in general). There you would find the answer to be very different.

And if you let the sort run in memory it would be different as well -- you would get a max of 25m given your pga aggregate target setting that may have been too small.

but consider what happens when the size of the "aggregate" goes up, dimishing marginal returns sets in:

select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 2.06 2.01 26970 29283 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 2.06 2.01 26970 29283 0 4775
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 4.57 10.05 30603 14484 15 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 4.57 10.05 30603 14484 15 4775
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.id = t.id)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.70 12.04 33787 45393 2 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.71 12.05 33787 45393 2 1000000
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by id) as maxcreated
from big_table
)
where created = maxcreated

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.00 9.60 9336 14484 2 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.00 9.60 9336 14484 2 1000000





and, given sufficient space to work "in memory", these two big queries both benefited:


select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.owner = t.owner)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 1.82 1.96 9909 29283 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 1.83 1.97 9909 29283 0 4775
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by owner) as maxcreated
from big_table
)
where created = maxcreated

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 320 2.15 2.11 2858 14484 0 4775
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 322 2.15 2.11 2858 14484 0 4775
********************************************************************************
select owner, object_name, created
from big_table t
where created = (select max(created)
from big_table t2
where t2.id = t.id)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 7.64 7.55 10181 94633 0 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 7.65 7.56 10181 94633 0 1000000
********************************************************************************
select owner, object_name, created
from
( select owner, object_name, created,
max(created) over (partition by id) as maxcreated
from big_table
)
where created = maxcreated

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 66668 5.69 5.49 2699 14484 0 1000000
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 66670 5.69 5.49 2699 14484 0 1000000


(this was a dual cpu xeon using 'nonparallel' query in this case, once with a 256mb pga aggregate target and again with a 2gig one

May 14, 2005 - 3:27 am UTC

Reviewer: kuldeep from India

Dear Tom,

I have three tables t1, t2 & t3. where t2 & t3 is joined with t2 with column "key_id".
Now I need sum of key_values(amount) of t2 and sum of key_values(amount) of t3 for key_id
in table t1.

kuldeep@dlfscg> select * from t1;

KEY_ID KEY_VAL
---------- ----------
2 1980
1 1975

kuldeep@dlfscg> select * from t2;

KEY_ID KEY_VAL
---------- ----------
2 550
2 575
1 500

kuldeep@dlfscg> select * from t3;

KEY_ID KEY_VAL
---------- ----------
2 900
1 1000
1 750

***** QUERY 1 *****

kuldeep@dlfscg> SELECT t1.key_id, SUM(t2.key_val) sum_t2_key_val, SUM(t3.key_val) sum_t3_key_val
2 FROM t1, t2, t3
3 WHERE t1.key_id=t2.key_id
4 AND t1.key_id=t3.key_id
5 GROUP BY t1.key_id
6 /

KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
1 1000 1750
2 1125 1800

***** QUERY 2 *****

kuldeep@dlfscg> SELECT t1.key_id, t2.sum_t2_key_val, t3.sum_t3_key_val
2 FROM t1,
3 (SELECT key_id, SUM(key_val) sum_t2_key_val FROM t2 GROUP BY key_id) t2,
4 (SELECT key_id, SUM(key_val) sum_t3_key_val FROM t3 GROUP BY key_id) t3
5 WHERE t1.key_id=t2.key_id
6 AND t1.key_id=t3.key_id
7 /

KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
1 500 1750
2 1125 900

Query 1 is giving wrong result and I can not use query 2 whose performance is very poor.

Oracle 9i has added a lot of new grouping features and a lot of analytic functions (all going over the head).

Is there any "special" sum function or way which should pick value only once belonging to a row (or query's key, here "key_id")
irrespective of how many time it is appearing on query result.


KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1000
1 500 750 <---- 500 of t2 should not be calculated, it is repeat
2 550 900
2 575 900 <---- 900 of t3 should not be calculated, it is repeat

thanks and regards,



Tom Kyte

Followup  

May 14, 2005 - 9:36 am UTC

select t1.key_id, t2.sum_val, t3.sum_val
from t1,
(select key_id, sum(val) sum_val from t2 group by key_id ) t2,
(select key_id, sum(val) sum_val from t3 group by key_id ) t3
WHERE t1.key_id=t2.key_id
AND t1.key_id=t3.key_id


apply an amount across multiple records

May 15, 2005 - 8:17 pm UTC

Reviewer: Dave from Seattle

I have a problem similar to what I call the invoice payment problem.
It would seem to be a common problem, but I have searched to no avail.

The idea is that a customer may have many outstanding invoices, and sends in a check for an arbitrary amount. So we need to apply the money across the invoices oldest first.
Note that in my specific case, if a payment exceeds the total outstanding, the excess is ignored (obviously not dealing with real money here!)

create table invoices (
cust_nbr integer not null,
invoice_nbr integer not null,
invoice_amt number not null,
payment_amt number not null,
primary key (cust_nbr, invoice_nbr)
);

begin
delete from invoices;
dbms_random.seed(123456789);
for c in 1 .. 2 loop
for i in 1 .. 3 loop
insert into invoices values (c, i, round(dbms_random.value * 10, 2)+1, 0);
end loop;
end loop;
update invoices
set payment_amt = round(dbms_random.value * invoice_amt, 2)
where invoice_nbr = 1;
commit;
end;
/
select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
invoice_amt - payment_amt outstanding_amt
from invoices
where invoice_amt - payment_amt > 0
order by cust_nbr, invoice_nbr;

CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 5.55 3.89
1 2 3.21 0 3.21
1 3 2.78 0 2.78
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92

variable cust_nbr number;
variable received_amt number;
begin
:cust_nbr := 1;
:received_amt := 7.25;
end;
/

update invoices i1
set payment_amt = (... some query which applies
:received_amt to outstanding_amt ...)
where cust_nbr = :cust_nbr;


result should be:

CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
1 1 9.44 9.44 0
1 2 3.21 3.21 0
1 3 2.78 .15 2.63
2 1 7.57 4.3 3.27
2 2 9.46 0 9.46
2 3 5.92 0 5.92


This is simple to solve in pl/sql with a cursor, but I thought it would be a good test for a set-based solution with analytics. But after some effort, I'm stumped.


Tom Kyte

Followup  

May 16, 2005 - 7:37 am UTC

Using analytics we can see how to apply the inputs:

ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  2         least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
  3    from (
  4  select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  5         invoice_amt - payment_amt outstanding_amt,
  6         sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
  7    from invoices
  8   where cust_nbr = :cust_nbr
  9         )
 10    order by cust_nbr, invoice_nbr;
 
  CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT AMOUNT_TO_APPLY
---------- ----------- ----------- ----------- ---------------
         1           1        9.44        5.55            3.89
         1           2        3.21           0            3.21
         1           3        2.78           0             .15


Just needed a running total of outstanding amounts to take away from the received amount....

Then, merge:

ops$tkyte@ORA9IR2> merge into invoices
  2  using
  3  (
  4  select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  5         least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
  6    from (
  7  select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  8         invoice_amt - payment_amt outstanding_amt,
  9         sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
 10    from invoices
 11   where cust_nbr = :cust_nbr
 12         )
 13  ) x
 14  on ( invoices.cust_nbr = x.cust_nbr and invoices.invoice_nbr = x.invoice_nbr )
 15  when matched then update set payment_amt = nvl(payment_amt,0)+x.amount_to_apply
 16  when not matched /* never happens... */ then insert (cust_nbr) values (null);
 
3 rows merged.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  2         invoice_amt - payment_amt outstanding_amt
  3    from invoices
  4    order by cust_nbr, invoice_nbr;
 
  CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
         1           1        9.44        9.44               0
         1           2        3.21        3.21               0
         1           3        2.78         .15            2.63
         2           1        7.57         4.3            3.27
         2           2        9.46           0            9.46
         2           3        5.92           0            5.92
 
6 rows selected.
 

Group by

May 16, 2005 - 10:06 am UTC

Reviewer: Anoop Gupta from INDIA

Reviewer: Anoop Gupta from INDIA

Hi Tom,

As i asked question that

I have a table in table data is like this

empid leavelname
1001 Level1
1001 Level2
1001 Level3
1001 Level4
1002 Level1
1002 Level2
1002 Level3
...
...

Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a
function.

empid emp_assigned on leavel
1001 level1,level2,level3,level4
1002 level1,level2,level3
...
...

Give me the way to write a query if Suppose here we have a limitation of levels for an employee is 50.

Please reply....




Tom Kyte

Followup  

May 16, 2005 - 1:09 pm UTC

select empid,
rtrim(
max(decode(rn,1,leavelname)) || ',' ||
max(decode(rn,1,leavelname)) || ',' ||
....
max(decode(rn,50,leavelname)), ',' )
from (select empid,
row_number() over (partition by empid order by leavelname) rn,
leavelname
from t
)
group by empid;

special sum

May 17, 2005 - 12:38 am UTC

Reviewer: kuldeep from India

Dear Tom,

Thanks for your response and for this useful site.

I was looking for a solution which could avoid these inline views which were making my query to run slow. I tried for the solution and got this query,

/* DATA VIEW */
kuldeep@dlfscg> SELECT t1.key_id,
2 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
3 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
4 FROM t1, t2, t3
5 WHERE t1.key_id=t2.key_id
6 AND t1.key_id=t3.key_id
7 ORDER BY t1.key_id
8 /

KEY_ID T2_ROWID T2_RN KEY_VAL T3_ROWID T2_RN KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
1 AAANZ5AAHAAAD94AAA 1 500 AAANZ4AAHAAAD9wAAA 1 1000
1 AAANZ5AAHAAAD94AAA 2 500 AAANZ4AAHAAAD9wAAB 1 750
2 AAANZ5AAHAAAD91AAA 1 550 AAANZ4AAHAAAD9tAAA 1 900
2 AAANZ5AAHAAAD91AAB 1 575 AAANZ4AAHAAAD9tAAA 2 900

/* FINAL QUERY */

kuldeep@dlfscg> SELECT key_id,
2 SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
3 SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
4 FROM (SELECT t1.key_id,
5 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
6 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
7 FROM t1, t2, t3
8 WHERE t1.key_id=t2.key_id
9 AND t1.key_id=t3.key_id)
10 GROUP BY key_id
11 /

KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1750
2 1125 900

regards,


Tom Kyte

Followup  

May 17, 2005 - 8:23 am UTC

one would need more information -- it APPEARS that you are trying to get a "random first hit" from T2 and T3 by T1.key_id

That is, for every row in T1 -- find the first match (any match will do) in T2 and in T3

report that value


is that correct.

and how big are t1,t2,t3, and how long is long.

group by

May 17, 2005 - 9:42 am UTC

Reviewer: Anoop Gupta from INDIA

Tom,

Thanks for your prompt response.

Analytical Problem

May 18, 2005 - 4:16 am UTC

Reviewer: Imran

Look at the following two queries.

SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
  2         CASE
  3            WHEN up_down < 0
  4               THEN 'DOWN'
  5            WHEN up_down > 0
  6               THEN 'UP'
  7            ELSE 'BALANCE'
  8         END CASE,
  9         prev_month
 10    FROM (SELECT exch || ' - ' || phone phone,
 11                 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
 12                 instdate, paybefdue this_month, arrears,
 13                 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
 14                   paybefdue
 15                 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
 16            FROM ptc
 17           WHERE phone IN (7629458));

PHONE           MONTH              ARREARS THIS_MONTH ABS(UP_DOWN) CASE    PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458   Apr, 2005          2562.52       5265         5265 UP               0

SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
  2         CASE
  3            WHEN up_down < 0
  4               THEN 'DOWN'
  5            WHEN up_down > 0
  6               THEN 'UP'
  7            ELSE 'BALANCE'
  8         END CASE,
  9         prev_month
 10    FROM (SELECT exch || ' - ' || phone phone,
 11                 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
 12                 instdate, paybefdue this_month, arrears,
 13                 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
 14                   paybefdue
 15                 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
 16            FROM ptc
 17           WHERE phone IN (7629459));

PHONE           MONTH              ARREARS THIS_MONTH ABS(UP_DOWN) CASE    PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629459   Apr, 2005          3516.62       7834         7834 UP               0

SQL> 

Now when I combine the two queries results are different. 

  1  SELECT phone, MONTH, arrears, this_month, ABS (up_down),
  2         CASE
  3            WHEN up_down < 0
  4               THEN 'DOWN'
  5            WHEN up_down > 0
  6               THEN 'UP'
  7            ELSE 'BALANCE'
  8         END CASE,
  9         prev_month
 10    FROM (SELECT exch || ' - ' || phone phone,
 11                 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
 12                 instdate, paybefdue this_month, arrears,
 13                 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
 14                   paybefdue
 15                 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
 16            FROM ptc
 17*          WHERE phone IN (7629458,7629459))
SQL> /

PHONE           MONTH              ARREARS THIS_MONTH ABS(UP_DOWN) CASE    PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458   Apr, 2005          2562.52       5265         2569 DOWN          7834
202 - 7629459   Apr, 2005          3516.62       7834         7834 UP               0

So you note that prev Month balance now disturbs badly.

Please tell me how to do this 

Tom Kyte

Followup  

May 18, 2005 - 8:58 am UTC

need test case. create table, insert's (like the page used to submit this said....)

Use of analytic functions in UPDATE statements

May 18, 2005 - 12:29 pm UTC

Reviewer: Bob Lyon from Houston

Tom,


-- Given this sample data

CREATE TABLE GT (
XP_ID INTEGER,
OFFSET INTEGER,
PMAX NUMBER,
PRIOR_PMAX NUMBER
);

INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 1, 3);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 2, 8);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 155, 3, 5);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 173, 3, 7.3);

-- I want to update the table and set the PRIOR_PMAX column values to be as follows

SELECT XP_ID, OFFSET, PMAX,
LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
ORDER BY XP_ID, OFFSET) PRIOR_PMAX
FROM GT
ORDER BY XP_ID, OFFSET;

XP_ID OFFSET PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
123 1 3
123 2 8 3
155 3 5
173 3 7.3

-- My insert to do this tells me "4 rows updated.", but does not do what I want.

UPDATE GT A
SET PRIOR_PMAX = (
SELECT LAG(B.PMAX, 1, NULL) OVER (PARTITION BY B.XP_ID
ORDER BY B.XP_ID, B.OFFSET) PRIOR_PMAX
FROM GT B
WHERE A.ROWID = B.ROWID
);

-- but I get

SELECT xp_id, offset, pmax, prior_pmax
FROM GT
ORDER BY xp_id, offset;

XP_ID OFFSET PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
123 1 3
123 2 8
155 3 5
173 3 7.3

-- Oracle doc states
-- "Therefore, analytic functions can appear only in the select list or ORDER BY clause."
-- which is perhaps a little ambiguous in this case.
-- Is there a way to do this update is "Straight SQL"?



Tom Kyte

Followup  

May 18, 2005 - 12:54 pm UTC

you can merge

merge into gt a
using ( SELECT rowid rid, XP_ID, OFFSET, PMAX,
LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
ORDER BY XP_ID, OFFSET) PRIOR_PMAX
FROM GT )b
on (a.rowid = b.rowid)
when matched then update ...
when not matched (never happens, just do a dummy insert of a single null in 9i or leave off entirely in 10g)

special sum

May 19, 2005 - 1:09 am UTC

Reviewer: Kuldeep from India

My requirement was like this : I have receivables (bills, debit notes etc.) which I adjusts against the received payments and credit note (both are in seperate tables). To know the outstanding I was joining (outer join) my receivables with payments and credit notes.

Because one receivable can be adjusted against many payments and credit notes so outstanding payment was like this:

outstanding = receivable amount - sum(payment amount) - sum(credit note amount)

this simple query using outer join was giving wrong result if a receivable is adjusted against one payment and more than one credit note or vice versa.

in this case where
receivable : 1000 payment : 400 CN : 400, 200

will appear as
1000 400 400
1000 400 200
--- ---
800 600 outstanding = -400 (wrong)

My t1, t2 and t3 has 600,000, 350,000 and 80,000 row respectively.

This is my actual inline view query
-----------------------------------
SELECT a.bill_type, a.bill_exact_type, a.period_id,
a.scheme_id, a.property_number, a.bill_number,
a.bill_amount, SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) adj_amt,
NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) pending_amt
FROM ALL_RECEIVABLE a,
(SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adj_amt) adj_amt
FROM CREDIT_NOTE_RECEIVABLE
WHERE bill_type=p_bill_type
AND scheme_id=p_scheme
AND property_number=p_prop
GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) c,
(SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adjust_amount) adjust_amount
FROM PAYMENT_RECEIPT_ADJ
WHERE bill_type=p_bill_type
AND scheme_id=p_scheme
AND property_number=p_prop
GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) p
WHERE a.bill_type=P_BILL_TYPE
AND a.scheme_id=P_SCHEME
AND a.property_number=P_PROP
AND a.bill_type=c.bill_type(+)
AND a.bill_exact_type=c.bill_exact_type(+)
AND a.period_id=c.period_id(+)
AND a.scheme_id=c.scheme_id(+)
AND a.property_number=c.property_number(+)
AND a.bill_number=c.bill_number(+)
AND a.bill_type=p.bill_type(+)
AND a.bill_exact_type=p.bill_exact_type(+)
AND a.period_id=p.period_id(+)
AND a.scheme_id=p.scheme_id(+)
AND a.property_number=p.property_number(+)
AND a.bill_number=p.bill_number(+)
GROUP BY a.bill_type, a.bill_exact_type, a.period_id, a.scheme_id,
a.property_number, a.bill_number, a.bill_date, a.bill_amount
HAVING (NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0))) > 0
ORDER BY a.bill_date;
-----------------------------------

It is not reporting just the first hit of t1 in t2 and t3. Here in my last posting, I was trying just to exclude any repeat of t2 and t3's ROW in sum calculation. That means one row of t2 and t3 should be calculated only once.

I have tried this query putting more rows and applied the same on actual query, it is working fine and giving the same result as previous inline view query was giving.

kuldeep@dlfscg> SELECT t1.key_id,
2 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
3 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
4 FROM t1, t2, t3
5 WHERE t1.key_id=t2.key_id(+)
6 AND t1.key_id=t3.key_id(+)
7 ORDER BY t1.key_id
8 /

KEY_ID T2_ROWID T2_RN KEY_VAL T3_ROWID T2_RN KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
1 AAANZ5AAHAAAD94AAA 1 500 AAANZ4AAHAAAD9wAAA 1 1000
1 AAANZ5AAHAAAD94AAA 2 500 AAANZ4AAHAAAD9wAAB 1 750
1 AAANZ5AAHAAAD94AAA 3 500 AAANZ4AAHAAAD9wAAC 1 25
2 AAANZ5AAHAAAD91AAA 1 550 AAANZ4AAHAAAD9tAAA 1 900
2 AAANZ5AAHAAAD91AAB 1 575 AAANZ4AAHAAAD9tAAA 2 900
3 AAANZ5AAHAAAD91AAC 1 222 1
3 AAANZ5AAHAAAD91AAD 1 223 2
4 1 AAANZ4AAHAAAD9tAAB 1 333

8 rows selected.

kuldeep@dlfscg> SELECT key_id,
2 SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
3 SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
4 FROM (SELECT t1.key_id,
5 t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
6 t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
7 FROM t1, t2, t3
8 WHERE t1.key_id=t2.key_id(+)
9 AND t1.key_id=t3.key_id(+))
10 GROUP BY key_id
11 /

KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
1 500 1775
2 1125 900
3 445 0
4 333

kuldeep@dlfscg>

thanks for your responses.

regards,

Tom Kyte

Followup  

May 19, 2005 - 7:47 am UTC

do not order by rowid to get a last row -- is that what you are trying to do??


which row do you want to get from t2 to join with t1
and which row do you want to get from t3 to join with t1

You must specify that based on attributes you manage (eg: there must be an orderable field that helps you determine WHICH record is the right one)


consider rowid to be a random number that does not have any meaning when ordered by, it does not imply order of insertion or anything.

null record

May 25, 2005 - 4:12 pm UTC

Reviewer: yeshk from St.louis,mo,usa

I need help with this query - This is just a part of the query I am working with.

I am not able to generate a NULL RECORD in between the result set.
I should be able to pass this information out as a reference cursor.

create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt number,non_measd_tkt number);

insert into test values('CA','NDS',100,200);
insert into test values('IL','DSL',200,300);
insert into test values('CA','DSL',100,300);
insert into test values('MO','NDS',1000,300);
insert into test values('MO','DSL',100,200);

I need a result like this
STATE SVC_CAT MEASD_TKT NON MEASD TKT
CA DSL 200 300
CA NDS 100 200
TOTAL 300 500

IL DSL 200 300
TOTAL 200 300

MO DSL 100 200
MO NDS 1000 300
TOTAL 1100 500

I am able to generate the result using a query with analytics.But I dont know how to get an empty row after each state total

Also,Which is better using cursor
1) cursor based on state.
2) get the data and insert into a temporary table.
3) insert a null record

or use analytics to get complete data and put into a reference cursor.

Thanks
yeshk

Tom Kyte

Followup  

May 25, 2005 - 7:57 pm UTC

well, that would sort of be the job of the "pretty printing routine" -- eg: the report generator?

what tool is printing this out?

null record

May 26, 2005 - 9:20 am UTC

Reviewer: yeshk from St.louis,mo,usa

we need to give the resultant set with a null record after each state calculation to front-end VB application. It will be given in a reference cursor.They will just select * from reference cursor and display it on a report.

Tom Kyte

Followup  

May 26, 2005 - 10:02 am UTC

the VB application should do this, (it should be able to do something shouldn't it...)

ops$tkyte@ORA9IR2> select decode( grp, 0, state ) state,
  2         decode( grp, 0, svc_cat) svc_cat,
  3             decode( grp, 0, sum_mt ) sum_mt,
  4             decode( grp, 0, sum_nmt ) sum_nmt
  5    from (
  6  select grouping(dummy) grp, state, svc_cat, sum(measd_tkt) sum_mt, sum(non_measd_tkt) sum_nmt
  7    from (
  8  select state, svc_cat, 1 dummy, measd_tkt, non_measd_tkt
  9    from test
 10         )
 11   group by rollup( state, dummy, svc_cat )
 12         )
 13  /
 
ST SVC     SUM_MT    SUM_NMT
-- --- ---------- ----------
CA DSL        100        300
CA NDS        100        200
CA            200        500
 
IL DSL        200        300
IL            200        300
 
MO DSL        100        200
MO NDS       1000        300
MO           1100        500
 
 
 
12 rows selected.
 
 

Can rollup do the thing??

May 26, 2005 - 9:39 am UTC

Reviewer: Bhavesh Ghodasara from Ahmedabad,Gujarat,India

Hi yeshk,

create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt
number,non_measd_tkt number);

insert into test values('CA','NDS',100,200);........

insert into test values('CA','DSL',100,300);....



STATE SVC_CAT MEASD_TKT NON MEASD TKT
CA DSL 200 300 <==From where measd_tkt=200 comes from??????
CA NDS 100 200
TOTAL 300 500

Tom,,Can we do like this,

break on state
select STATE,SVC_CAT,sum(measd_tkt),sum(non_measd_tkt)
from test
group by rollup(STATE,SVC_CAT)
order by state

............
If i make any mistake than please tell me..
Thanks in advance.

Tom Kyte

Followup  

May 26, 2005 - 10:19 am UTC

see above

Which analytics to use?

May 30, 2005 - 9:10 pm UTC

Reviewer: Marc-Andre Larochelle from Montreal, PQ

Hi Tom,

I have this 3rd party table:

drop table t;
create table t (atype varchar2(4),
acol# varchar2(3),
adin varchar2(8),
ares varchar2(8));

insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
insert into t (atype, acol#, ares) values ('MACT','1','02246569');
insert into t (atype, acol#, ares) values ('MACT','6','02246569');
insert into t (atype, acol#, ares) values ('MACT','7','00021474');

select * from t;

ATYPE ACOL# ADIN ARES
----- ----- -------- --------
DUPT 001 02246569
DUPT 002 00021474
DUPT 003 02246569
MACT 1 02246569
MACT 6 02246569
MACT 7 00021474

I would like to get the following result :

DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569

I need to match DUPT.adin=MACT.ares together but making sure MACT.acol# is different for every DUPT.acol#. Bsically this table has different values in column depending on the type of rows (atype).

I have tried using lag, lead, rank and nothing seems to work but I am pretty sure it is doable with analytics which is why I posted my question here.

Any hint/help would be appreciated.

Thank you,

Marc-Andre

Tom Kyte

Followup  

May 31, 2005 - 7:30 am UTC


question for you.

How did you know to put:

DUPT 001 02246569 together with MACT 1 02246569 and
DUPT 003 02246569 together with MACT 6 02246569

and not

DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569

for example. some missing logic here.

Am I Correct??

May 31, 2005 - 5:15 am UTC

Reviewer: Bhavesh Ghodasara from Ahmedabad,Gujarat,India

Hi tom,
I solved above problem...
The query like :

select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol# bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
count(*) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol<>lb

What i think is there must be a better way...
I know You will do it in much much better way..
Please suggest the corrections.
Thanks in Advance..


Tom Kyte

Followup  

May 31, 2005 - 8:17 am UTC


ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 1 02246569


well, it gives a different result than the one you posted, it gives my hypothetical answer -- where 001 was combined with 6, not 1.

We can do this..

May 31, 2005 - 8:28 am UTC

Reviewer: Bhavesh Ghodasara from Ahmedabad,Gujarat,India

Hi tom,
I can further modified my query:
now its give desired result....
(Agree that question is ambiguous)
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
min(b.acol#) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol=lb
or cnt>1

OUTPUT:
ATYP ACO AADIN BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569

So any corrections now??
Thanks in advance
Bhavesh


Tom Kyte

Followup  

May 31, 2005 - 8:43 am UTC

I don't know your data well enough, but your query is non-deterministic if you care.  Consider:


ops$tkyte@ORA10G> create table t (atype varchar2(4),
  2                  acol# varchar2(3),
  3                  adin varchar2(8),
  4                  ares varchar2(8));
 
Table created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
 
1 row created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
  2  from (
  3  select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
  4  bacol,b.ares bares,
  5  nvl(lead(b.acol# ) over(order by a.adin),0) lb,
  6  min(b.acol#) over(partition by a.acol#) cnt
  7  from t a,t b
  8  where a.adin=b.ares
  9  order by atyp,acol) t
 10  where bacol=lb
 11  or cnt>1;
 
ATYP ACO AADIN    BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 002 00021474 MACT 7   00021474
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> truncate table t;
 
Table truncated.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
 
1 row created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
  2  from (
  3  select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
  4  bacol,b.ares bares,
  5  nvl(lead(b.acol# ) over(order by a.adin),0) lb,
  6  min(b.acol#) over(partition by a.acol#) cnt
  7  from t a,t b
  8  where a.adin=b.ares
  9  order by atyp,acol) t
 10  where bacol=lb
 11  or cnt>1;
 
ATYP ACO AADIN    BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6   02246569
DUPT 002 00021474 MACT 7   00021474


Same data both times, just different order of insertions.  With analytics and order by, you need to be concerned about duplicates. 

Answers

May 31, 2005 - 11:35 am UTC

Reviewer: Marc-Andre Larochelle from Montreal, PQ

Tom, Bhavesh,

The problem resides exactly there: no logic to match the records. I know that DUPT.din1 must have a MACT.din1 somewhere. I just don't know which one (1st one, 2nd one?). This is a decision I will have to take.

DUPT 001 02246569 MACT 1 02246569
DUPT 003 02246569 MACT 6 02246569
and
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569

are the same to me. But when I run the query, I want to always get the same results.

Anyways, all in all, your queries (Bhavesh - thank you - and yours) seem to answer to my question. I will watch out for duplicates.

Thank you very much for the quick help.

Marc-Andre

What I found

May 31, 2005 - 5:02 pm UTC

Reviewer: Marc-Andre Larochelle from Montreal, PQ

Hi Tom,

Testing the SQL statement Bhavesh provided, I quickly discovered what you meant when saying the query was non-deterministic. When I added a 4th record :

insert into t (atype,acol#,adin) values ('DUPT','004','02246569');
insert into t (atype,acol#,ares) values ('MACT','5','02246569');

only one row was returned. I played with the query and here is what I came up with :

select atyp,acol,aadin,batype,bacol,bares
from (
select atyp,acol,aadin,batype,bacol,bares,drnk ,
rank() over (partition by acol order by bacol) rnk
from (
select a.atype atyp,
a.acol# acol,
a.adin aadin,
b.atype batype,
b.acol# bacol,
b.ares bares,
dense_rank() over (partition by a.atype,a.adin order by a.acol#) drnk
from t a,t b
where a.adin=b.ares))
where drnk=rnk;

Feel free to comment.

Again thank you (and Bhavesh).

Marc-Andre

Using Analytical Values to find latest info

June 03, 2005 - 10:41 am UTC

Reviewer: anirudh from newyork, NY

Hi Tom,

we have a fairly large table with about 100 million rows, among others this table has
the following columns

CREATE TABLE my_fact_table (
staff_number VARCHAR2 (10), -- staff number
per_end_dt DATE, -- last day of month
engagement_code VARCHAR2 (30), -- engagement code
client_code VARCHAR2 (20), -- client code
revenue NUMBER (15,2) -- revenue
)

in this table the same engagement code can have different client codes for diffenet periods. This was at one point desirable and that is the reason client code was stored in this fact table instead of the engagement dimension.

Our users now want us to update the client code in these transactions to the latest value of the client code (meaning - pick the client from the latest month for which we have got any transactions for that engagement)

This situation where same engagement has multiple clients across periods is there for about 5 % of the rows.

[btw - we do plan to do data-model change to reflect the new relationships - but that may take some time - hence the interim need to just update the fact table]

to implemnt these updates that may happen for several months, I'm trying to take the approach below
which involve multiple queries and creation of a couple of temp tables - does it seem reasonable. i have a lurking feeling that with a deeper understanding of Analytic functions this can be further simplified - will appreciate your thoughts.

============= My Approach =================

-- Find the Engagements that have multiple Clients
CREATE TABLE amtest_mult_cli AS
WITH
v1 AS (SELECT DISTINCT engagement_code,client_code
FROM my_fact_table)
SELECT engagement_code
FROM v1
GROUP BY engagement_code
HAVING COUNT(*) > 1

-- Find What should be the correct client for those engagements
CREATE TABLE amtest_use_cli AS
SELECT engagement_code,per_end_dt,client_code
FROM
(
SELECT engagement_code,per_end_dt,client_code
row_number() OVER (PARTITION BY engagement_code
ORDER BY per_end_dt DESC, client_code DESC)
row_num
FROM my_fact_table a,
amtest_mult_cli b
WHERE a.engagement_code = b.engagement_code
)
WHERE row_num = 1;

-- Update Correct Clients for those engagements
UPDATE my_fact_table a
SET a.client_code =
(SELECT b.client_code
FROM amtest_use_cli b
WHERE a.engagement_code = b.engagement_code)
WHERE EXISTS
(SELECT 1
FROM amtest_use_cli c
WHERE a.engagement_code = c.engagement_code);

======================================================

Tom Kyte

Followup  

June 03, 2005 - 12:14 pm UTC

why not:

merge into my_fact_table F
using
( select engagement_code,
substr(max(to_char(per_end_dt,'yyyymmddhh24miss')||client_code ),15) cc
from my_fact_table
group by engagement_code
having count(distinct client_code) > 1 ) X
on ( f.engagement_code = x.engagement_code )
when matched
then update set client_code = x.cc
when not matched
then insert ( client_code ) values ( null ); <<== never can happen
<<== in 10g, not needed!


That select finds the client_code for the max per_end_dt by engagement_code for engagement_code's that have more than one distinct client_code....


first_value(client_code)
over (partition by engagement_code
order by per_end_dt desc, client_code desc ),
count(distinct client_code)

help with lead

June 09, 2005 - 1:24 am UTC

Reviewer: Adolph from india

I have a table in the following structure:

create table cs_fpc_pr
(PRGM_C VARCHAR2(10) not null,
fpc_date date not null,
TIME_code VARCHAR2(3) not null,
SUN_TYPE varchar2(1))

insert into cs_fpc_pr values ('PRGM000222', to_date('08-may-2005','dd-mon-rrrr'), '33','1');

insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '45','3');

insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '45','3');

insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '24','1');


insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '23','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '47','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '48','3');

insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '46','3');

insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '46','3');

insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '46','3');

insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '23','1');

commit;

select prgm_c,fpc_date,time_code,sun_type,
lead(fpc_date) over(partition by prgm_C order by fpc_date) next_date
from cs_fpc_pr
order by prgm_c,fpc_date,time_code;


PRGM_C FPC_DATE TIM S NEXT_DATE
---------- --------- --- - ---------
PRGM000222 08-MAY-05 33 1 09-MAY-05
PRGM000222 09-MAY-05 05 3 09-MAY-05
PRGM000222 09-MAY-05 25 1 09-MAY-05
PRGM000222 09-MAY-05 45 3 10-MAY-05
PRGM000222 10-MAY-05 05 3 10-MAY-05
PRGM000222 10-MAY-05 25 1 10-MAY-05
PRGM000222 10-MAY-05 45 3 14-MAY-05
PRGM000222 14-MAY-05 05 3 14-MAY-05
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05
PRGM000242 08-MAY-05 23 1 08-MAY-05
PRGM000242 08-MAY-05 47 3 08-MAY-05
PRGM000242 08-MAY-05 48 3 09-MAY-05
PRGM000242 09-MAY-05 07 3 09-MAY-05
PRGM000242 09-MAY-05 33 1 09-MAY-05
PRGM000242 09-MAY-05 46 3 10-MAY-05
PRGM000242 10-MAY-05 07 3 10-MAY-05
PRGM000242 10-MAY-05 33 1 10-MAY-05
PRGM000242 10-MAY-05 46 3 11-MAY-05
PRGM000242 11-MAY-05 07 3 11-MAY-05
PRGM000242 11-MAY-05 33 1 11-MAY-05
PRGM000242 11-MAY-05 46 3 14-MAY-05
PRGM000242 14-MAY-05 07 3 14-MAY-05
PRGM000242 14-MAY-05 23 1

I need to find the for a particular 'prgm_c' the next date & time code where the 'sun_type' field = '1'.

A sample of the output should look something like this:

PRGM_C FPC_DATE TIM S NEXT_DATE next_time
---------- --------- --- - --------- -------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 10-MAY-05 25
PRGM000222 09-MAY-05 45 3 10-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 14-MAY-05 24
PRGM000222 10-MAY-05 45 3 14-MAY-05 24
PRGM000222 14-MAY-05 05 3 14-MAY-05 24
PRGM000222 14-MAY-05 24 1

Tom, Can you please help me with with this?

Regards



Tom Kyte

Followup  

June 09, 2005 - 6:53 am UTC

PRGM000222 10-MAY-05 05 3 10-MAY-05
PRGM000222 10-MAY-05 25 1 10-MAY-05
PRGM000222 10-MAY-05 45 3 14-MAY-05
PRGM000222 14-MAY-05 05 3 14-MAY-05
PRGM000222 14-MAY-05 24 1

you've got a problem with those fpc_dates and ordering by them. you have "dups" so no one of those 10-may-05 comes "first" same with the 14th. You need to figure out how to really order this data deterministically first.

My first attempt at this is:


tkyte@ORA9IR2W> select prgm_c, fpc_date, time_code, sun_type,
2 to_date(substr( max(data)
over (partition by prgm_c order by fpc_date desc),
6, 14 ),'yyyymmddhh24miss') ndt,
3 to_number( substr( max(data)
over (partition by prgm_c order by fpc_date desc), 20) ) ntc
4 from (
5 select prgm_c,
6 fpc_date,
7 time_code,
8 sun_type,
9 case when lag(sun_type)
over (partition by prgm_c order by fpc_date desc) = '1'
10 then to_char( row_number()
over (partition by prgm_c order by fpc_date desc) , 'fm00000') ||
11 to_char(lag(fpc_date)
over (partition by prgm_c order by fpc_date desc),'yyyymmddhh24mi
ss')||
12 lag(time_code) over (partition by prgm_c order by fpc_date desc)
13 end data
14 from cs_fpc_pr
15 )
16 order by prgm_c,fpc_date,time_code
17 /

PRGM_C FPC_DATE TIM S NDT NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 09-MAY-05 25
PRGM000222 09-MAY-05 45 3 09-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 10-MAY-05 25
PRGM000222 10-MAY-05 45 3 10-MAY-05 25
PRGM000222 14-MAY-05 05 3
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05 23
PRGM000242 08-MAY-05 23 1 08-MAY-05 23
PRGM000242 08-MAY-05 47 3 08-MAY-05 23
PRGM000242 08-MAY-05 48 3 08-MAY-05 23
PRGM000242 09-MAY-05 07 3 10-MAY-05 33
PRGM000242 09-MAY-05 33 1 10-MAY-05 33
PRGM000242 09-MAY-05 46 3 10-MAY-05 33
PRGM000242 10-MAY-05 07 3 10-MAY-05 33
PRGM000242 10-MAY-05 33 1 10-MAY-05 33
PRGM000242 10-MAY-05 46 3 10-MAY-05 33
PRGM000242 11-MAY-05 07 3 14-MAY-05 23
PRGM000242 11-MAY-05 33 1 14-MAY-05 23
PRGM000242 11-MAY-05 46 3 14-MAY-05 23
PRGM000242 14-MAY-05 07 3
PRGM000242 14-MAY-05 23 1

24 rows selected.

but the lack of distinctness on the fpc_date means you might get "a different answer" with the same set of data.

reply

June 09, 2005 - 7:48 am UTC

Reviewer: Adolph from India

Sorry for not being clear at the first instance so here goes.... A program (prgm_C) will have a maximum of one entry in the table for a combination of a (fpc_date & time_code).

This time_code actually maps to another table where '01' is '01:00:00' , '02' is '01:30:00' & so on (i.e. times stored in varchar2 formats )

So basically a program will exist for a fpc_date and a time_code only once

I hope i'm making sense.

Regards


Tom Kyte

Followup  

June 09, 2005 - 7:58 am UTC

tkyte@ORA9IR2W> select prgm_c,
2 fpc_date,
3 time_code,
4 sun_type,
5 to_date(
6 substr( max(data)
7 over (partition by prgm_c
8 order by fpc_date desc,
9 time_code desc),
10 6, 14 ),'yyyymmddhh24miss') ndt,
11 to_number(
12 substr( max(data)
13 over (partition by prgm_c
14 order by fpc_date desc,
15 time_code desc), 20) ) ntc
16 from (
17 select prgm_c,
18 fpc_date,
19 time_code,
20 sun_type,
21 case when lag(sun_type)
22 over (partition by prgm_c
23 order by fpc_date desc,
24 time_code desc) = '1'
25 then
26 to_char( row_number()
27 over (partition by prgm_c
28 order by fpc_date desc,
29 time_code desc) , 'fm00000') ||
30 to_char(lag(fpc_date)
31 over (partition by prgm_c
32 order by fpc_date desc,
33 time_code desc),'yyyymmddhh24mi ss')||
34 lag(time_code)
35 over (partition by prgm_c
36 order by fpc_date desc,
37 time_code desc)
38 end data
39 from cs_fpc_pr
40 )
41 order by prgm_c,fpc_date,time_code
42 /

PRGM_C FPC_DATE TIM S NDT NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33 1 09-MAY-05 25
PRGM000222 09-MAY-05 05 3 09-MAY-05 25
PRGM000222 09-MAY-05 25 1 10-MAY-05 25
PRGM000222 09-MAY-05 45 3 10-MAY-05 25
PRGM000222 10-MAY-05 05 3 10-MAY-05 25
PRGM000222 10-MAY-05 25 1 14-MAY-05 24
PRGM000222 10-MAY-05 45 3 14-MAY-05 24
PRGM000222 14-MAY-05 05 3 14-MAY-05 24
PRGM000222 14-MAY-05 24 1
PRGM000242 08-MAY-05 07 3 08-MAY-05 23
PRGM000242 08-MAY-05 23 1 09-MAY-05 33
PRGM000242 08-MAY-05 47 3 09-MAY-05 33
PRGM000242 08-MAY-05 48 3 09-MAY-05 33
PRGM000242 09-MAY-05 07 3 09-MAY-05 33
PRGM000242 09-MAY-05 33 1 10-MAY-05 33
PRGM000242 09-MAY-05 46 3 10-MAY-05 33
PRGM000242 10-MAY-05 07 3 10-MAY-05 33
PRGM000242 10-MAY-05 33 1 11-MAY-05 33
PRGM000242 10-MAY-05 46 3 11-MAY-05 33
PRGM000242 11-MAY-05 07 3 11-MAY-05 33
PRGM000242 11-MAY-05 33 1 14-MAY-05 23
PRGM000242 11-MAY-05 46 3 14-MAY-05 23
PRGM000242 14-MAY-05 07 3 14-MAY-05 23
PRGM000242 14-MAY-05 23 1

24 rows selected.

Just needed to add "time_code DESC"


See

</code> http://www.oracle.com/technology/oramag/oracle/04-mar/o24asktom.html <code>

analytics to the rescue

for the "carry down" technique I used here. In 10g, we'd simplify using "ignore nulls" in the LAST_VALUE function instead of the max() and row_number() trick

brilliant

June 09, 2005 - 9:53 am UTC

Reviewer: Adolph from India

Thank you very much Tom. The query works like a charm.I will read up the link. Analytics do rock n roll :)



Working on an Analytic Query

June 09, 2005 - 12:15 pm UTC

Reviewer: Scott from Long Island, NY USA

Tom,
From your example for Mark's problem on 4/8, it seems that you need to specify a number of columns to output this way. Is there a way to have a varying number of columns. For example, I need to have a query that takes a date range, and makes each date a column heading. Any help would be greatly appreciated.
Thanks,
Scott

Tom Kyte

Followup  

June 09, 2005 - 6:15 pm UTC

you need dynamic sql. the number of columns in a query is "well defined, known at parse time" by definition.

If you have access to expert one on one Oracle, I demostrated how to do this with ref cursors in a stored procedure. but you have to run a query, to get the set of column "headings" and write a query bsaed on that.

Tom any idea how I can re write this piece of code

June 09, 2005 - 3:00 pm UTC

Reviewer: A reader

decode ((SELECT ih.in_date
FROM major_sales ih
WHERE ih.container = i.container
AND sales > i.container_id
AND sales = (SELECT MIN(ihh.container_id)
FROM major_sales ihh
WHERE ihh.container_id > i.container_id
AND ihh.container = i.container)), NULL,

Tom Kyte

Followup  

June 09, 2005 - 6:35 pm UTC

not out of context, no.

I am still having problem with analytical function

July 01, 2005 - 12:31 pm UTC

Reviewer: A reader

select i.container,ssl_user_code,ssl_user_code ssl,cl.code length_code, out_trucker_code, i.chassis,
lead(in_date) over (partition by i.container order by in_date) next_in_date,
out_date,
lead (out_date) over (partition by i.container order by in_date) o_date
from his_containers i,
container_masters cm,
tml_container_lhts clht,
tml_container_lengths cl
WHERE cm.container = i.container
and cm.lht_code = clht.code
and clht.length_code = cl.code
and ssl_user_code = 'ACL'
and i.container like '%408014'
and voided_date is null
and ((in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))

results:
----------
CONTAINER SSL_USER_CODE SSL LENGTH_CODE OUT_TRUCKER_CODE CHASSIS NEXT_IN_DATE OUT_DATE O_DATE
ACLU408014 ACL ACL 4 R0480 3/22/2005 2:52:41 PM 3/21/2005 3:45:48 PM 4/6/2005 2:25:59 PM
ACLU408014 ACL ACL 4 J1375 4/6/2005 2:25:59 PM



1. how can I get rid of the 4/6/2005 2:25:59 PM???






Tom Kyte

Followup  

July 01, 2005 - 1:52 pm UTC

can you be more specific about why you don't like April 6th as 2:25:59pm? what is it about that you don't like?

That'll help me tell you how to in general remove it. What is the criteria for removal

analytical query

July 01, 2005 - 2:19 pm UTC

Reviewer: A reader

Tom,

We are trying to build the client within a the month, in this case is within april. I also would like to know how many days elapsed during 2 days so I can bill them.


Tom Kyte

Followup  

July 01, 2005 - 3:15 pm UTC

"how many days elapsed between 2 days"

the answer is: 2

but are you asking how to do date arithmetic? Just subtract.

sorry...within March

July 01, 2005 - 2:21 pm UTC

Reviewer: A reader


more information

July 01, 2005 - 2:29 pm UTC

Reviewer: A reader

Tom,

This is how the data looks

IN_DATE OUT_DATE CONTAINER
1/3/2005 2:23:05 PM 1/10/2005 5:05:16 PM ACLU408014
1/11/2005 1:04:49 PM 1/12/2005 8:49:06 AM ACLU408014
1/14/2005 12:09:50 PM 1/18/2005 6:39:10 AM ACLU408014
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM 4/6/2005 2:25:59 PM ACLU408014
4/7/2005 1:24:43 PM 4/10/2005 2:21:59 AM ACLU408014

and I would like to get the pair within the same month



Tom Kyte

Followup  

July 01, 2005 - 3:16 pm UTC

the pair of "what"?

I would like to get all the dates within the month

July 01, 2005 - 4:03 pm UTC

Reviewer: A reader


Tom Kyte

Followup  

July 01, 2005 - 4:13 pm UTC

please be much much more specific. pretend you trying to explain this to a newbie...


not following the requirement at all..
</code> http://asktom.oracle.com/Misc/how-to-ask-questions.html <code>

one more try

July 01, 2005 - 4:26 pm UTC

Reviewer: A reader

This is how the data looks as of now with the above query.
IN_DATE OUT_DATE CONTAINER
1/3/2005 2:23:05 PM 1/10/2005 5:05:16 PM ACLU408014
1/11/2005 1:04:49 PM 1/12/2005 8:49:06 AM ACLU408014
1/14/2005 12:09:50 PM 1/18/2005 6:39:10 AM ACLU408014
3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM 4/6/2005 2:25:59 PM ACLU408014
4/7/2005 1:24:43 PM 4/10/2005 2:21:59 AM ACLU408014

I Would like to get it as the following


IN_DATE OUT_DATE CONTAINER

3/19/2005 2:10:24 AM 3/21/2005 3:45:48 PM ACLU408014
3/22/2005 2:52:41 PM

This is what I am looking for.....this way.


Tom Kyte

Followup  

July 01, 2005 - 4:46 pm UTC

still not much of a specification (important thing for those of us in this industry - being able to describe the problem at hand in detail, so someone else can take the problem definition and code it).


Let me try, this is purely a speculative guess on my part:


I would like all records in the table such that the in_date-out_date range covered at least part of the month of march in the year 2005.

If the out_date falls AFTER march, I would like it nulled out.

(this part is a total guess) if the in_date falls BEFORE march, i would like it nulled out as well (for consistency?)


Ok, stated like that I can give you untested psuedo code since there are no create tables and no inserts to play with:


select case when in_date between to_date( :x, 'dd-mon-yyyy' )
and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
then in_date end,
case when out_date between to_date( :x, 'dd-mon-yyyy' )
and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
then out_date end,
container
from T
where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
and out_date >= to_date( :x, 'dd-mon-yyyy' )


bind in :x = '01-mar-2005' and :y = '01-apr-2005' for your dates.


As you requested

July 01, 2005 - 5:01 pm UTC

Reviewer: A reader

CREATE TABLE CONTAINER_MASTERS
(
CONTAINER VARCHAR2(10 BYTE) NOT NULL,
CHECK_DIGIT VARCHAR2(1 BYTE) NOT NULL,
SSL_OWNER_CODE VARCHAR2(5 BYTE) NOT NULL,
LHT_CODE VARCHAR2(5 BYTE) NOT NULL

)

INSERT INTO CONTAINER_MASTERS ( CONTAINER, CHECK_DIGIT, SSL_OWNER_CODE,
LHT_CODE ) VALUES ( '045404', '1', 'BCL', '5AV');
commit;


CREATE TABLE TML_CONTAINER_LHTS
(
CODE VARCHAR2(5 BYTE) NOT NULL,
SHORT_DESCRIPTION VARCHAR2(10 BYTE) NOT NULL,
LONG_DESCRIPTION VARCHAR2(30 BYTE) NOT NULL,
ISO VARCHAR2(4 BYTE) NOT NULL,
LENGTH_CODE VARCHAR2(5 BYTE) NOT NULL

)

INSERT INTO TML_CONTAINER_LHTS ( CODE, SHORT_DESCRIPTION, LONG_DESCRIPTION, ISO, LENGTH_CODE,
HEIGHT_CODE, TYPE_CODE ) VALUES ( '5BR', '5BR', '45'' 9''6" Reefer', '5432', '5', 'B', 'R');
commit;



CREATE TABLE TML_CONTAINER_LENGTHS
(
CODE VARCHAR2(5 BYTE) NOT NULL,
SHORT_DESCRIPTION VARCHAR2(10 BYTE) NOT NULL,

)


INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES (
'2', '20''', '20 Ft');
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES (
'4', '40''', '40 Ft');
commit;


Tom Kyte

Followup  

July 01, 2005 - 6:06 pm UTC

umm, specification?

did I get it right? if so, did you *try* the query at all???

Here is a SQL puzzle for analytics zealots

July 01, 2005 - 10:33 pm UTC

Reviewer: Mikito Harakiri

OK, if anybody suceed writing the following with analytics, I would convert to analytics once and forever. Credit it in the book, of course.

Given:
table Hotels (
name string,
price integer,
distance
)

Here is a query that sounds very analytical:
Order hotels by price, distance. Compare each record with its neighbour (lag?), and one of them is inferior to the other by both criteria -- more pricey and father from the beach -- then throw it away from the result.

Tom Kyte

Followup  

July 02, 2005 - 9:20 am UTC

define neighbor.

is neighbor defined by price or by distance? your specification is lacking many many details (seems to be a recurring theme on this page for some reason)

sounds like you want the cheapest closest hotel to the beach. for each row, if something closer and cheaper exists in the original set, do not keep that row.

sounds like a where not exists, not analytics to me. but then - the specification is lacking.

And lets see, in order to appreciate a tool, you have to be shown that the tool can be the end all, be all answer to everything??!?? that is downright silly don't you think.

Let's see:

"if anyone succeeds in making the Oracle 9i merge command select data, I would convert to merge once and forever"

"if anyone succeeds in making my car fly into outer space, I would convert to cars once and forever"

Think about your logic here.


There are no zealots here, there are people willing to read the documentation, understand that things work the way they work, not the way THEY think they should have been made to work, and have jobs to do, pragmatic practical things to accomplish and are willing to use the best tool for the job.

specs

July 03, 2005 - 11:07 pm UTC

Reviewer: Mikito Harakiri

Yes, find all the hotels that are not dominated by the others by both price and distance. That is "not exists" query, but it is a very inefficient one:

select * from hotels h
where not exists (select * from hotels hh
where hh.price < h.price and hh.distance <= h.distance
or hh.price <= h.price and hh.distance < h.distance
)

The one that reformulated is much more efficient, but how do I express it in SQL?

Tom Kyte

Followup  

July 04, 2005 - 10:25 am UTC

the one that reforumulated?  

and why do you have the or in there at all.  to dominate by both pric and distance would simply be:

where not exists ( select NULL
                     from hotels hh
                    where hh.price < h.price 
                      AND hh.distinct < h.distance )

You said "by BOTH price and distance", nothing but nothing about ties.


ops$tkyte@ORA9IR2> /*
DOC>
DOC>drop table hotels;
DOC>
DOC>create table hotels
DOC>as
DOC>select object_name name, object_id price, object_id distance, all_objects.*
DOC>  from all_objects;
DOC>
DOC>create index hotel_idx on hotels(price,distance);
DOC>
DOC>exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
DOC>*/
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select h1.name, h1.price, h1.distance
  2    from hotels h1
  3   where not exists ( select NULL
  4                        from hotels h2
  5                       where h2.price < h1.price
  6                         AND h2.distance < h1.distance )
  7  /
 
NAME                                PRICE   DISTANCE
------------------------------ ---------- ----------
I_OBJ#                                  3          3
 
Elapsed: 00:00:00.22
ops$tkyte@ORA9IR2> select count(*) from hotels;
 
  COUNT(*)
----------
     27837
 
Elapsed: 00:00:00.00

it doesn't seem horribly inefficient. 

Tom Can we give it one more try

July 05, 2005 - 9:20 am UTC

Reviewer: A reader

Tom, When I ran the query it returned nothing. I am sending you the whole test case. This is what I would like to see
in the report.

out_date in_date container
1/18/2005 6:39:10 AM 3/19/2005 2:10:24 AM ACLU408014
3/21/2005 3:45:48 PM 3/22/2005 2:52:41 PM ACLU408014




CREATE TABLE BETA
(
IN_DATE DATE NOT NULL,
OUT_DATE DATE,
CONTAINER VARCHAR2(10 BYTE) NOT NULL
)

INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
commit;

select in_date, out_date,container,
case when in_date between to_date('01-mar-2005', 'dd-mon-yyyy' )
and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
then in_date end,
case when out_date between to_date( '01-mar-2005', 'dd-mon-yyyy' )
and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
then out_date end
container
from BETA
WHERE in_date <= to_date( '01-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
and out_date >= to_date( '31-mar-2005', 'dd-mon-yyyy' )

Tom Kyte

Followup  

July 05, 2005 - 9:54 am UTC

you know, this is going beyond....

*s*p*e*c*i*f*i*c*a*t*i*o*n*

pretend you were explaining to your mother (who presumably doesn't work in IT and doesn't know sql or databases or whatever) what needed to be done.  

that is what I need to see.  I obviously don't know your logic of getting from "A (inputs) to B (outputs)" and you need to explain that.


and when I run my query:

ops$tkyte@ORA10G> variable x varchar2(20)
ops$tkyte@ORA10G> variable y varchar2(20)
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec :x := '01-mar-2005'; :y := '01-apr-2005'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select case when in_date between to_date( :x, 'dd-mon-yyyy' )
  2                               and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
  3              then in_date end,
  4         case when out_date between to_date( :x, 'dd-mon-yyyy' )
  5                                and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
  6              then out_date end,
  7         container
  8    from beta
  9   where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
 10     and out_date >= to_date( :x, 'dd-mon-yyyy' )
 11  /
 
CASEWHENI CASEWHENO CONTAINER
--------- --------- ----------
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05           ACLU408014


I do get output, not what you say you want, but output.  you need to tell me THE LOGIC here.  (and maybe when you write it down, specify it, the answer will just naturally appear)

so yes, we can definitely give it one more try but if and only if you provide the details, the specification, the logic, the thoughts behind this.

Not just "i have this and want that", it doesn't work that way. 

in english

July 05, 2005 - 10:33 am UTC

Reviewer: Jean


We are trying to bill from the time the truck left to the
time it returned. For example in the above query.
I would like to bill him from 1/18/2005 to 3/19/2005. So it must be part of the report. That's the the whole key here.

clarification!!

July 05, 2005 - 10:56 am UTC

Reviewer: A reader

the time he left 1/18/2005 6:39:10 AM
the time he came back 3/22/2005 2:52:41 PM

hope this helps....

Tom Kyte

Followup  

July 05, 2005 - 11:28 am UTC

ops$tkyte@ORA9IR2> select * from beta order by in_date;
 
IN_DATE   OUT_DATE  CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014   <<<=== gap, no 13
14-JAN-05 18-JAN-05 ACLU408014   <<=== big gap, no 19.... mar 18
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
 
6 rows selected.


I don't get it.  I don't get it AT ALL.   does anyone else ?  

nope, not getting it even a teeny tiny bit myself.


give us LOGIC, ALGORITHM, INFORMATION.


like I said, pretend I'm your mother who has never seen a computer -- explain the logic at that level (or I just give up) 

BETTER TABLE

July 05, 2005 - 11:57 am UTC

Reviewer: A reader

INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'), TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014');
commit;


OUT_DATE IN_DATE
1/18/2005 6:39:10 AM 3/19/2005 2:10:24 AM
3/21/2005 3:45:48 PM 3/22/2005 2:52:41 PM


LEFT 1/18 CAME BACK 3/19
LEFT 3/21 CAME BACK 3/22


Tom Kyte

Followup  

July 05, 2005 - 12:20 pm UTC

you have totally and utterly missed my point.



IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014

6 rows selected.


sigh.

what if the records are

IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014

specification, you know what, without it, I'm not even going to look anymore. Textual description of precisely what you want. I'm tired of guessing. I think I can guess, but I don't even want to guess about "missing" months like my second example here.


English Explanation

July 05, 2005 - 1:06 pm UTC

Reviewer: A reader

Sorry for going back and forth on this report. All I want is the following: We have trucks that comes and out of yard. All we are looking for is when the truck came in and the "next record" nothing in between because a truck can come in many times during a month. So we want when it first came in and the very last time he went out for a particular month.That is to say the last time he left the yard. So the date and time should give us this information. Finally this report should be within a month.

example:

IN_DATE OUT_DATE CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014

6 rows selected.

in this case we want

in_date out_date
-------- --------
3/22/2005 2:52:41PM 1/18/2005 6:39:10 AM








Tom Kyte

Followup  

July 05, 2005 - 1:17 pm UTC

so what happened to the 21st/22nd of march this time. the answer keeps changing?

and what if, there are no records for march in the table (nothing in_date/out_date wise)



follow up

July 05, 2005 - 1:57 pm UTC

Reviewer: jean

Tom,

We realized that it maybe too much to get the dates in between
so we opt for just getting the in_date and out_date. By the way there will always be data so do not worry about if....

Thanks!!



Tom Kyte

Followup  

July 05, 2005 - 3:11 pm UTC

feb, what about feb? you said there would always be data? I want to run this for feb?

do you or do you not need to be concerned about a missing month.

do not be concerned!

July 05, 2005 - 3:22 pm UTC

Reviewer: A reader

Please do not be concerned about missing a month. This is a report.

Tom Kyte

Followup  

July 05, 2005 - 3:46 pm UTC

umm, I want the report for feburary

it is blank.

now what?  it should not be blank should it?  this is a problem, this is a problem in our industry in general.  You get what you ask for (sometimes) and if you ask repeatedly for the wrong thing, that's what you'll get.  I am concerned -- by this line of question here.

Hey, here you go:

ops$tkyte-ORA9IR2> select *
  2    from (
  3  select
  4         lag(out_date) over (partition by container order by in_date) last_out_date,
  5         in_date,
  6             container
  7    from beta
  8         )
  9   where trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
 10      or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy');
 
LAST_OUT_ IN_DATE   CONTAINER
--------- --------- ----------
18-JAN-05 19-MAR-05 ACLU408014
21-MAR-05 22-MAR-05 ACLU408014

gets the answer given your data, makes a zillion assumptions (50% of which are probably wrong), won't work for FEB, probably doesn't answer the question behind the question, but hey, there you go.   

Thanks!!!

July 06, 2005 - 9:00 am UTC

Reviewer: A reader

I will try it ...Thanks a zillion for your efforts and your patient.

Thanks!

July 06, 2005 - 11:55 am UTC

Reviewer: A reader

CREATE TABLE BETA3
(
IN_DATE DATE NOT NULL,
OUT_DATE DATE,
CONTAINER VARCHAR2(10 BYTE) NOT NULL
)



INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '07/20/2004 03:08:49 PM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_Date( '08/10/2004 02:45:52 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312');
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES (
TO_Date( '03/19/2005 01:55:06 AM', 'MM/DD/YYYY HH:MI:SS AM'),
TO_Date( '03/27/2005 05:05:36 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312');
commit;

Tom I was able to get the first pair as show

last_out_date in_date container
8/10/2004 2:45:52 AM 3/19/2005 1:55:06 AM ACLU040312

which is fine...

But can I get the other pair?

last_out_date in_date container
3/27/2005 5:05:36 AM


Tom Kyte

Followup  

July 06, 2005 - 12:44 pm UTC

problem is, you are "missing" a row and 'making up' data is hard.

it might be

ops$tkyte-ORA10G> select decode( r, 1, last_out_date, out_date ),
  2         decode( r, 1, in_date, next_in_date )
  3    from (
  4  select
  5         lag(out_date) over (partition by container order by in_date) last_out_date,
  6         in_date, out_date,
  7         lead(in_date) over (partition by container order by in_date) next_in_date,
  8             container
  9    from beta3
 10         ), ( select 1 r from dual union all select 2 r from dual )
 11   where ((
 12          trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
 13          or
 14                  trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
 15             ) and r = 1 )
 16             or
 17             ( next_in_date is null and r = 2 )
 18  /
 
DECODE(R,1,LAST_OUT_ DECODE(R,1,IN_DATE,N
-------------------- --------------------
10-aug-2004 02:45:52 19-mar-2005 01:55:06
27-mar-2005 05:05:36

still curious what happens in feb. 

Please refer some books to learn Oracle Analytic functions

July 07, 2005 - 7:58 am UTC

Reviewer: Vijay from India


Tom Kyte

Followup  

July 07, 2005 - 9:47 am UTC

data warehousing guide (freely available on otn.oracle.com)

Expert one on one Oracle (I have a big chapter on them in there)

Thank you very much!!

July 08, 2005 - 10:35 am UTC

Reviewer: Jean

I want to thank you for the last query!!! it worked very well,even tho I still get dates outside of the range. But overall it's fine.




How to get contiguous date ranges from Start_date, end_date pairs?

July 11, 2005 - 3:15 pm UTC

Reviewer: Bob Lyon from Houston


-- Tom, Suppose I have a table with data...

-- MKT_CD START_DT_GMT END_DT_GMT
-- ------ ----------------- -----------------
-- AAA 07/11/05 00:00:00 07/12/05 00:00:00
-- BBB 07/11/05 00:00:00 07/11/05 01:00:00
-- BBB 07/11/05 01:00:00 07/11/05 02:00:00
-- BBB 07/11/05 02:00:00 07/11/05 03:00:00
-- BBB 07/11/05 06:00:00 07/11/05 07:00:00
-- BBB 07/11/05 07:00:00 07/11/05 08:00:00

-- What I would like to get is the "contiguous date ranges"
-- by MKT_CD, i.e.,

-- MKT_CD START_DT_GMT END_DT_GMT
-- ------ ----------------- -----------------
-- AAA 07/11/05 00:00:00 07/12/05 00:00:00
-- BBB 07/11/05 00:00:00 07/11/05 03:00:00
-- BBB 07/11/05 06:00:00 07/11/05 08:00:00

-- I have played with LAG/LEAD/FIRST_VALUE/LAST_VALUE
-- but seem to just "go in circles" trying to code this.

-- Here is the test data setup (Oracle 9.2.0.6) :

CREATE GLOBAL TEMPORARY TABLE NM_DEMAND_BIDS_API_GT
(
MKT_CD VARCHAR2(6) NOT NULL,
START_DT_GMT DATE NOT NULL,
END_DT_GMT DATE NOT NULL
)
ON COMMIT PRESERVE ROWS;

-- This code has 24 hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('AAA', TRUNC(SYSDATE), TRUNC(SYSDATE) + 1);
-- A second code goes by hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 00/24, TRUNC(SYSDATE) + 01/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 01/24, TRUNC(SYSDATE) + 02/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 02/24, TRUNC(SYSDATE) + 03/24);
-- and has an intentional gap
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 06/24, TRUNC(SYSDATE) + 07/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 07/24, TRUNC(SYSDATE) + 08/24);

-- Query

SELECT MKT_CD, START_DT_GMT, END_DT_GMT
FROM NM_DEMAND_BIDS_API_GT;



Tom Kyte

Followup  

July 11, 2005 - 3:49 pm UTC

based on:
http://www.oracle.com/technology/oramag/oracle/04-mar/o24asktom.html

ops$tkyte@ORA9IR2> select mkt_cd, min(start_dt_gmt), max(end_dt_gmt)
  2    from (
  3  select mkt_cd, start_dt_gmt, end_dt_gmt,
  4         max(grp) over (partition by mkt_cd order by start_dt_gmt) mgrp
  5    from (
  6  SELECT MKT_CD,
  7         START_DT_GMT,
  8         END_DT_GMT,
  9         case when lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) <> start_dt_gmt
 10                   or
 11                   lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) is null
 12              then row_number() over (partition by mkt_cd order by start_dt_gmt)
 13          end grp
 14    FROM NM_DEMAND_BIDS_API_GT
 15         )
 16         )
 17   group by mkt_cd, mgrp
 18   order by 1, 2
 19  /
 
MKT_CD MIN(START_DT_GMT)    MAX(END_DT_GMT)
------ -------------------- --------------------
AAA    11-jul-2005 00:00:00 12-jul-2005 00:00:00
BBB    11-jul-2005 00:00:00 11-jul-2005 03:00:00
BBB    11-jul-2005 06:00:00 11-jul-2005 08:00:00
 
 

Thanks!

July 11, 2005 - 5:20 pm UTC

Reviewer: Bob Lyon from Houston

Wow, that was fast.

The trick here is the MAX() analytic function. I could tag the lines where a break was to occur but couldn't figure out how to carry forward the tag/grp.

Thanks Again!

Analytical functions book

July 11, 2005 - 11:55 pm UTC

Reviewer: Vijay from India

Thanks a lot

More Help

July 26, 2005 - 5:40 pm UTC

Reviewer: Jean

Tom,

How can I get "just" the record within the scope? I am getting record outside of march.

select container,decode( r, 1, last_out_date, out_date )out_date, decode( r, 1, in_date, next_in_date) in_date,
code length_code,chassis,out_trucker_code,ssl_user_code ssl, ssl_user_code,out_mode
from (
select lag(out_date) over (partition by i.container order by in_date)
last_out_date,
i.ssl_user_code,
in_date,
cl.code,
i.out_trucker_code,
i.ssl_user_code ssl,
i.container,
i.chassis,
out_mode,
out_date,
clht.length_code,
lead(in_date) over (partition by i.container order by in_date)
next_in_date
from his_containers i,container_masters cm,tml_container_lhts clht,tml_container_lengths cl
where cm.container = i.container
and cm.lht_code = clht.code
and cl.code = clht.length_code
and ssl_user_code = 'ACL'
and i.container = 'ACLU214285'
and voided_date is null
and chassis is null
and in_mode = 'T'
and out_mode = 'T' ), ( select 1 r from dual union all select 2 r from dual )
where (( trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy'))
and r = 1 ) or ( next_in_date is null and r = 2 )
order by out_date


Tom Kyte

Followup  

July 26, 2005 - 5:57 pm UTC

select *
from (Q)
where <any other conditions you like>
order by out_date;


replace Q with your query.

that's what I got in my query.....

July 26, 2005 - 6:03 pm UTC

Reviewer: A reader


Tom Kyte

Followup  

July 26, 2005 - 6:23 pm UTC

don't know what you mean

I thought I was doing what you suggested already...

July 26, 2005 - 6:41 pm UTC

Reviewer: A reader


Tom Kyte

Followup  

July 26, 2005 - 6:56 pm UTC

I cannot see your output, obviously you are getting more data than you wanted it -- add to the predicate in order to filter it out. don't know what else to say.

More information..

July 27, 2005 - 9:14 am UTC

Reviewer: Jean

the way it was before

CONTAINER OUT_DATE IN_DATE LENGTH_CODE CHASSIS OUT_TRUCKER_CODE
ACLU217150 6/25/2004 2:58:01 PM 3/11/2005 7:36:29 PM 4 E2131 ACL ACL T



---with your changes---

CONTAINER OUT_DATE IN_DATE LENGTH_CODE CHASSIS OUT_TRUCKER_CODE ACLU217150 6/25/2004 2:58:01 PM 3/11/2005 7:36:29 PM 4 E2131



my history tables


CONTAINER_ID OUT_DATE IN_DATE
31779 6/21/2004 10:03:25 AM 6/16/2004 1:33:50 AM
55317 6/25/2004 2:58:01 PM 6/25/2004 2:19:49 PM
672863 3/2/2005 7:03:31 PM 2/26/2005 6:03:49 PM
708598 4/4/2005 3:31:03 PM 3/11/2005 7:36:29 PM
779305 4/16/2005 1:03:36 PM 4/6/2005 2:04:53 PM

as you can see I am not picking up the records within the month of march...with or without
the changes to the query.

Tom Kyte

Followup  

July 27, 2005 - 10:27 am UTC

sorry -- you'll need to work through this, you see the techniques involved right -- lag, lead, analytic functions, YOU understand your data much better than I.

(because in part, frankly, the "way it was before" and "with your changes" look, well, I don't know -- the same I think to me as displayed here)

Thanks for your help!

July 27, 2005 - 1:26 pm UTC

Reviewer: A reader

I know the data, however I thought I was going to be something easy just to get the date within march...I guess not.

count number of rows in a number of ranges

July 27, 2005 - 6:08 pm UTC

Reviewer: A reader

Hi

I would like to count the number of rows I have per range of values. For example

SELECT RANGE, SUM(suma) total_per_deptno
FROM (SELECT CASE
WHEN deptno between 10 and 20 THEN '10-20'
ELSE '30'
END RANGE,
deptno, 1 SUMA
FROM scott$emp)
GROUP BY RANGE

RANGE TOTAL_PER_DEPTNO
----- ----------------
10-20 8
30 6

Can I rewrite that query in some other way so range can be dynamic such as

11-20
21-30
31-40

and counts the number of rows?

Thank you



Tom Kyte

Followup  

July 27, 2005 - 6:33 pm UTC

if you can come up with a function f(x) such that f(x) returns what you want, sure.

EG:


for you 11-20, 21-30, 31-40 -- well

f(deptno) = trunc( (deptno-0.1)/10)

(assuming deptno is an integer) -- that'll bin up deptno 0..10, 11..20, 21..30 and so on into groups 0, 1, 2, 3, ....



August 02, 2005 - 1:35 pm UTC

Reviewer: A reader

Tom,

I hope you can provide an insight to this.

table emp1 is shown below.

EmpId Week Year Day0 Day1 ..... Day14

100 20 2005 8 8 8
200 22 2003 0 0 8
300 25 2004 8 8 0
400 06 2005 0 8 8
500 08 2002 8 0 8

create table emp1(empid varchar2(3), week varchar2(2), year varchar2(4), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));

insert into emp1 values('100', '20', '2005', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('200', '22', '2003', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('300', '25', '2004', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0);
insert into emp1 values('400', '06', '2005', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('500', '08', '2002', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);

I am trying to select emp1 records as follows:

EmpId, Date of the day, Hours worked per day

Firstly, I have to calculate date of the day of a record (first day that corresponds to Day0) using
week of the year and year. Then I have to increment the day by 1, 2 ...14
to get the hours worked for each particular date

Example: Assuming that week 20 of 2005 is 05/07/2005. It corresponds to Day0 in the same record

Day1 column corresponds to the next day which is 05/08/2005. Day2 becomes 05/09/2005 and so on ...

Then, I have to print individual rows for each empid as:

100 05/07/2005 8
100 05/08/2005 8
.....
200 05/22/2003 0
200 05/23/2003 8
.. and so on for all empid's ...


Thank you.

Tom Kyte

Followup  

August 02, 2005 - 2:09 pm UTC

oh no, columns where rows should be :(


and basically you are saying "i need ROWS where these rows should be!"


tell me, how do you turn 20 into a date?

August 02, 2005 - 2:19 pm UTC

Reviewer: A reader

Tom,

I should've explained it better. Week 20 of 2005, here should be translated to the first day of week 20 of 2005 (Assuming it is 05/07/2005). That corresponds to Day0 of that row. Day1 becomes 05/08/2005 and so on ...

Is there a function or approach that can convert columns to rows?

Tom Kyte

Followup  

August 02, 2005 - 3:30 pm UTC

no, i mean -- what function/logic/algorithm are you using to figure out "week 20 is this day"

August 02, 2005 - 9:06 pm UTC

Reviewer: A reader

Tom,

Sorry, firstly, the date is not calculated the way I said above. It's not clear yet how the date is obtained. This issue is under review and I think I'll obtain date by joining empid with some table (say temp1). However, I am sure I will have to use date (such as 05/07/2005), associate it with Day0 column value. Day1 becomes 05/08/2005 and so on .. However, I am trying to obtain a sql or pl/sql that can arrange the rows as described above. Any ideas? Thanks.

Tom Kyte

Followup  

August 03, 2005 - 10:06 am UTC

I cannot tell you how much I object to this model.  

storing "week" and "year" - UGH.

storing them in STRINGS - UGH UGH UGH.

storing things that should be cross record in record UGH to the power of 10.

I had to fix your inserts, they did not work, added day14 of zero.


ops$tkyte@ORA10G> with dates as
  2  (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual connect by level <= 15 )
  3  select empid, dt,
  4         case when l = 0 then day0
  5                  when l = 1 then day1
  6                  when l = 2 then day2
  7                          /* ... */
  8                  when l = 13 then day13
  9                  when l = 14 then day14
 10                  end data
 11    from (select * from emp1 where week = 20), dates
 12  /
 
EMP DT              DATA
--- --------- ----------
100 07-MAY-05          8
100 08-MAY-05          8
100 09-MAY-05          0
100 10-MAY-05
100 11-MAY-05
100 12-MAY-05
100 13-MAY-05
100 14-MAY-05
100 15-MAY-05
100 16-MAY-05
100 17-MAY-05
100 18-MAY-05
100 19-MAY-05
100 20-MAY-05          8
100 21-MAY-05          0
 
15 rows selected.


 

August 03, 2005 - 3:18 pm UTC

Reviewer: A reader

Tom,

Thanks for the solution. I need some more help if you don't mind. The sql works excellently and I experimented with it.

However, this question is based on a change of design here ... The emp1 table is joined with trn1 table (empid ~ trnid) to obtain values x and y. x and y should be passed to a function that returns date.

The emp1 table is like:

EmpId Day0 Day1 ..... Day14

100 8 8 8
200 0 0 8
300 8 8 0
400 0 8 8
500 8 0 8

trn1 table is like:

trnid x y
100 3 18
200 4 19
300 5 20
400 6 21
500 7 22

etc ...



create table emp1(empid varchar2(3), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));

insert into emp1 values('100', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('200', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 0);
insert into emp1 values('300', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0, 0);
insert into emp1 values('400', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('500', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);



create table trn1(empid varchar2(3), x number(2), y number(2));

insert into trn1 values('100', 3, 18);
insert into trn1 values('200', 4, 19);
insert into trn1 values('300', 5, 20);
insert into trn1 values('400', 6, 21);
insert into trn1 values('500', 7, 22);



I used this function on just one row of emp1 (by hard coding x and y values).

I replaced

with dates as
(select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual
connect by level <= 15 )

with

with dates as
(select getXYDate(x,y)+level-1 dt, level-1 l from dual
connect by level <= 15 )

However, I am trying to implement this on every row of emp1 by obtaining x and y from trn. There is no week or year in emp1 table. Any help? Thanks again.



Tom Kyte

Followup  

August 03, 2005 - 6:00 pm UTC

I didn't think it was possible, but now I like this even less than before!  didn't think you could do that ;(


ops$tkyte@ORA10G> with dates as
  2  (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual 
connect by level <= 15 )
  3  select empid, dt,
  4         case when l = 0 then day0
  5                  when l = 1 then day1
  6                  when l = 2 then day2
  7                          /* ... */
  8                  when l = 13 then day13
  9                  when l = 14 then day14
 10                  end data
 11    from ( QUERY ), dates
 12  /

replace query with a join of emp with trn and apply the function in there. 

August 03, 2005 - 7:38 pm UTC

Reviewer: A reader

Tom,

Sorry to bother you again. In my case, I think
(select to_date( '05/07/2005','mm/dd/yyyy') will not help me anymore because I have to basically find dates for Day0 .. Day14 of every row in emp1 table. The first date (date that corresponds to Day0) for each record should be obtained using a function by passing X and Y values of trn table .. Because each record may have different x, y values.
If it's not achievable using this way, can you suggest an alternate approach. I am trying to make a function that would use a loop. Also, the data should be written to a text file once complete, in that case I think a procedure might help and if so, could you throw some light? Thanks for your patience.

Tom Kyte

Followup  

August 03, 2005 - 8:26 pm UTC

well, you just need to generate a set of 15 numbers (L)

and add them in later than. No big change. You have the "start_date" from the function right -- just add L to dt.

August 03, 2005 - 8:38 pm UTC

Reviewer: A reader

Ok, Can you please show that if possible?

August 03, 2005 - 9:38 pm UTC

Reviewer: A reader

Tom,

I tried this and am getting an error: ORA-00904: "DAY13": invalid identifier

WITH DATES AS
(SELECT FUNC_XY(17,2003)+level-1 dt, level-1 l FROM DUAL
connect by level <= 15)
select empid, day0, day14, x, y, dt,
case when l = 0 then day0
when l = 1 then day1
when l = 2 then day2
when l = 3 then day3
when l = 4 then day4
when l = 5 then day5
when l = 6 then day6
when l = 7 then day7
when l = 8 then day8
when l = 9 then day9
when l = 10 then day10
when l = 11 then day11
when l = 12 then day12
when l = 13 then day13
when l = 14 then day14
end data
from (select emp1.empid, day0, day14, x, y from emp1, trn1 where emp1.empid = trn1.empid), dates
/

As said before ... I also have to use x and y instead of 17 and 2003 in order to compute it for every row.


Tom Kyte

Followup  

August 04, 2005 - 8:20 am UTC

yeah, well -- you didn't select it out in the inline view. fix that.


look the concept is thus:


with some_rows as ( select level-1 l from dual connect by level <= 15 )
select a.empid, a.dt+l, case when l=0 then a.day0
...
when l=14 then a.day14
end data
from some_rows,
(select emp1.empid, func_xy(trn1.x, trn1.y) dt,
emp1.day0, emp1.day1, .... <ALL OF THE DAYS>, emp1.day14
from emp1, trn1
where emp1.empid = trn1.empno )



August 04, 2005 - 9:15 am UTC

Reviewer: A reader

Tom,

Here, the sql is using a.empid, a.dt+l ...

whereas the inner sql is using emp1.day0, trn1.empid , etc ... My real inner sql well uses some more columns adn joins as well. When this gave me error, I just substituted emp1.day0, emp1.day14 etc ... with day0, day14 etc .. and it worked. However, when there are several joins with alias names, How should it be done?

To make it a bit clear, this sql looks similar to:

select emp1.empid, emp1.day0 from some_rows, (select emp1.empid, emp1.day0) ...

Any idea how to select from select and still use multiple joins etc ... Hope I am clear

Tom Kyte

Followup  

August 04, 2005 - 9:56 am UTC

you can join as much as you WANT in the inline views.

Sorry, I cannot go further with this one, I've shown the technique -- it is just a pivot to turn COLUMNS THAT SHOULD HAVE BEEN ROWS into rows -- very common.

August 04, 2005 - 9:52 am UTC

Reviewer: A reader

Please ignore above post.

I need some help

August 09, 2005 - 10:25 am UTC

Reviewer: Carlos

Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('11/15/2004 17:42:56', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/18/2004 15:09:19', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('11/24/2004 09:38:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/30/2004 04:28:09', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/03/2005 14:36:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/05/2005 10:04:15', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/07/2005 08:54:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/10/2005 10:54:07', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('01/12/2005 10:13:13', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/18/2005 04:23:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/03/2005 03:15:05', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/09/2005 18:54:11', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005 21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/22/2005 20:27:03', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/29/2005 17:05:04', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/22/2005 20:27:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/30/2005 08:53:13', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/30/2005 13:16:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/30/2005 15:08:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;


Tom,

I hope you can help since I have been struggling with this report. I would like to get something like this...


IN ORDER WORDS I WANT TO GET WHEN IT FIRST WAS LOGED IN INDATE AND WHEN IT WAS LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX. In this case for example for the month of March, however it can be for any given Month. Any Ideas how I can accomplish that?

IN_DATE OUT_DATE
3/22/2005 8:27:03 PM 3/30/2005 3:08:39 PM

----from the table above for the month of March

Tom Kyte

Followup  

August 09, 2005 - 10:45 am UTC

insufficient detail here, why won't min/max work for you for example.

but I don't understand the logic behind the two values you say you want, I don't get how you arrived at them.

This is what I get

August 09, 2005 - 10:57 am UTC

Reviewer: A reader

select in_date, out_date
from lou_date
where id = 201048
and ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))

I get the following:

In_date out_date

3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM

Tom Kyte

Followup  

August 09, 2005 - 11:19 am UTC

ok,

Insert into LOU_DATE
(IN_DATE, OUT_DATE)
Values
(TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005
21:47:41', 'MM/DD/YYYY HH24:MI:SS'));

why didn't you get that row. for example.

August 09, 2005 - 11:48 am UTC

Reviewer: A reader

SQL Statement which produced this data:
select in_date, out_date
from lou_date
where ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
order by out_date

3/3/2005 3:15:05 AM 3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/22/2005 8:27:15 PM 3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM

I guess my question is I would like to that when
I get records with beyond march it should be replace
with blank or Null...since I can't charged him/her
for April...


Tom Kyte

Followup  

August 09, 2005 - 12:00 pm UTC

I am so not following you here.

August 09, 2005 - 12:24 pm UTC

Reviewer: A reader

Tom,

Pretend that you are charging someone for a particular month. Let's say the month of March. So you would like to do a query that reflect just that..so a group of dates are given to you and in that group of dates you have multiple records with the same id. Also some records containts records that inintiated in march but came back in April. Here is are the examples..but it can work with any dates...

example 1.

in_date out_date
3/22/2005 8:27:15 PM 3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM


would like to see:
in_date out_date
3/22/2005 8:27:15 PM 3/30/2005 1:16:00 PM

example 2

In_date out_date
3/3/2005 3:15:05 AM 3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM

would like to see:

In_date out_date
3/3/2005 3:15:05 AM 3/15/2005 9:47:41 PM

Tom Kyte

Followup  

August 09, 2005 - 12:42 pm UTC

begs the question


in_date out_date
20-feb-2005 15-apr-2005

or

in_date out_date
3/22/2005 8:27:15 PM 3/25/2005 8:53:13 AM
3/30/2005 1:16:00 PM 4/16/2005 1:40:44 PM

what then. Be able to clearly specify the "goal" or the "algorithm" usually leads us straight to the query itself. There are so many ambiguities here. Pretend you were actually documenting this for a junior programmer to program. Give them the specifications. In gory detail.

please don't just answer these two what thens -- think of all of the cases (cause I'll just keep on coming back with "what then" if you don't)

Remember -- I know NOTHING about your data, not a thing. This progression from

... I WANT TO GET WHEN IT FIRST WAS LOGED IN INDATE AND WHEN IT WAS
LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX....

to this has been 'strange' to say the least.

Full explanation of requirements

August 09, 2005 - 3:10 pm UTC

Reviewer: A reader

Sorry for the misunderstanding Tom. Here is the full requirements. I hope I can explain it this time.

The report is a billing report and the it goes as follows:
For example for the month of March we have to bill as
in the following way:

out_date date_in Bill

2/23 3/2 3/1 to 3/2

3/1 3/3 3/1 to 3/3

3/1 4/14 3/1 to 3/31

3/1 - 3/1 to 3/31

2/23 - 3/1 to 3/31

Tom Kyte

Followup  

August 09, 2005 - 3:38 pm UTC

well, i hope you give your programmers more detail.  Here is the best I'll do

ops$tkyte@ORA9IR1> select t.*,
  2         greatest( in_date, to_date('mar-2005','mon-yyyy') ) fixed_in_date,
  3         least( nvl(out_date,to_date('3000','yyyy')),  last_day( to_date( 'mar-2005', 'mon-yyyy' ) ) ) fixed_out_date
  4    from t
  5   where in_date < last_day( to_date( 'mar-2005', 'mon-yyyy' ) )+1
  6     and out_date >= to_date( 'mar-2005', 'mon-yyyy' );

 
IN_DATE   OUT_DATE  FIXED_IN_ FIXED_OUT
--------- --------- --------- ---------
03-MAR-05 09-MAR-05 03-MAR-05 09-MAR-05
11-MAR-05 15-MAR-05 11-MAR-05 15-MAR-05
22-MAR-05 29-MAR-05 22-MAR-05 29-MAR-05
22-MAR-05 30-MAR-05 22-MAR-05 30-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
 
6 rows selected.


predicate finds records that overlap march.

select adjusts the begin/end dates. 

Thank!!!

August 10, 2005 - 12:00 pm UTC

Reviewer: A reader

Tom,

One more request. I would like to start the report with
the first time it went out. That is to say...

how it looks now with your help...

fix_in fix_out
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM 3/31/2005


how the data looks


fix_in fix_out
3/22/2005 8:27:03 PM 3/29/2005 5:05:04 PM---first went out
3/30/2005 3:08:39 PM 4/16/2005 1:40:44 PM

How I would like to see it since we begin billing from
the first date the truck went out.

fix_in fix_out
3/29/2005 5:05:04 PM 3/30/2005 3:08:39 PM
3/30/2005 3:08:39 PM 3/31/2005

Thanks again Tom

Tom Kyte

Followup  

August 10, 2005 - 1:05 pm UTC

try to work it out yourself -- please.

why? because I'll do this little thing and it'll be "oh yeah, one more thing, when the data looks like this...."

specifying requirements is like the most important thing in the world -- it is key, it is crucial. It is obivous you know what you want (well, maybe -- it seems to change over time) but I don't "get it" myself. Your simple example here with two rows begs so so many questions, I don't even want to get started.


You have lag() and lead() at your disposal, the probably come into play here. check them out.

Thanks for help !

August 11, 2005 - 3:25 pm UTC

Reviewer: A reader

The report is kind of tricky. Specially when one of the dates originates in Feb. and the other pair falls in march.


Hooked on Analytics worked for me!!

August 22, 2005 - 11:15 am UTC

Reviewer: Greg from Toronto

I think I need to find a meeting group to help with my addiction ... I think I'm addicted to analytics .. :\

Finally got a chance to read chapter 12 in "Expert Oracle" ... awesome!! 4 big, hairy Thumbs up!! heh

But I got a question ... an "odd" behaviour that I don't understand ... was wondering if you could help explain:

Test Script:
================
drop table junk2;
drop sequence seq_junk2;

create sequence seq_junk2;

create table junk2
(inv_num number,
cli_num number,
user_id number)
/

insert into junk2
values ( 123, 456, null );
insert into junk2
values ( 123, 678, null );
insert into junk2
values ( 234, 456, null );
insert into junk2
values ( 234, 678, null );

commit;

break on cli_num skip 1

select * from junk2;

select inv_num, cli_num,
NVL ( user_id, 999 ) chk1,
NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
seq_junk2.nextval seq,
FIRST_VALUE ( NVL ( user_id, seq_junk2.nextval ) )
OVER ( PARTITION BY cli_num ) user_id
from junk2
/
=====================

The final query shows this:


INV_NUM CLI_NUM CHK1 CHK2 SEQ USER_ID
---------- ---------- ---------- -------- ---------- ----------
123 456 999 NULL 1
234 999 NULL 2

123 678 999 NULL 3 2
234 999 NULL 4 2


4 rows selected.

and I'm kinda confused .. it appears that the analytic functions are not "processing" that sequence ... how do sequences and analytics work together?? (if at all??)

(In short, this is a simplified example of a bigger problem I tripped over. I'm trying to assign new user_ids for existing clients, but only want 1 user_id assigned per client. Trick is, each client can be associated with more 1 investment ... so I have multiple rows with same client, but I want the same user_id assigned. kind of: "Has this client got an id yet? if not, give him a new one, otherwise display the one he's already been assigned".)

FIRST_VALUE and LAST_VALUE seemed the logical choice ...
The interesting thing is, when I use DBMS_RANDOM.VALUE (to assign a random PIN to start with) ... it works fine, what am I missing/forgetting about sequences that changes their behaviour in this regards?)


Tom Kyte

Followup  

August 23, 2005 - 8:56 am UTC

that will be a tricky one, lots of assumptions on orders of rows processed and such.

that should throw an ora-2287 in my opinion.

I cannot see a safe way to do that without writing a plsql function and performing a lookup off to the side by cli_num

Sorry, I don't understand ...

August 23, 2005 - 11:36 am UTC

Reviewer: Greg from Toronto

you wrote:
"that will be a tricky one, lots of assumptions on orders of rows processed and such."

I don't understand what assumptions I'm making ... in my example, I just got 4 rows, I don't care what order they come back in, just so long as it deals with them in "groups of cli_nums" .. (hence the partition by cli_num portion) ... if I "lose" sequence numbers, that's fine, too ... I don't care about gaps in the sequence or "missing userids" ...

The only behaviour I'm seeing, is that the analytic function doesn't seem to be working with the sequence properly ...

I guess I can simplify the question even further:

Why does the following query return "NULL" ?

SQL > select first_value ( seq_junk2.nextval ) over ( )
2 from dual
3 /
------more------

FIRST_VALUE(SEQ_JUNK2.NEXTVAL)OVER()
------------------------------------


1 row selected.

(with a "normal" sequence - nothing fancy):

SQL > select seq_junk2.nextval from dual;
------more------

NEXTVAL
----------
29

1 row selected.


Tom Kyte

Followup  

August 24, 2005 - 8:35 am UTC

as i said, i believe it should be raising an error (I have it on my list of things to file when I get back in town).

I cannot make it work, I cannot think of a way to do it in a single statement, short of writing a user defined function.

Connect by with self referenced parent

August 23, 2005 - 12:30 pm UTC

Reviewer: Joe from Reston, VA

CONNECT BY works great but I've run into a problem when the ultimate parent is referenced in the parent record.  e.g., date looks like:
SQL> select * from t;

    OBJ_ID  PARENT_ID
---------- ----------
         1          1
         2          1
         3          1
         4          2
         5          4

But... using connect by generates an error..

SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
  2  from t
  3  connect by t.parent_id = prior t.obj_id;
ERROR:
ORA-01436: CONNECT BY loop in user data

If parent_id is null where obj_id = 1, then it's okay.  Any suggestion on how to handle the other case?  I'm stumped.
 

Solution for connect by

August 23, 2005 - 5:39 pm UTC

Reviewer: Logan Palanisamy from Sunnyvale, CA USA

SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
  2  from t
  3  connect by t.parent_id = prior t.obj_id and t.parent_id <> t.obj_id;

LEVEL                    OBJ_ID  PARENT_ID
-------------------- ---------- ----------
1                             1          1
  2                           2          1
    3                         4          2
      4                       5          4
  2                           3          1
1                             2          1
  2                           4          2
    3                         5          4
1                             3          1
1                             4          2
  2                           5          4
1                             5          4

12 rows selected.
 

re:Solution for connect by

August 24, 2005 - 8:43 am UTC

Reviewer: Joe from Reston, VA

Thanks Logan. Often the solution is so simple! Thanks.

Seq problem

August 24, 2005 - 11:25 am UTC

Reviewer: Bob B from Albany, NY

SELECT
A.*,
seq_junk2.currval CURR_SEQ,
seq_junk2.nextval - ROWNUM + VAL SEQ
FROM (
SELECT
inv_num,
cli_num,
NVL ( user_id, 999 ) chk1,
NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
DENSE_RANK() OVER ( ORDER BY CLI_NUM ) VAL
FROM JUNK2
) A

Might be a starting point. It works on the following ASSUMPTION: ROWNUM corresponds to the number of times the sequence has been called. As Tom stated, this assumption can easily go out the window (throw an analytic function or an order by on the outer query for a simple example).

A safer solution might be to run two updates. Update 1 will give a unique id to each null user id. Update 2 will update the user id to the min or max user id for that cli_num. A little overhead, but safer and simpler than the aforementioned alternative.

Still confused ... but working on it ...

August 24, 2005 - 1:42 pm UTC

Reviewer: Greg from Toronto

Thanks, Bob!! Yeah, that does exactly what I wanted it to do, (but still doesn't really explain the "why" part) ...

problem is, it looks like this is more a question on sequences now than analytics, so I'll see if I can find a more appropriate thread to continue this on ..

Thanks!!


A slight twist on lag/lead

September 01, 2005 - 11:08 am UTC

Reviewer: Sudha Bhagavatula from Buffalo, NY

That was useful to me. Could do a lot of queries easily. However I'm stuck at this point.

I have data like this:

subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2000
1001 001 2112 01/01/2001 06/30/2001
1001 001 2112 07/01/2001 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999

I want my output to look like this:

subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999

How do I achieve this ?


Tom Kyte

Followup  

September 01, 2005 - 3:49 pm UTC

well, you should start by describing the logic from getting from A to B first.

otherwise it is just text. what are the rules that got you from inputs to outputs.

tell me the procedural algorithm you would use for example.

Rules from A to B

September 02, 2005 - 9:29 am UTC

Reviewer: Sudha Bhagavatula from Buffalo, NY

A member is enrolled in a group for a timeframe. For all contiguous time frames for a group I can take the min(eff_date) and max(term_date). For each break in group a new row with min(eff_date) and max(term_date) again. So say a member was enrolled in a group from 01/01/2001 to 12/31/2001 and then again with the same group from 01/01/2005 to 06/30/2005 then I need 2 rows for this member
with the dates as said just now. This is the sql that I'm running, hopefully I'm on the right track but am stuck at this point:

SELECT SUBR_ID,
DEP_NBR,
GRP,
LAG_EFF_DATE,
LEAD_EFF_DATE,
EFF_DATE,
TERM_DATE,
LAG_TERM_DATE,
LEAD_TERM_DATE,
DECODE( LEAD_GRP, GRP, 1, 0 ) FIRST_OF_SET,
DECODE( LAG_GRP, GRP, 1, 0 ) LAST_OF_SET
FROM (SELECT M.SUBR_ID,
M.DEP_NBR,
LAG(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_GRP,
LEAD(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_GRP,
GRP_NBR||SUB_GRP GRP,
CJ.EFF_DATE,
CJ.TERM_DATE,
LAG(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_EFF_DATE,
LEAD(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_EFF_DATE,
LAG(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_TERM_DATE,
LEAD(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_TERM_DATE
FROM DW.T_MEMBER_GROUP_JUNCTION CJ,
BCBS.T_GROUP_DIMENSION G,
BCBS.T_MEMBER_DIMENSION M
WHERE CJ.GRP_DIM_ID = G.GRP_DIM_ID
AND CJ.MBR_DIM_ID = M.MBR_DIM_ID
AND M.DEP_NBR != '000'
AND G.BENE_PKG IS NOT NULL)
WHERE LAG_GRP IS NULL
OR LEAD_GRP IS NULL
OR LEAD_GRP <> GRP
OR LAG_GRP <> GRP

Thanks for your reply.

Tom Kyte

Followup  

September 03, 2005 - 7:15 am UTC

you know, without a table, rows and something more concrete.... I have no comment.

More detail

September 04, 2005 - 10:34 pm UTC

Reviewer: Sudha Bhagavatula from buffalo, NY

have 3 tables:

Member_dimension
Group_Dimension
Member_Group_Junction

Member_Dimension :- columns are mbr_dim_id, subr_id, dep_nbr

Group dimension :- columns are grp_dim_id, grp_nbr, sub_grp

Member_Group_Junction :- columns are mbr_dim_id, grp_dim_id, eff_date, term_date

I have to create one row for each contiguous dates of enrollment with a new row for a new group or a break in date.

Suppose a member (subr_id = 1001, dep_nbr = 001) is enrolled with a group called 001 from 01/01/2001 till 06/30/2001, he then changes group to 002 for the period 07/01/2001 till 12/31/2001. He enrolls with the same group 002 from 01/01/2002 till 06/30/2002 with a change in benefits. He then gets transferred to some other city or changes jobs. He joins back with the group 001 from 09/30/2003 till 11/30/2003 and quits again. joins back with the same group 001 from 01/01/2204 till present.The data in the junction table will be like this:


mbr_dim_id grp_dim_id eff_date term_date

1 1 01/01/2001 06/30/2001
1 2 07/01/2001 12/31/2001
1 2 01/01/2002 06/30/2002
1 1 09/30/2003 11/30/2003
1 1 01/01/2004 12/31/9999

My output should be like this:

mbr_dim_id grp_dim_id eff_date term_date

1 1 01/01/2001 06/30/2001
1 2 07/01/2001 06/30/2002
1 1 09/30/2003 11/30/2003
1 1 01/01/2004 12/31/9999

For each change in group or a break in the contiguity of the dates I should get a new row. The junction table is joined to the dimension with the respective dim_ids.

Hope I'm clearer this time.

Thanks
Sudha





Tom Kyte

Followup  

September 05, 2005 - 10:11 am UTC

tell you what, see
</code> http://www.oracle.com/technology/oramag/oracle/04-mar/o24asktom.html <code>

it shows a technique in the analytics to the rescue article that will be useful for grouping ranges a records using the LAG() function.



But, you need to read the text that you are supposed to read before putting an example here.

It is something I think I say a lot.


<quote>
If your followup requires a response that might include a query, you had better supply very very simple create tables and insert statements. I cannot create a table and populate it for each and every question. The SMALLEST create table possible (no tablespaces, no schema names, just like I do in my examples for you)
</quote>

that is a direct cut and paste

distinct last_value

September 06, 2005 - 4:49 am UTC

Reviewer: Putchi from Stockholm, Sweden

When using last_value I am usually only intrested in the last value, hence I need a distinct in the select to get it. It gives what I want but it seems that the database hase to do the work twice, first a window sort and after that a unique sort. Is there any way to avoid the distinct but still only get one row per partion key?

create table a (num number(2), var1 varchar2(10), var2 varchar2(10));

insert into a values (1,'a','A');
insert into a values (2,'b','A');
insert into a values (3,'c','A');
insert into a values (1,'a','B');
insert into a values (2,'b','B');
insert into a values (3,'c','B');

commit;

SQL> select distinct
  2         var2
  3        ,last_value(var1) over (partition by var2 order by num
  4                                rows between unbounded preceding and unbounded following) var1
  5  from a;

VAR2       VAR1
---------- ----------
A          c
B          c


Körschema
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   SORT (UNIQUE)
   2    1     WINDOW (SORT)
   3    2       TABLE ACCESS (FULL) OF 'A'

 

Tom Kyte

Followup  

September 06, 2005 - 8:31 am UTC

nope, analytics are not aggregates, aggregates are not analytics. 


A trick you can use to skip one or the other step is:


ops$tkyte@ORA817DEV> select var2,
  2         substr( max(to_char( num,'fm0000000000') || var1), 11 ) data
  3    from a
  4   group by var2
  5  /

VAR2       DATA
---------- -----------
A          c
B          c


 

Analytics to the rescue

September 06, 2005 - 11:28 am UTC

Reviewer: Sudha Bhagavatula from Buffalo, NY

Read that article. Helped me, but now I have another twist.

Create table contracts (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)

insert into contracts values ('1001', '001', '2112', to_date('01/01/2000','mm/dd/yyyy'), to_date('12/31/2000','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2001','mm/dd/yyyy'), to_date('06/30/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('07/01/2001','mm/dd/yyyy'), to_date('12/31/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '7552', to_date('01/01/2003','mm/dd/yyyy'), to_date('12/31/2003','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2004','mm/dd/yyyy'), to_date('12/31/9999','mm/dd/yyyy'));


I ran this query to identify breaks in groups and dates for the above table:

select subr_id, dep_nbr, grp,
min_eff_date,
max_term_date
from
(select subr_id, dep_nbr, grp,
min(eff_date) min_eff_date,
max(term_date) max_term_date
from
(select subr_id, dep_nbr, eff_date, term_date, grp,
max(rn)
over(partition by subr_id, dep_nbr order by eff_date) max_rn
from
(select subr_id, dep_nbr, eff_date, term_date, grp,
(case
when eff_date-lag_term_date > 1
or lag_term_date is null
or lag_grp_nbr is null
or lag_grp_nbr <> grp
then row_num
end) rn
from (
select subr_id, dep_nbr, eff_date, term_date, grp_nbr grp,
lag(term_date)
over (partition by subr_id, dep_nbr order by eff_date) lag_term_date,
lag(grp_nbr||sub_grp)
over (partition by subr_id, dep_nbr order by eff_date) lag_grp_nbr,
row_number()
over (partition by subr_id, dep_nbr order by eff_date) row_num
from contracts )))
group by subr_id, dep_nbr, grp, max_rn )
order by subr_id, dep_nbr, min_eff_date

This gave me the output as :

subr_id dep_nbr grp eff_date term_date
1001 001 2112 01/01/2000 12/31/2001
1001 001 7552 01/01/2003 12/31/2003
1001 001 2112 06/30/2004 12/31/9999


I now have another table :

create table contract_pcp_junction (subr_id varchar2(15), dep_nbr varchar2(3), pcp_id varchar2(12), eff_date date, term_date date)

insert into contract_pcp_junction values('1001','001','123765', to_date('07/01/2000','mm/dd/yyyy') to_date('06/30/2001','mm/dd/yyyy');
insert into contract_pcp_junction values('1001','001','155165', to_date('01/01/2003','mm/dd/yyyy') to_date('12/31/9999','mm/dd/yyyy');

This table identifies the provider coverage for each member. I need to identify the breaks in coverage with regards to the contracts.

Now as per the data above this member does not have a pcp from 01/01/2000 to 06/30/2000 and again from 07/01/2001 to 12/31/2001.

I need to insert the breaks into another table. This table needs to have the subr_id, dep_nbr, grp and eff_date, term_date.

create table contract_pcp_breaks (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)

This table needs to have the data for the breaks

subr_id dep_nbr grp_nbr eff_date term_date

1001 001 2112 01/01/2000 06/30/2000
1001 001 2112 07/01/2001 12/31/2001


How do I do that and hopefully I have the necessary scripts for you to work w1th.

Thanks a lot for your patience with this.

--Sudha

Tom Kyte

Followup  

September 06, 2005 - 8:51 pm UTC

yah, I have scripts, but no real idea how these tables relate. Your query looks overly complex for the single table.

cannot you take your data, join it, get some "flat relation" that just simply using lag() on will solve the problem?

(please remember, you have been looking at this for hours. To you this data is natural. to everyone else, it is just bits and bytes on the screen)

Combining two tables

September 09, 2005 - 6:39 am UTC

Reviewer: Putchi from Stockholm, Sweden

Hi Tom!

I want to combine from/to history values from two tables into one sequence like this:

create table a (a varchar2(2)
,from_date date
,to_date date);

create table b (b varchar2(2)
,from_date date
,to_date date);

insert into a ( a, from_date, to_date ) values (
'a1', to_date( '01/13/2005', 'mm/dd/yyyy'), to_date('02/10/2005', 'mm/dd/yyyy'));
insert into a ( a, from_date, to_date ) values (
'a2', to_date( '02/10/2005', 'mm/dd/yyyy'), to_date( '05/01/2005', 'mm/dd/yyyy'));
insert into a ( a, from_date, to_date ) values (
'a3', to_date( '05/01/2005', 'mm/dd/yyyy'), to_date( '08/12/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b1', to_date( '01/13/2005', 'mm/dd/yyyy'), to_date( '01/22/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b2', to_date( '01/22/2005', 'mm/dd/yyyy'), to_date( '04/01/2005', 'mm/dd/yyyy'));
insert into b ( b, from_date, to_date ) values (
'b3', to_date( '04/01/2005', 'mm/dd/yyyy'), to_date( '09/07/2005', 'mm/dd/yyyy'));
commit;


select * from ("Magic");

A B FROM_DATE TO_DATE
-- -- ---------- ----------
a1 b1 2005-01-13 2005-01-22
a1 b2 2005-01-22 2005-02-10
a2 b2 2005-02-10 2005-04-01
a2 b3 2005-04-01 2005-05-01
a3 b3 2005-05-01 2005-08-12

Is it possible?

Tom Kyte

Followup  

September 09, 2005 - 8:30 am UTC

ops$tkyte@ORA10G> select a.* , b.*,
  2         greatest(a.from_date,b.from_date),
  3             least(a.to_date,b.to_date)
  4    from a, b
  5   where a.from_date <=  b.to_date
  6     and a.to_date >= b.from_date;
 
A  FROM_DATE TO_DATE   B  FROM_DATE TO_DATE   GREATEST( LEAST(A.T
-- --------- --------- -- --------- --------- --------- ---------
a1 13-JAN-05 10-FEB-05 b1 13-JAN-05 22-JAN-05 13-JAN-05 22-JAN-05
a1 13-JAN-05 10-FEB-05 b2 22-JAN-05 01-APR-05 22-JAN-05 10-FEB-05
a2 10-FEB-05 01-MAY-05 b2 22-JAN-05 01-APR-05 10-FEB-05 01-APR-05
a2 10-FEB-05 01-MAY-05 b3 01-APR-05 07-SEP-05 01-APR-05 01-MAY-05
a3 01-MAY-05 12-AUG-05 b3 01-APR-05 07-SEP-05 01-MAY-05 12-AUG-05


It won't be blindingly fast on huge things I would guess... 

September 09, 2005 - 9:14 am UTC

Reviewer: Putchi from Stockholm, Sweden

OK, I will try if it works, the real tables will have hundred of thousands records. I tried this myself, but I couldn't come up with something that filled in the "null" values.

SQL> select a,b,from_date,lead(from_date) over (order by from_date)
  2  from (
  3  select a,null b,from_date,to_date from a
  4  union all
  5  select null a,b,from_date,to_date from b
  6  order by from_date
  7  );

A  B  FROM_DATE  LEAD(FROM_
-- -- ---------- ----------
a1    2005-01-13 2005-01-13
   b1 2005-01-13 2005-01-22
   b2 2005-01-22 2005-02-10
a2    2005-02-10 2005-04-01
   b3 2005-04-01 2005-05-01
a3    2005-05-01

 

Tom Kyte

Followup  

September 09, 2005 - 9:36 am UTC

that query won't work -- you need to join.

How to get the 1ST row of this distinct value in a single SELECT

September 16, 2005 - 11:48 am UTC

Reviewer: Sean Chang from VA,USA

Thank you, Tom.

I have read the analytic function for a while, but still
can't figure out a way to select the first row of a distinct
column value in a single SELECT statement. I.E
>>by running below Create and Insert


create table INV (
inv# number(7),
add_time date ,
inv_type varchar2(10),
amount number(8,2));

insert into inv values(1, sysdate-1, 'CASH', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(1, sysdate+1, 'COD', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(2, sysdate, 'MC', 10);
insert into inv values(3, sysdate-1, 'AMEX', 30);
insert into inv values(3, sysdate, 'CASH', 30);


I can get the first row of distinct INV# this way:

select * from (select a.*,
rank() over (partition by inv# order by add_time) time_order
from inv a) where time_order=1;

But how can I acheive this by a single SELECT statement?
The reason is that we have lots of tables we only need
look the very first row of the same Column value and I
don't want endup with lots of in-line views in SELECT
statement.



Tom Kyte

Followup  

September 16, 2005 - 1:59 pm UTC

that is a single select.

why not? (on the lots of in-line views). If you think they are evil - then you wouldn't like my code ;)



Is analytical fitting in this situtation?

October 03, 2005 - 10:29 am UTC

Reviewer: A reader

select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT

from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,


((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'A'
group by trunc(c.damage_inspection_date),c.damage_inspection_by

UNION ALL

select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by

UNION ALL

select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by

)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a

where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);

Tom Kyte

Followup  

October 03, 2005 - 11:29 am UTC

((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'A'
group by trunc(c.damage_inspection_date),c.damage_inspection_by

UNION ALL

select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by

UNION ALL

select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by

)


should be a single query without union's - you don't need to make three passes on that data

select ..., count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)


Great!

October 03, 2005 - 4:05 pm UTC

Reviewer: A reader

Tom,

When I put the changes. It saying "missing keyword" What am I doing wrong?

select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date)
damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id))
from gate_containers ab,gate_damages ac
where ab.gate_id = ac.gate_id
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);

Tom Kyte

Followup  

October 03, 2005 - 8:57 pm UTC

sorry, I am not a sql compiler, I cannot reproduce since I don't have the tables or anything.

Case when ... then ... end

October 04, 2005 - 8:26 am UTC

Reviewer: Greg from Toronto

Just lucked out an saw this:

"select ..., count(distinct case when damage_code = 'A' then gate_id),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)"

Should be:

"select ..., count(distinct case when damage_code = 'A' then gate_id end),
count(distinct case when damage_code = 'F' then gate_id end),
count(distinct gate_id)"

Tom just missed the "end" for the case statement ... (I got lucky and spotted it .. heh)



Tom Kyte

Followup  

October 04, 2005 - 4:25 pm UTC

(that is why i always ask for create tables and inserts - without them, it is not possible to test)

thanks!!

October 04, 2005 - 2:24 pm UTC

Reviewer: A reader


Well Taken

October 05, 2005 - 10:53 am UTC

Reviewer: A reader

Tom,

This is what I would like to see..

damage_inspection_date damage_inspection_by counts
xx/xx/xxxx Louis 2 minors
xx/xx/xxxx juan 1 major

thanks.

can analytics help me?

October 05, 2005 - 2:41 pm UTC

Reviewer: Susan from Watertown, MA

My result set be ordered by the sum of multiple columns with weight assigned to the columns. The SQL below works and gives me what I want, but maybe there is an analytical function solution? Thanks for all your help.

SELECT ename, job, sal, comm FROM scott.BONUS
ORDER BY DECODE(job, -2, 0, job)*100000+DECODE(sal, -2, 0, sal)*10000+DECODE(comm, -2,0,comm)*100 DESC



Tom Kyte

Followup  

October 05, 2005 - 3:05 pm UTC

not in this case - you want to order by a simple function of attributes of a single row.

You don't need to look across rows - analytics look across rows.

Thanks Tom

October 05, 2005 - 3:58 pm UTC

Reviewer: Susan from Watertown, MA

Thanks for your reply. Do you agree with the DECODE approach or am I missing a more elegant solution?

Tom Kyte

Followup  

October 05, 2005 - 8:23 pm UTC

the decode looks fine here - shorter than case but in this "case" just as easy to read.

Tom

October 05, 2005 - 4:25 pm UTC

Reviewer: A reader

Tom,

Can you please point in the right direction...

This is what I am getting with the following query...


damage_inspection_date damage_inspection_by status
6/12/2004 CCCT MAJOR
6/12/2004 CCCT MINOR
6/12/2004 CCCT TOTAL
6/12/2004 LOU MAJOR
6/12/2004 LOU MINOR


and this is what I would like to get....

damage_inspection_date damage_inspection_by status count
6/12/2004 CCCT MAJOR 2
6/12/2004 CCCT MINOR 2
6/12/2004 CCCT TOTAL 1




select b.damage_inspection_date,
b.damage_inspection_by
,b.status
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT ab.damage_inspection_date,
damage_inspection_by,
STATUS_CODE,
count(distinct case when ac.damage_location_code = 'A' then ab.gate_id end),
count(distinct case when ac.damage_location_code = 'F' then ab.gate_id end),
count(distinct ab.gate_id )
from gate_containers ab,gate_damages ac
where ab.gate_id = ac.gate_id
group by ab.damage_inspection_date,ab.damage_inspection_by,status_code, ab.gate_id))a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
group by (b.damage_inspection_date, b.damage_inspection_by,b.status)



Tom Kyte

Followup  

October 05, 2005 - 8:35 pm UTC

....
damage_inspection_date damage_inspection_by status
6/12/2004 CCCT MAJOR
6/12/2004 CCCT MINOR
6/12/2004 CCCT TOTAL
6/12/2004 LOU MAJOR
6/12/2004 LOU MINOR


and this is what I would like to get....

damage_inspection_date damage_inspection_by status count
6/12/2004 CCCT MAJOR 2
6/12/2004 CCCT MINOR 2
6/12/2004 CCCT TOTAL 1
...

by what "logic"? can you explain how you get from A to B?

follow up

October 06, 2005 - 9:48 am UTC

Reviewer: A reader

Tom,

I already got the first part done. All I need to show is to somehow have the count in another column, how many minor, major and total I have. Can that be possible?
Just maybe like in the second example.



Tom Kyte

Followup  

October 06, 2005 - 11:54 am UTC

first part of WHAT?

more information

October 06, 2005 - 12:47 pm UTC

Reviewer: A reader

Sorry about the lack of information before.

Here I will try to do bettter. I am trying to
a query where I need to count the major, minor
and then get a total.

requirements:

1. if there is a container with majors and a minors total the
counts = major+ minor = total count

2. where container has minor and no major count the minor only.
count = minor



inspector major minor total

1 major, 0 minor , other 1 1

inspector
2 major , 1 minor , other 2 1 3

inspector

0 major, 1 minor, other 0 1 1

Tom Kyte

Followup  

October 06, 2005 - 1:25 pm UTC

sorry -- going back to your original example, I still cannot see the logic behind "what I have" and "what I want" there.

I don't know what you mean by "i have the first part"




this what I have now

October 06, 2005 - 2:11 pm UTC

Reviewer: A reader

Tom,

This is my query and result...

select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT

from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,


((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by

UNION ALL

select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by

UNION ALL

select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by

)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a

where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+);

RESULT:

SQL Statement which produced this data:
select * from MAJOR_MINOR_COUNT_VIEW
where rownum < 10

6/12/2004 CCCT TOTAL 1
6/12/2004 CRAIG TOTAL 6
6/13/2004 CCCT TOTAL 5
6/14/2004 CCCT TOTAL 46
6/14/2004 FYFE TOTAL 30
6/14/2004 HALM TOTAL 38
6/14/2004 MUTH MAJOR 2
6/14/2004 MUTH MINOR 14
6/14/2004 MUTH TOTAL 40

AND I WOULD LIK TO HAVE LIKE AS
THE REQUIREMENTS ABOVE...HOPE THIS HELP.

Tom Kyte

Followup  

October 06, 2005 - 2:57 pm UTC

take your query - call it Q


select inspector,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (Q)
group by inspector

Year to dt + month to date

October 06, 2005 - 2:35 pm UTC

Reviewer: reader from US

CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )

INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;


SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id, TO_CHAR(invoice_dt, 'MON-YYYY') ORDER BY TO_CHAR(sale_dt, 'MON-YYYY') ASC) mon_sal,
SUM (sale_price_usd * qty_sold)
OVER (PARTITION BY id, TO_CHAR(sale_dt, 'YYYY') ORDER BY TO_CHAR(sale_dt, 'YYYY') ASC) yr_sal,
FROM test

ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 237
aa 14-OCT-05 121.5 164.75 237
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 1298.25
bb 13-OCT-05 235.25 600.5 1298.25
bb 15-OCT-05 365.25 600.5 1298.25
bb 14-NOV-05 465.25 465.25 1298.25

7 rows selected.


Ideally ,it should have been ----

ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.5
bb 15-OCT-05 365.25 600.5 833.0
bb 14-NOV-05 465.25 465.25 1298.25

How can I do this ?

Will appreciate your help .

THANKS




Tom Kyte

Followup  

October 06, 2005 - 3:02 pm UTC

ideally - there would be a qty_sold column somewhere :)


ideally you will ONLY use to_char to *format* data, never to process it.

trunc(invoice_dt,'y') NOT to_char(invoice_dt,'yyyy')
trunc(sale_dt,'mm') NOT to_char(sale_dt, 'MON-YYYY' )




Year to Date and Month to date

October 06, 2005 - 10:04 pm UTC

Reviewer: READER from US

As per your suggestion ,I made the changes but ...still need your help .


CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )

INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;


SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'MM') ORDER BY trunc(sale_dt,'MM') ASC) mon_sal,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'Y') ORDER BY trunc(sale_dt,'Y') ASC) yr_sal
FROM test
ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------------------- ---------- ---------- ----------
aa 9/15/2005 72.25 72.25 237
aa 10/14/2005 193.75 164.75 237
aa 10/19/2005 237 164.75 237
bb 9/14/2005 232.5 232.5 1298.25
bb 10/13/2005 467.75 600.5 1298.25
bb 10/15/2005 833 600.5 1298.25
bb 11/14/2005 1298.25 465.25 1298.25
7 rows selected


Ideally ,it should have been ----

ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.5
bb 15-OCT-05 365.25 600.5 833.0
bb 14-NOV-05 465.25 465.25 1298.25


Thanks again .

Tom Kyte

Followup  

October 07, 2005 - 8:13 am UTC

you shall have to explain how you derived your "optimal" output.

certainly isn't sorted by anything? I don't get the numbers.

Year to date /Month to date

October 07, 2005 - 9:49 am UTC

Reviewer: Reader from US

I wish to create a summary table where we will have sale for every day ,sale up to that day in that month and then upto that day in that year
ie running total or cummulative total

Thanks


Tom Kyte

Followup  

October 07, 2005 - 8:22 pm UTC

ok?

Follo up

October 07, 2005 - 9:52 am UTC

Reviewer: A reader

Tom,

The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..

major minor count

1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1


* count the minor when there is no major



CREATE TABLE GATE_CONTAINERS
(
GATE_ID NUMBER ,
VISIT NUMBER ,
REFERENCE_ID NUMBER ,
DAMAGE_INSPECTION_BY VARCHAR2(30),
DAMAGE_INSPECTION_DATE DATE,

)

Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(1, 1);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(17, 10);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(21, 12);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(31, 18);
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(33, 19);
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(36, 22, TO_DATE('06/12/2004 11:48:49', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(37, 23, TO_DATE('06/12/2004 11:50:11', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(39, 25, TO_DATE('06/12/2004 11:48:19', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
(GATE_ID, VISIT)
Values
(45, 30);
COMMIT;




CREATE TABLE GATE_DAMAGES
(
GATE_ID NUMBER NOT NULL,
DAMAGE_LOCATION_CODE VARCHAR2(5 BYTE) NOT NULL,
DAMAGE_TYPE_CODE VARCHAR2(5 BYTE) NOT NULL
)

Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '01', '9');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '02', 'C');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(37, '01', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(62, '05', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(101, '23', 'C');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(183, '99', '9');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '01', 'D');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '04', 'B');
Insert into damages_tbl
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '07', 'B');
COMMIT;


Tom Kyte

Followup  

October 07, 2005 - 8:35 pm UTC

The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..

major minor count

1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1



so tell me why there are minor counts when major > 0???

and this is my query

October 07, 2005 - 9:53 am UTC

Reviewer: A reader

select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by

I got it....

October 07, 2005 - 11:16 am UTC

Reviewer: A reader

Tom,

I got it....I just had to put the following. Let me know
what you think? If you have any suggestions!
Thanks for all your patient...

select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,z.damage_type_code,
( case z.damage_type_code
when 'F' then 0
ELSE Count(distinct g.gate_id)
end ) CNT
--- count(distinct g.gate_id) cnt
from gate_containers g,gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'

Year to Date and Month to date

October 07, 2005 - 11:52 pm UTC

Reviewer: Tim from CA USA

Just a guess - could this be what your looking for?

SELECT DISTINCT id,sale_dt
,SUM (amount) OVER
(PARTITION BY id, sale_dt ORDER BY id ASC, sale_dt ASC) sale_daily
,SUM (amount) OVER
(PARTITION BY id, TRUNC(sale_dt,'MM')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) mon_sal
,SUM (amount) OVER
(PARTITION BY id, TRUNC(sale_dt,'Y')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) yr_sal
FROM TEST
ORDER BY id, sale_dt

ID SALE_DT SALE_DAILY MON_SAL YR_SAL
---------- --------- ---------- ---------- ----------
aa 15-SEP-05 72.25 72.25 72.25
aa 14-OCT-05 121.5 121.5 193.75
aa 19-OCT-05 43.25 164.75 237
bb 14-SEP-05 232.5 232.5 232.5
bb 13-OCT-05 235.25 235.25 467.75
bb 15-OCT-05 365.25 600.5 833
bb 14-NOV-05 465.25 465.25 1298.25


-- another variation
SELECT a.*
,SUM (sale_daily) OVER
(PARTITION BY id, TRUNC(sale_dt,'MM')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) mon_sal
,SUM (sale_daily) OVER
(PARTITION BY id, TRUNC(sale_dt,'Y')
ORDER BY id ASC, sale_dt ASC
RANGE UNBOUNDED PRECEDING
) yr_sal
FROM
(
SELECT id,sale_dt
,SUM(amount) sale_daily
FROM TEST
GROUP BY id, sale_dt
) a


EXCELLENT !

October 08, 2005 - 12:24 am UTC

Reviewer: reader from US

Thanks !

to answer your question

October 11, 2005 - 12:54 pm UTC

Reviewer: A reader

The above pivot worked well, however my count are off since
I ONLY want to count the minor when there is no Major.
Something like this..

major minor count

1 major, 0 minor , other 1 1
2 major , 1 minor , other 2 2
0 major, 1 minor, other 0 1 1



so tell me why there are minor counts when major > 0???

because when there are a major's and minor's I want to count
only the major's. When just the minor when there is no
major...just want to count the minor. those are the only 2 situation that there should be.



How can I ignore some selected columns in my group by?

October 12, 2005 - 3:14 am UTC

Reviewer: Neil from London

Tom,
I have a set of data that is recorded daily and I want to
compress it; so this:

87654321 1 5 21-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 22-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 23-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 24-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 25-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 26-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 27-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 28-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 29-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 30-AUG-2005 2.7500E+10 0 -1.436E+10 2.7500E+10 0 -1.436E+10
87654321 1 5 31-AUG-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 01-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 02-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 03-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 04-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 05-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 06-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 07-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 08-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 09-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 10-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 11-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 12-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 13-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321 1 5 14-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 15-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 16-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 17-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 18-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 19-SEP-2005 0 0 -1.991E+10 0 0 -1.991E+10
87654321 1 5 20-SEP-2005 5555550000 0 -1.436E+10 5555550000 0 -1.436E+10
87654321 1 5 21-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 22-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 23-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 24-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 25-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10
87654321 1 5 26-SEP-2005 0 0 -1.436E+10 0 0 -1.436E+10

Needs to be converted into this:

87654321 1 5 21-AUG-2005 29-AUG-2005 0 0 -4.186E+10 0 0 -4.186E+10
87654321 1 5 30-AUG-2005 12-SEP-2005 2.7500E+10 0 -1.436E+10 2.7500E+10 0 -1.436E+10
87654321 1 5 13-SEP-2005 19-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321 1 5 20-SEP-2005 01-JAN-4000 5555550000 0 -1.436E+10 5555550000 0 -1.436E+10

The column of interest is the 7th one. Whenever it changes,
I want to create a new row beginning with the day's date,
and ending either on the day before the next change, or, if
there is no next change (LEAD analytic function), substitute
in 01-JAN-4000 to show that this is the current amount.
The problem is, I need to ignore the other figures in columns 5 & 6 and 8 & 9. If I group by all the columns, I get separate
entries for these lines. That's my stumbling block - I have got close with analytics, but so far, no cigar!
I'm on 8.1.7, although I'd be interested in solutions possible in
later versions, too. If you think this is possible, I can paste a create table and SQL*Loader script here, but it would detract from the post: It's a bit of a mess anyway - if only AskTom allowed 132 columns :)
T.I.A


Tom Kyte

Followup  

October 12, 2005 - 7:26 am UTC

"i want to create a new row" - that is hard as analytics don't "create rows", they just don't "squish them out" like an aggregate wou.d

make the example smaller - you don't need all of the columns, seems two or three might suffice. show the table, the data (via inserts) and the expected output if you like.

Maybe this should be a GROUP BY question, then

October 13, 2005 - 4:12 am UTC

Reviewer: Neil from London

OK - here's the table creation scripts and a couple of loader files.
My goal is to create a SQL statement to change the old data into the new.
I can use analytics to give me the start and end dates, but my problem is that I wish to ignore the actin, actout, expin and expout columns and concentrate on the act column. When it changes, I want to take the row, and give it an end date of the day before the date on which it changes again, or the default date of 01-JAN-4000 if no such row exists.
If I could just partition by the earliest date and the latest date where the act figure is the same within serial, volume and part, I could pick off the FIRST and the LAST and use LAG and LEAD to work out the dates...
CREATE TABLE t_old (
DEPOT VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART NUMBER(2)
,ASAT DATE
,ACTIN NUMBER(8)
,ACTOUT NUMBER(8)
,ACT NUMBER(8)
,EXPIN NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD NUMBER(8)
)
/

LOAD DATA
INFILE *
INTO TABLE t_old
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,ASAT
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 00822000 6086 5 25-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 00822000 6086 5 26-SEP-2005 0 0 -1796200 0 0 -1796200
DEPOT1 08226111 1 5 29-AUG-2005 0 0 -4185550 0 0 -4185550
DEPOT1 08226111 1 5 30-AUG-2005 2750000 0 -1435550 2750000 0 -1435550
DEPOT1 08226111 1 5 31-AUG-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 01-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 02-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 03-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111 1 5 04-SEP-2005 0 0 -1991105 0 0 -1991105
DEPOT1 08226111 1 5 05-SEP-2005 0 0 -1991105 0 0 -1991105
DEPOT1 08226111 1 5 06-SEP-2005 555555 0 -1435550 555555 0 -1435550
DEPOT1 08226111 1 5 07-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 1 5 08-SEP-2005 0 0 -1435550 0 0 -1435550
DEPOT1 08226111 420 5 11-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 12-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 13-SEP-2005 3329555 -2775150 555555 3329555 -2775150 555555
DEPOT1 08226111 420 5 14-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 420 5 15-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 420 5 16-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 420 5 17-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 495 5 18-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 495 5 19-SEP-2005 555555 0 555555 555555 0 555555
DEPOT1 08226111 495 5 20-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 495 5 21-SEP-2005 0 0 555555 0 0 555555
DEPOT1 08226111 495 5 22-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 495 5 23-SEP-2005 0 0 0 0 0 0
DEPOT1 08226111 664 5 28-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 29-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 30-AUG-2005 0 -2750000 1478550 0 -2750000 1478550
DEPOT1 08226111 664 5 31-AUG-2005 0 0 1478550 0 0 1478550
DEPOT1 08226111 664 5 01-SEP-2005 0 0 1478550 0 0 1478550

CREATE TABLE t_new (
DEPOT VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART NUMBER(2)
,FROM_D DATE
,UNTIL_D DATE
,ACTIN NUMBER(8)
,ACTOUT NUMBER(8)
,ACT NUMBER(8)
,EXPIN NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD NUMBER(8)
)
/

LOAD DATA
INFILE *
INTO TABLE t_new
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,FROM_D
,UNTIL_D
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 01-JAN-4000 0 0 -1796200 0 0 -1796200
DEPOT1 08226111 1 5 29-AUG-2005 29-AUG-2005 0 0 -4185550 0 0 -4185550
DEPOT1 08226111 1 5 30-AUG-2005 02-SEP-2005 2750000 0 -1435550 2750000 0 -1435550
DEPOT1 08226111 1 5 03-SEP-2005 05-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111 1 5 06-SEP-2005 01-JAN-4000 555555 0 -1435550 555555 0 -1435550
DEPOT1 08226111 420 5 11-SEP-2005 12-SEP-2005 0 0 1150 0 0 1150
DEPOT1 08226111 420 5 13-SEP-2005 15-SEP-2005 3329555 -2775150 555555 3329555 -2775150 555555
DEPOT1 08226111 420 5 16-SEP-2005 18-SEP-2005 0 -555555 0 0 -555555 0
DEPOT1 08226111 495 5 19-SEP-2005 21-SEP-2005 555555 0 555555 555555 0 555555
DEPOT1 08226111 495 5 22-SEP-2005 01-JAN-4000 0 -555555 0 0 -555555 0
DEPOT1 08226111 664 5 28-AUG-2005 29-AUG-2005 0 0 4228550 0 0 4228550
DEPOT1 08226111 664 5 30-AUG-2005 01-JAN-4000 0 -2750000 1478550 0 -2750000 1478550





Some Help needed!

October 18, 2005 - 10:54 am UTC

Reviewer: A reader

Tom,

How can I count the double moves as 1. For example,
in the case of 1690371?

I want to count
1690371 63 A
1690371 63 X
1690371 64 A
1690371 64 L

I want to count "A" AS ONE MOVE using this query

select trunc(g.damage_inspection_date) damage_inspection_date,g.damage_inspection_by, 'MINOR' STATUS,
count(distinct g.gate_id) cnt
from gate_containers g,
gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and g.DAMAGE_INSPECTION_BY = 'COLUMBO'
and trunc(G.damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by


1690355 59 A
1690355 59 E

1690371 63 A
1690371 63 X
1690371 64 A
1690371 64 L

1690405 71 A
1690405 71 I

1690433 71 A
1690433 71 I

1690486 54 F
1690486 54 L
1690486 72 F
1690486 72 I

1690540 59 A
1690540 59 E

1690636 63 A
1690636 63 X

1690781 67 X




One solution

October 19, 2005 - 9:29 am UTC

Reviewer: A reader

Tom,

Can decode work here...

decode(count(distinct g.gate_id,'A','F',0,NULL)

Tom Kyte

Followup  

October 19, 2005 - 9:45 am UTC

I didn't really understand the question right above, nor did I see any table creates or inserts, so I sort of ignored it...

More information

October 19, 2005 - 10:44 am UTC

Reviewer: A reader

create table gate_containers
(gate_id number,
action varchar2(5),
damage_inspection_date date,
damage_inspection_by varchar2(30))





Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686439, 'RNC', TO_DATE('06/14/2005 11:16:16', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688372, 'RNC', TO_DATE('06/14/2005 13:26:59', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688374, 'RNC', TO_DATE('06/14/2005 13:27:08', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688235, 'RNC', TO_DATE('06/14/2005 13:18:15', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688609, 'RNC', TO_DATE('06/14/2005 13:43:35', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686827, 'RNC', TO_DATE('06/14/2005 11:42:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688508, 'RNC', TO_DATE('06/14/2005 13:36:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686044, 'RNC', TO_DATE('06/14/2005 10:50:47', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685720, 'RNC', TO_DATE('06/14/2005 10:27:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686276, 'RNC', TO_DATE('06/14/2005 11:05:23', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)

CREATE TABLE GATE_DAMAGES
(
GATE_ID NUMBER NOT NULL,
DAMAGE_LOCATION_CODE VARCHAR2(5 BYTE) NOT NULL,
DAMAGE_TYPE_CODE VARCHAR2(5 BYTE) NOT NULL
)


--
--SQL Statement which produced this data:
-- SELECT * FROM GATE_DAMAGES
-- WHERE ROWNUM < 20
--
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '01', '9');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(34, '02', 'C');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(37, '01', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(62, '05', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(101, '23', 'C');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(183, '99', '9');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '01', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '04', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '07', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '08', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '11', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '18', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '22', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(188, '24', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '01', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '08', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '11', 'B');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '18', 'D');
Insert into GATE_DAMAGES
(GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
Values
(189, '22', 'D');
COMMIT;
Values
(1686279, 'RNC', TO_DATE('06/14/2005 11:05:34', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686285, 'RNC', TO_DATE('06/14/2005 11:05:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685831, 'RNC', TO_DATE('06/14/2005 10:36:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685417, 'RNC', TO_DATE('06/14/2005 10:06:00', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685579, 'RNC', TO_DATE('06/14/2005 10:17:18', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1685828, 'RNC', TO_DATE('06/14/2005 10:34:44', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686007, 'RNC', TO_DATE('06/14/2005 10:47:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1686131, 'RNC', TO_DATE('06/14/2005 10:56:42', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
(GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
Values
(1688019, 'RNC', TO_DATE('06/14/2005 13:05:56', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
COMMIT;


Tom,

Let me see if I can try now to explain it better. I am looking for
a report like this..

damages_inspection_date damages_inspection_by cnt minor major total
6/12/2005 MUTH XX XXX XX XX


requirements


MINOR = A
MAJOR = F
TOTAL != 'C'

and also but very important is that when there is a major and a minor
I should only count the major thus ignoring the minor.



Tom Kyte

Followup  

October 19, 2005 - 12:34 pm UTC

sorry - but I'll need much more "text" than that. Remember, I haven't been staring at these tables for hours/days, I'm not familar with your vernacular, I don't know what problem you are trying to solve.

spec it out like we used to in the olden days - someone wrote spec (requirements) and someone else might have written the code from the spec.

follow up

October 19, 2005 - 2:51 pm UTC

Reviewer: A reader

GATE_ID DAMAGE_TYPE_CODE
1690355 59 A A
1690355 59 E
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A A
1690405 71 I
1690433 71 A A
1690433 71 I
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A A
1690540 59 E
1690636 63 A A
1690636 63 X
1690781 67 X

1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X

I think this is a good example. In this case I got
A = MINOR DAMAGES
F = MAJOR DAMAGES
TOTAL = NOT EQUAL TO C ( !C)

1. If you look at it closely you can see for every
every gate_id where I have multiple
Major damages just count them as one like for example
Gate_id = 1690371

2 When you have multiples MINORS (F’’s)
Like gate_id = 1690486 count them as one

3 When you have gate_id with F and A like
Gate_id = 1687912 then just Count F(MAJORS)

damages_inspection_date damages_inspection_by cnt minor major Total
6/12/2005 MUTH XX XXX XXX XX



Tom Kyte

Followup  

October 19, 2005 - 4:30 pm UTC

you seem to be using F as major:

F = MAJOR DAMAGES


but also as minor:

When you have multiples MINORS (Fs)

sorry, I'm not being a "hard whatever", I'm not getting it. step back, pretend you were trying to explain this to your mom.

this is what I got so far

October 19, 2005 - 3:02 pm UTC

Reviewer: A reader

Tom,

This is what I got so far, but the query is
not following the rule with the MINORS DAMAGES..
----------------------------------------
select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by


SORRY!!

October 19, 2005 - 3:05 pm UTC

Reviewer: A reader

COPY THE WRONG QUERY...

This is what I got so far but the query
is not following the rules with the MINOR
AS stated above.

select damage_inspection_date,damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
count(distinct g.gate_id)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
d.damage_type_code = 'A'
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+)
)
group by damage_inspection_date,damage_inspection_by


follow up

October 19, 2005 - 4:47 pm UTC

Reviewer: A reader

1. If you look at it closely you can see for every
every gate_id where I have multiple
Major damages just count them as one like for example
Gate_id = 1690371

2 When you have multiples MINORS (F’s)
Like gate_id = 1690486 count them as one

3 When you have gate_id with F and A ALIKE
Gate_id = 1687912 then just Count F(MAJORS)


F FOR MAJOR
A FOR MINOR

SOMETIMES IN THE RECORD WILL HAVE MAJOR AND MINORS
I JUST WANT TO COUNT THE MAJOR AND IGNORE THE MINORS.
I HAVE GIVEN 3 EXAMPLES OF THE RULES....DON'T KNOW
WHAT ELSE TO SAY...ALSO PLEASE LOOK AT THE QUERY
IT'S ALL IN THE UNION. THE ONLY PROBLEM THAT I HAVE
IS THAT I AM COUNTING THE MAJOR AND THE MINORS IN
THE MINOR UNION.



Tom Kyte

Followup  

October 19, 2005 - 4:57 pm UTC

1) Ok, I'm looking at that gate id:

1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L

IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?

2) Ok, I'm looking at that gate id:

1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I

Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get it?

3) Ok, I'm looking at that:


1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X

and we are back to F being a major, not a minor again?


So, no, I don't get it, it is not clear, you can shout loud, but it won't matter.


(am I the only one not really following this??)

ANOTHER EXAMPLE!

October 19, 2005 - 4:55 pm UTC

Reviewer: A reader

1690355 59 A A
1690355 59 E
1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L
1690405 71 A A
1690405 71 I
1690433 71 A A
1690433 71 I
1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I
1690540 59 A A
1690540 59 E
1690636 63 A A
1690636 63 X
1690781 67 X
A 12
F 9


select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by, 'MINOR' status,
count(distinct g.gate_id) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and damage_inspection_by = 'COLUMBO'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by

result from the query:


damage_inspection_date
6/14/2005

damage_inspection_by
COLUMBO

status
MINOR

CNT
9 --------IT SHOULD BE 8 WHY? BECAUSE I WANT TO COUNT
THE "A"(MINORS) UNIQUELY,DISTINCTLY. IN OTHER WORDS


follow up

October 19, 2005 - 5:15 pm UTC

Reviewer: A reader

) Ok, I'm looking at that gate id:

1690371 63 A A
1690371 63 X
1690371 64 A
1690371 64 L

*****this count as one A = MINOR




IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?

2) Ok, I'm looking at that gate id:

1690486 54 F F
1690486 54 L
1690486 72 F
1690486 72 I

*******THIS COUNT AS ONE F = MAJOR

Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the
primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get
it?

3) Ok, I'm looking at that:


1687912 56 F F
1687912 56 I
1687912 66 A
1687912 66 X


***** IN THIS CASE IGNORE THE MINOR(A) AND COUNT JUST THE MAJOR(F)



****THE FAR LETTER ON THE RIGHT IS TO SHOW YOU HOW I AM
CALCULATING WHAT IS ON THE RIGHT...




Tom Kyte

Followup  

October 19, 2005 - 7:44 pm UTC

so, by gate_id compute how many A's and how many F's


select gate_id, count(case when col='A' then col end) A,
count(case when col='F' then col end) F
from t;

now you have gate_id and a count of A's and F's

call that Q

select ...
from (Q);

use CASE to look at A and F and return whatever you want.

Year to date Business Day

October 19, 2005 - 6:48 pm UTC

Reviewer: Reader from US

I am trying to calculate the number of days we did business ie sold anything and then a running total for the year .

CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )

INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
COMMIT;

SELECT a.*
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mtd_day_of_business
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) ytd_day_of_business
FROM
(
SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
GROUP BY ID, sale_dt
) a
ID SALE_DT SALE_DAILY MON_SAL YR_SAL MTD_DAYOFBUS YTD_DAYOFBUS
---------- --------- ---------- ---------- ---------- ------------ ------------
aa 15-SEP-05 216.75 216.75 216.75 1 1
aa 14-OCT-05 299.25 299.25 516 1 2
aa 19-OCT-05 129.75 429 645.75 2 3
bb 14-SEP-05 697.5 697.5 697.5 1 1
bb 13-OCT-05 705.75 705.75 1403.25 1 2
bb 15-OCT-05 1095.75 1801.5 2499 2 3
bb 14-NOV-05 1395.75 1395.75 3894.75 1 4

Ideally ,the business days in the month of sep -14,15
oct-13,14,15,19 nov 1

So the year count should be 7 and the monthly count should be sep 1,2 oct 1,2,3,4 and Nov 1 .

Can this be done using analytical function or is there any other way .

Thanks





Tom Kyte

Followup  

October 19, 2005 - 7:56 pm UTC

you mean like this?

ops$tkyte@ORA10GR1> SELECT a.*
  2  ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
  3       ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
  4  ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
  5       ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
  6  ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'MM') order by sale_dt )
  7  mtd_day_of_business
  8  ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'Y') )
  9  ytd_day_of_business
 10  FROM
 11  (
 12  SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
 13  GROUP BY ID, sale_dt
 14  ) a
 15  order by sale_dt
 16  /

ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL MTD_DAY_OF_BUSINESS YTD_DAY_OF_BUSINESS
---------- --------- ---------- ---------- ---------- ------------------- -------------------
bb         14-SEP-05     2092.5     2092.5     2092.5                   1                   7
aa         15-SEP-05     650.25     650.25     650.25                   2                   7
bb         13-OCT-05    2117.25    2117.25    4209.75                   1                   7
aa         14-OCT-05     1093.5     1093.5    1743.75                   2                   7
bb         15-OCT-05    3287.25     5404.5       7497                   3                   7
aa         19-OCT-05     389.25    1482.75       2133                   4                   7
bb         14-NOV-05    4187.25    4187.25   11684.25                   1                   7

7 rows selected.

 

Year to date ..

October 20, 2005 - 12:27 am UTC

Reviewer: Reader from US

The months are fine .
But year should increment ie 1,2,3,4,5,6,7
now it 7th business day for all transactions .

Thanks



Tom Kyte

Followup  

October 20, 2005 - 8:06 am UTC

I did that, because you asked for that.

... So the year count should be 7 and the monthly count should be sep 1,2 oct
1,2,3,4 and Nov 1 .....

add the order by to the year count just like I did for the month.

the order by will make it a running total.

Thank you

October 20, 2005 - 8:56 am UTC

Reviewer: A reader

Tom,

Thank you for your solution, above. However,
Can you give me a solution using CASE when
I can count either A and NOT F?

Thank you again.

Tom Kyte

Followup  

October 20, 2005 - 8:59 am UTC

select case when cnta > 0 and cntf > 0
then ...
when cnta = 0 and cntf > 0
then ...
when cnta > 0 and cntf = 0
then ...


use a boolean expression after computing the cnt of A and the cnt of F

SOMETHING LIKE THIS...

October 20, 2005 - 10:18 am UTC

Reviewer: A reader

Tom,

you mean something like this..

select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,
'MINOR' STATUS,
SUM (case WHEN z.DAMAGE_TYPE_CODE= 'A' THEN 1 ELSE 0 end)
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
group by trunc(g.damage_inspection_date),g.damage_inspection_by


Tom Kyte

Followup  

October 20, 2005 - 4:33 pm UTC

sure

FINAL SOLUTION

October 20, 2005 - 11:04 am UTC

Reviewer: A reader

Tom,

here is the problem that I was facing....I hope
this clear things out.


SQL Statement which produced this data:
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
--and z.gate_id = '1688273'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
-------------------------------------------------------------------------

6/14/2005 COLUMBO MINOR 1688235 A
6/14/2005 COLUMBO MINOR 1688609 A
6/14/2005 COLUMBO MINOR 1688273 A------was counting this a minor when it should be counted as a major
6/14/2005 COLUMBO MINOR 1686769 A
6/14/2005 COLUMBO MINOR 1686517 A
6/14/2005 COLUMBO MINOR 1687985 A
6/14/2005 COLUMBO MINOR 1686483 A
6/14/2005 COLUMBO MINOR 1685361 A
6/14/2005 COLUMBO MINOR 1686414 A



SQL Statement which produced this data:
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
--and z.damage_type_code = 'A'
AND damage_inspection_by = 'COLUMBO'
and z.gate_id = '1688273'
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')

6/14/2005 COLUMBO MINOR 1688273 A--------throw away!
6/14/2005 COLUMBO MINOR 1688273 E
6/14/2005 COLUMBO MINOR 1688273 C
6/14/2005 COLUMBO MINOR 1688273 F-------keep
6/14/2005 COLUMBO MINOR 1688273 I


follow up

October 20, 2005 - 5:01 pm UTC

Reviewer: A reader

Tom,

I am still not able to get the corrent result using
a case statement. Maybe I should use a function to return only the F when there is a F and A in the record.

select trunc(g.damage_inspection_date) damage_inspection_date,
g.damage_inspection_by,
'MINOR' STATUS,
sum (case when z.damage_type_code = 'F' then 1 else 0 end) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id and
z.damage_type_code = 'F'
group by trunc(g.damage_inspection_date),g.damage_inspection_by



Can this be done using analytical functions??

October 25, 2005 - 2:59 pm UTC

Reviewer: A reader

Tom,

I am done with my query....I am looking for a better
approach or how I can improve it. Thanks!!


select damage_inspection_date, damage_inspection_by,
max(decode(status,'MINOR',cnt)) minor,
max(decode(status,'MAJOR',cnt)) major,
max(decode(status,'TOTAL',cnt)) total
from (select b.damage_inspection_date,
b.damage_inspection_by
,b.status
,NVL(a.cnt,0) CNT
from
(select aa.damage_inspection_date,
aa.damage_inspection_by,
bb.status
from (select distinct trunc(gc.damage_inspection_date)
damage_inspection_date, gc.damage_inspection_by
from gate_damages gd, gate_containers gc
where gd.gate_id = gc.gate_id
) aa,
(select *
from (select 'MAJOR' STATUS from dual
union all
select 'MINOR' STATUS from dual
union all
select 'TOTAL' STATUS from dual
)
) bb
)b,
((SELECT damage_inspection_date,
damage_inspection_by,
Status,
cnt
FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
c.damage_inspection_by,
'MAJOR' STATUS,
count(distinct c.gate_id) cnt
from gate_containers c,
gate_damages d
where c.gate_id = d.gate_id and
d.damage_type_code = 'F'
group by trunc(c.damage_inspection_date),c.damage_inspection_by
UNION ALL
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by, 'MINOR' status,
count(distinct g.gate_id) cnt
from gate_containers g, gate_damages z
where g.gate_id = z.gate_id
and z.damage_type_code = 'A'
and not exists
(select z.gate_id from gate_damages z
where z.gate_id = g.gate_id
and z.damage_type_code = 'F')
group by trunc(g.damage_inspection_date),g.damage_inspection_by
UNION ALL
select trunc(ab.damage_inspection_date) damage_inspection_date,
ab.damage_inspection_by,
'TOTAL' STATUS,
count(distinct ab.gate_id) cnt
from gate_containers ab,
gate_damages ac
where ab.gate_id = ac.gate_id(+) and
SUBSTR(ab.action,2,1) != 'C'
group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
)
group by damage_inspection_date, damage_inspection_by, status, cnt
)
) a
where b.damage_inspection_by = a.damage_inspection_by(+)
and b.damage_inspection_date = a.damage_inspection_date(+)
and b.status = a.status(+))
group by damage_inspection_date, damage_inspection_by;




Tom Kyte

Followup  

October 26, 2005 - 11:24 am UTC

sorry - too big to reverse engineer here as a review/followup....

Analytic Question

November 20, 2005 - 8:16 am UTC

Reviewer: Yoav

Hi Tom.
Im tring to calculating Weighted moving average.
Im having a problem to calculate the values under column SUM_D.
Can you please demonstrate how to achieve the values that appears under the column SUM_D ?

create table t
(stock_date date,
close_value number(8,2));

INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('02-OCT-2005',759.56);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('29-SEP-2005',753.59);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('28-SEP-2005',749.20);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('27-SEP-2005',741.71);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('26-SEP-2005',729.93);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('25-SEP-2005',719.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('22-SEP-2005',727.30);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('21-SEP-2005',735.81);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('20-SEP-2005',740.38);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('19-SEP-2005',739.86);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('18-SEP-2005',745.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('15-SEP-2005',744.65);
COMMIT;

select RN, day, stock_date,close_value,weight
from(
select rownum RN,to_char(stock_date,'d') Day,
stock_date,close_value,
(case when to_char(stock_date,'d') = 1 then
1*close_value
when to_char(stock_date,'d') = 2 then
2*close_value
when to_char(stock_date,'d') = 3 then
3*close_value
when to_char(stock_date,'d') = 4 then
4*close_value
when to_char(stock_date,'d') = 5 then
5*close_value
end) weight
from( select rownum,stock_date,close_value
from test
order by stock_date asc)
order by 1
)
ORDER BY 1
/

RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
--------- - --------- ----------- ---------- ----------
1 5 15-SEP-05 744.65 3723.25 5
2 1 18-SEP-05 745.48 745.48 1 <==
3 2 19-SEP-05 739.86 1479.72 3
4 3 20-SEP-05 740.38 2221.14 6
5 4 21-SEP-05 735.81 2943.24 10
6 5 22-SEP-05 727.3 3636.5 15
7 1 25-SEP-05 719.48 719.48 1 <==
8 2 26-SEP-05 729.93 1459.86 3
9 3 27-SEP-05 741.71 2225.13 6
10 4 28-SEP-05 749.2 2996.8 10
11 5 29-SEP-05 753.59 3767.95 15

RN D STOCK_DAT CLOSE_VALUE WEIGHT SUM_D
--------- - --------- ----------- ---------- ----------
12 1 02-OCT-05 759.56 759.56 1

Thank You.

Tom Kyte

Followup  

November 20, 2005 - 8:31 am UTC

would you like to explain what sum_d is to you? explain the logic behind it.

Analytic Question

November 20, 2005 - 10:10 am UTC

Reviewer: Yoav

Hi Tom.
Im sorry if my explanation wasnt clear enough.
The column SUM_D is actualy a "running total" of the column
Day.
The thing is that i need to reset the value of the
column SUM_D to 1 at the beginning of each week (Sunday).

select RN, day, TheDayIs, stock_date
from(
select rownum RN,to_char(stock_date,'d') Day,
to_char(stock_date,'Day')TheDayIs,
stock_date,close_value
from( select rownum,stock_date,close_value
from test
order by stock_date asc)
order by 1
)
ORDER BY 1
/

RN D SUM_D THEDAYIS STOCK_DAT
--------- - ----- --------- ---------
1 5 5 Thursday 15-SEP-05
2 1 1 Sunday 18-SEP-05
3 2 3 Monday 19-SEP-05
4 3 6 Tuesday 20-SEP-05
5 4 10 Wednesday 21-SEP-05
6 5 15 Thursday 22-SEP-05
7 1 1 Sunday 25-SEP-05
8 2 3 Monday 26-SEP-05
9 3 6 Tuesday 27-SEP-05
10 4 10 Wednesday 28-SEP-05
11 5 15 Thursday 29-SEP-05
12 1 1 Sunday 02-OCT-05

Thank you for you quick response



Tom Kyte

Followup  

November 21, 2005 - 8:20 am UTC

you might have to "adjust" your stock_date by a day if 'ww' doesn't group right for you with your NLS settings (sometimes the week ends on a different day depending on your NLS settings - locale issue)



ops$tkyte@ORA9IR2> select row_number() over (order by stock_date) rn,
  2         to_char(stock_date,'d') day,
  3         stock_date,
  4         close_value,
  5         to_number(to_char(stock_date,'d'))*close_value weight,
  6          sum(to_number(to_char(stock_date,'d')))  
                  over (partition by to_char(stock_date,'ww') 
                            order by stock_date) sum_d
  7    from t
  8   order by stock_date
  9  /
 
        RN D STOCK_DAT CLOSE_VALUE     WEIGHT      SUM_D
---------- - --------- ----------- ---------- ----------
         1 5 15-SEP-05      744.65    3723.25          5
         2 1 18-SEP-05      745.48     745.48          1
         3 2 19-SEP-05      739.86    1479.72          3
         4 3 20-SEP-05      740.38    2221.14          6
         5 4 21-SEP-05      735.81    2943.24         10
         6 5 22-SEP-05       727.3     3636.5         15
         7 1 25-SEP-05      719.48     719.48          1
         8 2 26-SEP-05      729.93    1459.86          3
         9 3 27-SEP-05      741.71    2225.13          6
        10 4 28-SEP-05       749.2     2996.8         10
        11 5 29-SEP-05      753.59    3767.95         15
        12 1 02-OCT-05      759.56     759.56          1
 
12 rows selected.
 

Analytics Question

November 21, 2005 - 7:41 am UTC

Reviewer: Yoav

Hi Tom.
Im Sorry for wasting you time.
i found the solution.

select RN, day, week_no,
sum(day) over
(partition by week_no
order by day) sum_d,
stock_date
from(
select RN, day, week_no, stock_date
from(select rownum RN,to_char(stock_date,'d') Day,
to_char(stock_date,'ww') week_no,
stock_date,close_value,
(case when to_char(stock_date,'d') = 1 then
1*close_value
when to_char(stock_date,'d') = 2 then
2*close_value
when to_char(stock_date,'d') = 3 then
3*close_value
when to_char(stock_date,'d') = 4 then
4*close_value
when to_char(stock_date,'d') = 5 then
5*close_value
end) weight
from( select rownum,'Y',stock_date,close_value
from test
order by stock_date asc)
order by 1)
)
order by 1
/


RN D WE SUM_D STOCK_DAT
------ - -- ---------- ---------
1 5 37 5 15-SEP-05
2 1 38 1 18-SEP-05
3 2 38 3 19-SEP-05
4 3 38 6 20-SEP-05
5 4 38 10 21-SEP-05
6 5 38 15 22-SEP-05
7 1 39 1 25-SEP-05
8 2 39 3 26-SEP-05
9 3 39 6 27-SEP-05
10 4 39 10 28-SEP-05
11 5 39 15 29-SEP-05

RN D WE SUM_D STOCK_DAT
------ - -- ---------- ---------
12 1 40 1 02-OCT-05

Thank You. !!

Tom Kyte

Followup  

November 21, 2005 - 8:52 am UTC

see above, you can skip lots of steps here!

Analytics Question

November 22, 2005 - 5:29 am UTC

Reviewer: Yoav

Tom.
Your solution is better then my.
Thank you !

Could you please help me with this

November 29, 2005 - 4:17 am UTC

Reviewer: A reader

I am trying to output a report with different aggregates for different price ranges

create table t(
id number(3),
year number(4),
month number(2),
slno number(2),
colorcd number(2),
sizecd number(2),
itemid number(4),
prdno number(3),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));

create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));

insert into t values (1,2005,1,1,1,10,1000,101,150,100,10);
insert into t values (1,2005,1,2,1,11,1000,101,150,120,2);
insert into t values (1,2005,1,3,1,12,1000,101,150,100,10);
insert into t values (1,2005,1,4,1,13,1000,102,150,200,2);
insert into t values (1,2005,2,5,2,10,1000,102,150,100,20);
insert into t values (1,2005,2,6,2,11,1000,102,150,100,12);
insert into t values (1,2005,2,7,3,10,1000,103,150,100,20);
insert into t values (1,2005,3,8,4,10,1000,103,150,100,22);
insert into t values (1,2005,4,9,4,11,1000,103,150,100,12);
insert into t values (1,2005,1,10,5,10,1000,104,450,100,10);
insert into t values (1,2005,1,11,5,11,1000,104,450,120,2);
insert into t values (1,2005,1,12,5,12,1000,104,450,100,10);
insert into t values (1,2005,1,13,5,13,1000,104,450,200,2);
insert into t values (1,2005,2,14,5,14,1000,104,450,100,20);
insert into t values (1,2005,1,15,6,10,1001,105,150,100,10);
insert into t values (1,2005,1,16,6,11,1001,105,150,120,2);
insert into t values (1,2005,1,17,6,12,1001,105,150,100,10);
insert into t values (1,2005,1,18,6,13,1001,105,150,200,2);
insert into t values (1,2005,2,19,7,10,1001,105,150,100,20);
insert into t values (1,2005,2,20,7,11,1002,106,400,100,12);
insert into t values (1,2005,2,21,8,10,1002,106,400,100,20);
insert into t values (1,2005,3,22,9,10,1002,107,400,100,22);
insert into t values (1,2005,4,23,10,11,1002,107,400,100,12);


insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');


Desc Itemname <199 <299 <399 <499
----------------------------------------------------------------------------
Count of distinct prdnos Item0 3 null null 1
(Count of distinct prdnos
group by colorcd, sizecd) 3 null null 1
sum of sl_qty group by itemid 110 null null 44

Count of distinct prdnos Item1 1 null null null
(Count of distinct prdnos
group by colorcd, sizecd) 1 null null null
sum of sl_qty group by itemid 44 null null null

Count of distinct prdnos Item2 null null null 2
(Count of distinct prdnos
group by colorcd, sizecd) null null null 1
sum of sl_qty group by itemid null null null 66

Is this possible? The Desc column is not needed and 'null' should be blank.

Thank you

Tom Kyte

Followup  

November 29, 2005 - 10:22 am UTC

I don't get the "group by colorcd, sizecd" bit. If you group by those attributes, you'll get a row per unique ITEMNAME, COLORCD, SIZECD.

I don't understand the logic.

November 29, 2005 - 10:48 am UTC

Reviewer: A reader

Dear Tom,

For each unique record of ITEMNAME, COLORCD, SIZECD, the PRODNOs are repeating isn't it? I need a count of distinct prodnos.

COLORCD-SIZECD-ITEMID-PRODNO in that order, please see below

first group
-------------------------
1-10-1000-101
1-11-1000-101
1-12-1000-101
----------------------
second group
----------------------
1-13-1000-102
2-10-1000-102
2-11-1000-102

Both these groups comes under price range < 199. So the distinct count of PRODNO for price range < 199 = 2


Hope this make sense.

Thank you

Tom Kyte

Followup  

November 30, 2005 - 10:46 am UTC

not understanding how this gets down to a single row. I did not get it.

why would that be different than the count of disintct prodno's by itemid.

how about this query

November 30, 2005 - 8:04 pm UTC

Reviewer: steve from NYC, USA

Hi Tom,

Is there simple way to do it by analytic function?

select dept_num, id, sum(curr_adj_qty)
from
(
select dept_num, id, sum(current_adjust_qty) curr_adj_qty
from adjust
where applied_ind = 'N'
and expired_ind = 'N'
group by dept_num, id
UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
from adjust
where expired_ind = 'N'
and applied_ind = 'Y'
group by dept_num, id
) adj_tmp
group by dept_num, id


Thanks a lot!
Steve

Tom Kyte

Followup  

November 30, 2005 - 9:17 pm UTC

hows about you

a) set up a small example
b) explain what it is supposed to do in text (so we don't have to reverse engineer what you might have been thinking)

[RE] to Steve NYC

November 30, 2005 - 10:43 pm UTC

Reviewer: Marcio Portes

May be he is looking for this

ops$marcio@LNX10GR2> select dept_num, id, sum(curr_adj_qty)
2 from
3 (
4 select dept_num, id, sum(current_adjust_qty) curr_adj_qty
5 from adjust
6 where applied_ind = 'N'
7 and expired_ind = 'N'
8 group by dept_num, id
9 UNION
10 select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
11 from adjust
12 where expired_ind = 'N'
13 and applied_ind = 'Y'
14 group by dept_num, id
15 ) adj_tmp
16 group by dept_num, id
17 /

DEPT_NUM ID SUM(CURR_ADJ_QTY)
------------- ------------- -----------------
1 0 185
1 2 186
0 2 77
0 0 81
1 1 165
0 1 56

6 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2815735809

--------------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 390 | 11 (46)| 00:00:01 |
| 1 | HASH GROUP BY | | 10 | 390 | 11 (46)| 00:00:01 |
| 2 | VIEW | | 10 | 390 | 10 (40)| 00:00:01 |
| 3 | SORT UNIQUE | | 10 | 120 | 10 (70)| 00:00:01 |
| 4 | UNION-ALL | | | | | |
| 5 | HASH GROUP BY | | 5 | 60 | 5 (40)| 00:00:01 |
|* 6 | TABLE ACCESS FULL| ADJUST | 250 | 3000 | 3 (0)| 00:00:01 |
| 7 | HASH GROUP BY | | 5 | 60 | 5 (40)| 00:00:01 |
|* 8 | TABLE ACCESS FULL| ADJUST | 250 | 3000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - filter("APPLIED_IND"='N' AND "EXPIRED_IND"='N')
8 - filter("EXPIRED_IND"='N' AND "APPLIED_IND"='Y')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
624 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed

ops$marcio@LNX10GR2>
ops$marcio@LNX10GR2> select dept_num, id,
2 sum( case when applied_ind = 'N'
3 then current_adjust_qty
4 else 0 end )
5 - sum( case when applied_ind = 'Y'
6 then current_adjust_qty
7 else 0 end ) curr_adj_qty
8 from adjust
9 where expired_ind = 'N'
10 group by dept_num, id
11 /

DEPT_NUM ID CURR_ADJ_QTY
------------- ------------- -------------
1 0 185
1 2 186
0 2 77
1 1 165
0 0 81
0 1 56

6 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3658272021

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 60 | 4 (25)| 00:00:01 |
| 1 | HASH GROUP BY | | 5 | 60 | 4 (25)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| ADJUST | 500 | 6000 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("EXPIRED_IND"='N')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
622 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed

I used this script to produce output above.
set echo on
drop table adjust purge;

create table
adjust (
dept_num int,
id int,
applied_ind char(1),
expired_ind char(1),
current_adjust_qty int
);

insert /*+ append */ into adjust
with v
as ( select level l from dual connect by level <= 1000 )
select mod(l, 2), mod(l, 3),
decode(mod(l,8), 0, 'Y', 'N'),
decode(mod(l,5), 0, 'N', 'Y'),
trunc(dbms_random.value(1,10.000))
from v
/
commit;
exec dbms_stats.gather_table_stats( user, 'adjust' )
set autotrace on
select dept_num, id, sum(curr_adj_qty)
from
(
select dept_num, id, sum(current_adjust_qty) curr_adj_qty
from adjust
where applied_ind = 'N'
and expired_ind = 'N'
group by dept_num, id
UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
from adjust
where expired_ind = 'N'
and applied_ind = 'Y'
group by dept_num, id
) adj_tmp
group by dept_num, id
/

select dept_num, id,
sum( case when applied_ind = 'N'
then current_adjust_qty
else 0 end )
- sum( case when applied_ind = 'Y'
then current_adjust_qty
else 0 end ) curr_adj_qty
from adjust
where expired_ind = 'N'
group by dept_num, id
/
set autotrace off
set echo off



Multiple aggregates

December 01, 2005 - 7:45 am UTC

Reviewer: Raj

Dear Tom,

This is continuing with my previous post where the given data for the problem was wrong. I was trying to make a sample testcase. Here is the requirements along with the create table statements and corrected data.

This is to output sales figures for a given period of different products.


The output format should be,

1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD
3. PRICE
4. Count of distinct PRODNOs for an item group by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item group by PRICE
6. Total SL_QTY for an item group by PRICE
7. Total SL_QTY*PRICE for an item group by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item

create table t(
id number(3),
slno number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));

create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));

With Items as(
select
itemid, displaycd, itemname
from
p),
DistinctCounts as(
select
min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
from
(select
distinct prdno,colorcd, sizecd, price, itemid
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
price, prdno, colorcd, sizecd )
group by price),
Aggregates as(
select
price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
from
(
select
itemid, month, prdno,colorcd, sizecd,
count(distinct prdno) over (partition by price) c1,
sl_qty c3,
sl_qty*price c4,
price,
trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7 c6
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
prdno,colorcd, sizecd)
group by
price)
select
a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
avg(c6) avg_trend
from
DistinctCounts d, Aggregates a, Items i
where
d.prdno=a.prdno and
d.price=a.price and
i.itemid=d.itemid
group by
a.price,a.itemid, i.itemname, c2
order by
a.itemid, i.itemname, a.price
/

With this query I am able to get the report like this,

ITEMID ITEMNAME PRICE PRDNO_CNT SKU_CNT SALE_CNT SALE_PRICE AVG_PRICE AVG_TREND
---------- -------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1000 Item0 150 2 8 280 42000 150 40
1000 Item0 450 1 4 110 49500 450 46.6
1001 Item1 350 1 5 270 94500 350 32.6


Is it possible to get the report in the following format along with the nonmatching itemnames and null cells as blanks.

Itemname <199 <299 <399 <499
------------------------------------------------
Item0 2 null null 1
8 null null 4
280 null null 110
42000 null null 49500
150 null null 450
40.0 null null 46.6

Item1 null null 1 null
null null 5 null
null null 270 null
null null 94500 null
null null 350 null
null null 32.6 null

Item2 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null

Item3 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null

Many thanks for your help and patience

Sorry

December 01, 2005 - 9:45 pm UTC

Reviewer: Raj

Dear Tom,

I am sorry, I did'nt post the insert statements. Sorry for being careless. I did execute everything in my system and was formating and copying it one by one, previewed and reread it before posting and still missed it. I will repost the requirements below,

This is to output sales figures for a given period of different products.

The output format should be,

1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD
3. PRICE
4. Count of distinct PRODNOs for an item grouped by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item grouped by PRICE
6. Total SL_QTY for an item grouped by PRICE
7. Total SL_QTY*PRICE for an item grouped by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item

drop table t;
drop table p;

create table t(
id number(3),
slno number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));

create table p(
itemid number(4) primary key,
displaycd varchar2(2),
itemname varchar2(10));

insert into t values (1,1,2005,1,1000,101,1,10,150,90,10);
insert into t values (1,2,2005,1,1000,101,1,11,150,80,20);
insert into t values (1,3,2005,1,1000,101,1,12,150,90,10);
insert into t values (1,4,2005,1,1000,101,1,13,150,80,20);
insert into t values (1,5,2005,2,1000,101,1,10,150,80,20);
insert into t values (1,25,2005,1,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,1,1000,104,1,13,150,80,20);
insert into t values (1,25,2005,2,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,2,1000,104,1,13,150,80,20);
insert into t values (1,26,2005,2,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,2,1000,104,1,10,150,90,10);
insert into t values (1,26,2005,1,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,1,1000,104,1,10,150,90,10);
insert into t values (1,6,2005,2,1000,101,1,11,150,60,40);
insert into t values (1,7,2005,2,1000,101,1,12,150,80,20);
insert into t values (1,14,2005,2,1000,101,1,13,150,80,20);
insert into t values (1,15,2005,1,1001,103,1,10,350,90,10);
insert into t values (1,23,2005,3,1001,103,1,11,350,10,90);
insert into t values (1,22,2005,3,1001,103,1,10,350,90,10);
insert into t values (1,21,2005,2,1001,103,1,11,350,80,20);
insert into t values (1,20,2005,2,1001,103,1,10,350,80,20);
insert into t values (1,19,2005,1,1001,103,1,14,350,80,20);
insert into t values (1,18,2005,1,1001,103,1,13,350,70,30);
insert into t values (1,17,2005,1,1001,103,1,12,350,40,60);
insert into t values (1,16,2005,1,1001,103,1,11,350,90,10);
insert into t values (1,8,2005,1,1000,102,1,10,450,80,20);
insert into t values (1,9,2005,1,1000,102,1,11,450,90,10);
insert into t values (1,10,2005,1,1000,102,1,12,450,90,10);
insert into t values (1,11,2005,1,1000,102,1,13,450,90,10);
insert into t values (1,12,2005,2,1000,102,1,10,450,80,10);
insert into t values (1,13,2005,2,1000,102,1,11,450,50,50);

insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');

commit;

With Items as(
select
itemid, displaycd, itemname
from
p),
DistinctCounts as(
select
min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
from
(select
distinct prdno,colorcd, sizecd, price, itemid
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
price, prdno, colorcd, sizecd )
group by price),
Aggregates as(
select
price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
from
(
select
itemid, month, prdno,colorcd, sizecd,
count(distinct prdno) over (partition by price) c1,
sl_qty c3,
sl_qty*price c4,
price,
trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7 c6
from
t
where
id = 1 and
year= 2005 and
month in(1,2,3)
order by
prdno,colorcd, sizecd)
group by
price)
select
a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
avg(c6) avg_trend
from
DistinctCounts d, Aggregates a, Items i
where
d.prdno=a.prdno and
d.price=a.price and
i.itemid=d.itemid
group by
a.price,a.itemid, i.itemname, c2
order by
a.itemid, i.itemname, a.price
/
ITEMID ITEMNAME PRICE PRDNO_CNT SKU_CNT SALE_CNT SALE_PRICE AVG_PRICE AVG_TREND
------ -------- ------ --------- -------- --------- ----------- ---------- ----------
1000 Item0 150 2 8 280 42000 150 40
1000 Item0 450 1 4 110 49500 450 47
1001 Item1 350 1 5 270 94500 350 33

Is it possible to get the report in the following format along with the
nonmatching itemnames and null cells as blanks.

Itemname <199 <299 <399 <499
------------------------------------------------
Item0 2 null null 1
8 null null 4
280 null null 110
42000 null null 49500
150 null null 450
40.0 null null 46.6

Item1 null null 1 null
null null 5 null
null null 270 null
null null 94500 null
null null 350 null
null null 32.6 null

Item2 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null

Item3 null null null null
null null null null
null null null null
null null null null
null null null null
null null null null



Thanking you



Tom Kyte

Followup  

December 02, 2005 - 10:48 am UTC

it wasn't just that - it was "this was too big to answer in a coupld of seconds and since I get over 1,000 of these a month, I cannot spend too much time on each one, I'd rather take NEW questions sometimes"




a single query to collapse date ranges

December 06, 2005 - 12:56 pm UTC

Reviewer: Bob Lyon from Houston

Tom,

I know what I want to do but can't quite get my mind around the syntax...

We want a single query to collapse date ranges under the assumption that a date range that
starts later than another range has a better value.

So given this test case

CREATE GLOBAL TEMPORARY TABLE RDL (
DATE_FROM DATE,
DATE_TO DATE,
VALUE NUMBER
);

INSERT INTO RDL VALUES (TO_DATE('01/03/2005', 'MM/DD/YYYY'), TO_DATE('01/12/2005', 'MM/DD/YYYY'), 5);
INSERT INTO RDL VALUES (TO_DATE('01/05/2005', 'MM/DD/YYYY'), TO_DATE('01/10/2005', 'MM/DD/YYYY'), 8);


-- I assume the innermost subquery would
-- use the DUAL CONNECT BY LEVEL trick to generate individual days for each grouping
-- ORDER BY DATE_FROM

1 01/03/2005 01/04/2005 5
1 01/04/2005 01/05/2005 5
1 01/05/2005 01/06/2005 5
1 01/06/2005 01/07/2005 5
1 01/07/2005 01/08/2005 5
1 01/08/2005 01/09/2005 5
1 01/09/2005 01/10/2005 5
1 01/10/2005 01/11/2005 5
1 01/11/2005 01/12/2005 5

2 01/05/2005 01/06/2005 8
2 01/06/2005 01/07/2005 8
2 01/07/2005 01/08/2005 8
2 01/08/2005 01/09/2005 8
2 01/09/2005 01/10/2005 8

-- an outer subquery would use analytics to get the max grouping

1 01/03/2005 01/04/2005 5
1 01/04/2005 01/05/2005 5
2 01/05/2005 01/06/2005 8
2 01/06/2005 01/07/2005 8
2 01/07/2005 01/08/2005 8
2 01/08/2005 01/09/2005 8
2 01/09/2005 01/10/2005 8
1 01/10/2005 01/11/2005 5
1 01/11/2005 01/12/2005 5

-- And the outermost subquery would use analytics to collapse the dates into contiguous groups
-- for the desired result

1 01/03/2005 01/05/2005 5
2 01/05/2005 01/10/2005 8
1 01/10/2005 01/12/2005 5

The trick is to do all of the above in a single query!

Any suggestions (Yeah, I know, REALLY learn analytics!)

Thanks in advance,

Bob Lyon

OK, I think I got it

December 06, 2005 - 2:25 pm UTC

Reviewer: Bob Lyon from Houston

SELECT d date_from, d2 date_to, value
FROM (
SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
FROM (
SELECT DATE_FROM D, VALUE FROM RDL
UNION
SELECT DATE_TO D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
)
)
WHERE D2 IS NOT NULL
/

DATE_FROM DATE_TO VALUE
----------------- ----------------- ----------
01/03/05 00:00:00 01/05/05 00:00:00 5
01/05/05 00:00:00 01/10/05 00:00:00 8
01/10/05 00:00:00 01/12/05 00:00:00 5


Tom Kyte

Followup  

December 06, 2005 - 3:50 pm UTC

depends on how many overlaps you allow, take your create and:


...
ops$tkyte@ORA9IR2> INSERT INTO RDL VALUES (TO_DATE('01/06/2005', 'MM/DD/YYYY'),
  2  TO_DATE('01/7/2005', 'MM/DD/YYYY'), 99);

1 row created.

ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT d date_from, d2 date_to, value
  2  FROM (
  3     SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
  4     FROM (
  5     SELECT DATE_FROM D, VALUE FROM RDL
  6     UNION
  7     SELECT DATE_TO   D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
  8     )
  9  )
 10  WHERE D2 IS NOT NULL
 11  /

DATE_FROM DATE_TO        VALUE
--------- --------- ----------
03-JAN-05 05-JAN-05          5
05-JAN-05 06-JAN-05          8
06-JAN-05 07-JAN-05         99
07-JAN-05 10-JAN-05          8
10-JAN-05 12-JAN-05          5


so, maybe we expand out and keep the row we want:

ops$tkyte@ORA9IR2> with
  2  data
  3  as
  4  (select level-1 l
  5    from (select max(date_to-date_from+1) n from rdl) n
  6  connect by level <= n)
  7  select rdl.date_from+l,
  8         to_number( substr( max( to_char(date_from,'yyyymmdd') ||  value ), 9 ) ) value
  9    from rdl, data
 10   where data.l <= rdl.date_to-rdl.date_from
 11   group by rdl.date_from+l
 12  ;

RDL.DATE_      VALUE
--------- ----------
03-JAN-05          5
04-JAN-05          5
05-JAN-05          8
06-JAN-05         99
07-JAN-05         99
08-JAN-05          8
09-JAN-05          8
10-JAN-05          8
11-JAN-05          5
12-JAN-05          5

10 rows selected.



 

Select with Analytics Working Partially

December 06, 2005 - 4:34 pm UTC

Reviewer: denni50 from na

Hi Tom

have a question with the below script that is puzzling.
I'm using the 4 idnumbers as test data. I'm looking to select the most recent record where the appealcode is like '_R%' for 2005.

when I run the script it only pulls 2 of the idnumbers.
I've been looking at this and can't see why the other two are being bypassed.I'm trying to use more analytics in my code, it's working for two records and not the other two.

any tips/help greatly appreciated.


SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
  2  paydate,payamount,transnum,ltransnum,appealcode
  3    from (
  4  select x.*, row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
 
  5    from (
  6  select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
  7  payamount, transnum,ltransnum,appealcode,
  8  max(paydate) over (partition by idnumber) maxpd
  9         from payment
 10   where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MON-Y
YYY')
 11         ) x
 12    where appealcode like '_R%' 
 13    and paydate=maxpd
 14    and idnumber in(4002401,4004594,5406454,5618190)
 15         )
 16   where rn = 1;

  IDNUMBER USER FIRSTPA PAYDATE    PAYAMOUNT   TRANSNUM  LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
   4004594 ACDC May2005 17-MAY-05          0   10159410   10086183 DRE0505
   5618190 ACDC Mar2005 11-MAR-05          0    9918802    9845638 DRJ0503

SQL> 


**** 4 TEST IDNUMBERS FROM BASE TABLE*********************

SQL> select idnumber,appealcode,paydate,payamount,transnum
  2  from payment where appealcode like '_R%'
  3  and idnumber=4004594 order by paydate desc;

  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   4004594 DRE0505    17-MAY-05          0   10159410
   4004594 GRG0502    08-FEB-05          0    9804766
   4004594 GRF0501    31-JAN-05          0    9750332
   4004594 GRK0410    01-NOV-04          0    9303510
   4004594 GRC0403    19-MAR-04          0    8371053
   4004594 GRA0305    12-AUG-03          0    7543911
   4004594 GRG0303    16-APR-03          0    7209503
   4004594 GRA0301    16-FEB-03          0    7026840




  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   4002401 GRG0502    16-MAR-05         25    9862647
   4002401 GRG0502    23-FEB-05          0    9826142
   4002401 GRA0501    19-JAN-05          0    9712904
   4002401 GRF0412    05-JAN-05          0    9630884
   4002401 GRK0410    21-OCT-04          0    9299106
   4002401 GRG0303    03-MAR-03          0    7066423
   4002401 GRA0301    09-FEB-03          0    7022121



  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   5406454 DRJ0503    03-MAR-05          0    9887770
   5406454 DRG0502    28-FEB-05          0    9870637



  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   5618190 DRJ0503    11-MAR-05          0    9918802
   5618190 DRG0502    28-FEB-05          0    9870090
   5618190 GRG0502    21-FEB-05          0    9824705

 

Tom Kyte

Followup  

December 07, 2005 - 1:32 am UTC

(i would need a create table and insert statements if you really want me to play with it)

but this predicate:

12 where appealcode like '_R%'
13 and paydate=maxpd
14 and idnumber in(4002401,4004594,5406454,5618190)

says

"only keep _R% records that had the max paydate over ALL records for that id"


to satisfy:

I'm looking to select the most recent
record where the appealcode is like '_R%' for 2005.

perhaps you mean:


select *
from (select t.*,
row_number() over (partition by idnumber sort by paydate DESC)rn
from t
where appealcode like '_R%'
and idnumber in ( 1,2,3,4 ) )
where rn = 1;



that says

"find the _R% records"
"break them up by idnumber"
"sort each group from big to small by paydate"
"keep only the first record in each group"

if


to dennis

December 06, 2005 - 6:09 pm UTC

Reviewer: Oraboy from MI, USA

Hi ,

I tried your problem and looks its working fine.

Just a quick question..Did you check the dates are really 2005 and not 0005?

(Create scripts for anyone who wants to try in future)

Create table Test_T
(IdNumber number,
AppealCode Varchar2(100),
PayDate date,
PayAmount NUmber,
TransNum Number)
/

Insert into Test_t values ( 4004594 ,'DRE0505',to_date('17-May-05','DD-MON-RR'),0,10159410 );
Insert into Test_t values ( 4004594 ,'GRG0502',to_date('8-Feb-05','DD-MON-RR'),0,9804766 );
Insert into Test_t values ( 4004594 ,'GRF0501',to_date('31-Jan-05','DD-MON-RR'),0,9750332 );
Insert into Test_t values ( 4004594 ,'GRK0410',to_date('1-Nov-04','DD-MON-RR'),0,9303510 );
Insert into Test_t values ( 4004594 ,'GRC0403',to_date('19-Mar-04','DD-MON-RR'),0,8371053 );
Insert into Test_t values ( 4004594 ,'GRA0305',to_date('12-Aug-03','DD-MON-RR'),0,7543911 );
Insert into Test_t values ( 4004594 ,'GRG0303',to_date('16-Apr-03','DD-MON-RR'),0,7209503 );
Insert into Test_t values ( 4004594 ,'GRA0301',to_date('16-Feb-03','DD-MON-RR'),0,7026840 );
Insert into Test_t values ( 4002401 ,'GRG0502',to_date('16-Mar-05','DD-MON-RR'),25,9862647 );
Insert into Test_t values ( 4002401 ,'GRG0502',to_date('23-Feb-05','DD-MON-RR'),0,9826142 );
Insert into Test_t values ( 4002401 ,'GRA0501',to_date('19-Jan-05','DD-MON-RR'),0,9712904 );
Insert into Test_t values ( 4002401 ,'GRF0412',to_date('5-Jan-05','DD-MON-RR'),0,9630884 );
Insert into Test_t values ( 4002401 ,'GRK0410',to_date('21-Oct-04','DD-MON-RR'),0,9299106 );
Insert into Test_t values ( 4002401 ,'GRG0303',to_date('3-Mar-03','DD-MON-RR'),0,7066423 );
Insert into Test_t values ( 4002401 ,'GRA0301',to_date('9-Feb-03','DD-MON-RR'),0,7022121 );
Insert into Test_t values ( 5406454 ,'DRJ0503',to_date('3-Mar-05','DD-MON-RR'),0,9887770 );
Insert into Test_t values ( 5406454 ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870637 );
Insert into Test_t values ( 5618190 ,'DRJ0503',to_date('11-Mar-05','DD-MON-RR'),0,9918802 );
Insert into Test_t values ( 5618190 ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870090 );
Insert into Test_t values ( 5618190 ,'GRG0502',to_date('21-Feb-05','DD-MON-RR'),0,9824705 );

--since the other columns are not relevant , I used dummy values in your Select statement

s61>l
1 select
2 idnumber,
3 usercode1,
4 substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
5 paydate,
6 payamount,
7 transnum,
8 ltransnum,
9 appealcode
10 from (
11 select x.*,
12 row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
13 from (
14 select
15 idnumber,1 usercode1,to_char(paydate,'MON') mon_raw, paydate,
16 payamount, transnum,transnum ltransnum,appealcode,
17 max(paydate) over (partition by idnumber) maxpd
18 from Test_t
19 where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
20 and to_date('31-OCT-2005','DD-MON-YYYY')
21 ) x
22 where appealcode like '_R%'
23 and paydate=maxpd
24 and idnumber in(4002401,4004594,5406454,5618190)
25 )
26* where rn = 1

s61>/

IDNUMBER USERCODE1 FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- ---------- ------- --------- ---------- ---------- ---------- -----------
4002401 1 Mar2005 16-MAR-05 25 9862647 9862647 GRG0502
4004594 1 May2005 17-MAY-05 0 10159410 10159410 DRE0505
5406454 1 Mar2005 03-MAR-05 0 9887770 9887770 DRJ0503
5618190 1 Mar2005 11-MAR-05 0 9918802 9918802 DRJ0503

--added the other two columns
s61>alter table test_t add (usercode1 varchar2(100),ltransnum varchar2(100));

Table altered.

s61>update test_t set usercode1 = chr(65+ mod(rownum,3)), ltransnum=transnum+rownum;

20 rows updated.

S61> @<<ursql.txt>>
IDNUMBER USERCODE1 FIRSTPA PAYDATE
---------- ---------------------------------------------------------------------------------------------------- ------- ---------
4002401 A Mar2005 16-MAR-05
4004594 B May2005 17-MAY-05
5406454 B Mar2005 03-MAR-05
5618190 A Mar2005 11-MAR-05

-- this is just a guess on why the other two numbers didnt
--show up in your result

-- updating 2005 to 05
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=5406454
2 /

2 rows updated.

s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=4002401
2 /

7 rows updated.


s61>l
1 select
2 idnumber,
3 usercode1,
4 substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
5 paydate,
6 payamount,
7 transnum,
8 ltransnum,
9 appealcode
10 from (
11 select x.*,
12 row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
13 from (
14 select
15 idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
16 payamount, transnum, ltransnum,appealcode,
17 max(paydate) over (partition by idnumber) maxpd
18 from Test_t
19 where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
20 and to_date('31-OCT-2005','DD-MON-YYYY')
21 ) x
22 where appealcode like '_R%'
23 and paydate=maxpd
24 and idnumber in(4002401,4004594,5406454,5618190)
25 )
26* where rn = 1
s61>/

IDNUMBER US FIRSTPA PAYDATE PAYAMOUNT TRANSNUM LTRANSNUM APPEALCODE
---------- -- ------- --------- ---------- ---------- ---------- -------------------------------------
4004594 B May2005 17-MAY-05 0 10159410 10159411 DRE0505
5618190 A Mar2005 11-MAR-05 0 9918802 9918820 DRJ0503

-- same as what you see






Thanks Tom and Oraboy

December 07, 2005 - 8:37 am UTC

Reviewer: denni50 from na

Oraboy...you brought up a good possibility..although
the data gets posted through canned software...users
are responsible for creating the batch headers before
posting batches and what may have happened is a user
inadvertently inserted year 0005 instead of '2005'
for a particular batch that included those idnumbers
I am testing with here. It's happened before.

thanks for that helpful tip!

:~)



Oraboy and Tom

December 07, 2005 - 8:57 am UTC

Reviewer: denni50 from na

Oraboy:
it was not the year, I did testing using '0005' to see
if those two records would output results and it did not.

I changed the script based on logic that Tom suggested
and it worked...see changes below.

thanks Oraboy for your input and help.

SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode
  2  paydate,payamount,transnum,ltransnum,appealcode
  3    from (
  4  select x.*, row_number() over (partition by idnumber order by paydate desc,payamount desc, t
snum desc) rn 
  5    from (
  6  select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
  7  payamount, transnum,ltransnum,appealcode
  8  --max(paydate) over (partition by idnumber) maxpd
  9         from payment
 10   where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MO
YYY')
 11         ) x
 12    where appealcode like '_R%' 
 13    --and paydate=maxpd
 14    and idnumber in(4002401,4004594,5406454,5618190)
 15         )
 16   where rn = 1;

  IDNUMBER USER FIRSTPA PAYDATE    PAYAMOUNT   TRANSNUM  LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
   4002401 ACGA Mar2005 16-MAR-05         25    9862647    9789477 GRG0502
   4004594 ACDC May2005 17-MAY-05          0   10159410   10086183 DRE0505
   5406454 ACDC Mar2005 03-MAR-05          0    9887770    9814606 DRJ0503
   5618190 ACDC Mar2005 11-MAR-05          0    9918802    9845638 DRJ0503

SQL>  

anyway to do a dynamic lag?

December 07, 2005 - 10:27 pm UTC

Reviewer: Ryan from Northern Virginia

Is it possible to use lag, but you don't know how many rows you want to go back?
create table history (
history_id number,
history_sequence number,
history_status varchar2(20),
history_balance number);
insert into history(1,123,'HISTORY 1',10);
insert into history(1,128,'PROCESSED',0);
insert into history(1,130,'PROCESSED',0);
insert into history(1,131,'HISTORY 8',15);
insert into history(1,145,'PROCESSED',0);
for each history_id ordered by history_sequence
loop
if status = 'PROCESSED' then
history_balance = the history_balance of the last record where status != 'PROCESSED'
end if;
end loop;
Typically with lag you have to state how many rows you are looking back, in this case my discriminator is based on the value in the status field?
After this is run, I expect the values to be
1,123,'HISTORY 1',10
1,128,'PROCESSED',10
1,130,'PROCESSED',10
1,131,'HISTORY 8',15
1,145,'PROCESSED',15
I can do this with pl/sql. I am trying to figure out how to do this with straight sql.

Tom Kyte

Followup  

December 08, 2005 - 2:03 am UTC

last_value with ignore nulls in 10g, or to_number(substr(max in 9i and before can be used....


ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
  2         last_value(
  3         case when history_status <> 'PROCESSED'
  4                  then history_balance
  5                  end IGNORE NULLS ) over (order by history_sequence ) last_hb
  6    from history
  7  /

HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS       HISTORY_BALANCE    LAST_HB
---------- ---------------- -------------------- --------------- ----------
         1              123 HISTORY 1                         10         10
         1              128 PROCESSED                          0         10
         1              130 PROCESSED                          0         10
         1              131 HISTORY 8                         15         15
         1              145 PROCESSED                          0         15

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
  2         to_number( substr( max(
  3         case when history_status <> 'PROCESSED'
  4                  then to_char(history_sequence,'fm0000000000' ) || history_balance
  5                  end ) over (order by history_sequence ), 11 ) ) last_hb
  6    from history
  7  /

HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS       HISTORY_BALANCE    LAST_HB
---------- ---------------- -------------------- --------------- ----------
         1              123 HISTORY 1                         10         10
         1              128 PROCESSED                          0         10
         1              130 PROCESSED                          0         10
         1              131 HISTORY 8                         15         15
         1              145 PROCESSED                          0         15

 

Query

January 18, 2006 - 5:02 pm UTC

Reviewer: Mark from Boston, MA

Hi Tom,

Given a Table:
PK_ID NUMBER (PK)
CLASS_ID NUMBER
MY_DATE DATE

I'd like to develop an output like:

CLASS_ID W1 W2 W3...

where W1, W2, W3... are 'weeks' from SYSDATE, using MY_DATE, holding the counts of the CLASS_ID for that row.

How could I do that?

Thanks!

About to read Ch. 12 in Expert one-on-one...

Tom Kyte

Followup  

January 19, 2006 - 12:28 pm UTC

if

a) you have a finite number of weeks (eg: a sql query has N columns at parse time, unless you know what N is...)

b) an example <<<=== creates/inserts

we could play with this.

ok, here we go

January 20, 2006 - 1:31 pm UTC

Reviewer: Mark from Boston, MA

drop table cts_temp
/
create table cts_temp
( class_id number,
cts_date date)
/
insert into cts_temp
select round(dbms_random.value(1, 20)), trunc(created) from user_objects
/
Output sought:

CLASS_ID W1 W2 W3 W4 W5 W6 W7 W8 W9 W10+
----------------------------------------
1 1 3 7 5 1 0 0 0 0 12
2 1 5 1 0 0 0 5 3 4 6
3 1 0 0 9 0 1 1 5 1 10
...

where W# = # of Weeks away from current date
therefore, W1 is within 7 days of today, W10+ everything 10 weeks and older. These numbers are are Counts of records.

I have done this in the past with DECODE statements, but am looking for a more efficient way to do this using Analytics.

Tom Kyte

Followup  

January 20, 2006 - 2:47 pm UTC

but you will not want to use analytics since you NEED TO AGGREGATE.

decode (or case) is the correct approach to this problem, keep using that.

SQL Query

January 23, 2006 - 7:26 am UTC

Reviewer: Parag J Patankar from India

Hi Tom,

Wish you very happy new year 2006. I have a table 

create table t ( a number(5), b number(6), c number(1), d varchar2(8), e number(10));

insert into t values ( 09009, 1000, 1, 'RIS00001', 100);
insert into t values ( 09009, 1000, 0, 'RIS00001', 200);
insert into t values ( 09009, 1000, 0, 'RIS00001', 300);
insert into t values ( 09009, 1000, 2, 'RIS00001', 400)

insert into t values(09009, 5000, 2, 'BIC77777', 100);
insert into t values(09009, 5000, 2, 'BIC77777', 100);

insert into t values(09009, 6000, 0, 'DIG00077', 100);
insert into t values(09009, 6000, 0, 'DIG00077', 200);
commit;

17:33:38 SQL> select * from t;

         A          B          C D                 E
---------- ---------- ---------- -------- ----------
      9009       1000          1 RIS00001        100
      9009       1000          0 RIS00001        200
      9009       1000          0 RIS00001        300
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200

7 rows selected.

In Column "C" values can be 0, 1, 2. 

Now I want to select only those set records for column c value is not 1 for the combination of same a, b, and d.

For e.g I want output like 

      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200

RIS00001 records should not appear because for column c value 1 once appeared for 09009 10000. 

How can I do this in most efficient way ?

Currently it is very large table and joined to few tables in a query.

best regards
pjp  

Tom Kyte

Followup  

January 23, 2006 - 10:34 am UTC

for big "all rows" I would go analytics

ops$tkyte@ORA9IR2> select *
  2    from (
  3  select t.*,
  4         max( case when c = 1 then c end ) over (partition by a, b, d ) c_one
  5    from t
  6        )
  7   where c_one is null
  8  /

         A          B          C D                 E      C_ONE
---------- ---------- ---------- -------- ---------- ----------
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200


for possibly getting "first row as fast as I can", I might opt for not exists or not in


ops$tkyte@ORA9IR2> select *
  2    from t
  3   where (a,b,d) not in (select a,b,d from t where c = 1 and a is not null and b is not null and d is
  4   not null )
  5  /

         A          B          C D                 E
---------- ---------- ---------- -------- ----------
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200

ops$tkyte@ORA9IR2> select *
  2    from t
  3   where not exists (select null from t t2
  4                      where t2.a = t.a and t2.b = t.b and t2.d = t.d and t2.c = 1 )
  5  /

         A          B          C D                 E
---------- ---------- ---------- -------- ----------
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200
 

Eliminating distinct rows

February 20, 2006 - 8:48 am UTC

Reviewer: Avishay from Petah-Tikva, ISRAEL

Hello Tom,

I have a view that UNION ALL 3 tables
here is the result for a select * on that view(f_emp_v):

PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 5/12/2005
111111 123456 1/23/2005 1/24/2005
111111 123459 1/25/2005
111111 60 75 5/12/2005 5/13/2005
111111 30 5/13/2005 1/1/2006
111111 85 55 5/13/2005

Using the following SQL and analytical functions I fillied the NULL's, and created the IN_USE_FROM,IN_USE_UNTIL columns in a different way the IN_USE_UNTIL recieves the next date of the IN_USE_FROM ASC.
Here is the SQL:

SELECT Person_Id,
Substr(MAX(Decode(Cost_Account_Id,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Id))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Id,
Substr(MAX(Decode(Cost_Account_Code,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Code))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Code,
Substr(MAX(Decode(Working_Hours,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Working_Hours))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Working_Hours,
Substr(MAX(Decode(Fte,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') || Fte))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Fte,
Substr(MAX(Decode(Calc_Fte_Type,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Calc_Fte_Type))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Calc_Fte_Type,
In_Use_From,
Lead(t.In_Use_From, 1, In_Use_Until) Over(PARTITION BY t.Person_Id ORDER BY t.In_Use_From ASC) In_Use_Until
FROM Fact_Employee_List_v t
WHERE person_id = 111111
ORDER BY Person_Id,
In_Use_From;
Here are the results:

PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 1/23/2005
111111 123456 20 1/23/2005 1/25/2005
111111 123459 20 1/25/2005 5/12/2005
111111 123459 20 60 75 5/12/2005 5/13/2005
111111 123459 30 85 55 5/13/2005 5/13/2005
111111 123459 30 85 55 5/13/2005

The table "Fills" in accordance with the dates.
As you can see the last 2 rows except for the IN_USE_UNTIL are identical.
How can I get 'Rid' of the row with IN_USE_UNTIL NOT NULL ?
Is there a way to do it in the above select?
Maybe change the way the analytical function for IN_US_UNTIL ?

Your remarks will be appreciated
Best Regards,
Avishay

Update using LAG

February 21, 2006 - 2:16 pm UTC

Reviewer: Zahir M from Monroe NJ

SQL> desc tab1
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 MEMBER_ID                                          NUMBER(10)
 START_DATE                                         DATE
 STOP_DATE                                          DATE

SQL> Select * from tab1 where member_id = 125;

 MEMBER_ID START_DAT STOP_DATE
---------- --------- ---------
       125 23-OCT-00
       125 05-MAY-04
       125 10-MAY-04
       125 30-MAR-05

SQL> Select wh.* , lag(start_date) over ( partition by member_id order by start_date asc)  - 1 new_s
top_date
  2   from tab1 wh where member_id = 125
  3  
SQL> /

 MEMBER_ID START_DAT STOP_DATE NEW_STOP_
---------- --------- --------- ---------
       125 23-OCT-00
       125 05-MAY-04           22-OCT-00
       125 10-MAY-04           04-MAY-04
       125 30-MAR-05           09-MAY-04

SQL> Update tab1 a
  2  set a.stop_date = ( SElect 
  3        lag(b.start_date) 
  4        over ( partition by b.member_id order by b.start_date asc)  - 1 new_stop_date 
  5               from tab1 b where a.member_id = b.member_id and a.rowid = b.rowid ) ;

4 rows updated.

SQL>  Select * from tab1 where member_id = 125;

 MEMBER_ID START_DAT STOP_DATE
---------- --------- ---------
       125 23-OCT-00
       125 05-MAY-04
       125 10-MAY-04
       125 30-MAR-05


SQL>  select * from v$version
  2  /

BANNER
------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.7.4.1 - Production
PL/SQL Release 8.1.7.4.0 - Production
CORE    8.1.7.2.1       Production
TNS for 32-bit Windows: Version 8.1.7.4.0 - Production
NLSRTL Version 3.4.1.0.0 - Production

I am trying to use the LAG analytic function for an update statement . It does not seem to work.

Still , the column STOP_DATE is not updated with the new values( ie from the lag ).

Please advise.     

Tom Kyte

Followup  

February 22, 2006 - 8:16 am UTC

Oh, it is absolutely working!!! The problem is - where clause is applied

where a.member_id = b.member_id and a.rowid = b.rowid

AND THEN analytic is performed - of course, there is only one row however, so, well - the "previous" row isn't there anymore.



ops$tkyte@ORA10GR2> create table emp
  2  as
  3  select job, hiredate, to_date(null) last_hiredate
  4    from scott.emp;

Table created.

ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> merge into emp
  2  using ( select rowid rid, lag(hiredate) over (partition by job order by hiredate) last_hiredate
  3            from emp) e2
  4  on ( emp.rowid = e2.rid )
  5  when matched then update set last_hiredate = e2.last_hiredate
  6  -- when not matched then insert (job) values (NULL)
  7  /

14 rows merged.

ops$tkyte@ORA10GR2> select * from emp order by job, hiredate;

JOB       HIREDATE  LAST_HIRE
--------- --------- ---------
ANALYST   03-DEC-81
ANALYST   09-DEC-82 03-DEC-81
CLERK     17-DEC-80
CLERK     03-DEC-81 17-DEC-80
CLERK     23-JAN-82 03-DEC-81
CLERK     12-JAN-83 23-JAN-82
MANAGER   02-APR-81
MANAGER   01-MAY-81 02-APR-81
MANAGER   09-JUN-81 01-MAY-81
PRESIDENT 17-NOV-81
SALESMAN  20-FEB-81
SALESMAN  22-FEB-81 20-FEB-81
SALESMAN  08-SEP-81 22-FEB-81
SALESMAN  28-SEP-81 08-SEP-81

14 rows selected.


In 9i, you need the "insert" part - but it'll never happen.


(but really, this looks like a bad idea, you'll have to just keep doing this over and over and over)


In 8i, you'll likely want to "two step this", create global temporary table, insert and update the join. 

LEAD UPDATE

February 22, 2006 - 10:45 am UTC

Reviewer: Zahir M from Monroe NJ

Tom ,

I did the "two steps process" in 8i as you have suggested .
( except that I created a another table - not a global temp table ) . But it takes long duration to perform the update .



130874 rows updated.

Elapsed: 07:469:28146.33

Statistics
----------------------------------------------------------
0 recursive calls
4455554 db block gets
105093306 consistent gets
14702345 physical reads
30683572 redo size
852 bytes sent via SQL*Net to client
670 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
130874 rows processed


Tom Kyte

Followup  

February 22, 2006 - 11:05 am UTC

no example :(


should only take seconds for so few rows.  



ops$xp8i\tkyte@ORA8IR3W> alter table big_table add last_created date;

Table altered.

Elapsed: 00:00:00.63
ops$xp8i\tkyte@ORA8IR3W> create table t
  2  ( rid rowid primary key, last_created date );

Table created.

Elapsed: 00:00:00.78
ops$xp8i\tkyte@ORA8IR3W>
ops$xp8i\tkyte@ORA8IR3W> insert into t
  2  select rowid, lag(created) over (partition by object_type order by created)
  3    from big_table;

130874 rows created.

Elapsed: 00:00:36.03
ops$xp8i\tkyte@ORA8IR3W> exec dbms_stats.gather_table_stats( user, 'T' );

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.38
ops$xp8i\tkyte@ORA8IR3W>
ops$xp8i\tkyte@ORA8IR3W> update (select a.last_created new_last_created,
  2                 b.last_created old_last_created
  3                    from t a, big_table b
  4                   where a.rid = b.rowid )
  5     set old_last_created = new_last_created;

130874 rows updated.

Elapsed: 00:00:32.41



Maybe you are getting blocked by other users, since you are updating every row, lock the table first - then update it. 

LAG Update

February 22, 2006 - 11:32 am UTC

Reviewer: Zahir M from Monroe NJ

Thanks , Tom.

I re-ran the update after locking the table.
It took only 47 seconds for the update operation.

I guess it was locked by some other users / processes.

Thanks again !

Reposting

February 23, 2006 - 4:28 am UTC

Reviewer: Avishay from Petah-Tikva, Israel

Hello Tom,

I have a view that UNION ALL 3 tables
here is the result for a select * on that view(f_emp_v):

PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 5/12/2005
111111 123456 1/23/2005 1/24/2005
111111 123459 1/25/2005
111111 60 75 5/12/2005 5/13/2005
111111 30 5/13/2005 1/1/2006
111111 85 55 5/13/2005

Using the following SQL and analytical functions I fillied the NULL's, and
created the IN_USE_FROM,IN_USE_UNTIL columns in a different way the IN_USE_UNTIL
recieves the next date of the IN_USE_FROM ASC.
Here is the SQL:

SELECT Person_Id,
Substr(MAX(Decode(Cost_Account_Id,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Id))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Id,
Substr(MAX(Decode(Cost_Account_Code,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Cost_Account_Code))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Cost_Account_Code,
Substr(MAX(Decode(Working_Hours,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Working_Hours))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Working_Hours,
Substr(MAX(Decode(Fte,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') || Fte))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Fte,
Substr(MAX(Decode(Calc_Fte_Type,
NULL,
NULL,
To_Char(In_Use_From, 'yyyymmddhh24miss') ||
Calc_Fte_Type))
Over(PARTITION BY Person_Id ORDER BY In_Use_From),
15) Calc_Fte_Type,
In_Use_From,
Lead(t.In_Use_From, 1, In_Use_Until) Over(PARTITION BY t.Person_Id ORDER
BY t.In_Use_From ASC) In_Use_Until
FROM Fact_Employee_List_v t
WHERE person_id = 111111
ORDER BY Person_Id,
In_Use_From;
Here are the results:

PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111 20 1/1/2005 1/23/2005
111111 123456 20 1/23/2005 1/25/2005
111111 123459 20 1/25/2005 5/12/2005
111111 123459 20 60 75 5/12/2005 5/13/2005
111111 123459 30 85 55 5/13/2005 5/13/2005
111111 123459 30 85 55 5/13/2005

The table "Fills" in accordance with the dates.
As you can see the last 2 rows except for the IN_USE_UNTIL are identical.
How can I get 'Rid' of the row with IN_USE_UNTIL NOT NULL ?
Is there a way to do it in the above select?
Maybe change the way the analytical function for IN_US_UNTIL ?

Your remarks will be appreciated
Best Regards,
Avishay




Tom Kyte

Followup  

February 23, 2006 - 8:07 am UTC

why did you repost it.

You must have seen the page you used to post this. did you *READ* that page? I ignore all things that look like this - you have the classic example of what I ignore.

slow down, read the page you used to post this repost.

Analytics

February 23, 2006 - 10:42 am UTC

Reviewer: Mark from Boston, MA

Hi Tom,

Oracle 9i latest and greatest...

At a loss for this one. Don't know where to start...

I have a table:
Create Table MY_NUMS
(N1 NUMBER, N2 NUMBER, N3 NUMBER, N4 NUMBER, N5 NUMBER, N6 NUMBER)
/

I populate each column with random values between 1-46:

INSERT INTO my_nums
SELECT TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
FROM user_objects;
/

and I get data similar to this:
HT4:DEVDB001101007:10:10 - DEV> select * from my_nums where rownum <= 10
2 /

N1 N2 N3 N4 N5 N6
---------- ---------- ---------- ---------- ---------- ----------
6 13 11 6 21 36
33 23 45 11 24 32
36 19 43 19 8 44
11 39 9 14 35 25
42 8 29 15 26 4
1 25 12 41 21 10
20 6 43 29 39 28
16 18 36 15 38 26
16 33 15 16 40 18
17 1 20 39 20 46

10 rows selected.

And my question is how do I get counts of how many times each number appears with every other number?

Output might look like this:

MY_NUMBER OTHER_NUM COUNT(*)
--------- --------- --------
16 15 3
...

This says that the number 16 appeared 3 times with the number 15 in the same row.

I'd have to check each column (N1 - N6) against each of the other 5 columns for each value and sum them up...

Regards,
Mark



Tom Kyte

Followup  

February 23, 2006 - 10:47 am UTC

are my_number and other_num INPUTS into your query or what? where did 16 and 15 come from.

Analytics

February 23, 2006 - 12:28 pm UTC

Reviewer: Mark from Boston, MA

Oh, ok.

MY_NUMBER is the number I am counting the combinations with all OTHER_NUMBERs.

Output would ideally look like this:

MY_NUMBER OTHER_NUMBER COUNT(*)
--------- ------------ --------
1 2 5
1 3 6
1 4 2
...
2 3 4
2 4 7
2 5 8
...

etc., all the way down to
MY_NUMBER OTHER_NUMBER COUNT(*)
--------- ------------ --------
45 46 7

There are no INPUTS to the query as it calculates counts for all combinations.
This query somewhat does it, but I feel there is a way better way than doing it with decodes:

SELECT n1
,SUM(n2_2 + n3_2 + n4_2 + n5_2 + n6_5) two
,SUM(n2_3 + n3_3 + n4_3 + n5_3 + n6_5) three
,SUM(n2_4 + n3_4 + n4_4 + n5_4 + n6_5) four
,SUM(n2_5 + n3_5 + n4_5 + n5_5 + n6_5) five
,SUM(n2_46 + n3_46 + n4_46 + n5_46 + n6_46) fortysix
FROM (SELECT n1
,DECODE(n2, 2, 1, 0) n2_2
,DECODE(n2, 3, 1, 0) n2_3
,DECODE(n2, 4, 1, 0) n2_4
,DECODE(n2, 5, 1, 0) n2_5
/* n2_6 through n2_45 here ... */
,DECODE(n2, 46, 1, 0) n2_46
,DECODE(n3, 2, 1, 0) n3_2
,DECODE(n3, 3, 1, 0) n3_3
,DECODE(n3, 4, 1, 0) n3_4
,DECODE(n3, 5, 1, 0) n3_5
,DECODE(n3, 46, 1, 0) n3_46
,DECODE(n4, 2, 1, 0) n4_2
,DECODE(n4, 3, 1, 0) n4_3
,DECODE(n4, 4, 1, 0) n4_4
,DECODE(n4, 5, 1, 0) n4_5
,DECODE(n4, 46, 1, 0) n4_46
,DECODE(n5, 2, 1, 0) n5_2
,DECODE(n5, 3, 1, 0) n5_3
,DECODE(n5, 4, 1, 0) n5_4
,DECODE(n5, 5, 1, 0) n5_5
,DECODE(n5, 46, 1, 0) n5_46
,DECODE(n6, 2, 1, 0) n6_2
,DECODE(n6, 3, 1, 0) n6_3
,DECODE(n6, 4, 1, 0) n6_4
,DECODE(n6, 5, 1, 0) n6_5
,DECODE(n6, 46, 1, 0) n6_46
FROM my_nums)
GROUP BY n1
/
N1 TWO THREE FOUR FIVE FORTYSIX
---------- ---------- ---------- ---------- ---------- ----------
1 27 10 14 19 16
2 10 12 12 22 21
3 13 15 13 12 16
4 25 15 24 13 23
5 19 15 16 14 19
6 16 15 13 8 30
7 18 13 13 19 15
8 14 14 12 18 14
9 16 16 18 16 22
10 14 12 16 14 19
...

for the entire matrix of number.

This output says that the number 2 in columns N2,N3,N4,N5,N6 appeared 27 times with the number 1. The number 3 in columns N2 - N6 appeared 10 times with the number 1, etc.

A sort of 'how many times does this number appear with all thes other numbers in the same row" query...

Thanks.

Tom Kyte

Followup  

February 23, 2006 - 7:01 pm UTC

I was beaten to the punch on this one ;) see below

A solution

February 23, 2006 - 3:10 pm UTC

Reviewer: Michel Cadot from France

Hi Mark,

Here's a solution to your issue.
I changed the values to be able to post the whole example but the query does not care about what's inside the table.
I let all the steps but i think you can compact the query.

SQL> INSERT INTO my_nums
  2     SELECT TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  3           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  4           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  5           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  6           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  7           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  8     FROM user_views
  9     where rownum <= 3
 10  /

3 rows created.

SQL> commit;

Commit complete.

SQL> select * from my_nums;
        N1         N2         N3         N4         N5         N6
---------- ---------- ---------- ---------- ---------- ----------
         7          4          6          8          2          1
         1          2          8          8          7          2
         6          3          5          1          4          8

3 rows selected.

SQL> with 
  2    a as ( select row_number () over (order by n1) rn,
  3                  n1, n2, n3, n4, n5, n6
  4           from my_nums
  5         ),
  6    b as ( select rn, n1, n2, n3, n4, n5, n6, part
  7           from a, 
  8                (select rownum part from dual connect by level <= 15)
  9         ),
 10    c as ( select distinct rn, 
 11                           case part
 12                           when  1 then least(n1,n2)
 13                           when  2 then least(n1,n3)
 14                           when  3 then least(n1,n4)
 15                           when  4 then least(n1,n5)
 16                           when  5 then least(n1,n6)
 17                           when  6 then least(n2,n3)
 18                           when  7 then least(n2,n4)
 19                           when  8 then least(n2,n5)
 20                           when  9 then least(n2,n6)
 21                           when 10 then least(n3,n4)
 22                           when 11 then least(n3,n5)
 23                           when 12 then least(n3,n6)
 24                           when 13 then least(n4,n5)
 25                           when 14 then least(n4,n6)
 26                           when 15 then least(n5,n6)
 27                           end v1,
 28                           case part
 29                           when  1 then greatest(n1,n2)
 30                           when  2 then greatest(n1,n3)
 31                           when  3 then greatest(n1,n4)
 32                           when  4 then greatest(n1,n5)
 33                           when  5 then greatest(n1,n6)
 34                           when  6 then greatest(n2,n3)
 35                           when  7 then greatest(n2,n4)
 36                           when  8 then greatest(n2,n5)
 37                           when  9 then greatest(n2,n6)
 38                           when 10 then greatest(n3,n4)
 39                           when 11 then greatest(n3,n5)
 40                           when 12 then greatest(n3,n6)
 41                           when 13 then greatest(n4,n5)
 42                           when 14 then greatest(n4,n6)
 43                           when 15 then greatest(n5,n6)
 44                           end v2
 45           from b
 46         )
 47  select v1, v2, count(*) nb
 48  from c
 49  where v1 != v2
 50  group by v1, v2
 51  order by v1, v2
 52  /
        V1         V2         NB
---------- ---------- ----------
         1          2          2
         1          3          1
         1          4          2
         1          5          1
         1          6          2
         1          7          2
         1          8          3
         2          4          1
         2          6          1
         2          7          2
         2          8          2
         3          4          1
         3          5          1
         3          6          1
         3          8          1
         4          5          1
         4          6          2
         4          7          1
         4          8          2
         5          6          1
         5          8          1
         6          7          1
         6          8          2
         7          8          2

24 rows selected.

Regards
Michel 

Excellent

February 23, 2006 - 3:35 pm UTC

Reviewer: Mark from Boston, MA

Thanks

To Mark ... COUNT of what?

February 23, 2006 - 4:07 pm UTC

Reviewer: A reader

<quote>And my question is how do I get counts of how many times each number appears with every other number?</quote>

flip@FLOP> select * from my_nums;

N1 N2 N3 N4 N5 N6
---------- ---------- ---------- ---------- ---------- ----------
1 2 2 2 2 2

flip@FLOP> @michel_qry

V1 V2 NB
---------- ---------- ----------
1 2 1

Should the answer here be:
A. "1" [as in the number of rows where 1 is together with 2]
or
B. "5"
?

To A reader

February 23, 2006 - 4:46 pm UTC

Reviewer: Michel Cadot from France

The "distinct" in c definition is there to count only 1 per each row.
If you want to count all occurrences then remove "distinct".

Regards
Michel


Analytics

February 23, 2006 - 5:30 pm UTC

Reviewer: Mark from Boston, MA

The answer to that should be "5" as 1 and 2 appear as a combination 5 times.

Regards,
Mark

Difficulty with min(id) over partition by

February 27, 2006 - 11:38 pm UTC

Reviewer: Ken from Los Angeles, CA

Hi Tom:

We have a table with over 4 million records. A bug in the java application has created duplicate records in this table. I am brought in to dedupe the records based on certain criteria. For every duplicate record set, we need to keep the record with the smallest value in the ID field and delete the rest.

I am close to identifying the IDs to be deleted but the query is not identifying the records with min(id) consistently. Here is the table structure (created for this test case -- no index or anything) and query.

Thanks in advance for your help.




ken@DEV9206> desc DEDUPE_TEST_01
Name Null? Type
----------------------------------- -------- --------
EQUIP_ID NOT NULL NUMBER
EQUIP_TYPE_ID NOT NULL NUMBER
TYPE_ID NOT NULL NUMBER
CREATED NOT NULL DATE
COL_1 NUMBER
COL_0 NUMBER
ID NOT NULL NUMBER
CT NUMBER

ken@DEV9206> l
1 SELECT equip_id, equip_type_id, type_id, TO_CHAR(CREATED, 'DDMMYYYY HH24:MI:SS') created,
2 col_1, col_0, MIN(id) OVER (PARTITION BY equip_id
3 ORDER BY equip_id, equip_type_id, type_id, TO_CHAR(created, 'DDMMYYYY HH24:MI:SS'),
4 col_1, col_0) min_id,
5 id
6 from dedupe_test_01
7* where rownum < 31
ken@DEV9206> /

EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 MIN_ID ID
-------- ------------- ------- ----------------- ----- ----- --------- --------
3011 29 221 30110002 00:00:00 3 77 26445635 26445635
3011 29 221 30110002 00:00:00 3 77 26445651
3011 29 221 30110002 00:00:00 3 86 26445626 26445626
3011 29 221 30110002 00:00:00 3 86 26445653
3011 29 221 30110002 00:00:00 3 112 26445617 26445617
3011 29 221 30110002 00:00:00 3 112 26445620
3011 29 221 30110002 00:00:00 3 125 26445631*
3011 29 221 30110002 00:00:00 3 125 26445641
3551 29 221 11032005 17:01:00 1 186 3209093 3209093
3551 29 221 11032005 17:01:00 1 186 6072228
3551 29 221 11032005 17:01:00 1 186 8894681
3551 29 221 11032005 17:01:00 1 186 3837758
3551 29 221 11032005 17:01:00 1 186 3837738
3551 29 221 11032005 20:44:00 1 190 3209092 3209092
3551 29 221 11032005 20:44:00 1 190 3837757
3551 29 221 11032005 20:44:00 1 190 6072227
3551 29 221 11032005 20:44:00 1 190 8894680
3551 29 221 11032005 20:44:00 1 190 3837737
3551 29 221 11032005 23:00:00 1 227 3209091 3209091
3551 29 221 11032005 23:00:00 1 227 3837736
3551 29 221 30110002 00:00:00 3 112 3209094*
3551 29 221 30110002 00:00:00 3 112 3837739
3551 29 221 30110002 00:00:00 3 112 3837759
3551 29 221 30110002 00:00:00 3 112 6072229
3551 29 221 30110002 00:00:00 3 112 8894682
3551 29 221 30110002 00:00:00 3 118 3209095*
3551 29 221 30110002 00:00:00 3 118 3837740
3551 29 221 30110002 00:00:00 3 118 3837760
3551 29 221 30110002 00:00:00 3 118 6072230
3551 29 221 30110002 00:00:00 3 118 8894683

30 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 WINDOW (SORT)
2 1 COUNT (STOPKEY)
3 2 TABLE ACCESS (FULL) OF 'DEDUPE_TEST_01'


Tom Kyte

Followup  

February 28, 2006 - 7:14 am UTC

... A bug in the java application has
created duplicate records in this table. ...

You really meant to say "a horribly flawed implementation whereby the java coders decided to do all DATA LOGIC in the wrongest place in the world - the application has exposed itself. We know there are dozens more of these lying in wait for us"

That is identifying the min id very very very very consistently.

but why are you sorting? If you want the min(id) by equip_id, you should leave out the order by - else you get the minimum ID for the current row and every row in front of it - not over all rows by equip_id.

Don't know why you are using:

TO_CHAR(created, 'DDMMYYYY HH24:MI:SS')

why would you take something incredibly sortable (a DATE) and turn it into an ascii string that doesn't even sort correctly? (it will not sort by date at all, it'll sort as a string, wrong)


But - I don't believe you want the order by - but I cannot really say because you do not tell us what the primary key that should have been in place is?

Believe me - data and business logic belongs in the DB

February 28, 2006 - 8:32 am UTC

Reviewer: Ken

I have been telling the Java folks the same. And I have been preaching about the bind variables as well. The database was designed by someone quite famous in the Java tech community a few years back. The data abstraction has created so much issues and only the application knows what column contains what data based on a value in another field. But they are listening. They have changed updated the code to use preparedStatements and more changes are coming. A tiny victory, thanks in large part to your books and the forum. (The blog too, I must add. But I enjoy the non-tech pieces more since they show the otherside we hardly see here. Keep it up!)

The only reason in using TO_CHAR was to expose the time values but this can be done later. I will remove this and sort and run it again.

It is unbelievable how this app works. Pulling all the data and sorting in java, yikes!

Thank you so much, Tom.

Tom Kyte

Followup  

February 28, 2006 - 9:05 am UTC

but dates sort by

century, year, month, day, hour, minute, second

quite naturally!!!!!

By "exposing" the time in this example, you mucked up the sort order!!!

You put DD first - it would put the first of ANY MONTH before the second of ANY MONTH - messing it up.

it should just order by DATE - period. Never order by to_char(date)....

Analytics rock Analytics roll

February 28, 2006 - 2:11 pm UTC

Reviewer: Alf from NYC

Hello Tom,

I have four tables proc_event, visit, patient, and event

I need to get a list of patients with all records in the proc_event table for proc_id = 123 and 3456.

Patient proc_id
Patient A 123
3456
Patient B 123
3456

The query below is listing all records from the patient table that have either of the proc_id in the proc_event table.

Would you please direct me how or if I might be able to rewrite this in order to gets the desire listing?

Any information would be greatly appreciated. Thanks.

SELECT p.P_NAME,
p.MRN,
pe.proc_id,
e.date_time
FROM ud.patient p,
ud.proc_event pe,
ud.event e,
ud.visit v
WHERE ( (p.patient_id = v.patient_id)
AND (v.visit_id = e.visit_id)
AND (e.event_id(+) = pe.event_id)
AND (pe.proc_id in('123','3456'))
AND (e.date_time BETWEEN
to_date('01-jan-2005 00:00','dd-mon-yyyy hh24:mi') AND
to_date('20-jan-2005 23:59','dd-mon-yyyy hh24:mi')
)
)
GROUP BY p.P_NAME,
p.mrn,
pe.proc_id,
e.date_time
ORDER BY p.P_NAME ASC,
p.mrn ASC,
pe.proc_id ASC


Tom Kyte

Followup  

March 01, 2006 - 7:57 am UTC

...
I have four tables proc_event, visit, patient, and event

I need to get a list of patients with all records in the proc_event table for
proc_id = 123 and 3456.
...


Well given that

a) we don't know how these relate
b) what columns might be available
c) pretty much don't know anything

It is sort of difficult.

It sounds like you simply want to

a) join patients to proc_event (but we don't even know if you CAN!!!!!!!)
b) use an IN

Difficulty with min(id) over partition by -- part deux

March 01, 2006 - 9:29 am UTC

Reviewer: Ken

Hi Tom:

I am still not getting this to group correctly. The date is date, and orader is removed. Could you please let me know what could be causing this? Thanks.

ken@DEV9206> break on min_id skip 1
ken@DEV9206> l
1 SELECT equip_id, equip_type_id, type_id, created,
2 col_1, col_0,
3 ID,
4 MIN(id) OVER (PARTITION BY equip_id
5 ORDER BY equip_type_id,
6 type_id,
7 created,
8 col_1,
9 col_0) min_id
10 FROM dedupe_test_01
11* WHERE ROWNUM < 51
ken@DEV9206> /

EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 ID MIN_ID
---------- ------------- ---------- --------- ---------- ---------- ---------- ----------
3011 29 221 30-NOV-02 3 77 26445635 26445635
3011 29 221 30-NOV-02 3 77 26445651

3011 29 221 30-NOV-02 3 86 26445626 26445626
3011 29 221 30-NOV-02 3 86 26445653

3011 29 221 30-NOV-02 3 112 26445617 26445617
3011 29 221 30-NOV-02 3 112 26445620
3011 29 221 30-NOV-02 3 125 26445631
3011 29 221 30-NOV-02 3 125 26445641

3551 29 221 30-NOV-02 3 112 3209094 3209094
3551 29 221 30-NOV-02 3 112 3837739
3551 29 221 30-NOV-02 3 112 3837759
3551 29 221 30-NOV-02 3 112 6072229
3551 29 221 30-NOV-02 3 112 8894682
3551 29 221 30-NOV-02 3 118 3209095***
3551 29 221 30-NOV-02 3 118 3837740
3551 29 221 30-NOV-02 3 118 3837760
3551 29 221 30-NOV-02 3 118 6072230
3551 29 221 30-NOV-02 3 118 8894683

3551 29 221 11-MAR-05 1 186 3209093 3209093
3551 29 221 11-MAR-05 1 186 3837738
3551 29 221 11-MAR-05 1 186 3837758
3551 29 221 11-MAR-05 1 186 6072228
3551 29 221 11-MAR-05 1 186 8894681

3551 29 221 11-MAR-05 1 190 3209092 3209092
3551 29 221 11-MAR-05 1 190 3837737
3551 29 221 11-MAR-05 1 190 3837757
3551 29 221 11-MAR-05 1 190 6072227
3551 29 221 11-MAR-05 1 190 8894680

3551 29 221 11-MAR-05 1 227 3209091 3209091
3551 29 221 11-MAR-05 1 227 3837736
3551 29 221 11-MAR-05 1 227 3837756
3551 29 221 11-MAR-05 1 227 6072226
3551 29 221 11-MAR-05 1 227 8894679

3551 29 221 12-MAR-05 1 153 3209090 3209090
3551 29 221 12-MAR-05 1 153 3837735
3551 29 221 12-MAR-05 1 153 3837755
3551 29 221 12-MAR-05 1 153 6072225
3551 29 221 12-MAR-05 1 153 8894678
3551 29 221 16-MAR-05 1 109 3837734***
3551 29 221 16-MAR-05 1 109 3837754
3551 29 221 16-MAR-05 1 109 6072224
3551 29 221 16-MAR-05 1 109 8894677
3551 29 221 16-MAR-05 1 128 3837733***
3551 29 221 16-MAR-05 1 128 3837753
3551 29 221 16-MAR-05 1 128 6072223
3551 29 221 16-MAR-05 1 128 8894676
3551 29 221 17-MAR-05 1 181 3837732
3551 29 221 17-MAR-05 1 181 3837752
3551 29 221 17-MAR-05 1 181 6072222
3551 29 221 17-MAR-05 1 181 8894675


50 rows selected.

Tom Kyte

Followup  

March 01, 2006 - 9:52 am UTC

I said you don't want the order by at all.

If you are trying to associate the MIN(ID) to every record in a group - you just partition - you do NOT order by (else you get the min id of every record from the current on one up)

I don't know what you are trying to retrieve really since all I have is "non functioning sql" to work with - I don't know what your logic is.


query is doing precisely what you asked - break data up by X, sort by A,B,C and assign min(id) from the current row on up on the group.

I think "no order by" is called for.

Difficulty with min(id) over partition by -- part deux - take 2

March 01, 2006 - 9:49 am UTC

Reviewer: Ken

Please disregard my previous posting. It did not come out right.

I still cannot get this to group correctly. TO_CHAR was removed and order WAS removed. This gave me the following.

1 SELECT rownum, equip_id, equip_type_id, type_id, created,
2 col_1, col_0,
3 ID,
4 MIN(id) OVER (PARTITION BY equip_id) min_id
5 FROM dedupe_test_01
6* WHERE ROWNUM < 51
ken@DEV9206/

ROWNUM EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 ID MIN_ID
------ -------- ------------- ------- --------- ----- ----- -------- --------
1 3011 29 221 30-NOV-02 3 77 26445635 26445617
2 3011 29 221 30-NOV-02 3 77 26445651
3 3011 29 221 30-NOV-02 3 86 26445626
4 3011 29 221 30-NOV-02 3 86 26445653
5 3011 29 221 30-NOV-02 3 112 26445617
6 3011 29 221 30-NOV-02 3 112 26445620
7 3011 29 221 30-NOV-02 3 125 26445631
8 3011 29 221 30-NOV-02 3 125 26445641

9 3551 29 221 30-NOV-02 3 112 3209094 3209090
10 3551 29 221 30-NOV-02 3 112 3837739
11 3551 29 221 30-NOV-02 3 112 3837759
12 3551 29 221 30-NOV-02 3 112 6072229
13 3551 29 221 30-NOV-02 3 112 8894682
14 3551 29 221 30-NOV-02 3 118 3209095
15 3551 29 221 30-NOV-02 3 118 3837740
16 3551 29 221 30-NOV-02 3 118 3837760
17 3551 29 221 30-NOV-02 3 118 6072230
18 3551 29 221 30-NOV-02 3 118 8894683
19 3551 29 221 11-MAR-05 1 186 3209093
20 3551 29 221 11-MAR-05 1 186 3837738
21 3551 29 221 11-MAR-05 1 186 3837758
22 3551 29 221 11-MAR-05 1 186 6072228
23 3551 29 221 11-MAR-05 1 186 8894681
24 3551 29 221 11-MAR-05 1 190 3209092
25 3551 29 221 11-MAR-05 1 190 3837737
26 3551 29 221 11-MAR-05 1 190 3837757
27 3551 29 221 11-MAR-05 1 190 6072227
28 3551 29 221 11-MAR-05 1 190 8894680
29 3551 29 221 11-MAR-05 1 227 3209091
30 3551 29 221 11-MAR-05 1 227 3837736
31 3551 29 221 11-MAR-05 1 227 3837756
32 3551 29 221 11-MAR-05 1 227 6072226
33 3551 29 221 11-MAR-05 1 227 8894679
34 3551 29 221 12-MAR-05 1 153 3209090
35 3551 29 221 12-MAR-05 1 153 3837735
36 3551 29 221 12-MAR-05 1 153 3837755
37 3551 29 221 12-MAR-05 1 153 6072225
38 3551 29 221 12-MAR-05 1 153 8894678
39 3551 29 221 16-MAR-05 1 109 3837734
40 3551 29 221 16-MAR-05 1 109 3837754
41 3551 29 221 16-MAR-05 1 109 6072224
42 3551 29 221 16-MAR-05 1 109 8894677
43 3551 29 221 16-MAR-05 1 128 3837733
44 3551 29 221 16-MAR-05 1 128 3837753
45 3551 29 221 16-MAR-05 1 128 6072223
46 3551 29 221 16-MAR-05 1 128 8894676
47 3551 29 221 17-MAR-05 1 181 3837732
48 3551 29 221 17-MAR-05 1 181 3837752
49 3551 29 221 17-MAR-05 1 181 6072222
50 3551 29 221 17-MAR-05 1 181 8894675


50 rows selected.


I added the ORDER BY back in the PARTITION BY clause. Here's the result:

ken@DEV9206get q_1
1 SELECT rownum, equip_id, equip_type_id, type_id, created,
2 col_1, col_0,
3 ID,
4 MIN(id) OVER (PARTITION BY equip_id
5 ORDER BY equip_type_id,
6 type_id,
7 created,
8 col_1,
9 col_0) min_id
10 FROM dedupe_test_01
11* WHERE ROWNUM < 51
ken@DEV9206/

ROWNUM EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_1 COL_0 ID MIN_ID
------ -------- ------------- ------- --------- ----- ----- -------- --------
1 3011 29 221 30-NOV-02 3 77 26445635 26445635
2 3011 29 221 30-NOV-02 3 77 26445651

3 3011 29 221 30-NOV-02 3 86 26445626 26445626
4 3011 29 221 30-NOV-02 3 86 26445653

5 3011 29 221 30-NOV-02 3 112 26445617 26445617
6 3011 29 221 30-NOV-02 3 112 26445620
7 3011 29 221 30-NOV-02 3 125 26445631
8 3011 29 221 30-NOV-02 3 125 26445641

9 3551 29 221 30-NOV-02 3 112 3209094 3209094
10 3551 29 221 30-NOV-02 3 112 3837739
11 3551 29 221 30-NOV-02 3 112 3837759
12 3551 29 221 30-NOV-02 3 112 6072229
13 3551 29 221 30-NOV-02 3 112 8894682
14 3551 29 221 30-NOV-02 3 118 3209095
15 3551 29 221 30-NOV-02 3 118 3837740
16 3551 29 221 30-NOV-02 3 118 3837760
17 3551 29 221 30-NOV-02 3 118 6072230
18 3551 29 221 30-NOV-02 3 118 8894683

19 3551 29 221 11-MAR-05 1 186 3209093 3209093
20 3551 29 221 11-MAR-05 1 186 3837738
21 3551 29 221 11-MAR-05 1 186 3837758
22 3551 29 221 11-MAR-05 1 186 6072228
23 3551 29 221 11-MAR-05 1 186 8894681

24 3551 29 221 11-MAR-05 1 190 3209092 3209092
25 3551 29 221 11-MAR-05 1 190 3837737
26 3551 29 221 11-MAR-05 1 190 3837757
27 3551 29 221 11-MAR-05 1 190 6072227
28 3551 29 221 11-MAR-05 1 190 8894680

29 3551 29 221 11-MAR-05 1 227 3209091 3209091
30 3551 29 221 11-MAR-05 1 227 3837736
31 3551 29 221 11-MAR-05 1 227 3837756
32 3551 29 221 11-MAR-05 1 227 6072226
33 3551 29 221 11-MAR-05 1 227 8894679

34 3551 29 221 12-MAR-05 1 153 3209090 3209090
35 3551 29 221 12-MAR-05 1 153 3837735
36 3551 29 221 12-MAR-05 1 153 3837755
37 3551 29 221 12-MAR-05 1 153 6072225
38 3551 29 221 12-MAR-05 1 153 8894678
39 3551 29 221 16-MAR-05 1 109 3837734
40 3551 29 221 16-MAR-05 1 109 3837754
41 3551 29 221 16-MAR-05 1 109 6072224
42 3551 29 221 16-MAR-05 1 109 8894677
43 3551 29 221 16-MAR-05 1 128 3837733
44 3551 29 221 16-MAR-05 1 128 3837753
45 3551 29 221 16-MAR-05 1 128 6072223
46 3551 29 221 16-MAR-05 1 128 8894676
47 3551 29 221 17-MAR-05 1 181 3837732
48 3551 29 221 17-MAR-05 1 181 3837752
49 3551 29 221 17-MAR-05 1 181 6072222
50 3551 29 221 17-MAR-05 1 181 8894675


50 rows selected.

Row numbers 14, 39, 43, and 47 should have started new groups. What am I doing wrong that it is not seeing these as new groups?

Thanks much for your help.




Tom Kyte

Followup  

March 01, 2006 - 10:32 am UTC

you need to just state in english what you are trying to do rather than posting SQL that does not achieve it.

Tell us your LOGIC.

Difficulty with min(id) over partition by -- part deux - take 3

March 01, 2006 - 10:38 am UTC

Reviewer: Ken

Thanks, Tom, fair enough.

We would like to select duplicate rows from a table based on the values in certain fields (not all fields will have duplicate data since some are unique to each row). And identify the row within this set with the lowest value in one column --- min(id), and have this value be shown in a separate column.

Please let me know how we can achieve that.

Thanks again.

Tom Kyte

Followup  

March 01, 2006 - 10:42 am UTC

tell me what values perhaps.


In general, you will partition by the unique key
You will get the MIN(ID) by that key

No order by

Difficulty with min(id) over partition by -- part deux - take 4

March 01, 2006 - 11:54 am UTC

Reviewer: Ken

Thanks, Tom, but I think we are not on the same page yet.
Please let me try again re: logic.

We have the following table:
ken@DEV9206desc dedupe_test_01
Name Null? Type
----------------------------- -------- ------------------
ID NOT NULL NUMBER
EQUIP_ID NOT NULL NUMBER
EQUIP_TYPE_ID NOT NULL NUMBER
TYPE_ID NOT NULL NUMBER
CREATED NOT NULL DATE
COL_0 NUMBER
COL_1 NUMBER

ken@DEV9206l
1 select * from dedupe_test_01
2 where equip_id in (3011, 3551)
3 and rownum < 51
4* order by 1,2,3,4,5,6
ken@DEV9206/

EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_0 COL_1 ID
-------- ------------- ------- --------- ----- ----- --------
3011 29 221 30-NOV-02 77 3 26445635
3011 29 221 30-NOV-02 77 3 26445651
3011 29 221 30-NOV-02 86 3 26445626
3011 29 221 30-NOV-02 86 3 26445653
3011 29 221 30-NOV-02 112 3 26445617
3011 29 221 30-NOV-02 112 3 26445620
3011 29 221 30-NOV-02 125 3 26445631
3011 29 221 30-NOV-02 125 3 26445641
3551 29 221 30-NOV-02 112 3 3209094
3551 29 221 30-NOV-02 112 3 3837739
3551 29 221 30-NOV-02 112 3 3837759
3551 29 221 30-NOV-02 112 3 6072229
3551 29 221 30-NOV-02 112 3 8894682
3551 29 221 30-NOV-02 118 3 3209095
3551 29 221 30-NOV-02 118 3 3837740
3551 29 221 30-NOV-02 118 3 3837760
3551 29 221 30-NOV-02 118 3 6072230
3551 29 221 30-NOV-02 118 3 8894683
3551 29 221 11-MAR-05 186 1 3209093
3551 29 221 11-MAR-05 186 1 3837738
3551 29 221 11-MAR-05 186 1 3837758
3551 29 221 11-MAR-05 186 1 6072228
3551 29 221 11-MAR-05 186 1 8894681
3551 29 221 11-MAR-05 190 1 3209092
3551 29 221 11-MAR-05 190 1 3837737
3551 29 221 11-MAR-05 190 1 3837757
3551 29 221 11-MAR-05 190 1 6072227
3551 29 221 11-MAR-05 190 1 8894680
3551 29 221 11-MAR-05 227 1 3209091
3551 29 221 11-MAR-05 227 1 3837736
3551 29 221 11-MAR-05 227 1 3837756
3551 29 221 11-MAR-05 227 1 6072226
3551 29 221 11-MAR-05 227 1 8894679
3551 29 221 12-MAR-05 153 1 3209090
3551 29 221 12-MAR-05 153 1 3837735
3551 29 221 12-MAR-05 153 1 3837755
3551 29 221 12-MAR-05 153 1 6072225
3551 29 221 12-MAR-05 153 1 8894678
3551 29 221 16-MAR-05 109 1 3837734
3551 29 221 16-MAR-05 109 1 3837754
3551 29 221 16-MAR-05 109 1 6072224
3551 29 221 16-MAR-05 109 1 8894677
3551 29 221 16-MAR-05 128 1 3837733
3551 29 221 16-MAR-05 128 1 3837753
3551 29 221 16-MAR-05 128 1 6072223
3551 29 221 16-MAR-05 128 1 8894676
3551 29 221 17-MAR-05 181 1 3837732
3551 29 221 17-MAR-05 181 1 3837752
3551 29 221 17-MAR-05 181 1 6072222
3551 29 221 17-MAR-05 181 1 8894675

50 rows selected.

And here's the answer I need:
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED COL_0 COL_1 ID MIN_ID
-------- ------------- ------- --------- ----- ----- -------- --------
3011 29 221 30-NOV-02 77 3 26445635 26445635
3011 29 221 30-NOV-02 77 3 26445651
3011 29 221 30-NOV-02 86 3 26445626 26445626
3011 29 221 30-NOV-02 86 3 26445653
3011 29 221 30-NOV-02 112 3 26445617 26445617
3011 29 221 30-NOV-02 112 3 26445620
3011 29 221 30-NOV-02 125 3 26445631 26445631
3011 29 221 30-NOV-02 125 3 26445641
3551 29 221 30-NOV-02 112 3 3209094 3209094
3551 29 221 30-NOV-02 112 3 3837739
3551 29 221 30-NOV-02 112 3 3837759
3551 29 221 30-NOV-02 112 3 6072229
3551 29 221 30-NOV-02 112 3 8894682
3551 29 221 30-NOV-02 118 3 3209095 3209095
3551 29 221 30-NOV-02 118 3 3837740
3551 29 221 30-NOV-02 118 3 3837760
3551 29 221 30-NOV-02 118 3 6072230
3551 29 221 30-NOV-02 118 3 8894683
3551 29 221 11-MAR-05 186 1 3209093 3209093
3551 29 221 11-MAR-05 186 1 3837738
3551 29 221 11-MAR-05 186 1 3837758
3551 29 221 11-MAR-05 186 1 6072228
3551 29 221 11-MAR-05 186 1 8894681
3551 29 221 11-MAR-05 190 1 3209092 3209092
3551 29 221 11-MAR-05 190 1 3837737
3551 29 221 11-MAR-05 190 1 3837757
3551 29 221 11-MAR-05 190 1 6072227
3551 29 221 11-MAR-05 190 1 8894680

Perhaps, PARTITION BY is not appropriate? Thanks.


Tom Kyte

Followup  

March 01, 2006 - 1:49 pm UTC

I know we are not on the same page - because all i want is a textual description of the logic.

I do not, will not, reverse engineer "this is what I get" and "this is what I want"

I want you to write it down as if you would give it to someone to implement

because....... That is precisely what you are doing.

Difficulty with min(id) over partition by -- part deux - take 5

March 01, 2006 - 12:23 pm UTC

Reviewer: Ken

Thanks, Tom. Your questions had forced me to look at it again. I was able to resolve it without using PARTITION BY.

Thanks again.

Analytics Question

March 01, 2006 - 2:48 pm UTC

Reviewer: Mike from Dallas, TX

I am having an issue with using analytics. What I need to be able to accomplish is
I have a table called entity, this table is recursive in the fact the every entity
with the exception of the highest level entity has a parent entity.

Normally this list is easily generated by using 'connect by' in the SQL to show the hierarchy.

The problem I have is now we have to show all the tickets that have been opened for each entity
and all the entities under it.

A simple join will get the open tickets for each entity, but in order to sum all of the tickets
for each entity and all entities under it takes too many resources to be considered usable.

I have tried several ways to get this query to work with as few resources as possible but still
believe that this SQL will generate to unnacceptable as the table(s) grow.

what I really need to do is sum the entities tickets up in reverse order from the bottom up to
the level 1 entity so that I don't have to traverse the tree from level 1 to level 6, then each
level2 to level6, etc,..

I wrote a quick procedure to populate a table (t) with each uuid that has a ticket, it's sum of
the tickets tied to that entity and a concatenated list of uuids under that entity in the tree,
this resulted in a much smaller table but this is not the way I would want to implemet it

This table is going to grow into several hundred thousand entities and I need to be able
to generate this list much faster and using less resources than the connect by statements below.
Any help would definitely be appreciated.

Without using a procedure and a two step process here is what I have to work with for a test,

CREATE TABLE ENTITY (
ENTITY_UUID VARCHAR2(32),
NAME VARCHAR2(256),
PARENT_UUID VARCHAR2(32)
)
/

Table created.

CREATE TABLE ENTITY_TCKT (
ENTITY_UUID VARCHAR2(32),
CURRENT_LIFECYCLE_STATE NUMBER
)
/

Table created.

Then populate the entity table,

insert into entity values ('13E7CAA5FDEB42518A798A77A19F70B0','Level1 Entity',NULL);
insert into entity values ('66A6A6EFFA9D46BE82EC8F5CFFAC91B9','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('DCF6B6366D6449DB95A5AEA6B14F31F7','Level5 Entity','66A6A6EFFA9D46BE82EC8F5CFFAC91B9');
insert into entity values ('E2FD444948714528805EBFFA102511F5','Level5 Entity','CB4E1B74035947B9A5B9B0FE264DF4E7');
insert into entity values ('2E1E0646AC9F4BB9A6E4A747B20B2595','Level2 Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('54133391FDD54221B11382A20DFC38AA','Level2 Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('95A5F85D68184DB7A49F9DF7A236F9AF','Level5 Entity','99762DC75A5D42DCBEA6950D7011F130');
insert into entity values ('EAD30C5578BD491991B0D7049CD4F277','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('536FCF7E4A5D457B8C3AECBED878FDBF','Level3 Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('883FD970DF264B7A9DC0DFBAF225012A','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('C23E104B5AA044F795A6896B1C0B08E4','Level3 Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('64BA9F3295194A3B955FD446DBB2E7EC','Level4 Entity','C23E104B5AA044F795A6896B1C0B08E4');
insert into entity values ('0FE28D72C40C4D6FBB439A49B0BE6D3F','Level5 Entity','64BA9F3295194A3B955FD446DBB2E7EC');
----------list goes on------------------

Then to generate some random ticket data

DECLARE
l_tckt_ktr NUMBER;
l_entity_ktr NUMBER;
BEGIN
FOR i IN (select entity_uuid from entity)
LOOP
l_entity_ktr := round(dbms_random.value(1,6),0);
DBMS_OUTPUT.PUT_LINE('Processing entity -> '||i.entity_uuid||' l_entity_ktr = '||l_entity_ktr);
IF l_entity_ktr = 1 THEN
l_tckt_ktr := round(dbms_random.value(1,10),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
ELSIF l_entity_ktr = 3 THEN
l_tckt_ktr := round(dbms_random.value(50,100),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
ELSIF l_entity_ktr = 5 THEN
l_tckt_ktr := round(dbms_random.value(500,1000),0);
FOR n IN 1 .. l_tckt_ktr
LOOP
insert into entity_tckt values (i.entity_uuid,0);
END LOOP;
END IF;
END LOOP;
END;
/

PL/SQL procedure successfully completed.

Now that the tables have data, try two ways to get the query

set lines 256
column lpad('',2*level)||e.name format a50
column lpad('',2*level)||e.entity_uuid format a50
select lpad(' ', 2 * level)||e.name, lpad(' ', 2 * level)||e.entity_uuid, level,
(select count(*) from entity_tckt where current_lifecycle_state = 0 and entity_uuid in (
select a.entity_uuid from entity a start with a.entity_uuid = e.entity_uuid connect by prior a.entity_uuid = a.parent_uuid)) as nbr
from entity e
start with e.entity_uuid = '13E7CAA5FDEB42518A798A77A19F70B0' connect by prior e.entity_uuid = e.parent_uuid
/
LPAD('',2*LEVEL)||E.NAME LPAD('',2*LEVEL)||E.ENTITY_UUID LEVEL NBR
-------------------------------------------------- -------------------------------------------------- ---------- ----------
Level1 Entity 13E7CAA5FDEB42518A798A77A19F70B0 1 15533
Level2 Entity 2E1E0646AC9F4BB9A6E4A747B20B2595 2 7
Level2 Entity 54133391FDD54221B11382A20DFC38AA 2 15526
Level3 Entity 536FCF7E4A5D457B8C3AECBED878FDBF 3 890
Level4 Entity 66A6A6EFFA9D46BE82EC8F5CFFAC91B9 4 51
Level5 Entity DCF6B6366D6449DB95A5AEA6B14F31F7 5 0
Level4 Entity EAD30C5578BD491991B0D7049CD4F277 4 76
Level4 Entity 883FD970DF264B7A9DC0DFBAF225012A 4 0
Level4 Entity 0F01178489CB421CA5A4C55AEC98300E 4 4
Level5 Entity 5B3B033B4A30443BB1B6F57C1BB05399 5 4
Level4 Entity CB4E1B74035947B9A5B9B0FE264DF4E7 4 69

LPAD('',2*LEVEL)||E.NAME LPAD('',2*LEVEL)||E.ENTITY_UUID LEVEL NBR
-------------------------------------------------- -------------------------------------------------- ---------- ----------
Level5 Entity E2FD444948714528805EBFFA102511F5 5 69
Level4 Entity 99762DC75A5D42DCBEA6950D7011F130 4 607
Level5 Entity 95A5F85D68184DB7A49F9DF7A236F9AF 5 517
-------------------and on and on----------------

122 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
4 2 FILTER
5 4 CONNECT BY (WITH FILTERING)
6 5 NESTED LOOPS
7 6 TABLE ACCESS (FULL) OF 'ENTITY'
8 6 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
9 5 NESTED LOOPS
10 9 BUFFER (SORT)
11 10 CONNECT BY PUMP
12 9 TABLE ACCESS (FULL) OF 'ENTITY'
13 0 CONNECT BY (WITH FILTERING)
14 13 NESTED LOOPS
15 14 TABLE ACCESS (FULL) OF 'ENTITY'
16 14 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
17 13 NESTED LOOPS
18 17 BUFFER (SORT)
19 18 CONNECT BY PUMP
20 17 TABLE ACCESS (FULL) OF 'ENTITY'




Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
208383 consistent gets
0 physical reads
0 redo size
9007 bytes sent via SQL*Net to client
591 bytes received via SQL*Net from client
10 SQL*Net roundtrips to/from client
27913 sorts (memory)
0 sorts (disk)
122 rows processed

And now using 'WITH'

with alm as (select entity_uuid, count(*) as nbr
from entity_tckt
group by entity_uuid)
select e.entity_uuid, sum(nbr) from entity e, alm
where alm.entity_uuid in (select entity_uuid from entity start with entity_uuid = e.entity_uuid connect by prior entity_uuid = parent_uuid)
group by e.entity_uuid
/

ENTITY_UUID SUM(NBR)
-------------------------------- ----------
026D4A6B547544E08CFEC5DDED3B7777 772
02AD7930E9B94A20A9F4E245B3F4B8C4 1747
04B3AD5F804B4A66AC6C91606EA7019B 74
070547E30C604D4680089959A6DB7684 677
08ABA73521F94F2DB19FF8C1C53ADF06 2
0D65B9BB906B442B9A6BDFEEF3D858AC 86
0F01178489CB421CA5A4C55AEC98300E 4
0FE28D72C40C4D6FBB439A49B0BE6D3F 716
10801AFBCBC44F2F9851E16BC1CCA442 703
13E7CAA5FDEB42518A798A77A19F70B0 15533
1677A02C0B89479BA27F95D7AA3DAC03 9516

ENTITY_UUID SUM(NBR)
-------------------------------- ----------
1AE6BCDA1FA74B0B8DEBEFF13F9A444A 96
1F9BD4CD72FA4879AC338397FB59FD39 74
21FAD03A5EB34D79908E96DA647FF24C 592
235F6C1600584CDD8B3FB1FFCD742E7E 839
23EA0B4618A94C1586BDF08437B97EB0 3067
25CFE1E64F36468DB291CBCF0867B314 59
26C635064E83447B91BA8D125FE74C2A 776
2ABC45D4A5D84684AC29A578C5CCBD3E 1759
2DF37BBB64254DB29081F03D38B5CE33 876
2E1E0646AC9F4BB9A6E4A747B20B2595 7
2FF85FEF82864C0BB75BA4513885ED0E 76

----------------more data ----------------
71 rows selected.


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (GROUP BY)
2 1 FILTER
3 2 NESTED LOOPS
4 3 VIEW
5 4 SORT (GROUP BY)
6 5 TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
7 3 TABLE ACCESS (FULL) OF 'ENTITY'
8 2 FILTER
9 8 CONNECT BY (WITH FILTERING)
10 9 NESTED LOOPS
11 10 TABLE ACCESS (FULL) OF 'ENTITY'
12 10 TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
13 9 NESTED LOOPS
14 13 BUFFER (SORT)
15 14 CONNECT BY PUMP
16 13 TABLE ACCESS (FULL) OF 'ENTITY'




Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
197127 consistent gets
0 physical reads
0 redo size
3707 bytes sent via SQL*Net to client
547 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
27902 sorts (memory)
0 sorts (disk)
71 rows processed

The logical reads will kill us as the table grows - Any help you could give would be fantastic

P.S. - I love the site, use it every day - Thanx
Mike





Analytics rock Analytics roll

March 01, 2006 - 2:50 pm UTC

Reviewer: Alf from NYC

Sorry, I should have mention that there is not direct relation between patient and proc_event:

The only way that I’m able to relate patients to proc_event is by joining event to proc_event as (e.event_id(+) = pe.event_id) and then visit to patient (p.patient_id = v.patient_id). The IN operator is working as expected, however I try to use AND as pe.proc_id = '123' AND pe.proc_id ='3456' because this would return only records for patients who would have both 123 and 3456 (proc_di) records in the proc_event table.




Hello Tom, Analytics rock Analytics roll

March 02, 2006 - 4:51 pm UTC

Reviewer: Alf from NYC

Hi Tom,

I've been trying many different approach for this

Here are the table desc, I'm including the whole output from de desc command. Not sure if you'd want me to include everything or cut the not revelant culmns out.

SQL> desc ud_master.patient
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 PATIENT_ID                                NOT NULL NUMBER(12)
 NAME                                               VARCHAR2(100)
 TITLE_ID                                           NUMBER(12)
 MEDICAL_RECORD_NUMBER                              VARCHAR2(20)
 SEX                                                VARCHAR2(8)
 BIRTHDATE                                          DATE
 DATE_OF_DEATH                                      DATE
 APT_SUITE                                          VARCHAR2(100)
 STREET_ADDRESS                                     VARCHAR2(100)
 CITY                                               VARCHAR2(50)
 STATE                                              VARCHAR2(50)
 COUNTRY                                            VARCHAR2(50)
 MAILING_CODE                                       VARCHAR2(50)
 MARITAL_STATUS_ID                                  NUMBER(12)
 RACE_ID                                            NUMBER(12)
 RELIGION_ID                                        NUMBER(12)
 FREE_TEXT_RELIGION                                 VARCHAR2(100)
 OCCUPATION_ID                                      NUMBER(12)
 FREE_TEXT_OCCUPATION                               VARCHAR2(100)
 EMPLOYER_ID                                        NUMBER(12)
 FREE_TEXT_EMPLOYER                                 VARCHAR2(150)
 MOTHER_PATIENT_ID                                  NUMBER(12)
 COLLAPSED_INTO_PATIENT_ID                          NUMBER(12)
 SOCIAL_SECURITY_NUMBER                             VARCHAR2(15)
 LIFECARE_VISIT_ID                                  NUMBER(12)
 CONFIDENTIAL_FLAG                                  VARCHAR2(1)
 HOME_PHONE                                         VARCHAR2(20)
 DAY_PHONE                                          VARCHAR2(20)
 SMOKER_FLAG                                        VARCHAR2(1)
 CURRENT_LOCATION                                   VARCHAR2(15)
 SEC_LANG_NAME                                      VARCHAR2(100)
 ADDR_STRING                                        VARCHAR2(50)
 BLOCK_CODE                                         VARCHAR2(50)

SQL> desc ud_master.visit
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 VISIT_NUMBER                                       VARCHAR2(40)
 PATIENT_ID                                         NUMBER(12)
 VISIT_TYPE_ID                                      NUMBER(12)
 VISIT_SUBTYPE_ID                                   NUMBER(12)
 VISIT_STATUS_ID                                    NUMBER(12)
 FACILITY_ID                                        NUMBER(12)
 ATTENDING_EMP_PROVIDER_ID                          NUMBER(12)
 RESIDENT_EMP_PROVIDER_ID                           NUMBER(12)
 ADMISSION_DATE_TIME                                DATE
 DISCHARGE_DATE_TIME                                DATE
 DISCHARGE_TYPE_ID                                  NUMBER(12)
 MARITAL_STATUS_ID                                  NUMBER(12)
 RELIGION_ID                                        NUMBER(12)
 FREE_TEXT_RELIGION                                 VARCHAR2(100)
 FINANCIAL_CLASS_ID                                 NUMBER(12)
 OCCUPATION_ID                                      NUMBER(12)
 FREE_TEXT_OCCUPATION                               VARCHAR2(100)
 EMPLOYER_ID                                        NUMBER(12)
 FREE_TEXT_EMPLOYER                                 VARCHAR2(100)
 PHYSICIAN_SERVICE_ID                               VARCHAR2(12)
 LOCATION_ID                                        VARCHAR2(12)
 ADDL_RESP_EMP_PROVIDER_ID                          NUMBER(12)
 ADDL_RESP_STRING                                   VARCHAR2(100)
 ATTENDING_STRING                                   VARCHAR2(100)
 RESIDENT_STRING                                    VARCHAR2(100)
 LAST_LOCATION                                      VARCHAR2(15)
 ADDL_RESP_RESIDENT_SERVICE_ID                      NUMBER(12)
 TRIAGE_ACUITY_ID                                   NUMBER(12)
 SERIES_VISIT_FLAG                                  VARCHAR2(5)
 NEWBORN_FLAG                                       VARCHAR2(5)

SQL> desc ud_master.proc_event
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 EVENT_ID                                  NOT NULL NUMBER(12)
 ORDER_SPAN_ID                                      NUMBER(12)
 ORDER_SPAN_STATE_ID                                NUMBER(12)
 PROC_ID                                            NUMBER(12)
 ORIG_SCHEDULE_BEGIN_DATE_TIME                      DATE
 ORIG_SCHEDULE_END_DATE_TIME                        DATE
 FINAL_SCHEDULE_BEGIN_DATE_TIME                     DATE
 FINAL_SCHEDULE_END_DATE_TIME                       DATE
 ABNORMAL_STATE_ID                                  VARCHAR2(3)
 MODIFIED_PROC_NAME                                 VARCHAR2(250)
 FACILITY_ID                                        NUMBER(12)
 PRIORITY_ID                                        NUMBER(12)
 CORRECTED_FLAG                                     VARCHAR2(5)
 RX_FLAG                                            VARCHAR2(5)
 SPEC_RECOLLECT_FLAG                                VARCHAR2(5)
 COMPLETE_RESULT_RPT                                NUMBER(12)
 ORDER_VISIT_ID                                     NUMBER(12)
 ORDER_DEFINITION_ID                                VARCHAR2(25)
 PROC_ORDER_NBR                                     NUMBER(12)

SQL> desc ud_master.event
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 EVENT_ID                                  NOT NULL NUMBER(12)
 DATE_TIME                                          DATE
 EVENT_STATUS_ID                                    NUMBER(12)
 EVENT_TYPE_ID                                      NUMBER(12)
 PATIENT_SCHEDULE_DISPLAY                           VARCHAR2(100)

I've created the following query with a sub and with the exists clause, would please review it and let me know whatever I need to correct, many thanks in advance.

SELECT DISTINCT  patient.NAME,
         patient.medical_record_number Medical_RN,
         sub_epe.proc_id,
         to_date(sub_epe.date_time,'dd-mon-yy') event_dt
    FROM ud_master.patient,
         (SELECT proc_event.proc_id, event.date_time,proc_event.VISIT_ID
          FROM ud_master.event, ud_master.proc_event
          WHERE (exists (select * 
                         from ud_master.proc_event sub_p
                        where sub_p.proc_id = sub_p.proc_id) 
                          and (proc_event.visit_id = event.visit_id)))sub_epe,ud_master.visit
    WHERE (   (patient.patient_id = visit.patient_id)
          AND (sub_epe.proc_id = 21078) and (sub_epe.proc_id = 22025)
          AND (sub_epe.date_time BETWEEN to_date('25-jan-2005','dd-mon-yyyy') AND
                                         to_date('31-jan-2005','dd-mon-yyyy')))

GROUP BY patient.NAME,
         patient.medical_record_number,
         sub_epe.proc_id,
         sub_epe.date_time     
  

Another Select Query

March 08, 2006 - 2:55 am UTC

Reviewer: A reader

Hi Tom,
Have a requirement like this:
Create table test(
main number,
a1 varchar2(10),
a2 varchar2(10),
a3 varchar2(10),
a4 varchar2(10),
a5 varchar2(10)) nologging;

Insert into test (1,'A1','A2','A3','A4','A5');

Normal Output:

Main A1 A2 A3 A4 A5
---- -- -- -- -- --
1 A1 A2 A3 A4 A5

Desired Output:
---------------
Main Txt
---- ---
1 A1
1 A2
1 A3
1 A4
1 A5


I know it is a simple query.
Please advice


Tom Kyte

Followup  

March 09, 2006 - 12:13 pm UTC

ops$tkyte@ORA10GR2> with r as
  2  (select level l from dual connect by level <= 5)
  3  select test.main,
  4         decode( r.l, 1, a1, 2, a2, 3, a3, 4, a4, 5, a5 ) data
  5    from test, r;

      MAIN DATA
---------- ----------
         1 A1
         1 A2
         1 A3
         1 A4
         1 A5
 

Analytics rock Analytics roll

March 09, 2006 - 4:37 pm UTC

Reviewer: Alf from NYC

Hello Tom,

How about this?

I have four tables (descs included below) that holds information about patients along with their visits and tests (procedures hence the proc table) perform. In some case some patients are ordered two tests/procedures with a certain period of time.

I to create a report listing patients that had had at least two procedures done in this case procedure 21078 and 22025, the desire out put would be:

P_name P_MRN Proc_ID Event_Date_Time
---------------- ----------- ----------- --------------------
Patient Test, A 44422244555 21078 01-jan-2005 00:00
22025 05-jan-2005 10:35
Patient Test, B 3334442222 21078 28-feb-2005 11:15
22025 31-jul-2005 01:35
…...

My challenge is there's not direct relation between patient and proc_event, would please review these tables and give a hit how to approach this? Thanks.



Patient
=======
Name Null? Type
----------------------------------------- -------- ----------------------------
PATIENT_ID NOT NULL NUMBER(12)
NAME VARCHAR2(100)
TITLE_ID NUMBER(12)
MEDICAL_RECORD_NUMBER VARCHAR2(20)
SEX VARCHAR2(8)
.....

Visit
=====
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
VISIT_NUMBER VARCHAR2(40)
PATIENT_ID NUMBER(12)
VISIT_TYPE_ID NUMBER(12)
VISIT_SUBTYPE_ID NUMBER(12)
VISIT_STATUS_ID NUMBER(12)
FACILITY_ID NUMBER(12)
......

proc_event
=====
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
EVENT_ID NOT NULL NUMBER(12)
ORDER_SPAN_ID NUMBER(12)
ORDER_SPAN_STATE_ID NUMBER(12)
PROC_ID NUMBER(12)
......

event
=====
Name Null? Type
----------------------------------------- -------- ----------------------------
VISIT_ID NOT NULL NUMBER(12)
EVENT_ID NOT NULL NUMBER(12)
DATE_TIME DATE
EVENT_STATUS_ID NUMBER(12)
EVENT_TYPE_ID NUMBER(12)
PATIENT_SCHEDULE_DISPLAY VARCHAR2(100)
......

Analytics Rock

March 10, 2006 - 3:31 pm UTC

Reviewer: Alf from NYC

Hello Tom,

Follow-up for my pervious above question regarding a list of patients who have had two specific tests performed/ordered in this case proc_id ‘22025’ and ‘21078’

I finally got the below query to work this time. Would you please review it and let me know if there's anything I need to change to improve speed performance? Thanks.

SELECT
DISTINCT patient.medical_record_number mrn,
patient.name p_name,
proc.proc_id,proc.name,
last_day(to_date(event.date_time, 'dd-mon-yy')) D_time
FROM ud_master.event,
ud_master.proc_event,
ud_master.patient,
ud_master.visit,
ud_master.proc
WHERE (
(patient.patient_id = visit.patient_id)
AND (visit.visit_id = proc_event.visit_id)
AND (visit.visit_id = event.visit_id)
AND (proc_event.visit_id = event.visit_id)
AND (proc_event.event_id = event.event_id)
AND (proc.proc_id in (22025,21078))
AND (event.date_time BETWEEN to_date('01-jan-2005','dd-mon-yyyy') AND
to_date('31-dec-2005','dd-mon-yyyy')))


GROUP BY patient.medical_record_number,
patient.name,
proc.proc_id,
proc.name,
to_date(event.date_time, 'dd-mon-yy')

ORDER BY patient.medical_record_number,'p_name',
proc.proc_id,
proc.name,'D_time'

Analytic function

March 23, 2006 - 4:20 am UTC

Reviewer: Mohamed from France

Dear Tom,

I have a table T1 as follows:

IDE NOT NULL NUMBER
PTFL_IDE NOT NULL NUMBER
BATY_TYP NOT NULL VARCHAR2(6)
CTP_IC NOT NULL NUMBER
LABL_PLY NOT NULL VARCHAR2(6)
NMFI_IDE NOT NULL NUMBER
QTY NOT NULL NUMBER

and a table T2 as follows

IDE NOT NULL NUMBER
SNAP_NMFI_IDE NOT NULL NUMBER
IDE_PORTF VARCHAR2(50) ===>
SNAP_BATY_TYP VARCHAR2(6)
USR_INS VARCHAR2(48)
DAT_INS DATE
USR_UPD VARCHAR2(48)
DAT_UPD DATE
PGM_INS VARCHAR2(48)
PGM_UPD VARCHAR2(48)


and table T3 as follows

PTFL_IDE NOT NULL NUMBER
IDE_PORTF NOT NULL VARCHAR2(50)
PTF_STATUS NOT NULL VARCHAR2(3)
NLAB_COD_LABL NOT NULL VARCHAR2(6)

I would like to get from table T1 the sum(T1.QTY) grouped by T1.PTFL_IDE,T1.BATY_TYP,T1.CTP_IC, T1.LABL_PLY +
the max(T2.dat_ins) for each (T3.IDE_PORTF,T3.SNAP_BATY_TYP) in this group

Remark: T3.SNAP_BATY_TYP = T1. BATY_TYP
T3.IDE_PORTF = SELECT T3.IDE_PORTF FROM T3 WHERE T3.PTFL_IDE = T1.PTFL_IDE

Thanks in advance for your help


Tom Kyte

Followup  

March 23, 2006 - 10:54 am UTC

sorry, no help.

no create table.
no insert into.
no real explanation of the goal. the remark doesn't make sense (don't know what you are trying to remark), the "i would like to get" isn't very clear either.

Simple Analytical Query

March 24, 2006 - 2:31 am UTC

Reviewer: A reader from CA

Hi Tom,

I have finally got a chance to make use of analytical Query but I am not able to use it. Following is my table data:-

ITEM          MYORDERS  QTYONHAND SAFETYSTOCK
---------- ---------- ---------- -----------
ABC                 5         50          10
ABC                45         50          10
ABC                25         50          10
DEF                30         60          10
DEF                40         60          10
DEF                30         50          10
DEF                30         60          10
DEF                40         60          10
XYZ                20         80          10
XYZ                10         80          10
XYZ                10         80          10

ITEM          MYORDERS  QTYONHAND SAFETYSTOCK
---------- ---------- ---------- -----------
XYZ                20         80          10

Now I want the Output to be as :-
Item     MYORDERS   qtyonhand  safetystock  tomake  --> (qtyonhand-safetystock-sum(MYORDERS))
====  =========  =======   ============  ========
ABC           5       50             10         0   (haven't depleted stock)
ABC          45       50             10        10   -1 x (50-10-5-45)
ABC          25       50             10        35   -1 x (50-10-5-45-25)
so on........

I tried the following Query but it does'nt give me a running total for the MYORDERS column :-
SQL> select item,MYORDERS,sum(MYORDERS) over (partition by item) tomake from test;
ITEM          MYORDERS    TOMAKE
---------- ---------- ----------
ABC                 5         75
ABC                45         75
ABC                25         75
DEF                30        170
DEF                40        170
DEF                30        170
DEF                30        170
DEF                40        170
XYZ                20         60
XYZ                10         60
XYZ                10         60

ITEM          MYORDERS    TOMAKE
---------- ---------- ----------
XYZ                20         60

12 rows selected.

If I get a running total in TOMAKE then I can just subtract it from the value of SAFETYSTOCK for that particular row to get my desired Output.
I know I am missing the order by clause but I tried using the other 2 columns (QTYONHAND, SAFETYSTOCK) but did not get the desired output. Can you please help me out?

--Create Table Script
CREATE TABLE TEST
(
  ITEM         VARCHAR2(10),
  MYORDERS     NUMBER,
  QTYONHAND    NUMBER,
  SAFETYSTOCK  NUMBER
)

--Insert Statements
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'ABC', 5, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'ABC', 45, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'ABC', 25, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 30, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 40, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 30, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 40, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 30, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 10, 80, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 20, 80, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 10, 80, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 20, 80, 10);  

COMMIT;

Thanks 

Tom Kyte

Followup  

March 24, 2006 - 9:49 am UTC

you seem to be missing something to sort by? what orders this data - a running total implies "SORT BY SOMETHING"

Simple Analytical Query

March 24, 2006 - 12:16 pm UTC

Reviewer: A reader

Any one of the other 2 columns can be used to sort the data.

Thanks

Tom Kyte

Followup  

March 24, 2006 - 3:34 pm UTC

so, order by them. in the over () clause.

Simple Analytical Query

March 24, 2006 - 3:50 pm UTC

Reviewer: A reader

Hi Tom

I think the below query will work fine for me I have ordered the data by rowid:-

SQL> select item,toorder,sum(toorder) over (partition by item order by rowid) run_tot from test;

ITEM          TOORDER    RUN_TOT
---------- ---------- ----------
ABC                25         25
ABC                45         70
ABC                 5         75
DEF                30         30
DEF                40         70
DEF                30        100
DEF                40        140
DEF                30        170
XYZ                10         10
XYZ                20         30
XYZ                10         40

ITEM          TOORDER    RUN_TOT
---------- ---------- ----------
XYZ                20         60

12 rows selected.

Thanks for all your help 

Tom Kyte

Followup  

March 24, 2006 - 4:16 pm UTC

as long as you don't care that it gives different answers for the same data on different databases, sure.



Full Join of inline views with analytics

March 25, 2006 - 8:22 am UTC

Reviewer: Anwar from Islamabad, Pakistan

I have a table with date column. I want to display dates from two different months in two columns. I want dates from both months in same rows and not alternate ones.

create table test
(tdate date);
insert into test values('01-jan-2006');
insert into test values('05-jan-2006');
insert into test values('15-jan-2006');
insert into test values('02-feb-2006');
insert into test values('07-feb-2006');
insert into test values('25-feb-2006');

Then i issue following command to retrieve data.

  1  select jan06,feb06
  2  from
  3     (select tdate jan06,
  4     row_number() over (order by tdate) rn
  5     from test
  6     where tdate between '01-jan-2006' and '31