Excellent!!!
A reader, October   07, 2003 - 5:05 am UTC
 
 
Hi Tom,
 'been thinking about writing a book just about analytics ' ... please make this book available soon and am sure it will be yet another gift from you to Oracle World :) 
 
 
Wow!!
Michael T, October   07, 2003 - 7:00 am UTC
 
 
This is exactly what I needed!  Analytics do rock!  I just
need to understand them better.  If you do decide to write a
book on analytics, it would be at the top of my must have
list.  Thanks again!!! 
 
 
Small correction
Michael T, October   07, 2003 - 7:33 am UTC
 
 
After looking at it a little closer it looks like there is 
one small error.  The start date for the first MACH1 entry
should be the close date of the prior different station.  In
this case 07/01/2003.  However, by making some small changes 
to your query I can get the results I want.  
SELECT order#,
       station,
       lag(close_date) over (partition by order# order by close_date) 
           start_date,
       close_date
  FROM (SELECT order#, 
               station, 
               close_date
          FROM (SELECT order#,
                       lag(station) over (partition by order# order by 
                                          close_date) lag_station,
                       lead(station) over (partition by order# order by 
                                          close_date) lead_station,
                       station,
                       close_date
                  FROM t)
 WHERE lead_station <> station
    OR lead_station is null
    OR lag_station is null)
There might be an easier way to construct this query, but
it works great for me.  Thanks a lot for your help!
 
 
October   07, 2003 - 8:25 am UTC 
 
sorry about that -- you are right -- when we have "a pair", we want to use lag/lead again to get and keep the right dates.  
So, we want to keep rows that are:
a) the first row in the partition  "where lag_station is null"
b) the last row in the partition "where lead_station is null"
c) the first of a possible pair "where lag_station <> station"
d) the second of a possible pair "where lead_station <> station"
This query does that:
ops$tkyte@ORA920> select order#,
  2         station,
  3         lag_close_date,
  4         close_date,
  5         decode( lead_station, station, 1, 0 ) first_of_pair,
  6         decode( lag_station, station, 1, 0 ) second_of_pair
  7    from (
  8  select order#,
  9         lag(station) over (partition by order# order by close_date)
 10                                                              lag_station,
 11         lead(station) over (partition by order# order by close_date)
 12                                                              lead_station,
 13             station,
 14             close_date,
 15         lag(close_date) over (partition by order# order by close_date)
 16                                                             lag_close_date,
 17         lead(close_date) over (partition by order# order by close_date)
 18                                                             lead_close_date
 19    from t
 20         )
 21   where lag_station is null
 22          or lead_station is null
 23          or lead_station <> station
 24          or lag_station <> station
 25  /
 
ORDER# STATION LAG_CLOSE_ CLOSE_DATE FIRST_OF_PAIR SECOND_OF_PAIR
------ ------- ---------- ---------- ------------- --------------
 12345 RECV               07/01/2003             0              0
 12345 MACH1   07/01/2003 07/02/2003             1              0
 12345 MACH1   07/05/2003 07/11/2003             0              1
 12345 INSP1   07/11/2003 07/12/2003             0              0
 12345 MACH1   07/12/2003 07/16/2003             0              0
 12345 MACH2   07/16/2003 07/30/2003             0              0
 12345 STOCK   07/30/2003 08/01/2003             0              0
 
7 rows selected.
 
<b>we can see with the 1's the first/second of a pair in there.  All we need to do now is "reach forward" for the first of a pair and grab the close date from the next record:</b>
ops$tkyte@ORA920> select order#,
  2         station,
  3         lag_close_date,
  4         close_date
  5    from (
  6  select order#,
  7         station,
  8         lag_close_date,
  9         decode( lead_station,
 10                 station,
 11                 lead(close_date) over (partition by order# order by close_date),
 12                 close_date ) close_date,
 13         decode( lead_station, station, 1, 0 ) first_of_pair,
 14         decode( lag_station, station, 1, 0 ) second_of_pair
 15    from (
 16  select order#,
 17         lag(station) over (partition by order# order by close_date)
 18                                                              lag_station,
 19         lead(station) over (partition by order# order by close_date)
 20                                                              lead_station,
 21             station,
 22             close_date,
 23         lag(close_date) over (partition by order# order by close_date)
 24                                                             lag_close_date,
 25         lead(close_date) over (partition by order# order by close_date)
 26                                                             lead_close_date
 27    from t
 28         )
 29   where lag_station is null
 30          or lead_station is null
 31          or lead_station <> station
 32          or lag_station <> station
 33         )
 34   where second_of_pair <> 1
 35  /
 
ORDER# STATION LAG_CLOSE_ CLOSE_DATE
------ ------- ---------- ----------
 12345 RECV               07/01/2003
 12345 MACH1   07/01/2003 07/11/2003
 12345 INSP1   07/11/2003 07/12/2003
 12345 MACH1   07/12/2003 07/16/2003
 12345 MACH2   07/16/2003 07/30/2003
 12345 STOCK   07/30/2003 08/01/2003
 
6 rows selected.
<b>and discard the second of pairs row</b>
That is another way to do it (and an insight into how I develop analytic queries -- adding extra columns like that just to see visually what I want to do) 
 
 
 
 
another good book on the list please go ahead on this one too
Vijay Sehgal, October   07, 2003 - 8:58 am UTC
 
 
Best Regards,
Vijay Sehgal 
 
 
Very useful
Michael T., October   07, 2003 - 12:05 pm UTC
 
 
Excellent, as always! 
 
 
Can we reach to the end of the group?
Steve, December  15, 2003 - 11:28 am UTC
 
 
For example, say our analytic query returns the following result:
master_record    sub_record   nxt_record
95845433           25860032     95118740
95118740           25860032     95837497
95837497           25860032     
What I'd like is to do is grab the final master_record, 95837497, and have that populated in the final column.  There could be 2,3 or more in each group. 
 
December  15, 2003 - 3:45 pm UTC 
 
so the nxt_record of the last record should be the master_record of that row?
then just select
  nvl( lead(master_record) over (....), master_record ) nxt_record
when the lead is NULL, return the master_record of the current row 
 
 
 
Almost....
Steve, December  15, 2003 - 5:52 pm UTC
 
 
but I dodn't explain it well enough.  What I'd like to see is a result set that looks like:
master_record    sub_record   nxt_record
95845433           25860032     95837497
95118740           25860032     95837497
95837497           25860032     95837497
The data comes from this:
table activity
cllocn        moddate
25860032      18/06/2003
95118740      26/08/2003
95837497      15/12/2003
95845433      19/08/2003
table ext_dedupe
master_cllocn    dupe_cllocn
25860032         95118740
25860032         95837497
25860032         95845433
My query is:
select * from ( select master_record, sub_record, lead(master_record) over (partition by sub_record order by lst_activity asc) nxt_activity
from ( select * from (select case when dupelast_ackdate>last_ackdate then dupe_cllocn
when last_ackdate>dupelast_ackdate then master_cllocn
else master_cllocn
end master_record, greatest(last_ackdate,dupelast_ackdate) lst_activity,
case when dupelast_ackdate>last_ackdate then master_cllocn
when last_ackdate>dupelast_ackdate then dupe_cllocn
else dupe_cllocn
end sub_record
from (select master_cllocn, (select max(moddate) from activity a where a.cllocn=ed.master_cllocn) last_ackdate,
dupe_cllocn, (select max(moddate) from activity a where a.cllocn=ed.dupe_cllocn) dupelast_ackdate
from ext_dedupe ed))))
Am I on the right track or is there a simpler way to this?
Thanks 
 
December  16, 2003 - 6:50 am UTC 
 
can you explain in "just text" how you got from your inputs to your outputs.
it is not clear (and i didn't feel like parsing that sql to reverse engineer what it does)
 
 
 
 
Is this what you are looking for ?
Venkat, December  15, 2003 - 6:44 pm UTC
 
 
select master, sub, moddate
       , min(master) keep (dense_rank first order by moddate) over (partition by sub) first_in_list
       , max(master) keep (dense_rank last order by moddate) over (partition by sub) last_in_list
  from (select master, sub, moddate from (
        select 95845433 master, 25860032 sub, to_date('19-aug-03','dd/mon/yy') moddate from dual union all
        select 95118740, 25860032, to_date('26-aug-03','dd/mon/yy') from dual union all
        select 95837497, 25860032, to_date('15-dec-03','dd/mon/yy') from dual))
MASTER    SUB    MODDATE    FIRST_IN_LIST    LAST_IN_LIST
95845433    25860032    8/19/2003    95845433    95837497
95118740    25860032    8/26/2003    95845433    95837497
95837497    25860032    12/15/2003    95845433    95837497
 
 
 
Tom's Book 
umesh, December  16, 2003 - 4:13 am UTC
 
 
Tom
Do not announce until you are finished with the book .. when you talk of a book ..can't wait until We have it here
Analytics Book That must be real good 
 
 
Is it possible to get the same result in standard edition ?
Ninoslav, December  16, 2003 - 4:21 am UTC
 
 
Hi Tom,
yes, analitic functions are great. However, we can use it only in enterprise edition of database. We have a few small customers that want only a standard edition.
So, is it possible in this question to get the same result without analitic functions ? 
It would be nice to have some kind of mapping between analitics and 'standard' queries. But, that is probabaly impossible... 
 
December  16, 2003 - 7:27 am UTC 
 
Oracle 9iR2 and up -- analytics are a feature of standard edition.
there are things you can do in analytics that are quite simply NOT PRACTICAL in any sense without them.
 
 
 
 
ok
Steve, December  16, 2003 - 8:41 am UTC
 
 
I have two tables - activity and ext_dedupe.
table activity
cllocn        moddate
25860032      18/06/2003
95118740      26/08/2003
95837497      15/12/2003
95845433      19/08/2003
table ext_dedupe
master_cllocn    dupe_cllocn
25860032         95118740
25860032         95837497
25860032         95845433
Ext_dedupe is a table created by a third party app which has identified duplicate records within our database.  The first column is supposed to be the master and the second the duplicate.  The idea is to mark as archived all our duplicate records with a pointer to the master.  Notwithstanding the order of the columns, what we want to do is find out which record has the most recent activity (from the activity table) and archive off the others.
So, in this example although the master is listed as 25860032 against the other 3, an examination of the activity dates mean I want to keep 95837497 and mark the others as archived and have a pointer on each of them to 95837497.  That's why I thought if I could get to the following result it would make it simpler.
master_record    sub_record   nxt_record
95845433           25860032     95837497
95118740           25860032     95837497
95837497           25860032     95837497
Hope that makes sense! 
 
December  16, 2003 - 11:33 am UTC 
 
oh, then nxt_record is just 
last_value(master_record) over (partition by sub_record order by moddate)
  
 
 
 
Why...
Steve, December  16, 2003 - 1:31 pm UTC
 
 
it didn't work for me.  I had to change it to
first_value(master_record) over (partition by sub_record order by moddate desc)
Is there a reason for that? 
 
December  16, 2003 - 2:00 pm UTC 
 
doh, default window clause is current row and unbounded preceding
i would have needed a window clause that looks forwards rather then backwards (reason #1 why I should always set up a test case instead of just answering on the fly)
your solution of reversing the data works just fine. 
 
 
 
Another solution
A reader, December  16, 2003 - 4:03 pm UTC
 
 
The following gives the same result ...
select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
       , max(cllocn) keep (dense_rank last order by moddate)
                     over (partition by nvl(master_cllocn,cllocn)) nxt_record
  from activity, ext_dedupe where cllocn = dupe_cllocn
MASTER_RECORD    SUB_RECORD    NXT_RECORD
95118740        25860032    95837497
95837497        25860032    95837497
95845433        25860032    95837497
 
 
December  16, 2003 - 5:44 pm UTC 
 
yes, there are many many ways to do this.
first_value
last_value
substring of max() without keep 
sure. 
 
 
 
A reader, December  16, 2003 - 4:15 pm UTC
 
 
Actually the nvl(master_cllocn...) is required only if you need all 4 rows in the output as follows(there is an outer join involved). If you need only the 3 rows as shown in the above post, there is no need for the nvl's....
select cllocn master_record, nvl(master_cllocn,cllocn) sub_record
       , max(cllocn) keep (dense_rank last order by moddate)
                     over (partition by nvl(master_cllocn,cllocn)) nxt_record
       , last_value(cllocn) over (partition by nvl(master_cllocn,cllocn) order by moddate) nxt
  from activity, ext_dedupe where cllocn = dupe_cllocn (+)
MASTER_RECORD    SUB_RECORD    NXT_RECORD
25860032        25860032    95837497
95118740        25860032    95837497
95837497        25860032    95837497
95845433        25860032    95837497 
 
 
still q's on analytics
A reader, January   30, 2004 - 10:13 am UTC
 
 
Okay, so my web application logs "web transaction" statistics to a table.  This actually amounts to 0 to many database tranactions... but anyway.. I need to summarize (sum, min, max, count, average) each day's transaction times for each class (name2) and action (name3) and ultimately "archive" this data to a hisory table.  I am running 8.1.7 and pretty new to analytics.
My table looks like this:
SQL> desc tran_stats
 Name                    Null?    Type
 ----------------------- -------- ----------------
 ID                      NOT NULL NUMBER(9)
 NAME1                            VARCHAR2(100)
 NAME2                            VARCHAR2(100)
 NAME3                            VARCHAR2(100)
 NAME4                            VARCHAR2(100)
 SEC                     NOT NULL NUMBER(9,3)
 TS_CR                   NOT NULL DATE
        ID NAME1 NAME2                     NAME3         SEC NAME4 TS_CR
---------- ----- ------------------------- ---------- ------ ----- ---------
     35947       /CM01_PersonManagement    CREATE       .484       15-JAN-04
     35987       /CM01_PersonManagement    CREATE       .031       15-JAN-04
     36086       /CM01_PersonManagement    EDIT         .312       16-JAN-04
     36555       /CM01_PersonManagement    CREATE       .297       19-JAN-04
     36623       /CM01_PersonManagement    EDIT         .375       19-JAN-04
     36627       /CM01_PersonManagement    CREATE       .047       19-JAN-04
     36756       /CM01_AddressManagement   CREATE       .375       20-JAN-04
     36766       /CM01_AddressManagement   CREATE       .305       20-JAN-04
     36757       /CM01_AddressManagement   INSERT       .391       20-JAN-04
     37178       /CM01_PersonManagement    EDIT         .203       20-JAN-04
and I need output like this:
TS_CR     NAME2                     NAME3       M_SUM  M_MIN  M_MAX M_COUNT  M_AVG
--------- ------------------------- ---------- ------ ------ ------ ------- ------
20-JAN-04 /CM01_AddressManagement   CREATE       .680   .305   .375       2   .340
20-JAN-04 /CM01_AddressManagement   INSERT       .391   .391   .391       1   .391
20-JAN-04 /CM01_PersonManagement    EDIT         .203   .203   .203       1   .203
19-JAN-04 /CM01_PersonManagement    CREATE       .344   .047   .297       2   .172
19-JAN-04 /CM01_PersonManagement    EDIT         .375   .375   .375       1   .375
16-JAN-04 /CM01_PersonManagement    EDIT         .312   .312   .312       1   .312
15-JAN-04 /CM01_PersonManagement    CREATE       .515   .031   .484       2   .258
This seems to work, but there has to be a better/cleaner/more efficient way to do this:
select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
from (
select  trunc(ts_cr) ts_cr,id, name2, name3, sum(sec) m_dummy
 , min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min
 , max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max
 , round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg
 , count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count
 , sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
        from tran_stats group by name2, name3,trunc(ts_cr),id
)n order by 1 desc, 2, 3;
Any help or pointers would be appreciated.  Thanks in advance.
 
 
 
January   30, 2004 - 10:31 am UTC 
 
why does there "have to be"?
what is "unclean" about this?  I could make it more verbose (and perhaps more readable) but this does exactly what you ask for?
It seems pretty "good", very "clean" and probably the most efficient method to get this result? 
 
 
 
Regarding the previous post ...
A reader, January   30, 2004 - 11:45 am UTC
 
 
Am I missing something or will the following do the same ..
select trunc(ts_cr) ts_cr, name2, name3,
       count(*) m_count, min(sec) m_min, max(sec) m_max,
       sum(sec) m_sum, avg(sec) m_avg
from tran_stats
group by trunc(ts_cr), name2, name3
order by 1 desc, 2, 3 
 
January   30, 2004 - 7:43 pm UTC 
 
with the supplied data -- since "group by trunc(ts_cr), name2, name3" happened to be unique
yes.
In general -- no.  consider:
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
  2  from ( select trunc(ts_cr) ts_cr,
  3                id,
  4                            name2,
  5                            name3,
  6                            sum(sec) m_dummy ,
  7                            min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
  8                            max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
  9                            round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
 10                            count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
 11                            sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
 12               from tran_stats
 13                  group by name2, name3,trunc(ts_cr),id
 14        )n
 15  MINUS
 16  select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
 17  from (
 18  select trunc(ts_cr) ts_cr, name2, name3,
 19         count(*) m_count, min(sec) m_min, max(sec) m_max,
 20                    sum(sec) m_sum, avg(sec) m_avg
 21                            from tran_stats
 22                            group by trunc(ts_cr), name2, name3 )
 23  /
 
no rows selected
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> insert into tran_stats
  2  select 35947,'/CM01_PersonManagement','CREATE', .484  ,'15-JAN-04'
  3   from all_users where rownum <= 5;
 
5 rows created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select distinct ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
  2  from ( select trunc(ts_cr) ts_cr,
  3                id,
  4                            name2,
  5                            name3,
  6                            sum(sec) m_dummy ,
  7                            min(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_min ,
  8                            max(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_max ,
  9                            round(avg(sum(sec)) over(partition by name2,name3,trunc(ts_cr)),5) as m_avg ,
 10                            count(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_count ,
 11                            sum(sum(sec)) over(partition by name2,name3,trunc(ts_cr)) as m_sum
 12               from tran_stats
 13                  group by name2, name3,trunc(ts_cr),id
 14        )n
 15  MINUS
 16  select ts_cr, name2, name3, m_sum, m_min,m_max,m_count,m_avg
 17  from (
 18  select trunc(ts_cr) ts_cr, name2, name3,
 19         count(*) m_count, min(sec) m_min, max(sec) m_max,
 20                    sum(sec) m_sum, avg(sec) m_avg
 21                            from tran_stats
 22                            group by trunc(ts_cr), name2, name3 )
 23  /
 
TS_CR     NAME2                   NAME3         M_SUM      M_MIN      M_MAX    M_COUNT      M_AVG
--------- ----------------------- -------- ---------- ---------- ---------- ---------- ----------
15-JAN-04 /CM01_PersonManagement  CREATE        2.935       .031      2.904          2     1.4675
add more data and it won't be the same. 
 
 
 
 
OK
Siva, January   31, 2004 - 9:05 am UTC
 
 
Dear Tom,
Can analytics be used for the following formats of the same query
 sql>select ename,nvl(ename,'Name is null') from emp
 sql>select ename,decode(ename,null,'Name is null',ename)
 from emp
If you know other ways,Please let me know
Bye!
 
 
January   31, 2004 - 10:03 am UTC 
 
umm, why ?
 
 
 
 
with analytics
A reader, February  18, 2004 - 7:30 am UTC
 
 
with the following data
-- ------
1  val1_1
1  val1_2
1  val1_3
2  val1_1
2  val2_2
 
can i produce 
-- ------  --------------------
1  val1_1  val1_1,val1_2,val1_3
1  val1_2  val1_1,val1_2,val1_3 
1  val1_3  val1_1,val1_2,val1_3
2  val2_1  val2_1,val2_2
2  val2_2  val2_1,val2_2
with an  analytic that rocks
 
 
February  18, 2004 - 8:47 pm UTC 
 
if
select max(count(*)) from t group by id
has a reasonable maximum -- yes, but it would be a trick lag/lead thing.
I would probably join using stragg.  join the details to the aggregate using inline views. 
 
 
 
OK
Siddiq, March     01, 2004 - 9:26 am UTC
 
 
Hi Tom,
What can be the business use cases of the analytic functions
1)cume_dist
2)percentile_dist
3)percentile_cont
Where can they be of immense use?
Bye! 
 
March     01, 2004 - 10:17 am UTC 
 
they are just statistical functions for analysis.
2 and 3 are really variations on eachother (disc=discrete, cont=continuous) and would be used to compute pctcentiles (like you might see on an SAT test report from back in high school).  percentile_* can be used to find a median for example :)
cume_dist is a variation on that.  I'll cheat on an example, from the doc:
Analytic Example 
The following example calculates the salary percentile for each employee in the purchasing area. For example, 40% of clerks have salaries less than or equal to Himuro. 
SELECT job_id, last_name, salary, CUME_DIST() OVER (PARTITION BY job_id ORDER BY salary) AS cume_dist FROM employees WHERE job_id LIKE  PU% ; 
JOB_ID     LAST_NAME                 SALARY     CUME_DIST 
---------- ------------------------- ---------- ---------- 
PU_CLERK   Colmenares                2500        .2 
PU_CLERK   Himuro                    2600        .4 
PU_CLERK   Tobias                    2800        .6 
PU_CLERK   Baida                     2900        .8 
PU_CLERK   Khoo                      3100         1 
PU_MAN     Raphaely                  11000        1 
 
 
 
Stumped on Analytics
Dave Thompson, March     04, 2004 - 9:56 am UTC
 
 
Hi Tom,
I have the following two tables:
CREATE TABLE PAY_M
(
  PAY_ID  NUMBER,
  PAYMENT  NUMBER
)
--
--
CREATE TABLE PREM
(
  PREM_ID       NUMBER,
  PREM_PAYMENT  NUMBER
)
With the following data:
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES ( 
1, 100); 
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES ( 
2, 50); 
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES ( 
3, 50); 
INSERT INTO PREM ( PREM_ID, PREM_PAYMENT ) VALUES ( 
4, 50); 
COMMIT;
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES ( 
1, 50); 
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES ( 
2, 25); 
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES ( 
3, 50); 
INSERT INTO PAY_M ( PAY_ID, PAYMENT ) VALUES ( 
4, 50); 
COMMIT;
PAY_M contains payments made against the premiums in the table prem.
Payments:
    PAY_ID    PAYMENT
---------- ----------
         1         50
         2         25
         3         50
         4         50
Prem:
   PREM_ID PREM_PAYMENT
---------- ------------
         1          100
         2           50
         3           50
         4           50
We are trying to find which payment Ids paid each premium payment in Prem.  The payments are assigned sequentially to the premiums.
For example payments 1,2 & 3 pay off the £100 in premium 1 leaving £25.  Then the remaining payment from payment 3 & payment 4 pay off premium 2 leaving a balance of £25, and so on.
We are trying to create a query that will use the analytical functions to find all the payment IDs that pay off the associated premium ids.  We want to keep this SQL based as we need to Process about 30 million payments!
Thanks.
Great website, hope you enjoyed your recent visit to the UK. 
 
March     04, 2004 - 1:52 pm UTC 
 
let me make sure I have this straight -- you want to 
o sum up the first 3 records in payments
o discover they are 125 which exceeds 100
o output the fact that prem_id 1 is paid for by pay_id 1..3
o carry forward 25 from 3, discover that leftover 3+4 = 75 pays for prem_id 2 
  with 25 extra
while I believe (not sure) that the 10g MODEL clause might be able to do this (if you can do it in a spreadsheet, we can use the MODEL clause to do it).....
I'm pretty certain that analytics cannot -- we would need to recursively use lag (eg: after finding that 1,2,3 pay off 1, we'd need to -- well, it's hard to explain...)
I cannot see analytics doing this -- future rows depend on functions of the analytics from past rows and that is just "not allowed".
I can see how to do this in a pipelined PLSQL function -- will that work for you? 
 
 
 
Oops - Error in previous post
Dave Thompson, March     04, 2004 - 10:17 am UTC
 
 
Tom,
Sorry, ignore the above tables as they are missing the joining column:
CREATE TABLE PAY_M
(
  PREM_ID  NUMBER,
  PAY_ID   NUMBER,
  PAYMENT  NUMBER
)
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 1, 50); 
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 2, 25); 
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 3, 50); 
INSERT INTO PAY_M ( PREM_ID, PAY_ID, PAYMENT ) VALUES ( 
1, 4, 50); 
COMMIT;
CREATE TABLE PREM
(
  PREM_ID       NUMBER,
  PAY_ID        NUMBER,
  PREM_PAYMENT  NUMBER
)
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 1, 100); 
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 2, 50); 
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 3, 50); 
INSERT INTO PREM ( PREM_ID, PAY_ID, PREM_PAYMENT ) VALUES ( 
1, 4, 50); 
COMMIT;
SQL> l
  1  SELECT *
  2* FROM   PAY_M
SQL> /
   PREM_ID     PAY_ID    PAYMENT
---------- ---------- ----------
         1          1         50
         1          2         25
         1          3         50
         1          4         50
SQL> select *
  2  from prem;
   PREM_ID     PAY_ID PREM_PAYMENT
---------- ---------- ------------
         1          1          100
         1          2           50
         1          3           50
         1          4           50
 
 
 
 
Thanks.....
Dave Thompson, March     05, 2004 - 4:23 am UTC
 
 
Tom,
Thanks for your prompt response.
I am familiar with Pipeline functions.
I was however hoping we could do this as a set based opertion because of the volume of data involved.
Thanks for your time. 
 
 
analytics book
Ron Chennells, March     05, 2004 - 5:52 am UTC
 
 
Just another vote and pre order for the analytics book 
 
 
OK
Gerhard, March     19, 2004 - 12:33 am UTC
 
 
Dear Tom,
 I used the following query to find the difference of salaries between employees.
SQL> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal from emp;
ENAME             SAL   DIFF_SAL                                                
---------- ---------- ----------                                                
SMITH             800                                                           
JAMES             950        150                                                
ADAMS            1100        150                                                
WARD             1250        150                                                
MARTIN           1250          0                                                
MILLER           1300         50                                                
TURNER           1500        200                                                
ALLEN            1600        100                                                
CLARK            2450        850                                                
BLAKE            2850        400                                                
JONES            2975        125                                                
ENAME             SAL   DIFF_SAL                                                
---------- ---------- ----------                                                
SCOTT            3000         25                                                
FORD             3000          0                                                
KING             5000       2000                                                
14 rows selected.
My Question is:
" What is the difference between King's sal with other 
employees?".Could you please help with the query?
Bye! 
 
 
March     19, 2004 - 8:58 am UTC 
 
scott@ORA9IR2> select ename,sal,sal-lag(sal) over(order by sal) as diff_sal ,
  2         sal-king_sal king_sal_diff
  3    from (select sal king_sal from emp where ename = 'KING'),
  4         emp
  5  /
 
ENAME             SAL   DIFF_SAL KING_SAL_DIFF
---------- ---------- ---------- -------------
SMITH             800                    -4200
JAMES             950        150         -4050
ADAMS            1100        150         -3900
WARD             1250        150         -3750
MARTIN           1250          0         -3750
MILLER           1300         50         -3700
TURNER           1500        200         -3500
ALLEN            1600        100         -3400
CLARK            2450        850         -2550
BLAKE            2850        400         -2150
JONES            2975        125         -2025
SCOTT            3000         25         -2000
FORD             3000          0         -2000
KING             5000       2000             0
 
14 rows selected.
 
 
 
 
Will this be faster?
Venkat, March     19, 2004 - 4:20 pm UTC
 
 
select ename, sal, 
       sal-lag(sal) over(order by sal) as diff_sal,
       sal - max(case when ename='KING' then sal
                 else null end) over () king_sal_diff
from emp
 
 
March     20, 2004 - 9:47 am UTC 
 
when you benchmarked it and tested it to scale, what did you see?  it would be interesting no? 
 
 
 
lead/lag on different dataset
Stalin, May       03, 2004 - 9:22 pm UTC
 
 
Hi Tom, 
I've similar requirement but i'm not sure how to use lead or lag to refer from a different dataset.
Eg. logs table has both login and logout information and they are identified by action column. There could be different login/logout modes so records that have action in (1,2) and (3,4,5,6,7) values are login and logout records respectively. Now i need to find signon and signout times and also session duration in mins.
here is some sample data of logs table :
LOG_ID LOG_CREATION_DATE       USER_ID    SERVICE     ACTION 
---------- -------------------  ---------- ---------- ---------- 
         1 04/29/2004 10:48:36           3          5          2 
         3 04/29/2004 10:53:44           3          5          3 
         5 04/29/2004 11:11:35           3          5          1 
      1003 05/03/2004 15:18:53           3          5          5 
      1004 05/03/2004 15:19:50           8          5          1 
here is a query i came up with (not exacly what i want) :
select log_id signon_id, lead(log_id, 1) over (partition by account_id, user_id, mac order by log_id) signoff_id,
       user_id, log_creation_date signon_date,
       lead(log_creation_date, 1) over (partition by account_id, user_id, mac order by log_creation_date) signoff_date,
       nvl(round(((lead(log_creation_date, 1)
           over (partition by account_id, user_id order by log_creation_date)-log_creation_date)*1440), 2), 0)  Usage_Mins
from   logs
where  account_id = 'Robert'
and    service = 5
order  by user_id
desired output :
 SIGNON_ID SIGNOFF_ID     USER_ID SIGNON_DATE         SIGNOFF_DATE        USAGE_MINS
---------- ----------  ---------- ------------------- ------------------- ----------
         1          3           3 04/29/2004 10:48:36 04/29/2004 10:53:44       5.13
         5       1003          3 04/29/2004 11:11:35 05/03/2004 15:18:53     6007.3
      1004                      8 05/03/2004 15:19:50                              0
Thanks in Advance,
Stalin
 
 
May       04, 2004 - 7:11 am UTC 
 
maybe if you supply simple create table and insert ... values ... statements for me.... this stuff would go faster.
Your query references columns that are not in the example as well. 
 
 
 
Create table scripts
Stalin, May       04, 2004 - 1:29 pm UTC
 
 
Sorry for not giving this info in the first place.
here goes the scripts....
create table logs (log_id number, log_creation_date date, account_id varchar2(25), user_id number,
service number, action number, mac varchar2(50))
/
insert into logs values (1, to_date('04/29/2004 10:48:36'), 'Robert', 3, 5, 2, '00-00-00-00')
/
insert into logs values (3, to_date('04/29/2004 10:53:44'), 'Robert', 3, 5, 3, '00-00-00-00')
/
insert into logs values (5, to_date('04/29/2004 11:11:35'), 'Robert', 3, 5, 1, '00-00-00-00')
/
insert into logs values (1003, to_date('05/03/2004 15:18:53'), 'Robert', 3, 5, 5, '00-00-00-00')
/
insert into logs values (1004, to_date('05/03/2004 15:19:50'), 'Robert', 8, 5, 1, '00-00-00-00')
/
The reason for including mac in the partition group is cause users can login via multiple pc's without logging out hence i grouped it on account_id, user_id and mac.
Thanks,
Stalin 
 
May       04, 2004 - 2:38 pm UTC 
 
ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 ) minutes
  2    from (
  3  select log_id,
  4         case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
  5              then lead(log_id) over (partition by account_id, user_id, mac order by log_creation_date)
  6          end signoff_id,
  7         user_id,
  8         log_creation_date signon_date,
  9         case when action in (1,2) and lead(action) over (partition by account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
 10              then lead(log_creation_date) over (partition by account_id, user_id, mac order by log_creation_date)
 11          end signoff_date,
 12                  action
 13  from   logs
 14  where  account_id = 'Robert'
 15  and    service = 5
 16  order  by user_id
 17         ) a
 18   where action in (1,2)
 19  /
 
    LOG_ID SIGNOFF_ID    USER_ID SIGNON_DATE         SIGNOFF_DATE            ACTION    MINUTES
---------- ---------- ---------- ------------------- ------------------- ---------- ----------
         1          3          3 04/29/2004 10:48:36 04/29/2004 10:53:44          2       5.13
         5       1003          3 04/29/2004 11:11:35 05/03/2004 15:18:53          1     6007.3
      1004                     8 05/03/2004 15:19:50                              1
 
 
 
 
 
Excellent
Stalin, May       04, 2004 - 3:42 pm UTC
 
 
This is exactly what i'm looking for. 
Thanks so much! 
 
 
Help On SQL
VKOUL, May       04, 2004 - 8:05 pm UTC
 
 
I want to substitute the non null value of a column to its null column. e.g.
If I have records like the following
year  month  column_value
----- ------ --------------------
2002   06    55
2002   06    57
2002   07    NULL
2002   08    NULL
2002   09    NULL
2002   10    100
2002   11    101
I want the results as below
year  month  column_value
----- ------ --------------------
2002   06    55
2002   06    57
2002   07    57                      ------> Repeated
2002   08    57                      ------> Repeated
2002   09    57                      ------> Repeated
2002   10    100
2002   11    101
 
 
May       04, 2004 - 9:08 pm UTC 
 
create table, 
insert into table
much appreciated......... (so i don't spend days of my life making create tables and insert into statements.  I've added this request to all pages where you can input stuff and I'll just be asking for it from now on in......  Not picking on you, just reminding everyone that i need a script like I provide.....)
but..... asked and answered:
</code>  
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:10286792840956  <code>  
 
 
Help On SQL
VKoul, May       04, 2004 - 11:27 pm UTC
 
 
Beautiful !!!
I'll keep in mind "create table etc."
Thanks
VKoul 
 
 
analytic q
A reader, May       11, 2004 - 6:38 pm UTC
 
 
Tom
Please look at the following schema and data.
---------
spool schema
set echo on
drop table host_instances;
drop table rac_instances;
drop table instance_tablespaces;
create table host_instances
(
  host_name varchar2(50),
  instance_name varchar2(50)
);
create table rac_instances
(
  rac_name varchar2(50),
  instance_name varchar2(50)
);
create table instance_tablespaces
(
  instance_name varchar2(50),
  tablespace_name varchar2(50),
  tablespace_size number
);
-- host to instance mapping data
insert into host_instances values ( 'h1', 'i1' );
insert into host_instances values ( 'h2', 'i2' );
insert into host_instances values ( 'h3', 'i3' );
insert into host_instances values ( 'h4', 'i4' );
insert into host_instances values ( 'h5', 'i5' );
-- rac to instance mapping data
insert into rac_instances values ( 'rac1', 'i1' );
insert into rac_instances values ( 'rac1', 'i2' );
insert into rac_instances values ( 'rac2', 'i3' );
insert into rac_instances values ( 'rac2', 'i4' );
--- instance to tablespace mapping data
insert into instance_tablespaces values( 'i1', 't11', 100 );
insert into instance_tablespaces values( 'i1', 't12', 200 );
insert into instance_tablespaces values( 'i2', 't11', 100 );
insert into instance_tablespaces values( 'i2', 't12', 200 );
insert into instance_tablespaces values( 'i3', 't31', 500 );
insert into instance_tablespaces values( 'i3', 't32', 300 );
insert into instance_tablespaces values( 'i4', 't31', 500 );
insert into instance_tablespaces values( 'i4', 't32', 300 );
insert into instance_tablespaces values( 'i5', 't51', 400 );
commit;
---------
What I need is to sum up all tablespaces of all instances
for a list of hosts. However, if two hosts in the list
belong to a RAC then I should only pick one of the
hosts (I can pick any one of them.)
e.g. in the above data I should only pick i1 or i2 not
both since they both belong to the same RAC 'rac1'.
Following is the select I came up with for the above data.
Let me know if you have any comments on it.
Any other alternative solutions you can think of would
also be educating to me. I have not benchmarked this
select yet. The number of hosts could reach up to 2000
approximately. On an average we can assume each will have
one instance - some of these will be RACs.
Thank you!
-----------
scott@ora10g> set echo on
scott@ora10g> column host_name format a10
scott@ora10g> column instance_name format a10
scott@ora10g> column rac_name format a10
scott@ora10g> column row_number format 999
scott@ora10g> 
scott@ora10g> select a.instance_name, sum( tablespace_size )
  2  from
  3  (
  4    select instance_name
  5    from
  6    (
  7       select host_name, instance_name, rac_name,
  8         row_number() over
  9           (
 10             partition by rac_name
 11             order by rac_name, instance_name
 12           ) row_number
 13       from
 14       (
 15         select hi.host_name, hi.instance_name, ri.rac_name
 16         from host_instances hi, rac_instances ri
 17         where hi.instance_name = ri.instance_name(+)
 18       )
 19    )
 20    where row_number <= 1
 21  ) a, instance_tablespaces e
 22  where a.instance_name = e.instance_name
 23  group by a.instance_name;
i1                          300
i3                          800
i5                          400
---
Also do you prefer the .sql file (as above) or
the spooled output of schema.sql (i.e. schema.lst.) 
The above is more convenient to reproduce - but the spooled output makes for better reading in some cases.
 
 
May       11, 2004 - 9:15 pm UTC 
 
I like the cut and paste from sqlplus truth be told.
sure, I have to do two vi commands and a couple of deletes to fix it up but.... I'm fairly certain that the poster *actually ran the commands successfully!* which is most relevant to me....
Besides, I do it to you ;)
ops$tkyte@ORA9IR2> select *
  2    from (
  3  select h.host_name, h.instance_name, r.rac_name, sum(t.tablespace_size),
  4         row_number() over (partition by r.rac_name order by h.host_name ) rn
  5    from host_instances h,
  6             rac_instances r,
  7             instance_tablespaces t
  8   where h.instance_name = r.instance_name(+)
  9     and h.instance_name = t.instance_name
 10   group by h.host_name, h.instance_name, r.rac_name
 11         )
 12   where rn = 1
 13  /
 
HO IN RAC_N SUM(T.TABLESPACE_SIZE)         RN
-- -- ----- ---------------------- ----------
h1 i1 rac1                     300          1
h3 i3 rac2                     800          1
h5 i5                          400          1
is the first thing that popped into my head.
with just a couple hundred rows -- any of them will perform better than good enough. 
 
 
 
 
thanx!
A reader, May       11, 2004 - 9:54 pm UTC
 
 
"I like the cut and paste from sqlplus truth be told."
Actually I was going to post that only - but your
 example at the point of posting led me to believe
that you want a straight sql - may be you wanna 
fix that (not that many people seem to care anyways!:))
Thanx for the sql - it looks good and a tad simpler
than the one I wrote...
 
 
 
How to compute this running total (sort of...)
Kishan, May       18, 2004 - 11:33 am UTC
 
 
create table investment (
 investment_id number,
 asset_id number,
 agreement_id number,
 constraint pk_i primary key (investment_id)
)
/
create table period (
 period_id number,
 business_domain  varchar2(10),
 status_code      varchar2(10),
 constraint pk_p primary key (period_id)
)
/
create table entry (
 entry_id number,
 period_id number,
 investment_id number,
 constraint pk_e primary key(entry_id),
 constraint fk_e_period     foreign key(period_id) references period(period_id),
 constraint fk_e_investment foreign key (investment_id) references investment(investment_id)
)
/
create table entry_detail(
 entry_id number,
 account_type varchar2(10),
 amount      number,
 constraint pk_ed primary key(entry_id, account_type),
 constraint fk_ed_entry foreign key(entry_id) references entry(entry_id)
)
/
insert into period (period_id, business_domain, status_code)
SELECT  rownum     AS period_id,
        'BDG'      AS business_domain,
        '2'        AS status_code
from all_objects where rownum <= 5
/
insert into investment(investment_id, asset_id, agreement_id)
select rownum+10   AS investment_id,
       rownum+100  AS asset_id,
       rownum+1000 AS agreement_id
from all_objects where rownum <=5
/
insert into entry(entry_id, period_id, investment_id) values (1, 1, 11)
/
insert into entry(entry_id, period_id, investment_id) values (2, 2, 11)
/
insert into entry(entry_id, period_id, investment_id) values (3, 3, 11)
/
insert into entry(entry_id, period_id, investment_id) values (4, 3, 13)
/
insert into entry(entry_id, period_id, investment_id) values (5, 4, 13)
/
insert into entry(entry_id, period_id, investment_id) values (6, 4, 14)
/
insert into entry(entry_id, period_id, investment_id) values (7, 5, 14)
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC1', 1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC2', -200 )
/
insert into entry_detail(entry_id, account_type, amount) values(1, 'AC3', 300 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(2, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(3, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(4, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(5, 'AC2', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC1', 200 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC4', -1000 )
/
insert into entry_detail(entry_id, account_type, amount) values(6, 'AC2', -500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC1', 2200 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC3', 500 )
/
insert into entry_detail(entry_id, account_type, amount) values(7, 'AC4', 1200 )
/
scott@LDB.US.ORACLE.COM> select * from period;
 PERIOD_ID BUSINESS_D STATUS_COD
---------- ---------- ----------
         1 BDG        2
         2 BDG        2
         3 BDG        2
         4 BDG        2
         5 BDG        2
scott@LDB.US.ORACLE.COM> select * from investment;
INVESTMENT_ID   ASSET_ID AGREEMENT_ID
------------- ---------- ------------
           11        101         1001
           12        102         1002
           13        103         1003
           14        104         1004
           15        105         1005
scott@LDB.US.ORACLE.COM> select * from entry;
  ENTRY_ID  PERIOD_ID INVESTMENT_ID
---------- ---------- -------------
         1          1            11
         2          2            11
         3          3            11
         4          3            13
         5          4            13
         6          4            14
         7          5            14
7 rows selected.
scott@LDB.US.ORACLE.COM> select * from entry_detail;
  ENTRY_ID ACCOUNT_TY     AMOUNT
---------- ---------- ----------
         1 AC1              1000
         1 AC2              -200
         1 AC3               300
         2 AC1               200
         2 AC4             -1000
         2 AC2              -500
         3 AC2              2200
         3 AC1               200
         4 AC4             -1000
         4 AC2              -500
         5 AC2              2200
         6 AC1               200
         6 AC4             -1000
         6 AC2              -500
         7 AC1              2200
         7 AC3               500
         7 AC4              1200
17 rows selected.
The resultant view needed is given below.
To give an example from the result below, the first entry for investment_id 14
is from period 4. The account types entered on period 4 are AC1, AC4, AC2. We
need these three account types in all subsequent periods. Also, on period 5 a
new account type AC3 is added. So, if there is another period, say period_id 6, we need
information for AC1, AC2, AC3, AC4 (that's 4 account types). If there's no entry
for any of these account_types for any subseqent periods, the amount_for_period for such
periods are considered to be 0.00 and the balance will be sum(amount_for_period)
until that period.
PERIOD_ID INVESTMENT_ID ACCOUNT_TYPE AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
--------- ------------- ------------ ----------------- -------------------
1                    11       AC1                 1000              1000
1                    11       AC2                 -200              -200
1                    11       AC3                  300               300
2                    11       AC1                  200              1200
2                    11       AC2                 -500              -700
2                    11       AC3                    0               300
2                    11       AC4                -1000             -1000
3                    11       AC1                  200              1400
3                    11       AC2                  200              -500
3                    11       AC3                    0               300
3                    11       AC4                    0              1000
4                    11       AC1                    0              1400
4                    11       AC2                    0              -500
4                    11       AC3                    0               300
4                    11       AC4                    0              1000
5                    11       AC1                    0              1400
5                    11       AC2                    0              -500
5                    11       AC3                    0               300
5                    11       AC4                    0              1000
3                    13       AC4                -1000             -1000
3                    13       AC2                 -500              -500
4                    13       AC4                    0             -1000
4                    13       AC2                 -500             -1000
5                    13       AC4                    0             -1000
5                    13       AC4                    0             -1000
4                    14       AC1                  200               200
4                    14       AC4                -1000             -1000
4                    14       AC2                 -500              -500
5                    14       AC1                 2200              2400
5                    14       AC3                  500               500
5                    14       AC4                 1200               200
5                    14       AC2                    0              -500
The blank lines in between are just for clarity. As always, grateful for all your efforts.
Regards,
Kishan.
 
 
May       18, 2004 - 6:14 pm UTC 
 
so, what does your first try look like :)  at least get the join written up for the details - maybe the running total will be obvious from that. 
 
 
 
This is how far I went...and no further
Kishan, May       19, 2004 - 10:18 am UTC
 
 
select distinct period_id,
                investment_id,
                account_type,
                amount_for_period,
                balance_till_period
from (  select  period.period_id,
                entry.investment_id,
                entry_detail.account_type,
                  (case when entry.period_id = period.period_id then entry_detail.amount else 0 end) amount_for_period,
                   sum(amount) over(partition by period.period_id, investment_id, account_type) balance_till_period
        from    period left outer join (entry join entry_detail on (entry.entry_id = entry_detail.entry_id)) on (entry.period_id <= period.period_id))
order by investment_id
The result looks as below:
 PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
         1            11 AC1                     1000                1000
         1            11 AC2                     -200                -200
         1            11 AC3                      300                 300
         2            11 AC1                        0                1200
         2            11 AC1                      200                1200
         2            11 AC2                     -500                -700
         2            11 AC2                        0                -700
         2            11 AC3                        0                 300
         2            11 AC4                    -1000               -1000
         3            11 AC1                        0                1400
         3            11 AC1                      200                1400
         3            11 AC2                        0                1500
         3            11 AC2                     2200                1500
         3            11 AC3                        0                 300
         3            11 AC4                        0               -1000
         4            11 AC1                        0                1400
         4            11 AC2                        0                1500
         4            11 AC3                        0                 300
         4            11 AC4                        0               -1000
         5            11 AC1                        0                1400
         5            11 AC2                        0                1500
         5            11 AC3                        0                 300
         5            11 AC4                        0               -1000
         3            13 AC2                     -500                -500
         3            13 AC4                    -1000               -1000
         4            13 AC2                        0                1700
         4            13 AC2                     2200                1700
         4            13 AC4                        0               -1000
         5            13 AC2                        0                1700
         5            13 AC4                        0               -1000
         4            14 AC1                      200                 200
         4            14 AC2                     -500                -500
         4            14 AC4                    -1000               -1000
         5            14 AC1                        0                2400
         5            14 AC1                     2200                2400
         5            14 AC2                        0                -500
         5            14 AC3                      500                 500
         5            14 AC4                        0                 200
         5            14 AC4                     1200                 200
First, I am sorry my originally constructed result (by hand..;) misses a couple of rows . 
However, other than that, I am unable to remove the redundant rows that are shows up for the particular investment and accout_type for a period as the logic beats me. 
Basically, I need to remove rows where the amount_for_period is 0 for an account_type only if its a redundant row for that set. That is, the first row of period_id 2 and 3 are redundant but the rows for the period 4 are not redundant.
 
Could you help me out?
Regards,
Kishan. 
 
May       19, 2004 - 11:06 am UTC 
 
are we missing some more order bys?  I mean -- what if:
         3            11 AC1                        0                1400
         3            11 AC1                      200                1400
         3            11 AC2                        0                1500
         3            11 AC2                     2200                1500
         3            11 AC3                        0                 300
         3            11 AC4                        0               -1000
was really:
         3            11 AC1                      200                1400
         3            11 AC2                        0                1500
         3            11 AC2                     2200                1500
         3            11 AC3                        0                 300
         3            11 AC4                        0               -1000
         3            11 AC1                        0                1400
would that still be redundant?  missing something here/ 
 
 
 
Yes...they are redundant
A reader, May       19, 2004 - 12:16 pm UTC
 
 
Tom:
Yes, for that particular set, those rows are redundant, no matter what the order is. 
Regards,
Kishan. 
 
May       19, 2004 - 2:24 pm UTC 
 
ok, so what is the "key" of that result set?  what can we partition the result set by.
my idea will be to use your query in an inline view and analytics on that to weed out what you want. 
 
 
 
Kishan, May       19, 2004 - 3:08 pm UTC
 
 
The key would be period_id, investment_id and accout_type. Basically, what the result represents is the amount and the balance-to-date for a particular account_type of an investment_id for a period. 
Eg: Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
If there's no activity on that investment and account_type for the next period, say Period 2, the amount will be 0 for that period, and the balance will be previous period's balance. 
Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=0->Balance = 1000
But, if there's an activity on that account_type for that investment, then the amount will be the amount for that period and balance will be the sum of previous balance and current amount. Say for Period 2, the amount is 500, then
Period 1->Investment 1->Account_Type AC1->Amount=1000-> Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500-> Balance=1500
And if there's a new account type entry, say AC2 and amount, say 2000 created for period 2, then the result set will be
Period 1->Investment 1->Account_Type AC1->Amount=1000->Balance=1000
Period 2->Investment 1->Account_Type AC1->Amount=500->Balance=1500
Period 2->Investment 1->Account_Type AC2->Amount=2000->Balance=2000
There may be many investments per period and many account_types per investment. Hope I am clear....
Regards,
Kishan.
 
 
May       19, 2004 - 5:34 pm UTC 
 
so... if you have:
 PERIOD_ID INVESTMENT_ID ACCOUNT_TY AMOUNT_FOR_PERIOD BALANCE_TILL_PERIOD
---------- ------------- ---------- ----------------- -------------------
         1            11 AC1                     1000                1000
         1            11 AC2                     -200                -200
         1            11 AC3                      300                 300
         2            11 AC1                        0                1200
         2            11 AC1                      200                1200
         2            11 AC2                     -500                -700
         2            11 AC2                        0                -700
         2            11 AC3                        0                 300
         2            11 AC4                    -1000               -1000
you see though, why isn't the 4th line here "redundant" then? 
 
 
 
But it is redundant..
Kishan, May       19, 2004 - 11:51 pm UTC
 
 
Tom, I am assuming the 4th line you mention is  2->11->AC2->0->-700. Yes, it is redundant. 
We need amount and balance for every period_id, investment_id and account_type. One line, per period_id, investment_id and account_type, anything more, is redundant.  
Issue is, there may not be entries for a specific account_type of an investment for a particular period. In such cases, we need to assume amount for such periods are 0 and compute the balances accordingly.
Regards,
Kishan 
 
May       20, 2004 - 10:55 am UTC 
 
so, if you partition by 
 PERIOD_ID INVESTMENT_ID ACCOUNT_TY BALANCE_TILL_PERIOD
order by 
AMOUNT_FOR_PERIOD 
select a.*, lead(amount_for_period) over (partition by .... order by ... ) nxt
  from (YOUR_QUERY)
you can then
select *
  from (that_query)
 where nxt is NULL or (nxt is not null and amount_for_period <> 0)
if nxt is null -- last row in the partition, keep it.
if nxt is not null AND we are zero -- remove it.
 
 
 
 
Almost there?
Dave Thompson, May       20, 2004 - 12:30 pm UTC
 
 
Hi Tom,
We have the following table of data:
CREATE TABLE DEDUP_TEST
(
  ID          NUMBER,
  COLUMN_A    VARCHAR2(10 BYTE),
  COLUMN_B    VARCHAR2(10 BYTE),
  COLUMN_C    VARCHAR2(10 BYTE),
  START_DATE  DATE,
  END_DATE    DATE
)
With:
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
1, 'A', 'B', 'C',  TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
1, 'D', 'B', 'C',  TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
1, 'A', 'B', 'C',  TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'a', 'f', 'f',  TO_Date( '02/06/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '02/07/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/02/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/05/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/02/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/03/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/06/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
3, 'A', 'F', 'F',  TO_Date( '02/10/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '02/20/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
COMMIT;
We are trying to sequentially de-duplicate this data.
Basically from the top of the table we go down and the check each row against the previous.  If they are the same the row that is a duplicate is marked as such as is the original row.
So far we have this query:
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       DUP,
       CASE  WHEN COLUMN_A = NEXT_A 
                AND  COLUMN_B = NEXT_B
             AND  COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       NEXT_A,
       NEXT_B,
       NEXT_C,
       CASE  WHEN COLUMN_A = PREV_A 
                AND  COLUMN_B = PREV_B
             AND  COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM (  SELECT  ID, 
           COLUMN_A,
        COLUMN_B,
        COLUMN_C,
        START_DATE,
        END_DATE,
        LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
        LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
        LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
        LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
        LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
        LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM    DEDUP_TEST
ORDER 
BY      1, 5 ) ) )    
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET
---------- ---------- ---------- ---------- --------- --------- ---
         1 A          B          C          01-OCT-99 01-OCT-00 NOT
         1 D          B          C          01-OCT-01 01-OCT-02 NOT
         1 A          B          C          01-OCT-02 01-OCT-03 NOT
         2 A          B          B          01-OCT-00 01-OCT-01 DUP
         2 A          B          B          01-OCT-01 01-OCT-03 DUP
         2 A          B          B          02-OCT-01 05-OCT-03 DUP
         2 a          f          f          06-FEB-04 07-FEB-04 NOT
         2 A          B          B          02-OCT-05 03-OCT-05 DUP
         2 A          B          B          04-OCT-05 06-OCT-05 DUP
         3 A          F          F          10-FEB-04 20-FEB-04 NOT
The resultset from this is almost what I am after.
However where there are groups of duplicate rows I only want to return one row.  I take the attributes, the start_date of the first row duplicated and the end_date of the last row duplicated.
I do not want to group all the duplicates together, so for example the rows with the attributes
 ID COLUMN_A   COLUMN_B   COLUMN_C
  2 A          B          B
will result in two output rows:
2 A          B          B          01-OCT-00 01-OCT-03
2 A          B          B          02-OCT-05 06-OCT-05
This is the final piece I cannot work out.
Any help would be appreciated.
Thanks.
 
 
May       20, 2004 - 2:18 pm UTC 
 
what happens in your data if you had
1     A1   B1    C1     ....
1     A2   B2    C2     ....
1     A1   B1    C1     ....
that might or might not be "dup" since you just order by ID?  don't we need to ordedr by a,b, and c? 
 
 
 
Follow up
Dave Thompson, May       21, 2004 - 5:02 am UTC
 
 
Hi Tom, 
In repsonse to your question:
what happens in your data if you had
1     A1   B1    C1     ....
1     A2   B2    C2     ....
1     A1   B1    C1     ....
Then the first row would be classed as unique, as would the second and the third.  We are only looking at duplicates that occur sequentially.  
Sequential duplicates are then turned into one row by taking the start date of the first row and the end date of the last row in the group.
The test data should have had sequential dates:
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
1, 'A', 'B', 'C',  TO_Date( '10/01/1999 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2000 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
1, 'D', 'B', 'C',  TO_Date( '10/01/2001 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
1, 'A', 'B', 'C',  TO_Date( '10/01/2002 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'a', 'f', 'f',  TO_Date( '02/06/2009 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '02/07/2010 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/01/2003 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2004 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/01/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/02/2007 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/05/2008 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/02/2011 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/03/2012 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
2, 'A', 'B', 'B',  TO_Date( '10/04/2013 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '10/06/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
INSERT INTO DEDUP_TEST ( ID, COLUMN_A, COLUMN_B, COLUMN_C, START_DATE,
END_DATE ) VALUES ( 
3, 'A', 'F', 'F',  TO_Date( '02/10/2014 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '02/20/2015 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM')); 
COMMIT;
CREATE TABLE DEDUP_TEST
(
  ID          NUMBER,
  COLUMN_A    VARCHAR2(10 BYTE),
  COLUMN_B    VARCHAR2(10 BYTE),
  COLUMN_C    VARCHAR2(10 BYTE),
  START_DATE  DATE,
  END_DATE    DATE
)
The query:
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (       
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       DUP,
       CASE  WHEN COLUMN_A = NEXT_A 
                AND  COLUMN_B = NEXT_B
             AND  COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       NEXT_A,
       NEXT_B,
       NEXT_C,
       CASE  WHEN COLUMN_A = PREV_A 
                AND  COLUMN_B = PREV_B
             AND  COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM (  SELECT  ID, 
           COLUMN_A,
        COLUMN_B,
        COLUMN_C,
        START_DATE,
        END_DATE,
        LAG (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS prev_A,
        LAG (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS prev_B,
        LAG (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS prev_C,
        LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID) AS next_A,
        LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID) AS next_B,
        LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID) AS next_C
FROM    DEDUP_TEST
ORDER 
BY      ID, START_DATE ) ) ) 
Gives:
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET
---------- ---------- ---------- ---------- --------- --------- ---
         1 A          B          C          01-OCT-99 01-OCT-00 NOT
         1 D          B          C          01-OCT-01 01-OCT-02 NOT
         1 A          B          C          01-OCT-02 01-OCT-03 NOT
         2 A          B          B          01-OCT-03 01-OCT-04 DUP
         2 A          B          B          01-OCT-05 01-OCT-06 DUP
         2 A          B          B          02-OCT-07 05-OCT-08 DUP
         2 a          f          f          06-FEB-09 07-FEB-10 NOT
         2 A          B          B          02-OCT-11 03-OCT-12 DUP
         2 A          B          B          04-OCT-13 06-OCT-14 DUP
         3 A          F          F          10-FEB-14 20-FEB-15 NOT
From this the sequentially duplicated rows with the attributes a, b, c will become:
2  A B C 01-OCT-03 05-OCT-08
2  A B C 02-OCT-11 06-OCT-14
Thanks. 
 
May       21, 2004 - 10:50 am UTC 
 
define sequentially.
1     A1   B1    C1     ....
1     A2   B2    C2     ....
1     A1   B1    C1     ....
ordered by ID is the same (exact same) as:
1     A1   B1    C1     ....
1     A1   B1    C1     ....
1     A2   B2    C2     ....
and
1     A2   B2    C2     ....
1     A1   B1    C1     ....
1     A1   B1    C1     ....
and in fact, two runs of your query could return different answers given the SAME exact data.  How to handle that, you must have something more to sort by. 
 
 
 
Typo in previous post
Dave Thompson, May       21, 2004 - 5:56 am UTC
 
 
Tom, 
The final output should be:
From this the sequentially duplicated rows with the attributes a, b, c will 
become:
2  A B B 01-OCT-03 05-OCT-08
2  A B B 02-OCT-11 06-OCT-14
Thanks.
 
 
 
Order
Dave Thompson, May       21, 2004 - 10:57 am UTC
 
 
Hi Tom,
The order of the dataset should be on the ID and Start Date.
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET
---------- ---------- ---------- ---------- --------- --------- ---
         1 A          B          C          01-OCT-99 01-OCT-00 NOT
         1 D          B          C          01-OCT-01 01-OCT-02 NOT
         1 A          B          C          01-OCT-02 01-OCT-03 NOT
         2 A          B          B          01-OCT-03 01-OCT-04 DUP
         2 A          B          B          01-OCT-05 01-OCT-06 DUP
         2 A          B          B          02-OCT-07 05-OCT-08 DUP
         2 a          f          f          06-FEB-09 07-FEB-10 NOT
         2 A          B          B          02-OCT-11 03-OCT-12 DUP
         2 A          B          B          04-OCT-13 06-OCT-14 DUP
         3 A          F          F          10-FEB-14 20-FEB-15 NOT
Thanks. 
 
May       21, 2004 - 11:42 am UTC 
 
Ok, your example doesn't do that -- it is "non-deterministic", given the same data, it could/would return two different answers at different times during the day!
so, i think you want one of these:
ops$tkyte@ORA9IR2> select *
  2    from (
  3  select id, a,b,c, start_date, end_date,
  4         case when (a = lag(a) over (order by id, start_date desc) and
  5                    b = lag(b) over (order by id, start_date desc) and
  6                    c = lag(c) over (order by id, start_date desc) )
  7              then row_number() over (order by id, start_date)
  8          end rn
  9    from v
 10         )
 11   where rn is null
 12  /
 
        ID A          B          C          START_DAT END_DATE          RN
---------- ---------- ---------- ---------- --------- --------- ----------
         1 A          B          C          01-OCT-99 01-OCT-00
         1 D          B          C          01-OCT-01 01-OCT-02
         1 A          B          C          01-OCT-02 01-OCT-03
         2 A          B          B          02-OCT-07 05-OCT-08
         2 a          f          f          06-FEB-09 07-FEB-10
         2 A          B          B          04-OCT-13 06-OCT-14
         3 A          F          F          10-FEB-14 20-FEB-15
 
7 rows selected.
 
ops$tkyte@ORA9IR2> select *
  2    from (
  3  select id, a,b,c, start_date, end_date,
  4         case when (a = lag(a) over (order by id, start_date) and
  5                    b = lag(b) over (order by id, start_date) and
  6                    c = lag(c) over (order by id, start_date) )
  7              then row_number() over (order by id, start_date)
  8          end rn
  9    from v
 10         )
 11   where rn is null
 12  /
 
        ID A          B          C          START_DAT END_DATE          RN
---------- ---------- ---------- ---------- --------- --------- ----------
         1 A          B          C          01-OCT-99 01-OCT-00
         1 D          B          C          01-OCT-01 01-OCT-02
         1 A          B          C          01-OCT-02 01-OCT-03
         2 A          B          B          01-OCT-03 01-OCT-04
         2 a          f          f          06-FEB-09 07-FEB-10
         2 A          B          B          02-OCT-11 03-OCT-12
         3 A          F          F          10-FEB-14 20-FEB-15
 
7 rows selected.
we just need to mark records that the preceding record is the "same" after sorting -- then nuke them. 
 
 
 
 
More Info
Dave Thompson, May       21, 2004 - 12:25 pm UTC
 
 
Hi Tom,
Thanks for the prompt reply.
I re-wrote the base query:
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       CASE WHEN ( DUP = 'DUP' OR DUPER = 'DUP' ) THEN 'DUP' ELSE 'NOT' END LETSEE
FROM (       
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       DUP,
       CASE  WHEN COLUMN_A = NEXT_A 
                AND  COLUMN_B = NEXT_B
             AND  COLUMN_C = NEXT_C THEN 'DUP' ELSE 'NOT' END DUPER
FROM (
SELECT ID,
       COLUMN_A,
       COLUMN_B,
       COLUMN_C,
       START_DATE,
       END_DATE,
       NEXT_A,
       NEXT_B,
       NEXT_C,
       CASE  WHEN COLUMN_A = PREV_A 
                AND  COLUMN_B = PREV_B
             AND  COLUMN_C = PREV_C THEN 'DUP' ELSE 'NOT' END DUP
FROM (  SELECT  ID, 
           COLUMN_A,
        COLUMN_B,
        COLUMN_C,
        START_DATE,
        END_DATE,
        ROWID       ROWID_R,
        LAG  (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_A,
        LAG  (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_B,
        LAG  (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS prev_C,
        LEAD (COLUMN_A, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_A,
        LEAD (COLUMN_B, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_B,
        LEAD (COLUMN_C, 1, 0) OVER (ORDER BY ID, START_DATE) AS next_C
FROM    DEDUP_TEST
ORDER 
BY      ID, START_DATE ) ) )
And got:
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET
---------- ---------- ---------- ---------- --------- --------- ---
         1 A          B          C          01-OCT-99 01-OCT-00 NOT
         1 D          B          C          01-OCT-01 01-OCT-02 NOT
         1 A          B          C          01-OCT-02 01-OCT-03 NOT
         2 A          B          B          01-OCT-03 01-OCT-04 DUP
         2 A          B          B          01-OCT-05 01-OCT-06 DUP
         2 A          B          B          02-OCT-07 05-OCT-08 DUP
         2 a          f          f          06-FEB-09 07-FEB-10 NOT
         2 A          B          B          02-OCT-11 03-OCT-12 DUP
         2 A          B          B          04-OCT-13 06-OCT-14 DUP
         3 A          F          F          10-FEB-14 20-FEB-15 NOT
Looking at the column LETSEE I want to add a unique identifier to each row, treating duplicated rows as 1.
For example:
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET DUP_ID
---------- ---------- ---------- ---------- --------- --------- --- ------
         1 A          B          C          01-OCT-99 01-OCT-00 NOT 1
         1 D          B          C          01-OCT-01 01-OCT-02 NOT 2
         1 A          B          C          01-OCT-02 01-OCT-03 NOT 3
         2 A          B          B          01-OCT-03 01-OCT-04 DUP 4
         2 A          B          B          01-OCT-05 01-OCT-06 DUP 4
         2 A          B          B          02-OCT-07 05-OCT-08 DUP 4
         2 a          f          f          06-FEB-09 07-FEB-10 NOT 5
         2 A          B          B          02-OCT-11 03-OCT-12 DUP 6
         2 A          B          B          04-OCT-13 06-OCT-14 DUP 6
         3 A          F          F          10-FEB-14 20-FEB-15 NOT 7
Then I could use the Dup_Id to partition on to do the anaysis I need.
Any idea?
Have a nice weekend.
Thanks.
 
 
May       21, 2004 - 1:59 pm UTC 
 
the above query doesn't work? 
 
 
 
Hi Again
Dave Thompson, May       21, 2004 - 2:05 pm UTC
 
 
Hi Tom,
The above didn't work.
From the source query:
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET
---------- ---------- ---------- ---------- --------- --------- ---
         1 A          B          C          01-OCT-99 01-OCT-00 NOT
         1 D          B          C          01-OCT-01 01-OCT-02 NOT
         1 A          B          C          01-OCT-02 01-OCT-03 NOT
         2 A          B          B          01-OCT-03 01-OCT-04 DUP
         2 A          B          B          01-OCT-05 01-OCT-06 DUP
         2 A          B          B          02-OCT-07 05-OCT-08 DUP
         2 a          f          f          06-FEB-09 07-FEB-10 NOT
         2 A          B          B          02-OCT-11 03-OCT-12 DUP
         2 A          B          B          04-OCT-13 06-OCT-14 DUP
         3 A          F          F          10-FEB-14 20-FEB-15 NOT
I want to output the following resultset:
        ID COLUMN_A   COLUMN_B   COLUMN_C   START_DAT END_DATE  LET
---------- ---------- ---------- ---------- --------- --------- ---
         1 A          B          C          01-OCT-99 01-OCT-00 NOT
         1 D          B          C          01-OCT-01 01-OCT-02 NOT
         1 A          B          C          01-OCT-02 01-OCT-03 NOT
         2 A          B          B          01-OCT-03 05-OCT-08 DUP
         2 a          f          f          06-FEB-09 07-FEB-10 NOT
         2 A          B          B          02-OCT-11 06-OCT-14 DUP
         3 A          F          F          10-FEB-14 20-FEB-15 NOT
On the resultset from your queries the start and end dates were incorrect.
Where duplicates rows occur one after another then we need to take the start_date of the first row and the end_date of the last row in that block.
So far the following:
         2 A          B          B          01-OCT-03 01-OCT-04 DUP
         2 A          B          B          01-OCT-05 01-OCT-06 DUP
         2 A          B          B          02-OCT-07 05-OCT-08 DUP
You would get
         2 A          B          B          01-OCT-03 05-OCT-08 DUP
Does this make sense?
Thanks again for you input on this.
 
 
May       21, 2004 - 2:19 pm UTC 
 
ops$tkyte@ORA9IR2> select id, a,b,c, min(start_date) start_date, max(end_date) end_date
  2    from (
  3  select id, a,b,c, start_date, end_date,
  4         max(grp) over (order by id, start_date desc) grp
  5    from (
  6  select id, a,b,c, start_date, end_date,
  7         case when (a <> lag(a) over (order by id, start_date desc) or
  8                    b <> lag(b) over (order by id, start_date desc) or
  9                    c <> lag(c) over (order by id, start_date desc) )
 10              then row_number() over (order by id, start_date desc)
 11          end grp
 12    from v
 13         )
 14         )
 15   group by id, a,b,c,grp
 16   order by 1, 5
 17  /
 
        ID A          B          C          START_DAT END_DATE
---------- ---------- ---------- ---------- --------- ---------
         1 A          B          C          01-OCT-99 01-OCT-00
         1 D          B          C          01-OCT-01 01-OCT-02
         1 A          B          C          01-OCT-02 01-OCT-03
         2 A          B          B          01-OCT-03 05-OCT-08
         2 a          f          f          06-FEB-09 07-FEB-10
         2 A          B          B          02-OCT-11 06-OCT-14
         3 A          F          F          10-FEB-14 20-FEB-15
 
7 rows selected.
One of my (current) favorite analytic tricks -- the old "carry forward".  We mark rows such that the preceding row was different -- subsequent dup rows would have NULLS there for grp.  
Then, we use max(grp) to "carry" that number down....
Now we have something to group by -- we've divided the rows up into groups we can deal with.
(note: if a,b,c allow NULLS, we'll need to accomidate for that!) 
 
 
 
 
Great Stuff
Dave Thompson, May       21, 2004 - 5:02 pm UTC
 
 
Tom,
Thanks very much for that.  
I'll go over it in more detail when I'm in the Office Monday but it looks great from here.
Enjoy the weekend. 
 
 
Excellent
Dave Thompson, June      02, 2004 - 4:53 am UTC
 
 
Hi Tom,
This solution was spot on.
Thanks.
Any more thoughts on an Analytics book?  
 
 
Stalin, June      09, 2004 - 6:03 pm UTC
 
 
hi tom,
wondering what would below sql look like if there hadn't been existence of lead or partition analytical funtions. is pl/sql the only option.
snippet from "lead/lag on different dataset" thread (it's has the create and insert stmts)
ops$tkyte@ORA9IR2> select a.* , round( (signoff_date-signon_date) * 24 * 60, 2 )
minutes
  2    from (
  3  select log_id,
  4         case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
  5              then lead(log_id) over (partition by account_id, user_id, mac
order by log_creation_date)
  6          end signoff_id,
  7         user_id,
  8         log_creation_date signon_date,
  9         case when action in (1,2) and lead(action) over (partition by
account_id,user_id,mac order by log_creation_date) in (3,4,5,6,7)
 10              then lead(log_creation_date) over (partition by account_id,
user_id, mac order by log_creation_date)
 11          end signoff_date,
 12                  action
 13  from   logs
 14  where  account_id = 'Robert'
 15  and    service = 5
 16  order  by user_id
 17         ) a
 18   where action in (1,2)
 19  /
Thanks,
Stalin 
 
June      09, 2004 - 6:27 pm UTC 
 
you could use a non-equi self join to achieve the same.  Many orders of magnitudes slower.
scalar subqueries could be used as well -- with the same "slower" caveat. 
 
 
 
Is this solvable with ANALTICS too?
Peter Tran, June      10, 2004 - 12:14 am UTC
 
 
Hi Tom,
Can the following problem be solved using Analytics?
I have a 10 columns table where 9 of the fields are dimensions and one attribute.  I would like to get a report of D1/D2 columns where the ATTR1 is 1 for every other dimensions.  Furthermore the PK consist of all the dimension columns.
The example below aren't really true column names, but I didn't want to make the example table too wide for illustrative purpose.
 D1  D2  D3  D4  D5  D6  D7  D8  D9  ATTR1
-------------------------------------------- 
 AA  AA  AA  AA  AA  AA  AA  AA  AA    1
 AA  AA  BB  AA  AA  AA  AA  AA  AA    1
 AA  AA  AA  CC  AA  AA  AA  AA  AA    1
 AA  AA  AA  AA  DD  AA  AA  AA  AA    1
 AA  AA  AA  AA  EE  AA  AA  AA  AA    1
 AA  BB  AA  AA  AA  AA  AA  GG  AA    1 
 AA  BB  AA  AA  AA  AA  AA  AA  AA    1 
 AA  BB  CC  AA  AA  AA  AA  AA  AA    0 
 AA  BB  AA  DD  AA  AA  AA  AA  AA    1 
 EE  DD  JJ  LL  MM  NN  OO  PP  QQ    1
 EE  DD  TT  LL  MM  NN  OO  PP  QQ    1 
 
I want the query to return:
 D1  D2
-------- 
 AA  AA
 EE  DD
 
It would not return AA/BB, because of the record:
 D1  D2  D3  D4  D5  D6  D7  D8  D9  ATTR1
-------------------------------------------- 
 AA  BB  CC  AA  AA  AA  AA  AA  AA    0 
Thanks,
-Peter 
 
June      10, 2004 - 7:43 am UTC 
 
yes they can, but they are not needed.  regular aggregates do the job.  I'd give you the real query if I had a create table/inserts to demo against.  this is "psuedo code", might or might not actually work:
select d1, d2
  from t
 group by d1, d2
having count(distinct attribute) = 1
 
 
 
 
Michael T., June      10, 2004 - 9:01 am UTC
 
 
Peter,
I think the following may give you what you want.
SELECT d1, d2 
  FROM t 
 GROUP BY d1, d2 
HAVING SUM(DECODE(attr1, 1, 0, 1)) > 0;
Tom's psuedo code will work except for the case when all D1/D2 combinations have the same ATTR1 value, but that value is not 1.
 
 
June      10, 2004 - 9:45 am UTC 
 
ahh, good eye -- i was thinking "all attribute values are the same"
but yours doesn't do it,  this will
having count( decode( attr1, 1, 1 ) ) = count(*)
cound(decode(attr1,1,1)) will return a count of non-null occurences (all of the 1's)
count(*) returns a count of all records
output when count(decode) = count(*)
 
 
 
 
Thank you!
Peter Tran, June      10, 2004 - 10:37 am UTC
 
 
Hi Tom/Michael T.,
Thank you.  It so much clearer now.
-Peter 
 
 
Michael T., June      10, 2004 - 10:46 am UTC
 
 
I did screw up in my previous response.  The query I submitted gives the entirely wrong answer.  It should have been
SELECT d1, d2
  FROM t
 GROUP BY d1, d2
HAVING SUM(DECODE(attr1, 1, 0, 1)) = 0
Even though, incorrectly, I wasn't originally considering null values for ATTR1, the above query seems to produce the correct answer even if ATTR1 is NULL.  The DECODE will evaluate a null ATTR1 entry to 1.
Tom, many thanks for this site.  I have learned so much from it.  It is a daily must read for me.
 
 
 
You said a book on analytics?
Jeff, June      10, 2004 - 12:30 pm UTC
 
 
A book by you on analytics would be a best seller I think. 
Go for it. 
 
 
quick analytic question
A reader, June      16, 2004 - 5:03 pm UTC
 
 
schema creation---
---
scott@ora92> drop table t1;
Table dropped.
scott@ora92> create table t1
  2  (
  3    x varchar2(10),
  4    y number
  5  );
Table created.
scott@ora92> 
scott@ora92> insert into t1 values( 'x1', 1 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 2 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 4 );
1 row created.
scott@ora92> insert into t1 values( 'x1', 0 );
1 row created.
scott@ora92> commit;
Commit complete.
scott@ora92> select x, y, min(y) over() min_y
  2  from t1;
X                   Y      MIN_Y
---------- ---------- ----------
x1                  1          0
x1                  2          0
x1                  4          0
x1                  0          0
scott@ora92> spool off
---
how do i get the minimum of y for all values
that is greater than 0 (if one exists). In the above case
I should get the result as
X                   Y      MIN_Y
---------- ---------- ----------
x1                  1          1
x1                  2          1
x1                  4          1
x1                  0          1
Thanx for your excellent site and brilliant work!
 
 
June      16, 2004 - 6:09 pm UTC 
 
min( case when y > 0 then y end ) over () 
 
 
 
Great!!!
A reader, June      16, 2004 - 6:46 pm UTC
 
 
 
 
Thank you very much
Gj, July      02, 2004 - 9:16 am UTC
 
 
The Oracle docs are a little light on examples but thank you for giving us the quick start to analytics, can't say I understand the complex examples yet, but the simple stuff seems so easy to understand now, can't wait until a real problem comes along I can apply this feature to. 
 
July      02, 2004 - 10:39 am UTC 
 
 
 
How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?
Sergey, July      06, 2004 - 8:08 am UTC
 
 
Hi Tom,
I need to 'fill the gaps' with the values from the last existing row in a table that is outer joined to another table. The other table servers as a source of regular [time] intervals. The task seems to be conceptually very simple, so I looked into Ora docs (it happens to be Ora10g docs) I pretty soon found exactly what I need: LAST_VALUE with IGNORE NULLS. Unfortunately neither Ora8i, nor Ora9i accept IGNORE NULLS. Is there any way to mimic this feature with 'older' analitical functions?
I tried sort of ORDER BY SIGN(NVL(VALUE), 0) in analitical ORDER BY clause, but it does not work (I do not have a clue why)
Thanks in advance
Here is the test:
DROP TABLE TD;
CREATE TABLE TD AS
        (SELECT TRUNC(SYSDATE, 'DD') + ROWNUM T
            FROM ALL_OBJECTS
            WHERE ROWNUM <= 15
        );
DROP TABLE TV;
CREATE TABLE TV AS
        (SELECT 
                 TRUNC(SYSDATE, 'DD') + ROWNUM * 3 T
                ,ROWNUM V
            FROM ALL_OBJECTS
            WHERE ROWNUM <= 5
        );
SELECT 
         TD.T
        ,SIGN(NVL(TV.V, 0))
        ,NVL
            (TV.V, 
                LAST_VALUE(TV.V IGNORE NULLS) -- IGNORE NULLS does not work on Ora8i, Ora9i
                    OVER 
                        (
                            ORDER BY TD.T
                            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
                        )
            ) V
    FROM TD, TV
    WHERE TV.T(+) = TD.T
    ORDER BY TD.T
    ;
ERROR at line 6:
ORA-00907: missing right parenthesis
SELECT 
         TD.T
        ,SIGN(NVL(TV.V, 0))
        ,NVL
            (TV.V, 
                LAST_VALUE(TV.V)
                    OVER 
                        (
                            ORDER BY SIGN(NVL(TV.V, 0)), TD.T -- Does not work
                            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
                        )
            ) V
    FROM TD, TV
    WHERE TV.T(+) = TD.T
    ORDER BY TD.T
    ;
T                    SIGN(NVL(TV.V,0))                  V
------------------- ------------------ ------------------
07.07.2004 00:00:00                  0
08.07.2004 00:00:00                  0
09.07.2004 00:00:00                  1                  1
10.07.2004 00:00:00                  0
11.07.2004 00:00:00                  0
12.07.2004 00:00:00                  1                  2
13.07.2004 00:00:00                  0
14.07.2004 00:00:00                  0
15.07.2004 00:00:00                  1                  3
16.07.2004 00:00:00                  0
17.07.2004 00:00:00                  0
18.07.2004 00:00:00                  1                  4
19.07.2004 00:00:00                  0
20.07.2004 00:00:00                  0
21.07.2004 00:00:00                  1                  5
 
 
July      06, 2004 - 8:26 am UTC 
 
This is a trick I call "carry down", we use analytics on analytics to accomplish this.  We output "marker rows" with ROW_NUMBER() on the leading edge.  Using MAX() in the outer query, we "carry down" these marker rows -- substr gets rid of the row_number for us:
ops$tkyte@ORA10G> select t,
  2         sign_v,
  3             v,
  4             substr( max(data) over (order by t), 7 ) v2
  5    from (
  6  SELECT TD.T,
  7         SIGN(NVL(TV.V, 0)) sign_v,
  8          NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
  9           case when tv.v is not null
 10                 then to_char( row_number() 
                                  over (order by td.t), 'fm000000' ) || tv.v
 11                    end data
 12      FROM TD, TV
 13      WHERE TV.T(+) = TD.T
 14          )
 15   ORDER BY T
 16      ;
 
T             SIGN_V          V V2
--------- ---------- ---------- -----------------------------------------
07-JUL-04          0
08-JUL-04          0
09-JUL-04          1          1 1
10-JUL-04          0          1 1
11-JUL-04          0          1 1
12-JUL-04          1          2 2
13-JUL-04          0          2 2
14-JUL-04          0          2 2
15-JUL-04          1          3 3
16-JUL-04          0          3 3
17-JUL-04          0          3 3
18-JUL-04          1          4 4
19-JUL-04          0          4 4
20-JUL-04          0          4 4
21-JUL-04          1          5 5
 
15 rows selected.
So, in 9ir2 this would simply be:
ops$tkyte@ORA9IR2> select t,
  2         sign_v,
  3             substr( max(data) over (order by t), 7 ) v2
  4    from (
  5  SELECT TD.T,
  6         SIGN(NVL(TV.V, 0)) sign_v,
  7           case when tv.v is not null
  8                        then to_char( row_number() over (order by td.t), 'fm000000' ) || tv.v
  9                    end data
 10      FROM TD, TV
 11      WHERE TV.T(+) = TD.T
 12          )
 13   ORDER BY T
 14      ;
 
T             SIGN_V V2
--------- ---------- -----------------------------------------
07-JUL-04          0
08-JUL-04          0
09-JUL-04          1 1
10-JUL-04          0 1
11-JUL-04          0 1
12-JUL-04          1 2
13-JUL-04          0 2
14-JUL-04          0 2
15-JUL-04          1 3
16-JUL-04          0 3
17-JUL-04          0 3
18-JUL-04          1 4
19-JUL-04          0 4
20-JUL-04          0 4
21-JUL-04          1 5
 
15 rows selected.
 
 
 
 
 
Doesn't work with PL/SQL ????????
A reader, July      20, 2004 - 9:31 am UTC
 
 
Dear Tom
Are analytics fully compatible with PL/SQL?
Please see
SQL> ed
Wrote file afiedt.buf
  1  select empno,deptno,
  2         count(empno) over (partition by deptno order by empno
  3                            rows between unbounded preceding and current row) run_count
  4* from emp
SQL> /
     EMPNO     DEPTNO  RUN_COUNT
---------- ---------- ----------
      7782         10          1
      7839         10          2
      7934         10          3
      7369         20          1
      7566         20          2
      7788         20          3
      7876         20          4
      7902         20          5
      7499         30          1
      7521         30          2
      7654         30          3
     EMPNO     DEPTNO  RUN_COUNT
---------- ---------- ----------
      7698         30          4
      7844         30          5
      7900         30          6
14 rows selected.
SQL> 
SQL> ed
Wrote file afiedt.buf
  1  declare
  2  cursor c1 is
  3  select empno,deptno,
  4         count(empno) over (partition by deptno order by empno
  5                            rows between unbounded preceding and current row) run_count
  6  from emp;
  7  begin
  8   for rec in c1 loop
  9    null;
 10   end loop;
 11* end;
SQL> /
end;
   *
ERROR at line 11:
ORA-06550: line 5, column 72:
PL/SQL: ORA-00905: missing keyword
ORA-06550: line 3, column 1:
PL/SQL: SQL Statement ignored
SQL> 
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
PL/SQL Release 9.2.0.4.0 - Production
CORE    9.2.0.3.0       Production
TNS for 32-bit Windows: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
SQL>  
 
 
July      20, 2004 - 8:08 pm UTC 
 
You can contact support and reference <Bug:3083373>, but the workaround would be to use native dynamic sql or a view to "hide" this construct.
the problem turns out to be the word "current" which had meaning in plsql. 
 
 
 
Effect of distinct on lag
John Murphy, July      29, 2004 - 1:48 pm UTC
 
 
I am trying to use analytics to find accounts with receipts in 3 consecutive years.  The analytic code seems to work, however, when I add DISTINCT (to find each account once), I get strange results.  This is on 9.2.0.1.0.
create table jcm_test(acct_id number(10), rcpt_date date);
insert into jcm_test
values (1 , to_date('01-JAN-2000', 'dd-mon-yyyy'));                                                            
insert into jcm_test
values (1 , to_date('01-JAN-2001', 'dd-mon-yyyy'));                                                             
insert into jcm_test
values (1 , to_date('01-JAN-2003', 'dd-mon-yyyy'));                                                             
insert into jcm_test
values (1 , to_date('02-JAN-2001', 'dd-mon-yyyy'));   
(select j2.*,
  rcpt_year - lag_yr as year_diff,
  rank_year - lag_rank as rank_diff
  from (select acct_id, rcpt_year, rank_year,
       lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
       lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
       from (select acct_id,
            rcpt_year,
            rank() over (partition by acct_id order by j.rcpt_year) rank_year
            from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
                     from jcm_test) j )
     ) j2);
   ACCT_ID RCPT  RANK_YEAR LAG_   LAG_RANK  YEAR_DIFF  RANK_DIFF        
---------- ---- ---------- ---- ---------- ---------- ----------        
     1 2000      1                  
     1 2001      2                  
     1 2003      3 2000      1      3      2        
select * from
  (select j2.*,
  rcpt_year - lag_yr as year_diff,
  rank_year - lag_rank as rank_diff
  from (select acct_id, rcpt_year, rank_year,
       lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
       lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
       from (select acct_id,
             rcpt_year,
             rank() over (partition by acct_id order by j.rcpt_year) rank_year
             from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
                    from jcm_test) j )
   ) j2)
where  year_diff = rank_diff;
no rows selected
select distinct * from
  (select j2.*,
  rcpt_year - lag_yr as year_diff,
  rank_year - lag_rank as rank_diff
  from (select acct_id, rcpt_year, rank_year,
       lag(rcpt_year, 2) over (partition by acct_id order by rcpt_year) lag_yr,
       lag(rank_year, 2) over (partition by acct_id order by rcpt_year) lag_rank
       from (select acct_id,
             rcpt_year,
             rank() over (partition by acct_id order by j.rcpt_year) rank_year
             from (select distinct acct_id, to_char(rcpt_date, 'YYYY') rcpt_year
                  from jcm_test) j )
     ) j2)
   where  year_diff = rank_diff;
   ACCT_ID RCPT  RANK_YEAR LAG_   LAG_RANK  YEAR_DIFF  RANK_DIFF        
---------- ---- ---------- ---- ---------- ---------- ----------        
     1 2001      2 2000      1      1      1        
     1 2003      4 2001      2      2      2        
In your book, you say that because analytics are performed last, you must push them into an inline view.  However, that doesn't seem to do the trick here.  Thanks, john 
 
July      29, 2004 - 2:18 pm UTC 
 
what release -- i don't see what you see. 
 
 
 
Distinct effect release
John Murphy, July      29, 2004 - 3:12 pm UTC
 
 
Tom, we are using the following.
Oracle9i Release 9.2.0.1.0 - Production
PL/SQL Release 9.2.0.1.0 - Production
CORE    9.2.0.1.0       Production
TNS for 32-bit Windows: Version 9.2.0.1.0 - Production
NLSRTL Version 9.2.0.1.0 - Production
I tried searching Metalink, but couldn't find any bugs. 
 
July      29, 2004 - 4:03 pm UTC 
 
i found one, not published, was solved via 9202 -- at least it did not reproduce, they did not pursue it further for that reason.  
 
 
 
 
Distinct effect release
John Murphy, July      29, 2004 - 4:01 pm UTC
 
 
Actually, I suspect that this may be related to bug 2258035.  Do you agree?  Thanks, john 
 
July      29, 2004 - 4:18 pm UTC 
 
yes, i can confirm that in 9205, it is not happening that way. 
 
 
 
how to write this query
Teddy, July      30, 2004 - 6:33 am UTC
 
 
Hi
using the original poster´s example:
   ORDER  OPN  STATION  CLOSE_DATE
   -----  ---  -------  ----------
   12345   10  RECV     07/01/2003
   12345   20  MACH1    07/02/2003
   12345   25  MACH1    07/05/2003
   12345   30  MACH1    07/11/2003
   12345   36  INSP1    07/12/2003
   12345   50  MACH1    08/16/2003
   12346   90  MACH2    07/30/2003
   12346  990  STOCK    07/31/2003
How do you write a query to determine that and order has passed maufacturing operation in several months?
In above example  
12345 has rows in July and Augist but 12346 has rows in July only. How can we write a query to find orders such as 12345?
 
 
July      30, 2004 - 4:40 pm UTC 
 
select order, min(close_date), max(close_date)
  from t
having months_between( max(close_date), min(close_date) ) > your_threshold; 
 
 
 
Finding pairs in result set
PJ, August    11, 2004 - 10:05 am UTC
 
 
Tom,
CREATE TABLE A
(
  N  NUMBER,
  C  CHAR(1),
  V  VARCHAR2(20)
)
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '1st e of 1st N'); 
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '2nd e of 1st N'); 
INSERT INTO A ( N, C, V ) VALUES ( 1, 'e', '3rd e of 1st N'); 
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '1st w of 1st N'); 
INSERT INTO A ( N, C, V ) VALUES ( 1, 'w', '2nd w of 1st N'); 
INSERT INTO A ( N, C, V ) VALUES ( 2, 'e', '1st e of 2nd N'); 
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '1st w of 2nd N'); 
INSERT INTO A ( N, C, V ) VALUES ( 2, 'w', '2nd w of 2nd N'); 
commit;
 
SO the data I've is 
select * from a;
-------------------------
N    C    V
1    e    1st e of 1st N
1    e    2nd e of 1st N
1    e    3rd e of 1st N
1    w    1st w of 1st N
1    w    2nd w of 1st N
2    e    1st e of 2nd N
2    w    1st w of 2nd N
2    w    2nd w of 2nd N
---------------------------------------
And the output I'm looking for is
1    e    1st e of 1st N
1    e    2nd e of 1st N
1    w    1st w of 1st N
1    w    2nd w of 1st N
2    e    1st e of 2nd N
2    w    1st w of 2nd N
So basically I need the first pairs of (e-w/w-e) for each N.
I hope I'm clear here.
Thanks as usual in advance, 
 
August    11, 2004 - 12:40 pm UTC 
 
do you have a field that can be "sorted on" for finding "1st, 2cnd" and so on.
If not, there is no such thing as "first", or "third"
 
 
 
 
PJ, August    11, 2004 - 12:58 pm UTC
 
 
Tom,
Sorry if I was not clear.
we need to pick pairs for N. Like we have 5 rows with N=1. so we have to pick 4 rows leaving 1 UNPAIRED "e" out.
We want the data in the same order as it is in table. We can sort it by --> order by N,C 
 
August    11, 2004 - 1:58 pm UTC 
 
ops$tkyte@ORA920> select n, c, rn, cnt2
  2    from (
  3  select n, c, rn,
  4             min(cnt) over (partition by n) cnt2
  5    from (
  6  select n, c,
  7             row_number() over (partition by n, c order by c) rn,
  8             count(*) over (partition by n, c) cnt
  9    from a
 10             )
 11             )
 12   where rn <= cnt2
 13  /
 
         N C         RN       CNT2
---------- - ---------- ----------
         1 e          1          2
         1 e          2          2
         1 w          1          2
         1 w          2          2
         2 e          1          1
         2 w          1          1
 
6 rows selected.
 
 
 
 
 
Brilliant as usual !!
A reader, August    11, 2004 - 2:04 pm UTC
 
 
 
 
PJ's query
Kevin, August    11, 2004 - 2:04 pm UTC
 
 
PJ - you can drop the column 'v' from your table, and just use this query (which I think will answer your question using N and C alone, and generate an appropriate 'v' as it runs).
CREATE TABLE b
(
  N  NUMBER,
  C  CHAR(1)
) 
INSERT INTO b ( N, C ) VALUES ( 1, 'e');
 
INSERT INTO b ( N, C ) VALUES ( 1, 'e'); 
INSERT INTO b ( N, C ) VALUES ( 1, 'e'); 
INSERT INTO b ( N, C ) VALUES ( 1, 'w'); 
INSERT INTO b ( N, C ) VALUES ( 1, 'w'); 
INSERT INTO b ( N, C ) VALUES ( 2, 'e'); 
INSERT INTO b ( N, C ) VALUES ( 2, 'w'); 
INSERT INTO b ( N, C ) VALUES ( 2, 'w'); 
COMMIT;
SELECT n,c,v1 
FROM   (       
         SELECT lag (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1) c3, 
                lead (c1) OVER (PARTITION BY n,c1 ORDER BY n,c1)c4,
                c1 ||   
                 CASE WHEN c1 BETWEEN 10 AND 20 
                      THEN 'th' 
                      ELSE DECODE(MOD(c1,10),1,'st',2,'nd',3,'rd','th')
                 END || ' ' || c || ' of ' || c2 ||
                 CASE WHEN c2 BETWEEN 10 AND 20 
                      THEN 'th' 
                      ELSE DECODE(MOD(c2,10),1,'st',2,'nd',3,'rd','th')
                 END || ' N' v1,                
                t1.*
         FROM   (       
                  SELECT b.*,
                         row_number() OVER (PARTITION BY n, c ORDER BY n,c) c1,
                         DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
                  FROM   b
                ) t1     
       ) t2
WHERE c3 IS NOT NULL OR c4 IS NOT NULL
/
Results:
N    C    V1
1    e    1st e of 1st N    
1    w    1st w of 1st N    
1    e    2nd e of 1st N    
1    w    2nd w of 1st N    
2    e    1st e of 1st N    
2    w    1st w of 1st N    
INSERT INTO b ( N, C ) VALUES ( 1, 'w'); 
COMMIT;
Results:
N    C    V1
1    e    1st e of 1st N    
1    w    1st w of 1st N    
1    e    2nd e of 1st N    
1    w    2nd w of 1st N    
1    e    3rd e of 1st N    
1    w    3rd w of 1st N    
2    e    1st e of 1st N    
2    w    1st w of 1st N    
 
 
 
oops
Kevin, August    11, 2004 - 2:12 pm UTC
 
 
replace
DENSE_RANK() OVER (PARTITION BY n, c ORDER BY n,c) c2
with
DENSE_RANK() OVER (PARTITION BY c ORDER BY c) c2
my bad. 
 
 
A reader, August    11, 2004 - 3:27 pm UTC
 
 
Your bad what?
toe? leg?
 
 
 
Cool....
PJ, August    12, 2004 - 7:25 am UTC
 
 
 
 
analytic q
A reader, October   22, 2004 - 6:34 pm UTC
 
 
First the schema:
scott@ORA92I> drop table t1;
Table dropped.
scott@ORA92I> create table t1( catg1 varchar2(10), catg2 varchar2(10), total number );
Table created.
scott@ORA92I> 
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 5 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 6 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V1', 'T1', 9 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V2', 'T2', 10 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V3', 'T1', 11 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V4', 'T1', 1 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V5', 'T2', 2 );
1 row created.
scott@ORA92I> insert into t1( catg1, catg2, total) values( 'V6', 'T2', 3 );
1 row created.
 The catg2 can only take two values, 'T1', 'T2'.
 I want to sum the total column for catg1, catg2
 and order by their total sum for each catg1 and catg2 values. Then
 I want to list the top 3 catg1, catg2 combinations
 based on their sum values of total column.
 If there are more than 3 such combinations then I 
 club the remaining ones into a catg1 value of 'Others'.
my first cut solution is:
scott@ORA92I> select catg1, catg2, sum( total_sum )
  2  from
  3  (
  4    select case
  5          when dr > 3 then
  6            'Others'
  7          when dr <= 3 then
  8            catg1
  9            end catg1,
 10            catg2,
 11            total_sum
 12    from
 13    (
 14       select catg1, catg2, total_sum,
 15             dense_rank() over( order by total_sum desc) dr
 16       from
 17       (
 18         select catg1, catg2, sum( total ) total_sum
 19         from t1
 20         group by catg1, catg2
 21       )
 22    )
 23  )
 24  group by catg1, catg2;
CATG1      CATG2      SUM(TOTAL_SUM)
---------- ---------- --------------
V1         T1                     20
V2         T2                     10
V3         T1                     11
Others     T1                      1
Others     T2                      5
Does it look ok or do you have any better solution?
Thank you as always.
 
 
October   23, 2004 - 9:36 am UTC 
 
you could skip a layer of inline view, but it looks fine as is. 
 
 
 
thanx!
A reader, October   24, 2004 - 12:37 pm UTC
 
 
 
 
SQL query
Reader, November  03, 2004 - 1:45 pm UTC
 
 
I have a table which stores receipts against Purchase Orders. The users want the following o/p:
For each of the months of Jan, Feb and March 2004, provide a count of number of receipts which fall in each of the following Dollar value range
< $5000
Between $5000 to $9999
> $10,000
(There can be a number of receipts against one Purchase Order, so that's needs to be grouped together first)
I wrote this query using an inline view which is the UNION of 3 SQLs, one for each dollar range.
However, am sure there is a more elegant and efficient method to do this,maybe using analytical functions , CASE, decode .... Appreciate your help.
Thanks 
 
November  05, 2004 - 10:49 am UTC 
 
select trunc(date_col,'mm') Month, 
       count( case when amt < 5000 then 1 end ) "lt 5000",
       count( case when amt between 5000 and 9999 then 1 end ) "between 5/9k",
       count( case when amt >= 10000 then 1 end ) "10k or more"
  from t
 where date_col between :a and :b
 group by trunc(date_col,'mm')
single pass.... 
 
 
 
Great - 
syed, November  10, 2004 - 7:09 am UTC
 
 
Tom
I have a tables as follows
create table matches
( reference varchar2(9),
  endname varchar2(20),
  beginname varchar2(30),
  DOB date, 
  ni varchar2(9)
)
/
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A1','SMITH','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A2','JONES','TOM',to_date('1/1/1970','dd/mm/yyyy'),'AA1234568');
insert into matches values ('A3','JONES','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A4','BROWN','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234570');
insert into matches values ('A4','JONES','BRAD',to_date('1/1/1961','dd/mm/yyyy'),'AA1234571');
insert into matches values ('A1','SMITH','BOB',to_date('1/1/1976','dd/mm/yyyy'),'AA1234567');
insert into matches values ('A3','JACKSON','TOM',to_date('1/1/1971','dd/mm/yyyy'),'AA1234569');
insert into matches values ('A2','JACKSON','BOB',to_date('1/1/1962','dd/mm/yyyy'),'AA1234568');
 insert into matches values ('A5','JACKSON','TOM',to_date('1/1/1920','dd/mm/yyyy'),'AA1234569');
commit;
SQL> select rownum,REFERENCE,ENDNAME,BEGINNAME,DOB,NI from matches;
 ROWNUM REFERENCE ENDNAME  BEGINNAME  DOB       NI
------- --------- -------- ---------- --------- ---------
      1 A1        SMITH    BOB        01-JAN-76 AA1234567
      2 A1        SMITH    TOM        01-JAN-70 AA1234568
      3 A2        JONES    TOM        01-JAN-70 AA1234568
      4 A3        JONES    TOM        01-JAN-71 AA1234569
      5 A4        BROWN    BRAD       01-JAN-61 AA1234570
      6 A4        JONES    BRAD       01-JAN-61 AA1234571
      7 A1        SMITH    BOB        01-JAN-76 AA1234567
      8 A3        JACKSON  TOM        01-JAN-71 AA1234569
      9 A2        JACKSON  BOB        01-JAN-62 AA1234568
     10 A5        JACKSON  TOM        01-JAN-20 AA1234569
I need to show duplicates where the following columns values are the same.
a) REFERENCE, ENDNAME,BEGINNAME,DOB,NI
b) ENDNAME,BEGINNAME,NI
c) REFERENCE,NI
So, 
rownum 1 and 7 match criteria a)
rownum 8 and 10 match criteria b) 
rownum 1 and 7, rownum 3 and 9, rownum 4 and 8 match criteria c)
How can I select this data out to show number matching each criteria ?
 
 
 
November  10, 2004 - 7:23 am UTC 
 
"How can I select this data out to show number matching each criteria ?"
is ambigous.
If you add columns:
count(*) over (partition by reference, endname, beginname, dob, ni ) cnt1,
count(*) over (partition by endname, beginname, ni) cnt2,
count(*) over (partition by reference,ni) cnt3
it'll give you the "dup count" by each partition -- technically showing you the "number matching each criteria" 
 
 
 
analytics problem
David, November  19, 2004 - 9:37 am UTC
 
 
Am newish to analytic functions and have hit problem as follows:-
create table a
(accno         number(8)     not null,
 total_paid    number(7,2)   not null)
/
create table b
(accno         number(8)     not null,
 due_date      date          not null,
 amount_due    number(7,2)   not null)
/
insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into b values (1, '01-oct-04', 1000);
insert into b values (1, '01-jan-05', 900);
insert into b values (1, '01-apr-05', 700);
insert into b values (2, '01-oct-04', 1000);
insert into b values (2, '01-jan-05', 900);
insert into b values (2, '01-apr-05', 700);
insert into b values (3, '01-oct-04', 1000);
insert into b values (3, '01-jan-05', 900);
insert into b values (3, '01-apr-05', 700);
insert into b values (4, '01-oct-04', 1000);
insert into b values (4, '01-jan-05', 900);
insert into b values (4, '01-apr-05', 700);
If I then do this query...
SQL> select a.accno,
  2         a.total_paid,
  3         b.due_date,
  4         b.amount_due,
  5         case
  6  when sum(b.amount_due)
  7  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
  8  then 0
  9  when sum(b.amount_due)
 10  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
 11  then sum(b.amount_due)
 12  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
 13  when sum(b.amount_due)
 14  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
 15  and a.total_paid >= 0
 16  then b.amount_due
 17  end to_pay
 18  from a,b
 19  where a.accno = b.accno
 20  order by a.accno,
 21           to_date(b.due_date, 'dd-mon-rr')
 22  /
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         1       1000 01-OCT-04       1000       1000
         1       1000 01-JAN-05        900        900
         1       1000 01-APR-05        700        700
         2       1500 01-OCT-04       1000       1000
         2       1500 01-JAN-05        900        900
         2       1500 01-APR-05        700        700
         3       2000 01-OCT-04       1000       1000
         3       2000 01-JAN-05        900        900
         3       2000 01-APR-05        700        700
         4       3000 01-OCT-04       1000       1000
         4       3000 01-JAN-05        900        900
         4       3000 01-APR-05        700        700
12 rows selected.
...TO_PAY does not give what I was expecting. But if I do by individual accno I get what I'm after:-
SQL> select a.accno,
  2         a.total_paid,
  3         b.due_date,
  4         b.amount_due,
  5         case
  6  when sum(b.amount_due)
  7  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
  8  then 0
  9  when sum(b.amount_due)
 10  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
 11  then sum(b.amount_due)
 12  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
 13  when sum(b.amount_due)
 14  over (order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
 15  and a.total_paid >= 0
 16  then b.amount_due
 17  end to_pay
 18  from a,b
 19  where a.accno = b.accno
 20  and a.accno = &accno
 21  order by a.accno,
 22           to_date(b.due_date, 'dd-mon-rr')
 23  /
Enter value for accno: 1
old  20: and a.accno = &accno
new  20: and a.accno = 1
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         1       1000 01-OCT-04       1000          0
         1       1000 01-JAN-05        900        900
         1       1000 01-APR-05        700        700
3 rows selected.
SQL> /
Enter value for accno: 2
old  20: and a.accno = &accno
new  20: and a.accno = 2
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         2       1500 01-OCT-04       1000          0
         2       1500 01-JAN-05        900        400
         2       1500 01-APR-05        700        700
3 rows selected.
SQL> /
Enter value for accno: 3
old  20: and a.accno = &accno
new  20: and a.accno = 3
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         3       2000 01-OCT-04       1000          0
         3       2000 01-JAN-05        900          0
         3       2000 01-APR-05        700        600
3 rows selected.
SQL> /
Enter value for accno: 4
old  20: and a.accno = &accno
new  20: and a.accno = 4
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         4       3000 01-OCT-04       1000          0
         4       3000 01-JAN-05        900          0
         4       3000 01-APR-05        700          0
3 rows selected.
What is needed for first query above to work?
cheers,
David 
 
 
November  19, 2004 - 11:31 am UTC 
 
ops$tkyte@ORA9IR2> select a.accno,
  2         a.total_paid,
  3         b.due_date,
  4         b.amount_due,
  5         case
  6  when sum(b.amount_due)
  7  over (<b>partition by a.accno</b> order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid <= 0
  8  then 0
  9  when sum(b.amount_due)
 10  over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid < b.amount_due
 11  then sum(b.amount_due)
 12  over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid
 13  when sum(b.amount_due)
 14  over (partition by a.accno order by to_date(b.due_date, 'dd-mon-rr')) - a.total_paid >= b.amount_due
 15  and a.total_paid >= 0
 16  then b.amount_due
 17  end to_pay
 18  from a,b
 19  where a.accno = b.accno
 20  order by a.accno,
 21           to_date(b.due_date, 'dd-mon-rr')
 22  /
 
     ACCNO TOTAL_PAID DUE_DATE  AMOUNT_DUE     TO_PAY
---------- ---------- --------- ---------- ----------
         1       1000 01-OCT-04       1000          0
         1       1000 01-JAN-05        900        900
         1       1000 01-APR-05        700        700
         2       1500 01-OCT-04       1000          0
         2       1500 01-JAN-05        900        400
         2       1500 01-APR-05        700        700
         3       2000 01-OCT-04       1000          0
         3       2000 01-JAN-05        900          0
         3       2000 01-APR-05        700        600
         4       3000 01-OCT-04       1000          0
         4       3000 01-JAN-05        900          0
         4       3000 01-APR-05        700          0
 
12 rows selected.
 
 
 
 
 
 
excellent
David, November  19, 2004 - 12:02 pm UTC
 
 
many thanks 
 
 
Limitation of Analytic Functions
Nilanjan Ray, December  16, 2004 - 4:27 am UTC
 
 
I am using the following view
create or replace view vw_history as
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history 
The following SQL statement invariably does a full table scan on 112,861,91 rows of ADC_HISTORY and runs for 20-25 mins.
select *
from vw_history
where t_dt between to_date('01/01/2002','dd/mm/yyyy') and to_date('01/01/2002','dd/mm/yyyy');
The query return 4200 rows. ADC_HISTORY has 112,861,91 rows. I have the following indexes : ADC_HISTORY_IDX8 on txm_dt and ADC_HISTORY_IDX1 on spot_key columns. Both have good selectivities.
But when the required query is ran without the view it properly uses the index ADC_HISTORY_IDX8
select
txm_dt,s_key,s_hist_slno,cm_key,burst_key,cm_channel_key
,(lag(s_hist_slno,1,0) over(partition by s_key,s_hist_slno order by s_key,s_hist_slno)) prv_hist_slno
from adc_history 
 
I had raised a tar and it says:This is the expected behaviour "PREDICATES ARE NOT PUSHED IN THE VIEW IF ANY ANALYTIC FUNCTIONS ARE USED"
Is there any way to work aroung this limitation. I just cannot think of the painful situation if I am unable to use views with analytics!!!!
Your help is absolutely necessary. Thanks in advance  
 
December  16, 2004 - 8:27 am UTC 
 
guess what -- your two queries <b>return different answers</b>..
did you consider that?  did you check that?  
they are TOTALLY DIFFERENT.  Analytics are applied after predicates.  The view -- it has no predicate.  The query -- it has a predicate.  You'll find that you have DIFFERENT result sets.
don't you see that as a problem?
It is not that you are "unable to use views"
It is that "when I use a view, I get answer 1, when I do not use a view, I get answer 2"
which answer is technically correct here?
Think about it.
consider this example (using RBO just to make it so that "if an index could be used it would" to stress the point):
ops$tkyte@ORA9IR2> create table emp as select * from scott.emp;
 
Table created.
 
ops$tkyte@ORA9IR2> create index job_idx on emp(job);
 
Index created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> create or replace view v
  2  as
  3  select ename, sal, job,
  4         sum(sal) over (partition by job) sal_by_job,
  5             sum(sal) over (partition by deptno) sal_by_deptno
  6    from emp
  7  /
 
View created.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> set autotrace on explain
ops$tkyte@ORA9IR2> select *
  2    from v
  3   where job = 'CLERK'
  4  /
 
ENAME             SAL JOB       SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER           1300 CLERK           4150          8750
JAMES             950 CLERK           4150          9400
SMITH             800 CLERK           4150         10875
ADAMS            1100 CLERK           4150         10875
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE
   1    0   VIEW OF 'V'
   2    1     WINDOW (SORT)
   3    2       WINDOW (SORT)
   4    3         TABLE ACCESS (FULL) OF 'EMP'
 
 
<b>so, one might ask "well - hey, I've got that beautiful index on JOB, I said "where job = 'CLERK'", whats up with that full scan.
in fact, when I do it "right" -- without the evil view:</b>
 
ops$tkyte@ORA9IR2> select ename, sal, job,
  2         sum(sal) over (partition by job) sal_by_job,
  3             sum(sal) over (partition by deptno) sal_by_deptno
  4    from emp
  5   where job = 'CLERK'
  6  /
 
ENAME             SAL JOB       SAL_BY_JOB SAL_BY_DEPTNO
---------- ---------- --------- ---------- -------------
MILLER           1300 CLERK           4150          1300
SMITH             800 CLERK           4150          1900
ADAMS            1100 CLERK           4150          1900
JAMES             950 CLERK           4150           950
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE
   1    0   WINDOW (SORT)
   2    1     WINDOW (SORT)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
   4    3         INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
 
<b>it very rapidly uses my index !!!   stupid views...
but wait.
whats up with SAL_BY_DEPTNO, that appears to be wrong... hmmm, what happened?
What happened was we computed the sal_by_depto in the query without the view AFTER doing "where job = 'CLERK'"
YOU are doing your LAG() analysis AFTER applying the predicate.  Your lags in your query without the view -- they are pretty much "not accurate"
Note that when the predicate CAN be pushed:</b>
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select ename, sal, sal_by_job
  2    from v
  3   where job = 'CLERK'
  4  /
 
ENAME             SAL SAL_BY_JOB
---------- ---------- ----------
SMITH             800       4150
ADAMS            1100       4150
JAMES             950       4150
MILLER           1300       4150
 
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=RULE
   1    0   VIEW OF 'V'
   2    1     WINDOW (BUFFER)
   3    2       TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
   4    3         INDEX (RANGE SCAN) OF 'JOB_IDX' (NON-UNIQUE)
 
<b>it most certainly is.  here the predicate can safely be pushed -- since the analytic is computed "by job", a predicate on "job" can be applied FIRST and then the analytic computed.  
When pushing would change the answer -- we cannot do it.
When pushing the predicate would not change the answer -- we do it.
This is not a 'limitation', this is about "getting the right answer"</b>
 
 
ops$tkyte@ORA9IR2> set autotrace off
ops$tkyte@ORA9IR2> alter session set optimizer_mode = choose;
 
Session altered.
 
 
 
 
 
 
Great!!!
Nilanjan Ray, December  17, 2004 - 12:59 pm UTC
 
 
Simply amazing explanation. Cleared my doubts still further. One of the best explanation, in simple concise terms, I have seen on "Ask Tom". You know what, people should take enough caution and learn leasons from you before making misleading statements like "...LIMITATIONS...". In your terms yet again "Analytics Rock".
Regards 
 
 
Using analytical function, LEAD, LAG
Praveen, December  24, 2004 - 9:26 am UTC
 
 
Hi Tom,
    Analytical function, LEAD (or LAG) accepts the offset parameter as an integer which is a count of rows to be skipped from the current row before accessing the leading/lagging row. What if I want to access leading rows based on the value of column of current row, like a function applied to the column value of current row to access the leading row.
As an example: I have a table 
create table t(id integer, dt date);
For each id, start with the first record, after ordering by dt ASC. Get the next record where dt = 10 min + first_row.dt. Then next record where dt = 20 min + first_row.dt and so on. Each time time is cummulatively increased by 10 min.
Suppose if don't get an exact match from next record (ie next_row.dt <> first_row.dt+10 min(say), then we select a row closest to the expected record, but lying within +/-10 seconds.
insert into t values (1, to_date('12/20/2004 00:00:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:05:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:09:55', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:10:00', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:15:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:19:54', 'mm/dd/yyyy hh24:mi:ss')); --Not selected.
insert into t values (1, to_date('12/20/2004 00:19:55', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:25:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:30:05', 'mm/dd/yyyy hh24:mi:ss')); --Selected.
insert into t values (1, to_date('12/20/2004 00:30:06', 'mm/dd/yyyy hh24:mi:ss')); --Not Selected.
insert into t values (1, to_date('12/20/2004 00:35:00', 'mm/dd/yyyy hh24:mi:ss'));
insert into t values (1, to_date('12/20/2004 00:39:55', 'mm/dd/yyyy hh24:mi:ss')); --Either this or below record is selected. 
insert into t values (1, to_date('12/20/2004 00:40:05', 'mm/dd/yyyy hh24:mi:ss')); --Either this or above record is selected.
My output would be:
id    dt
-----------
1    12/20/2004 00:00:00 AM
1    12/20/2004 00:10:00 AM  --Exactly matches first_row.dt + 10min
1    12/20/2004 00:19:55 AM  --Closest to first_row.dt + 20min +/- 10sec
1    12/20/2004 00:30:05 AM  --Closest to first_row.dt + 30min +/- 10sec
1    12/20/2004 00:39:55 AM OR 12/20/2004 00:40:05 AM --Closest to first_row.dt + 40min +/- 10sec
The method I followed, after failed using LEAD is:
Step#1
------
Get a subset of dt's column, which is a 10 min cummulatiave dts from the dt value of first row(after rounding to the nearest minute, multiple of 10).
In this example I will get a subset:
12/20/2004 00:00:00 AM
12/20/2004 00:10:00 AM
12/20/2004 00:20:00 AM
12/20/2004 00:30:00 AM
12/20/2004 00:40:00 AM
This query will do it:
SELECT t1.id,
         (  min_dt -   MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
  FROM (SELECT   id, MIN (dt) min_dt,
                 ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
            FROM t
           WHERE id = 1
        GROUP BY id) t1, t
 WHERE ROWNUM <= max_rows + 1     
Step#2:
-------
This subquery is joined with table t to get only those records from t which is either equal to the dts in the resultset returned by the subquery or fall within the range 10min +/-10sec (not closest only, but all).
SELECT t.id, dt_rounded, ABS (t.dt - dt_rounded) * 24 * 60 * 60 dt_diff_in_sec
FROM t, 
    (SELECT t1.id,
         (  min_dt -   MOD ((ROUND (min_dt, 'mi') - ROUND (min_dt, 'hh')) * 24 * 60, 10) / (24 * 60)) + (ROWNUM - 1) * 10 / (24 * 60) dt_rounded
     FROM (SELECT   id, MIN (dt) min_dt,
                 ROUND ((MAX (dt) - MIN (dt)) * 24 * 60 / 10) max_rows
            FROM t
           WHERE id = 1
        GROUP BY id) t1, t
     WHERE ROWNUM <= max_rows + 1) t2     
WHERE t.id = 1          
AND ABS (t.dt - dt_rounded) * 24 * 60 * 60 <= 10
ORDER BY t.id, dt_rounded, dt_diff_in_sec;
I agree, this resultset will include duplicate records which I need to remove procedurally, while looping through the cursor; the order by clause simplifies this.
Now you might have guessed the problem. If table t contains more than 1000 records, the query asks me to wait atleast 2 min! And that too when I am planning to put at least 70,000 records! 
I wrote a procedure which is handling the situation a little better. But I dont know if analytical query can help me out to bring back the performance. I could do it if Lead have the fuctionality I mentioned in the first paragraph.  Do you have any hints?
Thanks and regards
Praveen 
 
December  24, 2004 - 9:54 am UTC 
 
you'd be looking at first_value with range windows, not lag and lead in this case.
 
 
 
 
Windowing clause and range function.
Praveen, December  25, 2004 - 1:29 pm UTC
 
 
Hi Tom,
    Thankyou for the suggestion. I am not very well used with analytical queries. I have tried based on your advise but unable to even start with. I am struck with the first step itself - in specifying the range in the windowing clause. In the windowing clause, we specify an integer to get the preceding rows based on the current column value (CLARK's example-Page:556, Analytical Funtions). 
In my above example I wrote a query which contains:
 
FIRST_VALUE(id)
OVER (ORDER BY dt DESC
      RANGE 10 PRECEDING)
10, in the windowing clause, will give me a record that fall within 10 days preceding the current row. But I need 10 minutes preceding records. Also at the same time all those records that span within +/- 10 sec, if exact 10 minute later records are not found (please see the description of the problem given in the previous question).
Kindly give me a more clear picture about windowing clause.
Also how you will approch the above problem.
Thanks and regards
Praveen 
 
December  26, 2004 - 12:19 pm UTC 
 
do you have Expert One on One Oracle?   I have extensive examples in there.
range 10 = 10 days.
range 10/24 = 10 hours
range 10/24/60 = 10 minutes......
 
 
 
 
I do have Expert One on One
Praveen, December  26, 2004 - 2:24 pm UTC
 
 
Hi Tom,
   I got the first glimpse into analytical queries through your book only. Although I had attempted to learn them through oracle documentation a couple of times earlier, I never was able to write an decent query using analytical functions. Now, after spending a few hours with your book, I can see that these fuctions are not as complex as I thought earlier. 
The 'hiredate' example you have given in the book is calculating in terms of days. (Pg:555)
"select ename, sal, hiredate, hiredate-100 window_top
       first_value(ename)
       over(order by hiredate asc
       range 100 preceding) ename_prec,...."
I got the hint from your follow-up. I should have to think a little myself.
Thankyou Tom,
Praveen. 
 
 
A reader, December  26, 2004 - 5:49 pm UTC
 
 
Tom,
Any dates when you would be releasing your book on Analytic?
Thanks. 
 
December  26, 2004 - 6:00 pm UTC 
 
doing a 2nd edition of Expert One on One Oracle now -- not on the list yet. 
 
 
 
Great answer!
Shimon Tourgeman, December  27, 2004 - 2:21 am UTC
 
 
Dear Tom,
Could you please tell us when you  are going to publish the next edition of your books, covering 9iR2 and maybe 10g, as you stated here?
Merry Christmas and a Happy New Year!
Shimon.
 
 
December  27, 2004 - 10:06 am UTC 
 
sometime in 2005, but not the first 1/2 :) 
 
 
 
Using range windows
Praveen, January   03, 2005 - 8:09 am UTC
 
 
Hi Tom,
Please allow me to explain the problem again which you had 
followed up earlier (Please refer: "Using analytical 
function, LEAD, LAG"). In the table t(id integer, dt date) 
I have records which only differ by seconds ('dt' column). 
Could you please help me to write a query to create windows 
such that each window groups records based on the 
expression 590 <= dt_1 <= 610 (590 & 610 are date 
difference between first record and current record in 
seconds and dt1 is the 'dt' column value of first record in 
each window after ordering by 'id' and 'dt' ASC).
The idea is to find a record following the first record 
which leads by 10 minutes. If exact match is not found 
apply a tolerance of +/-10 seconds. Once the nearest match 
is found (if multiple matches are found, select any), start 
from the next record and repeat the process. (Please see 
the scripts I had given earlier). 
In your follow up, you had suggested the use of 
first_value() analytical function with range windows. But 
it looks like it is pretty difficult to generate the kind 
of windows I specified above. And in your book, examples of
such complex nature where not given (pardon me for being 
critical).
Your answer will help me to get a deeper and practical 
understanding of analytical functions while at the same 
time may help us to bring down a 12 hour procedure to less 
than 5 hours.
Thanks and regards
Praveen 
 
January   03, 2005 - 9:11 am UTC 
 
no idea what 590 is.  days? hours? seconds?
sorry - this doesn't compute to me.
590 <= dt_1 <= 610???
 
 
 
 
Delete Records Older Than 90 Days While Keeping Max
Mac, January   03, 2005 - 10:24 am UTC
 
 
There is a DATE column in a table. I need to delete all records older than 90 days -- except if the newest record for a unique key happens to be older than 90 days, I want to keep it and delete the prior records for that key value.
How? 
 
January   03, 2005 - 10:26 am UTC 
 
if the "newest record for a unique key"
if the key is unique.... then the date column is the only thing to be looked at?
that is, if the key is unique, then the oldest record is the newest record is in the fact the only record.... 
 
 
 
Oops, but
A reader, January   03, 2005 - 11:01 am UTC
 
 
Sorry, forgot to mention that the DATE column is a part of the unique key. 
 
 
Sorry, I went a bit fast...
Praveen, January   03, 2005 - 2:00 pm UTC
 
 
Hi Tom,
    
Sorry, I didnt explained properly. 
    
590 = (10 minutes * 60) seconds - 10 seconds
600 = (10 minutes * 60) seconds + 10 seconds
    
Here I am looking for a record (say rn) exactly 
600 sec    (10 min) later to the first record in 
the range window. If I didn't get an exact match 
I try to find a record which is closest to rn, 
but lies with in a range which is 10 seconds less 
than or more than rn.
And the condition
         
"590 <= dt_1 <= 610" tries to eliminate all other 
records inside the range window that does not follow 
the above rule.
         
 dt_1 is the dt column value of any row following the 
 first row in a given range window, such that the 
 difference between dt_1 and dt of first row is between
 590 seconds and 610 seconds. I am interested in only
 one record which lies closest to 600 seconds.
     
 I hope, the picture is more clear to you now. As an 
 example, 
     
 id   dt
 -----------------------------
 1    12/20/2004 00:00:00 AM   --Range window #1
 1    12/20/2004 00:09:55 AM
 1    12/20/2004 00:10:00 AM   --Selected (Closest to 12/20/2004 00:10:00 AM)
 ............................
 1    12/20/2004 00:10:10 AM   --Range window #2  
 1    12/20/2004 00:19:55 AM   --Selected (Closest to 12/20/2004 00:20:00 AM)
 1    12/20/2004 00:20:55 AM   
 ............................
 1    12/20/2004 00:20:55 AM   --Range window #3  
 1    12/20/2004 00:25:00 AM   --Nothing to select
 1    12/20/2004 00:29:10 AM   --Nothing to select      
 ...........................
 1    12/20/2004 00:30:05 AM   --Range window #4
 1    12/20/2004 00:39:55 AM   --Either one is selected
 1    12/20/2004 00:40:05 AM   --Either one is selected
 -----------------------------
Thanks and regards
Praveen     
          
          
 
January   03, 2005 - 10:24 pm UTC 
 
that is first_value, last_value with a range window and the time range is
N * 1/24/60/60 -- for N seconds. 
 
 
 
How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?
jayaramj@quinnox.com, January   13, 2005 - 3:11 pm UTC
 
 
Hi Tom,
In answer to the question 'How to mimic Ora10g LAST_VALUE(... IGNORE NULLS)?' from reviewer Sergey (from Norway) in this post you have proposed the following solution:
ops$tkyte@ORA10G> select t,
  2         sign_v,
  3             v,
  4             substr( max(data) over (order by t), 7 ) v2
  5    from (
  6  SELECT TD.T,
  7         SIGN(NVL(TV.V, 0)) sign_v,
  8          NVL(TV.V, LAST_VALUE(TV.V IGNORE NULLS) OVER ( ORDER BY TD.T )) V,
  9           case when tv.v is not null
 10                 then to_char( row_number() 
                                  over (order by td.t), 'fm000000' ) || tv.v
 11                    end data
 12      FROM TD, TV
 13      WHERE TV.T(+) = TD.T
 14          )
 15   ORDER BY T
 16      ;
The problem is that this solution converts the data type of the column (in this case column TV.V) to a string (V2 in the result is a string). The result would then need to be converted back to the original data type. 
It is best to avoid such data type conversion. Is there a solution to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS) in Oracle 9i without the datatype conversion?
 
 
January   13, 2005 - 3:45 pm UTC 
 
encode the date as a string using to_char( v, 'yyyymmddhh24miss' ) and in the substr of it back out -- to_date( substr(...), 'yyyymmddhh24miss' ) 
 
 
 
How to mimic Oracle 10g LAST_VALUE(... IGNORE NULLS)?  
Jay, January   14, 2005 - 12:44 am UTC
 
 
In response to your post above - Taking care of dates (for datatype conversion) is not complex (though timestamp variants would require a different format string). Object columns are a different story altogether. These cannot be easily converted to strings. Is there a better solution that does not require datatype conversion (and hence does not require any knowledge of the column datatype in this SQL). 
 
January   14, 2005 - 8:06 am UTC 
 
upgrade to 10g. 
 
 
 
find prior collect_date to the max collect_date for each customer 
JANE, January   25, 2005 - 4:30 pm UTC
 
 
Hello,Tom!
I work in ORACLE 8I
I have table with 2 columns:cstmr_no,collect_date
CREATE TABLE CSTMR_dates
(
  CSTMR_NO             NUMBER(8)                NOT NULL,
  COLLECT_DATE  DATE                     NOT NULL);
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
How can i do instead this query the query using analytical
function:
select cstmr_no,max(collect_date) from 
CSTMR_dates
where  collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no 
In production i have thousands records in the table.                                                       THANK A LOT                                          
JANE 
 
January   25, 2005 - 6:59 pm UTC 
 
no idea what "return_collect_date" is. or where it comes from.
the sql is not sql... 
 
 
 
Mistake:return_collect_date is a collect_date
JANE, January   26, 2005 - 2:58 am UTC
 
 
Thank you for answer
JANE 
 
January   26, 2005 - 8:46 am UTC 
 
but this sql:
select cstmr_no,max(collect_date) from 
CSTMR_dates
where  collect_date<(select max(COLLECT_DATE)
group by cstmr_no 
is still not sql and I don't know if you want to 
a) delete all old data BY CSTMR_NO (eg: keep just the record with the max(collect_date) BY CSTMR_NO
b) delete all data such that the collect_date is not equal to the max(collect_date)
I cannot suggest a way to rewrite an invalid sql query. 
 
 
 
No,i want to do the next:
A reader, January   26, 2005 - 9:08 am UTC
 
 
i have just to presene the data without deleting anything
For each cstmr i have to see:
cstmr_no max(collect_date) last prior date to max
======== ================= ====================== 
18       01/05/04          01/03/04     
248      01/11/04          01/03/04     
insert into cstmr_dates
values(18,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/03/04','dd/mm/yy');
insert into cstmr_dates
values(18,to_date('01/05/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/11/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/02/04','dd/mm/yy');
insert into cstmr_dates
values(248,to_date('01/03/04','dd/mm/yy');
 
 
January   26, 2005 - 9:31 am UTC 
 
wow, how we got from:
select cstmr_no,max(collect_date) from 
CSTMR_dates
where  collect_date<(select max(RETURN_COLLECT_DATE)
group by cstmr_no 
to this, well -- just "wow".  horse of a very different color.
I have to sort of guess -- maybe I'll get it right -- you want 
a) every cstmr_no, 
b) the last two dates recorded for them.
well, after editing your inserts to make them become actual sql that can run.... (you don't really use YY in real life do you? please please say "no, that was a mistake...")
ops$tkyte@ORA9IR2> select cstmr_no,
  2         max( decode(rn,1,collect_date) ) d1,
  3         max( decode(rn,2,collect_date) ) d1
  4    from (
  5  select cstmr_no,
  6         collect_date,
  7             row_number() over (partition by cstmr_no order by collect_date desc nulls last) rn
  8    from cstmr_dates
  9         )
 10   where rn <= 2
 11   group by cstmr_no
 12  /
 
  CSTMR_NO D1        D1
---------- --------- ---------
        18 01-MAY-04 01-MAR-04
       248 01-NOV-04 01-MAR-04
 
 
 
 
 
 
Lead/Lag and Indexes
Rob H, February  22, 2005 - 6:12 pm UTC
 
 
We are using the Lead and Lag functions and I have run into an issue of Index usage.
lets say I have 2 tables
select customer_num, prod_id, date_sold, total_sales from sales_table_NA
and 
select customer_num, prod_id, date_sold, total_sales from sales_table_EUR
if i do a
create view eur_sales
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_EUR
group by customer_account, prod_id
create view na_sales as
select customer_account, prod_id, trunc(sales_date,'mon') month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by customer_account, prod_id order by trunc(sales_date,'mon') desc) sales_last
from sales_table_NA
group by customer_account, prod_id
There are indexes on the tables for customer_acccount
Now, if I 
select * from na_sales where customer_account=1 
the index is used. Same for eur_sales.  However, if I UNION them together it does not (WINDOW SORT on first select and WINDOW BUFFER on second).  If I remove the lead function and UNION them, the index is used.
Any help?
 
 
February  23, 2005 - 1:56 am UTC 
 
do you really want UNION or UNION ALL.........
(do you know the difference between the two)....
if you had given me simple setup scripts, I would have been happy to see if that makes a difference, but oh well. 
 
 
 
Potential Solution
Rob H, February  22, 2005 - 6:54 pm UTC
 
 
Rather than pre-sum the data into 2 views I found that union'ing (actually UNION ALL) the data, then sum and Lag works fine.
ie
select
customer_account, prod_id, sales_date month_purch,
sum(total_sales) sales_current, lead(sum(total_sales),1) over(partition by 
customer_account, prod_id order by sales_date desc) sales_last
from(
select customer_account, prod_id, sales_date, total_sales from sales_table_NA
union all 
select customer_account, prod_id, trunc(sales_date,'mon') month_purch, total_sales from sales_table_EUR) 
 
 
Attitude....
Rob H, February  23, 2005 - 9:54 am UTC
 
 
What's the deal? Having a bad day?  I'm sorry, but I assumed from the select statements you could infer structure.  Yes, I was using UNION ALL, yes, I know the difference (uh, feeling a bit rude are we?) but I didn't realize until after I posted that I missed that (a nice feature would be to be able to edit a post for a certain time after post).  I generalized the data structure and SQL for confidentiality reasons.  For a guy who is so hard on people's IM speak, you forget to capitalize your sentences :)
Now, UNION Vs UNION ALL didn't affect index usage (it did however have 'other' performance issues).  You can see from my next post that I worked on the issue and resolved it by not presuming each table.  With the new query, if someone issues a select with no 'where customer_account=' then it's slower (but that also wasn't the goal).
Thanks 
 
February  24, 2005 - 4:35 am UTC 
 
No?  I was simply asking "do you know the difference between the two" for I find most people
a) don't know union all exists
b) the semantic difference between union and union all
c) the performance penalty involved with union vs union all when they didn't need to use UNION
Your example, as posted, did not use UNION ALL.  Look at your text:
<quote>
Now, if I 
select * from na_sales where customer_account=1 
the index is used. Same for eur_sales.  However, if I UNION them together it 
does not (WINDOW SORT on first select and WINDOW BUFFER on second).  If I remove 
the lead function and UNION them, the index is used.
</quote>
I quite simply asked:
does union all change the behaviour? (i did not have an example with table creates and such to work with, so I couldn't really 'test it', I don't have your tables, your indexes, your datatypes, etc)
do you need to use union, you said union, you did not say union all.  do you know the difference between the two.
Sorry if you took it as an insult, I can only comment based on the data provided.  I had to assume you like most of the world was using UNION, not UNION ALL and simply wanted to know if you could use union all, if union all made a difference, if you knew the difference between the two.
If I had precience, I could have read your subsequent post and not ask any questions I guess.
Not having a bad day, just working with information provided.  I was not trying to insult you -- I was simply "asking".
 
 
 
 
Analytics 
Neelz, February  24, 2005 - 5:34 am UTC
 
 
Dear Sir,
I had gone through the above examples and was wondering whether analytical functions could be used when aggregating multiple columns from a table, 
CREATE TABLE T (
    SUPPLIER_CD CHAR(4) NOT NULL, 
    ORDERRPT_NO CHAR(8) NOT NULL, 
        ORDER_DATE CHAR(8) NOT NULL, 
    STORE_CD CHAR(4) NOT NULL, 
    POSITION_NO CHAR(3 ) NOT NULL, 
    CONTORL_FLAG CHAR(2 ), 
    ORDERQUANTITY_EXP NUMBER(3) DEFAULT (0) NOT NULL, 
    ORDERQUANTITY_RES NUMBER(3) DEFAULT (0) NOT NULL, 
        ENT_DATE DATE DEFAULT (SYSDATE) NOT NULL, 
    UPD_DATE DATE DEFAULT (SYSDATE) NOT NULL, 
    CONSTRAINT PK_T PRIMARY KEY(SUPPLIER_CD, ORDERRPT_NO,  ORDER_DATE, STORE_CD));
CREATE INDEX IDX_T ON T (SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE);
insert into t values('5636','62108373','20041129','0007','2','00',1,1, to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108373','20041129','0012','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'), to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108384','20041129','0014','2','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('5636','62108384','20041129','0015','3','00',1,1,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1309','4','00',8,8,to_date('2004/11/29', 'yyyy/mm/dd'),to_date('2004/11/30', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1312','12' ,'00',8,8,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));
insert into t values('1000','11169266','20040805','1313','13' ,'00',12,12,to_date('2004/04/22', 'yyyy/mm/dd'),to_date('2004/11/23', 'yyyy/mm/dd'));
Currently the following query is used:-
SELECT 
    SUPPLIER_CD,  ORDERRPT_NO,  ORDER_DATE,  
    SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1, 
    SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2, 
    SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3, 
    SUM(ORDERQUANTITY_RES) ORDER_TOTAL 
FROM 
    T 
GROUP BY 
    SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE 
The execution plan when this query is executed on the real table which has 4m records is : - 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=103002 Card=3571095 Bytes=107132850)
   1    0   SORT (GROUP BY NOSORT) (Cost=103002 Card=3571095 Bytes=107 132850)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=10 3002 Card=3571095 Bytes=107132850)
   3    2       INDEX (FULL SCAN) OF 'IDX_T' (NON-UNIQUE) (Cost=26942 Card=3571095)
Could you please tell me whether analytical functions could be used over here or a better approach for this query.
Thanks for your great help 
 
 
February  24, 2005 - 5:49 am UTC 
 
there would be no need of analytics here.  analytics would be useful to get the 'aggregates' while preserving the 'details'
eg:
select empno, sal, sum(sal) over (partition by deptno)
  from emp;
shows the empno, their sal and the sum of all salaries in their dept.  that would be instead of coding:
select empno, sal, sum_sal
  from emp, (select deptno, sum(sal) sum_sal from emp gropu by deptno) t
 where emp.deptno = t.deptno
/
 
 
 
 
I was just wondering
A reader, February  24, 2005 - 6:25 am UTC
 
 
how would analytics help in the following example (the data nodes are implemented as rows in a table with two columns as pointers: split-from and merge-to, and the third column is "value", some number, not shown on diagram):
</code>  
http://img23.exs.cx/my.php?loc=img23&image=directedgraph11th.png  <code>
The task is to use this directed dependency graph and prorate the "value" column in each row/node in the following way:
foreach node
-start with a node, for example 16
-visit each hierarchy on which 16 depends, in this case hierarchies for 14 and 15, SUM their values and the current value of node 16, and that will be new, prorated value for node 16
-repeat this recursively for each sub-hierarchy
until all nodes are prorated
I was thinking maybe to use combination of sys_connect_by_path and AF but not sure how. Any thoughts?  
February  24, 2005 - 6:51 am UTC 
 
you won't get very far with that structure in 9i and before.  connect by "loop" will be an error you see lots of with a directed graph.
analytics won't be appropriate either, they work on windows - not on hierarchies.
sys_connect_by_path is going to give you a string, not a sum
a scalar subquery in 10g with NOCYCLE on the query might work. 
 
 
 
What if there is no closure inside the graph?
A reader, February  24, 2005 - 9:08 am UTC
 
 
i.e. if the link between node 9 and 5 is removed, and the link between node 6 and 0 is removed.
Would that make difference? It would be a tree in that case. How should we proceed if that is the case? I was thinking maybe to use sys_connect_by_path to pack all sub-hierarchies one after another, and marker in window to be the depth or level. If the level switch from n to 1 that would mean the end of sub-hierarchy. If the level switch from 1 to 2 that is the begining of the hierarchy. And then aggregate over partition inside hierarchy view. Or is there a better approach? 
 
February  24, 2005 - 9:22 am UTC 
 
 
 
Lead/Lag and 0 Sales
Rob H, February  24, 2005 - 1:00 pm UTC
 
 
Thanks for all of the help so far. I have run into an issue where I have Companies and Contacts at that company.  Here are the tables.
create table SALES_TRANS
(
  CUSTOMER_ACCOUNT                   VARCHAR2(8)    ,
  STATION_NUMBER                     VARCHAR2(7)    ,
  PRODUCT_CODE                       VARCHAR2(8)    ,
  QUANTITY                           NUMBER         ,
  DATE_ISSUE                         DATE           ,
  PRICE                              NUMBER         ,
  VALUE                              NUMBER         );
/
Create table COMPANY_CUSTOMER
(
  COMPANY_ID                          NUMBER(9),
  CUSTOMER_ACCOUNT                    VARCHAR2(8));
/
Create table PRODUCT_INFO
(
  PRODUCT_CODE                       VARCHAR2(8) ,
  PRODUCT_GROUP                      VARCHAR2(25),
  PRODUCT_DESC                       VARCHAR2(100)
);
/
Running a query by customer (this select is a view called - SUM_CUST_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
t.CUSTOMER_ACCOUNT,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc)  total_VALUE_pre_y
From SALES_TRANS t
   inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
   inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.OMPANY_ID, t.CUSTOMER_ACCOUNT, p.product_group, fiscal_year
I get
COMPANY_ID,CUSTOMER_ACCOUNT,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","27294370","Product1",2002,1460.08,0
"F0009631","27294370","Product2",2005,0,27926.31
"F0009631","27294370","Product2",2004,27926.31,18086.17
"F0009631","27294370","Product2",2003,18086.17,47597.05
"F0009631","27294370","Product2",2002,47597.05,0
"F0009631","27294370","Product2",2001,0,0
"F0009631","27294370","Product3",2004,64582.6,51041
"F0009631","27294370","Product3",2003,51041,60225
"F0009631","27294370","Product3",2002,60225,43150
"F0009631","27294370","Product3",2001,43150,50491
"F0009631","27294370","Product3",2000,50491,664
"F0009631","27294370","Product3",1999,664,0
"F0009631","27294370","Product4",2005,2119.1,1708.61
"F0009631","27294370","Product4",2004,1708.61,4050.82
"F0009631","27294370","Product4",2003,4050.82,15662.57
"F0009631","27294370","Product4",2002,15662.57,0
"F0009631","27294370","Product5",2005,0,351.64
"F0009631","27294370","Product5",2004,351.64,5873.61
"F0009631","27294370","Product5",2003,5873.61,2548.83
"F0009631","27294370","Product5",2002,2548.83,0
"F0009631","27294370","Product6",2004,17347.84,16781.33
"F0009631","27294370","Product6",2003,16781.33,10575
"F0009631","27294370","Product6",2002,10575,3659.67
"F0009631","27294370","Product6",2001,3659.67,4901.67
"F0009631","27294370","Product6",2000,4901.67,4073.47
"F0009631","27294370","Product6",1999,4073.47,0
"F0009631","27294370","Product7",2004,5377.5,2588
"F0009631","27294370","Product7",2003,2588,245
"F0009631","27294370","Product7",2000,245,0
"F0009631","27340843","Product2",2003,3013.71,0
"F0009631","27340843","Product3",1999,1411,0
"F0009631","27340843","Product5",2003,3254.9,0
Now if I run the same grouping by only company (this select is a view called - SUM_COMPANY_TRANS_PRODUCT_FY_V)
Select
c.COMPANY_ID,
p.product_group,
FISCAL_YEAR(DATE_ISSUE) fiscal_year,
sum(VALUE) total_VALUE_curr_y,
lead(sum(VALUE),1) over (partition by c.COMPANY_ID, p.product_group order by FISCAL_YEAR(DATE_ISSUE) desc)  total_VALUE_pre_y
From SALES_TRANS t
   inner join COMPANY_CUSTOMER c on t.CUSTOMER_ACCOUNT = C.CUSTOMER_ACCOUNT
   inner join PRODUCT_INFO P ON t.PRODUCT_CODE = p.PRODUCT_CODE
group by c.COMPANY_ID, p.product_group, fiscal_year
we get
COMPANY_ID,PRODUCT_GROUP,FISCAL_YEAR,TOTAL_VALUE_CURR_Y,TOTAL_VALUE_PRE_Y
"F0009631","Product1",2002,1460.08,0
"F0009631","Product2",2005,0,27926.31
"F0009631","Product2",2004,27926.31,21099.88
"F0009631","Product2",2003,21099.88,47597.05
"F0009631","Product2",2002,47597.05,0
"F0009631","Product2",2001,0,0
"F0009631","Product3",2004,64582.6,51041
"F0009631","Product3",2003,51041,60225
"F0009631","Product3",2002,60225,43150
"F0009631","Product3",2001,43150,50491
"F0009631","Product3",2000,50491,2075
"F0009631","Product3",1999,2075,0
"F0009631","Product4",2005,2119.1,1708.61
"F0009631","Product4",2004,1708.61,4050.82
"F0009631","Product4",2003,4050.82,15662.57
"F0009631","Product4",2002,15662.57,0
"F0009631","Product5",2005,0,351.64
"F0009631","Product5",2004,351.64,9128.51
"F0009631","Product5",2003,9128.51,2548.83
"F0009631","Product5",2002,2548.83,0
"F0009631","Product6",2004,17347.84,16781.33
"F0009631","Product6",2003,16781.33,10575
"F0009631","Product6",2002,10575,3659.67
"F0009631","Product6",2001,3659.67,4901.67
"F0009631","Product6",2000,4901.67,4073.47
"F0009631","Product6",1999,4073.47,0
"F0009631","Product7",2004,5377.5,2588
"F0009631","Product7",2003,2588,245
"F0009631","Product7",2000,245,0
The problem is that because if  I 
select * from SUM_CUST_TRANS_PRODUCT_FY_V where fiscal_year=2004 
Customer 27340843 will not show up (no 2004 purchases),  but that also means that the total_VALUE_pre_y for 2004 will never summarize by customer to the total_VALUE_pre_y for 2004 for the company.  Is there a better way to do this.  The goal is that we can show current year sales vs previous years sales by company, by customer, and potentially a larger summary higher than company (city).  
I guess the idea would be that I could somehow show for all customers in a company, all years, all products, that the company has purchases (cartesian) for every year purchasing.  This I think is difficult for large customer, sales transaction tables.
ie
"F0009631","27340843","Product2",2004,0,3013.71 <--- ***
"F0009631","27340843","Product2",2003,3013.71,0
*** This row doesn't exist in the customer view.  There are no 2004 sales, so doesn't appear, but we would like to see it so that the year previous shows.
I would love to "attach" some of the transactions if it would help.  Is there a better way? 
 
 
hierarchical cubes + MV?
Rob H, February  25, 2005 - 2:52 pm UTC
 
 
Would hierarchical cubes and MV be the solution.  It seems like a lot of meta data to create.  We would have to create it for all customers, for all years, for all product groups.   
 
February  25, 2005 - 6:40 pm UTC 
 
if you have "missing data", the only way i know to "make it up" is an outer join (partitioned outer joins in 10g rock, removing the need to create cartesian products of every dimension first) 
 
 
 
Neelz, February  27, 2005 - 2:32 am UTC
 
 
Dear Sir,
This is with regards to my previous post which is 5th above from this.
<quote>
SELECT 
    SUPPLIER_CD,  ORDERRPT_NO,  ORDER_DATE,  
    SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1, 
    SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2, 
    SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3, 
    SUM(ORDERQUANTITY_RES) ORDER_TOTAL 
FROM 
    T 
GROUP BY 
    SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE 
</quote>
As you mentioned analytics could not be used, but could you please advice me on my problem,
The query is infact big, for brevity I just put few columns. The actual query is
SELECT 
    SUPPLIER_CD,  ORDERRPT_NO,  ORDER_DATE,  
    SUM(DECODE(RTRIM(POSITION_NO),'1',ORDERQUANTITY_RES,0)) Q1, 
    SUM(DECODE(RTRIM(POSITION_NO),'2',ORDERQUANTITY_RES,0)) Q2, 
    SUM(DECODE(RTRIM(POSITION_NO),'3',ORDERQUANTITY_RES,0)) Q3, 
    .....
    .....
    .....
    .....
    .....
    .....
    SUM(DECODE(RTRIM(POSITION_NO),'197',ORDERQUANTITY_RES,0)) Q197, 
    SUM(DECODE(RTRIM(POSITION_NO),'198',ORDERQUANTITY_RES,0)) Q198, 
    SUM(DECODE(RTRIM(POSITION_NO),'199',ORDERQUANTITY_RES,0)) Q199, 
    SUM(DECODE(RTRIM(POSITION_NO),'200',ORDERQUANTITY_RES,0)) Q200, 
    SUM(ORDERQUANTITY_RES) ORDER_TOTAL 
FROM 
    T 
GROUP BY 
    SUPPLIER_CD, ORDERRPT_NO, ORDER_DATE 
As you could see there is a definite pattern on the sum function. Could you please help me in tuning this query?
Thanks in advance
 
 
February  27, 2005 - 8:32 am UTC 
 
you aer doing a pivot -- looks great to me?  It is "classic" 
 
 
 
Neelz, February  27, 2005 - 9:51 am UTC
 
 
Dear Sir,
I am sorry if you felt like that, It is quite a new world for me here, started visiting this site 3-4 months back then realized the enormity of it and its become like an addiction. Bought both books by you and started working on it. Reading the Oracle concepts guide.  Every day many times will try for asking a question but till now no luck, might be because of timezone difference.
Coming back to my question, since it is a huge query and was taking 35 min to execute, after reading through many articles here and in the books I was really confused as to what approach should I take. Still is. Analytical functions (not useful as you told), Function based indexes(no becuase we have a standard edition), Materialized views(no because its an OLTP), Stored Sql functions, Deterministic keyword, user defined aggregates, optimizer hints.. at present it is confusing for me.  
I am working on it with different approaches, could reduce the execution time upto 9.08 minutes. The query was written with an index hint earlier and by removing it, the execution time decreased upt 9+ minutes. 
I was thinking whether you could advice on what approach should I take 
Thanks for your valuable time, 
 
 
February  27, 2005 - 10:04 am UTC 
 
if that is taking 35 minutes you either
a) have the memory settings like pga_aggreate_target/sort_area_size set way too low
b) you have billions of records that are hundreds of bytes in width
c) really slow disks
d) an overloaded system
I mean -- that query is pretty "simple" full scan, aggregate, nothing to it -- unless it is a gross simplification, it should not take 35 minutes.  Can you trace it with the 10046 level 12 trace and post the tkprof section that is relevant to just this query with the waits and all? 
 
 
 
Neelz, February  27, 2005 - 10:56 am UTC
 
 
Dear Sir,
Thank you for your kind reply,
This report is taken for the development system.
I used alter session set events '10046 trace name context forever, level 12'. The query execution time was 00:08:15.03
select
    supplier_cd, orderrpt_no, order_date,
    sum(decode(rtrim(position_no),'1',orderquantity_res,0)) q1,
    sum(decode(rtrim(position_no),'2',orderquantity_res,0)) q2,
    sum(decode(rtrim(position_no),'3',orderquantity_res,0)) q3,
    sum(decode(rtrim(position_no),'4',orderquantity_res,0)) q4,
    sum(decode(rtrim(position_no),'5',orderquantity_res,0)) q5,
    .....
    .....
    sum(decode(rtrim(position_no),'197',orderquantity_res,0)) q197,
    sum(decode(rtrim(position_no),'198',orderquantity_res,0)) q198,
    sum(decode(rtrim(position_no),'199',orderquantity_res,0)) q199,
    sum(decode(rtrim(position_no),'200',orderquantity_res,0)) q200,
    sum(orderquantity_res) order_total
from
    t
group by
    supplier_cd, orderrpt_no, order_date
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.03       0.04          0          0          0           0
Execute      2      0.02       0.04          0          0          0           0
Fetch       15    431.55     488.37      37147      36118         74         211
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       18    431.60     488.46      37147      36118         74         211
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 66  
Rows     Row Source Operation
-------  ---------------------------------------------------
    211  SORT GROUP BY 
4205484   TABLE ACCESS FULL T 
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                      16        0.00          0.00
  SQL*Net more data to client                    30        0.00          0.00
  db file sequential read                         3        0.04          0.05
  db file scattered read                       2280        0.78         30.62
  direct path write                               4        0.00          0.00
  direct path read                              147        0.05          1.45
  SQL*Net message from client                    16      140.57        166.58
  SQL*Net break/reset to client                   2        0.01          0.01
********************************************************************************
Thank you 
 
February  27, 2005 - 11:13 am UTC 
 
that is 8 minutes?
but I see some writes to temp here -- for 211 aggregated rows, perhaps your sort/pga is set small
Also, why do you need to rtrim() 4,205,484 rows?  (and why is something called position NUMBER in a string?) is that rtrim there "just in case" or is it really needed?  why would it have trailing blanks and is that not a data integrity issue that needs to be fixed?
(but this is an 8 minute query, not a 35 minute query, if it takes longer on production -- it'll be because it is waiting for something -- like IO...) 
 
 
 
Neelz, February  27, 2005 - 11:30 am UTC
 
 
Dear Sir,
This is a 3rd party application and the query was written with an index hint earlier. After removing the hint query execution time reduced to 8 min. Regarding the rtrim I have to check with the team if it is really needed. I will try the trace on production tomorrow.
And at last I could see the link for "Submit a New Question"!, I think I should try around 1.00 AM
Thanking You a lot 
 
February  27, 2005 - 11:31 am UTC 
 
depends on your time zone, rarely am I up at 1am east coast (gmt-5) time doing this stuff! 
 
 
 
Miki, March     02, 2005 - 8:51 am UTC
 
 
Tom,
I need to produce a moving average which has an even window size. If I want a 28 sized window, I need to look backward 14 but I need the first value of the window to be divided by 2 and I need to look forward 14 and the last value of the window to be divided by 2 also.
(a1/2+a2+...+a28+a29/2)/28
How could I accomplish it with the function: 
avg() over(...)?
Thanks in advance 
 
March     02, 2005 - 10:03 am UTC 
 
this is the first thought that popped into my head:
a) get the sum(val) over 13 before and 13 after (27 rows possible).
b) get the lag(val,14)/2 and lead(val,14)/2
c) add those three numbers
d) divide by the count of non-null VALS observed (count(val) 13 before/after+ 1 if lag is not null + 1 is lead is not null)
ops$tkyte@ORA9IR2> create table t
  2  as
  3  select rownum id, object_id val
  4    from all_objects
  5   where rownum <= 30;
 
Table created.
 
<b>so, this was my "debug" query, just to see the data:</b>
ops$tkyte@ORA9IR2> select id,
  2         sum(val) over 
                 (order by id rows between 13 preceding and 13 following) sum,
  3         count(val) over 
                 (order by id rows between 13 preceding and 13 following)+
  4             decode(lag(val,14) over (order by id),null,0,1)+
  5             decode(lead(val,14) over (order by id),null,0,1) cnt,
  6             lag(id,14) over (order by id) lagid,
  7             lag(val,14) over (order by id) lagval,
  8             lead(id,14) over (order by id) leadid,
  9             lead(val,14) over (order by id) leadval
 10    from t
 11   order by id;
 
        ID        SUM        CNT      LAGID     LAGVAL     LEADID    LEADVAL
---------- ---------- ---------- ---------- ---------- ---------- ----------
         1     218472         15                               15       6399
         2     224871         16                               16      19361
         3     244232         17                               17      23637
         4     267869         18                               18      14871
         5     282740         19                               19      20668
         6     303408         20                               20      18961
         7     322369         21                               21      15767
         8     338136         22                               22      20654
         9     358790         23                               23       7065
        10     365855         24                               24      17487
        11     383342         25                               25      11077
        12     394419         26                               26      20772
        13     415191         27                               27      15505
        14     430696         28                               28      12849
        15     425648         29          1      17897         29      23195
        16     441314         29          2       7529         30      18523
        17     436505         28          3      23332
        18     422306         27          4      14199
        19     399409         26          5      22897
        20     389266         25          6      10143
        21     365728         24          7      23538
        22     342135         23          8      23593
        23     332316         22          9       9819
        24     320581         21         10      11735
        25     303084         20         11      17497
        26     295369         19         12       7715
        27     276010         18         13      19359
        28     266791         17         14       9219
        29     260392         16         15       6399
        30     241031         15         16      19361
 
30 rows selected.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select id,
  2         (sum(val) over 
                (order by id rows between 13 preceding and 13 following)+
  3              nvl(lag(val,14) over (order by id)/2,0)+
  4              nvl(lead(val,14) over (order by id)/2,0))/
  5             nullif(
  6           count(val) over 
                  (order by id rows between 13 preceding and 13 following)+
  7               decode(lag(val,14) over (order by id),null,0,1)+
  8               decode(lead(val,14) over (order by id),null,0,1)
  9                  ,0) avg
 10    from t
 11   order by id;
 
        ID        AVG
---------- ----------
         1    14778.1
         2 14659.4688
         3 15061.7941
         4 15294.6944
         5 15424.9474
         6  15644.425
         7 15726.3095
         8 15839.2273
         9 15753.1522
        10 15608.2708
        11   15555.22
        12 15569.4231
        13 15664.5741
        14 15611.4464
        15      15386
        16 15666.8966
        17 16006.1071
        18 15903.9074
        19 15802.2115
        20    15773.5
        21 15729.0417
        22 15388.3261
        23 15328.4318
        24 15545.1667
        25  15591.625
        26 15748.7632
        27 15871.6389
        28 15964.7353
        29 16474.4688
        30    16714.1
 
30 rows selected.
 
ops$tkyte@ORA9IR2>
<b>I did not do a detailed check of the results -- but that should get you going (remember -- there are 29 rows -- 14+1+14!!! and beware NULLs)</b> 
 
 
 
 
Miki, March     02, 2005 - 10:54 am UTC
 
 
Tom,
Your answer is excellent. That is - almost - what I needed.
If my window size is odd I can use simly avg() over() function. I am looking for a solution where I can also use avg() over() instead of sum() over()/count().
Is it possible?
Thank you!  
 
March     02, 2005 - 11:15 am UTC 
 
if you want to do things to row 1 and row 29 in the window "special" like this -- this was the only thing I thought of. 
 
 
 
Miki, March     02, 2005 - 11:18 am UTC
 
 
Thank you! I will use your recommended code. 
 
 
consecutive days...  8.1.7
Dean, March     09, 2005 - 1:07 pm UTC
 
 
create table day_cd
(dt  date
,cd  varchar2(2))
/
insert into day_cd values ('08-MAR-05', 'BD');
insert into day_cd values ('09-MAR-05', 'AD');
insert into day_cd values ('10-MAR-05', 'AD');
insert into day_cd values ('11-MAR-05', 'AD');
insert into day_cd values ('12-MAR-05', 'AD');
insert into day_cd values ('13-MAR-05', 'AD');
insert into day_cd values ('14-MAR-05', 'CD');
insert into day_cd values ('15-MAR-05', 'CD');
insert into day_cd values ('16-MAR-05', 'AD');
insert into day_cd values ('17-MAR-05', 'AD');
insert into day_cd values ('18-MAR-05', 'AD');
insert into day_cd values ('19-MAR-05', 'CD')
/
SELECT * FROM DAY_CD;
DT        CD
--------- --
08-MAR-05 BD
09-MAR-05 AD
10-MAR-05 AD
11-MAR-05 AD
12-MAR-05 AD
13-MAR-05 AD
14-MAR-05 CD
15-MAR-05 CD
16-MAR-05 AD
17-MAR-05 AD
18-MAR-05 AD
19-MAR-05 CD
I'd like the count the occurrence of each code as it occurs in consecutive days as one occurrence.
So that the output would be:
CD      OCCURRENCES
--    -----------
AD    2
BD    1
CD    2
 
 
 
nevermind...
Dean, March     09, 2005 - 1:59 pm UTC
 
 
select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd
 
 
 
we were responding at the same time...
Dean, March     09, 2005 - 2:01 pm UTC
 
 
:)
select cd, count(*)
from
(
select cd, dt, case when (lead(dt) over (partition by cd order by dt) - dt) = 1 then 1 else 0 end day
from day_cd
)
where day = 0
group by cd
CD   COUNT(*)
-- ----------
AD          2
BD          1
CD          2
Thanks for all of your help... 
 
 
max() over() till not the current row
Miki, March     10, 2005 - 4:12 am UTC
 
 
Tom,
I have the following input
DATUM    T    COL1    COL2    COL3    COL4
2005.02.19 9:29    T    1    0    0    0
2005.02.20 9:29        0    0    0    0
2005.02.21 9:29        0    0    0    0
2005.02.22 9:29    T    1    0    0    0
2005.02.23 9:29        0    0    0    0
2005.02.24 9:29        0    0    0    0
2005.02.25 9:29        0    0    0    0
2005.02.26 9:29        0    0    0    0
2005.02.27 9:29    T    0    1    0    0
2005.02.28 9:29        0    0    0    0
2005.03.01 9:29        0    0    0    0
2005.03.02 9:29    T    1    1    0    0
2005.03.03 9:29        0    0    0    0
2005.03.04 9:29    T    1    1    0    0
2005.03.05 9:29        0    0    0    0
2005.03.06 9:29    T    1    0    0    0
2005.03.07 9:29        0    0    0    0
2005.03.08 9:29        0    0    0    0
2005.03.09 9:29        0    0    0    0
When value of column T is T a rule determines which columns (col1, 
, col4) get 1 or 0.
Unfortunately, with the rule more then one column can get value 1. So, if col1+
+col4 > 1 then I would like colx to be the previous colx where t = 'T' and col1+...+col4 = 1
So, the output is the following
DATUM    T    COL1    COL2    COL3    COL4
2005.02.19 9:29    T    1    0    0    0
2005.02.20 9:29        0    0    0    0
2005.02.21 9:29        0    0    0    0
2005.02.22 9:29    T    1    0    0    0
2005.02.23 9:29        0    0    0    0
2005.02.24 9:29        0    0    0    0
2005.02.25 9:29        0    0    0    0
2005.02.26 9:29        0    0    0    0
2005.02.27 9:29    T    0    1    0    0
2005.02.28 9:29        0    0    0    0
2005.03.01 9:29        0    0    0    0
2005.03.02 9:29    T    0    1    0    0
2005.03.03 9:29        0    0    0    0
2005.03.04 9:29    T    0    1    0    0
2005.03.05 9:29        0    0    0    0
2005.03.06 9:29    T    1    0    0    0
2005.03.07 9:29        0    0    0    0
2005.03.08 9:29        0    0    0    0
2005.03.09 9:29        0    0    0    0
I tried to use a max() over() function to replace the wrong value but it dosnt work because I cant see the max datum till the previous record where t=T and col1+...+col4 = 1
...
case when t = T and col1+
+col4>1 and
greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000), 
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000)  then 1 else 0 end col1,
Case when t = T and col1+
+col4>1 and
Greatest(nvl(max(decode(col1,1,datum)) over(order by datum), sysdate-10000), 
nvl(max(decode(col2,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col3,1,datum)) over(order by datum), sysdate-10000),
nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)
) = nvl(max(decode(col4,1,datum)) over(order by datum), sysdate-10000)  then 1 else 0 end col4
Could you give me a solution to my problem?
Thanks in advance
miki
 
 
 
Miki, March     10, 2005 - 8:09 am UTC
 
 
Here is my table populated with data:
create table T
(
  DATUM DATE,
  T     VARCHAR2(1),
  COL1  NUMBER,
  COL2  NUMBER,
  COL3  NUMBER,
  COL4  NUMBER
);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('16-01-2005 13:17:46', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-01-2005 17:23:13', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-03-2005 02:59:17', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('11-12-2004 21:59:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('10-01-2005 12:00:22', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('24-02-2005 02:36:51', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('08-12-2004 11:21:15', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('07-01-2005 20:52:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-02-2005 23:44:33', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-03-2005 16:25:12', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 19:02:28', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('22-01-2005 11:21:41', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-01-2005 15:32:18', 'dd-mm-yyyy hh24:mi:ss'), 'T', 1, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('19-12-2004 03:07:10', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('21-02-2005 16:25:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('01-01-2005 01:02:39', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('15-12-2004 05:49:26', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('04-02-2005 14:35:34', 'dd-mm-yyyy hh24:mi:ss'), 'T', 0, 1, 0, 0);
insert into T (DATUM, T, COL1, COL2, COL3, COL4)
values (to_date('02-12-2004 15:01:42', 'dd-mm-yyyy hh24:mi:ss'), null, 0, 0, 0, 0);
commit;
select t.* from t t
order by 1;
       DATUM    T    COL1    COL2    COL3    COL4
1    2004.12.02. 15:01:42        0    0    0    0
2    2004.12.08. 11:21:15        0    0    0    0
3    2004.12.11. 21:59:18    T    1    0    0    0
4    2004.12.15. 5:49:26        0    0    0    0
5    2004.12.19. 3:07:10        0    0    0    0
6    2005.01.01. 1:02:39    T    0    1    0    0
7    2005.01.01. 19:02:28        0    0    0    0
8    2005.01.04. 17:23:13    T    1    1    0    0
9    2005.01.07. 20:52:26        0    0    0    0
10    2005.01.10. 12:00:22        0    0    0    0
11    2005.01.16. 13:17:46        0    0    0    0
12    2005.01.19. 15:32:18    T    1    1    0    0
13    2005.01.22. 11:21:41        0    0    0    0
14    2005.02.02. 23:44:33        0    0    0    0
15    2005.02.04. 14:35:34    T    0    1    0    0
16    2005.02.21. 16:25:42        0    0    0    0
17    2005.02.24. 2:36:51        0    0    0    0
18    2005.03.01. 2:59:17        0    0    0    0
19    2005.03.04. 16:25:12    T    1    0    0    0
Line 8 and 12 have more then one column that contain 1.
So, I need to "copy" every colx from line 6 because it is the first line (ordered by datum), that has value 'T' for column T and only one colx has value 1.
Thank you 
 
March     10, 2005 - 8:28 am UTC 
 
ops$tkyte@ORA9IR2> select t, col1, col2, col3, col4,
  2         substr(max(data) over (order by datum),11,1) c1,
  3         substr(max(data) over (order by datum),12,1) c2,
  4         substr(max(data) over (order by datum),13,1) c3,
  5         substr(max(data) over (order by datum),14,1) c4,
  6             case when col1+col2+col3+col4 > 1 then '<---' end fix
  7    from (
  8  select t.*,
  9         case when t = 'T' and col1+col2+col3+col4 = 1
 10                  then to_char(row_number() over (order by datum) ,'fm0000000000') || col1 || col2 || col3 || col4
 11                  end data
 12    from t
 13         )
 14   order by datum;
T       COL1       COL2       COL3       COL4 C C C C FIX
- ---------- ---------- ---------- ---------- - - - - ----
           0          0          0          0
           0          0          0          0
T          1          0          0          0 1 0 0 0
           0          0          0          0 1 0 0 0
           0          0          0          0 1 0 0 0
T          0          1          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          1          1          0          0 0 1 0 0 <---
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          1          1          0          0 0 1 0 0 <---
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          0          1          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
           0          0          0          0 0 1 0 0
T          1          0          0          0 1 0 0 0
19 rows selected.
 
 
 
 
 
Great!
Miki, March     10, 2005 - 9:38 am UTC
 
 
Great solution!
Thank you, it is that i expected. 
 
 
book on Analytics
A reader, March     10, 2005 - 11:07 am UTC
 
 
Hi Tom,
It is high time that you publish the book on 'Analytic functions' - there is a lot one can do with these , but very few people are fully aware of it
When is this book due ?
thanks 
 
 
A variation of Dean's question ...
Julius, March     10, 2005 - 8:13 pm UTC
 
 
create table tt (
 did     number,
 dd     date,
 status number);
 
alter table tt add constraint tt_pk primary key (did,dd) using index;
insert into tt values (-111,to_date('03/03/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/04/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/05/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/06/2005','mm/dd/yyyy'),11);
insert into tt values (-111,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-111,to_date('03/08/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/09/2005','mm/dd/yyyy'),22);
insert into tt values (-111,to_date('03/10/2005','mm/dd/yyyy'),22);
insert into tt values (-222,to_date('03/04/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/05/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/06/2005','mm/dd/yyyy'),77);
insert into tt values (-222,to_date('03/07/2005','mm/dd/yyyy'),33);
insert into tt values (-222,to_date('03/08/2005','mm/dd/yyyy'),55);
insert into tt values (-222,to_date('03/09/2005','mm/dd/yyyy'),11);
I need a query which would return following result set where days_in_status is a count of consecutive days the did has been in its current status (dd values are days only). I've been trying to use analytics but without much success so far. Any idea? Thanks!!
 DID          DD STATUS DAYS_IN_STATUS
----- ---------- ------ --------------
-111  03/10/2005     22              3
-222  03/09/2005     11              1
 
 
March     10, 2005 - 9:04 pm UTC 
 
ops$tkyte@ORA9IR2> select did, max(dd), count(*)
  2    from (
  3  select x.*, max(grp) over (partition by did order by dd desc) maxgrp
  4    from (
  5  select tt.*,
  6         case when lag(status) over (partition by did order by dd desc) <> status
  7                  then 1
  8                  end grp
  9    from tt
 10         ) x
 11             )
 12   where maxgrp is null
 13   group by did
 14  /
       DID MAX(DD)     COUNT(*)
---------- --------- ----------
      -222 09-MAR-05          1
      -111 10-MAR-05          3
is one approach... 
 
 
 
 
SQL Query
a reader, March     15, 2005 - 6:50 pm UTC
 
 
Hi Tom,
create table a
(accno         number(8)     not null,
 amount_paid    number(7)   not null)
/
insert into a values (1, 1000);
insert into a values (2, 1500);
insert into a values (3, 2000);
insert into a values (4, 3000);
insert into a values (5, 3000);
Could you please help me in writing the following query without using rownum and analytics. 
list the accno corresponding to maximum amount paid. In case of more than one accounts having the same max amount paid, list any one.
I am expecting the result to be accno 4 or 5
Thanks for your time.
Regards 
 
March     15, 2005 - 9:27 pm UTC 
 
sounds like homework.
I give a similar quiz question in interviews (find the more frequently occuring month)
tkyte@ORA8IW> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno
  2    from a;
ACCNO
-----------------------------------------
5
is one possible approach (assuming that amount_paid is positive) 
tkyte@ORA8IW> select max(accno)
  2    from a
  3   where amount_paid = ( select max(amount_paid) from a );
MAX(ACCNO)
----------
         5
is another (that would work well if amount_paid,accno were indexed....)
 
 
 
 
negatives to worry about ...
Gabe, March     15, 2005 - 9:56 pm UTC
 
 
SQL> select * from a;
     ACCNO AMOUNT_PAID
---------- -----------
         1          -2
         2          -1
SQL> select substr( max( to_char(amount_paid,'fm0000000') || accno ), 8 ) accno from a;
ACCNO
-----------------------------------------
21
 
 
 
March     15, 2005 - 10:04 pm UTC 
 
....
(assuming that amount_paid is positive) 
.......
that was caveated and why I gave two answers ;)
 
 
 
 
cannot read ...
Gabe, March     15, 2005 - 10:57 pm UTC
 
 
Sorry about that ... missed it completely.
 
 
 
following an idea of mikito ...
Matthias Rogel, March     16, 2005 - 8:11 am UTC
 
 
  1  select accno
  2  from a
  3  start with amount_paid = (select max(amount_paid) from a)
  4         and accno = (select min(accno) from a where amount_paid = (select max(amount_paid) from a))
  5* connect by prior null is not null
SQL> /
     ACCNO
----------
         4
would be a third solution 
 
 
March     16, 2005 - 8:38 am UTC 
 
there are many solutions -- this one would win a Rube Goldberg award though :) 
 
 
 
another query using analytics 
A reader, March     29, 2005 - 11:32 am UTC
 
 
I've got 2 tables, t1 and t2.
t1(1 column):
t1.x(int ,primary key)
1
2
3
and t2(3 columns,index on t2.y):
t2.x(int) t2.y(int)  t2.z(int)
1         7000        1      
1         7000        6
1         8000        8 
2         7000        1
2         7000        5
3         7000        3 
3         8000        1
3         8000        7
3         9000        5
I would like to have a report like this:
t1.x     t2.y   count  min max
1        7000   2       1   8
1        8000   1       1   8
2        7000   2       1   5
3        7000   1       1   7
3        8000   2       1   7
3        9000   1       1   7
What I came up with is:
select distinct t1.x,t2.y,
count(*) over (partition by t1.x,t2.y) as count,
min(t2.z) over (partition by t1.x) as min,
max(t2.z) over (partition by t1.x) as max
from t1,t2 where
where t1.x=t2.x;
I was wondering if this query is good enough, or if there's a better way(in terms of performance) to write this query. I'm new to analytics, and your help would be very much appreciated.
 
 
March     29, 2005 - 12:25 pm UTC 
 
we could probably do this in analytics without the distinct, something like
select t1.x, t2.y, t2.cnt, 
       min(t2.z) over (partition by t1.x), 
       max(t2.z) over (partition by t1.x)
  from t1, (select x, y, count(*) cnt from t2 group by x, y ) t2
 where t1.x = t2.x;
and maybe even pusht he min/max() down into the inline view.
 
 
 
 
Analytics problem
Mark, April     08, 2005 - 12:19 pm UTC
 
 
Hi Tom,
I have a problem whose solution I'm pretty sure involves analytic functions. I've been struggling with it for some time, but analytics are new to me. I want to go from this:
/* create and inserts */
create table test.test (ordernum varchar2(10), 
                         tasktype char(3),
                         feetype varchar2(20),
                         amount number(10,2));
insert into test.test(ordernum, tasktype, feetype, amount)
               values('123123', 'DOC', 'Product Fee', 15);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('123123', 'DOC', 'Copy Fee', 1);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('34864', 'COS', 'Setup Fee', 23);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('34864', 'COS', 'File Review Fee', 27);
insert into test.test(ordernum, tasktype, feetype, amount)
               values('34864', 'COS', 'Statutory Fee', 23);               
insert into test.test(ordernum, tasktype, feetype, amount)
               values('56432', 'DOC', 'Product Fee', 80);    
insert into test.test(ordernum, tasktype, feetype, amount)
               values('56432', 'DOC', 'Prepayment', -16);
SQL> select tasktype, ordernum, feetype, amount from test.test;
TAS ORDERNUM   FEETYPE                  AMOUNT
--- ---------- -------------------- ----------
DOC 123123     Product Fee                  15
DOC 123123     Copy Fee                      1
COS 34864      Setup Fee                    23
COS 34864      File Review Fee              27
COS 34864      Statutory Fee                22
DOC 56432      Product Fee                  80
DOC 56432      Prepayment                  -16
...to this:
TAS ORDERNUM FEE1        FEE2            FEE3          FEE4     FEE5
--- -------- ----------- --------        ----------    -------- --------
DOC          Product Fee Copy Fee        Prepayment
DOC 123123   15          1
DOC 56432    80                          -16
COS          Setup Fee   File Review Fee Statutory Fee
COS 34864    23          27              22
Allow me to explain. For each tasktype I would like a heading row, which, going across, contains all the feetypes found in test.test for that particular tasktype. There should never be more than five feetypes.
For each ordernum under each tasktype, I would like to have the amounts going across, underneath the appropriate feetypes. 
I'm pretty sure my solution involves the lag and/or lead functions, partitioning over tasktype. I particularly seem to have trouble wrapping my brain around the problem of how to get a distinct ordernum while keeping intact the data in other columns (where ordernums duplicate).
I hope my explanation is clear enough.
Hope you can help. Thanks in advance. I will continue working on this. 
 
 
April     08, 2005 - 12:51 pm UTC 
 
ops$tkyte@ORA9IR2> with columns
  2  as
  3  (select tasktype, feetype, row_number() over (partition by tasktype order by feetype) rn
  4     from (select distinct tasktype, feetype from test )
  5  )
  6  select a.tasktype, a.ordernum,
  7         to_char( max( decode( rn, 1, amount ) )) fee1,
  8         to_char( max( decode( rn, 2, amount ) )) fee2,
  9         to_char( max( decode( rn, 3, amount ) )) fee3,
 10         to_char( max( decode( rn, 4, amount ) )) fee4,
 11         to_char( max( decode( rn, 5, amount ) )) fee5
 12    from test a, columns b
 13   where a.tasktype = b.tasktype
 14     and a.feetype = b.feetype
 15   group by a.tasktype, a.ordernum
 16   union all
 17  select tasktype, null,
 18         ( max( decode( rn, 1, feetype ) )) fee1,
 19         ( max( decode( rn, 2, feetype ) )) fee2,
 20         ( max( decode( rn, 3, feetype ) )) fee3,
 21         ( max( decode( rn, 4, feetype ) )) fee4,
 22         ( max( decode( rn, 5, feetype ) )) fee5
 23    from columns
 24   group by tasktype
 25   order by 1 desc, 2 nulls first
 26  /
 
TAS ORDERNUM   FEE1            FEE2            FEE3            FEE4 FEE5
--- ---------- --------------- --------------- --------------- ---- ----
DOC            Copy Fee        Prepayment      Product Fee
DOC 123123     1                               15
DOC 56432                      -16             80
COS            File Review Fee Setup Fee       Statutory Fee
COS 34864      27              23              23
of course. :)
(suggestion, break it out, run each of the bits to see what they do.  basically, columns is a view used to "pivot" on -- we needed to assign a column number to each FEETYPE by TASKTYPE.  That is all that view does.
Then, we join that to test and "pivot" naturally.
Union all in the pivot of the column names....
and sort) 
 
 
 
 
RE: Analytics problem
Mark, April     08, 2005 - 1:27 pm UTC
 
 
Excellent! I'll definitely break it down to figure out exactly what you did. Thank you very much.  
 
 
Re: another query using analytics
Gabe, April     08, 2005 - 3:27 pm UTC
 
 
You werent given any resources 
 so, I understand your solution was in fact merely a [untested] suggestion.
create table t1 ( x int primary key );
insert into t1 values (1);
insert into t1 values (2);
insert into t1 values (3);
create table t2 ( x int not null references t1(x), y int not null, z int not null );
insert into t2 values ( 1,7000,1);
insert into t2 values ( 1,7000,6);
insert into t2 values ( 1,8000,8);
insert into t2 values ( 2,7000,1);
insert into t2 values ( 2,7000,5);
insert into t2 values ( 3,7000,3);
insert into t2 values ( 3,8000,1);
insert into t2 values ( 3,8000,7);
insert into t2 values ( 3,9000,5);
My solution (avoiding the distinct) is not necessarily better than the one presented by the A reader, but here it goes:
flip@FLOP> select x, y, c
  2        ,min(f) over (partition by x) f
  3        ,max(l) over (partition by x) l
  4  from (
  5  select t2.x, t2.y, count(*) c
  6        ,min(t2.z) keep (dense_rank first order by t2.z) f
  7        ,max(t2.z) keep (dense_rank last  order by t2.z) l
  8  from  t1, t2
  9  where t1.x = t2.x
 10  group by t2.x, t2.y
 11  ) t
 12  ;
         X          Y          C          F          L
---------- ---------- ---------- ---------- ----------
         1       7000          2          1          8
         1       8000          1          1          8
         2       7000          2          1          5
         3       7000          1          1          7
         3       8000          2          1          7
         3       9000          1          1          7
Cheers. 
 
April     08, 2005 - 3:34 pm UTC 
 
without create tables and inserts, I guess :)
takes too much time to create the setup for every case (wish people would read the page that they have to page down through to put something up here...) 
 
 
 
I'm confused
Mikito, April     18, 2005 - 9:55 pm UTC
 
 
Given that
select distinct deptno
from emp
is essentially
select deptno
from emp
group by deptno
how is distinct query should be rewritten in case with analytics columns? Neither
SELECT deptno, count(1),
       min(sal) over (partition by deptno) f
from emp
group by deptno,min(sal) over (partition by deptno);
nor
SELECT deptno, count(1),
       min(sal) over (partition by deptno) f
from emp
group by deptno,f;
seems to be a valid syntax. 
(To repeat: "Does analytics scale?")
 
 
April     19, 2005 - 7:22 am UTC 
 
why would you use analytics that way?  
Tell us the question, we'll tell you the method.
select deptno, count(*)  /* because count(1) is counter-intuitive */, 
       min(sal) over (partition by deptno) f
  from emp
 group by deptno, min(sal) over (partition by deptno)
would not make sense.  You are saying "get all deptnos, by deptno find the minimum salary and associate that number with each one, then aggregate by deptno/min salary to count records"
You should just ask:
find the minimum salary and count of records by deptno.
select deptno, count(*), min(sal) from emp group by deptno;
is what you were looking for.  analytics scale up wonderful.  Say the question was instead:
you have a table full of records that have a customer_id and a last_sale_date, I would like you to retrieve the last record for each customer.
select * 
  from ( select cust.*, max(sale_date) over (partition by cust_id) lsd
           from cust )
 where sale_date = lsd;
versus
select * 
  from cust
 where sale_date = 
    (select max(sale_date) from cust c2 where cust_id = cust.cust_id )
/
or
select * 
  from cust, (select cust_id, max(sale_date) lsd from cust group by cust_id)x
 where cust.cust_id = x.cust_id
   and cust.sale_date = x.lsd
/
for example 
 
 
 
Tricky SQL?
A reader, April     19, 2005 - 10:29 am UTC
 
 
CREATE TABLE master
(
    m_no         INTEGER       PRIMARY KEY,
    m_name        VARCHAR2(255) NOT NULL UNIQUE
);
create table detail
(
  d_pk integer primary key,
  d_no integer not null references m(m_no),
  d_date date,
  d_data varchar2(255)
);
Given a d_pk, how can I get the second-to-last (ordered by d_date) record from M for that M_NAME? In other words, for a given m_name, there are multiple records in "detail" with different dates. Given one of those records, I want the prior record in "detail" (there might not be any)
I tried to design a simple master detail table, but maybe I over-normalized?
Thanks 
 
April     19, 2005 - 12:00 pm UTC 
 
are you saying "i have a detail record, I want the detail record that came 'in front' of this one"?  
that is what I sort of hear, but the second to last is confusing me.
select * 
  from (
select ...., lead(d_pk) over (order by d_date) next_pk
  from master, detail
 where master.m_no = (select d_no from detail where d_pk = :x)
   and master.m_no = detail.d_no
       )
 where next_pk = :x;
I think that does that.  You get the master/detail for that d_pk (inline view)
Use lead to assign to each record the "next pk" after sorting by d_date
Keep the record whose 'next' records primary key was the one you wanted..
 
 
 
 
a little inconsistency
mikito, April     19, 2005 - 1:24 pm UTC
 
 
I meant inconsistency, not scalability. Why "distinct"
SELECT distinct deptno, 
       min(sal) over (partition by deptno) f
from emp
is allowed, whereas "group by" doesn't? If someone has trouble understanding what analytics with "group by" means, the same should apply to analytics with "distinct" as well.
 
 
April     19, 2005 - 1:26 pm UTC 
 
because group by is not distinct, they are frankly very different concepts.
 
 
 
 
detail and summery in one sql statement
A reader, April     27, 2005 - 3:02 pm UTC
 
 
hi tom,
quick shot. i have to process many detail (column a - f) and one summery record (containing sum (column c) + count (*) over all recs + some literal placeholders) within one sql statement. is there another way then using a classical UNION ALL select? any new way with analytical functions? 
 
April     27, 2005 - 3:22 pm UTC 
 
need small example, did not follow your example as stated. 
 
 
 
detail and summery in one sql statement  
A reader, April     28, 2005 - 10:08 am UTC
 
 
hi tom,
here is the small and simple test case to show what i mean.
SQL> create table t1 (col1 number primary key, col2 number, col3 number);
Tabelle wurde angelegt.
SQL> create table t2 (col0 number primary key, col1 number references t1 (col1), col2 number, col3 number, col4 number);
Tabelle wurde angelegt.
SQL> create index t2_col1 on t2 (col1);
Index wurde angelegt.
SQL> insert into t1 values (1, 1, 1);
1 Zeile wurde erstellt.
SQL> insert into t2 values (1, 1, 1, 1, 1);
1 Zeile wurde erstellt.
SQL> insert into t2 values (2, 1, 2, 2, 2);
1 Zeile wurde erstellt.
SQL> insert into t2 values (3, 1, 3, 3, 3);
1 Zeile wurde erstellt.
SQL> analyze table t1 compute statistics;
Tabelle wurde analysiert.
SQL> analyze table t2 compute statistics;
Tabelle wurde analysiert.
SQL> select 0 rowtype, t1.col1 display1, t1.col2 display2, t2.col3 display3, t2.col4 display4
  2  from   t1 join t2 on (t1.col1 = t2.col1)
  3  where  t1.col1 = 1
  4  UNION ALL
  5  select 1 rowtype, t1.col1, count (*), null, sum (t2.col4)
  6  from   t1 join t2 on (t1.col1 = t2.col1)
  7  where  t1.col1 = 1
  8  group  by t1.col1
  9* order  by rowtype
   ROWTYPE   DISPLAY1   DISPLAY2   DISPLAY3   DISPLAY4
---------- ---------- ---------- ---------- ----------
         0          1          1          1          1
         0          1          1          2          2
         0          1          1          3          3
         1          1          3                     6
that is creating detail + summary record within one sql statement! 
 
 
April     28, 2005 - 10:18 am UTC 
 
ops$tkyte@ORA10G> select grouping_id(t1.col2) rowtype,
  2         t1.col1 d1,
  3             decode( grouping_id(t1.col2), 0, t1.col2, count(*) ) d2,
  4             decode( grouping_id(t1.col2), 0, t2.col3, null ) d3,
  5             decode( grouping_id(t1.col2), 0, t2.col4, sum(t2.col4) ) d4
  6    from t1, t2
  7   where t1.col1 = t2.col1
  8   group by grouping sets((t1.col1),(t1.col1,t1.col2,t2.col3,t2.col4))
  9  /
 
   ROWTYPE         D1         D2         D3         D4
---------- ---------- ---------- ---------- ----------
         0          1          1          1          1
         0          1          1          2          2
         0          1          1          3          3
         1          1          3                     6
 
 
 
 
 
detail and summery in one sql statement   
A reader, April     29, 2005 - 10:05 am UTC
 
 
hi tom,
thanks for your help. that's exactly what i need. analytics rock, analytics roll as you said. :)
unfortunately it is hard to get. :(
i looked in the documentation but cannot understand the grouping_id values in the example. please could you explain? what is "2" or "3" in the grouping column?
Examples
The following example shows how to extract grouping IDs from a query of the sample table sh.sales:
SELECT channel_id, promo_id, sum(amount_sold) s_sales,
   GROUPING(channel_id) gc,
   GROUPING(promo_id) gp,
   GROUPING_ID(channel_id, promo_id) gcp,
   GROUPING_ID(promo_id, channel_id) gpc
   FROM sales
   WHERE promo_id > 496
   GROUP BY CUBE(channel_id, promo_id);
 
C   PROMO_ID    S_SALES         GC         GP        GCP        GPC
- ---------- ---------- ---------- ---------- ---------- ----------
C        497   26094.35          0          0          0          0
C        498    22272.4          0          0          0          0
C        499    19616.8          0          0          0          0
C       9999   87781668          0          0          0          0
C            87849651.6          0          1          1          2
I        497    50325.8          0          0          0          0
I        498    52215.4          0          0          0          0
I        499   58445.85          0          0          0          0
I       9999  169497409          0          0          0          0
I             169658396          0          1          1          2
P        497   31141.75          0          0          0          0
P        498    46942.8          0          0          0          0
P        499      24156          0          0          0          0
P       9999   70890248          0          0          0          0
P            70992488.6          0          1          1          2
S        497  110629.75          0          0          0          0
S        498   82937.25          0          0          0          0
S        499   80999.15          0          0          0          0
S       9999  267205791          0          0          0          0
S             267480357          0          1          1          2
T        497     8319.6          0          0          0          0
T        498    5347.65          0          0          0          0
T        499      19781          0          0          0          0
T       9999   28095689          0          0          0          0
T            28129137.3          0          1          1          2
         497  226511.25          1          0          2          1
         498   209715.5          1          0          2          1
         499   202998.8          1          0          2          1
        9999  623470805          1          0          2          1
              624110031          1          1          3          3
 
 
April     29, 2005 - 10:21 am UTC 
 
 
 
How to do this using Analytics
A reader, May       05, 2005 - 5:11 pm UTC
 
 
Hello Sir,
         I have a denormalized table dept_emp of which part of it I have reproduced here.It has/will have dupes .
I need to find out all emps which belong to more than one dept using Analytics ( Want to avoid self join ).
So the required output must be :
DEPTNO DNAME      EMPNO ENAME               
------ ---------- ----- --------------------
    10 D10            1 E1                  
    10 D10            1 E1                  
    10 D10            2 E2                  
    10 D10            2 E2                  
                  
    20 D20            1 E1                  
    20 D20            1 E1                  
    20 D20            2 E2                  
    20 D20            2 E2                  
   
From the total set of :
SELECT * FROM DEPT_EMP ORDER BY DEPTNO ,EMPNO
DEPTNO DNAME      EMPNO ENAME               
------ ---------- ----- --------------------
    10 D10            1 E1                  
    10 D10            1 E1                  
    10 D10            2 E2                  
    10 D10            2 E2                  
    10 D10            3 E3                  
    10 D10            3 E3                  
    20 D20            1 E1                  
    20 D20            1 E1                  
    20 D20            2 E2                  
    20 D20            2 E2                  
    20 D20            4 E4                  
    20 D20            4 E4                  
    20 D20            5 E5                  
    20 D20            5 E5                  
14 rows selected
create table dept_emp (deptno number , dname varchar2(10) ,empno number ,ename varchar2(20) ) ;
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
10, 'D10', 1, 'E1'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
10, 'D10', 2, 'E2'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
10, 'D10', 3, 'E3'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 4, 'E4'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 5, 'E5'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 1, 'E1'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 2, 'E2'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
10, 'D10', 1, 'E1'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
10, 'D10', 2, 'E2'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
10, 'D10', 3, 'E3'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 4, 'E4'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 5, 'E5'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 1, 'E1'); 
INSERT INTO DEPT_EMP ( DEPTNO, DNAME, EMPNO, ENAME ) VALUES ( 
20, 'D20', 2, 'E2'); 
COMMIT ;
Thanx 
 
May       05, 2005 - 6:10 pm UTC 
 
no analytics
select empno, count(distinct deptno)
  from t
 group by empno
having count(distinct deptno) > 1; 
 
 
 
Thanx Sir
A reader, May       05, 2005 - 9:20 pm UTC
 
 
Actually I was planing to use analytics to get the whole row info, will do the same trick with Analytics, then.
You are a Genius. 
 
May       06, 2005 - 7:17 am UTC 
 
select *
  from (
select t.*, count(distinct deptno) over (partition by empno) cnt
  from t
       )
 where cnt > 1;
 
 
 
 
Analytical solution
Baiju Menon, May       10, 2005 - 6:29 am UTC
 
 
Sir,
I want to list the department and the maximum number of employees working in that department by using Analytical function(only the department in which the maximum number of employees are working) 
the query without the Analytical function is
select deptno, count(deptno) from emp group by deptno having count(deptno) in (select max(count(deptno)) from emp group by deptno)
Thanks
 
 
May       10, 2005 - 9:15 am UTC 
 
  1  select deptno, cnt
  2    from (
  3  select deptno, cnt, max(cnt) over() max_cnt
  4    from (
  5  select deptno, count(*) cnt
  6    from emp
  7   group by deptno
  8         )
  9        )
 10*  where cnt = max_cnt
scott@ORA9IR2> /
 
    DEPTNO        CNT
---------- ----------
        30          6
 
 
 
 
group by
Anoop Gupta, May       11, 2005 - 4:15 am UTC
 
 
Hi Tom,
        
       I have a table in table data is like this 
   empid leavelname
   1001  Level1
   1001  Level2
   1001  Level3
   1001  Level4
   1002  Level1
   1002  Level2
   1002  Level3
   ...
   ...
Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a function.
empid  emp_assigned on leavel
1001   level1,level2,level3,level4
1002   level1,level2,level3
...
...
Waiting for your response.....
        
 
May       11, 2005 - 7:30 am UTC 
 
only if there is some reasonable maximum number of levelname rows per empid.
is there? 
 
 
 
Analytics Rock - But why are they slower for me
Jeff Plumb, May       13, 2005 - 1:00 am UTC
 
 
Hi Tom,
I have followed you example about Analytics from Effective Oracle by Design on page 516 (Find a specific row in a partition). When I run the example and tkprof the 3 different queries, the analytics actually takes a lot longer to run, but it does do less logical I/O's. It is doing a lot more physical I/O's so I am guessing that it is using a temporary segment on disk to perform the window sort. To perform the test I created the big_table that you use and populated it with 1,000,000 rows. I am using Oracle 9i release 2. Here is the output from TKPROF:
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
********************************************************************************
select owner, object_name, created
from   big_table t
where  created = (select max(created)
                  from   big_table t2
                  where  t2.owner = t.owner)
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        8      5.32       6.42      13815      14669          0         694
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       10      5.32       6.42      13815      14669          0         694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows     Row Source Operation
-------  ---------------------------------------------------
    694  HASH JOIN
     20   VIEW
     20    SORT GROUP BY
1000000     TABLE ACCESS FULL BIG_TABLE
1000000   TABLE ACCESS FULL BIG_TABLE
********************************************************************************
select t.owner, t.object_name, t.created
from   big_table t
join (select owner, max(created) maxcreated
      from   big_table
      group by owner) t2
  on (t2.owner = t.owner and t2.maxcreated = t.created)
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        8      5.03       5.06      13816      14669          0         694
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       10      5.03       5.06      13816      14669          0         694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows     Row Source Operation
-------  ---------------------------------------------------
    694  HASH JOIN
     20   VIEW
     20    SORT GROUP BY
1000000     TABLE ACCESS FULL BIG_TABLE
1000000   TABLE ACCESS FULL BIG_TABLE
********************************************************************************
select owner, object_name, created
from
(   select owner, object_name, created, max(created) over (partition by owner) as maxcreated
    from   big_table
)
where created = maxcreated
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        8     16.68      40.66      15157       7331         17         694
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       10     16.68      40.66      15157       7331         17         694
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 33
Rows     Row Source Operation
-------  ---------------------------------------------------
    694  VIEW
1000000   WINDOW SORT
1000000    TABLE ACCESS FULL BIG_TABLE
********************************************************************************
And when I run the query with the analytics using autotrace I get the following which shows a sort to disk:
SQL*Plus: Release 9.2.0.6.0 - Production on Fri May 13 14:53:08 2005
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production
control@DWDEV> set autot traceonly
control@DWDEV> select owner, object_name, created
  2  from
  3  (   select owner, object_name, created, max(created) over (partition by owner) as maxcreated
  4      from   big_table
  5  )
  6  where created = maxcreated;
694 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=4399 Card=1000000 Bytes=52000000)
   1    0   VIEW (Cost=4399 Card=1000000 Bytes=52000000)
   2    1     WINDOW (SORT) (Cost=4399 Card=1000000 Bytes=43000000)
   3    2       TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=637 Card=1000000 Bytes=43000000)
Statistics
----------------------------------------------------------
          0  recursive calls
         17  db block gets
       7331  consistent gets
      15348  physical reads
        432  redo size
      12784  bytes sent via SQL*Net to client
        717  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
        694  rows processed
So how can I stop the sorts (disk)? I am guessing that the pga_aggregate_target needs to be higher, but it seems to already be set quite high.
control@DWDEV> show parameter pga
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 524288000
I hope you can help clarify how to make the anayltic version run quicker.
Thanks.
 
 
May       13, 2005 - 9:50 am UTC 
 
it'll be a function of the number of "owners" here
You have 1,000,000 records.
You have but 20 users.
in this extreme case, having 50,000 records per window and swapping out was not as good as squashing the data down to 20 records and joining -- the CBO quite smartly rewrote:
select owner, object_name, created
from   big_table t
where  created = (select max(created)
                  from   big_table t2
                  where  t2.owner = t.owner)
as
select ...
  from big_table t, (select owner,max(created) created from big_table t2 ...)
 where ....
So, does the data you analyze to find the "most current record" tend to have 50,000 records/key in real life?
In your case, your hash table didn't spill to disk.  In real life though, the numbers would probably be much different.  a 1,000,000 row table would have keys with 10 or 100 rows maybe, not 50,000 (in general).  There you would find the answer to be very different.
And if you let the sort run in memory it would be different as well -- you would get a max of 25m given your pga aggregate target setting that may have been too small.
but consider what happens when the size of the "aggregate" goes up, dimishing marginal returns sets in:
select owner, object_name, created
from   big_table t
where  created = (select max(created)
                  from   big_table t2
                  where  t2.owner = t.owner)
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      320      2.06       2.01      26970      29283          0        4775
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      322      2.06       2.01      26970      29283          0        4775
********************************************************************************
select owner, object_name, created
from
(   select owner, object_name, created,
           max(created) over (partition by owner) as maxcreated
    from   big_table
)
where created = maxcreated
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      320      4.57      10.05      30603      14484         15        4775
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      322      4.57      10.05      30603      14484         15        4775
********************************************************************************
select owner, object_name, created
from   big_table t
where  created = (select max(created)
                  from   big_table t2
                  where  t2.id = t.id)
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    66668      7.70      12.04      33787      45393          2     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    66670      7.71      12.05      33787      45393          2     1000000
********************************************************************************
select owner, object_name, created
from
(   select owner, object_name, created,
           max(created) over (partition by id) as maxcreated
    from   big_table
)
where created = maxcreated
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    66668      7.00       9.60       9336      14484          2     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    66670      7.00       9.60       9336      14484          2     1000000
and, given sufficient space to work "in memory", these two big queries both benefited:
select owner, object_name, created
from   big_table t
where  created = (select max(created)
                  from   big_table t2
                  where  t2.owner = t.owner)
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      320      1.82       1.96       9909      29283          0        4775
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      322      1.83       1.97       9909      29283          0        4775
********************************************************************************
select owner, object_name, created
from
(   select owner, object_name, created,
           max(created) over (partition by owner) as maxcreated
    from   big_table
)
where created = maxcreated
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch      320      2.15       2.11       2858      14484          0        4775
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      322      2.15       2.11       2858      14484          0        4775
********************************************************************************
select owner, object_name, created
from   big_table t
where  created = (select max(created)
                  from   big_table t2
                  where  t2.id = t.id)
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    66668      7.64       7.55      10181      94633          0     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    66670      7.65       7.56      10181      94633          0     1000000
********************************************************************************
select owner, object_name, created
from
(   select owner, object_name, created,
           max(created) over (partition by id) as maxcreated
    from   big_table
)
where created = maxcreated
                                                                                         
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch    66668      5.69       5.49       2699      14484          0     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    66670      5.69       5.49       2699      14484          0     1000000
(this was a dual cpu xeon using 'nonparallel' query in this case, once with a 256mb pga aggregate target and again with a 2gig one 
 
 
 
kuldeep, May       14, 2005 - 3:27 am UTC
 
 
Dear Tom,
I have three tables t1, t2 & t3. where t2 & t3 is joined with t2 with column "key_id".
Now I need sum of key_values(amount) of t2 and sum of key_values(amount) of t3 for key_id
in table t1.
kuldeep@dlfscg> select * from t1;
    KEY_ID    KEY_VAL
---------- ----------
         2       1980
         1       1975
kuldeep@dlfscg> select * from t2;
    KEY_ID    KEY_VAL
---------- ----------
         2        550
         2        575
         1        500
kuldeep@dlfscg> select * from t3;
    KEY_ID    KEY_VAL
---------- ----------
         2        900
         1       1000
         1        750
***** QUERY 1 *****
kuldeep@dlfscg> SELECT t1.key_id, SUM(t2.key_val) sum_t2_key_val, SUM(t3.key_val) sum_t3_key_val
  2  FROM t1, t2, t3
  3  WHERE t1.key_id=t2.key_id
  4  AND t1.key_id=t3.key_id
  5  GROUP BY t1.key_id
  6  /
    KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
         1           1000           1750
         2           1125           1800
***** QUERY 2 *****
kuldeep@dlfscg> SELECT t1.key_id, t2.sum_t2_key_val, t3.sum_t3_key_val
  2  FROM   t1, 
  3      (SELECT key_id, SUM(key_val) sum_t2_key_val FROM t2 GROUP BY key_id) t2,
  4      (SELECT key_id, SUM(key_val) sum_t3_key_val FROM t3 GROUP BY key_id) t3
  5  WHERE  t1.key_id=t2.key_id
  6  AND    t1.key_id=t3.key_id 
  7  /
    KEY_ID SUM_T2_KEY_VAL SUM_T3_KEY_VAL
---------- -------------- --------------
         1            500           1750
         2           1125            900
Query 1 is giving wrong result and I can not use query 2 whose performance is very poor.
Oracle 9i has added a lot of new grouping features and a lot of analytic functions (all going over the head).
Is there any "special" sum function or way which should pick value only once belonging to a row (or query's key, here "key_id")
irrespective of how many time it is appearing on query result.
    KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
         1        500       1000 
         1        500        750 <---- 500 of t2 should not be calculated, it is repeat
         2        550        900
         2        575        900 <---- 900 of t3 should not be calculated, it is repeat
thanks and regards,
 
 
May       14, 2005 - 9:36 am UTC 
 
select t1.key_id, t2.sum_val, t3.sum_val
  from t1, 
       (select key_id, sum(val) sum_val from t2 group by key_id ) t2,
       (select key_id, sum(val) sum_val from t3 group by key_id ) t3
 WHERE  t1.key_id=t2.key_id
 AND    t1.key_id=t3.key_id 
  
 
 
 
apply an amount across multiple records
Dave, May       15, 2005 - 8:17 pm UTC
 
 
I have a problem similar to what I call the invoice payment problem.
It would seem to be a common problem, but I have searched to no avail.
The idea is that a customer may have many outstanding invoices, and sends in a check for an arbitrary amount.  So we need to apply the money across the invoices oldest first.
Note that in my specific case, if a payment exceeds the total outstanding, the excess is ignored (obviously not dealing with real money here!)
create table invoices (
  cust_nbr    integer not null,
  invoice_nbr integer not null,
  invoice_amt number not null,
  payment_amt number not null,
  primary key (cust_nbr, invoice_nbr)
);
begin
  delete from invoices;
  dbms_random.seed(123456789);
  for c in 1 .. 2 loop
    for i in 1 .. 3 loop
      insert into invoices values (c, i, round(dbms_random.value * 10, 2)+1, 0);
    end loop;
  end loop;
  update invoices
    set payment_amt = round(dbms_random.value * invoice_amt, 2)
    where invoice_nbr = 1;
  commit;
end;
/
select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
       invoice_amt - payment_amt outstanding_amt
  from invoices
  where invoice_amt - payment_amt > 0
  order by cust_nbr, invoice_nbr;
  CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
         1           1        9.44        5.55            3.89
         1           2        3.21           0            3.21
         1           3        2.78           0            2.78
         2           1        7.57         4.3            3.27
         2           2        9.46           0            9.46
         2           3        5.92           0            5.92
variable cust_nbr number;
variable received_amt number;
begin
  :cust_nbr := 1;
  :received_amt := 7.25;
end;
/
update invoices i1
  set payment_amt = (... some query which applies
                      :received_amt to outstanding_amt ...)
  where cust_nbr = :cust_nbr;
result should be:
  CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
         1           1        9.44        9.44               0
         1           2        3.21        3.21               0
         1           3        2.78         .15            2.63
         2           1        7.57         4.3            3.27
         2           2        9.46           0            9.46
         2           3        5.92           0            5.92
This is simple to solve in pl/sql with a cursor, but I thought it would be a good test for a set-based solution with analytics.  But after some effort, I'm stumped.
 
 
May       16, 2005 - 7:37 am UTC 
 
Using analytics we can see how to apply the inputs:
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  2         least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
  3    from (
  4  select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  5         invoice_amt - payment_amt outstanding_amt,
  6         sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
  7    from invoices
  8   where cust_nbr = :cust_nbr
  9         )
 10    order by cust_nbr, invoice_nbr;
 
  CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT AMOUNT_TO_APPLY
---------- ----------- ----------- ----------- ---------------
         1           1        9.44        5.55            3.89
         1           2        3.21           0            3.21
         1           3        2.78           0             .15
Just needed a running total of outstanding amounts to take away from the received amount....
Then, merge:
ops$tkyte@ORA9IR2> merge into invoices
  2  using
  3  (
  4  select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  5         least( greatest( :received_amt - rt + outstanding_amt, 0 ), outstanding_amt ) amount_to_apply
  6    from (
  7  select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  8         invoice_amt - payment_amt outstanding_amt,
  9         sum(invoice_amt - payment_amt) over (partition by cust_nbr order by invoice_nbr) rt
 10    from invoices
 11   where cust_nbr = :cust_nbr
 12         )
 13  ) x
 14  on ( invoices.cust_nbr = x.cust_nbr and invoices.invoice_nbr = x.invoice_nbr )
 15  when matched then update set payment_amt = nvl(payment_amt,0)+x.amount_to_apply
 16  when not matched /* never happens... */ then insert (cust_nbr) values (null);
 
3 rows merged.
 
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select cust_nbr, invoice_nbr, invoice_amt, payment_amt,
  2         invoice_amt - payment_amt outstanding_amt
  3    from invoices
  4    order by cust_nbr, invoice_nbr;
 
  CUST_NBR INVOICE_NBR INVOICE_AMT PAYMENT_AMT OUTSTANDING_AMT
---------- ----------- ----------- ----------- ---------------
         1           1        9.44        9.44               0
         1           2        3.21        3.21               0
         1           3        2.78         .15            2.63
         2           1        7.57         4.3            3.27
         2           2        9.46           0            9.46
         2           3        5.92           0            5.92
 
6 rows selected.
 
 
 
 
 
Group by
Anoop Gupta, May       16, 2005 - 10:06 am UTC
 
 
Reviewer:  Anoop Gupta  from INDIA 
Hi Tom,
As i asked question that 
        
       I have a table in table data is like this 
   empid leavelname
   1001  Level1
   1001  Level2
   1001  Level3
   1001  Level4
   1002  Level1
   1002  Level2
   1002  Level3
   ...
   ...
Means this table tell on which levels employee is assigned.
Is there any query posible that will retrun data llike this without writing a 
function.
empid  emp_assigned on leavel
1001   level1,level2,level3,level4
1002   level1,level2,level3
...
...
Give me the way to write a query if Suppose here we have a limitation of levels for an employee is 50.
Please reply....
 
 
May       16, 2005 - 1:09 pm UTC 
 
select empid,
       rtrim(
       max(decode(rn,1,leavelname)) || ',' ||
       max(decode(rn,1,leavelname)) || ',' ||
       ....
       max(decode(rn,50,leavelname)), ',' )
  from (select empid,   
               row_number() over (partition by empid order by leavelname) rn,
               leavelname
          from t 
       )
 group by empid; 
 
 
 
special sum
kuldeep, May       17, 2005 - 12:38 am UTC
 
 
Dear Tom,
Thanks for your response and for this useful site.
I was looking for a solution which could avoid these inline views which were making my query to run slow. I tried for the solution and got this query,
/*  DATA VIEW  */
kuldeep@dlfscg> SELECT t1.key_id,    
  2      t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
  3      t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
  4  FROM t1, t2, t3
  5  WHERE t1.key_id=t2.key_id
  6  AND t1.key_id=t3.key_id
  7  ORDER BY t1.key_id
  8  /
    KEY_ID T2_ROWID                T2_RN    KEY_VAL T3_ROWID                T2_RN    KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
         1 AAANZ5AAHAAAD94AAA          1        500 AAANZ4AAHAAAD9wAAA          1       1000
         1 AAANZ5AAHAAAD94AAA          2        500 AAANZ4AAHAAAD9wAAB          1        750
         2 AAANZ5AAHAAAD91AAA          1        550 AAANZ4AAHAAAD9tAAA          1        900
         2 AAANZ5AAHAAAD91AAB          1        575 AAANZ4AAHAAAD9tAAA          2        900
/*  FINAL QUERY  */
kuldeep@dlfscg> SELECT key_id,
  2      SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
  3      SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
  4  FROM   (SELECT t1.key_id,    
  5           t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
  6           t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
  7          FROM t1, t2, t3
  8    WHERE t1.key_id=t2.key_id
  9    AND t1.key_id=t3.key_id)
 10  GROUP BY key_id
 11  /
    KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
         1        500       1750
         2       1125        900
regards,
 
 
May       17, 2005 - 8:23 am UTC 
 
one would need more information -- it APPEARS that you are trying to get a "random first hit" from T2 and T3 by T1.key_id
That is, for every row in T1 -- find the first match (any match will do) in T2 and in T3
report that value
is that correct.
and how big are t1,t2,t3,  and how long is long. 
 
 
 
group by
Anoop Gupta, May       17, 2005 - 9:42 am UTC
 
 
Tom,
Thanks for your prompt response. 
 
 
Analytical Problem
Imran, May       18, 2005 - 4:16 am UTC
 
 
Look at the following two queries.
SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
  2         CASE
  3            WHEN up_down < 0
  4               THEN 'DOWN'
  5            WHEN up_down > 0
  6               THEN 'UP'
  7            ELSE 'BALANCE'
  8         END CASE,
  9         prev_month
 10    FROM (SELECT exch || ' - ' || phone phone,
 11                 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
 12                 instdate, paybefdue this_month, arrears,
 13                 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
 14                   paybefdue
 15                 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
 16            FROM ptc
 17           WHERE phone IN (7629458));
PHONE           MONTH              ARREARS THIS_MONTH ABS(UP_DOWN) CASE    PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458   Apr, 2005          2562.52       5265         5265 UP               0
SQL> SELECT phone, MONTH, arrears, this_month, ABS (up_down),
  2         CASE
  3            WHEN up_down < 0
  4               THEN 'DOWN'
  5            WHEN up_down > 0
  6               THEN 'UP'
  7            ELSE 'BALANCE'
  8         END CASE,
  9         prev_month
 10    FROM (SELECT exch || ' - ' || phone phone,
 11                 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
 12                 instdate, paybefdue this_month, arrears,
 13                 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
 14                   paybefdue
 15                 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
 16            FROM ptc
 17           WHERE phone IN (7629459));
PHONE           MONTH              ARREARS THIS_MONTH ABS(UP_DOWN) CASE    PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629459   Apr, 2005          3516.62       7834         7834 UP               0
SQL> 
Now when I combine the two queries results are different. 
  1  SELECT phone, MONTH, arrears, this_month, ABS (up_down),
  2         CASE
  3            WHEN up_down < 0
  4               THEN 'DOWN'
  5            WHEN up_down > 0
  6               THEN 'UP'
  7            ELSE 'BALANCE'
  8         END CASE,
  9         prev_month
 10    FROM (SELECT exch || ' - ' || phone phone,
 11                 TO_CHAR (TO_DATE (MONTH, 'YYMM'), 'Mon, YYYY') MONTH, region,
 12                 instdate, paybefdue this_month, arrears,
 13                 LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC) prev_month,
 14                   paybefdue
 15                 - (LEAD (paybefdue, 1, 0) OVER (ORDER BY MONTH DESC)) up_down
 16            FROM ptc
 17*          WHERE phone IN (7629458,7629459))
SQL> /
PHONE           MONTH              ARREARS THIS_MONTH ABS(UP_DOWN) CASE    PREV_MONTH
--------------- --------------- ---------- ---------- ------------ ------- ----------
202 - 7629458   Apr, 2005          2562.52       5265         2569 DOWN          7834
202 - 7629459   Apr, 2005          3516.62       7834         7834 UP               0
So you note that prev Month balance now disturbs badly.
Please tell me how to do this 
 
 
May       18, 2005 - 8:58 am UTC 
 
need test case.  create table, insert's (like the page used to submit this said....) 
 
 
 
Use of analytic functions in UPDATE statements
Bob Lyon, May       18, 2005 - 12:29 pm UTC
 
 
Tom,
-- Given this sample data
CREATE TABLE GT (
   XP_ID           INTEGER,
   OFFSET          INTEGER,
   PMAX            NUMBER,
   PRIOR_PMAX      NUMBER
);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 1, 3);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 123, 2, 8);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 155, 3, 5);
INSERT INTO GT (XP_ID, OFFSET, PMAX) VALUES( 173, 3, 7.3);
-- I want to update the table and set the PRIOR_PMAX column values to be as follows
SELECT XP_ID, OFFSET,  PMAX,
       LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
                                ORDER BY     XP_ID, OFFSET) PRIOR_PMAX
FROM   GT
ORDER BY  XP_ID, OFFSET;
     XP_ID     OFFSET       PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
       123          1          3
       123          2          8          3
       155          3          5
       173          3        7.3
-- My insert to do this tells me "4 rows updated.", but does not do what I want.
UPDATE GT A
SET    PRIOR_PMAX = (
          SELECT LAG(B.PMAX, 1, NULL) OVER (PARTITION BY B.XP_ID
                                            ORDER BY     B.XP_ID, B.OFFSET)  PRIOR_PMAX
          FROM   GT B
          WHERE  A.ROWID = B.ROWID
          );
-- but I get
SELECT  xp_id, offset, pmax, prior_pmax
FROM    GT
ORDER BY  xp_id, offset;
     XP_ID     OFFSET       PMAX PRIOR_PMAX
---------- ---------- ---------- ----------
       123          1          3
       123          2          8
       155          3          5
       173          3        7.3
-- Oracle doc states
--  "Therefore, analytic functions can appear only in the select list or ORDER BY clause."
-- which is perhaps a little ambiguous in this case.
-- Is there a way to do this update is "Straight SQL"?
 
 
May       18, 2005 - 12:54 pm UTC 
 
you can merge
merge into gt a
using ( SELECT rowid rid, XP_ID, OFFSET,  PMAX,
       LAG(PMAX, 1, NULL) OVER (PARTITION BY XP_ID
                                ORDER BY     XP_ID, OFFSET) PRIOR_PMAX
FROM   GT )b
on (a.rowid = b.rowid) 
when matched then update ...
when not matched (never happens, just do a dummy insert of a single null in 9i or leave off entirely in 10g) 
 
 
 
special sum
Kuldeep, May       19, 2005 - 1:09 am UTC
 
 
My requirement was like this : I have receivables (bills, debit notes etc.) which I adjusts against the received payments and credit note (both are in seperate tables). To know the outstanding I was joining (outer join) my receivables with payments and credit notes. 
Because one receivable can be adjusted against many payments and credit notes so outstanding payment was like this:
  outstanding = receivable amount - sum(payment amount) - sum(credit note amount)
this simple query using outer join was giving wrong result if a receivable is adjusted against one payment and more than one credit note or vice versa.
in this case where
  receivable : 1000         payment : 400         CN : 400, 200
will appear as
  1000        400         400
  1000        400         200
              ---         ---
              800         600      outstanding = -400 (wrong)
My t1, t2 and t3 has 600,000, 350,000 and 80,000 row respectively.
This is my actual inline view query
-----------------------------------
SELECT a.bill_type, a.bill_exact_type, a.period_id, 
       a.scheme_id, a.property_number, a.bill_number, 
       a.bill_amount, SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) adj_amt,
       NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0)) pending_amt
FROM   ALL_RECEIVABLE a, 
       (SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adj_amt) adj_amt
        FROM   CREDIT_NOTE_RECEIVABLE
        WHERE  bill_type=p_bill_type
        AND    scheme_id=p_scheme
        AND    property_number=p_prop
        GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) c, 
    (SELECT bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number, SUM(adjust_amount) adjust_amount
     FROM    PAYMENT_RECEIPT_ADJ
     WHERE  bill_type=p_bill_type
     AND    scheme_id=p_scheme
     AND    property_number=p_prop
     GROUP BY bill_type, scheme_id, property_number, bill_exact_type, period_id, bill_number) p
WHERE   a.bill_type=P_BILL_TYPE
AND    a.scheme_id=P_SCHEME
AND    a.property_number=P_PROP
AND    a.bill_type=c.bill_type(+)
AND    a.bill_exact_type=c.bill_exact_type(+)
AND    a.period_id=c.period_id(+)
AND    a.scheme_id=c.scheme_id(+)
AND    a.property_number=c.property_number(+)
AND    a.bill_number=c.bill_number(+)
AND    a.bill_type=p.bill_type(+)
AND    a.bill_exact_type=p.bill_exact_type(+)
AND    a.period_id=p.period_id(+)
AND    a.scheme_id=p.scheme_id(+)
AND    a.property_number=p.property_number(+)
AND    a.bill_number=p.bill_number(+)
GROUP BY a.bill_type, a.bill_exact_type, a.period_id, a.scheme_id, 
             a.property_number, a.bill_number, a.bill_date, a.bill_amount
HAVING (NVL(a.bill_amount,0) - SUM(NVL(c.adj_amt,0)+NVL(p.adjust_amount,0))) > 0
ORDER BY a.bill_date;
-----------------------------------
It is not reporting just the first hit of t1 in t2 and t3. Here in my last posting, I was trying just to exclude any repeat of t2 and t3's ROW in sum calculation. That means one row of t2 and t3 should be calculated only once.
I have tried this query putting more rows and applied the same on actual query, it is working fine and giving the same result as previous inline view query was giving.
kuldeep@dlfscg> SELECT t1.key_id,    
  2      t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val,
  3      t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t2_rn, t3.key_val
  4  FROM t1, t2, t3
  5  WHERE t1.key_id=t2.key_id(+)
  6  AND t1.key_id=t3.key_id(+)
  7  ORDER BY t1.key_id
  8  /
    KEY_ID T2_ROWID                T2_RN    KEY_VAL T3_ROWID                T2_RN    KEY_VAL
---------- ------------------ ---------- ---------- ------------------ ---------- ----------
         1 AAANZ5AAHAAAD94AAA          1        500 AAANZ4AAHAAAD9wAAA          1       1000
         1 AAANZ5AAHAAAD94AAA          2        500 AAANZ4AAHAAAD9wAAB          1        750
         1 AAANZ5AAHAAAD94AAA          3        500 AAANZ4AAHAAAD9wAAC          1         25
         2 AAANZ5AAHAAAD91AAA          1        550 AAANZ4AAHAAAD9tAAA          1        900
         2 AAANZ5AAHAAAD91AAB          1        575 AAANZ4AAHAAAD9tAAA          2        900
         3 AAANZ5AAHAAAD91AAC          1        222                             1
         3 AAANZ5AAHAAAD91AAD          1        223                             2
         4                             1            AAANZ4AAHAAAD9tAAB          1        333
8 rows selected.
kuldeep@dlfscg> SELECT key_id,
  2      SUM(DECODE(t2_rn,1,t2_key_val,0)) t2_key_val,
  3      SUM(DECODE(t3_rn,1,t3_key_val,0)) t3_key_val
  4  FROM   (SELECT t1.key_id,    
  5           t2.ROWID t2_rowid, row_number() over (PARTITION BY t2.ROWID ORDER BY t3.ROWID) t2_rn, t2.key_val t2_key_val,
  6           t3.ROWID t3_rowid, row_number() over (PARTITION BY t3.ROWID ORDER BY t2.ROWID) t3_rn, t3.key_val t3_key_val
  7          FROM t1, t2, t3
  8    WHERE t1.key_id=t2.key_id(+)
  9    AND t1.key_id=t3.key_id(+))
 10  GROUP BY key_id
 11  /
    KEY_ID T2_KEY_VAL T3_KEY_VAL
---------- ---------- ----------
         1        500       1775
         2       1125        900
         3        445          0
         4                   333
kuldeep@dlfscg> 
thanks for your responses.
regards, 
 
May       19, 2005 - 7:47 am UTC 
 
do not order by rowid to get a last row -- is that what you are trying to do??
which row do you want to get from t2 to join with t1
and which row do you want to get from t3 to join with t1
You must specify that based on attributes you manage (eg: there must be an orderable field that helps you determine WHICH record is the right one)
consider rowid to be a random number that does not have any meaning when ordered by, it does not imply order of insertion or anything. 
 
 
 
null record
yeshk, May       25, 2005 - 4:12 pm UTC
 
 
I need help with this query - This is just a part of the query I am working with. 
I am not able to generate a NULL RECORD in between the result set.
I should be able to pass this information out as a reference cursor.
create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt number,non_measd_tkt number);
insert into test values('CA','NDS',100,200);
insert into test values('IL','DSL',200,300);
insert into test values('CA','DSL',100,300);
insert into test values('MO','NDS',1000,300);
insert into test values('MO','DSL',100,200);
I need a result like this
STATE   SVC_CAT  MEASD_TKT        NON MEASD TKT
CA    DSL          200              300
CA    NDS          100              200
      TOTAL        300              500
IL    DSL          200              300
      TOTAL        200              300
MO    DSL          100              200
MO    NDS         1000              300
      TOTAL       1100              500
I am able to generate the result using a query with  analytics.But I dont know how to get an empty row after each state total
Also,Which is better using cursor 
  1) cursor based on state.
  2) get the data and insert into a temporary table.
  3) insert a null record
or use analytics to get complete data and put into a reference cursor.
Thanks
yeshk 
 
May       25, 2005 - 7:57 pm UTC 
 
well, that would sort of be the job of the "pretty printing routine" -- eg: the report generator?
what tool is printing this out? 
 
 
 
null record  
yeshk, May       26, 2005 - 9:20 am UTC
 
 
we need to give the resultant set with a null record after each state calculation to front-end VB application. It will be given in a reference cursor.They will just select * from reference cursor and display it on a report.  
 
May       26, 2005 - 10:02 am UTC 
 
the VB application should do this, (it should be able to do something shouldn't it...)
ops$tkyte@ORA9IR2> select decode( grp, 0, state ) state,
  2         decode( grp, 0, svc_cat) svc_cat,
  3             decode( grp, 0, sum_mt ) sum_mt,
  4             decode( grp, 0, sum_nmt ) sum_nmt
  5    from (
  6  select grouping(dummy) grp, state, svc_cat, sum(measd_tkt) sum_mt, sum(non_measd_tkt) sum_nmt
  7    from (
  8  select state, svc_cat, 1 dummy, measd_tkt, non_measd_tkt
  9    from test
 10         )
 11   group by rollup( state, dummy, svc_cat )
 12         )
 13  /
 
ST SVC     SUM_MT    SUM_NMT
-- --- ---------- ----------
CA DSL        100        300
CA NDS        100        200
CA            200        500
 
IL DSL        200        300
IL            200        300
 
MO DSL        100        200
MO NDS       1000        300
MO           1100        500
 
 
 
12 rows selected.
 
 
 
 
 
 
Can rollup do the thing??
Bhavesh Ghodasara, May       26, 2005 - 9:39 am UTC
 
 
Hi yeshk,
 
create table test(state varchar2(2),svc_cat varchar2(3),measd_tkt 
number,non_measd_tkt number);
insert into test values('CA','NDS',100,200);........
insert into test values('CA','DSL',100,300);....
STATE   SVC_CAT  MEASD_TKT        NON MEASD TKT
CA    DSL          200              300     <==From where measd_tkt=200 comes from??????
CA    NDS          100              200
      TOTAL        300              500
Tom,,Can we do like this,
break on state
select STATE,SVC_CAT,sum(measd_tkt),sum(non_measd_tkt)
from test       
group by rollup(STATE,SVC_CAT)
order by state
............
If i make any mistake than please tell me..
Thanks in advance. 
 
May       26, 2005 - 10:19 am UTC 
 
see above 
 
 
 
Which analytics to use?
Marc-Andre Larochelle, May       30, 2005 - 9:10 pm UTC
 
 
Hi Tom,
I have this 3rd party table: 
drop table t;
create table t (atype varchar2(4),
                acol# varchar2(3),
                adin varchar2(8),
                ares varchar2(8));
insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
insert into t (atype, acol#, ares) values ('MACT','1','02246569');
insert into t (atype, acol#, ares) values ('MACT','6','02246569');
insert into t (atype, acol#, ares) values ('MACT','7','00021474');
select * from t;
ATYPE ACOL# ADIN     ARES
----- ----- -------- --------
DUPT  001   02246569
DUPT  002   00021474
DUPT  003   02246569
MACT  1              02246569
MACT  6              02246569
MACT  7              00021474
I would like to get the following result :
DUPT 001 02246569 MACT 1 02246569
DUPT 002 00021474 MACT 7 00021474
DUPT 003 02246569 MACT 6 02246569
I need to match DUPT.adin=MACT.ares together but making sure MACT.acol# is different for every DUPT.acol#. Bsically this table has different values in column depending on the type of rows (atype).
I have tried using lag, lead, rank and nothing seems to work but I am pretty sure it is doable with analytics which is why I posted my question here.
Any hint/help would be appreciated.
Thank you,
Marc-Andre 
 
May       31, 2005 - 7:30 am UTC 
 
question for you.
How did you know to put:
DUPT 001 02246569 together with MACT 1 02246569  and
DUPT 003 02246569 together with MACT 6 02246569
and not
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569
for example.  some missing logic here. 
 
 
 
Am I Correct??
Bhavesh Ghodasara, May       31, 2005 - 5:15 am UTC
 
 
Hi tom,
I solved above problem...
The query like :
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol# bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
count(*) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol<>lb
What i think is there must be a better way...
I know You will do it in much much better way..
Please suggest the corrections.
Thanks in Advance..
 
 
May       31, 2005 - 8:17 am UTC 
 
 
ATYP ACO AADIN    BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6   02246569
DUPT 002 00021474 MACT 7   00021474
DUPT 003 02246569 MACT 1   02246569
 
well, it gives a different result than the one you posted, it gives my hypothetical answer -- where 001 was combined with 6, not 1. 
 
 
 
We can do this..
Bhavesh Ghodasara, May       31, 2005 - 8:28 am UTC
 
 
Hi tom,
I can further modified my query:
now its give desired result....
(Agree that question is ambiguous)
select atyp,acol,aadin,batype,bacol,bares
from (
select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol# 
bacol,b.ares bares,
nvl(lead(b.acol# ) over(order by a.adin),0) lb,
min(b.acol#) over(partition by a.acol#) cnt
from t a,t b
where a.adin=b.ares
order by atyp,acol) t
where bacol=lb
or cnt>1
OUTPUT:
ATYP ACO AADIN    BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 1   02246569
DUPT 002 00021474 MACT 7   00021474
DUPT 003 02246569 MACT 6   02246569
So any corrections now??
Thanks in advance
Bhavesh
 
 
May       31, 2005 - 8:43 am UTC 
 
I don't know your data well enough, but your query is non-deterministic if you care.  Consider:
ops$tkyte@ORA10G> create table t (atype varchar2(4),
  2                  acol# varchar2(3),
  3                  adin varchar2(8),
  4                  ares varchar2(8));
 
Table created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
 
1 row created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
  2  from (
  3  select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
  4  bacol,b.ares bares,
  5  nvl(lead(b.acol# ) over(order by a.adin),0) lb,
  6  min(b.acol#) over(partition by a.acol#) cnt
  7  from t a,t b
  8  where a.adin=b.ares
  9  order by atyp,acol) t
 10  where bacol=lb
 11  or cnt>1;
 
ATYP ACO AADIN    BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 002 00021474 MACT 7   00021474
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> truncate table t;
 
Table truncated.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','001','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','002','00021474');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, adin) values ('DUPT','003','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','1','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','6','02246569');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','7','00021474');
 
1 row created.
 
ops$tkyte@ORA10G> insert into t (atype, acol#, ares) values ('MACT','5','02246569');
 
1 row created.
 
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> select atyp,acol,aadin,batype,bacol,bares
  2  from (
  3  select a.atype atyp,a.acol# acol,a.adin aadin,b.atype batype,b.acol#
  4  bacol,b.ares bares,
  5  nvl(lead(b.acol# ) over(order by a.adin),0) lb,
  6  min(b.acol#) over(partition by a.acol#) cnt
  7  from t a,t b
  8  where a.adin=b.ares
  9  order by atyp,acol) t
 10  where bacol=lb
 11  or cnt>1;
 
ATYP ACO AADIN    BATY BAC BARES
---- --- -------- ---- --- --------
DUPT 001 02246569 MACT 6   02246569
DUPT 002 00021474 MACT 7   00021474
Same data both times, just different order of insertions.  With analytics and order by, you need to be concerned about duplicates. 
 
 
 
 
Answers
Marc-Andre Larochelle, May       31, 2005 - 11:35 am UTC
 
 
Tom, Bhavesh,
The problem resides exactly there: no logic to match the records. I know that DUPT.din1 must have a MACT.din1 somewhere. I just don't know which one (1st one, 2nd one?). This is a decision I will have to take.
DUPT 001 02246569 MACT 1 02246569
DUPT 003 02246569 MACT 6 02246569
and
DUPT 001 02246569 MACT 6 02246569
DUPT 003 02246569 MACT 1 02246569
are the same to me. But when I run the query, I want to always get the same results.
Anyways, all in all, your queries (Bhavesh - thank you - and yours) seem to answer to my question. I will watch out for duplicates.
Thank you very much for the quick help.
Marc-Andre 
 
 
What I found
Marc-Andre Larochelle, May       31, 2005 - 5:02 pm UTC
 
 
Hi Tom,
Testing the SQL statement Bhavesh provided, I quickly discovered what you meant when saying the query was non-deterministic. When I added a 4th record :
insert into t (atype,acol#,adin) values ('DUPT','004','02246569');
insert into t (atype,acol#,ares) values ('MACT','5','02246569');
only one row was returned. I played with the query and here is what I came up with :
select atyp,acol,aadin,batype,bacol,bares
from (
select atyp,acol,aadin,batype,bacol,bares,drnk ,
       rank() over (partition by acol order by bacol) rnk
from (
select a.atype atyp,
         a.acol# acol,
         a.adin aadin,
         b.atype batype,
         b.acol# bacol,
         b.ares bares,
         dense_rank() over (partition by a.atype,a.adin order by a.acol#) drnk
from t a,t b
where a.adin=b.ares))
where drnk=rnk;
Feel free to comment.
Again thank you (and Bhavesh).
Marc-Andre 
 
 
Using Analytical Values to find latest info
anirudh, June      03, 2005 - 10:41 am UTC
 
 
Hi Tom,
we have a fairly large table with about 100 million rows, among others this table has 
the following columns
CREATE TABLE my_fact_table ( 
  staff_number     VARCHAR2 (10),   -- staff number 
  per_end_dt       DATE,            -- last day of month
  engagement_code  VARCHAR2 (30),   -- engagement code
  client_code      VARCHAR2 (20),   -- client code
  revenue          NUMBER (15,2)    -- revenue
  )
in this table the same engagement code can have different client codes for diffenet periods. This was at one point desirable and that is the reason client code was stored in this fact table instead of the engagement dimension. 
Our users now want us to update the client code in these transactions to the latest value of the client code (meaning - pick the client from the latest month for which we have got any transactions for that engagement)
This situation where same engagement has multiple clients across periods is there for about 5 % of the rows.
[btw - we do plan to do data-model change to reflect the new relationships - but that may take some time - hence the interim need to just update the fact table]
to implemnt these updates that may happen for several months, I'm trying to take the approach below 
which involve multiple queries and creation of a couple of temp tables - does it seem reasonable. i have a lurking feeling that with a deeper understanding of Analytic functions this can be further simplified - will appreciate your thoughts.
============= My Approach =================
-- Find the Engagements that have multiple Clients
CREATE TABLE amtest_mult_cli AS
WITH
v1 AS (SELECT DISTINCT engagement_code,client_code 
       FROM my_fact_table)
SELECT  engagement_code 
FROM v1
GROUP BY engagement_code
HAVING COUNT(*) > 1
-- Find What should be the correct client for those engagements
CREATE TABLE amtest_use_cli AS
SELECT engagement_code,per_end_dt,client_code
FROM
(
 SELECT engagement_code,per_end_dt,client_code
        row_number() OVER (PARTITION BY engagement_code 
          ORDER BY per_end_dt DESC, client_code DESC) 
        row_num
 FROM my_fact_table a,
     amtest_mult_cli b
 WHERE a.engagement_code = b.engagement_code
)
WHERE row_num = 1;
-- Update Correct Clients for those engagements
UPDATE my_fact_table a
SET a.client_code = 
    (SELECT b.client_code
     FROM amtest_use_cli b
     WHERE a.engagement_code = b.engagement_code)
WHERE EXISTS 
    (SELECT 1 
     FROM amtest_use_cli c
     WHERE a.engagement_code = c.engagement_code);
     
====================================================== 
 
June      03, 2005 - 12:14 pm UTC 
 
why not:
merge into my_fact_table F
using 
( select engagement_code, 
         substr(max(to_char(per_end_dt,'yyyymmddhh24miss')||client_code ),15) cc
    from my_fact_table  
   group by engagement_code
  having count(distinct client_code) > 1 ) X
on ( f.engagement_code = x.engagement_code )
when matched 
     then update set client_code = x.cc
when not matched 
     then insert ( client_code ) values ( null ); <<== never can happen 
                                                  <<== in 10g, not needed!
That select finds the client_code for the max per_end_dt by engagement_code for engagement_code's that have more than one distinct client_code....
               first_value(client_code) 
                     over (partition by engagement_code 
                           order by per_end_dt desc, client_code desc ),
               count(distinct client_code)  
 
 
 
help with lead
Adolph, June      09, 2005 - 1:24 am UTC
 
 
I have a table in the following structure:
create table cs_fpc_pr
(PRGM_C       VARCHAR2(10) not null,
 fpc_date     date  not null,
 TIME_code    VARCHAR2(3) not null,
 SUN_TYPE     varchar2(1)) 
insert into cs_fpc_pr values ('PRGM000222', to_date('08-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('09-may-2005','dd-mon-rrrr'), '45','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '25','1');
insert into cs_fpc_pr values ('PRGM000222', to_date('10-may-2005','dd-mon-rrrr'), '45','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '05','3');
insert into cs_fpc_pr values ('PRGM000222', to_date('14-may-2005','dd-mon-rrrr'), '24','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '23','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '47','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('08-may-2005','dd-mon-rrrr'), '48','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('09-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('10-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '33','1');
insert into cs_fpc_pr values ('PRGM000242', to_date('11-may-2005','dd-mon-rrrr'), '46','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '07','3');
insert into cs_fpc_pr values ('PRGM000242', to_date('14-may-2005','dd-mon-rrrr'), '23','1');
commit;
select prgm_c,fpc_date,time_code,sun_type,
lead(fpc_date) over(partition by prgm_C order by fpc_date) next_date
from cs_fpc_pr
order by prgm_c,fpc_date,time_code; 
PRGM_C     FPC_DATE  TIM S NEXT_DATE
---------- --------- --- - ---------
PRGM000222 08-MAY-05 33  1 09-MAY-05
PRGM000222 09-MAY-05 05  3 09-MAY-05
PRGM000222 09-MAY-05 25  1 09-MAY-05
PRGM000222 09-MAY-05 45  3 10-MAY-05
PRGM000222 10-MAY-05 05  3 10-MAY-05
PRGM000222 10-MAY-05 25  1 10-MAY-05
PRGM000222 10-MAY-05 45  3 14-MAY-05
PRGM000222 14-MAY-05 05  3 14-MAY-05
PRGM000222 14-MAY-05 24  1
PRGM000242 08-MAY-05 07  3 08-MAY-05
PRGM000242 08-MAY-05 23  1 08-MAY-05
PRGM000242 08-MAY-05 47  3 08-MAY-05
PRGM000242 08-MAY-05 48  3 09-MAY-05
PRGM000242 09-MAY-05 07  3 09-MAY-05
PRGM000242 09-MAY-05 33  1 09-MAY-05
PRGM000242 09-MAY-05 46  3 10-MAY-05
PRGM000242 10-MAY-05 07  3 10-MAY-05
PRGM000242 10-MAY-05 33  1 10-MAY-05
PRGM000242 10-MAY-05 46  3 11-MAY-05
PRGM000242 11-MAY-05 07  3 11-MAY-05
PRGM000242 11-MAY-05 33  1 11-MAY-05
PRGM000242 11-MAY-05 46  3 14-MAY-05
PRGM000242 14-MAY-05 07  3 14-MAY-05
PRGM000242 14-MAY-05 23  1
I need to find the for a particular 'prgm_c' the next date & time code where the 'sun_type' field = '1'. 
A sample of the output should look something like this:
PRGM_C     FPC_DATE  TIM S NEXT_DATE  next_time
---------- --------- --- - ---------  -------
PRGM000222 08-MAY-05 33  1 09-MAY-05   25
PRGM000222 09-MAY-05 05  3 09-MAY-05   25
PRGM000222 09-MAY-05 25  1 10-MAY-05   25
PRGM000222 09-MAY-05 45  3 10-MAY-05   25
PRGM000222 10-MAY-05 05  3 10-MAY-05   25
PRGM000222 10-MAY-05 25  1 14-MAY-05   24
PRGM000222 10-MAY-05 45  3 14-MAY-05   24
PRGM000222 14-MAY-05 05  3 14-MAY-05   24
PRGM000222 14-MAY-05 24  1
Tom, Can you please help me with with this?
Regards 
 
 
June      09, 2005 - 6:53 am UTC 
 
PRGM000222 10-MAY-05 05  3 10-MAY-05
PRGM000222 10-MAY-05 25  1 10-MAY-05
PRGM000222 10-MAY-05 45  3 14-MAY-05
PRGM000222 14-MAY-05 05  3 14-MAY-05
PRGM000222 14-MAY-05 24  1
you've got a problem with those fpc_dates and ordering by them. you have "dups" so no one of those 10-may-05 comes "first" same with the 14th.  You need to figure out how to really order this data deterministically first.
My first attempt at this is:
tkyte@ORA9IR2W> select prgm_c, fpc_date, time_code, sun_type,
  2         to_date(substr( max(data) 
                 over (partition by prgm_c order by fpc_date desc), 
                     6, 14 ),'yyyymmddhh24miss') ndt,
  3             to_number( substr( max(data) 
                  over (partition by prgm_c order by fpc_date desc), 20) ) ntc
  4    from (
  5  select prgm_c,
  6         fpc_date,
  7             time_code,
  8             sun_type,
  9             case when lag(sun_type) 
                        over (partition by prgm_c order by fpc_date desc)  = '1'
 10                  then to_char( row_number() 
              over (partition by prgm_c order by fpc_date desc) , 'fm00000') ||
 11                               to_char(lag(fpc_date) 
              over (partition by prgm_c order by fpc_date desc),'yyyymmddhh24mi
ss')||
 12             lag(time_code) over (partition by prgm_c order by fpc_date desc)
 13                  end data
 14  from cs_fpc_pr
 15       )
 16  order by prgm_c,fpc_date,time_code
 17  /
PRGM_C     FPC_DATE  TIM S NDT              NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33  1 09-MAY-05         25
PRGM000222 09-MAY-05 05  3 09-MAY-05         25
PRGM000222 09-MAY-05 25  1 09-MAY-05         25
PRGM000222 09-MAY-05 45  3 09-MAY-05         25
PRGM000222 10-MAY-05 05  3 10-MAY-05         25
PRGM000222 10-MAY-05 25  1 10-MAY-05         25
PRGM000222 10-MAY-05 45  3 10-MAY-05         25
PRGM000222 14-MAY-05 05  3
PRGM000222 14-MAY-05 24  1
PRGM000242 08-MAY-05 07  3 08-MAY-05         23
PRGM000242 08-MAY-05 23  1 08-MAY-05         23
PRGM000242 08-MAY-05 47  3 08-MAY-05         23
PRGM000242 08-MAY-05 48  3 08-MAY-05         23
PRGM000242 09-MAY-05 07  3 10-MAY-05         33
PRGM000242 09-MAY-05 33  1 10-MAY-05         33
PRGM000242 09-MAY-05 46  3 10-MAY-05         33
PRGM000242 10-MAY-05 07  3 10-MAY-05         33
PRGM000242 10-MAY-05 33  1 10-MAY-05         33
PRGM000242 10-MAY-05 46  3 10-MAY-05         33
PRGM000242 11-MAY-05 07  3 14-MAY-05         23
PRGM000242 11-MAY-05 33  1 14-MAY-05         23
PRGM000242 11-MAY-05 46  3 14-MAY-05         23
PRGM000242 14-MAY-05 07  3
PRGM000242 14-MAY-05 23  1
24 rows selected.
but the lack of distinctness on the fpc_date means you might get "a different answer" with the same set of data. 
 
 
 
reply
Adolph, June      09, 2005 - 7:48 am UTC
 
 
Sorry for not being clear at the first instance so here goes.... A program (prgm_C) will have a maximum of one entry in the table for a combination of a (fpc_date & time_code). 
This time_code actually maps to another table where '01' is '01:00:00' , '02' is '01:30:00' & so on (i.e. times stored in varchar2 formats ) 
So basically a program will exist for a fpc_date and a time_code only once
I hope i'm making sense.
Regards
 
 
June      09, 2005 - 7:58 am UTC 
 
tkyte@ORA9IR2W> select prgm_c,
  2         fpc_date,
  3         time_code,
  4         sun_type,
  5         to_date(
  6            substr( max(data)
  7              over (partition by prgm_c
  8                    order by fpc_date desc,
  9                             time_code desc),
 10                  6, 14 ),'yyyymmddhh24miss') ndt,
 11             to_number(
 12               substr( max(data)
 13                 over (partition by prgm_c
 14                       order by fpc_date desc,
 15                                time_code desc), 20) ) ntc
 16    from (
 17  select prgm_c,
 18         fpc_date,
 19         time_code,
 20         sun_type,
 21         case when lag(sun_type)
 22                     over (partition by prgm_c
 23                           order by fpc_date desc,
 24                                    time_code desc)  = '1'
 25                  then
 26                  to_char( row_number()
 27                           over (partition by prgm_c
 28                                 order by fpc_date desc,
 29                                    time_code desc) , 'fm00000') ||
 30                  to_char(lag(fpc_date)
 31                            over (partition by prgm_c
 32                                  order by fpc_date desc,
 33                                  time_code desc),'yyyymmddhh24mi ss')||
 34                  lag(time_code)
 35                     over (partition by prgm_c
 36                           order by fpc_date desc,
 37                                    time_code desc)
 38                  end data
 39  from cs_fpc_pr
 40       )
 41  order by prgm_c,fpc_date,time_code
 42  /
PRGM_C     FPC_DATE  TIM S NDT              NTC
---------- --------- --- - --------- ----------
PRGM000222 08-MAY-05 33  1 09-MAY-05         25
PRGM000222 09-MAY-05 05  3 09-MAY-05         25
PRGM000222 09-MAY-05 25  1 10-MAY-05         25
PRGM000222 09-MAY-05 45  3 10-MAY-05         25
PRGM000222 10-MAY-05 05  3 10-MAY-05         25
PRGM000222 10-MAY-05 25  1 14-MAY-05         24
PRGM000222 10-MAY-05 45  3 14-MAY-05         24
PRGM000222 14-MAY-05 05  3 14-MAY-05         24
PRGM000222 14-MAY-05 24  1
PRGM000242 08-MAY-05 07  3 08-MAY-05         23
PRGM000242 08-MAY-05 23  1 09-MAY-05         33
PRGM000242 08-MAY-05 47  3 09-MAY-05         33
PRGM000242 08-MAY-05 48  3 09-MAY-05         33
PRGM000242 09-MAY-05 07  3 09-MAY-05         33
PRGM000242 09-MAY-05 33  1 10-MAY-05         33
PRGM000242 09-MAY-05 46  3 10-MAY-05         33
PRGM000242 10-MAY-05 07  3 10-MAY-05         33
PRGM000242 10-MAY-05 33  1 11-MAY-05         33
PRGM000242 10-MAY-05 46  3 11-MAY-05         33
PRGM000242 11-MAY-05 07  3 11-MAY-05         33
PRGM000242 11-MAY-05 33  1 14-MAY-05         23
PRGM000242 11-MAY-05 46  3 14-MAY-05         23
PRGM000242 14-MAY-05 07  3 14-MAY-05         23
PRGM000242 14-MAY-05 23  1
24 rows selected.
Just needed to add "time_code DESC"
See
</code>  
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html  <code>
analytics to the rescue
for the "carry down" technique I used here.  In 10g, we'd simplify using "ignore nulls" in the LAST_VALUE function instead of the max() and row_number() trick  
 
 
brilliant
Adolph, June      09, 2005 - 9:53 am UTC
 
 
Thank you very much Tom. The query works like a charm.I will read up the link. Analytics do rock n roll :)
 
 
 
Working on an Analytic Query
Scott, June      09, 2005 - 12:15 pm UTC
 
 
Tom,
   From your example for Mark's problem on 4/8, it seems that you need to specify a number of columns to output this way.  Is there a way to have a varying number of columns.  For example, I need to have a query that takes a date range, and makes each date a column heading.  Any help would be greatly appreciated.
Thanks,
Scott 
 
June      09, 2005 - 6:15 pm UTC 
 
you need dynamic sql.  the number of columns in a query is "well defined, known at parse time" by definition.
If you have access to expert one on one Oracle, I demostrated how to do this with ref cursors in a stored procedure.  but you have to run a query, to get the set of column "headings" and write a query bsaed on that. 
 
 
 
Tom any idea how I can re write this piece of code
A reader, June      09, 2005 - 3:00 pm UTC
 
 
 decode ((SELECT ih.in_date
                 FROM major_sales ih
                WHERE ih.container = i.container
                  AND sales > i.container_id
                  AND sales = (SELECT MIN(ihh.container_id)
                                          FROM major_sales ihh
                                         WHERE ihh.container_id > i.container_id
                                           AND ihh.container = i.container)), NULL, 
 
June      09, 2005 - 6:35 pm UTC 
 
not out of context, no. 
 
 
 
I am still having problem with analytical function
A reader, July      01, 2005 - 12:31 pm UTC
 
 
select i.container,ssl_user_code,ssl_user_code ssl,cl.code length_code, out_trucker_code, i.chassis,
lead(in_date) over (partition by i.container order by in_date) next_in_date, 
out_date, 
lead (out_date) over (partition by i.container order by in_date) o_date 
from  his_containers i, 
       container_masters cm, 
       tml_container_lhts clht, 
       tml_container_lengths cl 
WHERE cm.container = i.container 
and cm.lht_code = clht.code 
and clht.length_code = cl.code 
and ssl_user_code = 'ACL' 
and  i.container like '%408014' 
and voided_date is null 
and ((in_date between  to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS') 
               and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR 
              (out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS') 
               and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')))
results:
----------
CONTAINER    SSL_USER_CODE    SSL LENGTH_CODE    OUT_TRUCKER_CODE  CHASSIS  NEXT_IN_DATE            OUT_DATE        O_DATE
ACLU408014    ACL    ACL    4        R0480             3/22/2005 2:52:41 PM    3/21/2005 3:45:48 PM  4/6/2005 2:25:59 PM
ACLU408014    ACL    ACL    4        J1375                        4/6/2005 2:25:59 PM    
1. how can I get rid of the 4/6/2005 2:25:59 PM???
  
 
July      01, 2005 - 1:52 pm UTC 
 
can you be more specific about why you don't like April 6th as 2:25:59pm?  what is it about that you don't like?
That'll help me tell you how to in general remove it.  What is the criteria for removal  
 
 
 
analytical query
A reader, July      01, 2005 - 2:19 pm UTC
 
 
Tom,
We are trying to build the client within a the month, in this case is within april. I also would like to know how many days elapsed during 2 days so I can bill them. 
 
 
July      01, 2005 - 3:15 pm UTC 
 
"how many days elapsed between 2 days"
the answer is: 2
but are you asking how to do date arithmetic?  Just subtract. 
 
 
 
sorry...within March
A reader, July      01, 2005 - 2:21 pm UTC
 
 
 
 
more information
A reader, July      01, 2005 - 2:29 pm UTC
 
 
Tom, 
This is how the data looks 
IN_DATE             OUT_DATE                CONTAINER
1/3/2005 2:23:05 PM    1/10/2005 5:05:16 PM    ACLU408014
1/11/2005 1:04:49 PM    1/12/2005 8:49:06 AM    ACLU408014
1/14/2005 12:09:50 PM    1/18/2005 6:39:10 AM    ACLU408014
3/19/2005 2:10:24 AM    3/21/2005 3:45:48 PM    ACLU408014
3/22/2005 2:52:41 PM    4/6/2005 2:25:59 PM    ACLU408014
4/7/2005 1:24:43 PM    4/10/2005 2:21:59 AM    ACLU408014
and I would like to get the pair within  the same month
 
 
 
July      01, 2005 - 3:16 pm UTC 
 
the pair of "what"? 
 
 
 
I would like to get all the dates within the month
A reader, July      01, 2005 - 4:03 pm UTC
 
 
 
 
one more try
A reader, July      01, 2005 - 4:26 pm UTC
 
 
This is how the data looks as of now with the above query.
IN_DATE             OUT_DATE                CONTAINER
1/3/2005 2:23:05 PM    1/10/2005 5:05:16 PM    ACLU408014
1/11/2005 1:04:49 PM    1/12/2005 8:49:06 AM    ACLU408014
1/14/2005 12:09:50 PM    1/18/2005 6:39:10 AM    ACLU408014
3/19/2005 2:10:24 AM    3/21/2005 3:45:48 PM    ACLU408014
3/22/2005 2:52:41 PM    4/6/2005 2:25:59 PM    ACLU408014
4/7/2005 1:24:43 PM    4/10/2005 2:21:59 AM    ACLU408014
I Would like to get it as the following
IN_DATE                  OUT_DATE                CONTAINER
 
3/19/2005 2:10:24 AM    3/21/2005 3:45:48 PM    ACLU408014
3/22/2005 2:52:41 PM    
 
This is what I am looking for.....this way.
 
 
July      01, 2005 - 4:46 pm UTC 
 
still not much of a specification (important thing for those of us in this industry - being able to describe the problem at hand in detail, so someone else can take the problem definition and code it).
Let me try, this is purely a speculative guess on my part:
I would like all records in the table such that the in_date-out_date range covered at least part of the month of march in the year 2005.
If the out_date falls AFTER march, I would like it nulled out.
(this part is a total guess) if the in_date falls BEFORE march, i would like it nulled out as well (for consistency?)
Ok, stated like that I can give you untested psuedo code since there are no create tables and no inserts to play with:
select case when in_date between to_date( :x, 'dd-mon-yyyy' )
                             and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
            then in_date end,
       case when out_date between to_date( :x, 'dd-mon-yyyy' )
                              and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
            then out_date end,
       container
  from T
 where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
   and out_date >= to_date( :x, 'dd-mon-yyyy' )
bind in :x = '01-mar-2005' and :y = '01-apr-2005' for your dates.
      
 
 
 
As you requested
A reader, July      01, 2005 - 5:01 pm UTC
 
 
CREATE TABLE CONTAINER_MASTERS
(
  CONTAINER              VARCHAR2(10 BYTE)      NOT NULL,
  CHECK_DIGIT            VARCHAR2(1 BYTE)       NOT NULL,
  SSL_OWNER_CODE         VARCHAR2(5 BYTE)       NOT NULL,
  LHT_CODE               VARCHAR2(5 BYTE)       NOT NULL
  
)
INSERT INTO CONTAINER_MASTERS ( CONTAINER, CHECK_DIGIT, SSL_OWNER_CODE,
LHT_CODE ) VALUES ( '045404', '1', 'BCL', '5AV'); 
commit;
 
CREATE TABLE TML_CONTAINER_LHTS
(
  CODE                     VARCHAR2(5 BYTE)     NOT NULL,
  SHORT_DESCRIPTION        VARCHAR2(10 BYTE)    NOT NULL,
  LONG_DESCRIPTION         VARCHAR2(30 BYTE)    NOT NULL,
  ISO                      VARCHAR2(4 BYTE)     NOT NULL,
  LENGTH_CODE              VARCHAR2(5 BYTE)     NOT NULL
 
)
INSERT INTO TML_CONTAINER_LHTS ( CODE, SHORT_DESCRIPTION, LONG_DESCRIPTION, ISO, LENGTH_CODE,
HEIGHT_CODE, TYPE_CODE ) VALUES ( '5BR', '5BR', '45'' 9''6" Reefer', '5432', '5', 'B', 'R'); 
commit;
 
CREATE TABLE TML_CONTAINER_LENGTHS
(
  CODE               VARCHAR2(5 BYTE)           NOT NULL,
  SHORT_DESCRIPTION  VARCHAR2(10 BYTE)          NOT NULL,
 
)
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES ( 
'2', '20''', '20 Ft'); 
INSERT INTO TML_CONTAINER_LENGTHS ( CODE, SHORT_DESCRIPTION,
LONG_DESCRIPTION ) VALUES ( 
'4', '40''', '40 Ft'); 
commit;
  
 
July      01, 2005 - 6:06 pm UTC 
 
umm, specification?
did I get it right?  if so, did you *try* the query at all??? 
 
 
 
Here is a SQL puzzle for analytics zealots
Mikito Harakiri, July      01, 2005 - 10:33 pm UTC
 
 
OK, if anybody suceed writing the following with analytics, I would convert to analytics once and forever. Credit it in the book, of course. 
Given:
table Hotels (
   name string,
   price integer,
   distance    
)
Here is a query that sounds very analytical:
Order hotels by price, distance. Compare each record with its neighbour (lag?), and one of them is inferior to the other by both criteria -- more pricey and father from the beach -- then throw it away from the result. 
 
July      02, 2005 - 9:20 am UTC 
 
define neighbor.
is neighbor defined by price or by distance?  your specification is lacking many many details (seems to be a recurring theme on this page for some reason)
sounds like you want the cheapest closest hotel to the beach.  for each row, if something closer and cheaper exists in the original set, do not keep that row.
sounds like a where not exists, not analytics to me.  but then - the specification is lacking.
And lets see, in order to appreciate a tool, you have to be shown that the tool can be the end all, be all answer to everything??!??  that is downright silly don't you think.
Let's see:
"if anyone succeeds in making the Oracle 9i merge command select data, I would convert to merge once and forever"
"if anyone succeeds in making my car fly into outer space, I would convert to cars once and forever"
Think about your logic here.
There are no zealots here, there are people willing to read the documentation, understand that things work the way they work, not the way THEY think they should have been made to work, and have jobs to do, pragmatic practical things to accomplish and are willing to use the best tool for the job. 
 
 
 
specs
Mikito Harakiri, July      03, 2005 - 11:07 pm UTC
 
 
Yes, find all the hotels that are not dominated by the others by both price and distance. That is "not exists" query, but it is a very inefficient one: 
select * from hotels h
where not exists (select * from hotels hh
   where hh.price < h.price and hh.distance <= h.distance
   or hh.price <= h.price and hh.distance < h.distance
) 
The one that reformulated is much more efficient, but how do I express it in SQL? 
 
July      04, 2005 - 10:25 am UTC 
 
the one that reforumulated?  
and why do you have the or in there at all.  to dominate by both pric and distance would simply be:
where not exists ( select NULL
                     from hotels hh
                    where hh.price < h.price 
                      AND hh.distinct < h.distance )
You said "by BOTH price and distance", nothing but nothing about ties.
ops$tkyte@ORA9IR2> /*
DOC>
DOC>drop table hotels;
DOC>
DOC>create table hotels
DOC>as
DOC>select object_name name, object_id price, object_id distance, all_objects.*
DOC>  from all_objects;
DOC>
DOC>create index hotel_idx on hotels(price,distance);
DOC>
DOC>exec dbms_stats.gather_table_stats( user, 'T', cascade=>true );
DOC>*/
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> select h1.name, h1.price, h1.distance
  2    from hotels h1
  3   where not exists ( select NULL
  4                        from hotels h2
  5                       where h2.price < h1.price
  6                         AND h2.distance < h1.distance )
  7  /
 
NAME                                PRICE   DISTANCE
------------------------------ ---------- ----------
I_OBJ#                                  3          3
 
Elapsed: 00:00:00.22
ops$tkyte@ORA9IR2> select count(*) from hotels;
 
  COUNT(*)
----------
     27837
 
Elapsed: 00:00:00.00
it doesn't seem horribly inefficient. 
 
 
 
 
Tom Can we give it one more try
A reader, July      05, 2005 - 9:20 am UTC
 
 
Tom, When I ran the query it returned nothing. I am sending you the whole test case. This is what I would like to see
in the report.
out_date                in_date               container
1/18/2005 6:39:10 AM    3/19/2005 2:10:24 AM  ACLU408014
3/21/2005 3:45:48 PM    3/22/2005 2:52:41 PM  ACLU408014
 
CREATE TABLE BETA
(
  IN_DATE    DATE                               NOT NULL,
  OUT_DATE   DATE,
  CONTAINER  VARCHAR2(10 BYTE)                  NOT NULL
)
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
commit;
select in_date, out_date,container,
            case when in_date between to_date('01-mar-2005', 'dd-mon-yyyy' )
            and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
            then in_date end,
            case when out_date between to_date( '01-mar-2005', 'dd-mon-yyyy' )
            and to_date( '31-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
            then out_date end
      container
  from BETA
  WHERE in_date <= to_date( '01-mar-2005', 'dd-mon-yyyy' )-1/24/60/60
  and out_date >= to_date( '31-mar-2005', 'dd-mon-yyyy' ) 
 
July      05, 2005 - 9:54 am UTC 
 
you know, this is going beyond....
*s*p*e*c*i*f*i*c*a*t*i*o*n*
pretend you were explaining to your mother (who presumably doesn't work in IT and doesn't know sql or databases or whatever) what needed to be done.  
that is what I need to see.  I obviously don't know your logic of getting from "A (inputs) to B (outputs)" and you need to explain that.
and when I run my query:
ops$tkyte@ORA10G> variable x varchar2(20)
ops$tkyte@ORA10G> variable y varchar2(20)
ops$tkyte@ORA10G>
ops$tkyte@ORA10G> exec :x := '01-mar-2005'; :y := '01-apr-2005'
 
PL/SQL procedure successfully completed.
 
ops$tkyte@ORA10G> select case when in_date between to_date( :x, 'dd-mon-yyyy' )
  2                               and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
  3              then in_date end,
  4         case when out_date between to_date( :x, 'dd-mon-yyyy' )
  5                                and to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
  6              then out_date end,
  7         container
  8    from beta
  9   where in_date <= to_date( :y, 'dd-mon-yyyy' )-1/24/60/60
 10     and out_date >= to_date( :x, 'dd-mon-yyyy' )
 11  /
 
CASEWHENI CASEWHENO CONTAINER
--------- --------- ----------
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05           ACLU408014
I do get output, not what you say you want, but output.  you need to tell me THE LOGIC here.  (and maybe when you write it down, specify it, the answer will just naturally appear)
so yes, we can definitely give it one more try but if and only if you provide the details, the specification, the logic, the thoughts behind this.
Not just "i have this and want that", it doesn't work that way. 
 
 
 
 
in english
Jean, July      05, 2005 - 10:33 am UTC
 
 
We are trying to bill from the time the truck left to the
time it returned. For example in the above query.
I would like to bill him from 1/18/2005 to 3/19/2005. So it must be part of the report. That's the the whole key here. 
 
 
clarification!!
A reader, July      05, 2005 - 10:56 am UTC
 
 
the time he left       1/18/2005 6:39:10 AM    
the time he came back  3/22/2005 2:52:41 PM     
 
hope this helps.... 
 
July      05, 2005 - 11:28 am UTC 
 
ops$tkyte@ORA9IR2> select * from beta order by in_date;
 
IN_DATE   OUT_DATE  CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014   <<<=== gap, no 13
14-JAN-05 18-JAN-05 ACLU408014   <<=== big gap, no 19.... mar 18
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
 
6 rows selected.
I don't get it.  I don't get it AT ALL.   does anyone else ?  
nope, not getting it even a teeny tiny bit myself.
give us LOGIC, ALGORITHM, INFORMATION.
like I said, pretend I'm your mother who has never seen a computer -- explain the logic at that level (or I just give up) 
 
 
 
 
BETTER TABLE
A reader, July      05, 2005 - 11:57 am UTC
 
 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '01/03/2005 02:23:05 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '01/10/2005 05:05:16 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '01/11/2005 01:04:49 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '01/12/2005 08:49:06 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '01/14/2005 12:09:50 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '01/18/2005 06:39:10 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '03/19/2005 02:10:24 AM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '03/21/2005 03:45:48 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '04/07/2005 01:24:43 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '04/10/2005 02:21:59 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
INSERT INTO BETA ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '03/22/2005 02:52:41 PM', 'MM/DD/YYYY HH:MI:SS AM'),  TO_Date( '04/06/2005 02:25:59 PM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU408014'); 
commit;
 
OUT_DATE            IN_DATE
1/18/2005 6:39:10 AM           3/19/2005 2:10:24 AM    
3/21/2005 3:45:48 PM           3/22/2005 2:52:41 PM
LEFT 1/18 CAME BACK 3/19
LEFT 3/21 CAME BACK 3/22
 
 
July      05, 2005 - 12:20 pm UTC 
 
you have totally and utterly missed my point.
 
IN_DATE   OUT_DATE  CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
 
6 rows selected.
sigh.  
what if the records are
IN_DATE   OUT_DATE  CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
specification, you know what, without it, I'm not even going to look anymore.  Textual description of precisely what you want.  I'm tired of guessing.  I think I can guess, but I don't even want to guess about "missing" months like my second example here.
 
 
 
 
English Explanation
A reader, July      05, 2005 - 1:06 pm UTC
 
 
Sorry for going back and forth on this report. All I want is the following: We have trucks that comes and out of yard. All we are looking for is when the truck came in and the "next record" nothing in between because a truck can come in many times during a month. So we want when it first came in and the very last time he went out for a particular month.That is to say the last time he left the yard. So the date and time should give us this information. Finally this  report should be  within a month.
example:
IN_DATE   OUT_DATE  CONTAINER
--------- --------- ----------
03-JAN-05 10-JAN-05 ACLU408014
11-JAN-05 12-JAN-05 ACLU408014
14-JAN-05 18-JAN-05 ACLU408014
19-MAR-05 21-MAR-05 ACLU408014
22-MAR-05 06-APR-05 ACLU408014
07-APR-05 10-APR-05 ACLU408014
 
6 rows selected.
in this case we want
in_date                  out_date
--------                --------
3/22/2005 2:52:41PM    1/18/2005 6:39:10 AM
 
 
 
July      05, 2005 - 1:17 pm UTC 
 
so what happened to the 21st/22nd of march this time.  the answer keeps changing?
and what if, there are no records for march in the table (nothing in_date/out_date wise)
 
 
 
 
follow up
jean, July      05, 2005 - 1:57 pm UTC
 
 
Tom,
We realized that it maybe too much to get the dates in between
so we opt for just getting the in_date and out_date. By the way there will always be data so do not worry about if....
Thanks!!
 
 
July      05, 2005 - 3:11 pm UTC 
 
feb, what about feb?  you said there would always be data?  I want to run this for feb?
do you or do you not need to be concerned about a missing month. 
 
 
 
do not be concerned!
A reader, July      05, 2005 - 3:22 pm UTC
 
 
Please do not be concerned about missing a month. This is a report. 
 
July      05, 2005 - 3:46 pm UTC 
 
umm, I want the report for feburary
it is blank.
now what?  it should not be blank should it?  this is a problem, this is a problem in our industry in general.  You get what you ask for (sometimes) and if you ask repeatedly for the wrong thing, that's what you'll get.  I am concerned -- by this line of question here.
Hey, here you go:
ops$tkyte-ORA9IR2> select *
  2    from (
  3  select
  4         lag(out_date) over (partition by container order by in_date) last_out_date,
  5         in_date,
  6             container
  7    from beta
  8         )
  9   where trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
 10      or trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy');
 
LAST_OUT_ IN_DATE   CONTAINER
--------- --------- ----------
18-JAN-05 19-MAR-05 ACLU408014
21-MAR-05 22-MAR-05 ACLU408014
gets the answer given your data, makes a zillion assumptions (50% of which are probably wrong), won't work for FEB, probably doesn't answer the question behind the question, but hey, there you go.   
 
 
 
 
Thanks!!!
A reader, July      06, 2005 - 9:00 am UTC
 
 
I will try it ...Thanks a zillion for your efforts and your patient. 
 
 
Thanks!
A reader, July      06, 2005 - 11:55 am UTC
 
 
CREATE TABLE BETA3
(
  IN_DATE    DATE                               NOT NULL,
  OUT_DATE   DATE,
  CONTAINER  VARCHAR2(10 BYTE)                  NOT NULL
)
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '07/20/2004 03:08:49 PM', 'MM/DD/YYYY HH:MI:SS AM'),  
TO_Date( '08/10/2004 02:45:52 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312'); 
INSERT INTO BETA3 ( IN_DATE, OUT_DATE, CONTAINER ) VALUES ( 
 TO_Date( '03/19/2005 01:55:06 AM', 'MM/DD/YYYY HH:MI:SS AM'), 
 TO_Date( '03/27/2005 05:05:36 AM', 'MM/DD/YYYY HH:MI:SS AM')
, 'ACLU040312'); 
commit;
 
Tom I was able to get the first pair as show 
last_out_date          in_date                  container
8/10/2004 2:45:52 AM     3/19/2005 1:55:06 AM     ACLU040312
which is fine...
But can I get the other pair?
last_out_date            in_date                  container
3/27/2005 5:05:36 AM   
 
 
July      06, 2005 - 12:44 pm UTC 
 
problem is, you are "missing" a row and 'making up' data is hard.
it might be
ops$tkyte-ORA10G> select decode( r, 1, last_out_date, out_date ),
  2         decode( r, 1, in_date, next_in_date )
  3    from (
  4  select
  5         lag(out_date) over (partition by container order by in_date) last_out_date,
  6         in_date, out_date,
  7         lead(in_date) over (partition by container order by in_date) next_in_date,
  8             container
  9    from beta3
 10         ), ( select 1 r from dual union all select 2 r from dual )
 11   where ((
 12          trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
 13          or
 14                  trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')
 15             ) and r = 1 )
 16             or
 17             ( next_in_date is null and r = 2 )
 18  /
 
DECODE(R,1,LAST_OUT_ DECODE(R,1,IN_DATE,N
-------------------- --------------------
10-aug-2004 02:45:52 19-mar-2005 01:55:06
27-mar-2005 05:05:36
still curious what happens in feb. 
 
 
 
 
Please refer some books to learn Oracle Analytic functions
Vijay, July      07, 2005 - 7:58 am UTC
 
 
 
July      07, 2005 - 9:47 am UTC 
 
data warehousing guide (freely available on otn.oracle.com)
Expert one on one Oracle (I have a big chapter on them in there) 
 
 
 
Thank you very much!!
Jean, July      08, 2005 - 10:35 am UTC
 
 
I want to thank you for the last query!!! it worked very well,even tho I still get dates outside of the range. But overall it's fine.
  
 
 
How to get contiguous date ranges from Start_date, end_date pairs?
Bob Lyon, July      11, 2005 - 3:15 pm UTC
 
 
-- Tom, Suppose I have a table with data...
-- MKT_CD START_DT_GMT      END_DT_GMT
-- ------ ----------------- -----------------
-- AAA    07/11/05 00:00:00 07/12/05 00:00:00
-- BBB    07/11/05 00:00:00 07/11/05 01:00:00
-- BBB    07/11/05 01:00:00 07/11/05 02:00:00
-- BBB    07/11/05 02:00:00 07/11/05 03:00:00
-- BBB    07/11/05 06:00:00 07/11/05 07:00:00
-- BBB    07/11/05 07:00:00 07/11/05 08:00:00
-- What I would like to get is the "contiguous date ranges"
-- by MKT_CD, i.e., 
-- MKT_CD START_DT_GMT      END_DT_GMT
-- ------ ----------------- -----------------
-- AAA    07/11/05 00:00:00 07/12/05 00:00:00
-- BBB    07/11/05 00:00:00 07/11/05 03:00:00
-- BBB    07/11/05 06:00:00 07/11/05 08:00:00
-- I have played with LAG/LEAD/FIRST_VALUE/LAST_VALUE
-- but seem to just "go in circles" trying to code this.
-- Here is the test data setup (Oracle 9.2.0.6) :
CREATE GLOBAL TEMPORARY TABLE NM_DEMAND_BIDS_API_GT
(
  MKT_CD           VARCHAR2(6) NOT NULL,
  START_DT_GMT     DATE        NOT NULL,
  END_DT_GMT       DATE        NOT NULL
)
ON COMMIT PRESERVE ROWS;
-- This code has 24 hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('AAA', TRUNC(SYSDATE), TRUNC(SYSDATE) + 1);
-- A second code goes by hours
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 00/24, TRUNC(SYSDATE) + 01/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 01/24, TRUNC(SYSDATE) + 02/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 02/24, TRUNC(SYSDATE) + 03/24);
-- and has an intentional gap
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 06/24, TRUNC(SYSDATE) + 07/24);
INSERT INTO NM_DEMAND_BIDS_API_GT ( MKT_CD, START_DT_GMT, END_DT_GMT )
VALUES ('BBB', TRUNC(SYSDATE)+ 07/24, TRUNC(SYSDATE) + 08/24);
-- Query
SELECT MKT_CD, START_DT_GMT, END_DT_GMT
FROM NM_DEMAND_BIDS_API_GT;
 
 
July      11, 2005 - 3:49 pm UTC 
 
based on:
  https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html  
ops$tkyte@ORA9IR2> select mkt_cd, min(start_dt_gmt), max(end_dt_gmt)
  2    from (
  3  select mkt_cd, start_dt_gmt, end_dt_gmt,
  4         max(grp) over (partition by mkt_cd order by start_dt_gmt) mgrp
  5    from (
  6  SELECT MKT_CD,
  7         START_DT_GMT,
  8         END_DT_GMT,
  9         case when lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) <> start_dt_gmt
 10                   or
 11                   lag(end_dt_gmt) over (partition by mkt_cd order by start_dt_gmt) is null
 12              then row_number() over (partition by mkt_cd order by start_dt_gmt)
 13          end grp
 14    FROM NM_DEMAND_BIDS_API_GT
 15         )
 16         )
 17   group by mkt_cd, mgrp
 18   order by 1, 2
 19  /
 
MKT_CD MIN(START_DT_GMT)    MAX(END_DT_GMT)
------ -------------------- --------------------
AAA    11-jul-2005 00:00:00 12-jul-2005 00:00:00
BBB    11-jul-2005 00:00:00 11-jul-2005 03:00:00
BBB    11-jul-2005 06:00:00 11-jul-2005 08:00:00
 
 
 
 
 
 
Thanks!
Bob Lyon, July      11, 2005 - 5:20 pm UTC
 
 
Wow, that was fast.
The trick here is the MAX() analytic function.  I could tag the lines where a break was to occur but couldn't figure out how to carry forward the tag/grp.
Thanks Again! 
 
 
Analytical functions book
Vijay, July      11, 2005 - 11:55 pm UTC
 
 
Thanks a lot 
 
 
More Help
Jean, July      26, 2005 - 5:40 pm UTC
 
 
Tom,
How can I get "just" the record within the scope? I am getting record outside of march.
select container,decode( r, 1, last_out_date, out_date )out_date, decode( r, 1, in_date, next_in_date) in_date, 
           code length_code,chassis,out_trucker_code,ssl_user_code ssl, ssl_user_code,out_mode 
           from ( 
     select lag(out_date) over (partition by i.container order by in_date) 
           last_out_date, 
           i.ssl_user_code, 
           in_date, 
           cl.code, 
           i.out_trucker_code, 
           i.ssl_user_code ssl, 
           i.container, 
           i.chassis, 
           out_mode, 
           out_date, 
           clht.length_code, 
           lead(in_date) over (partition by i.container order by in_date) 
          next_in_date 
          from his_containers i,container_masters cm,tml_container_lhts clht,tml_container_lengths cl 
          where cm.container = i.container 
          and cm.lht_code = clht.code 
          and cl.code = clht.length_code 
          and ssl_user_code = 'ACL' 
          and i.container = 'ACLU214285' 
          and voided_date is null 
          and chassis is  null 
          and in_mode = 'T' 
          and out_mode = 'T' ), ( select 1 r from dual union all select 2 r from dual ) 
          where (( trunc(in_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy') 
          or  trunc(last_out_date,'mm') = to_date('01-mar-2005','dd-mon-yyyy')) 
          and r = 1 ) or ( next_in_date is null and r = 2 )
          order by out_date
 
 
July      26, 2005 - 5:57 pm UTC 
 
select * 
  from (Q)
 where <any other conditions you like>
 order by out_date;
replace Q with your query. 
 
 
 
that's what I got in my query.....
A reader, July      26, 2005 - 6:03 pm UTC
 
 
 
July      26, 2005 - 6:23 pm UTC 
 
don't know what you mean 
 
 
 
I thought I was doing what you suggested already...
A reader, July      26, 2005 - 6:41 pm UTC
 
 
 
July      26, 2005 - 6:56 pm UTC 
 
I cannot see your output, obviously you are getting more data than you wanted it -- add  to the predicate in order to filter it out.  don't know what else to say. 
 
 
 
More information..
Jean, July      27, 2005 - 9:14 am UTC
 
 
the way it was before
CONTAINER    OUT_DATE            IN_DATE                LENGTH_CODE    CHASSIS    OUT_TRUCKER_CODE    
ACLU217150    6/25/2004 2:58:01 PM    3/11/2005 7:36:29 PM    4    E2131    ACL    ACL    T
---with your changes---
CONTAINER    OUT_DATE            IN_DATE                 LENGTH_CODE    CHASSIS    OUT_TRUCKER_CODE        ACLU217150    6/25/2004 2:58:01 PM    3/11/2005 7:36:29 PM    4        E2131
my history tables
CONTAINER_ID    OUT_DATE    IN_DATE
31779    6/21/2004 10:03:25 AM    6/16/2004 1:33:50 AM
55317    6/25/2004 2:58:01 PM    6/25/2004 2:19:49 PM
672863    3/2/2005 7:03:31 PM    2/26/2005 6:03:49 PM
708598    4/4/2005 3:31:03 PM    3/11/2005 7:36:29 PM
779305    4/16/2005 1:03:36 PM    4/6/2005 2:04:53 PM
as you can see I am not picking up the records within the month of march...with or without 
the changes to the query.  
 
July      27, 2005 - 10:27 am UTC 
 
sorry -- you'll need to work through this, you see the techniques involved right -- lag, lead, analytic functions, YOU understand your data much better than I.
(because in part, frankly, the "way it was before" and "with your changes" look, well, I don't know -- the same I think to me as displayed here) 
 
 
 
Thanks for your help!
A reader, July      27, 2005 - 1:26 pm UTC
 
 
I know the data, however I thought I was going to be something easy just to get the date within march...I guess not. 
 
 
count number of rows in a number of ranges
A reader, July      27, 2005 - 6:08 pm UTC
 
 
Hi
I would like to count the number of rows I have per range of values. For example
SELECT   RANGE, SUM(suma) total_per_deptno
  FROM (SELECT CASE
                  WHEN deptno between 10 and 20 THEN '10-20'
                  ELSE '30'
               END RANGE,
               deptno, 1 SUMA
          FROM scott$emp)
GROUP BY RANGE
RANGE TOTAL_PER_DEPTNO
----- ----------------
10-20                8
30                   6
Can I rewrite that query in some other way so range can be dynamic such as
11-20
21-30
31-40
and counts the number of rows?
Thank you
 
 
July      27, 2005 - 6:33 pm UTC 
 
if you can come up with a function f(x) such that f(x) returns what you want, sure.
EG:
for you 11-20, 21-30, 31-40 -- well
f(deptno) = trunc( (deptno-0.1)/10)
(assuming deptno is an integer) -- that'll bin up deptno 0..10, 11..20, 21..30 and so on into groups 0, 1, 2, 3, ....
 
 
 
 
A reader, August    02, 2005 - 1:35 pm UTC
 
 
Tom,
I hope you can provide an insight to this.
table emp1 is shown below. 
EmpId  Week  Year  Day0 Day1  .....  Day14
100    20    2005   8   8        8
200    22    2003   0    0        8    
300    25    2004   8    8        0    
400    06    2005   0   8        8
500    08    2002   8   0        8
create table emp1(empid varchar2(3), week varchar2(2), year varchar2(4), day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));
insert into emp1 values('100', '20', '2005', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('200', '22', '2003', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('300', '25', '2004', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0);
insert into emp1 values('400', '06', '2005', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
insert into emp1 values('500', '08', '2002', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8);
I am trying to select emp1 records as follows:
EmpId, Date of the day, Hours worked per day
Firstly, I have to calculate date of the day of a record (first day that corresponds to Day0) using 
week of the year and year. Then I have to increment the day by 1, 2 ...14 
to get the hours worked for each particular date
Example: Assuming that week 20 of 2005 is 05/07/2005. It corresponds to Day0 in the same record
Day1 column corresponds to the next day which is 05/08/2005. Day2 becomes 05/09/2005 and so on ... 
Then, I have to print individual rows for each empid as:
100 05/07/2005 8
100 05/08/2005 8 
.....
200 05/22/2003 0
200 05/23/2003 8 
.. and so on for all empid's ...
Thank you. 
 
August    02, 2005 - 2:09 pm UTC 
 
oh no, columns where rows should be :(
and basically you are saying "i need ROWS where these rows should be!"
tell me, how do you turn 20 into a date? 
 
 
 
A reader, August    02, 2005 - 2:19 pm UTC
 
 
Tom,
I should've explained it better. Week 20 of 2005, here should be translated to the first day of week 20 of 2005 (Assuming it is 05/07/2005). That corresponds to Day0 of that row. Day1 becomes 05/08/2005 and so on ...
Is there a function or approach that can convert columns to rows? 
 
August    02, 2005 - 3:30 pm UTC 
 
no, i mean -- what function/logic/algorithm are you using to figure out "week 20 is this day" 
 
 
 
A reader, August    02, 2005 - 9:06 pm UTC
 
 
Tom,
Sorry, firstly, the date is not calculated the way I said above. It's not clear yet how the date is obtained. This issue is under review and I think I'll obtain date by joining empid with some table (say temp1). However, I am sure I will have to use date (such as 05/07/2005), associate it with Day0 column value. Day1 becomes 05/08/2005 and so on ..  However, I am trying to obtain a sql or pl/sql that can arrange the rows as described above. Any ideas? Thanks. 
 
August    03, 2005 - 10:06 am UTC 
 
I cannot tell you how much I object to this model.  
storing "week" and "year" - UGH.
storing them in STRINGS - UGH UGH UGH.
storing things that should be cross record in record UGH to the power of 10.
I had to fix your inserts, they did not work, added day14 of zero.
ops$tkyte@ORA10G> with dates as
  2  (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual connect by level <= 15 )
  3  select empid, dt,
  4         case when l = 0 then day0
  5                  when l = 1 then day1
  6                  when l = 2 then day2
  7                          /* ... */
  8                  when l = 13 then day13
  9                  when l = 14 then day14
 10                  end data
 11    from (select * from emp1 where week = 20), dates
 12  /
 
EMP DT              DATA
--- --------- ----------
100 07-MAY-05          8
100 08-MAY-05          8
100 09-MAY-05          0
100 10-MAY-05
100 11-MAY-05
100 12-MAY-05
100 13-MAY-05
100 14-MAY-05
100 15-MAY-05
100 16-MAY-05
100 17-MAY-05
100 18-MAY-05
100 19-MAY-05
100 20-MAY-05          8
100 21-MAY-05          0
 
15 rows selected.
 
 
 
 
 
A reader, August    03, 2005 - 3:18 pm UTC
 
 
Tom,
Thanks for the solution. I need some more help if you don't mind. The sql works excellently and I experimented with it. 
However, this question is based on a change of design here ... The emp1 table is joined with trn1 table (empid ~ trnid) to obtain values x and y. x and y should be passed to a function that returns date. 
The emp1 table is like:
EmpId  Day0 Day1  .....  Day14
100    8   8        8
200    0    0        8    
300    8    8        0    
400    0   8        8
500    8   0        8
trn1 table is like:
trnid x y
100   3 18
200   4 19
300   5 20
400   6 21
500   7 22 
etc ...
create table emp1(empid varchar2(3),  day0 number(2), day1 number(2), day2 number(2), day3 number(2), day4 number(2), day5 number(2), day6 number(2), day7 number(2), day8 number(2), day9 number(2), day10 number(2), day11 number(2), day12 number(2), day13 number(2), day14 number(2));
insert into emp1 values('100', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('200', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 0);
insert into emp1 values('300', 8, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 0, 0);
insert into emp1 values('400', 0, 8, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
insert into emp1 values('500', 8, 0, 0, 8, 0, 8, 0, 0, 8, 8, 8, 8, 0, 8, 8);
create table trn1(empid varchar2(3), x number(2), y number(2));
insert into trn1 values('100', 3, 18);
insert into trn1 values('200', 4, 19);
insert into trn1 values('300', 5, 20);
insert into trn1 values('400', 6, 21);
insert into trn1 values('500', 7, 22);
I used this function on just one row of emp1 (by hard coding x and y values).
I replaced 
with dates as
    (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual 
connect by level <= 15 )
with 
with dates as
    (select getXYDate(x,y)+level-1 dt, level-1 l from dual 
connect by level <= 15 )
However, I am trying to implement this on every row of emp1 by obtaining x and y from trn. There is no week or year in emp1 table. Any help? Thanks again.
 
 
August    03, 2005 - 6:00 pm UTC 
 
I didn't think it was possible, but now I like this even less than before!  didn't think you could do that ;(
ops$tkyte@ORA10G> with dates as
  2  (select to_date( '05/07/2005','mm/dd/yyyy')+level-1 dt, level-1 l from dual 
connect by level <= 15 )
  3  select empid, dt,
  4         case when l = 0 then day0
  5                  when l = 1 then day1
  6                  when l = 2 then day2
  7                          /* ... */
  8                  when l = 13 then day13
  9                  when l = 14 then day14
 10                  end data
 11    from ( QUERY ), dates
 12  /
replace query with a join of emp with trn and apply the function in there. 
 
 
 
 
A reader, August    03, 2005 - 7:38 pm UTC
 
 
Tom,
Sorry to bother you again. In my case, I think 
(select to_date( '05/07/2005','mm/dd/yyyy') will not help me anymore because I have to basically find dates for Day0 .. Day14 of every row in emp1 table. The first date (date that corresponds to Day0) for each record should be obtained using a function by passing X and Y values of trn table .. Because each record may have different x, y values.
If it's not achievable using this way, can you suggest an alternate approach. I am trying to make a function that would use a loop. Also, the data should be written to a text file once complete, in that case I think a procedure might help and if so, could you throw some light? Thanks for your patience. 
 
August    03, 2005 - 8:26 pm UTC 
 
well, you just need to generate a set of 15 numbers (L)
and add them in later than.  No big change.  You have the "start_date" from the function right -- just add L to dt. 
 
 
 
A reader, August    03, 2005 - 8:38 pm UTC
 
 
Ok, Can you please show that if possible? 
 
 
A reader, August    03, 2005 - 9:38 pm UTC
 
 
Tom,
I tried this and am getting an error: ORA-00904: "DAY13": invalid identifier
WITH DATES AS 
(SELECT  FUNC_XY(17,2003)+level-1 dt, level-1 l FROM DUAL
 connect by level <= 15)
 select empid, day0, day14, x, y, dt,
    case when l = 0 then day0
         when l = 1 then day1
         when l = 2 then day2
         when l = 3 then day3
         when l = 4 then day4
         when l = 5 then day5
         when l = 6 then day6
         when l = 7 then day7
         when l = 8 then day8
         when l = 9 then day9
         when l = 10 then day10
         when l = 11 then day11
         when l = 12 then day12
         when l = 13 then day13
         when l = 14 then day14
         end data
  from (select emp1.empid, day0, day14, x, y from emp1, trn1 where emp1.empid = trn1.empid), dates
/
As said before ... I also have to use x and y instead of 17 and 2003 in order to compute it for every row.
 
 
August    04, 2005 - 8:20 am UTC 
 
yeah, well -- you didn't select it out in the inline view.  fix that.
look the concept is thus:
with some_rows as ( select level-1 l from dual connect by level <= 15 )
select a.empid, a.dt+l, case when l=0 then a.day0
                           ...
                           when l=14 then a.day14
                        end data
  from some_rows,
      (select emp1.empid, func_xy(trn1.x, trn1.y) dt,
              emp1.day0, emp1.day1, .... <ALL OF THE DAYS>, emp1.day14
         from emp1, trn1
        where emp1.empid = trn1.empno )
 
 
 
 
A reader, August    04, 2005 - 9:15 am UTC
 
 
Tom,
Here, the sql is using a.empid, a.dt+l ... 
whereas the inner sql is using emp1.day0, trn1.empid , etc ... My real inner sql well uses some more columns adn joins as well. When this gave me error, I just substituted emp1.day0, emp1.day14 etc ... with day0, day14 etc .. and it worked. However, when there are several joins with alias names, How should it be done? 
To make it a bit clear, this sql looks similar to:
select emp1.empid, emp1.day0 from some_rows, (select emp1.empid, emp1.day0) ...
Any idea how to select from select and still use multiple joins etc ... Hope I am clear 
 
August    04, 2005 - 9:56 am UTC 
 
you can join as much as you WANT in the inline views.  
Sorry, I cannot go further with this one, I've shown the technique -- it is just a pivot to turn COLUMNS THAT SHOULD HAVE BEEN ROWS into rows -- very common. 
 
 
 
A reader, August    04, 2005 - 9:52 am UTC
 
 
Please ignore above post. 
 
 
I need some help 
Carlos, August    09, 2005 - 10:25 am UTC
 
 
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('11/15/2004 17:42:56', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/18/2004 15:09:19', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('11/24/2004 09:38:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('11/30/2004 04:28:09', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('01/03/2005 14:36:24', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/05/2005 10:04:15', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('01/07/2005 08:54:59', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/10/2005 10:54:07', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('01/12/2005 10:13:13', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('01/18/2005 04:23:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/03/2005 03:15:05', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/09/2005 18:54:11', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005 21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/22/2005 20:27:03', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/29/2005 17:05:04', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/22/2005 20:27:15', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/30/2005 08:53:13', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/30/2005 13:16:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/30/2005 15:08:39', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('04/16/2005 13:40:44', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;
Tom, 
I hope you can help since I have been struggling with this report.  I would like to get something like this...
IN ORDER WORDS  I WANT TO GET WHEN IT FIRST WAS LOGED  IN INDATE AND WHEN IT WAS LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX. In this case for example for the month of March, however it can be for any given Month.  Any Ideas how I can accomplish that? 
IN_DATE               OUT_DATE
3/22/2005 8:27:03 PM        3/30/2005 3:08:39 PM 
  ----from the table above for the month of March 
 
August    09, 2005 - 10:45 am UTC 
 
insufficient detail here, why won't min/max work for you for example.
but I don't understand the logic behind the two values you say you want, I don't get how you arrived at them. 
 
 
 
This is what I get
A reader, August    09, 2005 - 10:57 am UTC
 
 
 select in_date, out_date
from lou_date 
where id = 201048
and ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR
(in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS')
and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS'))) 
I get the following:
In_date                  out_date
3/22/2005 8:27:03 PM  3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM  4/16/2005 1:40:44 PM 
 
August    09, 2005 - 11:19 am UTC 
 
ok, 
Insert into LOU_DATE
   (IN_DATE, OUT_DATE)
 Values
   (TO_DATE('03/11/2005 13:25:40', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/15/2005 
21:47:41', 'MM/DD/YYYY HH24:MI:SS'));
why didn't you get that row.  for example. 
 
 
 
A reader, August    09, 2005 - 11:48 am UTC
 
 
SQL Statement which produced this data:
  select in_date, out_date 
  from lou_date 
  where ((out_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS') 
  and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS')) OR 
  (in_date between to_date('01-MAR-05 00:00:00', 'DD-MON-RR HH24:MI:SS') 
  and to_date('31-MAR-05 23:59:59', 'DD-MON-RR HH24:MI:SS'))) 
  order by out_date
3/3/2005 3:15:05 AM    3/9/2005 6:54:11 PM
3/11/2005 1:25:40 PM    3/15/2005 9:47:41 PM
3/11/2005 1:25:40 PM    3/15/2005 9:47:41 PM
3/22/2005 8:27:03 PM    3/29/2005 5:05:04 PM
3/22/2005 8:27:15 PM    3/30/2005 8:53:13 AM
3/30/2005 1:16:00 PM    4/16/2005 1:40:44 PM
3/30/2005 3:08:39 PM    4/16/2005 1:40:44 PM
I guess my question is I would like to that when
I get records with beyond march it should be replace
with blank or Null...since I can't charged him/her
for April...
  
 
August    09, 2005 - 12:00 pm UTC 
 
I am so not following you here. 
 
 
 
A reader, August    09, 2005 - 12:24 pm UTC
 
 
Tom,
Pretend that you are charging someone for a particular month. Let's say the month of March. So you would like to do a query that reflect just that..so a group of dates are given to you and in that group of  dates you have multiple records with the same id. Also some records containts records that inintiated in march but came back in April. Here is are the examples..but it can work with any dates...
example 1.
   in_date          out_date
 3/22/2005 8:27:15 PM    3/30/2005 8:53:13 AM
 3/30/2005 1:16:00 PM    4/16/2005 1:40:44 PM
 
would like to see:
   in_date          out_date
 3/22/2005 8:27:15 PM     3/30/2005 1:16:00 PM  
  
 example 2
 
In_date               out_date
 3/3/2005 3:15:05 AM  3/9/2005 6:54:11 PM
 3/11/2005 1:25:40 PM 3/15/2005 9:47:41 PM
would like to see:
In_date               out_date
3/3/2005 3:15:05 AM   3/15/2005 9:47:41 PM 
 
August    09, 2005 - 12:42 pm UTC 
 
begs the question
in_date         out_date    
20-feb-2005     15-apr-2005
or
   in_date          out_date
 3/22/2005 8:27:15 PM    3/25/2005 8:53:13 AM
 3/30/2005 1:16:00 PM    4/16/2005 1:40:44 PM
what then.  Be able to clearly specify the "goal" or the "algorithm" usually leads us straight to the query itself.  There are so many ambiguities here.  Pretend you were actually documenting this for a junior programmer to program.  Give them the specifications.  In gory detail.
please don't just answer these two what thens -- think of all of the cases (cause I'll just keep on coming back with "what then" if you don't)
Remember -- I know NOTHING about your data, not a thing.  This progression from 
... I WANT TO GET WHEN IT FIRST WAS LOGED  IN INDATE AND WHEN IT WAS 
LAST LOGed IN OUT_DATE. SORT OF LIKE MIN AND MAX....
to this has been 'strange' to say the least. 
 
 
 
Full explanation of requirements
A reader, August    09, 2005 - 3:10 pm UTC
 
 
Sorry for the misunderstanding Tom. Here is the full requirements. I hope I can explain it this time.
The report is a billing report and the it goes as follows:
For example for the month of March we have to bill as 
in the following way:
out_date    date_in        Bill
2/23        3/2        3/1 to 3/2
3/1        3/3        3/1 to 3/3
3/1        4/14        3/1 to 3/31
3/1            -        3/1 to 3/31
2/23        -        3/1 to 3/31 
 
August    09, 2005 - 3:38 pm UTC 
 
well, i hope you give your programmers more detail.  Here is the best I'll do
ops$tkyte@ORA9IR1> select t.*,
  2         greatest( in_date, to_date('mar-2005','mon-yyyy') ) fixed_in_date,
  3         least( nvl(out_date,to_date('3000','yyyy')),  last_day( to_date( 'mar-2005', 'mon-yyyy' ) ) ) fixed_out_date
  4    from t
  5   where in_date < last_day( to_date( 'mar-2005', 'mon-yyyy' ) )+1
  6     and out_date >= to_date( 'mar-2005', 'mon-yyyy' );
 
IN_DATE   OUT_DATE  FIXED_IN_ FIXED_OUT
--------- --------- --------- ---------
03-MAR-05 09-MAR-05 03-MAR-05 09-MAR-05
11-MAR-05 15-MAR-05 11-MAR-05 15-MAR-05
22-MAR-05 29-MAR-05 22-MAR-05 29-MAR-05
22-MAR-05 30-MAR-05 22-MAR-05 30-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
30-MAR-05 16-APR-05 30-MAR-05 31-MAR-05
 
6 rows selected.
predicate finds records that overlap march.
select adjusts the begin/end dates. 
 
 
 
 
Thank!!!
A reader, August    10, 2005 - 12:00 pm UTC
 
 
Tom,
One more request. I would like to start the report with
the first time it went out. That is to say...
how it looks now with your help...
  fix_in                fix_out
3/22/2005 8:27:03 PM    3/29/2005 5:05:04 PM
3/30/2005 3:08:39 PM    3/31/2005
how the data looks
  fix_in            fix_out
3/22/2005 8:27:03 PM    3/29/2005 5:05:04 PM---first went out
3/30/2005 3:08:39 PM    4/16/2005 1:40:44 PM
How  I would like to see it since we begin billing from 
the first date the truck went out.
  fix_in         fix_out
3/29/2005 5:05:04 PM    3/30/2005 3:08:39 PM
3/30/2005 3:08:39 PM    3/31/2005
Thanks again Tom 
 
August    10, 2005 - 1:05 pm UTC 
 
try to work it out yourself -- please.  
why?  because I'll do this little thing and it'll be "oh yeah, one more thing, when the data looks like this...."
specifying requirements is like the most important thing in the world -- it is key, it is crucial.  It is obivous you know what you want (well, maybe -- it seems to change over time) but I don't "get it" myself.  Your simple example here with two rows begs so so many questions, I don't even want to get started.
You have lag() and lead() at your disposal, the probably come into play here.  check them out. 
 
 
 
Thanks for help !
A reader, August    11, 2005 - 3:25 pm UTC
 
 
The report is kind of tricky. Specially when one of the dates originates in Feb. and the other pair falls in march.
 
 
 
Hooked on Analytics worked for me!!
Greg, August    22, 2005 - 11:15 am UTC
 
 
I think I need to find a meeting group to help with my addiction ... I think I'm addicted to analytics .. :\
Finally got a chance to read chapter 12 in "Expert Oracle" ... awesome!!  4 big, hairy Thumbs up!! heh
But I got a question ... an "odd" behaviour that I don't understand ... was wondering if you could help explain:
Test Script:
================
drop table junk2;
drop sequence seq_junk2;
create sequence seq_junk2;
create table junk2
  (inv_num   number,
   cli_num   number,
   user_id   number)
/
insert into junk2
  values ( 123, 456, null );
insert into junk2
  values ( 123, 678, null );
insert into junk2
  values ( 234, 456, null );
insert into junk2
  values ( 234, 678, null );
commit;
break on cli_num skip 1
select * from junk2;
select inv_num, cli_num,
       NVL ( user_id, 999 ) chk1,
       NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
       seq_junk2.nextval seq,
       FIRST_VALUE ( NVL ( user_id, seq_junk2.nextval ) )
                    OVER ( PARTITION BY cli_num ) user_id
  from junk2
/
=====================
The final query shows this:
   INV_NUM    CLI_NUM       CHK1 CHK2            SEQ    USER_ID
---------- ---------- ---------- -------- ---------- ----------
       123        456        999 NULL              1
       234                   999 NULL              2
       123        678        999 NULL              3          2
       234                   999 NULL              4          2
4 rows selected.
and I'm kinda confused .. it appears that the analytic functions are not "processing" that sequence ... how do sequences and analytics work together?? (if at all??)
(In short, this is a simplified example of a bigger problem I tripped over.  I'm trying to assign new user_ids for existing clients, but only want 1 user_id assigned per client.  Trick is, each client can be associated with more 1 investment ... so I have multiple rows with same client, but I want the same user_id assigned. kind of: "Has this client got an id yet? if not, give him a new one, otherwise display the one he's already been assigned".)
FIRST_VALUE and LAST_VALUE seemed the logical choice ... 
The interesting thing is, when I use DBMS_RANDOM.VALUE (to assign a random PIN to start with) ... it works fine, what am I missing/forgetting about sequences that changes their behaviour in this regards?)
 
 
August    23, 2005 - 8:56 am UTC 
 
that will be a tricky one, lots of assumptions on orders of rows processed and such.
that should throw an ora-2287 in my opinion.
I cannot see a safe way to do that without writing a plsql function and performing a lookup off to the side by cli_num 
 
 
 
Sorry, I don't understand ... 
Greg, August    23, 2005 - 11:36 am UTC
 
 
you wrote:
"that will be a tricky one, lots of assumptions on orders of rows processed and such."
I don't understand what assumptions I'm making ... in my example, I just got 4 rows, I don't care what order they come back in, just so long as it deals with them in "groups of cli_nums" .. (hence the partition by cli_num portion) ... if I "lose" sequence numbers, that's fine, too ... I don't care about gaps in the sequence or "missing userids" ... 
The only behaviour I'm seeing, is that the analytic function doesn't seem to be working with the sequence properly ... 
I guess I can simplify the question even further:
Why does the following query return "NULL" ?
SQL > select first_value ( seq_junk2.nextval ) over ( )
  2  from dual
  3  /
------more------
FIRST_VALUE(SEQ_JUNK2.NEXTVAL)OVER()
------------------------------------
1 row selected.
(with a "normal" sequence - nothing fancy):
SQL > select seq_junk2.nextval from dual;
------more------
   NEXTVAL
----------
        29
1 row selected.
 
 
August    24, 2005 - 8:35 am UTC 
 
as i said, i believe it should be raising an error (I have it on my list of things to file when I get back in town).
I cannot make it work, I cannot think of a way to do it in a single statement, short of writing a user defined function. 
 
 
 
Connect by with self referenced parent
Joe, August    23, 2005 - 12:30 pm UTC
 
 
CONNECT BY works great but I've run into a problem when the ultimate parent is referenced in the parent record.  e.g., date looks like:
SQL> select * from t;
    OBJ_ID  PARENT_ID
---------- ----------
         1          1
         2          1
         3          1
         4          2
         5          4
But... using connect by generates an error..
SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
  2  from t
  3  connect by t.parent_id = prior t.obj_id;
ERROR:
ORA-01436: CONNECT BY loop in user data
If parent_id is null where obj_id = 1, then it's okay.  Any suggestion on how to handle the other case?  I'm stumped.
 
 
 
 
Solution for connect by
Logan Palanisamy, August    23, 2005 - 5:39 pm UTC
 
 
SQL> select lpad(' ', 2*(level-1)) ||level "LEVEL",t.obj_id, t.parent_id
  2  from t
  3  connect by t.parent_id = prior t.obj_id and t.parent_id <> t.obj_id;
LEVEL                    OBJ_ID  PARENT_ID
-------------------- ---------- ----------
1                             1          1
  2                           2          1
    3                         4          2
      4                       5          4
  2                           3          1
1                             2          1
  2                           4          2
    3                         5          4
1                             3          1
1                             4          2
  2                           5          4
1                             5          4
12 rows selected.
 
 
 
 
re:Solution for connect by
Joe, August    24, 2005 - 8:43 am UTC
 
 
Thanks Logan.  Often the solution is so simple!  Thanks. 
 
 
Seq problem
Bob B, August    24, 2005 - 11:25 am UTC
 
 
SELECT
  A.*,
  seq_junk2.currval CURR_SEQ,
  seq_junk2.nextval - ROWNUM + VAL SEQ
FROM (
SELECT 
  inv_num, 
  cli_num,
  NVL ( user_id, 999 ) chk1,
  NVL2 ( user_id, 'NOT NULL', 'NULL' ) chk2,
  DENSE_RANK() OVER ( ORDER BY CLI_NUM ) VAL
FROM JUNK2
) A
Might be a starting point.  It works on the following ASSUMPTION: ROWNUM corresponds to the number of times the sequence has been called.  As Tom stated, this assumption can easily go out the window (throw an analytic function or an order by on the outer query for a simple example).
A safer solution might be to run two updates.  Update 1 will give a unique id to each null user id.  Update 2 will update the user id to the min or max user id for that cli_num.  A little overhead, but safer and simpler than the aforementioned alternative. 
 
 
Still confused ... but working on it ... 
Greg, August    24, 2005 - 1:42 pm UTC
 
 
Thanks, Bob!!  Yeah, that does exactly what I wanted it to do, (but still doesn't really explain the "why" part) ... 
problem is, it looks like this is more a question on sequences now than analytics, so I'll see if I can find a more appropriate thread to continue this on ..
Thanks!!
 
 
 
A slight twist on lag/lead
Sudha Bhagavatula, September 01, 2005 - 11:08 am UTC
 
 
That was useful to me. Could do a lot of queries easily. However I'm stuck at this point.
I have data like this:
subr_id    dep_nbr    grp    eff_date     term_date
1001       001        2112   01/01/2000   12/31/2000
1001       001        2112   01/01/2001   06/30/2001
1001       001        2112   07/01/2001   12/31/2001
1001       001        7552   01/01/2003   12/31/2003
1001       001        2112   06/30/2004   12/31/9999
I want my output to look like this:
subr_id    dep_nbr    grp    eff_date     term_date
1001       001        2112   01/01/2000   12/31/2001
1001       001        7552   01/01/2003   12/31/2003
1001       001        2112   06/30/2004   12/31/9999
How do I achieve this ?
 
 
September 01, 2005 - 3:49 pm UTC 
 
well, you should start by describing the logic from getting from A to B first.
otherwise it is just text.  what are the rules that got you from inputs to outputs.
tell me the procedural algorithm you would use for example. 
 
 
 
Rules from A to B
Sudha Bhagavatula, September 02, 2005 - 9:29 am UTC
 
 
A member is enrolled in a group for a timeframe. For all contiguous time frames for a group I can take the min(eff_date) and max(term_date). For each break in group a new row with min(eff_date) and max(term_date) again. So say a member was enrolled in a group from 01/01/2001 to 12/31/2001 and then again with the same group from 01/01/2005 to 06/30/2005 then I need 2 rows for this member
with the dates as said just now. This is the sql that I'm running, hopefully I'm on the right track but am stuck at this point:
SELECT   SUBR_ID, 
         DEP_NBR,
         GRP,
         LAG_EFF_DATE,
         LEAD_EFF_DATE,
         EFF_DATE,
         TERM_DATE,
         LAG_TERM_DATE,
         LEAD_TERM_DATE,
         DECODE( LEAD_GRP, GRP, 1, 0 ) FIRST_OF_SET,
         DECODE( LAG_GRP, GRP, 1, 0 ) LAST_OF_SET
  FROM   (SELECT   M.SUBR_ID,
                   M.DEP_NBR,
                   LAG(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_GRP,
                   LEAD(GRP_NBR||SUB_GRP) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_GRP,                                                    
                   GRP_NBR||SUB_GRP GRP,
                   CJ.EFF_DATE,
                   CJ.TERM_DATE,
                   LAG(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_EFF_DATE,
                   LEAD(CJ.EFF_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_EFF_DATE,
                   LAG(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LAG_TERM_DATE,
                   LEAD(CJ.TERM_DATE) OVER (PARTITION BY M.SUBR_ID, M.DEP_NBR ORDER BY CJ.EFF_DATE) LEAD_TERM_DATE                                                            
            FROM   DW.T_MEMBER_GROUP_JUNCTION CJ,
                   BCBS.T_GROUP_DIMENSION G,
                   BCBS.T_MEMBER_DIMENSION M 
           WHERE   CJ.GRP_DIM_ID = G.GRP_DIM_ID 
             AND   CJ.MBR_DIM_ID = M.MBR_DIM_ID         
             AND   M.DEP_NBR != '000' 
             AND   G.BENE_PKG IS NOT NULL)
 WHERE   LAG_GRP IS NULL 
    OR   LEAD_GRP IS NULL 
    OR   LEAD_GRP <> GRP 
    OR   LAG_GRP <> GRP 
Thanks for your reply. 
 
September 03, 2005 - 7:15 am UTC 
 
you know, without a table, rows and something more concrete....  I have no comment. 
 
 
 
More detail
Sudha Bhagavatula, September 04, 2005 - 10:34 pm UTC
 
 
 have 3 tables:
Member_dimension
Group_Dimension
Member_Group_Junction
Member_Dimension :- columns are mbr_dim_id, subr_id, dep_nbr
Group dimension :- columns are grp_dim_id, grp_nbr, sub_grp
Member_Group_Junction :- columns are mbr_dim_id, grp_dim_id, eff_date, term_date
I have to create one row for each contiguous dates of enrollment with a new row for a new group or a break in date.
Suppose a member (subr_id = 1001, dep_nbr = 001) is enrolled with a group called 001 from 01/01/2001 till 06/30/2001, he then changes group to 002 for the period 07/01/2001 till 12/31/2001. He enrolls with the same group 002 from 01/01/2002 till 06/30/2002 with a change in benefits. He then gets transferred to some other city or changes jobs. He joins back with the group 001 from 09/30/2003 till 11/30/2003 and quits again. joins back with the same group 001 from 01/01/2204 till present.The data in the junction table will be like this:
mbr_dim_id    grp_dim_id   eff_date      term_date
1             1            01/01/2001    06/30/2001
1             2            07/01/2001    12/31/2001
1             2            01/01/2002    06/30/2002
1             1            09/30/2003    11/30/2003
1             1            01/01/2004    12/31/9999  
My output should be like this:
mbr_dim_id    grp_dim_id   eff_date      term_date
1             1            01/01/2001    06/30/2001
1             2            07/01/2001    06/30/2002
1             1            09/30/2003    11/30/2003
1             1            01/01/2004    12/31/9999  
For each change in group or a break in the contiguity of the dates I should get a new row. The junction table is joined to the dimension with the respective dim_ids.
Hope I'm clearer this time.
Thanks
Sudha
 
 
September 05, 2005 - 10:11 am UTC 
 
tell you what, see
</code>  
https://www.oracle.com/technetwork/issue-archive/2014/14-mar/o24asktom-2147206.html  <code>
it shows a technique in the analytics to the rescue article that will be useful for grouping ranges a records using the LAG() function.
But, you need to read the text that you are supposed to read before putting an example here.
It is something I think I say a lot.
<quote>
If your followup requires a response that might include a query, you had better supply very very simple create tables and insert statements. I cannot create a table and populate it for each and every question. The SMALLEST create table possible (no tablespaces, no schema names, just like I do in my examples for you)
</quote>
that is a direct cut and paste  
 
 
distinct last_value
Putchi, September 06, 2005 - 4:49 am UTC
 
 
When using last_value I am usually only intrested in the last value, hence I need a distinct in the select to get it. It gives what I want but it seems that the database hase to do the work twice, first a window sort and after that a unique sort. Is there any way to avoid the distinct but still only get one row per partion key?
create table a (num number(2), var1 varchar2(10), var2 varchar2(10));
insert into a values (1,'a','A');
insert into a values (2,'b','A');
insert into a values (3,'c','A');
insert into a values (1,'a','B');
insert into a values (2,'b','B');
insert into a values (3,'c','B');
commit;
SQL> select distinct
  2         var2
  3        ,last_value(var1) over (partition by var2 order by num
  4                                rows between unbounded preceding and unbounded following) var1
  5  from a;
VAR2       VAR1
---------- ----------
A          c
B          c
Körschema
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   SORT (UNIQUE)
   2    1     WINDOW (SORT)
   3    2       TABLE ACCESS (FULL) OF 'A'
 
 
 
September 06, 2005 - 8:31 am UTC 
 
nope, analytics are not aggregates, aggregates are not analytics. 
A trick you can use to skip one or the other step is:
ops$tkyte@ORA817DEV> select var2,
  2         substr( max(to_char( num,'fm0000000000') || var1), 11 ) data
  3    from a
  4   group by var2
  5  /
VAR2       DATA
---------- -----------
A          c
B          c
 
 
 
 
 
Analytics to the rescue
Sudha Bhagavatula, September 06, 2005 - 11:28 am UTC
 
 
Read that article. Helped me, but now I have another twist.
Create table contracts (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)
insert into contracts values ('1001', '001', '2112', to_date('01/01/2000','mm/dd/yyyy'), to_date('12/31/2000','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2001','mm/dd/yyyy'), to_date('06/30/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('07/01/2001','mm/dd/yyyy'), to_date('12/31/2001','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '7552', to_date('01/01/2003','mm/dd/yyyy'), to_date('12/31/2003','mm/dd/yyyy'));
insert into contracts values ('1001', '001', '2112', to_date('01/01/2004','mm/dd/yyyy'), to_date('12/31/9999','mm/dd/yyyy'));
I ran this query to identify breaks in groups and dates for the above table:
select subr_id, dep_nbr, grp,
       min_eff_date, 
       max_term_date
  from      
(select subr_id, dep_nbr, grp,
       min(eff_date) min_eff_date, 
       max(term_date) max_term_date
  from 
(select subr_id, dep_nbr, eff_date, term_date, grp,
       max(rn) 
         over(partition by subr_id, dep_nbr order by eff_date) max_rn
  from
(select subr_id, dep_nbr, eff_date, term_date, grp,
       (case
       when eff_date-lag_term_date > 1
            or lag_term_date is null
            or lag_grp_nbr is null
            or lag_grp_nbr <> grp
       then row_num
        end) rn
  from (
select subr_id, dep_nbr, eff_date, term_date, grp_nbr grp,
       lag(term_date) 
          over (partition by subr_id, dep_nbr order by eff_date) lag_term_date,
       lag(grp_nbr||sub_grp) 
          over (partition by subr_id, dep_nbr order by eff_date) lag_grp_nbr,   
       row_number() 
          over (partition by subr_id, dep_nbr order by eff_date) row_num
  from contracts         )))
group by subr_id, dep_nbr, grp, max_rn )      
 order by subr_id, dep_nbr, min_eff_date
This gave me the output as :
subr_id    dep_nbr    grp    eff_date     term_date
1001       001        2112   01/01/2000   12/31/2001
1001       001        7552   01/01/2003   12/31/2003
1001       001        2112   06/30/2004   12/31/9999
I now have another table :
create table contract_pcp_junction (subr_id varchar2(15), dep_nbr varchar2(3), pcp_id varchar2(12), eff_date date, term_date date)
insert into contract_pcp_junction values('1001','001','123765', to_date('07/01/2000','mm/dd/yyyy') to_date('06/30/2001','mm/dd/yyyy');
insert into contract_pcp_junction values('1001','001','155165', to_date('01/01/2003','mm/dd/yyyy') to_date('12/31/9999','mm/dd/yyyy');
This table identifies the provider coverage for each member. I need to identify the breaks in coverage with regards to the contracts.
Now as per the data above this member does not have a pcp from 01/01/2000 to 06/30/2000 and again from 07/01/2001 to 12/31/2001.
I need to insert the breaks into another table. This table needs to have the subr_id, dep_nbr, grp and eff_date, term_date.
create table contract_pcp_breaks (subr_id varchar2(15), dep_nbr varchar2(3), grp_nbr varchar2(12), eff_date date, term_date date)
This table needs to have the data for the breaks
subr_id    dep_nbr   grp_nbr   eff_date    term_date
1001       001       2112      01/01/2000  06/30/2000
1001       001       2112      07/01/2001  12/31/2001
How do I do that and hopefully I have the necessary scripts for you to work w1th.
Thanks a lot for your patience with this.
--Sudha 
 
September 06, 2005 - 8:51 pm UTC 
 
yah, I have scripts, but no real idea how these tables relate.  Your query looks overly complex for the single table.
cannot you take your data, join it, get some "flat relation" that just simply using lag() on will solve the problem?
(please remember, you have been looking at this for hours.  To you this data is natural.  to everyone else, it is just bits and bytes on the screen) 
 
 
 
Combining two tables
Putchi, September 09, 2005 - 6:39 am UTC
 
 
Hi Tom!
I want to combine from/to history values from two tables into one sequence like this:
create table a (a varchar2(2)
               ,from_date  date
               ,to_date    date);
create table b (b varchar2(2)
               ,from_date  date
               ,to_date    date);
insert into a ( a, from_date, to_date ) values ( 
'a1',  to_date( '01/13/2005', 'mm/dd/yyyy'),  to_date('02/10/2005', 'mm/dd/yyyy')); 
insert into a ( a, from_date, to_date ) values ( 
'a2',  to_date( '02/10/2005', 'mm/dd/yyyy'),  to_date( '05/01/2005', 'mm/dd/yyyy')); 
insert into a ( a, from_date, to_date ) values ( 
'a3',  to_date( '05/01/2005', 'mm/dd/yyyy'),  to_date( '08/12/2005', 'mm/dd/yyyy')); 
insert into b ( b, from_date, to_date ) values ( 
'b1',  to_date( '01/13/2005', 'mm/dd/yyyy'),  to_date( '01/22/2005', 'mm/dd/yyyy')); 
insert into b ( b, from_date, to_date ) values ( 
'b2',  to_date( '01/22/2005', 'mm/dd/yyyy'),  to_date( '04/01/2005', 'mm/dd/yyyy')); 
insert into b ( b, from_date, to_date ) values ( 
'b3',  to_date( '04/01/2005', 'mm/dd/yyyy'),  to_date( '09/07/2005', 'mm/dd/yyyy')); 
commit;
select * from ("Magic");
A  B  FROM_DATE  TO_DATE
-- -- ---------- ----------
a1 b1 2005-01-13 2005-01-22
a1 b2 2005-01-22 2005-02-10
a2 b2 2005-02-10 2005-04-01
a2 b3 2005-04-01 2005-05-01
a3 b3 2005-05-01 2005-08-12
Is it possible? 
 
September 09, 2005 - 8:30 am UTC 
 
ops$tkyte@ORA10G> select a.* , b.*,
  2         greatest(a.from_date,b.from_date),
  3             least(a.to_date,b.to_date)
  4    from a, b
  5   where a.from_date <=  b.to_date
  6     and a.to_date >= b.from_date;
 
A  FROM_DATE TO_DATE   B  FROM_DATE TO_DATE   GREATEST( LEAST(A.T
-- --------- --------- -- --------- --------- --------- ---------
a1 13-JAN-05 10-FEB-05 b1 13-JAN-05 22-JAN-05 13-JAN-05 22-JAN-05
a1 13-JAN-05 10-FEB-05 b2 22-JAN-05 01-APR-05 22-JAN-05 10-FEB-05
a2 10-FEB-05 01-MAY-05 b2 22-JAN-05 01-APR-05 10-FEB-05 01-APR-05
a2 10-FEB-05 01-MAY-05 b3 01-APR-05 07-SEP-05 01-APR-05 01-MAY-05
a3 01-MAY-05 12-AUG-05 b3 01-APR-05 07-SEP-05 01-MAY-05 12-AUG-05
It won't be blindingly fast on huge things I would guess... 
 
 
 
 
Putchi, September 09, 2005 - 9:14 am UTC
 
 
OK, I will try if it works, the real tables will have hundred of thousands records. I tried this myself, but I couldn't come up with something that filled in the "null" values.
SQL> select a,b,from_date,lead(from_date) over (order by from_date)
  2  from (
  3  select a,null b,from_date,to_date from a
  4  union all
  5  select null a,b,from_date,to_date from b
  6  order by from_date
  7  );
A  B  FROM_DATE  LEAD(FROM_
-- -- ---------- ----------
a1    2005-01-13 2005-01-13
   b1 2005-01-13 2005-01-22
   b2 2005-01-22 2005-02-10
a2    2005-02-10 2005-04-01
   b3 2005-04-01 2005-05-01
a3    2005-05-01
 
 
 
September 09, 2005 - 9:36 am UTC 
 
that query won't work -- you need to join. 
 
 
 
How to get the 1ST row of this distinct value in a single SELECT
Sean Chang, September 16, 2005 - 11:48 am UTC
 
 
Thank you, Tom.
 
I have read the analytic function for a while, but still
can't figure out a way to select the first row of a distinct
column value in a single SELECT statement. I.E
>>by running below Create and Insert
create  table  INV (
   inv# number(7),
   add_time     date ,
   inv_type varchar2(10),
   amount   number(8,2));
insert into inv values(1, sysdate-1, 'CASH', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(1, sysdate+1, 'COD', 100);
insert into inv values(1, sysdate, 'VISA', 200);
insert into inv values(2, sysdate, 'MC', 10);
insert into inv values(3, sysdate-1, 'AMEX', 30);
insert into inv values(3, sysdate, 'CASH', 30);
I can get the first row of distinct INV# this way:
select * from (select a.*,
              rank() over (partition by inv# order by add_time) time_order
               from inv  a)  where time_order=1;
But how can I acheive this by a single SELECT statement?
The reason is that we have lots of tables we only need
look the very first row of the same Column value and I
don't want endup with lots of in-line views in SELECT
statement.
 
 
September 16, 2005 - 1:59 pm UTC 
 
that is a single select.
why not? (on the lots of in-line views).  If you think they are evil - then you wouldn't like my code ;)
 
 
 
 
Is analytical fitting in this situtation?
A reader, October   03, 2005 - 10:29 am UTC
 
 
select  b.damage_inspection_date,
        b.damage_inspection_by
       ,b.status
       ,NVL(a.cnt,0) CNT
from
     (select aa.damage_inspection_date,
             aa.damage_inspection_by,
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
                from gate_damages gd, gate_containers gc
               where gd.gate_id = gc.gate_id
             ) aa,
             (select * 
                from (select 'MAJOR' STATUS from dual
                      union all
                      select 'MINOR' STATUS from dual
                      union all
                      select 'TOTAL' STATUS from dual
                     )
             ) bb
     )b,
     ((SELECT  damage_inspection_date,
               damage_inspection_by,
               Status,
               cnt
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
                       c.damage_inspection_by,
                       'MAJOR' STATUS,
                       count(distinct c.gate_id) cnt
                  from gate_containers c,
                       gate_damages d
                 where c.gate_id = d.gate_id and
                       d.damage_type_code = 'A'
              group by trunc(c.damage_inspection_date),c.damage_inspection_by
                 
                UNION ALL
                select trunc(g.damage_inspection_date) damage_inspection_date,
                       g.damage_inspection_by,
                       'MINOR' STATUS,
                       count(distinct g.gate_id) cnt
                  from gate_containers g,
                       gate_damages z
                 where g.gate_id = z.gate_id and
                       z.damage_type_code = 'F'
              group by trunc(g.damage_inspection_date),g.damage_inspection_by
             
                UNION ALL
                select  trunc(ab.damage_inspection_date) damage_inspection_date,
                        ab.damage_inspection_by,
                       'TOTAL' STATUS,
                       count(distinct ab.gate_id) cnt
                  from gate_containers ab,
                       gate_damages ac
                 where ab.gate_id = ac.gate_id
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
             
               )
               group by damage_inspection_date, damage_inspection_by, status, cnt
               )
     ) a
where b.damage_inspection_by = a.damage_inspection_by(+)
  and b.damage_inspection_date = a.damage_inspection_date(+)
  and b.status = a.status(+); 
 
October   03, 2005 - 11:29 am UTC 
 
((SELECT  damage_inspection_date,
               damage_inspection_by,
               Status,
               cnt
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
                       c.damage_inspection_by,
                       'MAJOR' STATUS,
                       count(distinct c.gate_id) cnt
                  from gate_containers c,
                       gate_damages d
                 where c.gate_id = d.gate_id and
                       d.damage_type_code = 'A'
              group by trunc(c.damage_inspection_date),c.damage_inspection_by
                 
                UNION ALL
                select trunc(g.damage_inspection_date) damage_inspection_date,
                       g.damage_inspection_by,
                       'MINOR' STATUS,
                       count(distinct g.gate_id) cnt
                  from gate_containers g,
                       gate_damages z
                 where g.gate_id = z.gate_id and
                       z.damage_type_code = 'F'
              group by trunc(g.damage_inspection_date),g.damage_inspection_by
             
                UNION ALL
                select  trunc(ab.damage_inspection_date) damage_inspection_date,
                        ab.damage_inspection_by,
                       'TOTAL' STATUS,
                       count(distinct ab.gate_id) cnt
                  from gate_containers ab,
                       gate_damages ac
                 where ab.gate_id = ac.gate_id
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
             
               )
should be a single query without union's - you don't need to make three passes on that data
select ..., count(distinct case when damage_code = 'A' then gate_id),
            count(distinct case when damage_code = 'F' then gate_id end), 
            count(distinct gate_id)
 
 
 
 
Great!
A reader, October   03, 2005 - 4:05 pm UTC
 
 
Tom,
When I put the changes. It saying "missing keyword" What am I doing wrong?
select  b.damage_inspection_date,
        b.damage_inspection_by
       ,b.status
       ,NVL(a.cnt,0) CNT
from
     (select aa.damage_inspection_date,
             aa.damage_inspection_by,
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) 
damage_inspection_date, gc.damage_inspection_by
                from gate_damages gd, gate_containers gc
               where gd.gate_id = gc.gate_id
             ) aa,
             (select * 
                from (select 'MAJOR' STATUS from dual
                      union all
                      select 'MINOR' STATUS from dual
                      union all
                      select 'TOTAL' STATUS from dual
                     )
             ) bb
     )b,
     ((SELECT  damage_inspection_date,
               damage_inspection_by,
               Status,
               count(distinct case when damage_code = 'A' then gate_id),
               count(distinct case when damage_code = 'F' then gate_id end), 
               count(distinct gate_id))
             from gate_containers ab,gate_damages ac
                 where ab.gate_id = ac.gate_id
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
                         )
    where b.damage_inspection_by = a.damage_inspection_by(+)
  and b.damage_inspection_date = a.damage_inspection_date(+)
  and b.status = a.status(+);  
 
October   03, 2005 - 8:57 pm UTC 
 
sorry, I am not a sql compiler, I cannot reproduce since I don't have the tables or anything. 
 
 
 
Case when ... then ... end
Greg, October   04, 2005 - 8:26 am UTC
 
 
Just lucked out an saw this:
"select ..., count(distinct case when damage_code = 'A' then gate_id),
            count(distinct case when damage_code = 'F' then gate_id end), 
            count(distinct gate_id)"
Should be:
"select ..., count(distinct case when damage_code = 'A' then gate_id end),
            count(distinct case when damage_code = 'F' then gate_id end), 
            count(distinct gate_id)"
Tom just missed the "end" for the case statement ... (I got lucky and spotted it .. heh)
 
 
October   04, 2005 - 4:25 pm UTC 
 
(that is why i always ask for create tables and inserts - without them, it is not possible to test) 
 
 
 
thanks!!
A reader, October   04, 2005 - 2:24 pm UTC
 
 
 
 
Well Taken
A reader, October   05, 2005 - 10:53 am UTC
 
 
Tom,
This is what I would like to see..
damage_inspection_date  damage_inspection_by    counts
xx/xx/xxxx                    Louis             2 minors
xx/xx/xxxx                    juan              1 major
thanks. 
 
 
can analytics help me?
Susan, October   05, 2005 - 2:41 pm UTC
 
 
My result set be ordered by the sum of multiple columns with weight assigned to the columns.  The SQL below works and gives me what I want, but maybe there is an analytical function solution?  Thanks for all your help.
SELECT ename, job, sal, comm  FROM scott.BONUS
ORDER BY DECODE(job, -2, 0, job)*100000+DECODE(sal, -2, 0, sal)*10000+DECODE(comm, -2,0,comm)*100 DESC
 
 
October   05, 2005 - 3:05 pm UTC 
 
not in this case - you want to order by a simple function of attributes of a single row.
You don't need to look across rows - analytics look across rows. 
 
 
 
Thanks Tom
Susan, October   05, 2005 - 3:58 pm UTC
 
 
Thanks for your reply. Do you agree with the DECODE approach or am I missing a more elegant solution? 
 
October   05, 2005 - 8:23 pm UTC 
 
the decode looks fine here - shorter than case but in this "case" just as easy to read. 
 
 
 
Tom
A reader, October   05, 2005 - 4:25 pm UTC
 
 
 Tom,
Can you please point in the right direction... 
This is what I am getting with the following query...
 damage_inspection_date damage_inspection_by  status
 6/12/2004           CCCT              MAJOR   
 6/12/2004           CCCT              MINOR
 6/12/2004           CCCT              TOTAL
 6/12/2004           LOU               MAJOR
 6/12/2004           LOU               MINOR
 
and this is what I would like to get....
 
 damage_inspection_date damage_inspection_by status   count
 6/12/2004          CCCT               MAJOR     2 
 6/12/2004          CCCT               MINOR     2
 6/12/2004         CCCT                TOTAL     1
 select b.damage_inspection_date,
        b.damage_inspection_by
       ,b.status
from
     (select aa.damage_inspection_date,
             aa.damage_inspection_by,
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
                from gate_damages gd, gate_containers gc
               where gd.gate_id = gc.gate_id
             ) aa,
             (select * 
                from (select 'MAJOR' STATUS from dual
                      union all
                      select 'MINOR' STATUS from dual
                      union all
                      select 'TOTAL' STATUS from dual
                     )
             ) bb
     )b,
    ((SELECT  ab.damage_inspection_date,
               damage_inspection_by,
               STATUS_CODE,
               count(distinct case when ac.damage_location_code = 'A' then ab.gate_id end),
               count(distinct case when ac.damage_location_code = 'F' then ab.gate_id end), 
               count(distinct ab.gate_id )
                   from gate_containers ab,gate_damages ac
                 where ab.gate_id = ac.gate_id
                group by ab.damage_inspection_date,ab.damage_inspection_by,status_code, ab.gate_id))a
                where b.damage_inspection_by = a.damage_inspection_by(+)
                  and b.damage_inspection_date = a.damage_inspection_date(+)
                group by (b.damage_inspection_date, b.damage_inspection_by,b.status)
 
 
October   05, 2005 - 8:35 pm UTC 
 
....
 damage_inspection_date damage_inspection_by  status
 6/12/2004           CCCT              MAJOR   
 6/12/2004           CCCT              MINOR
 6/12/2004           CCCT              TOTAL
 6/12/2004           LOU               MAJOR
 6/12/2004           LOU               MINOR
 
and this is what I would like to get....
 
 damage_inspection_date damage_inspection_by status   count
 6/12/2004          CCCT               MAJOR     2 
 6/12/2004          CCCT               MINOR     2
 6/12/2004         CCCT                TOTAL     1
...
by what "logic"?   can you explain how you get from A to B? 
 
 
 
follow up
A reader, October   06, 2005 - 9:48 am UTC
 
 
Tom,
I already got the first part done. All I need to show is to somehow have the count in another column, how many minor, major and total I have. Can that be possible?
Just maybe like in the second example.
 
 
October   06, 2005 - 11:54 am UTC 
 
first part of WHAT?   
 
 
 
more information
A reader, October   06, 2005 - 12:47 pm UTC
 
 
Sorry about the lack of information before.
Here I will try to do bettter. I am trying to 
a query where I need to count the major, minor 
and then get a total. 
requirements:
1. if there is a container with  majors and a minors total the     
counts = major+ minor = total count
2. where container has minor and no major count  the minor only.   
count = minor 
inspector             major  minor   total
   1 major, 0 minor , other           1      1
inspector
 2 major , 1 minor , other     2     1       3
inspector 
 0 major, 1 minor, other        0     1      1 
 
October   06, 2005 - 1:25 pm UTC 
 
sorry -- going back to your original example, I still cannot see the logic behind "what I have" and "what I want" there.  
I don't know what you mean by  "i have the first part"
 
 
 
 
this what I have now
A reader, October   06, 2005 - 2:11 pm UTC
 
 
Tom,
This is my query and result...
select  b.damage_inspection_date,
        b.damage_inspection_by
       ,b.status
       ,NVL(a.cnt,0) CNT
from
     (select aa.damage_inspection_date,
             aa.damage_inspection_by,
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
                from gate_damages gd, gate_containers gc
               where gd.gate_id = gc.gate_id
             ) aa,
             (select * 
                from (select 'MAJOR' STATUS from dual
                      union all
                      select 'MINOR' STATUS from dual
                      union all
                      select 'TOTAL' STATUS from dual
                     )
             ) bb
     )b,
     ((SELECT  damage_inspection_date,
               damage_inspection_by,
               Status,
               cnt
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
                       c.damage_inspection_by,
                       'MAJOR' STATUS,
                       count(distinct c.gate_id) cnt
                  from gate_containers c,
                       gate_damages d
                 where c.gate_id = d.gate_id and
                       d.damage_type_code = 'F'
              group by trunc(c.damage_inspection_date),c.damage_inspection_by
                 
                UNION ALL
                select trunc(g.damage_inspection_date) damage_inspection_date,
                       g.damage_inspection_by,
                       'MINOR' STATUS,
                       count(distinct g.gate_id) cnt
                  from gate_containers g,
                       gate_damages z
                 where g.gate_id = z.gate_id and
                       z.damage_type_code = 'A'
              group by trunc(g.damage_inspection_date),g.damage_inspection_by
             
                UNION ALL
                select  trunc(ab.damage_inspection_date) damage_inspection_date,
                        ab.damage_inspection_by,
                       'TOTAL' STATUS,
                       count(distinct ab.gate_id) cnt
                  from gate_containers ab,
                       gate_damages ac
                 where ab.gate_id = ac.gate_id(+) and
                       SUBSTR(ab.action,2,1) != 'C'
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
             
               )
               group by damage_inspection_date, damage_inspection_by, status, cnt
               )
     ) a
where b.damage_inspection_by = a.damage_inspection_by(+)
  and b.damage_inspection_date = a.damage_inspection_date(+)
  and b.status = a.status(+);
RESULT:
SQL Statement which produced this data:
  select * from MAJOR_MINOR_COUNT_VIEW 
  where rownum < 10
6/12/2004    CCCT    TOTAL    1
6/12/2004    CRAIG    TOTAL    6
6/13/2004    CCCT    TOTAL    5
6/14/2004    CCCT    TOTAL    46
6/14/2004    FYFE    TOTAL    30
6/14/2004    HALM    TOTAL    38
6/14/2004    MUTH    MAJOR    2
6/14/2004    MUTH    MINOR    14
6/14/2004    MUTH    TOTAL    40
AND I WOULD LIK TO HAVE LIKE AS 
THE REQUIREMENTS ABOVE...HOPE THIS HELP. 
 
October   06, 2005 - 2:57 pm UTC 
 
take your query - call it Q
select inspector, 
       max(decode(status,'MINOR',cnt)) minor,
       max(decode(status,'MAJOR',cnt)) major,
       max(decode(status,'TOTAL',cnt)) total
  from (Q)
 group by inspector 
 
 
 
Year to dt + month to date 
reader, October   06, 2005 - 2:35 pm UTC
 
 
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;
SELECT DISTINCT id,sale_dt,SUM (amount) 
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount) 
OVER (PARTITION BY id, TO_CHAR(invoice_dt, 'MON-YYYY') ORDER BY TO_CHAR(sale_dt, 'MON-YYYY') ASC) mon_sal,
SUM (sale_price_usd * qty_sold) 
OVER (PARTITION BY id, TO_CHAR(sale_dt, 'YYYY') ORDER BY TO_CHAR(sale_dt, 'YYYY') ASC) yr_sal,
FROM test
ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL
---------- --------- ---------- ---------- ----------
aa         15-SEP-05      72.25      72.25        237
aa         14-OCT-05      121.5     164.75        237
aa         19-OCT-05      43.25     164.75        237
bb         14-SEP-05      232.5      232.5    1298.25
bb         13-OCT-05     235.25      600.5    1298.25
bb         15-OCT-05     365.25      600.5    1298.25
bb         14-NOV-05     465.25     465.25    1298.25
7 rows selected.
Ideally ,it should have been ----
ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL
---------- --------- ---------- ---------- ----------
aa         15-SEP-05      72.25      72.25        72.25
aa         14-OCT-05      121.5     121.5        193.75
aa         19-OCT-05      43.25     164.75        237
bb         14-SEP-05     232.5      232.5    232.5
bb         13-OCT-05     235.25     235.25    467.5
bb         15-OCT-05     365.25     600.5    833.0 
bb         14-NOV-05     465.25     465.25    1298.25
How can I do this ?
Will appreciate your help .
THANKS 
 
 
October   06, 2005 - 3:02 pm UTC 
 
ideally - there would be a qty_sold column somewhere :)
ideally you will ONLY use to_char to *format* data, never to process it.
trunc(invoice_dt,'y')  NOT to_char(invoice_dt,'yyyy')
trunc(sale_dt,'mm')    NOT to_char(sale_dt, 'MON-YYYY' )
 
 
 
 
Year to Date and Month to date 
READER, October   06, 2005 - 10:04 pm UTC
 
 
As per your suggestion ,I made the changes but ...still need your help .
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
commit;
SELECT DISTINCT id,sale_dt,SUM (amount)
OVER (PARTITION BY id ORDER BY sale_dt ASC) sale_daily,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'MM') ORDER BY trunc(sale_dt,'MM') ASC) mon_sal,
SUM (amount)
OVER (PARTITION BY id,trunc(sale_dt,'Y') ORDER BY trunc(sale_dt,'Y') ASC) yr_sal
FROM test
ID         SALE_DT               SALE_DAILY    MON_SAL     YR_SAL
---------- --------------------- ---------- ---------- ----------
aa         9/15/2005                  72.25      72.25        237
aa         10/14/2005                193.75     164.75        237
aa         10/19/2005                   237     164.75        237
bb         9/14/2005                  232.5      232.5    1298.25
bb         10/13/2005                467.75      600.5    1298.25
bb         10/15/2005                   833      600.5    1298.25
bb         11/14/2005               1298.25     465.25    1298.25
7 rows selected
Ideally ,it should have been ----
ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL
---------- --------- ---------- ---------- ----------
aa         15-SEP-05      72.25      72.25        72.25
aa         14-OCT-05      121.5     121.5        193.75
aa         19-OCT-05      43.25     164.75        237
bb         14-SEP-05     232.5      232.5    232.5
bb         13-OCT-05     235.25     235.25    467.5
bb         15-OCT-05     365.25     600.5    833.0 
bb         14-NOV-05     465.25     465.25    1298.25
Thanks again . 
 
October   07, 2005 - 8:13 am UTC 
 
you shall have to explain how you derived your "optimal" output.
certainly isn't sorted by anything?  I don't get the numbers. 
 
 
 
Year to date /Month to date 
Reader, October   07, 2005 - 9:49 am UTC
 
 
I wish to create a summary table where we will have sale for every day ,sale up to that day in that month and then upto that day in that year 
ie running total or cummulative total 
Thanks 
 
 
October   07, 2005 - 8:22 pm UTC 
 
ok? 
 
 
Follo up
A reader, October   07, 2005 - 9:52 am UTC
 
 
Tom,
The above pivot worked well, however my count are off since
I ONLY want to  count the minor when there is no Major.
Something like this..
                              major  minor   count
1 major, 0 minor , other             1             1
2 major , 1 minor , other              2             2
0 major, 1 minor, other              0     1       1 
* count the minor when there is no major
CREATE TABLE GATE_CONTAINERS
(
  GATE_ID                       NUMBER          ,
  VISIT                         NUMBER          ,
  REFERENCE_ID                  NUMBER          ,
  DAMAGE_INSPECTION_BY          VARCHAR2(30),
  DAMAGE_INSPECTION_DATE        DATE,
                   
)
Insert into GATE_TBL
   (GATE_ID, VISIT)
 Values
   (1, 1);
Insert into GATE_TBL
   (GATE_ID, VISIT)
 Values
   (17, 10);
Insert into GATE_TBL
   (GATE_ID, VISIT)
 Values
   (21, 12);
Insert into GATE_TBL
   (GATE_ID, VISIT)
 Values
   (31, 18);
Insert into GATE_TBL
   (GATE_ID, VISIT)
 Values
   (33, 19);
Insert into GATE_TBL
   (GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (36, 22, TO_DATE('06/12/2004 11:48:49', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
   (GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (37, 23, TO_DATE('06/12/2004 11:50:11', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
   (GATE_ID, VISIT, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (39, 25, TO_DATE('06/12/2004 11:48:19', 'MM/DD/YYYY HH24:MI:SS'), 'CRAIG');
Insert into GATE_TBL
   (GATE_ID, VISIT)
 Values
   (45, 30);
COMMIT;
CREATE TABLE GATE_DAMAGES
(
  GATE_ID               NUMBER                  NOT NULL,
  DAMAGE_LOCATION_CODE  VARCHAR2(5 BYTE)        NOT NULL,
  DAMAGE_TYPE_CODE      VARCHAR2(5 BYTE)        NOT NULL
)
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (34, '01', '9');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (34, '02', 'C');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (37, '01', 'B');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (62, '05', 'B');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (101, '23', 'C');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (183, '99', '9');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '01', 'D');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '04', 'B');
Insert into damages_tbl
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '07', 'B');
COMMIT;
 
 
October   07, 2005 - 8:35 pm UTC 
 
The above pivot worked well, however my count are off since
I ONLY want to  count the minor when there is no Major.
Something like this..
                              major  minor   count
1 major, 0 minor , other             1             1
2 major , 1 minor , other              2             2
0 major, 1 minor, other              0     1       1 
so tell me why there are minor counts when major > 0??? 
 
 
 
and this is my query
A reader, October   07, 2005 - 9:53 am UTC
 
 
select damage_inspection_date,damage_inspection_by,
       max(decode(status,'MINOR',cnt)) minor,
       max(decode(status,'MAJOR',cnt)) major,
       max(decode(status,'TOTAL',cnt)) total
  from (select  b.damage_inspection_date,
        b.damage_inspection_by
       ,b.status
       ,NVL(a.cnt,0) CNT
from
     (select aa.damage_inspection_date,
             aa.damage_inspection_by,
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by
                from gate_damages gd, gate_containers gc
               where gd.gate_id = gc.gate_id
             ) aa,
             (select * 
                from (select 'MAJOR' STATUS from dual
                      union all
                      select 'MINOR' STATUS from dual
                      union all
                      select 'TOTAL' STATUS from dual
                     )
             ) bb
     )b,
     ((SELECT  damage_inspection_date,
               damage_inspection_by,
               Status,
               cnt
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date,
                       c.damage_inspection_by,
                       'MAJOR' STATUS,
                       count(distinct c.gate_id) cnt
                  from gate_containers c,
                       gate_damages d
                 where c.gate_id = d.gate_id and
                       d.damage_type_code = 'F'
              group by trunc(c.damage_inspection_date),c.damage_inspection_by
               UNION ALL
                select trunc(g.damage_inspection_date) damage_inspection_date,
                       g.damage_inspection_by,
                       'MINOR' STATUS,
                       count(distinct g.gate_id) cnt
                  from gate_containers g,
                       gate_damages z
                 where g.gate_id = z.gate_id and
                       z.damage_type_code = 'A'
              group by trunc(g.damage_inspection_date),g.damage_inspection_by
              UNION ALL
              select  trunc(ab.damage_inspection_date) damage_inspection_date,
                        ab.damage_inspection_by,
                       'TOTAL' STATUS,
                       count(distinct ab.gate_id) cnt
                  from gate_containers ab,
                       gate_damages ac
                 where ab.gate_id = ac.gate_id(+) and
                       SUBSTR(ab.action,2,1) != 'C'
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by
               )
               group by damage_inspection_date, damage_inspection_by, status, cnt
               )
     ) a
where b.damage_inspection_by = a.damage_inspection_by(+)
  and b.damage_inspection_date = a.damage_inspection_date(+)
  and b.status = a.status(+) 
)
group by damage_inspection_date,damage_inspection_by 
 
 
I got it....
A reader, October   07, 2005 - 11:16 am UTC
 
 
Tom,
I got it....I just had to put the following. Let me know
what you think? If you have any suggestions!
Thanks for all your patient...
select trunc(g.damage_inspection_date) damage_inspection_date,
             g.damage_inspection_by,z.damage_type_code, 
                       ( case z.damage_type_code  
                                    when 'F' then 0
                              ELSE Count(distinct g.gate_id) 
                            end ) CNT
                 ---      count(distinct g.gate_id) cnt
                  from gate_containers g,gate_damages z
                  where g.gate_id = z.gate_id 
                  and z.damage_type_code = 'A' 
 
 
Year to Date and Month to date
Tim, October   07, 2005 - 11:52 pm UTC
 
 
Just a guess - could this be what your looking for?
SELECT DISTINCT id,sale_dt
,SUM (amount) OVER
    (PARTITION BY id, sale_dt ORDER BY id ASC, sale_dt ASC) sale_daily
,SUM (amount) OVER
    (PARTITION BY id, TRUNC(sale_dt,'MM')
     ORDER BY id ASC, sale_dt ASC
     RANGE UNBOUNDED PRECEDING
    ) mon_sal
,SUM (amount) OVER
    (PARTITION BY id, TRUNC(sale_dt,'Y')
     ORDER BY id ASC, sale_dt ASC
     RANGE UNBOUNDED PRECEDING
    ) yr_sal
FROM TEST
ORDER BY id, sale_dt
ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL
---------- --------- ---------- ---------- ----------
aa         15-SEP-05      72.25      72.25      72.25
aa         14-OCT-05      121.5      121.5     193.75
aa         19-OCT-05      43.25     164.75        237
bb         14-SEP-05      232.5      232.5      232.5
bb         13-OCT-05     235.25     235.25     467.75
bb         15-OCT-05     365.25      600.5        833
bb         14-NOV-05     465.25     465.25    1298.25
-- another variation
SELECT a.*
,SUM (sale_daily) OVER
    (PARTITION BY id, TRUNC(sale_dt,'MM')
     ORDER BY id ASC, sale_dt ASC
     RANGE UNBOUNDED PRECEDING
    ) mon_sal
,SUM (sale_daily) OVER
    (PARTITION BY id, TRUNC(sale_dt,'Y')
     ORDER BY id ASC, sale_dt ASC
     RANGE UNBOUNDED PRECEDING
    ) yr_sal
FROM
(
SELECT id,sale_dt
,SUM(amount) sale_daily
FROM TEST
GROUP BY id, sale_dt
) a
 
 
 
EXCELLENT ! 
reader, October   08, 2005 - 12:24 am UTC
 
 
Thanks ! 
 
 
to answer your question
A reader, October   11, 2005 - 12:54 pm UTC
 
 
The above pivot worked well, however my count are off since
I ONLY want to  count the minor when there is no Major.
Something like this..
                              major  minor   count
1 major, 0 minor , other         1             1
2 major , 1 minor , other        2             2
0 major, 1 minor, other          0     1       1 
so tell me why there are minor counts when major > 0??? 
because when there are a major's and minor's I want to count
only the major's. When just the minor when there is no 
major...just want to count the minor. those are the only 2 situation that there should be.
 
 
 
How can I ignore some selected columns in my group by?
Neil, October   12, 2005 - 3:14 am UTC
 
 
Tom,
I have a set of data that is recorded daily and I want to 
compress it; so this:
87654321   1    5 21-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 22-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 23-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 24-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 25-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 26-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 27-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 28-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 29-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 30-AUG-2005 2.7500E+10          0 -1.436E+10 2.7500E+10          0 -1.436E+10
87654321   1    5 31-AUG-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 01-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 02-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 03-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 04-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 05-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 06-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 07-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 08-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 09-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 10-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 11-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 12-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 13-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321   1    5 14-SEP-2005          0          0 -1.991E+10          0          0 -1.991E+10
87654321   1    5 15-SEP-2005          0          0 -1.991E+10          0          0 -1.991E+10
87654321   1    5 16-SEP-2005          0          0 -1.991E+10          0          0 -1.991E+10
87654321   1    5 17-SEP-2005          0          0 -1.991E+10          0          0 -1.991E+10
87654321   1    5 18-SEP-2005          0          0 -1.991E+10          0          0 -1.991E+10
87654321   1    5 19-SEP-2005          0          0 -1.991E+10          0          0 -1.991E+10
87654321   1    5 20-SEP-2005 5555550000          0 -1.436E+10 5555550000          0 -1.436E+10
87654321   1    5 21-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 22-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 23-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 24-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 25-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
87654321   1    5 26-SEP-2005          0          0 -1.436E+10          0          0 -1.436E+10
Needs to be converted into this:
87654321   1    5 21-AUG-2005 29-AUG-2005          0          0 -4.186E+10          0          0 -4.186E+10
87654321   1    5 30-AUG-2005 12-SEP-2005 2.7500E+10          0 -1.436E+10 2.7500E+10          0 -1.436E+10
87654321   1    5 13-SEP-2005 19-SEP-2005 2.7500E+10 -3.306E+10 -1.991E+10 2.7500E+10 -3.306E+10 -1.991E+10
87654321   1    5 20-SEP-2005 01-JAN-4000 5555550000          0 -1.436E+10 5555550000          0 -1.436E+10
The column of interest is the 7th one. Whenever it changes, 
I want to create a new row beginning with the day's date, 
and ending either on the day before the next change, or, if 
there is no next change (LEAD analytic function), substitute 
in 01-JAN-4000 to show that this is the current amount. 
The problem is, I need to ignore the other figures in columns 5 & 6 and 8 & 9. If I group by all the columns, I get separate
entries for these lines. That's my stumbling block - I have got close with analytics, but so far, no cigar!
I'm on 8.1.7, although I'd be interested in solutions possible in
later versions, too. If you think this is possible, I can paste a create table and SQL*Loader script here, but it would detract from the post: It's a bit of a mess anyway - if only AskTom allowed 132 columns :)
T.I.A
 
 
October   12, 2005 - 7:26 am UTC 
 
"i want to create a new row" - that is hard as analytics don't "create rows", they just don't "squish them out" like an aggregate wou.d
make the example smaller - you don't need all of the columns, seems two or three might suffice.  show the table, the data (via inserts) and the expected output if you like. 
 
 
 
Maybe this should be a GROUP BY question, then
Neil, October   13, 2005 - 4:12 am UTC
 
 
OK - here's the table creation scripts and a couple of loader files. 
My goal is to create a SQL statement to change the old data into the new. 
I can use analytics to give me the start and end dates, but my problem is that I wish to ignore the actin, actout, expin and expout columns and concentrate on the act column. When it changes, I want to take the row, and give it an end date of the day before the date on which it changes again, or the default date of 01-JAN-4000 if no such row exists.
If I could just partition by the earliest date and the latest date where the act figure is the same within serial, volume and part, I could pick off the FIRST and the LAST and use LAG and LEAD to work out the dates...
CREATE TABLE t_old (
 DEPOT  VARCHAR2(6)
,SERIAL VARCHAR2(8)
,VOLUME NUMBER(4)
,PART   NUMBER(2)
,ASAT   DATE
,ACTIN  NUMBER(8)
,ACTOUT NUMBER(8)
,ACT    NUMBER(8)
,EXPIN  NUMBER(8)
,EXPOUT NUMBER(8)
,EXPD   NUMBER(8)
)
/
LOAD DATA
INFILE *
INTO TABLE t_old
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,ASAT
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005       0        0  -1796200       0        0 -1796200
DEPOT1 00822000 6086 5 25-SEP-2005       0        0  -1796200       0        0 -1796200
DEPOT1 00822000 6086 5 26-SEP-2005       0        0  -1796200       0        0 -1796200
DEPOT1 08226111    1 5 29-AUG-2005       0        0  -4185550       0        0 -4185550
DEPOT1 08226111    1 5 30-AUG-2005 2750000        0  -1435550 2750000        0 -1435550
DEPOT1 08226111    1 5 31-AUG-2005       0        0  -1435550       0        0 -1435550
DEPOT1 08226111    1 5 01-SEP-2005       0        0  -1435550       0        0 -1435550
DEPOT1 08226111    1 5 02-SEP-2005       0        0  -1435550       0        0 -1435550
DEPOT1 08226111    1 5 03-SEP-2005 2750000 -3305555  -1991105 2750000 -3305555 -1991105
DEPOT1 08226111    1 5 04-SEP-2005       0        0  -1991105       0        0 -1991105
DEPOT1 08226111    1 5 05-SEP-2005       0        0  -1991105       0        0 -1991105
DEPOT1 08226111    1 5 06-SEP-2005  555555        0  -1435550  555555        0 -1435550
DEPOT1 08226111    1 5 07-SEP-2005       0        0  -1435550       0        0 -1435550
DEPOT1 08226111    1 5 08-SEP-2005       0        0  -1435550       0        0 -1435550
DEPOT1 08226111  420 5 11-SEP-2005       0        0      1150       0        0     1150
DEPOT1 08226111  420 5 12-SEP-2005       0        0      1150       0        0     1150
DEPOT1 08226111  420 5 13-SEP-2005 3329555 -2775150    555555 3329555 -2775150   555555
DEPOT1 08226111  420 5 14-SEP-2005       0        0    555555       0        0   555555
DEPOT1 08226111  420 5 15-SEP-2005       0        0    555555       0        0   555555
DEPOT1 08226111  420 5 16-SEP-2005       0  -555555         0       0  -555555        0
DEPOT1 08226111  420 5 17-SEP-2005       0        0         0       0        0        0
DEPOT1 08226111  495 5 18-SEP-2005       0        0         0       0        0        0
DEPOT1 08226111  495 5 19-SEP-2005  555555        0    555555  555555        0   555555
DEPOT1 08226111  495 5 20-SEP-2005       0        0    555555       0        0   555555
DEPOT1 08226111  495 5 21-SEP-2005       0        0    555555       0        0   555555
DEPOT1 08226111  495 5 22-SEP-2005       0  -555555         0       0  -555555        0
DEPOT1 08226111  495 5 23-SEP-2005       0        0         0       0        0        0
DEPOT1 08226111  664 5 28-AUG-2005       0        0   4228550       0        0  4228550
DEPOT1 08226111  664 5 29-AUG-2005       0        0   4228550       0        0  4228550
DEPOT1 08226111  664 5 30-AUG-2005       0 -2750000   1478550       0 -2750000  1478550
DEPOT1 08226111  664 5 31-AUG-2005       0        0   1478550       0        0  1478550
DEPOT1 08226111  664 5 01-SEP-2005       0        0   1478550       0        0  1478550
CREATE TABLE t_new (
 DEPOT   VARCHAR2(6)
,SERIAL  VARCHAR2(8)
,VOLUME  NUMBER(4)
,PART    NUMBER(2)
,FROM_D  DATE
,UNTIL_D DATE
,ACTIN   NUMBER(8)
,ACTOUT  NUMBER(8)
,ACT     NUMBER(8)
,EXPIN   NUMBER(8)
,EXPOUT  NUMBER(8)
,EXPD    NUMBER(8)
)
/
LOAD DATA
INFILE *
INTO TABLE t_new
TRUNCATE
FIELDS TERMINATED BY WHITESPACE
(DEPOT
,SERIAL
,VOLUME
,PART
,FROM_D
,UNTIL_D
,ACTIN
,ACTOUT
,ACT
,EXPIN
,EXPOUT
,EXPD)
BEGINDATA
DEPOT1 00822000 6086 5 24-SEP-2005 01-JAN-4000       0        0 -1796200       0        0 -1796200
DEPOT1 08226111    1 5 29-AUG-2005 29-AUG-2005       0        0 -4185550       0        0 -4185550
DEPOT1 08226111    1 5 30-AUG-2005 02-SEP-2005 2750000        0 -1435550 2750000        0 -1435550
DEPOT1 08226111    1 5 03-SEP-2005 05-SEP-2005 2750000 -3305555 -1991105 2750000 -3305555 -1991105
DEPOT1 08226111    1 5 06-SEP-2005 01-JAN-4000  555555        0 -1435550  555555        0 -1435550
DEPOT1 08226111  420 5 11-SEP-2005 12-SEP-2005       0        0     1150       0        0     1150
DEPOT1 08226111  420 5 13-SEP-2005 15-SEP-2005 3329555 -2775150   555555 3329555 -2775150   555555
DEPOT1 08226111  420 5 16-SEP-2005 18-SEP-2005       0  -555555        0       0  -555555        0
DEPOT1 08226111  495 5 19-SEP-2005 21-SEP-2005  555555        0   555555  555555        0   555555
DEPOT1 08226111  495 5 22-SEP-2005 01-JAN-4000       0  -555555        0       0  -555555        0
DEPOT1 08226111  664 5 28-AUG-2005 29-AUG-2005       0        0  4228550       0        0  4228550
DEPOT1 08226111  664 5 30-AUG-2005 01-JAN-4000       0 -2750000  1478550       0 -2750000  1478550
 
 
 
Some Help needed!
A reader, October   18, 2005 - 10:54 am UTC
 
 
Tom,
How can I count the double moves as 1. For example,
in the case of 1690371?
I want to count 
1690371     63   A
1690371     63   X
1690371     64   A
1690371     64   L
I want to count "A" AS ONE MOVE using this query
select trunc(g.damage_inspection_date) damage_inspection_date,g.damage_inspection_by, 'MINOR' STATUS,
             count(distinct g.gate_id) cnt
                 from gate_containers g,
                 gate_damages z
                 where g.gate_id = z.gate_id 
                 and   z.damage_type_code = 'A'
                 and   g.DAMAGE_INSPECTION_BY = 'COLUMBO'
                  and trunc(G.damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy') 
                group by trunc(g.damage_inspection_date),g.damage_inspection_by
1690355     59   A    
1690355     59   E
1690371     63   A
1690371     63   X
1690371     64   A
1690371     64   L
1690405     71   A     
1690405     71   I
1690433     71   A     
1690433     71   I
1690486     54   F
1690486     54   L     
1690486     72   F
1690486     72   I
1690540    59    A     
1690540    59    E
1690636    63    A
1690636    63    X
1690781    67    X
 
 
 
One solution
A reader, October   19, 2005 - 9:29 am UTC
 
 
Tom,
Can decode work here...
decode(count(distinct g.gate_id,'A','F',0,NULL)  
 
October   19, 2005 - 9:45 am UTC 
 
I didn't really understand the question right above, nor did I see any table creates or inserts, so I sort of ignored it... 
 
 
 
More information 
A reader, October   19, 2005 - 10:44 am UTC
 
 
create table gate_containers
(gate_id number,
 action varchar2(5),
damage_inspection_date date,
damage_inspection_by varchar2(30))
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686439, 'RNC', TO_DATE('06/14/2005 11:16:16', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1688372, 'RNC', TO_DATE('06/14/2005 13:26:59', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1688374, 'RNC', TO_DATE('06/14/2005 13:27:08', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1688235, 'RNC', TO_DATE('06/14/2005 13:18:15', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1688609, 'RNC', TO_DATE('06/14/2005 13:43:35', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686827, 'RNC', TO_DATE('06/14/2005 11:42:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1688508, 'RNC', TO_DATE('06/14/2005 13:36:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686044, 'RNC', TO_DATE('06/14/2005 10:50:47', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1685720, 'RNC', TO_DATE('06/14/2005 10:27:38', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686276, 'RNC', TO_DATE('06/14/2005 11:05:23', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
CREATE TABLE GATE_DAMAGES
(
  GATE_ID               NUMBER                  NOT NULL,
  DAMAGE_LOCATION_CODE  VARCHAR2(5 BYTE)        NOT NULL,
  DAMAGE_TYPE_CODE      VARCHAR2(5 BYTE)        NOT NULL
)
--
--SQL Statement which produced this data:
--  SELECT * FROM GATE_DAMAGES 
--  WHERE ROWNUM < 20
--
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (34, '01', '9');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (34, '02', 'C');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (37, '01', 'B');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (62, '05', 'B');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (101, '23', 'C');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (183, '99', '9');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '01', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '04', 'B');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '07', 'B');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '08', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '11', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '18', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '22', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (188, '24', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (189, '01', 'B');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (189, '08', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (189, '11', 'B');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (189, '18', 'D');
Insert into GATE_DAMAGES
   (GATE_ID, DAMAGE_LOCATION_CODE, DAMAGE_TYPE_CODE)
 Values
   (189, '22', 'D');
COMMIT;
 Values
   (1686279, 'RNC', TO_DATE('06/14/2005 11:05:34', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686285, 'RNC', TO_DATE('06/14/2005 11:05:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1685831, 'RNC', TO_DATE('06/14/2005 10:36:22', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1685417, 'RNC', TO_DATE('06/14/2005 10:06:00', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1685579, 'RNC', TO_DATE('06/14/2005 10:17:18', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1685828, 'RNC', TO_DATE('06/14/2005 10:34:44', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686007, 'RNC', TO_DATE('06/14/2005 10:47:43', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1686131, 'RNC', TO_DATE('06/14/2005 10:56:42', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
Insert into GATE_CONTAINER
   (GATE_ID, ACTION, DAMAGE_INSPECTION_DATE, DAMAGE_INSPECTION_BY)
 Values
   (1688019, 'RNC', TO_DATE('06/14/2005 13:05:56', 'MM/DD/YYYY HH24:MI:SS'), 'COLUMBO');
COMMIT;
Tom, 
Let me see if I can try now to explain it better. I am looking for
a report like this..
damages_inspection_date   damages_inspection_by  cnt  minor  major  total
6/12/2005              MUTH          XX    XXX   XX     XX
requirements
MINOR = A
MAJOR = F
TOTAL != 'C' 
and also but very important is that when there is a major and a minor 
I should only count the major thus ignoring the minor.
 
 
October   19, 2005 - 12:34 pm UTC 
 
sorry - but I'll need much more "text" than that.  Remember, I haven't been staring at these tables for hours/days, I'm not familar with your vernacular, I don't know what problem you are trying to solve.
spec it out like we used to in the olden days - someone wrote spec (requirements) and someone else might have written the code from the spec. 
 
 
 
follow up
A reader, October   19, 2005 - 2:51 pm UTC
 
 
GATE_ID                                                    DAMAGE_TYPE_CODE
1690355    59    A    A    
1690355    59    E        
1690371    63    A    A     
1690371    63    X         
1690371    64    A         
1690371    64    L         
1690405    71    A    A    
1690405    71    I        
1690433    71    A    A    
1690433    71    I        
1690486    54    F    F    
1690486    54    L        
1690486    72    F        
1690486    72    I        
1690540    59    A    A    
1690540    59    E        
1690636    63    A    A    
1690636    63    X        
1690781    67    X         
1687912    56    F    F
1687912    56    I    
1687912    66    A    
1687912    66    X    
I think this is a good example. In this case I got 
A  =   MINOR DAMAGES
F =   MAJOR DAMAGES
TOTAL =  NOT EQUAL TO C ( !C)
1.    If you look at it closely you can see for every 
          every gate_id  where I have  multiple
Major damages just count them as one like for example
Gate_id = 1690371 
2      When you have multiples MINORS (Fs) 
Like gate_id = 1690486  count them as one
3     When you have gate_id with F and A like
Gate_id = 1687912 then just Count F(MAJORS) 
damages_inspection_date   damages_inspection_by  cnt  minor  major  Total
6/12/2005                 MUTH                XX    XXX   XXX     XX
 
 
October   19, 2005 - 4:30 pm UTC 
 
you seem to be using F as major:
F =   MAJOR DAMAGES
but also as minor:
When you have multiples MINORS (Fs)
sorry, I'm not being a "hard whatever", I'm not getting it.  step back, pretend you were trying to explain this to your mom.   
 
 
 
this is what I got so far
A reader, October   19, 2005 - 3:02 pm UTC
 
 
Tom,
This is what I got so far, but the query is 
not following the rule with the MINORS DAMAGES..
----------------------------------------
select damage_inspection_date,damage_inspection_by, 
       max(decode(status,'MINOR',cnt)) minor, 
       max(decode(status,'MAJOR',cnt)) major, 
       max(decode(status,'TOTAL',cnt)) total 
  from (select  b.damage_inspection_date, 
        b.damage_inspection_by 
       ,b.status 
       ,NVL(a.cnt,0) CNT 
from 
     (select aa.damage_inspection_date, 
             aa.damage_inspection_by, 
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by 
                from gate_damages gd, gate_containers gc 
               where gd.gate_id = gc.gate_id 
             ) aa, 
             (select * 
                from (select 'MAJOR' STATUS from dual 
                     union all 
                      select 'MINOR' STATUS from dual 
                      union all 
                      select 'TOTAL' STATUS from dual 
                     ) 
             ) bb 
     )b, 
     ((SELECT  damage_inspection_date, 
               damage_inspection_by, 
               Status, 
               cnt 
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date, 
                       c.damage_inspection_by, 
                       'MAJOR' STATUS, 
                       count(distinct c.gate_id) cnt 
                  from gate_containers c, 
                       gate_damages d 
                 where c.gate_id = d.gate_id and 
                       d.damage_type_code = 'F' 
              group by trunc(c.damage_inspection_date),c.damage_inspection_by 
               UNION ALL 
                select trunc(g.damage_inspection_date) damage_inspection_date, 
                        g.damage_inspection_by, 
                       'MINOR' STATUS, 
                       count(distinct g.gate_id) 
             from gate_containers g, gate_damages z 
                  where g.gate_id = z.gate_id 
                group by trunc(g.damage_inspection_date),g.damage_inspection_by 
                 UNION ALL 
              select  trunc(ab.damage_inspection_date) damage_inspection_date, 
                        ab.damage_inspection_by, 
                       'TOTAL' STATUS, 
                       count(distinct ab.gate_id) cnt 
                  from gate_containers ab, 
                       gate_damages ac 
                 where ab.gate_id = ac.gate_id(+) and 
                       SUBSTR(ab.action,2,1) != 'C' 
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by 
               ) 
               group by damage_inspection_date, damage_inspection_by, status, cnt 
               ) 
     ) a 
where b.damage_inspection_by = a.damage_inspection_by(+) 
  and b.damage_inspection_date = a.damage_inspection_date(+) 
  and b.status = a.status(+) 
) 
group by damage_inspection_date,damage_inspection_by
 
 
 
SORRY!!
A reader, October   19, 2005 - 3:05 pm UTC
 
 
COPY THE WRONG QUERY...
This is what I got so far but the query
is not following the rules with the MINOR
AS stated above.
select damage_inspection_date,damage_inspection_by, 
       max(decode(status,'MINOR',cnt)) minor, 
       max(decode(status,'MAJOR',cnt)) major, 
       max(decode(status,'TOTAL',cnt)) total 
  from (select  b.damage_inspection_date, 
        b.damage_inspection_by 
       ,b.status 
       ,NVL(a.cnt,0) CNT 
from 
     (select aa.damage_inspection_date, 
             aa.damage_inspection_by, 
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) damage_inspection_date, gc.damage_inspection_by 
                from gate_damages gd, gate_containers gc 
               where gd.gate_id = gc.gate_id 
             ) aa, 
             (select * 
                from (select 'MAJOR' STATUS from dual 
                     union all 
                      select 'MINOR' STATUS from dual 
                      union all 
                      select 'TOTAL' STATUS from dual 
                     ) 
             ) bb 
     )b, 
     ((SELECT  damage_inspection_date, 
               damage_inspection_by, 
               Status, 
               cnt 
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date, 
                       c.damage_inspection_by, 
                       'MAJOR' STATUS, 
                       count(distinct c.gate_id) cnt 
                  from gate_containers c, 
                       gate_damages d 
                 where c.gate_id = d.gate_id and 
                       d.damage_type_code = 'F' 
              group by trunc(c.damage_inspection_date),c.damage_inspection_by 
               UNION ALL 
                select trunc(g.damage_inspection_date) damage_inspection_date, 
                        g.damage_inspection_by, 
                       'MINOR' STATUS, 
                       count(distinct g.gate_id) 
                 from gate_containers g, gate_damages z 
                  where g.gate_id = z.gate_id 
                  d.damage_type_code = 'A' 
                group by trunc(g.damage_inspection_date),g.damage_inspection_by 
                 UNION ALL 
              select  trunc(ab.damage_inspection_date) damage_inspection_date, 
                        ab.damage_inspection_by, 
                       'TOTAL' STATUS, 
                       count(distinct ab.gate_id) cnt 
                  from gate_containers ab, 
                       gate_damages ac 
                 where ab.gate_id = ac.gate_id(+) and 
                       SUBSTR(ab.action,2,1) != 'C' 
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by 
               ) 
               group by damage_inspection_date, damage_inspection_by, status, cnt 
               ) 
     ) a 
where b.damage_inspection_by = a.damage_inspection_by(+) 
  and b.damage_inspection_date = a.damage_inspection_date(+) 
  and b.status = a.status(+) 
) 
group by damage_inspection_date,damage_inspection_by
 
 
 
follow up
A reader, October   19, 2005 - 4:47 pm UTC
 
 
1.    If you look at it closely you can see for every 
          every gate_id  where I have  multiple
Major damages just count them as one like for example
Gate_id = 1690371 
2      When you have multiples MINORS (Fs) 
Like gate_id = 1690486  count them as one
3     When you have gate_id with F and A ALIKE
Gate_id = 1687912 then just Count F(MAJORS) 
F FOR MAJOR
A FOR MINOR
SOMETIMES IN THE RECORD WILL HAVE MAJOR AND MINORS
I JUST WANT TO COUNT THE MAJOR AND IGNORE THE MINORS.
I HAVE GIVEN 3 EXAMPLES OF THE RULES....DON'T KNOW
WHAT ELSE TO SAY...ALSO PLEASE LOOK AT THE QUERY
IT'S ALL IN THE UNION. THE ONLY PROBLEM THAT I HAVE
IS THAT I AM COUNTING THE MAJOR AND THE MINORS IN 
THE MINOR UNION.
 
 
October   19, 2005 - 4:57 pm UTC 
 
1)  Ok, I'm looking at that gate id:
1690371    63    A    A     
1690371    63    X         
1690371    64    A         
1690371    64    L         
IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?
2)  Ok, I'm looking at that gate id:
1690486    54    F    F    
1690486    54    L        
1690486    72    F        
1690486    72    I       
Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get it?
3) Ok, I'm looking at that:
1687912    56    F    F
1687912    56    I    
1687912    66    A    
1687912    66    X    
and we are back to F being a major, not a minor again?  
So, no, I don't get it, it is not clear, you can shout loud, but it won't matter.
(am I the only one not really following this??) 
 
 
 
ANOTHER EXAMPLE!
A reader, October   19, 2005 - 4:55 pm UTC
 
 
1690355    59    A    A    
1690355    59    E        
1690371    63    A    A     
1690371    63    X         
1690371    64    A         
1690371    64    L         
1690405    71    A    A    
1690405    71    I        
1690433    71    A    A    
1690433    71    I        
1690486    54    F    F    
1690486    54    L        
1690486    72    F        
1690486    72    I        
1690540    59    A    A    
1690540    59    E        
1690636    63    A    A    
1690636    63    X        
1690781    67    X         
        A    12    
        F    9    
 select trunc(g.damage_inspection_date) damage_inspection_date, 
                  g.damage_inspection_by, 'MINOR' status, 
                  count(distinct g.gate_id) cnt
                 from gate_containers g, gate_damages  z 
                  where g.gate_id = z.gate_id
                  and  z.damage_type_code = 'A' 
                  and damage_inspection_by = 'COLUMBO' 
                  and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy') 
                group by trunc(g.damage_inspection_date),g.damage_inspection_by 
result from the query:
damage_inspection_date
6/14/2005   
damage_inspection_by
COLUMBO   
status
MINOR 
CNT
 9 --------IT SHOULD BE 8 WHY? BECAUSE I WANT TO COUNT 
           THE "A"(MINORS) UNIQUELY,DISTINCTLY. IN OTHER WORDS
            
 
 
follow up
A reader, October   19, 2005 - 5:15 pm UTC
 
 
)  Ok, I'm looking at that gate id:
1690371    63    A    A     
1690371    63    X         
1690371    64    A         
1690371    64    L         
*****this count as one A = MINOR
IF f is for major
AND that gate id is a prime example of multiple majors
THEN where the heck is f?
2)  Ok, I'm looking at that gate id:
1690486    54    F    F    
1690486    54    L        
1690486    72    F        
1690486    72    I 
*******THIS COUNT AS ONE F = MAJOR      
Now, I see F's and you said "F IS FOR MAJOR", but now you are saying this is the 
primary example of multiple MINORS... Maybe I'm being "dumb", but I don't get 
it?
3) Ok, I'm looking at that:
1687912    56    F    F
1687912    56    I    
1687912    66    A    
1687912    66    X    
***** IN THIS CASE IGNORE THE MINOR(A) AND COUNT JUST THE MAJOR(F)
****THE FAR LETTER ON THE RIGHT IS TO SHOW YOU HOW I AM 
CALCULATING WHAT IS ON THE RIGHT...
  
 
October   19, 2005 - 7:44 pm UTC 
 
so, by gate_id compute how many A's and how many F's
select gate_id, count(case when col='A' then col end) A, 
                count(case when col='F' then col end) F
  from t;
now you have gate_id and a count of A's and F's
call that Q
select ...
  from (Q);
use CASE to look at A and F and return whatever you want. 
 
 
 
Year to date Business Day 
Reader, October   19, 2005 - 6:48 pm UTC
 
 
I am trying to calculate the number of days we did business ie sold anything and then a running total for the year .
CREATE TABLE TEST (ID VARCHAR2(10),sale_dt DATE ,amount NUMBER(6,2) )
INSERT INTO TEST VALUES ('aa','14-OCT-2005',65.25);
INSERT INTO TEST VALUES ('aa','14-OCT-2005',56.25);
INSERT INTO TEST VALUES ('aa','15-SEP-2005',72.25);
INSERT INTO TEST VALUES ('aa','19-OCT-2005',43.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',67.25);
INSERT INTO TEST VALUES ('bb','13-OCT-2005',235.25);
INSERT INTO TEST VALUES ('bb','15-OCT-2005',365.25);
INSERT INTO TEST VALUES ('bb','14-NOV-2005',465.25);
INSERT INTO TEST VALUES ('bb','14-SEP-2005',165.25);
COMMIT;
SELECT a.*
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
     ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
     ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
     ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mtd_day_of_business
,COUNT(sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
     ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) ytd_day_of_business      
FROM 
(
SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
GROUP BY ID, sale_dt
) a
ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL MTD_DAYOFBUS YTD_DAYOFBUS
---------- --------- ---------- ---------- ---------- ------------ ------------
aa         15-SEP-05     216.75     216.75     216.75            1            1
aa         14-OCT-05     299.25     299.25        516            1            2
aa         19-OCT-05     129.75        429     645.75            2            3
bb         14-SEP-05      697.5      697.5      697.5            1            1
bb         13-OCT-05     705.75     705.75    1403.25            1            2
bb         15-OCT-05    1095.75     1801.5       2499            2            3
bb         14-NOV-05    1395.75    1395.75    3894.75            1            4
Ideally ,the business days in the month of sep -14,15 
oct-13,14,15,19 nov 1 
So the year count should be 7 and the monthly count should be sep 1,2 oct 1,2,3,4 and Nov 1 .
Can this be done using analytical function or is there any other way .
Thanks 
 
 
October   19, 2005 - 7:56 pm UTC 
 
you mean like this?
ops$tkyte@ORA10GR1> SELECT a.*
  2  ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'MM')
  3       ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) mon_sal
  4  ,SUM (sale_daily) OVER (PARTITION BY ID, TRUNC(sale_dt,'Y')
  5       ORDER BY ID ASC, sale_dt ASC RANGE UNBOUNDED PRECEDING ) yr_sal
  6  ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'MM') order by sale_dt )
  7  mtd_day_of_business
  8  ,COUNT(sale_daily) OVER (PARTITION BY TRUNC(sale_dt,'Y') )
  9  ytd_day_of_business
 10  FROM
 11  (
 12  SELECT ID,sale_dt,SUM(amount) sale_daily FROM TEST
 13  GROUP BY ID, sale_dt
 14  ) a
 15  order by sale_dt
 16  /
ID         SALE_DT   SALE_DAILY    MON_SAL     YR_SAL MTD_DAY_OF_BUSINESS YTD_DAY_OF_BUSINESS
---------- --------- ---------- ---------- ---------- ------------------- -------------------
bb         14-SEP-05     2092.5     2092.5     2092.5                   1                   7
aa         15-SEP-05     650.25     650.25     650.25                   2                   7
bb         13-OCT-05    2117.25    2117.25    4209.75                   1                   7
aa         14-OCT-05     1093.5     1093.5    1743.75                   2                   7
bb         15-OCT-05    3287.25     5404.5       7497                   3                   7
aa         19-OCT-05     389.25    1482.75       2133                   4                   7
bb         14-NOV-05    4187.25    4187.25   11684.25                   1                   7
7 rows selected.
 
 
 
 
 
Year to date ..
Reader, October   20, 2005 - 12:27 am UTC
 
 
The months are fine .
But year should increment ie 1,2,3,4,5,6,7 
now it 7th business day for all transactions .
Thanks 
 
 
October   20, 2005 - 8:06 am UTC 
 
I did that, because you asked for that.
... So the year count should be 7 and the monthly count should be sep 1,2 oct 
1,2,3,4 and Nov 1 .....
add the order by to the year count just like I did for the month.
the order by will make it a running total. 
 
 
 
Thank you
A reader, October   20, 2005 - 8:56 am UTC
 
 
Tom,
Thank you for your solution, above. However,
Can you give me a solution using CASE when
I can count either A and NOT F?
Thank you again. 
 
October   20, 2005 - 8:59 am UTC 
 
select case when cnta > 0 and cntf > 0 
            then ...
            when cnta = 0 and cntf > 0 
            then ...
            when cnta > 0 and cntf = 0 
            then ...
use a boolean expression after computing the cnt of A and the cnt of F 
 
 
 
SOMETHING LIKE THIS...
A reader, October   20, 2005 - 10:18 am UTC
 
 
Tom,
you mean something like this..
select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by, 
'MINOR' STATUS, 
SUM (case  WHEN z.DAMAGE_TYPE_CODE= 'A' THEN 1 ELSE 0 end) 
from gate_containers g, gate_damages z 
where g.gate_id = z.gate_id 
and z.damage_type_code = 'A' 
AND   damage_inspection_by = 'COLUMBO' 
and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy') 
group by trunc(g.damage_inspection_date),g.damage_inspection_by
 
 
October   20, 2005 - 4:33 pm UTC 
 
sure 
 
 
FINAL SOLUTION
A reader, October   20, 2005 - 11:04 am UTC
 
 
Tom,
here is the problem that I was facing....I hope 
this clear things out. 
SQL Statement which produced this data:
  select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code 
  from gate_containers g, gate_damages z 
  where g.gate_id = z.gate_id 
  and z.damage_type_code = 'A' 
  AND   damage_inspection_by = 'COLUMBO' 
  --and z.gate_id = '1688273' 
  and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy') 
  -------------------------------------------------------------------------
6/14/2005    COLUMBO    MINOR    1688235    A
6/14/2005    COLUMBO    MINOR    1688609    A
6/14/2005    COLUMBO    MINOR    1688273    A------was counting this a minor when it should be counted as a major
6/14/2005    COLUMBO    MINOR    1686769    A
6/14/2005    COLUMBO    MINOR    1686517    A
6/14/2005    COLUMBO    MINOR    1687985    A
6/14/2005    COLUMBO    MINOR    1686483    A
6/14/2005    COLUMBO    MINOR    1685361    A
6/14/2005    COLUMBO    MINOR    1686414    A
SQL Statement which produced this data:
  select trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by,'MINOR' STATUS, g.gate_id, z.damage_type_code 
  from gate_containers g, gate_damages z 
  where g.gate_id = z.gate_id 
  --and z.damage_type_code = 'A' 
  AND   damage_inspection_by = 'COLUMBO' 
  and z.gate_id = '1688273' 
  and trunc(damage_inspection_date) = to_date('06-14-2005','mm-dd-yyyy')
6/14/2005    COLUMBO    MINOR    1688273    A--------throw away!
6/14/2005    COLUMBO    MINOR    1688273    E
6/14/2005    COLUMBO    MINOR    1688273    C
6/14/2005    COLUMBO    MINOR    1688273    F-------keep
6/14/2005    COLUMBO    MINOR    1688273    I
 
 
 
follow up
A reader, October   20, 2005 - 5:01 pm UTC
 
 
Tom,
I am still not able to get the corrent result using
a case statement. Maybe I should use a function to return only the F when there is a F and A in the record.
select trunc(g.damage_inspection_date) damage_inspection_date, 
g.damage_inspection_by, 
'MINOR' STATUS, 
sum (case  when z.damage_type_code = 'F' then 1 else 0 end) cnt 
from gate_containers g, gate_damages z 
where g.gate_id = z.gate_id and 
z.damage_type_code = 'F' 
group by trunc(g.damage_inspection_date),g.damage_inspection_by 
 
 
 
Can this be done using analytical functions??
A reader, October   25, 2005 - 2:59 pm UTC
 
 
Tom,
I am done with my query....I am looking for a better
approach or how I can improve it. Thanks!!
select damage_inspection_date, damage_inspection_by, 
       max(decode(status,'MINOR',cnt)) minor, 
       max(decode(status,'MAJOR',cnt)) major, 
       max(decode(status,'TOTAL',cnt)) total 
  from (select  b.damage_inspection_date, 
        b.damage_inspection_by 
       ,b.status 
      ,NVL(a.cnt,0) CNT 
from 
     (select aa.damage_inspection_date, 
             aa.damage_inspection_by, 
             bb.status 
        from (select distinct trunc(gc.damage_inspection_date) 
damage_inspection_date, gc.damage_inspection_by 
                from gate_damages gd, gate_containers gc 
               where gd.gate_id = gc.gate_id 
             ) aa, 
             (select * 
                from (select 'MAJOR' STATUS from dual 
                      union all 
                      select 'MINOR' STATUS from dual 
                      union all 
                      select 'TOTAL' STATUS from dual 
                     ) 
             ) bb 
     )b, 
     ((SELECT  damage_inspection_date, 
               damage_inspection_by, 
               Status, 
               cnt 
          FROM (select trunc(c.damage_inspection_date) damage_inspection_date, 
                       c.damage_inspection_by, 
                       'MAJOR' STATUS, 
                       count(distinct c.gate_id) cnt 
                    from gate_containers c, 
                       gate_damages d 
                 where c.gate_id = d.gate_id and 
                       d.damage_type_code = 'F' 
              group by trunc(c.damage_inspection_date),c.damage_inspection_by 
                UNION ALL 
          select  trunc(g.damage_inspection_date) damage_inspection_date, g.damage_inspection_by, 'MINOR' status, 
           count(distinct g.gate_id) cnt 
           from gate_containers g, gate_damages z 
           where g.gate_id = z.gate_id 
           and z.damage_type_code = 'A' 
           and not exists 
          (select z.gate_id from gate_damages z 
           where z.gate_id = g.gate_id 
           and z.damage_type_code = 'F') 
           group by trunc(g.damage_inspection_date),g.damage_inspection_by 
                UNION ALL 
                select  trunc(ab.damage_inspection_date) damage_inspection_date, 
                        ab.damage_inspection_by, 
                       'TOTAL' STATUS, 
                       count(distinct ab.gate_id) cnt 
                  from gate_containers ab, 
                       gate_damages ac 
                   where ab.gate_id = ac.gate_id(+) and 
                       SUBSTR(ab.action,2,1) != 'C' 
              group by trunc(ab.damage_inspection_date),ab.damage_inspection_by 
               ) 
               group by damage_inspection_date, damage_inspection_by, status, cnt 
               ) 
     ) a 
where b.damage_inspection_by = a.damage_inspection_by(+) 
  and b.damage_inspection_date = a.damage_inspection_date(+) 
  and b.status = a.status(+)) 
 group by  damage_inspection_date, damage_inspection_by;
 
 
October   26, 2005 - 11:24 am UTC 
 
sorry - too big to reverse engineer here as a review/followup.... 
 
 
 
Analytic Question
Yoav, November  20, 2005 - 8:16 am UTC
 
 
Hi Tom.
Im tring to calculating Weighted moving average.
Im having a problem to calculate the values under column SUM_D.
Can you please demonstrate how to achieve the values that appears under the column SUM_D ?
create table t
(stock_date  date,
 close_value number(8,2));
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('02-OCT-2005',759.56);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('29-SEP-2005',753.59);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('28-SEP-2005',749.20);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('27-SEP-2005',741.71);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('26-SEP-2005',729.93);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('25-SEP-2005',719.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('22-SEP-2005',727.30);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('21-SEP-2005',735.81);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('20-SEP-2005',740.38);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('19-SEP-2005',739.86);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('18-SEP-2005',745.48);
INSERT INTO TEST(STOCK_DATE, CLOSE_VALUE) VALUES ('15-SEP-2005',744.65);
COMMIT;
select  RN, day, stock_date,close_value,weight          
from(
    select rownum RN,to_char(stock_date,'d') Day,
           stock_date,close_value,
           (case when to_char(stock_date,'d') = 1 then
                 1*close_value   
             when to_char(stock_date,'d') = 2 then
             2*close_value
             when to_char(stock_date,'d') = 3 then
             3*close_value
             when to_char(stock_date,'d') = 4 then
             4*close_value
             when to_char(stock_date,'d') = 5 then
                     5*close_value
        end) weight
      from( select rownum,stock_date,close_value
              from   test
              order by stock_date asc)
        order by 1
     )
ORDER BY 1     
/
       RN D STOCK_DAT CLOSE_VALUE     WEIGHT    SUM_D
--------- - --------- ----------- ---------- ----------
        1 5 15-SEP-05      744.65    3723.25          5
        2 1 18-SEP-05      745.48     745.48          1 <==
        3 2 19-SEP-05      739.86    1479.72          3
        4 3 20-SEP-05      740.38    2221.14          6
        5 4 21-SEP-05      735.81    2943.24         10 
        6 5 22-SEP-05       727.3     3636.5         15
        7 1 25-SEP-05      719.48     719.48          1 <==
        8 2 26-SEP-05      729.93    1459.86          3
        9 3 27-SEP-05      741.71    2225.13          6
       10 4 28-SEP-05       749.2     2996.8         10 
       11 5 29-SEP-05      753.59    3767.95         15
       RN D STOCK_DAT CLOSE_VALUE     WEIGHT    SUM_D
--------- - --------- ----------- ---------- ----------
       12 1 02-OCT-05      759.56     759.56          1
Thank You. 
 
November  20, 2005 - 8:31 am UTC 
 
would you like to explain what sum_d is to you?  explain the logic behind it. 
 
 
 
Analytic Question
Yoav, November  20, 2005 - 10:10 am UTC
 
 
Hi Tom.
Im sorry if my explanation wasnt clear enough.
The column SUM_D is actualy a "running total" of the column 
Day. 
The thing is that i need to reset the value of the
column SUM_D to 1 at the beginning of each week (Sunday).
select  RN, day, TheDayIs, stock_date
from(
    select rownum RN,to_char(stock_date,'d') Day,
           to_char(stock_date,'Day')TheDayIs, 
           stock_date,close_value
      from( select rownum,stock_date,close_value
              from   test
              order by stock_date asc)
        order by 1
     )
ORDER BY 1     
/
       RN D     SUM_D    THEDAYIS  STOCK_DAT   
--------- -    -----    --------- ---------
        1 5      5    Thursday  15-SEP-05
        2 1      1    Sunday    18-SEP-05
        3 2      3    Monday    19-SEP-05
        4 3      6    Tuesday   20-SEP-05
        5 4     10    Wednesday 21-SEP-05
        6 5     15    Thursday  22-SEP-05
        7 1      1     Sunday    25-SEP-05
        8 2      3     Monday    26-SEP-05
        9 3      6    Tuesday   27-SEP-05
       10 4     10    Wednesday 28-SEP-05
       11 5     15    Thursday  29-SEP-05
       12 1      1    Sunday    02-OCT-05
Thank you for you quick response 
 
 
November  21, 2005 - 8:20 am UTC 
 
you might have to "adjust" your stock_date by a day if 'ww' doesn't group right for you with your NLS settings (sometimes the week ends on a different day depending on your NLS settings - locale issue)
ops$tkyte@ORA9IR2> select row_number() over (order by stock_date) rn,
  2         to_char(stock_date,'d') day,
  3         stock_date,
  4         close_value,
  5         to_number(to_char(stock_date,'d'))*close_value weight,
  6          sum(to_number(to_char(stock_date,'d')))  
                  over (partition by to_char(stock_date,'ww') 
                            order by stock_date) sum_d
  7    from t
  8   order by stock_date
  9  /
 
        RN D STOCK_DAT CLOSE_VALUE     WEIGHT      SUM_D
---------- - --------- ----------- ---------- ----------
         1 5 15-SEP-05      744.65    3723.25          5
         2 1 18-SEP-05      745.48     745.48          1
         3 2 19-SEP-05      739.86    1479.72          3
         4 3 20-SEP-05      740.38    2221.14          6
         5 4 21-SEP-05      735.81    2943.24         10
         6 5 22-SEP-05       727.3     3636.5         15
         7 1 25-SEP-05      719.48     719.48          1
         8 2 26-SEP-05      729.93    1459.86          3
         9 3 27-SEP-05      741.71    2225.13          6
        10 4 28-SEP-05       749.2     2996.8         10
        11 5 29-SEP-05      753.59    3767.95         15
        12 1 02-OCT-05      759.56     759.56          1
 
12 rows selected.
 
 
 
 
 
Analytics Question
Yoav, November  21, 2005 - 7:41 am UTC
 
 
Hi Tom.
Im Sorry for wasting you time.
i found the solution.
select RN, day, week_no,
         sum(day) over
         (partition by week_no
         order by day) sum_d,
         stock_date
from(
select  RN, day, week_no, stock_date
from(select rownum RN,to_char(stock_date,'d') Day,
       to_char(stock_date,'ww') week_no,
       stock_date,close_value,
       (case when to_char(stock_date,'d') = 1 then
             1*close_value
      when to_char(stock_date,'d') = 2 then
      2*close_value
      when to_char(stock_date,'d') = 3 then
      3*close_value
      when to_char(stock_date,'d') = 4 then
      4*close_value
      when to_char(stock_date,'d') = 5 then
                    5*close_value
 end) weight
from( select rownum,'Y',stock_date,close_value
             from   test
             order by stock_date asc)
       order by 1)
)
order by 1
/
    RN D WE      SUM_D STOCK_DAT
------ - -- ---------- ---------
     1 5 37          5 15-SEP-05
     2 1 38          1 18-SEP-05
     3 2 38          3 19-SEP-05
     4 3 38          6 20-SEP-05
     5 4 38         10 21-SEP-05
     6 5 38         15 22-SEP-05
     7 1 39          1 25-SEP-05
     8 2 39          3 26-SEP-05
     9 3 39          6 27-SEP-05
    10 4 39         10 28-SEP-05
    11 5 39         15 29-SEP-05
    RN D WE      SUM_D STOCK_DAT
------ - -- ---------- ---------
    12 1 40          1 02-OCT-05
Thank You. !! 
 
November  21, 2005 - 8:52 am UTC 
 
see above, you can skip lots of steps here! 
 
 
 
Analytics Question
Yoav, November  22, 2005 - 5:29 am UTC
 
 
Tom.
Your solution is better then my.
Thank you ! 
 
 
Could you please help me with this 
A reader, November  29, 2005 - 4:17 am UTC
 
 
I am trying to output a report with different aggregates for different price ranges
create table t(
id number(3),
year number(4),
month number(2),
slno  number(2),
colorcd number(2),
sizecd number(2),
itemid number(4),
prdno number(3),
price  number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2), 
itemname varchar2(10));
insert into t values (1,2005,1,1,1,10,1000,101,150,100,10);
insert into t values (1,2005,1,2,1,11,1000,101,150,120,2);
insert into t values (1,2005,1,3,1,12,1000,101,150,100,10);
insert into t values (1,2005,1,4,1,13,1000,102,150,200,2);
insert into t values (1,2005,2,5,2,10,1000,102,150,100,20);
insert into t values (1,2005,2,6,2,11,1000,102,150,100,12);
insert into t values (1,2005,2,7,3,10,1000,103,150,100,20);
insert into t values (1,2005,3,8,4,10,1000,103,150,100,22);
insert into t values (1,2005,4,9,4,11,1000,103,150,100,12);
insert into t values (1,2005,1,10,5,10,1000,104,450,100,10);
insert into t values (1,2005,1,11,5,11,1000,104,450,120,2);
insert into t values (1,2005,1,12,5,12,1000,104,450,100,10);
insert into t values (1,2005,1,13,5,13,1000,104,450,200,2);
insert into t values (1,2005,2,14,5,14,1000,104,450,100,20);
insert into t values (1,2005,1,15,6,10,1001,105,150,100,10);
insert into t values (1,2005,1,16,6,11,1001,105,150,120,2);
insert into t values (1,2005,1,17,6,12,1001,105,150,100,10);
insert into t values (1,2005,1,18,6,13,1001,105,150,200,2);
insert into t values (1,2005,2,19,7,10,1001,105,150,100,20);
insert into t values (1,2005,2,20,7,11,1002,106,400,100,12);
insert into t values (1,2005,2,21,8,10,1002,106,400,100,20);
insert into t values (1,2005,3,22,9,10,1002,107,400,100,22);
insert into t values (1,2005,4,23,10,11,1002,107,400,100,12);
insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');
Desc                    Itemname    <199    <299     <399    <499
----------------------------------------------------------------------------
Count of distinct prdnos    Item0        3    null     null    1
(Count of distinct prdnos 
group by colorcd, sizecd)                3    null     null    1
sum of sl_qty group by itemid            110    null     null    44
                    
Count of distinct prdnos    Item1        1    null    null    null
(Count of distinct prdnos 
group by colorcd, sizecd)                1    null    null    null
sum of sl_qty group by itemid            44    null    null    null
                    
Count of distinct prdnos    Item2        null    null    null    2
(Count of distinct prdnos 
group by colorcd, sizecd)                null       null    null    1
sum of sl_qty group by itemid            null     null    null    66
Is this possible? The Desc column is not needed and 'null'  should be blank.
Thank you 
 
November  29, 2005 - 10:22 am UTC 
 
I don't get the "group by colorcd, sizecd" bit.  If you group by those attributes, you'll get a row per unique ITEMNAME, COLORCD, SIZECD.
I don't understand the logic. 
 
 
 
A reader, November  29, 2005 - 10:48 am UTC
 
 
Dear Tom,
For each unique record of ITEMNAME, COLORCD, SIZECD, the PRODNOs are repeating isn't it? I need a count of distinct prodnos.
COLORCD-SIZECD-ITEMID-PRODNO in that order, please see below
first group
-------------------------
1-10-1000-101
1-11-1000-101
1-12-1000-101
----------------------
second group
----------------------
1-13-1000-102
2-10-1000-102
2-11-1000-102
Both these groups comes under price range < 199. So the distinct count of PRODNO for price range < 199 = 2
 Hope this make sense. 
Thank you 
 
November  30, 2005 - 10:46 am UTC 
 
not understanding how this gets down to a single row.  I did not get it.
why would that be different than the count of disintct prodno's by itemid. 
 
 
 
how about this query
steve, November  30, 2005 - 8:04 pm UTC
 
 
Hi Tom,
Is there simple way to do it by analytic function?
select dept_num, id, sum(curr_adj_qty)
from
(
  select dept_num, id, sum(current_adjust_qty) curr_adj_qty
    from  adjust
   where applied_ind = 'N'
     and expired_ind = 'N'
   group by dept_num, id
   UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
    from  adjust
   where expired_ind = 'N'
     and applied_ind = 'Y'
    group by dept_num, id
) adj_tmp
group by dept_num, id
Thanks a lot!
Steve 
 
November  30, 2005 - 9:17 pm UTC 
 
hows about you 
a) set up a small example
b) explain what it is supposed to do in text (so we don't have to reverse engineer what you might have been thinking) 
 
 
 
[RE] to Steve NYC
Marcio Portes, November  30, 2005 - 10:43 pm UTC
 
 
May be he is looking for this
ops$marcio@LNX10GR2> select dept_num, id, sum(curr_adj_qty)
  2  from
  3  (
  4    select dept_num, id, sum(current_adjust_qty) curr_adj_qty
  5      from  adjust
  6     where applied_ind = 'N'
  7       and expired_ind = 'N'
  8     group by dept_num, id
  9     UNION
 10  select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
 11      from  adjust
 12     where expired_ind = 'N'
 13       and applied_ind = 'Y'
 14      group by dept_num, id
 15  ) adj_tmp
 16  group by dept_num, id
 17  /
     DEPT_NUM            ID SUM(CURR_ADJ_QTY)
------------- ------------- -----------------
            1             0               185
            1             2               186
            0             2                77
            0             0                81
            1             1               165
            0             1                56
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2815735809
--------------------------------------------------------------------------------
|Id  | Operation              | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT       |        |    10 |   390 |    11  (46)| 00:00:01 |
|  1 |  HASH GROUP BY         |        |    10 |   390 |    11  (46)| 00:00:01 |
|  2 |   VIEW                 |        |    10 |   390 |    10  (40)| 00:00:01 |
|  3 |    SORT UNIQUE         |        |    10 |   120 |    10  (70)| 00:00:01 |
|  4 |     UNION-ALL          |        |       |       |            |          |
|  5 |      HASH GROUP BY     |        |     5 |    60 |     5  (40)| 00:00:01 |
|* 6 |       TABLE ACCESS FULL| ADJUST |   250 |  3000 |     3   (0)| 00:00:01 |
|  7 |      HASH GROUP BY     |        |     5 |    60 |     5  (40)| 00:00:01 |
|* 8 |       TABLE ACCESS FULL| ADJUST |   250 |  3000 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - filter("APPLIED_IND"='N' AND "EXPIRED_IND"='N')
   8 - filter("EXPIRED_IND"='N' AND "APPLIED_IND"='Y')
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
        624  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed
ops$marcio@LNX10GR2>
ops$marcio@LNX10GR2> select dept_num, id,
  2         sum( case when applied_ind = 'N'
  3                   then current_adjust_qty
  4                   else 0 end )
  5         - sum( case when applied_ind = 'Y'
  6                   then current_adjust_qty
  7                   else 0 end ) curr_adj_qty
  8    from adjust
  9   where expired_ind = 'N'
 10   group by dept_num, id
 11  /
     DEPT_NUM            ID  CURR_ADJ_QTY
------------- ------------- -------------
            1             0           185
            1             2           186
            0             2            77
            1             1           165
            0             0            81
            0             1            56
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3658272021
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     5 |    60 |     4  (25)| 00:00:01 |
|   1 |  HASH GROUP BY     |        |     5 |    60 |     4  (25)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| ADJUST |   500 |  6000 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("EXPIRED_IND"='N')
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        622  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          6  rows processed
I used this script to produce output above.
set echo on
drop table adjust purge;
create table
adjust (
        dept_num                int,
        id                      int,
        applied_ind             char(1),
        expired_ind             char(1),
        current_adjust_qty      int
);
insert /*+ append */ into adjust
with v
as ( select level l from dual connect by level <= 1000 )
select mod(l, 2), mod(l, 3),
       decode(mod(l,8), 0, 'Y', 'N'),
       decode(mod(l,5), 0, 'N', 'Y'),
       trunc(dbms_random.value(1,10.000))
  from v
/
commit;
exec dbms_stats.gather_table_stats( user, 'adjust' )
set autotrace on
select dept_num, id, sum(curr_adj_qty)
from
(
  select dept_num, id, sum(current_adjust_qty) curr_adj_qty
    from  adjust
   where applied_ind = 'N'
     and expired_ind = 'N'
   group by dept_num, id
   UNION
select dept_num,id,(sum(current_adjust_qty)*-1)curr_adj_qty
    from  adjust
   where expired_ind = 'N'
     and applied_ind = 'Y'
    group by dept_num, id
) adj_tmp
group by dept_num, id
/
select dept_num, id,
       sum( case when applied_ind = 'N'
                 then current_adjust_qty
                 else 0 end )
       - sum( case when applied_ind = 'Y'
                 then current_adjust_qty
                 else 0 end ) curr_adj_qty
  from adjust
 where expired_ind = 'N'
 group by dept_num, id
/
set autotrace off
set echo off
 
 
 
Multiple aggregates
Raj, December  01, 2005 - 7:45 am UTC
 
 
Dear Tom,
  This is continuing with my previous post where the given data for the problem was wrong. I was trying to make a sample testcase. Here is the requirements along with the create table statements and corrected data.
This is to output sales figures for a given period of different products. 
The output format should be,
1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD 
3. PRICE
4. Count of distinct PRODNOs for an item group by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item group by PRICE
6. Total SL_QTY for an item group by PRICE
7. Total SL_QTY*PRICE for an item group by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item
create table t(
id number(3),
slno  number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price  number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2), 
itemname varchar2(10));
With Items as(
    select
        itemid, displaycd, itemname
    from
        p),
DistinctCounts as(
    select
        min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
    from
        (select
            distinct prdno,colorcd, sizecd, price, itemid
         from
            t
         where
            id = 1 and
            year= 2005 and
            month in(1,2,3)
         order by
            price, prdno, colorcd, sizecd )
    group by price),
Aggregates as(
    select
            price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3, 
        sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
    from
    (
        select
            itemid, month, prdno,colorcd, sizecd,
            count(distinct prdno) over (partition by price) c1,
            sl_qty c3,
            sl_qty*price c4,
            price,
            trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7  c6
        from
            t
        where
            id = 1 and
            year= 2005 and
            month in(1,2,3)
        order by
            prdno,colorcd, sizecd)
        group by
            price)
select
    a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
    c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price, 
    avg(c6) avg_trend
from 
    DistinctCounts d, Aggregates a, Items i
where
    d.prdno=a.prdno and
    d.price=a.price and
    i.itemid=d.itemid
group by
    a.price,a.itemid, i.itemname, c2
order by 
    a.itemid, i.itemname, a.price
/
With this query I am able to get the report like this,
ITEMID ITEMNAME                  PRICE      PRDNO_CNT    SKU_CNT   SALE_CNT SALE_PRICE  AVG_PRICE  AVG_TREND
---------- -------------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
      1000 Item0                       150          2          8        280  42000        150      40
      1000 Item0                       450          1          4        110  49500        450    46.6
      1001 Item1                       350          1          5        270  94500        350    32.6
Is it possible to get the report in the following format along with the nonmatching itemnames and null cells as blanks. 
Itemname    <199    <299     <399    <499
------------------------------------------------
Item0        2      null     null       1
             8      null     null       4
       280      null     null     110
         42000      null     null   49500
           150      null     null     450
          40.0      null     null    46.6
                    
Item1     null      null        1     null
          null      null        5     null
      null      null      270     null
          null      null    94500     null     
          null      null      350     null
          null      null     32.6     null
Item2     null      null     null    null
          null      null     null    null
      null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
Item3     null      null     null    null
          null      null     null    null
      null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
Many thanks for your help and patience 
 
 
Sorry 
Raj, December  01, 2005 - 9:45 pm UTC
 
 
Dear Tom,
  I am sorry, I did'nt post the insert statements. Sorry for being careless. I did execute everything in my system and was formating and copying it one by one, previewed and reread it before posting and still missed it. I will repost the requirements below,
This is to output sales figures for a given period of different products. 
The output format should be,
1. ITEMNAME - All the items from item table whether a match occurs or not.
2. DISPLAYCD 
3. PRICE
4. Count of distinct PRODNOs for an item grouped by PRICE
5. Total count of distinct( PRODNO+COLORCD+SIZECD) for an item grouped by PRICE
6. Total SL_QTY for an item grouped by PRICE
7. Total SL_QTY*PRICE for an item grouped by PRICE
8. Avg of PRICE for an item
9. Avg of (ST_QTY/SL_QTY) * 7 for an item
drop table t;
drop table p;
create table t(
id number(3),
slno  number(2),
year number(4),
month number(2),
itemid number(4),
prdno number(3),
colorcd number(2),
sizecd number(2),
price  number(4),
st_qty number(3),
sl_qty number(3),
constraint pk_t primary key(id, year, month, slno));
create table p(
itemid number(4) primary key,
displaycd varchar2(2), 
itemname varchar2(10));
insert into t values (1,1,2005,1,1000,101,1,10,150,90,10);
insert into t values (1,2,2005,1,1000,101,1,11,150,80,20);
insert into t values (1,3,2005,1,1000,101,1,12,150,90,10);
insert into t values (1,4,2005,1,1000,101,1,13,150,80,20);
insert into t values (1,5,2005,2,1000,101,1,10,150,80,20);
insert into t values (1,25,2005,1,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,1,1000,104,1,13,150,80,20);
insert into t values (1,25,2005,2,1000,104,1,11,150,80,20);
insert into t values (1,27,2005,2,1000,104,1,13,150,80,20);
insert into t values (1,26,2005,2,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,2,1000,104,1,10,150,90,10);
insert into t values (1,26,2005,1,1000,104,1,12,150,90,10);
insert into t values (1,24,2005,1,1000,104,1,10,150,90,10);
insert into t values (1,6,2005,2,1000,101,1,11,150,60,40);
insert into t values (1,7,2005,2,1000,101,1,12,150,80,20);
insert into t values (1,14,2005,2,1000,101,1,13,150,80,20);
insert into t values (1,15,2005,1,1001,103,1,10,350,90,10);
insert into t values (1,23,2005,3,1001,103,1,11,350,10,90);
insert into t values (1,22,2005,3,1001,103,1,10,350,90,10);
insert into t values (1,21,2005,2,1001,103,1,11,350,80,20);
insert into t values (1,20,2005,2,1001,103,1,10,350,80,20);
insert into t values (1,19,2005,1,1001,103,1,14,350,80,20);
insert into t values (1,18,2005,1,1001,103,1,13,350,70,30);
insert into t values (1,17,2005,1,1001,103,1,12,350,40,60);
insert into t values (1,16,2005,1,1001,103,1,11,350,90,10);
insert into t values (1,8,2005,1,1000,102,1,10,450,80,20);
insert into t values (1,9,2005,1,1000,102,1,11,450,90,10);
insert into t values (1,10,2005,1,1000,102,1,12,450,90,10);
insert into t values (1,11,2005,1,1000,102,1,13,450,90,10);
insert into t values (1,12,2005,2,1000,102,1,10,450,80,10);
insert into t values (1,13,2005,2,1000,102,1,11,450,50,50);
insert into p values(1000,'AA','Item0');
insert into p values(1001,'AB','Item1');
insert into p values(1002,'AC','Item2');
insert into p values(1003,'AD','Item3');
commit;
With Items as(
    select
        itemid, displaycd, itemname
    from
        p),
DistinctCounts as(
    select
        min(itemid) itemid, min(prdno) prdno, count(prdno) c2, price
    from
        (select
            distinct prdno,colorcd, sizecd, price, itemid
         from
            t
         where
            id = 1 and
            year= 2005 and
            month in(1,2,3)
         order by
            price, prdno, colorcd, sizecd )
    group by price),
Aggregates as(
    select
            price, min(itemid) itemid, min(prdno) prdno, max(c1) c1, sum(c3) c3,
            sum(c4) c4, avg(price) c5, trunc(avg(c6),1) c6
    from
    (
        select
            itemid, month, prdno,colorcd, sizecd,
            count(distinct prdno) over (partition by price) c1,
            sl_qty c3,
            sl_qty*price c4,
            price,
            trunc(st_qty/decode(sl_qty,0,1,sl_qty),1)*7  c6
        from
            t
        where
            id = 1 and
            year= 2005 and
            month in(1,2,3)
        order by
            prdno,colorcd, sizecd)
        group by
            price)
select
    a.itemid, i.itemname, a.price, sum(c1) prdno_cnt,
    c2 sku_cnt, sum(c3) sale_cnt, sum(c4) sale_price, avg(c5) avg_price,
    avg(c6) avg_trend
from
    DistinctCounts d, Aggregates a, Items i
where
    d.prdno=a.prdno and
    d.price=a.price and
    i.itemid=d.itemid
group by
    a.price,a.itemid, i.itemname, c2
order by
    a.itemid, i.itemname, a.price
/
ITEMID ITEMNAME  PRICE PRDNO_CNT  SKU_CNT  SALE_CNT  SALE_PRICE  AVG_PRICE  AVG_TREND
------ -------- ------ --------- -------- --------- ----------- ---------- ----------
  1000 Item0       150         2        8       280       42000        150         40
  1000 Item0       450         1        4       110       49500        450         47
  1001 Item1       350         1        5       270       94500        350         33
Is it possible to get the report in the following format along with the 
nonmatching itemnames and null cells as blanks. 
Itemname    <199    <299     <399    <499
------------------------------------------------
Item0        2      null     null       1
             8      null     null       4
           280      null     null     110
         42000      null     null   49500
           150      null     null     450
          40.0      null     null    46.6
                    
Item1     null      null        1     null
          null      null        5     null
          null      null      270     null
          null      null    94500     null     
          null      null      350     null
          null      null     32.6     null
Item2     null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
Item3     null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
          null      null     null    null
Thanking you
  
 
December  02, 2005 - 10:48 am UTC 
 
it wasn't just that - it was "this was too big to answer in a coupld of seconds and since I get over 1,000 of these a month, I cannot spend too much time on each one, I'd rather take NEW questions sometimes"
 
 
 
 
a single query to collapse date ranges
Bob Lyon, December  06, 2005 - 12:56 pm UTC
 
 
Tom,
I know what I want to do but can't quite get my mind around the syntax...
We want a single query to collapse date ranges under the assumption that a date range that
starts later than another range has a better value.
So given this test case
CREATE GLOBAL TEMPORARY TABLE RDL (
DATE_FROM DATE,
DATE_TO   DATE,
VALUE     NUMBER
);
INSERT INTO RDL VALUES (TO_DATE('01/03/2005', 'MM/DD/YYYY'), TO_DATE('01/12/2005', 'MM/DD/YYYY'), 5);
INSERT INTO RDL VALUES (TO_DATE('01/05/2005', 'MM/DD/YYYY'), TO_DATE('01/10/2005', 'MM/DD/YYYY'), 8);
-- I assume the innermost subquery would
-- use the DUAL CONNECT BY LEVEL trick to generate individual days for each grouping
-- ORDER BY DATE_FROM
1  01/03/2005 01/04/2005 5 
1  01/04/2005 01/05/2005 5 
1  01/05/2005 01/06/2005 5 
1  01/06/2005 01/07/2005 5 
1  01/07/2005 01/08/2005 5 
1  01/08/2005 01/09/2005 5 
1  01/09/2005 01/10/2005 5 
1  01/10/2005 01/11/2005 5 
1  01/11/2005 01/12/2005 5 
2  01/05/2005 01/06/2005 8 
2  01/06/2005 01/07/2005 8 
2  01/07/2005 01/08/2005 8 
2  01/08/2005 01/09/2005 8 
2  01/09/2005 01/10/2005 8 
-- an outer subquery would use analytics to get the max grouping
1  01/03/2005 01/04/2005 5 
1  01/04/2005 01/05/2005 5 
2  01/05/2005 01/06/2005 8 
2  01/06/2005 01/07/2005 8 
2  01/07/2005 01/08/2005 8 
2  01/08/2005 01/09/2005 8 
2  01/09/2005 01/10/2005 8 
1  01/10/2005 01/11/2005 5 
1  01/11/2005 01/12/2005 5 
-- And the outermost subquery would use analytics to collapse the dates into contiguous groups
-- for the desired result
1  01/03/2005 01/05/2005 5  
2  01/05/2005 01/10/2005 8  
1  01/10/2005 01/12/2005 5  
The trick is to do all of the above in a single query!
Any suggestions (Yeah, I know, REALLY learn analytics!)
Thanks in advance,
Bob Lyon 
 
 
OK, I think I got it
Bob Lyon, December  06, 2005 - 2:25 pm UTC
 
 
SELECT d date_from, d2 date_to, value
FROM (
   SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
   FROM (
   SELECT DATE_FROM D, VALUE FROM RDL
   UNION 
   SELECT DATE_TO   D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
   )
)   
WHERE D2 IS NOT NULL
/
DATE_FROM         DATE_TO                VALUE
----------------- ----------------- ----------
01/03/05 00:00:00 01/05/05 00:00:00          5
01/05/05 00:00:00 01/10/05 00:00:00          8
01/10/05 00:00:00 01/12/05 00:00:00          5
 
 
December  06, 2005 - 3:50 pm UTC 
 
depends on how many overlaps you allow, take your create and:
...
ops$tkyte@ORA9IR2> INSERT INTO RDL VALUES (TO_DATE('01/06/2005', 'MM/DD/YYYY'),
  2  TO_DATE('01/7/2005', 'MM/DD/YYYY'), 99);
1 row created.
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2>
ops$tkyte@ORA9IR2> SELECT d date_from, d2 date_to, value
  2  FROM (
  3     SELECT D, LEAD (d) OVER (ORDER BY D) d2, VALUE
  4     FROM (
  5     SELECT DATE_FROM D, VALUE FROM RDL
  6     UNION
  7     SELECT DATE_TO   D, LAG (VALUE) OVER (ORDER BY DATE_FROM) VALUE FROM RDL
  8     )
  9  )
 10  WHERE D2 IS NOT NULL
 11  /
DATE_FROM DATE_TO        VALUE
--------- --------- ----------
03-JAN-05 05-JAN-05          5
05-JAN-05 06-JAN-05          8
06-JAN-05 07-JAN-05         99
07-JAN-05 10-JAN-05          8
10-JAN-05 12-JAN-05          5
so, maybe we expand out and keep the row we want:
ops$tkyte@ORA9IR2> with
  2  data
  3  as
  4  (select level-1 l
  5    from (select max(date_to-date_from+1) n from rdl) n
  6  connect by level <= n)
  7  select rdl.date_from+l,
  8         to_number( substr( max( to_char(date_from,'yyyymmdd') ||  value ), 9 ) ) value
  9    from rdl, data
 10   where data.l <= rdl.date_to-rdl.date_from
 11   group by rdl.date_from+l
 12  ;
RDL.DATE_      VALUE
--------- ----------
03-JAN-05          5
04-JAN-05          5
05-JAN-05          8
06-JAN-05         99
07-JAN-05         99
08-JAN-05          8
09-JAN-05          8
10-JAN-05          8
11-JAN-05          5
12-JAN-05          5
10 rows selected.
 
 
 
 
 
Select with Analytics Working Partially
denni50, December  06, 2005 - 4:34 pm UTC
 
 
Hi Tom
have a question with the below script that is puzzling.
I'm using the 4 idnumbers as test data. I'm looking to select the most recent record where the appealcode is like '_R%' for 2005.
when I run the script it only pulls 2 of the idnumbers.
I've been looking at this and can't see why the other two are being bypassed.I'm trying to use more analytics in my code, it's working for two records and not the other two.
any tips/help greatly appreciated.
SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
  2  paydate,payamount,transnum,ltransnum,appealcode
  3    from (
  4  select x.*, row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
 
  5    from (
  6  select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
  7  payamount, transnum,ltransnum,appealcode,
  8  max(paydate) over (partition by idnumber) maxpd
  9         from payment
 10   where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MON-Y
YYY')
 11         ) x
 12    where appealcode like '_R%' 
 13    and paydate=maxpd
 14    and idnumber in(4002401,4004594,5406454,5618190)
 15         )
 16   where rn = 1;
  IDNUMBER USER FIRSTPA PAYDATE    PAYAMOUNT   TRANSNUM  LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
   4004594 ACDC May2005 17-MAY-05          0   10159410   10086183 DRE0505
   5618190 ACDC Mar2005 11-MAR-05          0    9918802    9845638 DRJ0503
SQL> 
**** 4 TEST IDNUMBERS FROM BASE TABLE*********************
SQL> select idnumber,appealcode,paydate,payamount,transnum
  2  from payment where appealcode like '_R%'
  3  and idnumber=4004594 order by paydate desc;
  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   4004594 DRE0505    17-MAY-05          0   10159410
   4004594 GRG0502    08-FEB-05          0    9804766
   4004594 GRF0501    31-JAN-05          0    9750332
   4004594 GRK0410    01-NOV-04          0    9303510
   4004594 GRC0403    19-MAR-04          0    8371053
   4004594 GRA0305    12-AUG-03          0    7543911
   4004594 GRG0303    16-APR-03          0    7209503
   4004594 GRA0301    16-FEB-03          0    7026840
  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   4002401 GRG0502    16-MAR-05         25    9862647
   4002401 GRG0502    23-FEB-05          0    9826142
   4002401 GRA0501    19-JAN-05          0    9712904
   4002401 GRF0412    05-JAN-05          0    9630884
   4002401 GRK0410    21-OCT-04          0    9299106
   4002401 GRG0303    03-MAR-03          0    7066423
   4002401 GRA0301    09-FEB-03          0    7022121
  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   5406454 DRJ0503    03-MAR-05          0    9887770
   5406454 DRG0502    28-FEB-05          0    9870637
  IDNUMBER APPEALCODE PAYDATE    PAYAMOUNT   TRANSNUM
---------- ---------- --------- ---------- ----------
   5618190 DRJ0503    11-MAR-05          0    9918802
   5618190 DRG0502    28-FEB-05          0    9870090
   5618190 GRG0502    21-FEB-05          0    9824705
 
 
 
December  07, 2005 - 1:32 am UTC 
 
(i would need a create table and insert statements if you really want me to play with it)
but this predicate:
 12    where appealcode like '_R%' 
 13    and paydate=maxpd
 14    and idnumber in(4002401,4004594,5406454,5618190)
says 
"only keep _R% records that had the max paydate over ALL records for that id"
to satisfy:
I'm looking to select the most recent 
record where the appealcode is like '_R%' for 2005.
perhaps you mean:
select *
  from (select t.*, 
               row_number() over (partition by idnumber sort by paydate DESC)rn
          from t
         where appealcode like '_R%'
           and idnumber in ( 1,2,3,4 ) )
 where rn = 1;
that says
"find the _R% records"
"break them up by idnumber"
"sort each group from big to small by paydate"
"keep only the first record in each group"
if 
 
 
 
 
to dennis
Oraboy, December  06, 2005 - 6:09 pm UTC
 
 
Hi , 
I tried your problem and looks its working fine.
Just a quick question..Did you check the dates are really 2005 and not 0005?
(Create scripts for anyone who wants to try in future)
Create table Test_T
(IdNumber number,
AppealCode Varchar2(100),
PayDate date,
PayAmount NUmber,
TransNum Number) 
/
Insert into Test_t values (    4004594    ,'DRE0505',to_date('17-May-05','DD-MON-RR'),0,10159410    );
Insert into Test_t values (    4004594    ,'GRG0502',to_date('8-Feb-05','DD-MON-RR'),0,9804766    );
Insert into Test_t values (    4004594    ,'GRF0501',to_date('31-Jan-05','DD-MON-RR'),0,9750332    );
Insert into Test_t values (    4004594    ,'GRK0410',to_date('1-Nov-04','DD-MON-RR'),0,9303510    );
Insert into Test_t values (    4004594    ,'GRC0403',to_date('19-Mar-04','DD-MON-RR'),0,8371053    );
Insert into Test_t values (    4004594    ,'GRA0305',to_date('12-Aug-03','DD-MON-RR'),0,7543911    );
Insert into Test_t values (    4004594    ,'GRG0303',to_date('16-Apr-03','DD-MON-RR'),0,7209503    );
Insert into Test_t values (    4004594    ,'GRA0301',to_date('16-Feb-03','DD-MON-RR'),0,7026840    );
Insert into Test_t values (    4002401    ,'GRG0502',to_date('16-Mar-05','DD-MON-RR'),25,9862647    );
Insert into Test_t values (    4002401    ,'GRG0502',to_date('23-Feb-05','DD-MON-RR'),0,9826142    );
Insert into Test_t values (    4002401    ,'GRA0501',to_date('19-Jan-05','DD-MON-RR'),0,9712904    );
Insert into Test_t values (    4002401    ,'GRF0412',to_date('5-Jan-05','DD-MON-RR'),0,9630884    );
Insert into Test_t values (    4002401    ,'GRK0410',to_date('21-Oct-04','DD-MON-RR'),0,9299106    );
Insert into Test_t values (    4002401    ,'GRG0303',to_date('3-Mar-03','DD-MON-RR'),0,7066423    );
Insert into Test_t values (    4002401    ,'GRA0301',to_date('9-Feb-03','DD-MON-RR'),0,7022121    );
Insert into Test_t values (    5406454    ,'DRJ0503',to_date('3-Mar-05','DD-MON-RR'),0,9887770    );
Insert into Test_t values (    5406454    ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870637    );
Insert into Test_t values (    5618190    ,'DRJ0503',to_date('11-Mar-05','DD-MON-RR'),0,9918802    );
Insert into Test_t values (    5618190    ,'DRG0502',to_date('28-Feb-05','DD-MON-RR'),0,9870090    );
Insert into Test_t values (    5618190    ,'GRG0502',to_date('21-Feb-05','DD-MON-RR'),0,9824705    );
--since the other columns are not relevant , I used dummy values in your Select statement
s61>l
  1  select
  2   idnumber,
  3   usercode1,
  4   substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
  5     paydate,
  6   payamount,
  7   transnum,
  8   ltransnum,
  9   appealcode
 10      from (
 11   select x.*,
 12     row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
 13       from (
 14    select
 15     idnumber,1 usercode1,to_char(paydate,'MON') mon_raw, paydate,
 16       payamount, transnum,transnum ltransnum,appealcode,
 17     max(paydate) over (partition by idnumber) maxpd
 18     from Test_t
 19    where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
 20      and to_date('31-OCT-2005','DD-MON-YYYY')
 21           ) x
 22    where appealcode like '_R%'
 23           and paydate=maxpd
 24           and idnumber in(4002401,4004594,5406454,5618190)
 25             )
 26*    where rn = 1
s61>/
  IDNUMBER  USERCODE1 FIRSTPA PAYDATE    PAYAMOUNT   TRANSNUM  LTRANSNUM APPEALCODE
---------- ---------- ------- --------- ---------- ---------- ---------- -----------
   4002401          1 Mar2005 16-MAR-05         25    9862647    9862647 GRG0502
   4004594          1 May2005 17-MAY-05          0   10159410   10159410 DRE0505
   5406454          1 Mar2005 03-MAR-05          0    9887770    9887770 DRJ0503
   5618190          1 Mar2005 11-MAR-05          0    9918802    9918802 DRJ0503
--added the other two columns
s61>alter table test_t add (usercode1 varchar2(100),ltransnum varchar2(100));
Table altered.
s61>update test_t set usercode1 = chr(65+ mod(rownum,3)), ltransnum=transnum+rownum;
20 rows updated.
S61> @<<ursql.txt>>
  IDNUMBER USERCODE1                                                                                            FIRSTPA PAYDATE   
---------- ---------------------------------------------------------------------------------------------------- ------- --------- 
   4002401 A                                                                                                    Mar2005 16-MAR-05       
   4004594 B                                                                                                    May2005 17-MAY-05       
   5406454 B                                                                                                    Mar2005 03-MAR-05       
   5618190 A                                                                                                    Mar2005 11-MAR-05       
-- this is just a guess on why the other two numbers didnt 
--show up  in your result
-- updating 2005 to 05 
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=5406454
  2  /
2 rows updated.
s61>update test_t set paydate=add_months(paydate,-(2005*12)) where idnumber=4002401
  2  /
7 rows updated.
s61>l
  1  select
  2    idnumber,
  3    usercode1,
  4    substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode,
  5      paydate,
  6    payamount,
  7    transnum,
  8    ltransnum,
  9    appealcode
 10       from (
 11    select x.*,
 12      row_number() over (partition by idnumber order by payamount desc, transnum desc) rn
 13        from (
 14     select
 15      idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
 16        payamount, transnum, ltransnum,appealcode,
 17      max(paydate) over (partition by idnumber) maxpd
 18      from Test_t
 19     where paydate between to_date('01-JAN-2005','DD-MON-YYYY')
 20       and to_date('31-OCT-2005','DD-MON-YYYY')
 21            ) x
 22     where appealcode like '_R%'
 23            and paydate=maxpd
 24            and idnumber in(4002401,4004594,5406454,5618190)
 25              )
 26*     where rn = 1
s61>/
  IDNUMBER US FIRSTPA PAYDATE    PAYAMOUNT   TRANSNUM LTRANSNUM  APPEALCODE
---------- -- ------- --------- ---------- ---------- ---------- -------------------------------------
   4004594 B  May2005 17-MAY-05          0   10159410 10159411   DRE0505
   5618190 A  Mar2005 11-MAR-05          0    9918802 9918820    DRJ0503
-- same as what you see
 
 
 
Thanks Tom and Oraboy
denni50, December  07, 2005 - 8:37 am UTC
 
 
Oraboy...you brought up a good possibility..although
the data gets posted through canned software...users
are responsible for creating the batch headers before
posting batches and what may have happened is a user
inadvertently inserted year 0005 instead of '2005'
for a particular batch that included those idnumbers
I am testing with here. It's happened before.
thanks for that helpful tip!
:~)
 
 
 
Oraboy and Tom
denni50, December  07, 2005 - 8:57 am UTC
 
 
Oraboy:
it was not the year, I did testing using '0005' to see
if those two records would output results and it did not.
I changed the script based on logic that Tom suggested
and it worked...see changes below.
thanks Oraboy for your input and help.
SQL> select idnumber,usercode1,substr(mon_raw,1,1)||lower(substr(mon_raw,2))||'2005' firstpaycode
  2  paydate,payamount,transnum,ltransnum,appealcode
  3    from (
  4  select x.*, row_number() over (partition by idnumber order by paydate desc,payamount desc, t
snum desc) rn 
  5    from (
  6  select idnumber,usercode1,to_char(paydate,'MON') mon_raw, paydate,
  7  payamount, transnum,ltransnum,appealcode
  8  --max(paydate) over (partition by idnumber) maxpd
  9         from payment
 10   where paydate between to_date('01-JAN-2005','DD-MON-YYYY') and to_date('31-OCT-2005','DD-MO
YYY')
 11         ) x
 12    where appealcode like '_R%' 
 13    --and paydate=maxpd
 14    and idnumber in(4002401,4004594,5406454,5618190)
 15         )
 16   where rn = 1;
  IDNUMBER USER FIRSTPA PAYDATE    PAYAMOUNT   TRANSNUM  LTRANSNUM APPEALCODE
---------- ---- ------- --------- ---------- ---------- ---------- ----------
   4002401 ACGA Mar2005 16-MAR-05         25    9862647    9789477 GRG0502
   4004594 ACDC May2005 17-MAY-05          0   10159410   10086183 DRE0505
   5406454 ACDC Mar2005 03-MAR-05          0    9887770    9814606 DRJ0503
   5618190 ACDC Mar2005 11-MAR-05          0    9918802    9845638 DRJ0503
SQL>  
 
 
 
anyway to do a dynamic lag?
Ryan, December  07, 2005 - 10:27 pm UTC
 
 
Is it possible to use lag, but you don't know how many rows you want to go back? 
create table history (
  history_id  number,
  history_sequence number,
  history_status varchar2(20),
  history_balance number);
insert into history(1,123,'HISTORY 1',10);
insert into history(1,128,'PROCESSED',0);
insert into history(1,130,'PROCESSED',0);
insert into history(1,131,'HISTORY 8',15);
insert into history(1,145,'PROCESSED',0);
for each history_id ordered by history_sequence
  loop
    if status = 'PROCESSED' then
      history_balance = the history_balance of the last record where status != 'PROCESSED'
    end if;
  end loop;
Typically with lag you have to state how many rows you are looking back, in this case my discriminator is based on the value in the status field? 
After this is run, I expect the values to be
1,123,'HISTORY 1',10
1,128,'PROCESSED',10
1,130,'PROCESSED',10
1,131,'HISTORY  8',15
1,145,'PROCESSED',15
I can do this with pl/sql. I am trying to figure out how to do this with straight sql.  
 
December  08, 2005 - 2:03 am UTC 
 
last_value with ignore nulls in 10g, or to_number(substr(max in 9i and before can be used....
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
  2         last_value(
  3         case when history_status <> 'PROCESSED'
  4                  then history_balance
  5                  end IGNORE NULLS ) over (order by history_sequence ) last_hb
  6    from history
  7  /
HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS       HISTORY_BALANCE    LAST_HB
---------- ---------------- -------------------- --------------- ----------
         1              123 HISTORY 1                         10         10
         1              128 PROCESSED                          0         10
         1              130 PROCESSED                          0         10
         1              131 HISTORY 8                         15         15
         1              145 PROCESSED                          0         15
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select history_id, history_sequence, history_status, history_balance,
  2         to_number( substr( max(
  3         case when history_status <> 'PROCESSED'
  4                  then to_char(history_sequence,'fm0000000000' ) || history_balance
  5                  end ) over (order by history_sequence ), 11 ) ) last_hb
  6    from history
  7  /
HISTORY_ID HISTORY_SEQUENCE HISTORY_STATUS       HISTORY_BALANCE    LAST_HB
---------- ---------------- -------------------- --------------- ----------
         1              123 HISTORY 1                         10         10
         1              128 PROCESSED                          0         10
         1              130 PROCESSED                          0         10
         1              131 HISTORY 8                         15         15
         1              145 PROCESSED                          0         15
 
 
 
 
 
Query
Mark, January   18, 2006 - 5:02 pm UTC
 
 
Hi Tom,
Given a Table:
PK_ID NUMBER (PK)
CLASS_ID NUMBER
MY_DATE DATE
I'd like to develop an output like:
CLASS_ID   W1  W2  W3...
where W1, W2, W3... are 'weeks' from SYSDATE, using MY_DATE, holding the counts of the CLASS_ID for that row.
How could I do that?
Thanks! 
About to read Ch. 12 in Expert one-on-one... 
 
January   19, 2006 - 12:28 pm UTC 
 
if
a) you have a finite number of weeks (eg: a sql query has N columns at parse time, unless you know what N is...)
b) an example <<<=== creates/inserts
we could play with this. 
 
 
 
ok, here we go
Mark, January   20, 2006 - 1:31 pm UTC
 
 
drop table cts_temp
/
create table cts_temp
( class_id number,
cts_date date)
/
insert into cts_temp
select round(dbms_random.value(1, 20)), trunc(created) from user_objects
/
Output sought:
CLASS_ID W1 W2 W3 W4 W5 W6 W7 W8 W9 W10+
----------------------------------------
1        1  3  7  5  1  0  0  0  0  12    
2        1  5  1  0  0  0  5  3  4  6
3        1  0  0  9  0  1  1  5  1  10        
...
where W# = # of Weeks away from current date
therefore, W1 is within 7 days of today, W10+ everything 10 weeks and older. These numbers are are Counts of records.
I have done this in the past with DECODE statements, but am looking for a more efficient way to do this using Analytics. 
 
January   20, 2006 - 2:47 pm UTC 
 
but you will not want to use analytics since you NEED TO AGGREGATE.
decode (or case) is the correct approach to this problem, keep using that. 
 
 
 
SQL Query
Parag J Patankar, January   23, 2006 - 7:26 am UTC
 
 
Hi Tom,
Wish you very happy new year 2006. I have a table 
create table t ( a number(5), b number(6), c number(1), d varchar2(8), e number(10));
insert into t values ( 09009, 1000, 1, 'RIS00001', 100);
insert into t values ( 09009, 1000, 0, 'RIS00001', 200);
insert into t values ( 09009, 1000, 0, 'RIS00001', 300);
insert into t values ( 09009, 1000, 2, 'RIS00001', 400)
insert into t values(09009, 5000, 2, 'BIC77777', 100);
insert into t values(09009, 5000, 2, 'BIC77777', 100);
insert into t values(09009, 6000, 0, 'DIG00077', 100);
insert into t values(09009, 6000, 0, 'DIG00077', 200);
commit;
17:33:38 SQL> select * from t;
         A          B          C D                 E
---------- ---------- ---------- -------- ----------
      9009       1000          1 RIS00001        100
      9009       1000          0 RIS00001        200
      9009       1000          0 RIS00001        300
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200
7 rows selected.
In Column "C" values can be 0, 1, 2. 
Now I want to select only those set records for column c value is not 1 for the combination of same a, b, and d.
For e.g I want output like 
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200
RIS00001 records should not appear because for column c value 1 once appeared for 09009 10000. 
How can I do this in most efficient way ?
Currently it is very large table and joined to few tables in a query.
best regards
pjp  
 
 
January   23, 2006 - 10:34 am UTC 
 
for big "all rows" I would go analytics
ops$tkyte@ORA9IR2> select *
  2    from (
  3  select t.*,
  4         max( case when c = 1 then c end ) over (partition by a, b, d ) c_one
  5    from t
  6        )
  7   where c_one is null
  8  /
         A          B          C D                 E      C_ONE
---------- ---------- ---------- -------- ---------- ----------
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200
for possibly getting "first row as fast as I can", I might opt for not exists or not in
ops$tkyte@ORA9IR2> select *
  2    from t
  3   where (a,b,d) not in (select a,b,d from t where c = 1 and a is not null and b is not null and d is
  4   not null )
  5  /
         A          B          C D                 E
---------- ---------- ---------- -------- ----------
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200
ops$tkyte@ORA9IR2> select *
  2    from t
  3   where not exists (select null from t t2
  4                      where t2.a = t.a and t2.b = t.b and t2.d = t.d and t2.c = 1 )
  5  /
         A          B          C D                 E
---------- ---------- ---------- -------- ----------
      9009       5000          2 BIC77777        100
      9009       5000          2 BIC77777        100
      9009       6000          0 DIG00077        100
      9009       6000          0 DIG00077        200
 
 
 
 
 
Eliminating distinct rows
Avishay, February  20, 2006 - 8:48 am UTC
 
 
Hello Tom,
I have a view that UNION ALL 3 tables
here is the result for a select * on that view(f_emp_v): 
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111        20               1/1/2005    5/12/2005
111111 123456              1/23/2005   1/24/2005
111111 123459                     1/25/2005    
111111               60  75      5/12/2005   5/13/2005
111111        30          5/13/2005   1/1/2006
111111               85  55      5/13/2005
Using the following SQL and analytical functions I fillied the NULL's, and created the IN_USE_FROM,IN_USE_UNTIL columns in a different way the IN_USE_UNTIL recieves the next date of the IN_USE_FROM ASC.
Here is the SQL:
SELECT Person_Id,
       Substr(MAX(Decode(Cost_Account_Id,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Cost_Account_Id))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Cost_Account_Id,
       Substr(MAX(Decode(Cost_Account_Code,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Cost_Account_Code))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Cost_Account_Code,
       Substr(MAX(Decode(Working_Hours,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Working_Hours))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Working_Hours,
       Substr(MAX(Decode(Fte,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') || Fte))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Fte,
       Substr(MAX(Decode(Calc_Fte_Type,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Calc_Fte_Type))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Calc_Fte_Type,
       In_Use_From,
       Lead(t.In_Use_From, 1, In_Use_Until) Over(PARTITION BY t.Person_Id ORDER BY t.In_Use_From ASC) In_Use_Until
FROM   Fact_Employee_List_v t
WHERE person_id = 111111
ORDER  BY Person_Id,
          In_Use_From;
Here are the results:
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111        20                  1/1/2005    1/23/2005
111111 123456 20                  1/23/2005   1/25/2005
111111 123459 20          1/25/2005   5/12/2005
111111 123459 20     60     75      5/12/2005   5/13/2005
111111 123459 30     85     55       5/13/2005   5/13/2005
111111 123459 30     85  55      5/13/2005
The table "Fills" in accordance with the dates.
As you can see the last 2 rows except for the IN_USE_UNTIL are identical.
How can I get 'Rid' of the row with IN_USE_UNTIL NOT NULL ?
Is there a way to do it in the above select?
Maybe change the way the analytical function for IN_US_UNTIL ?
Your remarks will be appreciated
Best Regards,
Avishay 
 
 
Update using LAG
Zahir M, February  21, 2006 - 2:16 pm UTC
 
 
SQL> desc tab1
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 MEMBER_ID                                          NUMBER(10)
 START_DATE                                         DATE
 STOP_DATE                                          DATE
SQL> Select * from tab1 where member_id = 125;
 MEMBER_ID START_DAT STOP_DATE
---------- --------- ---------
       125 23-OCT-00
       125 05-MAY-04
       125 10-MAY-04
       125 30-MAR-05
SQL> Select wh.* , lag(start_date) over ( partition by member_id order by start_date asc)  - 1 new_s
top_date
  2   from tab1 wh where member_id = 125
  3  
SQL> /
 MEMBER_ID START_DAT STOP_DATE NEW_STOP_
---------- --------- --------- ---------
       125 23-OCT-00
       125 05-MAY-04           22-OCT-00
       125 10-MAY-04           04-MAY-04
       125 30-MAR-05           09-MAY-04
SQL> Update tab1 a
  2  set a.stop_date = ( SElect 
  3        lag(b.start_date) 
  4        over ( partition by b.member_id order by b.start_date asc)  - 1 new_stop_date 
  5               from tab1 b where a.member_id = b.member_id and a.rowid = b.rowid ) ;
4 rows updated.
SQL>  Select * from tab1 where member_id = 125;
 MEMBER_ID START_DAT STOP_DATE
---------- --------- ---------
       125 23-OCT-00
       125 05-MAY-04
       125 10-MAY-04
       125 30-MAR-05
SQL>  select * from v$version
  2  /
BANNER
------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.7.4.1 - Production
PL/SQL Release 8.1.7.4.0 - Production
CORE    8.1.7.2.1       Production
TNS for 32-bit Windows: Version 8.1.7.4.0 - Production
NLSRTL Version 3.4.1.0.0 - Production
I am trying to use the LAG analytic function for an update statement . It does not seem to work.
Still , the column STOP_DATE is not updated with the new values( ie from the lag ).
Please advise.     
 
 
February  22, 2006 - 8:16 am UTC 
 
Oh, it is absolutely working!!! The problem is - where clause is applied
where a.member_id = b.member_id and a.rowid = b.rowid
AND THEN analytic is performed - of course, there is only one row however, so, well - the "previous" row isn't there anymore.
ops$tkyte@ORA10GR2> create table emp
  2  as
  3  select job, hiredate, to_date(null) last_hiredate
  4    from scott.emp;
Table created.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> merge into emp
  2  using ( select rowid rid, lag(hiredate) over (partition by job order by hiredate) last_hiredate
  3            from emp) e2
  4  on ( emp.rowid = e2.rid )
  5  when matched then update set last_hiredate = e2.last_hiredate
  6  -- when not matched then insert (job) values (NULL)
  7  /
14 rows merged.
ops$tkyte@ORA10GR2> select * from emp order by job, hiredate;
JOB       HIREDATE  LAST_HIRE
--------- --------- ---------
ANALYST   03-DEC-81
ANALYST   09-DEC-82 03-DEC-81
CLERK     17-DEC-80
CLERK     03-DEC-81 17-DEC-80
CLERK     23-JAN-82 03-DEC-81
CLERK     12-JAN-83 23-JAN-82
MANAGER   02-APR-81
MANAGER   01-MAY-81 02-APR-81
MANAGER   09-JUN-81 01-MAY-81
PRESIDENT 17-NOV-81
SALESMAN  20-FEB-81
SALESMAN  22-FEB-81 20-FEB-81
SALESMAN  08-SEP-81 22-FEB-81
SALESMAN  28-SEP-81 08-SEP-81
14 rows selected.
In 9i, you need the "insert" part - but it'll never happen.
(but really, this looks like a bad idea, you'll have to just keep doing this over and over and over)
In 8i, you'll likely want to "two step this", create global temporary table, insert and update the join. 
 
 
 
 
LEAD UPDATE
Zahir M, February  22, 2006 - 10:45 am UTC
 
 
Tom , 
I did the "two steps process" in 8i as you have suggested .
( except that I created a another table - not a global temp table ) . But it takes long duration to perform the update .
130874 rows updated.
Elapsed: 07:469:28146.33
Statistics
----------------------------------------------------------
          0  recursive calls
    4455554  db block gets
  105093306  consistent gets
   14702345  physical reads
   30683572  redo size
        852  bytes sent via SQL*Net to client
        670  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
     130874  rows processed
 
 
February  22, 2006 - 11:05 am UTC 
 
no example :(
should only take seconds for so few rows.  
ops$xp8i\tkyte@ORA8IR3W> alter table big_table add last_created date;
Table altered.
Elapsed: 00:00:00.63
ops$xp8i\tkyte@ORA8IR3W> create table t
  2  ( rid rowid primary key, last_created date );
Table created.
Elapsed: 00:00:00.78
ops$xp8i\tkyte@ORA8IR3W>
ops$xp8i\tkyte@ORA8IR3W> insert into t
  2  select rowid, lag(created) over (partition by object_type order by created)
  3    from big_table;
130874 rows created.
Elapsed: 00:00:36.03
ops$xp8i\tkyte@ORA8IR3W> exec dbms_stats.gather_table_stats( user, 'T' );
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.38
ops$xp8i\tkyte@ORA8IR3W>
ops$xp8i\tkyte@ORA8IR3W> update (select a.last_created new_last_created,
  2                 b.last_created old_last_created
  3                    from t a, big_table b
  4                   where a.rid = b.rowid )
  5     set old_last_created = new_last_created;
130874 rows updated.
Elapsed: 00:00:32.41
Maybe you are getting blocked by other users, since you are updating every row, lock the table first - then update it. 
 
 
 
 
LAG Update
Zahir M, February  22, 2006 - 11:32 am UTC
 
 
Thanks , Tom. 
I re-ran the update after locking the table. 
It took only 47 seconds for the update operation.
I guess it was locked by some other users / processes.
Thanks again ! 
 
 
Reposting
Avishay, February  23, 2006 - 4:28 am UTC
 
 
Hello Tom,
I have a view that UNION ALL 3 tables
here is the result for a select * on that view(f_emp_v): 
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111        20               1/1/2005    5/12/2005
111111 123456              1/23/2005   1/24/2005
111111 123459                     1/25/2005    
111111               60  75      5/12/2005   5/13/2005
111111        30          5/13/2005   1/1/2006
111111               85  55      5/13/2005
Using the following SQL and analytical functions I fillied the NULL's, and 
created the IN_USE_FROM,IN_USE_UNTIL columns in a different way the IN_USE_UNTIL 
recieves the next date of the IN_USE_FROM ASC.
Here is the SQL:
SELECT Person_Id,
       Substr(MAX(Decode(Cost_Account_Id,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Cost_Account_Id))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Cost_Account_Id,
       Substr(MAX(Decode(Cost_Account_Code,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Cost_Account_Code))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Cost_Account_Code,
       Substr(MAX(Decode(Working_Hours,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Working_Hours))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Working_Hours,
       Substr(MAX(Decode(Fte,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') || Fte))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Fte,
       Substr(MAX(Decode(Calc_Fte_Type,
                         NULL,
                         NULL,
                         To_Char(In_Use_From, 'yyyymmddhh24miss') ||
                         Calc_Fte_Type))
              Over(PARTITION BY Person_Id ORDER BY In_Use_From),
              15) Calc_Fte_Type,
       In_Use_From,
       Lead(t.In_Use_From, 1, In_Use_Until) Over(PARTITION BY t.Person_Id ORDER 
BY t.In_Use_From ASC) In_Use_Until
FROM   Fact_Employee_List_v t
WHERE person_id = 111111
ORDER  BY Person_Id,
          In_Use_From;
Here are the results:
PER_ID C_A_ID WORK_H FTE CALC_FTE IN_USE_FROM IN_USE_UNTIL
111111        20                  1/1/2005    1/23/2005
111111 123456 20                  1/23/2005   1/25/2005
111111 123459 20          1/25/2005   5/12/2005
111111 123459 20     60     75      5/12/2005   5/13/2005
111111 123459 30     85     55       5/13/2005   5/13/2005
111111 123459 30     85  55      5/13/2005
The table "Fills" in accordance with the dates.
As you can see the last 2 rows except for the IN_USE_UNTIL are identical.
How can I get 'Rid' of the row with IN_USE_UNTIL NOT NULL ?
Is there a way to do it in the above select?
Maybe change the way the analytical function for IN_US_UNTIL ?
Your remarks will be appreciated
Best Regards,
Avishay 
 
 
February  23, 2006 - 8:07 am UTC 
 
why did you repost it.
You must have seen the page you used to post this.  did you *READ* that page?  I ignore all things that look like this - you have the classic example of what I ignore.
slow down, read the page you used to post this repost. 
 
 
 
Analytics
Mark, February  23, 2006 - 10:42 am UTC
 
 
Hi Tom,
Oracle 9i latest and greatest...
At a loss for this one. Don't know where to start...
I have a table:
Create Table MY_NUMS
(N1 NUMBER, N2 NUMBER, N3 NUMBER, N4 NUMBER, N5 NUMBER, N6 NUMBER)
/
I populate each column with random values between 1-46:
INSERT INTO my_nums
   SELECT TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
         ,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
         ,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
         ,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
         ,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
         ,TRUNC(DBMS_RANDOM.VALUE(1, 47), 0)
     FROM user_objects;
/
and I get data similar to this:
HT4:DEVDB001101007:10:10 - DEV>      select * from my_nums where rownum <= 10
  2  /
        N1         N2         N3         N4         N5         N6
---------- ---------- ---------- ---------- ---------- ----------
         6         13         11          6         21         36
        33         23         45         11         24         32
        36         19         43         19          8         44
        11         39          9         14         35         25
        42          8         29         15         26          4
         1         25         12         41         21         10
        20          6         43         29         39         28
        16         18         36         15         38         26
        16         33         15         16         40         18
        17          1         20         39         20         46
10 rows selected.     
And my question is how do I get counts of how many times each number appears with every other number?
Output might look like this:
MY_NUMBER OTHER_NUM COUNT(*)
--------- --------- --------
       16        15        3
...
This says that the number 16 appeared 3 times with the number 15 in the same row.
I'd have to check each column (N1 - N6) against each of the other 5 columns for each value and sum them up...
Regards,
Mark
       
 
 
February  23, 2006 - 10:47 am UTC 
 
are my_number and other_num INPUTS into your query or what?  where did 16 and 15 come from. 
 
 
 
Analytics
Mark, February  23, 2006 - 12:28 pm UTC
 
 
Oh, ok.
MY_NUMBER is the number I am counting the combinations with all OTHER_NUMBERs.
Output would ideally look like this:
MY_NUMBER OTHER_NUMBER COUNT(*)
--------- ------------ --------
        1            2        5
        1            3        6
        1            4        2
...
        2            3        4
        2            4        7
        2            5        8
...
etc., all the way down to 
MY_NUMBER OTHER_NUMBER COUNT(*)
--------- ------------ --------
       45           46        7
There are no INPUTS to the query as it calculates counts for all combinations.
This query somewhat does it, but I feel there is a way better way than doing it with decodes:
SELECT n1
      ,SUM(n2_2 + n3_2 + n4_2 + n5_2 + n6_5) two
      ,SUM(n2_3 + n3_3 + n4_3 + n5_3 + n6_5) three
      ,SUM(n2_4 + n3_4 + n4_4 + n5_4 + n6_5) four
      ,SUM(n2_5 + n3_5 + n4_5 + n5_5 + n6_5) five
      ,SUM(n2_46 + n3_46 + n4_46 + n5_46 + n6_46) fortysix
  FROM (SELECT n1
              ,DECODE(n2, 2, 1, 0) n2_2
              ,DECODE(n2, 3, 1, 0) n2_3
              ,DECODE(n2, 4, 1, 0) n2_4
              ,DECODE(n2, 5, 1, 0) n2_5
              /* n2_6 through n2_45 here ... */
              ,DECODE(n2, 46, 1, 0) n2_46
              ,DECODE(n3, 2, 1, 0) n3_2
              ,DECODE(n3, 3, 1, 0) n3_3
              ,DECODE(n3, 4, 1, 0) n3_4
              ,DECODE(n3, 5, 1, 0) n3_5
              ,DECODE(n3, 46, 1, 0) n3_46
              ,DECODE(n4, 2, 1, 0) n4_2
              ,DECODE(n4, 3, 1, 0) n4_3
              ,DECODE(n4, 4, 1, 0) n4_4
              ,DECODE(n4, 5, 1, 0) n4_5
              ,DECODE(n4, 46, 1, 0) n4_46
              ,DECODE(n5, 2, 1, 0) n5_2
              ,DECODE(n5, 3, 1, 0) n5_3
              ,DECODE(n5, 4, 1, 0) n5_4
              ,DECODE(n5, 5, 1, 0) n5_5
              ,DECODE(n5, 46, 1, 0) n5_46
              ,DECODE(n6, 2, 1, 0) n6_2
              ,DECODE(n6, 3, 1, 0) n6_3
              ,DECODE(n6, 4, 1, 0) n6_4
              ,DECODE(n6, 5, 1, 0) n6_5
              ,DECODE(n6, 46, 1, 0) n6_46
          FROM my_nums)
 GROUP BY n1
/
        N1        TWO      THREE       FOUR       FIVE   FORTYSIX
---------- ---------- ---------- ---------- ---------- ----------
         1         27         10         14         19         16
         2         10         12         12         22         21
         3         13         15         13         12         16
         4         25         15         24         13         23
         5         19         15         16         14         19
         6         16         15         13          8         30
         7         18         13         13         19         15
         8         14         14         12         18         14
         9         16         16         18         16         22
        10         14         12         16         14         19
...
for the entire matrix of number.
This output says that the number 2 in columns N2,N3,N4,N5,N6 appeared 27 times with the number 1. The number 3 in columns N2 - N6 appeared 10 times with the number 1, etc.
A sort  of 'how many times does this number appear with all thes other numbers in the same row" query...
Thanks.    
 
February  23, 2006 - 7:01 pm UTC 
 
I was beaten to the punch on this one ;) see below 
 
 
 
A solution
Michel Cadot, February  23, 2006 - 3:10 pm UTC
 
 
Hi Mark,
Here's a solution to your issue.
I changed the values to be able to post the whole example but the query does not care about what's inside the table.
I let all the steps but i think you can compact the query.
SQL> INSERT INTO my_nums
  2     SELECT TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  3           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  4           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  5           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  6           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  7           ,TRUNC(DBMS_RANDOM.VALUE(1, 9), 0)
  8     FROM user_views
  9     where rownum <= 3
 10  /
3 rows created.
SQL> commit;
Commit complete.
SQL> select * from my_nums;
        N1         N2         N3         N4         N5         N6
---------- ---------- ---------- ---------- ---------- ----------
         7          4          6          8          2          1
         1          2          8          8          7          2
         6          3          5          1          4          8
3 rows selected.
SQL> with 
  2    a as ( select row_number () over (order by n1) rn,
  3                  n1, n2, n3, n4, n5, n6
  4           from my_nums
  5         ),
  6    b as ( select rn, n1, n2, n3, n4, n5, n6, part
  7           from a, 
  8                (select rownum part from dual connect by level <= 15)
  9         ),
 10    c as ( select distinct rn, 
 11                           case part
 12                           when  1 then least(n1,n2)
 13                           when  2 then least(n1,n3)
 14                           when  3 then least(n1,n4)
 15                           when  4 then least(n1,n5)
 16                           when  5 then least(n1,n6)
 17                           when  6 then least(n2,n3)
 18                           when  7 then least(n2,n4)
 19                           when  8 then least(n2,n5)
 20                           when  9 then least(n2,n6)
 21                           when 10 then least(n3,n4)
 22                           when 11 then least(n3,n5)
 23                           when 12 then least(n3,n6)
 24                           when 13 then least(n4,n5)
 25                           when 14 then least(n4,n6)
 26                           when 15 then least(n5,n6)
 27                           end v1,
 28                           case part
 29                           when  1 then greatest(n1,n2)
 30                           when  2 then greatest(n1,n3)
 31                           when  3 then greatest(n1,n4)
 32                           when  4 then greatest(n1,n5)
 33                           when  5 then greatest(n1,n6)
 34                           when  6 then greatest(n2,n3)
 35                           when  7 then greatest(n2,n4)
 36                           when  8 then greatest(n2,n5)
 37                           when  9 then greatest(n2,n6)
 38                           when 10 then greatest(n3,n4)
 39                           when 11 then greatest(n3,n5)
 40                           when 12 then greatest(n3,n6)
 41                           when 13 then greatest(n4,n5)
 42                           when 14 then greatest(n4,n6)
 43                           when 15 then greatest(n5,n6)
 44                           end v2
 45           from b
 46         )
 47  select v1, v2, count(*) nb
 48  from c
 49  where v1 != v2
 50  group by v1, v2
 51  order by v1, v2
 52  /
        V1         V2         NB
---------- ---------- ----------
         1          2          2
         1          3          1
         1          4          2
         1          5          1
         1          6          2
         1          7          2
         1          8          3
         2          4          1
         2          6          1
         2          7          2
         2          8          2
         3          4          1
         3          5          1
         3          6          1
         3          8          1
         4          5          1
         4          6          2
         4          7          1
         4          8          2
         5          6          1
         5          8          1
         6          7          1
         6          8          2
         7          8          2
24 rows selected.
Regards
Michel 
 
 
 
Excellent
Mark, February  23, 2006 - 3:35 pm UTC
 
 
Thanks 
 
 
To Mark ... COUNT of what?
A reader, February  23, 2006 - 4:07 pm UTC
 
 
<quote>And my question is how do I get counts of how many times each number appears with every other number?</quote>
flip@FLOP> select * from my_nums;
        N1         N2         N3         N4         N5         N6
---------- ---------- ---------- ---------- ---------- ----------
         1          2          2          2          2          2
flip@FLOP> @michel_qry
        V1         V2         NB
---------- ---------- ----------
         1          2          1
Should the answer here be:
A. "1" [as in the number of rows where 1 is together with 2]
or
B. "5"
? 
 
 
To A reader
Michel Cadot, February  23, 2006 - 4:46 pm UTC
 
 
The "distinct" in c definition is there to count only 1 per each row. 
If you want to count all occurrences then remove "distinct".
Regards
Michel
 
 
 
Analytics
Mark, February  23, 2006 - 5:30 pm UTC
 
 
The answer to that should be "5" as 1 and 2 appear as a combination 5 times.
Regards,
Mark 
 
 
Difficulty with min(id) over partition by
Ken, February  27, 2006 - 11:38 pm UTC
 
 
Hi Tom:
We have a table with over 4 million records. A bug in the java application has created duplicate records in this table. I am brought in to dedupe the records based on certain criteria. For every duplicate record set, we need to keep the record with the smallest value in the ID field and delete the rest. 
I am close to identifying the IDs to be deleted but the query is not identifying the records with min(id) consistently. Here is the table structure (created for this test case -- no index or anything) and query. 
Thanks in advance for your help.
ken@DEV9206> desc DEDUPE_TEST_01
 Name                                Null?    Type
 ----------------------------------- -------- --------
 EQUIP_ID                            NOT NULL NUMBER
 EQUIP_TYPE_ID                       NOT NULL NUMBER
 TYPE_ID                             NOT NULL NUMBER
 CREATED                             NOT NULL DATE
 COL_1                                        NUMBER
 COL_0                                        NUMBER
 ID                                  NOT NULL NUMBER
 CT                                           NUMBER
ken@DEV9206> l
  1  SELECT equip_id, equip_type_id, type_id, TO_CHAR(CREATED, 'DDMMYYYY HH24:MI:SS') created,
  2         col_1, col_0, MIN(id) OVER (PARTITION BY equip_id
  3                                         ORDER BY equip_id, equip_type_id, type_id, TO_CHAR(created, 'DDMMYYYY HH24:MI:SS'),
  4                                                  col_1, col_0) min_id,
  5         id
  6    from dedupe_test_01
  7*  where rownum < 31
ken@DEV9206> /
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED           COL_1 COL_0    MIN_ID       ID
-------- ------------- ------- ----------------- ----- ----- --------- --------
    3011            29     221 30110002 00:00:00     3    77  26445635 26445635
    3011            29     221 30110002 00:00:00     3    77           26445651
    3011            29     221 30110002 00:00:00     3    86  26445626 26445626
    3011            29     221 30110002 00:00:00     3    86           26445653
    3011            29     221 30110002 00:00:00     3   112  26445617 26445617
    3011            29     221 30110002 00:00:00     3   112           26445620
    3011            29     221 30110002 00:00:00     3   125           26445631*
    3011            29     221 30110002 00:00:00     3   125           26445641
    3551            29     221 11032005 17:01:00     1   186   3209093  3209093
    3551            29     221 11032005 17:01:00     1   186            6072228
    3551            29     221 11032005 17:01:00     1   186            8894681
    3551            29     221 11032005 17:01:00     1   186            3837758
    3551            29     221 11032005 17:01:00     1   186            3837738
    3551            29     221 11032005 20:44:00     1   190   3209092  3209092
    3551            29     221 11032005 20:44:00     1   190            3837757
    3551            29     221 11032005 20:44:00     1   190            6072227
    3551            29     221 11032005 20:44:00     1   190            8894680
    3551            29     221 11032005 20:44:00     1   190            3837737
    3551            29     221 11032005 23:00:00     1   227   3209091  3209091
    3551            29     221 11032005 23:00:00     1   227            3837736
    3551            29     221 30110002 00:00:00     3   112            3209094*
    3551            29     221 30110002 00:00:00     3   112            3837739
    3551            29     221 30110002 00:00:00     3   112            3837759
    3551            29     221 30110002 00:00:00     3   112            6072229
    3551            29     221 30110002 00:00:00     3   112            8894682
    3551            29     221 30110002 00:00:00     3   118            3209095*
    3551            29     221 30110002 00:00:00     3   118            3837740
    3551            29     221 30110002 00:00:00     3   118            3837760
    3551            29     221 30110002 00:00:00     3   118            6072230
    3551            29     221 30110002 00:00:00     3   118            8894683
30 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   WINDOW (SORT)
   2    1     COUNT (STOPKEY)
   3    2       TABLE ACCESS (FULL) OF 'DEDUPE_TEST_01'
 
 
February  28, 2006 - 7:14 am UTC 
 
... A bug in the java application has 
created duplicate records in this table. ...
You really meant to say "a horribly flawed implementation whereby the java coders decided to do all DATA LOGIC in the wrongest place in the world - the application has exposed itself.  We know there are dozens more of these lying in wait for us"
That is identifying the min id very very very very consistently.
but why are you sorting?  If you want the min(id) by equip_id, you should leave out the order by - else you get the minimum ID for the current row and every row in front of it - not over all rows by equip_id.
Don't know why you are using:
TO_CHAR(created, 'DDMMYYYY HH24:MI:SS')
why would you take something incredibly sortable (a DATE) and turn it into an ascii string that doesn't even sort correctly? (it will not sort by date at all, it'll sort as a string, wrong)
But - I don't believe you want the order by - but I cannot really say because you do not tell us what the primary key that should have been in place is? 
 
 
 
Believe me - data and business logic belongs in the DB
Ken, February  28, 2006 - 8:32 am UTC
 
 
I have been telling the Java folks the same. And I have been preaching about the bind variables as well. The database was designed by someone quite famous in the Java tech community a few years back. The data abstraction has created so much issues and only the application knows what column contains what data based on a value in another field. But they are listening. They have changed updated the code to use preparedStatements and more changes are coming. A tiny victory, thanks in large part to your books and the forum. (The blog too, I must add. But I enjoy the non-tech pieces more since they show the otherside we hardly see here. Keep it up!) 
The only reason in using TO_CHAR was to expose the time values but this can be done later. I will remove this and sort and run it again. 
It is unbelievable how this app works. Pulling all the data and sorting in java, yikes!
Thank you so much, Tom.  
 
February  28, 2006 - 9:05 am UTC 
 
but dates sort by 
century, year, month, day, hour, minute, second
quite naturally!!!!!
By "exposing" the time in this example, you mucked up the sort order!!!
You put DD first - it would put the first of ANY MONTH before the second of ANY MONTH - messing it up.
it should just order by DATE - period.  Never order by to_char(date).... 
 
 
 
Analytics rock Analytics roll
Alf, February  28, 2006 - 2:11 pm UTC
 
 
Hello Tom,
I have four tables proc_event, visit, patient, and event
I need to get a list of patients with all records in the proc_event table for  proc_id =  123 and 3456. 
Patient           proc_id
Patient A      123
                     3456
Patient B       123
                     3456
The query below is listing all records from the patient table that have either of the proc_id in the proc_event table.
Would you please direct me how or if I might be able to rewrite this in order to gets the desire listing? 
Any information would be greatly appreciated. Thanks.
SELECT   p.P_NAME, 
         p.MRN, 
       pe.proc_id,
         e.date_time
    FROM ud.patient p,
         ud.proc_event pe,
         ud.event e,
         ud.visit v
   WHERE (    (p.patient_id = v.patient_id)
          AND (v.visit_id = e.visit_id)
          AND (e.event_id(+) = pe.event_id)
          AND (pe.proc_id in('123','3456'))
          AND (e.date_time BETWEEN 
               to_date('01-jan-2005 00:00','dd-mon-yyyy hh24:mi') AND
               to_date('20-jan-2005 23:59','dd-mon-yyyy hh24:mi')
              )
         )
GROUP BY p.P_NAME,
         p.mrn,
         pe.proc_id,
         e.date_time 
ORDER BY p.P_NAME ASC,
         p.mrn ASC,
         pe.proc_id ASC        
 
 
March     01, 2006 - 7:57 am UTC 
 
...
I have four tables proc_event, visit, patient, and event
I need to get a list of patients with all records in the proc_event table for  
proc_id =  123 and 3456. 
...
Well given that
a) we don't know how these relate
b) what columns might be available
c) pretty much don't know anything
It is sort of difficult.
It sounds like you simply want to 
a) join patients to proc_event (but we don't even know if you CAN!!!!!!!)
b) use an IN 
 
 
 
Difficulty with min(id) over partition by -- part deux
Ken, March     01, 2006 - 9:29 am UTC
 
 
Hi Tom:
I am still not getting this to group correctly. The date is date, and orader is removed. Could you please let me know what could be causing this? Thanks.
ken@DEV9206> break on min_id skip 1
ken@DEV9206> l
  1  SELECT equip_id, equip_type_id, type_id, created,
  2         col_1, col_0,
  3         ID,
  4         MIN(id) OVER (PARTITION BY equip_id
  5                       ORDER BY equip_type_id,
  6                                type_id,
  7                                created,
  8                                col_1,
  9                                col_0) min_id
 10   FROM dedupe_test_01
 11* WHERE ROWNUM < 51
ken@DEV9206> /
  EQUIP_ID EQUIP_TYPE_ID    TYPE_ID CREATED        COL_1      COL_0         ID     MIN_ID
---------- ------------- ---------- --------- ---------- ---------- ---------- ----------
      3011            29        221 30-NOV-02          3         77   26445635   26445635
      3011            29        221 30-NOV-02          3         77   26445651
      3011            29        221 30-NOV-02          3         86   26445626   26445626
      3011            29        221 30-NOV-02          3         86   26445653
      3011            29        221 30-NOV-02          3        112   26445617   26445617
      3011            29        221 30-NOV-02          3        112   26445620
      3011            29        221 30-NOV-02          3        125   26445631
      3011            29        221 30-NOV-02          3        125   26445641
      3551            29        221 30-NOV-02          3        112    3209094    3209094
      3551            29        221 30-NOV-02          3        112    3837739
      3551            29        221 30-NOV-02          3        112    3837759
      3551            29        221 30-NOV-02          3        112    6072229
      3551            29        221 30-NOV-02          3        112    8894682
      3551            29        221 30-NOV-02          3        118    3209095***
      3551            29        221 30-NOV-02          3        118    3837740
      3551            29        221 30-NOV-02          3        118    3837760
      3551            29        221 30-NOV-02          3        118    6072230
      3551            29        221 30-NOV-02          3        118    8894683
      3551            29        221 11-MAR-05          1        186    3209093    3209093
      3551            29        221 11-MAR-05          1        186    3837738
      3551            29        221 11-MAR-05          1        186    3837758
      3551            29        221 11-MAR-05          1        186    6072228
      3551            29        221 11-MAR-05          1        186    8894681
      3551            29        221 11-MAR-05          1        190    3209092    3209092
      3551            29        221 11-MAR-05          1        190    3837737
      3551            29        221 11-MAR-05          1        190    3837757
      3551            29        221 11-MAR-05          1        190    6072227
      3551            29        221 11-MAR-05          1        190    8894680
      3551            29        221 11-MAR-05          1        227    3209091    3209091
      3551            29        221 11-MAR-05          1        227    3837736
      3551            29        221 11-MAR-05          1        227    3837756
      3551            29        221 11-MAR-05          1        227    6072226
      3551            29        221 11-MAR-05          1        227    8894679
      3551            29        221 12-MAR-05          1        153    3209090    3209090
      3551            29        221 12-MAR-05          1        153    3837735
      3551            29        221 12-MAR-05          1        153    3837755
      3551            29        221 12-MAR-05          1        153    6072225
      3551            29        221 12-MAR-05          1        153    8894678
      3551            29        221 16-MAR-05          1        109    3837734***
      3551            29        221 16-MAR-05          1        109    3837754
      3551            29        221 16-MAR-05          1        109    6072224
      3551            29        221 16-MAR-05          1        109    8894677
      3551            29        221 16-MAR-05          1        128    3837733***
      3551            29        221 16-MAR-05          1        128    3837753
      3551            29        221 16-MAR-05          1        128    6072223
      3551            29        221 16-MAR-05          1        128    8894676
      3551            29        221 17-MAR-05          1        181    3837732
      3551            29        221 17-MAR-05          1        181    3837752
      3551            29        221 17-MAR-05          1        181    6072222
      3551            29        221 17-MAR-05          1        181    8894675
50 rows selected. 
 
March     01, 2006 - 9:52 am UTC 
 
I said you don't want the order by at all.
If you are trying to associate the MIN(ID) to every record in a group - you just partition - you do NOT order by (else you get the min id of every record from the current on one up)
I don't know what you are trying to retrieve really since all I have is "non functioning sql" to work with - I don't know what your logic is.
query is doing precisely what you asked - break data up by X, sort by A,B,C and assign min(id) from the current row on up on the group.
I think "no order by" is called for. 
 
 
 
Difficulty with min(id) over partition by  -- part deux - take 2
Ken, March     01, 2006 - 9:49 am UTC
 
 
Please disregard my previous posting. It did not come out right.
I still cannot get this to group correctly. TO_CHAR was removed and order WAS removed. This gave me the following.
  1  SELECT rownum, equip_id, equip_type_id, type_id, created,
  2         col_1, col_0,
  3         ID,
  4         MIN(id) OVER (PARTITION BY equip_id) min_id
  5   FROM dedupe_test_01
  6* WHERE ROWNUM < 51
ken@DEV9206/
ROWNUM EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED   COL_1 COL_0       ID   MIN_ID
------ -------- ------------- ------- --------- ----- ----- -------- --------
     1     3011            29     221 30-NOV-02     3    77 26445635 26445617
     2     3011            29     221 30-NOV-02     3    77 26445651
     3     3011            29     221 30-NOV-02     3    86 26445626
     4     3011            29     221 30-NOV-02     3    86 26445653
     5     3011            29     221 30-NOV-02     3   112 26445617
     6     3011            29     221 30-NOV-02     3   112 26445620
     7     3011            29     221 30-NOV-02     3   125 26445631
     8     3011            29     221 30-NOV-02     3   125 26445641
     9     3551            29     221 30-NOV-02     3   112  3209094  3209090
    10     3551            29     221 30-NOV-02     3   112  3837739
    11     3551            29     221 30-NOV-02     3   112  3837759
    12     3551            29     221 30-NOV-02     3   112  6072229
    13     3551            29     221 30-NOV-02     3   112  8894682
    14     3551            29     221 30-NOV-02     3   118  3209095
    15     3551            29     221 30-NOV-02     3   118  3837740
    16     3551            29     221 30-NOV-02     3   118  3837760
    17     3551            29     221 30-NOV-02     3   118  6072230
    18     3551            29     221 30-NOV-02     3   118  8894683
    19     3551            29     221 11-MAR-05     1   186  3209093
    20     3551            29     221 11-MAR-05     1   186  3837738
    21     3551            29     221 11-MAR-05     1   186  3837758
    22     3551            29     221 11-MAR-05     1   186  6072228
    23     3551            29     221 11-MAR-05     1   186  8894681
    24     3551            29     221 11-MAR-05     1   190  3209092
    25     3551            29     221 11-MAR-05     1   190  3837737
    26     3551            29     221 11-MAR-05     1   190  3837757
    27     3551            29     221 11-MAR-05     1   190  6072227
    28     3551            29     221 11-MAR-05     1   190  8894680
    29     3551            29     221 11-MAR-05     1   227  3209091
    30     3551            29     221 11-MAR-05     1   227  3837736
    31     3551            29     221 11-MAR-05     1   227  3837756
    32     3551            29     221 11-MAR-05     1   227  6072226
    33     3551            29     221 11-MAR-05     1   227  8894679
    34     3551            29     221 12-MAR-05     1   153  3209090
    35     3551            29     221 12-MAR-05     1   153  3837735
    36     3551            29     221 12-MAR-05     1   153  3837755
    37     3551            29     221 12-MAR-05     1   153  6072225
    38     3551            29     221 12-MAR-05     1   153  8894678
    39     3551            29     221 16-MAR-05     1   109  3837734
    40     3551            29     221 16-MAR-05     1   109  3837754
    41     3551            29     221 16-MAR-05     1   109  6072224
    42     3551            29     221 16-MAR-05     1   109  8894677
    43     3551            29     221 16-MAR-05     1   128  3837733
    44     3551            29     221 16-MAR-05     1   128  3837753
    45     3551            29     221 16-MAR-05     1   128  6072223
    46     3551            29     221 16-MAR-05     1   128  8894676
    47     3551            29     221 17-MAR-05     1   181  3837732
    48     3551            29     221 17-MAR-05     1   181  3837752
    49     3551            29     221 17-MAR-05     1   181  6072222
    50     3551            29     221 17-MAR-05     1   181  8894675
50 rows selected.
I added the ORDER BY back in the PARTITION BY clause. Here's the result:
ken@DEV9206get q_1    
  1  SELECT rownum, equip_id, equip_type_id, type_id, created,
  2         col_1, col_0,
  3         ID,
  4         MIN(id) OVER (PARTITION BY equip_id
  5                       ORDER BY equip_type_id,
  6                                type_id,
  7                                created,
  8                                col_1,
  9                                col_0) min_id
 10   FROM dedupe_test_01
 11* WHERE ROWNUM < 51
ken@DEV9206/
ROWNUM EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED   COL_1 COL_0       ID   MIN_ID
------ -------- ------------- ------- --------- ----- ----- -------- --------
     1     3011            29     221 30-NOV-02     3    77 26445635 26445635
     2     3011            29     221 30-NOV-02     3    77 26445651
     3     3011            29     221 30-NOV-02     3    86 26445626 26445626
     4     3011            29     221 30-NOV-02     3    86 26445653
     5     3011            29     221 30-NOV-02     3   112 26445617 26445617
     6     3011            29     221 30-NOV-02     3   112 26445620
     7     3011            29     221 30-NOV-02     3   125 26445631
     8     3011            29     221 30-NOV-02     3   125 26445641
     9     3551            29     221 30-NOV-02     3   112  3209094  3209094
    10     3551            29     221 30-NOV-02     3   112  3837739
    11     3551            29     221 30-NOV-02     3   112  3837759
    12     3551            29     221 30-NOV-02     3   112  6072229
    13     3551            29     221 30-NOV-02     3   112  8894682
    14     3551            29     221 30-NOV-02     3   118  3209095
    15     3551            29     221 30-NOV-02     3   118  3837740
    16     3551            29     221 30-NOV-02     3   118  3837760
    17     3551            29     221 30-NOV-02     3   118  6072230
    18     3551            29     221 30-NOV-02     3   118  8894683
    19     3551            29     221 11-MAR-05     1   186  3209093  3209093
    20     3551            29     221 11-MAR-05     1   186  3837738
    21     3551            29     221 11-MAR-05     1   186  3837758
    22     3551            29     221 11-MAR-05     1   186  6072228
    23     3551            29     221 11-MAR-05     1   186  8894681
    24     3551            29     221 11-MAR-05     1   190  3209092  3209092
    25     3551            29     221 11-MAR-05     1   190  3837737
    26     3551            29     221 11-MAR-05     1   190  3837757
    27     3551            29     221 11-MAR-05     1   190  6072227
    28     3551            29     221 11-MAR-05     1   190  8894680
    29     3551            29     221 11-MAR-05     1   227  3209091  3209091
    30     3551            29     221 11-MAR-05     1   227  3837736
    31     3551            29     221 11-MAR-05     1   227  3837756
    32     3551            29     221 11-MAR-05     1   227  6072226
    33     3551            29     221 11-MAR-05     1   227  8894679
    34     3551            29     221 12-MAR-05     1   153  3209090  3209090
    35     3551            29     221 12-MAR-05     1   153  3837735
    36     3551            29     221 12-MAR-05     1   153  3837755
    37     3551            29     221 12-MAR-05     1   153  6072225
    38     3551            29     221 12-MAR-05     1   153  8894678
    39     3551            29     221 16-MAR-05     1   109  3837734
    40     3551            29     221 16-MAR-05     1   109  3837754
    41     3551            29     221 16-MAR-05     1   109  6072224
    42     3551            29     221 16-MAR-05     1   109  8894677
    43     3551            29     221 16-MAR-05     1   128  3837733
    44     3551            29     221 16-MAR-05     1   128  3837753
    45     3551            29     221 16-MAR-05     1   128  6072223
    46     3551            29     221 16-MAR-05     1   128  8894676
    47     3551            29     221 17-MAR-05     1   181  3837732
    48     3551            29     221 17-MAR-05     1   181  3837752
    49     3551            29     221 17-MAR-05     1   181  6072222
    50     3551            29     221 17-MAR-05     1   181  8894675
50 rows selected.
Row numbers 14, 39, 43, and 47 should have started new groups. What am I doing wrong that it is not seeing these as new groups? 
Thanks much for your help.
 
 
March     01, 2006 - 10:32 am UTC 
 
you need to just state in english what you are trying to do rather than posting SQL that does not achieve it. 
Tell us your LOGIC. 
 
 
 
Difficulty with min(id) over partition by -- part deux - take 3 
Ken, March     01, 2006 - 10:38 am UTC
 
 
Thanks, Tom, fair enough.
We would like to select duplicate rows from a table based on the values in certain fields (not all fields will have duplicate data since some are unique to each row). And identify the row within this set with the lowest value in one column --- min(id), and have this value be shown in a separate column. 
Please let me know how we can achieve that. 
Thanks again. 
 
March     01, 2006 - 10:42 am UTC 
 
tell me what values perhaps.
In general, you will partition by the unique key
You will get the MIN(ID) by that key
No order by 
 
 
 
Difficulty with min(id) over partition by -- part deux - take 4
Ken, March     01, 2006 - 11:54 am UTC
 
 
Thanks, Tom, but I think we are not on the same page yet. 
Please let me try again re: logic.
We have the following table:
ken@DEV9206desc dedupe_test_01
 Name                          Null?    Type
 ----------------------------- -------- ------------------
 ID                            NOT NULL NUMBER
 EQUIP_ID                      NOT NULL NUMBER
 EQUIP_TYPE_ID                 NOT NULL NUMBER
 TYPE_ID                       NOT NULL NUMBER
 CREATED                       NOT NULL DATE
 COL_0                                  NUMBER
 COL_1                                  NUMBER
ken@DEV9206l
  1  select * from dedupe_test_01
  2   where equip_id in (3011, 3551)
  3     and rownum < 51
  4*  order by 1,2,3,4,5,6
ken@DEV9206/
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED   COL_0 COL_1       ID
-------- ------------- ------- --------- ----- ----- --------
    3011            29     221 30-NOV-02    77     3 26445635
    3011            29     221 30-NOV-02    77     3 26445651
    3011            29     221 30-NOV-02    86     3 26445626
    3011            29     221 30-NOV-02    86     3 26445653
    3011            29     221 30-NOV-02   112     3 26445617
    3011            29     221 30-NOV-02   112     3 26445620
    3011            29     221 30-NOV-02   125     3 26445631
    3011            29     221 30-NOV-02   125     3 26445641
    3551            29     221 30-NOV-02   112     3  3209094
    3551            29     221 30-NOV-02   112     3  3837739
    3551            29     221 30-NOV-02   112     3  3837759
    3551            29     221 30-NOV-02   112     3  6072229
    3551            29     221 30-NOV-02   112     3  8894682
    3551            29     221 30-NOV-02   118     3  3209095
    3551            29     221 30-NOV-02   118     3  3837740
    3551            29     221 30-NOV-02   118     3  3837760
    3551            29     221 30-NOV-02   118     3  6072230
    3551            29     221 30-NOV-02   118     3  8894683
    3551            29     221 11-MAR-05   186     1  3209093
    3551            29     221 11-MAR-05   186     1  3837738
    3551            29     221 11-MAR-05   186     1  3837758
    3551            29     221 11-MAR-05   186     1  6072228
    3551            29     221 11-MAR-05   186     1  8894681
    3551            29     221 11-MAR-05   190     1  3209092
    3551            29     221 11-MAR-05   190     1  3837737
    3551            29     221 11-MAR-05   190     1  3837757
    3551            29     221 11-MAR-05   190     1  6072227
    3551            29     221 11-MAR-05   190     1  8894680
    3551            29     221 11-MAR-05   227     1  3209091
    3551            29     221 11-MAR-05   227     1  3837736
    3551            29     221 11-MAR-05   227     1  3837756
    3551            29     221 11-MAR-05   227     1  6072226
    3551            29     221 11-MAR-05   227     1  8894679
    3551            29     221 12-MAR-05   153     1  3209090
    3551            29     221 12-MAR-05   153     1  3837735
    3551            29     221 12-MAR-05   153     1  3837755
    3551            29     221 12-MAR-05   153     1  6072225
    3551            29     221 12-MAR-05   153     1  8894678
    3551            29     221 16-MAR-05   109     1  3837734
    3551            29     221 16-MAR-05   109     1  3837754
    3551            29     221 16-MAR-05   109     1  6072224
    3551            29     221 16-MAR-05   109     1  8894677
    3551            29     221 16-MAR-05   128     1  3837733
    3551            29     221 16-MAR-05   128     1  3837753
    3551            29     221 16-MAR-05   128     1  6072223
    3551            29     221 16-MAR-05   128     1  8894676
    3551            29     221 17-MAR-05   181     1  3837732
    3551            29     221 17-MAR-05   181     1  3837752
    3551            29     221 17-MAR-05   181     1  6072222
    3551            29     221 17-MAR-05   181     1  8894675
50 rows selected.
And here's the answer I need:
EQUIP_ID EQUIP_TYPE_ID TYPE_ID CREATED   COL_0 COL_1       ID   MIN_ID
-------- ------------- ------- --------- ----- ----- -------- --------  
    3011            29     221 30-NOV-02    77     3 26445635 26445635
    3011            29     221 30-NOV-02    77     3 26445651
    3011            29     221 30-NOV-02    86     3 26445626 26445626
    3011            29     221 30-NOV-02    86     3 26445653
    3011            29     221 30-NOV-02   112     3 26445617 26445617
    3011            29     221 30-NOV-02   112     3 26445620
    3011            29     221 30-NOV-02   125     3 26445631 26445631
    3011            29     221 30-NOV-02   125     3 26445641
    3551            29     221 30-NOV-02   112     3  3209094 3209094
    3551            29     221 30-NOV-02   112     3  3837739
    3551            29     221 30-NOV-02   112     3  3837759
    3551            29     221 30-NOV-02   112     3  6072229
    3551            29     221 30-NOV-02   112     3  8894682
    3551            29     221 30-NOV-02   118     3  3209095 3209095
    3551            29     221 30-NOV-02   118     3  3837740
    3551            29     221 30-NOV-02   118     3  3837760
    3551            29     221 30-NOV-02   118     3  6072230
    3551            29     221 30-NOV-02   118     3  8894683
    3551            29     221 11-MAR-05   186     1  3209093 3209093
    3551            29     221 11-MAR-05   186     1  3837738
    3551            29     221 11-MAR-05   186     1  3837758
    3551            29     221 11-MAR-05   186     1  6072228
    3551            29     221 11-MAR-05   186     1  8894681
    3551            29     221 11-MAR-05   190     1  3209092 3209092
    3551            29     221 11-MAR-05   190     1  3837737
    3551            29     221 11-MAR-05   190     1  3837757
    3551            29     221 11-MAR-05   190     1  6072227
    3551            29     221 11-MAR-05   190     1  8894680
Perhaps, PARTITION BY is not appropriate? Thanks.
 
 
March     01, 2006 - 1:49 pm UTC 
 
I know we are not on the same page - because all i want is a textual description of the logic.
I do not, will not, reverse engineer "this is what I get" and "this is what I want"
I want you to write it down as if you would give it to someone to implement
because....... That is precisely what you are doing. 
 
 
 
Difficulty with min(id) over partition by -- part deux - take 5
Ken, March     01, 2006 - 12:23 pm UTC
 
 
Thanks, Tom. Your questions had forced me to look at it again. I was able to resolve it without using PARTITION BY. 
Thanks again. 
 
 
Analytics Question
Mike, March     01, 2006 - 2:48 pm UTC
 
 
I am having an issue with using analytics. What I need to be able to accomplish is
I have a table called entity, this table is recursive in the fact the every entity
with the exception of the highest level entity has a parent entity.
Normally this list is easily generated by using 'connect by' in the SQL to show the hierarchy.
The problem I have is now we have to show all the tickets that have been opened for each entity
and all the entities under it.
A simple join will get the open tickets for each entity, but in order to sum all of the tickets
for each entity and all entities under it takes too many resources to be considered usable.
I have tried several ways to get this query to work with as few resources as possible but still
believe that this SQL will generate to unnacceptable as the table(s) grow.
what I really need to do is sum the entities tickets up in reverse order from the bottom up to
the level 1 entity so that I don't have to traverse the tree from level 1 to level 6, then each
level2 to level6, etc,..
I wrote a quick procedure to populate a table (t) with each uuid that has a ticket, it's sum of
the tickets tied to that entity and a concatenated list of uuids under that entity in the tree,
this resulted in a much smaller table but this is not the way I would want to implemet it
This table is going to grow into several hundred thousand entities and I need to be able
to generate this list much faster and using less resources than the connect by statements below.
Any help would definitely be appreciated.
Without using a procedure and a two step process here is what I have to work with for a test,
CREATE TABLE ENTITY (
 ENTITY_UUID     VARCHAR2(32),
 NAME            VARCHAR2(256),
 PARENT_UUID     VARCHAR2(32)
)
/
Table created.
CREATE TABLE ENTITY_TCKT (
 ENTITY_UUID             VARCHAR2(32),
 CURRENT_LIFECYCLE_STATE NUMBER
)
/
Table created.
Then populate the entity table,
insert into entity values ('13E7CAA5FDEB42518A798A77A19F70B0','Level1 Entity',NULL);
insert into entity values ('66A6A6EFFA9D46BE82EC8F5CFFAC91B9','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('DCF6B6366D6449DB95A5AEA6B14F31F7','Level5 Entity','66A6A6EFFA9D46BE82EC8F5CFFAC91B9');
insert into entity values ('E2FD444948714528805EBFFA102511F5','Level5 Entity','CB4E1B74035947B9A5B9B0FE264DF4E7');
insert into entity values ('2E1E0646AC9F4BB9A6E4A747B20B2595','Level2 Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('54133391FDD54221B11382A20DFC38AA','Level2 Entity','13E7CAA5FDEB42518A798A77A19F70B0');
insert into entity values ('95A5F85D68184DB7A49F9DF7A236F9AF','Level5 Entity','99762DC75A5D42DCBEA6950D7011F130');
insert into entity values ('EAD30C5578BD491991B0D7049CD4F277','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('536FCF7E4A5D457B8C3AECBED878FDBF','Level3 Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('883FD970DF264B7A9DC0DFBAF225012A','Level4 Entity','536FCF7E4A5D457B8C3AECBED878FDBF');
insert into entity values ('C23E104B5AA044F795A6896B1C0B08E4','Level3 Entity','54133391FDD54221B11382A20DFC38AA');
insert into entity values ('64BA9F3295194A3B955FD446DBB2E7EC','Level4 Entity','C23E104B5AA044F795A6896B1C0B08E4');
insert into entity values ('0FE28D72C40C4D6FBB439A49B0BE6D3F','Level5 Entity','64BA9F3295194A3B955FD446DBB2E7EC');
----------list goes on------------------
Then to generate some random ticket data
DECLARE
l_tckt_ktr NUMBER;
l_entity_ktr NUMBER;
BEGIN
FOR i IN (select entity_uuid from entity)
LOOP
        l_entity_ktr := round(dbms_random.value(1,6),0);
        DBMS_OUTPUT.PUT_LINE('Processing entity -> '||i.entity_uuid||' l_entity_ktr = '||l_entity_ktr);
        IF l_entity_ktr = 1 THEN
                l_tckt_ktr := round(dbms_random.value(1,10),0);
                FOR n IN 1 .. l_tckt_ktr
                LOOP
                        insert into entity_tckt values (i.entity_uuid,0);
                END LOOP;
        ELSIF l_entity_ktr = 3 THEN
                l_tckt_ktr := round(dbms_random.value(50,100),0);
                FOR n IN 1 .. l_tckt_ktr
                LOOP
                        insert into entity_tckt values (i.entity_uuid,0);
                END LOOP;
        ELSIF l_entity_ktr = 5 THEN
                l_tckt_ktr := round(dbms_random.value(500,1000),0);
                FOR n IN 1 .. l_tckt_ktr
                LOOP
                        insert into entity_tckt values (i.entity_uuid,0);
                END LOOP;
        END IF;
END LOOP;
END;
/
PL/SQL procedure successfully completed.
Now that the tables have data, try two ways to get the query
set lines 256
column lpad('',2*level)||e.name format a50
column lpad('',2*level)||e.entity_uuid format a50
select lpad(' ', 2 * level)||e.name, lpad(' ', 2 * level)||e.entity_uuid, level,
        (select count(*) from entity_tckt where current_lifecycle_state = 0 and entity_uuid in (
                select a.entity_uuid from entity a start with a.entity_uuid = e.entity_uuid connect by prior a.entity_uuid = a.parent_uuid)) as nbr
from entity e
start with e.entity_uuid = '13E7CAA5FDEB42518A798A77A19F70B0' connect by prior e.entity_uuid = e.parent_uuid
/
LPAD('',2*LEVEL)||E.NAME                           LPAD('',2*LEVEL)||E.ENTITY_UUID                         LEVEL        NBR
-------------------------------------------------- -------------------------------------------------- ---------- ----------
  Level1 Entity                                      13E7CAA5FDEB42518A798A77A19F70B0                          1      15533
    Level2 Entity                                      2E1E0646AC9F4BB9A6E4A747B20B2595                        2          7
    Level2 Entity                                      54133391FDD54221B11382A20DFC38AA                        2      15526
      Level3 Entity                                      536FCF7E4A5D457B8C3AECBED878FDBF                      3        890
        Level4 Entity                                      66A6A6EFFA9D46BE82EC8F5CFFAC91B9                    4         51
          Level5 Entity                                      DCF6B6366D6449DB95A5AEA6B14F31F7                  5          0
        Level4 Entity                                      EAD30C5578BD491991B0D7049CD4F277                    4         76
        Level4 Entity                                      883FD970DF264B7A9DC0DFBAF225012A                    4          0
        Level4 Entity                                      0F01178489CB421CA5A4C55AEC98300E                    4          4
          Level5 Entity                                      5B3B033B4A30443BB1B6F57C1BB05399                  5          4
        Level4 Entity                                      CB4E1B74035947B9A5B9B0FE264DF4E7                    4         69
LPAD('',2*LEVEL)||E.NAME                           LPAD('',2*LEVEL)||E.ENTITY_UUID                         LEVEL        NBR
-------------------------------------------------- -------------------------------------------------- ---------- ----------
          Level5 Entity                                      E2FD444948714528805EBFFA102511F5                  5         69
        Level4 Entity                                      99762DC75A5D42DCBEA6950D7011F130                    4        607
          Level5 Entity                                      95A5F85D68184DB7A49F9DF7A236F9AF                  5        517
-------------------and on and on----------------
122 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   SORT (AGGREGATE)
   2    1     FILTER
   3    2       TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
   4    2       FILTER
   5    4         CONNECT BY (WITH FILTERING)
   6    5           NESTED LOOPS
   7    6             TABLE ACCESS (FULL) OF 'ENTITY'
   8    6             TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
   9    5           NESTED LOOPS
  10    9             BUFFER (SORT)
  11   10               CONNECT BY PUMP
  12    9             TABLE ACCESS (FULL) OF 'ENTITY'
  13    0   CONNECT BY (WITH FILTERING)
  14   13     NESTED LOOPS
  15   14       TABLE ACCESS (FULL) OF 'ENTITY'
  16   14       TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
  17   13     NESTED LOOPS
  18   17       BUFFER (SORT)
  19   18         CONNECT BY PUMP
  20   17       TABLE ACCESS (FULL) OF 'ENTITY'
Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
     208383  consistent gets
          0  physical reads
          0  redo size
       9007  bytes sent via SQL*Net to client
        591  bytes received via SQL*Net from client
         10  SQL*Net roundtrips to/from client
      27913  sorts (memory)
          0  sorts (disk)
        122  rows processed
And now using 'WITH'
with alm as (select entity_uuid, count(*) as nbr
            from entity_tckt
            group by entity_uuid)
select e.entity_uuid, sum(nbr) from entity e, alm
where alm.entity_uuid in (select entity_uuid from entity start with entity_uuid = e.entity_uuid connect by prior entity_uuid = parent_uuid)
group by e.entity_uuid
/
ENTITY_UUID                        SUM(NBR)
-------------------------------- ----------
026D4A6B547544E08CFEC5DDED3B7777        772
02AD7930E9B94A20A9F4E245B3F4B8C4       1747
04B3AD5F804B4A66AC6C91606EA7019B         74
070547E30C604D4680089959A6DB7684        677
08ABA73521F94F2DB19FF8C1C53ADF06          2
0D65B9BB906B442B9A6BDFEEF3D858AC         86
0F01178489CB421CA5A4C55AEC98300E          4
0FE28D72C40C4D6FBB439A49B0BE6D3F        716
10801AFBCBC44F2F9851E16BC1CCA442        703
13E7CAA5FDEB42518A798A77A19F70B0      15533
1677A02C0B89479BA27F95D7AA3DAC03       9516
ENTITY_UUID                        SUM(NBR)
-------------------------------- ----------
1AE6BCDA1FA74B0B8DEBEFF13F9A444A         96
1F9BD4CD72FA4879AC338397FB59FD39         74
21FAD03A5EB34D79908E96DA647FF24C        592
235F6C1600584CDD8B3FB1FFCD742E7E        839
23EA0B4618A94C1586BDF08437B97EB0       3067
25CFE1E64F36468DB291CBCF0867B314         59
26C635064E83447B91BA8D125FE74C2A        776
2ABC45D4A5D84684AC29A578C5CCBD3E       1759
2DF37BBB64254DB29081F03D38B5CE33        876
2E1E0646AC9F4BB9A6E4A747B20B2595          7
2FF85FEF82864C0BB75BA4513885ED0E         76
----------------more data ----------------
71 rows selected.
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   SORT (GROUP BY)
   2    1     FILTER
   3    2       NESTED LOOPS
   4    3         VIEW
   5    4           SORT (GROUP BY)
   6    5             TABLE ACCESS (FULL) OF 'ENTITY_TCKT'
   7    3         TABLE ACCESS (FULL) OF 'ENTITY'
   8    2       FILTER
   9    8         CONNECT BY (WITH FILTERING)
  10    9           NESTED LOOPS
  11   10             TABLE ACCESS (FULL) OF 'ENTITY'
  12   10             TABLE ACCESS (BY USER ROWID) OF 'ENTITY'
  13    9           NESTED LOOPS
  14   13             BUFFER (SORT)
  15   14               CONNECT BY PUMP
  16   13             TABLE ACCESS (FULL) OF 'ENTITY'
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     197127  consistent gets
          0  physical reads
          0  redo size
       3707  bytes sent via SQL*Net to client
        547  bytes received via SQL*Net from client
          6  SQL*Net roundtrips to/from client
      27902  sorts (memory)
          0  sorts (disk)
         71  rows processed
The logical reads will kill us as the table grows - Any help you could give would be fantastic
P.S. - I love the site, use it every day - Thanx
Mike
 
 
 
Analytics rock Analytics roll
Alf, March     01, 2006 - 2:50 pm UTC
 
 
Sorry, I should have mention that there is not direct relation between patient and proc_event:
The only way that Im able to relate patients to proc_event is by joining event to proc_event as (e.event_id(+) = pe.event_id) and then visit to patient (p.patient_id = v.patient_id). The IN operator is working as expected, however I try to use AND as pe.proc_id = '123' AND pe.proc_id ='3456' because this would return only records for patients who would have both 123 and 3456 (proc_di) records in the proc_event table.
 
 
 
Hello Tom, Analytics rock Analytics roll
Alf, March     02, 2006 - 4:51 pm UTC
 
 
Hi Tom,
I've been trying many different approach for this
Here are the table desc, I'm including the whole output from de desc command. Not sure if you'd want me to include everything or cut the not revelant culmns out.
SQL> desc ud_master.patient
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 PATIENT_ID                                NOT NULL NUMBER(12)
 NAME                                               VARCHAR2(100)
 TITLE_ID                                           NUMBER(12)
 MEDICAL_RECORD_NUMBER                              VARCHAR2(20)
 SEX                                                VARCHAR2(8)
 BIRTHDATE                                          DATE
 DATE_OF_DEATH                                      DATE
 APT_SUITE                                          VARCHAR2(100)
 STREET_ADDRESS                                     VARCHAR2(100)
 CITY                                               VARCHAR2(50)
 STATE                                              VARCHAR2(50)
 COUNTRY                                            VARCHAR2(50)
 MAILING_CODE                                       VARCHAR2(50)
 MARITAL_STATUS_ID                                  NUMBER(12)
 RACE_ID                                            NUMBER(12)
 RELIGION_ID                                        NUMBER(12)
 FREE_TEXT_RELIGION                                 VARCHAR2(100)
 OCCUPATION_ID                                      NUMBER(12)
 FREE_TEXT_OCCUPATION                               VARCHAR2(100)
 EMPLOYER_ID                                        NUMBER(12)
 FREE_TEXT_EMPLOYER                                 VARCHAR2(150)
 MOTHER_PATIENT_ID                                  NUMBER(12)
 COLLAPSED_INTO_PATIENT_ID                          NUMBER(12)
 SOCIAL_SECURITY_NUMBER                             VARCHAR2(15)
 LIFECARE_VISIT_ID                                  NUMBER(12)
 CONFIDENTIAL_FLAG                                  VARCHAR2(1)
 HOME_PHONE                                         VARCHAR2(20)
 DAY_PHONE                                          VARCHAR2(20)
 SMOKER_FLAG                                        VARCHAR2(1)
 CURRENT_LOCATION                                   VARCHAR2(15)
 SEC_LANG_NAME                                      VARCHAR2(100)
 ADDR_STRING                                        VARCHAR2(50)
 BLOCK_CODE                                         VARCHAR2(50)
SQL> desc ud_master.visit
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 VISIT_NUMBER                                       VARCHAR2(40)
 PATIENT_ID                                         NUMBER(12)
 VISIT_TYPE_ID                                      NUMBER(12)
 VISIT_SUBTYPE_ID                                   NUMBER(12)
 VISIT_STATUS_ID                                    NUMBER(12)
 FACILITY_ID                                        NUMBER(12)
 ATTENDING_EMP_PROVIDER_ID                          NUMBER(12)
 RESIDENT_EMP_PROVIDER_ID                           NUMBER(12)
 ADMISSION_DATE_TIME                                DATE
 DISCHARGE_DATE_TIME                                DATE
 DISCHARGE_TYPE_ID                                  NUMBER(12)
 MARITAL_STATUS_ID                                  NUMBER(12)
 RELIGION_ID                                        NUMBER(12)
 FREE_TEXT_RELIGION                                 VARCHAR2(100)
 FINANCIAL_CLASS_ID                                 NUMBER(12)
 OCCUPATION_ID                                      NUMBER(12)
 FREE_TEXT_OCCUPATION                               VARCHAR2(100)
 EMPLOYER_ID                                        NUMBER(12)
 FREE_TEXT_EMPLOYER                                 VARCHAR2(100)
 PHYSICIAN_SERVICE_ID                               VARCHAR2(12)
 LOCATION_ID                                        VARCHAR2(12)
 ADDL_RESP_EMP_PROVIDER_ID                          NUMBER(12)
 ADDL_RESP_STRING                                   VARCHAR2(100)
 ATTENDING_STRING                                   VARCHAR2(100)
 RESIDENT_STRING                                    VARCHAR2(100)
 LAST_LOCATION                                      VARCHAR2(15)
 ADDL_RESP_RESIDENT_SERVICE_ID                      NUMBER(12)
 TRIAGE_ACUITY_ID                                   NUMBER(12)
 SERIES_VISIT_FLAG                                  VARCHAR2(5)
 NEWBORN_FLAG                                       VARCHAR2(5)
SQL> desc ud_master.proc_event
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 EVENT_ID                                  NOT NULL NUMBER(12)
 ORDER_SPAN_ID                                      NUMBER(12)
 ORDER_SPAN_STATE_ID                                NUMBER(12)
 PROC_ID                                            NUMBER(12)
 ORIG_SCHEDULE_BEGIN_DATE_TIME                      DATE
 ORIG_SCHEDULE_END_DATE_TIME                        DATE
 FINAL_SCHEDULE_BEGIN_DATE_TIME                     DATE
 FINAL_SCHEDULE_END_DATE_TIME                       DATE
 ABNORMAL_STATE_ID                                  VARCHAR2(3)
 MODIFIED_PROC_NAME                                 VARCHAR2(250)
 FACILITY_ID                                        NUMBER(12)
 PRIORITY_ID                                        NUMBER(12)
 CORRECTED_FLAG                                     VARCHAR2(5)
 RX_FLAG                                            VARCHAR2(5)
 SPEC_RECOLLECT_FLAG                                VARCHAR2(5)
 COMPLETE_RESULT_RPT                                NUMBER(12)
 ORDER_VISIT_ID                                     NUMBER(12)
 ORDER_DEFINITION_ID                                VARCHAR2(25)
 PROC_ORDER_NBR                                     NUMBER(12)
SQL> desc ud_master.event
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 EVENT_ID                                  NOT NULL NUMBER(12)
 DATE_TIME                                          DATE
 EVENT_STATUS_ID                                    NUMBER(12)
 EVENT_TYPE_ID                                      NUMBER(12)
 PATIENT_SCHEDULE_DISPLAY                           VARCHAR2(100)
I've created the following query with a sub and with the exists clause, would please review it and let me know whatever I need to correct, many thanks in advance.
SELECT DISTINCT  patient.NAME,
         patient.medical_record_number Medical_RN,
         sub_epe.proc_id,
         to_date(sub_epe.date_time,'dd-mon-yy') event_dt
    FROM ud_master.patient,
         (SELECT proc_event.proc_id, event.date_time,proc_event.VISIT_ID
          FROM ud_master.event, ud_master.proc_event
          WHERE (exists (select * 
                         from ud_master.proc_event sub_p
                        where sub_p.proc_id = sub_p.proc_id) 
                          and (proc_event.visit_id = event.visit_id)))sub_epe,ud_master.visit
    WHERE (   (patient.patient_id = visit.patient_id)
          AND (sub_epe.proc_id = 21078) and (sub_epe.proc_id = 22025)
          AND (sub_epe.date_time BETWEEN to_date('25-jan-2005','dd-mon-yyyy') AND
                                         to_date('31-jan-2005','dd-mon-yyyy')))
GROUP BY patient.NAME,
         patient.medical_record_number,
         sub_epe.proc_id,
         sub_epe.date_time     
  
 
 
 
Another Select Query
A reader, March     08, 2006 - 2:55 am UTC
 
 
Hi Tom,
Have a requirement like this:
Create table test(
main number,
a1 varchar2(10),
a2 varchar2(10),
a3 varchar2(10),
a4 varchar2(10),
a5 varchar2(10)) nologging;
Insert into test (1,'A1','A2','A3','A4','A5');
Normal Output:
Main A1 A2 A3 A4 A5
---- -- -- -- -- --
   1 A1 A2 A3 A4 A5
Desired Output:
---------------
Main   Txt
----   ---
1      A1
1      A2
1      A3
1      A4
1      A5
I know it is a simple query.
Please advice
 
 
March     09, 2006 - 12:13 pm UTC 
 
ops$tkyte@ORA10GR2> with r as
  2  (select level l from dual connect by level <= 5)
  3  select test.main,
  4         decode( r.l, 1, a1, 2, a2, 3, a3, 4, a4, 5, a5 ) data
  5    from test, r;
      MAIN DATA
---------- ----------
         1 A1
         1 A2
         1 A3
         1 A4
         1 A5
 
 
 
 
 
Analytics rock Analytics roll 
Alf, March     09, 2006 - 4:37 pm UTC
 
 
Hello Tom,
How about this?
I have four tables (descs included below) that holds information about patients along with their visits and tests (procedures hence the proc table) perform. In some case some patients are ordered two tests/procedures with a certain period of time.
I to create a report listing patients that had had at least two procedures done in this case procedure 21078 and 22025, the desire out put would be:
P_name            P_MRN           Proc_ID      Event_Date_Time
----------------  -----------     -----------  --------------------
Patient Test, A   44422244555      21078       01-jan-2005 00:00
                                   22025       05-jan-2005 10:35
Patient Test, B   3334442222       21078       28-feb-2005 11:15
                                   22025       31-jul-2005  01:35
...
My challenge is there's not direct relation between patient and proc_event, would please review these tables and give a hit how to approach this? Thanks.
Patient
=======
Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 PATIENT_ID                                NOT NULL NUMBER(12)
 NAME                                               VARCHAR2(100)
 TITLE_ID                                           NUMBER(12)
 MEDICAL_RECORD_NUMBER                              VARCHAR2(20)
 SEX                                                VARCHAR2(8)
.....
Visit
=====
Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 VISIT_NUMBER                                       VARCHAR2(40)
 PATIENT_ID                                         NUMBER(12)
 VISIT_TYPE_ID                                      NUMBER(12)
 VISIT_SUBTYPE_ID                                   NUMBER(12)
 VISIT_STATUS_ID                                    NUMBER(12)
 FACILITY_ID                                        NUMBER(12)
......
proc_event
=====
Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 EVENT_ID                                  NOT NULL NUMBER(12)
 ORDER_SPAN_ID                                      NUMBER(12)
 ORDER_SPAN_STATE_ID                                NUMBER(12)
 PROC_ID                                            NUMBER(12)
......
event
=====
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 VISIT_ID                                  NOT NULL NUMBER(12)
 EVENT_ID                                  NOT NULL NUMBER(12)
 DATE_TIME                                          DATE
 EVENT_STATUS_ID                                    NUMBER(12)
 EVENT_TYPE_ID                                      NUMBER(12)
 PATIENT_SCHEDULE_DISPLAY                           VARCHAR2(100)
...... 
 
 
Analytics Rock
Alf, March     10, 2006 - 3:31 pm UTC
 
 
Hello Tom,
Follow-up for my pervious above question regarding a list of patients who have had two specific tests performed/ordered in this case proc_id 22025 and 21078
I finally got the below query to work this time. Would you please review it and let me know if there's anything I need to change to improve speed performance? Thanks.
SELECT 
DISTINCT patient.medical_record_number mrn,
         patient.name p_name,
         proc.proc_id,proc.name,
         last_day(to_date(event.date_time, 'dd-mon-yy')) D_time
  FROM ud_master.event, 
       ud_master.proc_event,
       ud_master.patient,
       ud_master.visit, 
       ud_master.proc 
 WHERE ( 
       (patient.patient_id = visit.patient_id)
   AND (visit.visit_id = proc_event.visit_id)
   AND (visit.visit_id = event.visit_id)  
   AND (proc_event.visit_id = event.visit_id)
   AND (proc_event.event_id = event.event_id)
   AND (proc.proc_id in (22025,21078))
   AND (event.date_time BETWEEN to_date('01-jan-2005','dd-mon-yyyy') AND
                                to_date('31-dec-2005','dd-mon-yyyy')))
GROUP BY patient.medical_record_number,
         patient.name,
         proc.proc_id,
         proc.name,
         to_date(event.date_time, 'dd-mon-yy')
                  
ORDER BY patient.medical_record_number,'p_name',
         proc.proc_id,
         proc.name,'D_time' 
 
 
Analytic function
Mohamed from France, March     23, 2006 - 4:20 am UTC
 
 
Dear Tom,
I have a table T1 as follows:
IDE                             NOT NULL NUMBER
PTFL_IDE                        NOT NULL NUMBER
BATY_TYP                        NOT NULL VARCHAR2(6)
CTP_IC                          NOT NULL NUMBER
LABL_PLY                        NOT NULL VARCHAR2(6)
NMFI_IDE                        NOT NULL NUMBER
QTY                             NOT NULL NUMBER
and a table T2 as follows
IDE                             NOT NULL NUMBER
SNAP_NMFI_IDE                   NOT NULL NUMBER
IDE_PORTF                       VARCHAR2(50)  ===> 
SNAP_BATY_TYP                   VARCHAR2(6)
USR_INS                         VARCHAR2(48)
DAT_INS                         DATE
USR_UPD                         VARCHAR2(48)
DAT_UPD                         DATE
PGM_INS                         VARCHAR2(48)
PGM_UPD                         VARCHAR2(48)
and table T3 as follows
PTFL_IDE                        NOT NULL NUMBER
IDE_PORTF                       NOT NULL VARCHAR2(50)
PTF_STATUS                      NOT NULL VARCHAR2(3)
NLAB_COD_LABL                   NOT NULL VARCHAR2(6)
I would like to get from table T1 the sum(T1.QTY) grouped by T1.PTFL_IDE,T1.BATY_TYP,T1.CTP_IC, T1.LABL_PLY + 
the max(T2.dat_ins) for each (T3.IDE_PORTF,T3.SNAP_BATY_TYP) in this group   
Remark: T3.SNAP_BATY_TYP =  T1. BATY_TYP  
        T3.IDE_PORTF     =  SELECT T3.IDE_PORTF FROM T3 WHERE T3.PTFL_IDE = T1.PTFL_IDE     
Thanks in advance for your help          
 
 
March     23, 2006 - 10:54 am UTC 
 
sorry, no help.
no create table.
no insert into.
no real explanation of the goal.  the remark doesn't make sense (don't know what you are trying to remark), the "i would like to get" isn't very clear either. 
 
 
 
Simple Analytical Query
A reader, March     24, 2006 - 2:31 am UTC
 
 
Hi Tom,
I have finally got a chance to make use of analytical Query but I am not able to use it. Following is my table data:-
ITEM          MYORDERS  QTYONHAND SAFETYSTOCK
---------- ---------- ---------- -----------
ABC                 5         50          10
ABC                45         50          10
ABC                25         50          10
DEF                30         60          10
DEF                40         60          10
DEF                30         50          10
DEF                30         60          10
DEF                40         60          10
XYZ                20         80          10
XYZ                10         80          10
XYZ                10         80          10
ITEM          MYORDERS  QTYONHAND SAFETYSTOCK
---------- ---------- ---------- -----------
XYZ                20         80          10
Now I want the Output to be as :-
Item     MYORDERS   qtyonhand  safetystock  tomake  --> (qtyonhand-safetystock-sum(MYORDERS))
====  =========  =======   ============  ========
ABC           5       50             10         0   (haven't depleted stock)
ABC          45       50             10        10   -1 x (50-10-5-45)
ABC          25       50             10        35   -1 x (50-10-5-45-25)
so on........
I tried the following Query but it does'nt give me a running total for the MYORDERS column :-
SQL> select item,MYORDERS,sum(MYORDERS) over (partition by item) tomake from test;
ITEM          MYORDERS    TOMAKE
---------- ---------- ----------
ABC                 5         75
ABC                45         75
ABC                25         75
DEF                30        170
DEF                40        170
DEF                30        170
DEF                30        170
DEF                40        170
XYZ                20         60
XYZ                10         60
XYZ                10         60
ITEM          MYORDERS    TOMAKE
---------- ---------- ----------
XYZ                20         60
12 rows selected.
If I get a running total in TOMAKE then I can just subtract it from the value of SAFETYSTOCK for that particular row to get my desired Output.
I know I am missing the order by clause but I tried using the other 2 columns (QTYONHAND, SAFETYSTOCK) but did not get the desired output. Can you please help me out?
--Create Table Script
CREATE TABLE TEST
(
  ITEM         VARCHAR2(10),
  MYORDERS     NUMBER,
  QTYONHAND    NUMBER,
  SAFETYSTOCK  NUMBER
)
--Insert Statements
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'ABC', 5, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'ABC', 45, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'ABC', 25, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 30, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 40, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 30, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 40, 60, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'DEF', 30, 50, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 10, 80, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 20, 80, 10);
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 10, 80, 10); 
INSERT INTO TEST ( ITEM, MYORDERS, QTYONHAND, SAFETYSTOCK ) VALUES ( 
'XYZ', 20, 80, 10);  
COMMIT;
Thanks 
 
 
March     24, 2006 - 9:49 am UTC 
 
you seem to be missing something to sort by?  what orders this data - a running total implies "SORT BY SOMETHING" 
 
 
 
Simple Analytical Query
A reader, March     24, 2006 - 12:16 pm UTC
 
 
Any one of the other 2 columns can be used to sort the data.
Thanks 
 
March     24, 2006 - 3:34 pm UTC 
 
so, order by them.  in the over () clause. 
 
 
 
Simple Analytical Query  
A reader, March     24, 2006 - 3:50 pm UTC
 
 
Hi Tom
I think the below query will work fine for me I have ordered the data by rowid:-
SQL> select item,toorder,sum(toorder) over (partition by item order by rowid) run_tot from test;
ITEM          TOORDER    RUN_TOT
---------- ---------- ----------
ABC                25         25
ABC                45         70
ABC                 5         75
DEF                30         30
DEF                40         70
DEF                30        100
DEF                40        140
DEF                30        170
XYZ                10         10
XYZ                20         30
XYZ                10         40
ITEM          TOORDER    RUN_TOT
---------- ---------- ----------
XYZ                20         60
12 rows selected.
Thanks for all your help 
 
 
March     24, 2006 - 4:16 pm UTC 
 
as long as you don't care that it gives different answers for the same data on different databases, sure.
 
 
 
 
Full Join of inline views with analytics
Anwar, March     25, 2006 - 8:22 am UTC
 
 
I have a table with date column. I want to display dates from two different months in two columns. I want dates from both months in same rows and not alternate ones.
create table test
(tdate date);
insert into test values('01-jan-2006');
insert into test values('05-jan-2006');
insert into test values('15-jan-2006');
insert into test values('02-feb-2006');
insert into test values('07-feb-2006');
insert into test values('25-feb-2006');
Then i issue following command to retrieve data.
  1  select jan06,feb06
  2  from
  3     (select tdate jan06,
  4     row_number() over (order by tdate) rn
  5     from test
  6     where tdate between '01-jan-2006' and '31-jan-2006') j,
  7     (select tdate feb06,
  8     row_number() over (order by tdate) rn
  9     from test
 10     where tdate between '01-feb-2006' and '28-feb-2006') f
 11* where j.rn=f.rn
SQL> /
JAN06     FEB06
--------- ---------
01-JAN-06 02-FEB-06
05-JAN-06 07-FEB-06
15-JAN-06 25-FEB-06
So far so good. But then i insert another record in the table.
insert into test values('27-feb-2006');
Now number of records for february is more than that of january. I use full outer join and have this unexpected result.
  1  select j.rn,jan06,f.rn,feb06
  2  from
  3     (select tdate jan06,
  4     row_number() over (order by tdate) rn
  5     from test
  6     where tdate between '01-jan-2006' and '31-jan-2006') j
  7  full join
  8     (select tdate feb06,
  9     row_number() over (order by tdate) rn
 10     from test
 11     where tdate between '01-feb-2006' and '28-feb-2006') f
 12* on j.rn=f.rn
SQL> /
        RN JAN06             RN FEB06
---------- --------- ---------- ---------
         1 01-JAN-06          1 02-FEB-06
         2 05-JAN-06          2 07-FEB-06
         3 15-JAN-06          3 25-FEB-06
                                27-FEB-06
                                25-FEB-06
                                07-FEB-06
                                02-FEB-06
7 rows selected.
I create one table each for both months.
create table j_test
as select tdate JAN06,row_number() over (order by tdate) rn
from test
where tdate between '01-jan-2006' and '31-jan-2006'
/
create table f_test
as select tdate FEB06,row_number() over (order by tdate) rn
from test
where tdate between '01-feb-2006' and '28-feb-2006'
/
SQL> select j.rn,jan06,f.rn,feb06
  2  from j_test j full join f_test f
  3  on j.rn=f.rn
  4  /
        RN JAN06             RN FEB06
---------- --------- ---------- ---------
         1 01-JAN-06          1 02-FEB-06
         2 05-JAN-06          2 07-FEB-06
         3 15-JAN-06          3 25-FEB-06
                              4 25-FEB-06
This result is what i was expecting with inline views. Why is the behaviour different with analytics in inline views and tables?
Regards
Anwar 
 
 
March     25, 2006 - 9:13 am UTC 
 
reproduces in 9ir2, 10gr1 but not 10gr2 - please contact support with this test case. 
 
 
 
Full Join of inline views with analytics  - follow-up
Michel Cadot, March     25, 2006 - 10:27 am UTC
 
 
I don't get exactly the same thing on 9.2.0.6 but still get a wrong answer (btw, the answer showed for the last query is not correct, maybe there are not the same input rows, and i wonder why f.rn is not filled in the previous query answer):
SQL> select j.rn,jan06,f.rn,feb06
  2  from
  3     (select tdate jan06,
  4     row_number() over (order by tdate) rn
  5     from test
  6     where tdate between '01-jan-2006' and '31-jan-2006') j
  7  full join
  8     (select tdate feb06,
  9     row_number() over (order by tdate) rn
 10     from test
 11     where tdate between '01-feb-2006' and '28-feb-2006') f
 12  on j.rn=f.rn
 13  /
        RN JAN06               RN FEB06
---------- ----------- ---------- -----------
         1 01-jan-2006          1 02-feb-2006
         2 05-jan-2006          2 07-feb-2006
         3 15-jan-2006          3 25-feb-2006
                                3 25-feb-2006
                                4 27-feb-2006
5 rows selected.
But this does not happen if you use factoring clause:
SQL> with
  2    j_test as 
  3     (select tdate jan06,
  4     row_number() over (order by tdate) rn
  5     from test
  6     where tdate between '01-jan-2006' and '31-jan-2006'),
  7   f_test as 
  8     (select tdate feb06,
  9     row_number() over (order by tdate) rn
 10     from test
 11     where tdate between '01-feb-2006' and '28-feb-2006') 
 12  select j.rn,jan06,f.rn,feb06
 13  from j_test j full join f_test f
 14  on j.rn=f.rn
 15  /
        RN JAN06               RN FEB06
---------- ----------- ---------- -----------
         1 01-jan-2006          1 02-feb-2006
         2 05-jan-2006          2 07-feb-2006
         3 15-jan-2006          3 25-feb-2006
                                4 27-feb-2006
4 rows selected.
Regards
Michel
 
 
 
 
But is it possible to divide x  days over a week using analytics?
Martijn, April     13, 2006 - 9:39 am UTC
 
 
Hello Tom,
First of all: Thanks for your great work and site!
I wonder if its possible to execute the following in SQL, using rocking and rolling analytics:
Each day of the week a query has to select a balanced set of  ids from a table, for example the all_objects table.
Connected to Oracle9i Enterprise Edition Release 9.2.0.5.0 
SQL> select count(*) from all_objects;
  COUNT(*)
----------
     26652
SQL> select round(count(*)/7) from all_objects;
ROUND(COUNT(*)/7)
-----------------
             3807
So each day about 3807 records should be selected.
SQL> SELECT distinct(MOD(t.object_id, 7))+1                     day_of_week
  2  ,      count(*) over (PARTITION BY (MOD(t.object_id,  7))) nof_rec_by_objid
  3  FROM   all_objects t
  4  ;
DAY_OF_WEEK NOF_REC_BY_OBJID
----------- ----------------
          1             3810
          2             3820
          3             3808
          4             3811
          5             3807
          6             3791
          7             3805
7 rows selected
Now the data is more or less evenly distributed per day over a week. Not much of a problem BUT:
By using a parameter we want to be able to do the following:
Imagine that the number of object_id-records queried each day represents a set of persons an email will be sent to. I want to be able to send the same person 'x' emails per week, equally spread over the week. So if  today is day 1 of the week and the parameter value = 3 (emails per week) then I would like to get this result:
DAY_OF_WEEK NOF_REC_BY_OBJID
----------- ----------------
          1             3810      <-mailrun 1
          2             3820
          3             3808      <-mailrun 2
          4             3811
          5             3807      <-mailrun 3 
          6             3791                   
          7             3805
So:
if parameter = 3 and day = 1 then
Select a group of numbers: 1, 3, 5
All persons with mod(person_uid, 7)+1 in (1, 3, 5) need to be selected on day 1, 3 and 5 of the week.
if arameter = 3 and day of week  = 2 then
Select a group of numbers: 2, 4, 6 
All persons with mod(person_uid, 7)+1 in (2, 4, 6) need to be selected on day 2, 4 and 6 of the week.
Parameter = 3, day of week  = 3
Select a group of numbers: 3, 5, 7 
All persons with mod(person_uid, 7)+1 in ( 3, 5, 7) need to be selected on day 3, 5 and 7 of the week.
Etc. (for all days in 1..7).
If the parameter is set to 2 then the query schould get the following:
Parameter = 2, day of week = 1
Select a group of numbers: 1, 5 
All persons with mod(person_uid, 7)+1 in (1, 5) need to be selected on day 1, 3 and 5 of the week.
Etc. for all days of the week.
This is the current version of the daily selection:
SELECT ao.object_id
FROM   all_objects ao
WHERE  mod(ao.object_id, 7)+1 IN 
( to_number(to_char(sysdate, 'D'))
, to_number(to_char(sysdate, 'D'))+3
, to_number(to_char(sysdate, 'D'))-4);
I would like to get rid of the hardcoded ' +3  , -4' lines and replace them by a more flexible approach. Can analytics be of help in this question or is the only solution using  a PL/SQL Function? So the current query needs to be rewritten to:
SELECT ao.object_id
FROM   all_objects ao
WHERE  mod(ao.object_id, 7)+1 IN (select all available days of the week, evenly distributed over the week, based on the value of the (cursor) parameter );
Or
SELECT ao.object_id
FROM   all_objects ao
WHERE EXISTS  (select all available days of the week, evenly distributed over the week, based on the value of the (cursor) parameter where mod(ao.object_id, 7)+1  IN (the available days) );
Thanks in advance for any clue!
Regards,
Martijn
 
 
 
 
Continued from earlier post
Martijn, April     19, 2006 - 8:58 am UTC
 
 
With a PL/SQL function,
the point where I get stuck is the fact that if the value of the parameter is 3, I can't get the distribution right (in an elegant way that is). The odd daynumers mess up the distribution.
I.e., this is what i get when i choose 3 times a week for groups to receive email:
Day     Mailto "mods" with no.
-------------------------------
1       1, 3, 5  
2       2, 4, 6 
3       3, 5, 7  (7 should be 1?)
4       4, 6, 2
5       5, 7, 1  (wrong too)
6       6, 2, 4
7       7, 1, 3  (wrong too)
Still I'm sure that this could be done in SQL, with some formule, analytics, but got lost in finding the correct way to do so.
 
 
 
Analytical function
Amit, April     26, 2006 - 7:50 pm UTC
 
 
I have a result set like this
SELECT sku, price, pct_off, 
price * ((100 - pct_off) / 100) calculated_price
 FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
           FROM user_all_tables
          WHERE rownum <= 5);
      SKU     PRICE   PCT_OFF CALCULATED_PRICE
--------- --------- --------- ----------------
      101      1000        10              900
      101      1000        11              890 -> 801
      101      1000        12              880 -> 704.88
      101      1000        13              870 -> 613.25
      101      1000        14              860 -> 527.39
The calculated price is simply being calculated here
as Price - pct_off.
What I would like to do is to calculate the price based on previous calculated value.
So for row2, I want the calculated price to be
900 - 11% off  = 801 
for row3, it should be
801 - 12% off = 704.88 and so on.
 
 
April     27, 2006 - 2:54 pm UTC 
 
no create
no insert intos
no example basically
no look, not promising anything, I just don't look at these until I have an example I can cut and paste.
read about LAG() - it will do this, you can figure this out - I'm sure you can (you will need an ORDER BY which is WOEFULLY missing from your example - rows have NO ORDER until you give it to them) 
 
 
 
Analytical function  
Amit, April     27, 2006 - 4:39 pm UTC
 
 
Tom, 
There are no inserts needed to run my query. You can directly run this statement. Please run this statement once and you'll understand what I am talking about.
SELECT sku, price, pct_off, 
       price * ((100 - pct_off) / 100) calculated_price
 FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
           FROM user_all_tables
          WHERE rownum <= 5);
I have read about Lag() function and tried to use it here but it does not give me the results I need.
SQL> ed
Wrote file afiedt.buf
  1  SELECT sku, price, pct_off, cal_price,
  2      lag(cal_price, 1, cal_price) over(ORDER BY 1) new_cal_price
  3    FROM (SELECT sku, price, pct_off, price * ((100 - pct_off) / 100) cal_price
  4       FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
  5       FROM user_all_tables
  6*        WHERE rownum <= 5))
SQL> /
      SKU     PRICE   PCT_OFF CAL_PRICE NEW_CAL_PRICE
--------- --------- --------- --------- -------------
      101      1000        10       900           900 -- Correct
      101      1000        11       890           900 -- Should be 801
      101      1000        12       880           890 -- Should be 704.88
      101      1000        13       870           880 -- Should be 613.25
      101      1000        14       860           870 -- Should be 527.39
I want to calculate the price based on previous calculated 
value.
For row2, I want the calculated price to be
900 - 11% off  = 801 
for row3, it should be
801 - 12% off = 704.88 and so on. 
 
 
April     27, 2006 - 4:49 pm UTC 
 
umm, without an order by - I'm afraid there is no "prior row" concept AT ALL.
Need real example, YOU NEED SOMETHING TO ACTUALLY SORT THE DATA BY
order by 1
that orders (in this case) by the CONSTANT NUMBER ONE, which means "they are all equally "first" or "last" or "next")
order order in the court.  give us a real example to deal with here. 
 
 
 
Analytical question
A reader, April     27, 2006 - 4:50 pm UTC
 
 
I have tried this variation. This is closer to my desired result but only upto row 2.
SQL> SELECT sku, price, pct_off, cal_price,
  2      lag(cal_price * ((100 - pct_off) / 100), 1, cal_price) over(ORDER BY 1) new_cal_price
  3    FROM (SELECT sku, price, pct_off, price * ((100 - pct_off) / 100) cal_price
  4       FROM (SELECT 101 sku, 1000 price, 9 + rownum pct_off
  5       FROM user_all_tables
  6         WHERE rownum <= 5));
      SKU     PRICE   PCT_OFF CAL_PRICE NEW_CAL_PRICE
--------- --------- --------- --------- -------------
      101      1000        10       900           900
      101      1000        11       890           810
      101      1000        12       880         792.1
      101      1000        13       870         774.4
      101      1000        14       860         756.9 
 
 
April     28, 2006 - 1:43 am UTC 
 
I give up.
These rows have NO ORDER WHATSOEVER, you need something to order by.  
You don't have anything in your example to tell me what "row 2" is.  Your rows could be anywhere - you could run this query tomorrow and your row 2 might become row 4 - or row 5000 and not even appear.
 
 
 
 
Analytical question.
Amit, April     27, 2006 - 5:12 pm UTC
 
 
Ok..let me give you some real data with an order by clause.
SQL> CREATE TABLE A AS
  2   SELECT 101 sku, 1000 price, 9 + rownum pct_off, trunc(sysdate) + (2 * rownum) apply_dt
  3     FROM user_all_tables
  4   WHERE rownum <= 5;
Table created.
SQL> select * from a;
      SKU     PRICE   PCT_OFF APPLY_DT
--------- --------- --------- --------
      101      1000        10 04/29/06
      101      1000        11 05/01/06
      101      1000        12 05/03/06
      101      1000        13 05/05/06
      101      1000        14 05/07/06
SQL> SELECT sku, price, pct_off, apply_dt, cal_price,
  2      lag(cal_price * ((100 - pct_off) / 100), 1, cal_price) over(ORDER BY 1) new_cal_price
  3    FROM (SELECT sku, price, pct_off, price * ((100 - pct_off) / 100) cal_price, apply_dt
  4       FROM a
  5      ORDER BY apply_dt);
      SKU     PRICE   PCT_OFF APPLY_DT CAL_PRICE NEW_CAL_PRICE
--------- --------- --------- -------- --------- -------------
      101      1000        10 04/29/06       900           900
      101      1000        11 05/01/06       890           810
      101      1000        12 05/03/06       880         792.1
      101      1000        13 05/05/06       870         774.4
      101      1000        14 05/07/06       860         756.9
 
 
 
 
model is great - but 10G only
vlad, April     27, 2006 - 7:09 pm UTC
 
 
SQL> select sku, price, pct_off, new_price
  2    from (select aaa.*, rownum rn from aaa order by apply_dt)
  3     model
  4      return updated rows
  5     partition by(sku) dimension by(rn)
  6     measures(price, pct_off,0 new_price)
  7     rules(new_price [any] = nvl(new_price[cv()-1]/100*(100-pct_off[cv()]),price[cv()]/100*(100-pct_off[cv()])));
       SKU      PRICE    PCT_OFF  NEW_PRICE
---------- ---------- ---------- ----------
       101       1000         10        900
       101       1000         11        801
       101       1000         12     704.88
       101       1000         13   613.2456
       101       1000         14 527.391216
SQL>  
 
 
 
10G is cool, but all you need is to search carefully Tom's archive :-))
Vlad, April     27, 2006 - 10:19 pm UTC
 
 
Amit,
it took me some time to figure out that there is no need for all this model cool features - just common sence and Tom's archive. You actually don't need previous price value - it's enough to know initial value and all factors for calculating new_price for the current row:
SQL> select ord, 1000 * exp(sum(ln((100 - pct) / 100)) over(order by ord)) new_price,
  2         price,
  3         pct
  4    from (select rownum ord, rownum + 9 pct, 1000 price from all_tables where rownum < 10);
       ORD  NEW_PRICE      PRICE        PCT
---------- ---------- ---------- ----------
         1        900       1000         10
         2        801       1000         11
         3     704.88       1000         12
         4   613.2456       1000         13
         5 527.391216       1000         14
         6 448.282533       1000         15
         7 376.557328       1000         16
         8 312.542582       1000         17
         9 256.284917       1000         18
9 rows selected
SQL> 
 
 
 
 
Thanks Vlad
Amit, April     27, 2006 - 11:06 pm UTC
 
 
Thanks Vlad. This is exactly what I needed.
I will also checkout the model function in 10G. It looks very interesting.
 
 
 
Is it possible to implement with LAG?
A reader, April     28, 2006 - 1:20 pm UTC
 
 
Tom, just curious - how were you going to implement this with LAG function call? Doesn't look like this was the right suggestion (if possible at all?) 
 
April     28, 2006 - 1:32 pm UTC 
 
I did not really look at it until there was an example to go with it (said "not promising anything").  I don't look at them in detail until everything is good to go.
And by then, someone else had already done it :) 
 
 
 
Analytical question
Amit, April     28, 2006 - 1:43 pm UTC
 
 
I need some more help on the question I asked earlier.
The solution Vlad provided works very well if I need to do percent off in all the rows.
I need to break this pattern based on conditions and then continue again.
create table AA
(
  BATCH_ID       NUMBER(10) not null,
  APPLY_DT       DATE not null,
  SKU_NO         NUMBER(9) not null,
  SITE_NO        NUMBER(4) not null,
  PCT           NUMBER(5,2) not null,
  PRICE          NUMBER(10,2) not null,
  QTY          NUMBER(4) not null,
  UNIT_PRICE      NUMBER
);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6282, to_date('06-09-2006', 'dd-mm-yyyy'), 10007, 999, 12, 0, 0, 8.55);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6283, to_date('15-09-2006', 'dd-mm-yyyy'), 10007, 999, 5, 0, 0, 8.55);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6287, to_date('16-09-2006', 'dd-mm-yyyy'), 10007, 999, 0, 12, 1, 8.55);
insert into AA (BATCH_ID, APPLY_DT, SKU_NO, SITE_NO, PCT, PRICE, QTY, UNIT_PRICE)
values (6286, to_date('22-09-2006', 'dd-mm-yyyy'), 10007, 999, 10, 0, 0, 8.55);
commit;
SQL> SELECT apply_dt, pct, price, qty, unit_price,
  2      round(CASE
  3       WHEN pct > 0 THEN
  4        unit_price * exp(SUM(ln((100 - pct) / 100)) over(ORDER BY apply_dt))
  5       ELSE
  6        price / qty
  7      END, 2) new_price
  8    FROM aa;
APPLY_DT       PCT     PRICE       QTY UNIT_PRICE NEW_PRICE
-------- --------- --------- --------- ---------- ---------
09/06/06        12         0         0       8.55      7.52-> 12% Off 8.55
09/15/06         5         0         0       8.55      7.15-> 5% Off 7.52
09/16/06         0        12         1       8.55        12-> pct = 0 then                                                                    price/qty
09/22/06        10         0         0       8.55      6.43-> 10% off 12
In the example above row4 is the problem. 
In row3 I do not want to do pct off because pct = 0 for row3. I want to use price/qty instead. I am able to do what using case statement.
Now, for row4, I want 10% off row3 value. I want 10% of 12 and should get 10.8 and not 10% off 7.15 which is 6.43. 
 
 
 
added some more rows to your table...
Vlad, April     28, 2006 - 5:18 pm UTC
 
 
SQL> select zzz.*,
  2         first_value(unit_price) over(partition by part order by apply_dt) *
  3         exp(SUM(ln((100 - pct) / 100)) over(partition by part ORDER BY apply_dt)) new_price
  4    from (select zz.*,
  5                 nvl((sum(rn) over(order by apply_dt desc) - nvl(rn, 0)),0) part
  6            from (select aa.batch_id,
  7                         aa.apply_dt,
  8                         aa.sku_no,
  9                         aa.site_no,
 10                         aa.pct,
 11                         aa.price,
 12                         aa.qty,
 13                         decode(aa.pct, 0, aa.price / aa.qty, aa.unit_price) unit_price,
 14                         z.rn
 15                    from aa,
 16                         (select rowid rd, rownum rn
 17                            from aa
 18                           where pct = 0
 19                           order by apply_dt) z
 20                   where aa.rowid = z.rd(+)
 21                   order by apply_dt) zz) zzz
 22   order by apply_dt;
   BATCH_ID APPLY_DT        SKU_NO SITE_NO     PCT        PRICE   QTY UNIT_PRICE         RN       PART  NEW_PRICE
----------- ----------- ---------- ------- ------- ------------ ----- ---------- ---------- ---------- ----------
       6282 9/6/2006         10007     999   12.00         0.00     0       8.55                     3      7.524
       6283 9/15/2006        10007     999    5.00         0.00     0       8.55                     3     7.1478
       6287 9/16/2006        10007     999    0.00        12.00     1         12          1          2         12
       6286 9/22/2006        10007     999   10.00         0.00     0       8.55                     2       10.8
       6282 9/6/2007         10007     999   12.00         0.00     0       8.55                     2      9.504
       6283 9/15/2007        10007     999    5.00         0.00     0       8.55                     2     9.0288
       6287 9/16/2007        10007     999    0.00        12.00     1         12          2          0         12
       6286 9/22/2007        10007     999   10.00         0.00     0       8.55                     0       10.8
8 rows selected
SQL>  
 
 
 
Excellent Vlad
Amit, April     28, 2006 - 9:08 pm UTC
 
 
This works very well. Excellent work Vlad. Thanks
Tom, can this be done in any other way ? 
 
 
Is it possible with Analytics function ?
Jonty, May       01, 2006 - 2:25 pm UTC
 
 
Hello Tom,
This will help you to create table and populate data for my problem.
CREATE TABLE T_Order
    (Item                           VARCHAR2(20),
    Order_Date                     DATE NOT NULL,
    Order_No                       NUMBER(10,0))
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','1-JAN-2006',9001)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','2-JAN-2006',9002)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','8-JAN-2006',9003)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','16-JAN-2006',9004)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('101','18-JAN-2006',9005)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','1-JAN-2006',9006)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','5-JAN-2006',9007)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','7-JAN-2006',9008)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('102','20-JAN-2006',9009)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('103','5-JAN-2006',9010)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('103','7-JAN-2006',9011)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('103','9-JAN-2006',9012)
/
SELECT a.item, a.order_date, a.order_no
  FROM t_order a
/
I have following data in my table.
Row#      Item    Order_Date      Order_No
----      -----   ----------     ---------
1      101     1-Jan-2006         9001
2      101     2-Jan-2006         9002
3      101     8-Jan-2006         9003
4      101     16-Jan-2006         9004
5      101     18-Jan-2006         9005
6      102     1-Jan-2006         9006
7      102     5-Jan-2006         9007
8      102     7-Jan-2006         9008
9      102     20-Jan-2006         9009
    
10      103     5-Jan-2006         9010
11      103     7-Jan-2006         9011
12      103     9-Jan-2006         9012
I need to find out which are the latest group of Orders for each Item. 
Group of Ordres - any 2 orders order_date are within 1 week time gap.
In above example, 
For 
Item 101 -      Row # 4,5 will be group # 1
        Row# 1,2,3 wiil be group # 2
        
Item 102 -     Row # 9 will be group # 1
        Row # 6,7,8 will be group # 2
Item 103 -    Row # 10, 11, 12 will be group # 1
I am only interested in group # 1. Query results should be like this
Item #        Order No
--------        ------------
101        9004
101        9005
102        9009
103        9010
103        9011
103        9012
I really appreciate your help on this. 
 
May       02, 2006 - 3:21 am UTC 
 
a tad "ambigous" - 
do you mean within the last week of the LAST RECORD (eg: for item 101 - any date between 18-Jan-2006 and 12-Jan-2006)
OR
any set of records such that the date of the current record is within one week of the prior record (eg: if item 101 had observations:
18-jan, 15-jan, 12-jan, 9-jan, 6-jan, 3-jan
they would all "count")
we can do either, here is the "first" one:
ops$tkyte@ORA10GR2> select *
  2    from (
  3  SELECT a.item, a.order_date, a.order_no,
  4         max(order_date) over (partition by item) max_order_date
  5    FROM t_order a
  6         )
  7   where max_order_date - order_date <= 7
  8  /
ITEM                 ORDER_DAT   ORDER_NO MAX_ORDER
-------------------- --------- ---------- ---------
101                  16-JAN-06       9004 18-JAN-06
                     18-JAN-06       9005 18-JAN-06
102                  20-JAN-06       9009 20-JAN-06
103                  05-JAN-06       9010 09-JAN-06
                     07-JAN-06       9011 09-JAN-06
                     09-JAN-06       9012 09-JAN-06
6 rows selected.
 
 
 
 
 
2nd Option
Jonty, May       02, 2006 - 10:11 am UTC
 
 
Thanks a lot for quick response.
Yes, I am looking for this option answer.
"any set of records such that the date of the current record is within one week 
of the prior record (eg: if item 101 had observations:
18-jan, 15-jan, 12-jan, 9-jan, 6-jan, 3-jan
they would all "count")"
Here is the Insert St. for an example.
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','01-JAN-2006',9091)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','02-JAN-2006',9094)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','08-JAN-2006',9092)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','14-JAN-2006',9093)
/
INSERT INTO t_order
(ITEM,ORDER_DATE,ORDER_NO)
VALUES
('104','18-JAN-2006',9095)
/
ITEM                             ORDER_DATE   ORDER_NO
-------------------- ---------------------- ----------
104                              1-Jan-2006       9091 
104                              2-Jan-2006       9094 
104                             18-Jan-2006       9092 
104                             24-Jan-2006       9093 
104                             28-Jan-2006       9095 
Results should bring 18-jan, 24-jan, 28-jan records.
Thanks in advance.
 
 
May       02, 2006 - 3:47 pm UTC 
 
arg - why do selects NOT match "supplied data"
inserts <> select....
oh well, work with this:
ops$tkyte@ORA10GR2> select * from t_order;
ITEM                 ORDER_DAT   ORDER_NO
-------------------- --------- ----------
101                  01-JAN-06       9001
101                  02-JAN-06       9002
101                  08-JAN-06       9003
101                  16-JAN-06       9004
101                  18-JAN-06       9005
102                  01-JAN-06       9006
102                  05-JAN-06       9007
102                  07-JAN-06       9008
102                  20-JAN-06       9009
103                  05-JAN-06       9010
103                  07-JAN-06       9011
103                  09-JAN-06       9012
104                  01-JAN-06       9091
104                  02-JAN-06       9094
104                  08-JAN-06       9092
104                  14-JAN-06       9093
104                  18-JAN-06       9095
17 rows selected.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select *
  2    from (
  3  select item, order_date, order_no,
  4         max(grp) over (partition by item order by order_date DESC) maxgrp
  5    from (
  6  select item, order_date, order_no,
  7         case when (lag(order_date) over (partition by item order by order_date DESC)-order_date)>7
  8                  then row_number() over (partition by item order by order_date DESC)
  9              when lag(order_date) over (partition by item order by order_date DESC) is null then 1
 10             end grp
 11    from t_order
 12         )
 13             )
 14   where maxgrp = 1
 15   order by item, order_date
 16  /
ITEM                 ORDER_DAT   ORDER_NO     MAXGRP
-------------------- --------- ---------- ----------
101                  16-JAN-06       9004          1
101                  18-JAN-06       9005          1
102                  20-JAN-06       9009          1
103                  05-JAN-06       9010          1
103                  07-JAN-06       9011          1
103                  09-JAN-06       9012          1
104                  01-JAN-06       9091          1
104                  02-JAN-06       9094          1
104                  08-JAN-06       9092          1
104                  14-JAN-06       9093          1
104                  18-JAN-06       9095          1
11 rows selected.
 
 
 
 
 
It works.
Jonty, May       03, 2006 - 10:19 am UTC
 
 
Thanks a lot.
It works well.
Where can I find detailed material about Analytic Functions ?
 
 
May       03, 2006 - 1:01 pm UTC 
 
 
 
Merge rows with first non-null value for each column
narendra, May       08, 2006 - 7:48 am UTC
 
 
Tom,
It sounds pretty simple to achieve when I put it in english. However, I am not able to write a SQL to achieve the same.
Following are the details:
select * from v$version ;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
PL/SQL Release 9.2.0.6.0 - Production
CORE    9.2.0.6.0       Production
TNS for Solaris: Version 9.2.0.6.0 - Production
NLSRTL Version 9.2.0.6.0 - Production
create table t_src (id number not null, name varchar2(10), st_id number, created_date date not null)
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,'2',to_date('16-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,null,to_date('08-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6','Row6','1',to_date('05-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7',null,null,to_date('07-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7','Row10','1',to_date('06-MAY-06','DD-MON-YY'))
/
select * from t_src order by id, created_date desc ;
        ID NAME            ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
         6                     2 16-05-2006 00:00:00
         6                       08-05-2006 00:00:00
         6 Row6                1 05-05-2006 00:00:00
         7                       07-05-2006 00:00:00
         7 Row10               1 06-05-2006 00:00:00
Desired Output:
        ID NAME            ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
         6 Row6                2 16-05-2006 00:00:00
         7 Row10               1 07-05-2006 00:00:00
The logic behind desired output is:
Generate a row for each ID, with latest CREATED_DATE.
Fill columns NAME & ST_ID with first not null value in respective columns for the ID.
Can you please help me out ? 
 
May       08, 2006 - 8:31 am UTC 
 
...
Generate a row for each ID, with latest CREATED_DATE.
Fill columns NAME & ST_ID with first not null value in respective columns for 
the ID.
.....
your example does not do that.  Your ST_ID for row6 is for the latest row - not the first non-null value.
This is what I come up with:
ops$tkyte@ORA10GR2> select id, name, st_id, max_date
  2    from (
  3  select id, name, st_id, created_date,
  4         max(created_date) over (partition by id) max_date ,
  5             min(case when name is not null then created_date end) over (partition by id) min_date
  6    from t_src
  7         )
  8   where created_date = min_date
  9   order by id, created_date
 10  /
 
        ID NAME                                ST_ID MAX_DATE
---------- ------------------------------ ---------- ---------
         6 Row6                                    1 16-MAY-06
         7 Row10                                   1 07-MAY-06
 
 
 
 
 
Incomplete Test Data
Narendra, May       08, 2006 - 11:57 pm UTC
 
 
Sorry Tom,
My sample data did not cater for all conditions.
select * from t_src order by id, created_date desc ;
        ID NAME            ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
         6                     2 16-05-2006 00:00:00
         6 Row5                  08-05-2006 00:00:00
         6 Row6                1 05-05-2006 00:00:00
         7                       07-05-2006 00:00:00
         7 Row10               1 06-05-2006 00:00:00
         7 Row11               3 06-05-2006 00:00:00
Desired Output:
        ID NAME            ST_ID CREATED_DATE
---------- ---------- ---------- -------------------
         6 Row5                2 16-05-2006 00:00:00
         7 Row10               1 07-05-2006 00:00:00
"Row5" in first row is due to
i) "Name" column for ID = 6 with max created date is null &
ii) For remaining rows having ID = 6, "Row5" is first not null value.
With same logic, row with ID = 7 has "Row10" value in NAME column & "1" in ST_ID column.
Hope I am clear. 
 
May       09, 2006 - 7:42 am UTC 
 
I didn't see any inserts here for me to use - did I miss something?
and your explaination is still not very clear.
I believe you are trying to say:
get me the unique ID's
for each unique ID show me:
  a) the max(created_date)
  b) the last non-null value of name when sorted by created_date
  c) the last non-null value of st_id when sorted by created_date
You don't seem to be mentioning "how row with id=6" got st_id = 2. 
 
 
 
More details
Narendra, May       10, 2006 - 12:13 am UTC
 
 
Sorry Tom for missing out on inserts...
Following are details.
create table t_src (id number not null, name varchar2(10), st_id number, 
created_date date not null)
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,null,to_date('16-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6','asdf',null,to_date('10-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6','Row5',null,to_date('08-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('6',null,'6',to_date('05-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7',null,null,to_date('07-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7',null,'1',to_date('06-MAY-06','DD-MON-YY'))
/
Insert into T_SRC ("ID","NAME","ST_ID","CREATED_DATE") values ('7','Row11','3',to_date('04-MAY-06','DD-MON-YY'))
/
select * from t_src order by id, created_date desc
/
ID    NAME    ST_ID    CREATED_DATE
6        2    16-MAY-06
6            10-MAY-06
6    Row5        08-MAY-06
6        6    05-MAY-06
7            07-MAY-06
7        1    06-MAY-06
7    Row11    3    04-MAY-06
Desired Output:
ID    NAME    ST_ID    CREATED_DATE
6    Row5    2    16-MAY-06
7    Row11    1    07-MAY-06
Let me try explaining what I am trying to achieve.
1. For each uniqued ID, get the record with max CREATED_DATE. So in above example, I get following 2 rows:
ID    NAME    ST_ID    CREATED_DATE
6            16-MAY-06
7            07-MAY-06
2. For each record (i.e. unique ID) derived in step (1),
fill rest of the columns as follows
   2.1 If the record for a unique ID, with max CREATED_DATE, has any of the other columns with not null values, get those values as it is. So the above results become:
ID    NAME    ST_ID    CREATED_DATE
6        2    16-MAY-06
7            07-MAY-06
   2.2 For all columns of the record with unique ID, having max CREATED_DATE, get the FIRST NOT NULL value from remaining rows (from the output of query "select * from t_src order by id, created_date desc") for the same ID.
So the above results become:
ID    NAME    ST_ID    CREATED_DATE
6    Row5    2    16-MAY-06
7    Row11    1    07-MAY-06
Hope I am clear about the same. 
Awaiting your reply.
Regards
 
 
May       10, 2006 - 7:52 am UTC 
 
no, this is not clear - 2.2 is NOT clear at all. Not really.  
Actually, I don't see why there is a 2.1 and a 2.2 - seems you could say it all at once.
Are the columns treated independantly so that what I stated:
get me the unique ID's
for each unique ID show me:
  a) the max(created_date)
  b) the last non-null value of name when sorted by created_date desc
  c) the last non-null value of st_id when sorted by created_date desc
is actually what you mean.  It seems so.  In which case - I disagree with your sample output.  Your input data:
ops$tkyte@ORA10GR2> select * from t_src order by id, created_date desc;
 
        ID NAME                                ST_ID CREATED_D
---------- ------------------------------ ---------- ---------
         6                                           16-MAY-06
         6 asdf                                      10-MAY-06
         6 Row5                                      08-MAY-06
         6                                         6 05-MAY-06
         7                                           07-MAY-06
         7                                         1 06-MAY-06
         7 Row11                                   3 04-MAY-06
 
7 rows selected.
So, I don't see ANY row with id=6 and st_id = 2.  I don't get where your example comes up with that.  Also, I see that when id=6, asdf is going to be the last non-null value for NAME when sorted by created_date desc.
Stating a requirement and having examples that correspond to what you type in - sort of important.
Ok, I'll guess my statement of the problem is close to what you really mean, then:
ops$tkyte@ORA10GR2> select id,
  2         max(created_date),
  3         substr( max( case when name is not null
  4                    then to_char(created_date,'yyyymmddhh24miss')||name
  5                       end ), 15 ) name,
  6         to_number( substr( max( case when st_id is not null
  7                     then to_char(created_date,'yyyymmddhh24miss')|| st_id
  8                                  end ), 15 ) ) st_id
  9    from t_src
 10   group by id;
 
        ID MAX(CREAT NAME                                ST_ID
---------- --------- ------------------------------ ----------
         6 16-MAY-06 asdf                                    6
         7 07-MAY-06 Row11                                   1
 
is what you were really looking for perhaps. 
 
 
 
 
Results....bang on target
Narendra, May       11, 2006 - 12:10 am UTC
 
 
Tom,
Thanks a lot.
That is the exact query I am looking at.
However, just one question in the confusion created by my explanation.
You said:
get me the unique ID's
for each unique ID show me:
  a) the max(created_date)
  b) the last non-null value of name when sorted by created_date desc
  c) the last non-null value of st_id when sorted by created_date desc
With input data as:
        ID NAME                                ST_ID CREATED_D
---------- ------------------------------ ---------- ---------
         6                                           16-MAY-06
         6 asdf                                      10-MAY-06
         6 Row5                                      08-MAY-06
         6                                         6 05-MAY-06
         7                                           07-MAY-06
         7                                         1 06-MAY-06
         7 Row11                                   3 04-MAY-06
Now, "the last non-null value of name when sorted by created_date desc" for ID = 6 becomes "Row5" and not "asdf", Right?
Anyway, that is what I needed. "The first non-null value of name when sorted by created_date desc".
I think my interpretation of "first" and "last" created confusion. Sorry for the same.
Thanks once again.
( Will keep bothering you...:)
 
 
 
Should I use Analytics
David Piazza, May       25, 2006 - 11:08 pm UTC
 
 
Tom,
I have the following problem.  Table A has multiple x values which has just once value in Table B.  I need to update column y in table B based on the criteria of the latest date in column z as long as the record is older than 16-MAY-06.  The columns are shown below along with the way I'd like table B to end up:
Table A (Test Schema)          Table B
x   y      z                  x       y    
--  --     ---------          --      -- 
1   10     14-MAY-06          1       10  
2   25     14_MAY-06          2       NULL 
2   2      15_MAY-06          3       11  
3   25     10_MAY-06          4       24  
3   3      18_MAY-06
3   1      15_MAY-06
4   42     14_APR-06
4   4      14_MAY-05
AFTER:
Table B
x     y
--    --
1     10
2     2
3     1
4     42
I have successfully updated one value for value x, but when I try them all using the following query, I don't get the results I need.  I've spent a lot of time trying different things but to no avail.  Can this be done without analytics, and how would you do it?
UPDATE B
  SET y=
   (SELECT *
    FROM (SELECT y
          FROM test.A,B
          WHERE test.A.x=B.x AND
                test.z < TO_DATE('16-MAY-2006','dd-mon-yy')
          ORDER by test.z DESC )
   WHERE rownum = 1);
 
 
May       26, 2006 - 8:36 am UTC 
 
I would start by setting up a create table and inserts so that when I asked someone to play around with it - they would have an out of the box test case to work with.
Either MERGE using an aggregate query, or a correlated subquery likely. 
 
 
 
count's between boundaries
A reader, June      01, 2006 - 5:25 am UTC
 
 
tom,
can you please show me a more elegant solution for my query below?
TEST @ adm1 SQL>create table emp as select * from scott.emp;
Tabelle wurde erstellt.
Abgelaufen: 00:00:00.19
TEST @ adm1 SQL>insert into emp select * from scott.emp;
15 Zeilen wurden erstellt.
Abgelaufen: 00:00:00.01
TEST @ adm1 SQL>insert into emp select * from scott.emp;
15 Zeilen wurden erstellt.
Abgelaufen: 00:00:00.00
TEST @ adm1 SQL>edit
file afiedt.buf wurde geschrieben
  1  select lb, rb, count(*) sal_between
  2  from
  3  (
  4  select b.l_bound lb, b.r_bound rb, s.sal ss
  5  from
  6  (
  7  select (rownum - 1) * 1000        l_bound
  8        ,(rownum - 1) * 1000 + 1000 r_bound
  9  from   all_objects
 10  where  rownum <= 20                -- max. l_bound / r_bound.
 11  ) b,
 12  (
 13  select sal
 14  from   emp
 15  where  deptno between 10 and 30    -- data from.
 16  ) s
 17  where s.sal between b.l_bound and b.r_bound
 18  )
 19  group by lb, rb
 20* order by lb, rb
 21  /
        LB         RB SAL_BETWEEN
---------- ---------- -----------
         0       1000           6
      1000       2000          18
      2000       3000          15
      3000       4000           6
      4000       5000           3
      5000       6000           3
6 Zeilen ausgewählt.
Abgelaufen: 00:00:00.20 
 
 
June      01, 2006 - 10:36 am UTC 
 
it looks like you just want to "divide" doesn't it?? This is not 100% identical to yours (you double count things on the "boundary", dividing won't)
ops$tkyte@ORA10GR2> select lb, rb, count(*) sal_between
  2  from
  3  (
  4  select b.l_bound lb, b.r_bound rb, s.sal ss
  5  from
  6  (
  7  select (rownum - 1) * 1000        l_bound
  8        ,(rownum - 1) * 1000 + 1000 r_bound
  9  from   all_objects
 10  where  rownum <= 20                -- max. l_bound / r_bound.
 11  ) b,
 12  (
 13  select sal
 14  from   emp
 15  where  deptno between 10 and 30    -- data from.
 16  ) s
 17  where s.sal between b.l_bound and b.r_bound
 18  )
 19  group by lb, rb
 20  order by lb, rb
 21  /
        LB         RB SAL_BETWEEN
---------- ---------- -----------
         0       1000           2
      1000       2000           6
      2000       3000           5  <<<==== counted 3000 here
      3000       4000           2  <<<====              and here
      4000       5000           1  <<<==== counted 5000 here
      5000       6000           1  <<<====              and here
6 rows selected.
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select minsal, maxsal, cnt, truncsal*1000 || ' >= sal < ' || (truncsal+1)*1000 rng
  2    from (
  3  select min(sal) minsal, max(sal) maxsal, count(*) cnt, trunc(sal/1000) truncsal
  4    from emp
  5   group by trunc(sal/1000)
  6         )
  7   order by truncsal
  8  /
    MINSAL     MAXSAL        CNT RNG
---------- ---------- ---------- --------------------
       800        950          2 0 >= sal < 1000
      1100       1600          6 1000 >= sal < 2000
      2450       2975          3 2000 >= sal < 3000
      3000       3000          2 3000 >= sal < 4000
      5000       5000          1 5000 >= sal < 6000
 
 
 
 
 
count between boundaries
A reader, June      01, 2006 - 6:29 am UTC
 
 
ups, i made a mistake. overlapping partition boundaries.
TEST @ adm1 SQL>edit
file afiedt.buf wurde geschrieben
  1  select lb, rb, count(*) sal_between
  2  from
  3  (
  4   select b.l_bound lb, b.r_bound rb, s.sal ss
  5   from
  6   (
  7     select (rownum - 1) * 1000 + 1    l_bound
  8           ,(rownum - 1) * 1000 + 1000 r_bound
  9     from   all_objects
 10     where  rownum <= 20                -- max. l_bound / r_bound.
 11   ) b,
 12   (
 13     select sal
 14     from   emp
 15     where  deptno between 10 and 30    -- data from.
 16   ) s
 17   where s.sal between b.l_bound and b.r_bound
 18  )
 19  group by lb, rb
 20* order by lb, rb
TEST @ adm1 SQL>/
        LB         RB SAL_BETWEEN
---------- ---------- -----------
         1       1000           6
      1001       2000          18
      2001       3000          15
      4001       5000           3
Abgelaufen: 00:00:00.07 
 
 
June      01, 2006 - 10:43 am UTC 
 
see above, same concept will work if you "tweak the math in the division" to get 3000/5000 to go into the group you think they should.
Beware of decimal numbers with your solution? 
 
 
 
count's between boundaries
A reader, June      01, 2006 - 7:03 am UTC
 
 
i tried the width_bucket analytic function. would this be the best way?
TEST @ adm1 SQL>edit
file afiedt.buf wurde geschrieben
  1  select lb, rb, count(*) rates_between
  2  from
  3  (
  4  select sal
  5        ,width_bucket (sal - 1, 0, 50000, 50) * 1000 - 999 lb
  6        ,width_bucket (sal - 1, 0, 50000, 50) * 1000       rb
  7  from emp
  8  )
  9  group by lb, rb
 10* order by lb, rb
TEST @ adm1 SQL>/
        LB         RB RATES_BETWEEN
---------- ---------- -------------
         1       1000             6
      1001       2000            18
      2001       3000            15
      4001       5000             3
                                  3
Abgelaufen: 00:00:00.01 
 
 
June      01, 2006 - 10:43 am UTC 
 
I think you just want to "divide" 
 
 
 
count's between boundaries
A reader, June      02, 2006 - 8:37 am UTC
 
 
divide! yes, that's it! i wish i could find such a simple concise and elegant solution on the first try! ;(
what's about your book concerning "analytics"?
TEST @ adm1 SQL>SELECT   minsal, maxsal, cnt,
  2              LPAD ((DECODE (truncsal * 1000, 0, 0, truncsal * 1000 + 1)),
  3                    4
  4                   )
  5           || ' >= sal <= '
  6           || ((truncsal + 1) * 1000) rng
  7      FROM (SELECT   MIN (sal) minsal, MAX (sal) maxsal, COUNT (*) cnt,
  8                     TRUNC ((sal - 1) / 1000) truncsal
  9                FROM emp
 10            GROUP BY TRUNC ((sal - 1) / 1000))
 11  ORDER BY truncsal
 12  /
    MINSAL     MAXSAL        CNT RNG
---------- ---------- ---------- ----------------------------------------
       800        950          6    0 >= sal <= 1000
      1100       1600         18 1001 >= sal <= 2000
      2450       3000         15 2001 >= sal <= 3000
      5000       5000          3 4001 >= sal <= 5000
                               3  >= sal <=
Abgelaufen: 00:00:00.02 
 
 
 
TO_NUMBER PROBLEM
A reader, June      07, 2006 - 7:33 pm UTC
 
 
Hi
I have external table which has column which is char(19)
when i do SELECT    TO_NUMBER(Col_NUM) FROM ET_table
it comes as 4.70793001110283E15 
i want output as number. 
 
June      07, 2006 - 8:26 pm UTC 
 
that is a number - sqlplus is printing a string so you can read it but it is in fact "a number"
you can
select to_char( to_number(col_num), '99999999999999999999.99999' ) from et_table;
but rest assured, it is a nice number and sqlplus is just trying to make it fit reasonably on the screen for us human beings to read. 
 
 
 
Analytics or Not
Kandy Train, June      09, 2006 - 11:03 am UTC
 
 
Hi Tom,
Forgive me if this is not the proper thread to put this in.
I have never written a analytic query and came across this requiment and felt it can't be done with a normal ORDER BY.
Here is the example..
create table t1 (priority varchar2(20), planned_ship_date date, order_no varchar2(20));
insert into t1 values(1, sysdate + 5, 111111);
insert into t1 values(2, sysdate + 1, 333333);
insert into t1 values(3, sysdate + 1, 444444);
insert into t1 values(4, sysdate, 222222);
insert into t1 values(5, sysdate + 3, 555555);
--column planned_ship_date format a25
--select order_no, priority, planned_ship_date from t1;
I want the results to be sorted
1).  All the Priority = 1 rows should come first
2).  The rest should be orderd on the min(Planned_Ship_Date) --> earliest
3). When there is a tie on the Planned_Ship_date, priority = 2 rows should come first(If there is any).
The Order No column is there just to see the result, if we have orderes them correctly.
Can this be done??
Thank you so much for your help. 
 
June      09, 2006 - 1:24 pm UTC 
 
order by case when priority = 1 then 1 else 2 end,
         planned_ship_date,
         case when priority = 2 then 1 else 2 end
/
why is priority a varcahr2(20)?  that is so wrong, hope that is not "real life"
same with order_no
never ever use a string to store a number, or a date.  never. 
 
 
 
LEAST of all dates
Mave, June      12, 2006 - 10:55 am UTC
 
 
Tom, I want to get Least of all the selected dates [Trying to get earliest date]
eg:
  select least(d1,d2,d3) from (select sysdate-3 d1,sysdate-2 d2,sysdate d3 from dual)
should work. But If one of them is null, i want least of other two. Nulls should not be inlcuded in the list for LEAST Function
Thanks for all your help in advance.
 
 
June      13, 2006 - 10:04 am UTC 
 
nvl() them and have the nvl return a date far into the future (like the year 4000)
least( nvl(d1,to_date('4000','yyyy')), nvl(d2,...)
and if you think all three might be null, use decode or case to return NULL if the returned least value is 4000 
 
 
 
Works for me....
Mave, June      13, 2006 - 10:33 am UTC
 
 
Thanks tom, that would work for me. 
But I am curious ..how does this to_date('4000','yyyy') work? I tried and got 06/01/4000 as the date. Why did it show june 1st of 4000?
Thanks, 
 
June      13, 2006 - 12:28 pm UTC 
 
the default day and month for yyyy is "first day of the current month"
you can of course use 01014000 and a format of ddmmyyyy if you like  
 
 
 
If /how I can use analytics for this problem?
Ping, June      14, 2006 - 10:00 am UTC
 
 
I have a table with info like this:
c_id         a_id         reg_dt
10             1          1/1/2006
10             2          1/1/2006
10             3          1/1/2006
10             1          1/2/2006
10             2          1/2/2006
10             3          1/2/2006
10             1          1/3/2006
10             2          1/3/2006
10             3          1/3/2006
What I want to get is 3 rows:
c_id         a_id         reg_dt
10             1          1/1/2006
10             2          1/2/2006
10             3          1/3/2006
I was trying to use the lag/leap functions but didn't seem to be able to make it work. Any suggestions?
Thanks!
 
 
June      14, 2006 - 12:37 pm UTC 
 
fascinating.
a bunch of data and no description of what it means or why you got the three rows you did.
that and there is no create table, no inserts, nothing to play with.  
 
 
 
 
Query
Rambabu, June      15, 2006 - 7:10 am UTC
 
 
Hi Tom,
I have a table like this.
    
Pid    seq     value1  value2  
100    1    X    A
100    2    Y    B
101    1    M    O
101    2    N    P
I want the output in the following way. 
Pid    value1 value2  
100    X    B
101    M    P
For all Pids, I want value1 from the row which has "seq =1" and value2 from the row which has "seq =2".
I got the above result using the following query.
select t1.pid,t1.value1, t2.value2
from (select *
           from t tt
           where seq = (select min(seq) from t where pid = tt.pid)) t1,
     (select *
        from t tt
           where seq = (select max(seq) from t where pid = tt.pid)) t2
where t1.pid = t2.pid
Here i used two instances(t1,t2) of Table t.
How can i get the above result using a single instance of Table t and using analytics.
create table t(
    Pid number(6),
    seq number(6),
    value1 varchar2(20),
    value2 varchar2(20));
    
insert into t values(100,1,'X','A');
insert into t values(100,2,'Y','B');
insert into t values(101,1,'M','O');
insert into t values(101,2,'N','P'); 
commit;
Thanks in advance,
Rambabu B
   
 
June      15, 2006 - 8:47 am UTC 
 
ops$tkyte@ORA10GR2> select pid,
  2         max( decode( seq, 1, value1 ) ) v1,
  3             max( decode( seq, 2, value2 ) ) v2
  4    from t
  5   group by pid;
       PID V1                   V2
---------- -------------------- --------------------
       100 X                    B
       101 M                    P
<b>sorry, analytics didn't fit into the answer</b>, they would not make sense here. 
 
 
 
 
You can use analytics
Anwar, June      15, 2006 - 10:56 am UTC
 
 
Tom's answer seems better, more concise, but you can use analytics if you really want to.
SQL> SELECT pid,value1,v2
  2  FROM
  3  (
  4     SELECT pid,
  5             value1,
  6             lead(value2) over(partition by pid order by seq) v2,
  7             seq
  8     FROM t
  9  )
 10  WHERE seq=1
 11  /
       PID VALUE1               V2
---------- -------------------- --------------------
       100 X                    B
       101 M                    P 
 
 
 
Sorry, let me try...
A reader, June      15, 2006 - 11:55 am UTC
 
 
I think the table I showed earlier is coming from a cross join b/w two tables:
Table 1, stores acct_ids that are cancelled
customer_id       acct_id
10                  1
10                  2
10                  3
table 2, stores acct_ids that are currently in use
customer_id     acct_id    reg_dt
10                 11     1/1/2006
10                 12     1/2/2006
10                 13     1/3/2006
The two tables join on customer_id, so it got 9 rows. 
Customer_id 10 cancelled 3 of his acct_ids, and then registered 3 new acct_ids. What my client really needs is, after customer_id 10 cancaled his acct_id 1,2,3, when did he registered acct_id 11,12,13.
customer_id     acct_id_cancaled acct_id_new    reg_dt
10                 1                  11     1/1/2006
10                 2                  12     1/2/2006
10                 3                  13     1/3/2006
Hope this is clear.
Thanks, Ping
 
 
June      16, 2006 - 5:51 pm UTC 
 
I have no idea what you are referring to.
and still, hmm, not a create, not an insert - no real detail... 
 
 
 
For create table and insert statements...
A reader, June      15, 2006 - 12:00 pm UTC
 
 
create table a
(c_id integer,a_id integer)
;
insert into a values (10, 1);
insert into a values (10, 2);
insert into a values (10, 3);
create table b
(c_id integer,a_id integer,reg_dt date);
insert into b values (10, 11,'1/1/2006');
insert into b values (10, 12,'1/2/2006');
insert into b values (10, 13,'1/3/2006');
Thanks! 
 
 
SQL Query
Sankar Kumar, June      15, 2006 - 12:50 pm UTC
 
 
Hi Tom,
I have a table like this:
Personid   Change_Sequence Class     Cls_EffDate    Location   Loc_EffDate
----------------------------------------------------------------------------
1000            1          FullTime                 Hawaii     
1000            2          FullTime                 California  1/1/2005
1000            3          PartTime    1/1/2006     California  1/1/2005
1000            4          PartTime    1/1/2006     Texas       10/1/2005
1000            5          FullTime    1/1/2007     Boston      1/1/2007
1000            6          COBRA       1/1/2005     Boston      1/1/2007 
1000            7          Outside     5/1/2006     Boston      1/1/2007  
1. The primary key is (Personid, change_sequence)
2. The effective dates of the first row for each person is null (i.e. for change_sequence = 1)
3. For each row only one column value will be effected (i.e. Either Class or Location can be changed) and the remaining data will be copied from the previous row 
4. Both Class and Location can be changed in a row, only if the effective date is same. [See where change_sequence = 5]
I am using the following query for getting Class and Location of the person as per given Effective Date. 
Select ECls.personid, ECls.class, ELoc.Location, ECls.change_sequence Cls_change_seq, ELoc.change_Sequence Loc_change_seq
  from (select * from (select personid, class, change_sequence,
                       row_number() over(partition by personid order by change_sequence desc) r
                  from employee
                 where nvl(Cls_EffDate, to_date('&givenDate','MM/DD/YYYY')) <= to_date('&givenDate','MM/DD/YYYY')) Cls
         where Cls.r = 1) ECls,
       (Select * from (select personid, Location, change_sequence,
                       row_number() over(partition by personid order by change_sequence desc) r
                  from employee
                 where nvl(Loc_EffDate, to_date('&givenDate','MM/DD/YYYY')) <= to_date('&givenDate','MM/DD/YYYY')) Loc
         where Loc.r = 1) ELoc
 where ECls.personid = ELoc.personid;
Now lets run query with three different dates
For 1/1/2005 the output would be
PERSONID    CLASS             LOCATION       CLS_CHANGE_SEQ    LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000        COBRA-Class       California                  6                  3    
For 1/1/2006 the output would be
PERSONID    CLASS             LOCATION       CLS_CHANGE_SEQ    LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000        COBRA-Class       Texas                       6                  4
For 1/1/2007 the output would be
PERSONID    CLASS             LOCATION       CLS_CHANGE_SEQ    LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000        Outside           Boston                      7                  7
The table more than 1 Lakh rows and got more than 12 Columns, similar to CLASS and LOCATION, based on effective dates. (For example Dept, Dept_EffDate, Status, Status_EffDate,.....)
In the above query I created two instances ECls, ELoc. If I want dept column too, I would need to create one more instance.
The goal is to get all the Columns using less number of joins, if possible, using only one instance of the table? 
If this is not possible, what would be the better design to handle this type of data?
create table employee 
(personid number(10),
change_sequence number(3),
class varchar2(50),
cls_effdate date,
location varchar2(50),
loc_effdate date,
primary key (personid,change_sequence));
insert into employee values(1000,1,'FullTime',null,'Hawaii',null);
insert into employee 
values(1000,2,'FullTime',null,'California',to_date('1/1/2005','MM/DD/YYYY'));
insert into employee 
values(1000,3,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'California',to_date('1/1/2005','MM/DD/YYYY'));
insert into employee 
values(1000,4,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'Texas',to_date('10/1/2005','MM/DD/YYYY'));
insert into employee 
values(1000,5,'FullTime',to_date('1/1/2007','MM/DD/YYYY'),'Boston',to_date('1/1/2007','MM/DD/YYYY'));
insert into employee 
values(1000,6,'COBRA-Class',to_date('1/1/2005','MM/DD/YYYY'),'Boston',to_date('1/1/2007','MM/DD/YYYY'));
insert into employee 
values(1000,7,'Outside',to_date('5/1/2006','MM/DD/YYYY'),'Boston',to_date('1/1/2007','MM/DD/YYYY'));
Thanks in Advance
Sankar
 
 
 
 
A reader, June      16, 2006 - 8:01 am UTC
 
 
Dear Tom,
I have read chapter 12(analytic function) of your book and don't find the solution to my problem. May be you can help me.
CREATE TABLE T
(IDE            NUMBER,
 COMPTECLI      VARCHAR2(13),
 DAT_JOB        DATE,
 ACTION         VARCHAR2(255),
 FAG_SUCCESS    VARCHAR2(1));
insert into t values (115, 'XX12', TRUN(SYSDATE), '01','N');
insert into t values (120, 'YY14', TRUN(SYSDATE), '01','N');
insert into t values(145, 'ZZ25',  TRUNC(SYSDATE),'01','N');
insert into t values(255, 'XX12',TRUN(SYSDATE+1), '01','N');
insert into t values(301, 'AB13',TRUN(SYSDATE+1), '02','N');
commit;
select * from t
order by dat_job;
IDE  COMPTECLI  DAT_JOB      ACTION  FAG_SUCCESS
115    XX12     16/06/2006    01        N
120    YY14     16/06/2006    01        N
145    ZZ25     16/06/2006    01        N
301    AB13     17/06/2006    02        N
255    XX12     17/06/2006    01        N
What I would like to get is:
if a same COMPTECLI with the same ACTION and FAG_SUCCES = 'N' occur many times then select only the comptecli with the "bigest" dat_job. In this example I would like to get this:
IDE  COMPTECLI  DAT_JOB      ACTION  FAG_SUCCESS
120    YY14     16/06/2006    01        N
145    ZZ25     16/06/2006    01        N
301    AB13     17/06/2006    02        N
255    XX12     17/06/2006    01        N
where  comptecli XX12 of date 16/06/2006 has been ignored
Could you please help me
Thanks in advance
 
 
June      16, 2006 - 7:13 pm UTC 
 
there actually are examples of your problem in the book... you just didn't recognize them :)
however, what about records where fag_success <> 'N' - what becomes of them?
are they treated differently?  does this only apply to 'N's?
answer is maybe this:
ops$tkyte@ORA10GR2> select * from t
  2  order by dat_job;
       IDE COMPTECLI     DAT_JOB   ACTIO F
---------- ------------- --------- ----- -
       115 XX12          16-JUN-06 01    N
       120 YY14          16-JUN-06 01    N
       145 ZZ25          16-JUN-06 01    N
       255 XX12          17-JUN-06 01    N
       301 AB13          17-JUN-06 02    N
ops$tkyte@ORA10GR2>
ops$tkyte@ORA10GR2> select *
  2    from (
  3  select ide, comptecli, dat_job, action, fag_success,
  4         max(case when fag_success = 'N' then dat_job end) over (partition by comptecli, action) max_date
  5    from t
  6         )
  7   where fag_success <> 'N'
  8      or (fag_success = 'N' and dat_job = max_date)
  9          or max_date is null;
       IDE COMPTECLI     DAT_JOB   ACTIO F MAX_DATE
---------- ------------- --------- ----- - ---------
       301 AB13          17-JUN-06 02    N 17-JUN-06
       255 XX12          17-JUN-06 01    N 17-JUN-06
       120 YY14          16-JUN-06 01    N 16-JUN-06
       145 ZZ25          16-JUN-06 01    N 16-JUN-06
but you didn't provide the details - and the border test cases! 
 
 
 
 
A reader, June      17, 2006 - 5:42 am UTC
 
 
Thanks very much Tom,
Only Fag_Success = 'N' are considered. Records with Fag_Success = 'Y' are deleted before the process starts.
Again Thanks a lot 
 
June      17, 2006 - 7:14 am UTC 
 
then mentioning 'fag-success' at all in the problem description was what we call a 'red herring', something to throw us off the trail :) 
 
 
 
distinct and average
riyaz, June      28, 2006 - 12:29 am UTC
 
 
There is a data in a select is 
> 
> XVAL          PO_NO               BILLNO               BILL VALUE 
> 
> 000001          PO1000000          BL0000001          1000.00 
> 000001          PO1000001          BL0000002          2000.00 
> 000001          PO1000002          BL0000002          2000.00 
> 000001          PO1000003          BL0000003          1000.00 
> 000001          PO1000003          BL0000004          1000.00 
> 
> 
> I need a report like 
> 
> XVAL          PO_count          BILL Count          Avg BILL VALUE 
> 
> 000001          4                    4          1250.00 
 
 
 
distict (count) and average - contd
riyaz, June      28, 2006 - 12:36 am UTC
 
 
For bill no, amount is repeated twice, so only one amount to be taken
ie average --> 5000/4 = 1250
> XVAL          PO_NO               BILLNO               BILL VALUE 
> 
> 000001          PO1000000          BL0000001          1000.00 
> 000001          PO1000001          BL0000002          2000.00 
> 000001          PO1000002          BL0000002          2000.00  
(value is repeating, not orginally available in table)
> 000001          PO1000003          BL0000003          1000.00 
> 000001          PO1000003          BL0000004          1000.00 
> 
> 
Pl give us analytical query. 
 
June      28, 2006 - 7:45 am UTC 
 
no creates, no inserts, no look 
 
 
 
Does it perform better?
A reader, July      05, 2006 - 11:54 am UTC
 
 
Tom, Please take a look at this query. Basically i am trying to pull all the tables and it's columns from all_tab_columns and want them listed as group
eg:
   Table A    Column1
              Column2
              Column3
   Table B    Column1
              Column2
              Column3
I do not want table name to repeat for each column [like BREAK in SQL Plus]
I got it using this query:
select 
       case when lag(table_name) over (order by table_name) is null 
            then table_name
            when lag(table_name) over(order by table_name) = table_name 
            then '' 
            else table_name       
            end tab_name,
column_name
from all_tab_columns
Do you see any problem with this one? or this should be ok in terms of performance?
Thanks, 
 
July      08, 2006 - 8:11 am UTC 
 
I might have used row_number with a partition - but only because it "made more sense to me" that way...  You need to incorporate OWNER in there as well though, technically your query isn't quite right without it.
select decode( row_number() over (partition by owner, table_name order by column_name),
               1, owner || '.' || table_name ) tname,
       column_name
  from all_tab_columns
 order by owner, table_name, column_name
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     3637      4.36       4.26          0     390552          0       54535
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     3639      4.37       4.27          0     390552          0       54535
select
       case when lag(table_name) over (order by table_name) is null
                   then table_name
                               when lag(table_name) over(order by table_name) = table_name
                                           then ''
                                                       else table_name
                                                                   end tab_name,
                                                                   column_name
                                                                   from all_tab_columns
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     3637      4.51       4.42          0     390552          0       54535
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     3639      4.51       4.42          0     390552          0       54535
     
 
 
 
Items sold Together 
Reader, July      11, 2006 - 6:43 pm UTC
 
 
I was exploring if you could help me in writing a SQL to get the popular items sold together .Something like when you go to some websites they say that people who bought this item also bought this item .
eg if a customers buys a shirt he always buys a pen .
THANKS !
 
 
July      12, 2006 - 3:28 pm UTC 
 
sort of hard without a table isn't it? 
 
 
 
Can i use it as analytic func in ORACLE10 G
Mariana, July      13, 2006 - 5:17 pm UTC
 
 
I have a select 
SELECT SUPPLIER_NO,ACCOUNT_NO,LINE_NO,AMOUNT
FROM KM_SUP_PAYMENTS
ORDER BY SUPPLIER_NO
that gives such result:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
   123       555555    10001   34
   123       555555    20034   56
   123       555555    30034   12
   234       345555    34555   56  
   234       454555    45455   34
But i need to see also after each group of suppliers
total amount for supplier
like here, i tried to do it by analitic but my solution was wrong,   
can you help me please in solution in order to get this result:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
   123       555555    10001   34
   123       555555    20034   56
   123       555555    30034   12
   123                         102 
   234       345555    34555   56  
   234       454555    45455   34
   234                         90    
 
July      13, 2006 - 6:24 pm UTC 
 
scott@ORA10GR2> select deptno, ename, sum(sal)
  2  from emp
  3  group by grouping sets((deptno,ename),(deptno));
    DEPTNO ENAME        SUM(SAL)
---------- ---------- ----------
        10 KING             5000
        10 CLARK            2450
        10 MILLER           1300
        10                  8750
        20 FORD             3000
        20 ADAMS            1100
        20 JONES            2975
        20 SCOTT            3000
        20 SMITH             800
        20                 10875
        30 WARD             1250
        30 ALLEN            1600
        30 BLAKE            2850
        30 JAMES             950
        30 MARTIN           1250
        30 TURNER           1500
        30                  9400
17 rows selected.
 
 
 
 
Continue grouping sets in ORACLE 10G 
Marirana, July      14, 2006 - 12:12 am UTC
 
 
HI Tom,
Thank for answer,but at this example:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
   123       555555    10001   34
   123       555555    20034   56
   123       555555    30034   12
   123                         102 
   234       345555    34555   56  
   234       454555    45455   34
   234                         90    
 the amount for each row is not sum , but just when supllier_no changes,even though i have two identical rows 
before i have to dispay it and to make sum when supplier_no changes.  
 
July      14, 2006 - 8:22 am UTC 
 
i don't see a create table and inserts for your example?  
but using emp.... we can do this again:
ops$tkyte@ORA10GR2> select supplier_no, account_no, line_no, sum(amount)
  2    from (
  3  select deptno supplier_no, job account_no, empno line_no, sal amount
  4    from emp
  5         )
  6   group by grouping sets((supplier_no,account_no,line_no),(supplier_no))
  7  /
SUPPLIER_NO ACCOUNT_N    LINE_NO SUM(AMOUNT)
----------- --------- ---------- -----------
         10 CLERK           7934        1300
         10 MANAGER         7782        2450
         10 PRESIDENT       7839        5000
         10                             8750
         20 CLERK           7369         800
         20 CLERK           7876        1100
         20 ANALYST         7788        3000
         20 ANALYST         7902        3000
         20 MANAGER         7566        2975
         20                            10875
         30 CLERK           7900         950
         30 MANAGER         7698        2850
         30 SALESMAN        7499        1600
         30 SALESMAN        7521        1250
         30 SALESMAN        7654        1250
         30 SALESMAN        7844        1500
         30                             9400
17 rows selected.
I don't see two *identical rows* in your example - do you?
if you pretend my 30 = your 123 and my salesman = your 555555 - there you go. 
 
 
 
 
Can Analytic do this?
Steve Read, July      14, 2006 - 1:32 am UTC
 
 
Here is a simple self-contained query with no creates or inserts needed......
SELECT keynbr
      ,MAX(MAX(status)) OVER (PARTITION BY keynbr ORDER BY MAX(datetm) DESC)   status
 FROM -- should return 1, 4, X, X
     (          SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
      UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual 
      UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual 
      UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
      -- should return 2, 2, X, NULL
      UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual 
      UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual 
      UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual 
      UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual 
     )
 GROUP BY keynbr
What it returns now is the MAX(status), but what I actually want is the status associated with the most recent datetm.  I think this is probably very simple, but I can't figure out how to do it.  I read through this whole thread (took a long time) but I couldn't find the answer.
 
We are using 8.1.7.
Thanks in advance for your help. 
 
 
Can Analytic do this, part 2
Steve Read, July      14, 2006 - 1:36 am UTC
 
 
Correcting the comment lines from the previous post (I forgot to fix them after simplifying my example):
SELECT keynbr
      ,MAX(MAX(status)) OVER (PARTITION BY keynbr ORDER BY MAX(datetm) DESC)   status
 FROM -- should return 1, 4
     (          SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
      UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual 
      UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual 
      UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
      -- should return 2, 2
      UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual 
      UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual 
      UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual 
      UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual 
     )
 GROUP BY keynbr
 
 
 
Michel Cadot, July      14, 2006 - 3:03 am UTC
 
 
SQL> with
  2    raw_data as (
  3      SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
  4      UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual 
  5      UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual 
  6      UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
  7      UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual 
  8      UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual 
  9      UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual 
 10      UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual 
 11    ),
 12    numbered_data as (
 13      select keynbr, status,
 14             row_number() over (partition by keynbr order by datetm desc) rn
 15      from raw_data
 16    )
 17  select keynbr, status
 18  from numbered_data
 19  where rn = 1
 20  order by keynbr
 21  /
K S
- -
1 4
2 2
2 rows selected.
Have a look at rank and dense_rank functions depending on the result you want in case of dupplicate dates.
But if you have huge data and (keynbr,datetm) is indexed, the following one will be faster and cheaper:
SQL> with
  2    raw_data as (
  3      SELECT '1' keynbr, '1' status, sysdate +1 datetm FROM dual
  4      UNION ALL SELECT '1' keynbr, '2' status, sysdate +2 datetm FROM dual 
  5      UNION ALL SELECT '1' keynbr, '3' status, sysdate +3 datetm FROM dual 
  6      UNION ALL SELECT '1' keynbr, '4' status, sysdate +4 datetm FROM dual
  7      UNION ALL SELECT '2' keynbr, '2' status, sysdate +1 datetm FROM dual 
  8      UNION ALL SELECT '2' keynbr, '3' status, sysdate +2 datetm FROM dual 
  9      UNION ALL SELECT '2' keynbr, '4' status, sysdate +3 datetm FROM dual 
 10      UNION ALL SELECT '2' keynbr, '2' status, sysdate +4 datetm FROM dual 
 11    ),
 12    max_data as ( 
 13     select keynbr, max(datetm) max_date
 14     from raw_data
 15     group by keynbr
 16    )
 17  select r.keynbr, r.status
 18  from raw_data r, max_data m
 19  where r.keynbr = m.keynbr
 20    and r.datetm = m.max_date
 21  order by r.keynbr
 22  /
K S
- -
1 4
2 2
2 rows selected.
Michel
 
 
 
 
Complex Query
Sankar, July      14, 2006 - 7:41 am UTC
 
 
Hi Tom,
I have a table like this:
Personid   Change_Sequence Class     Cls_EffDate    Location   Loc_EffDate
----------------------------------------------------------------------------
1000            1          FullTime                 Hawaii     
1000            2          FullTime                 California  1/1/2005
1000            3          PartTime    1/1/2006     California  1/1/2005
1000            4          PartTime    1/1/2006     Texas       10/1/2005
1000            5          FullTime    1/1/2007     Boston      1/1/2007
1000            6          COBRA       1/1/2005     Boston      1/1/2007 
1000            7          Outside     5/1/2006     Boston      1/1/2007  
1. The primary key is (Personid, change_sequence)
2. The effective dates of the first row for each person is null (i.e. for 
change_sequence = 1)
3. For each row only one column value will be effected (i.e. Either Class or 
Location can be changed) and the remaining data will be copied from the previous 
row 
4. Both Class and Location can be changed in a row, only if the effective date 
is same. [See where change_sequence = 5]
I am using the following query for getting Class and Location of the person as 
per given Effective Date. 
Select ECls.personid, ECls.class, ELoc.Location, ECls.change_sequence 
Cls_change_seq, ELoc.change_Sequence Loc_change_seq
  from (select * from (select personid, class, change_sequence,
                       row_number() over(partition by personid order by 
change_sequence desc) r
                  from employee
                 where nvl(Cls_EffDate, to_date('&givenDate','MM/DD/YYYY')) <= 
to_date('&givenDate','MM/DD/YYYY')) Cls
         where Cls.r = 1) ECls,
       (Select * from (select personid, Location, change_sequence,
                       row_number() over(partition by personid order by 
change_sequence desc) r
                  from employee
                 where nvl(Loc_EffDate, to_date('&givenDate','MM/DD/YYYY')) <= 
to_date('&givenDate','MM/DD/YYYY')) Loc
         where Loc.r = 1) ELoc
 where ECls.personid = ELoc.personid;
Now lets run query with three different dates
For 1/1/2005 the output would be
PERSONID    CLASS             LOCATION       CLS_CHANGE_SEQ    LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000        COBRA-Class       California                  6                  3   
 
For 1/1/2006 the output would be
PERSONID    CLASS             LOCATION       CLS_CHANGE_SEQ    LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000        COBRA-Class       Texas                       6                  4
For 1/1/2007 the output would be
PERSONID    CLASS             LOCATION       CLS_CHANGE_SEQ    LOC_CHANGE_SEQ
------------------------------------------------------------------------------
1000        Outside           Boston                      7                  7
The table more than 1 Lakh rows and got more than 12 Columns, similar to CLASS 
and LOCATION, based on effective dates. (For example Dept, Dept_EffDate, Status, 
Status_EffDate,.....)
In the above query I created two instances ECls, ELoc. If I want dept column 
too, I would need to create one more instance.
The goal is to get all the Columns using less number of joins, if possible, 
using only one instance of the table? 
If this is not possible, what would be the better design to handle this type of 
data?
create table employee 
(personid number(10),
change_sequence number(3),
class varchar2(50),
cls_effdate date,
location varchar2(50),
loc_effdate date,
primary key (personid,change_sequence));
insert into employee values(1000,1,'FullTime',null,'Hawaii',null);
insert into employee 
values(1000,2,'FullTime',null,'California',to_date('1/1/2005','MM/DD/YYYY'));
insert into employee 
values(1000,3,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'California',to_date('1
/1/2005','MM/DD/YYYY'));
insert into employee 
values(1000,4,'PartTime',to_date('1/1/2006','MM/DD/YYYY'),'Texas',to_date('10/1/2
005','MM/DD/YYYY'));
insert into employee 
values(1000,5,'FullTime',to_date('1/1/2007','MM/DD/YYYY'),'Boston',to_date('1/1/2
007','MM/DD/YYYY'));
insert into employee 
values(1000,6,'COBRA-Class',to_date('1/1/2005','MM/DD/YYYY'),'Boston',to_date('1/
1/2007','MM/DD/YYYY'));
insert into employee 
values(1000,7,'Outside',to_date('5/1/2006','MM/DD/YYYY'),'Boston',to_date('1/1/20
07','MM/DD/YYYY'));
Thanks in Advance
Sankar
 
 
July      14, 2006 - 8:37 am UTC 
 
it might just be me, but 4 and 5 seems to conflict with eachother.
also, why do the dates go UP and DOWN as the sequence increases??? doesn't make sense.
seems the dates should only INCREASE or stay the same over time. 
 
 
 
Continue with grouping sets in ORACLE10G
Mariana, July      14, 2006 - 9:48 am UTC
 
 
Hi,Tom,
This is just an example:
SUPPLIER_NO ACCOUNT_NO LINE_NO AMOUNT
=========== ========== ======= ======
   123       555555    10001   34
   123       555555    20034   56
   123       555555    30034   12
   123                         102 
   234       345555    34555   56  
   234       454555    45455   34
   234                         90    
But what i really need in my project is
to run through the cursor
SELECT A.SUPPLIER_NO,
       B.SUPPLIER_NAME
       A.LINE_NO,
       A.PAYMENT_NO       
       B.VAT_SW
       A.AMOUNT_NO 
FROM KM_SUP_PAYMENTS A,
     KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
For each different supplier i have to insert sum(amount) to some array,but for each and each row i have to make other checks,I thought to make sum(amount) by code  but for this i have to use previous values fields from cursor,i also thought to use analitic func of sum(amount) 
for every row but i am not sure it is good.Help me please.
Thank you very much
           
 
 
July      14, 2006 - 9:59 am UTC 
 
SELECT A.SUPPLIER_NO,
       B.SUPPLIER_NAME
       A.LINE_NO,
       A.PAYMENT_NO       
       B.VAT_SW
       sum(A.AMOUNT_NO)
FROM KM_SUP_PAYMENTS A,
     KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
group by grouping sets( 
    (a.supplier_no,b.supplier_name,a.line_no,a.payment_no,b.vat_sw),
    (a.supplier_no)
)
ORDER BY A.SUPPLIER_NO
I am going to keep saying basically the same thing over and over - you want a subtotal.  Hence, I'll keep bringing up GROUPING SETS!!!!!!
 
 
 
 
Answer to Complex Query
Andy, July      14, 2006 - 10:32 am UTC
 
 
To Sankar,
This query should work for you. 
select 
  personid,
  /* repeat in the same pattern for other fields */
  max(cls_max_chg_seq) cls_chg_seq,
  max(case when (change_sequence = cls_max_chg_seq) then class end) class,
  max(case when (change_sequence = cls_max_chg_seq) then cls_effdate end) cls_effdate,
  /* repeat in the same pattern for other fields */
  max(loc_max_chg_seq) loc_chg_seq,
  max(case when (change_sequence = loc_max_chg_seq) then location end) location,
  max(case when (change_sequence = loc_max_chg_seq) then loc_effdate end) loc_effdate
from
    (
    select personid, change_sequence,
      /* repeat in the same pattern for other fields */
      class,
      cls_effdate,
      max(case when cls_effdate <= the_date then change_sequence else 1 end) over(partition by personid) cls_max_chg_seq,
      /* repeat in the same pattern for other fields */
      location,
      loc_effdate,
      max(case when loc_effdate <= the_date then change_sequence else 1 end) over(partition by personid) loc_max_chg_seq
    from 
      employee
    )
group by personid;
Here analytics gives you the max change sequence with effective date less than the input date (the_date => replace it with your own input).
Then group by, with max, allows you select the required value. This query can be readily extended to multiple attributes. 
 
 
Continue to grouping sets
Mariana, July      14, 2006 - 10:35 am UTC
 
 
I understand you, but what i am confused is that i do it through the cursor and i have to put in grouping sets all fields that appear in select,what about the analitical func in every row sum(amount) and i'll write the sum when previous supplier_no!=current supplier_no?
Thank you. 
 
July      14, 2006 - 12:34 pm UTC 
 
not understanding you.
where does a cursor come into the discussion, I'm getting the answer in SQL - no code.
Less code = less bugs.  Strive to write less code.
 
 
 
 
Analytics use
Thiru, July      14, 2006 - 11:55 am UTC
 
 
Hello Tom,
Great site. Thank so much.
Can you advise how analytics could be used (if possbile, or any other way to do) to get the results for the query below:
CREATE TABLE Z (A CHAR(2),B NUMBER,C NUMBER,D NUMBER);
CREATE TABLE Y (A CHAR(2) PRIMARY KEY,B NUMBER,C NUMBER);
BEGIN
INSERT INTO Y VALUES('AA',200,NULL);
INSERT INTO Y VALUES('AB',200,NULL);
INSERT INTO Y VALUES('AC',200,NULL);
INSERT INTO Y VALUES('AD',200,NULL);
INSERT INTO Z VALUES('AB',100,1,0);
INSERT INTO Z VALUES('AB',100,1,0);
INSERT INTO Z VALUES('AC',100,1,1);
INSERT INTO Z VALUES('AC',100,1,0);
INSERT INTO Z VALUES('AD',100,0,0);
INSERT INTO Z VALUES('AD',100,0,0);
COMMIT;
END;
I would like to update Table Y col c with the sum(Z.B) grouped by Z.A where SUM(Z.B)=Y.B AND  Z.C =1 for all records in the group and Z.D=0 for all records in the group . 
Y.A=Z.A  and Y.A is primary key.
 
 
July      14, 2006 - 12:53 pm UTC 
 
AND  Z.C =1 for all records in the group and Z.D=0 for all records 
in the group 
you lost me there.
do you mean something like
update y 
   set c = b
 where (a,b) in 
(
select z.a, sum(z.b)
  from z
 where z.c = 1
   and z.d = 0
 group by z.a
)
since SUM(Z.B) must be Y.B, we don't need to update from Z, we just need to find the records to update right?
 
 
 
 
I messed up the question
Thiru, July      14, 2006 - 1:39 pm UTC
 
 
Sorry Tom for the confusion, I really meant:
I would like to update Table Y col c with the sum(Z.B) grouped by Z.A where 
Z.C =1 for all records in the group and Z.D=0 for all records 
in the group . 
Y.A=Z.A  and Y.A is primary key.
The sum(Z.b)=y.b is not in the condition. I will have to take the values from Z. Just coincidence that I made the sceanrio where y.b looked same as sum(z.b).
Will that make the subquery more complex? 
Also can we use analytics for this looking at the millions of rows in the two tables?
 
 
July      14, 2006 - 1:57 pm UTC 
 
Not just a conincidence - you said it:
....
grouped by Z.A where SUM(Z.B)=Y.B AND  Z.C =1 
........
what version you got going here?
are there any Z.A values that are NOT in Y.A?
analytics will NOT come into play here - aggregates will. 
 
 
 
Thiru, July      14, 2006 - 2:01 pm UTC
 
 
"what version you got going here?
are there any Z.A values that are NOT in Y.A?"
I am on 10gR2.
Yes it is possible that Z.A values will not be in Y.A. 
  
 
July      14, 2006 - 2:47 pm UTC 
 
ops$tkyte@ORA10GR2> select * from y;
A           B          C
-- ---------- ----------
AA        200
AB        200
AC        200
AD        200
ops$tkyte@ORA10GR2> merge into y
  2  using
  3  (
  4  select z.a, sum(z.b) new_c
  5    from z
  6   where z.c = 1
  7     and z.d = 0
  8   group by z.a
  9  ) z
 10  on (y.a = z.a)
 11  when matched then update set c = z.new_c;
2 rows merged.
ops$tkyte@ORA10GR2> select * from y;
A           B          C
-- ---------- ----------
AA        200
AB        200        200
AC        200        100
AD        200
 
 
 
 
 
Thiru, July      14, 2006 - 3:03 pm UTC
 
 
The Y table should have been updated for only one row ie., 'AB' and not 'AC'.
As only AB on Table Z satisfies both conditions. All records in the group AB has 1 for column c and 0 for column d. While for 'AC' one record in Col d has a value 1. So this does not satisfy the condition.
select * from z
/       a       b       c       d
    AB    100    1    0
    AB    100    1    0
    AC    100    1    1
    AC    100    1    0
    AD    100    0    0
    AD    100    0    0
 
 
July      14, 2006 - 3:16 pm UTC 
 
that was not entirely clear.
easy enough however
ops$tkyte@ORA10GR2> merge into y
  2  using
  3  (
  4  select z.a, sum(z.b) new_c
  5    from z
  6   group by z.a
  7  having count(*) = count( case when c=1 and d=0 then 1 end)
  8  ) z
  9  on (y.a = z.a)
 10  when matched then update set c = z.new_c;
1 row merged.
ops$tkyte@ORA10GR2> select * from y;
A           B          C
-- ---------- ----------
AA        200
AB        200        200
AC        200
AD        200
 
 
 
 
 
Wow!
Thiru, July      14, 2006 - 3:21 pm UTC
 
 
Hats off to you! I was trying using CASE but just could not get it right.
Thanks. 
 
 
Continue to grouping sets in ORACLE10G
MARIANA, July      14, 2006 - 5:12 pm UTC
 
 
What i meant here that in my store procedure i have to use this select in cursor:i have to proceed each row at this cursor because i have a logic that i need to do it by code,
and my question is: May i use the analytical function instead of grouping set?
like:
CURSOR
SELECT A.SUPPLIER_NO,
       B.SUPPLIER_NAME
       A.LINE_NO,
       A.PAYMENT_NO       
       B.VAT_SW
       A.AMOUNT,
       SUM(A.AMOUNT) OVER (PARTITION BY A.SUPPLIER_NO)   
FROM KM_SUP_PAYMENTS A,
     KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
Thank you 
 
July      14, 2006 - 5:28 pm UTC 
 
you may do whatever you like, yes.  of course.
(you could also just fetch out grouping:
ops$tkyte@ORA10GR2> select supplier_no, account_no, line_no, sum(amount),
  2         grouping(account_no)
  3    from (
  4  select deptno supplier_no, job account_no, empno line_no, sal amount
  5    from emp
  6         )
  7   group by grouping sets((supplier_no,account_no,line_no),(supplier_no))
  8  /
SUPPLIER_NO ACCOUNT_N    LINE_NO SUM(AMOUNT) GROUPING(ACCOUNT_NO)
----------- --------- ---------- ----------- --------------------
         10 CLERK           7934        1300                    0
         10 MANAGER         7782        2450                    0
         10 PRESIDENT       7839        5000                    0
         10                             8750                    1
         20 CLERK           7369         800                    0
         20 CLERK           7876        1100                    0
         20 ANALYST         7788        3000                    0
         20 ANALYST         7902        3000                    0
         20 MANAGER         7566        2975                    0
         20                            10875                    1
         30 CLERK           7900         950                    0
         30 MANAGER         7698        2850                    0
         30 SALESMAN        7499        1600                    0
         30 SALESMAN        7521        1250                    0
         30 SALESMAN        7654        1250                    0
         30 SALESMAN        7844        1500                    0
         30                             9400                    1
17 rows selected.
so you know what rows are "made up for you" instead of doing the old "if curr <> last then ..." trick. 
 
 
 
 
Continue with grouping settings in ORACLE 10G
Mariana, July      14, 2006 - 6:32 pm UTC
 
 
Thank you,Tom,
I have a question about analitic func at this concept:
is this function is expensive ?
Is this function calculated for each row?
or just when supplier_no is changing?
SELECT A.SUPPLIER_NO,
       B.SUPPLIER_NAME
       A.LINE_NO,
       A.PAYMENT_NO       
       B.VAT_SW
       A.AMOUNT,
       SUM(A.AMOUNT) OVER (PARTITION BY A.SUPPLIER_NO)   
FROM KM_SUP_PAYMENTS A,
     KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
 
 
July      15, 2006 - 3:19 pm UTC 
 
it'll have to 
a) join
b) sort/hash/partition by supplier_no
c) compute the sum(a.amount)
"expensive" - probably less so than you doing it yourself since (a) and (b) would have to be done for you - letting the server do C is a good idea.
However, grouping sets *STILL* seems more appropriate. 
 
 
 
Can this be done?
A reader, July      15, 2006 - 11:43 am UTC
 
 
Hi Tom,
Can this be done in one select through analytic function, instead of two sql stmnts.
1  select 'client, '||count(*) from client
  2  union all
  3* select to_char(client_key) from client
SQL> /
'CLIENT,'||COUNT(*)
------------------------------------------------
client, 10
10
12
16
22
23
24
25
26
29
31
11 rows selected.
Regards,
Tarun
 
 
 
July      15, 2006 - 3:24 pm UTC 
 
assuming client_key is "unique"
scott%ORA10GR2> select decode( grouping(empno),
  2                 0, to_char(empno),
  3                             1, 'client, ' || count(empno) ) data
  4    from emp
  5   group by rollup(empno)
  6   order by grouping(empno) DESC, empno
  7  /
DATA
------------------------------------------------
client, 14
7369
7499
7521
7566
7654
7698
7782
7788
7839
7844
7876
7900
7902
7934
15 rows selected.
 
 
 
 
Good one!
Michel Cadot, July      15, 2006 - 3:48 pm UTC
 
 
 
 
Continue to grouping sets ORACLE 10G
Mariana, July      15, 2006 - 4:22 pm UTC
 
 
Hi Tom,
Thank you for question,
i didn't understand which join is executed in this analitic function?And is the analitic function executed per each row or once per different supplier_no?
SELECT A.SUPPLIER_NO,
       B.SUPPLIER_NAME
       A.LINE_NO,
       A.PAYMENT_NO       
       B.VAT_SW
       A.AMOUNT,
       SUM(A.AMOUNT) OVER (PARTITION BY A.SUPPLIER_NO)   
FROM KM_SUP_PAYMENTS A,
     KM_SUP_SUPPLIERS B
WHERE A.SUPPLIER_NO=B.SUPPLIER_NO
ORDER BY A.SUPPLIER_NO
 
 
July      16, 2006 - 9:36 am UTC 
 
you are doing the join?  right there - A to B.
I was describing in effect what would have to take place to process said query.
a) join
b) sort
c) compute analytic.
If you did it yourself it would be:
a) join
b) sort
c) YOU compute the analytic
I'm going to guess it will be better for US to commpute the analytic
I'll even guess that GROUPING SETS *is still the right answer* :) 
 
 
 
Complex Query
Sankar Kumar, July      16, 2006 - 10:59 am UTC
 
 
To Andy,
     That is what I am looking for. Thanks alot 
 
 
What if there is nothing to group on?
Phil, July      21, 2006 - 9:11 am UTC
 
 
Hi Tom
This is great but unlike "traditional" SQL I find it hard to learn. Can you recommend a book that deciphers it for the likes of me please?
I have some data which I would like to analyse but I am not sure if I can do it:
WHEN (date)                  DESCRIPTION (varchar2)  
7/21/2006 1:10:44 PM     Starting processX. (From 14-MAY-06 for 20 days)
7/21/2006 1:10:42 PM     Completed balance update processing in pr_update_lagging_RTS_bals. 7419 balances processed
7/21/2006 11:10:41 AM   Completed processing in processcdr. 433923 events,28567 accounts*days
7/21/2006 11:10:41 AM   Starting balance update processing in pr_update_lagging_RTS_bals. 
7/20/2006 9:42:48 PM     Starting processX. (From 24-APR-06 for 20 days)
7/20/2006 9:42:46 PM     Completed balance update processing in pr_update_lagging_RTS_bals. 2855 balances processed
7/20/2006 8:58:01 PM     Completed processing in processcdr. 286963 events,7211 accounts*days
7/20/2006 8:58:01 PM     Starting balance update processing in pr_update_lagging_RTS_bals. 
7/20/2006 8:25:48 PM     Starting processX. (From 04-APR-06 for 20 days)
7/20/2006 8:25:46 PM     Completed balance update processing in pr_update_lagging_RTS_bals. 528 balances processed
7/20/2006 8:17:30 PM     Completed processing in processX. 106057 CDR events,1009 accounts*days
7/20/2006 8:17:30 PM     Starting balance update processing in pr_update_lagging_RTS_bals. 
7/20/2006 8:06:00 PM     Starting processX. (From 15-MAR-06 for 20 days)
I only have these two columns. Can I query this for a result as follows:
Process #1 completed in xx minutes
Process #2 completed in yy minutes
...
If I could, I could then do some extra substr to get the frequency and performance (process per min) on this aspect of the app.
Cheers,
Phil 
 
July      23, 2006 - 7:29 am UTC 
 
the data warehousing guide is a pretty good read.  And remember "traditional" SQL looked like Klingon to you until you learned it.  Then and only then did you start thinking of it as "traditional" easy stuff.
If you have access to Expert One On One Oracle - I have a chapter on them in there as well that explains them in some detail.
we could do what you need in all likelyhood - what you would have to do is
a) provide create tables
b) with inserts to populate them
c) and most importantly describe how to find the begin/end records for a process.  I see two possible records - "starting processx" and "starting balance update"
A simple lag() or lead() will do this easily - we just select out the records that represent start and stop and using lag - you can get the prior record attached to the current record. 
 
 
 
Gints, July      21, 2006 - 3:10 pm UTC
 
 
create table a1(a number);
insert into a1 values(1);
insert into a1 values(2);
insert into a1 values(3);
insert into a1 values(4);
insert into a1 values(5);
SQL> select max (a) over (order by a)
2 from a1;
MAX(A)OVER(ORDERBYA)
--------------------
1
2
3
4
5
SQL> select min (a) over (order by a desc)
2 from a1;
MIN(A)OVER(ORDERBYADESC)
------------------------
5
4
3
2
1
SQL> select max (a) over (order by a), min (a) over (order by a desc)
2 from a1;
MAX(A)OVER(ORDERBYA) MIN(A)OVER(ORDERBYADESC)
-------------------- ------------------------
5 5
4 4
3 3
2 2
1 1
Why max (a) over (order by a) changes it's appearance? :OOOO
Even trying explicitly say window clause there are wrong results although separately everything is ok:
select max (a) over (order by a rows between unbounded preceding and current row),
max (a) over (order by a desc rows between current row and unbounded following)
from a1;
gives the same....
 
 
 
July      23, 2006 - 7:52 am UTC 
 
maybe when we include "a" in the query and order each query by "a" so they are presenting the data in the same order - it'll make more sense and you'll understand why the answer is in fact "correct" (remember, the rows can only be sorted ONE WAY :)
ops$tkyte%ORA10GR2> select a, max (a) over (order by a) from a1 order by a;
         A MAX(A)OVER(ORDERBYA)
---------- --------------------
         1                    1
         2                    2
         3                    3
         4                    4
         5                    5
ops$tkyte%ORA10GR2> select a, min (a) over (order by a desc) from a1 order by a;
         A MIN(A)OVER(ORDERBYADESC)
---------- ------------------------
         1                        1
         2                        2
         3                        3
         4                        4
         5                        5
ops$tkyte%ORA10GR2> select a, max (a) over (order by a), min (a) over (order by a desc) from a1 order by a;
         A MAX(A)OVER(ORDERBYA) MIN(A)OVER(ORDERBYADESC)
---------- -------------------- ------------------------
         1                    1                        1
         2                    2                        2
         3                    3                        3
         4                    4                        4
         5                    5                        5
The max(a) when you order by a in the over clause is the same as A
The min(a) when you order by a DESC in the over clause is the same as A
hence, for your data - max(a) over ... is the SAME as min(a) over ...
and both are the same as A! 
 
 
 
 
Problem in Analitic function
Mariana, July      26, 2006 - 4:44 pm UTC
 
 
Hello,Tom,
I have a table:
create table Lines
(LINE_NO NUMBER(4),
 MONTH   DATE,
 POINTS_NUM NUMBER(3));
INSERT INTO TABLE LINES
VALUES(1001,TO_DATE('01/03/2006','DD/MM/YYYY'),25);
INSERT INTO TABLE LINES
VALUES(1001,TO_DATE('01/04/2006','DD/MM/YYYY'),15);
INSERT INTO TABLE LINES
VALUES(1001,TO_DATE('01/05/2006','DD/MM/YYYY'),40);
INSERT INTO TABLE LINES
VALUES(1002,TO_DATE('01/03/2006','DD/MM/YYYY'),35);
INSERT INTO TABLE LINES
VALUES(1002,TO_DATE('01/04/2006','DD/MM/YYYY'),12);
INSERT INTO TABLE LINES
VALUES(1002,TO_DATE('01/05/2006','DD/MM/YYYY'),76);
SELECT LINE_NO,MONTH,POINTS_NUM FROM LINES
ORDER BY LINE_NO,MONTH;
LINE_NO  MONTH POINTS_NUM
=======  ===== ==========
1001     03/06  25
1001     04/06  15  
1001     05/06  40
1002     03/06  35
1002     04/06  12
1002     05/06  76
   
What i really need is :
1) For each group of lines(i have 2 groups 1001 and 1002)
   from the second row of each group i have to calc the percentage of change OF POINTS_NUM from previous month and for each group i have also to calc cumulative percentage of change of POINTS_NUM:
it will look like:
LINE_NO  MONTH POINTS_NUM PERCENTAGE_CHANGE CUMULATIVE
=======  ===== ========== ================= ==========
1001     03/06  25
1001     04/06  15        (15-25)/25*100    (15-25)/25*100
1001     05/06  40        (40-15)/15*100    (40-25)/25*100 
1002     03/06  35
1002     04/06  12        (12-35)/35*100     (12-35)/35*100
1002     05/06  76        (76-12)/12*100   (76-35)/35*100  
   
and i have to calc total cumulative percentage:
SELECT SUM(POINTS_NUM) SUM1 FROM LINES
WHERE MONTH='01-MAR-2006' 
SUM1:=25+35=60;
SELECT SUM(POINTS_NUM) SUM2 FROM LINES
WHERE MONTH='01-MAY-2006'
SUM2:=40+76=116;
TOTAL CUMULATIVE PERCENTAGE=((116-60)/60)*100;
My question is -how to do by one select-analitic function everything that i wrote above?
Thank you very much,
Mariana 
 
July      26, 2006 - 5:08 pm UTC 
 
here is your start:
ops$tkyte%ORA10GR2> select line_no, month, points_num,
  2         lag(points_num) over (partition by line_no order by month) last_points_num,
  3         first_value(points_num) over (partition by line_no order by month) first_points_num
  4    from lines
  5  /
   LINE_NO MONTH     POINTS_NUM LAST_POINTS_NUM FIRST_POINTS_NUM
---------- --------- ---------- --------------- ----------------
      1001 01-MAR-06         25                               25
      1001 01-APR-06         15              25               25
      1001 01-MAY-06         40              15               25
      1002 01-MAR-06         35                               35
      1002 01-APR-06         12              35               35
      1002 01-MAY-06         76              12               35
6 rows selected.
 
 
 
 
 
Continue to calc cumulative percentage
Mariana, July      26, 2006 - 5:12 pm UTC
 
 
Hi Tom,
Thank you for answer,
but how can i calc the total cumulative percentage? 
 
July      26, 2006 - 5:20 pm UTC 
 
you already seem to have those queries coded?!? 
 
 
 
Continue to cumulative percentage
Mariana, July      26, 2006 - 5:33 pm UTC
 
 
Sorry,i didn't understand you.
What do you mean? 
 
July      26, 2006 - 5:42 pm UTC 
 
you wrote the summation queries for that - union all them if you want.
You want analytics for the lag()/first_value()
you need plain old aggregation for the "cumulative percentage" - and the two results are entire different "shapes"
don't see how they fit together really 
 
 
 
Can you show me by example ,please, what you mean
A reader, July      26, 2006 - 5:46 pm UTC
 
 
Mariana 
 
July      26, 2006 - 5:53 pm UTC 
 
problem is - I wasn't sure what you meant by cumulative percent.
but your two sum queries certainly retrieve what you say you wanted?  You have already written them.
....
and i have to calc total cumulative percentage:
SELECT SUM(POINTS_NUM) SUM1 FROM LINES
WHERE MONTH='01-MAR-2006' 
SUM1:=25+35=60;
SELECT SUM(POINTS_NUM) SUM2 FROM LINES
WHERE MONTH='01-MAY-2006'
SUM2:=40+76=116;
TOTAL CUMULATIVE PERCENTAGE=((116-60)/60)*100;
..........
 
 
 
 
Analytic Query with other functions
A reader, July      26, 2006 - 5:52 pm UTC
 
 
Hi Tom,
How do i write the following query:
 select 1, max(decode(dummy,null,'NN',null)) D,
 max(dummy) over (partition by dummy order by dummy desc) S
 from dual
 group by 1;
I get the following error:
SQL> /
max(dummy) over (partition by dummy order by dummy desc) S
    *
ERROR at line 2:
ORA-00979: not a GROUP BY expression
Do not have access to Scott Schema, so used dual instead.
Oracle version: 9.2.0.7
Thanks as always
 
 
 
July      26, 2006 - 6:00 pm UTC 
 
why are you grouping by.
I cannot tell you how to write a query when you haven't told me what question you are trying to ask.
you don't use group by with analytics, you use group by with aggregates. 
 
 
 
Sorry for the improper posting
A reader, July      27, 2006 - 10:58 am UTC
 
 
Hi Tom,
sorry for the question.
Well the logic is like this;
I have a query on similar lines as follows:
select a.col1, b.col2, c.col3, a.dt,
max(decode(a.col2,null,'NN',null)) D,
from tab_a a,
tab_b b,
tab_c c
where 
a.col1 = b.col1
and a.col2 = c.col2
and c.col1 = b.col1
and a.dt = (select max(in_a.dt) from tab_a in_a
             where in_a.col1 = a.col1
             and in_a.dt = a.dt)
and b.col2 = (select max(in_b.col2) from tab_b in_b
             where in_b.col1 = b.col1
             and in_b.col2 = b.col2)
group by a.col1, b.col2, c.col3, a.dt;
I want to know how can i rewrite that query using Analytic Functions or in a alternate manner.
Thanks as always.
 
 
July      27, 2006 - 12:16 pm UTC 
 
instead of using tab_a in the from list use:
(select *
   from (select a.*, max(dt) over (partition by col1, col2) max_dt from tab_a a)
  where dt = max_dt 
) 
(I assume you meant to use col1, col2 not col1, dt in the subquery)
do the same for b
and drop the correlated subquery from the where clause 
 
 
 
Thanks for the quick response.
A reader, July      27, 2006 - 1:56 pm UTC
 
 
Hi Tom,
Dropping the corelated query was the idea. But I did not know how to reformat the query.
So as per you suggestion it should be as follows:
Original Query:
----------------
select a.col1, b.col2, c.col3, a.dt,
max(decode(a.col2,null,'NN',null)) D,
from tab_a a,
tab_b b,
tab_c c
where 
a.col1 = b.col1
and a.col2 = c.col2
and c.col1 = b.col1
and a.dt = (select max(in_a.dt) from tab_a in_a
             where in_a.col1 = a.col1
             and in_a.dt = a.dt)
and b.col2 = (select max(in_b.col2) from tab_b in_b
             where in_b.col1 = b.col1
             and in_b.col2 = b.col2)
group by a.col1, b.col2, c.col3, a.dt;
Your Suggestion:
----------------
instead of using tab_a in the from list use:
(select *
   from (select a.*, max(dt) over (partition by col1, col2) max_dt from tab_a a)
  where dt = max_dt 
) 
Query Rewritten as:
--------------------
select a.col1, b.col2, c.col3, a.dt,
max(decode(a.col2,null,'NN',null)) D,
from (select *
   from (select a.*, max(dt) over (partition by col1, col2) max_dt from tab_a a)
  where dt = max_dt 
) a,
(select *
   from (select b.*, max(col2) over (partition by col1, col2) max_dt from tab_b b)
  where dt = max_dt 
) b,
tab_c c
where 
a.col1 = b.col1
and a.col2 = c.col2
and c.col1 = b.col1
group by a.col1, b.col2, c.col3, a.dt;
Please correct me if I am wrong.
Thanks as always
 
 
July      27, 2006 - 2:28 pm UTC 
 
that is correct, all you want is the "most current record by col1, col2"
that analytic gets that record.
second max would be max(dt), not col2 
 
 
 
What in case...
A reader, July      28, 2006 - 3:59 pm UTC
 
 
Hi Tom,
With reference to the question above:
What should be the approach if i have to do a max(col) on 2 or more cols from the same table.
can you explain with an example using the same query.
Thanks in advance.
 
 
July      28, 2006 - 8:53 pm UTC 
 
give me a test case to work with - not sure what you want to do with the max of these two columns 
 
 
 
RBO and Analytical Functions
Muhammad Riaz Shahid, July      30, 2006 - 4:42 am UTC
 
 
Tom,
Can we use analytical functions while using RBO?
SQL> create table x(a number);
Table created.
SQL>  select row_number() over(order by 1) from x;
no rows selected
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   WINDOW (NOSORT)
   2    1     TABLE ACCESS (FULL) OF 'X'
DB Version 9.2.0.5.0
 
 
 
July      30, 2006 - 8:42 am UTC 
 
you just did. 
 
 
 
Analytics and RBO
Jonathan Lewis, July      31, 2006 - 5:44 am UTC
 
 
You need to be a little careful with RBO and analytics.  RBO is allowed to use an index to implement an "order by" clause without sorting - but if you include an analytic function in the select list the run-time engine may have to sort the data to partition or order by inside the analytic call - leaving the data in the wrong order ... RBO does not know about everything that can happen inside analytics.
I think there's a note about this, with an example, on my website.
 
 
 
A reader, August    05, 2006 - 1:50 am UTC
 
 
Hi Tom,
I have the following Analytics wich works fine but I have one little problem with the query.
It must discerned how the query is executed.
The Code column can only be A or B , I use these column for my partioning clause.
The columns pos1,pos2,pos3...pos41 can only be 0,1, or 2.
0 normal count as 0
1 normal count as 1
2 normal count as 1
Now the problem 0 and 1 be calculated in one query to get an result.
The columns with the value 2 can not be calculated within the query.
I must have two result orderd by 1, 0 and 2, 0
17/07/2006;05:59:58;17/07/2006;0388;XXXXXX1;A;1;2;0;0
17/07/2006;06:00:05;17/07/2006;1242;XXXXXX2;A;1;2;1;0
17/07/2006;06:00:11;17/07/2006;1146;XXXXXX3;A;2;1;2;0
24/07/2006;05:59:58;24/07/2006;0388;XXXXXX4;B;2;1;0;0
24/07/2006;06:00:05;24/07/2006;1242;XXXXXX5;B;1;2;1;0
24/07/2006;06:00:11;24/07/2006;1146;XXXXXX6;B;1;2;2;0
for th values with 0,1 the result schould look like this :
C POS1 POS2 POS3 POS3
- ---- ---- ---- ----
A    2    1    1    0
B    2    1    1    0
for th values with 0,1 the result schould look like this :
C POS1 POS2 POS3 POS3
- ---- ---- ---- ----
A    1    2    1    0
B    1    2    1    0
select distinct code,pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9,pos10,pos11,pos12,
       pos13,pos14,pos15,pos16,pos17,pos18,pos19,pos20,pos21,pos22,pos23,pos24
  from (
select vin_code,
       sum(pos1) over (partition by code order by code) as pos1,
       sum(pos2) over (partition by code order by code) as pos2,
       sum(pos3) over (partition by code order by code) as pos3,
       sum(pos1) over (partition by code order by code) as pos4,
       sum(pos2) over (partition by code order by code) as pos5,
       sum(pos1) over (partition by code order by code) as pos6,
       sum(pos2) over (partition by code order by code) as pos7,
       sum(pos3) over (partition by code order by code) as pos8,
       sum(pos1) over (partition by code order by code) as pos9,
       sum(pos2) over (partition by code order by code) as pos10,
       sum(pos3) over (partition by code order by code) as pos11,
       sum(pos1) over (partition by code order by code) as pos12,
       sum(pos2) over (partition by code order by code) as pos13,
       sum(pos3) over (partition by code order by code) as pos14,
       sum(pos1) over (partition by code order by code) as pos15,
       sum(pos2) over (partition by code order by code) as pos16,
       sum(pos3) over (partition by code order by code) as pos17,
       sum(pos1) over (partition by code order by code) as pos18,
       sum(pos2) over (partition by code order by code) as pos19,
       sum(pos3) over (partition by code order by code) as pos20,
       sum(pos1) over (partition by code order by code) as pos21,
       sum(pos2) over (partition by code order by code) as pos22,
       sum(pos3) over (partition by code order by code) as pos23,
       sum(pos1) over (partition by code order by code) as pos24
  from monitoring)
group by code,pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9,pos10,pos11,pos12,
pos13,pos14,pos15,pos16,pos17,pos18,pos19,pos20,pos21,pos22,pos23,pos24;
So hope you have an idea ;-)
Regards
Marcel 
 
August    05, 2006 - 10:50 am UTC 
 
no idea, didn't follow this logic at all.
And I see no table creates, no inserts..... 
 
 
 
Question review
Marcel, August    07, 2006 - 1:31 am UTC
 
 
Hi Tom,
I will try it again without errors in the question !
First the create table and some inserts
create table  vin_check 
(
seq number(4),
vin_code varchar2(10),
system varchar2(1),
pos1 number(1),
pos2 number(1),
pos3 number(1)
);
insert into vin_check values(0001,'XXXXXXXX01','A',1,1,1);
insert into vin_check values(0002,'XXXXXXXX02','A',0,1,2);
insert into vin_check values(0003,'XXXXXXXX03','A',2,2,0);
insert into vin_check values(0004,'XXXXXXXX04','A',1,0,1);
insert into vin_check values(0005,'XXXXXXXX05','B',1,1,1);
insert into vin_check values(0006,'XXXXXXXX06','B',0,1,2);
insert into vin_check values(0006,'XXXXXXXX06','B',2,2,0);
insert into vin_check values(0007,'XXXXXXXX07','B',1,0,1);
The problem I have is the following I must calculate the errors in the system not together 
but differenced by system and the factor of how they sum is the error value.
So I took system a partition column. (Query see below )
Here what the numbers mean.
0 is no error, 
1 internal error, 
2 external error.
       SEQ VIN_CODE   S       POS1       POS2       POS3
---------- ---------- - ---------- ---------- ----------
         1 XXXXXXXX01 A          1          1          1
         2 XXXXXXXX02 A          0          1          0
         3 XXXXXXXX03 A          2          2          0
         4 XXXXXXXX04 A          2          0          1
         5 XXXXXXXX05 B          1          1          1
         6 XXXXXXXX06 B          0          1          2
         6 XXXXXXXX06 B          2          2          0
         7 XXXXXXXX07 B          0          0          2
select distinct system,pos1,pos2,pos3  from (
select system,
       sum(pos1) over (partition by system order by system) as pos1,
       sum(pos2) over (partition by system order by system) as pos2,
       sum(pos3) over (partition by system order by system) as pos3
  from vin_check)
group by system,pos1,pos2,pos3;
S       POS1       POS2       POS3
- ---------- ---------- ----------
A          5          4          2
B          3          4          5
This would be the result with the query above, but I need an another result
the errors have not the same value ( for me ) and must treat alone. So it is recommend that I calculate the
errors with 1 and the errors with 2 seperatly.
Would I run the Query for errors with the value 1 the result should look like this:
S       POS1       POS2       POS3
- ---------- ---------- ----------
A          1          2          2
B          1          2          1
Would I run the Query for errors with the value 2 the result should look like this:
S       POS1       POS2       POS3
- ---------- ---------- ----------
A          2          1          0
B          1          1          2
Thanks Marcel 
 
August    07, 2006 - 8:00 am UTC 
 
this:
select distinct system,pos1,pos2,pos3  from (
select system,
       sum(pos1) over (partition by system order by system) as pos1,
       sum(pos2) over (partition by system order by system) as pos2,
       sum(pos3) over (partition by system order by system) as pos3
  from vin_check)
group by system,pos1,pos2,pos3;
is just wrong, you use analytics when you DON'T want to aggregate, just aggregate otherwise:
select system, sum(pos1), sum(pos2), sum(pos3) from vin_check group by system;
that is all that query should ever be.
Now, it would be nice if your supplied example data (create+inserts) matched your text example!!! but they don't, we'll make due:
tkyte%ORCL> select * from vin_check;
 SEQ VIN_CODE   S POS1 POS2 POS3
---- ---------- - ---- ---- ----
   1 XXXXXXXX01 A    1    1    1
   2 XXXXXXXX02 A    0    1    2
   3 XXXXXXXX03 A    2    2    0
   4 XXXXXXXX04 A    1    0    1
   5 XXXXXXXX05 B    1    1    1
   6 XXXXXXXX06 B    0    1    2
   6 XXXXXXXX06 B    2    2    0
   7 XXXXXXXX07 B    1    0    1
8 rows selected.
tkyte%ORCL> select system,
  2         sum(pos1) pos1_sum,
  3         sum(pos2) pos2_sum,
  4         sum(pos3) pos3_sum,
  5         count(case when pos1=1 then 1 end) "pos1=1",
  6         count(case when pos2=1 then 1 end) "pos2=1",
  7         count(case when pos2=1 then 1 end) "pos2=1",
  8         count(case when pos1=2 then 1 end) "pos1=2",
  9         count(case when pos2=2 then 1 end) "pos2=2",
 10         count(case when pos2=2 then 1 end) "pos2=2"
 11    from vin_check
 12   group by system;
S POS1_SUM POS2_SUM POS3_SUM pos1=1 pos2=1 pos2=1 pos1=2 pos2=2 pos2=2
- -------- -------- -------- ------ ------ ------ ------ ------ ------
A        4        4        4      2      2      2      1      1      1
B        4        4        4      2      2      2      1      1      1
that gets it all at once, better than three queries, but if you need three queries, use the columns from above - and a simple group by. 
 
 
 
Again confusion.....
Vinayak Awasthi, August    15, 2006 - 9:47 am UTC
 
 
Hi Tom,
Just started reading the Analytics chapter in your book and got a bit puzzled. See the following queries:
<<QUERY1>>
select sum(sal) over(order by ename,deptno) from emp
<<QUERY1>>
<<QUERY2>>
select sum(sal) over(order by deptno,ename) from emp
<<QUERY2>>
<<QUERY3>>
select sum(sal) over(order by deptno,ename),sum(sal) over(order by ename,deptno) from emp
<<QUERY3>>
<<QUERY4>>
select sum(sal) over(order by ename,deptno), sum(sal) over(order by deptno,ename) from emp
<<QUERY4>>
 
The result from query1 and query2 is obvious and understandable. But query3 & query4 bumped me. I am not able to understand how this result set is arrived. Moreover, the query3 & query4 are exactly the same with just the two columns interchanged but the result differs considerably.
Tom, can you please explain this behaviour of analytics.
Thanks 
 
August    15, 2006 - 12:18 pm UTC 
 
if you order the results consistently, you'll find 3 and 4 are identical with the columns reversed:
scott%ORA9IR2> select sum(sal) over(order by deptno,ename),sum(sal) over(order by ename,deptno)
  2  from emp
  3  order by rowid
  4  /
SUM(SAL)OVER(ORDERBYDEPTNO,ENAME) SUM(SAL)OVER(ORDERBYENAME,DEPTNO)
--------------------------------- ---------------------------------
                            19625                             26275
                            21225                              2700
                            29025                             29025
                            15825                             14925
                            26275                             21175
                            24075                              5550
                             2450                              8000
                            18825                             25475
                             7450                             19925
                            27775                             27775
                             9850                              1100
                            25025                             11950
                            12850                             11000
                             8750                             22475
14 rows selected.
scott%ORA9IR2> select sum(sal) over(order by ename,deptno), sum(sal) over(order by
  2  deptno,ename) from emp
  3  order by rowid
  4  /
SUM(SAL)OVER(ORDERBYENAME,DEPTNO) SUM(SAL)OVER(ORDERBYDEPTNO,ENAME)
--------------------------------- ---------------------------------
                            26275                             19625
                             2700                             21225
                            29025                             29025
                            14925                             15825
                            21175                             26275
                             5550                             24075
                             8000                              2450
                            25475                             18825
                            19925                              7450
                            27775                             27775
                             1100                              9850
                            11950                             25025
                            11000                             12850
                            22475                              8750
14 rows selected.
 
 
 
 
Addendum to my question above
Vinayak Awasthi, August    15, 2006 - 10:17 am UTC
 
 
Tom,
in addition to above question, I wanted to ask that how ORDER BY is processed by oracle in case of analytics.If the query does not have an outer order by but implicit analytic columns contain different order by, which order by oracle actually picks to generate the final result set.By running the different queries, I beleive the order by of the last column of the select.
Please advice. 
 
August    15, 2006 - 12:19 pm UTC 
 
unless and until you have an order by on your SQL statement - please make no assumptions about the order of rows, you cannot.
until you have order by, there is no order. 
 
 
 
Things not clear !!!!
Vinayak Awasthi, August    16, 2006 - 5:32 am UTC
 
 
Tom,
I agree that if we order the result consistently, then our results in both the queries are same but jsut reversed. But my confusion is where we do not provide any order by clause.If you see my query3 & query4, I have not give any order by clause to the sql, then what causes oracle to return the result set differently. Should it not give the same results in both thee queries with the columns reversed.
Please guide. 
 
August    16, 2006 - 8:36 am UTC 
 
if you do not give an order by
YOU CANNOT HAVE ANY EXPECTATION ON THE ORDER OF ROWS.
period.
the result sets are exactly the same - with or without the order by.  They are just returned with the rows in different orders. 
 
 
 
Thanks Tom...
Vinayak Awasthi, August    16, 2006 - 10:17 am UTC
 
 
Thanks Tom,
After running the queries multiple times, I understood what you were trying to explain. 
Thanks once again for your time and guidance. 
 
 
getting wrong answer
Vinayak, August    21, 2006 - 9:19 am UTC
 
 
Hi Tom,
I have the following table with the set of data. This is just a short version of the actual table. The actual table has around 13 million records.
CREATE TABLE MY_TEST_TABLE(ID NUMBER,KEY NUMBER,TYPEOFRECORD VARCHAR2(20BYTE),
MYDATE DATE);
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values(6366556, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(6366516, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(6366565, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(6366568, 404887, 'GP', TO_DATE('07/23/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(7076940, 404887, 'CE', TO_DATE('11/04/2004 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18197564, 404887, 'CE', TO_DATE('08/29/2005 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(17561339, 404887, 'CE', TO_DATE('05/05/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381063, 404887, 'CE', TO_DATE('05/19/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381260, 404887, 'CE', TO_DATE('06/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18386869, 404887, 'CE', TO_DATE('06/10/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18895620, 404887, 'CE', TO_DATE('06/10/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(17769950, 404887, 'CE', TO_DATE('05/06/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18096803, 404887, 'CE', TO_DATE('05/19/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381262, 404887, 'CE', TO_DATE('06/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into MY_TEST_TABLE(ID, KEY, TYPEOFRECORD, MYDATE)
Values
(18381270, 404887, 'CE', TO_DATE('06/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;
My requirement is to get the highest ID value, TYPEOFRECORD value for a given key value. So I wrote this query:
select KEY, max(ID) over (partition by KEY) myid, TYPEOFRECORD
from MY_TEST_TABLE;
but this query returns as many number of records as there are records for a particular KEY value.
But if I write this query it gives the correct result:
select tt1.key,tt1.id,tt1.MYDATE,tt1.TYPEOFRECORD 
from MY_TEST_TABLE tt1 where (tt1.KEY,tt1.ID) in 
(select tt.KEY, max(tt.ID) myid
from MY_TEST_TABLE tt
group by tt.key)
What wrong I am doing in the analytic query as I wanted to get the result through analytics and not use the traditional GROUP BY clause.
Thanks
Vinayak  
 
 
Help required on one scenario related to total promotion calculation 
Hardik Panchal, September 30, 2006 - 7:00 am UTC
 
 
Hi Tom,
I have to calculate promotion in percentage (if % off given, keep the value as it is, if amount off given, convert it to percentage) per customer and item id in one base table and then need to fill summary table based on the base table.
For example,
CUSTOMR:    01        
ITEM:       123456          
BASE-SDV:   100    
Abbreviations used:
PKG: Package Number
GRP: Group Number
EVE: Event Number
SEQ: Sequence Number
Base-SDV: synonym to Item Price. SDV = Store Door Value
Case 1 (Same package):
Promotion data defined:
 
PKG     GRP     EVE     %off    Amt.off     SEQ
01      01      01      0       10          01  
01      01      02      10      0           02   
 
Calculation required: 
For Same Package and Group-Customer-Product, First Event will be applied to its Base-SDV.(means 100 -10 =90). After that Second Event will be applied to Base-SDV which is deducted from first Event.[means 90-(10% of 90)=81]
   
Data to fill in Base-Table:
PKG     GRP     EVE     %off    Amt.off     Discount    SEQ
01      01      01      0       10          10%         01
01      01      02      10      0           9%          02   *
   
* Although 10% discount is defined, actually 9% applied   
  
Data to fill in Summary Table:
  
Customer    ITEM        TotalPromo
01          123456      19%
   
Case 2 (Different Package):
PKG     GRP     EVE     %off    Amt.off     SEQ
01      01      01      0       10          01 
02      01      02      10      0           02 
  
Calculation required: 
For different Package and Group-Customer-Product ,Event in 1st package will be applied to its Base-SDV.(means 100 -10 =90). After that, event of 2nd package will be applied to original item Base-SDV (without deducting any previous promotion discount) [means 90 - (10% of 100) = 80]
  
 
Data to fill in Base-Table:
   
PKG     GRP     EVE     %off     Amt.off    Discount    SEQ 
01      01       01      0       10         10%         01
02      01      01       10      0          10%         02 
   
Data to fill in Summary Table:
Customer     ITEM        TotalPromo
01           123456       20%
Quick reply is appreciated !
Thanks in Advance :-)
Hardik Panchal
 
 
September 30, 2006 - 8:11 am UTC 
 
no create
no inserts
no look
don't know if it can be done, did not really look 
 
 
 
Not getting your comments
Hardik Panchal, September 30, 2006 - 9:12 am UTC
 
 
Do you mean to say it doesn't look possible or you require more information like table structures , insert stmts.
Thanks  
 
October   01, 2006 - 12:28 am UTC 
 
correct, without a table create and insert intos - I'm not going to be able to test a sql solution and therefore could not look at it. 
 
 
 
analytical Functions
Maulesh Jani, September 30, 2006 - 9:30 am UTC
 
 
Hi , 
   Soory for last mail. 
LOGIC : 
       This table has base data for customer's Products promotion information . Now on the base of this table I have to create replicate table shows the infor in cumulative percentage of promotion . 
Like 
cust item event price  percent-promotion
01    01   01    100     10
01    01   02    100      5
requested output table will contain one more column
cust item event price  percent-promotion cumu-promo
01    01   01    100     10                 10
01    01   02    100      5                 15
I just want to know that for such scenario does analytical functions can be used ? If yes then pl.can give me any example 
Thanks and Pl. guide 
Regards
Maulesh Jani  
 
October   01, 2006 - 12:30 am UTC 
 
sum(percent-promotition) over (partition by YOUR_GROUP order by event)
I think your group might be cust, item - but not sure as example is "not very example-ish" 
 
 
 
Details Information Provided
Hardik Panchal, September 30, 2006 - 10:22 am UTC
 
 
I have following table:
CREATE TABLE TEMP_PROMO_CALC_2
(
  CTRY_CODE        CHAR(3 BYTE)                 NOT NULL,
  CO_CODE          CHAR(3 BYTE)                 NOT NULL,
  CUST_NBR         NUMBER                       NOT NULL,
  ITEM_ID          CHAR(20 BYTE),
  EVENT_NBR        NUMBER                       NOT NULL,
  PROMO_SEQ_NBR    NUMBER(2),
  EVENT_ALLOW_PCT  NUMBER(5,4),
  EVENT_ALLOW_AMT  NUMBER,
  BASE_PRICE       NUMBER,
  NEW_PRICE        NUMBER,
  PERCENT_DIC      NUMBER
)
Having following data in this table:
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE ) VALUES ( 
'138', '002', 1, '100', 1, 1, 4, 0, 100, NULL); 
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE ) VALUES ( 
'138', '002', 1, '100', 5, 2, 0, 20, 100, NULL); 
COMMIT;
Here, EVENT_ALLOW_PCT is the % discount, while EVENT_ALLOW_AMT is the discount amount in $. 
I want following things:
     - First convert all the promotions in % form
     - Apply the promotion in the seq. mentioned by PROMO_SEQ_NBR
In above example, 
     
     - if 4% (Event Seq 1, so this will be applied first) given on a 100 $ price, NEW_PRICE field value after applying this promotion will be 96 $. The PERCENT_DIC field value will be 4% for PROMO_SEQ_NBR -> 1. 
     
     - Then 20$ amount off promotion is given. This 20$ amount off promotion needs to be converted into % off i.e. 20$ will be deducted from the 96 $ (amount calculated after applying promotion in order of PROMO_SEQ_NBR, like 1st seq. is 4% percent off and then 2nd sequence is 20$ amount off discount). In this example, the PERCENT_DIC for this 2nd seq. will be 20.83 % [calculated as (20*100)/96] and NEW_PRICE filed value will be 76 $
    
Desrired Result:
CUST_NBR   ITEM_ID    EVENT_NBR     PROMO_SEQ_NBR   EVENT_ALLOW_PCT   
---------  --------  ---------   -------------   ----------------  
1           100           1                 1                  4    
1           100           5                 2                  0    
                                                                   
                                                                   
EVENT_ALLOW_AMT    BASE_PRICE  NEW_PRICE   PERCENT_DIC             
----------------   ----------  ---------   -----------             
               0          100         96             4             
              20          100         76         20.83             
I believe this can be achieved with the analytic functions !
Regards,
Hardik Panchal
 
 
October   01, 2006 - 12:41 am UTC 
 
ok, so, what happens with other rows, what are the groups here, is event_nbr there just to 'confuse' (is it used for anything in the example or just an extra bit)
start with this I guess
ops$tkyte%ORA10GR2> select cust_nbr, item_id, promo_seq_nbr,
  2         event_allow_pct, event_allow_amt,
  3             sum(pr * (100-event_allow_pct)/100 - event_allow_amt) over (partition by cust_nbr,it
em_id order
  4             by promo_seq_nbr) new_pr
  5    from (
  6  select cust_nbr, item_id, promo_seq_nbr,
  7         event_allow_pct, event_allow_amt,
  8         case when row_number() over (partition by cust_nbr, item_id
  9             order by promo_seq_nbr) = 1 then base_price else 0 end pr
 10    from temp_promo_calc_2
 11         )
 12  /
  CUST_NBR ITE PROMO_SEQ_NBR EVENT_ALLOW_PCT EVENT_ALLOW_AMT     NEW_PR
---------- --- ------------- --------------- --------------- ----------
         1 100             1               4               0         96
         1 100             2               0              20         76
 
 
 
 
 
Analytical Question
Hardik Panchal, October   01, 2006 - 2:31 am UTC
 
 
Hi TOM,
       Thanks for your help .Yes you are right that event_nbr is not usefull in calculation. Your query helped me , but when i tried it on other data then it return me wrong info, i m trying to update your query for desired output,but still not get it .
Here is insert stmt for the base table data :
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES ( 
'138', '002', 1, '100                 ', 1, 1, 4, 0, 100, NULL, NULL); 
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES ( 
'138', '002', 1, '100                 ', 5, 2, 0, 20, 100, NULL, NULL); 
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES ( 
'138', '002', 1, '100                 ', 3, 3, 5, 0, 100, NULL, NULL); 
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES ( 
'138', '002', 2, '200                 ', 4, 1, 0, 10, 100, NULL, NULL); 
INSERT INTO TEMP_PROMO_CALC_2 ( CTRY_CODE, CO_CODE, CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR,
EVENT_ALLOW_PCT, EVENT_ALLOW_AMT, BASE_PRICE, NEW_PRICE, PERCENT_DIC ) VALUES ( 
'138', '002', 2, '200                 ', 5, 2, 5, 0, 100, NULL, NULL); 
COMMIT;
after it when I applies ur query then it doesnt show me the correct output for New_pr . (means in this case when customer having 3 Promotion on same Product and when customer having promotion in different order of Amount_off and Promo_off) ,
. This query works well in the case when %off is first and amount off second .
 
 
October   01, 2006 - 3:06 am UTC 
 
"ur"????? what is that.
you best supply a really good detailed specification.  eg: explain precisely, in such detail that someone could write code from your specification - even if they never saw your data before and didn't really know your data
eg: that is what we have here.  so, specify it out in really good detail like you were going to hire a consultant to write code from your specification. 
 
 
 
Over?
Anderson Haertel Rodrigues, October   03, 2006 - 6:12 pm UTC
 
 
Hello All,
Oracle 9.2.0.5.0
My doubt as (please):
CREATE GLOBAL TEMPORARY TABLE ETMPTESTE
(
 CdPerfilAgrupamento integer,
 CdFuncAgrupamento integer,
 CdOperacao integer,
 UO char(1),
 NatVinc char(1),
 RelTrab char(1),
 Regiao char(1),
 RegTrab char(1),
 SitPrev char(1),
 Carreira char(1),
 Cargo char(1),
 Funcao char(1)
)
ON COMMIT PRESERVE ROWS;
insert into ETMPTESTE
 (
  CdPerfilAgrupamento,
  CdFuncAgrupamento,
  CdOperacao,
  UO,
  NatVinc,
  RelTrab,
  Regiao,
  RegTrab,
  SitPrev,
  Carreira,
  Cargo,
  Funcao
 )
values
 (
  1,85,1,'N','E','T','E','N','N','N','N','N'
 );
insert into ETMPTESTE
 (
  CdPerfilAgrupamento,
  CdFuncAgrupamento,
  CdOperacao,
  UO,
  NatVinc,
  RelTrab,
  Regiao,
  RegTrab,
  SitPrev,
  Carreira,
  Cargo,
  Funcao
 )
values
 (
  2,85,1,'E','N','E','N','E','N','N','E','N'
 );
insert into ETMPTESTE
 (
  CdPerfilAgrupamento,
  CdFuncAgrupamento,
  CdOperacao,
  UO,
  NatVinc,
  RelTrab,
  Regiao,
  RegTrab,
  SitPrev,
  Carreira,
  Cargo,
  Funcao
 )
values
 (
  3,85,1,'T','T','T','T','T','T','T','T','T'
 );
insert into ETMPTESTE
 (
  CdPerfilAgrupamento,
  CdFuncAgrupamento,
  CdOperacao,
  UO,
  NatVinc,
  RelTrab,
  Regiao,
  RegTrab,
  SitPrev,
  Carreira,
  Cargo,
  Funcao
 )
values
 (
  4,85,1,'E','E','E','E','E','E','E','E','E'
 );
My Query:
select FA,
       Op,
       UOAbrangencia,
       NatVincAbrangencia,
       RelTrabAbrangencia,
       RegiaoAbrangencia,
       RegTrabAbrangencia,
       SitPrevAbrangencia, 
       CEFAbrangencia,
       CCOAbrangencia, 
       FUCAbrangencia
from 
  (
   select FA,
          Op,
          CASE WHEN UO = 'T' AND UOLead = 'T' THEN 'T'
               WHEN UO = 'T' AND UOLead = 'E' THEN 'T'
               WHEN UO = 'T' AND UOLead = 'N' THEN 'T'
               WHEN UO = 'E' AND UOLead = 'T' THEN 'T'
               WHEN UO = 'E' AND UOLead = 'E' THEN 'E'
               WHEN UO = 'E' AND UOLead = 'N' THEN 'E'
               WHEN UO = 'N' AND UOLead = 'T' THEN 'T'
               WHEN UO = 'N' AND UOLead = 'E' THEN 'E'
               WHEN UO = 'N' AND UOLead = 'N' THEN 'N'
          END UOAbrangencia,
          CASE WHEN NatVinc = 'T' AND NatVincLead = 'T' THEN 'T'
               WHEN NatVinc = 'T' AND NatVincLead = 'E' THEN 'T'
               WHEN NatVinc = 'T' AND NatVincLead = 'N' THEN 'T'
               WHEN NatVinc = 'E' AND NatVincLead = 'T' THEN 'T'
               WHEN NatVinc = 'E' AND NatVincLead = 'E' THEN 'E'
               WHEN NatVinc = 'E' AND NatVincLead = 'N' THEN 'E'
               WHEN NatVinc = 'N' AND NatVincLead = 'T' THEN 'T'
               WHEN NatVinc = 'N' AND NatVincLead = 'E' THEN 'E'
               WHEN NatVinc = 'N' AND NatVincLead = 'N' THEN 'N'
          END NatVincAbrangencia,
          CASE WHEN RelTrab = 'T' AND RelTrabLead = 'T' THEN 'T'
               WHEN RelTrab = 'T' AND RelTrabLead = 'E' THEN 'T'
               WHEN RelTrab = 'T' AND RelTrabLead = 'N' THEN 'T'
               WHEN RelTrab = 'E' AND RelTrabLead = 'T' THEN 'T'
               WHEN RelTrab = 'E' AND RelTrabLead = 'E' THEN 'E'
               WHEN RelTrab = 'E' AND RelTrabLead = 'N' THEN 'E'
               WHEN RelTrab = 'N' AND RelTrabLead = 'T' THEN 'T'
               WHEN RelTrab = 'N' AND RelTrabLead = 'E' THEN 'E'
               WHEN RelTrab = 'N' AND RelTrabLead = 'N' THEN 'N'
          END RelTrabAbrangencia,
          CASE WHEN Regiao = 'T' AND RegiaoLead = 'T' THEN 'T'
               WHEN Regiao = 'T' AND RegiaoLead = 'E' THEN 'T'
               WHEN Regiao = 'T' AND RegiaoLead = 'N' THEN 'T'
               WHEN Regiao = 'E' AND RegiaoLead = 'T' THEN 'T'
               WHEN Regiao = 'E' AND RegiaoLead = 'E' THEN 'E'
               WHEN Regiao = 'E' AND RegiaoLead = 'N' THEN 'E'
               WHEN Regiao = 'N' AND RegiaoLead = 'T' THEN 'T'
               WHEN Regiao = 'N' AND RegiaoLead = 'E' THEN 'E'
               WHEN Regiao = 'N' AND RegiaoLead = 'N' THEN 'N'
          END RegiaoAbrangencia,
          CASE WHEN RegTrab = 'T' AND RegTrabLead = 'T' THEN 'T'
               WHEN RegTrab = 'T' AND RegTrabLead = 'E' THEN 'T'
               WHEN RegTrab = 'T' AND RegTrabLead = 'N' THEN 'T'
               WHEN RegTrab = 'E' AND RegTrabLead = 'T' THEN 'T'
               WHEN RegTrab = 'E' AND RegTrabLead = 'E' THEN 'E'
               WHEN RegTrab = 'E' AND RegTrabLead = 'N' THEN 'E'
               WHEN RegTrab = 'N' AND RegTrabLead = 'T' THEN 'T'
               WHEN RegTrab = 'N' AND RegTrabLead = 'E' THEN 'E'
               WHEN RegTrab = 'N' AND RegTrabLead = 'N' THEN 'N'
          END RegTrabAbrangencia,
          CASE WHEN SitPrev = 'T' AND SitPrevLead = 'T' THEN 'T'
               WHEN SitPrev = 'T' AND SitPrevLead = 'E' THEN 'T'
               WHEN SitPrev = 'T' AND SitPrevLead = 'N' THEN 'T'
               WHEN SitPrev = 'E' AND SitPrevLead = 'T' THEN 'T'
               WHEN SitPrev = 'E' AND SitPrevLead = 'E' THEN 'E'
               WHEN SitPrev = 'E' AND SitPrevLead = 'N' THEN 'E'
               WHEN SitPrev = 'N' AND SitPrevLead = 'T' THEN 'T'
               WHEN SitPrev = 'N' AND SitPrevLead = 'E' THEN 'E'
               WHEN SitPrev = 'N' AND SitPrevLead = 'N' THEN 'N'
          END SitPrevAbrangencia,
          CASE WHEN Carreira = 'T' AND CarreiraLead = 'T' THEN 'T'
               WHEN Carreira = 'T' AND CarreiraLead = 'E' THEN 'T'
               WHEN Carreira = 'T' AND CarreiraLead = 'N' THEN 'T'
               WHEN Carreira = 'E' AND CarreiraLead = 'T' THEN 'T'
               WHEN Carreira = 'E' AND CarreiraLead = 'E' THEN 'E'
               WHEN Carreira = 'E' AND CarreiraLead = 'N' THEN 'E'
               WHEN Carreira = 'N' AND CarreiraLead = 'T' THEN 'T'
               WHEN Carreira = 'N' AND CarreiraLead = 'E' THEN 'E'
               WHEN Carreira = 'N' AND CarreiraLead = 'N' THEN 'N'
          END CEFAbrangencia,
          CASE WHEN Cargo = 'T' AND CargoLead = 'T' THEN 'T'
               WHEN Cargo = 'T' AND CargoLead = 'E' THEN 'T'
               WHEN Cargo = 'T' AND CargoLead = 'N' THEN 'T'
               WHEN Cargo = 'E' AND CargoLead = 'T' THEN 'T'
               WHEN Cargo = 'E' AND CargoLead = 'E' THEN 'E'
               WHEN Cargo = 'E' AND CargoLead = 'N' THEN 'E'
               WHEN Cargo = 'N' AND CargoLead = 'T' THEN 'T'
               WHEN Cargo = 'N' AND CargoLead = 'E' THEN 'E'
               WHEN Cargo = 'N' AND CargoLead = 'N' THEN 'N'
          END CCOAbrangencia,
          CASE WHEN Funcao = 'T' AND FuncaoLead = 'T' THEN 'T'
               WHEN Funcao = 'T' AND FuncaoLead = 'E' THEN 'T'
               WHEN Funcao = 'T' AND FuncaoLead = 'N' THEN 'T'
               WHEN Funcao = 'E' AND FuncaoLead = 'T' THEN 'T'
               WHEN Funcao = 'E' AND FuncaoLead = 'E' THEN 'E'
               WHEN Funcao = 'E' AND FuncaoLead = 'N' THEN 'E'
               WHEN Funcao = 'N' AND FuncaoLead = 'T' THEN 'T'
               WHEN Funcao = 'N' AND FuncaoLead = 'E' THEN 'E'
               WHEN Funcao = 'N' AND FuncaoLead = 'N' THEN 'N'
          END FUCAbrangencia
    from
      ( select CdFuncAgrupamento FA,
               CdOperacao Op,
               UO,
               LEAD(UO) over (order by CdFuncAgrupamento, cdOperacao ) UOLead,
               NatVinc,
               LEAD(NatVinc) over (order by CdFuncAgrupamento, cdOperacao ) NatVincLead,
               RelTrab,
               LEAD(RelTrab) over (order by CdFuncAgrupamento, cdOperacao ) RelTrabLead,
               Regiao,
               LEAD(Regiao) over (order by CdFuncAgrupamento, cdOperacao ) RegiaoLead,
               RegTrab,
               LEAD(RegTrab) over (order by CdFuncAgrupamento, cdOperacao ) RegTrabLead,
               SitPrev,
               LEAD(SitPrev) over (order by CdFuncAgrupamento, cdOperacao ) SitPrevLead,
               Carreira,
               LEAD(Carreira) over (order by CdFuncAgrupamento, cdOperacao ) CarreiraLead,
               Cargo,
               LEAD(Cargo) over (order by CdFuncAgrupamento, cdOperacao ) CargoLead,
               Funcao,
               LEAD(Funcao) over (order by CdFuncAgrupamento, cdOperacao ) FuncaoLead
        from ETMPTESTE 
      )
   )
where UOAbrangencia is not null 
group by FA,
         Op,
         UOAbrangencia,
         NatVincAbrangencia,
         RelTrabAbrangencia,
         RegiaoAbrangencia,
         RegTrabAbrangencia,
         SitPrevAbrangencia, 
         CEFAbrangencia,
         CCOAbrangencia, 
         FUCAbrangencia
/
This result as:
 FA OP U N R R R S C C F
--- -- - - - - - - - - -
 85  1 E E T E E N N E N
 85  1 T T T T T T T T T
But, desired:
 FA OP U N R R R S C C F
--- -- - - - - - - - - -
 85  1 T T T T T T T T T
Thanks All
 
 
October   03, 2006 - 7:28 pm UTC 
 
"sorry"????
I have no idea why you expected what you expected.
or - WHAT YOU EXPECTED at all.
I see lots of code that quite simply does not work and no explaination of why you expected what you think should have come out.
and frankly, i don't know what to do about it anymore.  People thing other people should be mentally linked or something.  That we should be able to look at "code that doesn't work" and "this is what I expected" from abismally small test cases that hardly exercise the boundary conditions or anything...........
it does get discouraging.
please, just pretend you are trying to explain this to your mom, use simple terms, present it simply, yet COMPLETELY.  I have *no clue* why you expected that row and not the two that are correctly returned... 
 
 
 
Over?
Anderson Haertel Rodrigues, October   04, 2006 - 8:45 am UTC
 
 
Tom, Sorry!
My english for write, is very bad! ;-)
But, 
CREATE GLOBAL TEMPORARY TABLE ETMPTESTE
(
 CdPerfilAgrupamento integer,
 CdFuncAgrupamento integer,
 CdOperacao integer,
 UO char(1),
 NatVinc char(1),
 RelTrab char(1),
 Regiao char(1),
 RegTrab char(1),
 SitPrev char(1),
 Carreira char(1),
 Cargo char(1),
 Funcao char(1)
)
ON COMMIT PRESERVE ROWS;
insert into ETMPTESTE
 (CdPerfilAgrupamento, CdFuncAgrupamento,CdOperacao,
  UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
  Carreira, Cargo, Funcao)
values
 (1,85,1,'N','E','T','E','N','N','N','N','N');
 
insert into ETMPTESTE
 (CdPerfilAgrupamento, CdFuncAgrupamento, CdOperacao,
  UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
  Carreira, Cargo, Funcao)
 values
 (2,85,1,'E','N','E','N','E','N','N','E','N');
 
insert into ETMPTESTE
 (CdPerfilAgrupamento, CdFuncAgrupamento, CdOperacao,
  UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
  Carreira, Cargo, Funcao )
values
 (3,85,1,'T','T','T','T','T','T','T','T','T');
 
insert into ETMPTESTE
 (CdPerfilAgrupamento, CdFuncAgrupamento, CdOperacao,
  UO, NatVinc, RelTrab, Regiao, RegTrab, SitPrev,
  Carreira, Cargo, Funcao )
values
 (4,85,1,'E','E','E','E','E','E','E','E','E');
sigrharq@SIGRHDES>select CDPERFILAGRUPAMENTO, CDFUNCAGRUPAMENTO, CDOPERACAO, UO from etmpteste;
CDPERFILAGRUPAMENTO  CDFUNCAGRUPAMENTO         CDOPERACAO U
------------------- ------------------ ------------------ -
                  1                 85                  1 N
                  2                 85                  1 E
                  3                 85                  1 T
                  4                 85                  1 E
But, my desired result as:
CDFUNCAGRUPAMENTO         CDOPERACAO UO
85                        1          T
without CDPERFILAGRUPAMENTO, where: 
IF UO = 'T' AND Lead = 'T' 
   Column := 'T'
elsif UO = 'T' AND Lead = 'E' 
   Column := 'T'
elsif UO = 'T' AND Lead = 'N'
   Column := 'T'
elsif UO = 'E' AND Lead = 'T' 
   Column := 'T'
elsif UO = 'E' AND Lead = 'E' 
   Column := 'E'
elsif UO = 'E' AND Lead = 'N' 
   Column := 'E
elsif UO = 'N' AND Lead = 'T'
   Column := 'T'
elsif UO = 'N' AND Lead = 'E' 
   Column := 'E'
elsif UO = 'N' AND Lead = 'N'
   Column := 'N'
my query:
select FA,
       Op,
       UOAbrangencia
from 
  (
   select FA,
          Op,
          CASE WHEN UO = 'T' AND UOLead = 'T' THEN 'T'
               WHEN UO = 'T' AND UOLead = 'E' THEN 'T'
               WHEN UO = 'T' AND UOLead = 'N' THEN 'T'
               WHEN UO = 'E' AND UOLead = 'T' THEN 'T'
               WHEN UO = 'E' AND UOLead = 'E' THEN 'E'
               WHEN UO = 'E' AND UOLead = 'N' THEN 'E'
               WHEN UO = 'N' AND UOLead = 'T' THEN 'T'
               WHEN UO = 'N' AND UOLead = 'E' THEN 'E'
               WHEN UO = 'N' AND UOLead = 'N' THEN 'N'
          END UOAbrangencia
    from
      ( select CdFuncAgrupamento FA,
               CdOperacao Op,
               UO,
               LEAD(UO) over (order by CdFuncAgrupamento, cdOperacao ) UOLead
        from ETMPTESTE 
        order by CdOperacao, CdFuncAgrupamento
      )
   )
where UOAbrangencia is not null 
group by FA,
         Op,
         UOAbrangencia
/
My result: 
 FA OP U
--- -- -
 85  1 E
 85  1 T
Thanks!!!
 
 
October   04, 2006 - 5:06 pm UTC 
 
no, see I need to know the "logic here", reading a failed implementation doesn't help.  
no more select's, just tell us WHY there is just one row, what this row represents, how you arrived at it, what is the LOGIC 
 
 
 
Over?
Anderson Haertel Rodrigues, October   06, 2006 - 8:51 am UTC
 
 
Hello Tom!
Thanks for your answers and patience.
My query now it´s Ok!
Thanks all! 
 
 
Top-n query
Anu, October   10, 2006 - 11:10 am UTC
 
 
Hi Tom,
I have to find the most recent 1000 unique values from a set of data. To make it easy here, I've created a small table with sample data. 
create table ANALYTICS_QUESTION
(
  PK_COL NUMBER(2) not null,
  COL1   VARCHAR2(10),
  COL2   VARCHAR2(10),
  A_DATE DATE
);
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (1, 'alpha', 'alpha-val', to_date('09-10-2006 17:49:30', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (2, 'alpha', 'alpha-val', to_date('09-10-2006 17:49:33', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (3, 'alpha', 'alpha-val', to_date('29-09-2006 17:49:36', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (4, 'beta', 'beta-val', to_date('04-10-2006 17:49:52', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (5, 'gamma', 'gamma-val', to_date('07-10-2006 17:50:06', 'dd-mm-yyyy hh24:mi:ss'));
insert into ANALYTICS_QUESTION (PK_COL, COL1, COL2, A_DATE)
values (6, 'gamma', 'gamma-val', to_date('07-10-2006 17:50:09', 'dd-mm-yyyy hh24:mi:ss'));
The table now has:
SQL> select * from analytics_question;
PK_COL COL1       COL2       A_DATE
------ ---------- ---------- -----------
     1 alpha      alpha-val  10/9/2006 5
     2 alpha      alpha-val  10/9/2006 5
     3 alpha      alpha-val  9/29/2006 5
     4 beta       beta-val   10/4/2006 5
     5 gamma      gamma-val  10/7/2006 5
     6 gamma      gamma-val  10/7/2006 5
The output I need is (consider top 2 rows for this example):
COL1  COL2     
----- ---------
alpha alpha-val
gamma gamma-val
If I had to do this in a spreadsheet, I would remove duplicates in column col1 while retaining the rows that have the most recent date in column a_date. Next, I would sort this by date and pick the top 2 rows. (For a given value in co11, col2 always has the same value.)
Since it involves top-n, I considered using analytic functions. To do it the traditional way, I could use:
SQL> select *
  2  from   (select *
  3          from   (select col1,
  4                         col2,
  5                         max(a_date) as a_date
  6                  from   analytics_question
  7                  group  by col1,
  8                            col2) tab
  9          order  by a_date desc)
 10  where  rownum <= 2;
COL1       COL2       A_DATE
---------- ---------- -----------
alpha      alpha-val  10/9/2006 5
gamma      gamma-val  10/7/2006 5
In reality, the "analytics_question" would be a inline view over a table that has over 600,000 rows. Does not perform well.
I've attempted to use row_number() and rank() functions as below, but I wonder if there is any way to avoid the group by clause.
SQL> select *
  2  from   (select col1,
  3                 col2,
  4                 max(a_date),
  5                 row_number() over(order by max(a_date) desc) as rn
  6          from   analytics_question
  7          group  by col1,
  8                    col2)
  9  where  rn <= 2;
COL1       COL2       MAX(A_DATE)         RN
---------- ---------- ----------- ----------
alpha      alpha-val  10/9/2006 5          1
gamma      gamma-val  10/7/2006 5          2
Is there a way to simplify this further? Otherwise, its back to tuning.
Thanks and regards,
Anu 
 
 
October   10, 2006 - 7:51 pm UTC 
 
600,000 rows is pretty small - can you define "does not perform well"?
 
 
 
 
Your Help --
Reader, October   10, 2006 - 5:44 pm UTC
 
 
create table testa (a1 date,h_flg varchar2(1),b_flg varchar2(1),amt number(6,2))
         
        
         insert into testa values( sysdate,'B',null,25.00);
         insert into testa values( sysdate,null,'H',55.00);
         insert into testa values( sysdate,null,'H',35.00);
How can I get the output in one line --
160 70 90 
Thanks 
         insert into testa values( sysdate,'B',null,45.00);
commit;
select distinct sum(amt) over(partition by nvl(h_flg,0)) h_amt,
                sum(amt) over(partition by nvl(b_flg,0)) b_amt,
                sum(amt) over() tot_amt   from testa 
     H_AMT      B_AMT      B_AMT
---------- ---------- ----------
       160         70         70
       160         90         90
2 rows selected.
How can I get the output in one line --
160 70 90  
 
October   10, 2006 - 8:24 pm UTC 
 
umm, you have a table with just three rows in it?  your answer is very dependent on there being precisely three rows - is this the "real problem" or so overly simplified as to not be relevant? 
 
 
 
Help Required on Promotion Calculation
Hardik Panchal, October   12, 2006 - 9:40 am UTC
 
 
Hi Tom,
I have following table:
CREATE TABLE PROMOTION_CALC
(
  CUST_NBR         NUMBER                       NOT NULL,
  ITEM_ID          CHAR(20 BYTE),
  EVENT_NBR        NUMBER                       NOT NULL,
  PROMO_SEQ_NBR    NUMBER(2),
  PCT  NUMBER(5,4),
  AMT  NUMBER,
  BASE_PRICE       NUMBER
);
The table data is as below:
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 1, 1, 4, 0, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 6, 1, 0, 5, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 5, 2, 0, 20, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 2, 3, 5, 0, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 7, 3, 0, 10, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
2, '400                 ', 9, 1, 0, 15, 200); 
COMMIT;
following is the tabular form of data:
CUST_NBR  ITEM_ID  EVENT_NBR  PROMO_SEQ_NBR  PCT  AMT  BASE_PRICE  
--------  -------  ---------  -------------  ---  ---  ----------
1         300            1                1    4    0         100    
1         300            6                1    0    5         100    
1         300            5                2    0   20         100    
1         300            2                3    5    0         100    
1         300            7                3    0   10         100  
2         400            9                1    0   15         200
Now, i want to calculate NEW_PRICE and PERCENT_DIC as mentioned below:
CUST  ITEM  EVENT  PROMO_    PCT  AMT  BASE_PRICE  NEW_PRICE  PERCENT_DIC
_NBR  _ID   _NBR   SEQ_NBR
----  ----  ------ --------  ---  ---  ----------  ---------  -----------
1     300     1    1           4   0          100        91            4
1     300     6    1           0   5          100        91            5
1     300     5    2           0  20          100        71        21.97
1     300     2    3           5   0          100   57.4532            5
1     300     7    3           0  10          100   57.4532        14.08
2     400     9    1           0  15          200       185          7.5
The logic of calculation is as below:
- First of all, the promotion (whether in percent off or amount off) will be converted in to percent off
- The summation of percent off (group by  cust_nbr, item_nbr, promo_seq_nbr) will be applied 
- The percent off calculated will be applied on the base_price for the least promo_seq_nbr per cust_nbr, item_nbr combination. For the next set of promo_seq_nbr per cust_nbr, item_nbr combination, the percent off calculated will be applied on the new_price (amount calculated with previous promo_seq_nbr), and not on the base_price.
In the above example,
 
- Row 1 and Row 2 will be applied to base_price: 100$ individually (as both having same promo_seq_nbr: 1)
  i.e. 
    Row 1: Directly 4 percent off given
    Row 2: Amount Off 5$ (on base_price: 100$) will be converted to percent off, which will result into 5%
- The summation of percent off: 9% (4% from row 1 + 5% from row 2) will be applied per cust_nbr, item_nbr,  promo_seq_nbr (cust_nbr: 1, item_id: 300, promo_seq_nbr:1 in this case), which will result into new_price 91 $. Note here that the new_price column will be calculated as 91 in both the rows (row 1 and row 2)  as a result of combined promotion of 9% applied on base_price.
- Now, for subsequent promo_seq_nbr for cust_nbr, item_nbr combination, the converted Percent off will be applied on new_price and not on the item base_price. i.e. in row 3, the amount off 20$ promtion will first converted in to percent off (applied on 91$ and not on 100$), resulting into 21.97%.
- In a same way, 3rd promo_seq_nbr (again having two rows for cust_nbr, item_nbr combination with same promo_seq_nbr) will be applied (as a summation of row 4 and row 5) on a new_price (which is 71$ after applying 2nd promo_seq_nbr).   
The event_nbr field is not used in the promtion calculation, but is kept just because it is required for display purpose only in the business.
Thanks a lot in advance !
Regards,
Hardik Panchal
 
 
 
bulk analytic update
raajesh, October   12, 2006 - 3:32 pm UTC
 
 
Hi Tom
I would like to know if I can acheive my problem listed below using anayltic functions.
I have a table like this
Dept ID Dept Name Unique ID
1          A1 
2          A1
3          A1
4          A2
5          A2
6          A2
7          A3
8          A4
9          A4
10         A4
Now I want my unique ID field to be generated like this
Dept ID Dept Name Unique ID
1          A1       123 
2          A1       123
3          A1       123
4          A2       456
5          A2       456
6          A2       456
7          A3       124 
8          A4       125
9          A4       125 
10         A4       125
Also, the unique ID needs to be generated via an Oracle Sequence. The numbers given are just examples. i.e. unique ID takes values based on the Dept Name partition.
How to achieve this by an Update statement on the table? Please suggest a solution.
Raajesh 
 
 
October   13, 2006 - 6:55 am UTC 
 
no create, no inserts, no lookie :)
not promising it can be done, just stating "I do not look at the problem if you expect me to create a table, insert your data into it, create sequences, develop a working solution"
I only do the last bit, if and when possible 
 
 
 
to raajesh  
Michel Cadot, October   13, 2006 - 8:17 am UTC
 
 
I don't see the need of an Oracle sequence.
You can try this one:
update dept a
set unqid = (select val
             from (select rownum val, deptname
                   from (select distinct deptname from dept)
                  ) b 
             where b.deptname=a.deptname)
/
I'm not sure analytics are faster in this case but you can try (assuming deptid is a unique key):
update dept a 
set unqid = (select val
             from (select deptid, 
                          first_value (deptid) 
                            over (partition by deptname order by deptid) val
                   from dept) b
             where b.deptid=a.deptid)
/
Regards
Michel
 
 
October   13, 2006 - 8:24 am UTC 
 
if you really wanted to use "rownum", you best use order by in the select distinct, else there is no assurance that rownum would be assigned deterministically.
 
 
 
 
Michel Cadot, October   13, 2006 - 8:45 am UTC
 
 
Yes, I agree (about rownum and order) but I did not care about what the unique number is (as raajesh said: The numbers given are just examples) as long as it is different for each deptname. 
Anyway, the second query gives a number that fully depends on the data as the department (name) is associated to the first deptid for each name (and it could be any other one). (This to explain to other readers and not to you, Tom, as I know you perfectly understand my queries.) 
 
October   13, 2006 - 2:28 pm UTC 
 
problem is that the calls to the subquery are NOT deterministic, the same rownum could in theory be assigned to different rows in different invocations. 
 
 
 
One more analytics question
Serge Shmygelsky, October   13, 2006 - 8:55 am UTC
 
 
Hello Tom,
I have a table with the following data:
DN_NUM               LAC                  CELL_ID      DURATION
-------------------- -------------------- ---------- ----------
503111011            Kiev                 044                10
503111011            Kiev                 044                10
503111011            Kiev                 044                10
503111011            Kiev region          044                10
503111011            Kiev region          044                10
503111011            Kiev region          044                10
503111011            Kiev region          044                10
503111011            Odesa                048                10
503111011            Odesa region         048                10
503111011            Odesa region         048                10
503111011            Odesa region         048                10
503111011            Odesa region         048                10
503111011            Odesa region         048                10
and I need for every phone number (there is only one for simplicity) to find cell_id with maximum number of calls (rows actually) and within this cell_id find lac with maximum number of calls, e.g. output should look like the following:
DN_NUM      CELL_ID  LAC         COUNT_BY_CELL_ID COUNT_BY_REGION
503111011   044      Kiev region                7               4
I we just do 'GROUP BY cell_id, lac' then we'll have incorrect answer.
I've spent 2 hours playing with rank and rownum and cannot find solution. Is it possible to get it within one query?
Thanks in advance.
P.S. The real table is huge. So I'd rather not use any joins 
 
October   13, 2006 - 2:29 pm UTC 
 
no table create
no inserts into table
NO LOOK
 
 
 
 
to Serge Shmygelsky  
Michel Cadot, October   13, 2006 - 10:09 am UTC
 
 
Hi Serge,
Do you understand "no create, no inserts, no lookie"?
If so, provide the data if you want an answer.
Regards
Michel
 
 
 
Analytical Calculation Problem
Hardik Panchal, October   14, 2006 - 9:30 am UTC
 
 
Hi Tom,
I am posting one problem scenario, hoping an excellent solution from you as always :-)
I have following table:
CREATE TABLE PROMOTION_CALC
(
  CUST_NBR         NUMBER                       NOT NULL,
  ITEM_ID          CHAR(20 BYTE),
  EVENT_NBR        NUMBER                       NOT NULL,
  PROMO_SEQ_NBR    NUMBER(2),
  PCT  NUMBER(5,4),
  AMT  NUMBER,
  BASE_PRICE       NUMBER
);
The table data is as below:
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 1, 1, 4, 0, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 6, 1, 0, 5, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 5, 2, 0, 20, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 2, 3, 5, 0, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
1, '300                 ', 7, 3, 0, 10, 100); 
INSERT INTO PROMOTION_CALC ( CUST_NBR, ITEM_ID, EVENT_NBR, PROMO_SEQ_NBR, PCT,
AMT, BASE_PRICE ) VALUES ( 
2, '400                 ', 9, 1, 0, 15, 200); 
COMMIT;
following is the tabular form of data:
CUST_NBR  ITEM_ID  EVENT_NBR  PROMO_SEQ_NBR  PCT  AMT  BASE_PRICE  
--------  -------  ---------  -------------  ---  ---  ----------
1         300            1                1    4    0         100    
1         300            6                1    0    5         100    
1         300            5                2    0   20         100    
1         300            2                3    5    0         100    
1         300            7                3    0   10         100  
2         400            9                1    0   15         200
Now, i want to calculate NEW_PRICE and PERCENT_DIC as mentioned below:
CUST  ITEM  EVENT  PROMO_    PCT  AMT  BASE_PRICE  NEW_PRICE  PERCENT_DIC
_NBR  _ID   _NBR   SEQ_NBR
----  ----  ------ --------  ---  ---  ----------  ---------  -----------
1     300     1    1           4   0          100        91            4
1     300     6    1           0   5          100        91            5
1     300     5    2           0  20          100        71        21.97
1     300     2    3           5   0          100   57.4532            5
1     300     7    3           0  10          100   57.4532        14.08
2     400     9    1           0  15          200       185          7.5
The logic of calculation is as below:
- First of all, the promotion (whether in percent off or amount off) will be 
converted in to percent off
- The summation of percent off (group by  cust_nbr, item_nbr, promo_seq_nbr) 
will be applied 
- The percent off calculated will be applied on the base_price for the least 
promo_seq_nbr per cust_nbr, item_nbr combination. For the next set of 
promo_seq_nbr per cust_nbr, item_nbr combination, the percent off calculated 
will be applied on the new_price (amount calculated with previous 
promo_seq_nbr), and not on the base_price.
In the above example,
 
- Row 1 and Row 2 will be applied to base_price: 100$ individually (as both 
having same promo_seq_nbr: 1)
  i.e. 
    Row 1: Directly 4 percent off given
    Row 2: Amount Off 5$ (on base_price: 100$) will be converted to percent off, 
which will result into 5%
- The summation of percent off: 9% (4% from row 1 + 5% from row 2) will be 
applied per cust_nbr, item_nbr,  promo_seq_nbr (cust_nbr: 1, item_id: 300, 
promo_seq_nbr:1 in this case), which will result into new_price 91 $. Note here 
that the new_price column will be calculated as 91 in both the rows (row 1 and 
row 2)  as a result of combined promotion of 9% applied on base_price.
- Now, for subsequent promo_seq_nbr for cust_nbr, item_nbr combination, the 
converted Percent off will be applied on new_price and not on the item 
base_price. i.e. in row 3, the amount off 20$ promtion will first converted in 
to percent off (applied on 91$ and not on 100$), resulting into 21.97%.
- In a same way, 3rd promo_seq_nbr (again having two rows for cust_nbr, item_nbr 
combination with same promo_seq_nbr) will be applied (as a summation of row 4 
and row 5) on a new_price (which is 71$ after applying 2nd promo_seq_nbr).   
The event_nbr field is not used in the promtion calculation, but is kept just 
because it is required for display purpose only in the business.
Thanks a lot in advance !
Regards,
Hardik Panchal
 
 
October   14, 2006 - 9:39 am UTC 
 
first - I'm taking new questions now, why wouldn't you use new questions to ask a.... new question?
second, I have no idea why 91 is the new price on the first row.  4% off of 100 would be 96 no? 
 
 
 
How to do the reverse of this analytical function question?
Patty Hoth, October   16, 2006 - 1:05 pm UTC
 
 
I really find your website invaluable!  I have studied and played around with this answer to try to solve my sql issue.  I want to do the reverse of what this person was looking for.
My data looks like:
EMPLID      EFFDT         EFFSEQ DEPTID    
----------- --------- ---------- ----------
0003496     01-JUL-99          0 157000    
0003496     01-JAN-00          0 157000    
0003496     01-JUL-00          1 157999    
0003496     01-JAN-01          0 157999    
0003496     01-JUL-01          1 157999    
0003496     01-JAN-02          0 157999    
0003496     01-JUL-02          1 157999    
0003496     01-JAN-03          0 157999    
0003496     01-JUL-03          1 157999    
0003496     01-JAN-04          0 157999    
0003496     01-JAN-05          0 157999    
0003496     01-JUL-05          1 157999    
0003496     01-JUL-05          2 157999    
0003496     01-JAN-06          0 157999    
14 rows selected.
I want to create sql to answer the question of department movement.  So, for this data, the answer would like like:
eid    effdt            dept    end effdt
3496    7/1/1999    157000    7/1/2000
3496    7/1/2000    157999    
I have tried variations of the answer you provided, come very close, but am not able to get the right answer:
SELECT emplid,  deptid, effdt, ending_effdt
       FROM
(SELECT emplid,
       deptid,
       effdt,
       ending_effdt,
       DECODE( next_deptid, deptid, 1, 0 ) first_of_pair,
       DECODE( lead_deptid, deptid, 1, 0 ) second_of_pair
  FROM (
  SELECT emplid,
            lag(deptid) OVER (PARTITION BY emplid ORDER BY effdt, effseq) next_deptid,
          lead(deptid) OVER (PARTITION BY emplid ORDER BY effdt, effseq) lead_deptid,
         deptid,
         effdt,
-- lag(effdt) OVER (PARTITION BY emplid ORDER BY effdt) ending_effdt,
-- lead(effdt) OVER (PARTITION BY emplid ORDER BY effdt) lead_effdt
         lead(effdt) OVER (PARTITION BY emplid ORDER BY effdt, effseq) ending_effdt,
         lag(effdt) OVER (PARTITION BY emplid ORDER BY effdt, effseq) lead_effdt
   FROM ps_job
  WHERE emplid = '0003496' ) 
WHERE next_deptid IS NULL
   OR lead_deptid IS NULL
   OR lead_deptid <> deptid
   OR next_deptid <> deptid
 )
   WHERE first_of_pair <> 1
Result:
EMPLID      DEPTID     EFFDT     ENDING_EF
----------- ---------- --------- ---------
0003496     157000     7/1/1999  1/1/2000
0003496     157999     7/1/2000  1/1/2001
The effdt are correct, but the ending dates are not (second one should be null and first should be 7/1/2000).  I will continue to work on this, but thought I'd try your assistance!  Thanks very much in advance.
 
 
October   16, 2006 - 5:21 pm UTC 
 
no creates
  no insert intos
     definitely no lookie... 
 
 
 
One more analytics question - ctd
Serge Shmygelsky, October   23, 2006 - 8:10 am UTC
 
 
Hello Tom,
thanks for your attention. I'll try to re-formulate my question hoping to get the answer.
I have created the following table:
OPS$SHMYG@REX> create table test (dn_Num varchar2(11), region varchar2(20), area varchar2(20));
After that I put some data in there:
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa');
OPS$SHMYG@REX> insert into test values ('503111011', 'Kiev', 'Kiev');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
OPS$SHMYG@REX> insert into test values ('503111011', 'Odesa', 'Odesa region');
Now my table looks like this:
OPS$SHMYG@REX> select * from test order by 1, 2, 3;
DN_NUM               REGION               AREA
-------------------- -------------------- --------------------
503111011            Kiev                 Kiev
503111011            Kiev                 Kiev
503111011            Kiev                 Kiev
503111011            Kiev                 Kiev region
503111011            Kiev                 Kiev region
503111011            Kiev                 Kiev region
503111011            Kiev                 Kiev region
503111011            Odesa                Odesa
503111011            Odesa                Odesa region
503111011            Odesa                Odesa region
503111011            Odesa                Odesa region
503111011            Odesa                Odesa region
503111011            Odesa                Odesa region
Actually these are phone calls. Each phone call has 2 attributes:
region it is made from
area it is made from.
Each region contains 2 or more areas.
The question is as follows:
for any given phone number I need to find a region with maximum number of calls. For this phone and this region I need to find area with maximum number of calls.
If I just run (group by region, area), I'll have
Phone number  Region  CNT_REGION  Area          CNT_AREA
503111011     Odesa   6           Odesa region  5
but I need the following:
Phone number  Region  CNT_REGION  Area          CNT_AREA
503111011     Kiev    7           Kiev region   4
I cannot find a way to do it in one sql and hope this is possible as real table contains 100,000,000 rows per day.
Hope this time my explanation is acceptable.
Best regards. 
 
October   23, 2006 - 10:29 am UTC 
 
ops$tkyte%ORA10GR2> select * from (
  2  select region, count(*)
  3    from test
  4   group by region
  5   order by 2 desc
  6  )
  7  where rownum = 1;
REGION                 COUNT(*)
-------------------- ----------
Kiev                          7
ops$tkyte%ORA10GR2> select * from (
  2  select region, area, count(*)
  3    from test
  4   group by region, area
  5   order by 3 desc
  6  )
  7  where rownum = 1;
REGION               AREA                   COUNT(*)
-------------------- -------------------- ----------
Odesa                Odesa region                  5
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select *
  2    from (
  3  select region, area, max_area, area_cnt, reg_cnt,
  4         max(reg_cnt) over () max_reg
  5    from (
  6  select region, area, area_cnt,
  7         max(area_cnt) over (partition by region) max_area,
  8             sum(area_cnt) over (partition by region) reg_cnt
  9    from (
 10  select region, area, count(*) area_cnt
 11    from test
 12   where dn_num = '503111011'
 13   group by region, area
 14         )
 15             )
 16             )
 17   where area_cnt = max_area or reg_cnt = max_reg
 18  /
REGION               AREA                   MAX_AREA   AREA_CNT    REG_CNT    MAX_REG
-------------------- -------------------- ---------- ---------- ---------- ----------
Kiev                 Kiev                          4          3          7          7
Kiev                 Kiev region                   4          4          7          7
Odesa                Odesa region                  5          5          6          7
last query is the one you asked for, if dn_num is indexed, it doesn't really matter if the table is big, i would presume the phone number is fairly selective. 
 
 
 
 
One more analytics question - ctd
Serge Shmygelsky, October   23, 2006 - 9:03 am UTC
 
 
Hello Tom,
looks like I've finally found the answer. If I'm not wrong, it should be like that:
OPS$SHMYG@REX> select dn_num, region, area, count(*), sum(count(*)) over (partition by dn_num, region) from test group by dn_num, region, area order by 5 desc, 4 desc;
DN_NUM               REGION               AREA                   COUNT(*) SUM(COUNT(*))OVER(PARTITIONBYDN_NUM,REGION)
-------------------- -------------------- -------------------- ---------- -------------------------------------------
503111011            Kiev                 Kiev region                   4                                           7
503111011            Kiev                 Kiev                          3                                           7
503111011            Odesa                Odesa region                  5                                           6
503111011            Odesa                Odesa                         1                                           6
The first row is the one I was looking for.
 
 
October   23, 2006 - 10:31 am UTC 
 
no according to your problem definition - you asked for two two rows (max region, max area - no relation between area and region in your problem statement) AND further, there are ties - meaning this query can return lots of rows and it should
or you need to be much much more precise. 
 
 
 
One more analytics question - ctd
Serge Shmygelsky, October   24, 2006 - 2:06 am UTC
 
 
Hello Tom,
thanks for your solution. It solves the problem as well as the one I mentioned. They'll need to be customized a little but I'm on my way.
I'll pay some attention to my English :).
Do you plan to create a logo for PL/SQL? 'Analytics rocks' would be a good one :)
Best regards. 
 
 
One probable answer to Serge !!!!!
Vinayak, October   24, 2006 - 5:32 am UTC
 
 
select * from (
select * from(
select dn_num,region,area,count(region) over(partition by region) reg_cnt,count(area) over(partition by region,area) area_cnt from test1 
where dn_num='503111011'
)
order by reg_cnt desc,area_cnt desc
) where rownum=1
Hope this help you!!! Also wait for Tom's comments on this. 
 
October   24, 2006 - 9:16 am UTC 
 
well, not really knowing what he wanted, I cannot really comment much more than I did above - it is from his description that two rows should regularly appear (max region count, max area count) 
 
 
 
How to move to analytic function ?
Lena, October   26, 2006 - 6:12 pm UTC
 
 
Hi Tom,
There are times when you need to bring diffrent things twice from the same table .
 select test.name      name,
       test2.name     name2,
       p.name,
       p.main_num,
       p.sec_num
 from  test test,   <===
       test test2,  <===
       prod p
where  p.main_num  = test.id
  and  p.sec_num = test2.id
Is it possible get the same result using analytic function ?
create table test
(id number,
 name varchar2(20))
 
 insert into test values(251,'U-M');
 insert into test values(279,'PAL');
 insert into test values(300,'U-A');
 insert into test values(301,'E');
 insert into test values(303,'S');
 insert into test values(337,'A')
commit;
 
 create table prod
 (id       number,
  name     varchar2(20),
  main_num number,
  sec_num  number);
 
 insert into prod values (1,'XX',300,251);
 insert into prod values (2,'YY',300,null);
 insert into prod values (3,'ZZ',279,301);
 insert into prod values (4,'UU',303,337);
 COMMIT;
Regards 
 
October   27, 2006 - 7:38 am UTC 
 
in this case, no, using analytics would not suffice "in general" 
 
 
 
Second question ...
Lena, October   27, 2006 - 1:32 pm UTC
 
 
Hi Tom,
I tried to find a second way to write this statment
but didnt success. 
Could you please show how to do it ?
select test.name      name,
       test2.name     name2,
       p.name,
       p.main_num,
       p.sec_num
 from  test test,   <===
       test test2,  <===
       prod p
where  p.main_num  = test.id
  and  p.sec_num = test2.id
Thanks again 
 
October   27, 2006 - 8:08 pm UTC 
 
from test test2,
     test test,
     prod p
there you go, just mix up the order of the tables in the from clause and you have written it in a second way.....
Not sure what you mean? 
 
 
 
Second way = Different way
Lena, October   28, 2006 - 1:11 am UTC
 
 
Hi Tom,
I found in one of our reports a select statment that 
fetching from 20 tables in the FROM clause.
The reason that there are so many tables , is because of the need to bring twice (or more) thing from the same 
table.
So, i built a simple test case that look like the following query:
select test.name      name,
       test2.name     name2,
       p.name,
       p.main_num,
       p.sec_num
 from  test test,   <===
       test test2,  <===
       prod p
where  p.main_num  = test.id
  and  p.sec_num = test2.id
Im trying to rewrite this query in different way(second way ...)) , but cant find any way to do that task.
I thought about analytics , but understood that its not
the case for analytics.
My question is if Could you please show how to do it ?
Thanks again 
 
October   28, 2006 - 10:35 am UTC 
 
well, the problem is that each row in "prod" can join to a different set of rows in test each time, hence - well, you sort of need TWO JOINS.
In this case, it is somewhat "not really avoidable", p.main_num in general is not equal to p.sec_num (I presume, else you would not have two columns) therefore and thusly - two joins. 
 
 
 
A reader, November  17, 2006 - 6:50 am UTC
 
 
Dear Tom,
Sorry to ask my question here. I've the following situation
create table t1 (x number, y varchar2(3), z varchar2(3));
create table t2 (z varchar2(3), w number);
INSERT INTO T1 ( X, Y, Z ) VALUES ( 1, '003', 'aaa'); 
INSERT INTO T1 ( X, Y, Z ) VALUES ( 2, '004', 'bbb'); 
INSERT INTO T1 ( X, Y, Z ) VALUES ( 3, '005', 'ccc'); 
INSERT INTO T1 ( X, Y, Z ) VALUES ( 4, '006', 'ddd'); 
INSERT INTO T2 ( Z, W ) VALUES ( 'aaa', 1); 
INSERT INTO T2 ( Z, W ) VALUES ( 'bbb', 2); 
INSERT INTO T2 ( Z, W ) VALUES ( 'ccc', 3); 
commit;
I would like then to do the following update
update t1
set    t1.x = 99
where  t1.y = '003'
and    t1.z in (select z from t2)
Then, I would like to insert into another table all t1.z
that have been concerned by this update.
(I don't want to issue the following select again in order  to get those t1.z
   select t1.z from t1 
   where t1.y = '003'
   and t1.z in (select z from t2)
)
Is it possible to do this directly within the update statement ?
Thanks for your help as you do always
Best Regards 
 
November  17, 2006 - 8:09 am UTC 
 
ops$tkyte%ORA10GR2> declare
  2          type array is table of t3.z%type index by binary_integer;
  3          l_data array;
  4  begin
  5          update t1 set x = 99 where y = '003' and z in (select z from t2)
  6          returning z bulk collect into l_data;
  7
  8          forall i in 1 .. l_data.count
  9                  insert into t3 values (l_data(i));
 10  end;
 11  /
PL/SQL procedure successfully completed.
 
 
 
 
 
How can I use analytics to answer this question
A reader, November  27, 2006 - 11:15 am UTC
 
 
I have this data:
ORDER NUM     UPDATE DATE    TOTAL
---------     -----------    ------
100           01-JAN-06      150.00
100           15-FEB-06      160.00
100           02-JUN-06       85.00
200           05-FEB-06      300.00
200           07-NOV-06      400.00
200           25-NOV-06      100.00
300           01-MAY-06      500.00
300           20-JUN-06      600.00
300           26-NOV-06      750.00
400           16-AUG-06      800.00
400           20-SEP-06      800.00 
400           10-OCT-06      200.00 
I need to calculate the total difference between the last update date and the next to last update date for orders where the next to last update date is < 10/19 and the last total < the next to last total.
Using the sample data, the query should return
ORDER NUM   DIFFERENCE
---------   ----------
100          -75.00
400         -600.00
Your expert advise will be appreciated.
 
 
November  27, 2006 - 7:33 pm UTC 
 
no create
   no inserts
      no lookie
         no promises that when we do look we can answer... 
 
 
 
Here's the missing info
A reader, November  28, 2006 - 9:53 am UTC
 
 
create table orders (order_num number, update_date date, total number);
insert into orders values(100,to_date('01-JAN-06','DD-MON-YY'),150);
insert into orders values(100,to_date('15-FEB-06','DD-MON-YY'),160);
insert into orders values(100,to_date('02-JUN-06','DD-MON-YY'), 85);
insert into orders values(200,to_date('05-FEB-06','DD-MON-YY'),300);
insert into orders values(200,to_date('07-NOV-06','DD-MON-YY'),400);
insert into orders values(200,to_date('25-NOV-06','DD-MON-YY'),100);
insert into orders values(300,to_date('01-MAY-06','DD-MON-YY'),500);
insert into orders values(300,to_date('20-JUN-06','DD-MON-YY'),600);
insert into orders values(300,to_date('26-NOV-06','DD-MON-YY'),750);
insert into orders values(400,to_date('16-AUG-06','DD-MON-YY'),800);
insert into orders values(400,to_date('20-SEP-06','DD-MON-YY'),800);
insert into orders values(400,to_date('10-OCT-06','DD-MON-YY'),200);
 
 
November  28, 2006 - 11:56 am UTC 
 
this is a total guess since "10/19" is a bit ambiguous, and you don't really specify how to compare total with last total (assume "by order_num")
ops$tkyte%ORA10GR2> select *
  2    from (
  3  select order_num, update_date, total,
  4         lag(total) over (partition by order_num order by update_date) last,
  5         total-lag(total) over (partition by order_num order by update_date) diff
  6    from orders
  7         )
  8   where diff < 0
  9     and update_date < to_date( '19-oct-2006', 'dd-mon-yyyy')
 10   order by order_num, update_date
 11  /
 ORDER_NUM UPDATE_DA      TOTAL       LAST       DIFF
---------- --------- ---------- ---------- ----------
       100 02-JUN-06         85        160        -75
       400 10-OCT-06        200        800       -600
 
 
 
 
 
Thanks!
A reader, November  28, 2006 - 3:59 pm UTC
 
 
Your reply gave me an excellent foundation to achieve what I was trying to accomplish.
Analytics defenitely rock!
Thanks! 
 
 
How about de model clause
Bart, November  29, 2006 - 8:31 am UTC
 
 
Often I see you're quote :
(but wait'll you see the 
SQL Model clause in 10g)
But it seems very quiet around de model clause.
It is just too complicated
It doesn't fill any needs
or
??
Any thoughts...practical (real world) examples... 
 
 
Re: How about de model clause 
Frank Zhou, November  30, 2006 - 12:16 pm UTC
 
 
Bart,            
     SQL Model Clause is one of the most powerful and useful tool in the Oracle Database. This is truly a SQL Evolution
 It opens up limitless opportunities to process data in a single SQL queries. A fast and efficient single query solution becomes a reality for many of the tough real world problems. It is an excellent invention from the Oracle Database team!!
      In 10G release 2 the following trick is allowed in the database:
with Model_Clause_1 as (SQL_MODEL_Clause_QUERY),
With Model_Clause_2 as (an other SQL_MODEL_Clause_QUERY that can Selects from "Model_Clause_1")
Select * from "With Model_Clause_2"  
MODEL
Partition by ( * ) 
dimension by (Analytic functions is allowed here !)
Measures (Analytic functions is allowed here too !  )
Rules ( Analytic functions is "FINALLY" allowed here !!!!)
Just think about it, with this kind of powerful data processing capabilities people should have very few excuses not to be able to implement a single query to solve a database related problems. 
        Here are  more Model Clause examples :
</code>  
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:1941199930674#77866736858360    http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:4273679444401#76833955117389  <code>
SQL Model clause rock...... 
Analytics        roll.... 
Frank  
 
select distinct
Sandeep, December  06, 2006 - 1:39 am UTC
 
 
Tom I have one table suppose table1 and I have to select 2 rows in that suppose row1 and row2 but row1 I have some duplicates I want to select now and while selecting I want to skip duplicates and select only those rows which is having maximum effective date
we have 11 mil records in table so I have to write analytical function 
please help me in doing this 
 
December  07, 2006 - 8:01 am UTC 
 
how can row1 have "some duplicates".  
if you mean:
we have a history table, the key is:
X, EFF_DT
for any given X value, we only want to keep the record such that for that X value, EFF_DT = max(EFF_DT)
then
select *
  from (select t.*, max(eff_dt) over (partition by X) max_dt
          from t)
where eff_dt = max_dt;
 
 
 
 
Reader, December  06, 2006 - 1:43 am UTC
 
 
Sorry Tom rather than saying column I said rows 
 
 
 
00907: missing right parenthesis in ProC *
Jimmy, December  07, 2006 - 9:16 pm UTC
 
 
Tom Greetings, 
I have a program in ProC *, and works most of the times, the problem well is that some times it appears error ORA-00907: missing right parenthesis when executing following sentence INSERT: 
  EXEC SQL INSERT INTO TABLAF
      ( N1M_F5CT3NT4RN2 ,
        N1M_F2L32       , 
        N1M_PR2C4S2     ,
        T3P_D2C1M       ,
        T3P_PR2C4S2     ,
        C2D_V4NT5       ,
        C2D_C4NTR4M3    ,
        C2D_CL34NT4     ,
        F4C_4M3S32N     ,
        F4C_V4NC3M34NT2 ,
        C2D_C3CLF5CT    ,
        C2D_M2N4D5      ,
        F4C_3N3D5TC2BR2S,
        T2T_F5CT1R5     ,
        T2T_N4T2        ,
        T2T_P5G5R       ,
        5C1M_D4SC14NT2  ,
        5C1M_3V5        ,
        5C1M_3C5        ,
        5C1M_R4T4NC32N  ,
        3MP_5C1M1L5D2   ,
        3MP_S5LD25NT    ,
        3MP_5J1ST4S     ,
        3MP_P5G2S       ,
        3MP_M2R530      ,
        3MP_M2R560      ,
        3MP_M2R590      ,
        3MP_M2R5120     ,
        3MP_M2R5150     ,
        N1M_F5CTR4F     ,
        N2M_1S15R2R5    ,
        3ND_P5C2BR2S    ,
        3ND_P5C2NT5BL4  ,
        3ND_3MPR4S32N   ,
        N1M_D35N        ,
        3MP_3V5S5LD25NT ,
        F4C_S1SP4NS32N  )
    V5L14S (
        T2_N1MB4R(:stH3stD2c1.szN1mF5ct3nt4rn2)                     ,
        T2_N1MB4R(:stH3stD2c1.szN1mF2l32)                           ,      
        :stH3stD2c1.lN1mPr2c4s2                                     ,     
        :stH3stD2c1.3T3pD2c1m                                       ,
        :stH3stD2c1.3T3pPr2c4s2                                     ,           
        :stH3stD2c1.lC2dV4nt5                   :3_shC2dV4nt5       ,
        :stH3stD2c1.3C2dC4ntr4m3                                    ,        
        :stH3stD2c1.lC2dCl34nt4                                     ,
        T2_D5T4(:stH3stD2c1.szF4c4m3s32n      ,'YYYYMMDDHH24M3SS')  ,   
        T2_D5T4(:stH3stD2c1.szF4cV4nc3m34nt2  ,'YYYYMMDDHH24M3SS')  ,
        :stH3stD2c1.lC2dC3clF5ct                :3_shC2dC3clF5ct    ,        
        :stH3stD2c1.szC2dM2n4d5                                     ,      
        T2_D5T4(:stH3stD2c1.szF4c3n3D5tC2br2s   :3_shF4c3n3D5tC2br2s,
                                              'YYYYMMDDHH24M3SS')   ,
        :stH3stD2c1.dT2tF5ct1r5                                     ,        
        :stH3stD2c1.dT2tN4t2                                        ,       
        :stH3stD2c1.dT2tP5g5r                                       ,       
        :stH3stD2c1.d5c1mD4sc14nt2                                  ,
        :stH3stD2c1.d5c1m3V5                                        ,
        :stH3stD2c1.d5c1m3C5                                        ,
        :stH3stD2c1.d5c1mR4t4nc32n                                  ,
        :stH3stD2c1.d3mp5c1m1l5d2                                   ,
        :stH3stD2c1.d3mpS5ld25nt                                    ,
        :stH3stD2c1.d3mp5j1st4s                 :3_sh3mp5j1st4s     ,
        :stH3stD2c1.d3mpP5g2s                   :3_sh3mpP5g2s       ,
        :stH3stD2c1.d3mpM2r530                  :3_sh3mpM2r530      ,
        :stH3stD2c1.d3mpM2r560                  :3_sh3mpM2r560      ,
        :stH3stD2c1.d3mpM2r590                  :3_sh3mpM2r590      ,
        :stH3stD2c1.d3mpM2r5120                 :3_sh3mpM2r5120     ,
        :stH3stD2c1.d3mpM2r5150                 :3_sh3mpM2r5150     ,
        T2_N1MB4R(:stH3stD2c1.szN1mF5ctR4f      :3_shN1mF5ctR4f)    ,              
        :stH3stD2c1.szN2m1s15r2r5               :3_shN2m1s15r2r5    ,
        :stH3stD2c1.33ndP5C2br2s                                    ,
        :stH3stD2c1.33ndP5C2nt5bl4                                  ,
        :stH3stD2c1.33nd3mpr4s32n                                   ,
        :stH3stD2c1.szN1mD35N                                       ,
        :stH3stD2c1.d3mp3v5S5ld25nt                                 ,
        T2_D5T4(:stH3stD2c1.szF4cS1sp4ns32n   :3_shF4cS1sp4ns32n    , 
                                              'YYYYMMDDHH24M3SS')   );
The variables flags, as i_shF4c3n3D5tC2br2s previously are initialized with the values according to the value of the variables. 
Ej: i_shFecIniDatCobros = (stHistDocu.szFecIniDatCobros[0] == ' \0')?ORA_NULL:0; ORA_NULL has value -1 I thank for 
It will have something to see with host variable and flags, in what cases can fail. upgrade library of Oracle? or S.O?
Much the aid that you can offer me, 
Thanks, 
  
 
December  08, 2006 - 7:35 am UTC 
 
man, those are the wackiest variable names ever aren't they.
I don't see how a prepared statement like that could or would "sometimes" say "missing parens", it seems it should either be 
a) always
b) never
are you saying "intermittent" - because that would only be raised during a parse and that should only be parsed ONCE really during program execution and if it parsed at least once, it should always parse
(and if it does not, I would have to then say "you have a memory overwrite somewhere and you are munging this static string during your program execution - that is, you are doing it yourself") 
 
 
 
Michel Cadot, December  08, 2006 - 8:05 am UTC
 
 
It seems he tried to obfuscate the real name as we can see VALUES is converted to V5L14S, TO_DATE to T2_D5T4...
A quick look we can see:
A -> 5
E -> 4
I -> 3
O -> 2(?)
U -> 1
In his idiom "Analytics question" means "Syntax error". :)
Michel
 
 
 
Secret decoder ring...
Jmv, December  08, 2006 - 11:20 am UTC
 
 
Start of decoding....
T2_N1MB4R is TO_NUMBER so 2=O, 1=U, E=4
V5L14S is VALUES       so 5=A, 1=U and E=4 (correlation)
T2_D5T4 is TO_DATE     so 2=0 (corr.), 5=A (corr.), 4=E (correlated)
Possible partial Rosetta stone...
12345
UO?EA   Is 3=I?
Possible rewrite...
EXEC SQL INSERT INTO TABLAF
      ( NUM_FACTINTERNO ,
        NUM_FOLIO       , 
        NUM_PROCESO     ,
        TIP_DOCUM       ,
        TIP_PROCESO     ,
        COD_VENTA       ,
        COD_CENTREMI    ,
        COD_CLIENTE     ,
        FEC_EMISION     ,
        FEC_VENCIMIENTO ,
        COD_CICLFACT    ,
        COD_MONEDA      ,
        FEC_INIDATCOBROS,
        TOT_FACTURA     ,
        TOT_NETO        ,
        TOT_PAGAR       ,
        ACUM_DESCUENTO  ,
        ACUM_IVA        ,
        ACUM_ICA        ,
        ACUM_RETENCION  ,
        IMP_ACUMULADO   ,
        IMP_SALDOANT    ,
        IMP_AJUSTES     ,
        IMP_PAGOS       ,
        IMP_MORAI0      ,
        IMP_MORA60      ,
        IMP_MORA90      ,
        IMP_MORAUO0     ,
        IMP_MORAUA0     ,
        NUM_FACTREF     ,
        NOM_USUARORA    ,
        IND_PACOBROS    ,
        IND_PACONTABLE  ,
        IND_IMPRESION   ,
        NUM_DIAN        ,
        IMP_IVASALDOANT ,
        FEC_SUSPENSION  )
    VALUES (
        TO_NUMBER(:stHIstDOcU.szNUmFActIntErnO)                     ,
        TO_NUMBER(:stHIstDOcU.szNUmFOlIO)                           ,      
        :stHIstDOcU.lNUmPrOcEsO                                     ,     
        :stHIstDOcU.ITIpDOcUm                                       ,
        :stHIstDOcU.ITIpPrOcEsO                                     ,           
        :stHIstDOcU.lCOdVEntA                   :I_shCOdVEntA       ,
        :stHIstDOcU.ICOdCEntrEmI                                    ,        
        :stHIstDOcU.lCOdClIEntE                                     ,
        TO_DATE(:stHIstDOcU.szFEcEmIsIOn      ,'YYYYMMDDHHOEMISS')  ,   
        TO_DATE(:stHIstDOcU.szFEcVEncImIEntO  ,'YYYYMMDDHHOEMISS')  ,
        :stHIstDOcU.lCOdCIclFAct                :I_shCOdCIclFAct    ,        
        :stHIstDOcU.szCOdMOnEdA                                     ,      
        TO_DATE(:stHIstDOcU.szFEcInIDAtCObrOs   :I_shFEcInIDAtCObrOs,
                                              'YYYYMMDDHHOEMISS')   ,
        :stHIstDOcU.dTOtFActUrA                                     ,        
        :stHIstDOcU.dTOtNEtO                                        ,       
        :stHIstDOcU.dTOtPAgAr                                       ,       
        :stHIstDOcU.dAcUmDEscUEntO                                  ,
        :stHIstDOcU.dAcUmIVA                                        ,
        :stHIstDOcU.dAcUmICA                                        ,
        :stHIstDOcU.dAcUmREtEncIOn                                  ,
        :stHIstDOcU.dImpAcUmUlAdO                                   ,
        :stHIstDOcU.dImpSAldOAnt                                    ,
        :stHIstDOcU.dImpAjUstEs                 :I_shImpAjUstEs     ,
        :stHIstDOcU.dImpPAgOs                   :I_shImpPAgOs       ,
        :stHIstDOcU.dImpMOrAI0                  :I_shImpMOrAI0      ,
        :stHIstDOcU.dImpMOrA60                  :I_shImpMOrA60      ,
        :stHIstDOcU.dImpMOrA90                  :I_shImpMOrA90      ,
        :stHIstDOcU.dImpMOrAUO0                 :I_shImpMOrAUO0     ,
        :stHIstDOcU.dImpMOrAUA0                 :I_shImpMOrAUA0     ,
        TO_NUMBER(:stHIstDOcU.szNUmFActREf      :I_shNUmFActREf)    ,            
  
        :stHIstDOcU.szNOmUsUArOrA               :I_shNOmUsUArOrA    ,
        :stHIstDOcU.IIndPACObrOs                                    ,
        :stHIstDOcU.IIndPACOntAblE                                  ,
        :stHIstDOcU.IIndImprEsIOn                                   ,
        :stHIstDOcU.szNUmDIAN                                       ,
        :stHIstDOcU.dImpIvASAldOAnt                                 ,
        TO_DATE(:stHIstDOcU.szFEcSUspEnsIOn   :I_shFEcSUspEnsIOn    , 
                                              'YYYYMMDDHHOEMISS')   );
If trying to obfuscate, then one of the parenthesis might have been missed, or accidently replaced.  
Otherwise perhaps there may be a problem with the variable assignment wherein the variable contains a quote or other character which is causing the Pro*C routine (eg sql injection) to report the error.
Need more info.
 
 
December  09, 2006 - 12:39 pm UTC 
 
laughing out loud - nice decoder there. 
 
 
 
Analytic question
Carlos, December  11, 2006 - 8:05 am UTC
 
 
Tom I am using Oracle 10g and I need your feedback on the following:
I need a way to generate zero amount value records on dates when the blasts did not generate any amounts.
Here are the DDLs for the table and data:
create table test
(transaction_date date,
 blast_id number(11),
 blast_amount number(11)
)
/
insert into test values ('01-DEC-06', 1, 10);
insert into test values ('01-DEC-06', 2, 5);
insert into test values ('01-DEC-06', 3, 15);
insert into test values ('01-DEC-06', 4, 12);
insert into test values ('01-DEC-06', 5, 9);
insert into test values ('02-DEC-06', 7, 30);
insert into test values ('02-DEC-06', 8, 10);
insert into test values ('02-DEC-06', 1, 20);
insert into test values ('02-DEC-06', 9, 40);
insert into test values ('02-DEC-06', 2, 10);
insert into test values ('03-DEC-06', 10, 100);
insert into test values ('03-DEC-06', 6, 45);
insert into test values ('03-DEC-06', 12, 200);
insert into test values ('03-DEC-06', 13, 90);
insert into test values ('03-DEC-06', 14, 32);
commit;
If I query the table for transaction date = '02-DEC-06' the ouput should be
'02-DEC-06', 1, 20
'02-DEC-06', 2, 10
'02-DEC-06', 3, 0
'02-DEC-06', 4, 0
'02-DEC-06', 5, 0
'02-DEC-06', 7, 30
'02-DEC-06', 8, 10
'02-DEC-06', 9, 40
If I query the table for transaction date = '03-DEC-06' the output should be
'03-DEC-06', 1, 0
'03-DEC-06', 2, 0
'03-DEC-06', 3, 0
'03-DEC-06', 4, 0
'03-DEC-06', 5, 0
'03-DEC-06', 6, 45
'03-DEC-06', 7, 0
'03-DEC-06', 8, 0
'03-DEC-06', 9, 0
'03-DEC-06', 10, 100
'03-DEC-06', 12, 200
'03-DEC-06', 13, 90
'03-DEC-06', 14, 32
If I query the table for transaction date between '02-DEC-06' and '03-DEC-06', I should get
'02-DEC-06', 1, 20
'02-DEC-06', 2, 10
'02-DEC-06', 3, 0
'02-DEC-06', 4, 0
'02-DEC-06', 5, 0
'02-DEC-06', 7, 30
'02-DEC-06', 8, 10
'02-DEC-06', 9, 40
'03-DEC-06', 1, 0
'03-DEC-06', 2, 0
'03-DEC-06', 3, 0
'03-DEC-06', 4, 0
'03-DEC-06', 5, 0
'03-DEC-06', 6, 45
'03-DEC-06', 7, 0
'03-DEC-06', 8, 0
'03-DEC-06', 9, 0
'03-DEC-06', 10, 100
'03-DEC-06', 12, 200
'03-DEC-06', 13, 90
'03-DEC-06', 14, 32
My goal is not to store blasts that did not generate any amounts as I would be wasting space.
Thanks.
 
 
December  11, 2006 - 8:33 am UTC 
 
I don't know why "6" for 2-dec and "11" for 3-dec are "missing", I will assume that is a mistake in your example - I cannot see any logic to them not being there.
What we need is a set of rows that is equal to the max blast_id for the rows of interest.  That is:
  7  (select max(max(blast_id))
  8     from test
  9    where transaction_date between to_date(:ldate,'dd-mon-yyyy') and to_date(:hdate,'dd-mon-yyyy')
 10    group by transaction_date)
 11  )
take that number and have at least that many rows.  That is what "data" is below.
then, using 10gr2 partitioned outer joins, we can make up the missing rows and fill them in:
ops$tkyte%ORA10GR2> variable ldate varchar2(25);
ops$tkyte%ORA10GR2> variable hdate varchar2(25);
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> exec :ldate := '02-dec-2006';
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> exec :hdate := '03-dec-2006';
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with
  2  data
  3  as
  4  (select level l
  5     from dual
  6  connect by level <=
  7  (select max(max(blast_id))
  8     from test
  9    where transaction_date between to_date(:ldate,'dd-mon-yyyy') and to_date(:hdate,'dd-mon-yyyy')
 10    group by transaction_date)
 11  )
 12  select *
 13    from (
 14  select test.transaction_date, data.l, nvl(test.blast_amount,0) blast_amount,
 15         max(blast_id) over (partition by test.transaction_date) max_id
 16    from data left join
 17        (select *
 18               from test
 19          where transaction_date between to_date(:ldate,'dd-mon-yyyy') and to_date(:hdate,'dd-mon-yyyy')) test
 20            partition by (transaction_date) on (data.l = test.blast_id)
 21              )
 22   where l <= max_id
 23   order by transaction_date, l
 24  /
TRANSACTI          L BLAST_AMOUNT     MAX_ID
--------- ---------- ------------ ----------
02-DEC-06          1           20          9
02-DEC-06          2           10          9
02-DEC-06          3            0          9
02-DEC-06          4            0          9
02-DEC-06          5            0          9
02-DEC-06          6            0          9
02-DEC-06          7           30          9
02-DEC-06          8           10          9
02-DEC-06          9           40          9
03-DEC-06          1            0         14
03-DEC-06          2            0         14
03-DEC-06          3            0         14
03-DEC-06          4            0         14
03-DEC-06          5            0         14
03-DEC-06          6           45         14
03-DEC-06          7            0         14
03-DEC-06          8            0         14
03-DEC-06          9            0         14
03-DEC-06         10          100         14
03-DEC-06         11            0         14
03-DEC-06         12          200         14
03-DEC-06         13           90         14
03-DEC-06         14           32         14
23 rows selected.
 
 
 
 
 
Thanks a million
Carlos, December  11, 2006 - 8:49 am UTC
 
 
Just what I needed! 
 
 
Analytics Question
Carlos, December  12, 2006 - 6:41 pm UTC
 
 
Tom, regarding your comment on what you perceived where missing records, the blast_id is a generated from a database sequence;  A sequence is not guaranteed to always be sequential, as I understand it.  It guaranties uniqueness and in some cases could skip.
Would your solution change as result of this?
Thanks again. 
 
December  12, 2006 - 10:10 pm UTC 
 
sorry - this comment is not making sense to me.
I assumed
a) blast_id starts at 1
b) blast_id goes to max by the transaction date
that is all - was wondering about the missing records in your example, couldn't figure out why 
 
 
 
Finding Duplicates in large unpartitioned table
Ravi, December  26, 2006 - 12:29 pm UTC
 
 
Hi Tom, 
     Can you help me to find any ways to optimize the below routine which is taking 38 hours to run in production, table is huge with about 17 million records, and with about 2 million duplicate records, explan plan for query shows that is doing fast full table scan, but the problem looks like in the delete statement, which is taking very long.
Any help or insight on this problem is greatly appreciated.
SQL> select count(*) from claims_temp;
  COUNT(*)
----------
  17353894
Number of duplicates about 2,000,000
CREATE TABLE CLAIMS_TEMP
(
  HEALTH_SERVICE_ID              NUMBER(20)     NOT NULL,
  HEALTH_SERVICE_ITEM_NO         NUMBER(2)      NOT NULL,
  PDE_REC_NO                     NUMBER(3)      NOT NULL,
  PATIENT_ID                     NUMBER(10)     NOT NULL,
  ADJ_CARDHOLDER_ID              CHAR(20 BYTE)  NOT NULL,
  CLIENT_ID                      CHAR(15 BYTE)  NOT NULL,
  PLAN_ID                        CHAR(8 BYTE)   NOT NULL,
  ADJUD_DT                       DATE           NOT NULL,
  I_SERVICE_DT                   DATE           NOT NULL,
  LABEL_NAME                     VARCHAR2(50 BYTE) NOT NULL,
  I_QTY_DISPENSED                NUMBER(10,3)   NOT NULL,
  TOTAL_CLIENT_AMT_BILLED        NUMBER(8,2)    NOT NULL,
  O_PATIENT_PAY_AMT              NUMBER(8,2)    NOT NULL
);
 
CREATE UNIQUE INDEX EOB_CLAIMS_TEMP_UX1 ON CLAIMS_TEMP
(HEALTH_SERVICE_ID, HEALTH_SERVICE_ITEM_NO, PDE_REC_NO);
CREATE OR REPLACE PROCEDURE sp_del_dups (p_limit IN PLS_INTEGER, p_exit_status OUT PLS_INTEGER)
IS
   CURSOR dups_cur
   IS
      SELECT row_id
        FROM (SELECT ROWID row_id,
                     ROW_NUMBER () OVER (PARTITION BY health_service_id, health_service_item_no ORDER BY pde_rec_no DESC)
                                                                                        dups_claims
                FROM claims_temp)
       WHERE dups_claims > 1;
   TYPE rowidarray IS TABLE OF ROWID
      INDEX BY PLS_INTEGER;
   dups_rowid_tb   pkg_eob.rowidarray;
BEGIN
   OPEN dups_claims_cur;
   LOOP
      FETCH dups_claims_cur
      BULK COLLECT INTO dups_rowid_tb LIMIT p_limit;
      FORALL i IN 1 .. dups_rowid_tb.COUNT
         DELETE FROM claims_temp
               WHERE ROWID = dups_rowid_tb (i);
      COMMIT;
      EXIT WHEN dups_claims_cur%NOTFOUND;
   END LOOP;
   CLOSE dups_claims_cur;
   COMMIT;
EXCEPTION
   WHEN OTHERS
   THEN
      p_exit_status := 1;
      RAISE;
END sp_del_dups;
Thank you 
Ravi   
December  26, 2006 - 9:23 pm UTC 
 
lots of indexes or not lots of indexes?
what do you have p_limit set to?
you do know that setting p_exit_status in the exception block is a BIG OLD WASTE of keystrokes right? 
 
 
Commit inside a loop
A reader, December  27, 2006 - 5:36 am UTC
 
 
Why are you commiting inside a loop. This will cause an ORA-01555 error.
Why not using this
delete from claims_temp
where rowid not in (select min(rowid) from claims_temp
                    group by health_service_id
                            ,health_service_item_no
                            ,pde_rec_no
                    );
However I am not sure that you will not get the ORA-0156 error. In this case then the above delete must be done part by part
 
 
Finding Duplicates in large unpartitioned table   
Ravi, December  27, 2006 - 12:33 pm UTC
 
 
First thing is I am sorry, that the code I posted will not compile for you because, the Cursor name in the declaration section does not match to cursor name in cursor reference section. 
In the declaration section it should have been "dups_claims_cur "
I tested out P_limit with 50000 and 500
with 500 -- > takes 4 hours
with 50,000 -- > takes 38 hours
But I need this script to finish in secounds not hours, is there any better way of achiving this with out bulk collect
For the reply from " THE READER " that solution wouldnt work because you are looking for duplicates for a combination of "health_service_id, health_service_item_no, pde_rec_no " but my requirement is for combination of "health_service_id, health_service_item_no" and delete the records with "pde_rec_no" < max(pde_rec_no) for multiple occurence of "health_service_id,health_service_item_no "
And for the commit, I assume the commit is in the right place that it can be, here the commit frequency would be for each BULK_COLLECT LIMIT;
 
December  28, 2006 - 9:33 am UTC 
 
the commit should be where the commit belongs - and it not necessarily should be committing every delete call.
you mentioned nothing about number of indexes
and if you just run the query and fetch the rows (no delete), how long does that take. 
 
 
For Ravi
Tyler, December  27, 2006 - 5:31 pm UTC
 
 
"reader" had the right idea for you, probably just not all of your requirements. 
This should give you a starting point. I don't profess for the code to be optimal, you may need to tweak it for your needs. Should still be faster than your procedural code. 
DELETE FROM claims_temp
WHERE ROWID IN
(
   SELECT MY_ROWID
   FROM
   (
      SELECT
         ROWID as MY_ROWID,
         pde_rec_no,
         MAX(pde_rec_no) over (PARTITION BY health_service_id ,health_service_item_no) as MAX_pde_rec_no,
         COUNT(*) over (PARTITION BY health_service_id ,health_service_item_no) as MY_COUNT
      FROM claims_temp
   )
   WHERE MY_COUNT > 1
   AND pde_rec_no <> MAX_pde_rec_no
);
General idea, find all the items with more than 1 record (per health_service_id ,health_service_item_no) where the pde_rec_no DOES NOT EQUAL the MAX pde_rec_no for the given per health_service_id ,health_service_item_no combination. 
Based on your previous post, that should encompass the algorithm you require. I hope :) 
 
Finding Duplicates in large unpartitioned table   
Ravi, December  28, 2006 - 6:33 pm UTC
 
 
For Tom's Feedback
   My mistake, I dint include index counts
Number of indexes
Unique index on (Health_service_id, health_service_item_no,pde_rec_no)
Bitmap index on (plan_id) not part of the query
I used Tyler's Query to return full data set comes back (with out delete) in 20 min.
--------------------------------------------------------
For Tyler's Feedback 
     your Query comes back with much better cost then mine, window sort is avoided by using your query, but cost is to high and runs for ever when accompanied by delete
--------------------------------------------------------
Tyler's Query Uses
  WINDOW BUFFER 
--------------------------------------------------------
SELECT MY_ROWID 
  FROM 
  ( 
    SELECT 
      ROWID as MY_ROWID, 
      pde_rec_no, 
      MAX(pde_rec_no) over (PARTITION BY health_service_id ,health_service_item_no) as MAX_pde_rec_no, 
      COUNT(*) over (PARTITION BY health_service_id ,health_service_item_no) as MY_COUNT 
    FROM claims_temp 
  ) 
  WHERE MY_COUNT > 1 
  AND pde_rec_no <> MAX_pde_rec_no 
--------------------------------------------------------
  Ravi's Query 
  Uses WINDOW SORT (more expensive)
--------------------------------------------------------
SELECT row_id 
    FROM (SELECT ROWID row_id, 
              ROW_NUMBER () OVER (PARTITION BY health_service_id, health_service_item_no ORDER BY pde_rec_no DESC) 
                                                          dups_claims 
          FROM claims_temp) 
    WHERE dups_claims > 1; 
--------------------------------------------------------
For Both
    Query seems to be perfoming well, but when added 
Delete .. where rowid in (query)
    Cost went up by 20 times
    my new approach ( I assume not very good though, but does the job in 45 min)
CREATE TABLE rowid_temp
(seq_no NUMBER UNIQUE, row_id VARCHAR2(30))
INSERT      /*+APPEND */INTO rowid_temp
            (seq_no, row_id)
   SELECT ROWNUM, ROWIDTOCHAR (row_id)
     FROM (SELECT ROWID AS row_id, pde_rec_no,
                  MAX (pde_rec_no) OVER (PARTITION BY health_service_id, health_service_item_no)
                                          AS max_pde_rec_no,
                  COUNT (*) OVER (PARTITION BY health_service_id, health_service_item_no)
                                              AS dups_count
             FROM claims_temp)
    WHERE dups_count > 1 AND pde_rec_no <> max_pde_rec_no;
COMMIT;
DELETE FROM claims_temp
      WHERE ROWID IN (SELECT /*+ FULL(r) PARALLEL(r,4) */
                             CHARTOROWID (row_id)
                        FROM rowid_temp r);
COMMIT;      
      
TRUNCATE TABLE rowid_temp;      
 
December  29, 2006 - 9:38 am UTC 
 
all indexes are necessarily involved here!!!!! you are deleting entries.
have you thought about a maintenance window where you can disable indexes, do the bulk delete and then rebuild them  
 
 
Thank you
Ravi, December  29, 2006 - 11:42 am UTC
 
 
Unfortunatly DBA's in here will not allow me to drop and recreate indexes. I have to live with it,
Thanks a lot for all the help and guidence provided, that did help me a lot, execution time dropped from 38 hours to 45 minutes. 
Tyler thanks for your query, that did helped me.
Thank you
Ravi 
 
Analytics rock
Nengbing, January   09, 2007 - 1:26 pm UTC
 
 
Tom,
  Thank you so much for your help to many who are using Oracle. I enjoy reading your books and this site.
  I hope the following is relevant to this thread.
drop table t1 purge;
create table t1 (d number);
insert into t1 values(452911);
insert into t1 values(794034);
insert into t1 values(794057);
insert into t1 values(893069);
insert into t1 values(924952);
insert into t1 values(1068608);
insert into t1 values(1068617);
insert into t1 values(1237362);
insert into t1 values(1268008);
Desired output, groups of numbers that are are within 2*1000, each group with a distinctive group id.
        D         P1         P2          G
---------- ---------- ---------- ----------
    452911     451911     453911          1
    794034     793034     795034          2
    794057     793057     795057          2
    893069     892069     894069          3
    924952     923952     925952          4
   1068608    1067608    1069608          5
   1068617    1067617    1069617          5
   1237362    1236362    1238362          6
   1268008    1267008    1269008          7
  I have managed to do the following, but am not sure how to assign distinctive groups.
  1  select d,p1,p2,l1+NVL(lag(l1,1) over (order by d),0) g
  2  from (
  3  select d,d-1000 p1,d+1000 p2,
  4        case when (lead(d,1) over (order by d) - d) < 2000 then 1
  5        else 0
  6*       end l1 from t1)
SQL> /
         D         P1         P2          G
---------- ---------- ---------- ----------
    452911     451911     453911          0
    794034     793034     795034          1
    794057     793057     795057          1
    893069     892069     894069          0
    924952     923952     925952          0
   1068608    1067608    1069608          1
   1068617    1067617    1069617          1
   1237362    1236362    1238362          0
   1268008    1267008    1269008          0
9 rows selected.
  Bascially, I want to put those consecutive rows that have 1 in column G into distinctive groups, each row wih 0 in column G should be in its own group.
  Any help is greatly appreciated.
  Nengbing 
January   11, 2007 - 8:59 pm UTC 
 
sorry, this is very very confusing - don't get p1, p2, g AT ALL
... groups of numbers that are are within 2*1000, ...
means nothing - within 2*1000 of what?  why 2*1000 why not 2000? 
 
 
Analytics rock
Nengbing, January   10, 2007 - 7:00 pm UTC
 
 
create table t1 (d number);
---- thought of using sequence to generate group id
create sequence seq1 start by 1 increment by 1;
insert into table t1 values (452911);
insert into table t1 values (794012);
insert into table t1 values (794034);
insert into table t1 values (794234);
insert into table t1 values (794345);
insert into table t1 values (924952);
insert into table t1 values (1068608);
insert into table t1 values (1068617);
insert into table t1 values (1237362);
insert into table t1 values (1268008);
 select t2.*,
             case when (1=l1)
                       and  ((lag(l1,1) over (order by d)) = 1)
                         then seq1.currval
                   else  seq1.nextval
             end g
 from (
 select d,d-1000 p1,d+1000 p2,
       case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
            when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
       else 0
       end l1 from t1
 ) t2
/
         D         P1         P2         L1          G
---------- ---------- ---------- ---------- ----------
    452911     451911     453911          0        281
    794012     793012     795012          1        282
    794034     793034     795034          1        283
    794234     793234     795234          1        284
    794345     793345     795345          1        285
    924952     923952     925952          0        286
   1068608    1067608    1069608          1        287
   1068617    1067617    1069617          1        288
   1237362    1236362    1238362          0        289
   1268008    1267008    1269008          1        290
10 rows selected.
---- Why column G always increment?
---- replace seq1.nextval with 100
 select t2.*,
             case when (1=l1)
                       and  ((lag(l1,1) over (order by d)) = 1)
                         then seq1.currval
                   else  100
             end g
 from (
 select d,d-1000 p1,d+1000 p2,
       case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
            when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
       else 0
       end l1 from t1
 ) t2
/
         D         P1         P2         L1          G
---------- ---------- ---------- ---------- ----------
    452911     451911     453911          0        100
    794012     793012     795012          1        100
    794034     793034     795034          1        290
    794234     793234     795234          1        290
    794345     793345     795345          1        290
    924952     923952     925952          0        100
   1068608    1067608    1069608          1        100
   1068617    1067617    1069617          1        290
   1237362    1236362    1238362          0        100
   1268008    1267008    1269008          1        100
10 rows selected.
---- it seems that the condition testing is working
create sequence seq2 start with 1 increment by 1;
  select t2.*,
              case when (1=l1)
                        and  ((lag(l1,1) over (order by d)) = 1)
                          then seq1.currval
                    else  seq2.nextval
              end g
  from (
  select d,d-1000 p1,d+1000 p2,
        case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
             when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
        else 0
        end l1 from t1
 ) t2
/
         D         P1         P2         L1          G
---------- ---------- ---------- ---------- ----------
    452911     451911     453911          0          1
    794012     793012     795012          1          2
    794034     793034     795034          1        290
    794234     793234     795234          1        290
    794345     793345     795345          1        290
    924952     923952     925952          0          6
   1068608    1067608    1069608          1          7
   1068617    1067617    1069617          1        290
   1237362    1236362    1238362          0          9
   1268008    1267008    1269008          1         10
10 rows selected.
---- condition testing is OK
 select t2.*,
             case when (1=l1)
                       and  ((lag(l1,1) over (order by d)) = 1)
                         then seq1.currval
                   else  seq1.nextval
             end g
 from (
 select d,d-1000 p1,d+1000 p2,
       case when (lead(d,1,0) over (order by d) - d) < 2000 then 1
            when (d - (lag(d,1,0) over (order by d)) ) < 2000 then 1
       else 0
       end l1 from t1
 ) t2
/
         D         P1         P2         L1          G
---------- ---------- ---------- ---------- ----------
    452911     451911     453911          0        291
    794012     793012     795012          1        292
    794034     793034     795034          1        293
    794234     793234     795234          1        294
    794345     793345     795345          1        295
    924952     923952     925952          0        296
   1068608    1067608    1069608          1        297
   1068617    1067617    1069617          1        298
   1237362    1236362    1238362          0        299
   1268008    1267008    1269008          1        300
10 rows selected.
---- How come G always increments?
 
 
analytics rock
Nengbing, January   12, 2007 - 2:38 pm UTC
 
 
Tom,
I apologize for the confusion and hope this makes it clear.
Assuming that I have a set of numbers, I'd like to put the numbers into distinctive groups within each group the difference of any number to the next one when the numbers are sorted should be less than 2000, and all numbers that differs from any number in a group by less than 2000 should be in the same group.
For example:
    452911    1  ---- put 452911 in group 1
    794012    2  ---- in a new group because it differs from 1st number by more than 2000
    794034    2  ---- in group 2 because it differs from last number by less than 2000
    794234    2  ---- same group 2
    794345    2  ---- same group 2
    924952    3  ---- new group 3
   1068608    4  ---- new group 4
   1068617    4  ---- same group 4
   1237362    5  ---- new group 5
   1268008    6  ---- new group 6
By the way I did find in Oracle document that says "......If any of these locations contains references to both CURRVAL and NEXTVAL, then Oracle increments the sequence and returns the same value for both CURRVAL and NEXTVAL......."
 So using sequence as I intended from my last reply does not work.
 Thank you so much, Tom!
 
 
To: Nengbing 
Michel Cadot, January   16, 2007 - 5:01 am UTC
 
 
SQL> with
  2    step1 as (
  3      select d, 
  4             case 
  5               when d-nvl(lag(d) over(order by d),0) > 2000 
  6               then rownum
  7             end flag
  8      from t1
  9    ),
 10    step2 as (
 11      select d, max(flag) over (order by d) grp
 12      from step1
 13    )
 14  select d,
 15         dense_rank() over (order by grp) grp
 16  from step2
 17  order by d
 18  /
         D        GRP
---------- ----------
    452911          1
    794012          2
    794034          2
    794234          2
    794345          2
    924952          3
   1068608          4
   1068617          4
   1237362          5
   1268008          6
10 rows selected.
First we flag (with their rownum) the rows with a difference greater than 2000.
Then we propagate this flag to all members of the same group.
Finally we convert this flag to a rank.
Regards
Michel 
 
 
To: Nengbing   
Michel Cadot, January   16, 2007 - 3:40 pm UTC
 
 
Another solution, just writing what you said in the rule of a model clause:
SQL> select d, grp
  2  from (select d from t1 order by d)
  3  model
  4    dimension by (rownum rn)
  5    measures (d, cast(null as number) grp) 
  6    rules 
  7      ( grp[ANY] = case when d[cv()-1] is null then 1 -- case first line
  8                        when d[cv()]-d[cv()-1]>2000 then grp[cv()-1]+1
  9                        else grp[cv()-1]
 10                   end
 11       )
 12  order by 1
 13  /
         D        GRP
---------- ----------
    452911          1
    794012          2
    794034          2
    794234          2
    794345          2
    924952          3
   1068608          4
   1068617          4
   1237362          5
   1268008          6
10 rows selected.
Michel 
 
 
Thanks, Michel
Nengbing, January   16, 2007 - 6:30 pm UTC
 
 
Very nicely done!
I have only minor modifications (to replace t1 on line 8 with "select d from t1 order by d"; and 0 on line 5 with -2001 to always have a flag for 1st row )
Thank you very much! 
 
To: Nengbing   
Michel Cadot, January   17, 2007 - 12:37 am UTC
 
 
You're right for line 5 but ordering table t1 in line 8 is useless as the rows are compared with their previous one in the order given in the lag function (that is "d") whatever is the physical order in the table. 
Michel 
 
Michel Cadot, January   17, 2007 - 2:20 am UTC
 
 
I have to correct my query with the model clause.
I used "rownum" to dimension the result set, it works here 1) because the model clause does not generate any row and 2) thanks to the current implementation. But this is not the correct way to do it: I should number my lines before, for instance with the row_number function.
SQL> select d, grp
  2  from (select d, row_number() over (order by d) rn from t1)
  3    model
  4    dimension by (rn)
  5    measures (d, cast(null as number) grp) 
  6      rules 
  7        ( grp[ANY] = case when d[cv()-1] is null then 1 -- case first line
  8                          when d[cv()]-d[cv()-1]>2000 then grp[cv()-1]+1
  9                          else grp[cv()-1]
 10                     end
 11        )
 12  order by 1
 13  /
         D        GRP
---------- ----------
    452911          1
    794012          2
    794034          2
    794234          2
    794345          2
    924952          3
   1068608          4
   1068617          4
   1237362          5
   1268008          6
10 rows selected.
Michel 
 
 
Thanks, Michel
Nengbing, January   17, 2007 - 11:55 am UTC
 
 
I tend to agree with you. However, the rownum as flag may throw thing off.  For example, if I insert a few more rows like:
insert into t1 values (3000);
insert into t1 values (4000);
insert into t1 values (200);
insert into t1 values (400);
the query returns:
         D        GRP
---------- ----------
       200          1
       400          1
      3000          1
      4000          1
    452911          1
    794012          1
    794034          1
    794234          1
    794345          1
    924952          1
   1068608          1
   1068617          1
   1237362          1
   1268008          1
 Because step1 returns:
         D       FLAG
---------- ----------
       200         13
       400
      3000         11
      4000
    452911          1
    794012          2
    794034
    794234
    794345
    924952          6
   1068608          7
   1068617
   1237362          9
   1268008         10
  Thanks again, Michel, becuase of contributions from Oracle users like you, this site is even more helpful! 
 
 
Thanks Nengbing   
Michel Cadot, January   17, 2007 - 3:00 pm UTC
 
 
Oh yes, rownum is wrong, row_number should be used instead.
SQL> with
  2    step1 as (
  3      select d, 
  4             case 
  5               when d-nvl(lag(d) over(order by d),-2001) > 2000 
  6               then row_number() over(order by d)
  7             end flag
  8      from t1
  9    ),
 10    step2 as (
 11      select d, max(flag) over (order by d) grp
 12      from step1
 13    )
 14  select d,
 15         dense_rank() over (order by grp) grp
 16  from step2
 17  order by d
 18  /
         D        GRP
---------- ----------
       200          1
       400          1
      3000          2
      4000          2
    452911          3
    794012          4
    794034          4
    794234          4
    794345          4
    924952          5
   1068608          6
   1068617          6
   1237362          7
   1268008          8
14 rows selected.Thanks to correct that.
Michel 
 
 
analytics to solve credit / debit allocation
RVH, January   24, 2007 - 12:44 am UTC
 
 
Tom,
You mentioned in a response to Dave Thompson on March 04, 2004, that the problem described by him in this thread can not be solved using analytical functions. Just to let you know - the following does the job:
SQL> select * from pay_m;
    PAY_ID    PAYMENT
---------- ----------
         1         50
         2         25
         3         50
         4         50
SQL> 
SQL> 
SQL> select * from prem;
   PREM_ID PREM_PAYMENT
---------- ------------
         1          100
         2           50
         3           50
         4           50
SQL> 
The following query will allocate the actual payments (identified through PAY_ID) to the 'premiums' (identified through PREM_ID). I do also include the credit/debit amount remaining after the payment has been allocated.
SQL> select vprem.prem_id,
  2         vpay.pay_id,
  3         greatest(vprem.sum_prem_payment - vpay.sum_payment, 0) remaining_debit,
  4         greatest(vpay.sum_payment - vprem.sum_prem_payment, 0) available_credit
  5  from (select pay_id,
  6               payment,
  7               sum(payment) over (order by pay_id) - payment sum_prev_payment,
  8               sum(payment) over (order by pay_id) sum_payment
  9        from pay_m)  vpay,
 10       (select prem_id,
 11               prem_payment,
 12               sum(prem_payment) over (order by prem_id) - prem_payment sum_prev_prem_payment,
 13               sum(prem_payment) over (order by prem_id) sum_prem_payment
 14        from prem) vprem
 15  where  (    vpay.sum_payment > vprem.sum_prev_prem_payment
 16          and vpay.sum_payment <= vprem.sum_prem_payment)
 17     or  (    vpay.sum_prev_payment > vprem.sum_prev_prem_payment
 18          and vpay.sum_prev_payment < vprem.sum_prem_payment)
 19  order by 1 asc, 2 asc
 20  /
   PREM_ID     PAY_ID REMAINING_DEBIT AVAILABLE_CREDIT
---------- ---------- --------------- ----------------
         1          1              50                0
         1          2              25                0
         1          3               0               25
         2          3              25                0
         2          4               0               25
         3          4              25                0
6 rows selected.
Thanks for this very useful web site. I am learning a lot by browsing it.
 
 
Query - correction
RVH, January   24, 2007 - 7:37 pm UTC
 
 
Tom,
When I tested the query more thouroughly that I gave in my previous email I realised that it has a flaw under certain boundary conditions. I underline the change in the 'where' clause that fixes the problem. It is working fine now ... Cheers
select vprem.prem_id,
       vpay.pay_id,
       greatest(vprem.sum_prem_payment - vpay.sum_payment, 0) remaining_debit,
       greatest(vpay.sum_payment - vprem.sum_prem_payment, 0) available_credit
from (select pay_id,
             sum(payment) over (order by pay_id) - payment sum_prev_payment,
             sum(payment) over (order by pay_id) sum_payment
      from pay_m)  vpay,
     (select prem_id,
             sum(prem_payment) over (order by prem_id) - prem_payment sum_prev_prem_payment,
             sum(prem_payment) over (order by prem_id) sum_prem_payment
      from prem) vprem
where       greatest (vpay.sum_prev_payment, vprem.sum_prev_prem_payment)
      <  least (vpay.sum_payment, vprem.sum_prem_payment)
order by 1, 2
/
 
 
Any single query for this scenario
Subramanian T S, January   25, 2007 - 10:03 am UTC
 
 
Hi Tom,
I have a lookup table
company  brand  area     ndays
XX       {null} {null}    3
XX       BB     {null}    10
XX       BB     1         4
XX       BB     2         8
I want a query to retrieve ndays. The logic is as follows:
First search by company, brand and area
If match found return ndays. 
Else search by company and brand 
     If match found return ndays
     Else search by company
          If match found return ndays
          Else return 0
This can be done simple PL/SQL. I am looking for a single query may be using some built-in functions..
Thanks in advance :)
Cheers !
Subbu.     
 
To: Subramanian T S 
Michel Cadot, January   29, 2007 - 6:20 am UTC
 
 
One way to do it is (with your data):
SQL> var comp varchar2(10)
SQL> var brand varchar2(10)
SQL> var area number
SQL> begin 
  2    :comp := 'XX';
  3    :brand := 'BB';
  4    :area := 1;
  5  end;
  6  /
PL/SQL procedure successfully completed.
SQL> with 
  2    data as (
  3      select t.company, t.brand, t.area, t.ndays,
  4             row_number () over 
  5               (order by t.company, t.brand nulls last, t.area nulls last) nb
  6      from t
  7      where t.company = :comp
  8        and ( t.brand = :brand or t.brand is null )
  9        and ( t.area = :area or t.area is null ) 
 10    )
 11  select company, brand, area, nvl(ndays,0) ndays
 12  from (select 1 nb from dual) dual, data
 13  where data.nb (+) = dual.nb
 14  /
COMPANY    BRAND            AREA      NDAYS
---------- ---------- ---------- ----------
XX         BB                  1          4
1 row selected.
SQL> exec :area := 3
PL/SQL procedure successfully completed.
SQL> /
COMPANY    BRAND            AREA      NDAYS
---------- ---------- ---------- ----------
XX         BB                            10
1 row selected.
SQL> exec :brand := 'CC'
PL/SQL procedure successfully completed.
SQL> /
COMPANY    BRAND            AREA      NDAYS
---------- ---------- ---------- ----------
XX                                        3
1 row selected.
SQL> exec :comp := 'YY'
PL/SQL procedure successfully completed.
SQL> /
COMPANY    BRAND            AREA      NDAYS
---------- ---------- ---------- ----------
                                          0
1 row selected.
The first subquery ("data") searches all lines that match your criteria and orders them according to your algorithm. The main query output the first row and 0 if there is none.
Regards
Michel 
 
 
Excellent !!!
A reader, January   30, 2007 - 6:04 am UTC
 
 
Thank you Michel! Excellent one..
Cheers !
Subbu. 
 
excellent information.
kde, February  07, 2007 - 2:28 pm UTC
 
 
Tom,
I am having problems with the following requirement.
Would you suggest using analytics for this?
My data in the source looks like this.
1,abc,500
1,abc,200
1,abc,100
1,abc,150
2,aaa,200
2,aaa,150
2,aaa,50
3,asa,100
3,asa,30
3,asa,60
I need to generate
0,1,abc,500
1,1,abc,300(i.e.,500-200)
2,1,abc,200(i.e.,300-100)
3,1,abc,50(i.e.,200-150)
0,2,aaa,200
1,2,aaa,50(i.e.,200-150)
2,2,aaa,0(i.e.,50-50)
0,3,asa,100
1,3,asa,70(i.e.,100-30)
2,3,asa,10(i.e.,70-60)
I tried using analytics. Here is the query that I have so far
select 
   A.COL1 col1
  ,A.TXT1 txt1
  ,A.COL2 col2
  ,sum(A.COL2) as moving_sum
  ,lead(SUM(A.COL2),1) OVER  (PARTITION BY a.col1, a.txt1 ORDER BY A.COL2 DESC) as next_row -- ROWS 1 PRECEDING) 
FROM   t1 a
 group by A.COL1,A.TXT1,A.COL2
 ORDER BY A.COL1 ASC,A.TXT1 ASC,A.COL2 DESC
--
I tried using the result of the above query but couldn't process the results.
I would really appreciate any suggestions/help. 
thanks
kde 
February  07, 2007 - 6:54 pm UTC 
 
there is no explanation of how to get from input to output "requirements" wise.  You don't explain what your outputs are - we just have a "picture" and we are supposed to derive the requirements from that - that is a flawed approach.
tell us what the outputs actually mean.
and give us create table and inserts to work with. 
 
 
To kde ...
Gabe, February  07, 2007 - 8:20 pm UTC
 
 
The processing order is quite specific and cannot be derived from your input. To that end I have added a "sequence" column to your data set.
create table z ( id number, nm varchar2(5), val number, seq  number );
insert into z values (1,'abc',500,1);
insert into z values (1,'abc',200,2);
insert into z values (1,'abc',100,3);
insert into z values (1,'abc',150,4);
insert into z values (2,'aaa',200,1);
insert into z values (2,'aaa',150,2);
insert into z values (2,'aaa', 50,3);
insert into z values (3,'asa',100,1);
insert into z values (3,'asa', 30,2);
insert into z values (3,'asa', 60,3);
20:08:28 session_139> select id,nm,seq
20:08:30   2        ,first_value(val) over (partition by id order by seq)
20:08:30   3        - sum(case when seq=1 then 0 else val end)
20:08:30   4            over (partition by id order by seq) balance
20:08:30   5  from  z
20:08:30   6  order by 1,3
20:08:30   7  ;
        ID NM           SEQ    BALANCE
---------- ----- ---------- ----------
         1 abc            1        500
         1 abc            2        300
         1 abc            3        200
         1 abc            4         50
         2 aaa            1        200
         2 aaa            2         50
         2 aaa            3          0
         3 asa            1        100
         3 asa            2         70
         3 asa            3         10
10 rows selected. 
 
 
apologize
a user, February  07, 2007 - 10:44 pm UTC
 
 
<quote>
there is no explanation of how to get from input to output "requirements" wise. You don't explain what your outputs are - we just have a "picture" and we are supposed to derive the requirements from that - that is a flawed approach. 
<quote>
Tom, 
I apologize for asking questions with incomplete information. I should have given you what Gabe had created. 
regards 
 
Using analytics in subqueries
sharvu, February  08, 2007 - 3:15 pm UTC
 
 
Tom,
I need your suggestion: 
create table a(id number,src number,dt date);
There are 50 Mill records in this table that actually has 50 more columns.
I would need something like:
select count(*) from a t1
where src=999 and dt
= (select max(dt) from a t2 where  t2.src=t1.src and t2.id=t1.id)
The query runs forever almost. Is there an effective way to do this?
Thanks, 
February  08, 2007 - 4:25 pm UTC 
 
select count(*) 
  from (select dt, max(dt) over (partition by id) max_dt
          from a
         where src = 999
       )
 where dt = max_dt;
 
 
 
Can this be done through Analytical Function?
A reader, February  10, 2007 - 2:19 am UTC
 
 
Hi Tom,
I have set of very large tables (ranging from around 40 Million rows to 10 Million rows). I have created a test table just to set up an example what I want to achieve:-
create table test (start_date date, end_date date, amnt number(10));
insert into test values ('01-JAN-2000','01-JAN-2001',100);
insert into test values ('01-JAN-2001','01-JAN-2002',200);
insert into test values ('01-JAN-2002','01-JAN-2003',200);
insert into test values ('01-JAN-2003','01-JAN-2004',500);
insert into test values ('18-JAN-2004','01-JAN-2004',1000);
SQL> select * from test;
START_DAT END_DATE        AMNT
--------- --------- ----------
01-JAN-00 01-JAN-01        100
01-JAN-01 01-JAN-02        200
01-JAN-02 01-JAN-03        200
01-JAN-03 01-JAN-04        500
18-JAN-04 01-JAN-04       1000
select * FROM test
  2  WHERE (start_date, end_date) IN (SELECT max(start_date), max(END_DATE) FROM test
  3  where add_months(start_date,12)-END_DATE = 0);
START_DAT END_DATE        AMNT
--------- --------- ----------
01-JAN-03 01-JAN-04        500
i.e., I want to travel backwards and get the amount/record where difference of dates is 1 year.
Regards,
 
February  12, 2007 - 9:51 am UTC 
 
you want to select specific rows - for that you need a where clause.
analytics are useful to look "across rows", not for getting "specific rows" so much.  
another way for you would be
select * 
  from 
(
select * 
  from test
 where months_between(start_date,end_date) = 12
 order by start_date DESC
)
where rownum = 1;
 
 
 
To: A Reader
Michel Cadot, February  12, 2007 - 2:41 am UTC
 
 
SQL> select start_date, end_date, amnt
  2  from ( select start_date, end_date, amnt,
  3                max(start_date) over () max_start,
  4                max(end_date) over () max_end
  5         from test
  6         where add_months(start_date,12)-END_DATE = 0
  7        )
  8  where start_date = max_start and end_date = max_end
  9  /
START_DATE  END_DATE          AMNT
----------- ----------- ----------
01-JAN-2003 01-JAN-2004        500
Regards
Michel 
 
 
Is Analytic Query an answer for this?
A Reader, March     01, 2007 - 12:11 am UTC
 
 
Tom,
I really need some help with this. Please Help!!!!
I have a table that looks like this. 
CREATE TABLE T5
   ( COL1 VARCHAR2(26) NOT NULL, 
 COL2 NUMBER(10) NOT NULL, 
 COL3 NUMBER(10) NOT NULL, 
 COL4 VARCHAR2(4) NOT NULL, 
 COL5 VARCHAR2(10)
   )
Insert into T5 values ('2007-01-25 08:22:16',101,1,'A','ABC');
Insert into T5 values ('2007-01-25 08:22:16',101,1,'L','XYZ');
Insert into T5 values ('2007-01-25 08:22:16',101,2,'A','ABC');
Insert into T5 values ('2007-01-25 08:22:16',101,2,'L','XYZ');
Insert into T5 values ('2007-01-25 08:22:16',101,3,'A','ABC');
Insert into T5 values ('2007-01-25 08:22:16',101,3,'L','XYZ');
COL1   COL2 COL3 COL4 COL5
2007-01-25 08:22:16 101 1 A ABC
2007-01-25 08:22:16 101 1 L XYZ
2007-01-25 08:22:16 101 2 A ABC
2007-01-25 08:22:16 101 2 L XYZ
2007-01-25 08:22:16 101 3 A ABC
2007-01-25 08:22:16 101 3 L XYZ
I need the output like this.
COL1   COL2 COL3 A_GRP L_GRP
2007-01-25 08:22:16 101 1 ABC XYZ
2007-01-25 08:22:16 101 2 ABC XYZ
2007-01-25 08:22:16 101 3 ABC XYZ
i.e. 
For each group of COL1, COL2, COL3 do the following.
For each ~COL5 where COL4='A'~ put the value of ~COL5 where COL4='L'~ right next to it.
I did try the pivot query,
select 
      col1, col2, col3, 
      max(decode(col4,'A',COL5,null)) a,
      max(decode(col4,'L',COL5,null)) b
from t5
group by col1, col2, col3;
COL1   COL2 COL3 A B
2007-01-25 08:22:16 101 2 ABC XYZ
2007-01-25 08:22:16 101 1 ABC XYZ
2007-01-25 08:22:16 101 3 ABC XYZ
But this would not work once I add more columns to the query. 
I was thinking that this could be easily done with Analytic Query but I have not been able to figure it out yet.
Is there another way?
thanks for your help 
March     01, 2007 - 8:30 am UTC 
 
why doesn't it work - it is a typical pivot, works for any number of "columns"
tell us what isn't working first. 
 
 
OK
Kumar, March     05, 2007 - 6:15 am UTC
 
 
Hi Tom,
I have a query which gives the price difference at three levels namely
Product,Pack( A pack is group of similar products) and market(all packs).
DDL,DML and query is below.
SQL> create table pack(ccode varchar2(30),pack_code varchar2(30),prod_code number,
  2  curr_price number,prev_price number,trans_date date)
  3  /
Table created.
SQL> insert into pack values('UK','PK1',100,100.25,140.90,sysdate-6);
SQL> insert into pack values('UK','PK1',101,10.25,40.90,sysdate-6);
SQL> insert into pack values('UK','PK1',100,55,45,SYSDATE-6);
SQL> insert into pack values('UK','PK2',200,22.75,20.75,SYSDATE-5);
SQL> insert into pack values('UK','PK2',200,30.25,27.25,SYSDATE-5);
SQL> insert into pack values('UK','PK2',200,102.50,90.50,SYSDATE-5);
SQL> commit
  2  /
Commit complete.
SQL> select * from pack
  2  /
CCODE                          PACK_CODE                       PROD_CODE CURR_PRICE PREV_PRICE TRANS_DAT
------------------------------ ------------------------------ ---------- ---------- ---------- -----
UK                             PK1                                   100     100.25      140.9 27-FEB-07
UK                             PK1                                   101      10.25       40.9 27-FEB-07
UK                             PK1                                   100         55         45 27-FEB-07
UK                             PK2                                   200      22.75      20.75 28-FEB-07
UK                             PK2                                   200      30.25      27.25 28-FEB-07
UK                             PK2                                   200      102.5       90.5 28-FEB-07
6 rows selected.
When I issue the below query the computed columns namely prod_prc_diff and pack_prc_diff
show the same values.
SQL> col ccode for a5
SQL> col pack_code for a5
SQL> select p.*,sum(abs(curr_price-prev_price)) over(partition by pack_code,prod_code,trans_date) as
     prod_prc_diff,sum(abs(curr_price-prev_price)) over(partition by pack_code,trans_date)
     as pack_prc_diff from pack p;
CCODE PACK_  PROD_CODE CURR_PRICE PREV_PRICE TRANS_DAT PROD_PRC_DIFF PACK_PRC_DIFF
----- ----- ---------- ---------- ---------- --------- ------------- -------------
UK    PK1          100     100.25      140.9 27-FEB-07         40.65         40.65
UK    PK1          101      10.25       40.9 27-FEB-07         30.65         30.65
UK    PK1          100         55         45 27-FEB-07            10            10
UK    PK2          200      22.75      20.75 28-FEB-07             2             2
UK    PK2          200      30.25      27.25 28-FEB-07             3             3
UK    PK2          200      102.5       90.5 28-FEB-07            12            12
6 rows selected.
But I want the pack_prc_diff column to show two rows as 
pack_prc_diff 
=============
80.3
17
I also need one more column as below which would be sum of 80.3+17
mkt_prc_diff
============
97.3
Please let me know how to arrive at this.
Thanks Tom. 
March     05, 2007 - 1:17 pm UTC 
 
I'm confused, why do our outputs differ given the same inputs
ops$tkyte%ORA10GR2> select p.*,sum(abs(curr_price-prev_price)) over(partition by pack_code,prod_code,trans_date) as
  2  prod_prc_diff,sum(abs(curr_price-prev_price)) over(partition by pack_code,trans_date)
  3  as pack_prc_diff from pack p;
CCODE PACK_  PROD_CODE CURR_PRICE PREV_PRICE TRANS_DAT PROD_PRC_DIFF PACK_PRC_DIFF
----- ----- ---------- ---------- ---------- --------- ------------- -------------
UK    PK1          100     100.25      140.9 27-FEB-07         50.65          81.3
UK    PK1          100         55         45 27-FEB-07         50.65          81.3
UK    PK1          101      10.25       40.9 27-FEB-07         30.65          81.3
UK    PK2          200      102.5       90.5 28-FEB-07            17            17
UK    PK2          200      22.75      20.75 28-FEB-07            17            17
UK    PK2          200      30.25      27.25 28-FEB-07            17            17
 
 
 
Is Analytic Query an answer for this?
A reader, March     05, 2007 - 10:05 am UTC
 
 
why doesn't it work - it is a typical pivot, works for any number of "columns" 
tell us what isn't working first.Pivot would work if I add more columns to the max(decode()) list. I want to add more columns to the 'group by' list.
drop table t5;
CREATE TABLE T5
(
  COL1  VARCHAR2(26 BYTE)                       NOT NULL,
  COL2  NUMBER(10)                              NOT NULL,
  COL3  NUMBER(10)                              NOT NULL,
  COL4  VARCHAR2(4 BYTE)                        NOT NULL,
  COL5  VARCHAR2(10 BYTE),
  COL6  NUMBER(10),
  COL7  VARCHAR2(10 BYTE)
)
/
Insert into T5
   (COL1, COL2, COL3, COL4, COL5, 
    COL6, COL7)
 Values
   ('2007-01-25 08:22:16', 101, 1, 'A', 'ABC', 
    1, NULL);
Insert into T5
   (COL1, COL2, COL3, COL4, COL5, 
    COL6, COL7)
 Values
   ('2007-01-25 08:22:16', 101, 1, 'L', 'XYZ', 
    2, 'BB');
Insert into T5
   (COL1, COL2, COL3, COL4, COL5, 
    COL6, COL7)
 Values
   ('2007-01-25 08:22:16', 101, 2, 'A', 'ABC', 
    3, NULL);
Insert into T5
   (COL1, COL2, COL3, COL4, COL5, 
    COL6, COL7)
 Values
   ('2007-01-25 08:22:16', 101, 2, 'L', 'XYZ', 
    4, 'DD');
Insert into T5
   (COL1, COL2, COL3, COL4, COL5, 
    COL6, COL7)
 Values
   ('2007-01-25 08:22:16', 101, 3, 'A', 'ABC', 
    5, NULL);
Insert into T5
   (COL1, COL2, COL3, COL4, COL5, 
    COL6, COL7)
 Values
   ('2007-01-25 08:22:16', 101, 3, 'L', 'XYZ', 
    6, 'FF');
COMMIT;
SQL> select 
  2        col1, col2, col3, col6, col7,
  3        max(decode(col4,'A',COL5,null)) a,
  4        max(decode(col4,'L',COL5,null)) b
  5  from T5
  6  group by col1, col2, col3, col6, col7
  7  order by col1, col2, col3
  8  ;
COL1                 COL2  COL3       COL6  COL7       A       B
------------------- ----- ------- ---------- ---------- ------- -------
2007-01-25 08:22:16   101    1          1               ABC
2007-01-25 08:22:16   101    1          2  BB                XYZ
2007-01-25 08:22:16   101    2          3                 ABC
2007-01-25 08:22:16   101    2          4  DD                XYZ
2007-01-25 08:22:16   101    3          5                 ABC
2007-01-25 08:22:16   101    3          6  FF                XYZ
I do realize that the Pivot Query above is working exactly as it is supposed to. 
But I need the output to be like
COL1                 COL2  COL3       COL6  COL7       A       B
------------------- ----- ------- ---------- ---------- ------- -------
2007-01-25 08:22:16   101    1          2  BB        ABC     XYZ
2007-01-25 08:22:16   101    2          4  DD        ABC     XYZ
2007-01-25 08:22:16   101    3          6  FF        ABC     XYZ
And that's why I thought that Analutic Query could be the answer.
thanks 
 
March     05, 2007 - 2:07 pm UTC 
 
and you would have to tell us the LOGIC BY WHICH YOUR OUTPUT IS COMING FROM
did you want the max(col6), and the max(col7) - eg; you did not mean to group by them.
ops$tkyte%ORA10GR2> select
  2        col1, col2, col3, max(col6), max(col7),
  3        max(decode(col4,'A',COL5,null)) a,
  4        max(decode(col4,'L',COL5,null)) b
  5  from T5
  6  group by col1, col2, col3
  7  order by col1, col2, col3
  8  ;
COL1                             COL2       COL3  MAX(COL6) MAX(COL7)  A          B
-------------------------- ---------- ---------- ---------- ---------- ---------- ----------
2007-01-25 08:22:16               101          1          2 BB         ABC        XYZ
2007-01-25 08:22:16               101          2          4 DD         ABC        XYZ
2007-01-25 08:22:16               101          3          6 FF         ABC        XYZ
 
 
 
Logic you asked for!!!
A Reader, March     05, 2007 - 3:27 pm UTC
 
 
...LOGIC BY WHICH YOUR OUTPUT IS COMING FROM... 
I guess that would help. haha. Here it is...
For the table like T5 which has n (where n is defined) number of keys,
For every row with COL4 as L (call it X),
there may be multiple rows with COL4 as A (call it Y).
Among these matching rows there is a common key (which is a part of n keys above).
Data from Y contains values in multiple columns which are need to be JOINed with columns from X in order to create a ~complete~ row for the purpose of the process.
for example, consider the table below
4KEYS 1key   COL4    COL5
----- ----   ----    ---------
P      1      L       L_VALUES
Q      1      A       A_VALUES
R      1      A       A_VALUES
From the table above, create TWO rows (because there are TWO 'A' type lines) that look like the following.
4KEYS 1key   L          A       
----- ----   ---------  ---------
P      1     L_VALUES   A_VALUES
Q      1     L_VALUES   A_VALUES
Would you use analytics for that or just a simple self join?
thanks 
March     05, 2007 - 4:24 pm UTC 
 
I did not get the 'four keys' bit?
how did you know to keep "Q" but discard "R"
we are missing something here. 
 
 
Logic (Updated!!!)
A Reader, March     05, 2007 - 3:31 pm UTC
 
 
I think the 4KEYS column would have to go...
1key  L      A    
----  --------- --------- 
1  L_VALUES  A_VALUES 
1  L_VALUES  A_VALUES 
 
 
4KEYS would have to go...
A Reader, March     05, 2007 - 4:28 pm UTC
 
 
I think the 4KEYS column would have to go... 
1key L    A  
---- --------- --------- 
1 L_VALUES A_VALUES 
1 L_VALUES A_VALUES 
 
I was wrong!!!
reader, March     06, 2007 - 9:32 am UTC
 
 
Actually I was wrong when I gave this table as an example.
4KEYS 1key  COL4  COL5 
----- ----  ----  --------- 
P    1    L    L_VALUES 
Q    1    A    A_VALUES 
R    1    A    A_VALUES 
THE CORRECT EXAMPLE WOULD BE 
4KEYS 1key  COL4  COL5 
----- ----  ----  --------- 
P    1    L    L_VALUES 
P    1    A    A_VALUES 
P    1    A    A_VALUES 
i.e. the 4KEYS would remain same for that rowset.
And from this table I would have to create TWO rows (because there are TWO 'A' type lines) that look like the following. 
4KEYS 1key  L      A    
----- ----  --------- --------- 
P    1  L_VALUES  A_VALUES 
P    1  L_VALUES  A_VALUES 
Sorry for the confusion there...
thanks
 
March     06, 2007 - 11:11 am UTC 
 
Now you entirely, utterly and totally have lost me.
start over - use much more TEXT with WORDS than "pictures of inputs and outputs"
I fail to see how you get from your input to your output. 
 
 
Across Rows
A reader, March     08, 2007 - 9:41 pm UTC
 
 
Hi Tom,
In relation to data set below:-
create table test (start_date date, end_date date, amnt number(10)); 
insert into test values ('01-JAN-2000','01-JAN-2001',100); 
insert into test values ('01-JAN-2001','01-JAN-2002',200); 
insert into test values ('01-JAN-2002','01-JAN-2003',200); 
insert into test values ('01-JAN-2003','01-JAN-2004',500); 
insert into test values ('18-JAN-2004','01-JAN2004',1000); 
SQL> select * from test; 
START_DAT END_DATE    AMNT 
--------- --------- ---------- 
01-JAN-00 01-JAN-01    100 
01-JAN-01 01-JAN-02    200 
01-JAN-02 01-JAN-03    200 
01-JAN-03 01-JAN-04    500 
18-JAN-04 01-JAN-04    1000 
How to get a record by comparing end dates of two adjacent records (by traversing backwords) where end dates changed/different i.e., I want to get record (second last) with amount 500 as by comparing its end date with the previous one, the date is different.
Regards,
 
March     09, 2007 - 10:45 am UTC 
 
ops$tkyte%ORA10GR2> select *
  2    from (
  3  select start_date,
  4         end_date,
  5         lead(start_date) over (order by start_date) next_start_date,
  6         amnt
  7    from test
  8         )
  9   where end_date <> next_start_date
 10  /
START_DAT END_DATE  NEXT_STAR       AMNT
--------- --------- --------- ----------
01-JAN-03 01-JAN-04 18-JAN-04        500
 
 
 
 
Thanks  a lot!
A reader, March     09, 2007 - 5:43 pm UTC
 
 
 
 
Variation
Juan Velez, March     10, 2007 - 4:15 pm UTC
 
 
This is related to the topic, so I would like to ask:
I have a table with effective-dated data, this is, effective/termination dates
CREATE TABLE EFFECTIVEDATED(ID NUMBER NOT NULL, EFFECTIVEDAY DATE NOT NULL, TERMINATIONDAY NOT NULL, SOMEVALUE NUMBER);
INSERT INTO EFFECTIVEDATED VALUES(1, TO_DATE('20070301','YYYYMMDD'), TO_DATE('20070601','YYYYMMDD'), 10);
INSERT INTO EFFECTIVEDATED VALUES(1, TO_DATE('20070901','YYYYMMDD'), TO_DATE('20080601','YYYYMMDD'), 8);
INSERT INTO EFFECTIVEDATED VALUES(1, TO_DATE('20080601','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'), 12);
The data in this example has gaps if we consider the range 20070101 and 20090101.  I would like to get those missing segments as part of a query whose "somevalue" column would be 0, this is, given start/end dates 20070101/20090101 and the above data, a query should return the following records (assume that the query looks for ID=1)
ID EFFECTIVEDAY TERMINATIONDAY SOMEVALUE
-- ------------ -------------- ---------
1  20070101     20070301        0
1  20070301     20070601       10
1  20070601     20070901        0
1  20070901     20080601        8
1  20080601     20090101       12
I have accomplished this using PL/SQL but I know it can be done using analytics, I just do not know how, and that's what I am hoping you can help me with
Thanks 
March     12, 2007 - 5:38 pm UTC 
 
you know, by doing that - you've utterly handicapped the optimizer.  You should have used NULL, not 99991231
start with this...  it can be solved using this set:
ops$tkyte%ORA10GR2> with data
  2  as
  3  (select 1 l from dual union all select 2 l from dual)
  4  select *
  5    from data d, (select id, effectiveday edt, terminationday tdt, somevalue val,
  6                         lag( terminationday ) over (order by effectiveday ) last_tdt,
  7                         lead( effectiveday ) over (order by effectiveday ) next_edt,
  8                         to_date(:x,'yyyymmdd') x,
  9                         to_date(:y,'yyyymmdd') y
 10                    from effectivedated
 11                                   where id = :z
 12                                   ) e
 13   order by edt, l
 14  /
         L         ID EDT       TDT              VAL LAST_TDT  NEXT_EDT  X         Y
---------- ---------- --------- --------- ---------- --------- --------- --------- ---------
         1          1 01-MAR-07 01-JUN-07         10           01-SEP-07 01-JAN-07 01-JAN-09
         2          1 01-MAR-07 01-JUN-07         10           01-SEP-07 01-JAN-07 01-JAN-09
         1          1 01-SEP-07 01-JUN-08          8 01-JUN-07 01-JUN-08 01-JAN-07 01-JAN-09
         2          1 01-SEP-07 01-JUN-08          8 01-JUN-07 01-JUN-08 01-JAN-07 01-JAN-09
         1          1 01-JUN-08 31-DEC-99         12 01-JUN-08           01-JAN-07 01-JAN-09
         2          1 01-JUN-08 31-DEC-99         12 01-JUN-08           01-JAN-07 01-JAN-09
6 rows selected.
 
 
 
Analytics to the rescue
Livio, March     12, 2007 - 6:05 am UTC
 
 
I hope my problem can be considered a followup of this thread. Otherwise, I'll try to put a question at a later time.
I have a table like this:
create table eub.test (date_time date, validity number);
insert into test (date_time, validity)
values (to_date('08-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('08-01-2007', 'dd-mm-yyyy'), 27335);
insert into test (date_time, validity)
values (to_date('08-01-2007', 'dd-mm-yyyy'), 27336);
insert into test (date_time, validity)
values (to_date('09-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('09-01-2007', 'dd-mm-yyyy'), 27335);
insert into test (date_time, validity)
values (to_date('09-01-2007', 'dd-mm-yyyy'), 27340);
insert into test (date_time, validity)
values (to_date('10-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('10-01-2007', 'dd-mm-yyyy'), 27335);
insert into test (date_time, validity)
values (to_date('10-01-2007', 'dd-mm-yyyy'), 27336);
commit;
I'd like to make a report like the following:
date_time       validity      group
----------      --------      -----
08-01-2007      27334         0
08-01-2007      27335         0
08-01-2007      27336         0
09-01-2007      27334         1
09-01-2007      27335         1
09-01-2007      27340         1
10-01-2007      27334         0
10-01-2007      27335         0
10-01-2007      27336         0
Said in other words, dates with the very same validities belong to the same group (or partition.)
Is it possible to figure out a solution using analytics, without writing down procedural code?
Thank you very much in advance. 
March     12, 2007 - 8:01 pm UTC 
 
you'll need to be a tad more precise in your definition of a "group"
what logic did you follow to put the first 3 and last 3 together and what happens if there was a group with more "validities" that covered that group and all......
define, precisely, how to identify a group - in a manner a programmer could produce code from. 
 
 
Analytics to the rescue
Livio, March     13, 2007 - 4:36 am UTC
 
 
Tom,
 I should have used the word "set" instead of "group". In fact, I put the first 3 and the last three together because
the set of values s1 = {27334, 27335, 27336} associated to the date 08-01-2007 is the same set of values associated to the date 10-01-2007.
The set s2 = {27334, 27335, 27340} associated to the date 09-01-2007 is not equal to s1, so dates 09-01-2007 belong to a different set.
If I insert the following rows:
insert into test (date_time, validity)
values (to_date('11-01-2007', 'dd-mm-yyyy'), 27334);
insert into test (date_time, validity)
values (to_date('11-01-2007', 'dd-mm-yyyy'), 27335);
commit;
I would add a third set s3 = {27334, 27335} that, even if subset of s1 and s2, is not equal to s1, s2. Rows with date 11-01-2007 should be marked as belonging to a third set, say 2.
date_time    validity    group
----------    --------    -----
08-01-2007    27334      0
08-01-2007    27335      0
08-01-2007    27336      0
09-01-2007    27334      1
09-01-2007    27335      1
09-01-2007    27340      1
10-01-2007    27334      0
10-01-2007    27335      0
10-01-2007    27336      0 
11-01-2007    27334      2
11-01-2007    27335      2
To sum up: set equality is the criterion to group rows of table test together. Given two dates d1 and d2, let us call s1 and s2 the set of validities associated to d1 and d2, respectively. d1 and d2 belong to the same set (group, partition) if and only if set s1 equals to set s2.
I'd like to solve the above problem without writing down a procedure going through table test to identify all dates with the same set of validities. I thought that an analytic approach might help.
Thank you 
March     13, 2007 - 11:24 am UTC 
 
doubt you would ever come up with an efficient SQL based method for this.   
 
 
SQL Model and Analytics to the rescue   
Frank Zhou, March     13, 2007 - 2:43 pm UTC
 
 
Hi Livio,
        Here is a SQL model clause solution for you.
I was able to put the identical sets into the same group
(but the group number is not exactly the same as yours)
Frank
SQL> SELECT date_time  ,
  2         SUBSTR(fin_str,
  3                 INSTR(fin_str, ',', 1, LEVEL  ) + 1,
  4                 INSTR(fin_str, ',', 1, LEVEL+1) -
  5                 INSTR(fin_str, ',', 1, LEVEL) -1 ) validity,
  6        grp
  7  FROM (
  8  SELECT date_time  , ','||fin_str||',' as fin_str, dense_rank( ) over (order by fin_str) -1 grp
  9  FROM
 10  (
 11          SELECT date_time, validity, validity_str, fin_str
 12          FROM  test
 13          MODEL
 14          PARTITION BY  (date_time)
 15          DIMENSION BY ( row_number() OVER (PARTITION BY date_time ORDER BY validity) rn )
 16          MEASURES
 17         ( 
 18          CAST(NULL AS VARCHAR2(3255)) validity_str,  validity, CAST(NULL AS VARCHAR2(3255)) fin_str
 19         )
 20         RULES
 21        (
 22        validity_str[ANY] ORDER BY  rn  =
 23         CASE WHEN validity[cv() - 1 ] IS NULL
 24                   THEN TO_CHAR(validity[cv()])
 25                   ELSE validity_str[cv()-1]||','|| TO_CHAR(validity[cv()])
 26               END,
 27      fin_str[ANY] ORDER BY  rn  =
 28         CASE WHEN validity[cv() + 1 ] IS NULL
 29                   THEN validity_str[CV()]
 30               END
 31       )
 32  )
 33  WHERE fin_str IS NOT NULL
 34  )
 35  CONNECT BY PRIOR date_time = date_time
 36  AND INSTR (fin_str, ',', 1, LEVEL+1) > 0
 37  AND PRIOR dbms_random.string ('p', 10) IS NOT NULL
 38  order by date_time, validity;
DATE_TIME VALIDITY                  GRP                                         
--------- ------------------ ----------                                         
08-JAN-07 27334                       1                                         
08-JAN-07 27335                       1                                         
08-JAN-07 27336                       1                                         
09-JAN-07 27334                       2                                         
09-JAN-07 27335                       2                                         
09-JAN-07 27340                       2                                         
10-JAN-07 27334                       1                                         
10-JAN-07 27335                       1                                         
10-JAN-07 27336                       1                                         
11-JAN-07 27334                       0                                         
11-JAN-07 27335                       0                                         
11 rows selected.
SQL> spool off;
 
 
sys_connect_by_path and Analytics to the rescue   
Frank Zhou, March     13, 2007 - 3:19 pm UTC
 
 
    In addition to my 10G SQL Model Clause solution above, 
Here is a 9I pure SQL solution.
Frank
SQL> SELECT date_time  ,
  2         SUBSTR(fin_str,
  3                INSTR(fin_str, ',', 1, LEVEL  ) + 1,
  4                INSTR(fin_str, ',', 1, LEVEL+1) -
  5                INSTR(fin_str, ',', 1, LEVEL) -1 ) validity,
  6       grp
  7  FROM
  8  (SELECT date_time,fin_str||',' as fin_str,
  9          dense_rank( ) over (order by fin_str) -1 grp
 10     FROM
 11      (SELECT date_time, validity, fin_str,
 12         MAX(LENGTH(fin_str) - LENGTH(REPLACE(fin_str, ',', '')))
 13           OVER (PARTITION BY date_time) max_path,
 14      LENGTH(fin_str)-LENGTH(REPLACE(fin_str, ',', '')) str_len
 15        FROM (SELECT date_time, validity,
 16               sys_connect_by_path(validity, ',') fin_str
 17          from test
 18          connect by prior date_time = date_time
 19          and prior validity < validity
 20          )
 21    )
 22    WHERE max_path = str_len
 23  )
 24  CONNECT BY PRIOR date_time = date_time
 25  AND INSTR (fin_str, ',', 1, LEVEL+1) > 0
 26  AND PRIOR dbms_random.string ('p', 10) IS NOT NULL
 27  order by date_time, validity;
DATE_TIME VALIDITY                  GRP                                         
--------- ------------------ ----------                                         
08-JAN-07 27334                       1                                         
08-JAN-07 27335                       1                                         
08-JAN-07 27336                       1                                         
09-JAN-07 27334                       2                                         
09-JAN-07 27335                       2                                         
09-JAN-07 27340                       2                                         
10-JAN-07 27334                       1                                         
10-JAN-07 27335                       1                                         
10-JAN-07 27336                       1                                         
11-JAN-07 27334                       0                                         
11-JAN-07 27335                       0                                         
11 rows selected.
SQL> spool off;
 
 
Easier Analytics to the rescue   
Michel Cadot, March     13, 2007 - 3:34 pm UTC
 
 
Using the well-known stragg function defined at: 
http://asktom.oracle.com/pls/asktom/f?p=100:11:2073682287744121::::P11_QUESTION_ID:2196162600402 SQL> with
  2    data as (
  3      select date_time, validity,
  4             stragg(validity) over (partition by date_time) same_date
  5      from (select date_time, validity from test order by validity, date_time)
  6    )
  7  select date_time, validity, 
  8         dense_rank() over (order by same_date) "GROUP"
  9  from data
 10  order by 1, 2
 11  /
DATE_TIME    VALIDITY      GROUP
---------- ---------- ----------
01/08/2007      27334          3
01/08/2007      27335          3
01/08/2007      27336          3
01/09/2007      27334          1
01/09/2007      27335          1
01/09/2007      27340          1
01/10/2007      27334          3
01/10/2007      27335          3
01/10/2007      27336          3
01/11/2007      27334          2
01/11/2007      27335          2
Regards
Michel 
 
March     13, 2007 - 3:54 pm UTC 
 
you'd need to make sure to use a stragg that SORTS for this to work reliably (and probably "distincts" too)
but yes, why didn't I think of a simple pivot....
so perhaps something even as simple as:
ops$tkyte%ORA10GR2> select date_time,
  2         v1, v2, v3, v4, v5, v6, v7, v8
  3    from (
  4  select date_time,
  5         max(decode(rn,1,validity)) v1,
  6         max(decode(rn,2,validity)) v2,
  7         max(decode(rn,3,validity)) v3,
  8         max(decode(rn,4,validity)) v4,
  9         max(decode(rn,5,validity)) v5,
 10         max(decode(rn,6,validity)) v6,
 11         max(decode(rn,7,validity)) v7,
 12         max(decode(rn,8,validity)) v8
 13    from (
 14  select date_time, row_number() over (partition by date_time order by validity) rn, validity
 15    from test
 16         )
 17   group by date_time
 18         )
 19   order by v1, v2, v3, v4, v5, v6, v7, v8
 20  /
DATE_TIME         V1         V2         V3         V4         V5         V6         V7         V8
--------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
08-JAN-07      27334      27335      27336
10-JAN-07      27334      27335      27336
09-JAN-07      27334      27335      27340
would be useful - you can partition by v1, v2, .... vN
(I know, 8 columns = limit, you can raise limit.  stragg = limit as well though, varchar2(4000)) 
 
 
RE: Analytics to the rescue   
Frank Zhou, March     13, 2007 - 4:19 pm UTC
 
 
Hi Tom,
              My two SQl solutions above took care of the "Sorting"  ( The 9i pure sql solution also handles the "distinct"). For the sql model clause solution to handle the "Distinct"
The following sql pattern can be used : 
http://www.jlcomp.demon.co.uk/faq/Exclude_duplicate.html Thanks,
Frank 
 
Easier Analytics to the rescue  (2)
Michel Cadot, March     13, 2007 - 4:19 pm UTC
 
 
Tom, 
For your first remark, this is why I sorted the result set before applying stragg but of course it works only if we not PARALLEL_ENABLE the function.
For pivot, I thought about it but the problem is when you "group by date_time" to build the set of values per date you lost "validity" which is required in the final result so joining this with the original test seems unavoidable and we have 2 test table scans instead of 1 with stragg.
Regards
Michel
 
March     13, 2007 - 8:53 pm UTC 
 
not sure that we can 100% rely on the sort for the aggregate to work.... it might, it might not.  there is nothing saying it will (there is a version of stragg that saves the inputs, sorts them and THEN concatenates them...)
group by date_time - gives you a set of rows you can now PARTITION BY v1, v2, v..... on
 
 
 
Easier Analytics to the rescue (3)
Michel Cadot, March     14, 2007 - 3:03 am UTC
 
 
You mean grouping, ranking, degrouping; something like:
SQL> with 
  2    data as (
  3      select date_time, validity,
  4             row_number() over (partition by date_time order by validity) rn
  5      from test
  6    ),
  7    sets as (
  8      select date_time, 
  9             max(decode(rn,1,validity)) v1,
 10             max(decode(rn,2,validity)) v2,
 11             max(decode(rn,3,validity)) v3,
 12             max(decode(rn,4,validity)) v4,
 13             max(decode(rn,5,validity)) v5,
 14             max(decode(rn,6,validity)) v6,
 15             max(decode(rn,7,validity)) v7,
 16             max(decode(rn,8,validity)) v8,
 17             max(decode(rn,9,validity)) v9
 18      from data
 19      group by date_time
 20    ),
 21    groups as (
 22      select date_time, v1, v2, v3, v4, v5, v6, v7, v8, v9,
 23             dense_rank() over (order by v1, v2, v3, v4, v5, v6, v7, v8, v9) grp
 24      from sets
 25    ),
 26    lines as ( select rownum line from dual connect by level <= 9 )
 27  select date_time, 
 28         decode(line,1,v1,2,v2,3,v3,4,v4,5,v5,6,v6,7,v7,8,v8,9,v9) validity,
 29         grp "GROUP"
 30  from groups, lines
 31  where decode(line,1,v1,2,v2,3,v3,4,v4,5,v5,6,v6,7,v7,8,v8,9,v9)
 32          is not null
 33  order by 1, 2
 34  /
DATE_TIME    VALIDITY      GROUP
---------- ---------- ----------
01/08/2007      27334          1
01/08/2007      27335          1
01/08/2007      27336          1
01/09/2007      27334          2
01/09/2007      27335          2
01/09/2007      27340          2
01/10/2007      27334          1
01/10/2007      27335          1
01/10/2007      27336          1
01/11/2007      27334          3
01/11/2007      27335          3
I noticed that you seldomly use "with" clause and prefer inline views. Is this just your way of writing SQL or is there any underlying reason?
Regards
Michel 
 
March     14, 2007 - 7:52 am UTC 
 
i would not bother ungrouping again, once the data is inline like that (v1....vn) it is better than good enough :)
it is my way of writing SQL.  I typically start with simple query "Q" - wrap it in parentheses, put the next layer on it.
rather than go "down" the page - my queries tend to explode up and down the page - add a bit on top, add a bit on the bottom...  
 
 
Analytics to the rescue
Livio, March     14, 2007 - 7:59 am UTC
 
 
Hi folks,
thank you all for your help.
Eventually, I could figure out a solution based on the fact that a set of integers s = {s1, s2, ... sn} can be thought of as the number 2^s1 + 2^s2 + ...+ 2^sn.
Here's my solution:
select date_time, validity, dns_rnk, sum(power(2, dns_rnk)) over (partition by date_time) set_id from (
select date_time, validity, dense_rank() over (order by validity) - 1  dns_rnk
from test
)
order by day;
DATE_TIME         VALIDITY    DNS_RNK   SET_ID
--------- ---------- ---------- ----------
08-GEN-07      27334          0          7
08-GEN-07      27335          1          7
08-GEN-07      27336          2          7
09-GEN-07      27334          0         11
09-GEN-07      27335          1         11
09-GEN-07      27340          3         11
10-GEN-07      27334          0          7
10-GEN-07      27335          1          7
10-GEN-07      27336          2          7
11-GEN-07      27334          0          3
11-GEN-07      27335          1          3
I use dense_rank() in the inner query to "normalize" validities and avoid a numeric overflow error while computing the sum of the powers of 2.
Yet, should I have a large number of validities for each date, the value of set_id can become very large (something like 1,563E+112 in my original table) making two different sets look "the same". Anyway, I think this is just a matter of formatting the output.
I will compare the proposed solutions with the one I devised
and apply the best that suits my case.
Thanks again 
 
Analytics to the rescue   
Michel Cadot, March     14, 2007 - 8:56 am UTC
 
 
Hi Livio,
Nicely thought.
Regards
Michel
 
 
To Livio ... constraints matter
Gabe, March     14, 2007 - 11:35 pm UTC
 
 
<quote>a set of integers s = {s1, s2, ... sn} can be thought of as the number 2^s1 + 2^s2 + ...+ 2^sn</quote>
Your table definition does not have a unique constraint; the potential need to "distinct" has been mentioned and you have not clarified if duplicates can exist (and how it would impact the definition of "set equality").
If they do, your approach will give you false grouping since, for any n, Power(2,n) + power(2,n) = power(2,n+1) ... and it is just one of the combinations that would give you false positives. 
 
 
Re: To Livio ... constraints matter
Livio, March     15, 2007 - 8:07 am UTC
 
 
Hi Gabe;
thanks for your remark.
You're right: I did not clarified about the possibility of having duplicates values or not. Actually, I don't have to cope with duplicates. My observation should have been written more precisely as  
<quote>a set of distinct integers s = {s1, s2, ... sn} can be thought of as the number 2^s1 + 2^s2 + ...+ 
2^sn</quote>
In fact, columns date_time and validity make up the PK of table test.
Regards
Livio 
 
Analytics to the rescue... continue
Michel Cadot, March     15, 2007 - 8:46 am UTC
 
 
Livio,
You can change you big numbers to smaller applying the rank function on the set_id:
SQL> with 
  2    data as (
  3      select date_time, validity, 
  4             dense_rank() over (order by validity) - 1 val_rk
  5      from test
  6    ),
  7    groups as (
  8      select date_time, validity,
  9             sum(power(2,val_rk)) over (partition by date_time) set_id
 10      from data
 11    )
 12  select date_time, validity,
 13         dense_rank () over (order by set_id) - 1 "GROUP"
 14  from groups
 15  order by 1, 2
 16  /
DATE_TIME    VALIDITY      GROUP
---------- ---------- ----------
01/08/2007      27334          1
01/08/2007      27335          1
01/08/2007      27336          1
01/09/2007      27334          2
01/09/2007      27335          2
01/09/2007      27340          2
01/10/2007      27334          1
01/10/2007      27335          1
01/10/2007      27336          1
01/11/2007      27334          0
01/11/2007      27335          0
Regards
Michel 
 
 
Re: Analytics to the rescue... continue
Livio, March     16, 2007 - 5:33 am UTC
 
 
Hi Michel;
you're right. One more analytic makes the output more intelligible.
Thanks for your suggestion
Best Regards
Livio 
 
find the ranges
vincenzo, March     23, 2007 - 1:18 pm UTC
 
 
Hi Tom,
I have a big table with about 500 million rows:
  
drop table big_table;
create table big_table 
(serial_number number,
product_code char(10),
constraint pk_bt primary key (serial_number)
);
insert into big_table values (1,'product a');
insert into big_table values (2,'product a');
insert into big_table values (3,'product a');
insert into big_table values (6,'product a');
insert into big_table values (7,'product a');
insert into big_table values (10,'product b');
insert into big_table values (11,'product b');
insert into big_table values (12,'product b');
insert into big_table values (13,'product a');
SQL> select * from big_table;
SERIAL_NUMBER PRODUCT_CO
------------- ----------
            1 product a
            2 product a
            3 product a
            6 product a
            7 product a
           10 product b
           11 product b
           12 product b
           13 product a
I need to aggregate the serial_number by ranges and product_code.
The output should look like this:
FIRST_SERIAL LAST_SERIAL PRODUCT_CODE
------------- ------------- ----------
            1             3 product a   
            6             7 product a   
           10            12 product b
           13            13 product a
 
      How can be done using analytic function ?
Thanks 
March     26, 2007 - 7:04 am UTC 
 
ops$tkyte%ORA10GR2> select product_code, min(serial_number), max(serial_number)
  2    from (
  3  select product_code, serial_number,
  4         max(grp) over (partition by product_code order by serial_number) maxgrp
  5    from (
  6  select product_code, serial_number,
  7         case when lag(serial_number)
  8                   over (partition by product_code
  9                         order by serial_number) <> serial_number-1
 10                   or
 11                   row_number()
 12                   over (partition by product_code
 13                         order by serial_number) =1
 14              then row_number()
 15                   over (partition by product_code
 16                         order by serial_number)
 17              end grp
 18    from big_table
 19         )
 20         )
 21   group by product_code, maxgrp
 22   order by product_code, maxgrp
 23  /
PRODUCT_CO MIN(SERIAL_NUMBER) MAX(SERIAL_NUMBER)
---------- ------------------ ------------------
product a                   1                  3
product a                   6                  7
product a                  13                 13
product b                  10                 12
4 rows selected.
 
 
 
 
To vincenzo 
Michel Cadot, March     24, 2007 - 5:50 pm UTC
 
 
With the assumption that your serial_number is strictly positive (otherwise change the -1 in the nvl function):
SQL> with 
  2    data as (
  3      select serial_number, product_code,
  4             case 
  5             when nvl(lag(serial_number) 
  6                    over (partition by product_code order by serial_number),-1)
  7                  != serial_number-1
  8               then serial_number
  9             end flag
 10      from big_table
 11    ),
 12    grouping as (
 13      select serial_number, product_code,
 14             max(flag) over (partition by product_code order by serial_number) grp
 15      from data
 16    )
 17  select min(serial_number) first_serial,
 18         max(serial_number) last_serial,
 19         max(product_code) product_code
 20  from grouping
 21  group by grp
 22  order by 1
 23  /
FIRST_SERIAL LAST_SERIAL PRODUCT_CO
------------ ----------- ----------
           1           3 product a
           6           7 product a
          10          12 product b
          13          13 product a
4 rows selected.
Regards
Michel 
 
 
Dates with overlaps
Sudha Bhagavatula, March     26, 2007 - 9:28 am UTC
 
 
I have a question regarding lag and lead. 
create table t_mbr_enrol ( subr_id varchar2(15),
                           dep_nbr varchar2(2),
                           eff_date date,
                           term_date date);
                              
insert into t_mbr_enrol values('1001','0', TO_DATE('19990101','YYYYMMDD'), TO_DATE('19991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20020101','YYYYMMDD'), TO_DATE('20021231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20030101','YYYYMMDD'), TO_DATE('20031231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20040101','YYYYMMDD'), TO_DATE('20041231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20050101','YYYYMMDD'), TO_DATE('20051231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20060101','YYYYMMDD'), TO_DATE('20061231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','0', TO_DATE('20070101','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20040701','YYYYMMDD'), TO_DATE('20041231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20050101','YYYYMMDD'), TO_DATE('20051231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20060101','YYYYMMDD'), TO_DATE('20061231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','1', TO_DATE('20070101','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','2', TO_DATE('20050801','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
insert into t_mbr_enrol values('1001','3', TO_DATE('20070101','YYYYMMDD'), TO_DATE('99991231','YYYYMMDD'));
create table t_mdcr_elig ( subr_id varchar2(15),
                           dep_nbr varchar2(2),
                           eff_date date);  
                              
insert into t_mdcr_elig values('1001','0',TO_DATE('20040301','YYYYMMDD')); 
insert into t_mdcr_elig values('1001','1',TO_DATE('20060801','YYYYMMDD'));  
select * from t_mbr_enrol;
SUBR_ID         DEP_NBR EFF_DATE    TERM_DATE
--------------- ------- ----------- -----------
1001            0       01/01/1999  12/31/1999
1001            0       01/01/2002  12/31/2002
1001            0       01/01/2003  12/31/2003
1001            0       01/01/2004  12/31/2004
1001            0       01/01/2005  12/31/2005
1001            0       01/01/2006  12/31/2006
1001            0       01/01/2007  12/31/9999
1001            1       07/01/2004  12/31/2004
1001            1       01/01/2005  12/31/2005
1001            1       01/01/2006  12/31/2006
1001            1       01/01/2007  12/31/9999
1001            2       08/01/2005  12/31/9999
1001            3       01/01/2007  12/31/9999
13 rows selected
select * from t_mdcr_elig;
SUBR_ID         DEP_NBR EFF_DATE
--------------- ------- -----------
1001            0       03/01/2004
1001            1       08/01/2006
My final output should be like this:
SUBR_ID         DEP_NBR EFF_DATE    TERM_DATE
--------------- ------- ----------- -----------
1001            0       01/01/1999  12/31/1999
1001            0       01/01/2002  12/31/2002
1001            0       01/01/2003  12/31/2003
1001            0       01/01/2004  02/29/2004
1001            0       03/01/2004  12/31/2004
1001            0       01/01/2005  12/31/2005
1001            0       01/01/2006  07/31/2006
1001            0       08/01/2006  12/31/2006
1001            0       01/01/2007  12/31/9999
1001            1       07/01/2004  12/31/2004
1001            1       01/01/2005  12/31/2005
1001            1       01/01/2006  07/31/2006
1001            1       08/01/2006  12/31/2006
1001            1       01/01/2007  12/31/9999
1001            2       08/01/2005  07/31/2006
1001            2       08/01/2006  12/31/9999
1001            3       01/01/2007  12/31/9999
Is there a way of doing this in a single sql query with analytics ?
Thanks a lot
Sudha 
March     26, 2007 - 11:12 am UTC 
 
how's about you explain the logic behind your output - so we can understand what we are looking at instead of having to try to understand what you might have been thinking... 
 
 
Analytics
Sudha Bhagavatula, March     26, 2007 - 12:44 pm UTC
 
 
The table t_mbr_enrol contains data for a member's enrollment into a medical plan. 
The subscriber id is common to all members of a the same contract. A contract can have only one primary subscriber which is ascertained by the dep_nbr (dependent number) = 0. The rest of them are dependents of this primary subscriber. This table contains no overlaps and the date boundaries for each subr_id, dep_nbr are set.
The second table t_mdcr_elig contains the dat when each member became eligible for medicare, when they torn 65 years of age or due to some medical condition. 
When this happens, the dates in enrollment need to be sliced further to reflect medicare eligibility.
So in the example given earlier, the final output is like this:
SUBR_ID      DEP_NBR EFF_DATE  TERM_DATE 
--------------- ------- ----------- ----------- 
1001        0    01/01/1999 12/31/1999  -- no change
1001        0    01/01/2002 12/31/2002  -- no change 
1001        0    01/01/2003 12/31/2003  -- no change  
1001        0    01/01/2004 02/29/2004  -- last date before medicare for dep nbr = 0 
1001        0    03/01/2004 12/31/2004  -- start of medicare eligibility for subscriber
1001        0    01/01/2005 12/31/2005  -- no change 
1001        0    01/01/2006 07/31/2006  -- last date before medicare eligibilty of a dependent (dep_nbr = 1) 
1001        0    08/01/2006 12/31/2006  -- medicare eligiblity starts for dep nbr = 1 
1001        0    01/01/2007 12/31/9999  -- no change
1001        1    07/01/2004 12/31/2004  -- no change 
1001        1    01/01/2005 12/31/2005  -- no change 
1001        1    01/01/2006 07/31/2006  -- last date before medicare for dep nbr = 1 
1001        1    08/01/2006 12/31/2006  -- medicare eligibility for dep nbr = 1 
1001        1    01/01/2007 12/31/9999  -- no change
1001        2    08/01/2005 07/31/2006  -- dates split for medicare eligibilty for dep nbr = 1
1001        2    08/01/2006 12/31/9999  -- dates split for medicare eligibilty for dep nbr = 1
1001        3    01/01/2007 12/31/9999  -- no change since this member became active after the splits
I need to split dates for medicare eligibilty.
Thanks
Sudha
  
March     26, 2007 - 2:11 pm UTC 
 
that does not really explain the logic at all.  
 
 
MODEL clause ?
Gary, March     27, 2007 - 3:24 am UTC
 
 
The logic appears to be...
DECLARE
  CURSOR c_1 is 
   select subr_id, dep_nbr, eff_date, term_date
   from t_mbr_enrol
   order by 1,2,3;
  cursor c_2 
    (p_subr in varchar2, p_start in date, p_end in date) is 
  select eff_date - 1 eff_date
  from t_mdcr_elig
  where subr_id = p_subr
  and eff_date between p_start and p_end
  order by eff_date;
  v_out_line varchar2(100);
BEGIN
  FOR c_r1 in c_1 LOOP
    FOR c_r2 in c_2 
      (c_r1.subr_id, c_r1.eff_date, c_r1.term_date) LOOP
      dbms_output.put_line(c_r1.subr_id||':'||c_r1.dep_nbr||
        ' From '||c_r1.eff_date||' to '||c_r2.eff_date);
      c_r1.eff_date := c_r2.eff_date+1;
    END LOOP;
    dbms_output.put_line(c_r1.subr_id||':'||c_r1.dep_nbr||
     ' From '||c_r1.eff_date||' to '||c_r1.term_date);
  END LOOP;
END;
/
The following SQL is close, but it needs to generate an extra row whenever the split date is present. If they are on 10G, the MODEL clause would be an option
select t1.subr_id, t1.dep_nbr, 
       t1.eff_date, t1.term_date, t2.eff_date split_date
from t_mbr_enrol t1 left outer join t_mdcr_elig t2
      on (t1.subr_id = t2.subr_id
     and t2.eff_date between t1.eff_date and t1.term_date)
order by t1.subr_id, t1.dep_nbr, t1.eff_date, t2.eff_date;
 
 
how to get this select
A reader, March     27, 2007 - 5:02 am UTC
 
 
Thanks Tom for this web site,
I have the following question
create table t (name varchar2(50), ident varchar2(13))
insert into t values ('Jean','12456');
insert into t values ('Jean','12457');
insert into t values ('Jean','12458');
insert into t values ('Paul','67850');
insert into t values ('Paul','67851');
insert into t values ('Remy','4879');
commit;
I would like to issue a select that will give me:
Jean   12456  -- only once for Jean whatever the ident
Paul   67850  -- only once for Paul whatever the ident
Remy   4879   -- only once for Remy whatever the ident
Thanks a lot 
March     27, 2007 - 9:35 am UTC 
 
select name, min(ident) from t group by name; 
 
 
I have found 
A reader, March     27, 2007 - 5:47 am UTC
 
 
Tom,
I have found the solution to get my desired select
select * from (
select name, ident, row_number() over (partition by name order by ident) seq 
from t
)
where seq = 1
Thanks 
March     27, 2007 - 9:35 am UTC 
 
use the infinitely more straightforward query right above
this is an inappropriate use of analytics. 
 
 
A reader, March     27, 2007 - 10:58 am UTC
 
 
and it is more performant.
Thanks very much Tom 
 
Thanks Gary
Sudha Bhagavatula, March     28, 2007 - 9:50 am UTC
 
 
Thanks Gary for the input. I managed to get an extra row by doing a union instead of an outer join. 
 
pagination 
shay, March     29, 2007 - 3:44 am UTC
 
 
hi tom,
further to pagination result set i have another question:
i have this statement
select *
from (
select
id,type_id,
row_number() over
(order by sortby) rn
from wscportal_mpower.wsc where father_id=1382)
where rn between 1 and 16
order by rn
/
and it returns :
        ID    TYPE_ID         RN
---------- ---------- ----------
   1087809          0          1
     29359          0          2
       113          0          3
   1064393          0          4
    447340          0          5
    462072          0          6
   1031179          0          7
    458014          0          8
   1014710          1          9
    464113          0         10
   1014711          1         11
   1066580          1         12
   1032922          0         13
    827134          0         14
    838371          0         15
    864671          0         16
16 rows selected.
i would like to ask also on the type_id and if i find 2 rows with type_id=1 i would like to stop the resultset and get the answer. in other words to stop at row 11 (include)?!
is it possible? 
thanks a lot 
 
March     30, 2007 - 12:02 pm UTC 
 
no table create
no insert intos
no look 
 
 
Help with queries
Jayadevan, March     30, 2007 - 6:30 am UTC
 
 
Tom,
Could you please help with the following issues?
1)
There is a table x with 2 columns
SQL> desc x
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 CUS_ID                                             NUMBER
 CLMAMT                                             NUMBER
SQL> select * from x;
    CUS_ID     CLMAMT
---------- ----------
         1      13008
         3      10500
         2       9306
         6       4500
         4       3002
         5       3000
         7       3000
7 rows selected.
We have to classify the customers into A,B and C depending on whether the sum total of their claims come to 50%,90% and 10 %. If we do it manually, this is what we will get if we do it in excel
CusID Claim amt  Running total  Running total as % total   ABC class Description
1 13008    13008   28.08532688   A Top customers adding to 50% of total claims
3 10500    23508   50.75567838   A Top customers adding to 50% of total claims
2 9306    32814   70.84808705   B Top customers adding to 90% of total claims
6 4500    37314   80.56395198   B Top customers adding to 90% of total claims
4 3002    40316   87.04551343   B Top customers adding to 90% of total claims
5 3000    43316   93.52275671   C Rest
7 3000    46316   100    C Rest
How can we get this using a query? 
There is something wrong with the way I am using a simple sum and over.
It is clubbing the claim amount with the same values and adding it in the first instance itself.
SELECT cus_id, clmamt,
SUM(clmamt) 
OVER (ORDER BY clmamt desc )  x 
FROM x
1 13008 13008
3 10500 23508
2 9306 32814
6 4500 37314
4 3002 40316
5 3000 46316
7 3000 46316
2) I have a table with 3 columns
tariff, start_date, end_date
There is one record with values
1000 01-Jan-2006 31-Jan-2006
I need to insert into another table records such as 
1000 01-Jan-2006
1000 02-Jan-2006
1000 03-Jan-2006
.....
to
1000 31-Jan-2006
i.e. for each date starting with teh start date upto the end date in this table, 
insert a record in the target table. Can I do this with one query?     
March     30, 2007 - 1:41 pm UTC 
 
1) order by something "distinct" - else - neither of the claimamts "come first", they arrive "together"
order by clmamt desc, rowid
2) yes...
ops$tkyte%ORA10GR2> select * from t;
         X Y         Z
---------- --------- ---------
      1000 01-JAN-07 10-JAN-07
      2000 15-JAN-07 22-JAN-07
ops$tkyte%ORA10GR2> with data
  2  as
  3  (select level l from dual connect by level <= (select max(z-y+1) from t)
  4  )
  5  select t.*,
  6         t.y+data.l-1
  7    from data, t
  8   where data.l <= t.z-t.y+1
  9   order by x, t.y+data.l-1
 10  /
         X Y         Z         T.Y+DATA.
---------- --------- --------- ---------
      1000 01-JAN-07 10-JAN-07 01-JAN-07
      1000 01-JAN-07 10-JAN-07 02-JAN-07
      1000 01-JAN-07 10-JAN-07 03-JAN-07
      1000 01-JAN-07 10-JAN-07 04-JAN-07
      1000 01-JAN-07 10-JAN-07 05-JAN-07
      1000 01-JAN-07 10-JAN-07 06-JAN-07
      1000 01-JAN-07 10-JAN-07 07-JAN-07
      1000 01-JAN-07 10-JAN-07 08-JAN-07
      1000 01-JAN-07 10-JAN-07 09-JAN-07
      1000 01-JAN-07 10-JAN-07 10-JAN-07
      2000 15-JAN-07 22-JAN-07 15-JAN-07
      2000 15-JAN-07 22-JAN-07 16-JAN-07
      2000 15-JAN-07 22-JAN-07 17-JAN-07
      2000 15-JAN-07 22-JAN-07 18-JAN-07
      2000 15-JAN-07 22-JAN-07 19-JAN-07
      2000 15-JAN-07 22-JAN-07 20-JAN-07
      2000 15-JAN-07 22-JAN-07 21-JAN-07
      2000 15-JAN-07 22-JAN-07 22-JAN-07
18 rows selected.
 
 
 
Great
Jayadevan, March     31, 2007 - 3:27 am UTC
 
 
Tom,
Thanks a lot for the replies. I am able to do  the ABC classification by adding a case statement. I have learned more about Oracle from asktom than from books or training sessions.
Well, some Indians say - 'Cricket is our religion and Tendulkar is our God'. With India crashing out of the world cup, may be Indian Oracle developers can switch to 'Oracle is our religion and Tom is our God :)'.
 
 
about running %
wawan, April     02, 2007 - 4:46 am UTC
 
 
dear Tom,
create table  SCORING
(Name   varchar2(5),
 Amount number(10,2));
insert into SCORING values('AA',200);
insert into SCORING values('BB',400);
insert into SCORING values('CC',-200);
insert into SCORING values('DD',-300);
insert into SCORING values('EE',400);
insert into SCORING values('FF',500);
commit;
compute sum of Amount on report
break on report
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING;
NAME      AMOUNT      PRCTG
----- ---------- ----------
AA           200         .2
BB           400         .4
CC          -200        -.2
DD          -300        -.3
EE           400         .4
FF           500         .5
      ----------
sum         1000
how to make the result is like this ?
the % calculated from the sum of postive amount only
(1500 ---> 200/1500, 400/1500, 500/1500)
NAME      AMOUNT      PRCTG
----- ---------- ----------  ----- -------
AA           200         .2    200  .1333  
BB           400         .4    400  .2666
CC          -200        -.2     
DD          -300        -.3     
EE           400         .4    400  .2666
FF           500         .5    500  .3333
      ----------             -----
sum         1000              1500
I add two last column-result (off course) by typing.
regards 
April     03, 2007 - 8:35 pm UTC 
 
ops$tkyte%ORA10GR2> select name,Amount ,
  2  ratio_to_report(case when amount>0 then amount end) over() prctg
  3  from SCORING;
NAME                               AMOUNT      PRCTG
------------------------------ ---------- ----------
AA                                    200 .133333333
BB                                    400 .266666667
CC                                   -200
DD                                   -300
EE                                    400 .266666667
FF                                    500 .333333333
6 rows selected.
 
 
 
 
To wawan 
Michel Cadot, April     02, 2007 - 10:59 am UTC
 
 
SQL> select name, amount, 
  2         ratio_to_report(case when amount >= 0 then amount end) over() prctg 
  3  from SCORING; 
NAME      AMOUNT      PRCTG
----- ---------- ----------
AA           200 .133333333
BB           400 .266666667
CC          -200
DD          -300
EE           400 .266666667
FF           500 .333333333
6 rows selected.
Regards
Michel 
 
 
Thanks
wawan, April     02, 2007 - 9:06 pm UTC
 
 
ok. thanks for your query. 
 
To wawan: to get your result
Michel CADOT, April     04, 2007 - 4:30 am UTC
 
 
SQL> select case when grouping(name) = 1 then 'Sum' else name end name, 
  2         sum(amount1) amount1, prctg1, sum(amount2) amount2, prctg2 
  3  from (
  4  select name, amount amount1, 
  5         ratio_to_report(amount) over() prctg1, 
  6         case when amount >= 0 then amount end amount2,
  7         ratio_to_report(case when amount >= 0 then amount end) over() prctg2 
  8  from SCORING
  9  )
 10  group by grouping sets ((),(name,prctg1,prctg2))
 11  order by 1
 12  /
NAME     AMOUNT1     PRCTG1    AMOUNT2     PRCTG2
----- ---------- ---------- ---------- ----------
AA           200         .2        200 .133333333
BB           400         .4        400 .266666667
CC          -200        -.2
DD          -300        -.3
EE           400         .4        400 .266666667
FF           500         .5        500 .333333333
Sum         1000                  1500
7 rows selected.
Regards
Michel 
 
 
Stumped on Analytics!!
Saurabh, April     05, 2007 - 6:43 am UTC
 
 
Hi Tom,
This is in response to review "Stumped on Analytics" posted by "Dave Thompson" on 'March 04, 2004'. 
I know its too late(ages late) to reply but the case presented was interesting. Besides, I had similar(more complex) situation in one of my projects. Unfortunately, I had to do it on Teradata and Procedural Code was not allowed as a standard (by DBA's). So I gave a shot at Analytics. Finally objective was achieved and performance was worth effort spent but the resulting SQL was too complex to understand and maintain.
For presented case, I think below SQL will do. 
cusum's are used to achieve looping effect. Analytics rocks!! lot of procedural code can be eliminated using analytics.
select p.pay_id, p.payment, q.prem_payment, q.prem_id, 
   p.cusum, q.cusum,
--   p.prev_cusum, q.prev_cusum,
   case when p.cusum <= q.cusum then p.payment else p.cusum - q.cusum end amt_applied
from
(
select 
pay_id,
payment,
sum(payment) over(order by pay_id rows between unbounded preceding and current row) cusum,
sum(payment) over(order by pay_id rows between unbounded preceding and 1 preceding ) prev_cusum
from pay_m pay
) p , 
(
select prem_id, prem_payment, 
sum(prem_payment) over(order by prem_id  rows between unbounded preceding and current row) cusum,
sum(prem_payment) over(order by prem_id  rows between unbounded preceding and 1 preceding ) prev_cusum
from prem ) q
where nvl(p.prev_cusum,0) < q.cusum
and p.cusum  >=  nvl(q.prev_cusum,0)
    PAY_ID    PAYMENT PREM_PAYMENT    PREM_ID   
---------- ---------- ------------ ---------- 
         1         50          100          1   
         2         25          100          1   
         3         50          100          1   
         3         50           50          2   
         4         50           50          2   
         4         50           50          3   
If a grouped output is required, user-defined aggregation function, such as yours STRAGG, can be used.
regards
saurabh 
 
looking at two different prior rows at same time
Kevin Meade, April     09, 2007 - 1:56 pm UTC
 
 
--
-- this continues to be cool stuff but I have a variation on analytics I have not seen yet
-- I have read through the analytic post for hours now
--
-- guys in the office had a problem and wanted an all analytic solution
-- essentially they want to look at two different prior rows at the same time
-- (at least I think that is the best way to describe it)
--
-- 
drop table temp1
/
create table temp1
(
 claim_id number
,transtype varchar2(3)
,dollartype varchar2(3)
,transdate date
,amount number
)
/
insert into temp1 values (1,'TEK','USD',to_date('01','dd'),1100);
insert into temp1 values (1,'REV','USD',to_date('02','dd'),1200);
insert into temp1 values (1,'TEK','USD',to_date('04','dd'),1400);
insert into temp1 values (1,'TEK','USD',to_date('05','dd'),1500);
insert into temp1 values (1,'REV','USD',to_date('06','dd'),1600);
insert into temp1 values (1,'REV','USD',to_date('07','dd'),1700);
insert into temp1 values (1,'TEK','USD',to_date('08','dd'),1800);
insert into temp1 values (1,'TEK','USD',to_date('09','dd'),1900);
insert into temp1 values (1,'TEK','USD',to_date('10','dd'),1950);
insert into temp1 values (1,'REV','USD',to_date('11','dd'),2000);
commit
/
select * from temp1 order by transdate
/
/*
  CLAIM_ID TRA DOL TRANSDATE     AMOUNT
---------- --- --- --------- ----------
         1 TEK USD 01-APR-07       1100
         1 REV USD 02-APR-07       1200
         1 TEK USD 04-APR-07       1400
         1 TEK USD 05-APR-07       1500
         1 REV USD 06-APR-07       1600
         1 REV USD 07-APR-07       1700
         1 TEK USD 08-APR-07       1800
         1 TEK USD 09-APR-07       1900
         1 TEK USD 10-APR-07       1950
         1 REV USD 11-APR-07       2000
10 rows selected.
*/
-- 
-- by claim, for each row, get the amount from the prior REV row
--                         get the amount from the prior TEK row
--
-- using scalars it looks like this
--
select a.*
      ,(
        select amount
        from temp1 b
        where b.claim_id = a.claim_id
        and b.transtype = 'TEK'
        and b.transdate =
           (
            select max(c.transdate)
            from temp1 c
            where c.claim_id = a.claim_id
            and c.transdate < a.transdate
            and c.transtype = 'TEK'
           )
      ) prior_TEK
      ,(
        select amount
        from temp1 b
        where b.claim_id = a.claim_id
        and b.transtype = 'REV'
        and b.transdate =
           (
            select max(c.transdate)
            from temp1 c
            where c.claim_id = a.claim_id
            and c.transdate < a.transdate
            and c.transtype = 'REV'
           )
      ) prior_REV
from temp1 a
order by transdate
/
/*
  CLAIM_ID TRA DOL TRANSDATE     AMOUNT  PRIOR_TEK  PRIOR_REV
---------- --- --- --------- ---------- ---------- ----------
         1 TEK USD 01-APR-07       1100
         1 REV USD 02-APR-07       1200       1100
         1 TEK USD 04-APR-07       1400       1100       1200
         1 TEK USD 05-APR-07       1500       1400       1200
         1 REV USD 06-APR-07       1600       1500       1200
         1 REV USD 07-APR-07       1700       1500       1600
         1 TEK USD 08-APR-07       1800       1500       1700
         1 TEK USD 09-APR-07       1900       1800       1700
         1 TEK USD 10-APR-07       1950       1900       1700
         1 REV USD 11-APR-07       2000       1950       1700
10 rows selected.
*/
--
-- but they wanted a one pass analytic solution
-- I racked my brain for a couple of days and couldn't figure out how to do this with analytics
-- in one pass.  What I need is something like this: (FICTITIOUS WHERE CLAUSE IN THE ANALYTIC)
-- maybe I am just being dense
--
/*
select a.*
      ,lag(amount) over (partition by claim_id order by transdate WHERE TRANSTYPE = 'REV')
      ,lag(amount) over (partition by claim_id order by transdate WHERE TRANSTYPE = 'TEK')
from temp1 a
order by transdate
/
*/
--
-- But of course this WHERE clause is not part of the anlaytic syntax so...
-- I gave them the scalars and said if they wanted a one pass analytic solution then
-- they would need someone smarter than me cause I couldn't figure it out
-- to whit they said "what about asking TOM :)", so here I am
--
-- Kevin from CT USA
 
April     09, 2007 - 2:05 pm UTC 
 
first thing that popped into my head was....
ops$tkyte%ORA10GR2> select claim_id, transtype, dollartype, transdate, amount,
  2         lag(last_tek) over (order by transdate) real_last_tek,
  3         lag(last_rev) over (order by transdate) real_last_rev
  4    from (
  5  select claim_id, transtype, dollartype, transdate, amount,
  6         last_value(case when transtype='TEK' then amount end ignore nulls) over (order by transdate) last_tek,
  7         last_value(case when transtype='REV' then amount end ignore nulls) over (order by transdate) last_rev
  8    from temp1
  9         )
 10   order by transdate
 11  /
  CLAIM_ID TRA DOL TRANSDATE     AMOUNT REAL_LAST_TEK REAL_LAST_REV
---------- --- --- --------- ---------- ------------- -------------
         1 TEK USD 01-APR-07       1100
         1 REV USD 02-APR-07       1200          1100
         1 TEK USD 04-APR-07       1400          1100          1200
         1 TEK USD 05-APR-07       1500          1400          1200
         1 REV USD 06-APR-07       1600          1500          1200
         1 REV USD 07-APR-07       1700          1500          1600
         1 TEK USD 08-APR-07       1800          1500          1700
         1 TEK USD 09-APR-07       1900          1800          1700
         1 TEK USD 10-APR-07       1950          1900          1700
         1 REV USD 11-APR-07       2000          1950          1700
10 rows selected. 
 
 
carry down on 9i
Scott, April     09, 2007 - 11:41 pm UTC
 
 
G'day Tom,
I've been trying to translate your "carry down" example to my seemingly simple problem, but I cannot seem to get it right.
Given 
create table swtest (a number, b varchar2(2)); 
insert into swtest values (3,'51'); 
insert into swtest values (4,null); 
insert into swtest values (6,'E9'); 
9i> select * from swtest order by a 
         A B 
---------- -- 
         3 51 
         4 
         6 E9 I want a column C that would return the most recent value of B, eg: 
         A B  MI 
---------- -- -- 
         3 51 51 
         4    51 
         6 E9 E9 Something like this sounds kinda right in theory, but doesn't return 51 for the 2nd row. 
  1   select a, b,last_value(b) over (order by a rows between unbounded preceding and current row ) 
  2   from swtest 
  3*  order by a 
9i> / 
         A B  LA 
---------- -- -- 
         3 51 51 
         4 
         6 E9 E9 My colleague successfully used 
last_value(b IGNORE NULLS) over (order by a range between unbounded preceding and current row )
 But I'm on 9i.
Nulls first or any other "order by" fiddling doesn't have an affect on the windowing function.
Any help? 
April     10, 2007 - 10:28 am UTC 
 
ops$tkyte%ORA9IR2> select a, b,
  2         substr(max( case when b is not null then to_char(a,'fm0000000000' ) || b end ) over ( order by a ),11) maxb
  3    from swtest
  4  /
         A B  MAX
---------- -- ---
         3 51 51
         4    51
         6 E9 E9
 
 
 
 
Saurabh, April     10, 2007 - 12:39 am UTC
 
 
select a, b, 
max(decode(a,pm, b,null)) over(order by a rows between unbounded preceding and current row)
from (
select a, b,
max(decode(b,null,null,a)) over(order by a rows between unbounded preceding and current row) pm
from swtest 
) x
order by a
         A B  MA
---------- -- --
         3 51 51
         4    51
         5    51
         6 E9 E9
  
 
 
close...
Scott, April     10, 2007 - 1:26 am UTC
 
 
Ahh, but if you add
insert into swtest values (7,'60'); 
insert into swtest values (8,NULL);
 It's erroneous to my problem. That's relying on the ordering the character set.
I have managed this:
select a, first_value(b) over (partition by r) b 
from (
   select a, b, max(r) over (order by a) r 
   from (
      select a, b, case when b is not null then row_number() over (order by a)  end r
      from swtest
   )
)
order by a;But I was hoping for something with less inline views, as of course it's ultimately part of a larger query. 
 
Most elegant... that's going in the 9i notebook.
Scott, April     10, 2007 - 7:56 pm UTC
 
 
 
 
thanks muchly again
Kevin, April     11, 2007 - 4:42 pm UTC
 
 
thanks very much for your timely response.  I'll look it over in depth.
I am amazed every time I ask you something, how easily you seem to grasp my problem, and how hard it is for me to grasp your solution.  hmm.. all of a sudden I am thinking Jethro Tull: you are the wise man and I feel "thick as a brick".
Kevin 
 
analytic function in union query
wawan, April     12, 2007 - 12:49 am UTC
 
 
dear Tom , dear All
as a subject ,
create table SCORING1 
(Name  varchar2(5), 
Amount number(10,2)); 
create table SCORING2 
(Name  varchar2(5), 
Amount number(10,2)); 
insert into SCORING1 values('AA',200); 
insert into SCORING1 values('BB',400); 
insert into SCORING1 values('CC',-200); 
insert into SCORING2 values('DD',-300); 
insert into SCORING2 values('EE',400); 
insert into SCORING2 values('FF',500); 
commit; 
compute sum of Amount on report 
break on report 
select name,Amount , 
ratio_to_report(amount) over() prctg 
from SCORING1
/
NAME      AMOUNT      PRCTG
----- ---------- ----------
AA           200         .5
BB           400          1
CC          -200        -.5
      ----------
sum          400
select name,Amount , 
ratio_to_report(amount) over() prctg 
from SCORING2
/
NAME      AMOUNT      PRCTG
----- ---------- ----------
DD          -300        -.5
EE           400 .666666667
FF           500 .833333333
      ----------
sum          600
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING1
union
select name,Amount ,
ratio_to_report(amount) over() prctg
from SCORING2
/
NAME      AMOUNT      PRCTG
----- ---------- ----------
AA           200         .5
BB           400          1
CC          -200        -.5
DD          -300        -.5
EE           400 .666666667
FF           500 .833333333
      ----------
sum         1000
what is the correct query so the % is
calculated from 1000 for each rows ? 
April     13, 2007 - 11:41 am UTC 
 
ops$tkyte%ORA9IR2> compute sum of amount on report
ops$tkyte%ORA9IR2> break on report
ops$tkyte%ORA9IR2> select name, amount, ratio_to_report(amount) over () prctg
  2    from (select * from scoring1 UNION ALL select * from scoring2)
  3  /
NAME                               AMOUNT      PRCTG
------------------------------ ---------- ----------
AA                                    200         .2
BB                                    400         .4
CC                                   -200        -.2
DD                                   -300        -.3
EE                                    400         .4
FF                                    500         .5
                               ----------
sum                                  1000
6 rows selected. 
 
 
 
User defined analytic function
Miki, April     16, 2007 - 3:55 pm UTC
 
 
Dear Tom,
I have a difficult query to identify rows that member of a continuous constant number (for exapmle 0 traffic for 7 days).
I solved this problem with using couple of analytic functions and subquerys. The idea was to separate the non-0 and the 0 rows and assign a virtual partition number to each row...But I found this method too difficult to understand and genererated a lots of code.
I rethinked my solution and I tried to write a user defined analytic function.
CREATE TABLE "T" 
   ( "TRAFFIC_DATE" DATE, 
 "MC_ID" NUMBER(*,0), 
 "TRAFFIC" NUMBER(*,0)
   );
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('04-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('05-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('06-04-2007', 'dd-mm-yyyy'), 1, 100);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('07-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('08-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('09-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('10-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('11-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('12-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('13-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('14-04-2007', 'dd-mm-yyyy'), 1, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('15-04-2007', 'dd-mm-yyyy'), 1, 3);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('04-04-2007', 'dd-mm-yyyy'), 2, 2);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('05-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('06-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('07-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('08-04-2007', 'dd-mm-yyyy'), 2, 100);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('09-04-2007', 'dd-mm-yyyy'), 2, 100);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('10-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('11-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('12-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('13-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('14-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('15-04-2007', 'dd-mm-yyyy'), 2, 0);
insert into T (TRAFFIC_DATE, MC_ID, TRAFFIC)
values (to_date('16-04-2007', 'dd-mm-yyyy'), 2, 2);
commit;
My desired output:
SELECT a.*
      ,f_normal_day_fg(a.traffic) over(PARTITION BY a.mc_id ORDER BY a.traffic_date ROWS BETWEEN 6 preceding AND 6 following)
  FROM t a;
TRAFFIC_DATE MC_ID TRAFFIC F_NORMAL_DAY_FG(A.TRAFFIC)OVER
2007. 04. 04. 1 0 1
2007. 04. 05. 1 0 1
2007. 04. 06. 1 100 1
2007. 04. 07. 1 0 0
2007. 04. 08. 1 0 0
2007. 04. 09. 1 0 0
2007. 04. 10. 1 0 0
2007. 04. 11. 1 0 0
2007. 04. 12. 1 0 0
2007. 04. 13. 1 0 0
2007. 04. 14. 1 0 0
2007. 04. 15. 1 3 1
2007. 04. 04. 2 2 1
2007. 04. 05. 2 0 1
2007. 04. 06. 2 0 1
2007. 04. 07. 2 0 1
2007. 04. 08. 2 100 1
2007. 04. 09. 2 100 1
2007. 04. 10. 2 0 1
2007. 04. 11. 2 0 1
2007. 04. 12. 2 0 1
2007. 04. 13. 2 0 1
2007. 04. 14. 2 0 1
2007. 04. 15. 2 0 1
2007. 04. 16. 2 2 1
My function:
create or replace type F_Normal_Day as object
 (
    NemNullaForgDB  NUMBER,
    cnt  NUMBER,
    static function
         ODCIAggregateInitialize(sctx IN OUT F_Normal_Day )
         return number,
    member function
         ODCIAggregateIterate(self IN OUT F_Normal_Day ,
                              value IN NUMBER )
         return number,
    member function
         ODCIAggregateTerminate(self IN F_Normal_Day,
                                returnValue OUT  NUMBER,
                                flags IN NUMBER)
         return NUMBER,
    member function
         ODCIAggregateMerge(self IN OUT F_Normal_Day,
                            ctx2 IN F_Normal_Day)
         return NUMBER
 )
;
create or replace type body F_Normal_Day
  is
  static function ODCIAggregateInitialize(sctx IN OUT F_Normal_Day)
  return number
  is
  begin
      sctx := F_Normal_Day( 0,0 );
      return ODCIConst.Success;
  end;
  member function ODCIAggregateIterate(self IN OUT F_Normal_Day,
                                       value IN NUMBER )
  return number
  is
  begin
      IF VALUE=0 THEN
        self.NemNullaForgDB:=self.NemNullaForgDB+1;
      ELSE 
        self.NemNullaForgDB:=0;
      END IF;
      self.cnt:=greatest(SELF.cnt,SELF.NemNullaForgDB);
      
      RETURN ODCIConst.Success;
  end;
  member function ODCIAggregateTerminate(self IN F_Normal_Day,
                                         returnValue OUT NUMBER,
                                         flags IN NUMBER)
  return number
  is
  BEGIN
  
      IF SELF.cnt>=7 THEN
        returnValue := 0;
      ELSE 
        returnValue := 1;
      END IF;
   
--   returnValue:=SELF.cnt;
 --    returnValue:=self.NemNullaForgDB;
      return ODCIConst.Success;
  end;
  member function ODCIAggregateMerge(self IN OUT F_Normal_Day,
                                     ctx2 IN F_Normal_Day)
  return number
  is
  BEGIN
      return ODCIConst.Success;
NULL;
  end;
  end;
CREATE OR REPLACE FUNCTION f_normal_day_fg  (input NUMBER) RETURN NUMBER
PARALLEL_ENABLE AGGREGATE USING f_normal_day;
My question is: how can I (if it is possible) to assign the 7 number as parameter to the F_Normal_Day object? I mean, I like to have an analytic funtion like lag() that have 2 parameter. I can not add another parameter to f_normal_day_fg, because it raises an error...
Thanks in advance,
Miki
 
April     17, 2007 - 9:39 am UTC 
 
aggregate functions - which you are building - take precisely one argument.
You could do something like:
SELECT a.*
    ,f_normal_day_fg(a.traffic || ',7' ) over(PARTITION BY a.m
perhaps and substr it out - or use an application context and call dbms_session.set_context to the value prior to calling the function from SQL. 
 
 
calculate % of sum
wawan, April     17, 2007 - 6:14 am UTC
 
 
drop table scoring1;
drop table scoring2;
create table SCORING1
(Name  varchar2(5),
Location number(2),
Amount number(10,2));
create table SCORING2
(Name  varchar2(5),
Location number(2),
Amount number(10,2));
insert into SCORING1 values('AA',1,200);
insert into SCORING1 values('AA',2,300);
insert into SCORING1 values('AA',3,-400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,200);
insert into SCORING1 values('CC',2,-200);
insert into SCORING1 values('AA',1,200);
insert into SCORING1 values('AA',2,300);
insert into SCORING1 values('AA',3,-400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,200);
insert into SCORING1 values('CC',2,-200);
commit;
col prctg1 format 99.99
col prctg2 format 99.99
compute sum of AAA on name
compute sum of BBB on name
compute sum of prctg1 on name
compute sum of prctg2 on name
break on name skip 1
select A.name,A.location,sum(A.amount) AAA,
case when sum(A.amount)>0 then sum(A.amount) end BBB,
ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
 over(partition by A.name) prctg2
from
(
select name,location,amount
from SCORING1
union all
select name,location,amount
from SCORING2
) A
group by A.name, A.location
/
NAME    LOCATION        AAA        BBB PRCTG2
----- ---------- ---------- ---------- ------
AA             1        400        400    .40
               2        600        600    .60
               3       -800
*****            ---------- ---------- ------
sum                     200       1000   1.00
CC             1        400        400    .29
               2       -400
               3       1000       1000    .71
*****            ---------- ---------- ------
sum                    1000       1400   1.00
how add a new column , which calculate 
PRCTG2 * sum of each name : .40* 200 and .60*200 for AA
and .29*1000 and .71*1000 for CC ?
like this :
NAME    LOCATION        AAA        BBB PRCTG2  NEW_COLUMN
----- ---------- ---------- ---------- ------  ----------
AA             1        400        400    .40      80
               2        600        600    .60     120
               3       -800
*****            ---------- ---------- ------  ----------
sum                     200       1000   1.00     200
CC             1        400        400    .29     290
               2       -400
               3       1000       1000    .71     710
*****            ---------- ---------- ------  ----------
sum                    1000       1400   1.00    1000 
its too complicated for me :)
thanks before 
April     17, 2007 - 10:13 am UTC 
 
(please discover the CODE button!!! very hard to read)
...
PRCTG2 * sum of each name : .40* 200 and .60*200 for AA 
......
where does 200 come from? 
 
 
calculate query question
wawan, April     17, 2007 - 9:04 pm UTC
 
 
col prctg1 format 99.99 
col prctg2 format 99.99 
compute sum of AAA on name 
compute sum of BBB on name 
compute sum of prctg1 on name 
compute sum of prctg2 on name 
break on name skip 1 
select A.name,A.location,sum(A.amount) AAA, 
case when sum(A.amount)>0 then sum(A.amount) end BBB, 
ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end) 
over(partition by A.name) prctg2 
from 
( 
select name,location,amount 
from SCORING1 
union all 
select name,location,amount 
from SCORING2 
) A 
group by A.name, A.location 
/ 
NAME  LOCATION    AAA        BBB       PRCTG2 
----- ---------- ---------- ---------- ------ 
AA        1      400        400        .40 
          2      600        600        .60 
          3      -800 
*****            ---------- ---------- ------ 
sum              200        1000       1.00 
CC        1      400        400        .29 
          2     -400 
          3     1000        1000       .71 
*****            ---------- ---------- ------ 
sum             1000        1400       1.00 
how add a new column , which calculate 
PRCTG2 * sum of each name : .40* 200 and .60*200 for AA 
and .29*1000 and .71*1000 for CC ? 
like this : 
NAME  LOCATION    AAA       BBB        PRCTG2 NEW_COLUMN 
----- ---------- ---------- ---------- ------ ---------- 
AA        1      400        400         .40    80 
          2      600        600         .60   120 
          3     -800 
*****            ---------- ---------- ------ ---------- 
sum              200        1000        1.00  200 
CC        1      400         400         .29  290 
          2     -400 
          3     1000        1000         .71  710 
*****            ---------- ---------- ------ ---------- 
sum             1000        1400        1.00  1000 
its too complicated for me :) 
thanks before 
the 200 came from sum of amount of each name
sum AA is 200
sum CC is 1000
 
April     18, 2007 - 11:56 am UTC 
 
ops$tkyte%ORA10GR2> select name,
  2         location,
  3             aaa,
  4             bbb,
  5             prctg2,
  6             prctg2*sum(aaa) over (partition by name) sum_aaa
  7    from (
  8  select A.name,
  9         A.location,
 10             sum(A.amount) AAA,
 11         case when sum(A.amount)>0 then sum(A.amount) end BBB,
 12         ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
 13                 over(partition by A.name) prctg2
 14  from
 15  (
 16  select name,location,amount
 17  from SCORING1
 18  union all
 19  select name,location,amount
 20  from SCORING2
 21  ) A
 22  group by A.name, A.location
 23         )
 24  /
NAME                             LOCATION        AAA        BBB     PRCTG2    SUM_AAA
------------------------------ ---------- ---------- ---------- ---------- ----------
AA                                      1        400        400         .4         80
AA                                      2        600        600         .6        120
AA                                      3       -800
CC                                      1        400        400 .285714286 285.714286
CC                                      2       -400
CC                                      3       1000       1000 .714285714 714.285714
6 rows selected.
 
 
 
 
asking query
wawan, April     23, 2007 - 2:18 am UTC
 
 
Tom,
drop table scoring1;
drop table scoring2;
drop table scoring3;
create table SCORING1
(Name  varchar2(5),
Location number(2),
Amount number(10,2));
create table SCORING2
(Name  varchar2(5),
Location number(2),
Amount number(10,2));
create table SCORING3
(Name  varchar2(5),
Location number(2),
Amount number(10,2));
insert into SCORING1 values('AA',1,-500);
insert into SCORING1 values('AA',2,300);
insert into SCORING1 values('AA',3,400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,200);
insert into SCORING1 values('CC',2,-600);
insert into SCORING1 values('AA',1,200);
insert into SCORING1 values('AA',2,500);
insert into SCORING1 values('AA',3,400);
insert into SCORING1 values('CC',3,500);
insert into SCORING1 values('CC',1,400);
insert into SCORING1 values('CC',2,-200);
insert into SCORING3 values('AA',4,500);
insert into SCORING3 values('AA',5,500);
insert into SCORING3 values('CC',6,600);
insert into SCORING3 values('CC',7,300);
insert into SCORING3 values('CC',8,300);
commit;
col prctg1 format 99.99
col prctg2 format 99.99
compute sum of AAA on name
compute sum of BBB on name
compute sum of prctg1 on name
compute sum of prctg2 on name
break on name skip 1
select name,
        location,
            aaa,
            bbb,
            prctg2,
            prctg2*sum(aaa) over (partition by name) sum_aaaXprctg2
   from (
 select A.name,
        A.location,
            sum(A.amount) AAA,
        case when sum(A.amount)>0 then sum(A.amount) end BBB,
        ratio_to_report(case when sum(A.amount)>0 then sum(A.amount) end)
                over(partition by A.name) prctg2
 from
 (
 select name,location,amount
 from SCORING1
 union all
 select name,location,amount
 from SCORING2
 ) A
 group by A.name, A.location
        )
/
select name,location,ratio_to_report(amount)
 over(partition by name) prctg3 from scoring3
/
Commit complete.
NAME    LOCATION        AAA        BBB PRCTG2 SUM_AAAXPRCTG2
----- ---------- ---------- ---------- ------ --------------
AA             1       -300
               2        800        800    .50            650
               3        800        800    .50            650
*****            ---------- ---------- ------
sum                    1300       1600   1.00
CC             1        600        600    .38            300
               2       -800
               3       1000       1000    .63            500
*****            ---------- ---------- ------
sum                     800       1600   1.00
6 rows selected.
NAME    LOCATION     PRCTG3
----- ---------- ----------
AA             4         .5
               5         .5
CC             6         .5
               7        .25
               8        .25
how to make a query with result like this ?
NAME  LOCATION    AAA     BBB PRCTG2 SUM_AAA  new_col prctg3
----- -------- ------ ------- ------ -------  ------- ------
AA           1   -300
             2    800     800    .50     650
             3    800     800    .50     650
             4                                 325     .5
             5                                 325     .5
*****          ------ ------- ------ -------   ------
sum              1300    1600   1.00
CC           1    600     600    .38     300
             2   -800
             3   1000    1000    .63     500
             6                                  250    .5
             7                                  125    .25
             8                                  125    .25
*****          ------ ------- ------
sum               800    1600   1.00
the new column is get from amount in location 3 * prctg3,
for the location 4,5 and so on.
regards 
 
Analytics
Vamsi Krishna, April     27, 2007 - 2:31 am UTC
 
 
Tom you are the BEST.
I am exploring the world of Oracle with your invaluable support.
Thanks TOM Keep going........ 
 
Using SQL to convert nulls to the value
ht, May       01, 2007 - 12:37 am UTC
 
 
Hello Tom,
I've searched your site and am not sure if this is the correct place to ask this question so my apologies up front.  I've been stuck on this concept for some time.  All I want to do is to use sql (as opposed to pl/sql) to display (and not store) a row's previous value if it is null.
For instance, in the snippet below, I provide a script that adds the emp.week. column.  I then populate the column using pl/sql.  Then, I query the table for empno=7902.
The resulting output is close to what I want but instead of:
40       7902            8092
41
42       7902            8094
43
44       7902            8096
45
46       7902            8098
I would like:
40       7902            8092
41       7902            8092
42       7902            8094
43       7902            8094
44       7902            8096
45       7902            8096
46       7902            8098
The idea is I would like to save "space" in my database by not storing rows that, when checked, were the previous value (which could also be < the previous value).  But I would like to provide a report like the one above to compare empno commissions.  The actual output is just a sample, I'm really storing the growth of extents over time in multiple databases.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Connected.
desc emp
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 EMPNO                                     NOT NULL NUMBER(4)
 ENAME                                              VARCHAR2(10)
 JOB                                                VARCHAR2(9)
 MGR                                                NUMBER(4)
 HIREDATE                                           DATE
 SAL                                                NUMBER(7,2)
 COMM                                               NUMBER(7,2)
 DEPTNO                                             NUMBER(2)
alter table emp disable constraint pk_emp;
Table altered.
alter table emp add week number;
Table altered.
declare
  2  x       number;
  3  y       number;
  4  z       number;
  5  begin
  6  for i in 1 .. 52 loop
  7          for j in (select distinct empno from emp) loop
  8                  -- every other week for some empnos
  9                  y:=mod(j.empno,2);
 10                  x:=mod(i,2);
 11                  if y=0 then
 12                          if x=0 then
 13                                  insert into emp (empno,comm,week)values(j.empno,j.empno+150+i,i);
 14                          end if;
 15                  end if;
 16                  -- every week for some empnos
 17                  z:=mod(j.empno,2);
 18                  if z!=0 then
 19                          insert into emp (empno,comm,week)values(j.empno,j.empno-175-i,i);
 20                  end if;
 21          end loop;
 22  end loop;
 23  end;
 24  /
PL/SQL procedure successfully completed.
--select week,empno,comm from emp where week is not null order by week,empno;
/*
select x.week,x.empno,sum(comm) commission4week from emp x where week is not null
group by x.week,x.empno
order by x.week,x.empno;
*/
select distinct e.week,xyz.empno,xyz.commission4week from emp e ,
  2  (select x.week,x.empno,sum(comm) commission4week from emp x where week is not null
  3  and x.empno=7902
  4  group by x.week,x.empno) xyz
  5  where e.week=xyz.week(+)
  6  order by week,empno;
      WEEK      EMPNO COMMISSION4WEEK
---------- ---------- ---------------
         1
         2       7902            8054
         3
         4       7902            8056
         5
         6       7902            8058
         7
         8       7902            8060
         9
        10       7902            8062
        11
        12       7902            8064
        13
        14       7902            8066
        15
        16       7902            8068
        17
        18       7902            8070
        19
        20       7902            8072
        21
        22       7902            8074
        23
        24       7902            8076
        25
        26       7902            8078
        27
        28       7902            8080
        29
        30       7902            8082
        31
        32       7902            8084
        33
        34       7902            8086
        35
        36       7902            8088
        37
        38       7902            8090
        39
        40       7902            8092
        41
        42       7902            8094
        43
        44       7902            8096
        45
        46       7902            8098
        47
        48       7902            8100
        49
        50       7902            8102
        51
        52       7902            8104
53 rows selected.
desc emp
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 EMPNO                                              NUMBER(4)
 ENAME                                              VARCHAR2(10)
 JOB                                                VARCHAR2(9)
 MGR                                                NUMBER(4)
 HIREDATE                                           DATE
 SAL                                                NUMBER(7,2)
 COMM                                               NUMBER(7,2)
 DEPTNO                                             NUMBER(2)
 WEEK                                               NUMBER
delete emp where week is not null;
468 rows deleted.
alter table emp drop column week;
Table altered.
desc emp
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 EMPNO                                              NUMBER(4)
 ENAME                                              VARCHAR2(10)
 JOB                                                VARCHAR2(9)
 MGR                                                NUMBER(4)
 HIREDATE                                           DATE
 SAL                                                NUMBER(7,2)
 COMM                                               NUMBER(7,2)
 DEPTNO                                             NUMBER(2)
alter table emp enable constraint pk_emp;
Table altered.
exit;
 
May       01, 2007 - 9:52 am UTC 
 
last_value analytic function with the new in 10g option of "ignore nulls" is what you want:
scott%ORA10GR2> select empno, comm, last_value( comm ignore nulls ) over (order by empno) carry_down
  2  from emp
  3  order by empno;
     EMPNO       COMM CARRY_DOWN
---------- ---------- ----------
      7369
      7499        300        300
      7521        500        500
      7566                   500
      7654       1400       1400
      7698                  1400
      7782                  1400
      7788                  1400
      7839                  1400
      7844          0          0
      7876                     0
      7900                     0
      7902                     0
      7934                     0
14 rows selected.
 
 
 
Works great!
ht, May       01, 2007 - 1:42 pm UTC
 
 
Tom,
It's amazing that we can be stuck on a problem for a few days, "give up", submit it to you, go to bed, and wake up with a solution that not only avoids pl/sql but is quicker than the flawwed sql I had settled on.
This solution will save me gigabytes of storage since I don't have to store a value if it hasn't changed since the last check.  Also, it saves storage because I don't have to create a table that would have periodically stored my summarized data (using slower pl/sql) so queries would run acceptably.
Thank you. 
 
reader
A reader, May       08, 2007 - 4:12 pm UTC
 
 
Using analytical function, is it possible match two adjacent rows and print just the column and values that has changed during update
For example, if I use CDC (Change Data Capture), it prints out the before and after image of a row that has been updated. Ex
UU  123 456 789 101 102
UN  123 555 789 101 102
In the above example UU is the before imaage and UN is the after image. The row has four columns
This row has been updated. But only one column-value has been change i-e column2 value changed from 456 to 555
Is there a way to scan the subscriber view to identify , the column and vlaue that has been changed and print it  
May       11, 2007 - 8:36 am UTC 
 
do you have
a) a 'primary key'
b) a flag that says "this is before, this is after"
if so, 
select pk, 
       decode( old_c1, new_c1, cast( null as <datatype> ), new_c1 ) c1,
       ....
  from (
select pk, 
       max( decode( flag, 'before', c1 ) ) old_c1,
       max( decode( flag, 'after',  c1 ) ) new_c1,
       ....
  from t
 group by pk
       ) 
 
 
Avoiding a full table scan wgeb using outer joins?
ht, May       10, 2007 - 5:45 am UTC
 
 
Tom,
Is it possible to avoid the full table scan occuring below?  After implementing your carry_over technique above, it takes over 5 hours to run this query for a specific instance using my "production data".
I believe the predicate causing the issue is:
 WHERE e.batch_id = xyz.batch_id(+)
Is there a way to get the output desired and prevent a full table scan?
Thanks again.
SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 10 02:35:41 2007
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Connected.
SELECT DISTINCT
  2          e.batch_id,
  3          sum(xyz.total)size_mb,
  4          last_value(sum(xyz.total) ignore nulls) over(ORDER BY e.batch_id) carry_over
  5  FROM
  6          testeextent e,
  7          (
  8          SELECT
  9                   x.batch_id,
 10                  SUM(x.size_mb) total
 11          FROM
 12                  testeextent x
 13          WHERE
 14                  x.instance_id =7383
 15          GROUP BY
 16                  x.batch_id
 17          )xyz
 18  WHERE e.batch_id = xyz.batch_id(+)
 19  group by e.batch_id
 20  ORDER BY e.batch_id;
  BATCH_ID    SIZE_MB CARRY_OVER
---------- ---------- ----------
         1          4          4
         2         12         12
         3                    12
         4                    12
         5         40         40
         6         60         60
         7                    60
         8         68         68
8 rows selected.
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 137322112
--------------------------------------------------------------------------------
----------
| Id  | Operation             | Name             | Rows  | Bytes | Cost (%CPU)|
Time     |
--------------------------------------------------------------------------------
----------
|   0 | SELECT STATEMENT      |                  |     5 |    85 |    17  (12)|
00:00:01 |
|   1 |  WINDOW BUFFER        |                  |     5 |    85 |    17  (12)|
00:00:01 |
|   2 |   SORT GROUP BY       |                  |     5 |    85 |    17  (12)|
00:00:01 |
|*  3 |    HASH JOIN OUTER    |                  |     8 |   136 |    16   (7)|
00:00:01 |
|   4 |     TABLE ACCESS FULL | TESTEEXTENT      |     8 |    16 |    14   (0)|
00:00:01 |
|   5 |     VIEW              |                  |     5 |    75 |     1   (0)|
00:00:01 |
|   6 |      HASH GROUP BY    |                  |     5 |    35 |     1   (0)|
00:00:01 |
|*  7 |       INDEX RANGE SCAN| TESTEEXTENT_IDX1 |     5 |    35 |     1   (0)|
00:00:01 |
--------------------------------------------------------------------------------
----------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
   7 - access("X"."INSTANCE_ID"=7383)
       filter("X"."INSTANCE_ID"=7383)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         23  consistent gets
          0  physical reads
          0  redo size
        783  bytes sent via SQL*Net to client
        469  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          8  rows processed
 
May       11, 2007 - 10:38 am UTC 
 
umm, think about this please..
how would anything short of a full scan be even remotely efficient here?
  5  FROM
  6          testeextent e,
  7          (
  8          SELECT
  9                   x.batch_id,
 10                  SUM(x.size_mb) total
 11          FROM
 12                  testeextent x
 13          WHERE
 14                  x.instance_id =7383
 15          GROUP BY
 16                  x.batch_id
 17          )xyz
 18  WHERE e.batch_id = xyz.batch_id(+)
 19  group by e.batch_id
 20  ORDER BY e.batch_id;
you are getting - by definition - every thing from testeextent - E is retrieved in FULL
tell me, what do you think should happen in this query? 
 
 
Analytic question
A reader, May       11, 2007 - 9:49 am UTC
 
 
Hi Tom,
Thanks for the book expert one on one I've bought. I read carrefuly Chapter 12 : Analytic Functions and may be I've found an error in the explanation (page 557)
----------------------------------
Here, if we look at CLARK again, since we understand........ These are the values of CLARK's salary and the rows preceeding (isn't following instead?)
-----------------
Anyway, I have a question
CREATE TABLE T1
(
  TYP_TAR          VARCHAR2(3),
  FAG_SLC          VARCHAR2(1),
  TYP_CAT          VARCHAR2(2),
  PCT_FEE          NUMBER(6,4),
  PCT_RED          NUMBER(6,4),
  AMT_MAX_CND_FEE  NUMBER(38,17),
  AMT_MIN_CND_FEE  NUMBER(38,17),
  DAT_START_CND    DATE,
  DAT_END_CND      DATE,
  SLC_LWR          NUMBER(38,17),
  SLC_UPR          NUMBER(38,17),
  TYP_XTR_FAC      VARCHAR2(3),
  XTR_FAC_VAL      VARCHAR2(250),
  FAG_STD          VARCHAR2(1),
  FAG_DONE         VARCHAR2(1),
  PRC_CTT_BKA_VAL  VARCHAR2(1),
  AMT_MAX_TAR_FEE  NUMBER(38,17),
  FAG_APE          VARCHAR2(1),
  IDE              NUMBER(38),
  TPER_TYP_PDI     VARCHAR2(2)
)
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('02', 'N', '01', 1, NULL, 999999,
             40,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             999999, 25000, '02', 'Br/PB', 'Y', 'N',
             NULL, 999999, 'Y', 1, '00'
            );
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('02', 'N', '01', 14.5, NULL, 999999,
             40,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             999999, 25000, '02', 'Br/PB', 'Y', 'N',
             NULL, 999999, 'Y', 1, '00'
            );
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('02', 'N', '01', 1, NULL, 999999,
             40,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             25000, 999999, '02', 'Br/PB', 'Y', 'N',
             NULL, 999999, 'Y', 2, '00'
            );
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('02', 'N', '02', 0.5, NULL, 999999,
             30,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             25000, 999999, '02', 'HB', 'Y', 'N',
             NULL, 999999, 'Y', 8, '00'
            );
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('01', 'N', '08', 0.0011, NULL, 999999,
             40,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             999999, 999999, '###', '######', 'Y', 'N',
             NULL, 999999, 'Y', 20, '00'
            );
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('01', 'N', '08', 0.0012, NULL, 999999,
             40,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             999999, 999999, '###', '######', 'Y', 'N',
             NULL, 999999, 'Y', 20, '00'
            );
INSERT INTO t1
            (typ_tar, fag_slc, typ_cat, pct_fee, pct_red, amt_max_cnd_fee,
             amt_min_cnd_fee,
             dat_start_cnd,
             dat_end_cnd,
             slc_lwr, slc_upr, typ_xtr_fac, xtr_fac_val, fag_std, fag_done,
             prc_ctt_bka_val, amt_max_tar_fee, fag_ape, ide, tper_typ_pdi
            )
     VALUES ('01', 'N', '08', 0.0009, NULL, 999999,
             40,
             TO_DATE ('01/01/1400 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             TO_DATE ('01/01/2900 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'),
             999999, 999999, '###', '######', 'Y', 'Y',
             NULL, 999999, 'Y', 20, '00'
            );
I would like to select several rows plus the min(pct_fee) as in the below select
select ide  
      ,prc_ctt_bka_val
   ,amt_min_cnd_fee
   ,fag_slc      
   ,dat_start_cnd
   ,dat_end_cnd
   ,fag_ape  
      ,typ_tar
   ,typ_cat
   ,slc_lwr
   ,slc_upr
   ,typ_xtr_fac
   ,xtr_fac_val
   ,min(pct_fee) over( partition by typ_tar, typ_cat, slc_lwr, slc_upr, typ_xtr_fac,xtr_fac_val) min_pct_fee
from t1
where fag_done = 'N'
My problem is that I an getting two rows for ide 1 and 20. I would like to get only one row when there exist more than one ide. Distinct is not what I want
Is it possible to do this using analytics ?
Thanks
 
May       11, 2007 - 11:38 am UTC 
 
i don't understand what you are looking for.  
if you want one row by some key - you are looking for GROUP BY, not analytics. 
 
 
Maybe a summary table can be used to prevent the full table scan?
ht, May       11, 2007 - 12:01 pm UTC
 
 
Tom,
Thanks for the info.  I was hoping there was a way to avoid the full table scan the outer join seems to be causing.  The query takes over 5 hours so I was hoping I could somehow tune the query.
I'm going to test retrieving distinct batch_ids into another table to see if that speeds up the query.  That will probably not help though.  Please let me know if you know of a more efficient method to retrieve this data.
Thanks again.
ht 
May       11, 2007 - 12:07 pm UTC 
 
no, look at your query
the outer join has NOTHING AT ALL to do with the full scan
you need each and ever row from that table, all of them, each and every single one.
the outer join is NOT causing the full scan.
Your question mandates a full scan.
why do you think, how do you think a full scan can be avoided?
could a summary table that grouped up the table by batch_id be used?  sure, but now you have to create it, maintain it.
are you sure you have the right QUERY here?  do you understand what this query you are showing us does?  
remove the outer join, just query testeevent itself once (b).  Now, you have
select 
from table
group by col
please, explain how you would "avoid a full scan"  
 
 
Avoid scanning 832k recs?
ht, May       11, 2007 - 2:21 pm UTC
 
 
Tom,
You are absolutely correct.  I should not have focused on the full table scan but the actual query elapsed time.  I guess my question should be "Is there any other way (besides inserting batch_ids into another table when a batch_id is created) to improve the query elapsed time below?".  
select count(*) from testextent2;
  COUNT(*)
----------
    823205
1 row selected.
Elapsed: 00:00:00.66
select count(*) from testbatches2;
  COUNT(*)
----------
        51
1 row selected.
Elapsed: 00:00:00.01
set autotrace on;
SELECT DISTINCT
  2          e.batch_id,
  3          sum(xyz.total)size_mb,
  4          last_value(sum(xyz.total) ignore nulls) over(ORDER BY e.batch_id) carry_over
  5  FROM
  6          testbatches2 e,
  7          (
  8          SELECT
  9                   x.batch_id,
 10                  SUM(x.size_mb) total
 11          FROM
 12                  testextent2 x
 13          WHERE
 14                  x.instance_id =7383
 15          GROUP BY
 16                  x.batch_id
 17          )xyz
 18  WHERE e.batch_id = xyz.batch_id(+)
 19  group by e.batch_id
 20  ORDER BY e.batch_id;
  BATCH_ID    SIZE_MB CARRY_OVER
---------- ---------- ----------
       800      78545      78545
       820      47232      47232
       840      34587      34587
       860      40894      40894
       880      28310      28310
       900      22759      22759
       920      36445      36445
       921      26179      26179
       940      29027      29027
       941      36768      36768
       960      35733      35733
       980      35377      35377
      1000      25364      25364
      1020      39610      39610
      1040      31137      31137
      1041      35676      35676
      1060      34423      34423
      1061      29731      29731
      1080      26837      26837
      1100      34240      34240
      1120      40619      40619
      1140      18641      18641
      1160      45161      45161
      1180      23378      23378
      1200      17084      17084
      1220      19251      19251
      1240      25816      25816
      1260      17200      17200
      1280      20338      20338
      1300                 20338
      1320      36871      36871
      1340      33504      33504
      1360      33514      33514
      1380      47218      47218
      1400      28894      28894
      1420      34509      34509
      1440      36182      36182
      1460      30872      30872
      1480      34677      34677
      1500      30393      30393
      1520       5841       5841
      1540      10515      10515
      1560      10644      10644
      1561       2503       2503
      1562      23232      23232
      1580      25329      25329
      1581      45908      45908
      1600      36838      36838
      1620      43275      43275
      1640      42380      42380
      1660      37154      37154
51 rows selected.
<b>Elapsed: 00:00:00.06</b>
Execution Plan
----------------------------------------------------------
Plan hash value: 575510713
--------------------------------------------------------------------------------
----------
| Id  | Operation             | Name             | Rows  | Bytes | Cost (%CPU)|
Time     |
--------------------------------------------------------------------------------
----------
|   0 | SELECT STATEMENT      |                  |    51 |  1479 |    53   (4)|
00:00:01 |
|   1 |  WINDOW BUFFER        |                  |    51 |  1479 |    53   (4)|
00:00:01 |
|   2 |   SORT GROUP BY       |                  |    51 |  1479 |    53   (4)|
00:00:01 |
|*  3 |    HASH JOIN OUTER    |                  |    51 |  1479 |    52   (2)|
00:00:01 |
<b>|   4 |     TABLE ACCESS FULL | TESTBATCHES2     |    51 |   663 |     3   (0)|
00:00:01 |</b>
|   5 |     VIEW              |                  |    51 |   816 |    48   (0)|
00:00:01 |
|   6 |      HASH GROUP BY    |                  |    51 |   408 |    48   (0)|
00:00:01 |
|*  7 |       INDEX RANGE SCAN| TESTEXTENT2_IDX1 |  5488 | 43904 |    48   (0)|
00:00:01 |
--------------------------------------------------------------------------------
----------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
   7 - access("X"."INSTANCE_ID"=7383)
       filter("X"."INSTANCE_ID"=7383)
Note
-----
   - dynamic sampling used for this statement
Statistics
----------------------------------------------------------
         28  recursive calls
          0  db block gets
        123  consistent gets
          1  physical reads
          0  redo size
       2055  bytes sent via SQL*Net to client
        502  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
         51  rows processed
set autotrace off;
--------------------------------------------------------------
--------------------------------------------------------------
-- test #1
--------------------------------------------------------------
select count(*) from testextent2;
  COUNT(*)
----------
    823205
1 row selected.
Elapsed: 00:00:00.53
set autotrace on;
SELECT DISTINCT
  2          e.batch_id,
  3          sum(xyz.total)size_mb,
  4          last_value(sum(xyz.total) ignore nulls) over(ORDER BY e.batch_id) carry_over
  5  FROM
  6          testextent2 e,
  7          (
  8          SELECT
  9                   x.batch_id,
 10                  SUM(x.size_mb) total
 11          FROM
 12                  testextent2 x
 13          WHERE
 14                  x.instance_id =7383
 15          GROUP BY
 16                  x.batch_id
 17          )xyz
 18  WHERE e.batch_id = xyz.batch_id(+)
 19  group by e.batch_id
 20  ORDER BY e.batch_id;
  BATCH_ID    SIZE_MB CARRY_OVER
---------- ---------- ----------
       800 6.2573E+10 6.2573E+10
       820   52474752   52474752
       840   40017159   40017159
       860   13290550   13290550
       880    7219050    7219050
       900    4005584    4005584
       920    7216110    7216110
       921    3926850    3926850
       940    9259613    9259613
       941    8493408    8493408
       960    8790318    8790318
       980   35978409   35978409
      1000    7380924    7380924
      1020    9466790    9466790
      1040    6351948    6351948
      1041    7741692    7741692
      1060    8261520    8261520
      1061   10584236   10584236
      1080    3918202    3918202
      1100   25816960   25816960
      1120   19294025   19294025
      1140    4063738    4063738
      1160   25606287   25606287
      1180   27819820   27819820
      1200   11958800   11958800
      1220   17961183   17961183
      1240    7512456    7512456
      1260    8514000    8514000
      1280    4738754    4738754
      1300               4738754
      1320   11872462   11872462
      1340    8778048    8778048
      1360    8646612    8646612
      1380   69032716   69032716
      1400    8148108    8148108
      1420    7453944    7453944
      1440   10963146   10963146
      1460   75018960   75018960
      1480   12206304   12206304
      1500    6230565    6230565
      1520    4485888    4485888
      1540   10262640   10262640
      1560    2182020    2182020
      1561    1056266    1056266
      1562   15542208   15542208
      1580   46605360   46605360
      1581   17307316   17307316
      1600   16282396   16282396
      1620   58377975   58377975
      1640   14027780   14027780
      1660    7802340    7802340
51 rows selected.
<b>Elapsed: 00:00:03.90</b>
Execution Plan
----------------------------------------------------------
Plan hash value: 698401733
--------------------------------------------------------------------------------
-------------------
| Id  | Operation              | Name             | Rows  | Bytes |TempSpc| Cost
 (%CPU)| Time     |
--------------------------------------------------------------------------------
-------------------
|   0 | SELECT STATEMENT       |                  |    51 |   969 |       |  547
1  (12)| 00:01:06 |
|   1 |  WINDOW BUFFER         |                  |    51 |   969 |       |  547
1  (12)| 00:01:06 |
|   2 |   SORT GROUP BY NOSORT |                  |    51 |   969 |       |  547
1  (12)| 00:01:06 |
|   3 |    MERGE JOIN OUTER    |                  |    39M|   712M|       |  547
1  (12)| 00:01:06 |
|   4 |     SORT JOIN          |                  |   823K|  2411K|    18M|  490
8   (3)| 00:00:59 |
<b>|   5 |      TABLE ACCESS FULL | TESTEXTENT2      |   823K|  2411K|       |  262
8   (2)| 00:00:32 |</b>
|*  6 |     SORT JOIN          |                  |    51 |   816 |       |    4
9   (3)| 00:00:01 |
|   7 |      VIEW              |                  |    51 |   816 |       |    4
8   (0)| 00:00:01 |
|   8 |       HASH GROUP BY    |                  |    51 |   408 |       |    4
8   (0)| 00:00:01 |
|*  9 |        INDEX RANGE SCAN| TESTEXTENT2_IDX1 |  5488 | 43904 |       |    4
8   (0)| 00:00:01 |
--------------------------------------------------------------------------------
-------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
       filter("E"."BATCH_ID"="XYZ"."BATCH_ID"(+))
   9 - access("X"."INSTANCE_ID"=7383)
       filter("X"."INSTANCE_ID"=7383)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      13427  consistent gets
          0  physical reads
          0  redo size
       2162  bytes sent via SQL*Net to client
        502  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          3  sorts (memory)
          0  sorts (disk)
         51  rows processed
set autotrace off;
--------------------------------------------------------------
exit;
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
 
May       11, 2007 - 3:41 pm UTC 
 
what is unacceptable here? 
 
 
Takes 5 hours.
ht, May       11, 2007 - 3:57 pm UTC
 
 
Tom,
The scan of 800k+ recs in the second example causes the superset of data 5 hours to return.  I'm just wondering if there is a method to avoid the 800k+ rec scan or if I should just create a table that stores the batch_ids upon batch_id creation. 
May       11, 2007 - 4:33 pm UTC 
 
if it takes 5 hours to full scan 800k records, you have a serious problem.
How about a TKPROF of that there query with wait events enabled so we can see what it is doing, how it is doing it, how long it is waiting for what. 
 
 
Analytics to generate multiple types fo date dimensions
David, May       11, 2007 - 6:40 pm UTC
 
 
Tom,
I'm trying to get the query that joins 
my CONFIG table with ALL_OBJECTS and 
produces as many records per each config item as 
NUM_RECORDS field indicate:
--Generate dynamic date dimension
create table CONFIG
(
   NAME    VARCHAR2(50),
   NUM_RECORDS  NUMBER
);
insert into CONFIG
values ('TEST_2_rec',2);
insert into CONFIG
values ('TEST_3_rec',3);
insert into CONFIG
values ('TEST_5_rec',5);
e.g. there should be 3 groups of records returned,
one group with 2 records, another with 3 and lst one with 5.
I only came up with lame sl as this 
one which apparently doesn't work:
col name format a12
SELECT rownum as rnum,name,NUM_RECORDS
from ALL_OBJECTS JOIN CONFIG ON (1=1)
where rownum<= NUM_RECORDS
;
      RNUM NAME         NUM_RECORDS
---------- ------------ -----------
         1 TEST_2_rec             2
         2 TEST_2_rec             2
         3 TEST_3_rec             3
         4 TEST_5_rec             5
         5 TEST_5_rec             5
I tried to produce this with analytics,
but results are too scary to post :)
      RNUM NAME         NUM_RECORDS
---------- ------------ -----------
         1 TEST_2_rec             2
         2 TEST_2_rec             2
         3 TEST_3_rec             3
         4 TEST_3_rec             3
         5 TEST_3_rec             3
                           
         6 TEST_5_rec             5
         7 TEST_5_rec             5
         8 TEST_5_rec             5
         9 TEST_5_rec             5
        10 TEST_5_rec             5
        
Any help or advice would be appreciated :)
Thank you for your time
David
 
May       14, 2007 - 12:42 pm UTC 
 
 
 
To David
martina, May       12, 2007 - 1:35 am UTC
 
 
Hi, is that what you want?
INFO/INFO_PROD> r
  1  select name,count(*)
  2  from ( select c.*
  3         from
  4         mpa_config c
  5         ,
  6         (select level l from dual
  7           connect by level < 10 -- Here you need the maximum num_records
  8         ) d
  9        where c.num_records >= d.l
 10       )
 11* group by name
NAME                                                 COUNT(*)
-------------------------------------------------- ----------
TEST_2_rec                                                  2
TEST_3_rec                                                  3
TEST_5_rec                                                  5
INFO/INFO_PROD>
is that what you want?
martina 
 
A reader, May       12, 2007 - 9:27 am UTC
 
 
HI Tom,
Back to the question above
CREATE TABLE T1 
( 
TYP_TAR   VARCHAR2(3), 
FAG_SLC   VARCHAR2(1), 
TYP_CAT   VARCHAR2(2), 
PCT_FEE   NUMBER(6,4), 
PCT_RED   NUMBER(6,4), 
AMT_MAX_CND_FEE         NUMBER(38,17), 
AMT_MIN_CND_FEE  NUMBER(38,17), 
DAT_START_CND  DATE, 
DAT_END_CND   DATE, 
SLC_LWR   NUMBER(38,17), 
SLC_UPR   NUMBER(38,17), 
TYP_XTR_FAC   VARCHAR2(3), 
XTR_FAC_VAL   VARCHAR2(250), 
FAG_STD   VARCHAR2(1), 
FAG_DONE   VARCHAR2(1), 
PRC_CTT_BKA_VAL  VARCHAR2(1), 
AMT_MAX_TAR_FEE  NUMBER(38,17), 
FAG_APE   VARCHAR2(1), 
IDE    NUMBER(38), 
TPER_TYP_PDI   VARCHAR2(2) 
) 
../..
One of my colleague is doing the following very non performant code:
cursor c1 is
select ,prc_ctt_bka_val 
 ,amt_min_cnd_fee 
 ,fag_slc 
 ,dat_start_cnd 
 ,dat_end_cnd 
 ,pct_fee,fag_ape 
 ,typ_tar 
 ,typ_cat 
 ,slc_lwr 
 ,slc_upr 
 ,typ_xtr_fac 
 ,xtr_fac_val
from t1 
where fag_done = 'N' ;
cursor c2 is
select ,prc_ctt_bka_val 
 ,amt_min_cnd_fee 
 ,fag_slc 
 ,dat_start_cnd 
 ,dat_end_cnd 
 ,pct_fee,fag_ape 
 ,typ_tar 
 ,typ_cat 
 ,slc_lwr 
 ,slc_upr 
 ,typ_xtr_fac 
 ,xtr_fac_val
from t1 
where typ_tar  = p_typ_tar
and   typ_cat     = p_typ_cat
and   slc_lwr     = p_slc_lwr
and   slc_upr     = p_slc_upr
and   typ_xtr_fac = p_typ_xtr_fac
and   xtr_fac_val = p_xtr_fac_val
and   fag_done = 'N' 
and   pct_fee = (select min(pct_fee) from t1
                  where typ_tar       = p_typ_tar
                    and   typ_cat     = p_typ_cat
      and   slc_lwr     = p_slc_lwr
      and   slc_upr     = p_slc_upr
      and   typ_xtr_fac = p_typ_xtr_fac
      and   xtr_fac_val = p_xtr_fac_val
      and   fag_done = 'N' );
r2 c2%rowtype;
BEGIN
for r1 in c1 LOOP
  open c2(r1.typ_tar,r1.typ_cat,r1.slc_lwr,r1.typ_xtr_fac,xtr_fac_val);
  fetch c2 into r2;
  if c2%notfound then
      insert into table t2 values comming from cursor r1;
  else
     insert into table t2 values comming from cursor r2;
  end if;
  close c2;
END;
I would like to replace the above code by this one
 BEGIN
   insert into table t2
   select ,prc_ctt_bka_val 
 ,amt_min_cnd_fee 
 ,fag_slc 
 ,dat_start_cnd 
 ,dat_end_cnd 
 ,pct_fee,fag_ape 
 ,typ_tar 
 ,typ_cat 
 ,slc_lwr 
 ,slc_upr 
 ,typ_xtr_fac 
 ,xtr_fac_val 
       ,min(pct_fee) over( PARTITION BY typ_tar, typ_cat, slc_lwr, slc_upr, typ_xtr_fac,xtr_fac_val) min_pct_fee  
from t1 
where fag_done = 'N' ;
END;
1. knowing that table t1 has no any unique constraint (PK or UK) implemented how can I ensure that the above analytic select gives only one row(the one with the min(pct_fee) when there is duplicate (typ_tar ,typ_cat,slc_lwr,slc_upr ,typ_xtr_fac ,xtr_fac_val )
2. Could you please advise something else if you see that both we are wrong
Thanks very much 
May       14, 2007 - 12:59 pm UTC 
 
if you want a single recrod by x,y,z,..... you use group by (aggregation), not analytics.
analytics specifically do not "squish out" records, you seem to want one observation by key - hence, you want aggregations.
something like:
ops$tkyte%ORA10GR2> select id,
  2         min(fnm) KEEP (dense_rank first order by rowid) fnm,
  3         min(lnm) KEEP (dense_rank first order by rowid) lnm
  4    from test
  5   group by id;
        ID FNM                            LNM
---------- ------------------------------ ------------------------------
         1 Tom                            Jones
         2 Sue                            Snue
         3 Robert                         Smith
but you group by your big list of columns, not just id 
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:228182900346230020  
 
 
martina, May       13, 2007 - 1:57 am UTC
 
 
hi, 
i think the 2 cursors you show insert all rows of t1 into t2 fetching them in a complicated way. if you only wanted the c2 values c2 would be sufficient? 
This might be what you are looking for:
insert into t2
select .....
from
( select d.*
        ,row_number() over (partition by  typ_tar, typ_cat, slc_lwr, slc_upr, typ_xtr_fac,xtr_fac_val
             order by pct_fee) rd
              from t1 d ) z
WHERE RD = 1
;
if it is not please clarify WHAT you want to achieve, not HOW you want to achieve something.
regards, martina 
 
Thank you 
David, May       14, 2007 - 3:34 pm UTC
 
 
Thank you very much to Tom and Martina 
:) 
 
A reader, May       15, 2007 - 5:46 am UTC
 
 
Thanks Tom, Thanks Marina,
What I would like is
select a,
       b,
       c,
       d,
       x,
       y,
       z,       
      min(r)
from t
group by
       x,
       y,
       z
This will not compile because I am also selecting a, b, c and d
If there is a possibility to select all what I want using group by (x,y,z) then I will use group by.
Since I've not succeed to do it, I am using analytic function which allows me to have select like this
select a,
       b,
       c,
       d,
       x,
       y,
       z,       
      min(r) over (partition by x,y,z)
from t
 
May       15, 2007 - 4:35 pm UTC 
 
answer is already here:
ops$tkyte%ORA10GR2> select id,
  2         min(fnm) KEEP (dense_rank first order by rowid) fnm,
  3         min(lnm) KEEP (dense_rank first order by rowid) lnm
  4    from test
  5   group by id;
id = x,y,z
you'll min(a) keep (dense....) a
and just min(r) 
 
 
Finding Previous Item within range
Jonty, May       15, 2007 - 5:51 pm UTC
 
 
Hi,
I want to find out last Item No used in Previous year for each item - catch is - if same Item as current then take which ever is before that item.
I have given code below, which create sample table TB_TRANS, insert st. to populate data, TB_TRANS_OUT output table and how results should look.
Scenario # 
1. for Tran_no 1 record - item is 10390 - latest item used in last year was also 10390, which is same, go beyond that and get 2nd last which is 17365.
2. for Tran_no 2 record - same as 1
3. for Tran_no 3 record - item is 17365- latest item used in last year was 10390, previous item no will be 10390.
I hope this will give good idea what I am looking for.
I tried using Analytical Function but somehow I am not getting same results. I can get last item in previous year but scenario 1 I am not able to get it.
Any help will be highly appreciated.
CREATE TABLE TB_TRANS
(TRAN_DATE DATE,
TRAN_NO NUMBER,
CUST_NO NUMBER,
ITEM_NO NUMBER,
QTY NUMBER)
/
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '07/29/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 6, 101, 15448, 40); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '07/04/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 5, 101, 10390, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '07/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 4, 101, 17365, 20); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '07/01/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 3, 101, 17365, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '05/20/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 2, 101, 10390, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '03/04/2006 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 1, 101, 10390, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '12/24/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 1000, 101, 10390, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '12/14/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 999, 101, 17365, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '12/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 998, 101, 10390, 30); 
INSERT INTO TB_TRANS ( TRAN_DATE, TRAN_NO, CUST_NO, ITEM_NO, QTY ) VALUES ( 
 TO_Date( '11/04/2005 12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), 997, 101, 15448, 30); 
COMMIT;
SELECT * FROM TB_TRANS
ORDER BY tran_date, TRaN_NO
/
CREATE TABLE TB_TRANS_OUT
(TRAN_DATE DATE,
TRAN_NO NUMBER,
CUST_NO NUMBER,
ITEM_NO NUMBER,
QTY NUMBER,
PREV_ITEM_NO NUMBER
)
/
/* EXPECTED RESULT SET */
TRAN_DATE            TRAN_NO CUST_NO ITEM_NO QTY 
11/4/2005            997     101     15448   30  
12/4/2005            998     101     10390   30  
12/14/2005           999     101     17365   30  
12/24/2005           1000    101     10390   30  
3/4/2006             1       101     10390   30  
5/20/2006            2       101     10390   30  
7/1/2006             3       101     17365   30  
7/1/2006             4       101     17365   20  
7/4/2006             5       101     10390   30  
7/29/2006            6       101     15448   40  
SELECT * FROM TB_TRANS_OUT
ORDER BY tran_date, TRaN_NO
/
TRAN_DATE            TRAN_NO CUST_NO ITEM_NO QTY PREV_ITEM_NO 
11/4/2005            997     101     15448   30               
12/4/2005            998     101     10390   30               
12/14/2005           999     101     17365   30               
12/24/2005           1000    101     10390   30               
3/4/2006             1       101     10390   30  17365        
5/20/2006            2       101     10390   30  17365        
7/1/2006             3       101     17365   30  10390        
7/1/2006             4       101     17365   20  10390        
7/4/2006             5       101     10390   30  17365        
7/29/2006            6       101     15448   40  10390 
Thanks a lot in advance.
Jonty 
May       16, 2007 - 10:18 am UTC 
 
give us more details - will this always be for JUST two years
are the two years INPUTS into the query itself?
analytics will probably not be used here. 
 
 
To: Jonty
Narendra, May       16, 2007 - 6:08 am UTC
 
 
Following is something I could think of:
Assumptions:
1. Your year starts on 1st January and ends on 31st December.
2. CUST_NO remains same for all the rows. (If it changes the query will change a bit).
SQL> select a.tran_date, a.tran_no, a.cust_no, a.item_no, a.qty,
  2  ( select b.item_no from ( select item_no, tran_date from tb_trans order by tran_date desc) b
  3    where b.item_no <> a.item_no and b.tran_date < trunc(a.tran_date, 'YYYY')
  4    and rownum = 1 ) prev_item_no
  5  from tb_trans a
  6  ORDER BY tran_date, TRaN_NO ;
TRAN_DATE      TRAN_NO    CUST_NO    ITEM_NO        QTY PREV_ITEM_NO
----------- ---------- ---------- ---------- ---------- ------------
04-Nov-05          997        101      15448         30 
04-Dec-05          998        101      10390         30 
14-Dec-05          999        101      17365         30 
24-Dec-05         1000        101      10390         30 
04-Mar-06            1        101      10390         30        17365
20-May-06            2        101      10390         30        17365
01-Jul-06            3        101      17365         30        10390
01-Jul-06            4        101      17365         20        10390
04-Jul-06            5        101      10390         30        17365
29-Jul-06            6        101      15448         40        10390
10 rows selected
But I sincerely feel there might be a better approach. 
 
Follw-up 
Jonty, May       16, 2007 - 11:33 am UTC
 
 
Hi,
Thanks for looking into it.
Unfortunately this will not be a good solution.
1. I have to do it on High Volume data (1 to 300 mil records), there will be many more calculations on same record. I want to avoid using table twice in the query. I am more thinking to use Analycic Function.
2. Years will be varying but to make it easier lets assume Year is fix.
3. Customer No. - there will be many more customers. for each custom and item I have to do this.
4.  
May       17, 2007 - 10:46 am UTC 
 
2) no, how about you give the real requirement.
I don't see the opportunity for analytics really here.  You would partition by year, but you want last years stuff in this year - analytics work within a partition - not across partitions
how does customer fit into all of this??  you just mentioned "customer and item"
please be amazingly precise in your specification and if customer is part of the equation, make sure your example includes at least two customers to demonstrate with! 
 
 
Richard, May       16, 2007 - 4:13 pm UTC
 
 
In our data warehouse environment, we are receiving updated sales orders on a nightly basis from a 3rd party system via ODBC.
The process is prebuilt and we cannot modify. The only solutions we have available are to rebuild the entire interface process, or add additional processing on top of the delivered functionality.
Management decided we are better off adding customized processes on top of the delivered product.
We can only add new interfaces for additional data needed from the 3rd party app beyond the scope of the delivered interfaces. 
SQL*Plus: Release 10.1.0.2.0 - Production on Wed May 16 14:25:12 2007
Copyright (c) 1982, 2004, Oracle.  All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
SQL> drop table historical_txns;
Table dropped.
SQL> create table historical_txns (
  2    order_date date not null,
  3    order_no   integer not null,
  4    order_line integer not null,
  5    txn_date   date not null,
  6    blocked_flag varchar2(1) not null,
  7    constraint pk_historical_txns
  8      primary key (order_date, order_no, order_line, txn_date));
Table created.
SQL> drop table order_status_hist;
Table dropped.
SQL> create table order_status_hist (
  2    order_date date not null,
  3    order_no   integer not null,
  4    order_line integer not null,
  5    effective_dt date not null,
  6    expired_dt   date default to_date('99991231','yyyymmdd') not null,
  7    constraint ck_eff_exp_dates
  8      check (effective_dt <= expired_dt),
  9    constraint pk_order_status_hist
 10      primary key (order_date, order_no, order_line, effective_dt));
Table created.
SQL> alter session set nls_date_format='yyyymmdd';
Session altered.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070201'),'Y');
1 row created.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070203'),'N');
1 row created.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070210'),'Y');
1 row created.
SQL> insert into historical_txns values (to_date('20070201'),100,1,to_date('20070211'),'Y');
1 row created.
SQL> commit;
Commit complete.
SQL> select * from historical_txns;
ORDER_DA   ORDER_NO ORDER_LINE TXN_DATE B
-------- ---------- ---------- -------- -
20070201        100          1 20070201 Y
20070201        100          1 20070203 N
20070201        100          1 20070210 Y
20070201        100          1 20070211 YASSUMPTIONS:A given order line may be blocked and released multiple times, depending on item availability, customer credit, etc.
In the order_status_hist table, only one record for a given order line may be "active" at any given time. For example,
given the above data, the following is the expected output:
SQL> select * from order_status_hist;
ORDER_DA   ORDER_NO ORDER_LINE EFFECTIV EXPIRED_
-------- ---------- ---------- -------- --------
20070201        100          1 20070201 20070203
20070201        100          1 20070210 99991231
QUESTION:
To do the initial load of data from historical_txns into order_status_hist we need an analytic SQL or pl/sql procedure?
Note that the transaction dated 11-Feb-2007 did not change the flag.
Going forward, as inserts and updates are processed on the ORDERS table which contains a BLOCKED column with Y or N values, we want to capture changes to the BLOCKED column via a trigger.
If the updated record is blocked and the order was previously not blocked, a new effective date record is added to the order_status_hist table.
If the updated record unblocks a previously blocked order, the currently active record is given the current date as the expired_date.
If the BLOCKED field did not change, do nothing to the history table.
The following trigger should perform the required ongoing maintenance, correct?
CREATE OR REPLACE TRIGGER order_blocked_status_trg
  BEFORE DELETE or INSERT OR UPDATE OF blocked
  ON ORDERS
  FOR EACH ROW
BEGIN
  IF INSERTING
     and :new.blocked='Y'
  THEN
    -- when adding new records that are blocked, also add
    -- records to the blocked status table for tracking
    INSERT INTO order_status_hist (order_date, order_no, order_line, effective_date)
    VALUES (:new.order_date, :new.order_no, :new.order_line, trunc(SYSDATE));
  ELSIF UPDATING
        AND :OLD.blocked <> :NEW.blocked
  THEN
    -- if the line was previously blocked and is now unblocked,
    -- expire the currently blocked status record
    UPDATE order_status_hist
       SET expired_dt = trunc(sysdate)
     where order_date = :old.order_date
       and order_no = :old.order_no
       and order_line = :old.order_line
       and :old.blocked = 1
       and trunc(sysdate) between effective_dt and expired_dt;
    -- If the line was previously unblocked and is now blocked,
    -- create a blocked status record.
    if :new.blocked=1 then
    INSERT INTO order_status_hist (order_date, order_no, order_line, effective_date)
    values (:new.order_date, :new.order_no, :new.order_line, trunc(sysdate));
    end if;
  ELSIF DELETING
  then
    -- if the record being deleted has blocked status records,
    -- also delete these blocked status records for data integrity
    delete from order_status_hist
     where order_date = :old.order_date
       and order_no = :old.order_no
       and order_line = :old.order_line;
  END IF;
END order_blocked_status_trg;
/ 
May       17, 2007 - 10:55 am UTC 
 
please, I'm begging you to not use that expired date in the year 9999 - please, use null.  you'll whack out the optimizers ability to estimate card= values.
I would not use a trigger - hate triggers.
If I can give you a query (using analytics) from the one table to provide you with the desired output, is that sufficient? 
 
 
Richard, May       17, 2007 - 12:27 pm UTC
 
 
Analytic sql would be the ideal solution for me. I can update the process to use nvl(expired_dt,sysdate) instead of 9999/12/31 easily enough. 
 
TO Richard
Kevin, May       17, 2007 - 12:54 pm UTC
 
 
> I can update the process to use nvl(expired_dt,sysdate) instead of 9999/12/31 easily enough. 
Don't do this:
SYSDATE <= NVL(expired_dt,sysdate)
Instead, use:
(EXPIRED_DT IS NULL OR SYSDATE <= EXPIRED_DT)
Yes, we can see the equivalence, and yes, yours is more "succinct".  But the NVL() on EXPIRED_DT will prevent the CBO from using expired_dt column stats when estimating cardinities.  In fact, regardless of distribution on EXPIRED_DT, the CBO will use the default 5% selectivity estimate for your version of the constraint. 
 
GROUP BY order date
A reader, May       21, 2007 - 1:05 pm UTC
 
 
tom, I have a query and built out of 3- 4 sub queries.
SELECT   date,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
        FROM     (SELECT   TRUNC(date) date,product_id,product_structure, side, COUNT (*) cnt
                  FROM     (SELECT order_id, product_id, side, last_user, date,
                                   NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price,  price,
                                   NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
                                   NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
                            FROM   order  o,user u
                            where   date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY')  AND     date <=  TO_DATE ('16-May-2007','DD-Mon-YYYY')
                            AND    last_user = u.code ) o, products p
                  WHERE    1 = 1
                  AND      product_id = p.pid(+)
                  and      status = 2 
                  and   ( nvl(previous_status,0) <>2 or   price <> nvl(previous_price,0))
                  GROUP BY   TRUNC(date),product_id, product_structure,side
                  )
GROUP BY date,product_id, product_structure
the problem is, if I take off the date from the group by clause i get total askprice + price = 1656
and if i put the date clause (as above ) i get 
askprice + price = 1510 
looking at the query , do you see this happening ? 
May       21, 2007 - 1:27 pm UTC 
 
I don't understand.  If you take date off of the group by - the query no longer FUNCTIONS.
nvl( x, null ) is 'redundant' 
 
 
GROUP BY DATE
A reader, May       21, 2007 - 1:31 pm UTC
 
 
SELECT  product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
    FROM  (SELECT  product_id,product_structure, side, COUNT (*) cnt
            FROM  (SELECT order_id, product_id, side, last_user, date,
                      NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
                      NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
                      NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
                  FROM  order o,user u
                  where  date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND  date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
                  AND  last_user = u.code ) o, products p
            WHERE  1 = 1
            AND    product_id = p.pid(+)
            and    status = 2
            and  ( nvl(previous_status,0) <>2 or  price <> nvl(previous_price,0))
            GROUP BY  product_id, product_structure,side
            )
GROUP BY product_id, product_structure 
but now, if you sum askprice + price is not same as before
(
even though we have
                  where  date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND  date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
                  AND  last_user = u.code ) o, products p
 )
 
May       21, 2007 - 1:55 pm UTC 
 
I'm not understanding it - I don't see any outputs, so I'm not sure what you mean really.
the CODE button is really good for things CODE related (fixed width font) 
 
 
GROUP BY DATE
A reader, May       21, 2007 - 1:32 pm UTC
 
 
by the way , where do you see nvl(x,null) ? 
May       21, 2007 - 1:56 pm UTC 
 
NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, 
x = lag(price) over (partion by order_id order by date asc)
nvl( x, null ) previous_price 
 
 
GROUP BY DATE
A reader, May       21, 2007 - 2:17 pm UTC
 
 
code 1:
SELECT  <b>date</b>,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
    FROM  (SELECT  <b>TRUNC(date)</b> date,product_id,product_structure, side, COUNT (*) cnt
            FROM  (SELECT order_id, product_id, side, last_user, date,
                      NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
                      NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
                      NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
                  FROM  order o,user u
                  where  date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND  date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
                  AND  last_user = u.code ) o, products p
            WHERE  1 = 1
            AND    product_id = p.pid(+)
            and    status = 2
            and  ( nvl(previous_status,0) <>2 or  price <> nvl(previous_price,0))
            GROUP BY  <b>TRUNC(date)</b>,product_id, product_structure,side
            )
GROUP BY <b>date</b>,product_id, product_structure
code 2:
SELECT product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
  FROM (SELECT product_id,product_structure, side, COUNT (*) cnt
        FROM (SELECT order_id, product_id, side, last_user, date,
              NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
              NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
              NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
            FROM order o,user u
            where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
            AND last_user = u.code ) o, products p
        WHERE 1 = 1
        AND  product_id = p.pid(+)
        and  status = 2
        and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
        GROUP BY product_id, product_structure,side
        )
GROUP BY product_id, product_structure
now in the above two queries, in outer query one has date in the group by clause and the 2nd does not, and both has same where clause.
Q: DO YOU see any reason why , if i sum up the 
counts of askpirce + price , counts will be different ? 
May       22, 2007 - 9:18 am UTC 
 
it would be so cool to see the outputs you say are different just to, well, sort of see what it is you are comparing.
 
 
 
GROUP BY DATE
A reader, May       21, 2007 - 2:18 pm UTC
 
 
code 1:
SELECT  date,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
    FROM  (SELECT  TRUNC(date) date,product_id,product_structure, side, COUNT (*) cnt
            FROM  (SELECT order_id, product_id, side, last_user, date,
                      NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
                      NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
                      NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
                  FROM  order o,user u
                  where  date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND  date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
                  AND  last_user = u.code ) o, products p
            WHERE  1 = 1
            AND    product_id = p.pid(+)
            and    status = 2
            and  ( nvl(previous_status,0) <>2 or  price <> nvl(previous_price,0))
            GROUP BY  TRUNC(date),product_id, product_structure,side
            )
GROUP BY date,product_id, product_structure
code 2:
SELECT product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
  FROM (SELECT product_id,product_structure, side, COUNT (*) cnt
        FROM (SELECT order_id, product_id, side, last_user, date,
              NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
              NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
              NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
            FROM order o,user u
            where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
            AND last_user = u.code ) o, products p
        WHERE 1 = 1
        AND  product_id = p.pid(+)
        and  status = 2
        and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
        GROUP BY product_id, product_structure,side
        )
GROUP BY product_id, product_structure
now in the above two queries, in outer query one has date in the group by clause and the 2nd does not, and both has same where clause.
Q: DO YOU see any reason why , if i sum up the counts of askpirce + price , counts will be different ?
 
 
GROUP BY DATE -- worked ....
A reader, May       21, 2007 - 5:15 pm UTC
 
 
SELECT date,product_id,product_structure, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
  FROM (SELECT TRUNC(date) product_id,product_structure, side, COUNT (*) cnt
        FROM (SELECT order_id, product_id, side, last_user, date,
              NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
              NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
              NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
            FROM order o,user u
            where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
            AND last_user = u.code ) o, products p
        WHERE 1 = 1
        AND  product_id = p.pid(+)
        and  status = 2
        and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
        GROUP BY TRUNC(date),product_id, product_structure,side
        )
GROUP BY date,product_id, product_structure 
did not work..
but after changing the place of date column it produced same results...
SELECT product_id,product_structure,date, MAX (DECODE (side, 0, cnt, 0)) askprice, MAX (DECODE (side, 1, cnt, 0)) price
  FROM (SELECT product_id,product_structure, side,TRUNC(date) COUNT (*) cnt
        FROM (SELECT order_id, product_id, side, last_user, date,
              NVL (LAG (price) OVER (PARTITION BY order_id ORDER BY date ASC), NULL ) previous_price, price,
              NVL (LAG (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) previous_status, status,
              NVL (LEAD (status) OVER (PARTITION BY order_id ORDER BY date ASC),NULL ) next_status
            FROM order o,user u
            where date >= TO_DATE ('15-Mar-2007','DD-Mon-YYYY') AND date <= TO_DATE ('16-May-2007','DD-Mon-YYYY')
            AND last_user = u.code ) o, products p
        WHERE 1 = 1
        AND  product_id = p.pid(+)
        and  status = 2
        and ( nvl(previous_status,0) <>2 or price <> nvl(previous_price,0))
        GROUP BY product_id, product_structure,side,TRUNC(date)
        )
GROUP BY product_id, product_structure,date 
 
Analytics and PLSQL
A reader, May       23, 2007 - 8:13 am UTC
 
 
Hi Tom,
I have the following plsql code in a Procedure
Execute immediate 'insert into t1 (aa,
                                                  bb,
                                                  cc,
                                                  dd)
                             select a,
                                       b,
                                       c,
                                       min (d) over (partition by a,b)
                               from t
                               where fag_done = ''N''
                              ';
The trace file when "tkprofed" gives for this select a ration parse/execute = 1 which needs to be corrected.
Could you please help me to correct this high ration of parse count to execute count
Thanks a lot 
 
Any Analytical funtion of HELP
Hitesh, June      14, 2007 - 2:48 am UTC
 
 
Hi Tom,
I am trying to calculate %variation of the no of records inserted past hr with the no of records inserted past week.
Is there any analytical funtion to do that?
Select 'T' "Table Name", A.Hrly_Cnt "Past Hour Statistics", B.Weekly_Cnt "Past Week Statistics", Ceil((B.Weekly_Cnt-A.Hrly_Cnt)* 100 /(B.Weekly_Cnt)) "%Variation"
from
(
Select /*+Parallel(X,4) */ Count(*) Hrly_cnt
from
t X
Where 
Insert_date >= Sysdate-1/24 And Insert_date < Sysdate) A,
(
Select /*+Parallel(Y,4) */ Ceil(Count(*)/7/24) Weekly_cnt 
from 
t Y 
Where 
Insert_date >= Sysdate-7-1/24 And Insert_date < Sysdate-1/24) B
/
Thanks
Hitesh  
June      14, 2007 - 7:24 am UTC 
 
no analytics needed, but a single pass would more than do this
select count( case when insert_date between sysdate-1/24 and sysdate then 1 end),
    count( case when insert_date between sysdate-7-1/24 and sysdate then 1 end)
  from t
where insert_date between sysdate-7-1/24 and sysdate
 
 
 
Can we avoid multi pass of table T1
Ritesh, June      16, 2007 - 4:52 am UTC
 
 
Hi Tom,
Can we aviod the multi-pass of Table T1 in the below mentioned SQL. T1 is the parent table based on which ceratin records gets invalid status and they are inserted in table T. We need to know the %of Invalid activity based on the table T1.
Select 
'T'        "Table Name", 
Cnt            "Base Count",
Invalid_Match_Cnt   "Invalid Match Count",
Invalid_Match_Cnt*100/Cnt  "%Invalid"
from 
(Select /*+Parallel(X 16) */ Count(*) Cnt 
from 
T1 X 
WHERE Insert_date Between Sysdate-4/24 AND Sysdate-3/24) A,
(Select /*+ Parallel(X 16) Parallel(Y 16) Use_Hash(X Y) */ Count(*) Invalid_Match_Cnt
from 
T X, T1 Y 
WHERE 
X.Insert_date Between Sysdate-4/24 AND Sysdate  AND 
Y.Insert_date Between Sysdate-4/24 AND Sysdate-3/24     AND
X.tag = Y.tag) B
/
Thanks
Ritesh 
 
Last_Value vs First_Value - Please explain
Jay, July      19, 2007 - 10:56 am UTC
 
 
Quote:
"it didn't work for me.  I had to change it to first_value(master_record) over (partition by sub_record order by moddate desc)
Is there a reason for that? 
Followup   December 16, 2003 - 2pm US/Eastern:
doh, default window clause is current row and unbounded preceding
i would have needed a window clause that looks forwards rather then backwards (reason #1 why I 
should always set up a test case instead of just answering on the fly)
your solution of reversing the data works just fine. "
---------------------
Hi tom,
I had the same doubt. I read your explanation but couldn't fully comprehend what you were saying. 
Why would,
Last_Value() over (partition by .. order by ... )
be different from..
First_Value() over (partition by .. order by ... desc) ??
The later case provides the correct result. The first one doesn't. I'm confused. Thanks for your time..
Jay 
July      19, 2007 - 12:20 pm UTC 
 
give us the for example where the latter is correct and the former is not - and then we'll explain what happened.
NULLS would definitely affect that (unless you use nulls first/nulls last to control their placement) 
 
 
Thanks..
Jay, July      19, 2007 - 1:41 pm UTC
 
 
-----------------------------case 1 query
with 
population as
(
 select 123 as test_id, 'Alpha' as test1, 1 as test2 from dual
 union all
 select 123 as test_id, 'Beta' as test1, 2 as test2 from dual
 union all
 select 123 as test_id, 'Gamma' as test1, 3 as test2 from dual       
 union all
 select 123 as test_id, 'Delta' as test1, 4 as test2 from dual
 union all 
 select 123 as test_id, 'Pi' as test1, 5 as test2 from dual
)     
select test_id,
       test1, 
       test2,
       first_value(test1) over (partition by test_id
                                    order by test2 desc) as "Correct" 
  from population p   -----------------------------case 2 query
with 
population as
(
 select 123 as test_id, 'Alpha' as test1, 1 as test2 from dual
 union all
 select 123 as test_id, 'Beta' as test1, 2 as test2 from dual
 union all
 select 123 as test_id, 'Gamma' as test1, 3 as test2 from dual       
 union all
 select 123 as test_id, 'Delta' as test1, 4 as test2 from dual
 union all 
 select 123 as test_id, 'Pi' as test1, 5 as test2 from dual
)     
select test_id,
       test1, 
       test2, 
       last_value(test1) over (partition by test_id
                                   order by test2) as "Incorrect.. why??"
  from population p 
case 1 is correct and returns 'pi' which is expected.
case 2 should return 'pi' as well since I am ordering it by test2 ascending and am partitioning it by the ID.
Could you please explain me as to why this is happening?
Thanks for your time as always!  
July      19, 2007 - 2:53 pm UTC 
 
oh, ok - easy, the default window is:
default window clause is current row and unbounded preceding
so, first_value/last_value works from the current record BACKWARDS, up to the top of the result set.
first_value will always be - the first record.
last_value (in this case) will always be - the CURRENT RECORD
 14  select test_id,
 15         test1,
 16         test2,
 17         first_value(test1) over (partition by test_id
 18                                      order by test2 desc) as "Correct" ,
 19         last_value(test1) over (partition by test_id
 20                                     order by test2) as "Incorrect.. why??"
 21    from population p
 22    order by test2
 23  /
   TEST_ID TEST1      TEST2 Corre Incor
---------- ----- ---------- ----- -----
       123 Alpha          1 Pi    Alpha
       123 Beta           2 Pi    Beta
       123 Gamma          3 Pi    Gamma
       123 Delta          4 Pi    Delta
       123 Pi             5 Pi    Pi
here, when processing row 1 - last value has to be alpha, because if you look at the current record and all of the preceding ones, the last value of test1 is alpha.
when you go to row 2, the last value has to be beta - because when you look at the last row in the current window (which is from the current row and all preceding), the last value is Beta - and so on...
 14  select test_id,
 15         test1,
 16         test2,
 17         first_value(test1) over (partition by test_id
 18                                      order by test2 desc) as "Correct" ,
 19         last_value(test1) over (partition by test_id
 20                                     order by test2<b>
 21                                     range between current row and unbounded following ) as "Incorrect.. why??"</b>
 22    from population p
 23    order by test2
 24  /
   TEST_ID TEST1      TEST2 Corre Incor
---------- ----- ---------- ----- -----
       123 Alpha          1 Pi    Pi
       123 Beta           2 Pi    Pi
       123 Gamma          3 Pi    Pi
       123 Delta          4 Pi    Pi
       123 Pi             5 Pi    Pi
Here, we changed the window so that last value would look at the current row and everything AFTER it, so now, Pi is the answer because no matter what row we look at - using the current row and unbounded following - the last value is the last row in the window and the last row is always the SAME in that window. 
 
 
Thank you very much..
Jay, July      19, 2007 - 3:03 pm UTC
 
 
Hello Tom,
Thanks a lot for explaining the concept. It makes complete sense now. Thank you so much!!
Jay 
 
last_value
Sokrates, July      20, 2007 - 1:55 am UTC
 
 
now, even me, I got it !
 
 
LAST_VALUE, unbounded frustration
Duke Ganote, August    24, 2007 - 11:19 am UTC
 
 
That's why I never use LAST_VALUE.  FIRST_VALUE lets me intuitively use the default window clause (current row and unbounded preceding).  
FIRST_VALUE behaves as FIFO (first inline, first out).
The default clauses turn LAST_VALUE into LINF (last inline, never found).  
 
Analytics, configurable date window
Phil, September 02, 2007 - 7:54 am UTC
 
 
Hello Tom
I struggle with Analytics, have you finished writing a book that explains it from the start?
I need to produce a report based on time and split the data into chunks. Initially each minute but the user needs to be able to select the time "window" for example 5 minutes.
Is there any easy way of doing this using analytics please?
here are 2 hours of random data I've been working with...
create table tom
(when date)
/
insert into tom(select sysdate+((dbms_random.value)-.5)/14 from all_objects)
/ 
September 05, 2007 - 1:27 pm UTC 
 
I did,
years ago - expert one on one Oracle - the chapter on Analytics.  Covers it from start to finish.
Also, the Data Warehousing guide does a really good job!
you need to define your problem a little better - normally to split things into 1 or N minutes slices - you do NOT need analytics at all.  eg: one minute slices, just group by:
to_char( dt, 'yyyymmddhhmm' )
everything in the same minute will be together - you need to be a little more precise as to what you mean here though before I can say anything. 
 
 
Can LAG do what I need? Version 9.2.0.6.0
Peter, September 13, 2007 - 7:16 am UTC
 
 
Dear Tom,
Thanks for the site - I have learned many things and use Analytics a lot now - very performant in many cases. If you can spare the time to look at my problem I will be most grateful.
I have a table like:
SQL> r
  1  create table lag_test (
  2  claim varchar2 (10) not null,
  3  event varchar2 (10),
  4  period number (6,0) not null,
  5  amount number (10,2) not null
  6* )
Table created.
With this data:
SQL> insert into lag_test values ('1', 'A', 200701, 50);
1 row created.
SQL> insert into lag_test values ('1', null, 200702, 75);
1 row created.
SQL> insert into lag_test values ('1', 'A', 200703, 100);
1 row created.
SQL> insert into lag_test values ('1', 'A', 200704, 1000);
1 row created.
SQL> commit;
Commit complete.
And I am using the following query to compute movements in amount:
SQL> edit
Wrote file afiedt.buf
  1  select
  2  event,
  3  period,
  4  amount - lag (amount, 1, 0) over (partition by claim order by claim, period) amt
  5  from
  6* lag_test
SQL> /
EVENT          PERIOD        AMT
---------- ---------- ----------
A              200701         50
               200702         25
A              200703         25
A              200704        900
But I was expecting the following output:
EVENT          PERIOD        AMT
---------- ---------- ----------
A              200701         50
A              200702        -50
               200702         75
A              200703        100
               200703        -75
A              200704        900
Because at period 200702 "A" has gone away and been replaced by "null". And at period 200703 "null" has gone away and been replaced by "A".
I have scoured Chapter 12 in your book but maybe my understanding of how LAG works is flawed - or is there some way I can get my desired output.
Would really appreciate any help you can give me, and apologise if this is off topic.
Kind Regards,
Peter                
September 15, 2007 - 7:27 pm UTC 
 
I can only tell you if your desired output is possible if.....
You state in a factual, requirementy sort of what what it is.  Saying "I expect 42 to be displayed" is insufficient.
You have 4 rows, I fail to see how a query against a simple 4 row table would return more than 4 in this case.
'A' has nothing to do with this query - you partition by claim (always '1') you order by claim and period - other than you selecting event, it is not part of the picture here. 
 
 
Thanks for the tip above
Phil, September 15, 2007 - 5:57 pm UTC
 
 
Hi Tom Re:
Hello Tom
I struggle with Analytics, have you finished writing a book that explains it from the start?
I need to produce a report based on time and split the data into chunks. Initially each minute but 
the user needs to be able to select the time "window" for example 5 minutes.
Is there any easy way of doing this using analytics please?
here are 2 hours of random data I've been working with...
create table tom
(when date)
/
insert into tom(select sysdate+((dbms_random.value)-.5)/14 from all_objects)
/
OK, understand the to_char(t.when, 'yyyymmddhhmm') which makes perfect sense but is this an efficient way of doing a search for a max frequency in a certain window? I thought analytics would offer me some performance advanatages. I'm confident you'll improve on the following to give me a top frequency per period based on the above seed data...
select five_sec_count, five_sec_period, seconds secfrom, seconds + 5 secto
  from (select count(*) five_sec_count,
               to_char(t.when, 'yyyymmddhhmm') five_sec_period,
               case
                 when to_char(when, 'ss') between 1 and 5 then
                  5
                 when to_char(when, 'ss') between 6 and 10 then
                  10
                 when to_char(when, 'ss') between 11 and 15 then
                  15
                 when to_char(when, 'ss') between 16 and 20 then
                  20
                 when to_char(when, 'ss') between 21 and 25 then
                  25
                 when to_char(when, 'ss') between 26 and 30 then
                  30
                 when to_char(when, 'ss') between 31 and 35 then
                  35
                 when to_char(when, 'ss') between 36 and 40 then
                  40
                 when to_char(when, 'ss') between 41 and 45 then
                  45
                 when to_char(when, 'ss') between 46 and 50 then
                  50
                 when to_char(when, 'ss') between 51 and 55 then
                  55
                 else
                  60
               end seconds
          from tom t
         group by to_char(t.when, 'yyyymmddhhmm'),
                  case
                    when to_char(when, 'ss') between 1 and 5 then
                     5
                    when to_char(when, 'ss') between 6 and 10 then
                     10
                    when to_char(when, 'ss') between 11 and 15 then
                     15
                    when to_char(when, 'ss') between 16 and 20 then
                     20
                    when to_char(when, 'ss') between 21 and 25 then
                     25
                    when to_char(when, 'ss') between 26 and 30 then
                     30
                    when to_char(when, 'ss') between 31 and 35 then
                     35
                    when to_char(when, 'ss') between 36 and 40 then
                     40
                    when to_char(when, 'ss') between 41 and 45 then
                     45
                    when to_char(when, 'ss') between 46 and 50 then
                     50
                    when to_char(when, 'ss') between 51 and 55 then
                     55
                    else
                     60
                  end
         order by 1 desc)
 where rownum = 1
Thanks for everything you've provided over the years.
 
September 15, 2007 - 10:03 pm UTC 
 
well, the case isn't needed - a simple divide and trunc could do that - otherwise - looks fine. 
 
 
Can LAG do what I need? Version 9.2.0.6.0
Peter, September 19, 2007 - 11:35 am UTC
 
 
>>I can only tell you if your desired output is possible if..... 
>>You state in a factual, requirementy sort of what what it is. Saying "I expect 42 to be displayed" is insufficient. 
Sorry for being unclear Tom. My example was cut down too much and didn't really represent what I was trying to do. 
I have found the answer in your article 
https://asktom.oracle.com/Misc/oramag/on-uniqueness-space-and-numbers.html where you mention partitioned "sparse" outer joins. Unfortunately we are not yet on 10g, and my cartesian product will be extremely large.
Many thanks anyway for your response.
Kind Regards,
Peter 
 
Combine row data for date event
Robert, September 28, 2007 - 4:57 pm UTC
 
 
Hello Tom,
I have a small module that tracks employee activity within a department. However, the tracker registers the activity as a single event (marked by 'A' - active or 'I' - inactive). Chronologically, for the same employee/department, an event of type 'A' can be either followed by a 'I' or by nothing (in which case the emp is still active).
I need to get the activity as a period, where the start date/time is when the emp became active at a particular time and the end date/time is when the emp became inactive. There's a way to accomplish this using table joins but I'm pretty sure there's a (simpler) way to accomplish it using analytics. How can I do this? 
Here's the scripts (Oracle 10g).
create table emp_activity
(
      dept_id number,
      emp_id number,
      dt date,
      activity char(1)
);
One set of sample data:
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(15, 973, to_Date('23-AUG-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(15, 973, to_Date('12-APR-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'I');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(15, 973, to_Date('02-OCT-06 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');the result for this would be
dept | emp | start_date | end_date
----------------------------------------
15 | 973 | 23-AUG-07 | null
15 | 973 | 02-OCT-06 | 12-APR-07
Another set of data:
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(10, 100, to_Date('16-AUG-01 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(10, 100, to_Date('25-APR-03 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'I');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(10, 100, to_Date('30-JUL-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(10, 100, to_Date('22-AUG-07 11.08.54 AM', 'DD-MON-RR HH12.MI.SS AM'), 'I');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(10, 100, to_Date('22-AUG-07 11.35.00 AM', 'DD-MON-RR HH12.MI.SS AM'), 'A');
insert into emp_activity(dept_id,emp_id,dt,activity) 
values(10, 100, to_Date('22-AUG-07 12.00.00 AM', 'DD-MON-RR HH12.MI.SS PM'), 'I');
in this case, the result should be
dept | emp | start_date | end_date
----------------------------------------
10 | 100 | 22-AUG-07 | 22-AUG-07
10 | 100 | 30-JUL-07 | 22-AUG-07
10 | 100 | 16-AUG-01 | 25-APR-03
I left out the time fields in the results, for clarity. Thank you. 
October   03, 2007 - 1:10 pm UTC 
 
your last output looks wrong to me, this is what I think it should be:
ops$tkyte%ORA10GR2> select *
  2    from (
  3  select dept_id, emp_id,
  4         dt, lead(dt) over (partition by dept_id, emp_id order by dt) last_dt,
  5         activity, lead(activity) over (partition by dept_id, emp_id order by dt) last_activity
  6    from emp_activity
  7         )
  8   where activity = 'A'
  9   order by dept_id, emp_id, dt
 10  /
   DEPT_ID     EMP_ID DT        LAST_DT   A L
---------- ---------- --------- --------- - -
        10        100 16-AUG-01 25-APR-03 A I
        10        100 30-JUL-07 22-AUG-07 A I
        10        100 22-AUG-07           A
        15        973 02-OCT-06 12-APR-07 A I
        15        973 23-AUG-07           A
 
 
 
Re: Combine row data for date event
Robert, October   02, 2007 - 8:54 am UTC
 
 
I actually solved the problem myself after playing with lag and lead!
select dept_id, emp_id, start_date, end_date
from (
    select dept_id, emp_id, dt start_date
      ,lag(dt) over(partition by dept_id, emp_id order by dt desc) end_date
      ,activity
    from emp_activity
) where activity='A'
order by dept_id, emp_id, start_date desc
 
 
window clause on time
Ryan, October   10, 2007 - 1:31 pm UTC
 
 
This might be on here already, but I couldn't find it. I think I need an analytic function with a window clause to do this:
mytimestamp is a timestamp field
mymetric is a number field
select max(mytimestamp) as time, sum(mymetric) as bytes
from metrictable
where mymetric > ( select max(mymetric) from metrictable) - (1/96)
I want to get this rolled up by 15 minute chunks. do i use a group by, analytic function?
So it is essentially a set of buckets or chunks
first 15 minute period
second 15 minute period
There "where" clause is just me trying to get just the last 15 minutes.  
 
analytic functions
RAJ, November  08, 2007 - 4:16 pm UTC
 
 
Hi Tom,
I have the following scenario and given the exepected output as well.
I am trying to achieve with the below said query. 
SELECT DISTINCT a.emp_id, CASE  WHEN a.operation$ IN ('UU', 'UN') THEN 'U'
                   ELSE a.operation$  END operation,
                CASE
                   WHEN a.operation$ = 'I' THEN NULL
                   ELSE FIRST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
                END prev_amount,
                LAST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
                                                               AS curr_amount,
                CASE  WHEN a.operation$ = 'I'
                      THEN NULL ELSE FIRST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
                END prev_timestamp,
                LAST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
                                                            AS curr_timestamp
           FROM DEPT a
       ORDER BY emp_id
Result -- 3 rows but expected is 1 row. Below I have given the expected output as well.  I am not sure where I am missing in the above said query..
Here is the scenario..
create table dept (
OPERATION$         CHAR(2)
,CSCN$              NUMBER
,COMMIT_TIMESTAMP$  DATE
,RSID$              NUMBER
,USERNAME$          VARCHAR2(30)
,TIMESTAMP$         DATE
,SOURCE_COLMAP$     RAW(128)
,TARGET_COLMAP$     RAW(128)
,EMP_ID             NUMBER
,TIMESTAMP          DATE
,AMOUNT             NUMBER
)
INSERT INTO 
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
 VALUES
('UU ',1234,SYSDATE,40264,'TEST',SYSDATE,'08','08',104,SYSDATE,90000);
INSERT INTO 
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
 VALUES
('UN ',1234,SYSDATE,40265,'TEST',SYSDATE,'08','08',104,SYSDATE,100000);
INSERT INTO 
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
 VALUES
('UU ',1234,SYSDATE,40270,'TEST',SYSDATE,'08','08',104,SYSDATE,100000);
INSERT INTO 
(OPERATION$,CSCN$,COMMIT_TIMESTAMP$,RSID$,USERNAME$,TIMESTAMP$,SOURCE_COLMAP$,TARGET_COLMAP$,EMP_ID,TIMESTAMP,AMOUNT)
 VALUES
('UN ',1234,SYSDATE,40271,'TEST',SYSDATE,'08','08',104,SYSDATE,30000);
UU,UN -- 'U'
Expected result set
EMP_ID, OPERATION$, PREV_AMOUNT, PREV_TIMESTAMP, CURRENT_AMOUNT, CURR_TIMESTAMP
104   , U            90000       SYSDATE       ,  30000          SYSDATE
 
November  09, 2007 - 12:10 pm UTC 
 
funny, your example doesn't work AT ALL.  operation$ is 2 characters, you have 3.  there is no table name.  sigh.
Funnier still, And no clue why you expect a SINGLE ROW - when there is clearly three distinct rows.
ops$tkyte%ORA10GR2> SELECT DISTINCT a.emp_id, CASE  WHEN a.operation$ IN ('UU', 'UN') THEN 'U'
  2                     ELSE a.operation$  END operation,
  3                  CASE
  4                     WHEN a.operation$ = 'I' THEN NULL
  5                     ELSE FIRST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
  6                  END prev_amount,
  7                  LAST_VALUE (a.amount) OVER (PARTITION BY a.emp_id ORDER BY rsid$)
  8                                                                 AS curr_amount,
  9                  CASE  WHEN a.operation$ = 'I'
 10                        THEN NULL ELSE FIRST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
 11                  END prev_timestamp,
 12                  LAST_VALUE (a.TIMESTAMP$) OVER (PARTITION BY a.emp_id ORDER BY a.rsid$ ASC)
 13                                                              AS curr_timestamp
 14             FROM DEPT a
 15         ORDER BY emp_id
 16  /
    EMP_ID OP PREV_AMOUNT CURR_AMOUNT PREV_TIME CURR_TIME
---------- -- ----------- ----------- --------- ---------
       104 U        90000       30000 09-NOV-07 09-NOV-07
       104 U        90000       90000 09-NOV-07 09-NOV-07
       104 U        90000      100000 09-NOV-07 09-NOV-07
It appears you are misunderstanding how the analytic functions work in this case - so presenting your "answer" without presenting "your question" (phrased in text) is not useful. 
 
 
how do you carry lag to more than one row?
ee, December  05, 2007 - 6:11 pm UTC
 
 
How can you carry forward lag analytic results to multiple rows. Given table t based on view of two cartesianed tables that uses anlaytics dataset is as follows:
eg.
CREATE TABLE T(
cert_number varchar2(10),
inspection_date DATE,
risk_assign_date DATE, 
risk_number NUMBER, 
derived_risk NUMBER
);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) VALUES(1,TO_DATE('11-dec-2006'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) VALUES(1,TO_DATE('12-dec-2006'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) VALUES(1,TO_DATE('01-jan-2007'),TO_DATE('06-jan-2007'),3,3); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('25-jan-2007'),NULL,NULL,3); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('27-jan-2007'),TO_DATE('27-jan-2007'),8,8); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('13-feb-2007'),NULL,NULL,3); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('17-feb-2007'),TO_DATE('17-feb-2007'),2,2); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('18-apr-2007'),NULL,NULL,2); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('23-may-2007'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('17-aug-2007'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('22-aug-2007'),TO_DATE('22-aug-2007'),1,1);
Analytic code ...DECODE (risk_number, NULL,first_value(risk_number) over(ORDER BY cert_id_number, inspectiondate ASC ROWS 1 preceding ),risk_number)derived_risk...within the view, correctly determines the risk number for the preceding rows where the risk number is null, but it leaves several rows without a derived risk_number value. How can I carry the lag derived_risk value of 2 forward to more than just one row so that the rows that include inspection dates of 5/23/2007 and 8/23/07 also get a derived_risk value of 2 assigned to the row.
The desired output would look like:
cert_  inspection_  risk_assign_ risk_  derived_
number date         date         number risk
======  ==========  =====        ====== ========
1       12/11/2006  null         null   null
1       12/12/2006  null         null   null
1       1/1/2007    1/6/2007     3      3
1       1/25/2007   null         null   3
1       1/27/2007   1/27/2007    8      8
1       2/13/2007   null         null   3
1       2/17/2007   2/17/2007    2      2
1       4/18/2007   null         null   2
1       5/23/2007   null         null   2
1       8/17/2007   null         null   2
1       8/22/2007   8/22/2007    1      1 
December  10, 2007 - 9:52 am UTC 
 
ops$tkyte%ORA10GR2> select t.*,
  2         last_value( risk_number ignore nulls ) over (order by cert_number, inspection_date ) dr
  3    from t;
CERT_NUMBE INSPECTIO RISK_ASSI RISK_NUMBER DERIVED_RISK         DR
---------- --------- --------- ----------- ------------ ----------
1          11-DEC-06
1          12-DEC-06
1          01-JAN-07 06-JAN-07           3            3          3
1          25-JAN-07                                  3          3
1          27-JAN-07 27-JAN-07           8            8          8
1          13-FEB-07                                  3          8
1          17-FEB-07 17-FEB-07           2            2          2
1          18-APR-07                                  2          2
1          23-MAY-07                                             2
1          17-AUG-07                                             2
1          22-AUG-07 22-AUG-07           1            1          1
11 rows selected.
I don't know why your example has
null, null, 3,3,8,3 <<<===???? why 3 - i think it should be 8. 
 
 
 
how do you carry lag values forward to more than one row?
ee, December  06, 2007 - 8:55 am UTC
 
 
How can you carry forward lag analytic results to multiple rows. Derived risk value carries forward the value of previous recent risk number and only changes when new risk assign date changes with a new risk number. Given table t based on view of two cartesianed tables that uses anlaytics dataset is as follows: 
CREATE TABLE T(
cert_number varchar2(10),
inspection_date DATE,
risk_assign_date DATE, 
risk_number NUMBER, 
derived_risk NUMBER
);
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('11-dec-2006'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('12-dec-2006'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('01-jan-2007'),TO_DATE('06-jan-2007'),3,3); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('25-jan-2007'),NULL,NULL,3); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('27-jan-2007'),TO_DATE('27-jan-2007'),8,8); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('13-feb-2007'),NULL,NULL,8); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('17-feb-2007'),TO_DATE('17-feb-2007'),2,2); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('18-apr-2007'),NULL,NULL,2); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('23-may-2007'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('17-aug-2007'),NULL,NULL,NULL); 
INSERT INTO T (cert_number,inspection_date,risk_assign_date,risk_number,derived_risk ) 
VALUES(1,TO_DATE('22-aug-2007'),TO_DATE('22-aug-2007'),1,1);
Analytic code ...DECODE (risk_number, NULL,first_value(risk_number) over(ORDER BY cert_id_number, 
inspectiondate ASC ROWS 1 preceding ),risk_number)derived_risk...within the view, correctly 
determines the risk number for the preceding rows where the risk number is null, but it leaves 
several rows without a derived risk_number value. How can I carry the lag derived_risk value of 2 
forward to more than just one row so that the rows that include inspection dates of 5/23/2007 and 
8/23/07 also get a derived_risk value of 2 assigned to the row.
The desired output would look like:
cert_  inspection_  risk_assign_ risk_  derived_
number date         date         number risk
======  ==========  =====        ====== ========
1       12/11/2006  null         null   null
1       12/12/2006  null         null   null
1       1/1/2007    1/6/2007     3      3
1       1/25/2007   null         null   3
1       1/27/2007   1/27/2007    8      8
1       2/13/2007   null         null   8
1       2/17/2007   2/17/2007    2      2
1       4/18/2007   null         null   2
1       5/23/2007   null         null   2
1       8/17/2007   null         null   2
1       8/22/2007   8/22/2007    1      1 
 
10g only?
Jay, December  10, 2007 - 10:05 am UTC
 
 
Hi Tom,
Thanks! The questioner corrected his output in the following statement. I had a question, is this 'ignore nulls' a 10g thing? Because, it does not seem to be working for me and I use 9i. I get a ORA-00907 Missing right parenthesis when I use the 'ignore nulls' statement. 
Thanks for your help!
Jay 
December  10, 2007 - 10:16 am UTC 
 
ignore nulls is new with 10g - yes.
before that you can use this "carry down" technique:
ops$tkyte%ORA9IR2> select t.*,
  2         last_value( risk_number ignore nulls ) over (order by cert_number, inspection_date ) dr
  3    from t;
       last_value( risk_number ignore nulls ) over (order by cert_number, inspection_date ) dr
                               *
ERROR at line 2:
ORA-00907: missing right parenthesis
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select cert_number,inspection_date,risk_assign_date,risk_number,derived_risk,
  2         to_number( substr( max(data) over (order by cert_number, inspection_date), 11 ) ) dr
  3    from (
  4  select t.*,
  5         case when risk_number is not null
  6                  then to_char(row_number() over
  7                                       (order by cert_number, inspection_date),'fm0000000000') ||
  8                                   risk_number
  9                   end data
 10    from t
 11         ) x
 12  /
CERT_NUMBE INSPECTIO RISK_ASSI RISK_NUMBER DERIVED_RISK         DR
---------- --------- --------- ----------- ------------ ----------
1          11-DEC-06
1          12-DEC-06
1          01-JAN-07 06-JAN-07           3            3          3
1          25-JAN-07                                  3          3
1          27-JAN-07 27-JAN-07           8            8          8
1          13-FEB-07                                  3          8
1          17-FEB-07 17-FEB-07           2            2          2
1          18-APR-07                                  2          2
1          23-MAY-07                                             2
1          17-AUG-07                                             2
1          22-AUG-07 22-AUG-07           1            1          1
11 rows selected.
 
 
 
WOW!
Jay, December  10, 2007 - 10:24 am UTC
 
 
Thanks Tom. You are simply amazing. I get to learn so much from your site!
Thanks a ton and have a wonderful day.
Jay 
 
carrying lag value forward to more than one row
ee, December  11, 2007 - 12:40 pm UTC
 
 
Tom ,as always you find the answers to difficult questions and provide succinct code from which we can follow and learn. I struggled for several days on finding a solution to carry lag values forward to multiple rows and you have delivered once again. As I stated to you once at an RMOUG meeting, I thank the maker above for you insight and willingness to share your talents and knowledge with others. 
Thanks again. 
 
carry multiple lag values forward to multple rows
ee, December  13, 2007 - 12:00 pm UTC
 
 
TOM: I am still trying to grasp the capabilities of using analytics and have a follow up question:
as you stated: your code.....
ops$tkyte%ORA9IR2> 
  1 select cert_number,inspection_date,risk_assign_date,risk_number,derived_risk,
  2         to_number( substr( max(data) over (order by cert_number, inspection_date), 11 ) ) dr
  3    from (
  4  select t.*,
  5         case when risk_number is not null
  6                  then to_char(row_number() over
  7                                       (order by cert_number, inspection_date),'fm0000000000') 
||
  8                                   risk_number
  9                   end data
 10    from t
 11         ) x
 12  /
RETURNS this output:
cert_  inspection_  risk_assign_ risk_  derived_
number date         date         number risk
======  ==========  =====        ====== ========
1       12/11/2006  null         null   null
1       12/12/2006  null         null   null
1       1/6/2007    1/6/2007     3      3
1       1/25/2007   null         null   3
1       1/27/2007   1/27/2007    8      8
1       2/13/2007   null         null   8
1       2/17/2007   2/17/2007    2      2
1       4/18/2007   null         null   2
1       5/23/2007   null         null   2
1       8/17/2007   null         null   2
1       8/22/2007   8/22/2007    1      1
But, how would code the query to return this output?
The desired output would look like:
cert_  inspection_  risk_assign_ risk_  derived_   derived_
number date         date         number risk       risk_date
======  ==========  =====        ====== ========   =========
1       12/11/2006  null         null   null       null
1       12/12/2006  null         null   null       null
1       1/6/2007    1/6/2007     3      3          1/6/2007
1       1/25/2007   null         null   3          1/6/2007
1       1/27/2007   1/27/2007    8      8          1/27/2007
1       2/13/2007   null         null   8          1/27/2007
1       2/17/2007   2/17/2007    2      2          2/17/2007
1       4/18/2007   null         null   2          2/17/2007
1       5/23/2007   null         null   2          2/17/2007
1       8/17/2007   null         null   2          2/17/2007
1       8/22/2007   8/22/2007    1      1          8/22/2007
Thank you so very much for your help !!! 
December  13, 2007 - 12:37 pm UTC 
 
using the same technique on risk_assign_date
Look at what I did bit by bit here....  This is a technique applicable to anything...
carry down bit by bit...
ops$tkyte%ORA10GR2> select x, y
  2    from t
  3   order by x
  4  /
X                    Y
-------------------- --------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40
14-dec-2007 00:00:00
we start with that, we want to carry down Y.  Now, Y is not 'sortable' right now - the 13th comes before the 12th.  In general, this will be true - that the attribute we want is not sortable directly... We need something to add to Y to make it "sortable" - row_number will help:
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select x, y,
  2         case when y is not null then row_number() over (order by x) end rn
  3    from t
  4   order by x
  5  /
X                    Y                            RN
-------------------- -------------------- ----------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40          2
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40          4
14-dec-2007 00:00:00
we want row_number in a field whenever Y is not null (for the MAX() trick later!)  We need it to sort - and it would - it is a number, but we need to piggy back onto this RN field our Y value - so we need a string:
o
ps$tkyte%ORA10GR2> select x, y,
  2         case when y is not null then to_char(row_number() over (order by x),'fm000000') end rn
  3    from t
  4   order by x
  5  /
X                    Y                    RN
-------------------- -------------------- -------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 000002
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40 000004
14-dec-2007 00:00:00
Now, we need to add our value to this RN field - it is a date, we convert date into something neutral for our string:
ops$tkyte%ORA10GR2> select x, y,
  2         case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
  3             to_char(y,'yyyymmddhh24miss') rn
  4    from t
  5   order by x
  6  /
X                    Y                    RN
-------------------- -------------------- ---------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 00000220071213123940
12-dec-2007 00:00:00
13-dec-2007 00:00:00 12-dec-2007 12:39:40 00000420071212123940
14-dec-2007 00:00:00
Now, we can apply the MAX trick to 'carry' down:
ops$tkyte%ORA10GR2> select x, y, MAX(rn) over (order by x) max_rn
  2    from (
  3  select x, y,
  4         case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
  5             to_char(y,'yyyymmddhh24miss') rn
  6    from t
  7         )
  8   order by x
  9  /
X                    Y                    MAX_RN
-------------------- -------------------- ---------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 00000220071213123940
12-dec-2007 00:00:00                      00000220071213123940
13-dec-2007 00:00:00 12-dec-2007 12:39:40 00000420071212123940
14-dec-2007 00:00:00                      00000420071212123940
substr gets out bit out again:
ops$tkyte%ORA10GR2> select x, y, substr(MAX(rn) over (order by x),7) max_rn
  2    from (
  3  select x, y,
  4         case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
  5             to_char(y,'yyyymmddhh24miss') rn
  6    from t
  7         )
  8   order by x
  9  /
X                    Y                    MAX_RN
-------------------- -------------------- ---------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 20071213123940
12-dec-2007 00:00:00                      20071213123940
13-dec-2007 00:00:00 12-dec-2007 12:39:40 20071212123940
14-dec-2007 00:00:00                      20071212123940
we are almost done - a to_date and viola'
ops$tkyte%ORA10GR2> select x, y, to_date( substr(MAX(rn) over (order by x),7),'yyyymmddhh24miss') max_rn
  2    from (
  3  select x, y,
  4         case when y is not null then to_char(row_number() over (order by x),'fm000000') end ||
  5             to_char(y,'yyyymmddhh24miss') rn
  6    from t
  7         )
  8   order by x
  9  /
X                    Y                    MAX_RN
-------------------- -------------------- --------------------
10-dec-2007 00:00:00
11-dec-2007 00:00:00 13-dec-2007 12:39:40 13-dec-2007 12:39:40
12-dec-2007 00:00:00                      13-dec-2007 12:39:40
13-dec-2007 00:00:00 12-dec-2007 12:39:40 12-dec-2007 12:39:40
14-dec-2007 00:00:00                      12-dec-2007 12:39:40
so, apply that technique to your data - I purposely am not giving the final final solution :) I want you to code it. 
 
 
carry lag values forward to multple rows
ee, December  13, 2007 - 4:06 pm UTC
 
 
Thank you sooooo very much. I studied your example and was successful in applying it to my data situation.
PS.
During the daytime, I pour through your books and your askTom website and I carefully study the wonderful examples.... and then at night, I place your books under my pillow hoping that through osmosis, all of the wonderfull contents that is within the pages will be absorbed and understood when I awake. It is much better than counting sheep.
 
 
Single query to replace 'Union All'
Matthew, January   08, 2008 - 7:44 am UTC
 
 
Given the following script (v10.2):
DROP TABLE issue_category;
DROP TABLE category_type;
DROP TABLE issue_class;
DROP TABLE fd_scorecard;
DROP TABLE class_type;
DROP TABLE fd;
CREATE TABLE fd(
    fd_id VARCHAR2(20),
    fd_name VARCHAR2(100));
CREATE TABLE class_type(
    cl_id NUMBER(6),
    description VARCHAR2(100));
CREATE TABLE fd_scorecard(
    fd_id VARCHAR2(20),
    scorecard_id NUMBER(10),
    date_of_meeting DATE,
    status VARCHAR2(100));
CREATE TABLE category_type(
    ct_id NUMBER(6),
    class_type_id NUMBER(6),
    description VARCHAR2(100),
    weighting NUMBER(5,2));
CREATE TABLE issue_class(
    fd_id VARCHAR2(20),
    scorecard_id NUMBER(10),
    class_type_id NUMBER(6),
    ic_name VARCHAR2(100),
    num_grade NUMBER(1),
    weighting NUMBER(5,2));
CREATE TABLE issue_category(
    fd_id VARCHAR2(20),
    scorecard_id NUMBER(10),
    category_type_id NUMBER(10),
    class_type_id NUMBER(6),
    num_grade NUMBER(1),
    weighting NUMBER(5,2),
    date_created DATE,
    created_by VARCHAR2(100),
    notes VARCHAR2(4000));
    
/* Create the data */
DECLARE
TYPE class_type_tab IS TABLE OF class_type%ROWTYPE;
t_class_type class_type_tab := class_type_tab();
TYPE category_type_tab IS TABLE OF category_type%ROWTYPE;
t_category_type category_type_tab := category_type_tab();
indx PLS_INTEGER;
PROCEDURE ins(
            p_id class_type.cl_id%TYPE,
            p_desc class_type.description%TYPE) IS
BEGIN
indx := t_class_type.NEXT(indx);
t_class_type(indx).cl_id := p_id;
t_class_type(indx).description := p_desc;
END ins;
PROCEDURE ins(
            p_class_type_id category_type.class_type_id%TYPE,
            p_id category_type.ct_id%TYPE,
            p_desc category_type.description%TYPE,
            p_weighting category_type.weighting%TYPE) IS
BEGIN
indx := t_category_type.NEXT(indx);
t_category_type(indx).ct_id := p_id;
t_category_type(indx).class_type_id := p_class_type_id;
t_category_type(indx).description := p_desc;
t_category_type(indx).weighting := p_weighting;
END ins;
BEGIN
INSERT INTO fd(
        fd_id,
        fd_name)
VALUES('FD000001', 'FD000001');
indx := 0;
/* Class_Type data */
t_class_type.EXTEND(12);
ins(1,'Class Type 1');
ins(2,'Class Type 2');
ins(3,'Class Type 3');
ins(4,'Class Type 4');
ins(5,'Class Type 5');
ins(6,'Class Type 6');
ins(7,'Class Type 7');
ins(8,'Class Type 8');
ins(9,'Class Type 9');
ins(10,'Class Type 10');
ins(11,'Class Type 11');
ins(12,'Class Type 12');
FORALL i IN 1..t_class_type.COUNT
    INSERT INTO class_type
    VALUES t_class_type(i);
INSERT INTO fd_scorecard(
        fd_id,
        scorecard_id,
        date_of_meeting,
        status)
SELECT fd_id,
       ROWNUM Scorecard_ID,
       SYSDATE Meeting_Date,
       CASE WHEN MOD(ROWNUM,2) = 0 THEN 'Draft'
       ELSE 'Final'
       END Status
FROM fd;
INSERT INTO issue_class(
        fd_id,
        scorecard_id,
        class_type_id,
        ic_name,
        num_grade,
        weighting)
SELECT fs.fd_id,
       fs.scorecard_id,
       cl.cl_id Class_Type_ID,
       'Issue Class Name '||TO_CHAR(ROWNUM) Ic_Name,
       CASE MOD(ROWNUM, 5)
            WHEN 0 THEN 5
            WHEN 1 THEN 4
            WHEN 2 THEN 3
            WHEN 3 THEN 2
            ELSE 1 END Num_Grade,
       ROUND(100 / ROWNUM, 2) Weighting
FROM fd_scorecard fs,
     class_type cl;
/* Category_Type data */
t_category_type.EXTEND(56);
indx := 0;
ins(1,1,'Category Type 11',150);
ins(1,2,'Category Type 12',30);
ins(1,3,'Category Type 13',60);
ins(1,4,'Category Type 14',40);
ins(1,5,'Category Type 15',10);
ins(1,6,'Category Type 16',5);
ins(1,7,'Category Type 17',5);
ins(2,1,'Category Type 21',150);
ins(2,2,'Category Type 22',25);
ins(2,3,'Category Type 23',20);
ins(2,4,'Category Type 24',10);
ins(2,5,'Category Type 25',10);
ins(2,6,'Category Type 26',10);
ins(3,1,'Category Type 31',50);
ins(3,2,'Category Type 32',25);
ins(3,3,'Category Type 33',25);
ins(4,1,'Category Type 41',25);
ins(4,2,'Category Type 42',75);
ins(4,3,'Category Type 43',50);
ins(5,1,'Category Type 51',30);
ins(5,2,'Category Type 52',30);
ins(5,3,'Category Type 53',30);
ins(5,4,'Category Type 54',10);
ins(6,1,'Category Type 61',50);
ins(6,2,'Category Type 62',25);
ins(6,3,'Category Type 63',50);
ins(6,4,'Category Type 64',50);
ins(6,5,'Category Type 65',25);
ins(6,6,'Category Type 66',25);
ins(6,7,'Category Type 67',25);
ins(7,1,'Category Type 71',100);
ins(7,2,'Category Type 72',50);
ins(7,3,'Category Type 73',20);
ins(7,4,'Category Type 74',5);
ins(7,5,'Category Type 75',25);
ins(7,6,'Category Type 76',20);
ins(7,7,'Category Type 77',25);
ins(7,8,'Category Type 78',40);
ins(7,9,'Category Type 79',10);
ins(7,10,'Category Type 710',5);
ins(8,1,'Category Type 81',25);
ins(8,2,'Category Type 82',25);
ins(9,1,'Category Type 91',50);
ins(9,2,'Category Type 92',25);
ins(9,3,'Category Type 93',50);
ins(9,4,'Category Type 94',25);
ins(9,5,'Category Type 95',25);
ins(9,6,'Category Type 96',25);
ins(10,1,'Category Type 101',100);
ins(10,2,'Category Type 102',100);
ins(10,3,'Category Type 103',75);
ins(10,4,'Category Type 104',25);
ins(11,1,'Category Type 111',30);
ins(11,2,'Category Type 112',30);
ins(11,3,'Category Type 113',20);
ins(11,4,'Category Type 114',20);
FORALL i IN 1..t_category_type.COUNT
    INSERT INTO category_type
    VALUES t_category_type(i);
    
INSERT INTO issue_category(
        fd_id,
        scorecard_id,
        category_type_id,
        class_type_id,
        num_grade,
        weighting,
        date_created,
        created_by,
        notes)
SELECT ic.fd_id,
       ic.scorecard_id,
       ct.ct_id Category_Type_ID,
       ct.class_type_id Class_Type_ID,
       CASE MOD(ROWNUM, 5)
            WHEN 0 THEN 5
            WHEN 1 THEN 4
            WHEN 2 THEN 3
            WHEN 3 THEN 2
            ELSE 1 END Num_Grade,
       DBMS_RANDOM.VALUE(1,999) Weighting,
       SYSDATE Date_Created,
       USER Created_By,
       NULL Notes
FROM issue_class ic,
     category_type ct
WHERE ic.class_type_id = ct.class_type_id;
COMMIT;
END;
/* Constraints */
ALTER TABLE fd ADD(
    CONSTRAINT fd_pk
    PRIMARY KEY
    (fd_id));
ALTER TABLE class_type ADD(
    CONSTRAINT class_type_pk
    PRIMARY KEY
    (cl_id));
ALTER TABLE category_type ADD(
    CONSTRAINT category_type_pk
    PRIMARY KEY
    (ct_id, class_type_id));
ALTER TABLE issue_class ADD(
    CONSTRAINT issue_class_pk
    PRIMARY KEY
    (fd_id, scorecard_id, class_type_id));
ALTER TABLE issue_category ADD(
    CONSTRAINT issue_class_category_pk
    PRIMARY KEY
    (fd_id, scorecard_id, category_type_id, class_type_id));
ALTER TABLE fd_scorecard ADD(
    CONSTRAINT fd_scorecard_pk
    PRIMARY KEY
    (fd_id, scorecard_id));
ALTER TABLE fd_scorecard ADD(
    CONSTRAINT fd_scorecard_fd_fk 
    FOREIGN KEY (fd_id) 
    REFERENCES fd(fd_id));
ALTER TABLE issue_class ADD(
    CONSTRAINT issue_class_fd_scorecard_fk 
    FOREIGN KEY (fd_id,scorecard_id) 
    REFERENCES fd_scorecard(fd_id,scorecard_id));
ALTER TABLE issue_class ADD(
    CONSTRAINT issue_class_class_type_fk 
    FOREIGN KEY (class_type_id) 
    REFERENCES class_type(cl_id));
ALTER TABLE issue_class ADD(
    CHECK (num_grade BETWEEN 1 AND 5));
ALTER TABLE category_type ADD(
    CONSTRAINT category_type_class_type_fk 
    FOREIGN KEY (class_type_id) 
    REFERENCES class_type(cl_id));
ALTER TABLE issue_category ADD(
    CONSTRAINT issue_category_category_typ_fk 
    FOREIGN KEY (category_type_id, class_type_id)
    REFERENCES category_type(ct_id, class_type_id));
the following query produces the required results (i.e. what in my Cobol days I'd have called File Header, (Batch Header, Details, Batch Trailer) times n, File Trailer):
SELECT 0 Class_ID,
       f.fd_name Class_Description,
       0 Class_Type_ID,
       NULL Catg_ID,
       fs.status Catg_Description,
       TO_NUMBER(NULL) Num_Grade,
       TO_NUMBER(NULL) Weighting,
       fs.date_of_meeting Date_Created,
       NULL Created_By
FROM fd f,
     fd_scorecard fs
WHERE fs.fd_id = f.fd_id
AND f.fd_id = 'FD000001'
UNION ALL
SELECT cl_id                                                                    Class_ID,
       cl.description                                                           Class_Description,
       0                                                                     Class_Type_ID,
       NULL                                                                     Catg_ID,
       NULL                                                                     Catg_Description,
       TO_NUMBER(NULL)                                                          Num_Grade,
       TO_NUMBER(NULL)                                                          Weighting,
       TO_DATE(NULL)                                                            Date_Created,
       NULL                                                                     Created_By
FROM class_type cl
UNION ALL
SELECT cl.cl_id                                                                     Class_ID,
       NULL                                                                     Class_Description,
       ic.class_type_id,
       CASE ic.class_type_id
            WHEN 1 THEN 'A'
            WHEN 2 THEN 'B'
            WHEN 3 THEN 'C'
            WHEN 4 THEN 'D'
            WHEN 5 THEN 'E'
            WHEN 6 THEN 'F'
            WHEN 7 THEN 'G'
            WHEN 8 THEN 'H'
            WHEN 9 THEN 'I'
            WHEN 10 THEN 'J'
            WHEN 11 THEN 'K'
            WHEN 12 THEN 'L'
       END||
       TO_CHAR(ct.ct_id)                                                        Catg_ID,
       CASE GROUPING (ct.ct_id)
            WHEN 0 THEN ct.description
            ELSE
                CASE GROUPING(ic.class_type_id)
                    WHEN 0 THEN 'Grade'
                ELSE 'Overall Grade And Conclusion'
                END
       END                                                                      Catg_Description,
       ROUND(SUM(ic.num_grade * ic.weighting) / SUM(ic.weighting))              Num_Grade,
       ROUND(AVG(ct.weighting))                                                 Weighting,
       ic.date_created,
       ic.created_by
FROM fd f,
     issue_class icl,
     class_type cl,
     category_type ct,
     issue_category ic
WHERE f.fd_id = icl.fd_id
AND icl.class_type_id = cl.cl_id
AND ct.class_type_id = cl.cl_id
AND ic.class_type_id = icl.class_type_id
AND ic.scorecard_id = icl.scorecard_id
AND ic.fd_id = icl.fd_id
AND ic.category_type_id = ct.ct_id
AND f.fd_id = 'FD000001'
GROUP BY GROUPING SETS((ic.class_type_id),(cl.cl_id,cl.description, ic.class_type_id, ct.ct_id, ct.description, ic.date_created, ic.created_by),())
ORDER BY class_id, class_type_id, catg_id;
Can this be done with a single select? 
 
Single query to replace 'Union All'   
Matthew, January   08, 2008 - 9:14 am UTC
 
 
Apologies, the query I gave you was leaving the 'batch trailer' rows till the end. This one is OK:
SELECT 0 Class_ID,
       f.fd_name                                                              Class_Description,
       0                                                                        Class_Type_ID,
       0                                                                        Ct_ID,
       NULL                                                                     Catg_ID,
       fs.status                                                                Catg_Description,
       0                                                                        Batch_Trailer,
       TO_NUMBER(NULL)                                                          Num_Grade,
       TO_NUMBER(NULL)                                                          Weighting,
       fs.date_of_meeting                                                       Date_Created,
       NULL                                                                     Created_By
FROM fd f,
     fd_scorecard fs
WHERE fs.fd_id = f.fd_id
AND f.fd_id = :p_fd_id
UNION ALL
SELECT cl_id                                                                    Class_ID,
       cl.description                                                           Class_Description,
       0                                                                        Class_Type_ID,
       0                                                                        Ct_ID,
       NULL                                                                     Catg_ID,
       NULL                                                                     Catg_Description,
       0                                                                        Batch_Trailer,
       TO_NUMBER(NULL)                                                          Num_Grade,
       TO_NUMBER(NULL)                                                          Weighting,
       TO_DATE(NULL)                                                            Date_Created,
       NULL                                                                     Created_By
FROM class_type cl
UNION ALL
SELECT NVL(cl.cl_id, ic.class_type_id)                                          Class_ID,
       NULL                                                                     Class_Description,
       ic.class_type_id,
       NVL(ct.ct_id,0)                                                          Ct_ID,
       CASE ic.class_type_id
            WHEN 1 THEN 'A'
            WHEN 2 THEN 'B'
            WHEN 3 THEN 'C'
            WHEN 4 THEN 'D'
            WHEN 5 THEN 'E'
            WHEN 6 THEN 'F'
            WHEN 7 THEN 'G'
            WHEN 8 THEN 'H'
            WHEN 9 THEN 'I'
            WHEN 10 THEN 'J'
            WHEN 11 THEN 'K'
            WHEN 12 THEN 'L'
       END||
       TO_CHAR(ct.ct_id)                                                        Catg_ID,
       CASE GROUPING (ct.ct_id)
            WHEN 0 THEN ct.description
            ELSE
                CASE GROUPING(ic.class_type_id)
                    WHEN 0 THEN 'Grade'
                ELSE 'Overall Grade And Conclusion'
                END
       END                                                                      Catg_Description,
       GROUPING_ID(ct.description)                                              Batch_Trailer,
       ROUND(SUM(ic.num_grade * ic.weighting) / SUM(ic.weighting))              Num_Grade,
       ROUND(AVG(ct.weighting))                                                 Weighting,
       ic.date_created,
       ic.created_by
FROM fd f,
     issue_class icl,
     class_type cl,
     category_type ct,
     issue_category ic
WHERE f.fd_id = icl.fd_id
AND icl.class_type_id = cl.cl_id
AND ct.class_type_id = cl.cl_id
AND ic.class_type_id = icl.class_type_id
AND ic.scorecard_id = icl.scorecard_id
AND ic.fd_id = icl.fd_id
AND ic.category_type_id = ct.ct_id
AND f.fd_id = 'FD000001'
GROUP BY GROUPING SETS((ic.class_type_id),(cl.cl_id,cl.description, ic.class_type_id, ct.ct_id, 
ct.description, ic.date_created, ic.created_by),())
ORDER BY class_id, class_type_id, batch_trailer, ct_id;
 
January   08, 2008 - 9:49 am UTC 
 
I sort of doubt it based on the from lists and the group by grouping sets.
SELECT
  FROM fd f,
       fd_scorecard fs
 UNION ALL
SELECT
  FROM class_type cl
 UNION ALL
SELECT
  FROM fd f,
       issue_class icl,
       class_type cl,
       category_type ct,
       issue_category ic
if the union alls were subsets of eachother - maybe, but here you have fd+fd_scorecard - but score card isn't present in the last query - and fd is 1:M with fd_scorecard it would tend to "add rows" to the last query if we added it (assuming an OUTER JOIN was there in case the relationship is 1:M OPTIONAL - each fd row could have 0, 1 or MORE entries in fd_scorecard) 
 
 
Single query to replace 'Union All'
Matthew, January   08, 2008 - 9:57 am UTC
 
 
I won't waste any more time on it then. Many thanks. 
 
Using analytics for getting hourly intervals
Thiru, January   14, 2008 - 10:45 am UTC
 
 
Tom,
What is a good way using analytics for getting the result for:
create table t1 ( id number, sid number,qty number,stime date)
insert into t1 values(1,10,100,to_date('01-15-2008 12:00','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 12:05','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 13:05','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 13:10','MM-DD-YYYY HH24:MI'))
insert into t1 values(1,10,200,to_date('01-15-2008 14:10','MM-DD-YYYY HH24:MI'))
the output for sum (qty) needs to be grouped by id and sid and hourly time with the top of the hour being shown.
ID SID  TOP_OF_HOUR        TOTAL_QTY
1  10    01-15-2008 12:00     300
1  10    01-15-2008 13:00     400
1  10    01-15-2008 14:00     200
Thanks for the time  
January   14, 2008 - 3:43 pm UTC 
 
you don't want analytics, you just want to trunc the time to the HOUR level and group by id, sid, date.... sum(qty) by that. 
 
 
code..
Jay, January   14, 2008 - 5:45 pm UTC
 
 
Tom refers to a code like this..
select id,
       sid, 
       trunc(stime,'hh'),
       sum(qty)
  from t1 
group by id, sid, trunc(stime,'hh') 
Thanks! 
 
Selecting columns that do not change
Thiru, January   15, 2008 - 8:51 am UTC
 
 
Thanks for the input. If I have let's say 15 columns in the above t1 table and all the columns except the one that is being grouped on are the same, is there a way I can avoid putting all the columns in the group by clause?
if columns are c1....c15 and all columns from c4 to c15 are constant  values for the related groups, how do I avoid doing the following?
select c1,c2,sum(c3),....c15
group by c1,c2,c4....c15 
Only c1 and  c2 are the actual grouping columns
Thanks for the time again.
 
January   15, 2008 - 1:02 pm UTC 
 
you only have to type it once
and it is mandatory to be typed
You will not avoid the following(the above) but feel free to hide it in a view by all means. 
 
 
do a max...
Jay, January   15, 2008 - 9:04 am UTC
 
 
select c1, c2, sum(c3), max(c4), max(c5)...,max(c15)
group by c1, c2
Thanks,
Jay 
January   15, 2008 - 1:10 pm UTC 
 
do not do that.
the goal in programming is not to write obfuscated code for the ease of the current human typing.
the goal is to generate correct code that is maintainable, understandable and correct.
That query would be so misleading as to the intent, the purpose.  I would look at it later and say "what sort of goofy person did this, this is so wrong, don't they understand this data model - we'll have to review all of their code now, they don't know what this data is"
Please don't do something like this to avoid a few keystrokes once.  This is as simple as CUT AND PASTE 
select a, b, c, d, .... z, sum(xx) 
  from t
group by <yank that select list and paste it here>
 
 
 
max worked.
Thiru, January   15, 2008 - 10:45 am UTC
 
 
Thanks. That did it. Is max efficient to min in this case because the other column is doing a sum(). 
January   15, 2008 - 2:59 pm UTC 
 
stop it, please don't be lazy.
Your group by key is your group by key.
Your aggregates are your aggregates.
Please do not confuse the two.  You will confuse someone down the road or they'll be convinced you had no idea what you were doing.  
I'll never, in a billion years never, be able to understand the fear of typing. 
 
 
Jay, January   15, 2008 - 11:02 am UTC
 
 
I don't understand your question. Both Max or Min would work in this case. No pros or cons... it's the same banana!
Thanks!
Jay 
January   15, 2008 - 3:23 pm UTC 
 
yeah, they are both WRONG and INAPPROPRIATE to use.
And that they "asked" implies they don't understand what it is actually doing and that is even worse. 
 
 
Thiru, January   15, 2008 - 11:33 am UTC
 
 
Though the max function did the trick, dont you feel that when working with huge sets of data, the function would be an expensive operation as all these other  constant value columns are not going to be indexed.  I would like to just pick up one value without using the aggregate function call or do a group by all the 15 constant value columns.
Hope I am making myself clear here. It's like this:
c1 c2 c3 c4 c5 c6
1   2  1  1  1  1
1   5  1  1  1  1
2   10 2  2  2  2
2   20 2  2  2  2
As you see : select c1,sum(c2),max(c3),max(c4),max(c5),max(c6) from t1 group by c1
this has got a) the grouping operation b) the max operation for each of the other columns 
would like to avoid this max to get the result:
c1 c2 c3 c4 c5 c6
1   7  1  1  1  1
2   30 2  2  2  2
Thanks again. 
January   15, 2008 - 3:34 pm UTC 
 
hey - just use GROUP BY !!!!! (come on, this is a trivial CUT AND PASTE people)
think about this - using MAX() is harder for you to code.  *harder*
It is not only wrong
It is *harder*
In your attempt to shave a millisecond of your time, you add many seconds to your time
<b>
select c1, c2, c3, c4, c5, c6, sum(x) 
from t
group by </b> c1, c2, c3, c4, c5, c6
the code in bold - what you have to type.  the stuff not in bold, the stuff you have to copy and paste.
Now, do it with max and you are - typing MORE
heck, way more - because you probably have to alias the columns
select max(c1) c1, max(c2) c2, .......
all in a misguided attempt to "type less"...
and yeah, you don't want to max all over the place for performance reasons. 
 
 
Jay, January   15, 2008 - 1:06 pm UTC
 
 
Did you test run the query using the max function? Are you having any performance issues? Tom might talk to you more about this. But, I feel that you should evaluate your query and measure the performance before making a 'generic' comment without any basis.
Now, if you really "hate" doing 'group by', one way to do this via analytics would be something like this.
select distinct c1,
                sum_c2,
                c3,
                c4
  from
       (select c1,
               c2,
               sum(c2) over (partition by c1) as sum_c2,
               c3,
               c4
          from t1)
But, again, I don't understand why you would do this! 
Thanks,
Jay 
 
January   15, 2008 - 3:37 pm UTC 
 
it will take a little more cpu to max lots of columns than to group by... 
this was a quick and dirty 10g test:
SELECT OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, OBJECT_TYPE, CREATED,
  LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY,
  SUM(DATA_OBJECT_ID)
FROM
 T GROUP BY OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, OBJECT_TYPE,
  CREATED, LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY, GENERATED, SECONDARY
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     5000      7.68      13.12       4800       6900          0      499010
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     5011      7.69      13.13       4800       6900          0      499010
********************************************************************************
SELECT MAX(OWNER), MAX( OBJECT_NAME), MAX( SUBOBJECT_NAME), OBJECT_ID,
  MAX(OBJECT_TYPE), MAX( CREATED), MAX( LAST_DDL_TIME), MAX( TIMESTAMP), MAX(
  STATUS), MAX(TEMPORARY), MAX( GENERATED), MAX( SECONDARY),
  SUM(DATA_OBJECT_ID)
FROM
 T GROUP BY OBJECT_ID
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          0          1          0           0
Execute     10      0.00       0.00          0          0          0           0
Fetch     5000      9.27      16.05       7800       6900          0      499010
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     5011      9.29      16.08       7800       6901          0      499010
but the right answer is to group by that which you want to group by. 
 
 
Thanks Tom
Jay, January   15, 2008 - 1:29 pm UTC
 
 
Quote -- 
"... the goal is to generate correct code that is maintainable, understandable and correct......"
Tom, thanks for the comment. I get your point!
Jay 
 
Conditional sorting
Reader, January   22, 2008 - 7:03 pm UTC
 
 
Sample data
create table t
(
num_col number,
var_col varchar2(3)
);
insert into t values(1,'abc');
insert into t values(1,'def');
insert into t values(1,'pqr');
insert into t values(2,'cde');
insert into t values(2,'def');
insert into t values(2,'rst');
insert into t values(2,'xyz');
insert into t values(3,'bcd');
insert into t values(3,'def');
insert into t values(3,'pqr');
Expected Output
1 abc <== This num_col group comes first as abc is the least value among all var_col values in the table
1 def
1 pqr
3 bcd <== This num_col group comes next as bcd will be the next sorted value after abc among all var_col values in the table
3 def
3 pqr
2 cde <== This num_col group comes next as bcd will be the next sorted value after abc among all var_col values in the table
2 def
2 rst
2 xyz
Is it possible to achieve the above expected output in a single SQL query, maybe using Analytics?
 
January   22, 2008 - 7:13 pm UTC 
 
ops$tkyte%ORA10GR2> select num_col, var_col, min(var_col) over (partition by num_col) min_var_col
  2    from t
  3   order by 3
  4  /
   NUM_COL VAR MIN
---------- --- ---
         1 pqr abc
         1 abc abc
         1 def abc
         3 bcd bcd
         3 def bcd
         3 pqr bcd
         2 rst cde
         2 xyz cde
         2 cde cde
         2 def cde
10 rows selected.
 
 
 
 
Fantastic
Reader, January   22, 2008 - 7:22 pm UTC
 
 
 
 
s devarshi, January   29, 2008 - 8:59 am UTC
 
 
i have table t.
create table t( x number(3),y varchar2(3),z number(3));
the rows are
insert into t values (1,'a',5);
insert into t values (1,'a',10);
insert into t values (1,'b',25);
insert into t values (1,'b',12);
insert into t values (1,'c',5);
insert into t values (1,'c',5);
insert into t values (2,'a',5);
insert into t values (2,'a',15);
insert into t values (2,'c',5);
insert into t values (2,'c',15);
insert into t values (3,'a',5);
insert into t values (3,'a',5);
insert into t values (3,'b',5);
insert into t values (4,'a',5);
insert into t values (4,'a',35);
insert into t values (5,'a',55);
insert into t values (5,'a',5);
insert into t values (5,'b',55);
insert into t values (6,'b',35);
insert into t values (6,'c',35);
i need an output like this
         x     y     z    
         -     -     -
         5     a    60
         5     b    55
         5    tot  115
         6     b    35
         6     c    35
         6    tot   70
         4     a    40
         4    tot   40
the sorting is on z (tot) ,grouped by x and y .
i can get the output 
 
x,y1,y2,y3,tot  
format using decode but not this.
 
January   30, 2008 - 8:59 am UTC 
 
the sorting is on X,Y, and then SUM(z)
you missed X,Y in your "specification", you must order by X,Y and then sum(z).  You cannot sort by Z as that would cause X,Y to not be "grouped" in the result set of course.
ops$tkyte%ORA10GR2> select x, decode( grouping(y), 1, 'tot', y ), sum(z)
  2    from t
  3   group by grouping sets ((x,y),(x))
  4   order by x, y, sum(z)
  5  /
         X DEC     SUM(Z)
---------- --- ----------
         1 a           15
         1 b           37
         1 c           10
         1 tot         62
         2 a           20
         2 c           20
         2 tot         40
         3 a           10
         3 b            5
         3 tot         15
         4 a           40
         4 tot         40
         5 a           60
         5 b           55
         5 tot        115
         6 b           35
         6 c           35
         6 tot         70
18 rows selected.
 
 
 
devarshi, January   31, 2008 - 12:58 am UTC
 
 
superb 
 
analytics and 8.1.7
devarshi, February  05, 2008 - 12:59 am UTC
 
 
i have two more queries.
1. how to do it in 8.1.7 ?
2. can analytic function be used here ?
thanks 
February  05, 2008 - 8:13 am UTC 
 
ops$tkyte@ORA817DEV> select decode( grouping(x), 1, 'tot', x ) x,
  2             decode( grouping(y), 1, 'tot', y ),
  3             sum(z),
  4             grouping(x) gx, grouping(y) gy
  5    from t
  6   group by rollup(x,y)
  7  /
X   DEC     SUM(Z)         GX         GY
--- --- ---------- ---------- ----------
1   a           15          0          0
1   b           37          0          0
1   c           10          0          0
1   tot         62          0          1
2   a           20          0          0
2   c           20          0          0
2   tot         40          0          1
3   a           10          0          0
3   b            5          0          0
3   tot         15          0          1
4   a           40          0          0
4   tot         40          0          1
5   a           60          0          0
5   b           55          0          0
5   tot        115          0          1
6   b           35          0          0
6   c           35          0          0
6   tot         70          0          1
tot tot        342          1          1
19 rows selected.
you can use a where clause to get rid of the tot/tot row if you want.
analytics (the over () functions) would not be appropriate, they do not "make up" rows like grouping sets, rollup and cube do. 
 
 
 
Grouped Pagination
reader, February  07, 2008 - 8:26 am UTC
 
 
Hi Tom,
I have a view like this(Scott EMP table).
select min(job) over (partition by deptno) mnjb, deptno dnm , e.* from emp e
   order by mnjb,deptno;
This view returns an ordered resultset. On this ordered resultset i want to generate a sequential number against each distinct DEPTNO, which I will use for pagination.
The expected output is:
1 ANALYST 20 7369 SMITH CLERK 7902 12/17/1980 00:00:00 800  20
1 ANALYST 20 7876 ADAMS CLERK 7788 01/12/1983 00:00:00 1100  20
1 ANALYST 20 7788 SCOTT ANALYST 7566 12/09/1982 00:00:00 3000  20
1 ANALYST 20 7902 FORD ANALYST 7566 12/03/1981 00:00:00 3000  20
1 ANALYST 20 7566 JONES MANAGER 7839 04/02/1981 00:00:00 2975  20
2 CLERK  10 7782 CLARK MANAGER 7839 06/09/1981 00:00:00 2450  10
2 CLERK  10 7839 KING PRESIDENT  11/17/1981 00:00:00 5000  10
2 CLERK  10 7934 MILLER CLERK 7782 01/23/1982 00:00:00 1300  10
3 CLERK  30 7499 ALLEN SALESMAN 7698 02/20/1981 00:00:00 1600 300 30
3 CLERK  30 7698 BLAKE MANAGER 7839 05/01/1981 00:00:00 2850  30
3 CLERK  30 7654 MARTIN SALESMAN 7698 09/28/1981 00:00:00 1250 1400 30
3 CLERK  30 7900 JAMES CLERK 7698 12/03/1981 00:00:00 950  30
3 CLERK  30 7844 TURNER SALESMAN 7698 09/08/1981 00:00:00 1500 0 30
3 CLERK  30 7521 WARD SALESMAN 7698 02/22/1981 00:00:00 1250 500 30
Please refer the first column. The sequence changes based on the ordered distinct DEPTNO as received from the view.
Is it possible to get the sequences using Analytics or any other method?
 
February  07, 2008 - 9:07 am UTC 
 
ops$tkyte%ORA10GR2> create or replace view v
  2  as
  3  select dense_rank() over (order by mnjb, deptno) rnk,
  4         x.*
  5    from (
  6  select min(job) over (partition by deptno) mnjb,
  7         deptno dnm ,
  8             e.*
  9    from emp e
 10         ) x
 11   order by mnjb,deptno;
View created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select * from v;
       RNK MNJB             DNM      EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- --------- ---------- ---------- ---------- --------- ---------- --------- ---------- ---------- ----------
         1 ANALYST           20       7876 ADAMS      CLERK           7788 12-JAN-83       1100                    20
         1 ANALYST           20       7566 JONES      MANAGER         7839 02-APR-81       2975                    20
         1 ANALYST           20       7902 FORD       ANALYST         7566 03-DEC-81       3000                    20
         1 ANALYST           20       7788 SCOTT      ANALYST         7566 09-DEC-82   20182.53                    20
         1 ANALYST           20       7369 SMITH      CLERK           7902 17-DEC-80        800                    20
         2 CLERK             10        123 HelloWorld                                                              10
         2 CLERK             10       7839 KING       PRESIDENT            17-NOV-81       5000                    10
         2 CLERK             10       7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10
         2 CLERK             10       7934 MILLER     CLERK           7782 23-JAN-82       1300                    10
         3 CLERK             30       7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
         3 CLERK             30       7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
         3 CLERK             30       7900 JAMES      CLERK           7698 03-DEC-81        950                    30
         3 CLERK             30       7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30
         3 CLERK             30       7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30
         3 CLERK             30       7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30
15 rows selected.
 
 
 
 
Fantastic...superb....stunning
reader, February  07, 2008 - 9:42 am UTC
 
 
When will I be able to think like you?...just thinking out loud.... 
 
Ordering Rows by specific values
Humble, February  19, 2008 - 10:49 am UTC
 
 
Tom I have the below SQL
SELECT LPAD(NVL(pm.item,'0'),8,'0')                item,
       '00000000000'                               supplier,
       TO_CHAR(LPAD(NVL(pm.dept,0),3,0))           dept,
       TO_CHAR(LPAD(NVL(pm.class,0),4,'0'))        class,
       RPAD('R',1,'0')                             dummy,
       TO_CHAR(LPAD((ROUND(NVL(pm.new_price,0),2) * 100),7,'0')) new_price,  --izp.unit_retail,
       '0000000'                                   unit_cost,
       ' '                                         tax_category,
       RPAD(NVL(pm.item_short_desc,' '),18,' ')    item_short_desc,
       TO_CHAR(LPAD(NVL(pm.old_multi_units,0),3,'0')) old_multi_units,
       TO_CHAR(LPAD((ROUND(NVL(pm.old_multi_unit_retail,0),2) * 100),7,'0')) old_multi_unit_retail,
       NVL(RPAD(DECODE(pm.tran_type,1,'A',10,'M',13,'M',21,'D'),1,' '),' ')  tran_type
  FROM nex_pos_mods pm
 WHERE pm.store = TO_NUMBER(p_StoreNo)
   AND tran_type IN (1,10,13,21)
 GROUP BY pm.item,
           '00000000000',
    pm.dept,
    pm.class,
    pm.new_price,
           item_short_desc,
    pm.old_multi_units,
    pm.old_multi_unit_retail,
           '0000000',
    pm.tran_type;
I would like to order the results by specific tran_types.
I would like all 'D' values first,
all              'A' values second
all              'M' values third
what could I do using SQL analytics? 
February  19, 2008 - 5:03 pm UTC 
 
why does everyone say "how can I do this with analytics" when faced with any "i cannot figure out how to write this in sql"?
I know analytics are pretty amazing, but they are not magic.
Analytics do not sort result sets.
Analytics do not aggregate
Analytics do not do many things.
this is a very simple order by requirement:
order by decode( tran_type, 'D', 1, 'A', 2, 'M' 3 ), 
 
 
 
Most popular price
Thomas, February  25, 2008 - 11:36 am UTC
 
 
Hi Tom,
I have a table containing articles and a table containing 
the current price for each article in each store.
For each article, I'd like to know the most popular price,
that is, the price used in the most number of markets. In 
case of a tie (several prices being the most popular for an 
article), I'd like to have the lowest of these most popular 
prices.
Here is some simple test data:
CREATE TABLE article (article INTEGER PRIMARY KEY);
CREATE TABLE price (article INTEGER REFERENCES article,store INTEGER,price NUMBER);
INSERT INTO article VALUES(1);
INSERT INTO article VALUES(2);
INSERT INTO price VALUES(1,1,4.5);
INSERT INTO price VALUES(1,2,4.5);
INSERT INTO price VALUES(1,3,3);
INSERT INTO price VALUES(1,4,5);
INSERT INTO price VALUES(2,1,6);
INSERT INTO price VALUES(2,2,6.5);
INSERT INTO price VALUES(2,3,6);
INSERT INTO price VALUES(2,4,6.5);
INSERT INTO price VALUES(2,5,6);
INSERT INTO price VALUES(2,6,6.5);
INSERT INTO price VALUES(2,6,5);
I've tried it like this (without analytics) for an 
individual article, which works fine:
SELECT * FROM (SELECT price FROM price p WHERE p.article=1 
GROUP BY p.price ORDER BY COUNT(*) DESC,price)
WHERE ROWNUM=1
However, when I try to use this as a subquery like this, 
Oracle 10gR2 won't let me:
SELECT a.article,(SELECT * FROM (SELECT price FROM price p 
WHERE p.article=a.article GROUP BY p.price
ORDER BY COUNT(*) DESC,price) WHERE ROWNUM=1)
FROM article a;
Oracle tells me it doesn't know the alias a. The alias a 
seems to be present in the first level of the subquery, but 
not in the second level, i.e., where I want to use it.
Questions:
1. Why does Oracle tell me it doesn't know about a? To me, 
this looks like a valid query. Is there anything I can do to 
allow me access to a from the inner level of the subquery?
2. Can this be done better using a subquery with analytics?
 
February  25, 2008 - 2:06 pm UTC 
 
1) the correlation name goes one level down. period.
2) 
ops$tkyte@ORA920> select article, price
  2    from (
  3  select article, price,
  4             row_number() over (partition by article order by cnt desc, price ) rn
  5    from (
  6  select article, price,
  7             count(*) over (partition by article, price) cnt
  8    from price
  9             )
 10             )
 11   where rn = 1
 12   order by article
 13  /
   ARTICLE      PRICE
---------- ----------
         1        4.5
         2          6
 
 
 
analytics
A reader, February  26, 2008 - 6:08 pm UTC
 
 
   Tom:
   
   If I have a table
   
   Mag_no, contract, stage, create_date
   1       ABC        5       1/1/2008
   1       ABC        5       1/15/2008
   1       ABC        6        2/2/2008
   1       ABC        6       3/3/3008
   ...............
   
   How do I have a query to display the last create date record with highest stage
   
   1       ABC        6       3/3/3008 
February  27, 2008 - 2:26 am UTC 
 
The lack of specificity here is huge....
apparently "stage" is not relevant, you want to collapse over that.
but what about mag_no, contract....
and where is the create table, the insert intos
 
 
 
Partition is detached from Order by?
Marat, April     12, 2008 - 4:47 am UTC
 
 
Hi Tom,
lets see a table
CREATE TABLE TTT
(
  F1  VARCHAR2(10),
  F2  NUMBER
);
Insert into TTT (F1, F2) Values ('1', 1);
Insert into TTT (F2) Values (2);
Insert into TTT (F2) Values (3);
Insert into TTT (F1, F2)Values ('2', 4);
Insert into TTT (F2) Values(5);
Insert into TTT (F2) Values(6);
Insert into TTT (F2) Values(7);
select ttt.*,
       row_number() over ( partition by f1 order by f2) rn,
       first_value(f2) over ( partition by f1 order by f2)fv
  from ttt
 order by f2;     
F1 F2 RN FV
1 1 1 1
  2 1 2
  3 2 2
2 4 1 4
  5 3 2
  6 4 2
  7 5 2
Note the values in RN - they continue from the first partition (f1=1), and FIRST_VALUE points the second row, where nulls in F1 begin.
So, how to partition same values to make them based on Order by?
Actually, I need the result like
F1 F2 RN FV
1 1 1 1
1 2 1 2
1 3 2 2
2 4 1 4
2 5 1 5
2 6 2 5
2 7 3 5
that is, I need null values to be filled with previous non-null value.
Thank you. 
April     13, 2008 - 8:32 am UTC 
 
... Note the values in RN - they continue from the first partition (f1=1), and FIRST_VALUE points the second row, where nulls in F1 begin.  ...
no, they don'tF1    F2    RN    FV
1      1    1    1      <<<<= partition 1
2      4    1    4      <<<<= partition 2
       2    1    2      <<<<= partition 3
       3    2    2
       5    3    2
       6    4    2
       7    5    2
Now, since ROWS IN A TABLE HAVE NO ORDER WHATSOEVER, I have to presume you want to carry down the last non-null value of F1 after sorting the entire set by F2 - F2 is what "sequences" or "orders" this data - right?
So, your requirement is simply stated as "carry down the last non-null value of F1 after sorting all data by F2"
ops$tkyte%ORA10GR2> select last_value(f1 ignore nulls) over (order by f2) f1_new,
  2         f2
  3    from ttt
  4   order by f2
  5  /
F1_NEW             F2
---------- ----------
1                   1
1                   2
1                   3
2                   4
2                   5
2                   6
2                   7
7 rows selected.
that works in 10g, in earlier releases, assuming f2 is NOT NULL and POSITIVE - the following would cut it:
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select substr( max( case when f1 is not null then to_char(f2,'fm0000000') || f1 end ) over (order by f2), 8, 10 ) f1_new,
  2         f2
  3    from ttt
  4   order by f2
  5  /
F1_NEW             F2
---------- ----------
1                   1
1                   2
1                   3
2                   4
2                   5
2                   6
2                   7
7 rows selected.
 
 
 
How it works?
A reader, April     14, 2008 - 3:30 am UTC
 
 
Excellent! Thank you, Tom! It works fine.
But... I still can't understand why it works NOT like this:
F1    F2    RN    FV
1    1    1    1   => Partition 1
     2    1    2   => Partition 2
     3    2    2
2    4    1    4   => Partition 3
     5    3    2   => Partition 4
     6    4    2
     7    5    2
 
April     16, 2008 - 1:50 pm UTC 
 
umm, because F1 has three values in your table
f1 = 1
f1 = 2
f1 is null
therefore, if you partition by f1, you will have three partitions. 
 
 
Is PARTITION BY works before ORDER BY?
Marat, April     15, 2008 - 5:25 am UTC
 
 
Well, now I realize that PARTITION BY is working before ORDER BY, isn't it? 
April     16, 2008 - 2:42 pm UTC 
 
the data is partitioned by your partition by
THEN, the data in that partition is sorted.
THEN, the window is "set up" (range or row window), the default window is typically "current row and all preceding rows" with an order by.
THEN the function is applied. 
 
 
Query
Tim, April     17, 2008 - 2:12 pm UTC
 
 
N1 N2
1 345
2 645
3 378
4 95
5 2557
6 95
7 111
8 8756
9 40
I need to write SQL for below algorithm:
As you see N1 is sequential. So, algorithm starts from end (MAX N1).
1. Get curent N1 and N2 (lets keep it as N1_1 nd N2_1)
2. Go to previous record and get new N1 and N2 (lets keep it as N1_2 nd N2_2)
3. if N2_2 < N2_1 the return this record else go to previous record and start from step 1
Regarding above data, query must return;
N1 N2
1 345 -> return this (345 is less than 645)
2 645
3 378
4 95 -> return this (95 is less than 2557)
5 2557
6 95
7 111 -> return this (111 is less than 8756)
8 8756
9 40
Could you pls help to write this query? 
April     17, 2008 - 4:33 pm UTC 
 
no create table
no insert into table
no look
this is trivial with lag or lead - read up on them, you'll probably be able to figure it out.  lag lets you look "back N records" in a result set.  Lead - look forward N records... 
 
 
create table
Tim, April     18, 2008 - 12:05 am UTC
 
 
Sorry Tom.
Pls see below for table creation.
create table test (N1 int, N2 int);
insert into test values (1,345);
insert into test values (2,645);
insert into test values (3,378);
insert into test values (4,95);
insert into test values (5,2557);
insert into test values (6,95);
insert into test values (7,111);
insert into test values (8,8756);
insert into test values (9,40);
Thanks; 
April     18, 2008 - 8:31 am UTC 
 
did you even try, sigh.... the answer was given, hundreds of examples exist on this site...
did you notice your logic doesn't match your picture?
1. Get curent N1 and N2 (lets keep it as N1_1 nd N2_1)
2. Go to previous record and get new N1 and N2 (lets keep it as N1_2 nd N2_2)
3. if N2_2 < N2_1 the return this record else go to previous record and start from step 1
Regarding above data, query must return;
N1    N2
1    345 -> return this (345 is less than 645)
2    645
3    378
4    95 -> return this (95 is less than 2557)
5    2557
6    95
7    111 -> return this (111 is less than 8756)
8    8756
9    40
1. Get curent N1 and N2 (lets keep it as N1_1 nd N2_1)
2. Go to previous record and get new N1 and N2 (lets keep it as N1_2 nd N2_2)
3. if N2_2 < N2_1 the return this record else go to previous record and start from step 1
1) assume current = N1=7, N2_1=111
2) previous record is N1=6, N2_2=95
3) n2_2 < n2_1, but you wrote "111 is less than 8756"
Ok, play with lag and lead:
ops$tkyte%ORA10GR2> select n1, n2,
  2         lead(n2) over (order by n1) next_n2,
  3         lag(n2) over (order by n1) prior_n2
  4    from test
  5  /
        N1         N2    NEXT_N2   PRIOR_N2
---------- ---------- ---------- ----------
         1        345        645
         2        645        378        345
         3        378         95        645
         4         95       2557        378
         5       2557         95         95
         6         95        111       2557
         7        111       8756         95
         8       8756         40        111
         9         40                  8756
9 rows selected.
and tell us what you really meant - if you cannot get it yourself given this example. 
 
 
keen relationship
Ken, April     23, 2008 - 6:29 pm UTC
 
 
If Ids are available then the query writing gets easy but what could be used if there is only name matching pattens?
 For instance if combination is that the Second name of Son will be first name of his father, the third name of the son will be first name of this grand father and sons, forth name will be name of this great grand father. This rules apply on all names, thats why we have 4 names fields other than family name. 
I have a table with following fields
CREATE TABLE KEEN_RELATIONSHIP
   ( "TESTID" NUMBER, 
 "AGE" NUMBER(3,0), 
 "FIRSTNAME" VARCHAR2(25 BYTE), 
 "SECONDNAME" VARCHAR2(25 BYTE), 
 "THIRDNAME" VARCHAR2(25 BYTE), 
 "FORTHNAME" VARCHAR2(25 BYTE), 
 "FAMILYNAME" VARCHAR2(25 BYTE));
And have 39 following records in this table
-- INSERTING into KEEN_RELATIONSHIP
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961527492,47,'Mishal','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961562389,47,'Waleed','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962101220,42,'Khalid','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (964338909,30,'Faisal','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961746095,44,'Abdullah','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960785299,56,'Suliman','Salah','Suliman',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960986913,55,'Abdul Aziz','Salah','Suliman',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962749770,38,'Ali','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962579046,39,'Maman','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961747704,45,'Bader','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962145270,43,'Khalid','Mohd','Ibrahim','Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960714745,66,'Ibrahim','Mohd','Ibrahim','Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960792751,57,'Mohd','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960111460,82,'Ali','Abdul Mohsin','Mohd',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961886076,44,'Abdul Latif','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (964355027,33,'Abdul Aziz','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (963164687,36,'Abdul Mohsin','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960519063,72,'Baderia','Ali',null,null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (963027715,36,'Shekha','Ibrahim','Abdul Rehman','Salah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960985516,52,'Noora','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961282916,50,'Fozia','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962082670,64,'Naja','Rashid','Abdul Rehman',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961971436,45,'Mana','Ali','Salah',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962094798,82,'Munira','Abdul Mohsin','Mohd',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (964127289,35,'Alia','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961377460,48,'Nawal','Abdul Aziz','Abdul Mohsin',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (967763537,23,'Noora','Abdullah','Mohd','Ali Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (967775024,23,'Shekha','Bader','Abdul Aziz','Abdul Mohsin','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962214823,41,'Khlood','Mohd','Ali','Abdullah','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960958623,54,'Baderia','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961775619,62,'Wasam','Ali','Abdullah',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961541876,51,'Maryaam','Mohd','Ibrahim','Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (962343452,61,'Husa','Al Salah','Al Mansoor',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (966099168,26,'Afnan','Ibrahim','Mohd','Ibrahim Al Awada','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961669035,45,'Mana','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (961076755,51,'Shekha','Ali','Abdul Mohsin','Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (960791757,91,'Lulu','Ibrahim','Al Awada',null,'Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (966364107,30,'Ghzlan','Mohd','Ali','Abdul Mohsin Mohd','Jaan');
Insert into KEEN_RELATIONSHIP (TESTID,AGE,FIRSTNAME,SECONDNAME,THIRDNAME,FORTHNAME,FAMILYNAME) values (963587153,73,'Latifa','Abdul Rehman','Salah',null,'Jaan'); 
April     28, 2008 - 9:37 am UTC 
 
... If Ids are available then the query writing gets easy  ...
no it doesn't, in fact, it can get easier to get wrong - but we digress.  
there is no question here as far as I can see - at all.
and if the rule were "my name is derived from this rule", I would have a hierarchy - NOT a flat table like this.
Meaning, I have my name, my father has his name and so on - and if you want to see my full name, we retrieve the data on the fly. 
 
 
A reader, May       20, 2008 - 7:03 pm UTC
 
 
create table fas157_mref_inp_load_status_b
(Load_id integer,
 Load_Status char(1),
 month_id integer,
 updated_dt date default sysdate,
 updated_by varchar2(4000) default sys_context('USERENV','OS_USER'));
alter table fas157_mref_inp_load_status add constraint fas157_mref_inp_load_status_pk 
primary key(Load_id);
create table fas157_mref_out_load_status_b
(Load_id integer,
 Load_Status char(1),
 month_id integer,
 updated_dt date default sysdate,
 updated_by varchar2(4000) default sys_context('USERENV','OS_USER'));
alter table fas157_mref_out_load_status add constraint fas157_mref_out_load_status_pk 
primary key(Load_id);
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (2, null, 200804, to_date('16-05-2008 15:46:52', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (3, 'P', 200804, to_date('09-05-2008 15:58:44', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (1, 'P', 200804, to_date('09-05-2008 15:12:21', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (4, 'p', 200804, to_date('20-05-2008 09:12:21', 'dd-mm-yyyy hh24:mi:ss'), null);
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (5, 'P', 200804, to_date('20-05-2008 09:11:21', 'dd-mm-yyyy hh24:mi:ss'), null);
insert into FAS157_MREF_INP_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (6, 'P', 200804, to_date('09-05-2008 15:59:44', 'dd-mm-yyyy hh24:mi:ss'), null);
commit;
insert into FAS157_MREF_OUT_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (86, 'P', 200804, to_date('14-05-2008 09:56:51', 'dd-mm-yyyy hh24:mi:ss'), 'mamarti');
insert into FAS157_MREF_OUT_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (87, 'P', 200804, to_date('20-05-2008 11:56:51', 'dd-mm-yyyy hh24:mi:ss'), null);
insert into FAS157_MREF_OUT_LOAD_STATUS_B (LOAD_ID, LOAD_STATUS, MONTH_ID, UPDATED_DT, UPDATED_BY)
values (88, 'P', 200804, to_date('19-05-2008 11:56:51', 'dd-mm-yyyy hh24:mi:ss'), null);
commit;
 DESC fas157_mref_INP_load_status_b
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 LOAD_ID                                            NUMBER(38)
 LOAD_STATUS                                        CHAR(1 CHAR)
 MONTH_ID                                           NUMBER(38)
 UPDATED_DT                                         DATE
 UPDATED_BY                                         VARCHAR2(4000 CHAR)
desc FAS157_MREF_OUT_LOAD_STATUS_B
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 LOAD_ID                                            NUMBER(38)
 LOAD_STATUS                                        CHAR(1 CHAR)
 MONTH_ID                                           NUMBER(38)
 UPDATED_DT                                         DATE
 UPDATED_BY                                         VARCHAR2(4000 CHAR)
For all load_ids in output table(FAS157_MREF_OUT_LOAD_STATUS_B
) with load_status='P', I would like to get the load_id in input table (FAS157_MREF_INP_LOAD_STATUS_B) whose updated_dt is nearest to and less than the updated_dt in output table. Can we do this in a single SQL ?
thanks
Christian
 
May       20, 2008 - 10:00 pm UTC 
 
ops$tkyte%ORA10GR2> select out.*,
  2         (select max( load_id ) keep ( dense_rank first order by inp.updated_dt desc )
  3                from FAS157_MREF_INP_LOAD_STATUS_B inp
  4                   where inp.updated_dt <= out.updated_dt ) that_load_id
  5    from FAS157_MREF_OUT_LOAD_STATUS_B out
  6   where out.load_status = 'P'
  7  /
   LOAD_ID L   MONTH_ID UPDATED_DT           UPDATED_BY THAT_LOAD_ID
---------- - ---------- -------------------- ---------- ------------
        86 P     200804 14-may-2008 09:56:51 mamarti               6
        87 P     200804 20-may-2008 11:56:51                       4
        88 P     200804 19-may-2008 11:56:51                       2
 
 
 
 
A reader, May       20, 2008 - 7:06 pm UTC
 
 
Correction above
alter table fas157_mref_inp_load_status_b add constraint fas157_mref_inp_load_status_pk 
primary key(Load_id);
alter table fas157_mref_out_load_status_b add constraint fas157_mref_out_load_status_pk 
primary key(Load_id);
 
 
Reader, May       21, 2008 - 12:59 pm UTC
 
 
create table test
(id varchar2(10),
 dt date,
 amt number);
 
insert into test
values
('ABC', to_date('01/02/2008','mm/dd/yyyy'),20);
insert into test
values
('ABC', to_date('01/02/2008','mm/dd/yyyy'),40);
insert into test
values
('ABC', to_date('01/03/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/04/2008','mm/dd/yyyy'),50);
insert into test
values
('ABC', to_date('01/05/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/06/2008','mm/dd/yyyy'),20);
insert into test
values
('ABC', to_date('01/07/2008','mm/dd/yyyy'),0);
--
insert into test
values
('PQR', to_date('01/02/2008','mm/dd/yyyy'),30);
insert into test
values
('PQR', to_date('01/03/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/04/2008','mm/dd/yyyy'),80);
insert into test
values
('PQR', to_date('01/05/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/06/2008','mm/dd/yyyy'),10);
insert into test
values
('PQR', to_date('01/07/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/08/2008','mm/dd/yyyy'),10);
insert into test
values
('PQR', to_date('01/09/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/10/2008','mm/dd/yyyy'),30);
select * from test;
ID         DT               AMT
---------- --------- ----------
ABC        02-JAN-08         20
ABC        02-JAN-08         40
ABC        03-JAN-08          0
ABC        04-JAN-08         50
ABC        05-JAN-08          0
ABC        06-JAN-08         20
ABC        07-JAN-08          0
PQR        02-JAN-08         30
PQR        03-JAN-08          0
PQR        04-JAN-08         80
PQR        05-JAN-08          0
PQR        06-JAN-08         10
PQR        07-JAN-08          0
PQR        08-JAN-08         10
PQR        09-JAN-08          0
PQR        10-JAN-08         30
16 rows selected.
I need to find sum for the last three days for each id 
For example 
if today is 1/4/2008, I need to find the sum of 1/2/2008, 1/3/2008 and 1/4/2008
if today is 1/5/2008, I need to find the sum of 1/5/2008, 1/4/2008 and 1/3/2008
Can you please help me how to do this by using analytical functions? 
May       21, 2008 - 3:14 pm UTC 
 
that would not involve analytics at all.  It is a rather simple "where" and group by.
ops$tkyte%ORA10GR2> variable dt varchar2(20)
ops$tkyte%ORA10GR2> exec :dt := '1/4/2008'
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> select id, min(dt), max(dt), sum(amt)
  2    from test where dt between to_date( :dt, 'mm/dd/yyyy')-2 and to_date( :dt, 'mm/dd/yyyy')
  3   group by id;
ID         MIN(DT)   MAX(DT)     SUM(AMT)
---------- --------- --------- ----------
PQR        02-JAN-08 04-JAN-08        110
ABC        02-JAN-08 04-JAN-08        110
ops$tkyte%ORA10GR2> exec :dt := '1/5/2008'
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> /
ID         MIN(DT)   MAX(DT)     SUM(AMT)
---------- --------- --------- ----------
PQR        03-JAN-08 05-JAN-08         80
ABC        03-JAN-08 05-JAN-08         50
 
 
 
A reader, May       21, 2008 - 1:46 pm UTC
 
 
thanks 
Christian 
 
Reader, May       21, 2008 - 3:56 pm UTC
 
 
Tom,
Sorry missed to mention this in my previous question.
We have data since 2004. I need get the sum for all dates. I need to create a view for this, so that I can get data for any date, whenever, I query.
If the date is 05/07/2004, I should sum the data for 05/05/2004, 05/06/2004 and 05/07/2004. 
If the date is 05/06/2004, I should sum the data for 05/04/2004, 05/05/2004 and 05/06/2004  and so on. I am not sure if this is called as Moving sum. Please advice. 
 
Reader, May       21, 2008 - 4:28 pm UTC
 
 
with regards to the above question -
If I do dt-2, it will take weekends and holidays also. In the test table, I do not store data for weekends and holidays.  
May       21, 2008 - 5:21 pm UTC 
 
ok, so that is different - it would/could be an analytic with a range window - but, there is no such thing as "a holiday" - what do you need to have happen here and what do you have existing that will help make it happen. 
 
 
Vaibhav, May       22, 2008 - 12:39 am UTC
 
 
hi tom, 
I had to write a query that would give me the number fo errors that occured for a particular error code for the past 3 hrs but split over an interval of 15 mins...
eg: if it is 12 o clock now, then i would want number fo errors coourred between 9 to 9.15, 9.15 to 9.30 ... 11.45 to 12 for each error code...i went through your site and could write the following query
SELECT *
FROM   (SELECT * 
 FROM   (SELECT  me.error_code,
  COUNT (CASE when me.error_timestamp   BETWEEN (p_error_date - 3/24) AND (p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') THEN 1 ELSE NULL END) in1,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') THEN 1 ELSE NULL END) in2,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE') THEN 1 ELSE NULL END) in3,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE') THEN 1 ELSE NULL END) in4,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE') THEN 1 ELSE NULL END) in5,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE' THEN 1 ELSE NULL END) in6,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE') THEN 1 ELSE NULL END) in7,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE') THEN 1 ELSE NULL END) in8,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE') THEN 1 ELSE NULL END) in9,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE') THEN 1 ELSE NULL END) in10,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE') THEN 1 ELSE NULL END) in11,
  COUNT (CASE when me.error_timestamp BETWEEN (p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE') AND (p_error_date - 3/24) + NUMTODSINTERVAL(180, 'MINUTE') THEN 1 ELSE NULL END) in12,
  SUM (COUNT(1)) OVER (PARTITION BY error_code ORDER BY error_code) total_count
  FROM mhs_errors me, mhs_transaction_status mts
  WHERE me.transaction_id = mts.transaction_id
  AND me.error_timestamp BETWEEN p_error_date - 3/24 AND p_error_date
  AND     status_type = p_status_type
  GROUP BY me.error_code)
 ORDER BY total_count desc)
WHERE ROWNUM <= NVL(p_batchsize, 10)
i actually used a function that was called 12 times for each error code and then i went through your queries and came to kow about analytic functions and only then i could write the above query.
i want to know if there ia a better way to write this query? 
please help 
May       22, 2008 - 7:10 am UTC 
 
I have no idea what to do here.  No create tables, no inserts...
I don't know why you have so many columns for such a simple return - seems like it would have three columns in the output:
time (rounded to 15 minute intervals)
error code
count(*)
grouped by time, error code
no analytics.  
Given that is what I think - you must not be presenting your requirements.... sorry. 
 
 
Can try this
Ayan, May       22, 2008 - 3:39 am UTC
 
 
You can write a view as follows:
select sysdate,
       (trunc(sysdate,'hh')-3/24)+r*15/(24*60) start_dt,
       (trunc(sysdate,'hh')-3/24)+(r+1)*15/(24*60) end_dt 
from(
select sysdate,rownum r              
from user_objects
where rownum<13
)
and then compare your data as 
error_timestamp >= start_dt and error_timestamp<end_dt
and then do the grouping.
I dont see use of analytical function here, but Tom is the best one to answer 
 
Reader, May       22, 2008 - 10:01 am UTC
 
 
create table test
(id varchar2(10),
 dt date,
 amt number);
insert into test
values
('ABC', to_date('01/10/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/11/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/14/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/15/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/16/2008','mm/dd/yyyy'),20);
insert into test
values
('ABC', to_date('01/18/2008','mm/dd/yyyy'),0);
insert into test
values
('ABC', to_date('01/22/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/11/2008','mm/dd/yyyy'),30);
insert into test
values
('PQR', to_date('01/14/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/15/2008','mm/dd/yyyy'),80);
insert into test
values
('PQR', to_date('01/16/2008','mm/dd/yyyy'),0);
insert into test
values
('PQR', to_date('01/18/2008','mm/dd/yyyy'),10);
insert into test
values
('PQR', to_date('01/22/2008','mm/dd/yyyy'),0);
commit;
01/15/2008  is tuesday
01/14/2008 is monday
01/11/2008 is friday
I have zero for amount on all the above days for id = ABC. I do not store data for weekends and holidays in the test table.
When I do the 3-day sum using windowing option, I get 1E-34 instead of zero. I am not sure, why I am getting 1E-34. Please advice.
select dt
      ,id
      ,sum(amt) as sum_daily
      ,sum(sum(amt)) OVER (PARTITION BY id order by dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as SUM_3d
from test        
group by dt, id
order by dt;
DT        ID          SUM_DAILY     SUM_3D
--------- ---------- ---------- ----------
11-JAN-08 ABC                 0          0
11-JAN-08 PQR                30         30
14-JAN-08 ABC                 0          0
14-JAN-08 PQR                 0         30
15-JAN-08 PQR                80        110
15-JAN-08 ABC                 0         1E-34
16-JAN-08 ABC                20         20
16-JAN-08 PQR                 0         80
18-JAN-08 PQR                10         90
18-JAN-08 ABC                 0         20
22-JAN-08 PQR                 0         10
22-JAN-08 ABC                 0         20
12 rows selected. 
May       23, 2008 - 8:01 am UTC 
 
I cannot reproduce.
any other information here - like platform, precise version 
 
 
Vaibhav, May       23, 2008 - 12:52 am UTC
 
 
Hi
Thanks for that quick reply
I apologise for not providing any create and insert scripts. 
The reason i am using PARTITION BY is that i want the total count as well for that particular error code
Kindly see below the kind of output that i am lookin out for.
Also, Ayan mentioned using TRUNC(errror_date, 'HH')
i cannot use this because it will round the error timestamp. i dont want that
what i want is if user has passed the error date as 2008/22/05 12:35:56, then
i should give him the error count as
ERROR_CODE IN1 IN2 IN3 IN4 IN5 IN6 IN7 IN8 IN9 IN10 IN11 IN12 TOTAL_COUNT
err1  0 0 0 0 0 0 0 0 0 2 0 5 7
err2  0 0 0 0 0 0 0 0 0 0 0 5 5
err3  0 0 0 0 0 0 0 0 0 0 1 5 6
err4  0 0 0 0 0 0 0 1 0 0 0 8 9
here IN1 = 09:35:56 to 09:50:56
     IN2 = 09:50:56 to 10:05:56
     IN3 = 10:05:56 to 10:20:56
and so on...the last interval would be
     IN12 = 12:20:56 to 12:35:56
the last column TOTAL_COUNT would give me the total count for that error code for that 3 hours ie 09:35:56 to 12:35:56
Here are the scripts:
CREATE TABLE mhs_errors
(
 error_id   NUMBER(10),
 transaction_id   VARCHAR2(36) NOT NULL,
 error_code   VARCHAR2(50) NOT NULL,
 error_severity   VARCHAR2(10),
 error_timestamp   TIMESTAMP,
 error_text   VARCHAR2(700)
);
ALTER TABLE mhs_errors
ADD CONSTRAINT pk_mhs_errors PRIMARY KEY(error_id);
Insert into MHS_ERRORS Values (2, 'tran1', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (3, 'tran2', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (4, 'tran3', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (5, 'tran4', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (6, 'tran5', 'err1', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (7, 'tran6', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (8, 'tran7', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (9, 'tran8', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (10, 'tran9', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (11, 'tran10', 'err2', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (12, 'tran11', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (13, 'tran12', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (14, 'tran13', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (15, 'tran14', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (16, 'tran15', 'err3', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (17, 'tran16', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (18, 'tran17', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (19, 'tran18', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (20, 'tran19', 'err4', 'fatal', systimestamp, 'dummy error text');
Insert into MHS_ERRORS Values (21, 'tran20', 'err4', 'fatal', systimestamp, 'dummy error text');
COMMIT;
SELECT *
FROM   (SELECT * 
 FROM   (SELECT  me.error_code,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') 
                 THEN 1 ELSE NULL END) in1,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(15, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') 
                 THEN 1 ELSE NULL END) in2,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(30, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE')
                 THEN 1 ELSE NULL END) in3,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(45, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE')
                 THEN 1 ELSE NULL END) in4,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(60, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE')
                 THEN 1 ELSE NULL END) in5,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(75, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE')
                 THEN 1 ELSE NULL END) in6,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(90, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE')
                 THEN 1 ELSE NULL END) in7,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(105, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE')
                 THEN 1 ELSE NULL END) in8,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(120, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE')
                 THEN 1 ELSE NULL END) in9,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(135, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE')
                 THEN 1 ELSE NULL END) in10,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(150, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE')
                 THEN 1 ELSE NULL END) in11,
            COUNT (CASE when me.error_timestamp BETWEEN (&&p_error_date - 3/24) + NUMTODSINTERVAL(165, 'MINUTE') AND (&&p_error_date - 3/24) + NUMTODSINTERVAL(180, 'MINUTE')
                 THEN 1 ELSE NULL END) in12,
            SUM (COUNT(1)) OVER (PARTITION BY error_code ORDER BY error_code) total_count
            FROM    mhs_errors me
            WHERE    me.error_timestamp BETWEEN &&p_error_date - 3/24 AND &&p_error_date
            GROUP BY me.error_code)
 ORDER BY total_count desc)
WHERE ROWNUM <= NVL(:p_batchsize, 10)
I get the following output after executing the above query:
ERROR_CODE IN1 IN2 IN3 IN4 IN5 IN6 IN7 IN8 IN9 IN10 IN11 IN12 TOTAL_COUNT
err1  0 0 0 0 0 0 0 0 0 0 0 5 5
err2  0 0 0 0 0 0 0 0 0 0 0 5 5
err3  0 0 0 0 0 0 0 0 0 0 0 5 5
err4  0 0 0 0 0 0 0 0 0 0 0 5 5
Kindly guide me how to write this in a better way
thanks in advance 
May       23, 2008 - 9:02 am UTC 
 
I see nothing wrong with your existing query.
You need a function / column you want in the output - your counts are correct as is.
You are done. (well, use a bind variable in real life, no &&p_error_date of course and always - ALWAYS use to-date on a string to convert to a date - using the right date format as well) 
 
 
Vaibhav, May       23, 2008 - 12:44 pm UTC
 
 
Hi
Thanks for that piece of advice...
i will make sure i am using to date function henceforth...
now i will tell you what the actual buisness scenario would be...
on the screen, user has an option to select the time interval...
i.e. last 3 hours, last 6 hours, last 12 hours, last 24 hours
the only difference is, for 3 hours the error interval is 15 mins i.e for each error code we get 4 error counts per hour making it 12 error count for 3 hours (12 col + 1 total count = 13 cols in o/p)
for 6 hours, interval is again 15 min, therefore for each error code we have 6*4=24 error count (24 cols in o/p)
for 12 hours, interval is 30 min, therefore for each error code we have 12*2=24 error count(24 cols in o/p)
for 12 hours, interval is 1 hour, therefore for each error code we have 24*1=24 error count(24 cols in o/p)
what i have done is used if condition rewriting the same query 4 times with different time comparisons...
is there a way by which i can write just one select query and still get the desired output...
kindly help me with these..
thanks a million again for all the help that you provide to people...
and ya, i have been reading throughout that you have written a book on analytics...
where can i purchase it from ????
let me know about this 
May       23, 2008 - 6:19 pm UTC 
 
... is there a way by which i can write just one select query and still get the 
desired output...
 ...
a query has a fixed number of columns - unless you want to include nulls in the other queries, no.  the single query would have a SINGLE set of columns 
 
 
Vaibhav, May       24, 2008 - 1:52 am UTC
 
 
Thanks...
I was just wondering if there is any way by which we can filter out number of columns dynamically...its not possible as i dont want null values in those undesired columns...thanks for confirming this...
TOM...i asked you about your book Expert One-on-One 
i surfed some sites like amazon.com and all...
but I am from India...how do i purchase the book here 
May       24, 2008 - 7:00 pm UTC 
 
for books: take ISBN to any reputable bookstore and say "I want this book"
expert one on one is out of print, effective oracle by design and expert oracle database architecture are not.
a sql query has a FIXED NUMBER OF COLUMNS, period. 
 
 
Vaibhav, May       25, 2008 - 2:27 am UTC
 
 
thanks...
i will grab a copy of the available books asap... 
 
Min And Max of date ranges
Shivdeep Modi, May       27, 2008 - 8:16 am UTC
 
 
Hi,
I've got something like
create table tsreport_date ( report_date date);
insert into tsreport_date values('09-MAY-2008 17:52:22');
insert into tsreport_date values('10-MAY-2008 10:00:03');
insert into tsreport_date values('11-MAY-2008 10:00:01');
insert into tsreport_date values('12-MAY-2008 10:00:06');
insert into tsreport_date values('13-MAY-2008 10:00:01');
insert into tsreport_date values('14-MAY-2008 10:00:02');
insert into tsreport_date values('15-MAY-2008 10:00:05');
insert into tsreport_date values('16-MAY-2008 10:00:03');
insert into tsreport_date values('17-MAY-2008 10:00:01');
insert into tsreport_date values('18-MAY-2008 10:00:04');
insert into tsreport_date values('19-MAY-2008 10:00:02');
insert into tsreport_date values('20-MAY-2008 10:00:04');
insert into tsreport_date values('21-MAY-2008 10:00:04');
insert into tsreport_date values('22-MAY-2008 10:00:04');
insert into tsreport_date values('23-MAY-2008 10:00:02');
insert into tsreport_date values('23-MAY-2008 14:46:27');
insert into tsreport_date values('24-MAY-2008 10:00:02');
insert into tsreport_date values('25-MAY-2008 10:00:03');
insert into tsreport_date values('26-MAY-2008 10:00:01');
insert into tsreport_date values('27-MAY-2008 10:00:03');
insert into tsreport_date values('01-JUN-2008 11:03:16');
insert into tsreport_date values('02-JUN-2008 11:03:21');
insert into tsreport_date values('03-JUN-2008 11:03:23');
insert into tsreport_date values('26-JUN-2008 11:05:01');
insert into tsreport_date values('01-JUL-2008 11:05:10');
insert into tsreport_date values('02-JUL-2008 11:16:25');
insert into tsreport_date values('03-JUL-2008 11:16:27');
select to_char(report_date,'MON-YYYY'), min(report_date), max(report_date)
  from tsreport_date
 group by to_char(report_Date,'MON-YYYY')
/
TO_CHAR( MIN(REPORT_DATE)     MAX(REPORT_DATE)
-------- -------------------- --------------------
JUL-2008 01-JUL-2008 11:05:10 03-JUL-2008 11:16:27
MAY-2008 09-MAY-2008 17:52:22 27-MAY-2008 10:00:03
JUN-2008 01-JUN-2008 11:03:16 26-JUN-2008 11:05:01
3 rows selected.
To get the min and max dates of a month, is it possible to get the result using analytics?
If yes which one of them would be the best?
Regards,
Shivdeep 
May       27, 2008 - 8:32 am UTC 
 
ops$tkyte%ORA10GR2> select to_char(trunc(report_date,'mm'),'MON-YYYY'), min(report_date), max(report_date)
  2    from tsreport_date
  3   group by trunc(report_Date,'mm')
  4   order by trunc(report_date,'mm')
  5  /
TO_CHAR( MIN(REPORT_DATE)     MAX(REPORT_DATE)
-------- -------------------- --------------------
MAY-2008 09-may-2008 17:52:22 27-may-2008 10:00:03
JUN-2008 01-jun-2008 11:03:16 26-jun-2008 11:05:01
JUL-2008 01-jul-2008 11:05:10 03-jul-2008 11:16:27
that would be the more efficient way - do not convert a date into a string for operations on it - use TRUNC whenever possible.  
Analytics would not be appropriate here - you want to aggregate - you only want three records output - analytics do not "aggregate", aggregation does.
trunc(date,'mm') is a lot less cpu intensive than to_char(date,'mon-yyyy') - does the same thing and is SORTABLE (mon-yyyy isn't sortable at all, you'd have to to_date that string to make it a date again in order to sort!!!)
use to_char on a date only for formatting in a report - not as a way to truncate. 
 
 
 
Min And Max of date ranges
Shivdeep Modi, May       27, 2008 - 11:30 am UTC
 
 
Thanks for the advise to use trunc. I was not aware I was not using the best options using to_char.
Regards,
Shivdeep 
 
Vaibhav, May       28, 2008 - 12:17 am UTC
 
 
Hey Tom,
I found both the books,
Expert one on one oracle and Effective orcale by design
they will reach in 10 days
i m very excited 
 
Karteek, May       28, 2008 - 9:51 am UTC
 
 
Tom,
col1 is a grouping column and col2 is actually the result of row_num() analytic function on inner query (not shown), and col3 is some thing like a flag that tells about col2.
create table test(col1 varchar2(2), col2 number(5), col3 number(5));
insert into test values('A', 1, NULL);
insert into test values('A', 2, 2);
insert into test values('A', 3, 3);
insert into test values('A', 4, 4);
insert into test values('A', 5, NULL);
insert into test values('A', 6, 6);
insert into test values('A', 7, NULL);
insert into test values('A', 8, 8);
insert into test values('A', 9, 9);
A 1 
A 2 2
A 3 3
A 4 4
A 5
A 6 6
A 7
A 8 8
A 9 9
Output should be like...
A 2
A 8
Condition here is, I need to consider only the first occurence of the sequence with atleast 2 consecutive numbers followed(2,3,4 - 1st, 8,9 - 2nd). Hope I am clear!
Thanks Tom!
 
May       28, 2008 - 10:11 am UTC 
 
ops$tkyte%ORA10GR2> select *
  2    from (
  3  select col1, col2, col3,
  4         lag(col3) over (partition by col1 order by col2) last_col3,
  5         lead(col3) over (partition by col1 order by col2) next_col3
  6    from test
  7         )
  8   where last_col3 is null and next_col3 = col3+1
  9  /
CO       COL2       COL3  LAST_COL3  NEXT_COL3
-- ---------- ---------- ---------- ----------
A           2          2                     3
A           8          8                     9
 
 
 
 
Analytics
Shivdeep Modi, June      09, 2008 - 12:34 pm UTC
 
 
Hi,
I need to formulate drop statement for the package/package bodies.
I've got something like:
select owner,object_name,object_type
 from dba_objects
where object_type like 'PACKAGE%'
  and owner = 'NCLDBA'
order by object_name
/
Owner           Object Name                Object Type
--------------- -------------------------- ---------------
NCLDBA          TEST                       PACKAGE BODY
NCLDBA          TEST                       PACKAGE
NCLDBA          TEST2                      PACKAGE
NCLDBA          TEST3                      PACKAGE BODY
4 rows selected.
Now I want to formulate drop statements for the above.
The result set should be:
drop package test;
drop package test2;
drop package body test3;
I think this can be done using analytic functions. However I get the same rank for package/package body combination. What  is the mistake here?
select 'drop '||object_type||' '||owner||'.'||object_name||';' Statement,
       object_name,
       object_type,
       rank() over (partition by object_name order by object_name) as rank
  from dba_objects
 where object_type like 'PACKAGE%'
   and owner = 'NCLDBA'
 order by object_name
/
STATEMENT                        Object Name  Object Type    RANK
-------------------------------- ------------ ------------- -----
drop PACKAGE BODY NCLDBA.TEST;   TEST         PACKAGE BODY      1
drop PACKAGE NCLDBA.TEST;        TEST         PACKAGE           1
drop PACKAGE NCLDBA.TEST2;       TEST2        PACKAGE           1
drop PACKAGE BODY NCLDBA.TEST3;  TEST3        PACKAGE BODY      1
 
June      09, 2008 - 2:19 pm UTC 
 
ops$tkyte%ORA10GR2> select * from (
  2  select 'drop '||object_type||' '||owner||'.'||object_name||';' Statement,
  3         object_name,
  4         object_type,
  5         rank() over (partition by object_name order by object_type) as rank
  6    from dba_objects
  7   where object_type like 'PACKAGE%'
  8     and owner = user
  9     ) where rank = 1
 10   order by object_name
 11  /
STATEMENT                           OBJECT_NAME                    OBJECT_TYPE               RANK
----------------------------------- ------------------------------ ------------------- ----------
drop PACKAGE OPS$TKYTE.TEST;        TEST                           PACKAGE                      1
drop PACKAGE OPS$TKYTE.TEST2;       TEST2                          PACKAGE                      1
drop PACKAGE BODY OPS$TKYTE.TEST3;  TEST3                          PACKAGE BODY                 1
In an analytic, if what you order by isn't UNIQUE within the partition - neither row is "first", they are consider the same, a tie - so order by object type to make it unique. 
 
 
 
Analytics
Shivdeep Modi, June      10, 2008 - 4:31 am UTC
 
 
Thanks for the solution.
Regards,
Shivdeep 
 
Is there a way to do this using Analytics ?
Rajesh Srivastava, September 17, 2008 - 11:55 pm UTC
 
 
Hi Tom,
I have a Table with the following structure
Table : TAB1
---------------------------------------------------------
Client   HQ-Country     Engagement-Country     Revenue
---------------------------------------------------------
C1       US             UK                     100
C1       US             US                     200  
C1       US             Japan                  50
C2       US             India                  100
C2       US             Japan                  50
-------------------------------------------------------
What we want to do is to select all records for those clients where NO record exists with HQ_COUNTRY = ENGAGEMENT_COUNTRY. So the three records of C1 should not be selected and the two records for C2 should be selected.
I can think of :
select * from TAB1
where client_id NOT IN
(select distinct client_id from TAB1
 where HQ_COUNTRY = ENGAGEMENT_COUNTRY)
but it looks clunky and have a feeling will perform badly
(TAB1 is large and is a complex view)
Since the query is about checking a condition in a partition, i think Analytics with its PARTION clause should have a magic bullet - but i can't come up with it ...
Any pointers would be much appreciated
 
September 18, 2008 - 7:58 am UTC 
 
there are no silver bullets.
this can be done with analytics, but if tab1 is a large complex view that takes a long time to execute - well - it'll still take a long time to execute
select * 
  from (select tab1.*, 
               count( case when hq_country = engagement_country then 1 end)
               over (partition by client_id) cnt
          from tab1 )
 where cnt = 0; 
 
 
Above logic did not work for me..
Ashish, September 25, 2008 - 11:10 am UTC
 
 
Hi Tom,
  Thankyou for your time! above query did not work for me please help!
SELECT pc_no,av_cd,jcd,rga,rgm,
          from (
                select pc_no,av_cd,substr(jct,1,5)||'00' jcd,rga,rgm,
                       count (case when (nvl(rga,0)>0 OR nvl(rgm,0)>0) then 1 end)
                       over (partition by av_cd, pc_no, SUBSTR(jct,1,5)||'00') cnt
                  from ca_cd ca 
                 where av_cd = 'T2'
               )
         where cnt=1
           and avs = '0000422' 
September 25, 2008 - 3:38 pm UTC 
 
well, short of giving you another random piece of logic - how about you actually set down and tell us exactly what it is you are trying to do??????????
Seriously - look at this for a second and ask yourself "how could any reasonable person read what I wrote and know what I need, I didn't tell them what I needed - I only told them that the answer to someone else's question does not work for me"
Basically, all you have done is ruled out one solution from an infinite set of all solutions to all problems.  We could be here a while throwing out logic/ algorithms and never get to what you need.
So, it will go a lot better if.... you stated precisely what you are trying to do
and remember - create table, insert into - a test case - would be necessary if you want sql that actually compiles. 
 
 
Lag/Lead Windowing...
J, September 25, 2008 - 2:02 pm UTC
 
 
Hello Tom,
Here's a situation that we find ourselves up against.  We're hopelessly stuck and am hoping you can advise us on how best to proceed (thanks!)
The setup:
==========
create table jt
  (cust_cd   varchar2(20),
   visit_id1 number(20),
   visit_id2 number(8,2),
   visit_dt  date,
   c_flag    varchar2(1),
   l_flag    varchar2(1));
insert into jt values ('P1',1,60961,to_date('11/27/2007','MM/DD/YYYY'),'Y','N');
insert into jt values ('P1',2,60964.02,to_date('11/30/2007','MM/DD/YYYY'),'N','Y');
insert into jt values ('P1',3,61055,to_date('02/29/2008','MM/DD/YYYY'),'Y','Y');
insert into jt values ('P1',4,61055.01,to_date('02/29/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P1',5,61072,to_date('03/03/2008','MM/DD/YYYY'),'Y','N');
insert into jt values ('P1',6,61100,to_date('04/14/2008','MM/DD/YYYY'),'Y','Y');
insert into jt values ('P1',7,61121,to_date('05/15/2008','MM/DD/YYYY'),'Y','N');
insert into jt values ('P1',8,61163,to_date('06/16/2008','MM/DD/YYYY'),'Y','Y');
insert into jt values ('P1',9,61163.01,to_date('06/16/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P1',10,61163.02,to_date('06/16/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P2',11,61105,to_date('03/18/2008','MM/DD/YYYY'),'N','Y');
insert into jt values ('P2',12,62100,to_date('03/25/2008','MM/DD/YYYY'),'Y','N');
commit;
========
Here's the objective: For each Customer's "C" visit (C_Flag = 'Y'), find the *nearest* "L" visit(s) (L_Flag = 'Y') by Visit_Dt - backwards two weeks or forwards two weeks in time and associate the Visit_Id1's.
========
So the "C" visits are the pivot points. The sorting order is: Cust_Cd, Visit_Dt, Visit_Id2.   Just to make it more complicated, in the sample data, Visit_Id1 # 8 is a C-Visit.  It's also an L-Visit.  And... Visits 9 and 10 are on the same day as #9.  The number of associated L-Visits for a given C-Visit is three.
The resulting set should look like:
Cust ID  C-Visit ID  L-Visit ID1  L-Visit ID2  L-Visit ID3
-------  ----------  -----------  -----------  -----------
P1       1           2            (null)       (null)
P1       3           3            4            (null)
P1       5           3            4            (null)
P1       6           6            (null)       (null)
P1       7           (null)       (null)       (null)
P1       8           8            9            10
P2       12          11           (null)       (null)
Can this be answered through some nifty analytics?
Three of us have been wrestling with this all week long and can't figure out an approach - Any assistance or advice would be most appreciated.
Best Regards,
- J 
September 25, 2008 - 4:35 pm UTC 
 
here was my though process on this:
ops$tkyte%ORA10GR2> select cust_cd, visit_id1, visit_id2, visit_dt, c_flag, l_flag,
  2         decode( l_flag, 'Y', visit_dt ) curr_l,
  3         last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt ) last_l,
  4         last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt DESC ) next_l
  5    from jt;
CUST_CD               VISIT_ID1  VISIT_ID2 VISIT_DT  C L CURR_L    LAST_L    NEXT_L
-------------------- ---------- ---------- --------- - - --------- --------- ---------
P1                            1      60961 27-NOV-07 Y N                     30-NOV-07
P1                            2   60964.02 30-NOV-07 N Y 30-NOV-07 30-NOV-07 30-NOV-07
P1                            3      61055 29-FEB-08 Y Y 29-FEB-08 29-FEB-08 29-FEB-08
P1                            4   61055.01 29-FEB-08 N Y 29-FEB-08 29-FEB-08 29-FEB-08
P1                            5      61072 03-MAR-08 Y N           29-FEB-08 14-APR-08
P1                            6      61100 14-APR-08 Y Y 14-APR-08 14-APR-08 14-APR-08
P1                            7      61121 15-MAY-08 Y N           14-APR-08 16-JUN-08
P1                            8      61163 16-JUN-08 Y Y 16-JUN-08 16-JUN-08 16-JUN-08
P1                            9   61163.01 16-JUN-08 N Y 16-JUN-08 16-JUN-08 16-JUN-08
P1                           10   61163.02 16-JUN-08 N Y 16-JUN-08 16-JUN-08 16-JUN-08
P2                           11      61105 18-MAR-08 N Y 18-MAR-08 18-MAR-08 18-MAR-08
P2                           12      62100 25-MAR-08 Y N           18-MAR-08
12 rows selected.
Get the current row L-flag date if it exists, get the immediate prior one and get the immediate next one - one of those three (if there) will be the l-date of interest....
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select cust_cd, visit_id1, visit_id2, visit_dt, curr_l, last_l, next_l,
  2         case when curr_l is not null then curr_l
  3                  when last_l is not null and next_l is null and visit_dt-last_l <= 14 then last_l
  4                          when last_l is null and next_l is not null and next_l-visit_dt <= 14 then next_l
  5                          else decode( least( (visit_dt-last_l), (next_l-visit_dt), 14),
  6                                       (visit_dt-last_l), last_l,
  7                                                   (next_l-visit_dt), next_l )
  8                  end interesting_l
  9    from (
 10  select cust_cd, visit_id1, visit_id2, visit_dt, c_flag, l_flag,
 11         decode( l_flag, 'Y', visit_dt ) curr_l,
 12         last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt ) last_l,
 13         last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt DESC ) next_l
 14    from jt
 15         )
 16   where c_flag = 'Y'
 17  /
CUST_CD               VISIT_ID1  VISIT_ID2 VISIT_DT  CURR_L    LAST_L    NEXT_L    INTERESTI
-------------------- ---------- ---------- --------- --------- --------- --------- ---------
P1                            1      60961 27-NOV-07                     30-NOV-07 30-NOV-07
P1                            3      61055 29-FEB-08 29-FEB-08 29-FEB-08 29-FEB-08 29-FEB-08
P1                            5      61072 03-MAR-08           29-FEB-08 14-APR-08 29-FEB-08
P1                            6      61100 14-APR-08 14-APR-08 14-APR-08 14-APR-08 14-APR-08
P1                            7      61121 15-MAY-08           14-APR-08 16-JUN-08
P1                            8      61163 16-JUN-08 16-JUN-08 16-JUN-08 16-JUN-08 16-JUN-08
P2                           12      62100 25-MAR-08           18-MAR-08           18-MAR-08
7 rows selected.
So, we can take those three dates and for the C records - find the interesting l date....
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> with A
  2  as
  3  (
  4  select cust_cd, visit_id1, visit_id2, visit_dt, curr_l, last_l, next_l,
  5         case when curr_l is not null then curr_l
  6                  when last_l is not null and next_l is null and visit_dt-last_l <= 14 then last_l
  7                          when last_l is null and next_l is not null and next_l-visit_dt <= 14 then next_l
  8                          else decode( least( (visit_dt-last_l), (next_l-visit_dt), 14),
  9                                       (visit_dt-last_l), last_l,
 10                                                   (next_l-visit_dt), next_l )
 11                  end interesting_l
 12    from (
 13  select cust_cd, visit_id1, visit_id2, visit_dt, c_flag, l_flag,
 14         decode( l_flag, 'Y', visit_dt ) curr_l,
 15         last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt ) last_l,
 16         last_value( case when l_flag = 'Y' then visit_dt end ignore nulls) over ( partition by cust_cd order by visit_dt DESC ) next_l
 17    from jt
 18         )
 19   where c_flag = 'Y'
 20  ),
 21  B
 22  as
 23  (select * from jt where l_flag = 'Y' )
 24  select a.cust_cd, a.visit_dt, a.visit_id1, b.visit_id1
 25    from A left join B
 26      on (a.cust_cd = b.cust_cd and a.interesting_l = b.visit_dt)
 27   order by a.cust_cd, a.visit_dt, a.visit_id1
 28  /
CUST_CD              VISIT_DT   VISIT_ID1  VISIT_ID1
-------------------- --------- ---------- ----------
P1                   27-NOV-07          1          2
P1                   29-FEB-08          3          3
P1                   29-FEB-08          3          4
P1                   03-MAR-08          5          3
P1                   03-MAR-08          5          4
P1                   14-APR-08          6          6
P1                   15-MAY-08          7
P1                   16-JUN-08          8          8
P1                   16-JUN-08          8          9
P1                   16-JUN-08          8         10
P2                   25-MAR-08         12         11
11 rows selected.
and then outer join back to pick up the records of interest.... You could pivot this if you want - but basically this is your set of data. 
 
 
RE: Lag/Lead Windowing
J, September 25, 2008 - 6:25 pm UTC
 
 
Thank you Tom!
This is a *completely* different approach from what we were trying.  Will dissect this to try to tease the nuances of your "how to apply analytics" mindset.
All the best,
- J 
 
Analytics
Ashish, October   07, 2008 - 6:05 am UTC
 
 
Tom,
  First of all I apology for not providing 100% input to you in order to help me resolve the problem.
Here I go...Code is:
SELECT pc_no,av_cd,jcd,rga,rgm, 
      from ( 
          select pc_no,av_cd,substr(jct,1,5)||'00' jcd,rga,rgm, 
              count (case when (nvl(rga,0)>0 OR nvl(rgm,0)>0) then 1 end) 
              over (partition by av_cd, pc_no, SUBSTR(jct,1,5)||'00') cnt 
            from ca_cd ca 
          where av_cd = 'T2' 
          ) 
      where cnt=1 
      and avs = '0000424'
If I run above query I get below output:
pc_no av_cd jcd rga rgm
0000424   T2 7166500   0   0
0000424   T2 7166500   6   6
0000424   T2 7166500   0   0
While, I only want (rga>0 or rgm>0) data i.e. only one row out of above 3 rows from above table.
Please help!
Thanks, 
October   08, 2008 - 9:13 pm UTC 
 
add a where clause?
where cnt = 1 and (rga >0 or rgm>0) 
 
 
what about your book on Analytical functions
Vijay'S, April     15, 2009 - 6:49 am UTC
 
 
Dear Tom,
it's long you said that you are thinking of writing a book on analytical functions, have you started on it already or is it going to be announced soon. I know I am too optimistic here but reading your few more posts on analytics I can't wait more to have your book on this subject to understand them better. 
Please make this happen it will be great help for everyone as your other books have helped a lot.
Regards,
Vijay'S 
 
can this be done in analytics -or single sql
A reader, July      28, 2009 - 9:40 am UTC
 
 
Hi Tom,
I have a table like below and the sample data as given below.
drop table T ;
create table T ( ordernumber number,stockid number, side char(1),quantity number,price number) ;
sample data :
insert into t values (1, 1, 'S', 100, 50.00) ;
insert into t values (2, 5, 'B',  25, 12.50) ;
insert into t values (3, 1, 'S', 100, 50.00) ;
insert into t values (4, 1, 'B', 150, 51.00) ;
insert into t values (5, 1, 'B',  50, 49.00) ;
insert into t values (6, 8, 'B', 100, 16.75) ;
insert into t values (7, 5, 'S', 50, 12.25) ;
insert into t values (8, 8, 'S', 100, 17.00) ; 
my sql should be able to give me a output for all the possible records where i can see a stock to somone who want to buy..the ordernunber is a system generated number and that is typical a order.the sell and buy match off should have 2 restrictions 
the sell price must be >= buy price.
and the stock_number should be same for sell and buy records.
here "sell" records are the one for which side = 'S' and buy records are the one with side = 'B' ...so the expected output of the sql should be
stockId  sell_order buy_order sell_qty buy_quty 
1          1        4         100     100
1          3        4         50      50
5          7        2         25      25 
in case of order #7 the remaining qty to sell should be updated as part of this process ...so the table should be updated that now for order number 7 there are still 25 stocks and they can be considereed for the next run..
same is true for order for #3.
so far i have taken this approach - but not able to break it...
column diff format 99999
select  'T'
       , buy.ordernumber buy_ord
       ,sell.ordernumber sell_ord 
       ,sell.stockid
       ,sell.quantity sell_qty
      -- ,sell.tot_sell_qty
      -- ,buy.tot_buy_qty 
       ,buy.quantity buy_qty
       ,sell.price sell_price
       ,buy.price buy_price
       ,lag(sell.quantity - buy.quantity) over (   order by buy.price desc) diff
from
(select t.* , sum(quantity) over (partition by side,price) tot_sell_qty from t where side ='S' and quantity >0 and price > 0
   order by ordernumber,stockid,price ) sell,
(select t.* ,sum(quantity) over (partition by side,price) tot_buy_qty from t where side ='B' and quantity >0 and price > 0
   order by ordernumber,stockid,price ) buy
where  sell.stockid = buy.stockid and buy.price >= sell.price ;
i would really appreciate any help on this.
 
 
Is analytics better in this case?
Joe, July      29, 2009 - 7:50 pm UTC
 
 
Tom, 
I have become a huge fan of analytics and I find myself trying to use them whenever possible, even if there may be an easier or better to get the result.  So I figured I better get your opinion as to whether the fictional example below is better with or without the analytics.
declare
  s1 varchar2(30);
  s2 varchar2(30);
  function get_first_index_loop(t in varchar2) return varchar2 is
    result varchar2(30) := null;
  begin
    for c in (select 1               priority,
                     constraint_name index_name
                from user_constraints
               where table_name = t
                 and constraint_type = 'P'
              union
              select 2 priority,
                     index_name
                from user_indexes
               where table_name = t
               order by priority)
    loop
      result := c.index_name;
      exit;
    end loop;
    return result;
  end;
  function get_first_index_analytics(t in varchar2) return varchar2 is
    result varchar2(30) := null;
  begin
    select distinct first_value(index_name) over(order by priority)
      into result
      from (select 1 priority,
                   constraint_name index_name
              from user_constraints
             where constraint_type = 'P'
               and table_name = t
            union
            select 2 priority,
                   index_name
              from user_indexes
             where table_name = t
             order by priority);
  
    return result;
  
  exception
    when no_data_found then
      return null;
  end;
begin
  for t in (select table_name
              from user_tables
             order by table_name)
  loop
    s1 := get_first_index_loop(t.table_name);
    s2 := get_first_index_analytics(t.table_name);
  end loop;
end;
I know that "if it can be done in SQL then do it, else use as little pl/sql as possible".  But in the above example there are other factors like the exception handler that could affect performance.
Is one way better than the other (peformance, etc) and if not, could you share you opinion as to which way you prefer?
Thanks,
Joe 
August    03, 2009 - 5:10 pm UTC 
 
I would not use procedural code for this - no.  You shouldn't have any code at all - forget comparing "procedural code - no analytics" vs "procedural code with analytics"
There should never have been a function - ever.
ops$tkyte%ORA10GR2> select table_name,
  2             nvl(  ( select constraint_name index_name
  3                   from user_constraints
  4                  where constraint_type = 'P'
  5                    and table_name = t.table_name
  6                    and rownum = 1 ),
  7                ( select index_name
  8                    from user_indexes
  9                   where table_name = t.table_name
 10                     and rownum = 1 ) ) iname
 11    from user_tables t
 12   order by table_name
 13  /
TABLE_NAME                     INAME
------------------------------ ------------------------------
C
CASE                           CASE_SEARCH1
DEPT                           DEPT_PK
DOCUMENTS                      T_IOT_PK
 
 
 
more details on question asked on 28th Jul 2009
A reader, July      31, 2009 - 1:04 am UTC
 
 
Hi Tom,
Please consider these details as more accurate.
Hi Tom,
I have a table like below and the sample data as given below.
drop table T ;
create table T ( ordernumber number,stockid number, side char(1),quantity number,price number) ;
sample data :
insert into t values (1, 1, 'S', 100, 50.00) ;
insert into t values (2, 5, 'B',  25, 12.50) ;
insert into t values (3, 1, 'S', 100, 50.00) ;
insert into t values (4, 1, 'B', 150, 51.00) ;
insert into t values (5, 1, 'B',  50, 49.00) ;
insert into t values (6, 8, 'B', 100, 16.75) ;
insert into t values (7, 5, 'S', 50, 12.25) ;
insert into t values (8, 8, 'S', 100, 17.00) ; 
my sql should be able to give me a output for all the possible records where i can sell a stock to 
somone who want to buy..the ordernunber is a system generated number and that is typical a 
order.the sell and buy match off should have 3 restrictions 
1)the sell price must be >= buy price.
2)and the stock_number should be same for sell and buy records.that is sell-buy can be done only for the same stockid , we can sell a stockid 1 to someone who want to buy stockid 2 ,even if he has the required pricing etc.
3) for the same stockid , if I have more than 2 buy orders then the one with maximum buy price should be able to buy.
if there 2 or more buyers with the same price then the allotment should happen by ordernumber .
here "sell" records are the one for which side = 'S' and buy records are the one with side = 'B' 
...so the expected output of the sql should be
stockId  sell_order buy_order sell_qty buy_quty 
1          1        4         100     100
1          3        4         50      50
5          7        2         25      25 
in case of order #7 the remaining qty to sell should be updated as part of this process ...so the 
table should be updated that now for order number 7 there are still 25 stocks and they can be 
considereed for the next run..
same is true for order for #3.
so far i have taken this approach - but not able to break it...
column diff format 99999
select  'T'
       , buy.ordernumber buy_ord
       ,sell.ordernumber sell_ord 
       ,sell.stockid
       ,sell.quantity sell_qty
      -- ,sell.tot_sell_qty
      -- ,buy.tot_buy_qty 
       ,buy.quantity buy_qty
       ,sell.price sell_price
       ,buy.price buy_price
       ,lag(sell.tot_sell_qty - buy.quantity) over ( partition by sell.stockid order by buy.price desc) diff
from
(select t.* , sum(quantity) over (partition by stockid, side,price) tot_sell_qty from t where side ='S' and quantity >0 and price > 0
   order by ordernumber,stockid,price ) sell,
(select t.* ,sum(quantity) over (partition by stockid, side,price) tot_buy_qty from t where side ='B' and quantity >0 and price > 0
   order by ordernumber,stockid,price ) buy
where  sell.stockid = buy.stockid and buy.price >= sell.price ; 
really apprceciate any help on this.
Thanks  
 
How to give group id  for the particular group of rows
Suji, January   27, 2010 - 3:20 pm UTC
 
 
Ex:
CREATE TABLE TAB_A(A_ID NUMBER, A_DT DATE, A_TTL VARCHAR2(20));
BEGIN
INSERT INTO TAB_A VALUES(1,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABC');
INSERT INTO TAB_A VALUES(2,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABC');
INSERT INTO TAB_A VALUES(3,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABCD');
INSERT INTO TAB_A VALUES(4,TO_DATE('01/01/2009','MM/DD/YYYY'),'ABCD');
INSERT INTO TAB_A VALUES(5,TO_DATE('02/01/2009','MM/DD/YYYY'),'ABC');
INSERT INTO TAB_A VALUES(6,TO_DATE('02/01/2009','MM/DD/YYYY'),'ABCDE');
INSERT INTO TAB_A VALUES(6,TO_DATE('03/01/2009','MM/DD/YYYY'),'ABCDEF');
INSERT INTO TAB_A VALUES(7,TO_DATE('03/01/2009','MM/DD/YYYY'),'ABCDEF');
INSERT INTO TAB_A VALUES(8,TO_DATE('03/01/2009','MM/DD/YYYY'),'ABCDEF');
END;
SELECT DISTINCT A.A_ID, A.A_DT, A.A_TTL
FROM   TAB_A A
WHERE   (A.A_DT, A.A_TTL) IN
(SELECT A.A_DT, A.A_TTL
FROM   TAB_A A 
GROUP BY A.A_DT, A.A_TTL
HAVING COUNT(DISTINCT A.A_ID) > 1)
ORDER BY A.A_ID  
Answer is 
A_ID        A_DT   A_TTL
1  1/1/2009 ABC
2  1/1/2009 ABC
3  1/1/2009 ABCD
4  1/1/2009 ABCD
6  3/1/2009 ABCDEF
7  3/1/2009 ABCDEF
8  3/1/2009 ABCDEF
I just want 4th group which should generate unique number for each group of A_DT and A_TTL columns like 
all the 1/1/2009 and ABC should be displayed 1
all the 1/1/2009 and ABC should be 2
all the 3/1/2009 and ABCDEF should be 3 
etc..........
How do I generate query
Thanks in advance
 
January   29, 2010 - 3:23 pm UTC 
 
ops$tkyte%ORA11GR2> select a.*, dense_rank() over (order by a_ttl)
  2  FROM   TAB_A A
  3  WHERE   (A.A_DT, A.A_TTL) IN
  4  (SELECT A.A_DT, A.A_TTL
  5  FROM   TAB_A A
  6  GROUP BY A.A_DT, A.A_TTL
  7  HAVING COUNT(DISTINCT A.A_ID) > 1)
  8  ORDER BY A.A_ID
  9  /
      A_ID A_DT      A_TTL                DENSE_RANK()OVER(ORDERBYA_TTL)
---------- --------- -------------------- ------------------------------
         1 01-JAN-09 ABC                                               1
         2 01-JAN-09 ABC                                               1
         3 01-JAN-09 ABCD                                              2
         4 01-JAN-09 ABCD                                              2
         6 01-MAR-09 ABCDEF                                            3
         7 01-MAR-09 ABCDEF                                            3
         8 01-MAR-09 ABCDEF                                            3
7 rows selected.
 
 
 
 
Excelent
suji, February  01, 2010 - 12:59 pm UTC
 
 
My Problem is solved.   Excellent.   Thank you very much.
Thanks 
 
A reader, February  11, 2010 - 5:58 am UTC
 
 
> create table test (id number, token varchar2(10));
 > insert into test values (1,'ABC');
 > insert into test values (2,'XYZ');
 > insert into test values (3,'DEF');
 > insert into test values (3,'DEF');
 > insert into test values (4,'GHI');
 > insert into test values (5,'RST');
 > insert into test values (6,'JKL');
 > insert into test values (6,'JKL');
 > insert into test values (7,'MNO');
 > insert into test values (8,'PQR');
 > insert into test values (9,'DEF');
 > insert into test values (10,'RST');
 > SELECT * FROM TEST;
        ID TOKEN
---------- ----------
         1 ABC
         2 XYZ
         3 DEF
         3 DEF
         4 GHI
         5 RST
         6 JKL
         6 JKL
         7 MNO
         8 PQR
         9 DEF
        10 RST
I want a query that will return all duplicate Token values unless the duplicates all belong to the same Id. For each duplicate returned the Id should also be displayed. Based on the above data the query should return:
 ID TOKEN
--- ------
 10 RST
  5 RST
  3 DEF
  9 DEF
 
February  16, 2010 - 9:43 am UTC 
 
  1  select distinct *
  2    from (
  3  select id, token, count(distinct id) over (partition by token) cnt
  4    from test
  5         )
  6*  where cnt > 1
ops$tkyte%ORA10GR2> /
        ID TOKEN             CNT
---------- ---------- ----------
        10 RST                 2
         9 DEF                 2
         5 RST                 2
         3 DEF                 2
 
 
 
 
Great work Tom
Dominic, February  25, 2010 - 6:27 am UTC
 
 
Tom,
First of all thanks for all the work you've done for the Oracle development community.
create table test1
(object_no number,
 abi_code  varchar2(3),
 priority_over varchar2(3))
/
 
insert into test1 values(1,'G07','Z99');
insert into test1 values(1,'J01','A22');
insert into test1 values(1,'J01','G06');
insert into test1 values(1,'J01','G07');
insert into test1 values(1,'J01','Z99');
insert into test1 values(1,'J01','P01');
insert into test1 values(2,'P01','Z99');
 
commit;
SQL> select *
  2  from test1
  3  /
 
 OBJECT_NO ABI PRI
---------- --- ---
         1 G07 Z99
         1 J01 A22
         1 J01 G06
         1 J01 G07
         1 J01 Z99
         1 J01 P01
         2 P01 Z99
 
7 rows selected.
I need to flag those rows where the abi_code exists in the priority_over column for another abi_code for that object. The end output I would be after would be 
 OBJECT_NO ABI
---------- ---
         1 J01
         2 P01
which I can get with 
select distinct object_no, abi_code
from test1 t
where abi_code not in (select priority_over
                         from test1
                        where object_no = t.object_no)
but this is 2 full table scans so I was wondering if there was an analytic function I could use to flag those rows that I want to exclude
I tried 
select object_no, 
       abi_code, 
       priority_over,
       first_value(priority_over) 
         over (partition by object_no
                   order by decode(priority_over,
                                   abi_code,1,2)) match
  from test1
but that failed miserably.
Can I do this with analytics or am I stuck with the correlated subquery
Thanks in advance
Dom
 
 
March     01, 2010 - 10:15 am UTC 
 
even with the analytics - you would have to make two passes -
pass1: identify
pass2: update
because you cannot update the result of an analytic directly.
You are not stuck with a correlated subquery, you could use not in (select object_no, priority_code ...), but I would suggest this instead:
ops$tkyte%ORA10GR2> exec dbms_stats.set_table_stats( user, 'TEST1', numrows=> 1000000, numblks => 10000);
PL/SQL procedure successfully completed.
ops$tkyte%ORA10GR2> set autotrace traceonly explain
ops$tkyte%ORA10GR2> select object_no, abi_code
  2    from test1 t1
  3   where NOT EXISTS
  4     (select null
  5        from test1 t2
  6       where t2.object_no = t1.object_no
  7         and t2.priority_over = t1.abi_code)
  8  /
Execution Plan
----------------------------------------------------------
Plan hash value: 2822440888
------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |   999K|    30M|       |  1981   (4)| 00:00:15 |
|*  1 |  HASH JOIN ANTI    |       |   999K|    30M|    26M|  1981   (4)| 00:00:15 |
|   2 |   TABLE ACCESS FULL| TEST1 |  1000K|    15M|       |   709   (3)| 00:00:06 |
|   3 |   TABLE ACCESS FULL| TEST1 |  1000K|    15M|       |   711   (4)| 00:00:06 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T2"."OBJECT_NO"="T1"."OBJECT_NO" AND
              "T2"."PRIORITY_OVER"="T1"."ABI_CODE")
ops$tkyte%ORA10GR2> set autotrace off
looking for that nice juicy hash anti-join.   
 
 
Thanks Tom
Dominic, March     01, 2010 - 11:40 am UTC
 
 
Excellent as usual.
The actual problem was complicated by the fact that test1 is actually an inline view which initially caused me problems, but when I build the view in a with statement, that anti hash join comes up lovely.
  1  with il as (
  2  select hi.object_no,
  3         hi.item_description,
  4         hfm.abi_code,
  5         himp.priority_over
  6    from obj_hitem hi,
  7         hp_freetext_map hfm,
  8         hp_item_map_priority himp
  9   where instr(upper(hi.item_description),hfm.rule) > 0
 10     and himp.abi_code = hfm.abi_code)
 11  select il.object_no,
 12         il.item_description,
 13         il.abi_code,
 14         il.priority_over
 15    from il
 16   where not exists (select null
 17                       from il il2
 18                      where il2.object_no = il.object_no
 19*                       and il2.priority_over = il.abi_code)
SQL> set autotrace traceonly explain
SQL> /
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=21 Card=376 Bytes=60
          912)
 
   1    2     RECURSIVE EXECUTION OF 'SYS_LE_2_0'
   2    0   TEMP TABLE TRANSFORMATION
   3    2     HASH JOIN (ANTI) (Cost=21 Card=376 Bytes=60912)
   4    3       VIEW (Cost=4 Card=388 Bytes=56648)
   5    4         TABLE ACCESS (FULL) OF 'SYS_TEMP_0FD9D6602_77604101'
           (Cost=4 Card=388 Bytes=5432)
 
   6    3       VIEW (Cost=4 Card=388 Bytes=6208)
   7    6         TABLE ACCESS (FULL) OF 'SYS_TEMP_0FD9D6602_77604101'
           (Cost=4 Card=388 Bytes=5432)
Once more you allow me to build a reputation of competence based on your hard work. It's solutions like this that are allowing me to turn 14000 rows of slow by slow pl/sql processing into one 3500 line insert statement :-)
Seriously though, I plug your website in every interview I attend, or site I work on. Who knows, if management finally learn that there is such a thing as good SQL, they might start hiring good developers to write it. Won't hold my breath.
Thanks again for all your work, just don't retire before I do.
Dom
 
 
 
CezarN, March     01, 2010 - 2:57 pm UTC
 
 
@Dominic:
I thought about analytics as a solution to your question:
select object_no, abi_code, priority_over
  from (select object_no,
               cod,
               src,
               count(distinct src) over(partition by cod) as csrc,
               count(distinct object_no) over(partition by cod) as cobj,
               abi_code,
               priority_over
          from (select object_no,
                       case d.src
                         when 1 then abi_code
                         else priority_over
                       end as cod,
                       d.src,
                       abi_code,
                       priority_over
                  from test1
                  join (select 1 as src from dual
                        union all
                        select 2 as src from dual) d
                    on 1 = 1))
 where csrc = 2 and cobj > 1
 order by cod, src, object_no; 
 
@CezarN
Dominic, March     02, 2010 - 3:53 am UTC
 
 
Nice way of looking at the problem, I'm pretty impressed.
Unfortunately, as it is at the moment, it relies on the existence of the row where abi_code = 'J01' and priority_over = 'P01', but I tweaked it a bit and got this
select distinct object_no, abi_code
from (
select object_no,
               cod,
               src,
               count(distinct src) over(partition by object_no,
                                                     cod) as csrc,
               abi_code,
               priority_over
          from (select object_no,
                       case d.src
                         when 1 then abi_code
                         else priority_over
                       end as cod,
                       d.src,
                       abi_code,
                       priority_over
                  from test1
                  join (select 1 as src from dual
                        union all
                        select 2 as src from dual) d
                    on 1 = 1))
where csrc = 1
and src = 1
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   SORT (UNIQUE)
   2    1     VIEW
   3    2       WINDOW (SORT)
   4    3         NESTED LOOPS
   5    4           VIEW
   6    5             UNION-ALL
   7    6               TABLE ACCESS (FULL) OF 'DUAL'
   8    6               TABLE ACCESS (FULL) OF 'DUAL'
   9    4           TABLE ACCESS (FULL) OF 'TEST1'
I'll have to do some benchmarking to see which method performs best, and also which one can be more easily merged into the main query but it's always nice to have several options.
Thanks
Dom 
March     02, 2010 - 8:28 am UTC 
 
I'd lay odds on a nice juicy pair of full scans and a hash operation. 
 
 
Always bet on Tom
Dominic, March     02, 2010 - 10:52 am UTC
 
 
SQL> get get_stats
  1  declare
  2    l_start number;
  3  begin
  4    insert into run_stats
  5    select 'before',
  6           stats.*
  7      from stats;
  8    l_start := dbms_utility.get_time;
  9    for i in 1 .. 100
 10    loop
 11      for x in (with il as (select hi.object_no,
 12                                   hi.item_description,
 13                                   hfm.abi_code,
 14                                   himp.priority_over
 15                              from obj_hitem hi,
 16                                   hp_freetext_map hfm,
 17                                   hp_item_map_priority himp
 18                             where instr(upper(hi.item_description),
 19                                         hfm.rule(+)) > 0
 20                               and himp.abi_code(+) = hfm.abi_code)
 21                select il.object_no,
 22                       il.item_description,
 23                       il.abi_code
 24                  from il
 25                 where not exists (select null
 26                                     from il il2
 27                                    where il2.object_no = il.object_no
 28                                      and il2.priority_over = il.abi_code))
 29      loop
 30        null;
 31      end loop;
 32    end loop;
 33    dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
 34    insert into run_stats select 'after 1', stats.* from stats;
 35    l_start := dbms_utility.get_time;
 36    for i in 1 .. 100
 37    loop
 38      for x in (with test1 as (select hi.object_no,
 39                                   hi.item_description,
 40                                   hfm.abi_code,
 41                                   himp.priority_over
 42                              from obj_hitem hi,
 43                                   hp_freetext_map hfm,
 44                                   hp_item_map_priority himp
 45                             where instr(upper(hi.item_description),
 46                                         hfm.rule(+)) > 0
 47                               and himp.abi_code(+) = hfm.abi_code)
 48                select distinct object_no,
 49                       item_description,
 50                       abi_code
 51                  from (select object_no,
 52                               item_description,
 53                               cod,
 54                               src,
 55                               count(distinct src) over(partition by object_no,
 56                                                                     cod) as csrc,
 57                               abi_code,
 58                               priority_over
 59                          from (select object_no,
 60                                       item_description,
 61                                       case d.src
 62                                         when 1 then abi_code
 63                                         else priority_over
 64                                       end as cod,
 65                                       d.src,
 66                                       abi_code,
 67                                       priority_over
 68                                  from test1
 69                                  join (select 1 as src from dual
 70                                        union all
 71                                        select 2 as src from dual) d
 72                                    on 1 = 1))
 73                 where csrc = 1
 74                   and src = 1)
 75      loop
 76        null;
 77      end loop;
 78    end loop;
 79    dbms_output.put_line( (dbms_utility.get_time-l_start) || ' hsecs' );
 80    insert into run_stats select 'after 2', stats.* from stats;
 81* end;
 82  /
1240 hsecs
1480 hsecs
SQL> get extract_stats
  1  select a.name, b.value-a.value run1, c.value-b.value run2,
  2         ( (c.value-b.value)-(b.value-a.value)) diff
  3    from run_stats a, run_stats b, run_stats c
  4   where a.name = b.name
  5     and b.name = c.name
  6     and a.runid = 'before'
  7     and b.runid = 'after 1'
  8     and c.runid = 'after 2'
  9     and (c.value-a.value) > 0
 10     and (c.value-b.value) <> (b.value-a.value)
 11*  order by abs( (c.value-b.value)-(b.value-a.value))
SQL> /
 
NAME                                           RUN1       RUN2       DIFF
---------------------------------------- ---------- ---------- ----------
LATCH.FOB s.o list latch                          1          0         -1
LATCH.lgwr LWN SCN                                4          5          1
LATCH.session timer                               4          5          1
STAT...calls to kcmgcs                            6          7          1
LATCH.mostly latch-free SCN                       4          5          1
LATCH.active checkpoint queue latch               8         10          2
LATCH.channel operations parent latch           708        710          2
LATCH.Consistent RBA                              4          1         -3
LATCH.redo writing                               25         22         -3
STAT...active txn count during cleanout           0          4          4
STAT...cleanout - number of ktugct calls          0          4          4
STAT...consistent gets - examination              0          4          4
LATCH.messages                                   35         40          5
LATCH.simulator hash latch                     4372       4348        -24
LATCH.multiblock read objects                  9554       9502        -52
LATCH.sort extent pool                          700        800        100
STAT...parse count (total)                      106          6       -100
STAT...sorts (disk)                               0        100        100
STAT...opened cursors cumulative                106          6       -100
STAT...execute count                            205        105       -100
LATCH.session idle bit                          102          1       -101
LATCH.simulator lru latch                       561        672        111
STAT...db block gets                            759        637       -122
STAT...prefetched blocks                      26031      25893       -138
STAT...free buffer requested                  30972      30775       -197
LATCH.dml lock allocation                       202          4       -198
STAT...calls to kcmgas                          200          0       -200
STAT...sorts (memory)                             4        204        200
LATCH.session allocation                        204          3       -201
STAT...session uga memory max                     0        216        216
LATCH.checkpoint queue latch                   1536       1920        384
LATCH.undo global data                         3807       3408       -399
LATCH.loader state object freelist              200        600        400
STAT...consistent changes                       400          0       -400
LATCH.redo allocation                           455         47       -408
STAT...redo entries                             445         31       -414
STAT...calls to get snapshot scn: kcmgss      20607      20105       -502
STAT...enqueue releases                         708        202       -506
STAT...enqueue requests                         709        202       -507
LATCH.enqueues                                 4727       4021       -706
LATCH.cache buffers lru chain                   900        100       -800
STAT...db block changes                         870         53       -817
LATCH.enqueue hash chains                      5028       4026      -1002
LATCH.shared pool                              1428        233      -1195
LATCH.library cache pin                        1501        305      -1196
LATCH.library cache pin allocation             1284         88      -1196
LATCH.row cache objects                        9630       8254      -1376
LATCH.row cache enqueue latch                  9630       8244      -1386
STAT...buffer is not pinned count             81716      80316      -1400
STAT...table scan blocks gotten               81700      80300      -1400
STAT...no work - consistent read gets         81708      80308      -1400
STAT...consistent gets                       101917     100320      -1597
STAT...session logical reads                 102676     100957      -1719
LATCH.cache buffers chains                   243285     241440      -1845
LATCH.library cache                            3893        497      -3396
STAT...physical writes                          800       4900       4100
STAT...physical writes direct                   800       4900       4100
STAT...physical writes non checkpoint           800       4900       4100
STAT...physical reads                         30858      35669       4811
STAT...physical reads direct                      0       5000       5000
STAT...redo size                              82956      29868     -53088
STAT...recursive calls                        66331       7731     -58600
STAT...table scan rows gotten               8852600    8717200    -135400
STAT...sorts (rows)                            3084     201984     198900
 
64 rows selected.
 
I'm guessing it's the sort that does for the analytic version?
Dom
 
 
March     02, 2010 - 12:47 pm UTC 
 
I'd use get_CPU_time instead of just get time.  
hash (anti/non-anti) joins are just brutally efficient - one of the reasons we started using hash operations for many more things in 10g (over sort operations) 
 
 
@Dominic
CezarN, March     03, 2010 - 12:22 am UTC
 
 
Your where condition:
where csrc = 1
and src = 1
means than we have to have only ONE src and that src must be abi_code, right?
So, we can reduce the amount of sort analytics implies by work arounding that  COUNT(DISTINCT src). So I changed the coding set for src from (1,2) in (0,1). This means that a SUM of ZERO on src means that we have only ONE src and that src must be abi_code. No more DISTICT:
select distinct object_no, abi_code
from (
select object_no,
               cod,
               src,
               sum(src) over(partition by object_no, cod) as csrc,
               abi_code,
               priority_over
          from (select object_no,
                       case d.src
                         when 0 then abi_code
                         else priority_over
                       end as cod,
                       d.src,
                       abi_code,
                       priority_over
                  from test1
                  join (select 0 as src from dual
                        union all
                        select 1 as src from dual) d
                    on 1 = 1)
)
where csrc = 0
Hope this does better then with DISTINCT. 
 
@CezarN
Dominic, March     03, 2010 - 5:30 am UTC
 
 
Actually you can have more than 1 src as shown below, but fortunately because we are partitioning on cod, multiple values are still picked up. Having said that, I do like the sum method more, seems sharper to me somehow.
SQL> select *
  2  from test1
  3  /
 
 OBJECT_NO ABI PRI
---------- --- ---
         1 G07 Z99
         1 J01 A22
         1 J01 G06
         1 J01 G07
         1 J01 Z99
         2 P01 Z99
 
6 rows selected.
 
SQL> insert into test1
  2  values(1,'H01','Z99')
  3  /
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
select distinct object_no, 
       abi_code
  from (select object_no,
               cod,
               src,
               sum(src) over(partition by object_no, cod) as csrc,
               abi_code,
               priority_over
          from (select object_no,
                       case d.src
                         when 0 then abi_code
                         else priority_over
                       end as cod,
                       d.src,
                       abi_code,
                       priority_over
                  from test1
                  join (select 0 as src from dual
                        union all
                        select 1 as src from dual) d
                    on 1 = 1)
)
where csrc = 0
SQL> /
 
 OBJECT_NO ABI
---------- ---
         1 H01
         1 J01
         2 P01
The run times don't change much to be honest (sorry Tom, should have mentioned it before, I'm on version 9.2.0.7.0 so no get_cpu_time) and the stats seem to be the same, so the anti hash join looks to be the winner over analytics in this case. 
(looks like that's another set of pre-conceptions for the bin) :-) 
Thanks
Dom
 
 
 
Reader, June      10, 2010 - 12:17 pm UTC
 
 
Tom,
In what cases should we use analytics and in what cases group by should be used.
Ranking can be done without using rank function. Can you tell which one is better? Ranking using rank analytic function or just group by and get rownumber
thanks 
June      10, 2010 - 2:49 pm UTC 
 
that is like asking "when should I join and when should I group by"
Show me an example of an analytic (that does not ever aggregate, they do not by definition) that can be replaced by group by (that always aggregates - in general returns FEWER rows than a non-group by) and vice versa 
when you do that, I'll let you know the answer.
Use rank for example, do ranking with group by and ranking with analytics and then we can have a discussion. 
 
 
Analytical Query
NS, June      22, 2010 - 3:49 pm UTC
 
 
I’ll really appreciate your help to solve a particular scenario using analytical query.
I'm working on Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit. 
Following are the create and insert statements:
create table ntest(a number, pdt varchar2(10),sec_pdt varchar2(10),htime number);               
 
 insert into ntest values(1,'RE','ST',1000); 
 insert into ntest values(2,'FF','RE',1100);  
 insert into ntest values(3,'RE','NR',1400); 
 insert into ntest values(4,'NR','FF',1450); 
 insert into ntest values(5,'RE','NR',1700);
 insert into ntest values(6,'NR','FF',2100);
 insert into ntest values(7,'TT','FA',2105);  
 insert into ntest values(8,'MM','FJ',2107);
 insert into ntest values(9,'RE','TT',2115);  
 insert into ntest values(10,'RE','FF',2500);
 insert into ntest values(11,'MN','RT',2510); 
 insert into ntest values(12,'MN','NT',2600); 
 insert into ntest values(13,'ZZ','FF',3000); 
 insert into ntest values(14,'ZN','FF',3100); 
 
commit;
I want to select column a for those values where pdt_cd in that row matches with the pdt_cd in another row which is within the range of +150 or -150 of htime.
eg : if pdt_cd is 'NS' in row of htime 1000 and there is another row with htime between 850 and 1150 with same pdt_cd, then I want to select that a value.
similar is the case with sec_pdt column.
I'm able to tackle this situation by writing following query :
select a from
(               
select a,   
count(*) OVER(Partition by pdt
              order by htime RANGE BETWEEN 150 PRECEDING AND 150 FOLLOWING) pdtcnt,
count(*) OVER(Partition by sec_pdt
              order by htime RANGE BETWEEN 150 PRECEDING AND 150 FOLLOWING) secpdtcnt                         
from ntest   
) n
where pdtcnt > 1 
or secpdtcnt > 1;
I want to add 2 more OR conditions to this query 
1. I want to select column a for those values where pdt_cd in that row matches with the sec_pdt_cd in another row which is within the range of +150 or -150 of htime
2. I want to select column a for those values where sec_pdt_cd in that row matches with the pdt_cd in another row which is within the range of +150 or -150 of htime
I'm not able to figure out how to add these 2 conditions in the above query or is there any better way to do this  in single query?
 
June      24, 2010 - 6:18 am UTC 
 
it would have been interesting to have at least one observation such that:
where pdt_cd in that row matches 
with the sec_pdt_cd in another row which is within the range of +150 or -150 of 
htime
in the example wouldn't it?
I added:
insert into ntest values(42,'RE','NR',1800);
to create one.  Then, if you add:
count(*) over (partition by pdt, sec_pdt order by htime range between 150 preceding and 150 following)
you just want to keep where that is greater than one.
and isn't:
I want to select column a for those values where sec_pdt_cd in that row 
matches with the pdt_cd in another row which is within the range of +150 or 
-150 of htime
the same as the prior - if pdt_cd has a sec_pdt_cd within +-150 - then that same thing is true in reverse already. 
 
 
Analytics Question
NS, June      24, 2010 - 2:53 pm UTC
 
 
When I said
"where pdt_cd in that row matches 
with the sec_pdt_cd in another row which is within the range of +150 or -150 of htime  " I meant that pdt_cd in first row matches with the sec_pdt_cd in other rows, like 2,3 .... which is within the range of +150 or -150 of first row.
In my example RE(pdt_cd) from row 1 matches with RE(sec_pdt_cd) of row 2 and is within +150 or -150 of 1000(htime of row 1)
The other scenario is reverse
NR(sec_pdt_cd) from row 3 matches with NR(pdt_cd) of row 4 which is within +150 or -150 of 1400(htime of row 3)
From someone help I'm able to come up to the following query which brings the output I'm looking for.
with t as
 (select a, pdt combined_pdt, htime,pdt,sec_pdt
  from ntest
  union
  select a, sec_pdt combined_pdt, htime,pdt,sec_pdt
  from ntest)
select  distinct a,pdt,sec_pdt,htime
from (select a,
             combined_pdt,
             pdt,
             sec_pdt,
             htime,
             count(*) over(partition by combined_pdt order by htime range between 150 preceding and 150 following) pdt_cnt
      from t)
where pdt_cnt > 1
and combined_pdt is not null
order by htime;
Here is the result.
A PDT SEC_PDT HTIME
1 RE ST 1000
2 FF RE 1100
3 RE NR 1400
4 NR FF 1450
5 RE NR 1700
42 RE NR 1800
7 TT FA 2105
9 RE TT 2115
11 MN RT 2510
12 MN NT 2600
13 ZZ FF 3000
14 ZN FF 3100
Thanks. 
 
reader
A reader, July      12, 2010 - 12:03 pm UTC
 
 
On Toad Grid output,
break on column
does not work
Is there a way to simulate that function with analyticals so that I can just execute a sql query without "break on table_name" at the top and get the same result
Thanks 
July      19, 2010 - 10:53 am UTC 
 
ops$tkyte%ORA10GR2> select deptno, ename from scott.emp order by deptno, ename;
    DEPTNO ENAME
---------- ----------
        10 CLARK
        10 KING
        10 MILLER
        20 ADAMS
        20 FORD
        20 JONES
        20 SCOTT
        20 SMITH
        30 ALLEN
        30 BLAKE
        30 JAMES
        30 MARTIN
        30 TURNER
        30 WARD
14 rows selected.
ops$tkyte%ORA10GR2> break on deptno
ops$tkyte%ORA10GR2> select deptno, ename from scott.emp order by deptno, ename;
    DEPTNO ENAME
---------- ----------
        10 CLARK
           KING
           MILLER
        20 ADAMS
           FORD
           JONES
           SCOTT
           SMITH
        30 ALLEN
           BLAKE
           JAMES
           MARTIN
           TURNER
           WARD
14 rows selected.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select decode( row_number() over (partition by deptno order by ename), 1, deptno ) new_deptno,
  2         ename
  3    from scott.emp
  4    order by deptno, ename
  5  /
NEW_DEPTNO ENAME
---------- ----------
        10 CLARK
           KING
           MILLER
        20 ADAMS
           FORD
           JONES
           SCOTT
           SMITH
        30 ALLEN
           BLAKE
           JAMES
           MARTIN
           TURNER
           WARD
14 rows selected.
it will not be "as efficient" or course, takes extra work to compute row_number 
 
 
 
Analytics Stumper
Kevin Kirkpatrick, August    27, 2010 - 4:55 pm UTC
 
 
Hey Tom, 
I'm presently stumped on this analytics problem - and am about to bite the bullet and code it out in some very ugly PL/SQL.  But before throwing in the towel, I figured I'd let a pro take a stab.
2 Tables, DEBTS and ASSETS, defined as
WITH DEBTS AS 
(
SELECT 1 TRANSACTION_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 2 TRANSACTION_ID, 'X' ATYPE, 5 UNITS FROM DUAL UNION ALL
SELECT 3 TRANSACTION_ID, 'Y' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 4 TRANSACTION_ID, 'Z' ATYPE, 3 UNITS FROM DUAL 
) ,
ASSETS AS
(SELECT 'A' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
 SELECT 'B' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
 SELECT 'C' ASSET_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
 SELECT 'D' ASSET_ID, 'X' ATYPE, 2 UNITS FROM DUAL UNION ALL
 SELECT 'E' ASSET_ID, 'Y' ATYPE, 3 UNITS FROM DUAL UNION ALL
 SELECT 'F' ASSET_ID, 'Z' ATYPE, 2 UNITS FROM DUAL) 
-- insert query here.
/*
DEBTS can be interpretted as: "Transaction <TRANSACTION_ID> must deduct <UNITS> units from 
assets of type <ATYPE>."
ASSETS can be interpretted as: "Asset <ASSET_ID> is of type <ATYPE> and has <UNITS> available"
Desired results:  Show the required series of steps to properly deduct DEBTS units from ASSETS units of the same ASSET TYPE.  
Rules: 
1) Process DEBTS in order of TRANSACTION_ID.  
2) Debt units must be removed from ASSETS with a matching asset type, and must be 
deducted from assets in order of ASSET_ID*.
3) Uncovered Debts of a given transaction are not associate with any ASSET, but 
are available in the result set for error handling.  
So, for TRANSACTION_ID 1, we need to remove 4 units from ATYPE = 'X'.  We first remove 3 units 
from ASSET_ID A, then 1 more unit for ASSET_ID B.  Transaction 2 must remove an additional 5 
units from ATYPE='X'. It will take the remaining 2 units from ASSET_ID B, and then 3 more units
from ASSET_ID C.  No units will be taken from ASSET_ID D.
The total activity for ATYPE=X would thus appear as follows in the query results:
TRANSACTION_ID  UNITS_REMOVED ASSET_ID
1               3             A
1               1             B
2               2             B
2               3             C
Following this logic, transaction ID 3 is a new asset type, Y, and would thus have the following:
TRANSACTION_ID  UNITS_REMOVED ASSET_ID
3               2             E
The special case is if there are insufficient UNITS in ASSETS to cover a DEBT.  In that case
the remaining units need to be left as <null> ASSET_ID.  So Transaction ID 4 would give:
TRANSACTION_ID  UNITS_REMOVED ASSET_ID
4               2             F  
4               1             <null>
Anyway, to recap: can you substitute for <insert query here> such that this result set would be obtained:
WITH DEBTS AS 
(
SELECT 1 TRANSACTION_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
SELECT 2 TRANSACTION_ID, 'X' ATYPE, 5 UNITS FROM DUAL UNION ALL
SELECT 3 TRANSACTION_ID, 'Y' ATYPE, 2 UNITS FROM DUAL UNION ALL
SELECT 4 TRANSACTION_ID, 'Z' ATYPE, 3 UNITS FROM DUAL ) ,
ASSETS AS
(SELECT 'A' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
 SELECT 'B' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
 SELECT 'C' ASSET_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
 SELECT 'D' ASSET_ID, 'X' ATYPE, 2 UNITS FROM DUAL UNION ALL
 SELECT 'E' ASSET_ID, 'Y' ATYPE, 3 UNITS FROM DUAL UNION ALL
 SELECT 'F' ASSET_ID, 'Z' ATYPE, 2 UNITS FROM DUAL) 
-- insert query here.
TRANSACTION_ID  UNITS_REMOVED ASSET_ID
1               3             A
1               1             B
2               2             B
2               3             C
3               2             E
4               2             F  
4               1             <null>
Or do I have to bite the bullet and cram out some PL/SQL? (the main reason I want to avoid that is performance - I've got to process hundreds of thousands of DEBTS from millions of ASSETS, and need it to happen quick, e.g. PARALLEL FTS of ASSETS, PARALLEL FTS of DEBTS, hash join, and crunch).   
 
Solved
Kevin Kirkpatrick, August    30, 2010 - 10:30 am UTC
 
 
Tom,
Feel free to ignore the above - I've come up with the following solution (spoiler alert - to those who like to work through a neatly packaged SQL puzzler, I'd put this as a prime example...)
WITH 
DEBTS AS (
    SELECT 1 TRANSACTION_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
    SELECT 2 TRANSACTION_ID, 'X' ATYPE, 5 UNITS FROM DUAL UNION ALL
    SELECT 3 TRANSACTION_ID, 'Y' ATYPE, 2 UNITS FROM DUAL UNION ALL
    SELECT 4 TRANSACTION_ID, 'Z' ATYPE, 3 UNITS FROM DUAL )
,ASSETS AS (
    SELECT 'A' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
    SELECT 'B' ASSET_ID, 'X' ATYPE, 3 UNITS FROM DUAL UNION ALL
    SELECT 'C' ASSET_ID, 'X' ATYPE, 4 UNITS FROM DUAL UNION ALL
    SELECT 'D' ASSET_ID, 'X' ATYPE, 2 UNITS FROM DUAL UNION ALL
    SELECT 'E' ASSET_ID, 'Y' ATYPE, 3 UNITS FROM DUAL UNION ALL
    SELECT 'F' ASSET_ID, 'Z' ATYPE, 2 UNITS FROM DUAL)
-- Solution:
-- Asset ranges define the number of units each asset_id will contribute
-- to a given asset type.  Define into existence the "null" asset, which
-- effectively contributes infinite units to each asset type.
,ASSET_RANGES AS (
    SELECT ASSET_ID, ATYPE, 
           NVL(LAG(RUNNING_UNITS) OVER (PARTITION BY ATYPE ORDER BY ASSET_ID),0)+1 MIN_UNIT,
           RUNNING_UNITS MAX_UNIT 
    FROM 
        (SELECT SUM(UNITS) OVER (PARTITION BY ATYPE ORDER BY ASSET_ID) RUNNING_UNITS, 
                A.* 
         FROM ASSETS A) 
    UNION ALL
    SELECT NULL ASSET_ID, ATYPE, SUM(UNITS)+1 MIN_UNIT, 1000000000000 MAX_UNIT
    FROM   ASSETS
    GROUP BY ATYPE)
-- Debt ranges define the number of units each transaction_id will deduct
-- from a given asset type.  
,DEBT_RANGES AS (
    SELECT  TRANSACTION_ID, ATYPE,
            NVL(LAG(RUNNING_UNITS) OVER (PARTITION BY ATYPE ORDER BY TRANSACTION_ID),0)+1 MIN_UNIT,
            RUNNING_UNITS MAX_UNIT 
    FROM    
       (SELECT  SUM(UNITS) OVER (PARTITION BY ATYPE ORDER BY TRANSACTION_ID) RUNNING_UNITS, 
                D.* 
        FROM DEBTS D))
-- Using the ranges defined above, I get my desired "HASH JOIN AND CRUNCH" performance         
SELECT  TRANSACTION_ID,
        ASSET_ID, 
        LEAST(A.MAX_UNIT, D.MAX_UNIT) - GREATEST(A.MIN_UNIT,D.MIN_UNIT) +1
FROM    DEBT_RANGES D, ASSET_RANGES A
WHERE   D.ATYPE = A.ATYPE AND D.MIN_UNIT <= A.MAX_UNIT AND D.MAX_UNIT >= A.MIN_UNIT
ORDER BY TRANSACTION_ID, ASSET_ID 
 
 
analytics
A reader, October   15, 2010 - 9:09 am UTC
 
 
Hi Tom,
you always say that the analytical functions *opens up NEW * path .
Quote from your book :
"This new set of functionality holds some exciting possibilities. It opens up a whole new way of looking
at the data. It will remove a lot of procedural code and complex (or inefficient) queries that would have
taken a long time to develop, to achieve the same result".
Waht is the new path ? what is the 'mechanism' behind that ? how it happens? with out doing 'self join' how it apply the analytcal function and calculate AVG (for eample)?
Thanks 
 
October   15, 2010 - 9:25 am UTC 
 
not sure how to answer this.
close your eyes, envision a table of data.  Now, envision a sliding range of data in that big table of data - a range of rows.  Analytics has access to the rows in that range and can reference them.  The "new path" is "new code" that was added to the database kernel to open up access to that range - what we call a "window" and added steps to plans like "winder buffer sort" and the like.  
 
 
 
A reader, October   15, 2010 - 9:32 am UTC
 
 
Many thanks Tom,
"The "new path" is "new code" that was added to the database kernel"
Yes, understood now - One more doubt is that -Oracle engine has to access the FULL table first before applying ANY analytical function (assume we are applying analytics on a TABLE) - correct ? 
October   15, 2010 - 10:04 am UTC 
 
... Oracle engine has to access the 
FULL table first before applying ANY analytical function ...
of course the oracle engine has access to the full table - that is because we "own" the table, we manage all of the data.
However, it is not a REQUIREMENT to access the full table before applying ANY analytical function - it would depend on the schema (indexing), the query, the partitioning clause and so on.  
Sometimes the entire query is "executed" before the first rows are returned and sometimes the first rows may be returned immediately without hitting the entire table.  "it depends" - for all queries "it depends".  It is not a requirement. 
 
 
Slow analytic function
lalu, October   17, 2010 - 4:17 am UTC
 
 
HiTom,
I have a table with contents as below:
id notes
1 abcdefghijklmnopqrstuvwxyz
I want the output as below
id line notes
1 1 abcd (varchar of 4 characters)
1 2 efgh
1 3 ijkl
.....
....
1 7 yz
I use the below query to get the data:
SELECT DISTINCT A.ID, LEVEL LINE, SUBSTR(A.T, (LEVEL - 1) * 4 + 1, 4) NOTES
FROM (SELECT ID, CONTENTS T FROM MY_TABLE) A
CONNECT BY LENGTH(SUBSTR(A.T, (LEVEL - 1) * 4 + 1, (LEVEL - 1) * 4)) >= 1
ORDER BY ID, LINE
In my production system the table has got 50000 rows and the notes field is 1000char.
I want the data in chunk of 80 chars(notes) in the output.
Tried the above query by replacing 4 with 80.The query runs for 3/4 hours but still no output.
Is there any way to rewrite the query to get the data faster?
Thanks.
lalu.
 
October   25, 2010 - 8:17 am UTC 
 
ops$tkyte%ORA10GR2> create table t
  2  as
  3  select rownum id,
  4         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
  5         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
  6         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
  7         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
  8         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
  9         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
 10         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
 11         rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) || rpad( object_name, 30, 'x' ) ||
 12             rpad('y',81,'y') data
 13    from all_objects
 14   where rownum <= 5000
 15  /
Table created.
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> set autotrace traceonly
ops$tkyte%ORA10GR2> set timing on
ops$tkyte%ORA10GR2> select id, to_number( substr( column_value, 1, 4 ) ) id2, substr( column_value, 5 ) new_data
  2    from t, table( cast( multiset( select to_char( rownum, 'fm0009' ) || substr( data, 1+(rownum-1)*80, 80 )
  3                                     from dual
  4                                                                  connect by level <= ceil(length(data)/80) ) as sys.odciVarchar2List ) )
  5  /
70000 rows selected.
Elapsed: 00:00:01.29
Execution Plan
----------------------------------------------------------
Plan hash value: 442444467
----------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |    50M|    25G| 59040 |
|   1 |  NESTED LOOPS                       |      |    50M|    25G| 59040 |
|   2 |   TABLE ACCESS FULL                 | T    |  6130 |  3202K|    84 |
|   3 |   COLLECTION ITERATOR SUBQUERY FETCH|      |       |       |       |
|   4 |    COUNT                            |      |       |       |       |
|*  5 |     CONNECT BY WITHOUT FILTERING    |      |       |       |       |
|   6 |      FAST DUAL                      |      |     1 |       |     2 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter(LEVEL<=CEIL(LENGTH(:B1)/80))
Note
-----
   - cpu costing is off (consider enabling it)
   - dynamic sampling used for this statement
Statistics
----------------------------------------------------------
         68  recursive calls
          0  db block gets
       4903  consistent gets
        834  physical reads
          0  redo size
    6447972  bytes sent via SQL*Net to client
      51726  bytes received via SQL*Net from client
       4668  SQL*Net roundtrips to/from client
       5000  sorts (memory)
          0  sorts (disk)
      70000  rows processed
ops$tkyte%ORA10GR2> set autotrace off
seems to work fairly well. 
 
 
 
Tried but got error
A reader, October   25, 2010 - 9:18 am UTC
 
 
Hi TOm,
I tried your example.
Create table T,then
select id, to_number( substr( column_value, 1, 4 ) ) id2, substr( column_value, ) new_data
      from t, table( cast( multiset( select to_char( rownum, 'fm0009' ) || substr( data, 
1+(rownum-1)*80, 80 ) from dual  connect by level <= ceil(length(data)/80) )as sys.odciVarchar2List ) )
returns:
select id, to_number( substr( column_value, 1, 4 ) ) id2, substr( column_value, ) new_data
                                                                                *
ERROR at line 1:
ORA-00936: missing expression
 
 
Working
lalu, October   25, 2010 - 9:33 am UTC
 
 
Hi Tom,
Thanks a lot.
I was missing a very small char from the script.
Its running now.Will check the performance.
Thanks. 
 
Achieve this with analytics or pivot query?
A reader, January   20, 2011 - 3:35 pm UTC
 
 
Hi
I have following data:
phone     state    state_mod_date    service_id
--------  -------  ----------------  ------------
9087624   online   2010/02/22        1
9087624   online   2010/02/22        2
9087624   online   2010/02/23        6
9087624   offline  2010/02/23        2
9087624   offline  2010/02/24        6
9200921   online   2010/01/25        1
9200921   online   2010/03/12        3
9200921   online   2010/04/21        7
9200921   offline  2010/09/11        3
9760293   online   2010/01/02        1
I need to transform it to
phone     service_id  start_date  end_date
--------  ----------- ----------- ----------
9087624   1           2010/02/22
9087624   2           2010/02/22  2010/02/23
9087624   6           2010/02/23  2010/02/24
9200921   1           2010/01/25
9200921   3           2010/03/12  2010/09/11
9200921   7           2010/04/21
9760293   1           2010/01/02
I have been doing this with PL/SQL but the amount of data is increasing in the last few months and the PL/SQL code is running slower and slower.
I have been looking the analytic function or pivot query but cant find a way of doing it, can you please share some lights?
I am running 10.2.0.3
Thanks a lot
Pedro 
January   24, 2011 - 6:58 am UTC 
 
you would need to 
a) supply create tables
b) insert into tables (so we have data)
and most importantly:
c) explain your requirement, sometimes a picture is worth a thousand words - but we need an explanation of your inputs and outputs here and the processing that needs to take place.
It looks like just getting the min(state_mod_date) and max(state_mod_date) (where max<>min) by phone and service_id, it is not clear what the state column is used for 
 
 
A reader, January   24, 2011 - 11:39 am UTC
 
 
use group by as below...
select  phone, service_id, min(case when state='online' then state_mod_date END) as start_date   , max(case when state='offline' then state_mod_date END) as end_date  
from data 
Group by phone, service_id
 
 
Achieve this with analytics or pivot query?
Ravi Kumar Pilla, January   27, 2011 - 9:00 am UTC
 
 
Dear Tom,
I'm a silent reader of your site and learned a lot and appreciate your work.
I don't know whether it is the best way, but its working for me. Correct me if any other way which works faster for the person who asked.
 SELECT PHONE,SERVICEID,MAX(DECODE(status,'ONLINE',START_date))  startdate,
 MAX(decode(status,'OFFLINE',START_DATE)) ENDDATE FROM PHONE GROUP BY PHONE,
 SERVICEID
 
February  01, 2011 - 3:16 pm UTC 
 
as I said above:
It looks like just getting the min(state_mod_date) and max(state_mod_date) (where max<>min) by phone and service_id, it is not clear what the state column is used for 
I cannot validate your query as I don't know what the question is  
 
 
Combine dates
Bob C, January   28, 2011 - 3:12 pm UTC
 
 
I have a similar challenge as above, except I want to combine rows to show continuous date range when rate plan and customer are the same.  Records are not supposed to overlap, so that does not need to be handled.
drop table t1;
create table t1 ( key_col char(1), d_begin date, d_end date, cust char(10), rate_plan char(10) );
insert into t1 values ('a',TO_DATE('01/01/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/02/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1'); 
insert into t1 values ('a',TO_DATE('01/03/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/04/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/05/2011 01:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/06/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/07/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/08/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/09/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/10/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('a',TO_DATE('01/11/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/12/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r2');
insert into t1 values ('b',TO_DATE('01/01/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/02/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/03/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/04/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/05/2011 01:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/06/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/07/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/08/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r1');
insert into t1 values ('b',TO_DATE('01/09/2011 00:00:00', 'MM/DD/YYYY HH24:MI:SS'),TO_DATE('01/10/2011 23:59:59', 'MM/DD/YYYY HH24:MI:SS'), 'c1', 'r2');
commit;
Would like to get results:
a, 01/01/2011 00:00:00, 01/04/2011 23:59:59, c1, r1
a, 01/05/2011 01:00:00, 01/10/2011 23:59:59, c1, r1
a, 01/11/2011 00:00:00, 01/12/2011 23:59:59, c1, r2
b, 01/01/2011 00:00:00, 01/04/2011 23:59:59, c1, r1
b, 01/05/2011 01:00:00, 01/10/2011 23:59:59, c1, r1
b, 01/11/2011 00:00:00, 01/12/2011 23:59:59, c1, r2
following query is close, but does not account for the gap between rows 2 and 3
select distinct key_col, cust, rate_plan,
        first_value(d_begin) over (partition by key_col, cust, rate_plan order by d_begin rows unbounded preceding) as min_eff,
        last_value(d_end) over (partition by key_col, cust, rate_plan order by d_begin ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as max_term
        from t1
        order by 1,2,4
KEY_COL CUST RATE_PLAN  MIN_EFF  MAX_TERM
a c1         r1         1/1/2011 1/10/2011 11:59:59 PM
a c1         r2         1/11/2011 1/12/2011 11:59:59 PM
b c1         r1         1/1/2011 1/8/2011 11:59:59 PM
b c1         r2         1/9/2011 1/10/2011 11:59:59 PM
 
February  01, 2011 - 3:31 pm UTC 
 
I don't understand your output
a, 01/01/2011 00:00:00, 01/04/2011 23:59:59, c1, r1
a, 01/05/2011 01:00:00, 01/10/2011 23:59:59, c1, r1
a, 01/11/2011 00:00:00, 01/12/2011 23:59:59, c1, r2
I see why the third row is there, but I don't get why the first two are not merged.
I don't see any "gap" there. 
 
 
@Bob C re:combine dates
Stew Ashton, January   29, 2011 - 5:04 am UTC
 
 
Hi Bob,
Here's a solution based on a contribution by Steve from Pleasanton CA: 
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1594948200346140013#1596748700346101469 select key_col, cust, rate_plan, min(d_begin) d_begin, max(d_end) d_end from (
  select key_col, cust, rate_plan, d_begin, d_end,
  sum(new_start) over(partition by key_col,cust,rate_plan order by d_begin) grp from (
    select key_col, cust, rate_plan, d_begin, d_end,
    case when 
      lag(d_end + 1/24/60/60) over(partition by key_col,cust,rate_plan order by d_begin)
      >= d_begin
      then 0 else 1 end new_start 
    from t1
  )
)
group by key_col, cust, rate_plan, grp
order by key_col, cust, rate_plan, d_begin;The innermost query sets NEW_START to 0 when there is no date gap with the previous record, and to 1 when you have to start afresh. The middle query sets GRP as a running total, so all records within the same continuous date range will belong to the same GRP. The outer query uses GROUP BY to get one record per date range. 
 
Perfect answer thanks
A reader, February  01, 2011 - 10:24 am UTC
 
 
Works great, thanks. 
 
Your Feb. 1 followup
Stew Ashton, February  01, 2011 - 4:35 pm UTC
 
 
 
Tom, you said 'I don't see any "gap" there.'
The second row starts at 1 A.M., not midnight.  I didn't see that at first either.  Or did you mean something else? 
February  01, 2011 - 6:30 pm UTC 
 
indeed, the gap was too subtle for my eyes :) I missed it entirely.  After putting my glasses on:
ops$tkyte%ORA11GR2> select key_col, cust, rate_plan, min(d_begin), max(d_end)
  2    from (
  3  select key_col, cust, rate_plan, d_begin, d_end, last_d_end,
  4         last_value(rn IGNORE NULLS) over (partition by key_col, cust, rate_plan order by d_end) grp
  5    from (
  6  select key_col, cust, rate_plan, d_begin, d_end,
  7         lag(d_end) over (partition by key_col, cust, rate_plan order by d_end) last_d_end,
  8         case when lag(d_end) over (partition by key_col, cust, rate_plan order by d_end) = d_begin-1/24/60/60 then null
  9                  else row_number() over (partition by key_col, cust, rate_plan order by d_end )
 10                  end rn
 11    from t1
 12         )
 13             )
 14   group by key_col, cust, rate_plan, grp
 15   order by key_col, cust, rate_plan
 16  /
K CUST       RATE_PLAN  MIN(D_BEGIN)         MAX(D_END)
- ---------- ---------- -------------------- --------------------
a c1         r1         01-jan-2011 00:00:00 04-jan-2011 23:59:59
a c1         r1         05-jan-2011 01:00:00 10-jan-2011 23:59:59
a c1         r2         11-jan-2011 00:00:00 12-jan-2011 23:59:59
b c1         r1         01-jan-2011 00:00:00 04-jan-2011 23:59:59
b c1         r1         05-jan-2011 01:00:00 08-jan-2011 23:59:59
b c1         r2         09-jan-2011 00:00:00 10-jan-2011 23:59:59
6 rows selected.
 
 
 
for Pedro from Spain
CezarN, February  02, 2011 - 3:36 am UTC
 
 
/*--Here are create/insert statements you have to supply whenever you ask something
--drop table t1;
--create table t1  (phone varchar2(10), state varchar2(10), state_mod_date date, service_id number(2));
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9087624',   'online',   to_date('2010/02/22','yyyy/mm/dd'),  1);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9087624',   'online',   to_date('2010/02/22','yyyy/mm/dd'),  2);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9087624',   'online',   to_date('2010/02/23','yyyy/mm/dd'),  6);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9087624',   'offline',   to_date('2010/02/23','yyyy/mm/dd'),  2);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9087624',   'offline',   to_date('2010/02/24','yyyy/mm/dd'),  6);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9200921',   'online',   to_date('2010/01/25','yyyy/mm/dd'),  1);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9200921',   'online',   to_date('2010/03/12','yyyy/mm/dd'),  3);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9200921',   'online',   to_date('2010/04/21','yyyy/mm/dd'),  7);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9200921',   'offline',   to_date('2010/09/11','yyyy/mm/dd'),  3);
insert into t1 (phone,  state,  state_mod_date,  service_id) values ('9760293',   'online',   to_date('2010/01/02','yyyy/mm/dd'),  1);
commit;
select * from t1;
*/
--Hope this is what you want
select phone, service_id, start_date, end_date 
from (
  select phone, service_id, start_date, last_value(end_date) over (partition by phone, service_id) as end_date
  from (
    select 
      phone, 
      service_id, 
      case when state='online' then state_mod_date else null end as start_date,  
      case when state='offline' then state_mod_date else null end as end_date,
      state
    from t1
    order by phone, service_id, start_date
    )
  )
where start_date is not null
order by phone, service_id 
 
to_date function
Nitesh Jyotirmay, April     14, 2011 - 6:16 am UTC
 
 
SQL> ALTER SESSION SET NLS_DATE_FORMAT = 'DD:MM:YYYY HH24:MI:SS';
Session altered.
SQL> SELECT SYSDATE,to_date('13:14:00', 'HH24:MI:SS') FROM DUAL;
SYSDATE             to_date('13:14:00',
------------------- -------------------
14:04:2011 11:17:43 01:04:2011 13:14:00
why the to date function gives the output like that.
i am using the Oracle 9i
 
 
April     14, 2011 - 10:00 am UTC 
 
because you told us to.
you are running a query that selected two dates out.  When you select a date in sqlplus, it converts it to a string for display.  It uses your NLS date format for that.
If you want to control what the STRING representation of a date looks like - you would have to TO_CHAR it with a format - the format being the format you want. 
 
 
TO_DATE with only TIME elements in the string
Kim Berg Hansen, April     14, 2011 - 10:23 am UTC
 
 
Hi, Tom
A followup to the review from Nitesh above.
Nitesh did this:
SQL> SELECT SYSDATE,to_date('13:14:00', 'HH24:MI:SS') FROM DUAL;
SYSDATE             to_date('13:14:00',
------------------- -------------------
14:04:2011 11:17:43 01:04:2011 13:14:00
That is, he used TO_DATE on a string that only contains TIME elements. Since he did not specify any day, month or year, Oracle will have to "make up" something. Just as if you use TO_DATE only with DD-MM-YYYY elements, Oracle "makes up" that this must be midnight 00:00:00 hour.
The curious question is why Oracle picks the "first day of the current month"? It could have just as well have picked "today" or "January first Year 1"?
(Maybe Nitesh should consider using an INTERVAL datatype if he is only concerned with a time period? :-)
Anyway - just curious :-) 
April     14, 2011 - 10:27 am UTC 
 
because we documented it to be that way.
If you do not specify a year, the year is the current year.
If you do not specify a month, the month is the current month.
If you do not specify a day, the day is 1.
 
 
 
Thanks Tom
Kim Berg Hansen, April     14, 2011 - 10:33 am UTC
 
 
Thanks - I just couldn't find that bit in the documentation when I searched :-)
 
 
OK Great :-)
Kim Berg Hansen, April     14, 2011 - 11:42 am UTC
 
 
I was looking in the documentation of TO_DATE and in the "date formats" and those parts of the documentation. It did not occur to me to look at the datatype documentation :-)
Thanks for your patience... 
April     14, 2011 - 5:25 pm UTC 
 
I'll admit, it is a bit odd to be there - since it is really to_date related more so than data type related. 
 
 
SQL query needs to be modified
A reader, April     17, 2011 - 1:02 pm UTC
 
 
Sql) query needs to be modified according to parameters for date range providedQuestion: the below query is for sales compensation based on  the datre range i have to modify it so automatically it will change the date, can anyone please tell me how to go ab out it
--Use follwoing query for total tab of excel to make s`    ure for current three month change OSR DATE FOR THAT
select decode(to_char(osr_date),
                                            '12/10/2010 00:00:00',    '12/2010',
                                            '12/22/2010 00:00:00',    '12/2010',
                                            '01/04/2011 00:00:00',    '12/2010',
                                            '01/10/2011 00:00:00',    '01/2011',
                                            '01/22/2011 00:00:00',    '01/2011',
                                            '02/04/2011 00:00:00',    '01/2011',
                                            '02/10/2011 00:00:00',    '02/2011',
                                            '02/22/2011 00:00:00',    '02/2011',
                                            '03/04/2011 00:00:00',    '02/2011',
                                            '03/10/2011 00:00:00',    '03/2011',
                                            '03/22/2011 00:00:00',    '03/2011',
                                            '04/04/2011 00:00:00',    '03/2011',
                                             '1') osr_run,
       lst_nm,sum(less_usage) "LESS_USAGE"
from ( select a.LST_NM  , round(sum(comp_amt),2)less_usage,
                         decode (to_char(trunc(a.end_dt)),
                                            '12/10/2010 00:00:00',    '1',
                                            '12/22/2010 00:00:00',    '1',
                                            '01/04/2011 00:00:00',    '1',
                                            '01/10/2011 00:00:00',    '2',
                                            '01/22/2011 00:00:00',    '2',
                                            '02/04/2011 00:00:00',    '2',
                                            '02/10/2011 00:00:00',    '3',
                                            '02/22/2011 00:00:00',    '3',
                                            '03/04/2011 00:00:00',    '3',
                                            '03/10/2011 00:00:00',    '4',
                                            '03/22/2011 00:00:00',    '4',
                                            '04/04/2011 00:00:00',    '4',
                                             to_char(a.end_dt)) osr_run,
                                             to_char(trunc(a.end_dt))osr_date
                             from amk_arm_comp_2011 a
                             where lst_nm <>'House Account - CRM '                             
                             and a.end_dt  between to_date('12/10/2010','mm/dd/yyyy') and to_date('04/04/2011','mm/dd/yyyy')  --update each month by 1.  4 cycle month range for ARMS.
                             and a.sub_id <>2002003611
                          group by a.LST_NM, decode (to_char(trunc(a.end_dt)),
                                            '12/10/2010 00:00:00',    '1',
                                            '12/22/2010 00:00:00',    '1',
                                            '01/04/2011 00:00:00',    '1',
                                            '01/10/2011 00:00:00',    '2',
                                            '01/22/2011 00:00:00',    '2',
                                            '02/04/2011 00:00:00',    '2',
                                            '02/10/2011 00:00:00',    '3',
                                            '02/22/2011 00:00:00',    '3',
                                            '03/04/2011 00:00:00',    '3',
                                            '03/10/2011 00:00:00',    '4',
                                            '03/22/2011 00:00:00',    '4',
                                            '04/04/2011 00:00:00',    '4',
                                             to_char(a.end_dt)), 
                                             to_char(trunc(a.end_dt))
                                              )
group by decode(to_char(osr_date),
                                            '12/10/2010 00:00:00',    '12/2010',
                                            '12/22/2010 00:00:00',    '12/2010',
                                            '01/04/2011 00:00:00',    '12/2010',
                                            '01/10/2011 00:00:00',    '01/2011',
                                            '01/22/2011 00:00:00',    '01/2011',
                                            '02/04/2011 00:00:00',    '01/2011',
                                            '02/10/2011 00:00:00',    '02/2011',
                                            '02/22/2011 00:00:00',    '02/2011',
                                            '03/04/2011 00:00:00',    '02/2011',
                                            '03/10/2011 00:00:00',    '03/2011',
                                            '03/22/2011 00:00:00',    '03/2011',
                                            '04/04/2011 00:00:00',    '03/2011',
                                             '1'), lst_nm
                              order by lst_nm, osr_run desc;
 
April     18, 2011 - 10:27 am UTC 
 
You'll need to be a tad more clear on this
"i have to modify it so automatically it will change the date"
I presume you mean you'd like what are now hard coded dates to be replaced with some sort of function that figures out what the dates should be.  BUT in order to do that - you would sort of need to fill us in on WHAT THE RULES ARE - what the LOGIC is.
I'm looking at what appear to be a bunch of random dates to me.  Consider:
ops$tkyte%ORA11GR2> select dt, lag(dt) over (order by dt) ,
  2         dt-lag(dt) over (order by dt)
  3  from
  4  (select to_date( '12/10/2010 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
  5  select to_date( '12/22/2010 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
  6  select to_date( '01/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
  7  select to_date( '01/10/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
  8  select to_date( '01/22/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
  9  select to_date( '02/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
 10  select to_date( '02/10/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
 11  select to_date( '02/22/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
 12  select to_date( '03/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
 13  select to_date( '03/10/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
 14  select to_date( '03/22/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual union all
 15  select to_date( '04/04/2011 00:00:00', 'mm/dd/yyyy hh24:mi:ss') dt from dual)
 16  /
DT        LAG(DT)OV DT-LAG(DT)OVER(ORDERBYDT)
--------- --------- -------------------------
10-DEC-10
22-DEC-10 10-DEC-10                        12
04-JAN-11 22-DEC-10                        13
10-JAN-11 04-JAN-11                         6
22-JAN-11 10-JAN-11                        12
04-FEB-11 22-JAN-11                        13
10-FEB-11 04-FEB-11                         6
22-FEB-11 10-FEB-11                        12
04-MAR-11 22-FEB-11                        10
10-MAR-11 04-MAR-11                         6
22-MAR-11 10-MAR-11                        12
04-APR-11 22-MAR-11                        13
12 rows selected.
what is the "logic" behind these dates that are 13, 12, sometimes 6, other times 10 days apart???!?!?!?!?
I will say - DO NOT DO NOT DO NOT code this:
(to_char(trunc(a.end_dt)),
                                            '12/10/2010 00:00:00',    '1',...
DO code DO code DO code this:
(to_char(trunc(a.end_dt)),
   to_date('12/10/2010 00:00:00','mm/dd/yyyy hh24:mi:ss'),    '1', ....
NEVER NEVER NEVER rely on DEFAULTS.
Also - do not DO NOT do not code this:
decode (to_char(trunc(a.end_dt)),
                                            '12/10/2010 00:00:00',    '1',
you are once again relying on IMPLICIT CONVERSIONS - do not do that - ALWAYS (repeat ALWAYS) use an explicit format!@!!!!!!!!!!!!!!!! 
 
 
a reader, April     18, 2011 - 1:19 pm UTC
 
 
The dates on the 10th, 22nd and 4th (of the following month) are one considered "cycle".. so
12/10/2010, 12/22/2010 & 04/04/2011 fall under Dec-10 cycle. 01/10/2011, 01/22/2011 & 
02/04/2011 fall under Jan-11 cycle... & so on. These cycle dates are being inserted from another table. 
What I'm looking to do is get rid of the hard-coded values from the 2 sets of the decode. 
So I want to run this SQl from a given data range say - 12/05/2010 to 04/05/2011, it should pick up all "cycle" dates and name 
them with the right month in order to figure our commissions for the sales rep.
Believe me, I could not agree more with you on the implicit conversions. But this is what we are dealing with for now. 
 
April     18, 2011 - 1:51 pm UTC 
 
keep going, explain the number system and dates in these too:
                                            '12/10/2010 00:00:00',    '12/2010',
                                            '12/22/2010 00:00:00',    '12/2010',
                                            '01/04/2011 00:00:00',    '12/2010',
                                            '01/10/2011 00:00:00',    '01/2011',
                                            '01/22/2011 00:00:00',    '01/2011',
                                            '02/04/2011 00:00:00',    '01/2011',
                                            '02/10/2011 00:00:00',    '02/2011',
                                            '02/22/2011 00:00:00',    '02/2011',
                                            '03/04/2011 00:00:00',    '02/2011',
                                            '03/10/2011 00:00:00',    '03/2011',
                                            '03/22/2011 00:00:00',    '03/2011',
                                            '04/04/2011 00:00:00',    '03/2011',
                                             '1') osr_run,
       lst_nm,sum(less_usage) "LESS_USAGE"
from ( select a.LST_NM  , round(sum(comp_amt),2)less_usage,
                         decode (to_char(trunc(a.end_dt)),
                                            '12/10/2010 00:00:00',    '1',
                                            '12/22/2010 00:00:00',    '1',
                                            '01/04/2011 00:00:00',    '1',
                                            '01/10/2011 00:00:00',    '2',
                                            '01/22/2011 00:00:00',    '2',
                                            '02/04/2011 00:00:00',    '2',
                                            '02/10/2011 00:00:00',    '3',
                                            '02/22/2011 00:00:00',    '3',
                                            '03/04/2011 00:00:00',    '3',
                                            '03/10/2011 00:00:00',    '4',
                                            '03/22/2011 00:00:00',    '4',
                                            '04/04/2011 00:00:00',    '4',
                                             to_char(a.end_dt)) osr_run,o is the first week of a month always to be for the last month?
o are the numbers always 111,222,333,444 in order from first date to last date or are they assigned based on the date or what?
Believe me, I could not agree more with you on the implicit conversions. But 
this is what we are dealing with for now. you are obviously in the midst of fixing things, this should have been the first fix! 
 
 
A reader, April     27, 2011 - 12:53 pm UTC
 
 
Yes. The first week of a month is always to be for the last month & the numbers are in order (111,222,333) from first date to last date. 
The result of the above query produces results in this fashion - 
OSR_RUN LST_NM         COMP_AMT 
Dec-10 Jennings 495129.66
Jan-11 Jennings 484153.76
Feb-11 Jennings 472022.35
Mar-11 Jennings 481284.15
Dec-10 Parker         512221.55
Jan-11 Parker         519161.53
Feb-11 Parker         518672.03
Mar-11 Parker         518291.82
Dec-10 Scott         473346.21
Jan-11 Scott         478488.66
Feb-11 Scott         483854.59
Mar-11 Scott         480323.2
Dec-10 Spencer         501227.09
Jan-11 Spencer         512199.9
Feb-11 Spencer         511897.21
Mar-11 Spencer         506145.24
  
April     27, 2011 - 1:55 pm UTC 
 
so, to be clear, the first three lowest dates are assigned one (even if they are not in the same time period - eg, if the first three dates we hit are:
22-dec, 04-jan, 10-jan 
they will be assigned one? 
 
 
A reader, April     28, 2011 - 9:54 am UTC
 
 
The first three dates we should be hitting are - 
10-dec, 22-dec, 04-jan. These will be assigned 1.
10-jan, 22-jan, 04-feb. These will be assigned 2.
10-feb, 22-feb, 04-mar. These will be assigned 3.  
April     28, 2011 - 10:20 am UTC 
 
so, you are saying there is an implied constraint here that your input date ranges will ALWAYS be on the tenth of some month.
Ok, what if the date range I put in is
10-JUN-2010 through 04-DEC-2010
does june get one?
In any case - use whichever function below that works for you:
ops$tkyte%ORA11GR2> variable start_date varchar2(20)
ops$tkyte%ORA11GR2> variable end_date varchar2(20)
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec :start_date := '10-dec-2010'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec :end_date := '04-apr-2011'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> with four as
  2  (select decode(level,1,4,2,22,10) dy from dual connect by level <= 3),
  3  twelve as
  4  (select level mm from dual connect by level <= 12),
  5  one_year as
  6  (select to_date( dy||'/'||mm, 'dd/mm' ) dt from four, twelve),
  7  days as
  8  (select decode( l, 1, dt, add_months(dt,-12) ) dt
  9     from one_year, (select level l from dual connect by level <= 2)
 10  )
 11  select dt,
 12         to_char( dt-5, 'mm/yyyy' ),
 13             to_char( add_months(dt-5,1), 'fmmm' ) first,
 14             to_char( add_months(dt-5,1), 'fmmm' )-to_char( add_months(to_date(:start_date, 'dd-mon-yyyy')-5,1),'fmmm') + 1
 15             second
 16    from days
 17   where dt between to_date( :start_date, 'dd-mon-yyyy' ) and to_date( :end_date, 'dd-mon-yyyy' )
 18   order by dt;
DT        TO_CHAR FI     SECOND
--------- ------- -- ----------
10-DEC-10 12/2010 1           1
22-DEC-10 12/2010 1           1
04-JAN-11 12/2010 1           1
10-JAN-11 01/2011 2           2
22-JAN-11 01/2011 2           2
04-FEB-11 01/2011 2           2
10-FEB-11 02/2011 3           3
22-FEB-11 02/2011 3           3
04-MAR-11 02/2011 3           3
10-MAR-11 03/2011 4           4
22-MAR-11 03/2011 4           4
04-APR-11 03/2011 4           4
12 rows selected.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec :start_date := '10-feb-2011'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> exec :end_date := '04-jul-2011'
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> /
DT        TO_CHAR FI     SECOND
--------- ------- -- ----------
10-FEB-11 02/2011 3           1
22-FEB-11 02/2011 3           1
04-MAR-11 02/2011 3           1
10-MAR-11 03/2011 4           2
22-MAR-11 03/2011 4           2
04-APR-11 03/2011 4           2
10-APR-11 04/2011 5           3
22-APR-11 04/2011 5           3
04-MAY-11 04/2011 5           3
10-MAY-11 05/2011 6           4
22-MAY-11 05/2011 6           4
04-JUN-11 05/2011 6           4
10-JUN-11 06/2011 7           5
22-JUN-11 06/2011 7           5
04-JUL-11 06/2011 7           5
15 rows selected.
 
 
 
Divide into 3 groups
A reader, May       03, 2011 - 2:59 pm UTC
 
 
Table has following data
TEAM     member
team1      2
team2      5
team3      3
team4      1
team5      10
team6      4
team7      38
...
How to divide these teams relative evenly into 3 groups? 
May       04, 2011 - 1:45 pm UTC 
 
read about ntile() or just simply add a column "mod(rownum,3) grp" to your select list.
I would have given you an example, but you didn't give us a create table/insert intos to work with! 
 
 
Divide into 3 groups   
A reader, May       05, 2011 - 8:49 am UTC
 
 
Tom,
I have tried with mod(rownum,3).  But the result is not that even.
create table t (team varchar2(8), member_num number, group_no number);
insert into t values ('team1',3, null); 
insert into t values ('team2',5, null);
insert into t values ('team3',8, null);
insert into t values ('team4',9, null);
insert into t values ('team5',2, null);
insert into t values ('team6',7, null);
insert into t values ('team7',4, null);
insert into t values ('team8',3, null);
insert into t values ('team9',3, null);
insert into t values ('team10',25, null);
insert into t values ('team11',33, null);
commit;
update t
set t.group_no = (select r.rg from (select REPLACE(mod(rownum,3),0,3) rg, team 
                                     from t order by member) r 
where t.team = r.team
);
SQL > select group_no, sum(member) from t group by group_no;
  GROUP_NO SUM(MEMBER)
---------- -----------
         1          41
         2          43
         3          18
Thanks. 
 
May       06, 2011 - 10:06 am UTC 
 
When you said to "How to divide these teams relative evenly into 3 groups?
", what immediately pops into mine?
Well, to break them into three groups of equal sizes....
I see 11 teams.  I would put 
1,4,7,10 into a group of 4
2,5,8,11 into a group of 4
3,6,9    into a group of 3
there I have three groups of equal sizes.
You did not say anything like:
given that member_num column, can you split the values in the above list up into 3 groups such that the sum of member_num within each group is about the same magnitude.
Which is what you are apparently asking to do.
That is a 'bin fitting' type of problem and does not lend itself easily to SQL based solutions.
see  
http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22bin+fitting%22 for past discussions on this  
 
 
Similar to the topic
Yar, June      01, 2011 - 4:17 am UTC
 
 
CREATE TABLE GRTAB
(
 ID NUMBER
 INS_DATE DATE
)
/
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-25 23:13:32','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 02:14:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 04:15:30','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 05:14:31','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 07:15:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 10:15:50','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 13:44:46','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 15:14:54','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 16:15:01','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 17:14:38','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 19:15:36','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 11:30:17','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 14:30:22','YYYY-MM-DD HH24:MI:SS'));
ID INS_DATE            BASE_DATE 
-- ------------------- ------------------- 
 1 2011-05-25 23:13:32 2011-05-25 23:13:32  
 1 2011-05-26 02:14:19 2011-05-25 23:13:32  
 1 2011-05-26 04:15:30 2011-05-25 23:13:32  
 1 2011-05-26 05:14:31 2011-05-26 05:14:31  
 1 2011-05-26 07:15:19 2011-05-26 05:14:31  
 1 2011-05-26 10:15:50 2011-05-26 05:14:31  
 1 2011-05-26 13:44:46 2011-05-26 13:44:46  
 1 2011-05-26 15:14:54 2011-05-26 13:44:46  
 1 2011-05-26 16:15:01 2011-05-26 13:44:46  
 1 2011-05-26 17:14:38 2011-05-26 13:44:46  
 1 2011-05-26 19:15:36 2011-05-26 13:44:46  
 2 2011-05-30 11:30:17 2011-05-30 11:30:17  
 2 2011-05-30 14:30:22 2011-05-30 11:30:17  
CREATE TABLE GRTAB
(
 ID NUMBER
 INS_DATE DATE
)
/
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-25 23:13:32','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 02:14:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 04:15:30','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 05:14:31','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 07:15:19','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 10:15:50','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 13:44:46','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 15:14:54','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 16:15:01','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 17:14:38','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(1,TO_DATE('2011-05-26 19:15:36','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 11:30:17','YYYY-MM-DD HH24:MI:SS'));
INSERT INTO GRTAB VALUES(2,TO_DATE('2011-05-30 14:30:22','YYYY-MM-DD HH24:MI:SS'));
I'd like to group INS_DATE every 6 hours from the first INS_DATE of every ID. And on each group, set the BASE_DATE on the first ins_date per group, like:
ID INS_DATE            BASE_DATE 
-- ------------------- ------------------- 
 1 2011-05-25 23:13:32 2011-05-25 23:13:32  
 1 2011-05-26 02:14:19 2011-05-25 23:13:32  
 1 2011-05-26 04:15:30 2011-05-25 23:13:32  
 1 2011-05-26 05:14:31 2011-05-26 05:14:31  
 1 2011-05-26 07:15:19 2011-05-26 05:14:31  
 1 2011-05-26 10:15:50 2011-05-26 05:14:31  
 1 2011-05-26 13:44:46 2011-05-26 13:44:46  
 1 2011-05-26 15:14:54 2011-05-26 13:44:46  
 1 2011-05-26 16:15:01 2011-05-26 13:44:46  
 1 2011-05-26 17:14:38 2011-05-26 13:44:46  
 1 2011-05-26 19:15:36 2011-05-26 13:44:46  
 2 2011-05-30 11:30:17 2011-05-30 11:30:17  
 2 2011-05-30 14:30:22 2011-05-30 11:30:17  
been struggling the whole time doing this, any help is appreciated 
June      01, 2011 - 2:29 pm UTC 
 
please look at the other place you already asked this and please do not do that anymore, just ask on ONE PLACE 
 
 
Led/lag or connect by prior..
Jayadevan, June      08, 2011 - 3:55 am UTC
 
 
Hello Tom,
I have the following scenario - 
  CREATE TABLE "HR"."ACTIVITY" 
   ( "ACTIVITY_ID" NUMBER(*,0), 
 "PERSON_ID" NUMBER(*,0), 
 "START_DATE" DATE, 
 "END_DATE" DATE, 
 "ACTIVITY_NAME" VARCHAR2(10 BYTE)
   )
       Sample data -   
 Insert into HR.ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (1,1,to_date('08-JUN-11','DD-MON-RR'),to_date('18-JUN-11','DD-MON-RR'),'LEAVE');
Insert into HR.ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (2,1,to_date('06-JUN-11','DD-MON-RR'),to_date('07-JUN-11','DD-MON-RR'),'LEAVE');
Insert into HR.ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (3,1,to_date('01-JUN-11','DD-MON-RR'),to_date('04-JUN-11','DD-MON-RR'),'LEAVE');select * from activity order by start_date;
ACTIVITY_ID  PERSON_ID START_DAT END_DATE  ACTIVITY_N
----------- ---------- --------- --------- ----------
          3          1 01-JUN-11 04-JUN-11 LEAVE
          2          1 06-JUN-11 07-JUN-11 LEAVE
          1          1 08-JUN-11 18-JUN-11 LEAVE    
I will get one date as input - say sysdate. Now I have to get the activity a person is assigned for that date (i.e. start date <= sysdate and end_date>= sysdate) and walk back through all the data till there is a blank date, and walk forward till there is a blank date, and display all those records.
For example, if user enters  '10-JUN-2011' as input, Activityid 1 is the 'current' activity. Then I step back to the previous activity (activity id 2, with end date as 07. I step back again- now the end date of previous activity is 04-June. This implaies a break. So I stop there and stop going back. Similarly, I have to go forward. Ihope the situation is clear. There is another break condition - if previous activity is not the same as current, that implies stop. But that should be easy enough if I can figure out the firts case. Any suggestions? 
June      08, 2011 - 10:42 am UTC 
 
are you doing this for a single person or for a set of people or for all people in the table.
A more interesting test case would have more than one person in it to ensure the answer actually works too.
can a person have more than one activity assigned at the same time?
do the start/end times overlap?
this implies a break" 
why? and what does that mean?  how did you know that was a 'break'.  
There is another break condition - if previous activity is not the same as current, that implies stop.
if that implied a stop, then we would have stopped on the first record???? it had activity 2, which is not 1. 
 
 
Aswers
Jayadevan, June      08, 2011 - 11:43 pm UTC
 
 
Thanks for the reply. 
1) are you doing this for a single person or for a set of people or for all people in the table. 
This can happen for a number of persons. The maximum number of persons is, as of now 6000. I am sure it can go up. So we could select upto 6000 persons. Since we have to maintain the data for these persons for a few years, the number of records is already in millions. We already have a temp table which will store the 'filtered' persons- filtered on various conditions user will choose- all persons from a location, or all persons with some designation.... and that will be used for filrst-level filtering. Now we are iterating through a loop to get the 'break' and as number of persons/activities go up, iterations go up - no very scalable, I feel.
A more interesting test case would have more than one person in it to ensure the answer actually works too. 
2)can a person have more than one activity assigned at the same time? 
No
3) do the start/end times overlap? 
Not to the minute/second level. But end time can be today 10 AM and next start time can be tonight 11 or something like that
this implies a break" 
why? and what does that mean? how did you know that was a 'break'. 
If a person was not assigned any activity, we do not have to display what he was doing before that day.
There is another break condition - if previous activity is not the same as current, that implies stop. 
4)if that implied a stop, then we would have stopped on the first record???? it had activity 2, which is not 1. 
I mean the type of activity- which is 'LEAVE' is all the cases here. If the activity was, say , 'TRAINING', yes, we would not have taken the record with Activity id 2 (i.e. take only activity 1. Activity 2's name is different, so we do not take it, and do not traverse further.
I hope it is clear. 
The column activity_id is just a column I created as PK. It is the activity_name that I need to check for 'is it different or same'. 
 
June      09, 2011 - 9:56 am UTC 
 
provide me, as requested, a better test case.
Give me an employee that has a date based break.
Give me another on that has an activity based break.
Give me another that has no breaks - all data would be returned.
Give me one that has both a date based break and an activity based break.
Make your test case actually test all of the different cases 
 
 
SQL Queries
Raghu, June      10, 2011 - 5:08 am UTC
 
 
Hi Tom,
I am a Oracle developer and in my work I rarely get requirements to write complex SQL Queries, where I can make use of analytic functions and other features.
So I was wondering if you had any set of questions (On EMP/DEPT tables), which would test our query writing ability.
Or what queries you might ask in an interview.
There are daily PL/SQL quiz hapening, wish we had SQL Quiz too.
Thanks for all the help, I am a big fan of yours. 
 
Activities and breaks
Jayadevan, June      11, 2011 - 1:05 am UTC
 
 
Hello Tom,
Apologies for not providing enough sample data...
CREATE TABLE "ACTIVITY" ("ACTIVITY_ID" NUMBER(*,0), "PERSON_ID" NUMBER(*,0), "START_DATE" DATE, "END_DATE" DATE, "ACTIVITY_NAME" VARCHAR2(10))
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (4,1,to_date('19-JUN-11','DD-MON-RR'),to_date('18-JUN-11','DD-MON-RR'),'TRAINING');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (2,1,to_date('05-JUN-11','DD-MON-RR'),to_date('07-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (3,1,to_date('01-JUN-11','DD-MON-RR'),to_date('04-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (5,1,to_date('27-MAY-11','DD-MON-RR'),to_date('29-MAY-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (1,1,to_date('08-JUN-11','DD-MON-RR'),to_date('18-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (14,2,to_date('01-JUN-11','DD-MON-RR'),to_date('02-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (12,2,to_date('03-JUN-11','DD-MON-RR'),to_date('06-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (13,2,to_date('09-JUN-11','DD-MON-RR'),to_date('10-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (15,2,to_date('11-JUN-11','DD-MON-RR'),to_date('12-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (11,2,to_date('07-JUN-11','DD-MON-RR'),to_date('08-JUN-11','DD-MON-RR'),'GROUND');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (20,3,to_date('07-JUN-11','DD-MON-RR'),to_date('09-JUN-11','DD-MON-RR'),'LEAVE');
Insert into ACTIVITY (ACTIVITY_ID,PERSON_ID,START_DATE,END_DATE,ACTIVITY_NAME) values (21,3,to_date('13-JUN-11','DD-MON-RR'),to_date('19-JUN-11','DD-MON-RR'),'TRAINING');
select * from activity order by person_id, start_date, end_date ;
          5          1 27-MAY-11 29-MAY-11 LEAVE
          3          1 01-JUN-11 04-JUN-11 LEAVE
          2          1 05-JUN-11 07-JUN-11 LEAVE
          1          1 08-JUN-11 18-JUN-11 LEAVE
          4          1 19-JUN-11 18-JUN-11 TRAINING
         14          2 01-JUN-11 02-JUN-11 GROUND
         12          2 03-JUN-11 06-JUN-11 GROUND
         11          2 07-JUN-11 08-JUN-11 GROUND
         13          2 09-JUN-11 10-JUN-11 GROUND
         15          2 11-JUN-11 12-JUN-11 GROUND
         20          3 07-JUN-11 09-JUN-11 LEAVE
         21          3 13-JUN-11 19-JUN-11 TRAINING
The input filter is '07-JUN-11'.
For pers_id = 1 , I have to go back till activity_id 3. The start_date of the previous activity (id = 5) activity is not the same as the end_date of activity_id 5. So we stop there. This is a date-based break. While going forward, I have to go till activity_id 1. The activity_name for activity_id 4 is 'TRAINING', which is not the same as the ACTIVITY_NAME for person_id for the date '07-JUN-11'. This is a change of activity-based break.
For pers_id 3, there is an activity and date based_break on activity_id 21 and we stop there (actually, if one break condition is met, we do not have to check for the other condition)
For pers_id 2, I have to fetch all the records - going back from '07-JUN-11' and forward from '07-JUN-11', there is no 'break' as far as one activity's end_date and the next activity's start_dates are concerned. The activity names are also same. In  the end, the output I have to provide is
PERSON_ID         START_DAT     END_DATE          ACTIVITY_NAME
----------      ---------      ---------          ----------
    1             01-JUN-11     18-JUN-11         LEAVE
    2             01-JUN-11     12-JUN-11         GROUND
    3             07-JUN-11     09-JUN-11         LEAVE  
 
Correction
Jayadevan, June      11, 2011 - 1:14 am UTC
 
 
A correction in the sentence 
"The start_date of the previous activity (id = 5) activity is not the same as the end_date of activity_id 5. So we stop there. "
It should be 
"The end_date of the previous activity (id = 5) is not the same as the start_date of activity_id 3. It is not the previous date as compared to the start_date of activity 3 either So we stop there. ". In effect, there is some gap between the assignments- there is one or more days when the employee is not assigned any activity. That is a break 
 
@Jayadevan: Generic Analytic Solution
Brendan, June      11, 2011 - 8:06 am UTC
 
 
The requirement is expressed rather procedurally in terms of going forward and back from a given record, then stopping at the break points. If you want a solution by analytics, I would view it rather as a simple extension of a general class of problems where you want to group records primarily by contiguity of ranges (usually date ranges). I wondered whether it might not be interesting to present the class generically, propose a generic analytic solution, then apply it to your specific case.
I would describe the class for your problem as follows. Consider the fields in a record set to divide into the following categories:
key                     - partition by fields
range_start, range_end  - range fields
attr_brk                - break fields
attr_oth                - any other fields
The problem is then to obtain for each record a group_start, group_end pair that are the range_start and range_end values for the records that respectively start and end the break group of the current record. The records are to be ordered by range_start, range_end within the partitioning key, and a new break group starts when, between successive records, either there is a gap between range_end and range_start fields, or any of the attr_brk fields change value. In this case no overlaps are allowed in the ranges within a key.
It's worth noting that if you drop the break fields overlapping ranges can easily be accommodated, making a related second class that can be solved by similar methods.
I have written a kind of pseudo-SQL solution for both classes, and have also implemented them for your particular table, with my own test sets. I could post these if there is any interest, and TK has no objection. However, I will just state the approach used for now to keep it brief.
1. Within an inline view, use Lag and Lead functions with CASE expressions to set group_start and group_end values on the respective start and end records of the break groups, leaving other values null.
2. Select all the original fields from the inline view, as well as the new fields within First_Value, Last_Value functions with the IGNORE NULLS option
3. The ouptut from step 2 solves the problems as defined, but if necessary, can be used within another inline view to restrict the output to certain groups only (eg a 'current' group) 
 
Correction
Brendan, June      12, 2011 - 12:45 pm UTC
 
 
The second class actually needs an additional prior step. I've put an article on Scribd on this. 
 
Query
Jayadevan, June      13, 2011 - 9:09 am UTC
 
 
Hello,
If you do have a working query, please do share that. I hope some amount of filtering/elimination will happen in the first query that gets executed, since the table in question has millions of records. I think Tom too would be intersted in seeing that query. I considered a simple lead/lag inner query with filters applied in the outer query. But that would mean a full scan of this huge table, and hence  I thought I should seek Tom's help.
Regards,
Jayadevan 
 
Code as requested
Brendan, June      14, 2011 - 2:33 am UTC
 
 
Hi Jayadevan
Here is the code:
SELECT /* NO_OVERLAP */
       person_id, start_date, end_date, activity_name, activity_id id,
       Last_Value (group_start IGNORE NULLS) OVER (PARTITION BY person_id ORDER BY start_date) group_start,
       First_Value (group_end IGNORE NULLS) OVER (PARTITION BY person_id ORDER BY start_date RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) group_end
  FROM (
SELECT person_id, start_date, end_date, activity_name, activity_id,
       CASE WHEN (start_date > Nvl (Lag (end_date) OVER (PARTITION BY person_id ORDER BY start_date), start_date-1)) OR 
                 (activity_name != Lag (activity_name) OVER (PARTITION BY person_id ORDER BY start_date)) THEN start_date END group_start,
       CASE WHEN (Nvl (Lead (start_date) OVER (PARTITION BY person_id ORDER BY start_date), end_date+1) > end_date) OR
                 (activity_name != Lead (activity_name) OVER (PARTITION BY person_id ORDER BY start_date)) THEN end_date END  group_end
  FROM activity
 WHERE person_id IN (1, 2)
)
 ORDER BY person_id, start_date
The shape of the data matters as much as the size - is it wide and shallow or deep and narrow for example, in terms of persons and time. I've only tested on 30 records. For wide and shallow, it may be efficient. However, if you have a very deep data set but with only a few records in the 'current' group you might be better using a more procedural approach. Assuming you've indexed start and end times, you might 'anchor' from the current records, then execute indexed selects to pass from one record to the adjacent until you hit a break. You could also use the method mentioned in the recent thread on '3 second...' where a recursive subquery approach is mentioned, you'd have to put it in a union to go both ways I think - I'd have tried that myself but have not had time to do it. Alternatively, within the analytic approach, would it help to do an 'Exists current record' subquery? I don't know your data so I can't say. 
 
re: Activities and breaks
Stew Ashton, June      14, 2011 - 11:13 am UTC
 
 
 This seems like a job for procedural logic. Analytics let you look forward and backward in the data your are selecting, but they don't let you look backward in the result set you are creating.  Brendan used the MODEL clause in another thread and I think it does the trick here.  The MODEL clause carries forward the start date as long as there is not a "break", then the GROUP BY clause gives us one record with the latest end date:
> select person_id, start_date, max(end_date) end_date, activity_name from (
  SELECT person_id, start_date, end_date, activity_name
  FROM activity
  MODEL
    PARTITION BY (person_id)
    DIMENSION BY (Row_Number() OVER (PARTITION BY person_id ORDER BY start_date) rn)
    MEASURES (start_date, end_date, activity_name)
    RULES (
      start_date[rn>1] order by rn = 
        case when end_date[cv()-1] >= start_date[cv()]-1
        and activity_name[cv()-1] = activity_name[cv()] 
        then start_date[cv()-1]
        else start_date[cv()] end
  )
) 
where to_date('07-JUN-11', 'DD-MON-RR') between start_date and end_date
group by person_id, start_date, activity_name
order by 1,2,4
/
PERSON_ID  START_DATE   END_DATE   ACTIVITY_NAME
---------- ------------ ---------- -------------
1          01-JUN-11    18-JUN-11  LEAVE
2          01-JUN-11    12-JUN-11  GROUND
3          07-JUN-11    09-JUN-11  LEAVE 
 
 
Recursive Solution
Brendan, June      14, 2011 - 5:56 pm UTC
 
 
Some analytics sneaked in though, basic testing done...
WITH rsq_f (person_id, start_date, end_date, activity_name, activity_id) AS (
SELECT person_id, start_date, end_date, activity_name, activity_id
  FROM activity
 WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
   AND person_id IN (1, 2)
 UNION ALL
SELECT  act.person_id, act.start_date, act.end_date, act.activity_name, act.activity_id 
  FROM rsq_f
  JOIN activity act
    ON act.start_date      = rsq_f.end_date
   AND act.person_id       = rsq_f.person_id
   AND act.activity_name   = rsq_f.activity_name
   AND act.person_id IN (1, 2)
), rsq_b (person_id, start_date, end_date, activity_name, activity_id) AS (
SELECT person_id, start_date, end_date, activity_name, activity_id
  FROM activity
 WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
   AND person_id IN (1, 2)
 UNION ALL
SELECT  act.person_id, act.start_date, act.end_date, act.activity_name, act.activity_id 
  FROM rsq_b
  JOIN activity act
    ON act.end_date         = rsq_b.start_date
   AND act.person_id        = rsq_b.person_id
   AND act.activity_name    = rsq_b.activity_name
)
SELECT /* RSQ '&TODAY' */ person_id, start_date, end_date, activity_name, activity_id,
        Min (start_date) OVER (PARTITION BY person_id) grp_start, Max (end_date) OVER (PARTITION BY person_id) grp_end
  FROM (
SELECT  person_id, start_date, end_date, activity_name, activity_id
  FROM rsq_b
UNION
SELECT person_id, start_date, end_date, activity_name, activity_id
  FROM rsq_f
)
ORDER BY person_id, start_date
/
 PERSON_ID START_DAT END_DATE  ACTIVITY_N ACTIVITY_ID GRP_START GRP_END
---------- --------- --------- ---------- ----------- --------- ---------
         1 08-JUN-11 09-JUN-11 LEAVE                4 08-JUN-11 14-JUN-11
           09-JUN-11 14-JUN-11 LEAVE                5 08-JUN-11 14-JUN-11
         2 09-JUN-11 14-JUN-11 TRAINING            11 09-JUN-11 14-JUN-11
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |             |       |       |     2 (100)|          |
|   1 |  WINDOW SORT                                  |             |     8 |   408 |     2 (100)| 00:00:01 |
|   2 |   VIEW                                        |             |     8 |   408 |     2 (100)| 00:00:01 |
|   3 |    SORT UNIQUE                                |             |     8 |   408 |     2 (100)| 00:00:01 |
|   4 |     UNION-ALL                                 |             |       |       |            |          |
|   5 |      VIEW                                     |             |     4 |   204 |     0   (0)|          |
|   6 |       UNION ALL (RECURSIVE WITH) BREADTH FIRST|             |       |       |            |          |
|   7 |        INLIST ITERATOR                        |             |       |       |            |          |
|*  8 |         TABLE ACCESS BY INDEX ROWID           | ACTIVITY    |     2 |   102 |     0   (0)|          |
|*  9 |          INDEX RANGE SCAN                     | ACTIVITY_U1 |     1 |       |     0   (0)|          |
|  10 |        NESTED LOOPS                           |             |       |       |            |          |
|  11 |         NESTED LOOPS                          |             |     2 |   160 |     0   (0)|          |
|  12 |          RECURSIVE WITH PUMP                  |             |       |       |            |          |
|* 13 |          INDEX UNIQUE SCAN                    | ACTIVITY_U2 |     1 |       |     0   (0)|          |
|* 14 |         TABLE ACCESS BY INDEX ROWID           | ACTIVITY    |     1 |    51 |     0   (0)|          |
|  15 |      VIEW                                     |             |     4 |   204 |     0   (0)|          |
|  16 |       UNION ALL (RECURSIVE WITH) BREADTH FIRST|             |       |       |            |          |
|  17 |        INLIST ITERATOR                        |             |       |       |            |          |
|* 18 |         TABLE ACCESS BY INDEX ROWID           | ACTIVITY    |     2 |   102 |     0   (0)|          |
|* 19 |          INDEX RANGE SCAN                     | ACTIVITY_U1 |     1 |       |     0   (0)|          |
|  20 |        NESTED LOOPS                           |             |       |       |            |          |
|  21 |         NESTED LOOPS                          |             |     2 |   160 |     0   (0)|          |
|  22 |          RECURSIVE WITH PUMP                  |             |       |       |            |          |
|* 23 |          INDEX UNIQUE SCAN                    | ACTIVITY_U1 |     1 |       |     0   (0)|          |
|* 24 |         TABLE ACCESS BY INDEX ROWID           | ACTIVITY    |     1 |    51 |     0   (0)|          |
-------------------------------------------------------------------------------------------------------------
 
 
Recursion Refined
Brendan, June      15, 2011 - 1:57 am UTC
 
 
This version might be more efficient, it does half the table accesses...
DEFINE TODAY='11-JUN-2011'
PROMPT Recursive Subquery (OR) for '&TODAY'
WITH   act AS (
SELECT person_id, start_date, end_date, activity_name, activity_id
  FROM activity
 WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
   AND person_id IN (1, 2)
),     rsq (person_id, start_date, end_date, activity_name, activity_id, go_back) AS (
SELECT person_id, start_date, end_date, activity_name, activity_id, go_back
  FROM (
SELECT person_id, start_date, end_date, activity_name, activity_id, 'N' go_back
  FROM act
 UNION ALL
SELECT person_id, start_date, end_date, activity_name, activity_id, 'Y'
  FROM act
)
 UNION ALL
SELECT  act.person_id, act.start_date, act.end_date, act.activity_name, act.activity_id, CASE WHEN act.start_date = rsq.end_date THEN 'N' ELSE 'Y' END
  FROM rsq
  JOIN activity act
    ON ((act.start_date     = rsq.end_date AND go_back = 'N') OR
        (act.end_date        = rsq.start_date AND go_back = 'Y'))
   AND act.person_id       = rsq.person_id
   AND act.activity_name   = rsq.activity_name
   AND act.person_id IN (1, 2)
)
SELECT /* RSQ '&TODAY' */ person_id, start_date, end_date, activity_name, activity_id,
        Min (start_date) OVER (PARTITION BY person_id) grp_start, Max (end_date) OVER (PARTITION BY person_id) grp_end
  FROM (
SELECT  person_id, start_date, end_date, activity_name, activity_id
  FROM rsq
UNION
SELECT person_id, start_date, end_date, activity_name, activity_id
  FROM rsq
)
ORDER BY person_id, start_date
/
------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                           |       |       |    14 (100)|          |
|   1 |  TEMP TABLE TRANSFORMATION                 |                           |       |       |            |          |
|   2 |   LOAD AS SELECT                           |                           |       |       |            |          |
|   3 |    INLIST ITERATOR                         |                           |       |       |            |          |
|*  4 |     TABLE ACCESS BY INDEX ROWID            | ACTIVITY                  |     2 |   102 |     0   (0)|          |
|*  5 |      INDEX RANGE SCAN                      | ACTIVITY_U1               |     1 |       |     0   (0)|          |
|   6 |   LOAD AS SELECT                           |                           |       |       |            |          |
|   7 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|                           |       |       |            |          |
|   8 |     VIEW                                   |                           |     4 |   216 |     4   (0)| 00:00:01 |
|   9 |      UNION-ALL                             |                           |       |       |            |          |
|  10 |       VIEW                                 |                           |     2 |   102 |     2   (0)| 00:00:01 |
|  11 |        TABLE ACCESS FULL                   | SYS_TEMP_0FD9D661A_BC16D2 |     2 |   102 |     2   (0)| 00:00:01 |
|  12 |       VIEW                                 |                           |     2 |   102 |     2   (0)| 00:00:01 |
|  13 |        TABLE ACCESS FULL                   | SYS_TEMP_0FD9D661A_BC16D2 |     2 |   102 |     2   (0)| 00:00:01 |
|  14 |     NESTED LOOPS                           |                           |       |       |            |          |
|  15 |      NESTED LOOPS                          |                           |     1 |    92 |     4   (0)| 00:00:01 |
|  16 |       RECURSIVE WITH PUMP                  |                           |       |       |            |          |
|* 17 |       INDEX RANGE SCAN                     | ACTIVITY_U1               |     2 |       |     0   (0)|          |
|* 18 |      TABLE ACCESS BY INDEX ROWID           | ACTIVITY                  |     1 |    51 |     0   (0)|          |
|  19 |   WINDOW SORT                              |                           |    10 |   510 |     6  (34)| 00:00:01 |
|  20 |    VIEW                                    |                           |    10 |   510 |     6  (34)| 00:00:01 |
|  21 |     SORT UNIQUE                            |                           |    10 |   510 |     6  (67)| 00:00:01 |
|  22 |      UNION-ALL                             |                           |       |       |            |          |
|  23 |       VIEW                                 |                           |     5 |   255 |     2   (0)| 00:00:01 |
|  24 |        TABLE ACCESS FULL                   | SYS_TEMP_0FD9D661B_BC16D2 |     5 |   270 |     2   (0)| 00:00:01 |
|  25 |       VIEW                                 |                           |     5 |   255 |     2   (0)| 00:00:01 |
|  26 |        TABLE ACCESS FULL                   | SYS_TEMP_0FD9D661B_BC16D2 |     5 |   270 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------
Output...
Recursive Subquery (OR) for '11-JUN-2011'
old   4:  WHERE '&TODAY' BETWEEN start_date AND Nvl(end_date, '&TODAY')
new   4:  WHERE '11-JUN-2011' BETWEEN start_date AND Nvl(end_date, '11-JUN-2011')
old  25: SELECT /* RSQ '&TODAY' */ person_id, start_date, end_date, activity_name, activity_id,
new  25: SELECT /* RSQ '11-JUN-2011' */ person_id, start_date, end_date, activity_name, activity_id,
 PERSON_ID START_DAT END_DATE  ACTIVITY_N ACTIVITY_ID GRP_START GRP_END                                                 
---------- --------- --------- ---------- ----------- --------- ---------                                               
         1 08-JUN-11 09-JUN-11 LEAVE                4 08-JUN-11 14-JUN-11                                               
           09-JUN-11 14-JUN-11 LEAVE                5 08-JUN-11 14-JUN-11                                               
                                                                                                                        
         2 09-JUN-11 14-JUN-11 TRAINING            11 09-JUN-11 14-JUN-11                                               
 
 
Analytical question
John, June      15, 2011 - 7:24 pm UTC
 
 
have a table data
col1  col2
1      2
5      6
8      9
2      3
10     12
3      20
6      25
want to display data as
col1  col2
1      2
2      3
3      20
5      6
6      25
8      9
10     12
Thanks in advance 
June      17, 2011 - 1:25 pm UTC 
 
if I had such tables, I could show you how to do that.
But I don't, so I cannot.
If you provide me with a create table and a bunch of inserts, I could
no create
no inserts
no look 
 
 
I was wrong about "Activities and breaks"
Stew Ashton, June      16, 2011 - 8:58 am UTC
 
 
 I thought analytics couldn't do this, but I was wrong. In this solution,
Q1 adds a "break" column that is set to 1 when a break occurs.
Q2 does a running total on the "break" column; this assigns each set of lines a rank that increments at each break.
Finally, the main select gets the earliest start date and latest end date for each rank.
This solution is based on  
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1594948200346140013#1596748700346101469  By the way, in my tests the MODEL-based solution was somewhat faster :)
with q1 as (
  select person_id, start_date, end_date, activity_name
  , case when nvl(
    lag(activity_name)
     over (partition by person_id order by start_date, activity_name)
      , activity_name
    ) = activity_name 
    and nvl(
      lag(end_date)
       over (partition by person_id order by start_date, activity_name)
      , start_date-1
    ) >= start_date-1  
    then 0 else 1 end as break
  from activity
), q2 as (
  select person_id, start_date, end_date, activity_name
  , sum(break)
   over(partition by person_id order by start_date, activity_name)
   + 1 as rank
  from q1
)
select person_id, min(start_date) start_date, max(end_date) end_date, activity_name
from q2
group by person_id, rank, activity_name
having to_date('07-JUN-11', 'DD-MON-RR') between min(start_date) and max(end_date)
order by 1, 2
/
PERSON_ID  START_DATE           END_DATE             ACTIVITY_NAME
---------- -------------------- -------------------- -------------
1          2011/06/01 00:00:00  2011/06/18 00:00:00  LEAVE
2          2011/06/01 00:00:00  2011/06/12 00:00:00  GROUND
3          2011/06/07 00:00:00  2011/06/09 00:00:00  LEAVE 
 
 
Analytics
John, June      17, 2011 - 9:56 pm UTC
 
 
Please see the following 
create table t_hierarchy( col1 number(10), col2 number(10));
insert into  t_hierarchy values(1,80);
insert into  t_hierarchy values(60,10);
insert into  t_hierarchy values(80,60);
insert into  t_hierarchy values(65,200);
insert into  t_hierarchy values(10,100);
insert into  t_hierarchy values(200,55);
insert into  t_hierarchy values(101,500);
The data display should as follows
col1 col2
----    ----
1 80
80 60
60 10
10 100
65 200
200 55
101 500
Thanks in advance
  
June      20, 2011 - 9:44 am UTC 
 
neat, I can do that by:
select 1 oc, t.* from t where col1 = 1 and col2 = 60
UNION ALL
select 2 oc, t.* from t where col1 = 80 and col = 60
UNION ALL
...
order by 1;
there you go - now, if you would care to explain the logic behind your requested output and ALL of the rules governing your data (I can probably see what you are trying to do here) - such as 
what describes the parent child relation?
can there be more than one child of a given parent?
if yes, what then?
is there only one root - only one "master parent"?
if not, what then?
can there be loops?
etc etc etc
In other words, write a SPECIFICATION.
I completely fail to understand the questions with:
a) here is my data
b) here is my output
now please write a query.  Without providing ANY OTHER INFORMATION.
If your data always conforms to the rules I've made up in my head, then this might work (and it also might be overkill for you - you don't give me ANYTHING to work with)
ops$tkyte%ORA11GR2> select *
  2    from t_hierarchy t1
  3  start with rowid in (select rowid from t_hierarchy t2 where not exists (select null from t_hierarchy t3 where t3.col2 = t2.col1))
  4  connect by prior t1.col2 = t1.col1
  5  /
      COL1       COL2
---------- ----------
         1         80
        80         60
        60         10
        10        100
        65        200
       200         55
       101        500
7 rows selected.
 
 
 
About the t_hierarchy question
Stew Ashton, June      20, 2011 - 12:52 pm UTC
 
 
 
Tom, I see that the "root" values are returned in ascending order.  Is this guaranteed, or must one add the "order siblings by" clause? 
June      20, 2011 - 1:55 pm UTC 
 
hierarchical queries will return the children of children of children .... in order
but the children in a level will not come out in any order in particular.
You don't need an order siblings by in this case because (in the the example and one of my assumptions) there is ONLY ONE sibling per level - there is nothing to sort at that level 
 
 
Quick followup
Stew Ashton, June      20, 2011 - 2:08 pm UTC
 
 
 There are 3 "root" values: 1, 65 and 101. Those are the "siblings" I'm talking about. They happen to come back in ascending order, but can I count on that without an "order siblings by"?
select *
from t_hierarchy t1
start with rowid in (
  select rowid from T_HIERARCHY T2 
  where not exists (select null from T_HIERARCHY T3 where T3.COL2 = T2.COL1)
)
connect by prior T1.COL2 = T1.COL1
order siblings by T1.COL1 desc
COL1                   COL2                   
---------------------- ---------------------- 
101                    500                    
65                     200                    
200                    55                     
1                      80                     
80                     60                     
60                     10                     
10                     100
 
 
June      20, 2011 - 3:11 pm UTC 
 
ah, I see.
yes, they would have to be ordered to get them ordered.
else they could come back in any order.
 
 
 
am jealous
venkata, July      26, 2011 - 7:39 am UTC
 
 
Tom,
i don't know anything about model, dimension by, measures, partition by clauses. can you give me some pointers where i can learn about these and i stumbled upon the following url which i didn't understand at all please help.  
http://www.sqlsnippets.com/en/topic-11663.html  
July      28, 2011 - 6:30 pm UTC 
 
 
 
Activity dates
Jayadevan, July      29, 2011 - 5:06 am UTC
 
 
Hi Stew,
I got back to working on the activity/date query. The second query you provided -
with q1 as (
  select person_id, start_date, end_date, activity_name
  , case when nvl(
          lag(activity_name)
              over (partition by person_id order by start_date, activity_name)
      , activity_name
    ) = activity_name 
.......
works great.
Thank you.
 
 
help
Venkata, August    05, 2011 - 6:29 am UTC
 
 
Hi,
to learn about data modelling i have created a table and inserted data as shown below.
create table xtest(cust_id number(10), sale_val number(20,4), asoflastday date)
insert into xtest select trunc(dbms_random.value*100000,0) c,trunc(dbms_random.value*100000000,4) amt,last_day(add_months(trunc(sysdate,'yyyy'),rownum-1)) last
 from all_objects where rownum < 13
now when you do
select * from xtest 
you will get data which is different from what i have 
what i need is
sales as on feb = (febsale+Marsale)/2
sales as on Mar = (Marsale+Aprsale)/2
hope you are getting what i am trying to say :)
i need your help in solving this using only data modelling let me know if you can use analytics also
 
August    05, 2011 - 8:23 am UTC 
 
how can you solve a question like that with "data modeling".
Data modeling is the art of coming up with your schema.  It doesn't involve actually answering any business questions?
No, I don't know what you are trying to say. 
 
 
with analytics
Venkata, August    05, 2011 - 3:00 pm UTC
 
 
can you please get me a solution with analytics please  
August    05, 2011 - 3:05 pm UTC 
 
ops$tkyte%ORA11GR2> select cust_id, sale_val, asoflastday, (sale_val+lead(sale_val) over (order by asoflastday))/2 what_you_asked_for
  2    from xtest
  3   order by asoflastday
  4  /
   CUST_ID   SALE_VAL ASOFLASTD WHAT_YOU_ASKED_FOR
---------- ---------- --------- ------------------
     50333 59710311.4 31-JAN-11         44289384.5
     43525 28868457.6 28-FEB-11         40566135.3
      2827   52263813 31-MAR-11         72036300.6
     44790 91808788.3 30-APR-11         82621668.8
     10819 73434549.3 31-MAY-11         67047508.9
        35 60660468.5 30-JUN-11         31365013.1
     93257 2069557.62 31-JUL-11         17826510.8
     90286   33583464 31-AUG-11         28697602.1
     59528 23811740.1 30-SEP-11         18998234.4
     88679 14184728.7 31-OCT-11         8633988.97
     48826 3083249.24 30-NOV-11         17567744.3
     55996 32052239.3 31-DEC-11
12 rows selected.
seems to be what you asked for - but hard to say since it was pretty darn vague. 
 
 
 
little to tweak 
Venkata, August    05, 2011 - 3:39 pm UTC
 
 
Tom,
first thank you the pretty reply, i will eloberate you in detail. here i was waiting for your reply so that i can explain it a little more
you gave this 
   CUST_ID   SALE_VAL ASOFLASTD WHAT_YOU_ASKED_FOR
---------- ---------- --------- ------------------
     50333 59710311.4 31-JAN-11         44289384.5
     43525 28868457.6 28-FEB-11         40566135.3
      2827   52263813 31-MAR-11         72036300.6
     44790 91808788.3 30-APR-11         82621668.8
     10819 73434549.3 31-MAY-11         67047508.9
        35 60660468.5 30-JUN-11         31365013.1
     93257 2069557.62 31-JUL-11         17826510.8
     90286   33583464 31-AUG-11         28697602.1
     59528 23811740.1 30-SEP-11         18998234.4
     88679 14184728.7 31-OCT-11         8633988.97
     48826 3083249.24 30-NOV-11         17567744.3
     55996 32052239.3 31-DEC-11
i need something like this
   CUST_ID   SALE_VAL ASOFLASTD WHAT_YOU_ASKED_FOR
---------- ---------- --------- ------------------
     50333 59710311.4 31-JAN-11         
     43525 28868457.6 28-FEB-11        
     ---------------------------------------------
           88578769.0/2  <-----  sale for feb 
                ------^ here (2) is processing month number
     ---------------------------------------------
            88578769.0 28-feb-11 
      2827   52263813 31-MAR-11       
     ---------------------------------------------
            140842582/3 <------- avg sale for 31-Mar-2011
                ------^ here 3 is processing month number
the result should be like this for other months also. Will that be possible. 
August    06, 2011 - 8:01 am UTC 
 
. here i was waiting for your reply so that i can explain it a little more why - why OH WHY would you do something like that?  Just to waste both of our time?  Think about this please.  
When you ask a question, you are asking for something from someone - that something is "their time".  Their time is valuable to them.  Do not waste it by asking incomplete, vague questions (in short - by asking the WRONG question the first N times around).  Rather - use some of your precious time to precisely and concisely ask what you really meant to ask the first time - get it right the first time around. 
There is nothing, repeat nothing, more frustrating then getting a response "ok, that was cool - but what I *really* meant to ask was....."
I cannot read your stuff - you did not use the code button - therefore,  your text is using a proportional font - it is not readable.
Additionally, it makes no sense.  
I *think*, I'm GUESSING, that you want a running total - divided by the number of observations - in short, an average.
You do not want it divided by the month number (what if the months didn't run from 1..12 or there where more than 12 or a month was missing etc...).  You want an average plain and simple.
Now, since you used random numbers.... I cannot reproduce your example exactly - but here:
ops$tkyte%ORA11GR2> select cust_id, sale_val, asoflastday, avg(sale_val) over (order by asoflastday) what_you_PROBABLY_asked_for
  2    from xtest
  3   order by asoflastday
  4  /
   CUST_ID   SALE_VAL ASOFLASTD WHAT_YOU_PROBABLY_ASKED_FOR
---------- ---------- --------- ---------------------------
     44944 58494945.3 31-JAN-11                  58494945.3
     72912 47251551.8 28-FEB-11                  52873248.6
     17580 99228681.8 31-MAR-11                  68325059.6
     56824 76260729.8 30-APR-11                  70308977.2
     24469 11107307.6 31-MAY-11                  58468643.3
     20531 10186073.3 30-JUN-11                  50421548.3
     14909 42877408.5 31-JUL-11                    49343814
     11919 32108787.5 31-AUG-11                  47189435.7
     79820 4212567.56 30-SEP-11                  42414228.1
     28397 53316727.6 31-OCT-11                  43504478.1
     27438 96206453.3 30-NOV-11                  48295566.7
     22665 65783689.2 31-DEC-11                  49752910.3
12 rows selected.
 
 
 
 
Analytics
A reader, August    08, 2011 - 2:37 pm UTC
 
 
Hi Tom,
I have a question.
create table demo_1(col1 number, col2 varchar2(10));
insert into demo_1 values(1,'P');
insert into demo_1 values(2,'P');
insert into demo_1 values(3,'P');
insert into demo_1 values(4,'P');
insert into demo_1 values(5,'P');
create table demo(col1 number, col2 number, col3 varchar2(10));
insert into demo values(1,1,'A');
insert into demo values(2,1,'A');
insert into demo values(3,1,'A');
insert into demo values(4,2,'C');
insert into demo values(5,2,'B');
insert into demo values(6,3,'B');
insert into demo values(7,3,'B');
insert into demo values(8,4,'B');
insert into demo values(9,4,'C');
insert into demo values(10,5,'X');
We have to join the two tables demo and demo_1 where demo_1.col1 = demo.col2
The query should select the items based on follwoing criteria:
Select the value If there are multiple col2 values, they must exist in all the rows.
Select the value If there are only one col2 value.
The result will be
1 A
3 B
5 X
The col2=2 and 4 will be discarded as it has multiple col3 values in demo table.
Can we achieve this using Analytic function? 
August    13, 2011 - 3:36 pm UTC 
 
ops$tkyte%ORA11GR2> select t1.col1, count( distinct t2.col3), max(t2.col3)
  2    from demo_1 t1, demo t2
  3   where t1.col1 = t2.col2
  4   group by t1.col1
  5  having count(distinct t2.col3) = 1
  6  /
      COL1 COUNT(DISTINCTT2.COL3) MAX(T2.COL
---------- ---------------------- ----------
         1                      1 A
         5                      1 X
         3                      1 B
no analytics this time, just old fashioned aggregates. 
 
 
 
consider using oracle forum
chris227, August    11, 2011 - 8:57 am UTC
 
 
Im my opinion oracle forums are a good place for those kind of questions.
Anyway, you may try this:
with a as (
select 1 a,'P' b from dual
union all
select 2,'P' from dual
union all
select 3,'P' from dual
union all
select 4,'P' from dual
union all
select 5,'P' from dual
),
b as(
select 1 a,1 b,'A' c from dual
union all
select 2,1,'A' from dual
union all
select 3,1,'A' from dual
union all
select 4,2,'C' from dual
union all
select 5,2,'B' from dual
union all
select 6,3,'B' from dual
union all
select 7,3,'B' from dual
union all
select 8,4,'B' from dual
union all
select 9,4,'C' from dual
union all
select 10,5,'X' from dual
)
select distinct * from (
select b.b,b.c, count(distinct b.c) over (partition by b.b) num from a,b
where
a.a=b.b)
where num = 1
 
 
pre-Analytics
A reader, August    11, 2011 - 4:35 pm UTC
 
 
I've not been "fortunate" to use Oracle pre-Analytics, just wonder what you could do WITHOUT analytics for this;
select *
from   (select t1.*,
               row_number() over (partition by t1.val order by t2.priority desc) rn
        from   t1, t2
        where  t2.id = t1.t2_id
       )
where  rn = 1
The above selects all rows from t1 (with a FK to t2) and returns the t1 row with the highest priority.
What would be a good equivalent without Analytics?
Just been having a chat about the good old days with colleagues and we agree Analytics "rock", but this would make work more challenging :) 
 
answer to pre-analytics
chris227, August    12, 2011 - 3:40 am UTC
 
 
consider subqueries.
slightly modified example from question before:
with t2 as (
select 1 id,5 priority from dual
union all
select 2,1 from dual
union all
select 3,2 from dual
union all
select 4,3 from dual
union all
select 5,4 from dual
),
t1 as(
select 1 val,1 t2_id,'A' c from dual
union all
select 2,1,'A' from dual
union all
select 3,1,'A' from dual
union all
select 4,2,'C' from dual
union all
select 5,2,'B' from dual
union all
select 6,3,'B' from dual
union all
select 7,3,'B' from dual
union all
select 8,4,'B' from dual
union all
select 9,4,'C' from dual
union all
select 10,5,'X' from dual
),
j1 as (
select *
        from   t1, t2
        where  t2.id = t1.t2_id
)
select val, t2_id, c
from   j1
where
priority >= (
    select max(priority) from j1 j2
    where
    j1.val = j2.val
    )
       VAL      T2_ID C
---------- ---------- -
         1          1 A
         2          1 A
         3          1 A
         4          2 C
         5          2 B
         6          3 B
         7          3 B
         8          4 B
         9          4 C
        10          5 X
10 rows selected.
Same with but now old school
select val, t2_id, c
from   (select val, t2_id, c,
               row_number() over (partition by j1.val order by j1.priority desc) rn
        from j1
       )
where  rn = 1
       VAL      T2_ID C
---------- ---------- -
         1          1 A
         2          1 A
         3          1 A
         4          2 C
         5          2 B
         6          3 B
         7          3 B
         8          4 B
         9          4 C
        10          5 X
10 rows selected.     
 
 
answer to pre-analytics 2
chris227, August    12, 2011 - 3:52 am UTC
 
 
sorry, slip of the pen, statements should be swapped.
Notice that the window of the analytic corrsponds to the predicate of the where clause of the subquery 
 
One question
A reader, August    18, 2011 - 3:16 pm UTC
 
 
How to select the rows which are equal to number of number of master table.
Lets say we have three different flags in rows for a employee id.
In my second table I have object_id assigned to that flag. I will only select those flags which are equal to number of rows in first table.
table_1:
emp_id flag
1      XX
1      YY
1      ZZ
2      PP
2      XX
3      RR
table_2
emp_id   object_id
1        10
1        10
1        10
1        20
1        20
1        45
1        45
1        45
2        50
2        50
2        45
3        25
3        45
So mz expected output should be
emp_1 : 10, 45(since these occur 3 times equal to rows in table_1) We discard 20 as it occurs only 2 times
emp_2 : 50(since these occur 2 times equal to rows in table_1) We discard 45 as it occurs only 1 times
emp_3 : 25, 45(since these occur 1 times equal to rows in table_1) 
Could you please suggest a solution.
Oracle 10G R1 .. Estimated rows in table_1 : 1 million
table_2 : 500 MB 
August    30, 2011 - 12:54 pm UTC 
 
no creates
no inserts 
no look
if you want a query, provide create tables and inserts. 
 
 
21 days sql question
Muhammet, August    26, 2011 - 4:41 pm UTC
 
 
Hi, 
 Think a table with patients and inspect date.
The business rule says that there can be no more than one inspection within 21 days. But table has dirty data.
They want a report with patient and inspection date with cleaning data. They say for every 21 day starting
with inspection date,take the first one in the report.
  
Sample
Patient    Inspection_Date
------     --------------
Ahmet      01/01/2011   ==> DD/MM/YYYY
Ahmet      15/01/2011
Ahmet      01/02/2011
Ahmet      15/02/2011
Ahmet      01/06/2011
Ahmet      15/06/2011
Ahmet      01/07/2011
So the report will be
Ahmet 01/01/2011
Ahmet 02/02/2011
Ahmet 01/06/2011
Ahmet 01/07/2011
create table T (patient varchar2(20),inspection_date date);
insert into T values('Ahmet',to_date('01/01/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('15/01/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('01/02/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('15/02/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('01/06/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('15/06/2011','DD/MM/YYYY');
insert into T values('Ahmet',to_date('01/07/2011','DD/MM/YYYY');
Thanks
 
 
re: 21 days sql question
Stew Ashton, August    29, 2011 - 11:41 am UTC
 
 
 Hello Muhammet,
This solution may not be to everyone's taste, but the only alternative I know is worse:
SELECT distinct patient, keep_date
FROM t
MODEL
  PARTITION BY (patient)
  DIMENSION BY (Row_Number() OVER (PARTITION BY patient ORDER BY inspection_date) rn)
  MEASURES (inspection_date, inspection_date keep_date)
  RULES (
    keep_date[rn>1] ORDER BY rn = 
      CASE WHEN inspection_date[cv()] < keep_date[cv()-1]+21
      then keep_date[cv()-1]
      ELSE inspection_date[cv()] END
)
order by 1, 2;
PATIENT              KEEP_DATE                 
-------------------- ------------------------- 
Ahmet                01-JAN-11 00.00.00        
Ahmet                01-FEB-11 00.00.00        
Ahmet                01-JUN-11 00.00.00        
Ahmet                01-JUL-11 00.00.00 
 
 
Re: One Question: Tables for Analytics question
A reader, August    30, 2011 - 4:29 pm UTC
 
 
Hi Tom,
Please find the create and inserts:
create table table_1(emp_id number, flag varchar2(10));
insert into table_1 values(1,'XX');
insert into table_1 values(1,'YY');
insert into table_1 values(1,'ZZ');
insert into table_1 values(2,'PP');
insert into table_1 values(2,'XX');
insert into table_1 values(3,'RR');
create table table_2(emp_id number, object_id varchar2(10));
insert into table_2 values(1,'10');
insert into table_2 values(1,'10');
insert into table_2 values(1,'10');
insert into table_2 values(1,'20');
insert into table_2 values(1,'20');
insert into table_2 values(1,'45');
insert into table_2 values(1,'45');
insert into table_2 values(1,'45');
insert into table_2 values(2,'50');
insert into table_2 values(2,'50');
insert into table_2 values(2,'45');
insert into table_2 values(3,'25');
insert into table_2 values(3,'45');
Thanks for the time. 
 
To A Reader
Michel Cadot, August    31, 2011 - 2:21 am UTC
 
 
You have no need of analytic functions for this:
SQL> select b.emp_id, b.object_id
  2  from (select emp_id, count(*) nb from table_1 group by emp_id) a,
  3       (select emp_id, object_id, count(*) nb from table_2 group by emp_id, object_id) b
  4  where a.emp_id = b.emp_id and a.nb = b.nb
  5  order by 1, 2
  6  /
    EMP_ID OBJECT_ID
---------- ----------
         1 10
         1 45
         2 50
         3 25
         3 45
5 rows selected.Regards
Michel 
 
 
To Muhammet 
Michel Cadot, August    31, 2011 - 6:54 am UTC
 
 
With one more row in your table, here's a query with no model clause:
SQL> select patient, inspection_date
  2  from t
  3  order by 1, 2
  4  /
PATIENT              INSPECTION_
-------------------- -----------
Ahmet                01-JAN-2011
Ahmet                15-JAN-2011
Ahmet                01-FEB-2011
Ahmet                10-FEB-2011
Ahmet                15-FEB-2011
Ahmet                01-JUN-2011
Ahmet                15-JUN-2011
Ahmet                01-JUL-2011
8 rows selected.
SQL> with
  2    data as (
  3      select patient, inspection_date,
  4             min(inspection_date)
  5               over (partition by patient order by inspection_date
  6                     range between 22 following and unbounded following)
  7              next_21
  8      from t a
  9      ),
 10    compute (patient, inspection_date, next_21) as (
 11      select patient, min(inspection_date), min(next_21)
 12      from data
 13      group by patient
 14      union all
 15      select data.patient, data.inspection_date, data.next_21
 16      from data, compute c
 17      where data.patient = c.patient
 18        and data.inspection_date = c.next_21
 19    )
 20  select patient, inspection_date
 21  from compute
 22  order by 1, 2
 23  /
PATIENT              INSPECTION_
-------------------- -----------
Ahmet                01-JAN-2011
Ahmet                01-FEB-2011
Ahmet                01-JUN-2011
Ahmet                01-JUL-2011
4 rows selected.
Regards
Michel 
 
 
Analytic Question
A reader, August    31, 2011 - 11:04 am UTC
 
 
Is it possible to re write something like this?
SELECT
  CASE
    WHEN a.status IN (1,2,3,5)
    AND a.pm_id         = b.pm_id
    AND c.ac_id    =1
    AND b.ho_id=1
    THEN MIN(a.col2) keep (dense_rank FIRST
    ORDER BY (
      CASE
    WHEN a.status IN (1,2,3,5)
    AND a.pm_id         = b.pm_id
    AND c.ac_id    =1
    AND b.ho_id=1
        THEN 0
        ELSE 1
      END), a.col2) over (partition BY a.col1)
    ELSE NULL
  END pp1_account_id,
  CASE
   WHEN a.status IN (1,2,3,5)
    AND a.pm_id         = b.pm_id
        AND c.ac_id    =1
    AND b.ho_id=1
   THEN MIN(b.col3) keep (dense_rank FIRST
    ORDER BY (
      CASE
       WHEN a.status IN (1,2,3,5)
        AND a.pm_id         = b.pm_id
       AND c.ac_id    =1
    AND b.ho_id=1
     THEN 0
        ELSE 1
      END), b.col3) over (partition BY a.col1)
    ELSE NULL
    FROM table_a a,
      table_b b,
      table_c c
    WHERE a.col1 = b.col1(+)
    AND b.col1   = c.col1
    AND c.col1   = a.col1 
August    31, 2011 - 2:11 pm UTC 
 
maybe - I don't have the specification for the question being asked, I don't have the schema and constraints, I don't know of the query itself as stated is correct.
just for readability I'd be using an inline view at the very least - so I didn't have to keep reproducing 
 a.status IN (1,2,3,5)
    AND a.pm_id         = b.pm_id
    AND c.ac_id    =1
    AND b.ho_id=1
over and over again
 
 
 
@Michel re:Muhammet's question
Stew Ashton, August    31, 2011 - 2:10 pm UTC
 
 
 
Hi Michel,
Yes, that was the alternative I was thinking of, although I used the old-fashioned hierarchical query syntax. Amusing that we probably live near each other but meet only on a server in Austin :) 
 
Restriction on analytic function result on fly
Sonal, September 01, 2011 - 1:44 pm UTC
 
 
Hi Tom, 
Is it possible to restric query result based on cummulative data while processing. 
i.e.  table has following data
empno sal
10 100
11 200
12 300
13 400
by using analytic function, we can get 
empno   sal cummulative sal
10 100 100
11 200 300
12 300 600
13 400 1000
select empno, sal, sum(sal) over (order by empno) cummulative_sal from emp;
now if i want that give employee where cummulative salary first reach 500. so I will do 
select * from (
select a.*, rank() over (order by cummulative_sal) rnk from (
select empno, sal, sum(sal) over (order by empno) cummulative_sal from emp) a 
where cummulative_sal >= 500 
) where rnk = 1 
If I want this in dept wise or some other grouping then I will add partition clause. So here I have used 2 times from clause and get the result. I am wondering, if any function which tells me to stop processing more rows when my cummulative reach to that limit. if inside query is returning 100K rows and then if i am only interested in first 3 then why should I process another rows. I think performance wise there is no benefits but just to know is there any way for doing this? 
Regards
Sonal
 
September 01, 2011 - 2:36 pm UTC 
 
analytics is not able to do this (yet), no 
 
 
A reader, September 03, 2011 - 6:12 pm UTC
 
 
If you are on 11gR2 you can do with recursive with clause as below I assume you want to reset the total when >=500 to 0 
with t(deptno, empno, sal) as
( 
select 1, 10,    100 from dual union all
select 1, 11,    200 from dual union all
select 1, 12,    300 from dual union all
select 1, 13,    400 from dual 
),
data as 
(
  select t.*,row_number()over(partition by deptno order by empno) as rno,
             sum(sal)over(partition by deptno order by empno) as Run_Tot
  from t 
)
,
rec (rno, deptno, empno, sal, Run_Tot, Total, flg) as 
(
  select rno, deptno, empno, sal, Run_Tot, sal, case when sal >=500 Then 1 ELSE 0 END
  from data 
  where rno=1
  
  union all
  select d.rno, d.deptno, d.empno, d.sal, d.Run_Tot, case when r.Total + d.sal >=500 Then d.sal ELSE r.Total + d.sal END,
  case when r.Total + d.sal >=500 Then 1 ELSE 0 END
  from rec r, data d 
  where r.rno+1=d.rno and r.deptno=d.deptno 
)
select * from rec 
where flg=1 
hth...
Thanks 
 
 
@Sonal on "Restriction on analytic function result on fly"
Stew Ashton, September 04, 2011 - 4:01 am UTC
 
 
 Hi Sonal,
You can remove a FROM in your query if you say 
SELECT empno, sal, cum_sal FROM (
  SELECT empno, sal,
  sum(sal) OVER (ORDER BY empno) cum_sal,
  nvl(sum(sal) OVER (
    ORDER BY empno ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
  ),0) prev_cum_sal
  FROM emp
) WHERE cum_sal >=500 AND prev_cum_sal < 500;You could get down to one FROM with the MODEL clause
SELECT empno, sal, cum_sal FROM emp
MODEL
RETURN UPDATED ROWS
DIMENSION BY (
  sum(sal) OVER (ORDER BY empno) cum_sal,
  nvl(sum(sal) OVER (
    ORDER BY empno ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
  ),0) prev_cum_sal
)
MEASURES (empno, sal)
RULES (
  empno[cum_sal>=500, prev_cum_sal<500] = empno[cv(),cv()]
);Neither would do what you are looking for, but the execution plans have fewer steps. (The MODEL clause is just for fun.) 
 
 
21 days sql question
CezarN, November  01, 2011 - 7:17 am UTC
 
 
I have modified Michel Cadot's solution in order not to use recursive CTE which work on 11g R2 and above (waiting for this "above" :D):
select patient, inspection_date from
(
  select patient, inspection_date,
                 min(inspection_date)
                   over (partition by patient order by inspection_date
                         range between 22 following and unbounded following)
                  next_21,
                  row_number() over (partition by patient order by inspection_date) as rn
          from t a
)
connect by prior next_21 = inspection_date
start with rn = 1
;
PATIENT              INSPECTION_DATE
-------------------- ---------------
Ahmet                01/01/2011
Ahmet                01/02/2011
Ahmet                01/06/2011
Ahmet                01/07/2011
Regards,
Cezar 
 
Counting partners who appeared together with each other
Ramis, November  27, 2011 - 8:58 am UTC
 
 
Hi Tom
I am posting here becuase I think it can perhaps be solved by analytics. 
I have a table t 
create table T
(player varchar2(40),
for_team varchar2(10),
Vs_team varchar2(10),
matchid number,
points number)
/
INSERT INTO T VALUES ('John','A','B',1,2);
INSERT INTO T VALUES ('Fulton','A','B',1,10);
INSERT INTO T VALUES ('Sarah','A','B',1,9);
INSERT INTO T VALUES ('Peter','B','A',1,7);
INSERT INTO T VALUES ('Carlos','B','A',1,9);
INSERT INTO T VALUES ('Jose','B','A',1,6);
INSERT INTO T VALUES ('Joe','A','B',2,8);
INSERT INTO T VALUES ('Peter','A','B',2,9);
INSERT INTO T VALUES ('Carlos','A','B',2,1);
INSERT INTO T VALUES ('Rubben','B','A',2,10);
INSERT INTO T VALUES ('John','B','A',2,0);
INSERT INTO T VALUES ('Fulton','B','A',2,1);
INSERT INTO T VALUES ('Marcelo','A','B',3,7);
INSERT INTO T VALUES ('Daniela','A','B',3,1);
INSERT INTO T VALUES ('John','A','B',3,2);
INSERT INTO T VALUES ('Jose','B','A',3,5);
INSERT INTO T VALUES ('Abrao','B','A',3,3);
INSERT INTO T VALUES ('Carlos','B','A',3,10);
Select * from t
order by matchid, for_team
/
Player For_team vs_team  matchid   Points
John       A       B       1       2
Fulton     A       B       1       10
Sarah      A       B       1       9
Carlos     B       A       1       9
Peter      B       A       1       7
Jose       B       A       1       6
Peter      A       B       2       9
Carlos     A       B       2       1
Joe        A       B       2       8
Fulton     B       A       2       1
John       B       A       2       0
Rubben     B       A       2       10
Daniela    A       B       3       1
John       A       B       3       2
Marcelo    A       B       3       7
Jose       B       A       3       5
Abrao      B       A       3       3
Carlos     B       A       3       10Note: Each player can appear for more than one team in different matches. For exmaple, John appeared for Team A in matchid = 1 and then in matchid = 2 he appeared for team B. So same could for other players
Requirment:
I want for each player, the sum of total number of matches and points (which is easy using SUM) but along with total number of different teammates he played with in all the matches he appeared [for his team(s)] and also the total number opposition players and lastly the total number of different players he played with in all the matches he appeared in.
Just to clarify some terms, incase any doubt:
Here teammates = all players who appeared (for_team) in a match from the same team for whom the respective player also appeared in the same match.
opposition players = all players who appeared for VS_team played by each player.
total different players = all unique players who appeared for for_team or vs_team in the matches in which each player appeared.
Here is my desired outout:
Player    total Matches    Sum(points)   Different teammates    Different           total 
                                         player played with     opposition players  different                                                                                           players
John      3                4             5                      5                   10
Fulton    2                11            3                      4                   6  
Sarah     1                9             2                      3                   5
Peter     2                16            3                      4                   7
Carlos    3                20            4                      6                   10
Jose      2                11            3                      5                   8
Joe       1                8             2                      3                   5
Rubben    1                10            2                      3                   5
Marcelo   1                7             2                      3                   5
Daniela   1                1             2                      3                   5
Abrao     1                3             2                      3                   5       
 I want one simple query and shortest query to achieve about output since in my actual table data is huge.   
thanks in advance.
regards
Asim 
November  29, 2011 - 7:09 am UTC 
 
I don't think analytics are appropriate, but an aggregation would be:
ops$tkyte%ORA11GR2> select t1_player,
  2         count(case when t1_player = t2_player then matchid end) "total matches",
  3         sum(case when t1_player = t2_player then points end) "total points",
  4         count(distinct case when t1_for_team = t2_for_team and t1_player <> t2_player then t2_player end) "diff teammates",
  5         count(distinct case when t1_for_team <>t2_for_team then t2_player end) "diff opp"
  6    from (
  7  select t1.player t1_player, t1.for_team t1_for_team, t2.player t2_player, t2.for_team t2_for_team, t1.matchid, t1.points
  8    from t t1, t t2
  9   where t1.matchid = t2.matchid
 10         )
 11   group by t1_player
 12   order by t1_player
 13  /
T1_PLAYER  total matches total points diff teammates   diff opp
---------- ------------- ------------ -------------- ----------
Abrao                  1            3              2          3
Carlos                 3           20              4          6
Daniela                1            1              2          3
Fulton                 2           11              3          4
Joe                    1            8              2          3
John                   3            4              5          5
Jose                   2           11              3          5
Marcelo                1            7              2          3
Peter                  2           16              3          4
Rubben                 1           10              2          3
Sarah                  1            9              2          3
11 rows selected.
we cartesian product each matchid group - that'll join every play with every other player - including themselves - by matchid.
then - if we count the rows where the players are the same - that is the number of matches that player played in.
if we sum the points where the players are the same - that is the sum of their points.
if we count the number of distinct player names where the teams are the same (but the player name is different) that is the total number of teammates.
if we count the number of distinct player names where the teams are different - that is the total number of opposing players...
Now, if this table is "HUGE", remember that this will make it temporarily 16 times "more huge".
I put airquotes on "HUGE" because some people thing a few million rows is "HUGE" - but it really isn't.
You do not want a single index to be used - you want two full scans and a nice big hash join. 
 
 
thanks alot
Ramis, December  01, 2011 - 8:34 am UTC
 
 
Dear Tom
This is really so amazing. Thank you so much. I didn't expect that it will be so simple. On my actual table its takes just 11 seconds to give me output for 2600+ players and 3000+ matches. I appreciate your time and help. thanks again.
regards
Ramis 
 
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December  07, 2011 - 8:06 pm UTC
 
 
Tom:
I have a requirement to calculate the below values ( where the table 'T' in application is external table to read from text files)
1) ST_START_ID should be calculated as value of x when y = 'ST'
2) ST_END_ID should be calculated as value of x-1 when y ='CLM'
3) CLM_START_ID should be calculated as value of x when y ='CLM'
4) CLM_END_ID should be calculated as value of x-1 when y ='SE' or y ='CLM'
drop table t purge;
create table t(x number,y varchar2(10));
insert into t values(1,'ISA');
insert into t values(2,'ST');
insert into t values(3,'D1');
insert into t values(4,'D2');
insert into t values(5,'CLM1');
insert into t values(6,'X1');
insert into t values(7,'X2');
insert into t values(8,'X3');
insert into t values(9,'SE');
insert into t values(10,'ST');
insert into t values(11,'D1');
insert into t values(12,'D2');
insert into t values(13,'D3');
insert into t values(14,'CLM1');
insert into t values(15,'X1');
insert into t values(16,'CLM2');
insert into t values(17,'X1');
insert into t values(18,'X2');
insert into t values(19,'CLM3');
insert into t values(20,'X1');
insert into t values(21,'SE');
insert into t values(22,'IEA');
commit;
I need the output like this
ST_START_ID  ST_END_ID  CLM_START_ID CLM_END_ID
2    4   5    8
10    13   14    15 
10    13   16    18
10    13   19    20 
 
December  08, 2011 - 12:37 pm UTC 
 
umm, this doesn't make sense at all.
the pivoting is unexplained and doesn't make sense.
You wrote: 
1) ST_START_ID should be calculated as value of x when y = 'ST' 
2) ST_END_ID should be calculated as value of x-1 when y ='CLM' 
3) CLM_START_ID should be calculated as value of x when y ='CLM' 
4) CLM_END_ID should be calculated as value of x-1 when y ='SE' or y ='CLM' 
I see nothing where y = 'CLM'
going further on, assuming you meant "y like'CLM%'", how did you know to "go on" when you hit CLM1 with x=5.  you used it twice for st_end_id and clm_start_id, why didn't you use it a third time for clm_end_id
also, why did you use it twice, if that is always going to be true, you don't really need the st_end_id do you - it is rather obvious what it is - you don't need to compute it in the result set.
and what happens if in between an 'ST' record and the next 'ST' record - there are no clm records.
and you best explain those two last lines of output - they make no sense to me.
Ask yourself - honestly and truly ask yourself - have you given enough information in the form of a specification here that anyone could write code from it without ambiguity.  I could code up something that gets your result set - that is easy enough, but I have no idea if it is what you really meant.  I won't take my time until you give us a lot more detail.
 
 
 
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December  07, 2011 - 8:08 pm UTC
 
 
Tom, Sorry to miss this in previous post, that the size of this file mapped to External table 'T' will be around 54MB to 80MB. 
December  08, 2011 - 12:37 pm UTC 
 
so it is very small, ok, post more details from above please... 
 
 
Requirement to capture boundry values 
Rajeshwaran, Jeyabal, December  08, 2011 - 1:07 pm UTC
 
 
Tom:
Here is the more details
1) With in ST and SE ( that is where x between 2 and 9) there will be multiple CLM% segments. (atleast there will be one CLM available between ST and SE)
2) I need to output the boundry for each CLM segments with ST and SE combinations. 
 a) So i have one CLM% between x >=2 and x<=9 so I need the one output like this. 
 ST_START_ID  ST_END_ID  CLM_START_ID CLM_END_ID
 2             4            5             8
 b) Now i have theree CLM% between x >=10 and x<=21 so I need the three output like this one for each CLM% values.
 
 ST_START_ID  ST_END_ID  CLM_START_ID CLM_END_ID
 10             13            14             15 
 10             13            16             18
 10             13            19             20 
  
December  09, 2011 - 4:00 pm UTC 
 
this is as close as I can get you - YOU will take it from here.
ops$tkyte%ORA11GR2> select *
  2    from (
  3  select x, y, grp,
  4             max(st_x) over (partition by grp) new_st_x,
  5             max(se_x) over (partition by grp) new_se_x
  6    from (
  7  select x, y,
  8         count( case when y = 'ST' then 'ST' end ) over (order by x) grp,
  9             case when y = 'ST' then x end st_x,
 10             case when y = 'SE' then x end se_x
 11    from t
 12         )
 13             )
 14   where y like 'CLM_'
 15   order by x
 16  /
         X Y                 GRP   NEW_ST_X   NEW_SE_X
---------- ---------- ---------- ---------- ----------
         5 CLM1                1          2          9
        14 CLM1                2         10         21
        16 CLM2                2         10         21
        19 CLM3                2         10         21
your description leaves something to be - well, I'll leave it at that.
Anyway - what you have there is every CLM_ record "joined" with its prior ST record and post SE record.  That should be everything you need.
If you need the prior CLM_ record "joined" to each existing CLM_ record - please use LAG()  
 
 
Requirement to capture boundry values
Rajeshwaran, Jeyabal, December  10, 2011 - 2:41 pm UTC
 
 
Thank Tom.
I think i got it.
rajesh@ORA11GR2> select st_x as st_start_id,
  2        st_end_id - 1 as st_end_id,
  3        min(x) as clm_start_id,
  4        max(x) as clm_end_id
  5  from (
  6  select x,y,st_x,st_grp,last_value(clm_grp ignore nulls) over(order by x) as clm_grp,
  7         min(case when y like 'CLM%' then x end) over(partition by st_grp order by x) as st_end_id
  8  from (
  9  select x,y,
 10       case when y like 'CLM%' then row_number() over(order by x)
 11       when y like 'SE' then 1 end as clm_grp,
 12       last_value(case when y ='ST' then x end ignore nulls) over(order by x) as st_x,
 13      count(case when y='ST' then 'ST' end) over(order by x) as st_grp
 14  from t
 15      )
 16  )
 17  where clm_grp > 1
 18  group by st_x,st_end_id,clm_grp
 19  order by clm_grp
 20  /
ST_START_ID  ST_END_ID CLM_START_ID CLM_END_ID
----------- ---------- ------------ ----------
          2          4            5          8
         10         13           14         15
         10         13           16         18
         10         13           19         20
Elapsed: 00:00:00.11
rajesh@ORA11GR2>
rajesh@ORA11GR2> 
 
Alternate approach
Matt McPeak, December  14, 2011 - 2:56 pm UTC
 
 
Another way to do it:
select * from 
(
select t.*, max(case when y = 'ST' THEN x ELSE null end) over ( partition by null order by x  range between unbounded preceding and current row ) st_start_id,
max(case when y = 'CLM1' THEN x-1 ELSE null end) over ( partition by null order by x  range between unbounded preceding and current row ) st_end_id,
max(case when y like 'CLM%' THEN x ELSE null end) over ( partition by null order by x  range between unbounded preceding and current row ) clm_start_id,
min(case when y like 'CLM%' or y IN ('SE','IEA') THEN x-1 ELSE null end) over ( partition by null order by x  range between 1 following and unbounded following) clm_end_id
from t
)
where y like 'CLM%'
 
 
Analytic function
A reader, January   13, 2012 - 9:55 pm UTC
 
 
Hello Tom,
I have a table TEST which contain one column id.
And it has values from 99 to 1 in desc manner.
ID
99
98
..
..
1
Please help me getting a output as below
id1 id2 id3 id4 id5 id6 id7
99  98  97  96  95  94  93
86  87  88  89  90  91  92
85  84 83   ..  ..
Thanks.
lalu.
 
January   17, 2012 - 11:23 am UTC 
 
And it has values from 99 to 1 in desc manner.
No, I'm sorry, it does not.
I might well have 1 to 99 in it, but they are not in any "order".  there is no order to any set of rows unless and until you add an order by.
your output doesn't make sense to me.
you go 99 down to 93
then from 86 up to 92
then from 85 down to ....
do you want to go up, or down, or did you really mean to interleave it
ANY WHY DOESN'T ANYONE SEEM TO SAY STUFF LIKE THIS ANYMORE.  where are the specifications, where is the explanation of what you really mean?  pictograms like this don't tell the whole story.
pick the one you want
ops$tkyte%ORA11GR2> with data(r)
  2  as
  3  (select 1 r from dual
  4   union all
  5   select r+1 from data where r < 99
  6  )
  7  select max(decode(m,6,r)) id1,
  8         max(decode(m,5,r)) id2,
  9         max(decode(m,4,r)) id3,
 10         max(decode(m,3,r)) id4,
 11         max(decode(m,2,r)) id5,
 12         max(decode(m,1,r)) id6,
 13         max(decode(m,0,r)) id7
 14    from (
 15  select r, trunc((r+7-1.99)/7) d, mod(r-2+7, 7) m
 16    from data
 17         )
 18   group by d
 19   order by d desc;
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        99         98         97         96         95         94         93
        92         91         90         89         88         87         86
        85         84         83         82         81         80         79
        78         77         76         75         74         73         72
        71         70         69         68         67         66         65
        64         63         62         61         60         59         58
        57         56         55         54         53         52         51
        50         49         48         47         46         45         44
        43         42         41         40         39         38         37
        36         35         34         33         32         31         30
        29         28         27         26         25         24         23
        22         21         20         19         18         17         16
        15         14         13         12         11         10          9
         8          7          6          5          4          3          2
         1
15 rows selected.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> with data(r)
  2  as
  3  (select 1 r from dual
  4   union all
  5   select r+1 from data where r < 99
  6  )
  7  select max(decode(m,0,r)) id1,
  8         max(decode(m,1,r)) id2,
  9         max(decode(m,2,r)) id3,
 10         max(decode(m,3,r)) id4,
 11         max(decode(m,4,r)) id5,
 12         max(decode(m,5,r)) id6,
 13         max(decode(m,6,r)) id7
 14    from (
 15  select r, trunc((r+7-1.99)/7) d, mod(r-2+7, 7) m
 16    from data
 17         )
 18   group by d
 19   order by d desc
 20  /
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        93         94         95         96         97         98         99
        86         87         88         89         90         91         92
        79         80         81         82         83         84         85
        72         73         74         75         76         77         78
        65         66         67         68         69         70         71
        58         59         60         61         62         63         64
        51         52         53         54         55         56         57
        44         45         46         47         48         49         50
        37         38         39         40         41         42         43
        30         31         32         33         34         35         36
        23         24         25         26         27         28         29
        16         17         18         19         20         21         22
         9         10         11         12         13         14         15
         2          3          4          5          6          7          8
                                                                           1
15 rows selected.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> with data(r)
  2  as
  3  (select 1 r from dual
  4   union all
  5   select r+1 from data where r < 99
  6  )
  7  select max(decode(m,decode( mod(d,2), 1, 0, 6) ,r)) id1,
  8         max(decode(m,decode( mod(d,2), 1, 1, 5) ,r)) id2,
  9         max(decode(m,decode( mod(d,2), 1, 2, 4) ,r)) id3,
 10         max(decode(m,decode( mod(d,2), 1, 3, 3) ,r)) id4,
 11         max(decode(m,decode( mod(d,2), 1, 4, 2) ,r)) id5,
 12         max(decode(m,decode( mod(d,2), 1, 5, 1) ,r)) id6,
 13         max(decode(m,decode( mod(d,2), 1, 6, 0) ,r)) id7
 14    from (
 15  select r, trunc((r+7-1.99)/7) d, mod(r-2+7, 7) m
 16    from data
 17         )
 18   group by d
 19   order by d desc
 20  /
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        99         98         97         96         95         94         93
        86         87         88         89         90         91         92
        85         84         83         82         81         80         79
        72         73         74         75         76         77         78
        71         70         69         68         67         66         65
        58         59         60         61         62         63         64
        57         56         55         54         53         52         51
        44         45         46         47         48         49         50
        43         42         41         40         39         38         37
        30         31         32         33         34         35         36
        29         28         27         26         25         24         23
        16         17         18         19         20         21         22
        15         14         13         12         11         10          9
         2          3          4          5          6          7          8
         1
15 rows selected.
ops$tkyte%ORA11GR2> 
 
 
 
 
Ouch
Chuck, January   17, 2012 - 1:28 pm UTC
 
 
To me #3 wins the "Least Likely to be Useful" award hands down.
That that 'with' clause passes the compiler, let alone producing more than one row, simply makes my head hurt.
SQL fun at it's best. Thanks. 
 
Analytic and Tom both rocks.
A reader, January   18, 2012 - 7:32 am UTC
 
 
Thanks Tom.
I could not explain my requirement, but you have provided me the output. 
January   18, 2012 - 7:44 am UTC 
 
what did analytics have to do with anything?  We didn't use any in this particular problem. 
 
 
Alternative approach using pivot
AndyP, January   30, 2012 - 5:10 am UTC
 
 
Tom has of course provided the exact answer to the question and it probably isn't actually worth pursuing for its own sake, but I just wondered how to achieve the same results using the pivot syntax, which might be of interest to someone I suppose
In doing so I found that the solution provided relies on knowing the starting value of the dataset being used, so I've made it work for any highest number, not just 99
This method seems to require more work generating the data in the right format, but then the presentation of it using pivot is straightforward
(I too found the with clause a surprising formulation, so I used a more explicit one)
col rowval noprint
prompt This one has each row descending
with 
data as(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as(select r dataval,trunc((rownum-0.1)/7) rowval,mod((rownum-1),7) colval from data)
select * from datasets pivot(max(dataval) for colval in(0 as id1,1 as id2,2 as id3,3 as id4,4 as id5,5 as id6,6 as id7))
order by rowval
/
prompt This one has each row ascending
with
data as(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as(select r dataval,trunc((rownum-0.1)/7) rowval,mod((rownum-1),7) colval from data)
select * from datasets pivot(max(dataval) for colval in(6 as id1,5 as id2,4 as id3,3 as id4,2 as id5,1 as id6,0 as id7))
order by rowval
/
prompt This one has each row ascending too, achieved by reversing the data rather than altering the pivot column sequence
with
data as(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as(select r dataval,trunc((rownum-0.1)/7) rowval,6-mod((rownum-1),7) colval from data)
select * from datasets pivot(max(dataval) for colval in(0 as id1,1 as id2,2 as id3,3 as id4,4 as id5,5 as id6,6 as id7))
order by rowval
/
prompt This one has rows alternately descending and ascending, achieved by ordering the data like that before pivoting
with
data as
(select r from (select level r from dual connect by level <=25) order by r desc),
datasets as
(select r dataval,trunc((rownum-0.1)/7) rowval,decode(mod(trunc((rownum-0.1)/7),2),0,mod((rownum-1),7),6-mod((rownum-1),7)) colval from data)
select * from datasets pivot(max(dataval) for colval in(0 as id1,1 as id2,2 as id3,3 as id4,4 as id5,5 as id6,6 as id7))
order by rowval
/
SQL > @pivdown
This one has each row descending
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        25         24         23         22         21         20         19
        18         17         16         15         14         13         12
        11         10          9          8          7          6          5
         4          3          2          1
This one has each row ascending
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        19         20         21         22         23         24         25
        12         13         14         15         16         17         18
         5          6          7          8          9         10         11
                                          1          2          3          4
This one has each row ascending too, achieved by reversing the data rather than altering the pivot column sequence
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        19         20         21         22         23         24         25
        12         13         14         15         16         17         18
         5          6          7          8          9         10         11
                                          1          2          3          4
This one has rows alternately descending and ascending, achieved by ordering the data like that before pivoting
       ID1        ID2        ID3        ID4        ID5        ID6        ID7
---------- ---------- ---------- ---------- ---------- ---------- ----------
        25         24         23         22         21         20         19
        12         13         14         15         16         17         18
        11         10          9          8          7          6          5
                                          1          2          3          4
 
 
January   31, 2012 - 5:48 pm UTC 
 
;) 
thanks 
 
 
RANGE BETWEEN UNBOUNDED PRECEDING confusion
biswaranjan., April     05, 2012 - 6:29 am UTC
 
 
Hi Tom,
I am confused for the out put of the below query .
emp table has the below 14 columns.
7369 SMITH CLERK 7902 17-DEC-80 800  20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975  20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850  30
7782 CLARK MANAGER 7839 09-JUN-81 2450  10
7788 SCOTT ANALYST 7566 19-APR-87 3000  20
7839 KING PRESIDENT  17-NOV-81 5000  10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100  20
7900 JAMES CLERK 7698 03-DEC-81 950  30
7902 FORD ANALYST 7566 03-DEC-81 3001  20
7934 MILLER CLERK 7782 23-JAN-82 1300  10
.........
SELECT deptno, empno, sal,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal
 DEPTNO  EMPNO   SAL CNT_LT_HALF CNT_MT_HALF
------- ------ ----- ----------- -----------
     20   7369   800           0           3
     20   7876  1100           0           3
     20   7566  2975           2           0
     20   7788  3000           2           0
     20   7902  3000           2           0
     30   7900   950           0           3
     30   7521  1250           0           1
     30   7654  1250           0           1
     30   7844  1500           0           1
     30   7499  1600           0           1
     30   7698  2850           3           0
11 rows selected.
can you plz explain the out put of the CNT_LT_HALF and CNT_MT_HALF
columns . I was thinking myself and found no confusion for the out put of cnt_lt_half column but confused for the cnt_mt_half column's .
could you plz explain the logic.
thanks and regards,
Biswaranjan. 
April     06, 2012 - 10:17 am UTC 
 
does this make more sense:
ops$tkyte%ORA11GR2> break on deptno skip 1
ops$tkyte%ORA11GR2> SELECT deptno, empno, sal,
  2         Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
  3           BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) cnt1,
  4         first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
  5           BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) fv1,
  6         last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
  7           BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) lv1,
  8         COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
  9           BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) cnt2,
 10         first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
 11           BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) fv2,
 12         last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
 13           BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) lv2
 14  FROM emp
 15  WHERE deptno IN (20, 30)
 16  ORDER BY deptno, sal
 17  /
DEPTNO EMPNO   SAL  CNT1   FV1   LV1  CNT2   FV2   LV2
------ ----- ----- ----- ----- ----- ----- ----- -----
    20  7369   800     0                 3  7566  7902
        7876  1100     0                 3  7566  7902
        7566  2975     2  7369  7876     0
        7788  3000     2  7369  7876     0
        7902  3000     2  7369  7876     0
    30  7900   950     0                 3  7844  7698
        7521  1250     0                 1  7698  7698
        7654  1250     0                 1  7698  7698
        7844  1500     0                 1  7698  7698
        7499  1600     0                 1  7698  7698
        7698  2850     3  7900  7654     0
11 rows selected.
see what the empno ranges are now - you can see exactly which records are included in the range... 
 
 
A reader, April     06, 2012 - 12:24 pm UTC
 
 
Dear Sir,
the range for preceding is from unlimited lower bound to sal/2 and for following is from sal/2 to unlimited upper bound.
My above understanding match with the above output for preceding clause but not for following 
for eg: for record deptno=20 and empno=7369 the range will be 400 to unlimited.
can you please clarify this?
Thanks in Advance
 
April     08, 2012 - 5:44 pm UTC 
 
does this help:
ops$tkyte%ORA11GR2> SELECT deptno, empno, sal,
  2         'btwn 1st and ' || (sal-sal/2) txt1,
  3         Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
  4           BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) cnt1,
  5         first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
  6           BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) fv1,
  7         last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
  8           BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) lv1,
  9             'btwn ' || (sal+sal/2) || ' and last' txt2,
 10         COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
 11           BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) cnt2,
 12         first_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
 13           BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) fv2,
 14         last_value(empno) OVER (PARTITION BY deptno ORDER BY sal RANGE
 15           BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) lv2
 16  FROM emp
 17  WHERE deptno IN (20, 30)
 18  ORDER BY deptno, sal
 19  /
DEPTNO EMPNO   SAL TXT1               CNT1   FV1   LV1 TXT2                CNT2   FV2   LV2
------ ----- ----- ----------------- ----- ----- ----- ------------------ ----- ----- -----
    20  7369   800 btwn 1st and 400      0             btwn 1200 and last     3  7566  7902
        7876  1100 btwn 1st and 550      0             btwn 1650 and last     3  7566  7902
        7566  2975 btwn 1st and 1487     2  7369  7876 btwn 4462.5 and la     0
                   .5                                  st
        7788  3000 btwn 1st and 1500     2  7369  7876 btwn 4500 and last     0
        7902  3000 btwn 1st and 1500     2  7369  7876 btwn 4500 and last     0
    30  7900   950 btwn 1st and 475      0             btwn 1425 and last     3  7844  7698
        7521  1250 btwn 1st and 625      0             btwn 1875 and last     1  7698  7698
        7654  1250 btwn 1st and 625      0             btwn 1875 and last     1  7698  7698
        7844  1500 btwn 1st and 750      0             btwn 2250 and last     1  7698  7698
        7499  1600 btwn 1st and 800      0             btwn 2400 and last     1  7698  7698
        7698  2850 btwn 1st and 1425     3  7900  7654 btwn 4275 and last     0
11 rows selected.
 
 
 
unbound following
Biswaranjan, April     07, 2012 - 12:28 am UTC
 
 
Hi Tom,
First of all thanks for your quick reply.
I finally got it after thinking deeply your query's output.
@reader
you are thinking ,the way i was thinking so same confusionn you faced.
for your eg:
for record deptno=20 and emp=7369 the range will be  from current sal 400 ahead  to 
unlimited.
so the lower limit will be 800+800/2=1200
and the upper limit will be unlimited. so obviously the count for that record will be
3.
thanks again tom.
regards,
Biswaranjan 
 
A reader, April     07, 2012 - 4:20 pm UTC
 
 
@Biswaranjan - Thanks I understand that from output for following but the same does not work for preceding may be i miss something 
for eg:
when processing deptno=20 and empno=7876 for preceding, the range will be from unbounded preceding to 800(previous record salary) + 800/2=1200 upper limit 
so the count in this case should be 1 rather than 0?
is it like that for preceding the range is from unlimited TO sal/2 of previous record and for following it 
is from sal*1.5 TO unlimited following?
@Tom can you please clarify?
Thanks in Advance 
 
range window
biswaranjan, April     09, 2012 - 1:06 am UTC
 
 
@reader 
for preceding the range should be
unbound prededing to sal-sal/2 and for followning 
the range should be sal+sal/2.
you got wrong output coz for preceding you have added sal/2 with sal where you should have subtracted fro sal.
thanks,
Biswaranjan. 
 
range window with example
biswaranjan, April     09, 2012 - 1:15 am UTC
 
 
@reader
hope below example will clarify your doubt completely.\
SELECT deptno, empno, sal,sal-sal/4,sal+sal/4,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/4) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/4) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal;
Deptno  empno   sal (sa-sa/4)(sa+sa/4)pre_CNT  follow_CNT
20 7369 800 600 1000 0 4
20 7876 1100 825 1375 1 3
20 7566 2975 2231.25 3718.75 2 0
20 7788 3000 2250 3750 2 0
20 7902 3000 2250 3750 2 0
30 7900 950 712.5 1187.5 0 5
30 7521 1250 937.5 1562.5 0 2
30 7654 1250 937.5 1562.5 0 2
30 7844 1500 1125 1875 1 1
30 7499 1600 1200 2000 1 1
30 7698 2850 2137.5 3562.5 5 0
pre_count column is less than (sa-sa/4) and following_count column is more than (sa+sa/4).
thanks,
Biswaranjan.
 
 
A reader, April     09, 2012 - 10:53 am UTC
 
 
Thank you very much Biswaranjan!! 
 
analytic performance
Biswaranjan, April     18, 2012 - 4:04 am UTC
 
 
Hi Tom,
hope you are doing fine :).
can you please tell which below query is good for performance.
select ename,deptno,(
select max(sal) from emp e1 
where emp.deptno = e1.deptno ) max_sal_deptwise 
from emp 
order by deptno;
select ename,deptno,max(sal) over(partition by deptno) max_sal_deptwise from emp;
thanks,
Biswaranjan 
April     18, 2012 - 8:01 am UTC 
 
In almost every case - the analytic would be superior to the scalar subquery in general.
an exception to that might be if you wanted to get the first row from the query as fast as possible - the scalar subquery might have a slight advantage.
ops$tkyte%ORA11GR2> set linesize 1000
ops$tkyte%ORA11GR2> set echo on
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> /*
ops$tkyte%ORA11GR2> drop table t;
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create table t
ops$tkyte%ORA11GR2> as
ops$tkyte%ORA11GR2> select object_name ename, object_id empno,
ops$tkyte%ORA11GR2>        mod(object_id,1000) deptno, mod(object_id,1000) sal,
ops$tkyte%ORA11GR2>            a.*
ops$tkyte%ORA11GR2>   from all_objects a
ops$tkyte%ORA11GR2> /
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> create index t_idx on t(deptno,sal);
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'T' );
ops$tkyte%ORA11GR2> */
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> variable n number
ops$tkyte%ORA11GR2> set autotrace traceonly
ops$tkyte%ORA11GR2> exec  :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> select ename,deptno,(
  2  select max(sal) from t e1
  3  where emp.deptno = e1.deptno ) max_sal_deptwise
  4  from t  emp
  5  order by deptno;
72889 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 4042721663
----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       | 72889 |  2064K|       |   978   (1)| 00:00:12 |
|   1 |  SORT AGGREGATE              |       |     1 |     8 |       |            |          |
|   2 |   FIRST ROW                  |       |     1 |     8 |       |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| T_IDX |     1 |     8 |       |     2   (0)| 00:00:01 |
|   4 |  SORT ORDER BY               |       | 72889 |  2064K|  2872K|   978   (1)| 00:00:12 |
|   5 |   TABLE ACCESS FULL          | T     | 72889 |  2064K|       |   395   (1)| 00:00:05 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("E1"."DEPTNO"=:B1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      17441  consistent gets
          0  physical reads
          0  redo size
    2767962  bytes sent via SQL*Net to client
      53869  bytes received via SQL*Net from client
       4861  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      72889  rows processed
ops$tkyte%ORA11GR2> exec dbms_output.put_line( (dbms_utility.get_cpu_time-:n) || ' hsecs' );
37 hsecs
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> exec  :n := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> select ename,deptno,max(sal) over(partition by deptno) max_sal_deptwise from t;
72889 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2335850315
-----------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 72889 |  2348K|       |  1049   (1)| 00:00:13 |
|   1 |  WINDOW SORT       |      | 72889 |  2348K|  3160K|  1049   (1)| 00:00:13 |
|   2 |   TABLE ACCESS FULL| T    | 72889 |  2348K|       |   395   (1)| 00:00:05 |
-----------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1422  consistent gets
          0  physical reads
          0  redo size
    2767962  bytes sent via SQL*Net to client
      53869  bytes received via SQL*Net from client
       4861  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      72889  rows processed
ops$tkyte%ORA11GR2> exec dbms_output.put_line( (dbms_utility.get_cpu_time-:n) || ' hsecs' );
32 hsecs
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> set autotrace off
 
 
 
Thanks.
Biswaranjan, April     18, 2012 - 11:34 pm UTC
 
 
Hi Tom,
Thanks for a nice and simple explanation.
simply you rock!
one little request to you can you please tell which book is good for performance tuning for developer as a beginer to performance tuning.(sorry if I am not supposed to post these 
requests).
thanks,
Biswaranjan. 
April     19, 2012 - 5:44 am UTC 
 
you can post these requests - but i always say the same thing
I don't know of a single book - or even a set of books - that can teach a developer the 15 steps to performance.  The optimizer is that check list.
My approach to performance has always been:
a) understand how it (the database) works
b) know as much as you can about your options
c) understand those options work in general
d) put a, b, c together logically to say "these are probably the best ways"
e) benchmark them.
Just like I did above.
and then with experience - like 5-10-15 years - you'll be able to recognize general patterns/problems and be able to solve them without even thinking about it. 
 
 
answer says it all!
Ravi B, April     19, 2012 - 3:50 pm UTC
 
 
 
 
WOW nice
Biswaranjan, April     19, 2012 - 11:36 pm UTC
 
 
Thanks a lot.
I will follow your vital points mentioned.
A log way to go for me(3.5 years in oracle :)).
I will try to do performance issue myself if not able then surely post in this site.
thanks again,
Biswaranjan 
 
scnhealthcheck.sql" script for the Oracle version 9.2.0.1.0 . 
Menaka, May       16, 2012 - 6:54 am UTC
 
 
Hi Tom
We're searching scnhealthcheck.sql script for the Oracle version 9.2.0.1.0 . 
Could you please help me download this?
Thanks 
May       17, 2012 - 2:16 am UTC 
 
please utilize support for something like this. 
 
 
Grouping moving set
Sanji, May       24, 2012 - 1:07 pm UTC
 
 
Tom,
I'm trying to write a query that'd return 1 record for a given set from a set of entries coming into the system.
CREATE TABLE EQP_TEST (SEQ_NUM INT, JTYPE CHAR(1),EQP_ID INT, EQP_AMT INT, DATE_TIME TIME);
INSERT INTO EQP_TEST VALUES(1,'A',10,10,CAST ('10:00:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(2,'A',10,10,CAST ('12:00:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(3,'A',10,10,CAST ('12:30:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(4,'B',10,20,CAST ('12:35:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(5,'B',10,20,CAST ('13:00:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(6,'C',20,40,CAST ('13:30:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(7,'C',20,40,CAST ('14:00:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(8,'A',10,10,CAST ('14:30:00' AS  TIME));
INSERT INTO EQP_TEST VALUES(9,'C',20,40,CAST ('15:00:00' AS  TIME));
What i need is 
SEQ_NUM JTYPE EQP_ID EQP_AMT DATE_TIME
3 A 10 10 12:30:00
5 B 10 20 13:00:00
7 C 20 40 14:00:00
8 A 10 10 14:30:00
9 C 20 40 15:00:00
I know the max(date_time) over (partition by jtype rows unbounded following) would also include seq_num 8 for JTYPE 'A', which is what i do not want.
I am unable to come up with grouping sequence numbers 1,2,3 as 1 group, 4 and 5 as another and so on.
Any help is much appreciated.
Thanks 
May       24, 2012 - 3:10 pm UTC 
 
I'm sorry but did you mean to ask "askbill" or someone else this question?
CREATE TABLE EQP_TEST (SEQ_NUM INT, JTYPE CHAR(1),EQP_ID INT, EQP_AMT INT, 
DATE_TIME TIME);
doesn't really compute in Oracle - and any answer I gave you probably wouldn't work in whatever else you are actually using. 
 
 
Grouping moving set
Sanji, May       24, 2012 - 4:10 pm UTC
 
 
My apologies. I should have known better.
Could you take out time to explain how this requirement can be achieved in Oracle.
CREATE TABLE EQP_TEST (SEQ_NUM NUMBER, JTYPE CHAR(1), EQP_ID NUMBER, EQP_AMT NUMBER, DATE_TIME DATE);
INSERT INTO EQP_TEST VALUES(1,'A',10,10, TO_DATE('2012-05-01:10:00:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(2,'A',10,10, TO_DATE('2012-05-01:10:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(3,'A',10,10, TO_DATE('2012-05-01:11:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(4,'B',10,20, TO_DATE('2012-05-01:12:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(5,'B',10,20, TO_DATE('2012-05-01:13:00:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(6,'C',20,40, TO_DATE('2012-05-01:14:00:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(7,'C',20,40, TO_DATE('2012-05-01:14:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(8,'A',10,10, TO_DATE('2012-05-01:15:30:00','YYYY-MM-DD:HH24:MI:SS'));
INSERT INTO EQP_TEST VALUES(9,'D',20,40, TO_DATE('2012-05-01:15:45:00','YYYY-MM-DD:HH24:MI:SS'));
ALTER SESSION SET NLS_DATE_FORMAT = 'HH24:MI:SS';
SELECT * FROM EQP_TEST;
   SEQ_NUM J     EQP_ID    EQP_AMT DATE_TIM
---------- - ---------- ---------- --------
         1 A         10         10 10:00:00
         2 A         10         10 10:30:00
         3 A         10         10 11:30:00
         4 B         10         20 12:30:00
         5 B         10         20 13:00:00
         6 C         20         40 14:00:00
         7 C         20         40 14:30:00
         8 A         10         10 15:30:00
         9 D         20         40 15:45:00
I'm looking for 
         3 A         10         10 11:30:00
         5 B         10         20 13:00:00
         7 C         20         40 14:30:00
         8 A         10         10 15:30:00
         9 D         20         40 15:45:00
Appreciate you input.
Thanks
Sanji 
May       25, 2012 - 8:51 am UTC 
 
I made the assumption you wanted to keep the last eqp_* column in each group
ops$tkyte%ORA11GR2> select *
  2    from (
  3  select seq_num,
  4         jtype,
  5         eqp_id,
  6         eqp_amt,
  7             decode( lead(jtype) over (order by date_time), jtype, null, '<=' ) end_of_grp,
  8         date_time
  9    from eqp_test
 10         )
 11   where end_of_grp is not null
 12   order by date_time
 13  /
   SEQ_NUM J     EQP_ID    EQP_AMT EN DATE_TIM
---------- - ---------- ---------- -- --------
         3 A         10         10 <= 11:30:00
         5 B         10         20 <= 13:00:00
         7 C         20         40 <= 14:30:00
         8 A         10         10 <= 15:30:00
         9 D         20         40 <= 15:45:00
ops$tkyte%ORA11GR2> 
 
 
 
RE: Grouping moving set 
Duke Ganote, May       24, 2012 - 8:39 pm UTC
 
 
Tom has name for this technique, although it escapes my memory at the moment. Just use the LEAD analytic function:
select e.*
  from (
select e.seq_num
     , e.jtype
     , e.eqp_id
     , e.eqp_amt
     , e.date_time
     , case LEAD(Jtype) OVER (order by seq_num)
       when Jtype then NULL
       else seq_num
        end as jagged
  from eqp_test e
) e
 where jagged IS NOT NULL
 order by seq_num
/
SEQ_NUM J     EQP_ID    EQP_AMT DATE_TIM     JAGGED
------- - ---------- ---------- -------- ----------
      3 A         10         10 11:30:00          3
      5 B         10         20 13:00:00          5
      7 C         20         40 14:30:00          7
      8 A         10         10 15:30:00          8
      9 D         20         40 15:45:00          9 
 
Grouping moving set 
Sanji, May       25, 2012 - 9:26 am UTC
 
 
Brilliant... Really appreciate all your help Tom/ Duke.
Thanks much...
Sanji 
 
Query
A reader, May       26, 2012 - 10:42 am UTC
 
 
Hi Tom,
I need to get the PK_REF_COL column, it represent  the value of the PK column, and this should be derived based on the column PARENT_FK and DIR. 
For each record must find the record where the value of the column DIR is different from the value of the current record.The order of the records it is  indicated by the  ORDER_NO column.
PK          PARENT_FK    ORDER_NO DIR PK_REF_COL
10           100     1           N       12
11      100     2           N       12
12      100     3           S       13
13      100     4           N       14
14      100     5           S       15
15      100     6           N       NULL
16           200     1           N       20
17      200     2           N       20
18      200     3           N       20
19      200     4           N       20
20      200     5           S       21
21      200     6           N       NULL
22           300     1           S       23
23      300     2           N       NULL
24      300     3           N       NULL
25      300     4           N       NULL 
 
query for PK_REF_COL for reader
Biswaranjan, May       26, 2012 - 9:36 pm UTC
 
 
Hi,
@reader 
############Input
create table table2(PK number,PARENT_FK number,ORDER_NO number,DIR char(1)); 
insert into table2 values(10 ,          100 ,       1,           'N');      
insert into table2 values(11  ,       100    ,    2   ,        'N'  );     
insert into table2 values(12   ,      100     ,   3    ,       'S'   );    
insert into table2 values(13    ,     100      ,  4     ,      'N'     ); 
insert into table2 values(14     ,    100       , 5      ,     'S'  );     
insert into table2 values(15      ,   100        ,6       ,    'N'    );  
insert into table2 values(16  ,         200 ,       1 ,          'N'    );  
insert into table2 values(17   ,      200    ,    2     ,      'N'      );
insert into table2 values(18    ,     200      ,  3    ,       'N'       );
insert into table2 values(19     ,    200       , 4      ,     'N'       );
insert into table2 values(20     ,    200     ,   5       ,    'S'      );
insert into table2 values(21      ,   200        ,6        ,   'N'       );
insert into table2 values(22       ,    300       , 1 ,          'S'      );
insert into table2 values(23   ,      300   ,     2    ,       'N'      );
insert into table2 values(24    ,     300    ,    3     ,      'N'       );
insert into table2 values(25     ,    300     ,   4      ,     'N'       );
############ sql query
SELECT pk,parent_fk,order_no,dir
,decode(x,null,pk+to_number(last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY ROWNUM)-order_no),to_char(pk+1)) PK_REF_COL
FROM(
     SELECT table2.*
            ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir 
                  or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
                      then null
                  when lag(dir) over (PARTITION BY PARENT_FK order by ORDER_NO desc) is null 
                      then null
                  else order_no 
                  end   X 
      FROM table2) ask_all order by parent_fk,order_no;
########################output/result
        PK  PARENT_FK   ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
        10        100          1 N         12
        11        100          2 N         12
        12        100          3 S         13
        13        100          4 N         14
        14        100          5 S         15
        15        100          6 N
        16        200          1 N         20
        17        200          2 N         20
        18        200          3 N         20
        19        200          4 N         20
        20        200          5 S         21
        21        200          6 N
        22        300          1 S         23
        23        300          2 N
        24        300          3 N
        25        300          4 N
16 rows selected.
@reader 
Please always provide the create table and insert queries so
Tom or others can make use of them without wasting time for 
creating and inserting by their own.
If you are new to this asktom then ok but please make a habit to do that. :) .
Thanks & Regards,
Biswaranjan.
 
 
continuation to last posted result.
Biswaranjan, May       26, 2012 - 10:00 pm UTC
 
 
Hi,
@reader
I just removed the to_number things from my above posted query .
#############
SELECT pk,parent_fk,order_no,dir
,decode(x,null,pk+last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY ROWNUM)-order_no,pk+1) PK_REF_COL
FROM(
     SELECT table2.*
            ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir 
                  or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
                      then null
                  when lag(dir) over (PARTITION BY PARENT_FK order by ORDER_NO desc) is null 
                      then null
                  else order_no 
                  end   X 
      FROM table2)  order by parent_fk,order_no;
#########
regards,
Biswaranjan 
 
query
A reader, May       26, 2012 - 10:11 pm UTC
 
 
Thanks, I will.. 
 
Biswaranjan, May       27, 2012 - 7:38 am UTC
 
 
I was just checking with different input for my above posted query for pk_ref_col result.
I missed the logic slightly.
test cases with diff inputs########
SQL> select * from table2;
        PK  PARENT_FK   ORDER_NO D
---------- ---------- ---------- -
        10        100          1 N
        11        100          2 N
        12        100          3 S
        13        100          4 N
        14        100          5 N
        15        100          6 S
        16        200          1 N
        17        200          2 N
        18        200          3 N
        19        200          4 N
        20        200          5 S
        21        200          6 N
        22        300          1 S
        23        300          2 N
        24        300          3 S
        25        300          4 N
16 rows selected.
SQL> SELECT pk,parent_fk,order_no,dir
  2  ,decode(x,null,pk+last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY 
  3  ROWNUM)-order_no,pk+1) PK_REF_COL
  4  FROM(
  5       SELECT table2.*
  6              ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir 
  7                    or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
  8                        then null
  9                    when lag(dir) over (PARTITION BY PARENT_FK order by ORDER_NO desc) is null 
 10                        then null
 11                    else order_no 
 12                    end   X 
 13        FROM table2)  order by parent_fk,order_no;
        PK  PARENT_FK   ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
        10        100          1 N         12
        11        100          2 N         12
        12        100          3 S         13
        13        100          4 N
        14        100          5 N
        15        100          6 S
        16        200          1 N         20
        17        200          2 N         20
        18        200          3 N         20
        19        200          4 N         20
        20        200          5 S         21
        21        200          6 N
        22        300          1 S         23
        23        300          2 N         24
        24        300          3 S         25
        25        300          4 N
16 rows selected.
SO above result was not giving the expected result.
SO I rewrite the query ,with many inputs I tried and found no error.
@reader Please ignore the last posted query . Make use of following query.
###############
SQL> SELECT pk,parent_fk,order_no,dir
  2  ,case when x is null 
  3       then  pk+last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY order_no desc)-order_no
  4       when s =1
  5       then null
  6       else 
  7       pk+1
  8       end pk_ref_col
  9  FROM(
 10       SELECT table2.*
 11              ,row_number() over (PARTITION BY PARENT_FK order by ORDER_NO desc) s
 12              ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir 
 13                    or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
 14                    then null
 15                    else order_no 
 16                    end   X 
 17        FROM table2)  order by parent_fk,order_no;
        PK  PARENT_FK   ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
        10        100          1 N         12
        11        100          2 N         12
        12        100          3 S         13
        13        100          4 N         15
        14        100          5 N         15
        15        100          6 S
        16        200          1 N         20
        17        200          2 N         20
        18        200          3 N         20
        19        200          4 N         20
        20        200          5 S         21
        21        200          6 N
        22        300          1 S         23
        23        300          2 N         24
        24        300          3 S         25
        25        300          4 N
16 rows selected.
Thanks & Regards,
Biswaranjan.
 
 
 
query
A reader, May       27, 2012 - 1:22 pm UTC
 
 
Hi Biswaranjan,
The pk column is not a consecutive number is random, but it is unique. 
 
query
A reader, May       27, 2012 - 1:24 pm UTC
 
 
Biswaranjan,
I mean in the example it is , but it could be randomly. 
 
query for PK_REF_COL for reader for consecutive or non consecutive pk 
Biswaranjan, May       27, 2012 - 3:59 pm UTC
 
 
SQL> select * from table1;
        PK  PARENT_FK   ORDER_NO D
---------- ---------- ---------- -
        10        100          1 N
        11        100          2 N
        13        100          3 S
        14        100          4 N
        15        100          5 S
        16        100          6 N
        16        200          1 N
        17        200          2 N
        18        200          3 N
        19        200          4 N
        20        200          5 S
        21        200          6 N
        22        300          1 S
        24        300          2 N
        25        300          3 N
        28        300          4 N
16 rows selected.
SQL> SELECT pk,parent_fk,order_no,dir
  2      ,case when x is null 
  3           then last_value(X ignore nulls) OVER(partition by parent_fk ORDER BY order_no desc) 
  4           when s =1
  5           then null
  6           else 
  7           lag(pk) over(partition by parent_fk order by order_no desc)
  8           end pk_ref_col
  9      FROM(
 10          SELECT table1.*
 11                 ,row_number() over (PARTITION BY PARENT_FK order by ORDER_NO desc) s
 12                 ,case when lead(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir 
 13                       or lag(DIR) over (PARTITION BY PARENT_FK order by ORDER_NO desc)=dir
 14                       then null
 15                       else pk
 16                       end   X 
 17           FROM table1)  order by parent_fk,order_no;
        PK  PARENT_FK   ORDER_NO D PK_REF_COL
---------- ---------- ---------- - ----------
        10        100          1 N         13
        11        100          2 N         13
        13        100          3 S         14
        14        100          4 N         15
        15        100          5 S         16
        16        100          6 N
        16        200          1 N         20
        17        200          2 N         20
        18        200          3 N         20
        19        200          4 N         20
        20        200          5 S         21
        21        200          6 N
        22        300          1 S         24
        24        300          2 N
        25        300          3 N
        28        300          4 N
16 rows selected.
SQL> 
@reader you can make user of above query , Please check with 
various input and let us know if it works or not(I tested and worked fine).
Thanks & Regards,
Biswaranja. 
 
 
results that never changed over certain period of time
Ashrf, June      18, 2012 - 5:32 am UTC
 
 
create table emp_salaries (emp_id number, r_year number(4), r_month number(2), r_code number, r_value number);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 8, 102, 10187.097);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 8, 221, 516.129);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 8, 597, 6000);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 8, 598, 3783.871);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 9, 102, 840);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 9, 221, 46.667);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 10, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 10, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 11, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 11, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 12, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2010, 12, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 1, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 1, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 2, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 2, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 3, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 3, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 4, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 4, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 5, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 5, 199, 918.333);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 5, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 6, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 6, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 6, 599, 918.333);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 7, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 7, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 8, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 8, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 9, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 9, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 10, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 10, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 11, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 11, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 12, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2011, 12, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 1, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 1, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 2, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 2, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 3, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 3, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 4, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 4, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 5, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 5, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 6, 102, 900);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 6, 221, 50);
insert into emps_salaries
   (emp_id, r_year, r_month, r_code, r_value)
 values
   (1000000, 2012, 6, 233, 150);
commit;
I want to write a SQL statement deciding if a salary changed over a certain period of time which is determined as a bind variables (i.e. did salary change in the period 10/2010 till 04/2011).  THANKS  
June      18, 2012 - 9:09 am UTC 
 
you could have made the example a tad bit shorter.  
but this is trivial
select count(distinct r_value) from table where date between :x and :y;
if count > 1, it changed. 
 
 
Help/guidance writing analytical function SQL
A reader, June      28, 2012 - 8:03 am UTC
 
 
DROP TABLE SK_T;
CREATE TABLE SK_T
(ITEM                VARCHAR2(25) NOT NULL,
 START_DATE          DATE       NOT NULL,
 END_DATE            DATE,
 PRICE_TYPE          VARCHAR2(20),
 PRICE               NUMBER(10,2));
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.00);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.5);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('09/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.75);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/15/2012', 'MM/DD/RRRR'), TO_DATE('07/22/2012', 'MM/DD/RRRR'), 'SALE', 1.75);
Is it possible to generate the following using just SQL. The Table can have 3-4 million rows.
ITEM       START_DATE      END_DATE    PRICE_TYPE    PRICE
123456     07/01/2012      07/14/2012  REGULAR       2.00
123456     07/15/2012      07/22/2012  SALE          1.75
123456     07/23/2012      07/31/2012  REGULAR       2.00
123456     08/01/2012      08/31/2012  REGULAR       2.50
123456     09/01/2012      NULL        REGULAR       1.75
In the table Regular Priced items will only have the Start Date and Sale Priced will have a Start Date and End Date. 
If the Regular Priced Item's date range falls between a Sale priced item's date range then I have to generate a line for Start date and End Date and then a new row after the sale ends. 
 
June      29, 2012 - 9:17 am UTC 
 
explain your logic thoroughly - precisely.
and if this has to work for multiple items, by all means expand your test case to be inclusive of multiple items so one can check for edge conditions.
but most importantly - explain this to death.  where did that jul-23-jul31 record magically appear from.  where did the 1.75 come from for september.  detail your logic.
 
 
 
Help/guidance writing analytical function SQL
A reader, June      28, 2012 - 8:43 am UTC
 
 
Continuation to my previous question.........
Using Lead() I can generate the End Date value for Regular Items, my problem is generating new lines for the overlapping rows
ITEM       START_DATE      END_DATE    PRICE_TYPE    PRICE
123456     07/01/2012      07/31/2012  REGULAR       2.00
123456     07/15/2012      07/22/2012  SALE          1.75
123456     07/26/2012      07/30/2012  SALE          1.70
Should be converted to 
ITEM       START_DATE      END_DATE    PRICE_TYPE    PRICE
123456     07/01/2012      07/31/2012  REGULAR       2.00
123456     07/15/2012      07/22/2012  SALE          1.75
123456     07/23/2012      07/25/2012  REGULAR       2.00
123456     07/26/2012      07/30/2012  SALE          1.70
123456     07/31/2012      07/31/2012  REGULAR       2.00 
June      29, 2012 - 9:22 am UTC 
 
why do you have overlapping ranges?  what if you have multiple overlapping ranges?  This doesn't make sense - to store anything like that.  Are you basically saying "sale over overrides regular" and do you guarantee that ONLY ONE sale will override any given regular??? 
 
 
Help/guidance writing analytical function SQL
A reader, June      29, 2012 - 3:46 pm UTC
 
 
The overlapping date range in the above example was a typing error. Sorry for the confusion.
We have a table that stores the regular prices of the item, It has item_id, Start Date(/Effective Date) for the price and the Price. 
Example
ITEM       START_DATE      PRICE   
123456     08/01/2012      2.50
123456     09/01/2012      2.75
The regular Price is valid until the next Start Date/Effective date.
That means from Aug 01, 2012 thru Aug 31, 2012 the Price is $2.50 and from Sep 01 the Price will be $2.75. Sep 01, 2012's price does not have an End date because there no Start Date greater than Sep 01,2012.
There is another table than has Sale/Special Prices on that item. That table has Item Id, Start date, End date and the Price.
Start date and End date is NOT NULL and there won't be an overlap in the date range for the same item.
Example
ITEM       START_DATE      END_DATE    PRICE_TYPE    PRICE   
123456     08/01/2012      08/06/2012  SALE          1.50
123456     08/21/2012      08/28/2012  SALE          1.75
That means the Price on the Item is $1.50 from Aug 01, 2012 thru Aug 06, 2012 and then reverts back to the Regular Price after the Sale Period ends.
I need to combine the following two tables and generate the Price with the date Ranges that gives me the Price for the Item and it's correspoding date ranges.
Please note that after the Sale Price period ends, the Price Should revert it back to regular price
For the above example we should get the following output
ITEM       START_DATE      END_DATE    PRICE_TYPE    PRICE   
123456     08/01/2012      08/06/2012  SALE          1.50
123456     08/07/2012      08/20/2012  REGULAR       2.50
123456     08/21/2012      08/28/2012  SALE          1.75
123456     08/29/2012      08/31/2012  REGULAR       2.50
123456     09/01/2012      NULL        REGULAR       2.75
I am resending the test script again with more data for two items. 
Also, I am attaching a SQL I was able to come up with which gives the output in the above format. I don't know whether it's efficient or not.
DROP TABLE SK_T;
CREATE TABLE SK_T
(ITEM                VARCHAR2(25) NOT NULL,
 START_DATE          DATE       NOT NULL,
 END_DATE            DATE,
 PRICE_TYPE          VARCHAR2(20),
 PRICE               NUMBER(10,2));
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/15/2012', 'MM/DD/RRRR'), TO_DATE('07/22/2012', 'MM/DD/RRRR'), 'SALE', 1.50);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/23/2012', 'MM/DD/RRRR'), TO_DATE('07/26/2012', 'MM/DD/RRRR'), 'SALE', 1.25);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/01/2012', 'MM/DD/RRRR'), TO_DATE('08/06/2012', 'MM/DD/RRRR'), 'SALE', 1.50);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/21/2012', 'MM/DD/RRRR'), TO_DATE('08/28/2012', 'MM/DD/RRRR'), 'SALE', 1.75);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('09/21/2012', 'MM/DD/RRRR'), TO_DATE('09/28/2012', 'MM/DD/RRRR'), 'SALE', 1.00);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('07/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.00);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('08/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.5);
INSERT INTO SK_T
VALUES
('123456', TO_DATE('09/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 2.75);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('07/17/2012', 'MM/DD/RRRR'), TO_DATE('07/24/2012', 'MM/DD/RRRR'), 'SALE', 7.50);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('09/01/2012', 'MM/DD/RRRR'), TO_DATE('09/07/2012', 'MM/DD/RRRR'), 'SALE', 8);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('06/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 11.50);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('07/25/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 10.75);
INSERT INTO SK_T
VALUES
('23456', TO_DATE('10/01/2012', 'MM/DD/RRRR'), NULL, 'REGULAR', 11.25);
COMMIT;
WITH a AS --Using this to generate Regular Price row after the Sale Ends.
(SELECT LEVEL L
 FROM DUAL
 CONNECT BY LEVEL <= 2),
I AS 
(SELECT ITEM, 
        START_DATE,
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 END_DATE
              WHEN END_DATE > LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE) THEN
                 LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE)-1
              ELSE
                 END_DATE
         END) END_DATE,
        LAG(PRICE_TYPE,1) OVER (PARTITION BY  ITEM  ORDER BY START_DATE) PREV_PRICE_TYPE,
        LEAD(PRICE_TYPE,1) OVER (PARTITION BY  ITEM  ORDER BY START_DATE) NEXT_PRICE_TYPE,
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN END_DATE END),1) IGNORE NULLS OVER (PARTITION BY  ITEM  ORDER BY  START_DATE,ORD_TYPE)
              ELSE
                 END_DATE
         END) ORIG_END_DATE,
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN PRICE END),1) IGNORE NULLS OVER (PARTITION BY  ITEM  ORDER BY START_DATE, ORD_TYPE)
         END) ORIG_PRICE,
        PRICE_TYPE,
        PRICE,
        ORD_TYPE,
        LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE) NEXT_SALE_START_DATE
 FROM (SELECT ITEM, 
              START_DATE, 
              NVL((CASE WHEN END_DATE IS NOT NULL THEN
                       END_DATE
                    ELSE
                       (LEAD(START_DATE,1) IGNORE NULLS 
                           OVER (PARTITION BY ITEM, PRICE_TYPE 
                                 ORDER BY START_DATE))-1
               END), TO_DATE('31-DEC-5099', 'DD-MON-YYYY')) END_DATE,
              PRICE_TYPE,
              PRICE,
              DECODE(PRICE_TYPE, 'REGULAR', 1, 2) ORD_TYPE
       FROM SK_T
        ORDER BY 1,2)),
items AS
(SELECT ITEM, START_DATE, END_DATE, PRICE_TYPE, PRICE, ORIG_END_DATE, PREV_PRICE_TYPE, NEXT_PRICE_TYPE, ORIG_PRICE, L,ORD_TYPE, NEXT_SALE_START_DATE
 FROM A, I
 WHERE (L=1
        OR (PRICE_TYPE='SALE' 
             AND END_DATE < ORIG_END_DATE 
      AND ((PREV_PRICE_TYPE ='REGULAR' AND NEXT_PRICE_TYPE='REGULAR') 
             OR (PREV_PRICE_TYPE ='SALE' AND NEXT_PRICE_TYPE='REGULAR')
             OR  NEXT_SALE_START_DATE-1 != END_DATE
             OR  NEXT_PRICE_TYPE IS NULL)))),
qry AS  
(SELECT ITEM, 
        (CASE WHEN L= 2 THEN
          (LAST_VALUE(END_DATE) OVER (PARTITION BY  ITEM  ORDER BY START_DATE,ORD_TYPE, LEVEL))+1
       ELSE
          START_DATE
  END) START_DATE, 
 (CASE WHEN L= 2 THEN
          (CASE WHEN (NEXT_SALE_START_DATE-1) < ORIG_END_DATE THEN
                   NEXT_SALE_START_DATE-1
                ELSE
                   ORIG_END_DATE
           END)
       ELSE
          END_DATE
  END) END_DATE,  
 (CASE WHEN L= 2 THEN
          'REGULAR'
       ELSE
          PRICE_TYPE
  END) PRICE_TYPE, 
 (CASE WHEN L= 2 THEN
          ORIG_PRICE
       ELSE
          PRICE
  END) PRICE 
 FROM items)
SELECT item, start_date, end_date, price_type, price
FROM qry
WHERE start_date <= NVL(end_date,start_date)
ORDER BY 1,2;
 
July      02, 2012 - 6:52 am UTC 
 
There is another table than has Sale/Special Prices on that item. 
then why the heck does your example have but one table??? why would you do that - why would you say "i have two" and then give me but one????????   How can someone develop an answer for you if you do stuff like that?????  Your real life problem has nothing to do with your test case???
I doubt your query works if there are two sales in the window of a regular price, that is, if you have a regular price for the month of august, but it goes on sale on the 5th-10th and the 20th-25th.  You only produce two rows of a regular price - you might need N rows.
when you run this query:
a) will you do this for an item or for N items (where N is generally what number) or all items?
b) how large do you anticipate the date ranges you have to deal with being?  weeks, months, years?
c) please fix your example to model REALITY, if you have two tables - please use that in your example - it would be a waste of my time to provide a solution that doesn't work in real life. 
 
 
Help/guidance writing analytical function SQL
A reader, July      02, 2012 - 8:37 am UTC
 
 
The reason I didn't give the example with two tables was because I thought it's only a UNION ALL statementbetween the two tables and didn't want to confuse you with lot of details
a) will you do this for an item or for N items (where N is generally what number) or all items?
The query is all Items in the tables(SK_REG_TEMP and SK_SALE_TEMP)
There can be items in SK_REG_TEMP and not in SK_SALE_TEMP and vice versa.
b) how large do you anticipate the date ranges you have to deal with being? weeks, months, years?
Most of the date ranges will be in weeks, but there will be few that are in months(4-6 months)
c) please fix your example to model REALITY, if you have two tables - please use that in your example - it would be a waste of my time to provide a solution that doesn't work in real life. 
Please find th updated test case below.
And Sorry for the confusion.
DROP TABLE SK_REG_TEMP;
DROP TABLE SK_SALE_TEMP;
CREATE TABLE SK_REG_TEMP
(ITEM                VARCHAR2(25) NOT NULL,
 START_DATE          DATE       NOT NULL,
 PRICE               NUMBER(10,2));
 
CREATE TABLE SK_SALE_TEMP
(ITEM                VARCHAR2(25) NOT NULL,
 START_DATE          DATE       NOT NULL,
 END_DATE            DATE       NOT NULL,
 PRICE               NUMBER(10,2));
 
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('07/01/2012', 'MM/DD/RRRR'), 2.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('08/15/2012', 'MM/DD/RRRR'), 2.50);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('12/15/2012', 'MM/DD/RRRR'), 2.75);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('07/15/2012', 'MM/DD/RRRR'), 25.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('01/15/2013', 'MM/DD/RRRR'), 24.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM3', TO_DATE('07/23/2012', 'MM/DD/RRRR'), 15.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM4', TO_DATE('07/12/2012', 'MM/DD/RRRR'), 31.00);
INSERT INTO SK_REG_TEMP
(ITEM, START_DATE, PRICE)
VALUES
('ITEM4', TO_DATE('01/15/2013', 'MM/DD/RRRR'), 33.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('07/08/2012', 'MM/DD/RRRR'), TO_DATE('07/15/2012', 'MM/DD/RRRR'),  1.50);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('07/23/2012', 'MM/DD/RRRR'), TO_DATE('07/30/2012', 'MM/DD/RRRR'),  1.75);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM1', TO_DATE('08/15/2012', 'MM/DD/RRRR'), TO_DATE('01/15/2013', 'MM/DD/RRRR'),  1.90);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('07/19/2012', 'MM/DD/RRRR'), TO_DATE('07/26/2012', 'MM/DD/RRRR'),  22.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM2', TO_DATE('09/23/2012', 'MM/DD/RRRR'), TO_DATE('09/30/2012', 'MM/DD/RRRR'),  24.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM3', TO_DATE('08/14/2012', 'MM/DD/RRRR'), TO_DATE('08/21/2012', 'MM/DD/RRRR'),  12.00);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM5', TO_DATE('10/20/2012', 'MM/DD/RRRR'), TO_DATE('10/27/2012', 'MM/DD/RRRR'),  15.70);
INSERT INTO SK_SALE_TEMP
(ITEM, START_DATE, END_DATE, PRICE)
VALUES
('ITEM5', TO_DATE('12/21/2012', 'MM/DD/RRRR'), TO_DATE('02/15/2013', 'MM/DD/RRRR'),  14.70);
COMMIT;
Desired Output........
ITEM       START_DATE      END_DATE    PRICE_TYPE    PRICE   
ITEM1      07/01/2012      07/07/2012  REG           2.00  
ITEM1      07/08/2012      07/15/2012  SALE          1.50 
ITEM1      07/16/2012      07/22/2012  REG           2.00  
ITEM1      07/23/2012      07/30/2012  SALE          1.75
ITEM1      07/31/2012      08/14/2012  REG           2.00 
ITEM1      08/15/2012      01/15/2013  SALE          1.90
ITEM1      01/16/2013      NULL        REG           2.75
ITEM2      07/15/2012      07/18/2012  REG           25.00
ITEM2      07/19/2012      07/26/2012  SALE          22.00
ITEM2      07/27/2012      09/22/2012  REG           25.00
ITEM2      09/23/2012      09/30/2012  SALE          24.00
ITEM2      10/01/2012      01/14/2013  REG           25.00
ITEM2      01/15/2013      NULL        REG           24.00
ITEM3      07/23/2012      08/13/2012  REG           15.00
ITEM3      08/14/2012      08/21/2012  SALE          12.00
ITEM3      08/22/2012      NULL        REG           15.00
ITEM4      07/12/2012      01/14/2013  REG           31.00
ITEM4      01/15/2013      NULL        REG           33.00
ITEM5      10/20/2012      10/27/2012  SALE          15.70
ITEM5      12/21/2012      02/15/2013  SALE          14.70
 
July      02, 2012 - 12:52 pm UTC 
 
you might consider a data model more suited to representing this data relationally....
It isn't pretty, but I'm pretty sure it works.
The concept it to turn the ranges into rows.  for the regular price we have to figure out the end date - which is easy for all of the rows except the last row in the set by items.  That "end date" will be the greatest of a) the last observed regular price, or b) the last observed sales price for that item.
Once we have the end date - it is easy to turn the single row into a row per day per record.  
Same with sale date
then just join (by_date)
and then do a bit of grouping to get the ranges back again and a fancy case at the end to figure out whether to print out the end date or leave it null
ops$tkyte%ORA11GR2> with reg
  2  as
  3  (
  4  select item, start_date,
  5         nvl( lead(start_date-1) over (partition by item order by start_date),
  6              greatest( r.start_date+1,
  7                        (select max(s.end_date+1)
  8                           from sk_sale_temp s
  9                          where s.item = r.item)) ) end_date,
 10         price
 11    from sk_reg_temp r
 12  ),
 13  reg_by_date
 14  as
 15  (
 16  select item, start_date+column_value curr_date, start_date, end_date,
 17         price reg_price, cast( null as number ) sale_price, 'R' price_type
 18     from reg,
 19     TABLE( cast(multiset(select level-1 from dual
 20             connect by level <= end_date-start_date+1) as sys.odciNumberList) )
 21  ),
 22  sale_by_date
 23  as
 24  (select item, start_date+column_value curr_date, start_date, end_date,
 25          cast( null as number ) reg_price, price sale_price, 'S' price_type
 26     from sk_sale_temp,
 27     TABLE( cast(multiset(select level-1 from dual
 28             connect by level <= end_date-start_date+1) as sys.odciNumberList) )
 29  ),
 30  by_date
 31  as
 32  (
 33  select item, curr_date,
 34         case when max(price_type) = 'R'
 35              then max(reg_price)
 36              else max(sale_price)
 37          end the_price,
 38          max(price_type) the_price_type
 39    from (
 40  select *
 41    from reg_by_date
 42   union all
 43  select *
 44    from sale_by_date
 45         )
 46   group by item, curr_date
 47  )
 48  select item, sdate,
 49         case when the_price_type <> 'R' or edate <> last_date
 50              then edate
 51          end edate, the_price_type, the_price
 52    from (
 53  select item, min(curr_date) sdate, max(curr_date) edate, the_price, the_price_type,
 54         max( max(curr_date) ) over (partition by item) last_date
 55    from (
 56  select item, curr_date, the_price, the_price_type, count(flag) over (partition by item order by curr_date) grp
 57    from (
 58  select item, curr_date, the_price, the_price_type,
 59         case when lag(curr_date) over (partition by item order by curr_date) <> curr_date-1
 60                or lag(the_price) over (partition by item order by curr_date) <> the_price
 61                or lag(the_price_type) over (partition by item order by curr_date) <> the_price_type
 62              then 1
 63          end flag
 64    from by_date
 65         )
 66         )
 67   group by item, the_price, the_price_type, grp
 68         )
 69   order by 1, 2
 70  /
ITEM                      SDATE     EDATE     T  THE_PRICE
------------------------- --------- --------- - ----------
ITEM1                     01-JUL-12 07-JUL-12 R          2
                          08-JUL-12 15-JUL-12 S        1.5
                          16-JUL-12 22-JUL-12 R          2
                          23-JUL-12 30-JUL-12 S       1.75
                          31-JUL-12 14-AUG-12 R          2
                          15-AUG-12 15-JAN-13 S        1.9
                          16-JAN-13           R       2.75
ITEM2                     15-JUL-12 18-JUL-12 R         25
                          19-JUL-12 26-JUL-12 S         22
                          27-JUL-12 22-SEP-12 R         25
                          23-SEP-12 30-SEP-12 S         24
                          01-OCT-12 14-JAN-13 R         25
                          15-JAN-13           R         24
ITEM3                     23-JUL-12 13-AUG-12 R         15
                          14-AUG-12 21-AUG-12 S         12
                          22-AUG-12           R         15
ITEM4                     12-JUL-12 14-JAN-13 R         31
                          15-JAN-13           R         33
ITEM5                     20-OCT-12 27-OCT-12 S       15.7
                          21-DEC-12 15-FEB-13 S       14.7
20 rows selected.
 
 
 
Help/guidance writing analytical function SQL
A reader, July      02, 2012 - 9:14 am UTC
 
 
The following query looks like it's working for the above test data. I have two sale data in the regular price window for item ITEM1(07/08/2012 to 07/15/2012 and 07/23/2012 to 07/30/2012). I am producing two rows for every Sale row. 
It looks ugly and I don't know if it's efficient.
WITH a AS --Using this to generate Regular Price row after the Sale Ends.
(SELECT LEVEL L
 FROM DUAL
 CONNECT BY LEVEL <= 2),
STG AS
(SELECT ITEM, START_DATE, TO_DATE(NULL) END_DATE, 'REGULAR' PRICE_TYPE, PRICE
 FROM SK_REG_TEMP
 UNION ALL
 SELECT ITEM, START_DATE, END_DATE, 'SALE' PRICE_TYPE, PRICE
 FROM SK_SALE_TEMP),
I AS 
(SELECT ITEM,  
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 START_DATE
              WHEN RN > 1 AND START_DATE < LAG((CASE WHEN PRICE_TYPE='SALE' THEN END_DATE +1 END),1, SYSDATE-3000) OVER (PARTITION BY ITEM ORDER BY END_DATE) THEN
                 LAG((CASE WHEN PRICE_TYPE='SALE' THEN END_DATE +1 END),1) OVER (PARTITION BY ITEM ORDER BY END_DATE)
              ELSE
                 START_DATE
         END) START_DATE,
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 END_DATE
              WHEN END_DATE > LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE-1 END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE) THEN
                 LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE-1 END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE)
              ELSE
                 END_DATE
         END) END_DATE,
        LAG(PRICE_TYPE,1) OVER (PARTITION BY  ITEM  ORDER BY START_DATE) PREV_PRICE_TYPE,
        LEAD(PRICE_TYPE,1) OVER (PARTITION BY  ITEM  ORDER BY START_DATE) NEXT_PRICE_TYPE,
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN END_DATE END),1) IGNORE NULLS OVER (PARTITION BY  ITEM  ORDER BY  START_DATE,ORD_TYPE)
              ELSE
                 END_DATE
         END) ORIG_END_DATE,
        (CASE WHEN PRICE_TYPE='SALE' THEN 
                 LAG((CASE WHEN PRICE_TYPE='REGULAR' THEN PRICE END),1) IGNORE NULLS OVER (PARTITION BY  ITEM  ORDER BY START_DATE, ORD_TYPE)
         END) ORIG_PRICE,
        PRICE_TYPE,
        PRICE,
        ORD_TYPE,
        LEAD((CASE WHEN PRICE_TYPE='SALE' THEN START_DATE END),1) OVER (PARTITION BY ITEM ORDER BY START_DATE,ORD_TYPE) NEXT_SALE_START_DATE
 FROM (SELECT ITEM, 
              START_DATE, 
              NVL((CASE WHEN END_DATE IS NOT NULL THEN
                       END_DATE
                    ELSE
                       (LEAD(START_DATE,1) IGNORE NULLS 
                           OVER (PARTITION BY ITEM, PRICE_TYPE 
                                 ORDER BY START_DATE))-1
               END), TO_DATE('31-DEC-5099', 'DD-MON-YYYY')) END_DATE,
              PRICE_TYPE,
              PRICE,
              DECODE(PRICE_TYPE, 'REGULAR', 1, 2) ORD_TYPE,
              ROW_NUMBER() OVER (PARTITION BY ITEM ORDER BY START_DATE) RN
       FROM STG
       ORDER BY 1,2)),
items AS
(SELECT ITEM, START_DATE, END_DATE, PRICE_TYPE, PRICE, ORIG_END_DATE, PREV_PRICE_TYPE, NEXT_PRICE_TYPE, ORIG_PRICE, L,ORD_TYPE, NEXT_SALE_START_DATE
 FROM A, I
 WHERE (L=1
        OR (PRICE_TYPE='SALE' 
             AND END_DATE < ORIG_END_DATE 
         AND ((PREV_PRICE_TYPE ='REGULAR' AND NEXT_PRICE_TYPE='REGULAR') 
                OR (PREV_PRICE_TYPE ='SALE' AND NEXT_PRICE_TYPE='REGULAR')
                OR  NEXT_SALE_START_DATE-1 != END_DATE
                OR  NEXT_PRICE_TYPE IS NULL)))),
qry AS  
(SELECT ITEM, 
        (CASE WHEN L= 2 THEN
             (LAST_VALUE(END_DATE) OVER (PARTITION BY  ITEM  ORDER BY START_DATE,ORD_TYPE, LEVEL))+1
          ELSE
             START_DATE
     END) START_DATE, 
    (CASE WHEN L= 2 THEN
             (CASE WHEN (NEXT_SALE_START_DATE-1) < ORIG_END_DATE THEN
                      NEXT_SALE_START_DATE-1
                   ELSE
                      ORIG_END_DATE
              END)
          ELSE
             END_DATE
     END) END_DATE,  
    (CASE WHEN L= 2 THEN
             'REGULAR'
          ELSE
             PRICE_TYPE
     END) PRICE_TYPE, 
    (CASE WHEN L= 2 THEN
             ORIG_PRICE
          ELSE
             PRICE
     END) PRICE 
 FROM items)
SELECT item, start_date, end_date, price_type, price
FROM qry
WHERE start_date <= NVL(end_date,start_date)
ORDER BY 1,2;
 
 
Help/guidance writing analytical function SQL
A reader, July      02, 2012 - 2:58 pm UTC
 
 
Thanks a lot for the SQL.
The SQL I wrote was not dividing the regular and sales rows by date, but trying to find the end_date+1 from  the Sales row to get the next regular price row.
Your SQL is much more cleaner and easier to understand. Thanks again for the nice SQL.
 
 Is there any difference between (CAST NULL AS NUMBER) and TO_NUMBER(NULL)? 
July      03, 2012 - 8:04 am UTC 
 
cast(null as number) and to_number(null) lead to pretty much the same effect.  I just personally find cast( null as number ) to be cleaner to read. 
 
 
Help/guidance writing analytical function SQL
A reader, July      02, 2012 - 3:13 pm UTC
 
 
You had mentioned earlier
"you might consider a data model more suited to representing this data relationally.... "
It's not possible for me to change the data model right now.
If you were to do the data model, how would you have done this
Requirement
1) Maintain Regular Price of the Items with effective/start date. 
2) Maintain a sale price. The sale price will have a start and end date. 
   The sale price can be a fixed price($2.25) or a percentage off the retail(25% off retail) or fixed dollars off the retail($0.50 off the retail)
 
July      03, 2012 - 8:19 am UTC 
 
The requirement can be more readily stated as:
maintain the correct price of something during a period of time.
I probably would have gone with:
item, start_date, end_date, price, description
where description was 'regular', 'sales'.  Single table, non-overlapping ranges.
In short, the table would have been your report, your report would have been "select * from table order by item, start_date"
 
 
 
Analytics or not
Tony Fernandez, August    01, 2012 - 3:05 pm UTC
 
 
Dear Tom,
We have this query we are trying to implement using analytic functions but the cost and statistics increase dramatically, tried "with" in the select statament to "prepare data", but no gain at all.  We need your help:
( data provided is very small set, tables go to millions of rows each )
Here is query:
select  case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
             else dic.distributor_id
        end distributor_id
       ,case when dic.din_mapping_grp_flag = 1 then do.distributor_name
            else dc.distributor_name
        end distributor_name
       ,dic.din
       ,dic.active_flag
       ,min ( dic.dist_item_source ) dist_item_source
       ,max ( dic.brand )            brand
       ,max ( dic.description )      description
 from  dist_item_catalog dic
      ,distributor       do
      ,distributor       dc
where dic.dist_org_id    = do.distributor_id
  and dic.distributor_id = dc.distributor_id
  and dic.active_flag    = 1
                                  
group by case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
              else dic.distributor_id
         end,
         case when dic.din_mapping_grp_flag = 1 then do.distributor_name
              else dc.distributor_name
         end,
         dic.din,
         dic.active_flag;
Table distributor is:
CREATE TABLE DISTRIBUTOR
( DISTRIBUTOR_ID              VARCHAR2(13 BYTE) NOT NULL,
  DISTRIBUTOR_NAME            VARCHAR2(60 BYTE) NOT NULL );
SET DEFINE OFF;
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00729', 'Sysco Sacramento');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00705', 'AFI Foodservice INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00699', 'US Foods Detroit INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00697', 'Hansen Sales - VEND');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00696', 'Fox River');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00703', 'DiCarlo Distributors');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00701', 'US Foods Cincinnati INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00700', 'US Foods Knoxville 2270 6H');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00730', 'US Foods Paris (formerly PYA ) / 5Q INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00718', 'Upper Lakes Foods Inc');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00719', 'Fleming INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00720', 'Camellia INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00724', 'Reinhart Foodservice Twin Cities');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00725', 'Indianhead Foodservice ');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00726', 'Martin Preferred Foods');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00727', 'Wenning and Sons INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00706', 'J Kings Food Service Professionals - Pro Act');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00707', 'Texoma Meat Company INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00710', 'Q S I Foods INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00711', 'Sysco Jamestown INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00712', 'City Meats INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00713', 'Merit Day Food Service INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00714', 'US Foods Milwaukee 248 (formerly Alliant) 3D');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2001-00715', 'Sysco Pittsburgh');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2002-01122', 'Sysco Hampton Roads');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2002-01106', 'Martin Brothers');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2003-00231', 'Dean Foods - Pet Dairy Johnson City TN (Land O Sun)');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2003-00232', 'Dean Foods - Pet Dairy Spartanburg SC INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2003-00233', 'Dean Foods - Schepps Dairy Houston INACTIVE');
Insert into GPO.DISTRIBUTOR
   (DISTRIBUTOR_ID, DISTRIBUTOR_NAME)
 Values
   ('DO-2003-00234', 'Dean Foods - Shenandoah''s Pride Dairy');
COMMIT;
and table dist_item_catalog:
CREATE TABLE DIST_ITEM_CATALOG
(
DIN                   VARCHAR2(20 BYTE)       NOT NULL,
DIST_ITEM_SOURCE      VARCHAR2(1 BYTE)        NOT NULL,
DISTRIBUTOR_ID        VARCHAR2(13 BYTE),
DIST_ORG_ID           VARCHAR2(13 BYTE),
DIN_MAPPING_GRP_FLAG  NUMBER(1),
ACTIVE_FLAG           NUMBER,
DESCRIPTION           VARCHAR2(100 BYTE),
BRAND                 VARCHAR2(40 BYTE),
);
SET DEFINE OFF;
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '17253', 1, 
    'D', 'PACTIV', 'TRAY FOAM SCHL 5COMP WHT 8X10');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '55659', 1, 
    'D', 'SUNKSTS', 'DRINK CRANBERRY CKTL 10% 4+1');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '59859', 1, 
    'D', 'SUBWAY', 'TURKEY BRST CKD SLI .5 OZ');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '83737', 1, 
    'D', 'PRIMOS', 'BOX BAKERY 10X10X4 PRINT PRIMO');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '83790', 1, 
    'D', 'C&M', 'LID PLAS DOME F/9" DEEP');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '100628', 1, 
    'D', 'SYS IMP', 'JUICE ORANGE 100% NECTAR THICK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '101675', 1, 
    'D', 'SYS IMP', 'WATER LEMON NECTAR THICK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '102020', 1, 
    'D', 'SYS IMP', 'MILK 2% DAIRY HONEY THICK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '117341', 1, 
    'D', 'BASICAM', 'POTATO MASHED NATURE S OWN');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '127672', 1, 
    'D', 'SUBWAY', 'FILM PVC RL 18"X2000 W/CUTTER');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '138792', 1, 
    'D', 'PACKER', 'SUGAR COARSE CON AA');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '143356', 1, 
    'D', 'SYS CLS', 'POTATO H/BRN DEHY GLDN GRL');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '156901', 1, 
    'D', 'HELLMAN', 'DRESSING 1000 ISL FF PKT');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '172623', 1, 
    'D', 'CHFMXWL', 'PEA GREEN FRZN');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '213452', 1, 
    'D', 'SUBWAY', 'DOUGH COOKIE RASP CHSCAK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '218962', 1, 
    'D', 'SYS CLS', 'CHICKEN WING BUFF GLZE 1&2 PCK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '236964', 1, 
    'D', 'SLIKRIK', 'MIX SPICE CHILI ORGANIC');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '237018', 1, 
    'D', 'SLIKRIK', 'MIX SPICE RED HOT ORGANIC');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '284137', 1, 
    'D', 'BRILL', 'CAKE RED VELVET DBL LYR 8"');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '322376', 1, 
    'D', 'BRKBUSH', 'CHICKEN WING BRD BRBN WILD');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '325223', 1, 
    'D', 'CORNER', 'BACON SLI APLWD SMKD 13/15');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '349245', 1, 
    'D', 'THOMPSN', 'STEAK PORTERHOUSE SEL ANGUS');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '350514', 1, 
    'D', 'THOMPSN', 'BEEF ROUND INSIDE DENUDE');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '353237', 1, 
    'D', 'AREZZIO', 'CHEESE STRING MOZZARELLA');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '354219', 1, 
    'D', 'THOMPSN', 'PORK CHOP FRCHD 8-10 OZ');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '356024', 1, 
    'D', 'FARMLND', 'BACON CANADIAN SLI .75 OZ FRZN');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '356857', 1, 
    'D', 'SCHWANS', 'ICE CREAM STWBRY');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '359453', 1, 
    'D', 'CORNER', 'SAUCE BASIL PSTO');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '361028', 1, 
    'D', 'SUBWAY', 'CUP PLAS CLD 40 OZ "GRN LEAF"');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '366904', 1, 
    'D', 'MSGROWN', 'POTATO SWEET FRSH');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '384495', 1, 
    'D', 'KRSPY K', 'DOUGH BISCUIT BTRMLK DROP');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '387318', 1, 
    'D', 'SEABEST', 'SNAPPER FIL SKON SCAR 8/10 IDN');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '389280', 1, 
    'D', 'BKRSCLS', 'BATTER MUFFIN CHOC CHOC CHIP');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '392490', 1, 
    'D', 'COMET', 'CUP PLAS CAR 32TV OZ WHT');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '395802', 1, 
    'D', 'PRAIRIE', 'STRAW WRPD TRANS GNT 10.25"');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '429274', 1, 
    'D', 'SYS IMP', 'MILK 2% DAIRY NECTAR THICK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '435046', 1, 
    'D', 'PLDRHNO', 'SAUCE BBQ');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '449280', 1, 
    'D', 'GILSTER', 'PASTA SPAGHETTI 10"');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '455099', 1, 
    'D', 'PACKER', 'PIE LEMON MERNGE');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '471922', 1, 
    'D', 'SUBWAY', 'BAG PLAS SNDW "GRN LEAF"');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '476224', 1, 
    'D', 'CABBHNP', 'STEAK FILET C\C CH EXCLUSIVE');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '477513', 1, 
    'D', 'TODDS', 'SAUCE TERIYAKI SWEET THICK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '491567', 1, 
    'D', 'KINGCHS', 'CAKE BUNDT ASST 9"');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '496578', 1, 
    'D', 'BHB/NPM', 'STEAK STRIP E\E 1"TL PR FRZN');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '501957', 1, 
    'D', 'MARY B', 'DOUGH BISCUIT BUTTERMILK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '504169', 1, 
    'D', 'AEP INC', 'LINER TRASH 38X58 1.5M BLK');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '533517', 1, 
    'D', 'BTRBALL', 'TURKEY BRGR SAVRY WHT RTC');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '557575', 1, 
    'D', 'RUFINOS', 'APTZR MUSHROOM STFD W/CRABMEAT');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '566824', 1, 
    'D', 'FIRECLS', 'BEEF GROUND BULK FINE 73/27');
Insert into GPO.DIST_ITEM_CATALOG
   (DIN_MAPPING_GRP_FLAG, DIST_ORG_ID, DISTRIBUTOR_ID, DIN, ACTIVE_FLAG, 
    DIST_ITEM_SOURCE, BRAND, DESCRIPTION)
 Values
   (1, 'DO-2004-00404', 'DO-2001-00589', '570273', 1, 
    'D', 'CYCREEK', 'ALLIGATOR MEAT FARM RAISED');
COMMIT;
 
August    01, 2012 - 3:26 pm UTC 
 
I do not see a single analytic function in that entire query.
???? 
 
 
?
Tony Fernandez, August    01, 2012 - 3:50 pm UTC
 
 
That is correct Tom,  no analytic function is displayed but the ones attempted do not help in the tuning but make it more costly.
The aggregates have been added analytics like "over ( partition by case
               when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
               else dic.distributor_id
            end,
            case
               when dic.din_mapping_grp_flag = 1 then do.distributor_name
               else dc.distributor_name
            end,
            dic.din,
            dic.active_flag )
We need these items to "group by" or "over ( partitioned by )"
Thanks, 
August    01, 2012 - 3:52 pm UTC 
 
??? I don't understand at all.  
I don't see how analytics would fit into this query at all, you appear to want to aggregate.  
I'm not sure where you are trying to go with analytics, I don't see them as being useful in this query - not if your need is to AGGREGATE. 
 
 
continued
Tony Fernandez, August    01, 2012 - 4:04 pm UTC
 
 
Sorry Tom,
Here is the analytic version:
select  case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
             else dic.distributor_id
        end distributor_id
       ,case when dic.din_mapping_grp_flag = 1 then do.distributor_name
            else dc.distributor_name
        end distributor_name
       ,dic.din
       ,dic.active_flag
       ,min ( dic.dist_item_source ) over ( partition by case when dic.din_mapping_grp_flag = 1 then dic.dist_org_id
                                                              else dic.distributor_id
                                                          end,
                                                          dic.din,
                                                          dic.active_flag )
       ,max ( dic.brand ) -- analityc same as prior column directly above           
       ,max ( dic.description ) -- idem     
 from  dist_item_catalog dic
      ,distributor       do
      ,distributor       dc
where dic.dist_org_id    = do.distributor_id
  and dic.distributor_id = dc.distributor_id
  and dic.active_flag    = 1
;                                  
Please keep in mind that analytic is only one alternative to attempt to expedite the query, any other suggestion even aside of analytics will be fine.
Bottom line is the query for an GUI transaction takes up to 20 minutes on the whole data set, we would like to talk about seconds for the GUI user.
Thanks, 
August    01, 2012 - 4:10 pm UTC 
 
but that gives the wrong answer doesn't it.
you need to aggregate, you want to GET RID of rows.  analytics do not do that.
have you considered a materialized view. 
 
 
Mat view
Tony Fernandez, August    01, 2012 - 4:15 pm UTC
 
 
That is correct, more rows.
Materialized view is out of the question because this being a financial system and many things happen per second that could obsolete the mat view almost instantaneously.
Bottom line is that there are many aggregates with the same group by and noticed that just dropping one aggregated column makes great gain in time.  But unfortunately, all columns in query are needed. 
 
One more attempt
Tony Fernandez, August    06, 2012 - 9:10 am UTC
 
 
Tom,
Here is query written with no analytics.  Understood that analytics will not help in this case.  Can you please point any other direction yet to be explored?  My DBA group all agree that mat view was not solution before due to high OLTP and we want to continue with that norm.
Here is the query:
with dic2  as ( select  min( dist_item_catalog_id ) dicid
                       ,din
                       ,decode( din_mapping_grp_flag, 1, dist_org_id, distributor_id ) distid
                  from  dist_item_catalog 
                  group by  din
                           ,decode( din_mapping_grp_flag, 1, dist_org_id, distributor_id )
               )
select  'Available' type
       ,decode( dic.din_mapping_grp_flag, 1, dic.dist_org_id, dic.distributor_id ) distributor_id 
       ,decode( dic.din_mapping_grp_flag, 1, do.distributor_name, dc.distributor_name ) distributor_name
       ,dic.din
       ,dic.active_flag
       ,dic.dist_item_source
       ,dic.brand
       ,dic.description
       ,dic.dist_category_id
       ,dic.category_name
       ,dic.dist_sub_category_id
       ,dic.subcategory_name
       ,dic.pack_size
       ,dic.dist_mfr_name
       ,dic.dist_mfr_id 
  from  dist_item_catalog dic
       ,dic2
       ,distributor do      
       ,distributor dc
where dic2.dicid         = dic.dist_item_catalog_id
   and dic.dist_org_id    = do.distributor_id
   and dic.distributor_id = dc.distributor_id
   and dic.active_flag    = 1
   and not exists ( select 1
                      from dist_term_set_item dtsi
                     where dtsi.dist_item_catalog_id =
                            dic.dist_item_catalog_id ); 
August    17, 2012 - 12:20 pm UTC 
 
do you have a plan
and are the estimated cardinalities in the plan close to being correct?  use your knowledge of the data to answer that. 
 
 
Book?
Nelson, August    10, 2012 - 12:08 pm UTC
 
 
So Tom... 10 years later.. did you finally get a chance to publish a book on analytics? 
 
Plan
Tony Fernandez, August    17, 2012 - 1:47 pm UTC
 
 
Tom,
Here is my explain for two questions ago in current thread:
Execution Plan
----------------------------------------------------------
-----------------------------------------------------------------------------------
| Id  | Operation                | Name                   | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                        |  1652K|   623M| 86663 |
|   1 |  HASH JOIN               |                        |  1652K|   623M| 86663 |
|   2 |   TABLE ACCESS FULL      | DISTRIBUTOR            |  5673 |   238K|    58 |
|   3 |   HASH JOIN              |                        |  1652K|   556M| 86585 |
|   4 |    TABLE ACCESS FULL     | DISTRIBUTOR            |  5673 |   238K|    58 |
|   5 |    HASH JOIN             |                        |  1652K|   488M| 86506 |
|   6 |     VIEW                 |                        |  1652K|    20M| 17601 |
|   7 |      HASH GROUP BY       |                        |  1652K|    85M| 17601 |
|   8 |       TABLE ACCESS FULL  | DIST_ITEM_CATALOG      |  1652K|    85M| 17418 |
|   9 |     HASH JOIN RIGHT ANTI |                        |  1652K|   467M| 42682 |
|  10 |      INDEX FAST FULL SCAN| DIST_TERM_SET_ITEM_IDX |  1056K|  6192K|   476 |
|  11 |      TABLE ACCESS FULL   | DIST_ITEM_CATALOG      |  1652K|   458M| 17517 |
-----------------------------------------------------------------------------------
and Statistics:
Statistics
----------------------------------------------------------
        107  recursive calls
          0  db block gets
      47967  consistent gets
       5140  physical reads
          0  redo size
      25155  bytes sent via SQL*Net to client
        507  bytes received via SQL*Net from client
         15  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
        209  rows processed
Cardinalities are fine.
Thanks,
 
August    17, 2012 - 3:37 pm UTC 
 
no they aren't.
we estimated 1.6 MILLION ROWS as output.
you got 209
that sure doesn't add up.
can you run the query with gather_plan_statistics in sqlplus: 
http://jonathanlewis.wordpress.com/2006/11/09/dbms_xplan-in-10g/ or get a tkprof and merge the actual row counts with the estimated and give us a report that shows the plan - then estimated - then actual rows. 
 
 
Analytics - considering a flag in both senarios
ARC, October   02, 2012 - 3:59 am UTC
 
 
Hi Tom,
Please help me with analytics.
Table Data:
SNO FLG AMT
1 NPI 1000
2 NPI 2000
3 AFM 1400
4 AFM 1500
5 AFM 1700
6 NFM 500
7 NFM 700
8 NPI 8000
9 NPI 6000
10 OTH 9000
In the above data I have to consider NFM flag records under both NPI and AFM records. When I select NPI the amounts under NPI and NFM should sum up, When I select AFM the amounts under AFM and NFM should sum up.
I tried with below SQL, but it is working for one condition either NPI or AFM. I'm trying to achive in a single query without any filters on the FLAG.
========== 
SELECT case when flg in ('NPI','NFM') then 'NPI' when flg in ('AFM','NFM') then 'AFM' else FLG end SSS,
       sum(amt) amt 
FROM TEST_FLG
group by case when flg in ('NPI','NFM') then 'NPI' when flg in ('AFM','NFM') then 'AFM' else FLG end
==========
My desired out put is
NPI 18200
AFM 5800
OTH     9000
Plese help me in correcting query. Thanks in advance.
-ARC 
 
October   09, 2012 - 11:34 am UTC 
 
no create
no inserts
no look
no promises either, i really didn't look at this 
 
 
rebisco, October   10, 2012 - 2:29 am UTC
 
 
Hello ARC,
I'm not so sure if this is what you wanted but, for me it can be solved using UNION ALL and a few filters?
SQL> CREATE TABLE TEST AS
  2  SELECT 1 SNO, 'NPI' FLG, 1000 AMT FROM DUAL
  3  UNION ALL
  4  SELECT 2 SNO, 'NPI' FLG, 2000 AMT FROM DUAL
  5  UNION ALL
  6  SELECT 3 SNO, 'AFM' FLG, 1400 AMT FROM DUAL
  7  UNION ALL
  8  SELECT 4 SNO, 'AFM' FLG, 1500 AMT FROM DUAL
  9  UNION ALL
 10  SELECT 5 SNO, 'AFM' FLG, 1700 AMT FROM DUAL
 11  UNION ALL
 12  SELECT 6 SNO, 'NFM' FLG, 500 AMT FROM DUAL
 13  UNION ALL
 14  SELECT 7 SNO, 'NFM' FLG, 700 AMT FROM DUAL
 15  UNION ALL
 16  SELECT 8 SNO, 'NPI' FLG, 8000 AMT FROM DUAL
 17  UNION ALL
 18  SELECT 9 SNO, 'NPI' FLG, 6000 AMT FROM DUAL
 19  UNION ALL
 20  SELECT 10 SNO, 'OTH' FLG, 9000 AMT FROM DUAL;
Table created
SQL> SELECT 'NFM' flag, SUM(amt)amt FROM TEST WHERE flg IN ('NPI','AFM')
  2  UNION ALL
  3  SELECT 'NPI' flag, SUM(amt)amt FROM TEST WHERE flg IN ('NPI','NFM')
  4  UNION ALL
  5  SELECT 'AFM' flag, SUM(amt)amt FROM TEST WHERE flg IN ('AFM','NFM')
  6  UNION ALL
  7  SELECT 'OTH' flag, SUM(amt) amt FROM TEST WHERE flg NOT IN ('AFM','NPI','NFM');
FLAG                                    AMT
-------------------------------- ----------
NFM                                   21600
NPI                                   18200
AFM                                    5800
OTH                                    9000 
 
Printing range with comma separated values
Arvind, December  06, 2012 - 12:01 am UTC
 
 
Dear Tom,
In this question, user wants the data to be categorised in conitnous range. But I have some what same but bit different requirement as given below.
I have table and data as  given below
create table ag_t1
(col1  varchar2(5),
 col2  varchar2(10)
 );
 
 insert into ag_t1 values ('01.01', 'v1');
 insert into ag_t1 values ('01.02', 'v1');
 insert into ag_t1 values ('01.03', 'v1');
 insert into ag_t1 values ('01.04', 'v2');
 insert into ag_t1 values ('01.05', 'v1');
 insert into ag_t1 values ('01.06', 'v2');
 insert into ag_t1 values ('02.01', 'v1');
 insert into ag_t1 values ('02.02', 'v1');
 insert into ag_t1 values ('02.03', 'v2');
 insert into ag_t1 values ('02.04', 'v2');
 insert into ag_t1 values ('02.05', 'v1');
 insert into ag_t1 values ('02.06', 'v2');
 insert into ag_t1 values ('08.01', 'v1');
 insert into ag_t1 values ('08.02', 'v1');
 insert into ag_t1 values ('08.03', 'v1');
 insert into ag_t1 values ('08.04', 'v2');
 insert into ag_t1 values ('08.05', 'v1');
 insert into ag_t1 values ('08.06', 'v2');
I want the output as given below.
Output 
 v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
 v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06   
December  14, 2012 - 1:57 pm UTC 
 
why isn't 2.03, 2.04 using "to" like 2.01 to 2.02 did.
define this better.
always 2 digit number "." 2 digit number?  
always collapse out consecutive numbers using "to"? 
 
 
Printing range with comma separated values   
Rajeshwaran, Jeyabal, December  15, 2012 - 7:19 am UTC
 
 
drop table t purge;
create table t
(col1  number(5,2),
 col2  varchar2(10)
 );
 
 insert into t values (01.01, 'v1');
 insert into t values (01.02, 'v1');
 insert into t values (01.03, 'v1');
 insert into t values (01.04, 'v2');
 insert into t values (01.05, 'v1');
 insert into t values (01.06, 'v2');
 insert into t values (02.01, 'v1');
 insert into t values (02.02, 'v1');
 insert into t values (02.03, 'v2');
 insert into t values (02.04, 'v2');
 insert into t values (02.05, 'v1');
 insert into t values (02.06, 'v2');
 insert into t values (08.01, 'v1');
 insert into t values (08.02, 'v1');
 insert into t values (08.03, 'v1');
 insert into t values (08.04, 'v2');
 insert into t values (08.05, 'v1');
 insert into t values (08.06, 'v2');
 commit;
rajesh@ORA11G> select col2,min(col1) min_col1,max(col1) max_col1,
  2      case when nullif(min(col1),max(col1)) is not null
  3              then min(col1)||' to '||max(col1)
  4     else to_char(min(col1)) end reslt
  5  from (
  6  select col1,col2,max(val) over(order by col1,col2) grp
  7  from (
  8  select t.*,
  9    case when lag(col2) over(order by col1,col2) is null
 10    or lag(col2) over(order by col1,col2) <> col2 then
 11    row_number() over(order by col1,col2) end as val
 12  from t
 13       )
 14       )
 15  group by col2,grp
 16  order by col2
 17  /
COL2         MIN_COL1   MAX_COL1 RESLT
---------- ---------- ---------- --------------------
v1               1.01       1.03 1.01 to 1.03
v1               1.05       1.05 1.05
v1               2.01       2.02 2.01 to 2.02
v1               2.05       2.05 2.05
v1               8.01       8.03 8.01 to 8.03
v1               8.05       8.05 8.05
v2               1.04       1.04 1.04
v2               1.06       1.06 1.06
v2               2.03       2.04 2.03 to 2.04
v2               2.06       2.06 2.06
v2               8.04       8.04 8.04
v2               8.06       8.06 8.06
12 rows selected.
Elapsed: 00:00:00.01
rajesh@ORA11G> select col2,listagg(reslt,',') within group(order by min_col1) as listed_values
  2  from (
  3  select col2,min(col1) min_col1,max(col1) max_col1,
  4      case when nullif(min(col1),max(col1)) is not null
  5              then min(col1)||' to '||max(col1)
  6     else to_char(min(col1)) end reslt
  7  from (
  8  select col1,col2,max(val) over(order by col1,col2) grp
  9  from (
 10  select t.*,
 11    case when lag(col2) over(order by col1,col2) is null
 12    or lag(col2) over(order by col1,col2) <> col2 then
 13    row_number() over(order by col1,col2) end as val
 14  from t
 15       )
 16       )
 17  group by col2,grp
 18      )
 19  group by col2
 20  /
COL2       LISTED_VALUES
---------- ------------------------------------------------------------
v1         1.01 to 1.03,1.05,2.01 to 2.02,2.05,8.01 to 8.03,8.05
v2         1.04,1.06,2.03 to 2.04,2.06,8.04,8.06
Elapsed: 00:00:00.03
rajesh@ORA11G>
 
 
December  18, 2012 - 7:41 am UTC 
 
you did not produce their requested output, they did not provide sufficient information to actually answer the question....
 
 
 
Muhammad Bilal, December  31, 2012 - 4:44 am UTC
 
 
I have the following dataset:
CREATE TABLE RoomGroups
(group_id NUMBER,
 group_name VARCHAR2(30),
 group_capacity NUMBER);
INSERT INTO RoomGroups VALUES(1, 'SEECS UG Block', 100);
INSERT INTO RoomGroups VALUES(2, 'IAEC Building', 70);
INSERT INTO RoomGroups VALUES(3, 'RIMMS Building', 90);
Commit;
CREATE TABLE DegreeBatches
(batch_id NUMBER, batch_name VARCHAR2(30), batch_strength NUMBER);
INSERT INTO DegreeBatches VALUES(10,'BIT-10',35);
INSERT INTO DegreeBatches VALUES(20,'BIT-11',40);
INSERT INTO DegreeBatches VALUES(30,'BSCS-2',35);
INSERT INTO DegreeBatches VALUES(40,'BSCS-3',40);
INSERT INTO DegreeBatches VALUES(50,'BEE-4',50);
INSERT INTO DegreeBatches VALUES(60,'BICSE-7',25);
INSERT INTO DegreeBatches VALUES(70,'BESE-3',30);
Commit;
I want to achieve the following through single SQL statement:
1. Get all possible sets of DegreeBatches for each group - in such a way that for each set the SUM(batch_strength) <= Capacity of that group as show below. 
GroupId Batches    StrengthStr StrengthTotal Capacity   
1. {BIT-10, BIT-11, BICSE-7}  {35, 40, 25} 100  100
1. {BSCS-2, BSCS-3, BICSE-7}  {35, 40, 25} 100  100
1. {BEE-4, BSCS-3}   {50, 40}  90  100
...
...
2. {BIT-10, BSCS-2}   {35, 35}  70  70
2. {BSCS-3, BESE-3}   {40, 30}  70  70
2. {BIT-11, BICSE-7}   {40, 25}  65  70
and so on...
2. Next I want to retrieve only non-overlapping sets based on minimum value of Capacity-StrengthTotal for each group
GroupId Batches    StrengthStr StrengthTotal Capacity   
1. {BIT-10, BIT-11, BICSE-7}  {35, 40, 25} 100  100
2. {BSCS-3, BESE-3}   {40, 30}  70  70
3. {BSCS-2, BEE-4}   {35, 50}  85  90
3. The final result must have all the batches in it... 
Which in this case is true...
Any help will be highly appreciated...
Thanks
Bilal
 
 
How to squash/combine rows around current row
Matt, January   08, 2013 - 11:30 am UTC
 
 
Hi Tom
I am trying to figure out how to combine similar, connected rows.  I think I need analytics but all my attempts are meeting failure.
drop table mtemp;
create table mtemp (startdt date,enddt date, ra number, nqc number);
insert into mtemp values ('1-JAN-2013','2-JAN-2013',1,2);
insert into mtemp values ('2-JAN-2013','3-JAN-2013',1,2);
insert into mtemp values ('3-JAN-2013','4-JAN-2013',1,2);
insert into mtemp values ('4-JAN-2013','5-JAN-2013',1000,2000);
insert into mtemp values ('5-JAN-2013','6-JAN-2013',1,2);
commit;
what I need is a report that "squashes" the first three records where ra and nqc values are the same into one row.  Then the next record where the values change to 1000 and 2000 and then a third record where the values go back to their initial values.
The output I would like is:
startdt  enddt    ra      nqc
-------- -------- ------- --------
1-1-2013 1-4-2013 1       2     <three records "squashed"
1-4-2013 1-5-2013 1000    2000
1-5-2013 1-6-2013 1       2     <back to initial vals, need as SEPARATE record
I thought this was a simple analytic function using first_value and last_value.  But the fact that the partition by values "return" to something previous is the troublesome portion.
select first_value(startdt) over (partition by ra,nqc order by startdt) minstart
,      last_value(enddt) over (partition by ra,nqc order by startdt) maxend
,      ra
,      nqc
from mtemp
But with that I get 
MINSTART  MAXEND    RA    NQC
1/1/2013  1/2/2013  1     2
1/1/2013  1/3/2013  1     2
1/1/2013  1/4/2013  1     2
1/1/2013  1/6/2013  1     2
1/4/2013  1/5/2013  1000  2000
I have also tried various windowing with RANGE BETWEEN and ROWS BETWEEN but I am not finding good examples and I am failing.  
Thanks so much for your help over the years.
matt
 
January   14, 2013 - 11:53 am UTC 
 
ops$tkyte%ORA11GR2> select min(startdt), max(enddt), min(ra), min(nqc)
  2    from (
  3  select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by startdt) grp
  4    from (
  5  select startdt, enddt, ra, nqc,
  6         case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
  7                       and
  8                   decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
  9                          then row_number() over (order by startdt)
 10                  end tag
 11    from mtemp
 12         )
 13             )
 14   group by grp
 15   order by grp
 16  /
MIN(START MAX(ENDDT    MIN(RA)   MIN(NQC)
--------- --------- ---------- ----------
01-JAN-13 04-JAN-13          1          2
04-JAN-13 05-JAN-13       1000       2000
05-JAN-13 06-JAN-13          1          2
is one approach.
the inner most query tags the beginning of each group - based on whether the last ra/nqc are the same:
ops$tkyte%ORA11GR2> select startdt, enddt, ra, nqc,
  2                   lag(ra) over (order by startdt) lrq,
  3                   lag(nqc) over (order by startdt) lnqc,
  4                   decode( lag(ra) over (order by startdt), ra, 1,0 ) dra,
  5                   decode( lag(nqc) over (order by startdt), nqc, 1,0 ) dnqc,
  6         case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
  7                       and
  8                   decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
  9                          then row_number() over (order by startdt)
 10                  end tag
 11    from mtemp
 12  /
STARTDT   ENDDT             RA        NQC        LRQ       LNQC        DRA       DNQC        TAG
--------- --------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
01-JAN-13 02-JAN-13          1          2                                0          0          1
02-JAN-13 03-JAN-13          1          2          1          2          1          1
03-JAN-13 04-JAN-13          1          2          1          2          1          1
04-JAN-13 05-JAN-13       1000       2000          1          2          0          0          4
05-JAN-13 06-JAN-13          1          2       1000       2000          0          0          5
the second level carries down this value for us:
ops$tkyte%ORA11GR2> select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by startdt) grp
  2    from (
  3  select startdt, enddt, ra, nqc,
  4         case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
  5                       and
  6                   decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
  7                          then row_number() over (order by startdt)
  8                  end tag
  9    from mtemp
 10         )
 11  /
STARTDT   ENDDT             RA        NQC        GRP
--------- --------- ---------- ---------- ----------
01-JAN-13 02-JAN-13          1          2          1
02-JAN-13 03-JAN-13          1          2          1
03-JAN-13 04-JAN-13          1          2          1
04-JAN-13 05-JAN-13       1000       2000          4
05-JAN-13 06-JAN-13          1          2          5
and then we just aggregate. 
 
 
 
Question asked by Matt - bottom of page
Artieboy, January   15, 2013 - 12:33 am UTC
 
 
As you can tell I'm a novice, but would this work Oracle?
select unique startdt_new, enddt_new, ra, nqc
from
(--3
select startdt, enddt, ra, nqc, startdt_new,
last_value(enddt) over (partition by startdt_new) enddt_new from
(--2
select startdt, enddt, ra, nqc,
case when ra = lead(ra) over(partition by ra, nqc order by startdt) and nqc = lead(nqc) over(partition by ra, nqc order by startdt)
then c2 else startdt end startdt_new
from
(--1
select startdt, enddt, ra, nqc,
min(startdt) over(partition by ra, nqc) c2
from mtemp
)--1
)--2
)--3
order by startdt_new
CHEERS! 
January   15, 2013 - 10:34 am UTC 
 
explain your logic, like I did mine.  reverse engineering what someone was thinking can be error prone ;)
it doesn't work in general, no
ops$tkyte%ORA11GR2> select * from mtemp order by startdt;
STARTDT   ENDDT             RA        NQC
--------- --------- ---------- ----------
01-JAN-13 02-JAN-13          1          2
02-JAN-13 03-JAN-13          1          2
03-JAN-13 04-JAN-13          1          2
04-JAN-13 05-JAN-13       1000       2000
05-JAN-13 06-JAN-13          1          2
07-JAN-13 08-JAN-13       5000       6000
09-JAN-13 10-JAN-13          1          2
7 rows selected.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select unique startdt_new, enddt_new, ra, nqc
  2  from
  3  (--3
  4  select startdt, enddt, ra, nqc, startdt_new,
  5  last_value(enddt) over (partition by startdt_new) enddt_new from
  6  (--2
  7  select startdt, enddt, ra, nqc,
  8  case when ra = lead(ra) over(partition by ra, nqc order by startdt) and nqc =
  9  lead(nqc) over(partition by ra, nqc order by startdt)
 10  then c2 else startdt end startdt_new
 11  from
 12  (--1
 13  select startdt, enddt, ra, nqc,
 14  min(startdt) over(partition by ra, nqc) c2
 15  from mtemp
 16  )--1
 17  )--2
 18  )--3
 19  order by startdt_new
 20  /
STARTDT_N ENDDT_NEW         RA        NQC
--------- --------- ---------- ----------
01-JAN-13 06-JAN-13          1          2
04-JAN-13 05-JAN-13       1000       2000
07-JAN-13 08-JAN-13       5000       6000
09-JAN-13 10-JAN-13          1          2
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select min(startdt), max(enddt), min(ra), min(nqc)
  2    from (
  3  select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by startdt) grp
  4    from (
  5  select startdt, enddt, ra, nqc,
  6         case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
  7                       and
  8                   decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
  9                          then row_number() over (order by startdt)
 10                  end tag
 11    from mtemp
 12         )
 13             )
 14   group by grp
 15   order by grp
 16  /
MIN(START MAX(ENDDT    MIN(RA)   MIN(NQC)
--------- --------- ---------- ----------
01-JAN-13 04-JAN-13          1          2
04-JAN-13 05-JAN-13       1000       2000
05-JAN-13 06-JAN-13          1          2
07-JAN-13 08-JAN-13       5000       6000
09-JAN-13 10-JAN-13          1          2
 
 
 
Approach to How to squash/combine rows
Matt, January   15, 2013 - 12:13 pm UTC
 
 
Tom,
I like your approach.  We ended up solving a different way that appears to work, though.  It relies on two "counters" using row_number().  The first creates a running counter and the other a counter for like rows - rows with the same RA and NQC.  Subtract one from the other and you get a number that is unique for the grouping.
Do you see anything wrong with this:
-- first the inner query showing the counters:
SELECT startdt, enddt, ra, nqc
     , ROW_NUMBER() OVER(ORDER BY startdt, enddt) grpa
     , ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grpb
     , ROW_NUMBER() OVER(ORDER BY startdt, enddt) - ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grp
  FROM mtemp;
STARTDT   ENDDT             RA        NQC       GRPA       GRPB        GRP
--------- --------- ---------- ---------- ---------- ---------- ----------
01-JAN-13 02-JAN-13          1          2          1          1          0
02-JAN-13 03-JAN-13          1          2          2          2          0
03-JAN-13 04-JAN-13          1          2          3          3          0
04-JAN-13 05-JAN-13       1000       2000          4          1          3
05-JAN-13 06-JAN-13          1          2          5          4          1
Now we aggregate and remove the counting/grouping
SELECT   MIN(startdt) startdt, MAX(enddt) enddt, ra, nqc
    FROM (SELECT startdt, enddt, ra, nqc
               , ROW_NUMBER() OVER(ORDER BY startdt, enddt) grpa
               , ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grpb
               , ROW_NUMBER() OVER(ORDER BY startdt, enddt) - ROW_NUMBER() OVER(PARTITION BY ra, nqc ORDER BY startdt, enddt) grp
            FROM mtemp)
GROUP BY ra, nqc, grp
ORDER BY startdt, enddt;
STARTDT   ENDDT             RA        NQC
--------- --------- ---------- ----------
01-JAN-13 04-JAN-13          1          2
04-JAN-13 05-JAN-13       1000       2000
05-JAN-13 08-JAN-13          1          2
My colleague found this while researching this as a "Gaps and Islands" problem.
Thanks again.
 
January   15, 2013 - 2:33 pm UTC 
 
interesting approach - it looks to be roughly equivalent workwise - same number of window sort operations, they both full scan - window sort - window sort - aggregate.
I'll have to file that one away in my head for future reference :)
I've used what I call the "carry down" technique (my approach above) so so so many times now - it just types itself anymore...
 
 
 
Re: Approach to How to squash/combine rows
Narendra, January   17, 2013 - 3:19 am UTC
 
 
 
..question
A Reader, February  05, 2013 - 6:21 am UTC
 
 
Hello Tom,
  
SQL> create table t1 ( x  varchar2(31));
Table created.
SQL>insert into t1 values('zzz90000(Zh900000)xyz~0001234');
SQL>insert into t1 values('zzz90000(7i900000)xyz~0001234');
SQL>insert into t1 values('zzz90000(2i900000)xyz~0001234');
SQL>insert into t1 values('zzz80000(7i900000)xyz~0001234');
SQL>insert into t1 values('zzz70000(Ve900000)xyz~0001234');
SQL>insert into t1 values('zzz70000(2f900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(Zh900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(Ve900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(ui900000)xyz~0001234');
SQL>insert into t1 values('zzz60000(Ui900000)xyz~0001234');
man@ora10g:rac1> select * from t1 order by 1;
X
-------------------------------
zzz60000(Ui900000)xyz~0001234
zzz60000(Ve900000)xyz~0001234
zzz60000(Zh900000)xyz~0001234
zzz60000(ui900000)xyz~0001234
zzz70000(2f900000)xyz~0001234
zzz70000(Ve900000)xyz~0001234
zzz80000(7i900000)xyz~0001234
zzz90000(2i900000)xyz~0001234
zzz90000(7i900000)xyz~0001234
zzz90000(Zh900000)xyz~0001234
10 rows selected.
man@ora10g:rac1> select max(x) from t1;
MAX(X)
-------------------------------
zzz90000(Zh900000)xyz~0001234
man@ora10g:rac1> select * from t1 where x>'zzz70000(Ve900000)xyz~0001234';
X
-------------------------------
zzz90000(Zh900000)xyz~0001234
zzz90000(7i900000)xyz~0001234
zzz90000(2i900000)xyz~0001234
zzz80000(7i900000)xyz~0001234
 
 Question.
 
 I am interested in finding the result for top 4 values  with the given info of max(x).
 
 Had x been a number data type...
 I would have written
 I know using analytics it can be done.. but for some reasons I want to use max(x) .. 
 so I will write 
 select * from t1 where x > ( select ( max(x) - 4) from t1 ) ;
 
 how to select something similar in above case ( when we have x as varchar2 and with the 10 rows as shown ) 
  select * from t1 where x> ( select ( max(x)  - ??)  from t1); 
 
February  06, 2013 - 8:24 am UTC 
 
 
 
Help with this one
Doug, February  06, 2013 - 10:21 pm UTC
 
 
Hi Tom,
Here is my query
With tbs_name as (select tablespace_name 
                  from dba_tablespaces 
                  where tablespace_name = UPPER ('ADMIN_INDEX') )
SELECT   'free space' owner, '      ' OBJECT, file_id, block_id, blocks,
         (blocks * 8192) / (1024 * 1024) meg
    FROM dba_free_space a, Tbs_name
   WHERE a.tablespace_name = tbs_name.tablespace_name
UNION
SELECT   SUBSTR (owner, 1, 20), SUBSTR (segment_name, 1, 32), file_id,
         block_id, blocks,
         (blocks * 8192) / (1024 * 1024) meg
    FROM dba_extents a, tbs_name
   WHERE a.tablespace_name = tbs_name.tablespace_name
ORDER BY 3, 4;
I would like to take the output of this and summarise or 
consolidation into contiguous space.
ie if output was
OWNER     OBJECT  FILE_ID   BLOCK_ID   BLOCKS MEG
---------------------------------------------------
USER       IDX_1      2         9        128      1
USER       IDX_2      2       137        128      1
USER       IDX_2      2       265        128      1
USER       IDX_2      2       393        128      1
free space            2       521        128      1
free space            2       649        128      1
USER       IDX_2      2       777        128      1
Then I would want to roll up block Ids 137, 265 and 393
as as contigous space for IDX_2 and also roll up the freepace to report it as total of contigous space as well
I know defrags are normally a waste of time , my goal is
is to indentify indexes I can rebuild to free up free space.
I dropped 400Gb of indexes that were not being used and the space is available within the index tablespace but I need it 
for other purposes
Regards
Doug 
February  07, 2013 - 6:46 am UTC 
 
I dropped 400Gb of indexes that were not being used and the space is available within the index tablespace but I need it 
for other purposes great, so use it - why do you need to do anything to use this space?  it is already 100% useable???????
ops$tkyte%ORA11GR2> with tbs_name
  2  as
  3  (select tablespace_name
  4     from dba_tablespaces
  5    where tablespace_name = UPPER ('USERS')
  6  ),
  7  space
  8  as
  9  (
 10  SELECT 'free space' owner,
 11         '      ' OBJECT,
 12             file_id,
 13             block_id,
 14             blocks
 15    FROM dba_free_space a, Tbs_name
 16   WHERE a.tablespace_name = tbs_name.tablespace_name
 17   UNION ALL
 18  SELECT SUBSTR (owner, 1, 20),
 19         SUBSTR (segment_name, 1, 32), file_id,
 20         block_id,
 21             blocks
 22    FROM dba_extents a, tbs_name
 23   WHERE a.tablespace_name = tbs_name.tablespace_name
 24  )
 25  select owner, object, file_id, min(block_id), sum(blocks), count(*) cnt
 26    from (
 27  select owner,
 28         object,
 29             file_id,
 30             block_id,
 31             blocks,
 32             last_value( grp ignore nulls ) over (partition by owner, object, file_id order by block_id ) lgrp
 33    from (
 34  select owner,
 35         object,
 36             file_id,
 37             block_id,
 38             blocks,
 39             case when nvl( lag(block_id+blocks) over (partition by owner, object, file_id order by block_id), block_id+blocks) <> block_id
 40                  then row_number() over (partition by owner, object, file_id order by block_id)
 41                  end grp
 42    from space
 43         )
 44             )
 45   group by owner, object, file_id, lgrp
 46   order by owner, object, file_id
 47  /
OWNER      OBJECT                  FILE_ID MIN(BLOCK_ID) SUM(BLOCKS)        CNT
---------- -------------------- ---------- ------------- ----------- ----------
OPS$TKYTE  FY_CAL                        4           160           8          1
OPS$TKYTE  SYS_C0046904                  4           200           8          1
OPS$TKYTE  SYS_C0046904                  4          8872           8          1
OPS$TKYTE  SYS_C0046905                  4           248           8          1
OPS$TKYTE  SYS_C0046905                  4           280           8          1
OPS$TKYTE  SYS_C0046905                  4           328           8          1
OPS$TKYTE  SYS_C0046905                  4          1232           8          1
OPS$TKYTE  SYS_C0046905                  4          1264           8          1
OPS$TKYTE  SYS_C0046905                  4          8840           8          1
OPS$TKYTE  SYS_C0046905                  4          8856           8          1
OPS$TKYTE  SYS_C0046905                  4          9088         128          1
OPS$TKYTE  SYS_C0046905                  4         16640           8          1
OPS$TKYTE  SYS_C0046905                  4         16656           8          1
OPS$TKYTE  SYS_C0046905                  4         16672           8          1
OPS$TKYTE  SYS_C0046905                  4         18944           8          1
OPS$TKYTE  SYS_C0046905                  4         18960           8          1
OPS$TKYTE  SYS_C0046905                  4         18976           8          1
OPS$TKYTE  SYS_C0046905                  4         19032           8          1
OPS$TKYTE  SYS_C0046905                  4         19048           8          1
OPS$TKYTE  SYS_C0046905                  4         19064           8          1
OPS$TKYTE  SYS_C0046907                  4           272           8          1
OPS$TKYTE  SYS_C0046907                  4           296           8          1
OPS$TKYTE  SYS_C0046907                  4           344           8          1
OPS$TKYTE  SYS_C0046907                  4          1240           8          1
OPS$TKYTE  SYS_C0046907                  4          1272           8          1
OPS$TKYTE  SYS_C0046907                  4          8832           8          1
OPS$TKYTE  SYS_C0046907                  4          8848           8          1
OPS$TKYTE  SYS_C0046907                  4          8864           8          1
OPS$TKYTE  SYS_C0046907                  4          9216         128          1
OPS$TKYTE  SYS_C0046907                  4         16648           8          1
OPS$TKYTE  SYS_C0046907                  4         16664           8          1
OPS$TKYTE  SYS_C0046907                  4         16680           8          1
OPS$TKYTE  SYS_C0046907                  4         18952           8          1
OPS$TKYTE  SYS_C0046907                  4         18984           8          1
OPS$TKYTE  SYS_C0046907                  4         19024           8          1
OPS$TKYTE  SYS_C0046907                  4         19040           8          1
OPS$TKYTE  SYS_C0046907                  4         19056           8          1
OPS$TKYTE  T                             4          8880           8          1
OPS$TKYTE  T1                            4           152           8          1
OPS$TKYTE  T2                            4           136           8          1
OPS$TKYTE  TBL_HISTORY                   4         29960           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4           144           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4           168          16          2
OPS$TKYTE  TDETAIL_REF_MANY              4           232          16          2
OPS$TKYTE  TDETAIL_REF_MANY              4           256           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4           288           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4           336           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4          1184           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4          1200           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4          1216           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4          1248           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4          2176         128          1
OPS$TKYTE  TDETAIL_REF_MANY              4          6272         128          1
OPS$TKYTE  TDETAIL_REF_MANY              4          8704         128          1
OPS$TKYTE  TDETAIL_REF_MANY              4         10496         128          1
OPS$TKYTE  TDETAIL_REF_MANY              4         16384         128          1
OPS$TKYTE  TDETAIL_REF_MANY              4         16752           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4         18968           8          1
OPS$TKYTE  TDETAIL_REF_MANY              4         19000          16          2
OPS$TKYTE  TDETAIL_REF_ONE               4           184           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4           216          16          2
OPS$TKYTE  TDETAIL_REF_ONE               4           264           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4           304          24          3
OPS$TKYTE  TDETAIL_REF_ONE               4          1176           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4          1192           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4          1208           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4          1224           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4          1256           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4          2304         128          1
OPS$TKYTE  TDETAIL_REF_ONE               4          8576         128          1
OPS$TKYTE  TDETAIL_REF_ONE               4          8960         128          1
OPS$TKYTE  TDETAIL_REF_ONE               4         16256         128          1
OPS$TKYTE  TDETAIL_REF_ONE               4         16744           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4         16760           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4         18992           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4         19016           8          1
OPS$TKYTE  TDETAIL_REF_ONE               4         27776         128          1
OPS$TKYTE  TMASTER                       4           192           8          1
OPS$TKYTE  TMASTER                       4           208           8          1
OPS$TKYTE  TMASTER                       4           352         824         13
OPS$TKYTE  TMASTER                       4          1280         896          7
OPS$TKYTE  TMASTER                       4          2432        3840         30
OPS$TKYTE  TMASTER                       4          6400        2176          3
OPS$TKYTE  TMASTER                       4          9344        1152          2
OPS$TKYTE  TMASTER                       4         10624        5632          9
OPS$TKYTE  TMASTER                       4         16512         128          1
OPS$TKYTE  TMASTER                       4         16688          56          7
OPS$TKYTE  TMASTER                       4         16768        2176          3
OPS$TKYTE  TMASTER                       4         19072        8704         19
OPS$TKYTE  TMASTER                       4         27904        2048          2
free space                               4           128           8          1
free space                               4          8888          72          9
free space                               4         29952           8          1
free space                               4         29968        1488          1
94 rows selected.
 
 
 
 
Printing range with comma separated values
Arvind, February  06, 2013 - 11:36 pm UTC
 
 
Dear Tom,
This is further information as asked by you in my earlier post. In this question, user wants the data to be categorised in conitnous range. But I have some what 
same but bit different requirement as given below.
I have table and data as  given below
create table ag_t1
(col1  varchar2(5),
 col2  varchar2(10)
 );
 
 insert into ag_t1 values ('01.01', 'v1');
 insert into ag_t1 values ('01.02', 'v1');
 insert into ag_t1 values ('01.03', 'v1');
 insert into ag_t1 values ('01.04', 'v2');
 insert into ag_t1 values ('01.05', 'v1');
 insert into ag_t1 values ('01.06', 'v2');
 insert into ag_t1 values ('02.01', 'v1');
 insert into ag_t1 values ('02.02', 'v1');
 insert into ag_t1 values ('02.03', 'v2');
 insert into ag_t1 values ('02.04', 'v2');
 insert into ag_t1 values ('02.05', 'v1');
 insert into ag_t1 values ('02.06', 'v2');
 insert into ag_t1 values ('08.01', 'v1');
 insert into ag_t1 values ('08.02', 'v1');
 insert into ag_t1 values ('08.03', 'v1');
 insert into ag_t1 values ('08.04', 'v2');
 insert into ag_t1 values ('08.05', 'v1');
 insert into ag_t1 values ('08.06', 'v2');
I want the output as given below.
Output 
 v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
 v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06 
col1 values can be in three formats
a) 01.01
b) 01.001
c) 1 
February  11, 2013 - 7:58 am UTC 
 
you answered nothing that I asked above actually???
sigh.
if there is a 01.001 why isn't it in your example, where does it fit in.  Would it be "contiguous" with 01.01?  what does it mean.
if there is a 1, why isn't it in your example where does it fit in.  Would it be contiguous with 01?  
be precise, pretend you were explaining this to your mom - be that detailed.
 
 
 
arvind's analytic problem.
Biswaranjan, February  07, 2013 - 1:13 pm UTC
 
 
Hi Arvind,
I just analyzed your request deeply(initially it was very difficult to understand the logic behind it but finally I
got it totally ;)).
please run the below query and let us know whether it works or not
I tested for your input and some random inputs it works fine,
but plz test it.
#############(tested in 11g)I have just modified rajesh's query to fit your output #######
select col2,listagg(reslt,',') within group(order by min_col1) as listed_values
    from (
    select col2,min(col1) min_col1,max(col1) max_col1,
        case when nullif(min(col1),max(col1)) is not null
                then min(col1)||' to '||max(col1)
       else to_char(min(col1)) end reslt
    from (
    select col1,col2,case when col2<> 'v2' then max(val) over(order by rownum) else val end grp
    from (
   select ag_t1.*,
     case when lag(col2) over(order by rownum) is null
     or lag(col2) over(order by rownum) <> col2  or col2<>'v1'  then
     row_number() over(order by rownum) end as val
   from ag_t1 
        )
        )
   group by col2,grp
   )
   group by col2;
############################
for you data I got the result as below.
 v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
 v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06 
But I know tom can write it more beautifully(regarding performance ).
regards,
Biswaranjan. 
February  11, 2013 - 8:50 am UTC 
 
you got it totally?
where does 1 and 01.001 fit into the scheme of things?  what would your code do with that?  
it is not possible for you to have gotten it totally, it was not specified at a level of detail for anyone to get it right without guessing at what should happen with those values.
 
 
 
slightly different
A reader, February  07, 2013 - 4:22 pm UTC
 
 
Hi Tom,
I would like to reclaim the space because the system does not neet 400Gb of free space in the index tablespace.
It does however need disk space to turn on archive log mode and currently the system is at full capacity, no place to add more disks.
I don't think I explained my requirement very well sorry however I liked seeing your reply.
What I am trying to do is work out which indexes should I 
rebuild to try and get the free space towards the end of a data file do I can shrink it.
Regards Doug
 
February  11, 2013 - 9:04 am UTC 
 
oh, the way to do this would be to rebuild all of the indexes in that tablespace into a new tablespace and then drop the old, now empty, tablespace.
the new tablespace will be 100% contiguous (use an autoextend datafile that starts "small" and grows as you put more in there) and the old tablespace can just be dropped.
If you do it by identifying (via dba_extents) what indexes are at the "end" of your datafile - you can move the furthest out index first, shrinking the old datafile as  you are growing the new one.
and if you want, just rename the new tablespace to the old name after you are done. 
 
 
Arvind's question logic.
Biswaranjan., February  08, 2013 - 12:51 am UTC
 
 
for Arvinda's question logic.
('01.01', 'v1') 
('01.02', 'v1')  
('01.03', 'v1')
('01.04', 'v2')
('01.05', 'v1')
('01.06', 'v2')
('02.01', 'v1')
('02.02', 'v1')
('02.03', 'v2')
('02.04', 'v2')
('02.05', 'v1')
('02.06', 'v2')
('08.01', 'v1')
('08.02', 'v1')
('08.03', 'v1')
('08.04', 'v2')
('08.05', 'v1')
('08.06', 'v2')
v1- 01.01 to 01.03(coz it is continuos upto 01.03,so skipped 01.02),01.05
(though it has no v1 values above and below it we will keep it as it ),
now coming down 02.01 to 02.0(coz this v1 values starts from 02.01 and continuos upto 02.02),
02.05(though it has no v1 values above and below it we will keep it as it between commas),
now coming down 08.01 to 08.03(coz this v1 values starts from 08.01 and continuos upto 08.03),
and finally 08.05 (though it has no v1 values above and below it we will keep it as it).
so with out comment it would be as below for 'v1'
01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
And for v2 we should not see anything.place each and every value for v2 as comma separated.
so simply as below.
01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06 
That is all the logic ,what I understood.
thanks & regards,
Biswaranjan.
 
 
:) you opened my eyes further.
Biswaranjan, February  11, 2013 - 9:28 am UTC
 
 
frankly speaking, I Just guessed that logic(as you know :)).I should not have written like "got it totally".
But I know guessing might be disastrous and wasting of time 
in many case.
Thank you Tom.
By the way ,
I have been using this site ,since 3 years but never 
faced any error opening this site(I mean clicking any link on your homepage).
But Now I am getting the below error.
######
ORA-01654: unable to extend index ASKTOM.ATE_QUESTION_LOG_IDX by 8192 in tablespace FLOW_19246
##############
I even tried from linux machine but samething happend.
By accessing many times I was able to open and writing now :).
Is it some issue with your site or My system problem (am confused).
Thanks agian.
 
February  11, 2013 - 10:12 am UTC 
 
sorry about that, I ran out of space on apex.oracle.com and had to up my quota. 
 
 
Continuation from Matt’s request on 1/15/2013
Artieboy, February  19, 2013 - 6:19 pm UTC
 
 
Hey Tom:
Thanks for letting me know my query did not work.  So how about this?  It’s kind of like your carry down approach but it looks for the first value instead.
First I added in some extra rows after Matt’s original records and the two records you added:
INSERT INTO mtemp VALUES('11-jan-2013',   '12-jan-2013',   1,   2);
INSERT INTO mtemp VALUES('13-jan-2013',   '14-jan-2013',   1,   2);
INSERT INTO mtemp VALUES('15-jan-2013',   '16-jan-2013',   5000,   6000);
INSERT INTO mtemp VALUES('17-jan-2013',   '18-jan-2013',   1,  2);
Query:
select min(startdt), max(enddt), ra, nqc
from
(--3
select startdt, enddt, ra, nqc,
case when c1 is null then first_value(c1 ignore nulls)
over(order by startdt rows between current row and unbounded following)-1 else c1 end grp
from
(--2
select startdt, enddt, ra, nqc, lag_ra, lead_ra,
case when (ra=lag_ra  or ra=lead_ra) AND (nqc=lag_nqc or nqc=lead_nqc) 
then null else startdt end c1
from
(--1
select startdt, enddt, ra,
lag(ra) over(order by startdt)lag_ra,
lead(ra) over (order by startdt) lead_ra,
nqc,
lag(nqc) over (order by startdt)lag_nqc,
lead(nqc) over (order by startdt)lead_nqc
from mtemp
)--1
)--2
)--3
group by grp, ra, nqc
order by 1
Explanation:
Where ra does not equal the ra of the next or previous record (same thing for nqc) then that is a stand alone record.  For those records give me the startdt of that row, for all others return null.
Now I have all the “break points”, places where a continuous set is broken.  For all null values, starting from the current row and going down, give me the first date seen of column c1 minus 1 day,   This creates a grouping based on the date.  Then aggregate.
In addition:
I added one more record:
INSERT INTO mtemp VALUES('30-dec-2012',   '31-dec-2012',   5000, 6000);
This gives the first row of column c1 a date entry and the query still works.
Is this viable solution Oracle?  Just seems easier to group records that have a date field (min(date), max(date)) with a date field.
Cheers!
 
February  25, 2013 - 8:27 am UTC 
 
.. Just seems easier to group records that have a 
date field (min(date), max(date)) with a date field.
 ...
how so?  Your query has more levels of nesting - and seems more complex to me.
ops$tkyte%ORA11GR2> select startdt, enddt, ra, nqc, last_value(tag ignore nulls) over (order by 
startdt) grp
  2    from (
  3  select startdt, enddt, ra, nqc,
  4         case when decode( lag(ra) over (order by startdt), ra, 1,0 ) = 0
  5                       and
  6                   decode( lag(nqc) over (order by startdt), nqc, 1,0 ) = 0
  7                          then row_number() over (order by startdt)
  8                  end tag
  9    from mtemp
 10         )
 11  /
two layers of nesting.
I didn't prove your query works (or not) - I'm just stating that I find your approach much less obvious. 
 
 
arvind's analytic problem
A reader, February  21, 2013 - 10:17 pm UTC
 
 
Dear Tom,
This is further information as asked by you in my earlier post. In this question, user wants the 
data to be categorised in conitnous range. But I have some what 
same but bit different requirement as given below.
I have table and data as  given below
create table ag_t1
(col3 varchar2(4)
col1  varchar2(5),
 col2  varchar2(10)
 );
here my primary key is col3, and col1
Data set 1 -----
 
 insert into ag_t1 values ('4000','01.01', 'v1');
 insert into ag_t1 values ('4000','01.02', 'v1');
 insert into ag_t1 values ('4000','01.03', 'v1');
 insert into ag_t1 values ('4000','01.04', 'v2');
 insert into ag_t1 values ('4000','01.05', 'v1');
 insert into ag_t1 values ('4000','01.06', 'v2');
 insert into ag_t1 values ('4000','02.01', 'v1');
 insert into ag_t1 values ('4000','02.02', 'v1');
 insert into ag_t1 values ('4000','02.03', 'v2');
 insert into ag_t1 values ('4000','02.04', 'v2');
 insert into ag_t1 values ('4000','02.05', 'v1');
 insert into ag_t1 values ('4000','02.06', 'v2');
 insert into ag_t1 values ('4000','08.01', 'v1');
 insert into ag_t1 values ('4000','08.02', 'v1');
 insert into ag_t1 values ('4000','08.03', 'v1');
 insert into ag_t1 values ('4000','08.04', 'v2');
 insert into ag_t1 values ('4000','08.05', 'v1');
 insert into ag_t1 values ('4000','08.06', 'v2');
for above data set, I want the output as given below.
Output 
 v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
 v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06 
Data set 2 -----
 insert into ag_t1 values ('4001','1', 'v3');
 insert into ag_t1 values ('4001','2', 'v3');
 insert into ag_t1 values ('4001','3', 'v3');
 insert into ag_t1 values ('4001','4', 'v4');
 insert into ag_t1 values ('4001','5', 'v3');
 insert into ag_t1 values ('4001','6', 'v4');
for above data set, I want the output as given below.
Output 
 v3 - 1 to 3, 5
 v4 - 4, 6
Data set 3 -----
 
 insert into ag_t1 values ('4002','01.001', 'v1');
 insert into ag_t1 values ('4002','01.002', 'v1');
 insert into ag_t1 values ('4002','01.003', 'v1');
 insert into ag_t1 values ('4002','01.004', 'v2');
 insert into ag_t1 values ('4002','01.005', 'v1');
 insert into ag_t1 values ('4002','01.006', 'v2');
 insert into ag_t1 values ('4002','02.001', 'v1');
 insert into ag_t1 values ('4002','02.002', 'v1');
 insert into ag_t1 values ('4002','02.003', 'v2');
 insert into ag_t1 values ('4002','02.004', 'v2');
 insert into ag_t1 values ('4002','02.005', 'v1');
 insert into ag_t1 values ('4002','02.006', 'v2');
 insert into ag_t1 values ('4002','08.001', 'v1');
 insert into ag_t1 values ('4002','08.002', 'v1');
 insert into ag_t1 values ('4002','08.003', 'v1');
 insert into ag_t1 values ('4002','08.004', 'v2');
 insert into ag_t1 values ('4002','08.005', 'v1');
 insert into ag_t1 values ('4002','08.006', 'v2');
for above data set, I want the output as given below.
Output 
 v1 - 01.001 to 01.003, 01.005, 02.001 to 02.002, 02.005, 08.001 to 08.003, 08.005
 v2 - 01.004, 01.006, 02.003, 02.004, 02.006, 08.004, 08.006  
February  25, 2013 - 11:11 am UTC 
 
ops$tkyte%ORA11GR2> select c1, listagg( label, ',' ) within group (order by max_grp) str
  2    from (
  3  select c1, max_grp, to_char(min(c2),'fm00.00') || case when min(c2) <> max(c2)
  4                                                         then ' to ' || to_char(max(c2),'fm00.00')
  5                                                     end label
  6    from (
  7  select c1, c2, max(grp) over (partition by c1 order by c2) max_grp
  8    from (
  9  select c1, c2,
 10         case when nvl(lag(c2) over (partition by c1 order by c2), c2) <> c2-0.01
 11              then row_number() over (partition by c1 order by c2)
 12          end grp
 13    from (select col2 C1, to_number(col1) C2
 14            from ag_t1)
 15         )
 16         )
 17   group by c1, max_grp
 18         )
 19   group by c1
 20   order by c1
 21  /
C1         STR
---------- ------------------------------------------------------------------------
v1         01.01 to 01.03,01.05,02.01 to 02.02,02.05,08.01 to 08.03,08.05
v2         01.04,01.06,02.03 to 02.04,02.06,08.04,08.06
now you see the technique (break the query out layer by layer, running the innermost query, then the next layer and so on so you can understand what is going on)
and then you can easily do it for the other data sets. 
 
 
 
Analytical Query
Arvind, February  28, 2013 - 3:13 am UTC
 
 
Dear Tom,
Thanks for your response to my query. I have gone thru the query. Some time, data is coming with alphabets like 01.01A, 01.01B, 01.02 etc. In that case it may not be possible to convert date to number. 
Thanks & Regards
Arvind 
February  28, 2013 - 8:02 am UTC 
 
you didn't specify that.
get us a FULL specification that covers 100% of the issue and maybe we can take a look at it.
we can only deal with what we are given.  even here you don't say what to do with these values,  you give no example (creates, inserts) to work with. 
 
 
Arvind query
Stew Ashton, February  28, 2013 - 9:51 am UTC
 
 
 I'm being a bad boy. Tom said to specify more fully, so I should wait for Arvind to answer, but I'm not.
My variant is a bit "generic", but it only works if two assumptions are true:
- first, that the col1 values are formatted so that a simple ORDER BY works, without having to convert to numbers;
- and second, that any two successive col1 values are "consecutive" as long as the the col3 and col2 values are the same.
These assumptions fit the datasets given. If they are wrong, then Arvind has a lot more to specify.
One other thing, it appears that Arvind only wants " to " when there are at least 3 consecutive values.
P.S. This is another example of the "Tabibitosan method" as explained at  
https://forums.oracle.com/forums/thread.jspa?threadID=1005478  select col3, col2, listagg(
  min_col1 || decode(cnt, 1, null, 2, ', ' || max_col1, ' to ' || max_col1)
  , ', '
) within group(order by min_col1) col1
from (
  select col3, col2, count(*) cnt, min(col1) min_col1, max(col1) max_col1
  from (
    select col3, col2, col1,
    row_number() over(partition by col3 order by col1) -
    row_number() over(partition by col3, col2 order by col1) grp
    from ag_t1 t
  )
  group by col3, col2, grp
)
group by col3, col2
order by col3, col2;
COL3  COL2  COL1
----- ----- --------------------------------------------------------------------------
4000  v1    01.01 to 01.03, 01.05, 02.01, 02.02, 02.05, 08.01 to 08.03, 08.05
4000  v2    01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06
4001  v3    1 to 3, 5
4001  v4    4, 6
4002  v1    01.001 to 01.003, 01.005, 02.001, 02.002, 02.005, 08.001 to 08.003, 08.005
4002  v2    01.004, 01.006, 02.003, 02.004, 02.006, 08.004, 08.006 
 
February  28, 2013 - 11:52 am UTC 
 
nope, doesn't work - assumption 2 cannot be true.  the want them to be consecutive if they different by a unit - 0.01 in this case.
for example, if you remove a row:
-- insert into ag_t1 values ('4000','01.04', 'v2');
then your result becomes:
ops$tkyte%ORA11GR2> select col3, col2, listagg(
  2    min_col1 || decode(cnt, 1, null, 2, ', ' || max_col1, ' to ' || max_col1)
  3    , ', '
  4  ) within group(order by min_col1) col1
  5  from (
  6    select col3, col2, count(*) cnt, min(col1) min_col1, max(col1) max_col1
  7    from (
  8      select col3, col2, col1,
  9      row_number() over(partition by col3 order by col1) -
 10      row_number() over(partition by col3, col2 order by col1) grp
 11      from ag_t1 t
 12    )
 13    group by col3, col2, grp
 14  )
 15  group by col3, col2
 16  order by col3, col2;
COL3 COL2       COL1
---- ---------- ------------------------------------------------------------------------
4000 v1         01.01 to 01.05, 02.01, 02.02, 02.05, 08.01 to 08.03, 08.05
4000 v2         01.06, 02.03, 02.04, 02.06, 08.04, 08.06
but we don't have 1.01 to 1.05 - 1.04 is missing...
they have not specified this adequately to answer - but I know that assumption #2 cannot be correct - removing a row for V2 should not affect the answer of V1. 
 
 
Analytical Query
Arvind, March     03, 2013 - 11:38 pm UTC
 
 
This sample includes all cases
This is further information as asked by you in my earlier post. In this question, user wants the 
data to be categorised in conitnous range. But I have some what 
same but bit different requirement as given below.
I have table and data as  given below
create table ag_t1
(col3 varchar2(4)
col1  varchar2(5),
 col2  varchar2(10)
 );
here my primary key is col3, and col1
Data set 1 -----
 
 insert into ag_t1 values ('4000','01.01', 'v1');
 insert into ag_t1 values ('4000','01.02', 'v1');
 insert into ag_t1 values ('4000','01.03', 'v1');
 insert into ag_t1 values ('4000','01.04', 'v2');
 insert into ag_t1 values ('4000','01.05', 'v1');
 insert into ag_t1 values ('4000','01.06', 'v2');
 insert into ag_t1 values ('4000','02.01', 'v1');
 insert into ag_t1 values ('4000','02.02', 'v1');
 insert into ag_t1 values ('4000','02.03', 'v2');
 insert into ag_t1 values ('4000','02.04', 'v2');
 insert into ag_t1 values ('4000','02.05', 'v1');
 insert into ag_t1 values ('4000','02.06', 'v2');
 insert into ag_t1 values ('4000','08.01', 'v1');
 insert into ag_t1 values ('4000','08.02', 'v1');
 insert into ag_t1 values ('4000','08.03', 'v1');
 insert into ag_t1 values ('4000','08.04', 'v2');
 insert into ag_t1 values ('4000','08.05', 'v1');
 insert into ag_t1 values ('4000','08.06', 'v2');
for above data set, I want the output as given below.
Output 
 v1 - 01.01 to 01.03, 01.05, 02.01 to 02.02, 02.05, 08.01 to 08.03, 08.05
 v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06 
Data set 2 -----
 insert into ag_t1 values ('4001','1', 'v3');
 insert into ag_t1 values ('4001','2', 'v3');
 insert into ag_t1 values ('4001','3', 'v3');
 insert into ag_t1 values ('4001','4', 'v4');
 insert into ag_t1 values ('4001','5', 'v3');
 insert into ag_t1 values ('4001','6', 'v4');
for above data set, I want the output as given below.
Output 
 v3 - 1 to 3, 5
 v4 - 4, 6
Data set 3 -----
 
 insert into ag_t1 values ('4002','01.001', 'v1');
 insert into ag_t1 values ('4002','01.002', 'v1');
 insert into ag_t1 values ('4002','01.003', 'v1');
 insert into ag_t1 values ('4002','01.004', 'v2');
 insert into ag_t1 values ('4002','01.005', 'v1');
 insert into ag_t1 values ('4002','01.006', 'v2');
 insert into ag_t1 values ('4002','02.001', 'v1');
 insert into ag_t1 values ('4002','02.002', 'v1');
 insert into ag_t1 values ('4002','02.003', 'v2');
 insert into ag_t1 values ('4002','02.004', 'v2');
 insert into ag_t1 values ('4002','02.005', 'v1');
 insert into ag_t1 values ('4002','02.006', 'v2');
 insert into ag_t1 values ('4002','08.001', 'v1');
 insert into ag_t1 values ('4002','08.002', 'v1');
 insert into ag_t1 values ('4002','08.003', 'v1');
 insert into ag_t1 values ('4002','08.004', 'v2');
 insert into ag_t1 values ('4002','08.005', 'v1');
 insert into ag_t1 values ('4002','08.006', 'v2');
for above data set, I want the output as given below.
Output 
 v1 - 01.001 to 01.003, 01.005, 02.001 to 02.002, 02.005, 08.001 to 08.003, 08.005
 v2 - 01.004, 01.006, 02.003, 02.004, 02.006, 08.004, 08.006
Data set 4 -----
 
 insert into ag_t1 values ('4000','01.01A', 'v1');
 insert into ag_t1 values ('4000','01.01B', 'v1');
 insert into ag_t1 values ('4000','01.01C', 'v1');
 insert into ag_t1 values ('4000','01.02', 'v1');
 insert into ag_t1 values ('4000','01.03', 'v1');
 insert into ag_t1 values ('4000','01.04', 'v2');
 insert into ag_t1 values ('4000','01.05', 'v1');
 insert into ag_t1 values ('4000','01.06', 'v2');
 insert into ag_t1 values ('4000','02.01A', 'v1');
 insert into ag_t1 values ('4000','02.01B', 'v1');
 insert into ag_t1 values ('4000','02.01C', 'v1');
 insert into ag_t1 values ('4000','02.02', 'v1');
 insert into ag_t1 values ('4000','02.03', 'v2');
 insert into ag_t1 values ('4000','02.04', 'v2');
 insert into ag_t1 values ('4000','02.05', 'v1');
 insert into ag_t1 values ('4000','02.06', 'v2');
 insert into ag_t1 values ('4000','08.01A', 'v1');
 insert into ag_t1 values ('4000','08.01B', 'v1');
 insert into ag_t1 values ('4000','08.01C', 'v1');
 insert into ag_t1 values ('4000','08.02', 'v1');
 insert into ag_t1 values ('4000','08.03', 'v1');
 insert into ag_t1 values ('4000','08.04', 'v2');
 insert into ag_t1 values ('4000','08.05', 'v1');
 insert into ag_t1 values ('4000','08.06', 'v2');
for above data set, I want the output as given below.
Output 
 v1 - 01.01A to 01.01C, 01.02, 01.03, 01.05, 02.01A to 02.01C, 02.02, 02.05, 08.01A to 08.01C, 08.02, 08.03, 08.05
 v2 - 01.04, 01.06, 02.03, 02.04, 02.06, 08.04, 08.06  
March     04, 2013 - 4:32 pm UTC 
 
why did you post all of the redundant stuff ...
and a create table that does not work
with inserts that fail after you fix the create table...
                                  *
ERROR at line 1:
ORA-12899: value too large for column "OPS$TKYTE"."AG_T1"."COL1" (actual: 6,
maximum: 5)
ugh....
if we are allowed to assume:
all data will either be in the form
NN.NN   where N is a digit between 0 and 9
or 
NN.NNC  where N is a digit between 0 and 9 and C is a character between A and Z
then, a simple variation on a theme can work - we'll convert NN.NN into a number and NN.NNC into a number by taking NN.NN + ascii-code(C)/10000
then the original query just works - albeit with a little more complex formatting of the output.
If these assumptions are not correct - FIX YOUR SPECIFICATION (actually, provide one, you never ever have...)
ops$tkyte%ORA11GR2> select c1, listagg( label, ',' ) within group (order by max_grp) str
  2    from (
  3  select c1, max_grp,
  4         to_char( trunc(min(c2),2) ,'fm00.00') ||
  5                      case when (min(c2)-trunc(min(c2),2)) <> 0
  6                                       then chr( (min(c2)-trunc(min(c2),2))*10000 )
  7                                   end ||
  8                      case when min(c2) <> max(c2)
  9                  then ' to ' || to_char( trunc(max(c2)),'fm00.00') ||
 10                            case when (max(c2)-trunc(max(c2),2)) <> 0
 11                                             then chr( (max(c2)-trunc(max(c2),2))*10000 )
 12                                         end
 13                  end label
 14    from (
 15  select c1, c2, max(grp) over (partition by c1 order by c2) max_grp
 16    from (
 17  select c1, c2,
 18         case when nvl(lag(c2) over (partition by c1 order by c2), c2) <> c2-0.0001
 19              then row_number() over (partition by c1 order by c2)
 20          end grp
 21    from (select col2 C1, to_number(substr( col1, 1, 5 ))+ nvl(ascii(substr(col1,6,1))/10000,0) C2
 22            from ag_t1)
 23         )
 24         )
 25   group by c1, max_grp
 26         )
 27   group by c1
 28   order by c1
 29  /
C1
----------
STR
-------------------------------------------------------------------------------
v1
01.01A to 01.00C,01.02,01.03,01.05,02.01A to 02.00C,02.02,02.05,08.01A to 08.00
C,08.02,08.03,08.05
v2
01.04,01.06,02.03,02.04,02.06,08.04,08.06
 
 
 
problem understanding statement
preet, March     20, 2013 - 4:56 pm UTC
 
 
Hi Tom,
I am going through ORACLE documentation for understanding Analytic function. Unfortunately, I couldn't get this statement "To filter the results of a query based on an analytic function, nest these functions within the parent query, and then filter the results of the nested subquery."
Could you please provide some example on this statement? I couldn't understand it.
Thank you 
March     25, 2013 - 3:46 pm UTC 
 
see my original answer, it does just this.
I cannot write:
where lead(station) over (partition by order# order by close_date) <> station
so I have to alias lead(station) over (partition by order# order by close_date) as "lead_station" and in another level of the query use
where lead_station <> station
Basically you cannot "where" on the analytics, you have to create an inline view (or with factored subquery) and then where on the results of that inline view.  
 
 
Anand, March     25, 2013 - 6:57 am UTC
 
 
Hi Tom,
I have table as below :
DROP TABLE test 
/
CREATE TABLE test (branch_code NUMBER,tot_cnt NUMBER ,process_id NUMBER )
/
INSERT INTO test VALUES (1,5000,99)
/
INSERT INTO test VALUES (2,2500,99)
/
INSERT INTO test VALUES (3,2500,99)
/
INSERT INTO test VALUES (4,3000,99)
/
INSERT INTO test VALUES (5,2000,99)
/
INSERT INTO test VALUES (6,1000,99)
/
INSERT INTO test VALUES (7,4000,99)
/
INSERT INTO test VALUES (8,1200,99)
/
Now i want to updated the process_id from 1 to 6 from default 99.It should divide in such a
way that it should have tot_count equally as much as possible in each of the process_id.i know it not possible 
to have count exactly equal.
If i use below query it will give me count higher on some process_id and lower on some process id :
 MERGE INTO test a
    USING (SELECT branch_code,
                  ntile(6) over(order by branch_code) stream
             FROM test) b
    ON (a.branch_code = b.branch_code)
    WHEN MATCHED THEN
      UPDATE SET a.process_id = b.stream;
Result :
SELECT SUM(TOT_CNT), PROCESS_ID FROM TEST GROUP BY PROCESS_ID
SUM(TOT_CNT) PROCESS_ID
7500             1  ---> Higher count
5500             2
2000             3
1000             4  ---> Lower count
4000             5
1200             6   
 
Anand, April     23, 2013 - 1:27 pm UTC
 
 
Hi Tom,
Waiting for your reply on above question. 
April     23, 2013 - 2:56 pm UTC 
 
search for 
bin fitting 
on this site (and others).  there are approaches with the model clause, many demonstrated. 
 
 
Anand, April     25, 2013 - 1:47 pm UTC
 
 
Hi Tom,
Thanks for you reply .Never knew about this.
You have suggested below site in one of forum : 
http://www.jlcomp.demon.co.uk/faq/Bin_Fitting.html Now my question how if i want divide the record in 15 or 20 bucket ? Do i have to modify the query and add bucket 3,4,5...20(as per example) ? 
April     25, 2013 - 7:49 pm UTC 
 
left as an exercise for the reader... we've pointed you to the "technique".  Learn the technique and you'll be able to use it in thousands of places over your career ;)
 
 
 
Recursive Subquery Factors...
Brendan, April     26, 2013 - 2:40 pm UTC
 
 
I had a look at that link to a Model (approximate) solution and you have to rewrite the query to change the number of buckets. I implemented a similar algorithm using the v11.2 feature Recursive Subquery Factoring in which I parametrise the number of buckets. It's not quite identical as it distributes the items in batches across the set of buckets, but gave the same answers on my test problems, which involved 10,001 items of random values.
Insert test data:
INSERT INTO items
SELECT 'item' || n, DBMS_Random.Value (0, 10000) FROM  
(SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <10002)
Set the variable for number of buckets:
VAR N_BUCKETS NUMBER
EXECUTE :N_BUCKETS := 3
Recursive Subquery Factor query:
WITH buckets AS (
       SELECT LEVEL bucket, :N_BUCKETS n_buckets FROM DUAL CONNECT BY LEVEL <= :N_BUCKETS
), items_desc AS (
       SELECT item_name, item_value, Row_Number () OVER (ORDER BY item_value DESC) rn
         FROM items
), rsf (bucket, item_name, item_value, bin_value, lev, bin_rank, n_buckets) AS (
SELECT b.bucket,
       i.item_name, 
       i.item_value, 
       i.item_value,
       1,
       b.n_buckets - i.rn + 1,
       b.n_buckets
  FROM buckets b
  JOIN items_desc i
    ON i.rn = b.bucket
 UNION ALL
SELECT r.bucket,
       i.item_name, 
       i.item_value, 
       r.bin_value + i.item_value,
       r.lev + 1,
       Row_Number () OVER (ORDER BY r.bin_value + i.item_value),
       r.n_buckets
  FROM rsf r
  JOIN items_desc i
    ON (i.rn - r.lev * r.n_buckets) = r.bin_rank
)
SELECT r.item_name,
       r.bucket, r.item_value, r.bin_value, r.lev, r.bin_rank
  FROM rsf r
 ORDER BY item_value DESCOutput extract for linked Model for 3 buckets, and RSF for 3 and 20 buckets:
Model 3 buckets
NAME                                VALUE BUCKET_NAME         B1         B2         B3
------------------------------ ---------- ----------- ---------- ---------- ----------
item8937                             9999           1       9999
item4315                             9999           2       9999       9999
item3367                             9999           3       9999       9999       9999
...
item5523                                1           2   16734557   16734557   16734557
item145                                 1           1   16734558   16734557   16734557
item3670                                0           2   16734558   16734557   16734557
10001 rows selected.
Elapsed: 00:03:00.79
RSF 3 buckets
ITEM_NAME                          BUCKET ITEM_VALUE  BIN_VALUE        LEV   BIN_RANK
------------------------------ ---------- ---------- ---------- ---------- ----------
item3367                                3       9999       9999          1          1
item8937                                1       9999       9999          1          3
item4315                                2       9999       9999          1          2
...
item5523                                3          1   16734557       3333          2
item145                                 2          1   16734558       3334          2
item3670                                3          0   16734557       3334          1
10001 rows selected.
Elapsed: 00:02:37.65
RSF 20 buckets
ITEM_NAME                          BUCKET ITEM_VALUE  BIN_VALUE        LEV   BIN_RANK
------------------------------ ---------- ---------- ---------- ---------- ----------
item8937                                1       9999       9999          1         20
item3367                                3       9999       9999          1         18
item4315                                2       9999       9999          1         19
...
item5523                               20          1    2510177        500          1
item145                                19          1    2510178        500          3
item3670                               20          0    2510177        501          1
10001 rows selected.
Elapsed: 00:02:25.73
 
 
...and Plain Old SQL
Brendan, April     26, 2013 - 3:53 pm UTC
 
 
On further thought, you might consider a very simple solution where you just assign items sequentially to buckets in value order. This won't give quite as good an answer but often it might be good enough and it's much simpler and faster. Here are the results for 3 buckets, showing model and rsf giving the same very good results, with the simple solution a bit worse, but much quicker:
Model 3 buckets
NAME                                VALUE BUCKET_NAME         B1         B2         B3
------------------------------ ---------- ----------- ---------- ---------- ----------
item906                              9999           1       9999
item1343                             9998           2       9999       9998
item2819                             9995           3       9999       9998       9995
...
item3158                                7           3   16701010   16701010   16701010
item814                                 5           1   16701015   16701010   16701010
item5304                                4           2   16701015   16701014   16701010
10001 rows selected.
Elapsed: 00:03:01.91
RSF 3 buckets
ITEM_NAME                          BUCKET ITEM_VALUE  BIN_VALUE        LEV   BIN_RANK
------------------------------ ---------- ---------- ---------- ---------- ----------
item906                                 1       9999       9999          1          3
item1343                                2       9998       9998          1          2
item2819                                3       9995       9995          1          1
...
item3158                                1          7   16701010       3333          2
item814                                 2          5   16701015       3334          2
item5304                                1          4   16701014       3334          1
10001 rows selected.
Elapsed: 00:02:37.84
POS 3 buckets
ITEM_NAME                          BUCKET ITEM_VALUE BUCKET_TOTAL
------------------------------ ---------- ---------- ------------
item906                                 2       9999     16704330
item1343                                3       9998     16701019
item2819                                1       9995     16697690
...
item3158                                1          7     16697690
item814                                 2          5     16704330
item5304                                3          4     16701019
10001 rows selected.
Elapsed: 00:00:03.00
Here is the simple SQL:
WITH items_desc AS (
       SELECT item_name, item_value, Mod (Row_Number () OVER (ORDER BY item_value DESC), :N_BUCKETS) + 1 bucket
         FROM items
)
SELECT item_name, bucket, item_value, Sum (item_value) OVER (PARTITION BY bucket) bucket_total
  FROM items_desc
 ORDER BY item_value DESC 
 
Improved MODEL 
Brendan, April     28, 2013 - 2:18 pm UTC
 
 
And, for MODEL aficianados, here I've taken the same underlying algorithm as in the link above, but used MODEL in a slightly different way that avoids embedding the number of buckets in the query structure, and also avoids the expensive repeated summing. I use the first :N_BINS rows to store the running totals, rather than columns. It runs in about 50s on my PC compared with around 110s for both the original MODEL and my RSF (it's a different PC from above). As a bonus it's quite a lot simpler too.
SELECT item_name, bin, item_value, CASE WHEN rn_m <= :N_BINS THEN bin_value END bin_value
  FROM items
  MODEL 
    DIMENSION BY (Row_Number() OVER (ORDER BY item_value DESC) rn)
    MEASURES (item_name, 
              item_value,
              0 bin,
              item_value bin_value,
              Row_Number() OVER (ORDER BY item_value DESC) rn_m,
              0 min_bin,
              Count(*) OVER () - :N_BINS - 1 n_iters
    )
    RULES ITERATE(10000) UNTIL (ITERATION_NUMBER >= n_iters[1]) (
      bin[rn <= :N_BINS] = CV(rn),
      min_bin[1] = Min(rn_m) KEEP (DENSE_RANK FIRST ORDER BY bin_value)[rn <= :N_BINS],
      bin[ITERATION_NUMBER + :N_BINS + 1] = min_bin[1],
      bin_value[min_bin[1]] = bin_value[CV()] + item_value[ITERATION_NUMBER + :N_BINS + 1]
    )
 ORDER BY item_value DESC 
 
@Brendan re: Improved MODEL
Stew Ashton, April     28, 2013 - 7:57 pm UTC
 
 
Brendan,
Your last MODEL solution is brilliant. Studying it, I noticed that it doesn't quite work when the number of buckets is greater than or equal to the number of items. I suggest this slight adjustment to handle the edge case. It adds an nvl() near the end and filters out a possible extra row with a null item_name.
SELECT item_name, bin, item_value, CASE WHEN rn_m <= :N_BINS THEN bin_value END bin_value
FROM (
  select * from items
  MODEL 
    DIMENSION BY (Row_Number() OVER (ORDER BY item_value DESC) rn)
    MEASURES (item_name, 
              item_value,
              0 bin,
              item_value bin_value,
              Row_Number() OVER (ORDER BY item_value DESC) rn_m,
              0 min_bin,
              Count(*) OVER () - :N_BINS - 1 n_iters
    )
    RULES ITERATE(10000) UNTIL (ITERATION_NUMBER >= n_iters[1]) (
      bin[rn <= :N_BINS] = CV(rn),
      min_bin[1] =
        Min(rn_m) KEEP (DENSE_RANK FIRST ORDER BY bin_value)[rn <= :N_BINS],
      bin[ITERATION_NUMBER + :N_BINS + 1] = min_bin[1],
      bin_value[min_bin[1]] =
        bin_value[CV()] + nvl(item_value[ITERATION_NUMBER + :N_BINS + 1],0)
    )
)
where item_name is not null
ORDER BY item_value DESC 
 
@Brendan: another suggestion
Stew Ashton, April     28, 2013 - 8:53 pm UTC
 
 
 In the MEASURES clause, put
Row_Number() OVER (ORDER BY item_value DESC) bin,
instead of
0 bin,
then remove the first RULE. In my tests, it cuts elapsed time by about one third. 
 
 
@Stew: Thanks
Brendan, April     29, 2013 - 11:16 am UTC
 
 
Thanks, Stew
You are correct on both points. I hadn't tested the more items than buckets special case, and I get a reduction from 68s to 39s by initialising bin to its desired value instead of setting it in the rules.
 
 
Bin fitting: PL/SQL seems better
Stew Ashton, April     29, 2013 - 12:16 pm UTC
 
 
 Further testing indicates that my second suggestion is no help. However, I did try a pipelined table function for comparison and it runs an order or two of magnitude faster than the MODEL solution. I suspect it scales better too, assuming the MODEL solution keeps the whole table in memory.
> truncate table items
/
table ITEMS truncated.
> insert /*+ append */ into items
select level, dbms_random.value(0,10000) from dual connect by level <= 10002
/
10,002 rows inserted.
> commit
/
committed.
> create or replace package bin_fit as
cursor cur_out is select item_name, item_value, 0 num_bin from items;
type tt_out is table of cur_out%rowtype;
function do(p_numbins number) return tt_out pipelined;
end bin_fit;
/
PACKAGE BIN_FIT compiled
> create or replace package body bin_fit as
function do(p_numbins number) return tt_out pipelined is
  l_bins sys.odcinumberlist := sys.odcinumberlist();
  min_bin number := 1;
begin
  l_bins.extend(p_numbins);
  for i in 1..l_bins.count loop
    l_bins(i) := 0;
  end loop;
  for rec in (
    select item_name, item_value, 0 num_bin
    from items order by item_value desc
  ) loop
    l_bins(min_bin) := l_bins(min_bin) + rec.item_value;
    rec.num_bin := min_bin;
    pipe row (rec);
    for i in 1..l_bins.count loop
      if l_bins(i) < l_bins(min_bin) then
        min_bin := i;
      end if;
    end loop;
  end loop;
  return;
end do;
end bin_fit;
/
PACKAGE BODY BIN_FIT compiled
> select num_bin, sum(item_value) from (
  select * from table(bin_fit.do(20))
)
group by num_bin
order by num_bin
/
   NUM_BIN SUM(ITEM_VALUE)
---------- ---------------
         1     2506121.265 
         2     2506121.058 
         3     2506126.829 
         4     2506126.745 
         5     2506128.011 
         6     2506123.159 
         7     2506125.575 
         8      2506126.78 
         9     2506120.774 
        10     2506120.798 
        11     2506120.623 
        12      2506123.07 
        13     2506120.905 
        14       2506123.8 
        15     2506128.144 
        16     2506126.882 
        17     2506120.592 
        18     2506124.556 
        19      2506128.09 
        20     2506125.791 
 20 rows selectedElapsed: 00:00:00.130 
 
 
PL/SQL
Brendan, April     29, 2013 - 12:26 pm UTC
 
 
Yes, you'd expect PL/SQL to be much faster as it's an essentially procedural algorithm. 
 
multiple record matching using self join or analytics
santosh, June      12, 2013 - 3:16 am UTC
 
 
We have requirement to find matching rows (3 or 4 columns match ) from within same table .There may be a match of 1 row to another and also ,1 to many and also, many to many .
Do we have an efficient way to do this if data in scope could be in range of millions. 
June      18, 2013 - 2:50 pm UTC 
 
the database is brutally efficient at doing joins.  Just join. 
 
 
Anand, June      20, 2013 - 11:47 am UTC
 
 
Hi Tom,
I have a table like below :
create table test (limit_id NUMBER,stream NUMBER)
/
insert into test values (123,1)
/
insert into test values (123,2)
/
insert into test values (456,3)
/
insert into test values (678,2)
/
Now i want to use analytical function to get the such limit_id which is having 2 different stream. 
June      20, 2013 - 2:52 pm UTC 
 
no analytics, just good old aggregates:
ops$tkyte%ORA11GR2> select limit_id from test group by limit_id having count(distinct stream) = 2;
  LIMIT_ID
----------
       123
 
 
 
Anand, June      20, 2013 - 4:06 pm UTC
 
 
Thanks tom.... 
 
nth_value 
Rajeshwaran, June      21, 2013 - 1:26 pm UTC
 
 
Tom:
What additional information I need to provide nth_value function to get the output as 2975 rather than 3000? I know I can easily achieve results using dense_rank but would like to know how can this be handled using nth_value.
rajesh@ORA11G> select empno,deptno,sal,
  2    nth_value(sal,2) over(partition by
  3    deptno order by sal desc) nth_sal
  4  from emp
  5  where deptno = 20
  6  order by deptno,sal desc ;
     EMPNO     DEPTNO        SAL    NTH_SAL
---------- ---------- ---------- ----------
      7788         20       3000       3000
      7902         20       3000       3000
      7566         20       2975       3000
      7876         20       1100       3000
      7369         20        800       3000   
July      01, 2013 - 4:50 pm UTC 
 
nth value gives you the nth value in a window - that is 3,000 in this case.  that is simply what this function does.  It isn't looking for the nth DISTINCT value, just the nth value after ordering and partitioning.
 
 
 
Need a help on formulating report query
kumar, July      10, 2013 - 8:01 am UTC
 
 
Hi Tom,
I need to generate weekly(as of Monday 0:01 AM) and monthly(as of 1st Date of month) 
report for the following set of data which is a modification history of a product.
I need to produce a count of product for every latest status(max of seq_id within/group by prod_id) of product for every week or month.
Lets take an example for weekly report:-
Need to generate report for a given period - start date : 27-May-2013; end date : 09-Jul-2013
we need to generate report as of week of the year(week will be considered as of Monday 0:01 am).
week number for the given dates are 22/2013(Mon - 27-05-2013 00:01 am), 23/2013.......28/2013.
here is the query for every weekly report but i want your help in combining alltogether in a single query.
CREATE TABLE PRODUCT_TEST
(
 PROD_SEQ_ID NUMBER PRIMARY KEY,
 PROD_ID NUMBER,
 PRD_CHNG_DTTM DATE,  
 PROD_STTS VARCHAR2(25)
);
Insert into PRODUCT_TEST values (1, 101, to_date('27-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (2, 101, to_date('28-MAY-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (3, 101, to_date('30-MAY-2013','DD-MON-YYYY'),'Error');
Insert into PRODUCT_TEST values (4, 102, to_date('31-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (5, 102, to_date('04-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (6, 102, to_date('08-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (7, 103, to_date('12-JUN-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (8, 103, to_date('19-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (9, 103, to_date('26-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (10, 104, to_date('10-JUL-2013','DD-MON-YYYY'),'New');
COMMIT;
First we need to find all the week numbers i.e.,Monday 00:01AM so b/w 27th May and 9th Jul
we will have week 
22/2013(date will be 27/05/2013 0:01 am)
23/2013(date will be 03/06/2013 0:01 am)
24/2013(date will be 10/06/2013 0:01 am) and so on....till end date
For weekly report
For week 22/2013=======PROD_CHNG_DTTM < 27-May-2013 00:01==========
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
                       THEN 1
                   END ) cnt_new,
       COUNT( CASE WHEN PROD_STTS = 'Open'
                       THEN 1
                   END ) cnt_open,
       COUNT( CASE WHEN PROD_STTS = 'Error'
                       THEN 1
                   END ) cnt_error,
       COUNT( CASE WHEN PROD_STTS = 'Closed'
                       THEN 1
                   END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
For week 23/2013========PROD_CHNG_DTTM < 03-Jun-2013 00:01==========
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
                       THEN 1
                   END ) cnt_new,
       COUNT( CASE WHEN PROD_STTS = 'Open'
                       THEN 1
                   END ) cnt_open,
       COUNT( CASE WHEN PROD_STTS = 'Error'
                       THEN 1
                   END ) cnt_error,
       COUNT( CASE WHEN PROD_STTS = 'Closed'
                       THEN 1
                   END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('03-Jun-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
............. and so on for every week till end date and need to produce a 
weekly report like
For the period 27th May to 9th Jul--
As of    New_Count Open_Count Error_Count Closed_count
2013 FW22 1      0  0  0
2013 FW23 1      0  1  0
2013 FW24 0      0  1  1
..........................so on
Thanks in advance..
 
 
Can you please help me in formulating above query.
kumar, July      13, 2013 - 2:55 pm UTC
 
 
Need your help Tom.... 
 
help needed
kumar, July      16, 2013 - 5:11 pm UTC
 
 
Hi Tom,
please help me in formulating above requirement in combining all the queries into one....
Anxiously waiting for response.. 
 
Need your help on reporting query.
kumar, July      17, 2013 - 10:58 am UTC
 
 
Hi Tom,
I need to prepare a monthly, weekly and yearly count report from a modification history based on status.
CREATE TABLE PRODUCT_TEST
(
 PROD_SEQ_ID NUMBER PRIMARY KEY,
 PROD_ID NUMBER,
 PRD_CHNG_DTTM DATE,  
 PROD_STTS VARCHAR2(25)
);
Insert into PRODUCT_TEST values (1, 101, to_date('27-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (2, 101, to_date('28-MAY-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (3, 101, to_date('30-MAY-2013','DD-MON-YYYY'),'Error');
Insert into PRODUCT_TEST values (4, 102, to_date('31-MAY-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (5, 102, to_date('04-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (6, 102, to_date('08-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (7, 103, to_date('12-JUN-2013','DD-MON-YYYY'),'New');
Insert into PRODUCT_TEST values (8, 103, to_date('19-JUN-2013','DD-MON-YYYY'),'Open');
Insert into PRODUCT_TEST values (9, 103, to_date('26-JUN-2013','DD-MON-YYYY'),'Closed');
Insert into PRODUCT_TEST values (10, 104, to_date('10-JUL-2013','DD-MON-YYYY'),'New');
COMMIT;
If user wants the report from date 26-May-2013 todate 09-Jul-2013 
Let's just take an weekly report example :- 
First we need to find all the week numbers i.e., all the Mondays 00:01AM so for 26th May and 9th Jul range we will have weeks as
21/2013 (date will be 20/05/2013 0:01 am)
22/2013 (date will be 27/05/2013 0:01 am)
......
28/2013 (date will be 08/07/2013 0:01 am)
Now i have to produce a trend report which will have recods as of weeks :-
For 21/2013 my query will have PROD_CHNG_DTTM < 20-May-2013 00:01 i.e., take all records from beginning till 20/05/13
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
                       THEN 1
                   END ) cnt_new,
       COUNT( CASE WHEN PROD_STTS = 'Open'
                       THEN 1
                   END ) cnt_open,
       COUNT( CASE WHEN PROD_STTS = 'Error'
                       THEN 1
                   END ) cnt_error,
       COUNT( CASE WHEN PROD_STTS = 'Closed'
                       THEN 1
                   END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
For 22/2013 my query will have PROD_CHNG_DTTM < 27-May-2013 00:01 i.e., take all records from beginning till 27/05/13
SELECT COUNT( CASE WHEN PROD_STTS = 'New'
                       THEN 1
                   END ) cnt_new,
       COUNT( CASE WHEN PROD_STTS = 'Open'
                       THEN 1
                   END ) cnt_open,
       COUNT( CASE WHEN PROD_STTS = 'Error'
                       THEN 1
                   END ) cnt_error,
       COUNT( CASE WHEN PROD_STTS = 'Closed'
                       THEN 1
                   END ) cnt_closed
FROM
(
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
WHERE rn = 1;
One way is write queries for all the weeks requested by user's date range and use union all but it will be the slowest option i think if user demands for date range having over 100 weeks.
I am sure you must be having a faster solution for combining all the queries into one(I mean finding weeks query, all the report queries for max of record in varying todate...).
Hope I have made myself clear.
Thanks a lot in Advance
 
July      17, 2013 - 6:08 pm UTC 
 
I don't really have anything to add - this is not really a followup question - maybe when I have time to take a new question.
since you seem to use 
SELECT PROD_SEQ_ID, PROD_ID, PROD_STTS
,ROW_NUMBER() OVER (PARTITION BY PROD_ID ORDER BY PROD_SEQ_ID DESC) rn
FROM PRODUCT_TEST
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')
)
multiple times, perhaps in your union all you can
with data
as
( that select )
select .. from data ..
union all
select ... from data. ...
union all
select ... from data ..
we'll build the result once (data) and use it over and over. 
 
 
Thanks a lot TOM
kumar, July      30, 2013 - 9:12 am UTC
 
 
Thank you very much Tom..
For the time being i have made my query as per your suggestion, but still want to see some good or flexible option s from your pen just for learning or enhance my analytical ability.
May be i will to try put this into new question section whenever i will be able to see new question button...... 
 
@kumar on reporting query
Stew Ashton, July      31, 2013 - 1:02 pm UTC
 
 
 Hi kumar,
You may not ever read this, but I think you misled Tom. You repeated
WHERE PRD_CHNG_DTTM < to_date('27-May-2013 00:01', 'DD-MON-YYYY hh24:mi')when you said you needed two different dates.
I think you want a weekly summary of product status. For that, you need:
- the "as of" dates you want the summary for
- the 
range of dates the status applies to
- join status to "as of" dates based on the date range
- then group by "as of" date.
with as_of_dates as (
  select min_as_of_date + level*7 as_of_date
  from (
    select trunc(min(prd_chng_dttm)-1/24/60/60, 'IW') min_as_of_date
    from product_test
  )
  connect by min_as_of_date + level*7 <= sysdate
), data as (
  select prd_chng_dttm,
  lead(prd_chng_dttm-1/24/60/60,1,sysdate)
    over(partition by prod_id order by prd_chng_dttm) prd_chng_dttm_end,
  CASE WHEN PROD_STTS = 'New' THEN 1 else 0 END cnt_new,
  CASE WHEN PROD_STTS = 'Open' THEN 1 else 0 END cnt_Open,
  CASE WHEN PROD_STTS = 'Error' THEN 1 else 0 END cnt_Error,
  CASE WHEN PROD_STTS = 'Closed' THEN 1 else 0 END cnt_Closed
  from product_test
)
select as_of_date, sum(cnt_new) cnt_new, sum(cnt_open) cnt_open,
sum(cnt_error) cnt_error, sum(cnt_closed) cnt_closed
from data a, as_of_dates b
where b.as_of_date between a.prd_chng_dttm and a.prd_chng_dttm_end
group by as_of_date
order by as_of_date;
AS_OF_DATE              CNT_NEW   CNT_OPEN  CNT_ERROR CNT_CLOSED
-------------------- ---------- ---------- ---------- ----------
27-MAY-13 00.00.00            1          0          0          0 
03-JUN-13 00.00.00            1          0          1          0 
10-JUN-13 00.00.00            0          0          1          1 
17-JUN-13 00.00.00            1          0          1          1 
24-JUN-13 00.00.00            0          1          1          1 
01-JUL-13 00.00.00            0          0          1          2 
08-JUL-13 00.00.00            0          0          1          2 
15-JUL-13 00.00.00            1          0          1          2 
22-JUL-13 00.00.00            1          0          1          2 
29-JUL-13 00.00.00            1          0          1          2  
 
 
using partition
A, August    01, 2013 - 6:57 pm UTC
 
 
Hello Tom,
Was wondering can we use partition to display "Multiple" for Col3 is the getting repeated?
If the data is like this,
Col1 Col2 Col3
111  AA   NY
222  BB   LON
111  AA   PAR
333  AA   MUM
111  AA   MUM
222  BB   LON
444  XX   LON
We want the output like this:
Col1 Col2 Col3
111  AA   Multiple
222  BB   Multiple
333  AA   MUM
444  XX   LON
If there are multiple values for Col3 for an combination of Col1 and Col2 then in Col3 it should display as "Multiple" else it should show the value.
Thanks 
August    02, 2013 - 7:26 pm UTC 
 
ops$tkyte%ORA11GR2> with data
  2  as
  3  (select 111 c1, 'aa' c2, 'NY' c3 from dual union all
  4   select 111 c1, 'aa' c2, 'XX' c3 from dual union all
  5   select 222 c1, 'aa' c2, 'YY' c3 from dual union all
  6   select 222 c1, 'aa' c2, 'YY' c3 from dual
  7  )
  8  select c1, c2, case when count(distinct c3) > 1 then 'multiple' else max(c3) end c3
  9    from data
 10   group by c1, c2
 11   order by c1, c2
 12  /
        C1 C2 C3
---------- -- --------
       111 aa multiple
       222 aa YY
 
 
 
 
Thanks
A, August    05, 2013 - 7:14 am UTC
 
 
Hello Tom,
Thanks for the reply. This is hard-coded to only 2 distinct rows. There are more than 500 rows in the table and any combination of these
some thing like ...
Col1 Col2 Col3
111  AA   NY
222  BB   LON
111  AA   PAR
333  AA   MUM
111  AA   MUM
222  BB   LON
444  XX   LON
100  AQ   JJ
101  BB   PAR
...
...
...
How to write a query for this?
  
August    08, 2013 - 4:26 pm UTC 
 
there is no hard coded limit?  what are you talking about?
my query works for any number of rows. 
 
 
Response
Raj B, August    05, 2013 - 12:46 pm UTC
 
 
Tom has already answered the query! All you need to do is just try it. 
SQL> select * from t3;
      COL1 COL COL
---------- --- ---
       222 BB  ZZZ
       345 MM  YYY
       111 AA  CAL
       222 BB  IAD
       222 BB  BLT
       111 AA  LON
       123 AA  MUM
7 rows selected.
SQL> select col1,col2, case when count(distinct col3) > 1 then 'MUL' else max(col3) end col3
    from t3
    group by col1, col2 order by col1, col2;
      COL1 COL COL
---------- --- ---
       111 AA  MUL
       123 AA  MUM
       222 BB  MUL
       345 MM  YYY
 
 
 
tough query
Bhavesh Ghodasara, August    21, 2013 - 1:27 pm UTC
 
 
Hi Tom,
I was regular at your blog before 7-8 years. and learned a lot from you at beginning of my career. and it helped a lot. Ironically I became Teradata DBA :).. Thanks for all your help and wonderful solutions.
So I am back with query again. 
create table t
(
Artcl integer,
store integer,
yr         integer,
week  integer,
Inventory integer);
insert into t values( 1,1,2011,50,2);
insert into t values( 1,1,2011,51,-5);
insert into t values( 1,1,2011,52,4);
insert into t values( 1,1,2012,1,-5);
insert into t values( 1,1,2012,4,3);
insert into t values( 1,1,2012,7,2);
insert into t values( 1,1,2012,8,4);
insert into t values( 1,1,2012,10,-2);
insert into t values( 1,1,2012,15,7);
insert into t values( 2,1,2011,52,-5);
insert into t values( 2,1,2012,4,2);
insert into t values( 2,1,2012,6,-1);
insert into t values( 2,1,2012,50,5);
insert into t values( 2,1,2012,52,2);
insert into t values( 2,1,2013,1,-2);
insert into t values( 2,1,2013,2,-7);
insert into t values( 2,1,2013,3,-2);
insert into t values( 2,1,2013,4,4);
insert into t values( 2,1,2013,5,-2);
insert into t values( 3,2,2012,6,1);
insert into t values( 3,2,2012,50,2);
insert into t values( 3,2,2012,52,3);
insert into t values( 3,2,2013,1,-2);
insert into t values( 3,2,2013,2,-1);
insert into t values( 3,2,2013,3,-1);
insert into t values( 3,2,2013,4,5);
insert into t values( 3,2,2013,5,-2);
Now data will look like (except last column)
Artcl store year week  Inventory Rolling total
1 1 2011 50 2 2
1 1 2011 51 -5 0
1 1 2011 52 4 4
1 1 2012 1 -5 0
1 1 2012 4 3 3
1 1 2012 7 2 5
1 1 2012 8 4 9
1 1 2012 10 -2 7
1 1 2012 15 7 14
2 1 2011 52 -5 0
2 1 2012 4 2 2
2 1 2012 6 -1 1
2 1 2012 50 5 6
2 1 2012 52 2 8
2 1 2013 1 -2 6
2 1 2013 2 -7 0
2 1 2013 3 -2 0
2 1 2013 4 4 4
2 1 2013 5 -2 2
3 2 2012 6 1 1
3 2 2012 50 2 3
3 2 2012 52 3 6
3 2 2013 1 -2 4
3 2 2013 2 -1 3
3 2 2013 3 -1 2
3 2 2013 4 5 7
3 2 2013 5 -2 5
I need rolling column like above report.
1) partition by artcl, store, year , week
2) rolling total of inventory with following conditions :
   a) if total becomes 0 or negative, we have to consider as 0 and start all over.
I tried hard to get solution but no avail. I even think now that it will not probably possible 
with simple sql and will need function/procedure for that.
Please help. As I left with no option, I came back to my childhood hero for help :) 
 
String Diff
TCS, September 17, 2013 - 10:44 am UTC
 
 
i want to do string diff between two string like if i have two string input say : ABCDCAFGV and ABDCAF then my output will be : CGV 
Basically i want to do string position match between two string and output the difference.
Can it be done througn Oracle SQL query only or we need to write UDF for that. 
 
reply to Bhavesh's question
Ranjan., September 19, 2013 - 5:19 pm UTC
 
 
Hi Bhavesh,
For your scenario ,here is the below code which will produce the desire result.
################################
CREATE TYPE typ_obj_breaksum AS OBJECT
    (
      sum    NUMBER,
  
      STATIC FUNCTION ODCIAggregateInitialize (
                      sctx IN OUT typ_obj_breaksum
                      ) RETURN NUMBER,
  
      MEMBER FUNCTION ODCIAggregateIterate (
                     self  IN OUT typ_obj_breaksum,
                     value IN     NUMBER
                     ) RETURN NUMBER,
 
     MEMBER FUNCTION ODCIAggregateTerminate (
                     self   IN  typ_obj_breaksum,
                     retval OUT NUMBER,
                     flags  IN  NUMBER
                     ) RETURN NUMBER,
 
     MEMBER FUNCTION ODCIAggregateMerge (
                     self IN OUT typ_obj_breaksum,
                     ctx2 IN     typ_obj_breaksum
                     ) RETURN NUMBER
   );
   
/ 
CREATE TYPE BODY typ_obj_breaksum IS
  
       STATIC FUNCTION ODCIAggregateInitialize (
                       sctx IN OUT typ_obj_breaksum
                      ) RETURN NUMBER IS
      BEGIN
         sctx := typ_obj_breaksum(0);
         RETURN ODCIConst.Success;
       End;
      MEMBER FUNCTION ODCIAggregateIterate (
                      self  IN OUT typ_obj_breaksum,
                      value IN     NUMBER
                      ) RETURN NUMBER IS
      BEGIN
        self.sum := CASE
                       WHEN value >= 0
                       OR  (value < 0 AND self.sum + value > 0)
                       THEN self.sum + value
                       ELSE 0
                    END;
        Return Odciconst.Success;
      END;
     MEMBER FUNCTION ODCIAggregateTerminate (
                     self   IN  typ_obj_breaksum,
                      retval OUT NUMBER,
                      flags  IN  NUMBER
                      ) RETURN NUMBER IS
      BEGIN
        retval := self.sum;
        RETURN ODCIConst.Success;
      End;
 
      MEMBER FUNCTION ODCIAggregateMerge (
                      self IN OUT typ_obj_breaksum,
                      ctx2 IN     typ_obj_breaksum
                      ) RETURN NUMBER IS
      BEGIN
        self.sum := CASE
                       WHEN self.sum + ctx2.sum > 0
                       THEN self.sum + ctx2.sum
                       ELSE 0
                    END;
        RETURN ODCIConst.Success;
      END;
 
   End;
   
/ 
CREATE FUNCTION breaksum (input NUMBER) RETURN NUMBER
       PARALLEL_ENABLE
       AGGREGATE USING typ_obj_breaksum;
    
/ 
SELECT artcl , store,yr,week,inventory, BREAKSUM (inventory) over(partition by artcl order by rownum) AS rolling_total
    From   T
    ; 
And Here The Output Is:
##################################
Artcl    store    year    week     Inventory    Rolling total
1    1    2011    50    2    2
1    1    2011    51    -5    0
1    1    2011    52    4    4
1    1    2012    1    -5    0
1    1    2012    4    3    3
1    1    2012    7    2    5
1    1    2012    8    4    9
1    1    2012    10    -2    7
1    1    2012    15    7    14
2    1    2011    52    -5    0
2    1    2012    4    2    2
2    1    2012    6    -1    1
2    1    2012    50    5    6
2    1    2012    52    2    8
2    1    2013    1    -2    6
2    1    2013    2    -7    0
2    1    2013    3    -2    0
2    1    2013    4    4    4
2    1    2013    5    -2    2
3    2    2012    6    1    1
3    2    2012    50    2    3
3    2    2012    52    3    6
3    2    2013    1    -2    4
3    2    2013    2    -1    3
3    2    2013    3    -1    2
3    2    2013    4    5    7
3    2    2013    5    -2    5
############################
Hope that help you and from now ,you can able to create your own user defined aggregate function. :)
 
 
cont to my above post for  Bhavesh
Ranjan., September 19, 2013 - 5:53 pm UTC
 
 
Hi Bhavesh,
by seeing your result,
you datas need to partition by artcl,store  order by yr,week.
Because for "2    1    2013    1    -2    6"
the 6 came from the year 2012 balance ,which is 8 and added "-2" of year 2013 value.
So the query should be as below.
################33
SELECT artcl , store,yr,week,inventory, BREAKSUM (inventory) over(partition by artcl,store order by yr,week) AS rolling_total
    From   T ; 
########################3
Enjoy :). 
 
thanks
Bhavesh Ghodasara, September 24, 2013 - 11:50 am UTC
 
 
Thank you so much Rajan. 
I really appriciate your effort.
I want to know whether it is possible to get particular output by just Query, I dont want to use any function.
 
 
cont to my last post for @Bhavesh
Ranjan, September 26, 2013 - 8:43 pm UTC
 
 
Hi Bhavesh,
For Your requirement, It is not that easy  to display desire report directly using select query.
Long back I read something "creating our own aggregate function" from the below site. 
http://www.oracle-developer.net/display.php?id=215 I suggest you to read that page by spending 10 minutes and 
you will know why it is not that easy :).
I have only copied those logic from that site and shown in my last post.
Regards,
Ranjan. 
 
Rich, October   29, 2013 - 3:26 pm UTC
 
 
Tomm,
Thanks for providing the "Old Carry Forward Trick." It worked well for me!
"One of my (current) favorite analytic tricks -- the old "carry forward".  We mark rows such that 
the preceding row was different -- subsequent dup rows would have NULLS there for grp.  
Then, we use max(grp) to "carry" that number down....
Now we have something to group by -- we've divided the rows up into groups we can deal with."
 
 
selecting selective subset of rows
Ravi B, April     05, 2014 - 12:00 am UTC
 
 
Hi Tom,
Could you please help me with this problem.
CREATE TABLE TEST_VERSION(VERSION_ID NUMBER,SUBVERSION VARCHAR2(100));
insert into TEST_VERSION VALUES(1,null);
insert into TEST_VERSION VALUES(1,'Alpha');
insert into TEST_VERSION VALUES(1,'Internal Beta');
insert into TEST_VERSION VALUES(1,'Technical Preview');
insert into TEST_VERSION VALUES(1,'RTM');
insert into TEST_VERSION VALUES(2,null);
insert into TEST_VERSION VALUES(2,'Alpha');
insert into TEST_VERSION VALUES(3,'Internal Beta');
insert into TEST_VERSION VALUES(4,'Technical Preview');
SELECT * FROM TEST_VERSION;
select rules:
for given version_id:
1) if there is a SUBVERSION='RTM' row(s) in the set and also a row with SUBVERSION= NULL ignore the row with SUBVERSION= NULL and select all the other rows including RTM
2) if there is a row SUBVERSION= NULL and there are no rows with SUBVERSION='RTM' then pick all the rows including SUBVERSION=NULL
In essence, row with SUBVERSION= NULL cannot not co-exist with SUBVERSION='RTM' for the same version_id. 
April     16, 2014 - 5:43 pm UTC 
 
I think #2 should be phrased as:
2) if there are no rows with RTM, select them all.
ops$tkyte%ORA11GR2> select *
  2    from (
  3  select version_id, subversion,
  4         max( case when subversion = 'RTM' then 1 else 0 end ) over (partition by version_id ) has_rtm
  5    from test_version
  6         )
  7   where (has_rtm = 1 and subversion is not null)
  8      or (has_rtm = 0)
  9   order by version_id, subversion
 10  /
VERSION_ID SUBVERSION              HAS_RTM
---------- -------------------- ----------
         1 Alpha                         1
         1 Internal Beta                 1
         1 RTM                           1
         1 Technical Preview             1
         2 Alpha                         0
         2                               0
         3 Internal Beta                 0
         4 Technical Preview             0
8 rows selected.
 
 
 
selecting selective subset of rows
A reader, April     17, 2014 - 3:51 am UTC
 
 
Brilliant!! 
 
alternatively
Igor, April     17, 2014 - 4:02 pm UTC
 
 
SQL> select * from test_version
  2  minus
  3  select version_id, decode(subversion,'RTM',null) from test_version where subversion='RTM'
  4  order by 1,2;
VERSION_ID SUBVERSION
---------- -----------------
         1 Alpha
         1 Internal Beta
         1 RTM
         1 Technical Preview
         2 Alpha
         2
         3 Internal Beta
         4 Technical Preview
8 rows selected. 
 
April     17, 2014 - 5:40 pm UTC 
 
depends, we know nothing about their primary keys, unique constraints and so on.  don't forget that MINUS adds a "distinct" to the query.  That means that in general, you cannot substitute MINUS for what I did above.
just re-run the inserts and:
ops$tkyte%ORA11GR2> select *
  2    from (
  3  select version_id, subversion,
  4         max( case when subversion = 'RTM' then 1 else 0 end ) over (partition by version_id ) has_rtm
  5    from test_version
  6         )
  7   where (has_rtm = 1 and subversion is not null)
  8      or (has_rtm = 0)
  9   order by version_id, subversion
 10  /
VERSION_ID SUBVERSION              HAS_RTM
---------- -------------------- ----------
         1 Alpha                         1
         1 Alpha                         1
         1 Internal Beta                 1
         1 Internal Beta                 1
         1 RTM                           1
         1 RTM                           1
         1 Technical Preview             1
         1 Technical Preview             1
         2 Alpha                         0
         2 Alpha                         0
         2                               0
         2                               0
         3 Internal Beta                 0
         3 Internal Beta                 0
         4 Technical Preview             0
         4 Technical Preview             0
16 rows selected.
ops$tkyte%ORA11GR2> 
ops$tkyte%ORA11GR2> select * from test_version
  2  minus
  3  select version_id, decode(subversion,'RTM',null) from test_version where subversion='RTM'
  4  order by 1,2;
VERSION_ID SUBVERSION
---------- --------------------
         1 Alpha
         1 Internal Beta
         1 RTM
         1 Technical Preview
         2 Alpha
         2
         3 Internal Beta
         4 Technical Preview
8 rows selected.
and you don't need your decode, just select version_id, null - since you are only getting RTM subversions - the decode is an extra step you would not need.  
a non-analytic query that would work could be:
ops$tkyte%ORA11GR2> select *
  2    from test_version a
  3   where not exists (select null
  4                       from test_version b
  5                      where a.version_id = b.version_id
  6                        and b.subversion = 'RTM'
  7                        and a.subversion is null)
  8   order by 1, 2
  9  /
but be careful trying to substitute MINUS, UNION, UNION ALL, INTERSECT, etc - they have "set based side effects" that can and will change the answer you receive - unless there are constraints in place that tell you "that side effect cannot happen"
but yes, there are a huge number of equivalent ways to get this answer... 
 
 
Igor, April     17, 2014 - 9:03 pm UTC
 
 
Tom, you are right, decode was not needed in my query, thanks for pointing at this. And yes, these set operators (union, minus etc.) may produce different results, depending on the data (for example the duplicate records will be lost). Thanks! 
April     17, 2014 - 10:52 pm UTC 
 
thanks for the ideas though :) it is always good to see alternatives... 
 
 
A reader, August    11, 2015 - 7:51 pm UTC
 
 
In this example, is there a way to get the values without doing the join twice. I need to get the value from table N where the MAX date (field d) is less than or equal to date in table A.
create table g(a int, b varchar2(2), d date) -- a, b and d are together unique;
insert into g values (1, 'ER', to_date('01012005','mmddyyyy');
insert into g values (1, 'ER', to_date('06012005','mmddyyyy');
insert into g values (1, 'ER', to_date('01012015','mmddyyyy');
insert into g values (2, 'PR', to_date('01012005','mmddyyyy');
insert into g values (2, 'PR', to_date('01012015','mmddyyyy');
insert into g values (2, 'PR', to_date('06012015','mmddyyyy');
insert into g values (3, 'FR', to_date('06012015','mmddyyyy');
create table n(a int, b varchar2(2), d date, e varchar2(3), f varchar2(3)) -- a, b and d are unique;
insert into n values(1, 'ER', to_date('01012005','mmddyyyy', 'FSC', 'COL');
insert into n values(1, 'ER', to_date('01012014','mmddyyyy', 'FOO', 'CLL');
insert into n values(1, 'PR', to_date('01012014','mmddyyyy', 'FOI', 'COL');
insert into n values(1, 'PR', to_date('06012014','mmddyyyy', 'FIT', 'CLL');
select g.a, g.b, g.d,
(select n.e from n where n.a = g.a and n.b = g.b and n.d = (select max(n.d) from n where n.a = g.a and n.b = g.b and n.d<=g.d) e,
(select n.f from n where n.a = g.a and n.b = g.b and n.d = (select max(n.d) from n where n.a = g.a and n.b = g.b and n.d<=g.d) f
from g 
 
Analytics based on Logical offset
Rajeshwaran, Jeyabal, July      13, 2016 - 11:24 am UTC
 
 
set feedback off
drop table t purge;
create table t(
HYB_PROJECT_KEY    NUMBER,
SAMP_CNFG_KEY      NUMBER,
SAMPLE_RESULT_KEY  NUMBER,
RESULT_SOURCE      VARCHAR2(1),
RESULT_SOURCE_CDC1 VARCHAR2(1),
RESULT_DATE        DATE,
RESULT_VALUE_TYPE  VARCHAR2(10),
RESULT_VALUE       NUMBER,
IS_ACTIVE          NUMBER);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'A','A',to_date('09-JUN-2014','DD-MON-YYYY'),'TEST',7.2,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'A','A',to_date('14-JUN-2014','DD-MON-YYYY'),'TEST',8,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'H','H',to_date('01-JUN-2014','DD-MON-YYYY'),'TEST',4,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'H','H',to_date('03-JUN-2014','DD-MON-YYYY'),'TEST',5,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'H','H',to_date('06-JUN-2014','DD-MON-YYYY'),'TEST',4,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'H','H',to_date('07-JUN-2014','DD-MON-YYYY'),'TEST',3,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'H','H',to_date('15-JUN-2014','DD-MON-YYYY'),'TEST',10,1);
Insert into T (HYB_PROJECT_KEY,SAMP_CNFG_KEY,SAMPLE_RESULT_KEY,RESULT_SOURCE,RESULT_SOURCE_CDC1,RESULT_DATE,RESULT_VALUE_TYPE,RESULT_VALUE,IS_ACTIVE) 
 values (84,114,1365970,'H','H',to_date('17-JUN-2014','DD-MON-YYYY'),'TEST',0,1);
commit;
set feedback on Team,
1) for each sample_result_key and result_date just retain only the Minimum non-zero values as result_value.
2) for each sample_result_key if the maximum result_date having the result_value = 0, then go back seven days and pick the minimum result_value and its corresponding result_source and result_source_cdc1.
I am able to come up with a query like this, but do you have any other better way of doing this? 
demo@ORA11G>
demo@ORA11G> select hyb_project_key, samp_cnfg_key , sample_result_key,
  2          max(result_date) result_date,
  3          min(case when result_value > 0 then result_value end) result_value,
  4          min(result_source) keep(dense_rank first order by case when result_value > 0 then result_value end asc nulls last) result_source,
  5          min(result_source_cdc1) keep(dense_rank first order by case when result_value > 0 then result_value end asc nulls last) result_source_cdc1
  6  from (
  7  select hyb_project_key, samp_cnfg_key,
  8          sample_result_key,result_date, result_value,rnk,
  9          result_source,result_source_cdc1
 10  from (
 11  select hyb_project_key, samp_cnfg_key,
 12          sample_result_key,result_date, result_value,rnk,
 13          result_source,result_source_cdc1,nxt_value,
 14          last_value(nxt_value ignore nulls) over(partition by sample_result_key order by rnk) nxt_value2
 15  from (
 16  select hyb_project_key, samp_cnfg_key,
 17          sample_result_key,result_date,
 18          min(case when result_value > 0 then result_value end) result_value,
 19          min(result_source) keep(dense_rank first order by case when result_value > 0 then result_value end nulls last) result_source,
 20          min(result_source_cdc1) keep(dense_rank first order by case when result_value > 0 then result_value end nulls last) result_source_cdc1 ,
 21          row_number() over(partition by sample_result_key order by result_date desc) rnk,
 22          case when row_number() over(partition by sample_result_key order by result_date desc) = 1
 23                and min(case when result_value > 0 then result_value end) is null
 24                then last_value( result_date ) over( partition by sample_result_key order by result_date desc
 25                    range between current row and interval '7' day following) end nxt_value
 26  from t
 27  group by hyb_project_key, samp_cnfg_key,
 28          sample_result_key,result_date
 29  order by result_date desc
 30       )
 31       )
 32  where rnk =1 or nxt_value2 <= result_date
 33       ) t2
 34  group by hyb_project_key, samp_cnfg_key , sample_result_key ;
HYB_PROJECT_KEY SAMP_CNFG_KEY SAMPLE_RESULT_KEY RESULT_DATE RESULT_VALUE R R
--------------- ------------- ----------------- ----------- ------------ - -
             84           114           1365970 17-JUN-2014            8 A A
1 row selected.
   
 
July      13, 2016 - 1:24 pm UTC 
 
I'm not sure I understand the requirement. Based on the example, do you mean:
"Return the maximum result date for each sample. If this has a value of zero, return the lowest (non-zero) result in the seven previous days. Also show the source and CDC that gave this minimum"
?
If so, the following does that:
select hyb_project_key, samp_cnfg_key, sample_result_key, mx,
       min(case when result_value > 0 then result_value end) res_value,
       min(result_source) keep (
         dense_rank first order by case when result_value > 0 then result_value end nulls last
       ) score,
       min(result_source_cdc1) keep (
         dense_rank first order by case when result_value > 0 then result_value end
       ) cdc
from (
  select t.*,
         max(result_date) over (partition by sample_result_key) mx
  from   t
)
where  mx - result_date <= 7
group  by hyb_project_key, samp_cnfg_key, sample_result_key, mx;
HYB_PROJECT_KEY  SAMP_CNFG_KEY  SAMPLE_RESULT_KEY  MX                    RES_VALUE  SCORE  CDC  
84               114            1,365,970          17-JUN-2014 00:00:00  8          A      A       
Better is always a bit subjective. But this is far less code!
Obviously the where clause (mx - result_date <= 7) will need adjusting if there are multiple result keys in the data. 
 
 
Analytics based on Logical offset
Rajeshwaran, Jeyabal, July      13, 2016 - 2:36 pm UTC
 
 
Thanks, this helps. 
 
On "Analytics based on Logical offset"
Stew Ashton, January   09, 2017 - 11:03 am UTC
 
 
In 12c there is a MATCH_RECOGNIZE solution:
select * from t
match_recognize(
  partition by HYB_PROJECT_KEY, SAMP_CNFG_KEY, sample_result_key
  order by result_date desc
  measures a.RESULT_DATE RESULT_DATE,
    ab.RESULT_SOURCE RESULT_SOURCE, 
    ab.RESULT_SOURCE_CDC1 RESULT_SOURCE_CDC1, 
    ab.RESULT_VALUE_TYPE RESULT_VALUE_TYPE, 
    ab.RESULT_VALUE RESULT_VALUE, 
    ab.IS_ACTIVE IS_ACTIVE
  pattern (^a (b|c)*)
  subset bc = (b, c), ab = (a, b)
  define
    b as a.result_value = 0 and result_date >= a.result_date - 7
      and result_value = min(bc.result_value),
    c as a.result_value = 0 and result_date >= a.result_date - 7
); 
January   10, 2017 - 4:04 am UTC 
 
Love your work :-)