Skip to Main Content

Breadcrumb

May 4th

Question and Answer

Tom Kyte

Thanks for the question, Saminathan.

Asked: March 13, 2004 - 12:46 pm UTC

Last updated: October 08, 2009 - 7:07 am UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

Dear Tom,

Thanks for taking new questions. My question is related to Analytical Functions.

Table Data
==========
The table data(or inner select statement's output) is like below.

UserID RefDate GroupID Flag Activity Count
----- ------------- ----- ------ ------------ ----
U1 6/3/03 2:27 A 0 'D' 1
U1 6/9/03 7:47 B 0 'D' 1

U2 6/2/03 15:34 C 0 'D' 1
U2 6/2/03 15:37 D 0 'D' 1

U3 9/9/02 18:40 C 0 'D' 1
U3 9/9/02 18:45 E 0 'D' 1

U4 10/29/02 18:05 C 0 'R' 1
U4 10/29/02 19:59 E 0 'R' 1
U4 10/29/02 21:23 D 0 'R' 1



I have to group them by Activity and **FIRST** GroupID of the UserID.
U1's First GroupID is A (based on RefDate or sort by refDate).
So U1's both records count (here each records couunt is 1 -- last column) will be reported under GROUPID "A"

GroupID Flag Activity Count
----- ----- --------- -----
A 0 D 2

Expected output for the above table should be..

GroupID Flag Activity Count
----- ----- --------- -----
A 0 D 2
C 0 D 4
C 0 R 3


Is there any Analytical Functions to make this job easier?
Kindly give me some hint on this.



and Tom said...

ops$tkyte@ORA9IR2> select first_value(groupid)
2 over (partition by userid order by refdate) groupid,
3 flag,
4 activity,
5 cnt
6 from t a
7 /

G FLAG A CNT
- ---------- - ----------
A 0 D 1
A 0 D 1
C 0 D 1
C 0 D 1
C 0 D 1
C 0 D 1
C 0 R 1
C 0 R 1
C 0 R 1

9 rows selected.

ops$tkyte@ORA9IR2> select groupid, flag, activity, sum(cnt)
2 from (
3 select first_value(groupid)
4 over (partition by userid order by refdate) groupid,
5 flag,
6 activity,
7 cnt
8 from t a
9 )
10 group by groupid, flag, activity
11 /

G FLAG A SUM(CNT)
- ---------- - ----------
A 0 D 2
C 0 D 4
C 0 R 3



Rating

  (21 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

OK

Jacob, March 14, 2004 - 7:15 am UTC

Dear Sir,
Could you please specify a simple example for "How lag or
lead function can be used to replace a * self join*?Please
do reply.
Bye!


Tom Kyte
March 14, 2004 - 9:58 am UTC

report requirement:

show employees sorted by hiredate and display the time between the current employees hiredate and the prior/next hiredates by deptno (or think of an audit trail for example -- show the period of times between accesses):





scott@ORA9IR2> select deptno,
2 ename,
3 hiredate,
4 hiredate-lag(hiredate)
5 over (partition by deptno order by hiredate) last_hire,
6 lead(hiredate) over (partition by deptno
7 order by hiredate) - hiredate next_hire
8 from emp
9 order by deptno, hiredate
10 /

DEPTNO ENAME HIREDATE LAST_HIRE NEXT_HIRE
---------- ---------- --------- ---------- ----------
10 CLARK 09-JUN-81 161
10 KING 17-NOV-81 161 67
10 MILLER 23-JAN-82 67
20 SMITH 17-DEC-80 106
20 JONES 02-APR-81 106 245
20 FORD 03-DEC-81 245 371
20 SCOTT 09-DEC-82 371 34
20 ADAMS 12-JAN-83 34
30 ALLEN 20-FEB-81 2
30 WARD 22-FEB-81 2 68
30 BLAKE 01-MAY-81 68 130
30 TURNER 08-SEP-81 130 20
30 MARTIN 28-SEP-81 20 66
30 JAMES 03-DEC-81 66

14 rows selected.

scott@ORA9IR2>
scott@ORA9IR2> select deptno,
2 ename,
3 hiredate,
4 (select a.hiredate-max(b.hiredate)
5 from emp b
6 where b.deptno = a.deptno
7 and b.hiredate <= a.hiredate
8 and b.empno <> a.empno ) last_hire,
9 (select min(b.hiredate)-a.hiredate
10 from emp b
11 where b.deptno = a.deptno
12 and b.hiredate >= a.hiredate
13 and b.empno <> a.empno ) next_hire
14 from emp a
15 order by deptno, hiredate
16 /

DEPTNO ENAME HIREDATE LAST_HIRE NEXT_HIRE
---------- ---------- --------- ---------- ----------
10 CLARK 09-JUN-81 161
10 KING 17-NOV-81 161 67
10 MILLER 23-JAN-82 67
20 SMITH 17-DEC-80 106
20 JONES 02-APR-81 106 245
20 FORD 03-DEC-81 245 371
20 SCOTT 09-DEC-82 371 34
20 ADAMS 12-JAN-83 34
30 ALLEN 20-FEB-81 2
30 WARD 22-FEB-81 2 68
30 BLAKE 01-MAY-81 68 130
30 TURNER 08-SEP-81 130 20
30 MARTIN 28-SEP-81 20 66
30 JAMES 03-DEC-81 66

14 rows selected.

scott@ORA9IR2>



Excellent Tom!!!

Sami, March 15, 2004 - 7:01 am UTC


Sequence Number reset - based on category

Sami, March 23, 2004 - 12:09 pm UTC

Dear Tom,

Table Data is like below

UserID RefDate
----- -------------
U1 6/3/03 2:27
U1 8/9/03 7:47
U1 3/14/03 7:47
U1 6/17/03 7:47

U2 6/2/03 15:34
U2 9/2/03 15:37

U3 8/9/02 18:40
U3 9/9/02 18:45

U4 10/29/02 18:05
U4 10/29/02 19:59
U4 10/29/02 21:23


I need fetch the above data and insert into another temporary reporting table like below.

Seq# UserID RefDate
---- ----- -------------
1 U1 6/3/03 2:27
2 U1 8/9/03 7:47
3 U1 3/14/03 7:47
4 U1 6/17/03 7:47

1 U2 6/2/03 15:34
2 U2 9/2/03 15:37

1 U3 8/9/02 18:40
2 U3 9/9/02 18:45

1 U4 10/29/02 18:05
2 U4 10/29/02 19:59
3 U4 10/29/02 21:23

Is there any Analytical Function can be used to achive this?


Tom Kyte
March 23, 2004 - 5:26 pm UTC

row_number() over (partition by userid order by refdate)

Can this be re written using Analyical functions?

Miguel, June 01, 2004 - 2:55 pm UTC

SELECT a.tcode
,a.sequence
,a.mtype
,a.mcode
,CASE WHEN a.mcode > 0
THEN NVL(SUBSTR(LTRIM(RTRIM(b.FstNa) || ' ' || LTRIM(RTRIM(b.LstNa))),1,45),' ')
ELSE NVL(SUBSTR(LTRIM(RTRIM(c.FstNa) || ' ' || LTRIM(RTRIM(c.LstNa))),1,45),' ')
END as Fullname
,CASE WHEN a.mcode > 0
THEN NVL(SUBSTR(LTRIM(b.FstNa),1,30),' ')
ELSE NVL(SUBSTR(LTRIM(c.FstNa),1,30),' ')
END as FstNa
,CASE WHEN a.mcode > 0
THEN NVL(SUBSTR(LTRIM(b.LstNa),1,30),' ')
ELSE NVL(SUBSTR(LTRIM(c.LstNa),1,30),' ')
END as LstNa
,NVL((SELECT i.share_
FROM prep_user.tblEntShare i
WHERE a.tcode=i.tcode
AND a.sequence=i.sequence
AND i.ShrTyp = '1'
AND SYSDATE BETWEEN i.EffDt AND i.EndDt),0) as "SHARE"
,NVL((SELECT i.share_
FROM prep_user.tblEntShare i
WHERE a.tcode=i.tcode
AND a.sequence=i.sequence
AND i.ShrTyp = '2'
AND SYSDATE BETWEEN i.EffDt AND i.EndDt),0) as "FSHARE"
,a.pub_type
,a.pay_status
,CASE WHEN a.mcode > 0
THEN SUBSTR(LTRIM(NVL(b.society_code,' ') || NVL(a.EntSocCde,' ')),1,3)
ELSE SUBSTR(LTRIM(NVL(c.society_code,' ') || NVL(a.EntSocCde,' ')),1,3)
END as EntSocCde
,CASE WHEN a.mcode > 0
THEN NVL(b.cae_cde,0)
ELSE NVL(c.cae_cde,0)
END as CaeNr
from prep_user.tblEntitled a
,prep_user.tblMember b
,prep_user.tblEntNonMbr c
where b.mcode(+) = a.mcode
and c.tcode(+) = a.tcode
and c.sequence(+) = a.sequence
/


Tom Kyte
June 01, 2004 - 3:58 pm UTC

where do you think they would apply? you don't seem to be looking at any rows but the current row (no self joins, no correlated subqueries back to the same table).

I don't see them applying here, no.

count of consecutive records with same sessionid and pageid

Sami, July 14, 2004 - 5:10 pm UTC

Dear Tom,

How do I accomplish this using analytical function?

SQL> select * from test;

      SLNO  SESSIONID      PGEID ACCESSTIME
---------- ---------- ---------- ------------------------
         1          1          1 14-JUL-04 04:13:38
         2          1          3 14-JUL-04 04:13:46
         3          1          8 14-JUL-04 04:13:51
         4          1          8 14-JUL-04 04:13:52
         5          1          8 14-JUL-04 04:14:05
         6          1          5 14-JUL-04 04:14:10
         7          1          9 14-JUL-04 04:14:17
         8          1         10 14-JUL-04 04:14:21
         9          2          1 14-JUL-04 04:14:29
        10          3          4 14-JUL-04 04:14:36
        11          1          5 14-JUL-04 04:14:46
        12          4          5 14-JUL-04 04:14:51
        13          2          5 14-JUL-04 04:14:57
        14          2          5 14-JUL-04 04:15:03
        15          7          5 14-JUL-04 04:15:07
        16          7          1 14-JUL-04 04:15:09
        17          7          5 14-JUL-04 04:15:12
        18          8          5 14-JUL-04 04:15:15
        19          1          1 14-JUL-04 04:15:20
        20          1          2 14-JUL-04 04:15:23
        21          3          4 14-JUL-04 04:16:10
        22          1          2 14-JUL-04 04:17:33

22 rows selected.



If sessionid and pageid combination repeats consecutively then the count has to be reported.
For example ,


         3          1          8 14-JUL-04 04:13:51
         4          1          8 14-JUL-04 04:13:52
         5          1          8 14-JUL-04 04:14:05


        13          2          5 14-JUL-04 04:14:57
        14          2          5 14-JUL-04 04:15:03
        
        10          3          4 14-JUL-04 04:14:36          
        21          3          4 14-JUL-04 04:16:10 

<== Between SLNO 10 and 21 there is no sessionid 3 with different PGEID
        
        20          1          2 14-JUL-04 04:15:23
        22          1          2 14-JUL-04 04:17:33

I need an output like below

SESSIONID     PGEID     COUNT
    1      8    3
    2      5    2
    3      4    2
    1      2    2

Tom, your help would be really appreciated. 

Tom Kyte
July 15, 2004 - 11:45 am UTC

no create table. no insert intos. hmmm

</code> https://asktom.oracle.com/Misc/oramag/on-format-negation-and-sliding.html <code>

same solution as the last article in that column.

Tried as you suggested. It works fine

Sami, July 15, 2004 - 3:22 pm UTC

SQL> desc demo
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------

 SLNO                                               NUMBER
 SESSIONID                                          NUMBER
 PGEID                                              NUMBER
 ACCESSTIME                                         DATE


SQL> select * from demo
  2  /

      SLNO  SESSIONID      PGEID ACCESSTIME
---------- ---------- ---------- ------------------------
         1          1          1 14-JUL-04 04:13:38
         2          1          3 14-JUL-04 04:13:46
         3          1          8 14-JUL-04 04:13:51
         4          1          8 14-JUL-04 04:13:52
         5          1          8 14-JUL-04 04:14:05
         6          1          5 14-JUL-04 04:14:10
         7          1          9 14-JUL-04 04:14:17
         8          1         10 14-JUL-04 04:14:21
         9          2          1 14-JUL-04 04:14:29
        10          3          4 14-JUL-04 04:14:36
        11          1          5 14-JUL-04 04:14:46
        12          4          5 14-JUL-04 04:14:51
        13          2          5 14-JUL-04 04:14:57
        14          2          5 14-JUL-04 04:15:03
        15          7          5 14-JUL-04 04:15:07
        16          7          1 14-JUL-04 04:15:09
        17          7          5 14-JUL-04 04:15:12
        18          8          5 14-JUL-04 04:15:15
        19          1          1 14-JUL-04 04:15:20
        20          1          2 14-JUL-04 04:15:23
        21          4          5 15-JUL-04 01:24:06
        22          3          4 15-JUL-04 01:24:34

22 rows selected.

SQL> select slno,
    sessionid,
    pgeid,
    to_number(sessionid||pgeid) sp,
    to_number(lag(sessionid||pgeid) over(order by sessionid,accesstime)) lag,
    to_number(sessionid||pgeid)-to_number(lag(sessionid||pgeid) over(order by sessionid,accesstime)) lagdiff 
    from demo;

      SLNO  SESSIONID      PGEID         SP        LAG    LAGDIFF
---------- ---------- ---------- ---------- ---------- ----------
         1          1          1         11
         2          1          3         13         11          2
         3          1          8         18         13          5
         4          1          8         18         18          0
         5          1          8         18         18          0
         6          1          5         15         18         -3
         7          1          9         19         15          4
         8          1         10        110         19         91
        11          1          5         15        110        -95
        19          1          1         11         15         -4
        20          1          2         12         11          1
         9          2          1         21         12          9
        13          2          5         25         21          4
        14          2          5         25         25          0
        10          3          4         34         25          9
        22          3          4         34         34          0
        12          4          5         45         34         11
        21          4          5         45         45          0
        15          7          5         75         45         30
        16          7          1         71         75         -4
        17          7          5         75         71          4
        18          8          5         85         75         10

22 rows selected.


SQL> select sessionid,
    pgeid,
    count(*) refresh_count 
    from (select     slno,
            sessionid,
            pgeid,
            to_number(sessionid||pgeid) sp,
            to_number(lag(sessionid||pgeid) over(order by sessionid,accesstime)) lag,
            to_number(sessionid||pgeid)-to_number(lag(sessionid||pgeid) over(order by sessionid,accesstime)) lagdiff from demo 
        ) 
    where lagdiff=0 
    group by sessionid,pgeid;

 SESSIONID      PGEID REFRESH_COUNT
---------- ---------- ---------------
         1          8          2
         2          5          1
         3          4          1
         4          5          1

SQL> 

clarification

A reader, April 25, 2005 - 9:20 am UTC

Tom
This is about analytical function.I am sorry if this is considered as a new question..
Say i have a query like

select
name,
ssn,
dob,
first_value(salary) over (partition by name,ssn order by hire_date desc
from sf.emp
where emp_no <>99
and tick =0
and month = 'APRIL'

In the above case will the FIRST_VALUE function use all the rows in sf.emp to compute the salary or only the rows
that pass the conditions mentioned in the where clause
(i.e.,)where emp_no <>99
and tick =0
and month = 'APRIL'..
Thanks


Tom Kyte
April 25, 2005 - 9:40 am UTC

analytics are done AFTER the where clause, they see only what the WHERE CLAUSE sees.

what would you like to retrieve...

Thanks

A reader, April 25, 2005 - 4:19 pm UTC

Thanks a lot for the response

execellent Discussion

A reader, April 24, 2006 - 9:36 am UTC

Tom,

Can I apply analytical function to this query. It is taking
a long time any thoughts?

select all student, duration,
trunc(least(nvl(date_1,to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')))),
nvl(date_3,to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')),
to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS') - trunc(greatest(date_2, to_date('01-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')) +1)
from student_lane
where ( nvl(date_1,to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')) > to_date('01-MAR-2006','DD-MM-YYYY'))
and date_2 < to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')
and ( student_code = 'Q' or (student_code = 'G'
and date_3 is not null))
and (status_4 in ('1','2','3') or status_5 in ('1','2','3'))
and date_6 is null
and school_code = 'WYZ'
order by student, duration



Tom Kyte
April 24, 2006 - 11:28 am UTC

why do you think analytics would apply?

You have a simple "apply this where clause to each row in turn" sort of query here.



Thanks

A reader, April 24, 2006 - 11:30 am UTC

I just thought I would apply since I am trying to make better...it is taking a longg time tho...any thoughts??

Tom Kyte
April 24, 2006 - 1:20 pm UTC

where ( nvl(date_1,to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')) > to_date('01-MAR-2006','DD-MM-YYYY'))

and date_2 < to_date('31-MAR-2006 23:59:59','DD-MON-YYYY HH24:MI:SS')
and ( student_code = 'Q' or (student_code = 'G' and date_3 is not null))
and (status_4 in ('1','2','3') or status_5 in ('1','2','3'))
and date_6 is null
and school_code = 'WYZ'

you would need to help us out here and describe what "might be selective" if anything

follow up...

A reader, April 24, 2006 - 2:52 pm UTC


Tom,

I am not sure what you mean by "what might be selective" but if you are referring to which keys are selective, I would say it is school code and student_code.
Is there any way we can replace the OR? just a thought...


Analytical Function

Maulesh Jani, September 30, 2006 - 6:22 am UTC

Hi ,

I am having below base table and trying to get the result table out of it through analytical function.


Base Table

cust Item Price Promo_nbr seq-nbr Promo-Amt Promo-Perct New-price Promo-calc

001 001 100 01 1 10 0
001 001 100 02 2 5


Result :

cust Item Price promo-nbr seq-nbr Promo-Amt Promo-Perct New-price Promo-calc

001 001 100 01 1 10 0 90 10
001 001 100 02 2 5 4.5 85.5 14.5

I also have to take-care about the seq-nbr , means while calculating for the second Promotion I have to apply it to the deducted Price of first Promo for same Product-customer

Regards
Maulesh

Tom Kyte
September 30, 2006 - 8:09 am UTC

that is a neat result.


I wonder though... what the heck the numbers mean. where they came from. how they were computed. what the LOGIC WAS.

Short of a mind meld - don't know how you would expect anyone to answer this.

oh, and make it not wrap too when you provide somewhat MORE DETAILED SPECS

and don't forget! no create table, no insert into, no look.

Cant figure a way for this?

Sushil, October 01, 2006 - 12:27 am UTC

Tom

Can this be done by a analytic function.
Table
Id Frozen Amount
1 1 10
2 1 20
3 1 30
4 0 40
5 0 50
6 0 60
7 0 70
8 0 80
9 1 90
10 1 100

My Query
select id, amount from t
where id =(select min(id) from t where frozen=0)
union
select id,amount from t
where id =(select min(id)+1 from t where frozen=0)
union
select id,amount from t
where id =(select min(id)+2 from t where frozen=0)

Output Required

Output

Id Amount
4 40
5 50
6 60

Thanks


Tom Kyte
October 01, 2006 - 12:54 am UTC

I'll never understand how people think we are just supposed to look at "here is my table", "here is my report". No, I won't tell you the logic at all - I want you to reverse engineer my query, assume it is correct and figure it all out. Oh, and you create the table and populate it too"

I look at your query and frankly, I cannot tell really what the query is, but it seems "take the min frozen id and find the records that are that id, id+1, id+2, then sort distinct that set"

with data
as
(select min(id) min_id from t where frozen = 0)
select distinct id, amount
from t
where id between min_id and min_id+2;

no clue if the distinct is really needed, but it is there because your example had it

that is why posting the QUESTION - not your attempted solution (along with table creates and inserts - I've no idea if my sql actually compiles!)

Apologies if not clear

Sushil, October 01, 2006 - 1:27 am UTC

Tom
That made you really yell out at this time of night, well apologies if i wasnt clear enough.I dont create the table ,infact i want to achieve this with number of tables i just showed you the example that i created as it was not feasible to post everything here.
This is what i did
SQL> create table t(id number,frozen number,amount number) tablespace rman;

Table created.

SQL> insert into t values (1,1,10);

1 row created.

SQL> insert into t values (2,1,20);

1 row created.

SQL> insert into t values (3,1,30);

1 row created.

SQL> insert into t values (4,0,40);

1 row created.

SQL> insert into t values (5,0,50);

1 row created.

SQL> insert into t values (6,0,60);

1 row created.

SQL> insert into t values (7,0,70);

1 row created.

SQL> insert into t values (8,1,80);

1 row created.

SQL> insert into t values (9,1,90);

1 row created.

SQL> insert into t values (10,0,100);

1 row created.

SQL> commit;

Commit complete.


The column frozen=0 with minimum of id is my base point to start with (and it had thoousands of records not just ten).After this point to next 3 months i want to display the amounts as below.
        ID     AMOUNT
---------- ----------
         4         40
         5         50
         6         60

i achieve this by Union, but just wanted to confirm is its possible by analytics.My eg query never had a distinct its just that table has unique sequence values
And by the way your query does not work .

Thanks for your time.

 

Tom Kyte
October 01, 2006 - 3:04 am UTC

analytics are not "end all be all"

analytics are good to analyze across rows.  You are not doing that.  


ops$tkyte%ORA10GR2> with data
  2  as
  3  (select min(id) min_id from t where frozen = 0)
  4  select distinct id, amount
  5    from t, data
  6   where id between (select min_id from data) and (select min_id+2 from data);

        ID     AMOUNT
---------- ----------
         5         50
         6         60
         4         40

the reason my query did not work - NOTHING TO TEST AGAINST.  That and I wasn't really sure what the heck you were after...  



Your query did have a distinct, UNION implies clearly a DISTINCT, UNION ALL (which is perhaps what you meant to use) does not.

try this:

select * from dual union select * from dual;
select * from dual union ALL select * from dual;

and compare the results...

and the "fact" (which is only now fact since you just now mention it) that ID is unique (I presume, assume, hypothesize that when you say "unique sequence" you were referring to ID because even NOW we see nothing unique in the table by definition....) was not a 'fact' back then.

The union should definitely be a union all therefore.


<b>one thing I thought of on the plane yesterday....

You said:

...
eg query never had a distinct its just that table has unique  sequence values
.....

that makes me worried about your approach, you are going id, id+1, id+2

That - that just likely is WRONG.  It sounds like you want "three records starting with a specific one (based on the max() subquery) and getting the next two after sorting by ID"

That is the problem when the problem statement is not clearly specified, I have to take your query and presume it is correct.

Anyway - sequences ARE NOT GAP FREE, id, id+1, id+2 is probably entirely wrong.


if so, this is PROBABLY what you really want, but - without really clear specifications - it is sort of "hard".  Presumes index on t(frozen,id) and primary key on id...

ops$tkyte%ORA10GR2> select *
  2    from
  3  (select t.*, row_number() over (order by id) rn
  4     from t
  5    where id >= (select min(id) min_id from t where frozen = 0)
  6  )
  7   where rn <= 3
  8  /

        ID     FROZEN     AMOUNT         RN
---------- ---------- ---------- ----------
         4          0         40          1
         5          0         50          2
         6          0         60          3


Execution Plan
----------------------------------------------------------
Plan hash value: 2681708210

------------------------------------------------------------------------
| Id  | Operation                        | Name   | Rows  | Bytes | Cost
------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |        |     1 |    52 |
|*  1 |  VIEW                            |        |     1 |    52 |
|*  2 |   WINDOW NOSORT STOPKEY          |        |     1 |    39 |
|   3 |    TABLE ACCESS BY INDEX ROWID   | T      |     1 |    39 |
|*  4 |     INDEX RANGE SCAN             | T_PK   |     1 |       |
|   5 |      SORT AGGREGATE              |        |     1 |    26 |
|   6 |       FIRST ROW                  |        |     5 |   130 |
|*  7 |        INDEX RANGE SCAN (MIN/MAX)| T_IDX1 |     5 |   130 |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"<=3)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "ID")<=3)
   4 - access("ID">= (SELECT MIN("ID") FROM "T" "T" WHERE "FROZEN"=0))
       filter("ID">= (SELECT MIN("ID") FROM "T" "T" WHERE "FROZEN"=0))
   7 - access("FROZEN"=0)
</b> 

analytics count by diff. group by

karma, June 16, 2007 - 12:12 pm UTC

I need 2 diff counts from same table which is group by diff. columns. Table is huge and i wrote sql that makes 2 passes to the table. Is there more efficient or diff. ways to re-write SQL such that it makes just one pass to the table? Analytics ?
In following example i need to sum counts where i group by first_region_rep_rrep_c and then approved_region_rrep_c. I have 2 queries and then union but would like to avoid that.

SELECT account_num, rep_code, enter_d,
SUM (count_of_trades) AS num_of_sales
FROM (SELECT
account_num,
first_region_rrep_c AS rep_code, enter_d,
COUNT (*) AS count_of_sales
FROM sales_reg aa
WHERE product_c <>'Z##'
AND enter_d >= TO_DATE ('2007-06-14', 'YYYY-MM-DD')
AND enter_d <= TO_DATE ('2007-06-14', 'YYYY-MM-DD')
GROUP BY account_num,
first_region_rrep_c,
enter_d
UNION
SELECT account_num,
approved_region_rrep_c AS rep_code, enter_d,
COUNT (*) AS count_of_sales
FROM sales_reg aa
WHERE product_c <> 'Z##'
AND enter_d >= TO_DATE ('2007-06-14', 'YYYY-MM-DD')
AND enter_d <= TO_DATE ('2007-06-14', 'YYYY-MM-DD')
GROUP BY account_num,
approved_region_rrep_c AS rep_code,
enter_d)
GROUP BY account_num, rep_code, enter_d

strange exception

James Su, July 04, 2008 - 3:52 pm UTC

hi Tom,
I got a very strange exception, please see below.

create table test (a varchar2(10),b varchar2(10),c NUMBER(10));

insert into test values ('1','1',1);

commit;

SQL> select count(*) over (partition by a order by B RANGE between unbounded preceding and 1 preceding) as cnt from test;
select count(*) over (partition by a order by B RANGE between unbounded preceding and 1 preceding) as cnt from test
                                              *
ERROR at line 1:
ORA-00902: invalid datatype

But when I replace column b with c, it works. The difference is their data types. It doesn't work with VARCHAR2.

SQL> select count(*) over (partition by a order by C RANGE between unbounded preceding and 1 preceding) as cnt from test;

       CNT
----------
         0

When I replace RANGE with ROWS, both B and C work fine.

Tom Kyte
July 07, 2008 - 11:40 am UTC

you cannot 'range' on a varchar2

it doesn't make sense, range subtracts values from the current rows values


rows - fine with all types.
range - numbers and dates as you have to be able to add/subtract a number to/from them.

but your windows don't seem to make sense there anyway....

Lag and Lead (analytic function)

Daphnia, September 25, 2009 - 4:38 am UTC

Tom,

I need your help.

Kindly see the below data.
-------------------------------------
REC_ID YEAR_MONTH YR VAL
-------------------------------------
10 200101 2001 null --> ignored
10 200102 2001 null --> ignored
10 200103 2001 2356
10 200104 2001 7898 ----> upper value(uv)
10 200105 2001 null->1+(uv+lv/count)= 7239
10 200106 2001 null->2+(uv+lv/count)= 7240
10 200107 2001 6578 -----> lower value(lv)
10 200108 2001 6878 ----> upper value(uv)
10 200109 2001 null
10 200110 2001 null
10 200111 2001 null
10 200112 2001 8078 -----> lower value(lv)
---------------------------------------
i need to calculate the null values by using the upper value(the value just above the null) and lower value(the value just after the null) and total null count. the logic goes like,n+(uv+lv/count) and fill the result in the null sets.. the count is the inbetween count of the uv and lv.Moreover nulls before the upper value must be ignored for the red_id 10.how can we achieve this? kindly help me. the query which i tried is,

SELECT YEAR_MONTH,REC_ID,val,
last_VALUE(val IGNORE NULLS) over (PARTITION BY REC_ID ORDER BY YEAR_MONTH) last_val,
CASE WHEN (val IS NOT NULL AND previous_value IS NOT NULL AND
next_value IS NULL) THEN 'BEFORE_NULL'
WHEN (val IS NOT NULL AND previous_value IS NULL AND
next_value IS NOT NULL) THEN 'AFTER_NULL'
END AS record_position ,
previous_value,
next_value
FROM (SELECT ROWNUM row_num,
REC_ID,
YEAR_MONTH,
YR,
VAL,
lag(val) over(PARTITION BY REC_ID ORDER BY YEAR_MONTH) previous_value,
lead(val) over(PARTITION BY REC_ID ORDER BY YEAR_MONTH) next_value
FROM TABLE_X
WHERE YR = 2001
AND REC_ID = 10
ORDER BY YEAR_MONTH);

but, how to calculate and update it over there in the null. kindly help me.
---------------------------------------------------------
scripts
---------------------------------------------------------
CREATE TABLE TABLE_X
(REC_ID NUMBER,
YEAR_MONTH NUMBER,
YR NUMBER,
VAL NUMBER);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200101, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200102, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200103, 2001, 7898);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200104, 2001, 2356);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200105, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200106, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200107, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200108, 2001, 6578);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200109, 2001, 6878);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200110, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200111, 2001, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (10, 200112, 2001, 8078);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200202, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200203, 2002, 14225);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200204, 2002, 20554);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200205, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200206, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200207, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200208, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200209, 2002, 52199);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200210, 2002, 58528);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200211, 2002, null);

insert into TABLE_X (REC_ID, YEAR_MONTH, YR, VAL)
values (20, 200212, 2002, 71186);

--------------------------------------------------------

Awaiting your reply

Tom Kyte
September 29, 2009 - 9:26 am UTC

anytime you see this construct:

SELECT <b>ROWNUM row_num,</b>
  FROM ..
 WHERE ....<b>
 ORDER BY ....</b>



You can be SURE that someone messed up. Rownum is assigned PRIOR to sorting - it would be useless, the construct would always have to be:

select rownum, a.*
  from (select .... from .... where ... order by ....) A


assign rownum post order by.


and when I see a create table like this:

CREATE TABLE TABLE_X
(REC_ID NUMBER,
YEAR_MONTH NUMBER,
YR NUMBER,
VAL NUMBER);



I really sigh, out loud. year_month, yr - what kind of garbage is that!!! should just be a single attribute with a DATE in it!!!!!!


and I cannot begin to tell you how annoying it is to get an example that doesn't MATCH THE TEST CASE

Here is the data you said I would have:


Kindly see the below data.
-------------------------------------
REC_ID    YEAR_MONTH    YR    VAL
-------------------------------------
10    200101           2001    null --> ignored
10    200102           2001    null --> ignored
10    200103           2001    2356
10    200104           2001    7898 ----> upper value(uv)
10    200105           2001    null->1+(uv+lv/count)= 7239
10    200106           2001    null->2+(uv+lv/count)= 7240
10    200107           2001    6578 -----> lower value(lv)
10    200108           2001    6878 ----> upper value(uv)
10    200109           2001    null 
10    200110           2001    null
10    200111           2001    null
10    200112           2001    8078 -----> lower value(lv)
---------------------------------------


and here is what you gave me:

ops$tkyte%ORA11GR2> select * from t order by year_month;

    REC_ID YEAR_MONTH         YR        VAL
---------- ---------- ---------- ----------
        10     200101       2001
        10     200102       2001
        10     200103       2001       7898
        10     200104       2001       2356  <<=== well, i was expecting these
        10     200105       2001                   to be reversed AND ONLY 
        10     200106       2001                   TWO null records...     
        10     200107       2001                   SIGH
        10     200108       2001       6578
        10     200109       2001       6878
        10     200110       2001
        10     200111       2001
        10     200112       2001       8078
        20     200202       2002
        20     200203       2002      14225
        20     200204       2002      20554
        20     200205       2002
        20     200206       2002
        20     200207       2002
        20     200208       2002
        20     200209       2002      52199
        20     200210       2002      58528
        20     200211       2002
        20     200212       2002      71186

23 rows selected.




here you go:

ops$tkyte%ORA11GR2> select rec_id, year_month, val,
  2         to_number(substr(lv,1,10)) rn1,
  3             to_number(substr(fv,1,10)) rn2,
  4             to_number(substr(lv,11)) last_val,
  5             to_number(substr(fv,11)) first_val
  6    from (
  7  select x.*,
  8         last_value( case when val is not null then to_char(rn,'fm0000000000')||val end ignore nulls) over (order by year_month) lv,
  9         first_value( case when val is not null then to_char(rn,'fm0000000000')||val end ignore nulls) over (order by year_month
 10                                                                     rows between current row and unbounded following) fv
 11    from (
 12  select t.*, row_number() over (order by year_month) rn
 13    from t
 14         ) x
 15             )
 16   order by year_month;

    REC_ID YEAR_MONTH        VAL        RN1        RN2   LAST_VAL  FIRST_VAL
---------- ---------- ---------- ---------- ---------- ---------- ----------
        10     200101                                3                  7898
        10     200102                                3                  7898
        10     200103       7898          3          3       7898       7898
        10     200104       2356          4          4       2356       2356
        10     200105                     4          8       2356       6578
        10     200106                     4          8       2356       6578
        10     200107                     4          8       2356       6578
        10     200108       6578          8          8       6578       6578
        10     200109       6878          9          9       6878       6878
        10     200110                     9         12       6878       8078
        10     200111                     9         12       6878       8078
        10     200112       8078         12         12       8078       8078
        20     200202                    12         14       8078      14225
        20     200203      14225         14         14      14225      14225
        20     200204      20554         15         15      20554      20554
        20     200205                    15         20      20554      52199
        20     200206                    15         20      20554      52199
        20     200207                    15         20      20554      52199
        20     200208                    15         20      20554      52199
        20     200209      52199         20         20      52199      52199
        20     200210      58528         21         21      58528      58528
        20     200211                    21         23      58528      71186
        20     200212      71186         23         23      71186      71186

23 rows selected.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> select rec_id, year_month, rn,
  2         case when val is null
  3                  then (rn-rn1) + (last_val+first_val)/(rn2-rn1-1)
  4                          else val
  5                  end new_val,
  6         case when val is null
  7                  then '('||rn||'-'||rn1||') + ('||last_val||'+'||first_val||')/('||rn2||'-'||rn1||'-1)'
  8                  end new_valstr
  9    from (
 10  select rec_id, year_month, val,
 11         rn,
 12         to_number(substr(lv,1,10)) rn1,
 13             to_number(substr(fv,1,10)) rn2,
 14             to_number(substr(lv,11)) last_val,
 15             to_number(substr(fv,11)) first_val
 16    from (
 17  select x.*,
 18         last_value( case when val is not null then to_char(rn,'fm0000000000')||val end ignore nulls) over (order by year_month) lv,
 19         first_value( case when val is not null then to_char(rn,'fm0000000000')||val end ignore nulls) over (order by year_month
 20                                                                     rows between current row and unbounded following) fv
 21    from (
 22  select t.*, row_number() over (order by year_month) rn
 23    from t
 24         ) x
 25             )
 26             )
 27   order by year_month;

    REC_ID YEAR_MONTH         RN    NEW_VAL NEW_VALSTR
---------- ---------- ---------- ---------- ----------------------------------------
        10     200101          1            (1-) + (+7898)/(3--1)
        10     200102          2            (2-) + (+7898)/(3--1)
        10     200103          3       7898
        10     200104          4       2356
        10     200105          5       2979 (5-4) + (2356+6578)/(8-4-1)
        10     200106          6       2980 (6-4) + (2356+6578)/(8-4-1)
        10     200107          7       2981 (7-4) + (2356+6578)/(8-4-1)
        10     200108          8       6578
        10     200109          9       6878
        10     200110         10       7479 (10-9) + (6878+8078)/(12-9-1)
        10     200111         11       7480 (11-9) + (6878+8078)/(12-9-1)
        10     200112         12       8078
...


Lag/Lead

Daphnia, September 30, 2009 - 1:16 am UTC

Hi Tom,

Thank you so much for your help. I tried the below query. Just wanted to know whether this query is right ?

SELECT row_num,
rec_id,
YEAR_MONTH,
YR,
VAL,
upper_val,
lower_val,
row_number() over(PARTITION BY upper_val, lower_val ORDER BY YEAR_MONTH) inbet_count,
COUNT(nvl(VAL, 1)) over(PARTITION BY upper_val, lower_val) tot_inbet_count
FROM (SELECT rownum row_num,
rec_id,
YEAR_MONTH,
YR,
VAL,
last_value(VAL ignore NULLS) over(PARTITION BY rec_id ORDER BY YEAR_MONTH) upper_val,
last_value(VAL ignore NULLS) over(PARTITION BY rec_id ORDER BY YEAR_MONTH DESC) lower_val
FROM table_x
WHERE YR = 2001
);
Tom Kyte
October 07, 2009 - 7:55 am UTC

you tell me?

does it give you the right answer? does it answer the question you are asking?

I mean - you should be able to verbalize - in a descriptive form - exactly what this SQL is doing (if you cannot, stop everything until you can - you must be able to do that)


Not sure what your column row_num buys you - it is basically a randomly assigned sequential number in your example. If you run that query twice - you could have different row_num values assigned to your rows each time, It doesn't make sense to me.


Lag/Lead

Daphnia, October 01, 2009 - 6:49 am UTC

Hi Tom,

I used your query to validate and update records(approx. 903,234 records) in the table using a procedure. An outline of the procedure is shown below.
-----------------------------------------------------
create or replace procedure proc is

your query..
begin

for loop
update..
end loop

end proc
---------------------------------------------------------
unable to succeed as the cpu limits exeeds and out of 903,234 , only 34,000 records has been crossed.
any suggestions. Will bulk collect help? i need to ignore exception and carry out the updation for all the records, will bulk collect be useful in this case?
Tom Kyte
October 07, 2009 - 9:16 am UTC

what have YOU done to determine where the bottleneck is?

have you even run the query? to see if it completes in your lifetime?

have you traced it?

yes, bulk processing is going to help, in fact, a single statement is probably all you need, you can probably do this with just a single MERGE and no procedural code at all, but you don't supply any details here

Lag and Lead (analytic function)

Daphnia, October 06, 2009 - 7:10 am UTC

Hi Tom,

With regard to my above query, i need the below output with few more modifications. Please help me.
I have the below data in my table.

txID txmon_year txyear
----- ------------- --------
10 200101
10 200102
10 200103 7898
10 200201 2356
10 200203
10 200301
10 200302
10 200303 6578
10 200401 6878
10 200402
10 200403
10 200405 8078
20 200112
40 200101
40 200102 71186
40 200103 77515
40 200201
40 200202
40 200203 96502

And i need the below output.
where uv = upper value
lv = lower value
tot_cnt = total count
bet_cnt = inbetween count.


txID txmon_year txyear uv lv tot_cnt bet_cnt
---- ---------- ------ --- --- ------- -------
10 200101 ignore
10 200102 ignore
10 200103 7898
10 200201 2356
10 200203 null 2356 6578 3 1
10 200301 null 2356 6578 3 2
10 200302 null 2356 6578 3 3
10 200303 6578
10 200401 6878
10 200402 null 6878 8078 2 1
10 200403 null 6878 8078 2 2
10 200405 8078
20 200112 ignore
40 200101 ignore
40 200102 71186
40 200103 77515
40 200201 null 77515 96502 2 1
40 200202 null 77515 96502 2 2
40 200203 96502
please help me to get the above output.



Tom Kyte
October 08, 2009 - 7:07 am UTC

no create
no insert

I'm not even going to bother looking, we've been here before.


I've given you a template, an approach - a method. I suggest (strongly) that you

a) understand that approach so...
b) you can modify that approach and ...
c) apply it to other similar problems (such as this)

you can do this, just give it some thought.

Useful info

Prabhat, October 15, 2009 - 6:17 am UTC

Thanks,
Tom this topic was very helpful for me as i was also looking for some help on analytical function and i got this.

Thanks alot for your support.

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.